Compare commits

...

259 Commits

Author SHA1 Message Date
Shannon Sands bad9fe2452 add generic gateway startup readiness checks 2026-04-15 10:03:23 +10:00
Teknium 10494b42a1 feat(discord): register skills under /skill command group with category subcommands (#9909)
Instead of consuming one top-level slash command slot per skill (hitting the
100-command limit with ~26 built-ins + 74 skills), skills are now organized
under a single /skill group command with category-based subcommand groups:

  /skill creative ascii-art [args]
  /skill media gif-search [args]
  /skill mlops axolotl [args]

Discord supports 25 subcommand groups × 25 subcommands = 625 max skills,
well beyond the previous 74-slot ceiling.

Categories are derived from the skill directory structure:
- skills/creative/ascii-art/ → category 'creative'
- skills/mlops/training/axolotl/ → category 'mlops' (top-level parent)
- skills/dogfood/ → uncategorized (direct subcommand)

Changes:
- hermes_cli/commands.py: add discord_skill_commands_by_category() with
  category grouping, hub/disabled filtering, Discord limit enforcement
- gateway/platforms/discord.py: replace top-level skill registration with
  _register_skill_group() using app_commands.Group hierarchy
- tests: 7 new tests covering group creation, category grouping,
  uncategorized skills, hub exclusion, deep nesting, empty skills,
  and handler dispatch

Inspired by Discord community suggestion from bottium.
2026-04-14 16:27:02 -07:00
Teknium 039023f497 diag: log all hermes processes on unexpected gateway shutdown (#9905)
When the gateway receives SIGTERM/SIGINT, the shutdown handler now
runs 'ps aux' and logs every hermes/gateway-related process (excluding
itself). This will show in agent.log as:

  WARNING: Shutdown diagnostic — other hermes processes running:
    hermes  1234 ... hermes update --gateway
    hermes  5678 ... hermes gateway restart

This is the missing diagnostic for #5646 / #6666 — we can prove
the restarts are from systemctl but can't determine WHO issues the
systemctl command. Next time it happens, the agent.log will contain
the evidence (the process that sent the signal or called systemctl
should still be alive when the handler fires).
2026-04-14 16:26:36 -07:00
Teknium 6448e1da23 feat(zai): add GLM-5V-Turbo support for coding plan (#9907)
- Add glm-5v-turbo to OpenRouter, Nous, and native Z.AI model lists
- Add glm-5v context length entry (200K tokens) to model metadata
- Update Z.AI endpoint probe to try multiple candidate models per
  endpoint (glm-5.1, glm-5v-turbo, glm-4.7) — fixes detection for
  newer coding plan accounts that lack older models
- Add zai to _PROVIDER_VISION_MODELS so auxiliary vision tasks
  (vision_analyze, browser screenshots) route through 5v

Fixes #9888
2026-04-14 16:26:01 -07:00
Teknium 1e5e1e822b fix: ESC cancels secret/sudo prompts, clearer skip messaging (#9902)
- Add ESC key binding (eager) for secret_state and sudo_state modal
  prompts — fires immediately, same behavior as Ctrl+C cancel
- Update placeholder text: 'Enter to submit · ESC to skip' (was
  'Enter to skip' which was confusing — Enter on empty looked like
  submitting nothing rather than intentionally skipping)
- Update widget body text: 'ESC or Ctrl+C to skip'
- Change feedback message from 'Secret entry cancelled' to 'Secret
  entry skipped' — more accurate for the action taken
- getpass fallback prompt also updated for non-TUI mode
2026-04-14 16:11:37 -07:00
Teknium 55ce76b372 feat: add architecture-diagram skill (Cocoon AI port) (#9906)
Port of Cocoon AI's architecture-diagram-generator (MIT) as a Hermes skill.
Generates professional dark-themed system architecture diagrams as standalone
HTML/SVG files. Self-contained output, no dependencies.

- SKILL.md with design system specs, color palette, layout rules
- HTML template with all component types, arrow styles, legend examples
- Fits alongside excalidraw in creative/ category

Source: https://github.com/Cocoon-AI/architecture-diagram-generator
2026-04-14 16:10:18 -07:00
Teknium 1525624904 fix: block agent from self-destructing gateway via terminal (#6666)
Add dangerous command patterns that require approval when the agent
tries to run gateway lifecycle commands via the terminal tool:

- hermes gateway stop/restart — kills all running agents mid-work
- hermes update — pulls code and restarts the gateway
- systemctl restart/stop (with optional flags like --user)

These patterns fire the approval prompt so the user must explicitly
approve before the agent can kill its own gateway process. In YOLO
mode, the commands run without approval (by design — YOLO means the
user accepts all risks).

Also fixes the existing systemctl pattern to handle flags between
the command and action (e.g. 'systemctl --user restart' was previously
undetected because the regex expected the action immediately after
'systemctl').

Root cause: issue #6666 reported agents running 'hermes gateway
restart' via terminal, killing the gateway process mid-agent-loop.
The user sees the agent suddenly stop responding with no explanation.
Combined with the SIGTERM auto-recovery from PR #9875, the gateway
now both prevents accidental self-destruction AND recovers if it
happens anyway.

Test plan:
- Updated test_systemctl_restart_not_flagged → test_systemctl_restart_flagged
- All 119 approval tests pass
- E2E verified: hermes gateway restart, hermes update, systemctl
  --user restart all detected; hermes gateway status, systemctl
  status remain safe
2026-04-14 15:43:31 -07:00
Teknium 353b5bacbd test: add tests for /health/detailed endpoint and gateway health probe
- TestHealthDetailedEndpoint: 3 tests for the new API server endpoint
  (returns runtime data, handles missing status, no auth required)
- TestProbeGatewayHealth: 5 tests for _probe_gateway_health()
  (URL normalization, successful/failed probes, fallback chain)
- TestStatusRemoteGateway: 4 tests for /api/status remote fallback
  (remote probe triggers, skipped when local PID found, null PID handling)
2026-04-14 15:41:30 -07:00
Hermes Agent 139a5e37a4 docs(docker): add dashboard section, expose API port, update Compose example
- Running in gateway mode: expose port 8642 for the API server and
  health endpoint, with a note on when it's needed.
- New 'Running the dashboard' section: docker run command with
  GATEWAY_HEALTH_URL and env var reference table.
- Docker Compose example: updated to include both gateway and dashboard
  services with internal network connectivity (hermes-net), so the
  dashboard probes the gateway via http://hermes:8642.
- Concurrent access warning: clarified that running a read-only
  dashboard alongside the gateway is safe.
2026-04-14 15:41:30 -07:00
Hermes Agent 673acf22ae fix: override stale 'stopped' state when health probe confirms gateway alive
When the gateway responds to the health probe but the local
gateway_state.json has a stale 'stopped' state (common in cross-container
setups where the file was written before the gateway restarted), the
dashboard would show 'Running (remote)' but with a 'Stopped' badge.

Now if the HTTP probe succeeded (remote_health_body is not None) and
gateway_state is 'stopped' or None, override it to 'running'. Also
handles the no-shared-volume case where runtime is None entirely.
2026-04-14 15:41:30 -07:00
Hermes Agent 6ed682f111 fix: normalise GATEWAY_HEALTH_URL to base URL before probing
The probe was appending '/detailed' to whatever URL was provided,
so GATEWAY_HEALTH_URL=http://host:8642 would try /8642/detailed
and /8642 — neither of which are valid routes.

Now strips any trailing /health or /health/detailed from the env var
and always probes {base}/health/detailed then {base}/health.
Accepts bare base URL, /health, or /health/detailed forms.
2026-04-14 15:41:30 -07:00
Hermes Agent 45595f4805 feat(dashboard): add HTTP health probe for cross-container gateway detection
The dashboard's gateway status detection relied solely on local PID checks
(os.kill + /proc), which fails when the gateway runs in a separate container.

Changes:
- web_server.py: Add _probe_gateway_health() that queries the gateway's HTTP
  /health/detailed endpoint when the local PID check fails. Activated by
  setting the GATEWAY_HEALTH_URL env var (e.g. http://gateway:8642/health).
  Falls back to standard PID check when the env var is not set.
- api_server.py: Add GET /health/detailed endpoint that returns full gateway
  state (platforms, gateway_state, active_agents, pid, etc.) without auth.
  The existing GET /health remains unchanged for backwards compatibility.
- StatusPage.tsx: Handle the case where gateway_pid is null but the gateway
  is running remotely, displaying 'Running (remote)' instead of 'PID null'.

Environment variables:
- GATEWAY_HEALTH_URL: URL of the gateway health endpoint (e.g.
  http://gateway-container:8642/health). Unset = local PID check only.
- GATEWAY_HEALTH_TIMEOUT: Probe timeout in seconds (default: 3).
2026-04-14 15:41:30 -07:00
Teknium 397386cae2 fix: gateway auto-recovers from unexpected SIGTERM via systemd (#5646)
Root cause: when the gateway received SIGTERM (from hermes update,
external kill, WSL2 runtime, etc.), it exited with status 0. systemd's
Restart=on-failure only restarts on non-zero exit, so the gateway
stayed dead permanently. Users had to manually restart.

Fix 1: Signal-initiated shutdown exits non-zero
When SIGTERM/SIGINT is received and no restart was requested (via
/restart, /update, or SIGUSR1), start_gateway() returns False which
causes sys.exit(1). systemd sees a failure exit and auto-restarts
after RestartSec=30.

This is safe because systemctl stop tracks its own stop-requested
state independently of exit code — Restart= never fires for a
deliberate stop, regardless of exit code.

Also logs 'Received SIGTERM/SIGINT — initiating shutdown' so the
cause of unexpected shutdowns is visible in agent.log.

Fix 2: PID file ownership guard
remove_pid_file() now checks that the PID file belongs to the current
process before removing it. During --replace handoffs, the old
process's atexit handler could fire AFTER the new process wrote its
PID file, deleting the new record. This left the gateway running but
invisible to get_running_pid(), causing 'Another gateway already
running' errors on next restart.

Test plan:
- All restart drain tests pass (13)
- All gateway service tests pass (84)
- All update gateway restart tests pass (34)
2026-04-14 15:35:58 -07:00
Teknium eed891f1bb security: supply chain hardening — CI pinning, dep pinning, and code fixes (#9801)
CI/CD Hardening:
- Pin all 12 GitHub Actions to full commit SHAs (was mutable @vN tags)
- Add explicit permissions: {contents: read} to 4 workflows
- Pin CI pip installs to exact versions (pyyaml==6.0.2, httpx==0.28.1)
- Extend supply-chain-audit.yml to scan workflow, Dockerfile, dependency
  manifest, and Actions version changes

Dependency Pinning:
- Pin git-based Python deps to commit SHAs (atroposlib, tinker, yc-bench)
- Pin WhatsApp Baileys from mutable branch to commit SHA

Tool Registry:
- Reject tool name shadowing from different tool families (plugins/MCP
  cannot overwrite built-in tools). MCP-to-MCP overwrites still allowed.

MCP Security:
- Add tool description content scanning for prompt injection patterns
- Log detailed change diff on dynamic tool refresh at WARNING level

Skill Manager:
- Fix dangerous verdict bug: agent-created skills with dangerous
  findings were silently allowed (ask->None->allow). Now blocked.
2026-04-14 14:23:37 -07:00
Teknium 9bbf7659e9 chore: add Roy-oss1 to AUTHOR_MAP 2026-04-14 14:22:11 -07:00
Roy-oss1 1aa76620d4 fix(feishu): keep approval clicks synchronized with callback card state
Feishu approval clicks need the resolved card to come back from the
synchronous callback path itself. Leaving approval resolution to the
generic asynchronous card-action flow made button feedback depend on
later loop work instead of the callback response the client is waiting
for.

Change-Id: I574997cbbcaa097fdba759b47367e28d1b56b040
Constraint: Feishu card-action callbacks must acknowledge quickly and reflect final approval state from the callback response path
Rejected: Keep approval handling on the generic async card-action route | leaves card state synchronization vulnerable to callback timing and follow-up update ordering
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Keep approval callback response construction separate from async queue unblocking unless Feishu callback semantics change
Tested: pytest tests/gateway/test_feishu.py tests/gateway/test_feishu_approval_buttons.py tests/gateway/test_approve_deny_commands.py tests/gateway/test_slack_approval_buttons.py tests/gateway/test_telegram_approval_buttons.py -q
Not-tested: Live Feishu workspace end-to-end callback rendering
2026-04-14 14:22:11 -07:00
Teknium fa8c448f7d fix: notify active sessions on gateway shutdown + update health check
Three fixes for gateway lifecycle stability:

1. Notify active sessions before shutdown (#new)
   When the gateway receives SIGTERM or /restart, it now sends a
   notification to every chat with an active agent BEFORE starting
   the drain. Users see:
   - Shutdown: 'Gateway shutting down — your task will be interrupted.'
   - Restart: 'Gateway restarting — use /retry after restart to continue.'
   Deduplicates per-chat so group sessions with multiple users get
   one notification. Best-effort: send failures are logged and swallowed.

2. Skip .clean_shutdown marker when drain timed out
   Previously, a graceful SIGTERM always wrote .clean_shutdown, even if
   agents were force-interrupted when the drain timed out. This meant
   the next startup skipped session suspension, leaving interrupted
   sessions in a broken state (trailing tool response, no final message).
   Now the marker is only written if the drain completed without timeout,
   so interrupted sessions get properly suspended on next startup.

3. Post-restart health check for hermes update (#6631)
   cmd_update() now verifies the gateway actually survived after
   systemctl restart (sleep 3s + is-active check). If the service
   crashed immediately, it retries once. If still dead, prints
   actionable diagnostics (journalctl command, manual restart hint).

Also closes #8104 — already fixed on main (the /restart handler
correctly detects systemd via INVOCATION_ID and uses via_service=True).

Test plan:
- 6 new tests for shutdown notifications (dedup, restart vs shutdown
  messaging, sentinel filtering, send failure resilience)
- Existing restart drain + update tests pass (47 total)
2026-04-14 14:21:57 -07:00
Teknium 95d11dfd8e docs: automation templates gallery + comparison post (#9821)
* feat(skills): add fitness-nutrition skill to optional-skills

Cherry-picked from PR #9177 by @haileymarshall.

Adds a fitness and nutrition skill for gym-goers and health-conscious users:
- Exercise search via wger API (690+ exercises, free, no auth)
- Nutrition lookup via USDA FoodData Central (380K+ foods, DEMO_KEY fallback)
- Offline body composition calculators (BMI, TDEE, 1RM, macros, body fat %)
- Pure stdlib Python, no pip dependencies

Changes from original PR:
- Moved from skills/ to optional-skills/health/ (correct location)
- Fixed BMR formula in FORMULAS.md (removed confusing -5+10, now just +5)
- Fixed author attribution to match PR submitter
- Marked USDA_API_KEY as optional (DEMO_KEY works without signup)

Also adds optional env var support to the skill readiness checker:
- New 'optional: true' field in required_environment_variables entries
- Optional vars are preserved in metadata but don't block skill readiness
- Optional vars skip the CLI capture prompt flow
- Skills with only optional missing vars show as 'available' not 'setup_needed'

* docs: add automation templates gallery and comparison post

- New docs page: guides/automation-templates.md with 15+ ready-to-use
  automation recipes covering development workflow, devops, research,
  GitHub events, and business operations
- Comparison post (hermes-already-has-routines.md) showing Hermes has
  had schedule/webhook/API triggers since March 2026
- Added automation-templates to sidebar navigation

---------

Co-authored-by: haileymarshall <haileymarshall@users.noreply.github.com>
2026-04-14 12:30:50 -07:00
Teknium a37a095980 fix: detect qwen-oauth provider via CLI tokens in /model picker
Seed qwen-oauth credentials from resolve_qwen_runtime_credentials() in
_seed_from_singletons(). Users who authenticate via 'qwen auth qwen-oauth'
store tokens in ~/.qwen/oauth_creds.json which the runtime resolver reads
but the credential pool couldn't detect — same gap pattern as copilot.

Uses refresh_if_expiring=False to avoid network calls during discovery.
2026-04-14 11:16:26 -07:00
Marvae 0bd3f521ae fix: detect copilot provider via gh auth token in /model picker
Seed copilot credentials from resolve_copilot_token() in the credential
pool's _seed_from_singletons(), alongside the existing anthropic and
openai-codex seeding logic. This makes copilot appear in the /model
provider picker when the user authenticates solely through gh auth token.

Cherry-picked from PR #9767 by Marvae.
2026-04-14 11:16:26 -07:00
Teknium 3e0bccc54c fix: update existing webhook tests to use _webhook_register_url
Follow-up for cherry-picked PR #9746 — three pre-existing tests used
adapter._webhook_url (bare URL) in mock data, but _register_webhook
and _unregister_webhook now compare against _webhook_register_url
(password-bearing URL). Updated to match.
2026-04-14 11:02:48 -07:00
cypres0099 326cbbe40e fix(gateway/bluebubbles): embed password in registered webhook URL for inbound auth
When BlueBubbles posts webhook events to the adapter, it uses the exact
URL registered via /api/v1/webhook — and BB's registration API does not
support custom headers. The adapter currently registers the bare URL
(no credentials), but then requires password auth on inbound POSTs,
rejecting every webhook with HTTP 401.

This is masked on fresh BB installs by a race condition: the webhook
might register once with a prior (possibly patched) URL and keep working
until the first restart. On v0.9.0, _unregister_webhook runs on clean
shutdown, so the next startup re-registers with the bare URL and the
401s begin. Users see the bot go silent with no obvious cause.

Root cause: there's no way to pass auth credentials from BB to the
webhook handler except via the URL itself. BB accepts query params and
preserves them on outbound POSTs.

## Fix

Introduce `_webhook_register_url` — the URL handed to BB's registration
API, with the configured password appended as a `?password=<value>`
query param. The existing webhook auth handler already accepts this
form (it reads `request.query.get("password")`), so no change to the
receive side is needed.

The bare `_webhook_url` is still used for logging and for binding the
local listener, so credentials don't leak into log output. Only the
registration/find/unregister paths use the password-bearing form.

## Notes

- Password is URL-encoded via urllib.parse.quote, handling special
  characters (&, *, @, etc.) that would otherwise break parsing.
- Storing the password in BB's webhook table is not a new disclosure:
  anyone with access to that table already has the BB admin password
  (same credential used for every other API call).
- If `self.password` is empty (no auth configured), the register URL
  is the bare URL — preserves current behavior for unauthenticated
  local-only setups.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 11:02:48 -07:00
cypres0099 8b52356849 fix(gateway/bluebubbles): fall back to data.chats[0].guid when chatGuid missing
BlueBubbles v1.9+ webhook payloads for new-message events do not always
include a top-level chatGuid field on the message data object. Instead,
the chat GUID is nested under data.chats[0].guid.

The adapter currently checks five top-level fallback locations (record and
payload, snake_case and camelCase, plus payload.guid) but never looks
inside the chats array. When none of those top-level fields contain the
GUID, the adapter falls through to using the sender's phone/email as the
session chat ID.

This causes two observable bugs when a user is a participant in both a DM
and a group chat with the bot:

1. DM and group sessions merge. Every message from that user ends up with
   the same session_chat_id (their own address), so the bot cannot
   distinguish which thread the message came from.

2. Outbound routing becomes ambiguous. _resolve_chat_guid() iterates all
   chats and returns the first one where the address appears as a
   participant; group chats typically sort ahead of DMs by activity, so
   replies and cron messages intended for the DM can land in a group.

This was observed in production: a user's morning brief cron delivered to
a group chat with his spouse instead of his DM thread.

The fix adds a single fallback that extracts chat_guid from
record["chats"][0]["guid"] when the top-level fields are empty. The chats
array is included in every new-message webhook payload in BB v1.9.9
(verified against a live server). It is backwards compatible: if a future
BB version starts including chatGuid at the top level, that still wins.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 11:02:48 -07:00
cypres0099 064f8d74de fix(gateway/bluebubbles): remove invalid "message" from webhook event registration
The BlueBubbles adapter registers its webhook with three events:
["new-message", "updated-message", "message"]. The third, "message",
is not a valid event type in the BlueBubbles server API — BB rejects
the registration payload with HTTP 400 Bad Request.

Currently this is masked by the "crash resilience" check in
_register_webhook, which reuses any existing registration matching the
webhook URL and short-circuits before reaching the API call. So an
already-registered webhook from a prior run keeps working. But any fresh
install, or any restart after _unregister_webhook has run during a clean
shutdown, fails to re-register and silently stops receiving messages.

Observed in production: after a gateway restart in v0.9.0 (which auto-
unregisters on shutdown), the next startup hit this 400 and the bot went
silent until the invalid event was removed.

BlueBubbles documents "new-message" and "updated-message" as the message
event types (see https://docs.bluebubbles.app/). There is no "message"
event, and no harm in dropping it — the two remaining events cover all
inbound message webhooks.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 11:02:48 -07:00
Teknium 99bcc2de5b fix(security): harden dashboard API against unauthenticated access (#9800)
Addresses responsible disclosure from FuzzMind Security Lab (CVE pending).

The web dashboard API server had 36 endpoints, of which only 5 checked
the session token. The token itself was served from an unauthenticated
GET /api/auth/session-token endpoint, rendering the protection circular.
When bound to 0.0.0.0 (--host flag), all API keys, config, and cron
management were accessible to any machine on the network.

Changes:
- Add auth middleware requiring session token on ALL /api/ routes except
  a small public whitelist (status, config/defaults, config/schema,
  model/info)
- Remove GET /api/auth/session-token endpoint entirely; inject the token
  into index.html via a <script> tag at serve time instead
- Replace all inline token comparisons (!=) with hmac.compare_digest()
  to prevent timing side-channel attacks
- Block non-localhost binding by default; require --insecure flag to
  override (with warning log)
- Update frontend fetchJSON() to send Authorization header on all
  requests using the injected window.__HERMES_SESSION_TOKEN__

Credit: Callum (@0xca1x) and @migraine-sudo at FuzzMind Security Lab
2026-04-14 10:57:56 -07:00
asheriif b583210c97 fix(gateway): fix regression causing display.streaming to override root streaming key 2026-04-14 10:52:23 -07:00
Teknium 8bb5973950 docs: add proxy mode documentation
- Matrix docs: full Proxy Mode section with architecture diagram,
  step-by-step setup (host + Docker), docker-compose.yml/Dockerfile
  examples, configuration reference, and limitations notes
- API Server docs: add Proxy Mode section explaining the api_server
  serves as the backend for gateway proxy mode
- Environment variables reference: add GATEWAY_PROXY_URL and
  GATEWAY_PROXY_KEY entries
2026-04-14 10:49:48 -07:00
Teknium 90c98345c9 feat: gateway proxy mode — forward messages to remote API server
When GATEWAY_PROXY_URL (or gateway.proxy_url in config.yaml) is set,
the gateway becomes a thin relay: it handles platform I/O (encryption,
threading, media) and delegates all agent work to a remote Hermes API
server via POST /v1/chat/completions with SSE streaming.

This enables the primary use case of running a Matrix E2EE gateway in
Docker on Linux while the actual agent runs on the host (e.g. macOS)
with full access to local files, memory, skills, and a unified session
store. Works for any platform adapter, not just Matrix.

Configuration:
  - GATEWAY_PROXY_URL env var (Docker-friendly)
  - gateway.proxy_url in config.yaml
  - GATEWAY_PROXY_KEY env var for API auth (matches API_SERVER_KEY)
  - X-Hermes-Session-Id header for session continuity

Architecture:
  - _get_proxy_url() checks env var first, then config.yaml
  - _run_agent_via_proxy() handles HTTP forwarding with SSE streaming
  - _run_agent() delegates to proxy path when URL is configured
  - Platform streaming (GatewayStreamConsumer) works through proxy
  - Returns compatible result dict for session store recording

Files changed:
  - gateway/run.py: proxy mode implementation (~250 lines)
  - hermes_cli/config.py: GATEWAY_PROXY_URL + GATEWAY_PROXY_KEY env vars
  - tests/gateway/test_proxy_mode.py: 17 tests covering config
    resolution, dispatch, HTTP forwarding, error handling, message
    filtering, and result shape validation

Closes discussion from Cars29 re: Matrix gateway mixed-mode issue.
2026-04-14 10:49:48 -07:00
zhiheng.liu 1ace9b4dc4 fix: memory_setup.py - write non-secret env vars, check all fields in status
Critical bug fixes only (no redundant changes):

1. **Write non-secret fields to .env** - Add non-secret fields with env_var to env_writes so they get saved to .env
2. **Status checks all fields** - Check all fields with env_var (both secret and non-secret), not just secrets

Fixes:
- OPENVIKING_ENDPOINT and similar non-secret env vars now get written to .env
- hermes memory status now shows ALL missing required fields
2026-04-14 10:49:35 -07:00
dirtyfancy e964cfc403 fix(gateway): trigger memory provider shutdown on /new and /reset
The /new and /reset commands were not calling shutdown_memory_provider()
on the cached agent before eviction. This caused OpenViking (and any
memory provider that relies on session-end shutdown) to skip commit,
leaving memories un-indexed until idle timeout or gateway shutdown.

Add the missing shutdown_memory_provider() call in _handle_reset_command(),
matching the behavior already present in the session expiry watcher.

Fixes #7759
2026-04-14 10:49:35 -07:00
Disaster-Terminator 9bdfcd1b93 feat: sort tool search results by score and add corresponding unit test 2026-04-14 10:49:35 -07:00
Teknium b867171291 fix: preserve profile name completion in dynamic shell completion
The dynamic parser walker from the contributor's commit lost the profile
name tab-completion that existed in the old static generators. This adds
it back for all three shells:

- Bash: _hermes_profiles() helper, -p/--profile completion, profile
  action→name completion (use/delete/show/alias/rename/export)
- Zsh: _hermes_profiles() function, -p/--profile argument spec, profile
  action case with name completion
- Fish: __hermes_profiles function, -s p -l profile flag, profile action
  completions

Also removes the dead fallback path in cmd_completion() that imported
the old static generators from profiles.py (parser is always available
via the lambda wiring) and adds 11 regression-prevention tests for
profile completion.
2026-04-14 10:45:42 -07:00
leozeli c95b1c5096 fix(install): add fish shell support in install.sh
Fish users' $SHELL is /usr/bin/fish, which fell into the '*' case and
incorrectly wrote 'export PATH=...' to ~/.bashrc and ~/.zshrc — neither
of which fish reads.

- setup_path(): add fish) case that writes fish_add_path to
  ~/.config/fish/config.fish (fish-compatible PATH syntax)
- setup_path(): skip ~/.profile for fish (not sourced by fish)
- print_success(): show correct reload instruction for fish:
  source ~/.config/fish/config.fish

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 10:45:42 -07:00
leozeli a686dbdd26 feat(cli): add dynamic shell completion for bash, zsh, and fish
Replaces the hardcoded completion stubs in profiles.py with a dynamic
generator that walks the live argparse parser tree at runtime.

- New hermes_cli/completion.py: _walk() recursively extracts all
  subcommands and flags; generate_bash/zsh/fish() produce complete
  scripts with nested subcommand support
- cmd_completion now accepts the parser via closure so completions
  always reflect the actual registered commands (including plugin-
  registered ones like honcho)
- completion subcommand now accepts bash | zsh | fish (fish requested
  in issue comments)
- Fix _SUBCOMMANDS set: add honcho, claw, plugins, acp, webhook,
  memory, dump, debug, backup, import, completion, logs so that
  multi-word session names after -c/-r are not broken by these commands
- Add tests/hermes_cli/test_completion.py: 17 tests covering parser
  extraction, alias deduplication, bash/zsh/fish output content,
  bash syntax validation, fish syntax validation, and subcommand
  drift prevention

Tested on Linux (Arch). bash and fish completion verified live.
zsh script passes syntax check (zsh not installed on test machine).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 10:45:42 -07:00
N0nb0at b21b3bfd68 feat(plugins): namespaced skill registration for plugin skill bundles
Add ctx.register_skill() API so plugins can ship SKILL.md files under
a 'plugin:skill' namespace, preventing name collisions with built-in
Hermes skills. skill_view() detects the ':' separator and routes to
the plugin registry while bare names continue through the existing
flat-tree scan unchanged.

Key additions:
- agent/skill_utils: parse_qualified_name(), is_valid_namespace()
- hermes_cli/plugins: PluginContext.register_skill(), PluginManager
  skill registry (find/list/remove)
- tools/skills_tool: qualified name dispatch in skill_view(),
  _serve_plugin_skill() with full guards (disabled, platform,
  injection scan), bundle context banner with sibling listing,
  stale registry self-heal
- Hoisted _INJECTION_PATTERNS to module level (dedup)
- Updated skill_view schema description

Based on PR #9334 by N0nb0at. Lean P1 salvage — omits autogen shim
(P2) for a simpler first merge.

Closes #8422
2026-04-14 10:42:58 -07:00
Dusk1e 4b47856f90 fix: load credentials from HERMES_HOME .env in trajectory_compressor 2026-04-14 10:24:19 -07:00
Teknium 8a002d4efc chore: add ChimingLiu to AUTHOR_MAP 2026-04-14 10:22:11 -07:00
Teknium 8ea9ceb44c fix: guard reply_to_text against DeletedReferencedMessage
Use getattr() for resolved.content since discord.py's
DeletedReferencedMessage lacks a content attribute. Adds test
for the deleted-message edge case.
2026-04-14 10:22:11 -07:00
ChimingLiu 7636baf49c feat(discord): extract reply text from message references 2026-04-14 10:22:11 -07:00
Teknium 0e7dd30acc fix(browser): fix Camofox JS eval endpoint, userId, and package rename (#9774)
- Fix _camofox_eval() endpoint: /tabs/{id}/eval → /tabs/{id}/evaluate
  (correct Camofox REST API path)
- Add required userId field to JS eval request body (all other Camofox
  endpoints already include it)
- Update npm package from @askjo/camoufox-browser ^1.0.0 to
  @askjo/camofox-browser ^1.5.2 (upstream package was renamed)
- Update tools_config.py post-setup to reference new package directory
  and npx command
- Bump Node engine requirement from >=18 to >=20 (required by
  camoufox-js dependency in camofox-browser v1.5.2)
- Regenerate package-lock.json

Fixes issues reported in PRs #9472, #8267, #7208 (stale).
2026-04-14 10:21:54 -07:00
Teknium 5f36b42b2e fix: nest msvcrt import inside fcntl except block
Match cron/scheduler.py pattern — only attempt msvcrt import when
fcntl is unavailable. Pre-declare msvcrt = None at module level so
_file_lock() references don't NameError on Linux.
2026-04-14 10:18:05 -07:00
Dusk1e 420d27098f fix(tools): keep memory tool available when fcntl is unavailable 2026-04-14 10:18:05 -07:00
Zhuofeng Wang 449c17e9a9 fix(gateway): support Telegram MarkdownV2 expandable blockquotes 2026-04-14 10:16:49 -07:00
shijianzhi 70611879de fix(cli): fix doctor checks for Kimi China credentials 2026-04-14 10:16:30 -07:00
Austin Pickett 206259d111 Merge pull request #9701 from NousResearch/fix/dashboard-routing-v2
feat(web): re-apply dashboard UI improvements on top of i18n
2026-04-14 08:46:17 -07:00
Austin Pickett 4ffaac542b fix(web): i18n fixes for sidebar and dropdown labels
- Add missing translation keys: skills.resultCount, skills.toolsetLabel
- Replace hardcoded "result(s)" and "toolset" with translated strings
- Fix stale useMemo in SkillsPage allCategories (missing `t` dependency)
  causing sidebar category names to stay in English after language switch

Made-with: Cursor
2026-04-14 10:32:51 -04:00
Austin Pickett e88aa8a58c feat(web): re-apply dashboard UI improvements on top of i18n
Re-applies changes from #9471 that were overwritten by the i18n PR:

- URL-based routing via react-router-dom (NavLink, Routes, BrowserRouter)
- Replace emoji icons with lucide-react in ConfigPage and SkillsPage
- Sidebar layout for ConfigPage, SkillsPage, and LogsPage
- Custom dropdown Select component (SelectOption) in CronPage
- Remove all non-functional rounded borders across the UI
- Fixed header with proper content offset

Made-with: Cursor
2026-04-14 10:23:43 -04:00
Ben Barclay 16f9d02084 Merge pull request #9475 from NousResearch/docs/fix-docker-version-command
docs: update docker version check command
2026-04-14 20:27:24 +10:00
Teknium 7ad47ace51 fix: resolve remaining 4 CI test failures (#9543)
- test_auth_commands: suppress _seed_from_singletons auto-seeding that
  adds extra credentials from CI env (same pattern as nearby tests)
- test_interrupt: clear stale _interrupted_threads set to prevent
  thread ident reuse from prior tests in same xdist worker
- test_code_execution: add watch_patterns to _BLOCKED_TERMINAL_PARAMS
  to match production _TERMINAL_BLOCKED_PARAMS
2026-04-14 02:18:38 -07:00
Teknium b4fcec6412 fix: prevent streaming cursor from appearing as standalone messages (#9538)
During rapid tool-calling, the model often emits 1-2 tokens before
switching to tool calls. The stream consumer would create a new message
with 'X ▉' (short text + cursor), and if the follow-up edit to strip
the cursor was rate-limited by the platform, the cursor remained as
a permanent standalone message — reported on Telegram as 'white box'
artifacts.

Add a minimum-content guard in _send_or_edit: when creating a new
standalone message (no existing message_id), require at least 4
visible characters alongside the cursor before sending. Shorter text
accumulates into the next streaming segment instead.

This prevents cursor-only 'tofu' messages across all platforms without
affecting normal streaming (edits to existing messages, final sends
without cursor, and messages with substantial text are all unaffected).

Reported by @michalkomar on X.
2026-04-14 01:52:42 -07:00
Teknium 2558d28a9b fix: resolve CI test failures — add missing functions, fix stale tests (#9483)
Production fixes:
- Add clear_session_context() to hermes_logging.py (fixes 48 teardown errors)
- Add clear_session() to tools/approval.py (fixes 9 setup errors)
- Add SyncError M_UNKNOWN_TOKEN check to Matrix _sync_loop (bug fix)
- Fall back to inline api_key in named custom providers when key_env
  is absent (runtime_provider.py)

Test fixes:
- test_memory_user_id: use builtin+external provider pair, fix honcho
  peer_name override test to match production behavior
- test_display_config: remove TestHelpers for non-existent functions
- test_auxiliary_client: fix OAuth tokens to match _is_oauth_token
  patterns, replace get_vision_auxiliary_client with resolve_vision_provider_client
- test_cli_interrupt_subagent: add missing _execution_thread_id attr
- test_compress_focus: add model/provider/api_key/base_url/api_mode
  to mock compressor
- test_auth_provider_gate: add autouse fixture to clean Anthropic env
  vars that leak from CI secrets
- test_opencode_go_in_model_list: accept both 'built-in' and 'hermes'
  source (models.dev API unavailable in CI)
- test_email: verify email Platform enum membership instead of source
  inspection (build_channel_directory now uses dynamic enum loop)
- test_feishu: add bot_added/bot_deleted handler mocks to _Builder
- test_ws_auth_retry: add AsyncMock for sync_store.get_next_batch,
  add _pending_megolm and _joined_rooms to Matrix adapter mocks
- test_restart_drain: monkeypatch-delete INVOCATION_ID (systemd sets
  this in CI, changing the restart call signature)
- test_session_hygiene: add user_id to SessionSource
- test_session_env: use relative baseline for contextvar clear check
  (pytest-xdist workers share context)
2026-04-14 01:43:45 -07:00
Jiawen-lee 2cfd2dafc6 feat(gateway): add ignored_threads config for Telegram 2026-04-14 01:40:32 -07:00
Teknium 1acf81fdf5 docs: add QQBot to all 14 docs pages (full platform parity)
- sidebars.ts: sidebar navigation entry
- webhooks.md: deliver field routing table
- configuration.md: platform keys list
- sessions.md: platform identifiers table
- features/cron.md: delivery target table
- developer-guide/architecture.md: adapter listing
- developer-guide/cron-internals.md: delivery target table
- developer-guide/gateway-internals.md: file tree listing
- guides/cron-troubleshooting.md: supported platforms list
- integrations/index.md: platform links list
- reference/toolsets-reference.md: toolset table

(qqbot.md, environment-variables.md, and messaging/index.md were
already included in the contributor's original PR)
2026-04-14 00:11:49 -07:00
Teknium 8d545da3ff fix: add platform lock, send retry, message splitting, REST one-shot, shared strip_markdown
Improvements from our earlier #8269 salvage work applied to #7616:

- Platform token lock: acquire_scoped_lock/release_scoped_lock prevents
  two profiles from double-connecting the same QQ bot simultaneously
- Send retry with exponential backoff (3 attempts, 1s/2s/4s) with
  permanent vs transient error classification (matches Telegram pattern)
- Proper long-message splitting via truncate_message() instead of
  hard-truncating at MAX_MESSAGE_LENGTH (preserves code blocks, adds 1/N)
- REST-based one-shot send in send_message_tool — uses QQ Bot REST API
  directly with httpx instead of creating a full WebSocket adapter per
  message (fixes the connect→send race condition)
- Use shared strip_markdown() from helpers.py instead of 15 lines of
  inline regex with import-inside-method (DRY, same as BlueBubbles/SMS)
- format_message() now wired into send() pipeline
2026-04-14 00:11:49 -07:00
Teknium 4654f75627 fix: QQBot missing integration points, timestamp parsing, test fix
- Add Platform.QQBOT to _UPDATE_ALLOWED_PLATFORMS (enables /update command)
- Add 'qqbot' to webhook cross-platform delivery routing
- Add 'qqbot' to hermes dump platform detection
- Fix test_name_property casing: 'QQBot' not 'QQBOT'
- Add _parse_qq_timestamp() for ISO 8601 + integer ms compatibility
  (QQ API changed timestamp format — from PR #2411 finding)
- Wire timestamp parsing into all 4 message handlers
2026-04-14 00:11:49 -07:00
walli 884cd920d4 feat(gateway): unify QQBot branding, add PLATFORM_HINTS, fix streaming, restore missing setup functions
- Rename platform from 'qq' to 'qqbot' across all integration points
  (Platform enum, toolset, config keys, import paths, file rename qq.py → qqbot.py)
- Add PLATFORM_HINTS for QQBot in prompt_builder (QQ supports markdown)
- Set SUPPORTS_MESSAGE_EDITING = False to skip streaming on QQ
  (prevents duplicate messages from non-editable partial + final sends)
- Add _send_qqbot() standalone send function for cron/send_message tool
- Add interactive _setup_qq() wizard in hermes_cli/setup.py
- Restore missing _setup_signal/email/sms/dingtalk/feishu/wecom/wecom_callback
  functions that were lost during the original merge
2026-04-14 00:11:49 -07:00
Junjun Zhang 87bfc28e70 feat: add QQ Bot platform adapter (Official API v2)
Add full QQ Bot integration via the Official QQ Bot API (v2):
- WebSocket gateway for inbound events (C2C, group, guild, DM)
- REST API for outbound text/markdown/media messages
- Voice transcription (Tencent ASR + configurable STT provider)
- Attachment processing (images, voice, files)
- User authorization (allowlist + allow-all + DM pairing)

Integration points:
- gateway: Platform.QQ enum, adapter factory, allowlist maps
- CLI: setup wizard, gateway config, status display, tools config
- tools: send_message cross-platform routing, toolsets
- cron: delivery platform support
- docs: QQ Bot setup guide
2026-04-14 00:11:49 -07:00
Teknium eb44abd6b1 feat: improve file search UX — fuzzy @ completions, mtime sorting, better suggestions (#9467)
Three improvements to file search based on user feedback:

1. Fuzzy @ completions (commands.py):
   - Bare @query now does project-wide fuzzy file search instead of
     prefix-only directory listing
   - Uses rg --files with 5-second cache for responsive completions
   - Scoring: exact name (100) > prefix (80) > substring (60) >
     path contains (40) > subsequence with boundary bonus (35/25)
   - Bare @ with no query shows recently modified files first

2. Mtime-sorted file search (file_operations.py):
   - _search_files_rg now uses --sortr=modified (rg 13+) to surface
     recently edited files first
   - Falls back to unsorted on older rg versions

3. Improved file-not-found suggestions (file_operations.py):
   - Replaced crude character-set overlap with ranked scoring:
     same basename (90) > prefix (70) > substring (60) >
     reverse substring (40) > same extension (30)
   - search_files path-not-found now suggests similar directories
     from the parent
2026-04-13 23:54:45 -07:00
Greer Guthrie c7e2fe655a fix: make tool registry reads thread-safe 2026-04-13 23:52:32 -07:00
Teknium 6dc8f8e9c0 feat(skin): add warm-lightmode skin from PR #4811
Add a second light-mode skin option with warm brown/parchment tones,
adapted from ygd58's contribution in PR #4811. Includes completion
menu and status bar color keys for full light-terminal support.

Co-authored-by: buray <78954051+ygd58@users.noreply.github.com>
2026-04-13 23:51:21 -07:00
Liu Chongwei bc93641c4f feat(skins): add built-in daylight skin 2026-04-13 23:51:21 -07:00
Ben Barclay 9ffc26bc8f docs: update docker version check command
Replace `docker exec hermes hermes version` with
`docker run -it --rm nousresearch/hermes-agent:latest version`
2026-04-14 06:37:50 +00:00
Teknium a2ea237db2 feat: add internationalization (i18n) to web dashboard — English + Chinese (#9453)
Add a lightweight i18n system to the web dashboard with English (default) and
Chinese language support. A language switcher with flag icons is placed in the
header bar, allowing users to toggle between languages. The choice persists
to localStorage.

Implementation:
- src/i18n/ — types, translation files (en.ts, zh.ts), React context + hook
- LanguageSwitcher component shows the *other* language's flag as the toggle
- I18nProvider wraps the app in main.tsx
- All 8 pages + OAuth components updated to use t() translation calls
- Zero new dependencies — pure React context + localStorage
2026-04-13 23:19:13 -07:00
Teknium 19199cd38d fix: clamp 'minimal' reasoning effort to 'low' on Responses API (#9429)
GPT-5.4 supports none/low/medium/high/xhigh but not 'minimal'.
Users may configure 'minimal' via OpenRouter conventions, which would
cause a 400 on native OpenAI. Clamp to 'low' in the codex_responses
path before sending.
2026-04-13 23:11:13 -07:00
Teknium 38ad158b6b fix: auto-correct close model name matches in /model validation (#9424)
* feat(skills): add fitness-nutrition skill to optional-skills

Cherry-picked from PR #9177 by @haileymarshall.

Adds a fitness and nutrition skill for gym-goers and health-conscious users:
- Exercise search via wger API (690+ exercises, free, no auth)
- Nutrition lookup via USDA FoodData Central (380K+ foods, DEMO_KEY fallback)
- Offline body composition calculators (BMI, TDEE, 1RM, macros, body fat %)
- Pure stdlib Python, no pip dependencies

Changes from original PR:
- Moved from skills/ to optional-skills/health/ (correct location)
- Fixed BMR formula in FORMULAS.md (removed confusing -5+10, now just +5)
- Fixed author attribution to match PR submitter
- Marked USDA_API_KEY as optional (DEMO_KEY works without signup)

Also adds optional env var support to the skill readiness checker:
- New 'optional: true' field in required_environment_variables entries
- Optional vars are preserved in metadata but don't block skill readiness
- Optional vars skip the CLI capture prompt flow
- Skills with only optional missing vars show as 'available' not 'setup_needed'

* fix: auto-correct close model name matches in /model validation

When a user types a model name with a minor typo (e.g. gpt5.3-codex instead
of gpt-5.3-codex), the validation now auto-corrects to the closest match
instead of accepting the wrong name with a warning.

Uses difflib get_close_matches with cutoff=0.9 to avoid false corrections
(e.g. gpt-5.3 should not silently become gpt-5.4). Applied consistently
across all three validation paths: codex provider, custom endpoints, and
generic API-probed providers.

The validate_requested_model() return dict gains an optional corrected_model
key that switch_model() applies before building the result.

Reported by Discord user — /model gpt5.3-codex was accepted with a warning
but would fail at the API level.

---------

Co-authored-by: haileymarshall <haileymarshall@users.noreply.github.com>
2026-04-13 23:09:39 -07:00
Teknium 35424f8fc1 chore: add bennytimz to AUTHOR_MAP 2026-04-13 23:03:08 -07:00
oluwadareab12 a91b9bb855 feat(skills): add drug-discovery optional skill — ChEMBL, PubChem, OpenFDA, ADMET analysis
Pharmaceutical research skill covering bioactive compound search (ChEMBL),
drug-likeness screening (Lipinski Ro5 + Veber via PubChem), drug-drug
interaction lookups (OpenFDA), gene-disease associations (OpenTargets
GraphQL), and ADMET reasoning guidance. All free public APIs, zero auth,
stdlib-only Python. Includes helper scripts for batch Ro5 screening and
target-to-compound pipelines.

Moved to optional-skills/research/ (niche domain skill, not built-in).
Fixed: authors→author frontmatter, removed unused jq prerequisite,
bare except→except Exception.

Co-authored-by: bennytimz <oluwadareab12@gmail.com>
Salvaged from PR #8695.
2026-04-13 23:03:08 -07:00
Teknium d631431872 feat: prompt for display name when adding custom providers (#9420)
During custom endpoint setup, users are now asked for a display name
with the auto-generated name as the default. Typing 'Ollama' or
'LM Studio' replaces the generic 'Local (localhost:11434)' in the
provider menu.

Extracts _auto_provider_name() for reuse and adds a name= parameter
to _save_custom_provider() so the caller can pass through the
user-chosen label.
2026-04-13 22:41:00 -07:00
Kenny Xie cdd44817f2 fix(anthropic): send fast mode speed via extra_body 2026-04-13 22:32:39 -07:00
Teknium 110892ff69 docs: move Xiaomi MiMo up in README provider list 2026-04-13 22:30:44 -07:00
Teknium 3de2b98503 fix(streaming): filter <think> blocks from gateway stream consumer
Models like MiniMax emit inline <think>...</think> reasoning blocks in
their content field. The CLI already suppresses these via a state machine
in _stream_delta, but the gateway's GatewayStreamConsumer had no
equivalent filtering — raw think blocks were streamed directly to
Discord/Telegram/Slack.

The fix adds a _filter_and_accumulate() method that mirrors the CLI's
approach: a state machine tracks whether we're inside a reasoning block
and silently discards the content. Includes the same block-boundary
check (tag must appear at line start or after whitespace-only prefix)
to avoid false positives when models mention <think> in prose.

Handles all tag variants: <think>, <thinking>, <THINKING>, <thought>,
<reasoning>, <REASONING_SCRATCHPAD>.

Also handles edge cases:
- Tags split across streaming deltas (partial tag buffering)
- Unclosed blocks (content suppressed until stream ends)
- Multiple consecutive blocks
- _flush_think_buffer on stream end for held-back partial tags

Adds 22 unit tests + 1 integration test covering all scenarios.
2026-04-13 22:16:20 -07:00
helix4u e08590888a fix: honor interrupts during MCP tool waits 2026-04-13 22:14:55 -07:00
Teknium 69d619cf89 docs: add Hugging Face and Xiaomi MiMo to README provider list (#9406)
* feat(skills): add fitness-nutrition skill to optional-skills

Cherry-picked from PR #9177 by @haileymarshall.

Adds a fitness and nutrition skill for gym-goers and health-conscious users:
- Exercise search via wger API (690+ exercises, free, no auth)
- Nutrition lookup via USDA FoodData Central (380K+ foods, DEMO_KEY fallback)
- Offline body composition calculators (BMI, TDEE, 1RM, macros, body fat %)
- Pure stdlib Python, no pip dependencies

Changes from original PR:
- Moved from skills/ to optional-skills/health/ (correct location)
- Fixed BMR formula in FORMULAS.md (removed confusing -5+10, now just +5)
- Fixed author attribution to match PR submitter
- Marked USDA_API_KEY as optional (DEMO_KEY works without signup)

Also adds optional env var support to the skill readiness checker:
- New 'optional: true' field in required_environment_variables entries
- Optional vars are preserved in metadata but don't block skill readiness
- Optional vars skip the CLI capture prompt flow
- Skills with only optional missing vars show as 'available' not 'setup_needed'

* docs: add Hugging Face and Xiaomi MiMo to README provider list

---------

Co-authored-by: haileymarshall <haileymarshall@users.noreply.github.com>
2026-04-13 22:12:46 -07:00
haileymarshall f0b353bade feat(skills): add fitness-nutrition skill to optional-skills
Cherry-picked from PR #9177 by @haileymarshall.

Adds a fitness and nutrition skill for gym-goers and health-conscious users:
- Exercise search via wger API (690+ exercises, free, no auth)
- Nutrition lookup via USDA FoodData Central (380K+ foods, DEMO_KEY fallback)
- Offline body composition calculators (BMI, TDEE, 1RM, macros, body fat %)
- Pure stdlib Python, no pip dependencies

Changes from original PR:
- Moved from skills/ to optional-skills/health/ (correct location)
- Fixed BMR formula in FORMULAS.md (removed confusing -5+10, now just +5)
- Fixed author attribution to match PR submitter
- Marked USDA_API_KEY as optional (DEMO_KEY works without signup)

Also adds optional env var support to the skill readiness checker:
- New 'optional: true' field in required_environment_variables entries
- Optional vars are preserved in metadata but don't block skill readiness
- Optional vars skip the CLI capture prompt flow
- Skills with only optional missing vars show as 'available' not 'setup_needed'
2026-04-13 22:10:00 -07:00
Teknium 62fb6b2cd8 fix: guard zero context length display + add 19 tests for model info
- ModelInfoCard: hide card when effective_context_length <= 0 instead
  of showing 'Context Window: 0 auto-detected'
- Add tests for _normalize_config_for_web model_context_length extraction
- Add tests for _denormalize_config_from_web round-trip (write back,
  remove on zero, upgrade bare string to dict, coerce string input)
- Add tests for CONFIG_SCHEMA ordering (model_context_length after model)
- Add tests for GET /api/model/info endpoint (dict config, bare string,
  empty model, capabilities, graceful error handling)
2026-04-13 22:04:35 -07:00
kshitijk4poor 8fd3093f49 feat(web): add context window support to dashboard config
- Add GET /api/model/info endpoint that resolves model metadata using the
  same 10-step context-length detection chain the agent uses. Returns
  auto-detected context length, config override, effective value, and
  model capabilities (tools, vision, reasoning, max output, model family).

- Surface model.context_length as model_context_length virtual field in
  the config normalize/denormalize cycle. 0 = auto-detect (default),
  positive value overrides. Writing 0 removes context_length from the
  model dict on disk.

- Add ModelInfoCard component showing resolved context window (e.g. '1M
  auto-detected' or '500K override — auto: 1M'), max output tokens, and
  colored capability badges (Tools, Vision, Reasoning, model family).

- Inject ModelInfoCard between model field and context_length override in
  ConfigPage General tab. Card re-fetches on model change and after save.

- Insert model_context_length right after model in CONFIG_SCHEMA ordering
  so the three elements (model input → info card → override) are adjacent.
2026-04-13 22:04:35 -07:00
Gianfranco Piana eabc0a2f66 feat(plugins): let pre_tool_call hooks block tool execution
Plugins can now return {"action": "block", "message": "reason"} from
their pre_tool_call hook to prevent a tool from executing. The error
message is returned to the model as a tool result so it can adjust.

Covers both execution paths: handle_function_call (model_tools.py) and
agent-level tools (run_agent.py _invoke_tool + sequential/concurrent).
Blocked tools skip all side effects (counter resets, checkpoints,
callbacks, read-loop tracker).

Adds skip_pre_tool_call_hook flag to avoid double-firing the hook when
run_agent.py already checked and then calls handle_function_call.

Salvaged from PR #5385 (gianfrancopiana) and PR #4610 (oredsecurity).
2026-04-13 22:01:49 -07:00
Austin Pickett ea74f61d98 Merge pull request #9370 from NousResearch/fix/dashboard-routing
feat: react-router, sidebar layout, sticky header, dropdown component…
2026-04-13 21:23:48 -07:00
Teknium 943c01536f feat: add openrouter/elephant-alpha to curated model lists (#9378)
* Add hermes debug share instructions to all issue templates

- bug_report.yml: Add required Debug Report section with hermes debug share
  and /debug instructions, make OS/Python/Hermes version optional (covered
  by debug report), demote old logs field to optional supplementary
- setup_help.yml: Replace hermes doctor reference with hermes debug share,
  add Debug Report section with fallback chain (debug share -> --local -> doctor)
- feature_request.yml: Add optional Debug Report section for environment context

All templates now guide users to run hermes debug share (or /debug in chat)
and paste the resulting paste.rs links, giving maintainers system info,
config, and recent logs in one step.

* feat: add openrouter/elephant-alpha to curated model lists

- Add to OPENROUTER_MODELS (free, positioned above GPT models)
- Add to _PROVIDER_MODELS["nous"] mirror list
- Add 256K context window fallback in model_metadata.py
2026-04-13 21:16:14 -07:00
Teknium dd86deef13 feat(ci): add contributor attribution check on PRs (#9376)
Adds a CI workflow that blocks PRs introducing commits with
unmapped author emails. Checks each new commit's author email
against AUTHOR_MAP in scripts/release.py — GitHub noreply emails
auto-pass, but personal/work emails must be mapped.

Also adds --strict and --diff-base flags to contributor_audit.py
for programmatic use. --strict exits 1 when new unmapped emails
are found; --diff-base scopes the check to only flag emails from
commits after a given ref (grandfathers existing unknowns).

Prevention for the 97-unmapped-email gap found in the April 2026
contributor audit.
2026-04-13 21:13:08 -07:00
Teknium 5719c1f391 fix: add 75 contributor email→username mappings + .mailmap (#9358)
Audit of all external contributor PRs revealed 97 commit emails
not mapped in AUTHOR_MAP, meaning contributors weren't properly
credited in release notes. Cross-referenced via:
- GitHub API email search (9 resolved before rate limit)
- Salvage PR body mentions (@username in descriptions)
- Git noreply email cross-reference (same person, both emails)
- GH contributor list username matching

Also adds .mailmap for git shortlog/log display consistency.

Remaining 22 unmapped emails need GH API resolution when rate
limit resets — the contributor_audit.py script will flag them.

Addresses ColourfulWhite's report about missing contributor tags.
2026-04-13 21:10:39 -07:00
Austin Pickett bc3844c907 feat: react-router, sidebar layout, sticky header, dropdown component, remove emojis, rounded corners 2026-04-14 00:01:18 -04:00
Teknium 5621fc449a chore: rename AI Gateway → Vercel AI Gateway, move Xiaomi to #5 (#9326)
- Rename 'AI Gateway' to 'Vercel AI Gateway' across auth, models,
  doctor, setup, and tests.
- Move Xiaomi MiMo to position #5 in the provider picker.
2026-04-13 19:51:54 -07:00
Teknium 0cc7f79016 fix(streaming): prevent duplicate Telegram replies when stream task is cancelled (#9319)
When the 5-second stream_task timeout in gateway/run.py expires (due to
slow Telegram API calls from rate limiting after several messages), the
stream consumer is cancelled via asyncio.CancelledError. The
CancelledError handler did a best-effort final edit but never set
final_response_sent, so the gateway fell through to the normal send path
and delivered the full response again as a reply — causing a duplicate.

The fix: in the CancelledError handler, set final_response_sent = True
when already_sent is True (i.e., the stream consumer had already
delivered content to the user). This tells the gateway's already_sent
check that the response was delivered, preventing the duplicate send.

Adds two tests verifying the cancellation behavior:
- Cancelled with already_sent=True → final_response_sent=True (no dup)
- Cancelled with already_sent=False → final_response_sent=False (normal
  send path proceeds)

Reported by community user hume on Discord.
2026-04-13 19:22:43 -07:00
Teknium d15efc9c1b fix: correct GPT-5 family context lengths in fallback defaults (#9309)
The generic 'gpt-5' fallback was set to 128,000 — which is the max
OUTPUT tokens, not the context window. GPT-5 base and most variants
(codex, mini) have 400,000 context. This caused /model to report
128k for models like gpt-5.3-codex when models.dev was unavailable.

Added specific entries for GPT-5 variants with different context sizes:
- gpt-5.4, gpt-5.4-pro: 1,050,000 (1.05M)
- gpt-5.4-mini, gpt-5.4-nano: 400,000
- gpt-5.3-codex-spark: 128,000 (reduced)
- gpt-5.1-chat: 128,000 (chat variant)
- gpt-5 (catch-all): 400,000

Sources: https://developers.openai.com/api/docs/models
2026-04-13 19:22:23 -07:00
Teknium f6626fccee refactor: remove provider tier system — flat picker in hermes model (#9303)
Remove the two-tier (top/extended) provider picker that hid most
providers behind a 'More providers...' submenu. All providers now
appear in a single flat list.

- Remove tier field from ProviderEntry namedtuple
- Remove tier values from all CANONICAL_PROVIDERS entries
- Flatten the hermes model picker (no more 'More...' submenu)
- Move 'Custom endpoint' to the bottom of the main list
2026-04-13 18:51:13 -07:00
Teknium f324222b79 fix: add vLLM/local server error patterns + MCP initial connection retry (#9281)
Port two improvements inspired by Kilo-Org/kilocode analysis:

1. Error classifier: add context overflow patterns for vLLM, Ollama,
   and llama.cpp/llama-server. These local inference servers return
   different error formats than cloud providers (e.g., 'exceeds the
   max_model_len', 'context length exceeded', 'slot context'). Without
   these patterns, context overflow errors from local servers are
   misclassified as format errors, causing infinite retries instead
   of triggering compression.

2. MCP initial connection retry: previously, if the very first
   connection attempt to an MCP server failed (e.g., transient DNS
   blip at startup), the server was permanently marked as failed with
   no retry. Post-connect reconnection had 5 retries with exponential
   backoff, but initial connection had zero. Now initial connections
   retry up to 3 times with backoff before giving up, matching the
   resilience of post-connect reconnection.
   (Inspired by Kilo Code's MCP server disappearing fix in v1.3.3)

Tests: 6 new error classifier tests, 4 new MCP retry tests, 1
updated existing test. All 276 affected tests pass.
2026-04-13 18:46:14 -07:00
arthurbr11 0a4cf5b3e1 feat(providers): add Arcee AI as direct API provider
Adds Arcee AI as a standard direct provider (ARCEEAI_API_KEY) with
Trinity models: trinity-large-thinking, trinity-large-preview, trinity-mini.

Standard OpenAI-compatible provider checklist: auth.py, config.py,
models.py, main.py, providers.py, doctor.py, model_normalize.py,
model_metadata.py, setup.py, trajectory_compressor.py.

Based on PR #9274 by arthurbr11, simplified to a standard direct
provider without dual-endpoint OpenRouter routing.
2026-04-13 18:40:06 -07:00
Agent 78fa758451 feat(web): make Web UI responsive for mobile
- Nav: icons only on mobile, icon+label on sm+
- Brand: abbreviated "H A" on mobile, full "Hermes Agent" on sm+
- Content: reduced padding on mobile (px-3 vs px-6)
- StatusPage: session cards stack vertically on mobile, truncate
  overflow text, strip model namespace for brevity
- ConfigPage: sidebar becomes horizontal scrollable pills on mobile
  instead of fixed left column, search hidden on mobile
- SessionsPage: title + search stack vertically on mobile, search
  goes full-width
- Card component: add overflow-hidden to prevent content bleed
- Body/root: add overflow-x-hidden to prevent horizontal scroll
- Footer: reduced font sizes on mobile

All changes use Tailwind responsive breakpoints (sm: prefix).
No logic changes — purely layout/CSS adjustments.
2026-04-13 17:16:28 -07:00
Teknium ac80bd61ad test: add regression tests for custom_providers multi-model dedup and grouping
Tests for salvaged PRs #9233 and #8011.
2026-04-13 16:41:30 -07:00
Ubuntu ec9bf9e378 feat(model-picker): group custom_providers by name into a single row per provider
The /model picker currently renders one row per ``custom_providers``
entry. When several entries share the same provider name (e.g. four
``ollama-cloud`` entries for ``qwen3-coder``, ``glm-5.1``, ``kimi-k2``,
``minimax-m2.7``), users see four separate "Ollama Cloud" rows in the
picker, which is confusing UX — there is only one Ollama Cloud
provider, so there should be one row containing four models.

This PR groups ``custom_providers`` entries that share the same provider
name into a single picker row while keeping entries with distinct names
as separate rows. So:

* Four entries named ``Ollama Cloud`` → one "Ollama Cloud" row with
  four models inside.
* One entry named ``Ollama Cloud`` and one named ``Moonshot`` → two
  separate rows, one model each.

Implementation
--------------
Replaces the single-pass loop in ``list_authenticated_providers()`` with
a two-pass approach:

1. First pass: build an ``OrderedDict`` keyed by ``custom_provider_slug(name)``,
   accumulating ``models`` per group while preserving discovery order.
2. Second pass: iterate the groups and append one result row per group,
   skipping any slug that already appeared in an earlier provider source
   (the existing ``seen_slugs`` guard).

Insertion order is preserved via ``OrderedDict``, so providers and
their models still appear in the order the user listed them in
``custom_providers``. No new dependencies.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 16:41:30 -07:00
akhater 01f71007d0 fix(config): include model field in custom_providers dedup key
get_compatible_custom_providers() deduplicates by (name, base_url) which
collapses multiple models under the same provider into a single entry.
For example, 7 Ollama Cloud entries with different models become 1.
Adding model to the tuple preserves all entries.
2026-04-13 16:41:30 -07:00
Teknium 32cea0c08d fix: dashboard shows Nous Portal as 'not connected' despite active auth (#9261)
The dashboard device-code flow (_nous_poller in web_server.py) saved
credentials to the credential pool only, while get_nous_auth_status()
only checked the auth store (auth.json). This caused the Keys tab to
show 'not connected' even when the backend was fully authenticated.

Two fixes:
1. get_nous_auth_status() now checks the credential pool first (like
   get_codex_auth_status() already does), then falls back to the auth
   store.
2. _nous_poller now also persists to the auth store after saving to
   the credential pool, matching the CLI flow (_login_nous).

Adds 3 tests covering pool-only, auth-store-fallback, and empty-state
scenarios.
2026-04-13 16:32:11 -07:00
Teknium 8d023e43ed refactor: remove dead code — 1,784 lines across 77 files (#9180)
Deep scan with vulture, pyflakes, and manual cross-referencing identified:
- 41 dead functions/methods (zero callers in production)
- 7 production-dead functions (only test callers, tests deleted)
- 5 dead constants/variables
- ~35 unused imports across agent/, hermes_cli/, tools/, gateway/

Categories of dead code removed:
- Refactoring leftovers: _set_default_model, _setup_copilot_reasoning_selection,
  rebuild_lookups, clear_session_context, get_logs_dir, clear_session
- Unused API surface: search_models_dev, get_pricing, skills_categories,
  get_read_files_summary, clear_read_tracker, menu_labels, get_spinner_list
- Dead compatibility wrappers: schedule_cronjob, list_cronjobs, remove_cronjob
- Stale debug helpers: get_debug_session_info copies in 4 tool files
  (centralized version in debug_helpers.py already exists)
- Dead gateway methods: send_emote, send_notice (matrix), send_reaction
  (bluebubbles), _normalize_inbound_text (feishu), fetch_room_history
  (matrix), _start_typing_indicator (signal), parse_feishu_post_content
- Dead constants: NOUS_API_BASE_URL, SKILLS_TOOL_DESCRIPTION,
  FILE_TOOLS, VALID_ASPECT_RATIOS, MEMORY_DIR
- Unused UI code: _interactive_provider_selection,
  _interactive_model_selection (superseded by prompt_toolkit picker)

Test suite verified: 609 tests covering affected files all pass.
Tests for removed functions deleted. Tests using removed utilities
(clear_read_tracker, MEMORY_DIR) updated to use internal APIs directly.
2026-04-13 16:32:04 -07:00
Teknium a66fc1365d fix: add files:read to SLACK_BOT_TOKEN description in config.py
Missed in the original PR — the env var description also lists required scopes.
2026-04-13 16:31:38 -07:00
helix4u 448b8bfb7c docs: add slack files:read scope 2026-04-13 16:31:38 -07:00
Teknium def8b959b8 fix: add contributor audit script + fix missed contributors (#9264)
Three problems fixed:

1. bobashopcashier missing from v0.9.0 contributor list despite
   authoring the gateway drain PR (#7290, salvaged into #7503).
   Their email (kennyx102@gmail.com) was missing from AUTHOR_MAP.

2. release.py only scanned git commit authors, missing Co-authored-by
   trailers. Now parse_coauthors() extracts trailers from commit bodies.

3. No mechanism to detect contributors from salvaged PRs (where original
   author only appears in PR description, not git log).

Changes:
- scripts/release.py: add kennyx102@gmail.com to AUTHOR_MAP, enhance
  get_commits() to parse Co-authored-by trailers, filter AI assistants
  (Claude, Copilot, Cursor Agent) from co-author lists
- scripts/contributor_audit.py: new script that cross-references git
  authors, co-author trailers, and salvaged PR descriptions. Reports
  unknown emails and contributors missing from release notes.
- RELEASE_v0.9.0.md: add bobashopcashier to community contributors

Usage:
  python scripts/contributor_audit.py --since-tag v2026.4.8
  python scripts/contributor_audit.py --since-tag v2026.4.8 --release-file RELEASE_v0.9.0.md
2026-04-13 16:31:27 -07:00
helix4u f94f53cc22 fix(matrix): disable streaming cursor decoration on Matrix 2026-04-13 16:31:02 -07:00
helix4u 0ffb6f2dae fix(matrix): skip cursor-only stream placeholder messages 2026-04-13 16:31:02 -07:00
Teknium b27eaaa4db fix: improve ACP type check and restore comment accuracy
- Use isinstance() with try/except import for CopilotACPClient check
  in _to_async_client instead of fragile __class__.__name__ string check
- Restore accurate comment: GPT-5.x models *require* (not 'often require')
  the Responses API on OpenAI/OpenRouter; ACP is the exception, not a
  softening of the requirement
- Add inline comment explaining the ACP exclusion rationale
2026-04-13 16:17:43 -07:00
helix4u 8680f61f8b fix(copilot-acp): keep acp runtime off responses path 2026-04-13 16:17:43 -07:00
Teknium 063244bb16 test: add coverage for plugin context engine init (#9071)
Verify that plugin context engines receive update_model() with correct
context_length during AIAgent init — regression test for the ctx -- bug.
2026-04-13 15:00:57 -07:00
Stephen Schoettler c763ed5801 fix(agent): resolve context_length for plugin context engines
Plugin context engines loaded via load_context_engine() were never
given context_length, causing the CLI status bar to show "ctx --"
with an empty progress bar. Call update_model() immediately after
loading the plugin engine, mirroring what switch_model() already does.

Fixes NousResearch/hermes-agent#9071

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-13 15:00:57 -07:00
Teknium 204e9190c4 fix: consolidate provider lists into single CANONICAL_PROVIDERS source of truth (#9237)
Three separate hardcoded provider lists (/model, /provider, hermes model)
diverged over time, causing providers to be missing from some commands.

- Create CANONICAL_PROVIDERS in hermes_cli/models.py as the single source
  of truth for all provider identity, labels, and TUI ordering
- Derive _PROVIDER_LABELS and list_available_providers() from canonical list
- Add step 2b in list_authenticated_providers() to cross-check canonical
  list — catches providers with credentials that weren't found via
  PROVIDER_TO_MODELS_DEV or HERMES_OVERLAYS mappings
- Derive hermes model TUI provider menus from canonical list
- Add deepseek and xai as first-class providers (were missing from TUI)
- Add grok/x-ai/x.ai aliases for xai provider

Fixes: /model command not showing all providers that hermes model shows
2026-04-13 14:59:50 -07:00
Teknium 952a885fbf fix(gateway): /stop no longer resets the session (#9224)
/stop was calling suspend_session() which marked the session for auto-reset
on the next message. This meant users lost their conversation history every
time they stopped a running agent — especially painful for untitled sessions
that can't be resumed by name.

Now /stop just interrupts the agent and cleans the session lock. The session
stays intact so users can continue the conversation.

The suspend behavior was introduced in #7536 to break stuck session resume
loops on gateway restart. That case is already handled by
suspend_recently_active() which runs at gateway startup, so removing it from
/stop doesn't regress the original fix.
2026-04-13 14:59:05 -07:00
SHL0MS d5fd74cac2 fix(ci): don't fail supply chain scan when PR comment can't be posted on fork PRs (#6681)
The GITHUB_TOKEN for fork PRs is read-only — gh pr comment fails with
'Resource not accessible by integration'. This caused the supply chain
scan to show a red X on every fork PR even when no findings were detected.

The scan itself still runs and the 'Fail on critical findings' step
still exits 1 on real issues. Only the comment posting is gracefully
skipped for fork PRs.

Closes #6679

Co-authored-by: SHL0MS <SHL0MS@users.noreply.github.com>
2026-04-13 13:58:59 -07:00
Teknium a6f07a6c37 docs: fix hermes web → hermes dashboard in web-dashboard.md (#9207)
The actual CLI command is 'hermes dashboard', not 'hermes web'.
cli-commands.md already had the correct name.
2026-04-13 13:26:21 -07:00
Sabin Iacob a27b3c8725 add git to the container installed packages (fixes #8439) 2026-04-13 13:08:19 -07:00
Teknium 1af2e18d40 chore: release v0.9.0 (v2026.4.13) (#9182)
The everywhere release — Hermes goes mobile with Termux/Android, adds
iMessage and WeChat, ships Fast Mode for OpenAI and Anthropic,
introduces background process monitoring, launches a local web
dashboard, and delivers the deepest security hardening pass yet
across 16 supported platforms.

487 commits, 269 merged PRs, 167 resolved issues, 24 contributors.
2026-04-13 11:52:09 -07:00
Teknium 0e60a9dc25 fix: add kimi-coding-cn to remaining provider touchpoints
Follow-up for salvaged PR #7637. Adds kimi-coding-cn to:
- model_normalize.py (prefix strip)
- providers.py (models.dev mapping)
- runtime_provider.py (credential resolution)
- setup.py (model list + setup label)
- doctor.py (health check)
- trajectory_compressor.py (URL detection)
- models_dev.py (registry mapping)
- integrations/providers.md (docs)
2026-04-13 11:20:37 -07:00
hcshen0111 2b3aa36242 feat(providers): add kimi-coding-cn provider for mainland China users
Cherry-picked from PR #7637 by hcshen0111.
Adds kimi-coding-cn provider with dedicated KIMI_CN_API_KEY env var
and api.moonshot.cn/v1 endpoint for China-region Moonshot users.
2026-04-13 11:20:37 -07:00
Teknium ef180880aa fix: guard anthropic_adapter import + use canonical authorize URL
- Wrap module-level import from agent.anthropic_adapter in try/except
  so hermes web still starts if the adapter is unavailable; Phase 2
  PKCE endpoints return 501 in that case.
- Change authorize URL from console.anthropic.com to claude.ai to
  match the canonical adapter code.
2026-04-13 11:18:18 -07:00
kshitijk4poor 247929b0dd feat: dashboard OAuth provider management
Add OAuth provider management to the Hermes dashboard with full
lifecycle support for Anthropic (PKCE), Nous and OpenAI Codex
(device-code) flows.

## Backend (hermes_cli/web_server.py)

- 6 new API endpoints:
  GET /api/providers/oauth — list providers with connection status
  POST /api/providers/oauth/{id}/start — initiate PKCE or device-code
  POST /api/providers/oauth/{id}/submit — exchange PKCE auth code
  GET /api/providers/oauth/{id}/poll/{session} — poll device-code
  DELETE /api/providers/oauth/{id} — disconnect provider
  DELETE /api/providers/oauth/sessions/{id} — cancel pending session
- OAuth constants imported from anthropic_adapter (no duplication)
- Blocking I/O wrapped in run_in_executor for async safety
- In-memory session store with 15-minute TTL and automatic GC
- Auth token required on all mutating endpoints

## Frontend

- OAuthLoginModal — PKCE (paste auth code) and device-code (poll) flows
- OAuthProvidersCard — status, token preview, connect/disconnect actions
- Toast fix: createPortal to document.body for correct z-index
- App.tsx: skip animation key bump on initial mount (prevent double-mount)
- Integrated into the Env/Keys page
2026-04-13 11:18:18 -07:00
yongtenglei 2773b18b56 fix(run_agent): refresh activity during streaming responses
Previously, long-running streamed responses could be incorrectly treated
as idle by the gateway/cron inactivity timeout even while tokens were
actively arriving. The _touch_activity() call (which feeds
get_activity_summary() polled by the external timeout) was either called
only on the first chunk (chat completions) or not at all (Anthropic,
Codex, Codex fallback).

Add _touch_activity() on every chunk/event in all four streaming paths
so the inactivity monitor knows data is still flowing.

Fixes #8760
2026-04-13 10:55:51 -07:00
Teknium ba50fa3035 docs: fix 30+ inaccuracies across documentation (#9023)
Cross-referenced all docs pages against the actual codebase and fixed:

Reference docs (cli-commands.md, slash-commands.md, profile-commands.md):
- Fix: hermes web -> hermes dashboard (correct subparser name)
- Fix: Wrong provider list (removed deepseek, ai-gateway, opencode-zen,
  opencode-go, alibaba; added gemini)
- Fix: Missing tts in hermes setup section choices
- Add: Missing --image flag for hermes chat
- Add: Missing --component flag for hermes logs
- Add: Missing CLI commands: debug, backup, import
- Fix: /status incorrectly marked as messaging-only (available everywhere)
- Fix: /statusbar moved from Session to Configuration category
- Add: Missing slash commands: /fast, /snapshot, /image, /debug
- Add: Missing /restart from messaging commands table
- Fix: /compress description to match COMMAND_REGISTRY
- Add: --no-alias flag to profile create docs

Configuration docs (configuration.md, environment-variables.md):
- Fix: Vision timeout default 30s -> 120s
- Fix: TTS providers missing minimax and mistral
- Fix: STT providers missing mistral
- Fix: TTS openai base_url shown with wrong default
- Fix: Compression config showing stale summary_model/provider/base_url
  keys (migrated out in config v17) -> target_ratio/protect_last_n

Getting-started docs:
- Fix: Redundant faster-whisper install (already in voice extra)
- Fix: Messaging extra description missing Slack

Developer guide:
- Fix: architecture.md tool count 48 -> 47, toolset count 40 -> 19
- Fix: run_agent.py line count 9,200 -> 10,700
- Fix: cli.py line count 8,500 -> 10,000
- Fix: main.py line count 5,500 -> 6,000
- Fix: gateway/run.py line count 7,500 -> 9,000
- Fix: Browser tools count 11 -> 10
- Fix: Platform adapter count 15 -> 18 (add wecom_callback, api_server)
- Fix: agent-loop.md wrong budget sharing (not shared, independent)
- Fix: agent-loop.md non-existent _get_budget_warning() reference
- Fix: context-compression-and-caching.md non-existent function name
- Fix: toolsets-reference.md safe toolset includes mixture_of_agents (it doesn't)
- Fix: toolsets-reference.md hermes-cli tool count 38 -> 36

Guides:
- Fix: automate-with-cron.md claims daily at 9am is valid (it's not)
- Fix: delegation-patterns.md Max 3 presented as hard cap (configurable)
- Fix: sessions.md group thread key format (shared by default, not per-user)
- Fix: cron-internals.md job ID format and JSON structure
2026-04-13 10:53:10 -07:00
Teknium 4ca6668daf docs: comprehensive update for recent merged PRs (#9019)
Audit and update documentation across 12 files to match changes from
~50 recently merged PRs. Key updates:

Slash commands (slash-commands.md):
- Add 5 missing commands: /snapshot, /fast, /image, /debug, /restart
- Fix /status incorrectly labeled as messaging-only (available in both)
- Add --global flag to /model docs
- Add [focus topic] arg to /compress docs

CLI commands (cli-commands.md):
- Add hermes debug share section with options and examples
- Add hermes backup section with --quick and --label flags
- Add hermes import section

Feature docs:
- TTS: document global tts.speed and per-provider speed for Edge/OpenAI
- Web dashboard: add docs for 5 missing pages (Sessions, Logs,
  Analytics, Cron, Skills) and 15+ API endpoints
- WhatsApp: add streaming, 4K chunking, and markdown formatting docs
- Skills: add GitHub rate-limit/GITHUB_TOKEN troubleshooting tip
- Budget: document CLI notification on iteration budget exhaustion

Config migration (compression.summary_* → auxiliary.compression.*):
- Update configuration.md, environment-variables.md,
  fallback-providers.md, cli.md, and context-compression-and-caching.md
- Replace legacy compression.summary_model/provider/base_url references
  with auxiliary.compression.model/provider/base_url
- Add legacy migration info boxes explaining auto-migration

Minor fixes:
- wecom-callback.md: clarify 'text only' limitation (input only)
- Escape {session_id}/{job_id} in web-dashboard.md headings for MDX
2026-04-13 10:50:59 -07:00
墨綠BG c449cd1af5 fix(config): restore custom providers after v11→v12 migration
The v11→v12 migration converts custom_providers (list) into providers
(dict), then deletes the list. But all runtime resolvers read from
custom_providers — after migration, named custom endpoints silently stop
resolving and fallback chains fail with AuthError.

Add get_compatible_custom_providers() that reads from both config schemas
(legacy custom_providers list + v12+ providers dict), normalizes entries,
deduplicates, and returns a unified list. Update ALL consumers:

- hermes_cli/runtime_provider.py: _get_named_custom_provider() + key_env
- hermes_cli/auth_commands.py: credential pool provider names
- hermes_cli/main.py: model picker + _model_flow_named_custom()
- agent/auxiliary_client.py: key_env + custom_entry model fallback
- agent/credential_pool.py: _iter_custom_providers()
- cli.py + gateway/run.py: /model switch custom_providers passthrough
- run_agent.py + gateway/run.py: per-model context_length lookup

Also: use config.pop() instead of del for safer migration, fix stale
_config_version assertions in tests, add pool mock to codex test.

Co-authored-by: 墨綠BG <s5460703@gmail.com>
Closes #8776, salvaged from PR #8814
2026-04-13 10:50:52 -07:00
Teknium 0dd26c9495 fix(tests): fix 78 CI test failures and remove dead test (#9036)
Production fixes:
- voice_mode.py: add is_recording property to AudioRecorder (parity with TermuxAudioRecorder)
- cronjob_tools.py: add sms example to deliver description

Test fixes:
- test_real_interrupt_subagent: add missing _execution_thread_id (fixes 19 cascading failures from leaked _build_system_prompt patch)
- test_anthropic_error_handling: add _FakeMessages, override _interruptible_streaming_api_call (6 fixes)
- test_ctx_halving_fix: add missing request_overrides attribute (4 fixes)
- test_context_token_tracking: set _disable_streaming=True for non-streaming test path (4 fixes)
- test_dict_tool_call_args: set _disable_streaming=True (1 fix)
- test_provider_parity: add model='gpt-4o' for AIGateway tests to meet 64K minimum context (4 fixes)
- test_session_race_guard: add user_id to SessionSource (5 fixes)
- test_restart_drain/helpers: add user_id to SessionSource (2 fixes)
- test_telegram_photo_interrupts: add user_id to SessionSource
- test_interrupt: target thread_id for per-thread interrupt system (2 fixes)
- test_zombie_process_cleanup: rewrite with object.__new__ for refactored GatewayRunner.stop() (1 fix)
- test_browser_camofox_state: update config version 15->17 (1 fix)
- test_trajectory_compressor_async: widen lookback window 10->20 for line-shifted AsyncOpenAI (1 fix)
- test_voice_mode: fixed by production is_recording addition (5 fixes)
- test_voice_cli_integration: add _attached_images to CLI stub (2 fixes)
- test_hermes_logging: explicit propagation/level reset for cross-test pollution defense (1 fix)
- test_run_agent: add base_url for OpenRouter detection tests (2 fixes)

Deleted:
- test_inline_think_blocks_reasoning_only_accepted: tested unimplemented inline <think> handling
2026-04-13 10:50:24 -07:00
kimsr96 b909a9efef fix: extend ASCII-locale UnicodeEncodeError recovery to full request payload
The existing ASCII codec handler only sanitized conversation messages,
leaving tool schemas, system prompts, ephemeral prompts, prefill messages,
and HTTP headers as unhandled sources of non-ASCII content. On systems
with LANG=C or non-UTF-8 locale, Unicode symbols in tool descriptions
(e.g. arrows, em-dashes from prompt_builder) and system prompt content
would cause UnicodeEncodeError that fell through to the error path.

Changes:
- Add _sanitize_structure_non_ascii() generic recursive walker for
  nested dict/list payloads
- Add _sanitize_tools_non_ascii() thin wrapper for tool schemas
- Add _force_ascii_payload flag: once ASCII locale is detected, all
  subsequent API calls get proactively sanitized (prevents recurring
  failures from new tool results bringing fresh Unicode each turn)
- Extend the ASCII codec error handler to sanitize: prefill_messages,
  tool schemas (self.tools), system prompt, ephemeral system prompt,
  and default HTTP headers
- Update stale comment that acknowledged the gap

Cherry-picked from PR #8834 (credential pool changes dropped as
separate concern).
2026-04-13 05:16:35 -07:00
Teknium 28a9c43f81 fix: resolve key_env to actual API key value instead of env var name
The cherry-picked code passed the env var NAME (e.g. 'MY_API_KEY') as the
api_key value. The caller's has_usable_secret() check would reject the
var name, so the actual key was never used. Now we os.getenv() the
key_env value to get the real API key before returning it.
2026-04-13 05:16:21 -07:00
Geoff 76eecf3819 fix(model): Support providers: dict for custom endpoints in /model
Two fixes for user-defined providers in config.yaml:

1. list_authenticated_providers() - now includes full models list from
   providers.*.models array, not just default_model. This fixes /model
   showing only one model when multiple are configured.

2. _get_named_custom_provider() - now checks providers: dict (new-style)
   in addition to custom_providers: list (legacy). This fixes credential
   resolution errors when switching models via /model command.

Both changes are backwards compatible with existing custom_providers list format.

Fixes: Only one model appears for custom providers in /model selection
2026-04-13 05:16:21 -07:00
konsisumer 311dac1971 fix(file_tools): block /private/etc writes on macOS symlink bypass
On macOS, /etc is a symlink to /private/etc, so os.path.realpath()
resolves /etc/hosts to /private/etc/hosts. The sensitive path check
only matched /etc/ prefixes against the resolved path, allowing
writes to system files on macOS.

- Add /private/etc/ and /private/var/ to _SENSITIVE_PATH_PREFIXES
- Check both realpath-resolved and normpath-normalized paths
- Add regression tests for macOS symlink bypass

Closes #8734
Co-authored-by: ElhamDevelopmentStudio (PR #8829)
2026-04-13 05:15:05 -07:00
Teknium 587eeb56b9 chore: remove duplicate dead _try_gh_cli_token / _gh_cli_candidates from auth.py
These functions were duplicated between auth.py and copilot_auth.py.
The auth.py copies had zero production callers — only copilot_auth.py's
versions are used. Redirect the test import to the live copy and update
monkeypatch targets accordingly.
2026-04-13 05:12:36 -07:00
HearthCore 2a9e50c104 fix(copilot): resolve GHE token poisoning when GITHUB_TOKEN is set
When GITHUB_TOKEN is present in the environment (e.g. for gh CLI or
GitHub Actions), two issues broke Copilot authentication against
GitHub Enterprise (GHE) instances:

1. The copilot provider had no base_url_env_var, so COPILOT_API_BASE_URL
   was silently ignored — requests always went to public GitHub.

2. `gh auth token` (the CLI fallback) treats GITHUB_TOKEN as an override
   and echoes it back instead of reading from its credential store
   (hosts.yml). This caused the same rejected token to be used even
   after env var priority correctly skipped it.

Fix:
- Add base_url_env_var="COPILOT_API_BASE_URL" to copilot ProviderConfig
- Strip GITHUB_TOKEN/GH_TOKEN from the subprocess env when calling
  `gh auth token` so it reads from hosts.yml
- Pass --hostname from COPILOT_GH_HOST when set so gh returns the
  GHE-specific OAuth token
2026-04-13 05:12:36 -07:00
luyao618 8ec1608642 fix(agent): propagate api_mode to vision provider resolution
resolve_vision_provider_client() computed resolved_api_mode from config
but never passed it to downstream resolve_provider_client() or
_get_cached_client() calls, causing custom providers with
api_mode: anthropic_messages to crash when used for vision tasks.

Also remove the for_vision special case in _normalize_aux_provider()
that incorrectly discarded named custom provider identifiers.

Fixes #8857

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 05:02:54 -07:00
Teknium e3ffe5b75f fix: remove legacy compression.summary_* config and env var fallbacks (#8992)
Remove the backward-compat code paths that read compression provider/model
settings from legacy config keys and env vars, which caused silent failures
when auto-detection resolved to incompatible backends.

What changed:
- Remove compression.summary_model, summary_provider, summary_base_url from
  DEFAULT_CONFIG and cli.py defaults
- Remove backward-compat block in _resolve_task_provider_model() that read
  from the legacy compression section
- Remove _get_auxiliary_provider() and _get_auxiliary_env_override() helper
  functions (AUXILIARY_*/CONTEXT_* env var readers)
- Remove env var fallback chain for per-task overrides
- Update hermes config show to read from auxiliary.compression
- Add config migration (v16→17) that moves non-empty legacy values to
  auxiliary.compression and strips the old keys
- Update example config and openclaw migration script
- Remove/update tests for deleted code paths

Compression model/provider is now configured exclusively via:
  auxiliary.compression.provider / auxiliary.compression.model

Closes #8923
2026-04-13 04:59:26 -07:00
WorldInnovationsDepartment c1809e85e7 fix(gateway): handle stale lock files in acquire_scoped_lock
Updated the acquire_scoped_lock function to treat empty or corrupt lock files as stale. This change ensures that if a lock file exists but is invalid, it will be removed to prevent issues with stale locks. Added tests to verify recovery from both empty and corrupt lock files.
2026-04-13 04:59:25 -07:00
Teknium 23f668d66e fix: extract Gemma 4 <thought> reasoning in _extract_reasoning() (#8991)
Add <thought>(.*?)</thought> to inline_patterns so Gemma 4
reasoning content is captured for /reasoning display, not just
stripped from visible output.


Closes #8891

Co-authored-by: RhushabhVaghela <rhushabhvaghela@users.noreply.github.com>
2026-04-13 04:59:06 -07:00
flobo3 d8a521092b fix(weixin): rename send_document parameter to match base class 2026-04-13 04:58:30 -07:00
Teknium a5bd56eae3 fix: eliminate provider hang dead zones in retry/timeout architecture (#8985)
Three targeted changes to close the gaps between retry layers that
caused users to experience 'No response from provider for 580s' and
'No activity for 15 minutes' despite having 5 layers of retry:

1. Remove non-streaming fallback from streaming path

   Previously, when all 3 stream retries exhausted, the code fell back
   to _interruptible_api_call() which had no stale detection and no
   activity tracking — a black hole that could hang for up to 1800s.
   Now errors propagate to the main retry loop which has richer recovery
   (credential rotation, provider fallback, backoff).

   For 'stream not supported' errors, sets _disable_streaming flag so
   the main retry loop automatically switches to non-streaming on the
   next attempt.

2. Add _touch_activity to recovery dead zones

   The gateway inactivity monitor relies on _touch_activity() to know
   the agent is alive, but activity was never touched during:
   - Stale stream detection/kill cycles (180-300s gaps)
   - Stream retry connection rebuilds
   - Main retry backoff sleeps (up to 120s)
   - Error recovery classification

   Now all these paths touch activity every ~30s, keeping the gateway
   informed during recovery cycles.

3. Add stale-call detector to non-streaming path

   _interruptible_api_call() now has the same stale detection pattern
   as the streaming path: kills hung connections after 300s (default,
   configurable via HERMES_API_CALL_STALE_TIMEOUT), scaled for large
   contexts (450s for 50K+ tokens, 600s for 100K+ tokens), disabled
   for local providers.

   Also touches activity every ~30s during the wait so the gateway
   monitor stays informed.

Env vars:
- HERMES_API_CALL_STALE_TIMEOUT: non-streaming stale timeout (default 300s)
- HERMES_STREAM_STALE_TIMEOUT: unchanged (default 180s)

Before: worst case ~2+ hours of sequential retries with no feedback
After: worst case bounded by gateway inactivity timeout (default 1800s)
with continuous activity reporting
2026-04-13 04:55:20 -07:00
Teknium acdff020b7 test: add multi-word query tests for truncation match strategy
Tests phrase matching, proximity co-occurrence, and sliding window
coverage maximisation — the three new tiers from the truncation fix.
2026-04-13 04:54:42 -07:00
Al Sayed Hoota a5bc698b9a fix(session_search): improve truncation to center on actual query matches
Three-tier match strategy for _truncate_around_matches():
1. Full-phrase search (exact query string positions)
2. Proximity co-occurrence (all terms within 200 chars)
3. Individual terms (fallback, preserves existing behavior)

Sliding window picks the start offset covering the most matches.

Moved inline import re to module level.

Co-authored-by: Al Sayed Hoota <78100282+AlsayedHoota@users.noreply.github.com>
2026-04-13 04:54:42 -07:00
landy dbed40f39b fix: reopen resumed gateway sessions in sqlite 2026-04-13 04:54:07 -07:00
flobo3 d945cf6b1a fix(docker): add .venv to .dockerignore 2026-04-13 04:52:00 -07:00
twilwa 3a64348772 fix(discord): voice session continuity and signal handler thread safety
- Store source metadata on /voice channel join so voice input shares the
  same session as the linked text channel conversation
- Treat voice-linked text channels as free-response (skip @mention and
  auto-thread) while voice is active
- Scope the voice-linked exemption to the exact bound channel, not
  sibling threads
- Guard signal handler registration in start_gateway() for non-main
  threads (prevents RuntimeError when gateway runs in a daemon thread)
- Clean up _voice_sources on leave_voice_channel

Salvaged from PR #3475 by twilwa (Modal runtime portions excluded).
2026-04-13 04:49:21 -07:00
Teknium 381810ad50 feat: fix SQLite safety in hermes backup + add --quick snapshots + /snapshot command (#8971)
Three changes consolidated into the existing backup system:

1. Fix: hermes backup now uses sqlite3.Connection.backup() for .db files
   instead of raw file copy. Raw copy of a WAL-mode database can produce
   a corrupted backup — the backup() API handles this correctly.

2. hermes backup --quick: fast snapshot of just critical state files
   (config.yaml, state.db, .env, auth.json, cron/jobs.json, etc.)
   stored in ~/.hermes/state-snapshots/. Auto-prunes to 20 snapshots.

3. /snapshot slash command (alias /snap): in-session interface for
   quick state snapshots. create/list/restore/prune subcommands.
   Restore by ID or number. Powered by the same backup module.

No new modules — everything lives in hermes_cli/backup.py alongside
the existing full backup/import code.

No hooks in run_agent.py — purely on-demand, zero runtime overhead.

Closes the use case from PRs #8406 and #7813 with ~200 lines of new
logic instead of a 1090-line content-addressed storage engine.
2026-04-13 04:46:13 -07:00
Richard Li 82901695ff feat(wecom): add platform hint for native media sending 2026-04-13 04:46:04 -07:00
Teknium 3365abdddf fix: use correct 'completed' state in status badge map, clean up blank lines
The cron backend uses 'completed' (not 'exhausted') when repeat count
is reached. Also removes extra blank lines from cherry-pick.
2026-04-13 04:45:29 -07:00
jonny 70f490a12a fix(web): CronPage crash when rendering schedule object
The cron API returns schedule as {kind, expr, display} object but
CronPage.tsx rendered it directly as a React child, crashing with
'Objects are not valid as a React child'.

- Update CronJob interface in api.ts to match actual API response
- Use schedule_display (string) instead of schedule (object)
- Use state instead of status for job state
- Use last_error instead of error for error display
2026-04-13 04:45:29 -07:00
Teknium 8dfee98d06 fix: clean up description escaping, add string-data tests
Follow-up for cherry-picked PR #8918.
2026-04-13 04:45:07 -07:00
dippwho bca22f3090 fix(homeassistant): #8912 resolve XML tool calling loop by casting nested object to JSON string 2026-04-13 04:45:07 -07:00
MaybeRichard 11e2e04667 fix(telegram): pass proxy URL explicitly to HTTPXRequest when proxy env vars are set
When HTTPS_PROXY / HTTP_PROXY / ALL_PROXY env vars are set (or macOS system proxy
is detected), pass the proxy URL explicitly via HTTPXRequest(proxy=proxy_url) instead
of relying on httpx's trust_env mechanism, which is unreliable for HTTP CONNECT
proxies (e.g. Clash / ClashMac in fake-ip mode).

Uses the shared resolve_proxy_url() from base.py (handles env vars + macOS system
proxy detection) instead of duplicating env var reading inline. Consolidates the
proxy_configured boolean into a single proxy_url = resolve_proxy_url() call that
serves as both the gate for skipping fallback-IP transport and the value passed
to HTTPXRequest.

Co-authored-by: Hermes Agent <hermes@nousresearch.com>
Salvaged from PR #8931 by MaybeRichard.
2026-04-13 04:45:05 -07:00
XiaoXiao0221 860489600a fix(cli): sanitize surrogate characters in handle_paste
Prevents UTF-8 encoding crash when pasting text from Word or Google Docs,
which may contain lone surrogate code points (U+D800-U+DFFF).
Reuses existing _sanitize_surrogates() from run_agent module.
2026-04-13 04:42:45 -07:00
Teknium 0998a57007 refactor: remove 5 dead utility functions from utils.py (#8975)
Remove read_json_file, read_jsonl, append_jsonl, env_str, env_lower —
all added in #7917 but never imported anywhere in the codebase. Also
remove unused List and Optional typing imports.

env_int, env_bool, and the other helpers that have real consumers are
kept.
2026-04-13 04:39:59 -07:00
Teknium cea34dc7ef fix: follow-up for salvaged PR #8939
- Move test file to tests/hermes_cli/ (consistent with test layout)
- Remove unused imports (os, pytest) from test file
- Update _sanitize_env_lines docstring: now used on read + write paths
2026-04-13 04:35:37 -07:00
Mil Wang (from Dev Box) e469f3f3db fix: sanitize .env before loading to prevent token duplication (#8908)
When .env files become corrupted (e.g. concatenated KEY=VALUE pairs on
a single line due to concurrent writes or encoding issues), both
python-dotenv and load_env() would parse the entire concatenated string
as a single value. This caused bot tokens to appear duplicated up to 8×,
triggering InvalidToken errors from the Telegram API.

Root cause: _sanitize_env_lines() — which correctly splits concatenated
lines — was only called during save_env_value() writes, not during reads.

Fix:
- load_env() now calls _sanitize_env_lines() before parsing
- env_loader.load_hermes_dotenv() sanitizes the .env file on disk
  before python-dotenv reads it, so os.getenv() also returns clean values
- Added tests reproducing the exact corruption pattern from #8908

Closes #8908
2026-04-13 04:35:37 -07:00
ismell0992-afk e77f135ed8 fix(cli): narrow Nous Hermes non-agentic warning to actual hermes-3/-4 models
The startup warning that Nous Research Hermes 3 & 4 models are not agentic
fired on any model whose name contained "hermes" anywhere, via a plain
substring check. That false-positived on unrelated local Modelfiles such
as `hermes-brain:qwen3-14b-ctx16k` — a tool-capable Qwen3 wrapper that
happens to live under a custom "hermes" tag namespace — making the warning
noise for legitimate setups.

Replace the substring check with a narrow regex anchored on `^`, `/`, or
`:` boundaries that only matches the real Hermes-3 / Hermes-4 chat family
(e.g. `NousResearch/Hermes-3-Llama-3.1-70B`, `hermes-4-405b`,
`openrouter/hermes3:70b`). Consolidate into a single helper
`is_nous_hermes_non_agentic()` in `hermes_cli.model_switch` so the CLI
and the canonical check don't drift, and route the duplicate inline site
in `cli.HermesCLI._print_warnings()` through the helper.

Add a parametrized test covering positive matches (real Hermes-3/-4
names) and a broad set of negatives (custom Modelfiles, Qwen/Claude/GPT,
older Nous-Hermes-2 families, bare "hermes", empty string, and the
"brain-hermes-3-impostor" boundary case).
2026-04-13 04:33:52 -07:00
ismell0992-afk 3e99964789 fix(agent): prefer Ollama Modelfile num_ctx over GGUF training max
_query_local_context_length was checking model_info.context_length
(the GGUF training max) before num_ctx (the Modelfile runtime override),
inverse to query_ollama_num_ctx. The two helpers therefore disagreed on
the same model:

  hermes-brain:qwen3-14b-ctx32k     # Modelfile: num_ctx 32768
  underlying qwen3:14b GGUF         # qwen3.context_length: 40960

query_ollama_num_ctx correctly returned 32768 (the value Ollama will
actually allocate KV cache for). _query_local_context_length returned
40960, which let ContextCompressor grow conversations past 32768 before
triggering compression — at which point Ollama silently truncated the
prefix, corrupting context.

Swap the order so num_ctx is checked first, matching query_ollama_num_ctx.
Adds a parametrized test that seeds both values and asserts num_ctx wins.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-13 04:24:07 -07:00
Teknium 39b83f3443 fix: remove sandbox language from tool descriptions
The terminal and execute_code tool schemas unconditionally mentioned
'cloud sandboxes' in their descriptions sent to the model. This caused
agents running on local backends to believe they were in a sandboxed
environment, refusing networking tasks and other operations. Worse,
agents sometimes saved this false belief to persistent memory, making
it persist across sessions.

Reported by multiple users (XLion, 林泽).
2026-04-13 04:23:27 -07:00
Teknium 67fece1176 feat(cli): show notification when iteration budget is reached
Displays a dim warning after the response panel when the agent hit
its max iterations, so the user knows the response may be incomplete.
2026-04-13 03:40:47 -07:00
Teknium 934318ba3a fix: budget-exhausted conversations now get a summary instead of empty response
The post-loop grace call mechanism was broken: it injected a user
message and set _budget_grace_call=True, but could never re-enter the
while loop (already exited).  Worse, the flag blocked the fallback
_handle_max_iterations from running, so final_response stayed None.

Users saw empty/no response when the agent hit max iterations.

Fix: remove the dead grace block and let _handle_max_iterations handle
it directly — it already injects a summary request and makes one extra
toolless API call.
2026-04-13 03:36:20 -07:00
Teknium 3804556cd9 fix: restore clarify toolset row removed in cherry-pick 2026-04-13 02:49:11 -07:00
Haoqing Wang 8e0ae66520 fix(skills): correct TTS/STT providers, add missing platforms/commands in hermes-agent skill
Fixes verified via 5-container parallel testing against v0.8.0 codebase.

Critical fixes:
- TTS providers: replace nonexistent kokoro/fish with actual minimax/mistral/neutts
- STT providers: add missing mistral (Voxtral Transcribe)
- Testing section: remove `source venv/bin/activate` (no venv dir in project)

Expanded coverage:
- Provider table: 13 → 22 entries (add Gemini, xAI, Xiaomi, Qwen OAuth, MiniMax CN, etc.)
- Platform list: add BlueBubbles (iMessage) and Weixin (WeChat), clarify Open WebUI
- Slash commands: add 14 undocumented commands (/approve, /deny, /branch, /fast, etc.)
- Toolsets: add 4 missing (messaging, search, todo, rl)
- Troubleshooting: expand from 6 to 10 sections with practical deployment fixes
  (Copilot OAuth 403, gateway linger, WSL2 systemd, Discord intents, etc.)

Minor fixes:
- agent/ directory description expanded
- delegation config keys completed
- /restart noted as gateway-only
- hermes honcho noted as plugin-dependent
2026-04-13 02:49:11 -07:00
Teknium 397eae5d93 fix: recover partial streamed content on connection failure
When streaming fails after partial content delivery (e.g. OpenRouter
timeout kills connection mid-response), the stub response now carries
the accumulated streamed text instead of content=None.

Two fixes:
1. The partial-stream stub response includes recovered content from
   _current_streamed_assistant_text — the text that was already
   delivered to the user via stream callbacks before the connection
   died.

2. The empty response recovery chain now checks for partial stream
   content BEFORE falling back to _last_content_with_tools (prior
   turn content) or wasting API calls on retries. This prevents:
   - Showing wrong content from a prior turn
   - Burning 3+ unnecessary retry API calls
   - Falling through to '(empty)' when the user already saw content

The root cause: OpenRouter has a ~125s inactivity timeout. When
Anthropic's SSE stream goes silent during extended reasoning, the
proxy kills the connection. The model's text was already partially
streamed but the stub discarded it, triggering the empty recovery
chain which would show stale prior-turn content or waste retries.
2026-04-13 02:12:01 -07:00
Teknium 35b11f48a5 docs: add web dashboard documentation (#8864)
- New docs page: user-guide/features/web-dashboard.md covering
  quick start, prerequisites, all three pages (Status, Config, API Keys),
  the /reload slash command, REST API endpoints, CORS config, and
  development workflow
- Added 'Management' category in sidebar for web-dashboard
- Added 'hermes web' to CLI commands reference with options table
- Added '/reload' to slash commands reference (both CLI and gateway tables)
2026-04-13 01:15:27 -07:00
Ubuntu 73ed09e145 fix(gateway): keep venv python symlink unresolved when remapping paths
_remap_path_for_user was calling .resolve() on the Python path, which
followed venv/bin/python into the base interpreter. On uv-managed venvs
this swaps the systemd ExecStart to a bare Python that has none of the
venv's site-packages, so the service crashes on first import. Classical
python -m venv installs were unaffected by accident: the resolved target
/usr/bin/python3.x lives outside $HOME so the path-remap branch was
skipped and the system Python's packages silently worked.

Remove .resolve() calls on both current_home and the path; use
.expanduser() for lexical tilde expansion only. The function does
lexical prefix substitution, which is all it needs to do for its
actual purpose (remapping /root/.hermes -> /home/<user>/.hermes when
installing system services as root for a different user).

Repro: on a uv-managed venv install, `sudo hermes gateway install
--system` writes ExecStart=.../uv/python/cpython-3.11.15-.../bin/python3.11
instead of .../hermes-agent/venv/bin/python, and the service crashes on
ModuleNotFoundError: yaml.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 00:49:22 -07:00
Teknium 964ef681cf fix(gateway): improve /restart response with fallback instructions 2026-04-12 22:34:23 -07:00
Teknium 276d20e62c fix(gateway): /restart uses service restart under systemd instead of detached subprocess
The detached bash subprocess spawned by /restart gets killed by
systemd's KillMode=mixed cgroup cleanup, leaving the gateway dead.

Under systemd (detected via INVOCATION_ID env var), /restart now uses
via_service=True which exits with code 75 — RestartForceExitStatus=75
in the unit file makes systemd auto-restart the service. The detached
subprocess approach is preserved as fallback for non-systemd
environments (Docker, tmux, foreground mode).
2026-04-12 22:32:19 -07:00
Teknium e2a9b5369f feat: web UI dashboard for managing Hermes Agent (#8756)
* feat: web UI dashboard for managing Hermes Agent (salvage of #8204/#7621)

Adds an embedded web UI dashboard accessible via `hermes web`:
- Status page: agent version, active sessions, gateway status, connected platforms
- Config editor: schema-driven form with tabbed categories, import/export, reset
- API Keys page: set, clear, and view redacted values with category grouping
- Sessions, Skills, Cron, Logs, and Analytics pages

Backend:
- hermes_cli/web_server.py: FastAPI server with REST endpoints
- hermes_cli/config.py: reload_env() utility for hot-reloading .env
- hermes_cli/main.py: `hermes web` subcommand (--port, --host, --no-open)
- cli.py / commands.py: /reload slash command for .env hot-reload
- pyproject.toml: [web] optional dependency extra (fastapi + uvicorn)
- Both update paths (git + zip) auto-build web frontend when npm available

Frontend:
- Vite + React + TypeScript + Tailwind v4 SPA in web/
- shadcn/ui-style components, Nous design language
- Auto-refresh status page, toast notifications, masked password inputs

Security:
- Path traversal guard (resolve().is_relative_to()) on SPA file serving
- CORS localhost-only via allow_origin_regex
- Generic error messages (no internal leak), SessionDB handles closed properly

Tests: 47 tests covering reload_env, redact_key, API endpoints, schema
generation, path traversal, category merging, internal key stripping,
and full config round-trip.

Original work by @austinpickett (PR #1813), salvaged by @kshitijk4poor
(PR #7621#8204), re-salvaged onto current main with stale-branch
regressions removed.

* fix(web): clean up status page cards, always rebuild on `hermes web`

- Remove config version migration alert banner from status page
- Remove config version card (internal noise, not surfaced in TUI)
- Reorder status cards: Agent → Gateway → Active Sessions (3-col grid)
- `hermes web` now always rebuilds from source before serving,
  preventing stale web_dist when editing frontend files

* feat(web): full-text search across session messages

- Add GET /api/sessions/search endpoint backed by FTS5
- Auto-append prefix wildcards so partial words match (e.g. 'nimb' → 'nimby')
- Debounced search (300ms) with spinner in the search icon slot
- Search results show FTS5 snippets with highlighted match delimiters
- Expanding a search hit auto-scrolls to the first matching message
- Matching messages get a warning ring + 'match' badge
- Inline term highlighting within Markdown (text, bold, italic, headings, lists)
- Clear button (x) on search input for quick reset

---------

Co-authored-by: emozilla <emozilla@nousresearch.com>
2026-04-12 22:26:28 -07:00
Dusk1e c052cf0eea fix(security): validate domain/service params in ha_call_service to prevent path traversal 2026-04-12 22:26:15 -07:00
Teknium 8a64f3e368 feat(gateway): notify /restart requester when gateway comes back online
When a user sends /restart, the gateway now persists their routing info
(platform, chat_id, thread_id) to .restart_notify.json. After the new
gateway process starts and adapters connect, it reads the file, sends a
'Gateway restarted successfully' message to that specific chat, and
cleans up the file.

This follows the same pattern as _send_update_notification (used by
/update). Thread IDs are preserved so the notification lands in the
correct Telegram topic or Discord thread.

Previously, after /restart the user had no feedback that the gateway was
back — they had to send a message to find out. Now they get a proactive
notification and know their session continues.
2026-04-12 22:23:48 -07:00
Teknium b22663ea69 docs: restore Orchestra Research attribution in research-paper-writing skill (#8800)
PR #4654 replaced ml-paper-writing with research-paper-writing, preserving
the writing philosophy and reference files but dropping the dedicated
'Sources Behind This Guidance' attribution table from the SKILL.md body.

Re-adds:
- The researcher attribution table (Nanda, Farquhar, Gopen & Swan, Lipton,
  Steinhardt, Perez, Karpathy) with affiliations and links to SKILL.md
- Orchestra Research credit as original compiler of the writing philosophy
- 'Origin & Attribution' section in sources.md documenting the full chain:
  Nanda blog → Orchestra skill → teknium integration → SHL0MS expansion
2026-04-12 22:03:18 -07:00
Teknium 83ca0844f7 fix: preserve dots in model names for OpenCode Zen and ZAI providers (#8794)
OpenCode Zen was in _DOT_TO_HYPHEN_PROVIDERS, causing all dotted model
names (minimax-m2.5-free, gpt-5.4, glm-5.1) to be mangled. The fix:

Layer 1 (model_normalize.py): Remove opencode-zen from the blanket
dot-to-hyphen set. Add an explicit block that preserves dots for
non-Claude models while keeping Claude hyphenated (Zen's Claude
endpoint uses anthropic_messages mode which expects hyphens).

Layer 2 (run_agent.py _anthropic_preserve_dots): Add opencode-zen and
zai to the provider allowlist. Broaden URL check from opencode.ai/zen/go
to opencode.ai/zen/ to cover both Go and Zen endpoints. Add bigmodel.cn
for ZAI URL detection.

Also adds glm-5.1 to ZAI model lists in models.py and setup.py.

Closes #7710

Salvaged from contributions by:
- konsisumer (PR #7739, #7719)
- DomGrieco (PR #8708)
- Esashiero (PR #7296)
- sharziki (PR #7497)
- XiaoYingGee (PR #8750)
- APTX4869-maker (PR #8752)
- kagura-agent (PR #7157)
2026-04-12 21:22:59 -07:00
Teknium a0cd2c5338 fix(gateway): verbose tool progress no longer truncates args when tool_preview_length is 0 (#8735)
When tool_preview_length is 0 (default for platforms without a tier
default, like Session), verbose mode was truncating args JSON to 200
characters.  Since the user explicitly opted into verbose mode, they
expect full tool call detail — the 200-char cap defeated the purpose.

Now: tool_preview_length=0 means no truncation in verbose mode.
Positive values still cap as before.  Platform message-length limits
handle overflow naturally.
2026-04-12 20:05:12 -07:00
Teknium 3636f64540 fix: resolve npm audit vulnerabilities in browser tools and whatsapp bridge (#8745)
* fix(telegram): use UTF-16 code units for message length splitting

Port from nearai/ironclaw#2304: Telegram's 4096 character limit is
measured in UTF-16 code units, not Unicode codepoints. Characters
outside the Basic Multilingual Plane (emoji like 😀, CJK Extension B,
musical symbols) are surrogate pairs: 1 Python char but 2 UTF-16 units.

Previously, truncate_message() used Python's len() which counts
codepoints. This could produce chunks exceeding Telegram's actual limit
when messages contain many astral-plane characters.

Changes:
- Add utf16_len() helper and _prefix_within_utf16_limit() for
  UTF-16-aware string measurement and truncation
- Add _custom_unit_to_cp() binary-search helper that maps a custom-unit
  budget to the largest safe codepoint slice position
- Update truncate_message() to accept optional len_fn parameter
- Telegram adapter now passes len_fn=utf16_len when splitting messages
- Fix fallback truncation in Telegram error handler to use
  _prefix_within_utf16_limit instead of codepoint slicing
- Update send_message_tool.py to use utf16_len for Telegram platform
- Add comprehensive tests: utf16_len, _prefix_within_utf16_limit,
  truncate_message with len_fn (emoji splitting, content preservation,
  code block handling)
- Update mock lambdas in reply_mode tests to accept **kw for len_fn

* fix: resolve npm audit vulnerabilities in browser tools and whatsapp bridge

Browser tools (agent-browser):
- Override lodash to 4.18.1 (fixes prototype pollution CVEs in transitive
  dep via node-simctl → @appium/logger). Not reachable in Hermes's code
  path but cleans the audit report.
- basic-ftp and brace-expansion updated via npm audit fix.

WhatsApp bridge:
- file-type updated (fixes infinite loop in ASF parser + ZIP bomb DoS)
- music-metadata updated (fixes infinite loop in ASF parser)
- path-to-regexp updated (fixes ReDoS, mitigated by localhost binding)

Both components now report 0 npm vulnerabilities.

Ref: https://gist.github.com/jacklevin74/b41b710d3e20ba78fb7e2d42e2b83819
2026-04-12 19:38:20 -07:00
Teknium 15b1a3aa69 fix: improve WhatsApp UX — chunking, formatting, streaming (#8723)
Three changes that address the poor WhatsApp experience reported by users:

1. Reclassify WhatsApp from TIER_LOW to TIER_MEDIUM in display_config.py
   — enables streaming and tool progress via the existing Baileys /edit
   bridge endpoint. Users now see progressive responses instead of
   minutes of silence followed by a wall of text.

2. Lower MAX_MESSAGE_LENGTH from 65536 to 4096 and add proper chunking
   — send() now calls format_message() and truncate_message() before
   sending, then loops through chunks with a small delay between them.
   The base class truncate_message() already handles code block boundary
   detection (closes/reopens fences at chunk boundaries). reply_to is
   only set on the first chunk.

3. Override format_message() with WhatsApp-specific markdown conversion
   — converts **bold** to *bold*, ~~strike~~ to ~strike~, headers to
   bold text, and [links](url) to text (url). Code blocks and inline
   code are protected from conversion via placeholder substitution.

Together these fix the two user complaints:
- 'sends the whole code all the time' → now chunked at 4K with proper
  formatting
- 'terminal gets interrupted and gets cooked' → streaming + tool progress
  give visual feedback so users don't accidentally interrupt with
  follow-up messages
2026-04-12 19:20:13 -07:00
Teknium 5fae356a85 fix: show full last assistant response when resuming a session (#8724)
When resuming a session with --resume or -c, the last assistant response
was truncated to 200 chars / 3 lines just like older messages in the recap.
This forced users to waste tokens re-asking for the response.

Now the last assistant message in the recap is shown in full with non-dim
styling, so users can see exactly where they left off. Earlier messages
remain truncated for compact display.

Changes:
- Track un-truncated text for the last assistant entry during collection
- Replace last entry with full text after history trimming
- Render last assistant entry with bold (non-dim) styling
- Update existing truncation tests to use multi-message histories
- Add new tests for full last response display (char + multiline)
2026-04-12 19:07:14 -07:00
Teknium 9e992df8ae fix(telegram): use UTF-16 code units for message length splitting (#8725)
Port from nearai/ironclaw#2304: Telegram's 4096 character limit is
measured in UTF-16 code units, not Unicode codepoints. Characters
outside the Basic Multilingual Plane (emoji like 😀, CJK Extension B,
musical symbols) are surrogate pairs: 1 Python char but 2 UTF-16 units.

Previously, truncate_message() used Python's len() which counts
codepoints. This could produce chunks exceeding Telegram's actual limit
when messages contain many astral-plane characters.

Changes:
- Add utf16_len() helper and _prefix_within_utf16_limit() for
  UTF-16-aware string measurement and truncation
- Add _custom_unit_to_cp() binary-search helper that maps a custom-unit
  budget to the largest safe codepoint slice position
- Update truncate_message() to accept optional len_fn parameter
- Telegram adapter now passes len_fn=utf16_len when splitting messages
- Fix fallback truncation in Telegram error handler to use
  _prefix_within_utf16_limit instead of codepoint slicing
- Update send_message_tool.py to use utf16_len for Telegram platform
- Add comprehensive tests: utf16_len, _prefix_within_utf16_limit,
  truncate_message with len_fn (emoji splitting, content preservation,
  code block handling)
- Update mock lambdas in reply_mode tests to accept **kw for len_fn
2026-04-12 19:06:20 -07:00
Teknium 3cd6cbee5f feat: add /debug slash command for all platforms
Adds /debug as a slash command available in CLI, Telegram, Discord,
Slack, and all other gateway platforms. Uploads debug report + full
logs to paste services and returns shareable URLs.

- commands.py: CommandDef in Info category (no cli_only/gateway_only)
- gateway/run.py: async handler with run_in_executor for blocking I/O
- cli.py: dispatch in process_command to run_debug_share
2026-04-12 18:08:45 -07:00
Teknium f724079d3b fix(gateway): reject known-weak placeholder credentials at startup
Port from openclaw/openclaw#64586: users who copy .env.example without
changing placeholder values now get a clear error at startup instead of
a confusing auth failure from the platform API. Also rejects placeholder
API_SERVER_KEY when binding to a network-accessible address.

Cherry-picked from PR #8677.
2026-04-12 18:05:41 -07:00
Teknium c7d8d109ff fix(matrix): trust m.mentions.user_ids as authoritative mention signal
Port from openclaw/openclaw#64796: Per MSC3952 / Matrix v1.7, the
m.mentions.user_ids field is the authoritative mention signal. Clients
that populate m.mentions but don't duplicate @bot in the body text
were being silently dropped when MATRIX_REQUIRE_MENTION=true.

Cherry-picked from PR #8673.
2026-04-12 18:05:41 -07:00
Teknium 88a12af58c feat: add hermes debug share — upload debug report to pastebin (#8681)
* feat: add `hermes debug share` — upload debug report to pastebin

Adds a new `hermes debug share` command that collects system info
(via hermes dump), recent logs (agent.log, errors.log, gateway.log),
and uploads the combined report to a paste service (paste.rs primary,
dpaste.com fallback). Returns a shareable URL for support.

Options:
  --lines N    Number of log lines per file (default: 200)
  --expire N   Paste expiry in days (default: 7, dpaste.com only)
  --local      Print report locally without uploading

Files:
  hermes_cli/debug.py           - New module: paste upload + report collection
  hermes_cli/main.py            - Wire cmd_debug + argparse subparser
  tests/hermes_cli/test_debug.py - 19 tests covering upload, collection, CLI

* feat: upload full agent.log and gateway.log as separate pastes

hermes debug share now uploads up to 3 pastes:
  1. Summary report (system info + log tails) — always
  2. Full agent.log (last ~500KB) — if file exists
  3. Full gateway.log (last ~500KB) — if file exists

Each paste uploads independently; log upload failures are noted
but don't block the main report. Output shows all links aligned:

  Report     https://paste.rs/abc
  agent.log  https://paste.rs/def
  gateway.log https://paste.rs/ghi

Also adds _read_full_log() with size-capped tail reading to stay
within paste service limits (~512KB per file).

* feat: prepend hermes dump to each log paste for self-contained context

Each paste (agent.log, gateway.log) now starts with the hermes dump
output so clicking any single link gives full system context without
needing to cross-reference the summary report.

Refactored dump capture into _capture_dump() — called once and
reused across the summary report and each log paste.

* fix: fall back to .1 rotated log when primary log is missing or empty

When gateway.log (or agent.log) doesn't exist or is empty, the debug
share now checks for the .1 rotation file. This is common — the
gateway rotates logs and the primary file may not exist yet.

Extracted _resolve_log_path() to centralize the fallback logic for
both _read_log_tail() and _read_full_log().

* chore: remove unused display_hermes_home import
2026-04-12 18:05:14 -07:00
Teknium bcad679799 fix(api_server): normalize array-based content parts in chat completions
Some OpenAI-compatible clients (Open WebUI, LobeChat, etc.) send
message content as an array of typed parts instead of a plain string:

    [{"type": "text", "text": "hello"}]

The agent pipeline expects strings, so these array payloads caused
silent failures or empty messages.

Add _normalize_chat_content() with defensive limits (recursion depth,
list size, output length) and apply it to both the Chat Completions
and Responses API endpoints. The Responses path had inline
normalization that only handled input_text/output_text — the shared
function also handles the standard 'text' type.

Salvaged from PR #7980 (ikelvingo) — only the content normalization;
the SSE and Weixin changes in that PR were regressions and are not
included.

Co-authored-by: ikelvingo <ikelvingo@users.noreply.github.com>
2026-04-12 18:03:16 -07:00
AaronWong1999 e8385f6f89 docs: add HermesClaw to community ecosystem
Adds a one-line entry for HermesClaw (community WeChat bridge) to the Community section. It lets users run Hermes Agent and OpenClaw on the same WeChat account.
2026-04-12 18:03:16 -07:00
Sicheng Li ea2829ab43 fix(weixin,wecom,matrix): respect system proxy via aiohttp trust_env
aiohttp.ClientSession defaults to trust_env=False, ignoring HTTP_PROXY/
HTTPS_PROXY env vars. This causes QR login and all API calls to fail for
users behind a proxy (e.g. Clash in fake-ip mode), which is common in
China where Weixin and WeCom are primarily used.

Added trust_env=True to all aiohttp.ClientSession instantiations that
connect to external hosts (weixin: 3 places, wecom: 1, matrix: 1).
WhatsApp sessions are excluded as they only connect to localhost.

httpx-based adapters (dingtalk, signal, wecom_callback) are unaffected
as httpx defaults to trust_env=True.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-12 18:03:16 -07:00
Teknium bc4e2744c3 test: add tests for compression config_context_length passthrough
- Test that auxiliary.compression.context_length from config is forwarded
  to get_model_context_length (positive case)
- Test that invalid/non-integer config values are silently ignored
- Fix _make_agent() to set config=None (cherry-picked code reads self.config)
2026-04-12 17:52:34 -07:00
ygd58 4a9c356559 fix(compression): pass configured context_length to feasibility check
_check_compression_model_feasibility() called get_model_context_length()
without passing config_context_length, so custom endpoints that do not
support /models API queries always fell through to the 128K default,
ignoring auxiliary.compression.context_length in config.yaml.

Fix: read auxiliary.compression.context_length from config and pass it
as config_context_length (highest-priority hint) so the user-configured
value is always respected regardless of API availability.

Fixes #8499
2026-04-12 17:52:34 -07:00
Teknium 0d0d27d45e test(tts): add speed config tests for Edge, OpenAI, and MiniMax
12 tests covering:
- Provider-specific speed overrides global speed
- Global speed used as fallback
- Default (no speed) preserves existing behavior
- Edge SSML rate string conversion (positive/negative)
- OpenAI speed clamping to 0.25-4.0 range
2026-04-12 16:46:18 -07:00
0xbyt4 8ec0656f53 feat(tts): add speed support for Edge TTS and OpenAI TTS
Read tts.speed (global) or tts.<provider>.speed (provider-specific) from
config. Provider-specific takes precedence over global.

- Edge TTS: converts speed float to SSML prosody rate string
- OpenAI TTS: passes speed param clamped to 0.25-4.0
- MiniMax: wired into global tts.speed fallback for consistency

Co-authored-by: 0xbyt4 <0xbyt4@users.noreply.github.com>
2026-04-12 16:46:18 -07:00
Teknium 651419b014 fix: make mimo-v2-pro the default model for Nous portal users
Users who set up Nous auth without explicitly selecting a model via
`hermes model` were silently falling back to anthropic/claude-opus-4.6
(the first entry in _PROVIDER_MODELS['nous']), causing unexpected
charges on their Nous plan. Move xiaomi/mimo-v2-pro to the first
position so unconfigured users default to a free model instead.
2026-04-12 16:44:03 -07:00
Teknium a266238e1e fix(weixin): streaming cursor, media uploads, markdown links, blank messages (#8665)
Four fixes for the Weixin/WeChat adapter, synthesized from the best
aspects of community PRs #8407, #8521, #8360, #7695, #8308, #8525,
#7531, #8144, #8251.

1. Streaming cursor (▉) stuck permanently — WeChat doesn't support
   message editing, so the cursor appended during streaming can never
   be removed.  Add SUPPORTS_MESSAGE_EDITING = False to WeixinAdapter
   and check it in gateway/run.py to use an empty cursor for non-edit
   platforms.  (Fixes #8307, #8326)

2. Media upload failures — two bugs in _send_file():
   a) upload_full_url path used PUT (404 on WeChat CDN); now uses POST.
   b) aes_key was base64(raw_bytes) but the iLink API expects
      base64(hex_string); images showed as grey boxes.  (Fixes #8352, #7529)
   Also: unified both upload paths into _upload_ciphertext(), preferring
   upload_full_url.  Added send_video/send_voice methods and voice_item
   media builder for audio/.silk files.  Added video_md5 field.

3. Markdown links stripped — WeChat can't render [text](url), so
   format_message() now converts them to 'text (url)' plaintext.
   Code blocks are preserved.  (Fixes #7617)

4. Blank message prevention — three guards:
   a) _split_text_for_weixin_delivery('') returns [] not ['']
   b) send() filters empty/whitespace chunks before _send_text_chunk
   c) _send_message() raises ValueError for empty text as safety net

Community credit: joei4cm (#8407), lyonDan (#8521), SKFDJKLDG (#8360),
tomqiaozc (#7695), joshleeeeee (#8308), luoxiao6645(#8525),
longsizhuo (#7531), Astral-Yang (#8144), QingWei-Li (#8251).
2026-04-12 16:43:25 -07:00
Teknium c83674dd77 fix: unify OpenClaw detection, add isatty guard, fix print_warning import
Combines detection from both PRs into _detect_openclaw_processes():
- Cross-platform process scan (pgrep/tasklist/PowerShell) from PR #8102
- systemd service check from PR #8555
- Returns list[str] with details about what's found

Fixes in cleanup warning (from PR #8555):
- print_warning -> print_error/print_info (print_warning not in import chain)
- Added isatty() guard for non-interactive sessions
- Removed duplicate _check_openclaw_running() in favor of shared function

Updated all tests to match new API.
2026-04-12 16:40:37 -07:00
Serhat Dolmac 76f7411fca fix(claw): warn and prompt if OpenClaw is still running before archival (fixes #8502) 2026-04-12 16:40:37 -07:00
dirtyfancy 9fb36738a7 fix(claw): address Copilot review on Windows detection and non-interactive prompt
- Use PowerShell to inspect node.exe command lines on Windows,
  since tasklist output does not include them.
- Also check for dedicated openclaw.exe/clawd.exe processes.
- Skip the interactive prompt in non-interactive sessions so the
  preview-only behavior is preserved.
- Update tests accordingly.

Relates to #7907
2026-04-12 16:40:37 -07:00
dirtyfancy 5af9614f6d fix(claw): warn if OpenClaw is running before migration
Add _is_openclaw_running() and _warn_if_openclaw_running() to detect
OpenClaw processes (via pgrep/tasklist) before hermes claw migrate.
Warns the user that messaging platforms only allow one active session
per bot token, and lets them cancel or continue.

Fixes #7907
2026-04-12 16:40:37 -07:00
Teknium 76019320fb feat(skills): centralized skills index — eliminate GitHub API calls for search/install
Add a CI-built skills index served from the docs site. The index is
crawled daily by GitHub Actions, resolves all GitHub paths upfront, and
is cached locally by the client. When the index is available:

- Search uses the cached index (0 GitHub API calls, was 23+)
- Install uses resolved paths from index (6 API calls for file
  downloads only, was 31-45 for discovery + downloads)

Total: 68 → 6 GitHub API calls for a typical search + install flow.
Unauthenticated users (60 req/hr) can now search and install without
hitting rate limits.

Components:
- scripts/build_skills_index.py: Crawl all sources (skills.sh, GitHub
  taps, official, clawhub, lobehub), batch-resolve GitHub paths via
  tree API, output JSON index
- tools/skills_hub.py: HermesIndexSource class — search/fetch/inspect
  backed by the index, with lazy GitHubSource for file downloads
- parallel_search_sources() skips external API sources when index is
  available (0 GitHub calls for search)
- .github/workflows/skills-index.yml: twice-daily CI build + deploy
- .github/workflows/deploy-site.yml: also builds index during docs deploy

Graceful degradation: when the index is unavailable (first run, network
down, stale), all methods return empty/None and downstream sources
handle the request via direct API as before.
2026-04-12 16:39:04 -07:00
Teknium 7e0e5ea03b fix(skills): cache GitHub repo trees to avoid rate-limit exhaustion on install
Skills.sh installs hit the GitHub API 45 times per install because the
same repo tree was fetched 6 times redundantly. Combined with search
(23 API calls), this totals 68 — exceeding the unauthenticated rate
limit of 60 req/hr, causing 'Could not fetch' errors for users without
a GITHUB_TOKEN.

Changes:
- Add _get_repo_tree() cache to GitHubSource — repo info + recursive
  tree fetched once per repo per source instance, eliminating 10
  redundant API calls (6 tree + 4 candidate 404s)
- _download_directory_via_tree returns {} (not None) when cached tree
  shows path doesn't exist, skipping unnecessary Contents API fallback
- _check_rate_limit_response() detects exhausted quota and sets
  is_rate_limited flag
- do_install() shows actionable hint when rate limited: set
  GITHUB_TOKEN or install gh CLI

Before: 45 API calls per install (68 total with search)
After:  31 API calls per install (54 total with search — under 60/hr)

Reported by community user from Vietnam (no GitHub auth configured).
2026-04-12 16:39:04 -07:00
Teknium 4c6ebd077e chore: sync uv.lock with matrix extra deps (aiosqlite, asyncpg) (#8661)
These were already declared in pyproject.toml but missing from the lockfile.
2026-04-12 16:38:15 -07:00
alt-glitch 5e1197a42e fix(gateway): harden Docker/container gateway pathway
Centralize container detection in hermes_constants.is_container() with
process-lifetime caching, matching existing is_wsl()/is_termux() patterns.
Dedup _is_inside_container() in config.py to delegate to the new function.

Add _run_systemctl() wrapper that converts FileNotFoundError to RuntimeError
for defense-in-depth — all 10 bare subprocess.run(_systemctl_cmd(...)) call
sites now route through it.

Make supports_systemd_services() return False in containers and when
systemctl binary is absent (shutil.which check).

Add Docker-specific guidance in gateway_command() for install/uninstall/start
subcommands — exit 0 with helpful instructions instead of crashing.

Make 'hermes status' show 'Manager: docker (foreground)' and 'hermes dump'
show 'running (docker, pid N)' inside containers.

Fix setup_gateway() to use supports_systemd instead of _is_linux for all
systemd-related branches, and show Docker restart policy instructions in
containers.

Replace inline /.dockerenv check in voice_mode.py with is_container().

Fixes #7420

Co-authored-by: teknium1 <teknium1@users.noreply.github.com>
2026-04-12 16:36:11 -07:00
sprmn24 18ab5c99d1 fix(backup): correct marker filenames in _validate_backup_zip
The backup validation checked for 'hermes_state.db' and 'memory_store.db'
as telltale markers of a valid Hermes backup zip. Neither name exists in a
real Hermes installation — the actual database file is 'state.db'
(hermes_state.py: DEFAULT_DB_PATH = get_hermes_home() / 'state.db').

A fresh Hermes installation produces:
  ~/.hermes/state.db        (actual name)
  ~/.hermes/config.yaml
  ~/.hermes/.env

Because the marker set never matched 'state.db', a backup zip containing
only 'state.db' plus 'config.yaml' would fail validation with:
  'zip does not appear to be a Hermes backup'
and the import would exit with sys.exit(1), silently rejecting a valid backup.

Fix: replace the wrong marker names with the correct filename.

Adds TestValidateBackupZip with three cases:
- state.db is accepted as a valid marker
- old wrong names (hermes_state.db, memory_store.db) alone are rejected
- config.yaml continues to pass (existing behaviour preserved)
2026-04-12 16:35:56 -07:00
Teknium d6785dc4d4 fix: empty response recovery for reasoning models (mimo, qwen, GLM) (#8609)
Three fixes for the (empty) response bug affecting open reasoning models:

1. Allow retries after prefill exhaustion — models like mimo-v2-pro always
   populate reasoning fields via OpenRouter, so the old 'not _has_structured'
   guard on the retry path blocked retries for EVERY reasoning model after
   the 2 prefill attempts.  Now: 2 prefills + 3 retries = 6 total attempts
   before (empty).

2. Reset prefill/retry counters on tool-call recovery — the counters
   accumulated across the entire conversation, never resetting during
   tool-calling turns.  A model cycling empty→prefill→tools→empty burned
   both prefill attempts and the third empty got zero recovery.  Now
   counters reset when prefill succeeds with tool calls.

3. Strip think blocks before _truly_empty check — inline <think> content
   made the string non-empty, skipping both retry paths.

Reported by users on Telegram with xiaomi/mimo-v2-pro and qwen3.5 models.
Reproduced: qwen3.5-9b emits tool calls as XML in reasoning field instead
of proper function calls, causing content=None + tool_calls=None + reasoning
with embedded <tool_call> XML.  Prefill recovery works but counter
accumulation caused permanent (empty) in long sessions.
2026-04-12 15:38:11 -07:00
Teknium a4593f8b21 feat: make gateway 'still working' notification interval configurable (#8572)
Add agent.gateway_notify_interval config option (default 600s).
Set to 0 to disable periodic 'still working' notifications.
Bridged to HERMES_AGENT_NOTIFY_INTERVAL env var (same pattern as
gateway_timeout and gateway_timeout_warning).

The inactivity warning (gateway_timeout_warning) was already
configurable; this makes the wall-clock ping configurable too.
2026-04-12 13:06:34 -07:00
Teknium 1179918746 fix: salvage follow-ups for Feishu QR onboarding (#7706)
- Remove duplicate _setup_feishu() definition (old 3-line version left
  behind by cherry-pick — Python picked the new one but dead code
  remained)
- Remove misleading 'Disable direct messages' DM option — the Feishu
  adapter has no DM policy mechanism, so 'disable' produced identical
  env vars to 'pairing'. Users who chose 'disable' would still see
  pairing prompts. Reduced to 3 options: pairing, allow-all, allowlist.
- Fix test_probe_returns_bot_info_on_success and
  test_probe_returns_none_on_failure: patch FEISHU_AVAILABLE=True so
  probe_bot() takes the SDK path when lark_oapi is not installed
2026-04-12 13:05:56 -07:00
Shuo d7785f4d5b feat(feishu): add scan-to-create onboarding for Feishu / Lark
Add a QR-based onboarding flow to `hermes gateway setup` for Feishu / Lark.
Users scan a QR code with their phone and the platform creates a fully
configured bot application automatically — matching the existing WeChat
QR login experience.

Setup flow:
- Choose between QR scan-to-create (new app) or manual credential input (existing app)
- Connection mode selection (WebSocket / Webhook)
- DM security policy (pairing / open / allowlist / disabled)
- Group chat policy (open with @mention / disabled)

Implementation:
- Onboard functions (init/begin/poll/QR/probe) in gateway/platforms/feishu.py
- _setup_feishu() in hermes_cli/gateway.py with manual fallback
- probe_bot uses lark_oapi SDK when available, raw HTTP fallback otherwise
- qr_register() catches expected errors (network/protocol), propagates bugs
- Poll handles HTTP 4xx JSON responses and feishu/lark domain auto-detection

Tests:
- 25 tests for onboard module (registration, QR, probe, contract, negative paths)
- 16 tests for setup flow (credentials, connection mode, DM policy, group policy,
  adapter integration verifying env vars produce valid FeishuAdapterSettings)

Change-Id: I720591ee84755f32dda95fbac4b26dc82cbcf823
2026-04-12 13:05:56 -07:00
Teknium a9ebb331bc fix: contextual error diagnostics for invalid API responses (#8565)
Previously, all invalid API responses (choices=None) were diagnosed
as 'fast response often indicates rate limiting' regardless of actual
response time or error code. A 738s Cloudflare 524 timeout was labeled
as 'fast response' and 'possible rate limit'.

Now extracts the error code from response.error and classifies:
- 524: upstream provider timed out (Cloudflare)
- 504: upstream gateway timeout
- 429: rate limited by upstream provider
- 500/502: upstream server error
- 503/529: upstream provider overloaded
- Other codes: shown with code number
- No code + <10s: likely rate limited (timing heuristic)
- No code + >60s: likely upstream timeout
- No code + 10-60s: neutral response time

All downstream messages (retry status, final error, interrupt message)
now use the classified hint instead of generic rate-limit language.

Reported by community member Lumen Radley (MiMo provider timeouts).
2026-04-12 13:00:07 -07:00
Teknium 400fe9b2a1 fix: add <thought> stripping to auxiliary_client + tests
auxiliary_client.py had its own regex mirroring _strip_think_blocks
but was missing the <thought> variant. Also adds test coverage for
<thought> paired and orphaned tags.
2026-04-12 12:44:49 -07:00
Chen Chia Yang 326d5febe5 fix: also strip <thought> tags during streaming in cli.py 2026-04-12 12:44:49 -07:00
Chen Chia Yang a372c14fc5 fix: strip <thought> tags from Gemma 4 responses in _strip_think_blocks
Gemma 4 (26B/31B) uses <thought>...</thought> to wrap its reasoning
output. This tag was not included in the existing list of reasoning tag
variants stripped by _strip_think_blocks(), causing raw thinking blocks
to leak into the visible response.

Added a new re.sub() line for <thought> and extended the cleanup regex
to include 'thought' alongside the existing variants.

Fixes #6148
2026-04-12 12:44:49 -07:00
Teknium f295b17d92 fix: make agent_thread daemon to prevent orphan CLI processes on tab close (#8557)
When a user closes a terminal tab, SIGHUP exits the main thread but
the non-daemon agent_thread kept the entire Python process alive —
stuck in the API call loop with no interrupt signal. Over many
conversations, these orphan processes accumulate and cause massive
swap usage (reported: 77GB on a 32GB M1 Pro).

Changes:
- Make agent_thread daemon=True so the process exits when the main
  thread finishes its cleanup. Under normal operation this changes
  nothing — the main thread already waits on agent_thread.is_alive().
- Interrupt the agent in the finally/exit path so the daemon thread
  stops making API calls promptly rather than being killed mid-flight.
2026-04-12 12:38:55 -07:00
Teknium 06290f6a2f fix: handle broken stdin in prompt_toolkit startup (#6393) (#8560)
On macOS with uv-managed Python, stdin (fd 0) can be invalid or
unregisterable with the asyncio selector, causing:

  KeyError: '0 is not registered'

during prompt_toolkit's app.run() → asyncio.run() → _add_reader(0).

Three-layer fix:
1. Pre-flight fstat(0) check before app.run() — detects broken stdin
   early and prints actionable guidance instead of a raw traceback.
2. Catch KeyError/OSError around app.run() as fallback for edge cases
   that slip past the fstat guard.
3. Extend asyncio exception handler to suppress selector registration
   KeyErrors in async callbacks.

Fixes #6393
2026-04-12 12:38:03 -07:00
Teknium 06a17c57ae fix: improve profile creation UX — seed SOUL.md + credential warning (#8553)
Fresh profiles (created without --clone) now:
- Auto-seed a default SOUL.md immediately, so users have a file to
  customize right away instead of discovering it only after first use
- Print a clear warning that the profile has no API keys and will
  inherit from the shell environment unless configured separately
- Show the SOUL.md path for personality customization

Previously, fresh profiles started with no SOUL.md (only seeded on
first use via ensure_hermes_home), no mention of credential isolation,
and no guidance about customizing personality. Users reported confusion
about profiles using the wrong model/plan tokens and SOUL.md not
being read — both traced to operational gaps in the creation UX.

Closes #8093 (investigated: code correctly loads SOUL.md from profile
HERMES_HOME; issue was operational, not a code bug).
2026-04-12 12:22:34 -07:00
Teknium 4eecaf06e4 fix: prevent duplicate update prompt spam in gateway watcher (#8343)
The _watch_update_progress() poll loop never deleted .update_prompt.json
after forwarding the prompt to the user, causing the same prompt to be
re-sent every poll cycle (2s). Two fixes:

1. Delete .update_prompt.json after forwarding — the update process only
   polls for .update_response, it doesn't need the prompt file to persist.
2. Guard re-sends with _update_prompt_pending check — belt-and-suspenders
   to prevent duplicates even under race conditions.

Add regression test asserting the prompt is sent exactly once.
2026-04-12 04:52:59 -07:00
Teknium 7a67b13506 fix: title_generator no longer logs as 'compression' task
Changed task='compression' to task='title_generation' so auto-title
calls don't pollute logs with false compression alarms.
2026-04-12 04:17:18 -07:00
Teknium 45e60904c6 fix: fall back to provider's default model when model config is empty (#8303)
When a user configures a provider (e.g. `hermes auth add openai-codex`)
but never selects a model via `hermes model`, the gateway and CLI would
pass an empty model string to the API, causing:
  'Codex Responses request model must be a non-empty string'

Now both gateway (_resolve_session_agent_runtime) and CLI
(_ensure_runtime_credentials) detect an empty model and fill it from
the provider's first catalog entry in _PROVIDER_MODELS. This covers
all providers that have a static model list (openai-codex, anthropic,
gemini, copilot, etc.).

The fix is conservative: it only triggers when model is truly empty
and a known provider was resolved. Explicit model choices are never
overridden.
2026-04-12 03:53:30 -07:00
Teknium 17c72f176d fix: make skill loading instructions more aggressive in system prompt (#8286)
The previous wording ('If one clearly matches') set too high a threshold,
and 'If none match, proceed normally' was an easy escape hatch for lazy
models. Now:

- Lowered threshold: 'matches or is even partially relevant'
- Added MUST directive and 'err on the side of loading' guidance
- Replaced permissive closer with 'only proceed without if genuinely none
  are relevant'

This should reduce cases where the agent skips loading relevant skills
unless explicitly forced.
2026-04-12 03:03:16 -07:00
Teknium b6b6b02f0f fix: prevent unwanted session auto-reset after graceful gateway restarts (#8299)
When the gateway shuts down gracefully (hermes update, gateway restart,
/restart), it now writes a .clean_shutdown marker file. On the next
startup, if this marker exists, suspend_recently_active() is skipped
and the marker is cleaned up.

Previously, suspend_recently_active() fired on EVERY startup —
including planned restarts from hermes update or hermes gateway restart.
This caused users to lose their conversation history unexpectedly: the
session would be marked as suspended, and the next message would
trigger an auto-reset with a notification the user never asked for.

The original purpose of suspend_recently_active() is crash recovery —
preventing stuck sessions that were mid-processing when the gateway
died unexpectedly. Graceful shutdowns already drain active agents via
_drain_active_agents(), so there is no stuck-session risk. After a
crash (no marker written), suspension still fires as before.

Fixes the scenario where a user asks the agent to run hermes update,
the gateway restarts, and the user's next message gets an unwanted
'Session automatically reset' notification with their history cleared.
2026-04-12 03:03:07 -07:00
Teknium 56e3ee2440 fix: write update exit code before gateway restart (cgroup kill race) (#8288)
When /update runs via Telegram, hermes update --gateway is spawned inside
the gateway's systemd cgroup.  The update process itself calls
systemctl restart hermes-gateway, which tears down the cgroup with
KillMode=mixed — SIGKILL to all remaining processes.  The wrapping bash
shell is killed before it can execute the exit-code epilogue, so
.update_exit_code is never created.  The new gateway's update watcher
then polls for 30 minutes and sends a spurious timeout message.

Fix: write .update_exit_code from Python inside cmd_update() immediately
after the git pull + pip install succeed ("Update complete!"), before
attempting the gateway restart.  The shell epilogue still writes it too
(idempotent overwrite), but now the marker exists even when the process
is killed mid-restart.
2026-04-12 02:33:21 -07:00
Teknium b321330362 feat: add WSL environment hint to system prompt (#8285)
When running inside WSL (Windows Subsystem for Linux), inject a hint into
the system prompt explaining that the Windows host filesystem is mounted
at /mnt/c/, /mnt/d/, etc. This lets the agent naturally translate Windows
paths (Desktop, Documents) to their /mnt/ equivalents without the user
needing to configure anything.

Uses the existing is_wsl() detection from hermes_constants (cached,
checks /proc/version for 'microsoft'). Adds build_environment_hints()
in prompt_builder.py — extensible for Termux, Docker, etc. later.

Closes the UX gap where WSL users had to manually explain path
translation to the agent every session.
2026-04-12 02:26:28 -07:00
Teknium dd5b1063d0 fix: register MATRIX_RECOVERY_KEY env var + document migration path
Follow-up for cherry-picked PR #8272:
- Add MATRIX_RECOVERY_KEY to module docstring header in matrix.py
- Register in OPTIONAL_ENV_VARS (config.py) with password=True, advanced=True
- Add to _NON_SETUP_ENV_VARS set
- Document cross-signing verification in matrix.md E2EE section
- Update migration guide with recovery key step (step 3)
- Add to environment-variables.md reference
2026-04-12 02:18:03 -07:00
elkimek b9af4955b9 fix(matrix): restore verify_with_recovery_key after device key rotation
After the PgCryptoStore migration in v0.8.0, the verify_with_recovery_key
call that previously ran after share_keys() was dropped. On any rotation
that uploads fresh device keys (fresh crypto.db, server had stale keys
from a prior install, etc.), the new device keys carry no valid self-
signing signature because the bot has no access to the self-signing
private key.

Peers like Element then refuse to share Megolm sessions with the
rotated device, so the bot silently stops decrypting incoming messages.

This restores the recovery-key bootstrap: on startup, if
MATRIX_RECOVERY_KEY is set, import the cross-signing private keys from
SSSS and sign_own_device(), producing a valid signature server-side.

Idempotent and gated on MATRIX_RECOVERY_KEY — no behavior change for
users who don't configure a recovery key.

Verified end-to-end by deleting crypto.db and restarting: the bot
rotates device identity keys, re-uploads, self-signs via recovery key,
and decrypts+replies to fresh messages from a paired Element client.
2026-04-12 02:18:03 -07:00
Ben Barclay b0d65c333a Merge pull request #8279 from NousResearch/chore/simplify-docker-tags
chore: simplify Docker image tags
2026-04-12 19:09:05 +10:00
Ben 00adbd0de0 chore: simplify Docker image tags
- Main branch push: only push :latest (remove SHA tag)
- Release push: only push release tag name (remove :latest and SHA tag)
2026-04-12 19:08:16 +10:00
Teknium 95fa78eb6c fix: write refreshed Codex tokens back to ~/.codex/auth.json (#8277)
OpenAI OAuth refresh tokens are single-use and rotate on every refresh.
When Hermes refreshes a Codex token, it consumed the old refresh_token
but never wrote the new pair back to ~/.codex/auth.json. This caused
Codex CLI and VS Code to fail with 'refresh_token_reused' on their
next refresh attempt.

This mirrors the existing Anthropic write-back pattern where refreshed
tokens are written to ~/.claude/.credentials.json via
_write_claude_code_credentials().

Changes:
- Add _write_codex_cli_tokens() in hermes_cli/auth.py (parallel to
  _write_claude_code_credentials in anthropic_adapter.py)
- Call it from _refresh_codex_auth_tokens() (non-pool refresh path)
- Call it from credential_pool._refresh_entry() (pool happy path + retry)
- Add tests for the new write-back behavior
- Update existing test docstring to clarify _save_codex_tokens vs
  _write_codex_cli_tokens separation

Fixes refresh token conflict reported by @ec12edfae2cb221
2026-04-12 02:05:20 -07:00
Teknium 6d05e3d56f fix(gateway): evict cached agent on /model switch + add diagnostic logging (#8276)
After /model switches the model (both picker and text paths), the cached
agent's config signature becomes stale — the agent was updated in-place
via switch_model() but the cache tuple's signature was never refreshed.
The next turn *should* detect the signature mismatch and create a fresh
agent, but this relies on the new model's signature differing from the
old one in _agent_config_signature().

Evicting the cached agent explicitly after storing the session override
is more defensive — the next turn is guaranteed to create a fresh agent
from the override without depending on signature mismatch detection.

Also adds debug logging at three key decision points so we can trace
exactly what happens when /model + /retry interact:
- _resolve_session_agent_runtime: which override path is taken (fast
  with api_key vs fallback), or why no override was found
- _run_agent.run_sync: final resolved model/provider before agent
  creation

Reported: /model switch to xiaomi/mimo-v2-pro followed by /retry still
used the old model (glm-5.1).
2026-04-12 01:58:17 -07:00
Teknium 4aa534eae5 fix(gateway): peek at pending message during interrupt instead of consuming it
The monitor_for_interrupt() and backup interrupt checks were calling
get_pending_message() which pops the message from the adapter's queue.
This created a race condition: if the agent finished naturally before
checking _interrupt_requested, the pending message was permanently lost.

Timeline of the race:
1. Agent near completion, user sends message
2. Level 1 guard stores message in adapter._pending_messages, sets event
3. monitor_for_interrupt() detects event, POPS message, calls agent.interrupt()
4. Agent's run_conversation() was already returning (interrupted=False)
5. Post-run dequeue finds nothing (monitor already consumed it)
6. result.get('interrupted') is False so interrupt_message fallback doesn't fire
7. User message permanently lost — agent finishes without processing it

Fix: change all three interrupt detection sites (primary monitor + two
backup checks) from get_pending_message() (pop) to
_pending_messages.get() (peek). The message stays in the adapter's queue
until _dequeue_pending_event() consumes it in the post-run handler,
which runs regardless of whether the agent was interrupted or finished
naturally.

Reported by @_SushantSays — intermittent message loss during long
terminal command execution, persisting after the previous fix (73f970fa)
which addressed monitor task death but not this consumption race.
2026-04-12 01:57:34 -07:00
Teknium ae6820a45a fix(setup): validate base URL input in hermes model flow (#8264)
Reject non-URL values (e.g. shell commands typed by mistake) in the
base URL prompt during provider setup. Previously any string was saved
as-is to .env, breaking connectivity when the garbage value was used
as the API endpoint.

Adds http:// / https:// prefix check with a clear error message.
The custom-endpoint flow already had this validation (line 1620);
this brings the generic API-key provider flow to parity.

Triggered by a user support case where 'nano ~/.hermes/.env' was
accidentally entered as GLM_BASE_URL during Z.AI setup.
2026-04-12 01:51:57 -07:00
Teknium a1220977d3 fix: make skill loading instructions more aggressive in system prompt (#8209)
The previous wording ('If one clearly matches') set too high a threshold,
and 'If none match, proceed normally' was an easy escape hatch for lazy
models. Now:

- Lowered threshold: 'matches or is even partially relevant'
- Added MUST directive and 'err on the side of loading' guidance
- Replaced permissive closer with 'only proceed without if genuinely none
  are relevant'

This should reduce cases where the agent skips loading relevant skills
unless explicitly forced.
2026-04-12 01:46:34 -07:00
Teknium 078dba015d fix: three provider-related bugs (#8161, #8181, #8147) (#8243)
- Add openai/openai-codex -> openai mapping to PROVIDER_TO_MODELS_DEV
  so context-length lookups use models.dev data instead of 128k fallback.
  Fixes #8161.

- Set api_mode from custom_providers entry when switching via hermes model,
  and clear stale api_mode when the entry has none. Also extract api_mode
  in _named_custom_provider_map(). Fixes #8181.

- Convert OpenAI image_url content blocks to Anthropic image blocks when
  the endpoint is Anthropic-compatible (MiniMax, MiniMax-CN, or any URL
  containing /anthropic). Fixes #8147.
2026-04-12 01:44:18 -07:00
Harish Kukreja b1f13a8c5f fix(agent): route compression aux through live session runtime 2026-04-12 01:34:52 -07:00
Teknium c52f6348b6 fix: list all available toolsets in delegate_task schema description (#8231)
* fix: list all available toolsets in delegate_task schema description

The delegate_task tool's toolsets parameter description only mentioned
'terminal', 'file', and 'web' as examples. Models (especially smaller
ones like Gemma) would substitute 'web' for 'browser' because they
didn't know 'browser' was a valid option.

Now dynamically builds the toolset list from the TOOLSETS dict at import
time, excluding blocked, composite, and platform-specific toolsets.
Auto-updates when new toolsets are added.

Reported by jeffutter on Discord.

* chore: exclude moa and rl from delegate_task toolset list
2026-04-12 00:54:35 -07:00
Teknium 3162472674 feat(tips): add 69 deeper hidden-gem tips (279 total) (#8237)
Add lesser-known power-user tips covering:
- BOOT.md gateway startup automation
- Cron script attachment for data collection pipelines
- Prefill messages for few-shot priming
- Focus topic compression (/compress <topic>)
- Terminal exit code annotations and auto-retry
- Automatic sudo password piping
- execute_code built-in helpers (json_parse, shell_quote, retry)
- File loop detection and staleness warnings
- MCP sampling and dynamic tool discovery
- Delegation heartbeat and ACP child agents (Claude Code)
- 402 auto-fallback in auxiliary client
- Container mode, HERMES_HOME_MODE, subprocess HOME isolation
- Ctrl+C 5-tier priority system
- Browser CDP URL override and stealth mode
- Skills quarantine, audit log, and well-known protocol
- Per-platform display overrides, human delay mode
- And many more deep-cut features
2026-04-12 00:54:07 -07:00
Teknium 8b9d22a74b revert: keep debian:13.4 full image instead of slim
The slim image drops packages that may be needed at runtime.
Keep the full Debian base for compatibility.
2026-04-12 00:53:16 -07:00
m0n5t3r fee0e0d35e fix(docker): run as non-root user, use virtualenv (salvage #5811)
- Add gosu for runtime privilege dropping from root to hermes user
- Support HERMES_UID/HERMES_GID env vars for host mount permission matching
- Switch to debian:13.4-slim base image
- Use uv venv instead of pip install --break-system-packages
- Pin uv and gosu multi-stage images with SHA256 digests
- Set PLAYWRIGHT_BROWSERS_PATH to /opt/hermes/.playwright so build-time
  chromium install survives the /opt/data volume mount
- Keep procps for container debugging

Based on work by m0n5t3r in PR #5811. Stripped to hardening-only
changes (non-root, virtualenv, slim base); matrix deps, fonts, xvfb,
and entrypoint playwright download deferred to follow-up.
2026-04-12 00:53:16 -07:00
bravohenry 81ac62c0e9 fix(weixin): split chatty short replies into separate bubbles, keep structured content together
Add content-aware splitting to compact mode: short chat-like exchanges
(2-6 short lines without headings/lists/quotes) get separate message
bubbles for a natural chat feel, while structured content (tables,
headings with body, numbered lists) stays in a single message.

Cherry-picked from PR #7587 by bravohenry, adapted to the compact/legacy
split_per_line architecture from #7903.
2026-04-12 00:38:07 -07:00
Teknium f53a5a7fe1 fix: suppress duplicate completion notifications when agent already consumed output via wait/poll/log (#8228)
When the agent calls process(action='wait') or process(action='poll')
and gets the exited status, the completion_queue notification is
redundant — the agent already has the output from the tool return.
Previously, the drain loops in CLI and gateway would still inject
the [SYSTEM: Background process completed] message, causing the
agent to receive the same information twice.

Fix: track session IDs in _completion_consumed set when wait/poll/log
returns an exited process. Drain loops in cli.py and gateway watcher
skip completion events for consumed sessions. Watch pattern events
are never suppressed (they have independent semantics).

Adds 4 tests covering wait/poll/log marking and running-process
negative case.
2026-04-12 00:36:22 -07:00
Teknium fdf55e0fe9 feat(cli): show random tip on new session start (#8225)
Add a 'tip of the day' feature that displays a random one-liner about
Hermes Agent features on every new session — CLI startup, /clear, /new,
and gateway /new across all messaging platforms.

- New hermes_cli/tips.py module with 210 curated tips covering slash
  commands, keybindings, CLI flags, config options, tools, gateway
  platforms, profiles, sessions, memory, skills, cron, voice, security,
  and more
- CLI: tips display in skin-aware dim gold color after the welcome line
- Gateway: tips append to the /new and /reset response on all platforms
- Fully wrapped in try/except — tips are non-critical and never break
  startup or reset

Display format (CLI):
  ✦ Tip: /btw <question> asks a quick side question without tools or history.

Display format (gateway):
   Session reset! Starting fresh.
  ✦ Tip: hermes -c resumes your most recent CLI session.
2026-04-12 00:34:01 -07:00
opriz 36f57dbc51 fix(migration): don't auto-archive OpenClaw source directory
Remove auto-archival from hermes claw migrate — not its
responsibility (hermes claw cleanup is still there for that).

Skip MESSAGING_CWD when it points inside the OpenClaw source
directory, which was the actual root cause of agent confusion
after migration. Use Path.is_relative_to() for robust path
containment check.

Salvaged from PR #8192 by opriz.
Co-authored-by: opriz <opriz@users.noreply.github.com>
2026-04-12 00:33:54 -07:00
Teknium 1871227198 feat: rebrand OpenClaw references to Hermes during migration
- Add rebrand_text() that replaces OpenClaw, Open Claw, Open-Claw,
  ClawdBot, and MoltBot with Hermes (case-insensitive, word-boundary)
- Apply rebranding to memory entries (MEMORY.md, USER.md, daily memory)
- Apply rebranding to SOUL.md and workspace instructions via new
  transform parameter on copy_file()
- Fix moldbot -> moltbot typo across codebase (claw.py, migration
  script, docs, tests)
- Add unit tests for rebrand_text and integration tests for memory
  and soul migration rebranding
2026-04-12 00:33:54 -07:00
Teknium eb2a49f95a fix: openai-codex and anthropic not appearing in /model picker for external credentials (#8224)
Users whose credentials exist only in external files — OpenAI Codex
OAuth tokens in ~/.codex/auth.json or Anthropic Claude Code credentials
in ~/.claude/.credentials.json — would not see those providers in the
/model picker, even though hermes auth and hermes model detected them.

Root cause: list_authenticated_providers() only checked the raw Hermes
auth store and env vars. External credential file fallbacks (Codex CLI
import, Claude Code file discovery) were never triggered.

Fix (three parts):
1. _seed_from_singletons() in credential_pool.py: openai-codex now
   imports from ~/.codex/auth.json when the Hermes auth store is empty,
   mirroring resolve_codex_runtime_credentials().
2. list_authenticated_providers() in model_switch.py: auth store + pool
   checks now run for ALL providers (not just OAuth auth_type), catching
   providers like anthropic that support both API key and OAuth.
3. list_authenticated_providers(): direct check for anthropic external
   credential files (Claude Code, Hermes PKCE). The credential pool
   intentionally gates anthropic behind is_provider_explicitly_configured()
   to prevent auxiliary tasks from silently consuming tokens. The /model
   picker bypasses this gate since it is discovery-oriented.
2026-04-12 00:33:42 -07:00
Teknium 73f970fa4d fix: make gateway interrupt detection resilient to monitor task failures
The interrupt mechanism for regular text messages (non-commands) during
active agent runs relied on a single async polling task
(monitor_for_interrupt) with no error handling. If this task died
silently due to an unhandled exception, stale adapter reference after
reconnect, or any other failure, user messages sent during agent
execution would be queued but never trigger an actual interrupt — the
agent would continue running until it finished naturally, then process
the queued message.

Three improvements:

1. Error handling in monitor_for_interrupt(): wrap the polling body in
   try/except so transient errors are logged and retried instead of
   silently killing the task.

2. Fresh adapter reference on each poll iteration: re-resolve
   self.adapters.get(source.platform) every 200ms instead of capturing
   the adapter once at task creation time. This prevents stale
   references after adapter reconnects.

3. Backup interrupt check in the inactivity poll loop: both the
   unlimited and timeout-enabled paths now check for pending interrupts
   every 5 seconds (the existing poll interval). Uses a shared
   _interrupt_detected asyncio.Event to avoid double-firing when the
   primary monitor already handled the interrupt. Logs at INFO level
   with monitor task state for debugging.
2026-04-12 00:25:05 -07:00
Teknium 4cadfef8e3 fix(cli): restore stacked tool progress scrollback in TUI (#8201)
The TUI transition (4970705, f83e86d) replaced stacked per-tool history
lines with a single live-updating spinner widget. While the spinner
provides a nice live timer, it removed the scrollback history that
users relied on to see what the agent did during a session.

This restores stacked tool progress lines in 'all' and 'new' modes by
printing persistent scrollback lines via _cprint() when tools complete,
in addition to the existing live spinner display.

Behavior per mode:
- off: no scrollback lines, no spinner (unchanged)
- new: scrollback line on completion, skipping consecutive same-tool repeats
- all: scrollback line on every tool completion
- verbose: no scrollback (run_agent.py handles verbose output directly)

Implementation:
- Store function_args from tool.started events in _pending_tool_info
- On tool.completed, pop stored args and format via get_cute_tool_message()
- FIFO queue per function_name handles concurrent tool execution
- 'new' mode tracks _last_scrollback_tool for dedup
- State cleared at end of agent run

Reported by community user Mr.D — the stacked history provides
transparency into what the agent is doing, which builds trust.

Addresses user report from Discord about lost tool call visibility.
2026-04-11 23:22:34 -07:00
Teknium 8e00b3a69e fix(cron): steer model away from explicit deliver targets that lose topic context (#8187)
Rewrite the cronjob tool's 'deliver' parameter description to strongly
guide models toward omitting the parameter (which auto-detects origin
including thread/topic). The previous description listed all platform
names equally, inviting models to construct explicit targets like
'telegram:<chat_id>' which silently drops the thread_id.

New description:
- Leads with 'Omit this parameter' as the recommended path
- Explicitly warns that platform:chat_id without :thread_id loses topics
- Removes the long flat list of platform names that invited construction

Also adds diagnostic logging at two key points:
- _origin_from_env(): logs when thread_id is captured during job creation
- _deliver_result(): warns when origin has thread_id but delivery target
  lost it; logs at debug when delivering to a specific thread

Helps diagnose user-reported issue where cron responses from Telegram
topics are delivered to the main chat instead of the originating topic.
2026-04-11 23:20:39 -07:00
Teknium 1ca9b19750 feat: add network.force_ipv4 config to fix IPv6 timeout issues (#8196)
On servers with broken or unreachable IPv6, Python's socket.getaddrinfo
returns AAAA records first. urllib/httpx/requests all try IPv6 connections
first and hang for the full TCP timeout before falling back to IPv4. This
affects web_extract, web_search, the OpenAI SDK, and all HTTP tools.

Adds network.force_ipv4 config option (default: false) that monkey-patches
socket.getaddrinfo to resolve as AF_INET when the caller didn't specify a
family. Falls back to full resolution if no A record exists, so pure-IPv6
hosts still work.

Applied early at all three entry points (CLI, gateway, cron scheduler)
before any HTTP clients are created.

Reported by user @29n — Chinese Ubuntu server with unreachable IPv6 causing
timeouts on lobste.rs and other IPv6-enabled sites while Google/GitHub
worked fine (IPv4-only resolution).
2026-04-11 23:12:11 -07:00
Teknium 1cec910b6a fix: improve context compaction to prevent model answering stale questions (#8107)
After compression, models (especially Kimi 2.5) would sometimes respond
to questions from the summary instead of the latest user message. This
happened ~30% of the time on Telegram.

Root cause: the summary's 'Next Steps' section read as active instructions,
and the SUMMARY_PREFIX didn't explicitly tell the model to ignore questions
in the summary. When the summary merged into the first tail message, there
was no clear separator between historical context and the actual user message.

Changes inspired by competitor analysis (Claude Code, OpenCode, Codex):

1. SUMMARY_PREFIX rewritten with explicit 'Do NOT answer questions from
   this summary — respond ONLY to the latest user message AFTER it'

2. Summarizer preamble (shared by both prompts) adds:
   - 'Do NOT respond to any questions' (from OpenCode's approach)
   - 'Different assistant' framing (from Codex) to create psychological
     distance between summary content and active conversation

3. New summary sections:
   - '## Resolved Questions' — tracks already-answered questions with
     their answers, preventing re-answering (from Claude Code's
     'Pending user asks' pattern)
   - '## Pending User Asks' — explicitly marks unanswered questions
   - '## Remaining Work' replaces '## Next Steps' — passive framing
     avoids reading as active instructions

4. merge-summary-into-tail path now inserts a clear separator:
   '--- END OF CONTEXT SUMMARY — respond to the message below ---'

5. Iterative update prompt now instructs: 'Move answered questions to
   Resolved Questions' to maintain the resolved/pending distinction
   across multiple compactions.
2026-04-11 19:43:58 -07:00
Tom Qiao 8a48c58bd3 fix(gateway): add missing RedactingFormatter import
The gateway startup path references RedactingFormatter without
importing it, causing a NameError crash when launched with a
verbosity flag (e.g. via launchd --replace).

Fixes #8044

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-11 19:38:05 -07:00
Teknium a0a02c1bc0 feat: /compress <focus> — guided compression with focus topic (#8017)
Adds an optional focus topic to /compress: `/compress database schema`
guides the summariser to preserve information related to the focus topic
(60-70% of summary budget) while compressing everything else more aggressively.
Inspired by Claude Code's /compact <focus>.

Changes:
- context_compressor.py: focus_topic parameter on _generate_summary() and
  compress(); appends FOCUS TOPIC guidance block to the LLM prompt
- run_agent.py: focus_topic parameter on _compress_context(), passed through
  to the compressor
- cli.py: _manual_compress() extracts focus topic from command string,
  preserves existing manual_compression_feedback integration (no regression)
- gateway/run.py: _handle_compress_command() extracts focus from event args
  and passes through — full gateway parity
- commands.py: args_hint="[focus topic]" on /compress CommandDef

Salvaged from PR #7459 (CLI /compress focus only — /context command deferred).
15 new tests across CLI, compressor, and gateway.
2026-04-11 19:23:29 -07:00
helix4u cfbfc4c3f1 fix(discord): decouple readiness from slash sync 2026-04-11 19:22:14 -07:00
Teknium fa7cd44b92 feat: add hermes backup and hermes import commands (#7997)
* feat: add `hermes backup` and `hermes import` commands

hermes backup — creates a zip of ~/.hermes/ (config, skills, sessions,
profiles, memories, skins, cron jobs, etc.) excluding the hermes-agent
codebase, __pycache__, and runtime PID files. Defaults to
~/hermes-backup-<timestamp>.zip, customizable with -o.

hermes import <zipfile> — restores from a backup zip, validating it
looks like a hermes backup before extracting. Handles .hermes/ prefix
stripping, path traversal protection, and confirmation prompts (skip
with --force).

29 tests covering exclusion rules, backup creation, import validation,
prefix detection, path traversal blocking, confirmation flow, and a
full round-trip test.

* test: improve backup/import coverage to 97%

Add 17 additional tests covering:
- _format_size helper (bytes through terabytes)
- Nonexistent hermes home error exit
- Output path is a directory (auto-names inside it)
- Output without .zip suffix (auto-appends)
- Empty hermes home (all files excluded)
- Permission errors during backup and import
- Output zip inside hermes root (skips itself)
- Not-a-zip file rejection
- EOFError and KeyboardInterrupt during confirmation
- 500+ file progress display
- Directory-only zip prefix detection

Remove dead code branch in _detect_prefix (unreachable guard).

* feat: auto-restore profile wrapper scripts on import

After extracting backup files, hermes import now scans profiles/ for
subdirectories with config.yaml or .env and recreates the ~/.local/bin
wrapper scripts so profile aliases (e.g. 'coder chat') work immediately.

Also prints guidance for re-installing gateway services per profile.

Handles edge cases:
- Skips profile dirs without config (not real profiles)
- Skips aliases that collide with existing commands
- Gracefully degrades if hermes_cli.profiles isn't available (fresh install)
- Shows PATH hint if ~/.local/bin isn't in PATH

3 new profile restoration tests (49 total).
2026-04-11 19:15:50 -07:00
Siddharth Balyan 50d86b3c71 fix(matrix): replace pickle crypto store with SQLite, fix E2EE decryption (#7981)
Fixes #7952 — Matrix E2EE completely broken after mautrix migration.

- Replace MemoryCryptoStore + pickle/HMAC persistence with mautrix's
  PgCryptoStore backed by SQLite via aiosqlite. Crypto state now
  persists reliably across restarts without fragile serialization.

- Add handle_sync() call on initial sync response so to-device events
  (queued Megolm key shares) are dispatched to OlmMachine instead of
  being silently dropped.

- Add _verify_device_keys_on_server() after loading crypto state.
  Detects missing keys (re-uploads), stale keys from migration
  (attempts re-upload), and corrupted state (refuses E2EE).

- Add _CryptoStateStore adapter wrapping MemoryStateStore to satisfy
  mautrix crypto's StateStore interface (is_encrypted,
  get_encryption_info, find_shared_rooms).

- Remove redundant share_keys() call from sync loop — OlmMachine
  already handles this via DEVICE_OTK_COUNT event handler.

- Fix datetime vs float TypeError in session.py suspend_recently_active()
  that crashed gateway startup.

- Add aiosqlite and asyncpg to [matrix] extra in pyproject.toml.

- Update test mocks for PgCryptoStore/Database and add query_keys mock
  for key verification. 174 tests pass.

- Add E2EE upgrade/migration docs to Matrix user guide.
2026-04-12 07:24:46 +05:30
Siddharth Balyan 27eeea0555 perf(ssh,modal): bulk file sync via tar pipe and tar/base64 archive (#8014)
* perf(ssh,modal): bulk file sync via tar pipe and tar/base64 archive

SSH: symlink-staging + tar -ch piped over SSH in a single TCP stream.
Eliminates per-file scp round-trips. Handles timeout (kills both
processes), SSH Popen failure (kills tar), and tar create failure.

Modal: in-memory gzipped tar archive, base64-encoded, decoded+extracted
in one exec call. Checks exit code and raises on failure.

Both backends use shared helpers extracted into file_sync.py:
- quoted_mkdir_command() — mirrors existing quoted_rm_command()
- unique_parent_dirs() — deduplicates parent dirs from file pairs

Migrates _ensure_remote_dirs to use the new helpers.

28 new tests (21 SSH + 7 Modal), all passing.

Closes #7465
Closes #7467

* fix(modal): pipe stdin to avoid ARG_MAX, clean up review findings

- Modal bulk upload: stream base64 payload through proc.stdin in 1MB
  chunks instead of embedding in command string (Modal SDK enforces
  64KB ARG_MAX_BYTES — typical payloads are ~4.3MB)
- Modal single-file upload: same stdin fix, add exit code checking
- Remove what-narrating comments in ssh.py and modal.py (keep WHY
  comments: symlink staging rationale, SIGPIPE, deadlock avoidance)
- Remove unnecessary `sandbox = self._sandbox` alias in modal bulk
- Daytona: use shared helpers (unique_parent_dirs, quoted_mkdir_command)
  instead of inlined duplicates

---------

Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com>
2026-04-12 06:18:05 +05:30
Teknium fd73937ec8 feat: component-separated logging with session context and filtering (#7991)
* feat: component-separated logging with session context and filtering

Phase 1 — Gateway log isolation:
- gateway.log now only receives records from gateway.* loggers
  (platform adapters, session management, slash commands, delivery)
- agent.log remains the catch-all (all components)
- errors.log remains WARNING+ catch-all
- Moved gateway.log handler creation from gateway/run.py into
  hermes_logging.setup_logging(mode='gateway') with _ComponentFilter

Phase 2 — Session ID injection:
- Added set_session_context(session_id) / clear_session_context() API
  using threading.local() for per-thread session tracking
- _SessionFilter enriches every log record with session_tag attribute
- Log format: '2026-04-11 10:23:45 INFO [session_id] logger.name: msg'
- Session context set at start of run_conversation() in run_agent.py
- Thread-isolated: gateway conversations on different threads don't leak

Phase 3 — Component filtering in hermes logs:
- Added --component flag: hermes logs --component gateway|agent|tools|cli|cron
- COMPONENT_PREFIXES maps component names to logger name prefixes
- Works with all existing filters (--level, --session, --since, -f)
- Logger name extraction handles both old and new log formats

Files changed:
- hermes_logging.py: _SessionFilter, _ComponentFilter, COMPONENT_PREFIXES,
  set/clear_session_context(), gateway.log creation in setup_logging()
- gateway/run.py: removed redundant gateway.log handler (now in hermes_logging)
- run_agent.py: set_session_context() at start of run_conversation()
- hermes_cli/logs.py: --component filter, logger name extraction
- hermes_cli/main.py: --component argument on logs subparser

Addresses community request for component-separated, filterable logging.
Zero changes to existing logger names — __name__ already provides hierarchy.

* fix: use LogRecord factory instead of per-handler _SessionFilter

The _SessionFilter approach required attaching a filter to every handler
we create. Any handler created outside our _add_rotating_handler (like
the gateway stderr handler, or third-party handlers) would crash with
KeyError: 'session_tag' if it used our format string.

Replace with logging.setLogRecordFactory() which injects session_tag
into every LogRecord at creation time — process-global, zero per-handler
wiring needed. The factory is installed at import time (before
setup_logging) so session_tag is available from the moment hermes_logging
is imported.

- Idempotent: marker attribute prevents double-wrapping on module reload
- Chains with existing factory: won't break third-party record factories
- Removes _SessionFilter from _add_rotating_handler and setup_verbose_logging
- Adds tests: record factory injection, idempotency, arbitrary handler compat
2026-04-11 17:23:36 -07:00
Teknium 723b5bec85 feat: per-platform display verbosity configuration (#8006)
Add display.platforms section to config.yaml for per-platform overrides of
display settings (tool_progress, show_reasoning, streaming, tool_preview_length).

Each platform gets sensible built-in defaults based on capability tier:
- High (telegram, discord): tool_progress=all, streaming follows global
- Medium (slack, mattermost, matrix, feishu): tool_progress=new
- Low (signal, whatsapp, bluebubbles, wecom, etc.): tool_progress=off, streaming=false
- Minimal (email, sms, webhook, homeassistant): tool_progress=off, streaming=false

Example config:
  display:
    platforms:
      telegram:
        tool_progress: all
        show_reasoning: true
      slack:
        tool_progress: off

Resolution order: platform override > global setting > built-in platform default.

Changes:
- New gateway/display_config.py: resolver module with tier-based platform defaults
- gateway/run.py: tool_progress, tool_preview_length, streaming, show_reasoning
  all resolve per-platform via the new resolver
- /verbose command: now cycles tool_progress per-platform (saves to
  display.platforms.<platform>.tool_progress instead of global)
- /reasoning show|hide: now saves show_reasoning per-platform
- Config version 15 -> 16: migrates tool_progress_overrides into display.platforms
- Backward compat: legacy tool_progress_overrides still read as fallback
- 27 new tests for resolver, normalization, migration, backward compat
- Updated verbose command tests for per-platform behavior

Addresses community request for per-channel verbosity control (Guillaume Meyer,
Nathan Danielsen) — high verbosity on backchannel Telegram, low on customer-facing
Slack, none on email.
2026-04-11 17:20:34 -07:00
Teknium 14ccd32cee refactor(terminal): remove check_interval parameter (#8001)
The check_interval parameter on terminal_tool sent periodic output
updates to the gateway chat, but these were display-only — the agent
couldn't see or act on them. This added schema bloat and introduced
a bug where notify_on_complete=True was silently dropped when
check_interval was also set (the not-check_interval guard skipped
fast-watcher registration, and the check_interval watcher dict
was missing the notify_on_complete key).

Removing check_interval entirely:
- Eliminates the notify_on_complete interaction bug
- Reduces tool schema size (one fewer parameter for the model)
- Simplifies the watcher registration path
- notify_on_complete (agent wake-on-completion) still works
- watch_patterns (output alerting) still works
- process(action='poll') covers manual status checking

Closes #7947 (root cause eliminated rather than patched).
2026-04-11 17:16:11 -07:00
Mateus Scheuer Macedo 06f862fa1b feat(cli): add native /model picker modal for provider → model selection
When /model is called with no arguments in the interactive CLI, open a
two-step prompt_toolkit modal instead of the previous text-only listing:

1. Provider selection — curses_single_select with all authenticated providers
2. Model selection — live API fetch with curated fallback

Also fixes:
- OpenAI Codex model normalization (openai/gpt-5.4 → gpt-5.4)
- Dedicated Codex validation path using provider_model_ids()

Preserves curses_radiolist (used by setup, tools, plugins) alongside the
new curses_single_select. Retains tool elapsed timer in spinner.

Cherry-picked from PR #7438 by MestreY0d4-Uninter.
2026-04-11 17:16:06 -07:00
Teknium 39cd57083a refactor: remove budget warning injection system (dead code)
The _get_budget_warning() method already returned None unconditionally —
the entire budget warning system was disabled. Remove all dead code:

- _BUDGET_WARNING_RE regex
- _strip_budget_warnings_from_history() function and its call site
- Both injection blocks (concurrent + sequential tool execution)
- _get_budget_warning() method
- 7 tests for the removed functions

The budget exhaustion grace call system (_budget_exhausted_injected,
_budget_grace_call) is a separate recovery mechanism and is preserved.
2026-04-11 16:56:33 -07:00
waxinz d99e2a29d6 feat: standardize message whitespace and JSON formatting
Normalize api_messages before each API call for consistent prefix
matching across turns:

1. Strip leading/trailing whitespace from system prompt parts
2. Strip leading/trailing whitespace from message content strings
3. Normalize tool-call arguments to compact sorted JSON

This enables KV cache reuse on local inference servers (llama.cpp,
vLLM, Ollama) and improves cache hit rates for cloud providers.

All normalization operates on the api_messages copy — the original
conversation history in messages is never mutated.  Tool-call JSON
normalization creates new dicts via spread to avoid the shallow-copy
mutation bug in the original PR.

Salvaged from PR #7875 by @waxinz with mutation fix.
2026-04-11 16:49:44 -07:00
Siddharth Balyan cab814af15 feat(nix): container-aware CLI — auto-route into managed container (#7543)
* feat(nix): container-aware CLI — auto-route all subcommands into managed container

When container.enable = true, the host `hermes` CLI transparently execs
every subcommand into the managed Docker/Podman container. A symlink
bridge (~/.hermes -> /var/lib/hermes/.hermes) unifies state between host
and container so sessions, config, and memories are shared.

CLI changes:
- Global routing before subcommand dispatch (all commands forwarded)
- docker exec with -u exec_user, env passthrough (TERM, COLORTERM,
  LANG, LC_ALL), TTY-aware flags
- Retry with spinner on failure (TTY: 5s, non-TTY: 10s silent)
- Hard fail instead of silent fallback
- HERMES_DEV=1 env var bypasses routing for development
- No routing messages (invisible to user)

NixOS module changes:
- container.hostUsers option: lists users who get ~/.hermes symlink
  and automatic hermes group membership
- Activation script creates symlink bridge (with backup of existing
  ~/.hermes dirs), writes exec_user to .container-mode
- Cleanup on disable: removes symlinks + .container-mode + stops service
- Warning when hostUsers set without addToSystemPackages

* fix: address review — reuse sudo var, add chown -h on symlink update

- hermes_cli/main.py: reuse the existing `sudo` variable instead of
  redundant `shutil.which("sudo")` call that could return None
- nix/nixosModules.nix: add missing `chown -h` when updating an
  existing symlink target so ownership stays consistent with the
  fresh-create and backup-replace branches

* fix: address remaining review items from cursor bugbot

- hermes_cli/main.py: move container routing BEFORE parse_args() so
  --help, unrecognised flags, and all subcommands are forwarded
  transparently into the container instead of being intercepted by
  argparse on the host (high severity)

- nix/nixosModules.nix: resolve home dirs via
  config.users.users.${user}.home instead of hardcoding /home/${user},
  supporting users with custom home directories (medium severity)

- nix/nixosModules.nix: gate hostUsers group membership on
  container.enable so setting hostUsers without container mode doesn't
  silently add users to the hermes group (low severity)

* fix: simplify container routing — execvp, no retries, let it crash

- Replace subprocess.run retry loop with os.execvp (no idle parent process)
- Extract _probe_container helper for sudo detection with 15s timeout
- Narrow exception handling: FileNotFoundError only in get_container_exec_info,
  catch TimeoutExpired specifically, remove silent except Exception: pass
- Collapse needs_sudo + sudo into single sudo_path variable
- Simplify NixOS symlink creation from 4 branches to 2
- Gate NixOS sudoers hint with "On NixOS:" prefix
- Full test rewrite: 18 tests covering execvp, sudo probe, timeout, permissions

---------

Co-authored-by: Hermes Agent <hermes@nousresearch.com>
2026-04-12 05:17:46 +05:30
Teknium 5c2ecdec49 fix: use ceiling division for token estimation, deduplicate inline formula
Switch estimate_tokens_rough(), estimate_messages_tokens_rough(), and
estimate_request_tokens_rough() from floor division (len // 4) to
ceiling division ((len + 3) // 4). Short texts (1-3 chars) previously
estimated as 0 tokens, causing the compressor and pre-flight checks to
systematically undercount when many short tool results are present.

Also replaced the inline duplicate formula in run_conversation()
(total_chars // 4) with a call to the shared
estimate_messages_tokens_rough() function.

Updated 4 tests that hardcoded floor-division expected values.

Related: issue #6217, PR #6629
2026-04-11 16:33:40 -07:00
WAXLYY 6d272ba477 fix(tools): enforce ID uniqueness in TODO store during replace operations
Deduplicate todo items by ID before writing to the store, keeping the
last occurrence. Prevents ghost entries when the model sends duplicate
IDs in a single write() call, which corrupts subsequent merge operations.

Co-authored-by: WAXLYY <WAXLYY@users.noreply.github.com>
2026-04-11 16:22:50 -07:00
asheriif 97b0cd51ee feat(gateway): surface natural mid-turn assistant messages in chat platforms
Add display.interim_assistant_messages config (enabled by default) that
forwards completed assistant commentary between tool calls to the user
as separate chat messages. Models already emit useful status text like
'I'll inspect the repo first.' — this surfaces it on Telegram, Discord,
and other messaging platforms instead of swallowing it.

Independent from tool_progress and gateway streaming. Disabled for
webhooks. Uses GatewayStreamConsumer when available, falls back to
direct adapter send. Tracks response_previewed to prevent double-delivery
when interim message matches the final response.

Also fixes: cursor not stripped from fallback prefix in stream consumer
(affected continuation calculation on no-edit platforms like Signal).

Cherry-picked from PR #7885 by asheriif, default changed to enabled.
Fixes #5016
2026-04-11 16:21:39 -07:00
Teknium 6ee0005e8c docs: expand tool-use enforcement documentation (#7984)
- Fix auto list (was only gpt, actually includes codex/gemini/gemma/grok)
- Document the three guidance layers (general, OpenAI-specific, Google-specific)
- Add 'When to turn it on' section for users on non-default models
- Clarify that substring matching is case-insensitive
2026-04-11 16:20:27 -07:00
Teknium c8aff74632 fix: prevent agent from stopping mid-task — compression floor, budget overhaul, activity tracking
Three root causes of the 'agent stops mid-task' gateway bug:

1. Compression threshold floor (64K tokens minimum)
   - The 50% threshold on a 100K-context model fired at 50K tokens,
     causing premature compression that made models lose track of
     multi-step plans.  Now threshold_tokens = max(50% * context, 64K).
   - Models with <64K context are rejected at startup with a clear error.

2. Budget warning removal — grace call instead
   - Removed the 70%/90% iteration budget warnings entirely.  These
     injected '[BUDGET WARNING: Provide your final response NOW]' into
     tool results, causing models to abandon complex tasks prematurely.
   - Now: no warnings during normal execution.  When the budget is
     actually exhausted (90/90), inject a user message asking the model
     to summarise, allow one grace API call, and only then fall back
     to _handle_max_iterations.

3. Activity touches during long terminal execution
   - _wait_for_process polls every 0.2s but never reported activity.
     The gateway's inactivity timeout (default 1800s) would fire during
     long-running commands that appeared 'idle.'
   - Now: thread-local activity callback fires every 10s during the
     poll loop, keeping the gateway's activity tracker alive.
   - Agent wires _touch_activity into the callback before each tool call.

Also: docs update noting 64K minimum context requirement.

Closes #7915 (root cause was agent-loop termination, not Weixin delivery limits).
2026-04-11 16:18:57 -07:00
Teknium 08f35076c9 fix: always log outer loop exception traceback at DEBUG level
Replace the verbose_logging-gated logging.exception() with an
unconditional logger.debug(exc_info=True). The full traceback now
always lands in agent.log when debug logging is enabled, without
requiring the verbose_logging flag or spamming the console.

Previously, production errors in the 700-line response processing
block (normalization, tool dispatch, final response handling) were
logged as one-line messages with the traceback hidden behind
verbose_logging — making post-mortem debugging difficult.
2026-04-11 15:52:07 -07:00
Teknium 289d2745af docs: add platform adapter developer guide + WeCom Callback docs (#7969)
Add the missing 'Adding a Platform Adapter' developer guide — a
comprehensive step-by-step checklist covering all 20+ integration
points (enum, adapter, config, runner, CLI, tools, toolsets, cron,
webhooks, tests, and docs). Includes common patterns for long-poll,
callback/webhook, and token-lock adapters with reference implementations.

Also adds full docs coverage for the WeCom Callback platform:
- New docs page: user-guide/messaging/wecom-callback.md
- Environment variables reference (9 WECOM_CALLBACK_* vars)
- Toolsets reference (hermes-wecom-callback)
- Messaging index (comparison table, architecture diagram, toolsets,
  security, next-steps links)
- Integrations index listing
- Sidebar entries for both new pages
2026-04-11 15:50:54 -07:00
Koichi Tsutsumi fc417ed049 fix(cli): add ChatConsole.status for /skills search 2026-04-11 15:38:43 -07:00
0xbyt4 32519066dc fix(gateway): add HERMES_SESSION_KEY to session_context contextvars
Complete the contextvars migration by adding HERMES_SESSION_KEY to the
unified _VAR_MAP in session_context.py. Without this, concurrent gateway
handlers race on os.environ["HERMES_SESSION_KEY"].

- Add _SESSION_KEY ContextVar to _VAR_MAP, set_session_vars(), clear_session_vars()
- Wire session_key through _set_session_env() from SessionContext
- Replace os.getenv fallback in tools/approval.py with get_session_env()
  (function-level import to avoid cross-layer coupling)
- Keep os.environ set as CLI/cron fallback

Cherry-picked from PR #7878 by 0xbyt4.
2026-04-11 15:35:04 -07:00
syaor4n 689c515090 feat: add --env and --preset support to hermes mcp add
- Add --env KEY=VALUE for passing environment variables to stdio MCP servers
- Add --preset for known MCP server templates (empty for now, extensible)
- Validate env var names, reject --env for HTTP servers
- Explicit --command/--url overrides preset defaults
- Remove unused getpass import

Based on PR #7936 by @syaor4n (stitch preset removed, generic infra kept).
2026-04-11 15:34:57 -07:00
Teknium 758c4ad1ef fix: remove dead hasattr checks for retry counters initialized in reset block
All retry counters (_invalid_tool_retries, _invalid_json_retries,
_empty_content_retries, _incomplete_scratchpad_retries,
_codex_incomplete_retries) are initialized to 0 at the top of
run_conversation() (lines 7566-7570). The hasattr guards added before
the reset block existed are now dead code — the attributes always exist.

Removed 7 redundant hasattr checks (5 original targets + 2 bonus for
_codex_incomplete_retries found during cleanup).
2026-04-11 15:29:15 -07:00
Teknium 000a881fcf fix: reset compression_attempts and primary_recovery_attempted on fallback activation
When _try_activate_fallback() switches to a new provider, retry_count was
reset to 0 but compression_attempts and primary_recovery_attempted were
not. This meant a fallback provider that hit context overflow would only
get the leftover compression budget from the failed primary provider,
and transport recovery was blocked because the flag was still True from
the old provider's attempt.

Reset both counters at all 5 fallback activation sites inside the retry
loop so each fallback provider gets a fresh compression budget (3 attempts)
and its own transport recovery opportunity.
2026-04-11 15:26:24 -07:00
436 changed files with 52539 additions and 4882 deletions
+1
View File
@@ -5,6 +5,7 @@
# Dependencies
node_modules
.venv
# CI/CD
.github
+9
View File
@@ -43,6 +43,15 @@
# KIMI_BASE_URL=https://api.kimi.com/coding/v1 # Default for sk-kimi- keys
# KIMI_BASE_URL=https://api.moonshot.ai/v1 # For legacy Moonshot keys
# KIMI_BASE_URL=https://api.moonshot.cn/v1 # For Moonshot China keys
# KIMI_CN_API_KEY= # Dedicated Moonshot China key
# =============================================================================
# LLM PROVIDER (Arcee AI)
# =============================================================================
# Arcee AI provides access to Trinity models (trinity-mini, trinity-large-*)
# Get an Arcee key at: https://chat.arcee.ai/
# ARCEEAI_API_KEY=
# ARCEE_BASE_URL= # Override default base URL
# =============================================================================
# LLM PROVIDER (MiniMax)
+2
View File
@@ -0,0 +1,2 @@
# Auto-generated files — collapse diffs and exclude from language stats
web/package-lock.json linguist-generated=true
+24 -6
View File
@@ -11,6 +11,7 @@ body:
**Before submitting**, please:
- [ ] Search [existing issues](https://github.com/NousResearch/hermes-agent/issues) to avoid duplicates
- [ ] Update to the latest version (`hermes update`) and confirm the bug still exists
- [ ] Run `hermes debug share` and paste the links below (see Debug Report section)
- type: textarea
id: description
@@ -82,6 +83,25 @@ body:
- Slack
- WhatsApp
- type: textarea
id: debug-report
attributes:
label: Debug Report
description: |
Run `hermes debug share` from your terminal and paste the links it prints here.
This uploads your system info, config, and recent logs to a paste service automatically.
If you're in an interactive chat session, you can also use the `/debug` slash command — it does the same thing.
If the upload fails, run `hermes debug share --local` and paste the output directly.
placeholder: |
Report https://paste.rs/abc123
agent.log https://paste.rs/def456
gateway.log https://paste.rs/ghi789
render: shell
validations:
required: true
- type: input
id: os
attributes:
@@ -97,8 +117,6 @@ body:
label: Python Version
description: Output of `python --version`
placeholder: "3.11.9"
validations:
required: true
- type: input
id: hermes-version
@@ -106,14 +124,14 @@ body:
label: Hermes Version
description: Output of `hermes version`
placeholder: "2.1.0"
validations:
required: true
- type: textarea
id: logs
attributes:
label: Relevant Logs / Traceback
description: Paste any error output, traceback, or log messages. This will be auto-formatted as code.
label: Additional Logs / Traceback (optional)
description: |
The debug report above covers most logs. Use this field for any extra error output,
tracebacks, or screenshots not captured by `hermes debug share`.
render: shell
- type: textarea
@@ -71,3 +71,15 @@ body:
label: Contribution
options:
- label: I'd like to implement this myself and submit a PR
- type: textarea
id: debug-report
attributes:
label: Debug Report (optional)
description: |
If this feature request is related to a problem you're experiencing, run `hermes debug share` and paste the links here.
In an interactive chat session, you can use `/debug` instead.
This helps us understand your environment and any related logs.
placeholder: |
Report https://paste.rs/abc123
render: shell
+16 -4
View File
@@ -9,7 +9,8 @@ body:
Sorry you're having trouble! Please fill out the details below so we can help.
**Quick checks first:**
- Run `hermes doctor` and include the output below
- Run `hermes debug share` and paste the links in the Debug Report section below
- If you're in a chat session, you can use `/debug` instead — it does the same thing
- Try `hermes update` to get the latest version
- Check the [README troubleshooting section](https://github.com/NousResearch/hermes-agent#troubleshooting)
- For general questions, consider the [Nous Research Discord](https://discord.gg/NousResearch) for faster help
@@ -74,10 +75,21 @@ body:
placeholder: "2.1.0"
- type: textarea
id: doctor-output
id: debug-report
attributes:
label: Output of `hermes doctor`
description: Run `hermes doctor` and paste the full output. This will be auto-formatted.
label: Debug Report
description: |
Run `hermes debug share` from your terminal and paste the links it prints here.
This uploads your system info, config, and recent logs to a paste service automatically.
If you're in an interactive chat session, you can also use the `/debug` slash command — it does the same thing.
If the upload fails or install didn't get that far, run `hermes debug share --local` and paste the output directly.
If even that doesn't work, run `hermes doctor` and paste that output instead.
placeholder: |
Report https://paste.rs/abc123
agent.log https://paste.rs/def456
gateway.log https://paste.rs/ghi789
render: shell
- type: textarea
+73
View File
@@ -0,0 +1,73 @@
name: Contributor Attribution Check
on:
pull_request:
branches: [main]
paths:
# Only run when code files change (not docs-only PRs)
- '*.py'
- '**/*.py'
- '.github/workflows/contributor-check.yml'
permissions:
contents: read
jobs:
check-attribution:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
with:
fetch-depth: 0 # Full history needed for git log
- name: Check for unmapped contributor emails
run: |
# Get the merge base between this PR and main
MERGE_BASE=$(git merge-base origin/main HEAD)
# Find any new author emails in this PR's commits
NEW_EMAILS=$(git log ${MERGE_BASE}..HEAD --format='%ae' --no-merges | sort -u)
if [ -z "$NEW_EMAILS" ]; then
echo "No new commits to check."
exit 0
fi
# Check each email against AUTHOR_MAP in release.py
MISSING=""
while IFS= read -r email; do
# Skip teknium and bot emails
case "$email" in
*teknium*|*noreply@github.com*|*dependabot*|*github-actions*|*anthropic.com*|*cursor.com*)
continue ;;
esac
# Check if email is in AUTHOR_MAP (either as a key or matches noreply pattern)
if echo "$email" | grep -qP '\+.*@users\.noreply\.github\.com'; then
continue # GitHub noreply emails auto-resolve
fi
if ! grep -qF "\"${email}\"" scripts/release.py 2>/dev/null; then
AUTHOR=$(git log --author="$email" --format='%an' -1)
MISSING="${MISSING}\n ${email} (${AUTHOR})"
fi
done <<< "$NEW_EMAILS"
if [ -n "$MISSING" ]; then
echo ""
echo "⚠️ New contributor email(s) not in AUTHOR_MAP:"
echo -e "$MISSING"
echo ""
echo "Please add mappings to scripts/release.py AUTHOR_MAP:"
echo -e "$MISSING" | while read -r line; do
email=$(echo "$line" | sed 's/^ *//' | cut -d' ' -f1)
[ -z "$email" ] && continue
echo " \"${email}\": \"<github-username>\","
done
echo ""
echo "To find the GitHub username for an email:"
echo " gh api 'search/users?q=EMAIL+in:email' --jq '.items[0].login'"
exit 1
else
echo "✅ All contributor emails are mapped in AUTHOR_MAP."
fi
+14 -6
View File
@@ -28,24 +28,32 @@ jobs:
name: github-pages
url: ${{ steps.deploy.outputs.page_url }}
steps:
- uses: actions/checkout@v4
- uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
- uses: actions/setup-node@v4
- uses: actions/setup-node@49933ea5288caeca8642d1e84afbd3f7d6820020 # v4
with:
node-version: 20
cache: npm
cache-dependency-path: website/package-lock.json
- uses: actions/setup-python@v5
- uses: actions/setup-python@a26af69be951a213d495a4c3e4e4022e16d87065 # v5
with:
python-version: '3.11'
- name: Install PyYAML for skill extraction
run: pip install pyyaml
run: pip install pyyaml==6.0.2 httpx==0.28.1
- name: Extract skill metadata for dashboard
run: python3 website/scripts/extract-skills.py
- name: Build skills index (if not already present)
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: |
if [ ! -f website/static/api/skills-index.json ]; then
python3 scripts/build_skills_index.py || echo "Skills index build failed (non-fatal)"
fi
- name: Install dependencies
run: npm ci
working-directory: website
@@ -65,10 +73,10 @@ jobs:
echo "hermes-agent.nousresearch.com" > _site/CNAME
- name: Upload artifact
uses: actions/upload-pages-artifact@v3
uses: actions/upload-pages-artifact@56afc609e74202658d3ffba0e8f6dda462b719fa # v3
with:
path: _site
- name: Deploy to GitHub Pages
id: deploy
uses: actions/deploy-pages@v4
uses: actions/deploy-pages@d6db90164ac5ed86f2b6aed7e0febac5b3c0c03e # v4
+9 -14
View File
@@ -23,21 +23,21 @@ jobs:
timeout-minutes: 60
steps:
- name: Checkout code
uses: actions/checkout@v4
uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
with:
submodules: recursive
- name: Set up QEMU
uses: docker/setup-qemu-action@v3
uses: docker/setup-qemu-action@c7c53464625b32c7a7e944ae62b3e17d2b600130 # v3
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
uses: docker/setup-buildx-action@8d2750c68a42422c14e847fe6c8ac0403b4cbd6f # v3
# Build amd64 only so we can `load` the image for smoke testing.
# `load: true` cannot export a multi-arch manifest to the local daemon.
# The multi-arch build follows on push to main / release.
- name: Build image (amd64, smoke test)
uses: docker/build-push-action@v6
uses: docker/build-push-action@10e90e3645eae34f1e60eeb005ba3a3d33f178e8 # v6
with:
context: .
file: Dockerfile
@@ -56,36 +56,31 @@ jobs:
- name: Log in to Docker Hub
if: github.event_name == 'push' && github.ref == 'refs/heads/main' || github.event_name == 'release'
uses: docker/login-action@v3
uses: docker/login-action@c94ce9fb468520275223c153574b00df6fe4bcc9 # v3
with:
username: ${{ secrets.DOCKERHUB_USERNAME }}
password: ${{ secrets.DOCKERHUB_TOKEN }}
- name: Push multi-arch image (main branch)
if: github.event_name == 'push' && github.ref == 'refs/heads/main'
uses: docker/build-push-action@v6
uses: docker/build-push-action@10e90e3645eae34f1e60eeb005ba3a3d33f178e8 # v6
with:
context: .
file: Dockerfile
push: true
platforms: linux/amd64,linux/arm64
tags: |
nousresearch/hermes-agent:latest
nousresearch/hermes-agent:${{ github.sha }}
tags: nousresearch/hermes-agent:latest
cache-from: type=gha
cache-to: type=gha,mode=max
- name: Push multi-arch image (release)
if: github.event_name == 'release'
uses: docker/build-push-action@v6
uses: docker/build-push-action@10e90e3645eae34f1e60eeb005ba3a3d33f178e8 # v6
with:
context: .
file: Dockerfile
push: true
platforms: linux/amd64,linux/arm64
tags: |
nousresearch/hermes-agent:latest
nousresearch/hermes-agent:${{ github.event.release.tag_name }}
nousresearch/hermes-agent:${{ github.sha }}
tags: nousresearch/hermes-agent:${{ github.event.release.tag_name }}
cache-from: type=gha
cache-to: type=gha,mode=max
+6 -3
View File
@@ -7,13 +7,16 @@ on:
- '.github/workflows/docs-site-checks.yml'
workflow_dispatch:
permissions:
contents: read
jobs:
docs-site-checks:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
- uses: actions/setup-node@v4
- uses: actions/setup-node@49933ea5288caeca8642d1e84afbd3f7d6820020 # v4
with:
node-version: 20
cache: npm
@@ -23,7 +26,7 @@ jobs:
run: npm ci
working-directory: website
- uses: actions/setup-python@v5
- uses: actions/setup-python@a26af69be951a213d495a4c3e4e4022e16d87065 # v5
with:
python-version: '3.11'
+4 -1
View File
@@ -14,6 +14,9 @@ on:
- 'run_agent.py'
- 'acp_adapter/**'
permissions:
contents: read
concurrency:
group: nix-${{ github.ref }}
cancel-in-progress: true
@@ -26,7 +29,7 @@ jobs:
runs-on: ${{ matrix.os }}
timeout-minutes: 30
steps:
- uses: actions/checkout@v4
- uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
- uses: DeterminateSystems/nix-installer-action@ef8a148080ab6020fd15196c2084a2eea5ff2d25 # v22
- uses: DeterminateSystems/magic-nix-cache-action@565684385bcd71bad329742eefe8d12f2e765b39 # v13
- name: Check flake
+101
View File
@@ -0,0 +1,101 @@
name: Build Skills Index
on:
schedule:
# Run twice daily: 6 AM and 6 PM UTC
- cron: '0 6,18 * * *'
workflow_dispatch: # Manual trigger
push:
branches: [main]
paths:
- 'scripts/build_skills_index.py'
- '.github/workflows/skills-index.yml'
permissions:
contents: read
jobs:
build-index:
# Only run on the upstream repository, not on forks
if: github.repository == 'NousResearch/hermes-agent'
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
- uses: actions/setup-python@a26af69be951a213d495a4c3e4e4022e16d87065 # v5
with:
python-version: '3.11'
- name: Install dependencies
run: pip install httpx==0.28.1 pyyaml==6.0.2
- name: Build skills index
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: python scripts/build_skills_index.py
- name: Upload index artifact
uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4
with:
name: skills-index
path: website/static/api/skills-index.json
retention-days: 7
deploy-with-index:
needs: build-index
runs-on: ubuntu-latest
permissions:
pages: write
id-token: write
environment:
name: github-pages
url: ${{ steps.deploy.outputs.page_url }}
# Only deploy on schedule or manual trigger (not on every push to the script)
if: github.event_name == 'schedule' || github.event_name == 'workflow_dispatch'
steps:
- uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
- uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4
with:
name: skills-index
path: website/static/api/
- uses: actions/setup-node@49933ea5288caeca8642d1e84afbd3f7d6820020 # v4
with:
node-version: 20
cache: npm
cache-dependency-path: website/package-lock.json
- uses: actions/setup-python@a26af69be951a213d495a4c3e4e4022e16d87065 # v5
with:
python-version: '3.11'
- name: Install PyYAML for skill extraction
run: pip install pyyaml==6.0.2
- name: Extract skill metadata for dashboard
run: python3 website/scripts/extract-skills.py
- name: Install dependencies
run: npm ci
working-directory: website
- name: Build Docusaurus
run: npm run build
working-directory: website
- name: Stage deployment
run: |
mkdir -p _site/docs
cp -r landingpage/* _site/
cp -r website/build/* _site/docs/
echo "hermes-agent.nousresearch.com" > _site/CNAME
- name: Upload artifact
uses: actions/upload-pages-artifact@56afc609e74202658d3ffba0e8f6dda462b719fa # v3
with:
path: _site
- name: Deploy to GitHub Pages
id: deploy
uses: actions/deploy-pages@d6db90164ac5ed86f2b6aed7e0febac5b3c0c03e # v4
+58 -2
View File
@@ -14,7 +14,7 @@ jobs:
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v4
uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
with:
fetch-depth: 0
@@ -149,6 +149,62 @@ jobs:
"
fi
# --- CI/CD workflow files modified ---
WORKFLOW_HITS=$(git diff --name-only "$BASE".."$HEAD" | grep -E '\.github/workflows/.*\.ya?ml$' || true)
if [ -n "$WORKFLOW_HITS" ]; then
FINDINGS="${FINDINGS}
### ⚠️ WARNING: CI/CD workflow files modified
Changes to workflow files can alter build pipelines, inject steps, or modify permissions. Verify no unauthorized actions or secrets access were added.
**Files:**
\`\`\`
${WORKFLOW_HITS}
\`\`\`
"
fi
# --- Dockerfile / container build files modified ---
DOCKER_HITS=$(git diff --name-only "$BASE".."$HEAD" | grep -iE '(Dockerfile|\.dockerignore|docker-compose)' || true)
if [ -n "$DOCKER_HITS" ]; then
FINDINGS="${FINDINGS}
### ⚠️ WARNING: Container build files modified
Changes to Dockerfiles or compose files can alter base images, add build steps, or expose ports. Verify base image pins and build commands.
**Files:**
\`\`\`
${DOCKER_HITS}
\`\`\`
"
fi
# --- Dependency manifest files modified ---
DEP_HITS=$(git diff --name-only "$BASE".."$HEAD" | grep -E '(pyproject\.toml|requirements.*\.txt|package\.json|Gemfile|go\.mod|Cargo\.toml)$' || true)
if [ -n "$DEP_HITS" ]; then
FINDINGS="${FINDINGS}
### ⚠️ WARNING: Dependency manifest files modified
Changes to dependency files can introduce new packages or change version pins. Verify all dependency changes are intentional and from trusted sources.
**Files:**
\`\`\`
${DEP_HITS}
\`\`\`
"
fi
# --- GitHub Actions version unpinning (mutable tags instead of SHAs) ---
ACTIONS_UNPIN=$(echo "$DIFF" | grep -n '^\+' | grep 'uses:' | grep -v '#' | grep -E '@v[0-9]' | head -10 || true)
if [ -n "$ACTIONS_UNPIN" ]; then
FINDINGS="${FINDINGS}
### ⚠️ WARNING: GitHub Actions with mutable version tags
Actions should be pinned to full commit SHAs (not \`@v4\`, \`@v5\`). Mutable tags can be retargeted silently if a maintainer account is compromised.
**Matches:**
\`\`\`
${ACTIONS_UNPIN}
\`\`\`
"
fi
# --- Output results ---
if [ -n "$FINDINGS" ]; then
echo "found=true" >> "$GITHUB_OUTPUT"
@@ -183,7 +239,7 @@ jobs:
---
*Automated scan triggered by [supply-chain-audit](/.github/workflows/supply-chain-audit.yml). If this is a false positive, a maintainer can approve after manual review.*"
gh pr comment "${{ github.event.pull_request.number }}" --body "$BODY"
gh pr comment "${{ github.event.pull_request.number }}" --body "$BODY" || echo "::warning::Could not post PR comment (expected for fork PRs — GITHUB_TOKEN is read-only)"
- name: Fail on critical findings
if: steps.scan.outputs.critical == 'true'
+7 -4
View File
@@ -6,6 +6,9 @@ on:
pull_request:
branches: [main]
permissions:
contents: read
# Cancel in-progress runs for the same PR/branch
concurrency:
group: tests-${{ github.ref }}
@@ -17,13 +20,13 @@ jobs:
timeout-minutes: 10
steps:
- name: Checkout code
uses: actions/checkout@v4
uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
- name: Install system dependencies
run: sudo apt-get update && sudo apt-get install -y ripgrep
- name: Install uv
uses: astral-sh/setup-uv@v5
uses: astral-sh/setup-uv@d4b2f3b6ecc6e67c4457f6d3e41ec42d3d0fcb86 # v5
- name: Set up Python 3.11
run: uv python install 3.11
@@ -49,10 +52,10 @@ jobs:
timeout-minutes: 10
steps:
- name: Checkout code
uses: actions/checkout@v4
uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
- name: Install uv
uses: astral-sh/setup-uv@v5
uses: astral-sh/setup-uv@d4b2f3b6ecc6e67c4457f6d3e41ec42d3d0fcb86 # v5
- name: Set up Python 3.11
run: uv python install 3.11
+4
View File
@@ -51,6 +51,9 @@ ignored/
.worktrees/
environments/benchmarks/evals/
# Web UI build output
hermes_cli/web_dist/
# Release script temp files
.release_notes.md
mini-swe-agent/
@@ -58,3 +61,4 @@ mini-swe-agent/
# Nix
.direnv/
result
website/static/api/skills-index.json
+107
View File
@@ -0,0 +1,107 @@
# .mailmap — canonical author mapping for git shortlog / git log / GitHub
# Format: Canonical Name <canonical@email> <commit@email>
# See: https://git-scm.com/docs/gitmailmap
#
# This maps commit emails to GitHub noreply addresses so that:
# 1. `git shortlog -sn` shows deduplicated contributor counts
# 2. GitHub's contributor graph can attribute commits correctly
# 3. Contributors with personal/work emails get proper credit
#
# When adding entries: use the contributor's GitHub noreply email as canonical
# so GitHub can link commits to their profile.
# === Teknium (multiple emails) ===
Teknium <127238744+teknium1@users.noreply.github.com> <teknium1@gmail.com>
Teknium <127238744+teknium1@users.noreply.github.com> <teknium@nousresearch.com>
# === Contributors — personal/work emails mapped to GitHub noreply ===
# Format: Canonical Name <GH-noreply> <commit-email>
# Verified via GH API email search
luyao618 <364939526@qq.com> <364939526@qq.com>
ethernet8023 <arilotter@gmail.com> <arilotter@gmail.com>
nicoloboschi <boschi1997@gmail.com> <boschi1997@gmail.com>
cherifya <chef.ya@gmail.com> <chef.ya@gmail.com>
BongSuCHOI <chlqhdtn98@gmail.com> <chlqhdtn98@gmail.com>
dsocolobsky <dsocolobsky@gmail.com> <dsocolobsky@gmail.com>
pefontana <fontana.pedro93@gmail.com> <fontana.pedro93@gmail.com>
Helmi <frank@helmschrott.de> <frank@helmschrott.de>
hata1234 <hata1234@gmail.com> <hata1234@gmail.com>
# Verified via PR investigation / salvage PR bodies
DeployFaith <agents@kylefrench.dev> <agents@kylefrench.dev>
flobo3 <floptopbot33@gmail.com> <floptopbot33@gmail.com>
gaixianggeng <gaixg94@gmail.com> <gaixg94@gmail.com>
KUSH42 <xush@xush.org> <xush@xush.org>
konsisumer <der@konsi.org> <der@konsi.org>
WorldInnovationsDepartment <vorvul.danylo@gmail.com> <vorvul.danylo@gmail.com>
m0n5t3r <iacobs@m0n5t3r.info> <iacobs@m0n5t3r.info>
sprmn24 <oncuevtv@gmail.com> <oncuevtv@gmail.com>
fancydirty <fancydirty@gmail.com> <fancydirty@gmail.com>
fxfitz <francis.x.fitzpatrick@gmail.com> <francis.x.fitzpatrick@gmail.com>
limars874 <limars874@gmail.com> <limars874@gmail.com>
AaronWong1999 <aaronwong1999@icloud.com> <aaronwong1999@icloud.com>
dippwho <dipp.who@gmail.com> <dipp.who@gmail.com>
duerzy <duerzy@gmail.com> <duerzy@gmail.com>
geoffwellman <geoff.wellman@gmail.com> <geoff.wellman@gmail.com>
hcshen0111 <shenhaocheng19990111@gmail.com> <shenhaocheng19990111@gmail.com>
jamesarch <han.shan@live.cn> <han.shan@live.cn>
stephenschoettler <stephenschoettler@gmail.com> <stephenschoettler@gmail.com>
Tranquil-Flow <tranquil_flow@protonmail.com> <tranquil_flow@protonmail.com>
Dusk1e <yusufalweshdemir@gmail.com> <yusufalweshdemir@gmail.com>
Awsh1 <ysfalweshcan@gmail.com> <ysfalweshcan@gmail.com>
WAXLYY <ysfwaxlycan@gmail.com> <ysfwaxlycan@gmail.com>
donrhmexe <don.rhm@gmail.com> <don.rhm@gmail.com>
hqhq1025 <1506751656@qq.com> <1506751656@qq.com>
BlackishGreen33 <s5460703@gmail.com> <s5460703@gmail.com>
tomqiaozc <zqiao@microsoft.com> <zqiao@microsoft.com>
MagicRay1217 <mingjwan@microsoft.com> <mingjwan@microsoft.com>
aaronagent <1115117931@qq.com> <1115117931@qq.com>
YoungYang963 <young@YoungdeMacBook-Pro.local> <young@YoungdeMacBook-Pro.local>
LongOddCode <haolong@microsoft.com> <haolong@microsoft.com>
Cafexss <coffeemjj@gmail.com> <coffeemjj@gmail.com>
Cygra <sjtuwbh@gmail.com> <sjtuwbh@gmail.com>
DomGrieco <dgrieco@redhat.com> <dgrieco@redhat.com>
# Duplicate email mapping (same person, multiple emails)
Sertug17 <104278804+Sertug17@users.noreply.github.com> <srhtsrht17@gmail.com>
yyovil <birdiegyal@gmail.com> <tanishq231003@gmail.com>
DomGrieco <dgrieco@redhat.com> <dgrieco@redhat.com>
dsocolobsky <dsocolobsky@gmail.com> <dylan.socolobsky@lambdaclass.com>
olafthiele <programming@olafthiele.com> <olafthiele@gmail.com>
# Verified via git display name matching GH contributor username
cokemine <aptx4561@gmail.com> <aptx4561@gmail.com>
dalianmao000 <dalianmao0107@gmail.com> <dalianmao0107@gmail.com>
emozilla <emozilla@nousresearch.com> <emozilla@nousresearch.com>
jjovalle99 <juan.ovalle@mistral.ai> <juan.ovalle@mistral.ai>
kagura-agent <kagura.chen28@gmail.com> <kagura.chen28@gmail.com>
spniyant <niyant@spicefi.xyz> <niyant@spicefi.xyz>
olafthiele <programming@olafthiele.com> <programming@olafthiele.com>
r266-tech <r2668940489@gmail.com> <r2668940489@gmail.com>
xingkongliang <tianliangjay@gmail.com> <tianliangjay@gmail.com>
win4r <win4r@outlook.com> <win4r@outlook.com>
zhouboli <zhouboli@gmail.com> <zhouboli@gmail.com>
yongtenglei <yongtenglei@gmail.com> <yongtenglei@gmail.com>
# Nous Research team
benbarclay <ben@nousresearch.com> <ben@nousresearch.com>
jquesnelle <jonny@nousresearch.com> <jonny@nousresearch.com>
# GH contributor list verified
spideystreet <dhicham.pro@gmail.com> <dhicham.pro@gmail.com>
dorukardahan <dorukardahan@hotmail.com> <dorukardahan@hotmail.com>
MustafaKara7 <karamusti912@gmail.com> <karamusti912@gmail.com>
Hmbown <hmbown@gmail.com> <hmbown@gmail.com>
kamil-gwozdz <kamil@gwozdz.me> <kamil@gwozdz.me>
kira-ariaki <kira@ariaki.me> <kira@ariaki.me>
knopki <knopki@duck.com> <knopki@duck.com>
Unayung <unayung@gmail.com> <unayung@gmail.com>
SeeYangZhi <yangzhi.see@gmail.com> <yangzhi.see@gmail.com>
Julientalbot <julien.talbot@ergonomia.re> <julien.talbot@ergonomia.re>
lesterli <lisicheng168@gmail.com> <lisicheng168@gmail.com>
JiayuuWang <jiayuw794@gmail.com> <jiayuw794@gmail.com>
tesseracttars-creator <tesseracttars@gmail.com> <tesseracttars@gmail.com>
xinbenlv <zzn+pa@zzn.im> <zzn+pa@zzn.im>
SaulJWu <saul.jj.wu@gmail.com> <saul.jj.wu@gmail.com>
angelos <angelos@oikos.lan.home.malaiwah.com> <angelos@oikos.lan.home.malaiwah.com>
+4 -3
View File
@@ -55,7 +55,7 @@ hermes-agent/
├── gateway/ # Messaging platform gateway
│ ├── run.py # Main loop, slash commands, message dispatch
│ ├── session.py # SessionStore — conversation persistence
│ └── platforms/ # Adapters: telegram, discord, slack, whatsapp, homeassistant, signal
│ └── platforms/ # Adapters: telegram, discord, slack, whatsapp, homeassistant, signal, qqbot
├── acp_adapter/ # ACP server (VS Code / Zed / JetBrains integration)
├── cron/ # Scheduler (jobs.py, scheduler.py)
├── environments/ # RL training environments (Atropos)
@@ -351,8 +351,9 @@ Cache-breaking forces dramatically higher costs. The ONLY time we alter context
### Background Process Notifications (Gateway)
When `terminal(background=true, check_interval=...)` is used, the gateway runs a watcher that
pushes status updates to the user's chat. Control verbosity with `display.background_process_notifications`
When `terminal(background=true, notify_on_complete=true)` is used, the gateway runs a watcher that
detects process completion and triggers a new agent turn. Control verbosity of background process
messages with `display.background_process_notifications`
in config.yaml (or `HERMES_BACKGROUND_NOTIFICATIONS` env var):
- `all` — running-output updates + final message (default)
+23 -6
View File
@@ -1,27 +1,44 @@
FROM ghcr.io/astral-sh/uv:0.11.6-python3.13-trixie@sha256:b3c543b6c4f23a5f2df22866bd7857e5d304b67a564f4feab6ac22044dde719b AS uv_source
FROM tianon/gosu:1.19-trixie@sha256:3b176695959c71e123eb390d427efc665eeb561b1540e82679c15e992006b8b9 AS gosu_source
FROM debian:13.4
# Disable Python stdout buffering to ensure logs are printed immediately
ENV PYTHONUNBUFFERED=1
# Store Playwright browsers outside the volume mount so the build-time
# install survives the /opt/data volume overlay at runtime.
ENV PLAYWRIGHT_BROWSERS_PATH=/opt/hermes/.playwright
# Install system dependencies in one layer, clear APT cache
RUN apt-get update && \
apt-get install -y --no-install-recommends \
build-essential nodejs npm python3 python3-pip ripgrep ffmpeg gcc python3-dev libffi-dev procps && \
build-essential nodejs npm python3 ripgrep ffmpeg gcc python3-dev libffi-dev procps git && \
rm -rf /var/lib/apt/lists/*
# Non-root user for runtime; UID can be overridden via HERMES_UID at runtime
RUN useradd -u 10000 -m -d /opt/data hermes
COPY --chmod=0755 --from=gosu_source /gosu /usr/local/bin/
COPY --chmod=0755 --from=uv_source /usr/local/bin/uv /usr/local/bin/uvx /usr/local/bin/
COPY . /opt/hermes
WORKDIR /opt/hermes
# Install Python and Node dependencies in one layer, no cache
RUN pip install --no-cache-dir uv --break-system-packages && \
uv pip install --system --break-system-packages --no-cache -e ".[all]" && \
npm install --prefer-offline --no-audit && \
# Install Node dependencies and Playwright as root (--with-deps needs apt)
RUN npm install --prefer-offline --no-audit && \
npx playwright install --with-deps chromium --only-shell && \
cd /opt/hermes/scripts/whatsapp-bridge && \
npm install --prefer-offline --no-audit && \
npm cache clean --force
WORKDIR /opt/hermes
# Hand ownership to hermes user, then install Python deps in a virtualenv
RUN chown -R hermes:hermes /opt/hermes
USER hermes
RUN uv venv && \
uv pip install --no-cache-dir -e ".[all]"
USER root
RUN chmod +x /opt/hermes/docker/entrypoint.sh
ENV HERMES_HOME=/opt/data
+2 -1
View File
@@ -13,7 +13,7 @@
**The self-improving AI agent built by [Nous Research](https://nousresearch.com).** It's the only agent with a built-in learning loop — it creates skills from experience, improves them during use, nudges itself to persist knowledge, searches its own past conversations, and builds a deepening model of who you are across sessions. Run it on a $5 VPS, a GPU cluster, or serverless infrastructure that costs nearly nothing when idle. It's not tied to your laptop — talk to it from Telegram while it works on a cloud VM.
Use any model you want — [Nous Portal](https://portal.nousresearch.com), [OpenRouter](https://openrouter.ai) (200+ models), [z.ai/GLM](https://z.ai), [Kimi/Moonshot](https://platform.moonshot.ai), [MiniMax](https://www.minimax.io), OpenAI, or your own endpoint. Switch with `hermes model` — no code changes, no lock-in.
Use any model you want — [Nous Portal](https://portal.nousresearch.com), [OpenRouter](https://openrouter.ai) (200+ models), [Xiaomi MiMo](https://platform.xiaomimimo.com), [z.ai/GLM](https://z.ai), [Kimi/Moonshot](https://platform.moonshot.ai), [MiniMax](https://www.minimax.io), [Hugging Face](https://huggingface.co), OpenAI, or your own endpoint. Switch with `hermes model` — no code changes, no lock-in.
<table>
<tr><td><b>A real terminal interface</b></td><td>Full TUI with multiline editing, slash-command autocomplete, conversation history, interrupt-and-redirect, and streaming tool output.</td></tr>
@@ -167,6 +167,7 @@ python -m pytest tests/ -q
- 📚 [Skills Hub](https://agentskills.io)
- 🐛 [Issues](https://github.com/NousResearch/hermes-agent/issues)
- 💡 [Discussions](https://github.com/NousResearch/hermes-agent/discussions)
- 🔌 [HermesClaw](https://github.com/AaronWong1999/hermesclaw) — Community WeChat bridge: Run Hermes Agent and OpenClaw on the same WeChat account.
---
+329
View File
@@ -0,0 +1,329 @@
# Hermes Agent v0.9.0 (v2026.4.13)
**Release Date:** April 13, 2026
**Since v0.8.0:** 487 commits · 269 merged PRs · 167 resolved issues · 493 files changed · 63,281 insertions · 24 contributors
> The everywhere release — Hermes goes mobile with Termux/Android, adds iMessage and WeChat, ships Fast Mode for OpenAI and Anthropic, introduces background process monitoring, launches a local web dashboard for managing your agent, and delivers the deepest security hardening pass yet across 16 supported platforms.
---
## ✨ Highlights
- **Local Web Dashboard** — A new browser-based dashboard for managing your Hermes Agent locally. Configure settings, monitor sessions, browse skills, and manage your gateway — all from a clean web interface without touching config files or the terminal. The easiest way to get started with Hermes.
- **Fast Mode (`/fast`)** — Priority processing for OpenAI and Anthropic models. Toggle `/fast` to route through priority queues for significantly lower latency on supported models (GPT-5.4, Codex, Claude). Expands across all OpenAI Priority Processing models and Anthropic's fast tier. ([#6875](https://github.com/NousResearch/hermes-agent/pull/6875), [#6960](https://github.com/NousResearch/hermes-agent/pull/6960), [#7037](https://github.com/NousResearch/hermes-agent/pull/7037))
- **iMessage via BlueBubbles** — Full iMessage integration through BlueBubbles, bringing Hermes to Apple's messaging ecosystem. Auto-webhook registration, setup wizard integration, and crash resilience. ([#6437](https://github.com/NousResearch/hermes-agent/pull/6437), [#6460](https://github.com/NousResearch/hermes-agent/pull/6460), [#6494](https://github.com/NousResearch/hermes-agent/pull/6494))
- **WeChat (Weixin) & WeCom Callback Mode** — Native WeChat support via iLink Bot API and a new WeCom callback-mode adapter for self-built enterprise apps. Streaming cursor, media uploads, markdown link handling, and atomic state persistence. Hermes now covers the Chinese messaging ecosystem end-to-end. ([#7166](https://github.com/NousResearch/hermes-agent/pull/7166), [#7943](https://github.com/NousResearch/hermes-agent/pull/7943))
- **Termux / Android Support** — Run Hermes natively on Android via Termux. Adapted install paths, TUI optimizations for mobile screens, voice backend support, and the `/image` command work on-device. ([#6834](https://github.com/NousResearch/hermes-agent/pull/6834))
- **Background Process Monitoring (`watch_patterns`)** — Set patterns to watch for in background process output and get notified in real-time when they match. Monitor for errors, wait for specific events ("listening on port"), or watch build logs — all without polling. ([#7635](https://github.com/NousResearch/hermes-agent/pull/7635))
- **Native xAI & Xiaomi MiMo Providers** — First-class provider support for xAI (Grok) and Xiaomi MiMo, with direct API access, model catalogs, and setup wizard integration. Plus Qwen OAuth with portal request support. ([#7372](https://github.com/NousResearch/hermes-agent/pull/7372), [#7855](https://github.com/NousResearch/hermes-agent/pull/7855))
- **Pluggable Context Engine** — Context management is now a pluggable slot via `hermes plugins`. Swap in custom context engines that control what the agent sees each turn — filtering, summarization, or domain-specific context injection. ([#7464](https://github.com/NousResearch/hermes-agent/pull/7464))
- **Unified Proxy Support** — SOCKS proxy, `DISCORD_PROXY`, and system proxy auto-detection across all gateway platforms. Hermes behind corporate firewalls just works. ([#6814](https://github.com/NousResearch/hermes-agent/pull/6814))
- **Comprehensive Security Hardening** — Path traversal protection in checkpoint manager, shell injection neutralization in sandbox writes, SSRF redirect guards in Slack image uploads, Twilio webhook signature validation (SMS RCE fix), API server auth enforcement, git argument injection prevention, and approval button authorization. ([#7933](https://github.com/NousResearch/hermes-agent/pull/7933), [#7944](https://github.com/NousResearch/hermes-agent/pull/7944), [#7940](https://github.com/NousResearch/hermes-agent/pull/7940), [#7151](https://github.com/NousResearch/hermes-agent/pull/7151), [#7156](https://github.com/NousResearch/hermes-agent/pull/7156))
- **`hermes backup` & `hermes import`** — Full backup and restore of your Hermes configuration, sessions, skills, and memory. Migrate between machines or create snapshots before major changes. ([#7997](https://github.com/NousResearch/hermes-agent/pull/7997))
- **16 Supported Platforms** — With BlueBubbles (iMessage) and WeChat joining Telegram, Discord, Slack, WhatsApp, Signal, Matrix, Email, SMS, DingTalk, Feishu, WeCom, Mattermost, Home Assistant, and Webhooks, Hermes now runs on 16 messaging platforms out of the box.
- **`/debug` & `hermes debug share`** — New debugging toolkit: `/debug` slash command across all platforms for quick diagnostics, plus `hermes debug share` to upload a full debug report to a pastebin for easy sharing when troubleshooting. ([#8681](https://github.com/NousResearch/hermes-agent/pull/8681))
---
## 🏗️ Core Agent & Architecture
### Provider & Model Support
- **Native xAI (Grok) provider** with direct API access and model catalog ([#7372](https://github.com/NousResearch/hermes-agent/pull/7372))
- **Xiaomi MiMo as first-class provider** — setup wizard, model catalog, empty response recovery ([#7855](https://github.com/NousResearch/hermes-agent/pull/7855))
- **Qwen OAuth provider** with portal request support ([#6282](https://github.com/NousResearch/hermes-agent/pull/6282))
- **Fast Mode** — `/fast` toggle for OpenAI Priority Processing + Anthropic fast tier ([#6875](https://github.com/NousResearch/hermes-agent/pull/6875), [#6960](https://github.com/NousResearch/hermes-agent/pull/6960), [#7037](https://github.com/NousResearch/hermes-agent/pull/7037))
- **Structured API error classification** for smart failover decisions ([#6514](https://github.com/NousResearch/hermes-agent/pull/6514))
- **Rate limit header capture** shown in `/usage` ([#6541](https://github.com/NousResearch/hermes-agent/pull/6541))
- **API server model name** derived from profile name ([#6857](https://github.com/NousResearch/hermes-agent/pull/6857))
- **Custom providers** now included in `/model` listings and resolution ([#7088](https://github.com/NousResearch/hermes-agent/pull/7088))
- **Fallback provider activation** on repeated empty responses with user-visible status ([#7505](https://github.com/NousResearch/hermes-agent/pull/7505))
- **OpenRouter variant tags** (`:free`, `:extended`, `:fast`) preserved during model switch ([#6383](https://github.com/NousResearch/hermes-agent/pull/6383))
- **Credential exhaustion TTL** reduced from 24 hours to 1 hour ([#6504](https://github.com/NousResearch/hermes-agent/pull/6504))
- **OAuth credential lifecycle** hardening — stale pool keys, auth.json sync, Codex CLI race fixes ([#6874](https://github.com/NousResearch/hermes-agent/pull/6874))
- Empty response recovery for reasoning models (MiMo, Qwen, GLM) ([#8609](https://github.com/NousResearch/hermes-agent/pull/8609))
- MiniMax context lengths, thinking guard, endpoint corrections ([#6082](https://github.com/NousResearch/hermes-agent/pull/6082), [#7126](https://github.com/NousResearch/hermes-agent/pull/7126))
- Z.AI endpoint auto-detect via probe and cache ([#5763](https://github.com/NousResearch/hermes-agent/pull/5763))
### Agent Loop & Conversation
- **Pluggable context engine slot** via `hermes plugins` ([#7464](https://github.com/NousResearch/hermes-agent/pull/7464))
- **Background process monitoring** — `watch_patterns` for real-time output alerts ([#7635](https://github.com/NousResearch/hermes-agent/pull/7635))
- **Improved context compression** — higher limits, tool tracking, degradation warnings, token-budget tail protection ([#6395](https://github.com/NousResearch/hermes-agent/pull/6395), [#6453](https://github.com/NousResearch/hermes-agent/pull/6453))
- **`/compress <focus>`** — guided compression with a focus topic ([#8017](https://github.com/NousResearch/hermes-agent/pull/8017))
- **Tiered context pressure warnings** with gateway dedup ([#6411](https://github.com/NousResearch/hermes-agent/pull/6411))
- **Staged inactivity warning** before timeout escalation ([#6387](https://github.com/NousResearch/hermes-agent/pull/6387))
- **Prevent agent from stopping mid-task** — compression floor, budget overhaul, activity tracking ([#7983](https://github.com/NousResearch/hermes-agent/pull/7983))
- **Propagate child activity to parent** during `delegate_task` ([#7295](https://github.com/NousResearch/hermes-agent/pull/7295))
- **Truncated streaming tool call detection** before execution ([#6847](https://github.com/NousResearch/hermes-agent/pull/6847))
- Empty response retry (3 attempts with nudge) ([#6488](https://github.com/NousResearch/hermes-agent/pull/6488))
- Adaptive streaming backoff + cursor strip to prevent message truncation ([#7683](https://github.com/NousResearch/hermes-agent/pull/7683))
- Compression uses live session model instead of stale persisted config ([#8258](https://github.com/NousResearch/hermes-agent/pull/8258))
- Strip `<thought>` tags from Gemma 4 responses ([#8562](https://github.com/NousResearch/hermes-agent/pull/8562))
- Prevent `<think>` in prose from suppressing response output ([#6968](https://github.com/NousResearch/hermes-agent/pull/6968))
- Turn-exit diagnostic logging to agent loop ([#6549](https://github.com/NousResearch/hermes-agent/pull/6549))
- Scope tool interrupt signal per-thread to prevent cross-session leaks ([#7930](https://github.com/NousResearch/hermes-agent/pull/7930))
### Memory & Sessions
- **Hindsight memory plugin** — feature parity, setup wizard, config improvements — @nicoloboschi ([#6428](https://github.com/NousResearch/hermes-agent/pull/6428))
- **Honcho** — opt-in `initOnSessionStart` for tools mode — @Kathie-yu ([#6995](https://github.com/NousResearch/hermes-agent/pull/6995))
- Orphan children instead of cascade-deleting in prune/delete ([#6513](https://github.com/NousResearch/hermes-agent/pull/6513))
- Doctor command only checks the active memory provider ([#6285](https://github.com/NousResearch/hermes-agent/pull/6285))
---
## 📱 Messaging Platforms (Gateway)
### New Platforms
- **BlueBubbles (iMessage)** — full adapter with auto-webhook registration, setup wizard, and crash resilience ([#6437](https://github.com/NousResearch/hermes-agent/pull/6437), [#6460](https://github.com/NousResearch/hermes-agent/pull/6460), [#6494](https://github.com/NousResearch/hermes-agent/pull/6494), [#7107](https://github.com/NousResearch/hermes-agent/pull/7107))
- **Weixin (WeChat)** — native support via iLink Bot API with streaming, media uploads, markdown links ([#7166](https://github.com/NousResearch/hermes-agent/pull/7166), [#8665](https://github.com/NousResearch/hermes-agent/pull/8665))
- **WeCom Callback Mode** — self-built enterprise app adapter with atomic state persistence ([#7943](https://github.com/NousResearch/hermes-agent/pull/7943), [#7928](https://github.com/NousResearch/hermes-agent/pull/7928))
### Discord
- **Allowed channels whitelist** config — @jarvis-phw ([#7044](https://github.com/NousResearch/hermes-agent/pull/7044))
- **Forum channel topic inheritance** in thread sessions — @hermes-agent-dhabibi ([#6377](https://github.com/NousResearch/hermes-agent/pull/6377))
- **DISCORD_REPLY_TO_MODE** setting ([#6333](https://github.com/NousResearch/hermes-agent/pull/6333))
- Accept `.log` attachments, raise document size limit — @kira-ariaki ([#6467](https://github.com/NousResearch/hermes-agent/pull/6467))
- Decouple readiness from slash sync ([#8016](https://github.com/NousResearch/hermes-agent/pull/8016))
### Slack
- **Consolidated Slack improvements** — 7 community PRs salvaged into one ([#6809](https://github.com/NousResearch/hermes-agent/pull/6809))
- Handle assistant thread lifecycle events ([#6433](https://github.com/NousResearch/hermes-agent/pull/6433))
### Matrix
- **Migrated from matrix-nio to mautrix-python** ([#7518](https://github.com/NousResearch/hermes-agent/pull/7518))
- SQLite crypto store replacing pickle (fixes E2EE decryption) — @alt-glitch ([#7981](https://github.com/NousResearch/hermes-agent/pull/7981))
- Cross-signing recovery key verification for E2EE migration ([#8282](https://github.com/NousResearch/hermes-agent/pull/8282))
- DM mention threads + group chat events for Feishu ([#7423](https://github.com/NousResearch/hermes-agent/pull/7423))
### Gateway Core
- **Unified proxy support** — SOCKS, DISCORD_PROXY, multi-platform with macOS auto-detection ([#6814](https://github.com/NousResearch/hermes-agent/pull/6814))
- **Inbound text batching** for Discord, Matrix, WeCom + adaptive delay ([#6979](https://github.com/NousResearch/hermes-agent/pull/6979))
- **Surface natural mid-turn assistant messages** in chat platforms ([#7978](https://github.com/NousResearch/hermes-agent/pull/7978))
- **WSL-aware gateway** with smart systemd detection ([#7510](https://github.com/NousResearch/hermes-agent/pull/7510))
- **All missing platforms added to setup wizard** ([#7949](https://github.com/NousResearch/hermes-agent/pull/7949))
- **Per-platform `tool_progress` overrides** ([#6348](https://github.com/NousResearch/hermes-agent/pull/6348))
- **Configurable 'still working' notification interval** ([#8572](https://github.com/NousResearch/hermes-agent/pull/8572))
- `/model` switch persists across messages ([#7081](https://github.com/NousResearch/hermes-agent/pull/7081))
- `/usage` shows rate limits, cost, and token details between turns ([#7038](https://github.com/NousResearch/hermes-agent/pull/7038))
- Drain in-flight work before restart ([#7503](https://github.com/NousResearch/hermes-agent/pull/7503))
- Don't evict cached agent on failed runs — prevents MCP restart loop ([#7539](https://github.com/NousResearch/hermes-agent/pull/7539))
- Replace `os.environ` session state with `contextvars` ([#7454](https://github.com/NousResearch/hermes-agent/pull/7454))
- Derive channel directory platforms from enum instead of hardcoded list ([#7450](https://github.com/NousResearch/hermes-agent/pull/7450))
- Validate image downloads before caching (cross-platform) ([#7125](https://github.com/NousResearch/hermes-agent/pull/7125))
- Cross-platform webhook delivery for all platforms ([#7095](https://github.com/NousResearch/hermes-agent/pull/7095))
- Cron Discord thread_id delivery support ([#7106](https://github.com/NousResearch/hermes-agent/pull/7106))
- Feishu QR-based bot onboarding ([#8570](https://github.com/NousResearch/hermes-agent/pull/8570))
- Gateway status scoped to active profile ([#7951](https://github.com/NousResearch/hermes-agent/pull/7951))
- Prevent background process notifications from triggering false pairing requests ([#6434](https://github.com/NousResearch/hermes-agent/pull/6434))
---
## 🖥️ CLI & User Experience
### Interactive CLI
- **Termux / Android support** — adapted install paths, TUI, voice, `/image` ([#6834](https://github.com/NousResearch/hermes-agent/pull/6834))
- **Native `/model` picker modal** for provider → model selection ([#8003](https://github.com/NousResearch/hermes-agent/pull/8003))
- **Live per-tool elapsed timer** restored in TUI spinner ([#7359](https://github.com/NousResearch/hermes-agent/pull/7359))
- **Stacked tool progress scrollback** in TUI ([#8201](https://github.com/NousResearch/hermes-agent/pull/8201))
- **Random tips on new session start** (CLI + gateway, 279 tips) ([#8225](https://github.com/NousResearch/hermes-agent/pull/8225), [#8237](https://github.com/NousResearch/hermes-agent/pull/8237))
- **`hermes dump`** — copy-pasteable setup summary for debugging ([#6550](https://github.com/NousResearch/hermes-agent/pull/6550))
- **`hermes backup` / `hermes import`** — full config backup and restore ([#7997](https://github.com/NousResearch/hermes-agent/pull/7997))
- **WSL environment hint** in system prompt ([#8285](https://github.com/NousResearch/hermes-agent/pull/8285))
- **Profile creation UX** — seed SOUL.md + credential warning ([#8553](https://github.com/NousResearch/hermes-agent/pull/8553))
- Shell-aware sudo detection, empty password support ([#6517](https://github.com/NousResearch/hermes-agent/pull/6517))
- Flush stdin after curses/terminal menus to prevent escape sequence leakage ([#7167](https://github.com/NousResearch/hermes-agent/pull/7167))
- Handle broken stdin in prompt_toolkit startup ([#8560](https://github.com/NousResearch/hermes-agent/pull/8560))
### Setup & Configuration
- **Per-platform display verbosity** configuration ([#8006](https://github.com/NousResearch/hermes-agent/pull/8006))
- **Component-separated logging** with session context and filtering ([#7991](https://github.com/NousResearch/hermes-agent/pull/7991))
- **`network.force_ipv4`** config to fix IPv6 timeout issues ([#8196](https://github.com/NousResearch/hermes-agent/pull/8196))
- **Standardize message whitespace and JSON formatting** ([#7988](https://github.com/NousResearch/hermes-agent/pull/7988))
- **Rebrand OpenClaw → Hermes** during migration ([#8210](https://github.com/NousResearch/hermes-agent/pull/8210))
- Config.yaml takes priority over env vars for auxiliary settings ([#7889](https://github.com/NousResearch/hermes-agent/pull/7889))
- Harden setup provider flows + live OpenRouter catalog refresh ([#7078](https://github.com/NousResearch/hermes-agent/pull/7078))
- Normalize reasoning effort ordering across all surfaces ([#6804](https://github.com/NousResearch/hermes-agent/pull/6804))
- Remove dead `LLM_MODEL` env var + migration to clear stale entries ([#6543](https://github.com/NousResearch/hermes-agent/pull/6543))
- Remove `/prompt` slash command — prefix expansion footgun ([#6752](https://github.com/NousResearch/hermes-agent/pull/6752))
- `HERMES_HOME_MODE` env var to override permissions — @ygd58 ([#6993](https://github.com/NousResearch/hermes-agent/pull/6993))
- Fall back to default model when model config is empty ([#8303](https://github.com/NousResearch/hermes-agent/pull/8303))
- Warn when compression model context is too small ([#7894](https://github.com/NousResearch/hermes-agent/pull/7894))
---
## 🔧 Tool System
### Environments & Execution
- **Unified spawn-per-call execution layer** for environments ([#6343](https://github.com/NousResearch/hermes-agent/pull/6343))
- **Unified file sync** with mtime tracking, deletion, and transactional state ([#7087](https://github.com/NousResearch/hermes-agent/pull/7087))
- **Persistent sandbox envs** survive between turns ([#6412](https://github.com/NousResearch/hermes-agent/pull/6412))
- **Bulk file sync** via tar pipe for SSH/Modal backends — @alt-glitch ([#8014](https://github.com/NousResearch/hermes-agent/pull/8014))
- **Daytona** — bulk upload, config bridge, silent disk cap ([#7538](https://github.com/NousResearch/hermes-agent/pull/7538))
- Foreground timeout cap to prevent session deadlocks ([#7082](https://github.com/NousResearch/hermes-agent/pull/7082))
- Guard invalid command values ([#6417](https://github.com/NousResearch/hermes-agent/pull/6417))
### MCP
- **`hermes mcp add --env` and `--preset`** support ([#7970](https://github.com/NousResearch/hermes-agent/pull/7970))
- Combine `content` and `structuredContent` when both present ([#7118](https://github.com/NousResearch/hermes-agent/pull/7118))
- MCP tool name deconfliction fixes ([#7654](https://github.com/NousResearch/hermes-agent/pull/7654))
### Browser
- Browser hardening — dead code removal, caching, scroll perf, security, thread safety ([#7354](https://github.com/NousResearch/hermes-agent/pull/7354))
- `/browser connect` auto-launch uses dedicated Chrome profile dir ([#6821](https://github.com/NousResearch/hermes-agent/pull/6821))
- Reap orphaned browser sessions on startup ([#7931](https://github.com/NousResearch/hermes-agent/pull/7931))
### Voice & Vision
- **Voxtral TTS provider** (Mistral AI) ([#7653](https://github.com/NousResearch/hermes-agent/pull/7653))
- **TTS speed support** for Edge TTS, OpenAI TTS, MiniMax ([#8666](https://github.com/NousResearch/hermes-agent/pull/8666))
- **Vision auto-resize** for oversized images, raise limit to 20 MB, retry-on-failure ([#7883](https://github.com/NousResearch/hermes-agent/pull/7883), [#7902](https://github.com/NousResearch/hermes-agent/pull/7902))
- STT provider-model mismatch fix (whisper-1 vs faster-whisper) ([#7113](https://github.com/NousResearch/hermes-agent/pull/7113))
### Other Tools
- **`hermes dump`** command for setup summary ([#6550](https://github.com/NousResearch/hermes-agent/pull/6550))
- TODO store enforces ID uniqueness during replace operations ([#7986](https://github.com/NousResearch/hermes-agent/pull/7986))
- List all available toolsets in `delegate_task` schema description ([#8231](https://github.com/NousResearch/hermes-agent/pull/8231))
- API server: tool progress as custom SSE event to prevent model corruption ([#7500](https://github.com/NousResearch/hermes-agent/pull/7500))
- API server: share one Docker container across all conversations ([#7127](https://github.com/NousResearch/hermes-agent/pull/7127))
---
## 🧩 Skills Ecosystem
- **Centralized skills index + tree cache** — eliminates rate-limit failures on install ([#8575](https://github.com/NousResearch/hermes-agent/pull/8575))
- **More aggressive skill loading instructions** in system prompt (v3) ([#8209](https://github.com/NousResearch/hermes-agent/pull/8209), [#8286](https://github.com/NousResearch/hermes-agent/pull/8286))
- **Google Workspace skill** migrated to GWS CLI backend ([#6788](https://github.com/NousResearch/hermes-agent/pull/6788))
- **Creative divergence strategies** skill — @SHL0MS ([#6882](https://github.com/NousResearch/hermes-agent/pull/6882))
- **Creative ideation** — constraint-driven project generation — @SHL0MS ([#7555](https://github.com/NousResearch/hermes-agent/pull/7555))
- Parallelize skills browse/search to prevent hanging ([#7301](https://github.com/NousResearch/hermes-agent/pull/7301))
- Read name from SKILL.md frontmatter in skills_sync ([#7623](https://github.com/NousResearch/hermes-agent/pull/7623))
---
## 🔒 Security & Reliability
### Security Hardening
- **Twilio webhook signature validation** — SMS RCE fix ([#7933](https://github.com/NousResearch/hermes-agent/pull/7933))
- **Shell injection neutralization** in `_write_to_sandbox` via path quoting ([#7940](https://github.com/NousResearch/hermes-agent/pull/7940))
- **Git argument injection** and path traversal prevention in checkpoint manager ([#7944](https://github.com/NousResearch/hermes-agent/pull/7944))
- **SSRF redirect bypass** in Slack image uploads + base.py cache helpers ([#7151](https://github.com/NousResearch/hermes-agent/pull/7151))
- **Path traversal, credential gate, DANGEROUS_PATTERNS gaps** ([#7156](https://github.com/NousResearch/hermes-agent/pull/7156))
- **API bind guard** — enforce `API_SERVER_KEY` for non-loopback binding ([#7455](https://github.com/NousResearch/hermes-agent/pull/7455))
- **Approval button authorization** — require auth for session continuation — @Cafexss ([#6930](https://github.com/NousResearch/hermes-agent/pull/6930))
- Path boundary enforcement in skill manager operations ([#7156](https://github.com/NousResearch/hermes-agent/pull/7156))
- DingTalk/API webhook URL origin validation, header injection rejection ([#7455](https://github.com/NousResearch/hermes-agent/pull/7455))
### Reliability
- **Contextual error diagnostics** for invalid API responses ([#8565](https://github.com/NousResearch/hermes-agent/pull/8565))
- **Prevent 400 format errors** from triggering compression loop on Codex ([#6751](https://github.com/NousResearch/hermes-agent/pull/6751))
- **Don't halve context_length** on output-cap-too-large errors — @KUSH42 ([#6664](https://github.com/NousResearch/hermes-agent/pull/6664))
- **Recover primary client** on OpenAI transport errors ([#7108](https://github.com/NousResearch/hermes-agent/pull/7108))
- **Credential pool rotation** on billing-classified 400s ([#7112](https://github.com/NousResearch/hermes-agent/pull/7112))
- **Auto-increase stream read timeout** for local LLM providers ([#6967](https://github.com/NousResearch/hermes-agent/pull/6967))
- **Fall back to default certs** when CA bundle path doesn't exist ([#7352](https://github.com/NousResearch/hermes-agent/pull/7352))
- **Disambiguate usage-limit patterns** in error classifier — @sprmn24 ([#6836](https://github.com/NousResearch/hermes-agent/pull/6836))
- Harden cron script timeout and provider recovery ([#7079](https://github.com/NousResearch/hermes-agent/pull/7079))
- Gateway interrupt detection resilient to monitor task failures ([#8208](https://github.com/NousResearch/hermes-agent/pull/8208))
- Prevent unwanted session auto-reset after graceful gateway restarts ([#8299](https://github.com/NousResearch/hermes-agent/pull/8299))
- Prevent duplicate update prompt spam in gateway watcher ([#8343](https://github.com/NousResearch/hermes-agent/pull/8343))
- Deduplicate reasoning items in Responses API input ([#7946](https://github.com/NousResearch/hermes-agent/pull/7946))
### Infrastructure
- **Multi-arch Docker image** — amd64 + arm64 ([#6124](https://github.com/NousResearch/hermes-agent/pull/6124))
- **Docker runs as non-root user** with virtualenv — @benbarclay contributing ([#8226](https://github.com/NousResearch/hermes-agent/pull/8226))
- **Use `uv`** for Docker dependency resolution to fix resolution-too-deep ([#6965](https://github.com/NousResearch/hermes-agent/pull/6965))
- **Container-aware Nix CLI** — auto-route into managed container — @alt-glitch ([#7543](https://github.com/NousResearch/hermes-agent/pull/7543))
- **Nix shared-state permission model** for interactive CLI users — @alt-glitch ([#6796](https://github.com/NousResearch/hermes-agent/pull/6796))
- **Per-profile subprocess HOME isolation** ([#7357](https://github.com/NousResearch/hermes-agent/pull/7357))
- Profile paths fixed in Docker — profiles go to mounted volume ([#7170](https://github.com/NousResearch/hermes-agent/pull/7170))
- Docker container gateway pathway hardened ([#8614](https://github.com/NousResearch/hermes-agent/pull/8614))
- Enable unbuffered stdout for live Docker logs ([#6749](https://github.com/NousResearch/hermes-agent/pull/6749))
- Install procps in Docker image — @HiddenPuppy ([#7032](https://github.com/NousResearch/hermes-agent/pull/7032))
- Shallow git clone for faster installation — @sosyz ([#8396](https://github.com/NousResearch/hermes-agent/pull/8396))
- `hermes update` always reset on stash conflict ([#7010](https://github.com/NousResearch/hermes-agent/pull/7010))
- Write update exit code before gateway restart (cgroup kill race) ([#8288](https://github.com/NousResearch/hermes-agent/pull/8288))
- Nix: `setupSecrets` optional, tirith runtime dep — @devorun, @ethernet8023 ([#6261](https://github.com/NousResearch/hermes-agent/pull/6261), [#6721](https://github.com/NousResearch/hermes-agent/pull/6721))
- launchd stop uses `bootout` so `KeepAlive` doesn't respawn ([#7119](https://github.com/NousResearch/hermes-agent/pull/7119))
---
## 🐛 Notable Bug Fixes
- Fix: `/model` switch not persisting across gateway messages ([#7081](https://github.com/NousResearch/hermes-agent/pull/7081))
- Fix: session-scoped gateway model overrides ignored — @Hygaard ([#7662](https://github.com/NousResearch/hermes-agent/pull/7662))
- Fix: compaction model context length ignoring config — 3 related issues ([#8258](https://github.com/NousResearch/hermes-agent/pull/8258), [#8107](https://github.com/NousResearch/hermes-agent/pull/8107))
- Fix: OpenCode.ai context window resolved to 128K instead of 1M ([#6472](https://github.com/NousResearch/hermes-agent/pull/6472))
- Fix: Codex fallback auth-store lookup — @cherifya ([#6462](https://github.com/NousResearch/hermes-agent/pull/6462))
- Fix: duplicate completion notifications when process killed ([#7124](https://github.com/NousResearch/hermes-agent/pull/7124))
- Fix: agent daemon thread prevents orphan CLI processes on tab close ([#8557](https://github.com/NousResearch/hermes-agent/pull/8557))
- Fix: stale image attachment on text paste and voice input ([#7077](https://github.com/NousResearch/hermes-agent/pull/7077))
- Fix: DM thread session seeding causing cross-thread contamination ([#7084](https://github.com/NousResearch/hermes-agent/pull/7084))
- Fix: OpenClaw migration shows dry-run preview before executing ([#6769](https://github.com/NousResearch/hermes-agent/pull/6769))
- Fix: auth errors misclassified as retryable — @kuishou68 ([#7027](https://github.com/NousResearch/hermes-agent/pull/7027))
- Fix: Copilot-Integration-Id header missing ([#7083](https://github.com/NousResearch/hermes-agent/pull/7083))
- Fix: ACP session capabilities — @luyao618 ([#6985](https://github.com/NousResearch/hermes-agent/pull/6985))
- Fix: ACP PromptResponse usage from top-level fields ([#7086](https://github.com/NousResearch/hermes-agent/pull/7086))
- Fix: several failing/flaky tests on main — @dsocolobsky ([#6777](https://github.com/NousResearch/hermes-agent/pull/6777))
- Fix: backup marker filenames — @sprmn24 ([#8600](https://github.com/NousResearch/hermes-agent/pull/8600))
- Fix: `NoneType` in fast_mode check — @0xbyt4 ([#7350](https://github.com/NousResearch/hermes-agent/pull/7350))
- Fix: missing imports in uninstall.py — @JiayuuWang ([#7034](https://github.com/NousResearch/hermes-agent/pull/7034))
---
## 📚 Documentation
- Platform adapter developer guide + WeCom Callback docs ([#7969](https://github.com/NousResearch/hermes-agent/pull/7969))
- Cron troubleshooting guide ([#7122](https://github.com/NousResearch/hermes-agent/pull/7122))
- Streaming timeout auto-detection for local LLMs ([#6990](https://github.com/NousResearch/hermes-agent/pull/6990))
- Tool-use enforcement documentation expanded ([#7984](https://github.com/NousResearch/hermes-agent/pull/7984))
- BlueBubbles pairing instructions ([#6548](https://github.com/NousResearch/hermes-agent/pull/6548))
- Telegram proxy support section ([#6348](https://github.com/NousResearch/hermes-agent/pull/6348))
- `hermes dump` and `hermes logs` CLI reference ([#6552](https://github.com/NousResearch/hermes-agent/pull/6552))
- `tool_progress_overrides` configuration reference ([#6364](https://github.com/NousResearch/hermes-agent/pull/6364))
- Compression model context length warning docs ([#7879](https://github.com/NousResearch/hermes-agent/pull/7879))
---
## 👥 Contributors
**269 merged PRs** from **24 contributors** across **487 commits**.
### Community Contributors
- **@alt-glitch** (6 PRs) — Nix container-aware CLI, shared-state permissions, Matrix SQLite crypto store, bulk SSH/Modal file sync, Matrix mautrix compat
- **@SHL0MS** (2 PRs) — Creative divergence strategies skill, creative ideation skill
- **@sprmn24** (2 PRs) — Error classifier disambiguation, backup marker fix
- **@nicoloboschi** — Hindsight memory plugin feature parity
- **@Hygaard** — Session-scoped gateway model override fix
- **@jarvis-phw** — Discord allowed_channels whitelist
- **@Kathie-yu** — Honcho initOnSessionStart for tools mode
- **@hermes-agent-dhabibi** — Discord forum channel topic inheritance
- **@kira-ariaki** — Discord .log attachments and size limit
- **@cherifya** — Codex fallback auth-store lookup
- **@Cafexss** — Security: auth for session continuation
- **@KUSH42** — Compaction context_length fix
- **@kuishou68** — Auth error retryable classification fix
- **@luyao618** — ACP session capabilities
- **@ygd58** — HERMES_HOME_MODE env var override
- **@0xbyt4** — Fast mode NoneType fix
- **@JiayuuWang** — CLI uninstall import fix
- **@HiddenPuppy** — Docker procps installation
- **@dsocolobsky** — Test suite fixes
- **@bobashopcashier** (1 PR) — Graceful gateway drain before restart (salvaged into #7503 from #7290)
- **@benbarclay** — Docker image tag simplification
- **@sosyz** — Shallow git clone for faster install
- **@devorun** — Nix setupSecrets optional
- **@ethernet8023** — Nix tirith runtime dep
---
**Full Changelog**: [v2026.4.8...v2026.4.13](https://github.com/NousResearch/hermes-agent/compare/v2026.4.8...v2026.4.13)
+8 -7
View File
@@ -1230,9 +1230,10 @@ def build_anthropic_kwargs(
When *base_url* points to a third-party Anthropic-compatible endpoint,
thinking block signatures are stripped (they are Anthropic-proprietary).
When *fast_mode* is True, adds ``speed: "fast"`` and the fast-mode beta
header for ~2.5x faster output throughput on Opus 4.6. Currently only
supported on native Anthropic endpoints (not third-party compatible ones).
When *fast_mode* is True, adds ``extra_body["speed"] = "fast"`` and the
fast-mode beta header for ~2.5x faster output throughput on Opus 4.6.
Currently only supported on native Anthropic endpoints (not third-party
compatible ones).
"""
system, anthropic_messages = convert_messages_to_anthropic(messages, base_url=base_url)
anthropic_tools = convert_tools_to_anthropic(tools) if tools else []
@@ -1333,11 +1334,11 @@ def build_anthropic_kwargs(
kwargs["max_tokens"] = max(effective_max_tokens, budget + 4096)
# ── Fast mode (Opus 4.6 only) ────────────────────────────────────
# Adds speed:"fast" + the fast-mode beta header for ~2.5x output speed.
# Only for native Anthropic endpoints — third-party providers would
# reject the unknown beta header and speed parameter.
# Adds extra_body.speed="fast" + the fast-mode beta header for ~2.5x
# output speed. Only for native Anthropic endpoints — third-party
# providers would reject the unknown beta header and speed parameter.
if fast_mode and not _is_third_party_anthropic_endpoint(base_url):
kwargs["speed"] = "fast"
kwargs.setdefault("extra_body", {})["speed"] = "fast"
# Build extra_headers with ALL applicable betas (the per-request
# extra_headers override the client-level anthropic-beta header).
betas = list(_common_betas_for_base_url(base_url))
+210 -80
View File
@@ -27,10 +27,6 @@ Per-task overrides are configured in config.yaml under the ``auxiliary:`` sectio
(e.g. ``auxiliary.vision.provider``, ``auxiliary.compression.model``).
Default "auto" follows the chains above.
Legacy env var overrides (AUXILIARY_{TASK}_PROVIDER, AUXILIARY_{TASK}_MODEL,
AUXILIARY_{TASK}_BASE_URL, etc.) are still read as a backward-compat fallback
but config.yaml takes priority. New configuration should always use config.yaml.
Payment / credit exhaustion fallback:
When a resolved provider returns HTTP 402 or a credit-related error,
call_llm() automatically retries with the next available provider in the
@@ -68,6 +64,8 @@ _PROVIDER_ALIASES = {
"zhipu": "zai",
"kimi": "kimi-coding",
"moonshot": "kimi-coding",
"kimi-cn": "kimi-coding-cn",
"moonshot-cn": "kimi-coding-cn",
"minimax-china": "minimax-cn",
"minimax_cn": "minimax-cn",
"claude": "anthropic",
@@ -75,13 +73,13 @@ _PROVIDER_ALIASES = {
}
def _normalize_aux_provider(provider: Optional[str], *, for_vision: bool = False) -> str:
def _normalize_aux_provider(provider: Optional[str]) -> str:
normalized = (provider or "auto").strip().lower()
if normalized.startswith("custom:"):
suffix = normalized.split(":", 1)[1].strip()
if not suffix:
return "custom"
normalized = suffix if not for_vision else "custom"
normalized = suffix
if normalized == "codex":
return "openai-codex"
if normalized == "main":
@@ -98,6 +96,7 @@ _API_KEY_PROVIDER_AUX_MODELS: Dict[str, str] = {
"gemini": "gemini-3-flash-preview",
"zai": "glm-4.5-flash",
"kimi-coding": "kimi-k2-turbo-preview",
"kimi-coding-cn": "kimi-k2-turbo-preview",
"minimax": "MiniMax-M2.7",
"minimax-cn": "MiniMax-M2.7",
"anthropic": "claude-haiku-4-5-20251001",
@@ -113,6 +112,7 @@ _API_KEY_PROVIDER_AUX_MODELS: Dict[str, str] = {
# "exotic provider" branch checks this before falling back to the main model.
_PROVIDER_VISION_MODELS: Dict[str, str] = {
"xiaomi": "mimo-v2-omni",
"zai": "glm-5v-turbo",
}
# OpenRouter app attribution headers
@@ -753,30 +753,6 @@ def _resolve_api_key_provider() -> Tuple[Optional[OpenAI], Optional[str]]:
# ── Provider resolution helpers ─────────────────────────────────────────────
def _get_auxiliary_provider(task: str = "") -> str:
"""Read the provider override for a specific auxiliary task.
Checks AUXILIARY_{TASK}_PROVIDER first (e.g. AUXILIARY_VISION_PROVIDER),
then CONTEXT_{TASK}_PROVIDER (for the compression section's summary_provider),
then falls back to "auto". Returns one of: "auto", "openrouter", "nous", "main".
"""
if task:
for prefix in ("AUXILIARY_", "CONTEXT_"):
val = os.getenv(f"{prefix}{task.upper()}_PROVIDER", "").strip().lower()
if val and val != "auto":
return val
return "auto"
def _get_auxiliary_env_override(task: str, suffix: str) -> Optional[str]:
"""Read an auxiliary env override from AUXILIARY_* or CONTEXT_* prefixes."""
if not task:
return None
for prefix in ("AUXILIARY_", "CONTEXT_"):
val = os.getenv(f"{prefix}{task.upper()}_{suffix}", "").strip()
if val:
return val
return None
def _try_openrouter() -> Tuple[Optional[OpenAI], Optional[str]]:
@@ -1021,6 +997,23 @@ _AUTO_PROVIDER_LABELS = {
_AGGREGATOR_PROVIDERS = frozenset({"openrouter", "nous"})
_MAIN_RUNTIME_FIELDS = ("provider", "model", "base_url", "api_key", "api_mode")
def _normalize_main_runtime(main_runtime: Optional[Dict[str, Any]]) -> Dict[str, str]:
"""Return a sanitized copy of a live main-runtime override."""
if not isinstance(main_runtime, dict):
return {}
normalized: Dict[str, str] = {}
for field in _MAIN_RUNTIME_FIELDS:
value = main_runtime.get(field)
if isinstance(value, str) and value.strip():
normalized[field] = value.strip()
provider = normalized.get("provider")
if provider:
normalized["provider"] = provider.lower()
return normalized
def _get_provider_chain() -> List[tuple]:
"""Return the ordered provider detection chain.
@@ -1130,7 +1123,7 @@ def _try_payment_fallback(
return None, None, ""
def _resolve_auto() -> Tuple[Optional[OpenAI], Optional[str]]:
def _resolve_auto(main_runtime: Optional[Dict[str, Any]] = None) -> Tuple[Optional[OpenAI], Optional[str]]:
"""Full auto-detection chain.
Priority:
@@ -1142,6 +1135,12 @@ def _resolve_auto() -> Tuple[Optional[OpenAI], Optional[str]]:
"""
global auxiliary_is_nous, _stale_base_url_warned
auxiliary_is_nous = False # Reset — _try_nous() will set True if it wins
runtime = _normalize_main_runtime(main_runtime)
runtime_provider = runtime.get("provider", "")
runtime_model = runtime.get("model", "")
runtime_base_url = runtime.get("base_url", "")
runtime_api_key = runtime.get("api_key", "")
runtime_api_mode = runtime.get("api_mode", "")
# ── Warn once if OPENAI_BASE_URL is set but config.yaml uses a named
# provider (not 'custom'). This catches the common "env poisoning"
@@ -1149,7 +1148,7 @@ def _resolve_auto() -> Tuple[Optional[OpenAI], Optional[str]]:
# old OPENAI_BASE_URL lingers in ~/.hermes/.env. ──
if not _stale_base_url_warned:
_env_base = os.getenv("OPENAI_BASE_URL", "").strip()
_cfg_provider = _read_main_provider()
_cfg_provider = runtime_provider or _read_main_provider()
if (_env_base and _cfg_provider
and _cfg_provider != "custom"
and not _cfg_provider.startswith("custom:")):
@@ -1163,12 +1162,25 @@ def _resolve_auto() -> Tuple[Optional[OpenAI], Optional[str]]:
_stale_base_url_warned = True
# ── Step 1: non-aggregator main provider → use main model directly ──
main_provider = _read_main_provider()
main_model = _read_main_model()
main_provider = runtime_provider or _read_main_provider()
main_model = runtime_model or _read_main_model()
if (main_provider and main_model
and main_provider not in _AGGREGATOR_PROVIDERS
and main_provider not in ("auto", "")):
client, resolved = resolve_provider_client(main_provider, main_model)
resolved_provider = main_provider
explicit_base_url = None
explicit_api_key = None
if runtime_base_url and (main_provider == "custom" or main_provider.startswith("custom:")):
resolved_provider = "custom"
explicit_base_url = runtime_base_url
explicit_api_key = runtime_api_key or None
client, resolved = resolve_provider_client(
resolved_provider,
main_model,
explicit_base_url=explicit_base_url,
explicit_api_key=explicit_api_key,
api_mode=runtime_api_mode or None,
)
if client is not None:
logger.info("Auxiliary auto-detect: using main provider %s (%s)",
main_provider, resolved or main_model)
@@ -1212,6 +1224,12 @@ def _to_async_client(sync_client, model: str):
return AsyncCodexAuxiliaryClient(sync_client), model
if isinstance(sync_client, AnthropicAuxiliaryClient):
return AsyncAnthropicAuxiliaryClient(sync_client), model
try:
from agent.copilot_acp_client import CopilotACPClient
if isinstance(sync_client, CopilotACPClient):
return sync_client, model
except ImportError:
pass
async_kwargs = {
"api_key": sync_client.api_key,
@@ -1249,6 +1267,7 @@ def resolve_provider_client(
explicit_base_url: str = None,
explicit_api_key: str = None,
api_mode: str = None,
main_runtime: Optional[Dict[str, Any]] = None,
) -> Tuple[Optional[Any], Optional[str]]:
"""Central router: given a provider name and optional model, return a
configured client with the correct auth, base URL, and API format.
@@ -1319,7 +1338,7 @@ def resolve_provider_client(
# ── Auto: try all providers in priority order ────────────────────
if provider == "auto":
client, resolved = _resolve_auto()
client, resolved = _resolve_auto(main_runtime=main_runtime)
if client is None:
return None, None
# When auto-detection lands on a non-OpenRouter provider (e.g. a
@@ -1429,10 +1448,14 @@ def resolve_provider_client(
custom_entry = _get_named_custom_provider(provider)
if custom_entry:
custom_base = custom_entry.get("base_url", "").strip()
custom_key = custom_entry.get("api_key", "").strip() or "no-key-required"
custom_key = custom_entry.get("api_key", "").strip()
custom_key_env = custom_entry.get("key_env", "").strip()
if not custom_key and custom_key_env:
custom_key = os.getenv(custom_key_env, "").strip()
custom_key = custom_key or "no-key-required"
if custom_base:
final_model = _normalize_resolved_model(
model or _read_main_model() or "gpt-4o-mini",
model or custom_entry.get("model") or _read_main_model() or "gpt-4o-mini",
provider,
)
client = OpenAI(api_key=custom_key, base_url=custom_base)
@@ -1451,7 +1474,11 @@ def resolve_provider_client(
# ── API-key providers from PROVIDER_REGISTRY ─────────────────────
try:
from hermes_cli.auth import PROVIDER_REGISTRY, resolve_api_key_provider_credentials
from hermes_cli.auth import (
PROVIDER_REGISTRY,
resolve_api_key_provider_credentials,
resolve_external_process_provider_credentials,
)
except ImportError:
logger.debug("hermes_cli.auth not available for provider %s", provider)
return None, None
@@ -1525,6 +1552,41 @@ def resolve_provider_client(
return (_to_async_client(client, final_model) if async_mode
else (client, final_model))
if pconfig.auth_type == "external_process":
creds = resolve_external_process_provider_credentials(provider)
final_model = _normalize_resolved_model(model or _read_main_model(), provider)
if provider == "copilot-acp":
api_key = str(creds.get("api_key", "")).strip()
base_url = str(creds.get("base_url", "")).strip()
command = str(creds.get("command", "")).strip() or None
args = list(creds.get("args") or [])
if not final_model:
logger.warning(
"resolve_provider_client: copilot-acp requested but no model "
"was provided or configured"
)
return None, None
if not api_key or not base_url:
logger.warning(
"resolve_provider_client: copilot-acp requested but external "
"process credentials are incomplete"
)
return None, None
from agent.copilot_acp_client import CopilotACPClient
client = CopilotACPClient(
api_key=api_key,
base_url=base_url,
command=command,
args=args,
)
logger.debug("resolve_provider_client: %s (%s)", provider, final_model)
return (_to_async_client(client, final_model) if async_mode
else (client, final_model))
logger.warning("resolve_provider_client: external-process provider %s not "
"directly supported", provider)
return None, None
elif pconfig.auth_type in ("oauth_device_code", "oauth_external"):
# OAuth providers — route through their specific try functions
if provider == "nous":
@@ -1543,15 +1605,19 @@ def resolve_provider_client(
# ── Public API ──────────────────────────────────────────────────────────────
def get_text_auxiliary_client(task: str = "") -> Tuple[Optional[OpenAI], Optional[str]]:
def get_text_auxiliary_client(
task: str = "",
*,
main_runtime: Optional[Dict[str, Any]] = None,
) -> Tuple[Optional[OpenAI], Optional[str]]:
"""Return (client, default_model_slug) for text-only auxiliary tasks.
Args:
task: Optional task name ("compression", "web_extract") to check
for a task-specific provider override.
Callers may override the returned model with a per-task env var
(e.g. CONTEXT_COMPRESSION_MODEL, AUXILIARY_WEB_EXTRACT_MODEL).
Callers may override the returned model via config.yaml
(e.g. auxiliary.compression.model, auxiliary.web_extract.model).
"""
provider, model, base_url, api_key, api_mode = _resolve_task_provider_model(task or None)
return resolve_provider_client(
@@ -1560,10 +1626,11 @@ def get_text_auxiliary_client(task: str = "") -> Tuple[Optional[OpenAI], Optiona
explicit_base_url=base_url,
explicit_api_key=api_key,
api_mode=api_mode,
main_runtime=main_runtime,
)
def get_async_text_auxiliary_client(task: str = ""):
def get_async_text_auxiliary_client(task: str = "", *, main_runtime: Optional[Dict[str, Any]] = None):
"""Return (async_client, model_slug) for async consumers.
For standard providers returns (AsyncOpenAI, model). For Codex returns
@@ -1578,6 +1645,7 @@ def get_async_text_auxiliary_client(task: str = ""):
explicit_base_url=base_url,
explicit_api_key=api_key,
api_mode=api_mode,
main_runtime=main_runtime,
)
@@ -1588,7 +1656,7 @@ _VISION_AUTO_PROVIDER_ORDER = (
def _normalize_vision_provider(provider: Optional[str]) -> str:
return _normalize_aux_provider(provider, for_vision=True)
return _normalize_aux_provider(provider)
def _resolve_strict_vision_backend(provider: str) -> Tuple[Optional[Any], Optional[str]]:
@@ -1671,6 +1739,7 @@ def resolve_vision_provider_client(
async_mode=async_mode,
explicit_base_url=resolved_base_url,
explicit_api_key=resolved_api_key,
api_mode=resolved_api_mode,
)
if client is None:
return "custom", None, None
@@ -1695,7 +1764,8 @@ def resolve_vision_provider_client(
# Use provider-specific vision model if available, otherwise main model.
vision_model = _PROVIDER_VISION_MODELS.get(main_provider, main_model)
rpc_client, rpc_model = resolve_provider_client(
main_provider, vision_model)
main_provider, vision_model,
api_mode=resolved_api_mode)
if rpc_client is not None:
logger.info(
"Vision auto-detect: using active provider %s (%s)",
@@ -1719,7 +1789,8 @@ def resolve_vision_provider_client(
sync_client, default_model = _resolve_strict_vision_backend(requested)
return _finalize(requested, sync_client, default_model)
client, final_model = _get_cached_client(requested, resolved_model, async_mode)
client, final_model = _get_cached_client(requested, resolved_model, async_mode,
api_mode=resolved_api_mode)
if client is None:
return requested, None, None
return requested, client, final_model
@@ -1892,6 +1963,7 @@ def _get_cached_client(
base_url: str = None,
api_key: str = None,
api_mode: str = None,
main_runtime: Optional[Dict[str, Any]] = None,
) -> Tuple[Optional[Any], Optional[str]]:
"""Get or create a cached client for the given provider.
@@ -1915,7 +1987,9 @@ def _get_cached_client(
loop_id = id(current_loop)
except RuntimeError:
pass
cache_key = (provider, async_mode, base_url or "", api_key or "", api_mode or "", loop_id)
runtime = _normalize_main_runtime(main_runtime)
runtime_key = tuple(runtime.get(field, "") for field in _MAIN_RUNTIME_FIELDS) if provider == "auto" else ()
cache_key = (provider, async_mode, base_url or "", api_key or "", api_mode or "", loop_id, runtime_key)
with _client_cache_lock:
if cache_key in _client_cache:
cached_client, cached_default, cached_loop = _client_cache[cache_key]
@@ -1940,6 +2014,7 @@ def _get_cached_client(
explicit_base_url=base_url,
explicit_api_key=api_key,
api_mode=api_mode,
main_runtime=runtime,
)
if client is not None:
# For async clients, remember which loop they were created on so we
@@ -1964,9 +2039,8 @@ def _resolve_task_provider_model(
Priority:
1. Explicit provider/model/base_url/api_key args (always win)
2. Config file (auxiliary.{task}.* or compression.*)
3. Env var overrides (backward-compat: AUXILIARY_{TASK}_*, CONTEXT_{TASK}_*)
4. "auto" (full auto-detection chain)
2. Config file (auxiliary.{task}.provider/model/base_url)
3. "auto" (full auto-detection chain)
Returns (provider, model, base_url, api_key, api_mode) where model may
be None (use provider default). When base_url is set, provider is forced
@@ -1997,22 +2071,8 @@ def _resolve_task_provider_model(
cfg_api_key = str(task_config.get("api_key", "")).strip() or None
cfg_api_mode = str(task_config.get("api_mode", "")).strip() or None
# Backwards compat: compression section has its own keys.
# The auxiliary.compression defaults to provider="auto", so treat
# both None and "auto" as "not explicitly configured".
if task == "compression" and (not cfg_provider or cfg_provider == "auto"):
comp = config.get("compression", {}) if isinstance(config, dict) else {}
if isinstance(comp, dict):
cfg_provider = comp.get("summary_provider", "").strip() or None
cfg_model = cfg_model or comp.get("summary_model", "").strip() or None
_sbu = comp.get("summary_base_url") or ""
cfg_base_url = cfg_base_url or _sbu.strip() or None
# Env vars are backward-compat fallback only — config.yaml is primary.
env_model = _get_auxiliary_env_override(task, "MODEL") if task else None
env_api_mode = _get_auxiliary_env_override(task, "API_MODE") if task else None
resolved_model = model or cfg_model or env_model
resolved_api_mode = cfg_api_mode or env_api_mode
resolved_model = model or cfg_model
resolved_api_mode = cfg_api_mode
if base_url:
return "custom", resolved_model, base_url, api_key, resolved_api_mode
@@ -2026,17 +2086,6 @@ def _resolve_task_provider_model(
if cfg_provider and cfg_provider != "auto":
return cfg_provider, resolved_model, None, None, resolved_api_mode
# Env vars are backward-compat fallback for users who haven't
# migrated to config.yaml yet.
env_base_url = _get_auxiliary_env_override(task, "BASE_URL")
env_api_key = _get_auxiliary_env_override(task, "API_KEY")
if env_base_url:
return "custom", resolved_model, env_base_url, env_api_key, resolved_api_mode
env_provider = _get_auxiliary_provider(task)
if env_provider != "auto":
return env_provider, resolved_model, None, None, resolved_api_mode
return "auto", resolved_model, None, None, resolved_api_mode
return "auto", resolved_model, None, None, resolved_api_mode
@@ -2065,6 +2114,75 @@ def _get_task_timeout(task: str, default: float = _DEFAULT_AUX_TIMEOUT) -> float
return default
# ---------------------------------------------------------------------------
# Anthropic-compatible endpoint detection + image block conversion
# ---------------------------------------------------------------------------
# Providers that use Anthropic-compatible endpoints (via OpenAI SDK wrapper).
# Their image content blocks must use Anthropic format, not OpenAI format.
_ANTHROPIC_COMPAT_PROVIDERS = frozenset({"minimax", "minimax-cn"})
def _is_anthropic_compat_endpoint(provider: str, base_url: str) -> bool:
"""Detect if an endpoint expects Anthropic-format content blocks.
Returns True for known Anthropic-compatible providers (MiniMax) and
any endpoint whose URL contains ``/anthropic`` in the path.
"""
if provider in _ANTHROPIC_COMPAT_PROVIDERS:
return True
url_lower = (base_url or "").lower()
return "/anthropic" in url_lower
def _convert_openai_images_to_anthropic(messages: list) -> list:
"""Convert OpenAI ``image_url`` content blocks to Anthropic ``image`` blocks.
Only touches messages that have list-type content with ``image_url`` blocks;
plain text messages pass through unchanged.
"""
converted = []
for msg in messages:
content = msg.get("content")
if not isinstance(content, list):
converted.append(msg)
continue
new_content = []
changed = False
for block in content:
if block.get("type") == "image_url":
image_url_val = (block.get("image_url") or {}).get("url", "")
if image_url_val.startswith("data:"):
# Parse data URI: data:<media_type>;base64,<data>
header, _, b64data = image_url_val.partition(",")
media_type = "image/png"
if ":" in header and ";" in header:
media_type = header.split(":", 1)[1].split(";", 1)[0]
new_content.append({
"type": "image",
"source": {
"type": "base64",
"media_type": media_type,
"data": b64data,
},
})
else:
# URL-based image
new_content.append({
"type": "image",
"source": {
"type": "url",
"url": image_url_val,
},
})
changed = True
else:
new_content.append(block)
converted.append({**msg, "content": new_content} if changed else msg)
return converted
def _build_call_kwargs(
provider: str,
model: str,
@@ -2149,6 +2267,7 @@ def call_llm(
model: str = None,
base_url: str = None,
api_key: str = None,
main_runtime: Optional[Dict[str, Any]] = None,
messages: list,
temperature: float = None,
max_tokens: int = None,
@@ -2214,6 +2333,7 @@ def call_llm(
base_url=resolved_base_url,
api_key=resolved_api_key,
api_mode=resolved_api_mode,
main_runtime=main_runtime,
)
if client is None:
# When the user explicitly chose a non-OpenRouter provider but no
@@ -2234,7 +2354,7 @@ def call_llm(
if not resolved_base_url:
logger.info("Auxiliary %s: provider %s unavailable, trying auto-detection chain",
task or "call", resolved_provider)
client, final_model = _get_cached_client("auto")
client, final_model = _get_cached_client("auto", main_runtime=main_runtime)
if client is None:
raise RuntimeError(
f"No LLM provider configured for task={task} provider={resolved_provider}. "
@@ -2255,6 +2375,11 @@ def call_llm(
tools=tools, timeout=effective_timeout, extra_body=extra_body,
base_url=resolved_base_url)
# Convert image blocks for Anthropic-compatible endpoints (e.g. MiniMax)
_client_base = str(getattr(client, "base_url", "") or "")
if _is_anthropic_compat_endpoint(resolved_provider, _client_base):
kwargs["messages"] = _convert_openai_images_to_anthropic(kwargs["messages"])
# Handle max_tokens vs max_completion_tokens retry, then payment fallback.
try:
return _validate_llm_response(
@@ -2331,9 +2456,9 @@ def extract_content_or_reasoning(response) -> str:
if content:
# Strip inline think/reasoning blocks (mirrors _strip_think_blocks)
cleaned = re.sub(
r"<(?:think|thinking|reasoning|REASONING_SCRATCHPAD)>"
r"<(?:think|thinking|reasoning|thought|REASONING_SCRATCHPAD)>"
r".*?"
r"</(?:think|thinking|reasoning|REASONING_SCRATCHPAD)>",
r"</(?:think|thinking|reasoning|thought|REASONING_SCRATCHPAD)>",
"", content, flags=re.DOTALL | re.IGNORECASE,
).strip()
if cleaned:
@@ -2443,6 +2568,11 @@ async def async_call_llm(
tools=tools, timeout=effective_timeout, extra_body=extra_body,
base_url=resolved_base_url)
# Convert image blocks for Anthropic-compatible endpoints (e.g. MiniMax)
_client_base = str(getattr(client, "base_url", "") or "")
if _is_anthropic_compat_endpoint(resolved_provider, _client_base):
kwargs["messages"] = _convert_openai_images_to_anthropic(kwargs["messages"])
try:
return _validate_llm_response(
await client.chat.completions.create(**kwargs), task)
+127 -73
View File
@@ -4,8 +4,12 @@ Self-contained class with its own OpenAI client for summarization.
Uses auxiliary model (cheap/fast) to summarize middle turns while
protecting head and tail context.
Improvements over v1:
- Structured summary template (Goal, Progress, Decisions, Files, Next Steps)
Improvements over v2:
- Structured summary template with Resolved/Pending question tracking
- Summarizer preamble: "Do not respond to any questions" (from OpenCode)
- Handoff framing: "different assistant" (from Codex) to create separation
- "Remaining Work" replaces "Next Steps" to avoid reading as active instructions
- Clear separator when summary merges into tail message
- Iterative summary updates (preserves info across multiple compactions)
- Token-budget tail protection instead of fixed message count
- Tool output pruning before LLM summarization (cheap pre-pass)
@@ -20,6 +24,7 @@ from typing import Any, Dict, List, Optional
from agent.auxiliary_client import call_llm
from agent.context_engine import ContextEngine
from agent.model_metadata import (
MINIMUM_CONTEXT_LENGTH,
get_model_context_length,
estimate_messages_tokens_rough,
)
@@ -27,12 +32,13 @@ from agent.model_metadata import (
logger = logging.getLogger(__name__)
SUMMARY_PREFIX = (
"[CONTEXT COMPACTION] Earlier turns in this conversation were compacted "
"to save context space. The summary below describes work that was "
"already completed, and the current session state may still reflect "
"that work (for example, files may already be changed). Use the summary "
"and the current state to continue from where things left off, and "
"avoid repeating work:"
"[CONTEXT COMPACTION — REFERENCE ONLY] Earlier turns were compacted "
"into the summary below. This is a handoff from a previous context "
"window — treat it as background reference, NOT as active instructions. "
"Do NOT answer questions or fulfill requests mentioned in this summary; "
"they were already addressed. Respond ONLY to the latest user message "
"that appears AFTER this summary. The current session state (files, "
"config, etc.) may reflect work described here — avoid repeating it:"
)
LEGACY_SUMMARY_PREFIX = "[CONTEXT SUMMARY]:"
@@ -80,14 +86,19 @@ class ContextCompressor(ContextEngine):
base_url: str = "",
api_key: str = "",
provider: str = "",
api_mode: str = "",
) -> None:
"""Update model info after a model switch or fallback activation."""
self.model = model
self.base_url = base_url
self.api_key = api_key
self.provider = provider
self.api_mode = api_mode
self.context_length = context_length
self.threshold_tokens = int(context_length * self.threshold_percent)
self.threshold_tokens = max(
int(context_length * self.threshold_percent),
MINIMUM_CONTEXT_LENGTH,
)
def __init__(
self,
@@ -102,11 +113,13 @@ class ContextCompressor(ContextEngine):
api_key: str = "",
config_context_length: int | None = None,
provider: str = "",
api_mode: str = "",
):
self.model = model
self.base_url = base_url
self.api_key = api_key
self.provider = provider
self.api_mode = api_mode
self.threshold_percent = threshold_percent
self.protect_first_n = protect_first_n
self.protect_last_n = protect_last_n
@@ -118,7 +131,14 @@ class ContextCompressor(ContextEngine):
config_context_length=config_context_length,
provider=provider,
)
self.threshold_tokens = int(self.context_length * threshold_percent)
# Floor: never compress below MINIMUM_CONTEXT_LENGTH tokens even if
# the percentage would suggest a lower value. This prevents premature
# compression on large-context models at 50% while keeping the % sane
# for models right at the minimum.
self.threshold_tokens = max(
int(self.context_length * threshold_percent),
MINIMUM_CONTEXT_LENGTH,
)
self.compression_count = 0
# Derive token budgets: ratio is relative to the threshold, not total context
@@ -295,13 +315,20 @@ class ContextCompressor(ContextEngine):
return "\n\n".join(parts)
def _generate_summary(self, turns_to_summarize: List[Dict[str, Any]]) -> Optional[str]:
def _generate_summary(self, turns_to_summarize: List[Dict[str, Any]], focus_topic: str = None) -> Optional[str]:
"""Generate a structured summary of conversation turns.
Uses a structured template (Goal, Progress, Decisions, Files, Next Steps)
inspired by Pi-mono and OpenCode. When a previous summary exists,
Uses a structured template (Goal, Progress, Decisions, Resolved/Pending
Questions, Files, Remaining Work) with explicit preamble telling the
summarizer not to answer questions. When a previous summary exists,
generates an iterative update instead of summarizing from scratch.
Args:
focus_topic: Optional focus string for guided compression. When
provided, the summariser prioritises preserving information
related to this topic and is more aggressive about compressing
everything else. Inspired by Claude Code's ``/compact``.
Returns None if all attempts fail — the caller should drop
the middle turns without a summary rather than inject a useless
placeholder.
@@ -317,60 +344,27 @@ class ContextCompressor(ContextEngine):
summary_budget = self._compute_summary_budget(turns_to_summarize)
content_to_summarize = self._serialize_for_summary(turns_to_summarize)
if self._previous_summary:
# Iterative update: preserve existing info, add new progress
prompt = f"""You are updating a context compaction summary. A previous compaction produced the summary below. New conversation turns have occurred since then and need to be incorporated.
# Preamble shared by both first-compaction and iterative-update prompts.
# Inspired by OpenCode's "do not respond to any questions" instruction
# and Codex's "another language model" framing.
_summarizer_preamble = (
"You are a summarization agent creating a context checkpoint. "
"Your output will be injected as reference material for a DIFFERENT "
"assistant that continues the conversation. "
"Do NOT respond to any questions or requests in the conversation — "
"only output the structured summary. "
"Do NOT include any preamble, greeting, or prefix."
)
PREVIOUS SUMMARY:
{self._previous_summary}
NEW TURNS TO INCORPORATE:
{content_to_summarize}
Update the summary using this exact structure. PRESERVE all existing information that is still relevant. ADD new progress. Move items from "In Progress" to "Done" when completed. Remove information only if it is clearly obsolete.
## Goal
[What the user is trying to accomplish — preserve from previous summary, update if goal evolved]
## Constraints & Preferences
[User preferences, coding style, constraints, important decisions — accumulate across compactions]
## Progress
### Done
[Completed work — include specific file paths, commands run, results obtained]
### In Progress
[Work currently underway]
### Blocked
[Any blockers or issues encountered]
## Key Decisions
[Important technical decisions and why they were made]
## Relevant Files
[Files read, modified, or created — with brief note on each. Accumulate across compactions.]
## Next Steps
[What needs to happen next to continue the work]
## Critical Context
[Any specific values, error messages, configuration details, or data that would be lost without explicit preservation]
## Tools & Patterns
[Which tools were used, how they were used effectively, and any tool-specific discoveries. Accumulate across compactions.]
Target ~{summary_budget} tokens. Be specific — include file paths, command outputs, error messages, and concrete values rather than vague descriptions.
Write only the summary body. Do not include any preamble or prefix."""
else:
# First compaction: summarize from scratch
prompt = f"""Create a structured handoff summary for a later assistant that will continue this conversation after earlier turns are compacted.
TURNS TO SUMMARIZE:
{content_to_summarize}
Use this exact structure:
## Goal
# Shared structured template (used by both paths).
# Key changes vs v1:
# - "Pending User Asks" section (from Claude Code) explicitly tracks
# unanswered questions so the model knows what's resolved vs open
# - "Remaining Work" replaces "Next Steps" to avoid reading as active
# instructions
# - "Resolved Questions" makes it clear which questions were already
# answered (prevents model from re-answering them)
_template_sections = f"""## Goal
[What the user is trying to accomplish]
## Constraints & Preferences
@@ -387,25 +381,74 @@ Use this exact structure:
## Key Decisions
[Important technical decisions and why they were made]
## Resolved Questions
[Questions the user asked that were ALREADY answered — include the answer so the next assistant does not re-answer them]
## Pending User Asks
[Questions or requests from the user that have NOT yet been answered or fulfilled. If none, write "None."]
## Relevant Files
[Files read, modified, or created — with brief note on each]
## Next Steps
[What needs to happen next to continue the work]
## Remaining Work
[What remains to be done — framed as context, not instructions]
## Critical Context
[Any specific values, error messages, configuration details, or data that would be lost without explicit preservation]
## Tools & Patterns
[Which tools were used, how they were used effectively, and any tool-specific discoveries (e.g., preferred flags, working invocations, successful command patterns)]
[Which tools were used, how they were used effectively, and any tool-specific discoveries]
Target ~{summary_budget} tokens. Be specific — include file paths, command outputs, error messages, and concrete values rather than vague descriptions. The goal is to prevent the next assistant from repeating work or losing important details.
Target ~{summary_budget} tokens. Be specific — include file paths, command outputs, error messages, and concrete values rather than vague descriptions.
Write only the summary body. Do not include any preamble or prefix."""
if self._previous_summary:
# Iterative update: preserve existing info, add new progress
prompt = f"""{_summarizer_preamble}
You are updating a context compaction summary. A previous compaction produced the summary below. New conversation turns have occurred since then and need to be incorporated.
PREVIOUS SUMMARY:
{self._previous_summary}
NEW TURNS TO INCORPORATE:
{content_to_summarize}
Update the summary using this exact structure. PRESERVE all existing information that is still relevant. ADD new progress. Move items from "In Progress" to "Done" when completed. Move answered questions to "Resolved Questions". Remove information only if it is clearly obsolete.
{_template_sections}"""
else:
# First compaction: summarize from scratch
prompt = f"""{_summarizer_preamble}
Create a structured handoff summary for a different assistant that will continue this conversation after earlier turns are compacted. The next assistant should be able to understand what happened without re-reading the original turns.
TURNS TO SUMMARIZE:
{content_to_summarize}
Use this exact structure:
{_template_sections}"""
# Inject focus topic guidance when the user provides one via /compress <focus>.
# This goes at the end of the prompt so it takes precedence.
if focus_topic:
prompt += f"""
FOCUS TOPIC: "{focus_topic}"
The user has requested that this compaction PRIORITISE preserving all information related to the focus topic above. For content related to "{focus_topic}", include full detail — exact values, file paths, command outputs, error messages, and decisions. For content NOT related to the focus topic, summarise more aggressively (brief one-liners or omit if truly irrelevant). The focus topic sections should receive roughly 60-70% of the summary token budget."""
try:
call_kwargs = {
"task": "compression",
"main_runtime": {
"model": self.model,
"provider": self.provider,
"base_url": self.base_url,
"api_key": self.api_key,
"api_mode": self.api_mode,
},
"messages": [{"role": "user", "content": prompt}],
"max_tokens": summary_budget * 2,
# timeout resolved from auxiliary.compression.timeout config by call_llm
@@ -620,7 +663,7 @@ Write only the summary body. Do not include any preamble or prefix."""
# Main compression entry point
# ------------------------------------------------------------------
def compress(self, messages: List[Dict[str, Any]], current_tokens: int = None) -> List[Dict[str, Any]]:
def compress(self, messages: List[Dict[str, Any]], current_tokens: int = None, focus_topic: str = None) -> List[Dict[str, Any]]:
"""Compress conversation messages by summarizing middle turns.
Algorithm:
@@ -632,6 +675,12 @@ Write only the summary body. Do not include any preamble or prefix."""
After compression, orphaned tool_call / tool_result pairs are cleaned
up so the API never receives mismatched IDs.
Args:
focus_topic: Optional focus string for guided compression. When
provided, the summariser will prioritise preserving information
related to this topic and be more aggressive about compressing
everything else. Inspired by Claude Code's ``/compact``.
"""
n_messages = len(messages)
# Only need head + 3 tail messages minimum (token budget decides the real tail size)
@@ -689,7 +738,7 @@ Write only the summary body. Do not include any preamble or prefix."""
)
# Phase 3: Generate structured summary
summary = self._generate_summary(turns_to_summarize)
summary = self._generate_summary(turns_to_summarize, focus_topic=focus_topic)
# Phase 4: Assemble compressed message list
compressed = []
@@ -744,7 +793,12 @@ Write only the summary body. Do not include any preamble or prefix."""
msg = messages[i].copy()
if _merge_summary_into_tail and i == compress_end:
original = msg.get("content") or ""
msg["content"] = summary + "\n\n" + original
msg["content"] = (
summary
+ "\n\n--- END OF CONTEXT SUMMARY — "
"respond to the message below, not the summary above ---\n\n"
+ original
)
_merge_summary_into_tail = False
compressed.append(msg)
+1 -1
View File
@@ -26,7 +26,7 @@ Lifecycle:
"""
from abc import ABC, abstractmethod
from typing import Any, Dict, List, Optional
from typing import Any, Dict, List
class ContextEngine(ABC):
+98 -1
View File
@@ -18,12 +18,12 @@ import hermes_cli.auth as auth_mod
from hermes_cli.auth import (
CODEX_ACCESS_TOKEN_REFRESH_SKEW_SECONDS,
DEFAULT_AGENT_KEY_MIN_TTL_SECONDS,
KIMI_CODE_BASE_URL,
PROVIDER_REGISTRY,
_auth_store_lock,
_codex_access_token_is_expiring,
_decode_jwt_claims,
_import_codex_cli_tokens,
_write_codex_cli_tokens,
_load_auth_store,
_load_provider_state,
_resolve_kimi_base_url,
@@ -288,6 +288,14 @@ def _iter_custom_providers(config: Optional[dict] = None):
return
custom_providers = config.get("custom_providers")
if not isinstance(custom_providers, list):
# Fall back to the v12+ providers dict via the compatibility layer
try:
from hermes_cli.config import get_compatible_custom_providers
custom_providers = get_compatible_custom_providers(config)
except Exception:
return
if not custom_providers:
return
for entry in custom_providers:
if not isinstance(entry, dict):
@@ -693,6 +701,14 @@ class CredentialPool:
self._replace_entry(synced, updated)
self._persist()
self._sync_device_code_entry_to_auth_store(updated)
try:
_write_codex_cli_tokens(
updated.access_token,
updated.refresh_token,
last_refresh=updated.last_refresh,
)
except Exception as wexc:
logger.debug("Failed to write refreshed Codex tokens to CLI file (retry): %s", wexc)
return updated
except Exception as retry_exc:
logger.debug("Codex retry refresh also failed: %s", retry_exc)
@@ -718,6 +734,17 @@ class CredentialPool:
# _seed_from_singletons() on the next load_pool() sees fresh state
# instead of re-seeding stale/consumed tokens.
self._sync_device_code_entry_to_auth_store(updated)
# Write refreshed tokens back to ~/.codex/auth.json so Codex CLI
# and VS Code don't hit "refresh_token_reused" on their next refresh.
if self.provider == "openai-codex":
try:
_write_codex_cli_tokens(
updated.access_token,
updated.refresh_token,
last_refresh=updated.last_refresh,
)
except Exception as wexc:
logger.debug("Failed to write refreshed Codex tokens to CLI file: %s", wexc)
return updated
def _entry_needs_refresh(self, entry: PooledCredential) -> bool:
@@ -1125,9 +1152,79 @@ def _seed_from_singletons(provider: str, entries: List[PooledCredential]) -> Tup
},
)
elif provider == "copilot":
# Copilot tokens are resolved dynamically via `gh auth token` or
# env vars (COPILOT_GITHUB_TOKEN / GH_TOKEN). They don't live in
# the auth store or credential pool, so we resolve them here.
try:
from hermes_cli.copilot_auth import resolve_copilot_token
token, source = resolve_copilot_token()
if token:
source_name = "gh_cli" if "gh" in source.lower() else f"env:{source}"
active_sources.add(source_name)
changed |= _upsert_entry(
entries,
provider,
source_name,
{
"source": source_name,
"auth_type": AUTH_TYPE_API_KEY,
"access_token": token,
"label": source,
},
)
except Exception as exc:
logger.debug("Copilot token seed failed: %s", exc)
elif provider == "qwen-oauth":
# Qwen OAuth tokens live in ~/.qwen/oauth_creds.json, written by
# the Qwen CLI (`qwen auth qwen-oauth`). They aren't in the
# Hermes auth store or env vars, so resolve them here.
# Use refresh_if_expiring=False to avoid network calls during
# pool loading / provider discovery.
try:
from hermes_cli.auth import resolve_qwen_runtime_credentials
creds = resolve_qwen_runtime_credentials(refresh_if_expiring=False)
token = creds.get("api_key", "")
if token:
source_name = creds.get("source", "qwen-cli")
active_sources.add(source_name)
changed |= _upsert_entry(
entries,
provider,
source_name,
{
"source": source_name,
"auth_type": AUTH_TYPE_OAUTH,
"access_token": token,
"expires_at_ms": creds.get("expires_at_ms"),
"base_url": creds.get("base_url", ""),
"label": creds.get("auth_file", source_name),
},
)
except Exception as exc:
logger.debug("Qwen OAuth token seed failed: %s", exc)
elif provider == "openai-codex":
state = _load_provider_state(auth_store, "openai-codex")
tokens = state.get("tokens") if isinstance(state, dict) else None
# Fallback: import from Codex CLI (~/.codex/auth.json) if Hermes auth
# store has no tokens. This mirrors resolve_codex_runtime_credentials()
# so that load_pool() and list_authenticated_providers() detect tokens
# that only exist in the Codex CLI shared file.
if not (isinstance(tokens, dict) and tokens.get("access_token")):
try:
from hermes_cli.auth import _import_codex_cli_tokens, _save_codex_tokens
cli_tokens = _import_codex_cli_tokens()
if cli_tokens:
logger.info("Importing Codex CLI tokens into Hermes auth store.")
_save_codex_tokens(cli_tokens)
# Re-read state after import
auth_store = _load_auth_store()
state = _load_provider_state(auth_store, "openai-codex")
tokens = state.get("tokens") if isinstance(state, dict) else None
except Exception as exc:
logger.debug("Codex CLI token import failed: %s", exc)
if isinstance(tokens, dict) and tokens.get("access_token"):
active_sources.add("device_code")
changed |= _upsert_entry(
-6
View File
@@ -77,12 +77,6 @@ def _diff_ansi() -> dict[str, str]:
return _diff_colors_cached
def reset_diff_colors() -> None:
"""Reset cached diff colors (call after /skin switch)."""
global _diff_colors_cached
_diff_colors_cached = None
# Module-level helpers — each call resolves from the active skin lazily.
def _diff_dim(): return _diff_ansi()["dim"]
def _diff_file(): return _diff_ansi()["file"]
+12 -1
View File
@@ -13,7 +13,6 @@ from __future__ import annotations
import enum
import logging
import re
from dataclasses import dataclass, field
from typing import Any, Dict, Optional
@@ -157,6 +156,18 @@ _CONTEXT_OVERFLOW_PATTERNS = [
"prompt exceeds max length",
"max_tokens",
"maximum number of tokens",
# vLLM / local inference server patterns
"exceeds the max_model_len",
"max_model_len",
"prompt length", # "engine prompt length X exceeds"
"input is too long",
"maximum model length",
# Ollama patterns
"context length exceeded",
"truncating input",
# llama.cpp / llama-server patterns
"slot context", # "slot context: N tokens, prompt N tokens"
"n_ctx_slot",
# Chinese error messages (some providers return these)
"超过最大长度",
"上下文长度",
-1
View File
@@ -27,7 +27,6 @@ from agent.usage_pricing import (
DEFAULT_PRICING,
estimate_usage_cost,
format_duration_compact,
get_pricing,
has_known_pricing,
)
-1
View File
@@ -28,7 +28,6 @@ Usage in run_agent.py:
from __future__ import annotations
import json
import logging
import re
from typing import Any, Dict, List, Optional
+41 -15
View File
@@ -5,7 +5,6 @@ and run_agent.py for pre-flight context checks.
"""
import logging
import os
import re
import time
from pathlib import Path
@@ -24,17 +23,19 @@ logger = logging.getLogger(__name__)
# are preserved so the full model name reaches cache lookups and server queries.
_PROVIDER_PREFIXES: frozenset[str] = frozenset({
"openrouter", "nous", "openai-codex", "copilot", "copilot-acp",
"gemini", "zai", "kimi-coding", "minimax", "minimax-cn", "anthropic", "deepseek",
"gemini", "zai", "kimi-coding", "kimi-coding-cn", "minimax", "minimax-cn", "anthropic", "deepseek",
"opencode-zen", "opencode-go", "ai-gateway", "kilocode", "alibaba",
"qwen-oauth",
"xiaomi",
"arcee",
"custom", "local",
# Common aliases
"google", "google-gemini", "google-ai-studio",
"glm", "z-ai", "z.ai", "zhipu", "github", "github-copilot",
"github-models", "kimi", "moonshot", "claude", "deep-seek",
"github-models", "kimi", "moonshot", "kimi-cn", "moonshot-cn", "claude", "deep-seek",
"opencode", "zen", "go", "vercel", "kilo", "dashscope", "aliyun", "qwen",
"mimo", "xiaomi-mimo",
"arcee-ai", "arceeai",
"qwen-portal",
})
@@ -85,6 +86,11 @@ CONTEXT_PROBE_TIERS = [
# Default context length when no detection method succeeds.
DEFAULT_FALLBACK_CONTEXT = CONTEXT_PROBE_TIERS[0]
# Minimum context length required to run Hermes Agent. Models with fewer
# tokens cannot maintain enough working memory for tool-calling workflows.
# Sessions, model switches, and cron jobs should reject models below this.
MINIMUM_CONTEXT_LENGTH = 64_000
# Thin fallback defaults — only broad model family patterns.
# These fire only when provider is unknown AND models.dev/OpenRouter/Anthropic
# all miss. Replaced the previous 80+ entry dict.
@@ -100,9 +106,15 @@ DEFAULT_CONTEXT_LENGTHS = {
"claude-sonnet-4.6": 1000000,
# Catch-all for older Claude models (must sort after specific entries)
"claude": 200000,
# OpenAI
# OpenAI — GPT-5 family (most have 400k; specific overrides first)
# Source: https://developers.openai.com/api/docs/models
"gpt-5.4-nano": 400000, # 400k (not 1.05M like full 5.4)
"gpt-5.4-mini": 400000, # 400k (not 1.05M like full 5.4)
"gpt-5.4": 1050000, # GPT-5.4, GPT-5.4 Pro (1.05M context)
"gpt-5.3-codex-spark": 128000, # Spark variant has reduced 128k context
"gpt-5.1-chat": 128000, # Chat variant has 128k context
"gpt-5": 400000, # GPT-5.x base, mini, codex variants (400k)
"gpt-4.1": 1047576,
"gpt-5": 128000,
"gpt-4": 128000,
# Google
"gemini": 1048576,
@@ -144,6 +156,8 @@ DEFAULT_CONTEXT_LENGTHS = {
"kimi": 262144,
# Arcee
"trinity": 262144,
# OpenRouter
"elephant": 262144,
# Hugging Face Inference Providers — model IDs use org/name format
"Qwen/Qwen3.5-397B-A17B": 131072,
"Qwen/Qwen3.5-35B-A3B": 131072,
@@ -206,7 +220,9 @@ _URL_TO_PROVIDER: Dict[str, str] = {
"api.anthropic.com": "anthropic",
"api.z.ai": "zai",
"api.moonshot.ai": "kimi-coding",
"api.moonshot.cn": "kimi-coding-cn",
"api.kimi.com": "kimi-coding",
"api.arcee.ai": "arcee",
"api.minimax": "minimax",
"dashscope.aliyuncs.com": "alibaba",
"dashscope-intl.aliyuncs.com": "alibaba",
@@ -770,12 +786,12 @@ def _query_local_context_length(model: str, base_url: str) -> Optional[int]:
resp = client.post(f"{server_url}/api/show", json={"name": model})
if resp.status_code == 200:
data = resp.json()
# Check model_info for context length
model_info = data.get("model_info", {})
for key, value in model_info.items():
if "context_length" in key and isinstance(value, (int, float)):
return int(value)
# Check parameters string for num_ctx
# Prefer explicit num_ctx from Modelfile parameters: this is
# the *runtime* context Ollama will actually allocate KV cache
# for. The GGUF model_info.context_length is the training max,
# which can be larger than num_ctx — using it here would let
# Hermes grow conversations past the runtime limit and Ollama
# would silently truncate. Matches query_ollama_num_ctx().
params = data.get("parameters", "")
if "num_ctx" in params:
for line in params.split("\n"):
@@ -786,6 +802,11 @@ def _query_local_context_length(model: str, base_url: str) -> Optional[int]:
return int(parts[-1])
except ValueError:
pass
# Fall back to GGUF model_info context_length (training max)
model_info = data.get("model_info", {})
for key, value in model_info.items():
if "context_length" in key and isinstance(value, (int, float)):
return int(value)
# LM Studio native API: /api/v1/models returns max_context_length.
# This is more reliable than the OpenAI-compat /v1/models which
@@ -1040,16 +1061,21 @@ def get_model_context_length(
def estimate_tokens_rough(text: str) -> int:
"""Rough token estimate (~4 chars/token) for pre-flight checks."""
"""Rough token estimate (~4 chars/token) for pre-flight checks.
Uses ceiling division so short texts (1-3 chars) never estimate as
0 tokens, which would cause the compressor and pre-flight checks to
systematically undercount when many short tool results are present.
"""
if not text:
return 0
return len(text) // 4
return (len(text) + 3) // 4
def estimate_messages_tokens_rough(messages: List[Dict[str, Any]]) -> int:
"""Rough token estimate for a message list (pre-flight only)."""
total_chars = sum(len(str(msg)) for msg in messages)
return total_chars // 4
return (total_chars + 3) // 4
def estimate_request_tokens_rough(
@@ -1072,4 +1098,4 @@ def estimate_request_tokens_rough(
total_chars += sum(len(str(msg)) for msg in messages)
if tools:
total_chars += len(str(tools))
return total_chars // 4
return (total_chars + 3) // 4
+3 -96
View File
@@ -18,10 +18,8 @@ Other modules should import the dataclasses and query functions from here
rather than parsing the raw JSON themselves.
"""
import difflib
import json
import logging
import os
import time
from dataclasses import dataclass
from pathlib import Path
@@ -144,8 +142,11 @@ class ProviderInfo:
PROVIDER_TO_MODELS_DEV: Dict[str, str] = {
"openrouter": "openrouter",
"anthropic": "anthropic",
"openai": "openai",
"openai-codex": "openai",
"zai": "zai",
"kimi-coding": "kimi-for-coding",
"kimi-coding-cn": "kimi-for-coding",
"minimax": "minimax",
"minimax-cn": "minimax-cn",
"deepseek": "deepseek",
@@ -174,13 +175,6 @@ PROVIDER_TO_MODELS_DEV: Dict[str, str] = {
_MODELS_DEV_TO_PROVIDER: Optional[Dict[str, str]] = None
def _get_reverse_mapping() -> Dict[str, str]:
"""Return models.dev ID → Hermes provider ID mapping."""
global _MODELS_DEV_TO_PROVIDER
if _MODELS_DEV_TO_PROVIDER is None:
_MODELS_DEV_TO_PROVIDER = {v: k for k, v in PROVIDER_TO_MODELS_DEV.items()}
return _MODELS_DEV_TO_PROVIDER
def _get_cache_path() -> Path:
"""Return path to disk cache file."""
@@ -461,93 +455,6 @@ def list_agentic_models(provider: str) -> List[str]:
return result
def search_models_dev(
query: str, provider: str = None, limit: int = 5
) -> List[Dict[str, Any]]:
"""Fuzzy search across models.dev catalog. Returns matching model entries.
Args:
query: Search string to match against model IDs.
provider: Optional Hermes provider ID to restrict search scope.
If None, searches across all providers in PROVIDER_TO_MODELS_DEV.
limit: Maximum number of results to return.
Returns:
List of dicts, each containing 'provider', 'model_id', and the full
model 'entry' from models.dev.
"""
data = fetch_models_dev()
if not data:
return []
# Build list of (provider_id, model_id, entry) candidates
candidates: List[tuple] = []
if provider is not None:
# Search only the specified provider
mdev_provider_id = PROVIDER_TO_MODELS_DEV.get(provider)
if not mdev_provider_id:
return []
provider_data = data.get(mdev_provider_id, {})
if isinstance(provider_data, dict):
models = provider_data.get("models", {})
if isinstance(models, dict):
for mid, mdata in models.items():
candidates.append((provider, mid, mdata))
else:
# Search across all mapped providers
for hermes_prov, mdev_prov in PROVIDER_TO_MODELS_DEV.items():
provider_data = data.get(mdev_prov, {})
if isinstance(provider_data, dict):
models = provider_data.get("models", {})
if isinstance(models, dict):
for mid, mdata in models.items():
candidates.append((hermes_prov, mid, mdata))
if not candidates:
return []
# Use difflib for fuzzy matching — case-insensitive comparison
model_ids_lower = [c[1].lower() for c in candidates]
query_lower = query.lower()
# First try exact substring matches (more intuitive than pure edit-distance)
substring_matches = []
for prov, mid, mdata in candidates:
if query_lower in mid.lower():
substring_matches.append({"provider": prov, "model_id": mid, "entry": mdata})
# Then add difflib fuzzy matches for any remaining slots
fuzzy_ids = difflib.get_close_matches(
query_lower, model_ids_lower, n=limit * 2, cutoff=0.4
)
seen_ids: set = set()
results: List[Dict[str, Any]] = []
# Prioritize substring matches
for match in substring_matches:
key = (match["provider"], match["model_id"])
if key not in seen_ids:
seen_ids.add(key)
results.append(match)
if len(results) >= limit:
return results
# Add fuzzy matches
for fid in fuzzy_ids:
# Find original-case candidates matching this lowered ID
for prov, mid, mdata in candidates:
if mid.lower() == fid:
key = (prov, mid)
if key not in seen_ids:
seen_ids.add(key)
results.append({"provider": prov, "model_id": mid, "entry": mdata})
if len(results) >= limit:
return results
return results
# ---------------------------------------------------------------------------
# Rich dataclass constructors — parse raw models.dev JSON into dataclasses
+60 -4
View File
@@ -12,7 +12,7 @@ import threading
from collections import OrderedDict
from pathlib import Path
from hermes_constants import get_hermes_home, get_skills_dir
from hermes_constants import get_hermes_home, get_skills_dir, is_wsl
from typing import Optional
from agent.skill_utils import (
@@ -364,8 +364,56 @@ PLATFORM_HINTS = {
"documents. You can also include image URLs in markdown format ![alt](url) and they "
"will be downloaded and sent as native media when possible."
),
"wecom": (
"You are on WeCom (企业微信 / Enterprise WeChat). Markdown formatting is supported. "
"You CAN send media files natively — to deliver a file to the user, include "
"MEDIA:/absolute/path/to/file in your response. The file will be sent as a native "
"WeCom attachment: images (.jpg, .png, .webp) are sent as photos (up to 10 MB), "
"other files (.pdf, .docx, .xlsx, .md, .txt, etc.) arrive as downloadable documents "
"(up to 20 MB), and videos (.mp4) play inline. Voice messages are supported but "
"must be in AMR format — other audio formats are automatically sent as file attachments. "
"You can also include image URLs in markdown format ![alt](url) and they will be "
"downloaded and sent as native photos. Do NOT tell the user you lack file-sending "
"capability — use MEDIA: syntax whenever a file delivery is appropriate."
),
"qqbot": (
"You are on QQ, a popular Chinese messaging platform. QQ supports markdown formatting "
"and emoji. You can send media files natively: include MEDIA:/absolute/path/to/file in "
"your response. Images are sent as native photos, and other files arrive as downloadable "
"documents."
),
}
# ---------------------------------------------------------------------------
# Environment hints — execution-environment awareness for the agent.
# Unlike PLATFORM_HINTS (which describe the messaging channel), these describe
# the machine/OS the agent's tools actually run on.
# ---------------------------------------------------------------------------
WSL_ENVIRONMENT_HINT = (
"You are running inside WSL (Windows Subsystem for Linux). "
"The Windows host filesystem is mounted under /mnt/ — "
"/mnt/c/ is the C: drive, /mnt/d/ is D:, etc. "
"The user's Windows files are typically at "
"/mnt/c/Users/<username>/Desktop/, Documents/, Downloads/, etc. "
"When the user references Windows paths or desktop files, translate "
"to the /mnt/c/ equivalent. You can list /mnt/c/Users/ to discover "
"the Windows username if needed."
)
def build_environment_hints() -> str:
"""Return environment-specific guidance for the system prompt.
Detects WSL, and can be extended for Termux, Docker, etc.
Returns an empty string when no special environment is detected.
"""
hints: list[str] = []
if is_wsl():
hints.append(WSL_ENVIRONMENT_HINT)
return "\n\n".join(hints)
CONTEXT_FILE_MAX_CHARS = 20_000
CONTEXT_TRUNCATE_HEAD_RATIO = 0.7
CONTEXT_TRUNCATE_TAIL_RATIO = 0.2
@@ -726,8 +774,16 @@ def build_skills_system_prompt(
result = (
"## Skills (mandatory)\n"
"Before replying, scan the skills below. If one clearly matches your task, "
"load it with skill_view(name) and follow its instructions. "
"Before replying, scan the skills below. If a skill matches or is even partially relevant "
"to your task, you MUST load it with skill_view(name) and follow its instructions. "
"Err on the side of loading — it is always better to have context you don't need "
"than to miss critical steps, pitfalls, or established workflows. "
"Skills contain specialized knowledge — API endpoints, tool-specific commands, "
"and proven workflows that outperform general-purpose approaches. Load the skill "
"even if you think you could handle the task with basic tools like web_search or terminal. "
"Skills also encode the user's preferred approach, conventions, and quality standards "
"for tasks like code review, planning, and testing — load them even for tasks you "
"already know how to do, because the skill defines how it should be done here.\n"
"If a skill has issues, fix it with skill_manage(action='patch').\n"
"After difficult/iterative tasks, offer to save as a skill. "
"If a skill you loaded was missing steps, had wrong commands, or needed "
@@ -737,7 +793,7 @@ def build_skills_system_prompt(
+ "\n".join(index_lines) + "\n"
"</available_skills>\n"
"\n"
"If none match, proceed normally without loading a skill."
"Only proceed without loading a skill if genuinely none are relevant to the task."
)
# ── Store in LRU cache ────────────────────────────────────────────
+1 -1
View File
@@ -24,7 +24,7 @@ from __future__ import annotations
import time
from dataclasses import dataclass, field
from typing import Any, Dict, Mapping, Optional
from typing import Any, Mapping, Optional
@dataclass
+23 -1
View File
@@ -10,7 +10,7 @@ import os
import re
import sys
from pathlib import Path
from typing import Any, Dict, List, Set, Tuple
from typing import Any, Dict, List, Optional, Set, Tuple
from hermes_constants import get_config_path, get_skills_dir
@@ -441,3 +441,25 @@ def iter_skill_index_files(skills_dir: Path, filename: str):
matches.append(Path(root) / filename)
for path in sorted(matches, key=lambda p: str(p.relative_to(skills_dir))):
yield path
# ── Namespace helpers for plugin-provided skills ───────────────────────────
_NAMESPACE_RE = re.compile(r"^[a-zA-Z0-9_-]+$")
def parse_qualified_name(name: str) -> Tuple[Optional[str], str]:
"""Split ``'namespace:skill-name'`` into ``(namespace, bare_name)``.
Returns ``(None, name)`` when there is no ``':'``.
"""
if ":" not in name:
return None, name
return tuple(name.split(":", 1)) # type: ignore[return-value]
def is_valid_namespace(candidate: Optional[str]) -> bool:
"""Check whether *candidate* is a valid namespace (``[a-zA-Z0-9_-]+``)."""
if not candidate:
return False
return bool(_NAMESPACE_RE.match(candidate))
+1 -1
View File
@@ -36,7 +36,7 @@ def generate_title(user_message: str, assistant_response: str, timeout: float =
try:
response = call_llm(
task="compression", # reuse compression task config (cheap/fast model)
task="title_generation",
messages=messages,
max_tokens=30,
temperature=0.3,
-19
View File
@@ -575,25 +575,6 @@ def has_known_pricing(
return entry is not None
def get_pricing(
model_name: str,
provider: Optional[str] = None,
base_url: Optional[str] = None,
api_key: Optional[str] = None,
) -> Dict[str, float]:
"""Backward-compatible thin wrapper for legacy callers.
Returns only non-cache input/output fields when a pricing entry exists.
Unknown routes return zeroes.
"""
entry = get_pricing_entry(model_name, provider=provider, base_url=base_url, api_key=api_key)
if not entry:
return {"input": 0.0, "output": 0.0}
return {
"input": float(entry.input_cost_per_million or _ZERO),
"output": float(entry.output_cost_per_million or _ZERO),
}
def format_duration_compact(seconds: float) -> str:
if seconds < 60:
+12 -11
View File
@@ -25,6 +25,7 @@ model:
# "minimax-cn" - MiniMax China (requires: MINIMAX_CN_API_KEY)
# "huggingface" - Hugging Face Inference (requires: HF_TOKEN)
# "xiaomi" - Xiaomi MiMo (requires: XIAOMI_API_KEY)
# "arcee" - Arcee AI Trinity models (requires: ARCEEAI_API_KEY)
# "kilocode" - KiloCode gateway (requires: KILOCODE_API_KEY)
# "ai-gateway" - Vercel AI Gateway (requires: AI_GATEWAY_API_KEY)
#
@@ -309,15 +310,8 @@ compression:
# compression of older turns.
protect_last_n: 20
# Model to use for generating summaries (fast/cheap recommended)
# This model compresses the middle turns into a concise summary.
# IMPORTANT: it receives the full middle section of the conversation, so it
# MUST support a context length at least as large as your main model's.
summary_model: "google/gemini-3-flash-preview"
# Provider for the summary model (default: "auto")
# Options: "auto", "openrouter", "nous", "main"
# summary_provider: "auto"
# To pin a specific model/provider for compression summaries, use the
# auxiliary section below (auxiliary.compression.provider / model).
# =============================================================================
# Auxiliary Models (Advanced — Experimental)
@@ -529,7 +523,7 @@ agent:
# - A preset like "hermes-cli" or "hermes-telegram" (curated tool set)
# - A list of individual toolsets to compose your own (see list below)
#
# Supported platform keys: cli, telegram, discord, whatsapp, slack
# Supported platform keys: cli, telegram, discord, whatsapp, slack, qqbot
#
# Examples:
#
@@ -558,6 +552,7 @@ agent:
# slack: hermes-slack (same as telegram)
# signal: hermes-signal (same as telegram)
# homeassistant: hermes-homeassistant (same as telegram)
# qqbot: hermes-qqbot (same as telegram)
#
platform_toolsets:
cli: [hermes-cli]
@@ -567,6 +562,7 @@ platform_toolsets:
slack: [hermes-slack]
signal: [hermes-signal]
homeassistant: [hermes-homeassistant]
qqbot: [hermes-qqbot]
# ─────────────────────────────────────────────────────────────────────────────
# Available toolsets (use these names in platform_toolsets or the toolsets list)
@@ -774,6 +770,11 @@ display:
# Toggle at runtime with /verbose in the CLI
tool_progress: all
# Gateway-only natural mid-turn assistant updates.
# When true, completed assistant status messages are sent as separate chat
# messages. This is independent of tool_progress and gateway streaming.
interim_assistant_messages: true
# What Enter does when Hermes is already busy in the CLI.
# interrupt: Interrupt the current run and redirect Hermes (default)
# queue: Queue your message for the next turn
@@ -782,7 +783,7 @@ display:
# Background process notifications (gateway/messaging only).
# Controls how chatty the process watcher is when you use
# terminal(background=true, check_interval=...) from Telegram/Discord/etc.
# terminal(background=true, notify_on_complete=true) from Telegram/Discord/etc.
# off: No watcher messages at all
# result: Only the final completion message
# error: Only the final message when exit code != 0
+732 -87
View File
File diff suppressed because it is too large Load Diff
+26
View File
@@ -45,6 +45,7 @@ _KNOWN_DELIVERY_PLATFORMS = frozenset({
"telegram", "discord", "slack", "whatsapp", "signal",
"matrix", "mattermost", "homeassistant", "dingtalk", "feishu",
"wecom", "wecom_callback", "weixin", "sms", "email", "webhook", "bluebubbles",
"qqbot",
})
from cron.jobs import get_due_jobs, mark_job_run, save_job_output, advance_next_run
@@ -219,6 +220,21 @@ def _deliver_result(job: dict, content: str, adapters=None, loop=None) -> Option
chat_id = target["chat_id"]
thread_id = target.get("thread_id")
# Diagnostic: log thread_id for topic-aware delivery debugging
origin = job.get("origin") or {}
origin_thread = origin.get("thread_id")
if origin_thread and not thread_id:
logger.warning(
"Job '%s': origin has thread_id=%s but delivery target lost it "
"(deliver=%s, target=%s)",
job["id"], origin_thread, job.get("deliver", "local"), target,
)
elif thread_id:
logger.debug(
"Job '%s': delivering to %s:%s thread_id=%s",
job["id"], platform_name, chat_id, thread_id,
)
from tools.send_message_tool import _send_to_platform
from gateway.config import load_gateway_config, Platform
@@ -239,6 +255,7 @@ def _deliver_result(job: dict, content: str, adapters=None, loop=None) -> Option
"email": Platform.EMAIL,
"sms": Platform.SMS,
"bluebubbles": Platform.BLUEBUBBLES,
"qqbot": Platform.QQBOT,
}
platform = platform_map.get(platform_name.lower())
if not platform:
@@ -626,6 +643,15 @@ def run_job(job: dict) -> tuple[bool, str, str, Optional[str]]:
except Exception as e:
logger.warning("Job '%s': failed to load config.yaml, using defaults: %s", job_id, e)
# Apply IPv4 preference if configured.
try:
from hermes_constants import apply_ipv4_preference
_net_cfg = _cfg.get("network", {})
if isinstance(_net_cfg, dict) and _net_cfg.get("force_ipv4"):
apply_ipv4_preference(force=True)
except Exception:
pass
# Reasoning config from config.yaml
from hermes_constants import parse_reasoning_effort
effort = str(_cfg.get("agent", {}).get("reasoning_effort", "")).strip()
+27
View File
@@ -5,6 +5,33 @@ set -e
HERMES_HOME="/opt/data"
INSTALL_DIR="/opt/hermes"
# --- Privilege dropping via gosu ---
# When started as root (the default), optionally remap the hermes user/group
# to match host-side ownership, fix volume permissions, then re-exec as hermes.
if [ "$(id -u)" = "0" ]; then
if [ -n "$HERMES_UID" ] && [ "$HERMES_UID" != "$(id -u hermes)" ]; then
echo "Changing hermes UID to $HERMES_UID"
usermod -u "$HERMES_UID" hermes
fi
if [ -n "$HERMES_GID" ] && [ "$HERMES_GID" != "$(id -g hermes)" ]; then
echo "Changing hermes GID to $HERMES_GID"
groupmod -g "$HERMES_GID" hermes
fi
actual_hermes_uid=$(id -u hermes)
if [ "$(stat -c %u "$HERMES_HOME" 2>/dev/null)" != "$actual_hermes_uid" ]; then
echo "$HERMES_HOME is not owned by $actual_hermes_uid, fixing"
chown -R hermes:hermes "$HERMES_HOME"
fi
echo "Dropping root privileges"
exec gosu hermes "$0" "$@"
fi
# --- Running as hermes from here ---
source "${INSTALL_DIR}/.venv/bin/activate"
# Create essential directory structure. Cache and platform directories
# (cache/images, cache/audio, platforms/whatsapp, etc.) are created on
# demand by the application — don't pre-create them here so new installs
+1 -1
View File
@@ -118,7 +118,7 @@ For executed migrations, the full report is saved to `~/.hermes/migration/opencl
## Troubleshooting
### "OpenClaw directory not found"
The migration looks for `~/.openclaw` by default, then tries `~/.clawdbot` and `~/.moldbot`. If your OpenClaw is installed elsewhere, use `--source`:
The migration looks for `~/.openclaw` by default, then tries `~/.clawdbot` and `~/.moltbot`. If your OpenClaw is installed elsewhere, use `--source`:
```bash
hermes claw migrate --source /path/to/.openclaw
```
+8
View File
@@ -41,6 +41,14 @@ colors:
session_label: "#DAA520" # Session label
session_border: "#8B8682" # Session ID dim color
# TUI surfaces
status_bar_bg: "#1a1a2e" # Status / usage bar background
voice_status_bg: "#1a1a2e" # Voice-mode badge background
completion_menu_bg: "#1a1a2e" # Completion list background
completion_menu_current_bg: "#333355" # Active completion row background
completion_menu_meta_bg: "#1a1a2e" # Completion meta column background
completion_menu_meta_current_bg: "#333355" # Active completion meta background
# ── Spinner ─────────────────────────────────────────────────────────────────
# Customize the animated spinner shown during API calls and tool execution.
spinner:
+329
View File
@@ -0,0 +1,329 @@
# Container-Aware CLI Review Fixes Spec
**PR:** NousResearch/hermes-agent#7543
**Review:** cursor[bot] bugbot review (4094049442) + two prior rounds
**Date:** 2026-04-12
**Branch:** `feat/container-aware-cli-clean`
## Review Issues Summary
Six issues were raised across three bugbot review rounds. Three were fixed in intermediate commits (38277a6a, 726cf90f). This spec addresses remaining design concerns surfaced by those reviews and simplifies the implementation based on interview decisions.
| # | Issue | Severity | Status |
|---|-------|----------|--------|
| 1 | `os.execvp` retry loop unreachable | Medium | Fixed in 79e8cd12 (switched to subprocess.run) |
| 2 | Redundant `shutil.which("sudo")` | Medium | Fixed in 38277a6a (reuses `sudo` var) |
| 3 | Missing `chown -h` on symlink update | Low | Fixed in 38277a6a |
| 4 | Container routing after `parse_args()` | High | Fixed in 726cf90f |
| 5 | Hardcoded `/home/${user}` | Medium | Fixed in 726cf90f |
| 6 | Group membership not gated on `container.enable` | Low | Fixed in 726cf90f |
The mechanical fixes are in place but the overall design needs revision. The retry loop, error swallowing, and process model have deeper issues than what the bugbot flagged.
---
## Spec: Revised `_exec_in_container`
### Design Principles
1. **Let it crash.** No silent fallbacks. If `.container-mode` exists but something goes wrong, the error propagates naturally (Python traceback). The only case where container routing is skipped is when `.container-mode` doesn't exist or `HERMES_DEV=1`.
2. **No retries.** Probe once for sudo, exec once. If it fails, docker/podman's stderr reaches the user verbatim.
3. **Completely transparent.** No error wrapping, no prefixes, no spinners. Docker's output goes straight through.
4. **`os.execvp` on the happy path.** Replace the Python process entirely so there's no idle parent during interactive sessions. Note: `execvp` never returns on success (process is replaced) and raises `OSError` on failure (it does not return a value). The container process's exit code becomes the process exit code by definition — no explicit propagation needed.
5. **One human-readable exception to "let it crash".** `subprocess.TimeoutExpired` from the sudo probe gets a specific catch with a readable message, since a raw traceback for "your Docker daemon is slow" is confusing. All other exceptions propagate naturally.
### Execution Flow
```
1. get_container_exec_info()
- HERMES_DEV=1 → return None (skip routing)
- Inside container → return None (skip routing)
- .container-mode doesn't exist → return None (skip routing)
- .container-mode exists → parse and return dict
- .container-mode exists but malformed/unreadable → LET IT CRASH (no try/except)
2. _exec_in_container(container_info, sys.argv[1:])
a. shutil.which(backend) → if None, print "{backend} not found on PATH" and sys.exit(1)
b. Sudo probe: subprocess.run([runtime, "inspect", "--format", "ok", container_name], timeout=15)
- If succeeds → needs_sudo = False
- If fails → try subprocess.run([sudo, "-n", runtime, "inspect", ...], timeout=15)
- If succeeds → needs_sudo = True
- If fails → print error with sudoers hint (including why -n is required) and sys.exit(1)
- If TimeoutExpired → catch specifically, print human-readable message about slow daemon
c. Build exec_cmd: [sudo? + runtime, "exec", tty_flags, "-u", exec_user, env_flags, container, hermes_bin, *cli_args]
d. os.execvp(exec_cmd[0], exec_cmd)
- On success: process is replaced — Python is gone, container exit code IS the process exit code
- On OSError: let it crash (natural traceback)
```
### Changes to `hermes_cli/main.py`
#### `_exec_in_container` — rewrite
Remove:
- The entire retry loop (`max_retries`, `for attempt in range(...)`)
- Spinner logic (`"Waiting for container..."`, dots)
- Exit code classification (125/126/127 handling)
- `subprocess.run` for the exec call (keep it only for the sudo probe)
- Special TTY vs non-TTY retry counts
- The `time` import (no longer needed)
Change:
- Use `os.execvp(exec_cmd[0], exec_cmd)` as the final call
- Keep the `subprocess` import only for the sudo probe
- Keep TTY detection for the `-it` vs `-i` flag
- Keep env var forwarding (TERM, COLORTERM, LANG, LC_ALL)
- Keep the sudo probe as-is (it's the one "smart" part)
- Bump probe `timeout` from 5s to 15s — cold podman on a loaded machine needs headroom
- Catch `subprocess.TimeoutExpired` specifically on both probe calls — print a readable message about the daemon being unresponsive instead of a raw traceback
- Expand the sudoers hint error message to explain *why* `-n` (non-interactive) is required: a password prompt would hang the CLI or break piped commands
The function becomes roughly:
```python
def _exec_in_container(container_info: dict, cli_args: list):
"""Replace the current process with a command inside the managed container.
Probes whether sudo is needed (rootful containers), then os.execvp
into the container. If exec fails, the OS error propagates naturally.
"""
import shutil
import subprocess
backend = container_info["backend"]
container_name = container_info["container_name"]
exec_user = container_info["exec_user"]
hermes_bin = container_info["hermes_bin"]
runtime = shutil.which(backend)
if not runtime:
print(f"Error: {backend} not found on PATH. Cannot route to container.",
file=sys.stderr)
sys.exit(1)
# Probe whether we need sudo to see the rootful container.
# Timeout is 15s — cold podman on a loaded machine can take a while.
# TimeoutExpired is caught specifically for a human-readable message;
# all other exceptions propagate naturally.
needs_sudo = False
sudo = None
try:
probe = subprocess.run(
[runtime, "inspect", "--format", "ok", container_name],
capture_output=True, text=True, timeout=15,
)
except subprocess.TimeoutExpired:
print(
f"Error: timed out waiting for {backend} to respond.\n"
f"The {backend} daemon may be unresponsive or starting up.",
file=sys.stderr,
)
sys.exit(1)
if probe.returncode != 0:
sudo = shutil.which("sudo")
if sudo:
try:
probe2 = subprocess.run(
[sudo, "-n", runtime, "inspect", "--format", "ok", container_name],
capture_output=True, text=True, timeout=15,
)
except subprocess.TimeoutExpired:
print(
f"Error: timed out waiting for sudo {backend} to respond.",
file=sys.stderr,
)
sys.exit(1)
if probe2.returncode == 0:
needs_sudo = True
else:
print(
f"Error: container '{container_name}' not found via {backend}.\n"
f"\n"
f"The NixOS service runs the container as root. Your user cannot\n"
f"see it because {backend} uses per-user namespaces.\n"
f"\n"
f"Fix: grant passwordless sudo for {backend}. The -n (non-interactive)\n"
f"flag is required because the CLI calls sudo non-interactively —\n"
f"a password prompt would hang or break piped commands:\n"
f"\n"
f' security.sudo.extraRules = [{{\n'
f' users = [ "{os.getenv("USER", "your-user")}" ];\n'
f' commands = [{{ command = "{runtime}"; options = [ "NOPASSWD" ]; }}];\n'
f' }}];\n'
f"\n"
f"Or run: sudo hermes {' '.join(cli_args)}",
file=sys.stderr,
)
sys.exit(1)
else:
print(
f"Error: container '{container_name}' not found via {backend}.\n"
f"The container may be running under root. Try: sudo hermes {' '.join(cli_args)}",
file=sys.stderr,
)
sys.exit(1)
is_tty = sys.stdin.isatty()
tty_flags = ["-it"] if is_tty else ["-i"]
env_flags = []
for var in ("TERM", "COLORTERM", "LANG", "LC_ALL"):
val = os.environ.get(var)
if val:
env_flags.extend(["-e", f"{var}={val}"])
cmd_prefix = [sudo, "-n", runtime] if needs_sudo else [runtime]
exec_cmd = (
cmd_prefix + ["exec"]
+ tty_flags
+ ["-u", exec_user]
+ env_flags
+ [container_name, hermes_bin]
+ cli_args
)
# execvp replaces this process entirely — it never returns on success.
# On failure it raises OSError, which propagates naturally.
os.execvp(exec_cmd[0], exec_cmd)
```
#### Container routing call site in `main()` — remove try/except
Current:
```python
try:
from hermes_cli.config import get_container_exec_info
container_info = get_container_exec_info()
if container_info:
_exec_in_container(container_info, sys.argv[1:])
sys.exit(1) # exec failed if we reach here
except SystemExit:
raise
except Exception:
pass # Container routing unavailable, proceed locally
```
Revised:
```python
from hermes_cli.config import get_container_exec_info
container_info = get_container_exec_info()
if container_info:
_exec_in_container(container_info, sys.argv[1:])
# Unreachable: os.execvp never returns on success (process is replaced)
# and raises OSError on failure (which propagates as a traceback).
# This line exists only as a defensive assertion.
sys.exit(1)
```
No try/except. If `.container-mode` doesn't exist, `get_container_exec_info()` returns `None` and we skip routing. If it exists but is broken, the exception propagates with a natural traceback.
Note: `sys.exit(1)` after `_exec_in_container` is dead code in all paths — `os.execvp` either replaces the process or raises. It's kept as a belt-and-suspenders assertion with a comment marking it unreachable, not as actual error handling.
### Changes to `hermes_cli/config.py`
#### `get_container_exec_info` — remove inner try/except
Current code catches `(OSError, IOError)` and returns `None`. This silently hides permission errors, corrupt files, etc.
Change: Remove the try/except around file reading. Keep the early returns for `HERMES_DEV=1` and `_is_inside_container()`. The `FileNotFoundError` from `open()` when `.container-mode` doesn't exist should still return `None` (this is the "container mode not enabled" case). All other exceptions propagate.
```python
def get_container_exec_info() -> Optional[dict]:
if os.environ.get("HERMES_DEV") == "1":
return None
if _is_inside_container():
return None
container_mode_file = get_hermes_home() / ".container-mode"
try:
with open(container_mode_file, "r") as f:
# ... parse key=value lines ...
except FileNotFoundError:
return None
# All other exceptions (PermissionError, malformed data, etc.) propagate
return { ... }
```
---
## Spec: NixOS Module Changes
### Symlink creation — simplify to two branches
Current: 4 branches (symlink exists, directory exists, other file, doesn't exist).
Revised: 2 branches.
```bash
if [ -d "${symlinkPath}" ] && [ ! -L "${symlinkPath}" ]; then
# Real directory — back it up, then create symlink
_backup="${symlinkPath}.bak.$(date +%s)"
echo "hermes-agent: backing up existing ${symlinkPath} to $_backup"
mv "${symlinkPath}" "$_backup"
fi
# For everything else (symlink, doesn't exist, etc.) — just force-create
ln -sfn "${target}" "${symlinkPath}"
chown -h ${user}:${cfg.group} "${symlinkPath}"
```
`ln -sfn` handles: existing symlink (replaces), doesn't exist (creates), and after the `mv` above (creates). The only case that needs special handling is a real directory, because `ln -sfn` cannot atomically replace a directory.
Note: there is a theoretical race between the `[ -d ... ]` check and the `mv` (something could create/remove the directory in between). In practice this is a NixOS activation script running as root during `nixos-rebuild switch` — no other process should be touching `~/.hermes` at that moment. Not worth adding locking for.
### Sudoers — document, don't auto-configure
Do NOT add `security.sudo.extraRules` to the module. Document the sudoers requirement in the module's description/comments and in the error message the CLI prints when sudo probe fails.
### Group membership gating — keep as-is
The fix in 726cf90f (`cfg.container.enable && cfg.container.hostUsers != []`) is correct. Leftover group membership when container mode is disabled is harmless. No cleanup needed.
---
## Spec: Test Rewrite
The existing test file (`tests/hermes_cli/test_container_aware_cli.py`) has 16 tests. With the simplified exec model, several are obsolete.
### Tests to keep (update as needed)
- `test_is_inside_container_dockerenv` — unchanged
- `test_is_inside_container_containerenv` — unchanged
- `test_is_inside_container_cgroup_docker` — unchanged
- `test_is_inside_container_false_on_host` — unchanged
- `test_get_container_exec_info_returns_metadata` — unchanged
- `test_get_container_exec_info_none_inside_container` — unchanged
- `test_get_container_exec_info_none_without_file` — unchanged
- `test_get_container_exec_info_skipped_when_hermes_dev` — unchanged
- `test_get_container_exec_info_not_skipped_when_hermes_dev_zero` — unchanged
- `test_get_container_exec_info_defaults` — unchanged
- `test_get_container_exec_info_docker_backend` — unchanged
### Tests to add
- `test_get_container_exec_info_crashes_on_permission_error` — verify that `PermissionError` propagates (no silent `None` return)
- `test_exec_in_container_calls_execvp` — verify `os.execvp` is called with correct args (runtime, tty flags, user, env, container, binary, cli args)
- `test_exec_in_container_sudo_probe_sets_prefix` — verify that when first probe fails and sudo probe succeeds, `os.execvp` is called with `sudo -n` prefix
- `test_exec_in_container_no_runtime_hard_fails` — keep existing, verify `sys.exit(1)` when `shutil.which` returns None
- `test_exec_in_container_non_tty_uses_i_only` — update to check `os.execvp` args instead of `subprocess.run` args
- `test_exec_in_container_probe_timeout_prints_message` — verify that `subprocess.TimeoutExpired` from the probe produces a human-readable error and `sys.exit(1)`, not a raw traceback
- `test_exec_in_container_container_not_running_no_sudo` — verify the path where runtime exists (`shutil.which` returns a path) but probe returns non-zero and no sudo is available. Should print the "container may be running under root" error. This is distinct from `no_runtime_hard_fails` which covers `shutil.which` returning None.
### Tests to delete
- `test_exec_in_container_tty_retries_on_container_failure` — retry loop removed
- `test_exec_in_container_non_tty_retries_silently_exits_126` — retry loop removed
- `test_exec_in_container_propagates_hermes_exit_code` — no subprocess.run to check exit codes; execvp replaces the process. Note: exit code propagation still works correctly — when `os.execvp` succeeds, the container's process *becomes* this process, so its exit code is the process exit code by OS semantics. No application code needed, no test needed. A comment in the function docstring documents this intent for future readers.
---
## Out of Scope
- Auto-configuring sudoers rules in the NixOS module
- Any changes to `get_container_exec_info` parsing logic beyond the try/except narrowing
- Changes to `.container-mode` file format
- Changes to the `HERMES_DEV=1` bypass
- Changes to container detection logic (`_is_inside_container`)
-2
View File
@@ -18,9 +18,7 @@ suppress delivery.
"""
import logging
import os
import threading
from pathlib import Path
logger = logging.getLogger("hooks.boot-md")
+71 -1
View File
@@ -66,6 +66,7 @@ class Platform(Enum):
WECOM_CALLBACK = "wecom_callback"
WEIXIN = "weixin"
BLUEBUBBLES = "bluebubbles"
QQBOT = "qqbot"
@dataclass
@@ -303,6 +304,9 @@ class GatewayConfig:
# BlueBubbles uses extra dict for local server config
elif platform == Platform.BLUEBUBBLES and config.extra.get("server_url") and config.extra.get("password"):
connected.append(platform)
# QQBot uses extra dict for app credentials
elif platform == Platform.QQBOT and config.extra.get("app_id") and config.extra.get("client_secret"):
connected.append(platform)
return connected
def get_home_channel(self, platform: Platform) -> Optional[HomeChannel]:
@@ -621,6 +625,11 @@ def load_gateway_config() -> GatewayConfig:
if isinstance(frc, list):
frc = ",".join(str(v) for v in frc)
os.environ["TELEGRAM_FREE_RESPONSE_CHATS"] = str(frc)
ignored_threads = telegram_cfg.get("ignored_threads")
if ignored_threads is not None and not os.getenv("TELEGRAM_IGNORED_THREADS"):
if isinstance(ignored_threads, list):
ignored_threads = ",".join(str(v) for v in ignored_threads)
os.environ["TELEGRAM_IGNORED_THREADS"] = str(ignored_threads)
if "reactions" in telegram_cfg and not os.getenv("TELEGRAM_REACTIONS"):
os.environ["TELEGRAM_REACTIONS"] = str(telegram_cfg["reactions"]).lower()
@@ -665,6 +674,17 @@ def load_gateway_config() -> GatewayConfig:
_apply_env_overrides(config)
# --- Validate loaded values ---
_validate_gateway_config(config)
return config
def _validate_gateway_config(config: "GatewayConfig") -> None:
"""Validate and sanitize a loaded GatewayConfig in place.
Called by ``load_gateway_config()`` after all config sources are merged.
Extracted as a separate function for testability.
"""
policy = config.default_reset_policy
if not (0 <= policy.at_hour <= 23):
@@ -701,7 +721,31 @@ def load_gateway_config() -> GatewayConfig:
platform.value, env_name,
)
return config
# Reject known-weak placeholder tokens.
# Ported from openclaw/openclaw#64586: users who copy .env.example
# without changing placeholder values get a clear startup error instead
# of a confusing "auth failed" from the platform API.
try:
from hermes_cli.auth import has_usable_secret
except ImportError:
has_usable_secret = None # type: ignore[assignment]
if has_usable_secret is not None:
for platform, pconfig in config.platforms.items():
if not pconfig.enabled:
continue
env_name = _token_env_names.get(platform)
if not env_name:
continue
token = pconfig.token
if token and token.strip() and not has_usable_secret(token, min_length=4):
logger.error(
"%s is enabled but %s is set to a placeholder value ('%s'). "
"Set a real bot token before starting the gateway. "
"The adapter will NOT be started.",
platform.value, env_name, token.strip()[:6] + "...",
)
pconfig.enabled = False
def _apply_env_overrides(config: GatewayConfig) -> None:
@@ -1074,6 +1118,32 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
name=os.getenv("BLUEBUBBLES_HOME_CHANNEL_NAME", "Home"),
)
# QQ (Official Bot API v2)
qq_app_id = os.getenv("QQ_APP_ID")
qq_client_secret = os.getenv("QQ_CLIENT_SECRET")
if qq_app_id or qq_client_secret:
if Platform.QQBOT not in config.platforms:
config.platforms[Platform.QQBOT] = PlatformConfig()
config.platforms[Platform.QQBOT].enabled = True
extra = config.platforms[Platform.QQBOT].extra
if qq_app_id:
extra["app_id"] = qq_app_id
if qq_client_secret:
extra["client_secret"] = qq_client_secret
qq_allowed_users = os.getenv("QQ_ALLOWED_USERS", "").strip()
if qq_allowed_users:
extra["allow_from"] = qq_allowed_users
qq_group_allowed = os.getenv("QQ_GROUP_ALLOWED_USERS", "").strip()
if qq_group_allowed:
extra["group_allow_from"] = qq_group_allowed
qq_home = os.getenv("QQ_HOME_CHANNEL", "").strip()
if qq_home:
config.platforms[Platform.QQBOT].home_channel = HomeChannel(
platform=Platform.QQBOT,
chat_id=qq_home,
name=os.getenv("QQ_HOME_CHANNEL_NAME", "Home"),
)
# Session settings
idle_minutes = os.getenv("SESSION_IDLE_MINUTES")
if idle_minutes:
+1 -1
View File
@@ -12,7 +12,7 @@ import logging
from pathlib import Path
from datetime import datetime
from dataclasses import dataclass
from typing import Dict, List, Optional, Any, Union
from typing import Dict, List, Optional, Any
from hermes_cli.config import get_hermes_home
+194
View File
@@ -0,0 +1,194 @@
"""Per-platform display/verbosity configuration resolver.
Provides ``resolve_display_setting()`` the single entry-point for reading
display settings with platform-specific overrides and sensible defaults.
Resolution order (first non-None wins):
1. ``display.platforms.<platform>.<key>`` explicit per-platform user override
2. ``display.<key>`` global user setting
3. ``_PLATFORM_DEFAULTS[<platform>][<key>]`` built-in sensible default
4. ``_GLOBAL_DEFAULTS[<key>]`` built-in global default
Exception: ``display.streaming`` is CLI-only. Gateway streaming follows the
top-level ``streaming`` config unless ``display.platforms.<platform>.streaming``
sets an explicit per-platform override.
Backward compatibility: ``display.tool_progress_overrides`` is still read as a
fallback for ``tool_progress`` when no ``display.platforms`` entry exists. A
config migration (version bump) automatically moves the old format into the new
``display.platforms`` structure.
"""
from __future__ import annotations
from typing import Any
# ---------------------------------------------------------------------------
# Overrideable display settings and their global defaults
# ---------------------------------------------------------------------------
# These are the settings that can be configured per-platform.
# Other display settings (compact, personality, skin, etc.) are CLI-only
# and don't participate in per-platform resolution.
_GLOBAL_DEFAULTS: dict[str, Any] = {
"tool_progress": "all",
"show_reasoning": False,
"tool_preview_length": 0,
"streaming": None, # None = follow top-level streaming config
}
# ---------------------------------------------------------------------------
# Sensible per-platform defaults — tiered by platform capability
# ---------------------------------------------------------------------------
# Tier 1 (high): Supports message editing, typically personal/team use
# Tier 2 (medium): Supports editing but often workspace/customer-facing
# Tier 3 (low): No edit support — each progress msg is permanent
# Tier 4 (minimal): Batch/non-interactive delivery
_TIER_HIGH = {
"tool_progress": "all",
"show_reasoning": False,
"tool_preview_length": 40,
"streaming": None, # follow global
}
_TIER_MEDIUM = {
"tool_progress": "new",
"show_reasoning": False,
"tool_preview_length": 40,
"streaming": None,
}
_TIER_LOW = {
"tool_progress": "off",
"show_reasoning": False,
"tool_preview_length": 40,
"streaming": False,
}
_TIER_MINIMAL = {
"tool_progress": "off",
"show_reasoning": False,
"tool_preview_length": 0,
"streaming": False,
}
_PLATFORM_DEFAULTS: dict[str, dict[str, Any]] = {
# Tier 1 — full edit support, personal/team use
"telegram": _TIER_HIGH,
"discord": _TIER_HIGH,
# Tier 2 — edit support, often customer/workspace channels
"slack": _TIER_MEDIUM,
"mattermost": _TIER_MEDIUM,
"matrix": _TIER_MEDIUM,
"feishu": _TIER_MEDIUM,
# Tier 3 — no edit support, progress messages are permanent
"signal": _TIER_LOW,
"whatsapp": _TIER_MEDIUM, # Baileys bridge supports /edit
"bluebubbles": _TIER_LOW,
"weixin": _TIER_LOW,
"wecom": _TIER_LOW,
"wecom_callback": _TIER_LOW,
"dingtalk": _TIER_LOW,
# Tier 4 — batch or non-interactive delivery
"email": _TIER_MINIMAL,
"sms": _TIER_MINIMAL,
"webhook": _TIER_MINIMAL,
"homeassistant": _TIER_MINIMAL,
"api_server": {**_TIER_HIGH, "tool_preview_length": 0},
}
# Canonical set of per-platform overrideable keys (for validation).
OVERRIDEABLE_KEYS = frozenset(_GLOBAL_DEFAULTS.keys())
def resolve_display_setting(
user_config: dict,
platform_key: str,
setting: str,
fallback: Any = None,
) -> Any:
"""Resolve a display setting with per-platform override support.
Parameters
----------
user_config : dict
The full parsed config.yaml dict.
platform_key : str
Platform config key (e.g. ``"telegram"``, ``"slack"``). Use
``_platform_config_key(source.platform)`` from gateway/run.py.
setting : str
Display setting name (e.g. ``"tool_progress"``, ``"show_reasoning"``).
fallback : Any
Fallback value when the setting isn't found anywhere.
Returns
-------
The resolved value, or *fallback* if nothing is configured.
"""
display_cfg = user_config.get("display") or {}
# 1. Explicit per-platform override (display.platforms.<platform>.<key>)
platforms = display_cfg.get("platforms") or {}
plat_overrides = platforms.get(platform_key)
if isinstance(plat_overrides, dict):
val = plat_overrides.get(setting)
if val is not None:
return _normalise(setting, val)
# 1b. Backward compat: display.tool_progress_overrides.<platform>
if setting == "tool_progress":
legacy = display_cfg.get("tool_progress_overrides")
if isinstance(legacy, dict):
val = legacy.get(platform_key)
if val is not None:
return _normalise(setting, val)
# 2. Global user setting (display.<key>). Skip display.streaming because
# that key controls only CLI terminal streaming; gateway token streaming is
# governed by the top-level streaming config plus per-platform overrides.
if setting != "streaming":
val = display_cfg.get(setting)
if val is not None:
return _normalise(setting, val)
# 3. Built-in platform default
plat_defaults = _PLATFORM_DEFAULTS.get(platform_key)
if plat_defaults:
val = plat_defaults.get(setting)
if val is not None:
return val
# 4. Built-in global default
val = _GLOBAL_DEFAULTS.get(setting)
if val is not None:
return val
return fallback
# ---------------------------------------------------------------------------
# Helpers
# ---------------------------------------------------------------------------
def _normalise(setting: str, value: Any) -> Any:
"""Normalise YAML quirks (bare ``off`` → False in YAML 1.1)."""
if setting == "tool_progress":
if value is False:
return "off"
if value is True:
return "all"
return str(value).lower()
if setting in ("show_reasoning", "streaming"):
if isinstance(value, str):
return value.lower() in ("true", "1", "yes", "on")
return bool(value)
if setting == "tool_preview_length":
try:
return int(value)
except (TypeError, ValueError):
return 0
return value
+25 -1
View File
@@ -3,11 +3,12 @@ Event Hook System
A lightweight event-driven system that fires handlers at key lifecycle points.
Hooks are discovered from ~/.hermes/hooks/ directories, each containing:
- HOOK.yaml (metadata: name, description, events list)
- HOOK.yaml (metadata: name, description, events list, optional startup_readiness)
- handler.py (Python handler with async def handle(event_type, context))
Events:
- gateway:startup -- Gateway process starts
- gateway:shutdown -- Gateway process is shutting down
- session:start -- New session created (first message of a new session)
- session:end -- Session ends (user ran /new or /reset)
- session:reset -- Session reset completed (new session entry created)
@@ -31,6 +32,26 @@ from hermes_cli.config import get_hermes_home
HOOKS_DIR = get_hermes_home() / "hooks"
def _normalize_startup_readiness(hook_name: str, manifest: dict[str, Any]) -> Optional[dict[str, Any]]:
"""Validate and normalize optional startup readiness metadata."""
readiness = manifest.get("startup_readiness")
if readiness is None:
return None
if not isinstance(readiness, dict):
print(f"[hooks] Ignoring startup_readiness for {hook_name}: expected mapping", flush=True)
return None
check_id = str(readiness.get("id", "")).strip()
if not check_id:
print(f"[hooks] Ignoring startup_readiness for {hook_name}: missing id", flush=True)
return None
return {
"id": check_id,
"required": bool(readiness.get("required", True)),
}
class HookRegistry:
"""
Discovers, loads, and fires event hooks.
@@ -62,6 +83,7 @@ class HookRegistry:
"description": "Run ~/.hermes/BOOT.md on gateway startup",
"events": ["gateway:startup"],
"path": "(builtin)",
"startup_readiness": None,
})
except Exception as e:
print(f"[hooks] Could not load built-in boot-md hook: {e}", flush=True)
@@ -102,6 +124,7 @@ class HookRegistry:
if not events:
print(f"[hooks] Skipping {hook_name}: no events declared", flush=True)
continue
startup_readiness = _normalize_startup_readiness(hook_name, manifest)
# Dynamically load the handler module
spec = importlib.util.spec_from_file_location(
@@ -128,6 +151,7 @@ class HookRegistry:
"description": manifest.get("description", ""),
"events": events,
"path": str(hook_dir),
"startup_readiness": startup_readiness,
})
print(f"[hooks] Loaded hook '{hook_name}' for events: {events}", flush=True)
+2
View File
@@ -9,9 +9,11 @@ Each adapter handles:
"""
from .base import BasePlatformAdapter, MessageEvent, SendResult
from .qqbot import QQAdapter
__all__ = [
"BasePlatformAdapter",
"MessageEvent",
"SendResult",
"QQAdapter",
]
+102 -13
View File
@@ -10,6 +10,7 @@ Exposes an HTTP server with endpoints:
- POST /v1/runs start a run, returns run_id immediately (202)
- GET /v1/runs/{run_id}/events SSE stream of structured lifecycle events
- GET /health health check
- GET /health/detailed rich status for cross-container dashboard probing
Any OpenAI-compatible frontend (Open WebUI, LobeChat, LibreChat,
AnythingLLM, NextChat, ChatBox, etc.) can connect to hermes-agent
@@ -54,6 +55,66 @@ DEFAULT_PORT = 8642
MAX_STORED_RESPONSES = 100
MAX_REQUEST_BYTES = 1_000_000 # 1 MB default limit for POST bodies
CHAT_COMPLETIONS_SSE_KEEPALIVE_SECONDS = 30.0
MAX_NORMALIZED_TEXT_LENGTH = 65_536 # 64 KB cap for normalized content parts
MAX_CONTENT_LIST_SIZE = 1_000 # Max items when content is an array
def _normalize_chat_content(
content: Any, *, _max_depth: int = 10, _depth: int = 0,
) -> str:
"""Normalize OpenAI chat message content into a plain text string.
Some clients (Open WebUI, LobeChat, etc.) send content as an array of
typed parts instead of a plain string::
[{"type": "text", "text": "hello"}, {"type": "input_text", "text": "..."}]
This function flattens those into a single string so the agent pipeline
(which expects strings) doesn't choke.
Defensive limits prevent abuse: recursion depth, list size, and output
length are all bounded.
"""
if _depth > _max_depth:
return ""
if content is None:
return ""
if isinstance(content, str):
return content[:MAX_NORMALIZED_TEXT_LENGTH] if len(content) > MAX_NORMALIZED_TEXT_LENGTH else content
if isinstance(content, list):
parts: List[str] = []
items = content[:MAX_CONTENT_LIST_SIZE] if len(content) > MAX_CONTENT_LIST_SIZE else content
for item in items:
if isinstance(item, str):
if item:
parts.append(item[:MAX_NORMALIZED_TEXT_LENGTH])
elif isinstance(item, dict):
item_type = str(item.get("type") or "").strip().lower()
if item_type in {"text", "input_text", "output_text"}:
text = item.get("text", "")
if text:
try:
parts.append(str(text)[:MAX_NORMALIZED_TEXT_LENGTH])
except Exception:
pass
# Silently skip image_url / other non-text parts
elif isinstance(item, list):
nested = _normalize_chat_content(item, _max_depth=_max_depth, _depth=_depth + 1)
if nested:
parts.append(nested)
# Check accumulated size
if sum(len(p) for p in parts) >= MAX_NORMALIZED_TEXT_LENGTH:
break
result = "\n".join(parts)
return result[:MAX_NORMALIZED_TEXT_LENGTH] if len(result) > MAX_NORMALIZED_TEXT_LENGTH else result
# Fallback for unexpected types (int, float, bool, etc.)
try:
result = str(content)
return result[:MAX_NORMALIZED_TEXT_LENGTH] if len(result) > MAX_NORMALIZED_TEXT_LENGTH else result
except Exception:
return ""
def check_api_server_requirements() -> bool:
@@ -505,6 +566,27 @@ class APIServerAdapter(BasePlatformAdapter):
"""GET /health — simple health check."""
return web.json_response({"status": "ok", "platform": "hermes-agent"})
async def _handle_health_detailed(self, request: "web.Request") -> "web.Response":
"""GET /health/detailed — rich status for cross-container dashboard probing.
Returns gateway state, connected platforms, PID, and uptime so the
dashboard can display full status without needing a shared PID file or
/proc access. No authentication required.
"""
from gateway.status import read_runtime_status
runtime = read_runtime_status() or {}
return web.json_response({
"status": "ok",
"platform": "hermes-agent",
"gateway_state": runtime.get("gateway_state"),
"platforms": runtime.get("platforms", {}),
"active_agents": runtime.get("active_agents", 0),
"exit_reason": runtime.get("exit_reason"),
"updated_at": runtime.get("updated_at"),
"pid": os.getpid(),
})
async def _handle_models(self, request: "web.Request") -> "web.Response":
"""GET /v1/models — return hermes-agent as an available model."""
auth_err = self._check_auth(request)
@@ -553,7 +635,7 @@ class APIServerAdapter(BasePlatformAdapter):
for msg in messages:
role = msg.get("role", "")
content = msg.get("content", "")
content = _normalize_chat_content(msg.get("content", ""))
if role == "system":
# Accumulate system messages
if system_prompt is None:
@@ -926,18 +1008,7 @@ class APIServerAdapter(BasePlatformAdapter):
input_messages.append({"role": "user", "content": item})
elif isinstance(item, dict):
role = item.get("role", "user")
content = item.get("content", "")
# Handle content that may be a list of content parts
if isinstance(content, list):
text_parts = []
for part in content:
if isinstance(part, dict) and part.get("type") == "input_text":
text_parts.append(part.get("text", ""))
elif isinstance(part, dict) and part.get("type") == "output_text":
text_parts.append(part.get("text", ""))
elif isinstance(part, str):
text_parts.append(part)
content = "\n".join(text_parts)
content = _normalize_chat_content(item.get("content", ""))
input_messages.append({"role": role, "content": content})
else:
return web.json_response(_openai_error("'input' must be a string or array"), status=400)
@@ -1734,6 +1805,7 @@ class APIServerAdapter(BasePlatformAdapter):
self._app = web.Application(middlewares=mws)
self._app["api_server_adapter"] = self
self._app.router.add_get("/health", self._handle_health)
self._app.router.add_get("/health/detailed", self._handle_health_detailed)
self._app.router.add_get("/v1/health", self._handle_health)
self._app.router.add_get("/v1/models", self._handle_models)
self._app.router.add_post("/v1/chat/completions", self._handle_chat_completions)
@@ -1770,6 +1842,23 @@ class APIServerAdapter(BasePlatformAdapter):
)
return False
# Refuse to start network-accessible with a placeholder key.
# Ported from openclaw/openclaw#64586.
if is_network_accessible(self._host) and self._api_key:
try:
from hermes_cli.auth import has_usable_secret
if not has_usable_secret(self._api_key, min_length=8):
logger.error(
"[%s] Refusing to start: API_SERVER_KEY is set to a "
"placeholder value. Generate a real secret "
"(e.g. `openssl rand -hex 32`) and set API_SERVER_KEY "
"before exposing the API server on %s.",
self.name, self._host,
)
return False
except ImportError:
pass
# Port conflict detection — fail fast if port is already in use
try:
with _socket.socket(_socket.AF_INET, _socket.SOCK_STREAM) as _s:
+82 -9
View File
@@ -21,6 +21,59 @@ from urllib.parse import urlsplit
logger = logging.getLogger(__name__)
def utf16_len(s: str) -> int:
"""Count UTF-16 code units in *s*.
Telegram's message-length limit (4 096) is measured in UTF-16 code units,
**not** Unicode code-points. Characters outside the Basic Multilingual
Plane (emoji like 😀, CJK Extension B, musical symbols, ) are encoded as
surrogate pairs and therefore consume **two** UTF-16 code units each, even
though Python's ``len()`` counts them as one.
Ported from nearai/ironclaw#2304 which discovered the same discrepancy in
Rust's ``chars().count()``.
"""
return len(s.encode("utf-16-le")) // 2
def _prefix_within_utf16_limit(s: str, limit: int) -> str:
"""Return the longest prefix of *s* whose UTF-16 length ≤ *limit*.
Unlike a plain ``s[:limit]``, this respects surrogate-pair boundaries so
we never slice a multi-code-unit character in half.
"""
if utf16_len(s) <= limit:
return s
# Binary search for the longest safe prefix
lo, hi = 0, len(s)
while lo < hi:
mid = (lo + hi + 1) // 2
if utf16_len(s[:mid]) <= limit:
lo = mid
else:
hi = mid - 1
return s[:lo]
def _custom_unit_to_cp(s: str, budget: int, len_fn) -> int:
"""Return the largest codepoint offset *n* such that ``len_fn(s[:n]) <= budget``.
Used by :meth:`BasePlatformAdapter.truncate_message` when *len_fn* measures
length in units different from Python codepoints (e.g. UTF-16 code units).
Falls back to binary search which is O(log n) calls to *len_fn*.
"""
if len_fn(s) <= budget:
return len(s)
lo, hi = 0, len(s)
while lo < hi:
mid = (lo + hi + 1) // 2
if len_fn(s[:mid]) <= budget:
lo = mid
else:
hi = mid - 1
return lo
def is_network_accessible(host: str) -> bool:
"""Return True if *host* would expose the server beyond loopback.
@@ -1886,7 +1939,11 @@ class BasePlatformAdapter(ABC):
return content
@staticmethod
def truncate_message(content: str, max_length: int = 4096) -> List[str]:
def truncate_message(
content: str,
max_length: int = 4096,
len_fn: Optional["Callable[[str], int]"] = None,
) -> List[str]:
"""
Split a long message into chunks, preserving code block boundaries.
@@ -1898,11 +1955,16 @@ class BasePlatformAdapter(ABC):
Args:
content: The full message content
max_length: Maximum length per chunk (platform-specific)
len_fn: Optional length function for measuring string length.
Defaults to ``len`` (Unicode code-points). Pass
``utf16_len`` for platforms that measure message
length in UTF-16 code units (e.g. Telegram).
Returns:
List of message chunks
"""
if len(content) <= max_length:
_len = len_fn or len
if _len(content) <= max_length:
return [content]
INDICATOR_RESERVE = 10 # room for " (XX/XX)"
@@ -1921,22 +1983,33 @@ class BasePlatformAdapter(ABC):
# How much body text we can fit after accounting for the prefix,
# a potential closing fence, and the chunk indicator.
headroom = max_length - INDICATOR_RESERVE - len(prefix) - len(FENCE_CLOSE)
headroom = max_length - INDICATOR_RESERVE - _len(prefix) - _len(FENCE_CLOSE)
if headroom < 1:
headroom = max_length // 2
# Everything remaining fits in one final chunk
if len(prefix) + len(remaining) <= max_length - INDICATOR_RESERVE:
if _len(prefix) + _len(remaining) <= max_length - INDICATOR_RESERVE:
chunks.append(prefix + remaining)
break
# Find a natural split point (prefer newlines, then spaces)
region = remaining[:headroom]
# Find a natural split point (prefer newlines, then spaces).
# When _len != len (e.g. utf16_len for Telegram), headroom is
# measured in the custom unit. We need codepoint-based slice
# positions that stay within the custom-unit budget.
#
# _safe_slice_pos() maps a custom-unit budget to the largest
# codepoint offset whose custom length ≤ budget.
if _len is not len:
# Map headroom (custom units) → codepoint slice length
_cp_limit = _custom_unit_to_cp(remaining, headroom, _len)
else:
_cp_limit = headroom
region = remaining[:_cp_limit]
split_at = region.rfind("\n")
if split_at < headroom // 2:
if split_at < _cp_limit // 2:
split_at = region.rfind(" ")
if split_at < 1:
split_at = headroom
split_at = _cp_limit
# Avoid splitting inside an inline code span (`...`).
# If the text before split_at has an odd number of unescaped
@@ -1956,7 +2029,7 @@ class BasePlatformAdapter(ABC):
safe_split = candidate.rfind(" ", 0, last_bt)
nl_split = candidate.rfind("\n", 0, last_bt)
safe_split = max(safe_split, nl_split)
if safe_split > headroom // 4:
if safe_split > _cp_limit // 4:
split_at = safe_split
chunk_body = remaining[:split_at]
+24 -32
View File
@@ -224,6 +224,21 @@ class BlueBubblesAdapter(BasePlatformAdapter):
host = "localhost"
return f"http://{host}:{self.webhook_port}{self.webhook_path}"
@property
def _webhook_register_url(self) -> str:
"""Webhook URL registered with BlueBubbles, including the password as
a query param so inbound webhook POSTs carry credentials.
BlueBubbles posts events to the exact URL registered via
``/api/v1/webhook``. Its webhook registration API does not support
custom headers, so embedding the password in the URL is the only
way to authenticate inbound webhooks without disabling auth.
"""
base = self._webhook_url
if self.password:
return f"{base}?password={quote(self.password, safe='')}"
return base
async def _find_registered_webhooks(self, url: str) -> list:
"""Return list of BB webhook entries matching *url*."""
try:
@@ -245,7 +260,7 @@ class BlueBubblesAdapter(BasePlatformAdapter):
if not self.client:
return False
webhook_url = self._webhook_url
webhook_url = self._webhook_register_url
# Crash resilience — reuse an existing registration if present
existing = await self._find_registered_webhooks(webhook_url)
@@ -257,7 +272,7 @@ class BlueBubblesAdapter(BasePlatformAdapter):
payload = {
"url": webhook_url,
"events": ["new-message", "updated-message", "message"],
"events": ["new-message", "updated-message"],
}
try:
@@ -292,7 +307,7 @@ class BlueBubblesAdapter(BasePlatformAdapter):
if not self.client:
return False
webhook_url = self._webhook_url
webhook_url = self._webhook_register_url
removed = False
try:
@@ -604,35 +619,6 @@ class BlueBubblesAdapter(BasePlatformAdapter):
# Tapback reactions
# ------------------------------------------------------------------
async def send_reaction(
self,
chat_id: str,
message_guid: str,
reaction: str,
part_index: int = 0,
) -> SendResult:
"""Send a tapback reaction (requires Private API helper)."""
if not self._private_api_enabled or not self._helper_connected:
return SendResult(
success=False, error="Private API helper not connected"
)
guid = await self._resolve_chat_guid(chat_id)
if not guid:
return SendResult(success=False, error=f"Chat not found: {chat_id}")
try:
res = await self._api_post(
"/api/v1/message/react",
{
"chatGuid": guid,
"selectedMessageGuid": message_guid,
"reaction": reaction,
"partIndex": part_index,
},
)
return SendResult(success=True, raw_response=res)
except Exception as exc:
return SendResult(success=False, error=str(exc))
# ------------------------------------------------------------------
# Chat info
# ------------------------------------------------------------------
@@ -864,6 +850,12 @@ class BlueBubblesAdapter(BasePlatformAdapter):
payload.get("chat_guid"),
payload.get("guid"),
)
# Fallback: BlueBubbles v1.9+ webhook payloads omit top-level chatGuid;
# the chat GUID is nested under data.chats[0].guid instead.
if not chat_guid:
_chats = record.get("chats") or []
if _chats and isinstance(_chats[0], dict):
chat_guid = _chats[0].get("guid") or _chats[0].get("chatGuid")
chat_identifier = self._value(
record.get("chatIdentifier"),
record.get("identifier"),
-1
View File
@@ -21,7 +21,6 @@ import asyncio
import logging
import os
import re
import time
import uuid
from datetime import datetime, timezone
from typing import Any, Dict, Optional
+118 -37
View File
@@ -10,7 +10,6 @@ Uses discord.py library for:
"""
import asyncio
import json
import logging
import os
import struct
@@ -19,7 +18,6 @@ import tempfile
import threading
import time
from collections import defaultdict
from pathlib import Path
from typing import Callable, Dict, Optional, Any
logger = logging.getLogger(__name__)
@@ -442,6 +440,7 @@ class DiscordAdapter(BasePlatformAdapter):
self._pending_text_batches: Dict[str, MessageEvent] = {}
self._pending_text_batch_tasks: Dict[str, asyncio.Task] = {}
self._voice_text_channels: Dict[int, int] = {} # guild_id -> text_channel_id
self._voice_sources: Dict[int, Dict[str, Any]] = {} # guild_id -> linked text channel source metadata
self._voice_timeout_tasks: Dict[int, asyncio.Task] = {} # guild_id -> timeout task
# Phase 2: voice listening
self._voice_receivers: Dict[int, VoiceReceiver] = {} # guild_id -> VoiceReceiver
@@ -456,6 +455,7 @@ class DiscordAdapter(BasePlatformAdapter):
# show the standard typing gateway event for bots)
self._typing_tasks: Dict[str, asyncio.Task] = {}
self._bot_task: Optional[asyncio.Task] = None
self._post_connect_task: Optional[asyncio.Task] = None
# Dedup cache: prevents duplicate bot responses when Discord
# RESUME replays events after reconnects.
self._dedup = MessageDeduplicator()
@@ -545,15 +545,14 @@ class DiscordAdapter(BasePlatformAdapter):
# Resolve any usernames in the allowed list to numeric IDs
await adapter_self._resolve_allowed_usernames()
# Sync slash commands with Discord
try:
synced = await adapter_self._client.tree.sync()
logger.info("[%s] Synced %d slash command(s)", adapter_self.name, len(synced))
except Exception as e: # pragma: no cover - defensive logging
logger.warning("[%s] Slash command sync failed: %s", adapter_self.name, e, exc_info=True)
adapter_self._ready_event.set()
if adapter_self._post_connect_task and not adapter_self._post_connect_task.done():
adapter_self._post_connect_task.cancel()
adapter_self._post_connect_task = asyncio.create_task(
adapter_self._run_post_connect_initialization()
)
@self._client.event
async def on_message(message: DiscordMessage):
# Dedup: Discord RESUME replays events after reconnects (#4777)
@@ -686,14 +685,36 @@ class DiscordAdapter(BasePlatformAdapter):
except Exception as e: # pragma: no cover - defensive logging
logger.warning("[%s] Error during disconnect: %s", self.name, e, exc_info=True)
if self._post_connect_task and not self._post_connect_task.done():
self._post_connect_task.cancel()
try:
await self._post_connect_task
except asyncio.CancelledError:
pass
self._running = False
self._client = None
self._ready_event.clear()
self._post_connect_task = None
self._release_platform_lock()
logger.info("[%s] Disconnected", self.name)
async def _run_post_connect_initialization(self) -> None:
"""Finish non-critical startup work after Discord is connected."""
if not self._client:
return
try:
synced = await asyncio.wait_for(self._client.tree.sync(), timeout=30)
logger.info("[%s] Synced %d slash command(s)", self.name, len(synced))
except asyncio.TimeoutError:
logger.warning("[%s] Slash command sync timed out after 30s", self.name)
except asyncio.CancelledError:
raise
except Exception as e: # pragma: no cover - defensive logging
logger.warning("[%s] Slash command sync failed: %s", self.name, e, exc_info=True)
async def _add_reaction(self, message: Any, emoji: str) -> bool:
"""Add an emoji reaction to a Discord message."""
if not message or not hasattr(message, "add_reaction"):
@@ -1023,6 +1044,7 @@ class DiscordAdapter(BasePlatformAdapter):
if task:
task.cancel()
self._voice_text_channels.pop(guild_id, None)
self._voice_sources.pop(guild_id, None)
# Maximum seconds to wait for voice playback before giving up
PLAYBACK_TIMEOUT = 120
@@ -1714,46 +1736,90 @@ class DiscordAdapter(BasePlatformAdapter):
async def slash_btw(interaction: discord.Interaction, question: str):
await self._run_simple_slash(interaction, f"/btw {question}")
# Register installed skills as native slash commands (parity with
# Telegram, which uses telegram_menu_commands() in commands.py).
# Discord allows up to 100 application commands globally.
_DISCORD_CMD_LIMIT = 100
# Register skills under a single /skill command group with category
# subcommand groups. This uses 1 top-level slot instead of N,
# supporting up to 25 categories × 25 skills = 625 skills.
self._register_skill_group(tree)
def _register_skill_group(self, tree) -> None:
"""Register a ``/skill`` command group with category subcommand groups.
Skills are organized by their directory category under ``SKILLS_DIR``.
Each category becomes a subcommand group; root-level skills become
direct subcommands. Discord supports 25 subcommand groups × 25
subcommands each = 625 skills well beyond the old 100-command cap.
"""
try:
from hermes_cli.commands import discord_skill_commands
from hermes_cli.commands import discord_skill_commands_by_category
existing_names = {cmd.name for cmd in tree.get_commands()}
remaining_slots = max(0, _DISCORD_CMD_LIMIT - len(existing_names))
existing_names = set()
try:
existing_names = {cmd.name for cmd in tree.get_commands()}
except Exception:
pass
skill_entries, skipped = discord_skill_commands(
max_slots=remaining_slots,
categories, uncategorized, hidden = discord_skill_commands_by_category(
reserved_names=existing_names,
)
for discord_name, description, cmd_key in skill_entries:
# Closure factory to capture cmd_key per iteration
def _make_skill_handler(_key: str):
async def _skill_slash(interaction: discord.Interaction, args: str = ""):
await self._run_simple_slash(interaction, f"{_key} {args}".strip())
return _skill_slash
if not categories and not uncategorized:
return
handler = _make_skill_handler(cmd_key)
handler.__name__ = f"skill_{discord_name.replace('-', '_')}"
skill_group = discord.app_commands.Group(
name="skill",
description="Run a Hermes skill",
)
# ── Helper: build a callback for a skill command key ──
def _make_handler(_key: str):
@discord.app_commands.describe(args="Optional arguments for the skill")
async def _handler(interaction: discord.Interaction, args: str = ""):
await self._run_simple_slash(interaction, f"{_key} {args}".strip())
_handler.__name__ = f"skill_{_key.lstrip('/').replace('-', '_')}"
return _handler
# ── Uncategorized (root-level) skills → direct subcommands ──
for discord_name, description, cmd_key in uncategorized:
cmd = discord.app_commands.Command(
name=discord_name,
description=description,
callback=handler,
description=description or f"Run the {discord_name} skill",
callback=_make_handler(cmd_key),
)
discord.app_commands.describe(args="Optional arguments for the skill")(cmd)
tree.add_command(cmd)
skill_group.add_command(cmd)
if skipped:
# ── Category subcommand groups ──
for cat_name in sorted(categories):
cat_desc = f"{cat_name.replace('-', ' ').title()} skills"
if len(cat_desc) > 100:
cat_desc = cat_desc[:97] + "..."
cat_group = discord.app_commands.Group(
name=cat_name,
description=cat_desc,
parent=skill_group,
)
for discord_name, description, cmd_key in categories[cat_name]:
cmd = discord.app_commands.Command(
name=discord_name,
description=description or f"Run the {discord_name} skill",
callback=_make_handler(cmd_key),
)
cat_group.add_command(cmd)
tree.add_command(skill_group)
total = sum(len(v) for v in categories.values()) + len(uncategorized)
logger.info(
"[%s] Registered /skill group: %d skill(s) across %d categories"
" + %d uncategorized",
self.name, total, len(categories), len(uncategorized),
)
if hidden:
logger.warning(
"[%s] Discord slash command limit reached (%d): %d skill(s) not registered",
self.name, _DISCORD_CMD_LIMIT, skipped,
"[%s] %d skill(s) not registered (Discord subcommand limits)",
self.name, hidden,
)
except Exception as exc:
logger.warning("[%s] Failed to register skill slash commands: %s", self.name, exc)
logger.warning("[%s] Failed to register /skill group: %s", self.name, exc)
def _build_slash_event(self, interaction: discord.Interaction, text: str) -> MessageEvent:
"""Build a MessageEvent from a Discord slash command interaction."""
@@ -2222,6 +2288,7 @@ class DiscordAdapter(BasePlatformAdapter):
thread_id = str(message.channel.id)
parent_channel_id = self._get_parent_channel_id(message.channel)
is_voice_linked_channel = False
if not isinstance(message.channel, discord.DMChannel):
channel_ids = {str(message.channel.id)}
if parent_channel_id:
@@ -2248,7 +2315,12 @@ class DiscordAdapter(BasePlatformAdapter):
channel_ids.add(parent_channel_id)
require_mention = os.getenv("DISCORD_REQUIRE_MENTION", "true").lower() not in ("false", "0", "no")
is_free_channel = bool(channel_ids & free_channels)
# Voice-linked text channels act as free-response while voice is active.
# Only the exact bound channel gets the exemption, not sibling threads.
voice_linked_ids = {str(ch_id) for ch_id in self._voice_text_channels.values()}
current_channel_id = str(message.channel.id)
is_voice_linked_channel = current_channel_id in voice_linked_ids
is_free_channel = bool(channel_ids & free_channels) or is_voice_linked_channel
# Skip the mention check if the message is in a thread where
# the bot has previously participated (auto-created or replied in).
@@ -2272,7 +2344,7 @@ class DiscordAdapter(BasePlatformAdapter):
no_thread_channels = {ch.strip() for ch in no_thread_channels_raw.split(",") if ch.strip()}
skip_thread = bool(channel_ids & no_thread_channels)
auto_thread = os.getenv("DISCORD_AUTO_THREAD", "true").lower() in ("true", "1", "yes")
if auto_thread and not skip_thread:
if auto_thread and not skip_thread and not is_voice_linked_channel:
thread = await self._auto_create_thread(message)
if thread:
is_thread = True
@@ -2446,6 +2518,14 @@ class DiscordAdapter(BasePlatformAdapter):
_parent_id = str(getattr(_chan, "parent_id", "") or "")
_chan_id = str(getattr(_chan, "id", ""))
_skills = self._resolve_channel_skills(_chan_id, _parent_id or None)
reply_to_id = None
reply_to_text = None
if message.reference:
reply_to_id = str(message.reference.message_id)
if message.reference.resolved:
reply_to_text = getattr(message.reference.resolved, "content", None) or None
event = MessageEvent(
text=event_text,
message_type=msg_type,
@@ -2454,7 +2534,8 @@ class DiscordAdapter(BasePlatformAdapter):
message_id=str(message.id),
media_urls=media_urls,
media_types=media_types,
reply_to_message_id=str(message.reference.message_id) if message.reference else None,
reply_to_message_id=reply_to_id,
reply_to_text=reply_to_text,
timestamp=message.created_at,
auto_skill=_skills,
)
+450 -87
View File
@@ -34,6 +34,9 @@ from datetime import datetime
from pathlib import Path
from types import SimpleNamespace
from typing import Any, Dict, List, Optional
from urllib.error import HTTPError, URLError
from urllib.parse import urlencode
from urllib.request import Request, urlopen
# aiohttp/websockets are independent optional deps — import outside lark_oapi
# so they remain available for tests and webhook mode even if lark_oapi is missing.
@@ -69,7 +72,10 @@ try:
UpdateMessageRequestBody,
)
from lark_oapi.core.const import FEISHU_DOMAIN, LARK_DOMAIN
from lark_oapi.event.callback.model.p2_card_action_trigger import P2CardActionTriggerResponse
from lark_oapi.event.callback.model.p2_card_action_trigger import (
CallBackCard,
P2CardActionTriggerResponse,
)
from lark_oapi.event.dispatcher_handler import EventDispatcherHandler
from lark_oapi.ws import Client as FeishuWSClient
@@ -77,6 +83,7 @@ try:
except ImportError:
FEISHU_AVAILABLE = False
lark = None # type: ignore[assignment]
CallBackCard = None # type: ignore[assignment]
P2CardActionTriggerResponse = None # type: ignore[assignment]
EventDispatcherHandler = None # type: ignore[assignment]
FeishuWSClient = None # type: ignore[assignment]
@@ -166,9 +173,35 @@ _FEISHU_WEBHOOK_BODY_TIMEOUT_SECONDS = 30 # max seconds to read request
_FEISHU_WEBHOOK_ANOMALY_THRESHOLD = 25 # consecutive error responses before WARNING log
_FEISHU_WEBHOOK_ANOMALY_TTL_SECONDS = 6 * 60 * 60 # anomaly tracker TTL (6 hours) — matches openclaw
_FEISHU_CARD_ACTION_DEDUP_TTL_SECONDS = 15 * 60 # card action token dedup window (15 min)
_APPROVAL_CHOICE_MAP: Dict[str, str] = {
"approve_once": "once",
"approve_session": "session",
"approve_always": "always",
"deny": "deny",
}
_APPROVAL_LABEL_MAP: Dict[str, str] = {
"once": "Approved once",
"session": "Approved for session",
"always": "Approved permanently",
"deny": "Denied",
}
_FEISHU_BOT_MSG_TRACK_SIZE = 512 # LRU size for tracking sent message IDs
_FEISHU_REPLY_FALLBACK_CODES = frozenset({230011, 231003}) # reply target withdrawn/missing → create fallback
_FEISHU_ACK_EMOJI = "OK"
# QR onboarding constants
_ONBOARD_ACCOUNTS_URLS = {
"feishu": "https://accounts.feishu.cn",
"lark": "https://accounts.larksuite.com",
}
_ONBOARD_OPEN_URLS = {
"feishu": "https://open.feishu.cn",
"lark": "https://open.larksuite.com",
}
_REGISTRATION_PATH = "/oauth/v1/app/registration"
_ONBOARD_REQUEST_TIMEOUT_S = 10
# ---------------------------------------------------------------------------
# Fallback display strings
# ---------------------------------------------------------------------------
@@ -414,14 +447,6 @@ def _build_markdown_post_payload(content: str) -> str:
)
def parse_feishu_post_content(raw_content: str) -> FeishuPostParseResult:
try:
parsed = json.loads(raw_content) if raw_content else {}
except json.JSONDecodeError:
return FeishuPostParseResult(text_content=FALLBACK_POST_TEXT)
return parse_feishu_post_payload(parsed)
def parse_feishu_post_payload(payload: Any) -> FeishuPostParseResult:
resolved = _resolve_post_payload(payload)
if not resolved:
@@ -1482,14 +1507,12 @@ class FeishuAdapter(BasePlatformAdapter):
logger.warning("[Feishu] send_exec_approval failed: %s", exc)
return SendResult(success=False, error=str(exc))
async def _update_approval_card(
self, message_id: str, label: str, user_name: str, choice: str,
) -> None:
"""Replace the approval card with a resolved status card."""
if not self._client or not message_id:
return
@staticmethod
def _build_resolved_approval_card(*, choice: str, user_name: str) -> Dict[str, Any]:
"""Build raw card JSON for a resolved approval action."""
icon = "" if choice == "deny" else ""
card = {
label = _APPROVAL_LABEL_MAP.get(choice, "Resolved")
return {
"config": {"wide_screen_mode": True},
"header": {
"title": {"content": f"{icon} {label}", "tag": "plain_text"},
@@ -1502,13 +1525,6 @@ class FeishuAdapter(BasePlatformAdapter):
},
],
}
try:
payload = json.dumps(card, ensure_ascii=False)
body = self._build_update_message_body(msg_type="interactive", content=payload)
request = self._build_update_message_request(message_id=message_id, request_body=body)
await asyncio.to_thread(self._client.im.v1.message.update, request)
except Exception as exc:
logger.warning("[Feishu] Failed to update approval card %s: %s", message_id, exc)
async def send_voice(
self,
@@ -1837,20 +1853,82 @@ class FeishuAdapter(BasePlatformAdapter):
future.add_done_callback(self._log_background_failure)
def _on_card_action_trigger(self, data: Any) -> Any:
"""Schedule Feishu card actions on the adapter loop and acknowledge immediately."""
"""Handle card-action callback from the Feishu SDK (synchronous).
For approval actions: parses the event once, returns the resolved card
inline (the only reliable way to sync all clients), and schedules a
lightweight async method to actually unblock the agent.
For other card actions: delegates to ``_handle_card_action_event``.
"""
loop = self._loop
if loop is None or bool(getattr(loop, "is_closed", lambda: False)()):
if not self._loop_accepts_callbacks(loop):
logger.warning("[Feishu] Dropping card action before adapter loop is ready")
else:
future = asyncio.run_coroutine_threadsafe(
self._handle_card_action_event(data),
loop,
)
future.add_done_callback(self._log_background_failure)
return P2CardActionTriggerResponse() if P2CardActionTriggerResponse else None
event = getattr(data, "event", None)
action = getattr(event, "action", None)
action_value = getattr(action, "value", {}) or {}
hermes_action = action_value.get("hermes_action") if isinstance(action_value, dict) else None
if hermes_action:
return self._handle_approval_card_action(event=event, action_value=action_value, loop=loop)
self._submit_on_loop(loop, self._handle_card_action_event(data))
if P2CardActionTriggerResponse is None:
return None
return P2CardActionTriggerResponse()
@staticmethod
def _loop_accepts_callbacks(loop: Any) -> bool:
"""Return True when the adapter loop can accept thread-safe submissions."""
return loop is not None and not bool(getattr(loop, "is_closed", lambda: False)())
def _submit_on_loop(self, loop: Any, coro: Any) -> None:
"""Schedule background work on the adapter loop with shared failure logging."""
future = asyncio.run_coroutine_threadsafe(coro, loop)
future.add_done_callback(self._log_background_failure)
def _handle_approval_card_action(self, *, event: Any, action_value: Dict[str, Any], loop: Any) -> Any:
"""Schedule approval resolution and build the synchronous callback response."""
approval_id = action_value.get("approval_id")
if approval_id is None:
logger.debug("[Feishu] Card action missing approval_id, ignoring")
return P2CardActionTriggerResponse() if P2CardActionTriggerResponse else None
choice = _APPROVAL_CHOICE_MAP.get(action_value.get("hermes_action"), "deny")
operator = getattr(event, "operator", None)
open_id = str(getattr(operator, "open_id", "") or "")
user_name = self._get_cached_sender_name(open_id) or open_id
self._submit_on_loop(loop, self._resolve_approval(approval_id, choice, user_name))
if P2CardActionTriggerResponse is None:
return None
response = P2CardActionTriggerResponse()
if CallBackCard is not None:
card = CallBackCard()
card.type = "raw"
card.data = self._build_resolved_approval_card(choice=choice, user_name=user_name)
response.card = card
return response
async def _resolve_approval(self, approval_id: Any, choice: str, user_name: str) -> None:
"""Pop approval state and unblock the waiting agent thread."""
state = self._approval_state.pop(approval_id, None)
if not state:
logger.debug("[Feishu] Approval %s already resolved or unknown", approval_id)
return
try:
from tools.approval import resolve_gateway_approval
count = resolve_gateway_approval(state["session_key"], choice)
logger.info(
"Feishu button resolved %d approval(s) for session %s (choice=%s, user=%s)",
count, state["session_key"], choice, user_name,
)
except Exception as exc:
logger.error("Failed to resolve gateway approval from Feishu button: %s", exc)
async def _handle_reaction_event(self, event_type: str, data: Any) -> None:
"""Fetch the reacted-to message; if it was sent by this bot, emit a synthetic text event."""
if not self._client:
@@ -1942,51 +2020,6 @@ class FeishuAdapter(BasePlatformAdapter):
action_tag = str(getattr(action, "tag", "") or "button")
action_value = getattr(action, "value", {}) or {}
# --- Exec approval button intercept ---
hermes_action = action_value.get("hermes_action") if isinstance(action_value, dict) else None
if hermes_action:
approval_id = action_value.get("approval_id")
state = self._approval_state.pop(approval_id, None)
if not state:
logger.debug("[Feishu] Approval %s already resolved or unknown", approval_id)
return
choice_map = {
"approve_once": "once",
"approve_session": "session",
"approve_always": "always",
"deny": "deny",
}
choice = choice_map.get(hermes_action, "deny")
label_map = {
"once": "Approved once",
"session": "Approved for session",
"always": "Approved permanently",
"deny": "Denied",
}
label = label_map.get(choice, "Resolved")
# Resolve sender name for the status card
sender_id = SimpleNamespace(open_id=open_id, user_id=None, union_id=None)
sender_profile = await self._resolve_sender_profile(sender_id)
user_name = sender_profile.get("user_name") or open_id
# Resolve the approval — unblocks the agent thread
try:
from tools.approval import resolve_gateway_approval
count = resolve_gateway_approval(state["session_key"], choice)
logger.info(
"Feishu button resolved %d approval(s) for session %s (choice=%s, user=%s)",
count, state["session_key"], choice, user_name,
)
except Exception as exc:
logger.error("Failed to resolve gateway approval from Feishu button: %s", exc)
# Update the card to show the decision
await self._update_approval_card(state.get("message_id", ""), label, user_name, choice)
return
synthetic_text = f"/card {action_tag}"
if action_value:
try:
@@ -2672,12 +2705,6 @@ class FeishuAdapter(BasePlatformAdapter):
return self._resolve_media_message_type(media_types[0] if media_types else "", default=MessageType.DOCUMENT)
return MessageType.TEXT
def _normalize_inbound_text(self, text: str) -> str:
"""Strip Feishu mention placeholders from inbound text."""
text = _MENTION_RE.sub(" ", text or "")
text = _MULTISPACE_RE.sub(" ", text)
return text.strip()
async def _maybe_extract_text_document(self, cached_path: str, media_type: str) -> str:
if not cached_path or not media_type.startswith("text/"):
return ""
@@ -2895,6 +2922,19 @@ class FeishuAdapter(BasePlatformAdapter):
"user_id_alt": union_id,
}
def _get_cached_sender_name(self, sender_id: Optional[str]) -> Optional[str]:
"""Return a cached sender name only while its TTL is still valid."""
if not sender_id:
return None
cached = self._sender_name_cache.get(sender_id)
if cached is None:
return None
name, expire_at = cached
if time.time() < expire_at:
return name
self._sender_name_cache.pop(sender_id, None)
return None
async def _resolve_sender_name_from_api(self, sender_id: Optional[str]) -> Optional[str]:
"""Fetch the sender's display name from the Feishu contact API with a 10-minute cache.
@@ -2907,11 +2947,9 @@ class FeishuAdapter(BasePlatformAdapter):
if not trimmed:
return None
now = time.time()
cached = self._sender_name_cache.get(trimmed)
if cached is not None:
name, expire_at = cached
if now < expire_at:
return name
cached_name = self._get_cached_sender_name(trimmed)
if cached_name is not None:
return cached_name
try:
from lark_oapi.api.contact.v3 import GetUserRequest # lazy import
if trimmed.startswith("ou_"):
@@ -3621,3 +3659,328 @@ class FeishuAdapter(BasePlatformAdapter):
return _FEISHU_FILE_UPLOAD_TYPE, "file"
return _FEISHU_FILE_UPLOAD_TYPE, "file"
# =============================================================================
# QR scan-to-create onboarding
#
# Device-code flow: user scans a QR code with Feishu/Lark mobile app and the
# platform creates a fully configured bot application automatically.
# Called by `hermes gateway setup` via _setup_feishu() in hermes_cli/gateway.py.
# =============================================================================
def _accounts_base_url(domain: str) -> str:
return _ONBOARD_ACCOUNTS_URLS.get(domain, _ONBOARD_ACCOUNTS_URLS["feishu"])
def _onboard_open_base_url(domain: str) -> str:
return _ONBOARD_OPEN_URLS.get(domain, _ONBOARD_OPEN_URLS["feishu"])
def _post_registration(base_url: str, body: Dict[str, str]) -> dict:
"""POST form-encoded data to the registration endpoint, return parsed JSON.
The registration endpoint returns JSON even on 4xx (e.g. poll returns
authorization_pending as a 400). We always parse the body regardless of
HTTP status.
"""
url = f"{base_url}{_REGISTRATION_PATH}"
data = urlencode(body).encode("utf-8")
req = Request(url, data=data, headers={"Content-Type": "application/x-www-form-urlencoded"})
try:
with urlopen(req, timeout=_ONBOARD_REQUEST_TIMEOUT_S) as resp:
return json.loads(resp.read().decode("utf-8"))
except HTTPError as exc:
body_bytes = exc.read()
if body_bytes:
try:
return json.loads(body_bytes.decode("utf-8"))
except (ValueError, json.JSONDecodeError):
raise exc from None
raise
def _init_registration(domain: str = "feishu") -> None:
"""Verify the environment supports client_secret auth.
Raises RuntimeError if not supported.
"""
base_url = _accounts_base_url(domain)
res = _post_registration(base_url, {"action": "init"})
methods = res.get("supported_auth_methods") or []
if "client_secret" not in methods:
raise RuntimeError(
f"Feishu / Lark registration environment does not support client_secret auth. "
f"Supported: {methods}"
)
def _begin_registration(domain: str = "feishu") -> dict:
"""Start the device-code flow. Returns device_code, qr_url, user_code, interval, expire_in."""
base_url = _accounts_base_url(domain)
res = _post_registration(base_url, {
"action": "begin",
"archetype": "PersonalAgent",
"auth_method": "client_secret",
"request_user_info": "open_id",
})
device_code = res.get("device_code")
if not device_code:
raise RuntimeError("Feishu / Lark registration did not return a device_code")
qr_url = res.get("verification_uri_complete", "")
if "?" in qr_url:
qr_url += "&from=hermes&tp=hermes"
else:
qr_url += "?from=hermes&tp=hermes"
return {
"device_code": device_code,
"qr_url": qr_url,
"user_code": res.get("user_code", ""),
"interval": res.get("interval") or 5,
"expire_in": res.get("expire_in") or 600,
}
def _poll_registration(
*,
device_code: str,
interval: int,
expire_in: int,
domain: str = "feishu",
) -> Optional[dict]:
"""Poll until the user scans the QR code, or timeout/denial.
Returns dict with app_id, app_secret, domain, open_id on success.
Returns None on failure.
"""
deadline = time.time() + expire_in
current_domain = domain
domain_switched = False
poll_count = 0
while time.time() < deadline:
base_url = _accounts_base_url(current_domain)
try:
res = _post_registration(base_url, {
"action": "poll",
"device_code": device_code,
"tp": "ob_app",
})
except (URLError, OSError, json.JSONDecodeError):
time.sleep(interval)
continue
poll_count += 1
if poll_count == 1:
print(" Fetching configuration results...", end="", flush=True)
elif poll_count % 6 == 0:
print(".", end="", flush=True)
# Domain auto-detection
user_info = res.get("user_info") or {}
tenant_brand = user_info.get("tenant_brand")
if tenant_brand == "lark" and not domain_switched:
current_domain = "lark"
domain_switched = True
# Fall through — server may return credentials in this same response.
# Success
if res.get("client_id") and res.get("client_secret"):
if poll_count > 0:
print() # newline after "Fetching configuration results..." dots
return {
"app_id": res["client_id"],
"app_secret": res["client_secret"],
"domain": current_domain,
"open_id": user_info.get("open_id"),
}
# Terminal errors
error = res.get("error", "")
if error in ("access_denied", "expired_token"):
if poll_count > 0:
print()
logger.warning("[Feishu onboard] Registration %s", error)
return None
# authorization_pending or unknown — keep polling
time.sleep(interval)
if poll_count > 0:
print()
logger.warning("[Feishu onboard] Poll timed out after %ds", expire_in)
return None
try:
import qrcode as _qrcode_mod
except (ImportError, TypeError):
_qrcode_mod = None # type: ignore[assignment]
def _render_qr(url: str) -> bool:
"""Try to render a QR code in the terminal. Returns True if successful."""
if _qrcode_mod is None:
return False
try:
qr = _qrcode_mod.QRCode()
qr.add_data(url)
qr.make(fit=True)
qr.print_ascii(invert=True)
return True
except Exception:
return False
def probe_bot(app_id: str, app_secret: str, domain: str) -> Optional[dict]:
"""Verify bot connectivity via /open-apis/bot/v3/info.
Uses lark_oapi SDK when available, falls back to raw HTTP otherwise.
Returns {"bot_name": ..., "bot_open_id": ...} on success, None on failure.
"""
if FEISHU_AVAILABLE:
return _probe_bot_sdk(app_id, app_secret, domain)
return _probe_bot_http(app_id, app_secret, domain)
def _build_onboard_client(app_id: str, app_secret: str, domain: str) -> Any:
"""Build a lark Client for the given credentials and domain."""
sdk_domain = LARK_DOMAIN if domain == "lark" else FEISHU_DOMAIN
return (
lark.Client.builder()
.app_id(app_id)
.app_secret(app_secret)
.domain(sdk_domain)
.log_level(lark.LogLevel.WARNING)
.build()
)
def _parse_bot_response(data: dict) -> Optional[dict]:
"""Extract bot_name and bot_open_id from a /bot/v3/info response."""
if data.get("code") != 0:
return None
bot = data.get("bot") or data.get("data", {}).get("bot") or {}
return {
"bot_name": bot.get("bot_name"),
"bot_open_id": bot.get("open_id"),
}
def _probe_bot_sdk(app_id: str, app_secret: str, domain: str) -> Optional[dict]:
"""Probe bot info using lark_oapi SDK."""
try:
client = _build_onboard_client(app_id, app_secret, domain)
resp = client.request(
method="GET",
url="/open-apis/bot/v3/info",
body=None,
raw_response=True,
)
return _parse_bot_response(json.loads(resp.content))
except Exception as exc:
logger.debug("[Feishu onboard] SDK probe failed: %s", exc)
return None
def _probe_bot_http(app_id: str, app_secret: str, domain: str) -> Optional[dict]:
"""Fallback probe using raw HTTP (when lark_oapi is not installed)."""
base_url = _onboard_open_base_url(domain)
try:
token_data = json.dumps({"app_id": app_id, "app_secret": app_secret}).encode("utf-8")
token_req = Request(
f"{base_url}/open-apis/auth/v3/tenant_access_token/internal",
data=token_data,
headers={"Content-Type": "application/json"},
)
with urlopen(token_req, timeout=_ONBOARD_REQUEST_TIMEOUT_S) as resp:
token_res = json.loads(resp.read().decode("utf-8"))
access_token = token_res.get("tenant_access_token")
if not access_token:
return None
bot_req = Request(
f"{base_url}/open-apis/bot/v3/info",
headers={
"Authorization": f"Bearer {access_token}",
"Content-Type": "application/json",
},
)
with urlopen(bot_req, timeout=_ONBOARD_REQUEST_TIMEOUT_S) as resp:
bot_res = json.loads(resp.read().decode("utf-8"))
return _parse_bot_response(bot_res)
except (URLError, OSError, KeyError, json.JSONDecodeError) as exc:
logger.debug("[Feishu onboard] HTTP probe failed: %s", exc)
return None
def qr_register(
*,
initial_domain: str = "feishu",
timeout_seconds: int = 600,
) -> Optional[dict]:
"""Run the Feishu / Lark scan-to-create QR registration flow.
Returns on success::
{
"app_id": str,
"app_secret": str,
"domain": "feishu" | "lark",
"open_id": str | None,
"bot_name": str | None,
"bot_open_id": str | None,
}
Returns None on expected failures (network, auth denied, timeout).
Unexpected errors (bugs, protocol regressions) propagate to the caller.
"""
try:
return _qr_register_inner(initial_domain=initial_domain, timeout_seconds=timeout_seconds)
except (RuntimeError, URLError, OSError, json.JSONDecodeError) as exc:
logger.warning("[Feishu onboard] Registration failed: %s", exc)
return None
def _qr_register_inner(
*,
initial_domain: str,
timeout_seconds: int,
) -> Optional[dict]:
"""Run init → begin → poll → probe. Raises on network/protocol errors."""
print(" Connecting to Feishu / Lark...", end="", flush=True)
_init_registration(initial_domain)
begin = _begin_registration(initial_domain)
print(" done.")
print()
qr_url = begin["qr_url"]
if _render_qr(qr_url):
print(f"\n Scan the QR code above, or open this URL directly:\n {qr_url}")
else:
print(f" Open this URL in Feishu / Lark on your phone:\n\n {qr_url}\n")
print(" Tip: pip install qrcode to display a scannable QR code here next time")
print()
result = _poll_registration(
device_code=begin["device_code"],
interval=begin["interval"],
expire_in=min(begin["expire_in"], timeout_seconds),
domain=initial_domain,
)
if not result:
return None
# Probe bot — best-effort, don't fail the registration
bot_info = probe_bot(result["app_id"], result["app_secret"], result["domain"])
if bot_info:
result["bot_name"] = bot_info.get("bot_name")
result["bot_open_id"] = bot_info.get("bot_open_id")
else:
result["bot_name"] = None
result["bot_open_id"] = None
return result
+209 -118
View File
@@ -18,13 +18,13 @@ Environment variables:
MATRIX_REQUIRE_MENTION Require @mention in rooms (default: true)
MATRIX_FREE_RESPONSE_ROOMS Comma-separated room IDs exempt from mention requirement
MATRIX_AUTO_THREAD Auto-create threads for room messages (default: true)
MATRIX_RECOVERY_KEY Recovery key for cross-signing verification after device key rotation
MATRIX_DM_MENTION_THREADS Create a thread when bot is @mentioned in a DM (default: false)
"""
from __future__ import annotations
import asyncio
import json
import logging
import mimetypes
import os
@@ -104,7 +104,7 @@ MAX_MESSAGE_LENGTH = 4000
# Uses get_hermes_home() so each profile gets its own Matrix store.
from hermes_constants import get_hermes_dir as _get_hermes_dir
_STORE_DIR = _get_hermes_dir("platforms/matrix/store", "matrix/store")
_CRYPTO_PICKLE_PATH = _STORE_DIR / "crypto_store.pickle"
_CRYPTO_DB_PATH = _STORE_DIR / "crypto.db"
# Grace period: ignore messages older than this many seconds before startup.
_STARTUP_GRACE_SECONDS = 5
@@ -165,6 +165,33 @@ def check_matrix_requirements() -> bool:
return True
class _CryptoStateStore:
"""Adapter that satisfies the mautrix crypto StateStore interface.
OlmMachine requires a StateStore with ``is_encrypted``,
``get_encryption_info``, and ``find_shared_rooms``. The basic
``MemoryStateStore`` from ``mautrix.client`` doesn't implement these,
so we provide simple implementations that consult the client's room
state.
"""
def __init__(self, client_state_store: Any, joined_rooms: set):
self._ss = client_state_store
self._joined_rooms = joined_rooms
async def is_encrypted(self, room_id: str) -> bool:
return (await self.get_encryption_info(room_id)) is not None
async def get_encryption_info(self, room_id: str):
if hasattr(self._ss, "get_encryption_info"):
return await self._ss.get_encryption_info(room_id)
return None
async def find_shared_rooms(self, user_id: str) -> list:
# Return all joined rooms — simple but correct for a single-user bot.
return list(self._joined_rooms)
class MatrixAdapter(BasePlatformAdapter):
"""Gateway adapter for Matrix (any homeserver)."""
@@ -199,6 +226,7 @@ class MatrixAdapter(BasePlatformAdapter):
)
self._client: Any = None # mautrix.client.Client
self._crypto_db: Any = None # mautrix.util.async_db.Database
self._sync_task: Optional[asyncio.Task] = None
self._closing = False
self._startup_ts: float = 0.0
@@ -252,6 +280,92 @@ class MatrixAdapter(BasePlatformAdapter):
self._processed_events_set.add(event_id)
return False
# ------------------------------------------------------------------
# E2EE helpers
# ------------------------------------------------------------------
async def _verify_device_keys_on_server(self, client: Any, olm: Any) -> bool:
"""Verify our device keys are on the homeserver after loading crypto state.
Returns True if keys are valid or were successfully re-uploaded.
Returns False if verification fails (caller should refuse E2EE).
"""
try:
resp = await client.query_keys({client.mxid: [client.device_id]})
except Exception as exc:
logger.error(
"Matrix: cannot verify device keys on server: %s — refusing E2EE", exc,
)
return False
# query_keys returns typed objects (QueryKeysResponse, DeviceKeys
# with KeyID keys). Normalise to plain strings for comparison.
device_keys_map = getattr(resp, "device_keys", {}) or {}
our_user_devices = device_keys_map.get(str(client.mxid)) or {}
our_keys = our_user_devices.get(str(client.device_id))
if not our_keys:
logger.warning("Matrix: device keys missing from server — re-uploading")
olm.account.shared = False
try:
await olm.share_keys()
except Exception as exc:
logger.error("Matrix: failed to re-upload device keys: %s", exc)
return False
return True
# DeviceKeys.keys is a dict[KeyID, str]. Iterate to find the
# ed25519 key rather than constructing a KeyID for lookup.
server_ed25519 = None
keys_dict = getattr(our_keys, "keys", {}) or {}
for key_id, key_value in keys_dict.items():
if str(key_id).startswith("ed25519:"):
server_ed25519 = str(key_value)
break
local_ed25519 = olm.account.identity_keys.get("ed25519")
if server_ed25519 != local_ed25519:
if olm.account.shared:
# Restored account from DB but server has different keys — corrupted state.
logger.error(
"Matrix: server has different identity keys for device %s"
"local crypto state is stale. Delete %s and restart.",
client.device_id,
_CRYPTO_DB_PATH,
)
return False
# Fresh account (never uploaded). Server has stale keys from a
# previous installation. Try to delete the old device and re-upload.
logger.warning(
"Matrix: server has stale keys for device %s — attempting re-upload",
client.device_id,
)
try:
await client.api.request(
client.api.Method.DELETE
if hasattr(client.api, "Method")
else "DELETE",
f"/_matrix/client/v3/devices/{client.device_id}",
)
logger.info("Matrix: deleted stale device %s from server", client.device_id)
except Exception:
# Device deletion often requires UIA or may simply not be
# permitted — that's fine, share_keys will try to overwrite.
pass
try:
await olm.share_keys()
except Exception as exc:
logger.error(
"Matrix: cannot upload device keys for %s: %s. "
"Try generating a new access token to get a fresh device.",
client.device_id,
exc,
)
return False
return True
# ------------------------------------------------------------------
# Required overrides
# ------------------------------------------------------------------
@@ -350,54 +464,67 @@ class MatrixAdapter(BasePlatformAdapter):
return False
try:
from mautrix.crypto import OlmMachine
from mautrix.crypto.store import MemoryCryptoStore
from mautrix.crypto.store.asyncpg import PgCryptoStore
from mautrix.util.async_db import Database
_STORE_DIR.mkdir(parents=True, exist_ok=True)
# Remove legacy pickle file from pre-SQLite era.
legacy_pickle = _STORE_DIR / "crypto_store.pickle"
if legacy_pickle.exists():
logger.info("Matrix: removing legacy crypto_store.pickle (migrated to SQLite)")
legacy_pickle.unlink()
# Open SQLite-backed crypto store.
crypto_db = Database.create(
f"sqlite:///{_CRYPTO_DB_PATH}",
upgrade_table=PgCryptoStore.upgrade_table,
)
await crypto_db.start()
self._crypto_db = crypto_db
# account_id and pickle_key are required by mautrix ≥0.21.
# Use the Matrix user ID as account_id for stable identity.
# pickle_key secures in-memory serialisation; derive from
# the same user_id:device_id pair used for the on-disk HMAC.
_acct_id = self._user_id or "hermes"
_pickle_key = f"{_acct_id}:{self._device_id}"
crypto_store = MemoryCryptoStore(
_pickle_key = f"{_acct_id}:{self._device_id or 'default'}"
crypto_store = PgCryptoStore(
account_id=_acct_id,
pickle_key=_pickle_key,
db=crypto_db,
)
await crypto_store.open()
# Restore persisted crypto state from a previous run.
# Uses HMAC to verify integrity before unpickling.
pickle_path = _CRYPTO_PICKLE_PATH
if pickle_path.exists():
try:
import hashlib, hmac, pickle
raw = pickle_path.read_bytes()
# Format: 32-byte HMAC-SHA256 signature + pickle data.
if len(raw) > 32:
sig, payload = raw[:32], raw[32:]
# Key is derived from the device_id + user_id (stable per install).
hmac_key = f"{self._user_id}:{self._device_id}".encode()
expected = hmac.new(hmac_key, payload, hashlib.sha256).digest()
if hmac.compare_digest(sig, expected):
saved = pickle.loads(payload) # noqa: S301
if isinstance(saved, MemoryCryptoStore):
crypto_store = saved
logger.info("Matrix: restored E2EE crypto store from %s", pickle_path)
else:
logger.warning("Matrix: crypto store HMAC mismatch — ignoring stale/tampered file")
except Exception as exc:
logger.warning("Matrix: could not restore crypto store: %s", exc)
crypto_state = _CryptoStateStore(state_store, self._joined_rooms)
olm = OlmMachine(client, crypto_store, crypto_state)
olm = OlmMachine(client, crypto_store, state_store)
# Set trust policy: accept unverified devices so senders
# share Megolm session keys with us automatically.
# Accept unverified devices so senders share Megolm
# session keys with us automatically.
olm.share_keys_min_trust = TrustState.UNVERIFIED
olm.send_keys_min_trust = TrustState.UNVERIFIED
await olm.load()
# Verify our device keys are still on the homeserver.
if not await self._verify_device_keys_on_server(client, olm):
await crypto_db.stop()
await api.session.close()
return False
# Import cross-signing private keys from SSSS and self-sign
# the current device. Required after any device-key rotation
# (fresh crypto.db, share_keys re-upload) — otherwise the
# device's self-signing signature is stale and peers refuse
# to share Megolm sessions with the rotated device.
recovery_key = os.getenv("MATRIX_RECOVERY_KEY", "").strip()
if recovery_key:
try:
await olm.verify_with_recovery_key(recovery_key)
logger.info("Matrix: cross-signing verified via recovery key")
except Exception as exc:
logger.warning("Matrix: recovery key verification failed: %s", exc)
client.crypto = olm
logger.info(
"Matrix: E2EE enabled (store: %s%s)",
str(_STORE_DIR),
str(_CRYPTO_DB_PATH),
f", device_id={client.device_id}" if client.device_id else "",
)
except Exception as exc:
@@ -438,6 +565,15 @@ class MatrixAdapter(BasePlatformAdapter):
)
# Build DM room cache from m.direct account data.
await self._refresh_dm_cache()
# Dispatch events from the initial sync so the OlmMachine
# receives to-device key shares queued while we were offline.
try:
tasks = client.handle_sync(sync_data)
if tasks:
await asyncio.gather(*tasks)
except Exception as exc:
logger.warning("Matrix: initial sync event dispatch error: %s", exc)
else:
logger.warning("Matrix: initial sync returned unexpected type %s", type(sync_data).__name__)
except Exception as exc:
@@ -466,21 +602,12 @@ class MatrixAdapter(BasePlatformAdapter):
except (asyncio.CancelledError, Exception):
pass
# Persist E2EE crypto store before closing so the next restart
# can decrypt events using sessions from this run.
if self._client and self._encryption and getattr(self._client, "crypto", None):
# Close the SQLite crypto store database.
if hasattr(self, "_crypto_db") and self._crypto_db:
try:
import hashlib, hmac, pickle
crypto_store = self._client.crypto.crypto_store
_STORE_DIR.mkdir(parents=True, exist_ok=True)
pickle_path = _CRYPTO_PICKLE_PATH
payload = pickle.dumps(crypto_store)
hmac_key = f"{self._user_id}:{self._device_id}".encode()
sig = hmac.new(hmac_key, payload, hashlib.sha256).digest()
pickle_path.write_bytes(sig + payload)
logger.info("Matrix: persisted E2EE crypto store to %s", pickle_path)
await self._crypto_db.stop()
except Exception as exc:
logger.debug("Matrix: could not persist crypto store on disconnect: %s", exc)
logger.debug("Matrix: could not close crypto DB on disconnect: %s", exc)
if self._client:
try:
@@ -654,7 +781,7 @@ class MatrixAdapter(BasePlatformAdapter):
# Try aiohttp first (always available), fall back to httpx
try:
import aiohttp as _aiohttp
async with _aiohttp.ClientSession() as http:
async with _aiohttp.ClientSession(trust_env=True) as http:
async with http.get(image_url, timeout=_aiohttp.ClientTimeout(total=30)) as resp:
resp.raise_for_status()
data = await resp.read()
@@ -831,6 +958,16 @@ class MatrixAdapter(BasePlatformAdapter):
sync_data = await client.sync(
since=next_batch, timeout=30000,
)
# nio returns SyncError objects (not exceptions) for auth
# failures like M_UNKNOWN_TOKEN. Detect and stop immediately.
_sync_msg = getattr(sync_data, "message", None)
if _sync_msg and isinstance(_sync_msg, str):
_lower = _sync_msg.lower()
if "m_unknown_token" in _lower or "unknown_token" in _lower:
logger.error("Matrix: permanent auth error from sync: %s — stopping", _sync_msg)
return
if isinstance(sync_data, dict):
# Update joined rooms from sync response.
rooms_join = sync_data.get("rooms", {}).get("join", {})
@@ -853,13 +990,6 @@ class MatrixAdapter(BasePlatformAdapter):
except Exception as exc:
logger.warning("Matrix: sync event dispatch error: %s", exc)
# Share keys periodically if E2EE is enabled.
if self._encryption and getattr(client, "crypto", None):
try:
await client.crypto.share_keys()
except Exception as exc:
logger.warning("Matrix: E2EE key share failed: %s", exc)
# Retry any buffered undecrypted events.
if self._pending_megolm:
await self._retry_pending_decryptions()
@@ -1014,7 +1144,10 @@ class MatrixAdapter(BasePlatformAdapter):
thread_id = relates_to.get("event_id")
formatted_body = source_content.get("formatted_body")
is_mentioned = self._is_bot_mentioned(body, formatted_body)
# m.mentions.user_ids (MSC3952 / Matrix v1.7) — authoritative mention signal.
mentions_block = source_content.get("m.mentions") or {}
mention_user_ids = mentions_block.get("user_ids") if isinstance(mentions_block, dict) else None
is_mentioned = self._is_bot_mentioned(body, formatted_body, mention_user_ids)
# Require-mention gating.
if not is_dm:
@@ -1488,52 +1621,6 @@ class MatrixAdapter(BasePlatformAdapter):
logger.warning("Matrix: redact error: %s", exc)
return False
# ------------------------------------------------------------------
# Room history
# ------------------------------------------------------------------
async def fetch_room_history(
self,
room_id: str,
limit: int = 50,
start: str = "",
) -> list:
"""Fetch recent messages from a room."""
if not self._client:
return []
try:
resp = await self._client.get_messages(
RoomID(room_id),
direction=PaginationDirection.BACKWARD,
from_token=SyncToken(start) if start else None,
limit=limit,
)
except Exception as exc:
logger.warning("Matrix: get_messages failed for %s: %s", room_id, exc)
return []
if not resp:
return []
events = getattr(resp, "chunk", []) or (resp.get("chunk", []) if isinstance(resp, dict) else [])
messages = []
for event in reversed(events):
body = ""
content = getattr(event, "content", None)
if content:
if hasattr(content, "body"):
body = content.body or ""
elif isinstance(content, dict):
body = content.get("body", "")
messages.append({
"event_id": str(getattr(event, "event_id", "")),
"sender": str(getattr(event, "sender", "")),
"body": body,
"timestamp": getattr(event, "timestamp", 0) or getattr(event, "server_timestamp", 0),
"type": type(event).__name__,
})
return messages
# ------------------------------------------------------------------
# Room creation & management
# ------------------------------------------------------------------
@@ -1637,18 +1724,6 @@ class MatrixAdapter(BasePlatformAdapter):
except Exception as exc:
return SendResult(success=False, error=str(exc))
async def send_emote(
self, chat_id: str, text: str, metadata: Optional[Dict[str, Any]] = None,
) -> SendResult:
"""Send an emote message (/me style action)."""
return await self._send_simple_message(chat_id, text, "m.emote")
async def send_notice(
self, chat_id: str, text: str, metadata: Optional[Dict[str, Any]] = None,
) -> SendResult:
"""Send a notice message (bot-appropriate, non-alerting)."""
return await self._send_simple_message(chat_id, text, "m.notice")
# ------------------------------------------------------------------
# Helpers
# ------------------------------------------------------------------
@@ -1701,8 +1776,24 @@ class MatrixAdapter(BasePlatformAdapter):
# Mention detection helpers
# ------------------------------------------------------------------
def _is_bot_mentioned(self, body: str, formatted_body: Optional[str] = None) -> bool:
"""Return True if the bot is mentioned in the message."""
def _is_bot_mentioned(
self,
body: str,
formatted_body: Optional[str] = None,
mention_user_ids: Optional[list] = None,
) -> bool:
"""Return True if the bot is mentioned in the message.
Per MSC3952, ``m.mentions.user_ids`` is the authoritative mention
signal in the Matrix spec. When the sender's client populates that
field with the bot's user-id, we trust it — even when the visible
body text does not contain an explicit ``@bot`` string (some clients
only render mention "pills" in ``formatted_body`` or use display
names).
"""
# m.mentions.user_ids — authoritative per MSC3952 / Matrix v1.7.
if mention_user_ids and self._user_id and self._user_id in mention_user_ids:
return True
if not body and not formatted_body:
return False
if self._user_id and self._user_id in body:
File diff suppressed because it is too large Load Diff
-16
View File
@@ -17,7 +17,6 @@ import json
import logging
import os
import random
import re
import time
from datetime import datetime, timezone
from pathlib import Path
@@ -781,21 +780,6 @@ class SignalAdapter(BasePlatformAdapter):
# Typing Indicators
# ------------------------------------------------------------------
async def _start_typing_indicator(self, chat_id: str) -> None:
"""Start a typing indicator loop for a chat."""
if chat_id in self._typing_tasks:
return # Already running
async def _typing_loop():
try:
while True:
await self.send_typing(chat_id)
await asyncio.sleep(TYPING_INTERVAL)
except asyncio.CancelledError:
pass
self._typing_tasks[chat_id] = asyncio.create_task(_typing_loop())
async def _stop_typing_indicator(self, chat_id: str) -> None:
"""Stop a typing indicator loop for a chat."""
task = self._typing_tasks.pop(chat_id, None)
+57 -12
View File
@@ -65,7 +65,10 @@ from gateway.platforms.base import (
cache_image_from_bytes,
cache_audio_from_bytes,
cache_document_from_bytes,
resolve_proxy_url,
SUPPORTED_DOCUMENT_TYPES,
utf16_len,
_prefix_within_utf16_limit,
)
from gateway.platforms.telegram_network import (
TelegramFallbackTransport,
@@ -537,10 +540,7 @@ class TelegramAdapter(BasePlatformAdapter):
"write_timeout": _env_float("HERMES_TELEGRAM_HTTP_WRITE_TIMEOUT", 20.0),
}
proxy_configured = any(
(os.getenv(k) or "").strip()
for k in ("HTTPS_PROXY", "HTTP_PROXY", "ALL_PROXY", "https_proxy", "http_proxy", "all_proxy")
)
proxy_url = resolve_proxy_url()
disable_fallback = (os.getenv("HERMES_TELEGRAM_DISABLE_FALLBACK_IPS", "").strip().lower() in ("1", "true", "yes", "on"))
fallback_ips = self._fallback_ips()
if not fallback_ips:
@@ -551,7 +551,7 @@ class TelegramAdapter(BasePlatformAdapter):
", ".join(fallback_ips),
)
if fallback_ips and not proxy_configured and not disable_fallback:
if fallback_ips and not proxy_url and not disable_fallback:
logger.info(
"[%s] Telegram fallback IPs active: %s",
self.name,
@@ -567,10 +567,12 @@ class TelegramAdapter(BasePlatformAdapter):
**request_kwargs,
httpx_kwargs={"transport": TelegramFallbackTransport(fallback_ips)},
)
elif proxy_url:
logger.info("[%s] Proxy detected; passing explicitly to HTTPXRequest: %s", self.name, proxy_url)
request = HTTPXRequest(**request_kwargs, proxy=proxy_url)
get_updates_request = HTTPXRequest(**request_kwargs, proxy=proxy_url)
else:
if proxy_configured:
logger.info("[%s] Proxy configured; skipping Telegram fallback-IP transport", self.name)
elif disable_fallback:
if disable_fallback:
logger.info("[%s] Telegram fallback-IP transport disabled via env", self.name)
request = HTTPXRequest(**request_kwargs)
get_updates_request = HTTPXRequest(**request_kwargs)
@@ -799,7 +801,9 @@ class TelegramAdapter(BasePlatformAdapter):
try:
# Format and split message if needed
formatted = self.format_message(content)
chunks = self.truncate_message(formatted, self.MAX_MESSAGE_LENGTH)
chunks = self.truncate_message(
formatted, self.MAX_MESSAGE_LENGTH, len_fn=utf16_len,
)
if len(chunks) > 1:
# truncate_message appends a raw " (1/2)" suffix. Escape the
# MarkdownV2-special parentheses so Telegram doesn't reject the
@@ -970,7 +974,9 @@ class TelegramAdapter(BasePlatformAdapter):
# streaming). Truncate and succeed so the stream consumer can
# split the overflow into a new message instead of dying.
if "message_too_long" in err_str or "too long" in err_str:
truncated = content[: self.MAX_MESSAGE_LENGTH - 20] + ""
truncated = _prefix_within_utf16_limit(
content, self.MAX_MESSAGE_LENGTH - 20
) + ""
try:
await self._bot.edit_message_text(
chat_id=int(chat_id),
@@ -1910,9 +1916,20 @@ class TelegramAdapter(BasePlatformAdapter):
)
# 9) Convert blockquotes: > at line start → protect > from escaping
# Handle both regular blockquotes (> text) and expandable blockquotes
# (Telegram MarkdownV2: **> for expandable start, || to end the quote)
def _convert_blockquote(m):
prefix = m.group(1) # >, >>, >>>, **>, or **>> etc.
content = m.group(2)
# Check if content ends with || (expandable blockquote end marker)
# In this case, preserve the trailing || unescaped for Telegram
if prefix.startswith('**') and content.endswith('||'):
return _ph(f'{prefix} {_escape_mdv2(content[:-2])}||')
return _ph(f'{prefix} {_escape_mdv2(content)}')
text = re.sub(
r'^(>{1,3}) (.+)$',
lambda m: _ph(m.group(1) + ' ' + _escape_mdv2(m.group(2))),
r'^((?:\*\*)?>{1,3}) (.+)$',
_convert_blockquote,
text,
flags=re.MULTILINE,
)
@@ -1985,6 +2002,27 @@ class TelegramAdapter(BasePlatformAdapter):
return {str(part).strip() for part in raw if str(part).strip()}
return {part.strip() for part in str(raw).split(",") if part.strip()}
def _telegram_ignored_threads(self) -> set[int]:
raw = self.config.extra.get("ignored_threads")
if raw is None:
raw = os.getenv("TELEGRAM_IGNORED_THREADS", "")
if isinstance(raw, list):
values = raw
else:
values = str(raw).split(",")
ignored: set[int] = set()
for value in values:
text = str(value).strip()
if not text:
continue
try:
ignored.add(int(text))
except (TypeError, ValueError):
logger.warning("[%s] Ignoring invalid Telegram thread id: %r", self.name, value)
return ignored
def _compile_mention_patterns(self) -> List[re.Pattern]:
"""Compile optional regex wake-word patterns for group triggers."""
patterns = self.config.extra.get("mention_patterns")
@@ -2096,6 +2134,13 @@ class TelegramAdapter(BasePlatformAdapter):
"""
if not self._is_group_chat(message):
return True
thread_id = getattr(message, "message_thread_id", None)
if thread_id is not None:
try:
if int(thread_id) in self._telegram_ignored_threads():
return False
except (TypeError, ValueError):
logger.warning("[%s] Ignoring non-numeric Telegram message_thread_id: %r", self.name, thread_id)
if str(getattr(getattr(message, "chat", None), "id", "")) in self._telegram_free_response_chats():
return True
if not self._telegram_require_mention():
-1
View File
@@ -12,7 +12,6 @@ from __future__ import annotations
import asyncio
import ipaddress
import logging
import os
import socket
from typing import Iterable, Optional
+1 -1
View File
@@ -27,7 +27,6 @@ import hashlib
import hmac
import json
import logging
import os
import re
import subprocess
import time
@@ -204,6 +203,7 @@ class WebhookAdapter(BasePlatformAdapter):
"wecom_callback",
"weixin",
"bluebubbles",
"qqbot",
):
return await self._deliver_cross_platform(
deliver_type, content, delivery
+1 -2
View File
@@ -37,7 +37,6 @@ import logging
import mimetypes
import os
import re
import time
import uuid
from datetime import datetime, timezone
from pathlib import Path
@@ -266,7 +265,7 @@ class WeComAdapter(BasePlatformAdapter):
async def _open_connection(self) -> None:
"""Open and authenticate a websocket connection."""
await self._cleanup_ws()
self._session = aiohttp.ClientSession()
self._session = aiohttp.ClientSession(trust_env=True)
self._ws = await self._session.ws_connect(
self._ws_url,
heartbeat=HEARTBEAT_INTERVAL_SECONDS * 2,
+146 -51
View File
@@ -112,6 +112,7 @@ TYPING_STOP = 2
_HEADER_RE = re.compile(r"^(#{1,6})\s+(.+?)\s*$")
_TABLE_RULE_RE = re.compile(r"^\s*\|?(?:\s*:?-{3,}:?\s*\|)+\s*:?-{3,}:?\s*\|?\s*$")
_FENCE_RE = re.compile(r"^```([^\n`]*)\s*$")
_MARKDOWN_LINK_RE = re.compile(r"\[([^\]]+)\]\(([^)]+)\)")
def check_weixin_requirements() -> bool:
@@ -398,15 +399,16 @@ async def _send_message(
context_token: Optional[str],
client_id: str,
) -> None:
if not text or not text.strip():
raise ValueError("_send_message: text must not be empty")
message: Dict[str, Any] = {
"from_user_id": "",
"to_user_id": to,
"client_id": client_id,
"message_type": MSG_TYPE_BOT,
"message_state": MSG_STATE_FINISH,
"item_list": [{"type": ITEM_TEXT, "text_item": {"text": text}}],
}
if text:
message["item_list"] = [{"type": ITEM_TEXT, "text_item": {"text": text}}]
if context_token:
message["context_token"] = context_token
await _api_post(
@@ -499,13 +501,15 @@ async def _upload_ciphertext(
session: "aiohttp.ClientSession",
*,
ciphertext: bytes,
cdn_base_url: str,
upload_param: str,
filekey: str,
upload_url: str,
) -> str:
url = _cdn_upload_url(cdn_base_url, upload_param, filekey)
"""Upload encrypted media to the CDN.
Accepts either a constructed CDN URL (from upload_param) or a direct
upload_full_url both use POST with the raw ciphertext as the body.
"""
timeout = aiohttp.ClientTimeout(total=120)
async with session.post(url, data=ciphertext, headers={"Content-Type": "application/octet-stream"}, timeout=timeout) as response:
async with session.post(upload_url, data=ciphertext, headers={"Content-Type": "application/octet-stream"}, timeout=timeout) as response:
if response.status == 200:
encrypted_param = response.headers.get("x-encrypted-param")
if encrypted_param:
@@ -649,7 +653,7 @@ def _normalize_markdown_blocks(content: str) -> str:
result.append(_rewrite_table_block_for_weixin(table_lines))
continue
result.append(_rewrite_headers_for_weixin(line))
result.append(_MARKDOWN_LINK_RE.sub(r"\1 (\2)", _rewrite_headers_for_weixin(line)))
i += 1
normalized = "\n".join(item.rstrip() for item in result)
@@ -734,6 +738,42 @@ def _split_delivery_units_for_weixin(content: str) -> List[str]:
return [unit for unit in units if unit]
def _looks_like_chatty_line_for_weixin(line: str) -> bool:
"""Return True when a line looks like a standalone chat utterance."""
stripped = line.strip()
if not stripped:
return False
if len(stripped) > 48:
return False
if line.startswith((" ", "\t")):
return False
if stripped.startswith((">", "-", "*", "")):
return False
if re.match(r"^\*\*[^*]+\*\*$", stripped):
return False
if re.match(r"^\d+\.\s", stripped):
return False
return True
def _looks_like_heading_line_for_weixin(line: str) -> bool:
"""Return True when a short line behaves like a plain-text heading."""
stripped = line.strip()
if not stripped:
return False
return len(stripped) <= 24 and stripped.endswith((":", ""))
def _should_split_short_chat_block_for_weixin(block: str) -> bool:
"""Split only chat-like multiline blocks into separate bubbles."""
lines = [line for line in block.splitlines() if line.strip()]
if not 2 <= len(lines) <= 6:
return False
if _looks_like_heading_line_for_weixin(lines[0]):
return False
return all(_looks_like_chatty_line_for_weixin(line) for line in lines)
def _pack_markdown_blocks_for_weixin(content: str, max_length: int) -> List[str]:
if len(content) <= max_length:
return [content]
@@ -775,6 +815,8 @@ def _split_text_for_weixin_delivery(
``platforms.weixin.extra.split_multiline_messages`` (``true`` / ``false``)
or the env var ``WEIXIN_SPLIT_MULTILINE_MESSAGES``.
"""
if not content:
return []
if split_per_line:
# Legacy: one message per top-level delivery unit.
if len(content) <= max_length and "\n" not in content:
@@ -785,11 +827,17 @@ def _split_text_for_weixin_delivery(
chunks.append(unit)
continue
chunks.extend(_pack_markdown_blocks_for_weixin(unit, max_length))
return chunks or [content]
return [c for c in chunks if c] or [content]
# Compact (default): single message when under the limit.
# Compact (default): single message when under the limit — unless the
# content looks like a short chatty exchange, in which case split into
# separate bubbles for a more natural chat feel.
if len(content) <= max_length:
return [content]
return (
[u for u in _split_delivery_units_for_weixin(content) if u]
if _should_split_short_chat_block_for_weixin(content)
else [content]
)
return _pack_markdown_blocks_for_weixin(content, max_length) or [content]
@@ -887,7 +935,7 @@ async def qr_login(
if not AIOHTTP_AVAILABLE:
raise RuntimeError("aiohttp is required for Weixin QR login")
async with aiohttp.ClientSession() as session:
async with aiohttp.ClientSession(trust_env=True) as session:
try:
qr_resp = await _api_get(
session,
@@ -1000,6 +1048,10 @@ class WeixinAdapter(BasePlatformAdapter):
MAX_MESSAGE_LENGTH = 4000
# WeChat does not support editing sent messages — streaming must use the
# fallback "send-final-only" path so the cursor (▉) is never left visible.
SUPPORTS_MESSAGE_EDITING = False
def __init__(self, config: PlatformConfig):
super().__init__(config, Platform.WEIXIN)
extra = config.extra or {}
@@ -1082,7 +1134,7 @@ class WeixinAdapter(BasePlatformAdapter):
except Exception as exc:
logger.debug("[%s] Token lock unavailable (non-fatal): %s", self.name, exc)
self._session = aiohttp.ClientSession()
self._session = aiohttp.ClientSession(trust_env=True)
self._token_store.restore(self._account_id)
self._poll_task = asyncio.create_task(self._poll_loop(), name="weixin-poll")
self._mark_connected()
@@ -1409,7 +1461,7 @@ class WeixinAdapter(BasePlatformAdapter):
context_token = self._token_store.get(self._account_id, chat_id)
last_message_id: Optional[str] = None
try:
chunks = self._split_text(self.format_message(content))
chunks = [c for c in self._split_text(self.format_message(content)) if c and c.strip()]
for idx, chunk in enumerate(chunks):
client_id = f"hermes-weixin-{uuid.uuid4().hex}"
await self._send_text_chunk(
@@ -1495,24 +1547,51 @@ class WeixinAdapter(BasePlatformAdapter):
reply_to: Optional[str] = None,
metadata: Optional[Dict[str, Any]] = None,
) -> SendResult:
return await self.send_document(chat_id, path, caption=caption, metadata=metadata)
return await self.send_document(chat_id, file_path=path, caption=caption, metadata=metadata)
async def send_document(
self,
chat_id: str,
path: str,
file_path: str,
caption: str = "",
metadata: Optional[Dict[str, Any]] = None,
) -> SendResult:
if not self._session or not self._token:
return SendResult(success=False, error="Not connected")
try:
message_id = await self._send_file(chat_id, path, caption)
message_id = await self._send_file(chat_id, file_path, caption)
return SendResult(success=True, message_id=message_id)
except Exception as exc:
logger.error("[%s] send_document failed to=%s: %s", self.name, _safe_id(chat_id), exc)
return SendResult(success=False, error=str(exc))
async def send_video(
self,
chat_id: str,
video_path: str,
caption: Optional[str] = None,
reply_to: Optional[str] = None,
metadata: Optional[Dict[str, Any]] = None,
) -> SendResult:
if not self._session or not self._token:
return SendResult(success=False, error="Not connected")
try:
message_id = await self._send_file(chat_id, video_path, caption or "")
return SendResult(success=True, message_id=message_id)
except Exception as exc:
logger.error("[%s] send_video failed to=%s: %s", self.name, _safe_id(chat_id), exc)
return SendResult(success=False, error=str(exc))
async def send_voice(
self,
chat_id: str,
audio_path: str,
caption: Optional[str] = None,
reply_to: Optional[str] = None,
metadata: Optional[Dict[str, Any]] = None,
) -> SendResult:
return await self.send_document(chat_id, audio_path, caption=caption or "", metadata=metadata)
async def _download_remote_media(self, url: str) -> str:
from tools.url_safety import is_safe_url
@@ -1535,6 +1614,7 @@ class WeixinAdapter(BasePlatformAdapter):
filekey = secrets.token_hex(16)
aes_key = secrets.token_bytes(16)
rawsize = len(plaintext)
rawfilemd5 = hashlib.md5(plaintext).hexdigest()
upload_response = await _get_upload_url(
self._session,
base_url=self._base_url,
@@ -1543,41 +1623,42 @@ class WeixinAdapter(BasePlatformAdapter):
media_type=media_type,
filekey=filekey,
rawsize=rawsize,
rawfilemd5=hashlib.md5(plaintext).hexdigest(),
rawfilemd5=rawfilemd5,
filesize=_aes_padded_size(rawsize),
aeskey_hex=aes_key.hex(),
)
upload_param = str(upload_response.get("upload_param") or "")
upload_full_url = str(upload_response.get("upload_full_url") or "")
ciphertext = _aes128_ecb_encrypt(plaintext, aes_key)
if upload_param:
encrypted_query_param = await _upload_ciphertext(
self._session,
ciphertext=ciphertext,
cdn_base_url=self._cdn_base_url,
upload_param=upload_param,
filekey=filekey,
)
elif upload_full_url:
timeout = aiohttp.ClientTimeout(total=120)
async with self._session.put(
upload_full_url,
data=ciphertext,
headers={"Content-Type": "application/octet-stream"},
timeout=timeout,
) as response:
response.raise_for_status()
encrypted_query_param = response.headers.get("x-encrypted-param") or filekey
# Prefer upload_full_url (direct CDN), fall back to constructed CDN URL
# from upload_param. Both paths use POST — the old PUT for
# upload_full_url caused 404s on the WeChat CDN.
if upload_full_url:
upload_url = upload_full_url
elif upload_param:
upload_url = _cdn_upload_url(self._cdn_base_url, upload_param, filekey)
else:
raise RuntimeError(f"getUploadUrl returned neither upload_param nor upload_full_url: {upload_response}")
encrypted_query_param = await _upload_ciphertext(
self._session,
ciphertext=ciphertext,
upload_url=upload_url,
)
context_token = self._token_store.get(self._account_id, chat_id)
# The iLink API expects aes_key as base64(hex_string), not base64(raw_bytes).
# Sending base64(raw_bytes) causes images to show as grey boxes on the
# receiver side because the decryption key doesn't match.
aes_key_for_api = base64.b64encode(aes_key.hex().encode("ascii")).decode("ascii")
media_item = item_builder(
encrypt_query_param=encrypted_query_param,
aes_key_b64=base64.b64encode(aes_key).decode("ascii"),
aes_key_for_api=aes_key_for_api,
ciphertext_size=len(ciphertext),
plaintext_size=rawsize,
filename=Path(path).name,
rawfilemd5=rawfilemd5,
)
last_message_id = None
@@ -1617,39 +1698,53 @@ class WeixinAdapter(BasePlatformAdapter):
def _outbound_media_builder(self, path: str):
mime = mimetypes.guess_type(path)[0] or "application/octet-stream"
if mime.startswith("image/"):
return MEDIA_IMAGE, lambda **kwargs: {
return MEDIA_IMAGE, lambda **kw: {
"type": ITEM_IMAGE,
"image_item": {
"media": {
"encrypt_query_param": kwargs["encrypt_query_param"],
"aes_key": kwargs["aes_key_b64"],
"encrypt_query_param": kw["encrypt_query_param"],
"aes_key": kw["aes_key_for_api"],
"encrypt_type": 1,
},
"mid_size": kwargs["ciphertext_size"],
"mid_size": kw["ciphertext_size"],
},
}
if mime.startswith("video/"):
return MEDIA_VIDEO, lambda **kwargs: {
return MEDIA_VIDEO, lambda **kw: {
"type": ITEM_VIDEO,
"video_item": {
"media": {
"encrypt_query_param": kwargs["encrypt_query_param"],
"aes_key": kwargs["aes_key_b64"],
"encrypt_query_param": kw["encrypt_query_param"],
"aes_key": kw["aes_key_for_api"],
"encrypt_type": 1,
},
"video_size": kwargs["ciphertext_size"],
"video_size": kw["ciphertext_size"],
"play_length": kw.get("play_length", 0),
"video_md5": kw.get("rawfilemd5", ""),
},
}
return MEDIA_FILE, lambda **kwargs: {
if mime.startswith("audio/") or path.endswith(".silk"):
return MEDIA_VOICE, lambda **kw: {
"type": ITEM_VOICE,
"voice_item": {
"media": {
"encrypt_query_param": kw["encrypt_query_param"],
"aes_key": kw["aes_key_for_api"],
"encrypt_type": 1,
},
"playtime": kw.get("playtime", 0),
},
}
return MEDIA_FILE, lambda **kw: {
"type": ITEM_FILE,
"file_item": {
"media": {
"encrypt_query_param": kwargs["encrypt_query_param"],
"aes_key": kwargs["aes_key_b64"],
"encrypt_query_param": kw["encrypt_query_param"],
"aes_key": kw["aes_key_for_api"],
"encrypt_type": 1,
},
"file_name": kwargs["filename"],
"len": str(kwargs["plaintext_size"]),
"file_name": kw["filename"],
"len": str(kw["plaintext_size"]),
},
}
@@ -1689,7 +1784,7 @@ async def send_weixin_direct(
token_store.restore(account_id)
context_token = token_store.get(account_id, chat_id)
async with aiohttp.ClientSession() as session:
async with aiohttp.ClientSession(trust_env=True) as session:
adapter = WeixinAdapter(
PlatformConfig(
enabled=True,
+103 -26
View File
@@ -120,8 +120,9 @@ class WhatsAppAdapter(BasePlatformAdapter):
- session_path: Path to store WhatsApp session data
"""
# WhatsApp message limits
MAX_MESSAGE_LENGTH = 65536 # WhatsApp allows longer messages
# WhatsApp message limits — practical UX limit, not protocol max.
# WhatsApp allows ~65K but long messages are unreadable on mobile.
MAX_MESSAGE_LENGTH = 4096
# Default bridge location relative to the hermes-agent install
_DEFAULT_BRIDGE_DIR = Path(__file__).resolve().parents[2] / "scripts" / "whatsapp-bridge"
@@ -531,6 +532,63 @@ class WhatsAppAdapter(BasePlatformAdapter):
self._close_bridge_log()
print(f"[{self.name}] Disconnected")
def format_message(self, content: str) -> str:
"""Convert standard markdown to WhatsApp-compatible formatting.
WhatsApp supports: *bold*, _italic_, ~strikethrough~, ```code```,
and monospaced `inline`. Standard markdown uses different syntax
for bold/italic/strikethrough, so we convert here.
Code blocks (``` fenced) and inline code (`) are protected from
conversion via placeholder substitution.
"""
if not content:
return content
# --- 1. Protect fenced code blocks from formatting changes ---
_FENCE_PH = "\x00FENCE"
fences: list[str] = []
def _save_fence(m: re.Match) -> str:
fences.append(m.group(0))
return f"{_FENCE_PH}{len(fences) - 1}\x00"
result = re.sub(r"```[\s\S]*?```", _save_fence, content)
# --- 2. Protect inline code ---
_CODE_PH = "\x00CODE"
codes: list[str] = []
def _save_code(m: re.Match) -> str:
codes.append(m.group(0))
return f"{_CODE_PH}{len(codes) - 1}\x00"
result = re.sub(r"`[^`\n]+`", _save_code, result)
# --- 3. Convert markdown formatting to WhatsApp syntax ---
# Bold: **text** or __text__ → *text*
result = re.sub(r"\*\*(.+?)\*\*", r"*\1*", result)
result = re.sub(r"__(.+?)__", r"*\1*", result)
# Strikethrough: ~~text~~ → ~text~
result = re.sub(r"~~(.+?)~~", r"~\1~", result)
# Italic: *text* is already WhatsApp italic — leave as-is
# _text_ is already WhatsApp italic — leave as-is
# --- 4. Convert markdown headers to bold text ---
# # Header → *Header*
result = re.sub(r"^#{1,6}\s+(.+)$", r"*\1*", result, flags=re.MULTILINE)
# --- 5. Convert markdown links: [text](url) → text (url) ---
result = re.sub(r"\[([^\]]+)\]\(([^)]+)\)", r"\1 (\2)", result)
# --- 6. Restore protected sections ---
for i, fence in enumerate(fences):
result = result.replace(f"{_FENCE_PH}{i}\x00", fence)
for i, code in enumerate(codes):
result = result.replace(f"{_CODE_PH}{i}\x00", code)
return result
async def send(
self,
chat_id: str,
@@ -538,38 +596,57 @@ class WhatsAppAdapter(BasePlatformAdapter):
reply_to: Optional[str] = None,
metadata: Optional[Dict[str, Any]] = None
) -> SendResult:
"""Send a message via the WhatsApp bridge."""
"""Send a message via the WhatsApp bridge.
Formats markdown for WhatsApp, splits long messages into chunks
that preserve code block boundaries, and sends each chunk sequentially.
"""
if not self._running or not self._http_session:
return SendResult(success=False, error="Not connected")
bridge_exit = await self._check_managed_bridge_exit()
if bridge_exit:
return SendResult(success=False, error=bridge_exit)
if not content or not content.strip():
return SendResult(success=True, message_id=None)
try:
import aiohttp
payload = {
"chatId": chat_id,
"message": content,
}
if reply_to:
payload["replyTo"] = reply_to
async with self._http_session.post(
f"http://127.0.0.1:{self._bridge_port}/send",
json=payload,
timeout=aiohttp.ClientTimeout(total=30)
) as resp:
if resp.status == 200:
data = await resp.json()
return SendResult(
success=True,
message_id=data.get("messageId"),
raw_response=data
)
else:
error = await resp.text()
return SendResult(success=False, error=error)
# Format and chunk the message
formatted = self.format_message(content)
chunks = self.truncate_message(formatted, self.MAX_MESSAGE_LENGTH)
last_message_id = None
for chunk in chunks:
payload: Dict[str, Any] = {
"chatId": chat_id,
"message": chunk,
}
if reply_to and last_message_id is None:
# Only reply-to on the first chunk
payload["replyTo"] = reply_to
async with self._http_session.post(
f"http://127.0.0.1:{self._bridge_port}/send",
json=payload,
timeout=aiohttp.ClientTimeout(total=30)
) as resp:
if resp.status == 200:
data = await resp.json()
last_message_id = data.get("messageId")
else:
error = await resp.text()
return SendResult(success=False, error=error)
# Small delay between chunks to avoid rate limiting
if len(chunks) > 1:
await asyncio.sleep(0.3)
return SendResult(
success=True,
message_id=last_message_id,
)
except Exception as e:
return SendResult(success=False, error=str(e))
+1090 -154
View File
File diff suppressed because it is too large Load Diff
+10 -4
View File
@@ -12,7 +12,6 @@ import hashlib
import logging
import os
import json
import re
import threading
import uuid
from pathlib import Path
@@ -807,9 +806,9 @@ class SessionStore:
to avoid resetting long-idle sessions that are harmless to resume.
Returns the number of sessions that were suspended.
"""
import time as _time
from datetime import timedelta
cutoff = _time.time() - max_age_seconds
cutoff = _now() - timedelta(seconds=max_age_seconds)
count = 0
with self._lock:
self._ensure_loaded_locked()
@@ -878,7 +877,8 @@ class SessionStore:
Used by ``/resume`` to restore a previously-named session.
Ends the current session in SQLite (like reset), but instead of
generating a fresh session ID, re-uses ``target_session_id`` so the
old transcript is loaded on the next message.
old transcript is loaded on the next message. If the target session was
previously ended, re-open it so gateway resume semantics match the CLI.
"""
db_end_session_id = None
new_entry = None
@@ -918,6 +918,12 @@ class SessionStore:
except Exception as e:
logger.debug("Session DB end_session failed: %s", e)
if self._db:
try:
self._db.reopen_session(target_session_id)
except Exception as e:
logger.debug("Session DB reopen_session failed: %s", e)
return new_entry
def list_sessions(self, active_minutes: Optional[int] = None) -> List[SessionEntry]:
+5
View File
@@ -48,6 +48,7 @@ _SESSION_CHAT_NAME: ContextVar[str] = ContextVar("HERMES_SESSION_CHAT_NAME", def
_SESSION_THREAD_ID: ContextVar[str] = ContextVar("HERMES_SESSION_THREAD_ID", default="")
_SESSION_USER_ID: ContextVar[str] = ContextVar("HERMES_SESSION_USER_ID", default="")
_SESSION_USER_NAME: ContextVar[str] = ContextVar("HERMES_SESSION_USER_NAME", default="")
_SESSION_KEY: ContextVar[str] = ContextVar("HERMES_SESSION_KEY", default="")
_VAR_MAP = {
"HERMES_SESSION_PLATFORM": _SESSION_PLATFORM,
@@ -56,6 +57,7 @@ _VAR_MAP = {
"HERMES_SESSION_THREAD_ID": _SESSION_THREAD_ID,
"HERMES_SESSION_USER_ID": _SESSION_USER_ID,
"HERMES_SESSION_USER_NAME": _SESSION_USER_NAME,
"HERMES_SESSION_KEY": _SESSION_KEY,
}
@@ -66,6 +68,7 @@ def set_session_vars(
thread_id: str = "",
user_id: str = "",
user_name: str = "",
session_key: str = "",
) -> list:
"""Set all session context variables and return reset tokens.
@@ -82,6 +85,7 @@ def set_session_vars(
_SESSION_THREAD_ID.set(thread_id),
_SESSION_USER_ID.set(user_id),
_SESSION_USER_NAME.set(user_name),
_SESSION_KEY.set(session_key),
]
return tokens
@@ -97,6 +101,7 @@ def clear_session_vars(tokens: list) -> None:
_SESSION_THREAD_ID,
_SESSION_USER_ID,
_SESSION_USER_NAME,
_SESSION_KEY,
]
for var, token in zip(vars_in_order, tokens):
var.reset(token)
+179 -19
View File
@@ -26,6 +26,8 @@ _GATEWAY_KIND = "hermes-gateway"
_RUNTIME_STATUS_FILE = "gateway_state.json"
_LOCKS_DIRNAME = "gateway-locks"
_IS_WINDOWS = sys.platform == "win32"
_UNSET = object()
_VALID_STARTUP_CHECK_STATES = {"pending", "ready", "failed"}
def _get_pid_path() -> Path:
@@ -161,11 +163,39 @@ def _build_runtime_status_record() -> dict[str, Any]:
"restart_requested": False,
"active_agents": 0,
"platforms": {},
"startup_checks": {},
"updated_at": _utc_now_iso(),
})
return payload
def _normalize_startup_check_entries(
startup_checks: Optional[dict[str, Any]],
) -> dict[str, dict[str, Any]]:
"""Normalize persisted startup readiness entries."""
if not isinstance(startup_checks, dict):
return {}
now = _utc_now_iso()
normalized: dict[str, dict[str, Any]] = {}
for raw_id, raw_payload in startup_checks.items():
check_id = str(raw_id).strip()
if not check_id:
continue
payload = raw_payload if isinstance(raw_payload, dict) else {}
state = str(payload.get("state", "pending")).strip().lower()
if state not in _VALID_STARTUP_CHECK_STATES:
state = "pending"
normalized[check_id] = {
"state": state,
"required": bool(payload.get("required", True)),
"source": payload.get("source"),
"detail": payload.get("detail"),
"updated_at": payload.get("updated_at") or now,
}
return normalized
def _read_json_file(path: Path) -> Optional[dict[str, Any]]:
if not path.exists():
return None
@@ -218,14 +248,15 @@ def write_pid_file() -> None:
def write_runtime_status(
*,
gateway_state: Optional[str] = None,
exit_reason: Optional[str] = None,
restart_requested: Optional[bool] = None,
active_agents: Optional[int] = None,
platform: Optional[str] = None,
platform_state: Optional[str] = None,
error_code: Optional[str] = None,
error_message: Optional[str] = None,
gateway_state: Any = _UNSET,
exit_reason: Any = _UNSET,
restart_requested: Any = _UNSET,
active_agents: Any = _UNSET,
startup_checks: Any = _UNSET,
platform: Any = _UNSET,
platform_state: Any = _UNSET,
error_code: Any = _UNSET,
error_message: Any = _UNSET,
) -> None:
"""Persist gateway runtime health information for diagnostics/status."""
path = _get_runtime_status_path()
@@ -236,22 +267,24 @@ def write_runtime_status(
payload["start_time"] = _get_process_start_time(os.getpid())
payload["updated_at"] = _utc_now_iso()
if gateway_state is not None:
if gateway_state is not _UNSET:
payload["gateway_state"] = gateway_state
if exit_reason is not None:
if exit_reason is not _UNSET:
payload["exit_reason"] = exit_reason
if restart_requested is not None:
if restart_requested is not _UNSET:
payload["restart_requested"] = bool(restart_requested)
if active_agents is not None:
if active_agents is not _UNSET:
payload["active_agents"] = max(0, int(active_agents))
if startup_checks is not _UNSET:
payload["startup_checks"] = _normalize_startup_check_entries(startup_checks)
if platform is not None:
if platform is not _UNSET:
platform_payload = payload["platforms"].get(platform, {})
if platform_state is not None:
if platform_state is not _UNSET:
platform_payload["state"] = platform_state
if error_code is not None:
if error_code is not _UNSET:
platform_payload["error_code"] = error_code
if error_message is not None:
if error_message is not _UNSET:
platform_payload["error_message"] = error_message
platform_payload["updated_at"] = _utc_now_iso()
payload["platforms"][platform] = platform_payload
@@ -261,13 +294,131 @@ def write_runtime_status(
def read_runtime_status() -> Optional[dict[str, Any]]:
"""Read the persisted gateway runtime health/status information."""
return _read_json_file(_get_runtime_status_path())
payload = _read_json_file(_get_runtime_status_path())
if payload is None:
return None
payload.setdefault("platforms", {})
payload["startup_checks"] = _normalize_startup_check_entries(payload.get("startup_checks"))
return payload
def reset_startup_checks(checks: Optional[list[dict[str, Any]]] = None) -> dict[str, dict[str, Any]]:
"""Replace persisted startup readiness checks for the current run."""
normalized: dict[str, dict[str, Any]] = {}
now = _utc_now_iso()
for hook in checks or []:
if not isinstance(hook, dict):
continue
readiness = hook.get("startup_readiness")
if not isinstance(readiness, dict):
continue
check_id = str(readiness.get("id", "")).strip()
if not check_id:
continue
normalized[check_id] = {
"state": "pending",
"required": bool(readiness.get("required", True)),
"source": hook.get("name"),
"detail": None,
"updated_at": now,
}
write_runtime_status(startup_checks=normalized)
return normalized
def update_startup_check(
check_id: str,
state: str,
*,
detail: Any = _UNSET,
required: Any = _UNSET,
source: Any = _UNSET,
) -> dict[str, Any]:
"""Update a single startup readiness check in the runtime status file."""
normalized_id = str(check_id).strip()
if not normalized_id:
raise ValueError("startup readiness check id is required")
normalized_state = str(state).strip().lower()
if normalized_state not in _VALID_STARTUP_CHECK_STATES:
raise ValueError(f"invalid startup readiness state: {state}")
path = _get_runtime_status_path()
payload = _read_json_file(path) or _build_runtime_status_record()
checks = _normalize_startup_check_entries(payload.get("startup_checks"))
existing = checks.get(normalized_id, {})
now = _utc_now_iso()
checks[normalized_id] = {
"state": normalized_state,
"required": bool(existing.get("required", True) if required is _UNSET else required),
"source": existing.get("source") if source is _UNSET else source,
"detail": existing.get("detail") if detail is _UNSET else detail,
"updated_at": now,
}
payload["startup_checks"] = checks
payload.setdefault("platforms", {})
payload.setdefault("kind", _GATEWAY_KIND)
payload["pid"] = os.getpid()
payload["start_time"] = _get_process_start_time(os.getpid())
payload["updated_at"] = now
_write_json_file(path, payload)
return checks[normalized_id]
def mark_startup_check_pending(
check_id: str,
*,
detail: Any = _UNSET,
required: Any = _UNSET,
source: Any = _UNSET,
) -> dict[str, Any]:
return update_startup_check(check_id, "pending", detail=detail, required=required, source=source)
def mark_startup_check_ready(
check_id: str,
*,
detail: Any = _UNSET,
required: Any = _UNSET,
source: Any = _UNSET,
) -> dict[str, Any]:
return update_startup_check(check_id, "ready", detail=detail, required=required, source=source)
def mark_startup_check_failed(
check_id: str,
*,
detail: Any = _UNSET,
required: Any = _UNSET,
source: Any = _UNSET,
) -> dict[str, Any]:
return update_startup_check(check_id, "failed", detail=detail, required=required, source=source)
def remove_pid_file() -> None:
"""Remove the gateway PID file if it exists."""
"""Remove the gateway PID file, but only if it belongs to this process.
During --replace handoffs, the old process's atexit handler can fire AFTER
the new process has written its own PID file. Blindly removing the file
would delete the new process's record, leaving the gateway running with no
PID file (invisible to ``get_running_pid()``).
"""
try:
_get_pid_path().unlink(missing_ok=True)
path = _get_pid_path()
record = _read_json_file(path)
if record is not None:
try:
file_pid = int(record["pid"])
except (KeyError, TypeError, ValueError):
file_pid = None
if file_pid is not None and file_pid != os.getpid():
# PID file belongs to a different process — leave it alone.
return
path.unlink(missing_ok=True)
except Exception:
pass
@@ -289,6 +440,15 @@ def acquire_scoped_lock(scope: str, identity: str, metadata: Optional[dict[str,
}
existing = _read_json_file(lock_path)
if existing is None and lock_path.exists():
# Lock file exists but is empty or contains invalid JSON — treat as
# stale. This happens when a previous process was killed between
# O_CREAT|O_EXCL and the subsequent json.dump() (e.g. DNS failure
# during rapid Slack reconnect retries).
try:
lock_path.unlink(missing_ok=True)
except OSError:
pass
if existing:
try:
existing_pid = int(existing["pid"])
+248 -31
View File
@@ -32,6 +32,10 @@ _DONE = object()
# new one so that subsequent text appears below tool progress messages.
_NEW_SEGMENT = object()
# Queue marker for a completed assistant commentary message emitted between
# API/tool iterations (for example: "I'll inspect the repo first.").
_COMMENTARY = object()
@dataclass
class StreamConsumerConfig:
@@ -60,6 +64,18 @@ class GatewayStreamConsumer:
# progressive edits for the remainder of the stream.
_MAX_FLOOD_STRIKES = 3
# Reasoning/thinking tags that models emit inline in content.
# Must stay in sync with cli.py _OPEN_TAGS/_CLOSE_TAGS and
# run_agent.py _strip_think_blocks() tag variants.
_OPEN_THINK_TAGS = (
"<REASONING_SCRATCHPAD>", "<think>", "<reasoning>",
"<THINKING>", "<thinking>", "<thought>",
)
_CLOSE_THINK_TAGS = (
"</REASONING_SCRATCHPAD>", "</think>", "</reasoning>",
"</THINKING>", "</thinking>", "</thought>",
)
def __init__(
self,
adapter: Any,
@@ -75,20 +91,47 @@ class GatewayStreamConsumer:
self._accumulated = ""
self._message_id: Optional[str] = None
self._already_sent = False
self._edit_supported = True # Disabled on first edit failure (Signal/Email/HA)
self._edit_supported = True # Disabled when progressive edits are no longer usable
self._last_edit_time = 0.0
self._last_sent_text = "" # Track last-sent text to skip redundant edits
self._fallback_final_send = False
self._fallback_prefix = ""
self._flood_strikes = 0 # Consecutive flood-control edit failures
self._current_edit_interval = self.cfg.edit_interval # Adaptive backoff
self._final_response_sent = False
# Think-block filter state (mirrors CLI's _stream_delta tag suppression)
self._in_think_block = False
self._think_buffer = ""
@property
def already_sent(self) -> bool:
"""True if at least one message was sent/edited — signals the base
adapter to skip re-sending the final response."""
"""True if at least one message was sent or edited during the run."""
return self._already_sent
@property
def final_response_sent(self) -> bool:
"""True when the stream consumer delivered the final assistant reply."""
return self._final_response_sent
def on_segment_break(self) -> None:
"""Finalize the current stream segment and start a fresh message."""
self._queue.put(_NEW_SEGMENT)
def on_commentary(self, text: str) -> None:
"""Queue a completed interim assistant commentary message."""
if text:
self._queue.put((_COMMENTARY, text))
def _reset_segment_state(self, *, preserve_no_edit: bool = False) -> None:
if preserve_no_edit and self._message_id == "__no_edit__":
return
self._message_id = None
self._accumulated = ""
self._last_sent_text = ""
self._fallback_final_send = False
self._fallback_prefix = ""
def on_delta(self, text: str) -> None:
"""Thread-safe callback — called from the agent's worker thread.
@@ -99,12 +142,118 @@ class GatewayStreamConsumer:
if text:
self._queue.put(text)
elif text is None:
self._queue.put(_NEW_SEGMENT)
self.on_segment_break()
def finish(self) -> None:
"""Signal that the stream is complete."""
self._queue.put(_DONE)
# ── Think-block filtering ────────────────────────────────────────
# Models like MiniMax emit inline <think>...</think> blocks in their
# content. The CLI's _stream_delta suppresses these via a state
# machine; we do the same here so gateway users never see raw
# reasoning tags. The agent also strips them from the final
# response (run_agent.py _strip_think_blocks), but the stream
# consumer sends intermediate edits before that stripping happens.
def _filter_and_accumulate(self, text: str) -> None:
"""Add a text delta to the accumulated buffer, suppressing think blocks.
Uses a state machine that tracks whether we are inside a
reasoning/thinking block. Text inside such blocks is silently
discarded. Partial tags at buffer boundaries are held back in
``_think_buffer`` until enough characters arrive to decide.
"""
buf = self._think_buffer + text
self._think_buffer = ""
while buf:
if self._in_think_block:
# Look for the earliest closing tag
best_idx = -1
best_len = 0
for tag in self._CLOSE_THINK_TAGS:
idx = buf.find(tag)
if idx != -1 and (best_idx == -1 or idx < best_idx):
best_idx = idx
best_len = len(tag)
if best_len:
# Found closing tag — discard block, process remainder
self._in_think_block = False
buf = buf[best_idx + best_len:]
else:
# No closing tag yet — hold tail that could be a
# partial closing tag prefix, discard the rest.
max_tag = max(len(t) for t in self._CLOSE_THINK_TAGS)
self._think_buffer = buf[-max_tag:] if len(buf) > max_tag else buf
return
else:
# Look for earliest opening tag at a block boundary
# (start of text / preceded by newline + optional whitespace).
# This prevents false positives when models *mention* tags
# in prose (e.g. "the <think> tag is used for…").
best_idx = -1
best_len = 0
for tag in self._OPEN_THINK_TAGS:
search_start = 0
while True:
idx = buf.find(tag, search_start)
if idx == -1:
break
# Block-boundary check (mirrors cli.py logic)
if idx == 0:
is_boundary = (
not self._accumulated
or self._accumulated.endswith("\n")
)
else:
preceding = buf[:idx]
last_nl = preceding.rfind("\n")
if last_nl == -1:
is_boundary = (
(not self._accumulated
or self._accumulated.endswith("\n"))
and preceding.strip() == ""
)
else:
is_boundary = preceding[last_nl + 1:].strip() == ""
if is_boundary and (best_idx == -1 or idx < best_idx):
best_idx = idx
best_len = len(tag)
break # first boundary hit for this tag is enough
search_start = idx + 1
if best_len:
# Emit text before the tag, enter think block
self._accumulated += buf[:best_idx]
self._in_think_block = True
buf = buf[best_idx + best_len:]
else:
# No opening tag — check for a partial tag at the tail
held_back = 0
for tag in self._OPEN_THINK_TAGS:
for i in range(1, len(tag)):
if buf.endswith(tag[:i]) and i > held_back:
held_back = i
if held_back:
self._accumulated += buf[:-held_back]
self._think_buffer = buf[-held_back:]
else:
self._accumulated += buf
return
def _flush_think_buffer(self) -> None:
"""Flush any held-back partial-tag buffer into accumulated text.
Called when the stream ends (got_done) so that partial text that
was held back waiting for a possible opening tag is not lost.
"""
if self._think_buffer and not self._in_think_block:
self._accumulated += self._think_buffer
self._think_buffer = ""
async def run(self) -> None:
"""Async task that drains the queue and edits the platform message."""
# Platform message length limit — leave room for cursor + formatting
@@ -116,6 +265,7 @@ class GatewayStreamConsumer:
# Drain all available items from the queue
got_done = False
got_segment_break = False
commentary_text = None
while True:
try:
item = self._queue.get_nowait()
@@ -125,21 +275,32 @@ class GatewayStreamConsumer:
if item is _NEW_SEGMENT:
got_segment_break = True
break
self._accumulated += item
if isinstance(item, tuple) and len(item) == 2 and item[0] is _COMMENTARY:
commentary_text = item[1]
break
self._filter_and_accumulate(item)
except queue.Empty:
break
# Flush any held-back partial-tag buffer on stream end
# so trailing text that was waiting for a potential open
# tag is not lost.
if got_done:
self._flush_think_buffer()
# Decide whether to flush an edit
now = time.monotonic()
elapsed = now - self._last_edit_time
should_edit = (
got_done
or got_segment_break
or commentary_text is not None
or (elapsed >= self._current_edit_interval
and self._accumulated)
or len(self._accumulated) >= self.cfg.buffer_threshold
)
current_update_visible = False
if should_edit and self._accumulated:
# Split overflow: if accumulated text exceeds the platform
# limit, split into properly sized chunks.
@@ -161,6 +322,7 @@ class GatewayStreamConsumer:
self._last_sent_text = ""
self._last_edit_time = time.monotonic()
if got_done:
self._final_response_sent = self._already_sent
return
if got_segment_break:
self._message_id = None
@@ -192,10 +354,10 @@ class GatewayStreamConsumer:
self._last_sent_text = ""
display_text = self._accumulated
if not got_done and not got_segment_break:
if not got_done and not got_segment_break and commentary_text is None:
display_text += self.cfg.cursor
await self._send_or_edit(display_text)
current_update_visible = await self._send_or_edit(display_text)
self._last_edit_time = time.monotonic()
if got_done:
@@ -206,12 +368,20 @@ class GatewayStreamConsumer:
if self._accumulated:
if self._fallback_final_send:
await self._send_fallback_final(self._accumulated)
elif current_update_visible:
self._final_response_sent = True
elif self._message_id:
await self._send_or_edit(self._accumulated)
self._final_response_sent = await self._send_or_edit(self._accumulated)
elif not self._already_sent:
await self._send_or_edit(self._accumulated)
self._final_response_sent = await self._send_or_edit(self._accumulated)
return
if commentary_text is not None:
self._reset_segment_state()
await self._send_commentary(commentary_text)
self._last_edit_time = time.monotonic()
self._reset_segment_state()
# Tool boundary: reset message state so the next text chunk
# creates a fresh message below any tool-progress messages.
#
@@ -220,17 +390,14 @@ class GatewayStreamConsumer:
# github_comment delivery). Resetting to None would re-enter
# the "first send" path on every tool boundary and post one
# platform message per tool call — that is what caused 155
# comments under a single PR. Instead, keep all state so the
# full continuation is delivered once via _send_fallback_final.
# comments under a single PR. Instead, preserve the sentinel
# so the full continuation is delivered once via
# _send_fallback_final.
# (When editing fails mid-stream due to flood control the id is
# a real string like "msg_1", not "__no_edit__", so that case
# still resets and creates a fresh segment as intended.)
if got_segment_break and self._message_id != "__no_edit__":
self._message_id = None
self._accumulated = ""
self._last_sent_text = ""
self._fallback_final_send = False
self._fallback_prefix = ""
if got_segment_break:
self._reset_segment_state(preserve_no_edit=True)
await asyncio.sleep(0.05) # Small yield to not busy-loop
@@ -241,6 +408,14 @@ class GatewayStreamConsumer:
await self._send_or_edit(self._accumulated)
except Exception:
pass
# If we delivered any content before being cancelled, mark the
# final response as sent so the gateway's already_sent check
# doesn't trigger a duplicate message. The 5-second
# stream_task timeout (gateway/run.py) can cancel us while
# waiting on a slow Telegram API call — without this flag the
# gateway falls through to the normal send path.
if self._already_sent:
self._final_response_sent = True
except Exception as e:
logger.error("Stream consumer error: %s", e)
@@ -339,6 +514,7 @@ class GatewayStreamConsumer:
if not continuation.strip():
# Nothing new to send — the visible partial already matches final text.
self._already_sent = True
self._final_response_sent = True
return
raw_limit = getattr(self.adapter, "MAX_MESSAGE_LENGTH", 4096)
@@ -373,6 +549,7 @@ class GatewayStreamConsumer:
# the base gateway final-send path so we don't resend the
# full response and create another duplicate.
self._already_sent = True
self._final_response_sent = True
self._message_id = last_message_id
self._last_sent_text = last_successful_chunk
self._fallback_prefix = ""
@@ -390,6 +567,7 @@ class GatewayStreamConsumer:
self._message_id = last_message_id
self._already_sent = True
self._final_response_sent = True
self._last_sent_text = chunks[-1]
self._fallback_prefix = ""
@@ -420,6 +598,24 @@ class GatewayStreamConsumer:
except Exception:
pass # best-effort — don't let this block the fallback path
async def _send_commentary(self, text: str) -> bool:
"""Send a completed interim assistant commentary message."""
text = self._clean_for_display(text)
if not text.strip():
return False
try:
result = await self.adapter.send(
chat_id=self.chat_id,
content=text,
metadata=self.metadata,
)
if result.success:
self._already_sent = True
return True
except Exception as e:
logger.error("Commentary send error: %s", e)
return False
async def _send_or_edit(self, text: str) -> bool:
"""Send or edit the streaming message.
@@ -431,8 +627,31 @@ class GatewayStreamConsumer:
# Media files are delivered as native attachments after the stream
# finishes (via _deliver_media_from_response in gateway/run.py).
text = self._clean_for_display(text)
# A bare streaming cursor is not meaningful user-visible content and
# can render as a stray tofu/white-box message on some clients.
visible_without_cursor = text
if self.cfg.cursor:
visible_without_cursor = visible_without_cursor.replace(self.cfg.cursor, "")
_visible_stripped = visible_without_cursor.strip()
if not _visible_stripped:
return True # cursor-only / whitespace-only update
if not text.strip():
return True # nothing to send is "success"
# Guard: do not create a brand-new standalone message when the only
# visible content is a handful of characters alongside the streaming
# cursor. During rapid tool-calling the model often emits 1-2 tokens
# before switching to tool calls; the resulting "X ▉" message risks
# leaving the cursor permanently visible if the follow-up edit (to
# strip the cursor on segment break) is rate-limited by the platform.
# This was reported on Telegram, Matrix, and other clients where the
# ▉ block character renders as a visible white box ("tofu").
# Existing messages (edits) are unaffected — only first sends gated.
_MIN_NEW_MSG_CHARS = 4
if (self._message_id is None
and self.cfg.cursor
and self.cfg.cursor in text
and len(_visible_stripped) < _MIN_NEW_MSG_CHARS):
return True # too short for a standalone message — accumulate more
try:
if self._message_id is not None:
if self._edit_supported:
@@ -501,23 +720,21 @@ class GatewayStreamConsumer:
content=text,
metadata=self.metadata,
)
if result.success and result.message_id:
self._message_id = result.message_id
if result.success:
if result.message_id:
self._message_id = result.message_id
else:
self._edit_supported = False
self._already_sent = True
self._last_sent_text = text
if not result.message_id:
self._fallback_prefix = self._visible_prefix()
self._fallback_final_send = True
# Sentinel prevents re-entering the first-send path on
# every delta/tool boundary when platforms accept a
# message but do not return an editable message id.
self._message_id = "__no_edit__"
return True
elif result.success:
# Platform accepted the message but returned no message_id
# (e.g. Signal). Can't edit without an ID — switch to
# fallback mode: suppress intermediate deltas, send only
# the missing tail once the final response is ready.
self._already_sent = True
self._edit_supported = False
self._fallback_prefix = self._clean_for_display(text)
self._fallback_final_send = True
# Sentinel prevents re-entering this branch on every delta
self._message_id = "__no_edit__"
return True # platform accepted, just can't edit
else:
# Initial send failed — disable streaming for this session
self._edit_supported = False
+160
View File
@@ -0,0 +1,160 @@
# Hermes Agent Has Had "Routines" Since March
Anthropic just announced [Claude Code Routines](https://claude.com/blog/introducing-routines-in-claude-code) — scheduled tasks, GitHub event triggers, and API-triggered agent runs. Bundled prompt + repo + connectors, running on their infrastructure.
It's a good feature. We shipped it two months ago.
---
## The Three Trigger Types — Side by Side
Claude Code Routines offers three ways to trigger an automation:
**1. Scheduled (cron)**
> "Every night at 2am: pull the top bug from Linear, attempt a fix, and open a draft PR."
Hermes equivalent — works today:
```bash
hermes cron create "0 2 * * *" \
"Pull the top bug from the issue tracker, attempt a fix, and open a draft PR." \
--name "Nightly bug fix" \
--deliver telegram
```
**2. GitHub Events (webhook)**
> "Flag PRs that touch the /auth-provider module and post to #auth-changes."
Hermes equivalent — works today:
```bash
hermes webhook subscribe auth-watch \
--events "pull_request" \
--prompt "PR #{pull_request.number}: {pull_request.title} by {pull_request.user.login}. Check if it touches the auth-provider module. If yes, summarize the changes." \
--deliver slack
```
**3. API Triggers**
> "Read the alert payload, find the owning service, post a triage summary to #oncall."
Hermes equivalent — works today:
```bash
hermes webhook subscribe alert-triage \
--prompt "Alert: {alert.name} — Severity: {alert.severity}. Find the owning service, investigate, and post a triage summary with proposed first steps." \
--deliver slack
```
Every use case in their blog post — backlog triage, docs drift, deploy verification, alert correlation, library porting, bespoke PR review — has a working Hermes implementation. No new features needed. It's been shipping since March 2026.
---
## What's Different
| | Claude Code Routines | Hermes Agent |
|---|---|---|
| **Scheduled tasks** | ✅ Schedule-based | ✅ Any cron expression + human-readable intervals |
| **GitHub triggers** | ✅ PR, issue, push events | ✅ Any GitHub event via webhook subscriptions |
| **API triggers** | ✅ POST to unique endpoint | ✅ POST to webhook routes with HMAC auth |
| **MCP connectors** | ✅ Native connectors | ✅ Full MCP client support |
| **Script pre-processing** | ❌ | ✅ Python scripts run before agent, inject context |
| **Skill chaining** | ❌ | ✅ Load multiple skills per automation |
| **Daily limit** | 5-25 runs/day | **Unlimited** |
| **Model choice** | Claude only | **Any model** — Claude, GPT, Gemini, DeepSeek, Qwen, local |
| **Delivery targets** | GitHub comments | Telegram, Discord, Slack, SMS, email, GitHub comments, webhooks, local files |
| **Infrastructure** | Anthropic's servers | **Your infrastructure** — VPS, home server, laptop |
| **Data residency** | Anthropic's cloud | **Your machines** |
| **Cost** | Pro/Max/Team/Enterprise subscription | Your API key, your rates |
| **Open source** | No | **Yes** — MIT license |
---
## Things Hermes Does That Routines Can't
### Script Injection
Run a Python script *before* the agent. The script's stdout becomes context. The script handles mechanical work (fetching, diffing, computing); the agent handles reasoning.
```bash
hermes cron create "every 1h" \
"If CHANGE DETECTED, summarize what changed. If NO_CHANGE, respond with [SILENT]." \
--script ~/.hermes/scripts/watch-site.py \
--name "Pricing monitor" \
--deliver telegram
```
The `[SILENT]` pattern means you only get notified when something actually happens. No spam.
### Multi-Skill Workflows
Chain specialized skills together. Each skill teaches the agent a specific capability, and the prompt ties them together.
```bash
hermes cron create "0 8 * * *" \
"Search arXiv for papers on language model reasoning. Save the top 3 as Obsidian notes." \
--skills "arxiv,obsidian" \
--name "Paper digest"
```
### Deliver Anywhere
One automation, any destination:
```bash
--deliver telegram # Telegram home channel
--deliver discord # Discord home channel
--deliver slack # Slack channel
--deliver sms:+15551234567 # Text message
--deliver telegram:-1001234567890:42 # Specific Telegram forum topic
--deliver local # Save to file, no notification
```
### Model-Agnostic
Your nightly triage can run on Claude. Your deploy verification can run on GPT. Your cost-sensitive monitors can run on DeepSeek or a local model. Same automation system, any backend.
---
## The Limits Tell the Story
Claude Code Routines: **5 routines per day** on Pro. **25 on Enterprise.** That's their ceiling.
Hermes has no daily limit. Run 500 automations a day if you want. The only constraint is your API budget, and you choose which models to use for which tasks.
A nightly backlog triage on Sonnet costs roughly $0.02-0.05. A monitoring check on DeepSeek costs fractions of a cent. You control the economics.
---
## Get Started
Hermes Agent is open source and free. The automation infrastructure — cron scheduler, webhook platform, skill system, multi-platform delivery — is built in.
```bash
pip install hermes-agent
hermes setup
```
Set up a scheduled task in 30 seconds:
```bash
hermes cron create "0 9 * * 1" \
"Generate a weekly AI news digest. Search the web for major announcements, trending repos, and notable papers. Keep it under 500 words with links." \
--name "Weekly digest" \
--deliver telegram
```
Set up a GitHub webhook in 60 seconds:
```bash
hermes gateway setup # enable webhooks
hermes webhook subscribe pr-review \
--events "pull_request" \
--prompt "Review PR #{pull_request.number}: {pull_request.title}" \
--skills "github-code-review" \
--deliver github_comment
```
Full automation templates gallery: [hermes-agent.nousresearch.com/docs/guides/automation-templates](https://hermes-agent.nousresearch.com/docs/guides/automation-templates)
Documentation: [hermes-agent.nousresearch.com](https://hermes-agent.nousresearch.com)
GitHub: [github.com/NousResearch/hermes-agent](https://github.com/NousResearch/hermes-agent)
---
*Hermes Agent is built by [Nous Research](https://nousresearch.com). Open source, model-agnostic, runs on your infrastructure.*
+2 -2
View File
@@ -11,5 +11,5 @@ Provides subcommands for:
- hermes cron - Manage cron jobs
"""
__version__ = "0.8.0"
__release_date__ = "2026.4.8"
__version__ = "0.9.0"
__release_date__ = "2026.4.13"
+140 -73
View File
@@ -127,6 +127,7 @@ PROVIDER_REGISTRY: Dict[str, ProviderConfig] = {
auth_type="api_key",
inference_base_url=DEFAULT_GITHUB_MODELS_BASE_URL,
api_key_env_vars=("COPILOT_GITHUB_TOKEN", "GH_TOKEN", "GITHUB_TOKEN"),
base_url_env_var="COPILOT_API_BASE_URL",
),
"copilot-acp": ProviderConfig(
id="copilot-acp",
@@ -159,6 +160,21 @@ PROVIDER_REGISTRY: Dict[str, ProviderConfig] = {
api_key_env_vars=("KIMI_API_KEY",),
base_url_env_var="KIMI_BASE_URL",
),
"kimi-coding-cn": ProviderConfig(
id="kimi-coding-cn",
name="Kimi / Moonshot (China)",
auth_type="api_key",
inference_base_url="https://api.moonshot.cn/v1",
api_key_env_vars=("KIMI_CN_API_KEY",),
),
"arcee": ProviderConfig(
id="arcee",
name="Arcee AI",
auth_type="api_key",
inference_base_url="https://api.arcee.ai/api/v1",
api_key_env_vars=("ARCEEAI_API_KEY",),
base_url_env_var="ARCEE_BASE_URL",
),
"minimax": ProviderConfig(
id="minimax",
name="MiniMax",
@@ -208,7 +224,7 @@ PROVIDER_REGISTRY: Dict[str, ProviderConfig] = {
),
"ai-gateway": ProviderConfig(
id="ai-gateway",
name="AI Gateway",
name="Vercel AI Gateway",
auth_type="api_key",
inference_base_url="https://ai-gateway.vercel.sh/v1",
api_key_env_vars=("AI_GATEWAY_API_KEY",),
@@ -307,44 +323,6 @@ def _resolve_kimi_base_url(api_key: str, default_url: str, env_override: str) ->
return default_url
def _gh_cli_candidates() -> list[str]:
"""Return candidate ``gh`` binary paths, including common Homebrew installs."""
candidates: list[str] = []
resolved = shutil.which("gh")
if resolved:
candidates.append(resolved)
for candidate in (
"/opt/homebrew/bin/gh",
"/usr/local/bin/gh",
str(Path.home() / ".local" / "bin" / "gh"),
):
if candidate in candidates:
continue
if os.path.isfile(candidate) and os.access(candidate, os.X_OK):
candidates.append(candidate)
return candidates
def _try_gh_cli_token() -> Optional[str]:
"""Return a token from ``gh auth token`` when the GitHub CLI is available."""
for gh_path in _gh_cli_candidates():
try:
result = subprocess.run(
[gh_path, "auth", "token"],
capture_output=True,
text=True,
timeout=5,
)
except (FileNotFoundError, subprocess.TimeoutExpired) as exc:
logger.debug("gh CLI token lookup failed (%s): %s", gh_path, exc)
continue
if result.returncode == 0 and result.stdout.strip():
return result.stdout.strip()
return None
_PLACEHOLDER_SECRET_VALUES = {
"*",
@@ -405,13 +383,16 @@ def _resolve_api_key_provider_secret(
# Z.AI has separate billing for general vs coding plans, and global vs China
# endpoints. A key that works on one may return "Insufficient balance" on
# another. We probe at setup time and store the working endpoint.
# Each entry lists candidate models to try in order — newer coding plan accounts
# may only have access to recent models (glm-5.1, glm-5v-turbo) while older
# ones still use glm-4.7.
ZAI_ENDPOINTS = [
# (id, base_url, default_model, label)
("global", "https://api.z.ai/api/paas/v4", "glm-5", "Global"),
("cn", "https://open.bigmodel.cn/api/paas/v4", "glm-5", "China"),
("coding-global", "https://api.z.ai/api/coding/paas/v4", "glm-4.7", "Global (Coding Plan)"),
("coding-cn", "https://open.bigmodel.cn/api/coding/paas/v4", "glm-4.7", "China (Coding Plan)"),
# (id, base_url, probe_models, label)
("global", "https://api.z.ai/api/paas/v4", ["glm-5"], "Global"),
("cn", "https://open.bigmodel.cn/api/paas/v4", ["glm-5"], "China"),
("coding-global", "https://api.z.ai/api/coding/paas/v4", ["glm-5.1", "glm-5v-turbo", "glm-4.7"], "Global (Coding Plan)"),
("coding-cn", "https://open.bigmodel.cn/api/coding/paas/v4", ["glm-5.1", "glm-5v-turbo", "glm-4.7"], "China (Coding Plan)"),
]
@@ -419,35 +400,37 @@ def detect_zai_endpoint(api_key: str, timeout: float = 8.0) -> Optional[Dict[str
"""Probe z.ai endpoints to find one that accepts this API key.
Returns {"id": ..., "base_url": ..., "model": ..., "label": ...} for the
first working endpoint, or None if all fail.
first working endpoint, or None if all fail. For endpoints with multiple
candidate models, tries each in order and returns the first that succeeds.
"""
for ep_id, base_url, model, label in ZAI_ENDPOINTS:
try:
resp = httpx.post(
f"{base_url}/chat/completions",
headers={
"Authorization": f"Bearer {api_key}",
"Content-Type": "application/json",
},
json={
"model": model,
"stream": False,
"max_tokens": 1,
"messages": [{"role": "user", "content": "ping"}],
},
timeout=timeout,
)
if resp.status_code == 200:
logger.debug("Z.AI endpoint probe: %s (%s) OK", ep_id, base_url)
return {
"id": ep_id,
"base_url": base_url,
"model": model,
"label": label,
}
logger.debug("Z.AI endpoint probe: %s returned %s", ep_id, resp.status_code)
except Exception as exc:
logger.debug("Z.AI endpoint probe: %s failed: %s", ep_id, exc)
for ep_id, base_url, probe_models, label in ZAI_ENDPOINTS:
for model in probe_models:
try:
resp = httpx.post(
f"{base_url}/chat/completions",
headers={
"Authorization": f"Bearer {api_key}",
"Content-Type": "application/json",
},
json={
"model": model,
"stream": False,
"max_tokens": 1,
"messages": [{"role": "user", "content": "ping"}],
},
timeout=timeout,
)
if resp.status_code == 200:
logger.debug("Z.AI endpoint probe: %s (%s) model=%s OK", ep_id, base_url, model)
return {
"id": ep_id,
"base_url": base_url,
"model": model,
"label": label,
}
logger.debug("Z.AI endpoint probe: %s model=%s returned %s", ep_id, model, resp.status_code)
except Exception as exc:
logger.debug("Z.AI endpoint probe: %s model=%s failed: %s", ep_id, model, exc)
return None
@@ -929,6 +912,8 @@ def resolve_provider(
"glm": "zai", "z-ai": "zai", "z.ai": "zai", "zhipu": "zai",
"google": "gemini", "google-gemini": "gemini", "google-ai-studio": "gemini",
"kimi": "kimi-coding", "kimi-for-coding": "kimi-coding", "moonshot": "kimi-coding",
"kimi-cn": "kimi-coding-cn", "moonshot-cn": "kimi-coding-cn",
"arcee-ai": "arcee", "arceeai": "arcee",
"minimax-china": "minimax-cn", "minimax_cn": "minimax-cn",
"claude": "anthropic", "claude-code": "anthropic",
"github": "copilot", "github-copilot": "copilot",
@@ -1303,6 +1288,49 @@ def _read_codex_tokens(*, _lock: bool = True) -> Dict[str, Any]:
}
def _write_codex_cli_tokens(
access_token: str,
refresh_token: str,
*,
last_refresh: Optional[str] = None,
) -> None:
"""Write refreshed tokens back to ~/.codex/auth.json.
OpenAI OAuth refresh tokens are single-use and rotate on every refresh.
When Hermes refreshes a token it consumes the old refresh_token; if we
don't write the new pair back, the Codex CLI (or VS Code extension) will
fail with ``refresh_token_reused`` on its next refresh attempt.
This mirrors the Anthropic write-back to ~/.claude/.credentials.json
via ``_write_claude_code_credentials()``.
"""
codex_home = os.getenv("CODEX_HOME", "").strip()
if not codex_home:
codex_home = str(Path.home() / ".codex")
auth_path = Path(codex_home).expanduser() / "auth.json"
try:
existing: Dict[str, Any] = {}
if auth_path.is_file():
existing = json.loads(auth_path.read_text(encoding="utf-8"))
if not isinstance(existing, dict):
existing = {}
tokens_dict = existing.get("tokens")
if not isinstance(tokens_dict, dict):
tokens_dict = {}
tokens_dict["access_token"] = access_token
tokens_dict["refresh_token"] = refresh_token
existing["tokens"] = tokens_dict
if last_refresh is not None:
existing["last_refresh"] = last_refresh
auth_path.parent.mkdir(parents=True, exist_ok=True)
auth_path.write_text(json.dumps(existing, indent=2), encoding="utf-8")
auth_path.chmod(0o600)
except (OSError, IOError) as exc:
logger.debug("Failed to write refreshed tokens to %s: %s", auth_path, exc)
def _save_codex_tokens(tokens: Dict[str, str], last_refresh: str = None) -> None:
"""Save Codex OAuth tokens to Hermes auth store (~/.hermes/auth.json)."""
if last_refresh is None:
@@ -1425,6 +1453,12 @@ def _refresh_codex_auth_tokens(
updated_tokens["refresh_token"] = refreshed["refresh_token"]
_save_codex_tokens(updated_tokens)
# Write back to ~/.codex/auth.json so Codex CLI / VS Code stay in sync.
_write_codex_cli_tokens(
refreshed["access_token"],
refreshed["refresh_token"],
last_refresh=refreshed.get("last_refresh"),
)
return updated_tokens
@@ -2233,7 +2267,40 @@ def resolve_nous_runtime_credentials(
# =============================================================================
def get_nous_auth_status() -> Dict[str, Any]:
"""Status snapshot for `hermes status` output."""
"""Status snapshot for `hermes status` output.
Checks the credential pool first (where the dashboard device-code flow
and ``hermes auth`` store credentials), then falls back to the legacy
auth-store provider state.
"""
# Check credential pool first — the dashboard device-code flow saves
# here but may not have written to the auth store yet.
try:
from agent.credential_pool import load_pool
pool = load_pool("nous")
if pool and pool.has_credentials():
entry = pool.select()
if entry is not None:
access_token = (
getattr(entry, "access_token", None)
or getattr(entry, "runtime_api_key", "")
)
if access_token:
return {
"logged_in": True,
"portal_base_url": getattr(entry, "portal_base_url", None)
or getattr(entry, "base_url", None),
"inference_base_url": getattr(entry, "inference_base_url", None)
or getattr(entry, "base_url", None),
"access_token": access_token,
"access_expires_at": getattr(entry, "expires_at", None),
"agent_key_expires_at": getattr(entry, "agent_key_expires_at", None),
"has_refresh_token": bool(getattr(entry, "refresh_token", None)),
}
except Exception:
pass
# Fall back to auth-store provider state
state = get_provider_auth_state("nous")
if not state:
return {
+9 -9
View File
@@ -36,25 +36,23 @@ _OAUTH_CAPABLE_PROVIDERS = {"anthropic", "nous", "openai-codex", "qwen-oauth"}
def _get_custom_provider_names() -> list:
"""Return list of (display_name, pool_key) tuples for custom_providers in config."""
"""Return list of (display_name, pool_key, provider_key) tuples."""
try:
from hermes_cli.config import load_config
from hermes_cli.config import get_compatible_custom_providers, load_config
config = load_config()
except Exception:
return []
custom_providers = config.get("custom_providers")
if not isinstance(custom_providers, list):
return []
result = []
for entry in custom_providers:
for entry in get_compatible_custom_providers(config):
if not isinstance(entry, dict):
continue
name = entry.get("name")
if not isinstance(name, str) or not name.strip():
continue
pool_key = f"{CUSTOM_POOL_PREFIX}{_normalize_custom_pool_name(name)}"
result.append((name.strip(), pool_key))
provider_key = str(entry.get("provider_key", "") or "").strip()
result.append((name.strip(), pool_key, provider_key))
return result
@@ -66,9 +64,11 @@ def _resolve_custom_provider_input(raw: str) -> str | None:
# Direct match on 'custom:name' format
if normalized.startswith(CUSTOM_POOL_PREFIX):
return normalized
for display_name, pool_key in _get_custom_provider_names():
for display_name, pool_key, provider_key in _get_custom_provider_names():
if _normalize_custom_pool_name(display_name) == normalized:
return pool_key
if provider_key and provider_key.strip().lower() == normalized:
return pool_key
return None
@@ -405,7 +405,7 @@ def _pick_provider(prompt: str = "Provider") -> str:
known = sorted(set(list(PROVIDER_REGISTRY.keys()) + ["openrouter"]))
custom_names = _get_custom_provider_names()
if custom_names:
custom_display = [name for name, _key in custom_names]
custom_display = [name for name, _key, _provider_key in custom_names]
print(f"\nKnown providers: {', '.join(known)}")
print(f"Custom endpoints: {', '.join(custom_display)}")
else:
+655
View File
@@ -0,0 +1,655 @@
"""
Backup and import commands for hermes CLI.
`hermes backup` creates a zip archive of the entire ~/.hermes/ directory
(excluding the hermes-agent repo and transient files).
`hermes import` restores from a backup zip, overlaying onto the current
HERMES_HOME root.
"""
import json
import logging
import os
import shutil
import sqlite3
import sys
import tempfile
import time
import zipfile
from datetime import datetime, timezone
from pathlib import Path
from typing import Any, Dict, List, Optional
from hermes_constants import get_default_hermes_root, get_hermes_home, display_hermes_home
logger = logging.getLogger(__name__)
# ---------------------------------------------------------------------------
# Exclusion rules
# ---------------------------------------------------------------------------
# Directory names to skip entirely (matched against each path component)
_EXCLUDED_DIRS = {
"hermes-agent", # the codebase repo — re-clone instead
"__pycache__", # bytecode caches — regenerated on import
".git", # nested git dirs (profiles shouldn't have these, but safety)
"node_modules", # js deps if website/ somehow leaks in
}
# File-name suffixes to skip
_EXCLUDED_SUFFIXES = (
".pyc",
".pyo",
)
# File names to skip (runtime state that's meaningless on another machine)
_EXCLUDED_NAMES = {
"gateway.pid",
"cron.pid",
}
def _should_exclude(rel_path: Path) -> bool:
"""Return True if *rel_path* (relative to hermes root) should be skipped."""
parts = rel_path.parts
# Any path component matches an excluded dir name
for part in parts:
if part in _EXCLUDED_DIRS:
return True
name = rel_path.name
if name in _EXCLUDED_NAMES:
return True
if name.endswith(_EXCLUDED_SUFFIXES):
return True
return False
# ---------------------------------------------------------------------------
# SQLite safe copy
# ---------------------------------------------------------------------------
def _safe_copy_db(src: Path, dst: Path) -> bool:
"""Copy a SQLite database safely using the backup() API.
Handles WAL mode produces a consistent snapshot even while
the DB is being written to. Falls back to raw copy on failure.
"""
try:
conn = sqlite3.connect(f"file:{src}?mode=ro", uri=True)
backup_conn = sqlite3.connect(str(dst))
conn.backup(backup_conn)
backup_conn.close()
conn.close()
return True
except Exception as exc:
logger.warning("SQLite safe copy failed for %s: %s", src, exc)
try:
shutil.copy2(src, dst)
return True
except Exception as exc2:
logger.error("Raw copy also failed for %s: %s", src, exc2)
return False
# ---------------------------------------------------------------------------
# Backup
# ---------------------------------------------------------------------------
def _format_size(nbytes: int) -> str:
"""Human-readable file size."""
for unit in ("B", "KB", "MB", "GB"):
if nbytes < 1024:
return f"{nbytes:.1f} {unit}" if unit != "B" else f"{nbytes} {unit}"
nbytes /= 1024
return f"{nbytes:.1f} TB"
def run_backup(args) -> None:
"""Create a zip backup of the Hermes home directory."""
hermes_root = get_default_hermes_root()
if not hermes_root.is_dir():
print(f"Error: Hermes home directory not found at {hermes_root}")
sys.exit(1)
# Determine output path
if args.output:
out_path = Path(args.output).expanduser().resolve()
# If user gave a directory, put the zip inside it
if out_path.is_dir():
stamp = datetime.now().strftime("%Y-%m-%d-%H%M%S")
out_path = out_path / f"hermes-backup-{stamp}.zip"
else:
stamp = datetime.now().strftime("%Y-%m-%d-%H%M%S")
out_path = Path.home() / f"hermes-backup-{stamp}.zip"
# Ensure the suffix is .zip
if out_path.suffix.lower() != ".zip":
out_path = out_path.with_suffix(out_path.suffix + ".zip")
# Ensure parent directory exists
out_path.parent.mkdir(parents=True, exist_ok=True)
# Collect files
print(f"Scanning {display_hermes_home()} ...")
files_to_add: list[tuple[Path, Path]] = [] # (absolute, relative)
skipped_dirs = set()
for dirpath, dirnames, filenames in os.walk(hermes_root, followlinks=False):
dp = Path(dirpath)
rel_dir = dp.relative_to(hermes_root)
# Prune excluded directories in-place so os.walk doesn't descend
orig_dirnames = dirnames[:]
dirnames[:] = [
d for d in dirnames
if d not in _EXCLUDED_DIRS
]
for removed in set(orig_dirnames) - set(dirnames):
skipped_dirs.add(str(rel_dir / removed))
for fname in filenames:
fpath = dp / fname
rel = fpath.relative_to(hermes_root)
if _should_exclude(rel):
continue
# Skip the output zip itself if it happens to be inside hermes root
try:
if fpath.resolve() == out_path.resolve():
continue
except (OSError, ValueError):
pass
files_to_add.append((fpath, rel))
if not files_to_add:
print("No files to back up.")
return
# Create the zip
file_count = len(files_to_add)
print(f"Backing up {file_count} files ...")
total_bytes = 0
errors = []
t0 = time.monotonic()
with zipfile.ZipFile(out_path, "w", zipfile.ZIP_DEFLATED, compresslevel=6) as zf:
for i, (abs_path, rel_path) in enumerate(files_to_add, 1):
try:
# Safe copy for SQLite databases (handles WAL mode)
if abs_path.suffix == ".db":
with tempfile.NamedTemporaryFile(suffix=".db", delete=False) as tmp:
tmp_db = Path(tmp.name)
if _safe_copy_db(abs_path, tmp_db):
zf.write(tmp_db, arcname=str(rel_path))
total_bytes += tmp_db.stat().st_size
tmp_db.unlink(missing_ok=True)
else:
tmp_db.unlink(missing_ok=True)
errors.append(f" {rel_path}: SQLite safe copy failed")
continue
else:
zf.write(abs_path, arcname=str(rel_path))
total_bytes += abs_path.stat().st_size
except (PermissionError, OSError) as exc:
errors.append(f" {rel_path}: {exc}")
continue
# Progress every 500 files
if i % 500 == 0:
print(f" {i}/{file_count} files ...")
elapsed = time.monotonic() - t0
zip_size = out_path.stat().st_size
# Summary
print()
print(f"Backup complete: {out_path}")
print(f" Files: {file_count}")
print(f" Original: {_format_size(total_bytes)}")
print(f" Compressed: {_format_size(zip_size)}")
print(f" Time: {elapsed:.1f}s")
if skipped_dirs:
print(f"\n Excluded directories:")
for d in sorted(skipped_dirs):
print(f" {d}/")
if errors:
print(f"\n Warnings ({len(errors)} files skipped):")
for e in errors[:10]:
print(e)
if len(errors) > 10:
print(f" ... and {len(errors) - 10} more")
print(f"\nRestore with: hermes import {out_path.name}")
# ---------------------------------------------------------------------------
# Import
# ---------------------------------------------------------------------------
def _validate_backup_zip(zf: zipfile.ZipFile) -> tuple[bool, str]:
"""Check that a zip looks like a Hermes backup.
Returns (ok, reason).
"""
names = zf.namelist()
if not names:
return False, "zip archive is empty"
# Look for telltale files that a hermes home would have
markers = {"config.yaml", ".env", "state.db"}
found = set()
for n in names:
# Could be at the root or one level deep (if someone zipped the directory)
basename = Path(n).name
if basename in markers:
found.add(basename)
if not found:
return False, (
"zip does not appear to be a Hermes backup "
"(no config.yaml, .env, or state databases found)"
)
return True, ""
def _detect_prefix(zf: zipfile.ZipFile) -> str:
"""Detect if the zip has a common directory prefix wrapping all entries.
Some tools zip as `.hermes/config.yaml` instead of `config.yaml`.
Returns the prefix to strip (empty string if none).
"""
names = [n for n in zf.namelist() if not n.endswith("/")]
if not names:
return ""
# Find common prefix
parts_list = [Path(n).parts for n in names]
# Check if all entries share a common first directory
first_parts = {p[0] for p in parts_list if len(p) > 1}
if len(first_parts) == 1:
prefix = first_parts.pop()
# Only strip if it looks like a hermes dir name
if prefix in (".hermes", "hermes"):
return prefix + "/"
return ""
def run_import(args) -> None:
"""Restore a Hermes backup from a zip file."""
zip_path = Path(args.zipfile).expanduser().resolve()
if not zip_path.is_file():
print(f"Error: File not found: {zip_path}")
sys.exit(1)
if not zipfile.is_zipfile(zip_path):
print(f"Error: Not a valid zip file: {zip_path}")
sys.exit(1)
hermes_root = get_default_hermes_root()
with zipfile.ZipFile(zip_path, "r") as zf:
# Validate
ok, reason = _validate_backup_zip(zf)
if not ok:
print(f"Error: {reason}")
sys.exit(1)
prefix = _detect_prefix(zf)
members = [n for n in zf.namelist() if not n.endswith("/")]
file_count = len(members)
print(f"Backup contains {file_count} files")
print(f"Target: {display_hermes_home()}")
if prefix:
print(f"Detected archive prefix: {prefix!r} (will be stripped)")
# Check for existing installation
has_config = (hermes_root / "config.yaml").exists()
has_env = (hermes_root / ".env").exists()
if (has_config or has_env) and not args.force:
print()
print("Warning: Target directory already has Hermes configuration.")
print("Importing will overwrite existing files with backup contents.")
print()
try:
answer = input("Continue? [y/N] ").strip().lower()
except (EOFError, KeyboardInterrupt):
print("\nAborted.")
sys.exit(1)
if answer not in ("y", "yes"):
print("Aborted.")
return
# Extract
print(f"\nImporting {file_count} files ...")
hermes_root.mkdir(parents=True, exist_ok=True)
errors = []
restored = 0
t0 = time.monotonic()
for member in members:
# Strip prefix if detected
if prefix and member.startswith(prefix):
rel = member[len(prefix):]
else:
rel = member
if not rel:
continue
target = hermes_root / rel
# Security: reject absolute paths and traversals
try:
target.resolve().relative_to(hermes_root.resolve())
except ValueError:
errors.append(f" {rel}: path traversal blocked")
continue
try:
target.parent.mkdir(parents=True, exist_ok=True)
with zf.open(member) as src, open(target, "wb") as dst:
dst.write(src.read())
restored += 1
except (PermissionError, OSError) as exc:
errors.append(f" {rel}: {exc}")
if restored % 500 == 0:
print(f" {restored}/{file_count} files ...")
elapsed = time.monotonic() - t0
# Summary
print()
print(f"Import complete: {restored} files restored in {elapsed:.1f}s")
print(f" Target: {display_hermes_home()}")
if errors:
print(f"\n Warnings ({len(errors)} files skipped):")
for e in errors[:10]:
print(e)
if len(errors) > 10:
print(f" ... and {len(errors) - 10} more")
# Post-import: restore profile wrapper scripts
profiles_dir = hermes_root / "profiles"
restored_profiles = []
if profiles_dir.is_dir():
try:
from hermes_cli.profiles import (
create_wrapper_script, check_alias_collision,
_is_wrapper_dir_in_path, _get_wrapper_dir,
)
for entry in sorted(profiles_dir.iterdir()):
if not entry.is_dir():
continue
profile_name = entry.name
# Only create wrappers for directories with config
if not (entry / "config.yaml").exists() and not (entry / ".env").exists():
continue
collision = check_alias_collision(profile_name)
if collision:
print(f" Skipped alias '{profile_name}': {collision}")
restored_profiles.append((profile_name, False))
else:
wrapper = create_wrapper_script(profile_name)
restored_profiles.append((profile_name, wrapper is not None))
if restored_profiles:
created = [n for n, ok in restored_profiles if ok]
skipped = [n for n, ok in restored_profiles if not ok]
if created:
print(f"\n Profile aliases restored: {', '.join(created)}")
if skipped:
print(f" Profile aliases skipped: {', '.join(skipped)}")
if not _is_wrapper_dir_in_path():
print(f"\n Note: {_get_wrapper_dir()} is not in your PATH.")
print(' Add to your shell config (~/.bashrc or ~/.zshrc):')
print(' export PATH="$HOME/.local/bin:$PATH"')
except ImportError:
# hermes_cli.profiles might not be available (fresh install)
if any(profiles_dir.iterdir()):
print(f"\n Profiles detected but aliases could not be created.")
print(f" Run: hermes profile list (after installing hermes)")
# Guidance
print()
if not (hermes_root / "hermes-agent").is_dir():
print("Note: The hermes-agent codebase was not included in the backup.")
print(" If this is a fresh install, run: hermes update")
if restored_profiles:
gw_profiles = [n for n, _ in restored_profiles]
print("\nTo re-enable gateway services for profiles:")
for pname in gw_profiles:
print(f" hermes -p {pname} gateway install")
print("Done. Your Hermes configuration has been restored.")
# ---------------------------------------------------------------------------
# Quick state snapshots (used by /snapshot slash command and hermes backup --quick)
# ---------------------------------------------------------------------------
# Critical state files to include in quick snapshots (relative to HERMES_HOME).
# Everything else is either regeneratable (logs, cache) or managed separately
# (skills, repo, sessions/).
_QUICK_STATE_FILES = (
"state.db",
"config.yaml",
".env",
"auth.json",
"cron/jobs.json",
"gateway_state.json",
"channel_directory.json",
"processes.json",
)
_QUICK_SNAPSHOTS_DIR = "state-snapshots"
_QUICK_DEFAULT_KEEP = 20
def _quick_snapshot_root(hermes_home: Optional[Path] = None) -> Path:
home = hermes_home or get_hermes_home()
return home / _QUICK_SNAPSHOTS_DIR
def create_quick_snapshot(
label: Optional[str] = None,
hermes_home: Optional[Path] = None,
) -> Optional[str]:
"""Create a quick state snapshot of critical files.
Copies STATE_FILES to a timestamped directory under state-snapshots/.
Auto-prunes old snapshots beyond the keep limit.
Returns:
Snapshot ID (timestamp-based), or None if no files found.
"""
home = hermes_home or get_hermes_home()
root = _quick_snapshot_root(home)
ts = datetime.now(timezone.utc).strftime("%Y%m%d-%H%M%S")
snap_id = f"{ts}-{label}" if label else ts
snap_dir = root / snap_id
snap_dir.mkdir(parents=True, exist_ok=True)
manifest: Dict[str, int] = {} # rel_path -> file size
for rel in _QUICK_STATE_FILES:
src = home / rel
if not src.exists() or not src.is_file():
continue
dst = snap_dir / rel
dst.parent.mkdir(parents=True, exist_ok=True)
try:
if src.suffix == ".db":
if not _safe_copy_db(src, dst):
continue
else:
shutil.copy2(src, dst)
manifest[rel] = dst.stat().st_size
except (OSError, PermissionError) as exc:
logger.warning("Could not snapshot %s: %s", rel, exc)
if not manifest:
shutil.rmtree(snap_dir, ignore_errors=True)
return None
# Write manifest
meta = {
"id": snap_id,
"timestamp": ts,
"label": label,
"file_count": len(manifest),
"total_size": sum(manifest.values()),
"files": manifest,
}
with open(snap_dir / "manifest.json", "w") as f:
json.dump(meta, f, indent=2)
# Auto-prune
_prune_quick_snapshots(root, keep=_QUICK_DEFAULT_KEEP)
logger.info("State snapshot created: %s (%d files)", snap_id, len(manifest))
return snap_id
def list_quick_snapshots(
limit: int = 20,
hermes_home: Optional[Path] = None,
) -> List[Dict[str, Any]]:
"""List existing quick state snapshots, most recent first."""
root = _quick_snapshot_root(hermes_home)
if not root.exists():
return []
results = []
for d in sorted(root.iterdir(), reverse=True):
if not d.is_dir():
continue
manifest_path = d / "manifest.json"
if manifest_path.exists():
try:
with open(manifest_path) as f:
results.append(json.load(f))
except (json.JSONDecodeError, OSError):
results.append({"id": d.name, "file_count": 0, "total_size": 0})
if len(results) >= limit:
break
return results
def restore_quick_snapshot(
snapshot_id: str,
hermes_home: Optional[Path] = None,
) -> bool:
"""Restore state from a quick snapshot.
Overwrites current state files with the snapshot's copies.
Returns True if at least one file was restored.
"""
home = hermes_home or get_hermes_home()
root = _quick_snapshot_root(home)
snap_dir = root / snapshot_id
if not snap_dir.is_dir():
return False
manifest_path = snap_dir / "manifest.json"
if not manifest_path.exists():
return False
with open(manifest_path) as f:
meta = json.load(f)
restored = 0
for rel in meta.get("files", {}):
src = snap_dir / rel
if not src.exists():
continue
dst = home / rel
dst.parent.mkdir(parents=True, exist_ok=True)
try:
if dst.suffix == ".db":
# Atomic-ish replace for databases
tmp = dst.parent / f".{dst.name}.snap_restore"
shutil.copy2(src, tmp)
dst.unlink(missing_ok=True)
shutil.move(str(tmp), str(dst))
else:
shutil.copy2(src, dst)
restored += 1
except (OSError, PermissionError) as exc:
logger.error("Failed to restore %s: %s", rel, exc)
logger.info("Restored %d files from snapshot %s", restored, snapshot_id)
return restored > 0
def _prune_quick_snapshots(root: Path, keep: int = _QUICK_DEFAULT_KEEP) -> int:
"""Remove oldest quick snapshots beyond the keep limit. Returns count deleted."""
if not root.exists():
return 0
dirs = sorted(
(d for d in root.iterdir() if d.is_dir()),
key=lambda d: d.name,
reverse=True,
)
deleted = 0
for d in dirs[keep:]:
try:
shutil.rmtree(d)
deleted += 1
except OSError as exc:
logger.warning("Failed to prune snapshot %s: %s", d.name, exc)
return deleted
def prune_quick_snapshots(
keep: int = _QUICK_DEFAULT_KEEP,
hermes_home: Optional[Path] = None,
) -> int:
"""Manually prune quick snapshots. Returns count deleted."""
return _prune_quick_snapshots(_quick_snapshot_root(hermes_home), keep=keep)
def run_quick_backup(args) -> None:
"""CLI entry point for hermes backup --quick."""
label = getattr(args, "label", None)
snap_id = create_quick_snapshot(label=label)
if snap_id:
print(f"State snapshot created: {snap_id}")
snaps = list_quick_snapshots()
print(f" {len(snaps)} snapshot(s) stored in {display_hermes_home()}/state-snapshots/")
print(f" Restore with: /snapshot restore {snap_id}")
else:
print("No state files found to snapshot.")
-1
View File
@@ -5,7 +5,6 @@ Pure display functions with no HermesCLI state dependency.
import json
import logging
import os
import shutil
import subprocess
import threading
+3 -3
View File
@@ -75,12 +75,12 @@ def prompt_for_secret(cli, var_name: str, prompt: str, metadata=None) -> dict:
if not hasattr(cli, "_secret_deadline"):
cli._secret_deadline = 0
try:
value = getpass.getpass(f"{prompt} (hidden, Enter to skip): ")
value = getpass.getpass(f"{prompt} (hidden, ESC or empty Enter to skip): ")
except (EOFError, KeyboardInterrupt):
value = ""
if not value:
cprint(f"\n{_DIM} ⏭ Secret entry cancelled{_RST}")
cprint(f"\n{_DIM} ⏭ Secret entry skipped{_RST}")
return {
"success": True,
"reason": "cancelled",
@@ -133,7 +133,7 @@ def prompt_for_secret(cli, var_name: str, prompt: str, metadata=None) -> dict:
cli._app.invalidate()
if not value:
cprint(f"\n{_DIM} ⏭ Secret entry cancelled{_RST}")
cprint(f"\n{_DIM} ⏭ Secret entry skipped{_RST}")
return {
"success": True,
"reason": "cancelled",
+131 -61
View File
@@ -11,6 +11,7 @@ Usage:
import importlib.util
import logging
import subprocess
import sys
from datetime import datetime
from pathlib import Path
@@ -50,7 +51,100 @@ _OPENCLAW_SCRIPT_INSTALLED = (
)
# Known OpenClaw directory names (current + legacy)
_OPENCLAW_DIR_NAMES = (".openclaw", ".clawdbot", ".moldbot")
_OPENCLAW_DIR_NAMES = (".openclaw", ".clawdbot", ".moltbot")
def _detect_openclaw_processes() -> list[str]:
"""Detect running OpenClaw processes and services.
Returns a list of human-readable descriptions of what was found.
An empty list means nothing was detected.
"""
found: list[str] = []
# -- systemd service (Linux) ------------------------------------------
if sys.platform != "win32":
try:
result = subprocess.run(
["systemctl", "--user", "is-active", "openclaw-gateway.service"],
capture_output=True, text=True, timeout=5,
)
if result.stdout.strip() == "active":
found.append("systemd service: openclaw-gateway.service")
except (FileNotFoundError, subprocess.TimeoutExpired):
pass
# -- process scan ------------------------------------------------------
if sys.platform == "win32":
try:
for exe in ("openclaw.exe", "clawd.exe"):
result = subprocess.run(
["tasklist", "/FI", f"IMAGENAME eq {exe}"],
capture_output=True, text=True, timeout=5,
)
if exe in result.stdout.lower():
found.append(f"process: {exe}")
# Node.js-hosted OpenClaw — tasklist doesn't show command lines,
# so fall back to PowerShell.
ps_cmd = (
'Get-CimInstance Win32_Process -Filter "Name = \'node.exe\'" | '
'Where-Object { $_.CommandLine -match "openclaw|clawd" } | '
'Select-Object -First 1 ProcessId'
)
result = subprocess.run(
["powershell", "-NoProfile", "-Command", ps_cmd],
capture_output=True, text=True, timeout=5,
)
if result.stdout.strip():
found.append(f"node.exe process with openclaw in command line (PID {result.stdout.strip()})")
except Exception:
pass
else:
try:
result = subprocess.run(
["pgrep", "-f", "openclaw"],
capture_output=True, text=True, timeout=3,
)
if result.returncode == 0:
pids = result.stdout.strip().split()
found.append(f"openclaw process(es) (PIDs: {', '.join(pids)})")
except (FileNotFoundError, subprocess.TimeoutExpired):
pass
return found
def _warn_if_openclaw_running(auto_yes: bool) -> None:
"""Warn if OpenClaw is still running before migration.
Telegram, Discord, and Slack only allow one active connection per bot
token. Migrating while OpenClaw is running causes both to fight for the
same token.
"""
running = _detect_openclaw_processes()
if not running:
return
print()
print_error("OpenClaw appears to be running:")
for detail in running:
print_info(f" * {detail}")
print_info(
"Messaging platforms (Telegram, Discord, Slack) only allow one "
"active session per bot token. If you continue, both OpenClaw and "
"Hermes may try to use the same token, causing disconnects."
)
print_info("Recommendation: stop OpenClaw before migrating.")
print()
if auto_yes:
return
if not sys.stdin.isatty():
print_info("Non-interactive session — continuing to preview only.")
return
if not prompt_yes_no("Continue anyway?", default=False):
print_info("Migration cancelled. Stop OpenClaw and try again.")
sys.exit(0)
def _warn_if_gateway_running(auto_yes: bool) -> None:
"""Check if a Hermes gateway is running with connected platforms.
@@ -87,8 +181,8 @@ def _warn_if_gateway_running(auto_yes: bool) -> None:
print_info("Migration cancelled. Stop the gateway and try again.")
sys.exit(0)
# State files commonly found in OpenClaw workspace directories that cause
# confusion after migration (the agent discovers them and writes to them)
# State files commonly found in OpenClaw workspace directories — listed
# during cleanup to help the user decide whether to archive
_WORKSPACE_STATE_GLOBS = (
"*/todo.json",
"*/sessions/*",
@@ -133,7 +227,7 @@ def _find_openclaw_dirs() -> list[Path]:
def _scan_workspace_state(source_dir: Path) -> list[tuple[Path, str]]:
"""Scan an OpenClaw directory for workspace state files that cause confusion.
"""Scan an OpenClaw directory for workspace state files.
Returns a list of (path, description) tuples.
"""
@@ -216,7 +310,7 @@ def _cmd_migrate(args):
source_dir = Path.home() / ".openclaw"
if not source_dir.is_dir():
# Try legacy directory names
for legacy in (".clawdbot", ".moldbot"):
for legacy in (".clawdbot", ".moltbot"):
candidate = Path.home() / legacy
if candidate.is_dir():
source_dir = candidate
@@ -287,8 +381,11 @@ def _cmd_migrate(args):
print_info(f"Workspace: {workspace_target}")
print()
# Check if a gateway is running with connected platforms — migrating tokens
# while the gateway is active will cause conflicts (e.g. Telegram 409).
# Check if OpenClaw is still running — migrating tokens while both are
# active will cause conflicts (e.g. Telegram 409).
_warn_if_openclaw_running(auto_yes)
# Check if a Hermes gateway is running with connected platforms.
_warn_if_gateway_running(auto_yes)
# Ensure config.yaml exists before migration tries to read it
@@ -384,65 +481,16 @@ def _cmd_migrate(args):
# Print results
_print_migration_report(report, dry_run=False)
# After successful migration, offer to archive the source directory
if report.get("summary", {}).get("migrated", 0) > 0:
_offer_source_archival(source_dir, auto_yes)
def _offer_source_archival(source_dir: Path, auto_yes: bool = False):
"""After migration, offer to rename the source directory to prevent state fragmentation.
OpenClaw workspace directories contain state files (todo.json, sessions, etc.)
that the agent may discover and write to, causing confusion. Renaming the
directory prevents this.
"""
if not source_dir.is_dir():
return
# Scan for state files that could cause problems
state_files = _scan_workspace_state(source_dir)
print()
print_header("Post-Migration Cleanup")
print_info("The OpenClaw directory still exists and contains workspace state files")
print_info("that can confuse the agent (todo lists, sessions, logs).")
if state_files:
print()
print(color(" Found state files:", Colors.YELLOW))
# Show up to 10 most relevant findings
for path, desc in state_files[:10]:
print(f" {desc}")
if len(state_files) > 10:
print(f" ... and {len(state_files) - 10} more")
print()
print_info(f"Recommend: rename {source_dir.name}/ to {source_dir.name}.pre-migration/")
print_info("This prevents the agent from discovering old workspace directories.")
print_info("You can always rename it back if needed.")
print()
if not auto_yes and not sys.stdin.isatty():
print_info("Non-interactive session — skipping archival.")
print_info("Run later with: hermes claw cleanup")
return
if auto_yes or prompt_yes_no(f"Archive {source_dir} now?", default=True):
try:
archive_path = _archive_directory(source_dir)
print_success(f"Archived: {source_dir}{archive_path}")
print_info("The original directory has been renamed, not deleted.")
print_info(f"To undo: mv {archive_path} {source_dir}")
except OSError as e:
print_error(f"Could not archive: {e}")
print_info(f"You can do it manually: mv {source_dir} {source_dir}.pre-migration")
else:
print_info("Skipped. You can archive later with: hermes claw cleanup")
# Source directory is left untouched — archiving is not the migration
# tool's responsibility. Users who want to clean up can run
# 'hermes claw cleanup' separately.
def _cmd_cleanup(args):
"""Archive leftover OpenClaw directories after migration.
Scans for OpenClaw directories that still exist after migration and offers
to rename them to .pre-migration to prevent state fragmentation.
to rename them to .pre-migration to free disk space.
"""
dry_run = getattr(args, "dry_run", False)
auto_yes = getattr(args, "yes", False)
@@ -479,6 +527,28 @@ def _cmd_cleanup(args):
print_success("No OpenClaw directories found. Nothing to clean up.")
return
# Warn if OpenClaw is still running — archiving while the service is
# active causes it to recreate an empty skeleton directory (#8502).
running = _detect_openclaw_processes()
if running:
print()
print_error("OpenClaw appears to be still running:")
for detail in running:
print_info(f" * {detail}")
print_info(
"Archiving .openclaw/ while the service is active may cause it to "
"immediately recreate an empty skeleton directory, destroying your config."
)
print_info("Stop OpenClaw first: systemctl --user stop openclaw-gateway.service")
print()
if not auto_yes:
if not sys.stdin.isatty():
print_info("Non-interactive session — aborting. Stop OpenClaw and re-run.")
return
if not prompt_yes_no("Proceed anyway?", default=False):
print_info("Aborted. Stop OpenClaw first, then re-run: hermes claw cleanup")
return
total_archived = 0
for source_dir in dirs_to_check:
@@ -517,7 +587,7 @@ def _cmd_cleanup(args):
if state_files:
print()
print(color(f" {len(state_files)} state file(s) that could cause confusion:", Colors.YELLOW))
print(color(f" {len(state_files)} state file(s) found:", Colors.YELLOW))
for path, desc in state_files[:8]:
print(f" {desc}")
if len(state_files) > 8:
-1
View File
@@ -6,7 +6,6 @@ mcp_config.py, and memory_setup.py.
"""
import getpass
import sys
from hermes_cli.colors import Colors, color
+249 -81
View File
@@ -12,6 +12,9 @@ from __future__ import annotations
import os
import re
import shutil
import subprocess
import time
from collections.abc import Callable, Mapping
from dataclasses import dataclass
from typing import Any
@@ -69,9 +72,12 @@ COMMAND_REGISTRY: list[CommandDef] = [
args_hint="[name]"),
CommandDef("branch", "Branch the current session (explore a different path)", "Session",
aliases=("fork",), args_hint="[name]"),
CommandDef("compress", "Manually compress conversation context", "Session"),
CommandDef("compress", "Manually compress conversation context", "Session",
args_hint="[focus topic]"),
CommandDef("rollback", "List or restore filesystem checkpoints", "Session",
args_hint="[number]"),
CommandDef("snapshot", "Create or restore state snapshots of Hermes config/state", "Session",
aliases=("snap",), args_hint="[create|restore <id>|prune]"),
CommandDef("stop", "Kill all running background processes", "Session"),
CommandDef("approve", "Approve a pending dangerous command", "Session",
gateway_only=True, args_hint="[session|always]"),
@@ -128,6 +134,7 @@ COMMAND_REGISTRY: list[CommandDef] = [
CommandDef("cron", "Manage scheduled tasks", "Tools & Skills",
cli_only=True, args_hint="[subcommand]",
subcommands=("list", "add", "create", "edit", "pause", "resume", "run", "remove")),
CommandDef("reload", "Reload .env variables into the running session", "Tools & Skills"),
CommandDef("reload-mcp", "Reload MCP servers from config", "Tools & Skills",
aliases=("reload_mcp",)),
CommandDef("browser", "Connect browser tools to your live Chrome via CDP", "Tools & Skills",
@@ -153,6 +160,7 @@ COMMAND_REGISTRY: list[CommandDef] = [
cli_only=True, args_hint="<path>"),
CommandDef("update", "Update Hermes Agent to the latest version", "Info",
gateway_only=True),
CommandDef("debug", "Upload debug report (system info + logs) and get shareable links", "Info"),
# Exit
CommandDef("quit", "Exit the CLI", "Exit",
@@ -185,52 +193,6 @@ def resolve_command(name: str) -> CommandDef | None:
return _COMMAND_LOOKUP.get(name.lower().lstrip("/"))
def rebuild_lookups() -> None:
"""Rebuild all derived lookup dicts from the current COMMAND_REGISTRY.
Called after plugin commands are registered so they appear in help,
autocomplete, gateway dispatch, Telegram menu, and Slack mapping.
"""
global GATEWAY_KNOWN_COMMANDS
_COMMAND_LOOKUP.clear()
_COMMAND_LOOKUP.update(_build_command_lookup())
COMMANDS.clear()
for cmd in COMMAND_REGISTRY:
if not cmd.gateway_only:
COMMANDS[f"/{cmd.name}"] = _build_description(cmd)
for alias in cmd.aliases:
COMMANDS[f"/{alias}"] = f"{cmd.description} (alias for /{cmd.name})"
COMMANDS_BY_CATEGORY.clear()
for cmd in COMMAND_REGISTRY:
if not cmd.gateway_only:
cat = COMMANDS_BY_CATEGORY.setdefault(cmd.category, {})
cat[f"/{cmd.name}"] = COMMANDS[f"/{cmd.name}"]
for alias in cmd.aliases:
cat[f"/{alias}"] = COMMANDS[f"/{alias}"]
SUBCOMMANDS.clear()
for cmd in COMMAND_REGISTRY:
if cmd.subcommands:
SUBCOMMANDS[f"/{cmd.name}"] = list(cmd.subcommands)
for cmd in COMMAND_REGISTRY:
key = f"/{cmd.name}"
if key in SUBCOMMANDS or not cmd.args_hint:
continue
m = _PIPE_SUBS_RE.search(cmd.args_hint)
if m:
SUBCOMMANDS[key] = m.group(0).split("|")
GATEWAY_KNOWN_COMMANDS = frozenset(
name
for cmd in COMMAND_REGISTRY
if not cmd.cli_only or cmd.gateway_config_gate
for name in (cmd.name, *cmd.aliases)
)
def _build_description(cmd: CommandDef) -> str:
"""Build a CLI-facing description string including usage hint."""
if cmd.args_hint:
@@ -620,6 +582,116 @@ def discord_skill_commands(
)
def discord_skill_commands_by_category(
reserved_names: set[str],
) -> tuple[dict[str, list[tuple[str, str, str]]], list[tuple[str, str, str]], int]:
"""Return skill entries organized by category for Discord ``/skill`` subcommand groups.
Skills whose directory is nested at least 2 levels under ``SKILLS_DIR``
(e.g. ``creative/ascii-art/SKILL.md``) are grouped by their top-level
category. Root-level skills (e.g. ``dogfood/SKILL.md``) are returned as
*uncategorized* the caller should register them as direct subcommands
of the ``/skill`` group.
The same filtering as :func:`discord_skill_commands` is applied: hub
skills excluded, per-platform disabled excluded, names clamped.
Returns:
``(categories, uncategorized, hidden_count)``
- *categories*: ``{category_name: [(name, description, cmd_key), ...]}``
- *uncategorized*: ``[(name, description, cmd_key), ...]``
- *hidden_count*: skills dropped due to Discord group limits
(25 subcommand groups, 25 subcommands per group)
"""
from pathlib import Path as _P
_platform_disabled: set[str] = set()
try:
from agent.skill_utils import get_disabled_skill_names
_platform_disabled = get_disabled_skill_names(platform="discord")
except Exception:
pass
# Collect raw skill data --------------------------------------------------
categories: dict[str, list[tuple[str, str, str]]] = {}
uncategorized: list[tuple[str, str, str]] = []
_names_used: set[str] = set(reserved_names)
hidden = 0
try:
from agent.skill_commands import get_skill_commands
from tools.skills_tool import SKILLS_DIR
_skills_dir = SKILLS_DIR.resolve()
_hub_dir = (SKILLS_DIR / ".hub").resolve()
skill_cmds = get_skill_commands()
for cmd_key in sorted(skill_cmds):
info = skill_cmds[cmd_key]
skill_path = info.get("skill_md_path", "")
if not skill_path:
continue
sp = _P(skill_path).resolve()
# Skip skills outside SKILLS_DIR or from the hub
if not str(sp).startswith(str(_skills_dir)):
continue
if str(sp).startswith(str(_hub_dir)):
continue
skill_name = info.get("name", "")
if skill_name in _platform_disabled:
continue
raw_name = cmd_key.lstrip("/")
# Clamp to 32 chars (Discord limit)
discord_name = raw_name[:32]
if discord_name in _names_used:
continue
_names_used.add(discord_name)
desc = info.get("description", "")
if len(desc) > 100:
desc = desc[:97] + "..."
# Determine category from the relative path within SKILLS_DIR.
# e.g. creative/ascii-art/SKILL.md → parts = ("creative", "ascii-art")
try:
rel = sp.parent.relative_to(_skills_dir)
except ValueError:
continue
parts = rel.parts
if len(parts) >= 2:
cat = parts[0]
categories.setdefault(cat, []).append((discord_name, desc, cmd_key))
else:
uncategorized.append((discord_name, desc, cmd_key))
except Exception:
pass
# Enforce Discord limits: 25 subcommand groups, 25 subcommands each ------
_MAX_GROUPS = 25
_MAX_PER_GROUP = 25
trimmed_categories: dict[str, list[tuple[str, str, str]]] = {}
group_count = 0
for cat in sorted(categories):
if group_count >= _MAX_GROUPS:
hidden += len(categories[cat])
continue
entries = categories[cat][:_MAX_PER_GROUP]
hidden += max(0, len(categories[cat]) - _MAX_PER_GROUP)
trimmed_categories[cat] = entries
group_count += 1
# Uncategorized skills also count against the 25 top-level limit
remaining_slots = _MAX_GROUPS - group_count
if len(uncategorized) > remaining_slots:
hidden += len(uncategorized) - remaining_slots
uncategorized = uncategorized[:remaining_slots]
return trimmed_categories, uncategorized, hidden
def slack_subcommand_map() -> dict[str, str]:
"""Return subcommand -> /command mapping for Slack /hermes handler.
@@ -651,6 +723,10 @@ class SlashCommandCompleter(Completer):
) -> None:
self._skill_commands_provider = skill_commands_provider
self._command_filter = command_filter
# Cached project file list for fuzzy @ completions
self._file_cache: list[str] = []
self._file_cache_time: float = 0.0
self._file_cache_cwd: str = ""
def _command_allowed(self, slash_command: str) -> bool:
if self._command_filter is None:
@@ -835,46 +911,138 @@ class SlashCommandCompleter(Completer):
count += 1
return
# Bare @ or @partial — show matching files/folders from cwd
# Bare @ or @partial — fuzzy project-wide file search
query = word[1:] # strip the @
if not query:
search_dir, match_prefix = ".", ""
else:
expanded = os.path.expanduser(query)
if expanded.endswith("/"):
search_dir, match_prefix = expanded, ""
else:
search_dir = os.path.dirname(expanded) or "."
match_prefix = os.path.basename(expanded)
yield from self._fuzzy_file_completions(word, query, limit)
try:
entries = os.listdir(search_dir)
except OSError:
def _get_project_files(self) -> list[str]:
"""Return cached list of project files (refreshed every 5s)."""
cwd = os.getcwd()
now = time.monotonic()
if (
self._file_cache
and self._file_cache_cwd == cwd
and now - self._file_cache_time < 5.0
):
return self._file_cache
files: list[str] = []
# Try rg first (fast, respects .gitignore), then fd, then find.
for cmd in [
["rg", "--files", "--sortr=modified", cwd],
["rg", "--files", cwd],
["fd", "--type", "f", "--base-directory", cwd],
]:
tool = cmd[0]
if not shutil.which(tool):
continue
try:
proc = subprocess.run(
cmd, capture_output=True, text=True, timeout=2,
cwd=cwd,
)
if proc.returncode == 0 and proc.stdout.strip():
raw = proc.stdout.strip().split("\n")
# Store relative paths
for p in raw[:5000]:
rel = os.path.relpath(p, cwd) if os.path.isabs(p) else p
files.append(rel)
break
except (subprocess.TimeoutExpired, OSError):
continue
self._file_cache = files
self._file_cache_time = now
self._file_cache_cwd = cwd
return files
@staticmethod
def _score_path(filepath: str, query: str) -> int:
"""Score a file path against a fuzzy query. Higher = better match."""
if not query:
return 1 # show everything when query is empty
filename = os.path.basename(filepath)
lower_file = filename.lower()
lower_path = filepath.lower()
lower_q = query.lower()
# Exact filename match
if lower_file == lower_q:
return 100
# Filename starts with query
if lower_file.startswith(lower_q):
return 80
# Filename contains query as substring
if lower_q in lower_file:
return 60
# Full path contains query
if lower_q in lower_path:
return 40
# Initials / abbreviation match: e.g. "fo" matches "file_operations"
# Check if query chars appear in order in filename
qi = 0
for c in lower_file:
if qi < len(lower_q) and c == lower_q[qi]:
qi += 1
if qi == len(lower_q):
# Bonus if matches land on word boundaries (after _, -, /, .)
boundary_hits = 0
qi = 0
prev = "_" # treat start as boundary
for c in lower_file:
if qi < len(lower_q) and c == lower_q[qi]:
if prev in "_-./":
boundary_hits += 1
qi += 1
prev = c
if boundary_hits >= len(lower_q) * 0.5:
return 35
return 25
return 0
def _fuzzy_file_completions(self, word: str, query: str, limit: int = 20):
"""Yield fuzzy file completions for bare @query."""
files = self._get_project_files()
if not query:
# No query — show recently modified files (already sorted by mtime)
for fp in files[:limit]:
is_dir = fp.endswith("/")
filename = os.path.basename(fp)
kind = "folder" if is_dir else "file"
meta = "dir" if is_dir else _file_size_label(
os.path.join(os.getcwd(), fp)
)
yield Completion(
f"@{kind}:{fp}",
start_position=-len(word),
display=filename,
display_meta=meta,
)
return
count = 0
prefix_lower = match_prefix.lower()
for entry in sorted(entries):
if match_prefix and not entry.lower().startswith(prefix_lower):
continue
if entry.startswith("."):
continue # skip hidden files in bare @ mode
if count >= limit:
break
full_path = os.path.join(search_dir, entry)
is_dir = os.path.isdir(full_path)
display_path = os.path.relpath(full_path)
suffix = "/" if is_dir else ""
# Score and rank
scored = []
for fp in files:
s = self._score_path(fp, query)
if s > 0:
scored.append((s, fp))
scored.sort(key=lambda x: (-x[0], x[1]))
for _, fp in scored[:limit]:
is_dir = fp.endswith("/")
filename = os.path.basename(fp)
kind = "folder" if is_dir else "file"
meta = "dir" if is_dir else _file_size_label(full_path)
completion = f"@{kind}:{display_path}{suffix}"
yield Completion(
completion,
start_position=-len(word),
display=entry + suffix,
display_meta=meta,
meta = "dir" if is_dir else _file_size_label(
os.path.join(os.getcwd(), fp)
)
yield Completion(
f"@{kind}:{fp}",
start_position=-len(word),
display=filename,
display_meta=f"{fp} {meta}" if meta else fp,
)
count += 1
def _model_completions(self, sub_text: str, sub_lower: str):
"""Yield completions for /model from config aliases + built-in aliases."""
+315
View File
@@ -0,0 +1,315 @@
"""Shell completion script generation for hermes CLI.
Walks the live argparse parser tree to generate accurate, always-up-to-date
completion scripts no hardcoded subcommand lists, no extra dependencies.
Supports bash, zsh, and fish.
"""
from __future__ import annotations
import argparse
from typing import Any
def _walk(parser: argparse.ArgumentParser) -> dict[str, Any]:
"""Recursively extract subcommands and flags from a parser.
Uses _SubParsersAction._choices_actions to get canonical names (no aliases)
along with their help text.
"""
flags: list[str] = []
subcommands: dict[str, Any] = {}
for action in parser._actions:
if isinstance(action, argparse._SubParsersAction):
# _choices_actions has one entry per canonical name; aliases are
# omitted, which keeps completion lists clean.
seen: set[str] = set()
for pseudo in action._choices_actions:
name = pseudo.dest
if name in seen:
continue
seen.add(name)
subparser = action.choices.get(name)
if subparser is None:
continue
info = _walk(subparser)
info["help"] = _clean(pseudo.help or "")
subcommands[name] = info
elif action.option_strings:
flags.extend(o for o in action.option_strings if o.startswith("-"))
return {"flags": flags, "subcommands": subcommands}
def _clean(text: str, maxlen: int = 60) -> str:
"""Strip shell-unsafe characters and truncate."""
return text.replace("'", "").replace('"', "").replace("\\", "")[:maxlen]
# ---------------------------------------------------------------------------
# Bash
# ---------------------------------------------------------------------------
def generate_bash(parser: argparse.ArgumentParser) -> str:
tree = _walk(parser)
top_cmds = " ".join(sorted(tree["subcommands"]))
cases: list[str] = []
for cmd in sorted(tree["subcommands"]):
info = tree["subcommands"][cmd]
if cmd == "profile" and info["subcommands"]:
# Profile subcommand: complete actions, then profile names for
# actions that accept a profile argument.
subcmds = " ".join(sorted(info["subcommands"]))
profile_actions = "use delete show alias rename export"
cases.append(
f" profile)\n"
f" case \"$prev\" in\n"
f" profile)\n"
f" COMPREPLY=($(compgen -W \"{subcmds}\" -- \"$cur\"))\n"
f" return\n"
f" ;;\n"
f" {profile_actions.replace(' ', '|')})\n"
f" COMPREPLY=($(compgen -W \"$(_hermes_profiles)\" -- \"$cur\"))\n"
f" return\n"
f" ;;\n"
f" esac\n"
f" ;;"
)
elif info["subcommands"]:
subcmds = " ".join(sorted(info["subcommands"]))
cases.append(
f" {cmd})\n"
f" COMPREPLY=($(compgen -W \"{subcmds}\" -- \"$cur\"))\n"
f" return\n"
f" ;;"
)
elif info["flags"]:
flags = " ".join(info["flags"])
cases.append(
f" {cmd})\n"
f" COMPREPLY=($(compgen -W \"{flags}\" -- \"$cur\"))\n"
f" return\n"
f" ;;"
)
cases_str = "\n".join(cases)
return f"""# Hermes Agent bash completion
# Add to ~/.bashrc:
# eval "$(hermes completion bash)"
_hermes_profiles() {{
local profiles_dir="$HOME/.hermes/profiles"
local profiles="default"
if [ -d "$profiles_dir" ]; then
profiles="$profiles $(ls "$profiles_dir" 2>/dev/null)"
fi
echo "$profiles"
}}
_hermes_completion() {{
local cur prev
COMPREPLY=()
cur="${{COMP_WORDS[COMP_CWORD]}}"
prev="${{COMP_WORDS[COMP_CWORD-1]}}"
# Complete profile names after -p / --profile
if [[ "$prev" == "-p" || "$prev" == "--profile" ]]; then
COMPREPLY=($(compgen -W "$(_hermes_profiles)" -- "$cur"))
return
fi
if [[ $COMP_CWORD -ge 2 ]]; then
case "${{COMP_WORDS[1]}}" in
{cases_str}
esac
fi
if [[ $COMP_CWORD -eq 1 ]]; then
COMPREPLY=($(compgen -W "{top_cmds}" -- "$cur"))
fi
}}
complete -F _hermes_completion hermes
"""
# ---------------------------------------------------------------------------
# Zsh
# ---------------------------------------------------------------------------
def generate_zsh(parser: argparse.ArgumentParser) -> str:
tree = _walk(parser)
top_cmds_lines: list[str] = []
for cmd in sorted(tree["subcommands"]):
help_text = _clean(tree["subcommands"][cmd].get("help", ""))
top_cmds_lines.append(f" '{cmd}:{help_text}'")
top_cmds_str = "\n".join(top_cmds_lines)
sub_cases: list[str] = []
for cmd in sorted(tree["subcommands"]):
info = tree["subcommands"][cmd]
if not info["subcommands"]:
continue
if cmd == "profile":
# Profile subcommand: complete actions, then profile names for
# actions that accept a profile argument.
sub_lines: list[str] = []
for sc in sorted(info["subcommands"]):
sh = _clean(info["subcommands"][sc].get("help", ""))
sub_lines.append(f" '{sc}:{sh}'")
sub_str = "\n".join(sub_lines)
sub_cases.append(
f" profile)\n"
f" case ${{line[2]}} in\n"
f" use|delete|show|alias|rename|export)\n"
f" _hermes_profiles\n"
f" ;;\n"
f" *)\n"
f" local -a profile_cmds\n"
f" profile_cmds=(\n"
f"{sub_str}\n"
f" )\n"
f" _describe 'profile command' profile_cmds\n"
f" ;;\n"
f" esac\n"
f" ;;"
)
else:
sub_lines = []
for sc in sorted(info["subcommands"]):
sh = _clean(info["subcommands"][sc].get("help", ""))
sub_lines.append(f" '{sc}:{sh}'")
sub_str = "\n".join(sub_lines)
safe = cmd.replace("-", "_")
sub_cases.append(
f" {cmd})\n"
f" local -a {safe}_cmds\n"
f" {safe}_cmds=(\n"
f"{sub_str}\n"
f" )\n"
f" _describe '{cmd} command' {safe}_cmds\n"
f" ;;"
)
sub_cases_str = "\n".join(sub_cases)
return f"""#compdef hermes
# Hermes Agent zsh completion
# Add to ~/.zshrc:
# eval "$(hermes completion zsh)"
_hermes_profiles() {{
local -a profiles
profiles=(default)
if [[ -d "$HOME/.hermes/profiles" ]]; then
profiles+=("${{(@f)$(ls $HOME/.hermes/profiles 2>/dev/null)}}")
fi
_describe 'profile' profiles
}}
_hermes() {{
local context state line
typeset -A opt_args
_arguments -C \\
'(-h --help){{-h,--help}}[Show help and exit]' \\
'(-V --version){{-V,--version}}[Show version and exit]' \\
'(-p --profile){{-p,--profile}}[Profile name]:profile:_hermes_profiles' \\
'1:command:->commands' \\
'*::arg:->args'
case $state in
commands)
local -a subcmds
subcmds=(
{top_cmds_str}
)
_describe 'hermes command' subcmds
;;
args)
case ${{line[1]}} in
{sub_cases_str}
esac
;;
esac
}}
_hermes "$@"
"""
# ---------------------------------------------------------------------------
# Fish
# ---------------------------------------------------------------------------
def generate_fish(parser: argparse.ArgumentParser) -> str:
tree = _walk(parser)
top_cmds = sorted(tree["subcommands"])
top_cmds_str = " ".join(top_cmds)
lines: list[str] = [
"# Hermes Agent fish completion",
"# Add to your config:",
"# hermes completion fish | source",
"",
"# Helper: list available profiles",
"function __hermes_profiles",
" echo default",
" if test -d $HOME/.hermes/profiles",
" ls $HOME/.hermes/profiles 2>/dev/null",
" end",
"end",
"",
"# Disable file completion by default",
"complete -c hermes -f",
"",
"# Complete profile names after -p / --profile",
"complete -c hermes -f -s p -l profile"
" -d 'Profile name' -xa '(__hermes_profiles)'",
"",
"# Top-level subcommands",
]
for cmd in top_cmds:
info = tree["subcommands"][cmd]
help_text = _clean(info.get("help", ""))
lines.append(
f"complete -c hermes -f "
f"-n 'not __fish_seen_subcommand_from {top_cmds_str}' "
f"-a {cmd} -d '{help_text}'"
)
lines.append("")
lines.append("# Subcommand completions")
profile_name_actions = {"use", "delete", "show", "alias", "rename", "export"}
for cmd in top_cmds:
info = tree["subcommands"][cmd]
if not info["subcommands"]:
continue
lines.append(f"# {cmd}")
for sc in sorted(info["subcommands"]):
sinfo = info["subcommands"][sc]
sh = _clean(sinfo.get("help", ""))
lines.append(
f"complete -c hermes -f "
f"-n '__fish_seen_subcommand_from {cmd}' "
f"-a {sc} -d '{sh}'"
)
# For profile subcommand, complete profile names for relevant actions
if cmd == "profile":
for action in sorted(profile_name_actions):
lines.append(
f"complete -c hermes -f "
f"-n '__fish_seen_subcommand_from {action}; "
f"and __fish_seen_subcommand_from profile' "
f"-a '(__hermes_profiles)' -d 'Profile name'"
)
lines.append("")
return "\n".join(lines)
+419 -18
View File
@@ -45,11 +45,15 @@ _EXTRA_ENV_KEYS = frozenset({
"WEIXIN_HOME_CHANNEL", "WEIXIN_HOME_CHANNEL_NAME", "WEIXIN_DM_POLICY", "WEIXIN_GROUP_POLICY",
"WEIXIN_ALLOWED_USERS", "WEIXIN_GROUP_ALLOWED_USERS", "WEIXIN_ALLOW_ALL_USERS",
"BLUEBUBBLES_SERVER_URL", "BLUEBUBBLES_PASSWORD",
"QQ_APP_ID", "QQ_CLIENT_SECRET", "QQ_HOME_CHANNEL", "QQ_HOME_CHANNEL_NAME",
"QQ_ALLOWED_USERS", "QQ_GROUP_ALLOWED_USERS", "QQ_ALLOW_ALL_USERS", "QQ_MARKDOWN_SUPPORT",
"QQ_STT_API_KEY", "QQ_STT_BASE_URL", "QQ_STT_MODEL",
"TERMINAL_ENV", "TERMINAL_SSH_KEY", "TERMINAL_SSH_PORT",
"WHATSAPP_MODE", "WHATSAPP_ENABLED",
"MATTERMOST_HOME_CHANNEL", "MATTERMOST_REPLY_MODE",
"MATRIX_PASSWORD", "MATRIX_ENCRYPTION", "MATRIX_DEVICE_ID", "MATRIX_HOME_ROOM",
"MATRIX_REQUIRE_MENTION", "MATRIX_FREE_RESPONSE_ROOMS", "MATRIX_AUTO_THREAD",
"MATRIX_RECOVERY_KEY",
})
import yaml
@@ -143,6 +147,55 @@ def managed_error(action: str = "modify configuration"):
print(format_managed_message(action), file=sys.stderr)
# =============================================================================
# Container-aware CLI (NixOS container mode)
# =============================================================================
def get_container_exec_info() -> Optional[dict]:
"""Read container mode metadata from HERMES_HOME/.container-mode.
Returns a dict with keys: backend, container_name, exec_user, hermes_bin
or None if container mode is not active, we're already inside the
container, or HERMES_DEV=1 is set.
The .container-mode file is written by the NixOS activation script when
container.enable = true. It tells the host CLI to exec into the container
instead of running locally.
"""
if os.environ.get("HERMES_DEV") == "1":
return None
from hermes_constants import is_container
if is_container():
return None
container_mode_file = get_hermes_home() / ".container-mode"
try:
info = {}
with open(container_mode_file, "r") as f:
for line in f:
line = line.strip()
if "=" in line and not line.startswith("#"):
key, _, value = line.partition("=")
info[key.strip()] = value.strip()
except FileNotFoundError:
return None
# All other exceptions (PermissionError, malformed data, etc.) propagate
backend = info.get("backend", "docker")
container_name = info.get("container_name", "hermes-agent")
exec_user = info.get("exec_user", "hermes")
hermes_bin = info.get("hermes_bin", "/data/current-package/bin/hermes")
return {
"backend": backend,
"container_name": container_name,
"exec_user": exec_user,
"hermes_bin": hermes_bin,
}
# =============================================================================
# Config paths
# =============================================================================
@@ -287,6 +340,10 @@ DEFAULT_CONFIG = {
# threshold before escalating to a full timeout. The warning fires
# once per run and does not interrupt the agent. 0 = disable warning.
"gateway_timeout_warning": 900,
# Periodic "still working" notification interval (seconds).
# Sends a status message every N seconds so the user knows the
# agent hasn't died during long tasks. 0 = disable notifications.
"gateway_notify_interval": 600,
},
"terminal": {
@@ -360,9 +417,7 @@ DEFAULT_CONFIG = {
"threshold": 0.50, # compress when context usage exceeds this ratio
"target_ratio": 0.20, # fraction of threshold to preserve as recent tail
"protect_last_n": 20, # minimum recent messages to keep uncompressed
"summary_model": "", # empty = use main configured model
"summary_provider": "auto",
"summary_base_url": None,
},
"smart_model_routing": {
"enabled": False,
@@ -448,9 +503,11 @@ DEFAULT_CONFIG = {
"inline_diffs": True, # Show inline diff previews for write actions (write_file, patch, skill_manage)
"show_cost": False, # Show $ cost in the status bar (off by default)
"skin": "default",
"interim_assistant_messages": True, # Gateway: show natural mid-turn assistant status messages
"tool_progress_command": False, # Enable /verbose command in messaging gateway
"tool_progress_overrides": {}, # Per-platform overrides: {"signal": "off", "telegram": "all"}
"tool_progress_overrides": {}, # DEPRECATED — use display.platforms instead
"tool_preview_length": 0, # Max chars for tool call previews (0 = no limit, show full paths/commands)
"platforms": {}, # Per-platform display overrides: {"telegram": {"tool_progress": "all"}, "slack": {"tool_progress": "off"}}
},
# Privacy settings
@@ -637,8 +694,16 @@ DEFAULT_CONFIG = {
"backup_count": 3, # Number of rotated backup files to keep
},
# Network settings — workarounds for connectivity issues.
"network": {
# Force IPv4 connections. On servers with broken or unreachable IPv6,
# Python tries AAAA records first and hangs for the full TCP timeout
# before falling back to IPv4. Set to true to skip IPv6 entirely.
"force_ipv4": False,
},
# Config schema version - bump this when adding new required fields
"_config_version": 14,
"_config_version": 17,
}
# =============================================================================
@@ -754,6 +819,30 @@ OPTIONAL_ENV_VARS = {
"category": "provider",
"advanced": True,
},
"KIMI_CN_API_KEY": {
"description": "Kimi / Moonshot China API key",
"prompt": "Kimi (China) API key",
"url": "https://platform.moonshot.cn/",
"password": True,
"category": "provider",
"advanced": True,
},
"ARCEEAI_API_KEY": {
"description": "Arcee AI API key",
"prompt": "Arcee AI API key",
"url": "https://chat.arcee.ai/",
"password": True,
"category": "provider",
"advanced": True,
},
"ARCEE_BASE_URL": {
"description": "Arcee AI base URL override",
"prompt": "Arcee base URL (leave empty for default)",
"url": None,
"password": False,
"category": "provider",
"advanced": True,
},
"MINIMAX_API_KEY": {
"description": "MiniMax API key (international)",
"prompt": "MiniMax API key",
@@ -1106,7 +1195,7 @@ OPTIONAL_ENV_VARS = {
"SLACK_BOT_TOKEN": {
"description": "Slack bot token (xoxb-). Get from OAuth & Permissions after installing your app. "
"Required scopes: chat:write, app_mentions:read, channels:history, groups:history, "
"im:history, im:read, im:write, users:read, files:write",
"im:history, im:read, im:write, users:read, files:read, files:write",
"prompt": "Slack Bot Token (xoxb-...)",
"url": "https://api.slack.com/apps",
"password": True,
@@ -1216,6 +1305,14 @@ OPTIONAL_ENV_VARS = {
"category": "messaging",
"advanced": True,
},
"MATRIX_RECOVERY_KEY": {
"description": "Matrix recovery key for cross-signing verification after device key rotation (from Element: Settings → Security → Recovery Key)",
"prompt": "Matrix recovery key",
"url": None,
"password": True,
"category": "messaging",
"advanced": True,
},
"BLUEBUBBLES_SERVER_URL": {
"description": "BlueBubbles server URL for iMessage integration (e.g. http://192.168.1.10:1234)",
"prompt": "BlueBubbles server URL",
@@ -1237,6 +1334,53 @@ OPTIONAL_ENV_VARS = {
"password": False,
"category": "messaging",
},
"BLUEBUBBLES_ALLOW_ALL_USERS": {
"description": "Allow all BlueBubbles users without allowlist",
"prompt": "Allow All BlueBubbles Users",
"category": "messaging",
},
"QQ_APP_ID": {
"description": "QQ Bot App ID from QQ Open Platform (q.qq.com)",
"prompt": "QQ App ID",
"url": "https://q.qq.com",
"category": "messaging",
},
"QQ_CLIENT_SECRET": {
"description": "QQ Bot Client Secret from QQ Open Platform",
"prompt": "QQ Client Secret",
"password": True,
"category": "messaging",
},
"QQ_ALLOWED_USERS": {
"description": "Comma-separated QQ user IDs allowed to use the bot",
"prompt": "QQ Allowed Users",
"category": "messaging",
},
"QQ_GROUP_ALLOWED_USERS": {
"description": "Comma-separated QQ group IDs allowed to interact with the bot",
"prompt": "QQ Group Allowed Users",
"category": "messaging",
},
"QQ_ALLOW_ALL_USERS": {
"description": "Allow all QQ users without an allowlist (true/false)",
"prompt": "Allow All QQ Users",
"category": "messaging",
},
"QQ_HOME_CHANNEL": {
"description": "Default QQ channel/group for cron delivery and notifications",
"prompt": "QQ Home Channel",
"category": "messaging",
},
"QQ_HOME_CHANNEL_NAME": {
"description": "Display name for the QQ home channel",
"prompt": "QQ Home Channel Name",
"category": "messaging",
},
"QQ_SANDBOX": {
"description": "Enable QQ sandbox mode for development testing (true/false)",
"prompt": "QQ Sandbox Mode",
"category": "messaging",
},
"GATEWAY_ALLOW_ALL_USERS": {
"description": "Allow all users to interact with messaging bots (true/false). Default: false.",
"prompt": "Allow all users (true/false)",
@@ -1285,6 +1429,22 @@ OPTIONAL_ENV_VARS = {
"category": "messaging",
"advanced": True,
},
"GATEWAY_PROXY_URL": {
"description": "URL of a remote Hermes API server to forward messages to (proxy mode). When set, the gateway handles platform I/O only — all agent work is delegated to the remote server. Use for Docker E2EE containers that relay to a host agent. Also configurable via gateway.proxy_url in config.yaml.",
"prompt": "Remote Hermes API server URL (e.g. http://192.168.1.100:8642)",
"url": None,
"password": False,
"category": "messaging",
"advanced": True,
},
"GATEWAY_PROXY_KEY": {
"description": "Bearer token for authenticating with the remote Hermes API server (proxy mode). Must match the API_SERVER_KEY on the remote host.",
"prompt": "Remote API server auth key",
"url": None,
"password": True,
"category": "messaging",
"advanced": True,
},
"WEBHOOK_ENABLED": {
"description": "Enable the webhook platform adapter for receiving events from GitHub, GitLab, etc.",
"prompt": "Enable webhooks (true/false)",
@@ -1474,6 +1634,137 @@ def get_missing_skill_config_vars() -> List[Dict[str, Any]]:
return missing
def _normalize_custom_provider_entry(
entry: Any,
*,
provider_key: str = "",
) -> Optional[Dict[str, Any]]:
"""Return a runtime-compatible custom provider entry or ``None``."""
if not isinstance(entry, dict):
return None
base_url = ""
for url_key in ("api", "url", "base_url"):
raw_url = entry.get(url_key)
if isinstance(raw_url, str) and raw_url.strip():
base_url = raw_url.strip()
break
if not base_url:
return None
name = ""
raw_name = entry.get("name")
if isinstance(raw_name, str) and raw_name.strip():
name = raw_name.strip()
elif provider_key.strip():
name = provider_key.strip()
if not name:
return None
normalized: Dict[str, Any] = {
"name": name,
"base_url": base_url,
}
provider_key = provider_key.strip()
if provider_key:
normalized["provider_key"] = provider_key
api_key = entry.get("api_key")
if isinstance(api_key, str) and api_key.strip():
normalized["api_key"] = api_key.strip()
key_env = entry.get("key_env")
if isinstance(key_env, str) and key_env.strip():
normalized["key_env"] = key_env.strip()
api_mode = entry.get("api_mode") or entry.get("transport")
if isinstance(api_mode, str) and api_mode.strip():
normalized["api_mode"] = api_mode.strip()
model_name = entry.get("model") or entry.get("default_model")
if isinstance(model_name, str) and model_name.strip():
normalized["model"] = model_name.strip()
models = entry.get("models")
if isinstance(models, dict) and models:
normalized["models"] = models
context_length = entry.get("context_length")
if isinstance(context_length, int) and context_length > 0:
normalized["context_length"] = context_length
rate_limit_delay = entry.get("rate_limit_delay")
if isinstance(rate_limit_delay, (int, float)) and rate_limit_delay >= 0:
normalized["rate_limit_delay"] = rate_limit_delay
return normalized
def providers_dict_to_custom_providers(providers_dict: Any) -> List[Dict[str, Any]]:
"""Normalize ``providers`` config entries into the legacy custom-provider shape."""
if not isinstance(providers_dict, dict):
return []
custom_providers: List[Dict[str, Any]] = []
for key, entry in providers_dict.items():
normalized = _normalize_custom_provider_entry(entry, provider_key=str(key))
if normalized is not None:
custom_providers.append(normalized)
return custom_providers
def get_compatible_custom_providers(
config: Optional[Dict[str, Any]] = None,
) -> List[Dict[str, Any]]:
"""Return a deduplicated custom-provider view across legacy and v12+ config.
``custom_providers`` remains the on-disk legacy format, while ``providers``
is the newer keyed schema. Runtime and picker flows still need a single
list-shaped view, but we should not materialise that compatibility layer
back into config.yaml because it duplicates entries in UIs.
"""
if config is None:
config = load_config()
compatible: List[Dict[str, Any]] = []
seen_provider_keys: set = set()
seen_name_url_pairs: set = set()
def _append_if_new(entry: Optional[Dict[str, Any]]) -> None:
if entry is None:
return
provider_key = str(entry.get("provider_key", "") or "").strip().lower()
name = str(entry.get("name", "") or "").strip().lower()
base_url = str(entry.get("base_url", "") or "").strip().rstrip("/").lower()
model = str(entry.get("model", "") or "").strip().lower()
pair = (name, base_url, model)
if provider_key and provider_key in seen_provider_keys:
return
if name and base_url and pair in seen_name_url_pairs:
return
compatible.append(entry)
if provider_key:
seen_provider_keys.add(provider_key)
if name and base_url:
seen_name_url_pairs.add(pair)
custom_providers = config.get("custom_providers")
if custom_providers is not None:
if not isinstance(custom_providers, list):
return []
for entry in custom_providers:
_append_if_new(_normalize_custom_provider_entry(entry))
for entry in providers_dict_to_custom_providers(config.get("providers")):
_append_if_new(entry)
return compatible
def check_config_version() -> Tuple[int, int]:
"""
Check config version.
@@ -1791,8 +2082,8 @@ def migrate_config(interactive: bool = True, quiet: bool = False) -> Dict[str, A
if migrated_count > 0:
config["providers"] = providers_dict
# Remove the old list
del config["custom_providers"]
# Remove the old list — runtime reads via get_compatible_custom_providers()
config.pop("custom_providers", None)
save_config(config)
if not quiet:
print(f" ✓ Migrated {migrated_count} custom provider(s) to providers: section")
@@ -1865,6 +2156,81 @@ def migrate_config(interactive: bool = True, quiet: bool = False) -> Dict[str, A
if not quiet:
print(f" ✓ Migrated legacy stt.model to provider-specific config")
# ── Version 14 → 15: add explicit gateway interim-message gate ──
if current_ver < 15:
config = read_raw_config()
display = config.get("display", {})
if not isinstance(display, dict):
display = {}
if "interim_assistant_messages" not in display:
display["interim_assistant_messages"] = True
config["display"] = display
results["config_added"].append("display.interim_assistant_messages=true (default)")
save_config(config)
if not quiet:
print(" ✓ Added display.interim_assistant_messages=true")
# ── Version 15 → 16: migrate tool_progress_overrides into display.platforms ──
if current_ver < 16:
config = read_raw_config()
display = config.get("display", {})
if not isinstance(display, dict):
display = {}
old_overrides = display.get("tool_progress_overrides")
if isinstance(old_overrides, dict) and old_overrides:
platforms = display.get("platforms", {})
if not isinstance(platforms, dict):
platforms = {}
for plat, mode in old_overrides.items():
if plat not in platforms:
platforms[plat] = {}
if "tool_progress" not in platforms[plat]:
platforms[plat]["tool_progress"] = mode
display["platforms"] = platforms
config["display"] = display
save_config(config)
if not quiet:
migrated = ", ".join(f"{p}={m}" for p, m in old_overrides.items())
print(f" ✓ Migrated tool_progress_overrides → display.platforms: {migrated}")
results["config_added"].append("display.platforms (migrated from tool_progress_overrides)")
# ── Version 16 → 17: remove legacy compression.summary_* keys ──
if current_ver < 17:
config = read_raw_config()
comp = config.get("compression", {})
if isinstance(comp, dict):
s_model = comp.pop("summary_model", None)
s_provider = comp.pop("summary_provider", None)
s_base_url = comp.pop("summary_base_url", None)
migrated_keys = []
# Migrate non-empty, non-default values to auxiliary.compression
if s_model and str(s_model).strip():
aux = config.setdefault("auxiliary", {})
aux_comp = aux.setdefault("compression", {})
if not aux_comp.get("model"):
aux_comp["model"] = str(s_model).strip()
migrated_keys.append(f"model={s_model}")
if s_provider and str(s_provider).strip() not in ("", "auto"):
aux = config.setdefault("auxiliary", {})
aux_comp = aux.setdefault("compression", {})
if not aux_comp.get("provider") or aux_comp.get("provider") == "auto":
aux_comp["provider"] = str(s_provider).strip()
migrated_keys.append(f"provider={s_provider}")
if s_base_url and str(s_base_url).strip():
aux = config.setdefault("auxiliary", {})
aux_comp = aux.setdefault("compression", {})
if not aux_comp.get("base_url"):
aux_comp["base_url"] = str(s_base_url).strip()
migrated_keys.append(f"base_url={s_base_url}")
if migrated_keys or s_model is not None or s_provider is not None or s_base_url is not None:
config["compression"] = comp
save_config(config)
if not quiet:
if migrated_keys:
print(f" ✓ Migrated compression.summary_* → auxiliary.compression: {', '.join(migrated_keys)}")
else:
print(" ✓ Removed unused compression.summary_* keys")
if current_ver < latest_ver and not quiet:
print(f"Config version: {current_ver}{latest_ver}")
@@ -2177,6 +2543,7 @@ _FALLBACK_COMMENT = """
# nous (OAuth — hermes auth) — Nous Portal
# zai (ZAI_API_KEY) — Z.AI / GLM
# kimi-coding (KIMI_API_KEY) — Kimi / Moonshot
# kimi-coding-cn (KIMI_CN_API_KEY) — Kimi / Moonshot (China)
# minimax (MINIMAX_API_KEY) — MiniMax
# minimax-cn (MINIMAX_CN_API_KEY) — MiniMax (China)
#
@@ -2220,6 +2587,7 @@ _COMMENTED_SECTIONS = """
# nous (OAuth — hermes auth) — Nous Portal
# zai (ZAI_API_KEY) — Z.AI / GLM
# kimi-coding (KIMI_API_KEY) — Kimi / Moonshot
# kimi-coding-cn (KIMI_CN_API_KEY) — Kimi / Moonshot (China)
# minimax (MINIMAX_API_KEY) — MiniMax
# minimax-cn (MINIMAX_CN_API_KEY) — MiniMax (China)
#
@@ -2274,7 +2642,13 @@ def save_config(config: Dict[str, Any]):
def load_env() -> Dict[str, str]:
"""Load environment variables from ~/.hermes/.env."""
"""Load environment variables from ~/.hermes/.env.
Sanitizes lines before parsing so that corrupted files (e.g.
concatenated KEY=VALUE pairs on a single line) are handled
gracefully instead of producing mangled values such as duplicated
bot tokens. See #8908.
"""
env_path = get_env_path()
env_vars = {}
@@ -2283,17 +2657,21 @@ def load_env() -> Dict[str, str]:
# fail on UTF-8 .env files. Use explicit UTF-8 only on Windows.
open_kw = {"encoding": "utf-8", "errors": "replace"} if _IS_WINDOWS else {}
with open(env_path, **open_kw) as f:
for line in f:
line = line.strip()
if line and not line.startswith('#') and '=' in line:
key, _, value = line.partition('=')
env_vars[key.strip()] = value.strip().strip('"\'')
raw_lines = f.readlines()
# Sanitize before parsing: split concatenated lines & drop stale
# placeholders so corrupted .env files don't produce invalid tokens.
lines = _sanitize_env_lines(raw_lines)
for line in lines:
line = line.strip()
if line and not line.startswith('#') and '=' in line:
key, _, value = line.partition('=')
env_vars[key.strip()] = value.strip().strip('"\'')
return env_vars
def _sanitize_env_lines(lines: list) -> list:
"""Fix corrupted .env lines before writing.
"""Fix corrupted .env lines before reading or writing.
Handles two known corruption patterns:
1. Concatenated KEY=VALUE pairs on a single line (missing newline between
@@ -2526,6 +2904,28 @@ def save_env_value_secure(key: str, value: str) -> Dict[str, Any]:
def reload_env() -> int:
"""Re-read ~/.hermes/.env into os.environ. Returns count of vars updated.
Adds/updates vars that changed and removes vars that were deleted from
the .env file (but only vars known to Hermes OPTIONAL_ENV_VARS and
_EXTRA_ENV_KEYS to avoid clobbering unrelated environment).
"""
env_vars = load_env()
known_keys = set(OPTIONAL_ENV_VARS.keys()) | _EXTRA_ENV_KEYS
count = 0
for key, value in env_vars.items():
if os.environ.get(key) != value:
os.environ[key] = value
count += 1
# Remove known Hermes vars that are no longer in .env
for key in known_keys:
if key not in env_vars and key in os.environ:
del os.environ[key]
count += 1
return count
def get_env_value(key: str) -> Optional[str]:
"""Get a value from ~/.hermes/.env or environment."""
# Check environment first
@@ -2648,10 +3048,11 @@ def show_config():
print(f" Threshold: {compression.get('threshold', 0.50) * 100:.0f}%")
print(f" Target ratio: {compression.get('target_ratio', 0.20) * 100:.0f}% of threshold preserved")
print(f" Protect last: {compression.get('protect_last_n', 20)} messages")
_sm = compression.get('summary_model', '') or '(main model)'
_aux_comp = config.get('auxiliary', {}).get('compression', {})
_sm = _aux_comp.get('model', '') or '(auto)'
print(f" Model: {_sm}")
comp_provider = compression.get('summary_provider', 'auto')
if comp_provider != 'auto':
comp_provider = _aux_comp.get('provider', 'auto')
if comp_provider and comp_provider != 'auto':
print(f" Provider: {comp_provider}")
# Auxiliary models
+18 -2
View File
@@ -117,14 +117,30 @@ def _gh_cli_candidates() -> list[str]:
def _try_gh_cli_token() -> Optional[str]:
"""Return a token from ``gh auth token`` when the GitHub CLI is available."""
"""Return a token from ``gh auth token`` when the GitHub CLI is available.
When COPILOT_GH_HOST is set, passes ``--hostname`` so gh returns the
correct host's token. Also strips GITHUB_TOKEN / GH_TOKEN from the
subprocess environment so ``gh`` reads from its own credential store
(hosts.yml) instead of just echoing the env var back.
"""
hostname = os.getenv("COPILOT_GH_HOST", "").strip()
# Build a clean env so gh doesn't short-circuit on GITHUB_TOKEN / GH_TOKEN
clean_env = {k: v for k, v in os.environ.items()
if k not in ("GITHUB_TOKEN", "GH_TOKEN")}
for gh_path in _gh_cli_candidates():
cmd = [gh_path, "auth", "token"]
if hostname:
cmd += ["--hostname", hostname]
try:
result = subprocess.run(
[gh_path, "auth", "token"],
cmd,
capture_output=True,
text=True,
timeout=5,
env=clean_env,
)
except (FileNotFoundError, subprocess.TimeoutExpired) as exc:
logger.debug("gh CLI token lookup failed (%s): %s", gh_path, exc)
+123
View File
@@ -287,6 +287,129 @@ def _radio_numbered_fallback(
return cancel_returns
def curses_single_select(
title: str,
items: List[str],
default_index: int = 0,
*,
cancel_label: str = "Cancel",
) -> int | None:
"""Curses single-select menu. Returns selected index or None on cancel.
Works inside prompt_toolkit because curses.wrapper() restores the terminal
safely, unlike simple_term_menu which conflicts with /dev/tty.
"""
if not sys.stdin.isatty():
return None
try:
import curses
result_holder: list = [None]
all_items = list(items) + [cancel_label]
cancel_idx = len(items)
def _draw(stdscr):
curses.curs_set(0)
if curses.has_colors():
curses.start_color()
curses.use_default_colors()
curses.init_pair(1, curses.COLOR_GREEN, -1)
curses.init_pair(2, curses.COLOR_YELLOW, -1)
cursor = min(default_index, len(all_items) - 1)
scroll_offset = 0
while True:
stdscr.clear()
max_y, max_x = stdscr.getmaxyx()
try:
hattr = curses.A_BOLD
if curses.has_colors():
hattr |= curses.color_pair(2)
stdscr.addnstr(0, 0, title, max_x - 1, hattr)
stdscr.addnstr(
1, 0,
" ↑↓ navigate ENTER confirm ESC/q cancel",
max_x - 1, curses.A_DIM,
)
except curses.error:
pass
visible_rows = max_y - 3
if cursor < scroll_offset:
scroll_offset = cursor
elif cursor >= scroll_offset + visible_rows:
scroll_offset = cursor - visible_rows + 1
for draw_i, i in enumerate(
range(scroll_offset, min(len(all_items), scroll_offset + visible_rows))
):
y = draw_i + 3
if y >= max_y - 1:
break
arrow = "" if i == cursor else " "
line = f" {arrow} {all_items[i]}"
attr = curses.A_NORMAL
if i == cursor:
attr = curses.A_BOLD
if curses.has_colors():
attr |= curses.color_pair(1)
try:
stdscr.addnstr(y, 0, line, max_x - 1, attr)
except curses.error:
pass
stdscr.refresh()
key = stdscr.getch()
if key in (curses.KEY_UP, ord("k")):
cursor = (cursor - 1) % len(all_items)
elif key in (curses.KEY_DOWN, ord("j")):
cursor = (cursor + 1) % len(all_items)
elif key in (curses.KEY_ENTER, 10, 13):
result_holder[0] = cursor
return
elif key in (27, ord("q")):
result_holder[0] = None
return
curses.wrapper(_draw)
flush_stdin()
if result_holder[0] is not None and result_holder[0] >= cancel_idx:
return None
return result_holder[0]
except Exception:
all_items = list(items) + [cancel_label]
cancel_idx = len(items)
return _numbered_single_fallback(title, all_items, cancel_idx)
def _numbered_single_fallback(
title: str,
items: List[str],
cancel_idx: int,
) -> int | None:
"""Text-based numbered fallback for single-select."""
print(f"\n {title}\n")
for i, label in enumerate(items, 1):
print(f" {i}. {label}")
print()
try:
val = input(f" Choice [1-{len(items)}]: ").strip()
if not val:
return None
idx = int(val) - 1
if 0 <= idx < len(items) and idx < cancel_idx:
return idx
if idx == cancel_idx:
return None
except (ValueError, KeyboardInterrupt, EOFError):
pass
return None
def _numbered_fallback(
title: str,
items: List[str],
+336
View File
@@ -0,0 +1,336 @@
"""``hermes debug`` — debug tools for Hermes Agent.
Currently supports:
hermes debug share Upload debug report (system info + logs) to a
paste service and print a shareable URL.
"""
import io
import sys
import urllib.error
import urllib.parse
import urllib.request
from pathlib import Path
from typing import Optional
from hermes_constants import get_hermes_home
# ---------------------------------------------------------------------------
# Paste services — try paste.rs first, dpaste.com as fallback.
# ---------------------------------------------------------------------------
_PASTE_RS_URL = "https://paste.rs/"
_DPASTE_COM_URL = "https://dpaste.com/api/"
# Maximum bytes to read from a single log file for upload.
# paste.rs caps at ~1 MB; we stay under that with headroom.
_MAX_LOG_BYTES = 512_000
def _upload_paste_rs(content: str) -> str:
"""Upload to paste.rs. Returns the paste URL.
paste.rs accepts a plain POST body and returns the URL directly.
"""
data = content.encode("utf-8")
req = urllib.request.Request(
_PASTE_RS_URL, data=data, method="POST",
headers={
"Content-Type": "text/plain; charset=utf-8",
"User-Agent": "hermes-agent/debug-share",
},
)
with urllib.request.urlopen(req, timeout=30) as resp:
url = resp.read().decode("utf-8").strip()
if not url.startswith("http"):
raise ValueError(f"Unexpected response from paste.rs: {url[:200]}")
return url
def _upload_dpaste_com(content: str, expiry_days: int = 7) -> str:
"""Upload to dpaste.com. Returns the paste URL.
dpaste.com uses multipart form data.
"""
boundary = "----HermesDebugBoundary9f3c"
def _field(name: str, value: str) -> str:
return (
f"--{boundary}\r\n"
f'Content-Disposition: form-data; name="{name}"\r\n'
f"\r\n"
f"{value}\r\n"
)
body = (
_field("content", content)
+ _field("syntax", "text")
+ _field("expiry_days", str(expiry_days))
+ f"--{boundary}--\r\n"
).encode("utf-8")
req = urllib.request.Request(
_DPASTE_COM_URL, data=body, method="POST",
headers={
"Content-Type": f"multipart/form-data; boundary={boundary}",
"User-Agent": "hermes-agent/debug-share",
},
)
with urllib.request.urlopen(req, timeout=30) as resp:
url = resp.read().decode("utf-8").strip()
if not url.startswith("http"):
raise ValueError(f"Unexpected response from dpaste.com: {url[:200]}")
return url
def upload_to_pastebin(content: str, expiry_days: int = 7) -> str:
"""Upload *content* to a paste service, trying paste.rs then dpaste.com.
Returns the paste URL on success, raises on total failure.
"""
errors: list[str] = []
# Try paste.rs first (simple, fast)
try:
return _upload_paste_rs(content)
except Exception as exc:
errors.append(f"paste.rs: {exc}")
# Fallback: dpaste.com (supports expiry)
try:
return _upload_dpaste_com(content, expiry_days=expiry_days)
except Exception as exc:
errors.append(f"dpaste.com: {exc}")
raise RuntimeError(
"Failed to upload to any paste service:\n " + "\n ".join(errors)
)
# ---------------------------------------------------------------------------
# Log file reading
# ---------------------------------------------------------------------------
def _resolve_log_path(log_name: str) -> Optional[Path]:
"""Find the log file for *log_name*, falling back to the .1 rotation.
Returns the path if found, or None.
"""
from hermes_cli.logs import LOG_FILES
filename = LOG_FILES.get(log_name)
if not filename:
return None
log_dir = get_hermes_home() / "logs"
primary = log_dir / filename
if primary.exists() and primary.stat().st_size > 0:
return primary
# Fall back to the most recent rotated file (.1).
rotated = log_dir / f"{filename}.1"
if rotated.exists() and rotated.stat().st_size > 0:
return rotated
return None
def _read_log_tail(log_name: str, num_lines: int) -> str:
"""Read the last *num_lines* from a log file, or return a placeholder."""
from hermes_cli.logs import _read_last_n_lines
log_path = _resolve_log_path(log_name)
if log_path is None:
return "(file not found)"
try:
lines = _read_last_n_lines(log_path, num_lines)
return "".join(lines).rstrip("\n")
except Exception as exc:
return f"(error reading: {exc})"
def _read_full_log(log_name: str, max_bytes: int = _MAX_LOG_BYTES) -> Optional[str]:
"""Read a log file for standalone upload.
Returns the file content (last *max_bytes* if truncated), or None if the
file doesn't exist or is empty.
"""
log_path = _resolve_log_path(log_name)
if log_path is None:
return None
try:
size = log_path.stat().st_size
if size == 0:
return None
if size <= max_bytes:
return log_path.read_text(encoding="utf-8", errors="replace")
# File is larger than max_bytes — read the tail.
with open(log_path, "rb") as f:
f.seek(size - max_bytes)
# Skip partial line at the seek point.
f.readline()
content = f.read().decode("utf-8", errors="replace")
return f"[... truncated — showing last ~{max_bytes // 1024}KB ...]\n{content}"
except Exception:
return None
# ---------------------------------------------------------------------------
# Debug report collection
# ---------------------------------------------------------------------------
def _capture_dump() -> str:
"""Run ``hermes dump`` and return its stdout as a string."""
from hermes_cli.dump import run_dump
class _FakeArgs:
show_keys = False
old_stdout = sys.stdout
sys.stdout = capture = io.StringIO()
try:
run_dump(_FakeArgs())
except SystemExit:
pass
finally:
sys.stdout = old_stdout
return capture.getvalue()
def collect_debug_report(*, log_lines: int = 200, dump_text: str = "") -> str:
"""Build the summary debug report: system dump + log tails.
Parameters
----------
log_lines
Number of recent lines to include per log file.
dump_text
Pre-captured dump output. If empty, ``hermes dump`` is run
internally.
Returns the report as a plain-text string ready for upload.
"""
buf = io.StringIO()
if not dump_text:
dump_text = _capture_dump()
buf.write(dump_text)
# ── Recent log tails (summary only) ──────────────────────────────────
buf.write("\n\n")
buf.write(f"--- agent.log (last {log_lines} lines) ---\n")
buf.write(_read_log_tail("agent", log_lines))
buf.write("\n\n")
errors_lines = min(log_lines, 100)
buf.write(f"--- errors.log (last {errors_lines} lines) ---\n")
buf.write(_read_log_tail("errors", errors_lines))
buf.write("\n\n")
buf.write(f"--- gateway.log (last {errors_lines} lines) ---\n")
buf.write(_read_log_tail("gateway", errors_lines))
buf.write("\n")
return buf.getvalue()
# ---------------------------------------------------------------------------
# CLI entry points
# ---------------------------------------------------------------------------
def run_debug_share(args):
"""Collect debug report + full logs, upload each, print URLs."""
log_lines = getattr(args, "lines", 200)
expiry = getattr(args, "expire", 7)
local_only = getattr(args, "local", False)
print("Collecting debug report...")
# Capture dump once — prepended to every paste for context.
dump_text = _capture_dump()
report = collect_debug_report(log_lines=log_lines, dump_text=dump_text)
agent_log = _read_full_log("agent")
gateway_log = _read_full_log("gateway")
# Prepend dump header to each full log so every paste is self-contained.
if agent_log:
agent_log = dump_text + "\n\n--- full agent.log ---\n" + agent_log
if gateway_log:
gateway_log = dump_text + "\n\n--- full gateway.log ---\n" + gateway_log
if local_only:
print(report)
if agent_log:
print(f"\n\n{'=' * 60}")
print("FULL agent.log")
print(f"{'=' * 60}\n")
print(agent_log)
if gateway_log:
print(f"\n\n{'=' * 60}")
print("FULL gateway.log")
print(f"{'=' * 60}\n")
print(gateway_log)
return
print("Uploading...")
urls: dict[str, str] = {}
failures: list[str] = []
# 1. Summary report (required)
try:
urls["Report"] = upload_to_pastebin(report, expiry_days=expiry)
except RuntimeError as exc:
print(f"\nUpload failed: {exc}", file=sys.stderr)
print("\nFull report printed below — copy-paste it manually:\n")
print(report)
sys.exit(1)
# 2. Full agent.log (optional)
if agent_log:
try:
urls["agent.log"] = upload_to_pastebin(agent_log, expiry_days=expiry)
except Exception as exc:
failures.append(f"agent.log: {exc}")
# 3. Full gateway.log (optional)
if gateway_log:
try:
urls["gateway.log"] = upload_to_pastebin(gateway_log, expiry_days=expiry)
except Exception as exc:
failures.append(f"gateway.log: {exc}")
# Print results
label_width = max(len(k) for k in urls)
print(f"\nDebug report uploaded:")
for label, url in urls.items():
print(f" {label:<{label_width}} {url}")
if failures:
print(f"\n (failed to upload: {', '.join(failures)})")
print(f"\nShare these links with the Hermes team for support.")
def run_debug(args):
"""Route debug subcommands."""
subcmd = getattr(args, "debug_command", None)
if subcmd == "share":
run_debug_share(args)
else:
# Default: show help
print("Usage: hermes debug share [--lines N] [--expire N] [--local]")
print()
print("Commands:")
print(" share Upload debug report to a paste service and print URL")
print()
print("Options:")
print(" --lines N Number of log lines to include (default: 200)")
print(" --expire N Paste expiry in days (default: 7)")
print(" --local Print report locally instead of uploading")
+5 -2
View File
@@ -42,6 +42,7 @@ _PROVIDER_ENV_HINTS = (
"ZAI_API_KEY",
"Z_AI_API_KEY",
"KIMI_API_KEY",
"KIMI_CN_API_KEY",
"MINIMAX_API_KEY",
"MINIMAX_CN_API_KEY",
"KILOCODE_API_KEY",
@@ -721,13 +722,15 @@ def run_doctor(args):
_apikey_providers = [
("Z.AI / GLM", ("GLM_API_KEY", "ZAI_API_KEY", "Z_AI_API_KEY"), "https://api.z.ai/api/paas/v4/models", "GLM_BASE_URL", True),
("Kimi / Moonshot", ("KIMI_API_KEY",), "https://api.moonshot.ai/v1/models", "KIMI_BASE_URL", True),
("Kimi / Moonshot (China)", ("KIMI_CN_API_KEY",), "https://api.moonshot.cn/v1/models", None, True),
("Arcee AI", ("ARCEEAI_API_KEY",), "https://api.arcee.ai/api/v1/models", "ARCEE_BASE_URL", True),
("DeepSeek", ("DEEPSEEK_API_KEY",), "https://api.deepseek.com/v1/models", "DEEPSEEK_BASE_URL", True),
("Hugging Face", ("HF_TOKEN",), "https://router.huggingface.co/v1/models", "HF_BASE_URL", True),
("Alibaba/DashScope", ("DASHSCOPE_API_KEY",), "https://dashscope-intl.aliyuncs.com/compatible-mode/v1/models", "DASHSCOPE_BASE_URL", True),
# MiniMax: the /anthropic endpoint doesn't support /models, but the /v1 endpoint does.
("MiniMax", ("MINIMAX_API_KEY",), "https://api.minimax.io/v1/models", "MINIMAX_BASE_URL", True),
("MiniMax (China)", ("MINIMAX_CN_API_KEY",), "https://api.minimaxi.com/v1/models", "MINIMAX_CN_BASE_URL", True),
("AI Gateway", ("AI_GATEWAY_API_KEY",), "https://ai-gateway.vercel.sh/v1/models", "AI_GATEWAY_BASE_URL", True),
("Vercel AI Gateway", ("AI_GATEWAY_API_KEY",), "https://ai-gateway.vercel.sh/v1/models", "AI_GATEWAY_BASE_URL", True),
("Kilo Code", ("KILOCODE_API_KEY",), "https://api.kilo.ai/api/gateway/models", "KILOCODE_BASE_URL", True),
("OpenCode Zen", ("OPENCODE_ZEN_API_KEY",), "https://opencode.ai/zen/v1/models", "OPENCODE_ZEN_BASE_URL", True),
("OpenCode Go", ("OPENCODE_GO_API_KEY",), "https://opencode.ai/zen/go/v1/models", "OPENCODE_GO_BASE_URL", True),
@@ -747,7 +750,7 @@ def run_doctor(args):
print(f" Checking {_pname} API...", end="", flush=True)
try:
import httpx
_base = os.getenv(_base_env, "")
_base = os.getenv(_base_env, "") if _base_env else ""
# Auto-detect Kimi Code keys (sk-kimi-) → api.kimi.com
if not _base and _key.startswith("sk-kimi-"):
_base = "https://api.kimi.com/coding/v1"
+11
View File
@@ -44,6 +44,16 @@ def _redact(value: str) -> str:
def _gateway_status() -> str:
"""Return a short gateway status string."""
if sys.platform.startswith("linux"):
from hermes_constants import is_container
if is_container():
try:
from hermes_cli.gateway import find_gateway_pids
pids = find_gateway_pids()
if pids:
return f"running (docker, pid {pids[0]})"
return "stopped (docker)"
except Exception:
return "stopped (docker)"
try:
from hermes_cli.gateway import get_service_name
svc = get_service_name()
@@ -121,6 +131,7 @@ def _configured_platforms() -> list[str]:
"wecom": "WECOM_BOT_ID",
"wecom_callback": "WECOM_CALLBACK_CORP_ID",
"weixin": "WEIXIN_ACCOUNT_ID",
"qqbot": "QQ_APP_ID",
}
return [name for name, env in checks.items() if os.getenv(env)]
+49
View File
@@ -15,6 +15,51 @@ def _load_dotenv_with_fallback(path: Path, *, override: bool) -> None:
load_dotenv(dotenv_path=path, override=override, encoding="latin-1")
def _sanitize_env_file_if_needed(path: Path) -> None:
"""Pre-sanitize a .env file before python-dotenv reads it.
python-dotenv does not handle corrupted lines where multiple
KEY=VALUE pairs are concatenated on a single line (missing newline).
This produces mangled values e.g. a bot token duplicated 8×
(see #8908).
We delegate to ``hermes_cli.config._sanitize_env_lines`` which
already knows all valid Hermes env-var names and can split
concatenated lines correctly.
"""
if not path.exists():
return
try:
from hermes_cli.config import _sanitize_env_lines
except ImportError:
return # early bootstrap — config module not available yet
read_kw = {"encoding": "utf-8", "errors": "replace"}
try:
with open(path, **read_kw) as f:
original = f.readlines()
sanitized = _sanitize_env_lines(original)
if sanitized != original:
import tempfile
fd, tmp = tempfile.mkstemp(
dir=str(path.parent), suffix=".tmp", prefix=".env_"
)
try:
with os.fdopen(fd, "w", encoding="utf-8") as f:
f.writelines(sanitized)
f.flush()
os.fsync(f.fileno())
os.replace(tmp, path)
except BaseException:
try:
os.unlink(tmp)
except OSError:
pass
raise
except Exception:
pass # best-effort — don't block gateway startup
def load_hermes_dotenv(
*,
hermes_home: str | os.PathLike | None = None,
@@ -34,6 +79,10 @@ def load_hermes_dotenv(
user_env = home_path / ".env"
project_env_path = Path(project_env) if project_env else None
# Fix corrupted .env files before python-dotenv parses them (#8908).
if user_env.exists():
_sanitize_env_file_if_needed(user_env)
if user_env.exists():
_load_dotenv_with_fallback(user_env, override=True)
loaded.append(user_env)
+405 -39
View File
@@ -10,6 +10,7 @@ import shutil
import signal
import subprocess
import sys
import time
from pathlib import Path
PROJECT_ROOT = Path(__file__).parent.parent.resolve()
@@ -37,6 +38,10 @@ from hermes_cli.setup import (
from hermes_cli.colors import Colors, color
_SERVICE_READINESS_TIMEOUT = 30.0
_SERVICE_READINESS_POLL_INTERVAL = 0.2
# =============================================================================
# Process Management (for manual gateway runs)
# =============================================================================
@@ -331,7 +336,7 @@ def is_linux() -> bool:
return sys.platform.startswith('linux')
from hermes_constants import is_termux, is_wsl
from hermes_constants import is_container, is_termux, is_wsl
def _wsl_systemd_operational() -> bool:
@@ -353,7 +358,9 @@ def _wsl_systemd_operational() -> bool:
def supports_systemd_services() -> bool:
if not is_linux() or is_termux():
if not is_linux() or is_termux() or is_container():
return False
if shutil.which("systemctl") is None:
return False
if is_wsl():
return _wsl_systemd_operational()
@@ -483,6 +490,21 @@ def _journalctl_cmd(system: bool = False) -> list[str]:
return ["journalctl"] if system else ["journalctl", "--user"]
def _run_systemctl(args: list[str], *, system: bool = False, **kwargs) -> subprocess.CompletedProcess:
"""Run a systemctl command, raising RuntimeError if systemctl is missing.
Defense-in-depth: callers are gated by ``supports_systemd_services()``,
but this ensures any future caller that bypasses the gate still gets a
clear error instead of a raw ``FileNotFoundError`` traceback.
"""
try:
return subprocess.run(_systemctl_cmd(system) + args, **kwargs)
except FileNotFoundError:
raise RuntimeError(
"systemctl is not available on this system"
) from None
def _service_scope_label(system: bool = False) -> str:
return "system" if system else "user"
@@ -751,14 +773,22 @@ def _remap_path_for_user(path: str, target_home_dir: str) -> str:
/root/.hermes/hermes-agent -> /home/alice/.hermes/hermes-agent
/opt/hermes -> /opt/hermes (kept as-is)
Note: this function intentionally does NOT resolve symlinks. A venv's
``bin/python`` is typically a symlink to the base interpreter (e.g. a
uv-managed CPython at ``~/.local/share/uv/python/.../python3.11``);
resolving that symlink swaps the unit's ``ExecStart`` to a bare Python
that has none of the venv's site-packages, so the service crashes on
the first ``import``. Keep the symlinked path so the venv activates
its own environment. Lexical expansion only via ``expanduser``.
"""
current_home = Path.home().resolve()
resolved = Path(path).resolve()
current_home = Path.home()
p = Path(path).expanduser()
try:
relative = resolved.relative_to(current_home)
relative = p.relative_to(current_home)
return str(Path(target_home_dir) / relative)
except ValueError:
return str(resolved)
return str(p)
def _hermes_home_for_target_user(target_home_dir: str) -> str:
@@ -929,7 +959,7 @@ def refresh_systemd_unit_if_needed(system: bool = False) -> bool:
expected_user = _read_systemd_user_from_unit(unit_path) if system else None
unit_path.write_text(generate_systemd_unit(system=system, run_as_user=expected_user), encoding="utf-8")
subprocess.run(_systemctl_cmd(system) + ["daemon-reload"], check=True, timeout=30)
_run_systemctl(["daemon-reload"], system=system, check=True, timeout=30)
print(f"↻ Updated gateway {_service_scope_label(system)} service definition to match the current Hermes install")
return True
@@ -1025,7 +1055,7 @@ def systemd_install(force: bool = False, system: bool = False, run_as_user: str
if not systemd_unit_is_current(system=system):
print(f"↻ Repairing outdated {_service_scope_label(system)} systemd service at: {unit_path}")
refresh_systemd_unit_if_needed(system=system)
subprocess.run(_systemctl_cmd(system) + ["enable", get_service_name()], check=True, timeout=30)
_run_systemctl(["enable", get_service_name()], system=system, check=True, timeout=30)
print(f"{_service_scope_label(system).capitalize()} service definition updated")
return
print(f"Service already installed at: {unit_path}")
@@ -1036,8 +1066,8 @@ def systemd_install(force: bool = False, system: bool = False, run_as_user: str
print(f"Installing {_service_scope_label(system)} systemd service to: {unit_path}")
unit_path.write_text(generate_systemd_unit(system=system, run_as_user=run_as_user), encoding="utf-8")
subprocess.run(_systemctl_cmd(system) + ["daemon-reload"], check=True, timeout=30)
subprocess.run(_systemctl_cmd(system) + ["enable", get_service_name()], check=True, timeout=30)
_run_systemctl(["daemon-reload"], system=system, check=True, timeout=30)
_run_systemctl(["enable", get_service_name()], system=system, check=True, timeout=30)
print()
print(f"{_service_scope_label(system).capitalize()} service installed and enabled!")
@@ -1063,24 +1093,135 @@ def systemd_uninstall(system: bool = False):
if system:
_require_root_for_system_service("uninstall")
subprocess.run(_systemctl_cmd(system) + ["stop", get_service_name()], check=False, timeout=90)
subprocess.run(_systemctl_cmd(system) + ["disable", get_service_name()], check=False, timeout=30)
_run_systemctl(["stop", get_service_name()], system=system, check=False, timeout=90)
_run_systemctl(["disable", get_service_name()], system=system, check=False, timeout=30)
unit_path = get_systemd_unit_path(system=system)
if unit_path.exists():
unit_path.unlink()
print(f"✓ Removed {unit_path}")
subprocess.run(_systemctl_cmd(system) + ["daemon-reload"], check=True, timeout=30)
_run_systemctl(["daemon-reload"], system=system, check=True, timeout=30)
print(f"{_service_scope_label(system).capitalize()} service uninstalled")
def _describe_startup_check(check_id: str, check: dict) -> str:
source = check.get("source")
detail = check.get("detail")
label = f"{check_id} ({source})" if source and source != check_id else check_id
return f"{label}: {detail}" if detail else label
def _classify_startup_checks(state: dict | None) -> tuple[list[str], list[str], list[str]]:
checks = (state or {}).get("startup_checks") or {}
pending_required: list[str] = []
failed_required: list[str] = []
optional_warnings: list[str] = []
if not isinstance(checks, dict):
return pending_required, failed_required, optional_warnings
for check_id, raw_check in checks.items():
check = raw_check if isinstance(raw_check, dict) else {}
label = _describe_startup_check(str(check_id), check)
check_state = str(check.get("state", "pending")).strip().lower()
required = bool(check.get("required", True))
if check_state == "ready":
continue
if required:
if check_state == "failed":
failed_required.append(label)
else:
pending_required.append(label)
else:
prefix = "failed" if check_state == "failed" else "pending"
optional_warnings.append(f"{prefix}: {label}")
return pending_required, failed_required, optional_warnings
def _wait_for_service_readiness(
*,
action: str,
previous_pid: int | None = None,
timeout: float = _SERVICE_READINESS_TIMEOUT,
poll_interval: float = _SERVICE_READINESS_POLL_INTERVAL,
) -> list[str]:
from gateway.status import get_running_pid, read_runtime_status
deadline = time.monotonic() + timeout
last_pending: list[str] = []
while time.monotonic() < deadline:
live_pid = get_running_pid()
if live_pid is None or (previous_pid is not None and live_pid == previous_pid):
time.sleep(poll_interval)
continue
runtime = read_runtime_status() or {}
try:
runtime_pid = int(runtime.get("pid"))
except (TypeError, ValueError):
runtime_pid = None
if runtime_pid != live_pid:
time.sleep(poll_interval)
continue
gateway_state = runtime.get("gateway_state")
pending_required, failed_required, optional_warnings = _classify_startup_checks(runtime)
last_pending = pending_required
if gateway_state == "startup_failed":
reason = runtime.get("exit_reason") or f"gateway {action} failed during startup"
raise RuntimeError(reason)
if failed_required:
raise RuntimeError(
"required startup checks failed: " + "; ".join(failed_required)
)
if gateway_state == "running" and not pending_required:
return optional_warnings
time.sleep(poll_interval)
if last_pending:
raise RuntimeError(
"timed out waiting for required startup checks: " + "; ".join(last_pending)
)
if previous_pid is not None:
raise RuntimeError(
f"timed out waiting for gateway {action}; previous process is still active or no new runtime became ready"
)
raise RuntimeError(f"timed out waiting for gateway {action} readiness")
def _await_service_ready_or_exit(
*,
action: str,
previous_pid: int | None = None,
timeout: float = _SERVICE_READINESS_TIMEOUT,
) -> None:
try:
optional_warnings = _wait_for_service_readiness(
action=action,
previous_pid=previous_pid,
timeout=timeout,
)
except RuntimeError as exc:
print_error(f" Gateway {action} did not become ready: {exc}")
raise SystemExit(1) from exc
for warning in optional_warnings:
print_warning(f" Optional startup check {warning}")
def systemd_start(system: bool = False):
system = _select_systemd_scope(system)
if system:
_require_root_for_system_service("start")
refresh_systemd_unit_if_needed(system=system)
subprocess.run(_systemctl_cmd(system) + ["start", get_service_name()], check=True, timeout=30)
_run_systemctl(["start", get_service_name()], system=system, check=True, timeout=30)
_await_service_ready_or_exit(action="start")
print(f"{_service_scope_label(system).capitalize()} service started")
@@ -1089,7 +1230,7 @@ def systemd_stop(system: bool = False):
system = _select_systemd_scope(system)
if system:
_require_root_for_system_service("stop")
subprocess.run(_systemctl_cmd(system) + ["stop", get_service_name()], check=True, timeout=90)
_run_systemctl(["stop", get_service_name()], system=system, check=True, timeout=90)
print(f"{_service_scope_label(system).capitalize()} service stopped")
@@ -1103,9 +1244,11 @@ def systemd_restart(system: bool = False):
pid = get_running_pid()
if pid is not None and _request_gateway_self_restart(pid):
print(f"{_service_scope_label(system).capitalize()} service restart requested")
_await_service_ready_or_exit(action="restart", previous_pid=pid)
print(f"{_service_scope_label(system).capitalize()} service restarted")
return
subprocess.run(_systemctl_cmd(system) + ["reload-or-restart", get_service_name()], check=True, timeout=90)
_run_systemctl(["reload-or-restart", get_service_name()], system=system, check=True, timeout=90)
_await_service_ready_or_exit(action="restart", previous_pid=pid)
print(f"{_service_scope_label(system).capitalize()} service restarted")
@@ -1129,14 +1272,16 @@ def systemd_status(deep: bool = False, system: bool = False):
print(f" Run: {'sudo ' if system else ''}hermes gateway restart{scope_flag} # auto-refreshes the unit")
print()
subprocess.run(
_systemctl_cmd(system) + ["status", get_service_name(), "--no-pager"],
_run_systemctl(
["status", get_service_name(), "--no-pager"],
system=system,
capture_output=False,
timeout=10,
)
result = subprocess.run(
_systemctl_cmd(system) + ["is-active", get_service_name()],
result = _run_systemctl(
["is-active", get_service_name()],
system=system,
capture_output=True,
text=True,
timeout=10,
@@ -1362,6 +1507,7 @@ def launchd_start():
plist_path.write_text(generate_launchd_plist(), encoding="utf-8")
subprocess.run(["launchctl", "bootstrap", _launchd_domain(), str(plist_path)], check=True, timeout=30)
subprocess.run(["launchctl", "kickstart", f"{_launchd_domain()}/{label}"], check=True, timeout=30)
_await_service_ready_or_exit(action="start")
print("✓ Service started")
return
@@ -1374,6 +1520,7 @@ def launchd_start():
print("↻ launchd job was unloaded; reloading service definition")
subprocess.run(["launchctl", "bootstrap", _launchd_domain(), str(plist_path)], check=True, timeout=30)
subprocess.run(["launchctl", "kickstart", f"{_launchd_domain()}/{label}"], check=True, timeout=30)
_await_service_ready_or_exit(action="start")
print("✓ Service started")
def launchd_stop():
@@ -1444,7 +1591,8 @@ def launchd_restart():
try:
pid = get_running_pid()
if pid is not None and _request_gateway_self_restart(pid):
print("✓ Service restart requested")
_await_service_ready_or_exit(action="restart", previous_pid=pid)
print("✓ Service restarted")
return
if pid is not None:
try:
@@ -1456,6 +1604,7 @@ def launchd_restart():
if not exited:
print(f"⚠ Gateway drain timed out after {drain_timeout:.0f}s — forcing launchd restart")
subprocess.run(["launchctl", "kickstart", "-k", target], check=True, timeout=90)
_await_service_ready_or_exit(action="restart", previous_pid=pid)
print("✓ Service restarted")
except subprocess.CalledProcessError as e:
if e.returncode not in (3, 113):
@@ -1465,6 +1614,7 @@ def launchd_restart():
plist_path = get_launchd_plist_path()
subprocess.run(["launchctl", "bootstrap", _launchd_domain(), str(plist_path)], check=True, timeout=30)
subprocess.run(["launchctl", "kickstart", target], check=True, timeout=30)
_await_service_ready_or_exit(action="restart", previous_pid=pid)
print("✓ Service restarted")
def launchd_status(deep: bool = False):
@@ -1607,7 +1757,7 @@ _PLATFORMS = [
" Create an App-Level Token with scope: connections:write → copy xapp-... token",
"3. Add Bot Token Scopes: Features → OAuth & Permissions → Scopes",
" Required: chat:write, app_mentions:read, channels:history, channels:read,",
" groups:history, im:history, im:read, im:write, users:read, files:write",
" groups:history, im:history, im:read, im:write, users:read, files:read, files:write",
"4. Subscribe to Events: Features → Event Subscriptions → Enable",
" Required events: message.im, message.channels, app_mention",
" Optional: message.groups (for private channels)",
@@ -1886,6 +2036,29 @@ _PLATFORMS = [
"help": "Phone number or Apple ID to deliver cron results and notifications to."},
],
},
{
"key": "qqbot",
"label": "QQ Bot",
"emoji": "🐧",
"token_var": "QQ_APP_ID",
"setup_instructions": [
"1. Register a QQ Bot application at q.qq.com",
"2. Note your App ID and App Secret from the application page",
"3. Enable the required intents (C2C, Group, Guild messages)",
"4. Configure sandbox or publish the bot",
],
"vars": [
{"name": "QQ_APP_ID", "prompt": "QQ Bot App ID", "password": False,
"help": "Your QQ Bot App ID from q.qq.com."},
{"name": "QQ_CLIENT_SECRET", "prompt": "QQ Bot App Secret", "password": True,
"help": "Your QQ Bot App Secret from q.qq.com."},
{"name": "QQ_ALLOWED_USERS", "prompt": "Allowed user OpenIDs (comma-separated, leave empty for open access)", "password": False,
"is_allowlist": True,
"help": "Optional — restrict DM access to specific user OpenIDs."},
{"name": "QQ_HOME_CHANNEL", "prompt": "Home channel (user/group OpenID for cron delivery, or empty)", "password": False,
"help": "OpenID to deliver cron results and notifications to."},
],
},
]
@@ -2100,12 +2273,6 @@ def _setup_dingtalk():
_setup_standard_platform(dingtalk_platform)
def _setup_feishu():
"""Configure Feishu / Lark via the standard platform setup."""
feishu_platform = next(p for p in _PLATFORMS if p["key"] == "feishu")
_setup_standard_platform(feishu_platform)
def _setup_wecom():
"""Configure WeCom (Enterprise WeChat) via the standard platform setup."""
wecom_platform = next(p for p in _PLATFORMS if p["key"] == "wecom")
@@ -2129,24 +2296,24 @@ def _is_service_running() -> bool:
if user_unit_exists:
try:
result = subprocess.run(
_systemctl_cmd(False) + ["is-active", get_service_name()],
capture_output=True, text=True, timeout=10,
result = _run_systemctl(
["is-active", get_service_name()],
system=False, capture_output=True, text=True, timeout=10,
)
if result.stdout.strip() == "active":
return True
except subprocess.TimeoutExpired:
except (RuntimeError, subprocess.TimeoutExpired):
pass
if system_unit_exists:
try:
result = subprocess.run(
_systemctl_cmd(True) + ["is-active", get_service_name()],
capture_output=True, text=True, timeout=10,
result = _run_systemctl(
["is-active", get_service_name()],
system=True, capture_output=True, text=True, timeout=10,
)
if result.stdout.strip() == "active":
return True
except subprocess.TimeoutExpired:
except (RuntimeError, subprocess.TimeoutExpired):
pass
return False
@@ -2290,6 +2457,178 @@ def _setup_weixin():
print_info(f" User ID: {user_id}")
def _setup_feishu():
"""Interactive setup for Feishu / Lark — scan-to-create or manual credentials."""
print()
print(color(" ─── 🪽 Feishu / Lark Setup ───", Colors.CYAN))
existing_app_id = get_env_value("FEISHU_APP_ID")
existing_secret = get_env_value("FEISHU_APP_SECRET")
if existing_app_id and existing_secret:
print()
print_success("Feishu / Lark is already configured.")
if not prompt_yes_no(" Reconfigure Feishu / Lark?", False):
return
# ── Choose setup method ──
print()
method_choices = [
"Scan QR code to create a new bot automatically (recommended)",
"Enter existing App ID and App Secret manually",
]
method_idx = prompt_choice(" How would you like to set up Feishu / Lark?", method_choices, 0)
credentials = None
used_qr = False
if method_idx == 0:
# ── QR scan-to-create ──
try:
from gateway.platforms.feishu import qr_register
except Exception as exc:
print_error(f" Feishu / Lark onboard import failed: {exc}")
qr_register = None
if qr_register is not None:
try:
credentials = qr_register()
except KeyboardInterrupt:
print()
print_warning(" Feishu / Lark setup cancelled.")
return
except Exception as exc:
print_warning(f" QR registration failed: {exc}")
if credentials:
used_qr = True
if not credentials:
print_info(" QR setup did not complete. Continuing with manual input.")
# ── Manual credential input ──
if not credentials:
print()
print_info(" Go to https://open.feishu.cn/ (or https://open.larksuite.com/ for Lark)")
print_info(" Create an app, enable the Bot capability, and copy the credentials.")
print()
app_id = prompt(" App ID", password=False)
if not app_id:
print_warning(" Skipped — Feishu / Lark won't work without an App ID.")
return
app_secret = prompt(" App Secret", password=True)
if not app_secret:
print_warning(" Skipped — Feishu / Lark won't work without an App Secret.")
return
domain_choices = ["feishu (China)", "lark (International)"]
domain_idx = prompt_choice(" Domain", domain_choices, 0)
domain = "lark" if domain_idx == 1 else "feishu"
# Try to probe the bot with manual credentials
bot_name = None
try:
from gateway.platforms.feishu import probe_bot
bot_info = probe_bot(app_id, app_secret, domain)
if bot_info:
bot_name = bot_info.get("bot_name")
print_success(f" Credentials verified — bot: {bot_name or 'unnamed'}")
else:
print_warning(" Could not verify bot connection. Credentials saved anyway.")
except Exception as exc:
print_warning(f" Credential verification skipped: {exc}")
credentials = {
"app_id": app_id,
"app_secret": app_secret,
"domain": domain,
"open_id": None,
"bot_name": bot_name,
}
# ── Save core credentials ──
app_id = credentials["app_id"]
app_secret = credentials["app_secret"]
domain = credentials.get("domain", "feishu")
open_id = credentials.get("open_id")
bot_name = credentials.get("bot_name")
save_env_value("FEISHU_APP_ID", app_id)
save_env_value("FEISHU_APP_SECRET", app_secret)
save_env_value("FEISHU_DOMAIN", domain)
# Bot identity is resolved at runtime via _hydrate_bot_identity().
# ── Connection mode ──
if used_qr:
connection_mode = "websocket"
else:
print()
mode_choices = [
"WebSocket (recommended — no public URL needed)",
"Webhook (requires a reachable HTTP endpoint)",
]
mode_idx = prompt_choice(" Connection mode", mode_choices, 0)
connection_mode = "webhook" if mode_idx == 1 else "websocket"
if connection_mode == "webhook":
print_info(" Webhook defaults: 127.0.0.1:8765/feishu/webhook")
print_info(" Override with FEISHU_WEBHOOK_HOST / FEISHU_WEBHOOK_PORT / FEISHU_WEBHOOK_PATH")
print_info(" For signature verification, set FEISHU_ENCRYPT_KEY and FEISHU_VERIFICATION_TOKEN")
save_env_value("FEISHU_CONNECTION_MODE", connection_mode)
if bot_name:
print()
print_success(f" Bot created: {bot_name}")
# ── DM security policy ──
print()
access_choices = [
"Use DM pairing approval (recommended)",
"Allow all direct messages",
"Only allow listed user IDs",
]
access_idx = prompt_choice(" How should direct messages be authorized?", access_choices, 0)
if access_idx == 0:
save_env_value("FEISHU_ALLOW_ALL_USERS", "false")
save_env_value("FEISHU_ALLOWED_USERS", "")
print_success(" DM pairing enabled.")
print_info(" Unknown users can request access; approve with `hermes pairing approve`.")
elif access_idx == 1:
save_env_value("FEISHU_ALLOW_ALL_USERS", "true")
save_env_value("FEISHU_ALLOWED_USERS", "")
print_warning(" Open DM access enabled for Feishu / Lark.")
else:
save_env_value("FEISHU_ALLOW_ALL_USERS", "false")
default_allow = open_id or ""
allowlist = prompt(" Allowed user IDs (comma-separated)", default_allow, password=False).replace(" ", "")
save_env_value("FEISHU_ALLOWED_USERS", allowlist)
print_success(" Allowlist saved.")
# ── Group policy ──
print()
group_choices = [
"Respond only when @mentioned in groups (recommended)",
"Disable group chats",
]
group_idx = prompt_choice(" How should group chats be handled?", group_choices, 0)
if group_idx == 0:
save_env_value("FEISHU_GROUP_POLICY", "open")
print_info(" Group chats enabled (bot must be @mentioned).")
else:
save_env_value("FEISHU_GROUP_POLICY", "disabled")
print_info(" Group chats disabled.")
# ── Home channel ──
print()
home_channel = prompt(" Home chat ID (optional, for cron/notifications)", password=False)
if home_channel:
save_env_value("FEISHU_HOME_CHANNEL", home_channel)
print_success(f" Home channel set to {home_channel}")
print()
print_success("🪽 Feishu / Lark configured!")
print_info(f" App ID: {app_id}")
print_info(f" Domain: {domain}")
if bot_name:
print_info(f" Bot: {bot_name}")
def _setup_signal():
"""Interactive setup for Signal messenger."""
import shutil
@@ -2467,6 +2806,8 @@ def gateway_setup():
_setup_signal()
elif platform["key"] == "weixin":
_setup_weixin()
elif platform["key"] == "feishu":
_setup_feishu()
else:
_setup_standard_platform(platform)
@@ -2606,6 +2947,15 @@ def gateway_command(args):
print(" tmux new -s hermes 'hermes gateway run' # persistent via tmux")
print(" nohup hermes gateway run > ~/.hermes/logs/gateway.log 2>&1 & # background")
sys.exit(1)
elif is_container():
print("Service installation is not needed inside a Docker container.")
print("The container runtime is your service manager — use Docker restart policies instead:")
print()
print(" docker run --restart unless-stopped ... # auto-restart on crash/reboot")
print(" docker restart <container> # manual restart")
print()
print("To run the gateway: hermes gateway run")
sys.exit(0)
else:
print("Service installation not supported on this platform.")
print("Run manually: hermes gateway run")
@@ -2624,10 +2974,17 @@ def gateway_command(args):
systemd_uninstall(system=system)
elif is_macos():
launchd_uninstall()
elif is_container():
print("Service uninstall is not applicable inside a Docker container.")
print("To stop the gateway, stop or remove the container:")
print()
print(" docker stop <container>")
print(" docker rm <container>")
sys.exit(0)
else:
print("Not supported on this platform.")
sys.exit(1)
elif subcmd == "start":
system = getattr(args, 'system', False)
if is_termux():
@@ -2648,10 +3005,19 @@ def gateway_command(args):
print()
print("To enable systemd: add systemd=true to /etc/wsl.conf and run 'wsl --shutdown' from PowerShell.")
sys.exit(1)
elif is_container():
print("Service start is not applicable inside a Docker container.")
print("The gateway runs as the container's main process.")
print()
print(" docker start <container> # start a stopped container")
print(" docker restart <container> # restart a running container")
print()
print("Or run the gateway directly: hermes gateway run")
sys.exit(0)
else:
print("Not supported on this platform.")
sys.exit(1)
elif subcmd == "stop":
stop_all = getattr(args, 'all', False)
system = getattr(args, 'system', False)
+64 -9
View File
@@ -1,16 +1,18 @@
"""``hermes logs`` — view and filter Hermes log files.
Supports tailing, following, session filtering, level filtering, and
relative time ranges. All log files live under ``~/.hermes/logs/``.
Supports tailing, following, session filtering, level filtering,
component filtering, and relative time ranges. All log files live
under ``~/.hermes/logs/``.
Usage examples::
hermes logs # last 50 lines of agent.log
hermes logs -f # follow agent.log in real time
hermes logs errors # last 50 lines of errors.log
hermes logs gateway -n 100 # last 100 lines of gateway.log
hermes logs gateway -n 100 # last 100 lines of gateway.log
hermes logs --level WARNING # only WARNING+ lines
hermes logs --session abc123 # filter by session ID substring
hermes logs --component tools # only tool-related lines
hermes logs --since 1h # lines from the last hour
hermes logs --since 30m -f # follow, starting 30 min ago
"""
@@ -20,7 +22,7 @@ import sys
import time
from datetime import datetime, timedelta
from pathlib import Path
from typing import Optional
from typing import Optional, Sequence
from hermes_constants import get_hermes_home, display_hermes_home
@@ -38,6 +40,15 @@ _TS_RE = re.compile(r"^(\d{4}-\d{2}-\d{2}\s+\d{2}:\d{2}:\d{2})")
# Level extraction — matches " INFO ", " WARNING ", " ERROR ", " DEBUG ", " CRITICAL "
_LEVEL_RE = re.compile(r"\s(DEBUG|INFO|WARNING|ERROR|CRITICAL)\s")
# Logger name extraction — after level and optional session tag, the next
# non-space token before ":" is the logger name.
# Matches: "INFO gateway.run:" or "INFO [sess_abc] tools.terminal_tool:"
_LOGGER_NAME_RE = re.compile(
r"\s(?:DEBUG|INFO|WARNING|ERROR|CRITICAL)" # level
r"(?:\s+\[.*?\])?" # optional session tag
r"\s+(\S+):" # logger name
)
# Level ordering for >= filtering
_LEVEL_ORDER = {"DEBUG": 0, "INFO": 1, "WARNING": 2, "ERROR": 3, "CRITICAL": 4}
@@ -79,12 +90,27 @@ def _extract_level(line: str) -> Optional[str]:
return m.group(1) if m else None
def _extract_logger_name(line: str) -> Optional[str]:
"""Extract the logger name from a log line."""
m = _LOGGER_NAME_RE.search(line)
return m.group(1) if m else None
def _line_matches_component(line: str, prefixes: Sequence[str]) -> bool:
"""Check if a log line's logger name starts with any of *prefixes*."""
name = _extract_logger_name(line)
if name is None:
return False
return name.startswith(tuple(prefixes))
def _matches_filters(
line: str,
*,
min_level: Optional[str] = None,
session_filter: Optional[str] = None,
since: Optional[datetime] = None,
component_prefixes: Optional[Sequence[str]] = None,
) -> bool:
"""Check if a log line passes all active filters."""
if since is not None:
@@ -102,6 +128,10 @@ def _matches_filters(
if session_filter not in line:
return False
if component_prefixes is not None:
if not _line_matches_component(line, component_prefixes):
return False
return True
@@ -113,6 +143,7 @@ def tail_log(
level: Optional[str] = None,
session: Optional[str] = None,
since: Optional[str] = None,
component: Optional[str] = None,
) -> None:
"""Read and display log lines, optionally following in real time.
@@ -130,6 +161,8 @@ def tail_log(
Session ID substring to filter on.
since
Relative time string (e.g. ``"1h"``, ``"30m"``).
component
Component name to filter by (e.g. ``"gateway"``, ``"tools"``).
"""
filename = LOG_FILES.get(log_name)
if filename is None:
@@ -155,13 +188,29 @@ def tail_log(
print(f"Invalid --level: {level!r}. Use DEBUG, INFO, WARNING, ERROR, or CRITICAL.")
sys.exit(1)
has_filters = min_level is not None or session is not None or since_dt is not None
# Resolve component to logger name prefixes
component_prefixes = None
if component:
from hermes_logging import COMPONENT_PREFIXES
component_lower = component.lower()
if component_lower not in COMPONENT_PREFIXES:
available = ", ".join(sorted(COMPONENT_PREFIXES))
print(f"Unknown component: {component!r}. Available: {available}")
sys.exit(1)
component_prefixes = COMPONENT_PREFIXES[component_lower]
has_filters = (
min_level is not None
or session is not None
or since_dt is not None
or component_prefixes is not None
)
# Read and display the tail
try:
lines = _read_tail(log_path, num_lines, has_filters=has_filters,
min_level=min_level, session_filter=session,
since=since_dt)
since=since_dt, component_prefixes=component_prefixes)
except PermissionError:
print(f"Permission denied: {log_path}")
sys.exit(1)
@@ -172,6 +221,8 @@ def tail_log(
filter_parts.append(f"level>={min_level}")
if session:
filter_parts.append(f"session={session}")
if component:
filter_parts.append(f"component={component}")
if since:
filter_parts.append(f"since={since}")
filter_desc = f" [{', '.join(filter_parts)}]" if filter_parts else ""
@@ -190,7 +241,7 @@ def tail_log(
# Follow mode — poll for new content
try:
_follow_log(log_path, min_level=min_level, session_filter=session,
since=since_dt)
since=since_dt, component_prefixes=component_prefixes)
except KeyboardInterrupt:
print("\n--- stopped ---")
@@ -203,6 +254,7 @@ def _read_tail(
min_level: Optional[str] = None,
session_filter: Optional[str] = None,
since: Optional[datetime] = None,
component_prefixes: Optional[Sequence[str]] = None,
) -> list:
"""Read the last *num_lines* matching lines from a log file.
@@ -215,7 +267,8 @@ def _read_tail(
filtered = [
l for l in raw_lines
if _matches_filters(l, min_level=min_level,
session_filter=session_filter, since=since)
session_filter=session_filter, since=since,
component_prefixes=component_prefixes)
]
return filtered[-num_lines:]
else:
@@ -284,6 +337,7 @@ def _follow_log(
min_level: Optional[str] = None,
session_filter: Optional[str] = None,
since: Optional[datetime] = None,
component_prefixes: Optional[Sequence[str]] = None,
) -> None:
"""Poll a log file for new content and print matching lines."""
with open(path, "r", encoding="utf-8", errors="replace") as f:
@@ -293,7 +347,8 @@ def _follow_log(
line = f.readline()
if line:
if _matches_filters(line, min_level=min_level,
session_filter=session_filter, since=since):
session_filter=session_filter, since=since,
component_prefixes=component_prefixes):
print(line, end="")
sys.stdout.flush()
else:
+508 -130
View File
@@ -151,6 +151,18 @@ try:
except Exception:
pass # best-effort — don't crash the CLI if logging setup fails
# Apply IPv4 preference early, before any HTTP clients are created.
try:
from hermes_cli.config import load_config as _load_config_early
from hermes_constants import apply_ipv4_preference as _apply_ipv4
_early_cfg = _load_config_early()
_net = _early_cfg.get("network", {})
if isinstance(_net, dict) and _net.get("force_ipv4"):
_apply_ipv4(force=True)
del _early_cfg, _net
except Exception:
pass # best-effort — don't crash if config isn't available yet
import logging
import time as _time
from datetime import datetime
@@ -528,6 +540,113 @@ def _resolve_last_cli_session() -> Optional[str]:
return None
def _probe_container(cmd: list, backend: str, via_sudo: bool = False):
"""Run a container inspect probe, returning the CompletedProcess.
Catches TimeoutExpired specifically for a human-readable message;
all other exceptions propagate naturally.
"""
try:
return subprocess.run(cmd, capture_output=True, text=True, timeout=15)
except subprocess.TimeoutExpired:
label = f"sudo {backend}" if via_sudo else backend
print(
f"Error: timed out waiting for {label} to respond.\n"
f"The {backend} daemon may be unresponsive or starting up.",
file=sys.stderr,
)
sys.exit(1)
def _exec_in_container(container_info: dict, cli_args: list):
"""Replace the current process with a command inside the managed container.
Probes whether sudo is needed (rootful containers), then os.execvp
into the container. On success the Python process is replaced entirely
and the container's exit code becomes the process exit code (OS semantics).
On failure, OSError propagates naturally.
Args:
container_info: dict with backend, container_name, exec_user, hermes_bin
cli_args: the original CLI arguments (everything after 'hermes')
"""
import shutil
backend = container_info["backend"]
container_name = container_info["container_name"]
exec_user = container_info["exec_user"]
hermes_bin = container_info["hermes_bin"]
runtime = shutil.which(backend)
if not runtime:
print(f"Error: {backend} not found on PATH. Cannot route to container.",
file=sys.stderr)
sys.exit(1)
# Rootful containers (NixOS systemd service) are invisible to unprivileged
# users — Podman uses per-user namespaces, Docker needs group access.
# Probe whether the runtime can see the container; if not, try via sudo.
sudo_path = None
probe = _probe_container(
[runtime, "inspect", "--format", "ok", container_name], backend,
)
if probe.returncode != 0:
sudo_path = shutil.which("sudo")
if sudo_path:
probe2 = _probe_container(
[sudo_path, "-n", runtime, "inspect", "--format", "ok", container_name],
backend, via_sudo=True,
)
if probe2.returncode != 0:
print(
f"Error: container '{container_name}' not found via {backend}.\n"
f"\n"
f"The container is likely running as root. Your user cannot see it\n"
f"because {backend} uses per-user namespaces. Grant passwordless\n"
f"sudo for {backend} — the -n (non-interactive) flag is required\n"
f"because a password prompt would hang or break piped commands.\n"
f"\n"
f"On NixOS:\n"
f"\n"
f' security.sudo.extraRules = [{{\n'
f' users = [ "{os.getenv("USER", "your-user")}" ];\n'
f' commands = [{{ command = "{runtime}"; options = [ "NOPASSWD" ]; }}];\n'
f' }}];\n'
f"\n"
f"Or run: sudo hermes {' '.join(cli_args)}",
file=sys.stderr,
)
sys.exit(1)
else:
print(
f"Error: container '{container_name}' not found via {backend}.\n"
f"The container may be running under root. Try: sudo hermes {' '.join(cli_args)}",
file=sys.stderr,
)
sys.exit(1)
is_tty = sys.stdin.isatty()
tty_flags = ["-it"] if is_tty else ["-i"]
env_flags = []
for var in ("TERM", "COLORTERM", "LANG", "LC_ALL"):
val = os.environ.get(var)
if val:
env_flags.extend(["-e", f"{var}={val}"])
cmd_prefix = [sudo_path, "-n", runtime] if sudo_path else [runtime]
exec_cmd = (
cmd_prefix + ["exec"]
+ tty_flags
+ ["-u", exec_user]
+ env_flags
+ [container_name, hermes_bin]
+ cli_args
)
os.execvp(exec_cmd[0], exec_cmd)
def _resolve_session_by_name_or_id(name_or_id: str) -> Optional[str]:
"""Resolve a session name (title) or ID to a session ID.
@@ -880,7 +999,7 @@ def select_provider_and_model(args=None):
from hermes_cli.auth import (
resolve_provider, AuthError, format_auth_error,
)
from hermes_cli.config import load_config, get_env_value
from hermes_cli.config import get_compatible_custom_providers, load_config, get_env_value
config = load_config()
current_model = config.get("model")
@@ -915,28 +1034,9 @@ def select_provider_and_model(args=None):
if active == "openrouter" and get_env_value("OPENAI_BASE_URL"):
active = "custom"
provider_labels = {
"openrouter": "OpenRouter",
"nous": "Nous Portal",
"openai-codex": "OpenAI Codex",
"qwen-oauth": "Qwen OAuth",
"copilot-acp": "GitHub Copilot ACP",
"copilot": "GitHub Copilot",
"anthropic": "Anthropic",
"gemini": "Google AI Studio",
"zai": "Z.AI / GLM",
"kimi-coding": "Kimi / Moonshot",
"minimax": "MiniMax",
"minimax-cn": "MiniMax (China)",
"opencode-zen": "OpenCode Zen",
"opencode-go": "OpenCode Go",
"ai-gateway": "AI Gateway",
"kilocode": "Kilo Code",
"alibaba": "Alibaba Cloud (DashScope)",
"huggingface": "Hugging Face",
"xiaomi": "Xiaomi MiMo",
"custom": "Custom endpoint",
}
from hermes_cli.models import CANONICAL_PROVIDERS, _PROVIDER_LABELS
provider_labels = dict(_PROVIDER_LABELS) # derive from canonical list
active_label = provider_labels.get(active, active) if active else "none"
print()
@@ -944,38 +1044,12 @@ def select_provider_and_model(args=None):
print(f" Active provider: {active_label}")
print()
# Step 1: Provider selection — top providers shown first, rest behind "More..."
top_providers = [
("nous", "Nous Portal (Nous Research subscription)"),
("openrouter", "OpenRouter (100+ models, pay-per-use)"),
("anthropic", "Anthropic (Claude models — API key or Claude Code)"),
("openai-codex", "OpenAI Codex"),
("qwen-oauth", "Qwen OAuth (reuses local Qwen CLI login)"),
("copilot", "GitHub Copilot (uses GITHUB_TOKEN or gh auth token)"),
("huggingface", "Hugging Face Inference Providers (20+ open models)"),
]
extended_providers = [
("copilot-acp", "GitHub Copilot ACP (spawns `copilot --acp --stdio`)"),
("gemini", "Google AI Studio (Gemini models — OpenAI-compatible endpoint)"),
("zai", "Z.AI / GLM (Zhipu AI direct API)"),
("kimi-coding", "Kimi / Moonshot (Moonshot AI direct API)"),
("minimax", "MiniMax (global direct API)"),
("minimax-cn", "MiniMax China (domestic direct API)"),
("kilocode", "Kilo Code (Kilo Gateway API)"),
("opencode-zen", "OpenCode Zen (35+ curated models, pay-as-you-go)"),
("opencode-go", "OpenCode Go (open models, $10/month subscription)"),
("ai-gateway", "AI Gateway (Vercel — 200+ models, pay-per-use)"),
("alibaba", "Alibaba Cloud / DashScope Coding (Qwen + multi-provider)"),
("xiaomi", "Xiaomi MiMo (MiMo-V2 models — pro, omni, flash)"),
]
# Step 1: Provider selection — flat list from CANONICAL_PROVIDERS
all_providers = [(p.slug, p.tui_desc) for p in CANONICAL_PROVIDERS]
def _named_custom_provider_map(cfg) -> dict[str, dict[str, str]]:
custom_providers_cfg = cfg.get("custom_providers") or []
custom_provider_map = {}
if not isinstance(custom_providers_cfg, list):
return custom_provider_map
for entry in custom_providers_cfg:
for entry in get_compatible_custom_providers(cfg):
if not isinstance(entry, dict):
continue
name = (entry.get("name") or "").strip()
@@ -983,11 +1057,20 @@ def select_provider_and_model(args=None):
if not name or not base_url:
continue
key = "custom:" + name.lower().replace(" ", "-")
provider_key = (entry.get("provider_key") or "").strip()
if provider_key:
try:
resolve_provider(provider_key)
except AuthError:
key = provider_key
custom_provider_map[key] = {
"name": name,
"base_url": base_url,
"api_key": entry.get("api_key", ""),
"key_env": entry.get("key_env", ""),
"model": entry.get("model", ""),
"api_mode": entry.get("api_mode", ""),
"provider_key": provider_key,
}
return custom_provider_map
@@ -999,29 +1082,22 @@ def select_provider_and_model(args=None):
short_url = base_url.replace("https://", "").replace("http://", "").rstrip("/")
saved_model = provider_info.get("model", "")
model_hint = f"{saved_model}" if saved_model else ""
top_providers.append((key, f"{name} ({short_url}){model_hint}"))
all_providers.append((key, f"{name} ({short_url}){model_hint}"))
top_keys = {k for k, _ in top_providers}
extended_keys = {k for k, _ in extended_providers}
# If the active provider is in the extended list, promote it into top
if active and active in extended_keys:
promoted = [(k, l) for k, l in extended_providers if k == active]
extended_providers = [(k, l) for k, l in extended_providers if k != active]
top_providers = promoted + top_providers
top_keys.add(active)
# Build the primary menu
# Build the menu
ordered = []
default_idx = 0
for key, label in top_providers:
for key, label in all_providers:
if active and key == active:
ordered.append((key, f"{label} ← currently active"))
default_idx = len(ordered) - 1
else:
ordered.append((key, label))
ordered.append(("more", "More providers..."))
ordered.append(("custom", "Custom endpoint (enter URL manually)"))
_has_saved_custom_list = isinstance(config.get("custom_providers"), list) and bool(config.get("custom_providers"))
if _has_saved_custom_list:
ordered.append(("remove-custom", "Remove a saved custom provider"))
ordered.append(("cancel", "Cancel"))
provider_idx = _prompt_provider_choice(
@@ -1033,22 +1109,6 @@ def select_provider_and_model(args=None):
selected_provider = ordered[provider_idx][0]
# "More providers..." — show the extended list
if selected_provider == "more":
ext_ordered = list(extended_providers)
ext_ordered.append(("custom", "Custom endpoint (enter URL manually)"))
if _custom_provider_map:
ext_ordered.append(("remove-custom", "Remove a saved custom provider"))
ext_ordered.append(("cancel", "Cancel"))
ext_idx = _prompt_provider_choice(
[label for _, label in ext_ordered], default=0,
)
if ext_idx is None or ext_ordered[ext_idx][0] == "cancel":
print("No change.")
return
selected_provider = ext_ordered[ext_idx][0]
# Step 2: Provider-specific setup + model selection
if selected_provider == "openrouter":
_model_flow_openrouter(config, current_model)
@@ -1064,7 +1124,7 @@ def select_provider_and_model(args=None):
_model_flow_copilot(config, current_model)
elif selected_provider == "custom":
_model_flow_custom(config)
elif selected_provider.startswith("custom:"):
elif selected_provider.startswith("custom:") or selected_provider in _custom_provider_map:
provider_info = _named_custom_provider_map(load_config()).get(selected_provider)
if provider_info is None:
print(
@@ -1079,7 +1139,7 @@ def select_provider_and_model(args=None):
_model_flow_anthropic(config, current_model)
elif selected_provider == "kimi-coding":
_model_flow_kimi(config, current_model)
elif selected_provider in ("gemini", "zai", "minimax", "minimax-cn", "kilocode", "opencode-zen", "opencode-go", "ai-gateway", "alibaba", "huggingface", "xiaomi"):
elif selected_provider in ("gemini", "deepseek", "xai", "zai", "kimi-coding-cn", "minimax", "minimax-cn", "kilocode", "opencode-zen", "opencode-go", "ai-gateway", "alibaba", "huggingface", "xiaomi", "arcee"):
_model_flow_api_key_provider(config, selected_provider, current_model)
# ── Post-switch cleanup: clear stale OPENAI_BASE_URL ──────────────
@@ -1558,6 +1618,10 @@ def _model_flow_custom(config):
model_name = input("Model name (e.g. gpt-4, llama-3-70b): ").strip()
context_length_str = input("Context length in tokens [leave blank for auto-detect]: ").strip()
# Prompt for a display name — shown in the provider menu on future runs
default_name = _auto_provider_name(effective_url)
display_name = input(f"Display name [{default_name}]: ").strip() or default_name
except (KeyboardInterrupt, EOFError):
print("\nCancelled.")
return
@@ -1613,15 +1677,37 @@ def _model_flow_custom(config):
print("Endpoint saved. Use `/model` in chat or `hermes model` to set a model.")
# Auto-save to custom_providers so it appears in the menu next time
_save_custom_provider(effective_url, effective_key, model_name or "", context_length=context_length)
_save_custom_provider(effective_url, effective_key, model_name or "",
context_length=context_length, name=display_name)
def _save_custom_provider(base_url, api_key="", model="", context_length=None):
def _auto_provider_name(base_url: str) -> str:
"""Generate a display name from a custom endpoint URL.
Returns a human-friendly label like "Local (localhost:11434)" or
"RunPod (xyz.runpod.io)". Used as the default when prompting the
user for a display name during custom endpoint setup.
"""
import re
clean = base_url.replace("https://", "").replace("http://", "").rstrip("/")
clean = re.sub(r"/v1/?$", "", clean)
name = clean.split("/")[0]
if "localhost" in name or "127.0.0.1" in name:
name = f"Local ({name})"
elif "runpod" in name.lower():
name = f"RunPod ({name})"
else:
name = name.capitalize()
return name
def _save_custom_provider(base_url, api_key="", model="", context_length=None,
name=None):
"""Save a custom endpoint to custom_providers in config.yaml.
Deduplicates by base_url if the URL already exists, updates the
model name and context_length but doesn't add a duplicate entry.
Auto-generates a display name from the URL hostname.
Uses *name* when provided, otherwise auto-generates from the URL.
"""
from hermes_cli.config import load_config, save_config
@@ -1649,20 +1735,9 @@ def _save_custom_provider(base_url, api_key="", model="", context_length=None):
save_config(cfg)
return # already saved, updated if needed
# Auto-generate a name from the URL
import re
clean = base_url.replace("https://", "").replace("http://", "").rstrip("/")
# Remove /v1 suffix for cleaner names
clean = re.sub(r"/v1/?$", "", clean)
# Use hostname:port as the name
name = clean.split("/")[0]
# Capitalize for readability
if "localhost" in name or "127.0.0.1" in name:
name = f"Local ({name})"
elif "runpod" in name.lower():
name = f"RunPod ({name})"
else:
name = name.capitalize()
# Use provided name or auto-generate from URL
if not name:
name = _auto_provider_name(base_url)
entry = {"name": name, "base_url": base_url}
if api_key:
@@ -1749,7 +1824,9 @@ def _model_flow_named_custom(config, provider_info):
name = provider_info["name"]
base_url = provider_info["base_url"]
api_key = provider_info.get("api_key", "")
key_env = provider_info.get("key_env", "")
saved_model = provider_info.get("model", "")
provider_key = (provider_info.get("provider_key") or "").strip()
print(f" Provider: {name}")
print(f" URL: {base_url}")
@@ -1832,15 +1909,41 @@ def _model_flow_named_custom(config, provider_info):
if not isinstance(model, dict):
model = {"default": model} if model else {}
cfg["model"] = model
model["provider"] = "custom"
model["base_url"] = base_url
if api_key:
model["api_key"] = api_key
if provider_key:
model["provider"] = provider_key
model.pop("base_url", None)
model.pop("api_key", None)
else:
model["provider"] = "custom"
model["base_url"] = base_url
if api_key:
model["api_key"] = api_key
# Apply api_mode from custom_providers entry, or clear stale value
custom_api_mode = provider_info.get("api_mode", "")
if custom_api_mode:
model["api_mode"] = custom_api_mode
else:
model.pop("api_mode", None) # let runtime auto-detect from URL
save_config(cfg)
deactivate_provider()
# Save model name to the custom_providers entry for next time
_save_custom_provider(base_url, api_key, model_name)
# Persist the selected model back to whichever schema owns this endpoint.
if provider_key:
cfg = load_config()
providers_cfg = cfg.get("providers")
if isinstance(providers_cfg, dict):
provider_entry = providers_cfg.get(provider_key)
if isinstance(provider_entry, dict):
provider_entry["default_model"] = model_name
if api_key and not str(provider_entry.get("api_key", "") or "").strip():
provider_entry["api_key"] = api_key
if key_env and not str(provider_entry.get("key_env", "") or "").strip():
provider_entry["key_env"] = key_env
cfg["providers"] = providers_cfg
save_config(cfg)
else:
# Save model name to the custom_providers entry for next time
_save_custom_provider(base_url, api_key, model_name)
print(f"\n✅ Model set to: {model_name}")
print(f" Provider: {name} ({base_url})")
@@ -2373,8 +2476,11 @@ def _model_flow_api_key_provider(config, provider_id, current_model=""):
print()
override = ""
if override and base_url_env:
save_env_value(base_url_env, override)
effective_base = override
if not override.startswith(("http://", "https://")):
print(" Invalid URL — must start with http:// or https://. Keeping current value.")
else:
save_env_value(base_url_env, override)
effective_base = override
# Model selection — resolution order:
# 1. models.dev registry (cached, filtered for agentic/tool-capable models)
@@ -2537,13 +2643,12 @@ def _run_anthropic_oauth_flow(save_env_value):
def _model_flow_anthropic(config, current_model=""):
"""Flow for Anthropic provider — OAuth subscription, API key, or Claude Code creds."""
import os
from hermes_cli.auth import (
PROVIDER_REGISTRY, _prompt_model_selection, _save_model_choice,
_prompt_model_selection, _save_model_choice,
deactivate_provider,
)
from hermes_cli.config import (
get_env_value, save_env_value, load_config, save_config,
save_env_value, load_config, save_config,
save_anthropic_api_key,
)
from hermes_cli.models import _PROVIDER_MODELS
@@ -2705,12 +2810,34 @@ def cmd_dump(args):
run_dump(args)
def cmd_debug(args):
"""Debug tools (share report, etc.)."""
from hermes_cli.debug import run_debug
run_debug(args)
def cmd_config(args):
"""Configuration management."""
from hermes_cli.config import config_command
config_command(args)
def cmd_backup(args):
"""Back up Hermes home directory to a zip file."""
if getattr(args, "quick", False):
from hermes_cli.backup import run_quick_backup
run_quick_backup(args)
else:
from hermes_cli.backup import run_backup
run_backup(args)
def cmd_import(args):
"""Restore a Hermes backup from a zip file."""
from hermes_cli.backup import run_import
run_import(args)
def cmd_version(args):
"""Show version."""
print(f"Hermes Agent v{__version__} ({__release_date__})")
@@ -2829,6 +2956,44 @@ def _gateway_prompt(prompt_text: str, default: str = "", timeout: float = 300.0)
return default
def _build_web_ui(web_dir: Path, *, fatal: bool = False) -> bool:
"""Build the web UI frontend if npm is available.
Args:
web_dir: Path to the ``web/`` source directory.
fatal: If True, print error guidance and return False on failure
instead of a soft warning (used by ``hermes web``).
Returns True if the build succeeded or was skipped (no package.json).
"""
if not (web_dir / "package.json").exists():
return True
import shutil
npm = shutil.which("npm")
if not npm:
if fatal:
print("Web UI frontend not built and npm is not available.")
print("Install Node.js, then run: cd web && npm install && npm run build")
return not fatal
print("→ Building web UI...")
r1 = subprocess.run([npm, "install", "--silent"], cwd=web_dir, capture_output=True)
if r1.returncode != 0:
print(f" {'' if fatal else ''} Web UI npm install failed"
+ ("" if fatal else " (hermes web will not be available)"))
if fatal:
print(" Run manually: cd web && npm install && npm run build")
return False
r2 = subprocess.run([npm, "run", "build"], cwd=web_dir, capture_output=True)
if r2.returncode != 0:
print(f" {'' if fatal else ''} Web UI build failed"
+ ("" if fatal else " (hermes web will not be available)"))
if fatal:
print(" Run manually: cd web && npm install && npm run build")
return False
print(" ✓ Web UI built")
return True
def _update_via_zip(args):
"""Update Hermes Agent by downloading a ZIP archive.
@@ -2923,7 +3088,10 @@ def _update_via_zip(args):
check=True,
)
_install_python_dependencies_with_optional_fallback(pip_cmd)
# Build web UI frontend (optional — requires npm)
_build_web_ui(PROJECT_ROOT / "web")
# Sync skills
try:
from tools.skills_sync import sync_skills
@@ -3670,7 +3838,10 @@ def cmd_update(args):
if shutil.which("npm"):
print("→ Updating Node.js dependencies...")
subprocess.run(["npm", "install", "--silent"], cwd=PROJECT_ROOT, check=False)
# Build web UI frontend (optional — requires npm)
_build_web_ui(PROJECT_ROOT / "web")
print()
print("✓ Code updated!")
@@ -3798,6 +3969,26 @@ def cmd_update(args):
print()
print("✓ Update complete!")
# Write exit code *before* the gateway restart attempt.
# When running as ``hermes update --gateway`` (spawned by the gateway's
# /update command), this process lives inside the gateway's systemd
# cgroup. ``systemctl restart hermes-gateway`` kills everything in the
# cgroup (KillMode=mixed → SIGKILL to remaining processes), including
# us and the wrapping bash shell. The shell never reaches its
# ``printf $status > .update_exit_code`` epilogue, so the exit-code
# marker file is never created. The new gateway's update watcher then
# polls for 30 minutes and sends a spurious timeout message.
#
# Writing the marker here — after git pull + pip install succeed but
# before we attempt the restart — ensures the new gateway sees it
# regardless of how we die.
if gateway_mode:
_exit_code_path = get_hermes_home() / ".update_exit_code"
try:
_exit_code_path.write_text("0")
except OSError:
pass
# Auto-restart ALL gateways after update.
# The code update (git pull) is shared across all profiles, so every
# running gateway needs restarting to pick up the new code.
@@ -3845,7 +4036,40 @@ def cmd_update(args):
capture_output=True, text=True, timeout=15,
)
if restart.returncode == 0:
restarted_services.append(svc_name)
# Verify the service actually survived the
# restart. systemctl restart returns 0 even
# if the new process crashes immediately.
import time as _time
_time.sleep(3)
verify = subprocess.run(
scope_cmd + ["is-active", svc_name],
capture_output=True, text=True, timeout=5,
)
if verify.stdout.strip() == "active":
restarted_services.append(svc_name)
else:
# Retry once — transient startup failures
# (stale module cache, import race) often
# resolve on the second attempt.
print(f"{svc_name} died after restart, retrying...")
retry = subprocess.run(
scope_cmd + ["restart", svc_name],
capture_output=True, text=True, timeout=15,
)
_time.sleep(3)
verify2 = subprocess.run(
scope_cmd + ["is-active", svc_name],
capture_output=True, text=True, timeout=5,
)
if verify2.stdout.strip() == "active":
restarted_services.append(svc_name)
print(f"{svc_name} recovered on retry")
else:
print(
f"{svc_name} failed to stay running after restart.\n"
f" Check logs: journalctl --user -u {svc_name} --since '2 min ago'\n"
f" Restart manually: systemctl {'--user ' if scope == 'user' else ''}restart {svc_name}"
)
else:
print(f" ⚠ Failed to restart {svc_name}: {restart.stderr.strip()}")
except (FileNotFoundError, subprocess.TimeoutExpired):
@@ -3932,7 +4156,9 @@ def _coalesce_session_name_args(argv: list) -> list:
"chat", "model", "gateway", "setup", "whatsapp", "login", "logout", "auth",
"status", "cron", "doctor", "config", "pairing", "skills", "tools",
"mcp", "sessions", "insights", "version", "update", "uninstall",
"profile",
"profile", "dashboard",
"honcho", "claw", "plugins", "acp",
"webhook", "memory", "dump", "debug", "backup", "import", "completion", "logs",
}
_SESSION_FLAGS = {"-c", "--continue", "-r", "--resume"}
@@ -4082,18 +4308,24 @@ def cmd_profile(args):
print(f' Add to your shell config (~/.bashrc or ~/.zshrc):')
print(f' export PATH="$HOME/.local/bin:$PATH"')
# Profile dir for display
try:
profile_dir_display = "~/" + str(profile_dir.relative_to(Path.home()))
except ValueError:
profile_dir_display = str(profile_dir)
# Next steps
print(f"\nNext steps:")
print(f" {name} setup Configure API keys and model")
print(f" {name} chat Start chatting")
print(f" {name} gateway start Start the messaging gateway")
if clone or clone_all:
try:
profile_dir_display = "~/" + str(profile_dir.relative_to(Path.home()))
except ValueError:
profile_dir_display = str(profile_dir)
print(f"\n Edit {profile_dir_display}/.env for different API keys")
print(f" Edit {profile_dir_display}/SOUL.md for different personality")
else:
print(f"\n ⚠ This profile has no API keys yet. Run '{name} setup' first,")
print(f" or it will inherit keys from your shell environment.")
print(f" Edit {profile_dir_display}/SOUL.md to customize personality")
print()
except (ValueError, FileExistsError, FileNotFoundError) as e:
@@ -4204,14 +4436,38 @@ def cmd_profile(args):
sys.exit(1)
def cmd_completion(args):
def cmd_dashboard(args):
"""Start the web UI server."""
try:
import fastapi # noqa: F401
import uvicorn # noqa: F401
except ImportError:
print("Web UI dependencies not installed.")
print("Install them with: pip install hermes-agent[web]")
sys.exit(1)
if not _build_web_ui(PROJECT_ROOT / "web", fatal=True):
sys.exit(1)
from hermes_cli.web_server import start_server
start_server(
host=args.host,
port=args.port,
open_browser=not args.no_open,
allow_public=getattr(args, "insecure", False),
)
def cmd_completion(args, parser=None):
"""Print shell completion script."""
from hermes_cli.profiles import generate_bash_completion, generate_zsh_completion
from hermes_cli.completion import generate_bash, generate_zsh, generate_fish
shell = getattr(args, "shell", "bash")
if shell == "zsh":
print(generate_zsh_completion())
print(generate_zsh(parser))
elif shell == "fish":
print(generate_fish(parser))
else:
print(generate_bash_completion())
print(generate_bash(parser))
def cmd_logs(args):
@@ -4231,6 +4487,7 @@ def cmd_logs(args):
level=getattr(args, "level", None),
session=getattr(args, "session", None),
since=getattr(args, "since", None),
component=getattr(args, "component", None),
)
@@ -4268,6 +4525,7 @@ Examples:
hermes logs -f Follow agent.log in real time
hermes logs errors View errors.log
hermes logs --since 1h Lines from the last hour
hermes debug share Upload debug report for support
hermes update Update to latest version
For more help on a command:
@@ -4354,7 +4612,7 @@ For more help on a command:
)
chat_parser.add_argument(
"--provider",
choices=["auto", "openrouter", "nous", "openai-codex", "copilot-acp", "copilot", "anthropic", "gemini", "huggingface", "zai", "kimi-coding", "minimax", "minimax-cn", "kilocode", "xiaomi"],
choices=["auto", "openrouter", "nous", "openai-codex", "copilot-acp", "copilot", "anthropic", "gemini", "huggingface", "zai", "kimi-coding", "kimi-coding-cn", "minimax", "minimax-cn", "kilocode", "xiaomi", "arcee"],
default=None,
help="Inference provider (default: auto)"
)
@@ -4796,7 +5054,90 @@ For more help on a command:
help="Show redacted API key prefixes (first/last 4 chars) instead of just set/not set"
)
dump_parser.set_defaults(func=cmd_dump)
# =========================================================================
# debug command
# =========================================================================
debug_parser = subparsers.add_parser(
"debug",
help="Debug tools — upload logs and system info for support",
description="Debug utilities for Hermes Agent. Use 'hermes debug share' to "
"upload a debug report (system info + recent logs) to a paste "
"service and get a shareable URL.",
formatter_class=argparse.RawDescriptionHelpFormatter,
epilog="""\
Examples:
hermes debug share Upload debug report and print URL
hermes debug share --lines 500 Include more log lines
hermes debug share --expire 30 Keep paste for 30 days
hermes debug share --local Print report locally (no upload)
""",
)
debug_sub = debug_parser.add_subparsers(dest="debug_command")
share_parser = debug_sub.add_parser(
"share",
help="Upload debug report to a paste service and print a shareable URL",
)
share_parser.add_argument(
"--lines", type=int, default=200,
help="Number of log lines to include per log file (default: 200)",
)
share_parser.add_argument(
"--expire", type=int, default=7,
help="Paste expiry in days (default: 7)",
)
share_parser.add_argument(
"--local", action="store_true",
help="Print the report locally instead of uploading",
)
debug_parser.set_defaults(func=cmd_debug)
# =========================================================================
# backup command
# =========================================================================
backup_parser = subparsers.add_parser(
"backup",
help="Back up Hermes home directory to a zip file",
description="Create a zip archive of your entire Hermes configuration, "
"skills, sessions, and data (excludes the hermes-agent codebase). "
"Use --quick for a fast snapshot of just critical state files."
)
backup_parser.add_argument(
"-o", "--output",
help="Output path for the zip file (default: ~/hermes-backup-<timestamp>.zip)"
)
backup_parser.add_argument(
"-q", "--quick",
action="store_true",
help="Quick snapshot: only critical state files (config, state.db, .env, auth, cron)"
)
backup_parser.add_argument(
"-l", "--label",
help="Label for the snapshot (only used with --quick)"
)
backup_parser.set_defaults(func=cmd_backup)
# =========================================================================
# import command
# =========================================================================
import_parser = subparsers.add_parser(
"import",
help="Restore a Hermes backup from a zip file",
description="Extract a previously created Hermes backup into your "
"Hermes home directory, restoring configuration, skills, "
"sessions, and data"
)
import_parser.add_argument(
"zipfile",
help="Path to the backup zip file"
)
import_parser.add_argument(
"--force", "-f",
action="store_true",
help="Overwrite existing files without confirmation"
)
import_parser.set_defaults(func=cmd_import)
# =========================================================================
# config command
# =========================================================================
@@ -5146,6 +5487,8 @@ For more help on a command:
mcp_add_p.add_argument("--command", help="Stdio command (e.g. npx)")
mcp_add_p.add_argument("--args", nargs="*", default=[], help="Arguments for stdio command")
mcp_add_p.add_argument("--auth", choices=["oauth", "header"], help="Auth method")
mcp_add_p.add_argument("--preset", help="Known MCP preset name")
mcp_add_p.add_argument("--env", nargs="*", default=[], help="Environment variables for stdio servers (KEY=VALUE)")
mcp_rm_p = mcp_sub.add_parser("remove", aliases=["rm"], help="Remove an MCP server")
mcp_rm_p.add_argument("name", help="Server name to remove")
@@ -5604,13 +5947,30 @@ For more help on a command:
# =========================================================================
completion_parser = subparsers.add_parser(
"completion",
help="Print shell completion script (bash or zsh)",
help="Print shell completion script (bash, zsh, or fish)",
)
completion_parser.add_argument(
"shell", nargs="?", default="bash", choices=["bash", "zsh"],
"shell", nargs="?", default="bash", choices=["bash", "zsh", "fish"],
help="Shell type (default: bash)",
)
completion_parser.set_defaults(func=cmd_completion)
completion_parser.set_defaults(func=lambda args: cmd_completion(args, parser))
# =========================================================================
# dashboard command
# =========================================================================
dashboard_parser = subparsers.add_parser(
"dashboard",
help="Start the web UI dashboard",
description="Launch the Hermes Agent web dashboard for managing config, API keys, and sessions",
)
dashboard_parser.add_argument("--port", type=int, default=9119, help="Port (default 9119)")
dashboard_parser.add_argument("--host", default="127.0.0.1", help="Host (default 127.0.0.1)")
dashboard_parser.add_argument("--no-open", action="store_true", help="Don't open browser automatically")
dashboard_parser.add_argument(
"--insecure", action="store_true",
help="Allow binding to non-localhost (DANGEROUS: exposes API keys on the network)",
)
dashboard_parser.set_defaults(func=cmd_dashboard)
# =========================================================================
# logs command
@@ -5628,6 +5988,7 @@ Examples:
hermes logs gateway -n 100 Show last 100 lines of gateway.log
hermes logs --level WARNING Only show WARNING and above
hermes logs --session abc123 Filter by session ID
hermes logs --component tools Only show tool-related lines
hermes logs --since 1h Lines from the last hour
hermes logs --since 30m -f Follow, starting from 30 min ago
hermes logs list List available log files with sizes
@@ -5657,6 +6018,10 @@ Examples:
"--since", metavar="TIME",
help="Show lines since TIME ago (e.g. 1h, 30m, 2d)",
)
logs_parser.add_argument(
"--component", metavar="NAME",
help="Filter by component: gateway, agent, tools, cli, cron",
)
logs_parser.set_defaults(func=cmd_logs)
# =========================================================================
@@ -5665,9 +6030,22 @@ Examples:
# Pre-process argv so unquoted multi-word session names after -c / -r
# are merged into a single token before argparse sees them.
# e.g. ``hermes -c Pokemon Agent Dev`` → ``hermes -c 'Pokemon Agent Dev'``
# ── Container-aware routing ────────────────────────────────────────
# When NixOS container mode is active, route ALL subcommands into
# the managed container. This MUST run before parse_args() so that
# --help, unrecognised flags, and every subcommand are forwarded
# transparently instead of being intercepted by argparse on the host.
from hermes_cli.config import get_container_exec_info
container_info = get_container_exec_info()
if container_info:
_exec_in_container(container_info, sys.argv[1:])
# Unreachable: os.execvp never returns on success (process is replaced)
# and raises OSError on failure (which propagates as a traceback).
sys.exit(1)
_processed_argv = _coalesce_session_name_args(sys.argv[1:])
args = parser.parse_args(_processed_argv)
# Handle --version flag
if args.version:
cmd_version(args)
+85 -3
View File
@@ -9,7 +9,6 @@ configuration in ~/.hermes/config.yaml under the ``mcp_servers`` key.
"""
import asyncio
import getpass
import logging
import os
import re
@@ -28,6 +27,11 @@ from hermes_constants import display_hermes_home
logger = logging.getLogger(__name__)
_ENV_VAR_NAME_RE = re.compile(r"^[A-Za-z_][A-Za-z0-9_]*$")
_MCP_PRESETS: Dict[str, Dict[str, Any]] = {}
# ─── UI Helpers ───────────────────────────────────────────────────────────────
@@ -98,6 +102,59 @@ def _env_key_for_server(name: str) -> str:
return f"MCP_{name.upper().replace('-', '_')}_API_KEY"
def _parse_env_assignments(raw_env: Optional[List[str]]) -> Dict[str, str]:
"""Parse ``KEY=VALUE`` strings from CLI args into an env dict."""
parsed: Dict[str, str] = {}
for item in raw_env or []:
text = str(item or "").strip()
if not text:
continue
if "=" not in text:
raise ValueError(f"Invalid --env value '{text}' (expected KEY=VALUE)")
key, value = text.split("=", 1)
key = key.strip()
if not key:
raise ValueError(f"Invalid --env value '{text}' (missing variable name)")
if not _ENV_VAR_NAME_RE.match(key):
raise ValueError(f"Invalid --env variable name '{key}'")
parsed[key] = value
return parsed
def _apply_mcp_preset(
name: str,
*,
preset_name: Optional[str],
url: Optional[str],
command: Optional[str],
cmd_args: List[str],
server_config: Dict[str, Any],
) -> tuple[Optional[str], Optional[str], List[str], bool]:
"""Apply a known MCP preset when transport details were omitted."""
if not preset_name:
return url, command, cmd_args, False
preset = _MCP_PRESETS.get(preset_name)
if not preset:
raise ValueError(f"Unknown MCP preset: {preset_name}")
if url or command:
return url, command, cmd_args, False
url = preset.get("url")
command = preset.get("command")
cmd_args = list(preset.get("args") or [])
if url:
server_config["url"] = url
if command:
server_config["command"] = command
if cmd_args:
server_config["args"] = cmd_args
return url, command, cmd_args, True
# ─── Discovery (temporary connect) ───────────────────────────────────────────
def _probe_single_server(
@@ -166,13 +223,35 @@ def cmd_mcp_add(args):
command = getattr(args, "command", None)
cmd_args = getattr(args, "args", None) or []
auth_type = getattr(args, "auth", None)
preset_name = getattr(args, "preset", None)
raw_env = getattr(args, "env", None)
server_config: Dict[str, Any] = {}
try:
explicit_env = _parse_env_assignments(raw_env)
url, command, cmd_args, _preset_applied = _apply_mcp_preset(
name,
preset_name=preset_name,
url=url,
command=command,
cmd_args=list(cmd_args),
server_config=server_config,
)
except ValueError as exc:
_error(str(exc))
return
if url and explicit_env:
_error("--env is only supported for stdio MCP servers (--command or stdio presets)")
return
# Validate transport
if not url and not command:
_error("Must specify --url <endpoint> or --command <cmd>")
_error("Must specify --url <endpoint>, --command <cmd>, or --preset <name>")
_info("Examples:")
_info(' hermes mcp add ink --url "https://mcp.ml.ink/mcp"')
_info(' hermes mcp add github --command npx --args @modelcontextprotocol/server-github')
_info(' hermes mcp add myserver --preset mypreset')
return
# Check if server already exists
@@ -183,13 +262,15 @@ def cmd_mcp_add(args):
return
# Build initial config
server_config: Dict[str, Any] = {}
if url:
server_config["url"] = url
else:
server_config["command"] = command
if cmd_args:
server_config["args"] = cmd_args
if explicit_env:
server_config["env"] = explicit_env
# ── Authentication ────────────────────────────────────────────────
@@ -627,6 +708,7 @@ def mcp_command(args):
_info("hermes mcp serve Run as MCP server")
_info("hermes mcp add <name> --url <endpoint> Add an MCP server")
_info("hermes mcp add <name> --command <cmd> Add a stdio server")
_info("hermes mcp add <name> --preset <preset> Add from a known preset")
_info("hermes mcp remove <name> Remove a server")
_info("hermes mcp list List servers")
_info("hermes mcp test <name> Test connection")
+9 -5
View File
@@ -324,6 +324,9 @@ def cmd_setup(args) -> None:
val = _prompt(desc, default=str(effective_default) if effective_default else None)
if val:
provider_config[key] = val
# Also write to .env if this field has an env_var
if env_var and env_var not in env_writes:
env_writes[env_var] = val
# Write activation key to config.yaml
config["memory"]["provider"] = name
@@ -409,12 +412,13 @@ def cmd_status(args) -> None:
else:
print(f" Status: not available ✗")
schema = p.get_config_schema() if hasattr(p, "get_config_schema") else []
secrets = [f for f in schema if f.get("secret")]
if secrets:
# Check all fields that have env_var (both secret and non-secret)
required_fields = [f for f in schema if f.get("env_var")]
if required_fields:
print(f" Missing:")
for s in secrets:
env_var = s.get("env_var", "")
url = s.get("url", "")
for f in required_fields:
env_var = f.get("env_var", "")
url = f.get("url", "")
is_set = bool(os.environ.get(env_var))
mark = "" if is_set else ""
line = f" {mark} {env_var}"
+25 -6
View File
@@ -8,8 +8,9 @@ Different LLM providers expect model identifiers in different formats:
hyphens: ``claude-sonnet-4-6``.
- **Copilot** expects bare names *with* dots preserved:
``claude-sonnet-4.6``.
- **OpenCode Zen** follows the same dot-to-hyphen convention as
Anthropic: ``claude-sonnet-4-6``.
- **OpenCode Zen** preserves dots for GPT/GLM/Gemini/Kimi/MiniMax-style
model IDs, but Claude still uses hyphenated native names like
``claude-sonnet-4-6``.
- **OpenCode Go** preserves dots in model names: ``minimax-m2.7``.
- **DeepSeek** only accepts two model identifiers:
``deepseek-chat`` and ``deepseek-reasoner``.
@@ -50,6 +51,7 @@ _VENDOR_PREFIXES: dict[str, str] = {
"grok": "x-ai",
"qwen": "qwen",
"mimo": "xiaomi",
"trinity": "arcee-ai",
"nemotron": "nvidia",
"llama": "meta-llama",
"step": "stepfun",
@@ -67,20 +69,19 @@ _AGGREGATOR_PROVIDERS: frozenset[str] = frozenset({
# Providers that want bare names with dots replaced by hyphens.
_DOT_TO_HYPHEN_PROVIDERS: frozenset[str] = frozenset({
"anthropic",
"opencode-zen",
})
# Providers that want bare names with dots preserved.
_STRIP_VENDOR_ONLY_PROVIDERS: frozenset[str] = frozenset({
"copilot",
"copilot-acp",
"openai-codex",
})
# Providers whose native naming is authoritative -- pass through unchanged.
_AUTHORITATIVE_NATIVE_PROVIDERS: frozenset[str] = frozenset({
"gemini",
"huggingface",
"openai-codex",
})
# Direct providers that accept bare native names but should repair a matching
@@ -88,11 +89,13 @@ _AUTHORITATIVE_NATIVE_PROVIDERS: frozenset[str] = frozenset({
_MATCHING_PREFIX_STRIP_PROVIDERS: frozenset[str] = frozenset({
"zai",
"kimi-coding",
"kimi-coding-cn",
"minimax",
"minimax-cn",
"alibaba",
"qwen-oauth",
"xiaomi",
"arcee",
"custom",
})
@@ -329,6 +332,9 @@ def normalize_model_for_provider(model_input: str, target_provider: str) -> str:
>>> normalize_model_for_provider("claude-sonnet-4.6", "opencode-zen")
'claude-sonnet-4-6'
>>> normalize_model_for_provider("minimax-m2.5-free", "opencode-zen")
'minimax-m2.5-free'
>>> normalize_model_for_provider("deepseek-v3", "deepseek")
'deepseek-chat'
@@ -351,7 +357,16 @@ def normalize_model_for_provider(model_input: str, target_provider: str) -> str:
if provider in _AGGREGATOR_PROVIDERS:
return _prepend_vendor(name)
# --- Anthropic / OpenCode: strip matching provider prefix, dots -> hyphens ---
# --- OpenCode Zen: Claude stays hyphenated; other models keep dots ---
if provider == "opencode-zen":
bare = _strip_matching_provider_prefix(name, provider)
if "/" in bare:
return bare
if bare.lower().startswith("claude-"):
return _dots_to_hyphens(bare)
return bare
# --- Anthropic: strip matching provider prefix, dots -> hyphens ---
if provider in _DOT_TO_HYPHEN_PROVIDERS:
bare = _strip_matching_provider_prefix(name, provider)
if "/" in bare:
@@ -360,7 +375,11 @@ def normalize_model_for_provider(model_input: str, target_provider: str) -> str:
# --- Copilot: strip matching provider prefix, keep dots ---
if provider in _STRIP_VENDOR_ONLY_PROVIDERS:
return _strip_matching_provider_prefix(name, provider)
stripped = _strip_matching_provider_prefix(name, provider)
if stripped == name and name.startswith("openai/"):
# openai-codex maps openai/gpt-5.4 -> gpt-5.4
return name.split("/", 1)[1]
return stripped
# --- DeepSeek: map to one of two canonical names ---
if provider == "deepseek":
+161 -15
View File
@@ -21,6 +21,7 @@ OpenRouter variant suffixes (``:free``, ``:extended``, ``:fast``).
from __future__ import annotations
import logging
import re
from dataclasses import dataclass
from typing import List, NamedTuple, Optional
@@ -40,7 +41,6 @@ from agent.models_dev import (
get_model_capabilities,
get_model_info,
list_provider_models,
search_models_dev,
)
logger = logging.getLogger(__name__)
@@ -57,10 +57,36 @@ _HERMES_MODEL_WARNING = (
"(Claude, GPT, Gemini, DeepSeek, etc.)."
)
# Match only the real Nous Research Hermes 3 / Hermes 4 chat families.
# The previous substring check (`"hermes" in name.lower()`) false-positived on
# unrelated local Modelfiles like ``hermes-brain:qwen3-14b-ctx16k`` that just
# happen to carry "hermes" in their tag but are fully tool-capable.
#
# Positive examples the regex must match:
# NousResearch/Hermes-3-Llama-3.1-70B, hermes-4-405b, openrouter/hermes3:70b
# Negative examples it must NOT match:
# hermes-brain:qwen3-14b-ctx16k, qwen3:14b, claude-opus-4-6
_NOUS_HERMES_NON_AGENTIC_RE = re.compile(
r"(?:^|[/:])hermes[-_ ]?[34](?:[-_.:]|$)",
re.IGNORECASE,
)
def is_nous_hermes_non_agentic(model_name: str) -> bool:
"""Return True if *model_name* is a real Nous Hermes 3/4 chat model.
Used to decide whether to surface the non-agentic warning at startup.
Callers in :mod:`cli.py` and here should go through this single helper
so the two sites don't drift.
"""
if not model_name:
return False
return bool(_NOUS_HERMES_NON_AGENTIC_RE.search(model_name))
def _check_hermes_model_warning(model_name: str) -> str:
"""Return a warning string if *model_name* looks like a Hermes LLM model."""
if "hermes" in model_name.lower():
"""Return a warning string if *model_name* is a Nous Hermes 3/4 chat model."""
if is_nous_hermes_non_agentic(model_name):
return _HERMES_MODEL_WARNING
return ""
@@ -679,6 +705,10 @@ def switch_model(
error_message=msg,
)
# Apply auto-correction if validation found a closer match
if validation.get("corrected_model"):
new_model = validation["corrected_model"]
# --- OpenCode api_mode override ---
if target_provider in {"opencode-zen", "opencode-go", "opencode", "opencode-go"}:
api_mode = opencode_model_api_mode(target_provider, new_model)
@@ -839,8 +869,11 @@ def list_authenticated_providers(
if any(os.environ.get(ev) for ev in pcfg.api_key_env_vars):
has_creds = True
break
if not has_creds and overlay.auth_type in ("oauth_device_code", "oauth_external", "external_process"):
# These use auth stores, not env vars — check for auth.json entries
# Check auth store and credential pool for non-env-var credentials.
# This applies to OAuth providers AND api_key providers that also
# support OAuth (e.g. anthropic supports both API key and Claude Code
# OAuth via external credential files).
if not has_creds:
try:
from hermes_cli.auth import _load_auth_store
store = _load_auth_store()
@@ -853,6 +886,38 @@ def list_authenticated_providers(
has_creds = True
except Exception as exc:
logger.debug("Auth store check failed for %s: %s", pid, exc)
# Fallback: check the credential pool with full auto-seeding.
# This catches credentials that exist in external stores (e.g.
# Codex CLI ~/.codex/auth.json) which _seed_from_singletons()
# imports on demand but aren't in the raw auth.json yet.
if not has_creds:
try:
from agent.credential_pool import load_pool
pool = load_pool(hermes_slug)
if pool.has_credentials():
has_creds = True
except Exception as exc:
logger.debug("Credential pool check failed for %s: %s", hermes_slug, exc)
# Fallback: check external credential files directly.
# The credential pool gates anthropic behind
# is_provider_explicitly_configured() to prevent auxiliary tasks
# from silently consuming Claude Code tokens (PR #4210).
# But the /model picker is discovery-oriented — we WANT to show
# providers the user can switch to, even if they aren't currently
# configured.
if not has_creds and hermes_slug == "anthropic":
try:
from agent.anthropic_adapter import (
read_claude_code_credentials,
read_hermes_oauth_credentials,
)
hermes_creds = read_hermes_oauth_credentials()
cc_creds = read_claude_code_credentials()
if (hermes_creds and hermes_creds.get("accessToken")) or \
(cc_creds and cc_creds.get("accessToken")):
has_creds = True
except Exception as exc:
logger.debug("Anthropic external creds check failed: %s", exc)
if not has_creds:
continue
@@ -873,6 +938,65 @@ def list_authenticated_providers(
seen_slugs.add(pid)
seen_slugs.add(hermes_slug)
# --- 2b. Cross-check canonical provider list ---
# Catches providers that are in CANONICAL_PROVIDERS but weren't found
# in PROVIDER_TO_MODELS_DEV or HERMES_OVERLAYS (keeps /model in sync
# with `hermes model`).
try:
from hermes_cli.models import CANONICAL_PROVIDERS as _canon_provs
except ImportError:
_canon_provs = []
for _cp in _canon_provs:
if _cp.slug in seen_slugs:
continue
# Check credentials via PROVIDER_REGISTRY (auth.py)
_cp_config = _auth_registry.get(_cp.slug)
_cp_has_creds = False
if _cp_config and _cp_config.api_key_env_vars:
_cp_has_creds = any(os.environ.get(ev) for ev in _cp_config.api_key_env_vars)
# Also check auth store and credential pool
if not _cp_has_creds:
try:
from hermes_cli.auth import _load_auth_store
_cp_store = _load_auth_store()
_cp_providers_store = _cp_store.get("providers", {})
_cp_pool_store = _cp_store.get("credential_pool", {})
if _cp_store and (
_cp.slug in _cp_providers_store
or _cp.slug in _cp_pool_store
):
_cp_has_creds = True
except Exception:
pass
if not _cp_has_creds:
try:
from agent.credential_pool import load_pool
_cp_pool = load_pool(_cp.slug)
if _cp_pool.has_credentials():
_cp_has_creds = True
except Exception:
pass
if not _cp_has_creds:
continue
_cp_model_ids = curated.get(_cp.slug, [])
_cp_total = len(_cp_model_ids)
_cp_top = _cp_model_ids[:max_models]
results.append({
"slug": _cp.slug,
"name": _cp.label,
"is_current": _cp.slug == current_provider,
"is_user_defined": False,
"models": _cp_top,
"total_models": _cp_total,
"source": "canonical",
})
seen_slugs.add(_cp.slug)
# --- 3. User-defined endpoints from config ---
if user_providers and isinstance(user_providers, dict):
for ep_name, ep_cfg in user_providers.items():
@@ -882,9 +1006,16 @@ def list_authenticated_providers(
api_url = ep_cfg.get("api", "") or ep_cfg.get("url", "") or ""
default_model = ep_cfg.get("default_model", "")
# Build models list from both default_model and full models array
models_list = []
if default_model:
models_list.append(default_model)
# Also include the full models list from config
cfg_models = ep_cfg.get("models", [])
if isinstance(cfg_models, list):
for m in cfg_models:
if m and m not in models_list:
models_list.append(m)
# Try to probe /v1/models if URL is set (but don't block on it)
# For now just show what we know from config
@@ -900,7 +1031,17 @@ def list_authenticated_providers(
})
# --- 4. Saved custom providers from config ---
# Each ``custom_providers`` entry represents one model under a named
# provider. Entries sharing the same provider name are grouped into a
# single picker row so that e.g. four Ollama Cloud entries
# (qwen3-coder, glm-5.1, kimi-k2, minimax-m2.7) appear as one
# "Ollama Cloud" row with four models inside instead of four
# duplicate "Ollama Cloud" rows. Entries with distinct provider names
# still produce separate rows (e.g. Ollama Cloud vs Moonshot).
if custom_providers and isinstance(custom_providers, list):
from collections import OrderedDict
groups: "OrderedDict[str, dict]" = OrderedDict()
for entry in custom_providers:
if not isinstance(entry, dict):
continue
@@ -916,23 +1057,28 @@ def list_authenticated_providers(
continue
slug = custom_provider_slug(display_name)
if slug not in groups:
groups[slug] = {
"name": display_name,
"api_url": api_url,
"models": [],
}
default_model = (entry.get("model") or "").strip()
if default_model and default_model not in groups[slug]["models"]:
groups[slug]["models"].append(default_model)
for slug, grp in groups.items():
if slug in seen_slugs:
continue
models_list = []
default_model = (entry.get("model") or "").strip()
if default_model:
models_list.append(default_model)
results.append({
"slug": slug,
"name": display_name,
"name": grp["name"],
"is_current": slug == current_provider,
"is_user_defined": True,
"models": models_list,
"total_models": len(models_list),
"models": grp["models"],
"total_models": len(grp["models"]),
"source": "user-config",
"api_url": api_url,
"api_url": grp["api_url"],
})
seen_slugs.add(slug)
+155 -42
View File
@@ -12,7 +12,7 @@ import os
import urllib.request
import urllib.error
from difflib import get_close_matches
from typing import Any, Optional
from typing import Any, NamedTuple, Optional
COPILOT_BASE_URL = "https://api.githubcopilot.com"
COPILOT_MODELS_URL = f"{COPILOT_BASE_URL}/models"
@@ -29,6 +29,7 @@ OPENROUTER_MODELS: list[tuple[str, str]] = [
("qwen/qwen3.6-plus", ""),
("anthropic/claude-sonnet-4.5", ""),
("anthropic/claude-haiku-4.5", ""),
("openrouter/elephant-alpha", "free"),
("openai/gpt-5.4", ""),
("openai/gpt-5.4-mini", ""),
("xiaomi/mimo-v2-pro", ""),
@@ -43,6 +44,7 @@ OPENROUTER_MODELS: list[tuple[str, str]] = [
("minimax/minimax-m2.7", ""),
("minimax/minimax-m2.5", ""),
("z-ai/glm-5.1", ""),
("z-ai/glm-5v-turbo", ""),
("z-ai/glm-5-turbo", ""),
("moonshotai/kimi-k2.5", ""),
("x-ai/grok-4.20", ""),
@@ -70,13 +72,13 @@ def _codex_curated_models() -> list[str]:
_PROVIDER_MODELS: dict[str, list[str]] = {
"nous": [
"xiaomi/mimo-v2-pro",
"anthropic/claude-opus-4.6",
"anthropic/claude-sonnet-4.6",
"anthropic/claude-sonnet-4.5",
"anthropic/claude-haiku-4.5",
"openai/gpt-5.4",
"openai/gpt-5.4-mini",
"xiaomi/mimo-v2-pro",
"openai/gpt-5.3-codex",
"google/gemini-3-pro-preview",
"google/gemini-3-flash-preview",
@@ -88,6 +90,7 @@ _PROVIDER_MODELS: dict[str, list[str]] = {
"minimax/minimax-m2.7",
"minimax/minimax-m2.5",
"z-ai/glm-5.1",
"z-ai/glm-5v-turbo",
"z-ai/glm-5-turbo",
"moonshotai/kimi-k2.5",
"x-ai/grok-4.20-beta",
@@ -97,6 +100,7 @@ _PROVIDER_MODELS: dict[str, list[str]] = {
"arcee-ai/trinity-large-thinking",
"openai/gpt-5.4-pro",
"openai/gpt-5.4-nano",
"openrouter/elephant-alpha",
],
"openai-codex": _codex_curated_models(),
"copilot-acp": [
@@ -130,7 +134,9 @@ _PROVIDER_MODELS: dict[str, list[str]] = {
"gemma-4-26b-it",
],
"zai": [
"glm-5.1",
"glm-5",
"glm-5v-turbo",
"glm-5-turbo",
"glm-4.7",
"glm-4.5",
@@ -157,6 +163,12 @@ _PROVIDER_MODELS: dict[str, list[str]] = {
"kimi-k2-turbo-preview",
"kimi-k2-0905-preview",
],
"kimi-coding-cn": [
"kimi-k2.5",
"kimi-k2-thinking",
"kimi-k2-turbo-preview",
"kimi-k2-0905-preview",
],
"moonshot": [
"kimi-k2.5",
"kimi-k2-thinking",
@@ -193,6 +205,11 @@ _PROVIDER_MODELS: dict[str, list[str]] = {
"mimo-v2-omni",
"mimo-v2-flash",
],
"arcee": [
"trinity-large-thinking",
"trinity-large-preview",
"trinity-mini",
],
"opencode-zen": [
"gpt-5.4-pro",
"gpt-5.4",
@@ -478,29 +495,52 @@ def check_nous_free_tier() -> bool:
return False # default to paid on error — don't block users
_PROVIDER_LABELS = {
"openrouter": "OpenRouter",
"openai-codex": "OpenAI Codex",
"copilot-acp": "GitHub Copilot ACP",
"nous": "Nous Portal",
"copilot": "GitHub Copilot",
"gemini": "Google AI Studio",
"zai": "Z.AI / GLM",
"kimi-coding": "Kimi / Moonshot",
"minimax": "MiniMax",
"minimax-cn": "MiniMax (China)",
"anthropic": "Anthropic",
"deepseek": "DeepSeek",
"opencode-zen": "OpenCode Zen",
"opencode-go": "OpenCode Go",
"ai-gateway": "AI Gateway",
"kilocode": "Kilo Code",
"alibaba": "Alibaba Cloud (DashScope)",
"qwen-oauth": "Qwen OAuth (Portal)",
"huggingface": "Hugging Face",
"xiaomi": "Xiaomi MiMo",
"custom": "Custom endpoint",
}
# ---------------------------------------------------------------------------
# Canonical provider list — single source of truth for provider identity.
# Every code path that lists, displays, or iterates providers derives from
# this list: hermes model, /model, /provider, list_authenticated_providers.
#
# Fields:
# slug — internal provider ID (used in config.yaml, --provider flag)
# label — short display name
# tui_desc — longer description for the `hermes model` interactive picker
# ---------------------------------------------------------------------------
class ProviderEntry(NamedTuple):
slug: str
label: str
tui_desc: str # detailed description for `hermes model` TUI
CANONICAL_PROVIDERS: list[ProviderEntry] = [
ProviderEntry("nous", "Nous Portal", "Nous Portal (Nous Research subscription)"),
ProviderEntry("openrouter", "OpenRouter", "OpenRouter (100+ models, pay-per-use)"),
ProviderEntry("anthropic", "Anthropic", "Anthropic (Claude models — API key or Claude Code)"),
ProviderEntry("openai-codex", "OpenAI Codex", "OpenAI Codex"),
ProviderEntry("xiaomi", "Xiaomi MiMo", "Xiaomi MiMo (MiMo-V2 models — pro, omni, flash)"),
ProviderEntry("qwen-oauth", "Qwen OAuth (Portal)", "Qwen OAuth (reuses local Qwen CLI login)"),
ProviderEntry("copilot", "GitHub Copilot", "GitHub Copilot (uses GITHUB_TOKEN or gh auth token)"),
ProviderEntry("copilot-acp", "GitHub Copilot ACP", "GitHub Copilot ACP (spawns `copilot --acp --stdio`)"),
ProviderEntry("huggingface", "Hugging Face", "Hugging Face Inference Providers (20+ open models)"),
ProviderEntry("gemini", "Google AI Studio", "Google AI Studio (Gemini models — OpenAI-compatible endpoint)"),
ProviderEntry("deepseek", "DeepSeek", "DeepSeek (DeepSeek-V3, R1, coder — direct API)"),
ProviderEntry("xai", "xAI", "xAI (Grok models — direct API)"),
ProviderEntry("zai", "Z.AI / GLM", "Z.AI / GLM (Zhipu AI direct API)"),
ProviderEntry("kimi-coding", "Kimi / Moonshot", "Kimi / Moonshot (Moonshot AI direct API)"),
ProviderEntry("kimi-coding-cn", "Kimi / Moonshot (China)", "Kimi / Moonshot China (Moonshot CN direct API)"),
ProviderEntry("minimax", "MiniMax", "MiniMax (global direct API)"),
ProviderEntry("minimax-cn", "MiniMax (China)", "MiniMax China (domestic direct API)"),
ProviderEntry("alibaba", "Alibaba Cloud (DashScope)","Alibaba Cloud / DashScope Coding (Qwen + multi-provider)"),
ProviderEntry("arcee", "Arcee AI", "Arcee AI (Trinity models — direct API)"),
ProviderEntry("kilocode", "Kilo Code", "Kilo Code (Kilo Gateway API)"),
ProviderEntry("opencode-zen", "OpenCode Zen", "OpenCode Zen (35+ curated models, pay-as-you-go)"),
ProviderEntry("opencode-go", "OpenCode Go", "OpenCode Go (open models, $10/month subscription)"),
ProviderEntry("ai-gateway", "Vercel AI Gateway", "Vercel AI Gateway (200+ models, pay-per-use)"),
]
# Derived dicts — used throughout the codebase
_PROVIDER_LABELS = {p.slug: p.label for p in CANONICAL_PROVIDERS}
_PROVIDER_LABELS["custom"] = "Custom endpoint" # special case: not a named provider
_PROVIDER_ALIASES = {
"glm": "zai",
@@ -518,6 +558,10 @@ _PROVIDER_ALIASES = {
"google-ai-studio": "gemini",
"kimi": "kimi-coding",
"moonshot": "kimi-coding",
"kimi-cn": "kimi-coding-cn",
"moonshot-cn": "kimi-coding-cn",
"arcee-ai": "arcee",
"arceeai": "arcee",
"minimax-china": "minimax-cn",
"minimax_cn": "minimax-cn",
"claude": "anthropic",
@@ -543,9 +587,26 @@ _PROVIDER_ALIASES = {
"huggingface-hub": "huggingface",
"mimo": "xiaomi",
"xiaomi-mimo": "xiaomi",
"grok": "xai",
"x-ai": "xai",
"x.ai": "xai",
}
def get_default_model_for_provider(provider: str) -> str:
"""Return the default model for a provider, or empty string if unknown.
Uses the first entry in _PROVIDER_MODELS as the default. This is the
model a user would be offered first in the ``hermes model`` picker.
Used as a fallback when the user has configured a provider but never
selected a model (e.g. ``hermes auth add openai-codex`` without
``hermes model``).
"""
models = _PROVIDER_MODELS.get(provider, [])
return models[0] if models else ""
def _openrouter_model_is_free(pricing: Any) -> bool:
"""Return True when both prompt and completion pricing are zero."""
if not isinstance(pricing, dict):
@@ -615,13 +676,6 @@ def model_ids(*, force_refresh: bool = False) -> list[str]:
return [mid for mid, _ in fetch_openrouter_models(force_refresh=force_refresh)]
def menu_labels(*, force_refresh: bool = False) -> list[str]:
"""Return display labels like 'anthropic/claude-opus-4.6 (recommended)'."""
labels = []
for mid, desc in fetch_openrouter_models(force_refresh=force_refresh):
labels.append(f"{mid} ({desc})" if desc else mid)
return labels
# ---------------------------------------------------------------------------
@@ -821,23 +875,20 @@ def list_available_providers() -> list[dict[str, str]]:
Each dict has ``id``, ``label``, and ``aliases``.
Checks which providers have valid credentials configured.
Derives the provider list from :data:`CANONICAL_PROVIDERS` (single
source of truth shared with ``hermes model``, ``/model``, etc.).
"""
# Canonical providers in display order
_PROVIDER_ORDER = [
"openrouter", "nous", "openai-codex", "copilot", "copilot-acp",
"gemini", "huggingface",
"zai", "kimi-coding", "minimax", "minimax-cn", "kilocode", "anthropic", "alibaba",
"qwen-oauth", "xiaomi",
"opencode-zen", "opencode-go",
"ai-gateway", "deepseek", "custom",
]
# Derive display order from canonical list + custom
provider_order = [p.slug for p in CANONICAL_PROVIDERS] + ["custom"]
# Build reverse alias map
aliases_for: dict[str, list[str]] = {}
for alias, canonical in _PROVIDER_ALIASES.items():
aliases_for.setdefault(canonical, []).append(alias)
result = []
for pid in _PROVIDER_ORDER:
for pid in provider_order:
label = _PROVIDER_LABELS.get(pid, pid)
alias_list = aliases_for.get(pid, [])
# Check if this provider has credentials available
@@ -1772,6 +1823,17 @@ def validate_requested_model(
"message": None,
}
# Auto-correct if the top match is very similar (e.g. typo)
auto = get_close_matches(requested_for_lookup, api_models, n=1, cutoff=0.9)
if auto:
return {
"accepted": True,
"persist": True,
"recognized": True,
"corrected_model": auto[0],
"message": f"Auto-corrected `{requested}` → `{auto[0]}`",
}
suggestions = get_close_matches(requested, api_models, n=3, cutoff=0.5)
suggestion_text = ""
if suggestions:
@@ -1809,6 +1871,45 @@ def validate_requested_model(
"message": message,
}
# OpenAI Codex has its own catalog path; /v1/models probing is not the right validation path.
if normalized == "openai-codex":
try:
codex_models = provider_model_ids("openai-codex")
except Exception:
codex_models = []
if codex_models:
if requested_for_lookup in set(codex_models):
return {
"accepted": True,
"persist": True,
"recognized": True,
"message": None,
}
# Auto-correct if the top match is very similar (e.g. typo)
auto = get_close_matches(requested_for_lookup, codex_models, n=1, cutoff=0.9)
if auto:
return {
"accepted": True,
"persist": True,
"recognized": True,
"corrected_model": auto[0],
"message": f"Auto-corrected `{requested}` → `{auto[0]}`",
}
suggestions = get_close_matches(requested_for_lookup, codex_models, n=3, cutoff=0.5)
suggestion_text = ""
if suggestions:
suggestion_text = "\n Similar models: " + ", ".join(f"`{s}`" for s in suggestions)
return {
"accepted": True,
"persist": True,
"recognized": False,
"message": (
f"Note: `{requested}` was not found in the OpenAI Codex model listing. "
f"It may still work if your account has access to it."
f"{suggestion_text}"
),
}
# Probe the live API to check if the model actually exists
api_models = fetch_api_models(api_key, base_url)
@@ -1826,6 +1927,18 @@ def validate_requested_model(
# the user may have access to models not shown in the public
# listing (e.g. Z.AI Pro/Max plans can use glm-5 on coding
# endpoints even though it's not in /models). Warn but allow.
# Auto-correct if the top match is very similar (e.g. typo)
auto = get_close_matches(requested_for_lookup, api_models, n=1, cutoff=0.9)
if auto:
return {
"accepted": True,
"persist": True,
"recognized": True,
"corrected_model": auto[0],
"message": f"Auto-corrected `{requested}` → `{auto[0]}`",
}
suggestions = get_close_matches(requested, api_models, n=3, cutoff=0.5)
suggestion_text = ""
if suggestions:
+2
View File
@@ -33,7 +33,9 @@ PLATFORMS: OrderedDict[str, PlatformInfo] = OrderedDict([
("dingtalk", PlatformInfo(label="💬 DingTalk", default_toolset="hermes-dingtalk")),
("feishu", PlatformInfo(label="🪽 Feishu", default_toolset="hermes-feishu")),
("wecom", PlatformInfo(label="💬 WeCom", default_toolset="hermes-wecom")),
("wecom_callback", PlatformInfo(label="💬 WeCom Callback", default_toolset="hermes-wecom-callback")),
("weixin", PlatformInfo(label="💬 Weixin", default_toolset="hermes-weixin")),
("qqbot", PlatformInfo(label="💬 QQBot", default_toolset="hermes-qqbot")),
("webhook", PlatformInfo(label="🔗 Webhook", default_toolset="hermes-webhook")),
("api_server", PlatformInfo(label="🌐 API Server", default_toolset="hermes-api-server")),
])
+107 -11
View File
@@ -31,7 +31,6 @@ import importlib
import importlib.metadata
import importlib.util
import logging
import os
import sys
import types
from dataclasses import dataclass, field
@@ -263,6 +262,53 @@ class PluginContext:
self._manager._hooks.setdefault(hook_name, []).append(callback)
logger.debug("Plugin %s registered hook: %s", self.manifest.name, hook_name)
# -- skill registration -------------------------------------------------
def register_skill(
self,
name: str,
path: Path,
description: str = "",
) -> None:
"""Register a read-only skill provided by this plugin.
The skill becomes resolvable as ``'<plugin_name>:<name>'`` via
``skill_view()``. It does **not** enter the flat
``~/.hermes/skills/`` tree and is **not** listed in the system
prompt's ``<available_skills>`` index — plugin skills are
opt-in explicit loads only.
Raises:
ValueError: if *name* contains ``':'`` or invalid characters.
FileNotFoundError: if *path* does not exist.
"""
from agent.skill_utils import _NAMESPACE_RE
if ":" in name:
raise ValueError(
f"Skill name '{name}' must not contain ':' "
f"(the namespace is derived from the plugin name "
f"'{self.manifest.name}' automatically)."
)
if not name or not _NAMESPACE_RE.match(name):
raise ValueError(
f"Invalid skill name '{name}'. Must match [a-zA-Z0-9_-]+."
)
if not path.exists():
raise FileNotFoundError(f"SKILL.md not found at {path}")
qualified = f"{self.manifest.name}:{name}"
self._manager._plugin_skills[qualified] = {
"path": path,
"plugin": self.manifest.name,
"bare_name": name,
"description": description,
}
logger.debug(
"Plugin %s registered skill: %s",
self.manifest.name, qualified,
)
# ---------------------------------------------------------------------------
# PluginManager
@@ -279,6 +325,8 @@ class PluginManager:
self._context_engine = None # Set by a plugin via register_context_engine()
self._discovered: bool = False
self._cli_ref = None # Set by CLI after plugin discovery
# Plugin skill registry: qualified name → metadata dict.
self._plugin_skills: Dict[str, Dict[str, Any]] = {}
# -----------------------------------------------------------------------
# Public
@@ -555,6 +603,28 @@ class PluginManager:
)
return result
# -----------------------------------------------------------------------
# Plugin skill lookups
# -----------------------------------------------------------------------
def find_plugin_skill(self, qualified_name: str) -> Optional[Path]:
"""Return the ``Path`` to a plugin skill's SKILL.md, or ``None``."""
entry = self._plugin_skills.get(qualified_name)
return entry["path"] if entry else None
def list_plugin_skills(self, plugin_name: str) -> List[str]:
"""Return sorted bare names of all skills registered by *plugin_name*."""
prefix = f"{plugin_name}:"
return sorted(
e["bare_name"]
for qn, e in self._plugin_skills.items()
if qn.startswith(prefix)
)
def remove_plugin_skill(self, qualified_name: str) -> None:
"""Remove a stale registry entry (silently ignores missing keys)."""
self._plugin_skills.pop(qualified_name, None)
# ---------------------------------------------------------------------------
# Module-level singleton & convenience functions
@@ -584,18 +654,44 @@ def invoke_hook(hook_name: str, **kwargs: Any) -> List[Any]:
return get_plugin_manager().invoke_hook(hook_name, **kwargs)
def get_plugin_tool_names() -> Set[str]:
"""Return the set of tool names registered by plugins."""
return get_plugin_manager()._plugin_tool_names
def get_pre_tool_call_block_message(
tool_name: str,
args: Optional[Dict[str, Any]],
task_id: str = "",
session_id: str = "",
tool_call_id: str = "",
) -> Optional[str]:
"""Check ``pre_tool_call`` hooks for a blocking directive.
def get_plugin_cli_commands() -> Dict[str, dict]:
"""Return CLI commands registered by general plugins.
Plugins that need to enforce policy (rate limiting, security
restrictions, approval workflows) can return::
Returns a dict of ``{name: {help, setup_fn, handler_fn, ...}}``
suitable for wiring into argparse subparsers.
{"action": "block", "message": "Reason the tool was blocked"}
from their ``pre_tool_call`` callback. The first valid block
directive wins. Invalid or irrelevant hook return values are
silently ignored so existing observer-only hooks are unaffected.
"""
return dict(get_plugin_manager()._cli_commands)
hook_results = invoke_hook(
"pre_tool_call",
tool_name=tool_name,
args=args if isinstance(args, dict) else {},
task_id=task_id,
session_id=session_id,
tool_call_id=tool_call_id,
)
for result in hook_results:
if not isinstance(result, dict):
continue
if result.get("action") != "block":
continue
message = result.get("message")
if isinstance(message, str) and message:
return message
return None
def get_plugin_context_engine():
@@ -622,7 +718,7 @@ def get_plugin_toolsets() -> List[tuple]:
toolset_tools: Dict[str, List[str]] = {}
toolset_plugin: Dict[str, LoadedPlugin] = {}
for tool_name in manager._plugin_tool_names:
entry = registry._tools.get(tool_name)
entry = registry.get_entry(tool_name)
if not entry:
continue
ts = entry.toolset
@@ -631,7 +727,7 @@ def get_plugin_toolsets() -> List[tuple]:
# Map toolsets back to the plugin that registered them
for _name, loaded in manager._plugins.items():
for tool_name in loaded.tools_registered:
entry = registry._tools.get(tool_name)
entry = registry.get_entry(tool_name)
if entry and entry.toolset in toolset_tools:
toolset_plugin.setdefault(entry.toolset, loaded)
+10
View File
@@ -459,6 +459,16 @@ def create_profile(
dst.parent.mkdir(parents=True, exist_ok=True)
shutil.copy2(src, dst)
# Seed a default SOUL.md so the user has a file to customize immediately.
# Skipped when the profile already has one (from --clone / --clone-all).
soul_path = profile_dir / "SOUL.md"
if not soul_path.exists():
try:
from hermes_cli.default_soul import DEFAULT_SOUL_MD
soul_path.write_text(DEFAULT_SOUL_MD, encoding="utf-8")
except Exception:
pass # best-effort — don't fail profile creation over this
return profile_dir
+10
View File
@@ -136,6 +136,11 @@ HERMES_OVERLAYS: Dict[str, HermesOverlay] = {
transport="openai_chat",
base_url_env_var="XIAOMI_BASE_URL",
),
"arcee": HermesOverlay(
transport="openai_chat",
base_url_override="https://api.arcee.ai/api/v1",
base_url_env_var="ARCEE_BASE_URL",
),
}
@@ -179,6 +184,7 @@ ALIASES: Dict[str, str] = {
# kimi-for-coding (models.dev ID)
"kimi": "kimi-for-coding",
"kimi-coding": "kimi-for-coding",
"kimi-coding-cn": "kimi-for-coding",
"moonshot": "kimi-for-coding",
# minimax-cn
@@ -230,6 +236,10 @@ ALIASES: Dict[str, str] = {
"mimo": "xiaomi",
"xiaomi-mimo": "xiaomi",
# arcee
"arcee-ai": "arcee",
"arceeai": "arcee",
# Local server aliases → virtual "local" concept (resolved via user config)
"lmstudio": "lmstudio",
"lm-studio": "lmstudio",
+64 -10
View File
@@ -26,7 +26,7 @@ from hermes_cli.auth import (
resolve_external_process_provider_credentials,
has_usable_secret,
)
from hermes_cli.config import load_config
from hermes_cli.config import get_compatible_custom_providers, load_config
from hermes_constants import OPENROUTER_BASE_URL
@@ -275,14 +275,59 @@ def _get_named_custom_provider(requested_provider: str) -> Optional[Dict[str, An
return None
config = load_config()
# First check providers: dict (new-style user-defined providers)
providers = config.get("providers")
if isinstance(providers, dict):
for ep_name, entry in providers.items():
if not isinstance(entry, dict):
continue
# Match exact name or normalized name
name_norm = _normalize_custom_provider_name(ep_name)
# Resolve the API key from the env var name stored in key_env
key_env = str(entry.get("key_env", "") or "").strip()
resolved_api_key = os.getenv(key_env, "").strip() if key_env else ""
# Fall back to inline api_key when key_env is absent or unresolvable
if not resolved_api_key:
resolved_api_key = str(entry.get("api_key", "") or "").strip()
if requested_norm in {ep_name, name_norm, f"custom:{name_norm}"}:
# Found match by provider key
base_url = entry.get("api") or entry.get("url") or entry.get("base_url") or ""
if base_url:
return {
"name": entry.get("name", ep_name),
"base_url": base_url.strip(),
"api_key": resolved_api_key,
"model": entry.get("default_model", ""),
}
# Also check the 'name' field if present
display_name = entry.get("name", "")
if display_name:
display_norm = _normalize_custom_provider_name(display_name)
if requested_norm in {display_name, display_norm, f"custom:{display_norm}"}:
# Found match by display name
base_url = entry.get("api") or entry.get("url") or entry.get("base_url") or ""
if base_url:
return {
"name": display_name,
"base_url": base_url.strip(),
"api_key": resolved_api_key,
"model": entry.get("default_model", ""),
}
# Fall back to custom_providers: list (legacy format)
custom_providers = config.get("custom_providers")
if not isinstance(custom_providers, list):
if isinstance(custom_providers, dict):
logger.warning(
"custom_providers in config.yaml is a dict, not a list. "
"Each entry must be prefixed with '-' in YAML. "
"Run 'hermes doctor' for details."
)
if isinstance(custom_providers, dict):
logger.warning(
"custom_providers in config.yaml is a dict, not a list. "
"Each entry must be prefixed with '-' in YAML. "
"Run 'hermes doctor' for details."
)
return None
custom_providers = get_compatible_custom_providers(config)
if not custom_providers:
return None
for entry in custom_providers:
@@ -294,13 +339,21 @@ def _get_named_custom_provider(requested_provider: str) -> Optional[Dict[str, An
continue
name_norm = _normalize_custom_provider_name(name)
menu_key = f"custom:{name_norm}"
if requested_norm not in {name_norm, menu_key}:
provider_key = str(entry.get("provider_key", "") or "").strip()
provider_key_norm = _normalize_custom_provider_name(provider_key) if provider_key else ""
provider_menu_key = f"custom:{provider_key_norm}" if provider_key_norm else ""
if requested_norm not in {name_norm, menu_key, provider_key_norm, provider_menu_key}:
continue
result = {
"name": name.strip(),
"base_url": base_url.strip(),
"api_key": str(entry.get("api_key", "") or "").strip(),
}
key_env = str(entry.get("key_env", "") or "").strip()
if key_env:
result["key_env"] = key_env
if provider_key:
result["provider_key"] = provider_key
api_mode = _parse_api_mode(entry.get("api_mode"))
if api_mode:
result["api_mode"] = api_mode
@@ -342,6 +395,7 @@ def _resolve_named_custom_runtime(
api_key_candidates = [
(explicit_api_key or "").strip(),
str(custom_provider.get("api_key", "") or "").strip(),
os.getenv(str(custom_provider.get("key_env", "") or "").strip(), "").strip(),
os.getenv("OPENAI_API_KEY", "").strip(),
os.getenv("OPENROUTER_API_KEY", "").strip(),
]
@@ -557,7 +611,7 @@ def _resolve_explicit_runtime(
base_url = explicit_base_url
if not base_url:
if provider == "kimi-coding":
if provider in ("kimi-coding", "kimi-coding-cn"):
creds = resolve_api_key_provider_credentials(provider)
base_url = creds.get("base_url", "").rstrip("/")
else:

Some files were not shown because too many files have changed in this diff Show More