Compare commits

...

558 Commits

Author SHA1 Message Date
ethernet 2b4062f964 fix(ci): upload artifacts for lint-reports 2026-05-06 17:13:38 -04:00
ethernet 1dba69095d fix(ci): diff typecheck fixes against PR branch-off point 2026-05-06 17:05:00 -04:00
ethernet c96eb06dc4 fix: TerminalMenu returntype is int when multiselect is false 2026-05-06 17:05:00 -04:00
ethernet f6cfff4ac6 fix: bad import of DEFAULT_GATEWAY_RESTART_DRAIN_TIMEOUT 2026-05-06 17:05:00 -04:00
brooklyn! f1a8e99942 fix(tui): honor skin highlight colors (#20895) 2026-05-06 14:01:56 -07:00
brooklyn! da6019820a fix(tui): refresh virtual offsets after row resize (#20898) 2026-05-06 13:54:46 -07:00
brooklyn! 5044e1cbf1 fix(cli): submit LF enter in thin PTYs (#20896) 2026-05-06 13:51:13 -07:00
Teknium d8b85bfd1c chore: add guillaumemeyer to AUTHOR_MAP
For cherry-picked commits in PR #20801.
2026-05-06 13:39:43 -07:00
Guillaume Meyer 7df6115199 feat(gateway): also gate pre-restart "Gateway restarting" notification
Extend the gateway_restart_notification flag to cover
_notify_active_sessions_of_shutdown — the message that fires just
before drain ("⚠️ Gateway restarting — Your current task will be
interrupted. Send any message after restart and I'll try to resume
where you left off.") sent to active sessions and home channels.

Same operator/end-user reasoning: on a Slack workspace shared with
end users, "Gateway restarting" reads as "the bot is broken" — the
operator should be able to suppress it consistently with the other
two lifecycle pings rather than having a partial opt-out.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-06 13:39:43 -07:00
Guillaume Meyer b71f80e6ce feat(gateway): per-platform gateway_restart_notification flag
Adds an opt-out toggle on PlatformConfig that gates both restart
lifecycle pings: the "♻ Gateway restarted" message sent to the chat
that issued /restart, and the "♻️ Gateway online" home-channel
startup notification. Defaults to True so existing deployments are
unaffected.

The motivating split is operator vs. end-user surfaces: a back-channel
like Telegram should keep these pings, while a Slack workspace shared
with end users should not surface gateway lifecycle noise.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-06 13:39:43 -07:00
Teknium 33bf5f6292 fix(auth): fall back to global-root auth.json for providers missing in profile
Profile processes (kanban workers, cron subprocesses, delegated subagents)
read the profile's auth.json only. If a provider was authenticated at the
global root but not inside the profile, the profile's credential_pool
comes back empty and the process fails with 'No LLM provider configured'
— even though the credentials are sitting in ~/.hermes/auth.json. #18594
propagated HERMES_HOME correctly, which is what surfaced this: workers
now land in the right profile, and the profile turns out to shadow global
with no fallback.

Semantics (read-only, per-provider shadowing):
* Profile has any entries for provider X → use profile only (global ignored).
* Profile has zero entries for provider X → fall back to global.
* Writes (write_credential_pool, _save_auth_store) still target the profile.
* Classic mode (HERMES_HOME == global root) skips the fallback entirely —
  _global_auth_file_path() returns None.

Also mirrors the fallback in get_provider_auth_state so OAuth singletons
(nous, minimax-oauth, openai-codex, spotify) inherit cleanly — the Nous
shared-token store (PR #19712) remains the authoritative path for Nous
OAuth rotation, this just makes the read side consistent with it.

Seat belt: _load_global_auth_store() refuses to read the real user's
~/.hermes/auth.json under PYTEST_CURRENT_TEST even when HERMES_HOME points
to a profile-shaped path. Guard uses $HOME (stable across fixtures) rather
than Path.home() (which fixtures often monkeypatch to a tmp root).

Reported by @SeedsForbidden on Twitter as the credential_pool shadowing
follow-up to the #18594 fix.
2026-05-06 13:29:54 -07:00
Teknium d514dd4055 docs(tool-gateway): rewrite as pitch-first marketing page (#20827)
Previous version read like internal API docs \u2014 leading with env var tables,
config YAML, and 'precedence' rules before ever explaining the product.
Complete rewrite inverts the structure so readers see value first,
mechanics last.

Structure now:
- Lede: 'One subscription. Every tool built in.' + pitch paragraph
- CTA: subscribe/manage button styled as a real call-to-action
- What's included: emoji-led table with expanded descriptions per tool.
  Image gen lists all 9 models by name (FLUX 2 Klein/Pro, Z-Image Turbo,
  Nano Banana Pro, GPT Image 1.5/2, Ideogram V3, Recraft V4 Pro, Qwen)
- Why it's here: value bullets \u2014 one bill, one signup, one key, same
  quality, bring-your-own anytime
- Get started: two-command flow (hermes model \u2192 hermes status)
- Eligibility: paid-tier note with upgrade link
- Mix and match: three realistic usage patterns
- Using individual image models: ID reference table for power users
- --- separator ---
- Configuration reference (demoted): use_gateway flag, disabling,
  self-hosted gateway env vars moved below the fold where they belong
- FAQ: streamlined, removed redundant content

Fact-checked against code:
- 9 FAL models confirmed from tools/image_generation_tool.py FAL_MODELS
- Status section output verified against hermes_cli/status.py
- Portal subscription URL preserved
- Self-hosted env vars (TOOL_GATEWAY_DOMAIN etc.) kept accurate

Verified: docusaurus build SUCCESS, page renders, no new broken links.
2026-05-06 13:20:09 -07:00
asheriif 946ef0ea19 fix(tui): bound virtual history offset searches 2026-05-06 11:57:01 -07:00
ethernet a345f7b6e5 Merge pull request #19908 from NousResearch/typecheck
change: enable ruff/ty
2026-05-06 14:43:14 -04:00
kshitijk4poor a2ff193050 chore: follow-up cleanup for Kanban migration fix
- Expand migration comment to name the primary failure mode (missing
  column OperationalError from #20842) ahead of the secondary SQLite
  schema-reparse concern; also document the stale-cols-snapshot invariant
- Add clarifying comments on from_row() legacy fallback branches noting
  they are belt-and-suspenders dead code post-migration
- Add task_events comment in existing test explaining why the table is
  required by the migrator
- Add test_legacy_migration_no_legacy_columns_at_all: Scenario A —
  explicitly asserts the exact #20842 crash no longer occurs and that
  consecutive_failures defaults to 0 on a DB that never had spawn_failures
- Add test_legacy_migration_both_columns_already_present: Scenario D —
  asserts the migration is a no-op when both columns already exist,
  preserving the existing counter value
2026-05-06 11:25:16 -07:00
helix4u b1d420e75f fix(kanban): avoid fragile failure-column renames 2026-05-06 11:25:16 -07:00
kshitijk4poor 28299afc21 chore: follow-up cleanup for Feishu topic thread fix
- Remove dead metadata.get('reply_to') fallback in _send_raw_message;
  nothing in the codebase ever sets 'reply_to' inside a metadata dict —
  the key only appears as a top-level send_voice() keyword argument
- Simplify _status_thread_metadata construction in run.py to use a
  single dict literal instead of create-then-mutate pattern; the
  or-{} guard was dead since source.thread_id implies _progress_thread_id
  is also set for Feishu
- Add yuqian@zmetasoft.com to AUTHOR_MAP for contributor attribution
2026-05-06 10:52:51 -07:00
Yuqian 441ef75d15 fix(feishu): keep topic replies in threads
Route Feishu topic progress, status, approval, stream, and fallback messages through threaded replies by preserving the originating message id as the reply target. Add regressions for tool progress topic metadata and Feishu metadata-driven reply routing.
2026-05-06 10:52:51 -07:00
kshitij 48c241840a docs: add Web Search + Extract feature page with SearXNG setup guide 2026-05-06 10:20:05 -07:00
kshitij 94016dd1aa docs+skill: add searxng-search optional skill and documentation
Closes the remaining gaps from PR #11562 that weren't covered by the
core SearXNG integration landed in #20823.

- optional-skills/research/searxng-search/ — installable skill with
  SKILL.md (curl-based usage, category support, Python example) and
  searxng.sh helper script for health checks and instance queries
- website/docs/user-guide/configuration.md — SearXNG added to the
  Web Search Backends section (5 backends, backend table, per-capability
  split config example, correct search-only note)
- website/docs/reference/environment-variables.md — SEARXNG_URL row
- website/docs/reference/optional-skills-catalog.md — searxng-search entry

The core SearXNG code, OPTIONAL_ENV_VARS, hermes tools picker, and tests
were already on main via #20823.  This commit is purely additive docs +
the optional skill scaffold.

Credits from #11562 salvage:
  @w4rum — original _searxng_search structure
  @nathansdev — tools_config.py integration
  @moyomartin — category support and result formatting
  @0xMihai — config/env var approach
  @nicobailon — skill and documentation structure
  @searxng-fan — error handling patterns
  @local-first — self-hosted-first philosophy and docs
2026-05-06 10:15:56 -07:00
kshitij 5c906d7026 feat(web): add SearXNG as a native search-only backend
Adds SearXNG as a free, self-hosted web search provider.  SearXNG is a
privacy-respecting metasearch engine that requires no API key — just a
running instance and SEARXNG_URL pointing at it.

## What this adds

- `tools/web_providers/searxng.py` — `SearXNGSearchProvider` implementing
  `WebSearchProvider` (search only; no extract capability)
- `_is_backend_available("searxng")` — gates on SEARXNG_URL
- `_get_backend()` — accepts "searxng" as a configured value; adds it to
  auto-detect candidates (lower priority than paid services)
- `web_search_tool` — dispatches to SearXNG when it is the active backend
- `check_web_api_key()` — includes SearXNG in availability check
- `OPTIONAL_ENV_VARS["SEARXNG_URL"]` — registered with tools=["web_search"]
- `tools_config.py` — SearXNG appears in the `hermes tools` provider picker
- `nous_subscription.py` — `direct_searxng` detection, web_active / web_available
- `setup.py` — SEARXNG_URL listed in the missing-credential hint
- 23 tests covering: is_configured, happy-path search, score sorting, limit,
  HTTP/request errors, _is_backend_available, _get_backend, check_web_api_key

## Config

```yaml
# Use SearXNG for search, any paid provider for extract
web:
  search_backend: "searxng"
  extract_backend: "firecrawl"

# Or: SearXNG as the sole backend (web_extract will use the next available)
web:
  backend: "searxng"
```

SearXNG is search-only — it does not implement WebExtractProvider.  Users
who only configure SEARXNG_URL get web_search available; web_extract falls
back to the next available extract provider (or is unavailable if none).

Closes #19198 (Phase 2 Task 4 — SearXNG provider)
Ref: #11562 (original SearXNG PR)
2026-05-06 10:05:29 -07:00
kshitij cd2cbc73b7 refactor(web): per-capability backend selection for search/extract split
Introduce the foundation for independently selecting web search and
extract backends — enabling future combinations like SearXNG for
search + Firecrawl for extract.

Architecture:
- tools/web_providers/base.py: WebSearchProvider and WebExtractProvider
  ABCs with normalized result contracts (mirrors CloudBrowserProvider)
- tools/web_tools.py: _get_search_backend() and _get_extract_backend()
  read per-capability config keys, fall through to shared web.backend
- hermes_cli/config.py: web.search_backend and web.extract_backend in
  DEFAULT_CONFIG (empty = inherit from web.backend)

Behavioral change:
- web_search_tool() now dispatches via _get_search_backend()
- web_extract_tool() now dispatches via _get_extract_backend()
- When per-capability keys are empty (default), behavior is identical
  to before — _get_search_backend() falls through to _get_backend()

This is purely structural — no new backends are added. SearXNG and
other search-only/extract-only providers can now be added as simple
drop-in modules in follow-up PRs.

12 new tests, 49 existing tests pass with zero regressions.

Ref: #19198
2026-05-06 09:16:25 -07:00
Teknium 6388aafbd6 feat(dashboard): add 'default-large' built-in theme with 18px base size (#20820)
Same Hermes Teal palette as the default theme, but with baseSize 18px,
lineHeight 1.65, and spacious density so the whole dashboard scales up.
Gives users a one-click bigger-text preset and a copyable reference for
authoring custom YAML themes with their own typography settings.
2026-05-06 09:10:44 -07:00
Teknium a24789d738 fix(opencode-go): keep users on opencode-go instead of hijacking to native providers (#20802)
OpenCode Go and OpenCode Zen are flat-namespace model resellers — their
/v1/models returns bare IDs (deepseek-v4-flash, minimax-m2.7), and the
inference API rejects vendor-prefixed names with HTTP 401 'Model not
supported'. Two bugs fixed:

1. `switch_model` in hermes_cli/model_switch.py was silently switching the
   user off opencode-go to native deepseek when they typed
   `/model deepseek-v4-flash`. Step d found the model in opencode-go's live
   catalog, but step e (detect_provider_for_model) still ran and matched
   the bare name against deepseek's static catalog. Fix: track whether
   the live catalog resolved it; skip step e when it did.

2. `normalize_model_for_provider` in hermes_cli/model_normalize.py only
   stripped the exact `opencode-zen/` prefix, leaving arbitrary vendor
   prefixes like `minimax/minimax-m2.7` (commonly copied from aggregator
   slugs into fallback_model configs) intact — causing HTTP 401s when
   the fallback chain activated. Fix: opencode-go/opencode-zen strip ANY
   leading vendor prefix because their APIs are flat-namespace.

Tests: 11 new cases in tests/hermes_cli/test_opencode_go_flat_namespace.py
covering both normalization (prefix stripping, regression guards for
opencode-zen Claude hyphenation and openrouter vendor-prepending) and
switch_model (bare-name resolution on opencode-go's live catalog must
not trigger cross-provider hijack).

Reported by @Ufonik via Discord; Kimi K2.6 always worked because moonshotai
has no overlapping entry in a native provider's static catalog. Deepseek
and minimax failed because their v4/v2.7 names existed in the native
deepseek/minimax catalogs.
2026-05-06 09:08:33 -07:00
Teknium 773cf48c50 docs(plugins): close the gaps \u2014 image-gen-provider-plugin guide + publishing a skill tap (#20800)
Two pluggable surfaces were mentioned in the interfaces map without a
real authoring guide behind them:

1. **Image-gen backends** — only had 'See bundled examples' pointers.
   Now a full developer-guide/image-gen-provider-plugin.md (270 lines)
   mirroring the memory/context/model provider docs:
   - How discovery works, directory structure, plugin.yaml
   - ImageGenProvider ABC with every overridable method
     (name, display_name, is_available, list_models, default_model,
     get_setup_schema, generate)
   - Full authoring walkthrough with a working MyBackendImageGenProvider
   - Response-format reference (success_response / error_response)
   - Handling b64 vs URL output (save_b64_image helper)
   - User overrides at ~/.hermes/plugins/image_gen/<name>/
   - Testing recipe + pip distribution
   - Reference examples (openai, openai-codex, xai)

2. **Skill taps** — features/skills.md mentioned the CLI commands but
   never explained the repo contract for publishing a tap. Added
   'Publishing a custom skill tap' section under Skills Hub covering:
   - Repo layout (skills/<name>/SKILL.md by default)
   - Minimal working example
   - Non-default path configuration (taps.json)
   - Installing individual skills without subscribing
   - Trust-level handling
   - Full tap management CLI + in-session /skills tap commands

Wired into:
- website/sidebars.ts: image-gen-provider-plugin added to Extending group
- website/docs/user-guide/features/plugins.md: pluggable interfaces
  table + 'What plugins can do' table now link to the real guides
  instead of 'See bundled examples'
- website/docs/guides/build-a-hermes-plugin.md: top info map and
  inline sub-sections updated, 'Full guide:' line added to
  image-gen block, tap section mentions publishing

Verified: docusaurus build SUCCESS, new page renders at
/docs/developer-guide/image-gen-provider-plugin, anchor
#publishing-a-custom-skill-tap resolves from plugins.md +
build-a-hermes-plugin.md. Pre-existing zh-Hans broken links unchanged.
2026-05-06 08:40:05 -07:00
Teknium ad7aad251c feat(skills/linear): add Documents support + Python helper script (#20752)
* feat(skills/linear): add Documents support + Python helper script

The bundled Linear skill (PR #1230) covered issues, projects, teams, and
workflow states via curl. It had no coverage for Linear's Documents API,
so fetching an RFC/doc from a linear.app URL required hand-writing
GraphQL against an underdocumented schema.

Adds:
- Documents section in SKILL.md explaining slugId extraction from URLs,
  the contentState (markdown) vs contentState (ProseMirror) split, and
  four canonical curl examples (fetch by slugId, fetch by UUID, list
  recent, title-search).
- scripts/linear_api.py — stdlib-only Python CLI wrapping the most
  common operations (whoami, list-teams, list/get/search/create/update
  issues, add-comment, update-status, list/get/search documents, raw
  GraphQL passthrough). Zero deps, reads LINEAR_API_KEY from env.

Auth header quirk (personal key takes bare $LINEAR_API_KEY, no Bearer
prefix) is already documented in the skill.

Found during RFC review: the existing skill's lack of document support
forced falling back to the browser (which hit Linear's login wall).
Also fixes a schema gotcha — the Document field is `contentState`, not
`contentData` (which returns 400).

Tested end-to-end against the production API:
  python3 linear_api.py whoami
  python3 linear_api.py get-document 38359beef67c
Both return expected payloads.

* fix(skills/linear): point LINEAR_API_KEY setup to the correct page

The org-level Settings > API page (/settings/api) only shows OAuth apps
and workspace-member keys. Personal API keys live under Account,
Security, access (/settings/account/security). Update both the setup
link in config.py (shown during hermes setup) and the setup step in
SKILL.md so users land on the page that can create a personal key.
2026-05-06 08:27:21 -07:00
ethernet 9627ee70e5 feat(ci): add typecheck (warnings only in CI) 2026-05-06 10:58:12 -04:00
ethernet 63c51d8962 change: enable ruff/ty 2026-05-06 10:45:25 -04:00
Teknium b62a82e0c3 docs: pluggable surfaces coverage — model-provider guide, full plugin map, opt-in fix (#20749)
* docs(providers): add model-provider-plugin authoring guide + fix stale refs

New docs:
- website/docs/developer-guide/model-provider-plugin.md — full authoring
  guide (directory layout, minimal example, ProviderProfile fields,
  overridable hooks, user overrides, api_mode selection, auth types,
  testing, pip distribution)
- Wired into website/sidebars.ts under 'Extending'
- Cross-references added in:
  - guides/build-a-hermes-plugin.md (tip block)
  - developer-guide/adding-providers.md
  - developer-guide/provider-runtime.md

User guide:
- user-guide/features/plugins.md: Plugin types table grows from 3 to 4
  with 'Model providers' row

Stale comment cleanup (providers/*.py → plugins/model-providers/<name>/):
- hermes_cli/main.py:_is_profile_api_key_provider docstring
- hermes_cli/doctor.py:_build_apikey_providers_list docstring
- hermes_cli/auth.py: PROVIDER_REGISTRY + alias auto-extension comments
- hermes_cli/models.py: CANONICAL_PROVIDERS auto-extension comment

AGENTS.md:
- Project-structure tree: added plugins/model-providers/ row
- New section: 'Model-provider plugins' explaining discovery, override
  semantics, PluginManager integration, kind auto-coerce heuristic

Verified: docusaurus build succeeds, new page renders, all 3 cross-links
resolve. 347/347 targeted tests pass (tests/providers/,
tests/hermes_cli/test_plugins.py, tests/hermes_cli/test_runtime_provider_resolution.py,
tests/run_agent/test_provider_parity.py).

* docs(plugins): add 'pluggable interfaces at a glance' maps to plugins.md + build-a-hermes-plugin

Devs landing on either the user-guide plugin page or the build-a-plugin
guide now get an upfront table of every distinct pluggable surface with
a link to the right authoring doc. Previously they'd have to read the
full general-plugin guide to discover that model providers / platforms
/ memory / context engines are separate systems.

user-guide/features/plugins.md:
- New 'Pluggable interfaces — where to go for each' section below the
  existing 4-kinds table
- 10 rows covering every register_* surface (tool, hook, slash command,
  CLI subcommand, skill, model provider, platform, memory, context
  engine, image-gen)
- Explicit note: TTS/STT are NOT plugin-extensible yet — documented
  with a pointer to the current config.yaml 'command providers' pattern
  and a note that register_tts_provider()/register_stt_provider() may
  come later

guides/build-a-hermes-plugin.md:
- New :::info 'Not sure which guide you need?' map at the top so devs
  see all pluggable interfaces before investing in this 737-line
  general-plugin walkthrough
- Existing bottom :::tip expanded to include platform adapters alongside
  model/memory/context plugins

Verified:
- All 8 cross-doc links in the new plugins.md table resolve in a
  docusaurus build (SUCCESS, no new broken links)
- TTS link corrected (features/voice → features/tts; latter exists)
- Pre-existing broken links/anchors (cron-script-only, llms.txt,
  adding-platform-adapters#step-by-step-checklist) are unchanged

* docs(plugins): correct TTS/STT pluggability \u2014 they ARE plugins (command-providers)

Previous commit incorrectly said TTS/STT 'aren't plugin-extensible'. They
are, via the config-driven command-provider pattern \u2014 any CLI that reads
text and writes audio (or vice versa for STT) is automatically a plugin
with zero Python. The tts.md docs cover this extensively and I missed it.

plugins.md:
- TTS row: 'Config-driven (not a Python plugin)', points at
  tts.md#custom-command-providers
- STT row: points at tts.md#voice-message-transcription-stt (STT docs
  live in tts.md despite the filename)
- Expanded note: TTS/STT use config-driven shell-command templates as
  their plugin surface (full tts.providers.<name> registry for TTS;
  HERMES_LOCAL_STT_COMMAND escape hatch for STT)
- Any CLI that reads/writes files is automatically a plugin \u2014 no Python
  register_* API needed
- Future register_tts_provider()/register_stt_provider() hooks mentioned
  as nice-to-have for SDK/streaming cases, not as the primary story

build-a-hermes-plugin.md:
- Same map update: TTS/STT rows explicit, footer note corrected

Verified:
- tts.md anchors (custom-command-providers, voice-message-transcription-stt)
  exist and resolve in docusaurus build (SUCCESS, no new broken links)

* docs(plugins): expand pluggable interfaces table with MCP / event hooks / shell hooks / skill taps

Broadened the scope beyond Python register_* hooks. Hermes has MULTIPLE
plugin-style extension surfaces; they're now all in one table instead of
being scattered across feature docs.

Added rows for:
- **MCP servers** — config.yaml mcp_servers.<name> auto-registers external
  tools from any MCP server. Huge extensibility surface, previously not
  linked from the plugin map.
- **Gateway event hooks** — drop HOOK.yaml + handler.py into
  ~/.hermes/hooks/<name>/ to fire on gateway:startup, session:*, agent:*,
  command:* events. Separate from Python plugin hooks.
- **Shell hooks** — hooks: block in config.yaml runs shell commands on
  events (notifications, auditing, etc.).
- **Skill sources (taps)** — hermes skills tap add <repo> to pull in new
  skill registries beyond the built-in sources.

Both docs updated:
- user-guide/features/plugins.md: table column renamed to 'How' (mixes
  Python API + config-driven + drop-in-dir surfaces accurately)
- guides/build-a-hermes-plugin.md: :::info map at top mirrors the new
  surfaces with a forward-link to the consolidated table

Note block rewritten: instead of singling out TTS/STT as the 'different
style' exception, now honestly describes that Hermes deliberately
supports three plugin styles — Python APIs, config-driven commands, and
drop-in manifest directories — and devs should pick the one that fits
their integration.

Not included (considered and rejected):
- Transport layer (register_transport) — internal, not user-facing
- Tool-call parsers — internal, VLLM phase-2 thing
- Cloud browser providers — hardcoded registry, not drop-in yet
- Terminal backends — hardcoded if/elif, not drop-in yet
- Skill sources (the ABC) — hardcoded list, only taps are user-extensible

Verified:
- All 5 new anchors resolve (gateway-event-hooks, shell-hooks, skills-hub,
  custom-command-providers, voice-message-transcription-stt)
- Docusaurus build SUCCESS, zero new broken links
- Same 3 pre-existing broken links on main (cron-script-only, llms.txt,
  adding-platform-adapters#step-by-step-checklist)

* docs(plugins): cover every pluggable surface in both the overview and how-to

Both plugins.md and build-a-hermes-plugin.md now cover every extension
surface end-to-end \u2014 general plugin APIs, specialized plugin types,
config-driven surfaces \u2014 with concrete authoring patterns for each.

plugins.md:
- 'What plugins can do' table grows from 9 rows (general ctx.register_*
  only) to 14 rows covering register_platform, register_image_gen_provider,
  register_context_engine, MemoryProvider subclass, register_provider
  (model). Each row links to its full authoring guide.
- New 'Plugin sub-categories' section under Plugin Discovery explains
  how plugins/platforms/, plugins/image_gen/, plugins/memory/,
  plugins/context_engine/, plugins/model-providers/ are routed to
  different loaders \u2014 PluginManager vs the per-category own-loader
  systems.
- Explicit mention of user-override semantics at
  ~/.hermes/plugins/model-providers/ and ~/.hermes/plugins/memory/.

build-a-hermes-plugin.md:
- New '## Specialized plugin types' section (5 sub-sections):
  - Model provider plugins \u2014 ProviderProfile + plugin.yaml example,
    auto-wiring summary, link to full guide
  - Platform plugins \u2014 BasePlatformAdapter + register_platform() skeleton
  - Memory provider plugins \u2014 MemoryProvider subclass example
  - Context engine plugins \u2014 ContextEngine subclass example
  - Image-generation backends \u2014 ImageGenProvider + kind: backend example
- New '## Non-Python extension surfaces' section (5 sub-sections):
  - MCP servers \u2014 config.yaml mcp_servers.<name> example
  - Gateway event hooks \u2014 HOOK.yaml + handler.py example
  - Shell hooks \u2014 hooks: block in config.yaml example
  - Skill sources (taps) \u2014 hermes skills tap add example
  - TTS / STT command templates \u2014 tts.providers.<name> with type: command
- Distribute via pip / NixOS promoted from ### to ## (they were orphaned
  after the reorganization)

Each specialized / non-Python section has a concrete, copy-pasteable
example plus a 'Full guide:' link to the authoritative doc. Devs arriving
at the build-a-hermes-plugin guide now see every extension surface at
their disposal, not just the general tool/hook/slash-command surface.

Verified:
- Docusaurus build SUCCESS, zero new broken links
- All new cross-links (developer-guide/model-provider-plugin,
  adding-platform-adapters, memory-provider-plugin, context-engine-plugin,
  user-guide/features/mcp, skills#skills-hub, hooks#gateway-event-hooks,
  hooks#shell-hooks, tts#custom-command-providers,
  tts#voice-message-transcription-stt) resolve
- Same 3 pre-existing broken links on main (cron-script-only, llms.txt,
  adding-platform-adapters#step-by-step-checklist)

* docs(plugins): fix opt-in inconsistency — not every plugin is gated

The 'Every plugin is disabled by default' statement was wrong. Several
plugin categories intentionally bypass plugins.enabled:

- Bundled platform plugins (IRC, Teams) auto-load so shipped gateway
  channels are available out of the box. Activation per channel is via
  gateway.platforms.<name>.enabled.
- Bundled backends (plugins/image_gen/*) auto-load so the default
  backend 'just works'. Selection via <category>.provider config.
- Memory providers are all discovered; one is active via memory.provider.
- Context engines are all discovered; one is active via context.engine.
- Model providers: all 33 discovered at first get_provider_profile();
  user picks via --provider / config.

The plugins.enabled allow-list specifically gates:
- Standalone plugins (general tools/hooks/slash commands)
- User-installed backends
- User-installed platforms (third-party gateway adapters)
- Pip entry-point backends

Which matches the actual code in hermes_cli/plugins.py:737 where the
bundled+backend/platform check bypasses the allow-list.

Rewrote '## Plugins are opt-in' to:
- Retitle to 'Plugins are opt-in (with a few exceptions)'
- Narrow opening claim to 'General plugins and user-installed backends
  are disabled by default'
- Added 'What the allow-list does NOT gate' subsection with a full
  table of which bypass the gate and how they're activated instead
- Fixed migration section wording (bundled platform/backend plugins
  never needed grandfathering)

Verified: docusaurus build SUCCESS, zero new broken links.
2026-05-06 07:24:42 -07:00
Teknium 90a7adcb2e docs(wsl2): expand Windows (WSL2) guide — filesystem, networking, services, pitfalls (#20748)
Replaces the 22-line stub with a ~320-line guide covering the parts of the
Windows/WSL2 split that specifically affect Hermes users:

- Why WSL2 (and not native Windows)
- Install: distro choice, WSL1→2, systemd via /etc/wsl.conf
- Filesystem boundary: /mnt/c vs \\wsl$, perf/perms/watchers/case,
  wslpath/wslview, CRLF + git core.autocrlf, clone-where guidance
- Networking in both directions:
  - WSL → Windows services: links to the canonical WSL2 Networking section
    in integrations/providers.md (mirrored mode, NAT + host IP, bind addr,
    firewall) instead of duplicating
  - Windows/LAN → Hermes in WSL: mirrored vs NAT, netsh portproxy one-liner,
    firewall rule, webhook tunneling pointer
- Long-running services: systemd gateway + Task Scheduler wsl.exe --exec
  'sleep infinity' to keep the VM alive at login
- GPU passthrough: NVIDIA works, AMD/Intel out of matrix
- Common pitfalls: connection refused, /mnt/c slowness, CRLF ^M,
  UNC warnings, post-sleep clock drift, mirrored-mode DNS with VPN,
  PATH, Defender scanning, VHDX disk reclaim

All internal links use site-absolute /docs/... form (matches the rest of
user-guide/); all seven link targets verified to exist.
2026-05-06 06:45:32 -07:00
Teknium 3ce1233ae4 chore(release): map cleo@edaphic.xyz → curiouscleo
Follow-up to the salvaged fix for /goal ENAMETOOLONG drop — adds
AUTHOR_MAP entry so the release script resolves the commit author to
the correct GitHub user.
2026-05-06 06:34:48 -07:00
Cleo 906881c38b fix(cli): catch OSError in _resolve_attachment_path to prevent ENAMETOOLONG dropping long slash commands
When the user pastes a long slash command like \`/goal <long prose>\` into
\`hermes chat\`, the input flows into \`_detect_file_drop()\`, whose
\`starts_like_path\` prefilter accepts anything starting with \`/\` and
forwards it to \`_resolve_attachment_path()\`. That helper calls
\`Path.exists()\` which invokes \`os.stat()\`, which raises
\`OSError(errno=ENAMETOOLONG)\` — 63 on macOS, 36 on Linux — when the
candidate exceeds NAME_MAX (typically 255 bytes).

The OSError propagates up to the broad \`except Exception\` in
\`process_loop\` (cli.py:11798), gets logged at WARNING level, and the
user's input is silently dropped. From the user's POV the chat prompt
hangs — the only signal is in agent.log:

  WARNING cli: process_loop unhandled error (msg may be lost):
    [Errno 63] File name too long: "/goal Drive the space board..."

This affects any slash command with prose-length arguments — \`/goal\`
in particular but also \`/skill\`, \`/cron\`, custom user commands.

Fix: wrap the \`exists()\`/\`is_file()\` calls in try/except OSError so
structurally-invalid path candidates cleanly return None. The slash-
command dispatch path downstream (cli.py:11718) then handles the
input correctly.

Tests: two new regression cases in test_cli_file_drop.py cover the
original \`/goal\` reproducer and a synthetic long path. All 35 file-
drop tests pass.

Reproducer (without the fix):
  python -c "from cli import _detect_file_drop;
             _detect_file_drop('/goal ' + 'a'*300)"
  → OSError: [Errno 63] File name too long
2026-05-06 06:34:48 -07:00
Teknium a0fedfbb1b feat(checkpoints): v2 single-store rewrite with real pruning + disk guardrails (#20709)
Replaces the per-directory shadow-repo design with a single shared shadow
git store at ~/.hermes/checkpoints/store/. Object DB is now deduplicated
across every working directory the agent has ever touched; a dozen
worktrees of the same project cost near-zero in additional disk.

Why
---
Pre-v2 design had three compounding problems that let ~/.hermes/checkpoints/
grow to multi-GB on active machines:

1. Each working directory got its own full shadow git repo — no object
   dedup across projects or across worktrees of the same project.
2. _prune() was a documented no-op: max_snapshots only limited the
   /rollback listing. Loose objects accumulated forever.
3. Defaults: enabled=True, auto_prune=False — users paid the disk cost
   without ever asking for /rollback.

Field report on a single workstation: 847 MB across 47 shadow repos,
mostly redundant clones of the hermes-agent source tree.

Changes
-------
- tools/checkpoint_manager.py: full rewrite. Single bare store, per-project
  refs (refs/hermes/<hash>), per-project indexes (store/indexes/<hash>),
  per-project metadata (store/projects/<hash>.json with workdir +
  created_at + last_touch). On first v2 init, any pre-v2 per-directory
  shadow repos are auto-migrated into legacy-<timestamp>/ so the new
  store starts clean. _prune() now actually rewrites the per-project ref
  to the last max_snapshots commits and runs git gc --prune=now. New
  _enforce_size_cap() drops oldest commits round-robin across projects
  when the store exceeds max_total_size_mb. _drop_oversize_from_index()
  filters any single file larger than max_file_size_mb out of the snapshot.
- hermes_cli/checkpoints.py: new 'hermes checkpoints' CLI
  (status / list / prune / clear / clear-legacy) for managing the store
  outside a session.
- hermes_cli/config.py: flipped defaults — enabled=False, max_snapshots=20,
  auto_prune=True. Added max_total_size_mb=500, max_file_size_mb=10.
  Tightened DEFAULT_EXCLUDES (added target/, *.so/*.dylib/*.dll,
  *.mp4/*.mov, *.zip/*.tar.gz, .worktrees/, .mypy_cache/, etc.).
- run_agent.py / cli.py / gateway/run.py: thread the new kwargs through
  AIAgent and the startup auto_prune hooks.
- Tests rewritten to match v2 storage while keeping backwards-compat
  coverage for the pre-v2 prune path (per-directory shadow repos under
  base/ are still swept correctly for anyone mid-migration).
- Docs updated: user-guide/checkpoints-and-rollback.md explains the
  shared store, new defaults, migration, and the new CLI;
  reference/cli-commands.md documents 'hermes checkpoints'.

E2E validated
-------------
- Legacy migration: pre-v2 shadow repos auto-archived into legacy-<ts>/.
- Object dedup: two projects with an identical shared.py blob resolve to
  7 total objects in the store (v1 would have stored the blob twice).
- max_snapshots=3 actually enforced: after 6 commits, list shows 3.
- Orphan prune: deleting a project's workdir + 'hermes checkpoints prune
  --retention-days 0' removes its ref, index, and metadata; GC reclaims
  the objects.
- max_file_size_mb=1 excludes a 2 MB weights.bin while keeping the
  tracked source code files.
- hermes checkpoints {status,prune,clear,clear-legacy} all work from the
  CLI without an agent running.

Breaking / migration
--------------------
No in-place data migration — legacy per-directory shadow repos are moved
into legacy-<timestamp>/ on first run. Old /rollback history is still
accessible by inspecting the archive with git; run
'hermes checkpoints clear-legacy' to reclaim the space when ready. Users
relying on /rollback must now set checkpoints.enabled=true (or pass
--checkpoints) explicitly.
2026-05-06 05:44:35 -07:00
Teknium b045e7a2ba feat(skills): add shop-app personal shopping assistant (optional) (#20702)
Port Shop.app's upstream SKILL.md (https://shop.app/SKILL.md) into
optional-skills/productivity/shop-app/ with Hermes-native adaptations:

- Proper Hermes frontmatter (name, description<=60 chars, version,
  author, license, prerequisites, metadata.hermes tags + related_skills
  + homepage + upstream)
- Swap Shop.app's bespoke 'message()' tool references for Hermes
  conventions: gateway adapters handle platform formatting, so the
  skill just writes markdown (no Telegram/WhatsApp/iMessage sections
  referencing a tool Hermes doesn't ship)
- Name Hermes tools where relevant: curl via 'terminal', HTML policy
  pages via 'web_extract', try-on via 'image_generate'
- Reframe session state as 'hold in your reasoning context for this
  conversation only' and forbid writing tokens to .env / disk — matches
  Hermes ephemeral-memory discipline
- Drop NO_REPLY convention (Shop-app-runtime specific)
- Trigger-first description so the skill loader picks it up when the
  user wants to search products, track orders, returns, or reorder
2026-05-06 04:47:56 -07:00
helix4u 76074d9ee6 fix(cli): recover classic CLI output after resize 2026-05-06 04:20:54 -07:00
liuguangyong 17687911b7 fix(kanban): reset code element background inside board
The Nous DS globals.css applies a global rule:
  code { background: var(--midground); color: var(--background); }

This paints an opaque cream/yellow fill on every <code> element,
which hides text in the kanban drawer's event-payload, run-meta,
and worker-log panes (all rendered as <code>).

Fix: scope a reset inside .hermes-kanban so <code> elements inherit
their parent's color and stay transparent.
2026-05-06 04:20:52 -07:00
Teknium b1e0ef82f6 chore(release): map liuguangyong@hellobike -> liuguangyong93 2026-05-06 04:20:52 -07:00
Teknium a0556b861f fix(tui): restore gap before duration when verb segment is hidden
The verb-padding change dropped the leading space in durationSegment on
the assumption that the verb's trailing pad always supplies the gap. But
the unicode spinner style sets showVerb=false, making verbSegment an
empty string — in that mode the output would become `{frame}· {duration}`
with no separator. Add the space back; harmless when the verb segment
is shown (its trailing pad still provides the gap).
2026-05-06 04:02:09 -07:00
adybag14-cyber ca5febfed1 fix(tui): stabilize FaceTicker elapsed width to prevent composer drift 2026-05-06 04:02:09 -07:00
adybag14-cyber e45df2e81e fix(ui): reduce status-line jitter while scrolling 2026-05-06 04:02:09 -07:00
Teknium a869a523ee chore: AUTHOR_MAP entry for adybag14-cyber 2026-05-06 04:02:02 -07:00
adybag14-cyber 043a118d41 fix: harden install.sh against inherited Python env leakage 2026-05-06 04:02:02 -07:00
Teknium e70e49016f fix(cli): guard logger.debug in signal handler (#13710 regression) (#20673)
CPython's logging module is not reentrant-safe.  `Logger.isEnabledFor`
caches level results in `Logger._cache`; under shutdown races the cache
can be cleared (`Logger._clear_cache`, triggered by logging config changes
from another thread) or mid-mutation when a signal fires, raising
`KeyError: <level_int>` (e.g. `KeyError: 10` for DEBUG) inside the signal
handler.

When that happens, the KeyError escapes before the `raise KeyboardInterrupt()`
on the next line can fire, which bypasses prompt_toolkit's normal interrupt
unwind and surfaces as the EIO cascade originally reported in #13710.

Issue #13710 shipped two defenses (asyncio exception handler + outer
`except (KeyError, OSError)` with EIO suppression) that cover the EIO
unwind path.  This patch closes the remaining escape hatch: the
`logger.debug` call at the top of `_signal_handler` itself.  Wrap it in a
bare `try/except Exception: pass` so logging can never raise through a
signal handler.

Observed in the wild: debug report on 0.12.0 (commit 8163d371) shows the
exact stack — KeyError: 10 at logging/__init__.py:1742 inside the
signal handler's `logger.debug`, followed by the EIO cascade from
prompt_toolkit's emergency flush.

Tests: adds `TestSignalHandlerLoggingRace` to
`tests/hermes_cli/test_suppress_eio_on_interrupt.py` with 6 new cases:
- normal path still raises KeyboardInterrupt
- KeyError(10) from logger.debug does not escape
- any Exception from logger.debug is swallowed
- agent.interrupt still fires when logger.debug raises
- agent.interrupt raising also does not escape
- BaseException (SystemExit) is NOT swallowed — guard uses `except Exception`
  deliberately so real shutdown signals still propagate

Closes #13710 regression.
2026-05-06 03:55:47 -07:00
Teknium a6f5f9c484 fix(update): drop pip --quiet so slow installs don't look hung (#20679)
On Termux/Android aarch64 (and other platforms without prebuilt wheels
for some optional extras), 'pip install -e .[all]' compiles C/Rust
extensions from source. This can run for several minutes with zero
network activity and — with --quiet — zero stdout. Users report
'hermes update hangs at Updating Python dependencies', Ctrl+C it, then
re-run and see 'up to date' (because git pull already succeeded and the
pip step was still working when they interrupted).

Pip's default output is proportional to actual work (one line per
Collecting / Building wheel for X / Installing), so removing --quiet
costs nothing on fast hardware and prevents the false-hang interrupt
loop on slow hardware.

Reported via Discord on Termux/Android. Supersedes #20466 which
misdiagnosed the hang as PYTHONPATH shadowing (install.sh doesn't run
during 'hermes update', and terminal() doesn't inherit PYTHONPATH).
2026-05-06 03:55:02 -07:00
helix4u 466f3a11de fix(gateway): preserve model picker current context 2026-05-06 03:50:59 -07:00
Kshitij Kapoor 629d8b843d fix(browser): tighten Lightpanda fallback edge cases 2026-05-06 03:41:21 -07:00
Kshitij 68162eb18f fix(tui): collapse long system messages in transcript with expand toggle
System messages over 400 chars (system prompt, AGENTS.md, etc.) now
render as a collapsed \u25b8/\u25be toggle line in the transcript, matching
the Chevron convention used for runtime details. The summary shows
the first line + char count; clicking expands to full content.
2026-05-06 03:34:00 -07:00
Kshitij d78c34928f feat(tui): collapsible sections in startup banner (skills, system prompt, MCP)
The TUI SessionPanel banner now uses collapsible \u25b8/\u25be toggle
sections matching the existing Chevron convention used for runtime
agent details. Skills, system prompt, and MCP server lists are
collapsed by default; tools remain expanded as the most actionable
info.

- tui_gateway/server.py: _session_info() now passes agent._cached_system_prompt
  through to the TUI frontend
- ui-tui/src/types.ts: added system_prompt?: string to SessionInfo
- ui-tui/src/components/branding.tsx: rewrote SessionPanel with
  CollapseToggle helper + per-section useState toggles

Default states: tools=open, skills=collapsed, system=collapsed,
mcp=collapsed. Clicking any \u25b8/\u25be header toggles that section.
2026-05-06 03:34:00 -07:00
Kshitij Kapoor 3ebdd26449 fix(browser): surface Lightpanda Chrome fallback warnings 2026-05-06 03:23:19 -07:00
kshitijk4poor 395dbcc873 feat(browser): add Lightpanda engine support with automatic Chrome fallback
Add Lightpanda as an optional browser engine for local mode.
Lightpanda is a headless browser built from scratch in Zig -- faster
navigation than Chrome with significantly less memory.

One config line to enable:
  browser:
    engine: lightpanda

New functions in browser_tool.py:
- _get_browser_engine() -- config/env reader with validation + caching
- _should_inject_engine() -- only inject in local non-cloud mode
- _needs_lightpanda_fallback() -- detect empty/failed LP results
- _chrome_fallback_screenshot() -- temporary Chrome session for screenshots
- Engine injection in _run_browser_command (--engine flag)
- browser_vision pre-routes screenshots to Chrome when engine=lightpanda

Config:
- browser.engine in DEFAULT_CONFIG (auto/lightpanda/chrome)
- AGENT_BROWSER_ENGINE in OPTIONAL_ENV_VARS
- /browser status shows engine info in local mode

Rebased from PR #7144 onto current main. All existing code preserved --
pure additions only (+520/-2).

25 new tests + 81 total browser tests pass (0 failures).
2026-05-06 03:23:19 -07:00
kshitijk4poor aa88dcc57b fix: salvage batch — compaction guidance, memory authority, cache eviction after compression
- Fix /compact → /compress in context-overflow tips (closes #20020)
- Evict cached agent after session hygiene and /compress so system
  prompt refreshes with current SOUL.md, memory, and skills
- Restore memory authority across compaction: change 'informational
  background data' to 'authoritative reference data' in memory block
  and SUMMARY_PREFIX, with backward-compatible regex

Based on:
- PR #20027 by @LeonSGP43
- PR #18767 by @MacroAnarchy
- PR #17380 by @vominh1919

PR #17121 boundary marker fix already merged to main (2eef395e1).
PR #9262 user-message anchoring already on main via _ensure_last_user_message_in_tail().
2026-05-05 22:33:45 -07:00
Teknium f27fcb6a82 feat(models): add x-ai/grok-4.3 to OpenRouter + Nous Portal curated lists (#20497)
Endpoint validated over 6 conversational turns with tool calls (9 API
calls, 3 tool calls, 0 failures) and an 8-request burst (8/8 ok,
0 rate limits). Latency ~5-10s/call — slower than grok-4.20 but
expected for a reasoning model.

- hermes_cli/models.py: add to OPENROUTER_MODELS and _PROVIDER_MODELS['nous']
- website/static/api/model-catalog.json: regenerated
2026-05-05 19:15:10 -07:00
Teknium 477e4a2fe6 feat(models): add deepseek/deepseek-v4-pro to OpenRouter + Nous Portal curated lists (#20495)
Endpoint re-tested over 6 conversational turns (9 API calls, 3 tool calls)
and an 8-request burst — no rate limits, no errors, ~2-3s latency. The
historical rate-limit issues that caused its removal are gone.

- hermes_cli/models.py: add to OPENROUTER_MODELS and _PROVIDER_MODELS['nous']
- website/static/api/model-catalog.json: regenerated via build_model_catalog.py
2026-05-05 19:11:58 -07:00
Teknium e598e18529 docs: document custom model aliases for /model command (#20475)
User-defined model aliases (config.yaml model_aliases: and
model.aliases.*) have worked since early versions but were entirely
undocumented. Add a dedicated 'Custom model aliases' section to
slash-commands.md covering both YAML config formats and the
'hermes config set' shell form, mirror a shorter version into the
configuring-models 'Alternative methods' section, and cross-link from
the two /model table rows.

Flagged by @weehowe on Twitter — he wasn't aware the feature existed.
2026-05-05 19:11:20 -07:00
etherman-os 39f451f5ad fix: add Turkish locale references in config, tests, and docs
- hermes_cli/config.py: add tr to supported languages comment
- locales/en.yaml: add tr to locale file list comment
- tests/agent/test_i18n.py: add Turkish alias tests + explicit lang test
- website/docs/user-guide/configuration.md: add tr to supported values
2026-05-05 17:29:12 -07:00
etherman-os 985133852a feat(i18n): add Turkish (tr) locale
- Add locales/tr.yaml with Turkish translations for all approval.* and gateway.* keys
- Register 'tr' in SUPPORTED_LANGUAGES
- Add Turkish aliases: turkish, türkçe, tr-tr
2026-05-05 17:29:12 -07:00
Teknium fab3ad9777 chore(release): AUTHOR_MAP entries for suncokret12 and mioimotoai-lgtm 2026-05-05 17:26:15 -07:00
LeonSGP43 a49670c21b fix(kanban): wire dependency selects 2026-05-05 17:26:15 -07:00
Brecht-H 3f97297413 feat(kanban): surface task_runs.summary on dashboard cards + `kanban show`
The kanban-worker skill (built into the gateway dispatcher's spawn
prompt) instructs every worker to hand off via
``kanban_complete(summary=..., metadata=...)``. That writes the summary
onto the closing ``task_runs`` row, NOT onto ``tasks.result`` — the
latter is left NULL unless the caller passes ``result=`` explicitly.

Result: a glance at the dashboard or ``hermes kanban show <id>`` shows
a blank "Result:" section even when the worker did real work, which
on 2026-05-05 caused a Mac false-alarm ("Hermes did nothing") on a
task that had a 10-line completion summary on its run.

This patch surfaces the latest non-null run summary as
``latest_summary`` so the worker's actual handoff lands in front of
operators.

* New helpers ``kanban_db.latest_summary(conn, task_id)`` and
  ``kanban_db.latest_summaries(conn, task_ids)``. The batch variant
  uses a single window-function SELECT so the dashboard board endpoint
  doesn't pay an N+1 cost on multi-hundred-task boards.
* CLI ``hermes kanban show <id>`` prints a "Latest summary:" block
  when ``tasks.result`` is empty but a run has produced a summary
  (the existing "Result:" section still wins when populated, so the
  back-compat path for hand-edited results is untouched). JSON output
  gains a top-level ``latest_summary`` field.
* Dashboard ``/board`` and ``/tasks/{id}`` now include a
  ``latest_summary`` field on every task. Cards on /board carry a
  200-character preview (cheap to render, plenty for "what did this
  worker do?" at a glance); the drawer/detail endpoint returns the
  full summary.
* Five new tests cover: empty-runs case, post-complete surface,
  newest-of-multiple selection, empty-string skip, batch with
  missing tasks + empty input.

Smoke-tested locally against the live profile DB on the three
acceptance-criterion targets (t_f08fef91 cron-hygiene-audit,
t_007b7f1c EMA-analysis, t_05746fa4 self-assessment) — all three now
return their populated summaries via both ``latest_summary`` and
``latest_summaries``.

Test plan: 255/255 kanban tests pass + 91/91 dashboard plugin tests
pass. No regression on tasks where ``tasks.result`` is explicitly
populated (the existing "Result:" branch is preserved).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 17:26:15 -07:00
daixin1204 d2c6eceed9 fix(kanban): prevent child task dispatch when parent is not done
Add parent dependency guard to _set_status_direct so dragging
a task to the ready column is rejected (409) when its parents
are not all done. Previously the guard only existed in
recompute_ready, allowing direct status writes via the
dashboard API to bypass the dependency engine.

Root cause: after reclaiming stale workers, both T3 and T4
were set to ready via dashboard status writes in quick
succession, causing the writer to be spawned while the analyst
was blocked — upstream work wasn't done yet.
2026-05-05 17:26:15 -07:00
Teknium 8a1a42d098 test(kanban): backdate task_runs.started_at alongside tasks.started_at
After #19473 landed (enforce_max_runtime reads from task_runs.started_at
rather than tasks.started_at), a regression test added earlier still
only backdated the tasks column. Backdate both so the test is robust
regardless of which column the enforcer reads from.
2026-05-05 17:26:15 -07:00
澪 / Mio b28ab4fc3f fix(kanban): measure max runtime from current run 2026-05-05 17:26:15 -07:00
LeonSGP43 6d302b340e fix(kanban): accept created_cards linked as child of completing task
Widens _verify_created_cards to also accept ids that are children of the
completing task in task_links. Previously we only accepted cards where
created_by matched the completing task's assignee, which was too strict
for legitimate orchestrator flows: a specifier creates a card (so
created_by=specifier, not worker), then a worker picks it up and passes
parents=[current_task] to kanban_create. The explicit link proves the
relationship and should be trusted.

Salvaged from #20022 @LeonSGP43 (full PR superseded by #20232 +
this patch; the linked-children relaxation was the portable
improvement).
2026-05-05 17:26:15 -07:00
suncokret12 eda326df16 fix(doctor): report Kanban worker tools as runtime-gated 2026-05-05 17:26:15 -07:00
Teknium f0b95cc93d test(arcee): cover Trinity Large Thinking temperature + compression overrides
Salvage follow-up for PR #20344:
- AUTHOR_MAP entry for rob-maron (required by CI)
- 17 parametrized tests covering _is_arcee_trinity_thinking,
  _fixed_temperature_for_model Trinity override, and
  _compression_threshold_for_model, including sibling-model negatives
  (trinity-large-preview, trinity-mini) and the OpenRouter slug form.
2026-05-05 17:23:45 -07:00
rob-maron 2d4eaed111 arcee temperature + compression 2026-05-05 17:23:45 -07:00
teknium1 735349c679 chore: AUTHOR_MAP entry for olisikh 2026-05-05 17:21:59 -07:00
Oleksii Lisikh c4b287ba53 feat(i18n): add Ukrainian locale 2026-05-05 17:21:59 -07:00
Miniding 0d41e94ca9 feat(i18n): add French (fr) locale support
- Add fr.yaml with French translations for approval prompts and gateway messages
- Register 'fr' in SUPPORTED_LANGUAGES
- Add French aliases: french, français, fr-fr, fr-be, fr-ca, fr-ch
- Update locale sync comment in en.yaml
2026-05-05 15:13:57 -07:00
Teknium ee8edd4169 chore: AUTHOR_MAP entry for bogerman1 2026-05-05 15:13:36 -07:00
bogerman1 3188e63b05 fix(api_server): SSE token batching + error handling for Open WebUI performance
Reduces SSE event rate ~500/turn → ~20/turn via 50ms text-delta batching in
_dispatch(), which eliminates markdown re-render storms on Open WebUI. Also:

- Trim tool_call.arguments in the response.completed event to 100KB
  (prevents silent hangs on 848KB+ single-line SSE events).
- Catch-all exception handlers in _write_sse_responses() + _write_sse_chat_completion()
  emit a proper error chunk instead of TransferEncodingError from incomplete
  chunked encoding when the agent crashes mid-stream.
- MAX_REQUEST_BYTES 1MB → 10MB; pass client_max_size to aiohttp Application to
  avoid silent 400s on truncated request bodies for long conversations.

Salvage of #17552 (api_server portion only). The contrib/openwebui-filter/
payload from that PR — Open WebUI Filter Function + benchmark writeup — is
a client-side user-installable add-on and doesn't need to live in the repo;
dropped here. Closes #17537.

Co-authored-by: bogerman1 <93757150+bogerman1@users.noreply.github.com>
2026-05-05 15:13:36 -07:00
Nicolò Boschi 3082fa0829 feat(hindsight): probe API for update_mode='append' support, dedupe across processes
Mirrors the pattern already shipping in hindsight-integrations/openclaw:
probe `<api_url>/version` once per process, gate on Hindsight ≥ 0.5.0.
When supported, retains use a stable session-scoped `document_id`
(`session_id`) plus `update_mode='append'` so cross-process retains for
the same session merge into one document instead of producing
N-different-process-stamped duplicates. When unsupported (or probe
fails), fall back to the existing per-process unique
`f"{session_id}-{start_ts}"` document_id with no `update_mode` — the
resume-overwrite fix (#6654) keeps working unchanged on legacy servers.

Closes the dedup half of #20115. The proposed `document_id_strategy`
config knob isn't needed: auto-detection via the same /version probe
the OpenClaw plugin already uses gives the same outcome with no extra
config burden, and the choice is purely a function of what the server
can do.

Plumbing
--------
- Module-level helpers (`_meets_minimum_version`, `_fetch_hindsight_api_version`,
  `_check_api_supports_update_mode_append`) cache the result per api_url
  so every provider in the process gets one /version round-trip.
- One-time WARN logged when the API is older than 0.5.0, telling the
  user to upgrade for cross-session deduplication.
- New instance helper `_resolve_retain_target(fallback_doc_id)` returns
  `(document_id, update_mode)` based on cached capability. Wired into
  `sync_turn` and the `on_session_switch` flush path.
- For local_embedded mode, the probe URL is taken from the running
  client (`client.url`) so we hit the actual daemon port rather than
  the configured default.
- `update_mode` is set on the per-item dict; `aretain_batch` already
  threads `item['update_mode']` into the API call.

Tests
-----
- `TestUpdateModeAppendCapability` (5 cases): legacy fallback, modern
  stable+append, per-url cache, one-time warn, flush-on-switch resolves
  against the OLD session.
- Existing `_make_hindsight_provider` factory in the manager-side test
  file extended to seed `_mode`/`_api_url`/`_api_key`/`_client` and stub
  `_resolve_retain_target` so the bypass-init pattern keeps working.

E2E verified against installed `~/.hermes/hermes-agent`:
- Legacy probe (unreachable host) → `legacy-session-<ts>` doc_id,
  no `update_mode`.
- Modern probe (live local_embedded 0.5.6 daemon) → stable
  `modern-session` doc_id + `update_mode='append'`.
- `test_hermes_embedded_smoke.py` passes (90s).
2026-05-05 15:09:59 -07:00
Teknium 1efed67056 chore(release): AUTHOR_MAP entries for momowind and misery-hl 2026-05-05 15:09:28 -07:00
misery-hl 56b4795115 guard kanban worker lifecycle by run id 2026-05-05 15:09:28 -07:00
Moonyeah f0d278412f feat(gateway): respect kanban.max_spawn config to limit concurrent tasks
The dispatch_once function already accepts a max_spawn parameter but the
gateway was calling it without passing any value, effectively ignoring
the configuration. This change reads kanban.max_spawn from config.yaml
and passes it through, allowing users to limit concurrent kanban tasks.

This prevents resource exhaustion scenarios where kanban dispatcher
spawns too many parallel workers on constrained hardware.
2026-05-05 15:09:28 -07:00
0xVox 0b9cbc8b23 test(kanban): cover metadata handoff round-trip 2026-05-05 15:09:28 -07:00
Teknium 50ab0a85a7 chore: AUTHOR_MAP entry for formulahendry 2026-05-05 14:16:30 -07:00
Jun Han 0d945d1541 docs: update VS Code setup instructions for ACP Client integration 2026-05-05 14:16:30 -07:00
Teknium f97d022149 chore: AUTHOR_MAP entry for zhanggttry 2026-05-05 14:15:05 -07:00
zhangguangtao 05cdcac362 docs: add Chinese (zh-CN) README translation
Closes #12954

- Add README.zh-CN.md with complete Simplified Chinese translation
- Add language switcher badge in README.md linking to Chinese version
- Add language switcher badge in README.zh-CN.md linking to English version
2026-05-05 14:15:05 -07:00
haidao1919 74e4f5f97a docs(i18n): add zh-Hans Tool Gateway, image gen, and Windows WSL guide
Made-with: Cursor
2026-05-05 14:14:03 -07:00
Teknium a321874ab4 chore: AUTHOR_MAP entry for liu-collab 2026-05-05 14:12:49 -07:00
liuyuqi a11234dd68 docs(browser): document WSL-to-Windows Chrome MCP bridge 2026-05-05 14:12:49 -07:00
Teknium a860a1098f chore: AUTHOR_MAP entry for acesjohnny 2026-05-05 14:12:09 -07:00
Zhen Liu 1c42d8ff53 docs: add Open WebUI bootstrap script 2026-05-05 14:12:09 -07:00
Teknium 92a08c633f chore: AUTHOR_MAP entry for binhnt92 2026-05-05 14:11:16 -07:00
binhnt92 9a0a4c5831 docs(guides): add guide for running Hermes locally with Ollama
Step-by-step guide covering Ollama installation, model selection,
Hermes configuration, speed optimization, and optional gateway bot
setup — all running on local hardware with zero API cost.

Includes hardware requirements, model comparison table with tool-call
support status, context window tuning, GPU offloading tips, fallback
provider setup, troubleshooting, and cost comparison.
2026-05-05 14:11:16 -07:00
Teknium 1fc8733a69 fix(kanban): unify failure counter across spawn/timeout/crash outcomes (#20410)
The dispatcher's circuit breaker only protected against spawn-side
failures (profile missing, workspace mount error, exec failure).
Workers that successfully spawned but then timed out or crashed
re-queued to ``ready`` with no counter increment, so the next tick
re-spawned them — loops forever until someone noticed. Reported
externally on Twitter (Forbidden Seeds) and confirmed by walking the
kernel: ``enforce_max_runtime`` flipped the task back to ready, emitted
a ``timed_out`` event, and never touched ``spawn_failures``; same for
``detect_crashed_workers``.

Fix: unify the counter across all non-success outcomes.

Schema
------
* ``tasks.spawn_failures`` → ``tasks.consecutive_failures``
* ``tasks.last_spawn_error`` → ``tasks.last_failure_error``
* Migration renames the columns in-place on existing DBs (``ALTER
  TABLE RENAME COLUMN`` — SQLite >= 3.25) so historical counter
  values are preserved. Row mappers fall through to the legacy names
  if both column renames and a migration somehow got out of sync.

Counter lifecycle
-----------------
New helper ``_record_task_failure(conn, task_id, error, *, outcome,
release_claim, end_run, event_payload_extra)`` is the single point
every non-success outcome funnels through:

* ``spawn_failed``  → ``_record_spawn_failure`` (kept as alias)
  calls it with ``release_claim=True, end_run=True`` — transitions
  running→ready, clears claim, closes run.
* ``timed_out`` → ``enforce_max_runtime`` already does the status
  transition + run close + event emission, then calls
  ``_record_task_failure`` with ``release_claim=False, end_run=False``
  just to bump the counter (and trip the breaker if needed).
* ``crashed`` → ``detect_crashed_workers`` same pattern, but the
  counter increment runs after the main write_txn closes (SQLite
  doesn't nest write transactions).

If the counter hits the breaker threshold (``DEFAULT_FAILURE_LIMIT=5``,
same as before), the task transitions to ``blocked`` with a ``gave_up``
event on top of whatever outcome-specific event was already emitted.

Reset semantics changed: the counter now clears only on successful
``complete_task`` (and operator ``reclaim_task`` — an explicit "I've
looked at this, try again with a fresh budget"). Previously
``_clear_spawn_failures`` ran on every successful spawn, which would
have wiped the counter before a timeout could accumulate past threshold
— exactly the loop this fix prevents.

Diagnostics
-----------
* ``_rule_repeated_spawn_failures`` → ``_rule_repeated_failures``. Now
  fires regardless of which outcome is at fault. Classifies the most
  recent failure (spawn_failed / timed_out / crashed) from the run
  history so the title ("Agent timeout x3", "Agent crash x4", "Agent
  spawn x5") and suggested action (``doctor`` for spawn, ``log`` for
  timeout/crash) stay outcome-specific without N duplicate rules.
* ``_rule_repeated_crashes`` kept as a narrower early-warning at
  threshold 2 (vs 3 for the unified rule), but now suppresses itself
  when the unified rule would also fire — avoids double-flagging.
* Diagnostic ``data`` payload now carries
  ``{consecutive_failures, most_recent_outcome, last_error}`` instead
  of spawn-specific keys.

CLI
---
* ``Task.consecutive_failures`` / ``Task.last_failure_error`` are the
  public fields now. Existing callers that referenced the old names
  get migrated (tests updated in this commit).
* Backward-compat: ``DEFAULT_SPAWN_FAILURE_LIMIT``,
  ``_clear_spawn_failures``, ``_record_spawn_failure`` stay as aliases.

Tests
-----
* 6 new kernel tests: timeout increments counter, 3 consecutive
  timeouts trip the breaker (was the reported gap), crash increments
  counter, reclaim clears counter, completion clears counter, spawn
  success does NOT clear counter.
* Diagnostic tests: updated ``repeated_spawn_failures`` cases to use
  the new kind name and add a timeout-loop test.
* Dashboard API test: spawn_failures column update → consecutive_failures.

389/389 kanban-suite tests pass.

Live verification
-----------------
Seeded 4 tasks in an isolated HERMES_HOME: 3 timeouts, 4 crashes,
2-spawn-failed + 2-timed-out, and a task that had prior failures but
completed successfully. Board correctly shows "!! 3 tasks need
attention" (the successful one has no badge because the counter
reset). Drawer for the timeout-loop task renders "Agent timeout x3"
with most_recent_outcome=timed_out and the "Check logs" suggested
action (not the spawn-flavoured "Verify profile"). The successful
task has zero diagnostics.

Closes the Forbidden-Seeds-reported gap.
2026-05-05 13:55:37 -07:00
Teknium 587ef55f2c chore: AUTHOR_MAP entry for xsfX20 2026-05-05 13:55:21 -07:00
xsfx20 144ba71a33 docs(faq): use messaging extra for gateway deps 2026-05-05 13:55:21 -07:00
Teknium 391e3fff56 chore: AUTHOR_MAP entry for Hypnus-Yuan 2026-05-05 13:54:33 -07:00
Yuan Tao-Wen 39560c948d docs(voice): add Doubao speech integration examples (TTS + STT) 2026-05-05 13:54:33 -07:00
LeonSGP43 ca8e68822d docs(codex): clarify OAuth auth prerequisite 2026-05-05 13:53:55 -07:00
LeonSGP43 f13b349b9a docs: clarify Telegram group chat troubleshooting 2026-05-05 13:53:19 -07:00
Teknium bb2b129549 chore: AUTHOR_MAP entry for Fearvox 2026-05-05 13:52:46 -07:00
0xVox 5bd75c73ed docs(kanban): document handoff evidence metadata 2026-05-05 13:52:46 -07:00
Teknium 79902a0278 chore: AUTHOR_MAP entry for counterposition 2026-05-05 13:51:56 -07:00
Harish Kukreja 15be493055 docs(skills): modernize Obsidian file workflows 2026-05-05 13:51:56 -07:00
Michel Belleau 5f8e59b0f1 docs(discord): fix Server Members Intent + SSRC-mapping drift; add /voice join slash Choice
Salvage of #11350. Kept:
- Code: add an explicit /voice join Choice in the slash UI (runner accepts both 'join' and 'channel' but only 'channel' was in autocomplete).
- Docs: Server Members Intent is conditional (only needed if DISCORD_ALLOWED_USERS contains usernames); SSRC → user_id mapping uses the voice websocket SPEAKING opcode, not the Members intent.

Dropped from the original PR:
- HERMES_DISCORD_VOICE_PACKET_DUMP — this env var doesn't exist on main (it was in a different PR that isn't merged).
- DISCORD_PROXY docs — already documented on current main.
- DISCORD_ALLOW_MENTION_* docs — already on main.
- "barge-in mode" rewrite — current main actually does pause the listener during TTS (VoiceReceiver.pause() at discord.py:192); there is no barge_in_guard/barge_in_rms on main.

Co-authored-by: Michel Belleau <michel.belleau@malaiwah.com>
2026-05-05 13:50:43 -07:00
Teknium 1b1037171b chore: AUTHOR_MAP entry for CES4751 2026-05-05 13:48:37 -07:00
xiangyong de0ac21fff docs(docker): document API_SERVER_* env vars for exposing the OpenAI-compatible endpoint
Salvage of #11758. The PR's original diff was stale (the Docker Compose section on main has been heavily refactored — dashboard is now an embedded side-process, not a separate service), so the useful bit (API server env var requirements) is applied as a note on the basic `docker run` example.

Co-authored-by: xiangyong <xiangyong@zspace.cn>
2026-05-05 13:48:37 -07:00
Magicray1217 398efdb0fa docs(docker): add section on connecting to local inference servers (vLLM, Ollama)
Adds a comprehensive guide for connecting Dockerized Hermes to local
inference servers like vLLM and Ollama, covering:
- Docker Compose networking (recommended)
- Standalone Docker run with host.docker.internal / --network host
- Connectivity verification steps
- Ollama-specific example

Closes #12308
2026-05-05 13:47:13 -07:00
LeonSGP43 80c579a9dd docs(skills): explain restoring bundled skills 2026-05-05 13:46:20 -07:00
jani 3beef57825 docs: refresh stale platform/LOC/test counts; clarify gateway vs plugin platforms
AGENTS.md is the AI-assistant entry doc, so its counts get used as ground
truth. Several values had drifted, and the same drift had spread to a few
user-facing surfaces. Fixing all of them in one commit so the count claims
agree and clearly distinguish gateway-core from plugin-shipped platforms.

AGENTS.md:
- run_agent.py "~12k LOC" → "~14k LOC as of 2026-05-03" (actual 14,097)
- cli.py     "~11k LOC" → "~12k LOC as of 2026-05-03" (actual 12,043)
- tools/environments/ list now lists all 7 user-selectable terminal backends
  in canonical order, matching tools/terminal_tool.py:2214-2215
- gateway/platforms/ list adds yuanbao and wecom_callback; the 19 names
  match the user-facing list at website/docs/integrations/index.md
- plugins/ tree now mentions plugins/platforms/ (irc, teams)
- tests/ snapshot "~15k tests across ~700 files as of Apr 2026" →
  "~19k tests across ~890 files as of 2026-05-03"

User-facing count claims:
- hermes_cli/tips.py:195 — "19 platforms" → "21 messaging platforms" with
  IRC and Microsoft Teams added to the named list
- website/docs/index.md:49 — "6 terminal backends" → "7 terminal backends:
  ..., Vercel Sandbox" (also corrected by PR #19044; same edit content)
- website/docs/index.md:50 — "15+ platforms from one gateway" → "21+ messaging
  platforms (19 in the gateway, plus IRC and Microsoft Teams via plugins)"
- website/docs/integrations/index.md:83-85 — "15+ messaging platforms" → "19+",
  added yuanbao to the linked list. The surrounding text scopes it to "configured
  through the same gateway subsystem", so plugin platforms (IRC, Teams) are
  intentionally not in this list
- website/scripts/generate-llms-txt.py:205 — "15+ platforms" → "21+ messaging
  platforms — 19 native to the gateway plus IRC and Microsoft Teams via plugins"

LOC and date stamps follow the existing AGENTS.md "as of <date>" convention
(line 56 already used this pattern). Source of truth for the gateway count is
gateway/config.py:130-148 (PlatformID enum); plugin platforms live in
plugins/platforms/.

Out of scope:
- RELEASE_v0.9.0.md historical "16 platforms" claim (immutable history)
- userStories.json verbatim user quotes
- Programmatic count generation from gateway/config.py + plugin manifests
  is a worthwhile build-system change but separate from these content fixes
2026-05-05 13:45:47 -07:00
Teknium 7cc00087e7 chore: AUTHOR_MAP entry for deep-name 2026-05-05 13:44:09 -07:00
jani 0df80f4391 docs: align terminal-backend count and naming across docs and code
README:24 claimed "Six terminal backends" while tools/environments/ exposes
seven top-level backend choices through TERMINAL_ENV: local, docker, ssh,
singularity, modal, daytona, vercel_sandbox. Modal additionally has direct
and Nous-managed modes selected via terminal.modal_mode (the
ManagedModalEnvironment class is a Modal sub-mode, not a separate top-level
backend).

The same drift appeared in five other doc and code-comment sites with
inconsistent counts (six, seven, or implicit) and varying lists. Updated
all sites to a consistent seven-backend list in canonical order. The
configuration guide also clarifies how Modal's two modes are selected so
operators do not search for a non-existent backend: managed_modal value.

CONTRIBUTING.md:160 lists six backend filenames in a code tree but does
not carry the "Six terminal" prose; left out of scope per cohesion sweep
guidance to bundle only identical wording.

Files updated:
- README.md (line 24, marketing copy)
- website/docs/index.md (line 49, landing page)
- website/docs/user-guide/configuration.md (line 86, config guide)
- tools/environments/__init__.py (lines 3-6, package docstring)
- tools/file_operations.py (line 6, module docstring)
- environments/README.md (line 43, RL training docs — TERMINAL_ENV list)
2026-05-05 13:44:09 -07:00
Teknium 8fa5a03752 chore: AUTHOR_MAP entry for jethac 2026-05-05 13:43:04 -07:00
Jetha Chan b1476c76f6 docs(gemini): add Google Gemini guide 2026-05-05 13:43:04 -07:00
brooklyn! 794f48766c fix(tui): close slash parity gaps with CLI (#20339)
* fix(tui): close slash parity gaps with CLI

Route unsupported /skills subcommands through slash.exec, support /new <name>
titles, and handle /redraw natively so TUI behavior matches classic CLI. Also
filter gateway-only commands out of the TUI catalog while keeping /status
discoverable.

* fix(tui): run remaining CLI parity paths natively

Forward chat launch flags into the TUI runtime and handle live-session status
and skill reloads in the gateway process so TUI state no longer depends on the
slash worker's stale CLI instance.

* fix(tui): block stale snapshot restores

Prevent snapshot restore from running through the isolated slash worker because
it mutates disk state without refreshing the live TUI agent.

* chore: uptick

* fix(tui): guard async session title updates

Handle failures from the fire-and-forget session.title RPC so title-setting errors do not surface as unhandled promise rejections while preserving session-scoped messaging.
2026-05-05 15:42:39 -05:00
Jason Perlow acca3ec3af docs(providers): Together/Groq/Perplexity cookbook via custom_providers
Three worked recipes for OpenAI-compatible cloud providers, plus the
Copilot HTTP 401 auto-recovery info block and the GMI Cloud row in the
compatible providers table. All three additions were on the original
docs/custom-providers-cookbook branch but its merge base predated 1186
main commits, making the rebase impractical (84k+ line conflict).

Replays just the providers.md additions onto current main.
2026-05-05 13:42:20 -07:00
Wysie af312ccc97 docs: fix Camofox Docker setup instructions 2026-05-05 13:41:46 -07:00
JiaDe-Wu 7b05ccddc7 docs(bedrock): fix IAM permissions, add quickstart entry, add fallback provider, fix deployment section 2026-05-05 13:41:14 -07:00
Serhat Dolmac 84ec27616a docs(cli): expand hermes import reference — add description, warning, and examples 2026-05-05 13:40:26 -07:00
Teknium 9022804d78 feat(providers): make all 33 providers pluggable under plugins/model-providers/
Every provider profile is now a self-contained plugin under
plugins/model-providers/<name>/, mirroring the plugins/platforms/
pattern established for IRC and Teams. The ProviderProfile ABC
stays in providers/; the per-provider profile data moves out.

- plugins/model-providers/<name>/__init__.py calls register_provider()
- plugins/model-providers/<name>/plugin.yaml declares kind: model-provider
- providers/__init__.py._discover_providers() lazily scans bundled plugins
  then $HERMES_HOME/plugins/model-providers/<name>/ (user override path)
- User plugins with the same name override bundled ones (last-writer-wins
  in register_provider)
- Legacy providers/<name>.py layout still supported for back-compat with
  out-of-tree editable installs
- Hermes PluginManager: new kind=model-provider; skipped like memory
  plugins (providers/ discovery owns them); standalone plugins with
  register_provider+ProviderProfile in their __init__.py auto-coerce to
  this kind (same heuristic as memory providers)
- skip_names extended to include 'model-providers' so the general
  PluginManager doesn't double-scan the category
- 4 new tests in tests/providers/test_plugin_discovery.py covering
  bundled discovery, user override, and general-loader isolation
- Docs updated: website/docs/developer-guide/adding-providers.md,
  provider-runtime.md, providers/README.md, plugins/model-providers/README.md

No API break: auth.py / config.py / doctor.py / models.py / runtime_provider.py /
model_metadata.py / auxiliary_client.py / chat_completions.py / run_agent.py
all still consume providers via get_provider_profile() / list_providers() —
they just now see plugin-discovered entries instead of pkgutil-iterated ones.

Third parties can now drop a single directory into
~/.hermes/plugins/model-providers/<name>/ to add or override an inference
provider without touching the repo.
2026-05-05 13:40:01 -07:00
kshitijk4poor 20a4f79ed1 feat: provider modules — ProviderProfile ABC, 33 providers, fetch_models, transport single-path
Introduces providers/ package — single source of truth for every
inference provider. Adding a simple api-key provider now requires one
providers/<name>.py file with zero edits anywhere else.

What this PR ships:
- providers/ package (ProviderProfile ABC + 33 profiles across 4 api_modes)
- ProviderProfile declarative fields: name, api_mode, aliases, display_name,
  env_vars, base_url, models_url, auth_type, fallback_models, hostname,
  default_headers, fixed_temperature, default_max_tokens, default_aux_model
- 4 overridable hooks: prepare_messages, build_extra_body,
  build_api_kwargs_extras, fetch_models
- chat_completions.build_kwargs: profile path via _build_kwargs_from_profile,
  legacy flag path retained for lmstudio/tencent-tokenhub (which have
  session-aware reasoning probing that doesn't map cleanly to hooks yet)
- run_agent.py: profile path for all registered providers; legacy path
  variable scoping fixed (all flags defined before branching)
- Auto-wires: auth.PROVIDER_REGISTRY, models.CANONICAL_PROVIDERS,
  doctor health checks, config.OPTIONAL_ENV_VARS, model_metadata._URL_TO_PROVIDER
- GeminiProfile: thinking_config translation (native + openai-compat nested)
- New tests/providers/ (79 tests covering profile declarations, transport
  parity, hook overrides, e2e kwargs assembly)

Deltas vs original PR (salvaged onto current main):
- Added profiles: alibaba-coding-plan, azure-foundry, minimax-oauth
  (were added to main since original PR)
- Skipped profiles: lmstudio, tencent-tokenhub stay on legacy path (their
  reasoning_effort probing has no clean hook equivalent yet)
- Removed lmstudio alias from custom profile (it's a separate provider now)
- Skipped openrouter/custom from PROVIDER_REGISTRY auto-extension
  (resolve_provider special-cases them; adding breaks runtime resolution)
- runtime_provider: profile.api_mode only as fallback when URL detection
  finds nothing (was breaking minimax /v1 override)
- Preserved main's legacy-path improvements: deepseek reasoning_content
  preserve, gemini Gemma skip, OpenRouter response caching, Anthropic 1M
  beta recovery, etc.
- Kept agent/copilot_acp_client.py in place (rejected PR's relocation —
  main has 7 fixes landed since; relocation would revert them)
- _API_KEY_PROVIDER_AUX_MODELS alias kept for backward compat with existing
  test imports

Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com>
Closes #14418
2026-05-05 13:40:01 -07:00
Teknium 2b500ed68a chore: AUTHOR_MAP entry for asimons81 2026-05-05 13:34:03 -07:00
Tony Simons e4723f671a docs(cron): add context_from chaining section
Resolved merge against current main (new No-agent mode section added in parallel).

Co-authored-by: Tony Simons <tony@tonysimons.dev>
2026-05-05 13:34:03 -07:00
r266-tech b6e4e40df4 docs(guide): add Dispatch tools from slash commands section 2026-05-05 13:33:56 -07:00
r266-tech 91f339b981 docs(plugins): document ctx.dispatch_tool() in plugin capabilities table 2026-05-05 13:33:56 -07:00
Bartok9 72c33dfe95 docs(agent): remove stale BuiltinMemoryProvider references from memory module docstrings
The BuiltinMemoryProvider class was removed from the codebase but its
name lingered in the module-level docstrings of memory_manager.py and
memory_provider.py, creating false expectations:

- memory_manager.py docstring showed example code doing
  add_provider(BuiltinMemoryProvider(...)) which ImportError at runtime
- memory_provider.py docstring listed BuiltinMemoryProvider as
  'always present, not removable' — misleading for new contributors

The regression test (test_memory_user_id.py) already passes without
any reference to BuiltinMemoryProvider; it uses RecordingProvider
instances directly. The stale references were docs-only drift.

Update both docstrings to reflect the actual current architecture:
MemoryManager accepts external plugin providers only (one at a time).

Closes #14402
2026-05-05 13:33:49 -07:00
Teknium f67063ba81 feat(kanban): generic diagnostics engine for task distress signals (#20332)
* feat(kanban): generic diagnostics engine for task distress signals

Replaces the hallucination-specific ``warnings`` / ``RecoverySection``
surface (shipped in PR #20232) with a reusable diagnostic-rule engine
that covers five distress kinds in v1 and can be extended without
touching UI code. The "something's wrong with this task" signal is
no longer limited to phantom card ids.

Closes the follow-up from #20232 discussion.

New module
----------
``hermes_cli/kanban_diagnostics.py`` — stateless, no-side-effect rule
engine. Each rule is a pure function of
``(task, events, runs, now, config) -> list[Diagnostic]``. Registry
is a simple list; adding a new distress kind is one function + one
import, no UI or API changes required.

v1 rule set
-----------
* ``hallucinated_cards`` (error) — folds the existing
  ``completion_blocked_hallucination`` event into the new surface.
* ``prose_phantom_refs`` (warning) — folds
  ``suspected_hallucinated_references``.
* ``repeated_spawn_failures`` (error → critical at 2x threshold) —
  fires when ``tasks.spawn_failures >= 3``; suggests
  ``hermes -p <profile> doctor`` / ``auth``.
* ``repeated_crashes`` (error → critical) — fires after N consecutive
  ``crashed`` run outcomes with no successful completion between;
  suggests ``hermes kanban log <id>``.
* ``stuck_in_blocked`` (warning) — fires after 24h in ``blocked``
  state with no comments / unblock attempts; suggests commenting.

Every diagnostic carries structured ``actions`` (reclaim, reassign,
unblock, cli_hint, comment, open_docs) that render consistently in
both CLI and dashboard. Suggested actions are highlighted; generic
recovery actions (reclaim / reassign) are available on every kind as
fallbacks.

Diagnostics auto-clear when the underlying failure resolves — a
clean ``completed``/``edited`` event drops hallucination diagnostics,
a successful run drops crash diagnostics, a comment drops
stuck-blocked diagnostics. Audit events persist; the badge goes away.

API
---
``plugin_api.py``:
* ``/board`` now attaches ``diagnostics`` (full list) and
  ``warnings`` (compact summary with ``highest_severity``) per task.
* ``/tasks/{id}`` attaches diagnostics so the drawer's Diagnostics
  section auto-opens on flagged tasks.
* NEW ``/diagnostics`` endpoint — fleet-wide listing, filterable by
  severity, sorted critical-first.

CLI
---
* NEW ``hermes kanban diagnostics [--severity X] [--task id]
  [--json]`` — fleet view or single-task view, matches dashboard rule
  output so CLI users see the same picture.
* ``hermes kanban show <id>`` now renders a Diagnostics section near
  the top with severity markers + suggested actions.

Dashboard
---------
* Card badge is severity-coloured (⚠ amber warning, !! orange error,
  !!! red critical) using ``warnings.highest_severity``.
* Attention strip above the toolbar counts EVERY task with active
  diagnostics (not just hallucinations), severity-coloured, lists
  affected tasks with Open buttons when expanded.
* Drawer's old ``RecoverySection`` replaced with generic
  ``DiagnosticsSection`` rendering a card per active diagnostic:
  title + detail + structured data (task-id chips when payload keys
  look like id lists) + action buttons. Reassign profile picker is
  inline per-diagnostic. Clipboard fallback uses ``.catch()`` for
  environments where writeText rejects.
* Three-rung severity palette; amber for warning, orange for error,
  red for critical. Uses CSS variables so theming is straightforward.

Tests
-----
* NEW ``tests/hermes_cli/test_kanban_diagnostics.py`` — 14 unit tests
  covering each rule's positive/negative/threshold paths, severity
  sorting, broken-rule isolation, and sqlite3.Row integration.
* Dashboard plugin tests extended: ``/diagnostics`` endpoint (empty,
  populated, severity-filtered), ``/board`` exposes both diagnostic
  list and compact summary with ``highest_severity``.
* Existing hallucination-specific test (``test_board_surfaces_
  warnings_field_for_hallucinated_completions``) updated to reflect
  the new contract: warning summary keys by diagnostic kind
  (``hallucinated_cards``) not event kind.

379 kanban-suite tests pass (+16 net from this PR).

Live verification
-----------------
Seeded all 5 diagnostic kinds + one clean + one plain-running task
(7 total) into an isolated HERMES_HOME, spun up the dashboard, and
verified:

* Attention strip: shows ``!! 5 tasks need attention`` in the
  error-severity orange; Show expands to a list of 5 rows ordered
  critical > error > warning.
* Card badges: error tasks render ``!!`` orange, warning tasks
  render ``⚠`` amber, clean and plain-running tasks render no badge.
* Each of the 5 rules opens a correctly-coloured, correctly-styled
  diagnostic card in the drawer with its specific suggested action.
* Live reassign from a diagnostic card flipped
  ``broken-ml-worker → alice`` and the drawer refreshed with the
  new assignee + the same diagnostic still firing (correct:
  spawn_failures counter hasn't reset yet).
* CLI ``hermes kanban diagnostics`` prints all 5 in severity order;
  ``--severity error`` narrows to 3; ``kanban show <id>`` includes
  the Diagnostics block at the top with suggested action hint.

Migration note
--------------
The old ``warnings`` shape (``{count, kinds, latest_at}``) is
preserved on the API but ``kinds`` now keys by diagnostic kind
(``hallucinated_cards``) instead of event kind
(``completion_blocked_hallucination``). ``highest_severity`` is a
new required field. The dashboard was the only consumer and has
been updated in the same commit; external API consumers of the
``warnings`` field will need to update their kind-match logic.

* feat(kanban/diagnostics): lead titles with the actual error text

The generic 'Worker crashed N runs in a row' / 'Worker failed to spawn
N times' titles buried the actual cause in the data section. Operators
had to open logs or expand the diagnostic to see WHY the worker is
stuck — rate-limit vs insufficient quota vs bad auth vs context
overflow vs network blip all looked identical at a glance.

New titles:

  Agent crashed 3x: openai: 429 Too Many Requests - rate limit reached
  Agent crashed 3x: anthropic: 402 insufficient_quota - credit balance
  Agent crashed 3x: provider auth error: 401 Unauthorized
  Agent spawn failed 4x: insufficient_quota: You exceeded your current

Detail keeps the full error snippet (capped at 500 chars + ellipsis
for tracebacks). Title takes the first line capped at 160 chars.
Fallback title if no error recorded stays honest ('no error recorded').

Tests: 4 new cases covering 429/billing/spawn/truncation. 383 total
pass (+4).

Live-verified on dashboard with 6 seeded scenarios
(rate-limit, billing, auth, context, network, spawn-billing) —
each card title leads with the actionable error text.
2026-05-05 13:32:42 -07:00
r266-tech ec7f2f249e docs(cli): add skills reset subcommand to CLI reference
PR #11468 added `hermes skills reset` but cli-commands.md was not
updated. Adds the subcommand to the table and usage examples.

Closes #11543
2026-05-05 13:32:28 -07:00
Brooklyn Nicholson 00d25595c1 perf(ui-tui): narrow overlay subscriptions to focused selectors
Subscribe overlay components to computed theme/session selectors instead of the full UI store so unrelated UI state updates trigger fewer overlay renders.
2026-05-05 13:31:47 -07:00
r266-tech ee502e5640 docs(cli): add --deliver-only flag to hermes webhook subscribe
PR #12473 (merged 2026-04-19) added a new --deliver-only flag to
`hermes webhook subscribe` for zero-LLM direct delivery, but
website/docs/reference/cli-commands.md options table did not
reference it. Add the row so CLI users can discover the flag from
the reference page instead of having to read the source.
2026-05-05 13:30:06 -07:00
Teknium 0dc677f071 docs(skill/hermes-agent): sync slash commands + add durable-systems section
Mirrors the AGENTS.md #20226 additions (Toolsets / Delegation / Curator /
Cron / Kanban) into the user-facing hermes-agent skill, and closes the
drift in the in-session slash command list.

User report (wxrrior in Discord): the skill did not mention /goal, so a
brand-new session answering "/hermes-agent do you have any info on /goal"
confidently said it did not exist. Cross-check against the CommandDef
registry found 16 commands missing from the static list: /goal, /agents,
/busy, /copy, /curator, /debug, /footer, /gquota, /indicator, /kanban,
/redraw, /reload, /reload-skills, /snapshot, /steer, /topic.

Changes:
- Slash Commands header now tells the reader to run /help or check the
  live docs reference as the source of truth, and names the registry
  of record (hermes_cli/commands.py) so future drift gets flagged
  honestly instead of answered confidently wrong.
- Added all 16 missing commands, slotted into existing subsections
  (/goal and /steer in Session; /busy + /indicator + /footer in
  Configuration; /curator + /kanban + /reload-skills + /reload in
  Tools & Skills; /topic in Gateway; /copy in Utility; /gquota +
  /debug in Info).
- Toolsets table updated to the authoritative 30-key list from
  toolsets.py (added kanban, yuanbao, spotify, safe, debugging, video,
  feishu_doc, feishu_drive, discord, discord_admin, clarify; previously
  stopped at 20 keys).
- New "Durable & Background Systems" section before Troubleshooting
  covers Delegation, Cron, Curator, Kanban - each with a short rundown
  of CLI verbs, key invariants, and a pointer to the user-facing docs.
  Mirrors AGENTS.md #20226 but in the skill's user-facing register.
- Bumped version 2.0.0 -> 2.1.0.
2026-05-05 13:29:39 -07:00
r266-tech c28c2a2380 docs(tts): document per-provider max_text_length caps
PR #13743 replaced the global MAX_TEXT_LENGTH=4000 with a per-provider
table and a user-override 'max_text_length:' key, but the user-guide
TTS page documented no length behaviour at all. Users hitting truncation
had no way to discover the new caps or the override.

Add an 'Input length limits' subsection after the existing Configuration
YAML block: provider default caps (Edge 5000 / OpenAI 4096 / xAI 15000 /
MiniMax 10000 / Mistral 4000 / Gemini 5000 / ElevenLabs model-aware /
NeuTTS,KittenTTS 2000), ElevenLabs model_id -> cap table (5k-40k), an
override example, and the validation rules (non-positive / non-integer /
boolean values fall through to the provider default).
2026-05-05 13:28:53 -07:00
Teknium d5357f816d refactor(telegram): make typing thread-id resolver symmetric with send
Mirror _message_thread_id_for_typing() with _message_thread_id_for_send():
both now map the General forum topic (thread id "1") to None upfront.

That removes the need for the retry-without-thread fallback in send_typing()
entirely — if _message_thread_id_for_typing() returns a non-None value, it's
a real user-created topic and falling back to the root chat is never correct.
If Telegram rejects the typing action (e.g. topic deleted mid-session), we
swallow it at debug level instead of bleeding the indicator into All Messages.

Updates the General-topic typing regression test to assert the new single-call
contract.
2026-05-05 13:28:08 -07:00
helix4u 41545f7ec5 fix(telegram): keep DM topic typing scoped 2026-05-05 13:28:08 -07:00
WadydX 0664bf961a docs: fix broken nix-setup anchor for container-aware CLI 2026-05-05 13:27:38 -07:00
WadydX 58f93fb7d3 docs: remove dead papers.md link from saelens references 2026-05-05 13:27:12 -07:00
WadydX 2d5f20684a docs: remove dead reference links in flash-attention skill 2026-05-05 13:26:45 -07:00
Teknium c85a25faaa chore: AUTHOR_MAP entry for Beandon13 2026-05-05 13:26:12 -07:00
Brandon Zarnitz 27a8ba42ed docs(prompt): clarify supported customization surfaces 2026-05-05 13:26:12 -07:00
LeonSGP43 ce9888b52a docs(config): fix fallback provider config paths 2026-05-05 13:24:53 -07:00
beardthelion a6289927d3 docs(web_tools): correct web_extract summarizer timeout comment
The comment at tools/web_tools.py:700-702 stated the runtime default for
auxiliary.web_extract.timeout is 360s. The actual runtime default is 30s
(_DEFAULT_AUX_TIMEOUT in agent/auxiliary_client.py:3140), used by
_get_task_timeout when no auxiliary.web_extract.timeout key is present in
config.yaml.

The 360s figure is the config template default written by
hermes_cli/config.py:697 into freshly-generated config.yaml files. It only
takes effect when that key exists in the user's config — not as a fallback.
Users on configs that predate commit 20b4060d (Apr 5, 2026), or who removed
the key, fall through to the 30s _DEFAULT_AUX_TIMEOUT runtime default.

The comment was introduced in 20b4060d alongside the template-default bump
from 30 to 360. The runtime default in auxiliary_client.py was not changed
in that commit and has remained 30s since 839d9d74 (Mar 28, 2026).
2026-05-05 13:24:19 -07:00
Siddharth Balyan 3b750715a3 fix: resolve lazy session creation regressions (#18370 fallout) (#20363)
Fix three regressions introduced by PR #18370 (lazy session creation):

1. _finalize_session() uses stale session_key after compression (#20001)
2. session_key not synced after auto-compression in run_conversation (#20001)
3. pending_title ValueError leaves title wedged forever (#19029)
4. Gateway silently swallows null responses when agent did work (#18765)
5. One-time cleanup for accumulated ghost compression continuations (#20001)

Changes:
- tui_gateway/server.py: _finalize_session() now uses agent.session_id
  (falls back to session_key when agent is None). Refactor
  _sync_session_key_after_compress() with clear_pending_title and
  restart_slash_worker policy flags. Call it post-run_conversation()
  to sync session_key after auto-compression. Add ValueError handler
  to pending_title flush.
- gateway/run.py: Extract _normalize_empty_agent_response() helper that
  consolidates failed/partial/null response handling. Surfaces user-facing
  error when agent did work (api_calls > 0) but returned no text.
- hermes_state.py: Add finalize_orphaned_compression_sessions() — marks
  ghost continuation sessions as ended (non-destructive, preserves data).
- cli.py: One-time startup migration for orphaned compression sessions.

Test changes:
- tests/test_tui_gateway_server.py: Update pending_title ValueError test
  for post-#18370 architecture (title applied post-message, not at create).
- tests/test_lazy_session_regressions.py: 14 new regression tests covering
  all fixed paths.
2026-05-06 01:11:49 +05:30
Teknium 0397be5939 feat(tui): remove /provider alias for /model (#20358)
/model is the canonical command; /provider was a redundant alias that
dispatched to the same ModelPicker overlay. Drop the alias, the regex
branch in useCompletion, and the alias-coverage test.
2026-05-05 12:23:21 -07:00
Teknium 87b113c2e3 chore: AUTHOR_MAP entry for Tkander1715 2026-05-05 10:18:58 -07:00
Traemond Anderson 60235dba5e feat(cli): add list_picker_providers for credential-filtered picker
The Telegram/Discord /model pickers currently call
list_authenticated_providers(), which returns every provider whose
credentials resolve locally and every model in its curated snapshot.
Two failure modes fall out:

- OpenRouter rows can include IDs the live catalog no longer carries.
- Provider rows can surface with zero callable models (e.g. a slug
  whose credential pool entry exists but has nothing behind it).

list_picker_providers() wraps the base function and post-processes the
result so the interactive picker only shows models the user can
actually select:

- OpenRouter's models come from fetch_openrouter_models() (live-catalog
  filtered against the curated OPENROUTER_MODELS snapshot).
- Rows with an empty models list are dropped, except custom endpoints
  (is_user_defined=True with an api_url) where the user may enter
  model ids manually.
- All other fields pass through unchanged.

The gateway /model handler switches to the new helper for the
interactive picker payload only. Typed /model <name> and the text
fallback list stay on list_authenticated_providers() so nothing is
hidden from power users or platforms without a picker.

Covered by nine focused unit tests in
tests/hermes_cli/test_list_picker_providers.py.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 10:18:58 -07:00
Teknium cc2c820975 chore: AUTHOR_MAP entry for Aslaaen 2026-05-05 10:18:28 -07:00
Aslaaen e8e9147377 fix(acp): preserve assistant reasoning metadata in session persistence 2026-05-05 10:18:28 -07:00
Teknium dbe9b15fa1 chore: AUTHOR_MAP entry for zeejaytan 2026-05-05 10:15:57 -07:00
Zeejay f8ba265340 fix(aux): trigger fallback on 429 rate-limit errors in auxiliary client
When a provider returns a 429 rate-limit error (not billing-related),
the auxiliary client's call_llm/async_call_llm previously did NOT trigger
the fallback chain. This caused auxiliary tasks like session_search to
exhaust all 3 retries against the same rate-limited endpoint, losing
session metadata that depended on the summarization completing.

Root cause: `_is_payment_error()` only matched 429s containing billing
keywords ("credits", "insufficient funds", etc.). Provider-specific
rate-limit messages like Nous's "Hold up for a bit, you've exceeded the
rate limit on your API key" didn't match, so `_is_payment_error` returned
False, `_is_connection_error` returned False, and `should_fallback` was
False — all retries hit the same rate-limited provider.

Fix:
- New `_is_rate_limit_error()` function that detects 429 + rate-limit
  keywords, generic 429 without billing keywords, and OpenAI SDK
  `RateLimitError` class instances (which may omit .status_code).
- Updated `should_fallback` in both `call_llm` and `async_call_llm` to
  include `_is_rate_limit_error`.
- Updated the max_tokens retry path to also check for rate-limit errors.
- Updated the reason string to include "rate limit".

This complements the Nous rate guard (PR #10568) which prevents new calls
to Nous when already rate-limited — this fix handles the case where a
request is already in flight when the 429 arrives.

Related: #8023, #12554, #11034
Co-authored-by: Zeejay <zjtan1@gmail.com>
2026-05-05 10:15:57 -07:00
Teknium 8c0f254c06 chore: AUTHOR_MAP entry for LeonSGP43 2026-05-05 10:15:31 -07:00
LeonSGP43 244bacd0dc fix(skills): support category-qualified local skill names 2026-05-05 10:15:31 -07:00
Teknium 4553e32bc4 chore: AUTHOR_MAP entry for Es1la 2026-05-05 10:15:09 -07:00
Es1la a877c3f6d9 fix(feishu): tolerate malformed dedup timestamps
Salvages @Es1la's PR #13632 — a non-numeric timestamp in the persisted
feishu dedup state crashed adapter startup with ValueError/TypeError
from the unguarded float() call. Wrap the float() conversion in
try/except; skip the bad key and keep loading the rest.

The original PR also restructured existing TestDedupTTL tests to use
tempfile.TemporaryDirectory + HERMES_HOME patching — that was
test-hygiene scope creep unrelated to the bug. Kept only the
malformed-timestamp fix and added a focused regression test.
2026-05-05 10:15:09 -07:00
Teknium 77a102b7de chore: AUTHOR_MAP entry for jkausel-ai 2026-05-05 10:14:48 -07:00
Justin Kausel 526742199b Prefer fallback for Gemini CloudCode rate limits 2026-05-05 10:14:48 -07:00
Teknium 12135b4c8a chore: AUTHOR_MAP entry for wysie 2026-05-05 10:14:17 -07:00
Wysie 0120d8f31e fix: merge plugin tools into builtin toolsets 2026-05-05 10:14:17 -07:00
Teknium d9f0875591 chore: AUTHOR_MAP entry for hharry11 2026-05-05 10:13:55 -07:00
hharry11 247c9d468c fix(gateway): ensure deterministic thread eviction in helpers 2026-05-05 10:13:55 -07:00
Teknium 935cf2fcca chore: AUTHOR_MAP entry for JTroyerOvermatch 2026-05-05 10:13:34 -07:00
Jonathan Troyer 6430d67569 fix(openrouter): use canonical X-Title attribution header
OpenRouter's dashboard attributes usage via the `X-Title` header.
Hermes was sending `X-OpenRouter-Title`, which OpenRouter does not
recognize, so Hermes usage showed up unlabeled. Rename to `X-Title`
to match the canonical header (already used elsewhere in the same
file via _AI_GATEWAY_HEADERS).

Salvages the core fix from @JTroyerOvermatch's PR #13649. Dropped the
PR's `HERMES_OPENROUTER_TITLE` / `HERMES_OPENROUTER_REFERER` env-var
override plumbing per the '.env is for secrets only' policy — if
per-deployment attribution is needed later it should go under
`openrouter.title` / `openrouter.referer` in config.yaml instead.
2026-05-05 10:13:34 -07:00
Teknium 269be4ec84 chore: AUTHOR_MAP entry for Bongulielmi 2026-05-05 10:13:13 -07:00
Remigio Bongulielmi d8097d587f refactor(env): use shared Hermes dotenv loader 2026-05-05 10:13:13 -07:00
Teknium c62d8c9b74 chore: AUTHOR_MAP entry for Bartok9 2026-05-05 10:12:40 -07:00
Bartok dad62c4c47 fix(whatsapp): auto-convert mp3/wav to ogg/opus in send-media for native voice bubbles
WhatsApp bridge (bridge.js) only sets ptt:true when file extension is .ogg
or .opus, causing mp3/wav files (from Edge TTS, NeuTTS, etc.) to arrive
as file attachments instead of voice bubbles — silently, with no error.

Fix: when audio type is sent with a non-ogg/opus format, run ffmpeg
conversion to ogg/opus in a temp file before sending. This makes
send_voice() self-sufficient regardless of what format the caller provides.

Fallback: if ffmpeg is unavailable, original buffer is sent (previous
behaviour) with a console.warn — no crash.

Addresses veloguardian's review comment on PR #4992.
2026-05-05 10:12:40 -07:00
Teknium 45949e944a chore: AUTHOR_MAP entry for Junass1 2026-05-05 10:05:23 -07:00
Teknium e4e0090b54 test(acp): regression for #13675 — save_session preserves existing messages on encode failure 2026-05-05 10:05:23 -07:00
Junass1 5795b3be4e fix(acp): use SessionDB.replace_messages for atomic history rewrite
ACP's save_session() did a non-atomic clear_messages() + append_message()
loop. If any message hit an exception mid-loop (bad tool_call shape, etc.),
the DELETE had already committed and the persisted conversation was lost.

SessionDB.replace_messages() wraps DELETE + bulk INSERT in a single
BEGIN IMMEDIATE transaction that rolls back on any exception, so a bad
message can no longer clobber previously-persisted history.

Salvages @Awsh1's PR #13675 — uses the existing replace_messages()
helper (which covers more message fields than the PR's own copy)
instead of adding a duplicate.
2026-05-05 10:05:23 -07:00
Justin Kausel e805380b82 Discover plugin commands during CLI dispatch 2026-05-05 09:58:37 -07:00
sprmn24 ecc909de38 fix(session): serialize JSONL transcript appends under existing lock 2026-05-05 09:57:31 -07:00
sprmn24 db84c1535d fix(ssh): add scp availability check to preflight validation 2026-05-05 09:57:23 -07:00
WuTianyi 8e18d10318 fix(feishu): force text mode for markdown tables
Feishu post-type 'md' elements do not render markdown tables.
When table content is sent as post (triggered by **bold** matching
_MARKDOWN_HINT_RE), the message appears blank on the client.

Add _MARKDOWN_TABLE_RE to detect markdown table syntax and force
text mode for table content, ensuring it is visible as plain text.
2026-05-05 09:57:14 -07:00
Teknium b014a3d315 test(cron): update _isolate_tick_lock fixture for _get_lock_paths
After PR #13725 replaced the module-level _LOCK_DIR/_LOCK_FILE constants
with a dynamic _get_lock_paths() helper, the xdist-isolation fixture
needs to patch the function instead of the removed constants.
2026-05-05 09:57:06 -07:00
邓taoyuan 969bfff449 fix: merge _get_hermes_home() dynamic resolution and feishu receive_id_type detection
- scheduler.py: Replace static _hermes_home with dynamic _get_hermes_home() function
  to support profile switching at runtime (HERMES_HOME override)
- scheduler.py: Replace static _LOCK_DIR/_LOCK_FILE with _get_lock_paths() function
  for profile-aware lock path resolution
- feishu.py: Add receive_id_type detection (oc_/ou_ -> open_id, else chat_id)
  to fix Feishu API '[230001] ext=invalid receive_id' error for user DMs
2026-05-05 09:57:06 -07:00
Teknium de9238d37e feat(kanban): hallucination gate + recovery UX for worker-created-card claims (#20232)
Workers completing a kanban task can now claim the ids of cards they
created via an optional ``created_cards`` field on ``kanban_complete``.
The kernel verifies each id exists and was created by the completing
worker's profile; any phantom id blocks the completion with a
``HallucinatedCardsError`` and records a
``completion_blocked_hallucination`` event on the task so the rejected
attempt is auditable. Successful completions also get a non-blocking
prose-scan pass over their ``summary`` + ``result`` that emits a
``suspected_hallucinated_references`` event for any ``t_<hex>``
reference that doesn't resolve.

Closes #20017.

Recovery UX (kernel + CLI + dashboard)
--------------------------------------

A structural gate alone isn't enough — operators also need to see and
act on stuck workers, especially when a profile's model is the root
cause. This PR ships the full loop:

* ``kanban_db.reclaim_task(task_id)`` — operator-driven reclaim that
  releases an active worker claim immediately (unlike
  ``release_stale_claims`` which only acts after claim_expires has
  passed). Emits a ``reclaimed`` event with ``manual: True`` payload.
* ``kanban_db.reassign_task(task_id, profile, reclaim_first=...)`` —
  switch a task to a different profile, optionally reclaiming a stuck
  running worker in the same call.
* ``hermes kanban reclaim <id> [--reason ...]`` and
  ``hermes kanban reassign <id> <profile> [--reclaim] [--reason ...]``
  CLI subcommands wired through to the same helpers.
* ``POST /api/plugins/kanban/tasks/{id}/reclaim`` and
  ``POST /api/plugins/kanban/tasks/{id}/reassign`` endpoints on the
  dashboard plugin.

Dashboard surfacing
-------------------

* ⚠ **warning badge** on cards with active hallucination events.
* **attention strip** at the top of the board listing all flagged
  tasks; dismissible per session.
* **events callout** in the task drawer — hallucination events render
  with a red left border, amber icon, and phantom ids as styled chips.
* **recovery section** in the task drawer with three actions: Reclaim,
  Reassign (with profile picker + reclaim-first checkbox), and a
  copy-to-clipboard hint for ``hermes -p <profile> model`` since
  profile config lives on disk and can't be edited from the browser.
  Auto-opens when the task has warnings, collapsed otherwise.
  Keyed by task id so state doesn't leak between drawers.

Active-vs-stale rule: warnings clear when a clean ``completed`` or
``edited`` event supersedes the hallucination, so recovery is never
permanently stigmatising — the audit events persist for debugging but
the badge goes away once the worker succeeds.

Skill updates
-------------

* ``skills/devops/kanban-worker/SKILL.md`` documents the
  ``created_cards`` contract with good/bad examples.
* ``skills/devops/kanban-orchestrator/SKILL.md`` gains a "Recovering
  stuck workers" section with the three actions and when to use each.

Tests
-----

* Kernel gate: verified-cards manifest, phantom rejection + audit
  event, cross-worker rejection, prose scan positive + negative.
* Recovery helpers: reclaim on running task, reclaim on non-running
  returns False, reassign refuses running without reclaim_first,
  reassign with reclaim_first succeeds on running.
* API endpoints: warnings field present on /board and /tasks/:id,
  warnings cleared after clean completion, reclaim 200 + 409 paths,
  reassign 200 + 409 + reclaim_first paths.
* CLI smoke: reclaim + reassign subcommands.

Live-verified end-to-end on a dashboard with seeded scenarios:
attention strip renders, badges land on the right cards, drawer
callout shows phantom chips, Reclaim on a running task flips status to
ready + emits manual reclaimed event + refreshes the drawer,
Reassign swaps the assignee and triggers board refresh.

359/359 kanban-suite tests pass
(test_kanban_{db,cli,boards,core_functionality} + dashboard + tools).
2026-05-05 08:06:55 -07:00
Teknium 7de3c86c5a feat(i18n): add display.language for static message translation (zh/ja/de/es) (#20231)
* revert(gateway): remove stale-code self-check and auto-restart

Removes the _detect_stale_code / _trigger_stale_code_restart mechanism
introduced in #17648 and iterated in #19740. On every incoming message
the gateway compared the boot-time git HEAD SHA to the current SHA on
disk, and if they differed it would reply with

    Gateway code was updated in the background --
    restarting this gateway so your next message runs
    on the new code. Please retry in a moment.

and then kick off a graceful restart. This is unwanted behaviour:
users who run a long-lived gateway and do their own ad-hoc git
operations on the checkout end up with their chat interrupted and
the current message dropped every time HEAD moves, with no way to
opt out.

If an operator really needs the old protection against stale
sys.modules after "hermes update", the SIGKILL-survivor sweep in
hermes update (hermes_cli/main.py, also tagged #17648) already
handles the supervisor-respawn case on its own.

Removed:
  gateway/run.py:
    - _STALE_CODE_SENTINELS, _GIT_SHA_CACHE_TTL_SECS
    - _read_git_head_sha(), _compute_repo_mtime() module helpers
    - class-level _boot_wall_time / _boot_repo_mtime / _boot_git_sha /
      _stale_code_restart_triggered defaults
    - __init__ boot-snapshot block (_boot_*, _cached_current_sha*,
      _repo_root_for_staleness, _stale_code_notified)
    - _current_git_sha_cached(), _detect_stale_code(),
      _trigger_stale_code_restart() methods
    - stale-code check + user-facing restart notice at the top of
      _handle_message()
  tests/gateway/test_stale_code_self_check.py (deleted, 412 lines)

No new logic added. Zero remaining references to any removed
symbol. Gateway test suite passes the same 4589 tests it passed
before; the 3 pre-existing unrelated failures (discord free-channel,
feishu bot admission, teams typing) are unchanged by this commit.

* feat(i18n): add display.language for static message translation (zh/ja/de/es)

Adds a thin-slice i18n layer covering the highest-impact static user-facing
messages: the CLI dangerous-command approval prompt and a handful of gateway
slash-command replies (restart-drain, goal cleared, approval expired, config
read/save errors).

Out of scope (stays English): agent responses, log lines, tool outputs,
slash-command descriptions, error tracebacks.

Infrastructure:
- agent/i18n.py: catalog loader, t() helper, language resolution
  (HERMES_LANGUAGE env var > display.language config > en)
- locales/{en,zh,ja,de,es}.yaml: ~19 translated strings per language
- display.language in DEFAULT_CONFIG (hermes_cli/config.py)

Tests:
- tests/agent/test_i18n.py: 21 tests covering catalog parity, placeholder
  parity across locales, fallback behavior, env-var override, alias
  normalization, missing-key graceful degradation.

Docs:
- website/docs/user-guide/configuration.md: display.language entry plus a
  short section explaining scope so users don't expect agent responses to
  translate via this knob.
2026-05-05 08:03:07 -07:00
Teknium b7bd177105 docs(AGENTS.md): add curator/cron/delegation/toolsets, fix plugin tree (#20226)
* docs(AGENTS.md): add curator/cron/delegation/toolsets, fix plugin tree, frontmatter, auto-discovery caveat

Closes #19101 and #19107 (@pty819).

Verified 16 claims from those two issues against current main. 12 were
real gaps; 2 were generated/hallucinated (#10 unverified --now flag is
actually real and already cited in AGENTS.md; #11 stale PR refs #5587
and #4950 do not appear in AGENTS.md at all); 2 were low-prio nits
(memory provider hierarchy, --now scope enumeration) deferred.

Changes:
- Project tree: add yuanbao to platforms comment; expand plugins/
  subtree with real directory names (kanban, hermes-achievements,
  observability, image_gen) instead of vague '<others>'.
- Test-count blurb: 15k/700 Apr → 17k/900 May (verified: 17,375 test
  defs, 915 files).
- Adding New Tools: clarify that auto-discovery wires up schemas but
  the tool only reaches an agent if its name is added to a toolset in
  toolsets.py. _HERMES_CORE_TOOLS is not dead code.
- Adding Configuration: enumerate top-level config.yaml sections
  including auxiliary and curator; note auxiliary is per-task
  overrides for side-LLM work.
- SKILL.md frontmatter: add author, license, related_skills. Note
  top-level tags/category are mirrored from metadata.hermes.*.
- New section 'Toolsets' — enumerates the 30 current TOOLSETS keys
  (including yuanbao, kanban, moa, spotify, safe, debugging).
- New section 'Delegation (delegate_task)' — sync semantics, batch
  mode, leaf vs orchestrator roles, config knobs, durability caveat.
- New section 'Curator (skill lifecycle)' — core files, 11 CLI verbs,
  telemetry sidecar, invariants (pin/delete split after PR #20220,
  bundled/hub off-limits), curator.* config section.
- New section 'Cron (scheduled jobs)' — 4 schedule formats, 7 CLI
  verbs, per-job fields, 3-min hard interrupt, catchup/grace windows,
  tick.lock, cron→session isolation.

Skipped (invalid claims):
- #19107 item 10: --now is real (hermes_cli/skills_hub.py:624/966/1013/1470)
- #19107 item 11: no '#5587' or '#4950' or 'async_delegation' in AGENTS.md

* docs(AGENTS.md): add Kanban section

Adds a Kanban entry alongside Curator / Cron / Delegation so the major
durable background systems are all represented. Covers the CLI verbs,
the HERMES_KANBAN_TASK-gated worker toolset, the in-gateway dispatcher,
plugin assets, and the board/tenant isolation model. Points at the full
742-line user docs for detail.
2026-05-05 07:56:29 -07:00
Teknium 7530ce04e0 chore: AUTHOR_MAP entry for MaHaoHao-ch 2026-05-05 06:12:42 -07:00
MaHaoHao-ch 02147cc850 fix(cli): sanitize bracketed paste markers during setup
Strip bracketed-paste control sequences from setup prompt input so pasted API keys work on Linux and WSL terminals, and add regression tests for normal/password prompts.

Closes #16491
2026-05-05 06:12:42 -07:00
Teknium 8ebb81fd76 chore: AUTHOR_MAP entry for rxdxxxx 2026-05-05 06:12:11 -07:00
rxdxxxx c46bc92949 fix(run_agent): use aux provider for compression context length lookup
Each auxiliary model must be resolved with its own provider so that
provider-specific paths (e.g. Bedrock static table, OpenRouter API)
are invoked for the correct client, not inherited from the main model.

When the main model is Bedrock, passing self.provider unconditionally
to get_model_context_length() for the aux model caused the Bedrock
static table hard-intercept (step 1b) to fire for non-Bedrock models,
returning BEDROCK_DEFAULT_CONTEXT_LENGTH=128K instead of the model's
real context window — triggering a false compression warning every session.

Fix: pass _aux_cfg_provider when explicitly set, falling back to
self.provider only when the aux provider is unset or "auto".

Closes #12977
Related: #13807, #17460
2026-05-05 06:12:11 -07:00
Teknium fb311952d7 chore: AUTHOR_MAP entry for Krionex 2026-05-05 06:11:38 -07:00
Teknium 285c208cf7 fix(gateway): also tolerate malformed env vars in custom human-delay mode
Widens @Krionex's PR #16933 fix to cover the second bug class at the sibling
site. natural mode used to pass env values through int() before the PR
caught mis-typed values crashing the gateway; custom mode had the exact
same bug one branch away (HERMES_HUMAN_DELAY_MIN_MS=oops in custom mode
still crashed). Same try/except/fallback pattern, scoped to the two
int() calls that feed random.uniform().
2026-05-05 06:11:38 -07:00
Krionex 3b16c590e0 fix(gateway): ignore malformed custom delay env vars in natural mode 2026-05-05 06:11:38 -07:00
Teknium 349d0da07e chore: AUTHOR_MAP entry for novax635 2026-05-05 06:11:03 -07:00
novax635 4e6f51167d fix(cli): fall back on invalid HERMES_MAX_ITERATIONS 2026-05-05 06:11:03 -07:00
Teknium 37b5731694 chore: AUTHOR_MAP entry for npmisantosh 2026-05-05 06:08:14 -07:00
Santosh f6677748a0 fix(claw): handle missing dir in _scan_workspace_state 2026-05-05 06:08:14 -07:00
Teknium f844e516d8 chore: AUTHOR_MAP entry for agentlinker 2026-05-05 06:07:44 -07:00
Leon 19eebf6e0d fix(openrouter): treat xiaomi models as reasoning-capable 2026-05-05 06:07:44 -07:00
vominh1919 96514de472 fix(auxiliary): avoid locking into custom path when api_key is empty
When auxiliary.<task> config has base_url set but api_key is empty
(common when user expects env var fallback), _resolve_task_provider_model()
returned provider="custom" with api_key=None. This caused downstream
client construction to make API calls without an Authorization header,
resulting in HTTP 401 errors.

Fix: only return "custom" when BOTH cfg_base_url AND cfg_api_key are
non-empty. When base_url is set without api_key but with a known
provider (e.g. "openrouter"), pass through to that provider so it can
resolve credentials from environment variables.

Fixes #16829
2026-05-05 06:07:07 -07:00
Teknium c7fc5af122 chore: AUTHOR_MAP entry for tangyuanjc 2026-05-05 06:04:20 -07:00
JC的AI分身 80b386a472 fix(feishu): refresh bot identity during hydration 2026-05-05 06:04:20 -07:00
Teknium 314361733f test(api_server): _run_agent result now carries session_id for #16938 2026-05-05 06:01:03 -07:00
vominh1919 7f735b4db2 fix: return effective session_id after context compression (#16938)
When context compression rotates the agent's session_id to a new
child session, the API server was still returning the stale parent
session_id in the X-Hermes-Session-Id response header.

This caused external clients to keep sending the old session_id,
loading uncompressed parent history instead of the compressed
continuation.

Fix: _run_agent() now includes the effective session_id in its
result dict, and the response header uses it instead of the
original provided session_id.
2026-05-05 06:01:03 -07:00
Hafiy Zakaria 34c6f93496 fix: resolve model.aliases from config.yaml in /model alias resolution
hermes config set model.aliases.xxx commands write to the model.aliases
nested key, but _load_direct_aliases() only read from the top-level
model_aliases key. This meant aliases set via hermes config set were
invisible to the /model command, and unrecognised inputs fell through
to the DeepSeek normaliser which mapped everything to deepseek-chat.

Add a second pass in _load_direct_aliases() that reads model.aliases
and converts string-value entries (provider/model format) into
DirectAlias objects. The provider is parsed from the slash prefix;
if no slash, the current default provider from config is used.

Also prevent simple aliases from overriding explicit model_aliases
dict entries when both exist.
2026-05-05 05:49:01 -07:00
briandevans c1a2710a32 test(aux): cover effort: 0 fallback in Codex reasoning translation
Copilot review on PR #17012 noted the docstring/comment lists `0`
among the falsy effort values that fall back to `medium`, but the
existing regression tests only cover `None` and `""`. Add the third
case to lock in the full contract.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 05:47:50 -07:00
briandevans 9e893d16d1 fix(aux): default Codex reasoning effort to medium when extra_body.reasoning.effort is falsy
auxiliary.<task>.extra_body.reasoning, but the new translation path in
_CodexCompletionsAdapter.create() reads the effort with
``reasoning_cfg.get("effort", "medium")``.  That returns the configured
value verbatim when the key is present, so ``effort: null`` /
``effort: ""`` (both common YAML shapes) flow through as
``{"effort": null, "summary": "auto"}`` and Codex rejects the request
with "Invalid value for parameter ``reasoning.effort``".

agent/transports/codex.py::build_kwargs() — which the new adapter is
documented to mirror — uses a truthy check (``elif
reasoning_config.get("effort"):``) so the same falsy values keep the
"medium" default.  Switch the auxiliary adapter to the same
``or "medium"`` truthy form so identical config produces identical
requests on both paths.

- [x] Two new regression tests cover ``effort: None`` and
  ``effort: ""`` and assert the request goes out as
  ``{"effort": "medium", "summary": "auto"}``.
- [x] Old behaviour fails the new tests (``{'effort': None} !=
  {'effort': 'medium'}``); fixed behaviour passes all 11 tests in the
  ``TestCodexAdapterReasoningTranslation`` class.
- [x] Adjacent suites green: ``tests/agent/test_auxiliary_client.py``
  (108 passed) and ``tests/agent/transports/test_codex_transport.py +
  test_chat_completions.py`` (73 passed).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 05:47:50 -07:00
vominh1919 44cf33449d fix(mcp): add periodic keepalive to _wait_for_lifecycle_event
Sends a lightweight list_tools() probe every 3 minutes during idle
periods to prevent TCP connections from going stale behind LB / NAT
idle timeouts (commonly 300-600s).  When the keepalive fails, the
reconnect event fires so the transport rebuilds the session cleanly.

Salvages the keepalive portion of @vominh1919's PR #17016. The
circuit-breaker half-open recovery from the same PR was independently
landed on main via #benbarclay's commit 8cc3cebca ("fix(mcp): add
half-open state to circuit breaker", Apr 21); only the keepalive is
salvaged here.

Fixes #17003.
2026-05-05 05:47:33 -07:00
Teknium 005b2f4c5d chore: AUTHOR_MAP entry for beardthelion 2026-05-05 05:46:16 -07:00
beardthelion f15b0fbb4f fix: add PLATFORM_HINTS entry for api_server platform
The API server is a documented, first-class messaging platform with its own
gateway adapter, docs pages, and toolset. But it's the only messaging
platform missing from PLATFORM_HINTS in agent/prompt_builder.py.

Without a platform hint, the agent has no context about the API server's
rendering environment and defaults to markdown-heavy document-style outputs
(code fences, bold, bullet points) — which break on the plain-text frontends
most API server consumers wrap (Open WebUI, custom agents, third-party
bridges).

Adds a generic api_server entry that describes the medium (unknown rendering,
assume plain text) without encoding any specific use case. Individual consumers
can layer additional style guidance via ephemeral system prompts.

Before (DeepSeek V4 Pro via API server, no hint):
  **Sendblue bridge** at /opt/sendblue-bridge - **68MB** on disk

After (same prompt, with hint):
  Sendblue bridge at /opt/sendblue-bridge, 68MB on disk

No breaking changes — new dict entry only. Existing API server consumers see
no behavioral change except for models that previously defaulted to markdown
formatting, which now produce cleaner plain-text output.
2026-05-05 05:46:16 -07:00
Teknium b10e38e392 fix(skills): pin protects against deletion only, not edits (#20220)
Previously, pinning a skill blocked every skill_manage write action
(edit, patch, delete, write_file, remove_file). The 'hard fence'
design conflated two concerns:

  1. Pin as deletion protection — don't let the curator archive
     or the agent delete a stable skill.
  2. Pin as content freeze — don't let the agent rewrite it mid-conversation.

In practice (1) is what users pin for: they want a skill to survive
curator passes. (2) created friction — agents finding a new pitfall
in a pinned skill had to ask the user to unpin, then the agent
patches, then the user re-pins. The dance discouraged skill
maintenance and pinned skills went stale.

This narrows the _pinned_guard to skill_manage(action='delete') only.
Patches, edits, and supporting-file writes go through on pinned
skills so the agent can keep improving them. The curator's own
pinned-skip behavior (agent/curator.py:271 for auto-archive,
line 349 for the LLM review prompt) is unchanged — curator still
never touches pinned skills.

Changes:
- tools/skill_manager_tool.py: remove _pinned_guard calls from
  _edit_skill, _patch_skill, _write_file, _remove_file; keep on
  _delete_skill. Updated _pinned_guard docstring and error message.
- tools/skill_manager_tool.py: updated skill_manage model-facing tool
  description to reflect the new semantic.
- website/docs/user-guide/features/curator.md: updated pinning
  section.
- tests/tools/test_skill_manager_tool.py: flipped refuses-pinned
  tests for edit/patch/write_file/remove_file into allowed-when-pinned;
  kept test_delete_refuses_pinned (strengthened assertion to check the
  'cannot be deleted' wording).

Closes #18354
2026-05-05 05:43:10 -07:00
Teknium fe8560fc12 feat(api-server): X-Hermes-Session-Key header for long-term memory scoping (#20199)
* feat(api-server): X-Hermes-Session-Key header for long-term memory scoping

API Server integrations (Open WebUI, custom web UIs) can now pass a stable
per-channel identifier via X-Hermes-Session-Key that scopes long-term memory
(Honcho, etc.) independently of the transcript-scoped X-Hermes-Session-Id.
This matches the native gateway's session_key / session_id split: one stable
key per assistant channel, many independent transcripts that rotate on /new.

- _create_agent and _run_agent accept gateway_session_key and pass it to
  AIAgent(gateway_session_key=...), which is already honored by the Honcho
  memory provider (plugins/memory/honcho/client.py resolve_session_name).
- New shared helper _parse_session_key_header applies the same API-key
  gate, control-character sanitization, and a 256-char length cap as the
  existing session-id header.
- All three agent endpoints honor the header: /v1/chat/completions,
  /v1/responses, /v1/runs. JSON and SSE responses echo it back.
- /v1/capabilities advertises session_key_header so clients can
  feature-detect.

Closes #20060.

Co-authored-by: Andy Stewart <lazycat.manatee@gmail.com>

* chore: AUTHOR_MAP entry for manateelazycat

---------

Co-authored-by: Andy Stewart <lazycat.manatee@gmail.com>
2026-05-05 05:34:47 -07:00
Teknium 436672de0e feat(curator): add archive and prune subcommands (#20200)
* fix(curator): protect hub skills by frontmatter name

* test(skill_usage): add mark_agent_created to regression test

The cherry-picked test predates #19618/#19621 which rewrote
list_agent_created_skill_names() to require an explicit
created_by: 'agent' provenance marker. Without mark_agent_created(),
my-skill is excluded from the list and the positive assertion fails.

* feat(curator): add archive and prune subcommands

Adds 'hermes curator archive <skill>' and 'hermes curator prune
[--days N] [--yes] [--dry-run]' alongside the existing status, run,
pause, resume, pin, unpin, restore, backup, rollback verbs.

These are the two genuinely new user-facing verbs requested in #19384.
The other verbs proposed there ('stats' and 'restore') already exist
as 'curator status' and 'curator restore', so no duplicate surface is
added — all skill lifecycle commands live under the single 'hermes
curator' namespace.

- archive: manual archive of an agent-created skill. Refuses pinned
  skills with a hint pointing at 'hermes curator unpin'.
- prune: bulk-archive unpinned skills idle for >= N days (default 90).
  Falls back to created_at when last_activity_at is null so never-used
  skills can still be pruned. --dry-run previews, --yes skips prompt.

Adapted from @elmatadorgh's PR #19454 which placed the same verbs
under 'hermes skills' with a separate hermes_cli/skills_config.py
handler and rich table for stats. The 'stats' and 'restore' parts of
that PR duplicated existing surface, so only archive and prune are
kept, rewritten to match hermes_cli/curator.py's existing plain-text
handler style. Tests rewritten from scratch against the new handlers.

Closes #19384

Co-authored-by: elmatadorgh <coktinbaran5@gmail.com>

---------

Co-authored-by: LeonSGP43 <cine.dreamer.one@gmail.com>
Co-authored-by: elmatadorgh <coktinbaran5@gmail.com>
2026-05-05 05:15:54 -07:00
Teknium 4f76166cf0 chore: AUTHOR_MAP entry for qxxaa 2026-05-05 05:01:12 -07:00
qxxaa 0a7cc85eab fix(honcho): pass user_message as search_query in get_prefetch_context
The user_message parameter was accepted by get_prefetch_context but intentionally discarded, with the rationale that passing it would
expose conversation content in server access logs.

This rationale is inconsistent: Honcho already persists every message in full via saveMessages. The content is already in the database. A search query in an access log adds negligible additional exposure, and is moot for self-hosted Honcho deployments where the operator owns the logs.

Without search_query, Honcho returns the full peer representation -
all observations, deductive/inductive layers, and peer card - in
insertion order. When contextTokens is set, the most useful parts
(peer card, dialectic conclusions) are truncated because raw
observations fill the budget first.

Passing user_message as search_query enables Honcho's semantic
retrieval to return only conclusions relevant to the current session
topic, reducing injection noise and improving context quality on cold starts.

The _fetch_peer_context method already accepts and passes search_query to the Honcho API. This change simply connects the two.
2026-05-05 05:01:12 -07:00
Teknium 046c293183 chore: AUTHOR_MAP entry for chengoak 2026-05-05 05:00:41 -07:00
chengoak 8f4c0bf088 fix(wecom): pad base64 AES key before decode
WeCom doesn't pad base64 aeskey, causing Python strict mode decode failure
on media/image/file messages. Add automatic padding before base64 decode:
aes_key + '=' * ((4 - len(aes_key) % 4) % 4).

Salvages the AES padding fix from @chengoak's PR #17040. The SSRF whitelist
entry for a private COS bucket hostname was dropped as it belongs in user
config, not the built-in trusted-private-IP-hosts list. The debug-level
full-body info log was dropped to avoid logging potentially sensitive
message content at INFO level.
2026-05-05 05:00:41 -07:00
Teknium 83a07f4759 chore: AUTHOR_MAP entry for happy5318 2026-05-05 05:00:05 -07:00
Teknium 9e0ef2a1bc test: pin per-turn reasoning extraction semantics
Covers four scenarios for the reasoning-box extraction loop:
 - simple turn with reasoning
 - simple turn with no reasoning
 - tool-calling turn where reasoning lives on the tool-call step
 - prior turn had reasoning, current turn does not (the stale-display
   bug the fix exists for)
 - tool-calling turn where reasoning lives on BOTH steps (latest wins)
 - empty-string reasoning treated as missing

Also updates the four inline replica loops in tests/cli/test_reasoning_command.py
to match the new turn-boundary shape so the test file reflects
production semantics.
2026-05-05 05:00:05 -07:00
happy5318 efe1cb00c8 fix: prevent stale reasoning from being reused across turns
The reasoning-box extraction loop in run_conversation() walked backwards
through the entire message history looking for any assistant message
with a non-empty 'reasoning' field.  When the current turn produced
no reasoning (e.g. the provider returned reasoning_content=null for a
trivial response), the loop walked past the current turn and showed
reasoning from a prior turn — stale text from minutes or hours ago
displayed as if it belonged to the current reply.

Fix: stop the walk at the user message that started the current turn.
That picks the most recent reasoning WITHIN the turn (correct for
tool-calling turns where reasoning lands on the tool-call step and
the final-answer step has reasoning=None — common on Claude thinking,
DeepSeek v4, Codex Responses), and returns None cleanly when the
current turn genuinely had no reasoning.

Co-authored-by: happy5318 <happy5318@users.noreply.github.com>
2026-05-05 05:00:05 -07:00
Teknium 4577f392f9 chore: AUTHOR_MAP entry for ashermorse 2026-05-05 04:58:23 -07:00
Asher Morse 6b76ea4707 fix(gateway): load reply_to_mode from config.yaml for Discord and Telegram
The YAML-to-env-var bridge in load_gateway_config() mapped every Discord
and Telegram config key (require_mention, auto_thread, reactions, etc.)
except reply_to_mode. Users setting discord.reply_to_mode or
telegram.reply_to_mode in ~/.hermes/config.yaml got no effect — the
adapter only read the env var, which nothing populated from YAML.

Add the missing bridge for both platforms, following the existing pattern.
Top-level <platform>.reply_to_mode preferred, falls back to
<platform>.extra.reply_to_mode, env var never overwritten. Handles YAML
1.1 bare `off` → Python False coercion.

This is a re-submission of the work from #9837 and #13930, which both
implemented the same fix but neither landed (see co-authors below).

Co-authored-by: Matteo De Agazio <hypnosis.mda@gmail.com>
Co-authored-by: ishardo <239075732+ishardo@users.noreply.github.com>
2026-05-05 04:58:23 -07:00
LeonSGP43 354502ee48 fix(kanban): preserve dashboard completion summaries 2026-05-05 04:57:38 -07:00
Teknium cca8587d35 docs(quickstart): link Onchain AI Garage Hermes tutorials playlist (#20192)
* revert(gateway): remove stale-code self-check and auto-restart

Removes the _detect_stale_code / _trigger_stale_code_restart mechanism
introduced in #17648 and iterated in #19740. On every incoming message
the gateway compared the boot-time git HEAD SHA to the current SHA on
disk, and if they differed it would reply with

    Gateway code was updated in the background --
    restarting this gateway so your next message runs
    on the new code. Please retry in a moment.

and then kick off a graceful restart. This is unwanted behaviour:
users who run a long-lived gateway and do their own ad-hoc git
operations on the checkout end up with their chat interrupted and
the current message dropped every time HEAD moves, with no way to
opt out.

If an operator really needs the old protection against stale
sys.modules after "hermes update", the SIGKILL-survivor sweep in
hermes update (hermes_cli/main.py, also tagged #17648) already
handles the supervisor-respawn case on its own.

Removed:
  gateway/run.py:
    - _STALE_CODE_SENTINELS, _GIT_SHA_CACHE_TTL_SECS
    - _read_git_head_sha(), _compute_repo_mtime() module helpers
    - class-level _boot_wall_time / _boot_repo_mtime / _boot_git_sha /
      _stale_code_restart_triggered defaults
    - __init__ boot-snapshot block (_boot_*, _cached_current_sha*,
      _repo_root_for_staleness, _stale_code_notified)
    - _current_git_sha_cached(), _detect_stale_code(),
      _trigger_stale_code_restart() methods
    - stale-code check + user-facing restart notice at the top of
      _handle_message()
  tests/gateway/test_stale_code_self_check.py (deleted, 412 lines)

No new logic added. Zero remaining references to any removed
symbol. Gateway test suite passes the same 4589 tests it passed
before; the 3 pre-existing unrelated failures (discord free-channel,
feishu bot admission, teams typing) are unchanged by this commit.

* docs(quickstart): link Onchain AI Garage Hermes tutorials playlist

Adds a 'Prefer to watch?' tip callout near the top of the quickstart page pointing to @OnchainAIGarage's Hermes Agent Tutorials + Use Cases playlist, which includes a Masterclass series covering install, setup, and basic commands.

* docs(quickstart): embed Masterclass video in Prefer to watch section

Swaps the plain-link tip callout for an inline responsive YouTube embed of the Hermes Agent Masterclass (R3YOGfTBcQg) plus a kept link to the full Onchain AI Garage tutorials playlist.
2026-05-05 04:56:54 -07:00
Teknium 4d0f59fa5a test(skill_usage): add mark_agent_created to regression test
The cherry-picked test predates #19618/#19621 which rewrote
list_agent_created_skill_names() to require an explicit
created_by: 'agent' provenance marker. Without mark_agent_created(),
my-skill is excluded from the list and the positive assertion fails.
2026-05-05 04:55:22 -07:00
LeonSGP43 68c1a08ad1 fix(curator): protect hub skills by frontmatter name 2026-05-05 04:55:22 -07:00
Teknium 5168226d60 feat(file_tools): post-write delta lint on write_file + patch, add JSON/YAML/TOML/Python in-process linters (#20191)
Closes the gap where write_file skipped the post-edit syntax check that
patch already ran, so silent file corruption (bad quote escaping,
truncated writes, etc.) would persist on disk until a later read.

## Changes

tools/file_operations.py:
- Add in-process linters for .py, .json, .yaml, .toml (LINTERS_INPROC).
  Python uses ast.parse, JSON/YAML/TOML use stdlib/PyYAML parsers.
  Zero subprocess overhead; preferred over shell linters when both apply.
- _check_lint() now accepts optional content and routes to in-process
  linter first. Shell linter (py_compile, node --check, tsc, go vet,
  rustfmt) remains the fallback for languages without an in-process
  equivalent.
- New _check_lint_delta() implements the post-first/pre-lazy pattern
  borrowed from Cline and OpenCode: lint post-write state first; only
  if errors are found AND pre-content was captured does it lint the
  pre-state and diff. If the pre-existing file had the SAME errors the
  edit didn't introduce anything new, so the file is reported as 'still
  broken, pre-existing' with success=False but a message explaining the
  errors were pre-existing. If the edit introduced genuinely new errors,
  those are surfaced and pre-existing ones are filtered out.
- WriteResult gains a lint field.
- write_file() captures pre-content for in-process-lintable extensions
  and calls _check_lint_delta after a successful write.
- patch_replace() switches from _check_lint to _check_lint_delta,
  reusing the pre-edit content it already has in scope.

tools/file_tools.py:
- Update write_file schema description to mention the post-write lint.

tests/tools/test_file_operations_edge_cases.py:
- Update existing brace-path tests to use .js (shell linter) now that
  .py is in-process.
- Add TestCheckLintInproc (9 tests) covering Python/JSON/YAML/TOML
  in-process linters.
- Add TestCheckLintDelta (5 tests) covering the post-first/pre-lazy
  short-circuit, new-file path, and the single-error-parser caveat.

## Performance

In-process linters are microseconds per call (ast.parse, json.loads).
The hot path (clean write) runs exactly one lint — matches main's cost
for patch. Pre-state capture is skipped when the file has no applicable
linter. Measured 4.89ms/write average over 100 .py writes including lint.

## Inspiration

- Cline's DiffViewProvider.getNewDiagnosticProblems() — filters pre-write
  diagnostics from post-write diagnostics (src/integrations/editor/DiffViewProvider.ts).
- OpenCode's WriteTool — runs lsp.diagnostics() after write and appends
  errors to tool output (packages/opencode/src/tool/write.ts).
- Claude Code's DiagnosticTrackingService — captures baseline via
  beforeFileEdited() and returns new-diagnostics-only from
  getNewDiagnostics() (src/services/diagnosticTracking.ts).

## Validation

- tests/tools/test_file_operations.py + test_file_operations_edge_cases.py
  + test_file_tools.py + test_file_tools_live.py + test_file_write_safety.py
  + test_write_deny.py + test_patch_parser.py + test_file_ops_cwd_tracking.py:
  228 passed locally.
- Live E2E reproduction of the tips.py corruption incident: broken
  content written; lint field surfaces 'SyntaxError: invalid syntax.
  Perhaps you forgot a comma? (line 6, column 5)' — the exact error
  that would have self-corrected the bug on the next turn.
2026-05-05 04:54:17 -07:00
Teknium b93643c8fe chore: AUTHOR_MAP entry for wmagev 2026-05-05 04:51:29 -07:00
wmagev 2eef395e1c fix(compaction): mark end of context summary in role=user fallback
When the head ends with assistant/tool and the tail starts with assistant,
the summary is inserted as a standalone role="user" message. The body's
verbatim "## Active Task" quote then gets read as fresh user input by
weak/local models (#11475, #14521).

The merge-into-tail path already appends an explicit end-of-summary marker
for this reason. Mirror it on the standalone path so both insertion routes
give the model the same "summary above, not new input" signal.
2026-05-05 04:51:29 -07:00
Teknium c725d7d648 chore: AUTHOR_MAP entry for TheEpTic 2026-05-05 04:45:32 -07:00
Nexus 660ce7c54b fix(ui-tui): prevent React effect cleanup from killing python TUI gateway subprocess
The useEffect at useMainApp.ts:546-565 calls gw.kill() in its cleanup function. React calls cleanup on every re-render when the dependency array ([gw, sys]) shifts — which happens whenever sys changes identity (any system message). This sends SIGTERM to the Python TUI gateway subprocess, silently killing the backend mid-session.

The kill path was already handled by entry.tsx's setupGracefulExit for real app exits (SIGINT, uncaught exception). The die() function also calls gw.kill() for explicit user exit. Removing the cleanup kill leaves all exit paths covered while preventing accidental mid-session kills on ordinary React re-renders.
2026-05-05 04:45:32 -07:00
LeonSGP43 1a03e3b1c6 fix(kanban): detect darwin zombie workers 2026-05-05 04:43:40 -07:00
0xsir0000 f6b68f0f50 fix(gateway): keep DoH-confirmed Telegram IPs that match system DNS (#14520)
discover_fallback_ips() filtered out any DoH-resolved IP that also appeared
in the system resolver's answer set, on the assumption that the system IP
was unreachable. When DoH and system DNS agreed (a common case), the
function returned the hardcoded _SEED_FALLBACK_IPS list instead — and on
networks where those seed addresses are not routable, the Telegram fallback
transport had nothing usable to retry against and polling failed.

Drop the system_ips exclusion so DoH-confirmed IPs are preserved regardless
of system DNS overlap. The TelegramFallbackTransport already tries the
primary path first via system DNS, then falls through to the IP-rewrite
path on connect failure; including the same IP in both lanes lets a
transient primary failure recover via the explicit IP route instead of
escalating to seed addresses.

Update the two tests that codified the old exclusion to reflect the new,
inclusion-by-default behaviour.

Fixes #14520
2026-05-05 04:42:59 -07:00
revaraver aacf36e943 fix(cli): persist manual compress handoff 2026-05-05 04:42:48 -07:00
Teknium fe8dc26bc9 chore: AUTHOR_MAP entry for revaraver noreply 2026-05-05 04:42:44 -07:00
revaraver 4a3e3e20e5 fix(compression): preserve iterative summary continuity 2026-05-05 04:42:44 -07:00
Teknium f8a6db68ca test(kanban): isolate HERMES_KANBAN_BOARD writes in pin-env tests
The helper under test writes to os.environ directly, bypassing
monkeypatch tracking. Without an explicit snapshot/restore fixture,
the mutation leaks into subsequent tests and breaks TestSharedBoardPaths
(kanban path resolution reads HERMES_KANBAN_BOARD and routes through
boards/<leaked-slug>/ instead of the test's own HERMES_HOME).

Add an autouse fixture that snapshots the env var before the test and
restores (or pops) it after, regardless of what the helper did.
2026-05-05 04:37:47 -07:00
0xDevNinja b22b3f506a fix(cli): pin HERMES_KANBAN_BOARD at chat boot to stop subprocess board drift
Without an explicit pin, in-process kanban tools and shelled-out
`hermes kanban …` subprocesses resolve the active board on different
paths: the env var when set, otherwise the global `<root>/kanban/current`
file. When a concurrent session toggles the current-board pointer
mid-turn, the same chat ends up routing tool calls to board A while its
shell calls hit board B, surfacing as phantom "no such task" errors.

Pin the resolved board into env once at `cmd_chat` boot when
HERMES_KANBAN_BOARD isn't already set. Mirrors what the dispatcher does
for spawned workers (kanban_db.py:2622-2623). Idempotent and a no-op
when the env is already pinned by the caller.

Closes #20074
2026-05-05 04:37:47 -07:00
Teknium d472d697cd chore(release): map stevekelly622@gmail.com → @steezkelly 2026-05-05 04:34:45 -07:00
Steve Kelly 8c82d0664d fix(kanban): ignore stale current board pointers 2026-05-05 04:34:45 -07:00
Teknium 2a285d5ec2 fix(agent): stateful streaming scrubber for reasoning-block leaks (#17924) (#20184)
* revert(gateway): remove stale-code self-check and auto-restart

Removes the _detect_stale_code / _trigger_stale_code_restart mechanism
introduced in #17648 and iterated in #19740. On every incoming message
the gateway compared the boot-time git HEAD SHA to the current SHA on
disk, and if they differed it would reply with

    Gateway code was updated in the background --
    restarting this gateway so your next message runs
    on the new code. Please retry in a moment.

and then kick off a graceful restart. This is unwanted behaviour:
users who run a long-lived gateway and do their own ad-hoc git
operations on the checkout end up with their chat interrupted and
the current message dropped every time HEAD moves, with no way to
opt out.

If an operator really needs the old protection against stale
sys.modules after "hermes update", the SIGKILL-survivor sweep in
hermes update (hermes_cli/main.py, also tagged #17648) already
handles the supervisor-respawn case on its own.

Removed:
  gateway/run.py:
    - _STALE_CODE_SENTINELS, _GIT_SHA_CACHE_TTL_SECS
    - _read_git_head_sha(), _compute_repo_mtime() module helpers
    - class-level _boot_wall_time / _boot_repo_mtime / _boot_git_sha /
      _stale_code_restart_triggered defaults
    - __init__ boot-snapshot block (_boot_*, _cached_current_sha*,
      _repo_root_for_staleness, _stale_code_notified)
    - _current_git_sha_cached(), _detect_stale_code(),
      _trigger_stale_code_restart() methods
    - stale-code check + user-facing restart notice at the top of
      _handle_message()
  tests/gateway/test_stale_code_self_check.py (deleted, 412 lines)

No new logic added. Zero remaining references to any removed
symbol. Gateway test suite passes the same 4589 tests it passed
before; the 3 pre-existing unrelated failures (discord free-channel,
feishu bot admission, teams typing) are unchanged by this commit.

* fix(agent): stateful streaming scrubber for reasoning-block leaks (#17924)

Per-delta _strip_think_blocks ran at _fire_stream_delta and destroyed
downstream state. When MiniMax-M2.7 / DeepSeek / Qwen3 streamed a tag
split across deltas (delta1='<think>', delta2='Let me check'), the
regex case-2 match erased delta1 entirely, so CLI/gateway state
machines never learned a block was open and leaked delta2 as content.
Raw consumers (ACP, api_server, TTS) had no downstream defense at all.

Replace the per-delta regex with a stateful StreamingThinkScrubber
that survives delta boundaries:
  - Closed <tag>X</tag> pairs always stripped (matches _strip_think_blocks
    case 1).
  - Unterminated open at block boundary enters a block; content
    discarded until close tag arrives.  At end-of-stream, held
    content is dropped.
  - Orphan close tags stripped without boundary gating.
  - Partial tags at delta boundaries held back until resolved.
  - Block-boundary rule (start-of-stream, after \n, or
    whitespace-only since last \n) preserves prose that mentions
    tag names.

Reset at turn start alongside the existing context scrubber; flush at
turn end so a benign '<' held back at end-of-stream reaches the UI.

E2E-verified on live OpenRouter->MiniMax-m2 streams: closed pairs
strip cleanly, first word of post-block content is preserved, pure
content passes through unchanged.  Stefan's screenshot case (#17924)
— 'Let me check' getting chopped to ' me check' — no longer happens.

Final _strip_think_blocks calls on completed strings (final_response,
replay, compression) are preserved; only the streaming per-delta call
site switched to the scrubber.
2026-05-05 04:33:38 -07:00
Chris Danis 28f4d6db63 fix(tool-schemas): reactive strip of pattern/format on llama.cpp grammar 400s
MCP servers commonly emit JSON Schema `pattern` (e.g. `\\d{4}-\\d{2}-\\d{2}`
for date-time params) and `format` keywords. llama.cpp's
`json-schema-to-grammar` converter rejects regex escape classes
(\\d/\\w/\\s) and most format values, returning HTTP 400
"parse: error parsing grammar: unknown escape at \\d" — the whole request
fails.

Cloud providers (OpenAI, Anthropic, OpenRouter, Gemini) accept these
keywords fine and use them as prompting hints. Stripping unconditionally
loses useful hints for every cloud user to fix a llama.cpp-only bug.

Approach: classify the llama.cpp grammar-parse 400 in the error
classifier, and on match do a one-shot in-place strip of pattern/format
from `self.tools`, then retry. Follows the existing
`thinking_signature` recovery pattern. Cloud users hit zero overhead;
llama.cpp users pay one failed request per session.

Changes
- agent/error_classifier.py: new `FailoverReason.llama_cpp_grammar_pattern`
  + narrow HTTP-400 branch matching "error parsing grammar",
  "json-schema-to-grammar", or "unable to generate parser ... template".
- tools/schema_sanitizer.py: new `strip_pattern_and_format()` helper —
  reactive, walks schema nodes, skips property names (search_files.pattern
  survives). Returns strip count for logging.
- run_agent.py: new one-shot recovery block in the retry loop. Strips,
  logs, continues. Falls through to normal retry if nothing to strip.
- tests: 4 classifier tests (3 variants + 1 non-400 negative), 7 strip
  tests including the property-name preservation and idempotency checks.

Co-authored-by: Chris Danis <cdanis@gmail.com>
2026-05-05 04:25:18 -07:00
Interstellar-code 542e06c789 fix: include default profile in kanban assignees 2026-05-05 04:25:05 -07:00
Teknium fc4aa66ee4 feat(tips): add 100 new CLI startup tips (#20168)
Expands TIPS corpus from 280 to 380 entries covering untapped
territory across slash commands, CLI flags, env vars, config keys,
and platform features. Every tip verified against real code and
docs.

Batch 1 (50): advanced slash commands (/steer, /goal, /snapshot,
/copy, /redraw, /agents, /footer, /busy, /topic, /approve, /restart,
/kanban, /reload), no-agent cron, gateway hooks, curator, credential
pools, provider routing, TUI/dashboard env vars and themes, checkpoints,
Piper TTS, API server, GATEWAY_PROXY_URL, MATRIX_DEVICE_ID,
TELEGRAM_WEBHOOK_SECRET, batch_runner --resume.

Batch 2 (50): lesser-known slash commands (/new, /clear, /history,
/save, /status, /image, /platforms, /commands, /toolsets, /gquota,
/voice tts, /reload-skills, /indicator, /debug), CLI subcommands
(hermes -z, --pass-session-id, --image, --ignore-user-config,
--source tool, dump --show-keys, sessions rename/delete, import,
fallback, pairing, setup, status --deep), agent behavior env vars
(HERMES_AGENT_TIMEOUT, HERMES_ENABLE_PROJECT_PLUGINS,
HERMES_DISABLE_FILE_STATE_GUARD, HERMES_ALLOW_PRIVATE_URLS,
HERMES_OPTIONAL_SKILLS, HERMES_BUNDLED_SKILLS,
HERMES_DUMP_REQUEST_STDOUT, HERMES_OAUTH_TRACE, HERMES_STREAM_RETRIES),
gateway env vars, image_gen config, auxiliary.session_search,
tirith_fail_open, source tool filtering, API_SERVER_MODEL_NAME,
dashboard plugins.
2026-05-05 04:15:58 -07:00
Brecht-H f25d3ec917 fix(kanban): suppress dispatcher stuck-warn when ready queue holds only non-spawnable assignees
After PR #20105 (dispatcher skips ready tasks whose assignee fails
``profile_exists()`` to prevent the orion-cc/orion-research crash
loop), the gateway and CLI emit a spurious "kanban dispatcher stuck:
ready queue non-empty for N consecutive ticks but 0 workers spawned"
warning every 5 minutes on multi-lane setups where the queue is
steadily full of human-pulled work assigned to terminal lanes.

The warn is intended to catch real failure modes (broken PATH,
missing venv, credential loss for a real Hermes profile). On a
multi-lane host it fires forever even though everything is healthy:
the dispatcher correctly chose not to spawn, and there is nothing
for the operator to fix.

Changes:

* ``DispatchResult`` gains a ``skipped_nonspawnable`` field
  (separate from ``skipped_unassigned``) so callers can distinguish
  "task missing an owner — operator should route it" from "task
  owned by a control-plane lane — terminal will pull it".
* ``dispatch_once`` routes the ``not profile_exists(assignee)`` skip
  into the new bucket (was lumped into ``skipped_unassigned``).
* New helper ``has_spawnable_ready(conn)`` returns True iff at least
  one ready+assigned+unclaimed task in the DB has an assignee that
  maps to a real Hermes profile. Falls back to legacy "any
  ready+assigned" when ``profile_exists`` is unimportable so degraded
  installs still surface the original warn.
* The gateway dispatcher (``gateway/run.py``) and the CLI standalone
  daemon (``hermes_cli/kanban.py``) both swap their cheap
  ``ready_nonempty`` probe to use ``has_spawnable_ready``. Stuck-warn
  now fires only when there is genuine spawnable work the dispatcher
  failed to start.
* CLI dispatch output prints ``Skipped (non-spawnable assignee —
  terminal lane, OK)`` for visibility without alarm.

Tests:

* New ``has_spawnable_ready`` cases (empty queue, terminal-lane
  only, mixed real+terminal).
* New ``test_dispatch_skips_nonspawnable_into_separate_bucket``
  verifies the bucketing change.
* Updated ``test_dispatch_skips_unassigned`` to assert no
  cross-leak.
* Added ``all_assignees_spawnable`` fixture in
  ``tests/hermes_cli/conftest.py`` and threaded it through dispatcher
  tests that use synthetic assignees ("alice", "bob"). PR #20105
  (the parent commit) silently broke 8 such tests by routing those
  assignees into ``skipped_nonspawnable`` instead of spawning; this
  PR repairs them as part of the same code area.

Verified locally: 246/246 kanban-suite tests pass.

Stacks on top of fix/kanban-dispatcher-skip-missing-profile-2026-05-05
(PR #20105). Reviewer: this PR is meant to merge AFTER #20105.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 04:13:12 -07:00
Brecht-H ca5595fe7b fix(kanban): dispatcher skips ready tasks whose assignee is not a real profile
The kanban dispatcher's `_default_spawn` invokes
``hermes -p <task.assignee> chat -q ...``. When ``assignee``
names a control-plane lane (e.g. an interactive Claude Code
terminal like ``orion-cc`` / ``orion-research``) instead of a
real Hermes profile, the subprocess fails on startup with
"Profile 'X' does not exist", gets reaped as a zombie, the
TTL/crash detector marks the task back to ``ready``, and the
next tick re-spawns the same crashing worker. Result: a
permanent crash loop emitting ``spawned=2 crashed=2 every tick``
in the gateway log and burning CPU forever.

Reproduce on a fresh Hermes-agent install:

  # 1. Create a kanban task whose assignee names a non-profile.
  hermes kanban create --assignee orion-cc --status ready \
      --title "Review PR #N" --body "..."
  # 2. Start the gateway with the embedded dispatcher.
  hermes gateway run
  # gateway.log lines every minute:
  #   kanban dispatcher: tick spawned=1 reclaimed=0 crashed=1 ...
  # 3. ps -ef | grep '[h]ermes.*defunct' shows zombies.

Fix
---
``dispatch_once()`` now pre-checks ``hermes_cli.profiles.
profile_exists(assignee)`` before claiming. If False, the row
is added to ``skipped_unassigned`` (it's effectively
"unassigned-to-an-executable-profile") and the dispatcher
moves on without claiming, spawning, or counting a crash.

The check is opt-in safe: if the import fails (e.g. test
isolation, profile module restructured), ``profile_exists``
falls back to ``None`` and the original behaviour is preserved
unchanged.

This addresses the explicit hint in the kanban task body
(``t_2bab06e3``):

  "Should ready-state tasks auto-spawn at all, or only on
  explicit orion-cc claim? If spurious, gate the auto-spawn
  behind a config flag (e.g. only assignee=hermes or
  assignee=auto)."

Profile-existence is a tighter gate than a config flag — it
self-documents (the user already knows whether they have an
``orion-cc`` profile), and it doesn't require Mac to maintain
an allowlist as new lane names appear. New lanes that ARE
real profiles (created via ``hermes profile create``) auto-
qualify the moment the profile dir is created.

Validated live
--------------
On Orion's hermes-agent install, two ``orion-research``-
assigned tasks (Bug A and Bug C investigations) had been
crash-looping since 2026-05-05 06:58 local. After applying
the patch + restarting the gateway:

- Stale ``running`` claims released to ``ready`` cleanly.
- New gateway emitted ``kanban dispatcher: embedded`` and
  has ticked silently for 2+ minutes — no spawned=,
  crashed=, or stuck= log lines (all spawn skips are quiet).
- Tasks remain ``ready`` with ``claim_lock=None``,
  ``worker_pid=None``, ``spawn_failures=0``.
- Dashboard + telegram + freqtrade unaffected.

Confidence: high (live verified on Orion).
Scope-risk: narrow (additive guard inside one function).
Not-tested: behaviour when a profile is renamed mid-tick —
current code re-imports ``profile_exists`` per row so a
freshly created profile auto-qualifies on the next tick.
Machine: orion-terminal

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 04:13:12 -07:00
Teknium 91ce8fc000 fix(setup): offer Keep/Replace/Clear when API key already exists
hermes setup / hermes model used to silently skip the key prompt when
any value was present in .env — even a malformed paste — leaving users
with a stuck '✓' and no way to recover without hand-editing .env.

Replace the silent acknowledgement at all three API-key provider flows
(Kimi, Stepfun, generic) with a single [K]eep / [R]eplace / [C]lear
menu via a shared `_prompt_api_key` helper.

- K / Enter / Ctrl-C / unknown input → keep (never destroys the key)
- R → getpass for new key; empty input cancels and preserves existing
- C → clears the env var, tells user to rerun hermes setup, aborts flow

LM Studio's no-auth-placeholder substitution stays on first-time entry
only; on Replace an empty input means 'cancel', not 'overwrite with
dummy key'.

11 unit tests cover all branches incl. garbage-input-keeps-key, Ctrl-C
at the choice prompt, Replace-cancel preserving the old key, Clear
wiping only the target env var, and lmstudio placeholder semantics.

Fixes #16394
Reshapes #18355 — original PR pasted the menu inline at 3 sites with
no tests; this consolidates to one helper (+88/-66) with coverage.

Co-authored-by: Feranmi10 <89228157+Feranmi10@users.noreply.github.com>
2026-05-05 04:08:11 -07:00
simbam99 8ad5e98f8d fix(gateway): preserve pending update prompts across restarts 2026-05-05 03:59:39 -07:00
Teknium 2785355750 chore(release): map bjianhang@gmail.com → @bjianhang 2026-05-05 03:59:00 -07:00
baojianhang c3112adac5 fix(tui): improve clipboard copy fallbacks 2026-05-05 03:59:00 -07:00
Siddharth Balyan 13a7cbcd64 fix(nix): refresh stale tui npmDepsHash + fix cache-blind detection (#20144)
The fix-lockfiles script used 'nix build .#tui.npmDeps' to detect stale
hashes. This always succeeds when the OLD derivation is cached in Cachix
or cache.nixos.org — even when the source package-lock.json has changed.

Fix: use prefetch-npm-deps to compute the hash directly from the lockfile
and compare against what's in the nix file. Falls back to nix build only
if prefetch-npm-deps fails.
2026-05-05 15:32:20 +05:30
teknium1 601e5f1d57 fix(teams): log reply() fallback for diagnostics
The previous bare except swallowed every exception from app.reply()
silently. Log at debug so real failures (auth, chat gone) leave a
trace while keeping the group-chat 400 fallback working. Also fix
the Teams entry's indentation in the messaging flowchart.
2026-05-04 20:59:18 -07:00
Aamir Jawaid 2333b7a7ec fix(tests): patch TypingActivityInput after mock on Python <3.12
The SDK requires Python >=3.12 so CI (3.11) falls to the except
ImportError branch, leaving TypingActivityInput=None. After loading
the adapter module, explicitly restore it from the mock so
test_send_typing doesn't silently no-op.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-04 20:59:18 -07:00
Aamir Jawaid 3f023450dd fix(teams): fall back to flat send when threading returns 400
Group chats return 400 for threaded sends. Catch the error and
fall back to a flat send so messages always get delivered.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-04 20:59:18 -07:00
Aamir Jawaid 69aeba0df7 feat(teams): implement threading via app.reply()
Wire reply_to into send() using App.reply(conv_id, msg_id, content)
which constructs the threaded conversation ID internally.
Threads supported in channels and group chats.

Update comparison table: Threads 

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-04 20:59:18 -07:00
Aamir Jawaid 10f89d7b72 docs(teams): add Teams to messaging/index.md
- Add to platform description and intro paragraph
- Add row to platform comparison table (images + typing)
- Add node to architecture mermaid diagram
- Add TEAMS_ALLOWED_USERS to security examples
- Add to platform-specific toolsets table
- Add to Next Steps links

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-04 20:59:18 -07:00
Aamir Jawaid 93869b48ab docs: add Microsoft Teams to platform lists across docs
Update all platform enumeration lists to include Teams:
index.md, quickstart.md, integrations/index.md, sessions.md,
slash-commands.md, updating.md, hooks.md, hermes-agent skill.

Skipped PII redaction docs — Teams uses AAD object IDs, not
phone numbers, so redaction doesn't apply there.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-04 20:59:18 -07:00
Aamir Jawaid ef94aa201f docs(teams): add Teams to sidebar
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-04 20:59:18 -07:00
Teknium c77a6e3faa chore(security): add OSV-Scanner CI + Dependabot for github-actions only (#20037)
Adds two supply-chain controls that complement our existing pinning
strategy (full-SHA action pins, exact-version source dep pins via
uv.lock / package-lock.json) without undermining it.

.github/workflows/osv-scanner.yml
  Detection-only scan of uv.lock and the ui-tui/website package-locks
  against the OSV vulnerability database. Runs on PRs that touch
  lockfiles, on push to main, and weekly against main so CVEs
  published after merge still surface. Uses Google's officially-
  recommended reusable workflow pinned by full SHA (v2.3.5).
  Findings upload to the Security tab; fail-on-vuln is disabled so
  pre-existing vulns in pinned deps do not block merges — we move
  pins deliberately, not under CI pressure.

.github/dependabot.yml
  Scoped to github-actions only. Action pins must be moved when
  upstream publishes patches (often themselves security fixes);
  Dependabot opens a PR with the new SHA + release notes for normal
  review. Source-dependency ecosystems (pip, npm) are deliberately
  NOT enabled — automatic version-bump PRs against uv.lock /
  package-lock.json would fight our pinning strategy. CVE-driven
  security updates for source deps are enabled separately via the
  repo's Dependabot security updates setting (GitHub UI), which
  fires only when a pinned version becomes known-vulnerable.
2026-05-04 20:58:21 -07:00
Stephen Schoettler 1d938832a7 test(kanban): patch dashboard websocket token stub 2026-05-04 20:50:24 -07:00
Stephen Schoettler f7918c9349 test(teams): mock ClientOptions in adapter tests 2026-05-04 20:50:24 -07:00
Teknium a1bed18194 docs: clarify that the Docker terminal backend is a single persistent container (#20003)
The docs were ambiguous about whether the Docker terminal backend spins up
a fresh container per command or reuses a long-lived one. It's the latter
— Hermes starts one container on first use and routes every terminal,
file, and execute_code call through docker exec into that same container
for the life of the process (across /new, /reset, and delegate_task
subagents). Working-directory changes, installed packages, and files in
/workspace persist from one tool call to the next, like a local shell.

- configuration.md: lead the Docker Backend section with the persistence
  model before the YAML example; sharpen the Backend Overview table row.
- features/tools.md: expand the Docker Backend block (previously just a
  2-line YAML stub) with a clear statement of the persistent-container
  semantics and a pointer to the full lifecycle section.
- docker.md: tighten the 'Docker as a terminal backend' bullet and the
  'Skills and credential files' paragraph to call out the single-container
  model explicitly.
2026-05-04 20:09:31 -07:00
Jeffrey Quesnelle d12f59aa53 Merge pull request #19866 from NousResearch/fix/clarify-placeholder-credential
clarify placeholder telegram credential in tests
2026-05-04 22:24:52 -04:00
helix4u b816fd4e26 fix(tui): complete absolute paths as paths 2026-05-04 16:14:40 -07:00
helix4u b632290166 fix(gateway): handle planned service stops 2026-05-04 16:00:49 -07:00
brooklyn! 20428f5e60 fix(tui): respect voice.record_key config (supersedes #19028, #19339) (#19835)
* fix(tui): respect voice.record_key config instead of hardcoded Ctrl+B

Classic CLI loaded ``voice.record_key`` from config.yaml and bound the
prompt-toolkit handler dynamically (``cli.py`` paths). The new TUI hard-
coded ``Ctrl+B`` everywhere — ``isVoiceToggleKey`` (input handler),
``/voice status`` ("Record key: Ctrl+B"), and ``/voice on`` ("Ctrl+B to
start/stop recording"). A user who set ``voice.record_key: ctrl+o``
(or any other key) saw the documented config silently ignored — only
Ctrl+B worked, the displayed shortcut lied about it.

Wire the configured key end to end through the existing channels:

* **Backend** (``tui_gateway/server.py``): ``voice.toggle`` action=status
  AND action=on/off responses now include ``record_key``, sourced from
  ``config.get('voice', {}).get('record_key', 'ctrl+b')``.
* **Backend types** (``ui-tui/src/gatewayTypes.ts``): ``ConfigFullResponse``
  now exposes ``config.voice.record_key`` and ``VoiceToggleResponse``
  carries ``record_key`` so the TUI can both bind and display it.
* **Frontend parser/formatter** (``ui-tui/src/lib/platform.ts``):
  ``parseVoiceRecordKey()`` accepts ``ctrl+b`` / ``alt+r`` / ``cmd+space``
  and the common aliases (``option``, ``cmd``, ``win``, …); falls back to
  the documented Ctrl+B for empty / multi-character / malformed input so
  a typo never silently disables the shortcut. ``formatVoiceRecordKey()``
  renders for status text. ``isVoiceToggleKey`` now takes a parsed
  ``ParsedVoiceRecordKey`` argument; the hardcoded ``ch === 'b'`` is
  gone. Default arg keeps existing call sites back-compat.
* **Hydration** (``ui-tui/src/app/useConfigSync.ts``,
  ``useMainApp.ts``): startup ``config.get full`` already runs; extract
  ``cfg.voice.record_key`` from it, parse, push into a new
  ``voiceRecordKey`` state, and forward to the input handler ctx
  (``InputHandlerContext.voice.recordKey``). Mtime-poll path also
  re-applies the parsed key so a hand-edit of config.yaml takes effect
  the next tick — matches existing behaviour for display options.
* **Input handler** (``ui-tui/src/app/useInputHandlers.ts``):
  ``isVoiceToggleKey(key, ch, voice.recordKey)`` so the configured
  binding fires.
* **Slash command** (``ui-tui/src/app/slash/commands/session.ts``):
  ``/voice status`` and ``/voice on`` use ``formatVoiceRecordKey`` on
  the response's ``record_key`` instead of the hardcoded label.

Tests:
* ``parseVoiceRecordKey`` covers ctrl/alt/cmd/super aliases, multi-char
  rejection, and empty fallback.
* ``formatVoiceRecordKey`` covers the doc examples (``Ctrl+B``,
  ``Ctrl+O``, ``Alt+R``, ``Cmd+B``).
* ``isVoiceToggleKey`` regression: ``ctrl+o`` configured → only ``o``
  matches, not ``b``; ``alt+r`` matches both alt-bit and meta-bit
  encodings (terminal protocol parity); omitted-arg call still binds
  Ctrl+B for back-compat.

Full TUI suite (555 tests) passes; ``tsc --noEmit`` clean.

Fixes #18994

Co-authored-by: asheriif <ahmedsherif95@gmail.com>

* fix(tui): support named-key tokens in voice.record_key (space, enter, …)

Reviewer caught that the round-1 parser in #18994 rejected every
multi-character token, so a config value like ``ctrl+space`` (which the
CLI happily binds via prompt_toolkit's ``c-space`` rewrite in
``cli.py``) silently fell back to the documented Ctrl+B default —
re-introducing the same false-shortcut bug the PR was meant to fix,
just at a different surface.

Add explicit named-key support that mirrors what the CLI accepts:

* ``space``         (alias: ``spc``)        → matches ``ch === ' '``
* ``enter``         (alias: ``return``, ``ret``) → matches ``key.return``
* ``tab``                                   → matches ``key.tab``
* ``escape``        (alias: ``esc``)        → matches ``key.escape``
* ``backspace``     (alias: ``bs``)         → matches ``key.backspace``
* ``delete``        (alias: ``del``)        → matches ``key.delete``

``ParsedVoiceRecordKey`` gains an optional ``named`` field; ``ch``
holds either a single char (back-compat) or the canonical named token,
and the runtime matcher dispatches on ``named`` before checking the
modifier shape. Aliases collapse to one canonical name so
``ctrl+esc`` and ``ctrl+escape`` behave identically.

Unrecognised multi-character tokens (e.g. ``ctrl+spcae`` typo, or
unsupported keys like ``ctrl+f5``) still fall back to the Ctrl+B
default rather than silently disabling the binding — keeps the "typo
never silently kills the shortcut" guarantee.

Tests:

* ``parseVoiceRecordKey`` parametrised over every named token + each
  alias variant.
* New ``isVoiceToggleKey`` cases for space (ch-based match), enter
  (``key.return``), tab, escape, backspace, delete, including
  modifier-mismatch negatives.
* ``formatVoiceRecordKey`` renders named keys in title case
  (``Ctrl+Space``, ``Ctrl+Enter``).
* Existing fall-back-to-Ctrl+B contract preserved for empty input
  AND unrecognised multi-char tokens.

Full TUI suite: 559/559 pass; ``tsc --noEmit`` clean.

Refs #18994 (round-1 review feedback)

Co-authored-by: asheriif <ahmedsherif95@gmail.com>

* test(tui): assert voice.toggle returns configured record_key

Salvage the backend regression from #19339 — asserts ``voice.toggle``
action=on AND action=status responses carry the configured
``voice.record_key`` end-to-end through ``_load_cfg()``. Keeps the
CLI→TUI parity contract visible in the Python test suite alongside
the existing frontend parser/matcher/formatter coverage from #19028.

* fix(tui): address Copilot review on #19835 voice.record_key wiring

Five tightenings on the parser + matcher + hydration surface, all
caught by the Copilot review on the PR — each one turns a silent
false-fire or display/binding skew into a deterministic behaviour.

* **isVoiceToggleKey ctrl branch was too permissive for named keys.**
  The doc-default macOS Cmd+B muscle-memory fallback
  (``isActionMod(key)`` on top of ``key.ctrl``) fired for every
  configured key, so bare Esc — which hermes-ink reports with
  ``key.meta`` on some macOS terminals — triggered ``ctrl+escape``,
  and Alt+Space / Alt+Tab triggered ``ctrl+space`` / ``ctrl+tab``.
  Gate the fallback to the literal ``ctrl+b`` binding so any custom
  chord requires the real Ctrl bit.
* **Alt branch guarded against Ctrl/Cmd co-press.** Without this,
  Ctrl+Alt+<letter> and Cmd+Alt+<letter> also fired ``alt+<letter>``.
* **Dropped the ``meta`` modifier variant and its alias.** In
  hermes-ink ``key.meta`` is Alt on xterm-style terminals and Cmd on
  legacy macOS ones, so a literal ``meta+b`` config displayed as
  ``Cmd+B`` while matching Alt+B — exactly the kind of false
  shortcut the PR was meant to remove. ``cmd`` / ``command`` now
  collapse onto ``super`` (kitty-style ``key.super``, with a macOS
  ``key.meta`` fallback) and render as ``Cmd+B``. Unknown modifier
  tokens fall back to the documented Ctrl+B default rather than
  silently coercing to Ctrl.
* **Slash-command display/binding skew.** ``/voice status`` and
  ``/voice on`` rendered from the fresh gateway ``record_key``
  response, but ``useInputHandlers()`` still bound the old key
  until the next 5s mtime poll. Thread ``setVoiceRecordKey``
  through ``SlashHandlerContext.voice`` and push the parsed spec
  into frontend state on every response so text and binding stay
  consistent.
* **Test coverage for the two paths Copilot flagged.** Added
  vitest coverage for (a) the three-case ``/voice`` slash output
  in ``createSlashHandler.test.ts`` and (b) the
  ``applyDisplay → voice.record_key`` hydration + omit-setter
  back-compat paths in ``useConfigSync.test.ts``. Plus regression
  cases for every false-fire scenario above.

Suite: 575/575 green, tsc --noEmit clean.

* fix(tui): address Copilot round-2 review on #19835

Three tightenings on the surface introduced in the round-1 fix:

* **``/voice tts`` reset custom bindings to Ctrl+B.** The ``tts`` branch
  of ``voice.toggle`` omitted ``record_key`` from its response, so the
  frontend's ``r.record_key ?? 'ctrl+b'`` coerced a user's custom
  binding back to the default on every TTS toggle. Two-sided fix:
  the backend now includes ``record_key`` on the ``tts`` branch (parity
  with ``status``/``on``/``off``), and the slash handler only pushes
  frontend state when the response actually carries ``record_key`` —
  belt-and-suspenders against any future branch forgetting to include
  it.

* **``super+b`` / ``win+b`` / ``cmd+b`` displayed "Cmd+B" on Linux and
  Windows.** ``formatVoiceRecordKey`` rendered ``mod === 'super'`` as
  ``Cmd`` universally, which told non-mac users the wrong modifier to
  press even though ``isVoiceToggleKey`` matched the right event bits.
  Gate the label to ``isMac`` so non-mac renders ``Super+B``.

* **``control+b`` / ``ctrl + b`` lost the macOS Cmd+B fallback.**
  ``_isDefaultVoiceKey`` keyed off ``parsed.raw`` — so
  semantically-equal aliases of the documented default dropped into
  the strict branch even though they bind Ctrl+B. Compare on the
  parsed spec (mod + ch + named) instead.

Coverage added: Linux ``Super+B`` rendering (and macOS ``Cmd+B``),
``control+b`` / ``ctrl + b`` accepting the Cmd+B fallback on darwin,
``/voice tts`` without ``record_key`` not clobbering cached binding,
and a backend regression asserting every ``voice.toggle`` branch
carries the configured key.

Suite: 579/579 TUI vitest green, 2/2 backend voice tests green,
tsc --noEmit clean.

* fix(tui): address Copilot round-3 review on #19835

Three classes of robustness issue caught on the second pass — all
revolve around malformed YAML tipping ``parseVoiceRecordKey`` or
``_voice_record_key`` into a crash instead of the documented
fallback.

* **Parser crashed on non-string YAML scalars.** ``config.get full``
  returns raw ``yaml.safe_load`` output, so ``voice.record_key: 1``
  or ``voice.record_key: true`` in a hand-edited config would hit
  ``.trim()`` on a number/bool and throw, breaking startup and
  every mtime re-apply. Accept ``unknown`` at the signature, guard
  with ``typeof raw !== 'string'``, and fall back to the default.

* **Backend blew up on non-dict ``voice:``.** Same YAML hazard on
  the gateway side: ``voice: true`` / ``voice: cmd+b`` left
  ``_load_cfg().get("voice")`` as a bool/str, so ``.get("record_key")``
  raised AttributeError and took every ``voice.toggle`` branch down
  with it. Centralised the lookup in a single
  ``_voice_record_key()`` helper that ``isinstance``-guards both
  ``voice`` and ``record_key`` and falls back to ``ctrl+b``.

* **Multi-modifier chords silently dropped extras.** The previous
  validator only checked the first modifier token, so ``ctrl+alt+r``
  silently parsed as ``ctrl+r`` and ``cmd+ctrl+b`` as ``super+b`` —
  a typo bound a different shortcut than the user configured.
  Reject multi-modifier spellings outright; the classic CLI only
  supports single-modifier bindings via prompt_toolkit's ``c-x`` /
  ``a-x`` rewrite, so this matches CLI parity.

Coverage added:

* ``parseVoiceRecordKey`` fallback on ``1`` / ``true`` / ``null`` /
  ``undefined`` / ``{}``.
* ``parseVoiceRecordKey`` fallback on ``ctrl+alt+r`` /
  ``cmd+ctrl+b`` / ``alt+ctrl+space``.
* ``test_voice_toggle_handles_non_dict_voice_cfg`` exercises
  every non-dict ``voice:`` shape (bool, str, None, int, list) and
  asserts each falls back to ``record_key: 'ctrl+b'``.

Suite: 581/581 TUI vitest green, 3/3 backend voice tests green,
tsc --noEmit clean.

* fix(tui): address Copilot round-4 review on #19835

Four final corners of the voice.record_key surface:

* **Bare-char configs silently coerced to ``ctrl+<key>``.** A config
  like ``voice.record_key: o`` / ``space`` / ``escape`` fell through
  to the default ``mod = 'ctrl'`` and silently bound Ctrl+O, while
  the classic CLI's prompt_toolkit would bind the raw key (no
  rewrite) — so the two runtimes silently disagreed on what "o"
  means. Require an explicit modifier; bare-char configs fall back
  to the documented Ctrl+B default.

* **Reserved ctrl+<letter> bindings would never fire.**
  ``useInputHandlers()`` intercepts ``ctrl+c`` (interrupt),
  ``ctrl+d`` (quit), and ``ctrl+l`` (clear screen) before the voice
  check runs, so those configs would be advertised in /voice
  status but the advertised shortcut never actually triggers
  push-to-talk. Added ``_RESERVED_CTRL_CHARS`` at parse time so
  the user gets the documented default instead of a dead shortcut.
  (``alt+c``, ``cmd+l``, etc. are not intercepted and stay usable.)

* **``_load_cfg()`` root itself may be a non-dict.**
  ``_voice_record_key()`` isinstance-guarded the ``voice`` subkey
  but not the root — a malformed config.yaml that collapsed to a
  scalar/list at the top level (``config.yaml: true`` or ``[]``)
  would still raise on ``.get("voice")``. Added the top-level
  guard too so every malformed shape falls back to ``ctrl+b``.

* **Stale header comment on ``isVoiceToggleKey``.** The doc-comment
  still claimed "On macOS we additionally accept the platform
  action modifier (Cmd) for the configured letter" even though the
  implementation gates the Cmd fallback to the documented default
  only. Rewrote to match.

Coverage added:

* ``parseVoiceRecordKey`` fallback on bare chars (``o``, ``b``,
  ``space``, ``escape``).
* ``parseVoiceRecordKey`` fallback on ``ctrl+c`` / ``ctrl+d`` /
  ``ctrl+l``; positive case for ``alt+c`` / ``cmd+l`` still usable.
* Backend ``test_voice_toggle_handles_non_dict_voice_cfg`` now
  exercises 5 non-dict shapes at the YAML root too.

Suite: 583/583 TUI vitest green, 3/3 backend voice tests green,
tsc --noEmit clean.

* fix(tui): address Copilot round-5 review on #19835

Three follow-ups on the voice matcher's modifier + shift discipline:

* **``super`` branch falsely fired on Alt+<key> / bare Esc on macOS.**
  ``isVoiceToggleKey`` accepted ``isMac && key.meta`` as a Cmd
  fallback for the ``super`` modifier — but hermes-ink sets
  ``key.meta`` for plain Alt/Option AND for bare Escape on some
  macOS terminals. A ``cmd+b`` config silently fired on Alt+B;
  ``cmd+space`` on Alt+Space; ``cmd+escape`` on bare Esc. Drop the
  fallback and require the literal ``key.super`` bit. Legacy-
  terminal users who need Cmd should upgrade to a kitty-protocol
  terminal or bind ``alt+X`` explicitly.

* **Shift bit was never checked.** The parser rejects multi-
  modifier configs like ``ctrl+shift+tab``, but the runtime
  matcher didn't check ``key.shift`` — so ``ctrl+tab`` also fired
  on Ctrl+Shift+Tab and ``alt+enter`` on Alt+Shift+Enter.
  Early-return on ``key.shift === true`` so the runtime only fires
  the exact chord the user configured.

* **Test leaked ``HERMES_VOICE=1`` into later tests.**
  ``voice.toggle`` action=on writes to ``os.environ`` directly
  (CLI parity, runtime-only flag); ``test_voice_toggle_returns_
  configured_record_key`` dispatched action=on without letting
  monkeypatch take ownership of the var first. Any later test
  that read voice mode in the same Python process could inherit a
  stale enabled state. Added ``monkeypatch.setenv("HERMES_VOICE",
  "0")`` up front so monkeypatch restores the original value at
  teardown.

Coverage added:

* ``cmd+b`` / ``cmd+space`` / ``cmd+escape`` do NOT fire on
  ``key.meta``-only events on darwin.
* ``ctrl+tab`` / ``alt+enter`` / ``ctrl+o`` reject matches when
  ``key.shift`` is held; sanity cases without Shift still fire.

Suite: 585/585 TUI vitest green, 3/3 backend voice tests green,
tsc --noEmit clean.

* fix(tui): address Copilot round-6 review on #19835

Three classes of modifier-discipline tightening + one config-surface
honesty fix:

* **Default ``ctrl+b`` Cmd fallback leaked Alt+B.** The default's
  macOS Cmd+B muscle-memory path used ``isActionMod(key)``, which
  returns ``key.meta || key.super`` on darwin. hermes-ink also
  reports plain Alt as ``key.meta``, so Alt+B silently fired the
  default binding. Replaced with strict ``isMac && key.super ===
  true`` — kitty-style Cmd+B still works, Alt+B correctly
  rejected. Legacy-terminal mac users (Terminal.app without
  CSI-u) now get raw Ctrl+B only; the documented default still
  works everywhere.

* **ctrl / super branches accepted extra modifier bits.** The
  parser rejects multi-modifier configs like ``ctrl+alt+o``, but
  the runtime matcher was permissive — ``ctrl+o`` fired on
  Ctrl+Alt+O / Ctrl+Cmd+O, and ``super+b`` fired on Cmd+Alt+B /
  Ctrl+Cmd+B. Added strict ``!key.alt && !key.meta && key.super
  !== true`` on ctrl, and ``!key.ctrl && !key.alt && !key.meta``
  on super, so the runtime only fires the exact chord the parser
  would let you configure.

* **Dropped ``cmd`` / ``command`` aliases.** They parsed to
  ``super`` and rendered as ``Cmd+X``, but legacy macOS terminals
  report Cmd as ``key.meta`` (same signal as Alt), so a
  ``cmd+o`` config was advertised as working but never actually
  fired on Terminal.app-without-CSI-u. That recreated the
  "displayed shortcut does not work" problem this PR was meant to
  remove. Users who want the platform action modifier spell it
  ``super`` / ``win`` — that matches the unambiguous ``key.super``
  bit, and kitty-style macOS terminals render it as ``Cmd+X`` via
  platform-aware formatter.

Coverage updated:

* Default ctrl+b no longer fires on Alt+B via ``key.meta`` leak;
  raw Ctrl+B and kitty-style Cmd+B still fire.
* ``ctrl+o`` rejects Ctrl+Alt+O / Ctrl+Cmd+O / Ctrl+Meta+O chords.
* ``super+b`` rejects Cmd+Alt+B / Cmd+Meta+B / Ctrl+Cmd+B chords.
* ``cmd+b`` / ``command+b`` / ``meta+b`` all fall back to the
  documented default at parse time (joined the ambiguous-mac-mod
  rejection class).
* Round-2 expectations that asserted ``cmd+b`` parsed as super
  and accepted ``key.meta`` on darwin updated to reflect the new
  stricter contract.

Suite: 588/588 TUI vitest green, 3/3 backend voice tests green,
tsc --noEmit clean.

* fix(tui): address Copilot follow-up on wire typing + escape precedence

Two follow-ups from the latest Copilot pass:

* **Config wire typing honesty (`gatewayTypes.ts`)**
  `config.get full` forwards raw `yaml.safe_load()` output, so
  `voice.record_key` can be any scalar/container when hand-edited.
  Typing it as `string` suggests a normalized contract that the
  backend does not guarantee and makes unsafe callers more likely.
  Change `ConfigVoiceConfig.record_key` to `unknown` with an
  explicit comment that callers must normalize at runtime.

* **Escape-based voice bindings were swallowed before voice check**
  `useInputHandlers()` handled `key.escape` for queue-edit cancel and
  selection clear before `isVoiceToggleKey(...)`, so configured
  `ctrl+escape` / `alt+escape` / `super+escape` chords were advertised
  but never toggled recording in those UI states.
  Add an early escape+voice check before generic Esc handlers so
  escape-based voice bindings win when configured, while plain Esc
  behavior remains unchanged.

Also updated PR #19835 description text to remove stale cmd/command
alias claims and match the current parser contract.

* fix(tui): pass configured voice shortcut through TextInput layer

Thread the live parsed voiceRecordKey into TextInput so configured voice.record_key chords bubble to useInputHandlers instead of being consumed as editor input. This removes the last hardcoded Ctrl+B pass-through in the composer path while preserving existing global control chord behavior.

* fix(tui): require explicit alt bit for escape-based alt chords

Hermes-ink reports bare Escape as meta=true+escape=true on some terminals, so a configured alt+escape binding was firing on bare Esc. Require an explicit key.alt bit when the configured named key is escape so plain Esc stays plain Esc; kitty-style alt+escape still fires.

* fix(tui): harden voice.record + TextInput paste + super-mod reserved list

Three round-7 Copilot follow-ups on #19835:

- voice.record start handler used _load_cfg().get('voice', {}).get(...) without
  shape checks, so malformed YAML (bool/scalar/list) returned 5025 instead of
  using VAD defaults. Centralized _voice_cfg_dict() helper and type-guarded
  silence_threshold/silence_duration with numeric fallbacks.
- TextInput pass-through check moved above paste/copy handling so configured
  voice chords (ctrl+v / alt+v / cmd+v) beat the composer's paste/copy
  defaults.
- parser now also rejects super+{c,d,l,v} — on macOS those are
  copy/exit/clear/paste and would be advertised in /voice status but never
  actually toggle recording.

* Potential fix for pull request finding

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

* fix(tui): round-8 Copilot review — allow ctrl+x, gate super reservations to macOS, preserve voice key on transient RPC failure

Three round-8 Copilot follow-ups on #19835:

- Revert ctrl+x addition to _RESERVED_CTRL_CHARS (landed via Copilot Autofix
  commit 731ec86): ctrl+x is only claimed during queue-edit
  (queueEditIdx !== null), so voice works the rest of the session and
  matches CLI ctrl+<letter> parity.
- Gate super+{c,d,l,v} reservation to isMac. Linux/Windows TUI globals key
  off Ctrl, so kitty/CSI-u super+<letter> configs don't collide on non-mac
  and should stay usable.
- applyDisplay() now skips setVoiceRecordKey when cfg is null so one
  transient quietRpc() failure after a config edit doesn't clobber the
  cached binding back to Ctrl+B until the next successful poll.

New coverage:
- parseVoiceRecordKey preserves ctrl+x on linux
- super+{c,d,l,v} rejected on darwin, allowed on linux
- applyDisplay(null, ...) leaves voiceRecordKey untouched

* fix(cli,tui): normalize voice.record_key aliases across CLI + TUI for parity

Round-9 Copilot review on #19835: TUI accepted control+/option+/opt+/super+/win+ aliases but the classic CLI only rewrote literal ctrl+/alt+ before handing to prompt_toolkit, so a TUI-valid config silently bound a different (or no) shortcut in the CLI.

- Added normalize_voice_record_key_for_prompt_toolkit() in hermes_cli/voice.py with a single alias table (ctrl/control/alt/option/opt → c-/a-).
- Wired it into all three cli.py sites (_enable_voice_mode hint, _show_voice_status display, and the prompt_toolkit binding in _register_voice_handler).
- /voice status display now renders control+x as Ctrl+X and option+x as Alt+X (canonical casing) to match TUI formatVoiceRecordKey.
- super/win/windows are intentionally left unchanged: prompt_toolkit has no super modifier, so the CLI will reject them loudly at startup rather than silently binding Ctrl+B. Documented this split at both the TUI _MOD_ALIASES comment and the CLI normalizer docstring.
- Added tests covering ctrl/control/alt/option/opt mapping, case-insensitivity, non-string fallback, empty-string fallback, and super/win pass-through.

* fix(cli): port TUI parser contract into CLI voice.record_key normalizer

Round-10 Copilot review on #19835.

hermes_cli/voice.py's normalize_voice_record_key_for_prompt_toolkit() previously did blind substring replacement with no trim/validate step, so the CLI diverged from the TUI parser on:
- whitespace ('ctrl + b' -> 'c- b' instead of 'c-b')
- typoed named keys ('ctrl+spcae' passed through as 'c-spcae' and prompt_toolkit would reject at startup)
- bare-char configs ('o' should fall back, not pass through as 'o')
- multi-modifier chords ('ctrl+alt+r')
- reserved ctrl chars ('ctrl+c/d/l')
- unknown modifiers ('meta+b' / 'shift+b')
- named-key aliases ('return'/'esc'/'bs'/'del' not collapsed to prompt_toolkit canonicals)

Port the TUI parser contract into Python (_VOICE_MOD_ALIASES, _VOICE_NAMED_KEYS, _VOICE_RESERVED_CTRL_CHARS) so one config value binds the same shortcut in both runtimes.

Also added format_voice_record_key_for_status() shared between the PTT hint and /voice status display. Non-string scalars (voice.record_key: true / 1) now surface as 'Ctrl+B' instead of the raw scalar — /voice status no longer advertises a shortcut that can never bind.

Tests: 29/29 in test_voice_wrapper.py, including 11 new regressions covering whitespace, named-key aliases, typos, bare-char, multi-modifier, reserved ctrl, unknown mods, non-string fallback, and formatter contract.

* fix(cli): shape-safe voice config read + graceful super/win fallback

Round-11 Copilot review on #19835.

Two remaining cross-runtime gaps:

1. load_config().get('voice', {}) still assumed voice was a dict, so a hand-edited voice: true / voice: cmd+b at the top level raised AttributeError before the voice UI could start. Added voice_record_key_from_config(cfg) to hermes_cli/voice.py that isinstance-guards both the root and the voice subkey. All three cli.py read sites (_enable_voice_mode hint, _show_voice_status, PTT binding) now use it.

2. The CLI normalizer previously passed super+/win+/windows+ through unrewritten so prompt_toolkit would reject them loudly at startup — but that crash was a worse UX than a silent fallback. Normalizer now returns c-b for those spellings, and the PTT binding site logs a warning so users see why their TUI-only shortcut isn't binding in the CLI.

Coverage: 34/34 in tests/hermes_cli/test_voice_wrapper.py (5 new cases for voice_record_key_from_config + malformed-root + malformed-voice + extractor/normalizer composition).

* fix(cli): self-audit cleanup — remaining voice-config shape safety + doc drift

Self-review of the voice.record_key change set turned up four remaining items Copilot would very likely flag next round:

1. cli.py _voice_start_continuous still read load_config().get('voice', {}).get('silence_threshold') without an isinstance guard, so a hand-edited voice: true / voice: cmd+b (non-dict) raised AttributeError on VAD recording start. Shape-safe coerce the voice dict and numeric-guard silence_threshold/silence_duration.

2. cli.py _enable_voice_mode's auto_tts check had the same bug — fixed with the same isinstance guard.

3. hermes_cli/voice.py module comment on _VOICE_MOD_ALIASES still said super/win/windows 'pass through unchanged and prompt_toolkit's add() call loudly rejects them at startup'. Round 11 changed the normalizer to silently fall back to c-b with a warning at the binding site; updated the comment to match.

4. ui-tui/src/lib/platform.ts header comment had the same stale 'CLI will loudly reject them at startup' claim; updated to 'falls back to the documented default and logs a warning'.

No behavior change on the code paths already covered by test_voice_wrapper.py; the two cli.py fixes are defensive against malformed YAML that previous rounds already hardened in tui_gateway/server.py but missed in the classic CLI.

* fix(cli,tui): round-12 Copilot review — alt-collide on mac, bool-in-int guards, voice UI hardcodes, mtime-reload test

Five round-12 Copilot review items on #19835:

1. platform.ts: hermes-ink reports Alt as key.meta on many terminals; isActionMod on darwin accepts key.meta as the action modifier. So alt+c/d/l get claimed by isCopyShortcut / isAction('d')/'l') before the voice check. Reject those configs at parse time on macOS only (non-mac keeps them usable).

2. cli.py: four remaining hardcoded 'Ctrl+B' sites in voice-facing UI (_get_voice_status_fragments status bar, _voice_start_recording hints, _get_placeholder composer text) were still lying about non-default configs. Added self._voice_record_key_label() shared helper and wired it into all three sites.

3. server.py + cli.py: bool is a subclass of int, so isinstance(silence_threshold, (int, float)) accepted True/False from malformed YAML and forwarded 1/0 to the VAD engine. Exclude bool explicitly so boolean typos fall back to the documented 200 / 3.0 defaults.

4. useConfigSync.ts: extracted the config.get-full fetch+apply body into a shared hydrateFullConfig() helper. Both the initial hydration and mtime-reload paths now use it, so the polling/RPC wiring is exercised by direct unit tests (4 new cases: fresh apply, reapply on new value, transient RPC failure preserves cache, back-compat without voice setter).

5. Added alt+{c,d,l} rejection regressions on darwin + allow on linux, and bool-leak regressions for both silence_threshold and silence_duration in tests/test_tui_gateway_server.py.

Suite: 602/602 TUI vitest, 38/38 backend voice tests, typecheck + lints clean.

* fix(cli): cache voice record-key label at binding time + status-bar coverage

Round-13 Copilot review on #19835.

_voice_record_key_label() was reading live config on every render, which caused two problems:

1. prompt_toolkit registers the push-to-talk binding once at session start (@kb.add(_voice_key)); the binding does NOT re-read config. Editing voice.record_key mid-session would switch the status-bar / placeholder / recording-hint label to the new shortcut while the actual keybinding stayed on the startup chord — reintroducing the display/binding drift this whole PR is fighting.

2. Hot render path: during recording the UI is invalidated every 150ms, so re-loading + deep-merging config on every call added avoidable UI overhead.

Fix: cache the label at the same site that registers the prompt_toolkit binding via new set_voice_record_key_cache(raw_key). _voice_record_key_label() now just returns the cached value (falls back to 'Ctrl+B' before startup). Status/placeholder/hint are always in sync with the live binding; no config reload per render.

Also added 4 regression cases to tests/cli/test_cli_status_bar.py: configured ctrl+<letter> renders in both wide and compact status bars, configured named key (ctrl+space) renders in the recording hint, pre-startup absent cache falls back to Ctrl+B, and malformed configs (bool True) fall through the formatter to Ctrl+B.

Suite: 60/60 test_cli_status_bar + test_voice_wrapper, typecheck + lints clean.

* fix(cli): route /voice on + /voice status through startup-pinned label; mac alt+cdl parity

Round-14 Copilot review on #19835. All three comments legit:

1. _enable_voice_mode still formatted label from live load_config() — mid-session config edit would make /voice on announce the new shortcut while the prompt_toolkit binding stayed the startup chord. Use self._voice_record_key_label() (cached at binding time, round-13) so /voice on cannot drift from the live binding.

2. _show_voice_status had the same bug — /voice status reported live config instead of the pinned startup binding. Fixed the same way.

3. CLI normalizer accepted alt+c/alt+d/alt+l even though the TUI parser rejects them on macOS (Copilot round-12 — hermes-ink reports Alt as key.meta, isActionMod on darwin accepts it, collides with isCopyShortcut / isAction). Added _VOICE_RESERVED_ALT_CHARS_MAC = {c,d,l} gated to sys.platform == 'darwin' so a shared config like option+c falls back to c-b on both runtimes on macOS; non-mac still binds a-c.

Coverage: 4 new tests in test_voice_wrapper.py covering mac alt+cdl rejection, linux alt+cdl allowed, option/opt alias forms, and mac-specific exclusions for other alt letters. 62/62 in voice wrapper + status bar suites.

---------

Co-authored-by: Tranquil-Flow <tranquil_flow@protonmail.com>
Co-authored-by: asheriif <ahmedsherif95@gmail.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
2026-05-04 15:49:28 -07:00
kshitij 109c3e468c fix(terminal): guard background process spawn against deleted cwd (#19933)
Follow-up to #19928 which fixed the foreground path in _run_bash.
The background process spawn in process_registry.py had the same
vulnerability: Popen(cwd=session.cwd) and PtyProcess.spawn(cwd=...)
would raise FileNotFoundError if the directory was deleted.

Apply _resolve_safe_cwd() at session creation time so both the PTY
and pipe-mode Popen paths receive a validated cwd.
2026-05-04 15:35:34 -07:00
briandevans 9fa3a093f2 fix(local): test root as ancestor candidate; use real pipe for fake stdout
Address Copilot review on PR #17569:

1. _resolve_safe_cwd never tested the filesystem root because the loop
   exited when `os.path.dirname(parent) == parent`, which is true once
   `parent == '/'`. Restructure so the root is checked before the
   self-equal exit. Adds `test_returns_root_when_only_root_exists` —
   regression-guarded by reverting the loop and watching it fail.

2. The fake `Popen.stdout` was a `MagicMock`; `BaseEnvironment._wait_for_process`
   calls `proc.stdout.fileno()` then `select.select`/`os.read` against it,
   which raised `TypeError: fileno() returned a non-integer` (visible as a
   thread exception in test output) and could in theory read from an
   unrelated real fd. Hand `fake_popen` a real `os.pipe()` with the write
   end pre-closed so the drain loop sees EOF immediately. Helper records
   each fd so the test cleans up after itself.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 15:31:47 -07:00
briandevans 9644b8ae67 fix(local): recover when persistent_shell cwd is deleted (#17558)
When a tool call deletes its own working directory (`cd /tmp/foo &&
rm -rf /tmp/foo`), the next `subprocess.Popen(args, cwd=self.cwd)` raised
`FileNotFoundError: [Errno 2]` before bash even started — every subsequent
terminal/file-tool call hit the same wedge until the gateway restarted.

Fix in `LocalEnvironment._run_bash`: before handing `self.cwd` to Popen,
resolve a safe alternative when the path is gone (walk up to the nearest
existing ancestor, falling back to `tempfile.gettempdir()` only as a last
resort). Log a warning so the recovery is visible — not silent — and
update `self.cwd` so the next call doesn't repeat the message.

Defense in depth in `LocalEnvironment._update_cwd`: only adopt the new
cwd when it still exists as a directory. `pwd -P` from a deleted cwd can
leave a stale value in the marker file; refusing to store a missing path
keeps `self.cwd` valid by construction.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 15:31:47 -07:00
Teknium b8fb9270c4 refactor(cli): drop dead c-S-c key binding (follow-up to #19895) (#19919)
#19884 added a prompt_toolkit key binding for Ctrl+Shift+C to
"prevent Hermes from intercepting the keystroke as an interrupt
signal." #19895 then wrapped the binding in try/except after
discovering it crashed startup with ValueError on every platform.

Both PRs were based on a misreading of how terminal key events
propagate:

1. Terminal emulators (GNOME Terminal, iTerm2, kitty, Windows Terminal,
   etc.) intercept Ctrl+Shift+C before the keystroke reaches the
   application's stdin. prompt_toolkit never sees it. The binding
   could never have intercepted anything.

2. prompt_toolkit's key spec parser doesn't recognise 'c-S-c' on any
   platform — the Shift modifier is meaningless on control-sequence
   keys. Verified: every prompt_toolkit version raises 'Invalid key:
   c-S-c' at registration time.

The handler is dead code. Delete it and leave a comment explaining
why no binding is needed here. Ctrl+Q alias (#19884's other addition)
stays — that's a real prompt_toolkit key and a legitimate interrupt
shortcut.

Verified the CLI starts cleanly — key binding phase no longer raises
and the subsequent chat flow reaches the provider setup check without
error.
2026-05-04 14:49:38 -07:00
Teknium 56a78e74b2 feat(kanban-dashboard): sharper home-channel toggle contrast, drop → running action (#19916)
Follow-up polish to the kanban dashboard from #19864 and #19705.

**Home-channel toggle contrast.** The `.hermes-kanban-home-sub--on`
class previously used `color-mix(var(--color-ring) 14%, transparent)`
which was nearly invisible on both the default teal and NERV themes —
the on/off distinction relied almost entirely on the ✓ prefix glyph.
Bump to 32% fill + full-opacity ring border + inner ring shadow +
font-weight 600. Still theme-scoped (no hardcoded colors), but reads
at a glance on both tested themes.

**Drop the → running status action.** Since #19705, `PATCH /tasks/:id`
rejects `status=running` with HTTP 400 — only the dispatcher's
`claim_task` path legitimately enters that state (so the run row,
claim lock, and worker PID are created atomically). The UI button was
still present and produced a 400 on click, which is a confusing dead
affordance. Remove it from `StatusActions`; add a comment pointing to
#19535 so future editors know why it's missing.

Live-tested on the default Hermes Teal theme. 53/53 kanban dashboard
plugin tests still pass.
2026-05-04 14:48:19 -07:00
nftpoetrist 429b8eceb4 fix(cli): guard c-S-c key binding with try/except to prevent startup crash (#19895)
PR #19884 added @kb.add('c-S-c') unconditionally. prompt_toolkit raises
ValueError("Invalid key: c-S-c") during HermesCLI.__init__ on platforms
where this key spec is not recognised — the process exits before reaching
the prompt loop. Reported on macOS (#19894) and Linux (#19896) immediately
after #19884 landed.

Fix: wrap the registration in try/except ValueError so that startup
continues cleanly on any platform/version that rejects the spec. Where
the spec is accepted the binding is registered normally as a no-op,
allowing the terminal to handle Ctrl+Shift+C natively as before.

Fixes #19894
Fixes #19896
2026-05-04 14:45:01 -07:00
Rames Jusso e493b1c482 docs(skill): add hyperframes inspect command to cli.md + SKILL.md
- references/cli.md: add Inspect step (5/7) to Workflow + dedicated `## inspect` section between validate and preview, covering --json/--samples/--at flags and the legacy `hyperframes layout` alias
- SKILL.md: rename procedure step 7 to "Lint, validate, inspect, preview, render" with the full pipeline; explain inspect as the layout-side companion to validate (catches overflow / off-frame / occluded text issues that static lint can't see)
- SKILL.md verification: lint + validate + inspect as a single combined pass
- SKILL.md References list: include `inspect` in the cli.md command list

Brings the optional skill in sync with hyperframes-oss main as of 2026-05-03 — `inspect` was added in heygen-com/hyperframes#480 (2026-04-25) and is documented as a real workflow step in skills/hyperframes-cli/SKILL.md.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 14:13:17 -07:00
James 20859cc408 docs(skill): sync hyperframes skill with upstream changes
Pulls the hyperframes skill up to the latest state of heygen-com/hyperframes
skill content. Opened 2026-04-17; upstream has shipped CLI, layout, and path
changes since.

- SKILL.md: promote the visual-style check to a proper HARD-GATE
  (DESIGN.md > named style > ask 3 questions, with the #333/#3b82f6/Roboto
  tells); expand Step 6 to cover audio-reactive (mandatory per-frame
  tl.call() sampling loop — a single long tween does NOT react to audio),
  caption exit guarantee (hard tl.set kill after group.end), marker
  highlighting, and scene transitions; add the animation-map script to
  Verification; link the new features.md.

- references/cli.md: add capture and validate (both shipped commands, both
  referenced from the workflow but missing from the reference). Add
  --lang to tts with the voice-prefix auto-inference table and espeak-ng
  dependency note (heygen-com/hyperframes#351, 2026-04-20 — after this
  PR opened).

- references/website-to-video.md: update all paths to the capture/
  subfolder layout introduced in heygen-com/hyperframes#345
  (capture/screenshots/, capture/assets/, capture/extracted/tokens.json).
  Old captured/ prefix was broken — agents following the skill were
  looking for files in wrong locations.

- references/features.md (new): distilled coverage for captions (language
  rule, tone table, word grouping, fitTextFontSize, exit guarantee), TTS
  (multilingual phonemization, speed tuning), audio-reactive (data
  format, mapping table, sampling pattern), marker highlighting
  (highlight/circle/burst/scribble/sketchout), and transitions (energy/
  mood tables, presets, shader-compatible CSS rules). Five topics the
  original PR didn't cover.
2026-05-04 14:13:17 -07:00
James 50aabb9eb2 feat(skill): add hyperframes optional creative skill
Adds an optional creative skill that integrates HyperFrames, an
HTML-based video rendering framework, as a sibling to manim-video.
Complements manim's math-focused animation with motion-graphics,
captioned narration, audio-reactive visuals, shader transitions, and
website-to-video production.

Scope:
- optional-skills/creative/hyperframes/SKILL.md      — entry point
- references/composition.md                          — data-attr schema, timeline contract
- references/cli.md                                  — every npx hyperframes command
- references/gsap.md                                 — GSAP core API for compositions
- references/website-to-video.md                     — 7-step capture-to-video workflow
- references/troubleshooting.md                      — OpenClaw / Chromium 147 fix
- scripts/setup.sh                                   — idempotent one-time setup

OpenClaw / Chromium 147 fix (hyperframes#294):
Pinning hyperframes@>=0.4.2 (commit 4c72ba4 ships the
HeadlessExperimental.beginFrame auto-detect + screenshot fallback).
setup.sh pre-caches chrome-headless-shell so the fast BeginFrame path
is preferred over system Chrome. The PRODUCER_FORCE_SCREENSHOT=true
escape hatch is documented in troubleshooting.md and in SKILL.md
Pitfalls.

Placed under optional-skills/ (not bundled) per CONTRIBUTING.md
guidance for heavyweight deps: requires Node.js >= 22, FFmpeg, and
~300 MB chrome-headless-shell download.
2026-05-04 14:13:17 -07:00
Teknium 8fabef9d35 fix(docs): register cron-script-only guide in sidebar (#19893)
PR #19709 added website/docs/guides/cron-script-only.md but never added the entry to website/sidebars.ts, which is explicitly enumerated (not autogenerated). Two consequences:

1. The guide didn't show up in the left-nav "Guides & Tutorials" list — users could only reach it via cross-links from other pages.
2. Landing on the guide page directly made the sidebar disappear entirely (Docusaurus treats unregistered docs as orphaned and renders them without their parent sidebar).

Added 'guides/cron-script-only' next to 'guides/automate-with-cron' so it slots in alongside the other cron content. Verified with `npm run build`: no orphan warnings, no broken links, page builds with sidebar intact.

No content change, docs only.
2026-05-04 12:57:01 -07:00
briandevans 81cd678291 fix(google-workspace): restore required_credential_files in SKILL.md (#16452)
PR #9931 ("feat(google-workspace): add --from flag for custom sender display name")
accidentally removed the required_credential_files frontmatter block that tells
hermes to bind-mount google_token.json and google_client_secret.json into Docker
and Modal remote terminals before running setup.py.

Without this header the credential files are never registered in the session-scoped
ContextVar, so get_credential_file_mounts() returns an empty list at container
creation time and the OAuth files are invisible inside the sandbox.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 12:43:14 -07:00
briandevans 60b143e9df fix(tui_gateway): guard sys.path against local package shadowing (#15989)
When the TUI backend (tui_gateway/entry.py) is spawned by Node.js with the
user's CWD containing a local utils/ directory, that directory shadows the
installed utils module, causing ImportError in run_agent and hermes_cli.

Strip '' and '.' from sys.path and prepend HERMES_PYTHON_SRC_ROOT (already
set by hermes_cli before spawning the subprocess) so installed packages
always win over CWD artifacts.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-04 12:42:43 -07:00
Harry Riddle 645a2f482d fix(cli): fix shortcut config conflict in hermes_cli 2026-05-04 12:41:05 -07:00
Steven Chanin a919269eb5 fix(skills/email/himalaya): document v1.2.0 folder.aliases syntax
The bundled himalaya skill documented folder aliases using a stale
TOML schema (`[accounts.NAME.folder.alias]`, singular) that himalaya
v1.2.0 silently ignores. The TOML parses without error, but the
alias resolver never reads the sub-section — every lookup then falls
through to the canonical folder name.

Source: in `pimalaya/core` (the `email-lib` crate himalaya v1.2.0
depends on, currently v0.27.0), `email/src/folder/config.rs` defines
`FolderConfig { aliases: Option<HashMap<String, String>>, ... }`
(plural, no `#[serde(rename)]`/`alias` aliases, no
`deny_unknown_fields`), and `account/config/mod.rs::get_folder_alias`
returns the input verbatim when no alias is found. So the singular
`alias` key deserializes to nothing and lookups silently fall
through.

On Gmail (where `sent` resolves to `[Gmail]/Sent Mail`, not `Sent`)
this means save-to-Sent fails *after* SMTP delivery already
succeeded, and `himalaya message send` exits non-zero. Any caller
(agent, script, user) that retries on that exit code will re-run
the entire send — including SMTP — producing duplicate emails to
recipients. Silent ignore + caller-level retry is significantly
worse than a config that just doesn't work.

This commit updates SKILL.md and references/configuration.md to the
v1.2.0 `folder.aliases.X` syntax (plural, dotted keys, directly
under the account section), adds a Gmail-specific block with the
`[Gmail]/Sent Mail`-style mapping, and adds notes on the failure
mode so future readers don't hit the same trap. SKILL.md version
bumped 1.0.0 → 1.1.0.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 12:39:49 -07:00
Teknium 9cda237bb1 docs(cron): lead with agent-driven setup for no-agent mode (#19871)
The shipped no-agent docs introduced the feature via CLI first and
mentioned the chat path as a two-line afterthought. That buries the
actual value prop: the cronjob tool exposes no_agent directly to the
agent, so a user can describe a watchdog in plain language and Hermes
wires up the script + schedule + delivery without anyone opening an
editor.

Changes:

* cron-script-only.md: promote 'Create One from Chat' above
  'Create One from the CLI', flesh it out with a worked transcript
  (the actual tool calls the agent makes), add subsections covering
  'what the agent decides for you' (when to pick no_agent=True vs
  LLM mode) and 'managing watchdogs from chat' (pause/resume/edit/
  remove all agent-accessible).

* user-guide/features/cron.md:
  - Add 'no-agent mode' to the top-level feature list with a cross-
    link, plus a sentence up top making it clear everything is
    agent-accessible through the cronjob tool.
  - Add 'The agent sets these up for you' subsection to the no-agent
    section showing the exact tool call shape.

* automate-with-cron.md: tighten the existing tip box to mention the
  agent-driven path, not just CLI scheduling.

No behavior change — docs only.
2026-05-04 12:39:19 -07:00
briandevans eadf34633e fix(models): strip :cloud/-cloud suffix from models.dev Ollama Cloud IDs
models.dev appends :cloud and -cloud suffixes to Ollama Cloud model IDs
(e.g. kimi-k2.6:cloud, qwen3-coder:480b-cloud) that the live Ollama Cloud
API does not use. Without normalisation, these suffixed IDs bypass the
dedup check and appear alongside the correct clean IDs, causing 400/404
errors when users select them in /model or hermes model.

Add _strip_ollama_cloud_suffix() and apply it to mdev entries before the
dedup merge in fetch_ollama_cloud_models() so all model IDs stored in the
disk cache use the canonical form the API accepts.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 12:38:15 -07:00
Yoimex c050ee6573 fix(file_ops): resolve search_files path/line collision for hyphenated numeric filenames 2026-05-04 12:37:47 -07:00
Ricardo-M-L fbc477df71 fix(run_agent): acquire lock in IterationBudget.used property
The `used` property was reading `self._used` without holding the lock,
while `consume()`, `refund()`, and `remaining` all properly acquire
`self._lock` before accessing `_used`. This means a concurrent call to
`used` during `consume()` or `refund()` could observe a partially-
updated value, leading to incorrect iteration budget metrics reported
to the gateway, or in extreme cases a ValueError from CPython's list
implementation when the internal array resizes during iteration.

Fix: acquire the lock in `used` just like `remaining` does.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-04 12:37:28 -07:00
ClawdIA 64ad7dec0d fix(file-ops): allow file search in hidden roots 2026-05-04 12:37:09 -07:00
briandevans 9e2628ee7c test(discord): annotate make_attachment content_type as Optional[str]
Copilot review: the helper accepted None in one test but was annotated str.
Matches actual usage where no-content-type attachments are a tested scenario.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 12:36:47 -07:00
Ioodu 1c7f47a58c fix(cron): add concurrency regression test for parallel job state writes
get_due_jobs() called load_jobs() and save_jobs() without holding
_jobs_file_lock, creating a race with the locked mark_job_run() and
advance_next_run(). Wrap get_due_jobs() with the lock (delegating to a
new _get_due_jobs_locked() inner function) so all load→modify→save
cycles are serialised. Add two regression tests: one verifying 3
concurrent mark_job_run() calls each land their correct last_status and
last_run_at without overwrites, and a stress test confirming 10 parallel
calls each increment their job's completed count to exactly 1.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 12:36:29 -07:00
lhysdl 6875471916 fix(tts): update MiniMax API endpoint to v1/text_to_speech
MiniMax deprecated the old v1/t2a_v2 endpoint (api.minimax.io) and
moved to v1/text_to_speech (api.minimax.chat). The new API:

- Uses a flat payload: {model, text, voice_id} instead of nested
  voice_setting / audio_setting objects
- Returns raw audio bytes (Content-Type: audio/mpeg) instead of
  JSON with hex-encoded audio
- Uses model 'speech-01' instead of 'speech-2.8-hd'
- Updated default voice_id to 'female-shaonv' for Chinese TTS

The implementation detects Content-Type to handle both old and new
API responses, maintaining backward compatibility for any users who
manually configured the legacy base_url.
2026-05-04 12:36:09 -07:00
briandevans 75bce317a3 fix(cron): expand \${VAR} refs in config.yaml during job execution (#15890)
The cron scheduler's run_job() loaded config.yaml with yaml.safe_load()
but never called _expand_env_vars(), so ${HERMES_MODEL} and similar
references in model:, fallback_providers:, and other config.yaml fields
were forwarded to the LLM API as literal strings, causing HTTP 400 errors.

The normal CLI path has always called _expand_env_vars() via load_config(),
so this was a cron-only gap. The .env load at the top of run_job() already
populates os.environ before config.yaml is read, so the expansion sees the
correct values.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 12:35:46 -07:00
Albert.Zhou fd9c32c0f2 fix(email): drop non-allowlisted senders before dispatch to prevent mail loops
Add EMAIL_ALLOWED_USERS check in EmailAdapter._dispatch_message()
to silently discard emails from senders not in the allowlist.  This
prevents the adapter from creating thread context and dispatching a
MessageEvent for unauthorized senders, which could race with the
gateway authorization check and result in SMTP replies being sent
despite the handler returning None.

Test: tests/gateway/test_email.py::TestDispatchMessage::test_non_allowlisted_sender_dropped
Test: tests/gateway/test_email.py::TestDispatchMessage::test_allowlisted_sender_proceeds
Test: tests/gateway/test_email.py::TestDispatchMessage::test_empty_allowlist_allows_all
2026-05-04 12:35:22 -07:00
briandevans 20edca75e9 fix(update): sync bundled skills to all profiles, including active (#16176)
`hermes update` iterated only non-active profiles when seeding bundled
skills. `seed_profile_skills()` uses a subprocess with an explicit
HERMES_HOME so it correctly targets any profile path; the `p.name !=
active` filter was the only thing preventing the active profile from
being included, leaving it silently on stale skill content after every
update.

Drop the filter and update the header line from "other profiles" to
"all profiles". The active profile is now seeded on the same path as
every other profile. The earlier `sync_skills()` call (module-level
HERMES_HOME) remains for backward compatibility; the subprocess-based
loop is reliable regardless of which HERMES_HOME the CLI was invoked
with.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 12:34:53 -07:00
jjjojoj 103f51ad34 fix(doctor): check gh auth status when GITHUB_TOKEN absent
hermes doctor showed 'No GITHUB_TOKEN (60 req/hr)' warning even when
users had authenticated via gh auth login. Now falls back to
gh auth status --json authenticated when GITHUB_TOKEN and GH_TOKEN
are both unset.

Fixes #16115
2026-05-04 12:34:31 -07:00
fiver 8ab9f61dcf fix(gateway): preserve WSL interop PATH in systemd units 2026-05-04 12:34:06 -07:00
Teknium d90f73bcec fix(gateway): use git HEAD SHA, not file mtimes, for stale-code check (#19740)
The stale-code self-check (Issue #17648) used sentinel-file mtimes to
decide whether the gateway survived a `hermes update` with stale
`sys.modules`. That signal false-positives on any write to the
sentinel files — including agent-driven edits during Hermes-on-Hermes
dev sessions. Telling the agent to patch `run_agent.py` would flip
the check to True on the next user message and force a gateway
restart even though no update happened.

Switch the signal to `git rev-parse HEAD`. Agent file edits don't
move HEAD; `hermes update` (git pull) always does. Reading .git/HEAD
directly (no subprocess) with a 5s cache keeps the overhead negligible
on bursty chats. Non-git installs short-circuit to False — the
stale-modules class can't occur without a git-backed update path, so
there's nothing to detect.

The legacy `_compute_repo_mtime` helper is kept but unused by
detection, reserved as a fallback hook for future pip-install update
paths.

- _read_git_head_sha(): resolves HEAD across main checkout, worktree
  (follows `gitdir:` + `commondir` pointers), and packed-refs layouts.
- _current_git_sha_cached(): per-runner 5s SHA cache.
- _detect_stale_code(): boot SHA vs current SHA, returns False when
  either is unavailable.
- Tests cover all four layouts, the agent-edits-don't-trigger
  regression, and cache behavior.

Refs #17648.
2026-05-04 12:33:21 -07:00
Teknium a21f364ad7 chore(release): AUTHOR_MAP entries for Tier 1g salvage batch 2026-05-04 12:32:10 -07:00
Teknium 1c7c7c3c5f feat(kanban-dashboard): per-platform home-channel notification toggles (#19864)
* revert: auto-subscribe gateway chat on tool-driven kanban_create (#19718)

Reverts ff3d2773e2. Teknium reviewed the merged PR and decided this
behavior isn't wanted — tool-driven kanban_create should not mirror
the slash-command path's auto-subscribe. Orchestrators that want
their originating chat notified can call kanban_notify-subscribe
explicitly; we're not going to make it implicit.

* feat(kanban-dashboard): per-platform home-channel notification toggles

Adds a "Notify home channels" section to the task drawer in the kanban
dashboard plugin. Each platform where the user has set a home channel
(/sethome, TELEGRAM_HOME_CHANNEL env var, gateway.platforms.<p>.home_channel
in config.yaml) gets a toggle pill. Toggling on writes a kanban_notify_subs
row keyed to that platform's home (chat_id + thread_id); toggling off
removes it. The existing gateway notifier watcher delivers completed /
blocked / gave_up events without any new plumbing — this is purely a GUI
surface over existing machinery.

Replaces the reverted auto-subscribe behavior from #19718 with an explicit,
per-task, per-platform, user-controlled opt-in. No implicit subscription
on tool-driven kanban_create; no CLI commands; no slash commands. Just a
toggle in the drawer.

Backend (plugins/kanban/dashboard/plugin_api.py):
- GET  /api/plugins/kanban/home-channels[?task_id=X]
  Returns every platform with a configured home, plus a per-entry
  subscribed: bool relative to task_id (false when task_id omitted).
  Reads the live GatewayConfig via load_gateway_config() so env-var
  overlays stay honored.
- POST /api/plugins/kanban/tasks/:id/home-subscribe/:platform
  Idempotent add_notify_sub keyed to the platform's home.
- DELETE /api/plugins/kanban/tasks/:id/home-subscribe/:platform
  remove_notify_sub for the same tuple.
- 404 when the platform has no home configured, or task_id doesn't
  exist (POST only).

Frontend (plugins/kanban/dashboard/dist/index.js):
- TaskDrawer fetches /home-channels on open, keyed on task_id.
- HomeSubsSection renders nothing when zero platforms have a home (so
  users who haven't set one up don't see an empty UI block).
- Optimistic toggle with busy flag + revert-on-failure. One pill per
  platform; ✓ prefix and --on class indicate the subscribed state.

CSS (plugins/kanban/dashboard/dist/style.css):
- .hermes-kanban-home-subs flex row + .hermes-kanban-home-sub pill
  style + --on subscribed variant (subtle ring-colored background).

Live-tested against a dashboard with TELEGRAM + DISCORD_BOT_TOKEN /
HOME_CHANNEL env vars set: drawer shows both pills, toggling each
flips its visual state AND writes/removes the correct kanban_notify_subs
row (verified via direct DB read).

Tests (tests/plugins/test_kanban_dashboard_plugin.py, 11 new, 53/53
pass total):
- home-channels lists only platforms with a home (slack with a
  token but no home is excluded)
- no task_id -> all subscribed=false
- subscribe creates notify_sub row with correct chat/thread/platform
- subscribed=true reflected in subsequent GET
- idempotent re-subscribe
- unknown platform -> 404
- unknown task -> 404
- unsubscribe removes the row
- telegram + discord subscribe/unsubscribe independent
- zero homes -> empty list
2026-05-04 12:31:21 -07:00
emozilla 2bc82bb504 clarify placeholder telegram credential in tests 2026-05-04 15:31:15 -04:00
Teknium 3db6b9cc87 feat(cron): add no_agent mode for script-only cron jobs (watchdog pattern) (#19709)
* feat(cron): add no_agent mode for script-only cron jobs (watchdog pattern)

Adds a no_agent=True option to the cronjob system. When enabled, the
scheduler runs the attached script on schedule and delivers its stdout
directly to the job's target — no LLM, no agent loop, no token spend.
This is the classic bash-watchdog pattern (memory alert every 5 min,
disk alert every 15 min, CI ping) reimplemented as a first-class Hermes
primitive instead of a systemd timer + curl + bot token triplet living
outside the system.

## What

  hermes cron create "every 5m" \
    --no-agent \
    --script memory-watchdog.sh \
    --deliver telegram \
    --name memory-watchdog

Agent tool:

  cronjob(action='create',
          schedule='every 5m',
          script='memory-watchdog.sh',
          no_agent=True,
          deliver='telegram')

Semantics:
- Script stdout (trimmed) → delivered verbatim as the message
- Empty stdout          → silent tick (no delivery; watchdog pattern)
- wakeAgent=false gate  → silent tick (same gate LLM jobs use)
- Non-zero exit/timeout → delivered as an error alert
                          (broken watchdogs shouldn't fail silently)
- No LLM ever invoked; no tokens spent; no provider fallback applied

## Implementation

cron/jobs.py
  * create_job gains no_agent: bool = False
  * prompt becomes Optional (no_agent jobs don't need one)
  * Validation: no_agent=True requires a script at create time
  * Field roundtrips via load_jobs / save_jobs / update_job

cron/scheduler.py
  * run_job: new short-circuit branch at the top that runs the script,
    wraps its output into the (success, doc, final_response, error)
    tuple downstream delivery already expects, and returns before any
    AIAgent import or construction
  * _run_job_script: picks interpreter by extension — .sh/.bash run
    under /bin/bash, anything else under sys.executable (Python).
    Shell support unlocks the bash-watchdog pattern without wrapping
    scripts in Python. Extension is explicit; we deliberately do NOT
    trust the file's own shebang. Path-containment guard (scripts dir)
    unchanged.

tools/cronjob_tools.py
  * Schema: new no_agent boolean property with clear trigger guidance
  * cronjob() accepts no_agent and validates mode-specific shape:
    - no_agent=True requires script; prompt/skills optional
    - no_agent=False keeps the existing 'prompt or skill required' rule
  * update path rejects flipping no_agent=True on a job without a script
  * _format_job surfaces no_agent in list output
  * Handler lambda forwards no_agent from tool args

hermes_cli/main.py, hermes_cli/cron.py
  * 'hermes cron create --no-agent' and edit's --no-agent / --agent
    pair for toggling at CLI parity with the agent tool
  * Existing --script help text updated to describe both modes
  * List / create / edit output now shows 'Mode: no-agent (...)' when set

## Tests

tests/cron/test_cron_no_agent.py — 18 tests covering:
  * create_job: no_agent shape, validation, field persistence
  * update_job: flag roundtrip across reload
  * cronjob tool: schema validation, update toggling, mode-specific
    requirements, prompt-relaxation rule
  * run_job short-circuit:
    - success path delivers stdout verbatim
    - empty stdout → SILENT_MARKER (no delivery downstream)
    - wakeAgent=false gate → silent
    - script failure → error alert
    - run_job does NOT import AIAgent (verified via mock)
  * _run_job_script:
    - .sh executes via bash (no shebang required)
    - .bash executes via bash
    - .py still runs via sys.executable (regression)
    - path-traversal still blocked (security regression)

All 18 new tests pass. 341/342 pre-existing cron tests still pass; the
one failure (test_script_empty_output_noted) was already broken on main
and is unrelated to this change.

## Docs

website/docs/guides/cron-script-only.md — new dedicated guide covering
the watchdog pattern, interpreter rules, delivery mapping, worked
examples (memory / disk alerts), and the comparison table vs hermes send,
regular LLM cron jobs, and OS-level cron.

website/docs/user-guide/features/cron.md — new 'No-agent mode' section
in the cron feature reference, cross-linked to the guide.

website/docs/guides/automate-with-cron.md — new tip box pointing users
to no-agent mode when they don't need LLM reasoning.

## Compatibility

- Existing jobs: unchanged. no_agent defaults to False, existing code
  paths untouched until the flag is set.
- Schema additive only; older jobs.json without the field load fine
  via .get() with False default.
- New CLI flags are opt-in and don't alter existing flag behavior.

* fix(cron): lazy-import AIAgent + SessionDB so no_agent ticks pay zero

The unconditional `from run_agent import AIAgent` + SessionDB() init at
the top of run_job() meant every no_agent tick still paid the full agent
module load cost (~300ms + transitive imports + DB open) even though it
never touched any of that machinery.

Move both to live under the default (LLM) path, after the no_agent
short-circuit has returned. Now a no_agent tick's sys.modules stays
clean — verified end-to-end:

    assert 'run_agent' not in sys.modules  # before
    run_job(no_agent_job)
    assert 'run_agent' not in sys.modules  # after

The existing mock-based unit test (test_run_job_no_agent_never_invokes_aiagent)
kept passing because patch() replaces the class AFTER import; the leak
was only visible via real subprocess-style verification. End-to-end
demo confirmed: agent calls cronjob(no_agent=True) → script runs →
stdout delivered → no LLM machinery loaded.

* docs(cron): tighten no_agent tool schema — defaults, silent semantics, pick rule

Previous description buried the important bits in one long sentence.
Agents could plausibly miss three things an LLM-facing schema should
make unmissable:

1. What the default is — now first sentence + JSON Schema `default: false`
2. What 'silent run' actually means for the user — now spelled out:
   'nothing is sent to the user and they won't see anything happened'
3. When to pick True vs False — now a concrete decision rule with
   examples on both sides (watchdogs/metrics/pollers → True;
   summarize/draft/pick/rephrase → False)

Also adds explicit 'prompt and skills are ignored when True' since the
agent could otherwise still pass them out of habit.

No behavior change — schema text only.
2026-05-04 12:31:01 -07:00
teknium1 d35efb9898 feat(telegram): /topic off + help + auth gate + screenshot debounce
Four production-readiness additions to topic mode:

1. /topic off — clean disable path. Flips telegram_dm_topic_mode.enabled
   to 0 and clears telegram_dm_topic_bindings for this chat. Previously
   users had to edit state.db with sqlite3 to turn the feature off.
   Idempotent: calling /topic off when the chat was never enabled
   returns a friendly no-op message.

2. /topic help — inline usage printed in the DM so users don't have to
   visit docs to discover /topic off, /topic <session-id>, etc.

3. Authorization gate. /topic mutates SQLite side tables and flips the
   root DM into a lobby, so the action must be authorized. Now calls
   self._is_user_authorized(source); unauthorized DMs get a refusal
   instead of activation. Defense in depth on top of the gateway's
   existing pre-route auth.

4. BotFather screenshot debounce. A user repeatedly running /topic
   while Threads Settings is still disabled would previously re-upload
   the same screenshot every time. Now rate-limited to one send per
   5 minutes per chat. /topic off resets the counter so re-enabling
   starts fresh.

Command-def args hint updated: /topic [off|help|session-id].

Docs:
- New /topic subcommands table at the top of the multi-session section
- Disable instructions updated to recommend /topic off first, with the
  raw SQL fallback kept for bulk cleanup
- Under-the-hood list extended with the capability-hint debounce and
  the authorization gate

Tests (6 new):
- /topic help returns usage and doesn't create topic tables
- /topic off disables mode AND clears bindings
- /topic off is idempotent when never enabled
- Unauthorized users get refusal, no tables created
- Capability-hint debounce is per-chat
- /topic off resets both lobby and capability debounce counters

All 402 targeted tests pass. Full gateway sweep: 4809/4810
(pre-existing test_teams::test_send_typing unrelated).
2026-05-04 12:07:17 -07:00
teknium1 1381c89e56 fix(telegram): polish topic mode — CASCADE, General-topic handling, rename guard, debounce
Five follow-ups to topic mode based on integration audit:

1. ON DELETE CASCADE on telegram_dm_topic_bindings.session_id. Session
   pruning (manual /delete, auto-cleanup, any future prune job) would
   have thrown 'FOREIGN KEY constraint failed' for sessions bound to a
   topic. Migration bumped to v2, rebuilds the bindings table in place
   if FK lacks CASCADE. Idempotent; only runs once per DB.

2. Never auto-rename operator-declared topics. If an operator has
   extra.dm_topics configured AND a user runs /topic, messages in those
   pre-declared topics would previously trigger auto-rename and silently
   mutate operator config. _rename_telegram_topic_for_session_title now
   early-returns when _get_dm_topic_info returns a dict for this
   (chat_id, thread_id). Uses class-based lookup (not hasattr) so
   MagicMock test fixtures don't accidentally trip the guard.

3. General topic handling. Telegram's General (pinned top) topic in a
   forum-enabled private chat may send messages with message_thread_id=1
   or omit thread_id entirely depending on client. Both are now treated
   as the root lobby, not a topic lane. Prevents users from
   accidentally burning a session on the General topic.

4. Debounce the root-lobby reminder. 30-second cooldown per chat so a
   user who forgets topic mode is enabled and types ten messages in the
   root gets one reminder, not ten. Explicit command replies
   (/new-in-lobby, /topic <session-id>) still land every time.

5. Docs: added under-the-hood invariants for the above, plus a
   Downgrade section explaining that rolling back to a pre-/topic
   Hermes build leaves the DB tables orphaned but harmless — DMs just
   revert to native per-thread isolation.

Tests:
- test_operator_declared_topic_is_not_auto_renamed
- test_general_topic_is_treated_as_root_lobby
- test_lobby_reminder_is_debounced_per_chat
- test_binding_survives_session_deletion_via_cascade
- test_migration_rebuilds_v1_binding_table_with_cascade_fk

Validated: 4803/4804 tests pass (tests/gateway/ + tests/test_hermes_state.py).
Sole failure is a pre-existing test_teams::test_send_typing flake
unrelated to this PR.
2026-05-04 12:07:17 -07:00
teknium1 1a9542cf75 docs(telegram): document /topic multi-session DM mode
Adds a new section 'Multi-session DM mode (/topic)' to the Telegram
messaging docs, covering:

- Comparison table vs the existing config-driven extra.dm_topics
- BotFather prerequisites (Threads Settings, user-create permission)
- Activation flow and root-DM lobby behavior
- End-user flow for creating topics via the + button / All Messages
- Auto-renaming when Hermes generates session titles
- /new semantics inside a topic
- /topic <session-id> restore of previous sessions
- Persistence layout (SQLite side tables)
- How to disable the feature

Also:
- New /topic row in the messaging slash-commands reference
- Updated Bot API 9.4 summary to point at both topic features
2026-05-04 12:07:17 -07:00
teknium1 a7683d04a9 fix(telegram): harden DM topic binding — persist through switch_session, rebind on /new
Follow-up on @EmelyanenkoK's feat: add Telegram DM topic-mode sessions.

Three issues:

1. Split-brain session state. After get_or_create_session() returned a
   SessionEntry for a topic lane, the handler was mutating
   .session_id in place to the binding's target, but never persisting
   the switch through SessionStore. The sessions.json session_key →
   session_id map kept pointing at the lane's natural id; any reader
   that reloaded from disk saw the wrong id. Fixed by routing through
   SessionStore.switch_session(), which _save()s the mapping and ends
   the old session in SQLite like /resume does.

2. /new inside a topic was a one-message no-op. Reset created a new
   session but left the telegram_dm_topic_bindings row pointing at the
   old session_id, so the next message's binding lookup switched right
   back. Now _handle_reset_command rebinds the topic to the new
   session_id after reset.

3. is_telegram_session_linked_to_topic and
   list_unlinked_telegram_sessions_for_user both called
   apply_telegram_topic_migration() on read, contradicting the PR's
   own invariant that migration only runs on explicit /topic opt-in.
   They now tolerate missing topic tables and return empty/False.

Also: _telegram_topic_mode_enabled() now only treats True as enabled
(not any truthy return), so test fixtures with MagicMock session_db
don't accidentally flip every DM into lobby mode — this was breaking
4 pre-existing test_status_command tests.

Tests:
- New regression: /new inside a topic must update the binding row
  (test_new_inside_telegram_topic_rewrites_binding_to_new_session).
- _make_runner now stubs switch_session so existing restore tests
  still exercise the new code path.

Validated end-to-end with real SessionDB + SessionStore:
readers on fresh DB don't create topic tables; enable creates them;
binding override persists across SessionStore restart; /new rebinds
and the new id survives a restart.

Co-authored-by: EmelyanenkoK <emelyanenko.kirill@gmail.com>
2026-05-04 12:07:17 -07:00
EmelyanenkoK 25065283b3 fix: improve telegram topic mode setup 2026-05-04 12:07:17 -07:00
EmelyanenkoK d6615d8ec7 feat: add Telegram DM topic-mode sessions 2026-05-04 12:07:17 -07:00
asheriif 0ce1b9fe20 fix(tui): preserve prompt separator width (#19340)
* fix(tui): preserve prompt separator width

* fix(tui): align transcript height estimates with prompt width
2026-05-04 09:58:40 -07:00
brooklyn! d9c090fe36 Merge pull request #19338 from asheriif/fix/tui-plugin-slash-exec-live
fix(tui): run plugin slash commands live
2026-05-04 09:57:45 -07:00
kshitijk4poor 54e78cadb2 test: add regression test for Teams interactive_setup import fix
Adapted from PR #19188 by @LeonSGP43 — mocks cli_output helpers and
verifies interactive_setup persists credentials to .env without
crashing. Also adds megastary to AUTHOR_MAP.
2026-05-04 06:54:27 -07:00
megastary 38adfebe78 fix(teams): import prompt/print helpers from cli_output, not config
The Teams adapter's interactive_setup() tried to import prompt,
prompt_yes_no, print_info, print_success, and print_warning from
hermes_cli.config, but those helpers live in hermes_cli.cli_output.
Only get_env_value/save_env_value live in hermes_cli.config.

This caused 'hermes setup' to crash with ImportError as soon as the
user picked Teams in the messaging-platforms wizard.

Split the import accordingly.
2026-05-04 06:54:27 -07:00
kshitijk4poor cfd86dcdb8 chore: add bobashopcashier noreply email to AUTHOR_MAP 2026-05-04 06:23:52 -07:00
bobashopcashier d89e7a3cd4 fix(anthropic): restrict fast mode to Opus 4.6 (Anthropic API contract)
Per https://platform.claude.com/docs/en/build-with-claude/fast-mode:
"Fast mode is currently supported on Opus 4.6 only. Sending speed: fast
with an unsupported model returns an error."

Pre-fix, _is_anthropic_fast_model() returned True for any claude-* model,
so /fast on Opus 4.7 (or Sonnet/Haiku) would persist agent.service_tier=fast
in config.yaml and the adapter would inject extra_body["speed"] = "fast"
on every subsequent request. Opus 4.7 returns:

  HTTP 400: 'claude-opus-4-7' does not support the `speed` parameter.

This wedged sessions across model upgrades (a user who ran /fast on Opus 4.6
and later switched the default model to 4.7 hit a hard 400 on every turn
until they manually edited config.yaml).

Changes:
- _is_anthropic_fast_model: gate on "opus-4-6" / "opus-4.6" only
- anthropic_adapter: add _supports_fast_mode predicate as defensive guard
  so stale request_overrides on an unsupported model are dropped silently
  instead of 400'ing
- Tests: flip the assertions that mirrored the bug (Sonnet/Haiku/Opus 4.7
  asserting fast-mode support) to match the documented API contract
2026-05-04 06:23:52 -07:00
JasonOA888 a7417f8a4a fix(compressor): skip non-string tool content in summarization pass to prevent AttributeError
Commit 408dd8aa added a non-string guard for Pass 1 (dedup), but the same
pattern exists in Pass 2 (summarization/pruning) where content.startswith()
and len() are called on potentially non-string tool content.

When a provider returns tool results with non-string content (e.g. dict or
int from llama.cpp or similar), the pruning pass crashes with AttributeError.

Add the same isinstance(content, str) guard to Pass 2 for consistency.
2026-05-04 06:23:52 -07:00
helix4u eeb05cf556 docs: default custom tool creation to plugins
Steers custom tool creation toward the plugin route by default.
The adding-tools.md guide is now explicitly for built-in core Hermes
tools only.

Key fixes:
- Plugin quickstart: ctx.register_tool() now uses correct keyword-arg
  API (name=, toolset=, schema=, handler=) instead of broken 3-arg call
- Handler signature: (params, **kwargs) instead of (params)
- Handler return: json.dumps({...}) instead of plain string
- AGENTS.md: mentions plugin route before built-in tool instructions
- learning-path.md: plugins listed before core tool development
- contributing.md: separates plugin vs core tool paths

Based on PR #13138 by @helix4u.
2026-05-04 05:53:16 -07:00
ygd58 74c1b946e0 fix(browser): inject --no-sandbox for root and AppArmor userns restrictions
On VPS/Docker and some Ubuntu 23.10+ hosts, Chromium refuses to start
without --no-sandbox:
  - uid=0 (root): hard requirement (VPS/Docker deployments)
  - AppArmor apparmor_restrict_unprivileged_userns=1 (Ubuntu 23.10+):
    non-root too, under systemd or unprivileged containers

Detect both conditions and inject AGENT_BROWSER_CHROME_FLAGS with
--no-sandbox --disable-dev-shm-usage when the user hasn't already
set the flags themselves.

Salvage of #15771 — only the browser_tool.py fix is cherry-picked.
The PR's accompanying MCP preset addition (new feature surface)
was dropped so the bug fix can land independently.

Co-authored-by: ygd58 <buraysandro9@gmail.com>
2026-05-04 05:27:23 -07:00
briandevans ce22301dc6 test(sms): use clear=True in test_missing_phone_number_is_non_retryable
Prevents pre-existing TWILIO_PHONE_NUMBER or SMS_WEBHOOK_URL values in
the outer test environment from leaking into the assertion context.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 05:25:09 -07:00
0668001438 83080772f2 fix(delegation): honor provider override for subagents
Clear inherited provider preference filters when delegation.provider is set so delegated children do not route back to the parent provider. Add a regression test for cross-provider delegation with parent OpenRouter filters.

Closes #10653
2026-05-04 05:22:35 -07:00
Pratik Rai 7a8ee8b29d fix(gateway): deduplicate Weixin messages by content fingerprint 2026-05-04 05:20:13 -07:00
briandevans 0b5fd40a01 fix(delegate): correct _spawn_child → _build_child_agent in comments
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 05:18:45 -07:00
briandevans 42d72b5922 fix(status): add missing popular provider API keys to hermes status display
Closes #16082.

`hermes status` silently omitted four widely-used LLM providers
(Google/Gemini, DeepSeek, xAI/Grok, NVIDIA NIM) from the API Keys
and API-Key Providers sections. Add them, along with tuple-valued
env var support (first found wins) so Google can accept either
GOOGLE_API_KEY or GEMINI_API_KEY.

Also deduplicates the "NVIDIA" and "NVIDIA NIM" rows that were
both pointing at NVIDIA_API_KEY.

Salvage of #16159 (core behavior preserved + NVIDIA dedup fixup
on top of the tuple-support refactor).

Co-authored-by: briandevans <252620095+briandevans@users.noreply.github.com>
2026-05-04 05:14:13 -07:00
VinVC 5d6431c114 fix(doctor): resolve merge conflicts, add kimi-coding-cn test
- Rebased on upstream/main to resolve conflicts
- Added test_run_doctor_accepts_kimi_coding_cn_provider test
- All 30 tests pass
2026-05-04 05:12:42 -07:00
阿泥豆 0e9416036a test: add unit tests for heartbeat stale threshold increase 2026-05-04 05:08:51 -07:00
阿泥豆 0cc63043e0 fix(delegation): increase heartbeat stale thresholds
The heartbeat stale detection was too aggressive:
- idle: 5 * 30s = 150s — LLM inference on slow providers (Zhipu/GLM)
  frequently exceeds 150s, causing heartbeat to stop prematurely
- in-tool: 20 * 30s = 600s — borderline for long tool calls

When heartbeat stops, parent._last_activity_ts freezes, eventually
triggering gateway timeout and killing the entire delegation.

New thresholds:
- idle: 15 * 30s = 450s — accommodates slow LLM inference
- in-tool: 40 * 30s = 1200s — accommodates long-running tool calls

child_timeout_seconds (config: delegation.child_timeout_seconds) remains
the hard cap for total delegation duration.
2026-05-04 05:08:51 -07:00
briandevans 6b4ccb9b14 fix(session-search): report source from resolved parent, not FTS5 child session (#15909)
When a delegation child session (e.g. source='telegram') contains the
FTS5 hit but _resolve_to_parent() maps it to a different root session
(source='api_server'), the result entry was still reporting the child's
source because the loop discarded session_meta as `_` and fell back to
match_info.get('source'), which carries the child session's value.

Use the resolved parent's session_meta for source, model, and started_at
with match_info as a fallback, so the output accurately reflects the
session the user actually interacted with.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 05:07:40 -07:00
briandevans b46b0c9888 fix(backup): floor pre-update backup_keep to 1 so the new backup survives
`updates.backup_keep: 0` (or any negative value) wiped the freshly-
created pre-update zip:

  _prune_pre_update_backups(backup_dir, keep=0):
      backups = sorted(..., reverse=True)   # newest first, includes
                                            # the zip we just wrote
      for p in backups[0:]:                 # = all of them
          p.unlink()

The wrapper in `main.py` then printed `Saved: <path>` for a file that
no longer existed (the size lookup is wrapped in `try/except OSError`
which silently degrades to "0 B"), leaving operators believing they had
a recovery point when they had none.

This is a real footgun because some config systems treat 0 as "keep
unlimited"; here it does the opposite — every backup is destroyed
right after creation.

Fix: clamp `keep` to a minimum of 1 inside `_prune_pre_update_backups`
since that helper is only invoked immediately after a fresh backup
is written.  Operators who genuinely want no backups should set
`updates.pre_update_backup: false` (which gates creation entirely)
rather than relying on `backup_keep: 0`.

Also extends the `backup_keep` config docstring to spell out the floor
and point at `pre_update_backup: false` as the off-switch.

## Tests

Three regression tests added in `TestPreUpdateBackup`:

  - `test_keep_zero_does_not_delete_freshly_created_backup` —
    asserts the file persists after `keep=0`
  - `test_keep_negative_does_not_delete_freshly_created_backup` —
    same for negative values
  - `test_keep_zero_still_prunes_older_backups` — proves the floor
    only protects the new backup; older ones are still rotated out

Verified the new tests fail on origin/main (without the floor) and
pass with it; full `tests/hermes_cli/test_backup.py` suite green
(84 tests).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 05:07:13 -07:00
Sanhu Li ef8c213e88 fix(model-switch): soft-accept unlisted openai-codex models 2026-05-04 05:06:53 -07:00
0xsir0000 52882dade6 fix(agent): include name field on every role:tool message for Gemini compatibility (#16478)
Gemini's OpenAI-compatibility endpoint strictly requires the `name` field
on `role: tool` messages — it returns HTTP 400 ("Request contains an
invalid argument") when the function name is missing. OpenAI/Anthropic/
ollama tolerate the absence, so the gap stays invisible until the
conversation accumulates a tool turn and the user routes it through Gemini
(direct API or via ollama-cloud proxy).

Fix: add a `_get_tool_call_name_static()` helper alongside the existing
`_get_tool_call_id_static()`, and populate `name` at every site that
constructs a `role: tool` message — the pre-call sanitizer stub, the
tool-call args repair marker, both interrupt-skip paths, both
result-append paths (parallel + sequential), the invalid-tool-name
recovery, the invalid-JSON-args recovery, and the exception fallback.

Each call site was already in scope of the function name (`function_name`,
`skipped_name`, `name`, or a dict tool_call), so the change is local —
no new lookups, no behavior change for providers that already worked.

Fixes #16478
2026-05-04 05:06:33 -07:00
OpenClaw Bot 0443484115 fix(qqbot): honor proxy env vars for websocket 2026-05-04 05:06:09 -07:00
陈运波0668001438 6cf7a9e330 fix(vision): preserve explicit provider auth with custom base_url
Keep the configured vision provider when base_url is overridden so credential-pool lookup still resolves provider-specific API keys (e.g. ZAI_API_KEY), and add a regression test for this path.
2026-05-04 05:05:43 -07:00
swithek b7bbc62503 fix(compressor): _prune_old_tool_results boundary direction 2026-05-04 05:05:18 -07:00
Dejie Guo d29f90e89d fix(error_classifier): avoid large-context false overflow heuristics
Generic 400 and server-disconnect heuristics used absolute token/message-count fallbacks that are too aggressive for 1M context sessions. Gate those absolute fallbacks to smaller context windows while preserving relative pressure checks.

Fixes #16351
2026-05-04 05:04:56 -07:00
giwaov 026a5e47df fix(cli): preserve Windows hidden-dir paths in markdown 2026-05-04 05:04:36 -07:00
Teknium 3fb35520c6 revert: auto-subscribe gateway chat on tool-driven kanban_create (#19718) (#19721)
Reverts ff3d2773e2. Teknium reviewed the merged PR and decided this
behavior isn't wanted — tool-driven kanban_create should not mirror
the slash-command path's auto-subscribe. Orchestrators that want
their originating chat notified can call kanban_notify-subscribe
explicitly; we're not going to make it implicit.
2026-05-04 05:04:01 -07:00
Teknium 25b7b0f8e6 chore(release): AUTHOR_MAP entries for Tier 1f salvage batch 2026-05-04 05:03:10 -07:00
Teknium ff3d2773e2 feat(kanban): auto-subscribe gateway chat on tool-driven kanban_create (#19718)
Closes #19479.

When an orchestrator agent calls kanban_create from a gateway session
(e.g. a Telegram user delegating to an orchestrator profile), auto-
subscribe the originating (platform, chat, thread, user) to the new
task's terminal events. Mirrors the behavior of the /kanban create
slash command in gateway/run.py so tool-driven creation is at parity
with human-driven creation.

Without this, a user who interacts with an orchestrator exclusively
via the gateway never receives blocked / completed / gave_up
notifications for tasks the orchestrator created on their behalf —
silently breaking the gateway-first multi-agent flow the reporter
describes.

Reads the context-local HERMES_SESSION_* vars via get_session_env()
(not os.environ — those are contextvars for asyncio concurrency
safety). Falls through cleanly in CLI / cron contexts with no
session active (subscribed=False in the response). Best-effort: if
the gateway module isn't importable (test rigs stubbing gateway.*),
the task still creates, we just skip the subscription.

Response gains a 'subscribed' bool so the orchestrator knows whether
terminal events will land back in the originating chat or whether it
needs to poll / unblock manually.

Tests: 4 new in tests/tools/test_kanban_tools.py covering
CLI/no-subscribe, telegram/gateway-auto-subscribe, discord-DM/no-
thread subscribe, and partial-ctx/no-chat_id no-subscribe. 40/40
kanban tool tests pass.
2026-05-04 05:02:23 -07:00
Nikolay Gusev fdf9343c51 fix(tools): wrap bare scalars in single-element list for array-typed args
Open-weight models (DeepSeek, Qwen, GLM) sometimes emit tool calls like
`{"urls": "https://a.com"}` when the tool schema declares
`type: array`.  The call was JSON-valid but semantically wrong, and
`coerce_tool_args` would pass the bare string through — the tool then
failed with a confusing type error.

`coerce_tool_args` now wraps non-list, non-null values in a
single-element list when the schema declares `array`.  Strings still go
through `_coerce_value` first so JSON-encoded arrays
(`'["a","b"]'`) parse correctly and nullable `"null"` still
becomes `None`.  `None` itself is preserved — tools with sensible
defaults already handle it, and we don't want to silently mask a
deliberate null.

Salvaged from #19652 (NikolayGusev-astra) — the broader validate-then-
repair layer had several issues (duplicated existing coercion,
mis-classified `old_string` as a path field, prepended non-JSON
prefixes to tool results that break downstream JSON parsing, hardcoded
offset/limit defaults unsuitable for non-read_file tools).  The one
genuinely new capability is wrapping bare scalars, which is implemented
here directly inside the existing coercion path.

Co-authored-by: Nikolay Gusev <ngusev@astralinux.ru>
2026-05-04 05:00:37 -07:00
ms-alan 6f864f8f94 fix(redact): add code_file param to skip false-positive ENV/JSON patterns
ENV-assignment and JSON-field regex patterns in redact_sensitive_text()
cause false positives when reading source code files:
- MAX_TOKENS=*** triggers the ENV assignment pattern
- "apiKey": "test" in test fixtures triggers the JSON field pattern

Add code_file=False parameter. When code_file=True, skip only the
ENV-assignment and JSON-field regex passes; all other patterns (prefixes,
auth headers, private keys, DB connstrings, JWTs, URL secrets) are
still applied.

Update file_tools.py (read_file and search_files) to pass code_file=True
so agent code analysis is not polluted by false-positive redactions.

Closes #15934
2026-05-04 04:56:28 -07:00
Teknium a175f39577 feat(nous): persist Nous OAuth across profiles via shared token store (#19712)
Mirrors the Codex auto-import UX. On successful Nous login (either
`hermes auth add nous --type oauth` or `hermes login nous`), tokens are
mirrored to `$HERMES_SHARED_AUTH_DIR/nous_auth.json` (default
`~/.hermes/shared/nous_auth.json`, outside any named profile's
HERMES_HOME). On next login in a new profile, the flow offers to import
those credentials ("Import these credentials? [Y/n]") and rehydrates via
a forced refresh+mint instead of running the full device-code flow.

Runtime refresh in any profile syncs the rotated refresh_token back to
the shared store so sibling profiles don't hit stale-token fallback
after rotation.

The volatile 24h agent_key is NOT persisted to the shared store —
only the long-lived OAuth tokens are cross-profile useful.

- `HERMES_SHARED_AUTH_DIR` env var for tests + custom layouts
- Pytest seat belt mirrors the existing `_auth_file_path` guard so
  forgetting to redirect the store in a test fails loudly
- File mode 0600 where platform supports it
- Runtime credential resolution is unchanged — shared store is only
  consulted during the login flow, so profile isolation at runtime is
  preserved
- Stale refresh_token + portal-down cases gracefully fall back to
  device-code

Addresses a user report from Mike Nguyen: running
`hermes --profile <name> auth add nous --type oauth` for every new
profile is unnecessary friction now that Codex has a shared-import
flow via `~/.codex/auth.json`.
2026-05-04 04:54:55 -07:00
QifengKuang 69fc6d9c1e fix(telegram): fall back to document on any send_photo failure, not just dim errors
Broadens the existing fallback (previously only fired for
Photo_invalid_dimensions) to cover every send_photo exception class:
rate limits, corrupt file markers, format edge cases. The expected
dimension case still logs at INFO (document is the right path); all
other cases log at WARNING with exc_info so they're visible in logs.

If send_document itself fails, we still fall back to the base adapter's
text-only 'Image: /path' rendering as a last resort.

Salvage of #15837 — original PR author QifengKuang proposed the broader
try/except-style fallback. Adapted to keep the existing INFO-vs-WARNING
log split for dimension errors (the expected case).

Co-authored-by: QifengKuang <k2767567815@gmail.com>
2026-05-04 04:54:54 -07:00
Teknium d3b22b76d8 fix(kanban): enforce worker task-ownership on destructive tool calls (#19713)
Closes #19534 (security).

A worker spawned by the kanban dispatcher has HERMES_KANBAN_TASK set
to its own task id. The destructive tools (kanban_complete,
kanban_block, kanban_heartbeat) resolved task_id via
_default_task_id() which preferred an explicit arg over the env var,
with no ownership check — so a buggy or prompt-injected worker could
complete / block / heartbeat any OTHER task (sibling, cross-tenant,
anything) by supplying its id. Reporter's repro: worker for t_A
passed task_id=t_B to kanban_complete and got {"ok": true}.

Fix: add _enforce_worker_task_ownership(tid). If HERMES_KANBAN_TASK
is set and tid doesn't match, return a structured tool error with
guidance to use kanban_comment (for information handoff across tasks)
or kanban_create (for follow-up work). Orchestrator profiles (no env
var, but kanban toolset enabled per #18968) are exempt — their job
is routing and sometimes includes closing out child tasks.

Kept unrestricted (deliberately):
- kanban_show — workers legitimately read parent/sibling handoff context
- kanban_comment — cross-task comments are the handoff mechanism
- kanban_create — orchestrator fan-out, worker follow-up spawning
- kanban_link — parent/child linking

Tests: 5 new regression tests in tests/tools/test_kanban_tools.py
covering the grid (worker-attacks-foreign ×3 tools, worker-own-task
preserved, orchestrator-unrestricted). 36/36 pass.
2026-05-04 04:54:02 -07:00
Teknium 1bd5ac7f2f fix(self-improvement-loop): bump background-review budget to 16 and suppress status leaks (#19710)
The background memory/skill review fork had two user-visible issues:

1. max_iterations=8 was too tight for multi-step reviews. A review that
   needs to skill_view one or two candidate skills, add a memory entry,
   and patch a skill routinely blew the budget — surfacing an 'Iteration
   budget exhausted (8/8)' warning to the user and leaving the review
   half-finished.

2. Mid-review lifecycle messages leaked into the user's terminal past the
   existing quiet_mode + redirect_stdout/stderr guards. _emit_status and
   _emit_warning route through _vprint(force=True) -> _print_fn /
   status_callback, which bypass sys.stdout entirely. The stdout redirect
   only catches raw print() calls.

Changes:
- Bump the review fork's max_iterations from 8 to 16.
- Set review_agent.suppress_status_output = True on the fork. This
  short-circuits _vprint unconditionally so _emit_status/_emit_warning
  emissions (iteration-budget warnings, rate-limit retries, compression
  messages) never reach the user. The only user-visible output remains
  the compact final summary line ('💾 Self-improvement review: ...')
  which is printed via self._safe_print on the *main* agent (outside
  the fork's redirect/suppress scope).

Summarizer filter is already correct — _summarize_background_review_actions
only surfaces tool calls with data.get('success') is truthy, so failed
attempts and reasoning text never reach the summary line.
2026-05-04 04:53:44 -07:00
Kathy a79b0ec461 fix: keep Feishu topic replies from falling back to new threads (local patch)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-05-04 04:53:28 -07:00
cong 3ccf723bf9 fix(gateway): read context_length from custom_providers in session info header 2026-05-04 04:51:13 -07:00
h0tp-ftw 8c8f95bc8e fix(gateway): show friendly error when service is not installed
Instead of an unhelpful CalledProcessError traceback when running
`hermes gateway start/stop/restart` without first installing the service,
check for the unit file and exit with an actionable install hint.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-04 04:49:51 -07:00
Teknium c5789f4309 feat(achievements): share card render on unlocked badges (#19657)
* feat(achievements): share card render on unlocked badges

Adds a Share button to each unlocked achievement card that opens a
modal and renders a 1200x630 PNG share card client-side via Canvas2D
(no backend, no network, no new deps). Two actions: Download PNG and
Copy image to clipboard.

Card layout mirrors the in-dashboard visual language: tier-colored
glow, icon from the existing LUCIDE sprite set, achievement name,
tier badge pill, description, progress stat line, and a Hermes Agent
watermark. Sized for X/Twitter, Discord, LinkedIn, Bluesky link
previews.

Vendored on top of the upstream @PCinkusz bundle; the 'in-progress
scan banner' precedent already established this divergence pattern.
Manifest bumped 0.3.1 -> 0.4.0.

* feat(achievements): share-on-X as primary action on share dialog

Adds a 'Share on X' button as the primary action in the share dialog.
Opens https://x.com/intent/post with a pre-filled tweet referencing
the achievement name, tier, @NousResearch, and the Hermes docs URL.
Copy image and Download PNG become secondary actions: users who want
the badge attached can Copy image, paste into the X composer, post.

Primary button styled as X's signature black-on-white fill so the
action is unambiguous.
2026-05-04 04:47:53 -07:00
ygd58 297eaa3533 fix(api_server): emit run.failed when run_conversation returns failed=True
When run_conversation encounters a non-retryable client error (401, 400,
etc.), it returns a dict with failed=True instead of raising. The gateway's
_run_and_close only branched on exceptions, so it always emitted run.completed
even for failed runs — clients could not distinguish success from failure.

Inspect the result dict before emitting: if failed=True, emit run.failed
with the error message; otherwise emit run.completed as before. The existing
except Exception path is unchanged for genuine programming errors.

Fixes #15561
2026-05-04 04:47:36 -07:00
Teknium b2b479b40e docs(kanban): backfill multi-board refs in reference docs (#19704)
Followup to #19653. The feature PR updated the Kanban user guide but
missed four other pages that document the same surface. Caught when
Teknium asked 'did you add docs to the guide and any other kanban
related docs around this?'.

- reference/cli-commands.md: rewrite the `hermes kanban` section to
  document the `--board <slug>` global flag, the `boards`
  subcommand group (list/create/switch/show/rename/rm), board
  resolution order, and worked examples. Also fills in the
  `create` / `complete` flag lists that had drifted from the
  current CLI (`--summary`, `--metadata`, `--triage`,
  `--idempotency-key`, `--max-runtime`, `--skill`).
- reference/environment-variables.md: add `HERMES_KANBAN_BOARD`
  row, update `HERMES_KANBAN_DB` precedence note.
- reference/slash-commands.md: add `/kanban boards ...` and
  `/kanban --board <slug> ...` to the two `/kanban` rows (CLI
  table + gateway table).
- features/kanban-tutorial.md: the walkthrough uses the `default`
  board, so just a note pointing readers at the overview's Boards
  section if they want multiple queues, plus the corrected per-board
  DB path.

Skill docs (devops-kanban-orchestrator, -worker) intentionally not
updated: those are agent-facing lifecycle playbooks and boards are
transparent to workers (HERMES_KANBAN_BOARD env var pins the DB
automatically), so there's nothing new for a worker to know.
2026-05-04 04:47:19 -07:00
Teknium a8b689f0c2 test(kanban): regression for status=running rejection at dashboard PATCH
Reporter of #19535 explicitly asked for a regression test — covers it
here so a future refactor of _set_status_direct can't silently re-enable
the direct ready/todo -> running bypass.

Asserts both: (a) HTTP 400 with 'running' in the detail message, and
(b) the task's status is unchanged after the rejected PATCH (pre-request
status preserved, no partial mutation).
2026-05-04 04:46:47 -07:00
luyao618 6b3efcee49 fix(kanban): reject direct status transition to 'running' via dashboard API
The PATCH /tasks/:id endpoint allows setting status='running' via
_set_status_direct(), bypassing the dispatcher/claim path that creates
run rows, claim locks, expiry, and worker process metadata. This can
leave tasks stuck in 'running' with no active worker.

Fix: reject status='running' with HTTP 400, requiring all transitions
to 'running' to go through the canonical claim_task() path.

Closes #19535
2026-05-04 04:46:47 -07:00
vominh1919 652f8e6f3e fix(test): correct _coerce_number inf/nan test assertions
The test 'test_inf_stays_string_for_integer_only' incorrectly asserted
that _coerce_number('inf') returns float('inf'), but the function
correctly returns the original string 'inf' because infinity is not
JSON-serializable.

Fixed the assertion to expect the string 'inf', and added two new tests
for negative infinity and NaN edge cases to improve coverage of the
non-JSON-serializable number guard in _coerce_number().
2026-05-04 04:45:55 -07:00
Yoimex edf9c75621 fix(env): pass -- to cd for hyphen-prefixed workdirs 2026-05-04 04:45:03 -07:00
Teknium ae40fca955 fix(profiles): keep validate_profile_name strict; callers normalize first
Follow-up to @changchun989's cherry-pick: reverts the validate-via-
normalize change so validate_profile_name remains a strict regex check
on the input AS-GIVEN. Callers that accept mixed-case user input
(dashboard UI, CLI args, import flows) call normalize_profile_name()
first, then validate the result. This keeps validate honest about
what the on-disk directory name must look like — e.g. '  jules '
(trailing whitespace) is now rejected instead of silently trimmed
and accepted.

- validate_profile_name: strict lowercase/regex check again, 'UPPER'
  back in the invalid-names parametrize
- 8 call sites in profiles.py (create_profile, delete_profile,
  set_active_profile, export_profile, import_profile, rename_profile,
  resolve_profile_env, plus the clone_from branch): swap the
  normalize-then-validate order
- scripts/release.py: add changchun989@proton.me -> changchun989 to
  AUTHOR_MAP so CI doesn't block on the unmapped contributor email

All kanban + profile tests pass (268 across test_profiles.py +
test_kanban_db.py + test_kanban_core_functionality.py, plus 73 in
test_kanban_tools.py + test_kanban_dashboard_plugin.py).

Closes #18498.
2026-05-04 04:44:37 -07:00
changchun989 a31477dabb fix(profiles): normalize profile IDs for Kanban assignees and lookups
- Add normalize_profile_name() for lowercase canonical IDs and Default alias
- Use canonical names in create/delete/rename/export/import/set_active paths
- Canonicalize Kanban assignee on create/assign, list filter, and worker spawn
- Tests for mixed-case assignees and profile resolution (fixes #18498)
2026-05-04 04:44:37 -07:00
Yuyang Xu 60c4bc96fd fix(security): restore .env/auth.json/state.db with 0600 perms
`hermes import` was creating secret files with the process umask
(typically 0644) instead of 0600. zipfile.open() does not honor the
Unix mode bits stored in zip member external_attr; the restore loop
used open(target, "wb") which always falls back to umask.

Threat: silent privilege downgrade after a routine restore on
multi-user systems (shared dev boxes, CI runners, jump hosts) — any
local user could read API keys and OAuth tokens from ~/.hermes/.

Fix mirrors the convention already used at file creation
(hermes_cli/auth.py: stat.S_IRUSR | stat.S_IWUSR for auth.json).
The quick-snapshot restore path (restore_quick_snapshot) is
unaffected — it uses shutil.copy2 which preserves perms via
copystat().

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 04:43:53 -07:00
MichaelWDanko da8654bb41 fix(dashboard): show custom theme palette swatches 2026-05-04 04:43:27 -07:00
Cameron Aragon 239ea1bdea fix(image-gen): preserve xAI API error status 2026-05-04 04:43:07 -07:00
atongrun 75b4a34670 fix(cli): check updates against upstream/main for fork users 2026-05-04 04:42:44 -07:00
Teknium 5ec6baa400 feat(kanban): multi-project boards — one install, many kanbans (#19653)
Adds first-class board support to kanban so users can separate unrelated
streams of work (projects, repos, domains) into isolated queues. Single-
project users stay on the 'default' board and see no UI change.

Isolation model
---------------
- Each board is a directory at `~/.hermes/kanban/boards/<slug>/` with
  its own `kanban.db`, `workspaces/`, and `logs/`. The 'default' board
  keeps its legacy path (`~/.hermes/kanban.db`) for back-compat — fresh
  installs and pre-boards users get zero migration.
- Workers spawned by the dispatcher have `HERMES_KANBAN_BOARD` pinned in
  their env alongside the existing `HERMES_KANBAN_DB` /
  `HERMES_KANBAN_WORKSPACES_ROOT` pins, so workers physically cannot see
  other boards' tasks.
- The gateway's single dispatcher loop now sweeps every board per tick;
  per-tick cost is a few extra filesystem stats.
- CAS concurrency guarantees are preserved per-board (each board is its
  own SQLite DB, same WAL+IMMEDIATE machinery as before).

CLI
---
  hermes kanban boards list|create|switch|show|rename|rm
  hermes kanban --board <slug> <any-subcommand>

Board resolution order: `--board` flag → `HERMES_KANBAN_BOARD` env →
`~/.hermes/kanban/current` file → `default`. Slug validation is strict:
lowercase alphanumerics + hyphens + underscores, 1-64 chars, starts with
alphanumeric. Uppercase is auto-downcased; slashes / dots / `..` /
control chars are rejected so boards can't name their way out of the
boards/ directory.

Passive discoverability: when more than one board exists, `hermes kanban
list` prints a one-line header ("Board: foo (2 other boards …)") so
users who stumble across multi-project never have to hunt for the
feature. Invisible for single-board installs.

Dashboard
---------
- New `BoardSwitcher` component at the top of the Kanban tab: dropdown
  with all boards + task counts, `+ New board` button, `Archive`
  button (non-default only). Hidden entirely when only `default` exists
  and is empty — single-project users never see it.
- New `NewBoardDialog` modal: slug / display name / description / icon
  + "switch to this board after creating" checkbox.
- Selected board persists to `localStorage` so browser users don't
  shift the CLI's active board out from under a terminal they left open.
- New `?board=<slug>` query param on every existing endpoint plus a
  new `/boards` CRUD surface (`GET /boards`, `POST /boards`,
  `PATCH /boards/<slug>`, `DELETE /boards/<slug>`,
  `POST /boards/<slug>/switch`).
- Events WebSocket is pinned to a board at connection time; switching
  opens a fresh WS against the new board.

Also fixes a pre-existing bug in the plugin's tenant / assignee
filters: the SDK's `Select` uses `onValueChange(value)`, not
native `onChange(event)`, so those filters silently didn't work.
New `selectChangeHandler` helper wires both signatures.

Tests
-----
49 new tests in `tests/hermes_cli/test_kanban_boards.py` covering:
slug validation (valid / invalid / auto-downcase), path resolution
(default = legacy path, named = `boards/<slug>/`, env var override),
current-board resolution chain (env > file > default), board CRUD +
archive / hard-delete, per-board connection isolation (tasks don't
leak), worker spawn env injection (`HERMES_KANBAN_BOARD`,
`HERMES_KANBAN_DB`, `HERMES_KANBAN_WORKSPACES_ROOT` all point at the
right board), and end-to-end CLI surface.

Regression surface: all 264 pre-existing kanban tests continue to pass.

Live-tested via the dashboard: created 3 boards (default,
hermes-agent, atm10-server), created tasks on each via both CLI
(`--board <slug> create`) and dashboard (inline create on the Ready
column), confirmed zero cross-board leakage, confirmed `BoardSwitcher`
+ `NewBoardDialog` work end-to-end in the browser.
2026-05-04 04:42:38 -07:00
vominh1919 135b4c8b35 fix(mcp): decouple AnyUrl import from mcp dependency
AnyUrl was imported inside the same try block as mcp.client.auth, so
when the mcp package was not installed, AnyUrl was undefined and
_build_client_metadata raised NameError at runtime.

Moved the AnyUrl import to its own try/except block so it's available
whenever pydantic is installed (which is a core dependency), regardless
of whether the mcp SDK is present.

Also added pytest.importorskip('mcp') to the three
test_build_client_metadata tests that exercise _build_client_metadata,
since that function depends on OAuthClientMetadata from the mcp package.
2026-05-04 04:42:18 -07:00
vominh1919 0d563621fb fix(test): skip bedrock adapter tests when botocore is not installed
Six tests in test_bedrock_adapter.py import botocore.exceptions
directly (ConnectionClosedError, EndpointConnectionError,
ReadTimeoutError, ClientError) without guarding the import. When
botocore is not installed (it's an optional dependency), these tests
fail with ModuleNotFoundError instead of being gracefully skipped.

Added pytest.importorskip('botocore') to each affected test function,
following the same pattern used elsewhere in the test suite (e.g.
test_voice_mode.py for numpy, test_mcp_oauth.py for mcp).

Tests affected:
- TestIsStaleConnectionError: 3 tests
- TestCallConverseInvalidatesOnStaleError: 3 tests

Before: 6 FAIL with ModuleNotFoundError
After:  6 SKIP with reason message
2026-05-04 04:41:55 -07:00
vominh1919 d1d2d43387 fix(test): add skip marker for transcription tests requiring faster_whisper
TestTranscribeLocalExtended patches faster_whisper.WhisperModel, which
triggers an ImportError when the faster_whisper package is not installed.
Added a pytest.mark.skipif marker using importlib.util.find_spec so
these tests are gracefully skipped instead of failing with
ModuleNotFoundError.
2026-05-04 04:41:36 -07:00
Teknium 844d4a32ce chore(release): AUTHOR_MAP entries for Tier 1e salvage batch 2026-05-04 04:40:34 -07:00
Teknium 110387d149 docs(open-webui): fill gaps in quick setup — verify curls, ollama flag, restart note (#19654)
Reported by @neopabo — the Open WebUI page was missing several steps users
hit in practice:

- Use hermes config set instead of hand-editing .env (matches current UX)
- Restart-gateway note after enabling API_SERVER_ENABLED
- curl /health + /v1/models verification step before jumping to Docker
- ENABLE_OLLAMA_API=false in both docker run and compose snippets to
  suppress the empty Ollama backend that otherwise clutters the picker
- 15-30s startup wait note for first-run embedding model download
- Troubleshooting entry for the empty-Ollama-shadowing case
- /v1/models troubleshoot command now includes the Authorization header
2026-05-04 04:36:18 -07:00
Siddharth Balyan af6f9bc2a1 fix: refresh systemd unit on gateway boot (not just start/restart) (#19684)
The resilient restart settings from PR #18639 only took effect when
the gateway was started via `hermes gateway start` or `hermes gateway
restart` — both of which call refresh_systemd_unit_if_needed() which
writes the new unit and runs daemon-reload.

However, when the gateway self-restarts via exit-code-75 (stale-code
detection after `hermes update`, or the /restart command), systemd
respawns the process directly without going through any CLI function.
The unit file on disk stays stale, and systemd keeps using the old
cached settings (StartLimitBurst=5, RestartSec=30) until someone
manually runs `hermes gateway restart`.

This meant that after PR #18639 was deployed, users who never ran
`hermes gateway restart` manually were still vulnerable to the
permanent-death-on-network-outage bug.

Fix: call refresh_systemd_unit_if_needed() at the top of run_gateway()
(the foreground entry point that systemd's ExecStart invokes). This
ensures that on every boot — whether triggered by systemd restart,
exit-75 respawn, or manual foreground run — the unit definition and
daemon state are current. The call is best-effort (exceptions caught)
and a no-op when the unit is already current (one stat + string compare).
2026-05-04 16:27:51 +05:30
Teknium 33f554d83c feat(kanban-dashboard): workspace kind + path inputs in inline create form (#19679)
Closes #18718. Exposes the existing `workspace_kind` + `workspace_path`
fields (already accepted by POST /api/plugins/kanban/tasks) in the
dashboard's per-column inline-create form so users can create tasks
targeting a git worktree or an explicit directory without dropping
back to the CLI.

- Add a workspace-kind Select (scratch / worktree / dir) to
  InlineCreate in plugins/kanban/dashboard/dist/index.js.
- Conditionally render a workspace_path Input next to the select when
  kind != scratch; placeholder tells the user whether the path is
  required (dir) or optional (worktree — derived from assignee when
  blank).
- Submit wires `workspace_kind` / `workspace_path` into the POST body
  only when they're non-default, keeping the request shape small and
  interoperable with older dispatcher versions.

E2E verified in a dashboard pointed at the worktree: selecting dir +
typing /tmp/test-18718 produces a POST body with
{workspace_kind: 'dir', workspace_path: '/tmp/test-18718'} and the
task lands in sqlite with those fields set. 42/42 kanban dashboard
plugin tests pass.
2026-05-04 03:40:39 -07:00
Grey0202 a219a0a4df fix(anthropic): strip top-level oneOf/allOf/anyOf from tool input_schema
Extends the existing _normalize_tool_input_schema to also drop top-level
union keywords that Anthropic's tool schema validator rejects with HTTP 400.

Several upstream and plugin tools ship schemas with a top-level oneOf/
allOf/anyOf (common for Pydantic discriminated unions). The existing
strip_nullable_unions pass only handles anyOf-with-null patterns; a
non-null top-level union keyword sails through and hits the API.

Salvage of #16471 — approach folded into the existing normalize helper
rather than introducing a parallel _sanitize_input_schema function, to
avoid two schema-munging code paths running against the same input.

Co-authored-by: Grey0202 <grey0202@users.noreply.github.com>
2026-05-04 03:17:35 -07:00
charliekerfoot 412f2389f1 fix(google_oauth): close TOCTOU window when saving credentials 2026-05-04 03:16:19 -07:00
Ioodu e50809b771 fix(file-tools): cap read_file result size to prevent context window overflow
Set max_result_size_chars=100_000 on the read_file registry entry (was
float('inf')), closing the Layer 2 defense-in-depth gap in
tool_result_storage.py. The existing Layer 1 guard inside
_handle_read_file already returns a JSON error for oversized reads;
this aligns the registry cap with every other tool.

Update test_read_file_never_persisted → test_read_file_result_size_cap
to assert 100_000, and add test_read_file_registry_cap_is_100k as an
explicit regression guard against re-introducing float('inf').

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 03:14:59 -07:00
Teknium 5b6d413476 fix(cli,gateway): surface title errors from /new <name>
The contributor's PR silently swallowed ValueError from
SessionDB.set_session_title() with bare except Exception: pass.
Users typing /new <title> with an already-in-use title got an
untitled session and no feedback.

Changes:
- cli.py: catch ValueError from both sanitize_title() and
  set_session_title(); print the error and mark the session
  untitled in the banner (never echo the rejected title back).
- gateway/run.py: append a warning note to the reset reply on
  title rejection; reflect the accepted title in the header.
- Add regression tests for the duplicate-title path in CLI and
  gateway.

Also map exx@example.com -> @exxmen in scripts/release.py.
2026-05-04 03:14:50 -07:00
Exx f720751d79 feat(cli,gateway): /new accepts optional session name argument
Allow users to start a fresh session and immediately set its title by
passing a name to /new (or /reset):

    /new Refactor auth module

Changes:
- hermes_cli/commands.py: add args_hint='[name]' to /new command
- cli.py: parse title argument in process_command(), pass to new_session()
- cli.py: new_session() accepts title=None, sets title via SessionDB
- gateway/run.py: _handle_reset_command() parses title, sets on new entry
- gateway/session.py: reset_session() accepts optional display_name
- tests: add test_new_session_with_title, test_reset_command_with_title,
  test_new_command_in_help_output

All 36 affected tests pass.
2026-05-04 03:14:50 -07:00
ms-alan 055fde40e0 fix(doctor): check global agent-browser when local install not found
When agent-browser is globally installed via 'npm install -g agent-browser'
but not present in the local node_modules, doctor falsely warns that it's
not installed. Add shutil.which('agent-browser') as a fallback check after
the local path check.

Closes #15951
2026-05-04 03:13:22 -07:00
xyiy001 e69d11d30c fix(browser): allow CDP override to pass requirement checks
Treat explicit CDP override mode as a valid browser backend even when agent-browser is absent, and add a regression test to prevent false-negative availability gating.
2026-05-04 03:12:30 -07:00
kshitijk4poor 46072425fe fix(model-picker): exclude providers with empty credential pool entries
The auth check in list_authenticated_providers used mere key presence in
credential_pool to conclude a provider is authenticated.  An empty entry
(pool_store key with no actual credentials) caused providers like
ollama-cloud to appear as authenticated in the model picker even when no
OLLAMA_API_KEY was set.

The user's picker then offered nemotron-3-super under Ollama Cloud;
selecting it routed every subsequent turn to https://ollama.com/v1, which
rejected the requests with HTTP 400.

Fix: drop the pool_store key-existence check from both section 2
(HERMES_OVERLAYS) and section 2b (CANONICAL_PROVIDERS).  The following
load_pool().has_credentials() call already handles the legitimate pooled-
credential case; checking for an empty key just ahead of it was redundant
and actively harmful.
2026-05-04 03:12:12 -07:00
briandevans c8ecb56f27 fix(cli): reject invalid argv values from -p/--profile before resolving
`_apply_profile_override()` scans `sys.argv` for `-p / --profile` at
module import time. When `hermes_cli.main` is imported inside pytest
with `-p no:xdist` on the command line, it picks up `'no:xdist'` as a
profile name candidate, then passes it to `resolve_profile_env()` which
raises `ValueError` (invalid format), and the function calls
`sys.exit(1)` — aborting test collection with an INTERNALERROR before
any test runs.

The same conflict affects any tool or wrapper that uses `-p` for its
own flag and then imports `hermes_cli.main`.

Fix: add a format guard immediately after step 1 (explicit flag scan).
If `consume == 2` (the value came from `-p <value>`, not
`--profile=value`) and the candidate doesn't match the canonical
profile-name pattern `[a-z0-9][a-z0-9_-]{0,63}` (mirrored from
`hermes_cli.profiles._PROFILE_ID_RE`), discard it and continue as if
no `-p` flag was found. The `active_profile` file-based fallback
(step 2) only reads a file written by hermes itself, so it always
produces valid names and needs no guard.

Regression guard: with the guard reverted, importing
`hermes_cli.main` with `sys.argv = ['pytest', '-p', 'no:xdist', ...]`
raises `SystemExit(1)`. With the guard in place, the import succeeds
and `sys.argv` is left intact for pytest. Legitimate `-p coder` still
flows through to `resolve_profile_env()` unchanged.

Rebased onto current `origin/main` (`e5dad4ac5`) — the prior branch
base (`4fade39c9`) was 824 commits behind and the PR was DIRTY /
CONFLICTING. The 1.5 HERMES_HOME-set early-return block has since
landed between the original insertion point and step 2; the new guard
is positioned correctly before the early return so a bogus `-p` value
no longer prevents the early return from kicking in.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 03:11:47 -07:00
ChanlerDev e3461e0b2a fix(cli): remove dead 'q' check from quit command resolution
The 'q' alias is defined for 'queue' command in commands.py:93.
The hardcoded 'q' in cli.py:5910 was dead code - resolve_command('q')
returns the queue CommandDef, so canonical would never be 'q'.

Removes the misleading check without changing any behavior:
- /quit and /exit still exit (defined aliases)
- /q still maps to queue (as intended)
2026-05-04 03:11:30 -07:00
YAMAGUCHI Seiji cba86b7303 fix(cronjob): treat bare 'custom' provider as unspecified in override
`_resolve_model_override` treated any non-empty `provider` string from
the LLM as user-specified and skipped the pin-to-current-provider
fallback. When the LLM wrote bare `'custom'` (instead of the canonical
`'custom:<name>'` referring to a custom_providers entry), the value
serialized into jobs.json as `"provider": "custom"` and the scheduler
could never resolve a provider from it — the cron job failed silently
at run time.

Treat bare `'custom'` as "no provider supplied" so the current main
provider gets pinned instead, matching behaviour for the omitted case.

Defence-in-depth complement to a schema-description fix (#15477) that
discourages the LLM from emitting bare `'custom'` in the first place.
2026-05-04 03:11:11 -07:00
pander 6b88f46c54 fix(compressor): trigger fallback on timeout errors alongside model-not-found
Previously only HTTP 404/503 and specific error strings triggered a fallback
to the main model when the summary model was unavailable. Timeout errors
(HTTP 408/429/502/504, or error strings containing 'timeout') entered a
short cooldown instead, leaving context to grow unbounded for the rest of
the session.

Add _is_timeout detection alongside _is_model_not_found so that transient
timeout errors on the summary model also trigger immediate fallback to the
main model, preventing compression failure from cascading.

Closes #15935
2026-05-04 03:10:53 -07:00
DaniuXie a45bd28598 fix(wecom): set SUPPORTS_MESSAGE_EDITING=False to prevent broken streaming 2026-05-04 03:10:36 -07:00
zng8418 d2ea959fe9 fix(doctor): skip /models health check for MiniMax CN (returns 404)
MiniMax China (api.minimaxi.com) does not expose a /v1/models endpoint.
The doctor command was probing it and reporting HTTP 404 as a warning,
even though the API works correctly for chat completions.

Set supports_health_check=False for MiniMax CN so doctor shows
"(key configured)" instead of the false 404 warning.

Refs #12768, #13757
2026-05-04 03:10:17 -07:00
ideathinklab01-source d17eff29d5 fix(delegate): guard _load_config() against delegation: null in config.yaml
YAML parses `delegation: null` as Python None. `dict.get(key, {})`
only uses the default when the key is *missing*, not when it exists with
a None value, so `cfg.get("max_concurrent_children")` crashes with
`'NoneType' object has no attribute 'get'`.

Same pattern as fd9b692d (fix(tui): tolerate null top-level sections).
Use `dict.get(key) or {}` to handle both missing and None-valued keys.

Closes: delegation null config crash (same class as #7215, #7346)
2026-05-04 03:09:59 -07:00
ygd58 2d3d1d9736 fix(tui): use --outdir instead of --outfile in hermes-ink build script
esbuild raises 'Must use outdir when there are multiple input files'
on Android/Termux ARM64 with esbuild >=0.25. The build script used
--outfile=dist/ink-bundle.js which is only valid for a single entry
point with no code splitting. Switching to --outdir=dist fixes the
error and names the output file dist/entry-exports.js (matching the
input file name). Update index.js to import from the new path.

Fixes #16072
2026-05-04 03:09:41 -07:00
LLing486 145a38a875 fix(agent): preserve dots in model names for Xiaomi MiMo provider
Add 'xiaomi' to the _anthropic_preserve_dots() provider whitelist and
'xiaomimimo.com' to the URL-based fallback check. Without this,
normalize_model_name() converts mimo-v2.5 to mimo-v2-5, which the
Xiaomi API rejects with HTTP 400.

Fixes #16156
2026-05-04 03:09:24 -07:00
YAMAGUCHI Seiji 0896944382 fix(cronjob): advertise 'custom:<name>' provider format in tool schema
The `provider` field in CRONJOB_SCHEMA only showed examples like
'openrouter' and 'anthropic', with no mention of the canonical
'custom:<name>' form required for custom_providers entries. When the
user has custom providers configured, LLMs tend to write the bare type
name ('custom') because the schema does not advertise the ':<name>'
suffix. The bare value then serializes into jobs.json and causes the
cron job to fail silently at run time — `_resolve_model_override`
treats it as a user-specified provider and skips the pin-to-current
fallback, but no provider ever resolves from the bare 'custom' string.

Clarifying the schema so the canonical form is discoverable addresses
the root cause at the tool-definition boundary.
2026-05-04 03:09:07 -07:00
jjjojoj 9c64d09610 fix(status): show NVIDIA NIM api key status
hermes status was missing NVIDIA API key from its API keys display.
Now shows NVIDIA NIM ✓/✗ with key hash like other providers.

Fixes #16082
2026-05-04 03:08:50 -07:00
Teknium 64b39d835e chore(release): AUTHOR_MAP entries for Tier 1d salvage batch 2026-05-04 03:07:30 -07:00
taeng0204 20a06c586f fix(dashboard): render null instead of flashing spinner during plugin load 2026-05-04 03:06:45 -07:00
taeng0204 06a6d6967a fix(dashboard): defer unknown-route redirect while dashboard plugins load 2026-05-04 03:06:45 -07:00
Teknium 986ec04048 docs: document /kanban slash command (#19584)
* docs: document /kanban slash command

The kanban user guide and slash-commands reference only mentioned the
/kanban slash command in passing. Add a proper section covering:

- CLI and gateway both expose the full hermes kanban surface via
  hermes_cli.kanban.run_slash (identical argument surface)
- Mid-run usage: /kanban bypasses the running-agent guard, so reads
  and writes land immediately while an agent is still in a turn
- Auto-subscribe on /kanban create from the gateway — originating
  chat is subscribed to terminal events, with a worked example
- Output truncation (~3800 chars) in messaging
- Autocomplete hint list vs full subcommand surface

Also adds /kanban rows to both slash-command tables (CLI + messaging)
in reference/slash-commands.md and moves it into the 'works in both'
notes bucket.

* docs(kanban): frame the model's tool surface as primary, CLI as the human surface

The kanban user guide and CLI reference read as if you drive the board
by running `hermes kanban` commands everywhere. In practice:

- **You** (human, scripts, cron, dashboard) use the `hermes kanban …`
  CLI, the `/kanban …` slash command, or the REST/dashboard.
- **Workers** spawned by the dispatcher use a dedicated `kanban_*`
  toolset (`kanban_show`, `kanban_complete`, `kanban_block`,
  `kanban_heartbeat`, `kanban_comment`, `kanban_create`,
  `kanban_link`) and never shell out to the CLI.

Changes to `user-guide/features/kanban.md`:

- New 'Two surfaces' intro distinguishes the two front doors up front.
- Quick-start section re-labelled so each step says who is running it
  (you vs. orchestrator vs. worker).
- 'How workers interact with the board' rewritten:
  - Lead with "Workers do not shell out to `hermes kanban`."
  - Tool table extended with required params.
  - Concrete worker-turn example (`kanban_show` → `kanban_heartbeat`
    → `kanban_complete`) and an orchestrator fan-out example
    (`kanban_create` x N with `parents=[...]`).
  - Moved 'Why tools not CLI' from a defensive aside to a clean
    follow-up section.
- 'Worker skill' section explicitly says the lifecycle is taught
  in tool calls, not CLI commands.
- 'Pinning extra skills' reordered — orchestrator tool form first
  (the usual case), human/CLI second, dashboard third.
- 'Orchestrator skill' now shows a canonical `kanban_create` /
  `kanban_link` / `kanban_complete` tool-call sequence instead of
  only describing what the skill teaches.
- CLI-command-reference heading now clarifies this is the human
  surface, with a cross-link to the tool-surface section.
- 'Runs — one row per attempt' structured-handoff example replaced:
  the primary example is now `kanban_complete(summary=..., metadata=...)`
  (what a worker actually does), with the CLI form retained as
  "when you, the human, need to close a task a worker can't."

Changes to `reference/cli-commands.md`:

- `hermes kanban` intro marks itself as the human / scripting surface
  and links out to the worker tool surface.
- Corrected `comment <id>` description — the next worker reads it via
  `kanban_show()`, not by running `hermes kanban show`.

* docs(kanban-tutorial): reframe worker actions as tool calls

Honest answer to Teknium's follow-up: no, the first pass missed the
tutorial. The four stories all showed `hermes kanban claim /
complete / block / unblock` as if the backend-dev, pm, and reviewer
personas were humans running CLI commands. In a real hermes kanban
run those agents are dispatcher-spawned workers driving the board
through the `kanban_*` tool surface.

Changes:

- Setup intro now distinguishes the three surfaces up front
  (dashboard / CLI for you, `kanban_*` tools for workers) and
  establishes the convention: `bash` blocks are commands *you* run,
  `# worker tool calls` blocks are what the agent emits.
- Story 1 (solo dev schema): 'Claim the schema task, do the work,
  hand off' block replaced with the dispatcher spawning the
  backend-dev worker and a `kanban_show → kanban_heartbeat →
  kanban_complete` tool-call sequence. The 'On the CLI' `hermes
  kanban show / runs` block re-labelled as 'you peeking at the board'
  to keep it correct as a human inspection step.
- Story 2 (fleet farming): note about structured handoff updated
  from `--summary` / `--metadata` CLI flags to
  `kanban_complete(summary=..., metadata=...)` tool form.
- Story 3 (role pipeline): the big PM/engineer/reviewer block fully
  rewritten as three worker tool-call sequences — PM worker
  completes spec, engineer worker blocks, human/reviewer
  `hermes kanban unblock` (or `/kanban unblock`), engineer worker
  respawns and completes. The respawn-as-new-run mechanic is now
  explicit.
- Reviewer paragraph: `build_worker_context` replaced with
  `kanban_show()` — that's the tool that delivers the parent
  handoff to the model.
- Structured handoff section heading and body updated:
  `--summary`/`--metadata` → `summary`/`metadata` (tool params),
  with a note that the tool surface doesn't expose a bulk variant
  for the same reason the CLI refuses multi-task `complete`.

Story 4 (circuit breaker) unchanged — its workers fail to spawn,
so there are no tool calls to show; the `hermes kanban create` and
`hermes kanban runs` commands in it are correctly human-driven.
2026-05-04 03:05:34 -07:00
Teknium 0628004709 docs(model-catalog): rename x-ai/grok-4.20-beta to x-ai/grok-4.20 (#19640)
OpenRouter and Nous Portal dropped the -beta suffix from the Grok 4.20 slug.
The OpenRouter section already used the new slug; this updates the Nous
Portal section and bumps updated_at.
2026-05-04 02:48:30 -07:00
ms-alan c659a16899 fix(cli): detect quoted relative paths in _detect_file_drop
Closes #15197
2026-05-04 02:48:20 -07:00
ms-alan 08b8465ca9 fix(email): add required Date header to send_message_tool._send_email
Adds RFC 5322 Date header to the _send_email tool path in tools/send_message_tool.py.

Issue #15160 noted that both gateway/platforms/email.py and tools/send_message_tool.py
construct MIMEMultipart/MIMEText messages without setting a Date header. RFC 5322
requires the Date header; mail filters reject messages that lack it.

PR #15207 fixed the gateway/platforms/email.py path but did not cover
tools/send_message_tool._send_email, which is used by the send_message tool
for cross-channel messaging.

This change adds msg["Date"] = formatdate(localtime=True) to _send_email,
mirroring the fix applied to the gateway email adapter.

Closes #15160
2026-05-04 02:48:20 -07:00
thchen 51dc98d314 fix(agent): detect Qwen3/Ollama inline thinking after tool calls
Ollama serves Qwen3 thinking inside the content field as <think>...</think>
blocks rather than in the API-level reasoning_content field.  This means
_has_structured was False for these responses, so an empty-looking reply
after a tool call triggered the nudge instead of the prefill continuation,
causing a double-response loop.

Fix: detect <think>/<thinking>/<reasoning> in final_response and:
  1. Skip the nudge when thinking is present (model is still reasoning)
  2. Include _has_inline_thinking in _has_structured so prefill kicks in
2026-05-04 02:47:29 -07:00
LeonSGP43 0df7e61d2c fix(cli): omit empty api_mode when probing custom models 2026-05-04 02:46:41 -07:00
QifengKuang 52c539d53a fix(agent): disable SDK retries on per-request OpenAI clients
Per-request OpenAI-wire clients (used by both non-streaming and
streaming chat-completions paths in _interruptible_api_call) should
not run the SDK's built-in retry loop: the agent's outer loop owns
retries with credential rotation, provider fallback, and backoff that
the SDK can't see.

Leaving SDK retries on (default 2) compounds with our outer retries
and lets a single hung provider request stretch to ~3x the per-call
timeout before our stale detector reports it.

Shared/primary clients and Anthropic / Bedrock paths are unaffected
(they don't go through here).

Salvage of #15811 core improvement — the timeout push-down in the
original PR required scaffolding that has since been refactored on
main, so only the max_retries=0 change is preserved.

Co-authored-by: QifengKuang <k2767567815@gmail.com>
2026-05-04 02:43:20 -07:00
Teknium 3c070f9f9d fix(curator): only mark agent-created for background-review sediment (#19621)
Tighten the provenance semantics added in #19618: skills a user asks a
foreground agent to write via skill_manage(create) now stay invisible to
the curator. Only skills the background self-improvement review fork
sediments through skill_manage get the created_by=agent marker.

- tools/skill_provenance.py — new ContextVar module mirroring the
  _approval_session_key pattern: set_current_write_origin / reset /
  get / is_background_review. Default origin is 'foreground'; the
  review fork sets 'background_review'.
- run_agent.py — run_conversation() binds the ContextVar from
  self._memory_write_origin at the top of each call. The review fork
  runs on its own thread (fresh context), so foreground and review
  contexts never cross-contaminate.
- tools/skill_manager_tool.py — skill_manage(action='create') now
  only calls mark_agent_created() when is_background_review(). All
  other cases (foreground create, patch, edit, write_file, delete)
  continue as before.
- tests: test_skill_provenance.py (6 tests covering the ContextVar
  surface), split test_full_create_via_dispatcher into foreground
  vs. review-fork variants, curator status tests now mark-first.

Why: the agent routinely edits existing user skills on the user's
behalf; those writes must never flip provenance. And when a user
explicitly asks the foreground agent to create a skill, that skill
belongs to the user. The curator should only be cleaning up after
its own autonomous sediment from the review nudge loop.
2026-05-04 02:42:16 -07:00
Teknium bff484a51b fix(kanban-dashboard): widen drawer, bump body fonts, fix code-block contrast (#19638)
Closes #18576. Addresses three of four complaints from the readability
report; live-verified in a dashboard against a seeded task with body,
comments, and run history.

- Drawer default width 480px → 640px, exposed as the CSS var
  `--hermes-kanban-drawer-width` so deployments / user themes can
  override without forking the plugin.
- Bump body/meta/pre/log/run-history font sizes from the 0.65-0.75rem
  cluster to the 0.78-0.85rem cluster. Long paths and code snippets in
  task bodies, run metadata, and worker logs are legible again instead
  of requiring a squint.
- Fix the black-text-on-dark-theme regression in fenced markdown code
  blocks. Root cause: themes that don't define `--color-foreground`
  (NERV, at least) leave `color: var(--color-foreground)` resolving
  empty on <code>, which then falls back to the UA default (near-black)
  instead of inheriting from the drawer's <body>. Fix: force
  `color: inherit` on both inline and fenced code, and give the fenced
  block background via `currentColor` instead of `--color-foreground`
  so there's a visible card even when the theme var is absent.

Out of scope for this PR (comments added to #18576):
- Draggable resize handle (structural JS work; plugin ships built-only,
  no src/ in-tree).
- Live worker-log viewer for running tasks (backend WS + component).
- Sibling fix: themes like NERV should define --color-foreground. The
  current changes make the drawer robust against that gap, but the
  root fix belongs in the theme layer.
2026-05-04 02:41:51 -07:00
alt-glitch 2a52e28568 fix(setup): skip AUXILIARY_VISION_MODEL write when input is blank
Guard the save_env_value('AUXILIARY_VISION_MODEL', ...) call with
'if _selected_vision_model:' so blank input at the non-OpenAI vision
model prompt doesn't nuke existing values in .env.

save_env_value has no internal guard against empty strings — it
faithfully writes whatever it receives, including empty values that
shadow the previously-configured model.

Salvage of #15504 (core hunk). Contributor's test was dropped because
it collided with subsequent test refactors; the fix stands on its own.

Co-authored-by: alt-glitch <balyan.sid@gmail.com>
2026-05-04 02:41:47 -07:00
LeonSGP43 7d36533aeb fix(pty): default TERM for resize probes
Preserve explicit caller overrides, but backfill a sensible default
TERM=xterm-256color when missing or blank in the spawn env. CI often
runs without TERM in the parent process, which makes terminal probes
like 'tput cols' fail before winsize reads.

Salvage of #15278's core code fix only — the test changes conflict
with subsequent test refactors on main that now exercise TIOCGWINSZ
directly instead of via 'tput'.

Co-authored-by: LeonSGP43 <154585401+LeonSGP43@users.noreply.github.com>
2026-05-04 02:38:54 -07:00
Bart 99faac212e fix(tui): prevent trailing space in picker-command completions
Commands that open pickers (/model, /skin, /personality) previously
received a trailing space in their completions to keep the dropdown
visible in the classic CLI. However, the TUI's submit handler applies
the completion when Enter is pressed and the result differs from the
input — so '/model' + space became '/model ' and the command was never
executed.

Picker commands now omit the trailing space for exact matches, allowing
Enter to submit and open the picker. Non-picker commands (/help, etc.)
are unaffected.
2026-05-04 02:35:33 -07:00
analista 6da970f15d fix(tui): close AIAgent on session teardown to prevent FD leak
session.close only closed the slash_worker subprocess but never called
agent.close() on the AIAgent instance.  In the long-lived TUI gateway
process, this left httpx clients for GC to finalize.  When the OS
recycled a closed FD number for a new active connection, the stale
finalizer would close the live socket, causing intermittent
[Errno 9] Bad file descriptor on subsequent LLM API calls.

Call agent.close() (which properly shuts down the httpx transport pool
and TCP sockets) before closing the slash_worker.
2026-05-04 02:34:53 -07:00
nftpoetrist 4e2b20b705 fix(cli): sync use_gateway in _reconfigure_provider for tts, browser, and web
_reconfigure_provider() updates cloud_provider/backend/tts.provider when
switching tool providers via "hermes setup tools → Reconfigure", but did
not update the matching use_gateway flag. _configure_provider() (the
initial-setup path) sets use_gateway on all three tool categories. The
omission in _reconfigure_provider leaves a stale value in config.yaml:
switching from a Nous-managed provider (use_gateway=True) to a self-hosted
one keeps use_gateway=True, continuing to route requests through the Nous
gateway; switching the other way leaves use_gateway unset so the managed
feature does not activate.

Fix: mirror _configure_provider's use_gateway = bool(managed_feature)
assignment in the tts, browser, and web blocks of _reconfigure_provider.
Symmetric across all three tool categories. No behavior change for any
provider that does not set tts_provider, browser_provider, or web_backend.

Fixes #15229
2026-05-04 02:33:55 -07:00
flobo3 ba8337464d fix(gemini): extract usageMetadata from streaming chunks for token tracking 2026-05-04 02:33:30 -07:00
ee-blog f6aa1965d7 fix(telegram): fallback to document when photo dimensions exceed limits
Telegram's send_photo has dimension limits (sum of width+height <= 10000px).
When sending large screenshots or tall images, the API returns
'Photo_invalid_dimensions' error.

Fix: Catch this specific error in send_image_file() and automatically
fallback to send_document() which has no dimension limits (only 50MB size).

This is similar to the existing 5MB URL fallback (commit 542faf22) but
handles local files with dimension issues instead of URL size issues.
2026-05-04 02:33:09 -07:00
barteq ad4542bf6d fix(gateway): allow free_response_channels to override DISCORD_IGNORE_NO_MENTION
When DISCORD_IGNORE_NO_MENTION is true (default), the bot ignores
messages without @mention. However, this check ran before evaluating
free_response_channels, so messages in free-response channels were
wrongly dropped unless they contained a mention.

This change adds a carve-out: if the message lands in a channel that
is configured as a free response channel (or its parent category is),
the ignore-no-mention rule is skipped.

Also removes the unconditional skip_thread for free response channels
so that auto_thread still creates threads there unless explicitly
disabled via DISCORD_NO_THREAD_CHANNELS.
2026-05-04 02:32:39 -07:00
hex-clawd 54cd633366 fix(cron): skip AI call when script produces no output
When a cron job has a pre-run script that runs successfully but produces
no output (e.g. email checker with no new mail), the scheduler previously
injected "[Script ran successfully but produced no output.]" into the
prompt and still called the AI model. This wastes tokens on every cycle.

Now _build_job_prompt() returns None when script output is empty, and
run_job() short-circuits with a SILENT response - zero API calls when
there is nothing to report.
2026-05-04 02:32:18 -07:00
dpaluy e2248045f5 fix(cron): drop stale env-var override of persisted provider
Cron jobs were passing os.getenv("HERMES_INFERENCE_PROVIDER") as the
"requested" arg to resolve_runtime_provider(), which short-circuited
the resolver's own precedence (explicit arg → persisted config → env)
and let stale shell/.env values outrank the user's saved provider.

Long-lived cron daemons inherit env from the shell that launched them,
so a since-changed provider (e.g. DeepSeek) could keep firing for jobs
that don't pin provider/model. Same bug class as f0b763c74 fixed for
the TUI /model switch.

Pass only job.get("provider") and let resolve_requested_provider fall
through to persisted config and env in the documented order.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 02:31:57 -07:00
flobo3 d7663c7808 fix(docker): exclude compose/profile runtime state from build context 2026-05-04 02:31:39 -07:00
helix4u f236cbfec3 fix(tui): declare nanostores dependency 2026-05-04 02:31:22 -07:00
B1GGersnow dc63ad0ad2 fix(anthropic): cap max_tokens at 65536 for Qwen models via DashScope
DashScope's Anthropic-compatible endpoint enforces max_tokens ∈ [1, 65536].
Adding "qwen3" to _ANTHROPIC_OUTPUT_LIMITS prevents 400 errors that were
misclassified as context overflow, triggering premature compression.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-04 02:31:05 -07:00
Emilien Domenge 83bbe9b458 fix(delegation): pass target_model to resolve_runtime_provider in _resolve_delegation_credentials
When delegation.model differs from model.default and the provider is
opencode-go or opencode-zen, the wrong api_mode is computed because
resolve_runtime_provider falls back to model_cfg.get('default') — the
main model — instead of the configured delegation model.

For example, with model.default=minimax-m2.7 (anthropic_messages) and
delegation.model=glm-5.1 (chat_completions), subagents get
anthropic_messages, which strips /v1 from the base URL and causes a 404.

resolve_runtime_provider already accepts target_model for exactly this
purpose; _resolve_delegation_credentials just wasn't passing it.

Fixes #15319
Related: #13678
2026-05-04 02:30:48 -07:00
nftpoetrist e2211b2683 fix(compressor): reset _summary_failure_cooldown_until in on_session_reset()
on_session_reset() cleared _previous_summary, _last_summary_error, and
_ineffective_compression_count but left _summary_failure_cooldown_until
intact. When a transient summary error sets a 60 s cooldown (or 600 s
for a missing-provider RuntimeError) and the user immediately runs /reset
or /new, the cooldown carries into the new session. If the new session
reaches the compression threshold before the cooldown expires,
_generate_summary() returns None early, middle turns are silently dropped
without a summary, and the agent continues with no indication that
compaction was skipped.

Fix: set _summary_failure_cooldown_until = 0.0 in on_session_reset(),
matching the value assigned in __init__ and symmetric with the other
per-session fields already cleared there.

Fixes #15547
2026-05-04 02:30:31 -07:00
Teknium 3e1559b910 chore(release): AUTHOR_MAP entries for Tier 1c salvage batch
Pre-adds author-email mappings for upcoming Tier 1c salvage PRs
(small Apr 24-25 fixes).
2026-05-04 02:29:18 -07:00
Teknium baf834cc0f chore(release): map cine.dreamer.one@gmail.com to @LeonSGP43 2026-05-04 02:19:28 -07:00
LeonSGP43 abcaf05229 fix(skills): keep manual skills out of curator 2026-05-04 02:19:28 -07:00
asheriif 21c7c9f0ca fix(tui): harden plugin slash exec errors 2026-05-04 09:07:37 +00:00
Teknium cac4f2c0e6 test(kanban): update worker-prompt header assertion to match #19427
PR #19427 dropped the 'You are a Kanban worker' identity line from
KANBAN_GUIDANCE so SOUL.md stays authoritative for profile identity.
This test assertion was stale against that change; update it to the
new protocol-only header.
2026-05-04 02:00:42 -07:00
pdonizete deb59eab72 fix: allow kanban tools for orchestrator profiles with kanban toolset
The _check_kanban_mode() gating function only checked for
HERMES_KANBAN_TASK env var, which is only set by the dispatcher
when spawning workers. This prevented orchestrator profiles (like
techlead) from using kanban_create, kanban_link, etc. even when
they had 'kanban' explicitly in their toolsets config.

Now uses load_config() from hermes_cli.config (which has mtime-based
caching) to check if 'kanban' is in the profile's toolsets list.
This enables orchestrators to route work via Kanban while workers
continue using the dispatcher env var.

Fixes #18968
2026-05-04 02:00:42 -07:00
nftpoetrist 9faaa292b4 fix(delegate): inherit parent fallback_chain in _build_child_agent
_build_child_agent constructed child AIAgents without passing
fallback_model, leaving _fallback_chain=[] for every subagent.
When a subagent hit a rate-limit or credential exhaustion the
runtime fallback check (run_agent.py:7486 / 12267) found an empty
chain and failed immediately — even though the parent agent was
configured with fallback_providers and would have recovered.

The cron scheduler already propagates fallback_model correctly
(scheduler.py:1038). Fix closes the parity gap by reading the
parent's _fallback_chain (the normalised list form accepted by
AIAgent's fallback_model parameter) and threading it through.

Empty chains coerce to None so AIAgent initialises _fallback_chain=[]
as usual rather than iterating an empty list.
2026-05-04 01:48:56 -07:00
molvikar cb33c73418 fix(run_agent): gate iteration-limit provider routing to OpenRouter 2026-05-04 01:45:59 -07:00
Asunfly 8a364df2c8 fix: inherit reasoning config in API server runs 2026-05-04 01:44:16 -07:00
SHL0MS aede94e757 fix: back up config.yaml before hermes setup modifies it
Create a timestamped backup (~/.hermes/config.yaml.bak.YYYYMMDD_HHMMSS)
before the setup wizard runs any configuration sections. After setup
completes, show the backup path and a restore command.

This protects user-customized values (compression thresholds, provider
routing, PII redaction, auxiliary model configs) from being silently
overwritten by setup defaults.

Addresses #3522
2026-05-04 01:43:17 -07:00
memosr 2c7d7a9b2f fix(security): bind Meet node server to localhost and restrict token file to owner read 2026-05-04 01:42:59 -07:00
yuehei cdde0c8411 fix(feishu): enable MEDIA attachment delivery in send_message tool
The _send_feishu() function already supports media_files (images, video,
audio, documents) via the adapter's send_image_file/send_video/send_voice
/send_document methods, but _send_to_platform() never routed Feishu into
the early media-handling branch — media attachments were silently dropped
with a "not supported" warning.

Add a Feishu-specific media branch (matching the existing Yuanbao/Signal
pattern) so that MEDIA:<path> tags in send_message calls are correctly
delivered as native Feishu attachments. Also update the two error/warning
message strings to include feishu in the supported platform list.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-04 01:42:40 -07:00
WanderWang 45fd45103d fix: _chromium_installed() now checks AGENT_BROWSER_EXECUTABLE_PATH and system Chrome
Before this fix, _chromium_installed() only searched Playwright-style
chromium-* / chromium_headless_shell-* directories, which meant users
with system Chrome or AGENT_BROWSER_EXECUTABLE_PATH configured still
had all browser_* tools gated.

Now checks three sources in priority order:
1. AGENT_BROWSER_EXECUTABLE_PATH env var (if set and points to a real binary)
2. System Chrome/Chromium via shutil.which() (google-chrome, chromium-browser, chrome)
3. Playwright browser cache (existing logic, kept as fallback)

Closes #19294
2026-05-04 01:42:23 -07:00
Yanzhong Su c653f5dc3f Clarify session_search auxiliary model docs 2026-05-04 01:42:07 -07:00
ai-ag2026 8bdec80882 fix(agent): surface preflight compression status
Preflight compression can run synchronously before the first model call when a loaded session exceeds the active context threshold. Gateway users saw no visible progress while the compression LLM call was in flight, which can look like a dropped message during long compactions.\n\nEmit the existing lifecycle status through _emit_status before starting preflight compression so CLI, gateway, and WebUI status callbacks all get immediate feedback.\n\nAdds a regression assertion for the preflight path.
2026-05-04 01:41:51 -07:00
qiqufang d8be50d772 fix(web): add missing icons for config page category sidebar
Add icon mappings for 9 categories that fell back to FileQuestion:
- bedrock (Cloud), curator (Sparkles), kanban (LayoutDashboard)
- model_catalog (BookOpen), openrouter (Route), sessions (History)
- tool_loop_guardrails (Shield), tool_output (FileOutput), updates (RefreshCw)
2026-05-04 01:41:27 -07:00
Teknium 06031229e8 fix(tests): tolerate ps ancestor-walk in find_gateway_pids fallback test (#19590)
Follow-up to #19586 (@cixuuz salvage): _get_ancestor_pids walks ps -o ppid=
up the process tree, which the pre-existing mock in
test_find_gateway_pids_falls_back_to_pid_file_when_process_scan_fails didn't
expect. Return empty stdout so the ancestor loop terminates cleanly and the
original fallback assertion still passes.
2026-05-04 01:40:39 -07:00
liuhao1024 9c93fc5775 fix(tui): call process.exit(0) after Ink exit to trigger terminal cleanup
Ink's exit() calls unmount() which resets terminal modes (kitty keyboard,
mouse, etc.) but does NOT call process.exit().  The Node process stays
alive because stdin is still open (Ink listens on it), so the
process.on('exit') handler in entry.tsx — which sends the final
resetTerminalModes() — never fires.

This left kitty keyboard protocol and other terminal modes enabled in the
parent shell after /quit, Ctrl+C, or Ctrl+D, breaking arrow keys and
other input in subsequent programs.

Add explicit process.exit(0) after exit() in die() so the process
actually terminates and the exit handler runs.

Fixes #19194
2026-05-04 01:39:39 -07:00
Hermes Agent 74c997d985 fix(gateway): move quick-command dispatch before built-in handlers
Quick commands of type "alias" that target built-in slash commands
(e.g. /h -> /model) were processed too late in _handle_message — after
the if-canonical=="model" checks. This meant alias expansion never
reached the target handler and fell through to the LLM as raw text.

Two fixes:
1. Move the quick_commands block before built-in dispatch so alias
   targets (like /model) hit the correct handler after expansion.
2. Extract bare command name from target_command via .split()[0] to
   feed _resolve_cmd() correctly (was using the full arg-string).
2026-05-04 01:39:23 -07:00
holynn c857592558 fix(cli): allow custom:* provider slugs in model validation
Two related fixes for custom_providers model switching:

1. validate_requested_model() now recognizes custom:<name> slugs
   (e.g. custom:volcengine) as custom endpoints, not generic providers.
   Previously only the bare 'custom' slug matched the relaxed validation
   branch, causing model validation to fail with 'not found in provider
   listing' for all named custom providers.

2. switch_model() now consults the custom_providers list when deciding
   whether to override a validation rejection. If the requested model
   matches the entry's 'model' field or any key in its 'models' dict,
   the switch is accepted even when the remote /v1/models endpoint does
   not list it.

Both changes are covered by existing tests (86 passed).
2026-05-04 01:39:06 -07:00
Byrn Tong e8cdcf5328 fix: exclude ancestor PIDs from gateway process scan (#13242)
_scan_gateway_pids() uses ps-based pattern matching to find running
gateways. When invoked from the CLI (e.g. `hermes gateway status`),
the calling process itself matches gateway patterns, causing false
positives — the CLI is mistakenly counted as a running gateway.

Add _get_ancestor_pids() that walks the process tree from the current
PID up to init (PID 1). Merge this set into exclude_pids at the top
of _scan_gateway_pids() so the entire ancestor chain is filtered out.

This complements the existing os.getpid() exclusion in
_append_unique_pid() by also covering parent/grandparent processes
(e.g. when hermes is invoked via a wrapper script or shell).

Closes #13242
2026-05-04 01:38:41 -07:00
Aleksandr Pasevin 8a4fe80f8d fix(signal): skip reactions for unauthorized senders
The on_processing_start hook fired a reaction emoji (👀) on every
inbound Signal message before run.py's _is_user_authorized check.
This meant contacts not in SIGNAL_ALLOWED_USERS would see the bot
react to their messages even though Hermes silently dropped them —
leaking the presence of the bot and causing confusing UX.

Two changes to gateway/platforms/signal.py:

1. Read SIGNAL_ALLOWED_USERS into self.dm_allow_from in __init__
   (mirrors the group_allow_from pattern already in place).

2. Add _reactions_enabled(event) — two-gate check:
   - SIGNAL_REACTIONS=false/0/no disables reactions globally
   - If SIGNAL_ALLOWED_USERS is set, only react to senders in
     the allowlist (skips unauthorized contacts)

Both on_processing_start and on_processing_complete now call this
guard before sending any reaction.

Telegram already has an equivalent _reactions_enabled() guard
(controlled by TELEGRAM_REACTIONS). This brings Signal to parity.
2026-05-04 01:38:21 -07:00
nftpoetrist e89376d66f fix(setup): add missing SLACK_HOME_CHANNEL prompt to _setup_slack()
_setup_slack() was the only platform setup function that did not prompt
for a home channel. All four sibling setups (_setup_telegram,
_setup_discord, _setup_mattermost, _setup_bluebubbles) close with an
identical home-channel block, and setup_gateway() already checks for
SLACK_HOME_CHANNEL presence at the end of the wizard — but the value
was never collected, leaving cron delivery and cross-platform
notifications silently broken for Slack after a fresh hermes setup run.

Add the standard home-channel prompt at the end of _setup_slack(),
symmetric with the Discord implementation. Add two unit tests that
verify the prompt is saved when provided and skipped when left blank.
2026-05-04 01:37:18 -07:00
Byrn Tong 81ce945450 fix(gateway): show other profiles in gateway status to prevent confusion
When multiple gateway profiles are running (e.g. default and wx1),
`hermes gateway status` can be misleading — stopping one profile's
gateway and checking status may still show the other profile's process
without indicating which profile it belongs to.

Add `_print_other_profiles_gateway_status()` which displays running
gateways from other profiles at the bottom of the status output:

    Other profiles:
      ✓ wx1              — PID 166893

This uses the existing `find_profile_gateway_processes()` and
`get_active_profile_name()` — no new dependencies.

Closes #19113
Related: #4402, #4587
2026-05-04 01:37:02 -07:00
wanazhar df88375f0d fix: treat ctrl-c as curses cancel 2026-05-04 01:36:44 -07:00
leavr ccb5d87076 test: cover max-iterations summary message sanitization 2026-05-04 01:36:27 -07:00
tmdgusya a1cb811cb8 fix(cli): avoid voice TTS restart race 2026-05-04 01:36:07 -07:00
Teknium 314fe9f827 chore(release): add AUTHOR_MAP entries for upcoming salvage batch
Pre-adds author-email mappings for the 21 Tier 1b salvage PRs so
their cherry-picked commits land with mapped GitHub logins in the
release notes.
2026-05-04 01:34:32 -07:00
ethan 645b99aadd test(cron): cover null next_run_at recovery and non-dict origin tolerance
Adds four regression tests guarding the bugfix in the previous commit:
- TestGetDueJobs::test_broken_cron_without_next_run_is_recovered exercises
  cron schedules whose next_run_at was lost; expects compute_next_run to
  repopulate it within get_due_jobs() rather than silently skipping the job.
- TestGetDueJobs::test_broken_interval_without_next_run_is_recovered does
  the same for interval schedules.
- TestResolveOrigin::test_string_origin_is_tolerated and
  test_non_dict_origin_is_tolerated confirm _resolve_origin() returns None
  for legacy/hand-edited origins (string, list, int) instead of raising.

Co-Authored-By: Claude <noreply@anthropic.com>
2026-05-04 01:32:58 -07:00
ethan 78b635ee3c fix(cron): recover null next_run_at jobs and tolerate non-dict origin
Fixes #18722

get_due_jobs() now recomputes next_run_at via compute_next_run() for
cron/interval jobs that arrived with null next_run_at (e.g. via direct
jobs.json edits) instead of silently skipping them. _resolve_origin()
guards with isinstance(origin, dict), and _deliver_result() now routes
through _resolve_origin() so string/non-dict origins no longer crash
the ticker.

References: references #18735 (open competing fix from automated bulk PR touching 79 files); this PR is a focused single-issue contribution and adds the missing interval-recovery test variant

Co-Authored-By: Claude <noreply@anthropic.com>
2026-05-04 01:32:58 -07:00
Teknium 91ea3ae4b2 test(skills): add bytes-vs-str equivalence and on-disk hash parity tests
Follow-up on #9925 cherry-pick adding two additional tests:
- bytes content hashes identically to its str-decoded form
- mixed bytes+str bundle hash equals the on-disk content_hash from
  skills_guard (the production invariant used to detect drift)

Also map dodofun@126.com and 1615063567@qq.com in AUTHOR_MAP so the
CI contributor check passes for the cherry-picked commit.

Co-authored-by: LeonSGP43 <cine.dreamer.one@gmail.com>
Co-authored-by: zhao0112 <1615063567@qq.com>
2026-05-04 01:28:12 -07:00
dh 3072e5543b skills-hub: hash binary skill bundle files correctly 2026-05-04 01:28:12 -07:00
Teknium c90f25dd1f chore(release): map daixin1204@gmail.com to @SimbaKingjoe 2026-05-04 01:21:23 -07:00
daixin1204 744079ffe6 fix(curator): prevent false-positive consolidation from substring matching
_classify_removed_skills used naive 'in' substring matching to detect
whether a removed skill's name appeared in skill_manage arguments.
Short/common skill names (api, git, test, foo, etc.) matched
incorrectly when they appeared as substrings of longer words in file
paths (references/api-design.md) or content (latest, testing).

Replace with field-aware matching:
- file_path: needle must match a complete filename stem or directory
  name, with -/_ normalised for variant tolerance
- content fields: word-boundary regex (\b) prevents embedding in
  longer words

Also add 3 regression tests covering the false-positive scenarios.
2026-05-04 01:21:23 -07:00
Clooooode c0300575c1 fix(kanban): use get_default_hermes_root() in list_profiles_on_disk
Path.home() / ".hermes" / "profiles" breaks custom-root deployments
(e.g. HERMES_HOME=/opt/data). Switch to get_default_hermes_root() so
profile discovery is consistent with kanban_db_path() and
workspaces_root() fixed in #18985.

Fixes #19017.
Related to #18442, #18985.
2026-05-04 01:21:14 -07:00
Clooooode 1964b0565b test(kanban): add failing test for list_profiles_on_disk with custom HERMES_HOME
list_profiles_on_disk() hardcodes Path.home() / ".hermes" / "profiles",
ignoring HERMES_HOME when set to a custom root (e.g. /opt/data).

Add test_list_profiles_on_disk_custom_root to cover this case.

Related to #18442, #18985.
2026-05-04 01:21:14 -07:00
Siddharth Balyan 8163d37192 fix(skill): reference built-in video_analyze/vision_analyze tools in kanban-video-orchestrator (#19562)
The tool-matrix.md had a vague 'Gemini multimodal / Claude vision' entry
in the external tools table that didn't point to the actual built-in
Hermes tools. Now that video_analyze exists (merged in #19301), update
the skill to reference it properly:

- Add 'Built-in Hermes tools for media review' section with proper
  toolset names, enablement instructions, and capability details
- Add video + vision toolsets to cinematographer, editor, and reviewer
  profile configs
- Update role-archetypes.md to reference tools by name
- Update API key table to explain video_analyze routing
2026-05-04 12:54:50 +05:30
Siddharth Balyan a11aed1acc fix(cli): local backend CLI always uses launch directory, stops .env sync of TERMINAL_CWD (#19334)
The old CWD heuristic was fooled by:
1. TERMINAL_CWD persisted to .env by `hermes config set terminal.cwd`
2. Inherited TERMINAL_CWD from parent hermes processes
3. Only resolved when config had a placeholder value (not explicit paths)

Fix:
- load_cli_config() unconditionally uses os.getcwd() for local backend
- TERMINAL_CWD always force-exported in CLI mode (overrides stale values)
- Gateway sets _HERMES_GATEWAY=1 marker so lazy cli.py imports don't clobber
- Remove terminal.cwd from config-set .env sync map (prevents re-poisoning)
- Clarify setup wizard label as 'Gateway working directory'

Closes #19214
2026-05-04 11:36:19 +05:30
Ben Barclay 434d70d8bc Merge pull request #19540 from NousResearch/single_container_for_all
feat(docker): launch dashboard as side-process via HERMES_DASHBOARD=1
2026-05-04 15:38:19 +10:00
Ben 5671059f62 feat(docker): launch dashboard as side-process via HERMES_DASHBOARD=1
Adds an optional dashboard side-process to the container entrypoint,
toggled by `HERMES_DASHBOARD=1` (also accepts `true` / `yes`).  When set,
the entrypoint backgrounds `hermes dashboard` before `exec`-ing the main
command so the user's chosen foreground process (gateway, chat, `sleep
infinity`, …) remains PID-of-interest for the container runtime.
  docker run -d \
    -v ~/.hermes:/opt/data \
    -p 8642:8642 -p 9119:9119 \
    -e HERMES_DASHBOARD=1 \
    nousresearch/hermes-agent gateway run
Defaults chosen for the container case:
 - Host: 0.0.0.0 (reachable through published port; can override to
   127.0.0.1 via HERMES_DASHBOARD_HOST for sidecar/reverse-proxy setups)
 - Port: 9119 (matches `hermes dashboard`)
 - Auto-adds `--insecure` when binding to non-localhost, matching the
   dashboard's own safety gate for exposing API keys
 - HERMES_DASHBOARD_TUI is read by `hermes dashboard` directly — no
   entrypoint plumbing needed
Dashboard output is prefixed with `[dashboard]` via `stdbuf`+`sed -u` so
it's easy to separate from gateway logs in `docker logs`.  No supervision:
if the dashboard crashes it stays down until the container restarts
(documented in the `:::note` panel).
Other changes bundled in:
 - Deprecate GATEWAY_HEALTH_URL / GATEWAY_HEALTH_TIMEOUT env vars in
   hermes_cli/web_server.py with a DEPRECATED block comment and a
   `.. deprecated::` note on _probe_gateway_health.  The feature still
   works for this release; it'll be removed alongside the move to a
   first-class dashboard config key.
 - Rewrite the "Running the dashboard" doc section around the new
   single-container pattern.  Drops the previously-documented
   dashboard-as-its-own-container setup — that pattern relied on the
   deprecated env vars for cross-container gateway-liveness detection,
   and without them the dashboard would permanently report the gateway
   as "not running".
 - Collapse the two-service Compose example (gateway + dashboard
   container) into a single service with HERMES_DASHBOARD=1.  Removes
   the now-unnecessary bridge network and `depends_on`.
 - Drop the ":::warning" caveat about "Running a dashboard container
   alongside the gateway is safe" — that case no longer exists.
2026-05-04 15:37:27 +10:00
Ben Barclay 95f395027f Merge pull request #19520 from NousResearch/fix_docker_tui
fix(docker/tui): tolerate npm's peer-flag drop in lockfile comparison
2026-05-04 14:29:43 +10:00
Ben 2f2998bb1b fix(tui): tolerate npm's peer-flag drop in lockfile comparison
`_tui_need_npm_install()` compares the canonical `package-lock.json` against
the hidden `node_modules/.package-lock.json` to decide whether `npm install`
needs to re-run. npm 9 drops the `"peer": true` field from the hidden lock
on dev-deps that are *also* declared as peers (the canonical lock preserves
the dual annotation). That made the check flag 16 packages (`@babel/core`,
`@types/node`, `@types/react`, `@typescript-eslint/*`, `react`, `vite`,
`tsx`, `typescript`, …) as mismatched on every launch, triggering a runtime
`npm install`.
Inside the Docker image, that runtime install then fails with EACCES because
`/opt/hermes/ui-tui/node_modules/` is root-owned from build time, so
`docker run … hermes-agent --tui` prints:
    Installing TUI dependencies…
    npm install failed.
…and exits 1, with no preview. The empty preview is a second bug: the
launcher captured only stderr, but npm 9 writes EACCES to stdout, which
was DEVNULL'd.
Fixes:
 - Add `"peer"` to `_NPM_LOCK_RUNTIME_KEYS` so the comparison ignores the
   non-deterministic field, alongside the existing `"ideallyInert"`.
 - Capture stdout as well as stderr in the install subprocess so future
   failures surface a useful preview instead of a bare "failed." line.
Regression tests:
 - `test_no_install_when_only_peer_annotation_differs` — the exact scenario
 - `test_install_when_version_differs_even_with_peer_drop` — guards against
   the peer-drop tolerance masking a real version skew
On-host impact: the same false-positive was firing on every `hermes --tui`
invocation from a normal checkout, silently running a no-op `npm install`
each time (it converged because the host's `node_modules/` is writable).
Startup time on the TUI should drop noticeably.
2026-05-04 14:13:38 +10:00
Chris Danis 363cc93674 fix(cron): bump skill usage when cron jobs load skills
Cron jobs that reference skills via their skills: config never bumped
the usage counters in .usage.json, so the curator could auto-archive
skills actively used by cron jobs based on stale timestamps.

Now _build_job_prompt() calls bump_use(skill_name) for each
successfully loaded skill so the curator sees them as active.
2026-05-03 17:06:48 -07:00
nftpoetrist 808fee151d fix(auxiliary): propagate explicit_api_key to _try_anthropic()
_try_anthropic() lacked the explicit_api_key parameter added to
_try_openrouter() in #18768. When resolve_provider_client() is called
with provider="anthropic" and an explicit key (e.g. from a fallback_model
entry with api_key set), the key was silently ignored — _try_anthropic()
always fell back to resolve_anthropic_token(), so the fallback returned
None,None for users without a default Anthropic credential configured.

Fix: add explicit_api_key: str = None to _try_anthropic() and use
explicit_api_key or <pool/env fallback> in both the pool-present and
no-pool paths. Pass explicit_api_key=explicit_api_key at the call site
in resolve_provider_client(). Symmetric with the _try_openrouter() fix.
No behavior change when explicit_api_key is None.
2026-05-03 17:00:55 -07:00
molvikar 74636f9c4a fix(gateway): clear queued reload-skills notes on new/resume/branch 2026-05-03 17:00:31 -07:00
Kenny Wang 222767e5e8 fix: sanitize Telegram help command mentions 2026-05-03 17:00:09 -07:00
konsisumer 6fda92aa7f fix(gateway): bridge top-level require_mention to Telegram config
Users commonly place `require_mention: true` at the top level of
config.yaml alongside `group_sessions_per_user`, expecting it to gate
Telegram group messages. The key was silently ignored because the
config loader only checked `yaml_cfg["telegram"]["require_mention"]`.

When `require_mention` is found at the top level and no telegram-specific
value is set, the fix now:
- adds it to platforms_data["telegram"]["extra"] so _telegram_require_mention()
  picks it up via the primary config.extra path
- sets TELEGRAM_REQUIRE_MENTION env var for the secondary fallback path

A telegram-specific value (telegram.require_mention) still takes
precedence over the top-level shorthand.

Also corrects telegram.md: bare /cmd without @botname is rejected when
require_mention is enabled; only /cmd@botname (bot-menu form) passes.

Fixes #3979
2026-05-03 16:59:46 -07:00
clawbot 1bd975c0ba fix(gateway): suppress duplicate voice transcripts
Deduplicate exact and near-exact Discord voice STT transcripts per guild/user over a short window to avoid duplicate delayed agent replies.

Adds regression tests for exact and near-duplicate voice transcript suppression.
2026-05-03 16:59:21 -07:00
Teknium b58db237e4 fix(kanban): drop worker identity claim from KANBAN_GUIDANCE (#19427)
KANBAN_GUIDANCE layer 3 of the system prompt started with 'You are a
Kanban worker', overriding the profile's SOUL.md identity at layer 1.
Profiles with strict role boundaries (e.g. a reviewer profile that
never writes code) still executed implementation tasks because the
kanban identity claim diluted SOUL's.

Drop the identity line. Layer 3 now describes the task-execution
protocol only; SOUL.md remains the sole identity slot.

Fixes #19351
2026-05-03 16:59:00 -07:00
LeonSGP43 6713274a42 fix(file): strip leaked terminal fences from reads 2026-05-03 16:58:50 -07:00
Alan Chen 2d7543c61f fix(windows): enforce UTF-8 stdout/stderr to prevent UnicodeEncodeError crash
On Windows, services and terminals default to cp1252 encoding. The CLI
uses box-drawing characters (┌│├└─) in banners, doctor output, and
status displays. When print() tries to encode these under cp1252, an
unhandled UnicodeEncodeError crashes the gateway on startup.

This fix adds early UTF-8 enforcement in hermes_cli/__init__.py:
- Sets PYTHONUTF8=1 and PYTHONIOENCODING=utf-8
- Re-opens stdout/stderr with UTF-8 encoding if not already UTF-8

Runs at import time so it protects all CLI subcommands. No effect on
Unix (gated on sys.platform == "win32"). Backwards-compatible: on
systems already using UTF-8, the function is a no-op.

Fixes #10956
2026-05-03 16:58:25 -07:00
Teknium 2ababfe6ed chore(release): map 0xKingBack noreply email 2026-05-03 16:55:16 -07:00
0xKingBack 3c42024539 fix(curator): pass auxiliary curator api_key/base_url into runtime resolution
Curator review fork now forwards per-slot credentials from auxiliary.curator
and legacy curator.auxiliary to resolve_runtime_provider, matching the
canonical aux task schema. Add regression tests for binding and main fallback.
2026-05-03 16:55:16 -07:00
Kiala 3792b77bd1 fix(send_message): support QQBot C2C and group chats
The _send_qqbot function was hardcoded to use the guild channel
endpoint (/channels/{id}/messages), which fails for C2C private
chats and QQ groups with 'channel does not exist' (code 11263).

This change tries the appropriate endpoints in order:
1. /channels/{id}/messages     (guild channels)
2. /v2/users/{id}/messages     (C2C private chats)
3. /v2/groups/{id}/messages    (QQ groups)

Fixes active sending to QQBot C2C and group recipients.
2026-05-03 16:54:39 -07:00
MrBob 86e64c1d3b fix(gateway): hide required-arg commands from Telegram menu 2026-05-03 15:29:06 -07:00
sprmn24 408dd8aa28 fix(compressor): skip non-string tool content in dedup pass to prevent AttributeError 2026-05-03 15:28:30 -07:00
sprmn24 5bd937533c fix(vision): guard user_prompt type in video_analyze_tool before debug_call_data construction 2026-05-03 15:28:04 -07:00
sprmn24 6c4aca7adc fix(vision): guard user_prompt type before debug_call_data construction 2026-05-03 15:27:40 -07:00
Zyproth a5cae16496 fix(api_server): fall back to default port on malformed API_SERVER_PORT 2026-05-03 15:27:03 -07:00
Amit Gaur 65bebb9b80 fix(cli): follow 307 redirects in MiniMax OAuth httpx clients
The MiniMax OAuth API endpoints have moved from api.minimax.io to
account.minimax.io and the old paths now respond with HTTP 307.
httpx defaults to follow_redirects=False (unlike requests), so the
device-code and token-refresh flows fail with "Temporary Redirect".

Adds follow_redirects=True to the two httpx.Client instances in
hermes_cli/auth.py used by the MiniMax OAuth flow. This is forward-
compatible -- if endpoints move again, the redirect chain is
followed automatically.

Repro before patch:
  curl -i -X POST https://api.minimax.io/oauth/code  # -> 307
  curl -i -X POST https://api.minimax.io/oauth/token # -> 307

Verified end-to-end against a real MiniMax Plus account on macOS;
the existing tests/test_minimax_oauth.py suite (15 tests) still
passes.
2026-05-03 15:26:33 -07:00
Zyproth dfdd7b6e6f fix(codex-transport): preserve request override headers for xai responses 2026-05-03 15:25:45 -07:00
LeonSGP43 4a2f822137 fix(mcp): reconnect on terminated sessions 2026-05-03 15:23:33 -07:00
teknium1 2658494e81 fix(kanban): add per-path env overrides + dispatcher env injection
Layers defense-in-depth on top of the shared-root anchoring (base commit).

Changes in hermes_cli/kanban_db.py:
- kanban_db_path() now honours HERMES_KANBAN_DB first, then falls through
  to kanban_home()/kanban.db.
- workspaces_root() now honours HERMES_KANBAN_WORKSPACES_ROOT first, then
  falls through to kanban_home()/kanban/workspaces.
- All three overrides (HERMES_KANBAN_HOME, HERMES_KANBAN_DB,
  HERMES_KANBAN_WORKSPACES_ROOT) now call .expanduser() for consistency.
- _default_spawn() injects HERMES_KANBAN_DB and
  HERMES_KANBAN_WORKSPACES_ROOT into the worker subprocess env. Even
  when the worker's get_default_hermes_root() resolution somehow
  disagrees with the dispatcher's (symlinks, unusual Docker layouts),
  the two processes still open the same SQLite file.

Module docstring updated to describe all three overrides and the
dispatcher env-injection contract.

Tests (tests/hermes_cli/test_kanban_db.py, TestSharedBoardPaths):
- test_hermes_kanban_db_pin_beats_kanban_home
- test_hermes_kanban_workspaces_root_pin_beats_kanban_home
- test_empty_per_path_overrides_fall_through
- test_dispatcher_spawn_injects_kanban_db_and_workspaces_root
  (monkeypatches subprocess.Popen, asserts both env vars reach the
  child even after HERMES_HOME is rewritten by `hermes -p <profile>`.)

Docs: website/docs/reference/environment-variables.md gets entries
for the three kanban env vars.

This fusion is built on the cleanest of the seven competing PRs that
targeted issue #18442:

* Base commit (from PR #19350 by @GodsBoy): add `kanban_home()` helper
  anchored at `get_default_hermes_root()`, reroute all 5 kanban path
  sites through it (including the 3 sibling log-dir sites that the
  other six PRs missed), 8-test regression class.
* Dispatcher env-var injection approach drawn from PRs #18300
  (@quocanh261997) and #19100 (@cg2aigc).
* Per-path env overrides drawn from PR #19100 (@cg2aigc).
* get_default_hermes_root() resolution direction first proposed in
  PR #18503 (@beibi9966) and PR #18985 (@Gosuj).

Closes the duplicate/competing PRs: #18300, #18503, #18670, #18985,
#19037, #19056, #19100. Fixes #18442 and #19348.

Co-authored-by: quocanh261997 <17986614+quocanh261997@users.noreply.github.com>
Co-authored-by: cg2aigc <232694053+cg2aigc@users.noreply.github.com>
Co-authored-by: beibi9966 <beibei1988@proton.me>
Co-authored-by: Gosuj <123411271+Gosuj@users.noreply.github.com>
Co-authored-by: LeonSGP43 <154585401+LeonSGP43@users.noreply.github.com>
2026-05-03 15:13:39 -07:00
GodsBoy f5bd77b3e1 fix(kanban): anchor board, workspaces, and worker logs at the shared Hermes root
The Kanban board is documented as shared across all Hermes profiles, but
`kanban_db_path()` and `workspaces_root()` resolved through `get_hermes_home()`,
which returns the active profile's HERMES_HOME. When the dispatcher spawned a
worker with `hermes -p <profile> --skills kanban-worker chat -q "work kanban
task <id>"`, the worker rewrote HERMES_HOME to the profile subdirectory before
kanban_db.py imported, opening a profile-local `kanban.db` that did not contain
the dispatcher's task. `kanban_show` and `kanban_complete` failed; the
dispatcher's row stayed `running` and was retried/crashed. The same defect
applied to `_default_spawn`'s log directory and `worker_log_path`, so
`hermes kanban tail` did not see the worker's output.

Add `kanban_home()` in `hermes_cli/kanban_db.py` that resolves through
`HERMES_KANBAN_HOME` (explicit override) then `get_default_hermes_root()`,
which already understands the `<root>/profiles/<name>` and Docker / custom
HERMES_HOME shapes. Reroute `kanban_db_path`, `workspaces_root`, the
`_default_spawn` log directory, `gc_worker_logs`, and `worker_log_path`
through it. Profile-specific config, `.env`, memory, and sessions stay
isolated as before; only the kanban surface is shared.

Add a `TestSharedBoardPaths` regression class to `tests/hermes_cli/test_kanban_db.py`
covering: default install, profile-worker convergence, Docker custom HERMES_HOME,
Docker profile layout, explicit `HERMES_KANBAN_HOME` override, and a real
SQLite round-trip across dispatcher and worker HERMES_HOME perspectives.
The dispatcher/worker convergence tests fail on origin/main and pass after
the fix.

Update the `kanban.md` user-guide page and the misleading docstrings in
`kanban_db.py` to describe the shared-root behavior.

Fixes #19348
2026-05-03 15:13:39 -07:00
asheriif 7e780f4832 fix(tui): run plugin slash commands live 2026-05-03 19:42:16 +00:00
Siddharth Balyan 167b5648ea Revert "fix(cli): CLI/TUI on local backend always uses launch directory, ignores terminal.cwd (#19242)" (#19329)
This reverts commit 9eaddfafa3.
2026-05-04 00:43:58 +05:30
Siddharth Balyan 9eaddfafa3 fix(cli): CLI/TUI on local backend always uses launch directory, ignores terminal.cwd (#19242)
CLI/TUI sessions on the local backend now unconditionally use
os.getcwd() as the working directory. The terminal.cwd config value is
only consumed by gateway/cron/delegation modes (where there's no shell
to cd from).

Previously, 'hermes setup' would write an absolute path (e.g. $HOME)
into terminal.cwd which then pinned the CLI to that directory regardless
of where the user launched hermes from. This was a silent foot-gun —
the user's 'cd' was being ignored.

Changes:

1. cli.py: Restructured CWD resolution — if TERMINAL_CWD is not already
   set by the gateway, and the backend is local, always use os.getcwd().
   Config terminal.cwd is irrelevant for interactive CLI/TUI sessions.

2. setup.py: Moved the cwd prompt from setup_terminal_backend() to
   setup_gateway(). It now only appears when configuring messaging
   platforms and is labeled 'Gateway working directory'.

3. Tests: Rewrote test_cwd_env_respect.py to validate the new behavior:
   explicit config paths are ignored for CLI, gateway pre-set values are
   preserved, non-local backends keep their config paths.

4. Docs: Updated configuration.md, profiles.md, and
   environment-variables.md to clarify that terminal.cwd only affects
   gateway/cron mode on local backend.

Closes #19214
2026-05-04 00:14:36 +05:30
GodsBoy b8ae8cc801 fix(debug): redact log content at upload time in hermes debug share
Apply agent.redact.redact_sensitive_text with force=True to log content
captured by _capture_log_snapshot before it reaches upload_to_pastebin.
On-disk logs are untouched. Compatible with the off-by-default local
redaction policy from #16794: this is upload-time-only and applies
regardless of security.redact_secrets because the public paste service
is the leak surface. A visible banner is prepended to each uploaded log
paste so reviewers know redaction was applied. --no-redact preserves
deliberate unredacted sharing for maintainer-coordinated cases.

The bug-report, setup-help, and feature-request issue templates direct
users to run hermes debug share and paste the resulting public URLs.
With redaction off by default per #16794, those uploads have been
carrying credentials onto paste.rs and dpaste.com.

force=True is non-negotiable: without it, redact_sensitive_text
short-circuits at agent/redact.py:322 when the env var is unset, so the
fix would silently be a no-op for its target audience. A regression
test pins this down.

Fixes #19316
2026-05-03 11:42:20 -07:00
Siddharth Balyan c9a3f36f56 feat: add video_analyze tool for native video understanding (#19301)
* feat: add video_analyze tool for native video understanding

Adds a video_analyze tool that sends video files to multimodal LLMs
(e.g. Gemini) for analysis via the OpenRouter-compatible video_url
content type. Mirrors vision_analyze in structure, error handling,
and registration pattern.

Key design:
- Base64 encodes entire video (no frame extraction, no ffmpeg dep)
- Uses 'video_url' content block type (OpenRouter standard)
- Supports mp4, webm, mov, avi, mkv, mpeg formats
- 50 MB hard cap, 20 MB warning threshold
- 180s minimum timeout (videos take longer than images)
- AUXILIARY_VIDEO_MODEL env override, falls back to AUXILIARY_VISION_MODEL
- Same SSRF protection, retry logic, and cleanup as vision_analyze

Default disabled: registered in 'video' toolset (not in _HERMES_CORE_TOOLS).
Users opt in via: hermes tools enable video, or enabled_toolsets=['video'].

* feat(video): add models.dev capability pre-check + CONFIGURABLE_TOOLSETS entry

- Pre-checks model video capability via models.dev modalities.input
  before expensive base64 encoding. Fails early with helpful message
  suggesting video-capable alternatives (gemini, mimo-v2.5-pro).
- Passes optimistically if model unknown or lookup fails.
- Adds ModelInfo.supports_video_input() helper.
- Adds 'video' to CONFIGURABLE_TOOLSETS and _DEFAULT_OFF_TOOLSETS
  so 'hermes tools enable video' works from CLI.
- 8 new tests for the capability check (37 total).

* refactor(video): remove models.dev capability pre-check

Removes _check_video_model_capability and ModelInfo.supports_video_input.
The vision_analyze tool doesn't pre-check image capability either — both
tools rely on the same pattern: send request, handle API errors gracefully
with categorized user-facing messages. The pre-check was inconsistent
(only worked for some providers/models) so drop it for parity.

* cleanup: compress comments, fix fragile timeout coupling

- Replace _VISION_DOWNLOAD_TIMEOUT * 2 with hardcoded 60s (no silent
  breakage if vision timeout changes independently)
- Strip verbose comments and redundant log lines throughout
- No behavioral changes
2026-05-04 00:04:36 +05:30
SHL0MS 0dd8e3f8d8 rename: video-orchestrator → kanban-video-orchestrator
The kanban prefix makes the skill discoverable alongside `kanban-orchestrator`
and `kanban-worker`, and signals up front that this skill drives the kanban
plugin rather than being a generic video tool.

Updated:
- directory rename
- SKILL.md frontmatter `name:` and H1
- setup.sh.tmpl header
2026-05-03 10:26:54 -07:00
SHL0MS 511add7249 feat(skill): add video-orchestrator optional creative skill
Meta-pipeline that wraps any video request — narrative film, product /
marketing, music video, explainer, ASCII, generative, comic, 3D,
real-time/installation — in a Hermes Kanban pipeline. Performs adaptive
discovery, designs an appropriate team for the requested style, generates
the setup script that creates Hermes profiles + initial kanban task, and
helps monitor execution.

Routes scenes to whichever existing Hermes skill fits each beat
(`ascii-video`, `manim-video`, `p5js`, `comfyui`, `touchdesigner-mcp`,
`blender-mcp`, `pixel-art`, `baoyu-comic`, `claude-design`, `excalidraw`,
`songsee`, `heartmula`, …) plus external APIs for TTS, image-gen, and
image-to-video. Kanban orchestration uses the `kanban-orchestrator` and
`kanban-worker` skills.

The single-project workspace layout, profile-config patching pattern,
SOUL.md-per-profile model, and `--workspace dir:<path>` discipline are
adapted from alt-glitch's original kanban-video-pipeline at
https://github.com/NousResearch/kanban-video-pipeline. This skill
generalizes those patterns across video styles and replaces the original
string-replacement config patcher with a PyYAML-based one that touches
only `toolsets` and `skills.always_load` (preserving security-sensitive
fields like `approvals.mode`).

Includes:
- SKILL.md — workflow + critical rules
- references/ — intake, role archetypes, tool matrix, kanban setup,
  monitoring, six worked examples
- assets/ — brief / setup.sh / soul.md templates
- scripts/ — bootstrap_pipeline.py (plan.json -> setup.sh) and
  monitor.py (poll + issue detection)

Co-authored-by: alt-glitch <balyan.sid@gmail.com>
2026-05-03 10:26:54 -07:00
brooklyn! e97a9993b9 Merge pull request #19307 from NousResearch/bb/fix-terminal-resize-jumble
fix(tui): clear Apple Terminal resize artifacts
2026-05-03 10:17:15 -07:00
Brooklyn Nicholson 279b656adc fix(tui): clear Apple Terminal resize artifacts
Use a deeper alt-screen clear for Apple Terminal resize repaints so host reflow artifacts do not survive the recovery frame.
2026-05-03 12:11:24 -05:00
Bartok9 e527240b27 fix(tools): write_file handler now rejects missing 'content'/'path' args instead of silently writing zero-byte files (#19096)
Under context pressure, frontier models sometimes emit tool calls with
required fields dropped. Previously _handle_write_file() used
args.get('content', '') which substituted an empty string for the missing
key, returned success with bytes_written=0, and created a zero-byte file
on disk. The model had no way to detect the failure.

Changes:
- Reject calls where 'path' is absent or not a non-empty string
- Reject calls where 'content' key is entirely absent (key-presence check,
  not truthiness) — distinguishing a legitimately empty file from a dropped arg
- Reject calls where 'content' is a non-string type
- All error messages include guidance to re-emit the tool call or switch
  to execute_code with hermes_tools.write_file() for large payloads
- Explicit empty string content (file truncation) continues to work

Regression tests added for all four cases: missing path, missing content,
explicit-empty content, and wrong content type.

Fixes #19096
2026-05-03 08:52:41 -07:00
Tranquil-Flow 6b4fb9f878 fix(cron): treat non-dict origin as missing instead of crashing tick
``_resolve_origin`` called ``origin.get('platform')`` on whatever
``job.get('origin')`` returned. The leading ``if not origin: return None``
short-circuited the falsy cases (None, empty dict, "") but a non-empty
string passed that guard and then crashed with
``AttributeError: 'str' object has no attribute 'get'`` on every fire
attempt. Observed in the wild after a migration script tagged jobs with
free-form provenance strings (e.g.
``"combined-digest-replaces-x-and-y-20260503"``).

``mark_job_run`` did record ``last_status: error,
last_error: "'str' object has no attribute 'get'"`` once, but the next
tick re-loaded the same poisoned origin and crashed identically. The
job stayed enabled, fired every tick, and accumulated cascading errors
in the log until ``origin`` was patched manually.

Replace the falsy guard with ``isinstance(origin, dict)``. Non-dict
origins (string, int, list, tuple, float — anything that survived a
hand-edit, JSON-script write, or migration) are now treated the same
as a missing origin: the job continues with ``deliver`` falling back
through its normal home-channel path instead of crashing the scheduler
loop.

Test parametrises the non-dict shapes that can appear in jobs.json
through external writers and asserts ``_resolve_origin`` returns None
for each.

Note: this fix scope is the non-dict-``origin`` crash only. The
``next_run_at: null`` recurring-job recovery (the second sub-bug in
#18722) is independently addressed by the in-flight #18825, which
extends the never-silently-disable defense from #16265 to
``get_due_jobs()`` — that approach is well-aligned with the existing
recovery pattern and ships fine without a competing change here.

Fixes #18722 (non-dict origin crash; recurring-job recovery covered by #18825)
2026-05-03 08:51:50 -07:00
JasonOA888 69dd0f7cf1 fix(approval): extend sensitive write target to cover shell RC and credential files
Terminal commands can write to shell RC files (~/.bashrc, ~/.zshrc,
~/.profile) and credential files (~/.netrc, ~/.pgpass, ~/.npmrc,
~/.pypirc) via redirection or tee without triggering approval, even
though write_file already blocks these paths in file_safety.py.

This creates an inconsistency: write_file protects these paths but
terminal shell redirections bypass the same protection. An agent
prompted via indirect injection could install persistent backdoors
(e.g. PATH manipulation, alias overrides) or write credential entries
without user approval.

Extend _SENSITIVE_WRITE_TARGET with two new regex groups matching the
same paths that file_safety.py's WRITE_DENIED_PATHS already covers:
  _SHELL_RC_FILES  — ~/.bashrc, ~/.zshrc, ~/.profile, ~/.bash_profile,
                     ~/.zprofile
  _CREDENTIAL_FILES — ~/.netrc, ~/.pgpass, ~/.npmrc, ~/.pypirc

All 130 existing tests pass.
2026-05-03 08:49:13 -07:00
teknium1 3c59566cc5 chore(release): map leprincep35700 email for PR #18440 salvage 2026-05-03 08:47:49 -07:00
leprincep35700 b59bb4e351 fix(gateway): preserve home-channel thread targets across restart notifications 2026-05-03 08:47:49 -07:00
Teknium d87fd9f039 fix(goals): make /goal work in TUI and fix gateway verdict delivery (#19209)
/goal was silently broken outside the classic CLI.

TUI: /goal was routed through the HermesCLI slash-worker subprocess,
which set the goal row in SessionDB but then called
_pending_input.put(state.goal) — the subprocess has no reader for that
queue, so the kickoff message was discarded. No post-turn judge was
wired into prompt.submit either, so even a manual kickoff would not
continue the goal loop. Intercept /goal in command.dispatch instead,
drive GoalManager directly, and return {type: send, notice, message}
so the TUI client renders the Goal-set notice and fires the kickoff.
Run the judge in _run_prompt_submit after message.complete, surface
the verdict via status.update {kind: goal}, and chain the continuation
turn after the running guard is released.

Gateway: _post_turn_goal_continuation was gated on
hasattr(adapter, 'send_message'), but adapters only expose send().
That branch was dead on every platform — users never saw
'✓ Goal achieved', 'Continuing toward goal', or budget-exhausted
messages. Replace the dead call with adapter.send(chat_id, content,
metadata) and drop a broken reference to self._loop.

Tests:
- tests/tui_gateway/test_goal_command.py — full /goal dispatch matrix
  (set / status / pause / resume / clear / stop / done / whitespace)
  plus regressions for slash.exec → 4018 and 'goal' staying in
  _PENDING_INPUT_COMMANDS.
- tests/gateway/test_goal_verdict_send.py — locks in the adapter.send
  path for done / continue / budget-exhausted and verifies the hook
  no-ops when no goal is set or the adapter lacks send().
2026-05-03 05:49:12 -07:00
Teknium 55647a5813 fix(whatsapp): pin protobufjs >=7.5.5 via npm overrides to clear 3 critical vulns (#19204)
The whatsapp-bridge pulls @whiskeysockets/baileys at a pinned git
commit whose transitive dep tree ships protobufjs <7.5.5, triggering
GHSA-xq3m-2v4x-88gg (critical, arbitrary code execution). npm audit
reported 3 cascading criticals: protobufjs, @whiskeysockets/libsignal-node
(pulls protobufjs), and baileys itself (effect rollup).

Fix: add npm overrides block pinning protobufjs to ^7.5.5. Deduplicates
to a single 7.5.6 copy at node_modules/protobufjs that both libsignal-node
and any other consumers resolve through normal module resolution.

Why not bump baileys: npm-published baileys@6.17.16 is deprecated by the
maintainers (wrong version), 7.0.0-rc.* still pulls the same vulnerable
libsignal-node, and upstream Baileys HEAD adds a 4th vuln (music-metadata).
The override is the minimal, behavior-preserving fix.

Validation:
- npm audit: 3 critical -> 0 vulnerabilities
- node -e "import('@whiskeysockets/baileys')" -> all 5 named exports
  (makeWASocket, useMultiFileAuthState, DisconnectReason,
  fetchLatestBaileysVersion, downloadMediaMessage) resolve
- node bridge.js loads all modules and reaches Express bind
  (exits only on EADDRINUSE because the live gateway owns :3000)
- Single deduped protobufjs@7.5.6 in the tree
2026-05-03 05:22:30 -07:00
kshitijk4poor 6f2dab248a fix: update tests for resume_pending semantics + add AUTHOR_MAP entries
Tests updated to reflect suspend_recently_active now setting
resume_pending=True (preserves session) instead of suspended=True
(wipes session history).

AUTHOR_MAP entries: millerc79 (#19033), shellybotmoyer (#18915)
2026-05-03 03:54:03 -07:00
charliekerfoot 1148c46241 fix(gateway): correct ws scheme conversion for https urls 2026-05-03 03:54:03 -07:00
kshitijk4poor 7a22c639dc chore: add shellybotmoyer to AUTHOR_MAP 2026-05-03 03:54:03 -07:00
Hermes Agent 934103476f fix(gateway): send /new response before cancel_session_processing to avoid race (#18912)
When /new is issued while an agent is actively processing, the confirmation response was never sent to the user because cancel_session_processing() was called before _send_with_retry(). Task cancellation side effects could silently drop the response.

Fix: reorder to send the response BEFORE cancelling the old task. Add logging at the send point (matching the pattern at line 2800 in _process_message_background) so future failures are visible.

Closes: #18912
2026-05-03 03:54:03 -07:00
kshitijk4poor bf3239472f chore: add millerc79 to AUTHOR_MAP 2026-05-03 03:54:03 -07:00
millerc79 f1e0292517 fix(gateway): resume sessions after crash/restart instead of blanket suspend
suspend_recently_active() was unconditionally setting suspended=True on
startup, causing get_or_create_session() to wipe conversation history on
every restart. Change to set resume_pending=True instead, so sessions
auto-resume while still allowing stuck-loop escalation after 3 failures.
2026-05-03 03:54:03 -07:00
kshitijk4poor 0a97ce6bff chore: add nftpoetrist to AUTHOR_MAP 2026-05-03 03:47:49 -07:00
nftpoetrist 6c1322b997 fix(slack): close previous handler in connect() to prevent zombie Socket Mode connections
SlackAdapter.connect() overwrote self._handler, self._app, and
self._socket_mode_task without closing the prior AsyncSocketModeHandler
first. If connect() was called a second time on the same adapter (e.g.
during a gateway restart or in-process reconnect attempt), the old Socket
Mode websocket stayed alive. Both the old and new connections received
every Slack event and dispatched it twice — producing double responses
with different wording, the same bug that affected DiscordAdapter (#18187,
fixed in #18758).

Fix: add a close-before-reassign guard at the start of the connection
setup path, mirroring the guard DiscordAdapter.connect() already has.
When self._handler is None (fresh adapter, first connect()) the block is
a harmless no-op. Scoped to the handler/app fields only — no behavior
change for any path that does not call connect() twice.

Fixes #18980
2026-05-03 03:47:49 -07:00
kshitijk4poor c14bf441a3 chore: add 0xyg3n noreply email to AUTHOR_MAP 2026-05-03 03:44:55 -07:00
0xyg3n 19ba9e43b6 fix(gateway/discord): require allowlist auth on slash commands
Slash commands (_run_simple_slash, _handle_thread_create_slash) bypassed
every DISCORD_ALLOWED_* gate enforced by on_message. Any guild member
could invoke /background (RCE via terminal), /restart, /model, /skill,
etc. CVSS 9.8 Critical.

- _evaluate_slash_authorization mirrors on_message gates (user, role,
  channel, ignored channel) with fail-closed semantics
- _check_slash_authorization sends ephemeral reject + logs + admin alert
- Auth gate runs before defer() so rejections are ephemeral
- /skill autocomplete returns [] for unauthorized users (no catalog leak)
- Component views (ExecApproval, SlashConfirm, UpdatePrompt, ModelPicker)
  now honor role allowlists via shared _component_check_auth helper
- Optional DISCORD_HIDE_SLASH_COMMANDS defense-in-depth
- Cross-platform admin alert (Telegram/Slack fallback) on unauthorized attempts

Based on PR #18125 by @0xyg3n.
2026-05-03 03:44:55 -07:00
kshitijk4poor 5d5b8912be test: add tests for cmd_key preservation through name clamping
- TestClampCommandNamesTriples: unit tests for 3-tuple support in
  _clamp_command_names (short names, long names, collisions, multiple
  entries, backward compat with 2-tuples)
- TestDiscordSkillCmdKeyDispatch: integration test through the full
  discord_skill_commands pipeline verifying long skill names retain
  their original cmd_key after clamping
- Add contributor CharlieKerfoot to AUTHOR_MAP
2026-05-03 03:25:45 -07:00
charliekerfoot c4c0e5abc2 fix: After _clamp_command_names truncates skill names to fit the 32-cha… 2026-05-03 03:25:45 -07:00
kshitij 457c7b76cd feat(openrouter): add response caching support (#19132)
Enable OpenRouter's response caching feature (beta) via X-OpenRouter-Cache
headers. When enabled, identical API requests return cached responses for
free (zero billing), reducing both latency and cost.

Configuration via config.yaml:
  openrouter:
    response_cache: true       # default: on
    response_cache_ttl: 300    # 1-86400 seconds

Changes:
- Add openrouter config section to DEFAULT_CONFIG (response_cache + TTL)
- Add build_or_headers() in auxiliary_client.py that builds attribution
  headers plus optional cache headers based on config
- Replace inline _OR_HEADERS dicts with build_or_headers() at all 5 sites:
  run_agent.py __init__, _apply_client_headers_for_base_url(), and
  auxiliary_client.py _try_openrouter() + _to_async_client()
- Add _check_openrouter_cache_status() method to AIAgent that reads
  X-OpenRouter-Cache-Status from streaming response headers and logs
  HIT/MISS status
- Document in cli-config.yaml.example
- Add 28 tests (22 unit + 6 integration)

Ref: https://openrouter.ai/docs/guides/features/response-caching
2026-05-03 01:54:24 -07:00
Teknium 9b5b88b5e0 chore: add MottledShadow to AUTHOR_MAP 2026-05-03 01:51:33 -07:00
MottledShadow a22465e07a fix(weixin): send_weixin_direct cross-loop session check
When send_message tool is called from inside a running gateway, the
_run_async bridge spawns a worker thread with a separate event loop.
send_weixin_direct then reuses the live adapter's aiohttp session
which was created on the gateway's main loop.  aiohttp's TimerContext
checks asyncio.current_task(loop=session._loop) and sees None because
we're executing on the worker thread's loop → raises 'Timeout context
manager should be used inside a task'.

Fix: skip the live-adapter shortcut when the session belongs to a
different event loop, falling through to the fresh-session path.
2026-05-03 01:51:33 -07:00
Henkey 9987f3d824 fix(acp): compact Zed tool replay rendering 2026-05-03 01:44:23 -07:00
Henkey 19854c7cd2 Schedule ACP history replay and fence file output 2026-05-03 01:44:23 -07:00
Henkey eb612f5574 fix(acp): keep web extract rendering compact 2026-05-03 01:44:23 -07:00
Henkey b294d1d022 fix(acp): keep read-file starts compact 2026-05-03 01:44:23 -07:00
Henkey 72c8037a24 fix(acp): polish common tool rendering 2026-05-03 01:44:23 -07:00
Henkey ef9a08a872 fix(acp): polish Zed context and tool rendering 2026-05-03 01:44:23 -07:00
Henkey e26f9b2070 fix(acp): route Zed thoughts to reasoning callbacks 2026-05-03 01:44:23 -07:00
helix4u 4f37669170 fix(tools): reconfigure enabled unconfigured toolsets 2026-05-03 00:33:02 -07:00
helix4u d409a4409c fix(model): avoid bedrock credential probe in provider picker 2026-05-03 00:32:55 -07:00
Siddharth Balyan 5d3be898a8 docs(tts): mention xAI custom voice support (#18776)
Point users to xAI's custom voices feature — clone your voice in the
console, paste the voice_id into tts.xai.voice_id. No code changes
needed; the existing TTS pipeline already handles arbitrary voice IDs.

- config.py: link to xAI custom voices docs in voice_id comment
- setup.py: prompt accepts custom voice IDs during xAI TTS setup
- tts.md: short section linking to xAI console and docs
2026-05-02 16:08:01 +05:30
liuhao1024 af98122793 fix(auxiliary): propagate explicit_api_key to _try_openrouter()
When resolve_provider_client() passes explicit_api_key for OpenRouter auxiliary
tasks, _try_openrouter() now accepts and honors this parameter instead of
silently ignoring it and falling back to OPENROUTER_API_KEY env var.

Root cause: _try_openrouter() had no explicit_api_key parameter, so even
when callers wanted to pass a runtime credential pool key, it could not be used.

Fix:
- Add explicit_api_key: str = None parameter to _try_openrouter()
- Prioritize explicit_api_key over pool key and env var
- Update resolve_provider_client() call site to pass explicit_api_key

Regression coverage:
- Test that explicit_api_key is passed to OpenAI client when provided
- Test that fallback to OPENROUTER_API_KEY still works when explicit_api_key is None

Closes #18338
2026-05-02 02:27:49 -07:00
teknium1 73bcd83dba chore(release): map beibi9966 email for AUTHOR_MAP
Follow-up for PR #18502 salvage.
2026-05-02 02:23:37 -07:00
teknium1 762eb79f1e fix(gateway): tighten httpx keepalive and close whatsapp typing-response leak (#18451)
Two mitigations for the CLOSE_WAIT accumulation reported against QQ Bot
+ Feishu on macOS behind Cloudflare Warp.

1. Shared httpx.Limits helper (gateway/platforms/_http_client_limits.py).
   Every long-lived platform adapter now constructs httpx.AsyncClient
   with max_keepalive_connections=10 and keepalive_expiry=2.0, vs httpx's
   default of unbounded keepalive pool and 5.0s expiry. On macOS/Warp the
   default 5s window let idle keepalive sockets sit in CLOSE_WAIT long
   enough for seven persistent adapters (QQ Bot, WeCom, DingTalk, Signal,
   BlueBubbles, WeCom-callback, plus the transient Feishu helper) to
   compound to the 256-fd ulimit. Tunable via
   HERMES_GATEWAY_HTTPX_KEEPALIVE_EXPIRY and
   HERMES_GATEWAY_HTTPX_MAX_KEEPALIVE env vars.

2. whatsapp.send_typing aiohttp leak. The call was
   'await self._http_session.post(...)' with no 'async with' and no
   variable capture — the ClientResponse went out of scope unclosed,
   holding its TCP socket in CLOSE_WAIT until GC. Fixed by wrapping in
   'async with'. This was the only bare-await aiohttp leak in the
   gateway/tools/plugins tree per audit; all other aiohttp sites use
   the context-manager pattern correctly.

The underlying reporter also saw Feishu SDK (lark-oapi) connections in
CLOSE_WAIT — those are inside the SDK and out of our direct control, but
tightening httpx keepalive across adapters reduces the aggregate pool
pressure regardless of which individual adapter leaks.
2026-05-02 02:23:37 -07:00
beibi9966 38dd057e91 fix(feishu): finalize remote document downloads inside httpx.AsyncClient context (#18502)
Snapshot Content-Type and body while the client context is still
active so pooled connections fully release on exit. Previously the
read happened after `async with httpx.AsyncClient(...)` returned —
which works today only because httpx eagerly buffers non-streaming
responses; a future refactor to `.stream()` would silently read-
after-close.

Part of the #18451 connection-hygiene audit. Salvage of #18502.
2026-05-02 02:23:37 -07:00
Teknium e444d8f29c fix(gateway): config.yaml wins over .env for agent/display/timezone settings (#18764)
Regression from the silent config→env bridge. The bridge at module import
time is correct for max_turns (unconditional overwrite), but every other
agent.*, display.*, timezone, and security bridge key was guarded by
'if X not in os.environ' — so a stale .env entry from an old 'hermes setup'
run would shadow the user's current config.yaml indefinitely.

Symptom: agent.max_turns: 500 in config.yaml, HERMES_MAX_ITERATIONS=60
in .env from an old setup, and the gateway silently capped at 60
iterations per turn. Gateway logs confirmed api_calls never exceeded 60.

Three changes:

1. gateway/run.py: drop the 'not in os.environ' guards for all agent.*,
   display.*, timezone, and security.* bridge keys. config.yaml is now
   authoritative for these settings — same semantics already in place
   for max_turns, terminal.*, and auxiliary.*. Also surface the bridge
   failure (previously 'except Exception: pass') to stderr so operators
   see bridge errors instead of silently falling back to .env.

2. gateway/run.py: INFO-log the resolved max_iterations at gateway
   start so operators can verify the config→env bridge did the right
   thing instead of chasing a phantom budget ceiling.

3. hermes_cli/setup.py: stop writing HERMES_MAX_ITERATIONS to .env in
   the setup wizard. config.yaml is the single source of truth. Also
   clean up any stale .env entry left behind by pre-fix setups.

Regression tests in tests/gateway/test_config_env_bridge_authority.py
guard each config→env key against the 'stale .env shadows config' bug.
2026-05-02 02:14:35 -07:00
luyao618 13f344c5ce fix(agent): try fallback providers at init when primary credential pool is exhausted (#17929)
When a provider's credential pool has a single entry in 429-cooldown,
resolve_provider_client returns None and AIAgent.__init__ raises a
misleading RuntimeError suggesting the API key is missing — even when
valid fallback_providers are configured.

This patch makes __init__ iterate the fallback chain before raising,
mirroring the existing in-flight fallback logic in the request loop.
If a fallback resolves, the agent initializes against it and sets
_fallback_activated=True so _restore_primary_runtime can pick the
primary back up after cooldown.

Closes #17929
2026-05-02 02:09:46 -07:00
Teknium 1dce908930 fix(gateway): shutdown + restart hygiene (drain timeout, false-fatal, success log) (#18761)
* fix(gateway): config.yaml wins over .env for agent/display/timezone settings

Regression from the silent config→env bridge. The bridge at module import
time is correct for max_turns (unconditional overwrite), but every other
agent.*, display.*, timezone, and security bridge key was guarded by
'if X not in os.environ' — so a stale .env entry from an old 'hermes setup'
run would shadow the user's current config.yaml indefinitely.

Symptom: agent.max_turns: 500 in config.yaml, HERMES_MAX_ITERATIONS=60
in .env from an old setup, and the gateway silently capped at 60
iterations per turn. Gateway logs confirmed api_calls never exceeded 60.

Three changes:

1. gateway/run.py: drop the 'not in os.environ' guards for all agent.*,
   display.*, timezone, and security.* bridge keys. config.yaml is now
   authoritative for these settings — same semantics already in place
   for max_turns, terminal.*, and auxiliary.*. Also surface the bridge
   failure (previously 'except Exception: pass') to stderr so operators
   see bridge errors instead of silently falling back to .env.

2. gateway/run.py: INFO-log the resolved max_iterations at gateway
   start so operators can verify the config→env bridge did the right
   thing instead of chasing a phantom budget ceiling.

3. hermes_cli/setup.py: stop writing HERMES_MAX_ITERATIONS to .env in
   the setup wizard. config.yaml is the single source of truth. Also
   clean up any stale .env entry left behind by pre-fix setups.

Regression tests in tests/gateway/test_config_env_bridge_authority.py
guard each config→env key against the 'stale .env shadows config' bug.

* fix(gateway): shutdown + restart hygiene (drain timeout, false-fatal, success log)

Three issues observed in production gateway.log during a rapid restart
chain on 2026-05-02, all fixed here.

1. _send_restart_notification logged unconditional success
   adapter.send() catches provider errors (e.g. Telegram 'Chat not found')
   and returns SendResult(success=False); it never raises. The caller
   ignored the return value and always logged 'Sent restart notification
   to <chat>' at INFO, producing a misleading success line directly
   below the 'Failed to send Telegram message' traceback on every boot.
   Now inspects result.success and logs WARNING with the error otherwise.

2. WhatsApp bridge SIGTERM on shutdown classified as fatal error
   _check_managed_bridge_exit() saw the bridge's returncode -15 (our own
   SIGTERM from disconnect()) and fired the full fatal-error path,
   producing 'ERROR ... WhatsApp bridge process exited unexpectedly' plus
   'Fatal whatsapp adapter error (whatsapp_bridge_exited)' on every
   planned shutdown, immediately before the normal '✓ whatsapp
   disconnected'. Adds a _shutting_down flag that disconnect() sets
   before the terminate, and _check_managed_bridge_exit() returns None
   for returncode in {0, -2, -15} while shutting down. OOM-kill (137)
   and other non-signal exits still hit the fatal path.

3. restart_drain_timeout default 60s → 180s
   On 2026-05-02 01:43:27 a user /restart fired while three agents were
   mid-API-call (82s, 112s, 154s into their turns). The 60s drain budget
   expired and all three were force-interrupted. 180s covers realistic
   in-flight agent turns; users on very-long-reasoning models can still
   raise it further via agent.restart_drain_timeout in config.yaml.
   Existing explicit user values are preserved by deep-merge.

Tests
- tests/gateway/test_restart_notification.py: two new tests assert INFO
  is only logged on SendResult(success=True) and WARNING with the error
  string is logged on SendResult(success=False).
- tests/gateway/test_whatsapp_connect.py: parametrized test for
  returncode in {0, -2, -15} proves shutdown-time exits are suppressed;
  separate test proves returncode 137 (SIGKILL/OOM) still surfaces as
  fatal even when _shutting_down is set.
- _check_managed_bridge_exit() reads _shutting_down via getattr-with-
  default so existing _make_adapter() test helpers that bypass __init__
  (pitfall #17 in AGENTS.md) keep working unmodified.
2026-05-02 02:08:06 -07:00
teknium1 50f9f389ec chore(release): map ambition0802 email for AUTHOR_MAP
Follow-up for PR #17939 salvage.
2026-05-02 02:07:14 -07:00
ambition0802 7696ddc59e fix(cli): robust paste file expansion and process_loop error handling (#17666)
Two narrow fixes for long pasted messages silently disappearing:

1. _expand_paste_references: replace path.exists() + read_text() with
   try/except (OSError, IOError). Closes the TOCTOU window where a paste
   file deleted between check and read raised FileNotFoundError, bubbled
   up through process_loop's outer except, and silently dropped the
   user's input. Failures now return the placeholder text and log a
   warning.

2. process_loop outer except: logger.warning() instead of print().
   prompt_toolkit's TUI swallows stdout, so 'Error: …' was invisible
   to the user. Logged errors are discoverable via hermes logs.

Dropped the larger interrupt_queue→pending_input drain that was part of
the original PR — that's a separate class of input-drop (in-progress
interrupt handling) unrelated to the paste-file TOCTOU reported in the
issue, and worth its own review.

Salvage of #17939.
2026-05-02 02:07:14 -07:00
Teknium 5eac6084bc fix(discord): warn on 32-char clamp collisions in the /skill collector (#18759)
Discord's per-command name limit is 32 chars. When two skill slugs
share the same first 32 chars (or a skill slug clamps onto a reserved
gateway command name), only the first seen wins — the second is
dropped from the /skill autocomplete. The old behavior incremented a
``hidden`` counter silently, so skill authors had no way to discover
the drop short of noticing their skill was missing from the picker.

Not an actively-biting bug today (no collisions on the default catalog
as of 2026-05), but a landmine the moment someone ships a skill with a
long name. The earlier series in #18745 / #18753 / #18754 dropped the
other silent data-loss paths in the Discord /skill collector; this one
lights up the last remaining one.

Fix: promote ``_names_used`` from a set to a dict keyed by the clamped
name, mapping to the source cmd_key (or a ``"<reserved>"`` sentinel
for names inherited via ``reserved_names``). On collision, log a
WARNING naming both sides — the winner, the loser, the clamped name,
and what to rename.

Two phrasings:

* skill-vs-skill — "both clamp to X on Discord's 32-char command-name
  limit; only the winner appears in /skill. Rename one skill's
  frontmatter ``name:`` to differ in its first 32 chars."
* skill-vs-reserved — "collides with a reserved gateway command name;
  the skill will not appear in /skill. Rename the skill's frontmatter
  ``name:``."

Tests: three cases in
``tests/hermes_cli/test_discord_skill_clamp_warning.py`` —
skill-vs-skill collision (warning names both cmd_keys + clamped prefix),
skill-vs-reserved collision (warning uses the distinct phrasing), and a
no-collision negative (zero warnings emitted).
2026-05-02 02:05:01 -07:00
teknium1 e363ced3c3 test(discord): regression coverage for zombie-websocket guard in connect()
Covers PR #18224 fix for issue #18187 — when DiscordAdapter.connect() is
called a second time without an intervening disconnect(), the previous
commands.Bot must be closed before a new one is created. Otherwise both
websockets stay connected to Discord's gateway and both fire on_message,
producing double responses with different wording.
2026-05-02 02:04:14 -07:00
luyao618 292d2fb42f fix(discord): close old client before reconnect to prevent zombie websockets (#18187)
When DiscordAdapter.connect() is called during reconnect, it creates a new
commands.Bot client without closing the previous one. The old client's
websocket remains connected to Discord's gateway, causing both to fire
on_message for every incoming event — resulting in double responses.

Fix: before creating a new Bot instance, check if a previous client exists
and close it. This ensures only one websocket connection is active at any
time.

Closes #18187
2026-05-02 02:04:14 -07:00
teknium1 0a6865b328 test(credential_pool): regression coverage for .env vs os.environ precedence
Covers PR #18256 fix for issue #18254 — when OPENROUTER_API_KEY is set in
BOTH os.environ (stale from parent shell) and ~/.hermes/.env (fresh),
_seed_from_env must prefer the .env value. Also guards the fallback case
where .env omits the key entirely (Docker/K8s/systemd deployments that
only inject via runtime env).
2026-05-02 02:00:32 -07:00
teknium1 9c626ef8ea chore(release): map franksong2702 email for AUTHOR_MAP
Follow-up for PR #18256 salvage.
2026-05-02 02:00:32 -07:00
Frank Song 2ef1ad280b fix: prefer ~/.hermes/.env over os.environ when seeding credential pool
When _seed_from_env() reads API keys to populate the credential pool, it
should treat ~/.hermes/.env as the authoritative source — not os.environ.
Stale env vars inherited from parent shell processes (Codex CLI, test
scripts, etc.) can shadow deliberate changes to the .env file, causing
auth.json to cache an outdated key that leads to silent 401 errors.

This is especially visible with OpenRouter: if a parent process exported
OPENROUTER_API_KEY=test-key-fresh and the user later updates .env with a
valid key, restarting Hermes still picks up the stale os.environ value,
writes it back to auth.json, and all API calls fail with 401.

Fixes #18254
2026-05-02 02:00:32 -07:00
Teknium 10297fa23c fix(discord): /reload-skills now refreshes the /skill autocomplete live (#18754)
`_register_skill_group` captured the skill catalog in closure variables
(`entries` and `skill_lookup`) so the single `tree.add_command` call at
startup owned the only live copy. The closure is never re-entered after
startup, so `/reload-skills` — which rescans the on-disk skills dir and
refreshes the in-process `_skill_commands` registry — had no way to
propagate results into the `/skill` autocomplete on Discord. New skills
stayed invisible in the dropdown, and deleted skills returned
"Unknown skill" when the stale autocomplete entry was clicked.

The fix is purely a dataflow change: promote `entries` and `skill_lookup`
to instance attributes (`_skill_entries`, `_skill_lookup`), split the
collector-driven rebuild into a helper (`_refresh_skill_catalog_state`),
and add a public `refresh_skill_group()` method that re-runs the helper
and is safe to call at any point after the initial registration.

The gateway's `_handle_reload_skills_command` then iterates
`self.adapters` and calls `refresh_skill_group()` on any adapter that
exposes it (currently only Discord). Both sync and async implementations
are supported; adapters that don't override the method (Telegram's
BotCommand menu, Slack subcommand map, etc.) are silently skipped — the
in-process `reload_skills()` call covers them.

No `tree.sync()` is required because Discord fetches autocomplete
options dynamically on every keystroke — mutating the instance state the
callbacks already read from is sufficient. That sidesteps the per-app
command-bucket rate limit (~5 writes / 20 s) that made the previous
bulk-sync-on-reload approach unusable (#16713 context).

Tests: tests/gateway/test_reload_skills_discord_resync.py — five cases
covering (1) refresh replaces entries, (2) entries stay sorted after
refresh, (3) collector exception leaves cached state intact, (4)
`_refresh_skill_catalog_state` populates the instance attrs, (5)
orchestrator calls `refresh_skill_group()` on sync + async adapters and
skips adapters that don't expose it.
2026-05-02 02:00:11 -07:00
Teknium 6ec74aec07 fix(gateway): match disabled/optional skills by frontmatter slug, not dir name (#18753)
_check_unavailable_skill is meant to turn a typed "/foo" command that
doesn't resolve into a specific hint — "disabled, enable with hermes
skills config" or "available but not installed, install with hermes
skills install …" — instead of the generic "unknown command" reply.

It was doing the match with `skill_md.parent.name.lower().replace("_", "-")`,
comparing that to the typed command. For every skill whose directory name
drifted from its declared frontmatter `name:`, that comparison failed and
the user got the unhelpful generic path. On a standard install today 19
skills have this drift, e.g.:

  dir: mlops/stable-diffusion
  frontmatter: name: Stable Diffusion Image Generation
  registered slug (what the user types): /stable-diffusion-image-generation

  dir: mlops/qdrant
  frontmatter: name: Qdrant Vector Search
  registered slug: /qdrant-vector-search

  dir: mlops/flash-attention
  frontmatter: name: Optimizing Attention Flash
  registered slug: /optimizing-attention-flash

In every case, _check_unavailable_skill would fall through because
"stable-diffusion" != "stable-diffusion-image-generation", even with the
skill sitting right there on disk.

Fix: extract a small `_skill_slug_from_frontmatter` helper that reads the
SKILL.md frontmatter and normalizes exactly like scan_skill_commands
(lower, spaces/underscores → hyphens, strip non-[a-z0-9-], collapse
runs of hyphens, strip edges). Use it in both the
disabled-skills branch and the optional-skills branch. The disabled-set
membership check now uses the declared frontmatter name (which is what
`hermes skills config` writes into skills.disabled / platform_disabled),
not the slug.

Tests: five cases in tests/gateway/test_unavailable_skill_hint.py —
the drift case for the disabled branch, unknown-command negative,
matched-but-not-disabled negative, non-alnum stripping, and the drift
case for the optional-skills branch. All five fail against main and
pass with the fix.
2026-05-02 02:00:09 -07:00
Teknium 8825e9044c fix(discord): complete #18741 for /skill autocomplete and drop legacy 25x25 caps (#18745)
``discord_skill_commands_by_category`` was lagging the flat
``discord_skill_commands`` collector on two counts. Both were actively
dropping skills from Discord's ``/skill`` autocomplete dropdown.

1. External-dir skills were filtered out. #18741 widened the flat
   collector to accept ``SKILLS_DIR + skills.external_dirs`` but left
   this sibling collector — the one ``_register_skill_group`` actually
   uses on Discord — still matching ``SKILLS_DIR`` only. External
   skills were visible in ``hermes skills list`` and the agent's
   ``/skill-name`` dispatch but silently absent from Discord's
   ``/skill`` picker. Widen the accepted roots to match, and derive
   categories from whichever root the skill lives under so
   ``<ext>/mlops/foo/SKILL.md`` still lands in the ``mlops`` group.

2. 25-group × 25-subcommand caps were still applied. PR #11580
   refactored ``/skill`` to a flat autocomplete (whose options Discord
   fetches dynamically — no per-command payload concern) and its
   docstring promises "no hidden skills." The collector kept the old
   nested-layout caps anyway, silently dropping anything past the 25th
   alphabetical category. On installs with 29 category dirs today (real
   example: tail categories ``social-media``, ``software-development``,
   ``yuanbao`` going missing) this was biting immediately. Remove the
   caps; ``hidden`` now reports only 32-char name-clamp collisions
   against reserved names.

Tests: guard both behaviors. ``test_no_legacy_25x25_cap`` builds 30
categories × 30 skills each and asserts all 900 are returned.
``test_external_dirs_skills_included`` monkeypatches
``get_external_skills_dirs`` and asserts an external-dir skill makes
it into the result grouped under its own top-level directory.
2026-05-02 02:00:06 -07:00
Jacob Lizarraga 2470434d60 fix(telegram): probe polling liveness after reconnect to detect wedged Updater
After a transient Telegram 502, _handle_polling_network_error's
stop()+start_polling() cycle can leave PTB's Updater with `running=True`
but a wedged consumer task that never makes progress. No error_callback
fires in that state, so the reconnect ladder never advances past attempt
1, the MAX_NETWORK_RETRIES fatal-error path is never reached, and the
gateway sits silent indefinitely.

Schedule a heartbeat probe (60s after a successful reconnect) that
verifies Updater.running is still True and bot.get_me() responds within
a tight asyncio.wait_for timeout. Either failure feeds back into the
reconnect ladder so the existing escalation path fires.

No PTB-internal coupling, no Application rebuild — minimal additive
defense inside the existing reconnect abstraction.

Tests cover healthy / Updater non-running / probe timeout / probe
network error / already-fatal cases, plus an integration check that the
probe is actually scheduled after a successful start_polling().

Closes the silent-wedge case observed in the wild after a transient
Telegram 502; existing reconnect tests updated to mock bot.get_me() now
that the success path schedules a heartbeat probe.
2026-05-02 01:55:04 -07:00
liuhao1024 9bf260472b fix(tools): deduplicate tool names at API boundary for Vertex/Azure/Bedrock
Providers like Google Vertex, Azure, and Amazon Bedrock reject API
requests with duplicate tool names (HTTP 400: 'Tool names must be
unique').  The upstream injection paths in run_agent.py already dedup
after PR #17335, but two API-boundary functions pass tools through
without checking:

- agent/auxiliary_client.py: _build_call_kwargs() (all non-Anthropic
  providers in chat_completions mode)
- agent/anthropic_adapter.py: convert_tools_to_anthropic() (Anthropic
  Messages API path)

Add defensive dedup guards at both sites.  Duplicates are dropped with
a warning log, converting a hard 400 failure into a recoverable
condition.  This is intentionally conservative — the root-cause dedup
in run_agent.py is the primary defense; these guards add resilience
against future injection-path regressions.

Includes 8 new tests covering unique passthrough, duplicate removal,
empty/None edge cases.

Closes #18478
2026-05-02 01:51:51 -07:00
Teknium 699b3679bc fix(constants): warn once when get_hermes_home() falls back under an active profile (#18746)
When HERMES_HOME is unset but ~/.hermes/active_profile names a non-default
profile, any data this process writes lands in the default profile — not the
one the operator expects. Before this change the fallback was silent, so
cross-profile contamination (#18594) was invisible until a user noticed
their memory/state ended up in the wrong place.

Now we emit a one-shot warning to stderr the first time this happens in
a process. No raise — there are 30+ module-level callers of get_hermes_home()
and raising from any of them would brick import. Behavior is otherwise
unchanged; subprocess spawners (systemd template, kanban dispatcher, docker
entrypoint) already propagate HERMES_HOME correctly.

Bypasses logging.getLogger() because this runs before logging is configured
in a significant fraction of callers (module import time).

Refs #18594. Credit to @liuhao1024 for surfacing the silent-fallback case
in PR #18600; we kept the diagnostic signal without the import-time raise.
2026-05-02 01:49:55 -07:00
teknium1 98c98821ff chore(release): map CoreyNoDream email for AUTHOR_MAP
Follow-up for PR #18721 salvage.
2026-05-02 01:40:31 -07:00
CoreyNoDream c5e3a6fb5b fix(cli): decode .env as UTF-8 to avoid GBK crash on Windows
Path.read_text() uses the system locale by default. On Windows CN/JP/KR
locales (GBK/CP932/CP949), reading a UTF-8 .env raises UnicodeDecodeError
as soon as it contains any non-ASCII byte (e.g. an em dash).

Pin encoding="utf-8" on every .env read in hermes_cli to match how the
rest of the codebase (load_dotenv at doctor.py:26) already decodes it.

Adds a regression test that monkeypatches Path.read_text to simulate a
GBK locale and asserts 'hermes doctor' no longer raises.

Refs #18637
2026-05-02 01:40:31 -07:00
Teknium e2cea6eeba fix(gateway): include external_dirs skills in Telegram/Discord slash commands (#18741)
Skills configured through `skills.external_dirs` in config.yaml were
visible via `hermes skills list`, `get_skill_commands()`, and the
agent's `/skill-name` dispatch, but silently excluded from the
Telegram and Discord slash-command menus. The filter in
`_collect_gateway_skill_entries` only accepted skills whose
`skill_md_path` started with `SKILLS_DIR`, so anything under an
external directory fell through.

Widen the accepted-prefix set to include all configured external
dirs alongside the local skills dir. Every prefix is now
slash-terminated so `/my-skills` cannot also admit
`/my-skills-extra`. Also guard against empty `skill_md_path`
values so they can't accidentally match.

Fixes #8110

Salvages #8790 by luyao618.

Co-authored-by: Yao <34041715+luyao618@users.noreply.github.com>
2026-05-02 01:36:57 -07:00
Teknium c73594fe41 fix(skills): rescan skill_commands cache when platform scope changes (#18739)
The process-global `_skill_commands` dict in agent/skill_commands.py
was seeded by whichever platform scanned first, and
`get_skill_commands()` only rescanned when the cache was empty. In a
long-lived gateway process serving multiple platforms (Telegram +
Discord + Slack), the first platform's
`skills.platform_disabled` view was silently inherited by the
others — so a skill disabled for Telegram would also disappear from
Discord's slash menu, and vice versa.

Track the platform scope the cache was populated for
(`_skill_commands_platform`) and rescan in `get_skill_commands()`
when the currently-active platform no longer matches. Platform
resolution uses the same precedence as `_is_skill_disabled`:
`HERMES_PLATFORM` env var then `HERMES_SESSION_PLATFORM` from the
gateway session context.

Fixes #14536

Salvages #14570 by LeonSGP43.

Co-authored-by: LeonSGP <leon@sgp43.com>
2026-05-02 01:36:53 -07:00
Teknium 97acd66b4c fix(curator): authoritative absorbed_into on delete + restore cron skill links on rollback (#18671) (#18731)
* fix(curator): authoritative absorbed_into declarations on skill delete

Closes #18671. The classification pipeline that feeds cron-ref rewriting
used to infer consolidation vs pruning from two brittle signals: the
curator model's post-hoc YAML summary block, and a substring heuristic
scanning other tool calls for the removed skill's name. Both miss in
real consolidations — the model forgets the YAML under reasoning
pressure, and the heuristic misses when the umbrella's patch content
describes the absorbed behavior abstractly instead of naming the old
slug. When both miss, the skill falls through to 'no-evidence fallback'
pruned, and #18253's cron rewriter drops the cron ref entirely instead
of mapping it to the umbrella. Same observable symptom as pre-#18253:
'Skill(s) not found and skipped' at the next cron run.

The fix makes the model declare intent at the moment of deletion.
skill_manage(action='delete') now accepts absorbed_into:
  - absorbed_into='<umbrella>'  -> consolidated, target must exist on disk
  - absorbed_into=''            -> explicit prune, no forwarding target
  - missing                     -> legacy path, falls through to heuristic/YAML

The curator reconciler reads these declarations off llm_meta.tool_calls
BEFORE either the YAML block or the substring heuristic. Declaration
wins. Fallback logic stays intact for backward compat with any caller
(human or older curator conversation) that doesn't populate the arg.

Changes
- tools/skill_manager_tool.py: add absorbed_into param to skill_manage
  + _delete_skill. Validate target exists when non-empty. Reject
  absorbed_into=<self>. Wire through dispatcher + registry + schema.
- agent/curator.py: new _extract_absorbed_into_declarations() walks
  tool calls for skill_manage(delete) with the arg. _reconcile_classification
  accepts absorbed_declarations= and treats them as authoritative. Curator
  prompt updated to require the arg on every delete.
- Tests: 7 new skill_manager tests covering the tool contract (valid
  target, empty string, nonexistent target, self-reference, whitespace,
  backward compat, dispatcher plumbing). 11 new curator tests covering
  the extractor + authoritative reconciler path + mixed-legacy-and-
  declared runs.

Validation
- 307/307 targeted tests pass (curator + cron + skill_manager suites).
- E2E #18671 repro: 3 narrow skills, 1 umbrella, cron job referencing
  all 3. Model emits NO YAML block. Heuristic misses (patch prose
  doesn't name old slugs). Delete calls carry absorbed_into. Result:
  both PR skills correctly classified 'consolidated' + cron rewritten
  ['pr-review-format', 'pr-review-checklist', 'stale-junk'] ->
  ['hermes-agent-dev']; stale-junk pruned via absorbed_into=''.
- E2E backward-compat: delete without absorbed_into, model emits YAML
  -> routed via existing 'model' source, cron still rewritten correctly.

* feat(curator): capture + restore cron skill links across snapshot/rollback

Before this, rolling back a curator run restored the skills tree but cron
jobs still pointed at the umbrella skills the curator had rewritten them
to. The user would see their old narrow skills back on disk but their
cron jobs still configured with the merged umbrella — not actually 'back
to how it was'.

Snapshot side: snapshot_skills() now captures ~/.hermes/cron/jobs.json
alongside the skills tarball, as cron-jobs.json. The manifest gets a new
'cron_jobs' block with {backed_up, jobs_count} so rollback (and the CLI
confirm dialog) can surface what's in the snapshot. If jobs.json is
missing/unreadable/malformed, snapshot proceeds without cron data — the
skills backup is the core guarantee; cron is additive.

Rollback side: after the skills extract succeeds, the new
_restore_cron_skill_links() reconciles the backed-up jobs into the live
jobs.json SURGICALLY. Only 'skills' and 'skill' fields are restored, and
only on jobs matched by id. Everything else about a cron job — schedule,
last_run_at, next_run_at, enabled, prompt, workdir, hooks — is live
state the user or scheduler has modified since the snapshot; overwriting
it would regress unrelated activity.

Reconciliation rules:
- Job in backup AND live, skills differ  → skills restored.
- Job in backup AND live, skills match   → no-op.
- Job in backup, NOT in live             → skipped (user deleted it
                                              after snapshot; their choice
                                              is later than the snapshot).
- Job in live, NOT in backup             → untouched (user created it
                                              after snapshot).
- Snapshot missing cron-jobs.json at all → rollback still succeeds,
                                              reports 'not captured'
                                              (older pre-feature snapshots
                                              keep working).

Writes go through cron.jobs.save_jobs under the same _jobs_file_lock the
scheduler uses, so rollback doesn't race tick().

Also:
- hermes_cli/curator.py: rollback confirm dialog now shows
  'cron jobs: N (will be restored for skill-link fields only)' when the
  snapshot has cron data, or 'not in snapshot (<reason>)' otherwise.
- rollback()'s message string includes a 'cron links: ...' clause
  summarizing the reconciliation outcome.

Tests
- 9 new cases: snapshot-with-cron, snapshot-without-cron, malformed-json
  captured-as-raw, full rollback-restores-skills-and-cron, rollback
  touches only skill fields, rollback skips user-deleted jobs, rollback
  leaves user-created jobs untouched, rollback still works with
  pre-feature snapshot that has no cron-jobs.json, standalone unit test
  on _restore_cron_skill_links exercising the full report shape.

Validation
- 484/484 targeted tests pass (curator + cron + skill_manager suites).
- E2E: real snapshot_skills, real cron rewrite, real rollback. Before:
  ['pr-review-format', 'pr-review-checklist', 'pr-triage-salvage'].
  After curator: ['hermes-agent-dev']. After rollback: ['pr-review-format',
  'pr-review-checklist', 'pr-triage-salvage']. Non-skill fields (id,
  name, prompt) preserved across the round trip.
2026-05-02 01:29:57 -07:00
Siddharth Balyan f98b5d00a4 fix: gateway systemd unit now retries indefinitely with backoff (#18639)
The old defaults (StartLimitIntervalSec=600, StartLimitBurst=5,
RestartSec=30) meant any network outage over ~5 minutes would
permanently kill the gateway until manual intervention.

Changes:
- StartLimitIntervalSec=0 (never give up)
- Restart=always (not just on-failure)
- RestartSec=60 with RestartMaxDelaySec=300, RestartSteps=5
  (exponential backoff: 60 → 120 → 180 → 240 → 300s cap)
- After=network-online.target + Wants= (both units now wait for
  actual connectivity, not just network.target)

Power outage → internet down → internet back = auto-recovery.
2026-05-02 08:51:30 +05:30
Siddharth Balyan 585d6778da fix: allow WebSocket connections from non-loopback IPs in --insecure mode (#18633)
When the dashboard is bound to 0.0.0.0 with --insecure (e.g. behind
Tailscale Serve), WebSocket endpoints (/api/pty, /api/ws, /api/pub,
/api/events) rejected connections from non-loopback client IPs with
code 4403 — causing 'events feed disconnected' in the UI.

Extract the repeated loopback check into _ws_client_is_allowed() which
respects the public bind flag. Session token auth still guards all
endpoints regardless of bind mode.
2026-05-02 08:17:45 +05:30
kshitijk4poor f903ceece0 chore: add contributors to AUTHOR_MAP for Slack batch salvage
Adds email→username mappings for:
- priveperfumes (PR #18456)
- amroessam (PR #17798)
- Hinotoi-agent (PR #9361)
- valda (PR #14932)
2026-05-01 14:01:26 -07:00
Amr Essam d05a87e686 fix(gateway): clear slack assistant thread status 2026-05-01 14:01:26 -07:00
hinotoi-agent a147164d3c fix(slack): preserve per-user slash-command session isolation 2026-05-01 14:01:26 -07:00
nightq 5cdc39e29a fix(gateway): preserve case-sensitive chat IDs in DeliveryTarget.parse
Fixes NousResearch/hermes-agent#11768

Root cause: target.strip().lower() was lowercasing the entire target string,
corrupting case-sensitive chat IDs like Slack C123ABC and Matrix !RoomABC.

Fix: Only lowercase the platform prefix for case-insensitive matching;
preserve the original case for chat_id and thread_id values.
2026-05-01 14:01:26 -07:00
YAMAGUCHI Seiji 2b3923ff13 fix(gateway): coerce scalar free_response_channels to str before split
YAML loads a bare numeric value such as
    discord:
      free_response_channels: 1491973769726791812
as an int.  _discord_free_response_channels() / _slack_free_response_channels()
checked `isinstance(raw, list)` and `isinstance(raw, str)` in that order and
then fell through to `return set()`, so a single-channel config that happened
to be unquoted was silently dropped with no log line — the bot kept demanding
@mentions even though the channel was configured to free-response.

A multi-channel value like `1234567890,9876543210` does not trip this because
the comma forces YAML to parse it as a string.  Single-channel configs are
the only case that breaks, which is exactly the footgun that's hardest to
diagnose (the config "looks right" and the feature just doesn't activate).

Note that the old-schema env-var bridge at gateway/config.py:614+ already
runs `str(frc)` when forwarding to SLACK_/DISCORD_FREE_RESPONSE_CHANNELS,
so the env-var fallback worked.  The bug only surfaces on the
`config.extra["free_response_channels"]` path populated by the `platforms:`
bridge at gateway/config.py:576, which passes the raw YAML value through
unchanged.

Fix at the reader: treat any non-list value as a scalar, coerce with str(),
then apply the same CSV split semantics.  This keeps the public contract
stable (list or str-like continues to work identically) while accepting
the ints that the YAML loader is free to hand us.

Added tests for both Discord and Slack covering:
  - bare int value in config.extra
  - list of ints in config.extra
2026-05-01 14:01:26 -07:00
Prive FE Coder a717199bbf fix(slack): exclude reserved Slack commands from native slash manifest
Slack has built-in slash commands (e.g. /status, /me, /join) that apps
cannot register. When running `hermes slack manifest --write`, the
generated manifest included /status, causing Slack to reject the entire
manifest with a reserved-command error.

Add _SLACK_RESERVED_COMMANDS frozenset of all known Slack built-ins and
skip them in slack_native_slashes(). Affected commands remain reachable
via /hermes <command>.

Tests updated:
- New test_excludes_slack_reserved_commands validates no leaks
- test_includes_canonical_commands no longer asserts /status
- test_telegram_parity accounts for expected Slack-only exclusions
2026-05-01 14:01:26 -07:00
kshitijk4poor 8fcc160f6b fix(gateway/slack): review fixes — scope ephemeral to commands, user isolation
Self-review fixes for the slash ephemeral ack:

- Only stash response_url when text starts with '/' (gateway command).
  Free-form questions via '/hermes <question>' must produce public agent
  replies visible to the whole channel, not ephemeral.
- Use a ContextVar (_slash_user_id) to thread the invoking user's ID
  from _handle_slash_command through to send().  _pop_slash_context now
  matches the exact (channel_id, user_id) key when the ContextVar is
  set, preventing concurrent users on the same channel from stealing
  each other's ephemeral context.  ContextVars propagate to child
  asyncio.Tasks, so the value survives through handle_message →
  _process_message_background → _send_with_retry → send().
- Add truncate_message() in _send_slash_ephemeral to prevent silent
  failures on long responses (response_url has the same ~40k limit).
- Log send_private_notice failures at debug level instead of bare
  except/pass — aids diagnostics without spamming.
- Document app_mention dedup dependency on shared event ts.
- Add tests: free-form question must NOT stash context, concurrent
  users on the same channel get isolated contexts, non-slash send()
  path fallback behavior.
2026-05-01 13:33:06 -07:00
kshitijk4poor f34d298495 chore: add probepark to AUTHOR_MAP
Required for contributor_audit.py strict mode on the salvaged
PR #9340 commit.
2026-05-01 13:33:06 -07:00
probepark 0ab2d752ff feat(gateway): private notice delivery and Slack format_message fixes
Adds platform-level private notice delivery abstraction so operational
messages (e.g. sethome prompt) can be sent ephemerally on Slack when
configured with `slack.notice_delivery: private`.

Changes:
- gateway/config.py: _normalize_notice_delivery() + GatewayConfig.get_notice_delivery()
  with per-platform config bridging
- gateway/platforms/base.py: send_private_notice() default implementation
  (falls through to send())
- gateway/platforms/slack.py: send_private_notice() via chat_postEphemeral
- gateway/run.py: _deliver_platform_notice() helper replaces direct
  adapter.send() for the sethome notice, with private→public fallback
- gateway/platforms/slack.py: app_mention handler now forwards to
  _handle_slack_message (safe due to ts-based dedup) instead of no-op pass,
  fixing edge-case Slack configs where mentions arrive only as app_mention
- gateway/platforms/slack.py format_message: negative lookbehind prevents
  markdown images (![]()) from becoming broken Slack links; italic regex
  now requires non-whitespace boundaries so 'a * b * c' stays literal

Based on PR #9340 by @probepark.
2026-05-01 13:33:06 -07:00
kshitijk4poor 7cda0e5224 fix(gateway/slack): ephemeral ack and routing for slash commands
Slack slash commands (/q, /btw, /stop, /model, etc.) previously showed
no user-visible acknowledgement and posted command replies as public
channel messages.  This diverged from Discord, which uses ephemeral
deferred responses for slash commands.

Changes:
- handle_hermes_command now passes response_type='ephemeral' and a
  'Running /cmd…' text to ack(), giving the user immediate 'Only visible
  to you' feedback when they invoke any native slash command.
- _handle_slash_command stashes the Slack response_url from the command
  payload in a per-channel context dict before dispatching to
  handle_message.
- send() checks for a pending slash context and, when found, POSTs to
  the response_url with replace_original=true to swap the initial ack
  with the real command reply (e.g. 'Queued for the next turn.'),
  keeping it ephemeral.
- Stale slash contexts are garbage-collected on lookup (120s TTL).
- The response_url POST is non-fatal: if it fails, the user already saw
  the initial ack, and send() returns success=True.

Fixes #18182
2026-05-01 13:33:06 -07:00
Jeffrey Quesnelle 0b76d23d1a makes the Persistent Goals docs accessible in the docs nav (and llms.txt) (#18481) 2026-05-01 10:29:22 -07:00
Teknium f99676e315 fix(gateway): auto-restart when source files change out from under us (#17648) (#18409)
Long-running gateway processes that survive 'hermes update' keep
pre-update modules cached in sys.modules. When new tool files on
disk then try to 'from hermes_cli.config import cfg_get' (added in
PR #17304), the import resolves against the stale module object
and raises ImportError — hitting users on Matrix, Telegram, Feishu,
and other platforms.

Two defenses:

1. Gateway self-check (gateway/run.py). On __init__, snapshot the
   newest mtime across sentinel source files (hermes_cli/config.py,
   run_agent.py, gateway/run.py, etc.). On every inbound message,
   re-read those mtimes; if any is newer than boot time + 2s slack,
   request a graceful restart via the normal drain path and return
   a one-line ack to the user. Idempotent, works regardless of how
   the update happened (hermes update, manual git pull, installer).

2. Post-restart survivor sweep ('hermes update'). After the existing
   restart loop, sleep 3s, rescan for gateway PIDs we already tried
   to kill, and SIGKILL any survivors. The detached profile watchers
   and systemd then relaunch with fresh code instead of waiting out
   the 120s watcher timeout.

Closes #17648.
2026-05-01 09:50:08 -07:00
Teknium 77c0bc6b13 fix(curator): defer first run and add --dry-run preview (#18373) (#18389)
* fix(curator): defer first run and add --dry-run preview (#18373)

Curator was meant to run 7 days after install, not on the very first
gateway tick. On a fresh install (no .curator_state), should_run_now()
returned True immediately because last_run_at was None — so the gateway
cron ticker fired Curator against a fresh skill library moments after
'hermes update'. Combined with the binary 'agent-created' provenance
model (anything not bundled and not hub-installed), this consolidated
hand-authored user workflow skills without consent.

Changes:
- should_run_now(): first observation seeds last_run_at='now' and returns
  False. The next real pass fires one full interval_hours later (7 days
  by default), matching the original design intent.
- hermes curator run --dry-run: produces the same review report without
  applying automatic transitions OR permitting the LLM to call
  skill_manage / terminal mv. A DRY-RUN banner is prepended to the
  prompt and the caller skips apply_automatic_transitions. State is
  NOT advanced so a preview doesn't defer the next scheduled real pass.
- hermes update: prints a one-liner on fresh installs pointing at
  --dry-run, pause, and the docs. Silent on steady state.
- Docs: curator.md and cli-commands.md explain the deferred first-run
  behavior and warn that hand-written SKILL.md files share the
  'agent-created' bucket, with guidance to pin or preview before the
  first pass.

Tests:
- test_first_run_defers replaces the old 'first run always eligible'
  assertion — same fixture, inverted expectation.
- test_maybe_run_curator_defers_on_fresh_install covers the gateway tick
  path end-to-end.
- Three new dry-run tests cover state-advance suppression, prompt
  banner injection, and apply_automatic_transitions skipping.

Fixes #18373.

* feat(curator): pre-run backup + rollback (#18373)

Every real curator pass now snapshots ~/.hermes/skills/ into
~/.hermes/skills/.curator_backups/<utc-iso>/skills.tar.gz before calling
apply_automatic_transitions or the LLM review. If a run consolidates or
archives something the user didn't want touched, 'hermes curator
rollback' restores the tree in one command. Dry-run is skipped — no
mutation means no snapshot needed.

Changes:
- agent/curator_backup.py (new): tar.gz snapshot + safe rollback. The
  snapshot excludes .curator_backups/ (would recurse) and .hub/ (managed
  by the skills hub). Extract refuses absolute paths and .. components,
  and uses tarfile's filter='data' on Python 3.12+. Rollback takes a
  pre-rollback safety snapshot FIRST, stages the current tree into
  .rollback-staging-<ts>/ so the extract lands in an empty dir, and
  cleans the staging dir on success. A failed extract restores the
  staged contents.
- agent/curator.py: run_curator_review() calls curator_backup.
  snapshot_skills(reason='pre-curator-run') before apply_automatic_
  transitions. Best-effort — a failed snapshot logs at debug and the
  run continues (a transient disk issue shouldn't silently disable
  curator forever).
- hermes_cli/curator.py: new 'hermes curator backup' and 'hermes curator
  rollback' subcommands. rollback supports --list, --id <ts>, -y.
- hermes_cli/config.py: curator.backup.{enabled, keep} config block
  with sane defaults (enabled=true, keep=5).
- Docs: curator.md gets a 'Backups and rollback' section; cli-commands
  .md table gets the new rows.

Tests (new file tests/agent/test_curator_backup.py, 16 cases):
- snapshot creates tarball + manifest with correct counts
- snapshot excludes .curator_backups/ (recursion guard) and .hub/
- snapshot disabled via config returns None without creating anything
- snapshot uniquifies ids within the same second (-01 suffix)
- prune honors keep count, newest-first
- list_backups + _resolve_backup cover newest-default and unknown-id
- rollback restores a deleted skill with content intact
- rollback is itself undoable — safety snapshot shows up in list_backups
- rollback with no snapshots returns an error
- rollback refuses tarballs with absolute paths or .. components
- real curator runs take a 'pre-curator-run' snapshot; dry-runs do not

All curator tests: 210 passing locally.
2026-05-01 09:49:59 -07:00
Siddharth Balyan c5b4c48165 fix: lazy session creation — defer DB row until first message (#18370)
Prevents ghost sessions from accumulating in state.db when the TUI/web
dashboard is opened and closed without sending a message.

Changes:
- run_agent.py: Add _ensure_db_session() gate method, called at
  run_conversation() entry. Remove eager create_session() from __init__.
  Handle compression rotation flag correctly.
- tui_gateway/server.py: Remove eager db.create_session() in
  _start_agent_build(). Add post-first-message pending_title re-apply.
- hermes_state.py: Extract _insert_session_row() shared helper (DRY).
  Add prune_empty_ghost_sessions() for one-time migration.
- cli.py: One-time ghost session prune on startup. Fix _pending_title
  to call _ensure_db_session() before set_session_title().
- hermes_cli/main.py: Guard TUI exit summary on message_count > 0.
- tests: Update test_860_dedup to call _ensure_db_session() before
  direct _flush_messages_to_session_db() calls.

Closes: ghost session clutter in hermes sessions list and web dashboard.
2026-05-01 18:39:12 +05:30
Austin Pickett 20132435c0 Merge pull request #18117 from NousResearch/austin/fix/model-selector
feat(tui): overhaul /model picker to match hermes model with inline auth
2026-05-01 05:30:05 -07:00
Austin Pickett 5ad030d19d Merge pull request #18095 from NousResearch/austin/feat/plugins-page
feat(dashboard): Plugins page — manage, enable/disable, auth status
2026-05-01 05:29:24 -07:00
Austin Pickett 05c63259b5 Merge pull request #18358 from NousResearch/fix/kanban-buton
fix: kanban button
2026-05-01 04:49:06 -07:00
Austin Pickett a01c1f7305 fix: kanban button 2026-05-01 07:33:54 -04:00
Siddharth Balyan 75e1339d4c fix(telegram): send seed message after creating DM topics (#18334)
Telegram's client does not display empty forum topics in the chat's
topic list. After createForumTopic succeeds, send a short pin message
into the new topic so it becomes immediately visible to the user.

Only fires for newly created topics (no thread_id in config yet).
Failure to send the seed is non-fatal (debug-logged, topic still works).
2026-05-01 15:21:56 +05:30
Austin Pickett c23c7c994b fix(tui): address remaining review feedback — ordering and digit shortcuts
- Emit providers in CANONICAL_PROVIDERS order (matching hermes model)
  with user-defined/custom providers appended after
- Remove digit quick-select (1-9,0) handler — inconsistent with
  absolute row numbering and already removed from hint text
- Remove unused windowOffset import
2026-04-30 23:41:19 -04:00
Austin Pickett c8e506c383 fix(tui): address code review feedback on model picker
- Reset keySaving on back() to prevent blocked key entry after Esc
- Show '(needs setup)' for non-API-key auth providers instead of
  generic '(no key)'
- Set is_current correctly for unauthenticated providers that happen
  to be the active session provider
- Guard model.save_key with is_managed() check — return error on
  managed installs where .env is read-only
2026-04-30 23:11:28 -04:00
Austin Pickett f4c761c6a0 feat(tui): add inline provider disconnect via 'd' keybind in /model picker
- New model.disconnect RPC method: clears API key env vars from .env
  and OAuth/credential pool state via clear_provider_auth()
- Press 'd' on an authenticated provider opens confirmation prompt
- y/Enter confirms disconnect, n/Esc cancels
- Provider flips to unauthenticated state in-place (re-selectable
  to re-auth by pressing Enter again)
2026-04-30 23:03:32 -04:00
Austin Pickett 26f7f68507 feat(tui): show all providers in /model picker with inline API key setup
- model.options now returns all canonical providers (not just
  authenticated), each with authenticated/auth_type/key_env fields
- New model.save_key RPC method: saves API key to .env, sets in
  process, returns refreshed provider with models
- Picker shows ● (authed) / ○ (no key) markers with dimmed styling
- Selecting an unauthenticated api_key provider opens inline masked
  key input — after save, transitions directly to model selection
- Non-api_key auth providers show guidance to run hermes model
- Row numbers now show absolute position in list
2026-04-30 23:03:32 -04:00
Austin Pickett 36fa8a4d28 fix(tui): show absolute position numbers in model picker
The model picker displayed row numbers 1-12 regardless of scroll
position, making it impossible to tell where you were in the list.
Now shows the actual item index (e.g. 5, 6, 7... when scrolled down).

Also removed '1-9,0 quick' from the hint text since digit shortcuts
still work relative to the visible window, which would be confusing
with absolute numbering.
2026-04-30 23:03:32 -04:00
Austin Pickett 443950e827 fix(tui): pass user_providers as dict to match CLI model-switch pipeline
The TUI's _apply_model_switch() was converting the config.yaml
`providers:` dict into a list of dicts before passing it to
switch_model(). This caused resolve_provider_full() →
resolve_user_provider() to fail, since that function expects a dict
and does `user_config.get(name)` to look up provider entries.

The result: user-defined providers (e.g. ollama) appeared in CLI's
/model picker but were invisible in the TUI.

Fix:
- tui_gateway/server.py: pass cfg.get('providers') directly (dict),
  matching what cli.py already does at line 5598.
- hermes_cli/model_switch.py: fix the validation-override block
  (line ~893) which iterated user_providers as a list — now correctly
  handles the dict format with support for both dict-keyed and
  list-format models arrays.
2026-04-30 23:03:32 -04:00
Austin Pickett c73b799de7 feat(dashboard): add hide/show toggle for dashboard plugins in sidebar
- New config key: dashboard.hidden_plugins (list of plugin names)
- GET /api/dashboard/plugins now filters out hidden plugins from sidebar
- POST /api/dashboard/plugins/{name}/visibility toggles visibility
- Hub response includes user_hidden boolean per plugin row
- Eye/EyeOff toggle on plugin cards with dashboard manifests
- i18n: 'Show in sidebar' / 'Hide from sidebar' (en/zh)
2026-04-30 20:29:37 -04:00
Austin Pickett a52363231f refactor(plugins): move rescan button to page header, remove redundant title
Use usePageHeader().setEnd to place the rescan button in the shared
header bar. Remove the inline H2 title (already shown by the header)
and the wrapper div.
2026-04-30 20:29:37 -04:00
Austin Pickett 9550d0fd46 fix(plugins): show 'Plugins' in page header instead of 'Web UI'
Add /plugins route to resolve-page-title BUILTIN map.
2026-04-30 20:29:37 -04:00
Austin Pickett 7dc85495e0 style(plugins): make page full width 2026-04-30 20:29:37 -04:00
Austin Pickett 6549b0f2b7 fix(security): address CodeQL path-traversal and info-exposure findings
- Add _validate_plugin_name() guard on all {name} path param endpoints
  (rejects /, \, .. before reaching plugin logic)
- Strip after_install_path from install response (no internal paths to client)
- Update nix/tui.nix lockfile hash to match committed package-lock.json
2026-04-30 20:29:37 -04:00
Austin Pickett e2a4905606 feat(dashboard): add Plugins page with enable/disable, auth status, install/remove
- New PluginsPage.tsx: full plugin management UI (list, enable/disable,
  install from git, remove, git pull updates, provider picker)
- Backend: dashboard_set_agent_plugin_enabled now also toggles the
  plugin's toolset in platform_toolsets so enabling actually makes
  tools visible in agent sessions
- Backend: /api/dashboard/plugins/hub returns auth_required + auth_command
  per plugin (checks tool registry check_fn)
- Frontend: auth_required shown as Badge + CommandBlock with copy-able
  auth command
- Fix: Select overflow in providers card (min-w-0 grid cells, removed
  truncate/overflow-hidden that clipped dropdown)
- Refactor: _install_plugin_core extracted for non-interactive reuse,
  PluginOperationError for structured error handling
- i18n: en/zh/types updated with all new plugin page strings
2026-04-30 20:29:37 -04:00
578 changed files with 65712 additions and 4688 deletions
+4
View File
@@ -25,3 +25,7 @@ ui-tui/packages/hermes-ink/dist/
# Runtime data (bind-mounted at /opt/data; must not leak into build context)
data/
# Compose/profile runtime state (bind-mounted; avoid ownership/secret issues)
hermes-config/
runtime/
+9
View File
@@ -244,6 +244,15 @@ BROWSERBASE_PROXIES=true
# Uses custom Chromium build to avoid bot detection altogether
BROWSERBASE_ADVANCED_STEALTH=false
# Browser engine for local mode (default: auto = Chrome)
# "auto" — use Chrome (don't pass --engine flag)
# "lightpanda" — use Lightpanda (1.3-5.8x faster navigation, no screenshots)
# "chrome" — explicitly request Chrome
# Requires agent-browser v0.25.3+. Lightpanda commands that fail or return
# empty results are automatically retried with Chrome.
# Also configurable via browser.engine in config.yaml.
# AGENT_BROWSER_ENGINE=auto
# Browser session timeout in seconds (default: 300)
# Sessions are cleaned up after this duration of inactivity
BROWSER_SESSION_TIMEOUT=300
+44
View File
@@ -0,0 +1,44 @@
# Dependabot configuration for hermes-agent.
#
# Deliberately scoped to github-actions only.
#
# We do NOT enable Dependabot for pip / npm / any source-dependency ecosystem
# because we pin source dependencies exactly (uv.lock, package-lock.json) as
# part of our supply-chain posture. Automatic version-bump PRs against those
# pins would undermine the strategy — pins are moved deliberately, after
# review, not on a schedule.
#
# github-actions is the exception: action pins (we use full commit SHAs per
# supply-chain policy) must be updated when upstream actions publish
# patches — usually themselves security fixes. Dependabot opens a PR with
# the new SHA and release notes; we review and merge like any other PR.
#
# Security-update PRs for source dependencies (opened ONLY when a CVE is
# published affecting a currently-pinned version) are enabled separately
# via the repo's Dependabot security updates setting
# (Settings → Code security → Dependabot → Dependabot security updates).
# Those are CVE-only, not schedule-driven, and do not conflict with our
# pinning strategy — they fire when a pinned version becomes known-bad,
# which is exactly when we want to move the pin.
version: 2
updates:
- package-ecosystem: "github-actions"
directory: "/"
schedule:
interval: "weekly"
day: "monday"
open-pull-requests-limit: 5
labels:
- "dependencies"
- "github-actions"
commit-message:
prefix: "chore(actions)"
include: "scope"
groups:
# Batch routine action bumps into one PR per week to reduce noise.
# Security updates still open individually and bypass grouping.
actions-minor-patch:
update-types:
- "minor"
- "patch"
+158
View File
@@ -0,0 +1,158 @@
name: Lint (ruff + ty)
# Surface ruff and ty diagnostics as a diff vs the target branch.
# This check is advisory only ATM it always exits zero and never blocks merge.
# It posts a Markdown summary to the workflow run and, for pull requests,
# comments the same summary on the PR.
on:
push:
branches: [main]
paths-ignore:
- "**/*.md"
- "docs/**"
- "website/**"
pull_request:
branches: [main]
paths-ignore:
- "**/*.md"
- "docs/**"
- "website/**"
permissions:
contents: read
pull-requests: write # needed to post/update PR comments
concurrency:
group: lint-${{ github.ref }}
cancel-in-progress: true
jobs:
lint-diff:
name: ruff + ty diff
runs-on: ubuntu-latest
timeout-minutes: 10
steps:
- name: Checkout code
uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
with:
fetch-depth: 0 # need full history for merge-base + worktree
- name: Install uv
uses: astral-sh/setup-uv@d4b2f3b6ecc6e67c4457f6d3e41ec42d3d0fcb86 # v5
- name: Install ruff + ty
run: |
uv tool install ruff
uv tool install ty
- name: Determine base ref
id: base
env:
PR_BASE_SHA: ${{ github.event.pull_request.base.sha }}
run: |
# For PRs, diff against the PR's pinned parent commit
# (github.event.pull_request.base.sha — snapshot at PR open time,
# so later pushes to main don't leak into the diff).
# For pushes to main, diff against the previous commit on main.
if [ "${{ github.event_name }}" = "pull_request" ]; then
BASE_SHA="${PR_BASE_SHA}"
BASE_REF="PR base (${BASE_SHA:0:7})"
else
BASE_SHA=$(git rev-parse HEAD~1 2>/dev/null || git rev-parse HEAD)
BASE_REF="HEAD~1"
fi
echo "sha=${BASE_SHA}" >> "$GITHUB_OUTPUT"
echo "ref=${BASE_REF}" >> "$GITHUB_OUTPUT"
echo "Base SHA: ${BASE_SHA}"
echo "Base ref: ${BASE_REF}"
- name: Run ruff + ty on HEAD
run: |
mkdir -p .lint-reports/head
ruff check --output-format json --exit-zero \
> .lint-reports/head/ruff.json || true
ty check --output-format gitlab --exit-zero \
> .lint-reports/head/ty.json || true
echo "HEAD ruff: $(wc -c < .lint-reports/head/ruff.json) bytes"
echo "HEAD ty: $(wc -c < .lint-reports/head/ty.json) bytes"
- name: Run ruff + ty on base (via git worktree)
run: |
mkdir -p .lint-reports/base
# Use a worktree so we don't clobber the main checkout. If the basex
# SHA is identical to HEAD (e.g. first commit), skip and leave the
# base reports empty — the diff script handles missing files.
HEAD_SHA=$(git rev-parse HEAD)
BASE_SHA="${{ steps.base.outputs.sha }}"
if [ "$BASE_SHA" = "$HEAD_SHA" ]; then
echo "Base SHA == HEAD SHA, skipping base scan."
echo '[]' > .lint-reports/base/ruff.json
echo '[]' > .lint-reports/base/ty.json
else
git worktree add --detach /tmp/lint-base "$BASE_SHA"
(
cd /tmp/lint-base
ruff check --output-format json --exit-zero \
> "$GITHUB_WORKSPACE/.lint-reports/base/ruff.json" || true
ty check --output-format gitlab --exit-zero \
> "$GITHUB_WORKSPACE/.lint-reports/base/ty.json" || true
)
git worktree remove --force /tmp/lint-base
fi
echo "base ruff: $(wc -c < .lint-reports/base/ruff.json) bytes"
echo "base ty: $(wc -c < .lint-reports/base/ty.json) bytes"
- name: Generate diff summary
run: |
python scripts/lint_diff.py \
--base-ruff .lint-reports/base/ruff.json \
--head-ruff .lint-reports/head/ruff.json \
--base-ty .lint-reports/base/ty.json \
--head-ty .lint-reports/head/ty.json \
--base-ref "${{ steps.base.outputs.ref }}" \
--head-ref "${{ github.event_name == 'pull_request' && github.head_ref || github.ref_name }}" \
--output .lint-reports/summary.md
cat .lint-reports/summary.md >> "$GITHUB_STEP_SUMMARY"
- name: Upload reports as artifact
uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4
with:
name: lint-reports
path: .lint-reports/
retention-days: 14
# .lint-reports/ is a dotfile-prefixed directory, and upload-artifact@v4
# skips hidden files by default (breaking change from v3). Opt back in.
include-hidden-files: true
- name: Post / update PR comment
if: github.event_name == 'pull_request'
uses: actions/github-script@60a0d83039c74a4aee543508d2ffcb1c3799cdea # v7
with:
script: |
const fs = require('fs');
const body = fs.readFileSync('.lint-reports/summary.md', 'utf8');
const marker = '<!-- lint-diff-summary -->';
const fullBody = marker + '\n' + body;
const { data: comments } = await github.rest.issues.listComments({
owner: context.repo.owner,
repo: context.repo.repo,
issue_number: context.issue.number,
});
const existing = comments.find(c => c.body && c.body.includes(marker));
if (existing) {
await github.rest.issues.updateComment({
owner: context.repo.owner,
repo: context.repo.repo,
comment_id: existing.id,
body: fullBody,
});
} else {
await github.rest.issues.createComment({
owner: context.repo.owner,
repo: context.repo.repo,
issue_number: context.issue.number,
body: fullBody,
});
}
+67
View File
@@ -0,0 +1,67 @@
name: OSV-Scanner
# Scans lockfiles (uv.lock, package-lock.json) against the OSV vulnerability
# database. Runs on every PR that touches a lockfile and on a weekly schedule
# against main.
#
# This is detection-only — OSV-Scanner does NOT open PRs or modify pins.
# It reports known CVEs in currently-pinned dependency versions so we can
# decide when and how to patch on our own schedule. Our pinning strategy
# (full SHA / exact version) is preserved; only the notification signal
# is added.
#
# Complements the existing supply-chain-audit.yml workflow (which scans
# for malicious code patterns in PR diffs) by covering the orthogonal
# "currently-pinned dep became known-vulnerable" case.
#
# Uses Google's officially-recommended reusable workflow, pinned by SHA.
# Findings land in the repo's Security tab (Code Scanning > OSV-Scanner).
# fail-on-vuln is disabled so the job does not block merges on pre-existing
# vulnerabilities in pinned deps that we may need to patch deliberately.
on:
pull_request:
branches: [main]
paths:
- 'uv.lock'
- 'pyproject.toml'
- 'package.json'
- 'package-lock.json'
- 'ui-tui/package.json'
- 'ui-tui/package-lock.json'
- 'website/package.json'
- 'website/package-lock.json'
- '.github/workflows/osv-scanner.yml'
push:
branches: [main]
paths:
- 'uv.lock'
- 'pyproject.toml'
- 'package.json'
- 'package-lock.json'
- 'ui-tui/package-lock.json'
- 'website/package-lock.json'
schedule:
# Weekly scan against main — catches CVEs published after merge for
# deps that haven't changed since.
- cron: '0 9 * * 1'
workflow_dispatch:
permissions:
# Required by the reusable workflow to upload SARIF to the Security tab.
actions: read
contents: read
security-events: write
jobs:
scan:
name: Scan lockfiles
uses: google/osv-scanner-action/.github/workflows/osv-scanner-reusable.yml@c51854704019a247608d928f370c98740469d4b5 # v2.3.5
with:
# Scan explicit lockfiles rather than recursing, so we only look at
# the three sources of truth and skip vendored / test / worktree dirs.
scan-args: |-
--lockfile=uv.lock
--lockfile=ui-tui/package-lock.json
--lockfile=website/package-lock.json
fail-on-vuln: false
+231 -10
View File
@@ -37,12 +37,18 @@ hermes-agent/
│ ├── platforms/ # Adapter per platform (telegram, discord, slack, whatsapp,
│ │ # homeassistant, signal, matrix, mattermost, email, sms,
│ │ # dingtalk, wecom, weixin, feishu, qqbot, bluebubbles,
│ │ # webhook, api_server, ...). See ADDING_A_PLATFORM.md.
│ │ # yuanbao, webhook, api_server, ...). See ADDING_A_PLATFORM.md.
│ └── builtin_hooks/ # Extension point for always-registered gateway hooks (none shipped)
├── plugins/ # Plugin system (see "Plugins" section below)
│ ├── memory/ # Memory-provider plugins (honcho, mem0, supermemory, ...)
│ ├── context_engine/ # Context-engine plugins
── <others>/ # Dashboard, image-gen, disk-cleanup, examples, ...
── model-providers/ # Inference backend plugins (openrouter, anthropic, gmi, ...)
│ ├── kanban/ # Multi-agent board dispatcher + worker plugin
│ ├── hermes-achievements/ # Gamified achievement tracking
│ ├── observability/ # Metrics / traces / logs plugin
│ ├── image_gen/ # Image-generation providers
│ └── <others>/ # disk-cleanup, example-dashboard, google_meet, platforms,
│ # spotify, strike-freedom-cockpit, ...
├── optional-skills/ # Heavier/niche skills shipped but NOT active by default
├── skills/ # Built-in skills bundled with the repo
├── ui-tui/ # Ink (React) terminal UI — `hermes --tui`
@@ -53,7 +59,7 @@ hermes-agent/
├── environments/ # RL training environments (Atropos)
├── scripts/ # run_tests.sh, release.py, auxiliary scripts
├── website/ # Docusaurus docs site
└── tests/ # Pytest suite (~15k tests across ~700 files as of Apr 2026)
└── tests/ # Pytest suite (~17k tests across ~900 files as of May 2026)
```
**User config:** `~/.hermes/config.yaml` (settings), `~/.hermes/.env` (API keys only).
@@ -257,7 +263,16 @@ The dashboard embeds the real `hermes --tui` — **not** a rewrite. See `hermes
## Adding New Tools
Requires changes in **2 files**:
For most custom or local-only tools, do **not** edit Hermes core. Use the plugin
route instead: create `~/.hermes/plugins/<name>/plugin.yaml` and
`~/.hermes/plugins/<name>/__init__.py`, then register tools with
`ctx.register_tool(...)`. Plugin toolsets are discovered automatically and can be
enabled or disabled without touching `tools/` or `toolsets.py`.
Use the built-in route below only when the user is explicitly contributing a new
core Hermes tool that should ship in the base system.
Built-in/core tools require changes in **2 files**:
**1. Create `tools/your_tool.py`:**
```python
@@ -280,9 +295,9 @@ registry.register(
)
```
**2. Add to `toolsets.py`** — either `_HERMES_CORE_TOOLS` (all platforms) or a new toolset.
**2. Add to `toolsets.py`** — either `_HERMES_CORE_TOOLS` (all platforms) or a new toolset. **This step is required:** auto-discovery imports the tool and registers its schema, but the tool is only *exposed to an agent* if its name appears in a toolset. `_HERMES_CORE_TOOLS` is not dead code — it's the default bundle every platform's base toolset inherits from.
Auto-discovery: any `tools/*.py` file with a top-level `registry.register()` call is imported automatically — no manual import list to maintain.
Auto-discovery: any `tools/*.py` file with a top-level `registry.register()` call is imported automatically — no manual import list to maintain. Wiring into a toolset is still a deliberate, manual step.
The registry handles schema collection, dispatch, availability checking, and error wrapping. All handlers MUST return a JSON string.
@@ -304,6 +319,22 @@ The registry handles schema collection, dispatch, availability checking, and err
section is handled automatically by the deep-merge and does NOT require
a version bump.
### Top-level `config.yaml` sections (non-exhaustive):
`model`, `agent`, `terminal`, `compression`, `display`, `stt`, `tts`,
`memory`, `security`, `delegation`, `smart_model_routing`, `checkpoints`,
`auxiliary`, `curator`, `skills`, `gateway`, `logging`, `cron`, `profiles`,
`plugins`, `honcho`.
`auxiliary` holds per-task overrides for side-LLM work (curator, vision,
embedding, title generation, session_search, etc.) — each task can pin
its own provider/model/base_url/max_tokens/reasoning_effort. See
`agent/auxiliary_client.py::_resolve_auto` for resolution order.
`curator` holds the background skill-maintenance config —
`enabled`, `interval_hours`, `min_idle_hours`, `stale_after_days`,
`archive_after_days`, `backup` (nested).
### .env variables (SECRETS ONLY — API keys, tokens, passwords):
1. Add to `OPTIONAL_ENV_VARS` in `hermes_cli/config.py` with metadata:
```python
@@ -482,6 +513,31 @@ generic plugin surface (new hook, new ctx method) — never hardcode
plugin-specific logic into core. PR #5295 removed 95 lines of hardcoded
honcho argparse from `main.py` for exactly this reason.
### Model-provider plugins (`plugins/model-providers/<name>/`)
Every inference backend (openrouter, anthropic, gmi, deepseek, nvidia, …)
ships as a plugin here. Each plugin's `__init__.py` calls
`providers.register_provider(ProviderProfile(...))` at module load.
`providers/__init__.py._discover_providers()` is a **lazy, separate
discovery system** — scanned on first `get_provider_profile()` or
`list_providers()` call, NOT by the general PluginManager.
Scan order:
1. Bundled: `<repo>/plugins/model-providers/<name>/`
2. User: `$HERMES_HOME/plugins/model-providers/<name>/`
3. Legacy: `<repo>/providers/<name>.py` (back-compat)
User plugins of the same name override bundled ones — `register_provider()`
is last-writer-wins. This lets third parties swap out any built-in
profile without a repo patch.
The general PluginManager records `kind: model-provider` manifests but does
NOT import them (would double-instantiate `ProviderProfile`). Plugins
without an explicit `kind:` get auto-coerced via a source-text heuristic
(`register_provider` + `ProviderProfile` in `__init__.py`).
Full authoring guide: `website/docs/developer-guide/model-provider-plugin.md`.
### Dashboard / context-engine / image-gen plugin directories
`plugins/context_engine/`, `plugins/image_gen/`, `plugins/example-dashboard/`,
@@ -510,11 +566,176 @@ niche skills belong in `optional-skills/`.
### SKILL.md frontmatter
Standard fields: `name`, `description`, `version`, `platforms`
(OS-gating list: `[macos]`, `[linux, macos]`, ...),
Standard fields: `name`, `description`, `version`, `author`, `license`,
`platforms` (OS-gating list: `[macos]`, `[linux, macos]`, ...),
`metadata.hermes.tags`, `metadata.hermes.category`,
`metadata.hermes.config` (config.yaml settings the skill needs — stored
under `skills.config.<key>`, prompted during setup, injected at load time).
`metadata.hermes.related_skills`, `metadata.hermes.config` (config.yaml
settings the skill needs — stored under `skills.config.<key>`, prompted
during setup, injected at load time).
Top-level `tags:` and `category:` are also accepted and mirrored from
`metadata.hermes.*` by the loader.
---
## Toolsets
All toolsets are defined in `toolsets.py` as a single `TOOLSETS` dict.
Each platform's adapter picks a base toolset (e.g. Telegram uses
`"messaging"`); `_HERMES_CORE_TOOLS` is the default bundle most
platforms inherit from.
Current toolset keys: `browser`, `clarify`, `code_execution`, `cronjob`,
`debugging`, `delegation`, `discord`, `discord_admin`, `feishu_doc`,
`feishu_drive`, `file`, `homeassistant`, `image_gen`, `kanban`, `memory`,
`messaging`, `moa`, `rl`, `safe`, `search`, `session_search`, `skills`,
`spotify`, `terminal`, `todo`, `tts`, `video`, `vision`, `web`, `yuanbao`.
Enable/disable per platform via `hermes tools` (the curses UI) or the
`tools.<platform>.enabled` / `tools.<platform>.disabled` lists in
`config.yaml`.
---
## Delegation (`delegate_task`)
`tools/delegate_tool.py` spawns a subagent with an isolated
context + terminal session. Synchronous: the parent waits for the
child's summary before continuing its own loop — if the parent is
interrupted, the child is cancelled.
Two shapes:
- **Single:** pass `goal` (+ optional `context`, `toolsets`).
- **Batch (parallel):** pass `tasks: [...]` — each gets its own subagent
running concurrently. Concurrency is capped by
`delegation.max_concurrent_children` (default 3).
Roles:
- `role="leaf"` (default) — focused worker. Cannot call `delegate_task`,
`clarify`, `memory`, `send_message`, `execute_code`.
- `role="orchestrator"` — retains `delegate_task` so it can spawn its
own workers. Gated by `delegation.orchestrator_enabled` (default true)
and bounded by `delegation.max_spawn_depth` (default 2).
Key config knobs (under `delegation:` in `config.yaml`):
`max_concurrent_children`, `max_spawn_depth`, `child_timeout_seconds`,
`orchestrator_enabled`, `subagent_auto_approve`, `inherit_mcp_toolsets`,
`max_iterations`.
Synchronicity rule: delegate_task is **not** durable. For long-running
work that must outlive the current turn, use `cronjob` or
`terminal(background=True, notify_on_complete=True)` instead.
---
## Curator (skill lifecycle)
Background skill-maintenance system that tracks usage on agent-created
skills and auto-archives stale ones. Users never lose skills; archives
go to `~/.hermes/skills/.archive/` and are restorable.
- **Core:** `agent/curator.py` (review loop, auto-transitions, LLM review
prompt) + `agent/curator_backup.py` (pre-run tar.gz snapshots).
- **CLI:** `hermes_cli/curator.py` wires `hermes curator <verb>` where
verbs are: `status`, `run`, `pause`, `resume`, `pin`, `unpin`,
`archive`, `restore`, `prune`, `backup`, `rollback`.
- **Telemetry:** `tools/skill_usage.py` owns the sidecar
`~/.hermes/skills/.usage.json` — per-skill `use_count`, `view_count`,
`patch_count`, `last_activity_at`, `state` (active / stale /
archived), `pinned`.
Invariants:
- Curator only touches skills with `created_by: "agent"` provenance —
bundled + hub-installed skills are off-limits.
- Never deletes; max destructive action is archive.
- Pinned skills are exempt from every auto-transition and from the
LLM review pass.
- `skill_manage(action="delete")` refuses pinned skills; patch/edit/
write_file/remove_file go through so the agent can keep improving
pinned skills.
Config section (`curator:` in `config.yaml`):
`enabled`, `interval_hours`, `min_idle_hours`, `stale_after_days`,
`archive_after_days`, `backup.*`.
Full user-facing docs: `website/docs/user-guide/features/curator.md`.
---
## Cron (scheduled jobs)
`cron/jobs.py` (job store) + `cron/scheduler.py` (tick loop). Agents
schedule jobs via the `cronjob` tool; users via `hermes cron <verb>`
(`list`, `add`, `edit`, `pause`, `resume`, `run`, `remove`) or the
`/cron` slash command.
Supported schedule formats:
- Duration: `"30m"`, `"2h"`, `"1d"`
- "every" phrase: `"every 2h"`, `"every monday 9am"`
- 5-field cron expression: `"0 9 * * *"`
- ISO timestamp (one-shot): `"2026-06-01T09:00:00Z"`
Per-job fields include `skills` (load specific skills), `model` /
`provider` overrides, `script` (pre-run data-collection script whose
stdout is injected into the prompt; `no_agent=True` turns the script
into the entire job), `context_from` (chain job A's last output into
job B's prompt), `workdir` (run in a specific directory with its
`AGENTS.md`/`CLAUDE.md` loaded), and multi-platform delivery.
Hardening invariants:
- **3-minute hard interrupt** on cron sessions — runaway agent loops
cannot monopolize the scheduler.
- Catchup window: half the job's period, clamped to 120s2h.
- Grace window: 120s for one-shot jobs whose fire time was missed.
- File lock at `~/.hermes/cron/.tick.lock` prevents duplicate ticks
across processes.
- Cron sessions pass `skip_memory=True` by default; memory providers
intentionally do not run during cron.
Cron deliveries are **not** mirrored into the target gateway session —
they land in their own cron session with a header/footer frame so the
main conversation's message-role alternation stays intact.
---
## Kanban (multi-agent work queue)
Durable SQLite-backed board that lets multiple profiles / workers
collaborate on shared tasks. Users drive it via `hermes kanban <verb>`;
workers spawned by the dispatcher drive it via a dedicated `kanban_*`
toolset so their schema footprint is zero when they're not inside a
kanban task.
- **CLI:** `hermes_cli/kanban.py` wires `hermes kanban` with verbs
`init`, `create`, `list` (alias `ls`), `show`, `assign`, `link`,
`unlink`, `comment`, `complete`, `block`, `unblock`, `archive`,
`tail`, plus less-commonly-used `watch`, `stats`, `runs`, `log`,
`assignees`, `heartbeat`, `notify-*`, `dispatch`, `daemon`, `gc`.
- **Worker toolset:** `tools/kanban_tools.py` exposes `kanban_show`,
`kanban_complete`, `kanban_block`, `kanban_heartbeat`, `kanban_comment`,
`kanban_create`, `kanban_link` — gated by `HERMES_KANBAN_TASK` so
the schema only appears for processes actually running as a worker.
- **Dispatcher:** long-lived loop that (default every 60s) reclaims
stale claims, promotes ready tasks, atomically claims, and spawns
assigned profiles. Runs **inside the gateway** by default via
`kanban.dispatch_in_gateway: true`.
- **Plugin assets:** `plugins/kanban/dashboard/` (web UI) +
`plugins/kanban/systemd/` (`hermes-kanban-dispatcher.service` for
standalone dispatcher deployment).
Isolation model:
- **Board** is the hard boundary — workers are spawned with
`HERMES_KANBAN_BOARD` pinned in their env so they can't see other
boards.
- **Tenant** is a soft namespace *within* a board — one specialist
fleet can serve multiple businesses with workspace-path + memory-key
isolation.
- After ~5 consecutive spawn failures on the same task the dispatcher
auto-blocks it to prevent spin loops.
Full user-facing docs: `website/docs/user-guide/features/kanban.md`.
---
+2 -1
View File
@@ -9,6 +9,7 @@
<a href="https://discord.gg/NousResearch"><img src="https://img.shields.io/badge/Discord-5865F2?style=for-the-badge&logo=discord&logoColor=white" alt="Discord"></a>
<a href="https://github.com/NousResearch/hermes-agent/blob/main/LICENSE"><img src="https://img.shields.io/badge/License-MIT-green?style=for-the-badge" alt="License: MIT"></a>
<a href="https://nousresearch.com"><img src="https://img.shields.io/badge/Built%20by-Nous%20Research-blueviolet?style=for-the-badge" alt="Built by Nous Research"></a>
<a href="README.zh-CN.md"><img src="https://img.shields.io/badge/Lang-中文-red?style=for-the-badge" alt="中文"></a>
</p>
**The self-improving AI agent built by [Nous Research](https://nousresearch.com).** It's the only agent with a built-in learning loop — it creates skills from experience, improves them during use, nudges itself to persist knowledge, searches its own past conversations, and builds a deepening model of who you are across sessions. Run it on a $5 VPS, a GPU cluster, or serverless infrastructure that costs nearly nothing when idle. It's not tied to your laptop — talk to it from Telegram while it works on a cloud VM.
@@ -21,7 +22,7 @@ Use any model you want — [Nous Portal](https://portal.nousresearch.com), [Open
<tr><td><b>A closed learning loop</b></td><td>Agent-curated memory with periodic nudges. Autonomous skill creation after complex tasks. Skills self-improve during use. FTS5 session search with LLM summarization for cross-session recall. <a href="https://github.com/plastic-labs/honcho">Honcho</a> dialectic user modeling. Compatible with the <a href="https://agentskills.io">agentskills.io</a> open standard.</td></tr>
<tr><td><b>Scheduled automations</b></td><td>Built-in cron scheduler with delivery to any platform. Daily reports, nightly backups, weekly audits — all in natural language, running unattended.</td></tr>
<tr><td><b>Delegates and parallelizes</b></td><td>Spawn isolated subagents for parallel workstreams. Write Python scripts that call tools via RPC, collapsing multi-step pipelines into zero-context-cost turns.</td></tr>
<tr><td><b>Runs anywhere, not just your laptop</b></td><td>Six terminal backends — local, Docker, SSH, Daytona, Singularity, and Modal. Daytona and Modal offer serverless persistence — your agent's environment hibernates when idle and wakes on demand, costing nearly nothing between sessions. Run it on a $5 VPS or a GPU cluster.</td></tr>
<tr><td><b>Runs anywhere, not just your laptop</b></td><td>Seven terminal backends — local, Docker, SSH, Singularity, Modal, Daytona, and Vercel Sandbox. Daytona and Modal offer serverless persistence — your agent's environment hibernates when idle and wakes on demand, costing nearly nothing between sessions. Run it on a $5 VPS or a GPU cluster.</td></tr>
<tr><td><b>Research-ready</b></td><td>Batch trajectory generation, Atropos RL environments, trajectory compression for training the next generation of tool-calling models.</td></tr>
</table>
+186
View File
@@ -0,0 +1,186 @@
<p align="center">
<img src="assets/banner.png" alt="Hermes Agent" width="100%">
</p>
# Hermes Agent ☤
<p align="center">
<a href="https://hermes-agent.nousresearch.com/docs/"><img src="https://img.shields.io/badge/Docs-hermes--agent.nousresearch.com-FFD700?style=for-the-badge" alt="Documentation"></a>
<a href="https://discord.gg/NousResearch"><img src="https://img.shields.io/badge/Discord-5865F2?style=for-the-badge&logo=discord&logoColor=white" alt="Discord"></a>
<a href="https://github.com/NousResearch/hermes-agent/blob/main/LICENSE"><img src="https://img.shields.io/badge/License-MIT-green?style=for-the-badge" alt="License: MIT"></a>
<a href="https://nousresearch.com"><img src="https://img.shields.io/badge/Built%20by-Nous%20Research-blueviolet?style=for-the-badge" alt="Built by Nous Research"></a>
<a href="README.md"><img src="https://img.shields.io/badge/Lang-English-lightgrey?style=for-the-badge" alt="English"></a>
</p>
**由 [Nous Research](https://nousresearch.com) 构建的自进化 AI 代理。** 它是唯一内置学习闭环的智能代理——从经验中创建技能,在使用中改进技能,主动持久化知识,搜索过往对话,并在跨会话中逐步构建对你的深度理解。可以在 $5 的 VPS 上运行,也可以在 GPU 集群上运行,或者使用几乎零成本的 Serverless 基础设施。它不绑定你的笔记本——你可以在 Telegram 上与它对话,而它在云端 VM 上工作。
支持任意模型——[Nous Portal](https://portal.nousresearch.com)、[OpenRouter](https://openrouter.ai)200+ 模型)、[NVIDIA NIM](https://build.nvidia.com)Nemotron)、[小米 MiMo](https://platform.xiaomimimo.com)、[z.ai/GLM](https://z.ai)、[Kimi/Moonshot](https://platform.moonshot.ai)、[MiniMax](https://www.minimax.io)、[Hugging Face](https://huggingface.co)、OpenAI,或自定义端点。使用 `hermes model` 即可切换——无需改代码,无锁定。
<table>
<tr><td><b>真正的终端界面</b></td><td>完整的 TUI,支持多行编辑、斜杠命令自动补全、对话历史、中断重定向和流式工具输出。</td></tr>
<tr><td><b>随你所在</b></td><td>Telegram、Discord、Slack、WhatsApp、Signal 和 CLI——全部从单个网关进程运行。语音备忘录转写、跨平台对话连续性。</td></tr>
<tr><td><b>闭环学习</b></td><td>代理管理记忆并定期自我提醒。复杂任务后自动创建技能。技能在使用中自我改进。FTS5 会话搜索配合 LLM 摘要实现跨会话回溯。<a href="https://github.com/plastic-labs/honcho">Honcho</a> 辩证式用户建模。兼容 <a href="https://agentskills.io">agentskills.io</a> 开放标准。</td></tr>
<tr><td><b>定时自动化</b></td><td>内置 cron 调度器,支持向任何平台投递。日报、夜间备份、周审计——全部用自然语言描述,无人值守运行。</td></tr>
<tr><td><b>委派与并行</b></td><td>生成隔离子代理处理并行工作流。编写 Python 脚本通过 RPC 调用工具,将多步管道压缩为零上下文开销的轮次。</td></tr>
<tr><td><b>随处运行</b></td><td>六种终端后端——本地、Docker、SSH、Daytona、Singularity 和 Modal。Daytona 和 Modal 提供 Serverless 持久化——代理环境空闲时休眠、按需唤醒,空闲期间几乎零成本。$5 VPS 或 GPU 集群都能跑。</td></tr>
<tr><td><b>研究就绪</b></td><td>批量轨迹生成、Atropos RL 环境、轨迹压缩——用于训练下一代工具调用模型。</td></tr>
</table>
---
## 快速安装
```bash
curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash
```
支持 Linux、macOS、WSL2 和 Android (Termux)。安装程序会自动处理平台特定的配置。
> **Android / Termux** 已测试的手动安装路径请参考 [Termux 指南](https://hermes-agent.nousresearch.com/docs/getting-started/termux)。在 Termux 上,Hermes 会安装精选的 `.[termux]` 扩展,因为完整的 `.[all]` 扩展会拉取 Android 不兼容的语音依赖。
>
> **Windows** 原生 Windows 不受支持。请安装 [WSL2](https://learn.microsoft.com/zh-cn/windows/wsl/install) 并运行上述命令。
安装后:
```bash
source ~/.bashrc # 重新加载 shell(或: source ~/.zshrc
hermes # 开始对话!
```
---
## 快速入门
```bash
hermes # 交互式 CLI — 开始对话
hermes model # 选择 LLM 提供商和模型
hermes tools # 配置启用的工具
hermes config set # 设置单个配置项
hermes gateway # 启动消息网关(Telegram、Discord 等)
hermes setup # 运行完整设置向导(一次性配置所有内容)
hermes claw migrate # 从 OpenClaw 迁移(如果来自 OpenClaw
hermes update # 更新到最新版本
hermes doctor # 诊断问题
```
📖 **[完整文档 →](https://hermes-agent.nousresearch.com/docs/)**
## CLI 与消息平台 快速对照
Hermes 有两种入口:用 `hermes` 启动终端 UI,或运行网关从 Telegram、Discord、Slack、WhatsApp、Signal 或 Email 与之对话。进入对话后,许多斜杠命令在两种界面中通用。
| 操作 | CLI | 消息平台 |
|------|-----|----------|
| 开始对话 | `hermes` | 运行 `hermes gateway setup` + `hermes gateway start`,然后给机器人发消息 |
| 开始新对话 | `/new``/reset` | `/new``/reset` |
| 更换模型 | `/model [provider:model]` | `/model [provider:model]` |
| 设置人格 | `/personality [name]` | `/personality [name]` |
| 重试或撤销上一轮 | `/retry``/undo` | `/retry``/undo` |
| 压缩上下文 / 查看用量 | `/compress``/usage``/insights [--days N]` | `/compress``/usage``/insights [days]` |
| 浏览技能 | `/skills``/<skill-name>` | `/skills``/<skill-name>` |
| 中断当前工作 | `Ctrl+C` 或发送新消息 | `/stop` 或发送新消息 |
| 平台特定状态 | `/platforms` | `/status``/sethome` |
完整命令列表请参阅 [CLI 指南](https://hermes-agent.nousresearch.com/docs/user-guide/cli) 和 [消息网关指南](https://hermes-agent.nousresearch.com/docs/user-guide/messaging)。
---
## 文档
所有文档位于 **[hermes-agent.nousresearch.com/docs](https://hermes-agent.nousresearch.com/docs/)**
| 章节 | 内容 |
|------|------|
| [快速开始](https://hermes-agent.nousresearch.com/docs/getting-started/quickstart) | 安装 → 设置 → 2 分钟内开始首次对话 |
| [CLI 使用](https://hermes-agent.nousresearch.com/docs/user-guide/cli) | 命令、快捷键、人格、会话 |
| [配置](https://hermes-agent.nousresearch.com/docs/user-guide/configuration) | 配置文件、提供商、模型、所有选项 |
| [消息网关](https://hermes-agent.nousresearch.com/docs/user-guide/messaging) | Telegram、Discord、Slack、WhatsApp、Signal、Home Assistant |
| [安全](https://hermes-agent.nousresearch.com/docs/user-guide/security) | 命令审批、DM 配对、容器隔离 |
| [工具与工具集](https://hermes-agent.nousresearch.com/docs/user-guide/features/tools) | 40+ 工具、工具集系统、终端后端 |
| [技能系统](https://hermes-agent.nousresearch.com/docs/user-guide/features/skills) | 过程记忆、技能中心、创建技能 |
| [记忆](https://hermes-agent.nousresearch.com/docs/user-guide/features/memory) | 持久记忆、用户画像、最佳实践 |
| [MCP 集成](https://hermes-agent.nousresearch.com/docs/user-guide/features/mcp) | 连接任意 MCP 服务器扩展能力 |
| [定时调度](https://hermes-agent.nousresearch.com/docs/user-guide/features/cron) | 定时任务与平台投递 |
| [上下文文件](https://hermes-agent.nousresearch.com/docs/user-guide/features/context-files) | 影响每次对话的项目上下文 |
| [架构](https://hermes-agent.nousresearch.com/docs/developer-guide/architecture) | 项目结构、代理循环、关键类 |
| [贡献](https://hermes-agent.nousresearch.com/docs/developer-guide/contributing) | 开发设置、PR 流程、代码风格 |
| [CLI 参考](https://hermes-agent.nousresearch.com/docs/reference/cli-commands) | 所有命令和标志 |
| [环境变量](https://hermes-agent.nousresearch.com/docs/reference/environment-variables) | 完整环境变量参考 |
---
## 从 OpenClaw 迁移
如果你来自 OpenClaw,Hermes 可以自动导入你的设置、记忆、技能和 API 密钥。
**首次安装时:** 安装向导(`hermes setup`)会自动检测 `~/.openclaw` 并在配置开始前提供迁移选项。
**安装后任意时间:**
```bash
hermes claw migrate # 交互式迁移(完整预设)
hermes claw migrate --dry-run # 预览将要迁移的内容
hermes claw migrate --preset user-data # 仅迁移用户数据,不含密钥
hermes claw migrate --overwrite # 覆盖已有冲突
```
导入内容:
- **SOUL.md** — 人格文件
- **记忆** — MEMORY.md 和 USER.md 条目
- **技能** — 用户创建的技能 → `~/.hermes/skills/openclaw-imports/`
- **命令白名单** — 审批模式
- **消息设置** — 平台配置、允许用户、工作目录
- **API 密钥** — 白名单中的密钥(Telegram、OpenRouter、OpenAI、Anthropic、ElevenLabs
- **TTS 资产** — 工作区音频文件
- **工作区指令** — AGENTS.md(使用 `--workspace-target`
使用 `hermes claw migrate --help` 查看所有选项,或使用 `openclaw-migration` 技能进行交互式代理引导迁移(含干运行预览)。
---
## 贡献
欢迎贡献!请参阅 [贡献指南](https://hermes-agent.nousresearch.com/docs/developer-guide/contributing) 了解开发设置、代码风格和 PR 流程。
贡献者快速开始——克隆并使用 `setup-hermes.sh`
```bash
git clone https://github.com/NousResearch/hermes-agent.git
cd hermes-agent
./setup-hermes.sh # 安装 uv、创建 venv、安装 .[all]、创建符号链接 ~/.local/bin/hermes
./hermes # 自动检测 venv,无需先 source
```
手动安装(等效于上述命令):
```bash
curl -LsSf https://astral.sh/uv/install.sh | sh
uv venv venv --python 3.11
source venv/bin/activate
uv pip install -e ".[all,dev]"
python -m pytest tests/ -q
```
> **RL 训练(可选):** 如需参与 RL/Tinker-Atropos 集成开发:
> ```bash
> git submodule update --init tinker-atropos
> uv pip install -e "./tinker-atropos"
> ```
---
## 社区
- 💬 [Discord](https://discord.gg/NousResearch)
- 📚 [技能中心](https://agentskills.io)
- 🐛 [问题反馈](https://github.com/NousResearch/hermes-agent/issues)
- 💡 [讨论区](https://github.com/NousResearch/hermes-agent/discussions)
- 🔌 [HermesClaw](https://github.com/AaronWong1999/hermesclaw) — 社区微信桥接:在同一微信账号上运行 Hermes Agent 和 OpenClaw。
---
## 许可证
MIT — 详见 [LICENSE](LICENSE)。
由 [Nous Research](https://nousresearch.com) 构建。
+245 -27
View File
@@ -4,6 +4,7 @@ from __future__ import annotations
import asyncio
import contextvars
import json
import logging
import os
from collections import defaultdict, deque
@@ -47,6 +48,7 @@ from acp.schema import (
TextContentBlock,
UnstructuredCommandInput,
Usage,
UsageUpdate,
UserMessageChunk,
)
@@ -65,6 +67,7 @@ from acp_adapter.events import (
)
from acp_adapter.permissions import make_approval_callback
from acp_adapter.session import SessionManager, SessionState, _expand_acp_enabled_toolsets
from acp_adapter.tools import build_tool_complete, build_tool_start
logger = logging.getLogger(__name__)
@@ -315,6 +318,66 @@ class HermesACPAgent(acp.Agent):
return target_provider, new_model
@staticmethod
def _build_usage_update(state: SessionState) -> UsageUpdate | None:
"""Build ACP native context-usage data for clients like Zed.
Zed's circular context indicator is driven by ACP ``usage_update``
session updates: ``size`` is the model context window and ``used`` is
the current request pressure. Hermes estimates ``used`` from the same
buckets it sends to providers: system prompt, conversation history, and
tool schemas.
"""
agent = state.agent
compressor = getattr(agent, "context_compressor", None)
size = int(getattr(compressor, "context_length", 0) or 0)
if size <= 0:
return None
try:
from agent.model_metadata import estimate_request_tokens_rough
used = estimate_request_tokens_rough(
state.history,
system_prompt=getattr(agent, "_cached_system_prompt", "") or "",
tools=getattr(agent, "tools", None) or None,
)
except Exception:
logger.debug("Could not estimate ACP native context usage", exc_info=True)
used = int(getattr(compressor, "last_prompt_tokens", 0) or 0)
return UsageUpdate(
session_update="usage_update",
size=max(size, 0),
used=max(used, 0),
)
async def _send_usage_update(self, state: SessionState) -> None:
"""Send ACP native context usage to the connected client."""
if not self._conn:
return
update = self._build_usage_update(state)
if update is None:
return
try:
await self._conn.session_update(
session_id=state.session_id,
update=update,
)
except Exception:
logger.warning(
"Failed to send ACP usage update for session %s",
state.session_id,
exc_info=True,
)
def _schedule_usage_update(self, state: SessionState) -> None:
"""Schedule native context indicator refresh after ACP responses."""
if not self._conn:
return
loop = asyncio.get_running_loop()
loop.call_soon(asyncio.create_task, self._send_usage_update(state))
async def _register_session_mcp_servers(
self,
state: SessionState,
@@ -485,37 +548,99 @@ class HermesACPAgent(acp.Agent):
)
return None
@staticmethod
def _history_tool_call_name_args(tool_call: dict[str, Any]) -> tuple[str, dict[str, Any]]:
"""Extract function name/arguments from an OpenAI-style tool_call."""
function = tool_call.get("function") if isinstance(tool_call.get("function"), dict) else {}
name = str(function.get("name") or tool_call.get("name") or "unknown_tool")
raw_args = function.get("arguments") or tool_call.get("arguments") or tool_call.get("args") or {}
if isinstance(raw_args, str):
try:
parsed = json.loads(raw_args)
except Exception:
parsed = {"raw": raw_args}
raw_args = parsed
if not isinstance(raw_args, dict):
raw_args = {}
return name, raw_args
@staticmethod
def _history_tool_call_id(tool_call: dict[str, Any]) -> str:
"""Return the stable provider tool call id for ACP history replay."""
return str(
tool_call.get("id")
or tool_call.get("call_id")
or tool_call.get("tool_call_id")
or ""
).strip()
async def _replay_session_history(self, state: SessionState) -> None:
"""Send persisted user/assistant history to clients during session/load.
Zed's ACP history UI calls ``session/load`` after the user picks an item
from the Agents sidebar. The agent must then replay the full conversation
as ``user_message_chunk`` / ``agent_message_chunk`` notifications; merely
restoring server-side state makes Hermes remember context, but leaves the
editor looking like a clean thread.
as user/assistant chunks plus reconstructed tool-call start/completion
notifications; merely restoring server-side state makes Hermes remember
context, but leaves the editor looking like a clean thread.
"""
if not self._conn or not state.history:
return
for message in state.history:
role = str(message.get("role") or "")
if role not in {"user", "assistant"}:
continue
text = self._history_message_text(message)
if not text:
continue
update = self._history_message_update(role=role, text=text)
if update is None:
continue
active_tool_calls: dict[str, tuple[str, dict[str, Any]]] = {}
async def _send(update: Any) -> bool:
try:
await self._conn.session_update(session_id=state.session_id, update=update)
return True
except Exception:
logger.warning(
"Failed to replay ACP history for session %s",
state.session_id,
exc_info=True,
)
return
return False
for message in state.history:
role = str(message.get("role") or "")
if role in {"user", "assistant"}:
text = self._history_message_text(message)
if text:
update = self._history_message_update(role=role, text=text)
if update is not None and not await _send(update):
return
if role == "assistant" and isinstance(message.get("tool_calls"), list):
for tool_call in message["tool_calls"]:
if not isinstance(tool_call, dict):
continue
tool_call_id = self._history_tool_call_id(tool_call)
if not tool_call_id:
continue
tool_name, args = self._history_tool_call_name_args(tool_call)
active_tool_calls[tool_call_id] = (tool_name, args)
if not await _send(build_tool_start(tool_call_id, tool_name, args)):
return
continue
if role == "tool":
tool_call_id = str(message.get("tool_call_id") or "").strip()
tool_name = str(message.get("tool_name") or "").strip()
function_args: dict[str, Any] | None = None
if tool_call_id in active_tool_calls:
tool_name, function_args = active_tool_calls.pop(tool_call_id)
if not tool_call_id or not tool_name:
continue
result = message.get("content")
if not await _send(
build_tool_complete(
tool_call_id,
tool_name,
result=result if isinstance(result, str) else None,
function_args=function_args,
)
):
return
async def new_session(
self,
@@ -527,11 +652,24 @@ class HermesACPAgent(acp.Agent):
await self._register_session_mcp_servers(state, mcp_servers)
logger.info("New session %s (cwd=%s)", state.session_id, cwd)
self._schedule_available_commands_update(state.session_id)
self._schedule_usage_update(state)
return NewSessionResponse(
session_id=state.session_id,
models=self._build_model_state(state),
)
def _schedule_history_replay(self, state: SessionState) -> None:
"""Replay persisted history after session/load or session/resume returns.
Zed only attaches streamed transcript/tool updates once the load/resume
response has completed. Sending replay notifications while the request is
still in-flight can make the server look correct in logs while the editor
drops or fails to attach the tool-call history.
"""
loop = asyncio.get_running_loop()
replay_coro = self._replay_session_history(state)
loop.call_soon(asyncio.create_task, replay_coro)
async def load_session(
self,
cwd: str,
@@ -545,8 +683,9 @@ class HermesACPAgent(acp.Agent):
return None
await self._register_session_mcp_servers(state, mcp_servers)
logger.info("Loaded session %s", session_id)
await self._replay_session_history(state)
self._schedule_history_replay(state)
self._schedule_available_commands_update(session_id)
self._schedule_usage_update(state)
return LoadSessionResponse(models=self._build_model_state(state))
async def resume_session(
@@ -562,8 +701,9 @@ class HermesACPAgent(acp.Agent):
state = self.session_manager.create_session(cwd=cwd)
await self._register_session_mcp_servers(state, mcp_servers)
logger.info("Resumed session %s", state.session_id)
await self._replay_session_history(state)
self._schedule_history_replay(state)
self._schedule_available_commands_update(state.session_id)
self._schedule_usage_update(state)
return ResumeSessionResponse(models=self._build_model_state(state))
async def cancel(self, session_id: str, **kwargs: Any) -> None:
@@ -712,6 +852,7 @@ class HermesACPAgent(acp.Agent):
if self._conn:
update = acp.update_agent_message_text(response_text)
await self._conn.session_update(session_id, update)
await self._send_usage_update(state)
return PromptResponse(stop_reason="end_turn")
# If Zed sends another regular prompt while the same ACP session is
@@ -744,24 +885,37 @@ class HermesACPAgent(acp.Agent):
tool_call_meta: dict[str, dict[str, Any]] = {}
previous_approval_cb = None
streamed_message = False
if conn:
tool_progress_cb = make_tool_progress_cb(conn, session_id, loop, tool_call_ids, tool_call_meta)
thinking_cb = make_thinking_cb(conn, session_id, loop)
reasoning_cb = make_thinking_cb(conn, session_id, loop)
step_cb = make_step_cb(conn, session_id, loop, tool_call_ids, tool_call_meta)
message_cb = make_message_cb(conn, session_id, loop)
def stream_delta_cb(text: str) -> None:
nonlocal streamed_message
if text:
streamed_message = True
message_cb(text)
approval_cb = make_approval_callback(conn.request_permission, loop, session_id)
else:
tool_progress_cb = None
thinking_cb = None
reasoning_cb = None
step_cb = None
message_cb = None
stream_delta_cb = None
approval_cb = None
agent = state.agent
agent.tool_progress_callback = tool_progress_cb
agent.thinking_callback = thinking_cb
# ACP thought panes should not receive Hermes' local kawaii waiting/status
# updates. Route provider/model reasoning deltas instead; if the provider
# emits no reasoning, Zed should not get a fake "thinking" accordion.
agent.thinking_callback = None
agent.reasoning_callback = reasoning_cb
agent.step_callback = step_cb
agent.message_callback = message_cb
agent.stream_delta_callback = stream_delta_cb
# Approval callback is per-thread (thread-local, GHSA-qg5c-hvr5-hjgr).
# Set it INSIDE _run_agent so the TLS write happens in the executor
@@ -867,7 +1021,7 @@ class HermesACPAgent(acp.Agent):
)
except Exception:
logger.debug("Failed to auto-title ACP session %s", session_id, exc_info=True)
if final_response and conn:
if final_response and conn and not streamed_message:
update = acp.update_agent_message_text(final_response)
await conn.session_update(session_id, update)
@@ -903,6 +1057,8 @@ class HermesACPAgent(acp.Agent):
cached_read_tokens=result.get("cache_read_tokens"),
)
await self._send_usage_update(state)
stop_reason = "cancelled" if state.cancel_event and state.cancel_event.is_set() else "end_turn"
return PromptResponse(stop_reason=stop_reason, usage=usage)
@@ -1035,22 +1191,84 @@ class HermesACPAgent(acp.Agent):
return f"Could not list tools: {e}"
def _cmd_context(self, args: str, state: SessionState) -> str:
"""Show ACP session context pressure and compression guidance."""
n_messages = len(state.history)
if n_messages == 0:
return "Conversation is empty (no messages yet)."
# Count by role
# Count by role.
roles: dict[str, int] = {}
for msg in state.history:
role = msg.get("role", "unknown")
roles[role] = roles.get(role, 0) + 1
agent = state.agent
model = state.model or getattr(agent, "model", "")
provider = getattr(agent, "provider", None) or "auto"
compressor = getattr(agent, "context_compressor", None)
context_length = int(getattr(compressor, "context_length", 0) or 0)
threshold_tokens = int(getattr(compressor, "threshold_tokens", 0) or 0)
try:
from agent.model_metadata import estimate_request_tokens_rough
system_prompt = getattr(agent, "_cached_system_prompt", "") or ""
tools = getattr(agent, "tools", None) or None
approx_tokens = estimate_request_tokens_rough(
state.history,
system_prompt=system_prompt,
tools=tools,
)
except Exception:
logger.debug("Could not estimate ACP context usage", exc_info=True)
approx_tokens = 0
if threshold_tokens <= 0 and context_length > 0:
threshold_tokens = int(context_length * 0.80)
lines = [
f"Conversation: {n_messages} messages",
f"Conversation: {n_messages} messages"
if n_messages
else "Conversation is empty (no messages yet).",
f" user: {roles.get('user', 0)}, assistant: {roles.get('assistant', 0)}, "
f"tool: {roles.get('tool', 0)}, system: {roles.get('system', 0)}",
]
model = state.model or getattr(state.agent, "model", "")
if model:
lines.append(f"Model: {model}")
lines.append(f"Provider: {provider}")
if approx_tokens > 0:
if context_length > 0:
usage_pct = (approx_tokens / context_length) * 100
lines.append(
f"Context usage: ~{approx_tokens:,} / {context_length:,} tokens ({usage_pct:.1f}%)"
)
else:
lines.append(f"Context usage: ~{approx_tokens:,} tokens")
if threshold_tokens > 0:
if approx_tokens > 0:
threshold_pct = (threshold_tokens / context_length) * 100 if context_length > 0 else 0
remaining = max(threshold_tokens - approx_tokens, 0)
if approx_tokens >= threshold_tokens:
lines.append(
f"Compression: due now (threshold ~{threshold_tokens:,}"
+ (f", {threshold_pct:.0f}%" if threshold_pct else "")
+ "). Run /compact."
)
else:
lines.append(
f"Compression: ~{remaining:,} tokens until threshold "
f"(~{threshold_tokens:,}"
+ (f", {threshold_pct:.0f}%" if threshold_pct else "")
+ ")."
)
else:
lines.append(f"Compression threshold: ~{threshold_tokens:,} tokens")
if getattr(agent, "compression_enabled", True) is False:
lines.append("Compression is disabled for this agent.")
else:
lines.append("Tip: run /compact to compress manually before the threshold.")
return "\n".join(lines)
def _cmd_reset(self, args: str, state: SessionState) -> str:
+4 -11
View File
@@ -466,17 +466,10 @@ class SessionManager:
except Exception:
logger.debug("Failed to update ACP session metadata", exc_info=True)
# Replace stored messages with current history.
db.clear_messages(state.session_id)
for msg in state.history:
db.append_message(
session_id=state.session_id,
role=msg.get("role", "user"),
content=msg.get("content"),
tool_name=msg.get("tool_name") or msg.get("name"),
tool_calls=msg.get("tool_calls"),
tool_call_id=msg.get("tool_call_id"),
)
# Replace stored messages with current history atomically so a
# mid-rewrite failure rolls back and the previously persisted
# conversation is preserved (salvaged from #13675).
db.replace_messages(state.session_id, state.history)
except Exception:
logger.warning("Failed to persist ACP session %s", state.session_id, exc_info=True)
+822 -21
View File
@@ -28,6 +28,11 @@ TOOL_KIND_MAP: Dict[str, ToolKind] = {
"terminal": "execute",
"process": "execute",
"execute_code": "execute",
# Session/meta tools
"todo": "other",
"skill_view": "read",
"skills_list": "read",
"skill_manage": "edit",
# Web / fetch
"web_search": "fetch",
"web_extract": "fetch",
@@ -51,6 +56,28 @@ TOOL_KIND_MAP: Dict[str, ToolKind] = {
}
_POLISHED_TOOLS = {
# Core operator loop
"todo", "memory", "session_search", "delegate_task",
# Files / execution
"read_file", "write_file", "patch", "search_files", "terminal", "process", "execute_code",
# Skills / web / browser / media
"skill_view", "skills_list", "skill_manage", "web_search", "web_extract",
"browser_navigate", "browser_click", "browser_type", "browser_press", "browser_scroll",
"browser_back", "browser_snapshot", "browser_console", "browser_get_images", "browser_vision",
"vision_analyze", "image_generate", "text_to_speech",
# Schedulers / platform integrations
"cronjob", "send_message", "clarify", "discord", "discord_admin",
"ha_list_entities", "ha_get_state", "ha_list_services", "ha_call_service",
"feishu_doc_read", "feishu_drive_list_comments", "feishu_drive_list_comment_replies",
"feishu_drive_reply_comment", "feishu_drive_add_comment",
"kanban_create", "kanban_show", "kanban_comment", "kanban_complete",
"kanban_block", "kanban_link", "kanban_heartbeat",
"yb_query_group_info", "yb_query_group_members", "yb_search_sticker",
"yb_send_dm", "yb_send_sticker", "mixture_of_agents",
}
def get_tool_kind(tool_name: str) -> ToolKind:
"""Return the ACP ToolKind for a hermes tool, defaulting to 'other'."""
return TOOL_KIND_MAP.get(tool_name, "other")
@@ -85,18 +112,645 @@ def build_tool_title(tool_name: str, args: Dict[str, Any]) -> str:
if urls:
return f"extract: {urls[0]}" + (f" (+{len(urls)-1})" if len(urls) > 1 else "")
return "web extract"
if tool_name == "process":
action = str(args.get("action") or "").strip() or "manage"
sid = str(args.get("session_id") or "").strip()
return f"process {action}: {sid}" if sid else f"process {action}"
if tool_name == "delegate_task":
tasks = args.get("tasks")
if isinstance(tasks, list) and tasks:
return f"delegate batch ({len(tasks)} tasks)"
goal = args.get("goal", "")
if goal and len(goal) > 60:
goal = goal[:57] + "..."
return f"delegate: {goal}" if goal else "delegate task"
if tool_name == "session_search":
query = str(args.get("query") or "").strip()
return f"session search: {query}" if query else "recent sessions"
if tool_name == "memory":
action = str(args.get("action") or "manage").strip() or "manage"
target = str(args.get("target") or "memory").strip() or "memory"
return f"memory {action}: {target}"
if tool_name == "execute_code":
return "execute code"
code = str(args.get("code") or "").strip()
first_line = next((line.strip() for line in code.splitlines() if line.strip()), "")
if first_line:
if len(first_line) > 70:
first_line = first_line[:67] + "..."
return f"python: {first_line}"
return "python code"
if tool_name == "todo":
items = args.get("todos")
if isinstance(items, list):
return f"todo ({len(items)} item{'s' if len(items) != 1 else ''})"
return "todo"
if tool_name == "skill_view":
name = str(args.get("name") or "?").strip() or "?"
file_path = str(args.get("file_path") or "").strip()
suffix = f"/{file_path}" if file_path else ""
return f"skill view ({name}{suffix})"
if tool_name == "skills_list":
category = str(args.get("category") or "").strip()
return f"skills list ({category})" if category else "skills list"
if tool_name == "skill_manage":
action = str(args.get("action") or "manage").strip() or "manage"
name = str(args.get("name") or "?").strip() or "?"
file_path = str(args.get("file_path") or "").strip()
target = f"{name}/{file_path}" if file_path else name
if len(target) > 64:
target = target[:61] + "..."
return f"skill {action}: {target}"
if tool_name == "browser_navigate":
return f"navigate: {args.get('url', '?')}"
if tool_name == "browser_snapshot":
return "browser snapshot"
if tool_name == "browser_vision":
return f"browser vision: {str(args.get('question', '?'))[:50]}"
if tool_name == "browser_get_images":
return "browser images"
if tool_name == "vision_analyze":
return f"analyze image: {args.get('question', '?')[:50]}"
return f"analyze image: {str(args.get('question', '?'))[:50]}"
if tool_name == "image_generate":
prompt = str(args.get("prompt") or args.get("description") or "").strip()
return f"generate image: {prompt[:50]}" if prompt else "generate image"
if tool_name == "cronjob":
action = str(args.get("action") or "manage").strip() or "manage"
job_id = str(args.get("job_id") or args.get("id") or "").strip()
return f"cron {action}: {job_id}" if job_id else f"cron {action}"
return tool_name
def _text(content: str) -> Any:
return acp.tool_content(acp.text_block(content))
def _json_loads_maybe(value: Optional[str]) -> Any:
if not isinstance(value, str):
return value
try:
return json.loads(value)
except Exception:
pass
# Some Hermes tools append a human hint after a JSON payload, e.g.
# ``{...}\n\n[Hint: Results truncated...]``. Keep the structured rendering path
# by decoding the first JSON value instead of falling back to raw text.
try:
decoded, _ = json.JSONDecoder().raw_decode(value.lstrip())
return decoded
except Exception:
return None
def _truncate_text(text: str, limit: int = 5000) -> str:
if len(text) <= limit:
return text
return text[: max(0, limit - 100)] + f"\n... ({len(text)} chars total, truncated)"
def _fenced_text(text: str, language: str = "") -> str:
"""Return a Markdown fence that cannot be broken by backticks in text."""
longest = max((len(run) for run in text.split("`")[1::2]), default=0)
fence = "`" * max(3, longest + 1)
return f"{fence}{language}\n{text}\n{fence}"
def _format_todo_result(result: Optional[str]) -> Optional[str]:
data = _json_loads_maybe(result)
if not isinstance(data, dict) or not isinstance(data.get("todos"), list):
return None
summary = data.get("summary") if isinstance(data.get("summary"), dict) else {}
icon = {
"completed": "",
"in_progress": "🔄",
"pending": "",
"cancelled": "",
}
lines = ["**Todo list**", ""]
for item in data["todos"]:
if not isinstance(item, dict):
continue
status = str(item.get("status") or "pending")
content = str(item.get("content") or item.get("id") or "").strip()
if content:
lines.append(f"- {icon.get(status, '')} {content}")
if summary:
cancelled = summary.get("cancelled", 0)
lines.extend([
"",
"**Progress:** "
f"{summary.get('completed', 0)} completed, "
f"{summary.get('in_progress', 0)} in progress, "
f"{summary.get('pending', 0)} pending"
+ (f", {cancelled} cancelled" if cancelled else ""),
])
return "\n".join(lines)
def _format_read_file_result(result: Optional[str], args: Optional[Dict[str, Any]]) -> Optional[str]:
data = _json_loads_maybe(result)
if not isinstance(data, dict):
return None
if data.get("error") and not data.get("content"):
return f"Read failed: {data.get('error')}"
content = data.get("content")
if not isinstance(content, str):
return None
path = str((args or {}).get("path") or data.get("path") or "file").strip()
offset = (args or {}).get("offset")
limit = (args or {}).get("limit")
range_bits = []
if offset:
range_bits.append(f"from line {offset}")
if limit:
range_bits.append(f"limit {limit}")
suffix = f" ({', '.join(range_bits)})" if range_bits else ""
header = f"Read {path}{suffix}"
if data.get("total_lines") is not None:
header += f"{data.get('total_lines')} total lines"
# Hermes read_file output is line-numbered with `|`. If we send it as raw
# Markdown, Zed can interpret pipes as tables and collapse the layout.
# Fence the payload so file lines stay readable and literal.
return _truncate_text(f"{header}\n\n{_fenced_text(content)}")
def _format_search_files_result(result: Optional[str]) -> Optional[str]:
data = _json_loads_maybe(result)
if not isinstance(data, dict):
return None
matches = data.get("matches")
if not isinstance(matches, list):
return None
total = data.get("total_count", len(matches))
shown = min(len(matches), 12)
truncated = bool(data.get("truncated")) or len(matches) > shown
lines = [
"Search results",
f"Found {total} match{'es' if total != 1 else ''}; showing {shown}.",
"",
]
for match in matches[:shown]:
if not isinstance(match, dict):
lines.append(f"- {match}")
continue
path = str(match.get("path") or match.get("file") or match.get("filename") or "?")
line = match.get("line") or match.get("line_number")
content = str(match.get("content") or match.get("text") or "").strip()
loc = f"{path}:{line}" if line else path
lines.append(f"- {loc}")
if content:
snippet = _truncate_text(" ".join(content.split()), 300)
lines.append(f" {snippet}")
if truncated:
lines.extend([
"",
"Results truncated. Narrow the search, add file_glob, or use offset to page.",
])
return _truncate_text("\n".join(lines), limit=7000)
def _format_execute_code_result(result: Optional[str]) -> Optional[str]:
data = _json_loads_maybe(result)
if not isinstance(data, dict):
return result if isinstance(result, str) and result.strip() else None
output = str(data.get("output") or "")
error = str(data.get("error") or "")
exit_code = data.get("exit_code")
parts = [f"Exit code: {exit_code}" if exit_code is not None else "Execution complete"]
if output:
parts.extend(["", "Output:", output])
if error:
parts.extend(["", "Error:", error])
return _truncate_text("\n".join(parts))
def _extract_markdown_headings(content: str, limit: int = 8) -> list[str]:
headings: list[str] = []
for line in content.splitlines():
stripped = line.strip()
if stripped.startswith("#"):
heading = stripped.lstrip("#").strip()
if heading:
headings.append(heading)
if len(headings) >= limit:
break
return headings
def _format_skill_view_result(result: Optional[str]) -> Optional[str]:
data = _json_loads_maybe(result)
if not isinstance(data, dict):
return None
if data.get("success") is False:
return f"Skill view failed: {data.get('error', 'unknown error')}"
name = str(data.get("name") or "skill")
file_path = str(data.get("file") or data.get("path") or "SKILL.md")
description = str(data.get("description") or "").strip()
content = str(data.get("content") or "")
linked = data.get("linked_files") if isinstance(data.get("linked_files"), dict) else None
lines = ["**Skill loaded**", "", f"- **Name:** `{name}`", f"- **File:** `{file_path}`"]
if description:
lines.append(f"- **Description:** {description}")
if content:
lines.append(f"- **Content:** {len(content):,} chars loaded into agent context")
if linked:
linked_count = sum(len(v) for v in linked.values() if isinstance(v, list))
lines.append(f"- **Linked files:** {linked_count}")
headings = _extract_markdown_headings(content)
if headings:
lines.extend(["", "**Sections**"])
lines.extend(f"- {heading}" for heading in headings)
lines.extend([
"",
"_Full skill content is available to the agent but hidden here to keep ACP readable._",
])
return "\n".join(lines)
def _format_skill_manage_result(result: Optional[str], args: Optional[Dict[str, Any]]) -> Optional[str]:
data = _json_loads_maybe(result)
if not isinstance(data, dict):
return None
action = str((args or {}).get("action") or "manage").strip() or "manage"
name = str((args or {}).get("name") or data.get("name") or "skill").strip() or "skill"
file_path = str((args or {}).get("file_path") or data.get("file_path") or "SKILL.md").strip() or "SKILL.md"
success = data.get("success")
status = "✅ Skill updated" if success is not False else "✗ Skill update failed"
lines = [f"**{status}**", "", f"- **Action:** `{action}`", f"- **Skill:** `{name}`"]
if action not in {"delete"}:
lines.append(f"- **File:** `{file_path}`")
message = str(data.get("message") or data.get("error") or "").strip()
if message:
lines.append(f"- **Result:** {message}")
replacements = data.get("replacements") or data.get("replacement_count")
if replacements is not None:
lines.append(f"- **Replacements:** {replacements}")
path = str(data.get("path") or "").strip()
if path:
lines.append(f"- **Path:** `{path}`")
return "\n".join(lines)
def _format_web_search_result(result: Optional[str]) -> Optional[str]:
data = _json_loads_maybe(result)
if not isinstance(data, dict):
return None
web = data.get("data", {}).get("web") if isinstance(data.get("data"), dict) else data.get("web")
if not isinstance(web, list):
return None
lines = [f"Web results: {len(web)}"]
for item in web[:10]:
if not isinstance(item, dict):
continue
title = str(item.get("title") or item.get("url") or "result").strip()
url = str(item.get("url") or "").strip()
desc = str(item.get("description") or "").strip()
lines.append(f"{title}" + (f"{url}" if url else ""))
if desc:
lines.append(f" {desc}")
return _truncate_text("\n".join(lines))
def _format_web_extract_result(result: Optional[str]) -> Optional[str]:
"""Return only web_extract errors for ACP; success stays compact via title."""
data = _json_loads_maybe(result)
if not isinstance(data, dict):
return None
if data.get("success") is False and data.get("error"):
return f"Web extract failed: {data.get('error')}"
results = data.get("results")
if not isinstance(results, list):
return None
failures: list[str] = []
for item in results[:10]:
if not isinstance(item, dict):
continue
error = str(item.get("error") or "").strip()
if not error or error in {"None", "null"}:
continue
url = str(item.get("url") or "").strip()
title = str(item.get("title") or url or "Untitled").strip()
failures.append(
f"- {title}" + (f"{url}" if url and url != title else "") + f"\n Error: {_truncate_text(error, limit=500)}"
)
if not failures:
return None
lines = [f"Web extract failed for {len(failures)} URL{'s' if len(failures) != 1 else ''}"]
lines.extend(failures)
return "\n".join(lines)
def _format_process_result(result: Optional[str], args: Optional[Dict[str, Any]]) -> Optional[str]:
data = _json_loads_maybe(result)
if not isinstance(data, dict):
return result if isinstance(result, str) and result.strip() else None
if data.get("success") is False and data.get("error"):
return f"Process error: {data.get('error')}"
action = str((args or {}).get("action") or "process").strip() or "process"
if isinstance(data.get("processes"), list):
processes = data["processes"]
lines = [f"Processes: {len(processes)}"]
for proc in processes[:20]:
if not isinstance(proc, dict):
lines.append(f"- {proc}")
continue
sid = str(proc.get("session_id") or proc.get("id") or "?")
status = str(proc.get("status") or ("exited" if proc.get("exited") else "running"))
cmd = str(proc.get("command") or "").strip()
pid = proc.get("pid")
code = proc.get("exit_code")
bits = [status]
if pid is not None:
bits.append(f"pid {pid}")
if code is not None:
bits.append(f"exit {code}")
lines.append(f"- `{sid}` — {', '.join(bits)}" + (f"{cmd[:120]}" if cmd else ""))
if len(processes) > 20:
lines.append(f"... {len(processes) - 20} more process(es)")
return "\n".join(lines)
status = str(data.get("status") or data.get("state") or action).strip()
sid = str(data.get("session_id") or (args or {}).get("session_id") or "").strip()
lines = [f"Process {action}: {status}" + (f" (`{sid}`)" if sid else "")]
for key, label in (("command", "Command"), ("pid", "PID"), ("exit_code", "Exit code"), ("returncode", "Exit code"), ("lines", "Lines")):
if data.get(key) is not None:
lines.append(f"- **{label}:** {data.get(key)}")
output = data.get("output") or data.get("new_output") or data.get("log") or data.get("stdout")
error = data.get("error") or data.get("stderr")
if output:
lines.extend(["", "Output:", _truncate_text(str(output), limit=5000)])
if error:
lines.extend(["", "Error:", _truncate_text(str(error), limit=2000)])
msg = data.get("message")
if msg and not output and not error:
lines.append(str(msg))
return _truncate_text("\n".join(lines), limit=7000)
def _format_delegate_result(result: Optional[str]) -> Optional[str]:
data = _json_loads_maybe(result)
if not isinstance(data, dict):
return None
if data.get("error") and not isinstance(data.get("results"), list):
return f"Delegation failed: {data.get('error')}"
results = data.get("results")
if not isinstance(results, list):
return None
total = data.get("total_duration_seconds")
lines = [f"Delegation results: {len(results)} task{'s' if len(results) != 1 else ''}" + (f" in {total}s" if total is not None else "")]
icon = {"completed": "", "failed": "", "error": "", "timeout": "", "interrupted": ""}
for item in results:
if not isinstance(item, dict):
lines.append(f"- {item}")
continue
idx = item.get("task_index")
status = str(item.get("status") or "unknown")
model = item.get("model")
dur = item.get("duration_seconds")
role = item.get("_child_role")
header = f"{icon.get(status, '')} Task {idx + 1 if isinstance(idx, int) else '?'}: {status}"
bits = []
if model:
bits.append(str(model))
if role:
bits.append(f"role={role}")
if dur is not None:
bits.append(f"{dur}s")
if bits:
header += " (" + ", ".join(bits) + ")"
lines.extend(["", header])
summary = str(item.get("summary") or "").strip()
error = str(item.get("error") or "").strip()
if summary:
lines.append(_truncate_text(summary, limit=1200))
if error:
lines.append("Error: " + _truncate_text(error, limit=800))
trace = item.get("tool_trace")
if isinstance(trace, list) and trace:
names = [str(t.get("tool") or "?") for t in trace if isinstance(t, dict)]
if names:
lines.append("Tools: " + ", ".join(names[:12]) + (f" (+{len(names)-12})" if len(names) > 12 else ""))
return _truncate_text("\n".join(lines), limit=8000)
def _format_session_search_result(result: Optional[str]) -> Optional[str]:
data = _json_loads_maybe(result)
if not isinstance(data, dict):
return None
if data.get("success") is False:
return f"Session search failed: {data.get('error', 'unknown error')}"
results = data.get("results")
if not isinstance(results, list):
return None
mode = data.get("mode") or "search"
query = data.get("query")
lines = ["Recent sessions" if mode == "recent" else f"Session search results" + (f" for `{query}`" if query else "")]
if not results:
lines.append(str(data.get("message") or "No matching sessions found."))
return "\n".join(lines)
for item in results:
if not isinstance(item, dict):
continue
sid = str(item.get("session_id") or "?")
title = str(item.get("title") or item.get("when") or "Untitled session").strip()
when = str(item.get("last_active") or item.get("started_at") or item.get("when") or "").strip()
count = item.get("message_count")
source = str(item.get("source") or "").strip()
meta = ", ".join(str(x) for x in [when, source, f"{count} msgs" if count is not None else ""] if x)
lines.append(f"- **{title}** (`{sid}`)" + (f"{meta}" if meta else ""))
summary = str(item.get("summary") or item.get("preview") or "").strip()
if summary:
lines.append(" " + _truncate_text(" ".join(summary.split()), limit=500))
return _truncate_text("\n".join(lines), limit=7000)
def _format_memory_result(result: Optional[str], args: Optional[Dict[str, Any]]) -> Optional[str]:
data = _json_loads_maybe(result)
if not isinstance(data, dict):
return None
action = str((args or {}).get("action") or "memory").strip() or "memory"
target = str(data.get("target") or (args or {}).get("target") or "memory")
if data.get("success") is False:
lines = [f"✗ Memory {action} failed ({target})", str(data.get("error") or "unknown error")]
matches = data.get("matches")
if isinstance(matches, list) and matches:
lines.append("Matches:")
lines.extend(f"- {_truncate_text(str(m), 160)}" for m in matches[:5])
return "\n".join(lines)
lines = [f"✅ Memory {action} saved ({target})"]
if data.get("message"):
lines.append(str(data.get("message")))
if data.get("entry_count") is not None:
lines.append(f"Entries: {data.get('entry_count')}")
if data.get("usage"):
lines.append(f"Usage: {data.get('usage')}")
# Avoid dumping all memory entries into ACP UI; show only the explicit new value preview.
preview = str((args or {}).get("content") or (args or {}).get("old_text") or "").strip()
if preview:
lines.append("Preview: " + _truncate_text(preview, limit=300))
return "\n".join(lines)
def _format_edit_result(tool_name: str, result: Optional[str], args: Optional[Dict[str, Any]]) -> Optional[str]:
data = _json_loads_maybe(result)
path = str((args or {}).get("path") or "file").strip()
if isinstance(data, dict):
if data.get("success") is False or data.get("error"):
return f"{tool_name} failed for {path}: {data.get('error', 'unknown error')}"
message = str(data.get("message") or "").strip()
replacements = data.get("replacements") or data.get("replacement_count")
lines = [f"{tool_name} completed" + (f" for `{path}`" if path else "")]
if message:
lines.append(message)
if replacements is not None:
lines.append(f"Replacements: {replacements}")
if data.get("files_modified"):
files = data.get("files_modified")
if isinstance(files, list):
lines.append("Files: " + ", ".join(f"`{f}`" for f in files[:8]))
return "\n".join(lines)
if isinstance(result, str) and result.strip():
return _truncate_text(result, limit=3000)
return f"{tool_name} completed" + (f" for `{path}`" if path else "")
def _format_browser_result(tool_name: str, result: Optional[str], args: Optional[Dict[str, Any]]) -> Optional[str]:
data = _json_loads_maybe(result)
if not isinstance(data, dict):
return result if isinstance(result, str) and result.strip() else None
if data.get("success") is False or data.get("error"):
return f"{tool_name} failed: {data.get('error', 'unknown error')}"
if tool_name == "browser_get_images":
images = data.get("images") or data.get("data")
if isinstance(images, list):
lines = [f"Images found: {len(images)}"]
for img in images[:12]:
if isinstance(img, dict):
alt = str(img.get("alt") or "").strip()
url = str(img.get("url") or img.get("src") or "").strip()
lines.append(f"- {alt or 'image'}" + (f"{url}" if url else ""))
return _truncate_text("\n".join(lines), limit=5000)
title = str(data.get("title") or data.get("url") or data.get("status") or tool_name)
text = str(data.get("text") or data.get("content") or data.get("snapshot") or data.get("analysis") or data.get("message") or "").strip()
lines = [title]
if data.get("url") and data.get("url") != title:
lines.append(str(data.get("url")))
if text:
lines.extend(["", _truncate_text(text, limit=5000)])
return _truncate_text("\n".join(lines), limit=7000)
def _format_media_or_cron_result(tool_name: str, result: Optional[str]) -> Optional[str]:
data = _json_loads_maybe(result)
if not isinstance(data, dict):
return result if isinstance(result, str) and result.strip() else None
if data.get("success") is False or data.get("error"):
return f"{tool_name} failed: {data.get('error', 'unknown error')}"
lines = [f"{tool_name} completed"]
for key in ("file_path", "path", "url", "image_url", "job_id", "id", "status", "message", "next_run"):
if data.get(key):
lines.append(f"- **{key}:** {data.get(key)}")
return "\n".join(lines)
def _format_generic_structured_result(tool_name: str, result: Optional[str]) -> Optional[str]:
data = _json_loads_maybe(result)
if not isinstance(data, (dict, list)):
return result if isinstance(result, str) and result.strip() else None
if isinstance(data, list):
lines = [f"{tool_name}: {len(data)} item{'s' if len(data) != 1 else ''}"]
for item in data[:12]:
lines.append(f"- {_truncate_text(str(item), limit=240)}")
return _truncate_text("\n".join(lines), limit=5000)
if data.get("success") is False or data.get("error"):
return f"{tool_name} failed: {data.get('error', 'unknown error')}"
lines = [f"{tool_name} completed" if data.get("success") is True else f"{tool_name} result"]
priority_keys = (
"message", "status", "id", "task_id", "issue_id", "title", "name", "entity_id",
"state", "service", "url", "path", "file_path", "count", "total", "next_run",
)
seen = set()
for key in priority_keys:
value = data.get(key)
if value in (None, "", [], {}):
continue
seen.add(key)
lines.append(f"- **{key}:** {_truncate_text(str(value), limit=500)}")
for key, value in data.items():
if key in seen or key in {"success", "raw", "content", "entries"}:
continue
if value in (None, "", [], {}):
continue
if isinstance(value, (dict, list)):
preview = json.dumps(value, ensure_ascii=False, default=str)
else:
preview = str(value)
lines.append(f"- **{key}:** {_truncate_text(preview, limit=500)}")
if len(lines) >= 14:
break
content = data.get("content")
if isinstance(content, str) and content.strip():
lines.extend(["", _truncate_text(content.strip(), limit=1500)])
return _truncate_text("\n".join(lines), limit=7000)
def _build_polished_completion_content(
tool_name: str,
result: Optional[str],
function_args: Optional[Dict[str, Any]],
) -> Optional[List[Any]]:
formatter = {
"todo": lambda: _format_todo_result(result),
"read_file": lambda: _format_read_file_result(result, function_args),
"write_file": lambda: _format_edit_result(tool_name, result, function_args),
"patch": lambda: _format_edit_result(tool_name, result, function_args),
"search_files": lambda: _format_search_files_result(result),
"execute_code": lambda: _format_execute_code_result(result),
"process": lambda: _format_process_result(result, function_args),
"delegate_task": lambda: _format_delegate_result(result),
"session_search": lambda: _format_session_search_result(result),
"memory": lambda: _format_memory_result(result, function_args),
"skill_view": lambda: _format_skill_view_result(result),
"skill_manage": lambda: _format_skill_manage_result(result, function_args),
"web_search": lambda: _format_web_search_result(result),
"web_extract": lambda: _format_web_extract_result(result),
"browser_navigate": lambda: _format_browser_result(tool_name, result, function_args),
"browser_snapshot": lambda: _format_browser_result(tool_name, result, function_args),
"browser_vision": lambda: _format_browser_result(tool_name, result, function_args),
"browser_get_images": lambda: _format_browser_result(tool_name, result, function_args),
"vision_analyze": lambda: _format_media_or_cron_result(tool_name, result),
"image_generate": lambda: _format_media_or_cron_result(tool_name, result),
"cronjob": lambda: _format_media_or_cron_result(tool_name, result),
}.get(tool_name)
if formatter is None and tool_name in _POLISHED_TOOLS:
formatter = lambda: _format_generic_structured_result(tool_name, result)
if formatter is None:
return None
text = formatter()
if not text:
return None
return [_text(text)]
def _build_patch_mode_content(patch_text: str) -> List[Any]:
"""Parse V4A patch mode input into ACP diff blocks when possible."""
if not patch_text:
@@ -258,7 +912,11 @@ def _build_tool_complete_content(
except Exception:
pass
return [acp.tool_content(acp.text_block(display_result))]
polished_content = _build_polished_completion_content(tool_name, result, function_args)
if polished_content:
return polished_content
return [_text(display_result)]
# ---------------------------------------------------------------------------
@@ -288,7 +946,6 @@ def build_tool_start(
content = _build_patch_mode_content(patch_text)
return acp.start_tool_call(
tool_call_id, title, kind=kind, content=content, locations=locations,
raw_input=arguments,
)
if tool_name == "write_file":
@@ -297,32 +954,172 @@ def build_tool_start(
content = [acp.tool_diff_content(path=path, new_text=file_content)]
return acp.start_tool_call(
tool_call_id, title, kind=kind, content=content, locations=locations,
raw_input=arguments,
)
if tool_name == "terminal":
command = arguments.get("command", "")
content = [acp.tool_content(acp.text_block(f"$ {command}"))]
content = [_text(f"$ {command}")]
return acp.start_tool_call(
tool_call_id, title, kind=kind, content=content, locations=locations,
raw_input=arguments,
)
if tool_name == "read_file":
path = arguments.get("path", "")
content = [acp.tool_content(acp.text_block(f"Reading {path}"))]
# The title and location already identify the file. Sending a synthetic
# "Reading ..." content block makes Zed render an unhelpful Output
# section before the real file contents arrive on completion.
return acp.start_tool_call(
tool_call_id, title, kind=kind, content=content, locations=locations,
raw_input=arguments,
tool_call_id, title, kind=kind, content=None, locations=locations,
)
if tool_name == "search_files":
pattern = arguments.get("pattern", "")
target = arguments.get("target", "content")
content = [acp.tool_content(acp.text_block(f"Searching for '{pattern}' ({target})"))]
search_path = arguments.get("path")
where = f" in {search_path}" if search_path else ""
content = [_text(f"Searching for '{pattern}' ({target}){where}")]
return acp.start_tool_call(
tool_call_id, title, kind=kind, content=content, locations=locations,
)
if tool_name == "todo":
items = arguments.get("todos")
if isinstance(items, list):
preview_lines = ["Updating todo list", ""]
for item in items[:8]:
if isinstance(item, dict):
preview_lines.append(f"- {item.get('status', 'pending')}: {item.get('content', item.get('id', ''))}")
if len(items) > 8:
preview_lines.append(f"... {len(items) - 8} more")
content = [_text("\n".join(preview_lines))]
else:
content = [_text("Reading todo list")]
return acp.start_tool_call(
tool_call_id, title, kind=kind, content=content, locations=locations,
)
if tool_name == "skill_view":
name = str(arguments.get("name") or "?").strip() or "?"
file_path = str(arguments.get("file_path") or "SKILL.md").strip() or "SKILL.md"
content = [_text(f"Loading skill '{name}' ({file_path})")]
return acp.start_tool_call(
tool_call_id, title, kind=kind, content=content, locations=locations,
)
if tool_name == "skill_manage":
action = str(arguments.get("action") or "manage").strip() or "manage"
name = str(arguments.get("name") or "?").strip() or "?"
file_path = str(arguments.get("file_path") or "SKILL.md").strip() or "SKILL.md"
path = f"skills/{name}/{file_path}" if file_path else f"skills/{name}"
if action == "patch":
old = str(arguments.get("old_string") or "")
new = str(arguments.get("new_string") or "")
content = [acp.tool_diff_content(path=path, old_text=old or None, new_text=new)]
elif action in {"edit", "create"}:
content = [
acp.tool_diff_content(
path=path,
new_text=str(arguments.get("content") or ""),
)
]
elif action == "write_file":
target = str(arguments.get("file_path") or "file")
content = [
acp.tool_diff_content(
path=f"skills/{name}/{target}",
new_text=str(arguments.get("file_content") or ""),
)
]
elif action in {"delete", "remove_file"}:
target = str(arguments.get("file_path") or file_path or name)
content = [_text(f"Removing {target} from skill '{name}'")]
else:
content = [_text(f"Running skill_manage action '{action}' on skill '{name}' ({file_path})")]
return acp.start_tool_call(
tool_call_id, title, kind=kind, content=content, locations=locations,
)
if tool_name == "execute_code":
code = str(arguments.get("code") or "").strip()
preview = code[:1200] + (f"\n... ({len(code)} chars total, truncated)" if len(code) > 1200 else "")
content = [_text(f"Running Python helper script:\n\n```python\n{preview}\n```" if preview else "Running Python helper script")]
return acp.start_tool_call(
tool_call_id, title, kind=kind, content=content, locations=locations,
)
if tool_name == "web_search":
query = str(arguments.get("query") or "").strip()
content = [_text(f"Searching the web for: {query}" if query else "Searching the web")]
return acp.start_tool_call(
tool_call_id, title, kind=kind, content=content, locations=locations,
)
if tool_name == "web_extract":
# The title identifies the URL(s). Avoid a duplicate content block so
# Zed renders this like read_file: compact start, concise completion.
return acp.start_tool_call(
tool_call_id, title, kind=kind, content=None, locations=locations,
)
if tool_name == "process":
action = str(arguments.get("action") or "").strip() or "manage"
sid = str(arguments.get("session_id") or "").strip()
data_preview = str(arguments.get("data") or "").strip()
text = f"Process action: {action}" + (f"\nSession: {sid}" if sid else "")
if data_preview:
text += "\nInput: " + _truncate_text(data_preview, limit=500)
content = [_text(text)]
return acp.start_tool_call(
tool_call_id, title, kind=kind, content=content, locations=locations,
)
if tool_name == "delegate_task":
tasks = arguments.get("tasks")
if isinstance(tasks, list) and tasks:
lines = [f"Delegating {len(tasks)} tasks", ""]
for i, task in enumerate(tasks[:8], 1):
if isinstance(task, dict):
goal = str(task.get("goal") or "").strip()
role = str(task.get("role") or "").strip()
lines.append(f"{i}. " + _truncate_text(goal, limit=160) + (f" ({role})" if role else ""))
if len(tasks) > 8:
lines.append(f"... {len(tasks) - 8} more")
content = [_text("\n".join(lines))]
else:
goal = str(arguments.get("goal") or "").strip()
content = [_text("Delegating task" + (f":\n{_truncate_text(goal, limit=800)}" if goal else ""))]
return acp.start_tool_call(
tool_call_id, title, kind=kind, content=content, locations=locations,
)
if tool_name == "session_search":
query = str(arguments.get("query") or "").strip()
content = [_text(f"Searching past sessions for: {query}" if query else "Loading recent sessions")]
return acp.start_tool_call(
tool_call_id, title, kind=kind, content=content, locations=locations,
)
if tool_name == "memory":
action = str(arguments.get("action") or "manage").strip() or "manage"
target = str(arguments.get("target") or "memory").strip() or "memory"
preview = str(arguments.get("content") or arguments.get("old_text") or "").strip()
text = f"Memory {action} ({target})"
if preview:
text += "\nPreview: " + _truncate_text(preview, limit=500)
content = [_text(text)]
return acp.start_tool_call(
tool_call_id, title, kind=kind, content=content, locations=locations,
)
if tool_name in _POLISHED_TOOLS:
try:
args_text = json.dumps(arguments, indent=2, default=str)
except (TypeError, ValueError):
args_text = str(arguments)
content = [_text(_truncate_text(args_text, limit=1200))]
return acp.start_tool_call(
tool_call_id, title, kind=kind, content=content, locations=locations,
raw_input=arguments,
)
# Generic fallback
@@ -334,7 +1131,7 @@ def build_tool_start(
content = [acp.tool_content(acp.text_block(args_text))]
return acp.start_tool_call(
tool_call_id, title, kind=kind, content=content, locations=locations,
raw_input=arguments,
raw_input=None if tool_name in _POLISHED_TOOLS else arguments,
)
@@ -347,18 +1144,22 @@ def build_tool_complete(
) -> ToolCallProgress:
"""Create a ToolCallUpdate (progress) event for a completed tool call."""
kind = get_tool_kind(tool_name)
content = _build_tool_complete_content(
tool_name,
result,
function_args=function_args,
snapshot=snapshot,
)
if tool_name == "web_extract":
error_text = _format_web_extract_result(result)
content = [_text(error_text)] if error_text else None
else:
content = _build_tool_complete_content(
tool_name,
result,
function_args=function_args,
snapshot=snapshot,
)
return acp.update_tool_call(
tool_call_id,
kind=kind,
status="completed",
content=content,
raw_output=result,
raw_output=None if tool_name in _POLISHED_TOOLS else result,
)
+53 -4
View File
@@ -76,6 +76,7 @@ _ADAPTIVE_THINKING_SUBSTRINGS = ("4-6", "4.6", "4-7", "4.7")
# Models where temperature/top_p/top_k return 400 if set to non-default values.
# This is the Opus 4.7 contract; future 4.x+ models are expected to follow it.
_NO_SAMPLING_PARAMS_SUBSTRINGS = ("4-7", "4.7")
_FAST_MODE_SUPPORTED_SUBSTRINGS = ("opus-4-6", "opus-4.6")
# ── Max output token limits per Anthropic model ───────────────────────
# Source: Anthropic docs + Cline model catalog. Anthropic's API requires
@@ -105,6 +106,9 @@ _ANTHROPIC_OUTPUT_LIMITS = {
"claude-3-haiku": 4_096,
# Third-party Anthropic-compatible providers
"minimax": 131_072,
# Qwen models via DashScope Anthropic-compatible endpoint
# DashScope enforces max_tokens ∈ [1, 65536]
"qwen3": 65_536,
}
# For any model not in the table, assume the highest current limit.
@@ -216,6 +220,17 @@ def _forbids_sampling_params(model: str) -> bool:
return any(v in model for v in _NO_SAMPLING_PARAMS_SUBSTRINGS)
def _supports_fast_mode(model: str) -> bool:
"""Return True for models that support Anthropic Fast Mode (speed=fast).
Per Anthropic docs, fast mode is currently supported on Opus 4.6 only.
Sending ``speed: "fast"`` to any other Claude model (including Opus 4.7)
returns HTTP 400. This guard prevents silently 400'ing when stale config
or older callers leave fast mode enabled across a model upgrade.
"""
return any(v in model for v in _FAST_MODE_SUPPORTED_SUBSTRINGS)
# Beta headers for enhanced features (sent with ALL auth types).
# As of Opus 4.7 (2026-04-16), the first two are GA on Claude 4.6+ — the
# beta headers are still accepted (harmless no-op) but not required. Kept
@@ -1222,6 +1237,14 @@ def _normalize_tool_input_schema(schema: Any) -> Dict[str, Any]:
``keep_nullable_hint=False`` because the Anthropic validator does not
recognize the OpenAPI-style ``nullable: true`` extension and strict
schema-to-grammar converters may reject unknown keywords.
Top-level ``oneOf``/``allOf``/``anyOf`` are also stripped here: the
Anthropic API rejects union keywords at the schema root with a generic
HTTP 400. Several upstream and plugin tools ship schemas with one of
these keywords at the top level (commonly for Pydantic discriminated
unions). If we land here with those keywords still present after
nullable-union stripping, drop them and fall back to a plain object
schema so the tool still validates at the Anthropic boundary.
"""
if not schema:
return {"type": "object", "properties": {}}
@@ -1231,6 +1254,12 @@ def _normalize_tool_input_schema(schema: Any) -> Dict[str, Any]:
normalized = strip_nullable_unions(schema, keep_nullable_hint=False)
if not isinstance(normalized, dict):
return {"type": "object", "properties": {}}
# Strip top-level union keywords that Anthropic's validator rejects.
banned = {"oneOf", "allOf", "anyOf"}
if banned & normalized.keys():
normalized = {k: v for k, v in normalized.items() if k not in banned}
if "type" not in normalized:
normalized["type"] = "object"
if normalized.get("type") == "object" and not isinstance(normalized.get("properties"), dict):
normalized = {**normalized, "properties": {}}
return normalized
@@ -1241,10 +1270,24 @@ def convert_tools_to_anthropic(tools: List[Dict]) -> List[Dict]:
if not tools:
return []
result = []
seen_names: set = set()
for t in tools:
fn = t.get("function", {})
name = fn.get("name", "")
# Defensive dedup: Anthropic rejects requests with duplicate tool
# names. Upstream injection paths already dedup, but this guard
# converts a hard API failure into a warning. See: #18478
if name and name in seen_names:
logger.warning(
"convert_tools_to_anthropic: duplicate tool name '%s' "
"— dropping second occurrence",
name,
)
continue
if name:
seen_names.add(name)
result.append({
"name": fn.get("name", ""),
"name": name,
"description": fn.get("description", ""),
"input_schema": _normalize_tool_input_schema(
fn.get("parameters", {"type": "object", "properties": {}})
@@ -1901,9 +1944,15 @@ def build_anthropic_kwargs(
# ── Fast mode (Opus 4.6 only) ────────────────────────────────────
# Adds extra_body.speed="fast" + the fast-mode beta header for ~2.5x
# output speed. Only for native Anthropic endpoints — third-party
# providers would reject the unknown beta header and speed parameter.
if fast_mode and not _is_third_party_anthropic_endpoint(base_url):
# output speed. Per Anthropic docs, fast mode is only supported on
# Opus 4.6 — Opus 4.7 and other models 400 on the speed parameter.
# Only for native Anthropic endpoints — third-party providers would
# reject the unknown beta header and speed parameter.
if (
fast_mode
and not _is_third_party_anthropic_endpoint(base_url)
and _supports_fast_mode(model)
):
kwargs.setdefault("extra_body", {})["speed"] = "fast"
# Build extra_headers with ALL applicable betas (the per-request
# extra_headers override the client-level anthropic-beta header).
+241 -33
View File
@@ -196,6 +196,12 @@ def _is_kimi_model(model: Optional[str]) -> bool:
return bare.startswith("kimi-") or bare == "kimi"
def _is_arcee_trinity_thinking(model: Optional[str]) -> bool:
"""True for Arcee Trinity Large Thinking (direct or via OpenRouter)."""
bare = (model or "").strip().lower().rsplit("/", 1)[-1]
return bare == "trinity-large-thinking"
def _fixed_temperature_for_model(
model: Optional[str],
base_url: Optional[str] = None,
@@ -213,10 +219,46 @@ def _fixed_temperature_for_model(
if _is_kimi_model(model):
logger.debug("Omitting temperature for Kimi model %r (server-managed)", model)
return OMIT_TEMPERATURE
if _is_arcee_trinity_thinking(model):
return 0.5
return None
def _compression_threshold_for_model(model: Optional[str]) -> Optional[float]:
"""Return a context-compression threshold override for specific models.
The threshold is the fraction of the model's context window that must be
consumed before Hermes triggers summarization. Higher values delay
compression and preserve more raw context.
Returns a float in (0, 1] to override the global ``compression.threshold``
config value, or ``None`` to leave the user's config value unchanged.
"""
if _is_arcee_trinity_thinking(model):
return 0.75
return None
# Default auxiliary models for direct API-key providers (cheap/fast for side tasks)
_API_KEY_PROVIDER_AUX_MODELS: Dict[str, str] = {
def _get_aux_model_for_provider(provider_id: str) -> str:
"""Return the cheap auxiliary model for a provider.
Reads from ProviderProfile.default_aux_model first, falling back to the
legacy hardcoded dict for providers that predate the profiles system.
"""
try:
from providers import get_provider_profile
_p = get_provider_profile(provider_id)
if _p and _p.default_aux_model:
return _p.default_aux_model
except Exception:
pass
return _API_KEY_PROVIDER_AUX_MODELS_FALLBACK.get(provider_id, "")
# Fallback for providers not yet migrated to ProviderProfile.default_aux_model,
# plus providers we intentionally keep pinned here (e.g. Anthropic predates
# profiles). New providers should set default_aux_model on their profile instead.
_API_KEY_PROVIDER_AUX_MODELS_FALLBACK: Dict[str, str] = {
"gemini": "gemini-3-flash-preview",
"zai": "glm-4.5-flash",
"kimi-coding": "kimi-k2-turbo-preview",
@@ -235,6 +277,10 @@ _API_KEY_PROVIDER_AUX_MODELS: Dict[str, str] = {
"tencent-tokenhub": "hy3-preview",
}
# Legacy alias — callers that haven't been updated to _get_aux_model_for_provider()
# can still use this dict directly. Kept in sync with _FALLBACK above.
_API_KEY_PROVIDER_AUX_MODELS: Dict[str, str] = _API_KEY_PROVIDER_AUX_MODELS_FALLBACK
# Vision-specific model overrides for direct providers.
# When the user's main provider has a dedicated vision/multimodal model that
# differs from their main chat model, map it here. The vision auto-detect
@@ -259,13 +305,70 @@ _PROVIDERS_WITHOUT_VISION: frozenset = frozenset({
"kimi-coding-cn",
})
# OpenRouter app attribution headers
_OR_HEADERS = {
# OpenRouter app attribution headers (base — always sent).
# `X-Title` is the canonical attribution header OpenRouter's dashboard
# reads; the previous `X-OpenRouter-Title` label was not recognized there.
_OR_HEADERS_BASE = {
"HTTP-Referer": "https://hermes-agent.nousresearch.com",
"X-OpenRouter-Title": "Hermes Agent",
"X-Title": "Hermes Agent",
"X-OpenRouter-Categories": "productivity,cli-agent",
}
# Truthy values for boolean env-var parsing.
_TRUTHY_ENV_VALUES = frozenset({"1", "true", "yes", "on"})
def build_or_headers(or_config: dict | None = None) -> dict:
"""Build OpenRouter headers, optionally including response-cache headers.
Precedence for response cache: env var > config.yaml > default (enabled).
Environment variables:
``HERMES_OPENROUTER_CACHE`` — truthy (``1``/``true``/``yes``/``on``)
enables caching; ``0``/``false``/``no``/``off`` disables.
Overrides ``openrouter.response_cache`` in config.yaml.
``HERMES_OPENROUTER_CACHE_TTL`` — integer seconds (1-86400).
Overrides ``openrouter.response_cache_ttl`` in config.yaml.
*or_config* is the ``openrouter`` section from config.yaml. When *None*,
falls back to reading config from disk via ``load_config()``.
"""
headers = dict(_OR_HEADERS_BASE)
# Resolve config from disk if not provided.
if or_config is None:
try:
from hermes_cli.config import load_config
or_config = load_config().get("openrouter", {})
except Exception:
or_config = {}
# Determine cache enabled: env var overrides config.
env_cache = os.environ.get("HERMES_OPENROUTER_CACHE", "").strip().lower()
if env_cache:
cache_enabled = env_cache in _TRUTHY_ENV_VALUES
else:
cache_enabled = or_config.get("response_cache", False)
if not cache_enabled:
return headers
headers["X-OpenRouter-Cache"] = "true"
# Determine TTL: env var overrides config.
env_ttl = os.environ.get("HERMES_OPENROUTER_CACHE_TTL", "").strip()
if env_ttl:
if env_ttl.isdigit():
ttl = int(env_ttl)
if 1 <= ttl <= 86400:
headers["X-OpenRouter-Cache-TTL"] = str(ttl)
else:
ttl = or_config.get("response_cache_ttl", 300)
if isinstance(ttl, (int, float)) and 1 <= ttl <= 86400:
headers["X-OpenRouter-Cache-TTL"] = str(int(ttl))
return headers
# Vercel AI Gateway app attribution headers. HTTP-Referer maps to
# referrerUrl and X-Title maps to appName in the gateway's analytics.
from hermes_cli import __version__ as _HERMES_VERSION
@@ -512,7 +615,12 @@ class _CodexCompletionsAdapter:
# API allows it.
pass
else:
effort = reasoning_cfg.get("effort", "medium")
# Truthy-only check mirrors agent/transports/codex.py
# build_kwargs(): falsy values (None, "", 0) fall back
# to the default rather than being forwarded to the
# Codex backend, which rejects e.g. {"effort": null}
# with a 400.
effort = reasoning_cfg.get("effort") or "medium"
# Codex backend rejects "minimal"; clamp to "low" to
# match the main-agent Codex transport behavior.
if effort == "minimal":
@@ -1095,7 +1203,7 @@ def _resolve_api_key_provider() -> Tuple[Optional[OpenAI], Optional[str]]:
raw_base_url = _pool_runtime_base_url(entry, pconfig.inference_base_url) or pconfig.inference_base_url
base_url = _to_openai_base_url(raw_base_url)
model = _API_KEY_PROVIDER_AUX_MODELS.get(provider_id)
model = _get_aux_model_for_provider(provider_id) or None
if model is None:
continue # skip provider if we don't know a valid aux model
logger.debug("Auxiliary text client: %s (%s) via pool", pconfig.name, model)
@@ -1111,6 +1219,14 @@ def _resolve_api_key_provider() -> Tuple[Optional[OpenAI], Optional[str]]:
from hermes_cli.models import copilot_default_headers
extra["default_headers"] = copilot_default_headers()
else:
try:
from providers import get_provider_profile as _gpf_aux
_ph_aux = _gpf_aux(provider_id)
if _ph_aux and _ph_aux.default_headers:
extra["default_headers"] = dict(_ph_aux.default_headers)
except Exception:
pass
_client = OpenAI(api_key=api_key, base_url=base_url, **extra)
_client = _maybe_wrap_anthropic(_client, model, api_key, raw_base_url)
return _client, model
@@ -1122,7 +1238,7 @@ def _resolve_api_key_provider() -> Tuple[Optional[OpenAI], Optional[str]]:
raw_base_url = str(creds.get("base_url", "")).strip().rstrip("/") or pconfig.inference_base_url
base_url = _to_openai_base_url(raw_base_url)
model = _API_KEY_PROVIDER_AUX_MODELS.get(provider_id)
model = _get_aux_model_for_provider(provider_id) or None
if model is None:
continue # skip provider if we don't know a valid aux model
logger.debug("Auxiliary text client: %s (%s)", pconfig.name, model)
@@ -1138,6 +1254,14 @@ def _resolve_api_key_provider() -> Tuple[Optional[OpenAI], Optional[str]]:
from hermes_cli.models import copilot_default_headers
extra["default_headers"] = copilot_default_headers()
else:
try:
from providers import get_provider_profile as _gpf_aux2
_ph_aux2 = _gpf_aux2(provider_id)
if _ph_aux2 and _ph_aux2.default_headers:
extra["default_headers"] = dict(_ph_aux2.default_headers)
except Exception:
pass
_client = OpenAI(api_key=api_key, base_url=base_url, **extra)
_client = _maybe_wrap_anthropic(_client, model, api_key, raw_base_url)
return _client, model
@@ -1149,23 +1273,23 @@ def _resolve_api_key_provider() -> Tuple[Optional[OpenAI], Optional[str]]:
def _try_openrouter() -> Tuple[Optional[OpenAI], Optional[str]]:
def _try_openrouter(explicit_api_key: str = None) -> Tuple[Optional[OpenAI], Optional[str]]:
pool_present, entry = _select_pool_entry("openrouter")
if pool_present:
or_key = _pool_runtime_api_key(entry)
or_key = explicit_api_key or _pool_runtime_api_key(entry)
if not or_key:
return None, None
base_url = _pool_runtime_base_url(entry, OPENROUTER_BASE_URL) or OPENROUTER_BASE_URL
logger.debug("Auxiliary client: OpenRouter via pool")
return OpenAI(api_key=or_key, base_url=base_url,
default_headers=_OR_HEADERS), _OPENROUTER_MODEL
default_headers=build_or_headers()), _OPENROUTER_MODEL
or_key = os.getenv("OPENROUTER_API_KEY")
or_key = explicit_api_key or os.getenv("OPENROUTER_API_KEY")
if not or_key:
return None, None
logger.debug("Auxiliary client: OpenRouter")
return OpenAI(api_key=or_key, base_url=OPENROUTER_BASE_URL,
default_headers=_OR_HEADERS), _OPENROUTER_MODEL
default_headers=build_or_headers()), _OPENROUTER_MODEL
def _describe_openrouter_unavailable() -> str:
@@ -1474,7 +1598,7 @@ def _build_codex_client(model: str) -> Tuple[Optional[Any], Optional[str]]:
return CodexAuxiliaryClient(real_client, model), model
def _try_anthropic() -> Tuple[Optional[Any], Optional[str]]:
def _try_anthropic(explicit_api_key: str = None) -> Tuple[Optional[Any], Optional[str]]:
try:
from agent.anthropic_adapter import build_anthropic_client, resolve_anthropic_token
except ImportError:
@@ -1484,10 +1608,10 @@ def _try_anthropic() -> Tuple[Optional[Any], Optional[str]]:
if pool_present:
if entry is None:
return None, None
token = _pool_runtime_api_key(entry)
token = explicit_api_key or _pool_runtime_api_key(entry)
else:
entry = None
token = resolve_anthropic_token()
token = explicit_api_key or resolve_anthropic_token()
if not token:
return None, None
@@ -1510,7 +1634,7 @@ def _try_anthropic() -> Tuple[Optional[Any], Optional[str]]:
from agent.anthropic_adapter import _is_oauth_token
is_oauth = _is_oauth_token(token)
model = _API_KEY_PROVIDER_AUX_MODELS.get("anthropic", "claude-haiku-4-5-20251001")
model = _get_aux_model_for_provider("anthropic") or "claude-haiku-4-5-20251001"
logger.debug("Auxiliary client: Anthropic native (%s) at %s (oauth=%s)", model, base_url, is_oauth)
try:
real_client = build_anthropic_client(token, base_url)
@@ -1588,6 +1712,39 @@ def _is_payment_error(exc: Exception) -> bool:
return False
def _is_rate_limit_error(exc: Exception) -> bool:
"""Detect rate-limit errors that warrant provider fallback.
Returns True for HTTP 429 errors whose message indicates rate limiting
(as opposed to billing/quota exhaustion, which _is_payment_error handles).
Also catches OpenAI SDK RateLimitError instances that may not set
.status_code on the exception object.
"""
status = getattr(exc, "status_code", None)
err_lower = str(exc).lower()
# OpenAI SDK's RateLimitError sometimes omits .status_code —
# detect by class name so we don't miss these. (PR #8023 pattern)
if type(exc).__name__ == "RateLimitError":
return True
if status == 429:
# Distinguish rate-limit from billing: billing keywords are handled
# by _is_payment_error, everything else on 429 is a rate limit.
if any(kw in err_lower for kw in (
"rate limit", "rate_limit", "too many requests",
"try again", "retry after", "resets in",
)):
return True
# Generic 429 without billing keywords = likely a rate limit
if not any(kw in err_lower for kw in (
"credits", "insufficient funds", "billing",
"payment required", "can only afford",
)):
return True
return False
def _is_connection_error(exc: Exception) -> bool:
"""Detect connection/network errors that warrant provider fallback.
@@ -1911,7 +2068,7 @@ def _to_async_client(sync_client, model: str, is_vision: bool = False):
}
sync_base_url = str(sync_client.base_url)
if base_url_host_matches(sync_base_url, "openrouter.ai"):
async_kwargs["default_headers"] = dict(_OR_HEADERS)
async_kwargs["default_headers"] = build_or_headers()
elif base_url_host_matches(sync_base_url, "api.githubcopilot.com"):
from hermes_cli.copilot_auth import copilot_request_headers
@@ -2053,9 +2210,9 @@ def resolve_provider_client(
return (_to_async_client(client, final_model, is_vision=is_vision) if async_mode
else (client, final_model))
# ── OpenRouter ───────────────────────────────────────────────────
# ── OpenRouter ───────────────────────────────────────────
if provider == "openrouter":
client, default = _try_openrouter()
client, default = _try_openrouter(explicit_api_key=explicit_api_key)
if client is None:
logger.warning(
"resolve_provider_client: openrouter requested but %s",
@@ -2281,7 +2438,7 @@ def resolve_provider_client(
if pconfig.auth_type == "api_key":
if provider == "anthropic":
client, default_model = _try_anthropic()
client, default_model = _try_anthropic(explicit_api_key=explicit_api_key)
if client is None:
logger.warning("resolve_provider_client: anthropic requested but no Anthropic credentials found")
return None, None
@@ -2313,7 +2470,7 @@ def resolve_provider_client(
if explicit_base_url:
base_url = _to_openai_base_url(explicit_base_url.strip().rstrip("/"))
default_model = _API_KEY_PROVIDER_AUX_MODELS.get(provider, "")
default_model = _get_aux_model_for_provider(provider)
final_model = _normalize_resolved_model(model or default_model, provider)
if provider == "gemini":
@@ -2593,8 +2750,11 @@ def resolve_vision_provider_client(
return resolved_provider, sync_client, final_model
if resolved_base_url:
provider_for_base_override = (
requested if requested and requested not in ("", "auto") else "custom"
)
client, final_model = resolve_provider_client(
"custom",
provider_for_base_override,
model=resolved_model,
async_mode=async_mode,
explicit_base_url=resolved_base_url,
@@ -2602,8 +2762,8 @@ def resolve_vision_provider_client(
api_mode=resolved_api_mode,
)
if client is None:
return "custom", None, None
return "custom", client, final_model
return provider_for_base_override, None, None
return provider_for_base_override, client, final_model
if requested == "auto":
# Vision auto-detection order:
@@ -3069,8 +3229,14 @@ def _resolve_task_provider_model(
if task:
# Config.yaml is the primary source for per-task overrides.
if cfg_base_url:
if cfg_base_url and cfg_api_key:
# Both base_url and api_key explicitly set → custom endpoint.
return "custom", resolved_model, cfg_base_url, cfg_api_key, resolved_api_mode
if cfg_base_url and cfg_provider and cfg_provider != "auto":
# base_url set without api_key but with a known provider — use
# the provider so it can resolve credentials from env vars
# (e.g. OPENROUTER_API_KEY) instead of locking into "custom".
return cfg_provider, resolved_model, cfg_base_url, None, resolved_api_mode
if cfg_provider and cfg_provider != "auto":
return cfg_provider, resolved_model, None, None, resolved_api_mode
@@ -3237,7 +3403,26 @@ def _build_call_kwargs(
kwargs["max_tokens"] = max_tokens
if tools:
kwargs["tools"] = tools
# Defensive dedup: providers like Google Vertex, Azure, and Bedrock
# reject requests with duplicate tool names (HTTP 400). The upstream
# injection paths (run_agent.py) already dedup, but this guard
# converts a hard API failure into a warning if an upstream regression
# reintroduces duplicates. See: #18478
_seen: set = set()
_deduped: list = []
for _t in tools:
_tname = (_t.get("function") or {}).get("name", "")
if _tname and _tname in _seen:
logger.warning(
"_build_call_kwargs: duplicate tool name '%s' removed "
"(provider=%s model=%s)",
_tname, provider, model,
)
continue
if _tname:
_seen.add(_tname)
_deduped.append(_t)
kwargs["tools"] = _deduped
# Provider-specific extra_body
merged_extra = dict(extra_body or {})
@@ -3452,7 +3637,7 @@ def call_llm(
except Exception as retry_err:
# If the max_tokens retry also hits a payment or connection
# error, fall through to the fallback chain below.
if not (_is_payment_error(retry_err) or _is_connection_error(retry_err)):
if not (_is_payment_error(retry_err) or _is_connection_error(retry_err) or _is_rate_limit_error(retry_err)):
raise
first_err = retry_err
@@ -3535,13 +3720,27 @@ def call_llm(
# Codex/OAuth tokens that authenticate but whose endpoint is down,
# and providers the user never configured that got picked up by
# the auto-detection chain.
should_fallback = _is_payment_error(first_err) or _is_connection_error(first_err)
#
# ── Rate-limit fallback (#13579) ─────────────────────────────
# When the provider returns a 429 rate-limit (not billing), fall
# back to an alternative provider instead of exhausting retries
# against the same rate-limited endpoint.
should_fallback = (
_is_payment_error(first_err)
or _is_connection_error(first_err)
or _is_rate_limit_error(first_err)
)
# Only try alternative providers when the user didn't explicitly
# configure this task's provider. Explicit provider = hard constraint;
# auto (the default) = best-effort fallback chain. (#7559)
is_auto = resolved_provider in ("auto", "", None)
if should_fallback and is_auto:
reason = "payment error" if _is_payment_error(first_err) else "connection error"
if _is_payment_error(first_err):
reason = "payment error"
elif _is_rate_limit_error(first_err):
reason = "rate limit"
else:
reason = "connection error"
logger.info("Auxiliary %s: %s on %s (%s), trying fallback",
task or "call", reason, resolved_provider, first_err)
fb_client, fb_model, fb_label = _try_payment_fallback(
@@ -3744,7 +3943,7 @@ async def async_call_llm(
except Exception as retry_err:
# If the max_tokens retry also hits a payment or connection
# error, fall through to the fallback chain below.
if not (_is_payment_error(retry_err) or _is_connection_error(retry_err)):
if not (_is_payment_error(retry_err) or _is_connection_error(retry_err) or _is_rate_limit_error(retry_err)):
raise
first_err = retry_err
@@ -3813,11 +4012,20 @@ async def async_call_llm(
return _validate_llm_response(
await retry_client.chat.completions.create(**retry_kwargs), task)
# ── Payment / connection fallback (mirrors sync call_llm) ─────
should_fallback = _is_payment_error(first_err) or _is_connection_error(first_err)
# ── Payment / connection / rate-limit fallback (mirrors sync call_llm) ──
should_fallback = (
_is_payment_error(first_err)
or _is_connection_error(first_err)
or _is_rate_limit_error(first_err)
)
is_auto = resolved_provider in ("auto", "", None)
if should_fallback and is_auto:
reason = "payment error" if _is_payment_error(first_err) else "connection error"
if _is_payment_error(first_err):
reason = "payment error"
elif _is_rate_limit_error(first_err):
reason = "rate limit"
else:
reason = "connection error"
logger.info("Auxiliary %s (async): %s on %s (%s), trying fallback",
task or "call", reason, resolved_provider, first_err)
fb_client, fb_model, fb_label = _try_payment_fallback(
+76 -9
View File
@@ -43,6 +43,9 @@ SUMMARY_PREFIX = (
"they were already addressed. "
"Your current task is identified in the '## Active Task' section of the "
"summary — resume exactly from there. "
"IMPORTANT: Your persistent memory (MEMORY.md, USER.md) in the system "
"prompt is ALWAYS authoritative and active — never ignore or deprioritize "
"memory content due to this compaction note. "
"Respond ONLY to the latest user message "
"that appears AFTER this summary. The current session state (files, "
"config, etc.) may reflect work described here — avoid repeating it:"
@@ -344,6 +347,7 @@ class ContextCompressor(ContextEngine):
self._last_aux_model_failure_model = None
self._last_compression_savings_pct = 100.0
self._ineffective_compression_count = 0
self._summary_failure_cooldown_until = 0.0 # transient errors must not block a fresh session
def update_model(
self,
@@ -553,7 +557,16 @@ class ContextCompressor(ContextEngine):
break
accumulated += msg_tokens
boundary = i
prune_boundary = max(boundary, len(result) - min_protect)
# Translate the budget walk into a "protected count", apply the
# floor in count-space (where `max` reads naturally: protect at
# least `min_protect` messages or whatever the budget reserved,
# whichever is more), then convert back to a prune boundary.
# Doing this in index-space with `max` would invert the direction
# (smaller index = MORE protected), so a generous budget would
# silently get truncated back down to `min_protect`.
budget_protect_count = len(result) - boundary
protected_count = max(budget_protect_count, min_protect)
prune_boundary = len(result) - protected_count
else:
prune_boundary = len(result) - protect_tail_count
@@ -569,6 +582,8 @@ class ContextCompressor(ContextEngine):
# Skip multimodal content (list of content blocks)
if isinstance(content, list):
continue
if not isinstance(content, str):
continue
if len(content) < 200:
continue
h = hashlib.md5(content.encode("utf-8", errors="replace")).hexdigest()[:12]
@@ -588,6 +603,8 @@ class ContextCompressor(ContextEngine):
# Skip multimodal content (list of content blocks)
if isinstance(content, list):
continue
if not isinstance(content, str):
continue
if not content or content == _PRUNED_TOOL_PLACEHOLDER:
continue
# Skip already-deduplicated or previously-summarized results
@@ -903,15 +920,19 @@ The user has requested that this compaction PRIORITISE preserving all informatio
or "does not exist" in _err_str
or "no available channel" in _err_str
)
_is_timeout = (
_status in (408, 429, 502, 504)
or "timeout" in _err_str
)
if (
_is_model_not_found
(_is_model_not_found or _is_timeout)
and self.summary_model
and self.summary_model != self.model
and not getattr(self, "_summary_model_fallen_back", False)
):
self._summary_model_fallen_back = True
logging.warning(
"Summary model '%s' not available (%s). "
"Summary model '%s' unavailable (%s). "
"Falling back to main model '%s' for compression.",
self.summary_model, e, self.model,
)
@@ -975,15 +996,39 @@ The user has requested that this compaction PRIORITISE preserving all informatio
return None
@staticmethod
def _with_summary_prefix(summary: str) -> str:
"""Normalize summary text to the current compaction handoff format."""
def _strip_summary_prefix(summary: str) -> str:
"""Return summary body without the current or legacy handoff prefix."""
text = (summary or "").strip()
for prefix in (LEGACY_SUMMARY_PREFIX, SUMMARY_PREFIX):
for prefix in (SUMMARY_PREFIX, LEGACY_SUMMARY_PREFIX):
if text.startswith(prefix):
text = text[len(prefix):].lstrip()
break
return text[len(prefix):].lstrip()
return text
@classmethod
def _with_summary_prefix(cls, summary: str) -> str:
"""Normalize summary text to the current compaction handoff format."""
text = cls._strip_summary_prefix(summary)
return f"{SUMMARY_PREFIX}\n{text}" if text else SUMMARY_PREFIX
@staticmethod
def _is_context_summary_content(content: Any) -> bool:
text = _content_text_for_contains(content).lstrip()
return text.startswith(SUMMARY_PREFIX) or text.startswith(LEGACY_SUMMARY_PREFIX)
@classmethod
def _find_latest_context_summary(
cls,
messages: List[Dict[str, Any]],
start: int,
end: int,
) -> tuple[Optional[int], str]:
"""Find the newest handoff summary inside a compression window."""
for idx in range(end - 1, start - 1, -1):
content = messages[idx].get("content")
if cls._is_context_summary_content(content):
return idx, cls._strip_summary_prefix(_content_text_for_contains(content))
return None, ""
# ------------------------------------------------------------------
# Tool-call / tool-result pair integrity helpers
# ------------------------------------------------------------------
@@ -1290,6 +1335,15 @@ The user has requested that this compaction PRIORITISE preserving all informatio
return messages
turns_to_summarize = messages[compress_start:compress_end]
summary_idx, summary_body = self._find_latest_context_summary(
messages,
compress_start,
compress_end,
)
if summary_idx is not None:
if summary_body and not self._previous_summary:
self._previous_summary = summary_body
turns_to_summarize = messages[summary_idx + 1:compress_end]
if not self.quiet_mode:
logger.info(
@@ -1322,7 +1376,7 @@ The user has requested that this compaction PRIORITISE preserving all informatio
msg = messages[i].copy()
if i == 0 and msg.get("role") == "system":
existing = msg.get("content")
_compression_note = "[Note: Some earlier conversation turns have been compacted into a handoff summary to preserve context space. The current session state may still reflect earlier work, so build on that summary and state rather than re-doing work.]"
_compression_note = "[Note: Some earlier conversation turns have been compacted into a handoff summary to preserve context space. The current session state may still reflect earlier work, so build on that summary and state rather than re-doing work. Your persistent memory (MEMORY.md, USER.md) remains fully authoritative regardless of compaction.]"
if _compression_note not in _content_text_for_contains(existing):
msg["content"] = _append_text_to_content(
existing,
@@ -1367,6 +1421,19 @@ The user has requested that this compaction PRIORITISE preserving all informatio
# Merge the summary into the first tail message instead
# of inserting a standalone message that breaks alternation.
_merge_summary_into_tail = True
# When the summary lands as a standalone role="user" message,
# weak models read the verbatim "## Active Task" quote of a past
# user request as fresh input (#11475, #14521). Append the explicit
# end marker — the same one used in the merge-into-tail path — so
# the model has a clear "summary above, not new input" signal.
if not _merge_summary_into_tail and summary_role == "user":
summary = (
summary
+ "\n\n--- END OF CONTEXT SUMMARY — "
"respond to the message below, not the summary above ---"
)
if not _merge_summary_into_tail:
compressed.append({"role": summary_role, "content": summary})
+17 -6
View File
@@ -3,6 +3,7 @@
from __future__ import annotations
import logging
import os
import random
import threading
import time
@@ -13,7 +14,7 @@ from datetime import datetime
from typing import Any, Dict, List, Optional, Set, Tuple
from hermes_constants import OPENROUTER_BASE_URL
from hermes_cli.config import get_env_value
from hermes_cli.config import get_env_value, load_env
import hermes_cli.auth as auth_mod
from hermes_cli.auth import (
CODEX_ACCESS_TOKEN_REFRESH_SKEW_SECONDS,
@@ -1380,6 +1381,16 @@ def _seed_from_singletons(provider: str, entries: List[PooledCredential]) -> Tup
def _seed_from_env(provider: str, entries: List[PooledCredential]) -> Tuple[bool, Set[str]]:
changed = False
active_sources: Set[str] = set()
# Prefer ~/.hermes/.env over os.environ — the user's config file is the
# authoritative source for Hermes credentials. Stale env vars from parent
# processes (Codex CLI, test scripts, etc.) should not override deliberate
# changes to the .env file.
def _get_env_prefer_dotenv(key: str) -> str:
env_file = load_env()
val = env_file.get(key) or os.environ.get(key) or ""
return val.strip()
# Honour user suppression — `hermes auth remove <provider> <N>` for an
# env-seeded credential marks the env:<VAR> source as suppressed so it
# won't be re-seeded from the user's shell environment or ~/.hermes/.env.
@@ -1391,8 +1402,8 @@ def _seed_from_env(provider: str, entries: List[PooledCredential]) -> Tuple[bool
def _is_source_suppressed(_p, _s): # type: ignore[misc]
return False
if provider == "openrouter":
# Check both os.environ and ~/.hermes/.env file
token = (get_env_value("OPENROUTER_API_KEY") or "").strip()
# Prefer ~/.hermes/.env over os.environ
token = _get_env_prefer_dotenv("OPENROUTER_API_KEY")
if token:
source = "env:OPENROUTER_API_KEY"
if _is_source_suppressed(provider, source):
@@ -1418,7 +1429,7 @@ def _seed_from_env(provider: str, entries: List[PooledCredential]) -> Tuple[bool
env_url = ""
if pconfig.base_url_env_var:
env_url = (get_env_value(pconfig.base_url_env_var) or "").strip().rstrip("/")
env_url = _get_env_prefer_dotenv(pconfig.base_url_env_var).rstrip("/")
env_vars = list(pconfig.api_key_env_vars)
if provider == "anthropic":
@@ -1429,8 +1440,8 @@ def _seed_from_env(provider: str, entries: List[PooledCredential]) -> Tuple[bool
]
for env_var in env_vars:
# Check both os.environ and ~/.hermes/.env file
token = (get_env_value(env_var) or "").strip()
# Prefer ~/.hermes/.env over os.environ
token = _get_env_prefer_dotenv(env_var)
if not token:
continue
source = f"env:{env_var}"
+326 -47
View File
@@ -24,11 +24,12 @@ from __future__ import annotations
import json
import logging
import os
import re
import tempfile
import threading
from datetime import datetime, timedelta, timezone
from pathlib import Path
from typing import Any, Callable, Dict, List, Optional, Set
from typing import Any, Callable, Dict, List, NamedTuple, Optional, Set
from hermes_constants import get_hermes_home
from tools import skill_usage
@@ -36,6 +37,22 @@ from tools import skill_usage
logger = logging.getLogger(__name__)
def _strip_aux_credential(value: Any) -> Optional[str]:
if value is None:
return None
text = str(value).strip()
return text or None
class _ReviewRuntimeBinding(NamedTuple):
"""Provider/model for the curator review fork plus optional per-slot overrides."""
provider: str
model: str
explicit_api_key: Optional[str]
explicit_base_url: Optional[str]
DEFAULT_INTERVAL_HOURS = 24 * 7 # 7 days
DEFAULT_MIN_IDLE_HOURS = 2
DEFAULT_STALE_AFTER_DAYS = 30
@@ -184,7 +201,16 @@ def should_run_now(now: Optional[datetime] = None) -> bool:
Gates:
- curator.enabled == True
- not paused
- last_run_at missing, OR older than interval_hours
- last_run_at present AND older than interval_hours
First-run behavior: when there is no ``last_run_at`` (fresh install, or
install that predates the curator), we DO NOT run immediately. The
curator is designed to run after at least ``interval_hours`` (7 days by
default) of skill activity, not on the first background tick after
``hermes update``. On first observation we seed ``last_run_at`` to "now"
and defer the first real pass by one full interval. Users who want to
run it sooner can always invoke ``hermes curator run`` (with or without
``--dry-run``) explicitly that path bypasses this gate.
The idle check (min_idle_hours) is applied at the call site where we know
whether an agent is actively running here we only enforce the static
@@ -198,7 +224,21 @@ def should_run_now(now: Optional[datetime] = None) -> bool:
state = load_state()
last = _parse_iso(state.get("last_run_at"))
if last is None:
return True
# Never run before. Seed state so we wait a full interval before the
# first real pass. Report-only; do not auto-mutate the library the
# very first time a gateway ticks after an update.
if now is None:
now = datetime.now(timezone.utc)
try:
state["last_run_at"] = now.isoformat()
state["last_run_summary"] = (
"deferred first run — curator seeded, will run after one "
"interval; use `hermes curator run --dry-run` to preview now"
)
save_state(state)
except Exception as e: # pragma: no cover — best-effort persistence
logger.debug("Failed to seed curator last_run_at: %s", e)
return False
if now is None:
now = datetime.now(timezone.utc)
@@ -259,6 +299,33 @@ def apply_automatic_transitions(now: Optional[datetime] = None) -> Dict[str, int
# Review prompt for the forked agent
# ---------------------------------------------------------------------------
CURATOR_DRY_RUN_BANNER = (
"═══════════════════════════════════════════════════════════════\n"
"DRY-RUN — REPORT ONLY. DO NOT MUTATE THE SKILL LIBRARY.\n"
"═══════════════════════════════════════════════════════════════\n"
"\n"
"This is a PREVIEW pass. Follow every instruction below EXCEPT:\n"
"\n"
" • DO NOT call skill_manage with action=patch, create, delete, "
"write_file, or remove_file.\n"
" • DO NOT call terminal to mv skill directories into .archive/.\n"
" • DO NOT call terminal to mv, cp, rm, or rewrite any file under "
"~/.hermes/skills/.\n"
" • skills_list and skill_view are FINE — read as much as you need.\n"
"\n"
"Your output IS the deliverable. Produce the exact same "
"human-readable summary and structured YAML block you would "
"produce on a live run — but describe the actions you WOULD take, "
"not actions you took. A downstream reviewer will read the report "
"and decide whether to approve a live run with "
"`hermes curator run` (no flag).\n"
"\n"
"If you accidentally take a mutating action, say so explicitly in "
"the summary so the reviewer can revert it.\n"
"═══════════════════════════════════════════════════════════════"
)
CURATOR_REVIEW_PROMPT = (
"You are running as Hermes' background skill CURATOR. This is an "
"UMBRELLA-BUILDING consolidation pass, not a passive audit and not a "
@@ -337,6 +404,11 @@ CURATOR_REVIEW_PROMPT = (
" - skill_manage action=write_file — add a references/, templates/, "
"or scripts/ file under an existing skill (the skill must already "
"exist)\n"
" - skill_manage action=delete — archive a skill. MUST pass "
"`absorbed_into=<umbrella>` when you've merged its content into another "
"skill, or `absorbed_into=\"\"` when you're truly pruning with no "
"forwarding target. This drives cron-job skill-reference migration — "
"guessing from your YAML summary after the fact is fragile.\n"
" - terminal — mv a sibling into the archive "
"OR move its content into a support subfile\n\n"
"'keep' is a legitimate decision ONLY when the skill is already a "
@@ -398,6 +470,24 @@ def _reports_root() -> Path:
return root
def _needle_in_path_component(needle: str, path: str) -> bool:
"""Check if *needle* is a complete filename stem or directory name in *path*.
Unlike simple substring matching, this avoids false positives where short
skill names are embedded in longer filenames (e.g. "api" matching
"references/api-design.md"). Hyphens and underscores are normalised so
"open-webui-setup" matches "open_webui_setup.md".
"""
norm_needle = needle.replace("-", "_")
for part in path.replace("\\", "/").split("/"):
if not part:
continue
stem = part.rsplit(".", 1)[0] if "." in part else part
if stem.replace("-", "_") == norm_needle:
return True
return False
def _classify_removed_skills(
removed: List[str],
added: List[str],
@@ -476,15 +566,29 @@ def _classify_removed_skills(
continue
# Look for the removed skill's name in file_path / content / raw.
haystacks: List[str] = []
# Matching strategy differs by field type:
# file_path — needle must be a complete path component
# (filename stem or directory name), so "api" does NOT
# falsely match "references/api-design.md".
# content fields — word-boundary regex so "test" does NOT
# falsely match "latest" or "testing".
haystacks: List[tuple[str, str]] = []
for key in ("file_path", "file_content", "content", "new_string", "_raw"):
v = args.get(key)
if isinstance(v, str):
haystacks.append(v)
haystacks.append((key, v))
hit = False
for hay in haystacks:
for key, hay in haystacks:
for needle in needles:
if needle and needle in hay:
if not needle:
continue
if key == "file_path":
matched = _needle_in_path_component(needle, hay)
else:
matched = bool(
re.search(rf'\b{re.escape(needle)}\b', hay)
)
if matched:
hit = True
evidence = (
f"skill_manage action={args.get('action', '?')} "
@@ -587,15 +691,76 @@ def _parse_structured_summary(
return out
def _extract_absorbed_into_declarations(
tool_calls: List[Dict[str, Any]],
) -> Dict[str, Dict[str, Any]]:
"""Walk this run's tool calls and extract model-declared absorption targets.
The curator prompt requires every ``skill_manage(action='delete')`` call
to pass ``absorbed_into=<umbrella>`` when consolidating, or
``absorbed_into=""`` when truly pruning. This is the single authoritative
signal for classification the model's own declaration at the moment of
deletion, which beats both post-hoc YAML summary parsing and substring
heuristics on other tool calls.
Returns ``{skill_name: {"into": "<umbrella>" | "", "declared": True}}``.
Entries with ``into == ""`` are explicit prunings.
Skills without a ``skill_manage(delete)`` call, or with one that omitted
``absorbed_into``, are not in the returned dict caller falls back to
the existing heuristic/YAML logic for those (backward compat with older
curator runs and any callers that don't populate the arg).
"""
out: Dict[str, Dict[str, Any]] = {}
for tc in tool_calls or []:
if not isinstance(tc, dict):
continue
if tc.get("name") != "skill_manage":
continue
raw = tc.get("arguments") or ""
args: Dict[str, Any] = {}
if isinstance(raw, dict):
args = raw
elif isinstance(raw, str):
try:
args = json.loads(raw)
except Exception:
continue
if not isinstance(args, dict):
continue
if args.get("action") != "delete":
continue
name = args.get("name")
if not isinstance(name, str) or not name.strip():
continue
# absorbed_into must be present (even empty string is meaningful);
# missing key means the model didn't declare intent.
if "absorbed_into" not in args:
continue
target = args.get("absorbed_into")
if target is None:
continue
if not isinstance(target, str):
continue
out[name.strip()] = {"into": target.strip(), "declared": True}
return out
def _reconcile_classification(
removed: List[str],
heuristic: Dict[str, List[Dict[str, Any]]],
model_block: Dict[str, List[Dict[str, str]]],
destinations: Set[str],
absorbed_declarations: Optional[Dict[str, Dict[str, Any]]] = None,
) -> Dict[str, List[Dict[str, Any]]]:
"""Merge heuristic (tool-call evidence) with the model's structured block.
Rules:
Rules (evaluated in order; first match wins):
- **Model-declared `absorbed_into` at delete time is authoritative.** Any
entry in ``absorbed_declarations`` beats every other signal. This is
the model telling us directly, at the moment of deletion, what it did.
``into != ""`` and target exists consolidated. ``into == ""``
pruned. ``into != ""`` but target doesn't exist → hallucination; fall
through to the usual signals.
- Model-declared consolidation wins when its ``into`` target exists
in ``destinations`` (survived or newly-created). This gives the
model authority over intent + rationale.
@@ -616,6 +781,8 @@ def _reconcile_classification(
model_cons = {e["from"]: e for e in model_block.get("consolidations", [])}
model_pruned = {e["name"]: e for e in model_block.get("prunings", [])}
declared = absorbed_declarations or {}
consolidated: List[Dict[str, Any]] = []
pruned: List[Dict[str, Any]] = []
@@ -623,6 +790,36 @@ def _reconcile_classification(
mc = model_cons.get(name)
mp = model_pruned.get(name)
hc = heur_cons.get(name)
dec = declared.get(name)
# Authoritative: model declared `absorbed_into` at the delete call.
if dec is not None:
into_claim = dec.get("into", "")
if into_claim and into_claim in destinations:
entry: Dict[str, Any] = {
"name": name,
"into": into_claim,
"source": "absorbed_into (model-declared at delete)",
"reason": (mc.get("reason") or "") if mc else "",
}
if hc and hc.get("evidence"):
entry["evidence"] = hc["evidence"]
consolidated.append(entry)
continue
if into_claim == "":
# Explicit prune declaration
pruned.append({
"name": name,
"source": "absorbed_into=\"\" (model-declared prune)",
"reason": (mp.get("reason") or "") if mp else "",
})
continue
# into_claim is non-empty but target doesn't exist: the model
# named a nonexistent umbrella at delete time. The tool already
# rejects this at the skill_manage layer, so we shouldn't see it
# in practice — but if it slips through (e.g. the umbrella was
# deleted LATER in the same run), fall through to the usual
# signals rather than trusting a broken reference.
# Model says consolidated — trust it if the destination is real.
if mc and mc.get("into") in destinations:
@@ -758,11 +955,20 @@ def _write_run_report(
)
model_block = _parse_structured_summary(llm_meta.get("final", "") or "")
destinations = set(after_names) | set(added or [])
# Authoritative signal: extract per-delete `absorbed_into` declarations
# from this run's tool calls. These beat both the YAML summary block and
# the substring heuristic — the model is telling us directly, at the
# moment of deletion, whether each archived skill was consolidated
# (into=<umbrella>) or pruned (into="").
absorbed_declarations = _extract_absorbed_into_declarations(
llm_meta.get("tool_calls", []) or []
)
classification = _reconcile_classification(
removed=removed,
heuristic=heuristic,
model_block=model_block,
destinations=destinations,
absorbed_declarations=absorbed_declarations,
)
consolidated = classification["consolidated"]
pruned = classification["pruned"]
@@ -1072,6 +1278,7 @@ def _render_candidate_list() -> str:
def run_curator_review(
on_summary: Optional[Callable[[str], None]] = None,
synchronous: bool = False,
dry_run: bool = False,
) -> Dict[str, Any]:
"""Execute a single curator review pass.
@@ -1084,9 +1291,43 @@ def run_curator_review(
If *synchronous* is True, the LLM review runs in the calling thread; the
default is to spawn a daemon thread so the caller returns immediately.
If *dry_run* is True, the automatic stale/archive transitions are SKIPPED
and the LLM review pass is instructed to produce a report only no
skill_manage mutations, no terminal archive moves. The REPORT.md still
gets written and ``state.last_report_path`` still records it so users
can read what the curator WOULD have done.
"""
start = datetime.now(timezone.utc)
counts = apply_automatic_transitions(now=start)
if dry_run:
# Count candidates without mutating state.
try:
report = skill_usage.agent_created_report()
counts = {
"checked": len(report),
"marked_stale": 0,
"archived": 0,
"reactivated": 0,
}
except Exception:
counts = {"checked": 0, "marked_stale": 0, "archived": 0, "reactivated": 0}
else:
# Pre-mutation snapshot — best-effort, never blocks the run. A
# failed snapshot logs at debug and continues (the alternative is
# that a transient disk issue silently disables curator forever,
# which is worse). Users who want to require snapshots can disable
# curator entirely until they can fix disk space.
try:
from agent import curator_backup
snap = curator_backup.snapshot_skills(reason="pre-curator-run")
if snap is not None and on_summary:
try:
on_summary(f"curator: snapshot created ({snap.name})")
except Exception:
pass
except Exception as e:
logger.debug("Curator pre-run snapshot failed: %s", e, exc_info=True)
counts = apply_automatic_transitions(now=start)
auto_summary_parts = []
if counts["marked_stale"]:
@@ -1098,11 +1339,16 @@ def run_curator_review(
auto_summary = ", ".join(auto_summary_parts) if auto_summary_parts else "no changes"
# Persist state before the LLM pass so a crash mid-review still records
# the run and doesn't immediately re-trigger.
# the run and doesn't immediately re-trigger. In dry-run we do NOT bump
# last_run_at or run_count — a preview shouldn't push the next scheduled
# real pass out. We still record a summary so `hermes curator status`
# shows that a preview ran.
state = load_state()
state["last_run_at"] = start.isoformat()
state["run_count"] = int(state.get("run_count", 0)) + 1
state["last_run_summary"] = f"auto: {auto_summary}"
if not dry_run:
state["last_run_at"] = start.isoformat()
state["run_count"] = int(state.get("run_count", 0)) + 1
prefix = "dry-run auto: " if dry_run else "auto: "
state["last_run_summary"] = f"{prefix}{auto_summary}"
save_state(state)
def _llm_pass():
@@ -1118,7 +1364,7 @@ def run_curator_review(
try:
candidate_list = _render_candidate_list()
if "No agent-created skills" in candidate_list:
final_summary = f"auto: {auto_summary}; llm: skipped (no candidates)"
final_summary = f"{prefix}{auto_summary}; llm: skipped (no candidates)"
llm_meta = {
"final": "",
"summary": "skipped (no candidates)",
@@ -1128,14 +1374,21 @@ def run_curator_review(
"error": None,
}
else:
prompt = f"{CURATOR_REVIEW_PROMPT}\n\n{candidate_list}"
if dry_run:
prompt = (
f"{CURATOR_DRY_RUN_BANNER}\n\n"
f"{CURATOR_REVIEW_PROMPT}\n\n"
f"{candidate_list}"
)
else:
prompt = f"{CURATOR_REVIEW_PROMPT}\n\n{candidate_list}"
llm_meta = _run_llm_review(prompt)
final_summary = (
f"auto: {auto_summary}; llm: {llm_meta.get('summary', 'no change')}"
f"{prefix}{auto_summary}; llm: {llm_meta.get('summary', 'no change')}"
)
except Exception as e:
logger.debug("Curator LLM pass failed: %s", e, exc_info=True)
final_summary = f"auto: {auto_summary}; llm: error ({e})"
final_summary = f"{prefix}{auto_summary}; llm: error ({e})"
llm_meta = {
"final": "",
"summary": f"error ({e})",
@@ -1194,6 +1447,52 @@ def run_curator_review(
}
def _resolve_review_runtime(cfg: Dict[str, Any]) -> _ReviewRuntimeBinding:
"""Resolve provider/model and per-slot credentials for the curator review fork.
Same precedence as `_resolve_review_model()`. Non-empty ``api_key`` /
``base_url`` from the active slot are returned as explicit overrides so
``resolve_runtime_provider`` does not silently reuse the main chat
credential chain for a routed auxiliary model.
"""
_main = cfg.get("model", {}) if isinstance(cfg.get("model"), dict) else {}
_main_provider = _main.get("provider") or "auto"
_main_model = _main.get("default") or _main.get("model") or ""
# 1. Canonical aux task slot
_aux = cfg.get("auxiliary", {}) if isinstance(cfg.get("auxiliary"), dict) else {}
_cur_task = _aux.get("curator", {}) if isinstance(_aux.get("curator"), dict) else {}
_task_provider = (_cur_task.get("provider") or "").strip() or None
_task_model = (_cur_task.get("model") or "").strip() or None
if _task_provider and _task_provider != "auto" and _task_model:
return _ReviewRuntimeBinding(
_task_provider,
_task_model,
_strip_aux_credential(_cur_task.get("api_key")),
_strip_aux_credential(_cur_task.get("base_url")),
)
# 2. Legacy curator.auxiliary.{provider,model} (deprecated, pre-unification)
_cur = cfg.get("curator", {}) if isinstance(cfg.get("curator"), dict) else {}
_legacy = _cur.get("auxiliary", {}) if isinstance(_cur.get("auxiliary"), dict) else {}
_legacy_provider = _legacy.get("provider") or None
_legacy_model = _legacy.get("model") or None
if _legacy_provider and _legacy_model:
logger.info(
"curator: using deprecated curator.auxiliary.{provider,model} "
"config — please migrate to auxiliary.curator.{provider,model}"
)
return _ReviewRuntimeBinding(
str(_legacy_provider),
str(_legacy_model),
_strip_aux_credential(_legacy.get("api_key")),
_strip_aux_credential(_legacy.get("base_url")),
)
# 3. Fall through to the main chat model
return _ReviewRuntimeBinding(_main_provider, _main_model, None, None)
def _resolve_review_model(cfg: Dict[str, Any]) -> tuple[str, str]:
"""Pick (provider, model) for the curator review fork.
@@ -1209,32 +1508,8 @@ def _resolve_review_model(cfg: Dict[str, Any]) -> tuple[str, str]:
2. Legacy ``curator.auxiliary.{provider,model}`` when both are set
3. Main ``model.{provider,default/model}`` pair
"""
_main = cfg.get("model", {}) if isinstance(cfg.get("model"), dict) else {}
_main_provider = _main.get("provider") or "auto"
_main_model = _main.get("default") or _main.get("model") or ""
# 1. Canonical aux task slot
_aux = cfg.get("auxiliary", {}) if isinstance(cfg.get("auxiliary"), dict) else {}
_cur_task = _aux.get("curator", {}) if isinstance(_aux.get("curator"), dict) else {}
_task_provider = (_cur_task.get("provider") or "").strip() or None
_task_model = (_cur_task.get("model") or "").strip() or None
if _task_provider and _task_provider != "auto" and _task_model:
return _task_provider, _task_model
# 2. Legacy curator.auxiliary.{provider,model} (deprecated, pre-unification)
_cur = cfg.get("curator", {}) if isinstance(cfg.get("curator"), dict) else {}
_legacy = _cur.get("auxiliary", {}) if isinstance(_cur.get("auxiliary"), dict) else {}
_legacy_provider = _legacy.get("provider") or None
_legacy_model = _legacy.get("model") or None
if _legacy_provider and _legacy_model:
logger.info(
"curator: using deprecated curator.auxiliary.{provider,model} "
"config — please migrate to auxiliary.curator.{provider,model}"
)
return _legacy_provider, _legacy_model
# 3. Fall through to the main chat model
return _main_provider, _main_model
b = _resolve_review_runtime(cfg)
return b.provider, b.model
def _run_llm_review(prompt: str) -> Dict[str, Any]:
@@ -1273,10 +1548,10 @@ def _run_llm_review(prompt: str) -> Dict[str, Any]:
# arguments hits an auto-resolution path that fails for OAuth-only
# providers and for pool-backed credentials.
#
# `_resolve_review_model()` honors `auxiliary.curator.{provider,model}`
# `_resolve_review_runtime()` honors `auxiliary.curator.{provider,model,...}`
# (canonical aux-task slot, wired through `hermes model` → auxiliary
# picker and the dashboard Models tab), with a legacy fallback to
# `curator.auxiliary.{provider,model}`. See docs/user-guide/features/curator.md.
# `curator.auxiliary.{provider,model,...}`. See docs/user-guide/features/curator.md.
_api_key = None
_base_url = None
_api_mode = None
@@ -1286,9 +1561,13 @@ def _run_llm_review(prompt: str) -> Dict[str, Any]:
from hermes_cli.config import load_config
from hermes_cli.runtime_provider import resolve_runtime_provider
_cfg = load_config()
_provider, _model_name = _resolve_review_model(_cfg)
_binding = _resolve_review_runtime(_cfg)
_provider, _model_name = _binding.provider, _binding.model
_rp = resolve_runtime_provider(
requested=_provider, target_model=_model_name
requested=_provider,
target_model=_model_name,
explicit_api_key=_binding.explicit_api_key,
explicit_base_url=_binding.explicit_base_url,
)
_api_key = _rp.get("api_key")
_base_url = _rp.get("base_url")
+693
View File
@@ -0,0 +1,693 @@
"""Curator snapshot + rollback.
A pre-run snapshot of ``~/.hermes/skills/`` (excluding ``.curator_backups/``
itself) is taken before any mutating curator pass. Snapshots are tar.gz
files under ``~/.hermes/skills/.curator_backups/<utc-iso>/`` with a
companion ``manifest.json`` describing the snapshot (reason, time, size,
counted skill files). Rollback picks a snapshot, moves the current
``skills/`` tree aside into another snapshot so even the rollback itself
is undoable, then extracts the chosen snapshot into place.
The snapshot does NOT include:
- ``.curator_backups/`` (would recurse)
- ``.hub/`` (hub-installed skills managed by the hub, not us)
It DOES include:
- all SKILL.md files + their directories (``scripts/``, ``references/``,
``templates/``, ``assets/``)
- ``.usage.json`` (usage telemetry needed to rehydrate state cleanly)
- ``.archive/`` (so rollback restores previously-archived skills too)
- ``.curator_state`` (so rolling back also restores the last-run-at
pointer otherwise the curator would immediately re-fire on the next
tick)
- ``.bundled_manifest`` (so protection markers stay consistent)
Alongside the skills tarball, each snapshot also captures a copy of
``~/.hermes/cron/jobs.json`` as ``cron-jobs.json`` when it exists. Cron
jobs reference skills by name in their ``skills``/``skill`` fields; the
curator's consolidation pass rewrites those in place via
``cron.jobs.rewrite_skill_refs()``. Without capturing the pre-run state,
rolling back the skills tree would leave cron jobs pointing at the
umbrella skills even though the narrow skills they were originally
configured with have been restored. We store the whole jobs.json for
fidelity but rollback only touches the ``skills``/``skill`` fields the
rest (schedule, next_run_at, enabled, prompt, etc.) is live state and
we leave it alone.
"""
from __future__ import annotations
import json
import logging
import os
import re
import shutil
import tarfile
import tempfile
import time
from datetime import datetime, timezone
from pathlib import Path
from typing import Any, Dict, List, Optional, Tuple
from hermes_constants import get_hermes_home
logger = logging.getLogger(__name__)
DEFAULT_KEEP = 5
# Entries under skills/ that should NEVER be rolled up into a snapshot.
# .hub/ is managed by the skills hub; rolling it back would break lockfile
# invariants. .curator_backups is the backup dir itself — recursion bomb.
_EXCLUDE_TOP_LEVEL = {".curator_backups", ".hub"}
# Snapshot id regex: UTC ISO with colons replaced by dashes so the filename
# is portable (Windows-safe). An optional ``-NN`` suffix handles two
# snapshots landing in the same wallclock second.
_ID_RE = re.compile(r"^\d{4}-\d{2}-\d{2}T\d{2}-\d{2}-\d{2}Z(-\d{2})?$")
def _backups_dir() -> Path:
return get_hermes_home() / "skills" / ".curator_backups"
def _skills_dir() -> Path:
return get_hermes_home() / "skills"
def _cron_jobs_file() -> Path:
"""Source path for the live cron jobs store (``~/.hermes/cron/jobs.json``)."""
return get_hermes_home() / "cron" / "jobs.json"
CRON_JOBS_FILENAME = "cron-jobs.json"
def _backup_cron_jobs_into(dest: Path) -> Dict[str, Any]:
"""Copy the live cron jobs.json into ``dest`` as ``cron-jobs.json``.
Returns a small dict describing what was captured so the caller can
fold it into the manifest. Never raises if the cron file is missing
or unreadable, the return dict has ``backed_up=False`` and the reason,
and the snapshot proceeds without cron data (the snapshot is still
useful for rolling back skills).
"""
src = _cron_jobs_file()
info: Dict[str, Any] = {"backed_up": False, "jobs_count": 0}
if not src.exists():
info["reason"] = "no cron/jobs.json present"
return info
try:
raw = src.read_text(encoding="utf-8")
except OSError as e:
logger.debug("Failed to read cron/jobs.json for backup: %s", e)
info["reason"] = f"read error: {e}"
return info
# Count jobs as a nice diagnostic — but don't fail the snapshot if the
# file is unparseable; just store the raw text and let rollback deal
# with it (or not, if it's corrupted). jobs.json wraps the list as
# `{"jobs": [...], "updated_at": ...}` — we count via that shape, and
# fall back to bare-list shape just in case the format ever changes.
try:
parsed = json.loads(raw)
if isinstance(parsed, dict):
inner = parsed.get("jobs")
if isinstance(inner, list):
info["jobs_count"] = len(inner)
elif isinstance(parsed, list):
info["jobs_count"] = len(parsed)
except (json.JSONDecodeError, TypeError):
info["jobs_count"] = 0
info["parse_warning"] = "jobs.json was not valid JSON at snapshot time"
try:
(dest / CRON_JOBS_FILENAME).write_text(raw, encoding="utf-8")
except OSError as e:
logger.debug("Failed to write cron backup file: %s", e)
info["reason"] = f"write error: {e}"
return info
info["backed_up"] = True
return info
def _utc_id(now: Optional[datetime] = None) -> str:
"""UTC ISO-ish filesystem-safe timestamp: ``2026-05-01T13-05-42Z``."""
if now is None:
now = datetime.now(timezone.utc)
# isoformat → "2026-05-01T13:05:42.123456+00:00"; strip subseconds and tz.
s = now.replace(microsecond=0).isoformat()
if s.endswith("+00:00"):
s = s[:-6]
return s.replace(":", "-") + "Z"
def _load_config() -> Dict[str, Any]:
try:
from hermes_cli.config import load_config
cfg = load_config()
except Exception as e:
logger.debug("Failed to load config for curator backup: %s", e)
return {}
if not isinstance(cfg, dict):
return {}
cur = cfg.get("curator") or {}
if not isinstance(cur, dict):
return {}
bk = cur.get("backup") or {}
return bk if isinstance(bk, dict) else {}
def is_enabled() -> bool:
"""Default ON — the whole point of the backup is safety by default."""
return bool(_load_config().get("enabled", True))
def get_keep() -> int:
cfg = _load_config()
try:
n = int(cfg.get("keep", DEFAULT_KEEP))
except (TypeError, ValueError):
n = DEFAULT_KEEP
return max(1, n)
# ---------------------------------------------------------------------------
# Snapshot
# ---------------------------------------------------------------------------
def _count_skill_files(base: Path) -> int:
try:
return sum(1 for _ in base.rglob("SKILL.md"))
except OSError:
return 0
def _write_manifest(dest: Path, reason: str, archive_path: Path,
skills_counted: int,
cron_info: Optional[Dict[str, Any]] = None) -> None:
manifest = {
"id": dest.name,
"reason": reason,
"created_at": datetime.now(timezone.utc).isoformat(),
"archive": archive_path.name,
"archive_bytes": archive_path.stat().st_size,
"skill_files": skills_counted,
}
if cron_info is not None:
manifest["cron_jobs"] = {
"backed_up": bool(cron_info.get("backed_up", False)),
"jobs_count": int(cron_info.get("jobs_count", 0)),
}
if not cron_info.get("backed_up"):
manifest["cron_jobs"]["reason"] = cron_info.get("reason", "not captured")
if cron_info.get("parse_warning"):
manifest["cron_jobs"]["parse_warning"] = cron_info["parse_warning"]
(dest / "manifest.json").write_text(
json.dumps(manifest, indent=2, sort_keys=True), encoding="utf-8"
)
def snapshot_skills(reason: str = "manual") -> Optional[Path]:
"""Create a tar.gz snapshot of ``~/.hermes/skills/`` and prune old ones.
Returns the snapshot directory path, or ``None`` if the snapshot was
skipped (backup disabled, skills dir missing, or an IO error occurred
in which case we log at debug and return None so the curator never
aborts a pass because of a backup failure).
"""
if not is_enabled():
logger.debug("Curator backup disabled by config; skipping snapshot")
return None
skills = _skills_dir()
if not skills.exists():
logger.debug("No ~/.hermes/skills/ directory — nothing to back up")
return None
backups = _backups_dir()
try:
backups.mkdir(parents=True, exist_ok=True)
except OSError as e:
logger.debug("Failed to create backups dir %s: %s", backups, e)
return None
# Uniquify: if a snapshot with the same second already exists (can
# happen if two curator runs fire in the same second), append a short
# counter. Avoids clobbering and avoids timestamp collisions.
base_id = _utc_id()
snap_id = base_id
counter = 1
while (backups / snap_id).exists():
snap_id = f"{base_id}-{counter:02d}"
counter += 1
dest = backups / snap_id
try:
dest.mkdir(parents=True, exist_ok=False)
except OSError as e:
logger.debug("Failed to create snapshot dir %s: %s", dest, e)
return None
archive = dest / "skills.tar.gz"
try:
# Stream into the tarball — no tempdir copy needed.
with tarfile.open(archive, "w:gz", compresslevel=6) as tf:
for entry in sorted(skills.iterdir()):
if entry.name in _EXCLUDE_TOP_LEVEL:
continue
# arcname: store paths relative to skills/ so extraction
# drops cleanly back into the skills dir.
tf.add(str(entry), arcname=entry.name, recursive=True)
# Capture cron/jobs.json alongside the tarball. Never fails the
# snapshot — the skills side is the core guarantee; cron is
# additive. We still record in the manifest whether it was
# captured so rollback can surface "no cron data in this snapshot".
cron_info = _backup_cron_jobs_into(dest)
_write_manifest(dest, reason, archive,
_count_skill_files(skills),
cron_info=cron_info)
except (OSError, tarfile.TarError) as e:
logger.debug("Curator snapshot failed: %s", e, exc_info=True)
# Clean up partial snapshot
try:
shutil.rmtree(dest, ignore_errors=True)
except OSError:
pass
return None
_prune_old(keep=get_keep())
logger.info("Curator snapshot created: %s (%s)", snap_id, reason)
return dest
def _prune_old(keep: int) -> List[str]:
"""Delete regular snapshots beyond the newest *keep*. Returns deleted
ids. Staging dirs (``.rollback-staging-*``) are implementation detail
and pruned independently on every call."""
backups = _backups_dir()
if not backups.exists():
return []
entries: List[Tuple[str, Path]] = []
stale_staging: List[Path] = []
for child in backups.iterdir():
if not child.is_dir():
continue
if child.name.startswith(".rollback-staging-"):
# Staging dirs are only supposed to exist briefly during a
# rollback. If we find one here (e.g. from a crashed rollback),
# clean it up opportunistically.
stale_staging.append(child)
continue
if _ID_RE.match(child.name):
entries.append((child.name, child))
# Newest first (lexicographic works because the id is UTC ISO).
entries.sort(key=lambda t: t[0], reverse=True)
deleted: List[str] = []
for _, path in entries[keep:]:
try:
shutil.rmtree(path)
deleted.append(path.name)
except OSError as e:
logger.debug("Failed to prune %s: %s", path, e)
for path in stale_staging:
try:
shutil.rmtree(path)
except OSError as e:
logger.debug("Failed to clean stale staging dir %s: %s", path, e)
return deleted
# ---------------------------------------------------------------------------
# List + rollback
# ---------------------------------------------------------------------------
def _read_manifest(snap_dir: Path) -> Dict[str, Any]:
mf = snap_dir / "manifest.json"
if not mf.exists():
return {}
try:
return json.loads(mf.read_text(encoding="utf-8"))
except (OSError, json.JSONDecodeError):
return {}
def list_backups() -> List[Dict[str, Any]]:
"""Return all restorable snapshots, newest first. Only entries with a
real ``skills.tar.gz`` tarball are listed transient
``.rollback-staging-*`` directories created mid-rollback are
implementation detail and not shown."""
backups = _backups_dir()
if not backups.exists():
return []
out: List[Dict[str, Any]] = []
for child in sorted(backups.iterdir(), reverse=True):
if not child.is_dir():
continue
if not _ID_RE.match(child.name):
continue
if not (child / "skills.tar.gz").exists():
continue
mf = _read_manifest(child)
mf.setdefault("id", child.name)
mf.setdefault("path", str(child))
if "archive_bytes" not in mf:
arc = child / "skills.tar.gz"
try:
mf["archive_bytes"] = arc.stat().st_size
except OSError:
mf["archive_bytes"] = 0
out.append(mf)
return out
def _resolve_backup(backup_id: Optional[str]) -> Optional[Path]:
"""Return the path of the requested backup, or the newest one if
*backup_id* is None. Returns None if no match."""
backups = _backups_dir()
if not backups.exists():
return None
if backup_id:
target = backups / backup_id
if (
target.is_dir()
and _ID_RE.match(backup_id)
and (target / "skills.tar.gz").exists()
):
return target
return None
candidates = [
c for c in sorted(backups.iterdir(), reverse=True)
if c.is_dir() and _ID_RE.match(c.name) and (c / "skills.tar.gz").exists()
]
return candidates[0] if candidates else None
def _restore_cron_skill_links(snapshot_dir: Path) -> Dict[str, Any]:
"""Reconcile backed-up cron skill links into the live ``cron/jobs.json``.
We do NOT overwrite the whole cron file. Only the ``skills`` and
``skill`` fields are restored, and only on jobs that still exist in the
current file (matched by ``id``). Everything else about the job
schedule, next_run_at, last_run_at, enabled, prompt, workdir, hooks
is live state that the user/scheduler has modified since the snapshot;
overwriting it would regress unrelated cron activity.
Rules:
- Jobs present in backup AND live, with differing skills skills restored.
- Jobs present in backup AND live, with matching skills no-op.
- Jobs present in backup but gone from live (user deleted the job
after the snapshot) skipped, noted in the return report.
- Jobs present in live but not in backup (user created a new cron
job after the snapshot) left untouched.
Never raises; failures are captured in the return dict. Writes through
``cron.jobs`` to pick up the same lock + atomic-write path that tick()
uses, so we don't race the scheduler.
"""
report: Dict[str, Any] = {
"attempted": False,
"restored": [],
"skipped_missing": [],
"unchanged": 0,
"error": None,
}
backup_file = snapshot_dir / CRON_JOBS_FILENAME
if not backup_file.exists():
report["error"] = f"snapshot has no {CRON_JOBS_FILENAME}"
return report
try:
backup_text = backup_file.read_text(encoding="utf-8")
backup_parsed = json.loads(backup_text)
except (OSError, json.JSONDecodeError) as e:
report["error"] = f"failed to load backed-up jobs: {e}"
return report
# jobs.json on disk is `{"jobs": [...], "updated_at": ...}`; accept both
# that shape and a bare list for forward compat.
if isinstance(backup_parsed, dict):
backup_jobs = backup_parsed.get("jobs")
elif isinstance(backup_parsed, list):
backup_jobs = backup_parsed
else:
backup_jobs = None
if not isinstance(backup_jobs, list):
report["error"] = "backed-up cron-jobs.json has no jobs list"
return report
# Build a lookup of the backed-up skill state keyed by job id.
# We only need the two skill-ish fields (legacy single and modern list).
backup_by_id: Dict[str, Dict[str, Any]] = {}
for job in backup_jobs:
if not isinstance(job, dict):
continue
jid = job.get("id")
if not isinstance(jid, str) or not jid:
continue
backup_by_id[jid] = {
"skills": job.get("skills"),
"skill": job.get("skill"),
"name": job.get("name") or jid,
}
if not backup_by_id:
report["attempted"] = True # we tried but there was nothing to do
return report
# Load and rewrite the live jobs under the scheduler's lock.
try:
from cron.jobs import load_jobs, save_jobs, _jobs_file_lock
except ImportError as e:
report["error"] = f"cron module unavailable: {e}"
return report
report["attempted"] = True
try:
with _jobs_file_lock:
live_jobs = load_jobs()
changed = False
live_ids = set()
for live in live_jobs:
if not isinstance(live, dict):
continue
jid = live.get("id")
if not isinstance(jid, str) or not jid:
continue
live_ids.add(jid)
backup = backup_by_id.get(jid)
if backup is None:
continue # live job didn't exist at snapshot time
cur_skills = live.get("skills")
cur_skill = live.get("skill")
bkp_skills = backup.get("skills")
bkp_skill = backup.get("skill")
if cur_skills == bkp_skills and cur_skill == bkp_skill:
report["unchanged"] += 1
continue
# Restore. Preserve absence (don't force the key to appear
# if the backup didn't have it either).
if bkp_skills is None:
live.pop("skills", None)
else:
live["skills"] = bkp_skills
if bkp_skill is None:
live.pop("skill", None)
else:
live["skill"] = bkp_skill
report["restored"].append({
"job_id": jid,
"job_name": backup.get("name") or jid,
"from": {"skills": cur_skills, "skill": cur_skill},
"to": {"skills": bkp_skills, "skill": bkp_skill},
})
changed = True
# Jobs in backup but not in live = user deleted them after snapshot
for jid, backup in backup_by_id.items():
if jid not in live_ids:
report["skipped_missing"].append({
"job_id": jid,
"job_name": backup.get("name") or jid,
})
if changed:
save_jobs(live_jobs)
except Exception as e: # noqa: BLE001 — rollback must not die mid-restore
logger.debug("Cron skill-link restore failed: %s", e, exc_info=True)
report["error"] = f"restore failed mid-flight: {e}"
return report
def rollback(backup_id: Optional[str] = None) -> Tuple[bool, str, Optional[Path]]:
"""Restore ``~/.hermes/skills/`` from a snapshot.
Strategy:
1. Resolve the target snapshot (explicit id or newest regular).
2. Take a safety snapshot of the CURRENT skills tree under
``.curator_backups/pre-rollback-<ts>/`` so the rollback itself is
undoable.
3. Move all current top-level entries (except ``.curator_backups``
and ``.hub``) into a tempdir.
4. Extract the chosen snapshot into ``~/.hermes/skills/``.
5. On failure during 4, move the tempdir contents back (best-effort)
and return failure.
Returns ``(ok, message, snapshot_path)``.
"""
target = _resolve_backup(backup_id)
if target is None:
return (
False,
f"no matching backup found"
+ (f" for id '{backup_id}'" if backup_id else "")
+ " (use `hermes curator rollback --list` to see available snapshots)",
None,
)
archive = target / "skills.tar.gz"
if not archive.exists():
return (False, f"snapshot {target.name} has no skills.tar.gz — corrupted?", None)
skills = _skills_dir()
skills.mkdir(parents=True, exist_ok=True)
backups = _backups_dir()
backups.mkdir(parents=True, exist_ok=True)
# Step 2: safety snapshot of current state FIRST. If this fails we bail
# out before touching anything — otherwise a failed extract could leave
# the user with no skills.
try:
snapshot_skills(reason=f"pre-rollback to {target.name}")
except Exception as e:
return (False, f"pre-rollback safety snapshot failed: {e}", None)
# Additionally move current entries into an internal staging dir so
# the extract happens into an empty skills tree (predictable result).
# This dir is implementation detail — not listed as a restorable
# backup. The safety snapshot above is the user-facing undo handle.
staged = backups / f".rollback-staging-{_utc_id()}"
try:
staged.mkdir(parents=True, exist_ok=False)
except OSError as e:
return (False, f"failed to create staging dir: {e}", None)
moved: List[Tuple[Path, Path]] = []
try:
for entry in list(skills.iterdir()):
if entry.name in _EXCLUDE_TOP_LEVEL:
continue
dest = staged / entry.name
shutil.move(str(entry), str(dest))
moved.append((entry, dest))
except OSError as e:
# Best-effort rollback of the move
for orig, dest in moved:
try:
shutil.move(str(dest), str(orig))
except OSError:
pass
try:
shutil.rmtree(staged, ignore_errors=True)
except OSError:
pass
return (False, f"failed to stage current skills: {e}", None)
# Step 4: extract the snapshot into skills/
try:
with tarfile.open(archive, "r:gz") as tf:
# Python 3.12+ supports filter='data' for safer extraction.
# Fall back to the unfiltered call for older interpreters but
# still reject absolute paths and .. components defensively.
for member in tf.getmembers():
name = member.name
if name.startswith("/") or ".." in Path(name).parts:
raise tarfile.TarError(
f"refusing to extract unsafe path: {name!r}"
)
try:
tf.extractall(str(skills), filter="data") # type: ignore[call-arg]
except TypeError:
# Python < 3.12 — no filter kwarg
tf.extractall(str(skills))
except (OSError, tarfile.TarError) as e:
# Best-effort recover: move staged contents back
for orig, dest in moved:
try:
shutil.move(str(dest), str(orig))
except OSError:
pass
try:
shutil.rmtree(staged, ignore_errors=True)
except OSError:
pass
return (False, f"snapshot extract failed (state restored): {e}", None)
# Extract succeeded — the staging dir has served its purpose. The
# user's undo handle is the safety snapshot tarball we took earlier.
try:
shutil.rmtree(staged, ignore_errors=True)
except OSError:
pass
# Reconcile cron skill-links. Surgical: only the skills/skill fields
# on jobs matched by id. Everything else in jobs.json is live state
# (schedule, next_run_at, enabled, prompt, etc.) and we leave it
# alone. Failures here don't fail the overall rollback — the skills
# tree is already restored, which is the main guarantee.
cron_report = _restore_cron_skill_links(target)
summary_bits = [f"restored from snapshot {target.name}"]
if cron_report.get("attempted"):
restored_n = len(cron_report.get("restored") or [])
skipped_n = len(cron_report.get("skipped_missing") or [])
if cron_report.get("error"):
summary_bits.append(f"cron links: error — {cron_report['error']}")
elif restored_n == 0 and skipped_n == 0 and cron_report.get("unchanged", 0) == 0:
# Attempted but nothing matched — empty snapshot or no overlapping ids.
pass
else:
parts = []
if restored_n:
parts.append(f"{restored_n} job(s) had skill links restored")
if skipped_n:
parts.append(f"{skipped_n} backed-up job(s) no longer exist (skipped)")
if cron_report.get("unchanged"):
parts.append(f"{cron_report['unchanged']} already matched")
summary_bits.append("cron links: " + ", ".join(parts))
logger.info("Curator rollback: restored from %s (cron_report=%s)",
target.name, cron_report)
return (True, "; ".join(summary_bits), target)
# ---------------------------------------------------------------------------
# Human-readable summary for CLI
# ---------------------------------------------------------------------------
def format_size(n: int) -> str:
for unit in ("B", "KB", "MB", "GB"):
if n < 1024 or unit == "GB":
return f"{n:.1f} {unit}" if unit != "B" else f"{n} B"
n /= 1024
return f"{n:.1f} GB"
def summarize_backups() -> str:
rows = list_backups()
if not rows:
return "No curator snapshots yet."
lines = [f"{'id':<24} {'reason':<40} {'skills':>6} {'size':>8}"]
lines.append("" * len(lines[0]))
for r in rows:
lines.append(
f"{r.get('id','?'):<24} "
f"{(r.get('reason','?') or '?')[:40]:<40} "
f"{r.get('skill_files', 0):>6} "
f"{format_size(int(r.get('archive_bytes', 0))):>8}"
)
return "\n".join(lines)
+38 -2
View File
@@ -55,6 +55,7 @@ class FailoverReason(enum.Enum):
thinking_signature = "thinking_signature" # Anthropic thinking block sig invalid
long_context_tier = "long_context_tier" # Anthropic "extra usage" tier gate
oauth_long_context_beta_forbidden = "oauth_long_context_beta_forbidden" # Anthropic OAuth subscription rejects 1M context beta — disable beta and retry
llama_cpp_grammar_pattern = "llama_cpp_grammar_pattern" # llama.cpp json-schema-to-grammar rejects regex escapes in `pattern` / `format` — strip from tools and retry
# Catch-all
unknown = "unknown" # Unclassifiable — retry with backoff
@@ -470,6 +471,31 @@ def classify_api_error(
should_compress=False,
)
# llama.cpp's ``json-schema-to-grammar`` converter (used by its OAI
# server to build GBNF tool-call parsers) rejects regex escape classes
# like ``\d``/``\w``/``\s`` and most ``format`` values. MCP servers
# routinely emit ``"pattern": "\\d{4}-\\d{2}-\\d{2}"`` for date/phone/
# email params. llama.cpp surfaces this as HTTP 400 with one of a few
# recognizable phrases; on match we strip ``pattern``/``format`` from
# ``self.tools`` in the retry loop and retry once. Cloud providers are
# unaffected — they accept these keywords and we never hit this branch.
if (
status_code == 400
and (
"error parsing grammar" in error_msg
or "json-schema-to-grammar" in error_msg
or (
"unable to generate parser" in error_msg
and "template" in error_msg
)
)
):
return _result(
FailoverReason.llama_cpp_grammar_pattern,
retryable=True,
should_compress=False,
)
# ── 2. HTTP status code classification ──────────────────────────
if status_code is not None:
@@ -520,7 +546,12 @@ def classify_api_error(
is_disconnect = any(p in error_msg for p in _SERVER_DISCONNECT_PATTERNS)
if is_disconnect and not status_code:
is_large = approx_tokens > context_length * 0.6 or approx_tokens > 120000 or num_messages > 200
# Absolute token/message-count thresholds are only a proxy for smaller
# context windows. Large-context sessions can have hundreds of
# messages while still being far below their actual token budget.
is_large = approx_tokens > context_length * 0.6 or (
context_length <= 256000 and (approx_tokens > 120000 or num_messages > 200)
)
if is_large:
return _result(
FailoverReason.context_overflow,
@@ -766,7 +797,12 @@ def _classify_400(
if not err_body_msg:
err_body_msg = str(body.get("message") or "").strip().lower()
is_generic = len(err_body_msg) < 30 or err_body_msg in ("error", "")
is_large = approx_tokens > context_length * 0.4 or approx_tokens > 80000 or num_messages > 80
# Absolute token/message-count thresholds are only a proxy for smaller
# context windows. Large-context sessions can have many messages while
# still being far below their actual token budget.
is_large = approx_tokens > context_length * 0.4 or (
context_length <= 256000 and (approx_tokens > 80000 or num_messages > 80)
)
if is_generic and is_large:
return result_fn(
+15 -1
View File
@@ -679,7 +679,21 @@ def translate_stream_event(event: Dict[str, Any], model: str, tool_call_indices:
finish_reason_raw = str(cand.get("finishReason") or "")
if finish_reason_raw:
mapped = "tool_calls" if tool_call_indices else _map_gemini_finish_reason(finish_reason_raw)
chunks.append(_make_stream_chunk(model=model, finish_reason=mapped))
finish_chunk = _make_stream_chunk(model=model, finish_reason=mapped)
# Attach usage from this event's usageMetadata so the streaming
# loop in run_agent.py can record token counts (mirrors the
# non-streaming path in translate_gemini_response).
usage_meta = event.get("usageMetadata") or {}
if usage_meta:
finish_chunk.usage = SimpleNamespace(
prompt_tokens=int(usage_meta.get("promptTokenCount") or 0),
completion_tokens=int(usage_meta.get("candidatesTokenCount") or 0),
total_tokens=int(usage_meta.get("totalTokenCount") or 0),
prompt_tokens_details=SimpleNamespace(
cached_tokens=int(usage_meta.get("cachedContentTokenCount") or 0),
),
)
chunks.append(finish_chunk)
return chunks
+15 -2
View File
@@ -489,16 +489,29 @@ def save_credentials(creds: GoogleCredentials) -> Path:
"""Atomically write creds to disk with 0o600 permissions."""
path = _credentials_path()
path.parent.mkdir(parents=True, exist_ok=True)
# Tighten parent dir to 0o700 so siblings can't traverse to the creds file.
# On Windows this is a no-op (POSIX mode bits aren't enforced); ignore failures.
try:
os.chmod(path.parent, 0o700)
except OSError:
pass
payload = json.dumps(creds.to_dict(), indent=2, sort_keys=True) + "\n"
with _credentials_lock():
tmp_path = path.with_suffix(f".tmp.{os.getpid()}.{secrets.token_hex(4)}")
try:
with open(tmp_path, "w", encoding="utf-8") as fh:
# Create with 0o600 atomically to close the TOCTOU window where the
# default umask (often 0o644) would briefly expose tokens to other
# local users between open() and chmod().
fd = os.open(
str(tmp_path),
os.O_WRONLY | os.O_CREAT | os.O_EXCL,
stat.S_IRUSR | stat.S_IWUSR,
)
with os.fdopen(fd, "w", encoding="utf-8") as fh:
fh.write(payload)
fh.flush()
os.fsync(fh.fileno())
os.chmod(tmp_path, stat.S_IRUSR | stat.S_IWUSR)
atomic_replace(tmp_path, path)
finally:
try:
+233
View File
@@ -0,0 +1,233 @@
"""Lightweight internationalization (i18n) for Hermes static user-facing messages.
Scope (thin slice, by design): only the highest-impact static strings shown
to the user by Hermes itself -- approval prompts, a handful of gateway slash
command replies, restart-drain notices. Agent-generated output, log lines,
error tracebacks, tool outputs, and slash-command descriptions all stay in
English.
Catalog files live under ``locales/<lang>.yaml`` at the repo root. Each
catalog is a flat dict keyed by dotted paths (e.g. ``approval.choose`` or
``gateway.approval_expired``). Missing keys fall back to English; if English
is missing too, the key path itself is returned so a broken catalog never
crashes the agent.
Usage::
from agent.i18n import t
print(t("approval.choose_long")) # current lang
print(t("gateway.draining", count=3)) # {count} formatted
print(t("approval.choose_long", lang="zh")) # explicit override
Language resolution order:
1. Explicit ``lang=`` argument passed to :func:`t`
2. ``HERMES_LANGUAGE`` environment variable (for tests / quick override)
3. ``display.language`` from config.yaml
4. ``"en"`` (baseline)
Supported languages: en, zh, ja, de, es, fr, tr, uk. Unknown values fall back to en.
"""
from __future__ import annotations
import logging
import os
import threading
from functools import lru_cache
from pathlib import Path
from typing import Any
logger = logging.getLogger(__name__)
SUPPORTED_LANGUAGES: tuple[str, ...] = ("en", "zh", "ja", "de", "es", "fr", "tr", "uk")
DEFAULT_LANGUAGE = "en"
# Accept a few natural aliases so users who type "chinese" / "zh-CN" / "jp"
# get the right catalog instead of silently falling back to English.
_LANGUAGE_ALIASES: dict[str, str] = {
"english": "en", "en-us": "en", "en-gb": "en",
"chinese": "zh", "mandarin": "zh", "zh-cn": "zh", "zh-tw": "zh", "zh-hans": "zh", "zh-hant": "zh",
"japanese": "ja", "jp": "ja", "ja-jp": "ja",
"german": "de", "deutsch": "de", "de-de": "de",
"spanish": "es", "español": "es", "espanol": "es", "es-es": "es", "es-mx": "es",
"french": "fr", "français": "fr", "france": "fr", "fr-fr": "fr", "fr-be": "fr", "fr-ca": "fr", "fr-ch": "fr",
"ukrainian": "uk", "ukrainisch": "uk", "українська": "uk", "uk-ua": "uk", "ua": "uk",
"turkish": "tr", "türkçe": "tr", "tr-tr": "tr",
}
_catalog_cache: dict[str, dict[str, str]] = {}
_catalog_lock = threading.Lock()
def _locales_dir() -> Path:
"""Return the directory containing locale YAML files.
Lives next to the repo root so both the bundled install and editable
checkouts find it without PYTHONPATH gymnastics.
"""
# agent/i18n.py -> agent/ -> repo root
return Path(__file__).resolve().parent.parent / "locales"
def _normalize_lang(value: Any) -> str:
"""Normalize a user-supplied language value to a supported code.
Accepts supported codes directly, common aliases (``chinese`` -> ``zh``),
and case-insensitive regional tags (``zh-CN`` -> ``zh``). Returns the
default language for unknown values.
"""
if not isinstance(value, str):
return DEFAULT_LANGUAGE
key = value.strip().lower()
if not key:
return DEFAULT_LANGUAGE
if key in SUPPORTED_LANGUAGES:
return key
if key in _LANGUAGE_ALIASES:
return _LANGUAGE_ALIASES[key]
# Try stripping a region suffix (e.g. "pt-br" -> "pt" won't be supported,
# but "zh-CN" -> "zh" will).
base = key.split("-", 1)[0]
if base in SUPPORTED_LANGUAGES:
return base
return DEFAULT_LANGUAGE
def _load_catalog(lang: str) -> dict[str, str]:
"""Load and flatten one locale YAML file into a dotted-key dict.
YAML files can be nested for human readability; this produces the flat
key space :func:`t` expects. Cached per-language for the process.
"""
with _catalog_lock:
cached = _catalog_cache.get(lang)
if cached is not None:
return cached
path = _locales_dir() / f"{lang}.yaml"
if not path.is_file():
logger.debug("i18n catalog missing for %s at %s", lang, path)
with _catalog_lock:
_catalog_cache[lang] = {}
return {}
try:
import yaml # PyYAML is already a hermes dependency
with path.open("r", encoding="utf-8") as f:
raw = yaml.safe_load(f) or {}
except Exception as exc:
logger.warning("Failed to load i18n catalog %s: %s", path, exc)
with _catalog_lock:
_catalog_cache[lang] = {}
return {}
flat: dict[str, str] = {}
_flatten_into(raw, "", flat)
with _catalog_lock:
_catalog_cache[lang] = flat
return flat
def _flatten_into(node: Any, prefix: str, out: dict[str, str]) -> None:
if isinstance(node, dict):
for key, value in node.items():
child_key = f"{prefix}.{key}" if prefix else str(key)
_flatten_into(value, child_key, out)
elif isinstance(node, str):
out[prefix] = node
# Non-string, non-dict leaves are ignored -- catalogs are text-only.
@lru_cache(maxsize=1)
def _config_language_cached() -> str | None:
"""Read ``display.language`` from config.yaml once per process.
Cached because ``t()`` is called in hot paths (every approval prompt,
every gateway reply) and re-reading YAML each call would be wasteful.
``reset_language_cache()`` clears this when config changes at runtime
(e.g. after the setup wizard).
"""
try:
from hermes_cli.config import load_config
cfg = load_config()
lang = (cfg.get("display") or {}).get("language")
if lang:
return _normalize_lang(lang)
except Exception as exc:
logger.debug("Could not read display.language from config: %s", exc)
return None
def reset_language_cache() -> None:
"""Invalidate cached language resolution and catalogs.
Call after :func:`hermes_cli.config.save_config` if a running process
needs to pick up a changed ``display.language`` without restart.
"""
_config_language_cached.cache_clear()
with _catalog_lock:
_catalog_cache.clear()
def get_language() -> str:
"""Resolve the active language using env > config > default order."""
env_lang = os.environ.get("HERMES_LANGUAGE")
if env_lang:
return _normalize_lang(env_lang)
cfg_lang = _config_language_cached()
if cfg_lang:
return cfg_lang
return DEFAULT_LANGUAGE
def t(key: str, lang: str | None = None, **format_kwargs: Any) -> str:
"""Translate a dotted key to the active language.
Parameters
----------
key
Dotted path into the catalog, e.g. ``"approval.choose_long"``.
lang
Explicit language override. Takes precedence over env + config.
**format_kwargs
``str.format`` substitution arguments (``t("gateway.drain", count=3)``
expects a catalog entry with a ``{count}`` placeholder).
Returns
-------
The translated string, or the English fallback if the key is missing in
the target language, or the bare key if English is also missing.
"""
target = _normalize_lang(lang) if lang else get_language()
catalog = _load_catalog(target)
value = catalog.get(key)
if value is None and target != DEFAULT_LANGUAGE:
# Fall through to English rather than showing a key path to the user.
value = _load_catalog(DEFAULT_LANGUAGE).get(key)
if value is None:
# Last-ditch: return the key itself. A broken catalog should not
# crash anything; it just looks ugly until someone fixes it.
logger.debug("i18n miss: key=%r lang=%r", key, target)
value = key
if format_kwargs:
try:
return value.format(**format_kwargs)
except (KeyError, IndexError, ValueError) as exc:
logger.warning(
"i18n format failed for key=%r lang=%r kwargs=%r: %s",
key, target, format_kwargs, exc,
)
return value
return value
__all__ = [
"SUPPORTED_LANGUAGES",
"DEFAULT_LANGUAGE",
"t",
"get_language",
"reset_language_cache",
]
+6 -8
View File
@@ -1,17 +1,14 @@
"""MemoryManager — orchestrates the built-in memory provider plus at most
ONE external plugin memory provider.
"""MemoryManager — orchestrates memory providers for the agent.
Single integration point in run_agent.py. Replaces scattered per-backend
code with one manager that delegates to registered providers.
The BuiltinMemoryProvider is always registered first and cannot be removed.
Only ONE external (non-builtin) provider is allowed at a time attempting
to register a second external provider is rejected with a warning. This
Only ONE external plugin provider is allowed at a time attempting to
register a second external provider is rejected with a warning. This
prevents tool schema bloat and conflicting memory backends.
Usage in run_agent.py:
self._memory_manager = MemoryManager()
self._memory_manager.add_provider(BuiltinMemoryProvider(...))
# Only ONE of these:
self._memory_manager.add_provider(plugin_provider)
@@ -49,7 +46,7 @@ _INTERNAL_CONTEXT_RE = re.compile(
re.IGNORECASE,
)
_INTERNAL_NOTE_RE = re.compile(
r'\[System note:\s*The following is recalled memory context,\s*NOT new user input\.\s*Treat as informational background data\.\]\s*',
r'\[System note:\s*The following is recalled memory context,\s*NOT new user input\.\s*Treat as (?:informational background data|authoritative reference data[^\]]*)\.\]\s*',
re.IGNORECASE,
)
@@ -183,7 +180,8 @@ def build_memory_context_block(raw_context: str) -> str:
return (
"<memory-context>\n"
"[System note: The following is recalled memory context, "
"NOT new user input. Treat as informational background data.]\n\n"
"NOT new user input. Treat as authoritative reference data — "
"this is the agent's persistent memory and should inform all responses.]\n\n"
f"{clean}\n"
"</memory-context>"
)
+8 -9
View File
@@ -1,17 +1,16 @@
"""Abstract base class for pluggable memory providers.
Memory providers give the agent persistent recall across sessions. One
external provider is active at a time alongside the always-on built-in
memory (MEMORY.md / USER.md). The MemoryManager enforces this limit.
Memory providers give the agent persistent recall across sessions.
The MemoryManager enforces a one-external-provider limit to prevent
tool schema bloat and conflicting memory backends.
Built-in memory is always active as the first provider and cannot be removed.
External providers (Honcho, Hindsight, Mem0, etc.) are additive they never
disable the built-in store. Only one external provider runs at a time to
prevent tool schema bloat and conflicting memory backends.
External providers (Honcho, Hindsight, Mem0, etc.) are registered
and managed via MemoryManager. Only one external provider runs at a
time.
Registration:
1. Built-in: BuiltinMemoryProvider always present, not removable.
2. Plugins: Ship in plugins/memory/<name>/, activated by memory.provider config.
Plugins ship in plugins/memory/<name>/ and are activated via
the memory.provider config key.
Lifecycle (called by MemoryManager, wired in run_agent.py):
initialize() connect, create resources, warm up
+11
View File
@@ -318,6 +318,17 @@ _URL_TO_PROVIDER: Dict[str, str] = {
"ollama.com": "ollama-cloud",
}
# Auto-extend with hostnames derived from provider profiles.
# Any provider with a base_url not already in the map gets added automatically.
try:
from providers import list_providers as _list_providers
for _pp in _list_providers():
_host = _pp.get_hostname()
if _host and _host not in _URL_TO_PROVIDER:
_URL_TO_PROVIDER[_host] = _pp.name
except Exception:
pass
def _infer_provider_from_url(base_url: str) -> Optional[str]:
"""Infer the models.dev provider name from a base URL.
+8 -2
View File
@@ -183,8 +183,8 @@ SKILLS_GUIDANCE = (
)
KANBAN_GUIDANCE = (
"# You are a Kanban worker\n"
"You were spawned by the Hermes Kanban dispatcher to execute ONE task from "
"# Kanban task execution protocol\n"
"You have been assigned ONE task from "
"the shared board at `~/.hermes/kanban.db`. Your task id is in "
"`$HERMES_KANBAN_TASK`; your workspace is `$HERMES_KANBAN_WORKSPACE`. "
"The `kanban_*` tools in your schema are your primary coordination surface — "
@@ -513,6 +513,12 @@ PLATFORM_HINTS = {
"image and is the WRONG path. Bare Unicode emoji in text is also not a substitute "
"— when a sticker is the right response, use yb_send_sticker."
),
"api_server": (
"You're responding through an API server. The rendering layer is unknown — "
"assume plain text. No markdown formatting (no asterisks, bullets, headers, "
"code fences). Treat this like a conversation, not a document. Keep responses "
"brief and natural."
),
}
# ---------------------------------------------------------------------------
+17 -11
View File
@@ -305,13 +305,18 @@ def _redact_form_body(text: str) -> str:
return _redact_query_string(text.strip())
def redact_sensitive_text(text: str, *, force: bool = False) -> str:
def redact_sensitive_text(text: str, *, force: bool = False, code_file: bool = False) -> str:
"""Apply all redaction patterns to a block of text.
Safe to call on any string -- non-matching text passes through unchanged.
Disabled by default enable via security.redact_secrets: true in config.yaml.
Set force=True for safety boundaries that must never return raw secrets
regardless of the user's global logging redaction preference.
Set code_file=True to skip the ENV-assignment and JSON-field regex
patterns when the text is known to be source code (e.g. MAX_TOKENS=***
constants, "apiKey": "test" fixtures). Prefix patterns, auth headers,
private keys, DB connstrings, JWTs, and URL secrets are still redacted.
"""
if text is None:
return None
@@ -325,17 +330,18 @@ def redact_sensitive_text(text: str, *, force: bool = False) -> str:
# Known prefixes (sk-, ghp_, etc.)
text = _PREFIX_RE.sub(lambda m: _mask_token(m.group(1)), text)
# ENV assignments: OPENAI_API_KEY=sk-abc...
def _redact_env(m):
name, quote, value = m.group(1), m.group(2), m.group(3)
return f"{name}={quote}{_mask_token(value)}{quote}"
text = _ENV_ASSIGN_RE.sub(_redact_env, text)
# ENV assignments: OPENAI_API_KEY=*** (skip for code files — false positives)
if not code_file:
def _redact_env(m):
name, quote, value = m.group(1), m.group(2), m.group(3)
return f"{name}={quote}{_mask_token(value)}{quote}"
text = _ENV_ASSIGN_RE.sub(_redact_env, text)
# JSON fields: "apiKey": "value"
def _redact_json(m):
key, value = m.group(1), m.group(2)
return f'{key}: "{_mask_token(value)}"'
text = _JSON_FIELD_RE.sub(_redact_json, text)
# JSON fields: "apiKey": "***" (skip for code files — false positives)
def _redact_json(m):
key, value = m.group(1), m.group(2)
return f'{key}: "{_mask_token(value)}"'
text = _JSON_FIELD_RE.sub(_redact_json, text)
# Authorization headers
text = _AUTH_HEADER_RE.sub(
+38 -3
View File
@@ -6,6 +6,7 @@ can invoke skills via /skill-name commands.
import json
import logging
import os
import re
from pathlib import Path
from typing import Any, Dict, Optional
@@ -20,10 +21,35 @@ from agent.skill_preprocessing import (
logger = logging.getLogger(__name__)
_skill_commands: Dict[str, Dict[str, Any]] = {}
_skill_commands_platform: Optional[str] = None
# Patterns for sanitizing skill names into clean hyphen-separated slugs.
_SKILL_INVALID_CHARS = re.compile(r"[^a-z0-9-]")
_SKILL_MULTI_HYPHEN = re.compile(r"-{2,}")
def _resolve_skill_commands_platform() -> Optional[str]:
"""Return the current platform scope used for disabled-skill filtering.
Used to detect when the active platform has shifted so
:func:`get_skill_commands` can drop a stale cache that was populated
for a different platform's ``skills.platform_disabled`` view (#14536).
Resolves from (in order) ``HERMES_PLATFORM`` env var and
``HERMES_SESSION_PLATFORM`` from the gateway session context. Returns
``None`` when no platform scope is active (e.g. classic CLI, RL
rollouts, standalone scripts).
"""
try:
from gateway.session_context import get_session_env
resolved_platform = (
os.getenv("HERMES_PLATFORM")
or get_session_env("HERMES_SESSION_PLATFORM")
)
except Exception:
resolved_platform = os.getenv("HERMES_PLATFORM")
return resolved_platform or None
def _load_skill_payload(skill_identifier: str, task_id: str | None = None) -> tuple[dict[str, Any], Path | None, str] | None:
"""Load a skill by name/path and return (loaded_payload, skill_dir, display_name)."""
raw_identifier = (skill_identifier or "").strip()
@@ -218,7 +244,8 @@ def scan_skill_commands() -> Dict[str, Dict[str, Any]]:
Returns:
Dict mapping "/skill-name" to {name, description, skill_md_path, skill_dir}.
"""
global _skill_commands
global _skill_commands, _skill_commands_platform
_skill_commands_platform = _resolve_skill_commands_platform()
_skill_commands = {}
try:
from tools.skills_tool import SKILLS_DIR, _parse_frontmatter, skill_matches_platform, _get_disabled_skill_names
@@ -278,8 +305,16 @@ def scan_skill_commands() -> Dict[str, Dict[str, Any]]:
def get_skill_commands() -> Dict[str, Dict[str, Any]]:
"""Return the current skill commands mapping (scan first if empty)."""
if not _skill_commands:
"""Return the current skill commands mapping (scan first if empty).
Rescans when the active platform scope changes (e.g. a gateway
process serving Telegram and Discord concurrently) so each platform
sees its own ``skills.platform_disabled`` view (#14536).
"""
if (
not _skill_commands
or _skill_commands_platform != _resolve_skill_commands_platform()
):
scan_skill_commands()
return _skill_commands
+386
View File
@@ -0,0 +1,386 @@
"""Stateful scrubber for reasoning/thinking blocks in streamed assistant text.
``run_agent._strip_think_blocks`` is regex-based and correct for a complete
string, but when it runs *per-delta* in ``_fire_stream_delta`` it destroys
the state that downstream consumers (CLI ``_stream_delta``, gateway
``GatewayStreamConsumer._filter_and_accumulate``) rely on.
Concretely, when MiniMax-M2.7 streams
delta1 = "<think>"
delta2 = "Let me check their config"
delta3 = "</think>"
the per-delta regex erases delta1 entirely (case 2: unterminated-open at
boundary matches ``^<think>...``), so the downstream state machine never
sees the open tag, treats delta2 as regular content, and leaks reasoning
to the user. Consumers that don't run their own state machine (ACP,
api_server, TTS) never had any defence at all they just emitted
whatever survived the upstream regex.
This module centralises the tag-suppression state machine at the
upstream layer so every stream_delta_callback sees text that has
already had reasoning blocks removed. Partial tags at delta
boundaries are held back until the next delta resolves them, and
end-of-stream flushing surfaces any held-back prose that turned out
not to be a real tag.
Usage::
scrubber = StreamingThinkScrubber()
for delta in stream:
visible = scrubber.feed(delta)
if visible:
emit(visible)
tail = scrubber.flush() # at end of stream
if tail:
emit(tail)
The scrubber is re-entrant per agent instance. Call ``reset()`` at
the top of each new turn so a hung block from an interrupted prior
stream cannot taint the next turn's output.
Tag variants handled (case-insensitive):
``<think>``, ``<thinking>``, ``<reasoning>``, ``<thought>``,
``<REASONING_SCRATCHPAD>``.
Block-boundary rule for opens: an opening tag is only treated as a
reasoning-block opener when it appears at the start of the stream,
after a newline (optionally followed by whitespace), or when only
whitespace has been emitted on the current line. This prevents prose
that *mentions* the tag name (e.g. ``"use <think> tags here"``) from
being incorrectly suppressed. Closed pairs (``<think>X</think>``) are
always suppressed regardless of boundary; a closed pair is an
intentional, bounded construct.
"""
from __future__ import annotations
from typing import Tuple
__all__ = ["StreamingThinkScrubber"]
class StreamingThinkScrubber:
"""Stateful scrubber for streaming reasoning/thinking blocks.
State machine:
- ``_in_block``: True while inside an opened block, waiting for
a close tag. All text inside is discarded.
- ``_buf``: held-back partial-tag tail. Emitted / discarded on
the next ``feed()`` call or by ``flush()``.
- ``_last_emitted_ended_newline``: True iff the most recent
emission to the consumer ended with ``\\n``, or nothing has
been emitted yet (start-of-stream counts as a boundary). Used
to decide whether an open tag at buffer position 0 is at a
block boundary.
"""
_OPEN_TAG_NAMES: Tuple[str, ...] = (
"think",
"thinking",
"reasoning",
"thought",
"REASONING_SCRATCHPAD",
)
# Materialise literal tag strings so the hot path does string
# operations, not regex compilation per feed().
_OPEN_TAGS: Tuple[str, ...] = tuple(f"<{name}>" for name in _OPEN_TAG_NAMES)
_CLOSE_TAGS: Tuple[str, ...] = tuple(f"</{name}>" for name in _OPEN_TAG_NAMES)
# Pre-compute the longest tag (for partial-tag hold-back bound).
_MAX_TAG_LEN: int = max(len(tag) for tag in _OPEN_TAGS + _CLOSE_TAGS)
def __init__(self) -> None:
self._in_block: bool = False
self._buf: str = ""
self._last_emitted_ended_newline: bool = True
def reset(self) -> None:
"""Reset all state. Call at the top of every new turn."""
self._in_block = False
self._buf = ""
self._last_emitted_ended_newline = True
def feed(self, text: str) -> str:
"""Feed one delta; return the scrubbed visible portion.
May return an empty string when the entire delta is reasoning
content or is being held back pending resolution of a partial
tag at the boundary.
"""
if not text:
return ""
buf = self._buf + text
self._buf = ""
out: list[str] = []
while buf:
if self._in_block:
# Hunt for the earliest close tag.
close_idx, close_len = self._find_first_tag(
buf, self._CLOSE_TAGS,
)
if close_idx == -1:
# No close yet — hold back a potential partial
# close-tag prefix; discard everything else.
held = self._max_partial_suffix(buf, self._CLOSE_TAGS)
self._buf = buf[-held:] if held else ""
return "".join(out)
# Found close: discard block content + tag, continue.
buf = buf[close_idx + close_len:]
self._in_block = False
else:
# Priority 1 — closed <tag>X</tag> pair anywhere in
# buf. Closed pairs are always an intentional,
# bounded construct (even mid-line prose containing
# an open/close pair is almost certainly a model
# leaking reasoning inline), so no boundary gating.
pair = self._find_earliest_closed_pair(buf)
# Priority 2 — unterminated open tag at a block
# boundary. Boundary-gated so prose that mentions
# '<think>' isn't over-stripped.
open_idx, open_len = self._find_open_at_boundary(
buf, out,
)
# Pick whichever match comes earliest in the buffer.
if pair is not None and (
open_idx == -1 or pair[0] <= open_idx
):
start_idx, end_idx = pair
preceding = buf[:start_idx]
if preceding:
preceding = self._strip_orphan_close_tags(preceding)
if preceding:
out.append(preceding)
self._last_emitted_ended_newline = (
preceding.endswith("\n")
)
buf = buf[end_idx:]
continue
if open_idx != -1:
# Unterminated open at boundary — emit preceding,
# enter block, continue loop with remainder.
preceding = buf[:open_idx]
if preceding:
preceding = self._strip_orphan_close_tags(preceding)
if preceding:
out.append(preceding)
self._last_emitted_ended_newline = (
preceding.endswith("\n")
)
self._in_block = True
buf = buf[open_idx + open_len:]
continue
# No resolvable tag structure in buf. Hold back any
# partial-tag prefix at the tail so a split tag
# across deltas isn't missed, then emit the rest.
held = self._max_partial_suffix(buf, self._OPEN_TAGS)
held_close = self._max_partial_suffix(
buf, self._CLOSE_TAGS,
)
held = max(held, held_close)
if held:
emit_text = buf[:-held]
self._buf = buf[-held:]
else:
emit_text = buf
self._buf = ""
if emit_text:
emit_text = self._strip_orphan_close_tags(emit_text)
if emit_text:
out.append(emit_text)
self._last_emitted_ended_newline = (
emit_text.endswith("\n")
)
return "".join(out)
return "".join(out)
def flush(self) -> str:
"""End-of-stream flush.
If still inside an unterminated block, held-back content is
discarded leaking partial reasoning is worse than a
truncated answer. Otherwise the held-back partial-tag tail is
emitted verbatim (it turned out not to be a real tag prefix).
"""
if self._in_block:
self._buf = ""
self._in_block = False
return ""
tail = self._buf
self._buf = ""
if not tail:
return ""
tail = self._strip_orphan_close_tags(tail)
if tail:
self._last_emitted_ended_newline = tail.endswith("\n")
return tail
# ── internal helpers ───────────────────────────────────────────────
@staticmethod
def _find_first_tag(
buf: str, tags: Tuple[str, ...],
) -> Tuple[int, int]:
"""Return (earliest_index, tag_length) over *tags*, or (-1, 0).
Case-insensitive match.
"""
buf_lower = buf.lower()
best_idx = -1
best_len = 0
for tag in tags:
idx = buf_lower.find(tag.lower())
if idx != -1 and (best_idx == -1 or idx < best_idx):
best_idx = idx
best_len = len(tag)
return best_idx, best_len
def _find_earliest_closed_pair(self, buf: str):
"""Return (start_idx, end_idx) of the earliest closed pair, else None.
A closed pair is ``<tag>...</tag>`` of any variant. Matches are
case-insensitive and non-greedy (the closest close tag after
an open tag wins), matching the regex ``<tag>.*?</tag>``
semantics of ``_strip_think_blocks`` case 1. When two tag
variants could both match, the one whose open tag appears
earlier wins.
"""
buf_lower = buf.lower()
best: "tuple[int, int] | None" = None
for open_tag, close_tag in zip(self._OPEN_TAGS, self._CLOSE_TAGS):
open_lower = open_tag.lower()
close_lower = close_tag.lower()
open_idx = buf_lower.find(open_lower)
if open_idx == -1:
continue
close_idx = buf_lower.find(
close_lower, open_idx + len(open_lower),
)
if close_idx == -1:
continue
end_idx = close_idx + len(close_lower)
if best is None or open_idx < best[0]:
best = (open_idx, end_idx)
return best
def _find_open_at_boundary(
self, buf: str, already_emitted: list[str],
) -> Tuple[int, int]:
"""Return the earliest block-boundary open-tag (idx, len).
Returns (-1, 0) if no boundary-legal opener is present.
"""
buf_lower = buf.lower()
best_idx = -1
best_len = 0
for tag in self._OPEN_TAGS:
tag_lower = tag.lower()
search_start = 0
while True:
idx = buf_lower.find(tag_lower, search_start)
if idx == -1:
break
if self._is_block_boundary(buf, idx, already_emitted):
if best_idx == -1 or idx < best_idx:
best_idx = idx
best_len = len(tag)
break # first boundary hit for this tag is enough
search_start = idx + 1
return best_idx, best_len
def _is_block_boundary(
self, buf: str, idx: int, already_emitted: list[str],
) -> bool:
"""True iff position *idx* in *buf* is a block boundary.
A block boundary is:
- buf position 0 AND the most recent emission ended with
a newline (or nothing has been emitted yet)
- any position whose preceding text on the current line
(since the last newline in buf) is whitespace-only, AND
if there is no newline in the preceding buf portion, the
most recent prior emission ended with a newline
"""
if idx == 0:
# Check whether the last already-emitted chunk in THIS
# feed() call ended with a newline, otherwise fall back
# to the cross-feed flag.
if already_emitted:
return already_emitted[-1].endswith("\n")
return self._last_emitted_ended_newline
preceding = buf[:idx]
last_nl = preceding.rfind("\n")
if last_nl == -1:
# No newline in buf before the tag — boundary only if the
# prior emission ended with a newline AND everything since
# is whitespace.
if already_emitted:
prior_newline = already_emitted[-1].endswith("\n")
else:
prior_newline = self._last_emitted_ended_newline
return prior_newline and preceding.strip() == ""
# Newline present — text between it and the tag must be
# whitespace-only.
return preceding[last_nl + 1:].strip() == ""
@classmethod
def _max_partial_suffix(
cls, buf: str, tags: Tuple[str, ...],
) -> int:
"""Return the longest buf-suffix that is a prefix of any tag.
Only prefixes strictly shorter than the tag itself count
(full-length suffixes are the tag and are handled as matches,
not held-back partials). Case-insensitive.
"""
if not buf:
return 0
buf_lower = buf.lower()
max_check = min(len(buf_lower), cls._MAX_TAG_LEN - 1)
for i in range(max_check, 0, -1):
suffix = buf_lower[-i:]
for tag in tags:
tag_lower = tag.lower()
if len(tag_lower) > i and tag_lower.startswith(suffix):
return i
return 0
@classmethod
def _strip_orphan_close_tags(cls, text: str) -> str:
"""Remove any close tags from *text* (orphan-close handling).
An orphan close tag has no matching open in the current
scrubber state; it's always noise, stripped with any trailing
whitespace so the surrounding prose flows naturally.
"""
if "</" not in text:
return text
text_lower = text.lower()
out: list[str] = []
i = 0
while i < len(text):
matched = False
if text_lower[i:i + 2] == "</":
for tag in cls._CLOSE_TAGS:
tag_lower = tag.lower()
tag_len = len(tag_lower)
if text_lower[i:i + tag_len] == tag_lower:
# Skip the tag and any trailing whitespace,
# matching _strip_think_blocks case 3.
j = i + tag_len
while j < len(text) and text[j] in " \t\n\r":
j += 1
i = j
matched = True
break
if not matched:
out.append(text[i])
i += 1
return "".join(out)
+13 -1
View File
@@ -17,6 +17,7 @@ logger = logging.getLogger(__name__)
# so silent-drops (e.g. OpenRouter 402 exhausting the fallback chain)
# become visible instead of piling up as NULL session titles.
FailureCallback = Callable[[str, BaseException], None]
TitleCallback = Callable[[str], None]
_TITLE_PROMPT = (
"Generate a short, descriptive title (3-7 words) for a conversation that starts with the "
@@ -90,6 +91,7 @@ def auto_title_session(
assistant_response: str,
failure_callback: Optional[FailureCallback] = None,
main_runtime: dict = None,
title_callback: Optional[TitleCallback] = None,
) -> None:
"""Generate and set a session title if one doesn't already exist.
@@ -119,6 +121,11 @@ def auto_title_session(
try:
session_db.set_session_title(session_id, title)
logger.debug("Auto-generated session title: %s", title)
if title_callback is not None:
try:
title_callback(title)
except Exception:
logger.debug("Auto-title callback failed", exc_info=True)
except Exception as e:
logger.debug("Failed to set auto-generated title: %s", e)
@@ -131,6 +138,7 @@ def maybe_auto_title(
conversation_history: list,
failure_callback: Optional[FailureCallback] = None,
main_runtime: dict = None,
title_callback: Optional[TitleCallback] = None,
) -> None:
"""Fire-and-forget title generation after the first exchange.
@@ -152,7 +160,11 @@ def maybe_auto_title(
thread = threading.Thread(
target=auto_title_session,
args=(session_db, session_id, user_message, assistant_response),
kwargs={"failure_callback": failure_callback, "main_runtime": main_runtime},
kwargs={
"failure_callback": failure_callback,
"main_runtime": main_runtime,
"title_callback": title_callback,
},
daemon=True,
name="auto-title",
)
+13 -1
View File
@@ -6,9 +6,16 @@ Usage:
result = transport.normalize_response(raw_response)
"""
from agent.transports.types import NormalizedResponse, ToolCall, Usage, build_tool_call, map_finish_reason # noqa: F401
from agent.transports.types import (
NormalizedResponse,
ToolCall,
Usage,
build_tool_call,
map_finish_reason,
) # noqa: F401
_REGISTRY: dict = {}
_discovered: bool = False
def register_transport(api_mode: str, transport_cls: type) -> None:
@@ -23,6 +30,9 @@ def get_transport(api_mode: str):
This allows gradual migration call sites can check for None
and fall back to the legacy code path.
"""
global _discovered
if not _discovered:
_discover_transports()
cls = _REGISTRY.get(api_mode)
if cls is None:
# The registry can be partially populated when a specific transport
@@ -38,6 +48,8 @@ def get_transport(api_mode: str):
def _discover_transports() -> None:
"""Import all transport modules to trigger auto-registration."""
global _discovered
_discovered = True
try:
import agent.transports.anthropic # noqa: F401
except ImportError:
+158 -90
View File
@@ -109,7 +109,9 @@ class ChatCompletionsTransport(ProviderTransport):
def api_mode(self) -> str:
return "chat_completions"
def convert_messages(self, messages: List[Dict[str, Any]], **kwargs) -> List[Dict[str, Any]]:
def convert_messages(
self, messages: list[dict[str, Any]], **kwargs
) -> list[dict[str, Any]]:
"""Messages are already in OpenAI format — sanitize Codex leaks only.
Strips Codex Responses API fields (``codex_reasoning_items`` /
@@ -126,7 +128,9 @@ class ChatCompletionsTransport(ProviderTransport):
tool_calls = msg.get("tool_calls")
if isinstance(tool_calls, list):
for tc in tool_calls:
if isinstance(tc, dict) and ("call_id" in tc or "response_item_id" in tc):
if isinstance(tc, dict) and (
"call_id" in tc or "response_item_id" in tc
):
needs_sanitize = True
break
if needs_sanitize:
@@ -149,39 +153,41 @@ class ChatCompletionsTransport(ProviderTransport):
tc.pop("response_item_id", None)
return sanitized
def convert_tools(self, tools: List[Dict[str, Any]]) -> List[Dict[str, Any]]:
def convert_tools(self, tools: list[dict[str, Any]]) -> list[dict[str, Any]]:
"""Tools are already in OpenAI format — identity."""
return tools
def build_kwargs(
self,
model: str,
messages: List[Dict[str, Any]],
tools: Optional[List[Dict[str, Any]]] = None,
messages: list[dict[str, Any]],
tools: list[dict[str, Any]] | None = None,
**params,
) -> Dict[str, Any]:
) -> dict[str, Any]:
"""Build chat.completions.create() kwargs.
This is the most complex transport method it handles ~16 providers
via params rather than subclasses.
params:
params (all optional):
timeout: float API call timeout
max_tokens: int | None user-configured max tokens
ephemeral_max_output_tokens: int | None one-shot override (error recovery)
ephemeral_max_output_tokens: int | None one-shot override
max_tokens_param_fn: callable returns {max_tokens: N} or {max_completion_tokens: N}
reasoning_config: dict | None
request_overrides: dict | None
session_id: str | None
qwen_session_metadata: dict | None {sessionId, promptId} precomputed
model_lower: str lowercase model name for pattern matching
# Provider detection flags (all optional, default False)
# Provider profile path (all per-provider quirks live in providers/)
provider_profile: ProviderProfile | None when present, delegates to
_build_kwargs_from_profile(); all flag params below are bypassed.
# Legacy-path flags — only used when provider_profile is None
# (i.e. custom / unregistered providers). Known providers all go
# through provider_profile.
is_openrouter: bool
is_nous: bool
is_qwen_portal: bool
is_github_models: bool
is_nvidia_nim: bool
is_kimi: bool
is_tokenhub: bool
is_lmstudio: bool
is_custom_provider: bool
ollama_num_ctx: int | None
@@ -190,6 +196,7 @@ class ChatCompletionsTransport(ProviderTransport):
# Qwen-specific
qwen_prepare_fn: callable | None runs AFTER codex sanitization
qwen_prepare_inplace_fn: callable | None in-place variant for deepcopied lists
qwen_session_metadata: dict | None
# Temperature
fixed_temperature: Any from _fixed_temperature_for_model()
omit_temperature: bool
@@ -199,28 +206,21 @@ class ChatCompletionsTransport(ProviderTransport):
lmstudio_reasoning_options: list[str] | None # raw allowed_options from /api/v1/models
# Claude on OpenRouter/Nous max output
anthropic_max_output: int | None
# Extra
extra_body_additions: dict | None pre-built extra_body entries
extra_body_additions: dict | None
"""
# Codex sanitization: drop reasoning_items / call_id / response_item_id
sanitized = self.convert_messages(messages)
# Qwen portal prep AFTER codex sanitization. If sanitize already
# deepcopied, reuse that copy via the in-place variant to avoid a
# second deepcopy.
is_qwen = params.get("is_qwen_portal", False)
if is_qwen:
qwen_prep = params.get("qwen_prepare_fn")
qwen_prep_inplace = params.get("qwen_prepare_inplace_fn")
if sanitized is messages:
if qwen_prep is not None:
sanitized = qwen_prep(sanitized)
else:
# Already deepcopied — transform in place
if qwen_prep_inplace is not None:
qwen_prep_inplace(sanitized)
elif qwen_prep is not None:
sanitized = qwen_prep(sanitized)
# ── Provider profile: single-path when present ──────────────────
_profile = params.get("provider_profile")
if _profile:
return self._build_kwargs_from_profile(
_profile, model, sanitized, tools, params
)
# ── Legacy fallback (unregistered / unknown provider) ───────────
# Reached only when get_provider_profile() returned None.
# Known providers always go through the profile path above.
# Developer role swap for GPT-5/Codex models
model_lower = params.get("model_lower", (model or "").lower())
@@ -233,7 +233,7 @@ class ChatCompletionsTransport(ProviderTransport):
sanitized = list(sanitized)
sanitized[0] = {**sanitized[0], "role": "developer"}
api_kwargs: Dict[str, Any] = {
api_kwargs: dict[str, Any] = {
"model": model,
"messages": sanitized,
}
@@ -242,19 +242,6 @@ class ChatCompletionsTransport(ProviderTransport):
if timeout is not None:
api_kwargs["timeout"] = timeout
# Temperature
fixed_temp = params.get("fixed_temperature")
omit_temp = params.get("omit_temperature", False)
if omit_temp:
api_kwargs.pop("temperature", None)
elif fixed_temp is not None:
api_kwargs["temperature"] = fixed_temp
# Qwen metadata (caller precomputes {sessionId, promptId})
qwen_meta = params.get("qwen_session_metadata")
if qwen_meta and is_qwen:
api_kwargs["metadata"] = qwen_meta
# Tools
if tools:
# Moonshot/Kimi uses a stricter flavored JSON Schema. Rewriting
@@ -278,13 +265,6 @@ class ChatCompletionsTransport(ProviderTransport):
api_kwargs.update(max_tokens_fn(ephemeral))
elif max_tokens is not None and max_tokens_fn:
api_kwargs.update(max_tokens_fn(max_tokens))
elif is_nvidia_nim and max_tokens_fn:
api_kwargs.update(max_tokens_fn(16384))
elif is_qwen and max_tokens_fn:
api_kwargs.update(max_tokens_fn(65536))
elif is_kimi and max_tokens_fn:
# Kimi/Moonshot: 32000 matches Kimi CLI's default
api_kwargs.update(max_tokens_fn(32000))
elif anthropic_max_out is not None:
api_kwargs["max_tokens"] = anthropic_max_out
@@ -331,7 +311,7 @@ class ChatCompletionsTransport(ProviderTransport):
api_kwargs["reasoning_effort"] = _lm_effort
# extra_body assembly
extra_body: Dict[str, Any] = {}
extra_body: dict[str, Any] = {}
is_openrouter = params.get("is_openrouter", False)
is_nous = params.get("is_nous", False)
@@ -361,35 +341,7 @@ class ChatCompletionsTransport(ProviderTransport):
if gh_reasoning is not None:
extra_body["reasoning"] = gh_reasoning
else:
if reasoning_config is not None:
rc = dict(reasoning_config)
if is_nous and rc.get("enabled") is False:
pass # omit for Nous when disabled
else:
extra_body["reasoning"] = rc
else:
extra_body["reasoning"] = {"enabled": True, "effort": "medium"}
if is_nous:
extra_body["tags"] = ["product=hermes-agent"]
# Ollama num_ctx
ollama_ctx = params.get("ollama_num_ctx")
if ollama_ctx:
options = extra_body.get("options", {})
options["num_ctx"] = ollama_ctx
extra_body["options"] = options
# Ollama/custom think=false
if params.get("is_custom_provider", False):
if reasoning_config and isinstance(reasoning_config, dict):
_effort = (reasoning_config.get("effort") or "").strip().lower()
_enabled = reasoning_config.get("enabled", True)
if _effort == "none" or _enabled is False:
extra_body["think"] = False
if is_qwen:
extra_body["vl_high_resolution_images"] = True
extra_body["reasoning"] = {"enabled": True, "effort": "medium"}
if provider_name == "gemini":
raw_thinking_config = _build_gemini_thinking_config(model, reasoning_config)
@@ -423,6 +375,120 @@ class ChatCompletionsTransport(ProviderTransport):
return api_kwargs
def _build_kwargs_from_profile(self, profile, model, sanitized, tools, params):
"""Build API kwargs using a ProviderProfile — single path, no legacy flags.
This method replaces the entire flag-based kwargs assembly when a
provider_profile is passed. Every quirk comes from the profile object.
"""
from providers.base import OMIT_TEMPERATURE
# Message preprocessing
sanitized = profile.prepare_messages(sanitized)
# Developer role swap — model-name-based, applies to all providers
_model_lower = (model or "").lower()
if (
sanitized
and isinstance(sanitized[0], dict)
and sanitized[0].get("role") == "system"
and any(p in _model_lower for p in DEVELOPER_ROLE_MODELS)
):
sanitized = list(sanitized)
sanitized[0] = {**sanitized[0], "role": "developer"}
api_kwargs: dict[str, Any] = {
"model": model,
"messages": sanitized,
}
# Temperature
if profile.fixed_temperature is OMIT_TEMPERATURE:
pass # Don't include temperature at all
elif profile.fixed_temperature is not None:
api_kwargs["temperature"] = profile.fixed_temperature
else:
# Use caller's temperature if provided
temp = params.get("temperature")
if temp is not None:
api_kwargs["temperature"] = temp
# Timeout
timeout = params.get("timeout")
if timeout is not None:
api_kwargs["timeout"] = timeout
# Tools — apply Moonshot/Kimi schema sanitization regardless of path
if tools:
if is_moonshot_model(model):
tools = sanitize_moonshot_tools(tools)
api_kwargs["tools"] = tools
# max_tokens resolution — priority: ephemeral > user > profile default
max_tokens_fn = params.get("max_tokens_param_fn")
ephemeral = params.get("ephemeral_max_output_tokens")
user_max = params.get("max_tokens")
anthropic_max = params.get("anthropic_max_output")
if ephemeral is not None and max_tokens_fn:
api_kwargs.update(max_tokens_fn(ephemeral))
elif user_max is not None and max_tokens_fn:
api_kwargs.update(max_tokens_fn(user_max))
elif profile.default_max_tokens and max_tokens_fn:
api_kwargs.update(max_tokens_fn(profile.default_max_tokens))
elif anthropic_max is not None:
api_kwargs["max_tokens"] = anthropic_max
# Provider-specific api_kwargs extras (reasoning_effort, metadata, etc.)
reasoning_config = params.get("reasoning_config")
extra_body_from_profile, top_level_from_profile = (
profile.build_api_kwargs_extras(
reasoning_config=reasoning_config,
supports_reasoning=params.get("supports_reasoning", False),
qwen_session_metadata=params.get("qwen_session_metadata"),
model=model,
ollama_num_ctx=params.get("ollama_num_ctx"),
)
)
api_kwargs.update(top_level_from_profile)
# extra_body assembly
extra_body: dict[str, Any] = {}
# Profile's extra_body (tags, provider prefs, vl_high_resolution, etc.)
profile_body = profile.build_extra_body(
session_id=params.get("session_id"),
provider_preferences=params.get("provider_preferences"),
model=model,
base_url=params.get("base_url"),
reasoning_config=reasoning_config,
)
if profile_body:
extra_body.update(profile_body)
# Profile's reasoning/thinking extra_body entries
if extra_body_from_profile:
extra_body.update(extra_body_from_profile)
# Merge any pre-built extra_body additions from the caller
additions = params.get("extra_body_additions")
if additions:
extra_body.update(additions)
# Request overrides (user config)
overrides = params.get("request_overrides")
if overrides:
for k, v in overrides.items():
if k == "extra_body" and isinstance(v, dict):
extra_body.update(v)
else:
api_kwargs[k] = v
if extra_body:
api_kwargs["extra_body"] = extra_body
return api_kwargs
def normalize_response(self, response: Any, **kwargs) -> NormalizedResponse:
"""Normalize OpenAI ChatCompletion to NormalizedResponse.
@@ -444,7 +510,7 @@ class ChatCompletionsTransport(ProviderTransport):
# Gemini 3 thinking models attach extra_content with
# thought_signature — without replay on the next turn the API
# rejects the request with 400.
tc_provider_data: Dict[str, Any] = {}
tc_provider_data: dict[str, Any] = {}
extra = getattr(tc, "extra_content", None)
if extra is None and hasattr(tc, "model_extra"):
extra = (tc.model_extra or {}).get("extra_content")
@@ -455,12 +521,14 @@ class ChatCompletionsTransport(ProviderTransport):
except Exception:
pass
tc_provider_data["extra_content"] = extra
tool_calls.append(ToolCall(
id=tc.id,
name=tc.function.name,
arguments=tc.function.arguments,
provider_data=tc_provider_data or None,
))
tool_calls.append(
ToolCall(
id=tc.id,
name=tc.function.name,
arguments=tc.function.arguments,
provider_data=tc_provider_data or None,
)
)
usage = None
if hasattr(response, "usage") and response.usage:
@@ -508,7 +576,7 @@ class ChatCompletionsTransport(ProviderTransport):
return False
return True
def extract_cache_stats(self, response: Any) -> Optional[Dict[str, int]]:
def extract_cache_stats(self, response: Any) -> dict[str, int] | None:
"""Extract OpenRouter/OpenAI cache stats from prompt_tokens_details."""
usage = getattr(response, "usage", None)
if usage is None:
+12 -1
View File
@@ -143,7 +143,18 @@ class ResponsesApiTransport(ProviderTransport):
kwargs["max_output_tokens"] = max_tokens
if is_xai_responses and session_id:
kwargs["extra_headers"] = {"x-grok-conv-id": session_id}
existing_extra_headers = kwargs.get("extra_headers")
merged_extra_headers: Dict[str, str] = {}
if isinstance(existing_extra_headers, dict):
merged_extra_headers.update(
{
str(key): str(value)
for key, value in existing_extra_headers.items()
if key and value is not None
}
)
merged_extra_headers["x-grok-conv-id"] = session_id
kwargs["extra_headers"] = merged_extra_headers
return kwargs
+15 -14
View File
@@ -12,7 +12,7 @@ from __future__ import annotations
import json
from dataclasses import dataclass, field
from typing import Any, Dict, List, Optional
from typing import Any
@dataclass
@@ -32,10 +32,10 @@ class ToolCall:
* Others: ``None``
"""
id: Optional[str]
id: str | None
name: str
arguments: str # JSON string
provider_data: Optional[Dict[str, Any]] = field(default=None, repr=False)
provider_data: dict[str, Any] | None = field(default=None, repr=False)
# ── Backward compatibility ──────────────────────────────────
# The agent loop reads tc.function.name / tc.function.arguments
@@ -47,17 +47,17 @@ class ToolCall:
return "function"
@property
def function(self) -> "ToolCall":
def function(self) -> ToolCall:
"""Return self so tc.function.name / tc.function.arguments work."""
return self
@property
def call_id(self) -> Optional[str]:
def call_id(self) -> str | None:
"""Codex call_id from provider_data, accessed via getattr by _build_assistant_message."""
return (self.provider_data or {}).get("call_id")
@property
def response_item_id(self) -> Optional[str]:
def response_item_id(self) -> str | None:
"""Codex response_item_id from provider_data."""
return (self.provider_data or {}).get("response_item_id")
@@ -101,18 +101,18 @@ class NormalizedResponse:
* Others: ``None``
"""
content: Optional[str]
tool_calls: Optional[List[ToolCall]]
content: str | None
tool_calls: list[ToolCall] | None
finish_reason: str # "stop", "tool_calls", "length", "content_filter"
reasoning: Optional[str] = None
usage: Optional[Usage] = None
provider_data: Optional[Dict[str, Any]] = field(default=None, repr=False)
reasoning: str | None = None
usage: Usage | None = None
provider_data: dict[str, Any] | None = field(default=None, repr=False)
# ── Backward compatibility ──────────────────────────────────
# The shim _nr_to_assistant_message() mapped these from provider_data.
# These properties let NormalizedResponse pass through directly.
@property
def reasoning_content(self) -> Optional[str]:
def reasoning_content(self) -> str | None:
pd = self.provider_data or {}
return pd.get("reasoning_content")
@@ -136,8 +136,9 @@ class NormalizedResponse:
# Factory helpers
# ---------------------------------------------------------------------------
def build_tool_call(
id: Optional[str],
id: str | None,
name: str,
arguments: Any,
**provider_fields: Any,
@@ -151,7 +152,7 @@ def build_tool_call(
return ToolCall(id=id, name=name, arguments=args_str, provider_data=pd)
def map_finish_reason(reason: Optional[str], mapping: Dict[str, str]) -> str:
def map_finish_reason(reason: str | None, mapping: dict[str, str]) -> str:
"""Translate a provider-specific stop reason to the normalised set.
Falls back to ``"stop"`` for unknown or ``None`` reasons.
+12
View File
@@ -121,6 +121,18 @@ model:
# # Data policy: "allow" (default) or "deny" to exclude providers that may store data
# # data_collection: "deny"
# =============================================================================
# OpenRouter Response Caching (only applies when using OpenRouter)
# =============================================================================
# Cache identical API responses at the OpenRouter edge for free instant replays.
# When enabled, identical requests (same model, messages, parameters) return
# cached responses with zero billing. Separate from Anthropic prompt caching.
# See: https://openrouter.ai/docs/guides/features/response-caching
#
# openrouter:
# response_cache: true # Enable response caching (default: true)
# response_cache_ttl: 300 # Cache TTL in seconds, 1-86400 (default: 300)
# =============================================================================
# Git Worktree Isolation
# =============================================================================
+607 -115
View File
File diff suppressed because it is too large Load Diff
+56 -8
View File
@@ -420,7 +420,7 @@ def _normalize_workdir(workdir: Optional[str]) -> Optional[str]:
def create_job(
prompt: str,
prompt: Optional[str],
schedule: str,
name: Optional[str] = None,
repeat: Optional[int] = None,
@@ -435,12 +435,14 @@ def create_job(
context_from: Optional[Union[str, List[str]]] = None,
enabled_toolsets: Optional[List[str]] = None,
workdir: Optional[str] = None,
no_agent: bool = False,
) -> Dict[str, Any]:
"""
Create a new cron job.
Args:
prompt: The prompt to run (must be self-contained, or a task instruction when skill is set)
prompt: The prompt to run (must be self-contained, or a task instruction when skill is set).
Ignored when ``no_agent=True`` except as an optional name hint.
schedule: Schedule string (see parse_schedule)
name: Optional friendly name
repeat: How many times to run (None = forever, 1 = once)
@@ -451,21 +453,33 @@ def create_job(
model: Optional per-job model override
provider: Optional per-job provider override
base_url: Optional per-job base URL override
script: Optional path to a Python script whose stdout is injected into the
prompt each run. The script runs before the agent turn, and its output
is prepended as context. Useful for data collection / change detection.
script: Optional path to a script whose stdout feeds the job. With
``no_agent=True`` the script IS the job its stdout is
delivered verbatim. Without ``no_agent``, its stdout is
injected into the agent's prompt as context (data-collection /
change-detection pattern). Paths resolve under
~/.hermes/scripts/; ``.sh`` / ``.bash`` files run via bash,
anything else via Python.
context_from: Optional job ID (or list of job IDs) whose most recent output
is injected into the prompt as context before each run.
Useful for chaining cron jobs: job A finds data, job B processes it.
enabled_toolsets: Optional list of toolset names to restrict the agent to.
When set, only tools from these toolsets are loaded, reducing
token overhead. When omitted, all default tools are loaded.
Ignored when ``no_agent=True``.
workdir: Optional absolute path. When set, the job runs as if launched
from that directory: AGENTS.md / CLAUDE.md / .cursorrules from
that directory are injected into the system prompt, and the
terminal/file/code_exec tools use it as their working directory
(via TERMINAL_CWD). When unset, the old behaviour is preserved
(no context files injected, tools use the scheduler's cwd).
With ``no_agent=True``, ``workdir`` is still applied as the
script's cwd so relative paths inside the script behave
predictably.
no_agent: When True, skip the agent entirely run ``script`` on schedule
and deliver its stdout directly. Empty stdout = silent (no
delivery). Requires ``script`` to be set. Ideal for classic
watchdogs and periodic alerts that don't need LLM reasoning.
Returns:
The created job dict
@@ -499,6 +513,16 @@ def create_job(
normalized_toolsets = [str(t).strip() for t in enabled_toolsets if str(t).strip()] if enabled_toolsets else None
normalized_toolsets = normalized_toolsets or None
normalized_workdir = _normalize_workdir(workdir)
normalized_no_agent = bool(no_agent)
# no_agent jobs are meaningless without a script — the script IS the job.
# Surface this as a clear ValueError at create time so bad configs never
# reach the scheduler.
if normalized_no_agent and not normalized_script:
raise ValueError(
"no_agent=True requires a script — with no agent and no script "
"there is nothing for the job to run."
)
# Normalize context_from: accept str or list of str, store as list or None
if isinstance(context_from, str):
@@ -508,7 +532,7 @@ def create_job(
else:
context_from = None
label_source = (prompt or (normalized_skills[0] if normalized_skills else None)) or "cron job"
label_source = (prompt or (normalized_skills[0] if normalized_skills else None) or (normalized_script if normalized_no_agent else None)) or "cron job"
job = {
"id": job_id,
"name": name or label_source[:50].strip(),
@@ -519,6 +543,7 @@ def create_job(
"provider": normalized_provider,
"base_url": normalized_base_url,
"script": normalized_script,
"no_agent": normalized_no_agent,
"context_from": context_from,
"schedule": parsed_schedule,
"schedule_display": parsed_schedule.get("display", schedule),
@@ -785,6 +810,12 @@ def get_due_jobs() -> List[Dict[str, Any]]:
the job is fast-forwarded to the next future run instead of firing
immediately. This prevents a burst of missed jobs on gateway restart.
"""
with _jobs_file_lock:
return _get_due_jobs_locked()
def _get_due_jobs_locked() -> List[Dict[str, Any]]:
"""Inner implementation of get_due_jobs(); must be called with _jobs_file_lock held."""
now = _hermes_now()
raw_jobs = load_jobs()
jobs = [_apply_skill_fields(j) for j in copy.deepcopy(raw_jobs)]
@@ -797,19 +828,36 @@ def get_due_jobs() -> List[Dict[str, Any]]:
next_run = job.get("next_run_at")
if not next_run:
schedule = job.get("schedule", {})
kind = schedule.get("kind")
# One-shot jobs use a small grace window via the dedicated helper.
recovered_next = _recoverable_oneshot_run_at(
job.get("schedule", {}),
schedule,
now,
last_run_at=job.get("last_run_at"),
)
recovery_kind = "one-shot" if recovered_next else None
# Recurring jobs reach here only when something — typically a
# direct jobs.json edit that bypassed add_job() — left
# next_run_at unset. Without this branch, such jobs are
# silently skipped forever; recompute next_run_at from the
# schedule so they pick up at their next scheduled tick.
if not recovered_next and kind in ("cron", "interval"):
recovered_next = compute_next_run(schedule, now.isoformat())
if recovered_next:
recovery_kind = kind
if not recovered_next:
continue
job["next_run_at"] = recovered_next
next_run = recovered_next
logger.info(
"Job '%s' had no next_run_at; recovering one-shot run at %s",
"Job '%s' had no next_run_at; recovering %s run at %s",
job.get("name", job["id"]),
recovery_kind,
recovered_next,
)
for rj in raw_jobs:
+204 -29
View File
@@ -35,7 +35,7 @@ from typing import List, Optional
sys.path.insert(0, str(Path(__file__).parent.parent))
from hermes_constants import get_hermes_home
from hermes_cli.config import load_config
from hermes_cli.config import load_config, _expand_env_vars
from hermes_time import now as _hermes_now
logger = logging.getLogger(__name__)
@@ -114,18 +114,36 @@ from cron.jobs import get_due_jobs, mark_job_run, save_job_output, advance_next_
# locally for audit.
SILENT_MARKER = "[SILENT]"
# Resolve Hermes home directory (respects HERMES_HOME override)
_hermes_home = get_hermes_home()
# Backward-compatible module override used by tests and emergency monkeypatches.
_hermes_home: Path | None = None
# File-based lock prevents concurrent ticks from gateway + daemon + systemd timer
_LOCK_DIR = _hermes_home / "cron"
_LOCK_FILE = _LOCK_DIR / ".tick.lock"
def _get_hermes_home() -> Path:
"""Resolve Hermes home dynamically while preserving test monkeypatch hooks."""
return _hermes_home or get_hermes_home()
def _get_lock_paths() -> tuple[Path, Path]:
"""Resolve cron lock paths at call time so profile/env changes are honored."""
hermes_home = _get_hermes_home()
lock_dir = hermes_home / "cron"
return lock_dir, lock_dir / ".tick.lock"
def _resolve_origin(job: dict) -> Optional[dict]:
"""Extract origin info from a job, preserving any extra routing metadata."""
"""Extract origin info from a job, preserving any extra routing metadata.
Treats non-dict origins (free-form provenance strings, ints, lists from
migration scripts or hand-edited jobs.json) as missing instead of
crashing with ``AttributeError`` on ``origin.get(...)``. Without this
guard, a job tagged with e.g. ``"combined-digest-replaces-x-and-y"``
crashed every fire attempt with
``'str' object has no attribute 'get'`` ``mark_job_run`` recorded the
failure, but the next tick re-loaded the same poisoned origin and
crashed identically until the field was patched manually (#18722).
"""
origin = job.get("origin")
if not origin:
if not isinstance(origin, dict):
return None
platform = origin.get("platform")
chat_id = origin.get("chat_id")
@@ -147,6 +165,19 @@ def _get_home_target_chat_id(platform_name: str) -> str:
return value
def _get_home_target_thread_id(platform_name: str) -> Optional[str]:
"""Return the optional thread/topic ID for a platform home target."""
env_var = _HOME_TARGET_ENV_VARS.get(platform_name.lower())
if not env_var:
return None
value = os.getenv(f"{env_var}_THREAD_ID", "").strip()
if not value:
legacy = _LEGACY_HOME_TARGET_ENV_VARS.get(env_var)
if legacy:
value = os.getenv(f"{legacy}_THREAD_ID", "").strip()
return value or None
def _resolve_single_delivery_target(job: dict, deliver_value: str) -> Optional[dict]:
"""Resolve one concrete auto-delivery target for a cron job."""
@@ -175,7 +206,7 @@ def _resolve_single_delivery_target(job: dict, deliver_value: str) -> Optional[d
return {
"platform": platform_name,
"chat_id": chat_id,
"thread_id": None,
"thread_id": _get_home_target_thread_id(platform_name),
}
return None
@@ -229,7 +260,7 @@ def _resolve_single_delivery_target(job: dict, deliver_value: str) -> Optional[d
return {
"platform": platform_name,
"chat_id": chat_id,
"thread_id": None,
"thread_id": _get_home_target_thread_id(platform_name),
}
@@ -394,7 +425,7 @@ def _deliver_result(job: dict, content: str, adapters=None, loop=None) -> Option
thread_id = target.get("thread_id")
# Diagnostic: log thread_id for topic-aware delivery debugging
origin = job.get("origin") or {}
origin = _resolve_origin(job) or {}
origin_thread = origin.get("thread_id")
if origin_thread and not thread_id:
logger.warning(
@@ -553,8 +584,18 @@ def _run_job_script(script_path: str) -> tuple[bool, str]:
prevent arbitrary script execution via path traversal or absolute
path injection.
Supported interpreters (chosen by file extension):
* ``.sh`` / ``.bash`` run with ``/bin/bash``
* anything else run with the current Python interpreter
(``sys.executable``), preserving the original behaviour for
Python-based pre-check and data-collection scripts.
Shell support lets ``no_agent=True`` jobs ship classic bash watchdogs
(the `memory-watchdog.sh` pattern) without wrapping them in Python.
Args:
script_path: Path to a Python script. Relative paths are resolved
script_path: Path to the script. Relative paths are resolved
against HERMES_HOME/scripts/. Absolute and ~-prefixed paths
are also validated to ensure they stay within the scripts dir.
@@ -564,7 +605,7 @@ def _run_job_script(script_path: str) -> tuple[bool, str]:
"""
from hermes_constants import get_hermes_home
scripts_dir = get_hermes_home() / "scripts"
scripts_dir = _get_hermes_home() / "scripts"
scripts_dir.mkdir(parents=True, exist_ok=True)
scripts_dir_resolved = scripts_dir.resolve()
@@ -591,9 +632,19 @@ def _run_job_script(script_path: str) -> tuple[bool, str]:
script_timeout = _get_script_timeout()
# Pick an interpreter by extension. Bash for .sh/.bash, Python for
# everything else. We deliberately do NOT honour the file's own
# shebang: the scripts dir is trusted, but keeping the interpreter
# choice explicit here keeps the allowed surface small and auditable.
suffix = path.suffix.lower()
if suffix in (".sh", ".bash"):
argv = ["/bin/bash", str(path)]
else:
argv = [sys.executable, str(path)]
try:
result = subprocess.run(
[sys.executable, str(path)],
argv,
capture_output=True,
text=True,
timeout=script_timeout,
@@ -683,10 +734,8 @@ def _build_job_prompt(job: dict, prerun_script: Optional[tuple] = None) -> str:
f"{prompt}"
)
else:
prompt = (
"[Script ran successfully but produced no output.]\n\n"
f"{prompt}"
)
# Script produced no output — nothing to report, skip AI call.
return None
else:
prompt = (
"## Script Error\n"
@@ -759,6 +808,7 @@ def _build_job_prompt(job: dict, prerun_script: Optional[tuple] = None) -> str:
return prompt
from tools.skills_tool import skill_view
from tools.skill_usage import bump_use
parts = []
skipped: list[str] = []
@@ -770,6 +820,12 @@ def _build_job_prompt(job: dict, prerun_script: Optional[tuple] = None) -> str:
skipped.append(skill_name)
continue
# Bump usage so the curator sees this skill as actively used.
try:
bump_use(skill_name)
except Exception:
logger.debug("Cron job: failed to bump skill usage for '%s'", skill_name, exc_info=True)
content = str(loaded.get("content") or "").strip()
if parts:
parts.append("")
@@ -802,8 +858,120 @@ def run_job(job: dict) -> tuple[bool, str, str, Optional[str]]:
Returns:
Tuple of (success, full_output_doc, final_response, error_message)
"""
job_id = job["id"]
job_name = job["name"]
# ---------------------------------------------------------------
# no_agent short-circuit — the script IS the job, no LLM involvement.
# ---------------------------------------------------------------
# This mirrors the classic "run a bash script on a timer, send its
# stdout to telegram" watchdog pattern. The agent path is skipped
# entirely: no AIAgent, no prompt, no tool loop, no token spend.
#
# We check this BEFORE importing run_agent / constructing SessionDB so
# a pure-script tick never pays for the agent machinery it isn't going
# to use. Keep this block self-contained.
#
# Semantics:
# - script stdout (trimmed) → delivered verbatim as the final message
# - empty stdout → silent run (no delivery, success=True)
# - non-zero exit / timeout → delivered as an error alert, success=False
# - wakeAgent=false gate → treated like empty stdout (silent), since
# the whole point of no_agent is that there
# is no agent to wake
if job.get("no_agent"):
script_path = job.get("script")
if not script_path:
err = "no_agent=True but no script is set for this job"
logger.error("Job '%s': %s", job_id, err)
return False, "", "", err
# Apply workdir if configured — lets scripts use predictable relative
# paths. For no_agent jobs this is just the subprocess cwd (not an
# agent TERMINAL_CWD bridge).
_job_workdir = (job.get("workdir") or "").strip() or None
_prior_cwd = None
if _job_workdir and Path(_job_workdir).is_dir():
_prior_cwd = os.getcwd()
try:
os.chdir(_job_workdir)
except OSError:
_prior_cwd = None
try:
ok, output = _run_job_script(script_path)
finally:
if _prior_cwd is not None:
try:
os.chdir(_prior_cwd)
except OSError:
pass
now_iso = _hermes_now().strftime("%Y-%m-%d %H:%M:%S")
if not ok:
# Script crashed / timed out / exited non-zero. Deliver the
# error so the user knows the watchdog itself broke — silent
# failure for an alerting job is the worst-case outcome.
alert = (
f"⚠ Cron watchdog '{job_name}' script failed\n\n"
f"{output}\n\n"
f"Time: {now_iso}"
)
doc = (
f"# Cron Job: {job_name}\n\n"
f"**Job ID:** {job_id}\n"
f"**Run Time:** {now_iso}\n"
f"**Mode:** no_agent (script)\n"
f"**Status:** script failed\n\n"
f"{output}\n"
)
return False, doc, alert, output
# Honour the wakeAgent gate as a silent signal — `wakeAgent: false`
# means "nothing to report this tick", same as empty stdout.
if not _parse_wake_gate(output):
logger.info(
"Job '%s' (no_agent): wakeAgent=false gate — silent run", job_id
)
silent_doc = (
f"# Cron Job: {job_name}\n\n"
f"**Job ID:** {job_id}\n"
f"**Run Time:** {now_iso}\n"
f"**Mode:** no_agent (script)\n"
f"**Status:** silent (wakeAgent=false)\n"
)
return True, silent_doc, SILENT_MARKER, None
if not output.strip():
logger.info("Job '%s' (no_agent): empty stdout — silent run", job_id)
silent_doc = (
f"# Cron Job: {job_name}\n\n"
f"**Job ID:** {job_id}\n"
f"**Run Time:** {now_iso}\n"
f"**Mode:** no_agent (script)\n"
f"**Status:** silent (empty output)\n"
)
return True, silent_doc, SILENT_MARKER, None
doc = (
f"# Cron Job: {job_name}\n\n"
f"**Job ID:** {job_id}\n"
f"**Run Time:** {now_iso}\n"
f"**Mode:** no_agent (script)\n\n"
f"---\n\n"
f"{output}\n"
)
return True, doc, output, None
# ---------------------------------------------------------------
# Default (LLM) path — import and construct the agent machinery now
# that we know we actually need it. Doing these imports here instead of
# at module top keeps no_agent ticks from paying for AIAgent / SessionDB
# construction costs.
# ---------------------------------------------------------------
from run_agent import AIAgent
# Initialize SQLite session store so cron job messages are persisted
# and discoverable via session_search (same pattern as gateway/run.py).
_session_db = None
@@ -812,9 +980,6 @@ def run_job(job: dict) -> tuple[bool, str, str, Optional[str]]:
_session_db = SessionDB()
except Exception as e:
logger.debug("Job '%s': SQLite session store not available: %s", job.get("id", "?"), e)
job_id = job["id"]
job_name = job["name"]
# Wake-gate: if this job has a pre-check script, run it BEFORE building
# the prompt so a ``{"wakeAgent": false}`` response can short-circuit
@@ -839,6 +1004,9 @@ def run_job(job: dict) -> tuple[bool, str, str, Optional[str]]:
return True, silent_doc, SILENT_MARKER, None
prompt = _build_job_prompt(job, prerun_script=prerun_script)
if prompt is None:
logger.info("Job '%s': script produced no output, skipping AI call.", job_name)
return True, "", SILENT_MARKER, None
origin = _resolve_origin(job)
_cron_session_id = f"cron_{job_id}_{_hermes_now().strftime('%Y%m%d_%H%M%S')}"
@@ -898,9 +1066,9 @@ def run_job(job: dict) -> tuple[bool, str, str, Optional[str]]:
# changes take effect without a gateway restart.
from dotenv import load_dotenv
try:
load_dotenv(str(_hermes_home / ".env"), override=True, encoding="utf-8")
load_dotenv(str(_get_hermes_home() / ".env"), override=True, encoding="utf-8")
except UnicodeDecodeError:
load_dotenv(str(_hermes_home / ".env"), override=True, encoding="latin-1")
load_dotenv(str(_get_hermes_home() / ".env"), override=True, encoding="latin-1")
delivery_target = _resolve_delivery_target(job)
if delivery_target:
@@ -918,10 +1086,11 @@ def run_job(job: dict) -> tuple[bool, str, str, Optional[str]]:
_cfg = {}
try:
import yaml
_cfg_path = str(_hermes_home / "config.yaml")
_cfg_path = str(_get_hermes_home() / "config.yaml")
if os.path.exists(_cfg_path):
with open(_cfg_path) as _f:
_cfg = yaml.safe_load(_f) or {}
_cfg = _expand_env_vars(_cfg)
_model_cfg = _cfg.get("model", {})
if not job.get("model"):
if isinstance(_model_cfg, str):
@@ -951,7 +1120,7 @@ def run_job(job: dict) -> tuple[bool, str, str, Optional[str]]:
if prefill_file:
pfpath = Path(prefill_file).expanduser()
if not pfpath.is_absolute():
pfpath = _hermes_home / pfpath
pfpath = _get_hermes_home() / pfpath
if pfpath.exists():
try:
with open(pfpath, "r", encoding="utf-8") as _pf:
@@ -974,8 +1143,13 @@ def run_job(job: dict) -> tuple[bool, str, str, Optional[str]]:
)
from hermes_cli.auth import AuthError
try:
# Do not inject HERMES_INFERENCE_PROVIDER here. resolve_runtime_provider()
# already prefers persisted config over stale shell/env overrides when
# no explicit provider is requested. Passing the env var here short-
# circuits that precedence and can resurrect old providers (for
# example DeepSeek) for cron jobs that do not pin provider/model.
runtime_kwargs = {
"requested": job.get("provider") or os.getenv("HERMES_INFERENCE_PROVIDER"),
"requested": job.get("provider"),
}
if job.get("base_url"):
runtime_kwargs["explicit_base_url"] = job.get("base_url")
@@ -1270,12 +1444,13 @@ def tick(verbose: bool = True, adapters=None, loop=None) -> int:
Returns:
Number of jobs executed (0 if another tick is already running)
"""
_LOCK_DIR.mkdir(parents=True, exist_ok=True)
lock_dir, lock_file = _get_lock_paths()
lock_dir.mkdir(parents=True, exist_ok=True)
# Cross-platform file locking: fcntl on Unix, msvcrt on Windows
lock_fd = None
try:
lock_fd = open(_LOCK_FILE, "w")
lock_fd = open(lock_file, "w")
if fcntl:
fcntl.flock(lock_fd, fcntl.LOCK_EX | fcntl.LOCK_NB)
elif msvcrt:
+35
View File
@@ -86,6 +86,41 @@ if [ -d "$INSTALL_DIR/skills" ]; then
python3 "$INSTALL_DIR/tools/skills_sync.py"
fi
# Optionally start `hermes dashboard` as a side-process.
#
# Toggled by HERMES_DASHBOARD=1 (also accepts "true"/"yes", case-insensitive).
# Host/port/TUI can be overridden via:
# HERMES_DASHBOARD_HOST (default 0.0.0.0 — exposed outside the container)
# HERMES_DASHBOARD_PORT (default 9119, matches `hermes dashboard` default)
# HERMES_DASHBOARD_TUI (already honored by `hermes dashboard` itself)
#
# The dashboard is a long-lived server. We background it *before* the final
# `exec hermes "$@"` so the user's chosen foreground command (chat, gateway,
# sleep infinity, …) remains PID-of-interest for the container runtime. When
# the container stops the whole process tree is torn down, so no explicit
# cleanup is needed.
case "${HERMES_DASHBOARD:-}" in
1|true|TRUE|True|yes|YES|Yes)
dash_host="${HERMES_DASHBOARD_HOST:-0.0.0.0}"
dash_port="${HERMES_DASHBOARD_PORT:-9119}"
dash_args=(--host "$dash_host" --port "$dash_port" --no-open)
# Binding to anything other than localhost requires --insecure — the
# dashboard refuses otherwise because it exposes API keys. Inside a
# container this is the expected deployment (host reaches it via
# published port), so opt in automatically.
if [ "$dash_host" != "127.0.0.1" ] && [ "$dash_host" != "localhost" ]; then
dash_args+=(--insecure)
fi
echo "Starting hermes dashboard on ${dash_host}:${dash_port} (background)"
# Prefix dashboard output so it's distinguishable from the main
# process in `docker logs`. stdbuf keeps the pipe line-buffered.
(
stdbuf -oL -eL hermes dashboard "${dash_args[@]}" 2>&1 \
| sed -u 's/^/[dashboard] /'
) &
;;
esac
# Final exec: two supported invocation patterns.
#
# docker run <image> -> exec `hermes` with no args (legacy default)
@@ -0,0 +1,473 @@
# Telegram DM User-Managed Multi-Session Topics Implementation Plan
> **For Hermes:** Use test-driven-development for implementation. Use subagent-driven-development only after this plan is split into small reviewed tasks.
**Goal:** Add an opt-in Telegram DM multi-session mode where Telegram user-created private-chat topics become independent Hermes session lanes, while the root DM becomes a system lobby.
**Architecture:** Rely on Telegram's native private-chat topic UI. Users create new topics with the `+` button; Hermes maps each `message_thread_id` to a separate session lane. Hermes does not create topics for normal `/new` flow and does not try to manage topic lifecycle beyond activation/status, root-lobby behavior, and restoring legacy sessions into a user-created topic.
**Tech Stack:** Hermes gateway, Telegram Bot API 9.4+, python-telegram-bot adapter, SQLite SessionDB / side tables, pytest.
---
## 1. Product decisions
### Accepted
- PR-quality implementation: migrations, tests, docs, backwards compatibility.
- Use SQLite persistence, not JSON sidecars.
- Live status suffixes in topic titles are out of MVP.
- Topic title sync/editing is out of MVP except future-compatible storage if cheap.
- User creates Telegram topics manually through the Telegram bot interface.
- `/new` does **not** create Telegram topics.
- Root/main DM becomes a system lobby after activation.
- Existing Telegram behavior remains unchanged until the feature is activated/enabled.
- Migration of old sessions is supported through `/topic` listing and `/topic <session_id>` restore inside a user-created topic.
### Telegram API assumptions verified from Bot API docs
- `getMe` returns bot `User` fields:
- `has_topics_enabled`: forum/topic mode enabled in private chats.
- `allows_users_to_create_topics`: users may create/delete topics in private chats.
- `createForumTopic` works for private chats with a user, but MVP does not rely on it for normal flow.
- `Message.message_thread_id` identifies a topic in private chats.
- `sendMessage` supports `message_thread_id` for private-chat topics.
- `pinChatMessage` is allowed in private chats.
---
## 2. Target UX
### 2.1 Activation from root/main DM
User sends:
```text
/topic
```
Hermes:
1. calls Telegram `getMe`;
2. verifies `has_topics_enabled` and `allows_users_to_create_topics`;
3. enables multi-session topic mode for this Telegram DM user/chat;
4. sends an onboarding message;
5. pins the onboarding message if configured;
6. shows old/unlinked sessions that can be restored into topics.
Suggested onboarding text:
```text
Multi-session mode is enabled.
Create new Hermes chats with the + button in this bot interface. Each Telegram topic is an independent Hermes session, so you can work on different tasks in parallel.
This main chat is reserved for system commands, status, and session management.
To restore an old session:
1. Use /topic here to see unlinked sessions.
2. Create a new topic with the + button.
3. Send /topic <session_id> inside that topic.
```
### 2.2 Root/main DM after activation
Root DM is a system lobby.
Allowed/system commands include at least:
- `/topic`
- `/status`
- `/sessions` if available
- `/usage`
- `/help`
- `/platforms`
Normal user prompts in root DM do not enter the agent loop. Reply:
```text
This main chat is reserved for system commands.
To chat with Hermes, create a new topic using the + button in this bot interface. Each topic works as an independent Hermes session.
```
`/new` in root DM does not create a session/topic. Reply:
```text
To start a new parallel Hermes chat, create a new topic with the + button in this bot interface.
Each topic is an independent Hermes session. Use /new inside a topic only if you want to replace that topic's current session.
```
### 2.3 First message in a user-created topic
When a user creates a Telegram topic and sends the first message there:
1. Hermes receives a Telegram DM message with `message_thread_id`.
2. Hermes derives the existing thread-aware `session_key` from `(platform=telegram, chat_type=dm, chat_id, thread_id)`.
3. If no binding exists, Hermes creates a fresh Hermes session for this topic lane and persists the binding.
4. The message runs through the normal agent loop for that lane.
### 2.4 `/new` inside a non-main topic
`/new` remains supported but replaces the session attached to the current topic lane.
Hermes should warn:
```text
Started a new Hermes session in this topic.
Tip: for parallel work, create a new topic with the + button instead of using /new here. /new replaces the session attached to the current topic.
```
### 2.5 `/topic` in root/main DM after activation
Shows:
- mode enabled/disabled;
- last capability check result;
- whether intro message is pinned if known;
- count of known topic bindings;
- list of old/unlinked sessions.
Example:
```text
Telegram multi-session topics are enabled.
Create new Hermes chats with the + button in this bot interface.
Unlinked previous sessions:
1. 2026-05-01 Research notes — id: abc123
2. 2026-04-30 Deploy debugging — id: def456
3. Untitled session — id: ghi789
To restore one:
1. Create a new topic with the + button.
2. Open that topic.
3. Send /topic <id>
```
### 2.6 `/topic` inside a non-main topic
Without args, show the current topic binding:
```text
This topic is linked to:
Session: Research notes
ID: abc123
Use /new to replace this topic with a fresh session.
For parallel work, create another topic with the + button.
```
### 2.7 `/topic <session_id>` inside a non-main topic
Restore an old/unlinked session into the current user-created topic.
Behavior:
1. reject if not in Telegram DM topic;
2. verify session belongs to the same Telegram user/chat or is a safe legacy root DM session for this user;
3. reject if session is already linked to another active topic in MVP;
4. `SessionStore.switch_session(current_topic_session_key, target_session_id)`;
5. upsert binding with `managed_mode = restored`;
6. send two messages into the topic:
- session restored confirmation;
- last Hermes assistant message if available.
Example:
```text
Session restored: Research notes
Last Hermes message:
...
```
---
## 3. Persistence model
Use SQLite, but topic-mode schema changes are **explicit opt-in migrations**, not automatic startup reconciliation.
Important rollback-safety rule:
- upgrading Hermes and starting the gateway must not create Telegram topic-mode tables or columns;
- old/default Telegram behavior must keep working on the existing `state.db`;
- the first `/topic` activation path calls an idempotent explicit migration, then enables topic mode for that chat;
- if activation fails before the migration is needed, the database remains in the pre-topic-mode shape.
### 3.1 No eager `sessions` table mutation for MVP
Do **not** add `chat_id`, `chat_type`, `thread_id`, or `session_key` columns to `sessions` as part of ordinary `SessionDB()` startup. The existing declarative `_reconcile_columns()` mechanism would add them eagerly on every process start, which violates the managed-migration requirement.
For MVP, keep origin/session-lane data in topic-specific side tables created only by the explicit `/topic` migration. Legacy unlinked sessions can be discovered conservatively from existing data (`source = telegram`, `user_id = current Telegram user`) plus absence from topic bindings.
If future PRs need richer origin metadata for all gateway sessions, introduce it behind a separate explicit migration/command or a compatibility-reviewed schema bump.
### 3.2 Explicit `/topic` migration API
Add an idempotent method such as:
```python
def apply_telegram_topic_migration(self) -> None: ...
```
It creates only topic-mode side tables/indexes and records:
```text
state_meta.telegram_dm_topic_schema_version = 1
```
This method is called from `/topic` activation/status paths before reading or writing topic-mode state. It is not called from generic `SessionDB.__init__`, gateway startup, CLI startup, or auto-maintenance.
### 3.3 `telegram_dm_topic_mode`
Stores per-user/chat activation state. Created only by `apply_telegram_topic_migration()`.
Suggested fields:
- `chat_id` primary key
- `user_id`
- `enabled`
- `activated_at`
- `updated_at`
- `has_topics_enabled`
- `allows_users_to_create_topics`
- `capability_checked_at`
- `intro_message_id`
- `pinned_message_id`
### 3.4 `telegram_dm_topic_bindings`
Stores Telegram topic/thread to Hermes session binding. Created only by `apply_telegram_topic_migration()`.
Suggested fields:
- `chat_id`
- `thread_id`
- `user_id`
- `session_key`
- `session_id`
- `managed_mode`
- `auto`
- `restored`
- `new_replaced`
- `linked_at`
- `updated_at`
Recommended constraints:
- primary key `(chat_id, thread_id)`;
- unique index on `session_id` for MVP to prevent one session linked to multiple topics;
- index `(user_id, chat_id)` for status/listing.
### 3.5 Unlinked session semantics
For MVP, a session is unlinked if:
- `source = telegram`;
- `user_id = current Telegram user`;
- no row in `telegram_dm_topic_bindings` has `session_id = session_id`.
This is intentionally conservative until a future explicit migration adds richer cross-platform origin metadata.
Never dedupe by title.
---
## 4. Config
Suggested config block:
```yaml
platforms:
telegram:
extra:
multisession_topics:
enabled: false
mode: user_managed_topics
root_chat_behavior: system_lobby
pin_intro_message: true
```
Notes:
- `enabled: false` means existing Telegram behavior is unchanged.
- Activation via `/topic` may create per-chat enabled state only if global config permits it.
- `root_chat_behavior: system_lobby` is the MVP behavior for activated chats.
---
## 5. Command behavior summary
### `/topic` root/main DM
- If not activated: capability check, activate, send/pin onboarding, list unlinked sessions.
- If activated: show status and unlinked sessions.
### `/topic` non-main topic
- Show current binding.
### `/topic <session_id>` root/main DM
Reject with instructions:
```text
Create a new topic with the + button, open it, then send /topic <session_id> there to restore this session.
```
### `/topic <session_id>` non-main topic
Restore that session into this topic if ownership/linking checks pass.
### `/new` root/main DM when activated
Reply with instructions to use the `+` button. Do not enter agent loop.
### `/new` non-main topic
Create a new session in the current topic lane, persist/update binding, warn that `+` is preferred for parallel work.
### Normal text root/main DM when activated
Reply with system-lobby instruction. Do not enter agent loop.
### Normal text non-main topic
Normal Hermes agent flow for that topic's session lane.
---
## 6. PR breakdown
### PR 1 — Explicit topic-mode schema migration
**Goal:** Add rollback-safe SQLite support for Telegram topic mode without mutating `state.db` on ordinary upgrade/startup.
**Files likely touched:**
- `hermes_state.py`
- tests under `tests/`
**Tests first:**
1. opening an old/current DB with `SessionDB()` does not create topic-mode tables or `sessions` origin columns;
2. calling `apply_telegram_topic_migration()` creates `telegram_dm_topic_mode` and `telegram_dm_topic_bindings` idempotently;
3. migration records `state_meta.telegram_dm_topic_schema_version = 1`.
### PR 2 — Topic mode activation and binding APIs
**Goal:** Add SQLite persistence for activation and topic bindings.
**Tests first:**
1. enable/check mode row round-trips;
2. binding upsert and lookup by `(chat_id, user_id, thread_id)`;
3. linked sessions are excluded from unlinked list.
### PR 3 — `/topic` activation/status command
**Goal:** Implement root activation/status/listing behavior.
**Tests first:**
1. `/topic` in root checks `getMe` capabilities and records activation;
2. capability failure returns readable instructions;
3. activated root `/topic` lists unlinked sessions.
### PR 4 — System lobby behavior
**Goal:** Prevent root chat from entering agent loop after activation.
**Tests first:**
1. normal text in activated root returns lobby instruction;
2. `/new` in activated root returns `+` button instruction;
3. non-activated root behavior is unchanged.
### PR 5 — Auto-bind user-created topics
**Goal:** First message in non-main topic creates/uses an independent session lane.
**Tests first:**
1. new topic message creates binding with `auto_created`;
2. repeated topic message reuses same binding/lane;
3. two topics in same DM do not share sessions.
### PR 6 — Restore legacy sessions into a topic
**Goal:** Implement `/topic <session_id>` in non-main topics.
**Tests first:**
1. root `/topic <id>` rejects with instructions;
2. topic `/topic <id>` switches current topic lane to target session;
3. restore rejects sessions from other users/chats;
4. restore rejects already-linked sessions;
5. restore emits confirmation and last Hermes assistant message.
### PR 7 — `/new` inside topic updates binding
**Goal:** Keep existing `/new` semantics but persist topic binding replacement.
**Tests first:**
1. `/new` in topic creates a new session for same topic lane;
2. binding updates to `managed_mode = new_replaced`;
3. response includes guidance to use `+` for parallel work.
### PR 8 — Docs and polish
**Goal:** Document the feature and Telegram setup.
**Files likely touched:**
- `website/docs/user-guide/messaging/telegram.md`
- maybe `website/docs/user-guide/sessions.md`
Docs must explain:
- BotFather/Telegram settings for topic mode and user-created topics;
- `/topic` activation;
- root system lobby;
- using `+` for new parallel chats;
- restoring old sessions with `/topic <id>` inside a topic;
- limitations.
---
## 7. Testing / quality gates
Run targeted tests after each TDD cycle, then broader tests before completion.
Suggested commands after inspection confirms test paths:
```bash
python -m pytest tests/test_hermes_state.py -q
python -m pytest tests/gateway/ -q
python -m pytest tests/ -o 'addopts=' -q
```
Do not ship without verifying disabled-feature backwards compatibility.
---
## 8. Definition of done for MVP
- `/topic` activates/checks Telegram DM multi-session mode.
- Root DM becomes a system lobby after activation.
- Onboarding message tells users to create new chats with the Telegram `+` button.
- Onboarding message can be pinned in private chat.
- User-created topics automatically become independent Hermes session lanes.
- `/new` in root gives instructions, not a new agent run.
- `/new` in a topic creates a new session in that topic and warns that `+` is preferred for parallel work.
- `/topic` in root lists unlinked old sessions.
- `/topic <session_id>` inside a topic restores that session and sends confirmation + last Hermes assistant message.
- Ownership checks prevent restoring other users' sessions.
- Already-linked sessions are not restored into a second topic in MVP.
- Existing Telegram behavior is unchanged when the feature is disabled.
- Tests and docs are included.
+1 -1
View File
@@ -40,7 +40,7 @@ This directory contains the integration layer between **hermes-agent's** tool-ca
- `evaluate_log()` for saving eval results to JSON + samples.jsonl
**HermesAgentBaseEnv** (`hermes_base_env.py`) extends BaseEnv with hermes-agent specifics:
- Sets `os.environ["TERMINAL_ENV"]` to configure the terminal backend (local, docker, modal, daytona, ssh, singularity)
- Sets `os.environ["TERMINAL_ENV"]` to configure the terminal backend (local, docker, ssh, singularity, modal, daytona, vercel_sandbox)
- Resolves hermes-agent toolsets via `_resolve_tools_for_group()` (calls `get_tool_definitions()` which queries `tools/registry.py`)
- Implements `collect_trajectory()` which runs the full agent loop and computes rewards
- Supports two-phase operation (Phase 1: OpenAI server, Phase 2: VLLM ManagedServer)
Binary file not shown.

After

Width:  |  Height:  |  Size: 115 KiB

+105 -8
View File
@@ -65,6 +65,15 @@ def _normalize_unauthorized_dm_behavior(value: Any, default: str = "pair") -> st
return default
def _normalize_notice_delivery(value: Any, default: str = "public") -> str:
"""Normalize notice delivery mode to a supported value."""
if isinstance(value, str):
normalized = value.strip().lower()
if normalized in {"public", "private"}:
return normalized
return default
# Module-level cache for bundled platform plugin names (lives outside the
# enum so it doesn't become an accidental enum member).
_Platform__bundled_plugin_names: Optional[set] = None
@@ -177,18 +186,24 @@ class HomeChannel:
Default destination for a platform.
When a cron job specifies deliver="telegram" without a specific chat ID,
messages are sent to this home channel.
messages are sent to this home channel. Thread-aware platforms may also
store a thread/topic ID so the bare platform target routes to the exact
conversation where /sethome was run.
"""
platform: Platform
chat_id: str
name: str # Human-readable name for display
thread_id: Optional[str] = None
def to_dict(self) -> Dict[str, Any]:
return {
result = {
"platform": self.platform.value,
"chat_id": self.chat_id,
"name": self.name,
}
if self.thread_id:
result["thread_id"] = self.thread_id
return result
@classmethod
def from_dict(cls, data: Dict[str, Any]) -> "HomeChannel":
@@ -196,6 +211,7 @@ class HomeChannel:
platform=Platform(data["platform"]),
chat_id=str(data["chat_id"]),
name=data.get("name", "Home"),
thread_id=str(data["thread_id"]) if data.get("thread_id") else None,
)
@@ -255,15 +271,23 @@ class PlatformConfig:
# - "first": Only first chunk threads to user's message (default)
# - "all": All chunks in multi-part replies thread to user's message
reply_to_mode: str = "first"
# Whether the gateway is allowed to send "♻️ Gateway online" /
# "♻ Gateway restarted" lifecycle notifications on this platform.
# Default True preserves prior behavior. Set False on platforms used
# by end users (e.g. Slack) where operator-flavored restart pings are
# noise; keep True for back-channels where the operator wants them.
gateway_restart_notification: bool = True
# Platform-specific settings
extra: Dict[str, Any] = field(default_factory=dict)
def to_dict(self) -> Dict[str, Any]:
result = {
"enabled": self.enabled,
"extra": self.extra,
"reply_to_mode": self.reply_to_mode,
"gateway_restart_notification": self.gateway_restart_notification,
}
if self.token:
result["token"] = self.token
@@ -272,19 +296,22 @@ class PlatformConfig:
if self.home_channel:
result["home_channel"] = self.home_channel.to_dict()
return result
@classmethod
def from_dict(cls, data: Dict[str, Any]) -> "PlatformConfig":
home_channel = None
if "home_channel" in data:
home_channel = HomeChannel.from_dict(data["home_channel"])
return cls(
enabled=_coerce_bool(data.get("enabled"), False),
token=data.get("token"),
api_key=data.get("api_key"),
home_channel=home_channel,
reply_to_mode=data.get("reply_to_mode", "first"),
gateway_restart_notification=_coerce_bool(
data.get("gateway_restart_notification"), True
),
extra=data.get("extra", {}),
)
@@ -592,6 +619,17 @@ class GatewayConfig:
)
return self.unauthorized_dm_behavior
def get_notice_delivery(self, platform: Optional[Platform] = None) -> str:
"""Return the effective notice-delivery mode for a platform."""
if platform:
platform_cfg = self.platforms.get(platform)
if platform_cfg and "notice_delivery" in platform_cfg.extra:
return _normalize_notice_delivery(
platform_cfg.extra.get("notice_delivery"),
"public",
)
return "public"
def load_gateway_config() -> GatewayConfig:
"""
@@ -707,6 +745,11 @@ def load_gateway_config() -> GatewayConfig:
platform_cfg.get("unauthorized_dm_behavior"),
gw_data.get("unauthorized_dm_behavior", "pair"),
)
if "notice_delivery" in platform_cfg:
bridged["notice_delivery"] = _normalize_notice_delivery(
platform_cfg.get("notice_delivery"),
"public",
)
if "reply_prefix" in platform_cfg:
bridged["reply_prefix"] = platform_cfg["reply_prefix"]
if "reply_in_thread" in platform_cfg:
@@ -813,12 +856,36 @@ def load_gateway_config() -> GatewayConfig:
):
if yaml_key in allow_mentions_cfg and not os.getenv(env_key):
os.environ[env_key] = str(allow_mentions_cfg[yaml_key]).lower()
# reply_to_mode: top-level preferred, falls back to extra.reply_to_mode
# YAML 1.1 parses bare 'off' as boolean False — coerce to string "off".
_discord_extra = discord_cfg.get("extra") if isinstance(discord_cfg.get("extra"), dict) else {}
_discord_rtm = (
discord_cfg["reply_to_mode"] if "reply_to_mode" in discord_cfg
else _discord_extra.get("reply_to_mode")
)
if _discord_rtm is not None and not os.getenv("DISCORD_REPLY_TO_MODE"):
_rtm_str = "off" if _discord_rtm is False else str(_discord_rtm).lower()
os.environ["DISCORD_REPLY_TO_MODE"] = _rtm_str
# Bridge top-level require_mention to Telegram when the telegram: section
# does not already provide one. Users often write "require_mention: true"
# at the top level alongside group_sessions_per_user, expecting it to work
# the same way (#3979).
_tl_require_mention = yaml_cfg.get("require_mention")
if _tl_require_mention is not None:
_tg_section = yaml_cfg.get("telegram") or {}
if "require_mention" not in _tg_section:
_tg_plat = platforms_data.setdefault(Platform.TELEGRAM.value, {})
_tg_extra = _tg_plat.setdefault("extra", {})
_tg_extra.setdefault("require_mention", _tl_require_mention)
# Telegram settings → env vars (env vars take precedence)
telegram_cfg = yaml_cfg.get("telegram", {})
if isinstance(telegram_cfg, dict):
if "require_mention" in telegram_cfg and not os.getenv("TELEGRAM_REQUIRE_MENTION"):
os.environ["TELEGRAM_REQUIRE_MENTION"] = str(telegram_cfg["require_mention"]).lower()
# Prefer telegram.require_mention; fall back to the top-level shorthand.
_effective_rm = telegram_cfg.get("require_mention", yaml_cfg.get("require_mention"))
if _effective_rm is not None and not os.getenv("TELEGRAM_REQUIRE_MENTION"):
os.environ["TELEGRAM_REQUIRE_MENTION"] = str(_effective_rm).lower()
if "mention_patterns" in telegram_cfg and not os.getenv("TELEGRAM_MENTION_PATTERNS"):
os.environ["TELEGRAM_MENTION_PATTERNS"] = json.dumps(telegram_cfg["mention_patterns"])
frc = telegram_cfg.get("free_response_chats")
@@ -835,6 +902,16 @@ def load_gateway_config() -> GatewayConfig:
os.environ["TELEGRAM_REACTIONS"] = str(telegram_cfg["reactions"]).lower()
if "proxy_url" in telegram_cfg and not os.getenv("TELEGRAM_PROXY"):
os.environ["TELEGRAM_PROXY"] = str(telegram_cfg["proxy_url"]).strip()
# reply_to_mode: top-level preferred, falls back to extra.reply_to_mode
# YAML 1.1 parses bare 'off' as boolean False — coerce to string "off".
_telegram_extra = telegram_cfg.get("extra") if isinstance(telegram_cfg.get("extra"), dict) else {}
_telegram_rtm = (
telegram_cfg["reply_to_mode"] if "reply_to_mode" in telegram_cfg
else _telegram_extra.get("reply_to_mode")
)
if _telegram_rtm is not None and not os.getenv("TELEGRAM_REPLY_TO_MODE"):
_rtm_str = "off" if _telegram_rtm is False else str(_telegram_rtm).lower()
os.environ["TELEGRAM_REPLY_TO_MODE"] = _rtm_str
allowed_users = telegram_cfg.get("allow_from")
if allowed_users is not None and not os.getenv("TELEGRAM_ALLOWED_USERS"):
if isinstance(allowed_users, list):
@@ -1046,6 +1123,7 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
platform=Platform.TELEGRAM,
chat_id=telegram_home,
name=os.getenv("TELEGRAM_HOME_CHANNEL_NAME", "Home"),
thread_id=os.getenv("TELEGRAM_HOME_CHANNEL_THREAD_ID") or None,
)
# Discord
@@ -1062,6 +1140,7 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
platform=Platform.DISCORD,
chat_id=discord_home,
name=os.getenv("DISCORD_HOME_CHANNEL_NAME", "Home"),
thread_id=os.getenv("DISCORD_HOME_CHANNEL_THREAD_ID") or None,
)
# Reply threading mode for Discord (off/first/all)
@@ -1083,6 +1162,7 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
platform=Platform.WHATSAPP,
chat_id=whatsapp_home,
name=os.getenv("WHATSAPP_HOME_CHANNEL_NAME", "Home"),
thread_id=os.getenv("WHATSAPP_HOME_CHANNEL_THREAD_ID") or None,
)
# Slack
@@ -1110,6 +1190,7 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
platform=Platform.SLACK,
chat_id=slack_home,
name=os.getenv("SLACK_HOME_CHANNEL_NAME", ""),
thread_id=os.getenv("SLACK_HOME_CHANNEL_THREAD_ID") or None,
)
# Signal
@@ -1130,6 +1211,7 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
platform=Platform.SIGNAL,
chat_id=signal_home,
name=os.getenv("SIGNAL_HOME_CHANNEL_NAME", "Home"),
thread_id=os.getenv("SIGNAL_HOME_CHANNEL_THREAD_ID") or None,
)
# Mattermost
@@ -1149,6 +1231,7 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
platform=Platform.MATTERMOST,
chat_id=mattermost_home,
name=os.getenv("MATTERMOST_HOME_CHANNEL_NAME", "Home"),
thread_id=os.getenv("MATTERMOST_HOME_CHANNEL_THREAD_ID") or None,
)
# Matrix
@@ -1180,6 +1263,7 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
platform=Platform.MATRIX,
chat_id=matrix_home,
name=os.getenv("MATRIX_HOME_ROOM_NAME", "Home"),
thread_id=os.getenv("MATRIX_HOME_ROOM_THREAD_ID") or None,
)
# Home Assistant
@@ -1213,6 +1297,7 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
platform=Platform.EMAIL,
chat_id=email_home,
name=os.getenv("EMAIL_HOME_ADDRESS_NAME", "Home"),
thread_id=os.getenv("EMAIL_HOME_ADDRESS_THREAD_ID") or None,
)
# SMS (Twilio)
@@ -1228,6 +1313,7 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
platform=Platform.SMS,
chat_id=sms_home,
name=os.getenv("SMS_HOME_CHANNEL_NAME", "Home"),
thread_id=os.getenv("SMS_HOME_CHANNEL_THREAD_ID") or None,
)
# API Server
@@ -1290,6 +1376,7 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
platform=Platform.DINGTALK,
chat_id=dingtalk_home,
name=os.getenv("DINGTALK_HOME_CHANNEL_NAME", "Home"),
thread_id=os.getenv("DINGTALK_HOME_CHANNEL_THREAD_ID") or None,
)
# Feishu / Lark
@@ -1317,6 +1404,7 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
platform=Platform.FEISHU,
chat_id=feishu_home,
name=os.getenv("FEISHU_HOME_CHANNEL_NAME", "Home"),
thread_id=os.getenv("FEISHU_HOME_CHANNEL_THREAD_ID") or None,
)
# WeCom (Enterprise WeChat)
@@ -1339,6 +1427,7 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
platform=Platform.WECOM,
chat_id=wecom_home,
name=os.getenv("WECOM_HOME_CHANNEL_NAME", "Home"),
thread_id=os.getenv("WECOM_HOME_CHANNEL_THREAD_ID") or None,
)
# WeCom callback mode (self-built apps)
@@ -1397,6 +1486,7 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
platform=Platform.WEIXIN,
chat_id=weixin_home,
name=os.getenv("WEIXIN_HOME_CHANNEL_NAME", "Home"),
thread_id=os.getenv("WEIXIN_HOME_CHANNEL_THREAD_ID") or None,
)
# BlueBubbles (iMessage)
@@ -1420,6 +1510,7 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
platform=Platform.BLUEBUBBLES,
chat_id=bluebubbles_home,
name=os.getenv("BLUEBUBBLES_HOME_CHANNEL_NAME", "Home"),
thread_id=os.getenv("BLUEBUBBLES_HOME_CHANNEL_THREAD_ID") or None,
)
# QQ (Official Bot API v2)
@@ -1457,6 +1548,11 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
platform=Platform.QQBOT,
chat_id=qq_home,
name=os.getenv("QQBOT_HOME_CHANNEL_NAME") or os.getenv(qq_home_name_env, "Home"),
thread_id=(
os.getenv("QQBOT_HOME_CHANNEL_THREAD_ID")
or os.getenv("QQ_HOME_CHANNEL_THREAD_ID")
or None
),
)
# Yuanbao — YUANBAO_APP_ID preferred
@@ -1487,6 +1583,7 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
platform=Platform.YUANBAO,
chat_id=yuanbao_home,
name=os.getenv("YUANBAO_HOME_CHANNEL_NAME", "Home"),
thread_id=os.getenv("YUANBAO_HOME_CHANNEL_THREAD_ID") or None,
)
yuanbao_dm_policy = os.getenv("YUANBAO_DM_POLICY")
if yuanbao_dm_policy:
+9 -7
View File
@@ -53,9 +53,10 @@ class DeliveryTarget:
- "telegram" Telegram home channel
- "telegram:123456" specific Telegram chat
"""
target = target.strip().lower()
target_stripped = target.strip()
target_lower = target_stripped.lower()
if target == "origin":
if target_lower == "origin":
if origin:
return cls(
platform=origin.platform,
@@ -67,13 +68,14 @@ class DeliveryTarget:
# Fallback to local if no origin
return cls(platform=Platform.LOCAL, is_origin=True)
if target == "local":
if target_lower == "local":
return cls(platform=Platform.LOCAL)
# Check for platform:chat_id or platform:chat_id:thread_id format
if ":" in target:
parts = target.split(":", 2)
platform_str = parts[0]
# Use the original case for chat_id/thread_id to preserve case-sensitive IDs
if ":" in target_stripped:
parts = target_stripped.split(":", 2)
platform_str = parts[0].lower() # Platform names are case-insensitive
chat_id = parts[1] if len(parts) > 1 else None
thread_id = parts[2] if len(parts) > 2 else None
try:
@@ -85,7 +87,7 @@ class DeliveryTarget:
# Just a platform name (use home channel)
try:
platform = Platform(target)
platform = Platform(target_lower)
return cls(platform=platform)
except ValueError:
# Unknown platform, treat as local
+84
View File
@@ -0,0 +1,84 @@
"""Shared HTTP client factory for long-lived platform adapters.
Gateway messaging platforms (QQ Bot, Feishu, WeCom, DingTalk, Signal,
BlueBubbles, WeCom-callback) keep a persistent ``httpx.AsyncClient``
alive for the adapter's lifetime. That amortises TLS/connection setup
across many API calls, but it also means the process's file-descriptor
pressure is sensitive to how aggressively the pool recycles idle keep-
alive connections.
httpx's default ``keepalive_expiry`` is 5 seconds. On macOS behind
Cloudflare Warp (and other transparent proxies), peer-initiated FIN can
sit in ``CLOSE_WAIT`` longer than that before the local socket actually
drains which, multiplied across 7 long-lived adapters plus the LLM
client and MCP clients, walks straight into the default 256 fd limit.
See #18451.
``platform_httpx_limits()`` returns a tighter ``httpx.Limits`` the
adapter factories use instead of the httpx default. The values chosen:
* ``max_keepalive_connections=10`` plenty for any single adapter;
platform APIs rarely parallelise beyond this.
* ``keepalive_expiry=2.0`` close idle sockets aggressively so a
proxy's lingering CLOSE_WAIT window can't starve the process.
Override via ``HERMES_GATEWAY_HTTPX_KEEPALIVE_EXPIRY`` /
``HERMES_GATEWAY_HTTPX_MAX_KEEPALIVE`` env vars when tuning under load.
"""
from __future__ import annotations
import os
try:
import httpx
except ImportError: # pragma: no cover — optional dep
httpx = None # type: ignore[assignment]
_DEFAULT_KEEPALIVE_EXPIRY_S = 2.0
_DEFAULT_MAX_KEEPALIVE = 10
def platform_httpx_limits() -> "httpx.Limits | None":
"""Return ``httpx.Limits`` tuned for persistent platform-adapter clients.
Returns ``None`` when httpx isn't importable, so callers can fall
back to httpx's built-in default without a hard dependency on this
helper being reachable.
"""
if httpx is None:
return None
def _env_float(name: str, default: float) -> float:
raw = os.environ.get(name, "").strip()
if not raw:
return default
try:
val = float(raw)
except (TypeError, ValueError):
return default
return val if val > 0 else default
def _env_int(name: str, default: int) -> int:
raw = os.environ.get(name, "").strip()
if not raw:
return default
try:
val = int(raw)
except (TypeError, ValueError):
return default
return val if val > 0 else default
keepalive_expiry = _env_float(
"HERMES_GATEWAY_HTTPX_KEEPALIVE_EXPIRY", _DEFAULT_KEEPALIVE_EXPIRY_S
)
max_keepalive = _env_int(
"HERMES_GATEWAY_HTTPX_MAX_KEEPALIVE", _DEFAULT_MAX_KEEPALIVE
)
return httpx.Limits(
max_keepalive_connections=max_keepalive,
# Leave max_connections at httpx default (100) — plenty of headroom.
keepalive_expiry=keepalive_expiry,
)
+300 -31
View File
@@ -2,8 +2,8 @@
OpenAI-compatible API server platform adapter.
Exposes an HTTP server with endpoints:
- POST /v1/chat/completions OpenAI Chat Completions format (stateless; opt-in session continuity via X-Hermes-Session-Id header)
- POST /v1/responses OpenAI Responses API format (stateful via previous_response_id)
- POST /v1/chat/completions OpenAI Chat Completions format (stateless; opt-in session continuity via X-Hermes-Session-Id header; opt-in long-term memory scoping via X-Hermes-Session-Key header)
- POST /v1/responses OpenAI Responses API format (stateful via previous_response_id; X-Hermes-Session-Key supported)
- GET /v1/responses/{response_id} Retrieve a stored response
- DELETE /v1/responses/{response_id} Delete a stored response
- GET /v1/models lists hermes-agent as an available model
@@ -56,12 +56,20 @@ logger = logging.getLogger(__name__)
DEFAULT_HOST = "127.0.0.1"
DEFAULT_PORT = 8642
MAX_STORED_RESPONSES = 100
MAX_REQUEST_BYTES = 1_000_000 # 1 MB default limit for POST bodies
MAX_REQUEST_BYTES = 10_000_000 # 10 MB — accommodates long agent conversations with tool calls
CHAT_COMPLETIONS_SSE_KEEPALIVE_SECONDS = 30.0
MAX_NORMALIZED_TEXT_LENGTH = 65_536 # 64 KB cap for normalized content parts
MAX_CONTENT_LIST_SIZE = 1_000 # Max items when content is an array
def _coerce_port(value: Any, default: int = DEFAULT_PORT) -> int:
"""Parse a listen port without letting malformed env/config values crash startup."""
try:
return int(value)
except (TypeError, ValueError):
return default
def _normalize_chat_content(
content: Any, *, _max_depth: int = 10, _depth: int = 0,
) -> str:
@@ -573,7 +581,10 @@ class APIServerAdapter(BasePlatformAdapter):
super().__init__(config, Platform.API_SERVER)
extra = config.extra or {}
self._host: str = extra.get("host", os.getenv("API_SERVER_HOST", DEFAULT_HOST))
self._port: int = int(extra.get("port", os.getenv("API_SERVER_PORT", str(DEFAULT_PORT))))
raw_port = extra.get("port")
if raw_port is None:
raw_port = os.getenv("API_SERVER_PORT", str(DEFAULT_PORT))
self._port: int = _coerce_port(raw_port, DEFAULT_PORT)
self._api_key: str = extra.get("key", os.getenv("API_SERVER_KEY", ""))
self._cors_origins: tuple[str, ...] = self._parse_cors_origins(
extra.get("cors_origins", os.getenv("API_SERVER_CORS_ORIGINS", "")),
@@ -687,6 +698,71 @@ class APIServerAdapter(BasePlatformAdapter):
status=401,
)
# ------------------------------------------------------------------
# Session header helpers
# ------------------------------------------------------------------
# Soft length cap for session identifiers. Headers are bounded in
# aggregate by aiohttp (``client_max_size`` / default 8 KiB per
# header), but we impose a tighter limit on the session headers so a
# caller can't burn memory by passing a multi-kilobyte "session key".
# 256 chars is well above any realistic stable channel identifier
# (e.g. ``agent:main:webui:dm:user-42``) while staying small enough
# that the sanitized form is safe to pass into Honcho / state.db.
_MAX_SESSION_HEADER_LEN = 256
def _parse_session_key_header(
self, request: "web.Request"
) -> tuple[Optional[str], Optional["web.Response"]]:
"""Extract and validate the ``X-Hermes-Session-Key`` header.
The session key is a stable per-channel identifier that scopes
long-term memory (e.g. Honcho sessions) across transcripts. It
is independent of ``X-Hermes-Session-Id``: callers may send
either, both, or neither.
Returns ``(session_key, None)`` on success (with an empty/absent
header yielding ``None`` for the key), or ``(None, error_response)``
on validation failure.
Security: like session continuation, accepting a caller-supplied
memory scope requires API-key authentication so that an
unauthenticated client on a local-only server can't inject itself
into another user's long-term memory scope by guessing a key.
"""
raw = request.headers.get("X-Hermes-Session-Key", "").strip()
if not raw:
return None, None
if not self._api_key:
logger.warning(
"X-Hermes-Session-Key rejected: no API key configured. "
"Set API_SERVER_KEY to enable long-term memory scoping."
)
return None, web.json_response(
_openai_error(
"X-Hermes-Session-Key requires API key authentication. "
"Configure API_SERVER_KEY to enable this feature."
),
status=403,
)
# Reject control characters that could enable header injection on
# the echo path.
if re.search(r'[\r\n\x00]', raw):
return None, web.json_response(
{"error": {"message": "Invalid session key", "type": "invalid_request_error"}},
status=400,
)
if len(raw) > self._MAX_SESSION_HEADER_LEN:
return None, web.json_response(
{"error": {"message": "Session key too long", "type": "invalid_request_error"}},
status=400,
)
return raw, None
# ------------------------------------------------------------------
# Session DB helper
# ------------------------------------------------------------------
@@ -717,6 +793,7 @@ class APIServerAdapter(BasePlatformAdapter):
tool_progress_callback=None,
tool_start_callback=None,
tool_complete_callback=None,
gateway_session_key: Optional[str] = None,
) -> Any:
"""
Create an AIAgent instance using the gateway's runtime config.
@@ -725,12 +802,20 @@ class APIServerAdapter(BasePlatformAdapter):
base_url, etc. from config.yaml / env vars. Toolsets are resolved
from config.yaml platform_toolsets.api_server (same as all other
gateway platforms), falling back to the hermes-api-server default.
``gateway_session_key`` is a stable per-channel identifier supplied
by the client (via ``X-Hermes-Session-Key``). Unlike ``session_id``
which scopes the short-term transcript and rotates on /new, this
key is meant to persist across transcripts so long-term memory
providers (e.g. Honcho) can scope their per-chat state correctly
matching the semantics of the native gateway's ``session_key``.
"""
from run_agent import AIAgent
from gateway.run import _resolve_runtime_agent_kwargs, _resolve_gateway_model, _load_gateway_config
from gateway.run import _resolve_runtime_agent_kwargs, _resolve_gateway_model, _load_gateway_config, GatewayRunner
from hermes_cli.tools_config import _get_platform_tools
runtime_kwargs = _resolve_runtime_agent_kwargs()
reasoning_config = GatewayRunner._load_reasoning_config()
model = _resolve_gateway_model()
user_config = _load_gateway_config()
@@ -740,7 +825,6 @@ class APIServerAdapter(BasePlatformAdapter):
# Load fallback provider chain so the API server platform has the
# same fallback behaviour as Telegram/Discord/Slack (fixes #4954).
from gateway.run import GatewayRunner
fallback_model = GatewayRunner._load_fallback_model()
agent = AIAgent(
@@ -759,6 +843,8 @@ class APIServerAdapter(BasePlatformAdapter):
tool_complete_callback=tool_complete_callback,
session_db=self._ensure_session_db(),
fallback_model=fallback_model,
reasoning_config=reasoning_config,
gateway_session_key=gateway_session_key,
)
return agent
@@ -842,6 +928,7 @@ class APIServerAdapter(BasePlatformAdapter):
"run_stop": True,
"tool_progress_events": True,
"session_continuity_header": "X-Hermes-Session-Id",
"session_key_header": "X-Hermes-Session-Key",
"cors": bool(self._cors_origins),
},
"endpoints": {
@@ -913,6 +1000,15 @@ class APIServerAdapter(BasePlatformAdapter):
status=400,
)
# Allow caller to scope long-term memory (e.g. Honcho) with a
# stable per-channel identifier via X-Hermes-Session-Key. This
# is independent of X-Hermes-Session-Id: the key persists across
# transcripts while the id rotates when the caller starts a new
# transcript (i.e. /new semantics). See _parse_session_key_header.
gateway_session_key, key_err = self._parse_session_key_header(request)
if key_err is not None:
return key_err
# Allow caller to continue an existing session by passing X-Hermes-Session-Id.
# When provided, history is loaded from state.db instead of from the request body.
#
@@ -1047,11 +1143,13 @@ class APIServerAdapter(BasePlatformAdapter):
tool_start_callback=_on_tool_start,
tool_complete_callback=_on_tool_complete,
agent_ref=agent_ref,
gateway_session_key=gateway_session_key,
))
return await self._write_sse_chat_completion(
request, completion_id, model_name, created, _stream_q,
agent_task, agent_ref, session_id=session_id,
gateway_session_key=gateway_session_key,
)
# Non-streaming: run the agent (with optional Idempotency-Key)
@@ -1061,6 +1159,7 @@ class APIServerAdapter(BasePlatformAdapter):
conversation_history=history,
ephemeral_system_prompt=system_prompt,
session_id=session_id,
gateway_session_key=gateway_session_key,
)
idempotency_key = request.headers.get("Idempotency-Key")
@@ -1110,11 +1209,17 @@ class APIServerAdapter(BasePlatformAdapter):
},
}
return web.json_response(response_data, headers={"X-Hermes-Session-Id": session_id})
response_headers = {
"X-Hermes-Session-Id": result.get("session_id", session_id),
}
if gateway_session_key:
response_headers["X-Hermes-Session-Key"] = gateway_session_key
return web.json_response(response_data, headers=response_headers)
async def _write_sse_chat_completion(
self, request: "web.Request", completion_id: str, model: str,
created: int, stream_q, agent_task, agent_ref=None, session_id: str = None,
gateway_session_key: str = None,
) -> "web.StreamResponse":
"""Write real streaming SSE from agent's stream_delta_callback queue.
@@ -1137,6 +1242,8 @@ class APIServerAdapter(BasePlatformAdapter):
sse_headers.update(cors)
if session_id:
sse_headers["X-Hermes-Session-Id"] = session_id
if gateway_session_key:
sse_headers["X-Hermes-Session-Key"] = gateway_session_key
response = web.StreamResponse(status=200, headers=sse_headers)
await response.prepare(request)
@@ -1242,6 +1349,22 @@ class APIServerAdapter(BasePlatformAdapter):
except (asyncio.CancelledError, Exception):
pass
logger.info("SSE client disconnected; interrupted agent task %s", completion_id)
except Exception as _exc:
# Agent crashed mid-stream. Try to emit an error chunk
# so the client gets a proper response instead of a
# TransferEncodingError from incomplete chunked encoding.
import traceback as _tb
logger.error("Agent crashed mid-stream for %s: %s", completion_id, _tb.format_exc()[:300])
try:
error_chunk = {
"id": completion_id, "object": "chat.completion.chunk",
"created": created, "model": model,
"choices": [{"index": 0, "delta": {}, "finish_reason": "error"}],
}
await response.write(f"data: {json.dumps(error_chunk)}\n\n".encode())
await response.write(b"data: [DONE]\n\n")
except Exception:
pass
return response
@@ -1260,6 +1383,7 @@ class APIServerAdapter(BasePlatformAdapter):
conversation: Optional[str],
store: bool,
session_id: str,
gateway_session_key: Optional[str] = None,
) -> "web.StreamResponse":
"""Write an SSE stream for POST /v1/responses (OpenAI Responses API).
@@ -1302,6 +1426,8 @@ class APIServerAdapter(BasePlatformAdapter):
sse_headers.update(cors)
if session_id:
sse_headers["X-Hermes-Session-Id"] = session_id
if gateway_session_key:
sse_headers["X-Hermes-Session-Key"] = gateway_session_key
response = web.StreamResponse(status=200, headers=sse_headers)
await response.prepare(request)
@@ -1559,20 +1685,54 @@ class APIServerAdapter(BasePlatformAdapter):
async def _dispatch(it) -> None:
"""Route a queue item to the correct SSE emitter.
Plain strings are text deltas. Tagged tuples with
``__tool_started__`` / ``__tool_completed__`` prefixes
are tool lifecycle events.
Plain strings are text deltas they are batched (50ms)
to reduce Open WebUI re-render storms. Tagged tuples
with ``__tool_started__`` / ``__tool_completed__``
prefixes are tool lifecycle events and flush the buffer
before emitting.
"""
nonlocal _batch_timer
if isinstance(it, tuple) and len(it) == 2 and isinstance(it[0], str):
tag, payload = it
# Flush batched text before tool events
if _batch_buf:
await _flush_batch()
if tag == "__tool_started__":
await _emit_tool_started(payload)
elif tag == "__tool_completed__":
await _emit_tool_completed(payload)
# Unknown tags are silently ignored (forward-compat).
elif isinstance(it, str):
await _emit_text_delta(it)
# Other types (non-string, non-tuple) are silently dropped.
# Batch text deltas — append to buffer, flush on timer
_batch_buf.append(it)
if _batch_timer is None:
_batch_timer = asyncio.create_task(_batch_flush_after(0.05))
# Other types are silently dropped.
# ── Batching state ──
_batch_buf: List[str] = []
_batch_timer: Optional[asyncio.Task] = None
_batch_lock = asyncio.Lock()
async def _batch_flush_after(delay: float) -> None:
"""Wait delay seconds, then flush accumulated text deltas."""
try:
await asyncio.sleep(delay)
except asyncio.CancelledError:
return
# Clear timer reference BEFORE flush so new deltas
# can start a fresh timer while we emit
nonlocal _batch_buf, _batch_timer
_batch_timer = None
await _flush_batch()
async def _flush_batch() -> None:
"""Emit a single SSE delta for all accumulated text."""
nonlocal _batch_buf
async with _batch_lock:
if _batch_buf:
combined = "".join(_batch_buf)
_batch_buf = []
await _emit_text_delta(combined)
loop = asyncio.get_running_loop()
while True:
@@ -1597,11 +1757,21 @@ class APIServerAdapter(BasePlatformAdapter):
continue
if item is None: # EOS sentinel
# Cancel pending timer and flush remaining batched text
if _batch_timer and not _batch_timer.done():
_batch_timer.cancel()
_batch_timer = None
if _batch_buf:
await _flush_batch()
break
await _dispatch(item)
last_activity = time.monotonic()
# Flush any final batched text before processing result
if _batch_buf:
await _flush_batch()
# Pick up agent result + usage from the completed task
try:
result, agent_usage = await agent_task
@@ -1652,6 +1822,31 @@ class APIServerAdapter(BasePlatformAdapter):
# payload still see the assistant text. This mirrors the
# shape produced by _extract_output_items in the batch path.
final_items: List[Dict[str, Any]] = list(emitted_items)
# Trim large content from tool call arguments to keep the
# response.completed event under ~100KB. Clients already
# received full details via incremental events.
for _item in final_items:
if _item.get("type") == "function_call":
try:
_args = json.loads(_item.get("arguments", "{}")) if isinstance(_item.get("arguments"), str) else _item.get("arguments", {})
if isinstance(_args, dict):
for _k in ("content", "query", "pattern", "old_string", "new_string"):
if isinstance(_args.get(_k), str) and len(_args[_k]) > 500:
_args[_k] = "[" + str(len(_args[_k])) + " chars — truncated for response.completed]"
_item["arguments"] = json.dumps(_args)
except Exception:
pass
elif _item.get("type") == "function_call_output":
_output = _item.get("output", [])
if isinstance(_output, list) and _output:
_first = _output[0]
if isinstance(_first, dict) and _first.get("type") == "input_text":
_text = _first.get("text", "")
if len(_text) > 1000:
_first["text"] = _text[:500] + "...[" + str(len(_text) - 500) + " more chars]"
_item["output"] = [_first]
final_items.append({
"type": "message",
"role": "assistant",
@@ -1742,6 +1937,30 @@ class APIServerAdapter(BasePlatformAdapter):
agent_task.cancel()
logger.info("SSE task cancelled; persisted incomplete snapshot for %s", response_id)
raise
except Exception as _exc:
# Agent crashed with an unhandled error (e.g. model API error like
# BadRequestError, AuthenticationError). Emit a response.failed
# event and properly terminate the SSE stream so the client doesn't
# get a TransferEncodingError from incomplete chunked encoding.
import traceback as _tb
_persist_incomplete_if_needed()
agent_error = _tb.format_exc()
try:
failed_env = _envelope("failed")
failed_env["output"] = list(emitted_items)
failed_env["error"] = {"message": str(_exc)[:500], "type": "server_error"}
failed_env["usage"] = {
"input_tokens": usage.get("input_tokens", 0),
"output_tokens": usage.get("output_tokens", 0),
"total_tokens": usage.get("total_tokens", 0),
}
await _write_event("response.failed", {
"type": "response.failed",
"response": failed_env,
})
except Exception:
pass
logger.error("Agent crashed mid-stream for %s: %s", response_id, str(agent_error)[:300])
return response
@@ -1751,6 +1970,11 @@ class APIServerAdapter(BasePlatformAdapter):
if auth_err:
return auth_err
# Long-term memory scope header (see chat_completions for details).
gateway_session_key, key_err = self._parse_session_key_header(request)
if key_err is not None:
return key_err
# Parse request body
try:
body = await request.json()
@@ -1902,6 +2126,7 @@ class APIServerAdapter(BasePlatformAdapter):
tool_start_callback=_on_tool_start,
tool_complete_callback=_on_tool_complete,
agent_ref=agent_ref,
gateway_session_key=gateway_session_key,
))
response_id = f"resp_{uuid.uuid4().hex[:28]}"
@@ -1922,6 +2147,7 @@ class APIServerAdapter(BasePlatformAdapter):
conversation=conversation,
store=store,
session_id=session_id,
gateway_session_key=gateway_session_key,
)
async def _compute_response():
@@ -1930,6 +2156,7 @@ class APIServerAdapter(BasePlatformAdapter):
conversation_history=conversation_history,
ephemeral_system_prompt=instructions,
session_id=session_id,
gateway_session_key=gateway_session_key,
)
idempotency_key = request.headers.get("Idempotency-Key")
@@ -2004,7 +2231,10 @@ class APIServerAdapter(BasePlatformAdapter):
if conversation:
self._response_store.set_conversation(conversation, response_id)
return web.json_response(response_data)
response_headers = {"X-Hermes-Session-Id": session_id}
if gateway_session_key:
response_headers["X-Hermes-Session-Key"] = gateway_session_key
return web.json_response(response_data, headers=response_headers)
# ------------------------------------------------------------------
# GET / DELETE response endpoints
@@ -2326,6 +2556,7 @@ class APIServerAdapter(BasePlatformAdapter):
tool_start_callback=None,
tool_complete_callback=None,
agent_ref: Optional[list] = None,
gateway_session_key: Optional[str] = None,
) -> tuple:
"""
Create an agent and run a conversation in a thread executor.
@@ -2348,6 +2579,7 @@ class APIServerAdapter(BasePlatformAdapter):
tool_progress_callback=tool_progress_callback,
tool_start_callback=tool_start_callback,
tool_complete_callback=tool_complete_callback,
gateway_session_key=gateway_session_key,
)
if agent_ref is not None:
agent_ref[0] = agent
@@ -2362,6 +2594,12 @@ class APIServerAdapter(BasePlatformAdapter):
"output_tokens": getattr(agent, "session_completion_tokens", 0) or 0,
"total_tokens": getattr(agent, "session_total_tokens", 0) or 0,
}
# Include the effective session ID in the result so callers
# (e.g. X-Hermes-Session-Id header) can track compression-
# triggered session rotations. (#16938)
_eff_sid = getattr(agent, "session_id", session_id)
if isinstance(_eff_sid, str) and _eff_sid:
result["session_id"] = _eff_sid
return result, usage
return await loop.run_in_executor(None, _run)
@@ -2441,6 +2679,11 @@ class APIServerAdapter(BasePlatformAdapter):
if auth_err:
return auth_err
# Long-term memory scope header (see chat_completions for details).
gateway_session_key, key_err = self._parse_session_key_header(request)
if key_err is not None:
return key_err
# Enforce concurrency limit
if len(self._run_streams) >= self._MAX_CONCURRENT_RUNS:
return web.json_response(
@@ -2549,6 +2792,7 @@ class APIServerAdapter(BasePlatformAdapter):
session_id=session_id,
stream_delta_callback=_text_cb,
tool_progress_callback=event_cb,
gateway_session_key=gateway_session_key,
)
self._active_run_agents[run_id] = agent
def _run_sync():
@@ -2566,21 +2810,39 @@ class APIServerAdapter(BasePlatformAdapter):
return r, u
result, usage = await asyncio.get_running_loop().run_in_executor(None, _run_sync)
final_response = result.get("final_response", "") if isinstance(result, dict) else ""
q.put_nowait({
"event": "run.completed",
"run_id": run_id,
"timestamp": time.time(),
"output": final_response,
"usage": usage,
})
self._set_run_status(
run_id,
"completed",
output=final_response,
usage=usage,
last_event="run.completed",
)
# Check for structured failure (non-retryable client errors like
# 401/400 return failed=True instead of raising, so the except
# block below never fires — issue #15561).
if isinstance(result, dict) and result.get("failed"):
error_msg = result.get("error") or "agent run failed"
q.put_nowait({
"event": "run.failed",
"run_id": run_id,
"timestamp": time.time(),
"error": error_msg,
})
self._set_run_status(
run_id,
"failed",
error=error_msg,
last_event="run.failed",
)
else:
final_response = result.get("final_response", "") if isinstance(result, dict) else ""
q.put_nowait({
"event": "run.completed",
"run_id": run_id,
"timestamp": time.time(),
"output": final_response,
"usage": usage,
})
self._set_run_status(
run_id,
"completed",
output=final_response,
usage=usage,
last_event="run.completed",
)
except asyncio.CancelledError:
self._set_run_status(
run_id,
@@ -2631,7 +2893,14 @@ class APIServerAdapter(BasePlatformAdapter):
if hasattr(task, "add_done_callback"):
task.add_done_callback(self._background_tasks.discard)
return web.json_response({"run_id": run_id, "status": "started"}, status=202)
response_headers = (
{"X-Hermes-Session-Key": gateway_session_key} if gateway_session_key else {}
)
return web.json_response(
{"run_id": run_id, "status": "started"},
status=202,
headers=response_headers,
)
async def _handle_get_run(self, request: "web.Request") -> "web.Response":
"""GET /v1/runs/{run_id} — return pollable run status for external UIs."""
@@ -2775,7 +3044,7 @@ class APIServerAdapter(BasePlatformAdapter):
try:
mws = [mw for mw in (cors_middleware, body_limit_middleware, security_headers_middleware) if mw is not None]
self._app = web.Application(middlewares=mws)
self._app = web.Application(middlewares=mws, client_max_size=MAX_REQUEST_BYTES)
self._app["api_server_adapter"] = self
self._app.router.add_get("/health", self._handle_health)
self._app.router.add_get("/health/detailed", self._handle_health_detailed)
+69 -12
View File
@@ -1593,6 +1593,26 @@ class BasePlatformAdapter(ABC):
"""
return SendResult(success=False, error="Not supported")
async def send_private_notice(
self,
chat_id: str,
user_id: Optional[str],
content: str,
reply_to: Optional[str] = None,
metadata: Optional[Dict[str, Any]] = None,
) -> SendResult:
"""Send a notice privately when the platform supports it.
The default implementation falls back to a normal send so callers can
use one code path across platforms.
"""
return await self.send(
chat_id=chat_id,
content=content,
reply_to=reply_to,
metadata=metadata,
)
async def send_typing(self, chat_id: str, metadata=None) -> None:
"""
Send a typing indicator.
@@ -2469,19 +2489,30 @@ class BasePlatformAdapter(ABC):
try:
response = await self._message_handler(event)
# Old adapter task (if any) is cancelled AFTER the runner has
# fully handled the command — keeps ordering deterministic.
await self.cancel_session_processing(
session_key,
release_guard=False,
discard_pending=False,
)
_text, _eph_ttl = self._unwrap_ephemeral(response)
# Send the response BEFORE cancelling the old task so the send
# cannot be affected by task-cancellation side effects (race
# condition fix — issue #18912). Previously the send happened
# after cancel_session_processing, which could silently drop the
# "/new" confirmation when an agent was actively running.
if _text:
logger.info(
"[%s] Sending command '/%s' response (%d chars) to %s",
self.name,
cmd,
len(_text),
event.source.chat_id,
)
_r = await self._send_with_retry(
chat_id=event.source.chat_id,
content=_text,
reply_to=event.message_id,
reply_to=(
event.reply_to_message_id
if event.source.platform == Platform.FEISHU
and event.source.thread_id
and event.reply_to_message_id
else event.message_id
),
metadata=thread_meta,
)
if _eph_ttl > 0 and _r.success and _r.message_id:
@@ -2490,6 +2521,13 @@ class BasePlatformAdapter(ABC):
message_id=_r.message_id,
ttl_seconds=_eph_ttl,
)
# Old adapter task (if any) is cancelled AFTER the response has
# been sent — keeps ordering deterministic and avoids the race.
await self.cancel_session_processing(
session_key,
release_guard=False,
discard_pending=False,
)
except Exception:
# On failure, restore the original guard if one still exists so
# we don't leave the session in a half-reset state.
@@ -2574,7 +2612,13 @@ class BasePlatformAdapter(ABC):
_r = await self._send_with_retry(
chat_id=event.source.chat_id,
content=_text,
reply_to=event.message_id,
reply_to=(
event.reply_to_message_id
if event.source.platform == Platform.FEISHU
and event.source.thread_id
and event.reply_to_message_id
else event.message_id
),
metadata=_thread_meta,
)
if _eph_ttl > 0 and _r.success and _r.message_id:
@@ -2631,10 +2675,18 @@ class BasePlatformAdapter(ABC):
mode = os.getenv("HERMES_HUMAN_DELAY_MODE", "off").lower()
if mode == "off":
return 0.0
min_ms = int(os.getenv("HERMES_HUMAN_DELAY_MIN_MS", "800"))
max_ms = int(os.getenv("HERMES_HUMAN_DELAY_MAX_MS", "2500"))
if mode == "natural":
min_ms, max_ms = 800, 2500
return random.uniform(min_ms / 1000.0, max_ms / 1000.0)
# custom mode — tolerate malformed env vars instead of crashing.
try:
min_ms = int(os.getenv("HERMES_HUMAN_DELAY_MIN_MS", "800"))
except (TypeError, ValueError):
min_ms = 800
try:
max_ms = int(os.getenv("HERMES_HUMAN_DELAY_MAX_MS", "2500"))
except (TypeError, ValueError):
max_ms = 2500
return random.uniform(min_ms / 1000.0, max_ms / 1000.0)
async def _process_message_background(self, event: MessageEvent, session_key: str) -> None:
@@ -2778,10 +2830,15 @@ class BasePlatformAdapter(ABC):
# Send the text portion
if text_content:
logger.info("[%s] Sending response (%d chars) to %s", self.name, len(text_content), event.source.chat_id)
_reply_anchor = (
event.reply_to_message_id
if event.source.platform == Platform.FEISHU and event.source.thread_id and event.reply_to_message_id
else event.message_id
)
result = await self._send_with_retry(
chat_id=event.source.chat_id,
content=text_content,
reply_to=event.message_id,
reply_to=_reply_anchor,
metadata=_thread_metadata,
)
_record_delivery(result)
+3 -1
View File
@@ -162,7 +162,9 @@ class BlueBubblesAdapter(BasePlatformAdapter):
return False
from aiohttp import web
self.client = httpx.AsyncClient(timeout=30.0)
# Tighter keepalive so idle CLOSE_WAIT drains promptly (#18451).
from gateway.platforms._http_client_limits import platform_httpx_limits
self.client = httpx.AsyncClient(timeout=30.0, limits=platform_httpx_limits())
try:
await self._api_get("/api/v1/ping")
info = await self._api_get("/api/v1/server/info")
+5 -1
View File
@@ -228,7 +228,11 @@ class DingTalkAdapter(BasePlatformAdapter):
return False
try:
self._http_client = httpx.AsyncClient(timeout=30.0)
# Tighter keepalive so idle CLOSE_WAIT drains promptly (#18451).
from gateway.platforms._http_client_limits import platform_httpx_limits
self._http_client = httpx.AsyncClient(
timeout=30.0, limits=platform_httpx_limits(),
)
credential = dingtalk_stream.Credential(
self._client_id, self._client_secret
+531 -50
View File
@@ -497,6 +497,7 @@ class DiscordAdapter(BasePlatformAdapter):
self._ready_event = asyncio.Event()
self._allowed_user_ids: set = set() # For button approval authorization
self._allowed_role_ids: set = set() # For DISCORD_ALLOWED_ROLES filtering
self.gateway_runner = None # Set by gateway/run.py for cross-platform delivery
# Voice channel state (per-guild)
self._voice_clients: Dict[int, Any] = {} # guild_id -> VoiceClient
self._voice_locks: Dict[int, asyncio.Lock] = {} # guild_id -> serialize join/leave
@@ -613,6 +614,21 @@ class DiscordAdapter(BasePlatformAdapter):
# so LLM output or echoed user content can't ping the whole
# server; override per DISCORD_ALLOW_MENTION_* env vars or the
# discord.allow_mentions.* block in config.yaml.
# Close any existing client to prevent zombie websocket connections
# on reconnect (see #18187). Without this, the old client remains
# connected to Discord gateway and both fire on_message, causing
# double responses.
if self._client is not None:
try:
if not self._client.is_closed():
await self._client.close()
except Exception:
logger.debug("[%s] Failed to close previous Discord client", self.name)
finally:
self._client = None
self._ready_event.clear()
self._client = commands.Bot(
command_prefix="!", # Not really used, we handle raw messages
intents=intents,
@@ -704,11 +720,22 @@ class DiscordAdapter(BasePlatformAdapter):
return
# If humans are mentioned but we're not → not for us
# (preserves old DISCORD_IGNORE_NO_MENTION=true behavior)
# EXCEPT in free-response channels where the bot should
# answer regardless of who is mentioned.
_ignore_no_mention = os.getenv(
"DISCORD_IGNORE_NO_MENTION", "true"
).lower() in ("true", "1", "yes")
if _ignore_no_mention and not _self_mentioned and not _other_bots_mentioned:
return
_channel_id = str(message.channel.id)
_parent_id = None
if hasattr(message.channel, "parent_id") and message.channel.parent_id:
_parent_id = str(message.channel.parent_id)
_free_channels = adapter_self._discord_free_response_channels()
_channel_ids = {_channel_id}
if _parent_id:
_channel_ids.add(_parent_id)
if "*" not in _free_channels and not (_channel_ids & _free_channels):
return
await self._handle_message(message)
@@ -1914,6 +1941,225 @@ class DiscordAdapter(BasePlatformAdapter):
return True
return False
# ── Slash command authorization ─────────────────────────────────────
# Slash commands (``_run_simple_slash`` and ``_handle_thread_create_slash``)
# are a separate Discord interaction surface from regular messages and
# historically ran with NO authorization check — bypassing every gate
# ``on_message`` enforces (DISCORD_ALLOWED_USERS, DISCORD_ALLOWED_ROLES,
# DISCORD_ALLOWED_CHANNELS, DISCORD_IGNORED_CHANNELS). Any guild member
# could invoke ``/background``, ``/restart``, ``/sethome``, etc. as the
# operator. ``_check_slash_authorization`` mirrors the on_message gates
# one-for-one so the slash surface honors the same trust boundary.
#
# By design, this is a no-op for deployments with no allowlist env vars
# set — ``_is_allowed_user`` returns True and the channel checks early-out
# — preserving the existing "single-tenant, all guild members trusted"
# default. Deployments that DO set any DISCORD_ALLOWED_* var get slash
# parity with on_message.
def _evaluate_slash_authorization(
self, interaction: "discord.Interaction",
) -> Tuple[bool, Optional[str]]:
"""Evaluate slash authorization without producing any response.
Returns ``(allowed, reason)``. ``reason`` is populated only when
``allowed`` is False. This is the shared core used by both the
responding wrapper (``_check_slash_authorization``) and side-effect-
free callers like the ``/skill`` autocomplete callback, which must
return an empty list for unauthorized users instead of leaking an
ephemeral rejection per-keystroke.
Fail-closed semantics for malformed payloads: when an allowlist is
configured but the interaction is missing the data needed to
evaluate it (no channel id with channel policy active, no user
with user/role policy active), the gate REJECTS rather than
falling through. Without these guards a guild interaction that
happens to deserialize without a channel id would silently bypass
``DISCORD_ALLOWED_CHANNELS`` and a payload missing ``user`` would
raise ``AttributeError`` in the user check below, surfacing as
an opaque interaction failure rather than a clean rejection.
"""
chan_obj = getattr(interaction, "channel", None)
in_dm = isinstance(chan_obj, discord.DMChannel) if chan_obj is not None else False
# ── Channel scope (mirrors on_message lines 3374-3388) ──
# DMs aren't channel-gated — DMs follow on_message's DM lockdown
# path which has its own user-allowlist enforcement.
if not in_dm:
chan_id_raw = getattr(interaction, "channel_id", None) or getattr(
chan_obj, "id", None,
)
channel_ids: set = set()
if chan_id_raw is not None:
channel_ids.add(str(chan_id_raw))
# Mirror on_message: also test the parent channel for threads
# so per-channel allow/deny lists work consistently.
if isinstance(chan_obj, discord.Thread):
parent_id = self._get_parent_channel_id(chan_obj)
if parent_id:
channel_ids.add(str(parent_id))
allowed_raw = os.getenv("DISCORD_ALLOWED_CHANNELS", "")
if allowed_raw:
allowed = {c.strip() for c in allowed_raw.split(",") if c.strip()}
if "*" not in allowed:
if not channel_ids:
# Channel policy is configured but the interaction
# has no resolvable channel id. Fail closed.
return (
False,
"channel id missing with DISCORD_ALLOWED_CHANNELS configured",
)
if not (channel_ids & allowed):
return (False, "channel not in DISCORD_ALLOWED_CHANNELS")
# Ignored beats allowed: even when a thread's parent channel
# is on the allowlist, an explicit DISCORD_IGNORED_CHANNELS
# entry on the thread or its parent rejects the interaction.
ignored_raw = os.getenv("DISCORD_IGNORED_CHANNELS", "")
if ignored_raw and channel_ids:
ignored = {c.strip() for c in ignored_raw.split(",") if c.strip()}
if "*" in ignored or (channel_ids & ignored):
return (False, "channel in DISCORD_IGNORED_CHANNELS")
# ── User / role allowlist (mirrors on_message line 681) ──
user = getattr(interaction, "user", None)
allowed_users = getattr(self, "_allowed_user_ids", set()) or set()
allowed_roles = getattr(self, "_allowed_role_ids", set()) or set()
if user is None or getattr(user, "id", None) is None:
# No identifiable user. With any user/role allowlist
# configured, fail closed rather than raise AttributeError
# on ``interaction.user.id`` below. With no allowlist this
# is the existing "no allowlist = everyone" backwards-compat.
if allowed_users or allowed_roles:
return (False, "missing interaction.user with allowlist configured")
return (True, None)
user_id = str(user.id)
if not self._is_allowed_user(user_id, author=user):
return (
False,
"user not in DISCORD_ALLOWED_USERS / DISCORD_ALLOWED_ROLES",
)
return (True, None)
async def _check_slash_authorization(
self, interaction: "discord.Interaction", command_text: str,
) -> bool:
"""Mirror on_message's user/role/channel gates onto a slash invocation.
Returns True to proceed. Returns False *after* sending an ephemeral
rejection, logging a warning, and scheduling a cross-platform admin
alert the caller must stop on False (the interaction has already
been responded to).
"""
allowed, reason = self._evaluate_slash_authorization(interaction)
if allowed:
return True
return await self._reject_slash(
interaction, command_text, reason=reason or "unauthorized",
)
async def _reject_slash(
self, interaction: "discord.Interaction", command_text: str, *, reason: str,
) -> bool:
"""Send ephemeral reject + log warning + schedule admin alert. Returns False.
Tolerates a missing ``interaction.user`` -- the fail-closed branch
in ``_evaluate_slash_authorization`` deliberately routes here for
malformed payloads (no user) when an allowlist is configured, and
``str(interaction.user.id)`` would raise AttributeError before the
ephemeral rejection could be sent.
"""
user = getattr(interaction, "user", None)
if user is not None:
user_id = str(getattr(user, "id", "?"))
user_name = getattr(user, "name", "?")
else:
user_id = "?"
user_name = "?"
chan_id = getattr(interaction, "channel_id", None) or getattr(
getattr(interaction, "channel", None), "id", None,
)
guild_id = getattr(interaction, "guild_id", None)
logger.warning(
"[Discord] Unauthorized slash attempt: user=%s id=%s channel=%s "
"guild=%s cmd=%r reason=%r",
user_name, user_id, chan_id, guild_id, command_text, reason,
)
try:
await interaction.response.send_message(
"You're not authorized to use this command.",
ephemeral=True,
)
except Exception as e:
# Interaction may already be responded to (e.g. caller deferred
# before the auth check, or Discord retried). Best-effort only.
logger.debug("[Discord] Could not send unauthorized ephemeral: %s", e)
# Fire-and-forget: don't block the interaction handler on Telegram I/O.
try:
asyncio.create_task(self._notify_unauthorized_slash(
user_name, user_id, chan_id, guild_id, command_text, reason,
))
except Exception as e:
logger.debug("[Discord] Could not schedule admin notify task: %s", e)
return False
async def _notify_unauthorized_slash(
self, user_name: str, user_id: str, chan_id, guild_id,
command_text: str, reason: str,
) -> None:
"""Best-effort cross-platform alert to the gateway operator.
Tries TELEGRAM first (most operators set TELEGRAM_HOME_CHANNEL),
then SLACK. Silently no-ops if no other platform is configured
with a home channel.
A soft send failure -- adapter.send() returning a result with
``success=False`` rather than raising -- continues the fallback
chain. Treating a SendResult(success=False) as delivered would
mean a Telegram outage that the adapter politely surfaces (e.g.
rate-limit, auth failure) silently swallows the alert without
attempting Slack. Hard exceptions still take the same path via
the except branch below.
"""
runner = getattr(self, "gateway_runner", None)
if not runner:
return
for target in (Platform.TELEGRAM, Platform.SLACK):
try:
adapter = runner.adapters.get(target)
if not adapter:
continue
home = runner.config.get_home_channel(target)
if not home or not getattr(home, "chat_id", None):
continue
msg = (
"⚠️ Unauthorized Discord slash attempt\n"
f"User: {user_name} ({user_id})\n"
f"Channel: {chan_id} (guild {guild_id})\n"
f"Command: {command_text}\n"
f"Reason: {reason}"
)
result = await adapter.send(str(home.chat_id), msg)
# Only return on confirmed delivery. SendResult(success=False)
# -> continue to the next platform.
if getattr(result, "success", None) is False:
logger.debug(
"[Discord] Admin notify via %s returned success=False"
" (error=%r); falling through",
target, getattr(result, "error", None),
)
continue
return
except Exception as e:
logger.debug("[Discord] Admin notify via %s failed: %s", target, e)
async def send_image_file(
self,
chat_id: str,
@@ -2301,6 +2547,11 @@ class DiscordAdapter(BasePlatformAdapter):
except Exception:
pass # logging must never block command dispatch
# Auth gate — must run before defer() so an ephemeral rejection can
# be delivered on the still-unresponded interaction.
if not await self._check_slash_authorization(interaction, command_text):
return
await interaction.response.defer(ephemeral=True)
event = self._build_slash_event(interaction, command_text)
await self.handle_message(event)
@@ -2403,9 +2654,14 @@ class DiscordAdapter(BasePlatformAdapter):
await self._run_simple_slash(interaction, "/reload-skills")
@tree.command(name="voice", description="Toggle voice reply mode")
@discord.app_commands.describe(mode="Voice mode: on, off, tts, channel, leave, or status")
@discord.app_commands.describe(mode="Voice mode: join, channel, leave, on, tts, off, or status")
@discord.app_commands.choices(mode=[
discord.app_commands.Choice(name="channel — join your voice channel", value="channel"),
# `join` and `channel` both route to _handle_voice_channel_join in
# gateway/run.py — expose both in the slash UI so autocomplete
# matches what the docs advertise and what the runner accepts when
# the command is typed as plain text.
discord.app_commands.Choice(name="join — join your voice channel", value="join"),
discord.app_commands.Choice(name="channel — join your voice channel (alias)", value="channel"),
discord.app_commands.Choice(name="leave — leave voice channel", value="leave"),
discord.app_commands.Choice(name="on — voice reply to voice messages", value="on"),
discord.app_commands.Choice(name="tts — voice reply to all messages", value="tts"),
@@ -2445,7 +2701,8 @@ class DiscordAdapter(BasePlatformAdapter):
message: str = "",
auto_archive_duration: int = 1440,
):
await interaction.response.defer(ephemeral=True)
# defer() is performed inside the handler *after* the auth gate
# so a rejected invoker can receive an ephemeral rejection.
await self._handle_thread_create_slash(interaction, name, message, auto_archive_duration)
@tree.command(name="queue", description="Queue a prompt for the next turn (doesn't interrupt)")
@@ -2566,6 +2823,54 @@ class DiscordAdapter(BasePlatformAdapter):
# supporting up to 25 categories × 25 skills = 625 skills.
self._register_skill_group(tree)
# Optional defense-in-depth: hide every slash command from non-admin
# guild members in Discord's slash picker. Server-side authorization
# (``_check_slash_authorization``) is the actual gate; this is purely
# UX so users don't see commands they can't invoke. Off by default
# to preserve the slash UX for deployments that intentionally allow
# everyone in the guild.
if os.getenv("DISCORD_HIDE_SLASH_COMMANDS", "false").strip().lower() in (
"true", "1", "yes", "on",
):
self._apply_owner_only_visibility(tree)
def _apply_owner_only_visibility(self, tree) -> None:
"""Set default_member_permissions=0 on every registered slash command.
Discord interprets ``Permissions(0)`` as "requires no permissions",
which paradoxically means the command is hidden from every guild
member except those with the Administrator permission. Server admins
can re-grant per user/role via Server Settings Integrations
<bot> Permissions.
Authoritative gate is ``_check_slash_authorization`` on every
invocation, which catches stale clients, role grants made by
mistake, and direct API calls bypassing Discord's UI hide.
"""
try:
no_perms = discord.Permissions(0)
except Exception as e:
logger.warning(
"[Discord] _apply_owner_only_visibility: cannot build Permissions(0): %s",
e,
)
return
applied = 0
for cmd in tree.get_commands():
try:
cmd.default_permissions = no_perms
applied += 1
except Exception as e:
logger.debug(
"[Discord] Could not set default_permissions on %r: %s",
getattr(cmd, "name", "?"), e,
)
logger.info(
"[Discord] Hid %d slash command(s) from non-admin guild members "
"(opt-in defense in depth via DISCORD_HIDE_SLASH_COMMANDS).",
applied,
)
def _register_skill_group(self, tree) -> None:
"""Register a single ``/skill`` command with autocomplete on the name.
@@ -2584,40 +2889,32 @@ class DiscordAdapter(BasePlatformAdapter):
hidden skills. The slash picker also becomes more discoverable
Discord live-filters by the user's typed prefix against both the
skill name and its description.
The entries list and lookup dict are stored on ``self`` rather
than captured in closure variables so :meth:`refresh_skill_group`
can repopulate them when the user runs ``/reload-skills`` without
needing to touch the Discord slash-command tree or trigger a
``tree.sync()`` call.
"""
try:
from hermes_cli.commands import discord_skill_commands_by_category
existing_names = set()
try:
existing_names = {cmd.name for cmd in tree.get_commands()}
except Exception:
pass
# Reuse the existing collector for consistent filtering
# (per-platform disabled, hub-excluded, name clamping), then
# flatten — the category grouping was only useful for the
# nested layout.
categories, uncategorized, hidden = discord_skill_commands_by_category(
reserved_names=existing_names,
)
entries: list[tuple[str, str, str]] = list(uncategorized)
for cat_skills in categories.values():
entries.extend(cat_skills)
# Populate the instance-level entries/lookup so the
# autocomplete + handler callbacks below always read the
# freshest state. refresh_skill_group() re-runs the same
# collector and mutates these two attributes in place.
self._skill_entries: list[tuple[str, str, str]] = []
self._skill_lookup: dict[str, tuple[str, str]] = {}
self._skill_group_reserved_names: set[str] = set(existing_names)
self._refresh_skill_catalog_state()
if not entries:
if not self._skill_entries:
return
# Stable alphabetical order so the autocomplete suggestion
# list is predictable across restarts.
entries.sort(key=lambda t: t[0])
# name -> (description, cmd_key) — used by both the autocomplete
# callback and the handler for O(1) dispatch.
skill_lookup: dict[str, tuple[str, str]] = {
n: (d, k) for n, d, k in entries
}
async def _autocomplete_name(
interaction: "discord.Interaction", current: str,
) -> list:
@@ -2627,10 +2924,29 @@ class DiscordAdapter(BasePlatformAdapter):
"/skill pdf" surfaces skills whose description mentions
PDFs even if the name doesn't. Discord caps this list at
25 entries per query.
Authorization: a quiet pre-check evaluates the slash
allowlists and returns ``[]`` for unauthorized users so
the installed skill catalog is not leaked to anyone who
can see the command in the picker. Returning a generic
empty list here is intentional sending a per-keystroke
ephemeral rejection would produce a barrage of error
popups during typing.
Reads ``self._skill_entries`` so a ``/reload-skills`` run
since process start shows up on the very next keystroke.
"""
try:
allowed, _reason = self._evaluate_slash_authorization(interaction)
except Exception:
# Defensive: never raise from autocomplete. Fail
# closed by returning an empty suggestion list.
return []
if not allowed:
return []
q = (current or "").strip().lower()
choices: list = []
for name, desc, _key in entries:
for name, desc, _key in self._skill_entries:
if not q or q in name.lower() or (desc and q in desc.lower()):
if desc:
label = f"{name}{desc}"
@@ -2654,7 +2970,13 @@ class DiscordAdapter(BasePlatformAdapter):
async def _skill_handler(
interaction: "discord.Interaction", name: str, args: str = "",
):
entry = skill_lookup.get(name)
# Authorize BEFORE any skill lookup so that known and
# unknown skill names produce identical rejections for
# unauthorized users (no probing the installed catalog
# via "Unknown skill: <name>" responses).
if not await self._check_slash_authorization(interaction, "/skill"):
return
entry = self._skill_lookup.get(name)
if not entry:
await interaction.response.send_message(
f"Unknown skill: `{name}`. Start typing for "
@@ -2676,16 +2998,74 @@ class DiscordAdapter(BasePlatformAdapter):
logger.info(
"[%s] Registered /skill command with %d skill(s) via autocomplete",
self.name, len(entries),
self.name, len(self._skill_entries),
)
if hidden:
if self._skill_group_hidden_count:
logger.info(
"[%s] %d skill(s) filtered out of /skill (name clamp / reserved)",
self.name, hidden,
self.name, self._skill_group_hidden_count,
)
except Exception as exc:
logger.warning("[%s] Failed to register /skill command: %s", self.name, exc)
def _refresh_skill_catalog_state(self) -> None:
"""Re-scan disk for skills and repopulate ``self._skill_entries``.
Called once from :meth:`_register_skill_group` at startup and
again from :meth:`refresh_skill_group` whenever the user runs
``/reload-skills``. No Discord API calls are made autocomplete
and the handler both read from these instance attributes
directly, so an in-place mutation is sufficient.
"""
from hermes_cli.commands import discord_skill_commands_by_category
reserved = getattr(self, "_skill_group_reserved_names", set())
categories, uncategorized, hidden = discord_skill_commands_by_category(
reserved_names=set(reserved),
)
entries: list[tuple[str, str, str]] = list(uncategorized)
for cat_skills in categories.values():
entries.extend(cat_skills)
# Stable alphabetical order so the autocomplete suggestion
# list is predictable across restarts.
entries.sort(key=lambda t: t[0])
self._skill_entries = entries
self._skill_lookup = {n: (d, k) for n, d, k in entries}
self._skill_group_hidden_count = hidden
def refresh_skill_group(self) -> tuple[int, int]:
"""Rescan skills and update the live ``/skill`` autocomplete state.
Invoked by :meth:`gateway.run.GatewayOrchestrator._handle_reload_skills_command`
after :func:`agent.skill_commands.reload_skills` has refreshed
the in-process skill-command registry. Without this call, the
``/skill`` autocomplete dropdown keeps showing the list captured
at process start new skills stay invisible and deleted skills
return an "Unknown skill" error when clicked.
Because autocomplete options are fetched dynamically by Discord,
we only need to mutate the entries/lookup attributes read by the
callbacks no ``tree.sync()`` is required.
Returns ``(new_count, hidden_count)``.
"""
try:
self._refresh_skill_catalog_state()
except Exception as exc:
logger.warning(
"[%s] Failed to refresh /skill autocomplete after reload: %s",
self.name, exc,
)
return (len(getattr(self, "_skill_entries", [])), 0)
logger.info(
"[%s] Refreshed /skill autocomplete: %d skill(s) available (%d filtered)",
self.name,
len(self._skill_entries),
self._skill_group_hidden_count,
)
return (len(self._skill_entries), self._skill_group_hidden_count)
def _build_slash_event(self, interaction: discord.Interaction, text: str) -> MessageEvent:
"""Build a MessageEvent from a Discord slash command interaction."""
is_dm = isinstance(interaction.channel, discord.DMChannel)
@@ -2743,6 +3123,9 @@ class DiscordAdapter(BasePlatformAdapter):
auto_archive_duration: int = 1440,
) -> None:
"""Create a Discord thread from a slash command and start a session in it."""
if not await self._check_slash_authorization(interaction, "/thread"):
return
await interaction.response.defer(ephemeral=True)
result = await self._create_thread(
interaction,
name=name,
@@ -2851,8 +3234,15 @@ class DiscordAdapter(BasePlatformAdapter):
raw = os.getenv("DISCORD_FREE_RESPONSE_CHANNELS", "")
if isinstance(raw, list):
return {str(part).strip() for part in raw if str(part).strip()}
if isinstance(raw, str) and raw.strip():
return {part.strip() for part in raw.split(",") if part.strip()}
# Coerce non-list scalars (str/int/float) to str before splitting.
# YAML parses a bare numeric value such as
# `free_response_channels: 1491973769726791812` as int, which was
# previously falling through the isinstance(str) branch and silently
# returning an empty set. str() here accepts whatever scalar the YAML
# loader hands us without changing existing string/CSV semantics.
s = str(raw).strip() if raw is not None else ""
if s:
return {part.strip() for part in s.split(",") if part.strip()}
return set()
def _thread_parent_channel(self, channel: Any) -> Any:
@@ -3030,6 +3420,7 @@ class DiscordAdapter(BasePlatformAdapter):
view = ExecApprovalView(
session_key=session_key,
allowed_user_ids=self._allowed_user_ids,
allowed_role_ids=self._allowed_role_ids,
)
msg = await channel.send(embed=embed, view=view)
@@ -3068,6 +3459,7 @@ class DiscordAdapter(BasePlatformAdapter):
session_key=session_key,
confirm_id=confirm_id,
allowed_user_ids=self._allowed_user_ids,
allowed_role_ids=self._allowed_role_ids,
)
msg = await channel.send(embed=embed, view=view)
@@ -3102,6 +3494,7 @@ class DiscordAdapter(BasePlatformAdapter):
view = UpdatePromptView(
session_key=session_key,
allowed_user_ids=self._allowed_user_ids,
allowed_role_ids=self._allowed_role_ids,
)
msg = await channel.send(embed=embed, view=view)
return SendResult(success=True, message_id=str(msg.id))
@@ -3159,6 +3552,7 @@ class DiscordAdapter(BasePlatformAdapter):
session_key=session_key,
on_model_selected=on_model_selected,
allowed_user_ids=self._allowed_user_ids,
allowed_role_ids=self._allowed_role_ids,
)
msg = await channel.send(embed=embed, view=view)
@@ -3419,7 +3813,7 @@ class DiscordAdapter(BasePlatformAdapter):
if not is_thread and not isinstance(message.channel, discord.DMChannel):
no_thread_channels_raw = os.getenv("DISCORD_NO_THREAD_CHANNELS", "")
no_thread_channels = {ch.strip() for ch in no_thread_channels_raw.split(",") if ch.strip()}
skip_thread = bool(channel_ids & no_thread_channels) or is_free_channel
skip_thread = bool(channel_ids & no_thread_channels)
auto_thread = os.getenv("DISCORD_AUTO_THREAD", "true").lower() in ("true", "1", "yes")
is_reply_message = getattr(message, "type", None) == discord.MessageType.reply
if auto_thread and not skip_thread and not is_voice_linked_channel and not is_reply_message:
@@ -3714,6 +4108,72 @@ class DiscordAdapter(BasePlatformAdapter):
# Discord UI Components (outside the adapter class)
# ---------------------------------------------------------------------------
def _component_check_auth(
interaction,
allowed_user_ids: Optional[set],
allowed_role_ids: Optional[set],
) -> bool:
"""Shared user-or-role OR semantics for component view button clicks.
Mirrors ``DiscordAdapter._is_allowed_user`` / the slash and on_message
gates so every Discord interaction surface honors the same trust
boundary. Component views (ExecApprovalView, SlashConfirmView,
UpdatePromptView, ModelPickerView) used to receive only
``allowed_user_ids``: in role-only deployments
(DISCORD_ALLOWED_ROLES set, DISCORD_ALLOWED_USERS empty) the user
set was empty and the legacy "no allowlist = allow everyone" branch
let any guild member click the buttons -- approving exec commands,
cancelling slash confirmations, switching the model.
Behavior:
- both allowlists empty -> allow (preserves existing no-allowlist
deployments, no regression)
- user is in user allowlist -> allow
- role allowlist set + user has a role in it -> allow
- role allowlist set + interaction.user has no resolvable
``roles`` attribute (e.g. DM context with a role policy active)
-> reject (fail closed)
- otherwise -> reject
"""
user_set = allowed_user_ids or set()
role_set = allowed_role_ids or set()
has_users = bool(user_set)
has_roles = bool(role_set)
if not has_users and not has_roles:
return True
user = getattr(interaction, "user", None)
if user is None:
return False
if has_users:
try:
uid = str(user.id)
except AttributeError:
uid = ""
if uid and uid in user_set:
return True
if has_roles:
roles_attr = getattr(user, "roles", None)
if roles_attr is None:
# Role policy is configured but the interaction doesn't
# carry role data (DM-context Member, raw User payload).
# Fail closed: a user without a resolvable role list cannot
# satisfy a role allowlist.
return False
try:
user_role_ids = {getattr(r, "id", None) for r in roles_attr}
except TypeError:
return False
if user_role_ids & role_set:
return True
return False
if DISCORD_AVAILABLE:
class ExecApprovalView(discord.ui.View):
@@ -3726,17 +4186,23 @@ if DISCORD_AVAILABLE:
Only users in the allowed list can click. Times out after 5 minutes.
"""
def __init__(self, session_key: str, allowed_user_ids: set):
def __init__(
self,
session_key: str,
allowed_user_ids: set,
allowed_role_ids: Optional[set] = None,
):
super().__init__(timeout=300) # 5-minute timeout
self.session_key = session_key
self.allowed_user_ids = allowed_user_ids
self.allowed_role_ids = allowed_role_ids or set()
self.resolved = False
def _check_auth(self, interaction: discord.Interaction) -> bool:
"""Verify the user clicking is authorized."""
if not self.allowed_user_ids:
return True # No allowlist = anyone can approve
return str(interaction.user.id) in self.allowed_user_ids
return _component_check_auth(
interaction, self.allowed_user_ids, self.allowed_role_ids,
)
async def _resolve(
self, interaction: discord.Interaction, choice: str,
@@ -3828,17 +4294,24 @@ if DISCORD_AVAILABLE:
5 minutes (matches the gateway primitive's timeout).
"""
def __init__(self, session_key: str, confirm_id: str, allowed_user_ids: set):
def __init__(
self,
session_key: str,
confirm_id: str,
allowed_user_ids: set,
allowed_role_ids: Optional[set] = None,
):
super().__init__(timeout=300)
self.session_key = session_key
self.confirm_id = confirm_id
self.allowed_user_ids = allowed_user_ids
self.allowed_role_ids = allowed_role_ids or set()
self.resolved = False
def _check_auth(self, interaction: discord.Interaction) -> bool:
if not self.allowed_user_ids:
return True
return str(interaction.user.id) in self.allowed_user_ids
return _component_check_auth(
interaction, self.allowed_user_ids, self.allowed_role_ids,
)
async def _resolve(
self, interaction: discord.Interaction, choice: str,
@@ -3916,16 +4389,22 @@ if DISCORD_AVAILABLE:
5-minute timeout on its side).
"""
def __init__(self, session_key: str, allowed_user_ids: set):
def __init__(
self,
session_key: str,
allowed_user_ids: set,
allowed_role_ids: Optional[set] = None,
):
super().__init__(timeout=300)
self.session_key = session_key
self.allowed_user_ids = allowed_user_ids
self.allowed_role_ids = allowed_role_ids or set()
self.resolved = False
def _check_auth(self, interaction: discord.Interaction) -> bool:
if not self.allowed_user_ids:
return True
return str(interaction.user.id) in self.allowed_user_ids
return _component_check_auth(
interaction, self.allowed_user_ids, self.allowed_role_ids,
)
async def _respond(
self, interaction: discord.Interaction, answer: str,
@@ -4002,6 +4481,7 @@ if DISCORD_AVAILABLE:
session_key: str,
on_model_selected,
allowed_user_ids: set,
allowed_role_ids: Optional[set] = None,
):
super().__init__(timeout=120)
self.providers = providers
@@ -4010,15 +4490,16 @@ if DISCORD_AVAILABLE:
self.session_key = session_key
self.on_model_selected = on_model_selected
self.allowed_user_ids = allowed_user_ids
self.allowed_role_ids = allowed_role_ids or set()
self.resolved = False
self._selected_provider: str = ""
self._build_provider_select()
def _check_auth(self, interaction: discord.Interaction) -> bool:
if not self.allowed_user_ids:
return True
return str(interaction.user.id) in self.allowed_user_ids
return _component_check_auth(
interaction, self.allowed_user_ids, self.allowed_role_ids,
)
def _build_provider_select(self):
"""Build the provider dropdown menu."""
+12
View File
@@ -416,6 +416,18 @@ class EmailAdapter(BasePlatformAdapter):
logger.debug("[Email] Dropping automated sender at dispatch: %s", sender_addr)
return
# Skip senders not in EMAIL_ALLOWED_USERS — prevents the adapter
# from creating a MessageEvent (and thus thread context) for senders
# that the gateway will never authorize. Without this early guard,
# a race between dispatch and authorization can result in the adapter
# sending a reply even though the handler returned None.
allowed_raw = os.getenv("EMAIL_ALLOWED_USERS", "").strip()
if allowed_raw:
allowed = {addr.strip().lower() for addr in allowed_raw.split(",") if addr.strip()}
if sender_addr.lower() not in allowed:
logger.debug("[Email] Dropping non-allowlisted sender at dispatch: %s", sender_addr)
return
subject = msg_data["subject"]
body = msg_data["body"].strip()
attachments = msg_data["attachments"]
+87 -41
View File
@@ -153,6 +153,9 @@ _MARKDOWN_HINT_RE = re.compile(
r"(^#{1,6}\s)|(^\s*[-*]\s)|(^\s*\d+\.\s)|(^\s*---+\s*$)|(```)|(`[^`\n]+`)|(\*\*[^*\n].+?\*\*)|(~~[^~\n].+?~~)|(<u>.+?</u>)|(\*[^*\n]+\*)|(\[[^\]]+\]\([^)]+\))|(^>\s)",
re.MULTILINE,
)
# Detect markdown tables: a line starting with | followed by a separator line.
# Feishu post-type 'md' elements do not render tables, so we force text mode.
_MARKDOWN_TABLE_RE = re.compile(r"^\|.*\|\n\|[-|: ]+\|", re.MULTILINE)
_MARKDOWN_LINK_RE = re.compile(r"\[([^\]]+)\]\(([^)]+)\)")
_MARKDOWN_FENCE_OPEN_RE = re.compile(r"^```([^\n`]*)\s*$")
_MARKDOWN_FENCE_CLOSE_RE = re.compile(r"^```\s*$")
@@ -2757,9 +2760,11 @@ class FeishuAdapter(BasePlatformAdapter):
if hint:
text = f"{hint}\n\n{text}" if text else hint
thread_id = getattr(message, "thread_id", None) or getattr(message, "root_id", None) or None
reply_to_message_id = (
getattr(message, "parent_id", None)
or getattr(message, "upper_message_id", None)
or getattr(message, "root_id", None)
or None
)
reply_to_text = await self._fetch_message_text(reply_to_message_id) if reply_to_message_id else None
@@ -2791,7 +2796,7 @@ class FeishuAdapter(BasePlatformAdapter):
chat_type=self._resolve_source_chat_type(chat_info=chat_info, event_chat_type=chat_type),
user_id=sender_profile["user_id"],
user_name=sender_profile["user_name"],
thread_id=getattr(message, "thread_id", None) or None,
thread_id=thread_id,
user_id_alt=sender_profile["user_id_alt"],
is_bot=is_bot,
)
@@ -2922,13 +2927,18 @@ class FeishuAdapter(BasePlatformAdapter):
},
)
response.raise_for_status()
# Snapshot Content-Type and body while the client context is
# still active so pooled connections fully release on exit.
# See #18451.
content_type_hdr = str(response.headers.get("Content-Type", ""))
body = response.content
filename = self._derive_remote_filename(
file_url,
content_type=str(response.headers.get("Content-Type", "")),
content_type=content_type_hdr,
default_name=preferred_name,
default_ext=default_ext,
)
cached_path = cache_document_from_bytes(response.content, filename)
cached_path = cache_document_from_bytes(body, filename)
return cached_path, filename
@staticmethod
@@ -3855,47 +3865,50 @@ class FeishuAdapter(BasePlatformAdapter):
and self-sent bot event filtering.
Populates ``_bot_open_id`` and ``_bot_name`` from /open-apis/bot/v3/info
(no extra scopes required beyond the tenant access token). Falls back to
the application info endpoint for ``_bot_name`` only when the first probe
doesn't return it. Each field is hydrated independently — a value already
supplied via env vars (FEISHU_BOT_OPEN_ID / FEISHU_BOT_USER_ID /
FEISHU_BOT_NAME) is preserved and skips its probe.
(no extra scopes required beyond the tenant access token). The probe
always runs when a client is available so stale env vars from app/bot
migrations do not break group @mention gating. Falls back to the
application info endpoint for ``_bot_name`` only when the first probe
doesn't return it. If the probe fails, env-provided values are preserved.
"""
if not self._client:
return
if self._bot_open_id and self._bot_name:
# Everything the self-send filter and precise mention gate need is
# already in place; nothing to probe.
return
# Primary probe: /open-apis/bot/v3/info — returns bot_name + open_id, no
# extra scopes required. This is the same endpoint the onboarding wizard
# uses via probe_bot().
if not self._bot_open_id or not self._bot_name:
try:
req = (
BaseRequest.builder()
.http_method(HttpMethod.GET)
.uri("/open-apis/bot/v3/info")
.token_types({AccessTokenType.TENANT})
.build()
)
resp = await asyncio.to_thread(self._client.request, req)
content = getattr(getattr(resp, "raw", None), "content", None)
if content:
payload = json.loads(content)
parsed = _parse_bot_response(payload) or {}
open_id = (parsed.get("bot_open_id") or "").strip()
bot_name = (parsed.get("bot_name") or "").strip()
if open_id and not self._bot_open_id:
self._bot_open_id = open_id
if bot_name and not self._bot_name:
self._bot_name = bot_name
except Exception:
logger.debug(
"[Feishu] /bot/v3/info probe failed during hydration",
exc_info=True,
)
try:
req = (
BaseRequest.builder()
.http_method(HttpMethod.GET)
.uri("/open-apis/bot/v3/info")
.token_types({AccessTokenType.TENANT})
.build()
)
resp = await asyncio.to_thread(self._client.request, req)
content = getattr(getattr(resp, "raw", None), "content", None)
if content:
payload = json.loads(content)
parsed = _parse_bot_response(payload) or {}
open_id = (parsed.get("bot_open_id") or "").strip()
bot_name = (parsed.get("bot_name") or "").strip()
if open_id:
if self._bot_open_id and self._bot_open_id != open_id:
logger.warning(
"[Feishu] FEISHU_BOT_OPEN_ID is stale; using /bot/v3/info open_id for group @mention gating."
)
self._bot_open_id = open_id
if bot_name:
if self._bot_name and self._bot_name != bot_name:
logger.info(
"[Feishu] FEISHU_BOT_NAME differs from /bot/v3/info; using hydrated bot name for group @mention gating."
)
self._bot_name = bot_name
except Exception:
logger.debug(
"[Feishu] /bot/v3/info probe failed during hydration",
exc_info=True,
)
# Fallback probe for _bot_name only: application info endpoint. Needs
# admin:app.info:readonly or application:application:self_manage scope,
@@ -3940,7 +3953,14 @@ class FeishuAdapter(BasePlatformAdapter):
if isinstance(seen_data, list):
entries: Dict[str, float] = {str(item).strip(): 0.0 for item in seen_data if str(item).strip()}
elif isinstance(seen_data, dict):
entries = {k: float(v) for k, v in seen_data.items() if isinstance(k, str) and k.strip()}
entries = {}
for key, value in seen_data.items():
if not isinstance(key, str) or not key.strip():
continue
try:
entries[key] = float(value)
except (TypeError, ValueError):
continue
else:
return
# Filter out TTL-expired entries (entries saved with ts=0.0 are treated as immortal
@@ -3985,6 +4005,12 @@ class FeishuAdapter(BasePlatformAdapter):
# =========================================================================
def _build_outbound_payload(self, content: str) -> tuple[str, str]:
# Feishu post-type 'md' elements do not render markdown tables; sending
# table content as post causes the message to appear blank on the client.
# Force plain text for anything that looks like a markdown table.
if _MARKDOWN_TABLE_RE.search(content):
text_payload = {"text": content}
return "text", json.dumps(text_payload, ensure_ascii=False)
if _MARKDOWN_HINT_RE.search(content):
return "post", _build_markdown_post_payload(content)
text_payload = {"text": content}
@@ -4063,15 +4089,18 @@ class FeishuAdapter(BasePlatformAdapter):
reply_to: Optional[str],
metadata: Optional[Dict[str, Any]],
) -> Any:
effective_reply_to = reply_to
if not effective_reply_to and metadata and metadata.get("thread_id"):
effective_reply_to = metadata.get("reply_to_message_id")
reply_in_thread = bool((metadata or {}).get("thread_id"))
if reply_to:
if effective_reply_to:
body = self._build_reply_message_body(
content=payload,
msg_type=msg_type,
reply_in_thread=reply_in_thread,
uuid_value=str(uuid.uuid4()),
)
request = self._build_reply_message_request(reply_to, body)
request = self._build_reply_message_request(effective_reply_to, body)
return await asyncio.to_thread(self._client.im.v1.message.reply, request)
body = self._build_create_message_body(
@@ -4080,7 +4109,15 @@ class FeishuAdapter(BasePlatformAdapter):
content=payload,
uuid_value=str(uuid.uuid4()),
)
request = self._build_create_message_request("chat_id", body)
# Detect whether chat_id is a user open_id (DM) or a chat_id (group).
# Feishu API expects receive_id_type="open_id" for user DMs (ou_ prefix)
# and receive_id_type="chat_id" for group chats (oc_ prefix, which IS
# the chat_id format — see https://open.feishu.cn/document/).
if chat_id.startswith("ou_"):
receive_id_type = "open_id"
else:
receive_id_type = "chat_id"
request = self._build_create_message_request(receive_id_type, body)
return await asyncio.to_thread(self._client.im.v1.message.create, request)
@staticmethod
@@ -4222,6 +4259,15 @@ class FeishuAdapter(BasePlatformAdapter):
if active_reply_to and not self._response_succeeded(response):
code = getattr(response, "code", None)
if code in _FEISHU_REPLY_FALLBACK_CODES:
if (metadata or {}).get("thread_id"):
logger.warning(
"[Feishu] Reply to %s failed in thread %s (code %s — message withdrawn/missing); "
"skipping top-level fallback to avoid creating a new topic",
active_reply_to,
(metadata or {}).get("thread_id"),
code,
)
return response
logger.warning(
"[Feishu] Reply to %s failed (code %s — message withdrawn/missing); "
"falling back to new message in chat %s",
+10 -6
View File
@@ -222,33 +222,37 @@ class ThreadParticipationTracker:
def __init__(self, platform_name: str, max_tracked: int = 500):
self._platform = platform_name
self._max_tracked = max_tracked
self._threads: set = self._load()
self._threads: dict[str, None] = {
str(thread_id): None for thread_id in self._load()
}
def _state_path(self) -> Path:
from hermes_constants import get_hermes_home
return get_hermes_home() / f"{self._platform}_threads.json"
def _load(self) -> set:
def _load(self) -> list[str]:
path = self._state_path()
if path.exists():
try:
return set(json.loads(path.read_text(encoding="utf-8")))
data = json.loads(path.read_text(encoding="utf-8"))
if isinstance(data, list):
return [str(thread_id) for thread_id in data]
except Exception:
pass
return set()
return []
def _save(self) -> None:
path = self._state_path()
thread_list = list(self._threads)
if len(thread_list) > self._max_tracked:
thread_list = thread_list[-self._max_tracked:]
self._threads = set(thread_list)
self._threads = {thread_id: None for thread_id in thread_list}
atomic_json_write(path, thread_list, indent=None)
def mark(self, thread_id: str) -> None:
"""Mark *thread_id* as participated and persist."""
if thread_id not in self._threads:
self._threads.add(thread_id)
self._threads[thread_id] = None
self._save()
def __contains__(self, thread_id: str) -> bool:
+1 -1
View File
@@ -139,7 +139,7 @@ class HomeAssistantAdapter(BasePlatformAdapter):
async def _ws_connect(self) -> bool:
"""Establish WebSocket connection and authenticate."""
ws_url = self._hass_url.replace("http://", "ws://").replace("https://", "wss://")
ws_url = self._hass_url.replace("https://", "wss://").replace("http://", "ws://")
ws_url = f"{ws_url}/api/websocket"
self._session = aiohttp.ClientSession(
+16 -1
View File
@@ -243,10 +243,14 @@ class QQAdapter(BasePlatformAdapter):
return False
try:
# Tighter keepalive pool so idle CLOSE_WAIT sockets drain
# faster behind proxies like Cloudflare Warp (#18451).
from gateway.platforms._http_client_limits import platform_httpx_limits
self._http_client = httpx.AsyncClient(
timeout=30.0,
follow_redirects=True,
event_hooks={"response": [_ssrf_redirect_guard]},
limits=platform_httpx_limits(),
)
# 1. Get access token
@@ -393,13 +397,24 @@ class QQAdapter(BasePlatformAdapter):
await self._session.close()
self._session = None
self._session = aiohttp.ClientSession()
# Honor WSL proxy env for QQ WebSocket. Hermes upgrades overwrite this
# local patch, so QQ can regress to direct-connect timeouts after update.
self._session = aiohttp.ClientSession(trust_env=True)
ws_proxy = (
os.getenv("WSS_PROXY")
or os.getenv("wss_proxy")
or os.getenv("HTTPS_PROXY")
or os.getenv("https_proxy")
or os.getenv("ALL_PROXY")
or os.getenv("all_proxy")
)
self._ws = await self._session.ws_connect(
gateway_url,
headers={
"User-Agent": build_user_agent(),
},
timeout=CONNECT_TIMEOUT_SECONDS,
proxy=ws_proxy,
)
logger.info("[%s] WebSocket connected to %s", self._log_tag, gateway_url)
+34 -1
View File
@@ -192,6 +192,15 @@ class SignalAdapter(BasePlatformAdapter):
group_allowed_str = os.getenv("SIGNAL_GROUP_ALLOWED_USERS", "")
self.group_allow_from = set(_parse_comma_list(group_allowed_str))
# DM allowlist — mirrors SIGNAL_ALLOWED_USERS checked by run.py.
# Stored here so the reaction hooks can skip unauthorized senders
# (reactions fire before run.py's auth gate, so without this check
# every inbound DM from any contact gets a 👀 reaction).
# "*" means all users allowed (open mode); empty means no restriction
# recorded at adapter level (run.py still enforces auth separately).
dm_allowed_str = os.getenv("SIGNAL_ALLOWED_USERS", "*")
self.dm_allow_from = set(_parse_comma_list(dm_allowed_str))
# HTTP client
self.client: Optional[httpx.AsyncClient] = None
@@ -248,7 +257,9 @@ class SignalAdapter(BasePlatformAdapter):
except Exception as e:
logger.warning("Signal: Could not acquire phone lock (non-fatal): %s", e)
self.client = httpx.AsyncClient(timeout=30.0)
# Tighter keepalive so idle CLOSE_WAIT drains promptly (#18451).
from gateway.platforms._http_client_limits import platform_httpx_limits
self.client = httpx.AsyncClient(timeout=30.0, limits=platform_httpx_limits())
try:
# Health check — verify signal-cli daemon is reachable
try:
@@ -1428,8 +1439,28 @@ class SignalAdapter(BasePlatformAdapter):
return None
return (author, ts)
def _reactions_enabled(self, event: "MessageEvent" = None) -> bool:
"""Check if message reactions are enabled for this event.
Two gates:
1. SIGNAL_REACTIONS env var set to false/0/no to disable globally.
2. DM allowlist if SIGNAL_ALLOWED_USERS is set, only react to
messages from senders in that list. This prevents unauthorized
contacts from seeing the 👀 reaction (which fires before run.py's
auth gate and would otherwise reveal that a bot is listening).
"""
if os.getenv("SIGNAL_REACTIONS", "true").lower() in ("false", "0", "no"):
return False
if event is not None:
sender = getattr(getattr(event, "source", None), "user_id", None)
if sender and "*" not in self.dm_allow_from and sender not in self.dm_allow_from:
return False
return True
async def on_processing_start(self, event: MessageEvent) -> None:
"""React with 👀 when processing begins."""
if not self._reactions_enabled(event):
return
target = self._extract_reaction_target(event)
if target:
await self.send_reaction(event.source.chat_id, "👀", *target)
@@ -1440,6 +1471,8 @@ class SignalAdapter(BasePlatformAdapter):
On CANCELLED we leave the 👀 in place no terminal outcome means
the reaction should keep reflecting "in progress" (matches Telegram).
"""
if not self._reactions_enabled(event):
return
if outcome == ProcessingOutcome.CANCELLED:
return
target = self._extract_reaction_target(event)
+236 -13
View File
@@ -9,6 +9,7 @@ Uses slack-bolt (Python) with Socket Mode for:
"""
import asyncio
import contextvars
import json
import logging
import os
@@ -21,6 +22,7 @@ try:
from slack_bolt.async_app import AsyncApp
from slack_bolt.adapter.socket_mode.async_handler import AsyncSocketModeHandler
from slack_sdk.web.async_client import AsyncWebClient
import aiohttp
SLACK_AVAILABLE = True
except ImportError:
SLACK_AVAILABLE = False
@@ -50,6 +52,16 @@ from gateway.platforms.base import (
logger = logging.getLogger(__name__)
# ContextVar carrying the user_id of the slash-command invoker.
# Set in _handle_slash_command, read in send() to match the correct
# stashed response_url when multiple users issue commands on the same
# channel concurrently. ContextVars propagate to child asyncio.Tasks
# (Python 3.7+), so the value set in _handle_slash_command's task is
# visible in _process_message_background's child task.
_slash_user_id: contextvars.ContextVar[Optional[str]] = contextvars.ContextVar(
"_slash_user_id", default=None,
)
@dataclass
class _ThreadContextCache:
@@ -310,6 +322,11 @@ class SlackAdapter(BasePlatformAdapter):
# Track active assistant thread status indicators so stop_typing can
# clear them (chat_id → thread_ts).
self._active_status_threads: Dict[str, str] = {}
# Slash-command contexts: stash response_url + user_id so send()
# can route the first reply ephemerally. Keyed by
# (channel_id, user_id) to avoid cross-user collisions.
# Each value: {"response_url": str, "ts": float}
self._slash_command_contexts: Dict[Tuple[str, str], Dict[str, Any]] = {}
def _describe_slack_api_error(self, response: Any, *, file_obj: Optional[Dict[str, Any]] = None) -> Optional[str]:
"""Convert Slack API auth/permission failures into actionable user-facing text."""
@@ -368,6 +385,103 @@ class SlackAdapter(BasePlatformAdapter):
)
return None
# ------------------------------------------------------------------
# Slash-command ephemeral helpers
# ------------------------------------------------------------------
_SLASH_CTX_TTL = 120.0 # seconds — response_url is valid for 30 min;
# we use a much shorter TTL to avoid routing unrelated messages
# as ephemeral if the command handler was slow or dropped.
def _pop_slash_context(
self, chat_id: str,
) -> Optional[Dict[str, Any]]:
"""Return and remove the slash-command context for *chat_id*, if fresh.
Contexts older than ``_SLASH_CTX_TTL`` seconds are silently discarded.
Uses the ``_slash_user_id`` ContextVar (set in ``_handle_slash_command``)
to match the exact ``(channel_id, user_id)`` key. This prevents a
concurrent slash command from a different user on the same channel from
stealing another user's ephemeral context. Falls back to a
channel-only scan when the ContextVar is unset (e.g. send() called
from a non-slash code path should not match anything).
"""
now = time.monotonic()
# Clean up stale entries on every lookup — dict is small.
stale_keys = [
k for k, v in self._slash_command_contexts.items()
if now - v["ts"] > self._SLASH_CTX_TTL
]
for k in stale_keys:
self._slash_command_contexts.pop(k, None)
# Precise match: (channel_id, user_id) from ContextVar.
uid = _slash_user_id.get()
if uid:
return self._slash_command_contexts.pop((chat_id, uid), None)
# Fallback: channel-only scan (only reachable when ContextVar is
# unset, i.e. send() called outside a slash-command async context).
match_key = None
for key in list(self._slash_command_contexts):
if key[0] == chat_id:
match_key = key
break
if match_key is None:
return None
return self._slash_command_contexts.pop(match_key)
async def _send_slash_ephemeral(
self,
ctx: Dict[str, Any],
content: str,
) -> "SendResult":
"""Replace the initial ephemeral ack via ``response_url``.
Slack's ``response_url`` accepts a POST with ``replace_original``
for up to 30 minutes after the slash command was invoked. This
lets us swap the "Running /cmd…" placeholder with the real reply,
and the message stays ephemeral ("Only visible to you").
Falls back to a simple ``True`` SendResult if the POST fails
the user already saw the initial ack, so a delivery failure here
is non-critical.
"""
formatted = self.format_message(content)
# Slack's response_url has the same ~40k char limit as chat_postMessage.
# Truncate to MAX_MESSAGE_LENGTH and use only the first chunk — the
# response_url replaces a single ephemeral ack, so multi-chunk isn't
# possible. Long responses are rare for command replies.
chunks = self.truncate_message(formatted, self.MAX_MESSAGE_LENGTH)
text = chunks[0] if chunks else formatted
payload = {
"response_type": "ephemeral",
"replace_original": True,
"text": text,
}
try:
async with aiohttp.ClientSession() as session:
async with session.post(
ctx["response_url"],
json=payload,
timeout=aiohttp.ClientTimeout(total=10),
) as resp:
if resp.status == 200:
return SendResult(success=True, message_id=None)
body = await resp.text()
logger.warning(
"[Slack] response_url POST returned %s: %s",
resp.status,
body[:200],
)
except Exception as e:
logger.warning(
"[Slack] response_url POST failed: %s", e,
)
# Non-fatal — the user saw the initial ack already.
return SendResult(success=True, message_id=None)
async def connect(self) -> bool:
"""Connect to Slack via Socket Mode."""
if not SLACK_AVAILABLE:
@@ -414,6 +528,21 @@ class SlackAdapter(BasePlatformAdapter):
return False
lock_acquired = True
# Close any previous handler before creating a new one so that
# calling connect() a second time (e.g. during a gateway restart or
# in-process reconnect attempt) does not leave a zombie Socket Mode
# connection alive. Both the old and new connections would otherwise
# receive every Slack event and dispatch it twice, producing double
# responses — the same bug that affected DiscordAdapter (#18187).
if self._handler is not None:
try:
await self._handler.close_async()
except Exception:
logger.debug("[%s] Failed to close previous Slack handler", self.name)
finally:
self._handler = None
self._app = None
# First token is the primary — used for AsyncApp / Socket Mode
primary_token = bot_tokens[0]
self._app = AsyncApp(token=primary_token)
@@ -446,12 +575,16 @@ class SlackAdapter(BasePlatformAdapter):
async def handle_message_event(event, say):
await self._handle_slack_message(event)
# Acknowledge app_mention events to prevent Bolt 404 errors.
# The "message" handler above already processes @mentions in
# channels, so this is intentionally a no-op to avoid duplicates.
# Handle app_mention explicitly. In some Slack app configurations,
# channel mentions arrive only as app_mention events rather than the
# generic message event. Forward them into the normal message
# pipeline so @mentions reliably produce replies.
# NOTE: when Slack fires BOTH message and app_mention for the same
# @mention, they share the same event ts — the dedup in
# _handle_slack_message (MessageDeduplicator) suppresses the second.
@self._app.event("app_mention")
async def handle_app_mention(event, say):
pass
await self._handle_slack_message(event)
# File lifecycle events can arrive around snippet uploads even when
# the actual user message is what we care about. Ack them so Slack
@@ -502,7 +635,11 @@ class SlackAdapter(BasePlatformAdapter):
@self._app.command(_slash_pattern)
async def handle_hermes_command(ack, command):
await ack()
slash = (command.get("command") or "").lstrip("/")
await ack(
response_type="ephemeral",
text=f"Running `/{slash}`…",
)
await self._handle_slash_command(command)
# Register Block Kit action handlers for approval buttons
@@ -574,6 +711,17 @@ class SlackAdapter(BasePlatformAdapter):
return SendResult(success=False, error="Not connected")
try:
# Check for a pending slash-command context. When the user ran a
# native slash command (e.g. /q, /stop, /model), the initial ack
# already showed an ephemeral "Running /cmd…" message. If we have
# a stashed response_url for this channel, replace that ack with
# the actual command reply ephemerally instead of posting publicly.
slash_ctx = self._pop_slash_context(chat_id)
if slash_ctx:
return await self._send_slash_ephemeral(
slash_ctx, content,
)
# Convert standard markdown → Slack mrkdwn
formatted = self.format_message(content)
@@ -601,6 +749,10 @@ class SlackAdapter(BasePlatformAdapter):
last_result = await self._get_client(chat_id).chat_postMessage(**kwargs)
# Clear Slack Assistant status as soon as the final message is posted.
if thread_ts:
await self.stop_typing(chat_id)
# Track the sent message ts so we can auto-respond to thread
# replies without requiring @mention.
sent_ts = last_result.get("ts") if last_result else None
@@ -624,6 +776,42 @@ class SlackAdapter(BasePlatformAdapter):
logger.error("[Slack] Send error: %s", e, exc_info=True)
return SendResult(success=False, error=str(e))
async def send_private_notice(
self,
chat_id: str,
user_id: str,
content: str,
reply_to: Optional[str] = None,
metadata: Optional[Dict[str, Any]] = None,
) -> SendResult:
"""Send a Slack ephemeral message visible only to one user."""
if not self._app:
return SendResult(success=False, error="Not connected")
if not chat_id or not user_id:
return SendResult(success=False, error="chat_id and user_id are required")
try:
formatted = self.format_message(content)
thread_ts = self._resolve_thread_ts(reply_to, metadata)
kwargs = {
"channel": chat_id,
"user": user_id,
"text": formatted,
"mrkdwn": True,
}
if thread_ts:
kwargs["thread_ts"] = thread_ts
result = await self._get_client(chat_id).chat_postEphemeral(**kwargs)
return SendResult(
success=True,
message_id=result.get("message_ts") or result.get("ts"),
raw_response=result,
)
except Exception as e: # pragma: no cover - defensive logging
logger.error("[Slack] Ephemeral send error: %s", e, exc_info=True)
return SendResult(success=False, error=str(e))
async def edit_message(
self,
chat_id: str,
@@ -642,6 +830,8 @@ class SlackAdapter(BasePlatformAdapter):
ts=message_id,
text=formatted,
)
if finalize:
await self.stop_typing(chat_id)
return SendResult(success=True, message_id=message_id)
except Exception as e: # pragma: no cover - defensive logging
logger.error(
@@ -682,7 +872,7 @@ class SlackAdapter(BasePlatformAdapter):
# in an assistant-enabled context. Falls back to reactions.
logger.debug("[Slack] assistant.threads.setStatus failed: %s", e)
async def stop_typing(self, chat_id: str) -> None:
async def stop_typing(self, chat_id: str, metadata=None) -> None:
"""Clear the assistant thread status indicator."""
if not self._app:
return
@@ -969,7 +1159,7 @@ class SlackAdapter(BasePlatformAdapter):
return _ph(f'<{url}|{label}>')
text = re.sub(
r'\[([^\]]+)\]\(([^()]*(?:\([^()]*\)[^()]*)*)\)',
r'(?<!!)\[([^\]]+)\]\(([^()]*(?:\([^()]*\)[^()]*)*)\)',
_convert_markdown_link,
text,
)
@@ -1016,9 +1206,11 @@ class SlackAdapter(BasePlatformAdapter):
)
# 10) Convert italic: _text_ stays as _text_ (already Slack italic)
# Single *text* → _text_ (Slack italic)
# Single *text* → _text_ (Slack italic), but only when the
# emphasized text touches non-whitespace on both sides so literal
# delimiters like "a * b * c" are preserved.
text = re.sub(
r'(?<!\*)\*([^*\n]+)\*(?!\*)',
r'(?<!\*)\*(\S(?:[^*\n]*?\S)?)\*(?!\*)',
lambda m: _ph(f'_{m.group(1)}_'),
text,
)
@@ -2524,9 +2716,14 @@ class SlackAdapter(BasePlatformAdapter):
# gateway command dispatcher by prepending the slash.
text = f"/{slash_name} {text}".strip()
# Slack slash commands can originate from DMs or shared channels.
# Preserve DM semantics only for DM channel IDs; shared channels must
# keep group semantics so different users do not collide into one
# session key.
is_dm = str(channel_id).startswith("D")
source = self.build_source(
chat_id=channel_id,
chat_type="dm", # Slash commands are always in DM-like context
chat_type="dm" if is_dm else "group",
user_id=user_id,
)
@@ -2537,7 +2734,26 @@ class SlackAdapter(BasePlatformAdapter):
raw_message=command,
)
await self.handle_message(event)
# Stash the Slack response_url so the first reply for this
# channel+user can be routed ephemerally (replaces the initial
# "Running /cmd…" ack shown by handle_hermes_command).
# Only stash for COMMAND events (text starts with "/") — free-form
# questions via "/hermes <question>" must produce public replies so
# the whole channel can see the agent's answer.
response_url = command.get("response_url", "")
if response_url and user_id and channel_id and text.startswith("/"):
self._slash_command_contexts[(channel_id, user_id)] = {
"response_url": response_url,
"ts": time.monotonic(),
}
# Set the ContextVar so send() can match the correct stashed
# response_url even when multiple users slash concurrently.
_slash_user_id_token = _slash_user_id.set(user_id or None)
try:
await self.handle_message(event)
finally:
_slash_user_id.reset(_slash_user_id_token)
def _has_active_session_for_thread(
self,
@@ -2698,6 +2914,13 @@ class SlackAdapter(BasePlatformAdapter):
raw = os.getenv("SLACK_FREE_RESPONSE_CHANNELS", "")
if isinstance(raw, list):
return {str(part).strip() for part in raw if str(part).strip()}
if isinstance(raw, str) and raw.strip():
return {part.strip() for part in raw.split(",") if part.strip()}
# Coerce non-list scalars (str/int/float) to str before splitting.
# A bare numeric YAML value (`free_response_channels: 1234567890`) is
# loaded as int and was previously falling through the isinstance(str)
# branch to return an empty set. str() here accepts whatever scalar
# the YAML loader hands us without changing existing string/CSV
# semantics.
s = str(raw).strip() if raw is not None else ""
if s:
return {part.strip() for part in s.split(",") if part.strip()}
return set()
+9 -5
View File
@@ -10,7 +10,7 @@ Shares credentials with the optional telephony skill — same env vars:
Gateway-specific env vars:
- SMS_WEBHOOK_PORT (default 8080)
- SMS_WEBHOOK_HOST (default 0.0.0.0)
- SMS_WEBHOOK_HOST (default 127.0.0.1)
- SMS_WEBHOOK_URL (public URL for Twilio signature validation required)
- SMS_INSECURE_NO_SIGNATURE (true to disable signature validation dev only)
- SMS_ALLOWED_USERS (comma-separated E.164 phone numbers)
@@ -41,7 +41,7 @@ logger = logging.getLogger(__name__)
TWILIO_API_BASE = "https://api.twilio.com/2010-04-01/Accounts"
MAX_SMS_LENGTH = 1600 # ~10 SMS segments
DEFAULT_WEBHOOK_PORT = 8080
DEFAULT_WEBHOOK_HOST = "0.0.0.0"
DEFAULT_WEBHOOK_HOST = "127.0.0.1"
def check_sms_requirements() -> bool:
@@ -91,19 +91,23 @@ class SmsAdapter(BasePlatformAdapter):
from aiohttp import web
if not self._from_number:
logger.error("[sms] TWILIO_PHONE_NUMBER not set — cannot send replies")
msg = "[sms] TWILIO_PHONE_NUMBER not set — cannot send replies"
logger.error(msg)
self._set_fatal_error("sms_missing_phone_number", msg, retryable=False)
return False
insecure_no_sig = os.getenv("SMS_INSECURE_NO_SIGNATURE", "").lower() == "true"
if not self._webhook_url and not insecure_no_sig:
logger.error(
msg = (
"[sms] Refusing to start: SMS_WEBHOOK_URL is required for Twilio "
"signature validation. Set it to the public URL configured in your "
"Twilio console (e.g. https://example.com/webhooks/twilio). "
"For local development without validation, set "
"SMS_INSECURE_NO_SIGNATURE=true (NOT recommended for production).",
"SMS_INSECURE_NO_SIGNATURE=true (NOT recommended for production)."
)
logger.error(msg)
self._set_fatal_error("sms_missing_webhook_url", msg, retryable=False)
return False
if insecure_no_sig and not self._webhook_url:
+153 -22
View File
@@ -353,7 +353,10 @@ class TelegramAdapter(BasePlatformAdapter):
@classmethod
def _message_thread_id_for_typing(cls, thread_id: Optional[str]) -> Optional[int]:
if not thread_id:
# Mirrors _message_thread_id_for_send: the General forum topic (thread id
# "1") is represented as "no thread id" on the wire. User-created topics
# keep their real id so typing stays scoped to that topic.
if not thread_id or str(thread_id) == cls._GENERAL_TOPIC_THREAD_ID:
return None
return int(thread_id)
@@ -512,6 +515,17 @@ class TelegramAdapter(BasePlatformAdapter):
self.name, attempt,
)
self._polling_network_error_count = 0
# start_polling() returning is necessary but not sufficient:
# PTB's Updater can be left in a state where `running` is True
# but the underlying long-poll task is wedged on a stale httpx
# connection and never makes progress. No error_callback fires
# in that state, so the reconnect ladder won't advance on its
# own. Schedule a deferred probe to detect the wedge and
# re-enter the ladder if needed.
if not self.has_fatal_error:
probe = asyncio.ensure_future(self._verify_polling_after_reconnect())
self._background_tasks.add(probe)
probe.add_done_callback(self._background_tasks.discard)
except Exception as retry_err:
logger.warning("[%s] Telegram polling reconnect failed: %s", self.name, retry_err)
# start_polling failed — polling is dead and no further error
@@ -523,6 +537,50 @@ class TelegramAdapter(BasePlatformAdapter):
self._background_tasks.add(task)
task.add_done_callback(self._background_tasks.discard)
async def _verify_polling_after_reconnect(self) -> None:
"""Heartbeat probe scheduled after a successful reconnect.
PTB's Updater can survive a botched stop()+start_polling() cycle
with `running=True` but a wedged consumer task. No error callback
fires, so the reconnect ladder doesn't advance on its own. This
probe detects the wedge by:
1. Sleeping HEARTBEAT_PROBE_DELAY so a healthy long-poll has time
to complete at least one cycle.
2. Verifying `Updater.running` is still True.
3. Probing the bot endpoint with a tight asyncio timeout. A
wedged httpx pool fails this probe; a healthy one returns
well under the timeout.
On any failure, re-enter the reconnect ladder so the existing
MAX_NETWORK_RETRIES path can ultimately escalate to fatal-error.
"""
HEARTBEAT_PROBE_DELAY = 60
PROBE_TIMEOUT = 10
await asyncio.sleep(HEARTBEAT_PROBE_DELAY)
if self.has_fatal_error:
return
if not (self._app and self._app.updater and self._app.updater.running):
logger.warning(
"[%s] Updater not running %ds after reconnect — treating as wedged",
self.name, HEARTBEAT_PROBE_DELAY,
)
await self._handle_polling_network_error(
RuntimeError("Updater not running after reconnect heartbeat")
)
return
try:
await asyncio.wait_for(self._app.bot.get_me(), PROBE_TIMEOUT)
except Exception as probe_err:
logger.warning(
"[%s] Polling heartbeat probe failed %ds after reconnect: %s",
self.name, HEARTBEAT_PROBE_DELAY, probe_err,
)
await self._handle_polling_network_error(probe_err)
async def _handle_polling_conflict(self, error: Exception) -> None:
if self.has_fatal_error and self.fatal_error_code == "telegram_polling_conflict":
return
@@ -633,6 +691,29 @@ class TelegramAdapter(BasePlatformAdapter):
)
return None
async def rename_dm_topic(
self,
chat_id: int,
thread_id: int,
name: str,
) -> None:
"""Rename a forum topic in a private (DM) chat."""
if not self._bot:
return
try:
chat_id_arg = int(chat_id)
except (TypeError, ValueError):
chat_id_arg = chat_id
await self._bot.edit_forum_topic(
chat_id=chat_id_arg,
message_thread_id=int(thread_id),
name=name,
)
logger.info(
"[%s] Renamed DM topic in chat %s thread_id=%s -> '%s'",
self.name, chat_id, thread_id, name,
)
def _persist_dm_topic_thread_id(self, chat_id: int, topic_name: str, thread_id: int) -> None:
"""Save a newly created thread_id back into config.yaml so it persists across restarts."""
try:
@@ -761,6 +842,20 @@ class TelegramAdapter(BasePlatformAdapter):
# Persist thread_id to config so we don't recreate on next restart
self._persist_dm_topic_thread_id(int(chat_id), topic_name, thread_id)
# Send a seed message so the topic is visible in Telegram's client.
# Empty topics are hidden by the client UI until they contain a message.
try:
await self._bot.send_message(
chat_id=int(chat_id),
message_thread_id=thread_id,
text=f"\U0001f4cc {topic_name}",
)
except Exception as seed_err:
logger.debug(
"[%s] Could not send seed message to topic '%s': %s",
self.name, topic_name, seed_err,
)
async def connect(self) -> bool:
"""Connect to Telegram via polling or webhook.
@@ -2198,13 +2293,54 @@ class TelegramAdapter(BasePlatformAdapter):
)
return SendResult(success=True, message_id=str(msg.message_id))
except Exception as e:
logger.error(
"[%s] Failed to send Telegram local image, falling back to base adapter: %s",
self.name,
e,
exc_info=True,
error_str = str(e)
# Dimension-related errors are the expected case for valid image
# files that Telegram just refuses as photos (screenshots, extreme
# aspect ratios). Log at INFO because the document fallback is
# the correct path. Any other send_photo failure also falls back
# to document (rate limits, corrupt file markers, format edge
# cases), but at WARNING because it's unexpected and worth
# surfacing in logs.
is_dim_error = (
"Photo_invalid_dimensions" in error_str
or "PHOTO_INVALID_DIMENSIONS" in error_str
)
return await super().send_image_file(chat_id, image_path, caption, reply_to)
if is_dim_error:
logger.info(
"[%s] Image dimensions exceed Telegram photo limits, "
"sending as document: %s",
self.name,
image_path,
)
else:
logger.warning(
"[%s] Failed to send Telegram local image as photo, "
"trying document fallback: %s",
self.name,
e,
exc_info=True,
)
# Fallback to sending as document (file) — no dimension limit,
# only 50MB size limit. If even that fails, fall back to the
# base adapter's text-only "Image: /path" rendering.
try:
return await self.send_document(
chat_id=chat_id,
file_path=image_path,
caption=caption,
file_name=os.path.basename(image_path),
reply_to=reply_to,
metadata=metadata,
)
except Exception as doc_err:
logger.error(
"[%s] Failed to send Telegram local image as document, "
"falling back to base adapter: %s",
self.name,
doc_err,
exc_info=True,
)
return await super().send_image_file(chat_id, image_path, caption, reply_to)
async def send_document(
self,
@@ -2375,21 +2511,16 @@ class TelegramAdapter(BasePlatformAdapter):
try:
_typing_thread = self._metadata_thread_id(metadata)
message_thread_id = self._message_thread_id_for_typing(_typing_thread)
try:
await self._bot.send_chat_action(
chat_id=int(chat_id),
action="typing",
message_thread_id=message_thread_id,
)
except Exception as e:
if message_thread_id is not None and self._is_thread_not_found_error(e):
await self._bot.send_chat_action(
chat_id=int(chat_id),
action="typing",
message_thread_id=None,
)
else:
raise
# No retry-without-thread fallback here: _message_thread_id_for_typing
# already maps the forum General topic to None, so any non-None value
# reaching this call is a user-created topic. If Telegram rejects it
# (e.g. topic deleted mid-session), we swallow the failure rather than
# showing a typing indicator in the wrong chat/All Messages.
await self._bot.send_chat_action(
chat_id=int(chat_id),
action="typing",
message_thread_id=message_thread_id,
)
except Exception as e:
# Typing failures are non-fatal; log at debug level only.
logger.debug(
+10 -7
View File
@@ -185,10 +185,13 @@ async def _query_doh_provider(
async def discover_fallback_ips() -> list[str]:
"""Auto-discover Telegram API IPs via DNS-over-HTTPS.
Resolves api.telegram.org through Google and Cloudflare DoH, collects all
unique IPs, and excludes the system-DNS-resolved IP (which is presumably
unreachable on this network). Falls back to a hardcoded seed list when DoH
is also unavailable.
Resolves api.telegram.org through Google and Cloudflare DoH and returns all
unique A records. IPs that match the local system resolver are kept rather
than excluded: in many networks the system-DNS IP is the most reliable path
to api.telegram.org and a transient primary-path failure should be retried
against the same address via the IP-rewrite path before the seed list is
consulted (#14520). Falls back to a hardcoded seed list only when DoH
yields no usable answers.
"""
async with httpx.AsyncClient(timeout=httpx.Timeout(_DOH_TIMEOUT)) as client:
doh_tasks = [_query_doh_provider(client, p) for p in _DOH_PROVIDERS]
@@ -203,11 +206,11 @@ async def discover_fallback_ips() -> list[str]:
if isinstance(r, list):
doh_ips.extend(r)
# Deduplicate preserving order, exclude system-DNS IPs
# Deduplicate preserving order
seen: set[str] = set()
candidates: list[str] = []
for ip in doh_ips:
if ip not in seen and ip not in system_ips:
if ip not in seen:
seen.add(ip)
candidates.append(ip)
@@ -219,7 +222,7 @@ async def discover_fallback_ips() -> list[str]:
return validated
logger.info(
"DoH discovery yielded no new IPs (system DNS: %s); using seed fallback IPs %s",
"DoH discovery yielded no usable IPs (system DNS: %s); using seed fallback IPs %s",
", ".join(system_ips) or "unknown",
", ".join(_SEED_FALLBACK_IPS),
)
+8 -1
View File
@@ -142,6 +142,7 @@ class WeComAdapter(BasePlatformAdapter):
"""WeCom AI Bot adapter backed by a persistent WebSocket connection."""
MAX_MESSAGE_LENGTH = MAX_MESSAGE_LENGTH
SUPPORTS_MESSAGE_EDITING = False
# Threshold for detecting WeCom client-side message splits.
# When a chunk is near the 4000-char limit, a continuation is almost certain.
_SPLIT_THRESHOLD = 3900
@@ -206,7 +207,11 @@ class WeComAdapter(BasePlatformAdapter):
return False
try:
self._http_client = httpx.AsyncClient(timeout=30.0, follow_redirects=True)
# Tighter keepalive so idle CLOSE_WAIT drains promptly (#18451).
from gateway.platforms._http_client_limits import platform_httpx_limits
self._http_client = httpx.AsyncClient(
timeout=30.0, follow_redirects=True, limits=platform_httpx_limits(),
)
await self._open_connection()
self._mark_connected()
self._listen_task = asyncio.create_task(self._listen_loop())
@@ -1010,6 +1015,8 @@ class WeComAdapter(BasePlatformAdapter):
if not aes_key:
raise ValueError("aes_key is required")
# WeCom doesn't pad base64 keys; add padding if needed
aes_key = aes_key + '=' * ((4 - len(aes_key) % 4) % 4)
key = base64.b64decode(aes_key)
if len(key) != 32:
raise ValueError(f"Invalid WeCom AES key length: expected 32 bytes, got {len(key)}")
+3 -1
View File
@@ -119,7 +119,9 @@ class WecomCallbackAdapter(BasePlatformAdapter):
pass
try:
self._http_client = httpx.AsyncClient(timeout=20.0)
# Tighter keepalive so idle CLOSE_WAIT drains promptly (#18451).
from gateway.platforms._http_client_limits import platform_httpx_limits
self._http_client = httpx.AsyncClient(timeout=20.0, limits=platform_httpx_limits())
self._app = web.Application()
self._app.router.add_get("/health", self._handle_health)
self._app.router.add_get(self._path, self._handle_verify)
+12 -3
View File
@@ -1333,6 +1333,15 @@ class WeixinAdapter(BasePlatformAdapter):
if message_id and self._dedup.is_duplicate(message_id):
return
# Secondary content-fingerprint dedup for text messages
item_list = message.get("item_list") or []
text = _extract_text(item_list)
if text:
content_key = f"content:{sender_id}:{hashlib.md5(text.encode()).hexdigest()}"
if self._dedup.is_duplicate(content_key):
logger.debug("[%s] Content-dedup: skipping duplicate message from %s", self.name, sender_id)
return
chat_type, effective_chat_id = _guess_chat_type(message, self._account_id)
if chat_type == "group":
if self._group_policy == "disabled":
@@ -1347,8 +1356,6 @@ class WeixinAdapter(BasePlatformAdapter):
self._token_store.set(self._account_id, sender_id, context_token)
asyncio.create_task(self._maybe_fetch_typing_ticket(sender_id, context_token or None))
item_list = message.get("item_list") or []
text = _extract_text(item_list)
media_paths: List[str] = []
media_types: List[str] = []
@@ -2030,7 +2037,9 @@ async def send_weixin_direct(
live_adapter = _LIVE_ADAPTERS.get(resolved_token)
send_session = getattr(live_adapter, '_send_session', None)
if live_adapter is not None and send_session is not None and not send_session.closed:
if (live_adapter is not None and send_session is not None
and not send_session.closed
and send_session._loop is asyncio.get_running_loop()):
last_result: Optional[SendResult] = None
cleaned = live_adapter.format_message(message)
if cleaned:
+32 -2
View File
@@ -185,6 +185,13 @@ class WhatsAppAdapter(BasePlatformAdapter):
self._bridge_log: Optional[Path] = None
self._poll_task: Optional[asyncio.Task] = None
self._http_session: Optional["aiohttp.ClientSession"] = None
# Set to True by disconnect() before we SIGTERM our child bridge so
# _check_managed_bridge_exit() can distinguish an intentional
# shutdown-time exit (returncode -15 / -2 / 0) from a real crash.
# Without this, every graceful gateway shutdown/restart would log
# "Fatal whatsapp adapter error" plus dispatch a fatal-error
# notification before the normal "✓ whatsapp disconnected" fires.
self._shutting_down: bool = False
def _whatsapp_require_mention(self) -> bool:
configured = self.config.extra.get("require_mention")
@@ -555,6 +562,21 @@ class WhatsAppAdapter(BasePlatformAdapter):
if returncode is None:
return None
# Planned shutdown: disconnect() sets _shutting_down before it sends
# SIGTERM to the bridge, so a returncode of -15 (SIGTERM), -2 (SIGINT),
# or 0 (clean exit) at that point is expected, not a crash. Treat it
# as informational and skip the fatal-error path.
# getattr-with-default keeps tests that construct the adapter via
# ``WhatsAppAdapter.__new__`` (bypassing __init__) working without
# every _make_adapter() helper having to seed the attribute.
if getattr(self, "_shutting_down", False) and returncode in (0, -2, -15):
logger.info(
"[%s] Bridge exited during shutdown (code %d).",
self.name,
returncode,
)
return None
message = f"WhatsApp bridge process exited unexpectedly (code {returncode})."
if not self.has_fatal_error:
logger.error("[%s] %s", self.name, message)
@@ -565,6 +587,10 @@ class WhatsAppAdapter(BasePlatformAdapter):
async def disconnect(self) -> None:
"""Stop the WhatsApp bridge and clean up any orphaned processes."""
# Flip the shutdown flag BEFORE signalling the child so the exit-check
# path (which runs from other tasks like send() and the poll loop)
# doesn't race us and report the intentional termination as fatal.
self._shutting_down = True
if self._bridge_process:
try:
try:
@@ -876,11 +902,15 @@ class WhatsAppAdapter(BasePlatformAdapter):
try:
import aiohttp
await self._http_session.post(
# Must wrap in `async with` — a bare `await session.post(...)`
# leaves the response object alive until GC, holding its TCP
# socket in CLOSE_WAIT. See #18451.
async with self._http_session.post(
f"http://127.0.0.1:{self._bridge_port}/typing",
json={"chatId": chat_id},
timeout=aiohttp.ClientTimeout(total=5)
)
):
pass
except Exception:
pass # Ignore typing indicator failures
+1569 -213
View File
File diff suppressed because it is too large Load Diff
+22 -16
View File
@@ -1086,19 +1086,22 @@ class SessionStore:
return len(removed_keys)
def suspend_recently_active(self, max_age_seconds: int = 120) -> int:
"""Mark recently-active sessions as suspended.
"""Mark recently-active sessions as resumable after an unexpected exit.
Called on gateway startup to prevent sessions that were likely
in-flight when the gateway last exited from being blindly resumed
(#7536). Only suspends sessions updated within *max_age_seconds*
to avoid resetting long-idle sessions that are harmless to resume.
Returns the number of sessions that were suspended.
Called on gateway startup after a crash or fast restart to preserve
in-flight sessions instead of destroying their conversation history
(#7536). Only marks sessions updated within *max_age_seconds* to
avoid touching long-idle sessions. Sets ``resume_pending=True`` so
the next incoming message on the same session_key auto-resumes from
the existing transcript.
Entries flagged ``resume_pending=True`` are skipped those were
marked intentionally by the drain-timeout path as recoverable.
Terminal escalation for genuinely stuck ``resume_pending`` sessions
is handled by the existing ``.restart_failure_counts`` stuck-loop
counter, which runs after this method on startup.
Entries already flagged ``resume_pending=True`` are skipped. Entries
explicitly ``suspended=True`` (from /stop or stuck-loop escalation)
are also skipped. Terminal escalation for genuinely stuck sessions
is still handled by the existing ``.restart_failure_counts`` counter
(threshold 3), which runs after this method and sets ``suspended=True``.
Returns the number of sessions marked resumable.
"""
from datetime import timedelta
@@ -1110,13 +1113,15 @@ class SessionStore:
if entry.resume_pending:
continue
if not entry.suspended and entry.updated_at >= cutoff:
entry.suspended = True
entry.resume_pending = True
entry.resume_reason = "restart_interrupted"
entry.last_resume_marked_at = _now()
count += 1
if count:
self._save()
return count
def reset_session(self, session_key: str) -> Optional[SessionEntry]:
def reset_session(self, session_key: str, display_name: Optional[str] = None) -> Optional[SessionEntry]:
"""Force reset a session, creating a new session ID."""
db_end_session_id = None
db_create_kwargs = None
@@ -1140,7 +1145,7 @@ class SessionStore:
created_at=now,
updated_at=now,
origin=old_entry.origin,
display_name=old_entry.display_name,
display_name=display_name if display_name is not None else old_entry.display_name,
platform=old_entry.platform,
chat_type=old_entry.chat_type,
is_fresh_reset=True,
@@ -1271,8 +1276,9 @@ class SessionStore:
# Also write legacy JSONL (keeps existing tooling working during transition)
transcript_path = self.get_transcript_path(session_id)
with open(transcript_path, "a", encoding="utf-8") as f:
f.write(json.dumps(message, ensure_ascii=False) + "\n")
with self._lock:
with open(transcript_path, "a", encoding="utf-8") as f:
f.write(json.dumps(message, ensure_ascii=False) + "\n")
def rewrite_transcript(self, session_id: str, messages: List[Dict[str, Any]]) -> None:
"""Replace the entire transcript for a session with new messages.
+107 -51
View File
@@ -637,6 +637,8 @@ def release_all_scoped_locks(
_TAKEOVER_MARKER_FILENAME = ".gateway-takeover.json"
_TAKEOVER_MARKER_TTL_S = 60 # Marker older than this is treated as stale
_PLANNED_STOP_MARKER_FILENAME = ".gateway-planned-stop.json"
_PLANNED_STOP_MARKER_TTL_S = 60
def _get_takeover_marker_path() -> Path:
@@ -645,6 +647,67 @@ def _get_takeover_marker_path() -> Path:
return home / _TAKEOVER_MARKER_FILENAME
def _get_planned_stop_marker_path() -> Path:
"""Return the path to the intentional gateway stop marker file."""
home = get_hermes_home()
return home / _PLANNED_STOP_MARKER_FILENAME
def _marker_is_stale(written_at: str, ttl_s: int) -> bool:
try:
written_dt = datetime.fromisoformat(written_at)
age = (datetime.now(timezone.utc) - written_dt).total_seconds()
return age > ttl_s
except (TypeError, ValueError):
return True
def _consume_pid_marker_for_self(
path: Path,
*,
pid_field: str,
start_time_field: str,
ttl_s: int,
) -> bool:
record = _read_json_file(path)
if not record:
return False
try:
target_pid = int(record[pid_field])
target_start_time = record.get(start_time_field)
written_at = record.get("written_at") or ""
except (KeyError, TypeError, ValueError):
try:
path.unlink(missing_ok=True)
except OSError:
pass
return False
if _marker_is_stale(written_at, ttl_s):
try:
path.unlink(missing_ok=True)
except OSError:
pass
return False
our_pid = os.getpid()
our_start_time = _get_process_start_time(our_pid)
matches = (
target_pid == our_pid
and target_start_time is not None
and our_start_time is not None
and target_start_time == our_start_time
)
try:
path.unlink(missing_ok=True)
except OSError:
pass
return matches
def write_takeover_marker(target_pid: int) -> bool:
"""Record that ``target_pid`` is being replaced by the current process.
@@ -681,59 +744,13 @@ def consume_takeover_marker_for_self() -> bool:
Always unlinks the marker on match (and on detected staleness) so
subsequent unrelated signals don't re-trigger.
"""
path = _get_takeover_marker_path()
record = _read_json_file(path)
if not record:
return False
# Any malformed or stale marker → drop it and return False
try:
target_pid = int(record["target_pid"])
target_start_time = record.get("target_start_time")
written_at = record.get("written_at") or ""
except (KeyError, TypeError, ValueError):
try:
path.unlink(missing_ok=True)
except OSError:
pass
return False
# TTL guard: a stale marker older than _TAKEOVER_MARKER_TTL_S is ignored.
stale = False
try:
written_dt = datetime.fromisoformat(written_at)
age = (datetime.now(timezone.utc) - written_dt).total_seconds()
if age > _TAKEOVER_MARKER_TTL_S:
stale = True
except (TypeError, ValueError):
stale = True # Unparseable timestamp — treat as stale
if stale:
try:
path.unlink(missing_ok=True)
except OSError:
pass
return False
# Does the marker name THIS process?
our_pid = os.getpid()
our_start_time = _get_process_start_time(our_pid)
matches = (
target_pid == our_pid
and target_start_time is not None
and our_start_time is not None
and target_start_time == our_start_time
return _consume_pid_marker_for_self(
_get_takeover_marker_path(),
pid_field="target_pid",
start_time_field="target_start_time",
ttl_s=_TAKEOVER_MARKER_TTL_S,
)
# Consume the marker whether it matched or not — a marker that doesn't
# match our identity is stale-for-us anyway.
try:
path.unlink(missing_ok=True)
except OSError:
pass
return matches
def clear_takeover_marker() -> None:
"""Remove the takeover marker unconditionally. Safe to call repeatedly."""
@@ -743,6 +760,45 @@ def clear_takeover_marker() -> None:
pass
def write_planned_stop_marker(target_pid: int) -> bool:
"""Record that ``target_pid`` is being stopped intentionally.
The gateway exits non-zero for unexpected SIGTERM so service managers can
revive it. Service stop commands send the same SIGTERM, so the CLI writes
this short-lived marker first to let the target process exit cleanly.
"""
try:
target_start_time = _get_process_start_time(target_pid)
record = {
"target_pid": target_pid,
"target_start_time": target_start_time,
"stopper_pid": os.getpid(),
"written_at": _utc_now_iso(),
}
_write_json_file(_get_planned_stop_marker_path(), record)
return True
except (OSError, PermissionError):
return False
def consume_planned_stop_marker_for_self() -> bool:
"""Return True when the current process is being intentionally stopped."""
return _consume_pid_marker_for_self(
_get_planned_stop_marker_path(),
pid_field="target_pid",
start_time_field="target_start_time",
ttl_s=_PLANNED_STOP_MARKER_TTL_S,
)
def clear_planned_stop_marker() -> None:
"""Remove the planned-stop marker unconditionally."""
try:
_get_planned_stop_marker_path().unlink(missing_ok=True)
except OSError:
pass
def get_running_pid(
pid_path: Optional[Path] = None,
*,
+33 -1
View File
@@ -5,11 +5,43 @@ Provides subcommands for:
- hermes chat - Interactive chat (same as ./hermes)
- hermes gateway - Run gateway in foreground
- hermes gateway start - Start gateway service
- hermes gateway stop - Stop gateway service
- hermes gateway stop - Stop gateway service
- hermes setup - Interactive setup wizard
- hermes status - Show status of all components
- hermes cron - Manage cron jobs
"""
import os
import sys
__version__ = "0.12.0"
__release_date__ = "2026.4.30"
def _ensure_utf8():
"""Force UTF-8 stdout/stderr on Windows to prevent UnicodeEncodeError.
Windows services and terminals default to cp1252, which cannot encode
box-drawing characters used in CLI output. This causes unhandled
UnicodeEncodeError crashes on gateway startup.
"""
if sys.platform != "win32":
return
os.environ.setdefault("PYTHONUTF8", "1")
os.environ.setdefault("PYTHONIOENCODING", "utf-8")
for stream_name in ("stdout", "stderr"):
stream = getattr(sys, stream_name, None)
if stream is None:
continue
try:
if getattr(stream, "encoding", "").lower().replace("-", "") != "utf8":
new_stream = open(
stream.fileno(), "w", encoding="utf-8",
buffering=1, closefd=False,
)
setattr(sys, stream_name, new_stream)
except (AttributeError, OSError):
pass
_ensure_utf8()
+432 -19
View File
@@ -416,6 +416,40 @@ PROVIDER_REGISTRY: Dict[str, ProviderConfig] = {
),
}
# Auto-extend PROVIDER_REGISTRY with any api-key provider registered in
# providers/ that is not already declared above. New providers only need a
# plugins/model-providers/<name>/ plugin — no edits to this file required.
try:
from providers import list_providers as _list_providers_for_registry
for _pp in _list_providers_for_registry():
if _pp.name in PROVIDER_REGISTRY:
continue
if _pp.auth_type != "api_key" or not _pp.env_vars:
continue
# Skip providers that need custom token resolution or are special-cased
# in resolve_provider() (copilot/kimi/zai have bespoke token refresh;
# openrouter/custom are aggregator/user-supplied and handled outside
# the registry — adding them here breaks runtime_provider resolution
# that relies on `openrouter not in PROVIDER_REGISTRY`).
if _pp.name in {"copilot", "kimi-coding", "kimi-coding-cn", "zai", "openrouter", "custom"}:
continue
_api_key_vars = tuple(v for v in _pp.env_vars if not v.endswith("_BASE_URL") and not v.endswith("_URL"))
_base_url_var = next((v for v in _pp.env_vars if v.endswith("_BASE_URL") or v.endswith("_URL")), None)
PROVIDER_REGISTRY[_pp.name] = ProviderConfig(
id=_pp.name,
name=_pp.display_name or _pp.name,
auth_type="api_key",
inference_base_url=_pp.base_url,
api_key_env_vars=_api_key_vars or _pp.env_vars,
base_url_env_var=_base_url_var or "",
)
# Also register aliases so resolve_provider() resolves them
for _alias in _pp.aliases:
if _alias not in PROVIDER_REGISTRY:
PROVIDER_REGISTRY[_alias] = PROVIDER_REGISTRY[_pp.name]
except Exception:
pass
# =============================================================================
# Anthropic Key Helper
@@ -746,6 +780,73 @@ def _auth_file_path() -> Path:
return path
def _global_auth_file_path() -> Optional[Path]:
"""Return the global-root auth.json when the process is in profile mode.
Returns ``None`` when the profile and global root resolve to the same
directory (classic mode, or custom HERMES_HOME that is not a profile).
Used by read-only fallback paths so providers authed at the root are
visible to profile processes that haven't configured them locally.
See issue #18594 follow-up (credential_pool shadowing).
"""
try:
from hermes_constants import get_default_hermes_root
global_root = get_default_hermes_root()
except Exception:
return None
profile_home = get_hermes_home()
try:
if profile_home.resolve(strict=False) == global_root.resolve(strict=False):
return None
except Exception:
if profile_home == global_root:
return None
# No pytest seat belt here: this is a pure read-only path, and
# ``_load_global_auth_store()`` wraps the read in a try/except so an
# unreadable global file can never break the profile process. The
# write-side seat belt still lives on ``_auth_file_path()`` where it
# belongs (that's what protects the real user's auth store from being
# corrupted by a mis-configured test).
return global_root / "auth.json"
def _load_global_auth_store() -> Dict[str, Any]:
"""Load the global-root auth store (read-only fallback).
Returns an empty dict when no global fallback exists (classic mode,
or the global auth.json is absent). Never raises on missing file.
Seat belt: under pytest, refuses to read the real user's
``~/.hermes/auth.json`` even when HERMES_HOME is set to a profile
path. The hermetic conftest does not redirect ``HOME``, so
``get_default_hermes_root()`` for a profile-shaped HERMES_HOME can
still resolve to the real user's home on a dev machine. That would
leak real credentials into tests. This guard uses the unmodified
``HOME`` env var (what ``os.path.expanduser('~')`` would resolve to),
not ``Path.home()``, because ``Path.home`` is sometimes monkeypatched
by fixtures that want to relocate the global root to a tmp path.
"""
global_path = _global_auth_file_path()
if global_path is None or not global_path.exists():
return {}
if os.environ.get("PYTEST_CURRENT_TEST"):
real_home_env = os.environ.get("HOME", "")
if real_home_env:
real_root = Path(real_home_env) / ".hermes" / "auth.json"
try:
if global_path.resolve(strict=False) == real_root.resolve(strict=False):
return {}
except Exception:
pass
try:
return _load_auth_store(global_path)
except Exception:
# A malformed global store must not break profile reads. The
# profile's own auth store is still authoritative.
return {}
def _auth_lock_path() -> Path:
return _auth_file_path().with_suffix(".lock")
@@ -932,15 +1033,50 @@ def get_auth_provider_display_name(provider_id: str) -> str:
def read_credential_pool(provider_id: Optional[str] = None) -> Dict[str, Any]:
"""Return the persisted credential pool, or one provider slice."""
"""Return the persisted credential pool, or one provider slice.
In profile mode, the profile's credential pool is authoritative. If a
provider has no entries in the profile, entries from the global-root
``auth.json`` are used as a read-only fallback so workers spawned in a
profile can see providers that were only authenticated at global scope.
Profile entries always win: the global fallback only applies per-provider
when the profile has zero entries for that provider. Once the user runs
``hermes auth add <provider>`` inside the profile, profile entries
fully shadow global for that provider on the next read.
Writes always go to the profile (``write_credential_pool`` is unchanged).
See issue #18594 follow-up.
"""
auth_store = _load_auth_store()
pool = auth_store.get("credential_pool")
if not isinstance(pool, dict):
pool = {}
global_pool: Dict[str, Any] = {}
global_store = _load_global_auth_store()
maybe_global_pool = global_store.get("credential_pool") if global_store else None
if isinstance(maybe_global_pool, dict):
global_pool = maybe_global_pool
if provider_id is None:
return dict(pool)
merged = dict(pool)
for gp_key, gp_entries in global_pool.items():
if not isinstance(gp_entries, list) or not gp_entries:
continue
# Per-provider shadowing: profile wins whenever it has ANY entries.
existing = merged.get(gp_key)
if isinstance(existing, list) and existing:
continue
merged[gp_key] = list(gp_entries)
return merged
provider_entries = pool.get(provider_id)
return list(provider_entries) if isinstance(provider_entries, list) else []
if isinstance(provider_entries, list) and provider_entries:
return list(provider_entries)
# Profile has no entries for this provider — fall back to global.
global_entries = global_pool.get(provider_id)
return list(global_entries) if isinstance(global_entries, list) else []
def write_credential_pool(provider_id: str, entries: List[Dict[str, Any]]) -> Path:
@@ -999,9 +1135,25 @@ def unsuppress_credential_source(provider_id: str, source: str) -> bool:
def get_provider_auth_state(provider_id: str) -> Optional[Dict[str, Any]]:
"""Return persisted auth state for a provider, or None."""
"""Return persisted auth state for a provider, or None.
In profile mode, falls back to the global-root ``auth.json`` when the
profile has no state for this provider. Profile state always wins when
present. Writes (``_save_auth_store`` / ``persist_*_credentials``) are
unchanged they still target the profile only. This mirrors
``read_credential_pool``'s per-provider shadowing semantics so that
``_seed_from_singletons`` can reseed a profile's credential pool from
global-scope provider state (e.g. a globally-authenticated Anthropic
OAuth or Nous device-code session). See issue #18594 follow-up.
"""
auth_store = _load_auth_store()
return _load_provider_state(auth_store, provider_id)
state = _load_provider_state(auth_store, provider_id)
if state is not None:
return state
global_store = _load_global_auth_store()
if not global_store:
return None
return _load_provider_state(global_store, provider_id)
def get_active_provider() -> Optional[str]:
@@ -1195,6 +1347,17 @@ def resolve_provider(
"vllm": "custom", "llamacpp": "custom",
"llama.cpp": "custom", "llama-cpp": "custom",
}
# Extend with aliases declared in plugins/model-providers/<name>/ that aren't already mapped.
# This keeps providers/ as the single source for new aliases while the
# hardcoded dict above remains authoritative for existing ones.
try:
from providers import list_providers as _lp
for _pp in _lp():
for _alias in _pp.aliases:
if _alias not in _PROVIDER_ALIASES:
_PROVIDER_ALIASES[_alias] = _pp.name
except Exception:
pass
normalized = _PROVIDER_ALIASES.get(normalized, normalized)
if normalized == "openrouter":
@@ -2589,6 +2752,208 @@ def _poll_for_token(
# Nous Portal — token refresh, agent key minting, model discovery
# =============================================================================
# -----------------------------------------------------------------------------
# Shared Nous token store — lets OAuth credentials persist across profiles
# so a new `hermes --profile <name> auth add nous --type oauth` can one-tap
# import instead of running the full device-code flow every time.
#
# File lives at ${HERMES_SHARED_AUTH_DIR}/nous_auth.json, defaulting to
# ~/.hermes/shared/nous_auth.json. It is OUTSIDE any named profile's
# HERMES_HOME so named profiles (which typically live under
# ~/.hermes/profiles/<name>/) all see the same file.
#
# Written on successful login and on every runtime refresh so the stored
# refresh_token stays current even if one profile refreshes and rotates it.
# If ever the stored refresh_token does go stale server-side, import fails
# gracefully and the user falls back to the normal device-code flow.
# -----------------------------------------------------------------------------
NOUS_SHARED_STORE_FILENAME = "nous_auth.json"
def _nous_shared_auth_dir() -> Path:
"""Resolve the directory that holds the shared Nous token store.
Honors ``HERMES_SHARED_AUTH_DIR`` so tests can redirect it to a tmp
path without touching the real user's home. Defaults to
``~/.hermes/shared/``.
"""
override = os.getenv("HERMES_SHARED_AUTH_DIR", "").strip()
if override:
return Path(override).expanduser()
return Path.home() / ".hermes" / "shared"
def _nous_shared_store_path() -> Path:
path = _nous_shared_auth_dir() / NOUS_SHARED_STORE_FILENAME
# Seat belt: if pytest is running and this resolves to a path under the
# real user's home, refuse rather than silently corrupt cross-profile
# state. Tests must set HERMES_SHARED_AUTH_DIR to a tmp_path (conftest
# does not do this automatically — mirror the _auth_file_path() guard
# so forgetting to set it fails loudly instead of writing to the real
# shared store).
if os.environ.get("PYTEST_CURRENT_TEST"):
real_home_shared = (
Path.home() / ".hermes" / "shared" / NOUS_SHARED_STORE_FILENAME
).resolve(strict=False)
try:
resolved = path.resolve(strict=False)
except Exception:
resolved = path
if resolved == real_home_shared:
raise RuntimeError(
f"Refusing to touch real user shared Nous auth store during test run: "
f"{path}. Set HERMES_SHARED_AUTH_DIR to a tmp_path in your test fixture."
)
return path
def _write_shared_nous_state(state: Dict[str, Any]) -> None:
"""Persist a minimal copy of the Nous OAuth state to the shared store.
Best-effort: any failure is swallowed after logging. The shared store
is a convenience layer; the per-profile auth.json remains the source
of truth.
We deliberately omit the short-lived ``agent_key`` (24h TTL, profile-
specific) only the long-lived OAuth tokens are cross-profile useful.
"""
refresh_token = state.get("refresh_token")
access_token = state.get("access_token")
if not (isinstance(refresh_token, str) and refresh_token.strip()):
# No refresh_token = nothing worth sharing across profiles
return
if not (isinstance(access_token, str) and access_token.strip()):
return
shared = {
"_schema": 1,
"access_token": access_token,
"refresh_token": refresh_token,
"token_type": state.get("token_type") or "Bearer",
"scope": state.get("scope") or DEFAULT_NOUS_SCOPE,
"client_id": state.get("client_id") or DEFAULT_NOUS_CLIENT_ID,
"portal_base_url": state.get("portal_base_url") or DEFAULT_NOUS_PORTAL_URL,
"inference_base_url": state.get("inference_base_url") or DEFAULT_NOUS_INFERENCE_URL,
"obtained_at": state.get("obtained_at"),
"expires_at": state.get("expires_at"),
"updated_at": datetime.now(timezone.utc).isoformat(),
}
try:
path = _nous_shared_store_path()
path.parent.mkdir(parents=True, exist_ok=True)
tmp = path.with_suffix(path.suffix + ".tmp")
tmp.write_text(json.dumps(shared, indent=2, sort_keys=True))
try:
os.chmod(tmp, 0o600)
except OSError:
pass
os.replace(tmp, path)
_oauth_trace(
"nous_shared_store_written",
path=str(path),
refresh_token_fp=_token_fingerprint(refresh_token),
)
except Exception as exc:
logger.debug("Failed to write shared Nous auth store: %s", exc)
def _read_shared_nous_state() -> Optional[Dict[str, Any]]:
"""Return the shared Nous OAuth state if present and well-formed.
Returns ``None`` when the file is missing, unreadable, malformed, or
lacks required fields. Callers should treat ``None`` as "no shared
credentials available fall through to device-code".
"""
try:
path = _nous_shared_store_path()
except RuntimeError:
# Test seat belt tripped — treat as missing
return None
if not path.is_file():
return None
try:
payload = json.loads(path.read_text())
except (OSError, ValueError) as exc:
logger.debug("Shared Nous auth store at %s is unreadable: %s", path, exc)
return None
if not isinstance(payload, dict):
return None
refresh_token = payload.get("refresh_token")
access_token = payload.get("access_token")
if not (isinstance(refresh_token, str) and refresh_token.strip()):
return None
if not (isinstance(access_token, str) and access_token.strip()):
return None
return payload
def _try_import_shared_nous_state(
*,
timeout_seconds: float = 15.0,
min_key_ttl_seconds: int = 5 * 60,
) -> Optional[Dict[str, Any]]:
"""Attempt to rehydrate Nous OAuth state from the shared store.
Reads the shared file (if present), runs a forced refresh+mint using
the stored refresh_token to produce a fresh access_token + agent_key
scoped to this profile, and returns the full auth_state dict ready
for ``persist_nous_credentials()``.
Returns ``None`` when no shared state is available or the rehydrate
fails for any reason (expired refresh_token, portal unreachable,
etc.) caller should then fall through to the normal device-code
flow.
"""
shared = _read_shared_nous_state()
if not shared:
return None
# Build a full state dict so refresh_nous_oauth_from_state has every
# field it needs. force_refresh=True gets us a fresh access_token
# for this profile; force_mint=True gets us a fresh agent_key.
state: Dict[str, Any] = {
"access_token": shared.get("access_token"),
"refresh_token": shared.get("refresh_token"),
"client_id": shared.get("client_id") or DEFAULT_NOUS_CLIENT_ID,
"portal_base_url": shared.get("portal_base_url") or DEFAULT_NOUS_PORTAL_URL,
"inference_base_url": shared.get("inference_base_url") or DEFAULT_NOUS_INFERENCE_URL,
"token_type": shared.get("token_type") or "Bearer",
"scope": shared.get("scope") or DEFAULT_NOUS_SCOPE,
"obtained_at": shared.get("obtained_at"),
"expires_at": shared.get("expires_at"),
"agent_key": None,
"agent_key_expires_at": None,
"tls": {"insecure": False, "ca_bundle": None},
}
try:
refreshed = refresh_nous_oauth_from_state(
state,
min_key_ttl_seconds=min_key_ttl_seconds,
timeout_seconds=timeout_seconds,
force_refresh=True,
force_mint=True,
)
except AuthError as exc:
_oauth_trace(
"nous_shared_import_failed",
error_type=type(exc).__name__,
error_code=getattr(exc, "code", None),
)
logger.debug("Shared Nous import failed: %s", exc)
return None
except Exception as exc:
_oauth_trace(
"nous_shared_import_failed",
error_type=type(exc).__name__,
)
logger.debug("Shared Nous import failed: %s", exc)
return None
return refreshed
def _refresh_access_token(
*,
client: httpx.Client,
@@ -2991,6 +3356,12 @@ def persist_nous_credentials(
_save_provider_state(auth_store, "nous", state)
_save_auth_store(auth_store)
# Mirror to the shared store so a new profile can one-tap import
# these credentials via `hermes auth add nous --type oauth`. Best-
# effort: any I/O failure is logged and swallowed (the per-profile
# auth.json is still the source of truth).
_write_shared_nous_state(state)
pool = load_pool("nous")
return next(
(e for e in pool.entries() if e.source == NOUS_DEVICE_CODE_SOURCE),
@@ -3059,6 +3430,11 @@ def resolve_nous_runtime_credentials(
refresh_token_fp=_token_fingerprint(state.get("refresh_token")),
access_token_fp=_token_fingerprint(state.get("access_token")),
)
# Mirror post-refresh state to the shared store so sibling
# profiles don't hold stale refresh_tokens after rotation.
# Best-effort — any failure is logged and swallowed inside
# _write_shared_nous_state.
_write_shared_nous_state(state)
verify = _resolve_verify(insecure=insecure, ca_bundle=ca_bundle, auth_state=state)
timeout = httpx.Timeout(timeout_seconds if timeout_seconds else 15.0)
@@ -3840,7 +4216,7 @@ def _prompt_model_selection(
clear_screen=False,
title=effective_title,
)
idx = menu.show()
idx: int | None = menu.show() # ty:ignore[invalid-assignment] - TerminalMenu.show() is always `int | None` when multi_select is False / not provided.
from hermes_cli.curses_ui import flush_stdin
flush_stdin()
if idx is None:
@@ -4283,7 +4659,8 @@ def _minimax_oauth_login(
print(f"Portal: {portal_base_url}")
with httpx.Client(timeout=httpx.Timeout(timeout_seconds),
headers={"Accept": "application/json"}) as client:
headers={"Accept": "application/json"},
follow_redirects=True) as client:
code_data = _minimax_request_user_code(
client, portal_base_url=portal_base_url,
client_id=pconfig.client_id,
@@ -4360,7 +4737,8 @@ def _refresh_minimax_oauth_state(
return state
portal_base_url = state["portal_base_url"]
with httpx.Client(timeout=httpx.Timeout(timeout_seconds)) as client:
with httpx.Client(timeout=httpx.Timeout(timeout_seconds),
follow_redirects=True) as client:
response = client.post(
f"{portal_base_url}/oauth/token",
data={
@@ -4598,17 +4976,47 @@ def _login_nous(args, pconfig: ProviderConfig) -> None:
)
try:
auth_state = _nous_device_code_login(
portal_base_url=getattr(args, "portal_url", None),
inference_base_url=getattr(args, "inference_url", None),
client_id=getattr(args, "client_id", None) or pconfig.client_id,
scope=getattr(args, "scope", None) or pconfig.scope,
open_browser=not getattr(args, "no_browser", False),
timeout_seconds=timeout_seconds,
insecure=insecure,
ca_bundle=ca_bundle,
min_key_ttl_seconds=5 * 60,
)
auth_state = None
# Codex-style auto-import: before launching a fresh device-code
# flow, check the shared store for an existing Nous credential
# from any other profile. If present, offer to rehydrate it.
shared = _read_shared_nous_state()
if shared:
try:
shared_path = _nous_shared_store_path()
except RuntimeError:
shared_path = None
print()
if shared_path:
print(f"Found existing Nous OAuth credentials at {shared_path}")
else:
print("Found existing shared Nous OAuth credentials")
try:
do_import = input("Import these credentials? [Y/n]: ").strip().lower()
except (EOFError, KeyboardInterrupt):
do_import = "y"
if do_import in ("", "y", "yes"):
print("Rehydrating Nous session from shared credentials...")
auth_state = _try_import_shared_nous_state(
timeout_seconds=timeout_seconds,
min_key_ttl_seconds=5 * 60,
)
if auth_state is None:
print("Could not refresh shared credentials — falling back to device-code login.")
if auth_state is None:
auth_state = _nous_device_code_login(
portal_base_url=getattr(args, "portal_url", None),
inference_base_url=getattr(args, "inference_url", None),
client_id=getattr(args, "client_id", None) or pconfig.client_id,
scope=getattr(args, "scope", None) or pconfig.scope,
open_browser=not getattr(args, "no_browser", False),
timeout_seconds=timeout_seconds,
insecure=insecure,
ca_bundle=ca_bundle,
min_key_ttl_seconds=5 * 60,
)
inference_base_url = auth_state["inference_base_url"]
@@ -4625,6 +5033,11 @@ def _login_nous(args, pconfig: ProviderConfig) -> None:
_save_provider_state(auth_store, "nous", auth_state)
saved_to = _save_auth_store(auth_store)
# Mirror to the shared store so other profiles can one-tap import
# these credentials. Best-effort: any I/O failure is logged and
# swallowed inside the helper.
_write_shared_nous_state(auth_state)
print()
print("Login successful!")
print(f" Auth state: {saved_to}")
+41
View File
@@ -245,6 +245,47 @@ def auth_add_command(args) -> None:
return
if provider == "nous":
# Codex-style auto-import: if a shared Nous credential lives at
# ~/.hermes/shared/nous_auth.json (written by any previous
# successful login), offer to import it instead of running the
# full device-code flow. This makes `hermes --profile <name>
# auth add nous --type oauth` a one-tap operation for users who
# run multiple profiles.
shared = auth_mod._read_shared_nous_state()
if shared:
try:
path = auth_mod._nous_shared_store_path()
except RuntimeError:
path = None
print()
if path:
print(f"Found existing Nous OAuth credentials at {path}")
else:
print("Found existing shared Nous OAuth credentials")
try:
do_import = input("Import these credentials? [Y/n]: ").strip().lower()
except (EOFError, KeyboardInterrupt):
do_import = "y"
if do_import in ("", "y", "yes"):
print("Rehydrating Nous session from shared credentials...")
rehydrated = auth_mod._try_import_shared_nous_state(
timeout_seconds=getattr(args, "timeout", None) or 15.0,
min_key_ttl_seconds=max(
60, int(getattr(args, "min_key_ttl_seconds", 5 * 60))
),
)
if rehydrated is not None:
custom_label = (getattr(args, "label", None) or "").strip() or None
entry = auth_mod.persist_nous_credentials(rehydrated, label=custom_label)
shown_label = entry.label if entry is not None else label_from_token(
rehydrated.get("access_token", ""), _oauth_default_label(provider, 1),
)
print(f'Imported {provider} OAuth credentials: "{shown_label}"')
return
# Rehydrate failed (expired refresh_token, portal down, etc.)
# — fall through to device-code flow.
print("Could not refresh shared credentials — falling back to device-code login.")
creds = auth_mod._nous_device_code_login(
portal_base_url=getattr(args, "portal_url", None),
inference_base_url=getattr(args, "inference_url", None),
+15 -2
View File
@@ -61,6 +61,9 @@ _EXCLUDED_NAMES = {
"cron.pid",
}
# zipfile.open() drops Unix mode bits on extract; restore tightens these to 0600.
_SECRET_FILE_NAMES = {".env", "auth.json", "state.db"}
def _should_exclude(rel_path: Path) -> bool:
"""Return True if *rel_path* (relative to hermes root) should be skipped."""
@@ -381,6 +384,8 @@ def run_import(args) -> None:
target.parent.mkdir(parents=True, exist_ok=True)
with zf.open(member) as src, open(target, "wb") as dst:
dst.write(src.read())
if target.name in _SECRET_FILE_NAMES:
os.chmod(target, 0o600)
restored += 1
except (PermissionError, OSError) as exc:
errors.append(f" {rel}: {exc}")
@@ -788,9 +793,17 @@ def _prune_pre_update_backups(backup_dir: Path, keep: int) -> int:
Returns the number of files deleted. Only touches files matching
``pre-update-*.zip`` so hand-made zips dropped in the same directory
are never touched.
``keep`` is floored to 1 because this helper is only called immediately
after a fresh backup is written: deleting that backup right after the
user paid the disk/CPU cost to create it would leave them worse off
than no backup at all (and the wrapper in ``main.py`` would still print
a misleading ``Saved: <path>`` line for a file that no longer exists).
Operators who genuinely don't want a backup should set
``updates.pre_update_backup: false`` in config that gates creation.
"""
if keep < 0:
keep = 0
if keep < 1:
keep = 1
if not backup_dir.exists():
return 0
+244
View File
@@ -0,0 +1,244 @@
"""`hermes checkpoints` CLI subcommand.
Gives users direct visibility and control over the filesystem checkpoint
store at ``~/.hermes/checkpoints/``. Actions:
hermes checkpoints # same as `status`
hermes checkpoints status # total size, project count, breakdown
hermes checkpoints list # per-project checkpoint counts + workdir
hermes checkpoints prune [opts] # force a sweep (ignores the 24h marker)
hermes checkpoints clear [-f] # nuke the entire base (asks first)
hermes checkpoints clear-legacy # delete just the legacy-* archives
Examples::
hermes checkpoints
hermes checkpoints prune --retention-days 3 --max-size-mb 200
hermes checkpoints clear -f
None of these require the agent to be running. Safe to call any time.
"""
from __future__ import annotations
import argparse
import time
from datetime import datetime
from pathlib import Path
from typing import Any, Dict
def _fmt_bytes(n: int) -> str:
units = ("B", "KB", "MB", "GB", "TB")
size = float(n or 0)
for unit in units:
if size < 1024 or unit == units[-1]:
if unit == "B":
return f"{int(size)} {unit}"
return f"{size:.1f} {unit}"
size /= 1024
return f"{size:.1f} TB"
def _fmt_ts(ts: Any) -> str:
try:
return datetime.fromtimestamp(float(ts)).strftime("%Y-%m-%d %H:%M")
except (TypeError, ValueError):
return ""
def _fmt_age(ts: Any) -> str:
try:
age = time.time() - float(ts)
except (TypeError, ValueError):
return ""
if age < 0:
return "now"
if age < 60:
return f"{int(age)}s ago"
if age < 3600:
return f"{int(age / 60)}m ago"
if age < 86400:
return f"{int(age / 3600)}h ago"
return f"{int(age / 86400)}d ago"
def cmd_status(args: argparse.Namespace) -> int:
from tools.checkpoint_manager import store_status
info = store_status()
base = info["base"]
print(f"Checkpoint base: {base}")
print(f"Total size: {_fmt_bytes(info['total_size_bytes'])}")
print(f" store/ {_fmt_bytes(info['store_size_bytes'])}")
print(f" legacy-* {_fmt_bytes(info['legacy_size_bytes'])}")
print(f"Projects: {info['project_count']}")
projects = sorted(
info["projects"],
key=lambda p: (p.get("last_touch") or 0),
reverse=True,
)
if projects:
print()
print(f" {'WORKDIR':<60} {'COMMITS':>7} {'LAST TOUCH':>12} STATE")
for p in projects[: args.limit if hasattr(args, "limit") and args.limit else 20]:
wd = p.get("workdir") or "(unknown)"
if len(wd) > 60:
wd = "" + wd[-59:]
exists = p.get("exists")
state = "live" if exists else "orphan"
commits = p.get("commits", 0)
last = _fmt_age(p.get("last_touch"))
print(f" {wd:<60} {commits:>7} {last:>12} {state}")
legacy = info.get("legacy_archives", [])
if legacy:
print()
print(f"Legacy archives ({len(legacy)}):")
for arch in sorted(legacy, key=lambda a: a.get("mtime", 0), reverse=True):
print(f" {arch['name']:<40} {_fmt_bytes(arch['size_bytes']):>10}")
print()
print("Clear with: hermes checkpoints clear-legacy")
return 0
def cmd_list(args: argparse.Namespace) -> int:
# `list` is just a terser status — already covered.
return cmd_status(args)
def cmd_prune(args: argparse.Namespace) -> int:
from tools.checkpoint_manager import prune_checkpoints
retention_days = args.retention_days
max_size_mb = args.max_size_mb
print("Pruning checkpoint store…")
print(f" retention_days: {retention_days}")
print(f" delete_orphans: {not args.keep_orphans}")
print(f" max_total_size_mb: {max_size_mb}")
print()
result = prune_checkpoints(
retention_days=retention_days,
delete_orphans=not args.keep_orphans,
max_total_size_mb=max_size_mb,
)
print(f"Scanned: {result['scanned']}")
print(f"Deleted orphan: {result['deleted_orphan']}")
print(f"Deleted stale: {result['deleted_stale']}")
print(f"Errors: {result['errors']}")
print(f"Bytes reclaimed: {_fmt_bytes(result['bytes_freed'])}")
return 0
def _confirm(prompt: str) -> bool:
try:
resp = input(f"{prompt} [y/N]: ").strip().lower()
except (EOFError, KeyboardInterrupt):
print()
return False
return resp in ("y", "yes")
def cmd_clear(args: argparse.Namespace) -> int:
from tools.checkpoint_manager import CHECKPOINT_BASE, clear_all, store_status
info = store_status()
if info["total_size_bytes"] == 0 and not Path(CHECKPOINT_BASE).exists():
print("Nothing to clear — checkpoint base does not exist.")
return 0
print(f"This will delete the ENTIRE checkpoint base at {info['base']}")
print(f" size: {_fmt_bytes(info['total_size_bytes'])}")
print(f" projects: {info['project_count']}")
print(f" legacy dirs: {len(info.get('legacy_archives', []))}")
print()
print("All /rollback history for every working directory will be lost.")
if not args.force and not _confirm("Proceed?"):
print("Aborted.")
return 1
result = clear_all()
if result["deleted"]:
print(f"Cleared. Reclaimed {_fmt_bytes(result['bytes_freed'])}.")
return 0
print("Could not clear checkpoint base (see logs).")
return 2
def cmd_clear_legacy(args: argparse.Namespace) -> int:
from tools.checkpoint_manager import clear_legacy, store_status
info = store_status()
legacy = info.get("legacy_archives", [])
if not legacy:
print("No legacy archives to clear.")
return 0
total = sum(a.get("size_bytes", 0) for a in legacy)
print(f"Found {len(legacy)} legacy archive(s), total {_fmt_bytes(total)}:")
for arch in legacy:
print(f" {arch['name']:<40} {_fmt_bytes(arch['size_bytes']):>10}")
print()
print("Legacy archives hold pre-v2 per-project shadow repos, moved aside")
print("during the single-store migration. Delete when you're confident")
print("you don't need the old /rollback history.")
if not args.force and not _confirm("Delete all legacy archives?"):
print("Aborted.")
return 1
result = clear_legacy()
print(f"Deleted {result['deleted']} archive(s), reclaimed {_fmt_bytes(result['bytes_freed'])}.")
return 0
def register_cli(parser: argparse.ArgumentParser) -> None:
"""Wire subcommands onto the ``hermes checkpoints`` parser."""
parser.set_defaults(func=cmd_status) # bare `hermes checkpoints` → status
subs = parser.add_subparsers(dest="checkpoints_command", metavar="COMMAND")
p_status = subs.add_parser(
"status",
help="Show total size, project count, and per-project breakdown",
)
p_status.add_argument("--limit", type=int, default=20,
help="Max projects to list (default 20)")
p_status.set_defaults(func=cmd_status)
p_list = subs.add_parser(
"list",
help="Alias for 'status'",
)
p_list.add_argument("--limit", type=int, default=20)
p_list.set_defaults(func=cmd_list)
p_prune = subs.add_parser(
"prune",
help="Delete orphan/stale checkpoints and GC the store",
)
p_prune.add_argument("--retention-days", type=int, default=7,
help="Drop projects whose last_touch is older than N days (default 7)")
p_prune.add_argument("--max-size-mb", type=int, default=500,
help="After orphan/stale prune, drop oldest commits "
"per project until total size <= this (default 500)")
p_prune.add_argument("--keep-orphans", action="store_true",
help="Skip deleting projects whose workdir no longer exists")
p_prune.set_defaults(func=cmd_prune)
p_clear = subs.add_parser(
"clear",
help="Delete the entire checkpoint base (all /rollback history)",
)
p_clear.add_argument("-f", "--force", action="store_true",
help="Skip confirmation prompt")
p_clear.set_defaults(func=cmd_clear)
p_legacy = subs.add_parser(
"clear-legacy",
help="Delete only the legacy-<ts>/ archives from v1 migration",
)
p_legacy.add_argument("-f", "--force", action="store_true",
help="Skip confirmation prompt")
p_legacy.set_defaults(func=cmd_clear_legacy)
+9 -1
View File
@@ -235,6 +235,9 @@ def _scan_workspace_state(source_dir: Path) -> list[tuple[Path, str]]:
"""
findings: list[tuple[Path, str]] = []
if not source_dir.exists():
return findings
# Direct state files in the root
for name in ("todo.json", "sessions", "logs"):
candidate = source_dir / name
@@ -243,7 +246,12 @@ def _scan_workspace_state(source_dir: Path) -> list[tuple[Path, str]]:
findings.append((candidate, f"Root {kind}: {name}"))
# State files inside workspace directories
for child in sorted(source_dir.iterdir()):
try:
children = sorted(source_dir.iterdir())
except OSError:
return findings
for child in children:
if not child.is_dir() or child.name.startswith("."):
continue
# Check for workspace-like subdirectories
+174 -61
View File
@@ -10,6 +10,7 @@ To add an alias: set ``aliases=("short",)`` on the existing ``CommandDef``.
from __future__ import annotations
import logging
import os
import re
import shutil
@@ -21,6 +22,8 @@ from typing import Any
from utils import is_truthy_value
logger = logging.getLogger(__name__)
# prompt_toolkit is an optional CLI dependency — only needed for
# SlashCommandCompleter and SlashCommandAutoSuggest. Gateway and test
# environments that lack it must still be able to import this module
@@ -61,7 +64,9 @@ class CommandDef:
COMMAND_REGISTRY: list[CommandDef] = [
# Session
CommandDef("new", "Start a new session (fresh session ID + history)", "Session",
aliases=("reset",)),
aliases=("reset",), args_hint="[name]"),
CommandDef("topic", "Enable or inspect Telegram DM topic sessions", "Session",
gateway_only=True, args_hint="[off|help|session-id]"),
CommandDef("clear", "Clear screen and start a new session", "Session",
cli_only=True),
CommandDef("redraw", "Force a full UI repaint (recovers from terminal drift)", "Session",
@@ -396,6 +401,11 @@ def _is_gateway_available(cmd: CommandDef, config_overrides: set[str] | None = N
return False
def _requires_argument(args_hint: str) -> bool:
"""Return True when selecting a command without text would be incomplete."""
return args_hint.strip().startswith("<")
def gateway_help_lines() -> list[str]:
"""Generate gateway help text lines from the registry."""
overrides = _resolve_config_gates()
@@ -452,7 +462,9 @@ def telegram_bot_commands() -> list[tuple[str, str]]:
Telegram command names cannot contain hyphens, so they are replaced with
underscores. Aliases are skipped -- Telegram shows one menu entry per
canonical command.
canonical command. Commands that require arguments are skipped because
selecting a Telegram BotCommand sends only ``/command`` and would execute
an incomplete command.
Plugin-registered slash commands are included so plugins get native
autocomplete in Telegram without touching core code.
@@ -462,10 +474,14 @@ def telegram_bot_commands() -> list[tuple[str, str]]:
for cmd in COMMAND_REGISTRY:
if not _is_gateway_available(cmd, overrides):
continue
if _requires_argument(cmd.args_hint):
continue
tg_name = _sanitize_telegram_name(cmd.name)
if tg_name:
result.append((tg_name, cmd.description))
for name, description, _args_hint in _iter_plugin_command_entries():
for name, description, args_hint in _iter_plugin_command_entries():
if _requires_argument(args_hint):
continue
tg_name = _sanitize_telegram_name(name)
if tg_name:
result.append((tg_name, description))
@@ -499,9 +515,9 @@ def _sanitize_telegram_name(raw: str) -> str:
def _clamp_command_names(
entries: list[tuple[str, str]],
entries: list[tuple[str, ...]],
reserved: set[str],
) -> list[tuple[str, str]]:
) -> list[tuple[str, ...]]:
"""Enforce 32-char command name limit with collision avoidance.
Both Telegram and Discord cap slash command names at 32 characters.
@@ -509,10 +525,15 @@ def _clamp_command_names(
(against *reserved* names or earlier entries in the same batch), the name is
shortened to 31 chars and a digit ``0``-``9`` is appended to differentiate.
If all 10 digit slots are taken the entry is silently dropped.
Accepts tuples of any length >= 2. Extra elements beyond ``(name, desc)``
(e.g. ``cmd_key``) are passed through unchanged, so callers can attach
metadata that survives the rename.
"""
used: set[str] = set(reserved)
result: list[tuple[str, str]] = []
for name, desc in entries:
result: list[tuple] = []
for entry in entries:
name, desc, *extra = entry
if len(name) > _CMD_NAME_LIMIT:
candidate = name[:_CMD_NAME_LIMIT]
if candidate in used:
@@ -528,7 +549,7 @@ def _clamp_command_names(
if name in used:
continue
used.add(name)
result.append((name, desc))
result.append((name, desc, *extra))
return result
@@ -611,13 +632,26 @@ def _collect_gateway_skill_entries(
try:
from agent.skill_commands import get_skill_commands
from tools.skills_tool import SKILLS_DIR
from agent.skill_utils import get_external_skills_dirs
_skills_dir = str(SKILLS_DIR.resolve())
_hub_dir = str((SKILLS_DIR / ".hub").resolve())
_hub_dir = str((SKILLS_DIR / ".hub").resolve()).rstrip("/") + "/"
# Build set of allowed directory prefixes: local skills dir + any
# user-configured ``skills.external_dirs``. Ensure each prefix ends
# with ``/`` so ``/my-skills`` does not also match ``/my-skills-extra``.
# Without this widening, external skills are visible in
# ``hermes skills list`` and the agent's ``/skill-name`` dispatch but
# silently excluded from gateway slash menus (#8110).
_allowed_prefixes = [_skills_dir.rstrip("/") + "/"]
_allowed_prefixes.extend(
str(d).rstrip("/") + "/" for d in get_external_skills_dirs()
)
skill_cmds = get_skill_commands()
for cmd_key in sorted(skill_cmds):
info = skill_cmds[cmd_key]
skill_path = info.get("skill_md_path", "")
if not skill_path.startswith(_skills_dir):
if not skill_path:
continue
if not any(skill_path.startswith(prefix) for prefix in _allowed_prefixes):
continue
if skill_path.startswith(_hub_dir):
continue
@@ -635,17 +669,15 @@ def _collect_gateway_skill_entries(
except Exception:
pass
# Clamp names; _clamp_command_names works on (name, desc) pairs so we
# need to zip/unzip.
skill_pairs = [(n, d) for n, d, _ in skill_triples]
key_by_pair = {(n, d): k for n, d, k in skill_triples}
skill_pairs = _clamp_command_names(skill_pairs, reserved_names)
# Clamp names; cmd_key is passed through as extra payload so it survives
# any clamp-induced renames.
skill_triples = _clamp_command_names(skill_triples, reserved_names)
# Skills fill remaining slots — only tier that gets trimmed
remaining = max(0, max_slots - len(all_entries))
hidden_count = max(0, len(skill_pairs) - remaining)
for n, d in skill_pairs[:remaining]:
all_entries.append((n, d, key_by_pair.get((n, d), "")))
hidden_count = max(0, len(skill_triples) - remaining)
for n, d, k in skill_triples[:remaining]:
all_entries.append((n, d, k))
return all_entries[:max_slots], hidden_count
@@ -721,24 +753,40 @@ def discord_skill_commands(
def discord_skill_commands_by_category(
reserved_names: set[str],
) -> tuple[dict[str, list[tuple[str, str, str]]], list[tuple[str, str, str]], int]:
"""Return skill entries organized by category for Discord ``/skill`` subcommand groups.
"""Return skill entries organized by category for Discord ``/skill`` autocomplete.
Skills whose directory is nested at least 2 levels under ``SKILLS_DIR``
Skills whose directory is nested at least 2 levels under a scan root
(e.g. ``creative/ascii-art/SKILL.md``) are grouped by their top-level
category. Root-level skills (e.g. ``dogfood/SKILL.md``) are returned as
*uncategorized* the caller should register them as direct subcommands
of the ``/skill`` group.
*uncategorized*.
The same filtering as :func:`discord_skill_commands` is applied: hub
skills excluded, per-platform disabled excluded, names clamped.
Scan roots include the local ``SKILLS_DIR`` **and** any configured
``skills.external_dirs`` matching the widened filter applied to the
flat ``discord_skill_commands()`` collector in #18741. Without this
parity, external-dir skills are visible via ``hermes skills list`` and
the agent's ``/skill-name`` dispatch but silently absent from Discord's
``/skill`` autocomplete.
Filtering mirrors :func:`discord_skill_commands`: hub skills excluded,
per-platform disabled excluded, names clamped to 32 chars, descriptions
clamped to 100 chars.
The legacy 25-group × 25-subcommand caps (from the old nested
``/skill <cat> <name>`` layout) are **not** applied the live caller
(``_register_skill_group`` in ``gateway/platforms/discord.py``, refactored
in PR #11580) flattens these results and feeds them into a single
autocomplete callback, which scales to thousands of entries without any
per-command payload concerns. ``hidden_count`` is retained in the return
tuple for backward compatibility and still reports skills dropped for
other reasons (32-char clamp collision vs a reserved name).
Returns:
``(categories, uncategorized, hidden_count)``
- *categories*: ``{category_name: [(name, description, cmd_key), ...]}``
- *uncategorized*: ``[(name, description, cmd_key), ...]``
- *hidden_count*: skills dropped due to Discord group limits
(25 subcommand groups, 25 subcommands per group)
- *hidden_count*: skills dropped due to name clamp collisions
against already-registered command names.
"""
from pathlib import Path as _P
@@ -752,14 +800,33 @@ def discord_skill_commands_by_category(
# Collect raw skill data --------------------------------------------------
categories: dict[str, list[tuple[str, str, str]]] = {}
uncategorized: list[tuple[str, str, str]] = []
_names_used: set[str] = set(reserved_names)
# Map clamped-32-char-name → what it came from, so we can emit an
# actionable warning on collision. Reserved (gateway-builtin) command
# names are marked with a sentinel so the warning distinguishes
# "skill collided with a reserved command" from "two skills collided
# on the 32-char clamp" — the latter is the rename-worthy case.
_names_used: dict[str, str] = {n: "<reserved>" for n in reserved_names}
hidden = 0
try:
from agent.skill_commands import get_skill_commands
from agent.skill_utils import get_external_skills_dirs
from tools.skills_tool import SKILLS_DIR
_skills_dir = SKILLS_DIR.resolve()
_hub_dir = (SKILLS_DIR / ".hub").resolve()
# Build list of (resolved_root, is_local) tuples. Each external dir
# becomes its own scan root for category derivation — a skill at
# ``<external>/mlops/foo/SKILL.md`` is still categorized as "mlops".
_scan_roots: list[_P] = [_skills_dir]
try:
for ext in get_external_skills_dirs():
try:
_scan_roots.append(_P(ext).resolve())
except Exception:
continue
except Exception:
pass
skill_cmds = get_skill_commands()
for cmd_key in sorted(skill_cmds):
@@ -768,33 +835,72 @@ def discord_skill_commands_by_category(
if not skill_path:
continue
sp = _P(skill_path).resolve()
# Skip skills outside SKILLS_DIR or from the hub
if not str(sp).startswith(str(_skills_dir)):
continue
# Hub skills are loaded via the skill hub, not surfaced as
# slash commands.
if str(sp).startswith(str(_hub_dir)):
continue
# Accept skill if it lives under any scan root; record the
# matching root so we can derive the category correctly.
matched_root: _P | None = None
for root in _scan_roots:
try:
sp.relative_to(root)
except ValueError:
continue
matched_root = root
break
if matched_root is None:
continue
skill_name = info.get("name", "")
if skill_name in _platform_disabled:
continue
raw_name = cmd_key.lstrip("/")
# Clamp to 32 chars (Discord limit)
# Clamp to 32 chars (Discord per-command name limit)
discord_name = raw_name[:32]
if discord_name in _names_used:
# Two skills whose first 32 chars are identical. One wins
# (the first one seen, which is alphabetical because the
# caller iterates ``sorted(skill_cmds)``); the other is
# dropped from Discord's /skill autocomplete.
#
# Silently counting this as ``hidden`` (the old behavior)
# meant skill authors had no way to discover the drop —
# their skill just didn't appear in the picker. Emit a
# WARNING naming both sides so the author can rename the
# losing skill's frontmatter name to something with a
# distinct 32-char prefix.
prior = _names_used[discord_name]
if prior == "<reserved>":
logger.warning(
"Discord /skill: %r (from %r) collides on its 32-char "
"clamp with a reserved gateway command name %r — the "
"skill will not appear in the /skill autocomplete. "
"Rename the skill's frontmatter ``name:`` to differ "
"in its first 32 chars.",
discord_name, cmd_key, discord_name,
)
else:
logger.warning(
"Discord /skill: %r and %r both clamp to %r on "
"Discord's 32-char command-name limit — only %r "
"will appear in the /skill autocomplete. Rename "
"one skill's frontmatter ``name:`` to differ in "
"its first 32 chars.",
prior, cmd_key, discord_name, prior,
)
hidden += 1
continue
_names_used.add(discord_name)
_names_used[discord_name] = cmd_key
desc = info.get("description", "")
if len(desc) > 100:
desc = desc[:97] + "..."
# Determine category from the relative path within SKILLS_DIR.
# e.g. creative/ascii-art/SKILL.md → parts = ("creative", "ascii-art")
try:
rel = sp.parent.relative_to(_skills_dir)
except ValueError:
continue
# Determine category from the relative path within the matched
# scan root. e.g. creative/ascii-art/SKILL.md → ("creative", ...)
rel = sp.parent.relative_to(matched_root)
parts = rel.parts
if len(parts) >= 2:
cat = parts[0]
@@ -804,28 +910,7 @@ def discord_skill_commands_by_category(
except Exception:
pass
# Enforce Discord limits: 25 subcommand groups, 25 subcommands each ------
_MAX_GROUPS = 25
_MAX_PER_GROUP = 25
trimmed_categories: dict[str, list[tuple[str, str, str]]] = {}
group_count = 0
for cat in sorted(categories):
if group_count >= _MAX_GROUPS:
hidden += len(categories[cat])
continue
entries = categories[cat][:_MAX_PER_GROUP]
hidden += max(0, len(categories[cat]) - _MAX_PER_GROUP)
trimmed_categories[cat] = entries
group_count += 1
# Uncategorized skills also count against the 25 top-level limit
remaining_slots = _MAX_GROUPS - group_count
if len(uncategorized) > remaining_slots:
hidden += len(uncategorized) - remaining_slots
uncategorized = uncategorized[:remaining_slots]
return trimmed_categories, uncategorized, hidden
return categories, uncategorized, hidden
# ---------------------------------------------------------------------------
@@ -838,6 +923,13 @@ def discord_skill_commands_by_category(
_SLACK_MAX_SLASH_COMMANDS = 50
_SLACK_NAME_LIMIT = 32
_SLACK_INVALID_CHARS = re.compile(r"[^a-z0-9_\-]")
_SLACK_RESERVED_COMMANDS = frozenset({
# Built-in Slack slash commands that cannot be registered by apps.
# https://slack.com/help/articles/201259356-Use-built-in-slash-commands
"me", "status", "away", "dnd", "shrug", "remind", "msg", "feed",
"who", "collapse", "expand", "leave", "join", "open", "search",
"topic", "mute", "pro", "shortcuts",
})
def _sanitize_slack_name(raw: str) -> str:
@@ -864,6 +956,10 @@ def slack_native_slashes() -> list[tuple[str, str, str]]:
documented form (e.g. ``/background``, ``/bg``, and ``/btw`` all work).
Plugin-registered slash commands are included too.
Commands whose sanitized name collides with a Slack built-in
(e.g. ``/status``, ``/me``, ``/join``) are silently skipped. Users
can still reach them via ``/hermes <command>``.
Results are clamped to Slack's 50-command limit with duplicate-name
avoidance. ``/hermes`` is always reserved as the first entry so the
legacy ``/hermes <subcommand>`` form keeps working for anything that
@@ -881,6 +977,8 @@ def slack_native_slashes() -> list[tuple[str, str, str]]:
slack_name = _sanitize_slack_name(name)
if not slack_name or slack_name in seen:
return
if slack_name in _SLACK_RESERVED_COMMANDS:
return
if len(entries) >= _SLACK_MAX_SLASH_COMMANDS:
return
# Slack description cap is 2000 chars; keep it short.
@@ -1030,6 +1128,12 @@ class SlashCommandCompleter(Completer):
except Exception:
return {}
# Commands that open pickers when run without arguments.
# These should NOT receive a trailing space in completions because:
# - The TUI's submit handler applies completions on Enter if input differs
# - Adding space makes "/model" → "/model " which blocks picker execution
_PICKER_COMMANDS = frozenset({"model", "skin", "personality"})
@staticmethod
def _completion_text(cmd_name: str, word: str) -> str:
"""Return replacement text for a completion.
@@ -1038,8 +1142,17 @@ class SlashCommandCompleter(Completer):
returning ``help`` would be a no-op and prompt_toolkit suppresses the
menu. Appending a trailing space keeps the dropdown visible and makes
backspacing retrigger it naturally.
However, commands that open pickers (model, skin, personality) should
NOT get a trailing space the TUI would apply the completion on Enter
and block the picker from opening.
"""
return f"{cmd_name} " if cmd_name == word else cmd_name
if cmd_name != word:
return cmd_name
# Don't add space for picker commands — allows Enter to execute them
if cmd_name in SlashCommandCompleter._PICKER_COMMANDS:
return cmd_name
return f"{cmd_name} "
@staticmethod
def _extract_path_word(text: str) -> str | None:
+153 -20
View File
@@ -400,7 +400,12 @@ DEFAULT_CONFIG = {
# The gateway stops accepting new work, waits for running agents
# to finish, then interrupts any remaining runs after the timeout.
# 0 = no drain, interrupt immediately.
"restart_drain_timeout": 60,
#
# 180s is calibrated for realistic in-flight agent turns: a typical
# coding conversation mid-reasoning runs 60150s per call, so a 60s
# budget routinely interrupted legitimate work on /restart. Raise
# further in config.yaml if you run very-long-reasoning models.
"restart_drain_timeout": 180,
# Max app-level retry attempts for API errors (connection drops,
# provider timeouts, 5xx, etc.) before the agent surfaces the
# failure. The OpenAI SDK already does its own low-level retries
@@ -539,12 +544,25 @@ DEFAULT_CONFIG = {
# via TERMINAL_LOCAL_PERSISTENT env var.
"persistent_shell": True,
},
"web": {
"backend": "", # shared fallback — applies to both search and extract
"search_backend": "", # per-capability override for web_search (e.g. "searxng")
"extract_backend": "", # per-capability override for web_extract (e.g. "native")
},
"browser": {
"inactivity_timeout": 120,
"command_timeout": 30, # Timeout for browser commands in seconds (screenshot, navigate, etc.)
"record_sessions": False, # Auto-record browser sessions as WebM videos
"allow_private_urls": False, # Allow navigating to private/internal IPs (localhost, 192.168.x.x, etc.)
# Browser engine for local mode. Passed as ``--engine <value>`` to
# agent-browser v0.25.3+.
# "auto" — use Chrome (default, don't pass --engine at all)
# "lightpanda" — use Lightpanda (1.3-5.8x faster navigation, no screenshots)
# "chrome" — explicitly request Chrome
# Also settable via AGENT_BROWSER_ENGINE env var.
"engine": "auto",
"auto_local_for_private_urls": True, # When a cloud provider is set, auto-spawn local Chromium for LAN/localhost URLs instead of sending them to the cloud
"cdp_url": "", # Optional persistent CDP endpoint for attaching to an existing Chromium/Chrome
# CDP supervisor — dialog + frame detection via a persistent WebSocket.
@@ -562,21 +580,39 @@ DEFAULT_CONFIG = {
},
# Filesystem checkpoints — automatic snapshots before destructive file ops.
# When enabled, the agent takes a snapshot of the working directory once per
# conversation turn (on first write_file/patch call). Use /rollback to restore.
# When enabled, the agent takes a snapshot of the working directory once
# per conversation turn (on first write_file/patch call). Use /rollback
# to restore.
#
# Defaults changed in v2 (single shared shadow store, real pruning):
# - enabled: True -> False (opt-in; most users never use /rollback)
# - max_snapshots: 50 -> 20 (now actually enforced via ref rewrite)
# - auto_prune: False -> True (orphans/stale pruned automatically)
# Opt in via ``hermes chat --checkpoints`` or set enabled=True here.
"checkpoints": {
"enabled": True,
"max_snapshots": 50, # Max checkpoints to keep per directory
# Auto-maintenance: shadow repos accumulate forever under
# ~/.hermes/checkpoints/ (one per cd'd working directory). Field
# reports put the typical offender at 1000+ repos / ~12 GB. When
# auto_prune is on, hermes sweeps at startup (at most once per
# min_interval_hours) and deletes:
# * orphan repos: HERMES_WORKDIR no longer exists on disk
# * stale repos: newest mtime older than retention_days
# Opt-in so users who rely on /rollback against long-ago sessions
# never lose data silently.
"auto_prune": False,
"enabled": False,
# Max checkpoints to keep per working directory. Pre-v2 this only
# limited the `/rollback` listing; v2 actually rewrites the ref and
# garbage-collects older commits.
"max_snapshots": 20,
# Hard ceiling on total ``~/.hermes/checkpoints/`` size (MB). When
# exceeded, the oldest checkpoint per project is dropped in a
# round-robin pass until total size falls under the cap.
# 0 disables the size cap.
"max_total_size_mb": 500,
# Skip any single file larger than this when staging a checkpoint.
# Prevents accidental snapshotting of datasets, model weights, and
# other large generated assets. 0 disables the filter.
"max_file_size_mb": 10,
# Auto-maintenance: hermes sweeps the checkpoint base at startup
# (at most once per ``min_interval_hours``) and:
# * deletes project entries whose workdir no longer exists (orphan)
# * deletes project entries whose last_touch is older than
# ``retention_days``
# * GCs the single shared store to reclaim unreachable objects
# * enforces ``max_total_size_mb`` across remaining projects
# * deletes ``legacy-*`` archives older than ``retention_days``
"auto_prune": True,
"retention_days": 7,
"delete_orphans": True,
"min_interval_hours": 24,
@@ -639,6 +675,18 @@ DEFAULT_CONFIG = {
"cache_ttl": "5m",
},
# OpenRouter-specific settings.
# response_cache: enable OpenRouter response caching (X-OpenRouter-Cache header).
# When enabled, identical requests return cached responses for free (zero billing).
# This is separate from Anthropic prompt caching and works alongside it.
# See: https://openrouter.ai/docs/guides/features/response-caching
# response_cache_ttl: how long cached responses remain valid, in seconds (1-86400).
# Default 300 (5 minutes). Only used when response_cache is enabled.
"openrouter": {
"response_cache": True,
"response_cache_ttl": 300,
},
# AWS Bedrock provider configuration.
# Only used when model.provider is "bedrock".
"bedrock": {
@@ -761,9 +809,19 @@ DEFAULT_CONFIG = {
"show_reasoning": False,
"streaming": False,
"final_response_markdown": "strip", # render | strip | raw
# Preserve recent classic CLI output across Ctrl+L, /redraw, and
# terminal resize full-screen clears. Disable if a terminal emulator
# behaves badly with replayed scrollback.
"persistent_output": True,
"persistent_output_max_lines": 200,
"inline_diffs": True, # Show inline diff previews for write actions (write_file, patch, skill_manage)
"show_cost": False, # Show $ cost in the status bar (off by default)
"skin": "default",
# UI language for static user-facing messages (approval prompts, a
# handful of gateway slash-command replies). Does NOT affect agent
# responses, log lines, tool outputs, or slash-command descriptions.
# Supported: en, zh, ja, de, es, fr, tr, uk. Unknown values fall back to en.
"language": "en",
# TUI busy indicator style: kaomoji (default), emoji, unicode (braille
# spinner), or ascii. Live-swappable via `/indicator <style>`.
"tui_status_indicator": "kaomoji",
@@ -792,6 +850,7 @@ DEFAULT_CONFIG = {
"enabled": False,
"fields": ["model", "context_pct", "cwd"], # Order shown; drop any to hide
},
"copy_shortcut": "auto", # "auto" (platform default) | "ctrl_c" | "ctrl_shift_c" | "disabled"
},
# Web dashboard settings
@@ -825,7 +884,7 @@ DEFAULT_CONFIG = {
# Voices: alloy, echo, fable, onyx, nova, shimmer
},
"xai": {
"voice_id": "eve",
"voice_id": "eve", # or custom voice ID — see https://docs.x.ai/developers/model-capabilities/audio/custom-voices
"language": "en",
"sample_rate": 24000,
"bit_rate": 128000,
@@ -1022,6 +1081,14 @@ DEFAULT_CONFIG = {
# Archive a skill (move to skills/.archive/) after this many days
# without use. Archived skills are recoverable — no auto-deletion.
"archive_after_days": 90,
# Pre-run backup: before every real curator pass (dry-run is
# skipped), snapshot ~/.hermes/skills/ into
# ~/.hermes/skills/.curator_backups/<utc-iso>/skills.tar.gz so the
# user can roll back with `hermes curator rollback`.
"backup": {
"enabled": True,
"keep": 5, # retain last N regular snapshots
},
},
# Honcho AI-native memory -- reads ~/.honcho/config.json as single source of truth.
@@ -1261,7 +1328,10 @@ DEFAULT_CONFIG = {
# for a single update run.
"pre_update_backup": False,
# How many pre-update backup zips to retain. Older ones are pruned
# automatically after each successful backup.
# automatically after each successful backup. Values below 1 are
# floored to 1 — the backup just created is always preserved. To
# disable backups entirely, set ``pre_update_backup: false`` above
# rather than ``backup_keep: 0``.
"backup_keep": 5,
},
@@ -1762,6 +1832,14 @@ OPTIONAL_ENV_VARS = {
"password": True,
"category": "tool",
},
"SEARXNG_URL": {
"description": "URL of your SearXNG instance for free self-hosted web search",
"prompt": "SearXNG URL (e.g. http://localhost:8080)",
"url": "https://searxng.github.io/searxng/",
"tools": ["web_search"],
"password": False,
"category": "tool",
},
"BROWSERBASE_API_KEY": {
"description": "Browserbase API key for cloud browser (optional — local browser works without this)",
"prompt": "Browserbase API key",
@@ -1793,6 +1871,15 @@ OPTIONAL_ENV_VARS = {
"password": False,
"category": "tool",
},
"AGENT_BROWSER_ENGINE": {
"description": "Browser engine for local mode: auto (default Chrome), lightpanda (faster, no screenshots), chrome",
"prompt": "Browser engine (auto/lightpanda/chrome)",
"url": "https://github.com/vercel-labs/agent-browser",
"tools": ["browser_navigate", "browser_snapshot", "browser_click", "browser_vision"],
"password": False,
"category": "tool",
"advanced": True,
},
"CAMOFOX_URL": {
"description": "Camofox browser server URL for local anti-detection browsing (e.g. http://localhost:9377)",
"prompt": "Camofox server URL",
@@ -1871,7 +1958,7 @@ OPTIONAL_ENV_VARS = {
"LINEAR_API_KEY": {
"description": "Linear personal API key (used by the `linear` skill)",
"prompt": "Linear API key",
"url": "https://linear.app/settings/api",
"url": "https://linear.app/settings/account/security",
"password": True,
"category": "skill",
"advanced": True,
@@ -3918,6 +4005,7 @@ _FALLBACK_COMMENT = """
# kimi-coding-cn (KIMI_CN_API_KEY) — Kimi / Moonshot (China)
# minimax (MINIMAX_API_KEY) — MiniMax
# minimax-cn (MINIMAX_CN_API_KEY) — MiniMax (China)
# bedrock (AWS IAM / boto3) — AWS Bedrock (Converse API)
#
# For custom OpenAI-compatible endpoints, add base_url and key_env.
#
@@ -3949,6 +4037,7 @@ _COMMENTED_SECTIONS = """
# kimi-coding-cn (KIMI_CN_API_KEY) — Kimi / Moonshot (China)
# minimax (MINIMAX_API_KEY) — MiniMax
# minimax-cn (MINIMAX_CN_API_KEY) — MiniMax (China)
# bedrock (AWS IAM / boto3) — AWS Bedrock (Converse API)
#
# For custom OpenAI-compatible endpoints, add base_url and key_env.
#
@@ -4650,7 +4739,9 @@ def set_config_value(key: str, value: str):
"terminal.vercel_runtime": "TERMINAL_VERCEL_RUNTIME",
"terminal.docker_mount_cwd_to_workspace": "TERMINAL_DOCKER_MOUNT_CWD_TO_WORKSPACE",
"terminal.docker_run_as_host_user": "TERMINAL_DOCKER_RUN_AS_HOST_USER",
"terminal.cwd": "TERMINAL_CWD",
# terminal.cwd intentionally excluded — CLI resolves at runtime,
# gateway bridges it in gateway/run.py. Persisting to .env causes
# stale values to poison child processes.
"terminal.timeout": "TERMINAL_TIMEOUT",
"terminal.sandbox_dir": "TERMINAL_SANDBOX_DIR",
"terminal.persistent_shell": "TERMINAL_PERSISTENT_SHELL",
@@ -4804,3 +4895,45 @@ def config_command(args):
print(" hermes config path Show config file path")
print(" hermes config env-path Show .env file path")
sys.exit(1)
# ── Profile-driven env var injection ─────────────────────────────────────────
# Any provider registered in providers/ with auth_type="api_key" automatically
# gets its env_vars exposed in OPTIONAL_ENV_VARS without editing this file.
# Runs once at import time.
_profile_env_vars_injected = False
def _inject_profile_env_vars() -> None:
"""Populate OPTIONAL_ENV_VARS from provider profiles not already listed.
Called once at module load time. Idempotent repeated calls are no-ops.
"""
global _profile_env_vars_injected
if _profile_env_vars_injected:
return
_profile_env_vars_injected = True
try:
from providers import list_providers
for _pp in list_providers():
if _pp.auth_type not in ("api_key",):
continue
for _var in _pp.env_vars:
if _var in OPTIONAL_ENV_VARS:
continue
_is_key = not _var.endswith("_BASE_URL") and not _var.endswith("_URL")
OPTIONAL_ENV_VARS[_var] = {
"description": f"{_pp.display_name or _pp.name} {'API key' if _is_key else 'base URL override'}",
"prompt": f"{_pp.display_name or _pp.name} {'API key' if _is_key else 'base URL (leave empty for default)'}",
"url": _pp.signup_url or None,
"password": _is_key,
"category": "provider",
"advanced": True,
}
except Exception:
pass
# Eagerly inject so that OPTIONAL_ENV_VARS is fully populated at import time.
_inject_profile_env_vars()
+8
View File
@@ -93,6 +93,8 @@ def cron_list(show_all: bool = False):
script = job.get("script")
if script:
print(f" Script: {script}")
if job.get("no_agent"):
print(f" Mode: {color('no-agent', Colors.DIM)} (script stdout delivered directly)")
workdir = job.get("workdir")
if workdir:
print(f" Workdir: {workdir}")
@@ -172,6 +174,7 @@ def cron_create(args):
skills=_normalize_skills(getattr(args, "skill", None), getattr(args, "skills", None)),
script=getattr(args, "script", None),
workdir=getattr(args, "workdir", None),
no_agent=getattr(args, "no_agent", False) or None,
)
if not result.get("success"):
print(color(f"Failed to create job: {result.get('error', 'unknown error')}", Colors.RED))
@@ -184,6 +187,8 @@ def cron_create(args):
job_data = result.get("job", {})
if job_data.get("script"):
print(f" Script: {job_data['script']}")
if job_data.get("no_agent"):
print(" Mode: no-agent (script stdout delivered directly)")
if job_data.get("workdir"):
print(f" Workdir: {job_data['workdir']}")
print(f" Next run: {result['next_run_at']}")
@@ -225,6 +230,7 @@ def cron_edit(args):
skills=final_skills,
script=getattr(args, "script", None),
workdir=getattr(args, "workdir", None),
no_agent=getattr(args, "no_agent", None),
)
if not result.get("success"):
print(color(f"Failed to update job: {result.get('error', 'unknown error')}", Colors.RED))
@@ -240,6 +246,8 @@ def cron_edit(args):
print(" Skills: none")
if updated.get("script"):
print(f" Script: {updated['script']}")
if updated.get("no_agent"):
print(" Mode: no-agent (script stdout delivered directly)")
if updated.get("workdir"):
print(f" Workdir: {updated['workdir']}")
return 0
+280 -7
View File
@@ -160,7 +160,11 @@ def _cmd_run(args) -> int:
print("curator: disabled via config; enable with `curator.enabled: true`")
return 1
print("curator: running review pass...")
dry = bool(getattr(args, "dry_run", False))
if dry:
print("curator: running DRY-RUN (report only, no mutations)...")
else:
print("curator: running review pass...")
def _on_summary(msg: str) -> None:
print(msg)
@@ -168,17 +172,29 @@ def _cmd_run(args) -> int:
result = curator.run_curator_review(
on_summary=_on_summary,
synchronous=bool(args.synchronous),
dry_run=dry,
)
auto = result.get("auto_transitions", {})
if auto:
print(
f"auto: checked={auto.get('checked', 0)} "
f"stale={auto.get('marked_stale', 0)} "
f"archived={auto.get('archived', 0)} "
f"reactivated={auto.get('reactivated', 0)}"
)
if dry:
print(
f"auto (preview): {auto.get('checked', 0)} candidate skill(s) "
"— no transitions applied in dry-run"
)
else:
print(
f"auto: checked={auto.get('checked', 0)} "
f"stale={auto.get('marked_stale', 0)} "
f"archived={auto.get('archived', 0)} "
f"reactivated={auto.get('reactivated', 0)}"
)
if not args.synchronous:
print("llm pass running in background — check `hermes curator status` later")
if dry:
print(
"dry-run: no changes applied. When the report lands, read it with "
"`hermes curator status` and run `hermes curator run` (no flag) to apply."
)
return 0
@@ -229,6 +245,203 @@ def _cmd_restore(args) -> int:
return 0 if ok else 1
def _cmd_archive(args) -> int:
"""Manually archive an agent-created skill. Refuses if pinned.
The auto-curator archives stale skills on its own schedule; this verb is
for the user who wants to archive *now* without waiting for a run.
"""
from tools import skill_usage
if skill_usage.get_record(args.skill).get("pinned"):
print(
f"curator: '{args.skill}' is pinned — unpin first with "
f"`hermes curator unpin {args.skill}`"
)
return 1
ok, msg = skill_usage.archive_skill(args.skill)
print(f"curator: {msg}")
return 0 if ok else 1
def _idle_days(record: dict) -> Optional[int]:
"""Days since the skill's last activity (view / use / patch).
Falls back to ``created_at`` so a skill that was authored but never used
can still be pruned otherwise never-touched skills would be immortal.
Returns None only when both fields are missing or unparseable.
"""
ts = record.get("last_activity_at") or record.get("created_at")
if not ts:
return None
try:
dt = datetime.fromisoformat(str(ts))
except (TypeError, ValueError):
return None
if dt.tzinfo is None:
dt = dt.replace(tzinfo=timezone.utc)
return max(0, (datetime.now(timezone.utc) - dt).days)
def _cmd_prune(args) -> int:
"""Bulk-archive agent-created skills idle for >= N days.
Pinned skills are exempt. Already-archived skills are skipped. Default
``--days 90`` matches a conservative read of the curator's own archive
threshold; adjust with ``--days``. Use ``--dry-run`` to preview.
"""
from tools import skill_usage
days = getattr(args, "days", 90)
if days < 1:
print(f"curator: --days must be >= 1 (got {days})", file=sys.stderr)
return 2
dry_run = bool(getattr(args, "dry_run", False))
skip_confirm = bool(getattr(args, "yes", False))
candidates = []
for r in skill_usage.agent_created_report():
if r.get("pinned"):
continue
if r.get("state") == skill_usage.STATE_ARCHIVED:
continue
idle = _idle_days(r)
if idle is None or idle < days:
continue
candidates.append((r["name"], idle))
if not candidates:
print(f"curator: nothing to prune (no unpinned skills idle >= {days}d)")
return 0
candidates.sort(key=lambda c: -c[1])
print(f"curator: {len(candidates)} skill(s) idle >= {days}d:")
for name, idle in candidates:
print(f" {name:40s} idle {idle}d")
if dry_run:
print("\n(dry run — no changes made)")
return 0
if not skip_confirm:
try:
reply = input(f"\nArchive {len(candidates)} skill(s)? [y/N] ").strip().lower()
except (EOFError, KeyboardInterrupt):
print("\ncurator: aborted")
return 1
if reply not in ("y", "yes"):
print("curator: aborted")
return 1
archived = 0
failures = []
for name, _ in candidates:
ok, msg = skill_usage.archive_skill(name)
if ok:
archived += 1
else:
failures.append((name, msg))
print(f"\ncurator: archived {archived}/{len(candidates)}")
if failures:
print("failures:")
for name, msg in failures:
print(f" {name}: {msg}")
return 1
return 0
def _cmd_backup(args) -> int:
"""Take a manual snapshot of the skills tree. Same mechanism as the
automatic pre-run snapshot, just user-initiated."""
from agent import curator_backup
if not curator_backup.is_enabled():
print(
"curator: backups are disabled via config "
"(`curator.backup.enabled: false`); re-enable to snapshot"
)
return 1
reason = getattr(args, "reason", None) or "manual"
snap = curator_backup.snapshot_skills(reason=reason)
if snap is None:
print("curator: snapshot failed — check logs (backup disabled or IO error)")
return 1
print(f"curator: snapshot created at ~/.hermes/skills/.curator_backups/{snap.name}")
return 0
def _cmd_rollback(args) -> int:
"""Restore the skills tree from a snapshot. Defaults to newest.
``--list`` prints available snapshots and exits. ``--id <stamp>`` picks
a specific one. Without ``-y``, prompts for confirmation. A safety
snapshot of the current tree is always taken first, so rollbacks are
themselves undoable.
"""
from agent import curator_backup
if getattr(args, "list", False):
print(curator_backup.summarize_backups())
return 0
backup_id = getattr(args, "backup_id", None)
target_path = curator_backup._resolve_backup(backup_id)
if target_path is None:
rows = curator_backup.list_backups()
if not rows:
print(
"curator: no snapshots exist yet. Take one with "
"`hermes curator backup` or wait for the next curator run."
)
else:
print(
f"curator: no snapshot matching "
f"{'id ' + repr(backup_id) if backup_id else 'your query'}."
)
print("Available:")
print(curator_backup.summarize_backups())
return 1
manifest = curator_backup._read_manifest(target_path)
print(f"Rollback target: {target_path.name}")
if manifest:
print(f" reason: {manifest.get('reason', '?')}")
print(f" created_at: {manifest.get('created_at', '?')}")
print(f" skill files: {manifest.get('skill_files', '?')}")
cron = manifest.get("cron_jobs") or {}
if isinstance(cron, dict):
if cron.get("backed_up"):
print(
f" cron jobs: {cron.get('jobs_count', 0)} "
f"(will be restored for skill-link fields only)"
)
else:
reason = cron.get("reason", "not captured")
print(f" cron jobs: not in snapshot ({reason})")
print(
"\nThis will replace the current ~/.hermes/skills/ tree (a safety "
"snapshot of the current state is taken first so this is undoable). "
"Cron jobs that still exist will have their skills/skill fields "
"restored from the snapshot; all other cron fields are left alone."
)
if not getattr(args, "yes", False):
try:
ans = input("Proceed? [y/N] ").strip().lower()
except (EOFError, KeyboardInterrupt):
print("\ncancelled")
return 1
if ans not in ("y", "yes"):
print("cancelled")
return 1
ok, msg, _ = curator_backup.rollback(backup_id=target_path.name)
if ok:
print(f"curator: {msg}")
return 0
print(f"curator: rollback failed — {msg}")
return 1
# ---------------------------------------------------------------------------
# argparse wiring (called from hermes_cli.main)
# ---------------------------------------------------------------------------
@@ -250,6 +463,11 @@ def register_cli(parent: argparse.ArgumentParser) -> None:
"--sync", "--synchronous", dest="synchronous", action="store_true",
help="Wait for the LLM review pass to finish (default: background thread)",
)
p_run.add_argument(
"--dry-run", dest="dry_run", action="store_true",
help="Report only — no state changes, no archives, no consolidation "
"(use this to preview what curator would do)",
)
p_run.set_defaults(func=_cmd_run)
p_pause = subs.add_parser("pause", help="Pause the curator until resumed")
@@ -270,6 +488,61 @@ def register_cli(parent: argparse.ArgumentParser) -> None:
p_restore.add_argument("skill", help="Skill name")
p_restore.set_defaults(func=_cmd_restore)
p_archive = subs.add_parser(
"archive",
help="Manually archive a skill (move to .archive/, excluded from prompt)",
)
p_archive.add_argument("skill", help="Skill name")
p_archive.set_defaults(func=_cmd_archive)
p_prune = subs.add_parser(
"prune",
help="Bulk-archive agent-created skills idle for >= N days (default 90)",
)
p_prune.add_argument(
"--days", type=int, default=90,
help="Archive skills idle for at least N days (default: 90)",
)
p_prune.add_argument(
"-y", "--yes", action="store_true",
help="Skip the confirmation prompt",
)
p_prune.add_argument(
"--dry-run", dest="dry_run", action="store_true",
help="Show what would be archived without doing it",
)
p_prune.set_defaults(func=_cmd_prune)
p_backup = subs.add_parser(
"backup",
help="Take a manual tar.gz snapshot of ~/.hermes/skills/ "
"(curator also does this automatically before every real run)",
)
p_backup.add_argument(
"--reason", default=None,
help="Free-text label stored in manifest.json (default: 'manual')",
)
p_backup.set_defaults(func=_cmd_backup)
p_rollback = subs.add_parser(
"rollback",
help="Restore ~/.hermes/skills/ from a curator snapshot "
"(defaults to the newest)",
)
p_rollback.add_argument(
"--list", action="store_true",
help="List available snapshots and exit without restoring",
)
p_rollback.add_argument(
"--id", dest="backup_id", default=None,
help="Snapshot id to restore (see `--list`); default: newest",
)
p_rollback.add_argument(
"-y", "--yes", action="store_true",
help="Skip confirmation prompt",
)
p_rollback.set_defaults(func=_cmd_rollback)
def cli_main(argv=None) -> int:
"""Standalone entry (also usable by hermes_cli.main fallthrough)."""
+6
View File
@@ -156,6 +156,8 @@ def curses_checklist(
flush_stdin()
return result_holder[0] if result_holder[0] is not None else cancel_returns
except KeyboardInterrupt:
return cancel_returns
except Exception:
return _numbered_fallback(title, items, selected, cancel_returns, status_fn)
@@ -278,6 +280,8 @@ def curses_radiolist(
flush_stdin()
return result_holder[0] if result_holder[0] is not None else cancel_returns
except KeyboardInterrupt:
return cancel_returns
except Exception:
return _radio_numbered_fallback(title, items, selected, cancel_returns)
@@ -401,6 +405,8 @@ def curses_single_select(
return None
return result_holder[0]
except KeyboardInterrupt:
return None
except Exception:
all_items = list(items) + [cancel_label]
cancel_idx = len(items)
+82 -7
View File
@@ -1,12 +1,19 @@
"""``hermes debug`` debug tools for Hermes Agent.
"""``hermes debug`` debug tools for Hermes Agent.
Currently supports:
hermes debug share Upload debug report (system info + logs) to a
paste service and print a shareable URL.
By default, log content is run through
``agent.redact.redact_sensitive_text`` with
``force=True`` before upload so credentials in
``~/.hermes/logs/*.log`` are not leaked into
the public paste service. Pass ``--no-redact``
to disable.
"""
import io
import json
import logging
import sys
import time
import urllib.error
@@ -19,6 +26,16 @@ from typing import Optional
from hermes_constants import get_hermes_home
from utils import atomic_replace
logger = logging.getLogger(__name__)
# Banner prepended to upload-bound log content when redaction is enabled.
# Visible in the public paste so reviewers know the content was sanitized.
# Kept short; the trailing newline guarantees the banner sits on its own line.
_REDACTION_BANNER = (
"[hermes debug share: log content redacted at upload time. "
"run with --no-redact to disable]\n"
)
# ---------------------------------------------------------------------------
# Paste services — try paste.rs first, dpaste.com as fallback.
@@ -368,17 +385,40 @@ def _resolve_log_path(log_name: str) -> Optional[Path]:
return None
def _redact_log_text(text: str) -> str:
"""Run ``redact_sensitive_text`` with ``force=True`` over upload-bound text.
Uses ``force=True`` so redaction fires regardless of the operator's
``security.redact_secrets`` setting. The local on-disk log file is
not modified; only the in-memory copy headed for the public paste
service is sanitized. Returns the redacted text (or the original
when empty / non-string).
"""
if not text:
return text
from agent.redact import redact_sensitive_text
return redact_sensitive_text(text, force=True)
def _capture_log_snapshot(
log_name: str,
*,
tail_lines: int,
max_bytes: int = _MAX_LOG_BYTES,
redact: bool = True,
) -> LogSnapshot:
"""Capture a log once and derive summary/full-log views from it.
The report tail and standalone log upload must come from the same file
snapshot. Otherwise a rotation/truncate between reads can make the report
look newer than the uploaded ``agent.log`` paste.
When ``redact`` is True (the default), both ``tail_text`` and
``full_text`` are run through ``_redact_log_text`` so the snapshot
returned is upload-safe. The on-disk log file is never modified.
Pass ``redact=False`` to capture original log content (used by
``hermes debug share --no-redact``).
"""
log_path = _resolve_log_path(log_name)
if log_path is None:
@@ -438,18 +478,34 @@ def _capture_log_snapshot(
if truncated:
full_text = f"[... truncated — showing last ~{max_bytes // 1024}KB ...]\n{full_text}"
if redact:
tail_text = _redact_log_text(tail_text)
full_text = _redact_log_text(full_text)
return LogSnapshot(path=log_path, tail_text=tail_text, full_text=full_text)
except Exception as exc:
return LogSnapshot(path=log_path, tail_text=f"(error reading: {exc})", full_text=None)
def _capture_default_log_snapshots(log_lines: int) -> dict[str, LogSnapshot]:
"""Capture all logs used by debug-share exactly once."""
def _capture_default_log_snapshots(
log_lines: int, *, redact: bool = True
) -> dict[str, LogSnapshot]:
"""Capture all logs used by debug-share exactly once.
``redact`` is forwarded to each ``_capture_log_snapshot`` call so all
captured logs share the same redaction policy for a given run.
"""
errors_lines = min(log_lines, 100)
return {
"agent": _capture_log_snapshot("agent", tail_lines=log_lines),
"errors": _capture_log_snapshot("errors", tail_lines=errors_lines),
"gateway": _capture_log_snapshot("gateway", tail_lines=errors_lines),
"agent": _capture_log_snapshot(
"agent", tail_lines=log_lines, redact=redact
),
"errors": _capture_log_snapshot(
"errors", tail_lines=errors_lines, redact=redact
),
"gateway": _capture_log_snapshot(
"gateway", tail_lines=errors_lines, redact=redact
),
}
@@ -532,6 +588,7 @@ def run_debug_share(args):
log_lines = getattr(args, "lines", 200)
expiry = getattr(args, "expire", 7)
local_only = getattr(args, "local", False)
redact = not getattr(args, "no_redact", False)
if not local_only:
print(_PRIVACY_NOTICE)
@@ -539,8 +596,16 @@ def run_debug_share(args):
print("Collecting debug report...")
# Capture dump once — prepended to every paste for context.
# The dump is already redacted at extract time via dump.py:_redact;
# log_snapshots are redacted by _capture_default_log_snapshots when
# redact=True so credentials never reach the public paste service.
dump_text = _capture_dump()
log_snapshots = _capture_default_log_snapshots(log_lines)
log_snapshots = _capture_default_log_snapshots(log_lines, redact=redact)
if redact:
logger.info(
"hermes debug share: applied force-mode redaction to log snapshots before upload"
)
report = collect_debug_report(
log_lines=log_lines,
@@ -556,6 +621,15 @@ def run_debug_share(args):
if gateway_log:
gateway_log = dump_text + "\n\n--- full gateway.log ---\n" + gateway_log
# Visible banner so reviewers reading the public paste know redaction
# was applied at upload time. Banner is omitted under --no-redact.
if redact:
report = _REDACTION_BANNER + report
if agent_log:
agent_log = _REDACTION_BANNER + agent_log
if gateway_log:
gateway_log = _REDACTION_BANNER + gateway_log
if local_only:
print(report)
if agent_log:
@@ -666,6 +740,7 @@ def run_debug(args):
print(" --lines N Number of log lines to include (default: 200)")
print(" --expire N Paste expiry in days (default: 7)")
print(" --local Print report locally instead of uploading")
print(" --no-redact Disable upload-time secret redaction (default: redact)")
print()
print("Options (delete):")
print(" <url> ... One or more paste URLs to delete")
+132 -35
View File
@@ -12,6 +12,7 @@ import importlib.util
from pathlib import Path
from hermes_cli.config import get_project_root, get_hermes_home, get_env_path
from hermes_cli.env_loader import load_hermes_dotenv
from hermes_constants import display_hermes_home
PROJECT_ROOT = get_project_root()
@@ -19,15 +20,8 @@ HERMES_HOME = get_hermes_home()
_DHH = display_hermes_home() # user-facing display path (e.g. ~/.hermes or ~/.hermes/profiles/coder)
# Load environment variables from ~/.hermes/.env so API key checks work
from dotenv import load_dotenv
_env_path = get_env_path()
if _env_path.exists():
try:
load_dotenv(_env_path, encoding="utf-8")
except UnicodeDecodeError:
load_dotenv(_env_path, encoding="latin-1")
# Also try project .env as dev fallback
load_dotenv(PROJECT_ROOT / ".env", override=False, encoding="utf-8")
load_hermes_dotenv(hermes_home=_env_path.parent, project_env=PROJECT_ROOT / ".env")
from hermes_cli.colors import Colors, color
from hermes_cli.models import _HERMES_USER_AGENT
@@ -113,15 +107,35 @@ def _honcho_is_configured_for_doctor() -> bool:
return False
def _is_kanban_worker_env_gate(item: dict) -> bool:
"""Return True when Kanban is unavailable only because this is not a worker process."""
if item.get("name") != "kanban":
return False
if os.environ.get("HERMES_KANBAN_TASK"):
return False
tools = item.get("tools") or []
return bool(tools) and all(str(tool).startswith("kanban_") for tool in tools)
def _doctor_tool_availability_detail(toolset: str) -> str:
"""Optional explanatory suffix for toolsets whose doctor status needs context."""
if toolset == "kanban" and not os.environ.get("HERMES_KANBAN_TASK"):
return "(runtime-gated; loaded only for dispatcher-spawned workers)"
return ""
def _apply_doctor_tool_availability_overrides(available: list[str], unavailable: list[dict]) -> tuple[list[str], list[dict]]:
"""Adjust runtime-gated tool availability for doctor diagnostics."""
if not _honcho_is_configured_for_doctor():
return available, unavailable
updated_available = list(available)
updated_unavailable = []
for item in unavailable:
if item.get("name") == "honcho":
name = item.get("name")
if _is_kanban_worker_env_gate(item):
if "kanban" not in updated_available:
updated_available.append("kanban")
continue
if name == "honcho" and _honcho_is_configured_for_doctor():
if "honcho" not in updated_available:
updated_available.append("honcho")
continue
@@ -175,6 +189,85 @@ def _check_gateway_service_linger(issues: list[str]) -> None:
check_warn("Could not verify systemd linger", f"({linger_detail})")
_APIKEY_PROVIDERS_CACHE: list | None = None
def _build_apikey_providers_list() -> list:
"""Build the API-key provider health-check list once and cache it.
Tuple format: (name, env_vars, default_url, base_env, supports_models_endpoint)
Base list augmented with any ProviderProfile with auth_type="api_key" not
already present adding plugins/model-providers/<name>/ is sufficient to get into doctor.
"""
_static = [
("Z.AI / GLM", ("GLM_API_KEY", "ZAI_API_KEY", "Z_AI_API_KEY"), "https://api.z.ai/api/paas/v4/models", "GLM_BASE_URL", True),
("Kimi / Moonshot", ("KIMI_API_KEY",), "https://api.moonshot.ai/v1/models", "KIMI_BASE_URL", True),
("StepFun Step Plan", ("STEPFUN_API_KEY",), "https://api.stepfun.ai/step_plan/v1/models", "STEPFUN_BASE_URL", True),
("Kimi / Moonshot (China)", ("KIMI_CN_API_KEY",), "https://api.moonshot.cn/v1/models", None, True),
("Arcee AI", ("ARCEEAI_API_KEY",), "https://api.arcee.ai/api/v1/models", "ARCEE_BASE_URL", True),
("GMI Cloud", ("GMI_API_KEY",), "https://api.gmi-serving.com/v1/models", "GMI_BASE_URL", True),
("DeepSeek", ("DEEPSEEK_API_KEY",), "https://api.deepseek.com/v1/models", "DEEPSEEK_BASE_URL", True),
("Hugging Face", ("HF_TOKEN",), "https://router.huggingface.co/v1/models", "HF_BASE_URL", True),
("NVIDIA NIM", ("NVIDIA_API_KEY",), "https://integrate.api.nvidia.com/v1/models", "NVIDIA_BASE_URL", True),
("Alibaba/DashScope", ("DASHSCOPE_API_KEY",), "https://dashscope-intl.aliyuncs.com/compatible-mode/v1/models", "DASHSCOPE_BASE_URL", True),
# MiniMax global: /v1 endpoint supports /models.
("MiniMax", ("MINIMAX_API_KEY",), "https://api.minimax.io/v1/models", "MINIMAX_BASE_URL", True),
# MiniMax CN: /v1 endpoint does NOT support /models (returns 404).
("MiniMax (China)", ("MINIMAX_CN_API_KEY",), "https://api.minimaxi.com/v1/models", "MINIMAX_CN_BASE_URL", False),
("Vercel AI Gateway", ("AI_GATEWAY_API_KEY",), "https://ai-gateway.vercel.sh/v1/models", "AI_GATEWAY_BASE_URL", True),
("Kilo Code", ("KILOCODE_API_KEY",), "https://api.kilo.ai/api/gateway/models", "KILOCODE_BASE_URL", True),
("OpenCode Zen", ("OPENCODE_ZEN_API_KEY",), "https://opencode.ai/zen/v1/models", "OPENCODE_ZEN_BASE_URL", True),
# OpenCode Go has no shared /models endpoint; skip the health check.
("OpenCode Go", ("OPENCODE_GO_API_KEY",), None, "OPENCODE_GO_BASE_URL", False),
]
_known_names = {t[0] for t in _static}
# Also index by profile canonical name so profiles without display_name
# don't create duplicate entries for providers already in the static list.
_known_canonical: set[str] = set()
_name_to_canonical = {
"Z.AI / GLM": "zai", "Kimi / Moonshot": "kimi-coding",
"StepFun Step Plan": "stepfun", "Kimi / Moonshot (China)": "kimi-coding-cn",
"Arcee AI": "arcee", "GMI Cloud": "gmi", "DeepSeek": "deepseek",
"Hugging Face": "huggingface", "NVIDIA NIM": "nvidia",
"Alibaba/DashScope": "alibaba", "MiniMax": "minimax",
"MiniMax (China)": "minimax-cn", "Vercel AI Gateway": "ai-gateway",
"Kilo Code": "kilocode", "OpenCode Zen": "opencode-zen",
"OpenCode Go": "opencode-go",
}
for _label, _canonical in _name_to_canonical.items():
_known_canonical.add(_canonical)
try:
from providers import list_providers
from providers.base import ProviderProfile as _PP
for _pp in list_providers():
if not isinstance(_pp, _PP) or _pp.auth_type != "api_key" or not _pp.env_vars:
continue
_label = _pp.display_name or _pp.name
if _label in _known_names or _pp.name in _known_canonical:
continue
# Separate API-key vars from base-URL override vars — the health-check
# loop sends the first found value as Authorization: Bearer, so a URL
# string must never be picked.
_key_vars = tuple(
v for v in _pp.env_vars
if not v.endswith("_BASE_URL") and not v.endswith("_URL")
)
_base_var = next(
(v for v in _pp.env_vars if v.endswith("_BASE_URL") or v.endswith("_URL")),
None,
)
if not _key_vars:
continue
_models_url = (
(_pp.models_url or (_pp.base_url.rstrip("/") + "/models"))
if _pp.base_url else None
)
_static.append((_label, _key_vars, _models_url, _base_var, True))
except Exception:
pass
return _static
def run_doctor(args):
"""Run diagnostic checks."""
should_fix = getattr(args, 'fix', False)
@@ -263,8 +356,11 @@ def run_doctor(args):
if env_path.exists():
check_ok(f"{_DHH}/.env file exists")
# Check for common issues
content = env_path.read_text()
# Check for common issues. Pin encoding to UTF-8 because .env files are
# written as UTF-8 everywhere in the codebase, while Path.read_text()
# defaults to the system locale — which crashes on non-UTF-8 Windows
# locales (e.g. GBK) as soon as the file contains any non-ASCII byte.
content = env_path.read_text(encoding="utf-8")
if _has_provider_env_config(content):
check_ok("API key or custom endpoint configured")
else:
@@ -932,6 +1028,8 @@ def run_doctor(args):
agent_browser_path = PROJECT_ROOT / "node_modules" / "agent-browser"
if agent_browser_path.exists():
check_ok("agent-browser (Node.js)", "(browser automation)")
elif shutil.which("agent-browser"):
check_ok("agent-browser", "(browser automation)")
else:
if _is_termux():
check_info("agent-browser is not installed (expected in the tested Termux path)")
@@ -1082,26 +1180,11 @@ def run_doctor(args):
# -- API-key providers --
# Tuple: (name, env_vars, default_url, base_env, supports_models_endpoint)
# If supports_models_endpoint is False, we skip the health check and just show "configured"
_apikey_providers = [
("Z.AI / GLM", ("GLM_API_KEY", "ZAI_API_KEY", "Z_AI_API_KEY"), "https://api.z.ai/api/paas/v4/models", "GLM_BASE_URL", True),
("Kimi / Moonshot", ("KIMI_API_KEY",), "https://api.moonshot.ai/v1/models", "KIMI_BASE_URL", True),
("StepFun Step Plan", ("STEPFUN_API_KEY",), "https://api.stepfun.ai/step_plan/v1/models", "STEPFUN_BASE_URL", True),
("Kimi / Moonshot (China)", ("KIMI_CN_API_KEY",), "https://api.moonshot.cn/v1/models", None, True),
("Arcee AI", ("ARCEEAI_API_KEY",), "https://api.arcee.ai/api/v1/models", "ARCEE_BASE_URL", True),
("GMI Cloud", ("GMI_API_KEY",), "https://api.gmi-serving.com/v1/models", "GMI_BASE_URL", True),
("DeepSeek", ("DEEPSEEK_API_KEY",), "https://api.deepseek.com/v1/models", "DEEPSEEK_BASE_URL", True),
("Hugging Face", ("HF_TOKEN",), "https://router.huggingface.co/v1/models", "HF_BASE_URL", True),
("NVIDIA NIM", ("NVIDIA_API_KEY",), "https://integrate.api.nvidia.com/v1/models", "NVIDIA_BASE_URL", True),
("Alibaba/DashScope", ("DASHSCOPE_API_KEY",), "https://dashscope-intl.aliyuncs.com/compatible-mode/v1/models", "DASHSCOPE_BASE_URL", True),
# MiniMax: the /anthropic endpoint doesn't support /models, but the /v1 endpoint does.
("MiniMax", ("MINIMAX_API_KEY",), "https://api.minimax.io/v1/models", "MINIMAX_BASE_URL", True),
("MiniMax (China)", ("MINIMAX_CN_API_KEY",), "https://api.minimaxi.com/v1/models", "MINIMAX_CN_BASE_URL", True),
("Vercel AI Gateway", ("AI_GATEWAY_API_KEY",), "https://ai-gateway.vercel.sh/v1/models", "AI_GATEWAY_BASE_URL", True),
("Kilo Code", ("KILOCODE_API_KEY",), "https://api.kilo.ai/api/gateway/models", "KILOCODE_BASE_URL", True),
("OpenCode Zen", ("OPENCODE_ZEN_API_KEY",), "https://opencode.ai/zen/v1/models", "OPENCODE_ZEN_BASE_URL", True),
# OpenCode Go has no shared /models endpoint; skip the health check.
("OpenCode Go", ("OPENCODE_GO_API_KEY",), None, "OPENCODE_GO_BASE_URL", False),
]
# Cached at module level after first build — profiles auto-extend it.
global _APIKEY_PROVIDERS_CACHE
if _APIKEY_PROVIDERS_CACHE is None:
_APIKEY_PROVIDERS_CACHE = _build_apikey_providers_list()
_apikey_providers = _APIKEY_PROVIDERS_CACHE
for _pname, _env_vars, _default_url, _base_env, _supports_health_check in _apikey_providers:
_key = ""
for _ev in _env_vars:
@@ -1215,7 +1298,7 @@ def run_doctor(args):
for tid in available:
info = TOOLSET_REQUIREMENTS.get(tid, {})
check_ok(info.get("name", tid))
check_ok(info.get("name", tid), _doctor_tool_availability_detail(tid))
for item in unavailable:
env_vars = item.get("missing_vars") or item.get("env_vars") or []
@@ -1258,9 +1341,23 @@ def run_doctor(args):
check_warn("Skills Hub directory not initialized", "(run: hermes skills list)")
from hermes_cli.config import get_env_value
def _gh_authenticated() -> bool:
"""Check if gh CLI is authenticated via token file or device flow."""
try:
result = subprocess.run(
["gh", "auth", "status", "--json", "authenticated"],
capture_output=True, timeout=10,
)
return result.returncode == 0
except (FileNotFoundError, subprocess.TimeoutExpired):
return False
github_token = get_env_value("GITHUB_TOKEN") or get_env_value("GH_TOKEN")
if github_token:
check_ok("GitHub token configured (authenticated API access)")
elif _gh_authenticated():
check_ok("GitHub authenticated via gh CLI", "(full API access — no GITHUB_TOKEN needed)")
else:
check_warn("No GITHUB_TOKEN", f"(60 req/hr rate limit — set in {_DHH}/.env for better rates)")
+5 -8
View File
@@ -14,6 +14,7 @@ import sys
from pathlib import Path
from hermes_cli.config import get_hermes_home, get_env_path, get_project_root, load_config
from hermes_cli.env_loader import load_hermes_dotenv
from hermes_constants import display_hermes_home
@@ -195,15 +196,11 @@ def run_dump(args):
show_keys = getattr(args, "show_keys", False)
# Load env from .env file so key checks work
from dotenv import load_dotenv
env_path = get_env_path()
if env_path.exists():
try:
load_dotenv(env_path, encoding="utf-8")
except UnicodeDecodeError:
load_dotenv(env_path, encoding="latin-1")
# Also try project .env as dev fallback
load_dotenv(get_project_root() / ".env", override=False, encoding="utf-8")
load_hermes_dotenv(
hermes_home=env_path.parent,
project_env=get_project_root() / ".env",
)
project_root = get_project_root()
hermes_home = get_hermes_home()
+155 -11
View File
@@ -188,7 +188,7 @@ def _graceful_restart_via_sigusr1(pid: int, drain_timeout: float) -> bool:
SIGUSR1 is wired in gateway/run.py to ``request_restart(via_service=True)``
which drains in-flight agent runs (up to ``agent.restart_drain_timeout``
seconds), then exits with code 75. Both systemd (``Restart=on-failure``
seconds), then exits with code 75. Both systemd (``Restart=always``
+ ``RestartForceExitStatus=75``) and launchd (``KeepAlive.SuccessfulExit
= false``) relaunch the process after the graceful exit.
@@ -237,6 +237,26 @@ def _graceful_restart_via_sigusr1(pid: int, drain_timeout: float) -> bool:
return False
def _get_ancestor_pids() -> set[int]:
"""Return the set of PIDs in the current process's ancestor chain.
Walks from the current PID up to PID 1 (init) so that process-table scans
never match the calling CLI process or any of its parents. This prevents
``hermes gateway status`` from falsely counting the ``hermes`` CLI that
invoked it as a running gateway instance (see #13242).
"""
ancestors: set[int] = set()
pid = os.getpid()
# Cap iterations to avoid infinite loops on exotic platforms.
for _ in range(64):
ancestors.add(pid)
parent = _get_parent_pid(pid)
if parent is None or parent <= 0 or parent in ancestors:
break
pid = parent
return ancestors
def _append_unique_pid(pids: list[int], pid: int | None, exclude_pids: set[int]) -> None:
if pid is None or pid <= 0:
return
@@ -252,6 +272,10 @@ def _scan_gateway_pids(exclude_pids: set[int], all_profiles: bool = False) -> li
a live gateway when the PID file is stale/missing, and ``--all`` sweeps can
discover gateways outside the current profile.
"""
# Exclude the entire ancestor chain so the CLI process that invoked this
# scan (e.g. ``hermes gateway status``) is never mistaken for a running
# gateway. See #13242.
exclude_pids = exclude_pids | _get_ancestor_pids()
pids: list[int] = []
patterns = [
"hermes_cli.main gateway",
@@ -690,6 +714,32 @@ def _print_gateway_process_mismatch(snapshot: GatewayRuntimeSnapshot) -> None:
print(" can refuse to start another copy until this process stops.")
def _print_other_profiles_gateway_status() -> None:
"""Print a summary of gateway status across all profiles.
Shown at the bottom of ``hermes gateway status`` output so users with
multiple profiles can tell at a glance which gateways are running and
avoid confusing another profile's process with the current one.
"""
try:
from hermes_cli.profiles import get_active_profile_name
current = get_active_profile_name()
other_processes = [
p for p in find_profile_gateway_processes()
if p.profile != current
]
if not other_processes:
return
print()
print("Other profiles:")
for proc in other_processes:
print(f"{proc.profile:<16s} — PID {proc.pid}")
except Exception:
pass
def kill_gateway_processes(force: bool = False, exclude_pids: set | None = None,
all_profiles: bool = False) -> int:
"""Kill any running gateway processes. Returns count killed.
@@ -735,6 +785,12 @@ def stop_profile_gateway() -> bool:
if pid is None:
return False
try:
from gateway.status import write_planned_stop_marker
write_planned_stop_marker(pid)
except Exception:
pass
try:
os.kill(pid, signal.SIGTERM)
except ProcessLookupError:
@@ -1558,6 +1614,46 @@ def _build_user_local_paths(home: Path, path_entries: list[str]) -> list[str]:
return [p for p in candidates if p not in path_entries and Path(p).exists()]
def _build_wsl_interop_paths(path_entries: list[str]) -> list[str]:
"""Return WSL Windows interop PATH entries for generated systemd units.
WSL shells normally inherit Windows PATH entries such as
``/mnt/c/WINDOWS/System32``. systemd user services do not, so gateway tools
that call ``powershell.exe``/``cmd.exe`` work in a terminal but fail in the
background service unless we persist the relevant entries at install time.
"""
if not is_wsl():
return []
candidates: list[str] = []
for entry in os.environ.get("PATH", "").split(os.pathsep):
if entry.startswith("/mnt/"):
candidates.append(entry)
for executable in ("powershell.exe", "cmd.exe", "explorer.exe", "wsl.exe"):
resolved = shutil.which(executable)
if resolved:
candidates.append(str(Path(resolved).parent))
for entry in (
"/mnt/c/WINDOWS/system32",
"/mnt/c/WINDOWS",
"/mnt/c/WINDOWS/System32/Wbem",
"/mnt/c/WINDOWS/System32/WindowsPowerShell/v1.0/",
"/mnt/c/WINDOWS/System32/OpenSSH/",
):
if Path(entry).exists():
candidates.append(entry)
result: list[str] = []
seen = set(path_entries)
for entry in candidates:
if entry and entry not in seen:
seen.add(entry)
result.append(entry)
return result
def _remap_path_for_user(path: str, target_home_dir: str) -> str:
"""Remap *path* from the current user's home to *target_home_dir*.
@@ -1649,14 +1745,14 @@ def generate_systemd_unit(system: bool = False, run_as_user: str | None = None)
node_bin = _remap_path_for_user(node_bin, home_dir)
path_entries = [_remap_path_for_user(p, home_dir) for p in path_entries]
path_entries.extend(_build_user_local_paths(Path(home_dir), path_entries))
path_entries.extend(_build_wsl_interop_paths(path_entries))
path_entries.extend(common_bin_paths)
sane_path = ":".join(path_entries)
return f"""[Unit]
Description={SERVICE_DESCRIPTION}
After=network-online.target
Wants=network-online.target
StartLimitIntervalSec=600
StartLimitBurst=5
StartLimitIntervalSec=0
[Service]
Type=simple
@@ -1670,8 +1766,10 @@ Environment="LOGNAME={username}"
Environment="PATH={sane_path}"
Environment="VIRTUAL_ENV={venv_dir}"
Environment="HERMES_HOME={hermes_home}"
Restart=on-failure
RestartSec=30
Restart=always
RestartSec=60
RestartMaxDelaySec=300
RestartSteps=5
RestartForceExitStatus={GATEWAY_SERVICE_RESTART_EXIT_CODE}
KillMode=mixed
KillSignal=SIGTERM
@@ -1687,13 +1785,14 @@ WantedBy=multi-user.target
hermes_home = str(get_hermes_home().resolve())
profile_arg = _profile_arg(hermes_home)
path_entries.extend(_build_user_local_paths(Path.home(), path_entries))
path_entries.extend(_build_wsl_interop_paths(path_entries))
path_entries.extend(common_bin_paths)
sane_path = ":".join(path_entries)
return f"""[Unit]
Description={SERVICE_DESCRIPTION}
After=network.target
StartLimitIntervalSec=600
StartLimitBurst=5
After=network-online.target
Wants=network-online.target
StartLimitIntervalSec=0
[Service]
Type=simple
@@ -1702,8 +1801,10 @@ WorkingDirectory={working_dir}
Environment="PATH={sane_path}"
Environment="VIRTUAL_ENV={venv_dir}"
Environment="HERMES_HOME={hermes_home}"
Restart=on-failure
RestartSec=30
Restart=always
RestartSec=60
RestartMaxDelaySec=300
RestartSteps=5
RestartForceExitStatus={GATEWAY_SERVICE_RESTART_EXIT_CODE}
KillMode=mixed
KillSignal=SIGTERM
@@ -1918,6 +2019,15 @@ def systemd_uninstall(system: bool = False):
print(f"{_service_scope_label(system).capitalize()} service uninstalled")
def _require_service_installed(action: str, system: bool = False) -> None:
unit_path = get_systemd_unit_path(system=system)
if not unit_path.exists():
scope_flag = " --system" if system else ""
print(f"✗ Gateway service is not installed")
print(f" Run: {'sudo ' if system else ''}hermes gateway install{scope_flag}")
sys.exit(1)
def systemd_start(system: bool = False):
system = _select_systemd_scope(system)
if system:
@@ -1927,6 +2037,7 @@ def systemd_start(system: bool = False):
# reachable (common on fresh RHEL/Debian SSH sessions without linger).
# Raises UserSystemdUnavailableError with a remediation message.
_preflight_user_systemd()
_require_service_installed("start", system=system)
refresh_systemd_unit_if_needed(system=system)
_run_systemctl(["start", get_service_name()], system=system, check=True, timeout=30)
print(f"{_service_scope_label(system).capitalize()} service started")
@@ -1937,6 +2048,14 @@ def systemd_stop(system: bool = False):
system = _select_systemd_scope(system)
if system:
_require_root_for_system_service("stop")
_require_service_installed("stop", system=system)
try:
from gateway.status import get_running_pid, write_planned_stop_marker
pid = get_running_pid(cleanup_stale=False)
if pid is not None:
write_planned_stop_marker(pid)
except Exception:
pass
_run_systemctl(["stop", get_service_name()], system=system, check=True, timeout=90)
print(f"{_service_scope_label(system).capitalize()} service stopped")
@@ -1948,6 +2067,7 @@ def systemd_restart(system: bool = False):
_require_root_for_system_service("restart")
else:
_preflight_user_systemd()
_require_service_installed("restart", system=system)
refresh_systemd_unit_if_needed(system=system)
from gateway.status import get_running_pid
@@ -2297,6 +2417,13 @@ def launchd_start():
def launchd_stop():
label = get_launchd_label()
target = f"{_launchd_domain()}/{label}"
try:
from gateway.status import get_running_pid, write_planned_stop_marker
pid = get_running_pid(cleanup_stale=False)
if pid is not None:
write_planned_stop_marker(pid)
except Exception:
pass
# bootout unloads the service definition so KeepAlive doesn't respawn
# the process. A plain `kill SIGTERM` only signals the process — launchd
# immediately restarts it because KeepAlive.SuccessfulExit = false.
@@ -2439,6 +2566,20 @@ def run_gateway(verbose: int = 0, quiet: bool = False, replace: bool = False):
hasn't fully exited yet.
"""
sys.path.insert(0, str(PROJECT_ROOT))
# Refresh the systemd unit definition on every boot so that restart
# settings (RestartSec, StartLimitIntervalSec, etc.) stay current even
# when the process was respawned via exit-code-75 (stale-code or
# /restart) rather than through `hermes gateway restart` which already
# calls refresh_systemd_unit_if_needed(). Without this, a code update
# that ships new unit settings won't take effect until the next manual
# `hermes gateway start/restart` — leaving the gateway vulnerable to
# the exact failure mode the new settings were meant to prevent.
if supports_systemd_services():
try:
refresh_systemd_unit_if_needed(system=False)
except Exception:
pass # best-effort; don't block gateway startup
from gateway.run import start_gateway
@@ -2451,7 +2592,7 @@ def run_gateway(verbose: int = 0, quiet: bool = False, replace: bool = False):
print()
# Exit with code 1 if gateway fails to connect any platform,
# so systemd Restart=on-failure will retry on transient errors
# so systemd Restart=always will retry on transient errors
verbosity = None if quiet else verbose
try:
success = asyncio.run(start_gateway(replace=replace, verbosity=verbosity))
@@ -4453,6 +4594,9 @@ def _gateway_command_inner(args):
print(" hermes gateway install # Install as user service")
print(" sudo hermes gateway install --system # Install as boot-time system service")
# Show other profiles' gateway status for multi-profile awareness
_print_other_profiles_gateway_status()
elif subcmd == "migrate-legacy":
# Stop, disable, and remove legacy Hermes gateway unit files from
# pre-rename installs (e.g. hermes.service). Profile units and
+654 -11
View File
@@ -169,11 +169,93 @@ def build_parser(parent_subparsers: argparse._SubParsersAction) -> argparse.Argu
"or docs/hermes-kanban-v1-spec.pdf for the full design."
),
)
# --- global --board flag ---
# Applies to every subcommand below. When set, scopes all reads and
# writes to that board's DB. When omitted, resolves via the
# HERMES_KANBAN_BOARD env var, then the persisted current-board
# file, then "default". See kanban_db.get_current_board().
kanban_parser.add_argument(
"--board",
default=None,
metavar="<slug>",
help=(
"Board slug to operate on. Defaults to the current board "
"(set via `hermes kanban boards switch <slug>` or the "
"HERMES_KANBAN_BOARD env var). Use `hermes kanban boards list` "
"to see all boards."
),
)
sub = kanban_parser.add_subparsers(dest="kanban_action")
# --- init ---
sub.add_parser("init", help="Create kanban.db if missing (idempotent)")
# --- boards (new in v2: multi-project support) ---
p_boards = sub.add_parser(
"boards",
help="Manage kanban boards (one board per project / workstream)",
description=(
"Boards let you separate unrelated streams of work "
"(projects, repos, domains) into isolated queues. Each "
"board has its own DB, workspaces directory, and dispatcher "
"loop — tasks on one board cannot collide with tasks on "
"another. The first board is 'default' and always exists."
),
)
boards_sub = p_boards.add_subparsers(dest="boards_action")
b_list = boards_sub.add_parser(
"list", aliases=["ls"],
help="List all boards with task counts",
)
b_list.add_argument("--json", action="store_true")
b_list.add_argument("--all", action="store_true",
help="Include archived boards too")
b_create = boards_sub.add_parser(
"create", aliases=["new"],
help="Create a new board",
)
b_create.add_argument("slug",
help="Board slug (kebab-case, e.g. atm10-server)")
b_create.add_argument("--name", default=None,
help="Human-readable display name (defaults to Title Case of slug)")
b_create.add_argument("--description", default=None,
help="Optional description")
b_create.add_argument("--icon", default=None,
help="Optional emoji or single-character icon for the dashboard")
b_create.add_argument("--color", default=None,
help="Optional hex color (e.g. '#8b5cf6') for the dashboard")
b_create.add_argument("--switch", action="store_true",
help="Switch to the new board after creating it")
b_rm = boards_sub.add_parser(
"rm", aliases=["remove", "delete"],
help="Archive (default) or delete a board",
)
b_rm.add_argument("slug")
b_rm.add_argument("--delete", action="store_true",
help="Hard-delete the board directory instead of archiving it. "
"Default is to move it to boards/_archived/ so it's recoverable.")
b_switch = boards_sub.add_parser(
"switch", aliases=["use"],
help="Set the active board for subsequent CLI calls",
)
b_switch.add_argument("slug")
boards_sub.add_parser(
"show", aliases=["current"],
help="Print the currently-active board slug",
)
b_rename = boards_sub.add_parser(
"rename",
help="Change a board's human-readable display name (slug is immutable)",
)
b_rename.add_argument("slug")
b_rename.add_argument("name", help="New display name")
# --- create ---
p_create = sub.add_parser("create", help="Create a new task")
p_create.add_argument("title", help="Task title")
@@ -226,6 +308,57 @@ def build_parser(parent_subparsers: argparse._SubParsersAction) -> argparse.Argu
p_assign.add_argument("task_id")
p_assign.add_argument("profile", help="Profile name (or 'none' to unassign)")
# --- reclaim / reassign (recovery) ---
p_reclaim = sub.add_parser(
"reclaim",
help="Release an active worker claim on a running task",
)
p_reclaim.add_argument("task_id")
p_reclaim.add_argument(
"--reason", default=None,
help="Human-readable reason (recorded on the reclaimed event)",
)
p_reassign = sub.add_parser(
"reassign",
help="Reassign a task to a different profile, optionally reclaiming first",
)
p_reassign.add_argument("task_id")
p_reassign.add_argument(
"profile",
help="New profile name (or 'none' to unassign)",
)
p_reassign.add_argument(
"--reclaim", action="store_true",
help="Release any active claim before reassigning (required if task is running)",
)
p_reassign.add_argument(
"--reason", default=None,
help="Human-readable reason (recorded on the reclaimed event)",
)
# --- diagnostics (board-wide health) ---
p_diag = sub.add_parser(
"diagnostics",
aliases=["diag"],
help="List active diagnostics on the current board",
)
p_diag.add_argument(
"--severity",
choices=["warning", "error", "critical"],
default=None,
help="Only show diagnostics at or above this severity",
)
p_diag.add_argument(
"--task",
default=None,
help="Only show diagnostics for one task id",
)
p_diag.add_argument(
"--json", action="store_true",
help="Emit JSON (structured) instead of the default human table",
)
# --- link / unlink ---
p_link = sub.add_parser("link", help="Add a parent->child dependency")
p_link.add_argument("parent_id")
@@ -261,6 +394,27 @@ def build_parser(parent_subparsers: argparse._SubParsersAction) -> argparse.Argu
help='JSON dict of structured facts (e.g. \'{"changed_files": [...], '
'"tests_run": 12}\'). Stored on the closing run.')
p_edit = sub.add_parser(
"edit",
help="Edit recovery fields on an already-completed task",
)
p_edit.add_argument("task_id")
p_edit.add_argument(
"--result",
required=True,
help="Backfilled task result text for a done task",
)
p_edit.add_argument(
"--summary",
default=None,
help="Structured handoff summary. Falls back to --result if omitted.",
)
p_edit.add_argument(
"--metadata",
default=None,
help="JSON dict of structured facts to store on the latest completed run.",
)
p_block = sub.add_parser("block", help="Mark one or more tasks blocked")
p_block.add_argument("task_id")
p_block.add_argument("reason", nargs="*", help="Reason (also appended as a comment)")
@@ -366,7 +520,7 @@ def build_parser(parent_subparsers: argparse._SubParsersAction) -> argparse.Argu
# --- log ---
p_log = sub.add_parser(
"log",
help="Print the worker log for a task (from $HERMES_HOME/kanban/logs/)",
help="Print the worker log for a task (from <kanban-root>/kanban/logs/)",
)
p_log.add_argument("task_id")
p_log.add_argument("--tail", type=int, default=None,
@@ -442,6 +596,38 @@ def kanban_command(args: argparse.Namespace) -> int:
)
return 0
# `--board <slug>` applies to every subcommand below by way of an
# env-var pin for the duration of this call. Using HERMES_KANBAN_BOARD
# (rather than threading `board=` through 50+ kb.connect() sites)
# keeps the patch small and inherits the exact same resolution the
# dispatcher uses for workers — consistency is a feature here.
board_override = getattr(args, "board", None)
if board_override:
try:
normed = kb._normalize_board_slug(board_override)
except ValueError as exc:
print(f"kanban: {exc}", file=sys.stderr)
return 2
if not normed:
print("kanban: --board requires a slug", file=sys.stderr)
return 2
# Boards other than 'default' must already exist — typoed slugs
# would otherwise silently create an empty board.
if normed != kb.DEFAULT_BOARD and not kb.board_exists(normed):
print(
f"kanban: board {normed!r} does not exist. "
f"Create it with `hermes kanban boards create {normed}`.",
file=sys.stderr,
)
return 1
os.environ["HERMES_KANBAN_BOARD"] = normed
# Boards management doesn't touch the DB at all — dispatch early so
# fresh installs that haven't initialized any DB can still use
# `hermes kanban boards create …`.
if action == "boards":
return _dispatch_boards(args)
# Auto-initialize the DB before dispatching any subcommand. init_db
# is idempotent, so running it every invocation is cheap (one
# SELECT against sqlite_master when tables already exist) and
@@ -462,11 +648,16 @@ def kanban_command(args: argparse.Namespace) -> int:
"ls": _cmd_list,
"show": _cmd_show,
"assign": _cmd_assign,
"reclaim": _cmd_reclaim,
"reassign": _cmd_reassign,
"diagnostics": _cmd_diagnostics,
"diag": _cmd_diagnostics,
"link": _cmd_link,
"unlink": _cmd_unlink,
"claim": _cmd_claim,
"comment": _cmd_comment,
"complete": _cmd_complete,
"edit": _cmd_edit,
"block": _cmd_block,
"unblock": _cmd_unblock,
"archive": _cmd_archive,
@@ -513,6 +704,185 @@ def _profile_author() -> str:
return "user"
# ---------------------------------------------------------------------------
# Boards management (hermes kanban boards …)
# ---------------------------------------------------------------------------
def _dispatch_boards(args: argparse.Namespace) -> int:
"""Handle ``hermes kanban boards <action>``.
Boards management is deliberately separate from the task-level
commands: it operates on the filesystem (board directories,
``current`` pointer, ``board.json``), not on the per-board SQLite
DB, so a fresh HERMES_HOME that has never called ``kanban init``
can still run ``boards create`` / ``boards list``.
"""
sub = getattr(args, "boards_action", None) or "list"
if sub in ("list", "ls"):
return _cmd_boards_list(args)
if sub in ("create", "new"):
return _cmd_boards_create(args)
if sub in ("rm", "remove", "delete"):
return _cmd_boards_rm(args)
if sub in ("switch", "use"):
return _cmd_boards_switch(args)
if sub in ("show", "current"):
return _cmd_boards_show(args)
if sub == "rename":
return _cmd_boards_rename(args)
print(f"kanban boards: unknown action {sub!r}", file=sys.stderr)
return 2
def _board_task_counts(slug: str) -> dict[str, int]:
"""Return ``{status: count}`` for a board. Safe to call on an empty DB."""
try:
path = kb.kanban_db_path(board=slug)
if not path.exists():
return {}
with kb.connect(board=slug) as conn:
rows = conn.execute(
"SELECT status, COUNT(*) AS n FROM tasks GROUP BY status"
).fetchall()
return {r["status"]: int(r["n"]) for r in rows}
except Exception:
return {}
def _cmd_boards_list(args: argparse.Namespace) -> int:
include_archived = bool(getattr(args, "all", False))
boards = kb.list_boards(include_archived=include_archived)
# Enrich each entry with task counts + whether it's the current board.
current = kb.get_current_board()
for b in boards:
b["is_current"] = (b["slug"] == current)
b["counts"] = _board_task_counts(b["slug"])
b["total"] = sum(b["counts"].values())
if getattr(args, "json", False):
print(json.dumps(boards, indent=2, ensure_ascii=False))
return 0
# Human table: marker (•) for current, slug, display name, counts.
if not boards:
print("(no boards — create one with `hermes kanban boards create <slug>`)")
return 0
print(f"{'':2s} {'SLUG':24s} {'NAME':28s} COUNTS")
for b in boards:
marker = "" if b["is_current"] else " "
counts = b["counts"] or {}
counts_str = (
", ".join(f"{k}={v}" for k, v in sorted(counts.items()))
or "(empty)"
)
name = b.get("name") or ""
if b.get("archived"):
name += " [archived]"
print(f"{marker:2s} {b['slug']:24s} {name:28s} {counts_str}")
print()
print(f"Current board: {current}")
if len(boards) > 1:
print("Switch boards with `hermes kanban boards switch <slug>`.")
return 0
def _cmd_boards_create(args: argparse.Namespace) -> int:
try:
normed = kb._normalize_board_slug(args.slug)
except ValueError as exc:
print(f"kanban boards create: {exc}", file=sys.stderr)
return 2
if not normed:
print("kanban boards create: slug is required", file=sys.stderr)
return 2
already = kb.board_exists(normed) and normed != kb.DEFAULT_BOARD
meta = kb.create_board(
normed,
name=args.name,
description=args.description,
icon=args.icon,
color=args.color,
)
verb = "already exists" if already else "created"
print(f"Board {meta['slug']!r} {verb}.")
print(f" Display name: {meta.get('name', '')}")
print(f" DB path: {meta['db_path']}")
if getattr(args, "switch", False):
kb.set_current_board(meta["slug"])
print(f" Switched to {meta['slug']!r}.")
else:
print(f" Use `hermes kanban boards switch {meta['slug']}` to make it current.")
return 0
def _cmd_boards_rm(args: argparse.Namespace) -> int:
try:
res = kb.remove_board(args.slug, archive=not getattr(args, "delete", False))
except ValueError as exc:
print(f"kanban boards rm: {exc}", file=sys.stderr)
return 1
if res["action"] == "archived":
print(f"Board {res['slug']!r} archived → {res['new_path']}")
print("Recover by moving the directory back to "
"<root>/kanban/boards/<slug>/.")
else:
print(f"Board {res['slug']!r} deleted.")
return 0
def _cmd_boards_switch(args: argparse.Namespace) -> int:
try:
normed = kb._normalize_board_slug(args.slug)
except ValueError as exc:
print(f"kanban boards switch: {exc}", file=sys.stderr)
return 2
if not normed:
print("kanban boards switch: slug is required", file=sys.stderr)
return 2
if not kb.board_exists(normed):
print(
f"kanban boards switch: board {normed!r} does not exist. "
f"Create it with `hermes kanban boards create {normed}`.",
file=sys.stderr,
)
return 1
kb.set_current_board(normed)
print(f"Active board is now {normed!r}.")
return 0
def _cmd_boards_show(args: argparse.Namespace) -> int:
current = kb.get_current_board()
meta = kb.read_board_metadata(current)
counts = _board_task_counts(current)
total = sum(counts.values())
print(f"Current board: {current}")
print(f" Display name: {meta.get('name', '')}")
if meta.get("description"):
print(f" Description: {meta['description']}")
print(f" DB path: {meta['db_path']}")
print(f" Tasks: {total} total"
+ (f" ({', '.join(f'{k}={v}' for k, v in sorted(counts.items()))})"
if counts else ""))
return 0
def _cmd_boards_rename(args: argparse.Namespace) -> int:
try:
normed = kb._normalize_board_slug(args.slug)
except ValueError as exc:
print(f"kanban boards rename: {exc}", file=sys.stderr)
return 2
if not normed or not kb.board_exists(normed):
print(f"kanban boards rename: board {args.slug!r} does not exist",
file=sys.stderr)
return 1
meta = kb.write_board_metadata(normed, name=args.name)
print(f"Board {normed!r} renamed to {meta['name']!r}.")
return 0
# ---------------------------------------------------------------------------
def _parse_duration(val) -> Optional[int]:
"""Parse ``30s`` / ``5m`` / ``2h`` / ``1d`` or a raw integer → seconds.
@@ -573,7 +943,12 @@ def _cmd_init(args: argparse.Namespace) -> int:
def _cmd_heartbeat(args: argparse.Namespace) -> int:
with kb.connect() as conn:
ok = kb.heartbeat_worker(conn, args.task_id, note=getattr(args, "note", None))
ok = kb.heartbeat_worker(
conn,
args.task_id,
note=getattr(args, "note", None),
expected_run_id=_worker_run_id_for(args.task_id),
)
if not ok:
print(f"cannot heartbeat {args.task_id} (not running?)", file=sys.stderr)
return 1
@@ -662,6 +1037,21 @@ def _cmd_list(args: argparse.Namespace) -> int:
if getattr(args, "json", False):
print(json.dumps([_task_to_dict(t) for t in tasks], indent=2, ensure_ascii=False))
return 0
# Passive discoverability: when the user has multiple boards, surface
# which one they're looking at in the list header. Single-board users
# never see this — the feature stays invisible until you opt in.
try:
all_boards = kb.list_boards(include_archived=False)
except Exception:
all_boards = []
if len(all_boards) > 1:
current = kb.get_current_board()
other_count = len(all_boards) - 1
print(
f"Board: {current} "
f"({other_count} other board{'s' if other_count != 1 else ''}"
f"`hermes kanban boards list`)\n"
)
if not tasks:
print("(no matching tasks)")
return 0
@@ -681,10 +1071,16 @@ def _cmd_show(args: argparse.Namespace) -> int:
parents = kb.parent_ids(conn, args.task_id)
children = kb.child_ids(conn, args.task_id)
runs = kb.list_runs(conn, args.task_id)
# Workers hand off via ``task_runs.summary`` (kanban-worker skill);
# ``tasks.result`` is left NULL unless the caller explicitly passed
# ``result=``. Surfacing the latest summary here keeps ``show`` from
# looking like a no-op when the worker actually did real work.
latest_summary = kb.latest_summary(conn, args.task_id)
if getattr(args, "json", False):
payload = {
"task": _task_to_dict(task),
"latest_summary": latest_summary,
"parents": parents,
"children": children,
"comments": [
@@ -730,6 +1126,31 @@ def _cmd_show(args: argparse.Namespace) -> int:
if task.skills:
print(f" skills: {', '.join(task.skills)}")
print(f" created: {_fmt_ts(task.created_at)} by {task.created_by or '-'}")
# Diagnostics section — surface active distress signals at the top
# of show output so CLI users see them before scrolling through
# comments / runs.
from hermes_cli import kanban_diagnostics as kd
diags = kd.compute_task_diagnostics(task, events, runs)
if diags:
sev_marker = {"warning": "", "error": "!!", "critical": "!!!"}
print(f"\n Diagnostics ({len(diags)}):")
for d in diags:
print(f" {sev_marker.get(d.severity, '?')} [{d.severity}] {d.title}")
if d.data:
bits = []
for k, v in d.data.items():
if isinstance(v, list):
bits.append(f"{k}={','.join(str(x) for x in v)}")
else:
bits.append(f"{k}={v}")
if bits:
print(f" data: {' | '.join(bits)}")
# Only show suggested actions in show output to keep it tight;
# full list is available via `kanban diagnostics --task <id>`.
for a in d.actions:
if a.suggested:
print(f"{a.label}")
if task.started_at:
print(f" started: {_fmt_ts(task.started_at)}")
if task.completed_at:
@@ -746,6 +1167,13 @@ def _cmd_show(args: argparse.Namespace) -> int:
print()
print("Result:")
print(task.result)
elif latest_summary:
# Worker handoff lives on the latest run, not on tasks.result.
# Surface it at top-level so a glance at ``hermes kanban show <id>``
# tells you what the worker did even if tasks.result is empty.
print()
print("Latest summary:")
print(latest_summary)
if comments:
print()
print(f"Comments ({len(comments)}):")
@@ -787,6 +1215,167 @@ def _cmd_assign(args: argparse.Namespace) -> int:
return 0
def _cmd_reclaim(args: argparse.Namespace) -> int:
with kb.connect() as conn:
ok = kb.reclaim_task(
conn, args.task_id,
reason=getattr(args, "reason", None),
)
if not ok:
print(
f"cannot reclaim {args.task_id} (not running or unknown id)",
file=sys.stderr,
)
return 1
print(f"Reclaimed {args.task_id}")
return 0
def _cmd_reassign(args: argparse.Namespace) -> int:
profile = None if args.profile.lower() in ("none", "-", "null") else args.profile
with kb.connect() as conn:
ok = kb.reassign_task(
conn, args.task_id, profile,
reclaim_first=bool(getattr(args, "reclaim", False)),
reason=getattr(args, "reason", None),
)
if not ok:
print(
f"cannot reassign {args.task_id} "
f"(unknown id, or still running — pass --reclaim to release first)",
file=sys.stderr,
)
return 1
print(
f"Reassigned {args.task_id} to "
f"{profile or '(unassigned)'}"
+ (" (claim reclaimed)" if getattr(args, "reclaim", False) else "")
)
return 0
def _cmd_diagnostics(args: argparse.Namespace) -> int:
"""List active diagnostics on the board. Wraps the same rule engine
the dashboard uses, so CLI output matches what the UI shows.
"""
from hermes_cli import kanban_diagnostics as kd
with kb.connect() as conn:
# Either one-task mode or fleet mode.
if getattr(args, "task", None):
task = kb.get_task(conn, args.task)
if task is None:
print(f"no such task: {args.task}", file=sys.stderr)
return 1
diags_by_task = {
args.task: kd.compute_task_diagnostics(
task,
kb.list_events(conn, args.task),
kb.list_runs(conn, args.task),
)
}
else:
# Fleet mode: pull all non-archived tasks + their events/runs.
rows = list(conn.execute(
"SELECT * FROM tasks WHERE status != 'archived'"
).fetchall())
ids = [r["id"] for r in rows]
if not ids:
diags_by_task = {}
else:
placeholders = ",".join(["?"] * len(ids))
ev_by = {i: [] for i in ids}
for row in conn.execute(
f"SELECT * FROM task_events WHERE task_id IN ({placeholders}) ORDER BY id",
tuple(ids),
):
ev_by.setdefault(row["task_id"], []).append(row)
run_by = {i: [] for i in ids}
for row in conn.execute(
f"SELECT * FROM task_runs WHERE task_id IN ({placeholders}) ORDER BY id",
tuple(ids),
):
run_by.setdefault(row["task_id"], []).append(row)
diags_by_task = {}
for r in rows:
tid = r["id"]
dl = kd.compute_task_diagnostics(r, ev_by.get(tid, []), run_by.get(tid, []))
if dl:
diags_by_task[tid] = dl
# Severity filter.
sev = getattr(args, "severity", None)
if sev:
for tid in list(diags_by_task.keys()):
kept = [d for d in diags_by_task[tid] if d.severity == sev]
if kept:
diags_by_task[tid] = kept
else:
del diags_by_task[tid]
# Map task_id → title/status/assignee for the table output.
meta: dict[str, dict] = {}
if diags_by_task:
placeholders = ",".join(["?"] * len(diags_by_task))
for r in conn.execute(
f"SELECT id, title, status, assignee FROM tasks WHERE id IN ({placeholders})",
tuple(diags_by_task.keys()),
):
meta[r["id"]] = {
"title": r["title"], "status": r["status"],
"assignee": r["assignee"],
}
if getattr(args, "json", False):
out_json = [
{
"task_id": tid,
**meta.get(tid, {}),
"diagnostics": [d.to_dict() for d in dl],
}
for tid, dl in diags_by_task.items()
]
print(json.dumps(out_json, indent=2, ensure_ascii=False))
return 0
if not diags_by_task:
print("No active diagnostics on this board.")
return 0
# Human-readable summary: grouped by task, severity-marked, with
# suggested actions inline.
sev_marker = {"warning": "", "error": "!!", "critical": "!!!"}
total = sum(len(dl) for dl in diags_by_task.values())
print(
f"{total} active diagnostic(s) across "
f"{len(diags_by_task)} task(s):\n"
)
for tid, dl in diags_by_task.items():
m = meta.get(tid, {})
title = m.get("title") or "(untitled)"
status = m.get("status") or "?"
assignee = m.get("assignee") or "(unassigned)"
print(f" {tid} {status:8s} @{assignee:18s} {title}")
for d in dl:
print(f" {sev_marker.get(d.severity, '?')} [{d.severity}] {d.kind}: {d.title}")
if d.data:
# Compact key:value pairs on one line.
bits = []
for k, v in d.data.items():
if isinstance(v, list):
bits.append(f"{k}={','.join(str(x) for x in v)}")
else:
bits.append(f"{k}={v}")
if bits:
print(f" data: {' | '.join(bits)}")
# Suggested actions first.
for a in d.actions:
if a.suggested:
print(f"{a.label}")
print()
return 0
def _cmd_link(args: argparse.Namespace) -> int:
with kb.connect() as conn:
kb.link_tasks(conn, args.parent_id, args.child_id)
@@ -835,6 +1424,18 @@ def _cmd_comment(args: argparse.Namespace) -> int:
return 0
def _worker_run_id_for(task_id: str) -> Optional[int]:
if os.environ.get("HERMES_KANBAN_TASK") != task_id:
return None
raw = os.environ.get("HERMES_KANBAN_RUN_ID")
if not raw:
return None
try:
return int(raw)
except ValueError:
return None
def _cmd_complete(args: argparse.Namespace) -> int:
"""Mark one or more tasks done. Supports a single id or a list."""
ids = list(args.task_ids or [])
@@ -871,6 +1472,7 @@ def _cmd_complete(args: argparse.Namespace) -> int:
result=args.result,
summary=summary,
metadata=metadata,
expected_run_id=_worker_run_id_for(tid),
):
failed.append(tid)
print(f"cannot complete {tid} (unknown id or terminal state)", file=sys.stderr)
@@ -879,6 +1481,34 @@ def _cmd_complete(args: argparse.Namespace) -> int:
return 0 if not failed else 1
def _cmd_edit(args: argparse.Namespace) -> int:
raw_meta = getattr(args, "metadata", None)
metadata = None
if raw_meta:
try:
metadata = json.loads(raw_meta)
if not isinstance(metadata, dict):
raise ValueError("must be a JSON object")
except (ValueError, json.JSONDecodeError) as exc:
print(f"kanban: --metadata: {exc}", file=sys.stderr)
return 2
with kb.connect() as conn:
if not kb.edit_completed_task_result(
conn,
args.task_id,
result=args.result,
summary=getattr(args, "summary", None),
metadata=metadata,
):
print(
f"cannot edit {args.task_id} (unknown id or task is not done)",
file=sys.stderr,
)
return 1
print(f"Edited {args.task_id}")
return 0
def _cmd_block(args: argparse.Namespace) -> int:
reason = " ".join(args.reason).strip() if args.reason else None
author = _profile_author()
@@ -888,7 +1518,12 @@ def _cmd_block(args: argparse.Namespace) -> int:
for tid in ids:
if reason:
kb.add_comment(conn, tid, author, f"BLOCKED: {reason}")
if not kb.block_task(conn, tid, reason=reason):
if not kb.block_task(
conn,
tid,
reason=reason,
expected_run_id=_worker_run_id_for(tid),
):
failed.append(tid)
print(f"cannot block {tid}", file=sys.stderr)
else:
@@ -966,6 +1601,7 @@ def _cmd_dispatch(args: argparse.Namespace) -> int:
for (tid, who, ws) in res.spawned
],
"skipped_unassigned": res.skipped_unassigned,
"skipped_nonspawnable": res.skipped_nonspawnable,
}, indent=2))
return 0
print(f"Reclaimed: {res.reclaimed}")
@@ -985,6 +1621,11 @@ def _cmd_dispatch(args: argparse.Namespace) -> int:
print(f" - {tid} -> {who} @ {ws or '-'}{tag}")
if res.skipped_unassigned:
print(f"Skipped (unassigned): {', '.join(res.skipped_unassigned)}")
if res.skipped_nonspawnable:
print(
f"Skipped (non-spawnable assignee — terminal lane, OK): "
f"{', '.join(res.skipped_nonspawnable)}"
)
return 0
@@ -1096,16 +1737,18 @@ def _cmd_daemon(args: argparse.Namespace) -> int:
)
def _ready_queue_nonempty() -> bool:
"""Cheap SELECT — just asks whether there's at least one ready
task with an assignee that the dispatcher could have picked up."""
"""Cheap probe — is there at least one ready+assigned+unclaimed
task whose assignee maps to a real Hermes profile (i.e. one the
dispatcher would actually try to spawn for)?
Filters out tasks assigned to control-plane lanes
(e.g. ``orion-cc``, ``orion-research``) that are pulled by
terminals via ``claim_task`` directly those are correctly idle
from the dispatcher's perspective, not stuck.
"""
try:
with kb.connect() as conn:
row = conn.execute(
"SELECT 1 FROM tasks "
"WHERE status = 'ready' AND assignee IS NOT NULL "
" AND claim_lock IS NULL LIMIT 1"
).fetchone()
return row is not None
return kb.has_spawnable_ready(conn)
except Exception:
return False
+1494 -172
View File
File diff suppressed because it is too large Load Diff
+649
View File
@@ -0,0 +1,649 @@
"""Kanban diagnostics — structured, actionable distress signals for tasks.
A ``Diagnostic`` is a machine-readable description of something that's wrong
with a kanban task: a hallucinated card id, a spawn crash-loop, a task
stuck blocked for too long, etc. Each one carries:
* A **kind** (canonical code; UI/tests match on this).
* A **severity** (``warning`` / ``error`` / ``critical``).
* A **title** (one-line human description) and **detail** (longer text).
* A list of **suggested actions** structured entries the dashboard
turns into buttons and the CLI turns into hints.
Rules run over (task, recent events, recent runs) and emit diagnostics.
They are stateless and read-only no DB writes. Callers compute
diagnostics on demand (on ``/board`` load, ``/tasks/:id`` fetch, or
``hermes kanban diagnostics``).
Design goals:
* Fixable-on-the-operator's-side signals only (missing config, phantom
ids, crash loop). Not "the provider returned 502 once" that's a
transient runtime blip, not a diagnostic.
* Recoverable: every diagnostic comes with at least one suggested
recovery action the operator can actually take from the UI.
* Auto-clearing: when the underlying failure mode resolves (a clean
``completed`` event arrives, a spawn succeeds, the task gets
unblocked), the diagnostic stops firing. The audit event trail stays.
"""
from __future__ import annotations
from dataclasses import dataclass, field
from typing import Any, Callable, Iterable, Optional
import json
import time
# Severity rungs, ordered least → most urgent. The UI colors them
# amber (warning), orange (error), red (critical). Sorted outputs put
# critical first so operators see the worst fires at the top.
SEVERITY_ORDER = ("warning", "error", "critical")
@dataclass
class DiagnosticAction:
"""A single recovery action attached to a diagnostic.
The ``kind`` determines how both the UI and CLI render it:
* ``reclaim`` / ``reassign`` POST to the matching /tasks/:id/*
endpoint; dashboard wires into the existing recovery popover.
* ``unblock`` PATCH status back to ``ready`` (for stuck-blocked
diagnostics).
* ``cli_hint`` print/copy a shell command (e.g.
``hermes -p <profile> auth``). No HTTP side effect.
* ``open_docs`` deep-link to the docs URL named in ``payload.url``.
* ``comment`` nudge the operator to add a comment (for
stuck-blocked tasks that need human input).
``suggested=True`` marks the action as the recommended first step;
the UI highlights it. Multiple actions can be suggested if they're
equally valid.
"""
kind: str
label: str
payload: dict = field(default_factory=dict)
suggested: bool = False
def to_dict(self) -> dict:
return {
"kind": self.kind,
"label": self.label,
"payload": self.payload,
"suggested": self.suggested,
}
@dataclass
class Diagnostic:
"""One active distress signal on a task."""
kind: str
severity: str # "warning" | "error" | "critical"
title: str
detail: str
actions: list[DiagnosticAction] = field(default_factory=list)
first_seen_at: int = 0
last_seen_at: int = 0
count: int = 1
# Optional: the run id this diagnostic is scoped to. None = task-wide.
run_id: Optional[int] = None
# Optional structured payload for the UI (phantom ids, failure count).
data: dict = field(default_factory=dict)
def to_dict(self) -> dict:
return {
"kind": self.kind,
"severity": self.severity,
"title": self.title,
"detail": self.detail,
"actions": [a.to_dict() for a in self.actions],
"first_seen_at": self.first_seen_at,
"last_seen_at": self.last_seen_at,
"count": self.count,
"run_id": self.run_id,
"data": self.data,
}
# ---------------------------------------------------------------------------
# Rule helpers
# ---------------------------------------------------------------------------
def _task_field(task, name, default=None):
"""Read a field from a task regardless of representation.
Callers pass sqlite3.Row (dict-like with [] but no attribute
access), kanban_db.Task dataclasses (attribute access), or plain
dicts (both). This normalises them so rule functions don't have
to branch on type each time.
"""
if task is None:
return default
# sqlite Row + plain dicts both support mapping access; Row also
# supports .keys().
try:
# Row raises IndexError if the key isn't a column in the query;
# dicts return default via .get. Handle both.
if hasattr(task, "keys") and name in task.keys():
return task[name]
except Exception:
pass
if isinstance(task, dict):
return task.get(name, default)
return getattr(task, name, default)
def _parse_payload(ev) -> dict:
"""Tolerate event.payload being either a dict or a JSON string."""
p = _task_field(ev, "payload", None)
if p is None:
return {}
if isinstance(p, dict):
return p
if isinstance(p, str):
try:
return json.loads(p) or {}
except Exception:
return {}
return {}
def _event_kind(ev) -> str:
return _task_field(ev, "kind", "") or ""
def _event_ts(ev) -> int:
t = _task_field(ev, "created_at", 0)
return int(t or 0)
def _active_hallucination_events(
events: Iterable[Any],
kind: str,
) -> list[Any]:
"""Return events of ``kind`` that have no ``completed``/``edited``
event *strictly after* them. Walks chronologically: each clean
event resets the accumulator; each matching event gets appended.
Events must be sorted by id (i.e. arrival order); callers pass the
task's full event list which the DB already returns in that order.
"""
# Events arrive sorted by id asc (chronological). Walk once, track
# which hallucination events are still "active" (no clean event
# supersedes them).
active: list[Any] = []
for ev in events:
k = _event_kind(ev)
if k in ("completed", "edited"):
active.clear()
elif k == kind:
active.append(ev)
return active
def _latest_clean_event_ts(events: Iterable[Any]) -> int:
"""Timestamp of the most recent clean completion / edit event.
Kept for general "has this task ever been successfully completed"
lookups; hallucination rules use ``_active_hallucination_events``
instead because they need strict ordering.
"""
latest = 0
for ev in events:
if _event_kind(ev) in ("completed", "edited"):
t = _event_ts(ev)
if t > latest:
latest = t
return latest
# Standard always-available actions. Every diagnostic can offer these as
# fallbacks regardless of kind — they're the two baseline recovery
# primitives the kernel supports.
def _generic_recovery_actions(task: Any, *, running: bool) -> list[DiagnosticAction]:
out: list[DiagnosticAction] = []
if running:
out.append(DiagnosticAction(
kind="reclaim",
label="Reclaim task",
payload={},
))
out.append(DiagnosticAction(
kind="reassign",
label="Reassign to different profile",
payload={"reclaim_first": running},
))
return out
# ---------------------------------------------------------------------------
# Rule implementations
# ---------------------------------------------------------------------------
# Each rule takes (task, events, runs, now_ts, config) and returns
# zero or more Diagnostic instances. ``events`` / ``runs`` are lists of
# kanban_db.Event / kanban_db.Run (or plain dicts matching the same
# shape — for test convenience).
RuleFn = Callable[[Any, list[Any], list[Any], int, dict], list[Diagnostic]]
def _rule_hallucinated_cards(task, events, runs, now, cfg) -> list[Diagnostic]:
"""Blocked-hallucination gate fires: a worker called kanban_complete
with created_cards that didn't exist or weren't created by the
completing profile. Task stayed in its prior state; the operator
needs to decide how to proceed.
Auto-clears when a successful completion (or edit) follows the
blocked event.
"""
hits = _active_hallucination_events(events, "completion_blocked_hallucination")
if not hits:
return []
phantom_ids: list[str] = []
first = _event_ts(hits[0])
last = _event_ts(hits[-1])
for ev in hits:
payload = _parse_payload(ev)
for pid in payload.get("phantom_cards", []) or []:
if pid not in phantom_ids:
phantom_ids.append(pid)
running = _task_field(task, "status") == "running"
actions: list[DiagnosticAction] = []
actions.append(DiagnosticAction(
kind="comment",
label="Add a comment explaining what to do",
suggested=False,
))
actions.extend(_generic_recovery_actions(task, running=running))
return [Diagnostic(
kind="hallucinated_cards",
severity="error",
title="Worker claimed cards that don't exist",
detail=(
f"The completing worker declared created_cards that either didn't "
f"exist or weren't created by its profile. The completion was "
f"blocked and the task stayed in its prior state. "
f"Usually means the worker hallucinated ids instead of capturing "
f"return values from kanban_create."
),
actions=actions,
first_seen_at=first,
last_seen_at=last,
count=len(hits),
data={"phantom_ids": phantom_ids},
)]
def _rule_prose_phantom_refs(task, events, runs, now, cfg) -> list[Diagnostic]:
"""Advisory prose-scan: the completion summary mentions ``t_<hex>``
ids that don't resolve. Non-blocking; surfaced as a warning only.
Auto-clears when a fresh clean completion arrives AFTER the
suspected event.
"""
hits = _active_hallucination_events(events, "suspected_hallucinated_references")
if not hits:
return []
phantom_refs: list[str] = []
for ev in hits:
for pid in _parse_payload(ev).get("phantom_refs", []) or []:
if pid not in phantom_refs:
phantom_refs.append(pid)
running = _task_field(task, "status") == "running"
return [Diagnostic(
kind="prose_phantom_refs",
severity="warning",
title="Completion summary references unknown task ids",
detail=(
"The completion summary mentions task ids that don't resolve "
"in this board's database. The completion itself succeeded, "
"but downstream consumers parsing the summary may be pointed "
"at cards that never existed."
),
actions=_generic_recovery_actions(task, running=running),
first_seen_at=_event_ts(hits[0]),
last_seen_at=_event_ts(hits[-1]),
count=len(hits),
data={"phantom_refs": phantom_refs},
)]
def _rule_repeated_failures(task, events, runs, now, cfg) -> list[Diagnostic]:
"""Task's unified ``consecutive_failures`` counter is climbing —
something about this task+profile combo is broken and each retry
fails the same way. Triggers regardless of the specific failure
mode (spawn error, timeout, crash) because operationally they
all look the same: the kernel keeps retrying and the operator
needs to intervene.
Threshold: cfg["failure_threshold"] (default 3). A threshold of 3
is one below the circuit-breaker's default (5), so the diagnostic
surfaces BEFORE the breaker trips giving operators a window to
fix the problem while the dispatcher's still retrying.
Accepts the legacy ``spawn_failure_threshold`` config key for
back-compat.
"""
threshold = int(cfg.get(
"failure_threshold",
cfg.get("spawn_failure_threshold", 3),
))
# Read the new unified counter name, with a fallback to the legacy
# column name so this rule keeps working against old DB rows the
# caller somehow materialised without running the migration.
failures = (
_task_field(task, "consecutive_failures", None)
if _task_field(task, "consecutive_failures", None) is not None
else _task_field(task, "spawn_failures", 0)
)
if failures is None or failures < threshold:
return []
last_err = (
_task_field(task, "last_failure_error", None)
if _task_field(task, "last_failure_error", None) is not None
else _task_field(task, "last_spawn_error", None)
)
assignee = _task_field(task, "assignee")
# Classify the most recent failure by peeking at run outcomes so
# the title + suggested action can be specific without a separate
# per-outcome rule.
ordered_runs = sorted(runs, key=lambda r: _task_field(r, "id", 0))
most_recent_outcome = None
for r in reversed(ordered_runs):
oc = _task_field(r, "outcome")
if oc in ("spawn_failed", "timed_out", "crashed"):
most_recent_outcome = oc
break
actions: list[DiagnosticAction] = []
if most_recent_outcome == "spawn_failed" and assignee and assignee != "default":
# Spawn is failing specifically — profile setup issue.
actions.append(DiagnosticAction(
kind="cli_hint",
label=f"Verify profile: hermes -p {assignee} doctor",
payload={"command": f"hermes -p {assignee} doctor"},
suggested=True,
))
actions.append(DiagnosticAction(
kind="cli_hint",
label=f"Fix profile auth: hermes -p {assignee} auth",
payload={"command": f"hermes -p {assignee} auth"},
))
elif most_recent_outcome in ("timed_out", "crashed"):
# Worker got off the ground but died. Logs are the right place
# to diagnose; reclaim/reassign are the recovery levers.
task_id = _task_field(task, "id")
if task_id:
actions.append(DiagnosticAction(
kind="cli_hint",
label=f"Check logs: hermes kanban log {task_id}",
payload={"command": f"hermes kanban log {task_id}"},
suggested=True,
))
actions.extend(_generic_recovery_actions(
task, running=_task_field(task, "status") == "running",
))
severity = "critical" if failures >= threshold * 2 else "error"
err_text = (last_err or "").strip() if last_err else ""
err_snippet = err_text[:500] + ("" if len(err_text) > 500 else "") if err_text else ""
outcome_label = {
"spawn_failed": "spawn",
"timed_out": "timeout",
"crashed": "crash",
}.get(most_recent_outcome or "", "failure")
if err_snippet:
title = f"Agent {outcome_label} x{failures}: {err_snippet.splitlines()[0][:160]}"
detail = (
f"This task has failed {failures} times in a row "
f"(most recent: {outcome_label}). Full last error:\n\n"
f"{err_snippet}\n\n"
f"The dispatcher will keep retrying until the consecutive-"
f"failures counter trips the circuit breaker (default 5), "
f"at which point the task auto-blocks. Fix the root cause "
f"and reclaim to retry."
)
else:
title = f"Agent {outcome_label} x{failures} (no error recorded)"
detail = (
f"This task has failed {failures} times in a row "
f"(most recent: {outcome_label}) but no error text was "
f"captured. Check the suggested command or the worker log."
)
return [Diagnostic(
kind="repeated_failures",
severity=severity,
title=title,
detail=detail,
actions=actions,
first_seen_at=now,
last_seen_at=now,
count=failures,
data={
"consecutive_failures": failures,
"most_recent_outcome": most_recent_outcome,
"last_error": last_err,
},
)]
def _rule_repeated_crashes(task, events, runs, now, cfg) -> list[Diagnostic]:
"""The worker spawns fine but keeps crashing mid-run. Check the last
N runs' outcomes; N consecutive ``crashed`` without a successful
``completed`` means something about the task + profile combo is
broken (OOM, missing dependency, tool it needs is down).
Threshold: cfg["crash_threshold"] (default 2).
Narrower than ``repeated_failures`` fires earlier (2 crashes vs 3
total failures) so the operator gets a crash-specific heads-up
before the unified rule kicks in. Suppresses itself when the
unified rule is also about to fire, to avoid double-flagging.
"""
failure_threshold = int(cfg.get(
"failure_threshold",
cfg.get("spawn_failure_threshold", 3),
))
unified_counter = (
_task_field(task, "consecutive_failures", 0) or 0
)
# Unified rule will catch this — let it handle to avoid double fire.
if unified_counter >= failure_threshold:
return []
threshold = int(cfg.get("crash_threshold", 2))
ordered = sorted(runs, key=lambda r: _task_field(r, "id", 0))
# Count trailing consecutive 'crashed' outcomes.
consecutive = 0
last_err = None
for r in reversed(ordered):
outcome = _task_field(r, "outcome")
if outcome == "crashed":
consecutive += 1
if last_err is None:
last_err = _task_field(r, "error")
elif outcome in ("completed", "reclaimed"):
# A success (or manual reclaim) breaks the streak.
break
else:
# Other outcomes (timed_out, blocked, spawn_failed, gave_up)
# aren't crash signals — don't count them, but they also
# don't break the crash streak.
continue
if consecutive < threshold:
return []
task_id = _task_field(task, "id")
actions: list[DiagnosticAction] = []
if task_id:
actions.append(DiagnosticAction(
kind="cli_hint",
label=f"Check logs: hermes kanban log {task_id}",
payload={"command": f"hermes kanban log {task_id}"},
suggested=True,
))
running = _task_field(task, "status") == "running"
actions.extend(_generic_recovery_actions(task, running=running))
severity = "critical" if consecutive >= threshold * 2 else "error"
# Put the actual error up-front so operators see WHAT broke without
# having to open the logs. Truncate defensively — these can be huge
# (full tracebacks).
err_text = (last_err or "").strip() if last_err else ""
err_snippet = err_text[:500] + ("" if len(err_text) > 500 else "") if err_text else ""
if err_snippet:
title = f"Agent crashed {consecutive}x: {err_snippet.splitlines()[0][:160]}"
detail = (
f"The last {consecutive} runs ended with outcome=crashed. "
f"Full last error:\n\n{err_snippet}"
)
else:
title = f"Agent crashed {consecutive}x (no error recorded)"
detail = (
f"The last {consecutive} runs ended with outcome=crashed but "
f"no error text was captured. Check the worker log for more."
)
return [Diagnostic(
kind="repeated_crashes",
severity=severity,
title=title,
detail=detail,
actions=actions,
first_seen_at=now,
last_seen_at=now,
count=consecutive,
data={"consecutive_crashes": consecutive, "last_error": last_err},
)]
def _rule_stuck_in_blocked(task, events, runs, now, cfg) -> list[Diagnostic]:
"""Task has been in ``blocked`` status for too long without a comment.
Threshold: cfg["blocked_stale_hours"] (default 24).
Surfaced as a warning so humans know there's a pending unblock.
"""
hours = float(cfg.get("blocked_stale_hours", 24))
status = _task_field(task, "status")
if status != "blocked":
return []
# Find the most recent ``blocked`` event.
last_blocked_ts = 0
for ev in events:
if _event_kind(ev) == "blocked":
t = _event_ts(ev)
if t > last_blocked_ts:
last_blocked_ts = t
if last_blocked_ts == 0:
return []
age_hours = (now - last_blocked_ts) / 3600.0
if age_hours < hours:
return []
# Any comment / unblock after the block breaks the "stale" signal.
for ev in events:
if _event_kind(ev) in ("commented", "unblocked") and _event_ts(ev) > last_blocked_ts:
return []
actions: list[DiagnosticAction] = [
DiagnosticAction(
kind="comment",
label="Add a comment / unblock the task",
suggested=True,
),
]
return [Diagnostic(
kind="stuck_in_blocked",
severity="warning",
title=f"Task has been blocked for {int(age_hours)}h",
detail=(
f"This task transitioned to blocked {int(age_hours)}h ago and "
f"has had no comments or unblock attempts since. Blocked tasks "
f"are waiting for human input — check the block reason and "
f"either unblock with feedback or answer with a comment."
),
actions=actions,
first_seen_at=last_blocked_ts,
last_seen_at=last_blocked_ts,
count=1,
data={"blocked_at": last_blocked_ts, "age_hours": round(age_hours, 1)},
)]
# Registry — order matters: rules higher on the list render first when
# severity ties. Add new rules here.
_RULES: list[RuleFn] = [
_rule_hallucinated_cards,
_rule_prose_phantom_refs,
_rule_repeated_failures,
_rule_repeated_crashes,
_rule_stuck_in_blocked,
]
# Known kinds (for the UI's filter / legend / i18n keys). Update when
# rules are added.
DIAGNOSTIC_KINDS = (
"hallucinated_cards",
"prose_phantom_refs",
"repeated_failures",
"repeated_crashes",
"stuck_in_blocked",
)
DEFAULT_CONFIG = {
"failure_threshold": 3,
# Legacy alias accepted at read time by _rule_repeated_failures.
"spawn_failure_threshold": 3,
"crash_threshold": 2,
"blocked_stale_hours": 24,
}
def compute_task_diagnostics(
task,
events: list,
runs: list,
*,
now: Optional[int] = None,
config: Optional[dict] = None,
) -> list[Diagnostic]:
"""Run every rule against a single task's state and return a
severity-sorted list of active diagnostics.
Sorting: critical first, then error, then warning; ties broken by
most-recent ``last_seen_at``.
"""
now_ts = int(now if now is not None else time.time())
cfg = {**DEFAULT_CONFIG, **(config or {})}
out: list[Diagnostic] = []
for rule in _RULES:
try:
out.extend(rule(task, events, runs, now_ts, cfg))
except Exception:
# A broken rule must never crash the dashboard. Rule bugs
# get caught in tests; in production we'd rather drop the
# diagnostic than 500 a whole /board request.
continue
severity_idx = {s: i for i, s in enumerate(SEVERITY_ORDER)}
out.sort(
key=lambda d: (
-severity_idx.get(d.severity, -1),
-(d.last_seen_at or 0),
)
)
return out
def severity_of_highest(diagnostics: Iterable[Diagnostic]) -> Optional[str]:
"""Highest severity present in the list, or None if empty. Useful
for card badges that need a single color."""
highest_idx = -1
highest = None
for d in diagnostics:
idx = SEVERITY_ORDER.index(d.severity) if d.severity in SEVERITY_ORDER else -1
if idx > highest_idx:
highest_idx = idx
highest = d.severity
return highest
+768 -230
View File
File diff suppressed because it is too large Load Diff
+1 -1
View File
@@ -361,7 +361,7 @@ def _write_env_vars(env_path: Path, env_writes: dict) -> None:
existing_lines = []
if env_path.exists():
existing_lines = env_path.read_text().splitlines()
existing_lines = env_path.read_text(encoding="utf-8").splitlines()
updated_keys = set()
new_lines = []
+15 -8
View File
@@ -393,14 +393,21 @@ def normalize_model_for_provider(model_input: str, target_provider: str) -> str:
if provider in _AGGREGATOR_PROVIDERS:
return _prepend_vendor(name)
# --- OpenCode Zen: Claude stays hyphenated; other models keep dots ---
if provider == "opencode-zen":
bare = _strip_matching_provider_prefix(name, provider)
if "/" in bare:
return bare
if bare.lower().startswith("claude-"):
return _dots_to_hyphens(bare)
return bare
# --- OpenCode Zen / OpenCode Go: flat-namespace resellers.
# Their /v1/models API returns bare IDs only (no vendor prefix), and
# the inference endpoint rejects vendor-prefixed names with HTTP 401
# "Model not supported". Strip ANY leading ``vendor/`` so config
# entries like ``minimax/minimax-m2.7`` or ``deepseek/deepseek-v4-flash``
# — commonly copied from aggregator slugs into fallback_model lists —
# resolve to bare ``minimax-m2.7`` / ``deepseek-v4-flash`` the API
# actually serves. See PR reviewing opencode-go fallback 401s. ---
if provider in {"opencode-zen", "opencode-go"}:
if "/" in name:
_, bare_after_slash = name.split("/", 1)
name = bare_after_slash.strip() or name
if provider == "opencode-zen" and name.lower().startswith("claude-"):
return _dots_to_hyphens(name)
return name
# --- Anthropic: strip matching provider prefix, dots -> hyphens ---
if provider in _DOT_TO_HYPHEN_PROVIDERS:
+165 -16
View File
@@ -190,11 +190,18 @@ def _load_direct_aliases() -> dict[str, DirectAlias]:
model: "minimax-m2.7"
provider: custom
base_url: "https://ollama.com/v1"
Also reads ``model.aliases`` (set by ``hermes config set model.aliases.xxx``)
and converts simple string entries (``ds-flash: deepseek/deepseek-v4-flash``)
into DirectAlias objects. The provider is parsed from the ``provider/``
prefix in the value; if no slash, the current provider is used.
"""
merged = dict(_BUILTIN_DIRECT_ALIASES)
try:
from hermes_cli.config import load_config
cfg = load_config()
# --- model_aliases (dict-based format) ---
user_aliases = cfg.get("model_aliases")
if isinstance(user_aliases, dict):
for name, entry in user_aliases.items():
@@ -207,6 +214,30 @@ def _load_direct_aliases() -> dict[str, DirectAlias]:
merged[name.strip().lower()] = DirectAlias(
model=model, provider=provider, base_url=base_url,
)
# --- model.aliases (string-based format, from config set) ---
model_section = cfg.get("model", {})
if isinstance(model_section, dict):
simple_aliases = model_section.get("aliases")
if isinstance(simple_aliases, dict):
current_provider = model_section.get("provider", "")
for name, value in simple_aliases.items():
if not isinstance(value, str) or not value.strip():
continue
key = name.strip().lower()
if key in merged:
continue # don't override explicit model_aliases entries
val = value.strip()
if "/" in val:
provider, model = val.split("/", 1)
else:
provider = current_provider
model = val
merged[key] = DirectAlias(
model=model.strip(),
provider=provider.strip() or current_provider,
base_url="",
)
except Exception:
pass
return merged
@@ -768,6 +799,12 @@ def switch_model(
)
# --- Step d: Aggregator catalog search ---
# Track whether the live catalog of the CURRENT provider resolved the
# model — if so, step e must not second-guess and switch providers.
# Critical for flat-namespace resellers like opencode-go / opencode-zen
# whose live /v1/models returns bare IDs (e.g. "deepseek-v4-flash") that
# coincidentally match entries in native providers' static catalogs.
resolved_in_current_catalog = False
if is_aggregator(target_provider) and not resolved_alias:
catalog = list_provider_models(target_provider)
if catalog:
@@ -775,6 +812,7 @@ def switch_model(
for mid in catalog:
if mid.lower() == new_model_lower:
new_model = mid
resolved_in_current_catalog = True
break
else:
for mid in catalog:
@@ -782,6 +820,7 @@ def switch_model(
_, bare = mid.split("/", 1)
if bare.lower() == new_model_lower:
new_model = mid
resolved_in_current_catalog = True
break
# --- Step e: detect_provider_for_model() as last resort ---
@@ -794,6 +833,7 @@ def switch_model(
target_provider == current_provider
and not is_custom
and not resolved_alias
and not resolved_in_current_catalog
):
detected = detect_provider_for_model(new_model, current_provider)
if detected:
@@ -904,6 +944,26 @@ def switch_model(
if any(m.get("name") == new_model for m in cfg_models if isinstance(m, dict)):
override = True
break
# Also check custom_providers list — models declared there should be accepted
# even if the remote /v1/models endpoint doesn't list them.
if not override and custom_providers and isinstance(custom_providers, list):
for entry in custom_providers:
if not isinstance(entry, dict):
continue
# Match by provider slug (custom:<name>) or by base_url
entry_name = entry.get("name", "")
entry_slug = f"custom:{entry_name}" if entry_name else ""
entry_url = entry.get("base_url", "")
if entry_slug == target_provider or entry_url == base_url:
# Check if the requested model matches the entry's model
entry_model = entry.get("model", "")
entry_models = entry.get("models", {})
if new_model == entry_model:
override = True
break
if isinstance(entry_models, dict) and new_model in entry_models:
override = True
break
if override:
validation = {"accepted": True, "persist": True, "recognized": False, "message": validation.get("message", "")}
else:
@@ -1057,6 +1117,45 @@ def list_authenticated_providers(
if normed:
_builtin_endpoints.add(normed)
def _has_fast_aws_sdk_signal() -> bool:
"""Return True when explicit AWS auth config is present.
This intentionally avoids botocore's full credential chain. Provider
picker/model-switch discovery can run for non-Bedrock providers, and
botocore may otherwise probe EC2 IMDS (169.254.169.254) on local
machines before returning no credentials.
"""
if os.environ.get("AWS_BEARER_TOKEN_BEDROCK", "").strip():
return True
if (
os.environ.get("AWS_ACCESS_KEY_ID", "").strip()
and os.environ.get("AWS_SECRET_ACCESS_KEY", "").strip()
):
return True
return any(
os.environ.get(name, "").strip()
for name in (
"AWS_PROFILE",
"AWS_CONTAINER_CREDENTIALS_RELATIVE_URI",
"AWS_CONTAINER_CREDENTIALS_FULL_URI",
"AWS_WEB_IDENTITY_TOKEN_FILE",
)
)
def _has_aws_sdk_creds_for_listing(slug: str) -> bool:
"""Credential check for AWS SDK providers in non-runtime discovery."""
slug_norm = str(slug or "").strip().lower()
current_norm = str(current_provider or "").strip().lower()
if _has_fast_aws_sdk_signal():
return True
if slug_norm != current_norm:
return False
try:
from agent.bedrock_adapter import has_aws_credentials
return bool(has_aws_credentials())
except Exception:
return False
data = fetch_models_dev()
# Build curated model lists keyed by hermes provider ID
@@ -1184,7 +1283,9 @@ def list_authenticated_providers(
# Check if credentials exist
has_creds = False
if overlay.extra_env_vars:
if overlay.auth_type == "aws_sdk":
has_creds = _has_aws_sdk_creds_for_listing(hermes_slug)
elif overlay.extra_env_vars:
has_creds = any(os.environ.get(ev) for ev in overlay.extra_env_vars)
# Also check api_key_env_vars from PROVIDER_REGISTRY for api_key auth_type
if not has_creds and overlay.auth_type == "api_key":
@@ -1203,11 +1304,7 @@ def list_authenticated_providers(
from hermes_cli.auth import _load_auth_store
store = _load_auth_store()
providers_store = store.get("providers", {})
pool_store = store.get("credential_pool", {})
if store and (
pid in providers_store or hermes_slug in providers_store
or pid in pool_store or hermes_slug in pool_store
):
if store and (pid in providers_store or hermes_slug in providers_store):
has_creds = True
except Exception as exc:
logger.debug("Auth store check failed for %s: %s", pid, exc)
@@ -1303,11 +1400,7 @@ def list_authenticated_providers(
from hermes_cli.auth import _load_auth_store
_cp_store = _load_auth_store()
_cp_providers_store = _cp_store.get("providers", {})
_cp_pool_store = _cp_store.get("credential_pool", {})
if _cp_store and (
_cp.slug in _cp_providers_store
or _cp.slug in _cp_pool_store
):
if _cp_store and _cp.slug in _cp_providers_store:
_cp_has_creds = True
except Exception:
pass
@@ -1324,11 +1417,7 @@ def list_authenticated_providers(
# credentials come from the boto3 credential chain (env vars,
# ~/.aws/credentials, instance roles, etc.)
if not _cp_has_creds and _cp_config and getattr(_cp_config, "auth_type", "") == "aws_sdk":
try:
from agent.bedrock_adapter import has_aws_credentials
_cp_has_creds = has_aws_credentials()
except Exception:
pass
_cp_has_creds = _has_aws_sdk_creds_for_listing(_cp.slug)
if not _cp_has_creds:
continue
@@ -1603,3 +1692,63 @@ def list_authenticated_providers(
results.sort(key=lambda r: (not r["is_current"], -r["total_models"]))
return results
def list_picker_providers(
current_provider: str = "",
current_base_url: str = "",
user_providers: dict = None,
custom_providers: list | None = None,
max_models: int = 8,
current_model: str = "",
) -> List[dict]:
"""Interactive-picker variant of :func:`list_authenticated_providers`.
Post-processes the base list so the ``/model`` picker (Telegram/Discord
inline keyboards) only surfaces models that are actually callable in the
current install:
- OpenRouter's model list is replaced with the output of
:func:`hermes_cli.models.fetch_openrouter_models`, which filters the
curated ``OPENROUTER_MODELS`` snapshot against the live OpenRouter
catalog. IDs the live catalog no longer carries drop out, so the
picker never offers a model the user can't call.
- Provider rows whose model list ends up empty are dropped, except
custom endpoints (``is_user_defined=True`` with an ``api_url``) where
the user may supply their own model set through config.
All other providers and metadata fields are passed through unchanged.
The typed ``/model <name>`` path is unaffected -- only the interactive
picker payload is narrowed.
"""
from hermes_cli.models import fetch_openrouter_models
providers = list_authenticated_providers(
current_provider=current_provider,
current_base_url=current_base_url,
user_providers=user_providers,
custom_providers=custom_providers,
max_models=max_models,
current_model=current_model,
)
filtered: List[dict] = []
for p in providers:
slug = str(p.get("slug", "")).lower()
if slug == "openrouter":
try:
live = fetch_openrouter_models()
live_ids = [mid for mid, _ in live]
except Exception:
live_ids = list(p.get("models", []))
p = dict(p)
p["models"] = live_ids[:max_models]
p["total_models"] = len(live_ids)
has_models = bool(p.get("models"))
is_custom_endpoint = bool(p.get("is_user_defined")) and bool(p.get("api_url"))
if not has_models and not is_custom_endpoint:
continue
filtered.append(p)
return filtered
+85 -9
View File
@@ -61,12 +61,14 @@ OPENROUTER_MODELS: list[tuple[str, str]] = [
("z-ai/glm-5v-turbo", ""),
("z-ai/glm-5-turbo", ""),
("x-ai/grok-4.20", ""),
("x-ai/grok-4.3", ""),
("nvidia/nemotron-3-super-120b-a12b", ""),
("nvidia/nemotron-3-super-120b-a12b:free", "free"),
("arcee-ai/trinity-large-preview:free", "free"),
("arcee-ai/trinity-large-thinking", ""),
("openai/gpt-5.5-pro", ""),
("openai/gpt-5.4-nano", ""),
("deepseek/deepseek-v4-pro", ""),
]
_openrouter_catalog_cache: list[tuple[str, str]] | None = None
@@ -181,10 +183,12 @@ _PROVIDER_MODELS: dict[str, list[str]] = {
"z-ai/glm-5v-turbo",
"z-ai/glm-5-turbo",
"x-ai/grok-4.20-beta",
"x-ai/grok-4.3",
"nvidia/nemotron-3-super-120b-a12b",
"arcee-ai/trinity-large-thinking",
"openai/gpt-5.5-pro",
"openai/gpt-5.4-nano",
"deepseek/deepseek-v4-pro",
],
# Native OpenAI Chat Completions (api.openai.com). Used by /model counts and
# provider_model_ids fallback when /v1/models is unavailable.
@@ -806,6 +810,25 @@ CANONICAL_PROVIDERS: list[ProviderEntry] = [
ProviderEntry("ai-gateway", "Vercel AI Gateway", "Vercel AI Gateway"),
]
# Auto-extend CANONICAL_PROVIDERS with any provider registered in providers/
# that is not already in the list above. Adding plugins/model-providers/<name>/
# is sufficient to expose a new provider in the model picker, /model, and all
# downstream consumers — no edits to this file needed.
_canonical_slugs = {p.slug for p in CANONICAL_PROVIDERS}
try:
from providers import list_providers as _list_providers_for_canonical
for _pp in _list_providers_for_canonical():
if _pp.name in _canonical_slugs:
continue
if _pp.auth_type in ("oauth_device_code", "oauth_external", "external_process", "aws_sdk", "copilot"):
continue # non-api-key flows need bespoke picker UX; skip auto-inject
_label = _pp.display_name or _pp.name
_desc = _pp.description or f"{_label} (direct API)"
CANONICAL_PROVIDERS.append(ProviderEntry(_pp.name, _label, _desc))
_canonical_slugs.add(_pp.name)
except Exception:
pass
# Derived dicts — used throughout the codebase
_PROVIDER_LABELS = {p.slug: p.label for p in CANONICAL_PROVIDERS}
_PROVIDER_LABELS["custom"] = "Custom endpoint" # special case: not a named provider
@@ -1740,10 +1763,20 @@ def model_supports_fast_mode(model_id: Optional[str]) -> bool:
def _is_anthropic_fast_model(model_id: Optional[str]) -> bool:
"""Return True if the model is a Claude model eligible for Anthropic Fast Mode."""
"""Return True if the model is a Claude model eligible for Anthropic Fast Mode.
Fast mode is currently supported on Claude Opus 4.6 only. Per Anthropic's
docs (https://platform.claude.com/docs/en/build-with-claude/fast-mode):
"Fast mode is currently supported on Opus 4.6 only. Sending speed: fast
with an unsupported model returns an error." Opus 4.7 explicitly rejects
the ``speed`` parameter with HTTP 400.
"""
raw = _strip_vendor_prefix(str(model_id or ""))
base = raw.split(":")[0]
return base.startswith("claude-")
if not base.startswith("claude-"):
return False
# Only Opus 4.6 supports fast mode at present.
return "opus-4-6" in base or "opus-4.6" in base
def resolve_fast_mode_overrides(model_id: Optional[str]) -> dict[str, Any] | None:
@@ -2013,6 +2046,34 @@ def provider_model_ids(provider: Optional[str], *, force_refresh: bool = False)
return ids
except Exception:
pass
# ── Profile-based generic live fetch (all simple api-key providers) ──
# Handles any provider registered in providers/ with auth_type="api_key".
# Replaces per-provider copy-paste blocks (stepfun, gmi, zai, etc.).
try:
from providers import get_provider_profile
from hermes_cli.auth import resolve_api_key_provider_credentials
_p = get_provider_profile(normalized)
if _p and _p.auth_type == "api_key" and _p.base_url:
try:
creds = resolve_api_key_provider_credentials(normalized)
api_key = str(creds.get("api_key") or "").strip()
base_url = str(creds.get("base_url") or "").strip()
except Exception:
api_key, base_url = "", _p.base_url
if not base_url:
base_url = _p.base_url
if api_key:
live = _p.fetch_models(api_key=api_key)
if live:
return live
# Use profile's fallback_models if defined
if _p.fallback_models:
return list(_p.fallback_models)
except Exception:
pass
curated_static = list(_PROVIDER_MODELS.get(normalized, []))
if normalized in _MODELS_DEV_PREFERRED:
return _merge_with_models_dev(normalized, curated_static)
@@ -2896,6 +2957,19 @@ def fetch_api_models(
_OLLAMA_CLOUD_CACHE_TTL = 3600 # 1 hour
def _strip_ollama_cloud_suffix(model_id: str) -> str:
"""Strip :cloud / -cloud suffixes that models.dev appends to Ollama Cloud IDs.
The live API uses clean IDs (e.g. 'kimi-k2.6') while models.dev sometimes
returns them as 'kimi-k2.6:cloud'. Normalising before the dedup merge
prevents duplicate entries in the merged model list.
"""
for suffix in (":cloud", "-cloud"):
if model_id.endswith(suffix):
return model_id[: -len(suffix)]
return model_id
def _ollama_cloud_cache_path() -> Path:
"""Return the path for the Ollama Cloud model cache."""
from hermes_constants import get_hermes_home
@@ -2991,9 +3065,10 @@ def fetch_ollama_cloud_models(
seen.add(m)
merged.append(m)
for m in mdev_models:
if m and m not in seen:
seen.add(m)
merged.append(m)
normalized = _strip_ollama_cloud_suffix(m)
if normalized and normalized not in seen:
seen.add(normalized)
merged.append(normalized)
if merged:
_save_ollama_cloud_cache(merged)
return merged
@@ -3087,7 +3162,7 @@ def validate_requested_model(
"message": f"Model `{requested}` was not found in LM Studio's model listing.",
}
if normalized == "custom":
if normalized == "custom" or normalized.startswith("custom:"):
# Try probing with correct auth for the api_mode.
if api_mode == "anthropic_messages":
probe = probe_api_models(api_key, base_url, api_mode=api_mode)
@@ -3185,11 +3260,12 @@ def validate_requested_model(
if suggestions:
suggestion_text = "\n Similar models: " + ", ".join(f"`{s}`" for s in suggestions)
return {
"accepted": False,
"persist": False,
"accepted": True,
"persist": True,
"recognized": False,
"message": (
f"Model `{requested}` was not found in the OpenAI Codex model listing."
f"Note: `{requested}` was not found in the OpenAI Codex model listing. "
"It may still work if your ChatGPT/Codex account has access to a newer or hidden model ID."
f"{suggestion_text}"
),
}
+16 -3
View File
@@ -255,6 +255,10 @@ def get_nous_subscription_features(
terminal_cfg = config.get("terminal") if isinstance(config.get("terminal"), dict) else {}
web_backend = str(web_cfg.get("backend") or "").strip().lower()
# Per-capability overrides: if set, they determine which backend is active for
# search/extract independently of web.backend.
web_search_backend = str(web_cfg.get("search_backend") or "").strip().lower()
web_extract_backend = str(web_cfg.get("extract_backend") or "").strip().lower()
tts_provider = str(tts_cfg.get("provider") or "edge").strip().lower()
browser_provider_explicit = "cloud_provider" in browser_cfg
browser_provider = normalize_browser_cloud_provider(
@@ -280,6 +284,7 @@ def get_nous_subscription_features(
direct_firecrawl = bool(get_env_value("FIRECRAWL_API_KEY") or get_env_value("FIRECRAWL_API_URL"))
direct_parallel = bool(get_env_value("PARALLEL_API_KEY"))
direct_tavily = bool(get_env_value("TAVILY_API_KEY"))
direct_searxng = bool(get_env_value("SEARXNG_URL"))
direct_fal = fal_key_is_configured()
direct_openai_tts = bool(resolve_openai_audio_api_key())
direct_elevenlabs = bool(get_env_value("ELEVENLABS_API_KEY"))
@@ -323,10 +328,18 @@ def get_nous_subscription_features(
or (web_backend == "firecrawl" and direct_firecrawl)
or (web_backend == "parallel" and direct_parallel)
or (web_backend == "tavily" and direct_tavily)
or (web_backend == "searxng" and direct_searxng)
# Per-capability overrides: search_backend or extract_backend may be set
# without web.backend (using the new split config from #20061)
or (web_search_backend == "searxng" and direct_searxng)
or (web_search_backend == "exa" and direct_exa)
or (web_search_backend == "firecrawl" and direct_firecrawl)
or (web_search_backend == "parallel" and direct_parallel)
or (web_search_backend == "tavily" and direct_tavily)
)
)
web_available = bool(
managed_web_available or direct_exa or direct_firecrawl or direct_parallel or direct_tavily
managed_web_available or direct_exa or direct_firecrawl or direct_parallel or direct_tavily or direct_searxng
)
image_managed = image_tool_enabled and managed_image_available and not direct_fal
@@ -412,8 +425,8 @@ def get_nous_subscription_features(
managed_by_nous=web_managed,
direct_override=web_active and not web_managed,
toolset_enabled=web_tool_enabled,
current_provider=web_backend or "",
explicit_configured=bool(web_backend),
current_provider=web_backend or web_search_backend or "",
explicit_configured=bool(web_backend or web_search_backend),
),
"image_gen": NousFeatureState(
key="image_gen",
+35 -5
View File
@@ -173,7 +173,7 @@ def _get_enabled_plugins() -> Optional[set]:
# Data classes
# ---------------------------------------------------------------------------
_VALID_PLUGIN_KINDS: Set[str] = {"standalone", "backend", "exclusive", "platform"}
_VALID_PLUGIN_KINDS: Set[str] = {"standalone", "backend", "exclusive", "platform", "model-provider"}
@dataclass
@@ -643,15 +643,17 @@ class PluginManager:
# - flat: ``plugins/disk-cleanup/plugin.yaml`` (standalone)
# - category: ``plugins/image_gen/openai/plugin.yaml`` (backend)
#
# ``memory/`` and ``context_engine/`` are skipped at the top level —
# they have their own discovery systems. ``platforms/`` is a category
# holding platform adapters (scanned one level deeper below).
# ``memory/``, ``context_engine/``, and ``model-providers/`` are
# skipped at the top level — they have their own discovery systems
# (plugins/memory/__init__.py, providers/__init__.py). ``platforms/``
# is a category holding platform adapters (scanned one level deeper
# below).
repo_plugins = get_bundled_plugins_dir()
manifests.extend(
self._scan_directory(
repo_plugins,
source="bundled",
skip_names={"memory", "context_engine", "platforms"},
skip_names={"memory", "context_engine", "platforms", "model-providers"},
)
)
manifests.extend(
@@ -709,6 +711,21 @@ class PluginManager:
)
continue
# Model provider plugins are loaded by providers/__init__.py
# (its own lazy discovery keyed off first get_provider_profile()
# call). We record the manifest here for introspection but do
# not import the module — a second import would create two
# ProviderProfile instances and break the "last writer wins"
# override semantics between bundled and user plugins.
if manifest.kind == "model-provider":
loaded = LoadedPlugin(manifest=manifest, enabled=True)
self._plugins[lookup_key] = loaded
logger.debug(
"Skipping '%s' (model-provider, handled by providers/ discovery)",
lookup_key,
)
continue
# Built-in backends auto-load — they ship with hermes and must
# just work. Selection among them (e.g. which image_gen backend
# services calls) is driven by ``<category>.provider`` config,
@@ -886,6 +903,19 @@ class PluginManager:
"treating as kind='exclusive'",
key,
)
elif (
"register_provider" in source_text
and "ProviderProfile" in source_text
):
# Model provider plugin (calls register_provider()
# from ``providers`` with a ProviderProfile). Route
# to providers/__init__.py discovery.
kind = "model-provider"
logger.debug(
"Plugin %s: detected model provider, "
"treating as kind='model-provider'",
key,
)
except Exception:
pass
+378 -113
View File
@@ -15,13 +15,18 @@ import shutil
import subprocess
import sys
from pathlib import Path
from typing import Optional
from typing import Any, Optional
from hermes_constants import get_hermes_home
from hermes_cli.config import cfg_get
logger = logging.getLogger(__name__)
class PluginOperationError(Exception):
"""Recoverable plugin install/update failure (CLI exits; HTTP maps to 4xx)."""
# Minimum manifest version this installer understands.
# Plugins may declare ``manifest_version: 1`` in plugin.yaml;
# future breaking changes to the manifest schema bump this.
@@ -150,6 +155,24 @@ def _copy_example_files(plugin_dir: Path, console) -> None:
)
def _missing_requires_env_names(manifest: dict) -> list[str]:
"""Return declared ``requires_env`` names that are unset in ``~/.hermes/.env``."""
requires_env = manifest.get("requires_env") or []
if not requires_env:
return []
from hermes_cli.config import get_env_value
env_specs: list[dict] = []
for entry in requires_env:
if isinstance(entry, str):
env_specs.append({"name": entry})
elif isinstance(entry, dict) and entry.get("name"):
env_specs.append(entry)
return [s["name"] for s in env_specs if s.get("name") and not get_env_value(s["name"])]
def _prompt_plugin_env_vars(manifest: dict, console) -> None:
"""Prompt for required environment variables declared in plugin.yaml.
@@ -283,6 +306,95 @@ def _require_installed_plugin(name: str, plugins_dir: Path, console) -> Path:
# ---------------------------------------------------------------------------
def _install_plugin_core(identifier: str, *, force: bool) -> tuple[Path, dict, str]:
"""Clone Git plugin into ``~/.hermes/plugins``.
Returns ``(target_dir, installed_manifest, canonical_name)``.
Raises ``PluginOperationError`` on failure.
"""
import tempfile
try:
git_url = _resolve_git_url(identifier)
except ValueError as e:
raise PluginOperationError(str(e)) from e
plugins_dir = _plugins_dir()
with tempfile.TemporaryDirectory() as tmp:
tmp_target = Path(tmp) / "plugin"
try:
result = subprocess.run(
["git", "clone", "--depth", "1", git_url, str(tmp_target)],
capture_output=True,
text=True,
timeout=60,
)
except FileNotFoundError as e:
raise PluginOperationError(
"git is not installed or not in PATH.",
) from e
except subprocess.TimeoutExpired as e:
raise PluginOperationError(
"Git clone timed out after 60 seconds.",
) from e
if result.returncode != 0:
err = (result.stderr or result.stdout or "").strip()
raise PluginOperationError(f"Git clone failed:\n{err}")
manifest = _read_manifest(tmp_target)
plugin_name = manifest.get("name") or _repo_name_from_url(git_url)
try:
target = _sanitize_plugin_name(plugin_name, plugins_dir)
except ValueError as e:
raise PluginOperationError(str(e)) from e
mv = manifest.get("manifest_version")
if mv is not None:
try:
mv_int = int(mv)
except (ValueError, TypeError):
raise PluginOperationError(
f"Plugin '{plugin_name}' has invalid manifest_version "
f"'{mv}' (expected an integer).",
) from None
if mv_int > _SUPPORTED_MANIFEST_VERSION:
from hermes_cli.config import recommended_update_command
raise PluginOperationError(
f"Plugin '{plugin_name}' requires manifest_version {mv}, "
f"but this installer only supports up to {_SUPPORTED_MANIFEST_VERSION}. "
f"Run {recommended_update_command()} to update Hermes.",
) from None
if target.exists():
if not force:
raise PluginOperationError(
f"Plugin '{plugin_name}' already exists. Use force reinstall "
f"or run `hermes plugins update {plugin_name}`.",
)
shutil.rmtree(target)
shutil.move(str(tmp_target), str(target))
has_yaml = (target / "plugin.yaml").exists() or (target / "plugin.yml").exists()
if not has_yaml and not (target / "__init__.py").exists():
logger.warning(
"%s has no plugin.yaml / __init__.py; may not be a valid plugin",
plugin_name,
)
from rich.console import Console
_copy_example_files(target, Console())
installed_manifest = _read_manifest(target)
installed_name = installed_manifest.get("name") or target.name
return target, installed_manifest, installed_name
def cmd_install(
identifier: str,
force: bool = False,
@@ -293,7 +405,6 @@ def cmd_install(
After install, prompt "Enable now? [y/N]" unless *enable* is provided
(True = auto-enable without prompting, False = install disabled).
"""
import tempfile
from rich.console import Console
console = Console()
@@ -304,114 +415,41 @@ def cmd_install(
console.print(f"[red]Error:[/red] {e}")
sys.exit(1)
# Warn about insecure / local URL schemes
if git_url.startswith(("http://", "file://")):
console.print(
"[yellow]Warning:[/yellow] Using insecure/local URL scheme. "
"Consider using https:// or git@ for production installs."
"Consider using https:// or git@ for production installs.",
)
plugins_dir = _plugins_dir()
console.print(f"[dim]Cloning {git_url}...[/dim]")
# Clone into a temp directory first so we can read plugin.yaml for the name
with tempfile.TemporaryDirectory() as tmp:
tmp_target = Path(tmp) / "plugin"
console.print(f"[dim]Cloning {git_url}...[/dim]")
try:
target, installed_manifest, installed_name = _install_plugin_core(
identifier,
force=force,
)
except PluginOperationError as e:
console.print(f"[red]Error:[/red] {e}")
sys.exit(1)
try:
result = subprocess.run(
["git", "clone", "--depth", "1", git_url, str(tmp_target)],
capture_output=True,
text=True,
timeout=60,
)
except FileNotFoundError:
console.print("[red]Error:[/red] git is not installed or not in PATH.")
sys.exit(1)
except subprocess.TimeoutExpired:
console.print("[red]Error:[/red] Git clone timed out after 60 seconds.")
sys.exit(1)
if result.returncode != 0:
console.print(
f"[red]Error:[/red] Git clone failed:\n{result.stderr.strip()}"
)
sys.exit(1)
# Read manifest
manifest = _read_manifest(tmp_target)
plugin_name = manifest.get("name") or _repo_name_from_url(git_url)
# Sanitize plugin name against path traversal
try:
target = _sanitize_plugin_name(plugin_name, plugins_dir)
except ValueError as e:
console.print(f"[red]Error:[/red] {e}")
sys.exit(1)
# Check manifest_version compatibility
mv = manifest.get("manifest_version")
if mv is not None:
try:
mv_int = int(mv)
except (ValueError, TypeError):
console.print(
f"[red]Error:[/red] Plugin '{plugin_name}' has invalid "
f"manifest_version '{mv}' (expected an integer)."
)
sys.exit(1)
if mv_int > _SUPPORTED_MANIFEST_VERSION:
from hermes_cli.config import recommended_update_command
console.print(
f"[red]Error:[/red] Plugin '{plugin_name}' requires manifest_version "
f"{mv}, but this installer only supports up to {_SUPPORTED_MANIFEST_VERSION}.\n"
f"Run [bold]{recommended_update_command()}[/bold] to get a newer installer."
)
sys.exit(1)
if target.exists():
if not force:
console.print(
f"[red]Error:[/red] Plugin '{plugin_name}' already exists at {target}.\n"
f"Use [bold]--force[/bold] to remove and reinstall, or "
f"[bold]hermes plugins update {plugin_name}[/bold] to pull latest."
)
sys.exit(1)
console.print(f"[dim] Removing existing {plugin_name}...[/dim]")
shutil.rmtree(target)
# Move from temp to final location
shutil.move(str(tmp_target), str(target))
# Validate it looks like a plugin
if not (target / "plugin.yaml").exists() and not (target / "__init__.py").exists():
if not (target / "plugin.yaml").exists() and not (target / "plugin.yml").exists() and not (
target / "__init__.py"
).exists():
console.print(
f"[yellow]Warning:[/yellow] {plugin_name} doesn't contain plugin.yaml "
f"or __init__.py. It may not be a valid Hermes plugin."
f"[yellow]Warning:[/yellow] {installed_name} doesn't contain plugin.yaml "
f"or __init__.py. It may not be a valid Hermes plugin.",
)
# Copy .example files to their real names (e.g. config.yaml.example → config.yaml)
_copy_example_files(target, console)
# Re-read manifest from installed location (for env var prompting)
installed_manifest = _read_manifest(target)
# Prompt for required environment variables before showing after-install docs
_prompt_plugin_env_vars(installed_manifest, console)
_display_after_install(target, identifier)
# Determine the canonical plugin name for enable-list bookkeeping.
installed_name = installed_manifest.get("name") or target.name
# Decide whether to enable: explicit flag > interactive prompt > default off
should_enable = enable
if should_enable is None:
# Interactive prompt unless stdin isn't a TTY (scripted install).
if sys.stdin.isatty() and sys.stdout.isatty():
try:
answer = input(
f" Enable '{installed_name}' now? [y/N]: "
f" Enable '{installed_name}' now? [y/N]: ",
).strip().lower()
should_enable = answer in ("y", "yes")
except (EOFError, KeyboardInterrupt):
@@ -427,12 +465,12 @@ def cmd_install(
_save_enabled_set(enabled)
_save_disabled_set(disabled)
console.print(
f"[green]✓[/green] Plugin [bold]{installed_name}[/bold] enabled."
f"[green]✓[/green] Plugin [bold]{installed_name}[/bold] enabled.",
)
else:
console.print(
f"[dim]Plugin installed but not enabled. "
f"Run `hermes plugins enable {installed_name}` to activate.[/dim]"
f"Run `hermes plugins enable {installed_name}` to activate.[/dim]",
)
console.print("[dim]Restart the gateway for the plugin to take effect:[/dim]")
@@ -462,36 +500,22 @@ def cmd_update(name: str) -> None:
console.print(f"[dim]Updating {name}...[/dim]")
try:
result = subprocess.run(
["git", "pull", "--ff-only"],
capture_output=True,
text=True,
timeout=60,
cwd=str(target),
)
except FileNotFoundError:
console.print("[red]Error:[/red] git is not installed or not in PATH.")
sys.exit(1)
except subprocess.TimeoutExpired:
console.print("[red]Error:[/red] Git pull timed out after 60 seconds.")
sys.exit(1)
if result.returncode != 0:
console.print(f"[red]Error:[/red] Git pull failed:\n{result.stderr.strip()}")
ok, output = _git_pull_plugin_dir(target)
if not ok:
console.print(f"[red]Error:[/red] {output}")
sys.exit(1)
# Copy any new .example files
_copy_example_files(target, console)
output = result.stdout.strip()
if "Already up to date" in output:
out = output.strip()
if "Already up to date" in out:
console.print(
f"[green]✓[/green] Plugin [bold]{name}[/bold] is already up to date."
)
else:
console.print(f"[green]✓[/green] Plugin [bold]{name}[/bold] updated.")
console.print(f"[dim]{output}[/dim]")
console.print(f"[dim]{out}[/dim]")
def cmd_remove(name: str) -> None:
@@ -1244,6 +1268,247 @@ def _run_composite_fallback(plugin_names, plugin_labels, plugin_selected,
print()
def dashboard_install_plugin(
identifier: str,
*,
force: bool,
enable: bool,
) -> dict[str, Any]:
"""Non-interactive install for the web dashboard. Returns a JSON-serializable dict."""
warnings: list[str] = []
try:
git_url = _resolve_git_url(identifier)
if git_url.startswith(("http://", "file://")):
warnings.append(
"Insecure URL scheme; prefer https:// or git@ for production installs.",
)
except ValueError:
pass
try:
target, installed_manifest, installed_name = _install_plugin_core(
identifier,
force=force,
)
except PluginOperationError as exc:
return {"ok": False, "error": str(exc)}
missing_env = _missing_requires_env_names(installed_manifest)
if enable:
en = _get_enabled_set()
dis = _get_disabled_set()
en.add(installed_name)
dis.discard(installed_name)
_save_enabled_set(en)
_save_disabled_set(dis)
hint: str | None = None
ap = target / "after-install.md"
if ap.exists():
hint = str(ap)
return {
"ok": True,
"plugin_name": installed_name,
"warnings": warnings,
"missing_env": missing_env,
"after_install_path": hint,
"enabled": enable,
}
def _get_plugin_toolset_key(name: str) -> Optional[str]:
"""Return the toolset key a plugin registers its tools under, or None.
Queries the live tool registry the plugin must already be loaded.
Falls back to reading ``provides_tools`` from plugin.yaml and looking
up the toolset from the registry for the first tool name found.
"""
try:
from tools.registry import registry
except Exception:
return None
# Check the plugin manager for tools this plugin registered
try:
from hermes_cli.plugins import discover_plugins, get_plugin_manager
discover_plugins() # idempotent — ensures plugins are loaded
manager = get_plugin_manager()
for _key, loaded in manager._plugins.items():
if loaded.manifest.name == name or _key == name:
for tool_name in loaded.tools_registered:
entry = registry.get_entry(tool_name)
if entry and entry.toolset:
return entry.toolset
break
except Exception:
pass
# Fallback: read provides_tools from manifest on disk and query registry
try:
from hermes_cli.plugins import get_bundled_plugins_dir
for base in (get_bundled_plugins_dir(), _plugins_dir()):
if not base.is_dir():
continue
candidate = base / name
if candidate.is_dir():
manifest = _read_manifest(candidate)
for tool_name in manifest.get("provides_tools") or []:
entry = registry.get_entry(tool_name)
if entry and entry.toolset:
return entry.toolset
except Exception:
pass
return None
def _toggle_plugin_toolset(name: str, *, enable: bool) -> None:
"""Add or remove a plugin's toolset from platform_toolsets for all platforms.
Only acts if the plugin actually provides tools (has a toolset key).
"""
toolset_key = _get_plugin_toolset_key(name)
if not toolset_key:
return
from hermes_cli.config import load_config, save_config
config = load_config()
platform_toolsets = config.get("platform_toolsets")
if not isinstance(platform_toolsets, dict):
platform_toolsets = {}
config["platform_toolsets"] = platform_toolsets
changed = False
for platform, ts_list in platform_toolsets.items():
if not isinstance(ts_list, list):
continue
if enable:
if toolset_key not in ts_list:
ts_list.append(toolset_key)
changed = True
else:
if toolset_key in ts_list:
ts_list.remove(toolset_key)
changed = True
# If enabling and no platforms have toolset lists yet, add to "cli" at minimum
if enable and not changed and not platform_toolsets:
platform_toolsets["cli"] = [toolset_key]
changed = True
if changed:
save_config(config)
def dashboard_set_agent_plugin_enabled(name: str, *, enabled: bool) -> dict[str, Any]:
"""Enable or disable a plugin in ``config.yaml`` (runtime allow/deny lists).
For plugins that provide tools (toolsets), also toggles the toolset in
``platform_toolsets`` so the agent actually sees the tools in sessions.
"""
if not _plugin_exists(name):
return {"ok": False, "error": f"Plugin '{name}' is not installed or bundled."}
en = _get_enabled_set()
dis = _get_disabled_set()
if enabled:
if name in en and name not in dis:
return {"ok": True, "name": name, "unchanged": True}
en.add(name)
dis.discard(name)
_save_enabled_set(en)
_save_disabled_set(dis)
_toggle_plugin_toolset(name, enable=True)
return {"ok": True, "name": name, "unchanged": False}
if name not in en and name in dis:
return {"ok": True, "name": name, "unchanged": True}
en.discard(name)
dis.add(name)
_save_enabled_set(en)
_save_disabled_set(dis)
_toggle_plugin_toolset(name, enable=False)
return {"ok": True, "name": name, "unchanged": False}
def _user_installed_plugin_dir(name: str) -> Optional[Path]:
"""Resolved path under ``~/.hermes/plugins/<name>`` if it exists."""
plugins_dir = _plugins_dir()
try:
target = _sanitize_plugin_name(name, plugins_dir)
except ValueError:
return None
return target if target.is_dir() else None
def dashboard_update_user_plugin(name: str) -> dict[str, Any]:
"""``git pull`` inside ``~/.hermes/plugins/<name>``."""
target = _user_installed_plugin_dir(name)
if target is None:
return {
"ok": False,
"error": f"Plugin '{name}' was not found under {_plugins_dir()}.",
}
if not (target / ".git").exists():
return {
"ok": False,
"error": f"Plugin '{name}' is not a git checkout; cannot pull updates.",
}
ok, msg = _git_pull_plugin_dir(target)
if not ok:
return {"ok": False, "error": msg}
from rich.console import Console
_copy_example_files(target, Console())
unchanged = "Already up to date" in msg
return {"ok": True, "name": name, "output": msg, "unchanged": unchanged}
def _git_pull_plugin_dir(target: Path) -> tuple[bool, str]:
try:
result = subprocess.run(
["git", "pull", "--ff-only"],
capture_output=True,
text=True,
timeout=60,
cwd=str(target),
)
except FileNotFoundError:
return False, "git is not installed or not in PATH."
except subprocess.TimeoutExpired:
return False, "Git pull timed out after 60 seconds."
if result.returncode != 0:
err = (result.stderr or "").strip() or result.stdout.strip()
return False, err or "git pull failed."
return True, result.stdout.strip()
def dashboard_remove_user_plugin(name: str) -> dict[str, Any]:
"""Delete a plugin tree under ``~/.hermes/plugins/`` only."""
plugins_dir = _plugins_dir()
for n, _ver, _d, src, _path in _discover_all_plugins():
if n == name and src == "bundled":
return {"ok": False, "error": "Bundled plugins cannot be removed from the dashboard."}
target = _user_installed_plugin_dir(name)
if target is None:
return {
"ok": False,
"error": f"Plugin '{name}' was not found under {plugins_dir}.",
}
shutil.rmtree(target)
return {"ok": True, "name": name}
def plugins_command(args) -> None:
"""Dispatch hermes plugins subcommands."""
action = getattr(args, "plugins_action", None)
+111 -73
View File
@@ -179,8 +179,33 @@ def _get_wrapper_dir() -> Path:
# Validation
# ---------------------------------------------------------------------------
def normalize_profile_name(name: str) -> str:
"""Return the canonical profile id used on disk and in CLI ``-p`` argv.
Named profiles are stored lowercase under ``profiles/<id>/``. The special
alias ``default`` is matched case-insensitively (``Default`` ``default``).
Dashboards and tools may pass title-cased display labels; normalize before
validation, assignment, and subprocess spawn (see issue #18498).
"""
if not isinstance(name, str):
name = str(name)
stripped = name.strip()
if not stripped:
raise ValueError("profile name cannot be empty")
if stripped.casefold() == "default":
return "default"
return stripped.lower()
def validate_profile_name(name: str) -> None:
"""Raise ``ValueError`` if *name* is not a valid profile identifier."""
"""Raise ``ValueError`` if *name* is not a valid profile identifier.
Validates the input as-given strict lowercase match. Callers that accept
mixed-case or title-cased input from users (dashboard UI, CLI args) should
call :func:`normalize_profile_name` first. This separation keeps validate
honest about what the on-disk directory name must look like, while
ingress-point normalization handles UX flexibility (see #18498).
"""
if name == "default":
return # special alias for ~/.hermes
if not _PROFILE_ID_RE.match(name):
@@ -192,16 +217,18 @@ def validate_profile_name(name: str) -> None:
def get_profile_dir(name: str) -> Path:
"""Resolve a profile name to its HERMES_HOME directory."""
if name == "default":
canon = normalize_profile_name(name)
if canon == "default":
return _get_default_hermes_home()
return _get_profiles_root() / name
return _get_profiles_root() / canon
def profile_exists(name: str) -> bool:
"""Check whether a profile directory exists."""
if name == "default":
canon = normalize_profile_name(name)
if canon == "default":
return True
return get_profile_dir(name).is_dir()
return get_profile_dir(canon).is_dir()
# ---------------------------------------------------------------------------
@@ -213,28 +240,29 @@ def check_alias_collision(name: str) -> Optional[str]:
Checks: reserved names, hermes subcommands, existing binaries in PATH.
"""
if name in _RESERVED_NAMES:
return f"'{name}' is a reserved name"
if name in _HERMES_SUBCOMMANDS:
return f"'{name}' conflicts with a hermes subcommand"
canon = normalize_profile_name(name)
if canon in _RESERVED_NAMES:
return f"'{canon}' is a reserved name"
if canon in _HERMES_SUBCOMMANDS:
return f"'{canon}' conflicts with a hermes subcommand"
# Check existing commands in PATH
wrapper_dir = _get_wrapper_dir()
try:
result = subprocess.run(
["which", name], capture_output=True, text=True, timeout=5,
["which", canon], capture_output=True, text=True, timeout=5,
)
if result.returncode == 0:
existing_path = result.stdout.strip()
# Allow overwriting our own wrappers
if existing_path == str(wrapper_dir / name):
if existing_path == str(wrapper_dir / canon):
try:
content = (wrapper_dir / name).read_text()
content = (wrapper_dir / canon).read_text()
if "hermes -p" in content:
return None # it's our wrapper, safe to overwrite
except Exception:
pass
return f"'{name}' conflicts with an existing command ({existing_path})"
return f"'{canon}' conflicts with an existing command ({existing_path})"
except (FileNotFoundError, subprocess.TimeoutExpired):
pass
@@ -252,6 +280,7 @@ def create_wrapper_script(name: str) -> Optional[Path]:
Returns the path to the created wrapper, or None if creation failed.
"""
canon = normalize_profile_name(name)
wrapper_dir = _get_wrapper_dir()
try:
wrapper_dir.mkdir(parents=True, exist_ok=True)
@@ -259,9 +288,9 @@ def create_wrapper_script(name: str) -> Optional[Path]:
print(f"⚠ Could not create {wrapper_dir}: {e}")
return None
wrapper_path = wrapper_dir / name
wrapper_path = wrapper_dir / canon
try:
wrapper_path.write_text(f'#!/bin/sh\nexec hermes -p {name} "$@"\n')
wrapper_path.write_text(f'#!/bin/sh\nexec hermes -p {canon} "$@"\n')
wrapper_path.chmod(wrapper_path.stat().st_mode | stat.S_IEXEC | stat.S_IXGRP | stat.S_IXOTH)
return wrapper_path
except OSError as e:
@@ -271,7 +300,7 @@ def create_wrapper_script(name: str) -> Optional[Path]:
def remove_wrapper_script(name: str) -> bool:
"""Remove the wrapper script for a profile. Returns True if removed."""
wrapper_path = _get_wrapper_dir() / name
wrapper_path = _get_wrapper_dir() / normalize_profile_name(name)
if wrapper_path.exists():
try:
# Verify it's our wrapper before removing
@@ -421,16 +450,17 @@ def create_profile(
Path
The newly created profile directory.
"""
validate_profile_name(name)
canon = normalize_profile_name(name)
validate_profile_name(canon)
if name == "default":
if canon == "default":
raise ValueError(
"Cannot create a profile named 'default' — it is the built-in profile (~/.hermes)."
)
profile_dir = get_profile_dir(name)
profile_dir = get_profile_dir(canon)
if profile_dir.exists():
raise FileExistsError(f"Profile '{name}' already exists at {profile_dir}")
raise FileExistsError(f"Profile '{canon}' already exists at {profile_dir}")
# Resolve clone source
source_dir = None
@@ -440,6 +470,7 @@ def create_profile(
from hermes_constants import get_hermes_home
source_dir = get_hermes_home()
else:
clone_from = normalize_profile_name(clone_from)
validate_profile_name(clone_from)
source_dir = get_profile_dir(clone_from)
if not source_dir.is_dir():
@@ -540,24 +571,25 @@ def delete_profile(name: str, yes: bool = False) -> Path:
Returns the path that was removed.
"""
validate_profile_name(name)
canon = normalize_profile_name(name)
validate_profile_name(canon)
if name == "default":
if canon == "default":
raise ValueError(
"Cannot delete the default profile (~/.hermes).\n"
"To remove everything, use: hermes uninstall"
)
profile_dir = get_profile_dir(name)
profile_dir = get_profile_dir(canon)
if not profile_dir.is_dir():
raise FileNotFoundError(f"Profile '{name}' does not exist.")
raise FileNotFoundError(f"Profile '{canon}' does not exist.")
# Show what will be deleted
model, provider = _read_config_model(profile_dir)
gw_running = _check_gateway_running(profile_dir)
skill_count = _count_skills(profile_dir)
print(f"\nProfile: {name}")
print(f"\nProfile: {canon}")
print(f"Path: {profile_dir}")
if model:
print(f"Model: {model}" + (f" ({provider})" if provider else ""))
@@ -569,7 +601,7 @@ def delete_profile(name: str, yes: bool = False) -> Path:
]
# Check for service
wrapper_path = _get_wrapper_dir() / name
wrapper_path = _get_wrapper_dir() / canon
has_wrapper = wrapper_path.exists()
if has_wrapper:
items.append(f"Command alias ({wrapper_path})")
@@ -584,16 +616,16 @@ def delete_profile(name: str, yes: bool = False) -> Path:
if not yes:
print()
try:
confirm = input(f"Type '{name}' to confirm: ").strip()
confirm = input(f"Type '{canon}' to confirm: ").strip()
except (KeyboardInterrupt, EOFError):
print("\nCancelled.")
return profile_dir
if confirm != name:
if confirm != canon:
print("Cancelled.")
return profile_dir
# 1. Disable service (prevents auto-restart)
_cleanup_gateway_service(name, profile_dir)
_cleanup_gateway_service(canon, profile_dir)
# 2. Stop running gateway
if gw_running:
@@ -601,7 +633,7 @@ def delete_profile(name: str, yes: bool = False) -> Path:
# 3. Remove wrapper script
if has_wrapper:
if remove_wrapper_script(name):
if remove_wrapper_script(canon):
print(f"✓ Removed {wrapper_path}")
# 4. Remove profile directory
@@ -614,13 +646,13 @@ def delete_profile(name: str, yes: bool = False) -> Path:
# 5. Clear active_profile if it pointed to this profile
try:
active = get_active_profile()
if active == name:
if active == canon:
set_active_profile("default")
print("✓ Active profile reset to default")
except Exception:
pass
print(f"\nProfile '{name}' deleted.")
print(f"\nProfile '{canon}' deleted.")
return profile_dir
@@ -730,22 +762,23 @@ def set_active_profile(name: str) -> None:
Writes to ``~/.hermes/active_profile``. Use ``"default"`` to clear.
"""
validate_profile_name(name)
if name != "default" and not profile_exists(name):
canon = normalize_profile_name(name)
validate_profile_name(canon)
if canon != "default" and not profile_exists(canon):
raise FileNotFoundError(
f"Profile '{name}' does not exist. "
f"Create it with: hermes profile create {name}"
f"Profile '{canon}' does not exist. "
f"Create it with: hermes profile create {canon}"
)
path = _get_active_profile_path()
path.parent.mkdir(parents=True, exist_ok=True)
if name == "default":
if canon == "default":
# Remove the file to indicate default
path.unlink(missing_ok=True)
else:
# Atomic write
tmp = path.with_suffix(".tmp")
tmp.write_text(name + "\n")
tmp.write_text(canon + "\n")
tmp.replace(path)
@@ -811,16 +844,17 @@ def export_profile(name: str, output_path: str) -> Path:
"""
import tempfile
validate_profile_name(name)
profile_dir = get_profile_dir(name)
canon = normalize_profile_name(name)
validate_profile_name(canon)
profile_dir = get_profile_dir(canon)
if not profile_dir.is_dir():
raise FileNotFoundError(f"Profile '{name}' does not exist.")
raise FileNotFoundError(f"Profile '{canon}' does not exist.")
output = Path(output_path)
# shutil.make_archive wants the base name without extension
base = str(output).removesuffix(".tar.gz").removesuffix(".tgz")
if name == "default":
if canon == "default":
# The default profile IS ~/.hermes itself — its parent is ~/ and its
# directory name is ".hermes", not "default". We stage a clean copy
# under a temp dir so the archive contains ``default/...``.
@@ -836,14 +870,14 @@ def export_profile(name: str, output_path: str) -> Path:
# Named profiles — stage a filtered copy to exclude credentials
with tempfile.TemporaryDirectory() as tmpdir:
staged = Path(tmpdir) / name
staged = Path(tmpdir) / canon
_CREDENTIAL_FILES = {"auth.json", ".env"}
shutil.copytree(
profile_dir,
staged,
ignore=lambda d, contents: _CREDENTIAL_FILES & set(contents),
)
result = shutil.make_archive(base, "gztar", tmpdir, name)
result = shutil.make_archive(base, "gztar", tmpdir, canon)
return Path(result)
@@ -952,16 +986,17 @@ def import_profile(archive_path: str, name: Optional[str] = None) -> Path:
# Archives exported from the default profile have "default/" as top-level
# dir. Importing as "default" would target ~/.hermes itself — disallow
# that and guide the user toward a named profile.
if inferred_name == "default":
canon = normalize_profile_name(inferred_name)
validate_profile_name(canon)
if canon == "default":
raise ValueError(
"Cannot import as 'default' — that is the built-in root profile (~/.hermes). "
"Specify a different name: hermes profile import <archive> --name <name>"
)
validate_profile_name(inferred_name)
profile_dir = get_profile_dir(inferred_name)
profile_dir = get_profile_dir(canon)
if profile_dir.exists():
raise FileExistsError(f"Profile '{inferred_name}' already exists at {profile_dir}")
raise FileExistsError(f"Profile '{canon}' already exists at {profile_dir}")
profiles_root = _get_profiles_root()
profiles_root.mkdir(parents=True, exist_ok=True)
@@ -977,8 +1012,8 @@ def import_profile(archive_path: str, name: Optional[str] = None) -> Path:
)
final_source = extracted
if archive_root != inferred_name:
final_source = staging_root / inferred_name
if archive_root != canon:
final_source = staging_root / canon
extracted.rename(final_source)
shutil.move(str(final_source), str(profile_dir))
@@ -1048,25 +1083,27 @@ def rename_profile(old_name: str, new_name: str) -> Path:
Returns the new profile directory.
"""
validate_profile_name(old_name)
validate_profile_name(new_name)
old_canon = normalize_profile_name(old_name)
new_canon = normalize_profile_name(new_name)
validate_profile_name(old_canon)
validate_profile_name(new_canon)
if old_name == "default":
if old_canon == "default":
raise ValueError("Cannot rename the default profile.")
if new_name == "default":
if new_canon == "default":
raise ValueError("Cannot rename to 'default' — it is reserved.")
old_dir = get_profile_dir(old_name)
new_dir = get_profile_dir(new_name)
old_dir = get_profile_dir(old_canon)
new_dir = get_profile_dir(new_canon)
if not old_dir.is_dir():
raise FileNotFoundError(f"Profile '{old_name}' does not exist.")
raise FileNotFoundError(f"Profile '{old_canon}' does not exist.")
if new_dir.exists():
raise FileExistsError(f"Profile '{new_name}' already exists.")
raise FileExistsError(f"Profile '{new_canon}' already exists.")
# 1. Stop gateway if running
if _check_gateway_running(old_dir):
_cleanup_gateway_service(old_name, old_dir)
_cleanup_gateway_service(old_canon, old_dir)
_stop_gateway_process(old_dir)
# 2. Rename directory
@@ -1074,22 +1111,22 @@ def rename_profile(old_name: str, new_name: str) -> Path:
print(f"✓ Renamed {old_dir.name}{new_dir.name}")
# 3. Update profile-scoped Honcho host blocks, preserving aiPeer identity
_migrate_honcho_profile_host(old_name, new_name, new_dir)
_migrate_honcho_profile_host(old_canon, new_canon, new_dir)
# 4. Update wrapper script
remove_wrapper_script(old_name)
collision = check_alias_collision(new_name)
remove_wrapper_script(old_canon)
collision = check_alias_collision(new_canon)
if not collision:
create_wrapper_script(new_name)
print(f"✓ Alias updated: {new_name}")
create_wrapper_script(new_canon)
print(f"✓ Alias updated: {new_canon}")
else:
print(f"⚠ Cannot create alias '{new_name}'{collision}")
print(f"⚠ Cannot create alias '{new_canon}'{collision}")
# 5. Update active_profile if it pointed to old name
try:
if get_active_profile() == old_name:
set_active_profile(new_name)
print(f"✓ Active profile updated: {new_name}")
if get_active_profile() == old_canon:
set_active_profile(new_canon)
print(f"✓ Active profile updated: {new_canon}")
except Exception:
pass
@@ -1191,13 +1228,14 @@ def resolve_profile_env(profile_name: str) -> str:
Called early in the CLI entry point, before any hermes modules
are imported, to set the HERMES_HOME environment variable.
"""
validate_profile_name(profile_name)
profile_dir = get_profile_dir(profile_name)
canon = normalize_profile_name(profile_name)
validate_profile_name(canon)
profile_dir = get_profile_dir(canon)
if profile_name != "default" and not profile_dir.is_dir():
if canon != "default" and not profile_dir.is_dir():
raise FileNotFoundError(
f"Profile '{profile_name}' does not exist. "
f"Create it with: hermes profile create {profile_name}"
f"Profile '{canon}' does not exist. "
f"Create it with: hermes profile create {canon}"
)
return str(profile_dir)
+8 -3
View File
@@ -108,9 +108,14 @@ class PtyBridge:
"(or pip install -e '.[pty]')."
)
raise PtyUnavailableError("Pseudo-terminals are unavailable.")
# Let caller-supplied env fully override inheritance; if they pass
# None we inherit the server's env (same semantics as subprocess).
spawn_env = os.environ.copy() if env is None else env
# PTY-hosted programs expect TERM to describe the terminal type.
# CI often runs without TERM in the parent process, which makes
# simple terminal probes like `tput cols` fail before winsize reads.
# Preserve explicit caller overrides, but backfill a sensible default
# when TERM is missing or blank.
spawn_env = (os.environ.copy() if env is None else env.copy())
if not spawn_env.get("TERM"):
spawn_env["TERM"] = "xterm-256color"
proc = ptyprocess.PtyProcess.spawn( # type: ignore[union-attr]
list(argv),
cwd=cwd,
+71 -15
View File
@@ -15,6 +15,7 @@ import importlib.util
import json
import logging
import os
import re
import shutil
import sys
import copy
@@ -208,12 +209,23 @@ def prompt(question: str, default: str = None, password: bool = False) -> str:
else:
value = input(color(display, Colors.YELLOW))
return value.strip() or default or ""
cleaned = _sanitize_pasted_input(value)
return cleaned.strip() or default or ""
except (KeyboardInterrupt, EOFError):
print()
sys.exit(1)
_BRACKETED_PASTE_PATTERN = re.compile(r"\x1b\[\s*200~|\x1b\[\s*201~")
def _sanitize_pasted_input(value: str) -> str:
"""Strip terminal bracketed-paste control markers from pasted text."""
if not isinstance(value, str) or not value:
return value
return _BRACKETED_PASTE_PATTERN.sub("", value)
def _curses_prompt_choice(question: str, choices: list, default: int = 0, description: str | None = None) -> int:
"""Single-select menu using curses. Delegates to curses_radiolist."""
from hermes_cli.curses_ui import curses_radiolist
@@ -382,7 +394,7 @@ def _print_setup_summary(config: dict, hermes_home):
label = f"Web Search & Extract ({subscription_features.web.current_provider})"
tool_status.append((label, True, None))
else:
tool_status.append(("Web Search & Extract", False, "EXA_API_KEY, PARALLEL_API_KEY, FIRECRAWL_API_KEY/FIRECRAWL_API_URL, or TAVILY_API_KEY"))
tool_status.append(("Web Search & Extract", False, "EXA_API_KEY, PARALLEL_API_KEY, FIRECRAWL_API_KEY/FIRECRAWL_API_URL, TAVILY_API_KEY, or SEARXNG_URL"))
# Browser tools (local Chromium, Camofox, Browserbase, Browser Use, or Firecrawl)
browser_provider = subscription_features.browser.current_provider
@@ -964,7 +976,8 @@ def setup_model_provider(config: dict, *, quick: bool = False):
)
else:
_selected_vision_model = prompt(" Vision model (blank = use main/custom default)").strip()
save_env_value("AUXILIARY_VISION_MODEL", _selected_vision_model)
if _selected_vision_model:
save_env_value("AUXILIARY_VISION_MODEL", _selected_vision_model)
print_success(
f"Vision configured with {_base_url}"
+ (f" ({_selected_vision_model})" if _selected_vision_model else "")
@@ -1190,6 +1203,13 @@ def _setup_tts_provider(config: dict):
"Falling back to Edge TTS."
)
selected = "edge"
if selected == "xai":
print()
voice_id = prompt("xAI voice_id (Enter for 'eve', or paste a custom voice ID)")
if voice_id and voice_id.strip():
config.setdefault("tts", {}).setdefault("xai", {})["voice_id"] = voice_id.strip()
print_success(f"xAI voice_id set to: {voice_id.strip()}")
elif selected == "minimax":
existing = get_env_value("MINIMAX_API_KEY")
@@ -1321,15 +1341,13 @@ def setup_terminal_backend(config: dict):
print_success("Terminal backend: Local")
print_info("Commands run directly on this machine.")
# CWD for messaging
# Gateway/cron working directory
print()
print_info("Working directory for messaging sessions:")
print_info(" When using Hermes via Telegram/Discord, this is where")
print_info(
" the agent starts. CLI mode always starts in the current directory."
)
print_info("Gateway working directory:")
print_info(" Used by Telegram/Discord/cron sessions.")
print_info(" CLI/TUI always uses your launch directory instead.")
current_cwd = cfg_get(config, "terminal", "cwd", default="")
cwd = prompt(" Messaging working directory", current_cwd or str(Path.home()))
cwd = prompt(" Gateway working directory", current_cwd or str(Path.home()))
if cwd:
config["terminal"]["cwd"] = cwd
@@ -1643,7 +1661,11 @@ def setup_terminal_backend(config: dict):
def _apply_default_agent_settings(config: dict):
"""Apply recommended defaults for all agent settings without prompting."""
config.setdefault("agent", {})["max_turns"] = 90
save_env_value("HERMES_MAX_ITERATIONS", "90")
# config.yaml is the authoritative source for max_turns; the gateway
# bridges it into HERMES_MAX_ITERATIONS at startup. We no longer write
# to .env to avoid the dual-source inconsistency that caused the
# 60-vs-500 bug (stale .env entry silently shadowing config.yaml).
remove_env_value("HERMES_MAX_ITERATIONS")
config.setdefault("display", {})["tool_progress"] = "all"
@@ -1673,9 +1695,10 @@ def setup_agent_settings(config: dict):
print()
# ── Max Iterations ──
current_max = get_env_value("HERMES_MAX_ITERATIONS") or str(
cfg_get(config, "agent", "max_turns", default=90)
)
# config.yaml is authoritative; read from there. If a legacy .env
# entry is still around (from pre-PR#18413 setups), prefer the
# config value so we don't surface a stale number to the user.
current_max = str(cfg_get(config, "agent", "max_turns", default=90))
print_info("Maximum tool-calling iterations per conversation.")
print_info("Higher = more complex tasks, but costs more tokens.")
print_info(
@@ -1686,9 +1709,13 @@ def setup_agent_settings(config: dict):
try:
max_iter = int(max_iter_str)
if max_iter > 0:
save_env_value("HERMES_MAX_ITERATIONS", str(max_iter))
# Write to config.yaml (authoritative) only. Also clean up any
# stale .env entry from earlier setup runs — the gateway's
# bridge in gateway/run.py now unconditionally derives
# HERMES_MAX_ITERATIONS from agent.max_turns at startup.
config.setdefault("agent", {})["max_turns"] = max_iter
config.pop("max_turns", None)
remove_env_value("HERMES_MAX_ITERATIONS")
print_success(f"Max iterations set to {max_iter}")
except ValueError:
print_warning("Invalid number, keeping current value")
@@ -2033,6 +2060,16 @@ def _setup_slack():
print_warning("⚠️ No Slack allowlist set - unpaired users will be denied by default.")
print_info(" Set SLACK_ALLOW_ALL_USERS=true or GATEWAY_ALLOW_ALL_USERS=true only if you intentionally want open workspace access.")
print()
print_info("📬 Home Channel: where Hermes delivers cron job results,")
print_info(" cross-platform messages, and notifications.")
print_info(" To get a channel ID: open the channel in Slack, then right-click")
print_info(" the channel name → Copy link — the ID starts with C (e.g. C01ABC2DE3F).")
print_info(" You can also set this later by typing /set-home in a Slack channel.")
home_channel = prompt("Home channel ID (leave empty to set later with /set-home)")
if home_channel:
save_env_value("SLACK_HOME_CHANNEL", home_channel.strip())
def _write_slack_manifest_and_instruct():
"""Generate the Slack manifest, write it under HERMES_HOME, and print
@@ -2979,6 +3016,21 @@ def run_setup_wizard(args):
config = load_config()
hermes_home = get_hermes_home()
# Back up existing config before setup modifies it (#3522)
config_path = get_config_path()
if config_path.exists():
from datetime import datetime as _dt
_backup_path = config_path.with_suffix(
f".yaml.bak.{_dt.now().strftime('%Y%m%d_%H%M%S')}"
)
try:
import shutil
shutil.copy2(config_path, _backup_path)
except Exception:
_backup_path = None
else:
_backup_path = None
# Detect non-interactive environments (headless SSH, Docker, CI/CD)
non_interactive = getattr(args, 'non_interactive', False)
if not non_interactive and not is_interactive_stdin():
@@ -3148,6 +3200,10 @@ def run_setup_wizard(args):
# Save and show summary
save_config(config)
if _backup_path and _backup_path.exists():
print_info(f"Previous config backed up to: {_backup_path}")
print_info("If setup changed a value you customized, restore it with:")
print_info(f" cp {_backup_path} {config_path}")
_print_setup_summary(config, hermes_home)
_offer_launch_chat()
+1
View File
@@ -42,6 +42,7 @@ All fields are optional. Missing values inherit from the ``default`` skin.
session_border: "#8B8682" # Session ID dim color
status_bar_bg: "#1a1a2e" # TUI status/usage bar background
voice_status_bg: "#1a1a2e" # TUI voice status background
selection_bg: "#333355" # TUI mouse-selection highlight background
completion_menu_bg: "#1a1a2e" # Completion menu background
completion_menu_current_bg: "#333355" # Active completion row background
completion_menu_meta_bg: "#1a1a2e" # Completion meta column background
+25 -5
View File
@@ -122,11 +122,16 @@ def show_status(args):
print()
print(color("◆ API Keys", Colors.CYAN, Colors.BOLD))
keys = {
# Values may be a single env var name (str) or a tuple of alternates (first found wins).
keys: dict[str, str | tuple[str, ...]] = {
"OpenRouter": "OPENROUTER_API_KEY",
"OpenAI": "OPENAI_API_KEY",
"NVIDIA": "NVIDIA_API_KEY",
"Z.AI/GLM": "GLM_API_KEY",
"Anthropic": ("ANTHROPIC_API_KEY", "ANTHROPIC_TOKEN"),
"Google / Gemini": ("GOOGLE_API_KEY", "GEMINI_API_KEY"),
"DeepSeek": "DEEPSEEK_API_KEY",
"xAI / Grok": "XAI_API_KEY",
"NVIDIA NIM": "NVIDIA_API_KEY",
"Z.AI / GLM": "GLM_API_KEY",
"Kimi": "KIMI_API_KEY",
"StepFun Step Plan": "STEPFUN_API_KEY",
"MiniMax": "MINIMAX_API_KEY",
@@ -142,8 +147,23 @@ def show_status(args):
"GitHub": "GITHUB_TOKEN",
}
for name, env_var in keys.items():
value = get_env_value(env_var) or ""
def _resolve_env(env_ref) -> str:
"""Return first non-empty env var value from a str or tuple of names."""
if isinstance(env_ref, tuple):
for candidate in env_ref:
v = get_env_value(candidate) or ""
if v:
return v
return ""
return get_env_value(env_ref) or ""
for name, env_ref in keys.items():
# Anthropic already has a dedicated lookup below; keep that as the
# single source of truth (it also resolves OAuth tokens), skip here
# so we don't print two "Anthropic" rows.
if name == "Anthropic":
continue
value = _resolve_env(env_ref)
has_key = bool(value)
display = redact_key(value) if not show_all else value
print(f" {name:<12} {check_mark(has_key)} {display}")
+139 -1
View File
@@ -192,7 +192,7 @@ TIPS = [
"Voice messages on Telegram, Discord, WhatsApp, and Slack are auto-transcribed.",
# --- Gateway & Messaging ---
"Hermes runs on 18 platforms: Telegram, Discord, Slack, WhatsApp, Signal, Matrix, email, and more.",
"Hermes runs on 21 messaging platforms: Telegram, Discord, Slack, WhatsApp, Signal, Matrix, IRC, Microsoft Teams, email, and more.",
"hermes gateway install sets it up as a system service that starts on boot.",
"DingTalk uses Stream Mode — no webhooks or public URL needed.",
"BlueBubbles brings iMessage to Hermes via a local macOS server.",
@@ -334,6 +334,144 @@ TIPS = [
"MCP ${ENV_VAR} placeholders in config are resolved at server spawn — including vars from ~/.hermes/.env.",
"Skills from trusted repos (NousResearch) get a 'trusted' security level; community skills get extra scanning.",
"The skills quarantine at ~/.hermes/skills/.hub/quarantine/ holds skills pending security review.",
# --- Advanced Slash Commands ---
'/steer <prompt> injects a note after the next tool call — nudge direction mid-task without interrupting.',
'/goal <text> sets a standing Ralph-loop objective — Hermes auto-continues turn after turn until a judge says done.',
'/snapshot create [label] saves a full state snapshot of Hermes config; /snapshot restore <id> reverts later.',
'/copy [N] copies the last assistant response to your clipboard, or the Nth-from-last with a number.',
'/redraw forces a full UI repaint, fixing terminal drift after tmux resize or mouse selection artifacts.',
'/agents (alias /tasks) shows active agents and running background tasks across the current session.',
'/footer toggles the gateway footer on final replies showing model, tool counts, and turn timing.',
'/busy queue|steer|interrupt controls what pressing Enter does while Hermes is working.',
'/topic in Telegram DMs enables user-managed multi-session topic mode — /topic <id> restores past sessions inline.',
'/approve session|always runs a pending dangerous command with your chosen trust scope; /deny rejects it.',
'/restart gracefully restarts the gateway after draining active runs, then pings the requester when back up.',
'/kanban boards switch <slug> changes the active multi-project Kanban board from inside chat.',
'/reload reloads ~/.hermes/.env into the running session — pick up new API keys without restarting.',
# --- Cron (no-agent & scripts) ---
'cronjob with no_agent=True runs a script on schedule and sends its stdout directly — zero tokens, zero LLM.',
'An empty cron script stdout means silent tick — nothing is delivered, perfect for threshold watchdogs.',
"HERMES_CRON_MAX_PARALLEL (default 4) caps how many cron jobs run per tick so bursts don't saturate your keys.",
# --- Gateway Hooks ---
'Gateway hooks live under ~/.hermes/hooks/<name>/ with HOOK.yaml + handler.py — handler must be named `handle`.',
'Hook events include gateway:startup, session:start, agent:step, and command:* wildcard subscriptions.',
'Drop a ~/.hermes/BOOT.md checklist and a gateway:startup hook runs it as a one-shot agent every boot.',
# --- Curator ---
'hermes curator run --dry-run previews what the curator would archive or consolidate without mutating anything.',
"hermes curator pin <skill> hard-fences a skill against both auto-archival and the agent's skill_manage tool.",
'hermes curator rollback restores skills from a pre-run snapshot — backups live under skills/.curator_backups/.',
# --- Credential Pools & Routing ---
'hermes auth reset <provider> clears all cooldowns and exhaustion flags on a credential pool.',
'credential_pool_strategies.<provider>: round_robin cycles keys evenly instead of the fill_first default.',
'use_gateway: true per-tool routes web, image, tts, or browser through your Nous subscription — no extra keys.',
'provider_routing.data_collection: deny excludes data-storing providers on OpenRouter.',
'provider_routing.require_parameters: true only routes to providers that support every param in your request.',
# --- TUI & Dashboard ---
'HERMES_TUI_RESUME=1 auto-re-attaches to the most recent TUI session on launch — handy after SSH drops.',
"HERMES_TUI_THEME=light|dark|<hex> forces the TUI theme on terminals that don't set COLORFGBG.",
'Ctrl+G or Ctrl+X Ctrl+E in the TUI opens the input buffer in $EDITOR for long multi-line prompts.',
'The TUI renders LaTeX inline — $E=mc^2$ becomes Unicode math instead of raw TeX.',
'hermes dashboard launches a local web UI at 127.0.0.1:9119 — zero data leaves localhost.',
'hermes dashboard --tui embeds the full Hermes TUI in your browser via xterm.js and a WebSocket PTY.',
'Drop a YAML in ~/.hermes/dashboard-themes/ with two palette colors to reskin the entire dashboard.',
'Dashboard plugins are drop-in: manifest.json + JS bundle in ~/.hermes/dashboard-plugins/ — no npm build required.',
'layoutVariant: cockpit in a dashboard theme adds a 260px left rail that plugins can populate via the sidebar slot.',
# --- Env Vars & Config Gates ---
"display.tool_progress_command: true exposes /verbose on messaging platforms; it's CLI-only by default.",
'HERMES_BACKGROUND_NOTIFICATIONS=result only pings when background tasks finish (vs all/error/off).',
'HERMES_WRITE_SAFE_ROOT restricts write_file and patch to a directory prefix; writes outside require approval.',
'HERMES_IGNORE_RULES skips auto-injection of AGENTS.md, SOUL.md, .cursorrules, memory, and preloaded skills.',
'HERMES_ACCEPT_HOOKS auto-approves unseen shell hooks declared in config.yaml without a TTY prompt.',
'auxiliary.goal_judge.model routes the /goal judge to a cheap fast model to keep loop cost near zero.',
'Checkpoints skip directories with more than 50,000 files to avoid slow git operations on massive monorepos.',
# --- TTS ---
'tts.provider: piper runs 44-language local TTS on CPU — voices auto-download to ~/.hermes/cache/piper-voices/.',
'tts.providers.<name>.type: command wires any CLI TTS engine with {input_path} and {output_path} placeholders.',
# --- API Server & Proxy ---
'API_SERVER_ENABLED=true runs an OpenAI-compatible endpoint alongside the gateway for Open WebUI and LibreChat.',
'GATEWAY_PROXY_URL runs a split setup: platform I/O locally, agent work delegated to a remote API server.',
# --- Platform-specific ---
'MATRIX_DEVICE_ID pins a stable device ID for E2EE — without it, keys rotate every start and historic decrypt breaks.',
'TELEGRAM_WEBHOOK_SECRET is required whenever TELEGRAM_WEBHOOK_URL is set — generate with openssl rand -hex 32.',
# --- Batch ---
"batch_runner.py --resume content-matches completed prompts by text so dataset reorders don't re-run finished work.",
# --- Less-Known Slash Commands ---
'/new starts a fresh session in place (alias /reset) — fresh session ID, clean history, CLI stays open.',
'/clear wipes the terminal screen AND starts a new session — one shortcut for a visual reset.',
'/history prints the current conversation in-line without leaving the CLI — useful for a quick re-read.',
'/save writes the current conversation to disk without ending the session.',
'/status shows session info at a glance: ID, title, model, token usage, and elapsed time.',
'/image <path> attaches a local image file for your next prompt without pasting or drag-and-drop.',
'/platforms shows gateway and messaging-platform connection status right from inside chat.',
'/commands paginates the full slash-command + installed-skill list — useful on platforms without tab completion.',
'/toolsets lists every available toolset so you know what -t/--toolsets accepts.',
'/gquota shows Google Gemini Code Assist quota usage with progress bars when that provider is active.',
'/voice tts toggles TTS-only mode — agent replies out loud but you still type your prompts.',
'/reload-skills re-scans ~/.hermes/skills/ so drop-in skills appear without restarting the session.',
'/indicator kaomoji|emoji|unicode|ascii picks the TUI busy-indicator style shown during agent runs.',
'/debug uploads a support bundle (system info + logs) and returns shareable links — works in chat too.',
# --- CLI Subcommands & Flags ---
'hermes -z "<prompt>" is the purest one-shot: final answer on stdout, nothing else — ideal for piping in scripts.',
'hermes chat --pass-session-id injects the session ID into the system prompt so the agent can self-reference it.',
'hermes chat --image path/to/pic.png attaches a local image to a single -q query without a separate upload step.',
'hermes chat --ignore-user-config skips ~/.hermes/config.yaml — reproducible bug reports and CI runs.',
"hermes chat --source tool tags programmatic chats so they don't clutter hermes sessions list.",
'hermes dump --show-keys includes redacted API key fingerprints for deeper support debugging.',
'hermes sessions rename <ID> "new title" renames any past session; hermes sessions delete <ID> removes one.',
'hermes import restores a session export or profile archive produced by sessions export or profile export.',
'hermes fallback manages the fallback_model chain interactively — no hand-editing config.yaml.',
'hermes pairing rotates the DM pairing token — the first messager after rotation claims access to the bot.',
'hermes setup walks first-time users through provider, keys, and platform wiring in one interactive flow.',
'hermes status --deep runs the full health sweep across every component; plain hermes status is the quick view.',
# --- Agent Behavior Env Vars ---
'HERMES_AGENT_TIMEOUT=0 disables the gateway inactivity kill for a running agent — use for long research runs.',
'HERMES_ENABLE_PROJECT_PLUGINS=1 auto-loads repo-local plugins from ./.hermes/plugins/ — trust-gated by design.',
"HERMES_DISABLE_FILE_STATE_GUARD=1 turns off the 'file changed since you read it' guard on patch and write_file.",
'HERMES_ALLOW_PRIVATE_URLS=true lets web tools hit localhost and private networks — off by default in gateway mode.',
'HERMES_OPTIONAL_SKILLS=name1,name2 auto-installs extra optional-catalog skills on first run per profile.',
'HERMES_BUNDLED_SKILLS points at a custom bundled-skill tree — used by Homebrew and Nix packaging.',
'HERMES_DUMP_REQUEST_STDOUT=1 dumps every API request payload to stdout instead of log files.',
'HERMES_OAUTH_TRACE=1 logs redacted OAuth token exchange and refresh attempts for debugging provider auth.',
'HERMES_STREAM_RETRIES (default 3) controls mid-stream reconnect attempts on transient network errors.',
# --- Gateway Behavior Env Vars ---
'HERMES_GATEWAY_BUSY_ACK_ENABLED=false silences the ⚡/⏳/⏩ ack messages when a user messages a busy agent.',
'HERMES_AGENT_NOTIFY_INTERVAL (default 180s) sets how often the gateway pings with progress on long turns.',
'HERMES_RESTART_DRAIN_TIMEOUT (default 900s) caps how long /restart waits for in-flight runs before forcing.',
'HERMES_CHECKPOINT_TIMEOUT (default 30s) caps filesystem checkpoint creation — raise it on huge monorepos.',
# --- Auxiliary Tasks & Image Generation ---
'image_gen.model in config.yaml picks the FAL model: flux-2/klein, gpt-image-2, nano-banana-pro, and more.',
'image_gen.provider routes image generation through a plugin (OpenAI Images, Codex, FAL) instead of the default.',
'AUXILIARY_VISION_BASE_URL + AUXILIARY_VISION_API_KEY point vision analysis at any OpenAI-compatible endpoint.',
'auxiliary.session_search.max_concurrency bounds how many matched sessions are summarized in parallel (default 3).',
'auxiliary.session_search.extra_body forwards provider-specific OpenAI-compatible fields on summarization calls.',
# --- Security ---
'security.tirith_fail_open: false makes Hermes block commands when the tirith scanner itself errors out.',
'TIRITH_FAIL_OPEN env var overrides the tirith_fail_open config — a quick toggle without editing config.yaml.',
# --- Sessions & Source Tags ---
'--source tool chats are excluded from hermes sessions list by default — set --source explicitly to see them.',
'Session IDs are timestamp-prefixed (20250305_091523_abcd) so sorting works naturally in ls and jq.',
# --- Misc ---
'API_SERVER_MODEL_NAME customizes the model name on /v1/models — essential for multi-profile Open WebUI setups.',
'Dashboard plugins are served from /dashboard-plugins/<name>/ — drop files into ~/.hermes/dashboard-plugins/.',
]
+44 -6
View File
@@ -56,6 +56,7 @@ CONFIGURABLE_TOOLSETS = [
("file", "📁 File Operations", "read, write, patch, search"),
("code_execution", "⚡ Code Execution", "execute_code"),
("vision", "👁️ Vision / Image Analysis", "vision_analyze"),
("video", "🎬 Video Analysis", "video_analyze (requires video-capable model)"),
("image_gen", "🎨 Image Generation", "image_generate"),
("moa", "🧠 Mixture of Agents", "mixture_of_agents"),
("tts", "🔊 Text-to-Speech", "text_to_speech"),
@@ -78,7 +79,7 @@ CONFIGURABLE_TOOLSETS = [
# Toolsets that are OFF by default for new installs.
# They're still in _HERMES_CORE_TOOLS (available at runtime if enabled),
# but the setup checklist won't pre-select them for first-time users.
_DEFAULT_OFF_TOOLSETS = {"moa", "homeassistant", "rl", "spotify", "discord", "discord_admin"}
_DEFAULT_OFF_TOOLSETS = {"moa", "homeassistant", "rl", "spotify", "discord", "discord_admin", "video"}
# Platform-scoped toolsets: only appear in the `hermes tools` checklist for
# these platforms, and only resolve/save for these platforms. A toolset
@@ -298,6 +299,15 @@ TOOL_CATEGORIES = {
{"key": "FIRECRAWL_API_URL", "prompt": "Your Firecrawl instance URL (e.g., http://localhost:3002)"},
],
},
{
"name": "SearXNG",
"badge": "free · self-hosted · search only",
"tag": "Privacy-respecting metasearch engine — search only (pair with any extract provider)",
"web_backend": "searxng",
"env_vars": [
{"key": "SEARXNG_URL", "prompt": "Your SearXNG instance URL (e.g., http://localhost:8080)", "url": "https://searxng.github.io/searxng/"},
],
},
],
},
"image_gen": {
@@ -1822,7 +1832,7 @@ def _reconfigure_tool(config: dict):
cat = TOOL_CATEGORIES.get(ts_key)
reqs = TOOLSET_ENV_REQUIREMENTS.get(ts_key)
if cat or reqs:
if _toolset_has_keys(ts_key, config):
if _toolset_has_keys(ts_key, config) or _toolset_enabled_for_reconfigure(ts_key, config):
configurable.append((ts_key, ts_label))
if not configurable:
@@ -1848,6 +1858,28 @@ def _reconfigure_tool(config: dict):
save_config(config)
def _toolset_enabled_for_reconfigure(ts_key: str, config: dict) -> bool:
"""Return True if a configurable toolset is enabled anywhere.
Reconfigure must include enabled-but-unconfigured categories so users can
finish provider/API-key setup without disabling and re-enabling the toolset.
"""
for platform in PLATFORMS:
if not _toolset_allowed_for_platform(ts_key, platform):
continue
try:
enabled = _get_platform_tools(
config,
platform,
include_default_mcp_servers=False,
)
except Exception:
continue
if ts_key in enabled:
return True
return False
def _configure_tool_category_for_reconfig(ts_key: str, cat: dict, config: dict):
"""Reconfigure a tool category - provider selection + API key update."""
icon = cat.get("icon", "")
@@ -1897,21 +1929,27 @@ def _reconfigure_provider(provider: dict, config: dict):
return
if provider.get("tts_provider"):
config.setdefault("tts", {})["provider"] = provider["tts_provider"]
tts_cfg = config.setdefault("tts", {})
tts_cfg["provider"] = provider["tts_provider"]
tts_cfg["use_gateway"] = bool(managed_feature)
_print_success(f" TTS provider set to: {provider['tts_provider']}")
if "browser_provider" in provider:
bp = provider["browser_provider"]
browser_cfg = config.setdefault("browser", {})
if bp == "local":
config.setdefault("browser", {})["cloud_provider"] = "local"
browser_cfg["cloud_provider"] = "local"
_print_success(" Browser set to local mode")
elif bp:
config.setdefault("browser", {})["cloud_provider"] = bp
browser_cfg["cloud_provider"] = bp
_print_success(f" Browser cloud provider set to: {bp}")
browser_cfg["use_gateway"] = bool(managed_feature)
# Set web search backend in config if applicable
if provider.get("web_backend"):
config.setdefault("web", {})["backend"] = provider["web_backend"]
web_cfg = config.setdefault("web", {})
web_cfg["backend"] = provider["web_backend"]
web_cfg["use_gateway"] = bool(managed_feature)
_print_success(f" Web backend set to: {provider['web_backend']}")
if managed_feature and managed_feature not in ("web", "tts", "browser"):
+186
View File
@@ -27,6 +27,192 @@ import sys
import threading
from typing import Any, Callable, Optional
# Modifier aliases mirrored from the TUI parser (``ui-tui/src/lib/platform.ts``)
# ``_MOD_ALIASES`` table — the contract that removes the cross-runtime
# mismatch Copilot flagged in round-9 on #19835.
#
# ``super``/``win``/``windows`` are intentionally absent: prompt_toolkit
# has no super/meta modifier for the Cmd key, so those spellings are
# TUI-only. The normalizer below returns the documented default
# (``c-b``) for them — a silent fallback was preferred to a hard
# startup crash (Copilot round-11). The CLI binding site
# (``_register_voice_handler`` in cli.py) logs a warning when that
# fallback fires so users see why their TUI-only shortcut isn't
# bound in the classic CLI.
_VOICE_MOD_ALIASES = {
"ctrl": "c-",
"control": "c-",
"alt": "a-",
"option": "a-",
"opt": "a-",
}
# Named keys prompt_toolkit accepts in ``c-<name>`` / ``a-<name>`` form.
# Aliases collapse to prompt_toolkit's canonical spelling so the same
# config value binds identically in both runtimes (Copilot round-10 on
# #19835).
_VOICE_NAMED_KEYS = {
"space": "space",
"spc": "space",
"enter": "enter",
"return": "enter",
"ret": "enter",
"tab": "tab",
"escape": "escape",
"esc": "escape",
"backspace": "backspace",
"bs": "backspace",
"delete": "delete",
"del": "delete",
}
# ``useInputHandlers()`` intercepts these before the voice check runs,
# so a binding like ``ctrl+c`` (interrupt), ``ctrl+d`` (quit), or
# ``ctrl+l`` (clear screen) would be advertised in /voice status but
# never fire push-to-talk — the same blocklist the TUI parser uses.
_VOICE_RESERVED_CTRL_CHARS = frozenset({"c", "d", "l"})
# On macOS the classic CLI's prompt_toolkit bindings for copy / exit /
# clear also claim ``a-c`` / ``a-d`` / ``a-l`` via the action-modifier
# lookup, and hermes-ink reports Alt as ``key.meta`` on many terminals.
# Mirror the TUI parser's darwin-only reservation so ``option+c`` etc.
# don't bind Alt+C in the CLI while the TUI silently falls back to
# Ctrl+B (Copilot round-14 on #19835).
_VOICE_RESERVED_ALT_CHARS_MAC = frozenset({"c", "d", "l"})
_DEFAULT_PT_KEY = "c-b"
def voice_record_key_from_config(cfg: Any) -> Any:
"""Shape-safe ``cfg.voice.record_key`` lookup.
``load_config()`` deep-merges raw YAML and preserves scalar
overrides, so a hand-edited ``voice: true`` / ``voice: cmd+b``
leaves ``cfg["voice"]`` as a bool/str instead of a dict, and the
naive ``.get("voice", {}).get("record_key")`` chain raises
AttributeError before voice can even start (Copilot round-11 on
#19835). Return ``None`` for malformed shapes so call sites can
feed the result straight into the normalizer/formatter and get
the documented default.
"""
if not isinstance(cfg, dict):
return None
voice = cfg.get("voice")
if not isinstance(voice, dict):
return None
return voice.get("record_key")
def normalize_voice_record_key_for_prompt_toolkit(raw: Any) -> str:
"""Coerce ``voice.record_key`` into prompt_toolkit's ``c-x`` / ``a-x`` format.
Mirrors the TUI parser contract (``ui-tui/src/lib/platform.ts``)
so one config value binds the same shortcut in both runtimes:
* non-string / empty / typo'd / bare-char / multi-modifier / reserved
``ctrl+c|d|l`` documented default ``c-b``
* single-char keys: ``ctrl+o`` ``c-o``
* named keys: ``ctrl+space`` ``c-space`` (aliases collapse:
``ctrl+return`` ``c-enter``)
* ``super`` / ``win`` / ``windows`` ``c-b`` (TUI-only modifiers
prompt_toolkit has no super mod; the CLI binding site is
expected to warn when this fallback fires so users see the
cross-runtime split, Copilot round-11 on #19835)
"""
if not isinstance(raw, str):
return _DEFAULT_PT_KEY
lowered = raw.strip().lower()
if not lowered:
return _DEFAULT_PT_KEY
parts = [p.strip() for p in lowered.split("+") if p.strip()]
if not parts:
return _DEFAULT_PT_KEY
# Multi-modifier chords like ``ctrl+alt+r`` bind different shortcuts
# in prompt_toolkit (a-c-r form) and hermes-ink rejects them; collapse
# to the documented default instead of silently diverging.
if len(parts) > 2:
return _DEFAULT_PT_KEY
# Bare char / bare named key (no explicit modifier) — the CLI's
# prompt_toolkit binds the raw key without a modifier, which the TUI
# parser refuses; reject here too so both runtimes agree.
if len(parts) == 1:
return _DEFAULT_PT_KEY
modifier_token, key_token = parts
# ``super`` / ``win`` / ``windows`` are TUI-only (prompt_toolkit has
# no super modifier, so ``@kb.add(super+b)`` crashes the CLI at
# startup). Fall back to the documented default here; the CLI
# binding site is expected to log a warning when the configured
# value is one of these spellings so users know the TUI+CLI
# runtimes diverge on that shortcut (Copilot round-11 on #19835).
if modifier_token in {"super", "win", "windows"}:
return _DEFAULT_PT_KEY
normalized_mod = _VOICE_MOD_ALIASES.get(modifier_token)
if not normalized_mod:
return _DEFAULT_PT_KEY
# Single-char key: reject reserved-ctrl chords that the TUI would
# also block at parse time, plus the mac-only alt reservation.
if len(key_token) == 1:
if normalized_mod == "c-" and key_token in _VOICE_RESERVED_CTRL_CHARS:
return _DEFAULT_PT_KEY
if (
normalized_mod == "a-"
and sys.platform == "darwin"
and key_token in _VOICE_RESERVED_ALT_CHARS_MAC
):
return _DEFAULT_PT_KEY
return f"{normalized_mod}{key_token}"
# Multi-char key token must be a known named key; typos like
# ``ctrl+spcae`` fall back to the default rather than being passed
# through as ``c-spcae`` (which prompt_toolkit would reject).
named = _VOICE_NAMED_KEYS.get(key_token)
if not named:
return _DEFAULT_PT_KEY
return f"{normalized_mod}{named}"
def format_voice_record_key_for_status(raw: Any) -> str:
"""Render ``voice.record_key`` for ``/voice status`` in CLI-friendly form.
Mirrors the TUI's ``formatVoiceRecordKey``: returns ``Ctrl+B`` /
``Alt+Space`` / ``Ctrl+Enter``. Malformed configs surface as the
documented default so status never advertises a shortcut that
won't bind (Copilot round-10 on #19835).
"""
normalized = normalize_voice_record_key_for_prompt_toolkit(raw)
if normalized.startswith("c-"):
prefix, key = "Ctrl+", normalized[2:]
elif normalized.startswith("a-"):
prefix, key = "Alt+", normalized[2:]
elif "+" in normalized:
# ``super+<key>`` / ``win+<key>`` — CLI won't bind them, but
# render in title case so status output is still readable.
mod, key = normalized.split("+", 1)
prefix = mod[0].upper() + mod[1:] + "+"
else:
return "Ctrl+B"
if not key:
return prefix.rstrip("+")
if len(key) == 1:
return prefix + key.upper()
return prefix + key[0].upper() + key[1:]
from tools.voice_mode import (
create_audio_recorder,
is_whisper_hallucination,

Some files were not shown because too many files have changed in this diff Show More