Compare commits

...

208 Commits

Author SHA1 Message Date
kshitijk4poor 739b30bc02 fix: follow-up fixes for TinyFish browser provider salvage
- Remove ENV_VARS_BY_VERSION[23] entry: adding optional env vars
  does not require a config version bump (deep-merge handles it)
- Replace change-detector test (assert _config_version == 23) with
  invariant test (assert positive int)
- Add TinyFish case to setup.py missing_browser_hint
- Add TINYFISH_BROWSER_TIMEOUT to set_config_value allowed keys
- Add contributor simantak-dabhade to AUTHOR_MAP
2026-05-03 14:47:45 +05:30
Simantak Dabhade f41ebf7785 feat(tools): add TinyFish cloud browser provider
Adds TinyFish (tinyfish.ai) as a cloud browser provider alongside
Browserbase, Browser Use, and Firecrawl. Sessions are created via a
simple POST that returns a CDP websocket URL.

- tools/browser_providers/tinyfish.py — TinyFishBrowserProvider
- tools/browser_tool.py — register in _PROVIDER_REGISTRY
- hermes_cli/tools_config.py — add to onboarding provider picker
- hermes_cli/config.py — TINYFISH_API_KEY env var entries
- hermes_cli/nous_subscription.py — browser label + feature state
- website/docs — document env vars and setup

Based on PR #6329 by @simantak-dabhade.
2026-05-03 14:46:10 +05:30
kshitij 457c7b76cd feat(openrouter): add response caching support (#19132)
Enable OpenRouter's response caching feature (beta) via X-OpenRouter-Cache
headers. When enabled, identical API requests return cached responses for
free (zero billing), reducing both latency and cost.

Configuration via config.yaml:
  openrouter:
    response_cache: true       # default: on
    response_cache_ttl: 300    # 1-86400 seconds

Changes:
- Add openrouter config section to DEFAULT_CONFIG (response_cache + TTL)
- Add build_or_headers() in auxiliary_client.py that builds attribution
  headers plus optional cache headers based on config
- Replace inline _OR_HEADERS dicts with build_or_headers() at all 5 sites:
  run_agent.py __init__, _apply_client_headers_for_base_url(), and
  auxiliary_client.py _try_openrouter() + _to_async_client()
- Add _check_openrouter_cache_status() method to AIAgent that reads
  X-OpenRouter-Cache-Status from streaming response headers and logs
  HIT/MISS status
- Document in cli-config.yaml.example
- Add 28 tests (22 unit + 6 integration)

Ref: https://openrouter.ai/docs/guides/features/response-caching
2026-05-03 01:54:24 -07:00
Teknium 9b5b88b5e0 chore: add MottledShadow to AUTHOR_MAP 2026-05-03 01:51:33 -07:00
MottledShadow a22465e07a fix(weixin): send_weixin_direct cross-loop session check
When send_message tool is called from inside a running gateway, the
_run_async bridge spawns a worker thread with a separate event loop.
send_weixin_direct then reuses the live adapter's aiohttp session
which was created on the gateway's main loop.  aiohttp's TimerContext
checks asyncio.current_task(loop=session._loop) and sees None because
we're executing on the worker thread's loop → raises 'Timeout context
manager should be used inside a task'.

Fix: skip the live-adapter shortcut when the session belongs to a
different event loop, falling through to the fresh-session path.
2026-05-03 01:51:33 -07:00
Henkey 9987f3d824 fix(acp): compact Zed tool replay rendering 2026-05-03 01:44:23 -07:00
Henkey 19854c7cd2 Schedule ACP history replay and fence file output 2026-05-03 01:44:23 -07:00
Henkey eb612f5574 fix(acp): keep web extract rendering compact 2026-05-03 01:44:23 -07:00
Henkey b294d1d022 fix(acp): keep read-file starts compact 2026-05-03 01:44:23 -07:00
Henkey 72c8037a24 fix(acp): polish common tool rendering 2026-05-03 01:44:23 -07:00
Henkey ef9a08a872 fix(acp): polish Zed context and tool rendering 2026-05-03 01:44:23 -07:00
Henkey e26f9b2070 fix(acp): route Zed thoughts to reasoning callbacks 2026-05-03 01:44:23 -07:00
helix4u 4f37669170 fix(tools): reconfigure enabled unconfigured toolsets 2026-05-03 00:33:02 -07:00
helix4u d409a4409c fix(model): avoid bedrock credential probe in provider picker 2026-05-03 00:32:55 -07:00
Siddharth Balyan 5d3be898a8 docs(tts): mention xAI custom voice support (#18776)
Point users to xAI's custom voices feature — clone your voice in the
console, paste the voice_id into tts.xai.voice_id. No code changes
needed; the existing TTS pipeline already handles arbitrary voice IDs.

- config.py: link to xAI custom voices docs in voice_id comment
- setup.py: prompt accepts custom voice IDs during xAI TTS setup
- tts.md: short section linking to xAI console and docs
2026-05-02 16:08:01 +05:30
liuhao1024 af98122793 fix(auxiliary): propagate explicit_api_key to _try_openrouter()
When resolve_provider_client() passes explicit_api_key for OpenRouter auxiliary
tasks, _try_openrouter() now accepts and honors this parameter instead of
silently ignoring it and falling back to OPENROUTER_API_KEY env var.

Root cause: _try_openrouter() had no explicit_api_key parameter, so even
when callers wanted to pass a runtime credential pool key, it could not be used.

Fix:
- Add explicit_api_key: str = None parameter to _try_openrouter()
- Prioritize explicit_api_key over pool key and env var
- Update resolve_provider_client() call site to pass explicit_api_key

Regression coverage:
- Test that explicit_api_key is passed to OpenAI client when provided
- Test that fallback to OPENROUTER_API_KEY still works when explicit_api_key is None

Closes #18338
2026-05-02 02:27:49 -07:00
teknium1 73bcd83dba chore(release): map beibi9966 email for AUTHOR_MAP
Follow-up for PR #18502 salvage.
2026-05-02 02:23:37 -07:00
teknium1 762eb79f1e fix(gateway): tighten httpx keepalive and close whatsapp typing-response leak (#18451)
Two mitigations for the CLOSE_WAIT accumulation reported against QQ Bot
+ Feishu on macOS behind Cloudflare Warp.

1. Shared httpx.Limits helper (gateway/platforms/_http_client_limits.py).
   Every long-lived platform adapter now constructs httpx.AsyncClient
   with max_keepalive_connections=10 and keepalive_expiry=2.0, vs httpx's
   default of unbounded keepalive pool and 5.0s expiry. On macOS/Warp the
   default 5s window let idle keepalive sockets sit in CLOSE_WAIT long
   enough for seven persistent adapters (QQ Bot, WeCom, DingTalk, Signal,
   BlueBubbles, WeCom-callback, plus the transient Feishu helper) to
   compound to the 256-fd ulimit. Tunable via
   HERMES_GATEWAY_HTTPX_KEEPALIVE_EXPIRY and
   HERMES_GATEWAY_HTTPX_MAX_KEEPALIVE env vars.

2. whatsapp.send_typing aiohttp leak. The call was
   'await self._http_session.post(...)' with no 'async with' and no
   variable capture — the ClientResponse went out of scope unclosed,
   holding its TCP socket in CLOSE_WAIT until GC. Fixed by wrapping in
   'async with'. This was the only bare-await aiohttp leak in the
   gateway/tools/plugins tree per audit; all other aiohttp sites use
   the context-manager pattern correctly.

The underlying reporter also saw Feishu SDK (lark-oapi) connections in
CLOSE_WAIT — those are inside the SDK and out of our direct control, but
tightening httpx keepalive across adapters reduces the aggregate pool
pressure regardless of which individual adapter leaks.
2026-05-02 02:23:37 -07:00
beibi9966 38dd057e91 fix(feishu): finalize remote document downloads inside httpx.AsyncClient context (#18502)
Snapshot Content-Type and body while the client context is still
active so pooled connections fully release on exit. Previously the
read happened after `async with httpx.AsyncClient(...)` returned —
which works today only because httpx eagerly buffers non-streaming
responses; a future refactor to `.stream()` would silently read-
after-close.

Part of the #18451 connection-hygiene audit. Salvage of #18502.
2026-05-02 02:23:37 -07:00
Teknium e444d8f29c fix(gateway): config.yaml wins over .env for agent/display/timezone settings (#18764)
Regression from the silent config→env bridge. The bridge at module import
time is correct for max_turns (unconditional overwrite), but every other
agent.*, display.*, timezone, and security bridge key was guarded by
'if X not in os.environ' — so a stale .env entry from an old 'hermes setup'
run would shadow the user's current config.yaml indefinitely.

Symptom: agent.max_turns: 500 in config.yaml, HERMES_MAX_ITERATIONS=60
in .env from an old setup, and the gateway silently capped at 60
iterations per turn. Gateway logs confirmed api_calls never exceeded 60.

Three changes:

1. gateway/run.py: drop the 'not in os.environ' guards for all agent.*,
   display.*, timezone, and security.* bridge keys. config.yaml is now
   authoritative for these settings — same semantics already in place
   for max_turns, terminal.*, and auxiliary.*. Also surface the bridge
   failure (previously 'except Exception: pass') to stderr so operators
   see bridge errors instead of silently falling back to .env.

2. gateway/run.py: INFO-log the resolved max_iterations at gateway
   start so operators can verify the config→env bridge did the right
   thing instead of chasing a phantom budget ceiling.

3. hermes_cli/setup.py: stop writing HERMES_MAX_ITERATIONS to .env in
   the setup wizard. config.yaml is the single source of truth. Also
   clean up any stale .env entry left behind by pre-fix setups.

Regression tests in tests/gateway/test_config_env_bridge_authority.py
guard each config→env key against the 'stale .env shadows config' bug.
2026-05-02 02:14:35 -07:00
luyao618 13f344c5ce fix(agent): try fallback providers at init when primary credential pool is exhausted (#17929)
When a provider's credential pool has a single entry in 429-cooldown,
resolve_provider_client returns None and AIAgent.__init__ raises a
misleading RuntimeError suggesting the API key is missing — even when
valid fallback_providers are configured.

This patch makes __init__ iterate the fallback chain before raising,
mirroring the existing in-flight fallback logic in the request loop.
If a fallback resolves, the agent initializes against it and sets
_fallback_activated=True so _restore_primary_runtime can pick the
primary back up after cooldown.

Closes #17929
2026-05-02 02:09:46 -07:00
Teknium 1dce908930 fix(gateway): shutdown + restart hygiene (drain timeout, false-fatal, success log) (#18761)
* fix(gateway): config.yaml wins over .env for agent/display/timezone settings

Regression from the silent config→env bridge. The bridge at module import
time is correct for max_turns (unconditional overwrite), but every other
agent.*, display.*, timezone, and security bridge key was guarded by
'if X not in os.environ' — so a stale .env entry from an old 'hermes setup'
run would shadow the user's current config.yaml indefinitely.

Symptom: agent.max_turns: 500 in config.yaml, HERMES_MAX_ITERATIONS=60
in .env from an old setup, and the gateway silently capped at 60
iterations per turn. Gateway logs confirmed api_calls never exceeded 60.

Three changes:

1. gateway/run.py: drop the 'not in os.environ' guards for all agent.*,
   display.*, timezone, and security.* bridge keys. config.yaml is now
   authoritative for these settings — same semantics already in place
   for max_turns, terminal.*, and auxiliary.*. Also surface the bridge
   failure (previously 'except Exception: pass') to stderr so operators
   see bridge errors instead of silently falling back to .env.

2. gateway/run.py: INFO-log the resolved max_iterations at gateway
   start so operators can verify the config→env bridge did the right
   thing instead of chasing a phantom budget ceiling.

3. hermes_cli/setup.py: stop writing HERMES_MAX_ITERATIONS to .env in
   the setup wizard. config.yaml is the single source of truth. Also
   clean up any stale .env entry left behind by pre-fix setups.

Regression tests in tests/gateway/test_config_env_bridge_authority.py
guard each config→env key against the 'stale .env shadows config' bug.

* fix(gateway): shutdown + restart hygiene (drain timeout, false-fatal, success log)

Three issues observed in production gateway.log during a rapid restart
chain on 2026-05-02, all fixed here.

1. _send_restart_notification logged unconditional success
   adapter.send() catches provider errors (e.g. Telegram 'Chat not found')
   and returns SendResult(success=False); it never raises. The caller
   ignored the return value and always logged 'Sent restart notification
   to <chat>' at INFO, producing a misleading success line directly
   below the 'Failed to send Telegram message' traceback on every boot.
   Now inspects result.success and logs WARNING with the error otherwise.

2. WhatsApp bridge SIGTERM on shutdown classified as fatal error
   _check_managed_bridge_exit() saw the bridge's returncode -15 (our own
   SIGTERM from disconnect()) and fired the full fatal-error path,
   producing 'ERROR ... WhatsApp bridge process exited unexpectedly' plus
   'Fatal whatsapp adapter error (whatsapp_bridge_exited)' on every
   planned shutdown, immediately before the normal '✓ whatsapp
   disconnected'. Adds a _shutting_down flag that disconnect() sets
   before the terminate, and _check_managed_bridge_exit() returns None
   for returncode in {0, -2, -15} while shutting down. OOM-kill (137)
   and other non-signal exits still hit the fatal path.

3. restart_drain_timeout default 60s → 180s
   On 2026-05-02 01:43:27 a user /restart fired while three agents were
   mid-API-call (82s, 112s, 154s into their turns). The 60s drain budget
   expired and all three were force-interrupted. 180s covers realistic
   in-flight agent turns; users on very-long-reasoning models can still
   raise it further via agent.restart_drain_timeout in config.yaml.
   Existing explicit user values are preserved by deep-merge.

Tests
- tests/gateway/test_restart_notification.py: two new tests assert INFO
  is only logged on SendResult(success=True) and WARNING with the error
  string is logged on SendResult(success=False).
- tests/gateway/test_whatsapp_connect.py: parametrized test for
  returncode in {0, -2, -15} proves shutdown-time exits are suppressed;
  separate test proves returncode 137 (SIGKILL/OOM) still surfaces as
  fatal even when _shutting_down is set.
- _check_managed_bridge_exit() reads _shutting_down via getattr-with-
  default so existing _make_adapter() test helpers that bypass __init__
  (pitfall #17 in AGENTS.md) keep working unmodified.
2026-05-02 02:08:06 -07:00
teknium1 50f9f389ec chore(release): map ambition0802 email for AUTHOR_MAP
Follow-up for PR #17939 salvage.
2026-05-02 02:07:14 -07:00
ambition0802 7696ddc59e fix(cli): robust paste file expansion and process_loop error handling (#17666)
Two narrow fixes for long pasted messages silently disappearing:

1. _expand_paste_references: replace path.exists() + read_text() with
   try/except (OSError, IOError). Closes the TOCTOU window where a paste
   file deleted between check and read raised FileNotFoundError, bubbled
   up through process_loop's outer except, and silently dropped the
   user's input. Failures now return the placeholder text and log a
   warning.

2. process_loop outer except: logger.warning() instead of print().
   prompt_toolkit's TUI swallows stdout, so 'Error: …' was invisible
   to the user. Logged errors are discoverable via hermes logs.

Dropped the larger interrupt_queue→pending_input drain that was part of
the original PR — that's a separate class of input-drop (in-progress
interrupt handling) unrelated to the paste-file TOCTOU reported in the
issue, and worth its own review.

Salvage of #17939.
2026-05-02 02:07:14 -07:00
Teknium 5eac6084bc fix(discord): warn on 32-char clamp collisions in the /skill collector (#18759)
Discord's per-command name limit is 32 chars. When two skill slugs
share the same first 32 chars (or a skill slug clamps onto a reserved
gateway command name), only the first seen wins — the second is
dropped from the /skill autocomplete. The old behavior incremented a
``hidden`` counter silently, so skill authors had no way to discover
the drop short of noticing their skill was missing from the picker.

Not an actively-biting bug today (no collisions on the default catalog
as of 2026-05), but a landmine the moment someone ships a skill with a
long name. The earlier series in #18745 / #18753 / #18754 dropped the
other silent data-loss paths in the Discord /skill collector; this one
lights up the last remaining one.

Fix: promote ``_names_used`` from a set to a dict keyed by the clamped
name, mapping to the source cmd_key (or a ``"<reserved>"`` sentinel
for names inherited via ``reserved_names``). On collision, log a
WARNING naming both sides — the winner, the loser, the clamped name,
and what to rename.

Two phrasings:

* skill-vs-skill — "both clamp to X on Discord's 32-char command-name
  limit; only the winner appears in /skill. Rename one skill's
  frontmatter ``name:`` to differ in its first 32 chars."
* skill-vs-reserved — "collides with a reserved gateway command name;
  the skill will not appear in /skill. Rename the skill's frontmatter
  ``name:``."

Tests: three cases in
``tests/hermes_cli/test_discord_skill_clamp_warning.py`` —
skill-vs-skill collision (warning names both cmd_keys + clamped prefix),
skill-vs-reserved collision (warning uses the distinct phrasing), and a
no-collision negative (zero warnings emitted).
2026-05-02 02:05:01 -07:00
teknium1 e363ced3c3 test(discord): regression coverage for zombie-websocket guard in connect()
Covers PR #18224 fix for issue #18187 — when DiscordAdapter.connect() is
called a second time without an intervening disconnect(), the previous
commands.Bot must be closed before a new one is created. Otherwise both
websockets stay connected to Discord's gateway and both fire on_message,
producing double responses with different wording.
2026-05-02 02:04:14 -07:00
luyao618 292d2fb42f fix(discord): close old client before reconnect to prevent zombie websockets (#18187)
When DiscordAdapter.connect() is called during reconnect, it creates a new
commands.Bot client without closing the previous one. The old client's
websocket remains connected to Discord's gateway, causing both to fire
on_message for every incoming event — resulting in double responses.

Fix: before creating a new Bot instance, check if a previous client exists
and close it. This ensures only one websocket connection is active at any
time.

Closes #18187
2026-05-02 02:04:14 -07:00
teknium1 0a6865b328 test(credential_pool): regression coverage for .env vs os.environ precedence
Covers PR #18256 fix for issue #18254 — when OPENROUTER_API_KEY is set in
BOTH os.environ (stale from parent shell) and ~/.hermes/.env (fresh),
_seed_from_env must prefer the .env value. Also guards the fallback case
where .env omits the key entirely (Docker/K8s/systemd deployments that
only inject via runtime env).
2026-05-02 02:00:32 -07:00
teknium1 9c626ef8ea chore(release): map franksong2702 email for AUTHOR_MAP
Follow-up for PR #18256 salvage.
2026-05-02 02:00:32 -07:00
Frank Song 2ef1ad280b fix: prefer ~/.hermes/.env over os.environ when seeding credential pool
When _seed_from_env() reads API keys to populate the credential pool, it
should treat ~/.hermes/.env as the authoritative source — not os.environ.
Stale env vars inherited from parent shell processes (Codex CLI, test
scripts, etc.) can shadow deliberate changes to the .env file, causing
auth.json to cache an outdated key that leads to silent 401 errors.

This is especially visible with OpenRouter: if a parent process exported
OPENROUTER_API_KEY=test-key-fresh and the user later updates .env with a
valid key, restarting Hermes still picks up the stale os.environ value,
writes it back to auth.json, and all API calls fail with 401.

Fixes #18254
2026-05-02 02:00:32 -07:00
Teknium 10297fa23c fix(discord): /reload-skills now refreshes the /skill autocomplete live (#18754)
`_register_skill_group` captured the skill catalog in closure variables
(`entries` and `skill_lookup`) so the single `tree.add_command` call at
startup owned the only live copy. The closure is never re-entered after
startup, so `/reload-skills` — which rescans the on-disk skills dir and
refreshes the in-process `_skill_commands` registry — had no way to
propagate results into the `/skill` autocomplete on Discord. New skills
stayed invisible in the dropdown, and deleted skills returned
"Unknown skill" when the stale autocomplete entry was clicked.

The fix is purely a dataflow change: promote `entries` and `skill_lookup`
to instance attributes (`_skill_entries`, `_skill_lookup`), split the
collector-driven rebuild into a helper (`_refresh_skill_catalog_state`),
and add a public `refresh_skill_group()` method that re-runs the helper
and is safe to call at any point after the initial registration.

The gateway's `_handle_reload_skills_command` then iterates
`self.adapters` and calls `refresh_skill_group()` on any adapter that
exposes it (currently only Discord). Both sync and async implementations
are supported; adapters that don't override the method (Telegram's
BotCommand menu, Slack subcommand map, etc.) are silently skipped — the
in-process `reload_skills()` call covers them.

No `tree.sync()` is required because Discord fetches autocomplete
options dynamically on every keystroke — mutating the instance state the
callbacks already read from is sufficient. That sidesteps the per-app
command-bucket rate limit (~5 writes / 20 s) that made the previous
bulk-sync-on-reload approach unusable (#16713 context).

Tests: tests/gateway/test_reload_skills_discord_resync.py — five cases
covering (1) refresh replaces entries, (2) entries stay sorted after
refresh, (3) collector exception leaves cached state intact, (4)
`_refresh_skill_catalog_state` populates the instance attrs, (5)
orchestrator calls `refresh_skill_group()` on sync + async adapters and
skips adapters that don't expose it.
2026-05-02 02:00:11 -07:00
Teknium 6ec74aec07 fix(gateway): match disabled/optional skills by frontmatter slug, not dir name (#18753)
_check_unavailable_skill is meant to turn a typed "/foo" command that
doesn't resolve into a specific hint — "disabled, enable with hermes
skills config" or "available but not installed, install with hermes
skills install …" — instead of the generic "unknown command" reply.

It was doing the match with `skill_md.parent.name.lower().replace("_", "-")`,
comparing that to the typed command. For every skill whose directory name
drifted from its declared frontmatter `name:`, that comparison failed and
the user got the unhelpful generic path. On a standard install today 19
skills have this drift, e.g.:

  dir: mlops/stable-diffusion
  frontmatter: name: Stable Diffusion Image Generation
  registered slug (what the user types): /stable-diffusion-image-generation

  dir: mlops/qdrant
  frontmatter: name: Qdrant Vector Search
  registered slug: /qdrant-vector-search

  dir: mlops/flash-attention
  frontmatter: name: Optimizing Attention Flash
  registered slug: /optimizing-attention-flash

In every case, _check_unavailable_skill would fall through because
"stable-diffusion" != "stable-diffusion-image-generation", even with the
skill sitting right there on disk.

Fix: extract a small `_skill_slug_from_frontmatter` helper that reads the
SKILL.md frontmatter and normalizes exactly like scan_skill_commands
(lower, spaces/underscores → hyphens, strip non-[a-z0-9-], collapse
runs of hyphens, strip edges). Use it in both the
disabled-skills branch and the optional-skills branch. The disabled-set
membership check now uses the declared frontmatter name (which is what
`hermes skills config` writes into skills.disabled / platform_disabled),
not the slug.

Tests: five cases in tests/gateway/test_unavailable_skill_hint.py —
the drift case for the disabled branch, unknown-command negative,
matched-but-not-disabled negative, non-alnum stripping, and the drift
case for the optional-skills branch. All five fail against main and
pass with the fix.
2026-05-02 02:00:09 -07:00
Teknium 8825e9044c fix(discord): complete #18741 for /skill autocomplete and drop legacy 25x25 caps (#18745)
``discord_skill_commands_by_category`` was lagging the flat
``discord_skill_commands`` collector on two counts. Both were actively
dropping skills from Discord's ``/skill`` autocomplete dropdown.

1. External-dir skills were filtered out. #18741 widened the flat
   collector to accept ``SKILLS_DIR + skills.external_dirs`` but left
   this sibling collector — the one ``_register_skill_group`` actually
   uses on Discord — still matching ``SKILLS_DIR`` only. External
   skills were visible in ``hermes skills list`` and the agent's
   ``/skill-name`` dispatch but silently absent from Discord's
   ``/skill`` picker. Widen the accepted roots to match, and derive
   categories from whichever root the skill lives under so
   ``<ext>/mlops/foo/SKILL.md`` still lands in the ``mlops`` group.

2. 25-group × 25-subcommand caps were still applied. PR #11580
   refactored ``/skill`` to a flat autocomplete (whose options Discord
   fetches dynamically — no per-command payload concern) and its
   docstring promises "no hidden skills." The collector kept the old
   nested-layout caps anyway, silently dropping anything past the 25th
   alphabetical category. On installs with 29 category dirs today (real
   example: tail categories ``social-media``, ``software-development``,
   ``yuanbao`` going missing) this was biting immediately. Remove the
   caps; ``hidden`` now reports only 32-char name-clamp collisions
   against reserved names.

Tests: guard both behaviors. ``test_no_legacy_25x25_cap`` builds 30
categories × 30 skills each and asserts all 900 are returned.
``test_external_dirs_skills_included`` monkeypatches
``get_external_skills_dirs`` and asserts an external-dir skill makes
it into the result grouped under its own top-level directory.
2026-05-02 02:00:06 -07:00
Jacob Lizarraga 2470434d60 fix(telegram): probe polling liveness after reconnect to detect wedged Updater
After a transient Telegram 502, _handle_polling_network_error's
stop()+start_polling() cycle can leave PTB's Updater with `running=True`
but a wedged consumer task that never makes progress. No error_callback
fires in that state, so the reconnect ladder never advances past attempt
1, the MAX_NETWORK_RETRIES fatal-error path is never reached, and the
gateway sits silent indefinitely.

Schedule a heartbeat probe (60s after a successful reconnect) that
verifies Updater.running is still True and bot.get_me() responds within
a tight asyncio.wait_for timeout. Either failure feeds back into the
reconnect ladder so the existing escalation path fires.

No PTB-internal coupling, no Application rebuild — minimal additive
defense inside the existing reconnect abstraction.

Tests cover healthy / Updater non-running / probe timeout / probe
network error / already-fatal cases, plus an integration check that the
probe is actually scheduled after a successful start_polling().

Closes the silent-wedge case observed in the wild after a transient
Telegram 502; existing reconnect tests updated to mock bot.get_me() now
that the success path schedules a heartbeat probe.
2026-05-02 01:55:04 -07:00
liuhao1024 9bf260472b fix(tools): deduplicate tool names at API boundary for Vertex/Azure/Bedrock
Providers like Google Vertex, Azure, and Amazon Bedrock reject API
requests with duplicate tool names (HTTP 400: 'Tool names must be
unique').  The upstream injection paths in run_agent.py already dedup
after PR #17335, but two API-boundary functions pass tools through
without checking:

- agent/auxiliary_client.py: _build_call_kwargs() (all non-Anthropic
  providers in chat_completions mode)
- agent/anthropic_adapter.py: convert_tools_to_anthropic() (Anthropic
  Messages API path)

Add defensive dedup guards at both sites.  Duplicates are dropped with
a warning log, converting a hard 400 failure into a recoverable
condition.  This is intentionally conservative — the root-cause dedup
in run_agent.py is the primary defense; these guards add resilience
against future injection-path regressions.

Includes 8 new tests covering unique passthrough, duplicate removal,
empty/None edge cases.

Closes #18478
2026-05-02 01:51:51 -07:00
Teknium 699b3679bc fix(constants): warn once when get_hermes_home() falls back under an active profile (#18746)
When HERMES_HOME is unset but ~/.hermes/active_profile names a non-default
profile, any data this process writes lands in the default profile — not the
one the operator expects. Before this change the fallback was silent, so
cross-profile contamination (#18594) was invisible until a user noticed
their memory/state ended up in the wrong place.

Now we emit a one-shot warning to stderr the first time this happens in
a process. No raise — there are 30+ module-level callers of get_hermes_home()
and raising from any of them would brick import. Behavior is otherwise
unchanged; subprocess spawners (systemd template, kanban dispatcher, docker
entrypoint) already propagate HERMES_HOME correctly.

Bypasses logging.getLogger() because this runs before logging is configured
in a significant fraction of callers (module import time).

Refs #18594. Credit to @liuhao1024 for surfacing the silent-fallback case
in PR #18600; we kept the diagnostic signal without the import-time raise.
2026-05-02 01:49:55 -07:00
teknium1 98c98821ff chore(release): map CoreyNoDream email for AUTHOR_MAP
Follow-up for PR #18721 salvage.
2026-05-02 01:40:31 -07:00
CoreyNoDream c5e3a6fb5b fix(cli): decode .env as UTF-8 to avoid GBK crash on Windows
Path.read_text() uses the system locale by default. On Windows CN/JP/KR
locales (GBK/CP932/CP949), reading a UTF-8 .env raises UnicodeDecodeError
as soon as it contains any non-ASCII byte (e.g. an em dash).

Pin encoding="utf-8" on every .env read in hermes_cli to match how the
rest of the codebase (load_dotenv at doctor.py:26) already decodes it.

Adds a regression test that monkeypatches Path.read_text to simulate a
GBK locale and asserts 'hermes doctor' no longer raises.

Refs #18637
2026-05-02 01:40:31 -07:00
Teknium e2cea6eeba fix(gateway): include external_dirs skills in Telegram/Discord slash commands (#18741)
Skills configured through `skills.external_dirs` in config.yaml were
visible via `hermes skills list`, `get_skill_commands()`, and the
agent's `/skill-name` dispatch, but silently excluded from the
Telegram and Discord slash-command menus. The filter in
`_collect_gateway_skill_entries` only accepted skills whose
`skill_md_path` started with `SKILLS_DIR`, so anything under an
external directory fell through.

Widen the accepted-prefix set to include all configured external
dirs alongside the local skills dir. Every prefix is now
slash-terminated so `/my-skills` cannot also admit
`/my-skills-extra`. Also guard against empty `skill_md_path`
values so they can't accidentally match.

Fixes #8110

Salvages #8790 by luyao618.

Co-authored-by: Yao <34041715+luyao618@users.noreply.github.com>
2026-05-02 01:36:57 -07:00
Teknium c73594fe41 fix(skills): rescan skill_commands cache when platform scope changes (#18739)
The process-global `_skill_commands` dict in agent/skill_commands.py
was seeded by whichever platform scanned first, and
`get_skill_commands()` only rescanned when the cache was empty. In a
long-lived gateway process serving multiple platforms (Telegram +
Discord + Slack), the first platform's
`skills.platform_disabled` view was silently inherited by the
others — so a skill disabled for Telegram would also disappear from
Discord's slash menu, and vice versa.

Track the platform scope the cache was populated for
(`_skill_commands_platform`) and rescan in `get_skill_commands()`
when the currently-active platform no longer matches. Platform
resolution uses the same precedence as `_is_skill_disabled`:
`HERMES_PLATFORM` env var then `HERMES_SESSION_PLATFORM` from the
gateway session context.

Fixes #14536

Salvages #14570 by LeonSGP43.

Co-authored-by: LeonSGP <leon@sgp43.com>
2026-05-02 01:36:53 -07:00
Teknium 97acd66b4c fix(curator): authoritative absorbed_into on delete + restore cron skill links on rollback (#18671) (#18731)
* fix(curator): authoritative absorbed_into declarations on skill delete

Closes #18671. The classification pipeline that feeds cron-ref rewriting
used to infer consolidation vs pruning from two brittle signals: the
curator model's post-hoc YAML summary block, and a substring heuristic
scanning other tool calls for the removed skill's name. Both miss in
real consolidations — the model forgets the YAML under reasoning
pressure, and the heuristic misses when the umbrella's patch content
describes the absorbed behavior abstractly instead of naming the old
slug. When both miss, the skill falls through to 'no-evidence fallback'
pruned, and #18253's cron rewriter drops the cron ref entirely instead
of mapping it to the umbrella. Same observable symptom as pre-#18253:
'Skill(s) not found and skipped' at the next cron run.

The fix makes the model declare intent at the moment of deletion.
skill_manage(action='delete') now accepts absorbed_into:
  - absorbed_into='<umbrella>'  -> consolidated, target must exist on disk
  - absorbed_into=''            -> explicit prune, no forwarding target
  - missing                     -> legacy path, falls through to heuristic/YAML

The curator reconciler reads these declarations off llm_meta.tool_calls
BEFORE either the YAML block or the substring heuristic. Declaration
wins. Fallback logic stays intact for backward compat with any caller
(human or older curator conversation) that doesn't populate the arg.

Changes
- tools/skill_manager_tool.py: add absorbed_into param to skill_manage
  + _delete_skill. Validate target exists when non-empty. Reject
  absorbed_into=<self>. Wire through dispatcher + registry + schema.
- agent/curator.py: new _extract_absorbed_into_declarations() walks
  tool calls for skill_manage(delete) with the arg. _reconcile_classification
  accepts absorbed_declarations= and treats them as authoritative. Curator
  prompt updated to require the arg on every delete.
- Tests: 7 new skill_manager tests covering the tool contract (valid
  target, empty string, nonexistent target, self-reference, whitespace,
  backward compat, dispatcher plumbing). 11 new curator tests covering
  the extractor + authoritative reconciler path + mixed-legacy-and-
  declared runs.

Validation
- 307/307 targeted tests pass (curator + cron + skill_manager suites).
- E2E #18671 repro: 3 narrow skills, 1 umbrella, cron job referencing
  all 3. Model emits NO YAML block. Heuristic misses (patch prose
  doesn't name old slugs). Delete calls carry absorbed_into. Result:
  both PR skills correctly classified 'consolidated' + cron rewritten
  ['pr-review-format', 'pr-review-checklist', 'stale-junk'] ->
  ['hermes-agent-dev']; stale-junk pruned via absorbed_into=''.
- E2E backward-compat: delete without absorbed_into, model emits YAML
  -> routed via existing 'model' source, cron still rewritten correctly.

* feat(curator): capture + restore cron skill links across snapshot/rollback

Before this, rolling back a curator run restored the skills tree but cron
jobs still pointed at the umbrella skills the curator had rewritten them
to. The user would see their old narrow skills back on disk but their
cron jobs still configured with the merged umbrella — not actually 'back
to how it was'.

Snapshot side: snapshot_skills() now captures ~/.hermes/cron/jobs.json
alongside the skills tarball, as cron-jobs.json. The manifest gets a new
'cron_jobs' block with {backed_up, jobs_count} so rollback (and the CLI
confirm dialog) can surface what's in the snapshot. If jobs.json is
missing/unreadable/malformed, snapshot proceeds without cron data — the
skills backup is the core guarantee; cron is additive.

Rollback side: after the skills extract succeeds, the new
_restore_cron_skill_links() reconciles the backed-up jobs into the live
jobs.json SURGICALLY. Only 'skills' and 'skill' fields are restored, and
only on jobs matched by id. Everything else about a cron job — schedule,
last_run_at, next_run_at, enabled, prompt, workdir, hooks — is live
state the user or scheduler has modified since the snapshot; overwriting
it would regress unrelated activity.

Reconciliation rules:
- Job in backup AND live, skills differ  → skills restored.
- Job in backup AND live, skills match   → no-op.
- Job in backup, NOT in live             → skipped (user deleted it
                                              after snapshot; their choice
                                              is later than the snapshot).
- Job in live, NOT in backup             → untouched (user created it
                                              after snapshot).
- Snapshot missing cron-jobs.json at all → rollback still succeeds,
                                              reports 'not captured'
                                              (older pre-feature snapshots
                                              keep working).

Writes go through cron.jobs.save_jobs under the same _jobs_file_lock the
scheduler uses, so rollback doesn't race tick().

Also:
- hermes_cli/curator.py: rollback confirm dialog now shows
  'cron jobs: N (will be restored for skill-link fields only)' when the
  snapshot has cron data, or 'not in snapshot (<reason>)' otherwise.
- rollback()'s message string includes a 'cron links: ...' clause
  summarizing the reconciliation outcome.

Tests
- 9 new cases: snapshot-with-cron, snapshot-without-cron, malformed-json
  captured-as-raw, full rollback-restores-skills-and-cron, rollback
  touches only skill fields, rollback skips user-deleted jobs, rollback
  leaves user-created jobs untouched, rollback still works with
  pre-feature snapshot that has no cron-jobs.json, standalone unit test
  on _restore_cron_skill_links exercising the full report shape.

Validation
- 484/484 targeted tests pass (curator + cron + skill_manager suites).
- E2E: real snapshot_skills, real cron rewrite, real rollback. Before:
  ['pr-review-format', 'pr-review-checklist', 'pr-triage-salvage'].
  After curator: ['hermes-agent-dev']. After rollback: ['pr-review-format',
  'pr-review-checklist', 'pr-triage-salvage']. Non-skill fields (id,
  name, prompt) preserved across the round trip.
2026-05-02 01:29:57 -07:00
Siddharth Balyan f98b5d00a4 fix: gateway systemd unit now retries indefinitely with backoff (#18639)
The old defaults (StartLimitIntervalSec=600, StartLimitBurst=5,
RestartSec=30) meant any network outage over ~5 minutes would
permanently kill the gateway until manual intervention.

Changes:
- StartLimitIntervalSec=0 (never give up)
- Restart=always (not just on-failure)
- RestartSec=60 with RestartMaxDelaySec=300, RestartSteps=5
  (exponential backoff: 60 → 120 → 180 → 240 → 300s cap)
- After=network-online.target + Wants= (both units now wait for
  actual connectivity, not just network.target)

Power outage → internet down → internet back = auto-recovery.
2026-05-02 08:51:30 +05:30
Siddharth Balyan 585d6778da fix: allow WebSocket connections from non-loopback IPs in --insecure mode (#18633)
When the dashboard is bound to 0.0.0.0 with --insecure (e.g. behind
Tailscale Serve), WebSocket endpoints (/api/pty, /api/ws, /api/pub,
/api/events) rejected connections from non-loopback client IPs with
code 4403 — causing 'events feed disconnected' in the UI.

Extract the repeated loopback check into _ws_client_is_allowed() which
respects the public bind flag. Session token auth still guards all
endpoints regardless of bind mode.
2026-05-02 08:17:45 +05:30
kshitijk4poor f903ceece0 chore: add contributors to AUTHOR_MAP for Slack batch salvage
Adds email→username mappings for:
- priveperfumes (PR #18456)
- amroessam (PR #17798)
- Hinotoi-agent (PR #9361)
- valda (PR #14932)
2026-05-01 14:01:26 -07:00
Amr Essam d05a87e686 fix(gateway): clear slack assistant thread status 2026-05-01 14:01:26 -07:00
hinotoi-agent a147164d3c fix(slack): preserve per-user slash-command session isolation 2026-05-01 14:01:26 -07:00
nightq 5cdc39e29a fix(gateway): preserve case-sensitive chat IDs in DeliveryTarget.parse
Fixes NousResearch/hermes-agent#11768

Root cause: target.strip().lower() was lowercasing the entire target string,
corrupting case-sensitive chat IDs like Slack C123ABC and Matrix !RoomABC.

Fix: Only lowercase the platform prefix for case-insensitive matching;
preserve the original case for chat_id and thread_id values.
2026-05-01 14:01:26 -07:00
YAMAGUCHI Seiji 2b3923ff13 fix(gateway): coerce scalar free_response_channels to str before split
YAML loads a bare numeric value such as
    discord:
      free_response_channels: 1491973769726791812
as an int.  _discord_free_response_channels() / _slack_free_response_channels()
checked `isinstance(raw, list)` and `isinstance(raw, str)` in that order and
then fell through to `return set()`, so a single-channel config that happened
to be unquoted was silently dropped with no log line — the bot kept demanding
@mentions even though the channel was configured to free-response.

A multi-channel value like `1234567890,9876543210` does not trip this because
the comma forces YAML to parse it as a string.  Single-channel configs are
the only case that breaks, which is exactly the footgun that's hardest to
diagnose (the config "looks right" and the feature just doesn't activate).

Note that the old-schema env-var bridge at gateway/config.py:614+ already
runs `str(frc)` when forwarding to SLACK_/DISCORD_FREE_RESPONSE_CHANNELS,
so the env-var fallback worked.  The bug only surfaces on the
`config.extra["free_response_channels"]` path populated by the `platforms:`
bridge at gateway/config.py:576, which passes the raw YAML value through
unchanged.

Fix at the reader: treat any non-list value as a scalar, coerce with str(),
then apply the same CSV split semantics.  This keeps the public contract
stable (list or str-like continues to work identically) while accepting
the ints that the YAML loader is free to hand us.

Added tests for both Discord and Slack covering:
  - bare int value in config.extra
  - list of ints in config.extra
2026-05-01 14:01:26 -07:00
Prive FE Coder a717199bbf fix(slack): exclude reserved Slack commands from native slash manifest
Slack has built-in slash commands (e.g. /status, /me, /join) that apps
cannot register. When running `hermes slack manifest --write`, the
generated manifest included /status, causing Slack to reject the entire
manifest with a reserved-command error.

Add _SLACK_RESERVED_COMMANDS frozenset of all known Slack built-ins and
skip them in slack_native_slashes(). Affected commands remain reachable
via /hermes <command>.

Tests updated:
- New test_excludes_slack_reserved_commands validates no leaks
- test_includes_canonical_commands no longer asserts /status
- test_telegram_parity accounts for expected Slack-only exclusions
2026-05-01 14:01:26 -07:00
kshitijk4poor 8fcc160f6b fix(gateway/slack): review fixes — scope ephemeral to commands, user isolation
Self-review fixes for the slash ephemeral ack:

- Only stash response_url when text starts with '/' (gateway command).
  Free-form questions via '/hermes <question>' must produce public agent
  replies visible to the whole channel, not ephemeral.
- Use a ContextVar (_slash_user_id) to thread the invoking user's ID
  from _handle_slash_command through to send().  _pop_slash_context now
  matches the exact (channel_id, user_id) key when the ContextVar is
  set, preventing concurrent users on the same channel from stealing
  each other's ephemeral context.  ContextVars propagate to child
  asyncio.Tasks, so the value survives through handle_message →
  _process_message_background → _send_with_retry → send().
- Add truncate_message() in _send_slash_ephemeral to prevent silent
  failures on long responses (response_url has the same ~40k limit).
- Log send_private_notice failures at debug level instead of bare
  except/pass — aids diagnostics without spamming.
- Document app_mention dedup dependency on shared event ts.
- Add tests: free-form question must NOT stash context, concurrent
  users on the same channel get isolated contexts, non-slash send()
  path fallback behavior.
2026-05-01 13:33:06 -07:00
kshitijk4poor f34d298495 chore: add probepark to AUTHOR_MAP
Required for contributor_audit.py strict mode on the salvaged
PR #9340 commit.
2026-05-01 13:33:06 -07:00
probepark 0ab2d752ff feat(gateway): private notice delivery and Slack format_message fixes
Adds platform-level private notice delivery abstraction so operational
messages (e.g. sethome prompt) can be sent ephemerally on Slack when
configured with `slack.notice_delivery: private`.

Changes:
- gateway/config.py: _normalize_notice_delivery() + GatewayConfig.get_notice_delivery()
  with per-platform config bridging
- gateway/platforms/base.py: send_private_notice() default implementation
  (falls through to send())
- gateway/platforms/slack.py: send_private_notice() via chat_postEphemeral
- gateway/run.py: _deliver_platform_notice() helper replaces direct
  adapter.send() for the sethome notice, with private→public fallback
- gateway/platforms/slack.py: app_mention handler now forwards to
  _handle_slack_message (safe due to ts-based dedup) instead of no-op pass,
  fixing edge-case Slack configs where mentions arrive only as app_mention
- gateway/platforms/slack.py format_message: negative lookbehind prevents
  markdown images (![]()) from becoming broken Slack links; italic regex
  now requires non-whitespace boundaries so 'a * b * c' stays literal

Based on PR #9340 by @probepark.
2026-05-01 13:33:06 -07:00
kshitijk4poor 7cda0e5224 fix(gateway/slack): ephemeral ack and routing for slash commands
Slack slash commands (/q, /btw, /stop, /model, etc.) previously showed
no user-visible acknowledgement and posted command replies as public
channel messages.  This diverged from Discord, which uses ephemeral
deferred responses for slash commands.

Changes:
- handle_hermes_command now passes response_type='ephemeral' and a
  'Running /cmd…' text to ack(), giving the user immediate 'Only visible
  to you' feedback when they invoke any native slash command.
- _handle_slash_command stashes the Slack response_url from the command
  payload in a per-channel context dict before dispatching to
  handle_message.
- send() checks for a pending slash context and, when found, POSTs to
  the response_url with replace_original=true to swap the initial ack
  with the real command reply (e.g. 'Queued for the next turn.'),
  keeping it ephemeral.
- Stale slash contexts are garbage-collected on lookup (120s TTL).
- The response_url POST is non-fatal: if it fails, the user already saw
  the initial ack, and send() returns success=True.

Fixes #18182
2026-05-01 13:33:06 -07:00
Jeffrey Quesnelle 0b76d23d1a makes the Persistent Goals docs accessible in the docs nav (and llms.txt) (#18481) 2026-05-01 10:29:22 -07:00
Teknium f99676e315 fix(gateway): auto-restart when source files change out from under us (#17648) (#18409)
Long-running gateway processes that survive 'hermes update' keep
pre-update modules cached in sys.modules. When new tool files on
disk then try to 'from hermes_cli.config import cfg_get' (added in
PR #17304), the import resolves against the stale module object
and raises ImportError — hitting users on Matrix, Telegram, Feishu,
and other platforms.

Two defenses:

1. Gateway self-check (gateway/run.py). On __init__, snapshot the
   newest mtime across sentinel source files (hermes_cli/config.py,
   run_agent.py, gateway/run.py, etc.). On every inbound message,
   re-read those mtimes; if any is newer than boot time + 2s slack,
   request a graceful restart via the normal drain path and return
   a one-line ack to the user. Idempotent, works regardless of how
   the update happened (hermes update, manual git pull, installer).

2. Post-restart survivor sweep ('hermes update'). After the existing
   restart loop, sleep 3s, rescan for gateway PIDs we already tried
   to kill, and SIGKILL any survivors. The detached profile watchers
   and systemd then relaunch with fresh code instead of waiting out
   the 120s watcher timeout.

Closes #17648.
2026-05-01 09:50:08 -07:00
Teknium 77c0bc6b13 fix(curator): defer first run and add --dry-run preview (#18373) (#18389)
* fix(curator): defer first run and add --dry-run preview (#18373)

Curator was meant to run 7 days after install, not on the very first
gateway tick. On a fresh install (no .curator_state), should_run_now()
returned True immediately because last_run_at was None — so the gateway
cron ticker fired Curator against a fresh skill library moments after
'hermes update'. Combined with the binary 'agent-created' provenance
model (anything not bundled and not hub-installed), this consolidated
hand-authored user workflow skills without consent.

Changes:
- should_run_now(): first observation seeds last_run_at='now' and returns
  False. The next real pass fires one full interval_hours later (7 days
  by default), matching the original design intent.
- hermes curator run --dry-run: produces the same review report without
  applying automatic transitions OR permitting the LLM to call
  skill_manage / terminal mv. A DRY-RUN banner is prepended to the
  prompt and the caller skips apply_automatic_transitions. State is
  NOT advanced so a preview doesn't defer the next scheduled real pass.
- hermes update: prints a one-liner on fresh installs pointing at
  --dry-run, pause, and the docs. Silent on steady state.
- Docs: curator.md and cli-commands.md explain the deferred first-run
  behavior and warn that hand-written SKILL.md files share the
  'agent-created' bucket, with guidance to pin or preview before the
  first pass.

Tests:
- test_first_run_defers replaces the old 'first run always eligible'
  assertion — same fixture, inverted expectation.
- test_maybe_run_curator_defers_on_fresh_install covers the gateway tick
  path end-to-end.
- Three new dry-run tests cover state-advance suppression, prompt
  banner injection, and apply_automatic_transitions skipping.

Fixes #18373.

* feat(curator): pre-run backup + rollback (#18373)

Every real curator pass now snapshots ~/.hermes/skills/ into
~/.hermes/skills/.curator_backups/<utc-iso>/skills.tar.gz before calling
apply_automatic_transitions or the LLM review. If a run consolidates or
archives something the user didn't want touched, 'hermes curator
rollback' restores the tree in one command. Dry-run is skipped — no
mutation means no snapshot needed.

Changes:
- agent/curator_backup.py (new): tar.gz snapshot + safe rollback. The
  snapshot excludes .curator_backups/ (would recurse) and .hub/ (managed
  by the skills hub). Extract refuses absolute paths and .. components,
  and uses tarfile's filter='data' on Python 3.12+. Rollback takes a
  pre-rollback safety snapshot FIRST, stages the current tree into
  .rollback-staging-<ts>/ so the extract lands in an empty dir, and
  cleans the staging dir on success. A failed extract restores the
  staged contents.
- agent/curator.py: run_curator_review() calls curator_backup.
  snapshot_skills(reason='pre-curator-run') before apply_automatic_
  transitions. Best-effort — a failed snapshot logs at debug and the
  run continues (a transient disk issue shouldn't silently disable
  curator forever).
- hermes_cli/curator.py: new 'hermes curator backup' and 'hermes curator
  rollback' subcommands. rollback supports --list, --id <ts>, -y.
- hermes_cli/config.py: curator.backup.{enabled, keep} config block
  with sane defaults (enabled=true, keep=5).
- Docs: curator.md gets a 'Backups and rollback' section; cli-commands
  .md table gets the new rows.

Tests (new file tests/agent/test_curator_backup.py, 16 cases):
- snapshot creates tarball + manifest with correct counts
- snapshot excludes .curator_backups/ (recursion guard) and .hub/
- snapshot disabled via config returns None without creating anything
- snapshot uniquifies ids within the same second (-01 suffix)
- prune honors keep count, newest-first
- list_backups + _resolve_backup cover newest-default and unknown-id
- rollback restores a deleted skill with content intact
- rollback is itself undoable — safety snapshot shows up in list_backups
- rollback with no snapshots returns an error
- rollback refuses tarballs with absolute paths or .. components
- real curator runs take a 'pre-curator-run' snapshot; dry-runs do not

All curator tests: 210 passing locally.
2026-05-01 09:49:59 -07:00
Siddharth Balyan c5b4c48165 fix: lazy session creation — defer DB row until first message (#18370)
Prevents ghost sessions from accumulating in state.db when the TUI/web
dashboard is opened and closed without sending a message.

Changes:
- run_agent.py: Add _ensure_db_session() gate method, called at
  run_conversation() entry. Remove eager create_session() from __init__.
  Handle compression rotation flag correctly.
- tui_gateway/server.py: Remove eager db.create_session() in
  _start_agent_build(). Add post-first-message pending_title re-apply.
- hermes_state.py: Extract _insert_session_row() shared helper (DRY).
  Add prune_empty_ghost_sessions() for one-time migration.
- cli.py: One-time ghost session prune on startup. Fix _pending_title
  to call _ensure_db_session() before set_session_title().
- hermes_cli/main.py: Guard TUI exit summary on message_count > 0.
- tests: Update test_860_dedup to call _ensure_db_session() before
  direct _flush_messages_to_session_db() calls.

Closes: ghost session clutter in hermes sessions list and web dashboard.
2026-05-01 18:39:12 +05:30
Austin Pickett 20132435c0 Merge pull request #18117 from NousResearch/austin/fix/model-selector
feat(tui): overhaul /model picker to match hermes model with inline auth
2026-05-01 05:30:05 -07:00
Austin Pickett 5ad030d19d Merge pull request #18095 from NousResearch/austin/feat/plugins-page
feat(dashboard): Plugins page — manage, enable/disable, auth status
2026-05-01 05:29:24 -07:00
Austin Pickett 05c63259b5 Merge pull request #18358 from NousResearch/fix/kanban-buton
fix: kanban button
2026-05-01 04:49:06 -07:00
Austin Pickett a01c1f7305 fix: kanban button 2026-05-01 07:33:54 -04:00
Siddharth Balyan 75e1339d4c fix(telegram): send seed message after creating DM topics (#18334)
Telegram's client does not display empty forum topics in the chat's
topic list. After createForumTopic succeeds, send a short pin message
into the new topic so it becomes immediately visible to the user.

Only fires for newly created topics (no thread_id in config yet).
Failure to send the seed is non-fatal (debug-logged, topic still works).
2026-05-01 15:21:56 +05:30
Ben Barclay 0159f25fd0 Merge pull request #18281 from NousResearch/bb/fix-tui-docker-ink-v2
fix: prevent tui rebuilding assets
2026-05-01 18:43:40 +10:00
UgwujaGeorge b7ad3f478f fix(yuanbao): enforce owner identity check on group slash commands
The bot-owner identity check inside OwnerCommandMiddleware was commented
out and replaced with a hardcoded `is_owner = True`, so any group member
could trigger allowlisted privileged commands (/approve, /deny, /stop,
/reset, /retry, /undo, /new, /background, /bg, /btw, /queue, /q) by
sending the slash command without @-mentioning the bot. The most severe
case is /approve: a non-owner could approve a dangerous tool call the
bot was waiting on the owner to confirm.

Re-enable the documented identity check (push.from_account ==
push.bot_owner_id) so only the configured owner can issue these
commands.
2026-04-30 23:57:55 -07:00
Teknium a2a32688ca docs(website): add User Stories and Use Cases collage page (#18282)
Adds a new top-of-sidebar docs page at /docs/user-stories that is a
masonry-style collage of 99 real user stories sourced from X/Twitter,
GitHub issues/PRs, Reddit, Hacker News, YouTube, blogs (Medium, Substack,
dev.to), podcasts, LinkedIn, GitHub Gists, and Product Hunt.

Every tile links to the original post/issue/video/gist where someone
described a specific use case: personal assistants, dev workflows,
trading bots, research briefs, family WhatsApp agents, Kubernetes
deployments, legal-domain self-hosted setups, and more.

- docs/user-stories.mdx: MDX entry mounting the collage component
- src/components/UserStoriesCollage: React component with category +
  source filters, CSS-columns masonry layout, per-category accent colors
- src/data/userStories.json: source-of-truth dataset (force-added; the
  root .gitignore's unanchored 'data/' rule would otherwise swallow it,
  same reason skills.json is explicitly listed in website/.gitignore)
- sidebars.ts: link added at the top of the docs sidebar
2026-04-30 23:56:59 -07:00
Ben a49f4c617d fix: prevent tui rebuilding assets 2026-05-01 16:29:46 +10:00
web-dev0521 dfe512c58d fix(paths): route achievements plugin + profile-tui through HERMES_HOME
Four callsites hardcoded Path.home() / '.hermes' with no HERMES_HOME
check, breaking Docker deployments and profile isolation (hermes -p):

- plugins/hermes-achievements/dashboard/plugin_api.py:
  state_path(), snapshot_path(), checkpoint_path() bare-literal paths
- scripts/profile-tui.py:
  DEFAULT_STATE_DB and DEFAULT_LOG defaults ignored HERMES_HOME
- hermes_cli/slack_cli.py:
  except-Exception fallback for slack-manifest.json dump
- optional-skills/migration/openclaw-migration/scripts/openclaw_to_hermes.py:
  --target argparse default

Use get_hermes_home() (with an ImportError shim for the standalone
scripts) or 'os.environ.get("HERMES_HOME") or str(Path.home()/".hermes")'
where importing hermes_constants is impractical.

E2E-verified: with HERMES_HOME=/tmp/x all three achievements paths and
both profile-tui defaults route under /tmp/x.

Salvaged from #18068 (original scope was broader mechanical cleanup
claiming 23 callsites were buggy; most were already respecting
HERMES_HOME via os.environ.get(key, default) — only these 4 had no env
check at all). Credit: @web-dev0521.
2026-04-30 23:21:54 -07:00
Teknium c6eebfc25a docs: publish llms.txt and llms-full.txt for agent-friendly ingestion (#18276)
Two machine-readable entry points to the Hermes Agent docs:

  /llms.txt         curated index of every doc page, one link per page
                    with short descriptions. ~17 KB, safe to load into
                    an LLM context window.
  /llms-full.txt    every page under website/docs/ concatenated as markdown.
                    ~1.8 MB. For one-shot ingestion by coding agents and
                    RAG pipelines.

Both files are also served from /docs/llms.txt and /docs/llms-full.txt
(Docusaurus serves website/static/ under baseUrl=/docs/). Some agents and
IDE plugins probe the classic site-root path; the deploy workflow now copies
both files to _site root so either URL works.

Conforms to the emerging llmstxt.org spec: H1 project name, blockquote
summary, short install command, GitHub link, then curated sections
mirroring the docs-site navigation (Getting Started, Using Hermes,
Features, Messaging, Integrations, Guides, Developer Guide, Reference).

Generated by website/scripts/generate-llms-txt.py. Wired into prebuild.mjs
so every 'npm run build' and 'npm run start' refreshes the files alongside
the existing skills.json extraction. Both outputs are gitignored (same
precedent as src/data/skills.json).

Descriptions in llms.txt are pulled from each page's frontmatter, so they
stay current automatically. All ~80 section slugs are validated against
the filesystem at generation time; an invalid slug would fail the prebuild.
2026-04-30 23:17:14 -07:00
Teknium cf2b2d31ce docs: add Persistent Goals (/goal) feature page (#18275)
Adds a proper feature page at user-guide/features/goals.md covering
the /goal slash command — Hermes' take on the Ralph loop shipped in
PR #18262. The slash-commands reference table had two table rows but
no narrative doc walking through the judge model, fail-open semantics,
turn budget, persistence, user-message preemption, or the aux-model
config override.

Adds a walkthrough example showing a multi-turn goal running to
completion, covers the two judge failure modes with how to recover,
and credits Codex CLI 0.128.0 / Eric Traut as prior art.

Also cross-links both slash-commands.md rows to the new page so
readers discovering /goal from the command reference can dive in.
2026-04-30 23:16:54 -07:00
teknium1 2af8b8ff37 fix(moonshot): also strip nullable/enum after anyOf collapse
The anyOf collapse in _repair_schema returned early, skipping the
nullable-strip and enum-cleanup steps. When a schema had anyOf
[{enum: [..., null, '']}, {type: null}] alongside a parent-level
'nullable: true', collapsing to the single non-null branch produced a
merged node that still had both 'nullable' and the bad enum values —
Moonshot would still 400 on it.

Fix: fall through to Rules 1/3 when the collapse produces a single
merged node; only return early for the multi-branch case (pure
anyOf preservation) or when there was no null branch to remove.

Adds a test that locks in the combined-case expectation.
2026-04-30 23:14:31 -07:00
teknium1 9cb5baeacf chore(release): map hendrixfreire for moonshot salvage 2026-04-30 23:14:31 -07:00
Hendrix 9ca72a69a7 fix(moonshot): fill missing type before enum cleanup to handle anyOf branches without explicit type
When a schema node inside anyOf has enum values but no explicit 'type',
Rule 3 (enum cleanup) ran before _fill_missing_type, so node_type was
None and the enum was never cleaned. Moonshot then rejected the schema
with 'enum value (<nil>) does not match any type in [string]'.

Fix: reorder operations — fill missing type first, strip nullable,
then clean enum. This ensures enum cleanup always has a type to check.

Also fixes test expectation: empty string in enum is now correctly
stripped (Moonshot rejects it too).

Closes #16875
2026-04-30 23:14:31 -07:00
Teknium 77dd6d5469 chore(release): add mikeyobrien to AUTHOR_MAP 2026-04-30 23:13:34 -07:00
Mikey O'Brien 1be3b74cfb fix(gateway): honor MATRIX_HOME_ROOM in onboarding 2026-04-30 23:13:34 -07:00
Teknium 265bd59c1d feat: /goal — persistent cross-turn goals (Ralph loop) (#18262)
Add a standing-goal slash command that keeps Hermes working toward a
user-stated objective across turns until it is achieved, paused, or
the turn budget runs out. Our take on the Ralph loop — cf. Codex CLI
0.128.0's /goal.

After each turn, a lightweight auxiliary-model judge call asks 'is
this goal satisfied by the assistant's last response?'. If not, and
we're under the turn budget (default 20), Hermes feeds a continuation
prompt back into the same session as a normal user message. Any real
user message preempts the continuation loop automatically.

Judge failures fail OPEN (continue) so a flaky judge never wedges
progress — the turn budget is the real backstop.

### Commands

- `/goal <text>`    — set a standing goal (kicks off the first turn)
- `/goal` or `/goal status` — show current state
- `/goal pause`    — pause the continuation loop
- `/goal resume`   — resume (resets turn counter)
- `/goal clear`    — drop the goal

Works on both CLI and gateway platforms via the central CommandDef
registry.

### Design invariants preserved

- **Prompt cache**: continuation prompts are regular user-role
  messages appended to history. No system-prompt mutation, no toolset
  swap.
- **Role alternation**: continuation is a user turn, never injected
  mid-tool-loop.
- **Session persistence**: goal state lives in SessionDB.state_meta
  keyed by `goal:<session_id>`, so `/resume` picks it up.
- **Mid-run safety**: on the gateway, `/goal status|pause|clear` are
  allowed mid-run (control-plane only); setting a new goal requires
  `/stop` first so we don't race a second continuation prompt against
  the current turn.

### Files

- `hermes_cli/goals.py` (new, 380 lines) — GoalManager + judge + state
- `hermes_cli/commands.py` — CommandDef entry
- `hermes_cli/config.py` — `goals.max_turns` default
- `hermes_cli/web_server.py` — dashboard category merge
- `cli.py` — /goal handler + post-turn continuation hook in
  process_loop
- `gateway/run.py` — /goal handler + post-turn continuation hook
  wrapping _handle_message_with_agent
- `tests/hermes_cli/test_goals.py` (new, 26 tests) — judge parsing,
  fail-open semantics, lifecycle, persistence, budget exhaustion
- `website/docs/reference/slash-commands.md` — docs entry
2026-04-30 23:10:20 -07:00
Teknium 7c6c5619a7 docs(sidebar): collapse exploding skills tree to a single Skills node (#18259)
* docs(sidebar): collapse exploding skills tree to a single Skills node

The Skills sub-tree in the left sidebar expanded to 200+ entries
(22 bundled categories + 15 optional categories, every skill a page).
That's most of the nav on a first visit — docs for the actual product
get drowned in it.

Collapse the sidebar to:

  Skills
    godmode              (hand-written spotlight)
    google-workspace     (hand-written spotlight)
    Bundled catalog      (reference/skills-catalog — table of all bundled)
    Optional catalog     (reference/optional-skills-catalog — table of all optional)

Per-skill pages still generate and are still reachable at their URLs;
they're linked from the two catalog tables and from the Skills overview
page. They just don't appear in the left nav anymore.

sidebars.ts goes from 649 lines to 247. generate-skill-docs.py loses
the bundled/optional sidebar render helpers.

Also picks up incidental generator output drift on current main
(comfyui skill content refresh; 4 new skill pages for
devops-kanban-orchestrator, devops-kanban-worker,
productivity-here-now, productivity-shopify; two catalog refreshes).
These are what the generator produces on main today — keeping them
committed avoids the next docs build showing 'working tree dirty'.

* docs(sidebar): drop godmode and google-workspace spotlight pages

Keep the Skills sidebar node strictly principled: two catalog links,
nothing else. There was no rule for which skills got spotlight pages
and which got auto-generated pages — just that these two happened to
be hand-written first.

Both pages still build and are still reachable at
/docs/user-guide/skills/godmode and
/docs/user-guide/skills/google-workspace. They're linked from the
catalog tables and the Skills overview page.

Sidebar Skills node now:
  Skills
    ├── Bundled catalog
    └── Optional catalog
2026-04-30 23:08:22 -07:00
Teknium 50c046331d feat(update): add --yes/-y flag to skip interactive prompts (#18261)
hermes update had two interactive [Y/n] prompts with no bypass:
  1. Config migration (after new env/config options are added)
  2. Autostash restore (when uncommitted work was stashed before pull)

hermes uninstall already has --yes/-y; mirrors that.

Under --yes:
  - Config-migrate prompt → auto-yes, migrate_config(interactive=False)
    so new config fields are applied but API-key prompts are skipped
    (user runs 'hermes config migrate' later for those). Matches
    gateway-mode semantics.
  - Stash-restore prompt → auto-yes, git stash apply runs automatically.

Closes the 'can I hermes update -y, No ! Fix' gap reported by @murelux.
2026-04-30 23:06:32 -07:00
Teknium 4caad285a6 feat(gateway): auto-delete slash-command system notices after TTL (#18266)
Adds opt-in auto-deletion for slash-command reply messages like
"New session started!", "Restarting gateway…", "Stopped.", and
YOLO toggles.  After the TTL elapses the gateway calls the adapter's
delete_message; on platforms without a delete API (everything except
Telegram today) the TTL is silently ignored and the message stays.

Requested on Twitter by @charlesmcdowell — tool-call bubbles are useful
real-time, but system notices clutter the thread once the agent finishes.

Implementation:

- EphemeralReply(str) sentinel in gateway/platforms/base.py.  Subclasses
  str so existing 'X' in response / response.startswith(...) checks in
  tests and call sites keep working unchanged; isinstance() still
  distinguishes it for the send path.
- _process_message_background and both busy-session bypass paths
  (in base.py) call _unwrap_ephemeral() on the handler return, send
  the unwrapped text, and schedule a detached delete task when the
  TTL > 0 AND the adapter class overrides delete_message.
- display.ephemeral_system_ttl (default 0 = disabled) in DEFAULT_CONFIG.
  Handler can pass ttl_seconds explicitly to override.
- Wrapped the highest-noise return sites: /new, /reset, /stop,
  /yolo on/off, /restart success + "already in progress".  Draining
  notices and /help output left as plain strings — those are
  informational and users want to read them.

Backward-compat: default TTL 0 → no scheduling, no behavior change
for existing users.  Platforms without delete_message silently no-op.
2026-04-30 23:05:48 -07:00
Teknium e2eb561e8e fix(curator): rewrite cron job skill refs after consolidation (#18253)
When the curator consolidates skill X into umbrella Y, any cron job
that listed X in its skills field would fail to load X at run time —
the scheduler logs a warning and skips it, so the scheduled job runs
without the instructions it was scheduled to follow.

cron.jobs.rewrite_skill_refs(consolidated, pruned) now updates jobs
in-place: consolidated names route to the umbrella target (dedup
when umbrella is already present), pruned names are dropped.
agent.curator._write_run_report calls it after classification,
best-effort so a cron-side failure never breaks the curator itself.

Results are recorded in run.json (counts.cron_jobs_rewritten + full
cron_rewrites payload), a separate cron_rewrites.json for convenience
when jobs were touched, and a section in REPORT.md.

Reported by @tombielecki.
2026-04-30 23:04:50 -07:00
IMHaoyan bfb704684e fix(deepseek): use non-empty reasoning_content placeholder for V4 Pro thinking mode
DeepSeek V4 Pro tightened thinking-mode validation and rejects empty-string
reasoning_content with HTTP 400:

    The reasoning content in the thinking mode must be passed back to the API.

run_agent.py injected "" at three fallback sites — the tool-call pad in
_build_assistant_message and both injection branches of
_copy_reasoning_content_for_api (cross-provider poison guard + unconditional
thinking pad). All three now emit " " (single space), which satisfies the
non-empty check on V4 Pro without leaking fabricated reasoning.

Also upgrades stale empty-string placeholders on replay: sessions persisted
before this change have reasoning_content="" pinned at creation time; when
the active provider enforces thinking-mode echo, the replay path now rewrites
"" -> " " so existing users don't 400 on their first V4 Pro turn after
updating. Non-thinking providers still round-trip "" verbatim.

Updates 9 existing assertions + adds 2 regression tests (stale-placeholder
upgrade, non-thinking verbatim preservation).

Refs #15250, #17400.
Closes #17341.
2026-04-30 23:04:23 -07:00
Teknium f0dc919f92 fix(compression): include system prompt + tool schemas in token estimates (#18265)
The user-visible /compress banner and the post-compression last_prompt_tokens
writeback both counted only the raw message transcript (chars/4). With a 15KB
system prompt and 30 tool schemas (~26KB), a 4-message transcript that looks
like ~45 tokens to the transcript-only estimator is really ~10.5K tokens of
request pressure — a 234x gap.

Two user-facing consequences:
- Banner shows 'Compressing … (~45 tokens)…' while compression is actually
  firing on 10K+ tokens of real pressure, confusing users about why
  compression triggered (reported by @codecovenant on X; #6217).
- Post-compression last_prompt_tokens writeback omits tool schemas, so the
  next should_compress() check compares real usage against a stale
  underestimate — compression triggers late, potentially past the model's
  context limit on small-context models (#14695).

Swap estimate_messages_tokens_rough() for estimate_request_tokens_rough()
at every user-visible banner and at the post-compression writeback.
estimate_request_tokens_rough() already existed for exactly this purpose
and includes system prompt + tool schemas.

Touched call sites:
- run_agent.py: post-compression last_prompt_tokens writeback, post-tool
  call should_compress() fallback when provider usage is missing
- cli.py: /compress banner + summary
- gateway/run.py: gateway /compress banner + summary
- tui_gateway/server.py: TUI /compress status + summary
- acp_adapter/server.py: ACP /compact before/after

Left intentionally alone:
- Session-hygiene fallback and the 'no agent' /status path in gateway/run.py
  — no agent instance is in scope to query for system prompt/tools, and the
  existing 30-50% overestimate wobble on hygiene is safety-accepted.
- Verbose-mode 'Request size' logging — informational only, already counts
  system prompt via api_messages[0].

Also relabels the feedback line from 'Rough transcript estimate' to
'Approx request size' so the metric label matches what it actually measures.

Credits: diagnoses from @devilardis (#14695) and @Jackten (#6217);
user report @codecovenant on X (2026-04-30).

Closes #14695
Closes #6217
2026-04-30 23:03:54 -07:00
Teknium 41fa1f1b5c fix(acp): run /steer as a regular prompt on idle sessions (#18258)
When a user types /steer <text> on an ACP session that isn't actively
running a turn (and there's no interrupted-prompt salvage available),
_cmd_steer silently appended to state.queued_prompts and replied
"No active turn — queued for the next turn". That looks identical to
/queue output even though the user never typed /queue — @EddyLeeKhane
reported this as "/steer never works, gets queued instead".

Rewrite the payload to a plain user prompt before the slash-intercept
fires, matching the gateway's idle-/steer fallthrough in
gateway/run.py ~L4898.
2026-04-30 22:45:14 -07:00
Teknium fc78e708ed fix(update): don't crash hermes update if skill config scan fails (#18257)
`hermes update` ran the config migration (11 → 17) successfully then
crashed at `agent/skill_utils.py:340` during the post-migration
skill-config prompt. User @FlockonUS reported this on Twitter.

Root cause: `get_missing_skill_config_vars` in hermes_cli/config.py
only guarded the import of `discover_all_skill_config_vars`, not the
call. Any runtime exception inside the skill scan (malformed SKILL.md,
unreadable external skill dir, etc.) propagated up through
`migrate_config` and aborted `hermes update` after the version bump.

Wrap the call in try/except so skill-config prompting — which is a
post-migration nicety — can never block the migration itself.
2026-04-30 22:44:41 -07:00
Henkey ec1443b9f1 fix(acp): normalize Windows cwd for WSL tool execution 2026-04-30 20:55:14 -07:00
Henkey 78886365c2 fix(acp): replay interrupted prompts for steer 2026-04-30 20:54:37 -07:00
Henkey e27b0b7651 feat(acp): add steer and queue slash commands 2026-04-30 20:54:37 -07:00
Teknium 8fa44b1724 fix(guardrails): preserve display _detect_tool_failure semantics
The initial guardrail PR consolidated failure classification by pointing
display._detect_tool_failure at the new classify_tool_failure helper,
which was strictly broader: it flagged any JSON result with
"success": false / "failed": true / non-empty "error", plus plain-text
"traceback" and "error:" prefixes. That would uptick the user-visible
[error] tag on tools that return {"success": false} as a benign signal
(memory fullness, todo state, etc.) and feed the failure-streak counter
at the same time.

Restore display._detect_tool_failure to its pre-PR semantics verbatim.
Tighten classify_tool_failure (the guardrail's internal safety-fallback
used only when callers don't pass failed=) to match _detect_tool_failure
exactly, so the two never disagree. Production callers in run_agent.py
already pass an explicit failed= derived from _detect_tool_failure, so
the guardrail counter is driven by the same signal the CLI shows.
2026-04-30 20:43:15 -07:00
Mind-Dragon 0704589ceb fix(agent): make tool loop guardrails warning-first 2026-04-30 20:43:15 -07:00
Mind-Dragon 58b89965c8 fix(agent): add tool-call loop guardrails 2026-04-30 20:43:15 -07:00
Austin Pickett c23c7c994b fix(tui): address remaining review feedback — ordering and digit shortcuts
- Emit providers in CANONICAL_PROVIDERS order (matching hermes model)
  with user-defined/custom providers appended after
- Remove digit quick-select (1-9,0) handler — inconsistent with
  absolute row numbering and already removed from hint text
- Remove unused windowOffset import
2026-04-30 23:41:19 -04:00
Oxidane-bot 8d7500d80d fix(gateway): snapshot callback generation after agent binds it, not before
_process_message_background snapshotted callback_generation from the
interrupt event at the TOP of the task — before the handler ran.
_hermes_run_generation is only set on the event by
GatewayRunner._bind_adapter_run_generation during
_handle_message_with_agent, which runs DURING the handler await. The
early snapshot always captured None, which then flowed into
pop_post_delivery_callback(..., generation=None) in the finally block.

In pop_post_delivery_callback, generation=None with a tuple-registered
entry (generation, callback) bypasses the ownership check — it pops and
fires the callback regardless of which run owns it. Result: a stale run
could fire a fresher run's post-delivery callback (e.g. a
background-review notification attributed to the wrong turn).

Fix: move the snapshot into the finally block, after the handler has
run and _hermes_run_generation has been bound to the current run.

Regression test added: simulates a stale handler at generation=1 and a
fresher callback registered at generation=2. Pre-fix: snapshot=None →
pop fires the generation=2 callback under generation=1's ownership
("newer" fires). Post-fix: snapshot=1 → pop skips the mismatched
entry, callback stays in the dict for the correct run to claim.

Verified: test FAILS on current main (captures "newer" in fired list),
PASSES with this fix.

Salvaged from PR #12565 (the callback-ownership portion only; the
/status totals portion was already fixed on main in 7abc9ce4d via #17158).

Co-authored-by: Oxidane-bot <1317078257maroon@gmail.com>
2026-04-30 20:41:18 -07:00
Teknium 27ec74c68a fix: coerce show_reasoning and guard_agent_created config bools
Widens #16528 to two sibling sites that had the same quoted-boolean
bug: a YAML string "false" (or "0", "no", "off") silently evaluated
truthy under bool() / if-check.

- gateway/run.py _load_show_reasoning: is_truthy_value wrap
- tools/skill_manager_tool.py _guard_agent_created_enabled: is_truthy_value wrap
- regression tests for both
2026-04-30 20:40:46 -07:00
johnncenae bb706c3f38 fix(gateway): coerce tool_progress_command as a real boolean 2026-04-30 20:40:46 -07:00
Teknium a94841eaa0 fix(state): include finish_reason in conversation replay
SELECT in get_messages_as_conversation() was missing finish_reason, so
assistant messages round-tripped through replay (including /branch copies)
silently dropped the provider's stop signal. Adds it to the SELECT, restores
it on assistant rows, and locks it in with a round-trip test.
2026-04-30 20:40:28 -07:00
simbam99 7ba1a2b3df fix(gateway): preserve assistant metadata when branching sessions 2026-04-30 20:40:28 -07:00
Yukipukii1 55366510e5 fix(auth): make provider config writes atomic 2026-04-30 20:39:41 -07:00
Teknium 787b5c5f93 chore(release): map Mind-Dragon and JustinUssuri emails for AUTHOR_MAP 2026-04-30 20:38:09 -07:00
Mind-Dragon ab6c629ccc fix(terminal): skip sudo prompt when local NOPASSWD sudo works
When running on a host with sudoers NOPASSWD configured for the current
user, interactive Hermes sessions were unnecessarily entering the
password prompt path before executing sudo commands. Outside Hermes,
`sudo -n true` exits 0 for that user.

Add `_sudo_nopasswd_works()` that probes `sudo -n true` and, when it
succeeds, lets `_transform_sudo_command()` return the command unchanged
with no stdin password. The probe:

- Is scoped to the `local` terminal backend only, so Docker/SSH/Modal
  and other remote backends do not inherit host sudo state.
- Re-probes every call (no process-lifetime cache) so an expired sudo
  timestamp cannot silently make a later command block waiting for a
  password that Hermes never prompts for.
- Is bypassed entirely when `SUDO_PASSWORD` is configured or a cached
  password already exists, preserving existing explicit-password flows.

Co-authored-by: Junting Wu <juntingpublic@gmail.com>
2026-04-30 20:38:09 -07:00
simbam99 ccfe6a47c3 fix(gateway): coerce StreamingConfig booleans and malformed numerics safely 2026-04-30 20:37:49 -07:00
hharry11 24130b7e53 fix(approval): harden YOLO mode env parsing against quoted-bool strings 2026-04-30 20:37:37 -07:00
hharry11 158eb32686 fix(gateway): preserve document type when merging queued events 2026-04-30 20:37:27 -07:00
sprmn24 adaee2c72c test(skill_utils): add regression tests for non-dict metadata in extract_skill_conditions
The fix for this bug (isinstance guard) was merged via commit 3ff9e010,
but test coverage was not included. Adding 4 tests:
- dict metadata with hermes keys (normal case)
- string metadata (bug case — previously caused AttributeError)
- None metadata
- missing metadata key
2026-04-30 20:37:15 -07:00
teknium1 e21898ea98 test(discord_tool): add regression test for per-token capability cache
Proves token A's detected capabilities do not leak to token B after the
fix in the preceding commit. Before the fix this test would have seen
both tokens return token A's cached value.
2026-04-30 20:37:12 -07:00
sprmn24 fa7b0b0a67 fix(discord_tool): key capability cache by token instead of single global
_capability_cache was a single module-level dict shared across all
tokens. If the bot token rotates or multiple tokens are used in one
process, capabilities detected for token A would be returned for
token B, causing wrong schema gating and incorrect runtime behavior.

Replace the single Optional cache with a Dict keyed by token so each
token gets its own isolated capability entry.
2026-04-30 20:37:12 -07:00
Teknium 82b5786721 test(browser_supervisor): cover cache-hit healthcheck on dead thread/loop
Pure unit tests for _SupervisorRegistry — no Chrome required. Verified
to fail when the fix is reverted, pass with it in place.
2026-04-30 20:33:33 -07:00
sprmn24 73a6b80317 fix(browser_supervisor): verify thread and loop health before returning cached supervisor
_SupervisorRegistry.get_or_start() returned an existing supervisor
whenever the cdp_url matched, without checking if the supervisor's
thread or event loop was still alive. A crashed supervisor would be
silently reused, causing missed dialog/frame updates.

Now checks both _thread.is_alive() and _loop.is_running() before
returning the cached instance. An unhealthy supervisor is torn down
and recreated, matching the existing URL-changed code path.
2026-04-30 20:33:33 -07:00
sprmn24 ec4cb16a29 fix(honcho): guard _peers_cache and _sessions_cache reads under _cache_lock
_get_peer() and _get_or_create_honcho_session() accessed _peers_cache
and _sessions_cache without holding _cache_lock, while other paths
in the same class use the lock consistently. Under concurrent tool
calls or prefetch threads, this can produce stale reads or lost
cache updates.

Wrap both unguarded cache read sites in _cache_lock. Network calls
(honcho.peer() and honcho.session()) remain outside the lock to
avoid holding it during I/O.
2026-04-30 20:31:42 -07:00
sprmn24 bea2562fc4 fix(honcho): replace raw int() config parsing with safe helper
Three int() calls in HonchoClient.from_global_config() parsed
dialecticMaxChars, messageMaxChars, and dialecticMaxInputChars
directly without guards. A malformed value in honcho.json would
raise ValueError and abort provider initialization entirely.

Add _parse_int_config() helper following the existing
_parse_context_tokens() pattern, and replace all three raw
int() calls with it.
2026-04-30 20:31:32 -07:00
Roy-oss1 b94cb8e2c4 feat(feishu): operator-configurable bot admission and mention policy
Add two operator-facing toggles for inbound Feishu admission, enabling
bot-to-bot scenarios such as A2A orchestration and inter-bot
notifications:

  FEISHU_ALLOW_BOTS=none|mentions|all   (default: none)
    Accept messages from other bots. `mentions` requires the peer
    bot to @-mention Hermes; `all` admits every peer-bot message.

  FEISHU_REQUIRE_MENTION=true|false     (default: true)
    Whether group messages must @-mention the bot. Override per-chat
    via `group_rules.<chat_id>.require_mention` in config.yaml.

Defaults preserve prior behavior. Self-echo protection is always on:
when the bot's identity is unresolved (auto-detection failed and
FEISHU_BOT_OPEN_ID unset), peer-bot messages are rejected fail-closed
to avoid feedback loops.

Admitted peer bots bypass the human-user allowlist
(FEISHU_ALLOWED_USERS) to match existing Discord behavior; humans
still need an explicit allowlist entry. yaml feishu.allow_bots is
bridged to the env var so the adapter and gateway auth layer share
one source of truth.

Resolving peer-bot display names requires the
application:bot.basic_info:read scope; without it, peers still route
but appear as their open_id.

Test: tests/gateway/test_feishu_bot_admission.py covers the admission
pipeline, group-policy bot-bypass, hydration, and event-dispatch
plumbing as a parametrized matrix.

Change-Id: I363cccb578c2a5c8b8bf0f0a890c01c89909e256
2026-04-30 20:30:31 -07:00
buray fa9fd26acb fix(gateway): re-inject topic-bound skill after /new or /reset
reset_session() creates a fresh SessionEntry with created_at == updated_at,
but get_or_create_session() bumps updated_at on the next inbound message,
causing _is_new_session in _handle_message_with_agent to evaluate False.
The topic/channel skill auto-load gate (group_topics, channel_skill_bindings)
silently skips the first message after a manual reset.

Add an is_fresh_reset flag on SessionEntry, set by reset_session() and
consumed once by the message handler. Kept distinct from was_auto_reset
because that flag also drives a 'session expired due to inactivity'
user-facing notice and a context-note prepend — both wrong for an
explicit /new or /reset.

Persisted through to_dict/from_dict so the flag survives gateway
restart between /reset and the next message.

Fixes #6508

Co-authored-by: warabe1122 <45554392+warabe1122@users.noreply.github.com>
Co-authored-by: willy-scr <187001140+willy-scr@users.noreply.github.com>
2026-04-30 20:29:19 -07:00
Jezza Hehn 7abc9ce4df fix(gateway): read /status token totals from SessionDB (#17158)
/status was reading session_entry.total_tokens from the in-memory
SessionStore (gateway/session.py), which the agent never writes to —
so the token count was always 0.

The agent already persists token deltas to the SQLite SessionDB
(run_agent.py:11497) for every platform with a session_id. Route
/status through that single source of truth instead of duplicating
token writes into a second store.

Fix:
- gateway/run.py: _handle_status_command now calls
  self._session_db.get_session(session_id) and sums the five token
  component columns (input/output/cache_read/cache_write/reasoning).
  Falls back to 0 when no SessionDB is configured or no row exists.
- Two new regression tests covering the populated-row and
  missing-row paths.

Co-authored-by: Hermes <127238744+teknium1@users.noreply.github.com>
2026-04-30 20:28:50 -07:00
Teknium a178081468 fix(gateway): use _session_key_for_source for native image buffer write
Minor follow-up to the native-image-buffer isolation fix. The write site
in _prepare_inbound_message_text was calling build_session_key directly,
while every other call site in gateway/run.py uses the _session_key_for_source
helper — which consults session_store._generate_session_key first and falls
back to build_session_key. Keeping the write key and consume key on the
same helper prevents key drift if the session store ever overrides the
default keying behavior.
2026-04-30 20:26:35 -07:00
Yukipukii1 bdb7edd89e fix(gateway): isolate pending native image paths by session 2026-04-30 20:26:35 -07:00
sprmn24 5ed27c0f74 fix(tui_gateway): guard env var parsing against invalid values at import
_SLASH_WORKER_TIMEOUT_S and _pool used raw float()/int() on env vars
at module level. A non-numeric value (e.g. HERMES_TUI_SLASH_TIMEOUT_S=abc)
raises ValueError during import, preventing TUI gateway from starting
with no useful error message.

Wrap both parses in try/except with safe fallbacks:
- HERMES_TUI_SLASH_TIMEOUT_S: fallback to 45.0s
- HERMES_TUI_RPC_POOL_WORKERS: fallback to 4 workers
2026-04-30 20:26:23 -07:00
Teknium 531ac20408 fix(state): JSON-encode multimodal message content for sqlite
sqlite3 can only bind str/bytes/int/float/None to query parameters.
Multimodal message content is a list of parts (text + image_url), which
raised 'Error binding parameter 3: type list is not supported' in
append_message and replace_messages.

In the CLI/TUI this surfaced as a visible crash when users pasted
screenshots. In the gateway it was silently swallowed by a bare except
in append_to_transcript, causing multimodal turns to be lost from the
session transcript.

Fix at the DB layer: _encode_content wraps lists/dicts as
'\\x00json:' + json.dumps(...) on write, _decode_content unwraps on
read. Plain strings are untouched, so existing FTS search, previews,
and JSONL compat are unaffected. Paired decode in get_messages,
get_messages_as_conversation, and search_messages context previews.

Regression test covers: list content round-trip, dict content
round-trip, string content stored unchanged, replace_messages with
multimodal content.

Also included: aligned fix #17522 for TUI image attachment with
paths containing spaces (see previous commit).
2026-04-30 20:25:52 -07:00
Harry Riddle cc340c4a4d fix(tui): always call input.detect_drop for reliable image attachment
Remove frontend regex pre-check that truncated paths containing spaces,
quotes, or Windows drive letters. Backend _detect_file_drop correctly
handles these patterns. This fixes image attachment for common filenames
like "Screenshot 2026-04-29.png".

Add tests:
- test_input_detect_drop_path_with_spaces: attaches image with spaces in name
- test_input_detect_drop_path_with_spaces_and_remainder: remainder handling

Also restored missing  in test_rollback_restore_resolves_number_and_file_path.

Scope: tui, vision, tests
2026-04-30 20:25:52 -07:00
Teknium 19136dfc07 chore: map jatingodnani email in AUTHOR_MAP 2026-04-30 20:24:39 -07:00
Teknium 9a75743496 fix(gateway): apply agent.disabled_toolsets in gateway message loop
Widens the cherry-picked fix from @jatingodnani (#17343) to the
gateway path. On main, user_config.agent.disabled_toolsets was only
honored by _get_platform_tools' name-level subtraction — it did not
catch tools pulled in implicitly by a composite toolset (browser
includes web_search, hermes-* platforms include most tools).

Changes:
- gateway/run.py: resolve disabled_toolsets alongside enabled_toolsets
  and pass to AIAgent at both user-facing construction sites (normal
  message loop + single-turn cron-like path). Hygiene/compression
  agents (fixed enabled_toolsets=[memory]) are intentionally untouched.
- gateway/run.py: add (agent, disabled_toolsets) to
  _CACHE_BUSTING_CONFIG_KEYS so editing the list in config.yaml
  invalidates the cached AIAgent on the next message.
- cli.py: drop unused 'import platform' left over from PR #17343's
  import churn; restore 'import sys' used throughout the file.
- model_tools.py: drop unused 'import os, sys' added by PR #17343;
  fix comment reference from #15291 (unrelated OAuth issue) to #17309.

Co-authored-by: jatin godnani <godnanijatin@gmail.com>
2026-04-30 20:24:39 -07:00
jatin godnani e3624e00db fix: enforce strictly subtractive toolset filtration
Refactor tool resolution logic in model_tools.py to ensure that
disabled_toolsets are always subtracted at the end, preventing
composite toolsets (e.g. 'browser') from implicitly enabling tools
that should be hidden.

- Added 'disabled_toolsets' to DEFAULT_CONFIG in hermes_cli/config.py
- Updated HermesCLI in cli.py to load and propagate disabled toolsets to AIAgent
- Implemented robust two-phase resolution (additive then subtractive) in model_tools.py
2026-04-30 20:24:39 -07:00
Teknium 8e58265b60 chore(release): map allard.quek@singtel.com → AllardQuek (#18196) 2026-04-30 20:23:31 -07:00
Allard Quek ebe60abc4f fix(dashboard): separate theme identity from layout scale
Themes previously embedded layout-affecting values (baseSize, lineHeight,
density, letterSpacing) alongside visual identity properties, coupling
user ergonomic preferences to color theme selection.

This change establishes a clear separation of concerns:

- Themes own: palette, font family, border-radius, and font-coupled
  letterSpacing (e.g. Inter's -0.005em tracking)
- Layout scale (baseSize, lineHeight, density) is standardized via
  DEFAULT_TYPOGRAPHY and DEFAULT_LAYOUT — not overridden per theme

All themes now spread DEFAULT_TYPOGRAPHY and DEFAULT_LAYOUT as their
base, removing silent divergence and making future layout settings
(e.g. user-configurable density) trivially applicable across all themes
without per-theme special-casing.
2026-04-30 20:22:54 -07:00
Allard Quek 33d24095c4 fix(dashboard): normalize typography and layout across built-in themes
All built-in themes now spread DEFAULT_TYPOGRAPHY, removing independent
baseSize overrides and converging on 15px. All themes also use
density: comfortable, removing the compact/spacious divergence that
caused item-count shifts on fixed-height pages (e.g. Skills).

Two additional per-theme overrides are also normalized:

- rose: lineHeight: "1.7" removed — was paired with density: spacious
  for an airy feel; once density was normalised the elevated line-height
  became an orphaned artefact causing nav item height drift.

- cyberpunk: letterSpacing changed from "0.02em" to "0" — extra tracking
  on top of an already-wide monospace font caused text to wrap earlier
  than in other themes.

Switching themes is now a purely cosmetic change — color palette,
font family, border-radius, and typographic style differ; font size,
spacing, line-height, and letter-spacing do not.
2026-04-30 20:22:54 -07:00
Teknium 01cc701e54 docs + nit: busy_ack_enabled follow-ups
- Move the disabled-ack guard above the debounce so we don't stamp
  _busy_ack_ts[session_key] when no ack was actually sent. Harmless
  (never read when disabled) but cosmetically off.
- Document display.busy_ack_enabled in user-guide/messaging/index.md
  and HERMES_GATEWAY_BUSY_ACK_ENABLED in reference/environment-variables.md.
- Add JezzaHehn to scripts/release.py AUTHOR_MAP for contributor credit.

Follow-up to #17491 (Jezza Hehn).
2026-04-30 20:22:30 -07:00
Jezza Hehn 2b512cbca4 feat(gateway): add busy_ack_enabled config option to suppress ack messages
When a user sends a message while the gateway is busy processing,
an acknowledgment message is sent. This can be spammy for users
who send rapid messages.

Add display.busy_ack_enabled config option (default: true) to allow
users to suppress these busy-input acknowledgment messages.

Fixes #17457
2026-04-30 20:22:30 -07:00
Yukipukii1 25cbe3e1d6 fix(gateway): preserve thread routing for /update progress and prompts 2026-04-30 20:19:23 -07:00
Teknium f48ba47d1e chore(release): map allard.quek@singtel.com → AllardQuek 2026-04-30 20:19:14 -07:00
Allard Quek 226fd79c8e feat(dashboard): add interactive column sorting to analytics tables 2026-04-30 20:19:14 -07:00
Teknium 0ddc8aba68 fix(fallback): let custom_providers shadow built-in aliases
When a user defines `custom_providers: [{name: kimi, ...}]` and references
`provider: kimi` from fallback_model or the main config, the built-in alias
rewriting (`kimi` → `kimi-coding`) was hijacking the request before the
named-custom lookup ran.  `_get_named_custom_provider` also refused to
return a match when the raw name resolved to any built-in (including aliases),
so the custom endpoint was unreachable.

Fix at both layers of the resolution chain so every caller benefits, not
just `_try_activate_fallback`:

- hermes_cli/runtime_provider.py: narrow `_get_named_custom_provider`'s
  built-in-wins guard to canonical provider names only.  An alias like
  `kimi` that resolves to a different canonical (`kimi-coding`) no longer
  blocks the custom lookup; a canonical name like `nous` still does.

- agent/auxiliary_client.py: in `resolve_provider_client`, try the named-
  custom lookup with the original (pre-alias-normalization) name before the
  alias-normalized one, so aliased requests reach the user's custom entry.
  Also honour `explicit_base_url` and `explicit_api_key` in the API-key
  provider branch so callers that pass explicit hints (e.g. fallback
  activation) can override the registered defaults.

Tests added for:
- custom `kimi` shadowing built-in alias (regression for #15743)
- custom `nous` NOT shadowing canonical built-in (behaviour preserved)
- bare `kimi` without any custom entry still routing to built-in
- explicit base_url/api_key override on the API-key provider branch

Original PR #17827 by @Feranmi10 identified the same bug class and
implemented a narrower fix in `_try_activate_fallback`; this reshapes the
fix to live in the shared resolution layer so all callers benefit.

Fixes #15743
Co-authored-by: Feranmi10 <89228157+Feranmi10@users.noreply.github.com>
2026-04-30 20:18:44 -07:00
Yukipukii1 38875d00a7 fix(gateway): ensure platform configs honor home_channel env overrides 2026-04-30 20:18:33 -07:00
Teknium 5089c55e0b refactor(state): compute last_active ordering at SQL level via recursive CTE
Follow-up to the previous commit. Replace the post-fetch Python re-sort (which
required dropping LIMIT/OFFSET from SQL and scanning every session row) with a
recursive CTE that walks compression-continuation chains and computes
effective_last_active per root at SQL level. The outer query can then ORDER BY
+ LIMIT efficiently, and the Python projection loop no longer has to handle
ordering.

This preserves the correctness win (old compression roots whose live tip was
touched recently surface correctly) without the O(N) scan, which matters for
users with thousands of sessions.

Adds a regression test pinning the compression-tip case at limit=1 — the
stress case that any bounded-oversample shortcut would get wrong.

Co-authored-by: simbam99 <simbamax99@gmail.com>
2026-04-30 20:17:15 -07:00
simbam99 142b4bf3ce fix(session_search): order recent mode by last activity instead of start time
- order session_search recent-mode results by last activity instead of session start time
- add an opt-in `order_by_last_active` path to `SessionDB.list_sessions_rich`
- add regression coverage for both the database ordering and recent-mode call path
2026-04-30 20:17:15 -07:00
Austin Pickett c8e506c383 fix(tui): address code review feedback on model picker
- Reset keySaving on back() to prevent blocked key entry after Esc
- Show '(needs setup)' for non-API-key auth providers instead of
  generic '(no key)'
- Set is_current correctly for unauthenticated providers that happen
  to be the active session provider
- Guard model.save_key with is_managed() check — return error on
  managed installs where .env is read-only
2026-04-30 23:11:28 -04:00
Austin Pickett f4c761c6a0 feat(tui): add inline provider disconnect via 'd' keybind in /model picker
- New model.disconnect RPC method: clears API key env vars from .env
  and OAuth/credential pool state via clear_provider_auth()
- Press 'd' on an authenticated provider opens confirmation prompt
- y/Enter confirms disconnect, n/Esc cancels
- Provider flips to unauthenticated state in-place (re-selectable
  to re-auth by pressing Enter again)
2026-04-30 23:03:32 -04:00
Austin Pickett 26f7f68507 feat(tui): show all providers in /model picker with inline API key setup
- model.options now returns all canonical providers (not just
  authenticated), each with authenticated/auth_type/key_env fields
- New model.save_key RPC method: saves API key to .env, sets in
  process, returns refreshed provider with models
- Picker shows ● (authed) / ○ (no key) markers with dimmed styling
- Selecting an unauthenticated api_key provider opens inline masked
  key input — after save, transitions directly to model selection
- Non-api_key auth providers show guidance to run hermes model
- Row numbers now show absolute position in list
2026-04-30 23:03:32 -04:00
Austin Pickett 36fa8a4d28 fix(tui): show absolute position numbers in model picker
The model picker displayed row numbers 1-12 regardless of scroll
position, making it impossible to tell where you were in the list.
Now shows the actual item index (e.g. 5, 6, 7... when scrolled down).

Also removed '1-9,0 quick' from the hint text since digit shortcuts
still work relative to the visible window, which would be confusing
with absolute numbering.
2026-04-30 23:03:32 -04:00
Austin Pickett 443950e827 fix(tui): pass user_providers as dict to match CLI model-switch pipeline
The TUI's _apply_model_switch() was converting the config.yaml
`providers:` dict into a list of dicts before passing it to
switch_model(). This caused resolve_provider_full() →
resolve_user_provider() to fail, since that function expects a dict
and does `user_config.get(name)` to look up provider entries.

The result: user-defined providers (e.g. ollama) appeared in CLI's
/model picker but were invisible in the TUI.

Fix:
- tui_gateway/server.py: pass cfg.get('providers') directly (dict),
  matching what cli.py already does at line 5598.
- hermes_cli/model_switch.py: fix the validation-override block
  (line ~893) which iterated user_providers as a list — now correctly
  handles the dict format with support for both dict-keyed and
  list-format models arrays.
2026-04-30 23:03:32 -04:00
Teknium 96691268df fix(gateway): drain manual profile gateways via SIGUSR1 before respawn
The PR wired in a detached watcher that respawns manual profile gateways
after they exit.  Pair that with a SIGUSR1 graceful drain (same path
systemd/launchd use) so in-flight agent runs finish instead of getting
SIGTERM'd.  Fall back to SIGTERM if SIGUSR1 isn't wired or the gateway
doesn't exit within the drain budget — the watcher sees the exit and
relaunches either way.

Tested end-to-end against an orphaned gateway: graceful drain exits in
0.5s and the watcher fires the relaunch command.
2026-04-30 20:00:31 -07:00
Michael Nguyen 77fe7ab6b2 feat(gateway): restart manual profile gateways after update 2026-04-30 20:00:31 -07:00
Teknium 84324d06b8 chore(release): add quocanh261997 to AUTHOR_MAP 2026-04-30 20:00:31 -07:00
Teknium 8b7b074df9 test(context_compressor): regression test for PR #17025 tail-protection off-by-one
When len(messages) <= protect_tail_count and a token budget is set, the
previous formula min(protect_tail_count, len(result) - 1) under-protected
the tail by one, allowing the oldest message to be summarized.

The test fails on the buggy formula (pruned == 1) and passes on the fix
(pruned == 0, tool content preserved verbatim).
2026-04-30 20:00:01 -07:00
0z! b194617d00 fix(context_compressor): off-by-one in tail protection for short conversations 2026-04-30 20:00:01 -07:00
hharry11 2997ef9446 fix(api-server): use session-scoped task IDs for tool isolation 2026-04-30 19:59:38 -07:00
johnncenae a83d579d5b fix(telegram): enforce gateway auth for inline approval callbacks 2026-04-30 19:59:31 -07:00
johnncenae 9ae1fa9e39 fix(delegate): honor runtime default model during provider resolution 2026-04-30 19:58:55 -07:00
Stephen Schoettler b29b709a71 fix(agent): sanitize Codex tool-call history summaries 2026-04-30 19:58:46 -07:00
Teknium f43b126677 fix(gateway): atomic writes for sibling recovery/dedup state files
Widen PR #17842's atomic-write fix to two sibling sites that exhibit the
same 'partial JSON on interrupted write' class of bug:

- gateway/platforms/feishu.py: dedup state (_dedup_state_path)
- gateway/platforms/helpers.py: ParticipatedThreadTracker save

Both are small recovery/coordination files that get rewritten frequently and
break cross-restart dedup if left partial.
2026-04-30 19:58:16 -07:00
johnncenae 1ef9e88549 fix(gateway): write restart markers atomically and fix Windows lock collisions 2026-04-30 19:58:16 -07:00
teknium1 447a2bba3a fix(plugins): bound async plugin command await with 30s timeout
Follow-up to #17963. The threaded branch of resolve_plugin_command_result
previously called Event.wait() with no timeout — a hung async plugin
handler would wedge the terminal indefinitely. Cap the wait at 30s and
raise TimeoutError instead. Added a regression test covering the hung
handler path.
2026-04-30 19:56:18 -07:00
hharry11 ca9a61ae38 fix(plugins): await async handlers in CLI and TUI dispatch 2026-04-30 19:56:18 -07:00
johnncenae 79cffa9232 auth: coerce tls insecure flag safely instead of using Python truthiness 2026-04-30 19:55:48 -07:00
johnncenae 2bf73fbe2c fix(cli): coerce tls insecure flag safely in auth state 2026-04-30 19:55:48 -07:00
Teknium 7cbe943d2d feat(skills): add here.now as an optional skill
Moves the here-now skill under optional-skills/productivity/here-now/ so
it's discoverable via the Skills Hub but not installed by default, and
tightens the SKILL.md description to a single line to match sibling
optional-skill descriptions.

Install with:
  hermes skills install official/productivity/here-now

Closes #378
2026-04-30 19:48:15 -07:00
adamludwin 21cc9c8d32 Update here.now skill bundle
Made-with: Cursor
2026-04-30 19:48:15 -07:00
adamludwin f7dfd4ae36 feat(skills): add built-in here.now skill
Add the here.now productivity skill with a bundled publish runtime so Hermes can publish files and folders to live URLs. Keep the skill thin and docs-first while fixing script path resolution and upload failure handling.

Made-with: Cursor
2026-04-30 19:48:15 -07:00
Yukipukii1 2110a3a0c4 fix(tui): return JSON-RPC errors for invalid request shapes 2026-04-30 19:47:00 -07:00
Yukipukii1 5f3f456784 fix(approval): wake blocked gateway approvals on session cleanup 2026-04-30 19:46:27 -07:00
Feranmi10 f4ba97ad9a fix(status): add NVIDIA_API_KEY to hermes status API keys display
Closes #16082

The `hermes status` command listed provider API keys under the
◆ API Keys section but NVIDIA_API_KEY was absent. Users configured
with NVIDIA NIM had no way to verify their key was set from status
output. Add it alongside the other inference provider keys.
2026-04-30 19:46:06 -07:00
Yukipukii1 75483b6db1 fix(curator): preserve last_report_path in state 2026-04-30 19:45:59 -07:00
Mind-Dragon aab5bcc6ac test(model_switch): cover private user_providers override 2026-04-30 19:44:26 -07:00
Mind-Dragon 5ad8281885 fix(model_switch): correct user_providers override for private models
The switch_model override logic incorrectly iterated over user_providers
as if it were a list of dicts, but it's actually a dict mapping
provider_slug -> config. This meant private models defined in a provider's
`models:` section (e.g. nahcrof-dedicated with discover_models: false)
were never accepted when the API /models list didn't include them.

Fix: iterate over user_providers.items(), match by slug, and handle both
dict and list forms of the models config.
2026-04-30 19:44:26 -07:00
Aamir Jawaid 1e5a23fa64 docs(teams): use teams app get --install-link for Step 6
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-30 19:43:32 -07:00
Aamir Jawaid 67f1198ba9 docs(teams): fix CLI install tag and Step 6 install flow
- Keep @preview tag for teams CLI
- Step 3: note client secret won't be shown again
- Step 6: use the Install in Teams link from teams app create output

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-30 19:43:32 -07:00
Aamir Jawaid d5e72ae17f docs(teams): fix CLI install tag and Step 6 install flow
- Keep @preview tag for teams CLI
- Step 3: note client secret won't be shown again
- Step 6: just open the Install in Teams link from teams app create output

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-30 19:43:32 -07:00
Aamir Jawaid a5d60f42ee docs(teams): fix CLI install tag and Step 6 install flow
- Keep @preview tag for teams CLI
- Step 3: note client secret won't be shown again
- Step 6: use the install link printed by teams app create
  instead of a separate CLI command

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-30 19:43:32 -07:00
Aamir Jawaid 09aba91766 docs(teams): note that tunnel port 3978 is the default, not fixed
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-30 19:43:32 -07:00
Aamir Jawaid f59693c075 fix(teams): pipe TEAMS_PORT through docker-compose properly
Was hardcoded to 3978; use ${TEAMS_PORT:-3978} so a custom port
set in .env is actually passed into the container.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-30 19:43:32 -07:00
Aamir Jawaid c997830e1e docs(teams): fix port references and add TEAMS_ALLOW_ALL_USERS
- Replace hardcoded 3978 with configurable TEAMS_PORT references
- Fix incorrect docker-compose port mapping claim (uses network_mode: host)
- Add missing TEAMS_ALLOW_ALL_USERS to config reference table

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-30 19:43:32 -07:00
Aamir Jawaid 4a6fac36d8 docs(teams): fix group chat behavior — @mention required
Group chats require @mention just like channels, not respond-to-all.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-30 19:43:32 -07:00
Aamir Jawaid 624057fce6 feat(teams): set User-Agent to Hermes via 2.0.0 client option
microsoft-teams-apps 2.0.0 added the `client` option to AppOptions,
accepting a ClientOptions instance. Use it to set the User-Agent
header to "Hermes" on all outgoing HTTP requests.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-30 19:43:32 -07:00
briandevans 97d6f25008 test(toolsets): include kanban in expected post-#17805 toolset assertions
The kanban PR (#17805, c86842546) added the `kanban` toolset and
`tools/kanban_tools.py`, but didn't update three pre-existing test
assertions that bake the full toolset/tool inventory:

* `tests/tools/test_registry.py::test_matches_previous_manual_builtin_tool_set`
  hard-codes the manual list of builtin tool modules. `tools.kanban_tools`
  was missing.
* `tests/test_tui_gateway_server.py::test_load_enabled_toolsets_rejects_disabled_mcp_env`
  and `test_load_enabled_toolsets_falls_back_when_tui_env_invalid` both
  expect `["memory"]` from `_load_enabled_toolsets()`. With kanban now
  auto-recovered by `_get_platform_tools` (its tools live in hermes-cli's
  universe but are not in CONFIGURABLE_TOOLSETS), the resolver returns
  `["kanban", "memory"]`.
* `tests/hermes_cli/test_tools_config.py::test_get_platform_tools_preserves_explicit_empty_selection`
  asserts `set()` for an explicit empty list. The recovery loop now also
  surfaces `kanban`. Reframed to assert the contract the test name
  describes — no CONFIGURABLE toolset gets re-enabled when the user
  explicitly saved an empty list — which stays correct as more
  non-configurable platform toolsets are added.

Verified the failures reproduce on clean origin/main (180a7036b) with
`.[all,dev]`-equivalent extras (fastapi, starlette, httpx, pytest-asyncio)
and that all four pass with this commit applied. CI on main itself is
currently red on these tests; this restores green for everyone's PRs.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 19:43:03 -07:00
Chris Danis f61695ee73 fix(signal): skip contentless envelopes (profile key updates, empty messages)
Signal-cli sends dataMessage wrappers for profile key updates and other
metadata events that have no actual text content. These were reaching the
gateway as msg='' and triggering full agent turns for nothing.

Add early return in _handle_envelope() when both message field is empty/
missing/whitespace AND there are no attachments. Messages with media
attachments but no text still flow through.

- 12 lines added to gateway/platforms/signal.py
- 5 new tests in TestSignalContentlessEnvelope class
2026-04-30 19:42:59 -07:00
Teknium e2e6b6ff1a chore(models): move Vercel AI Gateway to bottom of provider picker (#18112)
It was sitting at position 4 of the `hermes model` list, ahead of Anthropic,
OpenAI, Xiaomi, and other first-class API providers. Move it to the end of
CANONICAL_PROVIDERS and drop the "(200+ models, $5 free credit, no markup)"
parenthetical so the entry just reads "Vercel AI Gateway".
2026-04-30 19:34:19 -07:00
Austin Pickett c73b799de7 feat(dashboard): add hide/show toggle for dashboard plugins in sidebar
- New config key: dashboard.hidden_plugins (list of plugin names)
- GET /api/dashboard/plugins now filters out hidden plugins from sidebar
- POST /api/dashboard/plugins/{name}/visibility toggles visibility
- Hub response includes user_hidden boolean per plugin row
- Eye/EyeOff toggle on plugin cards with dashboard manifests
- i18n: 'Show in sidebar' / 'Hide from sidebar' (en/zh)
2026-04-30 20:29:37 -04:00
Austin Pickett a52363231f refactor(plugins): move rescan button to page header, remove redundant title
Use usePageHeader().setEnd to place the rescan button in the shared
header bar. Remove the inline H2 title (already shown by the header)
and the wrapper div.
2026-04-30 20:29:37 -04:00
Austin Pickett 9550d0fd46 fix(plugins): show 'Plugins' in page header instead of 'Web UI'
Add /plugins route to resolve-page-title BUILTIN map.
2026-04-30 20:29:37 -04:00
Austin Pickett 7dc85495e0 style(plugins): make page full width 2026-04-30 20:29:37 -04:00
Austin Pickett 6549b0f2b7 fix(security): address CodeQL path-traversal and info-exposure findings
- Add _validate_plugin_name() guard on all {name} path param endpoints
  (rejects /, \, .. before reaching plugin logic)
- Strip after_install_path from install response (no internal paths to client)
- Update nix/tui.nix lockfile hash to match committed package-lock.json
2026-04-30 20:29:37 -04:00
Austin Pickett e2a4905606 feat(dashboard): add Plugins page with enable/disable, auth status, install/remove
- New PluginsPage.tsx: full plugin management UI (list, enable/disable,
  install from git, remove, git pull updates, provider picker)
- Backend: dashboard_set_agent_plugin_enabled now also toggles the
  plugin's toolset in platform_toolsets so enabling actually makes
  tools visible in agent sessions
- Backend: /api/dashboard/plugins/hub returns auth_required + auth_command
  per plugin (checks tool registry check_fn)
- Frontend: auth_required shown as Badge + CommandBlock with copy-able
  auth command
- Fix: Select overflow in providers card (min-w-0 grid cells, removed
  truncate/overflow-hidden that clipped dropdown)
- Refactor: _install_plugin_core extracted for non-interactive reuse,
  PluginOperationError for structured error handling
- i18n: en/zh/types updated with all new plugin page strings
2026-04-30 20:29:37 -04:00
Teknium e5dad4ac57 fix(agent): propagate ContextVars to concurrent tool worker threads (#18123)
Propagates ContextVars (notably `tools.approval._approval_session_key`) into concurrent tool worker threads via `copy_context().run` — mirrors `asyncio.to_thread` semantics.

Fixes approval-card cross-session misrouting in concurrent gateway traffic. Repro'd on Slack: session A's dangerous-command approval was delivered to channel B (@syahidfrd).

Salvages #16660 — core 4-LOC fix preserved, unrelated `tests/eval_018/` scope contamination dropped. Adds 5 regression guards including an AST-level source check on the real call site.

Closes #16660.

Co-authored-by: firefly <promptsiren@gmail.com>
Co-authored-by: banditburai <banditburai@users.noreply.github.com>
2026-04-30 16:26:26 -07:00
Teknium 180a7036bc feat(skills): add Shopify optional skill (Admin + Storefront GraphQL) (#18116)
Adds optional-skills/productivity/shopify — curl-based guide for the
Shopify Admin GraphQL API (products, orders, customers, inventory,
metafields, bulk operations, webhooks) and the Storefront GraphQL API.

- API version 2026-01 (current stable)
- Custom-app access tokens (shpat_...) with X-Shopify-Access-Token header
- Notes the 2026-01-01 deprecation of admin-created custom apps, points
  users at Dev Dashboard for new setups after that date
- Includes a reusable shop_gql() bash helper, cursor pagination,
  rate-limit cost inspection, GID conventions, userErrors check
- Safety section warns on destructive mutations (delete/refund/cancel)

Installs cleanly via: hermes skills install official/productivity/shopify
2026-04-30 15:58:44 -07:00
brooklyn! 8fed969618 Merge pull request #18113 from NousResearch/bb/tui-sgr-mouse-fragments
fix(tui): recover fragmented SGR mouse reports
2026-04-30 15:56:59 -07:00
Brooklyn Nicholson ded011c5a5 fix(tui): tighten SGR fragment matching 2026-04-30 17:50:49 -05:00
Brooklyn Nicholson 71b685aee0 fix(tui): recover fragmented SGR mouse reports 2026-04-30 17:43:21 -05:00
Teknium bbbce92651 feat(tui): render self-improvement review summaries in the transcript
The Ink TUI (\`hermes --tui\` + dashboard \`/chat\`) had no wiring for the
background self-improvement review. When the review fired and patched
a skill or saved a memory entry, the change landed but the user had
no visual indication it happened — only the CLI had a print surface
for the '💾 Self-improvement review: …' line.

Changes:

- tui_gateway/server.py: in _init_session, attach
  agent.background_review_callback to an _emit('review.summary',
  sid, {text}) closure. Wrapped in try/except so agents with locked
  attribute slots don't break session startup.
- ui-tui/src/app/createGatewayEventHandler.ts: handle 'review.summary'
  by routing ev.payload.text through sys(…), matching the existing
  'background.complete' pattern. Empty / whitespace payloads are
  ignored so the transcript never gets a blank system line.
- ui-tui/src/gatewayTypes.ts: extend the GatewayEvent discriminated
  union with { type: 'review.summary', payload?: { text?: string } }.

Gateway platforms (Telegram, Discord, Slack, …) already route the
review summary via background_review_callback → post-delivery queue
in gateway/run.py, so they pick up the new 'Self-improvement review:'
prefix from the companion run_agent change with no platform edits.

Tests:
- tests/tui_gateway/test_review_summary_callback.py (Python, 2 tests):
  _init_session attaches a callback that emits the right event; the
  callback path survives agents that can't accept the attribute.
- ui-tui/src/__tests__/createGatewayEventHandler.test.ts (vitest, 2
  new cases): review.summary events feed sys(...) with the full text;
  empty / missing payloads are no-ops.
- TypeScript type-check passes.
- tui_gateway suite: 64/64 pass.
2026-04-30 14:07:22 -07:00
Teknium 80a676658c fix(cli): surface self-improvement review summaries from bg thread
When the self-improvement background review fires after a turn, it runs
in a bg thread and emits a '  💾 <summary>' line to announce what it
saved to memory or skills. Two problems made this invisible to users
even when the review successfully modified a skill:

1. The print went through `_cprint` (prompt_toolkit's print_formatted_text)
   on a bg thread while the CLI's PromptSession was live. Direct
   print_formatted_text races with the input-area redraw and the line
   can land behind/above the prompt, scrolled off without the user
   seeing it.

2. The message said only '💾 Skill created.' / '💾 Memory updated'
   with no indication that the self-improvement loop was the one doing
   this. Users who did catch the line couldn't tell the background
   review from some other agent action.

Fixes:

- `_cprint` now detects when it's called from a non-app thread with a
  running prompt_toolkit Application, and routes through
  `run_in_terminal` via `loop.call_soon_threadsafe`. That pauses the
  input, prints the line above the prompt, and redraws — the normal
  prompt_toolkit contract for bg-thread output. Direct-print fallback
  preserved for the no-app / same-thread / import-error paths. Affects
  every bg-thread emission, not just the review summary (curator
  summaries and auxiliary failure prints benefit too).

- The summary now reads '  💾 Self-improvement review: <summary>' in
  both the CLI and the gateway `background_review_callback` path, so
  the origin is unambiguous.

Tests:
- New `tests/cli/test_cprint_bg_thread.py` covers all five routing
  branches (no app, app-not-running, cross-thread schedule, same-thread
  direct, app-loop-attribute-error, import-error).
- New case in `tests/run_agent/test_background_review.py` asserts the
  attributed prefix shows up in both `_safe_print` and
  `background_review_callback`.

Live E2E: exercised _cprint from a bg thread inside a real Application
event loop; confirmed get_app_or_none() sees the app, call_soon_threadsafe
schedules run_in_terminal, and the inner _pt_print runs.
2026-04-30 14:07:22 -07:00
Teknium c868425467 feat(kanban): durable multi-profile collaboration board (#17805)
Salvage of PR #16100 onto current main (after emozilla's #17514 fix
that unblocks plugin Pydantic body validation). History preserved on
the standing `feat/kanban-standing` branch; this squashes the 22
iterative commits into one clean landing.

What this lands:
- SQLite kernel (hermes_cli/kanban_db.py) — durable task board with
  tasks, task_links, task_runs, task_comments, task_events,
  kanban_notify_subs tables. WAL mode, atomic claim via CAS,
  tenant-namespaced, skills JSON array per task, max-runtime timeouts,
  worker heartbeats, idempotency keys, circuit breaker on repeated
  spawn failures, crash detection via /proc/<pid>/status, run history
  preserved across attempts.
- Dispatcher — runs inside the gateway by default
  (`kanban.dispatch_in_gateway: true`). Ticks every 60s, reclaims
  stale claims, promotes ready tasks, spawns `hermes -p <assignee>
  chat -q "work kanban task <id>"` with HERMES_KANBAN_TASK +
  HERMES_KANBAN_WORKSPACE env. Auto-loads `--skills kanban-worker`
  plus any per-task skills. Health telemetry warns on stuck ready
  queue.
- Structured tool surface (tools/kanban_tools.py) — 7 tools
  (kanban_show, kanban_complete, kanban_block, kanban_heartbeat,
  kanban_comment, kanban_create, kanban_link). Gated on
  HERMES_KANBAN_TASK via check_fn so zero schema footprint in normal
  sessions.
- System-prompt guidance (agent/prompt_builder.py KANBAN_GUIDANCE)
  injected only when kanban tools are active.
- Dashboard plugin (plugins/kanban/dashboard/) — Linear-style board
  UI: triage/todo/ready/running/blocked/done columns, drag-drop,
  inline create, task drawer with markdown, comments, run history,
  dependency editor, bulk ops, lanes-by-profile grouping, WS-driven
  live refresh. Matches active dashboard theme via CSS variables.
- CLI — `hermes kanban init|create|list|show|assign|link|unlink|
  claim|comment|complete|block|unblock|archive|tail|dispatch|context|
  init|gc|watch|stats|notify|log|heartbeat|runs|assignees` +
  `/kanban` slash in-session.
- Worker + orchestrator skills (skills/devops/kanban-worker +
  kanban-orchestrator) — pattern library for good summary/metadata
  shapes, retry diagnostics, block-reason examples, fan-out patterns.
- Per-task force-loaded skills — `--skill <name>` (repeatable),
  stored as JSON, threaded through to dispatcher argv as one
  `--skills X` pair per skill alongside the built-in kanban-worker.
  Dashboard + CLI + tool parity.
- Deprecation of standalone `hermes kanban daemon` — stub exits 2
  with migration guidance; `--force` escape hatch for headless hosts.
- Docs (website/docs/user-guide/features/kanban.md + kanban-tutorial.md)
  with 11 dashboard screenshots walking through four user stories
  (Solo Dev, Fleet Farming, Role Pipeline, Circuit Breaker).
- Tests (251 passing): kernel schema + migration + CAS atomicity,
  dispatcher logic, circuit breaker, crash detection, max-runtime
  timeouts, claim lifecycle, tenant isolation, idempotency keys, per-
  task skills round-trip + validation + dispatcher argv, tool surface
  (7 tools × round-trip + error paths), dashboard REST (CRUD + bulk
  + links + warnings), gateway-embedded dispatcher (config gate, env
  override, graceful shutdown), CLI deprecation stub, migration from
  legacy schemas.

Gateway integration:
- GatewayRunner._kanban_dispatcher_watcher — new asyncio background
  task, symmetric with _kanban_notifier_watcher. Runs dispatch_once
  via asyncio.to_thread so SQLite WAL never blocks the loop. Sleeps
  in 1s slices for snappy shutdown. Respects HERMES_KANBAN_DISPATCH_IN_GATEWAY=0
  env override for debugging.
- Config: new `kanban` section in DEFAULT_CONFIG with
  `dispatch_in_gateway: true` (default) + `dispatch_interval_seconds: 60`.
  Additive — no \_config_version bump needed.

Forward-compat:
- workflow_template_id / current_step_key columns on tasks (v1 writes
  NULL; v2 will use them for routing).
- task_runs holds claim machinery (claim_lock, claim_expires,
  worker_pid, last_heartbeat_at) so multi-attempt history is first-
  class from day one.

Closes #16102.

Co-authored-by: emozilla <emozilla@nousresearch.com>
2026-04-30 13:36:47 -07:00
ethernet 59c1a13f45 Merge pull request #15680 from NousResearch/fix/nix-package-lock
fix: let fixing nix pkgs command work without an initial build
2026-04-30 16:21:51 -04:00
Teknium 1d8068d71d feat(models): add openrouter/owl-alpha (free) to curated OpenRouter list (#18071) 2026-04-30 12:57:02 -07:00
Ari Lotter 9ac4a2e53e fix: let fixing nix pkgs command work without an initial build 2026-04-30 15:39:45 -04:00
Austin Pickett 6bc5d72271 Merge pull request #16419 from vincez-hms-coder/feat/dashboard-profiles-hms-coder
feat(dashboard): add profiles management page
2026-04-30 12:09:23 -07:00
ethernet b737af8226 Merge pull request #18047 from stephenschoettler/fix/acp-persist-user-message-test-mocks
test(acp): accept prompt persistence kwargs in MCP E2E mocks
2026-04-30 14:43:26 -04:00
Teknium 73bf3ab1b2 chore: release v0.12.0 (2026.4.30) (#18057)
The Curator release — Hermes Agent now maintains itself. Autonomous
background Curator grades, prunes, and consolidates the skill library;
self-improvement loop substantially upgraded; four new inference
providers; Microsoft Teams (via pluggable platforms) + Yuanbao as 18th
and 19th messaging platforms; Spotify + Google Meet native integrations;
ComfyUI + TouchDesigner-MCP bundled by default; Humanizer skill ported;
~57% cut to visible TUI cold start.

Stats since v0.11.0: 1,096 commits, 550 merged PRs, 1,270 files
changed, 217,776 insertions, 213 community contributors.
2026-04-30 11:31:01 -07:00
Teknium 76edc40ab0 fix(agent): extend thinking-mode reasoning_content pad to Kimi/Moonshot
Builds on #16855 (@lsdsjy) which fixed DeepSeek v4 reasoning_content
replay via model_extra fallback + capturing tool_calls at method entry.
Kimi / Moonshot thinking mode enforces the same echo-back contract and
hits the same 400 when a tool-call turn is persisted without
reasoning_content.

- _build_assistant_message: pad branch now uses _needs_thinking_reasoning_pad()
  (DeepSeek OR Kimi) instead of _needs_deepseek_tool_reasoning() alone.
- Extract _needs_thinking_reasoning_pad() and reuse it in
  _copy_reasoning_content_for_api so both sites share one predicate.
- tests/run_agent/test_deepseek_reasoning_content_echo.py: add
  TestBuildAssistantMessagePadsStrictProviders parametrized over DeepSeek
  (attr=None, attr-absent), Kimi (attr=None), Moonshot (via base_url),
  and an OpenRouter negative control that must NOT pad. Proven to fail
  2/5 cases on Kimi/Moonshot without this change.
- scripts/release.py: add AUTHOR_MAP entries for lsdsjy and season179.

Refs #17400.

Co-authored-by: season179 <season.saw@gmail.com>
2026-04-30 11:18:39 -07:00
lsdsjy b9b9ee3e6c fix(deepseek): preserve v4 reasoning_content on replay 2026-04-30 11:18:39 -07:00
ethernet 8fbc9d7d78 Merge pull request #18043 from NousResearch/feat/help-ui
feat(tui): add a mini help menu when u write ? in the input field
2026-04-30 14:02:28 -04:00
Stephen Schoettler 699a9c11a9 test(acp): accept prompt persistence kwargs in mocks 2026-04-30 10:47:23 -07:00
Teknium d60a9917d3 feat(curator): show most-used and least-used skills in hermes curator status (#18033)
Alongside the existing 'least recently used' section, surface two more
rankings so users can see which of their agent-created skills actually
get exercised:

- 'most used (top 5)' — sorted by use_count descending. Hidden when every
  skill has use_count=0 (noise suppression on fresh installs).
- 'least used (top 5)' — sorted by use_count ascending. Always shown
  when the catalog is non-empty.

use_count started tracking real agent skill activation in PR #17932
(bump_use wired into skill_view tool + slash invocation + --skill
preload), so these rankings are now meaningful.

Tests: 3 new in tests/hermes_cli/test_curator_status.py — happy path
with mixed use_counts, zero-use suppression of the most-used section,
and the no-skills clean-empty case.
2026-04-30 10:37:33 -07:00
ethernet 7c07422202 feat(tui): add a mini help menu when u write ? in the input field
it feels so nice :3 just a lil popup ! doesn't get in the way or take
any focus or anything, and directs users to /help for more info :3
2026-04-30 13:37:12 -04:00
VinceZ-Hms-Coder ca7f46beb5 Merge upstream/main and address Copilot review feedback
Merge resolved conflicts in web/src/{i18n/{en,zh,types}.ts,lib/api.ts}
by keeping both this branch's `profiles` additions and upstream's new
`models` page additions.

Copilot review feedback:
- Implement POST /api/profiles/{name}/open-terminal endpoint (already
  present); align Windows branch to `cmd.exe /c start "" <cmd>` so it
  matches the new test and spawns a fresh window instead of /k reusing
  the parent console.
- Move backslash escaping out of the macOS AppleScript f-string
  expression (Python <3.12 disallows backslashes inside f-string
  expression parts).
- Patch `_get_wrapper_dir` via monkeypatch in
  test_profiles_create_creates_wrapper_alias_when_safe so the test no
  longer writes to the real `~/.local/bin`.
- Extend test_dashboard_browser_safe_imports to scan `.ts` files in
  addition to `.tsx`.
- Switch upstream's new ModelsPage.tsx away from the `@nous-research/ui`
  root barrel onto per-component subpaths to satisfy the stricter scan.
- Fix NouiTypography `leading-1.4` -> `leading-[1.4]` so Tailwind
  actually emits the line-height for the `sm` variant.
- Guard ProfilesPage.openSoulEditor against out-of-order responses by
  tracking the latest requested profile via a ref.
- Replace ProfilesPage's hand-rolled setup command with a fetch to
  `/api/profiles/{name}/setup-command` so the copied command always
  matches what the backend would actually run (handles wrapper-alias
  collisions and reserved names correctly).
- Wire SOUL.md textarea label `htmlFor` -> textarea `id` so screen
  readers and clicking the label work as expected.
2026-04-30 06:43:22 -04:00
Austin Pickett cb0e2e2f36 Potential fix for pull request finding
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
2026-04-29 15:23:30 -04:00
vincez-hms-coder 4c0cc77e94 fix(dashboard): keep ui imports browser-safe after rebase 2026-04-29 01:47:13 -04:00
vincez-hms-coder 9b62c98170 chore(dashboard): restore package lock metadata 2026-04-29 01:43:21 -04:00
vincez-hms-coder 469e4df3c2 fix(profiles): preserve skills on dashboard profile creation 2026-04-29 01:42:51 -04:00
vincez-hms-coder ae11a31058 feat(profiles): add profile setup command endpoint and wrapper creation 2026-04-29 01:42:51 -04:00
vincez-hms-coder 3e200b64fb fix(profiles): update terminal command for copying based on profile name
Co-authored-by: Copilot <copilot@github.com>
2026-04-29 01:42:51 -04:00
vincez-hms-coder 1745cfc6d7 fix(dashboard): avoid node-only ui imports in browser 2026-04-29 01:42:50 -04:00
vincez-hms-coder 58c07867e3 fix(dashboard): keep profiles list resilient 2026-04-29 01:39:52 -04:00
vincez-hms-coder 4523965de9 feat(dashboard): add profiles management page
Copy profile dashboard changes onto a fresh branch under the vincez-hms-coder account.

Includes:
- Profiles dashboard route and sidebar entry
- Profile lifecycle REST endpoints
- SOUL.md read/write support
- i18n labels and helper text updates
- Targeted profile API tests

Test plan:
- pytest tests/hermes_cli/test_web_server.py -k profile -q
- cd web && npm run build
2026-04-29 01:39:51 -04:00
315 changed files with 46880 additions and 2271 deletions
+6
View File
@@ -9,6 +9,12 @@ node_modules
.venv
**/.venv
# Built artifacts that are regenerated inside the image. Excluded so local
# rebuilds on the developer's machine don't invalidate the npm-install layer
# that now depends on the full ui-tui/packages/hermes-ink/ tree being present.
ui-tui/dist/
ui-tui/packages/hermes-ink/dist/
# CI/CD
.github
+10
View File
@@ -76,6 +76,16 @@ jobs:
run: |
mkdir -p _site/docs
cp -r website/build/* _site/docs/
# llms.txt / llms-full.txt are also published at the site root
# (https://hermes-agent.nousresearch.com/llms.txt) because some
# agents and IDE plugins probe the classic root-level path rather
# than /docs/llms.txt. Same file, two URLs, one source of truth.
if [ -f website/build/llms.txt ]; then
cp website/build/llms.txt _site/llms.txt
fi
if [ -f website/build/llms-full.txt ]; then
cp website/build/llms-full.txt _site/llms-full.txt
fi
- name: Upload artifact
uses: actions/upload-pages-artifact@56afc609e74202658d3ffba0e8f6dda462b719fa # v3
+18 -8
View File
@@ -28,10 +28,26 @@ WORKDIR /opt/hermes
# ---------- Layer-cached dependency install ----------
# Copy only package manifests first so npm install + Playwright are cached
# unless the lockfiles themselves change.
#
# ui-tui/packages/hermes-ink/ is copied IN FULL (not just its manifests)
# because it is referenced as a `file:` workspace dependency from
# ui-tui/package.json. Copying the tree up front lets npm resolve the
# workspace to real content instead of stopping at a bare package.json.
COPY package.json package-lock.json ./
COPY web/package.json web/package-lock.json web/
COPY ui-tui/package.json ui-tui/package-lock.json ui-tui/
COPY ui-tui/packages/hermes-ink/package.json ui-tui/packages/hermes-ink/package-lock.json ui-tui/packages/hermes-ink/
COPY ui-tui/packages/hermes-ink/ ui-tui/packages/hermes-ink/
# `npm_config_install_links=false` forces npm to install `file:` deps as
# symlinks (the npm 10+ default) even on Debian's older bundled npm 9.x,
# which defaults to `install-links=true` and installs file deps as *copies*.
# The host-side package-lock.json is generated with a newer npm that uses
# symlinks, so an install-as-copy produces a hidden node_modules/.package-lock.json
# that permanently disagrees with the root lock on the @hermes/ink entry.
# That disagreement trips the TUI launcher's `_tui_need_npm_install()`
# check on every startup and triggers a runtime `npm install` that then
# fails with EACCES (node_modules/ is root-owned from build time).
ENV npm_config_install_links=false
RUN npm install --prefer-offline --no-audit && \
npx playwright install --with-deps chromium --only-shell && \
@@ -45,13 +61,7 @@ COPY --chown=hermes:hermes . .
# Build browser dashboard and terminal UI assets.
RUN cd web && npm run build && \
cd ../ui-tui && npm run build && \
rm -rf node_modules/@hermes/ink && \
rm -rf packages/hermes-ink/node_modules && \
cp -R packages/hermes-ink node_modules/@hermes/ink && \
npm install --omit=dev --prefer-offline --no-audit --prefix node_modules/@hermes/ink && \
rm -rf node_modules/@hermes/ink/node_modules/react && \
node --input-type=module -e "await import('@hermes/ink')"
cd ../ui-tui && npm run build
# ---------- Permissions ----------
# Make install dir world-readable so any HERMES_UID can read it at runtime.
+505
View File
@@ -0,0 +1,505 @@
# Hermes Agent v0.12.0 (v2026.4.30)
**Release Date:** April 30, 2026
**Since v0.11.0:** 1,096 commits · 550 merged PRs · 1,270 files changed · 217,776 insertions · 213 community contributors (including co-authors)
> The Curator release — Hermes Agent now maintains itself. An autonomous background Curator grades, prunes, and consolidates your skill library on its own schedule. The self-improvement loop that reviews what to save got a substantial upgrade. Four new inference providers, a 18th messaging platform, a 19th via Teams plugin, native Spotify + Google Meet integrations, ComfyUI and TouchDesigner-MCP moved from optional to bundled-by-default, and a ~57% cut to visible TUI cold start.
---
## ✨ Highlights
- **Autonomous Curator** — `hermes curator` runs as a background agent on the gateway's cron ticker (7-day cycle default). It grades your skill library, consolidates related skills, prunes dead ones, and writes per-run reports to `logs/curator/run.json` + `REPORT.md`. Archived skills are classified consolidated-vs-pruned via model + heuristic. Defense-in-depth gates protect bundled/hub skills from mutation. Unified under `auxiliary.curator` — pick the curator's model in `hermes model`, manage it from the dashboard. `hermes curator status` ranks skills by usage (most-used / least-used). ([#17277](https://github.com/NousResearch/hermes-agent/pull/17277), [#17307](https://github.com/NousResearch/hermes-agent/pull/17307), [#17941](https://github.com/NousResearch/hermes-agent/pull/17941), [#17868](https://github.com/NousResearch/hermes-agent/pull/17868), [#18033](https://github.com/NousResearch/hermes-agent/pull/18033))
- **Self-improvement loop — substantially upgraded** — The background review fork (the core of Hermes' self-improvement: after each turn it decides what memories/skills to save or update) is now class-first (rubric-based rather than free-form), active-update biased (prefers the skill the agent just loaded), handles `references/`/`templates/` sub-files, and properly inherits the parent's live runtime (provider, model, credentials actually propagate). Restricted to memory + skills toolsets so it can't sprawl. Memory providers shut down cleanly. Prior-turn tool messages excluded from the summary so the fork sees a clean context. ([#16026](https://github.com/NousResearch/hermes-agent/pull/16026), [#17213](https://github.com/NousResearch/hermes-agent/pull/17213), [#16099](https://github.com/NousResearch/hermes-agent/pull/16099), [#16569](https://github.com/NousResearch/hermes-agent/pull/16569), [#16204](https://github.com/NousResearch/hermes-agent/pull/16204), [#15057](https://github.com/NousResearch/hermes-agent/pull/15057))
- **Skill integrations — major expansion** — **ComfyUI v5** with official CLI + REST + hardware-gated local install, moved from optional to **built-in by default** ([#17610](https://github.com/NousResearch/hermes-agent/pull/17610), [#17631](https://github.com/NousResearch/hermes-agent/pull/17631), [#17734](https://github.com/NousResearch/hermes-agent/pull/17734)). **TouchDesigner-MCP** bundled by default, expanded with GLSL, post-FX, audio, geometry, and 9 new reference docs ([#16753](https://github.com/NousResearch/hermes-agent/pull/16753), [#16624](https://github.com/NousResearch/hermes-agent/pull/16624), [#16768](https://github.com/NousResearch/hermes-agent/pull/16768) — @kshitijk4poor + @SHL0MS). **Humanizer** skill ports a text-cleaner that strips AI-isms ([#16787](https://github.com/NousResearch/hermes-agent/pull/16787)). **claude-design** HTML artifact skill + design-md (Google DESIGN.md spec) + airtable salvage + `skill_manage` edits in `external_dirs` + direct-URL skill install + `/reload-skills` slash command. ([#16358](https://github.com/NousResearch/hermes-agent/pull/16358), [#14876](https://github.com/NousResearch/hermes-agent/pull/14876), [#16291](https://github.com/NousResearch/hermes-agent/pull/16291), [#17512](https://github.com/NousResearch/hermes-agent/pull/17512), [#16323](https://github.com/NousResearch/hermes-agent/pull/16323), [#17744](https://github.com/NousResearch/hermes-agent/pull/17744))
- **LM Studio — first-class provider** — upgraded from a custom-endpoint alias to a full-blown native provider: dedicated auth, `hermes doctor` checks, reasoning transport, live `/models` listing. (Salvage of @kshitijk4poor's #17061.) ([#17102](https://github.com/NousResearch/hermes-agent/pull/17102))
- **Four more new inference providers** — **GMI Cloud** (first-class, salvage of #11955@isaachuangGMICLOUD), **Azure AI Foundry** with auto-detection, **MiniMax OAuth** with PKCE browser flow (salvage #15203), **Tencent Tokenhub** (salvage of #16860). ([#16663](https://github.com/NousResearch/hermes-agent/pull/16663), [#15845](https://github.com/NousResearch/hermes-agent/pull/15845), [#17524](https://github.com/NousResearch/hermes-agent/pull/17524), [#16960](https://github.com/NousResearch/hermes-agent/pull/16960))
- **Pluggable gateway platforms + Microsoft Teams** — the gateway is now a plugin host. Drop-in messaging adapters live outside the core, and Microsoft Teams is the first plugin-shipped platform. (Salvage of #17664.) ([#17751](https://github.com/NousResearch/hermes-agent/pull/17751), [#17828](https://github.com/NousResearch/hermes-agent/pull/17828))
- **Tencent 元宝 (Yuanbao) — 18th messaging platform** — native gateway adapter with text + media delivery. ([#16298](https://github.com/NousResearch/hermes-agent/pull/16298), [#17424](https://github.com/NousResearch/hermes-agent/pull/17424))
- **Spotify — native tools + bundled skill + wizard** — 7 tools (play, search, queue, playlists, devices) behind PKCE OAuth, interactive setup wizard, bundled skill, surfacing in `hermes tools`, cron usage documented. ([#15121](https://github.com/NousResearch/hermes-agent/pull/15121), [#15130](https://github.com/NousResearch/hermes-agent/pull/15130), [#15154](https://github.com/NousResearch/hermes-agent/pull/15154), [#15180](https://github.com/NousResearch/hermes-agent/pull/15180))
- **Google Meet plugin** — join calls, transcribe, speak, follow up. Realtime OpenAI transport + Node bot server, full pipeline bundled as a plugin. ([#16364](https://github.com/NousResearch/hermes-agent/pull/16364))
- **`hermes -z` one-shot mode + `hermes update --check`** — non-interactive `hermes -z <prompt>` with `--model`/`--provider`/`HERMES_INFERENCE_MODEL`. `hermes update --check` preflight. Opt-in pre-update HERMES_HOME backup. ([#15702](https://github.com/NousResearch/hermes-agent/pull/15702), [#15704](https://github.com/NousResearch/hermes-agent/pull/15704), [#15841](https://github.com/NousResearch/hermes-agent/pull/15841), [#16539](https://github.com/NousResearch/hermes-agent/pull/16539), [#16566](https://github.com/NousResearch/hermes-agent/pull/16566))
- **Models dashboard tab + in-browser model config** — rich per-model analytics, switch main + auxiliary models from the dashboard. ([#17745](https://github.com/NousResearch/hermes-agent/pull/17745), [#17802](https://github.com/NousResearch/hermes-agent/pull/17802))
- **Remote model catalog manifest** — OpenRouter + Nous Portal model catalogs are now pulled from a remote manifest so new models show up without a release. ([#16033](https://github.com/NousResearch/hermes-agent/pull/16033))
- **Native multimodal image routing** — images now route based on the model's actual vision capability rather than provider defaults. ([#16506](https://github.com/NousResearch/hermes-agent/pull/16506))
- **Gateway media parity** — native multi-image sending across Telegram, Discord, Slack, Mattermost, Email, and Signal; centralized audio routing with FLAC support + Telegram document fallback. ([#17909](https://github.com/NousResearch/hermes-agent/pull/17909), [#17833](https://github.com/NousResearch/hermes-agent/pull/17833))
- **TUI catches up to (and past) the classic CLI** — LaTeX rendering (@austinpickett), `/reload` .env hot-reload, pluggable busy-indicator styles (@OutThisLife, #13610), opt-in auto-resume of last session, expanded light-terminal auto-detection, session delete from `/resume` picker with `d`, modified mouse-wheel line scroll, and a `/mouse` toggle that kills ConPTY's phantom mouse injection (@kevin-ho). ([#17175](https://github.com/NousResearch/hermes-agent/pull/17175), [#17286](https://github.com/NousResearch/hermes-agent/pull/17286), [#17150](https://github.com/NousResearch/hermes-agent/pull/17150), [#17130](https://github.com/NousResearch/hermes-agent/pull/17130), [#17113](https://github.com/NousResearch/hermes-agent/pull/17113), [#17668](https://github.com/NousResearch/hermes-agent/pull/17668), [#17669](https://github.com/NousResearch/hermes-agent/pull/17669), [#15488](https://github.com/NousResearch/hermes-agent/pull/15488))
- **Observability + achievements plugins** — bundled Langfuse observability plugin (salvage #16845) + bundled hermes-achievements plugin that scans full session history. ([#16917](https://github.com/NousResearch/hermes-agent/pull/16917), [#17754](https://github.com/NousResearch/hermes-agent/pull/17754))
- **TTS provider registry + Piper local TTS** — pluggable `tts.providers.<name>` registry; Piper ships as a native local TTS provider. (Closes #8508.) ([#17843](https://github.com/NousResearch/hermes-agent/pull/17843), [#17885](https://github.com/NousResearch/hermes-agent/pull/17885))
- **Vercel Sandbox backend** — Vercel sandboxes as an execute_code/terminal backend (@kshitijk4poor). ([#17445](https://github.com/NousResearch/hermes-agent/pull/17445))
- **Secret redaction off by default** — default flipped to off. Prevents the long-standing patch-corruption incidents where fake secret-shaped substrings mangled tool outputs. Opt in via `redaction.enabled: true` when you need it. ([#16794](https://github.com/NousResearch/hermes-agent/pull/16794))
- **Cold-start performance** — visible TUI cold start cut **~57%** via lazy agent init (@OutThisLife), lazy imports of OpenAI / Anthropic / Firecrawl / account_usage, mtime-cached `load_config()`, memoized `get_tool_definitions()` with TTL-cached `check_fn` results, precompiled dangerous-command patterns. ([#17190](https://github.com/NousResearch/hermes-agent/pull/17190), [#17046](https://github.com/NousResearch/hermes-agent/pull/17046), [#17041](https://github.com/NousResearch/hermes-agent/pull/17041), [#17098](https://github.com/NousResearch/hermes-agent/pull/17098), [#17206](https://github.com/NousResearch/hermes-agent/pull/17206))
- **Configurable prompt cache TTL** — `prompt_caching.cache_ttl` (5m default, 1h opt-in — cost savings for bursty sessions that keep cache warm). Salvage of #12659. ([#15065](https://github.com/NousResearch/hermes-agent/pull/15065))
---
## 🧠 Autonomous Curator & Self-Improvement Loop
### Curator — autonomous skill maintenance
- **`hermes curator` as a background agent** — runs on the gateway's cron ticker, 7-day cycle by default, umbrella-first prompt, inherits parent config, unbounded iterations ([#17277](https://github.com/NousResearch/hermes-agent/pull/17277) — issue #7816)
- **Per-run reports** — `logs/curator/run.json` + `REPORT.md` per cycle ([#17307](https://github.com/NousResearch/hermes-agent/pull/17307))
- **Consolidated vs pruned classification** — archived skills split with model + heuristic ([#17941](https://github.com/NousResearch/hermes-agent/pull/17941))
- **`hermes curator status`** — ranks skills by usage, shows most-used and least-used ([#18033](https://github.com/NousResearch/hermes-agent/pull/18033))
- **Unified under `auxiliary.curator`** — pick the model in `hermes model`, configure from the dashboard ([#17868](https://github.com/NousResearch/hermes-agent/pull/17868))
- **Documentation** — dedicated curator feature page on the docs site ([#17563](https://github.com/NousResearch/hermes-agent/pull/17563))
- Fix: seed defaults on update, create `logs/curator/` directory, defer fire import ([#17927](https://github.com/NousResearch/hermes-agent/pull/17927))
- Fix: scan nested archive subdirs in `restore_skill` (@0xDevNinja) ([#17951](https://github.com/NousResearch/hermes-agent/pull/17951))
- Fix: use actual skill activity in curator status (@y0shua1ee) ([#17953](https://github.com/NousResearch/hermes-agent/pull/17953))
- Fix: `skill_manage` refuses writes on pinned skills; pinning now blocks curator writes ([#17562](https://github.com/NousResearch/hermes-agent/pull/17562), [#17578](https://github.com/NousResearch/hermes-agent/pull/17578))
- Fix: `bump_use()` wired into skill invocation + preload + skill_view (salvage #17782) ([#17932](https://github.com/NousResearch/hermes-agent/pull/17932))
### Self-improvement loop (background review fork)
- **Class-first skill-review prompt** — rubric-based grading rather than free-form "should this update" ([#16026](https://github.com/NousResearch/hermes-agent/pull/16026))
- **Active-update bias** — prefers updating skills the agent just loaded, handles `references/` + `templates/` sub-files ([#17213](https://github.com/NousResearch/hermes-agent/pull/17213))
- **Fork inherits parent's live runtime** — provider, model, credentials actually propagate now ([#16099](https://github.com/NousResearch/hermes-agent/pull/16099))
- **Scoped toolsets** — review fork restricted to memory + skills (no shell, no web) ([#16569](https://github.com/NousResearch/hermes-agent/pull/16569))
- **Clean shutdown** — background review memory providers exit properly (salvage #15289) ([#16204](https://github.com/NousResearch/hermes-agent/pull/16204))
- **Clean context** — prior-history tool messages excluded from review summary (salvage #14967) ([#15057](https://github.com/NousResearch/hermes-agent/pull/15057))
---
## 🧩 Skills Ecosystem
### Skill integrations — newly bundled or promoted
- **ComfyUI v5** — official CLI + REST + hardware-gated local install; **moved from optional to built-in** ([#17610](https://github.com/NousResearch/hermes-agent/pull/17610), [#17631](https://github.com/NousResearch/hermes-agent/pull/17631), [#17734](https://github.com/NousResearch/hermes-agent/pull/17734), [#17612](https://github.com/NousResearch/hermes-agent/pull/17612))
- **TouchDesigner-MCP** — **bundled by default** ([#16753](https://github.com/NousResearch/hermes-agent/pull/16753) — @kshitijk4poor), expanded with GLSL, post-FX, audio, geometry references ([#16624](https://github.com/NousResearch/hermes-agent/pull/16624)), 9 new reference docs ([#16768](https://github.com/NousResearch/hermes-agent/pull/16768) — @SHL0MS)
- **Humanizer** — strips AI-isms from text ([#16787](https://github.com/NousResearch/hermes-agent/pull/16787))
- **claude-design** — HTML artifact skill with disambiguation from other design skills ([#16358](https://github.com/NousResearch/hermes-agent/pull/16358))
- **design-md** — Google's DESIGN.md spec skill ([#14876](https://github.com/NousResearch/hermes-agent/pull/14876))
- **airtable** — salvaged skill + skill API keys wired into `.env` (#15838) ([#16291](https://github.com/NousResearch/hermes-agent/pull/16291))
- **pretext** — creative browser demos with @chenglou/pretext ([#17259](https://github.com/NousResearch/hermes-agent/pull/17259))
- **spike** + **sketch** — throwaway experiments + HTML mockups, adapted from gsd-build ([#17421](https://github.com/NousResearch/hermes-agent/pull/17421))
### Skills UX
- **Install skills from a direct HTTP(S) URL** — `hermes skills install <url>` ([#16323](https://github.com/NousResearch/hermes-agent/pull/16323))
- **`/reload-skills`** slash command (salvage #17670) ([#17744](https://github.com/NousResearch/hermes-agent/pull/17744))
- **`hermes skills list`** shows enabled/disabled status ([#16129](https://github.com/NousResearch/hermes-agent/pull/16129))
- **`skill_manage` refuses writes on pinned skills** ([#17562](https://github.com/NousResearch/hermes-agent/pull/17562))
- **`skill_manage` edits external_dirs skills in place** (salvage #9966) ([#17512](https://github.com/NousResearch/hermes-agent/pull/17512), [#17289](https://github.com/NousResearch/hermes-agent/pull/17289))
- Fix: inline-shell rendering in `skill_view` ([#15376](https://github.com/NousResearch/hermes-agent/pull/15376))
- Fix: exclude `.archive/` from skill index walk (salvage #17639) ([#17931](https://github.com/NousResearch/hermes-agent/pull/17931))
- Fix: dedicated docs page per bundled + optional skill ([#14929](https://github.com/NousResearch/hermes-agent/pull/14929))
- Fix: `google-workspace` shared HERMES_HOME helper + ship deps as optional extra ([#15405](https://github.com/NousResearch/hermes-agent/pull/15405))
- Fix: auto-wrap ASCII-art code blocks in generated skill pages ([#16497](https://github.com/NousResearch/hermes-agent/pull/16497))
- Point agent at `hermes-agent` skill + docs site for Hermes questions ([#16535](https://github.com/NousResearch/hermes-agent/pull/16535))
---
## 🏗️ Core Agent & Architecture
### Provider & Model Support
#### New providers
- **GMI Cloud** — first-class API-key provider on par with Arcee/Kilocode/Xiaomi (salvage of #11955@isaachuangGMICLOUD) ([#16663](https://github.com/NousResearch/hermes-agent/pull/16663))
- **Azure AI Foundry** — auto-detection, full wiring ([#15845](https://github.com/NousResearch/hermes-agent/pull/15845))
- **LM Studio** — upgraded from custom-endpoint alias to first-class provider: dedicated auth, doctor checks, reasoning transport, live `/models` (salvage of #17061@kshitijk4poor) ([#17102](https://github.com/NousResearch/hermes-agent/pull/17102))
- **MiniMax OAuth** — PKCE browser flow with full OAuth integration (salvage #15203) ([#17524](https://github.com/NousResearch/hermes-agent/pull/17524))
- **Tencent Tokenhub** — new provider (salvage of #16860) ([#16960](https://github.com/NousResearch/hermes-agent/pull/16960))
#### Model catalog
- **Remote model catalog manifest** — OpenRouter + Nous Portal catalogs pulled from remote manifest so new models show up without a release ([#16033](https://github.com/NousResearch/hermes-agent/pull/16033))
- `openai/gpt-5.5` and `gpt-5.5-pro` added to OpenRouter + Nous Portal ([#15343](https://github.com/NousResearch/hermes-agent/pull/15343))
- `deepseek-v4-pro` and `deepseek-v4-flash` added ([#14934](https://github.com/NousResearch/hermes-agent/pull/14934))
- `qwen3.6-plus` added to Alibaba-supported models ([#16896](https://github.com/NousResearch/hermes-agent/pull/16896))
- Gemini free-tier keys blocked at setup with 429 guidance surfacing ([#15100](https://github.com/NousResearch/hermes-agent/pull/15100))
#### Model configuration
- **Configurable `prompt_caching.cache_ttl`** — 5m default, 1h opt-in (salvage #12659) ([#15065](https://github.com/NousResearch/hermes-agent/pull/15065))
- `/fast` whitelist broadened to all OpenAI + Anthropic models ([#16883](https://github.com/NousResearch/hermes-agent/pull/16883))
- `auxiliary.extra_body.reasoning` translates into Codex Responses API ([#17004](https://github.com/NousResearch/hermes-agent/pull/17004))
- `hermes fallback` command for managing fallback providers ([#16052](https://github.com/NousResearch/hermes-agent/pull/16052))
### Agent Loop & Conversation
- **Native multimodal image routing** — based on model vision capability, not provider defaults ([#16506](https://github.com/NousResearch/hermes-agent/pull/16506))
- **Delegate `child_timeout_seconds` default bumped to 600s** ([#14809](https://github.com/NousResearch/hermes-agent/pull/14809))
- **Diagnostic dump when subagent times out with 0 API calls** ([#15105](https://github.com/NousResearch/hermes-agent/pull/15105))
- **Gateway busts cached agent on compression/context_length config edits** ([#17008](https://github.com/NousResearch/hermes-agent/pull/17008))
- **Opt-in runtime-metadata footer on final replies** ([#17026](https://github.com/NousResearch/hermes-agent/pull/17026))
- `/reload-mcp` awareness — rebuild cached agents + prompt-cache cost confirmation ([#17729](https://github.com/NousResearch/hermes-agent/pull/17729))
- Fix: repair CamelCase + `_tool` suffix tool-call emissions ([#15124](https://github.com/NousResearch/hermes-agent/pull/15124))
- Fix: retry on `json.JSONDecodeError` instead of treating as local validation error ([#15107](https://github.com/NousResearch/hermes-agent/pull/15107))
- Fix: handle unescaped control chars in `tool_call.arguments` ([#15356](https://github.com/NousResearch/hermes-agent/pull/15356))
- Fix: ordering fix in `_copy_reasoning_content_for_api` — cross-provider reasoning isolation (@Zjianru) ([#15749](https://github.com/NousResearch/hermes-agent/pull/15749))
- Fix: inject empty `reasoning_content` for DeepSeek/Kimi `tool_calls` unconditionally (@Zjianru) ([#15762](https://github.com/NousResearch/hermes-agent/pull/15762))
- Fix: persist streamed `reasoning_content` on assistant turns (#16844) ([#16892](https://github.com/NousResearch/hermes-agent/pull/16892))
- Fix: cancel coroutine on timeout so worker thread exits; full traceback on tool failure ([#17428](https://github.com/NousResearch/hermes-agent/pull/17428))
- Fix: isolate `get_tool_definitions` quiet_mode cache + dedup LCM injection (#17335) ([#17889](https://github.com/NousResearch/hermes-agent/pull/17889))
- Fix: serialize concurrent `hermes_tools` RPC calls from `execute_code` (#17770) ([#17894](https://github.com/NousResearch/hermes-agent/pull/17894), [#17902](https://github.com/NousResearch/hermes-agent/pull/17902))
- Fix: rename `[SYSTEM:``[IMPORTANT:` in all user-injected markers (dodges Azure content filter) ([#16114](https://github.com/NousResearch/hermes-agent/pull/16114))
### Compression
- **Retry summary on main model for unknown errors before giving up** ([#16774](https://github.com/NousResearch/hermes-agent/pull/16774))
- **Notify users when configured aux model fails even if main-model fallback recovers** ([#16775](https://github.com/NousResearch/hermes-agent/pull/16775))
- `/compress` wrapped in `_busy_command` to block input during compression ([#15388](https://github.com/NousResearch/hermes-agent/pull/15388))
- Fix: reserve system + tools headroom when aux binds threshold ([#15631](https://github.com/NousResearch/hermes-agent/pull/15631))
- Fix: use text-char sum for multimodal token estimation in `_find_tail_cut_by_tokens` ([#16369](https://github.com/NousResearch/hermes-agent/pull/16369))
### Session, Memory & State
- **Trigram FTS5 index for CJK search, replace LIKE fallback** (@alt-glitch) ([#16651](https://github.com/NousResearch/hermes-agent/pull/16651))
- **Index `tool_name` + `tool_calls` in FTS5, with repair + migration** (salvages #16866) ([#16914](https://github.com/NousResearch/hermes-agent/pull/16914))
- **Checkpoints: auto-prune orphan and stale shadow repos at startup** ([#16303](https://github.com/NousResearch/hermes-agent/pull/16303))
- **Memory providers notified on mid-process session_id rotation** (#6672) ([#17409](https://github.com/NousResearch/hermes-agent/pull/17409))
- Fix: quote underscored terms in FTS5 query sanitization ([#16915](https://github.com/NousResearch/hermes-agent/pull/16915))
- Fix: resolve viking_read 500/412 on file URIs + pseudo-summary URIs (salvage #5886) ([#17869](https://github.com/NousResearch/hermes-agent/pull/17869))
- Fix: skip external-provider sync on interrupted turns ([#15395](https://github.com/NousResearch/hermes-agent/pull/15395))
- Fix: close embedded Hindsight async client cleanly (salvage #14605) ([#16209](https://github.com/NousResearch/hermes-agent/pull/16209))
- Fix: pass session transcript to `shutdown_memory_provider` on gateway + CLI (#15165) ([#16571](https://github.com/NousResearch/hermes-agent/pull/16571))
- Fix: write-origin metadata seam ([#15346](https://github.com/NousResearch/hermes-agent/pull/15346))
- Fix: preserve symlinks during atomic file writes ([#16980](https://github.com/NousResearch/hermes-agent/pull/16980))
- Refactor: remove `flush_memories` entirely ([#15696](https://github.com/NousResearch/hermes-agent/pull/15696))
### Auxiliary models
- Fix: surface auxiliary failures in UI (previously silent) ([#15324](https://github.com/NousResearch/hermes-agent/pull/15324))
- Fix: surface title-gen auxiliary failures instead of silently dropping ([#16371](https://github.com/NousResearch/hermes-agent/pull/16371))
- Fix: generalize unsupported-parameter detector and harden `max_tokens` retry ([#15633](https://github.com/NousResearch/hermes-agent/pull/15633))
---
## 📱 Messaging Platforms (Gateway)
### New Platforms
- **Microsoft Teams (19th platform)** — as a plugin, + xdist collision guard ([#17828](https://github.com/NousResearch/hermes-agent/pull/17828))
- **Yuanbao (Tencent 元宝, 18th platform)** — native adapter with text + media delivery ([#16298](https://github.com/NousResearch/hermes-agent/pull/16298), [#17424](https://github.com/NousResearch/hermes-agent/pull/17424), [#16880](https://github.com/NousResearch/hermes-agent/pull/16880))
### Pluggable Gateway Platforms
- **Drop-in messaging adapters** — the gateway is now a plugin host for platforms (salvage of #17664) ([#17751](https://github.com/NousResearch/hermes-agent/pull/17751))
### Telegram
- **Chat allowlists for groups and forums** (@web3blind) ([#15027](https://github.com/NousResearch/hermes-agent/pull/15027))
- **Send fresh finals for stale preview streams** (port openclaw#72038) ([#16261](https://github.com/NousResearch/hermes-agent/pull/16261))
- **Render markdown tables as row-group bullets + prompt hint** ([#16997](https://github.com/NousResearch/hermes-agent/pull/16997))
- Document fallback in centralized audio routing ([#17833](https://github.com/NousResearch/hermes-agent/pull/17833))
- Native multi-image sending ([#17909](https://github.com/NousResearch/hermes-agent/pull/17909))
### Discord
- **Opt-in toolsets + ID injection + tool split + Feishu wiring** (salvage #15457, #15458) ([#15610](https://github.com/NousResearch/hermes-agent/pull/15610), [#15613](https://github.com/NousResearch/hermes-agent/pull/15613))
- Fix: coerce `limit` parameter to int before `min()` call ([#16319](https://github.com/NousResearch/hermes-agent/pull/16319))
### Slack
- **Register every gateway command as a native slash (Discord/Telegram parity)** ([#16164](https://github.com/NousResearch/hermes-agent/pull/16164))
- **`strict_mention` config** — prevents thread auto-engagement ([#16193](https://github.com/NousResearch/hermes-agent/pull/16193))
- **`channel_skill_bindings`** — bind specific skills to specific Slack channels ([#16283](https://github.com/NousResearch/hermes-agent/pull/16283))
### Signal
- **Native formatting** — markdown → bodyRanges, reply quotes, reactions ([#17417](https://github.com/NousResearch/hermes-agent/pull/17417))
- Native multi-image sending ([#17909](https://github.com/NousResearch/hermes-agent/pull/17909))
### Feishu / Mattermost / Email / Signal
- All participate in **native multi-image sending** ([#17909](https://github.com/NousResearch/hermes-agent/pull/17909))
### Gateway Core
- **Centralized audio routing + FLAC support + Telegram doc fallback** ([#17833](https://github.com/NousResearch/hermes-agent/pull/17833))
- **Native multi-image sending** across Telegram, Discord, Slack, Mattermost, Email, Signal ([#17909](https://github.com/NousResearch/hermes-agent/pull/17909))
- **Make hygiene hard message limit configurable** ([#17000](https://github.com/NousResearch/hermes-agent/pull/17000))
- **Opt-in runtime-metadata footer on final replies** ([#17026](https://github.com/NousResearch/hermes-agent/pull/17026))
- **`pre_gateway_dispatch` hook** — plugins can intercept before dispatch ([#15050](https://github.com/NousResearch/hermes-agent/pull/15050))
- **`pre_approval_request` / `post_approval_response` hooks** ([#16776](https://github.com/NousResearch/hermes-agent/pull/16776))
- Fix: timeouts — guard `load_config()` call against runtime exceptions ([#16318](https://github.com/NousResearch/hermes-agent/pull/16318))
- Fix: support passing handler tools via registry ([#15613](https://github.com/NousResearch/hermes-agent/pull/15613))
---
## 🔧 Tool System
### Plugin-first architecture
- **Pluggable gateway platforms** — platforms can ship as plugins ([#17751](https://github.com/NousResearch/hermes-agent/pull/17751))
- **Microsoft Teams as first plugin-shipped platform** ([#17828](https://github.com/NousResearch/hermes-agent/pull/17828))
- **`pre_gateway_dispatch` hook** ([#15050](https://github.com/NousResearch/hermes-agent/pull/15050))
- **`pre_approval_request` + `post_approval_response` hooks** ([#16776](https://github.com/NousResearch/hermes-agent/pull/16776))
- **`duration_ms` on `post_tool_call`** (inspired by Claude Code 2.1.119) ([#15429](https://github.com/NousResearch/hermes-agent/pull/15429))
- **Bundled plugins**: Spotify ([#15174](https://github.com/NousResearch/hermes-agent/pull/15174)), Google Meet ([#16364](https://github.com/NousResearch/hermes-agent/pull/16364)), Langfuse observability ([#16917](https://github.com/NousResearch/hermes-agent/pull/16917)), hermes-achievements ([#17754](https://github.com/NousResearch/hermes-agent/pull/17754))
- **Page-scoped plugin slots for built-in dashboard pages** ([#15658](https://github.com/NousResearch/hermes-agent/pull/15658))
- **Declarative plugin installation for NixOS module** (@alt-glitch) ([#15953](https://github.com/NousResearch/hermes-agent/pull/15953))
### Browser
- **CDP supervisor** — dialog detection + response + cross-origin iframe eval ([#14540](https://github.com/NousResearch/hermes-agent/pull/14540))
- **Auto-spawn local Chromium for LAN/localhost URLs** when cloud provider is configured ([#16136](https://github.com/NousResearch/hermes-agent/pull/16136))
### Execute code / Terminal
- **Vercel Sandbox backend** for `execute_code` / terminal (@kshitijk4poor) ([#17445](https://github.com/NousResearch/hermes-agent/pull/17445))
- **Collapse subagent `task_id`s to shared container** ([#16177](https://github.com/NousResearch/hermes-agent/pull/16177))
- **Docker: run container as host user** to avoid root-owned bind mounts (@benbarclay) ([#17305](https://github.com/NousResearch/hermes-agent/pull/17305))
- Fix: safely quote `~/` subpaths in wrapped `cd` commands ([#15394](https://github.com/NousResearch/hermes-agent/pull/15394))
- Fix: close file descriptor in `LocalEnvironment._update_cwd` ([#17300](https://github.com/NousResearch/hermes-agent/pull/17300))
- Fix: SSH — prevent tar from overwriting remote home dir permissions ([#17898](https://github.com/NousResearch/hermes-agent/pull/17898), [#17867](https://github.com/NousResearch/hermes-agent/pull/17867))
### Image generation
- See Provider section for updates; no new image providers this window.
### TTS / Voice
- **Pluggable TTS provider registry** under `tts.providers.<name>` ([#17843](https://github.com/NousResearch/hermes-agent/pull/17843))
- **Piper** as native local TTS provider (closes #8508) ([#17885](https://github.com/NousResearch/hermes-agent/pull/17885))
- **Voice mode CLI parity in the TUI** — VAD loop + TTS + crash forensics ([#14810](https://github.com/NousResearch/hermes-agent/pull/14810))
- Fix: vision — use HERMES_HOME-based cache dir instead of cwd ([#17719](https://github.com/NousResearch/hermes-agent/pull/17719))
### Cron
- **Honor `hermes tools` config for the cron platform** ([#14798](https://github.com/NousResearch/hermes-agent/pull/14798))
- **Per-job `workdir`** — project-aware cron runs ([#15110](https://github.com/NousResearch/hermes-agent/pull/15110))
- **`context_from` field** — chain cron job outputs ([#15606](https://github.com/NousResearch/hermes-agent/pull/15606))
- Fix: promote `croniter` to a core dependency ([#17577](https://github.com/NousResearch/hermes-agent/pull/17577))
### Web search
- **Expose `limit` for `web_search`** ([#16934](https://github.com/NousResearch/hermes-agent/pull/16934))
### Maps
- Fix: include seconds in timezone UTC offset output ([#16300](https://github.com/NousResearch/hermes-agent/pull/16300))
### Approvals
- **Hardline blocklist for unrecoverable commands** ([#15878](https://github.com/NousResearch/hermes-agent/pull/15878))
- Perf: precompile DANGEROUS_PATTERNS and HARDLINE_PATTERNS ([#17206](https://github.com/NousResearch/hermes-agent/pull/17206))
### ACP
- **Advertise and forward image prompts** ([#18030](https://github.com/NousResearch/hermes-agent/pull/18030))
### API Server
- **POST `/v1/runs/{run_id}/stop`** (salvage of #15656) ([#15842](https://github.com/NousResearch/hermes-agent/pull/15842))
- **Expose run status for external UIs** (#17085) ([#17458](https://github.com/NousResearch/hermes-agent/pull/17458))
### Nix
- **Declarative plugin installation for NixOS module** (@alt-glitch) ([#15953](https://github.com/NousResearch/hermes-agent/pull/15953))
- Fix: use `--rebuild` in fix-lockfiles to bypass cached FOD store paths ([#15444](https://github.com/NousResearch/hermes-agent/pull/15444))
- Fix: `extraPackages` now actually works via per-user profile ([#17047](https://github.com/NousResearch/hermes-agent/pull/17047))
- Fix: refresh web/ npm-deps hash to unblock main builds ([#17174](https://github.com/NousResearch/hermes-agent/pull/17174))
- Fix: replace magic-nix-cache with Cachix ([#17928](https://github.com/NousResearch/hermes-agent/pull/17928))
---
## 🖥️ TUI
### New features
- **LaTeX rendering** (@austinpickett) ([#17175](https://github.com/NousResearch/hermes-agent/pull/17175))
- **`/reload` .env hot-reload** — ported from the classic CLI ([#17286](https://github.com/NousResearch/hermes-agent/pull/17286))
- **Pluggable busy-indicator styles** (@OutThisLife, #13610) ([#17150](https://github.com/NousResearch/hermes-agent/pull/17150))
- **Opt-in auto-resume of the most recent session** (@OutThisLife) ([#17130](https://github.com/NousResearch/hermes-agent/pull/17130))
- **Expanded light-terminal auto-detection** — `HERMES_TUI_THEME` + background hex (@OutThisLife) ([#17113](https://github.com/NousResearch/hermes-agent/pull/17113))
- **Delete sessions from `/resume` picker with `d`** (@OutThisLife) ([#17668](https://github.com/NousResearch/hermes-agent/pull/17668))
- **Line-by-line scroll on modified mouse wheel** (@OutThisLife) ([#17669](https://github.com/NousResearch/hermes-agent/pull/17669))
- **Delete queued message while editing with ctrl-x / cancel with esc** (@OutThisLife) ([#16707](https://github.com/NousResearch/hermes-agent/pull/16707))
- **Per-section visibility for the details accordion** (@OutThisLife) ([#14968](https://github.com/NousResearch/hermes-agent/pull/14968))
- **Voice mode CLI parity** — VAD loop + TTS + crash forensics ([#14810](https://github.com/NousResearch/hermes-agent/pull/14810))
- **Contextual first-touch hints ported to TUI** — `/busy`, `/verbose` ([#16054](https://github.com/NousResearch/hermes-agent/pull/16054))
- **Mini help menu on `?` in the input field** (@ethernet8023) ([#18043](https://github.com/NousResearch/hermes-agent/pull/18043))
### Fixes
- Fix: proactive mouse disable on ConPTY + `/mouse` toggle command (@kevin-ho, WSL2 ghost-mouse fix) ([#15488](https://github.com/NousResearch/hermes-agent/pull/15488))
- Fix: restore skills search RPC ([#15870](https://github.com/NousResearch/hermes-agent/pull/15870))
- Perf: cache text measurements across yoga flex re-passes ([#14818](https://github.com/NousResearch/hermes-agent/pull/14818))
- Perf: stabilize long-session scrolling ([#15926](https://github.com/NousResearch/hermes-agent/pull/15926))
- Perf: lazily seed virtual history heights ([#16523](https://github.com/NousResearch/hermes-agent/pull/16523))
- Perf: cut visible cold start ~57% with lazy agent init ([#17190](https://github.com/NousResearch/hermes-agent/pull/17190))
---
## 🖱️ CLI & User Experience
### New commands
- **`hermes -z <prompt>`** — non-interactive one-shot mode ([#15702](https://github.com/NousResearch/hermes-agent/pull/15702))
- **`hermes -z` with `--model` / `--provider` / `HERMES_INFERENCE_MODEL`** ([#15704](https://github.com/NousResearch/hermes-agent/pull/15704))
- **`hermes update --check`** preflight flag ([#15841](https://github.com/NousResearch/hermes-agent/pull/15841))
- **`hermes fallback`** command for managing fallback providers ([#16052](https://github.com/NousResearch/hermes-agent/pull/16052))
- **`/busy`** slash command for busy input mode ([#15382](https://github.com/NousResearch/hermes-agent/pull/15382))
- **`/busy` input mode 'steer'** as a third option ([#16279](https://github.com/NousResearch/hermes-agent/pull/16279))
- **`/btw` as alias for `/background`** ([#16053](https://github.com/NousResearch/hermes-agent/pull/16053))
- **`/reload-skills`** slash command (salvage #17670) ([#17744](https://github.com/NousResearch/hermes-agent/pull/17744))
- **Surface `/queue`, `/bg`, `/steer` in agent-running placeholder** ([#16118](https://github.com/NousResearch/hermes-agent/pull/16118))
### Setup / onboarding
- **Auto-reconfigure on existing installs** ([#15879](https://github.com/NousResearch/hermes-agent/pull/15879))
- **Contextual first-touch hints for `/busy` and `/verbose`** ([#16046](https://github.com/NousResearch/hermes-agent/pull/16046))
- **Cost-saving tips from the April 30 tip-of-the-day** ([#17841](https://github.com/NousResearch/hermes-agent/pull/17841))
- **Hyperlink startup banner title to the latest GitHub Release** ([#14945](https://github.com/NousResearch/hermes-agent/pull/14945))
### Update / backup
- **Snapshot pairing data before `git pull`** ([#16383](https://github.com/NousResearch/hermes-agent/pull/16383))
- **Auto-backup HERMES_HOME before `hermes update`** (opt-in, off by default) ([#16539](https://github.com/NousResearch/hermes-agent/pull/16539), [#16566](https://github.com/NousResearch/hermes-agent/pull/16566))
- **Exclude `checkpoints/` from backups** ([#16572](https://github.com/NousResearch/hermes-agent/pull/16572))
- **Exclude SQLite WAL/SHM/journal sidecars from backups** ([#16576](https://github.com/NousResearch/hermes-agent/pull/16576))
- **Installer FHS layout for root installs on Linux** ([#15608](https://github.com/NousResearch/hermes-agent/pull/15608))
- Fix: kill stale dashboards instead of warning ([#17832](https://github.com/NousResearch/hermes-agent/pull/17832))
- Fix: show correct update status on nix-built hermes ([#17550](https://github.com/NousResearch/hermes-agent/pull/17550))
### Slash-command housekeeping
- Refactor: drop `/provider`, `/plan` handler, and clean up slash registry ([#15047](https://github.com/NousResearch/hermes-agent/pull/15047))
- Refactor: drop `persist_session` plumbing + fix broken `/btw` mid-turn bypass ([#16075](https://github.com/NousResearch/hermes-agent/pull/16075))
### OpenClaw migration (for folks coming from OpenClaw)
- **Hardened OpenClaw import** — plan-first apply, redaction, pre-migration backup ([#16911](https://github.com/NousResearch/hermes-agent/pull/16911))
- Fix: case-preserving brand rewrite + one-time `~/.openclaw` residue banner ([#16327](https://github.com/NousResearch/hermes-agent/pull/16327))
- Fix: resolve `openclaw` workspace files from `agents.defaults.workspace` ([#16879](https://github.com/NousResearch/hermes-agent/pull/16879))
- Fix: resolve model aliases against real OpenClaw catalog schema (salvage #16778) ([#16977](https://github.com/NousResearch/hermes-agent/pull/16977))
---
## 📊 Web Dashboard
- **Models tab** — rich per-model analytics ([#17745](https://github.com/NousResearch/hermes-agent/pull/17745))
- **Configure main + auxiliary models from the Models page** ([#17802](https://github.com/NousResearch/hermes-agent/pull/17802))
- **Dashboard Chat tab — xterm.js + JSON-RPC sidecar** (supersedes #12710 + #13379, @OutThisLife) ([#14890](https://github.com/NousResearch/hermes-agent/pull/14890))
- **Dashboard layout refresh** (@austinpickett) ([#14899](https://github.com/NousResearch/hermes-agent/pull/14899))
- **`--stop` and `--status` flags** on the dashboard CLI ([#17840](https://github.com/NousResearch/hermes-agent/pull/17840))
- **Page-scoped plugin slots for built-in pages** ([#15658](https://github.com/NousResearch/hermes-agent/pull/15658))
- Fix: replace all buttons for design system buttons ([#17007](https://github.com/NousResearch/hermes-agent/pull/17007))
---
## ⚡ Performance
- **TUI visible cold start cut ~57%** via lazy agent init ([#17190](https://github.com/NousResearch/hermes-agent/pull/17190))
- **Lazy-import OpenAI, Anthropic, Firecrawl, account_usage** ([#17046](https://github.com/NousResearch/hermes-agent/pull/17046))
- **mtime-cache `load_config()` and `read_raw_config()`** ([#17041](https://github.com/NousResearch/hermes-agent/pull/17041))
- **Memoize `get_tool_definitions()` + TTL-cache `check_fn` results** ([#17098](https://github.com/NousResearch/hermes-agent/pull/17098))
- **Precompile DANGEROUS_PATTERNS and HARDLINE_PATTERNS** ([#17206](https://github.com/NousResearch/hermes-agent/pull/17206))
- **Cache Ink text measurements across yoga flex re-passes** ([#14818](https://github.com/NousResearch/hermes-agent/pull/14818))
- **Stabilize long-session scrolling** ([#15926](https://github.com/NousResearch/hermes-agent/pull/15926))
- **Lazily seed virtual history heights** ([#16523](https://github.com/NousResearch/hermes-agent/pull/16523))
---
## 🔒 Security & Reliability
- **Secret redaction off by default** — stops corrupting patches / API payloads with fake-key substitutions. Opt in via `redaction.enabled: true` ([#16794](https://github.com/NousResearch/hermes-agent/pull/16794))
- **`[SYSTEM:``[IMPORTANT:`** in all user-injected markers (Azure content filter dodge) ([#16114](https://github.com/NousResearch/hermes-agent/pull/16114))
- **Hardline blocklist for unrecoverable commands** ([#15878](https://github.com/NousResearch/hermes-agent/pull/15878))
- **Canonical `mask_secret` helper; fix status.py DIM drift** ([#17207](https://github.com/NousResearch/hermes-agent/pull/17207))
- **Sweep expired paste.rs uploads on a real timer** ([#16431](https://github.com/NousResearch/hermes-agent/pull/16431))
- **Preserve symlinks during atomic file writes** ([#16980](https://github.com/NousResearch/hermes-agent/pull/16980))
- **Probe `/dev/tty` by opening it, not bare existence** ([#17024](https://github.com/NousResearch/hermes-agent/pull/17024))
---
## 🐛 Notable Bug Fixes
This window includes 360 `fix:` PRs. Selected highlights from across the stack:
- **Background review fork inherits parent's live runtime** — provider/model/creds now propagate correctly ([#16099](https://github.com/NousResearch/hermes-agent/pull/16099))
- **Hindsight configurable `HINDSIGHT_TIMEOUT` env var** ([#15077](https://github.com/NousResearch/hermes-agent/pull/15077))
- **Tools: normalize numeric entries + clear stale `no_mcp` in `_save_platform_tools`** ([#15607](https://github.com/NousResearch/hermes-agent/pull/15607))
- **MCP: rewrite `definitions` refs to `$defs` in input schemas** — closes provider-side 400s
- **Azure content filter compatibility** — renamed `[SYSTEM:` markers so Azure's content filter stops flagging them ([#16114](https://github.com/NousResearch/hermes-agent/pull/16114))
- **Vision cache uses HERMES_HOME instead of cwd** ([#17719](https://github.com/NousResearch/hermes-agent/pull/17719))
- **FTS5 search** — tool_name + tool_calls indexing with repair + migration ([#16914](https://github.com/NousResearch/hermes-agent/pull/16914))
- **Streaming reasoning persists on assistant turns** ([#16892](https://github.com/NousResearch/hermes-agent/pull/16892))
- **execute_code concurrent RPC serialization** (#17770) ([#17894](https://github.com/NousResearch/hermes-agent/pull/17894), [#17902](https://github.com/NousResearch/hermes-agent/pull/17902))
- **Background reviewer scoped to memory + skills toolsets** — no more accidental web/shell escapes ([#16569](https://github.com/NousResearch/hermes-agent/pull/16569))
- **Compression recovery** — retry on main before giving up; notify user when aux fails ([#16774](https://github.com/NousResearch/hermes-agent/pull/16774), [#16775](https://github.com/NousResearch/hermes-agent/pull/16775))
- **`croniter` promoted to a core dependency** ([#17577](https://github.com/NousResearch/hermes-agent/pull/17577))
- **Discord tool `limit` parameter coerced to int** before `min()` call ([#16319](https://github.com/NousResearch/hermes-agent/pull/16319))
- **Yuanbao messaging platform entrance fix** ([#16880](https://github.com/NousResearch/hermes-agent/pull/16880))
- **ACP advertise and forward image prompts** ([#18030](https://github.com/NousResearch/hermes-agent/pull/18030))
- **DeepSeek / Kimi reasoning content isolation** across cross-provider histories (@Zjianru) ([#15749](https://github.com/NousResearch/hermes-agent/pull/15749), [#15762](https://github.com/NousResearch/hermes-agent/pull/15762))
- **Preserve reasoning_content replay on DeepSeek v4 + Kimi/Moonshot thinking** ([#18045](https://github.com/NousResearch/hermes-agent/pull/18045))
The vast majority of the 360 fixes landed in the streaming/compression/tool-calling paths across all providers — DeepSeek, Kimi, Moonshot, GLM, Qwen, MiniMax, Gemini, Anthropic, OpenAI — alongside TUI polish (resize, scroll, sticky-prompt) and gateway platform-specific edge cases.
---
## 🧪 Testing & CI
- Hermetic test parity (`scripts/run_tests.sh`) held across this window
- **Microsoft Teams xdist collision guard** — prevents worker collisions when Teams platform tests run in parallel ([#17828](https://github.com/NousResearch/hermes-agent/pull/17828))
- Chore: remove unused imports and dead locals (ruff F401, F841) ([#17010](https://github.com/NousResearch/hermes-agent/pull/17010))
---
## 📚 Documentation
- **Curator feature page** added to docs site ([#17563](https://github.com/NousResearch/hermes-agent/pull/17563))
- **Document pin also blocking `skill_manage` writes** ([#17578](https://github.com/NousResearch/hermes-agent/pull/17578))
- **Direct-URL skill install documented** across features, reference, guide, and `hermes-agent` skill ([#16355](https://github.com/NousResearch/hermes-agent/pull/16355))
- **Hooks tutorial — build a BOOT.md startup checklist** (replaces the removed built-in hook) ([#17202](https://github.com/NousResearch/hermes-agent/pull/17202))
- **ComfyUI docs: ask local vs cloud FIRST before hardware check** ([#17612](https://github.com/NousResearch/hermes-agent/pull/17612))
- **Obliteratus skill: link YouTube video guide in SKILL.md** ([#15808](https://github.com/NousResearch/hermes-agent/pull/15808))
- Per-skill docs pages generated for bundled + optional skills; ASCII art code blocks auto-wrapped ([#14929](https://github.com/NousResearch/hermes-agent/pull/14929), [#16497](https://github.com/NousResearch/hermes-agent/pull/16497))
---
## ⚖️ Removed / Reverted
- **Kanban multi-profile collaboration board** — landed in #16081, reverted in ([#16098](https://github.com/NousResearch/hermes-agent/pull/16098)) while the design is reworked
- **computer-use cua-driver** — 3 preparatory PRs landed then were reverted in ([#16927](https://github.com/NousResearch/hermes-agent/pull/16927))
- **BOOT.md built-in hook** removed ([#17093](https://github.com/NousResearch/hermes-agent/pull/17093)); the hooks tutorial ([#17202](https://github.com/NousResearch/hermes-agent/pull/17202)) shows how to build the same workflow yourself with a shell hook
- **`/provider` + `/plan` slash commands dropped** ([#15047](https://github.com/NousResearch/hermes-agent/pull/15047))
- **`flush_memories` removed entirely** ([#15696](https://github.com/NousResearch/hermes-agent/pull/15696))
---
## 👥 Contributors
### Core
- **@teknium1** (Teknium)
### Top Community Contributors (by merged PR count since v0.11.0)
- **@OutThisLife** (Brooklyn) — 52 PRs · TUI — light-terminal detection + pluggable busy styles + auto-resume + session-delete from /resume + mouse-wheel scrolling + xterm.js dashboard Chat tab + cold-start cut + accordion polish
- **@kshitijk4poor** — 12 PRs · LM Studio first-class provider (salvage), Vercel Sandbox backend, GMI Cloud salvage, bundled-by-default touchdesigner-mcp, many tool-call / reasoning fixes
- **@helix4u** — 10 PRs · MCP schema robustness, assorted stability fixes
- **@alt-glitch** — 8 PRs · trigram FTS5 CJK search, declarative Nix plugin install, matrix/feishu hints and fixes
- **@ethernet8023** — 4 PRs
- **@austinpickett** — 4 PRs · LaTeX rendering in TUI, dashboard layout refresh
- **@benbarclay** — 3 PRs · Docker run-as-host-user so bind mounts don't get root-owned
- **@vominh1919** — 2 PRs
- **@stephenschoettler** — 2 PRs
- **@kevin-ho** — ConPTY mouse-injection fix (#15488)
- **@Zjianru** — cross-provider reasoning_content isolation + DeepSeek/Kimi empty-reasoning injection (#15749, #15762)
- **@web3blind** — Telegram chat allowlists for groups and forums (#15027)
- **@SHL0MS** — 9 new TouchDesigner-MCP reference docs (#16768)
- **@0xDevNinja** — curator `restore_skill` nested-archive fix (#17951)
- **@y0shua1ee** — curator `use` activity fix (#17953)
### Also contributing
Salvaged or co-authored work from **@isaachuangGMICLOUD** (GMI Cloud), earlier upstream PRs from the original author of each salvage chain, and a long tail of one-shot fixes, documentation nudges, and skill contributions from the community.
### All Contributors (alphabetical, excluding @teknium1)
@0xbyt4, @0xharryriddle, @0xDevNinja, @0z1-ghb, @5park1e, @A-FdL-Prog, @aj-nt, @akhater, @alblez, @alexg0bot,
@alexzhu0, @AllardQuek, @alt-glitch, @amanning3390, @amanuel2, @AndreKurait, @andrewhosf, @Andy283, @andyylin,
@angel12, @AntAISecurityLab, @ash, @austinpickett, @badgerbees, @BadTechBandit, @Bartok9, @beenherebefore,
@beesrsj2500, @BeliefanX, @benbarclay, @benjaminsehl, @BlackishGreen33, @bloodcarter, @BlueBirdBack,
@briandevans, @brooklynnicholson, @bsgdigital, @buray, @bwjoke, @camaragon, @cdanis, @cgarwood82,
@charles-brooks, @chen1749144759, @chengoak, @ching-kaching, @Contentment003111, @crayfish-ai, @CruxExperts,
@cyclingwithelephants, @dandaka, @danklynn, @ddupont808, @dhabibi, @difujia, @dimitrovi, @dlkakbs,
@dontcallmejames, @EKKOLearnAI, @emozilla, @ericnicolaides, @Erosika, @ethernet8023, @exiao, @Feranmi10,
@flobo3, @foxion37, @georgeglessner, @georgex8001, @ghostmfr, @H-Ali13381, @HangGlidersRule, @harryplusplus,
@haru398801, @heathley, @hejuntt1014, @hekaru-agent, @helix4u, @Heltman, @HenkDz, @heyitsaamir, @hharry11,
@hhhonzik, @hhuang91, @HiddenPuppy, @htsh, @iamagenius00, @in-liberty420, @innocarpe, @irispillars, @iRonin,
@isaachuangGMICLOUD, @Ito-69, @j3ffffff, @jackjin1997, @jakubkrcmar, @Jason2031, @JayGwod, @jerome-benoit,
@johnncenae, @Kailigithub, @keiravoss94, @kevin-ho, @knockyai, @konsisumer, @kshitijk4poor, @kunlabs, @l0hde,
@Leihb, @leoneparise, @LeonSGP43, @liizfq, @liuhao1024, @loongzhao, @lsdsjy, @luyao618, @ma-pony, @Magaav,
@MagicRay1217, @math0r-be, @MattMaximo, @maxims-oss, @MaxyMoos, @maymuneth, @mcndjxlefnd, @memosr,
@MestreY0d4-Uninter, @mewwts, @Mirac1eSky, @MorAlekss, @mrhwick, @mrunmayee17, @mssteuer, @Nanako0129,
@nazirulhafiy, @Nerijusas, @Nicecsh, @nicoloboschi, @nightq, @ningfangbin, @octo-patch, @Octopus,
@OutThisLife, @Paperclip, @pein892, @perlowja, @prasadus92, @qike-ms, @qiyin-code, @Readon, @ReginaldasR,
@revaraver, @rfilgueiras, @rmoen, @romanornr, @rugvedS07, @rylena, @samrusani, @Sanjays2402, @sasha-id,
@Satoshi-agi, @scheidti, @scotttrinh, @season179, @SeeYangZhi, @sgaofen, @shamork, @shannonsands, @SHL0MS,
@simbam99, @Societus, @socrates1024, @Sonoyunchu, @sprmn24, @stephenschoettler, @tangyuanjc, @TechPrototyper,
@tekgnosis-net, @ThomassJonax, @tmimmanuel, @tochukwuada, @Tosko4, @Tranquil-Flow, @twozle, @txbxxx,
@UgwujaGeorge, @Versun, @vlwkaos, @voidborne-d, @vominh1919, @Wang-tianhao, @Wangshengyang2004, @web3blind,
@westers, @Wysie, @xandersbell, @xiahu88988, @XieNBi, @xinbenlv, @xnbi, @y0shua1ee, @yatesjalex, @yes999zc,
@yeyitech, @Yoimex, @YueLich, @Yukipukii1, @zhiyanliu, @zicochaos, @Zjianru, @zkl2333, @zons-zhaozhy,
@ztexydt-cqh.
Also: @Siddharth Balyan, @YuShu.
---
**Full Changelog**: [v2026.4.23...v2026.4.30](https://github.com/NousResearch/hermes-agent/compare/v2026.4.23...v2026.4.30)
+381 -30
View File
@@ -4,6 +4,7 @@ from __future__ import annotations
import asyncio
import contextvars
import json
import logging
import os
from collections import defaultdict, deque
@@ -47,6 +48,7 @@ from acp.schema import (
TextContentBlock,
UnstructuredCommandInput,
Usage,
UsageUpdate,
UserMessageChunk,
)
@@ -65,6 +67,7 @@ from acp_adapter.events import (
)
from acp_adapter.permissions import make_approval_callback
from acp_adapter.session import SessionManager, SessionState, _expand_acp_enabled_toolsets
from acp_adapter.tools import build_tool_complete, build_tool_start
logger = logging.getLogger(__name__)
@@ -164,6 +167,8 @@ class HermesACPAgent(acp.Agent):
"context": "Show conversation context info",
"reset": "Clear conversation history",
"compact": "Compress conversation context",
"steer": "Inject guidance into the currently running agent turn",
"queue": "Queue a prompt to run after the current turn finishes",
"version": "Show Hermes version",
}
@@ -193,6 +198,16 @@ class HermesACPAgent(acp.Agent):
"name": "compact",
"description": "Compress conversation context",
},
{
"name": "steer",
"description": "Inject guidance into the currently running agent turn",
"input_hint": "guidance for the active turn",
},
{
"name": "queue",
"description": "Queue a prompt to run after the current turn finishes",
"input_hint": "prompt to run next",
},
{
"name": "version",
"description": "Show Hermes version",
@@ -303,6 +318,66 @@ class HermesACPAgent(acp.Agent):
return target_provider, new_model
@staticmethod
def _build_usage_update(state: SessionState) -> UsageUpdate | None:
"""Build ACP native context-usage data for clients like Zed.
Zed's circular context indicator is driven by ACP ``usage_update``
session updates: ``size`` is the model context window and ``used`` is
the current request pressure. Hermes estimates ``used`` from the same
buckets it sends to providers: system prompt, conversation history, and
tool schemas.
"""
agent = state.agent
compressor = getattr(agent, "context_compressor", None)
size = int(getattr(compressor, "context_length", 0) or 0)
if size <= 0:
return None
try:
from agent.model_metadata import estimate_request_tokens_rough
used = estimate_request_tokens_rough(
state.history,
system_prompt=getattr(agent, "_cached_system_prompt", "") or "",
tools=getattr(agent, "tools", None) or None,
)
except Exception:
logger.debug("Could not estimate ACP native context usage", exc_info=True)
used = int(getattr(compressor, "last_prompt_tokens", 0) or 0)
return UsageUpdate(
session_update="usage_update",
size=max(size, 0),
used=max(used, 0),
)
async def _send_usage_update(self, state: SessionState) -> None:
"""Send ACP native context usage to the connected client."""
if not self._conn:
return
update = self._build_usage_update(state)
if update is None:
return
try:
await self._conn.session_update(
session_id=state.session_id,
update=update,
)
except Exception:
logger.warning(
"Failed to send ACP usage update for session %s",
state.session_id,
exc_info=True,
)
def _schedule_usage_update(self, state: SessionState) -> None:
"""Schedule native context indicator refresh after ACP responses."""
if not self._conn:
return
loop = asyncio.get_running_loop()
loop.call_soon(asyncio.create_task, self._send_usage_update(state))
async def _register_session_mcp_servers(
self,
state: SessionState,
@@ -473,37 +548,99 @@ class HermesACPAgent(acp.Agent):
)
return None
@staticmethod
def _history_tool_call_name_args(tool_call: dict[str, Any]) -> tuple[str, dict[str, Any]]:
"""Extract function name/arguments from an OpenAI-style tool_call."""
function = tool_call.get("function") if isinstance(tool_call.get("function"), dict) else {}
name = str(function.get("name") or tool_call.get("name") or "unknown_tool")
raw_args = function.get("arguments") or tool_call.get("arguments") or tool_call.get("args") or {}
if isinstance(raw_args, str):
try:
parsed = json.loads(raw_args)
except Exception:
parsed = {"raw": raw_args}
raw_args = parsed
if not isinstance(raw_args, dict):
raw_args = {}
return name, raw_args
@staticmethod
def _history_tool_call_id(tool_call: dict[str, Any]) -> str:
"""Return the stable provider tool call id for ACP history replay."""
return str(
tool_call.get("id")
or tool_call.get("call_id")
or tool_call.get("tool_call_id")
or ""
).strip()
async def _replay_session_history(self, state: SessionState) -> None:
"""Send persisted user/assistant history to clients during session/load.
Zed's ACP history UI calls ``session/load`` after the user picks an item
from the Agents sidebar. The agent must then replay the full conversation
as ``user_message_chunk`` / ``agent_message_chunk`` notifications; merely
restoring server-side state makes Hermes remember context, but leaves the
editor looking like a clean thread.
as user/assistant chunks plus reconstructed tool-call start/completion
notifications; merely restoring server-side state makes Hermes remember
context, but leaves the editor looking like a clean thread.
"""
if not self._conn or not state.history:
return
for message in state.history:
role = str(message.get("role") or "")
if role not in {"user", "assistant"}:
continue
text = self._history_message_text(message)
if not text:
continue
update = self._history_message_update(role=role, text=text)
if update is None:
continue
active_tool_calls: dict[str, tuple[str, dict[str, Any]]] = {}
async def _send(update: Any) -> bool:
try:
await self._conn.session_update(session_id=state.session_id, update=update)
return True
except Exception:
logger.warning(
"Failed to replay ACP history for session %s",
state.session_id,
exc_info=True,
)
return
return False
for message in state.history:
role = str(message.get("role") or "")
if role in {"user", "assistant"}:
text = self._history_message_text(message)
if text:
update = self._history_message_update(role=role, text=text)
if update is not None and not await _send(update):
return
if role == "assistant" and isinstance(message.get("tool_calls"), list):
for tool_call in message["tool_calls"]:
if not isinstance(tool_call, dict):
continue
tool_call_id = self._history_tool_call_id(tool_call)
if not tool_call_id:
continue
tool_name, args = self._history_tool_call_name_args(tool_call)
active_tool_calls[tool_call_id] = (tool_name, args)
if not await _send(build_tool_start(tool_call_id, tool_name, args)):
return
continue
if role == "tool":
tool_call_id = str(message.get("tool_call_id") or "").strip()
tool_name = str(message.get("tool_name") or "").strip()
function_args: dict[str, Any] | None = None
if tool_call_id in active_tool_calls:
tool_name, function_args = active_tool_calls.pop(tool_call_id)
if not tool_call_id or not tool_name:
continue
result = message.get("content")
if not await _send(
build_tool_complete(
tool_call_id,
tool_name,
result=result if isinstance(result, str) else None,
function_args=function_args,
)
):
return
async def new_session(
self,
@@ -515,11 +652,24 @@ class HermesACPAgent(acp.Agent):
await self._register_session_mcp_servers(state, mcp_servers)
logger.info("New session %s (cwd=%s)", state.session_id, cwd)
self._schedule_available_commands_update(state.session_id)
self._schedule_usage_update(state)
return NewSessionResponse(
session_id=state.session_id,
models=self._build_model_state(state),
)
def _schedule_history_replay(self, state: SessionState) -> None:
"""Replay persisted history after session/load or session/resume returns.
Zed only attaches streamed transcript/tool updates once the load/resume
response has completed. Sending replay notifications while the request is
still in-flight can make the server look correct in logs while the editor
drops or fails to attach the tool-call history.
"""
loop = asyncio.get_running_loop()
replay_coro = self._replay_session_history(state)
loop.call_soon(asyncio.create_task, replay_coro)
async def load_session(
self,
cwd: str,
@@ -533,8 +683,9 @@ class HermesACPAgent(acp.Agent):
return None
await self._register_session_mcp_servers(state, mcp_servers)
logger.info("Loaded session %s", session_id)
await self._replay_session_history(state)
self._schedule_history_replay(state)
self._schedule_available_commands_update(session_id)
self._schedule_usage_update(state)
return LoadSessionResponse(models=self._build_model_state(state))
async def resume_session(
@@ -550,13 +701,17 @@ class HermesACPAgent(acp.Agent):
state = self.session_manager.create_session(cwd=cwd)
await self._register_session_mcp_servers(state, mcp_servers)
logger.info("Resumed session %s", state.session_id)
await self._replay_session_history(state)
self._schedule_history_replay(state)
self._schedule_available_commands_update(state.session_id)
self._schedule_usage_update(state)
return ResumeSessionResponse(models=self._build_model_state(state))
async def cancel(self, session_id: str, **kwargs: Any) -> None:
state = self.session_manager.get_session(session_id)
if state and state.cancel_event:
with state.runtime_lock:
if state.is_running and state.current_prompt_text:
state.interrupted_prompt_text = state.current_prompt_text
state.cancel_event.set()
try:
if getattr(state, "agent", None) and hasattr(state.agent, "interrupt"):
@@ -654,6 +809,39 @@ class HermesACPAgent(acp.Agent):
if not has_content:
return PromptResponse(stop_reason="end_turn")
# /steer on an idle session has no in-flight tool call to inject into.
# Rewrite it so the payload runs as a normal user prompt, matching the
# gateway's behavior (gateway/run.py ~L4898). Two sub-cases:
# 1. Zed-interrupt salvage — a prior prompt was cancelled by the
# client right before /steer arrived; replay it with the steer
# text attached as explicit correction/guidance so the user's
# in-flight work isn't lost.
# 2. Plain idle — no prior work to salvage; just run the steer
# payload as a regular prompt. Without this, _cmd_steer would
# silently append to state.queued_prompts and respond with
# "No active turn — queued for the next turn", which looks like
# /queue even though the user never typed /queue.
if isinstance(user_content, str) and user_text.startswith("/steer"):
steer_text = user_text.split(maxsplit=1)[1].strip() if len(user_text.split(maxsplit=1)) > 1 else ""
interrupted_prompt = ""
rewrite_idle = False
with state.runtime_lock:
if not state.is_running and steer_text:
if state.interrupted_prompt_text:
interrupted_prompt = state.interrupted_prompt_text
state.interrupted_prompt_text = ""
else:
rewrite_idle = True
if interrupted_prompt:
user_text = (
f"{interrupted_prompt}\n\n"
f"User correction/guidance after interrupt: {steer_text}"
)
user_content = user_text
elif rewrite_idle:
user_text = steer_text
user_content = steer_text
# Intercept slash commands — handle locally without calling the LLM.
# Slash commands are text-only; if the client included images/resources,
# send the whole multimodal prompt to the agent instead of treating it as
@@ -664,8 +852,27 @@ class HermesACPAgent(acp.Agent):
if self._conn:
update = acp.update_agent_message_text(response_text)
await self._conn.session_update(session_id, update)
await self._send_usage_update(state)
return PromptResponse(stop_reason="end_turn")
# If Zed sends another regular prompt while the same ACP session is
# still running, queue it instead of racing two AIAgent loops against
# the same state.history. /steer and /queue are handled above and can
# land immediately.
with state.runtime_lock:
if state.is_running:
queued_text = user_text or "[Image attachment]"
state.queued_prompts.append(queued_text)
depth = len(state.queued_prompts)
if self._conn:
update = acp.update_agent_message_text(
f"Queued for the next turn. ({depth} queued)"
)
await self._conn.session_update(session_id, update)
return PromptResponse(stop_reason="end_turn")
state.is_running = True
state.current_prompt_text = user_text or "[Image attachment]"
logger.info("Prompt on session %s: %s", session_id, user_text[:100])
conn = self._conn
@@ -678,24 +885,37 @@ class HermesACPAgent(acp.Agent):
tool_call_meta: dict[str, dict[str, Any]] = {}
previous_approval_cb = None
streamed_message = False
if conn:
tool_progress_cb = make_tool_progress_cb(conn, session_id, loop, tool_call_ids, tool_call_meta)
thinking_cb = make_thinking_cb(conn, session_id, loop)
reasoning_cb = make_thinking_cb(conn, session_id, loop)
step_cb = make_step_cb(conn, session_id, loop, tool_call_ids, tool_call_meta)
message_cb = make_message_cb(conn, session_id, loop)
def stream_delta_cb(text: str) -> None:
nonlocal streamed_message
if text:
streamed_message = True
message_cb(text)
approval_cb = make_approval_callback(conn.request_permission, loop, session_id)
else:
tool_progress_cb = None
thinking_cb = None
reasoning_cb = None
step_cb = None
message_cb = None
stream_delta_cb = None
approval_cb = None
agent = state.agent
agent.tool_progress_callback = tool_progress_cb
agent.thinking_callback = thinking_cb
# ACP thought panes should not receive Hermes' local kawaii waiting/status
# updates. Route provider/model reasoning deltas instead; if the provider
# emits no reasoning, Zed should not get a fake "thinking" accordion.
agent.thinking_callback = None
agent.reasoning_callback = reasoning_cb
agent.step_callback = step_cb
agent.message_callback = message_cb
agent.stream_delta_callback = stream_delta_cb
# Approval callback is per-thread (thread-local, GHSA-qg5c-hvr5-hjgr).
# Set it INSIDE _run_agent so the TLS write happens in the executor
@@ -777,6 +997,9 @@ class HermesACPAgent(acp.Agent):
result = await loop.run_in_executor(_executor, ctx.run, _run_agent)
except Exception:
logger.exception("Executor error for session %s", session_id)
with state.runtime_lock:
state.is_running = False
state.current_prompt_text = ""
return PromptResponse(stop_reason="end_turn")
if result.get("messages"):
@@ -798,10 +1021,32 @@ class HermesACPAgent(acp.Agent):
)
except Exception:
logger.debug("Failed to auto-title ACP session %s", session_id, exc_info=True)
if final_response and conn:
if final_response and conn and not streamed_message:
update = acp.update_agent_message_text(final_response)
await conn.session_update(session_id, update)
# Mark this turn idle before draining queued work so recursive prompt()
# calls can acquire the session. Queued turns are intentionally run as
# normal follow-up user prompts, preserving role alternation and history.
with state.runtime_lock:
state.is_running = False
state.current_prompt_text = ""
while True:
with state.runtime_lock:
if not state.queued_prompts:
break
next_prompt = state.queued_prompts.pop(0)
if conn:
await conn.session_update(
session_id,
acp.update_user_message_text(next_prompt),
)
await self.prompt(
prompt=[TextContentBlock(type="text", text=next_prompt)],
session_id=session_id,
)
usage = None
if any(result.get(key) is not None for key in ("prompt_tokens", "completion_tokens", "total_tokens")):
usage = Usage(
@@ -812,6 +1057,8 @@ class HermesACPAgent(acp.Agent):
cached_read_tokens=result.get("cache_read_tokens"),
)
await self._send_usage_update(state)
stop_reason = "cancelled" if state.cancel_event and state.cancel_event.is_set() else "end_turn"
return PromptResponse(stop_reason=stop_reason, usage=usage)
@@ -879,6 +1126,8 @@ class HermesACPAgent(acp.Agent):
"context": self._cmd_context,
"reset": self._cmd_reset,
"compact": self._cmd_compact,
"steer": self._cmd_steer,
"queue": self._cmd_queue,
"version": self._cmd_version,
}.get(cmd)
@@ -942,22 +1191,84 @@ class HermesACPAgent(acp.Agent):
return f"Could not list tools: {e}"
def _cmd_context(self, args: str, state: SessionState) -> str:
"""Show ACP session context pressure and compression guidance."""
n_messages = len(state.history)
if n_messages == 0:
return "Conversation is empty (no messages yet)."
# Count by role
# Count by role.
roles: dict[str, int] = {}
for msg in state.history:
role = msg.get("role", "unknown")
roles[role] = roles.get(role, 0) + 1
agent = state.agent
model = state.model or getattr(agent, "model", "")
provider = getattr(agent, "provider", None) or "auto"
compressor = getattr(agent, "context_compressor", None)
context_length = int(getattr(compressor, "context_length", 0) or 0)
threshold_tokens = int(getattr(compressor, "threshold_tokens", 0) or 0)
try:
from agent.model_metadata import estimate_request_tokens_rough
system_prompt = getattr(agent, "_cached_system_prompt", "") or ""
tools = getattr(agent, "tools", None) or None
approx_tokens = estimate_request_tokens_rough(
state.history,
system_prompt=system_prompt,
tools=tools,
)
except Exception:
logger.debug("Could not estimate ACP context usage", exc_info=True)
approx_tokens = 0
if threshold_tokens <= 0 and context_length > 0:
threshold_tokens = int(context_length * 0.80)
lines = [
f"Conversation: {n_messages} messages",
f"Conversation: {n_messages} messages"
if n_messages
else "Conversation is empty (no messages yet).",
f" user: {roles.get('user', 0)}, assistant: {roles.get('assistant', 0)}, "
f"tool: {roles.get('tool', 0)}, system: {roles.get('system', 0)}",
]
model = state.model or getattr(state.agent, "model", "")
if model:
lines.append(f"Model: {model}")
lines.append(f"Provider: {provider}")
if approx_tokens > 0:
if context_length > 0:
usage_pct = (approx_tokens / context_length) * 100
lines.append(
f"Context usage: ~{approx_tokens:,} / {context_length:,} tokens ({usage_pct:.1f}%)"
)
else:
lines.append(f"Context usage: ~{approx_tokens:,} tokens")
if threshold_tokens > 0:
if approx_tokens > 0:
threshold_pct = (threshold_tokens / context_length) * 100 if context_length > 0 else 0
remaining = max(threshold_tokens - approx_tokens, 0)
if approx_tokens >= threshold_tokens:
lines.append(
f"Compression: due now (threshold ~{threshold_tokens:,}"
+ (f", {threshold_pct:.0f}%" if threshold_pct else "")
+ "). Run /compact."
)
else:
lines.append(
f"Compression: ~{remaining:,} tokens until threshold "
f"(~{threshold_tokens:,}"
+ (f", {threshold_pct:.0f}%" if threshold_pct else "")
+ ")."
)
else:
lines.append(f"Compression threshold: ~{threshold_tokens:,} tokens")
if getattr(agent, "compression_enabled", True) is False:
lines.append("Compression is disabled for this agent.")
else:
lines.append("Tip: run /compact to compress manually before the threshold.")
return "\n".join(lines)
def _cmd_reset(self, args: str, state: SessionState) -> str:
@@ -975,10 +1286,16 @@ class HermesACPAgent(acp.Agent):
if not hasattr(agent, "_compress_context"):
return "Context compression not available for this agent."
from agent.model_metadata import estimate_messages_tokens_rough
from agent.model_metadata import estimate_request_tokens_rough
original_count = len(state.history)
approx_tokens = estimate_messages_tokens_rough(state.history)
# Include system prompt + tool schemas so the figure reflects real
# request pressure, not a transcript-only underestimate (#6217).
_sys_prompt = getattr(agent, "_cached_system_prompt", "") or ""
_tools = getattr(agent, "tools", None) or None
approx_tokens = estimate_request_tokens_rough(
state.history, system_prompt=_sys_prompt, tools=_tools
)
original_session_db = getattr(agent, "_session_db", None)
try:
@@ -998,7 +1315,13 @@ class HermesACPAgent(acp.Agent):
self.session_manager.save_session(state.session_id)
new_count = len(state.history)
new_tokens = estimate_messages_tokens_rough(state.history)
_sys_prompt_after = getattr(agent, "_cached_system_prompt", "") or _sys_prompt
_tools_after = getattr(agent, "tools", None) or _tools
new_tokens = estimate_request_tokens_rough(
state.history,
system_prompt=_sys_prompt_after,
tools=_tools_after,
)
return (
f"Context compressed: {original_count} -> {new_count} messages\n"
f"~{approx_tokens:,} -> ~{new_tokens:,} tokens"
@@ -1006,6 +1329,34 @@ class HermesACPAgent(acp.Agent):
except Exception as e:
return f"Compression failed: {e}"
def _cmd_steer(self, args: str, state: SessionState) -> str:
steer_text = args.strip()
if not steer_text:
return "Usage: /steer <guidance>"
if state.is_running and hasattr(state.agent, "steer"):
try:
if state.agent.steer(steer_text):
preview = steer_text[:80] + ("..." if len(steer_text) > 80 else "")
return f"⏩ Steer queued for the active turn: {preview}"
except Exception as exc:
logger.warning("ACP steer failed for session %s: %s", state.session_id, exc)
return f"⚠️ Steer failed: {exc}"
with state.runtime_lock:
state.queued_prompts.append(steer_text)
depth = len(state.queued_prompts)
return f"No active turn — queued for the next turn. ({depth} queued)"
def _cmd_queue(self, args: str, state: SessionState) -> str:
queued_text = args.strip()
if not queued_text:
return "Usage: /queue <prompt>"
with state.runtime_lock:
state.queued_prompts.append(queued_text)
depth = len(state.queued_prompts)
return f"Queued for the next turn. ({depth} queued)"
def _cmd_version(self, args: str, state: SessionState) -> str:
return f"Hermes Agent v{HERMES_VERSION}"
+46 -7
View File
@@ -26,6 +26,33 @@ from typing import Any, Dict, List, Optional
logger = logging.getLogger(__name__)
def _win_path_to_wsl(path: str) -> str | None:
"""Convert a Windows drive path to its WSL /mnt/<drive>/... equivalent."""
match = re.match(r"^([A-Za-z]):[\\/](.*)$", path)
if not match:
return None
drive = match.group(1).lower()
tail = match.group(2).replace("\\", "/")
return f"/mnt/{drive}/{tail}"
def _translate_acp_cwd(cwd: str) -> str:
"""Translate Windows ACP cwd values when Hermes itself is running in WSL.
Windows ACP clients can launch ``hermes acp`` inside WSL while still sending
editor workspaces as Windows drive paths such as ``E:\\Projects``. Store
and execute against the WSL mount path so agents, tools, and persisted ACP
sessions all agree on the usable workspace. Native Linux/macOS keeps the
original cwd unchanged.
"""
from hermes_constants import is_wsl
if not is_wsl():
return cwd
translated = _win_path_to_wsl(str(cwd))
return translated if translated is not None else cwd
def _normalize_cwd_for_compare(cwd: str | None) -> str:
raw = str(cwd or ".").strip()
if not raw:
@@ -34,11 +61,9 @@ def _normalize_cwd_for_compare(cwd: str | None) -> str:
# Normalize Windows drive paths into the equivalent WSL mount form so
# ACP history filters match the same workspace across Windows and WSL.
match = re.match(r"^([A-Za-z]):[\\/](.*)$", expanded)
if match:
drive = match.group(1).lower()
tail = match.group(2).replace("\\", "/")
expanded = f"/mnt/{drive}/{tail}"
translated = _win_path_to_wsl(expanded)
if translated is not None:
expanded = translated
elif re.match(r"^/mnt/[A-Za-z]/", expanded):
expanded = f"/mnt/{expanded[5].lower()}/{expanded[7:]}"
@@ -96,12 +121,18 @@ def _acp_stderr_print(*args, **kwargs) -> None:
def _register_task_cwd(task_id: str, cwd: str) -> None:
"""Bind a task/session id to the editor's working directory for tools."""
"""Bind a task/session id to the editor's working directory for tools.
Zed can launch Hermes from a Windows workspace while the ACP process runs
inside WSL. In that case ACP sends cwd as e.g. ``E:\\Projects\\POTI``;
local tools need the WSL mount equivalent or subprocess creation fails
before the command can run.
"""
if not task_id:
return
try:
from tools.terminal_tool import register_task_env_overrides
register_task_env_overrides(task_id, {"cwd": cwd})
register_task_env_overrides(task_id, {"cwd": _translate_acp_cwd(cwd)})
except Exception:
logger.debug("Failed to register ACP task cwd override", exc_info=True)
@@ -145,6 +176,11 @@ class SessionState:
model: str = ""
history: List[Dict[str, Any]] = field(default_factory=list)
cancel_event: Any = None # threading.Event
is_running: bool = False
queued_prompts: List[str] = field(default_factory=list)
runtime_lock: Any = field(default_factory=Lock)
current_prompt_text: str = ""
interrupted_prompt_text: str = ""
class SessionManager:
@@ -175,6 +211,7 @@ class SessionManager:
"""Create a new session with a unique ID and a fresh AIAgent."""
import threading
cwd = _translate_acp_cwd(cwd)
session_id = str(uuid.uuid4())
agent = self._make_agent(session_id=session_id, cwd=cwd)
state = SessionState(
@@ -217,6 +254,7 @@ class SessionManager:
"""Deep-copy a session's history into a new session."""
import threading
cwd = _translate_acp_cwd(cwd)
original = self.get_session(session_id) # checks DB too
if original is None:
return None
@@ -318,6 +356,7 @@ class SessionManager:
def update_cwd(self, session_id: str, cwd: str) -> Optional[SessionState]:
"""Update the working directory for a session and its tool overrides."""
cwd = _translate_acp_cwd(cwd)
state = self.get_session(session_id) # checks DB too
if state is None:
return None
+822 -21
View File
@@ -28,6 +28,11 @@ TOOL_KIND_MAP: Dict[str, ToolKind] = {
"terminal": "execute",
"process": "execute",
"execute_code": "execute",
# Session/meta tools
"todo": "other",
"skill_view": "read",
"skills_list": "read",
"skill_manage": "edit",
# Web / fetch
"web_search": "fetch",
"web_extract": "fetch",
@@ -51,6 +56,28 @@ TOOL_KIND_MAP: Dict[str, ToolKind] = {
}
_POLISHED_TOOLS = {
# Core operator loop
"todo", "memory", "session_search", "delegate_task",
# Files / execution
"read_file", "write_file", "patch", "search_files", "terminal", "process", "execute_code",
# Skills / web / browser / media
"skill_view", "skills_list", "skill_manage", "web_search", "web_extract",
"browser_navigate", "browser_click", "browser_type", "browser_press", "browser_scroll",
"browser_back", "browser_snapshot", "browser_console", "browser_get_images", "browser_vision",
"vision_analyze", "image_generate", "text_to_speech",
# Schedulers / platform integrations
"cronjob", "send_message", "clarify", "discord", "discord_admin",
"ha_list_entities", "ha_get_state", "ha_list_services", "ha_call_service",
"feishu_doc_read", "feishu_drive_list_comments", "feishu_drive_list_comment_replies",
"feishu_drive_reply_comment", "feishu_drive_add_comment",
"kanban_create", "kanban_show", "kanban_comment", "kanban_complete",
"kanban_block", "kanban_link", "kanban_heartbeat",
"yb_query_group_info", "yb_query_group_members", "yb_search_sticker",
"yb_send_dm", "yb_send_sticker", "mixture_of_agents",
}
def get_tool_kind(tool_name: str) -> ToolKind:
"""Return the ACP ToolKind for a hermes tool, defaulting to 'other'."""
return TOOL_KIND_MAP.get(tool_name, "other")
@@ -85,18 +112,645 @@ def build_tool_title(tool_name: str, args: Dict[str, Any]) -> str:
if urls:
return f"extract: {urls[0]}" + (f" (+{len(urls)-1})" if len(urls) > 1 else "")
return "web extract"
if tool_name == "process":
action = str(args.get("action") or "").strip() or "manage"
sid = str(args.get("session_id") or "").strip()
return f"process {action}: {sid}" if sid else f"process {action}"
if tool_name == "delegate_task":
tasks = args.get("tasks")
if isinstance(tasks, list) and tasks:
return f"delegate batch ({len(tasks)} tasks)"
goal = args.get("goal", "")
if goal and len(goal) > 60:
goal = goal[:57] + "..."
return f"delegate: {goal}" if goal else "delegate task"
if tool_name == "session_search":
query = str(args.get("query") or "").strip()
return f"session search: {query}" if query else "recent sessions"
if tool_name == "memory":
action = str(args.get("action") or "manage").strip() or "manage"
target = str(args.get("target") or "memory").strip() or "memory"
return f"memory {action}: {target}"
if tool_name == "execute_code":
return "execute code"
code = str(args.get("code") or "").strip()
first_line = next((line.strip() for line in code.splitlines() if line.strip()), "")
if first_line:
if len(first_line) > 70:
first_line = first_line[:67] + "..."
return f"python: {first_line}"
return "python code"
if tool_name == "todo":
items = args.get("todos")
if isinstance(items, list):
return f"todo ({len(items)} item{'s' if len(items) != 1 else ''})"
return "todo"
if tool_name == "skill_view":
name = str(args.get("name") or "?").strip() or "?"
file_path = str(args.get("file_path") or "").strip()
suffix = f"/{file_path}" if file_path else ""
return f"skill view ({name}{suffix})"
if tool_name == "skills_list":
category = str(args.get("category") or "").strip()
return f"skills list ({category})" if category else "skills list"
if tool_name == "skill_manage":
action = str(args.get("action") or "manage").strip() or "manage"
name = str(args.get("name") or "?").strip() or "?"
file_path = str(args.get("file_path") or "").strip()
target = f"{name}/{file_path}" if file_path else name
if len(target) > 64:
target = target[:61] + "..."
return f"skill {action}: {target}"
if tool_name == "browser_navigate":
return f"navigate: {args.get('url', '?')}"
if tool_name == "browser_snapshot":
return "browser snapshot"
if tool_name == "browser_vision":
return f"browser vision: {str(args.get('question', '?'))[:50]}"
if tool_name == "browser_get_images":
return "browser images"
if tool_name == "vision_analyze":
return f"analyze image: {args.get('question', '?')[:50]}"
return f"analyze image: {str(args.get('question', '?'))[:50]}"
if tool_name == "image_generate":
prompt = str(args.get("prompt") or args.get("description") or "").strip()
return f"generate image: {prompt[:50]}" if prompt else "generate image"
if tool_name == "cronjob":
action = str(args.get("action") or "manage").strip() or "manage"
job_id = str(args.get("job_id") or args.get("id") or "").strip()
return f"cron {action}: {job_id}" if job_id else f"cron {action}"
return tool_name
def _text(content: str) -> Any:
return acp.tool_content(acp.text_block(content))
def _json_loads_maybe(value: Optional[str]) -> Any:
if not isinstance(value, str):
return value
try:
return json.loads(value)
except Exception:
pass
# Some Hermes tools append a human hint after a JSON payload, e.g.
# ``{...}\n\n[Hint: Results truncated...]``. Keep the structured rendering path
# by decoding the first JSON value instead of falling back to raw text.
try:
decoded, _ = json.JSONDecoder().raw_decode(value.lstrip())
return decoded
except Exception:
return None
def _truncate_text(text: str, limit: int = 5000) -> str:
if len(text) <= limit:
return text
return text[: max(0, limit - 100)] + f"\n... ({len(text)} chars total, truncated)"
def _fenced_text(text: str, language: str = "") -> str:
"""Return a Markdown fence that cannot be broken by backticks in text."""
longest = max((len(run) for run in text.split("`")[1::2]), default=0)
fence = "`" * max(3, longest + 1)
return f"{fence}{language}\n{text}\n{fence}"
def _format_todo_result(result: Optional[str]) -> Optional[str]:
data = _json_loads_maybe(result)
if not isinstance(data, dict) or not isinstance(data.get("todos"), list):
return None
summary = data.get("summary") if isinstance(data.get("summary"), dict) else {}
icon = {
"completed": "",
"in_progress": "🔄",
"pending": "",
"cancelled": "",
}
lines = ["**Todo list**", ""]
for item in data["todos"]:
if not isinstance(item, dict):
continue
status = str(item.get("status") or "pending")
content = str(item.get("content") or item.get("id") or "").strip()
if content:
lines.append(f"- {icon.get(status, '')} {content}")
if summary:
cancelled = summary.get("cancelled", 0)
lines.extend([
"",
"**Progress:** "
f"{summary.get('completed', 0)} completed, "
f"{summary.get('in_progress', 0)} in progress, "
f"{summary.get('pending', 0)} pending"
+ (f", {cancelled} cancelled" if cancelled else ""),
])
return "\n".join(lines)
def _format_read_file_result(result: Optional[str], args: Optional[Dict[str, Any]]) -> Optional[str]:
data = _json_loads_maybe(result)
if not isinstance(data, dict):
return None
if data.get("error") and not data.get("content"):
return f"Read failed: {data.get('error')}"
content = data.get("content")
if not isinstance(content, str):
return None
path = str((args or {}).get("path") or data.get("path") or "file").strip()
offset = (args or {}).get("offset")
limit = (args or {}).get("limit")
range_bits = []
if offset:
range_bits.append(f"from line {offset}")
if limit:
range_bits.append(f"limit {limit}")
suffix = f" ({', '.join(range_bits)})" if range_bits else ""
header = f"Read {path}{suffix}"
if data.get("total_lines") is not None:
header += f"{data.get('total_lines')} total lines"
# Hermes read_file output is line-numbered with `|`. If we send it as raw
# Markdown, Zed can interpret pipes as tables and collapse the layout.
# Fence the payload so file lines stay readable and literal.
return _truncate_text(f"{header}\n\n{_fenced_text(content)}")
def _format_search_files_result(result: Optional[str]) -> Optional[str]:
data = _json_loads_maybe(result)
if not isinstance(data, dict):
return None
matches = data.get("matches")
if not isinstance(matches, list):
return None
total = data.get("total_count", len(matches))
shown = min(len(matches), 12)
truncated = bool(data.get("truncated")) or len(matches) > shown
lines = [
"Search results",
f"Found {total} match{'es' if total != 1 else ''}; showing {shown}.",
"",
]
for match in matches[:shown]:
if not isinstance(match, dict):
lines.append(f"- {match}")
continue
path = str(match.get("path") or match.get("file") or match.get("filename") or "?")
line = match.get("line") or match.get("line_number")
content = str(match.get("content") or match.get("text") or "").strip()
loc = f"{path}:{line}" if line else path
lines.append(f"- {loc}")
if content:
snippet = _truncate_text(" ".join(content.split()), 300)
lines.append(f" {snippet}")
if truncated:
lines.extend([
"",
"Results truncated. Narrow the search, add file_glob, or use offset to page.",
])
return _truncate_text("\n".join(lines), limit=7000)
def _format_execute_code_result(result: Optional[str]) -> Optional[str]:
data = _json_loads_maybe(result)
if not isinstance(data, dict):
return result if isinstance(result, str) and result.strip() else None
output = str(data.get("output") or "")
error = str(data.get("error") or "")
exit_code = data.get("exit_code")
parts = [f"Exit code: {exit_code}" if exit_code is not None else "Execution complete"]
if output:
parts.extend(["", "Output:", output])
if error:
parts.extend(["", "Error:", error])
return _truncate_text("\n".join(parts))
def _extract_markdown_headings(content: str, limit: int = 8) -> list[str]:
headings: list[str] = []
for line in content.splitlines():
stripped = line.strip()
if stripped.startswith("#"):
heading = stripped.lstrip("#").strip()
if heading:
headings.append(heading)
if len(headings) >= limit:
break
return headings
def _format_skill_view_result(result: Optional[str]) -> Optional[str]:
data = _json_loads_maybe(result)
if not isinstance(data, dict):
return None
if data.get("success") is False:
return f"Skill view failed: {data.get('error', 'unknown error')}"
name = str(data.get("name") or "skill")
file_path = str(data.get("file") or data.get("path") or "SKILL.md")
description = str(data.get("description") or "").strip()
content = str(data.get("content") or "")
linked = data.get("linked_files") if isinstance(data.get("linked_files"), dict) else None
lines = ["**Skill loaded**", "", f"- **Name:** `{name}`", f"- **File:** `{file_path}`"]
if description:
lines.append(f"- **Description:** {description}")
if content:
lines.append(f"- **Content:** {len(content):,} chars loaded into agent context")
if linked:
linked_count = sum(len(v) for v in linked.values() if isinstance(v, list))
lines.append(f"- **Linked files:** {linked_count}")
headings = _extract_markdown_headings(content)
if headings:
lines.extend(["", "**Sections**"])
lines.extend(f"- {heading}" for heading in headings)
lines.extend([
"",
"_Full skill content is available to the agent but hidden here to keep ACP readable._",
])
return "\n".join(lines)
def _format_skill_manage_result(result: Optional[str], args: Optional[Dict[str, Any]]) -> Optional[str]:
data = _json_loads_maybe(result)
if not isinstance(data, dict):
return None
action = str((args or {}).get("action") or "manage").strip() or "manage"
name = str((args or {}).get("name") or data.get("name") or "skill").strip() or "skill"
file_path = str((args or {}).get("file_path") or data.get("file_path") or "SKILL.md").strip() or "SKILL.md"
success = data.get("success")
status = "✅ Skill updated" if success is not False else "✗ Skill update failed"
lines = [f"**{status}**", "", f"- **Action:** `{action}`", f"- **Skill:** `{name}`"]
if action not in {"delete"}:
lines.append(f"- **File:** `{file_path}`")
message = str(data.get("message") or data.get("error") or "").strip()
if message:
lines.append(f"- **Result:** {message}")
replacements = data.get("replacements") or data.get("replacement_count")
if replacements is not None:
lines.append(f"- **Replacements:** {replacements}")
path = str(data.get("path") or "").strip()
if path:
lines.append(f"- **Path:** `{path}`")
return "\n".join(lines)
def _format_web_search_result(result: Optional[str]) -> Optional[str]:
data = _json_loads_maybe(result)
if not isinstance(data, dict):
return None
web = data.get("data", {}).get("web") if isinstance(data.get("data"), dict) else data.get("web")
if not isinstance(web, list):
return None
lines = [f"Web results: {len(web)}"]
for item in web[:10]:
if not isinstance(item, dict):
continue
title = str(item.get("title") or item.get("url") or "result").strip()
url = str(item.get("url") or "").strip()
desc = str(item.get("description") or "").strip()
lines.append(f"{title}" + (f"{url}" if url else ""))
if desc:
lines.append(f" {desc}")
return _truncate_text("\n".join(lines))
def _format_web_extract_result(result: Optional[str]) -> Optional[str]:
"""Return only web_extract errors for ACP; success stays compact via title."""
data = _json_loads_maybe(result)
if not isinstance(data, dict):
return None
if data.get("success") is False and data.get("error"):
return f"Web extract failed: {data.get('error')}"
results = data.get("results")
if not isinstance(results, list):
return None
failures: list[str] = []
for item in results[:10]:
if not isinstance(item, dict):
continue
error = str(item.get("error") or "").strip()
if not error or error in {"None", "null"}:
continue
url = str(item.get("url") or "").strip()
title = str(item.get("title") or url or "Untitled").strip()
failures.append(
f"- {title}" + (f"{url}" if url and url != title else "") + f"\n Error: {_truncate_text(error, limit=500)}"
)
if not failures:
return None
lines = [f"Web extract failed for {len(failures)} URL{'s' if len(failures) != 1 else ''}"]
lines.extend(failures)
return "\n".join(lines)
def _format_process_result(result: Optional[str], args: Optional[Dict[str, Any]]) -> Optional[str]:
data = _json_loads_maybe(result)
if not isinstance(data, dict):
return result if isinstance(result, str) and result.strip() else None
if data.get("success") is False and data.get("error"):
return f"Process error: {data.get('error')}"
action = str((args or {}).get("action") or "process").strip() or "process"
if isinstance(data.get("processes"), list):
processes = data["processes"]
lines = [f"Processes: {len(processes)}"]
for proc in processes[:20]:
if not isinstance(proc, dict):
lines.append(f"- {proc}")
continue
sid = str(proc.get("session_id") or proc.get("id") or "?")
status = str(proc.get("status") or ("exited" if proc.get("exited") else "running"))
cmd = str(proc.get("command") or "").strip()
pid = proc.get("pid")
code = proc.get("exit_code")
bits = [status]
if pid is not None:
bits.append(f"pid {pid}")
if code is not None:
bits.append(f"exit {code}")
lines.append(f"- `{sid}` — {', '.join(bits)}" + (f"{cmd[:120]}" if cmd else ""))
if len(processes) > 20:
lines.append(f"... {len(processes) - 20} more process(es)")
return "\n".join(lines)
status = str(data.get("status") or data.get("state") or action).strip()
sid = str(data.get("session_id") or (args or {}).get("session_id") or "").strip()
lines = [f"Process {action}: {status}" + (f" (`{sid}`)" if sid else "")]
for key, label in (("command", "Command"), ("pid", "PID"), ("exit_code", "Exit code"), ("returncode", "Exit code"), ("lines", "Lines")):
if data.get(key) is not None:
lines.append(f"- **{label}:** {data.get(key)}")
output = data.get("output") or data.get("new_output") or data.get("log") or data.get("stdout")
error = data.get("error") or data.get("stderr")
if output:
lines.extend(["", "Output:", _truncate_text(str(output), limit=5000)])
if error:
lines.extend(["", "Error:", _truncate_text(str(error), limit=2000)])
msg = data.get("message")
if msg and not output and not error:
lines.append(str(msg))
return _truncate_text("\n".join(lines), limit=7000)
def _format_delegate_result(result: Optional[str]) -> Optional[str]:
data = _json_loads_maybe(result)
if not isinstance(data, dict):
return None
if data.get("error") and not isinstance(data.get("results"), list):
return f"Delegation failed: {data.get('error')}"
results = data.get("results")
if not isinstance(results, list):
return None
total = data.get("total_duration_seconds")
lines = [f"Delegation results: {len(results)} task{'s' if len(results) != 1 else ''}" + (f" in {total}s" if total is not None else "")]
icon = {"completed": "", "failed": "", "error": "", "timeout": "", "interrupted": ""}
for item in results:
if not isinstance(item, dict):
lines.append(f"- {item}")
continue
idx = item.get("task_index")
status = str(item.get("status") or "unknown")
model = item.get("model")
dur = item.get("duration_seconds")
role = item.get("_child_role")
header = f"{icon.get(status, '')} Task {idx + 1 if isinstance(idx, int) else '?'}: {status}"
bits = []
if model:
bits.append(str(model))
if role:
bits.append(f"role={role}")
if dur is not None:
bits.append(f"{dur}s")
if bits:
header += " (" + ", ".join(bits) + ")"
lines.extend(["", header])
summary = str(item.get("summary") or "").strip()
error = str(item.get("error") or "").strip()
if summary:
lines.append(_truncate_text(summary, limit=1200))
if error:
lines.append("Error: " + _truncate_text(error, limit=800))
trace = item.get("tool_trace")
if isinstance(trace, list) and trace:
names = [str(t.get("tool") or "?") for t in trace if isinstance(t, dict)]
if names:
lines.append("Tools: " + ", ".join(names[:12]) + (f" (+{len(names)-12})" if len(names) > 12 else ""))
return _truncate_text("\n".join(lines), limit=8000)
def _format_session_search_result(result: Optional[str]) -> Optional[str]:
data = _json_loads_maybe(result)
if not isinstance(data, dict):
return None
if data.get("success") is False:
return f"Session search failed: {data.get('error', 'unknown error')}"
results = data.get("results")
if not isinstance(results, list):
return None
mode = data.get("mode") or "search"
query = data.get("query")
lines = ["Recent sessions" if mode == "recent" else f"Session search results" + (f" for `{query}`" if query else "")]
if not results:
lines.append(str(data.get("message") or "No matching sessions found."))
return "\n".join(lines)
for item in results:
if not isinstance(item, dict):
continue
sid = str(item.get("session_id") or "?")
title = str(item.get("title") or item.get("when") or "Untitled session").strip()
when = str(item.get("last_active") or item.get("started_at") or item.get("when") or "").strip()
count = item.get("message_count")
source = str(item.get("source") or "").strip()
meta = ", ".join(str(x) for x in [when, source, f"{count} msgs" if count is not None else ""] if x)
lines.append(f"- **{title}** (`{sid}`)" + (f"{meta}" if meta else ""))
summary = str(item.get("summary") or item.get("preview") or "").strip()
if summary:
lines.append(" " + _truncate_text(" ".join(summary.split()), limit=500))
return _truncate_text("\n".join(lines), limit=7000)
def _format_memory_result(result: Optional[str], args: Optional[Dict[str, Any]]) -> Optional[str]:
data = _json_loads_maybe(result)
if not isinstance(data, dict):
return None
action = str((args or {}).get("action") or "memory").strip() or "memory"
target = str(data.get("target") or (args or {}).get("target") or "memory")
if data.get("success") is False:
lines = [f"✗ Memory {action} failed ({target})", str(data.get("error") or "unknown error")]
matches = data.get("matches")
if isinstance(matches, list) and matches:
lines.append("Matches:")
lines.extend(f"- {_truncate_text(str(m), 160)}" for m in matches[:5])
return "\n".join(lines)
lines = [f"✅ Memory {action} saved ({target})"]
if data.get("message"):
lines.append(str(data.get("message")))
if data.get("entry_count") is not None:
lines.append(f"Entries: {data.get('entry_count')}")
if data.get("usage"):
lines.append(f"Usage: {data.get('usage')}")
# Avoid dumping all memory entries into ACP UI; show only the explicit new value preview.
preview = str((args or {}).get("content") or (args or {}).get("old_text") or "").strip()
if preview:
lines.append("Preview: " + _truncate_text(preview, limit=300))
return "\n".join(lines)
def _format_edit_result(tool_name: str, result: Optional[str], args: Optional[Dict[str, Any]]) -> Optional[str]:
data = _json_loads_maybe(result)
path = str((args or {}).get("path") or "file").strip()
if isinstance(data, dict):
if data.get("success") is False or data.get("error"):
return f"{tool_name} failed for {path}: {data.get('error', 'unknown error')}"
message = str(data.get("message") or "").strip()
replacements = data.get("replacements") or data.get("replacement_count")
lines = [f"{tool_name} completed" + (f" for `{path}`" if path else "")]
if message:
lines.append(message)
if replacements is not None:
lines.append(f"Replacements: {replacements}")
if data.get("files_modified"):
files = data.get("files_modified")
if isinstance(files, list):
lines.append("Files: " + ", ".join(f"`{f}`" for f in files[:8]))
return "\n".join(lines)
if isinstance(result, str) and result.strip():
return _truncate_text(result, limit=3000)
return f"{tool_name} completed" + (f" for `{path}`" if path else "")
def _format_browser_result(tool_name: str, result: Optional[str], args: Optional[Dict[str, Any]]) -> Optional[str]:
data = _json_loads_maybe(result)
if not isinstance(data, dict):
return result if isinstance(result, str) and result.strip() else None
if data.get("success") is False or data.get("error"):
return f"{tool_name} failed: {data.get('error', 'unknown error')}"
if tool_name == "browser_get_images":
images = data.get("images") or data.get("data")
if isinstance(images, list):
lines = [f"Images found: {len(images)}"]
for img in images[:12]:
if isinstance(img, dict):
alt = str(img.get("alt") or "").strip()
url = str(img.get("url") or img.get("src") or "").strip()
lines.append(f"- {alt or 'image'}" + (f"{url}" if url else ""))
return _truncate_text("\n".join(lines), limit=5000)
title = str(data.get("title") or data.get("url") or data.get("status") or tool_name)
text = str(data.get("text") or data.get("content") or data.get("snapshot") or data.get("analysis") or data.get("message") or "").strip()
lines = [title]
if data.get("url") and data.get("url") != title:
lines.append(str(data.get("url")))
if text:
lines.extend(["", _truncate_text(text, limit=5000)])
return _truncate_text("\n".join(lines), limit=7000)
def _format_media_or_cron_result(tool_name: str, result: Optional[str]) -> Optional[str]:
data = _json_loads_maybe(result)
if not isinstance(data, dict):
return result if isinstance(result, str) and result.strip() else None
if data.get("success") is False or data.get("error"):
return f"{tool_name} failed: {data.get('error', 'unknown error')}"
lines = [f"{tool_name} completed"]
for key in ("file_path", "path", "url", "image_url", "job_id", "id", "status", "message", "next_run"):
if data.get(key):
lines.append(f"- **{key}:** {data.get(key)}")
return "\n".join(lines)
def _format_generic_structured_result(tool_name: str, result: Optional[str]) -> Optional[str]:
data = _json_loads_maybe(result)
if not isinstance(data, (dict, list)):
return result if isinstance(result, str) and result.strip() else None
if isinstance(data, list):
lines = [f"{tool_name}: {len(data)} item{'s' if len(data) != 1 else ''}"]
for item in data[:12]:
lines.append(f"- {_truncate_text(str(item), limit=240)}")
return _truncate_text("\n".join(lines), limit=5000)
if data.get("success") is False or data.get("error"):
return f"{tool_name} failed: {data.get('error', 'unknown error')}"
lines = [f"{tool_name} completed" if data.get("success") is True else f"{tool_name} result"]
priority_keys = (
"message", "status", "id", "task_id", "issue_id", "title", "name", "entity_id",
"state", "service", "url", "path", "file_path", "count", "total", "next_run",
)
seen = set()
for key in priority_keys:
value = data.get(key)
if value in (None, "", [], {}):
continue
seen.add(key)
lines.append(f"- **{key}:** {_truncate_text(str(value), limit=500)}")
for key, value in data.items():
if key in seen or key in {"success", "raw", "content", "entries"}:
continue
if value in (None, "", [], {}):
continue
if isinstance(value, (dict, list)):
preview = json.dumps(value, ensure_ascii=False, default=str)
else:
preview = str(value)
lines.append(f"- **{key}:** {_truncate_text(preview, limit=500)}")
if len(lines) >= 14:
break
content = data.get("content")
if isinstance(content, str) and content.strip():
lines.extend(["", _truncate_text(content.strip(), limit=1500)])
return _truncate_text("\n".join(lines), limit=7000)
def _build_polished_completion_content(
tool_name: str,
result: Optional[str],
function_args: Optional[Dict[str, Any]],
) -> Optional[List[Any]]:
formatter = {
"todo": lambda: _format_todo_result(result),
"read_file": lambda: _format_read_file_result(result, function_args),
"write_file": lambda: _format_edit_result(tool_name, result, function_args),
"patch": lambda: _format_edit_result(tool_name, result, function_args),
"search_files": lambda: _format_search_files_result(result),
"execute_code": lambda: _format_execute_code_result(result),
"process": lambda: _format_process_result(result, function_args),
"delegate_task": lambda: _format_delegate_result(result),
"session_search": lambda: _format_session_search_result(result),
"memory": lambda: _format_memory_result(result, function_args),
"skill_view": lambda: _format_skill_view_result(result),
"skill_manage": lambda: _format_skill_manage_result(result, function_args),
"web_search": lambda: _format_web_search_result(result),
"web_extract": lambda: _format_web_extract_result(result),
"browser_navigate": lambda: _format_browser_result(tool_name, result, function_args),
"browser_snapshot": lambda: _format_browser_result(tool_name, result, function_args),
"browser_vision": lambda: _format_browser_result(tool_name, result, function_args),
"browser_get_images": lambda: _format_browser_result(tool_name, result, function_args),
"vision_analyze": lambda: _format_media_or_cron_result(tool_name, result),
"image_generate": lambda: _format_media_or_cron_result(tool_name, result),
"cronjob": lambda: _format_media_or_cron_result(tool_name, result),
}.get(tool_name)
if formatter is None and tool_name in _POLISHED_TOOLS:
formatter = lambda: _format_generic_structured_result(tool_name, result)
if formatter is None:
return None
text = formatter()
if not text:
return None
return [_text(text)]
def _build_patch_mode_content(patch_text: str) -> List[Any]:
"""Parse V4A patch mode input into ACP diff blocks when possible."""
if not patch_text:
@@ -258,7 +912,11 @@ def _build_tool_complete_content(
except Exception:
pass
return [acp.tool_content(acp.text_block(display_result))]
polished_content = _build_polished_completion_content(tool_name, result, function_args)
if polished_content:
return polished_content
return [_text(display_result)]
# ---------------------------------------------------------------------------
@@ -288,7 +946,6 @@ def build_tool_start(
content = _build_patch_mode_content(patch_text)
return acp.start_tool_call(
tool_call_id, title, kind=kind, content=content, locations=locations,
raw_input=arguments,
)
if tool_name == "write_file":
@@ -297,32 +954,172 @@ def build_tool_start(
content = [acp.tool_diff_content(path=path, new_text=file_content)]
return acp.start_tool_call(
tool_call_id, title, kind=kind, content=content, locations=locations,
raw_input=arguments,
)
if tool_name == "terminal":
command = arguments.get("command", "")
content = [acp.tool_content(acp.text_block(f"$ {command}"))]
content = [_text(f"$ {command}")]
return acp.start_tool_call(
tool_call_id, title, kind=kind, content=content, locations=locations,
raw_input=arguments,
)
if tool_name == "read_file":
path = arguments.get("path", "")
content = [acp.tool_content(acp.text_block(f"Reading {path}"))]
# The title and location already identify the file. Sending a synthetic
# "Reading ..." content block makes Zed render an unhelpful Output
# section before the real file contents arrive on completion.
return acp.start_tool_call(
tool_call_id, title, kind=kind, content=content, locations=locations,
raw_input=arguments,
tool_call_id, title, kind=kind, content=None, locations=locations,
)
if tool_name == "search_files":
pattern = arguments.get("pattern", "")
target = arguments.get("target", "content")
content = [acp.tool_content(acp.text_block(f"Searching for '{pattern}' ({target})"))]
search_path = arguments.get("path")
where = f" in {search_path}" if search_path else ""
content = [_text(f"Searching for '{pattern}' ({target}){where}")]
return acp.start_tool_call(
tool_call_id, title, kind=kind, content=content, locations=locations,
)
if tool_name == "todo":
items = arguments.get("todos")
if isinstance(items, list):
preview_lines = ["Updating todo list", ""]
for item in items[:8]:
if isinstance(item, dict):
preview_lines.append(f"- {item.get('status', 'pending')}: {item.get('content', item.get('id', ''))}")
if len(items) > 8:
preview_lines.append(f"... {len(items) - 8} more")
content = [_text("\n".join(preview_lines))]
else:
content = [_text("Reading todo list")]
return acp.start_tool_call(
tool_call_id, title, kind=kind, content=content, locations=locations,
)
if tool_name == "skill_view":
name = str(arguments.get("name") or "?").strip() or "?"
file_path = str(arguments.get("file_path") or "SKILL.md").strip() or "SKILL.md"
content = [_text(f"Loading skill '{name}' ({file_path})")]
return acp.start_tool_call(
tool_call_id, title, kind=kind, content=content, locations=locations,
)
if tool_name == "skill_manage":
action = str(arguments.get("action") or "manage").strip() or "manage"
name = str(arguments.get("name") or "?").strip() or "?"
file_path = str(arguments.get("file_path") or "SKILL.md").strip() or "SKILL.md"
path = f"skills/{name}/{file_path}" if file_path else f"skills/{name}"
if action == "patch":
old = str(arguments.get("old_string") or "")
new = str(arguments.get("new_string") or "")
content = [acp.tool_diff_content(path=path, old_text=old or None, new_text=new)]
elif action in {"edit", "create"}:
content = [
acp.tool_diff_content(
path=path,
new_text=str(arguments.get("content") or ""),
)
]
elif action == "write_file":
target = str(arguments.get("file_path") or "file")
content = [
acp.tool_diff_content(
path=f"skills/{name}/{target}",
new_text=str(arguments.get("file_content") or ""),
)
]
elif action in {"delete", "remove_file"}:
target = str(arguments.get("file_path") or file_path or name)
content = [_text(f"Removing {target} from skill '{name}'")]
else:
content = [_text(f"Running skill_manage action '{action}' on skill '{name}' ({file_path})")]
return acp.start_tool_call(
tool_call_id, title, kind=kind, content=content, locations=locations,
)
if tool_name == "execute_code":
code = str(arguments.get("code") or "").strip()
preview = code[:1200] + (f"\n... ({len(code)} chars total, truncated)" if len(code) > 1200 else "")
content = [_text(f"Running Python helper script:\n\n```python\n{preview}\n```" if preview else "Running Python helper script")]
return acp.start_tool_call(
tool_call_id, title, kind=kind, content=content, locations=locations,
)
if tool_name == "web_search":
query = str(arguments.get("query") or "").strip()
content = [_text(f"Searching the web for: {query}" if query else "Searching the web")]
return acp.start_tool_call(
tool_call_id, title, kind=kind, content=content, locations=locations,
)
if tool_name == "web_extract":
# The title identifies the URL(s). Avoid a duplicate content block so
# Zed renders this like read_file: compact start, concise completion.
return acp.start_tool_call(
tool_call_id, title, kind=kind, content=None, locations=locations,
)
if tool_name == "process":
action = str(arguments.get("action") or "").strip() or "manage"
sid = str(arguments.get("session_id") or "").strip()
data_preview = str(arguments.get("data") or "").strip()
text = f"Process action: {action}" + (f"\nSession: {sid}" if sid else "")
if data_preview:
text += "\nInput: " + _truncate_text(data_preview, limit=500)
content = [_text(text)]
return acp.start_tool_call(
tool_call_id, title, kind=kind, content=content, locations=locations,
)
if tool_name == "delegate_task":
tasks = arguments.get("tasks")
if isinstance(tasks, list) and tasks:
lines = [f"Delegating {len(tasks)} tasks", ""]
for i, task in enumerate(tasks[:8], 1):
if isinstance(task, dict):
goal = str(task.get("goal") or "").strip()
role = str(task.get("role") or "").strip()
lines.append(f"{i}. " + _truncate_text(goal, limit=160) + (f" ({role})" if role else ""))
if len(tasks) > 8:
lines.append(f"... {len(tasks) - 8} more")
content = [_text("\n".join(lines))]
else:
goal = str(arguments.get("goal") or "").strip()
content = [_text("Delegating task" + (f":\n{_truncate_text(goal, limit=800)}" if goal else ""))]
return acp.start_tool_call(
tool_call_id, title, kind=kind, content=content, locations=locations,
)
if tool_name == "session_search":
query = str(arguments.get("query") or "").strip()
content = [_text(f"Searching past sessions for: {query}" if query else "Loading recent sessions")]
return acp.start_tool_call(
tool_call_id, title, kind=kind, content=content, locations=locations,
)
if tool_name == "memory":
action = str(arguments.get("action") or "manage").strip() or "manage"
target = str(arguments.get("target") or "memory").strip() or "memory"
preview = str(arguments.get("content") or arguments.get("old_text") or "").strip()
text = f"Memory {action} ({target})"
if preview:
text += "\nPreview: " + _truncate_text(preview, limit=500)
content = [_text(text)]
return acp.start_tool_call(
tool_call_id, title, kind=kind, content=content, locations=locations,
)
if tool_name in _POLISHED_TOOLS:
try:
args_text = json.dumps(arguments, indent=2, default=str)
except (TypeError, ValueError):
args_text = str(arguments)
content = [_text(_truncate_text(args_text, limit=1200))]
return acp.start_tool_call(
tool_call_id, title, kind=kind, content=content, locations=locations,
raw_input=arguments,
)
# Generic fallback
@@ -334,7 +1131,7 @@ def build_tool_start(
content = [acp.tool_content(acp.text_block(args_text))]
return acp.start_tool_call(
tool_call_id, title, kind=kind, content=content, locations=locations,
raw_input=arguments,
raw_input=None if tool_name in _POLISHED_TOOLS else arguments,
)
@@ -347,18 +1144,22 @@ def build_tool_complete(
) -> ToolCallProgress:
"""Create a ToolCallUpdate (progress) event for a completed tool call."""
kind = get_tool_kind(tool_name)
content = _build_tool_complete_content(
tool_name,
result,
function_args=function_args,
snapshot=snapshot,
)
if tool_name == "web_extract":
error_text = _format_web_extract_result(result)
content = [_text(error_text)] if error_text else None
else:
content = _build_tool_complete_content(
tool_name,
result,
function_args=function_args,
snapshot=snapshot,
)
return acp.update_tool_call(
tool_call_id,
kind=kind,
status="completed",
content=content,
raw_output=result,
raw_output=None if tool_name in _POLISHED_TOOLS else result,
)
+15 -1
View File
@@ -1241,10 +1241,24 @@ def convert_tools_to_anthropic(tools: List[Dict]) -> List[Dict]:
if not tools:
return []
result = []
seen_names: set = set()
for t in tools:
fn = t.get("function", {})
name = fn.get("name", "")
# Defensive dedup: Anthropic rejects requests with duplicate tool
# names. Upstream injection paths already dedup, but this guard
# converts a hard API failure into a warning. See: #18478
if name and name in seen_names:
logger.warning(
"convert_tools_to_anthropic: duplicate tool name '%s' "
"— dropping second occurrence",
name,
)
continue
if name:
seen_names.add(name)
result.append({
"name": fn.get("name", ""),
"name": name,
"description": fn.get("description", ""),
"input_schema": _normalize_tool_input_schema(
fn.get("parameters", {"type": "object", "properties": {}})
+114 -12
View File
@@ -259,13 +259,68 @@ _PROVIDERS_WITHOUT_VISION: frozenset = frozenset({
"kimi-coding-cn",
})
# OpenRouter app attribution headers
_OR_HEADERS = {
# OpenRouter app attribution headers (base — always sent)
_OR_HEADERS_BASE = {
"HTTP-Referer": "https://hermes-agent.nousresearch.com",
"X-OpenRouter-Title": "Hermes Agent",
"X-OpenRouter-Categories": "productivity,cli-agent",
}
# Truthy values for boolean env-var parsing.
_TRUTHY_ENV_VALUES = frozenset({"1", "true", "yes", "on"})
def build_or_headers(or_config: dict | None = None) -> dict:
"""Build OpenRouter headers, optionally including response-cache headers.
Precedence for response cache: env var > config.yaml > default (enabled).
Environment variables:
``HERMES_OPENROUTER_CACHE`` truthy (``1``/``true``/``yes``/``on``)
enables caching; ``0``/``false``/``no``/``off`` disables.
Overrides ``openrouter.response_cache`` in config.yaml.
``HERMES_OPENROUTER_CACHE_TTL`` integer seconds (1-86400).
Overrides ``openrouter.response_cache_ttl`` in config.yaml.
*or_config* is the ``openrouter`` section from config.yaml. When *None*,
falls back to reading config from disk via ``load_config()``.
"""
headers = dict(_OR_HEADERS_BASE)
# Resolve config from disk if not provided.
if or_config is None:
try:
from hermes_cli.config import load_config
or_config = load_config().get("openrouter", {})
except Exception:
or_config = {}
# Determine cache enabled: env var overrides config.
env_cache = os.environ.get("HERMES_OPENROUTER_CACHE", "").strip().lower()
if env_cache:
cache_enabled = env_cache in _TRUTHY_ENV_VALUES
else:
cache_enabled = or_config.get("response_cache", False)
if not cache_enabled:
return headers
headers["X-OpenRouter-Cache"] = "true"
# Determine TTL: env var overrides config.
env_ttl = os.environ.get("HERMES_OPENROUTER_CACHE_TTL", "").strip()
if env_ttl:
if env_ttl.isdigit():
ttl = int(env_ttl)
if 1 <= ttl <= 86400:
headers["X-OpenRouter-Cache-TTL"] = str(ttl)
else:
ttl = or_config.get("response_cache_ttl", 300)
if isinstance(ttl, (int, float)) and 1 <= ttl <= 86400:
headers["X-OpenRouter-Cache-TTL"] = str(int(ttl))
return headers
# Vercel AI Gateway app attribution headers. HTTP-Referer maps to
# referrerUrl and X-Title maps to appName in the gateway's analytics.
from hermes_cli import __version__ as _HERMES_VERSION
@@ -1149,23 +1204,23 @@ def _resolve_api_key_provider() -> Tuple[Optional[OpenAI], Optional[str]]:
def _try_openrouter() -> Tuple[Optional[OpenAI], Optional[str]]:
def _try_openrouter(explicit_api_key: str = None) -> Tuple[Optional[OpenAI], Optional[str]]:
pool_present, entry = _select_pool_entry("openrouter")
if pool_present:
or_key = _pool_runtime_api_key(entry)
or_key = explicit_api_key or _pool_runtime_api_key(entry)
if not or_key:
return None, None
base_url = _pool_runtime_base_url(entry, OPENROUTER_BASE_URL) or OPENROUTER_BASE_URL
logger.debug("Auxiliary client: OpenRouter via pool")
return OpenAI(api_key=or_key, base_url=base_url,
default_headers=_OR_HEADERS), _OPENROUTER_MODEL
default_headers=build_or_headers()), _OPENROUTER_MODEL
or_key = os.getenv("OPENROUTER_API_KEY")
or_key = explicit_api_key or os.getenv("OPENROUTER_API_KEY")
if not or_key:
return None, None
logger.debug("Auxiliary client: OpenRouter")
return OpenAI(api_key=or_key, base_url=OPENROUTER_BASE_URL,
default_headers=_OR_HEADERS), _OPENROUTER_MODEL
default_headers=build_or_headers()), _OPENROUTER_MODEL
def _describe_openrouter_unavailable() -> str:
@@ -1911,7 +1966,7 @@ def _to_async_client(sync_client, model: str, is_vision: bool = False):
}
sync_base_url = str(sync_client.base_url)
if base_url_host_matches(sync_base_url, "openrouter.ai"):
async_kwargs["default_headers"] = dict(_OR_HEADERS)
async_kwargs["default_headers"] = build_or_headers()
elif base_url_host_matches(sync_base_url, "api.githubcopilot.com"):
from hermes_cli.copilot_auth import copilot_request_headers
@@ -1977,6 +2032,12 @@ def resolve_provider_client(
(client, resolved_model) or (None, None) if auth is unavailable.
"""
_validate_proxy_env_urls()
# Preserve the original provider name before alias normalization so a
# user-declared ``custom_providers`` entry whose name coincidentally
# matches a built-in alias (e.g. user names their custom provider "kimi"
# which aliases to "kimi-coding") is still reachable via the named-custom
# branch below.
original_provider = (provider or "").strip().lower()
# Normalise aliases
provider = _normalize_aux_provider(provider)
@@ -2047,9 +2108,9 @@ def resolve_provider_client(
return (_to_async_client(client, final_model, is_vision=is_vision) if async_mode
else (client, final_model))
# ── OpenRouter ───────────────────────────────────────────────────
# ── OpenRouter ───────────────────────────────────────────
if provider == "openrouter":
client, default = _try_openrouter()
client, default = _try_openrouter(explicit_api_key=explicit_api_key)
if client is None:
logger.warning(
"resolve_provider_client: openrouter requested but %s",
@@ -2163,7 +2224,18 @@ def resolve_provider_client(
# ── Named custom providers (config.yaml providers dict / custom_providers list) ───
try:
from hermes_cli.runtime_provider import _get_named_custom_provider
custom_entry = _get_named_custom_provider(provider)
# When the raw requested name is an alias (``kimi`` → ``kimi-coding``)
# and the user defined a ``custom_providers`` entry under that alias
# name, the custom entry is the intended target — the built-in alias
# rewriting would otherwise hijack the request. Only preferred when
# the raw name is an alias (not a canonical provider name) so custom
# entries that coincidentally match a canonical provider (e.g. ``nous``)
# still defer to the built-in per `_get_named_custom_provider`'s guard.
custom_entry = None
if original_provider and original_provider != provider:
custom_entry = _get_named_custom_provider(original_provider)
if custom_entry is None:
custom_entry = _get_named_custom_provider(provider)
if custom_entry:
custom_base = custom_entry.get("base_url", "").strip()
custom_key = custom_entry.get("api_key", "").strip()
@@ -2273,6 +2345,12 @@ def resolve_provider_client(
creds = resolve_api_key_provider_credentials(provider)
api_key = str(creds.get("api_key", "")).strip()
# Honour an explicit api_key override (e.g. from a fallback_model entry
# or a custom_providers entry) so callers that pass an explicit
# credential can authenticate against endpoints where no built-in
# credential is registered for this provider alias.
if explicit_api_key:
api_key = explicit_api_key.strip() or api_key
if not api_key:
tried_sources = list(pconfig.api_key_env_vars)
if provider == "copilot":
@@ -2284,6 +2362,11 @@ def resolve_provider_client(
raw_base_url = str(creds.get("base_url", "")).strip().rstrip("/") or pconfig.inference_base_url
base_url = _to_openai_base_url(raw_base_url)
# Honour an explicit base_url override from the caller — used when a
# fallback_model entry (or custom_providers lookup) routes through a
# built-in provider name but targets a user-specified endpoint.
if explicit_base_url:
base_url = _to_openai_base_url(explicit_base_url.strip().rstrip("/"))
default_model = _API_KEY_PROVIDER_AUX_MODELS.get(provider, "")
final_model = _normalize_resolved_model(model or default_model, provider)
@@ -3209,7 +3292,26 @@ def _build_call_kwargs(
kwargs["max_tokens"] = max_tokens
if tools:
kwargs["tools"] = tools
# Defensive dedup: providers like Google Vertex, Azure, and Bedrock
# reject requests with duplicate tool names (HTTP 400). The upstream
# injection paths (run_agent.py) already dedup, but this guard
# converts a hard API failure into a warning if an upstream regression
# reintroduces duplicates. See: #18478
_seen: set = set()
_deduped: list = []
for _t in tools:
_tname = (_t.get("function") or {}).get("name", "")
if _tname and _tname in _seen:
logger.warning(
"_build_call_kwargs: duplicate tool name '%s' removed "
"(provider=%s model=%s)",
_tname, provider, model,
)
continue
if _tname:
_seen.add(_tname)
_deduped.append(_t)
kwargs["tools"] = _deduped
# Provider-specific extra_body
merged_extra = dict(extra_body or {})
+3 -3
View File
@@ -538,7 +538,7 @@ class ContextCompressor(ContextEngine):
# Token-budget approach: walk backward accumulating tokens
accumulated = 0
boundary = len(result)
min_protect = min(protect_tail_count, len(result) - 1)
min_protect = min(protect_tail_count, len(result))
for i in range(len(result) - 1, -1, -1):
msg = result[i]
raw_content = msg.get("content") or ""
@@ -992,8 +992,8 @@ The user has requested that this compaction PRIORITISE preserving all informatio
def _get_tool_call_id(tc) -> str:
"""Extract the call ID from a tool_call entry (dict or SimpleNamespace)."""
if isinstance(tc, dict):
return tc.get("id", "")
return getattr(tc, "id", "") or ""
return tc.get("call_id", "") or tc.get("id", "") or ""
return getattr(tc, "call_id", "") or getattr(tc, "id", "") or ""
def _sanitize_tool_pairs(self, messages: List[Dict[str, Any]]) -> List[Dict[str, Any]]:
"""Fix orphaned tool_call / tool_result pairs after compression.
+17 -6
View File
@@ -3,6 +3,7 @@
from __future__ import annotations
import logging
import os
import random
import threading
import time
@@ -13,7 +14,7 @@ from datetime import datetime
from typing import Any, Dict, List, Optional, Set, Tuple
from hermes_constants import OPENROUTER_BASE_URL
from hermes_cli.config import get_env_value
from hermes_cli.config import get_env_value, load_env
import hermes_cli.auth as auth_mod
from hermes_cli.auth import (
CODEX_ACCESS_TOKEN_REFRESH_SKEW_SECONDS,
@@ -1380,6 +1381,16 @@ def _seed_from_singletons(provider: str, entries: List[PooledCredential]) -> Tup
def _seed_from_env(provider: str, entries: List[PooledCredential]) -> Tuple[bool, Set[str]]:
changed = False
active_sources: Set[str] = set()
# Prefer ~/.hermes/.env over os.environ — the user's config file is the
# authoritative source for Hermes credentials. Stale env vars from parent
# processes (Codex CLI, test scripts, etc.) should not override deliberate
# changes to the .env file.
def _get_env_prefer_dotenv(key: str) -> str:
env_file = load_env()
val = env_file.get(key) or os.environ.get(key) or ""
return val.strip()
# Honour user suppression — `hermes auth remove <provider> <N>` for an
# env-seeded credential marks the env:<VAR> source as suppressed so it
# won't be re-seeded from the user's shell environment or ~/.hermes/.env.
@@ -1391,8 +1402,8 @@ def _seed_from_env(provider: str, entries: List[PooledCredential]) -> Tuple[bool
def _is_source_suppressed(_p, _s): # type: ignore[misc]
return False
if provider == "openrouter":
# Check both os.environ and ~/.hermes/.env file
token = (get_env_value("OPENROUTER_API_KEY") or "").strip()
# Prefer ~/.hermes/.env over os.environ
token = _get_env_prefer_dotenv("OPENROUTER_API_KEY")
if token:
source = "env:OPENROUTER_API_KEY"
if _is_source_suppressed(provider, source):
@@ -1418,7 +1429,7 @@ def _seed_from_env(provider: str, entries: List[PooledCredential]) -> Tuple[bool
env_url = ""
if pconfig.base_url_env_var:
env_url = (get_env_value(pconfig.base_url_env_var) or "").strip().rstrip("/")
env_url = _get_env_prefer_dotenv(pconfig.base_url_env_var).rstrip("/")
env_vars = list(pconfig.api_key_env_vars)
if provider == "anthropic":
@@ -1429,8 +1440,8 @@ def _seed_from_env(provider: str, entries: List[PooledCredential]) -> Tuple[bool
]
for env_var in env_vars:
# Check both os.environ and ~/.hermes/.env file
token = (get_env_value(env_var) or "").strip()
# Prefer ~/.hermes/.env over os.environ
token = _get_env_prefer_dotenv(env_var)
if not token:
continue
source = f"env:{env_var}"
+296 -12
View File
@@ -55,6 +55,7 @@ def _default_state() -> Dict[str, Any]:
"last_run_at": None,
"last_run_duration_seconds": None,
"last_run_summary": None,
"last_report_path": None,
"paused": False,
"run_count": 0,
}
@@ -183,7 +184,16 @@ def should_run_now(now: Optional[datetime] = None) -> bool:
Gates:
- curator.enabled == True
- not paused
- last_run_at missing, OR older than interval_hours
- last_run_at present AND older than interval_hours
First-run behavior: when there is no ``last_run_at`` (fresh install, or
install that predates the curator), we DO NOT run immediately. The
curator is designed to run after at least ``interval_hours`` (7 days by
default) of skill activity, not on the first background tick after
``hermes update``. On first observation we seed ``last_run_at`` to "now"
and defer the first real pass by one full interval. Users who want to
run it sooner can always invoke ``hermes curator run`` (with or without
``--dry-run``) explicitly that path bypasses this gate.
The idle check (min_idle_hours) is applied at the call site where we know
whether an agent is actively running here we only enforce the static
@@ -197,7 +207,21 @@ def should_run_now(now: Optional[datetime] = None) -> bool:
state = load_state()
last = _parse_iso(state.get("last_run_at"))
if last is None:
return True
# Never run before. Seed state so we wait a full interval before the
# first real pass. Report-only; do not auto-mutate the library the
# very first time a gateway ticks after an update.
if now is None:
now = datetime.now(timezone.utc)
try:
state["last_run_at"] = now.isoformat()
state["last_run_summary"] = (
"deferred first run — curator seeded, will run after one "
"interval; use `hermes curator run --dry-run` to preview now"
)
save_state(state)
except Exception as e: # pragma: no cover — best-effort persistence
logger.debug("Failed to seed curator last_run_at: %s", e)
return False
if now is None:
now = datetime.now(timezone.utc)
@@ -258,6 +282,33 @@ def apply_automatic_transitions(now: Optional[datetime] = None) -> Dict[str, int
# Review prompt for the forked agent
# ---------------------------------------------------------------------------
CURATOR_DRY_RUN_BANNER = (
"═══════════════════════════════════════════════════════════════\n"
"DRY-RUN — REPORT ONLY. DO NOT MUTATE THE SKILL LIBRARY.\n"
"═══════════════════════════════════════════════════════════════\n"
"\n"
"This is a PREVIEW pass. Follow every instruction below EXCEPT:\n"
"\n"
" • DO NOT call skill_manage with action=patch, create, delete, "
"write_file, or remove_file.\n"
" • DO NOT call terminal to mv skill directories into .archive/.\n"
" • DO NOT call terminal to mv, cp, rm, or rewrite any file under "
"~/.hermes/skills/.\n"
" • skills_list and skill_view are FINE — read as much as you need.\n"
"\n"
"Your output IS the deliverable. Produce the exact same "
"human-readable summary and structured YAML block you would "
"produce on a live run — but describe the actions you WOULD take, "
"not actions you took. A downstream reviewer will read the report "
"and decide whether to approve a live run with "
"`hermes curator run` (no flag).\n"
"\n"
"If you accidentally take a mutating action, say so explicitly in "
"the summary so the reviewer can revert it.\n"
"═══════════════════════════════════════════════════════════════"
)
CURATOR_REVIEW_PROMPT = (
"You are running as Hermes' background skill CURATOR. This is an "
"UMBRELLA-BUILDING consolidation pass, not a passive audit and not a "
@@ -336,6 +387,11 @@ CURATOR_REVIEW_PROMPT = (
" - skill_manage action=write_file — add a references/, templates/, "
"or scripts/ file under an existing skill (the skill must already "
"exist)\n"
" - skill_manage action=delete — archive a skill. MUST pass "
"`absorbed_into=<umbrella>` when you've merged its content into another "
"skill, or `absorbed_into=\"\"` when you're truly pruning with no "
"forwarding target. This drives cron-job skill-reference migration — "
"guessing from your YAML summary after the fact is fragile.\n"
" - terminal — mv a sibling into the archive "
"OR move its content into a support subfile\n\n"
"'keep' is a legitimate decision ONLY when the skill is already a "
@@ -586,15 +642,76 @@ def _parse_structured_summary(
return out
def _extract_absorbed_into_declarations(
tool_calls: List[Dict[str, Any]],
) -> Dict[str, Dict[str, Any]]:
"""Walk this run's tool calls and extract model-declared absorption targets.
The curator prompt requires every ``skill_manage(action='delete')`` call
to pass ``absorbed_into=<umbrella>`` when consolidating, or
``absorbed_into=""`` when truly pruning. This is the single authoritative
signal for classification the model's own declaration at the moment of
deletion, which beats both post-hoc YAML summary parsing and substring
heuristics on other tool calls.
Returns ``{skill_name: {"into": "<umbrella>" | "", "declared": True}}``.
Entries with ``into == ""`` are explicit prunings.
Skills without a ``skill_manage(delete)`` call, or with one that omitted
``absorbed_into``, are not in the returned dict caller falls back to
the existing heuristic/YAML logic for those (backward compat with older
curator runs and any callers that don't populate the arg).
"""
out: Dict[str, Dict[str, Any]] = {}
for tc in tool_calls or []:
if not isinstance(tc, dict):
continue
if tc.get("name") != "skill_manage":
continue
raw = tc.get("arguments") or ""
args: Dict[str, Any] = {}
if isinstance(raw, dict):
args = raw
elif isinstance(raw, str):
try:
args = json.loads(raw)
except Exception:
continue
if not isinstance(args, dict):
continue
if args.get("action") != "delete":
continue
name = args.get("name")
if not isinstance(name, str) or not name.strip():
continue
# absorbed_into must be present (even empty string is meaningful);
# missing key means the model didn't declare intent.
if "absorbed_into" not in args:
continue
target = args.get("absorbed_into")
if target is None:
continue
if not isinstance(target, str):
continue
out[name.strip()] = {"into": target.strip(), "declared": True}
return out
def _reconcile_classification(
removed: List[str],
heuristic: Dict[str, List[Dict[str, Any]]],
model_block: Dict[str, List[Dict[str, str]]],
destinations: Set[str],
absorbed_declarations: Optional[Dict[str, Dict[str, Any]]] = None,
) -> Dict[str, List[Dict[str, Any]]]:
"""Merge heuristic (tool-call evidence) with the model's structured block.
Rules:
Rules (evaluated in order; first match wins):
- **Model-declared `absorbed_into` at delete time is authoritative.** Any
entry in ``absorbed_declarations`` beats every other signal. This is
the model telling us directly, at the moment of deletion, what it did.
``into != ""`` and target exists consolidated. ``into == ""``
pruned. ``into != ""`` but target doesn't exist → hallucination; fall
through to the usual signals.
- Model-declared consolidation wins when its ``into`` target exists
in ``destinations`` (survived or newly-created). This gives the
model authority over intent + rationale.
@@ -615,6 +732,8 @@ def _reconcile_classification(
model_cons = {e["from"]: e for e in model_block.get("consolidations", [])}
model_pruned = {e["name"]: e for e in model_block.get("prunings", [])}
declared = absorbed_declarations or {}
consolidated: List[Dict[str, Any]] = []
pruned: List[Dict[str, Any]] = []
@@ -622,6 +741,36 @@ def _reconcile_classification(
mc = model_cons.get(name)
mp = model_pruned.get(name)
hc = heur_cons.get(name)
dec = declared.get(name)
# Authoritative: model declared `absorbed_into` at the delete call.
if dec is not None:
into_claim = dec.get("into", "")
if into_claim and into_claim in destinations:
entry: Dict[str, Any] = {
"name": name,
"into": into_claim,
"source": "absorbed_into (model-declared at delete)",
"reason": (mc.get("reason") or "") if mc else "",
}
if hc and hc.get("evidence"):
entry["evidence"] = hc["evidence"]
consolidated.append(entry)
continue
if into_claim == "":
# Explicit prune declaration
pruned.append({
"name": name,
"source": "absorbed_into=\"\" (model-declared prune)",
"reason": (mp.get("reason") or "") if mp else "",
})
continue
# into_claim is non-empty but target doesn't exist: the model
# named a nonexistent umbrella at delete time. The tool already
# rejects this at the skill_manage layer, so we shouldn't see it
# in practice — but if it slips through (e.g. the umbrella was
# deleted LATER in the same run), fall through to the usual
# signals rather than trusting a broken reference.
# Model says consolidated — trust it if the destination is real.
if mc and mc.get("into") in destinations:
@@ -757,15 +906,57 @@ def _write_run_report(
)
model_block = _parse_structured_summary(llm_meta.get("final", "") or "")
destinations = set(after_names) | set(added or [])
# Authoritative signal: extract per-delete `absorbed_into` declarations
# from this run's tool calls. These beat both the YAML summary block and
# the substring heuristic — the model is telling us directly, at the
# moment of deletion, whether each archived skill was consolidated
# (into=<umbrella>) or pruned (into="").
absorbed_declarations = _extract_absorbed_into_declarations(
llm_meta.get("tool_calls", []) or []
)
classification = _reconcile_classification(
removed=removed,
heuristic=heuristic,
model_block=model_block,
destinations=destinations,
absorbed_declarations=absorbed_declarations,
)
consolidated = classification["consolidated"]
pruned = classification["pruned"]
# Rewrite cron job skill references. When the curator consolidates
# skill X into umbrella Y, any cron job that lists X fails to load
# it at run time — the scheduler skips it and the job runs without
# the instructions it was scheduled to follow. Rewriting the
# references in-place keeps scheduled jobs working across
# consolidation passes. Best-effort: never let a cron-module issue
# break the curator.
cron_rewrites: Dict[str, Any] = {"rewrites": [], "jobs_updated": 0, "jobs_scanned": 0}
try:
consolidated_map = {
e["name"]: e["into"]
for e in consolidated
if isinstance(e, dict) and e.get("name") and e.get("into")
}
pruned_names = [
e["name"] for e in pruned
if isinstance(e, dict) and e.get("name")
]
if consolidated_map or pruned_names:
from cron.jobs import rewrite_skill_refs as _rewrite_cron_refs
cron_rewrites = _rewrite_cron_refs(
consolidated=consolidated_map,
pruned=pruned_names,
)
except Exception as e:
logger.debug("Curator cron skill rewrite failed: %s", e, exc_info=True)
cron_rewrites = {
"rewrites": [],
"jobs_updated": 0,
"jobs_scanned": 0,
"error": str(e),
}
payload = {
"started_at": started_at.isoformat(),
"duration_seconds": round(elapsed_seconds, 2),
@@ -781,6 +972,7 @@ def _write_run_report(
"consolidated_this_run": len(consolidated),
"pruned_this_run": len(pruned),
"state_transitions": len(transitions),
"cron_jobs_rewritten": int(cron_rewrites.get("jobs_updated", 0)),
"tool_calls_total": sum(tc_counts.values()),
},
"tool_call_counts": tc_counts,
@@ -790,6 +982,7 @@ def _write_run_report(
"pruned_names": [p["name"] for p in pruned],
"added": added,
"state_transitions": transitions,
"cron_rewrites": cron_rewrites,
"llm_final": llm_meta.get("final", ""),
"llm_summary": llm_meta.get("summary", ""),
"llm_error": llm_meta.get("error"),
@@ -812,6 +1005,17 @@ def _write_run_report(
except Exception as e:
logger.debug("Curator REPORT.md write failed: %s", e)
# cron_rewrites.json — only when at least one job was touched, to
# keep run dirs uncluttered for the common no-op case.
try:
if int(cron_rewrites.get("jobs_updated", 0)) > 0:
(run_dir / "cron_rewrites.json").write_text(
json.dumps(cron_rewrites, indent=2, ensure_ascii=False) + "\n",
encoding="utf-8",
)
except Exception as e:
logger.debug("Curator cron_rewrites.json write failed: %s", e)
return run_dir
@@ -942,6 +1146,39 @@ def _render_report_markdown(p: Dict[str, Any]) -> str:
lines.append(f"- `{t.get('name')}`: {t.get('from')}{t.get('to')}")
lines.append("")
# Cron job rewrites — show which scheduled jobs had their skill
# references updated so users can audit that the auto-rewrite did
# the right thing. Only present when at least one job changed.
cron_rw = p.get("cron_rewrites") or {}
cron_rewrites_list = cron_rw.get("rewrites") or []
if cron_rewrites_list:
lines.append(f"### Cron job skill references rewritten ({len(cron_rewrites_list)})\n")
lines.append(
"_Cron jobs that referenced a consolidated or pruned skill were "
"updated in-place so they keep loading the right instructions "
"on their next run. See `cron_rewrites.json` for the full record._\n"
)
SHOW = 25
for entry in cron_rewrites_list[:SHOW]:
job_name = entry.get("job_name") or entry.get("job_id") or "?"
before = entry.get("before") or []
after = entry.get("after") or []
mapped = entry.get("mapped") or {}
dropped = entry.get("dropped") or []
lines.append(
f"- `{job_name}`: `{', '.join(before)}` → `{', '.join(after) or '(none)'}`"
)
for old, new in mapped.items():
lines.append(f" - `{old}` → `{new}` (consolidated)")
for name in dropped:
lines.append(f" - `{name}` dropped (pruned)")
if len(cron_rewrites_list) > SHOW:
lines.append(
f"- … and {len(cron_rewrites_list) - SHOW} more "
"(see `cron_rewrites.json`)"
)
lines.append("")
# Full LLM final response
final = (p.get("llm_final") or "").strip()
if final:
@@ -992,6 +1229,7 @@ def _render_candidate_list() -> str:
def run_curator_review(
on_summary: Optional[Callable[[str], None]] = None,
synchronous: bool = False,
dry_run: bool = False,
) -> Dict[str, Any]:
"""Execute a single curator review pass.
@@ -1004,9 +1242,43 @@ def run_curator_review(
If *synchronous* is True, the LLM review runs in the calling thread; the
default is to spawn a daemon thread so the caller returns immediately.
If *dry_run* is True, the automatic stale/archive transitions are SKIPPED
and the LLM review pass is instructed to produce a report only no
skill_manage mutations, no terminal archive moves. The REPORT.md still
gets written and ``state.last_report_path`` still records it so users
can read what the curator WOULD have done.
"""
start = datetime.now(timezone.utc)
counts = apply_automatic_transitions(now=start)
if dry_run:
# Count candidates without mutating state.
try:
report = skill_usage.agent_created_report()
counts = {
"checked": len(report),
"marked_stale": 0,
"archived": 0,
"reactivated": 0,
}
except Exception:
counts = {"checked": 0, "marked_stale": 0, "archived": 0, "reactivated": 0}
else:
# Pre-mutation snapshot — best-effort, never blocks the run. A
# failed snapshot logs at debug and continues (the alternative is
# that a transient disk issue silently disables curator forever,
# which is worse). Users who want to require snapshots can disable
# curator entirely until they can fix disk space.
try:
from agent import curator_backup
snap = curator_backup.snapshot_skills(reason="pre-curator-run")
if snap is not None and on_summary:
try:
on_summary(f"curator: snapshot created ({snap.name})")
except Exception:
pass
except Exception as e:
logger.debug("Curator pre-run snapshot failed: %s", e, exc_info=True)
counts = apply_automatic_transitions(now=start)
auto_summary_parts = []
if counts["marked_stale"]:
@@ -1018,11 +1290,16 @@ def run_curator_review(
auto_summary = ", ".join(auto_summary_parts) if auto_summary_parts else "no changes"
# Persist state before the LLM pass so a crash mid-review still records
# the run and doesn't immediately re-trigger.
# the run and doesn't immediately re-trigger. In dry-run we do NOT bump
# last_run_at or run_count — a preview shouldn't push the next scheduled
# real pass out. We still record a summary so `hermes curator status`
# shows that a preview ran.
state = load_state()
state["last_run_at"] = start.isoformat()
state["run_count"] = int(state.get("run_count", 0)) + 1
state["last_run_summary"] = f"auto: {auto_summary}"
if not dry_run:
state["last_run_at"] = start.isoformat()
state["run_count"] = int(state.get("run_count", 0)) + 1
prefix = "dry-run auto: " if dry_run else "auto: "
state["last_run_summary"] = f"{prefix}{auto_summary}"
save_state(state)
def _llm_pass():
@@ -1038,7 +1315,7 @@ def run_curator_review(
try:
candidate_list = _render_candidate_list()
if "No agent-created skills" in candidate_list:
final_summary = f"auto: {auto_summary}; llm: skipped (no candidates)"
final_summary = f"{prefix}{auto_summary}; llm: skipped (no candidates)"
llm_meta = {
"final": "",
"summary": "skipped (no candidates)",
@@ -1048,14 +1325,21 @@ def run_curator_review(
"error": None,
}
else:
prompt = f"{CURATOR_REVIEW_PROMPT}\n\n{candidate_list}"
if dry_run:
prompt = (
f"{CURATOR_DRY_RUN_BANNER}\n\n"
f"{CURATOR_REVIEW_PROMPT}\n\n"
f"{candidate_list}"
)
else:
prompt = f"{CURATOR_REVIEW_PROMPT}\n\n{candidate_list}"
llm_meta = _run_llm_review(prompt)
final_summary = (
f"auto: {auto_summary}; llm: {llm_meta.get('summary', 'no change')}"
f"{prefix}{auto_summary}; llm: {llm_meta.get('summary', 'no change')}"
)
except Exception as e:
logger.debug("Curator LLM pass failed: %s", e, exc_info=True)
final_summary = f"auto: {auto_summary}; llm: error ({e})"
final_summary = f"{prefix}{auto_summary}; llm: error ({e})"
llm_meta = {
"final": "",
"summary": f"error ({e})",
+693
View File
@@ -0,0 +1,693 @@
"""Curator snapshot + rollback.
A pre-run snapshot of ``~/.hermes/skills/`` (excluding ``.curator_backups/``
itself) is taken before any mutating curator pass. Snapshots are tar.gz
files under ``~/.hermes/skills/.curator_backups/<utc-iso>/`` with a
companion ``manifest.json`` describing the snapshot (reason, time, size,
counted skill files). Rollback picks a snapshot, moves the current
``skills/`` tree aside into another snapshot so even the rollback itself
is undoable, then extracts the chosen snapshot into place.
The snapshot does NOT include:
- ``.curator_backups/`` (would recurse)
- ``.hub/`` (hub-installed skills managed by the hub, not us)
It DOES include:
- all SKILL.md files + their directories (``scripts/``, ``references/``,
``templates/``, ``assets/``)
- ``.usage.json`` (usage telemetry needed to rehydrate state cleanly)
- ``.archive/`` (so rollback restores previously-archived skills too)
- ``.curator_state`` (so rolling back also restores the last-run-at
pointer otherwise the curator would immediately re-fire on the next
tick)
- ``.bundled_manifest`` (so protection markers stay consistent)
Alongside the skills tarball, each snapshot also captures a copy of
``~/.hermes/cron/jobs.json`` as ``cron-jobs.json`` when it exists. Cron
jobs reference skills by name in their ``skills``/``skill`` fields; the
curator's consolidation pass rewrites those in place via
``cron.jobs.rewrite_skill_refs()``. Without capturing the pre-run state,
rolling back the skills tree would leave cron jobs pointing at the
umbrella skills even though the narrow skills they were originally
configured with have been restored. We store the whole jobs.json for
fidelity but rollback only touches the ``skills``/``skill`` fields the
rest (schedule, next_run_at, enabled, prompt, etc.) is live state and
we leave it alone.
"""
from __future__ import annotations
import json
import logging
import os
import re
import shutil
import tarfile
import tempfile
import time
from datetime import datetime, timezone
from pathlib import Path
from typing import Any, Dict, List, Optional, Tuple
from hermes_constants import get_hermes_home
logger = logging.getLogger(__name__)
DEFAULT_KEEP = 5
# Entries under skills/ that should NEVER be rolled up into a snapshot.
# .hub/ is managed by the skills hub; rolling it back would break lockfile
# invariants. .curator_backups is the backup dir itself — recursion bomb.
_EXCLUDE_TOP_LEVEL = {".curator_backups", ".hub"}
# Snapshot id regex: UTC ISO with colons replaced by dashes so the filename
# is portable (Windows-safe). An optional ``-NN`` suffix handles two
# snapshots landing in the same wallclock second.
_ID_RE = re.compile(r"^\d{4}-\d{2}-\d{2}T\d{2}-\d{2}-\d{2}Z(-\d{2})?$")
def _backups_dir() -> Path:
return get_hermes_home() / "skills" / ".curator_backups"
def _skills_dir() -> Path:
return get_hermes_home() / "skills"
def _cron_jobs_file() -> Path:
"""Source path for the live cron jobs store (``~/.hermes/cron/jobs.json``)."""
return get_hermes_home() / "cron" / "jobs.json"
CRON_JOBS_FILENAME = "cron-jobs.json"
def _backup_cron_jobs_into(dest: Path) -> Dict[str, Any]:
"""Copy the live cron jobs.json into ``dest`` as ``cron-jobs.json``.
Returns a small dict describing what was captured so the caller can
fold it into the manifest. Never raises if the cron file is missing
or unreadable, the return dict has ``backed_up=False`` and the reason,
and the snapshot proceeds without cron data (the snapshot is still
useful for rolling back skills).
"""
src = _cron_jobs_file()
info: Dict[str, Any] = {"backed_up": False, "jobs_count": 0}
if not src.exists():
info["reason"] = "no cron/jobs.json present"
return info
try:
raw = src.read_text(encoding="utf-8")
except OSError as e:
logger.debug("Failed to read cron/jobs.json for backup: %s", e)
info["reason"] = f"read error: {e}"
return info
# Count jobs as a nice diagnostic — but don't fail the snapshot if the
# file is unparseable; just store the raw text and let rollback deal
# with it (or not, if it's corrupted). jobs.json wraps the list as
# `{"jobs": [...], "updated_at": ...}` — we count via that shape, and
# fall back to bare-list shape just in case the format ever changes.
try:
parsed = json.loads(raw)
if isinstance(parsed, dict):
inner = parsed.get("jobs")
if isinstance(inner, list):
info["jobs_count"] = len(inner)
elif isinstance(parsed, list):
info["jobs_count"] = len(parsed)
except (json.JSONDecodeError, TypeError):
info["jobs_count"] = 0
info["parse_warning"] = "jobs.json was not valid JSON at snapshot time"
try:
(dest / CRON_JOBS_FILENAME).write_text(raw, encoding="utf-8")
except OSError as e:
logger.debug("Failed to write cron backup file: %s", e)
info["reason"] = f"write error: {e}"
return info
info["backed_up"] = True
return info
def _utc_id(now: Optional[datetime] = None) -> str:
"""UTC ISO-ish filesystem-safe timestamp: ``2026-05-01T13-05-42Z``."""
if now is None:
now = datetime.now(timezone.utc)
# isoformat → "2026-05-01T13:05:42.123456+00:00"; strip subseconds and tz.
s = now.replace(microsecond=0).isoformat()
if s.endswith("+00:00"):
s = s[:-6]
return s.replace(":", "-") + "Z"
def _load_config() -> Dict[str, Any]:
try:
from hermes_cli.config import load_config
cfg = load_config()
except Exception as e:
logger.debug("Failed to load config for curator backup: %s", e)
return {}
if not isinstance(cfg, dict):
return {}
cur = cfg.get("curator") or {}
if not isinstance(cur, dict):
return {}
bk = cur.get("backup") or {}
return bk if isinstance(bk, dict) else {}
def is_enabled() -> bool:
"""Default ON — the whole point of the backup is safety by default."""
return bool(_load_config().get("enabled", True))
def get_keep() -> int:
cfg = _load_config()
try:
n = int(cfg.get("keep", DEFAULT_KEEP))
except (TypeError, ValueError):
n = DEFAULT_KEEP
return max(1, n)
# ---------------------------------------------------------------------------
# Snapshot
# ---------------------------------------------------------------------------
def _count_skill_files(base: Path) -> int:
try:
return sum(1 for _ in base.rglob("SKILL.md"))
except OSError:
return 0
def _write_manifest(dest: Path, reason: str, archive_path: Path,
skills_counted: int,
cron_info: Optional[Dict[str, Any]] = None) -> None:
manifest = {
"id": dest.name,
"reason": reason,
"created_at": datetime.now(timezone.utc).isoformat(),
"archive": archive_path.name,
"archive_bytes": archive_path.stat().st_size,
"skill_files": skills_counted,
}
if cron_info is not None:
manifest["cron_jobs"] = {
"backed_up": bool(cron_info.get("backed_up", False)),
"jobs_count": int(cron_info.get("jobs_count", 0)),
}
if not cron_info.get("backed_up"):
manifest["cron_jobs"]["reason"] = cron_info.get("reason", "not captured")
if cron_info.get("parse_warning"):
manifest["cron_jobs"]["parse_warning"] = cron_info["parse_warning"]
(dest / "manifest.json").write_text(
json.dumps(manifest, indent=2, sort_keys=True), encoding="utf-8"
)
def snapshot_skills(reason: str = "manual") -> Optional[Path]:
"""Create a tar.gz snapshot of ``~/.hermes/skills/`` and prune old ones.
Returns the snapshot directory path, or ``None`` if the snapshot was
skipped (backup disabled, skills dir missing, or an IO error occurred
in which case we log at debug and return None so the curator never
aborts a pass because of a backup failure).
"""
if not is_enabled():
logger.debug("Curator backup disabled by config; skipping snapshot")
return None
skills = _skills_dir()
if not skills.exists():
logger.debug("No ~/.hermes/skills/ directory — nothing to back up")
return None
backups = _backups_dir()
try:
backups.mkdir(parents=True, exist_ok=True)
except OSError as e:
logger.debug("Failed to create backups dir %s: %s", backups, e)
return None
# Uniquify: if a snapshot with the same second already exists (can
# happen if two curator runs fire in the same second), append a short
# counter. Avoids clobbering and avoids timestamp collisions.
base_id = _utc_id()
snap_id = base_id
counter = 1
while (backups / snap_id).exists():
snap_id = f"{base_id}-{counter:02d}"
counter += 1
dest = backups / snap_id
try:
dest.mkdir(parents=True, exist_ok=False)
except OSError as e:
logger.debug("Failed to create snapshot dir %s: %s", dest, e)
return None
archive = dest / "skills.tar.gz"
try:
# Stream into the tarball — no tempdir copy needed.
with tarfile.open(archive, "w:gz", compresslevel=6) as tf:
for entry in sorted(skills.iterdir()):
if entry.name in _EXCLUDE_TOP_LEVEL:
continue
# arcname: store paths relative to skills/ so extraction
# drops cleanly back into the skills dir.
tf.add(str(entry), arcname=entry.name, recursive=True)
# Capture cron/jobs.json alongside the tarball. Never fails the
# snapshot — the skills side is the core guarantee; cron is
# additive. We still record in the manifest whether it was
# captured so rollback can surface "no cron data in this snapshot".
cron_info = _backup_cron_jobs_into(dest)
_write_manifest(dest, reason, archive,
_count_skill_files(skills),
cron_info=cron_info)
except (OSError, tarfile.TarError) as e:
logger.debug("Curator snapshot failed: %s", e, exc_info=True)
# Clean up partial snapshot
try:
shutil.rmtree(dest, ignore_errors=True)
except OSError:
pass
return None
_prune_old(keep=get_keep())
logger.info("Curator snapshot created: %s (%s)", snap_id, reason)
return dest
def _prune_old(keep: int) -> List[str]:
"""Delete regular snapshots beyond the newest *keep*. Returns deleted
ids. Staging dirs (``.rollback-staging-*``) are implementation detail
and pruned independently on every call."""
backups = _backups_dir()
if not backups.exists():
return []
entries: List[Tuple[str, Path]] = []
stale_staging: List[Path] = []
for child in backups.iterdir():
if not child.is_dir():
continue
if child.name.startswith(".rollback-staging-"):
# Staging dirs are only supposed to exist briefly during a
# rollback. If we find one here (e.g. from a crashed rollback),
# clean it up opportunistically.
stale_staging.append(child)
continue
if _ID_RE.match(child.name):
entries.append((child.name, child))
# Newest first (lexicographic works because the id is UTC ISO).
entries.sort(key=lambda t: t[0], reverse=True)
deleted: List[str] = []
for _, path in entries[keep:]:
try:
shutil.rmtree(path)
deleted.append(path.name)
except OSError as e:
logger.debug("Failed to prune %s: %s", path, e)
for path in stale_staging:
try:
shutil.rmtree(path)
except OSError as e:
logger.debug("Failed to clean stale staging dir %s: %s", path, e)
return deleted
# ---------------------------------------------------------------------------
# List + rollback
# ---------------------------------------------------------------------------
def _read_manifest(snap_dir: Path) -> Dict[str, Any]:
mf = snap_dir / "manifest.json"
if not mf.exists():
return {}
try:
return json.loads(mf.read_text(encoding="utf-8"))
except (OSError, json.JSONDecodeError):
return {}
def list_backups() -> List[Dict[str, Any]]:
"""Return all restorable snapshots, newest first. Only entries with a
real ``skills.tar.gz`` tarball are listed transient
``.rollback-staging-*`` directories created mid-rollback are
implementation detail and not shown."""
backups = _backups_dir()
if not backups.exists():
return []
out: List[Dict[str, Any]] = []
for child in sorted(backups.iterdir(), reverse=True):
if not child.is_dir():
continue
if not _ID_RE.match(child.name):
continue
if not (child / "skills.tar.gz").exists():
continue
mf = _read_manifest(child)
mf.setdefault("id", child.name)
mf.setdefault("path", str(child))
if "archive_bytes" not in mf:
arc = child / "skills.tar.gz"
try:
mf["archive_bytes"] = arc.stat().st_size
except OSError:
mf["archive_bytes"] = 0
out.append(mf)
return out
def _resolve_backup(backup_id: Optional[str]) -> Optional[Path]:
"""Return the path of the requested backup, or the newest one if
*backup_id* is None. Returns None if no match."""
backups = _backups_dir()
if not backups.exists():
return None
if backup_id:
target = backups / backup_id
if (
target.is_dir()
and _ID_RE.match(backup_id)
and (target / "skills.tar.gz").exists()
):
return target
return None
candidates = [
c for c in sorted(backups.iterdir(), reverse=True)
if c.is_dir() and _ID_RE.match(c.name) and (c / "skills.tar.gz").exists()
]
return candidates[0] if candidates else None
def _restore_cron_skill_links(snapshot_dir: Path) -> Dict[str, Any]:
"""Reconcile backed-up cron skill links into the live ``cron/jobs.json``.
We do NOT overwrite the whole cron file. Only the ``skills`` and
``skill`` fields are restored, and only on jobs that still exist in the
current file (matched by ``id``). Everything else about the job
schedule, next_run_at, last_run_at, enabled, prompt, workdir, hooks
is live state that the user/scheduler has modified since the snapshot;
overwriting it would regress unrelated cron activity.
Rules:
- Jobs present in backup AND live, with differing skills skills restored.
- Jobs present in backup AND live, with matching skills no-op.
- Jobs present in backup but gone from live (user deleted the job
after the snapshot) skipped, noted in the return report.
- Jobs present in live but not in backup (user created a new cron
job after the snapshot) left untouched.
Never raises; failures are captured in the return dict. Writes through
``cron.jobs`` to pick up the same lock + atomic-write path that tick()
uses, so we don't race the scheduler.
"""
report: Dict[str, Any] = {
"attempted": False,
"restored": [],
"skipped_missing": [],
"unchanged": 0,
"error": None,
}
backup_file = snapshot_dir / CRON_JOBS_FILENAME
if not backup_file.exists():
report["error"] = f"snapshot has no {CRON_JOBS_FILENAME}"
return report
try:
backup_text = backup_file.read_text(encoding="utf-8")
backup_parsed = json.loads(backup_text)
except (OSError, json.JSONDecodeError) as e:
report["error"] = f"failed to load backed-up jobs: {e}"
return report
# jobs.json on disk is `{"jobs": [...], "updated_at": ...}`; accept both
# that shape and a bare list for forward compat.
if isinstance(backup_parsed, dict):
backup_jobs = backup_parsed.get("jobs")
elif isinstance(backup_parsed, list):
backup_jobs = backup_parsed
else:
backup_jobs = None
if not isinstance(backup_jobs, list):
report["error"] = "backed-up cron-jobs.json has no jobs list"
return report
# Build a lookup of the backed-up skill state keyed by job id.
# We only need the two skill-ish fields (legacy single and modern list).
backup_by_id: Dict[str, Dict[str, Any]] = {}
for job in backup_jobs:
if not isinstance(job, dict):
continue
jid = job.get("id")
if not isinstance(jid, str) or not jid:
continue
backup_by_id[jid] = {
"skills": job.get("skills"),
"skill": job.get("skill"),
"name": job.get("name") or jid,
}
if not backup_by_id:
report["attempted"] = True # we tried but there was nothing to do
return report
# Load and rewrite the live jobs under the scheduler's lock.
try:
from cron.jobs import load_jobs, save_jobs, _jobs_file_lock
except ImportError as e:
report["error"] = f"cron module unavailable: {e}"
return report
report["attempted"] = True
try:
with _jobs_file_lock:
live_jobs = load_jobs()
changed = False
live_ids = set()
for live in live_jobs:
if not isinstance(live, dict):
continue
jid = live.get("id")
if not isinstance(jid, str) or not jid:
continue
live_ids.add(jid)
backup = backup_by_id.get(jid)
if backup is None:
continue # live job didn't exist at snapshot time
cur_skills = live.get("skills")
cur_skill = live.get("skill")
bkp_skills = backup.get("skills")
bkp_skill = backup.get("skill")
if cur_skills == bkp_skills and cur_skill == bkp_skill:
report["unchanged"] += 1
continue
# Restore. Preserve absence (don't force the key to appear
# if the backup didn't have it either).
if bkp_skills is None:
live.pop("skills", None)
else:
live["skills"] = bkp_skills
if bkp_skill is None:
live.pop("skill", None)
else:
live["skill"] = bkp_skill
report["restored"].append({
"job_id": jid,
"job_name": backup.get("name") or jid,
"from": {"skills": cur_skills, "skill": cur_skill},
"to": {"skills": bkp_skills, "skill": bkp_skill},
})
changed = True
# Jobs in backup but not in live = user deleted them after snapshot
for jid, backup in backup_by_id.items():
if jid not in live_ids:
report["skipped_missing"].append({
"job_id": jid,
"job_name": backup.get("name") or jid,
})
if changed:
save_jobs(live_jobs)
except Exception as e: # noqa: BLE001 — rollback must not die mid-restore
logger.debug("Cron skill-link restore failed: %s", e, exc_info=True)
report["error"] = f"restore failed mid-flight: {e}"
return report
def rollback(backup_id: Optional[str] = None) -> Tuple[bool, str, Optional[Path]]:
"""Restore ``~/.hermes/skills/`` from a snapshot.
Strategy:
1. Resolve the target snapshot (explicit id or newest regular).
2. Take a safety snapshot of the CURRENT skills tree under
``.curator_backups/pre-rollback-<ts>/`` so the rollback itself is
undoable.
3. Move all current top-level entries (except ``.curator_backups``
and ``.hub``) into a tempdir.
4. Extract the chosen snapshot into ``~/.hermes/skills/``.
5. On failure during 4, move the tempdir contents back (best-effort)
and return failure.
Returns ``(ok, message, snapshot_path)``.
"""
target = _resolve_backup(backup_id)
if target is None:
return (
False,
f"no matching backup found"
+ (f" for id '{backup_id}'" if backup_id else "")
+ " (use `hermes curator rollback --list` to see available snapshots)",
None,
)
archive = target / "skills.tar.gz"
if not archive.exists():
return (False, f"snapshot {target.name} has no skills.tar.gz — corrupted?", None)
skills = _skills_dir()
skills.mkdir(parents=True, exist_ok=True)
backups = _backups_dir()
backups.mkdir(parents=True, exist_ok=True)
# Step 2: safety snapshot of current state FIRST. If this fails we bail
# out before touching anything — otherwise a failed extract could leave
# the user with no skills.
try:
snapshot_skills(reason=f"pre-rollback to {target.name}")
except Exception as e:
return (False, f"pre-rollback safety snapshot failed: {e}", None)
# Additionally move current entries into an internal staging dir so
# the extract happens into an empty skills tree (predictable result).
# This dir is implementation detail — not listed as a restorable
# backup. The safety snapshot above is the user-facing undo handle.
staged = backups / f".rollback-staging-{_utc_id()}"
try:
staged.mkdir(parents=True, exist_ok=False)
except OSError as e:
return (False, f"failed to create staging dir: {e}", None)
moved: List[Tuple[Path, Path]] = []
try:
for entry in list(skills.iterdir()):
if entry.name in _EXCLUDE_TOP_LEVEL:
continue
dest = staged / entry.name
shutil.move(str(entry), str(dest))
moved.append((entry, dest))
except OSError as e:
# Best-effort rollback of the move
for orig, dest in moved:
try:
shutil.move(str(dest), str(orig))
except OSError:
pass
try:
shutil.rmtree(staged, ignore_errors=True)
except OSError:
pass
return (False, f"failed to stage current skills: {e}", None)
# Step 4: extract the snapshot into skills/
try:
with tarfile.open(archive, "r:gz") as tf:
# Python 3.12+ supports filter='data' for safer extraction.
# Fall back to the unfiltered call for older interpreters but
# still reject absolute paths and .. components defensively.
for member in tf.getmembers():
name = member.name
if name.startswith("/") or ".." in Path(name).parts:
raise tarfile.TarError(
f"refusing to extract unsafe path: {name!r}"
)
try:
tf.extractall(str(skills), filter="data") # type: ignore[call-arg]
except TypeError:
# Python < 3.12 — no filter kwarg
tf.extractall(str(skills))
except (OSError, tarfile.TarError) as e:
# Best-effort recover: move staged contents back
for orig, dest in moved:
try:
shutil.move(str(dest), str(orig))
except OSError:
pass
try:
shutil.rmtree(staged, ignore_errors=True)
except OSError:
pass
return (False, f"snapshot extract failed (state restored): {e}", None)
# Extract succeeded — the staging dir has served its purpose. The
# user's undo handle is the safety snapshot tarball we took earlier.
try:
shutil.rmtree(staged, ignore_errors=True)
except OSError:
pass
# Reconcile cron skill-links. Surgical: only the skills/skill fields
# on jobs matched by id. Everything else in jobs.json is live state
# (schedule, next_run_at, enabled, prompt, etc.) and we leave it
# alone. Failures here don't fail the overall rollback — the skills
# tree is already restored, which is the main guarantee.
cron_report = _restore_cron_skill_links(target)
summary_bits = [f"restored from snapshot {target.name}"]
if cron_report.get("attempted"):
restored_n = len(cron_report.get("restored") or [])
skipped_n = len(cron_report.get("skipped_missing") or [])
if cron_report.get("error"):
summary_bits.append(f"cron links: error — {cron_report['error']}")
elif restored_n == 0 and skipped_n == 0 and cron_report.get("unchanged", 0) == 0:
# Attempted but nothing matched — empty snapshot or no overlapping ids.
pass
else:
parts = []
if restored_n:
parts.append(f"{restored_n} job(s) had skill links restored")
if skipped_n:
parts.append(f"{skipped_n} backed-up job(s) no longer exist (skipped)")
if cron_report.get("unchanged"):
parts.append(f"{cron_report['unchanged']} already matched")
summary_bits.append("cron links: " + ", ".join(parts))
logger.info("Curator rollback: restored from %s (cron_report=%s)",
target.name, cron_report)
return (True, "; ".join(summary_bits), target)
# ---------------------------------------------------------------------------
# Human-readable summary for CLI
# ---------------------------------------------------------------------------
def format_size(n: int) -> str:
for unit in ("B", "KB", "MB", "GB"):
if n < 1024 or unit == "GB":
return f"{n:.1f} {unit}" if unit != "B" else f"{n} B"
n /= 1024
return f"{n:.1f} GB"
def summarize_backups() -> str:
rows = list_backups()
if not rows:
return "No curator snapshots yet."
lines = [f"{'id':<24} {'reason':<40} {'skills':>6} {'size':>8}"]
lines.append("" * len(lines[0]))
for r in rows:
lines.append(
f"{r.get('id','?'):<24} "
f"{(r.get('reason','?') or '?')[:40]:<40} "
f"{r.get('skill_files', 0):>6} "
f"{format_size(int(r.get('archive_bytes', 0))):>8}"
)
return "\n".join(lines)
+5 -5
View File
@@ -20,25 +20,25 @@ def summarize_manual_compression(
headline = f"No changes from compression: {before_count} messages"
if after_tokens == before_tokens:
token_line = (
f"Rough transcript estimate: ~{before_tokens:,} tokens (unchanged)"
f"Approx request size: ~{before_tokens:,} tokens (unchanged)"
)
else:
token_line = (
f"Rough transcript estimate: ~{before_tokens:,}"
f"Approx request size: ~{before_tokens:,}"
f"~{after_tokens:,} tokens"
)
else:
headline = f"Compressed: {before_count}{after_count} messages"
token_line = (
f"Rough transcript estimate: ~{before_tokens:,}"
f"Approx request size: ~{before_tokens:,}"
f"~{after_tokens:,} tokens"
)
note = None
if not noop and after_count < before_count and after_tokens > before_tokens:
note = (
"Note: fewer messages can still raise this rough transcript estimate "
"when compression rewrites the transcript into denser summaries."
"Note: fewer messages can still raise this estimate when "
"compression rewrites the transcript into denser summaries."
)
return {
+45 -4
View File
@@ -81,15 +81,56 @@ def _repair_schema(node: Any, is_schema: bool = True) -> Any:
return repaired
# Rule 2: when anyOf is present, type belongs only on the children.
# Additionally, Moonshot rejects null-type branches inside anyOf
# (enum value (<nil>) does not match any type in [string]).
# Collapse the anyOf to the first non-null branch and infer its type.
if "anyOf" in repaired and isinstance(repaired["anyOf"], list):
repaired.pop("type", None)
return repaired
non_null = [b for b in repaired["anyOf"]
if isinstance(b, dict) and b.get("type") != "null"]
if non_null and len(non_null) < len(repaired["anyOf"]):
# Drop the anyOf wrapper — keep only the non-null branch.
# If there's a single non-null branch, promote it and fall
# through to Rules 1/3 so nullable/enum cleanup still applies
# to the merged node.
if len(non_null) == 1:
merge = {k: v for k, v in repaired.items() if k != "anyOf"}
merge.update(non_null[0])
repaired = merge
else:
repaired["anyOf"] = non_null
return repaired
else:
# Nothing to collapse — parent type stripped, children already
# repaired by the recursive walk above.
return repaired
# Moonshot also rejects non-standard keywords like ``nullable`` on
# parameter schemas — strip it.
repaired.pop("nullable", None)
# Rule 1: property schemas without type need one. $ref nodes are exempt
# — their type comes from the referenced definition.
if "$ref" in repaired:
return repaired
return _fill_missing_type(repaired)
# Fill missing type BEFORE Rule 3 so enum cleanup can check the type.
if "$ref" not in repaired:
repaired = _fill_missing_type(repaired)
# Rule 3: Moonshot rejects null/empty-string values inside enum arrays
# when the parent type is a scalar (string, integer, etc.). The error:
# "enum value (<nil>) does not match any type in [string]"
# Strip null and empty-string from enum values, and if the enum becomes
# empty, drop it entirely.
if "enum" in repaired and isinstance(repaired["enum"], list):
node_type = repaired.get("type")
if node_type in ("string", "integer", "number", "boolean"):
cleaned = [v for v in repaired["enum"]
if v is not None and v != ""]
if cleaned:
repaired["enum"] = cleaned
else:
repaired.pop("enum")
return repaired
def _fill_missing_type(node: Dict[str, Any]) -> Dict[str, Any]:
+58
View File
@@ -182,6 +182,64 @@ SKILLS_GUIDANCE = (
"Skills that aren't maintained become liabilities."
)
KANBAN_GUIDANCE = (
"# You are a Kanban worker\n"
"You were spawned by the Hermes Kanban dispatcher to execute ONE task from "
"the shared board at `~/.hermes/kanban.db`. Your task id is in "
"`$HERMES_KANBAN_TASK`; your workspace is `$HERMES_KANBAN_WORKSPACE`. "
"The `kanban_*` tools in your schema are your primary coordination surface — "
"they write directly to the shared SQLite DB and work regardless of terminal "
"backend (local/docker/modal/ssh).\n"
"\n"
"## Lifecycle\n"
"\n"
"1. **Orient.** Call `kanban_show()` first (no args — it defaults to your "
"task). The response includes title, body, parent-task handoffs (summary + "
"metadata), any prior attempts on this task if you're a retry, the full "
"comment thread, and a pre-formatted `worker_context` you can treat as "
"ground truth.\n"
"2. **Work inside the workspace.** `cd $HERMES_KANBAN_WORKSPACE` before "
"any file operations. The workspace is yours for this run. Don't modify "
"files outside it unless the task explicitly asks.\n"
"3. **Heartbeat on long operations.** Call `kanban_heartbeat(note=...)` "
"every few minutes during long subprocesses (training, encoding, crawling). "
"Skip heartbeats for short tasks.\n"
"4. **Block on genuine ambiguity.** If you need a human decision you cannot "
"infer (missing credentials, UX choice, paywalled source, peer output you "
"need first), call `kanban_block(reason=\"...\")` and stop. Don't guess. "
"The user will unblock with context and the dispatcher will respawn you.\n"
"5. **Complete with structured handoff.** Call `kanban_complete(summary=..., "
"metadata=...)`. `summary` is 13 human-readable sentences naming concrete "
"artifacts. `metadata` is machine-readable facts "
"(`{changed_files: [...], tests_run: N, decisions: [...]}`). Downstream "
"workers read both via their own `kanban_show`. Never put secrets / "
"tokens / raw PII in either field — run rows are durable forever.\n"
"6. **If follow-up work appears, create it; don't do it.** Use "
"`kanban_create(title=..., assignee=<right-profile>, parents=[your-task-id])` "
"to spawn a child task for the appropriate specialist profile instead of "
"scope-creeping into the next thing.\n"
"\n"
"## Orchestrator mode\n"
"\n"
"If your task is itself a decomposition task (e.g. a planner profile given "
"a high-level goal), use `kanban_create` to fan out into child tasks — one "
"per specialist, each with an explicit `assignee` and `parents=[...]` to "
"express dependencies. Then `kanban_complete` your own task with a summary "
"of the decomposition. Do NOT execute the work yourself; your job is "
"routing, not implementation.\n"
"\n"
"## Do NOT\n"
"\n"
"- Do not shell out to `hermes kanban <verb>` for board operations. Use "
"the `kanban_*` tools — they work across all terminal backends.\n"
"- Do not complete a task you didn't actually finish. Block it.\n"
"- Do not assign follow-up work to yourself. Assign it to the right "
"specialist profile.\n"
"- Do not call `delegate_task` as a board substitute. `delegate_task` is "
"for short reasoning subtasks inside your own run; board tasks are for "
"cross-agent handoffs that outlive one API loop."
)
TOOL_USE_ENFORCEMENT_GUIDANCE = (
"# Tool-use enforcement\n"
"You MUST use your tools to take action — do not describe what you would do "
+38 -3
View File
@@ -6,6 +6,7 @@ can invoke skills via /skill-name commands.
import json
import logging
import os
import re
from pathlib import Path
from typing import Any, Dict, Optional
@@ -20,10 +21,35 @@ from agent.skill_preprocessing import (
logger = logging.getLogger(__name__)
_skill_commands: Dict[str, Dict[str, Any]] = {}
_skill_commands_platform: Optional[str] = None
# Patterns for sanitizing skill names into clean hyphen-separated slugs.
_SKILL_INVALID_CHARS = re.compile(r"[^a-z0-9-]")
_SKILL_MULTI_HYPHEN = re.compile(r"-{2,}")
def _resolve_skill_commands_platform() -> Optional[str]:
"""Return the current platform scope used for disabled-skill filtering.
Used to detect when the active platform has shifted so
:func:`get_skill_commands` can drop a stale cache that was populated
for a different platform's ``skills.platform_disabled`` view (#14536).
Resolves from (in order) ``HERMES_PLATFORM`` env var and
``HERMES_SESSION_PLATFORM`` from the gateway session context. Returns
``None`` when no platform scope is active (e.g. classic CLI, RL
rollouts, standalone scripts).
"""
try:
from gateway.session_context import get_session_env
resolved_platform = (
os.getenv("HERMES_PLATFORM")
or get_session_env("HERMES_SESSION_PLATFORM")
)
except Exception:
resolved_platform = os.getenv("HERMES_PLATFORM")
return resolved_platform or None
def _load_skill_payload(skill_identifier: str, task_id: str | None = None) -> tuple[dict[str, Any], Path | None, str] | None:
"""Load a skill by name/path and return (loaded_payload, skill_dir, display_name)."""
raw_identifier = (skill_identifier or "").strip()
@@ -218,7 +244,8 @@ def scan_skill_commands() -> Dict[str, Dict[str, Any]]:
Returns:
Dict mapping "/skill-name" to {name, description, skill_md_path, skill_dir}.
"""
global _skill_commands
global _skill_commands, _skill_commands_platform
_skill_commands_platform = _resolve_skill_commands_platform()
_skill_commands = {}
try:
from tools.skills_tool import SKILLS_DIR, _parse_frontmatter, skill_matches_platform, _get_disabled_skill_names
@@ -278,8 +305,16 @@ def scan_skill_commands() -> Dict[str, Dict[str, Any]]:
def get_skill_commands() -> Dict[str, Dict[str, Any]]:
"""Return the current skill commands mapping (scan first if empty)."""
if not _skill_commands:
"""Return the current skill commands mapping (scan first if empty).
Rescans when the active platform scope changes (e.g. a gateway
process serving Telegram and Discord concurrently) so each platform
sees its own ``skills.platform_disabled`` view (#14536).
"""
if (
not _skill_commands
or _skill_commands_platform != _resolve_skill_commands_platform()
):
scan_skill_commands()
return _skill_commands
+455
View File
@@ -0,0 +1,455 @@
"""Pure tool-call loop guardrail primitives.
The controller in this module is intentionally side-effect free: it tracks
per-turn tool-call observations and returns decisions. Runtime code owns whether
those decisions become warning guidance, synthetic tool results, or controlled
turn halts.
"""
from __future__ import annotations
import hashlib
import json
from dataclasses import dataclass, field
from typing import Any, Mapping
from utils import safe_json_loads
IDEMPOTENT_TOOL_NAMES = frozenset(
{
"read_file",
"search_files",
"web_search",
"web_extract",
"session_search",
"browser_snapshot",
"browser_console",
"browser_get_images",
"mcp_filesystem_read_file",
"mcp_filesystem_read_text_file",
"mcp_filesystem_read_multiple_files",
"mcp_filesystem_list_directory",
"mcp_filesystem_list_directory_with_sizes",
"mcp_filesystem_directory_tree",
"mcp_filesystem_get_file_info",
"mcp_filesystem_search_files",
}
)
MUTATING_TOOL_NAMES = frozenset(
{
"terminal",
"execute_code",
"write_file",
"patch",
"todo",
"memory",
"skill_manage",
"browser_click",
"browser_type",
"browser_press",
"browser_scroll",
"browser_navigate",
"send_message",
"cronjob",
"delegate_task",
"process",
}
)
@dataclass(frozen=True)
class ToolCallGuardrailConfig:
"""Thresholds for per-turn tool-call loop detection.
Warnings are enabled by default and never prevent tool execution. Hard stops
are explicit opt-in so interactive CLI/TUI sessions get a gentle nudge unless
the user enables circuit-breaker behavior in config.yaml.
"""
warnings_enabled: bool = True
hard_stop_enabled: bool = False
exact_failure_warn_after: int = 2
exact_failure_block_after: int = 5
same_tool_failure_warn_after: int = 3
same_tool_failure_halt_after: int = 8
no_progress_warn_after: int = 2
no_progress_block_after: int = 5
idempotent_tools: frozenset[str] = field(default_factory=lambda: IDEMPOTENT_TOOL_NAMES)
mutating_tools: frozenset[str] = field(default_factory=lambda: MUTATING_TOOL_NAMES)
@classmethod
def from_mapping(cls, data: Mapping[str, Any] | None) -> "ToolCallGuardrailConfig":
"""Build config from the `tool_loop_guardrails` config.yaml section."""
if not isinstance(data, Mapping):
return cls()
warn_after = data.get("warn_after")
if not isinstance(warn_after, Mapping):
warn_after = {}
hard_stop_after = data.get("hard_stop_after")
if not isinstance(hard_stop_after, Mapping):
hard_stop_after = {}
defaults = cls()
return cls(
warnings_enabled=_as_bool(data.get("warnings_enabled"), defaults.warnings_enabled),
hard_stop_enabled=_as_bool(data.get("hard_stop_enabled"), defaults.hard_stop_enabled),
exact_failure_warn_after=_positive_int(
warn_after.get("exact_failure", data.get("exact_failure_warn_after")),
defaults.exact_failure_warn_after,
),
same_tool_failure_warn_after=_positive_int(
warn_after.get("same_tool_failure", data.get("same_tool_failure_warn_after")),
defaults.same_tool_failure_warn_after,
),
no_progress_warn_after=_positive_int(
warn_after.get("idempotent_no_progress", data.get("no_progress_warn_after")),
defaults.no_progress_warn_after,
),
exact_failure_block_after=_positive_int(
hard_stop_after.get("exact_failure", data.get("exact_failure_block_after")),
defaults.exact_failure_block_after,
),
same_tool_failure_halt_after=_positive_int(
hard_stop_after.get("same_tool_failure", data.get("same_tool_failure_halt_after")),
defaults.same_tool_failure_halt_after,
),
no_progress_block_after=_positive_int(
hard_stop_after.get("idempotent_no_progress", data.get("no_progress_block_after")),
defaults.no_progress_block_after,
),
)
@dataclass(frozen=True)
class ToolCallSignature:
"""Stable, non-reversible identity for a tool name plus canonical args."""
tool_name: str
args_hash: str
@classmethod
def from_call(cls, tool_name: str, args: Mapping[str, Any] | None) -> "ToolCallSignature":
canonical = canonical_tool_args(args or {})
return cls(tool_name=tool_name, args_hash=_sha256(canonical))
def to_metadata(self) -> dict[str, str]:
"""Return public metadata without raw argument values."""
return {"tool_name": self.tool_name, "args_hash": self.args_hash}
@dataclass(frozen=True)
class ToolGuardrailDecision:
"""Decision returned by the tool-call guardrail controller."""
action: str = "allow" # allow | warn | block | halt
code: str = "allow"
message: str = ""
tool_name: str = ""
count: int = 0
signature: ToolCallSignature | None = None
@property
def allows_execution(self) -> bool:
return self.action in {"allow", "warn"}
@property
def should_halt(self) -> bool:
return self.action in {"block", "halt"}
def to_metadata(self) -> dict[str, Any]:
data: dict[str, Any] = {
"action": self.action,
"code": self.code,
"message": self.message,
"tool_name": self.tool_name,
"count": self.count,
}
if self.signature is not None:
data["signature"] = self.signature.to_metadata()
return data
def canonical_tool_args(args: Mapping[str, Any]) -> str:
"""Return sorted compact JSON for parsed tool arguments."""
if not isinstance(args, Mapping):
raise TypeError(f"tool args must be a mapping, got {type(args).__name__}")
return json.dumps(
args,
ensure_ascii=False,
sort_keys=True,
separators=(",", ":"),
default=str,
)
def classify_tool_failure(tool_name: str, result: str | None) -> tuple[bool, str]:
"""Safety-fallback classifier used only when callers don't pass ``failed``.
Mirrors ``agent.display._detect_tool_failure`` exactly so the guardrail
never disagrees with the CLI's user-visible ``[error]`` tag. Production
callers in ``run_agent.py`` always pass an explicit ``failed=`` derived
from ``_detect_tool_failure``; this function exists so standalone callers
(tests, tooling) still get consistent behavior.
"""
if result is None:
return False, ""
if tool_name == "terminal":
data = safe_json_loads(result)
if isinstance(data, dict):
exit_code = data.get("exit_code")
if exit_code is not None and exit_code != 0:
return True, f" [exit {exit_code}]"
return False, ""
if tool_name == "memory":
data = safe_json_loads(result)
if isinstance(data, dict):
if data.get("success") is False and "exceed the limit" in data.get("error", ""):
return True, " [full]"
lower = result[:500].lower()
if '"error"' in lower or '"failed"' in lower or result.startswith("Error"):
return True, " [error]"
return False, ""
class ToolCallGuardrailController:
"""Per-turn controller for repeated failed/non-progressing tool calls."""
def __init__(self, config: ToolCallGuardrailConfig | None = None):
self.config = config or ToolCallGuardrailConfig()
self.reset_for_turn()
def reset_for_turn(self) -> None:
self._exact_failure_counts: dict[ToolCallSignature, int] = {}
self._same_tool_failure_counts: dict[str, int] = {}
self._no_progress: dict[ToolCallSignature, tuple[str, int]] = {}
self._halt_decision: ToolGuardrailDecision | None = None
@property
def halt_decision(self) -> ToolGuardrailDecision | None:
return self._halt_decision
def before_call(self, tool_name: str, args: Mapping[str, Any] | None) -> ToolGuardrailDecision:
signature = ToolCallSignature.from_call(tool_name, _coerce_args(args))
if not self.config.hard_stop_enabled:
return ToolGuardrailDecision(tool_name=tool_name, signature=signature)
exact_count = self._exact_failure_counts.get(signature, 0)
if exact_count >= self.config.exact_failure_block_after:
decision = ToolGuardrailDecision(
action="block",
code="repeated_exact_failure_block",
message=(
f"Blocked {tool_name}: the same tool call failed {exact_count} "
"times with identical arguments. Stop retrying it unchanged; "
"change strategy or explain the blocker."
),
tool_name=tool_name,
count=exact_count,
signature=signature,
)
self._halt_decision = decision
return decision
if self._is_idempotent(tool_name):
record = self._no_progress.get(signature)
if record is not None:
_result_hash, repeat_count = record
if repeat_count >= self.config.no_progress_block_after:
decision = ToolGuardrailDecision(
action="block",
code="idempotent_no_progress_block",
message=(
f"Blocked {tool_name}: this read-only call returned the same "
f"result {repeat_count} times. Stop repeating it unchanged; "
"use the result already provided or try a different query."
),
tool_name=tool_name,
count=repeat_count,
signature=signature,
)
self._halt_decision = decision
return decision
return ToolGuardrailDecision(tool_name=tool_name, signature=signature)
def after_call(
self,
tool_name: str,
args: Mapping[str, Any] | None,
result: str | None,
*,
failed: bool | None = None,
) -> ToolGuardrailDecision:
args = _coerce_args(args)
signature = ToolCallSignature.from_call(tool_name, args)
if failed is None:
failed, _ = classify_tool_failure(tool_name, result)
if failed:
exact_count = self._exact_failure_counts.get(signature, 0) + 1
self._exact_failure_counts[signature] = exact_count
self._no_progress.pop(signature, None)
same_count = self._same_tool_failure_counts.get(tool_name, 0) + 1
self._same_tool_failure_counts[tool_name] = same_count
if self.config.hard_stop_enabled and same_count >= self.config.same_tool_failure_halt_after:
decision = ToolGuardrailDecision(
action="halt",
code="same_tool_failure_halt",
message=(
f"Stopped {tool_name}: it failed {same_count} times this turn. "
"Stop retrying the same failing tool path and choose a different approach."
),
tool_name=tool_name,
count=same_count,
signature=signature,
)
self._halt_decision = decision
return decision
if self.config.warnings_enabled and exact_count >= self.config.exact_failure_warn_after:
return ToolGuardrailDecision(
action="warn",
code="repeated_exact_failure_warning",
message=(
f"{tool_name} has failed {exact_count} times with identical arguments. "
"This looks like a loop; inspect the error and change strategy "
"instead of retrying it unchanged."
),
tool_name=tool_name,
count=exact_count,
signature=signature,
)
if self.config.warnings_enabled and same_count >= self.config.same_tool_failure_warn_after:
return ToolGuardrailDecision(
action="warn",
code="same_tool_failure_warning",
message=(
f"{tool_name} has failed {same_count} times this turn. "
"This looks like a loop; change approach before retrying."
),
tool_name=tool_name,
count=same_count,
signature=signature,
)
return ToolGuardrailDecision(tool_name=tool_name, count=exact_count, signature=signature)
self._exact_failure_counts.pop(signature, None)
self._same_tool_failure_counts.pop(tool_name, None)
if not self._is_idempotent(tool_name):
self._no_progress.pop(signature, None)
return ToolGuardrailDecision(tool_name=tool_name, signature=signature)
result_hash = _result_hash(result)
previous = self._no_progress.get(signature)
repeat_count = 1
if previous is not None and previous[0] == result_hash:
repeat_count = previous[1] + 1
self._no_progress[signature] = (result_hash, repeat_count)
if self.config.warnings_enabled and repeat_count >= self.config.no_progress_warn_after:
return ToolGuardrailDecision(
action="warn",
code="idempotent_no_progress_warning",
message=(
f"{tool_name} returned the same result {repeat_count} times. "
"Use the result already provided or change the query instead of "
"repeating it unchanged."
),
tool_name=tool_name,
count=repeat_count,
signature=signature,
)
return ToolGuardrailDecision(tool_name=tool_name, count=repeat_count, signature=signature)
def _is_idempotent(self, tool_name: str) -> bool:
if tool_name in self.config.mutating_tools:
return False
return tool_name in self.config.idempotent_tools
def toolguard_synthetic_result(decision: ToolGuardrailDecision) -> str:
"""Build a synthetic role=tool content string for a blocked tool call."""
return json.dumps(
{
"error": decision.message,
"guardrail": decision.to_metadata(),
},
ensure_ascii=False,
)
def append_toolguard_guidance(result: str, decision: ToolGuardrailDecision) -> str:
"""Append runtime guidance to the current tool result content."""
if decision.action not in {"warn", "halt"} or not decision.message:
return result
label = "Tool loop hard stop" if decision.action == "halt" else "Tool loop warning"
suffix = (
f"\n\n[{label}: "
f"{decision.code}; count={decision.count}; {decision.message}]"
)
return (result or "") + suffix
def _coerce_args(args: Mapping[str, Any] | None) -> Mapping[str, Any]:
return args if isinstance(args, Mapping) else {}
def _result_hash(result: str | None) -> str:
parsed = safe_json_loads(result or "")
if parsed is not None:
try:
canonical = json.dumps(
parsed,
ensure_ascii=False,
sort_keys=True,
separators=(",", ":"),
default=str,
)
except TypeError:
canonical = str(parsed)
else:
canonical = result or ""
return _sha256(canonical)
def _as_bool(value: Any, default: bool) -> bool:
if value is None:
return default
if isinstance(value, bool):
return value
if isinstance(value, (int, float)):
return bool(value)
if isinstance(value, str):
lowered = value.strip().lower()
if lowered in {"1", "true", "yes", "on", "enabled"}:
return True
if lowered in {"0", "false", "no", "off", "disabled"}:
return False
return default
def _positive_int(value: Any, default: int) -> int:
if value is None:
return default
try:
parsed = int(value)
except (TypeError, ValueError):
return default
return parsed if parsed >= 1 else default
def _sha256(value: str) -> str:
return hashlib.sha256(value.encode("utf-8")).hexdigest()
+5 -1
View File
@@ -477,9 +477,13 @@ class ChatCompletionsTransport(ProviderTransport):
# so keep them apart in provider_data rather than merging.
reasoning = getattr(msg, "reasoning", None)
reasoning_content = getattr(msg, "reasoning_content", None)
if reasoning_content is None and hasattr(msg, "model_extra"):
model_extra = getattr(msg, "model_extra", None) or {}
if isinstance(model_extra, dict) and "reasoning_content" in model_extra:
reasoning_content = model_extra["reasoning_content"]
provider_data: Dict[str, Any] = {}
if reasoning_content:
if reasoning_content is not None:
provider_data["reasoning_content"] = reasoning_content
rd = getattr(msg, "reasoning_details", None)
if rd:
+31
View File
@@ -121,6 +121,18 @@ model:
# # Data policy: "allow" (default) or "deny" to exclude providers that may store data
# # data_collection: "deny"
# =============================================================================
# OpenRouter Response Caching (only applies when using OpenRouter)
# =============================================================================
# Cache identical API responses at the OpenRouter edge for free instant replays.
# When enabled, identical requests (same model, messages, parameters) return
# cached responses with zero billing. Separate from Anthropic prompt caching.
# See: https://openrouter.ai/docs/guides/features/response-caching
#
# openrouter:
# response_cache: true # Enable response caching (default: true)
# response_cache_ttl: 300 # Cache TTL in seconds, 1-86400 (default: 300)
# =============================================================================
# Git Worktree Isolation
# =============================================================================
@@ -289,6 +301,25 @@ browser:
# after this period of no activity between agent loops (default: 120 = 2 minutes)
inactivity_timeout: 120
# =============================================================================
# Tool Loop Guardrails
# =============================================================================
# Soft warnings are enabled by default. They append guidance to repeated failed
# or non-progressing tool results but still let the tool execute. Hard stops are
# opt-in circuit breakers for autonomous/cron sessions where stopping a loop is
# preferable to spending the full iteration budget.
tool_loop_guardrails:
warnings_enabled: true
hard_stop_enabled: false
warn_after:
exact_failure: 2
same_tool_failure: 3
idempotent_no_progress: 2
hard_stop_after:
exact_failure: 5
same_tool_failure: 8
idempotent_no_progress: 5
# =============================================================================
# Context Compression (Auto-shrinks long conversations)
# =============================================================================
+327 -17
View File
@@ -15,7 +15,6 @@ Usage:
import logging
import os
import re
import shutil
import sys
import json
@@ -86,7 +85,7 @@ from hermes_cli.browser_connect import (
try_launch_chrome_debug,
)
from hermes_cli.env_loader import load_hermes_dotenv
from utils import base_url_host_matches
from utils import base_url_host_matches, is_truthy_value
_hermes_home = get_hermes_home()
_project_env = Path(__file__).parent / '.env'
@@ -600,6 +599,7 @@ def load_cli_config() -> Dict[str, Any]:
# Load configuration at module startup
CLI_CONFIG = load_cli_config()
# Initialize centralized logging early — agent.log + errors.log in ~/.hermes/logs/.
# This ensures CLI sessions produce a log trail even before AIAgent is instantiated.
try:
@@ -934,6 +934,20 @@ def _run_state_db_auto_maintenance(session_db) -> None:
try:
from hermes_cli.config import load_config as _load_full_config
from hermes_constants import get_hermes_home as _get_hermes_home
_hermes_home_maint = _get_hermes_home()
# One-time prune of empty TUI ghost sessions.
try:
if not session_db.get_meta("ghost_session_prune_v1"):
pruned = session_db.prune_empty_ghost_sessions(
sessions_dir=_hermes_home_maint / "sessions"
)
session_db.set_meta("ghost_session_prune_v1", "1")
if pruned:
logger.info("Pruned %d empty TUI ghost sessions", pruned)
except Exception as _prune_exc:
logger.debug("Ghost session prune skipped: %s", _prune_exc)
cfg = (_load_full_config().get("sessions") or {})
if not cfg.get("auto_prune", False):
return
@@ -941,7 +955,7 @@ def _run_state_db_auto_maintenance(session_db) -> None:
retention_days=int(cfg.get("retention_days", 90)),
min_interval_hours=int(cfg.get("min_interval_hours", 24)),
vacuum=bool(cfg.get("vacuum_after_prune", True)),
sessions_dir=_get_hermes_home() / "sessions",
sessions_dir=_hermes_home_maint / "sessions",
)
except Exception as exc:
logger.debug("state.db auto-maintenance skipped: %s", exc)
@@ -1240,8 +1254,73 @@ def _cprint(text: str):
Raw ANSI escapes written via print() are swallowed by patch_stdout's
StdoutProxy. Routing through print_formatted_text(ANSI(...)) lets
prompt_toolkit parse the escapes and render real colors.
When called from a background thread while a prompt_toolkit
``Application`` is running (the common case for the self-improvement
background review's ``💾 …`` summary, curator summaries, and other
bg-thread emissions), a direct ``_pt_print`` races with the input
area's redraw and the line can end up visually buried behind the
prompt. Route those cases through ``run_in_terminal`` via
``loop.call_soon_threadsafe``, which pauses the input area, prints
the line above it, and redraws the prompt cleanly.
"""
_pt_print(_PT_ANSI(text))
try:
from prompt_toolkit.application import get_app_or_none, run_in_terminal
except Exception:
_pt_print(_PT_ANSI(text))
return
app = None
try:
app = get_app_or_none()
except Exception:
app = None
# No active app, or we're already on the app's main thread: the
# direct prompt_toolkit print is safe and matches existing behavior
# (spinner frames, streamed tokens, tool activity prefixes, …).
if app is None or not getattr(app, "_is_running", False):
_pt_print(_PT_ANSI(text))
return
try:
loop = app.loop # type: ignore[attr-defined]
except Exception:
loop = None
if loop is None:
_pt_print(_PT_ANSI(text))
return
import asyncio as _asyncio
try:
current_loop = _asyncio.get_event_loop_policy().get_event_loop()
except Exception:
current_loop = None
# Same thread as the app's loop → safe to print directly.
if current_loop is loop and loop.is_running():
_pt_print(_PT_ANSI(text))
return
# Cross-thread emission: ask the app's event loop to schedule a
# ``run_in_terminal`` that wraps ``_pt_print``. This hides the
# prompt, prints, and redraws. Fire-and-forget — if scheduling
# fails we fall back to a direct print so the line isn't lost.
def _schedule():
try:
run_in_terminal(lambda: _pt_print(_PT_ANSI(text)))
except Exception:
try:
_pt_print(_PT_ANSI(text))
except Exception:
pass
try:
loop.call_soon_threadsafe(_schedule)
except Exception:
try:
_pt_print(_PT_ANSI(text))
except Exception:
pass
# ---------------------------------------------------------------------------
@@ -2053,6 +2132,8 @@ class HermesCLI:
# Parse and validate toolsets
self.enabled_toolsets = toolsets
self.disabled_toolsets = CLI_CONFIG["agent"].get("disabled_toolsets") or []
if toolsets and "all" not in toolsets and "*" not in toolsets:
# Validate each toolset — MCP server names are resolved via
# live registry aliases (registered during discover_mcp_tools),
@@ -2847,7 +2928,14 @@ class HermesCLI:
def _expand_ref(match):
path = Path(match.group(1))
return path.read_text(encoding="utf-8") if path.exists() else match.group(0)
# Use try/except instead of path.exists() to avoid TOCTOU race:
# the paste file may be deleted between check and read, causing
# the input to be silently dropped (#17666).
try:
return path.read_text(encoding="utf-8")
except (OSError, IOError):
logger.warning("Paste file gone or unreadable, returning placeholder: %s", path)
return match.group(0)
return paste_ref_re.sub(_expand_ref, text)
@@ -3503,6 +3591,7 @@ class HermesCLI:
credential_pool=runtime.get("credential_pool"),
max_iterations=self.max_turns,
enabled_toolsets=self.enabled_toolsets,
disabled_toolsets=self.disabled_toolsets,
verbose_logging=self.verbose,
quiet_mode=not self.verbose,
ephemeral_system_prompt=self.system_prompt if self.system_prompt else None,
@@ -3550,14 +3639,18 @@ class HermesCLI:
tuple(runtime.get("args") or ()),
)
if self._pending_title and self._session_db:
# Force-create DB row on /title intent, then apply title.
if self._pending_title and self._session_db and self.agent:
try:
self._session_db.set_session_title(self.session_id, self._pending_title)
_cprint(f" Session title applied: {self._pending_title}")
self._pending_title = None
self.agent._ensure_db_session()
if self.agent._session_db_created:
self._session_db.set_session_title(self.session_id, self._pending_title)
_cprint(f" Session title applied: {self._pending_title}")
self._pending_title = None
# else: row creation failed transiently — keep _pending_title for retry
except (ValueError, Exception) as e:
_cprint(f" Could not apply pending title: {e}")
self._pending_title = None
# Keep _pending_title so it can be retried after row creation succeeds
return True
except Exception as e:
ChatConsole().print(f"[bold red]Failed to initialize agent: {e}[/]")
@@ -4885,6 +4978,7 @@ class HermesCLI:
if self._session_db:
try:
self.agent._session_db_created = False
self._session_db.create_session(
session_id=self.session_id,
source=os.environ.get("HERMES_SESSION_SOURCE", "cli"),
@@ -4894,6 +4988,7 @@ class HermesCLI:
"reasoning_config": self.reasoning_config,
},
)
self.agent._session_db_created = True
except Exception:
pass
# Notify memory providers that session_id rotated to a fresh
@@ -6087,6 +6182,27 @@ class HermesCLI:
except Exception as exc:
print(f"(._.) curator: {exc}")
def _handle_kanban_command(self, cmd: str):
"""Handle the /kanban command — delegate to the shared kanban CLI.
The string form passed here is the user's full ``/kanban ...``
including the leading slash; we strip it and hand the remainder
to ``kanban.run_slash`` which returns a single formatted string.
"""
from hermes_cli.kanban import run_slash
rest = cmd.strip()
if rest.startswith("/"):
rest = rest.lstrip("/")
if rest.startswith("kanban"):
rest = rest[len("kanban"):].lstrip()
try:
output = run_slash(rest)
except Exception as exc: # pragma: no cover - defensive
output = f"(._.) kanban error: {exc}"
if output:
print(output)
def _handle_skills_command(self, cmd: str):
"""Handle /skills slash command — delegates to hermes_cli.skills_hub."""
from hermes_cli.skills_hub import handle_skills_slash
@@ -6332,6 +6448,8 @@ class HermesCLI:
self._handle_cron_command(cmd_original)
elif canonical == "curator":
self._handle_curator_command(cmd_original)
elif canonical == "kanban":
self._handle_kanban_command(cmd_original)
elif canonical == "skills":
with self._busy_command(self._slow_command_status(cmd_original)):
self._handle_skills_command(cmd_original)
@@ -6449,6 +6567,8 @@ class HermesCLI:
# No active run — treat as a normal next-turn message.
self._pending_input.put(payload)
_cprint(f" No agent running; queued as next turn: {payload[:80]}{'...' if len(payload) > 80 else ''}")
elif canonical == "goal":
self._handle_goal_command(cmd_original)
elif canonical == "skin":
self._handle_skin_command(cmd_original)
elif canonical == "voice":
@@ -6494,12 +6614,17 @@ class HermesCLI:
self._console_print(f"[bold red]Quick command '{base_cmd}' has unsupported type (supported: 'exec', 'alias')[/]")
# Check for plugin-registered slash commands
elif base_cmd.lstrip("/") in _get_plugin_cmd_handler_names():
from hermes_cli.plugins import get_plugin_command_handler
from hermes_cli.plugins import (
get_plugin_command_handler,
resolve_plugin_command_result,
)
plugin_handler = get_plugin_command_handler(base_cmd.lstrip("/"))
if plugin_handler:
user_args = cmd_original[len(base_cmd):].strip()
try:
result = plugin_handler(user_args)
result = resolve_plugin_command_result(
plugin_handler(user_args)
)
if result:
_cprint(str(result))
except Exception as e:
@@ -6924,6 +7049,166 @@ class HermesCLI:
print(" status Show current browser mode")
print()
# ────────────────────────────────────────────────────────────────
# /goal — persistent cross-turn goals (Ralph-style loop)
# ────────────────────────────────────────────────────────────────
def _get_goal_manager(self):
"""Return the GoalManager bound to the current session_id.
Cached on ``self._goal_manager`` and rebound lazily when
``session_id`` changes (e.g. after /new or a compression-driven
session split).
"""
try:
from hermes_cli.goals import GoalManager
from hermes_cli.config import load_config
except Exception as exc:
logging.debug("goal manager unavailable: %s", exc)
return None
sid = getattr(self, "session_id", None) or ""
if not sid:
return None
existing = getattr(self, "_goal_manager", None)
if existing is not None and getattr(existing, "session_id", None) == sid:
return existing
try:
cfg = load_config() or {}
goals_cfg = cfg.get("goals") or {}
max_turns = int(goals_cfg.get("max_turns", 20) or 20)
except Exception:
max_turns = 20
mgr = GoalManager(session_id=sid, default_max_turns=max_turns)
self._goal_manager = mgr
return mgr
def _handle_goal_command(self, cmd: str) -> None:
"""Dispatch /goal subcommands: set / status / pause / resume / clear."""
parts = (cmd or "").strip().split(None, 1)
arg = parts[1].strip() if len(parts) > 1 else ""
mgr = self._get_goal_manager()
if mgr is None:
_cprint(f" {_DIM}Goals unavailable (no active session).{_RST}")
return
lower = arg.lower()
# Bare /goal or /goal status → show current state
if not arg or lower == "status":
_cprint(f" {mgr.status_line()}")
return
if lower == "pause":
state = mgr.pause(reason="user-paused")
if state is None:
_cprint(f" {_DIM}No goal set.{_RST}")
else:
_cprint(f" ⏸ Goal paused: {state.goal}")
return
if lower == "resume":
state = mgr.resume()
if state is None:
_cprint(f" {_DIM}No goal to resume.{_RST}")
else:
_cprint(f" ▶ Goal resumed: {state.goal}")
_cprint(
f" {_DIM}Send any message (or press Enter on an empty prompt "
f"is a no-op; type 'continue' to kick it off).{_RST}"
)
return
if lower in ("clear", "stop", "done"):
had = mgr.has_goal()
mgr.clear()
if had:
_cprint(" ✓ Goal cleared.")
else:
_cprint(f" {_DIM}No active goal.{_RST}")
return
# Otherwise treat the arg as the goal text.
try:
state = mgr.set(arg)
except ValueError as exc:
_cprint(f" Invalid goal: {exc}")
return
_cprint(f" ⊙ Goal set ({state.max_turns}-turn budget): {state.goal}")
_cprint(
f" {_DIM}After each turn, a judge model will check if the goal is done. "
f"Hermes keeps working until it is, you pause/clear it, or the budget is "
f"exhausted. Use /goal status, /goal pause, /goal resume, /goal clear.{_RST}"
)
# Kick the loop off immediately so the user doesn't have to send a
# separate message after setting the goal.
try:
self._pending_input.put(state.goal)
except Exception:
pass
def _maybe_continue_goal_after_turn(self) -> None:
"""Hook run after every CLI turn. Judges + maybe re-queues.
Safe to call when no goal is set returns quickly.
Preemption is automatic: if a real user message is already in
``_pending_input`` we skip judging (the user's new input takes
priority and we'll re-judge after that turn). If judge says done,
mark it done and tell the user. If judge says continue and we're
under budget, push the continuation prompt onto the queue.
"""
mgr = self._get_goal_manager()
if mgr is None or not mgr.is_active():
return
# If a real user message is already queued, don't inject a
# continuation prompt on top — let the user's turn go first.
try:
if getattr(self, "_pending_input", None) is not None \
and not self._pending_input.empty():
return
except Exception:
pass
# Extract the agent's final response for this turn.
last_response = ""
try:
hist = self.conversation_history or []
for msg in reversed(hist):
if msg.get("role") == "assistant":
content = msg.get("content", "")
if isinstance(content, list):
# Multimodal content — flatten text parts.
parts = [
p.get("text", "")
for p in content
if isinstance(p, dict) and p.get("type") in ("text", "output_text")
]
last_response = "\n".join(t for t in parts if t)
else:
last_response = str(content or "")
break
except Exception:
last_response = ""
decision = mgr.evaluate_after_turn(last_response, user_initiated=True)
msg = decision.get("message") or ""
if msg:
_cprint(f" {msg}")
if decision.get("should_continue"):
prompt = decision.get("continuation_prompt")
if prompt:
try:
self._pending_input.put(prompt)
except Exception as exc:
logging.debug("goal continuation enqueue failed: %s", exc)
def _handle_skin_command(self, cmd: str):
"""Handle /skin [name] — show or change the display skin."""
try:
@@ -7050,7 +7335,7 @@ class HermesCLI:
import os
from hermes_cli.colors import Colors as _Colors
current = bool(os.environ.get("HERMES_YOLO_MODE"))
current = is_truthy_value(os.environ.get("HERMES_YOLO_MODE"))
if current:
os.environ.pop("HERMES_YOLO_MODE", None)
_cprint(
@@ -7247,10 +7532,20 @@ class HermesCLI:
original_count = len(self.conversation_history)
with self._busy_command("Compressing context..."):
try:
from agent.model_metadata import estimate_messages_tokens_rough
from agent.model_metadata import estimate_request_tokens_rough
from agent.manual_compression_feedback import summarize_manual_compression
original_history = list(self.conversation_history)
approx_tokens = estimate_messages_tokens_rough(original_history)
# Include system prompt + tool schemas in the estimate —
# a transcript-only number understates real request pressure
# and can even appear to grow after compression because a
# dense handoff summary replaces many short turns (#6217).
_sys_prompt = getattr(self.agent, "_cached_system_prompt", "") or ""
_tools = getattr(self.agent, "tools", None) or None
approx_tokens = estimate_request_tokens_rough(
original_history,
system_prompt=_sys_prompt,
tools=_tools,
)
if focus_topic:
print(f"🗜️ Compressing {original_count} messages (~{approx_tokens:,} tokens), "
f"focus: \"{focus_topic}\"...")
@@ -7282,7 +7577,11 @@ class HermesCLI:
):
self.session_id = self.agent.session_id
self._pending_title = None
new_tokens = estimate_messages_tokens_rough(self.conversation_history)
new_tokens = estimate_request_tokens_rough(
self.conversation_history,
system_prompt=_sys_prompt,
tools=_tools,
)
summary = summarize_manual_compression(
original_history,
self.conversation_history,
@@ -11248,6 +11547,17 @@ class HermesCLI:
app.invalidate() # Refresh status line
# Goal continuation: if a standing goal is active, ask
# the judge whether the turn satisfied it. If not, and
# there's no real user message already queued, push the
# continuation prompt back into _pending_input so the
# next loop iteration picks it up naturally (and any
# user input that arrives in between still preempts).
try:
self._maybe_continue_goal_after_turn()
except Exception as _goal_exc:
logging.debug("goal continuation hook failed: %s", _goal_exc)
# Continuous voice: auto-restart recording after agent responds.
# Dispatch to a daemon thread so play_beep (sd.wait) and
# AudioRecorder.start (lock acquire) never block process_loop —
@@ -11281,7 +11591,7 @@ class HermesCLI:
pass # Non-fatal — don't break the main loop
except Exception as e:
print(f"Error: {e}")
logger.warning("process_loop unhandled error (msg may be lost): %s", e)
# Start processing thread
process_thread = threading.Thread(target=process_loop, daemon=True)
+118
View File
@@ -882,3 +882,121 @@ def save_job_output(job_id: str, output: str):
raise
return output_file
# =============================================================================
# Skill reference rewriting (curator integration)
# =============================================================================
def rewrite_skill_refs(
consolidated: Optional[Dict[str, str]] = None,
pruned: Optional[List[str]] = None,
) -> Dict[str, Any]:
"""Rewrite cron job skill references after a curator consolidation pass.
When the curator consolidates a skill X into umbrella Y (or archives X
as pruned), any cron job that lists ``X`` in its ``skills`` field will
fail to load ``X`` at run time the scheduler logs a warning and
skips the skill, so the job runs without the instructions it was
scheduled to follow. See cron/scheduler.py where ``skill_view`` is
called per skill name.
This function repairs cron jobs in-place:
- A skill listed in ``consolidated`` is replaced with its umbrella
target (the ``into`` value). If the umbrella is already in the
job's skill list, the stale name is dropped without duplication.
- A skill listed in ``pruned`` is dropped outright there is no
forwarding target.
- Ordering and other skills in the list are preserved.
- The legacy ``skill`` field is realigned via ``_apply_skill_fields``.
Args:
consolidated: mapping of ``old_skill_name -> umbrella_skill_name``.
pruned: list of skill names that were archived with no forwarding
target.
Returns a report dict::
{
"rewrites": [
{
"job_id": ...,
"job_name": ...,
"before": [...],
"after": [...],
"mapped": {"old": "new", ...},
"dropped": ["old", ...],
},
...
],
"jobs_updated": N,
"jobs_scanned": M,
}
Best-effort: exceptions from loading/saving propagate to the caller so
tests can assert behaviour; the curator invocation site wraps this
call in a try/except so a failure here never breaks the curator.
"""
consolidated = dict(consolidated or {})
pruned_set = set(pruned or [])
# A skill listed in both wins as "consolidated" — it has a target,
# which is the more useful of the two outcomes.
pruned_set -= set(consolidated.keys())
if not consolidated and not pruned_set:
return {"rewrites": [], "jobs_updated": 0, "jobs_scanned": 0}
with _jobs_file_lock:
jobs = load_jobs()
rewrites: List[Dict[str, Any]] = []
changed = False
for job in jobs:
skills_before = _normalize_skill_list(job.get("skill"), job.get("skills"))
if not skills_before:
continue
mapped: Dict[str, str] = {}
dropped: List[str] = []
new_skills: List[str] = []
for name in skills_before:
if name in consolidated:
target = consolidated[name]
mapped[name] = target
if target and target not in new_skills:
new_skills.append(target)
elif name in pruned_set:
dropped.append(name)
else:
if name not in new_skills:
new_skills.append(name)
if not mapped and not dropped:
continue
job["skills"] = new_skills
job["skill"] = new_skills[0] if new_skills else None
changed = True
rewrites.append({
"job_id": job.get("id"),
"job_name": job.get("name") or job.get("id"),
"before": list(skills_before),
"after": list(new_skills),
"mapped": mapped,
"dropped": dropped,
})
if changed:
save_jobs(jobs)
logger.info(
"Curator rewrote skill references in %d cron job(s)", len(rewrites)
)
return {
"rewrites": rewrites,
"jobs_updated": len(rewrites),
"jobs_scanned": len(jobs),
}
+1 -1
View File
@@ -40,7 +40,7 @@ services:
# - TEAMS_CLIENT_SECRET=${TEAMS_CLIENT_SECRET}
# - TEAMS_TENANT_ID=${TEAMS_TENANT_ID}
# - TEAMS_ALLOWED_USERS=${TEAMS_ALLOWED_USERS}
# - TEAMS_PORT=3978
# - TEAMS_PORT=${TEAMS_PORT:-3978}
command: ["gateway", "run"]
dashboard:
Binary file not shown.
+64 -6
View File
@@ -36,6 +36,26 @@ def _coerce_bool(value: Any, default: bool = True) -> bool:
return is_truthy_value(value, default=default)
def _coerce_float(value: Any, default: float) -> float:
"""Coerce numeric config values, falling back on malformed input."""
if value is None:
return default
try:
return float(value)
except (TypeError, ValueError):
return default
def _coerce_int(value: Any, default: int) -> int:
"""Coerce integer config values, falling back on malformed input."""
if value is None:
return default
try:
return int(value)
except (TypeError, ValueError):
return default
def _normalize_unauthorized_dm_behavior(value: Any, default: str = "pair") -> str:
"""Normalize unauthorized DM behavior to a supported value."""
if isinstance(value, str):
@@ -45,6 +65,15 @@ def _normalize_unauthorized_dm_behavior(value: Any, default: str = "pair") -> st
return default
def _normalize_notice_delivery(value: Any, default: str = "public") -> str:
"""Normalize notice delivery mode to a supported value."""
if isinstance(value, str):
normalized = value.strip().lower()
if normalized in {"public", "private"}:
return normalized
return default
# Module-level cache for bundled platform plugin names (lives outside the
# enum so it doesn't become an accidental enum member).
_Platform__bundled_plugin_names: Optional[set] = None
@@ -301,13 +330,13 @@ class StreamingConfig:
if not data:
return cls()
return cls(
enabled=data.get("enabled", False),
enabled=_coerce_bool(data.get("enabled"), False),
transport=data.get("transport", "edit"),
edit_interval=float(data.get("edit_interval", 1.0)),
buffer_threshold=int(data.get("buffer_threshold", 40)),
edit_interval=_coerce_float(data.get("edit_interval"), 1.0),
buffer_threshold=_coerce_int(data.get("buffer_threshold"), 40),
cursor=data.get("cursor", ""),
fresh_final_after_seconds=float(
data.get("fresh_final_after_seconds", 60.0)
fresh_final_after_seconds=_coerce_float(
data.get("fresh_final_after_seconds"), 60.0
),
)
@@ -572,6 +601,17 @@ class GatewayConfig:
)
return self.unauthorized_dm_behavior
def get_notice_delivery(self, platform: Optional[Platform] = None) -> str:
"""Return the effective notice-delivery mode for a platform."""
if platform:
platform_cfg = self.platforms.get(platform)
if platform_cfg and "notice_delivery" in platform_cfg.extra:
return _normalize_notice_delivery(
platform_cfg.extra.get("notice_delivery"),
"public",
)
return "public"
def load_gateway_config() -> GatewayConfig:
"""
@@ -687,6 +727,11 @@ def load_gateway_config() -> GatewayConfig:
platform_cfg.get("unauthorized_dm_behavior"),
gw_data.get("unauthorized_dm_behavior", "pair"),
)
if "notice_delivery" in platform_cfg:
bridged["notice_delivery"] = _normalize_notice_delivery(
platform_cfg.get("notice_delivery"),
"public",
)
if "reply_prefix" in platform_cfg:
bridged["reply_prefix"] = platform_cfg["reply_prefix"]
if "reply_in_thread" in platform_cfg:
@@ -900,6 +945,12 @@ def load_gateway_config() -> GatewayConfig:
if "dm_mention_threads" in matrix_cfg and not os.getenv("MATRIX_DM_MENTION_THREADS"):
os.environ["MATRIX_DM_MENTION_THREADS"] = str(matrix_cfg["dm_mention_threads"]).lower()
# Feishu settings → env vars (env vars take precedence)
feishu_cfg = yaml_cfg.get("feishu", {})
if isinstance(feishu_cfg, dict):
if "allow_bots" in feishu_cfg and not os.getenv("FEISHU_ALLOW_BOTS"):
os.environ["FEISHU_ALLOW_BOTS"] = str(feishu_cfg["allow_bots"]).lower()
except Exception as e:
logger.warning(
"Failed to process config.yaml — falling back to .env / gateway.json values. "
@@ -1051,7 +1102,14 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
if Platform.WHATSAPP not in config.platforms:
config.platforms[Platform.WHATSAPP] = PlatformConfig()
config.platforms[Platform.WHATSAPP].enabled = True
whatsapp_home = os.getenv("WHATSAPP_HOME_CHANNEL")
if whatsapp_home and Platform.WHATSAPP in config.platforms:
config.platforms[Platform.WHATSAPP].home_channel = HomeChannel(
platform=Platform.WHATSAPP,
chat_id=whatsapp_home,
name=os.getenv("WHATSAPP_HOME_CHANNEL_NAME", "Home"),
)
# Slack
slack_token = os.getenv("SLACK_BOT_TOKEN")
if slack_token:
+9 -7
View File
@@ -53,9 +53,10 @@ class DeliveryTarget:
- "telegram" Telegram home channel
- "telegram:123456" specific Telegram chat
"""
target = target.strip().lower()
target_stripped = target.strip()
target_lower = target_stripped.lower()
if target == "origin":
if target_lower == "origin":
if origin:
return cls(
platform=origin.platform,
@@ -67,13 +68,14 @@ class DeliveryTarget:
# Fallback to local if no origin
return cls(platform=Platform.LOCAL, is_origin=True)
if target == "local":
if target_lower == "local":
return cls(platform=Platform.LOCAL)
# Check for platform:chat_id or platform:chat_id:thread_id format
if ":" in target:
parts = target.split(":", 2)
platform_str = parts[0]
# Use the original case for chat_id/thread_id to preserve case-sensitive IDs
if ":" in target_stripped:
parts = target_stripped.split(":", 2)
platform_str = parts[0].lower() # Platform names are case-insensitive
chat_id = parts[1] if len(parts) > 1 else None
thread_id = parts[2] if len(parts) > 2 else None
try:
@@ -85,7 +87,7 @@ class DeliveryTarget:
# Just a platform name (use home channel)
try:
platform = Platform(target)
platform = Platform(target_lower)
return cls(platform=platform)
except ValueError:
# Unknown platform, treat as local
+84
View File
@@ -0,0 +1,84 @@
"""Shared HTTP client factory for long-lived platform adapters.
Gateway messaging platforms (QQ Bot, Feishu, WeCom, DingTalk, Signal,
BlueBubbles, WeCom-callback) keep a persistent ``httpx.AsyncClient``
alive for the adapter's lifetime. That amortises TLS/connection setup
across many API calls, but it also means the process's file-descriptor
pressure is sensitive to how aggressively the pool recycles idle keep-
alive connections.
httpx's default ``keepalive_expiry`` is 5 seconds. On macOS behind
Cloudflare Warp (and other transparent proxies), peer-initiated FIN can
sit in ``CLOSE_WAIT`` longer than that before the local socket actually
drains which, multiplied across 7 long-lived adapters plus the LLM
client and MCP clients, walks straight into the default 256 fd limit.
See #18451.
``platform_httpx_limits()`` returns a tighter ``httpx.Limits`` the
adapter factories use instead of the httpx default. The values chosen:
* ``max_keepalive_connections=10`` plenty for any single adapter;
platform APIs rarely parallelise beyond this.
* ``keepalive_expiry=2.0`` close idle sockets aggressively so a
proxy's lingering CLOSE_WAIT window can't starve the process.
Override via ``HERMES_GATEWAY_HTTPX_KEEPALIVE_EXPIRY`` /
``HERMES_GATEWAY_HTTPX_MAX_KEEPALIVE`` env vars when tuning under load.
"""
from __future__ import annotations
import os
try:
import httpx
except ImportError: # pragma: no cover — optional dep
httpx = None # type: ignore[assignment]
_DEFAULT_KEEPALIVE_EXPIRY_S = 2.0
_DEFAULT_MAX_KEEPALIVE = 10
def platform_httpx_limits() -> "httpx.Limits | None":
"""Return ``httpx.Limits`` tuned for persistent platform-adapter clients.
Returns ``None`` when httpx isn't importable, so callers can fall
back to httpx's built-in default without a hard dependency on this
helper being reachable.
"""
if httpx is None:
return None
def _env_float(name: str, default: float) -> float:
raw = os.environ.get(name, "").strip()
if not raw:
return default
try:
val = float(raw)
except (TypeError, ValueError):
return default
return val if val > 0 else default
def _env_int(name: str, default: int) -> int:
raw = os.environ.get(name, "").strip()
if not raw:
return default
try:
val = int(raw)
except (TypeError, ValueError):
return default
return val if val > 0 else default
keepalive_expiry = _env_float(
"HERMES_GATEWAY_HTTPX_KEEPALIVE_EXPIRY", _DEFAULT_KEEPALIVE_EXPIRY_S
)
max_keepalive = _env_int(
"HERMES_GATEWAY_HTTPX_MAX_KEEPALIVE", _DEFAULT_MAX_KEEPALIVE
)
return httpx.Limits(
max_keepalive_connections=max_keepalive,
# Leave max_connections at httpx default (100) — plenty of headroom.
keepalive_expiry=keepalive_expiry,
)
+4 -2
View File
@@ -2351,10 +2351,11 @@ class APIServerAdapter(BasePlatformAdapter):
)
if agent_ref is not None:
agent_ref[0] = agent
effective_task_id = session_id or str(uuid.uuid4())
result = agent.run_conversation(
user_message=user_message,
conversation_history=conversation_history,
task_id="default",
task_id=effective_task_id,
)
usage = {
"input_tokens": getattr(agent, "session_prompt_tokens", 0) or 0,
@@ -2551,10 +2552,11 @@ class APIServerAdapter(BasePlatformAdapter):
)
self._active_run_agents[run_id] = agent
def _run_sync():
effective_task_id = session_id or run_id
r = agent.run_conversation(
user_message=user_message,
conversation_history=conversation_history,
task_id="default",
task_id=effective_task_id,
)
u = {
"input_tokens": getattr(agent, "session_prompt_tokens", 0) or 0,
+209 -13
View File
@@ -416,7 +416,7 @@ def is_host_excluded_by_no_proxy(hostname: str, no_proxy_value: str | None = Non
from dataclasses import dataclass, field
from datetime import datetime
from pathlib import Path
from typing import Dict, List, Optional, Any, Callable, Awaitable, Tuple
from typing import Dict, List, Optional, Any, Callable, Awaitable, Tuple, Union
from enum import Enum
from pathlib import Path as _Path
@@ -981,7 +981,7 @@ def coerce_plaintext_gateway_command(event: "MessageEvent") -> None:
return
@dataclass
@dataclass
class SendResult:
"""Result of sending a message."""
success: bool
@@ -991,6 +991,45 @@ class SendResult:
retryable: bool = False # True for transient connection errors — base will retry automatically
class EphemeralReply(str):
"""System-notice reply that auto-deletes after a TTL.
Slash-command handlers in ``gateway/run.py`` can return this wrapper
instead of a plain string to request that the reply message be deleted
after ``ttl_seconds`` on platforms that support ``delete_message``.
Subclassing ``str`` keeps the wrapper transparent to anything that
treats handler return values as text (existing tests use ``in`` /
``startswith`` / equality; the ``_process_message_background`` pipeline
extracts attachments from the string content). ``isinstance(r,
EphemeralReply)`` still distinguishes ephemeral replies from plain
strings so the send path can schedule deletion.
Platforms that don't override :meth:`BasePlatformAdapter.delete_message`
silently ignore the TTL the message is sent normally and left in
place. When ``ttl_seconds`` is ``None``, the pipeline uses the
configured ``display.ephemeral_system_ttl`` default. A default of ``0``
disables auto-deletion globally, preserving prior behavior.
"""
ttl_seconds: Optional[int]
def __new__(cls, text: str, ttl_seconds: Optional[int] = None):
instance = super().__new__(cls, text)
instance.ttl_seconds = ttl_seconds
return instance
@property
def text(self) -> str:
"""Return the underlying text.
Provided for call sites that want an explicit string conversion,
though ``str(reply)`` and using ``reply`` directly where a string
is expected both work identically.
"""
return str.__str__(self)
def merge_pending_message_event(
pending_messages: Dict[str, MessageEvent],
session_key: str,
@@ -1034,6 +1073,11 @@ def merge_pending_message_event(
existing.text = event.text
if existing_is_photo or incoming_is_photo:
existing.message_type = MessageType.PHOTO
elif (
getattr(existing, "message_type", None) == MessageType.TEXT
and event.message_type != MessageType.TEXT
):
existing.message_type = event.message_type
return
if (
@@ -1068,8 +1112,10 @@ _RETRYABLE_ERROR_PATTERNS = (
)
# Type for message handlers
MessageHandler = Callable[[MessageEvent], Awaitable[Optional[str]]]
# Type for message handlers. Handlers may return a plain string (normal
# reply), an ``EphemeralReply`` to opt the reply into auto-deletion, or
# ``None`` when the response was already delivered (e.g. via streaming).
MessageHandler = Callable[[MessageEvent], Awaitable[Optional[Union[str, "EphemeralReply"]]]]
def resolve_channel_prompt(
@@ -1454,6 +1500,64 @@ class BasePlatformAdapter(ABC):
"""
return False
def _get_ephemeral_system_ttl_default(self) -> int:
"""Read ``display.ephemeral_system_ttl`` from config.
Returns the TTL in seconds to use when an :class:`EphemeralReply`
does not specify one explicitly. ``0`` (the default) disables
auto-deletion. Non-fatal if config is unreadable.
"""
try:
from hermes_cli.config import load_config as _load_config
except Exception:
return 0
try:
cfg = _load_config()
except Exception:
return 0
display = cfg.get("display", {}) if isinstance(cfg, dict) else {}
if not isinstance(display, dict):
return 0
raw = display.get("ephemeral_system_ttl", 0)
try:
return int(raw)
except (TypeError, ValueError):
return 0
def _schedule_ephemeral_delete(
self,
chat_id: str,
message_id: str,
ttl_seconds: int,
) -> None:
"""Spawn a detached task that deletes ``message_id`` after ``ttl_seconds``.
Best-effort failures (gateway restart, permission denied, message
too old for Telegram's 48h window) are swallowed at debug level.
Does not block the caller.
"""
async def _run_delete() -> None:
try:
await asyncio.sleep(max(1, int(ttl_seconds)))
await self.delete_message(chat_id=chat_id, message_id=message_id)
except asyncio.CancelledError:
raise
except Exception as e:
logger.debug(
"[%s] Ephemeral delete failed for %s/%s: %s",
self.name, chat_id, message_id, e,
)
coro = _run_delete()
try:
asyncio.create_task(coro)
except RuntimeError:
# No running loop (e.g. unit tests that never reach the async
# path). Close the coroutine cleanly so Python doesn't warn
# about it never being awaited, then drop silently.
coro.close()
async def send_slash_confirm(
self,
chat_id: str,
@@ -1489,6 +1593,26 @@ class BasePlatformAdapter(ABC):
"""
return SendResult(success=False, error="Not supported")
async def send_private_notice(
self,
chat_id: str,
user_id: Optional[str],
content: str,
reply_to: Optional[str] = None,
metadata: Optional[Dict[str, Any]] = None,
) -> SendResult:
"""Send a notice privately when the platform supports it.
The default implementation falls back to a normal send so callers can
use one code path across platforms.
"""
return await self.send(
chat_id=chat_id,
content=content,
reply_to=reply_to,
metadata=metadata,
)
async def send_typing(self, chat_id: str, metadata=None) -> None:
"""
Send a typing indicator.
@@ -2043,6 +2167,28 @@ class BasePlatformAdapter(ABC):
lowered = error.lower()
return "timed out" in lowered or "readtimeout" in lowered or "writetimeout" in lowered
def _unwrap_ephemeral(self, response: Any) -> Tuple[Optional[str], int]:
"""Unwrap a handler response into (text, ttl_seconds).
Accepts a plain string, ``None``, or an :class:`EphemeralReply`.
Returns ``(text, ttl)`` where ``ttl > 0`` means the caller should
schedule a deletion via :meth:`_schedule_ephemeral_delete` after
the send succeeds. ``ttl`` is forced to 0 when the adapter
doesn't override :meth:`delete_message` so non-supporting
platforms silently degrade to normal sends.
"""
if isinstance(response, EphemeralReply):
ttl = response.ttl_seconds
if ttl is None:
try:
ttl = int(self._get_ephemeral_system_ttl_default())
except Exception:
ttl = 0
if ttl and ttl > 0 and type(self).delete_message is BasePlatformAdapter.delete_message:
ttl = 0
return response.text, int(ttl or 0)
return response, 0
async def _send_with_retry(
self,
chat_id: str,
@@ -2350,13 +2496,20 @@ class BasePlatformAdapter(ABC):
release_guard=False,
discard_pending=False,
)
if response:
await self._send_with_retry(
_text, _eph_ttl = self._unwrap_ephemeral(response)
if _text:
_r = await self._send_with_retry(
chat_id=event.source.chat_id,
content=response,
content=_text,
reply_to=event.message_id,
metadata=thread_meta,
)
if _eph_ttl > 0 and _r.success and _r.message_id:
self._schedule_ephemeral_delete(
chat_id=event.source.chat_id,
message_id=_r.message_id,
ttl_seconds=_eph_ttl,
)
except Exception:
# On failure, restore the original guard if one still exists so
# we don't leave the session in a half-reset state.
@@ -2436,13 +2589,20 @@ class BasePlatformAdapter(ABC):
try:
_thread_meta = {"thread_id": event.source.thread_id} if event.source.thread_id else None
response = await self._message_handler(event)
if response:
await self._send_with_retry(
_text, _eph_ttl = self._unwrap_ephemeral(response)
if _text:
_r = await self._send_with_retry(
chat_id=event.source.chat_id,
content=response,
content=_text,
reply_to=event.message_id,
metadata=_thread_meta,
)
if _eph_ttl > 0 and _r.success and _r.message_id:
self._schedule_ephemeral_delete(
chat_id=event.source.chat_id,
message_id=_r.message_id,
ttl_seconds=_eph_ttl,
)
except Exception as e:
logger.error("[%s] Command '/%s' dispatch failed: %s", self.name, cmd, e, exc_info=True)
return
@@ -2516,7 +2676,6 @@ class BasePlatformAdapter(ABC):
# Fall back to a new Event only if the entry was removed externally.
interrupt_event = self._active_sessions.get(session_key) or asyncio.Event()
self._active_sessions[session_key] = interrupt_event
callback_generation = getattr(interrupt_event, "_hermes_run_generation", None)
# Start continuous typing indicator (refreshes every 2 seconds)
_thread_metadata = {"thread_id": event.source.thread_id} if event.source.thread_id else None
@@ -2549,7 +2708,16 @@ class BasePlatformAdapter(ABC):
# Call the handler (this can take a while with tool calls)
response = await self._message_handler(event)
# Slash-command handlers may return an EphemeralReply sentinel to
# request that their reply message auto-delete after a TTL (used
# for system notices like "✨ New session started!" that the user
# doesn't need to keep in the thread). Unwrap here so all the
# downstream extract_media / text-processing logic sees a plain
# string, and remember the TTL + platform capability so the
# post-send block can schedule the deletion.
response, _ephemeral_ttl = self._unwrap_ephemeral(response)
# Send response if any. A None/empty response is normal when
# streaming already delivered the text (already_sent=True) or
# when the message was queued behind an active agent. Log at
@@ -2638,6 +2806,21 @@ class BasePlatformAdapter(ABC):
)
_record_delivery(result)
# Schedule auto-deletion of system-notice replies.
# Detached so the handler returns immediately; errors
# (permission denied, message too old) are swallowed.
if (
_ephemeral_ttl
and _ephemeral_ttl > 0
and result.success
and result.message_id
):
self._schedule_ephemeral_delete(
chat_id=event.source.chat_id,
message_id=result.message_id,
ttl_seconds=_ephemeral_ttl,
)
# Human-like pacing delay between text and media
human_delay = self._get_human_delay()
@@ -2815,7 +2998,20 @@ class BasePlatformAdapter(ABC):
finally:
# Fire any one-shot post-delivery callback registered for this
# session (e.g. deferred background-review notifications).
_callback_generation = callback_generation
#
# Snapshot the callback generation HERE (after the agent has run),
# not at the top of this task. _hermes_run_generation is set on
# the interrupt event by GatewayRunner._bind_adapter_run_generation
# during _handle_message_with_agent — which happens DURING the
# self._message_handler(event) await above. Snapshotting earlier
# always captured None, which bypassed the generation-ownership
# check in pop_post_delivery_callback and let stale runs fire a
# fresher run's callbacks.
_callback_generation = getattr(
interrupt_event,
"_hermes_run_generation",
None,
)
if hasattr(self, "pop_post_delivery_callback"):
_post_cb = self.pop_post_delivery_callback(
session_key,
+3 -1
View File
@@ -162,7 +162,9 @@ class BlueBubblesAdapter(BasePlatformAdapter):
return False
from aiohttp import web
self.client = httpx.AsyncClient(timeout=30.0)
# Tighter keepalive so idle CLOSE_WAIT drains promptly (#18451).
from gateway.platforms._http_client_limits import platform_httpx_limits
self.client = httpx.AsyncClient(timeout=30.0, limits=platform_httpx_limits())
try:
await self._api_get("/api/v1/ping")
info = await self._api_get("/api/v1/server/info")
+5 -1
View File
@@ -228,7 +228,11 @@ class DingTalkAdapter(BasePlatformAdapter):
return False
try:
self._http_client = httpx.AsyncClient(timeout=30.0)
# Tighter keepalive so idle CLOSE_WAIT drains promptly (#18451).
from gateway.platforms._http_client_limits import platform_httpx_limits
self._http_client = httpx.AsyncClient(
timeout=30.0, limits=platform_httpx_limits(),
)
credential = dingtalk_stream.Credential(
self._client_id, self._client_secret
+109 -32
View File
@@ -613,6 +613,21 @@ class DiscordAdapter(BasePlatformAdapter):
# so LLM output or echoed user content can't ping the whole
# server; override per DISCORD_ALLOW_MENTION_* env vars or the
# discord.allow_mentions.* block in config.yaml.
# Close any existing client to prevent zombie websocket connections
# on reconnect (see #18187). Without this, the old client remains
# connected to Discord gateway and both fire on_message, causing
# double responses.
if self._client is not None:
try:
if not self._client.is_closed():
await self._client.close()
except Exception:
logger.debug("[%s] Failed to close previous Discord client", self.name)
finally:
self._client = None
self._ready_event.clear()
self._client = commands.Bot(
command_prefix="!", # Not really used, we handle raw messages
intents=intents,
@@ -2584,40 +2599,32 @@ class DiscordAdapter(BasePlatformAdapter):
hidden skills. The slash picker also becomes more discoverable
Discord live-filters by the user's typed prefix against both the
skill name and its description.
The entries list and lookup dict are stored on ``self`` rather
than captured in closure variables so :meth:`refresh_skill_group`
can repopulate them when the user runs ``/reload-skills`` without
needing to touch the Discord slash-command tree or trigger a
``tree.sync()`` call.
"""
try:
from hermes_cli.commands import discord_skill_commands_by_category
existing_names = set()
try:
existing_names = {cmd.name for cmd in tree.get_commands()}
except Exception:
pass
# Reuse the existing collector for consistent filtering
# (per-platform disabled, hub-excluded, name clamping), then
# flatten — the category grouping was only useful for the
# nested layout.
categories, uncategorized, hidden = discord_skill_commands_by_category(
reserved_names=existing_names,
)
entries: list[tuple[str, str, str]] = list(uncategorized)
for cat_skills in categories.values():
entries.extend(cat_skills)
# Populate the instance-level entries/lookup so the
# autocomplete + handler callbacks below always read the
# freshest state. refresh_skill_group() re-runs the same
# collector and mutates these two attributes in place.
self._skill_entries: list[tuple[str, str, str]] = []
self._skill_lookup: dict[str, tuple[str, str]] = {}
self._skill_group_reserved_names: set[str] = set(existing_names)
self._refresh_skill_catalog_state()
if not entries:
if not self._skill_entries:
return
# Stable alphabetical order so the autocomplete suggestion
# list is predictable across restarts.
entries.sort(key=lambda t: t[0])
# name -> (description, cmd_key) — used by both the autocomplete
# callback and the handler for O(1) dispatch.
skill_lookup: dict[str, tuple[str, str]] = {
n: (d, k) for n, d, k in entries
}
async def _autocomplete_name(
interaction: "discord.Interaction", current: str,
) -> list:
@@ -2627,10 +2634,13 @@ class DiscordAdapter(BasePlatformAdapter):
"/skill pdf" surfaces skills whose description mentions
PDFs even if the name doesn't. Discord caps this list at
25 entries per query.
Reads ``self._skill_entries`` so a ``/reload-skills`` run
since process start shows up on the very next keystroke.
"""
q = (current or "").strip().lower()
choices: list = []
for name, desc, _key in entries:
for name, desc, _key in self._skill_entries:
if not q or q in name.lower() or (desc and q in desc.lower()):
if desc:
label = f"{name}{desc}"
@@ -2654,7 +2664,7 @@ class DiscordAdapter(BasePlatformAdapter):
async def _skill_handler(
interaction: "discord.Interaction", name: str, args: str = "",
):
entry = skill_lookup.get(name)
entry = self._skill_lookup.get(name)
if not entry:
await interaction.response.send_message(
f"Unknown skill: `{name}`. Start typing for "
@@ -2676,16 +2686,74 @@ class DiscordAdapter(BasePlatformAdapter):
logger.info(
"[%s] Registered /skill command with %d skill(s) via autocomplete",
self.name, len(entries),
self.name, len(self._skill_entries),
)
if hidden:
if self._skill_group_hidden_count:
logger.info(
"[%s] %d skill(s) filtered out of /skill (name clamp / reserved)",
self.name, hidden,
self.name, self._skill_group_hidden_count,
)
except Exception as exc:
logger.warning("[%s] Failed to register /skill command: %s", self.name, exc)
def _refresh_skill_catalog_state(self) -> None:
"""Re-scan disk for skills and repopulate ``self._skill_entries``.
Called once from :meth:`_register_skill_group` at startup and
again from :meth:`refresh_skill_group` whenever the user runs
``/reload-skills``. No Discord API calls are made autocomplete
and the handler both read from these instance attributes
directly, so an in-place mutation is sufficient.
"""
from hermes_cli.commands import discord_skill_commands_by_category
reserved = getattr(self, "_skill_group_reserved_names", set())
categories, uncategorized, hidden = discord_skill_commands_by_category(
reserved_names=set(reserved),
)
entries: list[tuple[str, str, str]] = list(uncategorized)
for cat_skills in categories.values():
entries.extend(cat_skills)
# Stable alphabetical order so the autocomplete suggestion
# list is predictable across restarts.
entries.sort(key=lambda t: t[0])
self._skill_entries = entries
self._skill_lookup = {n: (d, k) for n, d, k in entries}
self._skill_group_hidden_count = hidden
def refresh_skill_group(self) -> tuple[int, int]:
"""Rescan skills and update the live ``/skill`` autocomplete state.
Invoked by :meth:`gateway.run.GatewayOrchestrator._handle_reload_skills_command`
after :func:`agent.skill_commands.reload_skills` has refreshed
the in-process skill-command registry. Without this call, the
``/skill`` autocomplete dropdown keeps showing the list captured
at process start new skills stay invisible and deleted skills
return an "Unknown skill" error when clicked.
Because autocomplete options are fetched dynamically by Discord,
we only need to mutate the entries/lookup attributes read by the
callbacks no ``tree.sync()`` is required.
Returns ``(new_count, hidden_count)``.
"""
try:
self._refresh_skill_catalog_state()
except Exception as exc:
logger.warning(
"[%s] Failed to refresh /skill autocomplete after reload: %s",
self.name, exc,
)
return (len(getattr(self, "_skill_entries", [])), 0)
logger.info(
"[%s] Refreshed /skill autocomplete: %d skill(s) available (%d filtered)",
self.name,
len(self._skill_entries),
self._skill_group_hidden_count,
)
return (len(self._skill_entries), self._skill_group_hidden_count)
def _build_slash_event(self, interaction: discord.Interaction, text: str) -> MessageEvent:
"""Build a MessageEvent from a Discord slash command interaction."""
is_dm = isinstance(interaction.channel, discord.DMChannel)
@@ -2851,8 +2919,15 @@ class DiscordAdapter(BasePlatformAdapter):
raw = os.getenv("DISCORD_FREE_RESPONSE_CHANNELS", "")
if isinstance(raw, list):
return {str(part).strip() for part in raw if str(part).strip()}
if isinstance(raw, str) and raw.strip():
return {part.strip() for part in raw.split(",") if part.strip()}
# Coerce non-list scalars (str/int/float) to str before splitting.
# YAML parses a bare numeric value such as
# `free_response_channels: 1491973769726791812` as int, which was
# previously falling through the isinstance(str) branch and silently
# returning an empty set. str() here accepts whatever scalar the YAML
# loader hands us without changing existing string/CSV semantics.
s = str(raw).strip() if raw is not None else ""
if s:
return {part.strip() for part in s.split(",") if part.strip()}
return set()
def _thread_parent_channel(self, channel: Any) -> Any:
@@ -3078,6 +3153,7 @@ class DiscordAdapter(BasePlatformAdapter):
async def send_update_prompt(
self, chat_id: str, prompt: str, default: str = "",
session_key: str = "",
metadata: Optional[Dict[str, Any]] = None,
) -> SendResult:
"""Send an interactive button-based update prompt (Yes / No).
@@ -3087,9 +3163,10 @@ class DiscordAdapter(BasePlatformAdapter):
if not self._client or not DISCORD_AVAILABLE:
return SendResult(success=False, error="Not connected")
try:
channel = self._client.get_channel(int(chat_id))
target_id = metadata.get("thread_id") if metadata and metadata.get("thread_id") else chat_id
channel = self._client.get_channel(int(target_id))
if not channel:
channel = await self._client.fetch_channel(int(chat_id))
channel = await self._client.fetch_channel(int(target_id))
default_hint = f" (default: {default})" if default else ""
embed = discord.Embed(
+214 -53
View File
@@ -64,7 +64,7 @@ from dataclasses import dataclass, field
from datetime import datetime
from pathlib import Path
from types import SimpleNamespace
from typing import Any, Dict, List, Optional, Sequence
from typing import Any, Dict, List, Literal, Optional, Sequence
from urllib.error import HTTPError, URLError
from urllib.parse import urlencode
from urllib.request import Request, urlopen
@@ -141,6 +141,7 @@ from gateway.platforms.base import (
)
from gateway.status import acquire_scoped_lock, release_scoped_lock
from hermes_constants import get_hermes_home
from utils import atomic_json_write
logger = logging.getLogger(__name__)
@@ -387,6 +388,8 @@ class FeishuAdapterSettings:
admins: frozenset[str] = frozenset()
default_group_policy: str = ""
group_rules: Dict[str, FeishuGroupRule] = field(default_factory=dict)
allow_bots: str = "none" # "none" | "mentions" | "all"
require_mention: bool = True
@dataclass
@@ -396,6 +399,7 @@ class FeishuGroupRule:
policy: str # "open" | "allowlist" | "blacklist" | "admin_only" | "disabled"
allowlist: set[str] = field(default_factory=set)
blacklist: set[str] = field(default_factory=set)
require_mention: Optional[bool] = None # None = inherit global
@dataclass
@@ -405,6 +409,40 @@ class FeishuBatchState:
counts: Dict[str, int] = field(default_factory=dict)
# ---------------------------------------------------------------------------
# Admission: policy types
# ---------------------------------------------------------------------------
RejectReason = Literal[
"self_echo",
"self_ids_unknown",
"bots_disabled",
"bot_not_mentioned",
"group_policy_rejected",
]
def _is_bot_sender(sender: Any) -> bool:
# receive_v1 docs say {user, bot}; accept "app" defensively.
return getattr(sender, "sender_type", "") in ("bot", "app")
def _sender_identity(sender: Any) -> frozenset:
# Take any non-empty id variant — tenant sender_id_type decides which are populated.
sid = getattr(sender, "sender_id", None)
if sid is None:
return frozenset()
return frozenset(
v for v in (
getattr(sid, "open_id", None),
getattr(sid, "user_id", None),
getattr(sid, "union_id", None),
)
if v
)
# ---------------------------------------------------------------------------
# Markdown rendering helpers
# ---------------------------------------------------------------------------
@@ -1377,10 +1415,16 @@ class FeishuAdapter(BasePlatformAdapter):
for chat_id, rule_cfg in raw_group_rules.items():
if not isinstance(rule_cfg, dict):
continue
# Only override when the key is explicitly set — missing vs false
# must not collapse.
per_chat_require_mention: Optional[bool] = None
if "require_mention" in rule_cfg:
per_chat_require_mention = _to_boolean(rule_cfg.get("require_mention"))
group_rules[str(chat_id)] = FeishuGroupRule(
policy=str(rule_cfg.get("policy", "open")).strip().lower(),
allowlist=set(str(u).strip() for u in rule_cfg.get("allowlist", []) if str(u).strip()),
blacklist=set(str(u).strip() for u in rule_cfg.get("blacklist", []) if str(u).strip()),
require_mention=per_chat_require_mention,
)
# Bot-level admins
@@ -1390,6 +1434,16 @@ class FeishuAdapter(BasePlatformAdapter):
# Default group policy (for groups not in group_rules)
default_group_policy = str(extra.get("default_group_policy", "")).strip().lower()
# Env-only so adapter and gateway auth bypass share one source; yaml
# feishu.allow_bots is bridged to this env var at config load.
allow_bots = os.getenv("FEISHU_ALLOW_BOTS", "none").strip().lower()
if allow_bots not in ("none", "mentions", "all"):
logger.warning(
"[Feishu] Unknown allow_bots=%r, falling back to 'none'. Valid: none, mentions, all.",
allow_bots,
)
allow_bots = "none"
return FeishuAdapterSettings(
app_id=str(extra.get("app_id") or os.getenv("FEISHU_APP_ID", "")).strip(),
app_secret=str(extra.get("app_secret") or os.getenv("FEISHU_APP_SECRET", "")).strip(),
@@ -1446,6 +1500,10 @@ class FeishuAdapter(BasePlatformAdapter):
admins=admins,
default_group_policy=default_group_policy,
group_rules=group_rules,
allow_bots=allow_bots,
require_mention=_to_boolean(
extra.get("require_mention", os.getenv("FEISHU_REQUIRE_MENTION", "true"))
),
)
def _apply_settings(self, settings: FeishuAdapterSettings) -> None:
@@ -1476,6 +1534,8 @@ class FeishuAdapter(BasePlatformAdapter):
self._ws_reconnect_interval = settings.ws_reconnect_interval
self._ws_ping_interval = settings.ws_ping_interval
self._ws_ping_timeout = settings.ws_ping_timeout
self._allow_bots = settings.allow_bots
self._require_mention = settings.require_mention
def _build_event_handler(self) -> Any:
if EventDispatcherHandler is None:
@@ -2189,30 +2249,28 @@ class FeishuAdapter(BasePlatformAdapter):
event = getattr(data, "event", None)
message = getattr(event, "message", None)
sender = getattr(event, "sender", None)
sender_id = getattr(sender, "sender_id", None)
if not message or not sender_id:
logger.debug("[Feishu] Dropping malformed inbound event: missing message or sender_id")
if not message or not sender or not getattr(sender, "sender_id", None):
logger.debug("[Feishu] Dropping malformed inbound event: missing message/sender")
return
message_id = getattr(message, "message_id", None)
if not message_id or self._is_duplicate(message_id):
logger.debug("[Feishu] Dropping duplicate/missing message_id: %s", message_id)
return
if self._is_self_sent_bot_message(event):
logger.debug("[Feishu] Dropping self-sent bot event: %s", message_id)
reason = self._admit(sender, message)
if reason is not None:
logger.debug("[Feishu] dropping inbound event: %s", reason)
return
chat_type = getattr(message, "chat_type", "p2p")
chat_id = getattr(message, "chat_id", "") or ""
if chat_type != "p2p" and not self._should_accept_group_message(message, sender_id, chat_id):
logger.debug("[Feishu] Dropping group message that failed mention/policy gate: %s", message_id)
return
await self._process_inbound_message(
data=data,
message=message,
sender_id=sender_id,
sender_id=getattr(sender, "sender_id", None),
chat_type=chat_type,
message_id=message_id,
is_bot=_is_bot_sender(sender),
)
def _on_message_read_event(self, data: P2ImMessageMessageReadV1) -> None:
@@ -2389,10 +2447,11 @@ class FeishuAdapter(BasePlatformAdapter):
msg = items[0] if items else None
if not msg:
return
# GET im/v1/messages returns sender.id=app_id for bot messages —
# peer bots and us share sender_type="app" but differ on app_id.
sender = getattr(msg, "sender", None)
sender_type = str(getattr(sender, "sender_type", "") or "").lower()
if sender_type != "app":
return # only route reactions on our own bot messages
if str(getattr(sender, "id", "") or "") != self._app_id:
return # only route reactions on this bot's own messages
chat_id = str(getattr(msg, "chat_id", "") or "")
chat_type_raw = str(getattr(msg, "chat_type", "p2p") or "p2p")
if not chat_id:
@@ -2679,6 +2738,7 @@ class FeishuAdapter(BasePlatformAdapter):
sender_id: Any,
chat_type: str,
message_id: str,
is_bot: bool = False,
) -> None:
text, inbound_type, media_urls, media_types, mentions = await self._extract_message_content(message)
@@ -2704,19 +2764,27 @@ class FeishuAdapter(BasePlatformAdapter):
)
reply_to_text = await self._fetch_message_text(reply_to_message_id) if reply_to_message_id else None
sender_primary = (
getattr(sender_id, "open_id", None)
or getattr(sender_id, "user_id", None)
or getattr(sender_id, "union_id", None)
or "<unknown>"
)
logger.info(
"[Feishu] Inbound %s message received: id=%s type=%s chat_id=%s text=%r media=%d",
"[Feishu] Inbound %s message received: id=%s type=%s chat_id=%s sender=%s:%s text=%r media=%d",
"dm" if chat_type == "p2p" else "group",
message_id,
inbound_type.value,
getattr(message, "chat_id", "") or "",
"bot" if is_bot else "user",
sender_primary,
text[:120],
len(media_urls),
)
chat_id = getattr(message, "chat_id", "") or ""
chat_info = await self.get_chat_info(chat_id)
sender_profile = await self._resolve_sender_profile(sender_id)
sender_profile = await self._resolve_sender_profile(sender_id, is_bot=is_bot)
source = self.build_source(
chat_id=chat_id,
chat_name=chat_info.get("name") or chat_id or "Feishu Chat",
@@ -2725,6 +2793,7 @@ class FeishuAdapter(BasePlatformAdapter):
user_name=sender_profile["user_name"],
thread_id=getattr(message, "thread_id", None) or None,
user_id_alt=sender_profile["user_id_alt"],
is_bot=is_bot,
)
normalized = MessageEvent(
text=text,
@@ -2853,13 +2922,18 @@ class FeishuAdapter(BasePlatformAdapter):
},
)
response.raise_for_status()
# Snapshot Content-Type and body while the client context is
# still active so pooled connections fully release on exit.
# See #18451.
content_type_hdr = str(response.headers.get("Content-Type", ""))
body = response.content
filename = self._derive_remote_filename(
file_url,
content_type=str(response.headers.get("Content-Type", "")),
content_type=content_type_hdr,
default_name=preferred_name,
default_ext=default_ext,
)
cached_path = cache_document_from_bytes(response.content, filename)
cached_path = cache_document_from_bytes(body, filename)
return cached_path, filename
@staticmethod
@@ -3447,7 +3521,12 @@ class FeishuAdapter(BasePlatformAdapter):
return "dm"
return "group"
async def _resolve_sender_profile(self, sender_id: Any) -> Dict[str, Optional[str]]:
async def _resolve_sender_profile(
self,
sender_id: Any,
*,
is_bot: bool = False,
) -> Dict[str, Optional[str]]:
"""Map Feishu's three-tier user IDs onto Hermes' SessionSource fields.
Preference order for the primary ``user_id`` field:
@@ -3464,7 +3543,11 @@ class FeishuAdapter(BasePlatformAdapter):
union_id = getattr(sender_id, "union_id", None) or None
# Prefer tenant-scoped user_id; fall back to app-scoped open_id.
primary_id = user_id or open_id
display_name = await self._resolve_sender_name_from_api(primary_id or union_id)
# bot/v3/bots/basic_batch only accepts open_id.
name_lookup_id = open_id if is_bot else (primary_id or union_id)
display_name = await self._resolve_sender_name_from_api(
name_lookup_id, is_bot=is_bot,
)
return {
"user_id": primary_id,
"user_name": display_name,
@@ -3484,11 +3567,14 @@ class FeishuAdapter(BasePlatformAdapter):
self._sender_name_cache.pop(sender_id, None)
return None
async def _resolve_sender_name_from_api(self, sender_id: Optional[str]) -> Optional[str]:
"""Fetch the sender's display name from the Feishu contact API with a 10-minute cache.
ID-type detection mirrors openclaw: ou_ open_id, on_ union_id, else user_id.
Failures are silently suppressed; the message pipeline must not block on name resolution.
async def _resolve_sender_name_from_api(
self,
sender_id: Optional[str],
*,
is_bot: bool = False,
) -> Optional[str]:
"""Bots divert to bot/basic_batch — contact API doesn't return bot names.
Failures are silent so the pipeline never blocks on name resolution.
"""
if not sender_id or not self._client:
return None
@@ -3498,7 +3584,16 @@ class FeishuAdapter(BasePlatformAdapter):
now = time.time()
cached_name = self._get_cached_sender_name(trimmed)
if cached_name is not None:
return cached_name
return cached_name or None # "" cached means "known nameless"
if is_bot:
names = await self._fetch_bot_names([trimmed])
if names is None:
return None
expire_at = now + _FEISHU_SENDER_NAME_TTL_SECONDS
for oid, name in names.items():
self._sender_name_cache[oid] = (name, expire_at)
hit = self._sender_name_cache.get(trimmed)
return (hit[0] or None) if hit else None
try:
from lark_oapi.api.contact.v3 import GetUserRequest # lazy import
if trimmed.startswith("ou_"):
@@ -3527,6 +3622,35 @@ class FeishuAdapter(BasePlatformAdapter):
logger.debug("[Feishu] Failed to resolve sender name for %s", sender_id, exc_info=True)
return None
async def _fetch_bot_names(self, bot_ids: List[str]) -> Optional[Dict[str, str]]:
if not self._client or not bot_ids:
return None
try:
req = (
BaseRequest.builder()
.http_method(HttpMethod.GET)
.uri("/open-apis/bot/v3/bots/basic_batch")
.queries([("bot_ids", oid) for oid in bot_ids])
.token_types({AccessTokenType.TENANT})
.build()
)
resp = await asyncio.to_thread(self._client.request, req)
content = getattr(getattr(resp, "raw", None), "content", None)
if not content:
return None
payload = json.loads(content)
if payload.get("code") != 0:
return None
bots = (payload.get("data") or {}).get("bots") or {}
return {
oid: str(info.get("name") or "").strip()
for oid, info in bots.items()
if oid
}
except Exception:
logger.debug("[Feishu] Failed to fetch bot names for %s", bot_ids, exc_info=True)
return None
async def _fetch_message_text(self, message_id: str) -> Optional[str]:
if not self._client or not message_id:
return None
@@ -3590,10 +3714,60 @@ class FeishuAdapter(BasePlatformAdapter):
logger.exception("[Feishu] Background inbound processing failed")
# =========================================================================
# Group policy and mention gating
# Inbound admission
# =========================================================================
def _allow_group_message(self, sender_id: Any, chat_id: str = "") -> bool:
def _admit(self, sender: Any, message: Any) -> Optional[RejectReason]:
sender_ids = _sender_identity(sender)
self_ids = frozenset(v for v in (self._bot_open_id, self._bot_user_id) if v)
is_bot = _is_bot_sender(sender)
is_group = getattr(message, "chat_type", "p2p") != "p2p"
chat_id = getattr(message, "chat_id", "") or ""
require_mention = is_group and self._require_mention_for(chat_id)
# Defensive only — Feishu doesn't echo our outbound back as inbound,
# and open_id is always populated on both sides.
if self_ids and sender_ids & self_ids:
return "self_echo"
if is_bot:
mode = self._allow_bots
if mode != "mentions" and mode != "all":
return "bots_disabled"
# Defensive: pre-hydration or malformed payloads.
if not self_ids or not sender_ids:
return "self_ids_unknown"
# Step 4 covers mention enforcement for groups when require_mention
# is on; check here only on paths step 4 won't reach.
if mode == "mentions" and not require_mention and not self._mentions_self(message):
return "bot_not_mentioned"
if not is_group:
return None
if not self._allow_group_message(
getattr(sender, "sender_id", None), chat_id, is_bot=is_bot,
):
return "group_policy_rejected"
if require_mention and not self._mentions_self(message):
return "group_policy_rejected"
return None
def _require_mention_for(self, chat_id: str) -> bool:
rule = self._group_rules.get(chat_id) if chat_id else None
if rule and rule.require_mention is not None:
return rule.require_mention
return self._require_mention
# --- Group policy ---------------------------------------------------------
def _allow_group_message(
self,
sender_id: Any,
chat_id: str = "",
*,
is_bot: bool = False,
) -> bool:
"""Per-group policy gate for non-DM traffic."""
sender_open_id = getattr(sender_id, "open_id", None)
sender_user_id = getattr(sender_id, "user_id", None)
@@ -3612,12 +3786,17 @@ class FeishuAdapter(BasePlatformAdapter):
allowlist = self._allowed_group_users
blacklist = set()
# Channel locks apply to everyone; allowlist/blacklist only gate humans
# (bots were already cleared upstream by FEISHU_ALLOW_BOTS).
if policy == "disabled":
return False
if policy == "open":
return True
if policy == "admin_only":
return False
if is_bot:
return True
if policy == "allowlist":
return bool(sender_ids and (sender_ids & allowlist))
if policy == "blacklist":
@@ -3625,17 +3804,16 @@ class FeishuAdapter(BasePlatformAdapter):
return bool(sender_ids and (sender_ids & self._allowed_group_users))
def _should_accept_group_message(self, message: Any, sender_id: Any, chat_id: str = "") -> bool:
"""Require an explicit @mention before group messages enter the agent."""
if not self._allow_group_message(sender_id, chat_id):
return False
# @_all is Feishu's @everyone placeholder — always route to the bot.
# --- Mention detection ----------------------------------------------------
def _mentions_self(self, message: Any) -> bool:
# @_all is Feishu's @everyone placeholder.
raw_content = getattr(message, "content", "") or ""
if "@_all" in raw_content:
return True
mentions = getattr(message, "mentions", None) or []
if mentions:
return self._message_mentions_bot(mentions)
if mentions and self._message_mentions_bot(mentions):
return True
normalized = normalize_feishu_message(
message_type=getattr(message, "message_type", "") or "",
raw_content=raw_content,
@@ -3644,23 +3822,6 @@ class FeishuAdapter(BasePlatformAdapter):
)
return self._post_mentions_bot(normalized.mentions)
def _is_self_sent_bot_message(self, event: Any) -> bool:
"""Return True only for Feishu events emitted by this Hermes bot."""
sender = getattr(event, "sender", None)
sender_type = str(getattr(sender, "sender_type", "") or "").strip().lower()
if sender_type not in {"bot", "app"}:
return False
sender_id = getattr(sender, "sender_id", None)
sender_open_id = str(getattr(sender_id, "open_id", "") or "").strip()
sender_user_id = str(getattr(sender_id, "user_id", "") or "").strip()
if self._bot_open_id and sender_open_id == self._bot_open_id:
return True
if self._bot_user_id and sender_user_id == self._bot_user_id:
return True
return False
def _message_mentions_bot(self, mentions: List[Any]) -> bool:
# IDs trump names: when both sides have open_id (or both user_id),
# match requires equal IDs. Name fallback only when either side
@@ -3804,7 +3965,7 @@ class FeishuAdapter(BasePlatformAdapter):
recent = self._seen_message_order[-self._dedup_cache_size:]
# Save as {msg_id: timestamp} so TTL filtering works across restarts.
payload = {"message_ids": {k: self._seen_message_ids[k] for k in recent if k in self._seen_message_ids}}
self._dedup_state_path.write_text(json.dumps(payload, ensure_ascii=False), encoding="utf-8")
atomic_json_write(self._dedup_state_path, payload, indent=None)
except OSError:
logger.warning("[Feishu] Failed to persist dedup state to %s", self._dedup_state_path, exc_info=True)
+3 -2
View File
@@ -13,6 +13,8 @@ import time
from pathlib import Path
from typing import TYPE_CHECKING, Dict
from utils import atomic_json_write
if TYPE_CHECKING:
from gateway.platforms.base import MessageEvent
@@ -237,12 +239,11 @@ class ThreadParticipationTracker:
def _save(self) -> None:
path = self._state_path()
path.parent.mkdir(parents=True, exist_ok=True)
thread_list = list(self._threads)
if len(thread_list) > self._max_tracked:
thread_list = thread_list[-self._max_tracked:]
self._threads = set(thread_list)
path.write_text(json.dumps(thread_list), encoding="utf-8")
atomic_json_write(path, thread_list, indent=None)
def mark(self, thread_id: str) -> None:
"""Mark *thread_id* as participated and persist."""
+4
View File
@@ -243,10 +243,14 @@ class QQAdapter(BasePlatformAdapter):
return False
try:
# Tighter keepalive pool so idle CLOSE_WAIT sockets drain
# faster behind proxies like Cloudflare Warp (#18451).
from gateway.platforms._http_client_limits import platform_httpx_limits
self._http_client = httpx.AsyncClient(
timeout=30.0,
follow_redirects=True,
event_hooks={"response": [_ssrf_redirect_guard]},
limits=platform_httpx_limits(),
)
# 1. Get access token
+15 -1
View File
@@ -248,7 +248,9 @@ class SignalAdapter(BasePlatformAdapter):
except Exception as e:
logger.warning("Signal: Could not acquire phone lock (non-fatal): %s", e)
self.client = httpx.AsyncClient(timeout=30.0)
# Tighter keepalive so idle CLOSE_WAIT drains promptly (#18451).
from gateway.platforms._http_client_limits import platform_httpx_limits
self.client = httpx.AsyncClient(timeout=30.0, limits=platform_httpx_limits())
try:
# Health check — verify signal-cli daemon is reachable
try:
@@ -534,6 +536,18 @@ class SignalAdapter(BasePlatformAdapter):
except Exception:
logger.exception("Signal: failed to fetch attachment %s", att_id)
# Skip envelopes with no meaningful content (no text, no attachments).
# Catches profile key updates, empty messages, and other metadata-only
# envelopes that still carry a dataMessage wrapper but have nothing
# worth processing. See issue: signal-cli logs "Profile key update" +
# Hermes receives msg='' triggering a full agent turn for nothing.
if (not text or not text.strip()) and not media_urls:
logger.debug(
"Signal: skipping contentless envelope from %s (%d attachments)",
redact_phone(sender), len(media_urls) if media_urls else 0,
)
return
# Build session source
source = self.build_source(
chat_id=chat_id,
+221 -13
View File
@@ -9,6 +9,7 @@ Uses slack-bolt (Python) with Socket Mode for:
"""
import asyncio
import contextvars
import json
import logging
import os
@@ -21,6 +22,7 @@ try:
from slack_bolt.async_app import AsyncApp
from slack_bolt.adapter.socket_mode.async_handler import AsyncSocketModeHandler
from slack_sdk.web.async_client import AsyncWebClient
import aiohttp
SLACK_AVAILABLE = True
except ImportError:
SLACK_AVAILABLE = False
@@ -50,6 +52,16 @@ from gateway.platforms.base import (
logger = logging.getLogger(__name__)
# ContextVar carrying the user_id of the slash-command invoker.
# Set in _handle_slash_command, read in send() to match the correct
# stashed response_url when multiple users issue commands on the same
# channel concurrently. ContextVars propagate to child asyncio.Tasks
# (Python 3.7+), so the value set in _handle_slash_command's task is
# visible in _process_message_background's child task.
_slash_user_id: contextvars.ContextVar[Optional[str]] = contextvars.ContextVar(
"_slash_user_id", default=None,
)
@dataclass
class _ThreadContextCache:
@@ -310,6 +322,11 @@ class SlackAdapter(BasePlatformAdapter):
# Track active assistant thread status indicators so stop_typing can
# clear them (chat_id → thread_ts).
self._active_status_threads: Dict[str, str] = {}
# Slash-command contexts: stash response_url + user_id so send()
# can route the first reply ephemerally. Keyed by
# (channel_id, user_id) to avoid cross-user collisions.
# Each value: {"response_url": str, "ts": float}
self._slash_command_contexts: Dict[Tuple[str, str], Dict[str, Any]] = {}
def _describe_slack_api_error(self, response: Any, *, file_obj: Optional[Dict[str, Any]] = None) -> Optional[str]:
"""Convert Slack API auth/permission failures into actionable user-facing text."""
@@ -368,6 +385,103 @@ class SlackAdapter(BasePlatformAdapter):
)
return None
# ------------------------------------------------------------------
# Slash-command ephemeral helpers
# ------------------------------------------------------------------
_SLASH_CTX_TTL = 120.0 # seconds — response_url is valid for 30 min;
# we use a much shorter TTL to avoid routing unrelated messages
# as ephemeral if the command handler was slow or dropped.
def _pop_slash_context(
self, chat_id: str,
) -> Optional[Dict[str, Any]]:
"""Return and remove the slash-command context for *chat_id*, if fresh.
Contexts older than ``_SLASH_CTX_TTL`` seconds are silently discarded.
Uses the ``_slash_user_id`` ContextVar (set in ``_handle_slash_command``)
to match the exact ``(channel_id, user_id)`` key. This prevents a
concurrent slash command from a different user on the same channel from
stealing another user's ephemeral context. Falls back to a
channel-only scan when the ContextVar is unset (e.g. send() called
from a non-slash code path should not match anything).
"""
now = time.monotonic()
# Clean up stale entries on every lookup — dict is small.
stale_keys = [
k for k, v in self._slash_command_contexts.items()
if now - v["ts"] > self._SLASH_CTX_TTL
]
for k in stale_keys:
self._slash_command_contexts.pop(k, None)
# Precise match: (channel_id, user_id) from ContextVar.
uid = _slash_user_id.get()
if uid:
return self._slash_command_contexts.pop((chat_id, uid), None)
# Fallback: channel-only scan (only reachable when ContextVar is
# unset, i.e. send() called outside a slash-command async context).
match_key = None
for key in list(self._slash_command_contexts):
if key[0] == chat_id:
match_key = key
break
if match_key is None:
return None
return self._slash_command_contexts.pop(match_key)
async def _send_slash_ephemeral(
self,
ctx: Dict[str, Any],
content: str,
) -> "SendResult":
"""Replace the initial ephemeral ack via ``response_url``.
Slack's ``response_url`` accepts a POST with ``replace_original``
for up to 30 minutes after the slash command was invoked. This
lets us swap the "Running /cmd…" placeholder with the real reply,
and the message stays ephemeral ("Only visible to you").
Falls back to a simple ``True`` SendResult if the POST fails
the user already saw the initial ack, so a delivery failure here
is non-critical.
"""
formatted = self.format_message(content)
# Slack's response_url has the same ~40k char limit as chat_postMessage.
# Truncate to MAX_MESSAGE_LENGTH and use only the first chunk — the
# response_url replaces a single ephemeral ack, so multi-chunk isn't
# possible. Long responses are rare for command replies.
chunks = self.truncate_message(formatted, self.MAX_MESSAGE_LENGTH)
text = chunks[0] if chunks else formatted
payload = {
"response_type": "ephemeral",
"replace_original": True,
"text": text,
}
try:
async with aiohttp.ClientSession() as session:
async with session.post(
ctx["response_url"],
json=payload,
timeout=aiohttp.ClientTimeout(total=10),
) as resp:
if resp.status == 200:
return SendResult(success=True, message_id=None)
body = await resp.text()
logger.warning(
"[Slack] response_url POST returned %s: %s",
resp.status,
body[:200],
)
except Exception as e:
logger.warning(
"[Slack] response_url POST failed: %s", e,
)
# Non-fatal — the user saw the initial ack already.
return SendResult(success=True, message_id=None)
async def connect(self) -> bool:
"""Connect to Slack via Socket Mode."""
if not SLACK_AVAILABLE:
@@ -446,12 +560,16 @@ class SlackAdapter(BasePlatformAdapter):
async def handle_message_event(event, say):
await self._handle_slack_message(event)
# Acknowledge app_mention events to prevent Bolt 404 errors.
# The "message" handler above already processes @mentions in
# channels, so this is intentionally a no-op to avoid duplicates.
# Handle app_mention explicitly. In some Slack app configurations,
# channel mentions arrive only as app_mention events rather than the
# generic message event. Forward them into the normal message
# pipeline so @mentions reliably produce replies.
# NOTE: when Slack fires BOTH message and app_mention for the same
# @mention, they share the same event ts — the dedup in
# _handle_slack_message (MessageDeduplicator) suppresses the second.
@self._app.event("app_mention")
async def handle_app_mention(event, say):
pass
await self._handle_slack_message(event)
# File lifecycle events can arrive around snippet uploads even when
# the actual user message is what we care about. Ack them so Slack
@@ -502,7 +620,11 @@ class SlackAdapter(BasePlatformAdapter):
@self._app.command(_slash_pattern)
async def handle_hermes_command(ack, command):
await ack()
slash = (command.get("command") or "").lstrip("/")
await ack(
response_type="ephemeral",
text=f"Running `/{slash}`…",
)
await self._handle_slash_command(command)
# Register Block Kit action handlers for approval buttons
@@ -574,6 +696,17 @@ class SlackAdapter(BasePlatformAdapter):
return SendResult(success=False, error="Not connected")
try:
# Check for a pending slash-command context. When the user ran a
# native slash command (e.g. /q, /stop, /model), the initial ack
# already showed an ephemeral "Running /cmd…" message. If we have
# a stashed response_url for this channel, replace that ack with
# the actual command reply ephemerally instead of posting publicly.
slash_ctx = self._pop_slash_context(chat_id)
if slash_ctx:
return await self._send_slash_ephemeral(
slash_ctx, content,
)
# Convert standard markdown → Slack mrkdwn
formatted = self.format_message(content)
@@ -601,6 +734,10 @@ class SlackAdapter(BasePlatformAdapter):
last_result = await self._get_client(chat_id).chat_postMessage(**kwargs)
# Clear Slack Assistant status as soon as the final message is posted.
if thread_ts:
await self.stop_typing(chat_id)
# Track the sent message ts so we can auto-respond to thread
# replies without requiring @mention.
sent_ts = last_result.get("ts") if last_result else None
@@ -624,6 +761,42 @@ class SlackAdapter(BasePlatformAdapter):
logger.error("[Slack] Send error: %s", e, exc_info=True)
return SendResult(success=False, error=str(e))
async def send_private_notice(
self,
chat_id: str,
user_id: str,
content: str,
reply_to: Optional[str] = None,
metadata: Optional[Dict[str, Any]] = None,
) -> SendResult:
"""Send a Slack ephemeral message visible only to one user."""
if not self._app:
return SendResult(success=False, error="Not connected")
if not chat_id or not user_id:
return SendResult(success=False, error="chat_id and user_id are required")
try:
formatted = self.format_message(content)
thread_ts = self._resolve_thread_ts(reply_to, metadata)
kwargs = {
"channel": chat_id,
"user": user_id,
"text": formatted,
"mrkdwn": True,
}
if thread_ts:
kwargs["thread_ts"] = thread_ts
result = await self._get_client(chat_id).chat_postEphemeral(**kwargs)
return SendResult(
success=True,
message_id=result.get("message_ts") or result.get("ts"),
raw_response=result,
)
except Exception as e: # pragma: no cover - defensive logging
logger.error("[Slack] Ephemeral send error: %s", e, exc_info=True)
return SendResult(success=False, error=str(e))
async def edit_message(
self,
chat_id: str,
@@ -642,6 +815,8 @@ class SlackAdapter(BasePlatformAdapter):
ts=message_id,
text=formatted,
)
if finalize:
await self.stop_typing(chat_id)
return SendResult(success=True, message_id=message_id)
except Exception as e: # pragma: no cover - defensive logging
logger.error(
@@ -682,7 +857,7 @@ class SlackAdapter(BasePlatformAdapter):
# in an assistant-enabled context. Falls back to reactions.
logger.debug("[Slack] assistant.threads.setStatus failed: %s", e)
async def stop_typing(self, chat_id: str) -> None:
async def stop_typing(self, chat_id: str, metadata=None) -> None:
"""Clear the assistant thread status indicator."""
if not self._app:
return
@@ -969,7 +1144,7 @@ class SlackAdapter(BasePlatformAdapter):
return _ph(f'<{url}|{label}>')
text = re.sub(
r'\[([^\]]+)\]\(([^()]*(?:\([^()]*\)[^()]*)*)\)',
r'(?<!!)\[([^\]]+)\]\(([^()]*(?:\([^()]*\)[^()]*)*)\)',
_convert_markdown_link,
text,
)
@@ -1016,9 +1191,11 @@ class SlackAdapter(BasePlatformAdapter):
)
# 10) Convert italic: _text_ stays as _text_ (already Slack italic)
# Single *text* → _text_ (Slack italic)
# Single *text* → _text_ (Slack italic), but only when the
# emphasized text touches non-whitespace on both sides so literal
# delimiters like "a * b * c" are preserved.
text = re.sub(
r'(?<!\*)\*([^*\n]+)\*(?!\*)',
r'(?<!\*)\*(\S(?:[^*\n]*?\S)?)\*(?!\*)',
lambda m: _ph(f'_{m.group(1)}_'),
text,
)
@@ -2524,9 +2701,14 @@ class SlackAdapter(BasePlatformAdapter):
# gateway command dispatcher by prepending the slash.
text = f"/{slash_name} {text}".strip()
# Slack slash commands can originate from DMs or shared channels.
# Preserve DM semantics only for DM channel IDs; shared channels must
# keep group semantics so different users do not collide into one
# session key.
is_dm = str(channel_id).startswith("D")
source = self.build_source(
chat_id=channel_id,
chat_type="dm", # Slash commands are always in DM-like context
chat_type="dm" if is_dm else "group",
user_id=user_id,
)
@@ -2537,7 +2719,26 @@ class SlackAdapter(BasePlatformAdapter):
raw_message=command,
)
await self.handle_message(event)
# Stash the Slack response_url so the first reply for this
# channel+user can be routed ephemerally (replaces the initial
# "Running /cmd…" ack shown by handle_hermes_command).
# Only stash for COMMAND events (text starts with "/") — free-form
# questions via "/hermes <question>" must produce public replies so
# the whole channel can see the agent's answer.
response_url = command.get("response_url", "")
if response_url and user_id and channel_id and text.startswith("/"):
self._slash_command_contexts[(channel_id, user_id)] = {
"response_url": response_url,
"ts": time.monotonic(),
}
# Set the ContextVar so send() can match the correct stashed
# response_url even when multiple users slash concurrently.
_slash_user_id_token = _slash_user_id.set(user_id or None)
try:
await self.handle_message(event)
finally:
_slash_user_id.reset(_slash_user_id_token)
def _has_active_session_for_thread(
self,
@@ -2698,6 +2899,13 @@ class SlackAdapter(BasePlatformAdapter):
raw = os.getenv("SLACK_FREE_RESPONSE_CHANNELS", "")
if isinstance(raw, list):
return {str(part).strip() for part in raw if str(part).strip()}
if isinstance(raw, str) and raw.strip():
return {part.strip() for part in raw.split(",") if part.strip()}
# Coerce non-list scalars (str/int/float) to str before splitting.
# A bare numeric YAML value (`free_response_channels: 1234567890`) is
# loaded as int and was previously falling through the isinstance(str)
# branch to return an empty set. str() here accepts whatever scalar
# the YAML loader hands us without changing existing string/CSV
# semantics.
s = str(raw).strip() if raw is not None else ""
if s:
return {part.strip() for part in s.split(",") if part.strip()}
return set()
+143 -7
View File
@@ -290,14 +290,53 @@ class TelegramAdapter(BasePlatformAdapter):
# and any other slash-confirm prompts; see GatewayRunner._request_slash_confirm).
self._slash_confirm_state: Dict[str, str] = {}
@staticmethod
def _is_callback_user_authorized(user_id: str) -> bool:
def _is_callback_user_authorized(
self,
user_id: str,
*,
chat_id: Optional[str] = None,
chat_type: Optional[str] = None,
thread_id: Optional[str] = None,
user_name: Optional[str] = None,
) -> bool:
"""Return whether a Telegram inline-button caller may perform gated actions."""
normalized_user_id = str(user_id or "").strip()
if not normalized_user_id:
return False
runner = getattr(getattr(self, "_message_handler", None), "__self__", None)
auth_fn = getattr(runner, "_is_user_authorized", None)
if callable(auth_fn):
try:
from gateway.session import SessionSource
normalized_chat_type = str(chat_type or "dm").strip().lower() or "dm"
if normalized_chat_type == "private":
normalized_chat_type = "dm"
elif normalized_chat_type == "supergroup":
normalized_chat_type = "forum" if thread_id is not None else "group"
source = SessionSource(
platform=Platform.TELEGRAM,
chat_id=str(chat_id or normalized_user_id),
chat_type=normalized_chat_type,
user_id=normalized_user_id,
user_name=str(user_name).strip() if user_name else None,
thread_id=str(thread_id) if thread_id is not None else None,
)
return bool(auth_fn(source))
except Exception:
logger.debug(
"[Telegram] Falling back to env-only callback auth for user %s",
normalized_user_id,
exc_info=True,
)
allowed_csv = os.getenv("TELEGRAM_ALLOWED_USERS", "").strip()
if not allowed_csv:
return True
allowed_ids = {uid.strip() for uid in allowed_csv.split(",") if uid.strip()}
return "*" in allowed_ids or user_id in allowed_ids
return "*" in allowed_ids or normalized_user_id in allowed_ids
@classmethod
def _metadata_thread_id(cls, metadata: Optional[Dict[str, Any]]) -> Optional[str]:
@@ -473,6 +512,17 @@ class TelegramAdapter(BasePlatformAdapter):
self.name, attempt,
)
self._polling_network_error_count = 0
# start_polling() returning is necessary but not sufficient:
# PTB's Updater can be left in a state where `running` is True
# but the underlying long-poll task is wedged on a stale httpx
# connection and never makes progress. No error_callback fires
# in that state, so the reconnect ladder won't advance on its
# own. Schedule a deferred probe to detect the wedge and
# re-enter the ladder if needed.
if not self.has_fatal_error:
probe = asyncio.ensure_future(self._verify_polling_after_reconnect())
self._background_tasks.add(probe)
probe.add_done_callback(self._background_tasks.discard)
except Exception as retry_err:
logger.warning("[%s] Telegram polling reconnect failed: %s", self.name, retry_err)
# start_polling failed — polling is dead and no further error
@@ -484,6 +534,50 @@ class TelegramAdapter(BasePlatformAdapter):
self._background_tasks.add(task)
task.add_done_callback(self._background_tasks.discard)
async def _verify_polling_after_reconnect(self) -> None:
"""Heartbeat probe scheduled after a successful reconnect.
PTB's Updater can survive a botched stop()+start_polling() cycle
with `running=True` but a wedged consumer task. No error callback
fires, so the reconnect ladder doesn't advance on its own. This
probe detects the wedge by:
1. Sleeping HEARTBEAT_PROBE_DELAY so a healthy long-poll has time
to complete at least one cycle.
2. Verifying `Updater.running` is still True.
3. Probing the bot endpoint with a tight asyncio timeout. A
wedged httpx pool fails this probe; a healthy one returns
well under the timeout.
On any failure, re-enter the reconnect ladder so the existing
MAX_NETWORK_RETRIES path can ultimately escalate to fatal-error.
"""
HEARTBEAT_PROBE_DELAY = 60
PROBE_TIMEOUT = 10
await asyncio.sleep(HEARTBEAT_PROBE_DELAY)
if self.has_fatal_error:
return
if not (self._app and self._app.updater and self._app.updater.running):
logger.warning(
"[%s] Updater not running %ds after reconnect — treating as wedged",
self.name, HEARTBEAT_PROBE_DELAY,
)
await self._handle_polling_network_error(
RuntimeError("Updater not running after reconnect heartbeat")
)
return
try:
await asyncio.wait_for(self._app.bot.get_me(), PROBE_TIMEOUT)
except Exception as probe_err:
logger.warning(
"[%s] Polling heartbeat probe failed %ds after reconnect: %s",
self.name, HEARTBEAT_PROBE_DELAY, probe_err,
)
await self._handle_polling_network_error(probe_err)
async def _handle_polling_conflict(self, error: Exception) -> None:
if self.has_fatal_error and self.fatal_error_code == "telegram_polling_conflict":
return
@@ -722,6 +816,20 @@ class TelegramAdapter(BasePlatformAdapter):
# Persist thread_id to config so we don't recreate on next restart
self._persist_dm_topic_thread_id(int(chat_id), topic_name, thread_id)
# Send a seed message so the topic is visible in Telegram's client.
# Empty topics are hidden by the client UI until they contain a message.
try:
await self._bot.send_message(
chat_id=int(chat_id),
message_thread_id=thread_id,
text=f"\U0001f4cc {topic_name}",
)
except Exception as seed_err:
logger.debug(
"[%s] Could not send seed message to topic '%s': %s",
self.name, topic_name, seed_err,
)
async def connect(self) -> bool:
"""Connect to Telegram via polling or webhook.
@@ -1321,6 +1429,7 @@ class TelegramAdapter(BasePlatformAdapter):
async def send_update_prompt(
self, chat_id: str, prompt: str, default: str = "",
session_key: str = "",
metadata: Optional[Dict[str, Any]] = None,
) -> SendResult:
"""Send an inline-keyboard update prompt (Yes / No buttons).
@@ -1338,11 +1447,14 @@ class TelegramAdapter(BasePlatformAdapter):
InlineKeyboardButton("✗ No", callback_data="update_prompt:n"),
]
])
thread_id = self._metadata_thread_id(metadata)
message_thread_id = self._message_thread_id_for_send(thread_id)
msg = await self._bot.send_message(
chat_id=int(chat_id),
text=text,
parse_mode=ParseMode.MARKDOWN,
reply_markup=keyboard,
message_thread_id=message_thread_id,
**self._link_preview_kwargs(),
)
return SendResult(success=True, message_id=str(msg.message_id))
@@ -1760,6 +1872,12 @@ class TelegramAdapter(BasePlatformAdapter):
if not query or not query.data:
return
data = query.data
query_message = getattr(query, "message", None)
query_chat_id = getattr(query_message, "chat_id", None)
query_chat = getattr(query_message, "chat", None)
query_chat_type = getattr(query_chat, "type", None)
query_thread_id = getattr(query_message, "message_thread_id", None)
query_user_name = getattr(query.from_user, "first_name", None)
# --- Model picker callbacks ---
if data.startswith(("mp:", "mm:", "mb", "mx", "mg:")):
@@ -1781,7 +1899,13 @@ class TelegramAdapter(BasePlatformAdapter):
# Only authorized users may click approval buttons.
caller_id = str(getattr(query.from_user, "id", ""))
if not self._is_callback_user_authorized(caller_id):
if not self._is_callback_user_authorized(
caller_id,
chat_id=query_chat_id,
chat_type=str(query_chat_type) if query_chat_type is not None else None,
thread_id=str(query_thread_id) if query_thread_id is not None else None,
user_name=query_user_name,
):
await query.answer(text="⛔ You are not authorized to approve commands.")
return
@@ -1831,8 +1955,14 @@ class TelegramAdapter(BasePlatformAdapter):
choice = parts[1] # once, always, cancel
confirm_id = parts[2]
caller_id = str(getattr(query.from_user, "id", ""))
if not self._is_callback_user_authorized(caller_id):
caller_id = str(getattr(query.from_user, "id", ""))
if not self._is_callback_user_authorized(
caller_id,
chat_id=query_chat_id,
chat_type=str(query_chat_type) if query_chat_type is not None else None,
thread_id=str(query_thread_id) if query_thread_id is not None else None,
user_name=query_user_name,
):
await query.answer(text="⛔ You are not authorized to answer this prompt.")
return
@@ -1891,7 +2021,13 @@ class TelegramAdapter(BasePlatformAdapter):
return
answer = data.split(":", 1)[1] # "y" or "n"
caller_id = str(getattr(query.from_user, "id", ""))
if not self._is_callback_user_authorized(caller_id):
if not self._is_callback_user_authorized(
caller_id,
chat_id=query_chat_id,
chat_type=str(query_chat_type) if query_chat_type is not None else None,
thread_id=str(query_thread_id) if query_thread_id is not None else None,
user_name=query_user_name,
):
await query.answer(text="⛔ You are not authorized to answer update prompts.")
return
await query.answer(text=f"Sent '{answer}' to the update process.")
+5 -1
View File
@@ -206,7 +206,11 @@ class WeComAdapter(BasePlatformAdapter):
return False
try:
self._http_client = httpx.AsyncClient(timeout=30.0, follow_redirects=True)
# Tighter keepalive so idle CLOSE_WAIT drains promptly (#18451).
from gateway.platforms._http_client_limits import platform_httpx_limits
self._http_client = httpx.AsyncClient(
timeout=30.0, follow_redirects=True, limits=platform_httpx_limits(),
)
await self._open_connection()
self._mark_connected()
self._listen_task = asyncio.create_task(self._listen_loop())
+3 -1
View File
@@ -119,7 +119,9 @@ class WecomCallbackAdapter(BasePlatformAdapter):
pass
try:
self._http_client = httpx.AsyncClient(timeout=20.0)
# Tighter keepalive so idle CLOSE_WAIT drains promptly (#18451).
from gateway.platforms._http_client_limits import platform_httpx_limits
self._http_client = httpx.AsyncClient(timeout=20.0, limits=platform_httpx_limits())
self._app = web.Application()
self._app.router.add_get("/health", self._handle_health)
self._app.router.add_get(self._path, self._handle_verify)
+3 -1
View File
@@ -2030,7 +2030,9 @@ async def send_weixin_direct(
live_adapter = _LIVE_ADAPTERS.get(resolved_token)
send_session = getattr(live_adapter, '_send_session', None)
if live_adapter is not None and send_session is not None and not send_session.closed:
if (live_adapter is not None and send_session is not None
and not send_session.closed
and send_session._loop is asyncio.get_running_loop()):
last_result: Optional[SendResult] = None
cleaned = live_adapter.format_message(message)
if cleaned:
+32 -2
View File
@@ -185,6 +185,13 @@ class WhatsAppAdapter(BasePlatformAdapter):
self._bridge_log: Optional[Path] = None
self._poll_task: Optional[asyncio.Task] = None
self._http_session: Optional["aiohttp.ClientSession"] = None
# Set to True by disconnect() before we SIGTERM our child bridge so
# _check_managed_bridge_exit() can distinguish an intentional
# shutdown-time exit (returncode -15 / -2 / 0) from a real crash.
# Without this, every graceful gateway shutdown/restart would log
# "Fatal whatsapp adapter error" plus dispatch a fatal-error
# notification before the normal "✓ whatsapp disconnected" fires.
self._shutting_down: bool = False
def _whatsapp_require_mention(self) -> bool:
configured = self.config.extra.get("require_mention")
@@ -555,6 +562,21 @@ class WhatsAppAdapter(BasePlatformAdapter):
if returncode is None:
return None
# Planned shutdown: disconnect() sets _shutting_down before it sends
# SIGTERM to the bridge, so a returncode of -15 (SIGTERM), -2 (SIGINT),
# or 0 (clean exit) at that point is expected, not a crash. Treat it
# as informational and skip the fatal-error path.
# getattr-with-default keeps tests that construct the adapter via
# ``WhatsAppAdapter.__new__`` (bypassing __init__) working without
# every _make_adapter() helper having to seed the attribute.
if getattr(self, "_shutting_down", False) and returncode in (0, -2, -15):
logger.info(
"[%s] Bridge exited during shutdown (code %d).",
self.name,
returncode,
)
return None
message = f"WhatsApp bridge process exited unexpectedly (code {returncode})."
if not self.has_fatal_error:
logger.error("[%s] %s", self.name, message)
@@ -565,6 +587,10 @@ class WhatsAppAdapter(BasePlatformAdapter):
async def disconnect(self) -> None:
"""Stop the WhatsApp bridge and clean up any orphaned processes."""
# Flip the shutdown flag BEFORE signalling the child so the exit-check
# path (which runs from other tasks like send() and the poll loop)
# doesn't race us and report the intentional termination as fatal.
self._shutting_down = True
if self._bridge_process:
try:
try:
@@ -876,11 +902,15 @@ class WhatsAppAdapter(BasePlatformAdapter):
try:
import aiohttp
await self._http_session.post(
# Must wrap in `async with` — a bare `await session.post(...)`
# leaves the response object alive until GC, holding its TCP
# socket in CLOSE_WAIT. See #18451.
async with self._http_session.post(
f"http://127.0.0.1:{self._bridge_port}/typing",
json={"chatId": chat_id},
timeout=aiohttp.ClientTimeout(total=5)
)
):
pass
except Exception:
pass # Ignore typing indicator failures
+6 -4
View File
@@ -1896,10 +1896,12 @@ class OwnerCommandMiddleware(InboundMiddleware):
if cmd not in cls.ALLOWLIST:
return None, None, False
# Sender identity check: bot owner <-> push.from_account == push.bot_owner_id
# owner_id = (push or {}).get("bot_owner_id") or ""
# is_owner = bool(owner_id) and owner_id == from_account
is_owner = True
# Sender identity check: bot owner <-> push.from_account == push.bot_owner_id.
# The allowlisted commands (/approve, /deny, /stop, /reset, ...) are
# privileged — leaking them to non-owners lets any group member approve
# a dangerous tool call, kill the owner's task, or wipe session state.
owner_id = str((push or {}).get("bot_owner_id") or "").strip()
is_owner = bool(owner_id) and owner_id == from_account
return cmd, cmd_line, is_owner
async def handle(self, ctx: InboundContext, next_fn) -> None:
+1246 -105
View File
File diff suppressed because it is too large Load Diff
+12
View File
@@ -458,6 +458,15 @@ class SessionEntry:
was_auto_reset: bool = False
auto_reset_reason: Optional[str] = None # "idle" or "daily"
reset_had_activity: bool = False # whether the expired session had any messages
# Set by reset_session() when the user explicitly sends /new or /reset.
# Consumed once by _handle_message_with_agent to trigger topic/channel
# skill re-injection on the first message of the new session. We can't
# reuse was_auto_reset for this because that flag fires the "session
# expired due to inactivity" user-facing notice and a misleading
# context-note prepend — both wrong for an explicit manual reset.
# See issue #6508.
is_fresh_reset: bool = False
# Set by the background expiry watcher after it finalizes an expired
# session (invoking on_session_finalize hooks and evicting the cached
@@ -508,6 +517,7 @@ class SessionEntry:
if self.last_resume_marked_at
else None
),
"is_fresh_reset": self.is_fresh_reset,
}
if self.origin:
result["origin"] = self.origin.to_dict()
@@ -556,6 +566,7 @@ class SessionEntry:
resume_pending=data.get("resume_pending", False),
resume_reason=data.get("resume_reason"),
last_resume_marked_at=last_resume_marked_at,
is_fresh_reset=data.get("is_fresh_reset", False),
)
@@ -1132,6 +1143,7 @@ class SessionStore:
display_name=old_entry.display_name,
platform=old_entry.platform,
chat_type=old_entry.chat_type,
is_fresh_reset=True,
)
self._entries[session_key] = new_entry
+8 -4
View File
@@ -21,6 +21,7 @@ from datetime import datetime, timezone
from pathlib import Path
from hermes_constants import get_hermes_home
from typing import Any, Optional
from utils import atomic_json_write
if sys.platform == "win32":
import msvcrt
@@ -34,6 +35,10 @@ _IS_WINDOWS = sys.platform == "win32"
_UNSET = object()
_GATEWAY_LOCK_FILENAME = "gateway.lock"
_gateway_lock_handle = None
# Windows byte-range locks are mandatory for other readers. Lock a byte well
# past the JSON payload so runtime status / PID readers can still read the file
# while another process holds the mutual-exclusion lock.
_WINDOWS_LOCK_OFFSET = 1024 * 1024
def _get_pid_path() -> Path:
@@ -205,8 +210,7 @@ def _read_json_file(path: Path) -> Optional[dict[str, Any]]:
def _write_json_file(path: Path, payload: dict[str, Any]) -> None:
path.parent.mkdir(parents=True, exist_ok=True)
path.write_text(json.dumps(payload))
atomic_json_write(path, payload, indent=None, separators=(",", ":"))
def _read_pid_record(pid_path: Optional[Path] = None) -> Optional[dict]:
@@ -286,7 +290,7 @@ def _try_acquire_file_lock(handle) -> bool:
if handle.tell() == 0:
handle.write("\n")
handle.flush()
handle.seek(0)
handle.seek(_WINDOWS_LOCK_OFFSET)
msvcrt.locking(handle.fileno(), msvcrt.LK_NBLCK, 1)
else:
fcntl.flock(handle.fileno(), fcntl.LOCK_EX | fcntl.LOCK_NB)
@@ -298,7 +302,7 @@ def _try_acquire_file_lock(handle) -> bool:
def _release_file_lock(handle) -> None:
try:
if _IS_WINDOWS:
handle.seek(0)
handle.seek(_WINDOWS_LOCK_OFFSET)
msvcrt.locking(handle.fileno(), msvcrt.LK_UNLCK, 1)
else:
fcntl.flock(handle.fileno(), fcntl.LOCK_UN)
+2 -2
View File
@@ -11,5 +11,5 @@ Provides subcommands for:
- hermes cron - Manage cron jobs
"""
__version__ = "0.11.0"
__release_date__ = "2026.4.23"
__version__ = "0.12.0"
__release_date__ = "2026.4.30"
+5 -5
View File
@@ -43,7 +43,7 @@ import yaml
from hermes_cli.config import get_hermes_home, get_config_path, read_raw_config
from hermes_constants import OPENROUTER_BASE_URL
from utils import atomic_replace
from utils import atomic_replace, atomic_yaml_write, is_truthy_value
logger = logging.getLogger(__name__)
@@ -2480,8 +2480,8 @@ def _resolve_verify(
tls_state = tls_state if isinstance(tls_state, dict) else {}
effective_insecure = (
bool(insecure) if insecure is not None
else bool(tls_state.get("insecure", False))
is_truthy_value(insecure, default=False) if insecure is not None
else is_truthy_value(tls_state.get("insecure", False), default=False)
)
effective_ca = (
ca_bundle
@@ -3653,7 +3653,7 @@ def _update_config_for_provider(
config["model"] = model_cfg
config_path.write_text(yaml.safe_dump(config, sort_keys=False))
atomic_yaml_write(config_path, config, sort_keys=False)
return config_path
@@ -3712,7 +3712,7 @@ def _reset_config_provider() -> Path:
model["provider"] = "auto"
if "base_url" in model:
model["base_url"] = OPENROUTER_BASE_URL
config_path.write_text(yaml.safe_dump(config, sort_keys=False))
atomic_yaml_write(config_path, config, sort_keys=False)
return config_path
+136 -45
View File
@@ -10,6 +10,7 @@ To add an alias: set ``aliases=("short",)`` on the existing ``CommandDef``.
from __future__ import annotations
import logging
import os
import re
import shutil
@@ -19,6 +20,10 @@ from collections.abc import Callable, Mapping
from dataclasses import dataclass
from typing import Any
from utils import is_truthy_value
logger = logging.getLogger(__name__)
# prompt_toolkit is an optional CLI dependency — only needed for
# SlashCommandCompleter and SlashCommandAutoSuggest. Gateway and test
# environments that lack it must still be able to import this module
@@ -93,6 +98,8 @@ COMMAND_REGISTRY: list[CommandDef] = [
aliases=("q",), args_hint="<prompt>"),
CommandDef("steer", "Inject a message after the next tool call without interrupting", "Session",
args_hint="<prompt>"),
CommandDef("goal", "Set a standing goal Hermes works on across turns until achieved", "Session",
args_hint="[text | pause | resume | clear | status]"),
CommandDef("status", "Show session info", "Session"),
CommandDef("profile", "Show active profile name and home directory", "Info"),
CommandDef("sethome", "Set this chat as the home channel", "Session",
@@ -151,6 +158,11 @@ COMMAND_REGISTRY: list[CommandDef] = [
CommandDef("curator", "Background skill maintenance (status, run, pin, archive)",
"Tools & Skills", args_hint="[subcommand]",
subcommands=("status", "run", "pause", "resume", "pin", "unpin", "restore")),
CommandDef("kanban", "Multi-profile collaboration board (tasks, links, comments)",
"Tools & Skills", args_hint="[subcommand]",
subcommands=("list", "ls", "show", "create", "assign", "link", "unlink",
"claim", "comment", "complete", "block", "unblock", "archive",
"tail", "dispatch", "context", "init", "gc")),
CommandDef("reload", "Reload .env variables into the running session", "Tools & Skills",
cli_only=True),
CommandDef("reload-mcp", "Reload MCP servers from config", "Tools & Skills",
@@ -366,7 +378,7 @@ def _resolve_config_gates() -> set[str]:
else:
val = None
break
if val:
if is_truthy_value(val, default=False):
result.add(cmd.name)
return result
@@ -602,13 +614,26 @@ def _collect_gateway_skill_entries(
try:
from agent.skill_commands import get_skill_commands
from tools.skills_tool import SKILLS_DIR
from agent.skill_utils import get_external_skills_dirs
_skills_dir = str(SKILLS_DIR.resolve())
_hub_dir = str((SKILLS_DIR / ".hub").resolve())
_hub_dir = str((SKILLS_DIR / ".hub").resolve()).rstrip("/") + "/"
# Build set of allowed directory prefixes: local skills dir + any
# user-configured ``skills.external_dirs``. Ensure each prefix ends
# with ``/`` so ``/my-skills`` does not also match ``/my-skills-extra``.
# Without this widening, external skills are visible in
# ``hermes skills list`` and the agent's ``/skill-name`` dispatch but
# silently excluded from gateway slash menus (#8110).
_allowed_prefixes = [_skills_dir.rstrip("/") + "/"]
_allowed_prefixes.extend(
str(d).rstrip("/") + "/" for d in get_external_skills_dirs()
)
skill_cmds = get_skill_commands()
for cmd_key in sorted(skill_cmds):
info = skill_cmds[cmd_key]
skill_path = info.get("skill_md_path", "")
if not skill_path.startswith(_skills_dir):
if not skill_path:
continue
if not any(skill_path.startswith(prefix) for prefix in _allowed_prefixes):
continue
if skill_path.startswith(_hub_dir):
continue
@@ -712,24 +737,40 @@ def discord_skill_commands(
def discord_skill_commands_by_category(
reserved_names: set[str],
) -> tuple[dict[str, list[tuple[str, str, str]]], list[tuple[str, str, str]], int]:
"""Return skill entries organized by category for Discord ``/skill`` subcommand groups.
"""Return skill entries organized by category for Discord ``/skill`` autocomplete.
Skills whose directory is nested at least 2 levels under ``SKILLS_DIR``
Skills whose directory is nested at least 2 levels under a scan root
(e.g. ``creative/ascii-art/SKILL.md``) are grouped by their top-level
category. Root-level skills (e.g. ``dogfood/SKILL.md``) are returned as
*uncategorized* the caller should register them as direct subcommands
of the ``/skill`` group.
*uncategorized*.
The same filtering as :func:`discord_skill_commands` is applied: hub
skills excluded, per-platform disabled excluded, names clamped.
Scan roots include the local ``SKILLS_DIR`` **and** any configured
``skills.external_dirs`` matching the widened filter applied to the
flat ``discord_skill_commands()`` collector in #18741. Without this
parity, external-dir skills are visible via ``hermes skills list`` and
the agent's ``/skill-name`` dispatch but silently absent from Discord's
``/skill`` autocomplete.
Filtering mirrors :func:`discord_skill_commands`: hub skills excluded,
per-platform disabled excluded, names clamped to 32 chars, descriptions
clamped to 100 chars.
The legacy 25-group × 25-subcommand caps (from the old nested
``/skill <cat> <name>`` layout) are **not** applied the live caller
(``_register_skill_group`` in ``gateway/platforms/discord.py``, refactored
in PR #11580) flattens these results and feeds them into a single
autocomplete callback, which scales to thousands of entries without any
per-command payload concerns. ``hidden_count`` is retained in the return
tuple for backward compatibility and still reports skills dropped for
other reasons (32-char clamp collision vs a reserved name).
Returns:
``(categories, uncategorized, hidden_count)``
- *categories*: ``{category_name: [(name, description, cmd_key), ...]}``
- *uncategorized*: ``[(name, description, cmd_key), ...]``
- *hidden_count*: skills dropped due to Discord group limits
(25 subcommand groups, 25 subcommands per group)
- *hidden_count*: skills dropped due to name clamp collisions
against already-registered command names.
"""
from pathlib import Path as _P
@@ -743,14 +784,33 @@ def discord_skill_commands_by_category(
# Collect raw skill data --------------------------------------------------
categories: dict[str, list[tuple[str, str, str]]] = {}
uncategorized: list[tuple[str, str, str]] = []
_names_used: set[str] = set(reserved_names)
# Map clamped-32-char-name → what it came from, so we can emit an
# actionable warning on collision. Reserved (gateway-builtin) command
# names are marked with a sentinel so the warning distinguishes
# "skill collided with a reserved command" from "two skills collided
# on the 32-char clamp" — the latter is the rename-worthy case.
_names_used: dict[str, str] = {n: "<reserved>" for n in reserved_names}
hidden = 0
try:
from agent.skill_commands import get_skill_commands
from agent.skill_utils import get_external_skills_dirs
from tools.skills_tool import SKILLS_DIR
_skills_dir = SKILLS_DIR.resolve()
_hub_dir = (SKILLS_DIR / ".hub").resolve()
# Build list of (resolved_root, is_local) tuples. Each external dir
# becomes its own scan root for category derivation — a skill at
# ``<external>/mlops/foo/SKILL.md`` is still categorized as "mlops".
_scan_roots: list[_P] = [_skills_dir]
try:
for ext in get_external_skills_dirs():
try:
_scan_roots.append(_P(ext).resolve())
except Exception:
continue
except Exception:
pass
skill_cmds = get_skill_commands()
for cmd_key in sorted(skill_cmds):
@@ -759,33 +819,72 @@ def discord_skill_commands_by_category(
if not skill_path:
continue
sp = _P(skill_path).resolve()
# Skip skills outside SKILLS_DIR or from the hub
if not str(sp).startswith(str(_skills_dir)):
continue
# Hub skills are loaded via the skill hub, not surfaced as
# slash commands.
if str(sp).startswith(str(_hub_dir)):
continue
# Accept skill if it lives under any scan root; record the
# matching root so we can derive the category correctly.
matched_root: _P | None = None
for root in _scan_roots:
try:
sp.relative_to(root)
except ValueError:
continue
matched_root = root
break
if matched_root is None:
continue
skill_name = info.get("name", "")
if skill_name in _platform_disabled:
continue
raw_name = cmd_key.lstrip("/")
# Clamp to 32 chars (Discord limit)
# Clamp to 32 chars (Discord per-command name limit)
discord_name = raw_name[:32]
if discord_name in _names_used:
# Two skills whose first 32 chars are identical. One wins
# (the first one seen, which is alphabetical because the
# caller iterates ``sorted(skill_cmds)``); the other is
# dropped from Discord's /skill autocomplete.
#
# Silently counting this as ``hidden`` (the old behavior)
# meant skill authors had no way to discover the drop —
# their skill just didn't appear in the picker. Emit a
# WARNING naming both sides so the author can rename the
# losing skill's frontmatter name to something with a
# distinct 32-char prefix.
prior = _names_used[discord_name]
if prior == "<reserved>":
logger.warning(
"Discord /skill: %r (from %r) collides on its 32-char "
"clamp with a reserved gateway command name %r — the "
"skill will not appear in the /skill autocomplete. "
"Rename the skill's frontmatter ``name:`` to differ "
"in its first 32 chars.",
discord_name, cmd_key, discord_name,
)
else:
logger.warning(
"Discord /skill: %r and %r both clamp to %r on "
"Discord's 32-char command-name limit — only %r "
"will appear in the /skill autocomplete. Rename "
"one skill's frontmatter ``name:`` to differ in "
"its first 32 chars.",
prior, cmd_key, discord_name, prior,
)
hidden += 1
continue
_names_used.add(discord_name)
_names_used[discord_name] = cmd_key
desc = info.get("description", "")
if len(desc) > 100:
desc = desc[:97] + "..."
# Determine category from the relative path within SKILLS_DIR.
# e.g. creative/ascii-art/SKILL.md → parts = ("creative", "ascii-art")
try:
rel = sp.parent.relative_to(_skills_dir)
except ValueError:
continue
# Determine category from the relative path within the matched
# scan root. e.g. creative/ascii-art/SKILL.md → ("creative", ...)
rel = sp.parent.relative_to(matched_root)
parts = rel.parts
if len(parts) >= 2:
cat = parts[0]
@@ -795,28 +894,7 @@ def discord_skill_commands_by_category(
except Exception:
pass
# Enforce Discord limits: 25 subcommand groups, 25 subcommands each ------
_MAX_GROUPS = 25
_MAX_PER_GROUP = 25
trimmed_categories: dict[str, list[tuple[str, str, str]]] = {}
group_count = 0
for cat in sorted(categories):
if group_count >= _MAX_GROUPS:
hidden += len(categories[cat])
continue
entries = categories[cat][:_MAX_PER_GROUP]
hidden += max(0, len(categories[cat]) - _MAX_PER_GROUP)
trimmed_categories[cat] = entries
group_count += 1
# Uncategorized skills also count against the 25 top-level limit
remaining_slots = _MAX_GROUPS - group_count
if len(uncategorized) > remaining_slots:
hidden += len(uncategorized) - remaining_slots
uncategorized = uncategorized[:remaining_slots]
return trimmed_categories, uncategorized, hidden
return categories, uncategorized, hidden
# ---------------------------------------------------------------------------
@@ -829,6 +907,13 @@ def discord_skill_commands_by_category(
_SLACK_MAX_SLASH_COMMANDS = 50
_SLACK_NAME_LIMIT = 32
_SLACK_INVALID_CHARS = re.compile(r"[^a-z0-9_\-]")
_SLACK_RESERVED_COMMANDS = frozenset({
# Built-in Slack slash commands that cannot be registered by apps.
# https://slack.com/help/articles/201259356-Use-built-in-slash-commands
"me", "status", "away", "dnd", "shrug", "remind", "msg", "feed",
"who", "collapse", "expand", "leave", "join", "open", "search",
"topic", "mute", "pro", "shortcuts",
})
def _sanitize_slack_name(raw: str) -> str:
@@ -855,6 +940,10 @@ def slack_native_slashes() -> list[tuple[str, str, str]]:
documented form (e.g. ``/background``, ``/bg``, and ``/btw`` all work).
Plugin-registered slash commands are included too.
Commands whose sanitized name collides with a Slack built-in
(e.g. ``/status``, ``/me``, ``/join``) are silently skipped. Users
can still reach them via ``/hermes <command>``.
Results are clamped to Slack's 50-command limit with duplicate-name
avoidance. ``/hermes`` is always reserved as the first entry so the
legacy ``/hermes <subcommand>`` form keeps working for anything that
@@ -872,6 +961,8 @@ def slack_native_slashes() -> list[tuple[str, str, str]]:
slack_name = _sanitize_slack_name(name)
if not slack_name or slack_name in seen:
return
if slack_name in _SLACK_RESERVED_COMMANDS:
return
if len(entries) >= _SLACK_MAX_SLASH_COMMANDS:
return
# Slack description cap is 2000 chars; keep it short.
+125 -4
View File
@@ -400,7 +400,12 @@ DEFAULT_CONFIG = {
# The gateway stops accepting new work, waits for running agents
# to finish, then interrupts any remaining runs after the timeout.
# 0 = no drain, interrupt immediately.
"restart_drain_timeout": 60,
#
# 180s is calibrated for realistic in-flight agent turns: a typical
# coding conversation mid-reasoning runs 60150s per call, so a 60s
# budget routinely interrupted legitimate work on /restart. Raise
# further in config.yaml if you run very-long-reasoning models.
"restart_drain_timeout": 180,
# Max app-level retry attempts for API errors (connection drops,
# provider timeouts, 5xx, etc.) before the agent surfaces the
# failure. The OpenAI SDK already does its own low-level retries
@@ -457,6 +462,7 @@ DEFAULT_CONFIG = {
# remains available as a tool regardless of this setting — the routing
# only controls how inbound user images are presented.
"image_input_mode": "auto",
"disabled_toolsets": [],
},
"terminal": {
@@ -606,6 +612,24 @@ DEFAULT_CONFIG = {
"max_line_length": 2000,
},
# Tool loop guardrails nudge models when they repeat failed or
# non-progressing tool calls. Soft warnings are always-on by default;
# hard stops are opt-in so interactive CLI/TUI sessions keep flowing.
"tool_loop_guardrails": {
"warnings_enabled": True,
"hard_stop_enabled": False,
"warn_after": {
"exact_failure": 2,
"same_tool_failure": 3,
"idempotent_no_progress": 2,
},
"hard_stop_after": {
"exact_failure": 5,
"same_tool_failure": 8,
"idempotent_no_progress": 5,
},
},
"compression": {
"enabled": True,
"threshold": 0.50, # compress when context usage exceeds this ratio
@@ -620,6 +644,18 @@ DEFAULT_CONFIG = {
"cache_ttl": "5m",
},
# OpenRouter-specific settings.
# response_cache: enable OpenRouter response caching (X-OpenRouter-Cache header).
# When enabled, identical requests return cached responses for free (zero billing).
# This is separate from Anthropic prompt caching and works alongside it.
# See: https://openrouter.ai/docs/guides/features/response-caching
# response_cache_ttl: how long cached responses remain valid, in seconds (1-86400).
# Default 300 (5 minutes). Only used when response_cache is enabled.
"openrouter": {
"response_cache": True,
"response_cache_ttl": 300,
},
# AWS Bedrock provider configuration.
# Only used when model.provider is "bedrock".
"bedrock": {
@@ -756,6 +792,14 @@ DEFAULT_CONFIG = {
"tool_progress_command": False, # Enable /verbose command in messaging gateway
"tool_progress_overrides": {}, # DEPRECATED — use display.platforms instead
"tool_preview_length": 0, # Max chars for tool call previews (0 = no limit, show full paths/commands)
# Auto-delete system-notice replies (e.g. "✨ New session started!",
# "♻ Restarting gateway…", "⚡ Stopped…") after N seconds on platforms
# that support message deletion (currently Telegram; other platforms
# ignore and leave the message in place). Only affects slash-command
# replies wrapped with gateway.platforms.base.EphemeralReply — agent
# responses and content messages are never touched. Default 0
# (disabled) preserves prior behavior.
"ephemeral_system_ttl": 0,
"platforms": {}, # Per-platform display overrides: {"telegram": {"tool_progress": "all"}, "slack": {"tool_progress": "off"}}
# Gateway runtime-metadata footer appended to the FINAL message of a turn
# (disabled by default to keep replies minimal). When enabled, renders
@@ -798,7 +842,7 @@ DEFAULT_CONFIG = {
# Voices: alloy, echo, fable, onyx, nova, shimmer
},
"xai": {
"voice_id": "eve",
"voice_id": "eve", # or custom voice ID — see https://docs.x.ai/developers/model-capabilities/audio/custom-voices
"language": "en",
"sample_rate": 24000,
"bit_rate": 128000,
@@ -925,7 +969,23 @@ DEFAULT_CONFIG = {
# injected at the start of every API call for few-shot priming.
# Never saved to sessions, logs, or trajectories.
"prefill_messages_file": "",
# Goals — persistent cross-turn goals (Ralph-style loop).
# After every turn, a lightweight judge call asks the auxiliary model
# whether the active /goal is satisfied by the assistant's last
# response. If not, Hermes feeds a continuation prompt back into the
# same session and keeps working until the goal is done, the turn
# budget is exhausted, or the user pauses/clears it. Judge failures
# fail OPEN (continue) so a flaky judge never wedges progress — the
# turn budget is the real backstop.
"goals": {
# Max continuation turns before Hermes auto-pauses the goal and
# asks the user to /goal resume. Protects against judge false
# negatives (goal actually done but judge says continue) and
# unbounded model spend on fuzzy / unachievable goals.
"max_turns": 20,
},
# Skills — external skill directories for sharing skills across tools/agents.
# Each path is expanded (~, ${VAR}) and resolved. Read-only — skill creation
# always goes to ~/.hermes/skills/.
@@ -979,6 +1039,14 @@ DEFAULT_CONFIG = {
# Archive a skill (move to skills/.archive/) after this many days
# without use. Archived skills are recoverable — no auto-deletion.
"archive_after_days": 90,
# Pre-run backup: before every real curator pass (dry-run is
# skipped), snapshot ~/.hermes/skills/ into
# ~/.hermes/skills/.curator_backups/<utc-iso>/skills.tar.gz so the
# user can roll back with `hermes curator rollback`.
"backup": {
"enabled": True,
"keep": 5, # retain last N regular snapshots
},
},
# Honcho AI-native memory -- reads ~/.honcho/config.json as single source of truth.
@@ -1104,6 +1172,24 @@ DEFAULT_CONFIG = {
"max_parallel_jobs": None,
},
# Kanban multi-agent coordination — controls the dispatcher loop that
# spawns workers for ready tasks. The dispatcher ticks every N seconds
# (default 60), reclaims stale claims, promotes dependency-satisfied
# todos to ready, and fires `hermes -p <assignee> chat -q ...` for
# each claimable ready task. One dispatcher per profile is sufficient;
# running more than one on the same kanban.db will race for claims.
"kanban": {
# Run the dispatcher inside the gateway process. On by default —
# the cost is ~300µs every `dispatch_interval_seconds` when idle,
# and gateway is the supervisor users already have. Set to false
# only if you run the dispatcher as a separate systemd unit or
# don't want the gateway to spawn workers.
"dispatch_in_gateway": True,
# Seconds between dispatcher ticks (idle or not). Lower = snappier
# pickup of newly-ready tasks; higher = less SQL pressure.
"dispatch_interval_seconds": 60,
},
# execute_code settings — controls the tool used for programmatic tool calls.
"code_execution": {
# Execution mode:
@@ -1732,6 +1818,29 @@ OPTIONAL_ENV_VARS = {
"password": False,
"category": "tool",
},
"TINYFISH_API_KEY": {
"description": "TinyFish API key for cloud browser, search, fetch, and agent",
"prompt": "TinyFish API key",
"url": "https://agent.tinyfish.ai/api-keys",
"tools": ["browser_navigate", "browser_click"],
"password": True,
"category": "tool",
},
"TINYFISH_API_URL": {
"description": "TinyFish browser API URL override (optional, for staging/dev)",
"prompt": "TinyFish API URL (leave empty for default)",
"url": None,
"tools": ["browser_navigate", "browser_click"],
"password": False,
"category": "tool",
},
"TINYFISH_BROWSER_TIMEOUT": {
"description": "TinyFish browser session inactivity timeout in seconds (optional, default 300)",
"prompt": "Browser session timeout (seconds)",
"tools": ["browser_navigate", "browser_click"],
"password": False,
"category": "tool",
},
"CAMOFOX_URL": {
"description": "Camofox browser server URL for local anti-detection browsing (e.g. http://localhost:9377)",
"prompt": "Camofox server URL",
@@ -2400,7 +2509,17 @@ def get_missing_skill_config_vars() -> List[Dict[str, Any]]:
except Exception:
return []
all_vars = discover_all_skill_config_vars()
try:
all_vars = discover_all_skill_config_vars()
except Exception as e:
# A malformed SKILL.md, unreadable external skill dir, or similar
# should never break `hermes update`. Skill-config prompting is a
# post-migration nicety, not a blocker.
import logging
logging.getLogger(__name__).debug(
"discover_all_skill_config_vars failed: %s", e
)
return []
if not all_vars:
return []
@@ -4337,6 +4456,7 @@ def show_config():
("TAVILY_API_KEY", "Tavily"),
("BROWSERBASE_API_KEY", "Browserbase"),
("BROWSER_USE_API_KEY", "Browser Use"),
("TINYFISH_API_KEY", "TinyFish"),
("FAL_KEY", "FAL"),
]
@@ -4521,6 +4641,7 @@ def set_config_value(key: str, value: str):
'FIRECRAWL_GATEWAY_URL', 'TOOL_GATEWAY_DOMAIN', 'TOOL_GATEWAY_SCHEME',
'TOOL_GATEWAY_USER_TOKEN', 'TAVILY_API_KEY',
'BROWSERBASE_API_KEY', 'BROWSERBASE_PROJECT_ID', 'BROWSER_USE_API_KEY',
'TINYFISH_API_KEY', 'TINYFISH_API_URL', 'TINYFISH_BROWSER_TIMEOUT',
'FAL_KEY', 'TELEGRAM_BOT_TOKEN', 'DISCORD_BOT_TOKEN',
'TERMINAL_SSH_HOST', 'TERMINAL_SSH_USER', 'TERMINAL_SSH_KEY',
'SUDO_PASSWORD', 'SLACK_BOT_TOKEN', 'SLACK_APP_TOKEN',
+193 -7
View File
@@ -108,6 +108,49 @@ def _cmd_status(args) -> int:
f"last_activity={last}"
)
# Show top 5 most-active and least-active skills by activity_count
# (use + view + patch). This is a different signal from
# least-recently-active: activity_count reflects frequency,
# last_activity_at reflects recency. A skill touched 30 times a year
# ago is high-frequency but stale; a skill touched once yesterday is
# recent but low-frequency. Both can matter.
active_all = by_state.get("active", [])
if active_all:
most_active = sorted(
active_all,
key=lambda r: (r.get("activity_count") or 0, r.get("last_activity_at") or ""),
reverse=True,
)[:5]
if most_active and (most_active[0].get("activity_count") or 0) > 0:
print("\nmost active (top 5):")
for r in most_active:
last = _fmt_ts(r.get("last_activity_at"))
print(
f" {r['name']:40s} "
f"activity={r.get('activity_count', 0):3d} "
f"use={r.get('use_count', 0):3d} "
f"view={r.get('view_count', 0):3d} "
f"patches={r.get('patch_count', 0):3d} "
f"last_activity={last}"
)
least_active = sorted(
active_all,
key=lambda r: (r.get("activity_count") or 0, r.get("last_activity_at") or ""),
)[:5]
if least_active:
print("\nleast active (top 5):")
for r in least_active:
last = _fmt_ts(r.get("last_activity_at"))
print(
f" {r['name']:40s} "
f"activity={r.get('activity_count', 0):3d} "
f"use={r.get('use_count', 0):3d} "
f"view={r.get('view_count', 0):3d} "
f"patches={r.get('patch_count', 0):3d} "
f"last_activity={last}"
)
return 0
@@ -117,7 +160,11 @@ def _cmd_run(args) -> int:
print("curator: disabled via config; enable with `curator.enabled: true`")
return 1
print("curator: running review pass...")
dry = bool(getattr(args, "dry_run", False))
if dry:
print("curator: running DRY-RUN (report only, no mutations)...")
else:
print("curator: running review pass...")
def _on_summary(msg: str) -> None:
print(msg)
@@ -125,17 +172,29 @@ def _cmd_run(args) -> int:
result = curator.run_curator_review(
on_summary=_on_summary,
synchronous=bool(args.synchronous),
dry_run=dry,
)
auto = result.get("auto_transitions", {})
if auto:
print(
f"auto: checked={auto.get('checked', 0)} "
f"stale={auto.get('marked_stale', 0)} "
f"archived={auto.get('archived', 0)} "
f"reactivated={auto.get('reactivated', 0)}"
)
if dry:
print(
f"auto (preview): {auto.get('checked', 0)} candidate skill(s) "
"— no transitions applied in dry-run"
)
else:
print(
f"auto: checked={auto.get('checked', 0)} "
f"stale={auto.get('marked_stale', 0)} "
f"archived={auto.get('archived', 0)} "
f"reactivated={auto.get('reactivated', 0)}"
)
if not args.synchronous:
print("llm pass running in background — check `hermes curator status` later")
if dry:
print(
"dry-run: no changes applied. When the report lands, read it with "
"`hermes curator status` and run `hermes curator run` (no flag) to apply."
)
return 0
@@ -186,6 +245,98 @@ def _cmd_restore(args) -> int:
return 0 if ok else 1
def _cmd_backup(args) -> int:
"""Take a manual snapshot of the skills tree. Same mechanism as the
automatic pre-run snapshot, just user-initiated."""
from agent import curator_backup
if not curator_backup.is_enabled():
print(
"curator: backups are disabled via config "
"(`curator.backup.enabled: false`); re-enable to snapshot"
)
return 1
reason = getattr(args, "reason", None) or "manual"
snap = curator_backup.snapshot_skills(reason=reason)
if snap is None:
print("curator: snapshot failed — check logs (backup disabled or IO error)")
return 1
print(f"curator: snapshot created at ~/.hermes/skills/.curator_backups/{snap.name}")
return 0
def _cmd_rollback(args) -> int:
"""Restore the skills tree from a snapshot. Defaults to newest.
``--list`` prints available snapshots and exits. ``--id <stamp>`` picks
a specific one. Without ``-y``, prompts for confirmation. A safety
snapshot of the current tree is always taken first, so rollbacks are
themselves undoable.
"""
from agent import curator_backup
if getattr(args, "list", False):
print(curator_backup.summarize_backups())
return 0
backup_id = getattr(args, "backup_id", None)
target_path = curator_backup._resolve_backup(backup_id)
if target_path is None:
rows = curator_backup.list_backups()
if not rows:
print(
"curator: no snapshots exist yet. Take one with "
"`hermes curator backup` or wait for the next curator run."
)
else:
print(
f"curator: no snapshot matching "
f"{'id ' + repr(backup_id) if backup_id else 'your query'}."
)
print("Available:")
print(curator_backup.summarize_backups())
return 1
manifest = curator_backup._read_manifest(target_path)
print(f"Rollback target: {target_path.name}")
if manifest:
print(f" reason: {manifest.get('reason', '?')}")
print(f" created_at: {manifest.get('created_at', '?')}")
print(f" skill files: {manifest.get('skill_files', '?')}")
cron = manifest.get("cron_jobs") or {}
if isinstance(cron, dict):
if cron.get("backed_up"):
print(
f" cron jobs: {cron.get('jobs_count', 0)} "
f"(will be restored for skill-link fields only)"
)
else:
reason = cron.get("reason", "not captured")
print(f" cron jobs: not in snapshot ({reason})")
print(
"\nThis will replace the current ~/.hermes/skills/ tree (a safety "
"snapshot of the current state is taken first so this is undoable). "
"Cron jobs that still exist will have their skills/skill fields "
"restored from the snapshot; all other cron fields are left alone."
)
if not getattr(args, "yes", False):
try:
ans = input("Proceed? [y/N] ").strip().lower()
except (EOFError, KeyboardInterrupt):
print("\ncancelled")
return 1
if ans not in ("y", "yes"):
print("cancelled")
return 1
ok, msg, _ = curator_backup.rollback(backup_id=target_path.name)
if ok:
print(f"curator: {msg}")
return 0
print(f"curator: rollback failed — {msg}")
return 1
# ---------------------------------------------------------------------------
# argparse wiring (called from hermes_cli.main)
# ---------------------------------------------------------------------------
@@ -207,6 +358,11 @@ def register_cli(parent: argparse.ArgumentParser) -> None:
"--sync", "--synchronous", dest="synchronous", action="store_true",
help="Wait for the LLM review pass to finish (default: background thread)",
)
p_run.add_argument(
"--dry-run", dest="dry_run", action="store_true",
help="Report only — no state changes, no archives, no consolidation "
"(use this to preview what curator would do)",
)
p_run.set_defaults(func=_cmd_run)
p_pause = subs.add_parser("pause", help="Pause the curator until resumed")
@@ -227,6 +383,36 @@ def register_cli(parent: argparse.ArgumentParser) -> None:
p_restore.add_argument("skill", help="Skill name")
p_restore.set_defaults(func=_cmd_restore)
p_backup = subs.add_parser(
"backup",
help="Take a manual tar.gz snapshot of ~/.hermes/skills/ "
"(curator also does this automatically before every real run)",
)
p_backup.add_argument(
"--reason", default=None,
help="Free-text label stored in manifest.json (default: 'manual')",
)
p_backup.set_defaults(func=_cmd_backup)
p_rollback = subs.add_parser(
"rollback",
help="Restore ~/.hermes/skills/ from a curator snapshot "
"(defaults to the newest)",
)
p_rollback.add_argument(
"--list", action="store_true",
help="List available snapshots and exit without restoring",
)
p_rollback.add_argument(
"--id", dest="backup_id", default=None,
help="Snapshot id to restore (see `--list`); default: newest",
)
p_rollback.add_argument(
"-y", "--yes", action="store_true",
help="Skip confirmation prompt",
)
p_rollback.set_defaults(func=_cmd_rollback)
def cli_main(argv=None) -> int:
"""Standalone entry (also usable by hermes_cli.main fallthrough)."""
+5 -2
View File
@@ -263,8 +263,11 @@ def run_doctor(args):
if env_path.exists():
check_ok(f"{_DHH}/.env file exists")
# Check for common issues
content = env_path.read_text()
# Check for common issues. Pin encoding to UTF-8 because .env files are
# written as UTF-8 everywhere in the codebase, while Path.read_text()
# defaults to the system locale — which crashes on non-UTF-8 Windows
# locales (e.g. GBK) as soon as the file contains any non-ASCII byte.
content = env_path.read_text(encoding="utf-8")
if _has_provider_env_config(content):
check_ok("API key or custom endpoint configured")
else:
+100 -12
View File
@@ -10,6 +10,7 @@ import shutil
import signal
import subprocess
import sys
import textwrap
from dataclasses import dataclass
from pathlib import Path
@@ -59,6 +60,13 @@ class GatewayRuntimeSnapshot:
def has_process_service_mismatch(self) -> bool:
return self.service_installed and self.running and not self.service_running
@dataclass(frozen=True)
class ProfileGatewayProcess:
profile: str
path: Path
pid: int
def _get_service_pids() -> set:
"""Return PIDs currently managed by systemd or launchd gateway services.
@@ -180,7 +188,7 @@ def _graceful_restart_via_sigusr1(pid: int, drain_timeout: float) -> bool:
SIGUSR1 is wired in gateway/run.py to ``request_restart(via_service=True)``
which drains in-flight agent runs (up to ``agent.restart_drain_timeout``
seconds), then exits with code 75. Both systemd (``Restart=on-failure``
seconds), then exits with code 75. Both systemd (``Restart=always``
+ ``RestartForceExitStatus=75``) and launchd (``KeepAlive.SuccessfulExit
= false``) relaunch the process after the graceful exit.
@@ -371,6 +379,83 @@ def find_gateway_pids(exclude_pids: set | None = None, all_profiles: bool = Fals
return pids
def find_profile_gateway_processes(
exclude_pids: set | None = None,
) -> list[ProfileGatewayProcess]:
"""Return running gateway PIDs mapped to Hermes profiles via PID files."""
_exclude = set(exclude_pids or set())
processes: list[ProfileGatewayProcess] = []
try:
from gateway.status import get_running_pid
from hermes_cli.profiles import list_profiles
except Exception:
return processes
seen: set[int] = set()
for profile in list_profiles():
try:
pid = get_running_pid(profile.path / "gateway.pid", cleanup_stale=False)
except Exception:
continue
if pid is None or pid <= 0 or pid in _exclude or pid in seen:
continue
seen.add(pid)
processes.append(ProfileGatewayProcess(profile=profile.name, path=profile.path, pid=pid))
return processes
def _gateway_run_args_for_profile(profile: str) -> list[str]:
args = [get_python_path(), "-m", "hermes_cli.main"]
if profile != "default":
args.extend(["--profile", profile])
args.extend(["gateway", "run", "--replace"])
return args
def launch_detached_profile_gateway_restart(profile: str, old_pid: int) -> bool:
"""Relaunch a manually-run profile gateway after its current PID exits."""
if old_pid <= 0:
return False
watcher = textwrap.dedent(
"""
import os
import subprocess
import sys
import time
pid = int(sys.argv[1])
cmd = sys.argv[2:]
deadline = time.monotonic() + 120
while time.monotonic() < deadline:
try:
os.kill(pid, 0)
except ProcessLookupError:
break
except PermissionError:
pass
time.sleep(0.2)
subprocess.Popen(
cmd,
stdout=subprocess.DEVNULL,
stderr=subprocess.DEVNULL,
start_new_session=True,
)
"""
).strip()
try:
subprocess.Popen(
[sys.executable, "-c", watcher, str(old_pid), *_gateway_run_args_for_profile(profile)],
stdout=subprocess.DEVNULL,
stderr=subprocess.DEVNULL,
start_new_session=True,
)
except OSError:
return False
return True
def _probe_systemd_service_running(system: bool = False) -> tuple[bool, bool]:
selected_system = _select_systemd_scope(system)
unit_exists = get_systemd_unit_path(system=selected_system).exists()
@@ -1570,8 +1655,7 @@ def generate_systemd_unit(system: bool = False, run_as_user: str | None = None)
Description={SERVICE_DESCRIPTION}
After=network-online.target
Wants=network-online.target
StartLimitIntervalSec=600
StartLimitBurst=5
StartLimitIntervalSec=0
[Service]
Type=simple
@@ -1585,8 +1669,10 @@ Environment="LOGNAME={username}"
Environment="PATH={sane_path}"
Environment="VIRTUAL_ENV={venv_dir}"
Environment="HERMES_HOME={hermes_home}"
Restart=on-failure
RestartSec=30
Restart=always
RestartSec=60
RestartMaxDelaySec=300
RestartSteps=5
RestartForceExitStatus={GATEWAY_SERVICE_RESTART_EXIT_CODE}
KillMode=mixed
KillSignal=SIGTERM
@@ -1606,9 +1692,9 @@ WantedBy=multi-user.target
sane_path = ":".join(path_entries)
return f"""[Unit]
Description={SERVICE_DESCRIPTION}
After=network.target
StartLimitIntervalSec=600
StartLimitBurst=5
After=network-online.target
Wants=network-online.target
StartLimitIntervalSec=0
[Service]
Type=simple
@@ -1617,8 +1703,10 @@ WorkingDirectory={working_dir}
Environment="PATH={sane_path}"
Environment="VIRTUAL_ENV={venv_dir}"
Environment="HERMES_HOME={hermes_home}"
Restart=on-failure
RestartSec=30
Restart=always
RestartSec=60
RestartMaxDelaySec=300
RestartSteps=5
RestartForceExitStatus={GATEWAY_SERVICE_RESTART_EXIT_CODE}
KillMode=mixed
KillSignal=SIGTERM
@@ -2366,7 +2454,7 @@ def run_gateway(verbose: int = 0, quiet: bool = False, replace: bool = False):
print()
# Exit with code 1 if gateway fails to connect any platform,
# so systemd Restart=on-failure will retry on transient errors
# so systemd Restart=always will retry on transient errors
verbosity = None if quiet else verbose
try:
success = asyncio.run(start_gateway(replace=replace, verbosity=verbosity))
@@ -4377,4 +4465,4 @@ def _gateway_command_inner(args):
if not supports_systemd_services() and not is_macos():
print("Legacy unit migration only applies to systemd-based Linux hosts.")
return
remove_legacy_hermes_units(interactive=not yes, dry_run=dry_run)
remove_legacy_hermes_units(interactive=not yes, dry_run=dry_run)
+535
View File
@@ -0,0 +1,535 @@
"""Persistent session goals — the Ralph loop for Hermes.
A goal is a free-form user objective that stays active across turns. After
each turn completes, a small judge call asks an auxiliary model "is this
goal satisfied by the assistant's last response?". If not, Hermes feeds a
continuation prompt back into the same session and keeps working until the
goal is done, turn budget is exhausted, the user pauses/clears it, or the
user sends a new message (which takes priority and pauses the goal loop).
State is persisted in SessionDB's ``state_meta`` table keyed by
``goal:<session_id>`` so ``/resume`` picks it up.
Design notes / invariants:
- The continuation prompt is just a normal user message appended to the
session via ``run_conversation``. No system-prompt mutation, no toolset
swap prompt caching stays intact.
- Judge failures are fail-OPEN: ``continue``. A broken judge must not wedge
progress; the turn budget is the backstop.
- When a real user message arrives mid-loop it preempts the continuation
prompt and also pauses the goal loop for that turn (we still re-judge
after, so if the user's message happens to complete the goal the judge
will say ``done``).
- This module has zero hard dependency on ``cli.HermesCLI`` or the gateway
runner both wire the same ``GoalManager`` in.
Nothing in this module touches the agent's system prompt or toolset.
"""
from __future__ import annotations
import json
import logging
import re
import time
from dataclasses import dataclass, asdict
from typing import Any, Dict, Optional, Tuple
logger = logging.getLogger(__name__)
# ──────────────────────────────────────────────────────────────────────
# Constants & defaults
# ──────────────────────────────────────────────────────────────────────
DEFAULT_MAX_TURNS = 20
DEFAULT_JUDGE_TIMEOUT = 30.0
# Cap how much of the last response + recent messages we send to the judge.
_JUDGE_RESPONSE_SNIPPET_CHARS = 4000
CONTINUATION_PROMPT_TEMPLATE = (
"[Continuing toward your standing goal]\n"
"Goal: {goal}\n\n"
"Continue working toward this goal. Take the next concrete step. "
"If you believe the goal is complete, state so explicitly and stop. "
"If you are blocked and need input from the user, say so clearly and stop."
)
JUDGE_SYSTEM_PROMPT = (
"You are a strict judge evaluating whether an autonomous agent has "
"achieved a user's stated goal. You receive the goal text and the "
"agent's most recent response. Your only job is to decide whether "
"the goal is fully satisfied based on that response.\n\n"
"A goal is DONE only when:\n"
"- The response explicitly confirms the goal was completed, OR\n"
"- The response clearly shows the final deliverable was produced, OR\n"
"- The response explains the goal is unachievable / blocked / needs "
"user input (treat this as DONE with reason describing the block).\n\n"
"Otherwise the goal is NOT done — CONTINUE.\n\n"
"Reply ONLY with a single JSON object on one line:\n"
'{\"done\": <true|false>, \"reason\": \"<one-sentence rationale>\"}'
)
JUDGE_USER_PROMPT_TEMPLATE = (
"Goal:\n{goal}\n\n"
"Agent's most recent response:\n{response}\n\n"
"Is the goal satisfied?"
)
# ──────────────────────────────────────────────────────────────────────
# Dataclass
# ──────────────────────────────────────────────────────────────────────
@dataclass
class GoalState:
"""Serializable goal state stored per session."""
goal: str
status: str = "active" # active | paused | done | cleared
turns_used: int = 0
max_turns: int = DEFAULT_MAX_TURNS
created_at: float = 0.0
last_turn_at: float = 0.0
last_verdict: Optional[str] = None # "done" | "continue" | "skipped"
last_reason: Optional[str] = None
paused_reason: Optional[str] = None # why we auto-paused (budget, etc.)
def to_json(self) -> str:
return json.dumps(asdict(self), ensure_ascii=False)
@classmethod
def from_json(cls, raw: str) -> "GoalState":
data = json.loads(raw)
return cls(
goal=data.get("goal", ""),
status=data.get("status", "active"),
turns_used=int(data.get("turns_used", 0) or 0),
max_turns=int(data.get("max_turns", DEFAULT_MAX_TURNS) or DEFAULT_MAX_TURNS),
created_at=float(data.get("created_at", 0.0) or 0.0),
last_turn_at=float(data.get("last_turn_at", 0.0) or 0.0),
last_verdict=data.get("last_verdict"),
last_reason=data.get("last_reason"),
paused_reason=data.get("paused_reason"),
)
# ──────────────────────────────────────────────────────────────────────
# Persistence (SessionDB state_meta)
# ──────────────────────────────────────────────────────────────────────
def _meta_key(session_id: str) -> str:
return f"goal:{session_id}"
_DB_CACHE: Dict[str, Any] = {}
def _get_session_db() -> Optional[Any]:
"""Return a SessionDB instance for the current HERMES_HOME.
SessionDB has no built-in singleton, but opening a new connection per
/goal call would thrash the file. We cache one instance per
``hermes_home`` path so profile switches still pick up the right DB.
Defensive against import/instantiation failures so tests and
non-standard launchers can still use the GoalManager.
"""
try:
from hermes_constants import get_hermes_home
from hermes_state import SessionDB
home = str(get_hermes_home())
except Exception as exc: # pragma: no cover
logger.debug("GoalManager: SessionDB bootstrap failed (%s)", exc)
return None
cached = _DB_CACHE.get(home)
if cached is not None:
return cached
try:
db = SessionDB()
except Exception as exc: # pragma: no cover
logger.debug("GoalManager: SessionDB() raised (%s)", exc)
return None
_DB_CACHE[home] = db
return db
def load_goal(session_id: str) -> Optional[GoalState]:
"""Load the goal for a session, or None if none exists."""
if not session_id:
return None
db = _get_session_db()
if db is None:
return None
try:
raw = db.get_meta(_meta_key(session_id))
except Exception as exc:
logger.debug("GoalManager: get_meta failed: %s", exc)
return None
if not raw:
return None
try:
return GoalState.from_json(raw)
except Exception as exc:
logger.warning("GoalManager: could not parse stored goal for %s: %s", session_id, exc)
return None
def save_goal(session_id: str, state: GoalState) -> None:
"""Persist a goal to SessionDB. No-op if DB unavailable."""
if not session_id:
return
db = _get_session_db()
if db is None:
return
try:
db.set_meta(_meta_key(session_id), state.to_json())
except Exception as exc:
logger.debug("GoalManager: set_meta failed: %s", exc)
def clear_goal(session_id: str) -> None:
"""Mark a goal cleared in the DB (preserved for audit, status=cleared)."""
state = load_goal(session_id)
if state is None:
return
state.status = "cleared"
save_goal(session_id, state)
# ──────────────────────────────────────────────────────────────────────
# Judge
# ──────────────────────────────────────────────────────────────────────
def _truncate(text: str, limit: int) -> str:
if not text:
return ""
if len(text) <= limit:
return text
return text[:limit] + "… [truncated]"
_JSON_OBJECT_RE = re.compile(r"\{.*?\}", re.DOTALL)
def _parse_judge_response(raw: str) -> Tuple[bool, str]:
"""Parse the judge's reply. Fail-open to ``(False, "<reason>")``.
Returns ``(done, reason)``.
"""
if not raw:
return False, "judge returned empty response"
text = raw.strip()
# Strip markdown code fences the model may wrap JSON in.
if text.startswith("```"):
text = text.strip("`")
# Peel off leading json/JSON/etc tag
nl = text.find("\n")
if nl != -1:
text = text[nl + 1:]
# First try: parse the whole blob.
data: Optional[Dict[str, Any]] = None
try:
data = json.loads(text)
except Exception:
# Second try: pull the first JSON object out.
match = _JSON_OBJECT_RE.search(text)
if match:
try:
data = json.loads(match.group(0))
except Exception:
data = None
if not isinstance(data, dict):
return False, f"judge reply was not JSON: {_truncate(raw, 200)!r}"
done_val = data.get("done")
if isinstance(done_val, str):
done = done_val.strip().lower() in ("true", "yes", "1", "done")
else:
done = bool(done_val)
reason = str(data.get("reason") or "").strip()
if not reason:
reason = "no reason provided"
return done, reason
def judge_goal(
goal: str,
last_response: str,
*,
timeout: float = DEFAULT_JUDGE_TIMEOUT,
) -> Tuple[str, str]:
"""Ask the auxiliary model whether the goal is satisfied.
Returns ``(verdict, reason)`` where verdict is ``"done"``, ``"continue"``,
or ``"skipped"`` (when the judge couldn't be reached).
This is deliberately fail-open: any error returns ``("continue", "...")``
so a broken judge doesn't wedge progress — the turn budget is the
backstop.
"""
if not goal.strip():
return "skipped", "empty goal"
if not last_response.strip():
# No substantive reply this turn — almost certainly not done yet.
return "continue", "empty response (nothing to evaluate)"
try:
from agent.auxiliary_client import get_text_auxiliary_client
except Exception as exc:
logger.debug("goal judge: auxiliary client import failed: %s", exc)
return "continue", "auxiliary client unavailable"
try:
client, model = get_text_auxiliary_client("goal_judge")
except Exception as exc:
logger.debug("goal judge: get_text_auxiliary_client failed: %s", exc)
return "continue", "auxiliary client unavailable"
if client is None or not model:
return "continue", "no auxiliary client configured"
prompt = JUDGE_USER_PROMPT_TEMPLATE.format(
goal=_truncate(goal, 2000),
response=_truncate(last_response, _JUDGE_RESPONSE_SNIPPET_CHARS),
)
try:
resp = client.chat.completions.create(
model=model,
messages=[
{"role": "system", "content": JUDGE_SYSTEM_PROMPT},
{"role": "user", "content": prompt},
],
temperature=0,
max_tokens=200,
timeout=timeout,
)
except Exception as exc:
logger.info("goal judge: API call failed (%s) — falling through to continue", exc)
return "continue", f"judge error: {type(exc).__name__}"
try:
raw = resp.choices[0].message.content or ""
except Exception:
raw = ""
done, reason = _parse_judge_response(raw)
verdict = "done" if done else "continue"
logger.info("goal judge: verdict=%s reason=%s", verdict, _truncate(reason, 120))
return verdict, reason
# ──────────────────────────────────────────────────────────────────────
# GoalManager — the orchestration surface CLI + gateway talk to
# ──────────────────────────────────────────────────────────────────────
class GoalManager:
"""Per-session goal state + continuation decisions.
The CLI and gateway each hold one ``GoalManager`` per live session.
Methods:
- ``set(goal)`` start a new standing goal.
- ``clear()`` remove the active goal.
- ``pause()`` / ``resume()`` explicit user controls.
- ``status()`` printable one-liner.
- ``evaluate_after_turn(last_response)`` call the judge, update state,
and return a decision dict the caller uses to drive the next turn.
- ``next_continuation_prompt()`` the canonical user-role message to
feed back into ``run_conversation``.
"""
def __init__(self, session_id: str, *, default_max_turns: int = DEFAULT_MAX_TURNS):
self.session_id = session_id
self.default_max_turns = int(default_max_turns or DEFAULT_MAX_TURNS)
self._state: Optional[GoalState] = load_goal(session_id)
# --- introspection ------------------------------------------------
@property
def state(self) -> Optional[GoalState]:
return self._state
def is_active(self) -> bool:
return self._state is not None and self._state.status == "active"
def has_goal(self) -> bool:
return self._state is not None and self._state.status in ("active", "paused")
def status_line(self) -> str:
s = self._state
if s is None or s.status in ("cleared",):
return "No active goal. Set one with /goal <text>."
turns = f"{s.turns_used}/{s.max_turns} turns"
if s.status == "active":
return f"⊙ Goal (active, {turns}): {s.goal}"
if s.status == "paused":
extra = f"{s.paused_reason}" if s.paused_reason else ""
return f"⏸ Goal (paused, {turns}{extra}): {s.goal}"
if s.status == "done":
return f"✓ Goal done ({turns}): {s.goal}"
return f"Goal ({s.status}, {turns}): {s.goal}"
# --- mutation -----------------------------------------------------
def set(self, goal: str, *, max_turns: Optional[int] = None) -> GoalState:
goal = (goal or "").strip()
if not goal:
raise ValueError("goal text is empty")
state = GoalState(
goal=goal,
status="active",
turns_used=0,
max_turns=int(max_turns) if max_turns else self.default_max_turns,
created_at=time.time(),
last_turn_at=0.0,
)
self._state = state
save_goal(self.session_id, state)
return state
def pause(self, reason: str = "user-paused") -> Optional[GoalState]:
if not self._state:
return None
self._state.status = "paused"
self._state.paused_reason = reason
save_goal(self.session_id, self._state)
return self._state
def resume(self, *, reset_budget: bool = True) -> Optional[GoalState]:
if not self._state:
return None
self._state.status = "active"
self._state.paused_reason = None
if reset_budget:
self._state.turns_used = 0
save_goal(self.session_id, self._state)
return self._state
def clear(self) -> None:
if self._state is None:
return
self._state.status = "cleared"
save_goal(self.session_id, self._state)
self._state = None
def mark_done(self, reason: str) -> None:
if not self._state:
return
self._state.status = "done"
self._state.last_verdict = "done"
self._state.last_reason = reason
save_goal(self.session_id, self._state)
# --- the main entry point called after every turn -----------------
def evaluate_after_turn(
self,
last_response: str,
*,
user_initiated: bool = True,
) -> Dict[str, Any]:
"""Run the judge and update state. Return a decision dict.
``user_initiated`` distinguishes a real user prompt (True) from a
continuation prompt we fed ourselves (False). Both increment
``turns_used`` because both consume model budget.
Decision keys:
- ``status``: current goal status after update
- ``should_continue``: bool caller should fire another turn
- ``continuation_prompt``: str or None
- ``verdict``: "done" | "continue" | "skipped" | "inactive"
- ``reason``: str
- ``message``: user-visible one-liner to print/send
"""
state = self._state
if state is None or state.status != "active":
return {
"status": state.status if state else None,
"should_continue": False,
"continuation_prompt": None,
"verdict": "inactive",
"reason": "no active goal",
"message": "",
}
# Count the turn that just finished.
state.turns_used += 1
state.last_turn_at = time.time()
verdict, reason = judge_goal(state.goal, last_response)
state.last_verdict = verdict
state.last_reason = reason
if verdict == "done":
state.status = "done"
save_goal(self.session_id, state)
return {
"status": "done",
"should_continue": False,
"continuation_prompt": None,
"verdict": "done",
"reason": reason,
"message": f"✓ Goal achieved: {reason}",
}
if state.turns_used >= state.max_turns:
state.status = "paused"
state.paused_reason = f"turn budget exhausted ({state.turns_used}/{state.max_turns})"
save_goal(self.session_id, state)
return {
"status": "paused",
"should_continue": False,
"continuation_prompt": None,
"verdict": "continue",
"reason": reason,
"message": (
f"⏸ Goal paused — {state.turns_used}/{state.max_turns} turns used. "
"Use /goal resume to keep going, or /goal clear to stop."
),
}
save_goal(self.session_id, state)
return {
"status": "active",
"should_continue": True,
"continuation_prompt": self.next_continuation_prompt(),
"verdict": "continue",
"reason": reason,
"message": (
f"↻ Continuing toward goal ({state.turns_used}/{state.max_turns}): {reason}"
),
}
def next_continuation_prompt(self) -> Optional[str]:
if not self._state or self._state.status != "active":
return None
return CONTINUATION_PROMPT_TEMPLATE.format(goal=self._state.goal)
__all__ = [
"GoalState",
"GoalManager",
"CONTINUATION_PROMPT_TEMPLATE",
"DEFAULT_MAX_TURNS",
"load_goal",
"save_goal",
"clear_goal",
"judge_goal",
]
+1393
View File
File diff suppressed because it is too large Load Diff
File diff suppressed because it is too large Load Diff
+165 -13
View File
@@ -289,7 +289,7 @@ def _has_any_provider_configured() -> bool:
env_file = get_env_path()
if env_file.exists():
try:
for line in env_file.read_text().splitlines():
for line in env_file.read_text(encoding="utf-8").splitlines():
line = line.strip()
if line.startswith("#") or "=" not in line:
continue
@@ -800,6 +800,8 @@ def _print_tui_exit_summary(session_id: Optional[str], active_session_file: Opti
title = db.get_session_title(target)
message_count = int(session.get("message_count") or 0)
if message_count == 0:
return # No real conversation — don't show resume info
input_tokens = int(session.get("input_tokens") or 0)
output_tokens = int(session.get("output_tokens") or 0)
cache_read_tokens = int(session.get("cache_read_tokens") or 0)
@@ -5041,6 +5043,13 @@ def cmd_slack(args):
return 1
def cmd_kanban(args):
"""Multi-profile collaboration board."""
from hermes_cli.kanban import kanban_command
return kanban_command(args)
def cmd_hooks(args):
"""Shell-hook inspection and management."""
from hermes_cli.hooks import hooks_command
@@ -5424,6 +5433,45 @@ def _find_stale_dashboard_pids() -> list[int]:
return dashboard_pids
def _print_curator_first_run_notice() -> None:
"""Print a short heads-up about the skill curator after `hermes update`.
Only fires when the curator is enabled AND has no recorded run yet, which
is exactly the window where the gateway ticker used to fire Curator
against a fresh skill library immediately after an update. We defer the
first real pass by one ``interval_hours``; this notice tells the user how
to preview or disable before then. Silent on steady state.
"""
try:
from agent import curator
except Exception:
return
try:
if not curator.is_enabled():
return
state = curator.load_state()
except Exception:
return
if state.get("last_run_at"):
# Curator has run before (real or already seeded) — no notice needed.
return
try:
hours = curator.get_interval_hours()
except Exception:
hours = 24 * 7
days = max(1, hours // 24)
print()
print(" Skill curator")
print(
f" Background skill maintenance is enabled. First pass is deferred "
f"~{days}d after installation; only agent-created skills are in "
f"scope and nothing is ever auto-deleted (archive is recoverable)."
)
print(" Preview now: hermes curator run --dry-run")
print(" Pause it: hermes curator pause")
print(" Docs: https://hermes-agent.nousresearch.com/docs/user-guide/features/curator")
def _kill_stale_dashboard_processes(
reason: str = "the running backend no longer matches the updated frontend",
) -> None:
@@ -5661,6 +5709,10 @@ def _update_via_zip(args):
print()
print("✓ Update complete!")
try:
_print_curator_first_run_notice()
except Exception as e:
logger.debug("Curator first-run notice failed: %s", e)
_kill_stale_dashboard_processes()
@@ -6666,6 +6718,7 @@ def _cmd_update_impl(args, gateway_mode: bool):
if gateway_mode
else None
)
assume_yes = bool(getattr(args, "yes", False))
print("⚕ Updating Hermes Agent...")
print()
@@ -6785,8 +6838,10 @@ def _cmd_update_impl(args, gateway_mode: bool):
else:
auto_stash_ref = _stash_local_changes_if_needed(git_cmd, PROJECT_ROOT)
prompt_for_restore = auto_stash_ref is not None and (
gateway_mode or (sys.stdin.isatty() and sys.stdout.isatty())
prompt_for_restore = (
auto_stash_ref is not None
and not assume_yes
and (gateway_mode or (sys.stdin.isatty() and sys.stdout.isatty()))
)
# Check if there are updates
@@ -7047,7 +7102,10 @@ def _cmd_update_impl(args, gateway_mode: bool):
print(f" {len(missing_config)} new config option(s) available")
print()
if gateway_mode:
if assume_yes:
print(" --yes: auto-applying config migration (skipping API-key prompts).")
response = "y"
elif gateway_mode:
response = (
_gateway_prompt(
"Would you like to configure new options now? [Y/n]", "n"
@@ -7073,14 +7131,17 @@ def _cmd_update_impl(args, gateway_mode: bool):
if response in ("", "y", "yes"):
print()
# In gateway mode, run auto-migrations only (no input() prompts
# for API keys which would hang the detached process).
results = migrate_config(interactive=not gateway_mode, quiet=False)
# In gateway mode OR under --yes, run auto-migrations only (no
# input() prompts for API keys which would hang the detached
# process / defeat the point of --yes).
results = migrate_config(
interactive=not (gateway_mode or assume_yes), quiet=False
)
if results["env_added"] or results["config_added"]:
print()
print("✓ Configuration updated!")
if gateway_mode and missing_env:
if (gateway_mode or assume_yes) and missing_env:
print(" API keys require manual entry: hermes config migrate")
else:
print()
@@ -7091,6 +7152,15 @@ def _cmd_update_impl(args, gateway_mode: bool):
print()
print("✓ Update complete!")
# Curator first-run heads-up. Only prints when curator is enabled AND
# has never run — i.e. the window where the ticker would otherwise
# have fired against a fresh skill library. Kept silent on steady
# state so we don't nag.
try:
_print_curator_first_run_notice()
except Exception as e:
logger.debug("Curator first-run notice failed: %s", e)
# Repair RHEL-family root installs where /usr/local/bin isn't on PATH
# for non-login interactive shells. No-op on every other platform.
try:
@@ -7130,6 +7200,8 @@ def _cmd_update_impl(args, gateway_mode: bool):
supports_systemd_services,
_ensure_user_systemd_env,
find_gateway_pids,
find_profile_gateway_processes,
launch_detached_profile_gateway_restart,
_get_service_pids,
_graceful_restart_via_sigusr1,
)
@@ -7233,6 +7305,7 @@ def _cmd_update_impl(args, gateway_mode: bool):
restarted_services = []
killed_pids = set()
relaunched_profiles = []
# --- Systemd services (Linux) ---
# Discover all hermes-gateway* units (default + profiles)
@@ -7422,7 +7495,33 @@ def _cmd_update_impl(args, gateway_mode: bool):
manual_pids = find_gateway_pids(
exclude_pids=service_pids, all_profiles=True
)
profile_processes = {
proc.pid: proc
for proc in find_profile_gateway_processes(exclude_pids=service_pids)
if proc.pid in manual_pids
}
for pid, proc in profile_processes.items():
if not launch_detached_profile_gateway_restart(proc.profile, pid):
continue
# Prefer a graceful SIGUSR1 drain so in-flight agent runs
# finish before the watcher respawns the gateway. If the
# gateway doesn't support SIGUSR1 or doesn't exit within
# the drain budget, fall back to SIGTERM — the watcher
# still sees the exit and relaunches either way.
drained = _graceful_restart_via_sigusr1(
pid, drain_timeout=_drain_budget,
)
if not drained:
try:
os.kill(pid, _signal.SIGTERM)
except (ProcessLookupError, PermissionError):
pass
killed_pids.add(pid)
relaunched_profiles.append(proc.profile)
for pid in manual_pids:
if pid in profile_processes:
continue
try:
os.kill(pid, _signal.SIGTERM)
killed_pids.add(pid)
@@ -7433,11 +7532,14 @@ def _cmd_update_impl(args, gateway_mode: bool):
print()
for svc in restarted_services:
print(f" ✓ Restarted {svc}")
if killed_pids:
print(f" → Stopped {len(killed_pids)} manual gateway process(es)")
if relaunched_profiles:
names = ", ".join(relaunched_profiles)
print(f" ✓ Restarting manual gateway profile(s): {names}")
unmapped_count = len(killed_pids) - len(relaunched_profiles)
if unmapped_count:
print(f" → Stopped {unmapped_count} manual gateway process(es)")
print(" Restart manually: hermes gateway run")
# Also restart for each profile if needed
if len(killed_pids) > 1:
if unmapped_count > 1:
print(
" (or: hermes -p <profile> gateway run for each profile)"
)
@@ -7446,6 +7548,42 @@ def _cmd_update_impl(args, gateway_mode: bool):
# No gateways were running — nothing to do
pass
# --- Post-restart survivor sweep -----------------------------
# Issue #17648: some gateways ignore SIGTERM (stuck drain,
# blocked I/O, PID dead but zombie). The detached profile
# watchers wait 120s for the old PID to exit — if it never
# does, no respawn happens and the user keeps hitting
# ImportError against a stale sys.modules. Give the
# graceful paths a brief window to complete, then SIGKILL
# any remaining pre-update PIDs so the watcher / service
# manager can relaunch with fresh code.
try:
_time.sleep(3.0)
_service_pids_after = _get_service_pids()
_surviving = find_gateway_pids(
exclude_pids=_service_pids_after, all_profiles=True,
)
# Scope to PIDs we already tried to kill during this
# update (killed_pids). Anything new is a gateway that
# started AFTER our restart attempt — respecting user
# intent, we don't kill those.
_stuck = [pid for pid in _surviving if pid in killed_pids]
if _stuck:
print()
print(
f"{len(_stuck)} gateway process(es) ignored SIGTERM — force-killing"
)
for pid in _stuck:
try:
os.kill(pid, _signal.SIGKILL)
except (ProcessLookupError, PermissionError):
pass
# Give the OS a beat to reap the processes so the
# watchers see them exit and respawn.
_time.sleep(1.5)
except Exception as _sweep_exc:
logger.debug("Post-restart survivor sweep failed: %s", _sweep_exc)
except Exception as e:
logger.debug("Gateway restart during update failed: %s", e)
@@ -7682,7 +7820,7 @@ def cmd_profile(args):
if clone_all:
print(f"Full copy from {source_label}.")
else:
print(f"Cloned config, .env, SOUL.md from {source_label}.")
print(f"Cloned config, .env, SOUL.md, and skills from {source_label}.")
# Auto-clone Honcho config for the new profile (only with --clone/--clone-all)
if clone or clone_all:
@@ -8640,6 +8778,13 @@ def main():
webhook_parser.set_defaults(func=cmd_webhook)
# =========================================================================
# kanban command — multi-profile collaboration board
# =========================================================================
from hermes_cli.kanban import build_parser as _build_kanban_parser
kanban_parser = _build_kanban_parser(subparsers)
kanban_parser.set_defaults(func=cmd_kanban)
# =========================================================================
# hooks command — shell-hook inspection and management
# =========================================================================
@@ -9847,6 +9992,13 @@ Examples:
default=False,
help="Force a pre-update backup for this run (off by default; overrides updates.pre_update_backup)",
)
update_parser.add_argument(
"--yes",
"-y",
action="store_true",
default=False,
help="Assume yes for interactive prompts (config migration, stash restore). API-key entry is skipped; run 'hermes config migrate' separately for those.",
)
update_parser.set_defaults(func=cmd_update)
# =========================================================================
+1 -1
View File
@@ -361,7 +361,7 @@ def _write_env_vars(env_path: Path, env_writes: dict) -> None:
existing_lines = []
if env_path.exists():
existing_lines = env_path.read_text().splitlines()
existing_lines = env_path.read_text(encoding="utf-8").splitlines()
updated_keys = set()
new_lines = []
+61 -16
View File
@@ -891,14 +891,19 @@ def switch_model(
if not validation.get("accepted"):
override = False
if user_providers:
for up in user_providers:
if isinstance(up, dict) and up.get("provider") == target_provider:
cfg_models = up.get("models", [])
if new_model in cfg_models or any(
m.get("name") == new_model for m in cfg_models if isinstance(m, dict)
):
# user_providers is a dict: {provider_slug: config_dict}
for slug, cfg in user_providers.items():
if slug == target_provider:
cfg_models = cfg.get("models", {})
# Direct membership works for dict (keys) and list (strings)
if new_model in cfg_models:
override = True
break
# Also accept if models is a list of dicts with 'name' field
if isinstance(cfg_models, list):
if any(m.get("name") == new_model for m in cfg_models if isinstance(m, dict)):
override = True
break
if override:
validation = {"accepted": True, "persist": True, "recognized": False, "message": validation.get("message", "")}
else:
@@ -1052,6 +1057,45 @@ def list_authenticated_providers(
if normed:
_builtin_endpoints.add(normed)
def _has_fast_aws_sdk_signal() -> bool:
"""Return True when explicit AWS auth config is present.
This intentionally avoids botocore's full credential chain. Provider
picker/model-switch discovery can run for non-Bedrock providers, and
botocore may otherwise probe EC2 IMDS (169.254.169.254) on local
machines before returning no credentials.
"""
if os.environ.get("AWS_BEARER_TOKEN_BEDROCK", "").strip():
return True
if (
os.environ.get("AWS_ACCESS_KEY_ID", "").strip()
and os.environ.get("AWS_SECRET_ACCESS_KEY", "").strip()
):
return True
return any(
os.environ.get(name, "").strip()
for name in (
"AWS_PROFILE",
"AWS_CONTAINER_CREDENTIALS_RELATIVE_URI",
"AWS_CONTAINER_CREDENTIALS_FULL_URI",
"AWS_WEB_IDENTITY_TOKEN_FILE",
)
)
def _has_aws_sdk_creds_for_listing(slug: str) -> bool:
"""Credential check for AWS SDK providers in non-runtime discovery."""
slug_norm = str(slug or "").strip().lower()
current_norm = str(current_provider or "").strip().lower()
if _has_fast_aws_sdk_signal():
return True
if slug_norm != current_norm:
return False
try:
from agent.bedrock_adapter import has_aws_credentials
return bool(has_aws_credentials())
except Exception:
return False
data = fetch_models_dev()
# Build curated model lists keyed by hermes provider ID
@@ -1179,7 +1223,9 @@ def list_authenticated_providers(
# Check if credentials exist
has_creds = False
if overlay.extra_env_vars:
if overlay.auth_type == "aws_sdk":
has_creds = _has_aws_sdk_creds_for_listing(hermes_slug)
elif overlay.extra_env_vars:
has_creds = any(os.environ.get(ev) for ev in overlay.extra_env_vars)
# Also check api_key_env_vars from PROVIDER_REGISTRY for api_key auth_type
if not has_creds and overlay.auth_type == "api_key":
@@ -1319,11 +1365,7 @@ def list_authenticated_providers(
# credentials come from the boto3 credential chain (env vars,
# ~/.aws/credentials, instance roles, etc.)
if not _cp_has_creds and _cp_config and getattr(_cp_config, "auth_type", "") == "aws_sdk":
try:
from agent.bedrock_adapter import has_aws_credentials
_cp_has_creds = has_aws_credentials()
except Exception:
pass
_cp_has_creds = _has_aws_sdk_creds_for_listing(_cp.slug)
if not _cp_has_creds:
continue
@@ -1412,14 +1454,17 @@ def list_authenticated_providers(
models_list = list(fb)
# Prefer the endpoint's live /models list when credentials are
# available. This keeps OpenAI-compatible relays (for example CRS)
# in sync when the server catalog changes without requiring the
# user to mirror every model into config.yaml.
# available, unless the provider explicitly opts out via
# discover_models: false (e.g. dedicated endpoints that expose
# the entire aggregator catalog via /models).
api_key = str(ep_cfg.get("api_key", "") or "").strip()
if not api_key:
key_env = str(ep_cfg.get("key_env", "") or "").strip()
api_key = os.environ.get(key_env, "").strip() if key_env else ""
if api_url and api_key:
discover = ep_cfg.get("discover_models", True)
if isinstance(discover, str):
discover = discover.lower() not in ("false", "no", "0")
if api_url and api_key and discover:
try:
from hermes_cli.models import fetch_api_models
live_models = fetch_api_models(api_key, api_url)
+2 -1
View File
@@ -40,6 +40,7 @@ OPENROUTER_MODELS: list[tuple[str, str]] = [
("anthropic/claude-sonnet-4.5", ""),
("anthropic/claude-haiku-4.5", ""),
("openrouter/elephant-alpha", "free"),
("openrouter/owl-alpha", "free"),
("openai/gpt-5.5", ""),
("openai/gpt-5.4-mini", ""),
("xiaomi/mimo-v2.5-pro", ""),
@@ -773,7 +774,6 @@ CANONICAL_PROVIDERS: list[ProviderEntry] = [
ProviderEntry("nous", "Nous Portal", "Nous Portal (Nous Research subscription)"),
ProviderEntry("openrouter", "OpenRouter", "OpenRouter (100+ models, pay-per-use)"),
ProviderEntry("lmstudio", "LM Studio", "LM Studio (local desktop app with built-in model server)"),
ProviderEntry("ai-gateway", "Vercel AI Gateway", "Vercel AI Gateway (200+ models, $5 free credit, no markup)"),
ProviderEntry("anthropic", "Anthropic", "Anthropic (Claude models — API key or Claude Code)"),
ProviderEntry("openai-codex", "OpenAI Codex", "OpenAI Codex"),
ProviderEntry("xiaomi", "Xiaomi MiMo", "Xiaomi MiMo (MiMo-V2.5 and V2 models — pro, omni, flash)"),
@@ -803,6 +803,7 @@ CANONICAL_PROVIDERS: list[ProviderEntry] = [
ProviderEntry("opencode-go", "OpenCode Go", "OpenCode Go (open models, $10/month subscription)"),
ProviderEntry("bedrock", "AWS Bedrock", "AWS Bedrock (Claude, Nova, Llama, DeepSeek — IAM or API key)"),
ProviderEntry("azure-foundry", "Azure Foundry", "Azure Foundry (OpenAI-style or Anthropic-style endpoint — your Azure AI deployment)"),
ProviderEntry("ai-gateway", "Vercel AI Gateway", "Vercel AI Gateway"),
]
# Derived dicts — used throughout the codebase
+8
View File
@@ -141,6 +141,7 @@ def _browser_label(current_provider: str) -> str:
"browserbase": "Browserbase",
"browser-use": "Browser Use",
"firecrawl": "Firecrawl",
"tinyfish": "TinyFish",
"camofox": "Camofox",
"local": "Local browser",
}
@@ -169,6 +170,7 @@ def _resolve_browser_feature_state(
direct_browserbase: bool,
direct_browser_use: bool,
direct_firecrawl: bool,
direct_tinyfish: bool,
managed_browser_available: bool,
) -> tuple[str, bool, bool, bool]:
"""Resolve browser availability using the same precedence as runtime."""
@@ -196,6 +198,10 @@ def _resolve_browser_feature_state(
available = bool(browser_local_available and direct_firecrawl)
active = bool(browser_tool_enabled and available)
return current_provider, available, active, False
if current_provider == "tinyfish":
available = bool(browser_local_available and direct_tinyfish)
active = bool(browser_tool_enabled and available)
return current_provider, available, active, False
if current_provider == "camofox":
return current_provider, False, False, False
@@ -286,6 +292,7 @@ def get_nous_subscription_features(
direct_camofox = bool(get_env_value("CAMOFOX_URL"))
direct_browserbase = bool(get_env_value("BROWSERBASE_API_KEY") and get_env_value("BROWSERBASE_PROJECT_ID"))
direct_browser_use = bool(get_env_value("BROWSER_USE_API_KEY"))
direct_tinyfish = bool(get_env_value("TINYFISH_API_KEY"))
direct_modal = has_direct_modal_credentials()
# When use_gateway is set, suppress direct credentials for managed detection
@@ -363,6 +370,7 @@ def get_nous_subscription_features(
direct_browserbase=direct_browserbase,
direct_browser_use=direct_browser_use,
direct_firecrawl=direct_firecrawl,
direct_tinyfish=direct_tinyfish,
managed_browser_available=managed_browser_available,
)
+52
View File
@@ -33,12 +33,15 @@ so plugin-defined tools appear alongside the built-in tools.
from __future__ import annotations
import asyncio
import importlib
import importlib.metadata
import importlib.util
import inspect
import logging
import os
import sys
import threading
import types
from dataclasses import dataclass, field
from pathlib import Path
@@ -1226,6 +1229,55 @@ def get_plugin_command_handler(name: str) -> Optional[Callable]:
return entry["handler"] if entry else None
_PLUGIN_COMMAND_AWAIT_TIMEOUT_SECS = 30.0
def resolve_plugin_command_result(result: Any) -> Any:
"""Resolve a plugin command return value, awaiting async handlers when needed.
Sync CLI/TUI dispatch sites call plugin handlers from plain functions.
If a handler is async, await it directly when no loop is running; if
we're already inside an active loop, run it in a helper thread with its
own loop so the caller still gets a concrete result synchronously. The
threaded path is bounded by a 30s timeout so a hung async handler cannot
wedge the terminal indefinitely.
"""
if not inspect.isawaitable(result):
return result
try:
asyncio.get_running_loop()
except RuntimeError:
return asyncio.run(result)
outcome: Dict[str, Any] = {}
failure: Dict[str, BaseException] = {}
done = threading.Event()
def _runner() -> None:
try:
outcome["value"] = asyncio.run(result)
except BaseException as exc: # pragma: no cover - re-raised below
failure["exc"] = exc
finally:
done.set()
thread = threading.Thread(
target=_runner,
name="hermes-plugin-command-await",
daemon=True,
)
thread.start()
if not done.wait(timeout=_PLUGIN_COMMAND_AWAIT_TIMEOUT_SECS):
raise TimeoutError(
"Plugin command async handler did not complete within "
f"{_PLUGIN_COMMAND_AWAIT_TIMEOUT_SECS:.0f}s"
)
if "exc" in failure:
raise failure["exc"]
return outcome.get("value")
def get_plugin_commands() -> Dict[str, dict]:
"""Return the full plugin commands dict (name → {handler, description, plugin}).
+378 -113
View File
@@ -15,13 +15,18 @@ import shutil
import subprocess
import sys
from pathlib import Path
from typing import Optional
from typing import Any, Optional
from hermes_constants import get_hermes_home
from hermes_cli.config import cfg_get
logger = logging.getLogger(__name__)
class PluginOperationError(Exception):
"""Recoverable plugin install/update failure (CLI exits; HTTP maps to 4xx)."""
# Minimum manifest version this installer understands.
# Plugins may declare ``manifest_version: 1`` in plugin.yaml;
# future breaking changes to the manifest schema bump this.
@@ -150,6 +155,24 @@ def _copy_example_files(plugin_dir: Path, console) -> None:
)
def _missing_requires_env_names(manifest: dict) -> list[str]:
"""Return declared ``requires_env`` names that are unset in ``~/.hermes/.env``."""
requires_env = manifest.get("requires_env") or []
if not requires_env:
return []
from hermes_cli.config import get_env_value
env_specs: list[dict] = []
for entry in requires_env:
if isinstance(entry, str):
env_specs.append({"name": entry})
elif isinstance(entry, dict) and entry.get("name"):
env_specs.append(entry)
return [s["name"] for s in env_specs if s.get("name") and not get_env_value(s["name"])]
def _prompt_plugin_env_vars(manifest: dict, console) -> None:
"""Prompt for required environment variables declared in plugin.yaml.
@@ -283,6 +306,95 @@ def _require_installed_plugin(name: str, plugins_dir: Path, console) -> Path:
# ---------------------------------------------------------------------------
def _install_plugin_core(identifier: str, *, force: bool) -> tuple[Path, dict, str]:
"""Clone Git plugin into ``~/.hermes/plugins``.
Returns ``(target_dir, installed_manifest, canonical_name)``.
Raises ``PluginOperationError`` on failure.
"""
import tempfile
try:
git_url = _resolve_git_url(identifier)
except ValueError as e:
raise PluginOperationError(str(e)) from e
plugins_dir = _plugins_dir()
with tempfile.TemporaryDirectory() as tmp:
tmp_target = Path(tmp) / "plugin"
try:
result = subprocess.run(
["git", "clone", "--depth", "1", git_url, str(tmp_target)],
capture_output=True,
text=True,
timeout=60,
)
except FileNotFoundError as e:
raise PluginOperationError(
"git is not installed or not in PATH.",
) from e
except subprocess.TimeoutExpired as e:
raise PluginOperationError(
"Git clone timed out after 60 seconds.",
) from e
if result.returncode != 0:
err = (result.stderr or result.stdout or "").strip()
raise PluginOperationError(f"Git clone failed:\n{err}")
manifest = _read_manifest(tmp_target)
plugin_name = manifest.get("name") or _repo_name_from_url(git_url)
try:
target = _sanitize_plugin_name(plugin_name, plugins_dir)
except ValueError as e:
raise PluginOperationError(str(e)) from e
mv = manifest.get("manifest_version")
if mv is not None:
try:
mv_int = int(mv)
except (ValueError, TypeError):
raise PluginOperationError(
f"Plugin '{plugin_name}' has invalid manifest_version "
f"'{mv}' (expected an integer).",
) from None
if mv_int > _SUPPORTED_MANIFEST_VERSION:
from hermes_cli.config import recommended_update_command
raise PluginOperationError(
f"Plugin '{plugin_name}' requires manifest_version {mv}, "
f"but this installer only supports up to {_SUPPORTED_MANIFEST_VERSION}. "
f"Run {recommended_update_command()} to update Hermes.",
) from None
if target.exists():
if not force:
raise PluginOperationError(
f"Plugin '{plugin_name}' already exists. Use force reinstall "
f"or run `hermes plugins update {plugin_name}`.",
)
shutil.rmtree(target)
shutil.move(str(tmp_target), str(target))
has_yaml = (target / "plugin.yaml").exists() or (target / "plugin.yml").exists()
if not has_yaml and not (target / "__init__.py").exists():
logger.warning(
"%s has no plugin.yaml / __init__.py; may not be a valid plugin",
plugin_name,
)
from rich.console import Console
_copy_example_files(target, Console())
installed_manifest = _read_manifest(target)
installed_name = installed_manifest.get("name") or target.name
return target, installed_manifest, installed_name
def cmd_install(
identifier: str,
force: bool = False,
@@ -293,7 +405,6 @@ def cmd_install(
After install, prompt "Enable now? [y/N]" unless *enable* is provided
(True = auto-enable without prompting, False = install disabled).
"""
import tempfile
from rich.console import Console
console = Console()
@@ -304,114 +415,41 @@ def cmd_install(
console.print(f"[red]Error:[/red] {e}")
sys.exit(1)
# Warn about insecure / local URL schemes
if git_url.startswith(("http://", "file://")):
console.print(
"[yellow]Warning:[/yellow] Using insecure/local URL scheme. "
"Consider using https:// or git@ for production installs."
"Consider using https:// or git@ for production installs.",
)
plugins_dir = _plugins_dir()
console.print(f"[dim]Cloning {git_url}...[/dim]")
# Clone into a temp directory first so we can read plugin.yaml for the name
with tempfile.TemporaryDirectory() as tmp:
tmp_target = Path(tmp) / "plugin"
console.print(f"[dim]Cloning {git_url}...[/dim]")
try:
target, installed_manifest, installed_name = _install_plugin_core(
identifier,
force=force,
)
except PluginOperationError as e:
console.print(f"[red]Error:[/red] {e}")
sys.exit(1)
try:
result = subprocess.run(
["git", "clone", "--depth", "1", git_url, str(tmp_target)],
capture_output=True,
text=True,
timeout=60,
)
except FileNotFoundError:
console.print("[red]Error:[/red] git is not installed or not in PATH.")
sys.exit(1)
except subprocess.TimeoutExpired:
console.print("[red]Error:[/red] Git clone timed out after 60 seconds.")
sys.exit(1)
if result.returncode != 0:
console.print(
f"[red]Error:[/red] Git clone failed:\n{result.stderr.strip()}"
)
sys.exit(1)
# Read manifest
manifest = _read_manifest(tmp_target)
plugin_name = manifest.get("name") or _repo_name_from_url(git_url)
# Sanitize plugin name against path traversal
try:
target = _sanitize_plugin_name(plugin_name, plugins_dir)
except ValueError as e:
console.print(f"[red]Error:[/red] {e}")
sys.exit(1)
# Check manifest_version compatibility
mv = manifest.get("manifest_version")
if mv is not None:
try:
mv_int = int(mv)
except (ValueError, TypeError):
console.print(
f"[red]Error:[/red] Plugin '{plugin_name}' has invalid "
f"manifest_version '{mv}' (expected an integer)."
)
sys.exit(1)
if mv_int > _SUPPORTED_MANIFEST_VERSION:
from hermes_cli.config import recommended_update_command
console.print(
f"[red]Error:[/red] Plugin '{plugin_name}' requires manifest_version "
f"{mv}, but this installer only supports up to {_SUPPORTED_MANIFEST_VERSION}.\n"
f"Run [bold]{recommended_update_command()}[/bold] to get a newer installer."
)
sys.exit(1)
if target.exists():
if not force:
console.print(
f"[red]Error:[/red] Plugin '{plugin_name}' already exists at {target}.\n"
f"Use [bold]--force[/bold] to remove and reinstall, or "
f"[bold]hermes plugins update {plugin_name}[/bold] to pull latest."
)
sys.exit(1)
console.print(f"[dim] Removing existing {plugin_name}...[/dim]")
shutil.rmtree(target)
# Move from temp to final location
shutil.move(str(tmp_target), str(target))
# Validate it looks like a plugin
if not (target / "plugin.yaml").exists() and not (target / "__init__.py").exists():
if not (target / "plugin.yaml").exists() and not (target / "plugin.yml").exists() and not (
target / "__init__.py"
).exists():
console.print(
f"[yellow]Warning:[/yellow] {plugin_name} doesn't contain plugin.yaml "
f"or __init__.py. It may not be a valid Hermes plugin."
f"[yellow]Warning:[/yellow] {installed_name} doesn't contain plugin.yaml "
f"or __init__.py. It may not be a valid Hermes plugin.",
)
# Copy .example files to their real names (e.g. config.yaml.example → config.yaml)
_copy_example_files(target, console)
# Re-read manifest from installed location (for env var prompting)
installed_manifest = _read_manifest(target)
# Prompt for required environment variables before showing after-install docs
_prompt_plugin_env_vars(installed_manifest, console)
_display_after_install(target, identifier)
# Determine the canonical plugin name for enable-list bookkeeping.
installed_name = installed_manifest.get("name") or target.name
# Decide whether to enable: explicit flag > interactive prompt > default off
should_enable = enable
if should_enable is None:
# Interactive prompt unless stdin isn't a TTY (scripted install).
if sys.stdin.isatty() and sys.stdout.isatty():
try:
answer = input(
f" Enable '{installed_name}' now? [y/N]: "
f" Enable '{installed_name}' now? [y/N]: ",
).strip().lower()
should_enable = answer in ("y", "yes")
except (EOFError, KeyboardInterrupt):
@@ -427,12 +465,12 @@ def cmd_install(
_save_enabled_set(enabled)
_save_disabled_set(disabled)
console.print(
f"[green]✓[/green] Plugin [bold]{installed_name}[/bold] enabled."
f"[green]✓[/green] Plugin [bold]{installed_name}[/bold] enabled.",
)
else:
console.print(
f"[dim]Plugin installed but not enabled. "
f"Run `hermes plugins enable {installed_name}` to activate.[/dim]"
f"Run `hermes plugins enable {installed_name}` to activate.[/dim]",
)
console.print("[dim]Restart the gateway for the plugin to take effect:[/dim]")
@@ -462,36 +500,22 @@ def cmd_update(name: str) -> None:
console.print(f"[dim]Updating {name}...[/dim]")
try:
result = subprocess.run(
["git", "pull", "--ff-only"],
capture_output=True,
text=True,
timeout=60,
cwd=str(target),
)
except FileNotFoundError:
console.print("[red]Error:[/red] git is not installed or not in PATH.")
sys.exit(1)
except subprocess.TimeoutExpired:
console.print("[red]Error:[/red] Git pull timed out after 60 seconds.")
sys.exit(1)
if result.returncode != 0:
console.print(f"[red]Error:[/red] Git pull failed:\n{result.stderr.strip()}")
ok, output = _git_pull_plugin_dir(target)
if not ok:
console.print(f"[red]Error:[/red] {output}")
sys.exit(1)
# Copy any new .example files
_copy_example_files(target, console)
output = result.stdout.strip()
if "Already up to date" in output:
out = output.strip()
if "Already up to date" in out:
console.print(
f"[green]✓[/green] Plugin [bold]{name}[/bold] is already up to date."
)
else:
console.print(f"[green]✓[/green] Plugin [bold]{name}[/bold] updated.")
console.print(f"[dim]{output}[/dim]")
console.print(f"[dim]{out}[/dim]")
def cmd_remove(name: str) -> None:
@@ -1244,6 +1268,247 @@ def _run_composite_fallback(plugin_names, plugin_labels, plugin_selected,
print()
def dashboard_install_plugin(
identifier: str,
*,
force: bool,
enable: bool,
) -> dict[str, Any]:
"""Non-interactive install for the web dashboard. Returns a JSON-serializable dict."""
warnings: list[str] = []
try:
git_url = _resolve_git_url(identifier)
if git_url.startswith(("http://", "file://")):
warnings.append(
"Insecure URL scheme; prefer https:// or git@ for production installs.",
)
except ValueError:
pass
try:
target, installed_manifest, installed_name = _install_plugin_core(
identifier,
force=force,
)
except PluginOperationError as exc:
return {"ok": False, "error": str(exc)}
missing_env = _missing_requires_env_names(installed_manifest)
if enable:
en = _get_enabled_set()
dis = _get_disabled_set()
en.add(installed_name)
dis.discard(installed_name)
_save_enabled_set(en)
_save_disabled_set(dis)
hint: str | None = None
ap = target / "after-install.md"
if ap.exists():
hint = str(ap)
return {
"ok": True,
"plugin_name": installed_name,
"warnings": warnings,
"missing_env": missing_env,
"after_install_path": hint,
"enabled": enable,
}
def _get_plugin_toolset_key(name: str) -> Optional[str]:
"""Return the toolset key a plugin registers its tools under, or None.
Queries the live tool registry the plugin must already be loaded.
Falls back to reading ``provides_tools`` from plugin.yaml and looking
up the toolset from the registry for the first tool name found.
"""
try:
from tools.registry import registry
except Exception:
return None
# Check the plugin manager for tools this plugin registered
try:
from hermes_cli.plugins import discover_plugins, get_plugin_manager
discover_plugins() # idempotent — ensures plugins are loaded
manager = get_plugin_manager()
for _key, loaded in manager._plugins.items():
if loaded.manifest.name == name or _key == name:
for tool_name in loaded.tools_registered:
entry = registry.get_entry(tool_name)
if entry and entry.toolset:
return entry.toolset
break
except Exception:
pass
# Fallback: read provides_tools from manifest on disk and query registry
try:
from hermes_cli.plugins import get_bundled_plugins_dir
for base in (get_bundled_plugins_dir(), _plugins_dir()):
if not base.is_dir():
continue
candidate = base / name
if candidate.is_dir():
manifest = _read_manifest(candidate)
for tool_name in manifest.get("provides_tools") or []:
entry = registry.get_entry(tool_name)
if entry and entry.toolset:
return entry.toolset
except Exception:
pass
return None
def _toggle_plugin_toolset(name: str, *, enable: bool) -> None:
"""Add or remove a plugin's toolset from platform_toolsets for all platforms.
Only acts if the plugin actually provides tools (has a toolset key).
"""
toolset_key = _get_plugin_toolset_key(name)
if not toolset_key:
return
from hermes_cli.config import load_config, save_config
config = load_config()
platform_toolsets = config.get("platform_toolsets")
if not isinstance(platform_toolsets, dict):
platform_toolsets = {}
config["platform_toolsets"] = platform_toolsets
changed = False
for platform, ts_list in platform_toolsets.items():
if not isinstance(ts_list, list):
continue
if enable:
if toolset_key not in ts_list:
ts_list.append(toolset_key)
changed = True
else:
if toolset_key in ts_list:
ts_list.remove(toolset_key)
changed = True
# If enabling and no platforms have toolset lists yet, add to "cli" at minimum
if enable and not changed and not platform_toolsets:
platform_toolsets["cli"] = [toolset_key]
changed = True
if changed:
save_config(config)
def dashboard_set_agent_plugin_enabled(name: str, *, enabled: bool) -> dict[str, Any]:
"""Enable or disable a plugin in ``config.yaml`` (runtime allow/deny lists).
For plugins that provide tools (toolsets), also toggles the toolset in
``platform_toolsets`` so the agent actually sees the tools in sessions.
"""
if not _plugin_exists(name):
return {"ok": False, "error": f"Plugin '{name}' is not installed or bundled."}
en = _get_enabled_set()
dis = _get_disabled_set()
if enabled:
if name in en and name not in dis:
return {"ok": True, "name": name, "unchanged": True}
en.add(name)
dis.discard(name)
_save_enabled_set(en)
_save_disabled_set(dis)
_toggle_plugin_toolset(name, enable=True)
return {"ok": True, "name": name, "unchanged": False}
if name not in en and name in dis:
return {"ok": True, "name": name, "unchanged": True}
en.discard(name)
dis.add(name)
_save_enabled_set(en)
_save_disabled_set(dis)
_toggle_plugin_toolset(name, enable=False)
return {"ok": True, "name": name, "unchanged": False}
def _user_installed_plugin_dir(name: str) -> Optional[Path]:
"""Resolved path under ``~/.hermes/plugins/<name>`` if it exists."""
plugins_dir = _plugins_dir()
try:
target = _sanitize_plugin_name(name, plugins_dir)
except ValueError:
return None
return target if target.is_dir() else None
def dashboard_update_user_plugin(name: str) -> dict[str, Any]:
"""``git pull`` inside ``~/.hermes/plugins/<name>``."""
target = _user_installed_plugin_dir(name)
if target is None:
return {
"ok": False,
"error": f"Plugin '{name}' was not found under {_plugins_dir()}.",
}
if not (target / ".git").exists():
return {
"ok": False,
"error": f"Plugin '{name}' is not a git checkout; cannot pull updates.",
}
ok, msg = _git_pull_plugin_dir(target)
if not ok:
return {"ok": False, "error": msg}
from rich.console import Console
_copy_example_files(target, Console())
unchanged = "Already up to date" in msg
return {"ok": True, "name": name, "output": msg, "unchanged": unchanged}
def _git_pull_plugin_dir(target: Path) -> tuple[bool, str]:
try:
result = subprocess.run(
["git", "pull", "--ff-only"],
capture_output=True,
text=True,
timeout=60,
cwd=str(target),
)
except FileNotFoundError:
return False, "git is not installed or not in PATH."
except subprocess.TimeoutExpired:
return False, "Git pull timed out after 60 seconds."
if result.returncode != 0:
err = (result.stderr or "").strip() or result.stdout.strip()
return False, err or "git pull failed."
return True, result.stdout.strip()
def dashboard_remove_user_plugin(name: str) -> dict[str, Any]:
"""Delete a plugin tree under ``~/.hermes/plugins/`` only."""
plugins_dir = _plugins_dir()
for n, _ver, _d, src, _path in _discover_all_plugins():
if n == name and src == "bundled":
return {"ok": False, "error": "Bundled plugins cannot be removed from the dashboard."}
target = _user_installed_plugin_dir(name)
if target is None:
return {
"ok": False,
"error": f"Plugin '{name}' was not found under {plugins_dir}.",
}
shutil.rmtree(target)
return {"ok": True, "name": name}
def plugins_command(args) -> None:
"""Dispatch hermes plugins subcommands."""
action = getattr(args, "plugins_action", None)
+11 -2
View File
@@ -11,7 +11,7 @@ zero migration needed.
Usage::
hermes profile create coder # fresh profile + bundled skills
hermes profile create coder --clone # also copy config, .env, SOUL.md
hermes profile create coder --clone # also copy config, .env, SOUL.md, skills
hermes profile create coder --clone-all # full copy of source profile
coder chat # use via wrapper alias
hermes -p coder chat # or via flag
@@ -411,7 +411,8 @@ def create_profile(
clone_all:
If True, do a full copytree of the source (all state).
clone_config:
If True, copy only config files (config.yaml, .env, SOUL.md).
If True, copy config files (config.yaml, .env, SOUL.md), installed
skills, and selected profile identity files from the source profile.
no_alias:
If True, skip wrapper script creation.
@@ -469,6 +470,14 @@ def create_profile(
if src.exists():
shutil.copy2(src, profile_dir / filename)
# Clone installed skills from the source profile. The dashboard's
# "clone from default" flow is expected to preserve both bundled
# and user-installed skills so the new profile immediately has the
# same agent capabilities as the source profile.
source_skills = source_dir / "skills"
if source_skills.is_dir():
shutil.copytree(source_skills, profile_dir / "skills", dirs_exist_ok=True)
# Clone memory and other subdirectory files
for relpath in _CLONE_SUBDIR_FILES:
src = source_dir / relpath
+11 -2
View File
@@ -358,11 +358,20 @@ def _get_named_custom_provider(requested_provider: str) -> Optional[Dict[str, An
return None
if not requested_norm.startswith("custom:"):
try:
auth_mod.resolve_provider(requested_norm)
canonical = auth_mod.resolve_provider(requested_norm)
except AuthError:
pass
else:
return None
# A user-declared ``custom_providers`` entry whose name matches
# only an *alias* (``kimi`` → built-in ``kimi-coding``) is the
# user's intended target — alias rewriting would otherwise hijack
# the request. We only defer to the built-in when the raw name is
# the canonical provider itself (``nous``, ``openrouter``, …) so
# accidentally shadowing a canonical provider still resolves to
# the built-in. See tests/hermes_cli/test_runtime_provider_resolution.py
# ``test_named_custom_provider_does_not_shadow_builtin_provider``.
if (canonical or "").strip().lower() == requested_norm:
return None
config = load_config()
+26 -6
View File
@@ -384,7 +384,7 @@ def _print_setup_summary(config: dict, hermes_home):
else:
tool_status.append(("Web Search & Extract", False, "EXA_API_KEY, PARALLEL_API_KEY, FIRECRAWL_API_KEY/FIRECRAWL_API_URL, or TAVILY_API_KEY"))
# Browser tools (local Chromium, Camofox, Browserbase, Browser Use, or Firecrawl)
# Browser tools (local Chromium, Camofox, Browserbase, Browser Use, Firecrawl, or TinyFish)
browser_provider = subscription_features.browser.current_provider
if subscription_features.browser.managed_by_nous:
tool_status.append(("Browser Automation (Nous Browser Use)", True, None))
@@ -406,6 +406,10 @@ def _print_setup_summary(config: dict, hermes_home):
)
elif browser_provider == "Camofox":
missing_browser_hint = "CAMOFOX_URL"
elif browser_provider == "TinyFish":
missing_browser_hint = (
"npm install -g agent-browser and set TINYFISH_API_KEY"
)
elif browser_provider == "Local browser":
missing_browser_hint = "npm install -g agent-browser"
tool_status.append(
@@ -1190,6 +1194,13 @@ def _setup_tts_provider(config: dict):
"Falling back to Edge TTS."
)
selected = "edge"
if selected == "xai":
print()
voice_id = prompt("xAI voice_id (Enter for 'eve', or paste a custom voice ID)")
if voice_id and voice_id.strip():
config.setdefault("tts", {}).setdefault("xai", {})["voice_id"] = voice_id.strip()
print_success(f"xAI voice_id set to: {voice_id.strip()}")
elif selected == "minimax":
existing = get_env_value("MINIMAX_API_KEY")
@@ -1643,7 +1654,11 @@ def setup_terminal_backend(config: dict):
def _apply_default_agent_settings(config: dict):
"""Apply recommended defaults for all agent settings without prompting."""
config.setdefault("agent", {})["max_turns"] = 90
save_env_value("HERMES_MAX_ITERATIONS", "90")
# config.yaml is the authoritative source for max_turns; the gateway
# bridges it into HERMES_MAX_ITERATIONS at startup. We no longer write
# to .env to avoid the dual-source inconsistency that caused the
# 60-vs-500 bug (stale .env entry silently shadowing config.yaml).
remove_env_value("HERMES_MAX_ITERATIONS")
config.setdefault("display", {})["tool_progress"] = "all"
@@ -1673,9 +1688,10 @@ def setup_agent_settings(config: dict):
print()
# ── Max Iterations ──
current_max = get_env_value("HERMES_MAX_ITERATIONS") or str(
cfg_get(config, "agent", "max_turns", default=90)
)
# config.yaml is authoritative; read from there. If a legacy .env
# entry is still around (from pre-PR#18413 setups), prefer the
# config value so we don't surface a stale number to the user.
current_max = str(cfg_get(config, "agent", "max_turns", default=90))
print_info("Maximum tool-calling iterations per conversation.")
print_info("Higher = more complex tasks, but costs more tokens.")
print_info(
@@ -1686,9 +1702,13 @@ def setup_agent_settings(config: dict):
try:
max_iter = int(max_iter_str)
if max_iter > 0:
save_env_value("HERMES_MAX_ITERATIONS", str(max_iter))
# Write to config.yaml (authoritative) only. Also clean up any
# stale .env entry from earlier setup runs — the gateway's
# bridge in gateway/run.py now unconditionally derives
# HERMES_MAX_ITERATIONS from agent.max_turns at startup.
config.setdefault("agent", {})["max_turns"] = max_iter
config.pop("max_turns", None)
remove_env_value("HERMES_MAX_ITERATIONS")
print_success(f"Max iterations set to {max_iter}")
except ValueError:
print_warning("Invalid number, keeping current value")
+2 -1
View File
@@ -18,6 +18,7 @@ for reinstall when scopes/commands change.
from __future__ import annotations
import json
import os
import sys
from pathlib import Path
@@ -128,7 +129,7 @@ def slack_manifest_command(args) -> int:
target = Path(get_hermes_home()) / "slack-manifest.json"
except Exception:
target = Path.home() / ".hermes" / "slack-manifest.json"
target = Path(os.environ.get("HERMES_HOME") or str(Path.home() / ".hermes")) / "slack-manifest.json"
else:
target = Path(write_target).expanduser()
target.parent.mkdir(parents=True, exist_ok=True)
+1
View File
@@ -125,6 +125,7 @@ def show_status(args):
keys = {
"OpenRouter": "OPENROUTER_API_KEY",
"OpenAI": "OPENAI_API_KEY",
"NVIDIA": "NVIDIA_API_KEY",
"Z.AI/GLM": "GLM_API_KEY",
"Kimi": "KIMI_API_KEY",
"StepFun Step Plan": "STEPFUN_API_KEY",
+32 -1
View File
@@ -379,6 +379,15 @@ TOOL_CATEGORIES = {
"browser_provider": "firecrawl",
"post_setup": "agent_browser",
},
{
"name": "TinyFish",
"tag": "Low latency browser with stealth & proxies",
"env_vars": [
{"key": "TINYFISH_API_KEY", "prompt": "TinyFish API key", "url": "https://agent.tinyfish.ai/api-keys"},
],
"browser_provider": "tinyfish",
"post_setup": "agent_browser",
},
{
"name": "Camofox",
"badge": "free · local",
@@ -1822,7 +1831,7 @@ def _reconfigure_tool(config: dict):
cat = TOOL_CATEGORIES.get(ts_key)
reqs = TOOLSET_ENV_REQUIREMENTS.get(ts_key)
if cat or reqs:
if _toolset_has_keys(ts_key, config):
if _toolset_has_keys(ts_key, config) or _toolset_enabled_for_reconfigure(ts_key, config):
configurable.append((ts_key, ts_label))
if not configurable:
@@ -1848,6 +1857,28 @@ def _reconfigure_tool(config: dict):
save_config(config)
def _toolset_enabled_for_reconfigure(ts_key: str, config: dict) -> bool:
"""Return True if a configurable toolset is enabled anywhere.
Reconfigure must include enabled-but-unconfigured categories so users can
finish provider/API-key setup without disabling and re-enabling the toolset.
"""
for platform in PLATFORMS:
if not _toolset_allowed_for_platform(ts_key, platform):
continue
try:
enabled = _get_platform_tools(
config,
platform,
include_default_mcp_servers=False,
)
except Exception:
continue
if ts_key in enabled:
return True
return False
def _configure_tool_category_for_reconfig(ts_key: str, cat: dict, config: dict):
"""Reconfigure a tool category - provider selection + API key update."""
icon = cat.get("icon", "")
+540 -10
View File
@@ -345,6 +345,7 @@ _CATEGORY_MERGE: Dict[str, str] = {
"dashboard": "display",
"code_execution": "agent",
"prompt_caching": "agent",
"goals": "agent",
# Only `telegram.reactions` currently lives under telegram — fold it in
# with the other messaging-platform config (discord) so it isn't an
# orphan tab of one field.
@@ -2344,6 +2345,254 @@ async def delete_cron_job(job_id: str):
return {"ok": True}
# ---------------------------------------------------------------------------
# Profile management endpoints (minimal — list/create/rename/delete + SOUL.md)
# ---------------------------------------------------------------------------
class ProfileCreate(BaseModel):
name: str
clone_from_default: bool = False
class ProfileRename(BaseModel):
new_name: str
class ProfileSoulUpdate(BaseModel):
content: str
def _profile_attr(info, name: str, default: Any = None) -> Any:
try:
return getattr(info, name)
except Exception:
return default
def _profile_to_dict(info) -> Dict[str, Any]:
return {
"name": _profile_attr(info, "name", ""),
"path": str(_profile_attr(info, "path", "")),
"is_default": bool(_profile_attr(info, "is_default", False)),
"model": _profile_attr(info, "model"),
"provider": _profile_attr(info, "provider"),
"has_env": bool(_profile_attr(info, "has_env", False)),
"skill_count": int(_profile_attr(info, "skill_count", 0) or 0),
}
def _fallback_profile_dicts(profiles_mod) -> List[Dict[str, Any]]:
def _safe(callable_, default):
try:
return callable_()
except Exception:
return default
profiles: List[Dict[str, Any]] = []
default_home = profiles_mod._get_default_hermes_home()
if default_home.is_dir():
model, provider = _safe(lambda: profiles_mod._read_config_model(default_home), (None, None))
profiles.append({
"name": "default",
"path": str(default_home),
"is_default": True,
"model": model,
"provider": provider,
"has_env": (default_home / ".env").exists(),
"skill_count": _safe(lambda: profiles_mod._count_skills(default_home), 0),
})
profiles_root = profiles_mod._get_profiles_root()
if profiles_root.is_dir():
for entry in sorted(profiles_root.iterdir()):
if not entry.is_dir() or not profiles_mod._PROFILE_ID_RE.match(entry.name):
continue
model, provider = _safe(lambda entry=entry: profiles_mod._read_config_model(entry), (None, None))
profiles.append({
"name": entry.name,
"path": str(entry),
"is_default": False,
"model": model,
"provider": provider,
"has_env": (entry / ".env").exists(),
"skill_count": _safe(lambda entry=entry: profiles_mod._count_skills(entry), 0),
})
return profiles
def _resolve_profile_dir(name: str) -> Path:
"""Validate ``name`` and resolve to its directory or raise an HTTPException."""
from hermes_cli import profiles as profiles_mod
try:
profiles_mod.validate_profile_name(name)
except ValueError as e:
raise HTTPException(status_code=400, detail=str(e))
if not profiles_mod.profile_exists(name):
raise HTTPException(status_code=404, detail=f"Profile '{name}' does not exist.")
return profiles_mod.get_profile_dir(name)
def _profile_setup_command(name: str) -> str:
"""Return the shell command used to configure a profile in the CLI."""
_resolve_profile_dir(name)
return "hermes setup" if name == "default" else f"{name} setup"
@app.get("/api/profiles")
async def list_profiles_endpoint():
from hermes_cli import profiles as profiles_mod
try:
return {"profiles": [_profile_to_dict(p) for p in profiles_mod.list_profiles()]}
except Exception:
_log.exception("GET /api/profiles failed; falling back to profile directory scan")
return {"profiles": _fallback_profile_dicts(profiles_mod)}
@app.post("/api/profiles")
async def create_profile_endpoint(body: ProfileCreate):
from hermes_cli import profiles as profiles_mod
try:
path = profiles_mod.create_profile(
name=body.name,
clone_from="default" if body.clone_from_default else None,
clone_config=body.clone_from_default,
)
# Match the CLI's profile-create flow: fresh named profiles get the
# bundled skills installed. When cloning from default, create_profile()
# has already copied the source profile's skills, including any
# user-installed skills.
if not body.clone_from_default:
profiles_mod.seed_profile_skills(path, quiet=True)
# Match the CLI's profile-create flow: named profiles should get a
# wrapper in ~/.local/bin when the alias is safe to create.
collision = profiles_mod.check_alias_collision(body.name)
if not collision:
profiles_mod.create_wrapper_script(body.name)
except (ValueError, FileExistsError, FileNotFoundError) as e:
raise HTTPException(status_code=400, detail=str(e))
except Exception as e:
_log.exception("POST /api/profiles failed")
raise HTTPException(status_code=500, detail=str(e))
return {"ok": True, "name": body.name, "path": str(path)}
@app.get("/api/profiles/{name}/setup-command")
async def get_profile_setup_command(name: str):
return {"command": _profile_setup_command(name)}
@app.post("/api/profiles/{name}/open-terminal")
async def open_profile_terminal_endpoint(name: str):
try:
command = _profile_setup_command(name)
if sys.platform.startswith("win"):
subprocess.Popen(["cmd.exe", "/c", "start", "", command])
elif sys.platform == "darwin":
escaped = command.replace("\\", "\\\\").replace('"', '\\"')
applescript = (
'tell application "Terminal"\n'
"activate\n"
f'do script "{escaped}"\n'
"end tell"
)
subprocess.Popen(["osascript", "-e", applescript])
else:
terminal_commands = [
("x-terminal-emulator", ["x-terminal-emulator", "-e", "sh", "-lc", command]),
("gnome-terminal", ["gnome-terminal", "--", "sh", "-lc", command]),
("konsole", ["konsole", "-e", "sh", "-lc", command]),
("xfce4-terminal", ["xfce4-terminal", "-e", f"sh -lc '{command}'"]),
("mate-terminal", ["mate-terminal", "-e", f"sh -lc '{command}'"]),
("lxterminal", ["lxterminal", "-e", f"sh -lc '{command}'"]),
("tilix", ["tilix", "-e", "sh", "-lc", command]),
("alacritty", ["alacritty", "-e", "sh", "-lc", command]),
("kitty", ["kitty", "sh", "-lc", command]),
("xterm", ["xterm", "-e", "sh", "-lc", command]),
]
for executable, popen_args in terminal_commands:
if subprocess.call(
["which", executable],
stdout=subprocess.DEVNULL,
stderr=subprocess.DEVNULL,
) == 0:
subprocess.Popen(popen_args)
break
else:
raise HTTPException(
status_code=400,
detail="No supported terminal emulator found",
)
except FileNotFoundError as e:
raise HTTPException(status_code=404, detail=str(e))
except ValueError as e:
raise HTTPException(status_code=400, detail=str(e))
except HTTPException:
raise
except Exception as e:
_log.exception("POST /api/profiles/%s/open-terminal failed", name)
raise HTTPException(status_code=500, detail=str(e))
return {"ok": True, "command": command}
@app.patch("/api/profiles/{name}")
async def rename_profile_endpoint(name: str, body: ProfileRename):
from hermes_cli import profiles as profiles_mod
try:
path = profiles_mod.rename_profile(name, body.new_name)
except FileNotFoundError as e:
raise HTTPException(status_code=404, detail=str(e))
except (ValueError, FileExistsError) as e:
raise HTTPException(status_code=400, detail=str(e))
except Exception as e:
_log.exception("PATCH /api/profiles/%s failed", name)
raise HTTPException(status_code=500, detail=str(e))
return {"ok": True, "name": body.new_name, "path": str(path)}
@app.delete("/api/profiles/{name}")
async def delete_profile_endpoint(name: str):
"""Delete a profile. The dashboard collects the user's confirmation in
its own dialog before this request, so we always pass ``yes=True`` to
skip the CLI's interactive prompt."""
from hermes_cli import profiles as profiles_mod
try:
path = profiles_mod.delete_profile(name, yes=True)
except FileNotFoundError as e:
raise HTTPException(status_code=404, detail=str(e))
except ValueError as e:
raise HTTPException(status_code=400, detail=str(e))
except Exception as e:
_log.exception("DELETE /api/profiles/%s failed", name)
raise HTTPException(status_code=500, detail=str(e))
return {"ok": True, "path": str(path)}
@app.get("/api/profiles/{name}/soul")
async def get_profile_soul(name: str):
soul_path = _resolve_profile_dir(name) / "SOUL.md"
if soul_path.exists():
try:
return {"content": soul_path.read_text(encoding="utf-8"), "exists": True}
except OSError as e:
raise HTTPException(status_code=500, detail=f"Could not read SOUL.md: {e}")
return {"content": "", "exists": False}
@app.put("/api/profiles/{name}/soul")
async def update_profile_soul(name: str, body: ProfileSoulUpdate):
soul_path = _resolve_profile_dir(name) / "SOUL.md"
try:
soul_path.write_text(body.content, encoding="utf-8")
except OSError as e:
_log.exception("PUT /api/profiles/%s/soul failed", name)
raise HTTPException(status_code=500, detail=f"Could not write SOUL.md: {e}")
return {"ok": True}
# ---------------------------------------------------------------------------
# Skills & Tools endpoints
# ---------------------------------------------------------------------------
@@ -2633,6 +2882,25 @@ _VALID_CHANNEL_RE = re.compile(r"^[A-Za-z0-9._-]{1,128}$")
# loopback so tests don't need to rewrite request scope.
_LOOPBACK_HOSTS = frozenset({"127.0.0.1", "::1", "localhost", "testclient"})
def _is_public_bind() -> bool:
"""True when bound to all-interfaces (operator used --insecure)."""
return getattr(app.state, "bound_host", "") in ("0.0.0.0", "::")
def _ws_client_is_allowed(ws: "WebSocket") -> bool:
"""Check if the WebSocket client IP is acceptable.
Allows loopback always; allows any IP when bound to all-interfaces
(--insecure mode, guarded by session token auth).
"""
if _is_public_bind():
return True
client_host = ws.client.host if ws.client else ""
if not client_host:
return True
return client_host in _LOOPBACK_HOSTS
# Per-channel subscriber registry used by /api/pub (PTY-side gateway → dashboard)
# and /api/events (dashboard → browser sidebar). Keyed by an opaque channel id
# the chat tab generates on mount; entries auto-evict when the last subscriber
@@ -2723,8 +2991,7 @@ async def pty_ws(ws: WebSocket) -> None:
await ws.close(code=4401)
return
client_host = ws.client.host if ws.client else ""
if client_host and client_host not in _LOOPBACK_HOSTS:
if not _ws_client_is_allowed(ws):
await ws.close(code=4403)
return
@@ -2831,8 +3098,7 @@ async def gateway_ws(ws: WebSocket) -> None:
await ws.close(code=4401)
return
client_host = ws.client.host if ws.client else ""
if client_host and client_host not in _LOOPBACK_HOSTS:
if not _ws_client_is_allowed(ws):
await ws.close(code=4403)
return
@@ -2864,8 +3130,7 @@ async def pub_ws(ws: WebSocket) -> None:
await ws.close(code=4401)
return
client_host = ws.client.host if ws.client else ""
if client_host and client_host not in _LOOPBACK_HOSTS:
if not _ws_client_is_allowed(ws):
await ws.close(code=4403)
return
@@ -2894,8 +3159,7 @@ async def events_ws(ws: WebSocket) -> None:
await ws.close(code=4401)
return
client_host = ws.client.host if ws.client else ""
if client_host and client_host not in _LOOPBACK_HOSTS:
if not _ws_client_is_allowed(ws):
await ws.close(code=4403)
return
@@ -3369,12 +3633,16 @@ def _get_dashboard_plugins(force_rescan: bool = False) -> list:
@app.get("/api/dashboard/plugins")
async def get_dashboard_plugins():
"""Return discovered dashboard plugins."""
"""Return discovered dashboard plugins (excludes user-hidden ones)."""
plugins = _get_dashboard_plugins()
# Strip internal fields before sending to frontend.
# Read user's hidden plugins list from config.
config = load_config()
hidden: list = cfg_get(config, "dashboard", "hidden_plugins", default=[]) or []
# Strip internal fields before sending to frontend and filter out hidden.
return [
{k: v for k, v in p.items() if not k.startswith("_")}
for p in plugins
if p["name"] not in hidden
]
@@ -3385,6 +3653,268 @@ async def rescan_dashboard_plugins():
return {"ok": True, "count": len(plugins)}
class _AgentPluginInstallBody(BaseModel):
identifier: str
force: bool = False
enable: bool = True
def _strip_dashboard_manifest(p: Dict[str, Any]) -> Dict[str, Any]:
return {k: v for k, v in p.items() if not k.startswith("_")}
def _merged_plugins_hub() -> Dict[str, Any]:
"""Agent discovery + dashboard manifests + optional provider picker metadata."""
from hermes_cli.plugins_cmd import (
_discover_all_plugins,
_get_current_context_engine,
_get_current_memory_provider,
_discover_context_engines,
_discover_memory_providers,
_get_disabled_set,
_get_enabled_set,
_read_manifest as _read_plugin_manifest_at,
)
dashboard_list = _get_dashboard_plugins()
dash_by_name = {str(p["name"]): p for p in dashboard_list}
disabled_set = _get_disabled_set()
enabled_set = _get_enabled_set()
# Read user-hidden plugins from config for the user_hidden field.
config = load_config()
hidden_plugins: list = cfg_get(config, "dashboard", "hidden_plugins", default=[]) or []
plugins_root_resolved = (get_hermes_home() / "plugins").resolve()
rows: List[Dict[str, Any]] = []
for name, version, description, source, dir_str in _discover_all_plugins():
if name in disabled_set:
runtime_status = "disabled"
elif name in enabled_set:
runtime_status = "enabled"
else:
runtime_status = "inactive"
dir_path = Path(dir_str)
dm = dash_by_name.get(name)
has_dash_manifest = dm is not None or (dir_path / "dashboard" / "manifest.json").exists()
under_user_tree = False
try:
dir_path.resolve().relative_to(plugins_root_resolved)
under_user_tree = True
except ValueError:
pass
can_remove_update = (
source in ("user", "git") and under_user_tree and Path(dir_str).is_dir()
)
# Check if this plugin provides tools that require auth
auth_required = False
auth_command = ""
manifest_data = _read_plugin_manifest_at(dir_path)
provides_tools = manifest_data.get("provides_tools") or []
if provides_tools:
try:
from tools.registry import registry
for tname in provides_tools:
entry = registry.get_entry(tname)
if entry and entry.check_fn and not entry.check_fn():
auth_required = True
auth_command = f"hermes auth {name}"
break
except Exception:
pass
rows.append({
"name": name,
"version": version or "",
"description": description or "",
"source": source,
"runtime_status": runtime_status,
"has_dashboard_manifest": has_dash_manifest,
"dashboard_manifest": _strip_dashboard_manifest(dm) if dm else None,
"path": dir_str,
"can_remove": can_remove_update,
"can_update_git": can_remove_update and (Path(dir_str) / ".git").exists(),
"auth_required": auth_required,
"auth_command": auth_command,
"user_hidden": name in hidden_plugins,
})
agent_names = {r["name"] for r in rows}
orphan_dashboard = [
_strip_dashboard_manifest(p)
for p in dashboard_list
if str(p["name"]) not in agent_names
]
memory_providers: List[Dict[str, str]] = []
try:
for n, desc in _discover_memory_providers():
memory_providers.append({"name": n, "description": desc})
except Exception:
memory_providers = []
context_engines: List[Dict[str, str]] = []
try:
for n, desc in _discover_context_engines():
context_engines.append({"name": n, "description": desc})
except Exception:
context_engines = []
return {
"plugins": rows,
"orphan_dashboard_plugins": orphan_dashboard,
"providers": {
"memory_provider": _get_current_memory_provider() or "",
"memory_options": memory_providers,
"context_engine": _get_current_context_engine(),
"context_options": context_engines,
},
}
@app.get("/api/dashboard/plugins/hub")
async def get_plugins_hub(request: Request):
"""Unified agent plugins + dashboard extension metadata (session protected)."""
_require_token(request)
try:
return _merged_plugins_hub()
except Exception as exc:
_log.warning("plugins/hub failed: %s", exc)
raise HTTPException(status_code=500, detail="Failed to build plugins hub.") from exc
@app.post("/api/dashboard/agent-plugins/install")
async def post_agent_plugin_install(request: Request, body: _AgentPluginInstallBody):
_require_token(request)
from hermes_cli.plugins_cmd import dashboard_install_plugin
result = dashboard_install_plugin(
body.identifier.strip(),
force=body.force,
enable=body.enable,
)
if not result.get("ok"):
raise HTTPException(
status_code=400,
detail=result.get("error") or "Install failed.",
)
_get_dashboard_plugins(force_rescan=True)
# Strip internal paths from the response
result.pop("after_install_path", None)
return result
def _validate_plugin_name(name: str) -> str:
"""Reject path-traversal attempts in plugin name URL parameters."""
if not name or "/" in name or "\\" in name or ".." in name:
raise HTTPException(status_code=400, detail="Invalid plugin name.")
return name
@app.post("/api/dashboard/agent-plugins/{name}/enable")
async def post_agent_plugin_enable(request: Request, name: str):
_require_token(request)
name = _validate_plugin_name(name)
from hermes_cli.plugins_cmd import dashboard_set_agent_plugin_enabled
result = dashboard_set_agent_plugin_enabled(name, enabled=True)
if not result.get("ok"):
raise HTTPException(status_code=400, detail=result.get("error") or "Enable failed.")
return result
@app.post("/api/dashboard/agent-plugins/{name}/disable")
async def post_agent_plugin_disable(request: Request, name: str):
_require_token(request)
name = _validate_plugin_name(name)
from hermes_cli.plugins_cmd import dashboard_set_agent_plugin_enabled
result = dashboard_set_agent_plugin_enabled(name, enabled=False)
if not result.get("ok"):
raise HTTPException(status_code=400, detail=result.get("error") or "Disable failed.")
return result
@app.post("/api/dashboard/agent-plugins/{name}/update")
async def post_agent_plugin_update(request: Request, name: str):
_require_token(request)
name = _validate_plugin_name(name)
from hermes_cli.plugins_cmd import dashboard_update_user_plugin
result = dashboard_update_user_plugin(name)
if not result.get("ok"):
raise HTTPException(status_code=400, detail=result.get("error") or "Update failed.")
_get_dashboard_plugins(force_rescan=True)
return result
@app.delete("/api/dashboard/agent-plugins/{name}")
async def delete_agent_plugin(request: Request, name: str):
_require_token(request)
name = _validate_plugin_name(name)
from hermes_cli.plugins_cmd import dashboard_remove_user_plugin
result = dashboard_remove_user_plugin(name)
if not result.get("ok"):
raise HTTPException(status_code=400, detail=result.get("error") or "Remove failed.")
_get_dashboard_plugins(force_rescan=True)
return result
class _PluginProvidersPutBody(BaseModel):
memory_provider: Optional[str] = None
context_engine: Optional[str] = None
@app.put("/api/dashboard/plugin-providers")
async def put_plugin_providers(request: Request, body: _PluginProvidersPutBody):
"""Persist memory provider / context engine selection (writes config.yaml)."""
_require_token(request)
from hermes_cli.plugins_cmd import (
_save_context_engine,
_save_memory_provider,
)
if body.memory_provider is not None:
_save_memory_provider(body.memory_provider)
if body.context_engine is not None:
_save_context_engine(body.context_engine)
return {"ok": True}
class _PluginVisibilityBody(BaseModel):
hidden: bool
@app.post("/api/dashboard/plugins/{name}/visibility")
async def post_plugin_visibility(request: Request, name: str, body: _PluginVisibilityBody):
"""Toggle a plugin's sidebar visibility (persists to config.yaml dashboard.hidden_plugins)."""
_require_token(request)
name = _validate_plugin_name(name)
config = load_config()
if "dashboard" not in config or not isinstance(config.get("dashboard"), dict):
config["dashboard"] = {}
hidden_list: list = config["dashboard"].get("hidden_plugins") or []
if not isinstance(hidden_list, list):
hidden_list = []
if body.hidden and name not in hidden_list:
hidden_list.append(name)
elif not body.hidden and name in hidden_list:
hidden_list.remove(name)
config["dashboard"]["hidden_plugins"] = hidden_list
save_config(config)
return {"ok": True, "name": name, "hidden": body.hidden}
@app.get("/dashboard-plugins/{plugin_name}/{file_path:path}")
async def serve_plugin_asset(plugin_name: str, file_path: str):
"""Serve static assets from a dashboard plugin directory.
+51 -1
View File
@@ -8,14 +8,64 @@ import os
from pathlib import Path
_profile_fallback_warned: bool = False
def get_hermes_home() -> Path:
"""Return the Hermes home directory (default: ~/.hermes).
Reads HERMES_HOME env var, falls back to ~/.hermes.
This is the single source of truth all other copies should import this.
When ``HERMES_HOME`` is unset but an ``active_profile`` file indicates
a non-default profile is active, logs a loud one-shot warning to
``errors.log`` so cross-profile data corruption is diagnosable instead
of silent. Behavior is unchanged otherwise we still return
``~/.hermes`` because raising here would brick 30+ module-level
callers that import this at load time. Subprocess spawners are
expected to propagate ``HERMES_HOME`` explicitly (see the systemd
template in ``hermes_cli/gateway.py`` and the kanban dispatcher in
``hermes_cli/kanban_db.py``). See https://github.com/NousResearch/hermes-agent/issues/18594.
"""
val = os.environ.get("HERMES_HOME", "").strip()
return Path(val) if val else Path.home() / ".hermes"
if val:
return Path(val)
# Guard: if a non-default profile is sticky-active, warn once that
# the fallback to the default profile is almost certainly wrong.
global _profile_fallback_warned
if not _profile_fallback_warned:
try:
# Inline the default-root resolution from get_default_hermes_root()
# to stay import-safe (this function is called from module scope
# in 30+ files; we cannot afford to trigger logging setup here).
active_path = (Path.home() / ".hermes" / "active_profile")
active = active_path.read_text().strip() if active_path.exists() else ""
except (UnicodeDecodeError, OSError):
active = ""
if active and active != "default":
_profile_fallback_warned = True
# Write directly to stderr. We intentionally do NOT route this
# through ``logging`` because (a) this function is called at
# module-import time from 30+ sites, often before logging is
# configured, and (b) root-logger propagation would double-emit
# on consoles where a StreamHandler is already attached.
import sys
msg = (
f"[HERMES_HOME fallback] HERMES_HOME is unset but active "
f"profile is {active!r}. Falling back to ~/.hermes, which "
f"is the DEFAULT profile — not {active!r}. Any data this "
f"process writes will land in the wrong profile. The "
f"subprocess spawner should pass HERMES_HOME explicitly "
f"(see issue #18594)."
)
try:
sys.stderr.write(msg + "\n")
sys.stderr.flush()
except Exception:
pass
return Path.home() / ".hermes"
def get_default_hermes_root() -> Path:
+199 -45
View File
@@ -514,7 +514,7 @@ class SessionDB:
# Session lifecycle
# =========================================================================
def create_session(
def _insert_session_row(
self,
session_id: str,
source: str,
@@ -523,8 +523,8 @@ class SessionDB:
system_prompt: str = None,
user_id: str = None,
parent_session_id: str = None,
) -> str:
"""Create a new session record. Returns the session_id."""
) -> None:
"""Shared INSERT OR IGNORE for session rows."""
def _do(conn):
conn.execute(
"""INSERT OR IGNORE INTO sessions (id, source, user_id, model, model_config,
@@ -542,8 +542,11 @@ class SessionDB:
),
)
self._execute_write(_do)
return session_id
def create_session(self, session_id: str, source: str, **kwargs) -> str:
"""Create a new session record. Returns the session_id."""
self._insert_session_row(session_id, source, **kwargs)
return session_id
def end_session(self, session_id: str, end_reason: str) -> None:
"""Mark a session as ended.
@@ -679,21 +682,41 @@ class SessionDB:
session_id: str,
source: str = "unknown",
model: str = None,
) -> None:
"""Ensure a session row exists, creating it with minimal metadata if absent.
**kwargs,
) -> str:
"""Ensure a session row exists (INSERT OR IGNORE). Accepts optional kwargs."""
self._insert_session_row(session_id, source, model=model, **kwargs)
return session_id
def prune_empty_ghost_sessions(self, sessions_dir: "Optional[Path]" = None) -> int:
"""Remove empty TUI ghost sessions (no messages, no title, >24hr old)."""
cutoff = time.time() - 86400 # Only sessions older than 24 hours
Used by _flush_messages_to_session_db to recover from a failed
create_session() call (e.g. transient SQLite lock at agent startup).
INSERT OR IGNORE is safe to call even when the row already exists.
"""
def _do(conn):
conn.execute(
"""INSERT OR IGNORE INTO sessions
(id, source, model, started_at)
VALUES (?, ?, ?, ?)""",
(session_id, source, model, time.time()),
)
self._execute_write(_do)
rows = conn.execute("""
SELECT id FROM sessions
WHERE source = 'tui'
AND title IS NULL
AND ended_at IS NOT NULL
AND started_at < ?
AND NOT EXISTS (
SELECT 1 FROM messages WHERE messages.session_id = sessions.id
)
""", (cutoff,)).fetchall()
ids = [r[0] if isinstance(r, (tuple, list)) else r["id"] for r in rows]
if ids:
placeholders = ",".join("?" * len(ids))
conn.execute(
f"DELETE FROM sessions WHERE id IN ({placeholders})", ids
)
return ids
removed_ids = self._execute_write(_do) or []
# Clean up any on-disk session files (belt-and-suspenders)
if sessions_dir and removed_ids:
for sid in removed_ids:
self._remove_session_files(sessions_dir, sid)
return len(removed_ids)
def get_session(self, session_id: str) -> Optional[Dict[str, Any]]:
"""Get a session by ID."""
@@ -933,6 +956,7 @@ class SessionDB:
offset: int = 0,
include_children: bool = False,
project_compression_tips: bool = True,
order_by_last_active: bool = False,
) -> List[Dict[str, Any]]:
"""List sessions with preview (first user message) and last active timestamp.
@@ -952,6 +976,14 @@ class SessionDB:
compressed continuations from being invisible to users while keeping
delegate subagents and branches hidden. Pass ``False`` to return the
raw root rows (useful for admin/debug UIs).
Pass ``order_by_last_active=True`` to sort by most-recent activity
instead of original conversation start time. For compression chains,
the "most-recent activity" is taken from the live tip (not the root),
so an old conversation that was compressed and continued recently
surfaces in the correct slot. Ordering is computed at SQL level via
a recursive CTE that walks compression-continuation edges, so LIMIT
and OFFSET still apply efficiently.
"""
where_clauses = []
params = []
@@ -979,25 +1011,80 @@ class SessionDB:
params.extend(exclude_sources)
where_sql = f"WHERE {' AND '.join(where_clauses)}" if where_clauses else ""
query = f"""
SELECT s.*,
COALESCE(
(SELECT SUBSTR(REPLACE(REPLACE(m.content, X'0A', ' '), X'0D', ' '), 1, 63)
FROM messages m
WHERE m.session_id = s.id AND m.role = 'user' AND m.content IS NOT NULL
ORDER BY m.timestamp, m.id LIMIT 1),
''
) AS _preview_raw,
COALESCE(
(SELECT MAX(m2.timestamp) FROM messages m2 WHERE m2.session_id = s.id),
s.started_at
) AS last_active
FROM sessions s
{where_sql}
ORDER BY s.started_at DESC
LIMIT ? OFFSET ?
"""
params.extend([limit, offset])
if order_by_last_active:
# Compute effective_last_active by walking each surfaced session's
# compression-continuation chain forward in SQL and taking the MAX
# timestamp across the chain. This lets us ORDER BY + LIMIT at SQL
# level instead of fetching every row and sorting in Python, while
# still surfacing old compression roots whose live tip is fresh.
#
# The CTE seeds from rows the outer WHERE admits (roots + branch
# children), then recursively joins forward through
# compression-continuation edges using the same criteria as
# get_compression_tip (parent.end_reason='compression' AND
# child.started_at >= parent.ended_at).
query = f"""
WITH RECURSIVE chain(root_id, cur_id) AS (
SELECT s.id, s.id FROM sessions s {where_sql}
UNION ALL
SELECT c.root_id, child.id
FROM chain c
JOIN sessions parent ON parent.id = c.cur_id
JOIN sessions child ON child.parent_session_id = c.cur_id
WHERE parent.end_reason = 'compression'
AND child.started_at >= parent.ended_at
),
chain_max AS (
SELECT
root_id,
MAX(COALESCE(
(SELECT MAX(m.timestamp) FROM messages m WHERE m.session_id = cur_id),
(SELECT started_at FROM sessions ss WHERE ss.id = cur_id)
)) AS effective_last_active
FROM chain
GROUP BY root_id
)
SELECT s.*,
COALESCE(
(SELECT SUBSTR(REPLACE(REPLACE(m.content, X'0A', ' '), X'0D', ' '), 1, 63)
FROM messages m
WHERE m.session_id = s.id AND m.role = 'user' AND m.content IS NOT NULL
ORDER BY m.timestamp, m.id LIMIT 1),
''
) AS _preview_raw,
COALESCE(
(SELECT MAX(m2.timestamp) FROM messages m2 WHERE m2.session_id = s.id),
s.started_at
) AS last_active,
COALESCE(cm.effective_last_active, s.started_at) AS _effective_last_active
FROM sessions s
LEFT JOIN chain_max cm ON cm.root_id = s.id
{where_sql}
ORDER BY _effective_last_active DESC, s.started_at DESC, s.id DESC
LIMIT ? OFFSET ?
"""
# WHERE params apply twice (CTE seed + outer select).
params = params + params + [limit, offset]
else:
query = f"""
SELECT s.*,
COALESCE(
(SELECT SUBSTR(REPLACE(REPLACE(m.content, X'0A', ' '), X'0D', ' '), 1, 63)
FROM messages m
WHERE m.session_id = s.id AND m.role = 'user' AND m.content IS NOT NULL
ORDER BY m.timestamp, m.id LIMIT 1),
''
) AS _preview_raw,
COALESCE(
(SELECT MAX(m2.timestamp) FROM messages m2 WHERE m2.session_id = s.id),
s.started_at
) AS last_active
FROM sessions s
{where_sql}
ORDER BY s.started_at DESC
LIMIT ? OFFSET ?
"""
params.extend([limit, offset])
with self._lock:
cursor = self._conn.execute(query, params)
rows = cursor.fetchall()
@@ -1011,6 +1098,8 @@ class SessionDB:
s["preview"] = text + ("..." if len(raw) > 60 else "")
else:
s["preview"] = ""
# Drop the internal ordering column so callers see a clean dict.
s.pop("_effective_last_active", None)
sessions.append(s)
# Project compression roots forward to their tips. Each row whose
@@ -1088,6 +1177,48 @@ class SessionDB:
# Message storage
# =========================================================================
# Sentinel prefix used to distinguish JSON-encoded structured content
# (multimodal messages: lists of parts like text + image_url) from plain
# string content. The NUL byte is not legal in normal text, so this
# cannot collide with real user content.
_CONTENT_JSON_PREFIX = "\x00json:"
@classmethod
def _encode_content(cls, content: Any) -> Any:
"""Serialize structured (list/dict) message content for sqlite.
sqlite3 can only bind ``str``, ``bytes``, ``int``, ``float``, and ``None``
to query parameters. Multimodal messages have ``content`` as a list of
parts (``[{"type": "text", ...}, {"type": "image_url", ...}]``), which
raises ``ProgrammingError: Error binding parameter N: type 'list' is
not supported`` when bound directly.
Returns the value unchanged when it's already a safe scalar, or a
sentinel-prefixed JSON string for lists/dicts. Paired with
:meth:`_decode_content` on read.
"""
if content is None or isinstance(content, (str, bytes, int, float)):
return content
try:
return cls._CONTENT_JSON_PREFIX + json.dumps(content)
except (TypeError, ValueError):
# Last-resort fallback: stringify so persistence never fails.
return str(content)
@classmethod
def _decode_content(cls, content: Any) -> Any:
"""Reverse :meth:`_encode_content`; returns scalars unchanged."""
if isinstance(content, str) and content.startswith(cls._CONTENT_JSON_PREFIX):
try:
return json.loads(content[len(cls._CONTENT_JSON_PREFIX):])
except (json.JSONDecodeError, TypeError):
logger.warning(
"Failed to decode JSON-encoded message content; "
"returning raw string"
)
return content
return content
def append_message(
self,
session_id: str,
@@ -1124,6 +1255,9 @@ class SessionDB:
if codex_message_items else None
)
tool_calls_json = json.dumps(tool_calls) if tool_calls else None
# Multimodal content (list of parts) must be JSON-encoded: sqlite3
# cannot bind list/dict parameters directly.
stored_content = self._encode_content(content)
# Pre-compute tool call count
num_tool_calls = 0
@@ -1140,7 +1274,7 @@ class SessionDB:
(
session_id,
role,
content,
stored_content,
tool_call_id,
tool_calls_json,
tool_name,
@@ -1223,7 +1357,7 @@ class SessionDB:
(
session_id,
role,
msg.get("content"),
self._encode_content(msg.get("content")),
msg.get("tool_call_id"),
tool_calls_json,
msg.get("tool_name"),
@@ -1262,6 +1396,8 @@ class SessionDB:
result = []
for row in rows:
msg = dict(row)
if "content" in msg:
msg["content"] = self._decode_content(msg["content"])
if msg.get("tool_calls"):
try:
msg["tool_calls"] = json.loads(msg["tool_calls"])
@@ -1351,15 +1487,15 @@ class SessionDB:
placeholders = ",".join("?" for _ in session_ids)
rows = self._conn.execute(
"SELECT role, content, tool_call_id, tool_calls, tool_name, "
"reasoning, reasoning_content, reasoning_details, codex_reasoning_items, "
"codex_message_items "
"finish_reason, reasoning, reasoning_content, reasoning_details, "
"codex_reasoning_items, codex_message_items "
f"FROM messages WHERE session_id IN ({placeholders}) ORDER BY timestamp, id",
tuple(session_ids),
).fetchall()
messages = []
for row in rows:
content = row["content"]
content = self._decode_content(row["content"])
if row["role"] in {"user", "assistant"} and isinstance(content, str):
content = sanitize_context(content).strip()
msg = {"role": row["role"], "content": content}
@@ -1377,6 +1513,8 @@ class SessionDB:
# that replay reasoning (OpenRouter, OpenAI, Nous) receive
# coherent multi-turn reasoning context.
if row["role"] == "assistant":
if row["finish_reason"]:
msg["finish_reason"] = row["finish_reason"]
if row["reasoning"]:
msg["reasoning"] = row["reasoning"]
if row["reasoning_content"] is not None:
@@ -1744,10 +1882,26 @@ class SessionDB:
)""",
(match["id"], match["id"]),
)
context_msgs = [
{"role": r["role"], "content": (r["content"] or "")[:200]}
for r in ctx_cursor.fetchall()
]
context_msgs = []
for r in ctx_cursor.fetchall():
raw = r["content"]
decoded = self._decode_content(raw)
# Multimodal context: render a compact text-only
# summary for search previews.
if isinstance(decoded, list):
text_parts = [
p.get("text", "") for p in decoded
if isinstance(p, dict) and p.get("type") == "text"
]
text = " ".join(t for t in text_parts if t).strip()
preview = text or "[multimodal content]"
elif isinstance(decoded, str):
preview = decoded
else:
preview = ""
context_msgs.append(
{"role": r["role"], "content": preview[:200]}
)
match["context"] = context_msgs
except Exception:
match["context"] = []
+7 -6
View File
@@ -356,12 +356,17 @@ def _compute_tool_definitions(
else:
if not quiet_mode:
print(f"⚠️ Unknown toolset: {toolset_name}")
elif disabled_toolsets:
else:
# Default: start with everything
from toolsets import get_all_toolsets
for ts_name in get_all_toolsets():
tools_to_include.update(resolve_toolset(ts_name))
# Always apply disabled toolsets as a subtraction step at the end.
# This ensures that even if a composite toolset (like hermes-cli)
# is enabled, any tools belonging to a disabled toolset are strictly
# stripped out. See issue #17309.
if disabled_toolsets:
for toolset_name in disabled_toolsets:
if validate_toolset(toolset_name):
resolved = resolve_toolset(toolset_name)
@@ -376,10 +381,6 @@ def _compute_tool_definitions(
else:
if not quiet_mode:
print(f"⚠️ Unknown toolset: {toolset_name}")
else:
from toolsets import get_all_toolsets
for ts_name in get_all_toolsets():
tools_to_include.update(resolve_toolset(ts_name))
# Plugin-registered tools are now resolved through the normal toolset
# path — validate_toolset() / resolve_toolset() / get_all_toolsets()
+1 -1
View File
@@ -163,7 +163,7 @@
for entry in "''${ENTRIES[@]}"; do
IFS=":" read -r ATTR FOLDER NIX_FILE <<< "$entry"
echo "==> .#$ATTR ($FOLDER -> $NIX_FILE)"
OUTPUT=$(nix build ".#$ATTR.npmDeps" --no-link --rebuild --print-build-logs 2>&1)
OUTPUT=$(nix build ".#$ATTR.npmDeps" --no-link --print-build-logs 2>&1)
STATUS=$?
if [ "$STATUS" -eq 0 ]; then
echo " ok"
+1 -1
View File
@@ -4,7 +4,7 @@ let
src = ../ui-tui;
npmDeps = pkgs.fetchNpmDeps {
inherit src;
hash = "sha256-Chz+NW9NXqboXHOa6PKwf5bhAkkcFtKNhvKWwg2XSPc=";
hash = "sha256-a/HGI9OgVcTnZrMXA7xFMGnFoVxyHe95fulVz+WNYB0=";
};
npm = hermesNpmLib.mkNpmPassthru { folder = "ui-tui"; attr = "tui"; pname = "hermes-tui"; };
@@ -2960,7 +2960,7 @@ class Migrator:
def parse_args() -> argparse.Namespace:
parser = argparse.ArgumentParser(description="Migrate OpenClaw user state into Hermes Agent.")
parser.add_argument("--source", default=str(Path.home() / ".openclaw"), help="OpenClaw home directory")
parser.add_argument("--target", default=str(Path.home() / ".hermes"), help="Hermes home directory")
parser.add_argument("--target", default=os.environ.get("HERMES_HOME") or str(Path.home() / ".hermes"), help="Hermes home directory")
parser.add_argument(
"--workspace-target",
help="Optional workspace root where the workspace instructions file should be copied",
@@ -0,0 +1,217 @@
---
name: here.now
description: Publish static sites to {slug}.here.now and store private files in cloud Drives for agent-to-agent handoff.
version: 1.15.3
author: here.now
license: MIT
prerequisites:
commands: [curl, file, jq]
platforms: [macos, linux]
metadata:
hermes:
tags: [here.now, herenow, publish, deploy, hosting, static-site, web, share, URL, drive, storage]
homepage: https://here.now
requires_toolsets: [terminal]
---
# here.now
here.now lets agents publish websites and store private files in cloud Drives.
Use here.now for two jobs:
- **Sites**: publish websites and files at `{slug}.here.now`.
- **Drives**: store private agent files in cloud folders.
## Current docs
**Before answering questions about here.now capabilities, features, or workflows, read the current docs:**
→ **https://here.now/docs**
Read the docs:
- at the first here.now-related interaction in a conversation
- any time the user asks how to do something
- any time the user asks what is possible, supported, or recommended
- before telling the user a feature is unsupported
Topics that require current docs (do not rely on local skill text alone):
- Drives and Drive sharing
- custom domains
- payments and payment gating
- forking
- proxy routes and service variables
- handles and links
- limits and quotas
- SPA routing
- error handling and remediation
- feature availability
**If docs and live API behavior disagree, trust the live API behavior.**
If the docs fetch fails or times out, continue with the local skill and live API/script output. Prefer live API behavior for active operations.
## Requirements
- Required binaries: `curl`, `file`, `jq`
- Optional environment variable: `$HERENOW_API_KEY`
- Optional Drive token variable: `$HERENOW_DRIVE_TOKEN`
- Optional credentials file: `~/.herenow/credentials`
- Skill helper paths:
- `${HERMES_SKILL_DIR}/scripts/publish.sh` for publishing sites
- `${HERMES_SKILL_DIR}/scripts/drive.sh` for private Drive storage
## Create a site
```bash
PUBLISH="${HERMES_SKILL_DIR}/scripts/publish.sh"
bash "$PUBLISH" {file-or-dir} --client hermes
```
Outputs the live URL (e.g. `https://bright-canvas-a7k2.here.now/`).
Under the hood this is a three-step flow: create/update -> upload files -> finalize. A site is not live until finalize succeeds.
Without an API key this creates an **anonymous site** that expires in 24 hours.
With a saved API key, the site is permanent.
**File structure:** For HTML sites, place `index.html` at the root of the directory you publish, not inside a subdirectory. The directory's contents become the site root. For example, publish `my-site/` where `my-site/index.html` exists — don't publish a parent folder that contains `my-site/`.
You can also publish raw files without any HTML. Single files get a rich auto-viewer (images, PDF, video, audio). Multiple files get an auto-generated directory listing with folder navigation and an image gallery.
## Update an existing site
```bash
PUBLISH="${HERMES_SKILL_DIR}/scripts/publish.sh"
bash "$PUBLISH" {file-or-dir} --slug {slug} --client hermes
```
The script auto-loads the `claimToken` from `.herenow/state.json` when updating anonymous sites. Pass `--claim-token {token}` to override.
Authenticated updates require a saved API key.
## Use a Drive
Use a Drive when the user wants private cloud storage for agent files: documents, context, memory, plans, assets, media, research, code, and anything else that should persist without being published as a website.
Every signed-in account has a default Drive named `My Drive`.
```bash
DRIVE="${HERMES_SKILL_DIR}/scripts/drive.sh"
bash "$DRIVE" default
bash "$DRIVE" ls "My Drive"
bash "$DRIVE" put "My Drive" notes/today.md --from ./notes/today.md
bash "$DRIVE" cat "My Drive" notes/today.md
bash "$DRIVE" share "My Drive" --perms write --prefix notes/ --ttl 7d
```
Use scoped Drive tokens for agent-to-agent handoff. If you receive a `herenow_drive` share block, use its `token` as `Authorization: Bearer <token>` against `api_base`, respect `pathPrefix` when present, and preserve ETags on writes. A `pathPrefix` of `null` means full-Drive access. If the skill is available, prefer `drive.sh`; otherwise call the listed API operations directly.
## API key storage
The publish script reads the API key from these sources (first match wins):
1. `--api-key {key}` flag (CI/scripting only — avoid in interactive use)
2. `$HERENOW_API_KEY` environment variable
3. `~/.herenow/credentials` file (recommended for agents)
To store a key, write it to the credentials file:
```bash
mkdir -p ~/.herenow && echo "{API_KEY}" > ~/.herenow/credentials && chmod 600 ~/.herenow/credentials
```
**IMPORTANT**: After receiving an API key, save it immediately — run the command above yourself. Do not ask the user to run it manually. Avoid passing the key via CLI flags (e.g. `--api-key`) in interactive sessions; the credentials file is the preferred storage method.
Never commit credentials or local state files (`~/.herenow/credentials`, `.herenow/state.json`) to source control.
## Getting an API key
To upgrade from anonymous (24h) to permanent sites:
1. Ask the user for their email address.
2. Request a one-time sign-in code:
```bash
curl -sS https://here.now/api/auth/agent/request-code \
-H "content-type: application/json" \
-d '{"email": "user@example.com"}'
```
3. Tell the user: "Check your inbox for a sign-in code from here.now and paste it here."
4. Verify the code and get the API key:
```bash
curl -sS https://here.now/api/auth/agent/verify-code \
-H "content-type: application/json" \
-d '{"email":"user@example.com","code":"ABCD-2345"}'
```
5. Save the returned `apiKey` yourself (do not ask the user to do this):
```bash
mkdir -p ~/.herenow && echo "{API_KEY}" > ~/.herenow/credentials && chmod 600 ~/.herenow/credentials
```
## State file
After every site create/update, the script writes to `.herenow/state.json` in the working directory:
```json
{
"publishes": {
"bright-canvas-a7k2": {
"siteUrl": "https://bright-canvas-a7k2.here.now/",
"claimToken": "abc123",
"claimUrl": "https://here.now/claim?slug=bright-canvas-a7k2&token=abc123",
"expiresAt": "2026-02-18T01:00:00.000Z"
}
}
}
```
Before creating or updating sites, you may check this file to find prior slugs.
Treat `.herenow/state.json` as internal cache only.
Never present this local file path as a URL, and never use it as source of truth for auth mode, expiry, or claim URL.
## What to tell the user
For published sites:
- Always share the `siteUrl` from the current script run.
- Read and follow `publish_result.*` lines from script stderr to determine auth mode.
- When `publish_result.auth_mode=authenticated`: tell the user the site is **permanent** and saved to their account. No claim URL is needed.
- When `publish_result.auth_mode=anonymous`: tell the user the site **expires in 24 hours**. Share the claim URL (if `publish_result.claim_url` is non-empty and starts with `https://`) so they can keep it permanently. Warn that claim tokens are only returned once and cannot be recovered.
- Never tell the user to inspect `.herenow/state.json` for claim URLs or auth status.
For Drives:
- Do not describe Drive files as public URLs.
- Tell the user Drive contents are private unless shared with a scoped token.
- When sharing access with another agent, prefer a scoped token with a narrow `pathPrefix` and short TTL.
## publish.sh options
| Flag | Description |
| ---------------------- | -------------------------------------------- |
| `--slug {slug}` | Update an existing site instead of creating |
| `--claim-token {token}`| Override claim token for anonymous updates |
| `--title {text}` | Viewer title (non-HTML sites) |
| `--description {text}` | Viewer description |
| `--ttl {seconds}` | Set expiry (authenticated only) |
| `--client {name}` | Agent name for attribution (e.g. `hermes`) |
| `--base-url {url}` | API base URL (default: `https://here.now`) |
| `--allow-nonherenow-base-url` | Allow sending auth to non-default `--base-url` |
| `--api-key {key}` | API key override (prefer credentials file) |
| `--spa` | Enable SPA routing (serve index.html for unknown paths) |
| `--forkable` | Allow others to fork this site |
## Beyond publish.sh
For Drive operations, use `drive.sh` or the Drive API. For broader account and site management — delete, metadata, passwords, payments, domains, handles, links, variables, proxy routes, forking, duplication, and more — see the current docs:
→ **https://here.now/docs**
Full docs: https://here.now/docs
+406
View File
@@ -0,0 +1,406 @@
#!/usr/bin/env bash
set -euo pipefail
BASE_URL="https://here.now"
CREDENTIALS_FILE="$HOME/.herenow/credentials"
API_KEY="${HERENOW_API_KEY:-}"
DRIVE_TOKEN="${HERENOW_DRIVE_TOKEN:-}"
ALLOW_NON_HERENOW_BASE_URL=0
MAX_FILE_BYTES=$((500 * 1024 * 1024))
usage() {
cat <<'USAGE'
Usage: drive.sh [global options] <command> [args]
Global options:
--api-key <key> Account API key (or $HERENOW_API_KEY / ~/.herenow/credentials)
--token <drv_live_...> Drive token (or $HERENOW_DRIVE_TOKEN)
--base-url <url> API base (default: https://here.now)
--allow-nonherenow-base-url
Commands:
create [name] [--default]
default
ls
ls <drive> [prefix]
cat <drive> <path>
put <drive> <path> --from <local-file>
import <drive> <prefix> --from <local-folder> [--dry-run]
export <drive> <prefix> --to <local-folder> [--dry-run]
rm <drive> <path> [--recursive --confirm <path>]
share <drive> --perms read|write [--prefix notes/] [--ttl 30d] [--label text] [--manage-tokens]
tokens <drive>
revoke <drive> <tokenId>
delete <drive> --confirm "<drive name>"
USAGE
exit 1
}
die() { echo "error: $1" >&2; exit 1; }
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
SKILL_DIR="$(cd "${SCRIPT_DIR}/.." && pwd)"
BUNDLED_JQ="${SKILL_DIR}/bin/jq"
if [[ -x "$BUNDLED_JQ" ]]; then
JQ_BIN="$BUNDLED_JQ"
elif command -v jq >/dev/null 2>&1; then
JQ_BIN="$(command -v jq)"
else
die "requires jq"
fi
for cmd in curl file; do
command -v "$cmd" >/dev/null 2>&1 || die "requires $cmd"
done
while [[ $# -gt 0 ]]; do
case "$1" in
--api-key) API_KEY="$2"; shift 2 ;;
--token) DRIVE_TOKEN="$2"; shift 2 ;;
--base-url) BASE_URL="$2"; shift 2 ;;
--allow-nonherenow-base-url) ALLOW_NON_HERENOW_BASE_URL=1; shift ;;
--help|-h) usage ;;
--*) die "unknown global option: $1" ;;
*) break ;;
esac
done
CMD="${1:-}"
[[ -n "$CMD" ]] || usage
shift || true
if [[ -z "$API_KEY" && -z "$DRIVE_TOKEN" && -f "$CREDENTIALS_FILE" ]]; then
API_KEY=$(tr -d '[:space:]' < "$CREDENTIALS_FILE")
fi
BASE_URL="${BASE_URL%/}"
if [[ "$BASE_URL" != "https://here.now" && "$ALLOW_NON_HERENOW_BASE_URL" -ne 1 ]]; then
if [[ -n "$API_KEY" || -n "$DRIVE_TOKEN" ]]; then
die "refusing to send credentials to non-default base URL; pass --allow-nonherenow-base-url to override"
fi
fi
auth_header=()
if [[ -n "$DRIVE_TOKEN" ]]; then
auth_header=(-H "authorization: Bearer $DRIVE_TOKEN")
elif [[ -n "$API_KEY" ]]; then
auth_header=(-H "authorization: Bearer $API_KEY")
else
die "missing credentials; set HERENOW_API_KEY, HERENOW_DRIVE_TOKEN, or ~/.herenow/credentials"
fi
compute_sha256() {
local f="$1"
if command -v sha256sum >/dev/null 2>&1; then
sha256sum "$f" | cut -d' ' -f1
else
shasum -a 256 "$f" | cut -d' ' -f1
fi
}
guess_content_type() {
local f="$1"
case "${f##*.}" in
html|htm) echo "text/html; charset=utf-8" ;;
css) echo "text/css; charset=utf-8" ;;
js|mjs) echo "text/javascript; charset=utf-8" ;;
json) echo "application/json; charset=utf-8" ;;
md|txt) echo "text/plain; charset=utf-8" ;;
svg) echo "image/svg+xml" ;;
png) echo "image/png" ;;
jpg|jpeg) echo "image/jpeg" ;;
gif) echo "image/gif" ;;
webp) echo "image/webp" ;;
pdf) echo "application/pdf" ;;
*) file --brief --mime-type "$f" 2>/dev/null || echo "application/octet-stream" ;;
esac
}
api_json() {
local method="$1"; shift
local url="$1"; shift
local body="${1:-}"
local tmp
tmp=$(mktemp)
local code
if [[ -n "$body" ]]; then
code=$(curl -sS -o "$tmp" -w "%{http_code}" -X "$method" "$url" "${auth_header[@]}" -H "content-type: application/json" -d "$body")
else
code=$(curl -sS -o "$tmp" -w "%{http_code}" -X "$method" "$url" "${auth_header[@]}")
fi
if [[ "$code" -lt 200 || "$code" -ge 300 ]]; then
local err
err=$("$JQ_BIN" -r '.error // empty' "$tmp" 2>/dev/null || true)
[[ -n "$err" ]] || err="$(cat "$tmp")"
rm -f "$tmp"
die "HTTP $code: $err"
fi
cat "$tmp"
rm -f "$tmp"
}
urlenc() {
"$JQ_BIN" -nr --arg v "$1" '$v|@uri'
}
urlenc_path() {
local path="$1"
local out=""
local part
IFS='/' read -r -a parts <<< "$path"
for part in "${parts[@]}"; do
[[ -n "$out" ]] && out="$out/"
out="$out$(urlenc "$part")"
done
echo "$out"
}
resolve_drive() {
local name="$1"
if [[ "$name" == drv_* ]]; then
echo "$name"
return
fi
if [[ -n "$DRIVE_TOKEN" ]]; then
die "drive tokens must reference drives by drv_ id; use account credentials to resolve drive names"
fi
if [[ "$name" == "default" || "$name" == "my-drive" || "$name" == "My Drive" ]]; then
api_json GET "$BASE_URL/api/v1/drives/default" | "$JQ_BIN" -r '.drive.id'
return
fi
local rows count
rows=$(api_json GET "$BASE_URL/api/v1/drives" | "$JQ_BIN" --arg n "$name" '[.drives[] | select(.name == $n)]')
count=$(echo "$rows" | "$JQ_BIN" 'length')
[[ "$count" -eq 1 ]] || die "drive name '$name' matched $count drives; use a drv_ id"
echo "$rows" | "$JQ_BIN" -r '.[0].id'
}
drive_head() {
local id="$1"
api_json GET "$BASE_URL/api/v1/drives/$id" | "$JQ_BIN" -r '.drive.headVersionId // .headVersionId // empty'
}
file_meta() {
local id="$1"
local path="$2"
local prefix
prefix=$(urlenc "$path")
api_json GET "$BASE_URL/api/v1/drives/$id/files?prefix=$prefix&limit=200" | "$JQ_BIN" -c --arg p "$path" '.files[]? | select(.path == $p)' | head -n 1
}
put_file() {
local drive="$1"; shift
local path="$1"; shift
local local_file=""
while [[ $# -gt 0 ]]; do
case "$1" in
--from) local_file="$2"; shift 2 ;;
*) die "unexpected put argument: $1" ;;
esac
done
[[ -f "$local_file" ]] || die "--from must be a file"
local id sz ct sha meta body upload upload_url upload_id http_code
id=$(resolve_drive "$drive")
sz=$(wc -c < "$local_file" | tr -d ' ')
[[ "$sz" -le "$MAX_FILE_BYTES" ]] || die "$path exceeds the $MAX_FILE_BYTES byte Drive file limit"
ct=$(guess_content_type "$local_file")
sha=$(compute_sha256 "$local_file")
meta=$(file_meta "$id" "$path" || true)
body=$("$JQ_BIN" -n --arg p "$path" --argjson s "$sz" --arg c "$ct" --arg sha "$sha" \
'{path:$p,size:$s,contentType:$c,sha256:$sha}')
if [[ -n "$meta" ]]; then
etag=$(echo "$meta" | "$JQ_BIN" -r '.etag')
body=$(echo "$body" | "$JQ_BIN" --arg e "$etag" '.ifMatch = $e')
else
body=$(echo "$body" | "$JQ_BIN" '.ifNoneMatch = "*"')
fi
upload=$(api_json POST "$BASE_URL/api/v1/drives/$id/files/uploads" "$body")
upload_url=$(echo "$upload" | "$JQ_BIN" -r '.uploadUrl')
upload_id=$(echo "$upload" | "$JQ_BIN" -r '.uploadId')
http_code=$(curl -sS -o /dev/null -w "%{http_code}" -X PUT "$upload_url" -H "Content-Type: $ct" --data-binary "@$local_file")
[[ "$http_code" -ge 200 && "$http_code" -lt 300 ]] || die "upload failed for $path (HTTP $http_code)"
api_json POST "$BASE_URL/api/v1/drives/$id/files/finalize" "$("$JQ_BIN" -n --arg u "$upload_id" '{uploadId:$u}')" | "$JQ_BIN" .
}
case "$CMD" in
create)
name=""
is_default="false"
while [[ $# -gt 0 ]]; do
case "$1" in
--default) is_default="true"; shift ;;
*) [[ -z "$name" ]] && name="$1" || die "unexpected argument: $1"; shift ;;
esac
done
body=$("$JQ_BIN" -n --arg n "$name" --argjson d "$is_default" '{isDefault:$d} + (if $n == "" then {} else {name:$n} end)')
api_json POST "$BASE_URL/api/v1/drives" "$body" | "$JQ_BIN" .
;;
default)
api_json GET "$BASE_URL/api/v1/drives/default" | "$JQ_BIN" .
;;
ls)
if [[ $# -eq 0 ]]; then
[[ -z "$DRIVE_TOKEN" ]] || die "drive tokens cannot list drives; pass a drv_ id"
api_json GET "$BASE_URL/api/v1/drives" | "$JQ_BIN" .
else
id=$(resolve_drive "$1")
prefix="${2:-}"
api_json GET "$BASE_URL/api/v1/drives/$id/files?prefix=$(urlenc "$prefix")" | "$JQ_BIN" .
fi
;;
cat)
[[ $# -eq 2 ]] || die "usage: drive.sh cat <drive> <path>"
id=$(resolve_drive "$1")
curl -fsS "$BASE_URL/api/v1/drives/$id/files/$(urlenc_path "$2")" "${auth_header[@]}"
;;
put)
[[ $# -ge 2 ]] || die "usage: drive.sh put <drive> <path> --from <local-file>"
put_file "$@"
;;
import)
[[ $# -ge 2 ]] || die "usage: drive.sh import <drive> <prefix> --from <local-folder> [--dry-run]"
drive="$1"; prefix="${2%/}"; shift 2
from=""; dry=0
while [[ $# -gt 0 ]]; do
case "$1" in
--from) from="$2"; shift 2 ;;
--dry-run) dry=1; shift ;;
*) die "unexpected import argument: $1" ;;
esac
done
[[ -d "$from" ]] || die "--from must be a folder"
uploaded=0
skipped=0
failed=0
planned=0
while IFS= read -r -d '' f; do
rel="${f#$from/}"
[[ "$rel" == .git/* || "$rel" == node_modules/* || "$rel" == ".DS_Store" || "$rel" == */.DS_Store ]] && continue
planned=$((planned + 1))
sz=$(wc -c < "$f" | tr -d ' ')
if [[ "$sz" -gt "$MAX_FILE_BYTES" ]]; then
echo "skip oversized $f ($sz bytes > $MAX_FILE_BYTES)" >&2
skipped=$((skipped + 1))
continue
fi
dest="$rel"
[[ -n "$prefix" ]] && dest="$prefix/$rel"
if [[ "$dry" -eq 1 ]]; then
echo "upload $f -> $dest"
skipped=$((skipped + 1))
else
if (put_file "$drive" "$dest" --from "$f" >/dev/null); then
uploaded=$((uploaded + 1))
else
failed=$((failed + 1))
fi
fi
done < <(find "$from" -type f -print0 | sort -z)
echo "planned=$planned uploaded=$uploaded skipped=$skipped failed=$failed"
[[ "$failed" -eq 0 ]] || exit 1
;;
export)
[[ $# -ge 2 ]] || die "usage: drive.sh export <drive> <prefix> --to <local-folder> [--dry-run]"
id=$(resolve_drive "$1"); prefix="${2%/}"; shift 2
to=""; dry=0
while [[ $# -gt 0 ]]; do
case "$1" in
--to) to="$2"; shift 2 ;;
--dry-run) dry=1; shift ;;
*) die "unexpected export argument: $1" ;;
esac
done
[[ -n "$to" ]] || die "--to is required"
cursor=""
total=0
while true; do
url="$BASE_URL/api/v1/drives/$id/files?prefix=$(urlenc "$prefix")&limit=200"
[[ -n "$cursor" ]] && url="$url&cursor=$(urlenc "$cursor")"
files=$(api_json GET "$url")
while IFS= read -r p; do
[[ -n "$p" ]] || continue
rel="$p"
[[ -n "$prefix" ]] && rel="${p#$prefix/}"
out="$to/$rel"
if [[ "$dry" -eq 1 ]]; then
echo "download $p -> $out"
else
mkdir -p "$(dirname "$out")"
curl -fsS "$BASE_URL/api/v1/drives/$id/files/$(urlenc_path "$p")" "${auth_header[@]}" -o "$out"
fi
total=$((total + 1))
done < <(echo "$files" | "$JQ_BIN" -r '.files[].path')
cursor=$(echo "$files" | "$JQ_BIN" -r '.nextCursor // empty')
[[ -n "$cursor" ]] || break
done
echo "files=$total"
;;
rm)
[[ $# -ge 2 ]] || die "usage: drive.sh rm <drive> <path> [--recursive --confirm <path>]"
id=$(resolve_drive "$1"); path="$2"; shift 2
recursive=0; confirm=""
while [[ $# -gt 0 ]]; do
case "$1" in
--recursive) recursive=1; shift ;;
--confirm) confirm="$2"; shift 2 ;;
*) die "unexpected rm argument: $1" ;;
esac
done
if [[ "$recursive" -eq 1 ]]; then
[[ "$confirm" == "$path" ]] || die "recursive delete requires --confirm '$path'"
head=$(drive_head "$id")
api_json DELETE "$BASE_URL/api/v1/drives/$id/files/$(urlenc_path "$path")?recursive=true&baseVersionId=$(urlenc "$head")" | "$JQ_BIN" .
else
meta=$(file_meta "$id" "$path")
etag=$(echo "$meta" | "$JQ_BIN" -r '.etag')
curl -fsS -X DELETE "$BASE_URL/api/v1/drives/$id/files/$(urlenc_path "$path")" "${auth_header[@]}" -H "If-Match: $etag" | "$JQ_BIN" .
fi
;;
share)
[[ $# -ge 1 ]] || die "usage: drive.sh share <drive> --perms read|write [--prefix notes/] [--ttl 30d] [--label text] [--manage-tokens]"
id=$(resolve_drive "$1"); shift
perms="write"; prefix=""; ttl=""; label=""; manage_tokens="false"
while [[ $# -gt 0 ]]; do
case "$1" in
--perms) perms="$2"; shift 2 ;;
--prefix) prefix="$2"; shift 2 ;;
--ttl) ttl="$2"; shift 2 ;;
--label) label="$2"; shift 2 ;;
--manage-tokens) manage_tokens="true"; shift ;;
*) die "unexpected share argument: $1" ;;
esac
done
body=$("$JQ_BIN" -n --arg p "$perms" --arg pp "$prefix" --arg ttl "$ttl" --arg label "$label" --argjson mt "$manage_tokens" \
'{perms:$p} + (if $mt then {manageTokens:true} else {} end) + (if $ttl == "" then {} else {ttl:$ttl} end) + (if $pp == "" then {} else {pathPrefix:$pp} end) + (if $label == "" then {} else {label:$label} end)')
api_json POST "$BASE_URL/api/v1/drives/$id/tokens" "$body" | "$JQ_BIN" -r '.shareBlock'
;;
tokens)
[[ $# -eq 1 ]] || die "usage: drive.sh tokens <drive>"
id=$(resolve_drive "$1")
api_json GET "$BASE_URL/api/v1/drives/$id/tokens" | "$JQ_BIN" .
;;
revoke)
[[ $# -eq 2 ]] || die "usage: drive.sh revoke <drive> <tokenId>"
id=$(resolve_drive "$1")
api_json DELETE "$BASE_URL/api/v1/drives/$id/tokens/$2" | "$JQ_BIN" .
;;
delete)
[[ $# -ge 1 ]] || die "usage: drive.sh delete <drive> --confirm <drive name>"
id=$(resolve_drive "$1"); shift
confirm=""
while [[ $# -gt 0 ]]; do
case "$1" in
--confirm) confirm="$2"; shift 2 ;;
*) die "unexpected delete argument: $1" ;;
esac
done
drive=$(api_json GET "$BASE_URL/api/v1/drives/$id")
name=$(echo "$drive" | "$JQ_BIN" -r '.drive.name')
[[ "$confirm" == "$name" ]] || die "delete requires --confirm '$name'"
api_json DELETE "$BASE_URL/api/v1/drives/$id" | "$JQ_BIN" .
;;
*)
die "unknown command: $CMD"
;;
esac
+445
View File
@@ -0,0 +1,445 @@
#!/usr/bin/env bash
set -euo pipefail
BASE_URL="https://here.now"
CREDENTIALS_FILE="$HOME/.herenow/credentials"
API_KEY="${HERENOW_API_KEY:-}"
API_KEY_SOURCE="none"
if [[ -n "${HERENOW_API_KEY:-}" ]]; then
API_KEY_SOURCE="env"
fi
ALLOW_NON_HERENOW_BASE_URL=0
SLUG=""
CLAIM_TOKEN=""
TITLE=""
DESCRIPTION=""
TTL=""
CLIENT=""
TARGET=""
FORKABLE=""
SPA_MODE=""
FROM_DRIVE=""
DRIVE_VERSION=""
usage() {
cat <<'USAGE'
Usage: publish.sh <file-or-dir> [options]
Options:
--api-key <key> API key (or set $HERENOW_API_KEY)
--slug <slug> Update existing publish
--claim-token <token> Claim token for anonymous updates
--title <text> Viewer title
--description <text> Viewer description
--ttl <seconds> Expiry (authenticated only)
--client <name> Agent name for attribution (e.g. cursor, claude-code)
--forkable Allow others to fork this site
--spa Enable SPA routing
--from-drive <drv_...> Publish a Drive snapshot instead of local files
--version <dv_...> Drive version for --from-drive (default: current head)
--base-url <url> API base (default: https://here.now)
--allow-nonherenow-base-url
Allow auth requests to non-default API base URL
USAGE
exit 1
}
die() { echo "error: $1" >&2; exit 1; }
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
SKILL_DIR="$(cd "${SCRIPT_DIR}/.." && pwd)"
BUNDLED_JQ="${SKILL_DIR}/bin/jq"
if [[ -x "$BUNDLED_JQ" ]]; then
JQ_BIN="$BUNDLED_JQ"
elif command -v jq >/dev/null 2>&1; then
JQ_BIN="$(command -v jq)"
else
die "requires jq"
fi
for cmd in curl file; do
command -v "$cmd" >/dev/null 2>&1 || die "requires $cmd"
done
while [[ $# -gt 0 ]]; do
case "$1" in
--api-key) API_KEY="$2"; API_KEY_SOURCE="flag"; shift 2 ;;
--slug) SLUG="$2"; shift 2 ;;
--claim-token) CLAIM_TOKEN="$2"; shift 2 ;;
--title) TITLE="$2"; shift 2 ;;
--description) DESCRIPTION="$2"; shift 2 ;;
--ttl) TTL="$2"; shift 2 ;;
--client) CLIENT="$2"; shift 2 ;;
--base-url) BASE_URL="$2"; shift 2 ;;
--allow-nonherenow-base-url) ALLOW_NON_HERENOW_BASE_URL=1; shift ;;
--forkable) FORKABLE="true"; shift ;;
--spa) SPA_MODE="true"; shift ;;
--from-drive) FROM_DRIVE="$2"; shift 2 ;;
--version) DRIVE_VERSION="$2"; shift 2 ;;
--help|-h) usage ;;
-*) die "unknown option: $1" ;;
*) [[ -z "$TARGET" ]] && TARGET="$1" || die "unexpected argument: $1"; shift ;;
esac
done
if [[ -n "$FROM_DRIVE" ]]; then
[[ -z "$TARGET" ]] || die "--from-drive does not accept a local file-or-dir argument"
else
[[ -n "$TARGET" ]] || usage
[[ -e "$TARGET" ]] || die "path does not exist: $TARGET"
fi
# Load API key from credentials file if not provided via flag or env
if [[ -z "$API_KEY" && -f "$CREDENTIALS_FILE" ]]; then
API_KEY=$(cat "$CREDENTIALS_FILE" | tr -d '[:space:]')
[[ -n "$API_KEY" ]] && API_KEY_SOURCE="credentials"
fi
BASE_URL="${BASE_URL%/}"
STATE_DIR=".herenow"
STATE_FILE="$STATE_DIR/state.json"
# Safety guard: avoid accidentally sending bearer auth to arbitrary endpoints.
if [[ -n "$API_KEY" && "$BASE_URL" != "https://here.now" && "$ALLOW_NON_HERENOW_BASE_URL" -ne 1 ]]; then
die "refusing to send API key to non-default base URL; pass --allow-nonherenow-base-url to override"
fi
# Auto-load claim token from state file for anonymous updates
if [[ -n "$SLUG" && -z "$CLAIM_TOKEN" && -z "$API_KEY" && -f "$STATE_FILE" ]]; then
CLAIM_TOKEN=$("$JQ_BIN" -r --arg s "$SLUG" '.publishes[$s].claimToken // empty' "$STATE_FILE" 2>/dev/null || true)
fi
if [[ -n "$FROM_DRIVE" ]]; then
[[ -n "$API_KEY" ]] || die "--from-drive requires an account API key"
BODY=$("$JQ_BIN" -n --arg d "$FROM_DRIVE" '{driveId:$d}')
[[ -n "$DRIVE_VERSION" ]] && BODY=$(echo "$BODY" | "$JQ_BIN" --arg v "$DRIVE_VERSION" '.versionId = $v')
[[ -n "$SLUG" ]] && BODY=$(echo "$BODY" | "$JQ_BIN" --arg s "$SLUG" '.slug = $s')
if [[ -n "$TITLE" || -n "$DESCRIPTION" ]]; then
viewer="{}"
[[ -n "$TITLE" ]] && viewer=$(echo "$viewer" | "$JQ_BIN" --arg t "$TITLE" '.title = $t')
[[ -n "$DESCRIPTION" ]] && viewer=$(echo "$viewer" | "$JQ_BIN" --arg d "$DESCRIPTION" '.description = $d')
BODY=$(echo "$BODY" | "$JQ_BIN" --argjson v "$viewer" '.viewer = $v')
fi
[[ "$FORKABLE" == "true" ]] && BODY=$(echo "$BODY" | "$JQ_BIN" '.forkable = true')
[[ "$SPA_MODE" == "true" ]] && BODY=$(echo "$BODY" | "$JQ_BIN" '.spaMode = true')
CLIENT_HEADER_VALUE="here-now-publish-sh"
if [[ -n "$CLIENT" ]]; then
normalized_client=$(echo "$CLIENT" | tr '[:upper:]' '[:lower:]' | tr -cs 'a-z0-9._-' '-')
normalized_client="${normalized_client#-}"
normalized_client="${normalized_client%-}"
if [[ -n "$normalized_client" ]]; then
CLIENT_HEADER_VALUE="${normalized_client}/publish-sh"
fi
fi
echo "publishing from Drive..." >&2
RESPONSE=$(curl -sS -X POST "$BASE_URL/api/v1/publish/from-drive" \
-H "authorization: Bearer $API_KEY" \
-H "x-herenow-client: $CLIENT_HEADER_VALUE" \
-H "content-type: application/json" \
-d "$BODY")
if echo "$RESPONSE" | "$JQ_BIN" -e '.error' >/dev/null 2>&1; then
err=$(echo "$RESPONSE" | "$JQ_BIN" -r '.error')
die "$err"
fi
SITE_URL=$(echo "$RESPONSE" | "$JQ_BIN" -r '.siteUrl')
OUT_SLUG=$(echo "$RESPONSE" | "$JQ_BIN" -r '.slug')
CURRENT_VERSION=$(echo "$RESPONSE" | "$JQ_BIN" -r '.currentVersionId')
DRIVE_VERSION_OUT=$(echo "$RESPONSE" | "$JQ_BIN" -r '.driveVersionId')
echo "$SITE_URL"
echo "" >&2
echo "publish_result.site_url=$SITE_URL" >&2
echo "publish_result.slug=$OUT_SLUG" >&2
echo "publish_result.action=from_drive" >&2
echo "publish_result.auth_mode=authenticated" >&2
echo "publish_result.api_key_source=$API_KEY_SOURCE" >&2
echo "publish_result.persistence=permanent" >&2
echo "publish_result.drive_id=$FROM_DRIVE" >&2
echo "publish_result.drive_version_id=$DRIVE_VERSION_OUT" >&2
echo "publish_result.current_version_id=$CURRENT_VERSION" >&2
exit 0
fi
compute_sha256() {
local f="$1"
if command -v sha256sum >/dev/null 2>&1; then
sha256sum "$f" | cut -d' ' -f1
else
shasum -a 256 "$f" | cut -d' ' -f1
fi
}
guess_content_type() {
local f="$1"
case "${f##*.}" in
html|htm) echo "text/html; charset=utf-8" ;;
css) echo "text/css; charset=utf-8" ;;
js|mjs) echo "text/javascript; charset=utf-8" ;;
json) echo "application/json; charset=utf-8" ;;
md|txt) echo "text/plain; charset=utf-8" ;;
svg) echo "image/svg+xml" ;;
png) echo "image/png" ;;
jpg|jpeg) echo "image/jpeg" ;;
gif) echo "image/gif" ;;
webp) echo "image/webp" ;;
pdf) echo "application/pdf" ;;
mp4) echo "video/mp4" ;;
mov) echo "video/quicktime" ;;
mp3) echo "audio/mpeg" ;;
wav) echo "audio/wav" ;;
xml) echo "application/xml" ;;
woff2) echo "font/woff2" ;;
woff) echo "font/woff" ;;
ttf) echo "font/ttf" ;;
ico) echo "image/x-icon" ;;
*)
local detected
detected=$(file --brief --mime-type "$f" 2>/dev/null || echo "application/octet-stream")
echo "$detected"
;;
esac
}
# Build file manifest as JSON array
FILES_JSON="[]"
if [[ -f "$TARGET" ]]; then
sz=$(wc -c < "$TARGET" | tr -d ' ')
ct=$(guess_content_type "$TARGET")
bn=$(basename "$TARGET")
h=$(compute_sha256 "$TARGET")
FILES_JSON=$("$JQ_BIN" -n --arg p "$bn" --argjson s "$sz" --arg c "$ct" --arg h "$h" \
'[{"path":$p,"size":$s,"contentType":$c,"hash":$h}]')
FILE_MAP=$("$JQ_BIN" -n --arg p "$bn" --arg a "$(cd "$(dirname "$TARGET")" && pwd)/$(basename "$TARGET")" \
'{($p):$a}')
elif [[ -d "$TARGET" ]]; then
FILE_MAP="{}"
while IFS= read -r -d '' f; do
rel="${f#$TARGET/}"
[[ "$rel" == ".DS_Store" ]] && continue
[[ "$(basename "$rel")" == ".DS_Store" ]] && continue
[[ "$rel" == ".herenow/fork-meta.json" ]] && continue
sz=$(wc -c < "$f" | tr -d ' ')
ct=$(guess_content_type "$f")
h=$(compute_sha256 "$f")
abs=$(cd "$(dirname "$f")" && pwd)/$(basename "$f")
FILES_JSON=$(echo "$FILES_JSON" | "$JQ_BIN" --arg p "$rel" --argjson s "$sz" --arg c "$ct" --arg h "$h" \
'. + [{"path":$p,"size":$s,"contentType":$c,"hash":$h}]')
FILE_MAP=$(echo "$FILE_MAP" | "$JQ_BIN" --arg p "$rel" --arg a "$abs" '. + {($p):$a}')
done < <(find "$TARGET" -type f -print0 | sort -z)
else
die "not a file or directory: $TARGET"
fi
file_count=$(echo "$FILES_JSON" | "$JQ_BIN" 'length')
[[ "$file_count" -gt 0 ]] || die "no files found"
# Read fork-meta.json defaults if present and no explicit flags given
FORK_META=""
if [[ -d "$TARGET" ]]; then
FORK_META_PATH="$TARGET/.herenow/fork-meta.json"
if [[ -f "$FORK_META_PATH" ]]; then
FORK_META=$(cat "$FORK_META_PATH")
if [[ -z "$FORKABLE" ]]; then
FORKABLE=$("$JQ_BIN" -r '.forkable // empty' <<< "$FORK_META" 2>/dev/null || true)
fi
fi
fi
# Build request body
BODY=$(echo "$FILES_JSON" | "$JQ_BIN" '{files: .}')
if [[ -n "$TTL" ]]; then
BODY=$(echo "$BODY" | "$JQ_BIN" --argjson t "$TTL" '.ttlSeconds = $t')
fi
if [[ -n "$TITLE" || -n "$DESCRIPTION" ]]; then
viewer="{}"
[[ -n "$TITLE" ]] && viewer=$(echo "$viewer" | "$JQ_BIN" --arg t "$TITLE" '.title = $t')
[[ -n "$DESCRIPTION" ]] && viewer=$(echo "$viewer" | "$JQ_BIN" --arg d "$DESCRIPTION" '.description = $d')
BODY=$(echo "$BODY" | "$JQ_BIN" --argjson v "$viewer" '.viewer = $v')
fi
if [[ -n "$CLAIM_TOKEN" && -n "$SLUG" && -z "$API_KEY" ]]; then
BODY=$(echo "$BODY" | "$JQ_BIN" --arg ct "$CLAIM_TOKEN" '.claimToken = $ct')
fi
if [[ "$FORKABLE" == "true" ]]; then
BODY=$(echo "$BODY" | "$JQ_BIN" '.forkable = true')
fi
if [[ "$SPA_MODE" == "true" ]]; then
BODY=$(echo "$BODY" | "$JQ_BIN" '.spaMode = true')
fi
# Determine endpoint and method
if [[ -n "$SLUG" ]]; then
URL="$BASE_URL/api/v1/publish/$SLUG"
METHOD="PUT"
else
URL="$BASE_URL/api/v1/publish"
METHOD="POST"
fi
# Build auth header
AUTH_ARGS=()
if [[ -n "$API_KEY" ]]; then
AUTH_ARGS=(-H "authorization: Bearer $API_KEY")
fi
AUTH_MODE="anonymous"
if [[ -n "$API_KEY" ]]; then
AUTH_MODE="authenticated"
fi
CLIENT_HEADER_VALUE="here-now-publish-sh"
if [[ -n "$CLIENT" ]]; then
normalized_client=$(echo "$CLIENT" | tr '[:upper:]' '[:lower:]' | tr -cs 'a-z0-9._-' '-')
normalized_client="${normalized_client#-}"
normalized_client="${normalized_client%-}"
if [[ -n "$normalized_client" ]]; then
CLIENT_HEADER_VALUE="${normalized_client}/publish-sh"
fi
fi
CLIENT_ARGS=(-H "x-herenow-client: $CLIENT_HEADER_VALUE")
# Step 1: Create/update publish
echo "creating publish ($file_count files)..." >&2
RESPONSE=$(curl -sS -X "$METHOD" "$URL" \
"${AUTH_ARGS[@]+"${AUTH_ARGS[@]}"}" \
"${CLIENT_ARGS[@]+"${CLIENT_ARGS[@]}"}" \
-H "content-type: application/json" \
-d "$BODY")
# Check for errors
if echo "$RESPONSE" | "$JQ_BIN" -e '.error' >/dev/null 2>&1; then
err=$(echo "$RESPONSE" | "$JQ_BIN" -r '.error')
details=$(echo "$RESPONSE" | "$JQ_BIN" -r '.details // empty')
die "$err${details:+ ($details)}"
fi
OUT_SLUG=$(echo "$RESPONSE" | "$JQ_BIN" -r '.slug')
VERSION_ID=$(echo "$RESPONSE" | "$JQ_BIN" -r '.upload.versionId')
FINALIZE_URL=$(echo "$RESPONSE" | "$JQ_BIN" -r '.upload.finalizeUrl')
SITE_URL=$(echo "$RESPONSE" | "$JQ_BIN" -r '.siteUrl')
UPLOAD_COUNT=$(echo "$RESPONSE" | "$JQ_BIN" '.upload.uploads | length')
SKIPPED_COUNT=$(echo "$RESPONSE" | "$JQ_BIN" '.upload.skipped // [] | length')
[[ "$OUT_SLUG" != "null" ]] || die "unexpected response: $RESPONSE"
# Step 2: Upload files (skipped files are unchanged from previous version)
if [[ "$SKIPPED_COUNT" -gt 0 ]]; then
echo "uploading $UPLOAD_COUNT files ($SKIPPED_COUNT unchanged, skipped)..." >&2
else
echo "uploading $UPLOAD_COUNT files..." >&2
fi
upload_errors=0
for i in $(seq 0 $((UPLOAD_COUNT - 1))); do
upload_path=$(echo "$RESPONSE" | "$JQ_BIN" -r ".upload.uploads[$i].path")
upload_url=$(echo "$RESPONSE" | "$JQ_BIN" -r ".upload.uploads[$i].url")
upload_ct=$(echo "$RESPONSE" | "$JQ_BIN" -r ".upload.uploads[$i].headers[\"Content-Type\"] // empty")
if [[ -f "$TARGET" && ! -d "$TARGET" ]]; then
local_file="$TARGET"
else
local_file=$(echo "$FILE_MAP" | "$JQ_BIN" -r --arg p "$upload_path" '.[$p]')
fi
if [[ ! -f "$local_file" ]]; then
echo "warning: missing local file for $upload_path" >&2
upload_errors=$((upload_errors + 1))
continue
fi
ct_args=()
[[ -n "$upload_ct" ]] && ct_args=(-H "Content-Type: $upload_ct")
http_code=$(curl -sS -o /dev/null -w "%{http_code}" -X PUT "$upload_url" \
"${ct_args[@]+"${ct_args[@]}"}" \
--data-binary "@$local_file")
if [[ "$http_code" -lt 200 || "$http_code" -ge 300 ]]; then
echo "warning: upload failed for $upload_path (HTTP $http_code)" >&2
upload_errors=$((upload_errors + 1))
fi
done
[[ "$upload_errors" -eq 0 ]] || die "$upload_errors file(s) failed to upload"
# Step 3: Finalize
echo "finalizing..." >&2
FIN_RESPONSE=$(curl -sS -X POST "$FINALIZE_URL" \
"${AUTH_ARGS[@]+"${AUTH_ARGS[@]}"}" \
"${CLIENT_ARGS[@]+"${CLIENT_ARGS[@]}"}" \
-H "content-type: application/json" \
-d "{\"versionId\":\"$VERSION_ID\"}")
if echo "$FIN_RESPONSE" | "$JQ_BIN" -e '.error' >/dev/null 2>&1; then
err=$(echo "$FIN_RESPONSE" | "$JQ_BIN" -r '.error')
die "finalize failed: $err"
fi
# Save state
mkdir -p "$STATE_DIR"
if [[ -f "$STATE_FILE" ]]; then
STATE=$(cat "$STATE_FILE")
else
STATE='{"publishes":{}}'
fi
entry=$("$JQ_BIN" -n --arg s "$SITE_URL" '{siteUrl: $s}')
RESPONSE_CLAIM_TOKEN=$(echo "$RESPONSE" | "$JQ_BIN" -r '.claimToken // empty')
RESPONSE_CLAIM_URL=$(echo "$RESPONSE" | "$JQ_BIN" -r '.claimUrl // empty')
RESPONSE_EXPIRES=$(echo "$RESPONSE" | "$JQ_BIN" -r '.expiresAt // empty')
[[ -n "$RESPONSE_CLAIM_TOKEN" ]] && entry=$(echo "$entry" | "$JQ_BIN" --arg v "$RESPONSE_CLAIM_TOKEN" '.claimToken = $v')
[[ -n "$RESPONSE_CLAIM_URL" ]] && entry=$(echo "$entry" | "$JQ_BIN" --arg v "$RESPONSE_CLAIM_URL" '.claimUrl = $v')
[[ -n "$RESPONSE_EXPIRES" ]] && entry=$(echo "$entry" | "$JQ_BIN" --arg v "$RESPONSE_EXPIRES" '.expiresAt = $v')
STATE=$(echo "$STATE" | "$JQ_BIN" --arg slug "$OUT_SLUG" --argjson e "$entry" '.publishes[$slug] = $e')
echo "$STATE" | "$JQ_BIN" '.' > "$STATE_FILE"
# Output
echo "$SITE_URL"
PERSISTENCE="permanent"
if [[ "$AUTH_MODE" == "anonymous" ]]; then
PERSISTENCE="expires_24h"
elif [[ -n "$RESPONSE_EXPIRES" ]]; then
PERSISTENCE="expires_at"
fi
SAFE_CLAIM_URL=""
if [[ -n "$RESPONSE_CLAIM_URL" && "$RESPONSE_CLAIM_URL" == https://* ]]; then
SAFE_CLAIM_URL="$RESPONSE_CLAIM_URL"
fi
ACTION="create"
if [[ -n "$SLUG" ]]; then
ACTION="update"
fi
echo "" >&2
echo "publish_result.site_url=$SITE_URL" >&2
echo "publish_result.slug=$OUT_SLUG" >&2
echo "publish_result.action=$ACTION" >&2
echo "publish_result.auth_mode=$AUTH_MODE" >&2
echo "publish_result.api_key_source=$API_KEY_SOURCE" >&2
echo "publish_result.persistence=$PERSISTENCE" >&2
echo "publish_result.expires_at=$RESPONSE_EXPIRES" >&2
echo "publish_result.claim_url=$SAFE_CLAIM_URL" >&2
if [[ "$AUTH_MODE" == "authenticated" ]]; then
echo "authenticated publish (permanent, saved to your account)" >&2
else
echo "anonymous publish (expires in 24h)" >&2
if [[ -n "$SAFE_CLAIM_URL" ]]; then
echo "claim URL: $SAFE_CLAIM_URL" >&2
fi
if [[ -n "$RESPONSE_CLAIM_TOKEN" ]]; then
echo "claim token saved to $STATE_FILE" >&2
fi
fi
@@ -0,0 +1,372 @@
---
name: shopify
description: Shopify Admin & Storefront GraphQL APIs via curl. Products, orders, customers, inventory, metafields.
version: 1.0.0
author: community
license: MIT
prerequisites:
env_vars: [SHOPIFY_ACCESS_TOKEN, SHOPIFY_STORE_DOMAIN]
commands: [curl, jq]
required_environment_variables:
- name: SHOPIFY_ACCESS_TOKEN
prompt: Shopify Admin API access token (starts with shpat_)
help: "Shopify admin → Settings → Apps and sales channels → Develop apps → Create an app → API credentials. Token shown ONCE on install."
- name: SHOPIFY_STORE_DOMAIN
prompt: Your shop subdomain without protocol (e.g. my-store.myshopify.com)
help: "The permanent myshopify.com domain, not your custom domain."
- name: SHOPIFY_API_VERSION
prompt: Shopify API version (default 2026-01)
help: "Stable quarterly version. Override if you need an older one."
metadata:
hermes:
tags: [Shopify, E-commerce, Commerce, API, GraphQL]
related_skills: [airtable, xurl]
homepage: https://shopify.dev/docs/api/admin-graphql
---
# Shopify — Admin & Storefront GraphQL APIs
Work with Shopify stores directly through `curl`: list products, manage inventory, pull orders, update customers, read metafields. No SDK, no app framework — just the GraphQL endpoint and a custom-app access token.
The REST Admin API is legacy since 2024-04 and only receives security fixes. **Use GraphQL Admin** for all admin work. Use **Storefront GraphQL** for read-only customer-facing queries (products, collections, cart).
## Prerequisites
1. In Shopify admin: **Settings → Apps and sales channels → Develop apps → Create an app**.
2. Click **Configure Admin API scopes**, select what you need (examples below), save.
3. **Install app** → the Admin API access token appears ONCE. Copy it immediately — Shopify will never show it again. Tokens start with `shpat_`.
4. Save to `~/.hermes/.env`:
```
SHOPIFY_ACCESS_TOKEN=shpat_xxxxxxxxxxxxxxxxxxxx
SHOPIFY_STORE_DOMAIN=my-store.myshopify.com
SHOPIFY_API_VERSION=2026-01
```
> **Heads up:** As of January 1, 2026, new "legacy custom apps" created in the Shopify admin are gone. New setups should use the **Dev Dashboard** (`shopify.dev/docs/apps/build/dev-dashboard`). Existing admin-created apps keep working. If the user's shop has no existing custom app and it's after 2026-01-01, direct them to Dev Dashboard instead of the admin flow.
Common scopes by task:
- Products / collections: `read_products`, `write_products`
- Inventory: `read_inventory`, `write_inventory`, `read_locations`
- Orders: `read_orders`, `write_orders` (30 most recent without `read_all_orders`)
- Customers: `read_customers`, `write_customers`
- Draft orders: `read_draft_orders`, `write_draft_orders`
- Fulfillments: `read_fulfillments`, `write_fulfillments`
- Metafields / metaobjects: covered by the matching resource scopes
## API Basics
- **Endpoint:** `https://$SHOPIFY_STORE_DOMAIN/admin/api/$SHOPIFY_API_VERSION/graphql.json`
- **Auth header:** `X-Shopify-Access-Token: $SHOPIFY_ACCESS_TOKEN` (NOT `Authorization: Bearer`)
- **Method:** always `POST`, always `Content-Type: application/json`, body is `{"query": "...", "variables": {...}}`
- **HTTP 200 does not mean success.** GraphQL returns errors in a top-level `errors` array and per-field `userErrors`. Always check both.
- **IDs are GID strings:** `gid://shopify/Product/10079467700516`, `gid://shopify/Variant/...`, `gid://shopify/Order/...`. Pass these verbatim — don't strip the prefix.
- **Rate limit:** calculated via query cost (leaky bucket). Each response has `extensions.cost` with `requestedQueryCost`, `actualQueryCost`, `throttleStatus.{currentlyAvailable, maximumAvailable, restoreRate}`. Back off when `currentlyAvailable` drops below your next query's cost. Standard shops = 100 points bucket, 50/s restore; Plus = 1000/100.
Base curl pattern (reusable):
```bash
shop_gql() {
local query="$1"
local variables="${2:-{}}"
curl -sS -X POST \
"https://${SHOPIFY_STORE_DOMAIN}/admin/api/${SHOPIFY_API_VERSION:-2026-01}/graphql.json" \
-H "Content-Type: application/json" \
-H "X-Shopify-Access-Token: ${SHOPIFY_ACCESS_TOKEN}" \
--data "$(jq -nc --arg q "$query" --argjson v "$variables" '{query: $q, variables: $v}')"
}
```
Pipe through `jq` for readable output. `-sS` keeps errors visible but hides the progress bar.
## Discovery
### Shop info + current API version
```bash
shop_gql '{ shop { name myshopifyDomain primaryDomain { url } currencyCode plan { displayName } } }' | jq
```
### List all supported API versions
```bash
shop_gql '{ publicApiVersions { handle supported } }' | jq '.data.publicApiVersions[] | select(.supported)'
```
## Products
### Search products (first 20 matching query)
```bash
shop_gql '
query($q: String!) {
products(first: 20, query: $q) {
edges { node { id title handle status totalInventory variants(first: 5) { edges { node { id sku price inventoryQuantity } } } } }
pageInfo { hasNextPage endCursor }
}
}' '{"q":"hoodie status:active"}' | jq
```
Query syntax supports `title:`, `sku:`, `vendor:`, `product_type:`, `status:active`, `tag:`, `created_at:>2025-01-01`. Full grammar: https://shopify.dev/docs/api/usage/search-syntax
### Paginate products (cursor)
```bash
shop_gql '
query($cursor: String) {
products(first: 100, after: $cursor) {
edges { cursor node { id handle } }
pageInfo { hasNextPage endCursor }
}
}' '{"cursor":null}'
# subsequent calls: pass the previous endCursor
```
### Get a product with variants + metafields
```bash
shop_gql '
query($id: ID!) {
product(id: $id) {
id title handle descriptionHtml tags status
variants(first: 20) { edges { node { id sku price compareAtPrice inventoryQuantity selectedOptions { name value } } } }
metafields(first: 20) { edges { node { namespace key type value } } }
}
}' '{"id":"gid://shopify/Product/10079467700516"}' | jq
```
### Create a product with one variant
```bash
shop_gql '
mutation($input: ProductCreateInput!) {
productCreate(product: $input) {
product { id handle }
userErrors { field message }
}
}' '{"input":{"title":"Test Hoodie","status":"DRAFT","vendor":"Hermes","productType":"Apparel","tags":["test"]}}'
```
Variants now have their own mutations in recent versions:
```bash
# Add variants after creating the product
shop_gql '
mutation($productId: ID!, $variants: [ProductVariantsBulkInput!]!) {
productVariantsBulkCreate(productId: $productId, variants: $variants) {
productVariants { id sku price }
userErrors { field message }
}
}' '{"productId":"gid://shopify/Product/...","variants":[{"optionValues":[{"optionName":"Size","name":"M"}],"price":"49.00","inventoryItem":{"sku":"HD-M","tracked":true}}]}'
```
### Update price / SKU
```bash
shop_gql '
mutation($productId: ID!, $variants: [ProductVariantsBulkInput!]!) {
productVariantsBulkUpdate(productId: $productId, variants: $variants) {
productVariants { id sku price }
userErrors { field message }
}
}' '{"productId":"gid://shopify/Product/...","variants":[{"id":"gid://shopify/ProductVariant/...","price":"55.00"}]}'
```
## Orders
### List recent orders (last 30 by default without `read_all_orders`)
```bash
shop_gql '
{
orders(first: 20, reverse: true, query: "financial_status:paid") {
edges { node {
id name createdAt displayFinancialStatus displayFulfillmentStatus
totalPriceSet { shopMoney { amount currencyCode } }
customer { id displayName email }
lineItems(first: 10) { edges { node { title quantity sku } } }
} }
}
}' | jq
```
Useful order query filters: `financial_status:paid|pending|refunded`, `fulfillment_status:unfulfilled|fulfilled`, `created_at:>2025-01-01`, `tag:gift`, `email:foo@example.com`.
### Fetch a single order with shipping address
```bash
shop_gql '
query($id: ID!) {
order(id: $id) {
id name email
shippingAddress { name address1 address2 city province country zip phone }
lineItems(first: 50) { edges { node { title quantity variant { sku } originalUnitPriceSet { shopMoney { amount currencyCode } } } } }
transactions { id kind status amountSet { shopMoney { amount currencyCode } } }
}
}' '{"id":"gid://shopify/Order/...."}' | jq
```
## Customers
```bash
# Search
shop_gql '
{
customers(first: 10, query: "email:*@example.com") {
edges { node { id email displayName numberOfOrders amountSpent { amount currencyCode } } }
}
}'
# Create
shop_gql '
mutation($input: CustomerInput!) {
customerCreate(input: $input) {
customer { id email }
userErrors { field message }
}
}' '{"input":{"email":"test@example.com","firstName":"Test","lastName":"User","tags":["api-created"]}}'
```
## Inventory
Inventory lives on **inventory items** tied to variants, quantities tracked per **location**.
```bash
# Get inventory for a variant across all locations
shop_gql '
query($id: ID!) {
productVariant(id: $id) {
id sku
inventoryItem {
id tracked
inventoryLevels(first: 10) {
edges { node { location { id name } quantities(names: ["available","on_hand","committed"]) { name quantity } } }
}
}
}
}' '{"id":"gid://shopify/ProductVariant/..."}'
```
Adjust stock (delta) — uses `inventoryAdjustQuantities`:
```bash
shop_gql '
mutation($input: InventoryAdjustQuantitiesInput!) {
inventoryAdjustQuantities(input: $input) {
inventoryAdjustmentGroup { reason changes { name delta } }
userErrors { field message }
}
}' '{
"input": {
"reason": "correction",
"name": "available",
"changes": [{"delta": 5, "inventoryItemId": "gid://shopify/InventoryItem/...", "locationId": "gid://shopify/Location/..."}]
}
}'
```
Set absolute stock (not delta) — `inventorySetQuantities`:
```bash
shop_gql '
mutation($input: InventorySetQuantitiesInput!) {
inventorySetQuantities(input: $input) {
inventoryAdjustmentGroup { id }
userErrors { field message }
}
}' '{"input":{"reason":"correction","name":"available","ignoreCompareQuantity":true,"quantities":[{"inventoryItemId":"gid://shopify/InventoryItem/...","locationId":"gid://shopify/Location/...","quantity":100}]}}'
```
## Metafields & Metaobjects
Metafields attach custom data to resources (products, customers, orders, shop).
```bash
# Read
shop_gql '
query($id: ID!) {
product(id: $id) {
metafields(first: 10, namespace: "custom") {
edges { node { key type value } }
}
}
}' '{"id":"gid://shopify/Product/..."}'
# Write (works for any owner type)
shop_gql '
mutation($metafields: [MetafieldsSetInput!]!) {
metafieldsSet(metafields: $metafields) {
metafields { id key namespace }
userErrors { field message code }
}
}' '{"metafields":[{"ownerId":"gid://shopify/Product/...","namespace":"custom","key":"care_instructions","type":"multi_line_text_field","value":"Wash cold. Tumble dry low."}]}'
```
## Storefront API (public read-only)
Different endpoint, different token, used for customer-facing apps/hydrogen-style headless setups. Headers differ:
- **Endpoint:** `https://$SHOPIFY_STORE_DOMAIN/api/$SHOPIFY_API_VERSION/graphql.json`
- **Auth header (public):** `X-Shopify-Storefront-Access-Token: <public token>` — embeddable in browser
- **Auth header (private):** `Shopify-Storefront-Private-Token: <private token>` — server-only
```bash
curl -sS -X POST \
"https://${SHOPIFY_STORE_DOMAIN}/api/${SHOPIFY_API_VERSION:-2026-01}/graphql.json" \
-H "Content-Type: application/json" \
-H "X-Shopify-Storefront-Access-Token: ${SHOPIFY_STOREFRONT_TOKEN}" \
-d '{"query":"{ shop { name } products(first: 5) { edges { node { id title handle } } } }"}' | jq
```
## Bulk Operations
For dumps larger than rate limits allow (full product catalog, all orders for a year):
```bash
# 1. Start bulk query
shop_gql '
mutation {
bulkOperationRunQuery(query: """
{ products { edges { node { id title handle variants { edges { node { sku price } } } } } } }
""") {
bulkOperation { id status }
userErrors { field message }
}
}'
# 2. Poll status
shop_gql '{ currentBulkOperation { id status errorCode objectCount fileSize url partialDataUrl } }'
# 3. When status=COMPLETED, download the JSONL file
curl -sS "$URL" > products.jsonl
```
Each JSONL line is a node, and nested connections are emitted as separate lines with `__parentId`. Reassemble client-side if needed.
## Webhooks
Subscribe to events so you don't have to poll:
```bash
shop_gql '
mutation($topic: WebhookSubscriptionTopic!, $sub: WebhookSubscriptionInput!) {
webhookSubscriptionCreate(topic: $topic, webhookSubscription: $sub) {
webhookSubscription { id topic endpoint { __typename ... on WebhookHttpEndpoint { callbackUrl } } }
userErrors { field message }
}
}' '{"topic":"ORDERS_CREATE","sub":{"callbackUrl":"https://example.com/webhook","format":"JSON"}}'
```
Verify incoming webhook HMAC using the app's client secret (not the access token):
```bash
echo -n "$REQUEST_BODY" | openssl dgst -sha256 -hmac "$APP_SECRET" -binary | base64
# Compare to X-Shopify-Hmac-Sha256 header
```
## Pitfalls
- **REST endpoints still exist but are frozen.** Don't write new integrations against `/admin/api/.../products.json`. Use GraphQL.
- **Token format check.** Admin tokens start with `shpat_`. Storefront public tokens with `shpua_`. If you have one and the wrong header, every request returns 401 without a useful error body.
- **403 with a valid token = missing scope.** Shopify returns `{"errors":[{"message":"Access denied for ..."}]}`. Re-configure Admin API scopes on the app, then reinstall to regenerate the token.
- **`userErrors` is empty != success.** Also check `data.<mutation>.<resource>` is non-null. Some failures populate neither — inspect the whole response.
- **GID vs numeric ID.** Legacy REST gave numeric IDs; GraphQL wants full GID strings. To convert: `gid://shopify/Product/<numeric>`.
- **Rate limit surprise.** A single `products(first: 250)` with deep nesting can cost 1000+ points and throttle immediately on a standard-plan shop. Start narrow, read `extensions.cost`, adjust.
- **Pagination order.** `products(first: N, reverse: true)` sorts by `id DESC`, not `created_at`. Use `sortKey: CREATED_AT, reverse: true` for "newest first."
- **`read_all_orders` for historical data.** Without it, `orders(...)` silently caps at the 60-day window. You won't get an error, just fewer results than expected. For Shopify Plus merchants with many orders, request this scope via the app's protected-data settings.
- **Currencies are strings.** Amounts come back as `"49.00"` not `49.0`. Don't `jq tonumber` blindly if you care about zero-padding.
- **Multi-currency Money fields** have `shopMoney` (store's currency) AND `presentmentMoney` (customer's). Pick one consistently.
## Safety
Mutations in Shopify are real — they create products, charge refunds, cancel orders, ship fulfillments. Before running `productDelete`, `orderCancel`, `refundCreate`, or any bulk mutation: state clearly what the change is, on which shop, and confirm with the user. There is no staging clone of production data unless the user has a separate dev store.
@@ -12,6 +12,14 @@ import time
from pathlib import Path
from typing import Any, Dict, List, Optional, Set
try:
from hermes_constants import get_hermes_home
except ImportError:
import os as _os
def get_hermes_home() -> Path: # type: ignore[misc]
val = (_os.environ.get("HERMES_HOME") or "").strip()
return Path(val) if val else Path.home() / ".hermes"
try:
from fastapi import APIRouter
except Exception: # Allows local unit tests without dashboard dependencies.
@@ -135,15 +143,15 @@ ACHIEVEMENTS: List[Dict[str, Any]] = [
def state_path() -> Path:
return Path.home() / ".hermes" / "plugins" / "hermes-achievements" / "state.json"
return get_hermes_home() / "plugins" / "hermes-achievements" / "state.json"
def snapshot_path() -> Path:
return Path.home() / ".hermes" / "plugins" / "hermes-achievements" / "scan_snapshot.json"
return get_hermes_home() / "plugins" / "hermes-achievements" / "scan_snapshot.json"
def checkpoint_path() -> Path:
return Path.home() / ".hermes" / "plugins" / "hermes-achievements" / "scan_checkpoint.json"
return get_hermes_home() / "plugins" / "hermes-achievements" / "scan_checkpoint.json"
def load_state() -> Dict[str, Any]:
File diff suppressed because it is too large Load Diff
+760
View File
@@ -0,0 +1,760 @@
/*
* Hermes Kanban dashboard plugin styles.
*
* All colors reference theme CSS vars so the board reskins with the
* active dashboard theme. No hardcoded palette.
*/
.hermes-kanban {
width: 100%;
}
/* ---- Columns layout -------------------------------------------------- */
.hermes-kanban-columns {
display: grid;
grid-template-columns: repeat(auto-fit, minmax(260px, 1fr));
gap: 0.75rem;
align-items: start;
}
.hermes-kanban-column {
display: flex;
flex-direction: column;
background: color-mix(in srgb, var(--color-card) 85%, transparent);
border: 1px solid var(--color-border);
border-radius: var(--radius);
padding: 0.5rem;
min-height: 200px;
max-height: calc(100vh - 220px);
transition: border-color 120ms ease, background-color 120ms ease;
}
.hermes-kanban-column--drop {
border-color: var(--color-ring);
background: color-mix(in srgb, var(--color-ring) 8%, var(--color-card));
}
.hermes-kanban-column-header {
display: flex;
align-items: center;
gap: 0.5rem;
padding: 0.25rem 0.25rem 0.35rem;
font-weight: 600;
font-size: 0.85rem;
color: var(--color-foreground);
}
.hermes-kanban-column-label {
flex: 1;
letter-spacing: 0.01em;
}
.hermes-kanban-column-count {
font-variant-numeric: tabular-nums;
color: var(--color-muted-foreground);
font-size: 0.75rem;
font-weight: 500;
}
.hermes-kanban-column-add {
appearance: none;
background: transparent;
border: 1px solid var(--color-border);
color: var(--color-foreground);
border-radius: var(--radius-sm, 0.25rem);
width: 22px;
height: 22px;
line-height: 1;
font-size: 1rem;
cursor: pointer;
}
.hermes-kanban-column-add:hover {
background: color-mix(in srgb, var(--color-foreground) 8%, transparent);
}
.hermes-kanban-column-sub {
padding: 0 0.25rem 0.5rem;
font-size: 0.7rem;
color: var(--color-muted-foreground);
border-bottom: 1px solid color-mix(in srgb, var(--color-border) 60%, transparent);
margin-bottom: 0.5rem;
}
.hermes-kanban-column-body {
display: flex;
flex-direction: column;
gap: 0.45rem;
overflow-y: auto;
padding-right: 0.1rem;
}
.hermes-kanban-empty {
padding: 1.5rem 0.5rem;
text-align: center;
font-size: 0.75rem;
color: var(--color-muted-foreground);
border: 1px dashed color-mix(in srgb, var(--color-border) 70%, transparent);
border-radius: var(--radius-sm, 0.25rem);
}
/* ---- Status dots ----------------------------------------------------- */
.hermes-kanban-dot {
display: inline-block;
width: 0.5rem;
height: 0.5rem;
border-radius: 999px;
background: var(--color-muted-foreground);
}
.hermes-kanban-dot-triage { background: #b47dd6; } /* lilac — fresh/unspecified */
.hermes-kanban-dot-todo { background: var(--color-muted-foreground); }
.hermes-kanban-dot-ready { background: #d4b348; } /* amber */
.hermes-kanban-dot-running { background: #3fb97d; } /* green */
.hermes-kanban-dot-blocked { background: var(--color-destructive, #d14a4a); }
.hermes-kanban-dot-done { background: #4a8cd1; } /* blue */
.hermes-kanban-dot-archived { background: var(--color-border); }
/* ---- Progress pill (N/M child tasks done) --------------------------- */
.hermes-kanban-progress {
font-family: var(--font-mono, ui-monospace, monospace);
font-size: 0.62rem;
padding: 0.05rem 0.35rem;
border-radius: 999px;
background: color-mix(in srgb, var(--color-foreground) 8%, transparent);
border: 1px solid color-mix(in srgb, var(--color-border) 80%, transparent);
color: var(--color-muted-foreground);
letter-spacing: 0.02em;
}
.hermes-kanban-progress--full {
background: color-mix(in srgb, #3fb97d 22%, transparent);
border-color: color-mix(in srgb, #3fb97d 45%, transparent);
color: var(--color-foreground);
}
/* ---- Lanes (per-profile sub-grouping inside Running) ---------------- */
.hermes-kanban-lane {
display: flex;
flex-direction: column;
gap: 0.35rem;
padding: 0.25rem 0 0.35rem;
border-top: 1px dashed color-mix(in srgb, var(--color-border) 70%, transparent);
}
.hermes-kanban-lane:first-child {
border-top: 0;
padding-top: 0;
}
.hermes-kanban-lane-head {
display: flex;
align-items: center;
gap: 0.4rem;
font-size: 0.65rem;
text-transform: uppercase;
letter-spacing: 0.08em;
color: var(--color-muted-foreground);
padding: 0 0.1rem;
}
.hermes-kanban-lane-name {
font-weight: 600;
font-family: var(--font-mono, ui-monospace, monospace);
}
.hermes-kanban-lane-count {
margin-left: auto;
font-variant-numeric: tabular-nums;
}
/* ---- Card ------------------------------------------------------------ */
.hermes-kanban-card {
cursor: grab;
transition: transform 100ms ease, box-shadow 100ms ease;
}
.hermes-kanban-card:hover {
box-shadow: 0 1px 0 0 var(--color-ring) inset, 0 0 0 1px var(--color-ring) inset;
}
.hermes-kanban-card:active {
cursor: grabbing;
transform: scale(0.995);
}
.hermes-kanban-card-content {
padding: 0.5rem 0.6rem !important;
display: flex;
flex-direction: column;
gap: 0.3rem;
}
.hermes-kanban-card-row {
display: flex;
align-items: center;
gap: 0.35rem;
flex-wrap: wrap;
}
.hermes-kanban-card-id {
font-family: var(--font-mono, ui-monospace, monospace);
font-size: 0.65rem;
color: var(--color-muted-foreground);
letter-spacing: 0.03em;
}
.hermes-kanban-card-title {
font-size: 0.85rem;
font-weight: 500;
line-height: 1.3;
color: var(--color-foreground);
word-break: break-word;
}
.hermes-kanban-card-meta {
font-size: 0.7rem;
color: var(--color-muted-foreground);
gap: 0.55rem;
}
.hermes-kanban-priority {
font-size: 0.6rem !important;
padding: 0.05rem 0.3rem !important;
background: color-mix(in srgb, var(--color-ring) 18%, transparent);
color: var(--color-foreground);
border: 1px solid color-mix(in srgb, var(--color-ring) 40%, transparent);
}
.hermes-kanban-tag {
font-size: 0.6rem !important;
padding: 0.05rem 0.3rem !important;
}
.hermes-kanban-assignee {
font-weight: 500;
color: color-mix(in srgb, var(--color-foreground) 80%, var(--color-muted-foreground));
}
.hermes-kanban-unassigned {
font-style: italic;
}
.hermes-kanban-ago {
margin-left: auto;
}
/* ---- Inline create --------------------------------------------------- */
.hermes-kanban-inline-create {
display: flex;
flex-direction: column;
gap: 0.35rem;
padding: 0.5rem;
margin-bottom: 0.5rem;
background: color-mix(in srgb, var(--color-card) 70%, transparent);
border: 1px dashed var(--color-border);
border-radius: var(--radius-sm, 0.25rem);
}
.hermes-kanban-inline-create > .flex.gap-2:last-child > button:first-of-type {
flex: 1;
min-width: 0;
}
/* ---- Drawer (task detail side panel) --------------------------------- */
.hermes-kanban-drawer-shade {
position: fixed;
inset: 0;
background: rgba(0, 0, 0, 0.45);
z-index: 60;
display: flex;
justify-content: flex-end;
}
.hermes-kanban-drawer {
width: min(480px, 92vw);
height: 100vh;
background: var(--color-card);
border-left: 1px solid var(--color-border);
display: flex;
flex-direction: column;
box-shadow: -4px 0 18px rgba(0, 0, 0, 0.35);
animation: hermes-kanban-drawer-in 180ms ease-out;
}
@keyframes hermes-kanban-drawer-in {
from { transform: translateX(100%); opacity: 0.3; }
to { transform: translateX(0); opacity: 1; }
}
.hermes-kanban-drawer-head {
display: flex;
align-items: center;
justify-content: space-between;
padding: 0.6rem 0.8rem;
border-bottom: 1px solid var(--color-border);
font-family: var(--font-mono, ui-monospace, monospace);
}
.hermes-kanban-drawer-close {
appearance: none;
background: transparent;
border: 0;
color: var(--color-muted-foreground);
font-size: 1.25rem;
line-height: 1;
cursor: pointer;
padding: 0 0.25rem;
}
.hermes-kanban-drawer-close:hover { color: var(--color-foreground); }
.hermes-kanban-drawer-body {
flex: 1;
overflow-y: auto;
padding: 0.9rem;
display: flex;
flex-direction: column;
gap: 0.85rem;
}
.hermes-kanban-drawer-title {
display: flex;
align-items: center;
gap: 0.5rem;
font-size: 1rem;
font-weight: 600;
}
.hermes-kanban-drawer-meta {
display: flex;
flex-direction: column;
gap: 0.15rem;
padding: 0.5rem 0.6rem;
background: color-mix(in srgb, var(--color-foreground) 4%, transparent);
border: 1px solid var(--color-border);
border-radius: var(--radius-sm, 0.25rem);
}
.hermes-kanban-meta-row {
display: flex;
gap: 0.5rem;
font-size: 0.72rem;
}
.hermes-kanban-meta-label {
width: 92px;
color: var(--color-muted-foreground);
}
.hermes-kanban-meta-value {
color: var(--color-foreground);
word-break: break-word;
}
.hermes-kanban-actions {
display: flex;
flex-wrap: wrap;
gap: 0.3rem;
}
.hermes-kanban-section {
display: flex;
flex-direction: column;
gap: 0.35rem;
}
.hermes-kanban-section-head {
font-size: 0.72rem;
font-weight: 600;
text-transform: uppercase;
letter-spacing: 0.07em;
color: var(--color-muted-foreground);
}
.hermes-kanban-pre {
margin: 0;
padding: 0.45rem 0.55rem;
white-space: pre-wrap;
word-break: break-word;
background: color-mix(in srgb, var(--color-foreground) 4%, transparent);
border: 1px solid var(--color-border);
border-radius: var(--radius-sm, 0.25rem);
font-family: var(--font-mono, ui-monospace, monospace);
font-size: 0.72rem;
color: var(--color-foreground);
}
.hermes-kanban-comment {
border-left: 2px solid color-mix(in srgb, var(--color-ring) 35%, transparent);
padding-left: 0.5rem;
display: flex;
flex-direction: column;
gap: 0.2rem;
}
.hermes-kanban-comment-head {
display: flex;
gap: 0.5rem;
font-size: 0.7rem;
}
.hermes-kanban-comment-author {
font-weight: 600;
color: var(--color-foreground);
}
.hermes-kanban-comment-ago {
color: var(--color-muted-foreground);
}
.hermes-kanban-event {
display: flex;
gap: 0.5rem;
font-size: 0.7rem;
color: var(--color-muted-foreground);
font-family: var(--font-mono, ui-monospace, monospace);
}
.hermes-kanban-event-kind {
color: var(--color-foreground);
min-width: 6rem;
}
.hermes-kanban-event-payload {
color: var(--color-muted-foreground);
overflow: hidden;
text-overflow: ellipsis;
white-space: nowrap;
max-width: 280px;
}
.hermes-kanban-drawer-comment-row {
display: flex;
gap: 0.4rem;
padding: 0.55rem 0.75rem;
border-top: 1px solid var(--color-border);
background: color-mix(in srgb, var(--color-card) 90%, transparent);
}
.hermes-kanban-count {
display: inline-flex;
gap: 0.2rem;
align-items: center;
}
/* ---- Selection chrome ----------------------------------------------- */
.hermes-kanban-card--selected :where(.hermes-kanban-card-content) {
box-shadow: 0 0 0 2px var(--color-ring) inset,
0 0 0 1px var(--color-ring) inset;
background: color-mix(in srgb, var(--color-ring) 6%, var(--color-card));
}
.hermes-kanban-card-check {
width: 0.85rem;
height: 0.85rem;
margin: 0;
cursor: pointer;
accent-color: var(--color-ring);
}
/* ---- Bulk action bar ------------------------------------------------ */
.hermes-kanban-bulk {
display: flex;
align-items: center;
gap: 0.5rem;
padding: 0.4rem 0.75rem;
background: color-mix(in srgb, var(--color-ring) 10%, var(--color-card));
border: 1px solid color-mix(in srgb, var(--color-ring) 40%, var(--color-border));
border-radius: var(--radius-sm, 0.25rem);
flex-wrap: wrap;
}
.hermes-kanban-bulk-count {
font-weight: 600;
font-size: 0.75rem;
padding-right: 0.25rem;
}
.hermes-kanban-bulk > button,
.hermes-kanban-bulk-reassign > button {
height: 1.7rem !important;
padding: 0 0.5rem !important;
font-size: 0.7rem !important;
border: 1px solid var(--color-border);
cursor: pointer;
}
.hermes-kanban-bulk > button:hover:not(:disabled),
.hermes-kanban-bulk-reassign > button:hover:not(:disabled) {
background: color-mix(in srgb, var(--color-foreground) 8%, transparent);
}
.hermes-kanban-bulk-reassign {
display: flex;
align-items: center;
gap: 0.25rem;
padding-left: 0.5rem;
border-left: 1px solid color-mix(in srgb, var(--color-border) 70%, transparent);
}
/* ---- Dependency editor chips --------------------------------------- */
.hermes-kanban-deps-row {
display: flex;
align-items: center;
gap: 0.5rem;
margin-bottom: 0.4rem;
}
.hermes-kanban-deps-label {
font-size: 0.68rem;
text-transform: uppercase;
letter-spacing: 0.08em;
color: var(--color-muted-foreground);
min-width: 4rem;
}
.hermes-kanban-deps-chips {
display: flex;
gap: 0.3rem;
flex-wrap: wrap;
flex: 1;
}
.hermes-kanban-deps-empty {
font-size: 0.7rem;
color: var(--color-muted-foreground);
font-style: italic;
}
.hermes-kanban-dep-chip {
display: inline-flex;
align-items: center;
gap: 0.15rem;
padding: 0.1rem 0.35rem;
background: color-mix(in srgb, var(--color-foreground) 6%, transparent);
border: 1px solid var(--color-border);
border-radius: var(--radius-sm, 0.25rem);
font-family: var(--font-mono, ui-monospace, monospace);
font-size: 0.68rem;
color: var(--color-foreground);
}
.hermes-kanban-dep-chip-x {
appearance: none;
background: transparent;
border: 0;
color: var(--color-muted-foreground);
cursor: pointer;
font-size: 0.85rem;
line-height: 1;
padding: 0 0.15rem;
}
.hermes-kanban-dep-chip-x:hover { color: var(--color-destructive, #d14a4a); }
/* ---- Inline edit affordances --------------------------------------- */
.hermes-kanban-editable {
cursor: pointer;
border-bottom: 1px dotted color-mix(in srgb, var(--color-border) 80%, transparent);
}
.hermes-kanban-editable:hover {
color: var(--color-foreground);
border-bottom-color: var(--color-ring);
}
.hermes-kanban-drawer-title-text {
cursor: pointer;
}
.hermes-kanban-drawer-title-text:hover {
text-decoration: underline;
text-decoration-color: var(--color-ring);
text-decoration-style: dotted;
text-underline-offset: 3px;
}
.hermes-kanban-edit-row {
display: flex;
align-items: center;
gap: 0.35rem;
width: 100%;
}
.hermes-kanban-section-head-row {
display: flex;
align-items: center;
justify-content: space-between;
gap: 0.5rem;
}
.hermes-kanban-edit-link {
appearance: none;
background: transparent;
border: 0;
color: var(--color-muted-foreground);
font-size: 0.7rem;
text-transform: uppercase;
letter-spacing: 0.05em;
cursor: pointer;
padding: 0;
}
.hermes-kanban-edit-link:hover { color: var(--color-ring); }
.hermes-kanban-textarea {
width: 100%;
min-height: 8rem;
background: var(--color-card);
color: var(--color-foreground);
border: 1px solid var(--color-border);
border-radius: var(--radius-sm, 0.25rem);
padding: 0.5rem 0.6rem;
font-family: var(--font-mono, ui-monospace, monospace);
font-size: 0.8rem;
line-height: 1.5;
resize: vertical;
}
.hermes-kanban-textarea:focus {
outline: none;
border-color: var(--color-ring);
box-shadow: 0 0 0 2px color-mix(in srgb, var(--color-ring) 30%, transparent);
}
/* ---- Markdown rendering -------------------------------------------- */
.hermes-kanban-md {
font-size: 0.8rem;
line-height: 1.55;
color: var(--color-foreground);
}
.hermes-kanban-md p { margin: 0.25rem 0; }
.hermes-kanban-md h1,
.hermes-kanban-md h2,
.hermes-kanban-md h3,
.hermes-kanban-md h4 {
margin: 0.6rem 0 0.2rem;
line-height: 1.25;
}
.hermes-kanban-md h1 { font-size: 1.05rem; }
.hermes-kanban-md h2 { font-size: 0.95rem; }
.hermes-kanban-md h3 { font-size: 0.88rem; }
.hermes-kanban-md h4 { font-size: 0.82rem; }
.hermes-kanban-md ul {
margin: 0.25rem 0 0.25rem 1.1rem;
padding: 0;
}
.hermes-kanban-md li { margin: 0.1rem 0; }
.hermes-kanban-md a {
color: var(--color-ring);
text-decoration: underline;
}
.hermes-kanban-md code {
font-family: var(--font-mono, ui-monospace, monospace);
font-size: 0.75rem;
padding: 0.05rem 0.3rem;
background: color-mix(in srgb, var(--color-foreground) 8%, transparent);
border-radius: 3px;
}
.hermes-kanban-md-code {
margin: 0.35rem 0;
padding: 0.5rem 0.6rem;
background: color-mix(in srgb, var(--color-foreground) 5%, transparent);
border: 1px solid var(--color-border);
border-radius: var(--radius-sm, 0.25rem);
overflow-x: auto;
}
.hermes-kanban-md-code code {
background: transparent;
padding: 0;
font-size: 0.75rem;
white-space: pre;
}
.hermes-kanban-md strong { font-weight: 600; }
/* ---- Touch-drag proxy ---------------------------------------------- */
.hermes-kanban-touch-proxy {
pointer-events: none;
opacity: 0.85;
box-shadow: 0 8px 20px rgba(0, 0, 0, 0.35);
transform: scale(1.02);
transition: none;
}
/* ---- Staleness tiers ------------------------------------------------ */
.hermes-kanban-card--stale-amber :where(.hermes-kanban-card-content) {
box-shadow: 0 0 0 1px #d4b34888 inset;
}
.hermes-kanban-card--stale-amber:hover :where(.hermes-kanban-card-content) {
box-shadow: 0 0 0 2px #d4b348 inset;
}
.hermes-kanban-card--stale-red :where(.hermes-kanban-card-content) {
box-shadow: 0 0 0 1px var(--color-destructive, #d14a4a) inset,
0 0 8px color-mix(in srgb, var(--color-destructive, #d14a4a) 30%, transparent);
}
.hermes-kanban-card--stale-red:hover :where(.hermes-kanban-card-content) {
box-shadow: 0 0 0 2px var(--color-destructive, #d14a4a) inset,
0 0 10px color-mix(in srgb, var(--color-destructive, #d14a4a) 45%, transparent);
}
/* ---- Worker log pane ------------------------------------------------ */
.hermes-kanban-log {
max-height: 340px;
overflow: auto;
white-space: pre;
font-size: 0.7rem;
line-height: 1.45;
}
/* ---- Run history (per-attempt log in the drawer) ------------------- */
.hermes-kanban-run {
border-left: 2px solid var(--color-border);
padding: 0.35rem 0.5rem;
margin-bottom: 0.4rem;
background: color-mix(in srgb, var(--color-foreground) 3%, transparent);
border-radius: var(--radius-sm, 0.25rem);
}
.hermes-kanban-run--active { border-left-color: #3fb97d; }
.hermes-kanban-run--completed { border-left-color: #4a8cd1; }
.hermes-kanban-run--ended { border-left-color: #6b7280; } /* generic fallback when outcome is unset */
.hermes-kanban-run--blocked { border-left-color: var(--color-destructive, #d14a4a); }
.hermes-kanban-run--crashed,
.hermes-kanban-run--timed_out,
.hermes-kanban-run--gave_up,
.hermes-kanban-run--spawn_failed {
border-left-color: var(--color-destructive, #d14a4a);
background: color-mix(in srgb, var(--color-destructive, #d14a4a) 6%, transparent);
}
.hermes-kanban-run--reclaimed { border-left-color: #d4b348; }
.hermes-kanban-run-head {
display: flex;
align-items: center;
gap: 0.6rem;
font-size: 0.7rem;
}
.hermes-kanban-run-outcome {
font-family: var(--font-mono, ui-monospace, monospace);
font-weight: 600;
text-transform: uppercase;
letter-spacing: 0.05em;
color: var(--color-foreground);
}
.hermes-kanban-run-profile {
color: var(--color-muted-foreground);
}
.hermes-kanban-run-elapsed {
font-variant-numeric: tabular-nums;
color: var(--color-muted-foreground);
}
.hermes-kanban-run-ago {
margin-left: auto;
color: var(--color-muted-foreground);
}
.hermes-kanban-run-summary {
font-size: 0.75rem;
padding: 0.2rem 0 0;
color: var(--color-foreground);
}
.hermes-kanban-run-error {
font-size: 0.7rem;
color: var(--color-destructive, #d14a4a);
padding: 0.15rem 0 0;
font-family: var(--font-mono, ui-monospace, monospace);
}
.hermes-kanban-run-meta {
display: block;
font-size: 0.65rem;
padding: 0.15rem 0 0;
color: var(--color-muted-foreground);
white-space: pre-wrap;
word-break: break-word;
font-family: var(--font-mono, ui-monospace, monospace);
}
+14
View File
@@ -0,0 +1,14 @@
{
"name": "kanban",
"label": "Kanban",
"description": "Multi-agent collaboration board — drag-drop cards across columns, read comment threads, see which profile is running what",
"icon": "Package",
"version": "1.0.0",
"tab": {
"path": "/kanban",
"position": "after:skills"
},
"entry": "dist/index.js",
"css": "dist/style.css",
"api": "plugin_api.py"
}
+845
View File
@@ -0,0 +1,845 @@
"""Kanban dashboard plugin — backend API routes.
Mounted at /api/plugins/kanban/ by the dashboard plugin system.
This layer is intentionally thin: every handler is a small wrapper around
``hermes_cli.kanban_db`` or a direct SQL query. Writes use the same code
paths the CLI and gateway ``/kanban`` command use, so the three surfaces
cannot drift.
Live updates arrive via the ``/events`` WebSocket, which tails the
append-only ``task_events`` table on a short poll interval (WAL mode lets
reads run alongside the dispatcher's IMMEDIATE write transactions).
Security note
-------------
The dashboard's HTTP auth middleware (``web_server.auth_middleware``)
explicitly skips ``/api/plugins/`` plugin routes are unauthenticated by
design because the dashboard binds to localhost by default. For the
WebSocket we still require the session token as a ``?token=`` query
parameter (browsers cannot set the ``Authorization`` header on an upgrade
request), matching the established pattern used by the in-browser PTY
bridge in ``hermes_cli/web_server.py``. If you run the dashboard with
``--host 0.0.0.0``, every plugin route kanban included becomes
reachable from the network. Don't do that on a shared host.
"""
from __future__ import annotations
import asyncio
import hmac
import json
import logging
import sqlite3
import time
from dataclasses import asdict
from typing import Any, Optional
from fastapi import APIRouter, HTTPException, Query, WebSocket, WebSocketDisconnect, status as http_status
from pydantic import BaseModel, Field
from hermes_cli import kanban_db
log = logging.getLogger(__name__)
router = APIRouter()
# ---------------------------------------------------------------------------
# Auth helper — WebSocket only (HTTP routes live behind the dashboard's
# existing plugin-bypass; this is documented above).
# ---------------------------------------------------------------------------
def _check_ws_token(provided: Optional[str]) -> bool:
"""Constant-time compare against the dashboard session token.
Imported lazily so the plugin still loads in test contexts where the
dashboard web_server module isn't importable (e.g. the bare-FastAPI
test harness).
"""
if not provided:
return False
try:
from hermes_cli import web_server as _ws
except Exception:
# No dashboard context (tests). Accept so the tail loop is still
# testable; in production the dashboard module always imports
# cleanly because it's the caller.
return True
expected = getattr(_ws, "_SESSION_TOKEN", None)
if not expected:
return True
return hmac.compare_digest(str(provided), str(expected))
def _conn():
"""Open a kanban_db connection, creating the schema on first use.
Every handler that mutates the DB goes through this so the plugin
self-heals on a fresh install (no user-visible "no such table"
error if somebody hits POST /tasks before GET /board).
``init_db`` is idempotent.
"""
try:
kanban_db.init_db()
except Exception as exc:
log.warning("kanban init_db failed: %s", exc)
return kanban_db.connect()
# ---------------------------------------------------------------------------
# Serialization helpers
# ---------------------------------------------------------------------------
# Columns shown by the dashboard, in left-to-right order. "archived" is
# available via a filter toggle rather than a visible column.
BOARD_COLUMNS: list[str] = [
"triage", "todo", "ready", "running", "blocked", "done",
]
def _task_dict(task: kanban_db.Task) -> dict[str, Any]:
d = asdict(task)
# Add derived age metrics so the UI can colour stale cards without
# computing deltas client-side.
d["age"] = kanban_db.task_age(task)
# Keep body short on list endpoints; full body comes from /tasks/:id.
return d
def _event_dict(event: kanban_db.Event) -> dict[str, Any]:
return {
"id": event.id,
"task_id": event.task_id,
"kind": event.kind,
"payload": event.payload,
"created_at": event.created_at,
"run_id": event.run_id,
}
def _comment_dict(c: kanban_db.Comment) -> dict[str, Any]:
return {
"id": c.id,
"task_id": c.task_id,
"author": c.author,
"body": c.body,
"created_at": c.created_at,
}
def _run_dict(r: kanban_db.Run) -> dict[str, Any]:
"""Serialise a Run for the drawer's Run history section."""
return {
"id": r.id,
"task_id": r.task_id,
"profile": r.profile,
"step_key": r.step_key,
"status": r.status,
"claim_lock": r.claim_lock,
"claim_expires": r.claim_expires,
"worker_pid": r.worker_pid,
"max_runtime_seconds": r.max_runtime_seconds,
"last_heartbeat_at": r.last_heartbeat_at,
"started_at": r.started_at,
"ended_at": r.ended_at,
"outcome": r.outcome,
"summary": r.summary,
"metadata": r.metadata,
"error": r.error,
}
def _links_for(conn: sqlite3.Connection, task_id: str) -> dict[str, list[str]]:
"""Return {'parents': [...], 'children': [...]} for a task."""
parents = [
r["parent_id"]
for r in conn.execute(
"SELECT parent_id FROM task_links WHERE child_id = ? ORDER BY parent_id",
(task_id,),
)
]
children = [
r["child_id"]
for r in conn.execute(
"SELECT child_id FROM task_links WHERE parent_id = ? ORDER BY child_id",
(task_id,),
)
]
return {"parents": parents, "children": children}
# ---------------------------------------------------------------------------
# GET /board
# ---------------------------------------------------------------------------
@router.get("/board")
def get_board(
tenant: Optional[str] = Query(None, description="Filter to a single tenant"),
include_archived: bool = Query(False),
):
"""Return the full board grouped by status column.
``_conn()`` auto-initializes ``kanban.db`` on first call so a fresh
install doesn't surface a "failed to load" error on the plugin tab.
"""
conn = _conn()
try:
tasks = kanban_db.list_tasks(
conn, tenant=tenant, include_archived=include_archived
)
# Pre-fetch link counts per task (cheap: one query).
link_counts: dict[str, dict[str, int]] = {}
for row in conn.execute(
"SELECT parent_id, child_id FROM task_links"
).fetchall():
link_counts.setdefault(row["parent_id"], {"parents": 0, "children": 0})[
"children"
] += 1
link_counts.setdefault(row["child_id"], {"parents": 0, "children": 0})[
"parents"
] += 1
# Comment + event counts (both cheap aggregates).
comment_counts: dict[str, int] = {
r["task_id"]: r["n"]
for r in conn.execute(
"SELECT task_id, COUNT(*) AS n FROM task_comments GROUP BY task_id"
)
}
# Progress rollup: for each parent, how many children are done / total.
# One pass over task_links joined with child status — cheaper than
# N per-task queries and the plugin uses it to render "N/M".
progress: dict[str, dict[str, int]] = {}
for row in conn.execute(
"SELECT l.parent_id AS pid, t.status AS cstatus "
"FROM task_links l JOIN tasks t ON t.id = l.child_id"
).fetchall():
p = progress.setdefault(row["pid"], {"done": 0, "total": 0})
p["total"] += 1
if row["cstatus"] == "done":
p["done"] += 1
latest_event_id = conn.execute(
"SELECT COALESCE(MAX(id), 0) AS m FROM task_events"
).fetchone()["m"]
columns: dict[str, list[dict]] = {c: [] for c in BOARD_COLUMNS}
if include_archived:
columns["archived"] = []
for t in tasks:
d = _task_dict(t)
d["link_counts"] = link_counts.get(t.id, {"parents": 0, "children": 0})
d["comment_count"] = comment_counts.get(t.id, 0)
d["progress"] = progress.get(t.id) # None when the task has no children
col = t.status if t.status in columns else "todo"
columns[col].append(d)
# Stable per-column ordering already applied by list_tasks
# (priority DESC, created_at ASC), keep as-is.
# List of known tenants for the UI filter dropdown.
tenants = [
r["tenant"]
for r in conn.execute(
"SELECT DISTINCT tenant FROM tasks WHERE tenant IS NOT NULL ORDER BY tenant"
)
]
# List of distinct assignees for the lane-by-profile sub-grouping.
assignees = [
r["assignee"]
for r in conn.execute(
"SELECT DISTINCT assignee FROM tasks WHERE assignee IS NOT NULL "
"AND status != 'archived' ORDER BY assignee"
)
]
return {
"columns": [
{"name": name, "tasks": columns[name]} for name in columns.keys()
],
"tenants": tenants,
"assignees": assignees,
"latest_event_id": int(latest_event_id),
"now": int(time.time()),
}
finally:
conn.close()
# ---------------------------------------------------------------------------
# GET /tasks/:id
# ---------------------------------------------------------------------------
@router.get("/tasks/{task_id}")
def get_task(task_id: str):
conn = _conn()
try:
task = kanban_db.get_task(conn, task_id)
if task is None:
raise HTTPException(status_code=404, detail=f"task {task_id} not found")
return {
"task": _task_dict(task),
"comments": [_comment_dict(c) for c in kanban_db.list_comments(conn, task_id)],
"events": [_event_dict(e) for e in kanban_db.list_events(conn, task_id)],
"links": _links_for(conn, task_id),
"runs": [_run_dict(r) for r in kanban_db.list_runs(conn, task_id)],
}
finally:
conn.close()
# ---------------------------------------------------------------------------
# POST /tasks
# ---------------------------------------------------------------------------
class CreateTaskBody(BaseModel):
title: str
body: Optional[str] = None
assignee: Optional[str] = None
tenant: Optional[str] = None
priority: int = 0
workspace_kind: str = "scratch"
workspace_path: Optional[str] = None
parents: list[str] = Field(default_factory=list)
triage: bool = False
idempotency_key: Optional[str] = None
max_runtime_seconds: Optional[int] = None
skills: Optional[list[str]] = None
@router.post("/tasks")
def create_task(payload: CreateTaskBody):
conn = _conn()
try:
task_id = kanban_db.create_task(
conn,
title=payload.title,
body=payload.body,
assignee=payload.assignee,
created_by="dashboard",
workspace_kind=payload.workspace_kind,
workspace_path=payload.workspace_path,
tenant=payload.tenant,
priority=payload.priority,
parents=payload.parents,
triage=payload.triage,
idempotency_key=payload.idempotency_key,
max_runtime_seconds=payload.max_runtime_seconds,
skills=payload.skills,
)
task = kanban_db.get_task(conn, task_id)
body: dict[str, Any] = {"task": _task_dict(task) if task else None}
# Surface a dispatcher-presence warning so the UI can show a
# banner when a `ready` task would otherwise sit idle because no
# gateway is running (or dispatch_in_gateway=false). Only emit
# for ready+assigned tasks; triage/todo are expected to wait,
# and unassigned tasks can't be dispatched regardless.
if task and task.status == "ready" and task.assignee:
try:
from hermes_cli.kanban import _check_dispatcher_presence
running, message = _check_dispatcher_presence()
if not running and message:
body["warning"] = message
except Exception:
# Probe failure must never block the create itself.
pass
return body
except ValueError as e:
raise HTTPException(status_code=400, detail=str(e))
finally:
conn.close()
# ---------------------------------------------------------------------------
# PATCH /tasks/:id (status / assignee / priority / title / body)
# ---------------------------------------------------------------------------
class UpdateTaskBody(BaseModel):
status: Optional[str] = None
assignee: Optional[str] = None
priority: Optional[int] = None
title: Optional[str] = None
body: Optional[str] = None
result: Optional[str] = None
block_reason: Optional[str] = None
# Structured handoff fields — forwarded to complete_task when status
# transitions to 'done'. Dashboard parity with ``hermes kanban
# complete --summary ... --metadata ...``.
summary: Optional[str] = None
metadata: Optional[dict] = None
@router.patch("/tasks/{task_id}")
def update_task(task_id: str, payload: UpdateTaskBody):
conn = _conn()
try:
task = kanban_db.get_task(conn, task_id)
if task is None:
raise HTTPException(status_code=404, detail=f"task {task_id} not found")
# --- assignee ----------------------------------------------------
if payload.assignee is not None:
try:
ok = kanban_db.assign_task(
conn, task_id, payload.assignee or None,
)
except RuntimeError as e:
raise HTTPException(status_code=409, detail=str(e))
if not ok:
raise HTTPException(status_code=404, detail="task not found")
# --- status -------------------------------------------------------
if payload.status is not None:
s = payload.status
ok = True
if s == "done":
ok = kanban_db.complete_task(
conn, task_id,
result=payload.result,
summary=payload.summary,
metadata=payload.metadata,
)
elif s == "blocked":
ok = kanban_db.block_task(conn, task_id, reason=payload.block_reason)
elif s == "ready":
# Re-open a blocked task, or just an explicit status set.
current = kanban_db.get_task(conn, task_id)
if current and current.status == "blocked":
ok = kanban_db.unblock_task(conn, task_id)
else:
# Direct status write for drag-drop (todo -> ready etc).
ok = _set_status_direct(conn, task_id, "ready")
elif s == "archived":
ok = kanban_db.archive_task(conn, task_id)
elif s in ("todo", "running", "triage"):
ok = _set_status_direct(conn, task_id, s)
else:
raise HTTPException(status_code=400, detail=f"unknown status: {s}")
if not ok:
raise HTTPException(
status_code=409,
detail=f"status transition to {s!r} not valid from current state",
)
# --- priority -----------------------------------------------------
if payload.priority is not None:
with kanban_db.write_txn(conn):
conn.execute(
"UPDATE tasks SET priority = ? WHERE id = ?",
(int(payload.priority), task_id),
)
conn.execute(
"INSERT INTO task_events (task_id, kind, payload, created_at) "
"VALUES (?, 'reprioritized', ?, ?)",
(task_id, json.dumps({"priority": int(payload.priority)}),
int(time.time())),
)
# --- title / body -------------------------------------------------
if payload.title is not None or payload.body is not None:
with kanban_db.write_txn(conn):
sets, vals = [], []
if payload.title is not None:
if not payload.title.strip():
raise HTTPException(status_code=400, detail="title cannot be empty")
sets.append("title = ?")
vals.append(payload.title.strip())
if payload.body is not None:
sets.append("body = ?")
vals.append(payload.body)
vals.append(task_id)
conn.execute(
f"UPDATE tasks SET {', '.join(sets)} WHERE id = ?", vals,
)
conn.execute(
"INSERT INTO task_events (task_id, kind, payload, created_at) "
"VALUES (?, 'edited', NULL, ?)",
(task_id, int(time.time())),
)
updated = kanban_db.get_task(conn, task_id)
return {"task": _task_dict(updated) if updated else None}
finally:
conn.close()
def _set_status_direct(
conn: sqlite3.Connection, task_id: str, new_status: str,
) -> bool:
"""Direct status write for drag-drop moves that aren't covered by the
structured complete/block/unblock/archive verbs (e.g. todo<->ready,
running<->ready). Appends a ``status`` event row for the live feed.
When this transitions OFF ``running`` to anything other than the
terminal verbs above (which own their own run closing), we close the
active run with outcome='reclaimed' so attempt history isn't
orphaned. ``running -> ready`` via drag-drop is the common case
(user yanking a stuck worker back to the queue).
"""
with kanban_db.write_txn(conn):
# Snapshot current state so we know whether to close a run.
prev = conn.execute(
"SELECT status, current_run_id FROM tasks WHERE id = ?",
(task_id,),
).fetchone()
if prev is None:
return False
was_running = prev["status"] == "running"
cur = conn.execute(
"UPDATE tasks SET status = ?, "
" claim_lock = CASE WHEN ? = 'running' THEN claim_lock ELSE NULL END, "
" claim_expires = CASE WHEN ? = 'running' THEN claim_expires ELSE NULL END, "
" worker_pid = CASE WHEN ? = 'running' THEN worker_pid ELSE NULL END "
"WHERE id = ?",
(new_status, new_status, new_status, new_status, task_id),
)
if cur.rowcount != 1:
return False
run_id = None
if was_running and new_status != "running" and prev["current_run_id"]:
run_id = kanban_db._end_run(
conn, task_id,
outcome="reclaimed", status="reclaimed",
summary=f"status changed to {new_status} (dashboard/direct)",
)
conn.execute(
"INSERT INTO task_events (task_id, run_id, kind, payload, created_at) "
"VALUES (?, ?, 'status', ?, ?)",
(task_id, run_id, json.dumps({"status": new_status}), int(time.time())),
)
# If we re-opened something, children may have gone stale.
if new_status in ("done", "ready"):
kanban_db.recompute_ready(conn)
return True
# ---------------------------------------------------------------------------
# Comments
# ---------------------------------------------------------------------------
class CommentBody(BaseModel):
body: str
author: Optional[str] = "dashboard"
@router.post("/tasks/{task_id}/comments")
def add_comment(task_id: str, payload: CommentBody):
if not payload.body.strip():
raise HTTPException(status_code=400, detail="body is required")
conn = _conn()
try:
if kanban_db.get_task(conn, task_id) is None:
raise HTTPException(status_code=404, detail=f"task {task_id} not found")
kanban_db.add_comment(
conn, task_id, author=payload.author or "dashboard", body=payload.body,
)
return {"ok": True}
finally:
conn.close()
# ---------------------------------------------------------------------------
# Links
# ---------------------------------------------------------------------------
class LinkBody(BaseModel):
parent_id: str
child_id: str
@router.post("/links")
def add_link(payload: LinkBody):
conn = _conn()
try:
kanban_db.link_tasks(conn, payload.parent_id, payload.child_id)
return {"ok": True}
except ValueError as e:
raise HTTPException(status_code=400, detail=str(e))
finally:
conn.close()
@router.delete("/links")
def delete_link(parent_id: str = Query(...), child_id: str = Query(...)):
conn = _conn()
try:
ok = kanban_db.unlink_tasks(conn, parent_id, child_id)
return {"ok": bool(ok)}
finally:
conn.close()
# ---------------------------------------------------------------------------
# Bulk actions (multi-select on the board)
# ---------------------------------------------------------------------------
class BulkTaskBody(BaseModel):
ids: list[str]
status: Optional[str] = None
assignee: Optional[str] = None # "" or None = unassign
priority: Optional[int] = None
archive: bool = False
@router.post("/tasks/bulk")
def bulk_update(payload: BulkTaskBody):
"""Apply the same patch to every id in ``payload.ids``.
This is an *independent* iteration per-task failures don't abort
siblings. Returns per-id outcome so the UI can surface partials.
"""
ids = [i for i in (payload.ids or []) if i]
if not ids:
raise HTTPException(status_code=400, detail="ids is required")
results: list[dict] = []
conn = _conn()
try:
for tid in ids:
entry: dict[str, Any] = {"id": tid, "ok": True}
try:
task = kanban_db.get_task(conn, tid)
if task is None:
entry.update(ok=False, error="not found")
results.append(entry)
continue
if payload.archive:
if not kanban_db.archive_task(conn, tid):
entry.update(ok=False, error="archive refused")
if payload.status is not None and not payload.archive:
s = payload.status
if s == "done":
ok = kanban_db.complete_task(conn, tid)
elif s == "blocked":
ok = kanban_db.block_task(conn, tid)
elif s == "ready":
cur = kanban_db.get_task(conn, tid)
if cur and cur.status == "blocked":
ok = kanban_db.unblock_task(conn, tid)
else:
ok = _set_status_direct(conn, tid, "ready")
elif s in ("todo", "running", "triage"):
ok = _set_status_direct(conn, tid, s)
else:
entry.update(ok=False, error=f"unknown status {s!r}")
results.append(entry)
continue
if not ok:
entry.update(ok=False, error=f"transition to {s!r} refused")
if payload.assignee is not None:
try:
if not kanban_db.assign_task(
conn, tid, payload.assignee or None,
):
entry.update(ok=False, error="assign refused")
except RuntimeError as e:
entry.update(ok=False, error=str(e))
if payload.priority is not None:
with kanban_db.write_txn(conn):
conn.execute(
"UPDATE tasks SET priority = ? WHERE id = ?",
(int(payload.priority), tid),
)
conn.execute(
"INSERT INTO task_events (task_id, kind, payload, created_at) "
"VALUES (?, 'reprioritized', ?, ?)",
(tid, json.dumps({"priority": int(payload.priority)}),
int(time.time())),
)
except Exception as e: # defensive — one bad id shouldn't kill the batch
entry.update(ok=False, error=str(e))
results.append(entry)
return {"results": results}
finally:
conn.close()
# ---------------------------------------------------------------------------
# Plugin config (read dashboard.kanban.* defaults from config.yaml)
# ---------------------------------------------------------------------------
@router.get("/config")
def get_config():
"""Return kanban dashboard preferences from ~/.hermes/config.yaml.
Reads the ``dashboard.kanban`` section if present; defaults otherwise.
Used by the UI to pre-select tenant filters, toggle markdown rendering,
or set column-width preferences without a round-trip per page load.
"""
try:
from hermes_cli.config import load_config
cfg = load_config() or {}
except Exception:
cfg = {}
dash_cfg = (cfg.get("dashboard") or {})
# dashboard.kanban may itself be a dict; fall back to {}.
k_cfg = dash_cfg.get("kanban") or {}
return {
"default_tenant": k_cfg.get("default_tenant") or "",
"lane_by_profile": bool(k_cfg.get("lane_by_profile", True)),
"include_archived_by_default": bool(k_cfg.get("include_archived_by_default", False)),
"render_markdown": bool(k_cfg.get("render_markdown", True)),
}
# ---------------------------------------------------------------------------
# Stats (per-profile / per-status counts + oldest-ready age)
# ---------------------------------------------------------------------------
@router.get("/stats")
def get_stats():
"""Per-status + per-assignee counts + oldest-ready age.
Designed for the dashboard HUD and for router profiles that need to
answer "is this specialist overloaded?" without scanning the whole
board themselves.
"""
conn = _conn()
try:
return kanban_db.board_stats(conn)
finally:
conn.close()
@router.get("/assignees")
def get_assignees():
"""Known profiles + per-profile task counts.
Returns the union of ``~/.hermes/profiles/*`` on disk and every
distinct assignee currently used on the board. The dashboard uses
this to populate its assignee dropdown so a freshly-created profile
appears in the picker before it's been given any task.
"""
conn = _conn()
try:
return {"assignees": kanban_db.known_assignees(conn)}
finally:
conn.close()
# ---------------------------------------------------------------------------
# Worker log (read-only; file written by _default_spawn)
# ---------------------------------------------------------------------------
@router.get("/tasks/{task_id}/log")
def get_task_log(task_id: str, tail: Optional[int] = Query(None, ge=1, le=2_000_000)):
"""Return the worker's stdout/stderr log.
``tail`` caps the response size (bytes) so the dashboard drawer
doesn't paginate megabytes into the browser. Returns 404 if the task
has never spawned. The on-disk log is rotated at 2 MiB per
``_rotate_worker_log`` a single ``.log.1`` is kept, no further
generations, so disk usage per task is bounded at ~4 MiB.
"""
conn = _conn()
try:
task = kanban_db.get_task(conn, task_id)
finally:
conn.close()
if task is None:
raise HTTPException(status_code=404, detail=f"task {task_id} not found")
content = kanban_db.read_worker_log(task_id, tail_bytes=tail)
log_path = kanban_db.worker_log_path(task_id)
size = log_path.stat().st_size if log_path.exists() else 0
return {
"task_id": task_id,
"path": str(log_path),
"exists": content is not None,
"size_bytes": size,
"content": content or "",
# Truncated when the on-disk file was larger than the tail cap.
"truncated": bool(tail and size > tail),
}
# ---------------------------------------------------------------------------
# Dispatch nudge (optional quick-path so the UI doesn't wait 60 s)
# ---------------------------------------------------------------------------
@router.post("/dispatch")
def dispatch(dry_run: bool = Query(False), max_n: int = Query(8, alias="max")):
conn = _conn()
try:
result = kanban_db.dispatch_once(
conn, dry_run=dry_run, max_spawn=max_n,
)
# DispatchResult is a dataclass.
try:
return asdict(result)
except TypeError:
return {"result": str(result)}
finally:
conn.close()
# ---------------------------------------------------------------------------
# WebSocket: /events?since=<event_id>
# ---------------------------------------------------------------------------
# Poll interval for the event tail loop. SQLite WAL + 300 ms polling is
# the simplest and most robust approach; it adds a fraction of a percent
# of CPU and has no shared state to synchronize across workers.
_EVENT_POLL_SECONDS = 0.3
@router.websocket("/events")
async def stream_events(ws: WebSocket):
# Enforce the dashboard session token as a query param — browsers can't
# set Authorization on a WS upgrade. This matches how the PTY bridge
# authenticates in hermes_cli/web_server.py.
token = ws.query_params.get("token")
if not _check_ws_token(token):
await ws.close(code=http_status.WS_1008_POLICY_VIOLATION)
return
await ws.accept()
try:
since_raw = ws.query_params.get("since", "0")
try:
cursor = int(since_raw)
except ValueError:
cursor = 0
def _fetch_new(cursor_val: int) -> tuple[int, list[dict]]:
conn = kanban_db.connect()
try:
rows = conn.execute(
"SELECT id, task_id, run_id, kind, payload, created_at "
"FROM task_events WHERE id > ? ORDER BY id ASC LIMIT 200",
(cursor_val,),
).fetchall()
out: list[dict] = []
new_cursor = cursor_val
for r in rows:
try:
payload = json.loads(r["payload"]) if r["payload"] else None
except Exception:
payload = None
out.append({
"id": r["id"],
"task_id": r["task_id"],
"run_id": r["run_id"],
"kind": r["kind"],
"payload": payload,
"created_at": r["created_at"],
})
new_cursor = r["id"]
return new_cursor, out
finally:
conn.close()
while True:
cursor, events = await asyncio.to_thread(_fetch_new, cursor)
if events:
await ws.send_json({"events": events, "cursor": cursor})
await asyncio.sleep(_EVENT_POLL_SECONDS)
except WebSocketDisconnect:
return
except Exception as exc: # defensive: never crash the dashboard worker
log.warning("Kanban event stream error: %s", exc)
try:
await ws.close()
except Exception:
pass
@@ -0,0 +1,32 @@
# DEPRECATED — the kanban dispatcher now runs inside the gateway by
# default (config key: kanban.dispatch_in_gateway, default true). To
# migrate:
#
# systemctl --user disable --now hermes-kanban-dispatcher.service
# # then make sure a gateway is running; e.g. a systemd user unit
# # for `hermes gateway start`. The gateway hosts the dispatcher.
#
# This unit is kept for users who truly cannot run the gateway (host
# policy forbids long-lived services, etc.). It now invokes the
# standalone dispatcher via the explicit --force flag, so nobody
# accidentally keeps two dispatchers racing against the same
# kanban.db. Running this unit AND a gateway with
# dispatch_in_gateway=true is NOT supported.
[Unit]
Description=Hermes Kanban dispatcher (DEPRECATED standalone daemon — prefer gateway-embedded dispatch)
Documentation=https://hermes-agent.nousresearch.com/docs/user-guide/features/kanban
After=network.target
[Service]
Type=simple
ExecStart=/usr/bin/env hermes kanban daemon --force --interval 60 --pidfile %t/hermes-kanban-dispatcher.pid
Restart=on-failure
RestartSec=5
# Log to the journal via stdout/stderr; the dispatcher also writes per-task
# worker output to $HERMES_HOME/kanban/logs/<task>.log.
StandardOutput=journal
StandardError=journal
[Install]
WantedBy=default.target
+23 -12
View File
@@ -110,6 +110,17 @@ def _parse_context_tokens(host_val, root_val) -> int | None:
return None
def _parse_int_config(host_val, root_val, default: int) -> int:
"""Parse an integer config: host wins, then root, then default."""
for val in (host_val, root_val):
if val is not None:
try:
return int(val)
except (ValueError, TypeError):
pass
return default
def _parse_dialectic_depth(host_val, root_val) -> int:
"""Parse dialecticDepth: host wins, then root, then 1. Clamped to 1-3."""
for val in (host_val, root_val):
@@ -463,10 +474,10 @@ class HonchoClientConfig:
raw.get("dialecticDynamic"),
default=True,
),
dialectic_max_chars=int(
host_block.get("dialecticMaxChars")
or raw.get("dialecticMaxChars")
or 600
dialectic_max_chars=_parse_int_config(
host_block.get("dialecticMaxChars"),
raw.get("dialecticMaxChars"),
default=600,
),
dialectic_depth=_parse_dialectic_depth(
host_block.get("dialecticDepth"),
@@ -487,15 +498,15 @@ class HonchoClientConfig:
or raw.get("reasoningLevelCap")
or "high"
),
message_max_chars=int(
host_block.get("messageMaxChars")
or raw.get("messageMaxChars")
or 25000
message_max_chars=_parse_int_config(
host_block.get("messageMaxChars"),
raw.get("messageMaxChars"),
default=25000,
),
dialectic_max_input_chars=int(
host_block.get("dialecticMaxInputChars")
or raw.get("dialecticMaxInputChars")
or 10000
dialectic_max_input_chars=_parse_int_config(
host_block.get("dialecticMaxInputChars"),
raw.get("dialecticMaxInputChars"),
default=10000,
),
recall_mode=_normalize_recall_mode(
host_block.get("recallMode")
+9 -6
View File
@@ -160,11 +160,13 @@ class HonchoSessionManager:
Peers are lazy -- no API call until first use.
Observation settings are controlled per-session via SessionPeerConfig.
"""
if peer_id in self._peers_cache:
return self._peers_cache[peer_id]
with self._cache_lock:
if peer_id in self._peers_cache:
return self._peers_cache[peer_id]
peer = self.honcho.peer(peer_id)
self._peers_cache[peer_id] = peer
with self._cache_lock:
self._peers_cache[peer_id] = peer
return peer
def _get_or_create_honcho_session(
@@ -176,9 +178,10 @@ class HonchoSessionManager:
Returns:
Tuple of (honcho_session, existing_messages).
"""
if session_id in self._sessions_cache:
logger.debug("Honcho session '%s' retrieved from cache", session_id)
return self._sessions_cache[session_id], []
with self._cache_lock:
if session_id in self._sessions_cache:
logger.debug("Honcho session '%s' retrieved from cache", session_id)
return self._sessions_cache[session_id], []
session = self.honcho.session(session_id)
+3
View File
@@ -38,6 +38,7 @@ except ImportError:
try:
from microsoft_teams.apps import App, ActivityContext
from microsoft_teams.common.http.client import ClientOptions
from microsoft_teams.api import MessageActivity, ConversationReference
from microsoft_teams.api.activities.typing import TypingActivityInput
from microsoft_teams.api.activities.invoke.adaptive_card import AdaptiveCardInvokeActivity
@@ -57,6 +58,7 @@ try:
TEAMS_SDK_AVAILABLE = True
except ImportError:
TEAMS_SDK_AVAILABLE = False
ClientOptions = None # type: ignore[assignment,misc]
App = None # type: ignore[assignment,misc]
ActivityContext = None # type: ignore[assignment,misc]
MessageActivity = None # type: ignore[assignment,misc]
@@ -208,6 +210,7 @@ class TeamsAdapter(BasePlatformAdapter):
client_secret=self._client_secret,
tenant_id=self._tenant_id,
http_server_adapter=_AiohttpBridgeAdapter(aiohttp_app),
client=ClientOptions(headers={"User-Agent": "Hermes"}),
)
# Register message handler before initialize()
+1 -1
View File
@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
[project]
name = "hermes-agent"
version = "0.11.0"
version = "0.12.0"
description = "The self-improving AI agent — creates skills from experience, improves them during use, and runs anywhere"
readme = "README.md"
requires-python = ">=3.11"
+442 -172
View File
File diff suppressed because it is too large Load Diff
+10 -2
View File
@@ -35,10 +35,18 @@ import time
from pathlib import Path
from typing import Any
_PROJECT_ROOT = Path(__file__).resolve().parent.parent
sys.path.insert(0, str(_PROJECT_ROOT))
try:
from hermes_constants import get_hermes_home
except ImportError:
def get_hermes_home() -> Path: # type: ignore[misc]
val = (os.environ.get("HERMES_HOME") or "").strip()
return Path(val) if val else Path.home() / ".hermes"
DEFAULT_TUI_DIR = Path(os.environ.get("HERMES_TUI_DIR", "/home/bb/hermes-agent/ui-tui"))
DEFAULT_LOG = Path(os.environ.get("HERMES_PERF_LOG", str(Path.home() / ".hermes" / "perf.log")))
DEFAULT_STATE_DB = Path.home() / ".hermes" / "state.db"
DEFAULT_LOG = Path(os.environ.get("HERMES_PERF_LOG", str(get_hermes_home() / "perf.log")))
DEFAULT_STATE_DB = get_hermes_home() / "state.db"
# Keystroke escape sequences. Matches what xterm/VT220 send when the
# terminal has bracketed-paste disabled and the key-repeat handler fires.
+48
View File
@@ -41,18 +41,26 @@ PYPROJECT_FILE = REPO_ROOT / "pyproject.toml"
AUTHOR_MAP = {
# teknium (multiple emails)
"teknium1@gmail.com": "teknium1",
"m@mobrienv.dev": "mikeyobrien",
"qiyin.zuo@pcitc.com": "qiyin-code",
"leone.parise@gmail.com": "leoneparise",
"teknium@nousresearch.com": "teknium1",
"127238744+teknium1@users.noreply.github.com": "teknium1",
"159539633+MottledShadow@users.noreply.github.com": "MottledShadow",
"aludwin+gh@gmail.com": "adamludwin",
"2093036+exiao@users.noreply.github.com": "exiao",
"rylen.anil@gmail.com": "rylena",
"godnanijatin@gmail.com": "jatingodnani",
"14046872+tmimmanuel@users.noreply.github.com": "tmimmanuel",
"657290301@qq.com": "IMHaoyan",
"revar@users.noreply.github.com": "revaraver",
# Matrix parity salvage batch (April 2026)
"sr@samirusani": "samrusani",
"angelclaw@AngelMacBook.local": "angel12",
"charles@cryptoassetrecovery.com": "charles-brooks",
# DeepSeek v4 + Kimi thinking-mode reasoning_content salvage (April 2026)
"luwinyang@deepseek.com": "lsdsjy",
"season.saw@gmail.com": "season179",
"heathley@Heathley-MacBook-Air.local": "heathley",
"vlad19@gmail.com": "dandaka",
"adamrummer@gmail.com": "cyclingwithelephants",
@@ -73,6 +81,13 @@ AUTHOR_MAP = {
"thomasjhon6666@gmail.com": "ThomassJonax",
"focusflow.app.help@gmail.com": "yes999zc",
"rob@atlas.lan": "rmoen",
# Slack ephemeral slash-ack salvage (May 2026)
"probepark@users.noreply.github.com": "probepark",
# Slack batch salvage (May 2026)
"280484231+prive-fe-bot@users.noreply.github.com": "priveperfumes",
"amr@ghanem.sa": "amroessam",
"paperlantern.agent@gmail.com": "Hinotoi-agent",
"valda@underscore.jp": "valda",
"162235745+0z1-ghb@users.noreply.github.com": "0z1-ghb",
"yes999zc@163.com": "yes999zc",
"343873859@qq.com": "DrStrangerUJN",
@@ -89,6 +104,8 @@ AUTHOR_MAP = {
"130918800+devorun@users.noreply.github.com": "devorun",
"surat.s@itm.kmutnb.ac.th": "beesrsj2500",
"beesr@bee.localdomain": "beesrsj2500",
"mind-dragon@nous.research": "Mind-Dragon",
"juntingpublic@gmail.com": "JustinUssuri",
"mtf201013@gmail.com": "ma-pony",
"sonoyuncudmr@gmail.com": "Sonoyunchu",
"43525405+yatesjalex@users.noreply.github.com": "yatesjalex",
@@ -97,6 +114,8 @@ AUTHOR_MAP = {
"web3blind@users.noreply.github.com": "web3blind",
"julia@alexland.us": "alexg0bot",
"christian@scheid.tech": "scheidti",
# Moonshot schema anyOf+enum salvage (May 2026)
"git@local.invalid": "hendrixfreire",
"1060770+benjaminsehl@users.noreply.github.com": "benjaminsehl",
"nerijusn76@gmail.com": "Nerijusas",
"itonov@proton.me": "Ito-69",
@@ -109,6 +128,7 @@ AUTHOR_MAP = {
"foxion37@gmail.com": "foxion37",
"bloodcarter@gmail.com": "bloodcarter",
"scott@scotttrinh.com": "scotttrinh",
"quocanh261997@gmail.com": "quocanh261997",
# contributors (from noreply pattern)
"david.vv@icloud.com": "davidvv",
"wangqiang@wangqiangdeMac-mini.local": "xiaoqiang243",
@@ -164,6 +184,7 @@ AUTHOR_MAP = {
"sir_even@icloud.com": "sirEven",
"36056348+sirEven@users.noreply.github.com": "sirEven",
"70424851+insecurejezza@users.noreply.github.com": "insecurejezza",
"jezzahehn@gmail.com": "JezzaHehn",
"254021826+dodo-reach@users.noreply.github.com": "dodo-reach",
"259807879+Bartok9@users.noreply.github.com": "Bartok9",
"270082434+crayfish-ai@users.noreply.github.com": "crayfish-ai",
@@ -289,6 +310,7 @@ AUTHOR_MAP = {
"154585401+LeonSGP43@users.noreply.github.com": "LeonSGP43",
"12250313+Kailigithub@users.noreply.github.com": "Kailigithub",
"mgparkprint@gmail.com": "vlwkaos",
"1317078257maroon@gmail.com": "Oxidane-bot",
"tranquil_flow@protonmail.com": "Tranquil-Flow",
"LyleLengyel@gmail.com": "mcndjxlefnd",
"wangshengyang2004@163.com": "Wangshengyang2004",
@@ -327,6 +349,7 @@ AUTHOR_MAP = {
"stefan@dimagents.ai": "dimitrovi",
"hermes@noushq.ai": "benbarclay",
"chinmingcock@gmail.com": "ChimingLiu",
"allard.quek@singtel.com": "AllardQuek",
"openclaw@sparklab.ai": "openclaw",
"semihcvlk53@gmail.com": "Himess",
"erenkar950@gmail.com": "erenkarakus",
@@ -348,6 +371,10 @@ AUTHOR_MAP = {
"xowiekk@gmail.com": "Xowiek",
"1243352777@qq.com": "zons-zhaozhy",
"e.silacandmr@gmail.com": "Es1la",
"h3057183414@gmail.com": "CoreyNoDream",
"franksong2702@gmail.com": "franksong2702",
"673088860@qq.com": "ambition0802",
"beibei1988@proton.me": "beibi9966",
# ── bulk addition: 75 emails resolved via API, PR salvage bodies, noreply
# crossref, and GH contributor list matching (April 2026 audit) ──
"1115117931@qq.com": "aaronagent",
@@ -419,6 +446,8 @@ AUTHOR_MAP = {
"ogzerber@users.noreply.github.com": "ogzerber",
"cola-runner@users.noreply.github.com": "cola-runner",
"ygd58@users.noreply.github.com": "ygd58",
"45554392+warabe1122@users.noreply.github.com": "warabe1122",
"187001140+willy-scr@users.noreply.github.com": "willy-scr",
"vominh1919@users.noreply.github.com": "vominh1919",
"iamagenius00@users.noreply.github.com": "iamagenius00",
"9219265+cresslank@users.noreply.github.com": "cresslank",
@@ -443,6 +472,7 @@ AUTHOR_MAP = {
"taosiyuan163@153.com": "taosiyuan163",
"tesseracttars@gmail.com": "tesseracttars-creator",
"tianliangjay@gmail.com": "xingkongliang",
"1317078257maroon@gmail.com": "Oxidane-bot",
"tranquil_flow@protonmail.com": "Tranquil-Flow",
"LyleLengyel@gmail.com": "mcndjxlefnd",
"unayung@gmail.com": "Unayung",
@@ -488,9 +518,11 @@ AUTHOR_MAP = {
"hubin_ll@qq.com": "LLQWQ",
"memosr_email@gmail.com": "memosr",
"jperlow@gmail.com": "perlowja",
"jasonpette1783@gmail.com": "web-dev0521",
"tangyuanjc@JCdeAIfenshendeMac-mini.local": "tangyuanjc",
"harryplusplus@gmail.com": "harryplusplus",
"anthhub@163.com": "anthhub",
"allard.quek@singtel.com": "AllardQuek",
"shenuu@gmail.com": "shenuu",
"xiayh17@gmail.com": "xiayh0107",
"zhujianxyz@gmail.com": "opriz",
@@ -626,6 +658,22 @@ AUTHOR_MAP = {
"164839249+Joseph19820124@users.noreply.github.com": "Joseph19820124",
"rugved@lmstudio.ai": "rugvedS07",
"44333070+Heltman@users.noreply.github.com": "Heltman",
# v0.12.0 additions
"ching@kachingappz.com": "ching-kaching",
"codezhujr@gmail.com": "Zjianru", # salvage chain: code by codez, PR #15749 author @Zjianru
"daimon@noreply.github.com": "Siddharth Balyan", # co-author only
"i@zkl2333.com": "zkl2333",
"isaachuang@Isaacs-MacBook-Pro.local": "isaachuangGMICLOUD",
"isaachuang@Mac.localdomain": "isaachuangGMICLOUD", # salvage of PR #11955 → #16663
"liyuan851277048@icloud.com": "Octopus", # co-author only
"me+github7604@versun.org": "Versun", # co-author only
"my.vesper.nine@gmail.com": "kevin-ho", # salvage: PR #15488 author @kevin-ho
"noreply@paperclip.ing": "Paperclip", # co-author only
"teknium@hermes-agent": "teknium1",
"web3blind@gmail.com": "web3blind",
"ztzheng@163.com": "chengoak", # PR #17467
"24110240104@m.fudan.edu.cn": "YuShu", # co-author only
"simantak@mac.local": "simantak-dabhade", # PR #6329
}
+152
View File
@@ -0,0 +1,152 @@
---
name: kanban-orchestrator
description: Decomposition playbook + specialist-roster conventions + anti-temptation rules for an orchestrator profile routing work through Kanban. The "don't do the work yourself" rule and the basic lifecycle are auto-injected into every kanban worker's system prompt; this skill is the deeper playbook when you're specifically playing the orchestrator role.
version: 2.0.0
metadata:
hermes:
tags: [kanban, multi-agent, orchestration, routing]
related_skills: [kanban-worker]
---
# Kanban Orchestrator — Decomposition Playbook
> The **core worker lifecycle** (including the `kanban_create` fan-out pattern and the "decompose, don't execute" rule) is auto-injected into every kanban process via the `KANBAN_GUIDANCE` system-prompt block. This skill is the deeper playbook when you're an orchestrator profile whose whole job is routing.
## When to use the board (vs. just doing the work)
Create Kanban tasks when any of these are true:
1. **Multiple specialists are needed.** Research + analysis + writing is three profiles.
2. **The work should survive a crash or restart.** Long-running, recurring, or important.
3. **The user might want to interject.** Human-in-the-loop at any step.
4. **Multiple subtasks can run in parallel.** Fan-out for speed.
5. **Review / iteration is expected.** A reviewer profile loops on drafter output.
6. **The audit trail matters.** Board rows persist in SQLite forever.
If *none* of those apply — it's a small one-shot reasoning task — use `delegate_task` instead or answer the user directly.
## The anti-temptation rules
Your job description says "route, don't execute." The rules that enforce that:
- **Do not execute the work yourself.** Your restricted toolset usually doesn't even include terminal/file/code/web for implementation. If you find yourself "just fixing this quickly" — stop and create a task for the right specialist.
- **For any concrete task, create a Kanban task and assign it.** Every single time.
- **If no specialist fits, ask the user which profile to create.** Do not default to doing it yourself under "close enough."
- **Decompose, route, and summarize — that's the whole job.**
## The standard specialist roster (convention)
Unless the user's setup has customized profiles, assume these exist. Adjust to whatever the user actually has — ask if you're unsure.
| Profile | Does | Typical workspace |
|---|---|---|
| `researcher` | Reads sources, gathers facts, writes findings | `scratch` |
| `analyst` | Synthesizes, ranks, de-dupes. Consumes multiple `researcher` outputs | `scratch` |
| `writer` | Drafts prose in the user's voice | `scratch` or `dir:` into their Obsidian vault |
| `reviewer` | Reads output, leaves findings, gates approval | `scratch` |
| `backend-eng` | Writes server-side code | `worktree` |
| `frontend-eng` | Writes client-side code | `worktree` |
| `ops` | Runs scripts, manages services, handles deployments | `dir:` into ops scripts repo |
| `pm` | Writes specs, acceptance criteria | `scratch` |
## Decomposition playbook
### Step 1 — Understand the goal
Ask clarifying questions if the goal is ambiguous. Cheap to ask; expensive to spawn the wrong fleet.
### Step 2 — Sketch the task graph
Before creating anything, draft the graph out loud (in your response to the user). Example for "Analyze whether we should migrate to Postgres":
```
T1 researcher research: Postgres cost vs current
T2 researcher research: Postgres performance vs current
T3 analyst synthesize migration recommendation parents: T1, T2
T4 writer draft decision memo parents: T3
```
Show this to the user. Let them correct it before you create anything.
### Step 3 — Create tasks and link
```python
t1 = kanban_create(
title="research: Postgres cost vs current",
assignee="researcher",
body="Compare estimated infrastructure costs, migration costs, and ongoing ops costs over a 3-year window. Sources: AWS/GCP pricing, team time estimates, current Postgres bills from peers.",
tenant=os.environ.get("HERMES_TENANT"),
)["task_id"]
t2 = kanban_create(
title="research: Postgres performance vs current",
assignee="researcher",
body="Compare query latency, throughput, and scaling characteristics at our expected data volume (~500GB, 10k QPS peak). Sources: benchmark papers, public case studies, pgbench results if easy.",
)["task_id"]
t3 = kanban_create(
title="synthesize migration recommendation",
assignee="analyst",
body="Read the findings from T1 (cost) and T2 (performance). Produce a 1-page recommendation with explicit trade-offs and a go/no-go call.",
parents=[t1, t2],
)["task_id"]
t4 = kanban_create(
title="draft decision memo",
assignee="writer",
body="Turn the analyst's recommendation into a 2-page memo for the CTO. Match the tone of previous decision memos in the team's knowledge base.",
parents=[t3],
)["task_id"]
```
`parents=[...]` gates promotion — children stay in `todo` until every parent reaches `done`, then auto-promote to `ready`. No manual coordination needed; the dispatcher and dependency engine handle it.
### Step 4 — Complete your own task
If you were spawned as a task yourself (e.g. `planner` profile was assigned `T0: "investigate Postgres migration"`), mark it done with a summary of what you created:
```python
kanban_complete(
summary="decomposed into T1-T4: 2 researchers parallel, 1 analyst on their outputs, 1 writer on the recommendation",
metadata={
"task_graph": {
"T1": {"assignee": "researcher", "parents": []},
"T2": {"assignee": "researcher", "parents": []},
"T3": {"assignee": "analyst", "parents": ["T1", "T2"]},
"T4": {"assignee": "writer", "parents": ["T3"]},
},
},
)
```
### Step 5 — Report back to the user
Tell them what you created in plain prose:
> I've queued 4 tasks:
> - **T1** (researcher): cost comparison
> - **T2** (researcher): performance comparison, in parallel with T1
> - **T3** (analyst): synthesizes T1 + T2 into a recommendation
> - **T4** (writer): turns T3 into a CTO memo
>
> The dispatcher will pick up T1 and T2 now. T3 starts when both finish. You'll get a gateway ping when T4 completes. Use the dashboard or `hermes kanban tail <id>` to follow along.
## Common patterns
**Fan-out + fan-in (research → synthesize):** N `researcher` tasks with no parents, one `analyst` task with all of them as parents.
**Pipeline with gates:** `pm → backend-eng → reviewer`. Each stage's `parents=[previous_task]`. Reviewer blocks or completes; if reviewer blocks, the operator unblocks with feedback and respawns.
**Same-profile queue:** 50 tasks, all assigned to `translator`, no dependencies between them. Dispatcher serializes — translator processes them in priority order, accumulating experience in their own memory.
**Human-in-the-loop:** Any task can `kanban_block()` to wait for input. Dispatcher respawns after `/unblock`. The comment thread carries the full context.
## Pitfalls
**Reassignment vs. new task.** If a reviewer blocks with "needs changes," create a NEW task linked from the reviewer's task — don't re-run the same task with a stern look. The new task is assigned to the original implementer profile.
**Argument order for links.** `kanban_link(parent_id=..., child_id=...)` — parent first. Mixing them up demotes the wrong task to `todo`.
**Don't pre-create the whole graph if the shape depends on intermediate findings.** If T3's structure depends on what T1 and T2 find, let T3 exist as a "synthesize findings" task whose own first step is to read parent handoffs and plan the rest. Orchestrators can spawn orchestrators.
**Tenant inheritance.** If `HERMES_TENANT` is set in your env, pass `tenant=os.environ.get("HERMES_TENANT")` on every `kanban_create` call so child tasks stay in the same namespace.
+134
View File
@@ -0,0 +1,134 @@
---
name: kanban-worker
description: Pitfalls, examples, and edge cases for Hermes Kanban workers. The lifecycle itself is auto-injected into every worker's system prompt as KANBAN_GUIDANCE (from agent/prompt_builder.py); this skill is what you load when you want deeper detail on specific scenarios.
version: 2.0.0
metadata:
hermes:
tags: [kanban, multi-agent, collaboration, workflow, pitfalls]
related_skills: [kanban-orchestrator]
---
# Kanban Worker — Pitfalls and Examples
> You're seeing this skill because the Hermes Kanban dispatcher spawned you as a worker with `--skills kanban-worker` — it's loaded automatically for every dispatched worker. The **lifecycle** (6 steps: orient → work → heartbeat → block/complete) also lives in the `KANBAN_GUIDANCE` block that's auto-injected into your system prompt. This skill is the deeper detail: good handoff shapes, retry diagnostics, edge cases.
## Workspace handling
Your workspace kind determines how you should behave inside `$HERMES_KANBAN_WORKSPACE`:
| Kind | What it is | How to work |
|---|---|---|
| `scratch` | Fresh tmp dir, yours alone | Read/write freely; it gets GC'd when the task is archived. |
| `dir:<path>` | Shared persistent directory | Other runs will read what you write. Treat it like long-lived state. Path is guaranteed absolute (the kernel rejects relative paths). |
| `worktree` | Git worktree at the resolved path | If `.git` doesn't exist, run `git worktree add <path> <branch>` from the main repo first, then cd and work normally. Commit work here. |
## Tenant isolation
If `$HERMES_TENANT` is set, the task belongs to a tenant namespace. When reading or writing persistent memory, prefix memory entries with the tenant so context doesn't leak across tenants:
- Good: `business-a: Acme is our biggest customer`
- Bad (leaks): `Acme is our biggest customer`
## Good summary + metadata shapes
The `kanban_complete(summary=..., metadata=...)` handoff is how downstream workers read what you did. Patterns that work:
**Coding task:**
```python
kanban_complete(
summary="shipped rate limiter — token bucket, keys on user_id with IP fallback, 14 tests pass",
metadata={
"changed_files": ["rate_limiter.py", "tests/test_rate_limiter.py"],
"tests_run": 14,
"tests_passed": 14,
"decisions": ["user_id primary, IP fallback for unauthenticated requests"],
},
)
```
**Research task:**
```python
kanban_complete(
summary="3 competing libraries reviewed; vLLM wins on throughput, SGLang on latency, Tensorrt-LLM on memory efficiency",
metadata={
"sources_read": 12,
"recommendation": "vLLM",
"benchmarks": {"vllm": 1.0, "sglang": 0.87, "trtllm": 0.72},
},
)
```
**Review task:**
```python
kanban_complete(
summary="reviewed PR #123; 2 blocking issues found (SQL injection in /search, missing CSRF on /settings)",
metadata={
"pr_number": 123,
"findings": [
{"severity": "critical", "file": "api/search.py", "line": 42, "issue": "raw SQL concat"},
{"severity": "high", "file": "api/settings.py", "issue": "missing CSRF middleware"},
],
"approved": False,
},
)
```
Shape `metadata` so downstream parsers (reviewers, aggregators, schedulers) can use it without re-reading your prose.
## Block reasons that get answered fast
Bad: `"stuck"` — the human has no context.
Good: one sentence naming the specific decision you need. Leave longer context as a comment instead.
```python
kanban_comment(
task_id=os.environ["HERMES_KANBAN_TASK"],
body="Full context: I have user IPs from Cloudflare headers but some users are behind NATs with thousands of peers. Keying on IP alone causes false positives.",
)
kanban_block(reason="Rate limit key choice: IP (simple, NAT-unsafe) or user_id (requires auth, skips anonymous endpoints)?")
```
The block message is what appears in the dashboard / gateway notifier. The comment is the deeper context a human reads when they open the task.
## Heartbeats worth sending
Good heartbeats name progress: `"epoch 12/50, loss 0.31"`, `"scanned 1.2M/2.4M rows"`, `"uploaded 47/120 videos"`.
Bad heartbeats: `"still working"`, empty notes, sub-second intervals. Every few minutes max; skip entirely for tasks under ~2 minutes.
## Retry scenarios
If you open the task and `kanban_show` returns `runs: [...]` with one or more closed runs, you're a retry. The prior runs' `outcome` / `summary` / `error` tell you what didn't work. Don't repeat that path. Typical retry diagnostics:
- `outcome: "timed_out"` — the previous attempt hit `max_runtime_seconds`. You may need to chunk the work or shorten it.
- `outcome: "crashed"` — OOM or segfault. Reduce memory footprint.
- `outcome: "spawn_failed"` + `error: "..."` — usually a profile config issue (missing credential, bad PATH). Ask the human via `kanban_block` instead of retrying blindly.
- `outcome: "reclaimed"` + `summary: "task archived..."` — operator archived the task out from under the previous run; you probably shouldn't be running at all, check status carefully.
- `outcome: "blocked"` — a previous attempt blocked; the unblock comment should be in the thread by now.
## Do NOT
- Call `delegate_task` as a substitute for `kanban_create`. `delegate_task` is for short reasoning subtasks inside YOUR run; `kanban_create` is for cross-agent handoffs that outlive one API loop.
- Modify files outside `$HERMES_KANBAN_WORKSPACE` unless the task body says to.
- Create follow-up tasks assigned to yourself — assign to the right specialist.
- Complete a task you didn't actually finish. Block it instead.
## Pitfalls
**Task state can change between dispatch and your startup.** Between when the dispatcher claimed and when your process actually booted, the task may have been blocked, reassigned, or archived. Always `kanban_show` first. If it reports `blocked` or `archived`, stop — you shouldn't be running.
**Workspace may have stale artifacts.** Especially `dir:` and `worktree` workspaces can have files from previous runs. Read the comment thread — it usually explains why you're running again and what state the workspace is in.
**Don't rely on the CLI when the guidance is available.** The `kanban_*` tools work across all terminal backends (Docker, Modal, SSH). `hermes kanban <verb>` from your terminal tool will fail in containerized backends because the CLI isn't installed there. When in doubt, use the tool.
## CLI fallback (for scripting)
Every tool has a CLI equivalent for human operators and scripts:
- `kanban_show``hermes kanban show <id> --json`
- `kanban_complete``hermes kanban complete <id> --summary "..." --metadata '{...}'`
- `kanban_block``hermes kanban block <id> "reason"`
- `kanban_create``hermes kanban create "title" --assignee <profile> [--parent <id>]`
- etc.
Use the tools from inside an agent; the CLI exists for the human at the terminal.
+6 -5
View File
@@ -124,7 +124,7 @@ class TestMcpRegistrationE2E:
mock_conn.request_permission = AsyncMock()
acp_agent._conn = mock_conn
def mock_run_conversation(user_message, conversation_history=None, task_id=None):
def mock_run_conversation(user_message, conversation_history=None, task_id=None, **kwargs):
"""Simulate an agent turn that calls terminal, gets a result, then responds."""
agent = state.agent
@@ -178,9 +178,10 @@ class TestMcpRegistrationE2E:
complete_event = completions[0]
assert isinstance(complete_event, ToolCallProgress)
assert complete_event.status == "completed"
# rawOutput should contain the tool result string
assert complete_event.raw_output is not None
assert "hello" in str(complete_event.raw_output)
# Completion should contain human-readable output rather than forcing raw JSON panes.
assert complete_event.content
assert "hello" in complete_event.content[0].content.text
assert complete_event.raw_output is None
def test_patch_mode_tool_start_emits_diff_blocks_for_v4a_patch(self):
update = build_tool_start(
@@ -213,7 +214,7 @@ class TestMcpRegistrationE2E:
mock_conn.request_permission = AsyncMock()
acp_agent._conn = mock_conn
def mock_run(user_message, conversation_history=None, task_id=None):
def mock_run(user_message, conversation_history=None, task_id=None, **kwargs):
agent = state.agent
# Fire two tool calls
if agent.tool_progress_callback:
+187 -8
View File
@@ -27,7 +27,10 @@ from acp.schema import (
SetSessionModeResponse,
SessionInfo,
TextContentBlock,
ToolCallProgress,
ToolCallStart,
Usage,
UsageUpdate,
UserMessageChunk,
)
from acp_adapter.server import HermesACPAgent, HERMES_VERSION
@@ -200,6 +203,8 @@ class TestSessionOps:
"context",
"reset",
"compact",
"steer",
"queue",
"version",
]
model_cmd = next(
@@ -208,6 +213,46 @@ class TestSessionOps:
assert model_cmd.input is not None
assert model_cmd.input.root.hint == "model name to switch to"
def test_build_usage_update_for_zed_context_indicator(self, agent, mock_manager):
state = mock_manager.create_session(cwd="/tmp")
state.history = [{"role": "user", "content": "hello"}]
state.agent.context_compressor = MagicMock(context_length=100_000)
state.agent._cached_system_prompt = "system"
state.agent.tools = [{"type": "function", "function": {"name": "demo"}}]
with patch(
"agent.model_metadata.estimate_request_tokens_rough",
return_value=25_000,
):
update = agent._build_usage_update(state)
assert isinstance(update, UsageUpdate)
assert update.session_update == "usage_update"
assert update.size == 100_000
assert update.used == 25_000
@pytest.mark.asyncio
async def test_send_usage_update_to_client(self, agent, mock_manager):
state = mock_manager.create_session(cwd="/tmp")
state.agent.context_compressor = MagicMock(context_length=100_000)
mock_conn = MagicMock(spec=acp.Client)
mock_conn.session_update = AsyncMock()
agent._conn = mock_conn
with patch(
"agent.model_metadata.estimate_request_tokens_rough",
return_value=25_000,
):
await agent._send_usage_update(state)
mock_conn.session_update.assert_awaited_once()
call = mock_conn.session_update.await_args
assert call.kwargs["session_id"] == state.session_id
update = call.kwargs["update"]
assert isinstance(update, UsageUpdate)
assert update.size == 100_000
assert update.used == 25_000
@pytest.mark.asyncio
async def test_cancel_sets_event(self, agent):
resp = await agent.new_session(cwd=".")
@@ -238,11 +283,31 @@ class TestSessionOps:
{"role": "system", "content": "hidden system"},
{"role": "user", "content": "what controls the / slash commands?"},
{"role": "assistant", "content": "HermesACPAgent._ADVERTISED_COMMANDS controls them."},
{"role": "tool", "content": "tool output should not replay"},
{
"role": "assistant",
"content": "",
"tool_calls": [
{
"id": "call_search_1",
"type": "function",
"function": {
"name": "search_files",
"arguments": '{"pattern":"slash commands","path":"."}',
},
}
],
},
{
"role": "tool",
"tool_call_id": "call_search_1",
"content": '{"total_count":1,"matches":[{"path":"cli.py","line":42,"content":"slash commands"}]}',
},
]
mock_conn.session_update.reset_mock()
resp = await agent.load_session(cwd="/tmp", session_id=new_resp.session_id)
await asyncio.sleep(0)
await asyncio.sleep(0)
assert isinstance(resp, LoadSessionResponse)
calls = mock_conn.session_update.await_args_list
@@ -257,6 +322,21 @@ class TestSessionOps:
assert isinstance(replay_calls[1].kwargs["update"], AgentMessageChunk)
assert replay_calls[1].kwargs["update"].content.text.startswith("HermesACPAgent")
tool_updates = [
call.kwargs["update"]
for call in calls
if getattr(call.kwargs.get("update"), "session_update", None)
in {"tool_call", "tool_call_update"}
]
assert len(tool_updates) == 2
assert isinstance(tool_updates[0], ToolCallStart)
assert tool_updates[0].tool_call_id == "call_search_1"
assert tool_updates[0].title == "search: slash commands"
assert isinstance(tool_updates[1], ToolCallProgress)
assert tool_updates[1].tool_call_id == "call_search_1"
assert "Search results" in tool_updates[1].content[0].content.text
assert "cli.py:42" in tool_updates[1].content[0].content.text
@pytest.mark.asyncio
async def test_resume_session_replays_persisted_history_to_client(self, agent):
mock_conn = MagicMock(spec=acp.Client)
@@ -269,6 +349,8 @@ class TestSessionOps:
mock_conn.session_update.reset_mock()
resp = await agent.resume_session(cwd="/tmp", session_id=new_resp.session_id)
await asyncio.sleep(0)
await asyncio.sleep(0)
assert isinstance(resp, ResumeSessionResponse)
updates = [call.kwargs["update"] for call in mock_conn.session_update.await_args_list]
@@ -278,6 +360,27 @@ class TestSessionOps:
for update in updates
)
@pytest.mark.asyncio
async def test_load_session_schedules_history_replay_after_response(self, agent):
"""Zed only attaches replayed updates after session/load has completed."""
new_resp = await agent.new_session(cwd="/tmp")
state = agent.session_manager.get_session(new_resp.session_id)
state.history = [{"role": "user", "content": "hello from history"}]
events = []
async def replay_after_response(_state):
events.append("replay")
with patch.object(agent, "_replay_session_history", side_effect=replay_after_response):
resp = await agent.load_session(cwd="/tmp", session_id=new_resp.session_id)
events.append("returned")
assert isinstance(resp, LoadSessionResponse)
assert events == ["returned"]
await asyncio.sleep(0)
await asyncio.sleep(0)
assert events == ["returned", "replay"]
@pytest.mark.asyncio
async def test_resume_session_creates_new_if_missing(self, agent):
resume_resp = await agent.resume_session(cwd="/tmp", session_id="nonexistent")
@@ -522,6 +625,11 @@ class TestPrompt:
assert isinstance(resp, PromptResponse)
assert resp.stop_reason == "end_turn"
state.agent.run_conversation.assert_called_once()
assert state.agent.tool_progress_callback is not None
assert state.agent.step_callback is not None
assert state.agent.stream_delta_callback is not None
assert state.agent.reasoning_callback is not None
assert state.agent.thinking_callback is None
@pytest.mark.asyncio
async def test_prompt_updates_history(self, agent):
@@ -565,12 +673,40 @@ class TestPrompt:
prompt = [TextContentBlock(type="text", text="help me")]
await agent.prompt(prompt=prompt, session_id=new_resp.session_id)
# session_update should have been called with the final message
# session_update should include the final message (usage_update may follow it)
mock_conn.session_update.assert_called()
# Get the last call's update argument
last_call = mock_conn.session_update.call_args_list[-1]
update = last_call[1].get("update") or last_call[0][1]
assert update.session_update == "agent_message_chunk"
updates = [
call.kwargs.get("update") or call.args[1]
for call in mock_conn.session_update.call_args_list
]
assert any(update.session_update == "agent_message_chunk" for update in updates)
@pytest.mark.asyncio
async def test_prompt_does_not_duplicate_streamed_final_message(self, agent):
"""If ACP already streamed response chunks, final_response should not be sent again."""
new_resp = await agent.new_session(cwd=".")
state = agent.session_manager.get_session(new_resp.session_id)
def mock_run(*args, **kwargs):
state.agent.stream_delta_callback("streamed answer")
return {"final_response": "streamed answer", "messages": []}
state.agent.run_conversation = mock_run
mock_conn = MagicMock(spec=acp.Client)
mock_conn.session_update = AsyncMock()
agent._conn = mock_conn
prompt = [TextContentBlock(type="text", text="hello")]
await agent.prompt(prompt=prompt, session_id=new_resp.session_id)
updates = [
call.kwargs.get("update") or call.args[1]
for call in mock_conn.session_update.call_args_list
]
agent_chunks = [update for update in updates if update.session_update == "agent_message_chunk"]
assert len(agent_chunks) == 1
assert agent_chunks[0].content.text == "streamed answer"
@pytest.mark.asyncio
async def test_prompt_auto_titles_session(self, agent):
@@ -708,6 +844,43 @@ class TestSlashCommands:
assert "2 messages" in result
assert "user: 1" in result
def test_context_shows_usage_and_compression_threshold(self, agent, mock_manager):
state = self._make_state(mock_manager)
state.history = [{"role": "user", "content": "hello"}]
state.agent.context_compressor = MagicMock(
context_length=100_000,
threshold_tokens=80_000,
)
state.agent._cached_system_prompt = "system"
state.agent.tools = [{"type": "function", "function": {"name": "demo"}}]
with patch(
"agent.model_metadata.estimate_request_tokens_rough",
return_value=25_000,
):
result = agent._handle_slash_command("/context", state)
assert "Context usage: ~25,000 / 100,000 tokens (25.0%)" in result
assert "Compression: ~55,000 tokens until threshold (~80,000, 80%)" in result
assert "Tip: run /compact" in result
def test_context_says_compression_due_when_past_threshold(self, agent, mock_manager):
state = self._make_state(mock_manager)
state.history = [{"role": "user", "content": "hello"}]
state.agent.context_compressor = MagicMock(
context_length=100_000,
threshold_tokens=80_000,
)
with patch(
"agent.model_metadata.estimate_request_tokens_rough",
return_value=82_000,
):
result = agent._handle_slash_command("/context", state)
assert "Context usage: ~82,000 / 100,000 tokens (82.0%)" in result
assert "Compression: due now (threshold ~80,000, 80%). Run /compact." in result
def test_reset_clears_history(self, agent, mock_manager):
state = self._make_state(mock_manager)
state.history = [{"role": "user", "content": "hello"}]
@@ -730,6 +903,7 @@ class TestSlashCommands:
]
state.agent.compression_enabled = True
state.agent._cached_system_prompt = "system"
state.agent.tools = None
original_session_db = object()
state.agent._session_db = original_session_db
@@ -746,7 +920,7 @@ class TestSlashCommands:
with (
patch.object(agent.session_manager, "save_session") as mock_save,
patch(
"agent.model_metadata.estimate_messages_tokens_rough",
"agent.model_metadata.estimate_request_tokens_rough",
side_effect=[40, 12],
),
):
@@ -786,7 +960,12 @@ class TestSlashCommands:
resp = await agent.prompt(prompt=prompt, session_id=new_resp.session_id)
assert resp.stop_reason == "end_turn"
mock_conn.session_update.assert_called_once()
updates = [
call.kwargs.get("update") or call.args[1]
for call in mock_conn.session_update.call_args_list
]
assert any(update.session_update == "agent_message_chunk" for update in updates)
assert any(update.session_update == "usage_update" for update in updates)
@pytest.mark.asyncio
async def test_unknown_slash_falls_through_to_llm(self, agent, mock_manager):
+75
View File
@@ -8,6 +8,7 @@ from types import SimpleNamespace
import pytest
from unittest.mock import MagicMock, patch
from acp_adapter import session as acp_session
from acp_adapter.session import SessionManager, SessionState
from hermes_state import SessionDB
@@ -42,6 +43,27 @@ class TestCreateSession:
state = manager.create_session(cwd="/tmp/work")
assert calls == [(state.session_id, "/tmp/work")]
def test_register_task_cwd_translates_windows_drive_for_wsl_tools(self, monkeypatch):
captured = {}
def fake_register_task_env_overrides(task_id, overrides):
captured["task_id"] = task_id
captured["overrides"] = overrides
monkeypatch.setattr("hermes_constants._wsl_detected", True)
monkeypatch.setattr(
"tools.terminal_tool.register_task_env_overrides",
fake_register_task_env_overrides,
)
acp_session._register_task_cwd("session-1", r"E:\Projects\AI\paperclip")
assert captured == {
"task_id": "session-1",
"overrides": {"cwd": "/mnt/e/Projects/AI/paperclip"},
}
def test_session_ids_are_unique(self, manager):
s1 = manager.create_session()
s2 = manager.create_session()
@@ -56,6 +78,59 @@ class TestCreateSession:
assert manager.get_session("does-not-exist") is None
# ---------------------------------------------------------------------------
# WSL cwd translation
# ---------------------------------------------------------------------------
class TestWslCwdTranslation:
def test_translate_acp_cwd_converts_windows_drive_path_when_wsl(self, monkeypatch):
monkeypatch.setattr("hermes_constants._wsl_detected", True)
assert acp_session._translate_acp_cwd(r"E:\Projects\AI\paperclip") == "/mnt/e/Projects/AI/paperclip"
def test_translate_acp_cwd_handles_forward_slashes_when_wsl(self, monkeypatch):
monkeypatch.setattr("hermes_constants._wsl_detected", True)
assert acp_session._translate_acp_cwd("D:/work/project") == "/mnt/d/work/project"
def test_translate_acp_cwd_leaves_windows_drive_path_unchanged_off_wsl(self, monkeypatch):
monkeypatch.setattr("hermes_constants._wsl_detected", False)
assert acp_session._translate_acp_cwd(r"E:\Projects\AI\paperclip") == r"E:\Projects\AI\paperclip"
def test_translate_acp_cwd_leaves_posix_path_unchanged_on_wsl(self, monkeypatch):
monkeypatch.setattr("hermes_constants._wsl_detected", True)
assert acp_session._translate_acp_cwd("/mnt/e/Projects/AI/paperclip") == "/mnt/e/Projects/AI/paperclip"
def test_create_session_stores_translated_cwd_on_wsl(self, manager, monkeypatch):
monkeypatch.setattr("hermes_constants._wsl_detected", True)
state = manager.create_session(cwd=r"E:\Projects\AI\paperclip")
assert state.cwd == "/mnt/e/Projects/AI/paperclip"
def test_fork_session_stores_translated_cwd_on_wsl(self, manager, monkeypatch):
monkeypatch.setattr("hermes_constants._wsl_detected", True)
original = manager.create_session(cwd="/tmp/base")
forked = manager.fork_session(original.session_id, cwd=r"D:\work\project")
assert forked is not None
assert forked.cwd == "/mnt/d/work/project"
def test_update_cwd_stores_translated_cwd_on_wsl(self, manager, monkeypatch):
monkeypatch.setattr("hermes_constants._wsl_detected", True)
state = manager.create_session(cwd="/tmp/old")
updated = manager.update_cwd(state.session_id, cwd=r"C:\Users\foo\project")
assert updated is not None
assert updated.cwd == "/mnt/c/Users/foo/project"
# ---------------------------------------------------------------------------
# fork
# ---------------------------------------------------------------------------
+232 -5
View File
@@ -52,6 +52,12 @@ class TestToolKindMap:
def test_tool_kind_execute_code(self):
assert get_tool_kind("execute_code") == "execute"
def test_tool_kind_todo(self):
assert get_tool_kind("todo") == "other"
def test_tool_kind_skill_view(self):
assert get_tool_kind("skill_view") == "read"
def test_tool_kind_browser_navigate(self):
assert get_tool_kind("browser_navigate") == "fetch"
@@ -110,6 +116,25 @@ class TestBuildToolTitle:
title = build_tool_title("web_search", {"query": "python asyncio"})
assert "python asyncio" in title
def test_skill_view_title_includes_skill_name(self):
title = build_tool_title("skill_view", {"name": "github-pitfalls"})
assert title == "skill view (github-pitfalls)"
def test_skill_view_title_includes_linked_file(self):
title = build_tool_title("skill_view", {"name": "github-pitfalls", "file_path": "references/api.md"})
assert title == "skill view (github-pitfalls/references/api.md)"
def test_execute_code_title_includes_first_code_line(self):
title = build_tool_title("execute_code", {"code": "\nfrom hermes_tools import terminal\nprint('done')"})
assert title == "python: from hermes_tools import terminal"
def test_skill_manage_title_includes_action_and_target(self):
title = build_tool_title(
"skill_manage",
{"action": "patch", "name": "hermes-agent-operations", "file_path": "references/acp.md"},
)
assert title == "skill patch: hermes-agent-operations/references/acp.md"
def test_unknown_tool_uses_name(self):
title = build_tool_title("some_new_tool", {"foo": "bar"})
assert title == "some_new_tool"
@@ -164,15 +189,23 @@ class TestBuildToolStart:
assert "ls -la /tmp" in text
def test_build_tool_start_for_read_file(self):
"""read_file should include the path in content."""
"""read_file start should stay compact; completion carries file contents."""
args = {"path": "/etc/hosts", "offset": 1, "limit": 50}
result = build_tool_start("tc-3", "read_file", args)
assert isinstance(result, ToolCallStart)
assert result.kind == "read"
assert len(result.content) >= 1
content_item = result.content[0]
assert isinstance(content_item, ContentToolCallContent)
assert "/etc/hosts" in content_item.content.text
assert result.content is None
assert result.raw_input is None
def test_build_tool_start_for_web_extract_is_compact(self):
"""web_extract start should stay compact; title identifies URLs."""
args = {"urls": ["https://example.com/docs"]}
result = build_tool_start("tc-web-start", "web_extract", args)
assert isinstance(result, ToolCallStart)
assert result.title == "extract: https://example.com/docs"
assert result.kind == "fetch"
assert result.content is None
assert result.raw_input is None
def test_build_tool_start_for_search(self):
"""search_files should include pattern in content."""
@@ -181,6 +214,48 @@ class TestBuildToolStart:
assert isinstance(result, ToolCallStart)
assert result.kind == "search"
assert "TODO" in result.content[0].content.text
assert result.raw_input is None
def test_build_tool_start_for_todo_is_human_readable(self):
args = {"todos": [{"id": "one", "content": "Fix ACP rendering", "status": "in_progress"}]}
result = build_tool_start("tc-todo", "todo", args)
assert result.title == "todo (1 item)"
assert "Fix ACP rendering" in result.content[0].content.text
assert result.raw_input is None
def test_build_tool_start_for_skill_view_is_human_readable(self):
result = build_tool_start("tc-skill", "skill_view", {"name": "github-pitfalls"})
assert result.title == "skill view (github-pitfalls)"
assert "github-pitfalls" in result.content[0].content.text
assert result.raw_input is None
def test_build_tool_start_for_execute_code_shows_code_preview(self):
result = build_tool_start("tc-code", "execute_code", {"code": "print('hello')"})
assert result.kind == "execute"
assert result.title == "python: print('hello')"
assert "```python" in result.content[0].content.text
assert "print('hello')" in result.content[0].content.text
assert result.raw_input is None
def test_build_tool_start_for_skill_manage_patch_shows_diff(self):
result = build_tool_start(
"tc-skill-manage",
"skill_manage",
{
"action": "patch",
"name": "hermes-agent-operations",
"file_path": "references/acp.md",
"old_string": "old advice",
"new_string": "new advice",
},
)
assert result.kind == "edit"
assert result.title == "skill patch: hermes-agent-operations/references/acp.md"
assert isinstance(result.content[0], FileEditToolCallContent)
assert result.content[0].path == "skills/hermes-agent-operations/references/acp.md"
assert result.content[0].old_text == "old advice"
assert result.content[0].new_text == "new advice"
assert result.raw_input is None
def test_build_tool_start_generic_fallback(self):
"""Unknown tools should get a generic text representation."""
@@ -205,6 +280,158 @@ class TestBuildToolComplete:
content_item = result.content[0]
assert isinstance(content_item, ContentToolCallContent)
assert "total 42" in content_item.content.text
assert result.raw_output is None
def test_build_tool_complete_for_todo_is_checklist(self):
result = build_tool_complete(
"tc-todo",
"todo",
'{"todos":[{"id":"a","content":"Inspect ACP","status":"completed"},{"id":"b","content":"Patch renderers","status":"in_progress"}],"summary":{"total":2,"pending":0,"in_progress":1,"completed":1,"cancelled":0}}',
)
text = result.content[0].content.text
assert "✅ Inspect ACP" in text
assert "- 🔄 Patch renderers" in text
assert "**Progress:** 1 completed, 1 in progress, 0 pending" in text
assert result.raw_output is None
def test_build_tool_complete_for_skill_view_summarizes_content_without_raw_json(self):
result = build_tool_complete(
"tc-skill",
"skill_view",
'{"success":true,"name":"github-pitfalls","description":"GitHub gotchas","content":"# GitHub Pitfalls\\nUse gh carefully.","path":"github/github-pitfalls/SKILL.md"}',
)
text = result.content[0].content.text
assert "**Skill loaded**" in text
assert "`github-pitfalls`" in text
assert "GitHub gotchas" in text
assert "GitHub Pitfalls" in text
assert "Use gh carefully" not in text
assert "Full skill content is available to the agent" in text
assert result.raw_output is None
def test_build_tool_complete_for_execute_code_formats_output(self):
result = build_tool_complete("tc-code", "execute_code", '{"output":"hello\\n","exit_code":0}')
text = result.content[0].content.text
assert "Exit code: 0" in text
assert "hello" in text
assert result.raw_output is None
def test_build_tool_complete_for_skill_manage_summarizes_without_raw_json(self):
result = build_tool_complete(
"tc-skill-manage",
"skill_manage",
'{"success":true,"message":"Patched references/hermes-acp-zed-rendering.md in skill \'hermes-agent-operations\' (1 replacement)."}',
function_args={
"action": "patch",
"name": "hermes-agent-operations",
"file_path": "references/hermes-acp-zed-rendering.md",
},
)
text = result.content[0].content.text
assert "**✅ Skill updated**" in text
assert "`patch`" in text
assert "`hermes-agent-operations`" in text
assert "references/hermes-acp-zed-rendering.md" in text
assert "{\"success\"" not in text
assert result.raw_output is None
def test_build_tool_complete_for_read_file_formats_content(self):
result = build_tool_complete(
"tc-read",
"read_file",
'{"content":"1|hello\\n2|world","total_lines":2}',
function_args={"path":"README.md","offset":1,"limit":20},
)
text = result.content[0].content.text
assert "Read README.md" in text
assert "```\n1|hello\n2|world\n```" in text
assert result.raw_output is None
def test_build_tool_complete_for_search_files_formats_matches(self):
result = build_tool_complete(
"tc-search",
"search_files",
'{"total_count":2,"matches":[{"path":"README.md","line":3,"content":"TODO: fix this"},{"path":"src/app.py","line":9,"content":"needle"}],"truncated":true}\n\n[Hint: Results truncated. Use offset=12 to see more.]',
)
text = result.content[0].content.text
assert "Search results" in text
assert "Found 2 matches" in text
assert "README.md:3" in text
assert "TODO: fix this" in text
assert "Results truncated" in text
assert result.raw_output is None
def test_build_tool_complete_for_process_list_formats_table(self):
result = build_tool_complete(
"tc-process",
"process",
'{"processes":[{"session_id":"p1","status":"running","pid":123,"command":"npm run dev"}]}',
function_args={"action":"list"},
)
text = result.content[0].content.text
assert "Processes: 1" in text
assert "`p1`" in text
assert "npm run dev" in text
assert result.raw_output is None
def test_build_tool_complete_for_delegate_task_summarizes_children(self):
result = build_tool_complete(
"tc-delegate",
"delegate_task",
'{"results":[{"task_index":0,"status":"completed","summary":"Reviewed ACP rendering.","model":"gpt-5.5","duration_seconds":3.2,"tool_trace":[{"tool":"read_file"}]}],"total_duration_seconds":3.4}',
)
text = result.content[0].content.text
assert "Delegation results: 1 task" in text
assert "Reviewed ACP rendering" in text
assert "gpt-5.5" in text
assert "Tools: read_file" in text
assert result.raw_output is None
def test_build_tool_complete_for_session_search_recent(self):
result = build_tool_complete(
"tc-session",
"session_search",
'{"success":true,"mode":"recent","results":[{"session_id":"s1","title":"ACP work","last_active":"2026-05-02","message_count":12,"preview":"Polished tool rendering."}],"count":1}',
)
text = result.content[0].content.text
assert "Recent sessions" in text
assert "ACP work" in text
assert "Polished tool rendering" in text
assert result.raw_output is None
def test_build_tool_complete_for_memory_avoids_dumping_entries(self):
result = build_tool_complete(
"tc-memory",
"memory",
'{"success":true,"target":"user","entries":["private long memory"],"usage":"1% — 19/2000 chars","entry_count":1,"message":"Entry added."}',
function_args={"action":"add","target":"user","content":"User likes concise ACP rendering."},
)
text = result.content[0].content.text
assert "Memory add saved" in text
assert "User likes concise ACP rendering" in text
assert "private long memory" not in text
assert result.raw_output is None
def test_build_tool_complete_for_web_extract_success_stays_compact(self):
result = build_tool_complete(
"tc-web-extract",
"web_extract",
'{"results":[{"url":"https://example.com","title":"Example","content":"# Intro\\nThis is extracted content."}]}',
)
assert result.content is None
assert result.raw_output is None
def test_build_tool_complete_for_web_extract_error_shows_error(self):
result = build_tool_complete(
"tc-web-extract-error",
"web_extract",
'{"results":[{"url":"https://example.com","title":"Example","error":"timeout"}]}',
)
text = result.content[0].content.text
assert "Web extract failed" in text
assert "https://example.com" in text
assert "timeout" in text
assert result.raw_output is None
def test_build_tool_complete_truncates_large_output(self):
"""Very large outputs should be truncated."""
+150
View File
@@ -0,0 +1,150 @@
from types import SimpleNamespace
import pytest
from acp.schema import TextContentBlock
from acp_adapter.server import HermesACPAgent
from acp_adapter.session import SessionManager
class FakeAgent:
def __init__(self):
self.model = "fake-model"
self.provider = "fake-provider"
self.enabled_toolsets = ["hermes-acp"]
self.disabled_toolsets = []
self.tools = []
self.valid_tool_names = set()
self.steers = []
self.runs = []
def steer(self, text):
self.steers.append(text)
return True
def run_conversation(self, *, user_message, conversation_history, task_id, **kwargs):
self.runs.append(user_message)
messages = list(conversation_history or [])
messages.append({"role": "user", "content": user_message})
final = f"ran: {user_message}"
messages.append({"role": "assistant", "content": final})
return {"final_response": final, "messages": messages}
class CaptureConn:
def __init__(self):
self.updates = []
async def session_update(self, *args, **kwargs):
if kwargs:
self.updates.append((kwargs.get("session_id"), kwargs.get("update")))
else:
self.updates.append((args[0], args[1]))
async def request_permission(self, *args, **kwargs):
return SimpleNamespace(outcome="allow")
class NoopDb:
def get_session(self, *_args, **_kwargs):
return None
def create_session(self, *_args, **_kwargs):
return None
def update_session(self, *_args, **_kwargs):
return None
def make_agent_and_state():
fake = FakeAgent()
manager = SessionManager(agent_factory=lambda **kwargs: fake, db=NoopDb())
acp_agent = HermesACPAgent(session_manager=manager)
state = manager.create_session(cwd=".")
conn = CaptureConn()
acp_agent.on_connect(conn)
return acp_agent, state, fake, conn
@pytest.mark.asyncio
async def test_acp_steer_slash_command_injects_into_running_agent():
acp_agent, state, fake, _conn = make_agent_and_state()
state.is_running = True
response = await acp_agent.prompt(
session_id=state.session_id,
prompt=[TextContentBlock(type="text", text="/steer prefer the simpler fix")],
)
assert response.stop_reason == "end_turn"
assert fake.steers == ["prefer the simpler fix"]
assert fake.runs == []
@pytest.mark.asyncio
async def test_acp_steer_after_zed_interrupt_replays_interrupted_prompt_with_guidance():
acp_agent, state, fake, _conn = make_agent_and_state()
state.interrupted_prompt_text = "write hi to a text file"
response = await acp_agent.prompt(
session_id=state.session_id,
prompt=[TextContentBlock(type="text", text="/steer write HELLO instead")],
)
assert response.stop_reason == "end_turn"
assert fake.steers == []
assert fake.runs == [
"write hi to a text file\n\nUser correction/guidance after interrupt: write HELLO instead"
]
assert state.interrupted_prompt_text == ""
@pytest.mark.asyncio
async def test_acp_steer_on_idle_session_runs_as_regular_prompt():
# /steer on an idle session (no running turn, nothing to salvage) should
# run the steer payload as a normal user prompt — NOT silently append it
# to state.queued_prompts. Without this, users on Zed / other ACP clients
# see their /steer turn into "queued for the next turn" when they never
# typed /queue. Matches gateway/run.py ~L4898 idle-/steer behavior.
acp_agent, state, fake, _conn = make_agent_and_state()
response = await acp_agent.prompt(
session_id=state.session_id,
prompt=[TextContentBlock(type="text", text="/steer summarize the README")],
)
assert response.stop_reason == "end_turn"
assert fake.steers == []
assert fake.runs == ["summarize the README"]
assert state.queued_prompts == []
@pytest.mark.asyncio
async def test_acp_queue_slash_command_adds_next_turn_without_running_now():
acp_agent, state, fake, _conn = make_agent_and_state()
response = await acp_agent.prompt(
session_id=state.session_id,
prompt=[TextContentBlock(type="text", text="/queue run the tests after this")],
)
assert response.stop_reason == "end_turn"
assert state.queued_prompts == ["run the tests after this"]
assert fake.runs == []
@pytest.mark.asyncio
async def test_acp_prompt_drains_queued_turns_after_current_run():
acp_agent, state, fake, conn = make_agent_and_state()
state.queued_prompts.append("then run tests")
response = await acp_agent.prompt(
session_id=state.session_id,
prompt=[TextContentBlock(type="text", text="make the change")],
)
assert response.stop_reason == "end_turn"
assert fake.runs == ["make the change", "then run tests"]
assert state.queued_prompts == []
agent_messages = [u for _sid, u in conn.updates if getattr(u, "session_update", None) == "agent_message_chunk"]
assert len(agent_messages) >= 2

Some files were not shown because too many files have changed in this diff Show More