Compare commits

...

160 Commits

Author SHA1 Message Date
Teknium 4a3eac5fe1 feat: add /recap slash command — summarize recent session activity
Inspired by Claude Code's /recap (v2.1.114, April 2026). Produces a
compact text summary of recent activity in the current session:
turn counts, tools used, files touched, last user ask, and last
assistant reply. Useful when juggling multiple sessions or
returning to a session after being away.

Implementation notes:
- Pure local computation from the in-memory conversation history /
  gateway transcript. No LLM call, no auxiliary model, no prompt-cache
  invalidation — a recap should be instant and free.
- Works unchanged on CLI and every gateway platform (Telegram,
  Discord, Slack, …) via a shared hermes_cli.session_recap.build_recap
  helper. Claude Code only ships this on the CLI.
- Tailored to hermes-agent's tool vocabulary: file-editing tools
  (patch, write_file, read_file, skill_manage, skill_view) surface
  touched paths; tool-call counts highlight which classes of work
  drove the session.
- Added to ACTIVE_SESSION_BYPASS_COMMANDS and the Level-2 early
  intercept in gateway/run.py so /recap works while an agent is
  running (read-only, safe).

Source: https://code.claude.com/docs/en/whats-new/2026-w17
2026-05-01 17:10:46 -07:00
kshitijk4poor f903ceece0 chore: add contributors to AUTHOR_MAP for Slack batch salvage
Adds email→username mappings for:
- priveperfumes (PR #18456)
- amroessam (PR #17798)
- Hinotoi-agent (PR #9361)
- valda (PR #14932)
2026-05-01 14:01:26 -07:00
Amr Essam d05a87e686 fix(gateway): clear slack assistant thread status 2026-05-01 14:01:26 -07:00
hinotoi-agent a147164d3c fix(slack): preserve per-user slash-command session isolation 2026-05-01 14:01:26 -07:00
nightq 5cdc39e29a fix(gateway): preserve case-sensitive chat IDs in DeliveryTarget.parse
Fixes NousResearch/hermes-agent#11768

Root cause: target.strip().lower() was lowercasing the entire target string,
corrupting case-sensitive chat IDs like Slack C123ABC and Matrix !RoomABC.

Fix: Only lowercase the platform prefix for case-insensitive matching;
preserve the original case for chat_id and thread_id values.
2026-05-01 14:01:26 -07:00
YAMAGUCHI Seiji 2b3923ff13 fix(gateway): coerce scalar free_response_channels to str before split
YAML loads a bare numeric value such as
    discord:
      free_response_channels: 1491973769726791812
as an int.  _discord_free_response_channels() / _slack_free_response_channels()
checked `isinstance(raw, list)` and `isinstance(raw, str)` in that order and
then fell through to `return set()`, so a single-channel config that happened
to be unquoted was silently dropped with no log line — the bot kept demanding
@mentions even though the channel was configured to free-response.

A multi-channel value like `1234567890,9876543210` does not trip this because
the comma forces YAML to parse it as a string.  Single-channel configs are
the only case that breaks, which is exactly the footgun that's hardest to
diagnose (the config "looks right" and the feature just doesn't activate).

Note that the old-schema env-var bridge at gateway/config.py:614+ already
runs `str(frc)` when forwarding to SLACK_/DISCORD_FREE_RESPONSE_CHANNELS,
so the env-var fallback worked.  The bug only surfaces on the
`config.extra["free_response_channels"]` path populated by the `platforms:`
bridge at gateway/config.py:576, which passes the raw YAML value through
unchanged.

Fix at the reader: treat any non-list value as a scalar, coerce with str(),
then apply the same CSV split semantics.  This keeps the public contract
stable (list or str-like continues to work identically) while accepting
the ints that the YAML loader is free to hand us.

Added tests for both Discord and Slack covering:
  - bare int value in config.extra
  - list of ints in config.extra
2026-05-01 14:01:26 -07:00
Prive FE Coder a717199bbf fix(slack): exclude reserved Slack commands from native slash manifest
Slack has built-in slash commands (e.g. /status, /me, /join) that apps
cannot register. When running `hermes slack manifest --write`, the
generated manifest included /status, causing Slack to reject the entire
manifest with a reserved-command error.

Add _SLACK_RESERVED_COMMANDS frozenset of all known Slack built-ins and
skip them in slack_native_slashes(). Affected commands remain reachable
via /hermes <command>.

Tests updated:
- New test_excludes_slack_reserved_commands validates no leaks
- test_includes_canonical_commands no longer asserts /status
- test_telegram_parity accounts for expected Slack-only exclusions
2026-05-01 14:01:26 -07:00
kshitijk4poor 8fcc160f6b fix(gateway/slack): review fixes — scope ephemeral to commands, user isolation
Self-review fixes for the slash ephemeral ack:

- Only stash response_url when text starts with '/' (gateway command).
  Free-form questions via '/hermes <question>' must produce public agent
  replies visible to the whole channel, not ephemeral.
- Use a ContextVar (_slash_user_id) to thread the invoking user's ID
  from _handle_slash_command through to send().  _pop_slash_context now
  matches the exact (channel_id, user_id) key when the ContextVar is
  set, preventing concurrent users on the same channel from stealing
  each other's ephemeral context.  ContextVars propagate to child
  asyncio.Tasks, so the value survives through handle_message →
  _process_message_background → _send_with_retry → send().
- Add truncate_message() in _send_slash_ephemeral to prevent silent
  failures on long responses (response_url has the same ~40k limit).
- Log send_private_notice failures at debug level instead of bare
  except/pass — aids diagnostics without spamming.
- Document app_mention dedup dependency on shared event ts.
- Add tests: free-form question must NOT stash context, concurrent
  users on the same channel get isolated contexts, non-slash send()
  path fallback behavior.
2026-05-01 13:33:06 -07:00
kshitijk4poor f34d298495 chore: add probepark to AUTHOR_MAP
Required for contributor_audit.py strict mode on the salvaged
PR #9340 commit.
2026-05-01 13:33:06 -07:00
probepark 0ab2d752ff feat(gateway): private notice delivery and Slack format_message fixes
Adds platform-level private notice delivery abstraction so operational
messages (e.g. sethome prompt) can be sent ephemerally on Slack when
configured with `slack.notice_delivery: private`.

Changes:
- gateway/config.py: _normalize_notice_delivery() + GatewayConfig.get_notice_delivery()
  with per-platform config bridging
- gateway/platforms/base.py: send_private_notice() default implementation
  (falls through to send())
- gateway/platforms/slack.py: send_private_notice() via chat_postEphemeral
- gateway/run.py: _deliver_platform_notice() helper replaces direct
  adapter.send() for the sethome notice, with private→public fallback
- gateway/platforms/slack.py: app_mention handler now forwards to
  _handle_slack_message (safe due to ts-based dedup) instead of no-op pass,
  fixing edge-case Slack configs where mentions arrive only as app_mention
- gateway/platforms/slack.py format_message: negative lookbehind prevents
  markdown images (![]()) from becoming broken Slack links; italic regex
  now requires non-whitespace boundaries so 'a * b * c' stays literal

Based on PR #9340 by @probepark.
2026-05-01 13:33:06 -07:00
kshitijk4poor 7cda0e5224 fix(gateway/slack): ephemeral ack and routing for slash commands
Slack slash commands (/q, /btw, /stop, /model, etc.) previously showed
no user-visible acknowledgement and posted command replies as public
channel messages.  This diverged from Discord, which uses ephemeral
deferred responses for slash commands.

Changes:
- handle_hermes_command now passes response_type='ephemeral' and a
  'Running /cmd…' text to ack(), giving the user immediate 'Only visible
  to you' feedback when they invoke any native slash command.
- _handle_slash_command stashes the Slack response_url from the command
  payload in a per-channel context dict before dispatching to
  handle_message.
- send() checks for a pending slash context and, when found, POSTs to
  the response_url with replace_original=true to swap the initial ack
  with the real command reply (e.g. 'Queued for the next turn.'),
  keeping it ephemeral.
- Stale slash contexts are garbage-collected on lookup (120s TTL).
- The response_url POST is non-fatal: if it fails, the user already saw
  the initial ack, and send() returns success=True.

Fixes #18182
2026-05-01 13:33:06 -07:00
Jeffrey Quesnelle 0b76d23d1a makes the Persistent Goals docs accessible in the docs nav (and llms.txt) (#18481) 2026-05-01 10:29:22 -07:00
Teknium f99676e315 fix(gateway): auto-restart when source files change out from under us (#17648) (#18409)
Long-running gateway processes that survive 'hermes update' keep
pre-update modules cached in sys.modules. When new tool files on
disk then try to 'from hermes_cli.config import cfg_get' (added in
PR #17304), the import resolves against the stale module object
and raises ImportError — hitting users on Matrix, Telegram, Feishu,
and other platforms.

Two defenses:

1. Gateway self-check (gateway/run.py). On __init__, snapshot the
   newest mtime across sentinel source files (hermes_cli/config.py,
   run_agent.py, gateway/run.py, etc.). On every inbound message,
   re-read those mtimes; if any is newer than boot time + 2s slack,
   request a graceful restart via the normal drain path and return
   a one-line ack to the user. Idempotent, works regardless of how
   the update happened (hermes update, manual git pull, installer).

2. Post-restart survivor sweep ('hermes update'). After the existing
   restart loop, sleep 3s, rescan for gateway PIDs we already tried
   to kill, and SIGKILL any survivors. The detached profile watchers
   and systemd then relaunch with fresh code instead of waiting out
   the 120s watcher timeout.

Closes #17648.
2026-05-01 09:50:08 -07:00
Teknium 77c0bc6b13 fix(curator): defer first run and add --dry-run preview (#18373) (#18389)
* fix(curator): defer first run and add --dry-run preview (#18373)

Curator was meant to run 7 days after install, not on the very first
gateway tick. On a fresh install (no .curator_state), should_run_now()
returned True immediately because last_run_at was None — so the gateway
cron ticker fired Curator against a fresh skill library moments after
'hermes update'. Combined with the binary 'agent-created' provenance
model (anything not bundled and not hub-installed), this consolidated
hand-authored user workflow skills without consent.

Changes:
- should_run_now(): first observation seeds last_run_at='now' and returns
  False. The next real pass fires one full interval_hours later (7 days
  by default), matching the original design intent.
- hermes curator run --dry-run: produces the same review report without
  applying automatic transitions OR permitting the LLM to call
  skill_manage / terminal mv. A DRY-RUN banner is prepended to the
  prompt and the caller skips apply_automatic_transitions. State is
  NOT advanced so a preview doesn't defer the next scheduled real pass.
- hermes update: prints a one-liner on fresh installs pointing at
  --dry-run, pause, and the docs. Silent on steady state.
- Docs: curator.md and cli-commands.md explain the deferred first-run
  behavior and warn that hand-written SKILL.md files share the
  'agent-created' bucket, with guidance to pin or preview before the
  first pass.

Tests:
- test_first_run_defers replaces the old 'first run always eligible'
  assertion — same fixture, inverted expectation.
- test_maybe_run_curator_defers_on_fresh_install covers the gateway tick
  path end-to-end.
- Three new dry-run tests cover state-advance suppression, prompt
  banner injection, and apply_automatic_transitions skipping.

Fixes #18373.

* feat(curator): pre-run backup + rollback (#18373)

Every real curator pass now snapshots ~/.hermes/skills/ into
~/.hermes/skills/.curator_backups/<utc-iso>/skills.tar.gz before calling
apply_automatic_transitions or the LLM review. If a run consolidates or
archives something the user didn't want touched, 'hermes curator
rollback' restores the tree in one command. Dry-run is skipped — no
mutation means no snapshot needed.

Changes:
- agent/curator_backup.py (new): tar.gz snapshot + safe rollback. The
  snapshot excludes .curator_backups/ (would recurse) and .hub/ (managed
  by the skills hub). Extract refuses absolute paths and .. components,
  and uses tarfile's filter='data' on Python 3.12+. Rollback takes a
  pre-rollback safety snapshot FIRST, stages the current tree into
  .rollback-staging-<ts>/ so the extract lands in an empty dir, and
  cleans the staging dir on success. A failed extract restores the
  staged contents.
- agent/curator.py: run_curator_review() calls curator_backup.
  snapshot_skills(reason='pre-curator-run') before apply_automatic_
  transitions. Best-effort — a failed snapshot logs at debug and the
  run continues (a transient disk issue shouldn't silently disable
  curator forever).
- hermes_cli/curator.py: new 'hermes curator backup' and 'hermes curator
  rollback' subcommands. rollback supports --list, --id <ts>, -y.
- hermes_cli/config.py: curator.backup.{enabled, keep} config block
  with sane defaults (enabled=true, keep=5).
- Docs: curator.md gets a 'Backups and rollback' section; cli-commands
  .md table gets the new rows.

Tests (new file tests/agent/test_curator_backup.py, 16 cases):
- snapshot creates tarball + manifest with correct counts
- snapshot excludes .curator_backups/ (recursion guard) and .hub/
- snapshot disabled via config returns None without creating anything
- snapshot uniquifies ids within the same second (-01 suffix)
- prune honors keep count, newest-first
- list_backups + _resolve_backup cover newest-default and unknown-id
- rollback restores a deleted skill with content intact
- rollback is itself undoable — safety snapshot shows up in list_backups
- rollback with no snapshots returns an error
- rollback refuses tarballs with absolute paths or .. components
- real curator runs take a 'pre-curator-run' snapshot; dry-runs do not

All curator tests: 210 passing locally.
2026-05-01 09:49:59 -07:00
Siddharth Balyan c5b4c48165 fix: lazy session creation — defer DB row until first message (#18370)
Prevents ghost sessions from accumulating in state.db when the TUI/web
dashboard is opened and closed without sending a message.

Changes:
- run_agent.py: Add _ensure_db_session() gate method, called at
  run_conversation() entry. Remove eager create_session() from __init__.
  Handle compression rotation flag correctly.
- tui_gateway/server.py: Remove eager db.create_session() in
  _start_agent_build(). Add post-first-message pending_title re-apply.
- hermes_state.py: Extract _insert_session_row() shared helper (DRY).
  Add prune_empty_ghost_sessions() for one-time migration.
- cli.py: One-time ghost session prune on startup. Fix _pending_title
  to call _ensure_db_session() before set_session_title().
- hermes_cli/main.py: Guard TUI exit summary on message_count > 0.
- tests: Update test_860_dedup to call _ensure_db_session() before
  direct _flush_messages_to_session_db() calls.

Closes: ghost session clutter in hermes sessions list and web dashboard.
2026-05-01 18:39:12 +05:30
Austin Pickett 20132435c0 Merge pull request #18117 from NousResearch/austin/fix/model-selector
feat(tui): overhaul /model picker to match hermes model with inline auth
2026-05-01 05:30:05 -07:00
Austin Pickett 5ad030d19d Merge pull request #18095 from NousResearch/austin/feat/plugins-page
feat(dashboard): Plugins page — manage, enable/disable, auth status
2026-05-01 05:29:24 -07:00
Austin Pickett 05c63259b5 Merge pull request #18358 from NousResearch/fix/kanban-buton
fix: kanban button
2026-05-01 04:49:06 -07:00
Austin Pickett a01c1f7305 fix: kanban button 2026-05-01 07:33:54 -04:00
Siddharth Balyan 75e1339d4c fix(telegram): send seed message after creating DM topics (#18334)
Telegram's client does not display empty forum topics in the chat's
topic list. After createForumTopic succeeds, send a short pin message
into the new topic so it becomes immediately visible to the user.

Only fires for newly created topics (no thread_id in config yet).
Failure to send the seed is non-fatal (debug-logged, topic still works).
2026-05-01 15:21:56 +05:30
Ben Barclay 0159f25fd0 Merge pull request #18281 from NousResearch/bb/fix-tui-docker-ink-v2
fix: prevent tui rebuilding assets
2026-05-01 18:43:40 +10:00
UgwujaGeorge b7ad3f478f fix(yuanbao): enforce owner identity check on group slash commands
The bot-owner identity check inside OwnerCommandMiddleware was commented
out and replaced with a hardcoded `is_owner = True`, so any group member
could trigger allowlisted privileged commands (/approve, /deny, /stop,
/reset, /retry, /undo, /new, /background, /bg, /btw, /queue, /q) by
sending the slash command without @-mentioning the bot. The most severe
case is /approve: a non-owner could approve a dangerous tool call the
bot was waiting on the owner to confirm.

Re-enable the documented identity check (push.from_account ==
push.bot_owner_id) so only the configured owner can issue these
commands.
2026-04-30 23:57:55 -07:00
Teknium a2a32688ca docs(website): add User Stories and Use Cases collage page (#18282)
Adds a new top-of-sidebar docs page at /docs/user-stories that is a
masonry-style collage of 99 real user stories sourced from X/Twitter,
GitHub issues/PRs, Reddit, Hacker News, YouTube, blogs (Medium, Substack,
dev.to), podcasts, LinkedIn, GitHub Gists, and Product Hunt.

Every tile links to the original post/issue/video/gist where someone
described a specific use case: personal assistants, dev workflows,
trading bots, research briefs, family WhatsApp agents, Kubernetes
deployments, legal-domain self-hosted setups, and more.

- docs/user-stories.mdx: MDX entry mounting the collage component
- src/components/UserStoriesCollage: React component with category +
  source filters, CSS-columns masonry layout, per-category accent colors
- src/data/userStories.json: source-of-truth dataset (force-added; the
  root .gitignore's unanchored 'data/' rule would otherwise swallow it,
  same reason skills.json is explicitly listed in website/.gitignore)
- sidebars.ts: link added at the top of the docs sidebar
2026-04-30 23:56:59 -07:00
Ben a49f4c617d fix: prevent tui rebuilding assets 2026-05-01 16:29:46 +10:00
web-dev0521 dfe512c58d fix(paths): route achievements plugin + profile-tui through HERMES_HOME
Four callsites hardcoded Path.home() / '.hermes' with no HERMES_HOME
check, breaking Docker deployments and profile isolation (hermes -p):

- plugins/hermes-achievements/dashboard/plugin_api.py:
  state_path(), snapshot_path(), checkpoint_path() bare-literal paths
- scripts/profile-tui.py:
  DEFAULT_STATE_DB and DEFAULT_LOG defaults ignored HERMES_HOME
- hermes_cli/slack_cli.py:
  except-Exception fallback for slack-manifest.json dump
- optional-skills/migration/openclaw-migration/scripts/openclaw_to_hermes.py:
  --target argparse default

Use get_hermes_home() (with an ImportError shim for the standalone
scripts) or 'os.environ.get("HERMES_HOME") or str(Path.home()/".hermes")'
where importing hermes_constants is impractical.

E2E-verified: with HERMES_HOME=/tmp/x all three achievements paths and
both profile-tui defaults route under /tmp/x.

Salvaged from #18068 (original scope was broader mechanical cleanup
claiming 23 callsites were buggy; most were already respecting
HERMES_HOME via os.environ.get(key, default) — only these 4 had no env
check at all). Credit: @web-dev0521.
2026-04-30 23:21:54 -07:00
Teknium c6eebfc25a docs: publish llms.txt and llms-full.txt for agent-friendly ingestion (#18276)
Two machine-readable entry points to the Hermes Agent docs:

  /llms.txt         curated index of every doc page, one link per page
                    with short descriptions. ~17 KB, safe to load into
                    an LLM context window.
  /llms-full.txt    every page under website/docs/ concatenated as markdown.
                    ~1.8 MB. For one-shot ingestion by coding agents and
                    RAG pipelines.

Both files are also served from /docs/llms.txt and /docs/llms-full.txt
(Docusaurus serves website/static/ under baseUrl=/docs/). Some agents and
IDE plugins probe the classic site-root path; the deploy workflow now copies
both files to _site root so either URL works.

Conforms to the emerging llmstxt.org spec: H1 project name, blockquote
summary, short install command, GitHub link, then curated sections
mirroring the docs-site navigation (Getting Started, Using Hermes,
Features, Messaging, Integrations, Guides, Developer Guide, Reference).

Generated by website/scripts/generate-llms-txt.py. Wired into prebuild.mjs
so every 'npm run build' and 'npm run start' refreshes the files alongside
the existing skills.json extraction. Both outputs are gitignored (same
precedent as src/data/skills.json).

Descriptions in llms.txt are pulled from each page's frontmatter, so they
stay current automatically. All ~80 section slugs are validated against
the filesystem at generation time; an invalid slug would fail the prebuild.
2026-04-30 23:17:14 -07:00
Teknium cf2b2d31ce docs: add Persistent Goals (/goal) feature page (#18275)
Adds a proper feature page at user-guide/features/goals.md covering
the /goal slash command — Hermes' take on the Ralph loop shipped in
PR #18262. The slash-commands reference table had two table rows but
no narrative doc walking through the judge model, fail-open semantics,
turn budget, persistence, user-message preemption, or the aux-model
config override.

Adds a walkthrough example showing a multi-turn goal running to
completion, covers the two judge failure modes with how to recover,
and credits Codex CLI 0.128.0 / Eric Traut as prior art.

Also cross-links both slash-commands.md rows to the new page so
readers discovering /goal from the command reference can dive in.
2026-04-30 23:16:54 -07:00
teknium1 2af8b8ff37 fix(moonshot): also strip nullable/enum after anyOf collapse
The anyOf collapse in _repair_schema returned early, skipping the
nullable-strip and enum-cleanup steps. When a schema had anyOf
[{enum: [..., null, '']}, {type: null}] alongside a parent-level
'nullable: true', collapsing to the single non-null branch produced a
merged node that still had both 'nullable' and the bad enum values —
Moonshot would still 400 on it.

Fix: fall through to Rules 1/3 when the collapse produces a single
merged node; only return early for the multi-branch case (pure
anyOf preservation) or when there was no null branch to remove.

Adds a test that locks in the combined-case expectation.
2026-04-30 23:14:31 -07:00
teknium1 9cb5baeacf chore(release): map hendrixfreire for moonshot salvage 2026-04-30 23:14:31 -07:00
Hendrix 9ca72a69a7 fix(moonshot): fill missing type before enum cleanup to handle anyOf branches without explicit type
When a schema node inside anyOf has enum values but no explicit 'type',
Rule 3 (enum cleanup) ran before _fill_missing_type, so node_type was
None and the enum was never cleaned. Moonshot then rejected the schema
with 'enum value (<nil>) does not match any type in [string]'.

Fix: reorder operations — fill missing type first, strip nullable,
then clean enum. This ensures enum cleanup always has a type to check.

Also fixes test expectation: empty string in enum is now correctly
stripped (Moonshot rejects it too).

Closes #16875
2026-04-30 23:14:31 -07:00
Teknium 77dd6d5469 chore(release): add mikeyobrien to AUTHOR_MAP 2026-04-30 23:13:34 -07:00
Mikey O'Brien 1be3b74cfb fix(gateway): honor MATRIX_HOME_ROOM in onboarding 2026-04-30 23:13:34 -07:00
Teknium 265bd59c1d feat: /goal — persistent cross-turn goals (Ralph loop) (#18262)
Add a standing-goal slash command that keeps Hermes working toward a
user-stated objective across turns until it is achieved, paused, or
the turn budget runs out. Our take on the Ralph loop — cf. Codex CLI
0.128.0's /goal.

After each turn, a lightweight auxiliary-model judge call asks 'is
this goal satisfied by the assistant's last response?'. If not, and
we're under the turn budget (default 20), Hermes feeds a continuation
prompt back into the same session as a normal user message. Any real
user message preempts the continuation loop automatically.

Judge failures fail OPEN (continue) so a flaky judge never wedges
progress — the turn budget is the real backstop.

### Commands

- `/goal <text>`    — set a standing goal (kicks off the first turn)
- `/goal` or `/goal status` — show current state
- `/goal pause`    — pause the continuation loop
- `/goal resume`   — resume (resets turn counter)
- `/goal clear`    — drop the goal

Works on both CLI and gateway platforms via the central CommandDef
registry.

### Design invariants preserved

- **Prompt cache**: continuation prompts are regular user-role
  messages appended to history. No system-prompt mutation, no toolset
  swap.
- **Role alternation**: continuation is a user turn, never injected
  mid-tool-loop.
- **Session persistence**: goal state lives in SessionDB.state_meta
  keyed by `goal:<session_id>`, so `/resume` picks it up.
- **Mid-run safety**: on the gateway, `/goal status|pause|clear` are
  allowed mid-run (control-plane only); setting a new goal requires
  `/stop` first so we don't race a second continuation prompt against
  the current turn.

### Files

- `hermes_cli/goals.py` (new, 380 lines) — GoalManager + judge + state
- `hermes_cli/commands.py` — CommandDef entry
- `hermes_cli/config.py` — `goals.max_turns` default
- `hermes_cli/web_server.py` — dashboard category merge
- `cli.py` — /goal handler + post-turn continuation hook in
  process_loop
- `gateway/run.py` — /goal handler + post-turn continuation hook
  wrapping _handle_message_with_agent
- `tests/hermes_cli/test_goals.py` (new, 26 tests) — judge parsing,
  fail-open semantics, lifecycle, persistence, budget exhaustion
- `website/docs/reference/slash-commands.md` — docs entry
2026-04-30 23:10:20 -07:00
Teknium 7c6c5619a7 docs(sidebar): collapse exploding skills tree to a single Skills node (#18259)
* docs(sidebar): collapse exploding skills tree to a single Skills node

The Skills sub-tree in the left sidebar expanded to 200+ entries
(22 bundled categories + 15 optional categories, every skill a page).
That's most of the nav on a first visit — docs for the actual product
get drowned in it.

Collapse the sidebar to:

  Skills
    godmode              (hand-written spotlight)
    google-workspace     (hand-written spotlight)
    Bundled catalog      (reference/skills-catalog — table of all bundled)
    Optional catalog     (reference/optional-skills-catalog — table of all optional)

Per-skill pages still generate and are still reachable at their URLs;
they're linked from the two catalog tables and from the Skills overview
page. They just don't appear in the left nav anymore.

sidebars.ts goes from 649 lines to 247. generate-skill-docs.py loses
the bundled/optional sidebar render helpers.

Also picks up incidental generator output drift on current main
(comfyui skill content refresh; 4 new skill pages for
devops-kanban-orchestrator, devops-kanban-worker,
productivity-here-now, productivity-shopify; two catalog refreshes).
These are what the generator produces on main today — keeping them
committed avoids the next docs build showing 'working tree dirty'.

* docs(sidebar): drop godmode and google-workspace spotlight pages

Keep the Skills sidebar node strictly principled: two catalog links,
nothing else. There was no rule for which skills got spotlight pages
and which got auto-generated pages — just that these two happened to
be hand-written first.

Both pages still build and are still reachable at
/docs/user-guide/skills/godmode and
/docs/user-guide/skills/google-workspace. They're linked from the
catalog tables and the Skills overview page.

Sidebar Skills node now:
  Skills
    ├── Bundled catalog
    └── Optional catalog
2026-04-30 23:08:22 -07:00
Teknium 50c046331d feat(update): add --yes/-y flag to skip interactive prompts (#18261)
hermes update had two interactive [Y/n] prompts with no bypass:
  1. Config migration (after new env/config options are added)
  2. Autostash restore (when uncommitted work was stashed before pull)

hermes uninstall already has --yes/-y; mirrors that.

Under --yes:
  - Config-migrate prompt → auto-yes, migrate_config(interactive=False)
    so new config fields are applied but API-key prompts are skipped
    (user runs 'hermes config migrate' later for those). Matches
    gateway-mode semantics.
  - Stash-restore prompt → auto-yes, git stash apply runs automatically.

Closes the 'can I hermes update -y, No ! Fix' gap reported by @murelux.
2026-04-30 23:06:32 -07:00
Teknium 4caad285a6 feat(gateway): auto-delete slash-command system notices after TTL (#18266)
Adds opt-in auto-deletion for slash-command reply messages like
"New session started!", "Restarting gateway…", "Stopped.", and
YOLO toggles.  After the TTL elapses the gateway calls the adapter's
delete_message; on platforms without a delete API (everything except
Telegram today) the TTL is silently ignored and the message stays.

Requested on Twitter by @charlesmcdowell — tool-call bubbles are useful
real-time, but system notices clutter the thread once the agent finishes.

Implementation:

- EphemeralReply(str) sentinel in gateway/platforms/base.py.  Subclasses
  str so existing 'X' in response / response.startswith(...) checks in
  tests and call sites keep working unchanged; isinstance() still
  distinguishes it for the send path.
- _process_message_background and both busy-session bypass paths
  (in base.py) call _unwrap_ephemeral() on the handler return, send
  the unwrapped text, and schedule a detached delete task when the
  TTL > 0 AND the adapter class overrides delete_message.
- display.ephemeral_system_ttl (default 0 = disabled) in DEFAULT_CONFIG.
  Handler can pass ttl_seconds explicitly to override.
- Wrapped the highest-noise return sites: /new, /reset, /stop,
  /yolo on/off, /restart success + "already in progress".  Draining
  notices and /help output left as plain strings — those are
  informational and users want to read them.

Backward-compat: default TTL 0 → no scheduling, no behavior change
for existing users.  Platforms without delete_message silently no-op.
2026-04-30 23:05:48 -07:00
Teknium e2eb561e8e fix(curator): rewrite cron job skill refs after consolidation (#18253)
When the curator consolidates skill X into umbrella Y, any cron job
that listed X in its skills field would fail to load X at run time —
the scheduler logs a warning and skips it, so the scheduled job runs
without the instructions it was scheduled to follow.

cron.jobs.rewrite_skill_refs(consolidated, pruned) now updates jobs
in-place: consolidated names route to the umbrella target (dedup
when umbrella is already present), pruned names are dropped.
agent.curator._write_run_report calls it after classification,
best-effort so a cron-side failure never breaks the curator itself.

Results are recorded in run.json (counts.cron_jobs_rewritten + full
cron_rewrites payload), a separate cron_rewrites.json for convenience
when jobs were touched, and a section in REPORT.md.

Reported by @tombielecki.
2026-04-30 23:04:50 -07:00
IMHaoyan bfb704684e fix(deepseek): use non-empty reasoning_content placeholder for V4 Pro thinking mode
DeepSeek V4 Pro tightened thinking-mode validation and rejects empty-string
reasoning_content with HTTP 400:

    The reasoning content in the thinking mode must be passed back to the API.

run_agent.py injected "" at three fallback sites — the tool-call pad in
_build_assistant_message and both injection branches of
_copy_reasoning_content_for_api (cross-provider poison guard + unconditional
thinking pad). All three now emit " " (single space), which satisfies the
non-empty check on V4 Pro without leaking fabricated reasoning.

Also upgrades stale empty-string placeholders on replay: sessions persisted
before this change have reasoning_content="" pinned at creation time; when
the active provider enforces thinking-mode echo, the replay path now rewrites
"" -> " " so existing users don't 400 on their first V4 Pro turn after
updating. Non-thinking providers still round-trip "" verbatim.

Updates 9 existing assertions + adds 2 regression tests (stale-placeholder
upgrade, non-thinking verbatim preservation).

Refs #15250, #17400.
Closes #17341.
2026-04-30 23:04:23 -07:00
Teknium f0dc919f92 fix(compression): include system prompt + tool schemas in token estimates (#18265)
The user-visible /compress banner and the post-compression last_prompt_tokens
writeback both counted only the raw message transcript (chars/4). With a 15KB
system prompt and 30 tool schemas (~26KB), a 4-message transcript that looks
like ~45 tokens to the transcript-only estimator is really ~10.5K tokens of
request pressure — a 234x gap.

Two user-facing consequences:
- Banner shows 'Compressing … (~45 tokens)…' while compression is actually
  firing on 10K+ tokens of real pressure, confusing users about why
  compression triggered (reported by @codecovenant on X; #6217).
- Post-compression last_prompt_tokens writeback omits tool schemas, so the
  next should_compress() check compares real usage against a stale
  underestimate — compression triggers late, potentially past the model's
  context limit on small-context models (#14695).

Swap estimate_messages_tokens_rough() for estimate_request_tokens_rough()
at every user-visible banner and at the post-compression writeback.
estimate_request_tokens_rough() already existed for exactly this purpose
and includes system prompt + tool schemas.

Touched call sites:
- run_agent.py: post-compression last_prompt_tokens writeback, post-tool
  call should_compress() fallback when provider usage is missing
- cli.py: /compress banner + summary
- gateway/run.py: gateway /compress banner + summary
- tui_gateway/server.py: TUI /compress status + summary
- acp_adapter/server.py: ACP /compact before/after

Left intentionally alone:
- Session-hygiene fallback and the 'no agent' /status path in gateway/run.py
  — no agent instance is in scope to query for system prompt/tools, and the
  existing 30-50% overestimate wobble on hygiene is safety-accepted.
- Verbose-mode 'Request size' logging — informational only, already counts
  system prompt via api_messages[0].

Also relabels the feedback line from 'Rough transcript estimate' to
'Approx request size' so the metric label matches what it actually measures.

Credits: diagnoses from @devilardis (#14695) and @Jackten (#6217);
user report @codecovenant on X (2026-04-30).

Closes #14695
Closes #6217
2026-04-30 23:03:54 -07:00
Teknium 41fa1f1b5c fix(acp): run /steer as a regular prompt on idle sessions (#18258)
When a user types /steer <text> on an ACP session that isn't actively
running a turn (and there's no interrupted-prompt salvage available),
_cmd_steer silently appended to state.queued_prompts and replied
"No active turn — queued for the next turn". That looks identical to
/queue output even though the user never typed /queue — @EddyLeeKhane
reported this as "/steer never works, gets queued instead".

Rewrite the payload to a plain user prompt before the slash-intercept
fires, matching the gateway's idle-/steer fallthrough in
gateway/run.py ~L4898.
2026-04-30 22:45:14 -07:00
Teknium fc78e708ed fix(update): don't crash hermes update if skill config scan fails (#18257)
`hermes update` ran the config migration (11 → 17) successfully then
crashed at `agent/skill_utils.py:340` during the post-migration
skill-config prompt. User @FlockonUS reported this on Twitter.

Root cause: `get_missing_skill_config_vars` in hermes_cli/config.py
only guarded the import of `discover_all_skill_config_vars`, not the
call. Any runtime exception inside the skill scan (malformed SKILL.md,
unreadable external skill dir, etc.) propagated up through
`migrate_config` and aborted `hermes update` after the version bump.

Wrap the call in try/except so skill-config prompting — which is a
post-migration nicety — can never block the migration itself.
2026-04-30 22:44:41 -07:00
Henkey ec1443b9f1 fix(acp): normalize Windows cwd for WSL tool execution 2026-04-30 20:55:14 -07:00
Henkey 78886365c2 fix(acp): replay interrupted prompts for steer 2026-04-30 20:54:37 -07:00
Henkey e27b0b7651 feat(acp): add steer and queue slash commands 2026-04-30 20:54:37 -07:00
Teknium 8fa44b1724 fix(guardrails): preserve display _detect_tool_failure semantics
The initial guardrail PR consolidated failure classification by pointing
display._detect_tool_failure at the new classify_tool_failure helper,
which was strictly broader: it flagged any JSON result with
"success": false / "failed": true / non-empty "error", plus plain-text
"traceback" and "error:" prefixes. That would uptick the user-visible
[error] tag on tools that return {"success": false} as a benign signal
(memory fullness, todo state, etc.) and feed the failure-streak counter
at the same time.

Restore display._detect_tool_failure to its pre-PR semantics verbatim.
Tighten classify_tool_failure (the guardrail's internal safety-fallback
used only when callers don't pass failed=) to match _detect_tool_failure
exactly, so the two never disagree. Production callers in run_agent.py
already pass an explicit failed= derived from _detect_tool_failure, so
the guardrail counter is driven by the same signal the CLI shows.
2026-04-30 20:43:15 -07:00
Mind-Dragon 0704589ceb fix(agent): make tool loop guardrails warning-first 2026-04-30 20:43:15 -07:00
Mind-Dragon 58b89965c8 fix(agent): add tool-call loop guardrails 2026-04-30 20:43:15 -07:00
Austin Pickett c23c7c994b fix(tui): address remaining review feedback — ordering and digit shortcuts
- Emit providers in CANONICAL_PROVIDERS order (matching hermes model)
  with user-defined/custom providers appended after
- Remove digit quick-select (1-9,0) handler — inconsistent with
  absolute row numbering and already removed from hint text
- Remove unused windowOffset import
2026-04-30 23:41:19 -04:00
Oxidane-bot 8d7500d80d fix(gateway): snapshot callback generation after agent binds it, not before
_process_message_background snapshotted callback_generation from the
interrupt event at the TOP of the task — before the handler ran.
_hermes_run_generation is only set on the event by
GatewayRunner._bind_adapter_run_generation during
_handle_message_with_agent, which runs DURING the handler await. The
early snapshot always captured None, which then flowed into
pop_post_delivery_callback(..., generation=None) in the finally block.

In pop_post_delivery_callback, generation=None with a tuple-registered
entry (generation, callback) bypasses the ownership check — it pops and
fires the callback regardless of which run owns it. Result: a stale run
could fire a fresher run's post-delivery callback (e.g. a
background-review notification attributed to the wrong turn).

Fix: move the snapshot into the finally block, after the handler has
run and _hermes_run_generation has been bound to the current run.

Regression test added: simulates a stale handler at generation=1 and a
fresher callback registered at generation=2. Pre-fix: snapshot=None →
pop fires the generation=2 callback under generation=1's ownership
("newer" fires). Post-fix: snapshot=1 → pop skips the mismatched
entry, callback stays in the dict for the correct run to claim.

Verified: test FAILS on current main (captures "newer" in fired list),
PASSES with this fix.

Salvaged from PR #12565 (the callback-ownership portion only; the
/status totals portion was already fixed on main in 7abc9ce4d via #17158).

Co-authored-by: Oxidane-bot <1317078257maroon@gmail.com>
2026-04-30 20:41:18 -07:00
Teknium 27ec74c68a fix: coerce show_reasoning and guard_agent_created config bools
Widens #16528 to two sibling sites that had the same quoted-boolean
bug: a YAML string "false" (or "0", "no", "off") silently evaluated
truthy under bool() / if-check.

- gateway/run.py _load_show_reasoning: is_truthy_value wrap
- tools/skill_manager_tool.py _guard_agent_created_enabled: is_truthy_value wrap
- regression tests for both
2026-04-30 20:40:46 -07:00
johnncenae bb706c3f38 fix(gateway): coerce tool_progress_command as a real boolean 2026-04-30 20:40:46 -07:00
Teknium a94841eaa0 fix(state): include finish_reason in conversation replay
SELECT in get_messages_as_conversation() was missing finish_reason, so
assistant messages round-tripped through replay (including /branch copies)
silently dropped the provider's stop signal. Adds it to the SELECT, restores
it on assistant rows, and locks it in with a round-trip test.
2026-04-30 20:40:28 -07:00
simbam99 7ba1a2b3df fix(gateway): preserve assistant metadata when branching sessions 2026-04-30 20:40:28 -07:00
Yukipukii1 55366510e5 fix(auth): make provider config writes atomic 2026-04-30 20:39:41 -07:00
Teknium 787b5c5f93 chore(release): map Mind-Dragon and JustinUssuri emails for AUTHOR_MAP 2026-04-30 20:38:09 -07:00
Mind-Dragon ab6c629ccc fix(terminal): skip sudo prompt when local NOPASSWD sudo works
When running on a host with sudoers NOPASSWD configured for the current
user, interactive Hermes sessions were unnecessarily entering the
password prompt path before executing sudo commands. Outside Hermes,
`sudo -n true` exits 0 for that user.

Add `_sudo_nopasswd_works()` that probes `sudo -n true` and, when it
succeeds, lets `_transform_sudo_command()` return the command unchanged
with no stdin password. The probe:

- Is scoped to the `local` terminal backend only, so Docker/SSH/Modal
  and other remote backends do not inherit host sudo state.
- Re-probes every call (no process-lifetime cache) so an expired sudo
  timestamp cannot silently make a later command block waiting for a
  password that Hermes never prompts for.
- Is bypassed entirely when `SUDO_PASSWORD` is configured or a cached
  password already exists, preserving existing explicit-password flows.

Co-authored-by: Junting Wu <juntingpublic@gmail.com>
2026-04-30 20:38:09 -07:00
simbam99 ccfe6a47c3 fix(gateway): coerce StreamingConfig booleans and malformed numerics safely 2026-04-30 20:37:49 -07:00
hharry11 24130b7e53 fix(approval): harden YOLO mode env parsing against quoted-bool strings 2026-04-30 20:37:37 -07:00
hharry11 158eb32686 fix(gateway): preserve document type when merging queued events 2026-04-30 20:37:27 -07:00
sprmn24 adaee2c72c test(skill_utils): add regression tests for non-dict metadata in extract_skill_conditions
The fix for this bug (isinstance guard) was merged via commit 3ff9e010,
but test coverage was not included. Adding 4 tests:
- dict metadata with hermes keys (normal case)
- string metadata (bug case — previously caused AttributeError)
- None metadata
- missing metadata key
2026-04-30 20:37:15 -07:00
teknium1 e21898ea98 test(discord_tool): add regression test for per-token capability cache
Proves token A's detected capabilities do not leak to token B after the
fix in the preceding commit. Before the fix this test would have seen
both tokens return token A's cached value.
2026-04-30 20:37:12 -07:00
sprmn24 fa7b0b0a67 fix(discord_tool): key capability cache by token instead of single global
_capability_cache was a single module-level dict shared across all
tokens. If the bot token rotates or multiple tokens are used in one
process, capabilities detected for token A would be returned for
token B, causing wrong schema gating and incorrect runtime behavior.

Replace the single Optional cache with a Dict keyed by token so each
token gets its own isolated capability entry.
2026-04-30 20:37:12 -07:00
Teknium 82b5786721 test(browser_supervisor): cover cache-hit healthcheck on dead thread/loop
Pure unit tests for _SupervisorRegistry — no Chrome required. Verified
to fail when the fix is reverted, pass with it in place.
2026-04-30 20:33:33 -07:00
sprmn24 73a6b80317 fix(browser_supervisor): verify thread and loop health before returning cached supervisor
_SupervisorRegistry.get_or_start() returned an existing supervisor
whenever the cdp_url matched, without checking if the supervisor's
thread or event loop was still alive. A crashed supervisor would be
silently reused, causing missed dialog/frame updates.

Now checks both _thread.is_alive() and _loop.is_running() before
returning the cached instance. An unhealthy supervisor is torn down
and recreated, matching the existing URL-changed code path.
2026-04-30 20:33:33 -07:00
sprmn24 ec4cb16a29 fix(honcho): guard _peers_cache and _sessions_cache reads under _cache_lock
_get_peer() and _get_or_create_honcho_session() accessed _peers_cache
and _sessions_cache without holding _cache_lock, while other paths
in the same class use the lock consistently. Under concurrent tool
calls or prefetch threads, this can produce stale reads or lost
cache updates.

Wrap both unguarded cache read sites in _cache_lock. Network calls
(honcho.peer() and honcho.session()) remain outside the lock to
avoid holding it during I/O.
2026-04-30 20:31:42 -07:00
sprmn24 bea2562fc4 fix(honcho): replace raw int() config parsing with safe helper
Three int() calls in HonchoClient.from_global_config() parsed
dialecticMaxChars, messageMaxChars, and dialecticMaxInputChars
directly without guards. A malformed value in honcho.json would
raise ValueError and abort provider initialization entirely.

Add _parse_int_config() helper following the existing
_parse_context_tokens() pattern, and replace all three raw
int() calls with it.
2026-04-30 20:31:32 -07:00
Roy-oss1 b94cb8e2c4 feat(feishu): operator-configurable bot admission and mention policy
Add two operator-facing toggles for inbound Feishu admission, enabling
bot-to-bot scenarios such as A2A orchestration and inter-bot
notifications:

  FEISHU_ALLOW_BOTS=none|mentions|all   (default: none)
    Accept messages from other bots. `mentions` requires the peer
    bot to @-mention Hermes; `all` admits every peer-bot message.

  FEISHU_REQUIRE_MENTION=true|false     (default: true)
    Whether group messages must @-mention the bot. Override per-chat
    via `group_rules.<chat_id>.require_mention` in config.yaml.

Defaults preserve prior behavior. Self-echo protection is always on:
when the bot's identity is unresolved (auto-detection failed and
FEISHU_BOT_OPEN_ID unset), peer-bot messages are rejected fail-closed
to avoid feedback loops.

Admitted peer bots bypass the human-user allowlist
(FEISHU_ALLOWED_USERS) to match existing Discord behavior; humans
still need an explicit allowlist entry. yaml feishu.allow_bots is
bridged to the env var so the adapter and gateway auth layer share
one source of truth.

Resolving peer-bot display names requires the
application:bot.basic_info:read scope; without it, peers still route
but appear as their open_id.

Test: tests/gateway/test_feishu_bot_admission.py covers the admission
pipeline, group-policy bot-bypass, hydration, and event-dispatch
plumbing as a parametrized matrix.

Change-Id: I363cccb578c2a5c8b8bf0f0a890c01c89909e256
2026-04-30 20:30:31 -07:00
buray fa9fd26acb fix(gateway): re-inject topic-bound skill after /new or /reset
reset_session() creates a fresh SessionEntry with created_at == updated_at,
but get_or_create_session() bumps updated_at on the next inbound message,
causing _is_new_session in _handle_message_with_agent to evaluate False.
The topic/channel skill auto-load gate (group_topics, channel_skill_bindings)
silently skips the first message after a manual reset.

Add an is_fresh_reset flag on SessionEntry, set by reset_session() and
consumed once by the message handler. Kept distinct from was_auto_reset
because that flag also drives a 'session expired due to inactivity'
user-facing notice and a context-note prepend — both wrong for an
explicit /new or /reset.

Persisted through to_dict/from_dict so the flag survives gateway
restart between /reset and the next message.

Fixes #6508

Co-authored-by: warabe1122 <45554392+warabe1122@users.noreply.github.com>
Co-authored-by: willy-scr <187001140+willy-scr@users.noreply.github.com>
2026-04-30 20:29:19 -07:00
Jezza Hehn 7abc9ce4df fix(gateway): read /status token totals from SessionDB (#17158)
/status was reading session_entry.total_tokens from the in-memory
SessionStore (gateway/session.py), which the agent never writes to —
so the token count was always 0.

The agent already persists token deltas to the SQLite SessionDB
(run_agent.py:11497) for every platform with a session_id. Route
/status through that single source of truth instead of duplicating
token writes into a second store.

Fix:
- gateway/run.py: _handle_status_command now calls
  self._session_db.get_session(session_id) and sums the five token
  component columns (input/output/cache_read/cache_write/reasoning).
  Falls back to 0 when no SessionDB is configured or no row exists.
- Two new regression tests covering the populated-row and
  missing-row paths.

Co-authored-by: Hermes <127238744+teknium1@users.noreply.github.com>
2026-04-30 20:28:50 -07:00
Teknium a178081468 fix(gateway): use _session_key_for_source for native image buffer write
Minor follow-up to the native-image-buffer isolation fix. The write site
in _prepare_inbound_message_text was calling build_session_key directly,
while every other call site in gateway/run.py uses the _session_key_for_source
helper — which consults session_store._generate_session_key first and falls
back to build_session_key. Keeping the write key and consume key on the
same helper prevents key drift if the session store ever overrides the
default keying behavior.
2026-04-30 20:26:35 -07:00
Yukipukii1 bdb7edd89e fix(gateway): isolate pending native image paths by session 2026-04-30 20:26:35 -07:00
sprmn24 5ed27c0f74 fix(tui_gateway): guard env var parsing against invalid values at import
_SLASH_WORKER_TIMEOUT_S and _pool used raw float()/int() on env vars
at module level. A non-numeric value (e.g. HERMES_TUI_SLASH_TIMEOUT_S=abc)
raises ValueError during import, preventing TUI gateway from starting
with no useful error message.

Wrap both parses in try/except with safe fallbacks:
- HERMES_TUI_SLASH_TIMEOUT_S: fallback to 45.0s
- HERMES_TUI_RPC_POOL_WORKERS: fallback to 4 workers
2026-04-30 20:26:23 -07:00
Teknium 531ac20408 fix(state): JSON-encode multimodal message content for sqlite
sqlite3 can only bind str/bytes/int/float/None to query parameters.
Multimodal message content is a list of parts (text + image_url), which
raised 'Error binding parameter 3: type list is not supported' in
append_message and replace_messages.

In the CLI/TUI this surfaced as a visible crash when users pasted
screenshots. In the gateway it was silently swallowed by a bare except
in append_to_transcript, causing multimodal turns to be lost from the
session transcript.

Fix at the DB layer: _encode_content wraps lists/dicts as
'\\x00json:' + json.dumps(...) on write, _decode_content unwraps on
read. Plain strings are untouched, so existing FTS search, previews,
and JSONL compat are unaffected. Paired decode in get_messages,
get_messages_as_conversation, and search_messages context previews.

Regression test covers: list content round-trip, dict content
round-trip, string content stored unchanged, replace_messages with
multimodal content.

Also included: aligned fix #17522 for TUI image attachment with
paths containing spaces (see previous commit).
2026-04-30 20:25:52 -07:00
Harry Riddle cc340c4a4d fix(tui): always call input.detect_drop for reliable image attachment
Remove frontend regex pre-check that truncated paths containing spaces,
quotes, or Windows drive letters. Backend _detect_file_drop correctly
handles these patterns. This fixes image attachment for common filenames
like "Screenshot 2026-04-29.png".

Add tests:
- test_input_detect_drop_path_with_spaces: attaches image with spaces in name
- test_input_detect_drop_path_with_spaces_and_remainder: remainder handling

Also restored missing  in test_rollback_restore_resolves_number_and_file_path.

Scope: tui, vision, tests
2026-04-30 20:25:52 -07:00
Teknium 19136dfc07 chore: map jatingodnani email in AUTHOR_MAP 2026-04-30 20:24:39 -07:00
Teknium 9a75743496 fix(gateway): apply agent.disabled_toolsets in gateway message loop
Widens the cherry-picked fix from @jatingodnani (#17343) to the
gateway path. On main, user_config.agent.disabled_toolsets was only
honored by _get_platform_tools' name-level subtraction — it did not
catch tools pulled in implicitly by a composite toolset (browser
includes web_search, hermes-* platforms include most tools).

Changes:
- gateway/run.py: resolve disabled_toolsets alongside enabled_toolsets
  and pass to AIAgent at both user-facing construction sites (normal
  message loop + single-turn cron-like path). Hygiene/compression
  agents (fixed enabled_toolsets=[memory]) are intentionally untouched.
- gateway/run.py: add (agent, disabled_toolsets) to
  _CACHE_BUSTING_CONFIG_KEYS so editing the list in config.yaml
  invalidates the cached AIAgent on the next message.
- cli.py: drop unused 'import platform' left over from PR #17343's
  import churn; restore 'import sys' used throughout the file.
- model_tools.py: drop unused 'import os, sys' added by PR #17343;
  fix comment reference from #15291 (unrelated OAuth issue) to #17309.

Co-authored-by: jatin godnani <godnanijatin@gmail.com>
2026-04-30 20:24:39 -07:00
jatin godnani e3624e00db fix: enforce strictly subtractive toolset filtration
Refactor tool resolution logic in model_tools.py to ensure that
disabled_toolsets are always subtracted at the end, preventing
composite toolsets (e.g. 'browser') from implicitly enabling tools
that should be hidden.

- Added 'disabled_toolsets' to DEFAULT_CONFIG in hermes_cli/config.py
- Updated HermesCLI in cli.py to load and propagate disabled toolsets to AIAgent
- Implemented robust two-phase resolution (additive then subtractive) in model_tools.py
2026-04-30 20:24:39 -07:00
Teknium 8e58265b60 chore(release): map allard.quek@singtel.com → AllardQuek (#18196) 2026-04-30 20:23:31 -07:00
Allard Quek ebe60abc4f fix(dashboard): separate theme identity from layout scale
Themes previously embedded layout-affecting values (baseSize, lineHeight,
density, letterSpacing) alongside visual identity properties, coupling
user ergonomic preferences to color theme selection.

This change establishes a clear separation of concerns:

- Themes own: palette, font family, border-radius, and font-coupled
  letterSpacing (e.g. Inter's -0.005em tracking)
- Layout scale (baseSize, lineHeight, density) is standardized via
  DEFAULT_TYPOGRAPHY and DEFAULT_LAYOUT — not overridden per theme

All themes now spread DEFAULT_TYPOGRAPHY and DEFAULT_LAYOUT as their
base, removing silent divergence and making future layout settings
(e.g. user-configurable density) trivially applicable across all themes
without per-theme special-casing.
2026-04-30 20:22:54 -07:00
Allard Quek 33d24095c4 fix(dashboard): normalize typography and layout across built-in themes
All built-in themes now spread DEFAULT_TYPOGRAPHY, removing independent
baseSize overrides and converging on 15px. All themes also use
density: comfortable, removing the compact/spacious divergence that
caused item-count shifts on fixed-height pages (e.g. Skills).

Two additional per-theme overrides are also normalized:

- rose: lineHeight: "1.7" removed — was paired with density: spacious
  for an airy feel; once density was normalised the elevated line-height
  became an orphaned artefact causing nav item height drift.

- cyberpunk: letterSpacing changed from "0.02em" to "0" — extra tracking
  on top of an already-wide monospace font caused text to wrap earlier
  than in other themes.

Switching themes is now a purely cosmetic change — color palette,
font family, border-radius, and typographic style differ; font size,
spacing, line-height, and letter-spacing do not.
2026-04-30 20:22:54 -07:00
Teknium 01cc701e54 docs + nit: busy_ack_enabled follow-ups
- Move the disabled-ack guard above the debounce so we don't stamp
  _busy_ack_ts[session_key] when no ack was actually sent. Harmless
  (never read when disabled) but cosmetically off.
- Document display.busy_ack_enabled in user-guide/messaging/index.md
  and HERMES_GATEWAY_BUSY_ACK_ENABLED in reference/environment-variables.md.
- Add JezzaHehn to scripts/release.py AUTHOR_MAP for contributor credit.

Follow-up to #17491 (Jezza Hehn).
2026-04-30 20:22:30 -07:00
Jezza Hehn 2b512cbca4 feat(gateway): add busy_ack_enabled config option to suppress ack messages
When a user sends a message while the gateway is busy processing,
an acknowledgment message is sent. This can be spammy for users
who send rapid messages.

Add display.busy_ack_enabled config option (default: true) to allow
users to suppress these busy-input acknowledgment messages.

Fixes #17457
2026-04-30 20:22:30 -07:00
Yukipukii1 25cbe3e1d6 fix(gateway): preserve thread routing for /update progress and prompts 2026-04-30 20:19:23 -07:00
Teknium f48ba47d1e chore(release): map allard.quek@singtel.com → AllardQuek 2026-04-30 20:19:14 -07:00
Allard Quek 226fd79c8e feat(dashboard): add interactive column sorting to analytics tables 2026-04-30 20:19:14 -07:00
Teknium 0ddc8aba68 fix(fallback): let custom_providers shadow built-in aliases
When a user defines `custom_providers: [{name: kimi, ...}]` and references
`provider: kimi` from fallback_model or the main config, the built-in alias
rewriting (`kimi` → `kimi-coding`) was hijacking the request before the
named-custom lookup ran.  `_get_named_custom_provider` also refused to
return a match when the raw name resolved to any built-in (including aliases),
so the custom endpoint was unreachable.

Fix at both layers of the resolution chain so every caller benefits, not
just `_try_activate_fallback`:

- hermes_cli/runtime_provider.py: narrow `_get_named_custom_provider`'s
  built-in-wins guard to canonical provider names only.  An alias like
  `kimi` that resolves to a different canonical (`kimi-coding`) no longer
  blocks the custom lookup; a canonical name like `nous` still does.

- agent/auxiliary_client.py: in `resolve_provider_client`, try the named-
  custom lookup with the original (pre-alias-normalization) name before the
  alias-normalized one, so aliased requests reach the user's custom entry.
  Also honour `explicit_base_url` and `explicit_api_key` in the API-key
  provider branch so callers that pass explicit hints (e.g. fallback
  activation) can override the registered defaults.

Tests added for:
- custom `kimi` shadowing built-in alias (regression for #15743)
- custom `nous` NOT shadowing canonical built-in (behaviour preserved)
- bare `kimi` without any custom entry still routing to built-in
- explicit base_url/api_key override on the API-key provider branch

Original PR #17827 by @Feranmi10 identified the same bug class and
implemented a narrower fix in `_try_activate_fallback`; this reshapes the
fix to live in the shared resolution layer so all callers benefit.

Fixes #15743
Co-authored-by: Feranmi10 <89228157+Feranmi10@users.noreply.github.com>
2026-04-30 20:18:44 -07:00
Yukipukii1 38875d00a7 fix(gateway): ensure platform configs honor home_channel env overrides 2026-04-30 20:18:33 -07:00
Teknium 5089c55e0b refactor(state): compute last_active ordering at SQL level via recursive CTE
Follow-up to the previous commit. Replace the post-fetch Python re-sort (which
required dropping LIMIT/OFFSET from SQL and scanning every session row) with a
recursive CTE that walks compression-continuation chains and computes
effective_last_active per root at SQL level. The outer query can then ORDER BY
+ LIMIT efficiently, and the Python projection loop no longer has to handle
ordering.

This preserves the correctness win (old compression roots whose live tip was
touched recently surface correctly) without the O(N) scan, which matters for
users with thousands of sessions.

Adds a regression test pinning the compression-tip case at limit=1 — the
stress case that any bounded-oversample shortcut would get wrong.

Co-authored-by: simbam99 <simbamax99@gmail.com>
2026-04-30 20:17:15 -07:00
simbam99 142b4bf3ce fix(session_search): order recent mode by last activity instead of start time
- order session_search recent-mode results by last activity instead of session start time
- add an opt-in `order_by_last_active` path to `SessionDB.list_sessions_rich`
- add regression coverage for both the database ordering and recent-mode call path
2026-04-30 20:17:15 -07:00
Austin Pickett c8e506c383 fix(tui): address code review feedback on model picker
- Reset keySaving on back() to prevent blocked key entry after Esc
- Show '(needs setup)' for non-API-key auth providers instead of
  generic '(no key)'
- Set is_current correctly for unauthenticated providers that happen
  to be the active session provider
- Guard model.save_key with is_managed() check — return error on
  managed installs where .env is read-only
2026-04-30 23:11:28 -04:00
Austin Pickett f4c761c6a0 feat(tui): add inline provider disconnect via 'd' keybind in /model picker
- New model.disconnect RPC method: clears API key env vars from .env
  and OAuth/credential pool state via clear_provider_auth()
- Press 'd' on an authenticated provider opens confirmation prompt
- y/Enter confirms disconnect, n/Esc cancels
- Provider flips to unauthenticated state in-place (re-selectable
  to re-auth by pressing Enter again)
2026-04-30 23:03:32 -04:00
Austin Pickett 26f7f68507 feat(tui): show all providers in /model picker with inline API key setup
- model.options now returns all canonical providers (not just
  authenticated), each with authenticated/auth_type/key_env fields
- New model.save_key RPC method: saves API key to .env, sets in
  process, returns refreshed provider with models
- Picker shows ● (authed) / ○ (no key) markers with dimmed styling
- Selecting an unauthenticated api_key provider opens inline masked
  key input — after save, transitions directly to model selection
- Non-api_key auth providers show guidance to run hermes model
- Row numbers now show absolute position in list
2026-04-30 23:03:32 -04:00
Austin Pickett 36fa8a4d28 fix(tui): show absolute position numbers in model picker
The model picker displayed row numbers 1-12 regardless of scroll
position, making it impossible to tell where you were in the list.
Now shows the actual item index (e.g. 5, 6, 7... when scrolled down).

Also removed '1-9,0 quick' from the hint text since digit shortcuts
still work relative to the visible window, which would be confusing
with absolute numbering.
2026-04-30 23:03:32 -04:00
Austin Pickett 443950e827 fix(tui): pass user_providers as dict to match CLI model-switch pipeline
The TUI's _apply_model_switch() was converting the config.yaml
`providers:` dict into a list of dicts before passing it to
switch_model(). This caused resolve_provider_full() →
resolve_user_provider() to fail, since that function expects a dict
and does `user_config.get(name)` to look up provider entries.

The result: user-defined providers (e.g. ollama) appeared in CLI's
/model picker but were invisible in the TUI.

Fix:
- tui_gateway/server.py: pass cfg.get('providers') directly (dict),
  matching what cli.py already does at line 5598.
- hermes_cli/model_switch.py: fix the validation-override block
  (line ~893) which iterated user_providers as a list — now correctly
  handles the dict format with support for both dict-keyed and
  list-format models arrays.
2026-04-30 23:03:32 -04:00
Teknium 96691268df fix(gateway): drain manual profile gateways via SIGUSR1 before respawn
The PR wired in a detached watcher that respawns manual profile gateways
after they exit.  Pair that with a SIGUSR1 graceful drain (same path
systemd/launchd use) so in-flight agent runs finish instead of getting
SIGTERM'd.  Fall back to SIGTERM if SIGUSR1 isn't wired or the gateway
doesn't exit within the drain budget — the watcher sees the exit and
relaunches either way.

Tested end-to-end against an orphaned gateway: graceful drain exits in
0.5s and the watcher fires the relaunch command.
2026-04-30 20:00:31 -07:00
Michael Nguyen 77fe7ab6b2 feat(gateway): restart manual profile gateways after update 2026-04-30 20:00:31 -07:00
Teknium 84324d06b8 chore(release): add quocanh261997 to AUTHOR_MAP 2026-04-30 20:00:31 -07:00
Teknium 8b7b074df9 test(context_compressor): regression test for PR #17025 tail-protection off-by-one
When len(messages) <= protect_tail_count and a token budget is set, the
previous formula min(protect_tail_count, len(result) - 1) under-protected
the tail by one, allowing the oldest message to be summarized.

The test fails on the buggy formula (pruned == 1) and passes on the fix
(pruned == 0, tool content preserved verbatim).
2026-04-30 20:00:01 -07:00
0z! b194617d00 fix(context_compressor): off-by-one in tail protection for short conversations 2026-04-30 20:00:01 -07:00
hharry11 2997ef9446 fix(api-server): use session-scoped task IDs for tool isolation 2026-04-30 19:59:38 -07:00
johnncenae a83d579d5b fix(telegram): enforce gateway auth for inline approval callbacks 2026-04-30 19:59:31 -07:00
johnncenae 9ae1fa9e39 fix(delegate): honor runtime default model during provider resolution 2026-04-30 19:58:55 -07:00
Stephen Schoettler b29b709a71 fix(agent): sanitize Codex tool-call history summaries 2026-04-30 19:58:46 -07:00
Teknium f43b126677 fix(gateway): atomic writes for sibling recovery/dedup state files
Widen PR #17842's atomic-write fix to two sibling sites that exhibit the
same 'partial JSON on interrupted write' class of bug:

- gateway/platforms/feishu.py: dedup state (_dedup_state_path)
- gateway/platforms/helpers.py: ParticipatedThreadTracker save

Both are small recovery/coordination files that get rewritten frequently and
break cross-restart dedup if left partial.
2026-04-30 19:58:16 -07:00
johnncenae 1ef9e88549 fix(gateway): write restart markers atomically and fix Windows lock collisions 2026-04-30 19:58:16 -07:00
teknium1 447a2bba3a fix(plugins): bound async plugin command await with 30s timeout
Follow-up to #17963. The threaded branch of resolve_plugin_command_result
previously called Event.wait() with no timeout — a hung async plugin
handler would wedge the terminal indefinitely. Cap the wait at 30s and
raise TimeoutError instead. Added a regression test covering the hung
handler path.
2026-04-30 19:56:18 -07:00
hharry11 ca9a61ae38 fix(plugins): await async handlers in CLI and TUI dispatch 2026-04-30 19:56:18 -07:00
johnncenae 79cffa9232 auth: coerce tls insecure flag safely instead of using Python truthiness 2026-04-30 19:55:48 -07:00
johnncenae 2bf73fbe2c fix(cli): coerce tls insecure flag safely in auth state 2026-04-30 19:55:48 -07:00
Teknium 7cbe943d2d feat(skills): add here.now as an optional skill
Moves the here-now skill under optional-skills/productivity/here-now/ so
it's discoverable via the Skills Hub but not installed by default, and
tightens the SKILL.md description to a single line to match sibling
optional-skill descriptions.

Install with:
  hermes skills install official/productivity/here-now

Closes #378
2026-04-30 19:48:15 -07:00
adamludwin 21cc9c8d32 Update here.now skill bundle
Made-with: Cursor
2026-04-30 19:48:15 -07:00
adamludwin f7dfd4ae36 feat(skills): add built-in here.now skill
Add the here.now productivity skill with a bundled publish runtime so Hermes can publish files and folders to live URLs. Keep the skill thin and docs-first while fixing script path resolution and upload failure handling.

Made-with: Cursor
2026-04-30 19:48:15 -07:00
Yukipukii1 2110a3a0c4 fix(tui): return JSON-RPC errors for invalid request shapes 2026-04-30 19:47:00 -07:00
Yukipukii1 5f3f456784 fix(approval): wake blocked gateway approvals on session cleanup 2026-04-30 19:46:27 -07:00
Feranmi10 f4ba97ad9a fix(status): add NVIDIA_API_KEY to hermes status API keys display
Closes #16082

The `hermes status` command listed provider API keys under the
◆ API Keys section but NVIDIA_API_KEY was absent. Users configured
with NVIDIA NIM had no way to verify their key was set from status
output. Add it alongside the other inference provider keys.
2026-04-30 19:46:06 -07:00
Yukipukii1 75483b6db1 fix(curator): preserve last_report_path in state 2026-04-30 19:45:59 -07:00
Mind-Dragon aab5bcc6ac test(model_switch): cover private user_providers override 2026-04-30 19:44:26 -07:00
Mind-Dragon 5ad8281885 fix(model_switch): correct user_providers override for private models
The switch_model override logic incorrectly iterated over user_providers
as if it were a list of dicts, but it's actually a dict mapping
provider_slug -> config. This meant private models defined in a provider's
`models:` section (e.g. nahcrof-dedicated with discover_models: false)
were never accepted when the API /models list didn't include them.

Fix: iterate over user_providers.items(), match by slug, and handle both
dict and list forms of the models config.
2026-04-30 19:44:26 -07:00
Aamir Jawaid 1e5a23fa64 docs(teams): use teams app get --install-link for Step 6
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-30 19:43:32 -07:00
Aamir Jawaid 67f1198ba9 docs(teams): fix CLI install tag and Step 6 install flow
- Keep @preview tag for teams CLI
- Step 3: note client secret won't be shown again
- Step 6: use the Install in Teams link from teams app create output

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-30 19:43:32 -07:00
Aamir Jawaid d5e72ae17f docs(teams): fix CLI install tag and Step 6 install flow
- Keep @preview tag for teams CLI
- Step 3: note client secret won't be shown again
- Step 6: just open the Install in Teams link from teams app create output

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-30 19:43:32 -07:00
Aamir Jawaid a5d60f42ee docs(teams): fix CLI install tag and Step 6 install flow
- Keep @preview tag for teams CLI
- Step 3: note client secret won't be shown again
- Step 6: use the install link printed by teams app create
  instead of a separate CLI command

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-30 19:43:32 -07:00
Aamir Jawaid 09aba91766 docs(teams): note that tunnel port 3978 is the default, not fixed
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-30 19:43:32 -07:00
Aamir Jawaid f59693c075 fix(teams): pipe TEAMS_PORT through docker-compose properly
Was hardcoded to 3978; use ${TEAMS_PORT:-3978} so a custom port
set in .env is actually passed into the container.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-30 19:43:32 -07:00
Aamir Jawaid c997830e1e docs(teams): fix port references and add TEAMS_ALLOW_ALL_USERS
- Replace hardcoded 3978 with configurable TEAMS_PORT references
- Fix incorrect docker-compose port mapping claim (uses network_mode: host)
- Add missing TEAMS_ALLOW_ALL_USERS to config reference table

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-30 19:43:32 -07:00
Aamir Jawaid 4a6fac36d8 docs(teams): fix group chat behavior — @mention required
Group chats require @mention just like channels, not respond-to-all.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-30 19:43:32 -07:00
Aamir Jawaid 624057fce6 feat(teams): set User-Agent to Hermes via 2.0.0 client option
microsoft-teams-apps 2.0.0 added the `client` option to AppOptions,
accepting a ClientOptions instance. Use it to set the User-Agent
header to "Hermes" on all outgoing HTTP requests.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-30 19:43:32 -07:00
briandevans 97d6f25008 test(toolsets): include kanban in expected post-#17805 toolset assertions
The kanban PR (#17805, c86842546) added the `kanban` toolset and
`tools/kanban_tools.py`, but didn't update three pre-existing test
assertions that bake the full toolset/tool inventory:

* `tests/tools/test_registry.py::test_matches_previous_manual_builtin_tool_set`
  hard-codes the manual list of builtin tool modules. `tools.kanban_tools`
  was missing.
* `tests/test_tui_gateway_server.py::test_load_enabled_toolsets_rejects_disabled_mcp_env`
  and `test_load_enabled_toolsets_falls_back_when_tui_env_invalid` both
  expect `["memory"]` from `_load_enabled_toolsets()`. With kanban now
  auto-recovered by `_get_platform_tools` (its tools live in hermes-cli's
  universe but are not in CONFIGURABLE_TOOLSETS), the resolver returns
  `["kanban", "memory"]`.
* `tests/hermes_cli/test_tools_config.py::test_get_platform_tools_preserves_explicit_empty_selection`
  asserts `set()` for an explicit empty list. The recovery loop now also
  surfaces `kanban`. Reframed to assert the contract the test name
  describes — no CONFIGURABLE toolset gets re-enabled when the user
  explicitly saved an empty list — which stays correct as more
  non-configurable platform toolsets are added.

Verified the failures reproduce on clean origin/main (180a7036b) with
`.[all,dev]`-equivalent extras (fastapi, starlette, httpx, pytest-asyncio)
and that all four pass with this commit applied. CI on main itself is
currently red on these tests; this restores green for everyone's PRs.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 19:43:03 -07:00
Chris Danis f61695ee73 fix(signal): skip contentless envelopes (profile key updates, empty messages)
Signal-cli sends dataMessage wrappers for profile key updates and other
metadata events that have no actual text content. These were reaching the
gateway as msg='' and triggering full agent turns for nothing.

Add early return in _handle_envelope() when both message field is empty/
missing/whitespace AND there are no attachments. Messages with media
attachments but no text still flow through.

- 12 lines added to gateway/platforms/signal.py
- 5 new tests in TestSignalContentlessEnvelope class
2026-04-30 19:42:59 -07:00
Teknium e2e6b6ff1a chore(models): move Vercel AI Gateway to bottom of provider picker (#18112)
It was sitting at position 4 of the `hermes model` list, ahead of Anthropic,
OpenAI, Xiaomi, and other first-class API providers. Move it to the end of
CANONICAL_PROVIDERS and drop the "(200+ models, $5 free credit, no markup)"
parenthetical so the entry just reads "Vercel AI Gateway".
2026-04-30 19:34:19 -07:00
Austin Pickett c73b799de7 feat(dashboard): add hide/show toggle for dashboard plugins in sidebar
- New config key: dashboard.hidden_plugins (list of plugin names)
- GET /api/dashboard/plugins now filters out hidden plugins from sidebar
- POST /api/dashboard/plugins/{name}/visibility toggles visibility
- Hub response includes user_hidden boolean per plugin row
- Eye/EyeOff toggle on plugin cards with dashboard manifests
- i18n: 'Show in sidebar' / 'Hide from sidebar' (en/zh)
2026-04-30 20:29:37 -04:00
Austin Pickett a52363231f refactor(plugins): move rescan button to page header, remove redundant title
Use usePageHeader().setEnd to place the rescan button in the shared
header bar. Remove the inline H2 title (already shown by the header)
and the wrapper div.
2026-04-30 20:29:37 -04:00
Austin Pickett 9550d0fd46 fix(plugins): show 'Plugins' in page header instead of 'Web UI'
Add /plugins route to resolve-page-title BUILTIN map.
2026-04-30 20:29:37 -04:00
Austin Pickett 7dc85495e0 style(plugins): make page full width 2026-04-30 20:29:37 -04:00
Austin Pickett 6549b0f2b7 fix(security): address CodeQL path-traversal and info-exposure findings
- Add _validate_plugin_name() guard on all {name} path param endpoints
  (rejects /, \, .. before reaching plugin logic)
- Strip after_install_path from install response (no internal paths to client)
- Update nix/tui.nix lockfile hash to match committed package-lock.json
2026-04-30 20:29:37 -04:00
Austin Pickett e2a4905606 feat(dashboard): add Plugins page with enable/disable, auth status, install/remove
- New PluginsPage.tsx: full plugin management UI (list, enable/disable,
  install from git, remove, git pull updates, provider picker)
- Backend: dashboard_set_agent_plugin_enabled now also toggles the
  plugin's toolset in platform_toolsets so enabling actually makes
  tools visible in agent sessions
- Backend: /api/dashboard/plugins/hub returns auth_required + auth_command
  per plugin (checks tool registry check_fn)
- Frontend: auth_required shown as Badge + CommandBlock with copy-able
  auth command
- Fix: Select overflow in providers card (min-w-0 grid cells, removed
  truncate/overflow-hidden that clipped dropdown)
- Refactor: _install_plugin_core extracted for non-interactive reuse,
  PluginOperationError for structured error handling
- i18n: en/zh/types updated with all new plugin page strings
2026-04-30 20:29:37 -04:00
Teknium e5dad4ac57 fix(agent): propagate ContextVars to concurrent tool worker threads (#18123)
Propagates ContextVars (notably `tools.approval._approval_session_key`) into concurrent tool worker threads via `copy_context().run` — mirrors `asyncio.to_thread` semantics.

Fixes approval-card cross-session misrouting in concurrent gateway traffic. Repro'd on Slack: session A's dangerous-command approval was delivered to channel B (@syahidfrd).

Salvages #16660 — core 4-LOC fix preserved, unrelated `tests/eval_018/` scope contamination dropped. Adds 5 regression guards including an AST-level source check on the real call site.

Closes #16660.

Co-authored-by: firefly <promptsiren@gmail.com>
Co-authored-by: banditburai <banditburai@users.noreply.github.com>
2026-04-30 16:26:26 -07:00
Teknium 180a7036bc feat(skills): add Shopify optional skill (Admin + Storefront GraphQL) (#18116)
Adds optional-skills/productivity/shopify — curl-based guide for the
Shopify Admin GraphQL API (products, orders, customers, inventory,
metafields, bulk operations, webhooks) and the Storefront GraphQL API.

- API version 2026-01 (current stable)
- Custom-app access tokens (shpat_...) with X-Shopify-Access-Token header
- Notes the 2026-01-01 deprecation of admin-created custom apps, points
  users at Dev Dashboard for new setups after that date
- Includes a reusable shop_gql() bash helper, cursor pagination,
  rate-limit cost inspection, GID conventions, userErrors check
- Safety section warns on destructive mutations (delete/refund/cancel)

Installs cleanly via: hermes skills install official/productivity/shopify
2026-04-30 15:58:44 -07:00
brooklyn! 8fed969618 Merge pull request #18113 from NousResearch/bb/tui-sgr-mouse-fragments
fix(tui): recover fragmented SGR mouse reports
2026-04-30 15:56:59 -07:00
Brooklyn Nicholson ded011c5a5 fix(tui): tighten SGR fragment matching 2026-04-30 17:50:49 -05:00
Brooklyn Nicholson 71b685aee0 fix(tui): recover fragmented SGR mouse reports 2026-04-30 17:43:21 -05:00
Teknium bbbce92651 feat(tui): render self-improvement review summaries in the transcript
The Ink TUI (\`hermes --tui\` + dashboard \`/chat\`) had no wiring for the
background self-improvement review. When the review fired and patched
a skill or saved a memory entry, the change landed but the user had
no visual indication it happened — only the CLI had a print surface
for the '💾 Self-improvement review: …' line.

Changes:

- tui_gateway/server.py: in _init_session, attach
  agent.background_review_callback to an _emit('review.summary',
  sid, {text}) closure. Wrapped in try/except so agents with locked
  attribute slots don't break session startup.
- ui-tui/src/app/createGatewayEventHandler.ts: handle 'review.summary'
  by routing ev.payload.text through sys(…), matching the existing
  'background.complete' pattern. Empty / whitespace payloads are
  ignored so the transcript never gets a blank system line.
- ui-tui/src/gatewayTypes.ts: extend the GatewayEvent discriminated
  union with { type: 'review.summary', payload?: { text?: string } }.

Gateway platforms (Telegram, Discord, Slack, …) already route the
review summary via background_review_callback → post-delivery queue
in gateway/run.py, so they pick up the new 'Self-improvement review:'
prefix from the companion run_agent change with no platform edits.

Tests:
- tests/tui_gateway/test_review_summary_callback.py (Python, 2 tests):
  _init_session attaches a callback that emits the right event; the
  callback path survives agents that can't accept the attribute.
- ui-tui/src/__tests__/createGatewayEventHandler.test.ts (vitest, 2
  new cases): review.summary events feed sys(...) with the full text;
  empty / missing payloads are no-ops.
- TypeScript type-check passes.
- tui_gateway suite: 64/64 pass.
2026-04-30 14:07:22 -07:00
Teknium 80a676658c fix(cli): surface self-improvement review summaries from bg thread
When the self-improvement background review fires after a turn, it runs
in a bg thread and emits a '  💾 <summary>' line to announce what it
saved to memory or skills. Two problems made this invisible to users
even when the review successfully modified a skill:

1. The print went through `_cprint` (prompt_toolkit's print_formatted_text)
   on a bg thread while the CLI's PromptSession was live. Direct
   print_formatted_text races with the input-area redraw and the line
   can land behind/above the prompt, scrolled off without the user
   seeing it.

2. The message said only '💾 Skill created.' / '💾 Memory updated'
   with no indication that the self-improvement loop was the one doing
   this. Users who did catch the line couldn't tell the background
   review from some other agent action.

Fixes:

- `_cprint` now detects when it's called from a non-app thread with a
  running prompt_toolkit Application, and routes through
  `run_in_terminal` via `loop.call_soon_threadsafe`. That pauses the
  input, prints the line above the prompt, and redraws — the normal
  prompt_toolkit contract for bg-thread output. Direct-print fallback
  preserved for the no-app / same-thread / import-error paths. Affects
  every bg-thread emission, not just the review summary (curator
  summaries and auxiliary failure prints benefit too).

- The summary now reads '  💾 Self-improvement review: <summary>' in
  both the CLI and the gateway `background_review_callback` path, so
  the origin is unambiguous.

Tests:
- New `tests/cli/test_cprint_bg_thread.py` covers all five routing
  branches (no app, app-not-running, cross-thread schedule, same-thread
  direct, app-loop-attribute-error, import-error).
- New case in `tests/run_agent/test_background_review.py` asserts the
  attributed prefix shows up in both `_safe_print` and
  `background_review_callback`.

Live E2E: exercised _cprint from a bg thread inside a real Application
event loop; confirmed get_app_or_none() sees the app, call_soon_threadsafe
schedules run_in_terminal, and the inner _pt_print runs.
2026-04-30 14:07:22 -07:00
Teknium c868425467 feat(kanban): durable multi-profile collaboration board (#17805)
Salvage of PR #16100 onto current main (after emozilla's #17514 fix
that unblocks plugin Pydantic body validation). History preserved on
the standing `feat/kanban-standing` branch; this squashes the 22
iterative commits into one clean landing.

What this lands:
- SQLite kernel (hermes_cli/kanban_db.py) — durable task board with
  tasks, task_links, task_runs, task_comments, task_events,
  kanban_notify_subs tables. WAL mode, atomic claim via CAS,
  tenant-namespaced, skills JSON array per task, max-runtime timeouts,
  worker heartbeats, idempotency keys, circuit breaker on repeated
  spawn failures, crash detection via /proc/<pid>/status, run history
  preserved across attempts.
- Dispatcher — runs inside the gateway by default
  (`kanban.dispatch_in_gateway: true`). Ticks every 60s, reclaims
  stale claims, promotes ready tasks, spawns `hermes -p <assignee>
  chat -q "work kanban task <id>"` with HERMES_KANBAN_TASK +
  HERMES_KANBAN_WORKSPACE env. Auto-loads `--skills kanban-worker`
  plus any per-task skills. Health telemetry warns on stuck ready
  queue.
- Structured tool surface (tools/kanban_tools.py) — 7 tools
  (kanban_show, kanban_complete, kanban_block, kanban_heartbeat,
  kanban_comment, kanban_create, kanban_link). Gated on
  HERMES_KANBAN_TASK via check_fn so zero schema footprint in normal
  sessions.
- System-prompt guidance (agent/prompt_builder.py KANBAN_GUIDANCE)
  injected only when kanban tools are active.
- Dashboard plugin (plugins/kanban/dashboard/) — Linear-style board
  UI: triage/todo/ready/running/blocked/done columns, drag-drop,
  inline create, task drawer with markdown, comments, run history,
  dependency editor, bulk ops, lanes-by-profile grouping, WS-driven
  live refresh. Matches active dashboard theme via CSS variables.
- CLI — `hermes kanban init|create|list|show|assign|link|unlink|
  claim|comment|complete|block|unblock|archive|tail|dispatch|context|
  init|gc|watch|stats|notify|log|heartbeat|runs|assignees` +
  `/kanban` slash in-session.
- Worker + orchestrator skills (skills/devops/kanban-worker +
  kanban-orchestrator) — pattern library for good summary/metadata
  shapes, retry diagnostics, block-reason examples, fan-out patterns.
- Per-task force-loaded skills — `--skill <name>` (repeatable),
  stored as JSON, threaded through to dispatcher argv as one
  `--skills X` pair per skill alongside the built-in kanban-worker.
  Dashboard + CLI + tool parity.
- Deprecation of standalone `hermes kanban daemon` — stub exits 2
  with migration guidance; `--force` escape hatch for headless hosts.
- Docs (website/docs/user-guide/features/kanban.md + kanban-tutorial.md)
  with 11 dashboard screenshots walking through four user stories
  (Solo Dev, Fleet Farming, Role Pipeline, Circuit Breaker).
- Tests (251 passing): kernel schema + migration + CAS atomicity,
  dispatcher logic, circuit breaker, crash detection, max-runtime
  timeouts, claim lifecycle, tenant isolation, idempotency keys, per-
  task skills round-trip + validation + dispatcher argv, tool surface
  (7 tools × round-trip + error paths), dashboard REST (CRUD + bulk
  + links + warnings), gateway-embedded dispatcher (config gate, env
  override, graceful shutdown), CLI deprecation stub, migration from
  legacy schemas.

Gateway integration:
- GatewayRunner._kanban_dispatcher_watcher — new asyncio background
  task, symmetric with _kanban_notifier_watcher. Runs dispatch_once
  via asyncio.to_thread so SQLite WAL never blocks the loop. Sleeps
  in 1s slices for snappy shutdown. Respects HERMES_KANBAN_DISPATCH_IN_GATEWAY=0
  env override for debugging.
- Config: new `kanban` section in DEFAULT_CONFIG with
  `dispatch_in_gateway: true` (default) + `dispatch_interval_seconds: 60`.
  Additive — no \_config_version bump needed.

Forward-compat:
- workflow_template_id / current_step_key columns on tasks (v1 writes
  NULL; v2 will use them for routing).
- task_runs holds claim machinery (claim_lock, claim_expires,
  worker_pid, last_heartbeat_at) so multi-attempt history is first-
  class from day one.

Closes #16102.

Co-authored-by: emozilla <emozilla@nousresearch.com>
2026-04-30 13:36:47 -07:00
ethernet 59c1a13f45 Merge pull request #15680 from NousResearch/fix/nix-package-lock
fix: let fixing nix pkgs command work without an initial build
2026-04-30 16:21:51 -04:00
Teknium 1d8068d71d feat(models): add openrouter/owl-alpha (free) to curated OpenRouter list (#18071) 2026-04-30 12:57:02 -07:00
Ari Lotter 9ac4a2e53e fix: let fixing nix pkgs command work without an initial build 2026-04-30 15:39:45 -04:00
Austin Pickett 6bc5d72271 Merge pull request #16419 from vincez-hms-coder/feat/dashboard-profiles-hms-coder
feat(dashboard): add profiles management page
2026-04-30 12:09:23 -07:00
ethernet b737af8226 Merge pull request #18047 from stephenschoettler/fix/acp-persist-user-message-test-mocks
test(acp): accept prompt persistence kwargs in MCP E2E mocks
2026-04-30 14:43:26 -04:00
Stephen Schoettler 699a9c11a9 test(acp): accept prompt persistence kwargs in mocks 2026-04-30 10:47:23 -07:00
VinceZ-Hms-Coder ca7f46beb5 Merge upstream/main and address Copilot review feedback
Merge resolved conflicts in web/src/{i18n/{en,zh,types}.ts,lib/api.ts}
by keeping both this branch's `profiles` additions and upstream's new
`models` page additions.

Copilot review feedback:
- Implement POST /api/profiles/{name}/open-terminal endpoint (already
  present); align Windows branch to `cmd.exe /c start "" <cmd>` so it
  matches the new test and spawns a fresh window instead of /k reusing
  the parent console.
- Move backslash escaping out of the macOS AppleScript f-string
  expression (Python <3.12 disallows backslashes inside f-string
  expression parts).
- Patch `_get_wrapper_dir` via monkeypatch in
  test_profiles_create_creates_wrapper_alias_when_safe so the test no
  longer writes to the real `~/.local/bin`.
- Extend test_dashboard_browser_safe_imports to scan `.ts` files in
  addition to `.tsx`.
- Switch upstream's new ModelsPage.tsx away from the `@nous-research/ui`
  root barrel onto per-component subpaths to satisfy the stricter scan.
- Fix NouiTypography `leading-1.4` -> `leading-[1.4]` so Tailwind
  actually emits the line-height for the `sm` variant.
- Guard ProfilesPage.openSoulEditor against out-of-order responses by
  tracking the latest requested profile via a ref.
- Replace ProfilesPage's hand-rolled setup command with a fetch to
  `/api/profiles/{name}/setup-command` so the copied command always
  matches what the backend would actually run (handles wrapper-alias
  collisions and reserved names correctly).
- Wire SOUL.md textarea label `htmlFor` -> textarea `id` so screen
  readers and clicking the label work as expected.
2026-04-30 06:43:22 -04:00
Austin Pickett cb0e2e2f36 Potential fix for pull request finding
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
2026-04-29 15:23:30 -04:00
vincez-hms-coder 4c0cc77e94 fix(dashboard): keep ui imports browser-safe after rebase 2026-04-29 01:47:13 -04:00
vincez-hms-coder 9b62c98170 chore(dashboard): restore package lock metadata 2026-04-29 01:43:21 -04:00
vincez-hms-coder 469e4df3c2 fix(profiles): preserve skills on dashboard profile creation 2026-04-29 01:42:51 -04:00
vincez-hms-coder ae11a31058 feat(profiles): add profile setup command endpoint and wrapper creation 2026-04-29 01:42:51 -04:00
vincez-hms-coder 3e200b64fb fix(profiles): update terminal command for copying based on profile name
Co-authored-by: Copilot <copilot@github.com>
2026-04-29 01:42:51 -04:00
vincez-hms-coder 1745cfc6d7 fix(dashboard): avoid node-only ui imports in browser 2026-04-29 01:42:50 -04:00
vincez-hms-coder 58c07867e3 fix(dashboard): keep profiles list resilient 2026-04-29 01:39:52 -04:00
vincez-hms-coder 4523965de9 feat(dashboard): add profiles management page
Copy profile dashboard changes onto a fresh branch under the vincez-hms-coder account.

Includes:
- Profiles dashboard route and sidebar entry
- Profile lifecycle REST endpoints
- SOUL.md read/write support
- i18n labels and helper text updates
- Targeted profile API tests

Test plan:
- pytest tests/hermes_cli/test_web_server.py -k profile -q
- cd web && npm run build
2026-04-29 01:39:51 -04:00
264 changed files with 39886 additions and 1982 deletions
+6
View File
@@ -9,6 +9,12 @@ node_modules
.venv
**/.venv
# Built artifacts that are regenerated inside the image. Excluded so local
# rebuilds on the developer's machine don't invalidate the npm-install layer
# that now depends on the full ui-tui/packages/hermes-ink/ tree being present.
ui-tui/dist/
ui-tui/packages/hermes-ink/dist/
# CI/CD
.github
+10
View File
@@ -76,6 +76,16 @@ jobs:
run: |
mkdir -p _site/docs
cp -r website/build/* _site/docs/
# llms.txt / llms-full.txt are also published at the site root
# (https://hermes-agent.nousresearch.com/llms.txt) because some
# agents and IDE plugins probe the classic root-level path rather
# than /docs/llms.txt. Same file, two URLs, one source of truth.
if [ -f website/build/llms.txt ]; then
cp website/build/llms.txt _site/llms.txt
fi
if [ -f website/build/llms-full.txt ]; then
cp website/build/llms-full.txt _site/llms-full.txt
fi
- name: Upload artifact
uses: actions/upload-pages-artifact@56afc609e74202658d3ffba0e8f6dda462b719fa # v3
+18 -8
View File
@@ -28,10 +28,26 @@ WORKDIR /opt/hermes
# ---------- Layer-cached dependency install ----------
# Copy only package manifests first so npm install + Playwright are cached
# unless the lockfiles themselves change.
#
# ui-tui/packages/hermes-ink/ is copied IN FULL (not just its manifests)
# because it is referenced as a `file:` workspace dependency from
# ui-tui/package.json. Copying the tree up front lets npm resolve the
# workspace to real content instead of stopping at a bare package.json.
COPY package.json package-lock.json ./
COPY web/package.json web/package-lock.json web/
COPY ui-tui/package.json ui-tui/package-lock.json ui-tui/
COPY ui-tui/packages/hermes-ink/package.json ui-tui/packages/hermes-ink/package-lock.json ui-tui/packages/hermes-ink/
COPY ui-tui/packages/hermes-ink/ ui-tui/packages/hermes-ink/
# `npm_config_install_links=false` forces npm to install `file:` deps as
# symlinks (the npm 10+ default) even on Debian's older bundled npm 9.x,
# which defaults to `install-links=true` and installs file deps as *copies*.
# The host-side package-lock.json is generated with a newer npm that uses
# symlinks, so an install-as-copy produces a hidden node_modules/.package-lock.json
# that permanently disagrees with the root lock on the @hermes/ink entry.
# That disagreement trips the TUI launcher's `_tui_need_npm_install()`
# check on every startup and triggers a runtime `npm install` that then
# fails with EACCES (node_modules/ is root-owned from build time).
ENV npm_config_install_links=false
RUN npm install --prefer-offline --no-audit && \
npx playwright install --with-deps chromium --only-shell && \
@@ -45,13 +61,7 @@ COPY --chown=hermes:hermes . .
# Build browser dashboard and terminal UI assets.
RUN cd web && npm run build && \
cd ../ui-tui && npm run build && \
rm -rf node_modules/@hermes/ink && \
rm -rf packages/hermes-ink/node_modules && \
cp -R packages/hermes-ink node_modules/@hermes/ink && \
npm install --omit=dev --prefer-offline --no-audit --prefix node_modules/@hermes/ink && \
rm -rf node_modules/@hermes/ink/node_modules/react && \
node --input-type=module -e "await import('@hermes/ink')"
cd ../ui-tui && npm run build
# ---------- Permissions ----------
# Make install dir world-readable so any HERMES_UID can read it at runtime.
+136 -3
View File
@@ -164,6 +164,8 @@ class HermesACPAgent(acp.Agent):
"context": "Show conversation context info",
"reset": "Clear conversation history",
"compact": "Compress conversation context",
"steer": "Inject guidance into the currently running agent turn",
"queue": "Queue a prompt to run after the current turn finishes",
"version": "Show Hermes version",
}
@@ -193,6 +195,16 @@ class HermesACPAgent(acp.Agent):
"name": "compact",
"description": "Compress conversation context",
},
{
"name": "steer",
"description": "Inject guidance into the currently running agent turn",
"input_hint": "guidance for the active turn",
},
{
"name": "queue",
"description": "Queue a prompt to run after the current turn finishes",
"input_hint": "prompt to run next",
},
{
"name": "version",
"description": "Show Hermes version",
@@ -557,6 +569,9 @@ class HermesACPAgent(acp.Agent):
async def cancel(self, session_id: str, **kwargs: Any) -> None:
state = self.session_manager.get_session(session_id)
if state and state.cancel_event:
with state.runtime_lock:
if state.is_running and state.current_prompt_text:
state.interrupted_prompt_text = state.current_prompt_text
state.cancel_event.set()
try:
if getattr(state, "agent", None) and hasattr(state.agent, "interrupt"):
@@ -654,6 +669,39 @@ class HermesACPAgent(acp.Agent):
if not has_content:
return PromptResponse(stop_reason="end_turn")
# /steer on an idle session has no in-flight tool call to inject into.
# Rewrite it so the payload runs as a normal user prompt, matching the
# gateway's behavior (gateway/run.py ~L4898). Two sub-cases:
# 1. Zed-interrupt salvage — a prior prompt was cancelled by the
# client right before /steer arrived; replay it with the steer
# text attached as explicit correction/guidance so the user's
# in-flight work isn't lost.
# 2. Plain idle — no prior work to salvage; just run the steer
# payload as a regular prompt. Without this, _cmd_steer would
# silently append to state.queued_prompts and respond with
# "No active turn — queued for the next turn", which looks like
# /queue even though the user never typed /queue.
if isinstance(user_content, str) and user_text.startswith("/steer"):
steer_text = user_text.split(maxsplit=1)[1].strip() if len(user_text.split(maxsplit=1)) > 1 else ""
interrupted_prompt = ""
rewrite_idle = False
with state.runtime_lock:
if not state.is_running and steer_text:
if state.interrupted_prompt_text:
interrupted_prompt = state.interrupted_prompt_text
state.interrupted_prompt_text = ""
else:
rewrite_idle = True
if interrupted_prompt:
user_text = (
f"{interrupted_prompt}\n\n"
f"User correction/guidance after interrupt: {steer_text}"
)
user_content = user_text
elif rewrite_idle:
user_text = steer_text
user_content = steer_text
# Intercept slash commands — handle locally without calling the LLM.
# Slash commands are text-only; if the client included images/resources,
# send the whole multimodal prompt to the agent instead of treating it as
@@ -666,6 +714,24 @@ class HermesACPAgent(acp.Agent):
await self._conn.session_update(session_id, update)
return PromptResponse(stop_reason="end_turn")
# If Zed sends another regular prompt while the same ACP session is
# still running, queue it instead of racing two AIAgent loops against
# the same state.history. /steer and /queue are handled above and can
# land immediately.
with state.runtime_lock:
if state.is_running:
queued_text = user_text or "[Image attachment]"
state.queued_prompts.append(queued_text)
depth = len(state.queued_prompts)
if self._conn:
update = acp.update_agent_message_text(
f"Queued for the next turn. ({depth} queued)"
)
await self._conn.session_update(session_id, update)
return PromptResponse(stop_reason="end_turn")
state.is_running = True
state.current_prompt_text = user_text or "[Image attachment]"
logger.info("Prompt on session %s: %s", session_id, user_text[:100])
conn = self._conn
@@ -777,6 +843,9 @@ class HermesACPAgent(acp.Agent):
result = await loop.run_in_executor(_executor, ctx.run, _run_agent)
except Exception:
logger.exception("Executor error for session %s", session_id)
with state.runtime_lock:
state.is_running = False
state.current_prompt_text = ""
return PromptResponse(stop_reason="end_turn")
if result.get("messages"):
@@ -802,6 +871,28 @@ class HermesACPAgent(acp.Agent):
update = acp.update_agent_message_text(final_response)
await conn.session_update(session_id, update)
# Mark this turn idle before draining queued work so recursive prompt()
# calls can acquire the session. Queued turns are intentionally run as
# normal follow-up user prompts, preserving role alternation and history.
with state.runtime_lock:
state.is_running = False
state.current_prompt_text = ""
while True:
with state.runtime_lock:
if not state.queued_prompts:
break
next_prompt = state.queued_prompts.pop(0)
if conn:
await conn.session_update(
session_id,
acp.update_user_message_text(next_prompt),
)
await self.prompt(
prompt=[TextContentBlock(type="text", text=next_prompt)],
session_id=session_id,
)
usage = None
if any(result.get(key) is not None for key in ("prompt_tokens", "completion_tokens", "total_tokens")):
usage = Usage(
@@ -879,6 +970,8 @@ class HermesACPAgent(acp.Agent):
"context": self._cmd_context,
"reset": self._cmd_reset,
"compact": self._cmd_compact,
"steer": self._cmd_steer,
"queue": self._cmd_queue,
"version": self._cmd_version,
}.get(cmd)
@@ -975,10 +1068,16 @@ class HermesACPAgent(acp.Agent):
if not hasattr(agent, "_compress_context"):
return "Context compression not available for this agent."
from agent.model_metadata import estimate_messages_tokens_rough
from agent.model_metadata import estimate_request_tokens_rough
original_count = len(state.history)
approx_tokens = estimate_messages_tokens_rough(state.history)
# Include system prompt + tool schemas so the figure reflects real
# request pressure, not a transcript-only underestimate (#6217).
_sys_prompt = getattr(agent, "_cached_system_prompt", "") or ""
_tools = getattr(agent, "tools", None) or None
approx_tokens = estimate_request_tokens_rough(
state.history, system_prompt=_sys_prompt, tools=_tools
)
original_session_db = getattr(agent, "_session_db", None)
try:
@@ -998,7 +1097,13 @@ class HermesACPAgent(acp.Agent):
self.session_manager.save_session(state.session_id)
new_count = len(state.history)
new_tokens = estimate_messages_tokens_rough(state.history)
_sys_prompt_after = getattr(agent, "_cached_system_prompt", "") or _sys_prompt
_tools_after = getattr(agent, "tools", None) or _tools
new_tokens = estimate_request_tokens_rough(
state.history,
system_prompt=_sys_prompt_after,
tools=_tools_after,
)
return (
f"Context compressed: {original_count} -> {new_count} messages\n"
f"~{approx_tokens:,} -> ~{new_tokens:,} tokens"
@@ -1006,6 +1111,34 @@ class HermesACPAgent(acp.Agent):
except Exception as e:
return f"Compression failed: {e}"
def _cmd_steer(self, args: str, state: SessionState) -> str:
steer_text = args.strip()
if not steer_text:
return "Usage: /steer <guidance>"
if state.is_running and hasattr(state.agent, "steer"):
try:
if state.agent.steer(steer_text):
preview = steer_text[:80] + ("..." if len(steer_text) > 80 else "")
return f"⏩ Steer queued for the active turn: {preview}"
except Exception as exc:
logger.warning("ACP steer failed for session %s: %s", state.session_id, exc)
return f"⚠️ Steer failed: {exc}"
with state.runtime_lock:
state.queued_prompts.append(steer_text)
depth = len(state.queued_prompts)
return f"No active turn — queued for the next turn. ({depth} queued)"
def _cmd_queue(self, args: str, state: SessionState) -> str:
queued_text = args.strip()
if not queued_text:
return "Usage: /queue <prompt>"
with state.runtime_lock:
state.queued_prompts.append(queued_text)
depth = len(state.queued_prompts)
return f"Queued for the next turn. ({depth} queued)"
def _cmd_version(self, args: str, state: SessionState) -> str:
return f"Hermes Agent v{HERMES_VERSION}"
+46 -7
View File
@@ -26,6 +26,33 @@ from typing import Any, Dict, List, Optional
logger = logging.getLogger(__name__)
def _win_path_to_wsl(path: str) -> str | None:
"""Convert a Windows drive path to its WSL /mnt/<drive>/... equivalent."""
match = re.match(r"^([A-Za-z]):[\\/](.*)$", path)
if not match:
return None
drive = match.group(1).lower()
tail = match.group(2).replace("\\", "/")
return f"/mnt/{drive}/{tail}"
def _translate_acp_cwd(cwd: str) -> str:
"""Translate Windows ACP cwd values when Hermes itself is running in WSL.
Windows ACP clients can launch ``hermes acp`` inside WSL while still sending
editor workspaces as Windows drive paths such as ``E:\\Projects``. Store
and execute against the WSL mount path so agents, tools, and persisted ACP
sessions all agree on the usable workspace. Native Linux/macOS keeps the
original cwd unchanged.
"""
from hermes_constants import is_wsl
if not is_wsl():
return cwd
translated = _win_path_to_wsl(str(cwd))
return translated if translated is not None else cwd
def _normalize_cwd_for_compare(cwd: str | None) -> str:
raw = str(cwd or ".").strip()
if not raw:
@@ -34,11 +61,9 @@ def _normalize_cwd_for_compare(cwd: str | None) -> str:
# Normalize Windows drive paths into the equivalent WSL mount form so
# ACP history filters match the same workspace across Windows and WSL.
match = re.match(r"^([A-Za-z]):[\\/](.*)$", expanded)
if match:
drive = match.group(1).lower()
tail = match.group(2).replace("\\", "/")
expanded = f"/mnt/{drive}/{tail}"
translated = _win_path_to_wsl(expanded)
if translated is not None:
expanded = translated
elif re.match(r"^/mnt/[A-Za-z]/", expanded):
expanded = f"/mnt/{expanded[5].lower()}/{expanded[7:]}"
@@ -96,12 +121,18 @@ def _acp_stderr_print(*args, **kwargs) -> None:
def _register_task_cwd(task_id: str, cwd: str) -> None:
"""Bind a task/session id to the editor's working directory for tools."""
"""Bind a task/session id to the editor's working directory for tools.
Zed can launch Hermes from a Windows workspace while the ACP process runs
inside WSL. In that case ACP sends cwd as e.g. ``E:\\Projects\\POTI``;
local tools need the WSL mount equivalent or subprocess creation fails
before the command can run.
"""
if not task_id:
return
try:
from tools.terminal_tool import register_task_env_overrides
register_task_env_overrides(task_id, {"cwd": cwd})
register_task_env_overrides(task_id, {"cwd": _translate_acp_cwd(cwd)})
except Exception:
logger.debug("Failed to register ACP task cwd override", exc_info=True)
@@ -145,6 +176,11 @@ class SessionState:
model: str = ""
history: List[Dict[str, Any]] = field(default_factory=list)
cancel_event: Any = None # threading.Event
is_running: bool = False
queued_prompts: List[str] = field(default_factory=list)
runtime_lock: Any = field(default_factory=Lock)
current_prompt_text: str = ""
interrupted_prompt_text: str = ""
class SessionManager:
@@ -175,6 +211,7 @@ class SessionManager:
"""Create a new session with a unique ID and a fresh AIAgent."""
import threading
cwd = _translate_acp_cwd(cwd)
session_id = str(uuid.uuid4())
agent = self._make_agent(session_id=session_id, cwd=cwd)
state = SessionState(
@@ -217,6 +254,7 @@ class SessionManager:
"""Deep-copy a session's history into a new session."""
import threading
cwd = _translate_acp_cwd(cwd)
original = self.get_session(session_id) # checks DB too
if original is None:
return None
@@ -318,6 +356,7 @@ class SessionManager:
def update_cwd(self, session_id: str, cwd: str) -> Optional[SessionState]:
"""Update the working directory for a session and its tool overrides."""
cwd = _translate_acp_cwd(cwd)
state = self.get_session(session_id) # checks DB too
if state is None:
return None
+29 -1
View File
@@ -1977,6 +1977,12 @@ def resolve_provider_client(
(client, resolved_model) or (None, None) if auth is unavailable.
"""
_validate_proxy_env_urls()
# Preserve the original provider name before alias normalization so a
# user-declared ``custom_providers`` entry whose name coincidentally
# matches a built-in alias (e.g. user names their custom provider "kimi"
# which aliases to "kimi-coding") is still reachable via the named-custom
# branch below.
original_provider = (provider or "").strip().lower()
# Normalise aliases
provider = _normalize_aux_provider(provider)
@@ -2163,7 +2169,18 @@ def resolve_provider_client(
# ── Named custom providers (config.yaml providers dict / custom_providers list) ───
try:
from hermes_cli.runtime_provider import _get_named_custom_provider
custom_entry = _get_named_custom_provider(provider)
# When the raw requested name is an alias (``kimi`` → ``kimi-coding``)
# and the user defined a ``custom_providers`` entry under that alias
# name, the custom entry is the intended target — the built-in alias
# rewriting would otherwise hijack the request. Only preferred when
# the raw name is an alias (not a canonical provider name) so custom
# entries that coincidentally match a canonical provider (e.g. ``nous``)
# still defer to the built-in per `_get_named_custom_provider`'s guard.
custom_entry = None
if original_provider and original_provider != provider:
custom_entry = _get_named_custom_provider(original_provider)
if custom_entry is None:
custom_entry = _get_named_custom_provider(provider)
if custom_entry:
custom_base = custom_entry.get("base_url", "").strip()
custom_key = custom_entry.get("api_key", "").strip()
@@ -2273,6 +2290,12 @@ def resolve_provider_client(
creds = resolve_api_key_provider_credentials(provider)
api_key = str(creds.get("api_key", "")).strip()
# Honour an explicit api_key override (e.g. from a fallback_model entry
# or a custom_providers entry) so callers that pass an explicit
# credential can authenticate against endpoints where no built-in
# credential is registered for this provider alias.
if explicit_api_key:
api_key = explicit_api_key.strip() or api_key
if not api_key:
tried_sources = list(pconfig.api_key_env_vars)
if provider == "copilot":
@@ -2284,6 +2307,11 @@ def resolve_provider_client(
raw_base_url = str(creds.get("base_url", "")).strip().rstrip("/") or pconfig.inference_base_url
base_url = _to_openai_base_url(raw_base_url)
# Honour an explicit base_url override from the caller — used when a
# fallback_model entry (or custom_providers lookup) routes through a
# built-in provider name but targets a user-specified endpoint.
if explicit_base_url:
base_url = _to_openai_base_url(explicit_base_url.strip().rstrip("/"))
default_model = _API_KEY_PROVIDER_AUX_MODELS.get(provider, "")
final_model = _normalize_resolved_model(model or default_model, provider)
+3 -3
View File
@@ -538,7 +538,7 @@ class ContextCompressor(ContextEngine):
# Token-budget approach: walk backward accumulating tokens
accumulated = 0
boundary = len(result)
min_protect = min(protect_tail_count, len(result) - 1)
min_protect = min(protect_tail_count, len(result))
for i in range(len(result) - 1, -1, -1):
msg = result[i]
raw_content = msg.get("content") or ""
@@ -992,8 +992,8 @@ The user has requested that this compaction PRIORITISE preserving all informatio
def _get_tool_call_id(tc) -> str:
"""Extract the call ID from a tool_call entry (dict or SimpleNamespace)."""
if isinstance(tc, dict):
return tc.get("id", "")
return getattr(tc, "id", "") or ""
return tc.get("call_id", "") or tc.get("id", "") or ""
return getattr(tc, "call_id", "") or getattr(tc, "id", "") or ""
def _sanitize_tool_pairs(self, messages: List[Dict[str, Any]]) -> List[Dict[str, Any]]:
"""Fix orphaned tool_call / tool_result pairs after compression.
+188 -11
View File
@@ -55,6 +55,7 @@ def _default_state() -> Dict[str, Any]:
"last_run_at": None,
"last_run_duration_seconds": None,
"last_run_summary": None,
"last_report_path": None,
"paused": False,
"run_count": 0,
}
@@ -183,7 +184,16 @@ def should_run_now(now: Optional[datetime] = None) -> bool:
Gates:
- curator.enabled == True
- not paused
- last_run_at missing, OR older than interval_hours
- last_run_at present AND older than interval_hours
First-run behavior: when there is no ``last_run_at`` (fresh install, or
install that predates the curator), we DO NOT run immediately. The
curator is designed to run after at least ``interval_hours`` (7 days by
default) of skill activity, not on the first background tick after
``hermes update``. On first observation we seed ``last_run_at`` to "now"
and defer the first real pass by one full interval. Users who want to
run it sooner can always invoke ``hermes curator run`` (with or without
``--dry-run``) explicitly — that path bypasses this gate.
The idle check (min_idle_hours) is applied at the call site where we know
whether an agent is actively running — here we only enforce the static
@@ -197,7 +207,21 @@ def should_run_now(now: Optional[datetime] = None) -> bool:
state = load_state()
last = _parse_iso(state.get("last_run_at"))
if last is None:
return True
# Never run before. Seed state so we wait a full interval before the
# first real pass. Report-only; do not auto-mutate the library the
# very first time a gateway ticks after an update.
if now is None:
now = datetime.now(timezone.utc)
try:
state["last_run_at"] = now.isoformat()
state["last_run_summary"] = (
"deferred first run — curator seeded, will run after one "
"interval; use `hermes curator run --dry-run` to preview now"
)
save_state(state)
except Exception as e: # pragma: no cover — best-effort persistence
logger.debug("Failed to seed curator last_run_at: %s", e)
return False
if now is None:
now = datetime.now(timezone.utc)
@@ -258,6 +282,33 @@ def apply_automatic_transitions(now: Optional[datetime] = None) -> Dict[str, int
# Review prompt for the forked agent
# ---------------------------------------------------------------------------
CURATOR_DRY_RUN_BANNER = (
"═══════════════════════════════════════════════════════════════\n"
"DRY-RUN — REPORT ONLY. DO NOT MUTATE THE SKILL LIBRARY.\n"
"═══════════════════════════════════════════════════════════════\n"
"\n"
"This is a PREVIEW pass. Follow every instruction below EXCEPT:\n"
"\n"
" • DO NOT call skill_manage with action=patch, create, delete, "
"write_file, or remove_file.\n"
" • DO NOT call terminal to mv skill directories into .archive/.\n"
" • DO NOT call terminal to mv, cp, rm, or rewrite any file under "
"~/.hermes/skills/.\n"
" • skills_list and skill_view are FINE — read as much as you need.\n"
"\n"
"Your output IS the deliverable. Produce the exact same "
"human-readable summary and structured YAML block you would "
"produce on a live run — but describe the actions you WOULD take, "
"not actions you took. A downstream reviewer will read the report "
"and decide whether to approve a live run with "
"`hermes curator run` (no flag).\n"
"\n"
"If you accidentally take a mutating action, say so explicitly in "
"the summary so the reviewer can revert it.\n"
"═══════════════════════════════════════════════════════════════"
)
CURATOR_REVIEW_PROMPT = (
"You are running as Hermes' background skill CURATOR. This is an "
"UMBRELLA-BUILDING consolidation pass, not a passive audit and not a "
@@ -766,6 +817,39 @@ def _write_run_report(
consolidated = classification["consolidated"]
pruned = classification["pruned"]
# Rewrite cron job skill references. When the curator consolidates
# skill X into umbrella Y, any cron job that lists X fails to load
# it at run time — the scheduler skips it and the job runs without
# the instructions it was scheduled to follow. Rewriting the
# references in-place keeps scheduled jobs working across
# consolidation passes. Best-effort: never let a cron-module issue
# break the curator.
cron_rewrites: Dict[str, Any] = {"rewrites": [], "jobs_updated": 0, "jobs_scanned": 0}
try:
consolidated_map = {
e["name"]: e["into"]
for e in consolidated
if isinstance(e, dict) and e.get("name") and e.get("into")
}
pruned_names = [
e["name"] for e in pruned
if isinstance(e, dict) and e.get("name")
]
if consolidated_map or pruned_names:
from cron.jobs import rewrite_skill_refs as _rewrite_cron_refs
cron_rewrites = _rewrite_cron_refs(
consolidated=consolidated_map,
pruned=pruned_names,
)
except Exception as e:
logger.debug("Curator cron skill rewrite failed: %s", e, exc_info=True)
cron_rewrites = {
"rewrites": [],
"jobs_updated": 0,
"jobs_scanned": 0,
"error": str(e),
}
payload = {
"started_at": started_at.isoformat(),
"duration_seconds": round(elapsed_seconds, 2),
@@ -781,6 +865,7 @@ def _write_run_report(
"consolidated_this_run": len(consolidated),
"pruned_this_run": len(pruned),
"state_transitions": len(transitions),
"cron_jobs_rewritten": int(cron_rewrites.get("jobs_updated", 0)),
"tool_calls_total": sum(tc_counts.values()),
},
"tool_call_counts": tc_counts,
@@ -790,6 +875,7 @@ def _write_run_report(
"pruned_names": [p["name"] for p in pruned],
"added": added,
"state_transitions": transitions,
"cron_rewrites": cron_rewrites,
"llm_final": llm_meta.get("final", ""),
"llm_summary": llm_meta.get("summary", ""),
"llm_error": llm_meta.get("error"),
@@ -812,6 +898,17 @@ def _write_run_report(
except Exception as e:
logger.debug("Curator REPORT.md write failed: %s", e)
# cron_rewrites.json — only when at least one job was touched, to
# keep run dirs uncluttered for the common no-op case.
try:
if int(cron_rewrites.get("jobs_updated", 0)) > 0:
(run_dir / "cron_rewrites.json").write_text(
json.dumps(cron_rewrites, indent=2, ensure_ascii=False) + "\n",
encoding="utf-8",
)
except Exception as e:
logger.debug("Curator cron_rewrites.json write failed: %s", e)
return run_dir
@@ -942,6 +1039,39 @@ def _render_report_markdown(p: Dict[str, Any]) -> str:
lines.append(f"- `{t.get('name')}`: {t.get('from')}{t.get('to')}")
lines.append("")
# Cron job rewrites — show which scheduled jobs had their skill
# references updated so users can audit that the auto-rewrite did
# the right thing. Only present when at least one job changed.
cron_rw = p.get("cron_rewrites") or {}
cron_rewrites_list = cron_rw.get("rewrites") or []
if cron_rewrites_list:
lines.append(f"### Cron job skill references rewritten ({len(cron_rewrites_list)})\n")
lines.append(
"_Cron jobs that referenced a consolidated or pruned skill were "
"updated in-place so they keep loading the right instructions "
"on their next run. See `cron_rewrites.json` for the full record._\n"
)
SHOW = 25
for entry in cron_rewrites_list[:SHOW]:
job_name = entry.get("job_name") or entry.get("job_id") or "?"
before = entry.get("before") or []
after = entry.get("after") or []
mapped = entry.get("mapped") or {}
dropped = entry.get("dropped") or []
lines.append(
f"- `{job_name}`: `{', '.join(before)}` → `{', '.join(after) or '(none)'}`"
)
for old, new in mapped.items():
lines.append(f" - `{old}` → `{new}` (consolidated)")
for name in dropped:
lines.append(f" - `{name}` dropped (pruned)")
if len(cron_rewrites_list) > SHOW:
lines.append(
f"- … and {len(cron_rewrites_list) - SHOW} more "
"(see `cron_rewrites.json`)"
)
lines.append("")
# Full LLM final response
final = (p.get("llm_final") or "").strip()
if final:
@@ -992,6 +1122,7 @@ def _render_candidate_list() -> str:
def run_curator_review(
on_summary: Optional[Callable[[str], None]] = None,
synchronous: bool = False,
dry_run: bool = False,
) -> Dict[str, Any]:
"""Execute a single curator review pass.
@@ -1004,9 +1135,43 @@ def run_curator_review(
If *synchronous* is True, the LLM review runs in the calling thread; the
default is to spawn a daemon thread so the caller returns immediately.
If *dry_run* is True, the automatic stale/archive transitions are SKIPPED
and the LLM review pass is instructed to produce a report only — no
skill_manage mutations, no terminal archive moves. The REPORT.md still
gets written and ``state.last_report_path`` still records it so users
can read what the curator WOULD have done.
"""
start = datetime.now(timezone.utc)
counts = apply_automatic_transitions(now=start)
if dry_run:
# Count candidates without mutating state.
try:
report = skill_usage.agent_created_report()
counts = {
"checked": len(report),
"marked_stale": 0,
"archived": 0,
"reactivated": 0,
}
except Exception:
counts = {"checked": 0, "marked_stale": 0, "archived": 0, "reactivated": 0}
else:
# Pre-mutation snapshot — best-effort, never blocks the run. A
# failed snapshot logs at debug and continues (the alternative is
# that a transient disk issue silently disables curator forever,
# which is worse). Users who want to require snapshots can disable
# curator entirely until they can fix disk space.
try:
from agent import curator_backup
snap = curator_backup.snapshot_skills(reason="pre-curator-run")
if snap is not None and on_summary:
try:
on_summary(f"curator: snapshot created ({snap.name})")
except Exception:
pass
except Exception as e:
logger.debug("Curator pre-run snapshot failed: %s", e, exc_info=True)
counts = apply_automatic_transitions(now=start)
auto_summary_parts = []
if counts["marked_stale"]:
@@ -1018,11 +1183,16 @@ def run_curator_review(
auto_summary = ", ".join(auto_summary_parts) if auto_summary_parts else "no changes"
# Persist state before the LLM pass so a crash mid-review still records
# the run and doesn't immediately re-trigger.
# the run and doesn't immediately re-trigger. In dry-run we do NOT bump
# last_run_at or run_count — a preview shouldn't push the next scheduled
# real pass out. We still record a summary so `hermes curator status`
# shows that a preview ran.
state = load_state()
state["last_run_at"] = start.isoformat()
state["run_count"] = int(state.get("run_count", 0)) + 1
state["last_run_summary"] = f"auto: {auto_summary}"
if not dry_run:
state["last_run_at"] = start.isoformat()
state["run_count"] = int(state.get("run_count", 0)) + 1
prefix = "dry-run auto: " if dry_run else "auto: "
state["last_run_summary"] = f"{prefix}{auto_summary}"
save_state(state)
def _llm_pass():
@@ -1038,7 +1208,7 @@ def run_curator_review(
try:
candidate_list = _render_candidate_list()
if "No agent-created skills" in candidate_list:
final_summary = f"auto: {auto_summary}; llm: skipped (no candidates)"
final_summary = f"{prefix}{auto_summary}; llm: skipped (no candidates)"
llm_meta = {
"final": "",
"summary": "skipped (no candidates)",
@@ -1048,14 +1218,21 @@ def run_curator_review(
"error": None,
}
else:
prompt = f"{CURATOR_REVIEW_PROMPT}\n\n{candidate_list}"
if dry_run:
prompt = (
f"{CURATOR_DRY_RUN_BANNER}\n\n"
f"{CURATOR_REVIEW_PROMPT}\n\n"
f"{candidate_list}"
)
else:
prompt = f"{CURATOR_REVIEW_PROMPT}\n\n{candidate_list}"
llm_meta = _run_llm_review(prompt)
final_summary = (
f"auto: {auto_summary}; llm: {llm_meta.get('summary', 'no change')}"
f"{prefix}{auto_summary}; llm: {llm_meta.get('summary', 'no change')}"
)
except Exception as e:
logger.debug("Curator LLM pass failed: %s", e, exc_info=True)
final_summary = f"auto: {auto_summary}; llm: error ({e})"
final_summary = f"{prefix}{auto_summary}; llm: error ({e})"
llm_meta = {
"final": "",
"summary": f"error ({e})",
+440
View File
@@ -0,0 +1,440 @@
"""Curator snapshot + rollback.
A pre-run snapshot of ``~/.hermes/skills/`` (excluding ``.curator_backups/``
itself) is taken before any mutating curator pass. Snapshots are tar.gz
files under ``~/.hermes/skills/.curator_backups/<utc-iso>/`` with a
companion ``manifest.json`` describing the snapshot (reason, time, size,
counted skill files). Rollback picks a snapshot, moves the current
``skills/`` tree aside into another snapshot so even the rollback itself
is undoable, then extracts the chosen snapshot into place.
The snapshot does NOT include:
- ``.curator_backups/`` (would recurse)
- ``.hub/`` (hub-installed skills — managed by the hub, not us)
It DOES include:
- all SKILL.md files + their directories (``scripts/``, ``references/``,
``templates/``, ``assets/``)
- ``.usage.json`` (usage telemetry — needed to rehydrate state cleanly)
- ``.archive/`` (so rollback restores previously-archived skills too)
- ``.curator_state`` (so rolling back also restores the last-run-at
pointer — otherwise the curator would immediately re-fire on the next
tick)
- ``.bundled_manifest`` (so protection markers stay consistent)
"""
from __future__ import annotations
import json
import logging
import os
import re
import shutil
import tarfile
import tempfile
import time
from datetime import datetime, timezone
from pathlib import Path
from typing import Any, Dict, List, Optional, Tuple
from hermes_constants import get_hermes_home
logger = logging.getLogger(__name__)
DEFAULT_KEEP = 5
# Entries under skills/ that should NEVER be rolled up into a snapshot.
# .hub/ is managed by the skills hub; rolling it back would break lockfile
# invariants. .curator_backups is the backup dir itself — recursion bomb.
_EXCLUDE_TOP_LEVEL = {".curator_backups", ".hub"}
# Snapshot id regex: UTC ISO with colons replaced by dashes so the filename
# is portable (Windows-safe). An optional ``-NN`` suffix handles two
# snapshots landing in the same wallclock second.
_ID_RE = re.compile(r"^\d{4}-\d{2}-\d{2}T\d{2}-\d{2}-\d{2}Z(-\d{2})?$")
def _backups_dir() -> Path:
return get_hermes_home() / "skills" / ".curator_backups"
def _skills_dir() -> Path:
return get_hermes_home() / "skills"
def _utc_id(now: Optional[datetime] = None) -> str:
"""UTC ISO-ish filesystem-safe timestamp: ``2026-05-01T13-05-42Z``."""
if now is None:
now = datetime.now(timezone.utc)
# isoformat → "2026-05-01T13:05:42.123456+00:00"; strip subseconds and tz.
s = now.replace(microsecond=0).isoformat()
if s.endswith("+00:00"):
s = s[:-6]
return s.replace(":", "-") + "Z"
def _load_config() -> Dict[str, Any]:
try:
from hermes_cli.config import load_config
cfg = load_config()
except Exception as e:
logger.debug("Failed to load config for curator backup: %s", e)
return {}
if not isinstance(cfg, dict):
return {}
cur = cfg.get("curator") or {}
if not isinstance(cur, dict):
return {}
bk = cur.get("backup") or {}
return bk if isinstance(bk, dict) else {}
def is_enabled() -> bool:
"""Default ON — the whole point of the backup is safety by default."""
return bool(_load_config().get("enabled", True))
def get_keep() -> int:
cfg = _load_config()
try:
n = int(cfg.get("keep", DEFAULT_KEEP))
except (TypeError, ValueError):
n = DEFAULT_KEEP
return max(1, n)
# ---------------------------------------------------------------------------
# Snapshot
# ---------------------------------------------------------------------------
def _count_skill_files(base: Path) -> int:
try:
return sum(1 for _ in base.rglob("SKILL.md"))
except OSError:
return 0
def _write_manifest(dest: Path, reason: str, archive_path: Path,
skills_counted: int) -> None:
manifest = {
"id": dest.name,
"reason": reason,
"created_at": datetime.now(timezone.utc).isoformat(),
"archive": archive_path.name,
"archive_bytes": archive_path.stat().st_size,
"skill_files": skills_counted,
}
(dest / "manifest.json").write_text(
json.dumps(manifest, indent=2, sort_keys=True), encoding="utf-8"
)
def snapshot_skills(reason: str = "manual") -> Optional[Path]:
"""Create a tar.gz snapshot of ``~/.hermes/skills/`` and prune old ones.
Returns the snapshot directory path, or ``None`` if the snapshot was
skipped (backup disabled, skills dir missing, or an IO error occurred —
in which case we log at debug and return None so the curator never
aborts a pass because of a backup failure).
"""
if not is_enabled():
logger.debug("Curator backup disabled by config; skipping snapshot")
return None
skills = _skills_dir()
if not skills.exists():
logger.debug("No ~/.hermes/skills/ directory — nothing to back up")
return None
backups = _backups_dir()
try:
backups.mkdir(parents=True, exist_ok=True)
except OSError as e:
logger.debug("Failed to create backups dir %s: %s", backups, e)
return None
# Uniquify: if a snapshot with the same second already exists (can
# happen if two curator runs fire in the same second), append a short
# counter. Avoids clobbering and avoids timestamp collisions.
base_id = _utc_id()
snap_id = base_id
counter = 1
while (backups / snap_id).exists():
snap_id = f"{base_id}-{counter:02d}"
counter += 1
dest = backups / snap_id
try:
dest.mkdir(parents=True, exist_ok=False)
except OSError as e:
logger.debug("Failed to create snapshot dir %s: %s", dest, e)
return None
archive = dest / "skills.tar.gz"
try:
# Stream into the tarball — no tempdir copy needed.
with tarfile.open(archive, "w:gz", compresslevel=6) as tf:
for entry in sorted(skills.iterdir()):
if entry.name in _EXCLUDE_TOP_LEVEL:
continue
# arcname: store paths relative to skills/ so extraction
# drops cleanly back into the skills dir.
tf.add(str(entry), arcname=entry.name, recursive=True)
_write_manifest(dest, reason, archive, _count_skill_files(skills))
except (OSError, tarfile.TarError) as e:
logger.debug("Curator snapshot failed: %s", e, exc_info=True)
# Clean up partial snapshot
try:
shutil.rmtree(dest, ignore_errors=True)
except OSError:
pass
return None
_prune_old(keep=get_keep())
logger.info("Curator snapshot created: %s (%s)", snap_id, reason)
return dest
def _prune_old(keep: int) -> List[str]:
"""Delete regular snapshots beyond the newest *keep*. Returns deleted
ids. Staging dirs (``.rollback-staging-*``) are implementation detail
and pruned independently on every call."""
backups = _backups_dir()
if not backups.exists():
return []
entries: List[Tuple[str, Path]] = []
stale_staging: List[Path] = []
for child in backups.iterdir():
if not child.is_dir():
continue
if child.name.startswith(".rollback-staging-"):
# Staging dirs are only supposed to exist briefly during a
# rollback. If we find one here (e.g. from a crashed rollback),
# clean it up opportunistically.
stale_staging.append(child)
continue
if _ID_RE.match(child.name):
entries.append((child.name, child))
# Newest first (lexicographic works because the id is UTC ISO).
entries.sort(key=lambda t: t[0], reverse=True)
deleted: List[str] = []
for _, path in entries[keep:]:
try:
shutil.rmtree(path)
deleted.append(path.name)
except OSError as e:
logger.debug("Failed to prune %s: %s", path, e)
for path in stale_staging:
try:
shutil.rmtree(path)
except OSError as e:
logger.debug("Failed to clean stale staging dir %s: %s", path, e)
return deleted
# ---------------------------------------------------------------------------
# List + rollback
# ---------------------------------------------------------------------------
def _read_manifest(snap_dir: Path) -> Dict[str, Any]:
mf = snap_dir / "manifest.json"
if not mf.exists():
return {}
try:
return json.loads(mf.read_text(encoding="utf-8"))
except (OSError, json.JSONDecodeError):
return {}
def list_backups() -> List[Dict[str, Any]]:
"""Return all restorable snapshots, newest first. Only entries with a
real ``skills.tar.gz`` tarball are listed — transient
``.rollback-staging-*`` directories created mid-rollback are
implementation detail and not shown."""
backups = _backups_dir()
if not backups.exists():
return []
out: List[Dict[str, Any]] = []
for child in sorted(backups.iterdir(), reverse=True):
if not child.is_dir():
continue
if not _ID_RE.match(child.name):
continue
if not (child / "skills.tar.gz").exists():
continue
mf = _read_manifest(child)
mf.setdefault("id", child.name)
mf.setdefault("path", str(child))
if "archive_bytes" not in mf:
arc = child / "skills.tar.gz"
try:
mf["archive_bytes"] = arc.stat().st_size
except OSError:
mf["archive_bytes"] = 0
out.append(mf)
return out
def _resolve_backup(backup_id: Optional[str]) -> Optional[Path]:
"""Return the path of the requested backup, or the newest one if
*backup_id* is None. Returns None if no match."""
backups = _backups_dir()
if not backups.exists():
return None
if backup_id:
target = backups / backup_id
if (
target.is_dir()
and _ID_RE.match(backup_id)
and (target / "skills.tar.gz").exists()
):
return target
return None
candidates = [
c for c in sorted(backups.iterdir(), reverse=True)
if c.is_dir() and _ID_RE.match(c.name) and (c / "skills.tar.gz").exists()
]
return candidates[0] if candidates else None
def rollback(backup_id: Optional[str] = None) -> Tuple[bool, str, Optional[Path]]:
"""Restore ``~/.hermes/skills/`` from a snapshot.
Strategy:
1. Resolve the target snapshot (explicit id or newest regular).
2. Take a safety snapshot of the CURRENT skills tree under
``.curator_backups/pre-rollback-<ts>/`` so the rollback itself is
undoable.
3. Move all current top-level entries (except ``.curator_backups``
and ``.hub``) into a tempdir.
4. Extract the chosen snapshot into ``~/.hermes/skills/``.
5. On failure during 4, move the tempdir contents back (best-effort)
and return failure.
Returns ``(ok, message, snapshot_path)``.
"""
target = _resolve_backup(backup_id)
if target is None:
return (
False,
f"no matching backup found"
+ (f" for id '{backup_id}'" if backup_id else "")
+ " (use `hermes curator rollback --list` to see available snapshots)",
None,
)
archive = target / "skills.tar.gz"
if not archive.exists():
return (False, f"snapshot {target.name} has no skills.tar.gz — corrupted?", None)
skills = _skills_dir()
skills.mkdir(parents=True, exist_ok=True)
backups = _backups_dir()
backups.mkdir(parents=True, exist_ok=True)
# Step 2: safety snapshot of current state FIRST. If this fails we bail
# out before touching anything — otherwise a failed extract could leave
# the user with no skills.
try:
snapshot_skills(reason=f"pre-rollback to {target.name}")
except Exception as e:
return (False, f"pre-rollback safety snapshot failed: {e}", None)
# Additionally move current entries into an internal staging dir so
# the extract happens into an empty skills tree (predictable result).
# This dir is implementation detail — not listed as a restorable
# backup. The safety snapshot above is the user-facing undo handle.
staged = backups / f".rollback-staging-{_utc_id()}"
try:
staged.mkdir(parents=True, exist_ok=False)
except OSError as e:
return (False, f"failed to create staging dir: {e}", None)
moved: List[Tuple[Path, Path]] = []
try:
for entry in list(skills.iterdir()):
if entry.name in _EXCLUDE_TOP_LEVEL:
continue
dest = staged / entry.name
shutil.move(str(entry), str(dest))
moved.append((entry, dest))
except OSError as e:
# Best-effort rollback of the move
for orig, dest in moved:
try:
shutil.move(str(dest), str(orig))
except OSError:
pass
try:
shutil.rmtree(staged, ignore_errors=True)
except OSError:
pass
return (False, f"failed to stage current skills: {e}", None)
# Step 4: extract the snapshot into skills/
try:
with tarfile.open(archive, "r:gz") as tf:
# Python 3.12+ supports filter='data' for safer extraction.
# Fall back to the unfiltered call for older interpreters but
# still reject absolute paths and .. components defensively.
for member in tf.getmembers():
name = member.name
if name.startswith("/") or ".." in Path(name).parts:
raise tarfile.TarError(
f"refusing to extract unsafe path: {name!r}"
)
try:
tf.extractall(str(skills), filter="data") # type: ignore[call-arg]
except TypeError:
# Python < 3.12 — no filter kwarg
tf.extractall(str(skills))
except (OSError, tarfile.TarError) as e:
# Best-effort recover: move staged contents back
for orig, dest in moved:
try:
shutil.move(str(dest), str(orig))
except OSError:
pass
try:
shutil.rmtree(staged, ignore_errors=True)
except OSError:
pass
return (False, f"snapshot extract failed (state restored): {e}", None)
# Extract succeeded — the staging dir has served its purpose. The
# user's undo handle is the safety snapshot tarball we took earlier.
try:
shutil.rmtree(staged, ignore_errors=True)
except OSError:
pass
logger.info("Curator rollback: restored from %s", target.name)
return (True, f"restored from snapshot {target.name}", target)
# ---------------------------------------------------------------------------
# Human-readable summary for CLI
# ---------------------------------------------------------------------------
def format_size(n: int) -> str:
for unit in ("B", "KB", "MB", "GB"):
if n < 1024 or unit == "GB":
return f"{n:.1f} {unit}" if unit != "B" else f"{n} B"
n /= 1024
return f"{n:.1f} GB"
def summarize_backups() -> str:
rows = list_backups()
if not rows:
return "No curator snapshots yet."
lines = [f"{'id':<24} {'reason':<40} {'skills':>6} {'size':>8}"]
lines.append("" * len(lines[0]))
for r in rows:
lines.append(
f"{r.get('id','?'):<24} "
f"{(r.get('reason','?') or '?')[:40]:<40} "
f"{r.get('skill_files', 0):>6} "
f"{format_size(int(r.get('archive_bytes', 0))):>8}"
)
return "\n".join(lines)
+5 -5
View File
@@ -20,25 +20,25 @@ def summarize_manual_compression(
headline = f"No changes from compression: {before_count} messages"
if after_tokens == before_tokens:
token_line = (
f"Rough transcript estimate: ~{before_tokens:,} tokens (unchanged)"
f"Approx request size: ~{before_tokens:,} tokens (unchanged)"
)
else:
token_line = (
f"Rough transcript estimate: ~{before_tokens:,}"
f"Approx request size: ~{before_tokens:,}"
f"~{after_tokens:,} tokens"
)
else:
headline = f"Compressed: {before_count}{after_count} messages"
token_line = (
f"Rough transcript estimate: ~{before_tokens:,}"
f"Approx request size: ~{before_tokens:,}"
f"~{after_tokens:,} tokens"
)
note = None
if not noop and after_count < before_count and after_tokens > before_tokens:
note = (
"Note: fewer messages can still raise this rough transcript estimate "
"when compression rewrites the transcript into denser summaries."
"Note: fewer messages can still raise this estimate when "
"compression rewrites the transcript into denser summaries."
)
return {
+45 -4
View File
@@ -81,15 +81,56 @@ def _repair_schema(node: Any, is_schema: bool = True) -> Any:
return repaired
# Rule 2: when anyOf is present, type belongs only on the children.
# Additionally, Moonshot rejects null-type branches inside anyOf
# (enum value (<nil>) does not match any type in [string]).
# Collapse the anyOf to the first non-null branch and infer its type.
if "anyOf" in repaired and isinstance(repaired["anyOf"], list):
repaired.pop("type", None)
return repaired
non_null = [b for b in repaired["anyOf"]
if isinstance(b, dict) and b.get("type") != "null"]
if non_null and len(non_null) < len(repaired["anyOf"]):
# Drop the anyOf wrapper — keep only the non-null branch.
# If there's a single non-null branch, promote it and fall
# through to Rules 1/3 so nullable/enum cleanup still applies
# to the merged node.
if len(non_null) == 1:
merge = {k: v for k, v in repaired.items() if k != "anyOf"}
merge.update(non_null[0])
repaired = merge
else:
repaired["anyOf"] = non_null
return repaired
else:
# Nothing to collapse — parent type stripped, children already
# repaired by the recursive walk above.
return repaired
# Moonshot also rejects non-standard keywords like ``nullable`` on
# parameter schemas — strip it.
repaired.pop("nullable", None)
# Rule 1: property schemas without type need one. $ref nodes are exempt
# — their type comes from the referenced definition.
if "$ref" in repaired:
return repaired
return _fill_missing_type(repaired)
# Fill missing type BEFORE Rule 3 so enum cleanup can check the type.
if "$ref" not in repaired:
repaired = _fill_missing_type(repaired)
# Rule 3: Moonshot rejects null/empty-string values inside enum arrays
# when the parent type is a scalar (string, integer, etc.). The error:
# "enum value (<nil>) does not match any type in [string]"
# Strip null and empty-string from enum values, and if the enum becomes
# empty, drop it entirely.
if "enum" in repaired and isinstance(repaired["enum"], list):
node_type = repaired.get("type")
if node_type in ("string", "integer", "number", "boolean"):
cleaned = [v for v in repaired["enum"]
if v is not None and v != ""]
if cleaned:
repaired["enum"] = cleaned
else:
repaired.pop("enum")
return repaired
def _fill_missing_type(node: Dict[str, Any]) -> Dict[str, Any]:
+58
View File
@@ -182,6 +182,64 @@ SKILLS_GUIDANCE = (
"Skills that aren't maintained become liabilities."
)
KANBAN_GUIDANCE = (
"# You are a Kanban worker\n"
"You were spawned by the Hermes Kanban dispatcher to execute ONE task from "
"the shared board at `~/.hermes/kanban.db`. Your task id is in "
"`$HERMES_KANBAN_TASK`; your workspace is `$HERMES_KANBAN_WORKSPACE`. "
"The `kanban_*` tools in your schema are your primary coordination surface — "
"they write directly to the shared SQLite DB and work regardless of terminal "
"backend (local/docker/modal/ssh).\n"
"\n"
"## Lifecycle\n"
"\n"
"1. **Orient.** Call `kanban_show()` first (no args — it defaults to your "
"task). The response includes title, body, parent-task handoffs (summary + "
"metadata), any prior attempts on this task if you're a retry, the full "
"comment thread, and a pre-formatted `worker_context` you can treat as "
"ground truth.\n"
"2. **Work inside the workspace.** `cd $HERMES_KANBAN_WORKSPACE` before "
"any file operations. The workspace is yours for this run. Don't modify "
"files outside it unless the task explicitly asks.\n"
"3. **Heartbeat on long operations.** Call `kanban_heartbeat(note=...)` "
"every few minutes during long subprocesses (training, encoding, crawling). "
"Skip heartbeats for short tasks.\n"
"4. **Block on genuine ambiguity.** If you need a human decision you cannot "
"infer (missing credentials, UX choice, paywalled source, peer output you "
"need first), call `kanban_block(reason=\"...\")` and stop. Don't guess. "
"The user will unblock with context and the dispatcher will respawn you.\n"
"5. **Complete with structured handoff.** Call `kanban_complete(summary=..., "
"metadata=...)`. `summary` is 13 human-readable sentences naming concrete "
"artifacts. `metadata` is machine-readable facts "
"(`{changed_files: [...], tests_run: N, decisions: [...]}`). Downstream "
"workers read both via their own `kanban_show`. Never put secrets / "
"tokens / raw PII in either field — run rows are durable forever.\n"
"6. **If follow-up work appears, create it; don't do it.** Use "
"`kanban_create(title=..., assignee=<right-profile>, parents=[your-task-id])` "
"to spawn a child task for the appropriate specialist profile instead of "
"scope-creeping into the next thing.\n"
"\n"
"## Orchestrator mode\n"
"\n"
"If your task is itself a decomposition task (e.g. a planner profile given "
"a high-level goal), use `kanban_create` to fan out into child tasks — one "
"per specialist, each with an explicit `assignee` and `parents=[...]` to "
"express dependencies. Then `kanban_complete` your own task with a summary "
"of the decomposition. Do NOT execute the work yourself; your job is "
"routing, not implementation.\n"
"\n"
"## Do NOT\n"
"\n"
"- Do not shell out to `hermes kanban <verb>` for board operations. Use "
"the `kanban_*` tools — they work across all terminal backends.\n"
"- Do not complete a task you didn't actually finish. Block it.\n"
"- Do not assign follow-up work to yourself. Assign it to the right "
"specialist profile.\n"
"- Do not call `delegate_task` as a board substitute. `delegate_task` is "
"for short reasoning subtasks inside your own run; board tasks are for "
"cross-agent handoffs that outlive one API loop."
)
TOOL_USE_ENFORCEMENT_GUIDANCE = (
"# Tool-use enforcement\n"
"You MUST use your tools to take action — do not describe what you would do "
+455
View File
@@ -0,0 +1,455 @@
"""Pure tool-call loop guardrail primitives.
The controller in this module is intentionally side-effect free: it tracks
per-turn tool-call observations and returns decisions. Runtime code owns whether
those decisions become warning guidance, synthetic tool results, or controlled
turn halts.
"""
from __future__ import annotations
import hashlib
import json
from dataclasses import dataclass, field
from typing import Any, Mapping
from utils import safe_json_loads
IDEMPOTENT_TOOL_NAMES = frozenset(
{
"read_file",
"search_files",
"web_search",
"web_extract",
"session_search",
"browser_snapshot",
"browser_console",
"browser_get_images",
"mcp_filesystem_read_file",
"mcp_filesystem_read_text_file",
"mcp_filesystem_read_multiple_files",
"mcp_filesystem_list_directory",
"mcp_filesystem_list_directory_with_sizes",
"mcp_filesystem_directory_tree",
"mcp_filesystem_get_file_info",
"mcp_filesystem_search_files",
}
)
MUTATING_TOOL_NAMES = frozenset(
{
"terminal",
"execute_code",
"write_file",
"patch",
"todo",
"memory",
"skill_manage",
"browser_click",
"browser_type",
"browser_press",
"browser_scroll",
"browser_navigate",
"send_message",
"cronjob",
"delegate_task",
"process",
}
)
@dataclass(frozen=True)
class ToolCallGuardrailConfig:
"""Thresholds for per-turn tool-call loop detection.
Warnings are enabled by default and never prevent tool execution. Hard stops
are explicit opt-in so interactive CLI/TUI sessions get a gentle nudge unless
the user enables circuit-breaker behavior in config.yaml.
"""
warnings_enabled: bool = True
hard_stop_enabled: bool = False
exact_failure_warn_after: int = 2
exact_failure_block_after: int = 5
same_tool_failure_warn_after: int = 3
same_tool_failure_halt_after: int = 8
no_progress_warn_after: int = 2
no_progress_block_after: int = 5
idempotent_tools: frozenset[str] = field(default_factory=lambda: IDEMPOTENT_TOOL_NAMES)
mutating_tools: frozenset[str] = field(default_factory=lambda: MUTATING_TOOL_NAMES)
@classmethod
def from_mapping(cls, data: Mapping[str, Any] | None) -> "ToolCallGuardrailConfig":
"""Build config from the `tool_loop_guardrails` config.yaml section."""
if not isinstance(data, Mapping):
return cls()
warn_after = data.get("warn_after")
if not isinstance(warn_after, Mapping):
warn_after = {}
hard_stop_after = data.get("hard_stop_after")
if not isinstance(hard_stop_after, Mapping):
hard_stop_after = {}
defaults = cls()
return cls(
warnings_enabled=_as_bool(data.get("warnings_enabled"), defaults.warnings_enabled),
hard_stop_enabled=_as_bool(data.get("hard_stop_enabled"), defaults.hard_stop_enabled),
exact_failure_warn_after=_positive_int(
warn_after.get("exact_failure", data.get("exact_failure_warn_after")),
defaults.exact_failure_warn_after,
),
same_tool_failure_warn_after=_positive_int(
warn_after.get("same_tool_failure", data.get("same_tool_failure_warn_after")),
defaults.same_tool_failure_warn_after,
),
no_progress_warn_after=_positive_int(
warn_after.get("idempotent_no_progress", data.get("no_progress_warn_after")),
defaults.no_progress_warn_after,
),
exact_failure_block_after=_positive_int(
hard_stop_after.get("exact_failure", data.get("exact_failure_block_after")),
defaults.exact_failure_block_after,
),
same_tool_failure_halt_after=_positive_int(
hard_stop_after.get("same_tool_failure", data.get("same_tool_failure_halt_after")),
defaults.same_tool_failure_halt_after,
),
no_progress_block_after=_positive_int(
hard_stop_after.get("idempotent_no_progress", data.get("no_progress_block_after")),
defaults.no_progress_block_after,
),
)
@dataclass(frozen=True)
class ToolCallSignature:
"""Stable, non-reversible identity for a tool name plus canonical args."""
tool_name: str
args_hash: str
@classmethod
def from_call(cls, tool_name: str, args: Mapping[str, Any] | None) -> "ToolCallSignature":
canonical = canonical_tool_args(args or {})
return cls(tool_name=tool_name, args_hash=_sha256(canonical))
def to_metadata(self) -> dict[str, str]:
"""Return public metadata without raw argument values."""
return {"tool_name": self.tool_name, "args_hash": self.args_hash}
@dataclass(frozen=True)
class ToolGuardrailDecision:
"""Decision returned by the tool-call guardrail controller."""
action: str = "allow" # allow | warn | block | halt
code: str = "allow"
message: str = ""
tool_name: str = ""
count: int = 0
signature: ToolCallSignature | None = None
@property
def allows_execution(self) -> bool:
return self.action in {"allow", "warn"}
@property
def should_halt(self) -> bool:
return self.action in {"block", "halt"}
def to_metadata(self) -> dict[str, Any]:
data: dict[str, Any] = {
"action": self.action,
"code": self.code,
"message": self.message,
"tool_name": self.tool_name,
"count": self.count,
}
if self.signature is not None:
data["signature"] = self.signature.to_metadata()
return data
def canonical_tool_args(args: Mapping[str, Any]) -> str:
"""Return sorted compact JSON for parsed tool arguments."""
if not isinstance(args, Mapping):
raise TypeError(f"tool args must be a mapping, got {type(args).__name__}")
return json.dumps(
args,
ensure_ascii=False,
sort_keys=True,
separators=(",", ":"),
default=str,
)
def classify_tool_failure(tool_name: str, result: str | None) -> tuple[bool, str]:
"""Safety-fallback classifier used only when callers don't pass ``failed``.
Mirrors ``agent.display._detect_tool_failure`` exactly so the guardrail
never disagrees with the CLI's user-visible ``[error]`` tag. Production
callers in ``run_agent.py`` always pass an explicit ``failed=`` derived
from ``_detect_tool_failure``; this function exists so standalone callers
(tests, tooling) still get consistent behavior.
"""
if result is None:
return False, ""
if tool_name == "terminal":
data = safe_json_loads(result)
if isinstance(data, dict):
exit_code = data.get("exit_code")
if exit_code is not None and exit_code != 0:
return True, f" [exit {exit_code}]"
return False, ""
if tool_name == "memory":
data = safe_json_loads(result)
if isinstance(data, dict):
if data.get("success") is False and "exceed the limit" in data.get("error", ""):
return True, " [full]"
lower = result[:500].lower()
if '"error"' in lower or '"failed"' in lower or result.startswith("Error"):
return True, " [error]"
return False, ""
class ToolCallGuardrailController:
"""Per-turn controller for repeated failed/non-progressing tool calls."""
def __init__(self, config: ToolCallGuardrailConfig | None = None):
self.config = config or ToolCallGuardrailConfig()
self.reset_for_turn()
def reset_for_turn(self) -> None:
self._exact_failure_counts: dict[ToolCallSignature, int] = {}
self._same_tool_failure_counts: dict[str, int] = {}
self._no_progress: dict[ToolCallSignature, tuple[str, int]] = {}
self._halt_decision: ToolGuardrailDecision | None = None
@property
def halt_decision(self) -> ToolGuardrailDecision | None:
return self._halt_decision
def before_call(self, tool_name: str, args: Mapping[str, Any] | None) -> ToolGuardrailDecision:
signature = ToolCallSignature.from_call(tool_name, _coerce_args(args))
if not self.config.hard_stop_enabled:
return ToolGuardrailDecision(tool_name=tool_name, signature=signature)
exact_count = self._exact_failure_counts.get(signature, 0)
if exact_count >= self.config.exact_failure_block_after:
decision = ToolGuardrailDecision(
action="block",
code="repeated_exact_failure_block",
message=(
f"Blocked {tool_name}: the same tool call failed {exact_count} "
"times with identical arguments. Stop retrying it unchanged; "
"change strategy or explain the blocker."
),
tool_name=tool_name,
count=exact_count,
signature=signature,
)
self._halt_decision = decision
return decision
if self._is_idempotent(tool_name):
record = self._no_progress.get(signature)
if record is not None:
_result_hash, repeat_count = record
if repeat_count >= self.config.no_progress_block_after:
decision = ToolGuardrailDecision(
action="block",
code="idempotent_no_progress_block",
message=(
f"Blocked {tool_name}: this read-only call returned the same "
f"result {repeat_count} times. Stop repeating it unchanged; "
"use the result already provided or try a different query."
),
tool_name=tool_name,
count=repeat_count,
signature=signature,
)
self._halt_decision = decision
return decision
return ToolGuardrailDecision(tool_name=tool_name, signature=signature)
def after_call(
self,
tool_name: str,
args: Mapping[str, Any] | None,
result: str | None,
*,
failed: bool | None = None,
) -> ToolGuardrailDecision:
args = _coerce_args(args)
signature = ToolCallSignature.from_call(tool_name, args)
if failed is None:
failed, _ = classify_tool_failure(tool_name, result)
if failed:
exact_count = self._exact_failure_counts.get(signature, 0) + 1
self._exact_failure_counts[signature] = exact_count
self._no_progress.pop(signature, None)
same_count = self._same_tool_failure_counts.get(tool_name, 0) + 1
self._same_tool_failure_counts[tool_name] = same_count
if self.config.hard_stop_enabled and same_count >= self.config.same_tool_failure_halt_after:
decision = ToolGuardrailDecision(
action="halt",
code="same_tool_failure_halt",
message=(
f"Stopped {tool_name}: it failed {same_count} times this turn. "
"Stop retrying the same failing tool path and choose a different approach."
),
tool_name=tool_name,
count=same_count,
signature=signature,
)
self._halt_decision = decision
return decision
if self.config.warnings_enabled and exact_count >= self.config.exact_failure_warn_after:
return ToolGuardrailDecision(
action="warn",
code="repeated_exact_failure_warning",
message=(
f"{tool_name} has failed {exact_count} times with identical arguments. "
"This looks like a loop; inspect the error and change strategy "
"instead of retrying it unchanged."
),
tool_name=tool_name,
count=exact_count,
signature=signature,
)
if self.config.warnings_enabled and same_count >= self.config.same_tool_failure_warn_after:
return ToolGuardrailDecision(
action="warn",
code="same_tool_failure_warning",
message=(
f"{tool_name} has failed {same_count} times this turn. "
"This looks like a loop; change approach before retrying."
),
tool_name=tool_name,
count=same_count,
signature=signature,
)
return ToolGuardrailDecision(tool_name=tool_name, count=exact_count, signature=signature)
self._exact_failure_counts.pop(signature, None)
self._same_tool_failure_counts.pop(tool_name, None)
if not self._is_idempotent(tool_name):
self._no_progress.pop(signature, None)
return ToolGuardrailDecision(tool_name=tool_name, signature=signature)
result_hash = _result_hash(result)
previous = self._no_progress.get(signature)
repeat_count = 1
if previous is not None and previous[0] == result_hash:
repeat_count = previous[1] + 1
self._no_progress[signature] = (result_hash, repeat_count)
if self.config.warnings_enabled and repeat_count >= self.config.no_progress_warn_after:
return ToolGuardrailDecision(
action="warn",
code="idempotent_no_progress_warning",
message=(
f"{tool_name} returned the same result {repeat_count} times. "
"Use the result already provided or change the query instead of "
"repeating it unchanged."
),
tool_name=tool_name,
count=repeat_count,
signature=signature,
)
return ToolGuardrailDecision(tool_name=tool_name, count=repeat_count, signature=signature)
def _is_idempotent(self, tool_name: str) -> bool:
if tool_name in self.config.mutating_tools:
return False
return tool_name in self.config.idempotent_tools
def toolguard_synthetic_result(decision: ToolGuardrailDecision) -> str:
"""Build a synthetic role=tool content string for a blocked tool call."""
return json.dumps(
{
"error": decision.message,
"guardrail": decision.to_metadata(),
},
ensure_ascii=False,
)
def append_toolguard_guidance(result: str, decision: ToolGuardrailDecision) -> str:
"""Append runtime guidance to the current tool result content."""
if decision.action not in {"warn", "halt"} or not decision.message:
return result
label = "Tool loop hard stop" if decision.action == "halt" else "Tool loop warning"
suffix = (
f"\n\n[{label}: "
f"{decision.code}; count={decision.count}; {decision.message}]"
)
return (result or "") + suffix
def _coerce_args(args: Mapping[str, Any] | None) -> Mapping[str, Any]:
return args if isinstance(args, Mapping) else {}
def _result_hash(result: str | None) -> str:
parsed = safe_json_loads(result or "")
if parsed is not None:
try:
canonical = json.dumps(
parsed,
ensure_ascii=False,
sort_keys=True,
separators=(",", ":"),
default=str,
)
except TypeError:
canonical = str(parsed)
else:
canonical = result or ""
return _sha256(canonical)
def _as_bool(value: Any, default: bool) -> bool:
if value is None:
return default
if isinstance(value, bool):
return value
if isinstance(value, (int, float)):
return bool(value)
if isinstance(value, str):
lowered = value.strip().lower()
if lowered in {"1", "true", "yes", "on", "enabled"}:
return True
if lowered in {"0", "false", "no", "off", "disabled"}:
return False
return default
def _positive_int(value: Any, default: int) -> int:
if value is None:
return default
try:
parsed = int(value)
except (TypeError, ValueError):
return default
return parsed if parsed >= 1 else default
def _sha256(value: str) -> str:
return hashlib.sha256(value.encode("utf-8")).hexdigest()
+19
View File
@@ -289,6 +289,25 @@ browser:
# after this period of no activity between agent loops (default: 120 = 2 minutes)
inactivity_timeout: 120
# =============================================================================
# Tool Loop Guardrails
# =============================================================================
# Soft warnings are enabled by default. They append guidance to repeated failed
# or non-progressing tool results but still let the tool execute. Hard stops are
# opt-in circuit breakers for autonomous/cron sessions where stopping a loop is
# preferable to spending the full iteration budget.
tool_loop_guardrails:
warnings_enabled: true
hard_stop_enabled: false
warn_after:
exact_failure: 2
same_tool_failure: 3
idempotent_no_progress: 2
hard_stop_after:
exact_failure: 5
same_tool_failure: 8
idempotent_no_progress: 5
# =============================================================================
# Context Compression (Auto-shrinks long conversations)
# =============================================================================
+354 -15
View File
@@ -15,7 +15,6 @@ Usage:
import logging
import os
import re
import shutil
import sys
import json
@@ -86,7 +85,7 @@ from hermes_cli.browser_connect import (
try_launch_chrome_debug,
)
from hermes_cli.env_loader import load_hermes_dotenv
from utils import base_url_host_matches
from utils import base_url_host_matches, is_truthy_value
_hermes_home = get_hermes_home()
_project_env = Path(__file__).parent / '.env'
@@ -600,6 +599,7 @@ def load_cli_config() -> Dict[str, Any]:
# Load configuration at module startup
CLI_CONFIG = load_cli_config()
# Initialize centralized logging early — agent.log + errors.log in ~/.hermes/logs/.
# This ensures CLI sessions produce a log trail even before AIAgent is instantiated.
try:
@@ -934,6 +934,20 @@ def _run_state_db_auto_maintenance(session_db) -> None:
try:
from hermes_cli.config import load_config as _load_full_config
from hermes_constants import get_hermes_home as _get_hermes_home
_hermes_home_maint = _get_hermes_home()
# One-time prune of empty TUI ghost sessions.
try:
if not session_db.get_meta("ghost_session_prune_v1"):
pruned = session_db.prune_empty_ghost_sessions(
sessions_dir=_hermes_home_maint / "sessions"
)
session_db.set_meta("ghost_session_prune_v1", "1")
if pruned:
logger.info("Pruned %d empty TUI ghost sessions", pruned)
except Exception as _prune_exc:
logger.debug("Ghost session prune skipped: %s", _prune_exc)
cfg = (_load_full_config().get("sessions") or {})
if not cfg.get("auto_prune", False):
return
@@ -941,7 +955,7 @@ def _run_state_db_auto_maintenance(session_db) -> None:
retention_days=int(cfg.get("retention_days", 90)),
min_interval_hours=int(cfg.get("min_interval_hours", 24)),
vacuum=bool(cfg.get("vacuum_after_prune", True)),
sessions_dir=_get_hermes_home() / "sessions",
sessions_dir=_hermes_home_maint / "sessions",
)
except Exception as exc:
logger.debug("state.db auto-maintenance skipped: %s", exc)
@@ -1240,8 +1254,73 @@ def _cprint(text: str):
Raw ANSI escapes written via print() are swallowed by patch_stdout's
StdoutProxy. Routing through print_formatted_text(ANSI(...)) lets
prompt_toolkit parse the escapes and render real colors.
When called from a background thread while a prompt_toolkit
``Application`` is running (the common case for the self-improvement
background review's ``💾 …`` summary, curator summaries, and other
bg-thread emissions), a direct ``_pt_print`` races with the input
area's redraw and the line can end up visually buried behind the
prompt. Route those cases through ``run_in_terminal`` via
``loop.call_soon_threadsafe``, which pauses the input area, prints
the line above it, and redraws the prompt cleanly.
"""
_pt_print(_PT_ANSI(text))
try:
from prompt_toolkit.application import get_app_or_none, run_in_terminal
except Exception:
_pt_print(_PT_ANSI(text))
return
app = None
try:
app = get_app_or_none()
except Exception:
app = None
# No active app, or we're already on the app's main thread: the
# direct prompt_toolkit print is safe and matches existing behavior
# (spinner frames, streamed tokens, tool activity prefixes, …).
if app is None or not getattr(app, "_is_running", False):
_pt_print(_PT_ANSI(text))
return
try:
loop = app.loop # type: ignore[attr-defined]
except Exception:
loop = None
if loop is None:
_pt_print(_PT_ANSI(text))
return
import asyncio as _asyncio
try:
current_loop = _asyncio.get_event_loop_policy().get_event_loop()
except Exception:
current_loop = None
# Same thread as the app's loop → safe to print directly.
if current_loop is loop and loop.is_running():
_pt_print(_PT_ANSI(text))
return
# Cross-thread emission: ask the app's event loop to schedule a
# ``run_in_terminal`` that wraps ``_pt_print``. This hides the
# prompt, prints, and redraws. Fire-and-forget — if scheduling
# fails we fall back to a direct print so the line isn't lost.
def _schedule():
try:
run_in_terminal(lambda: _pt_print(_PT_ANSI(text)))
except Exception:
try:
_pt_print(_PT_ANSI(text))
except Exception:
pass
try:
loop.call_soon_threadsafe(_schedule)
except Exception:
try:
_pt_print(_PT_ANSI(text))
except Exception:
pass
# ---------------------------------------------------------------------------
@@ -2053,6 +2132,8 @@ class HermesCLI:
# Parse and validate toolsets
self.enabled_toolsets = toolsets
self.disabled_toolsets = CLI_CONFIG["agent"].get("disabled_toolsets") or []
if toolsets and "all" not in toolsets and "*" not in toolsets:
# Validate each toolset — MCP server names are resolved via
# live registry aliases (registered during discover_mcp_tools),
@@ -3503,6 +3584,7 @@ class HermesCLI:
credential_pool=runtime.get("credential_pool"),
max_iterations=self.max_turns,
enabled_toolsets=self.enabled_toolsets,
disabled_toolsets=self.disabled_toolsets,
verbose_logging=self.verbose,
quiet_mode=not self.verbose,
ephemeral_system_prompt=self.system_prompt if self.system_prompt else None,
@@ -3550,14 +3632,18 @@ class HermesCLI:
tuple(runtime.get("args") or ()),
)
if self._pending_title and self._session_db:
# Force-create DB row on /title intent, then apply title.
if self._pending_title and self._session_db and self.agent:
try:
self._session_db.set_session_title(self.session_id, self._pending_title)
_cprint(f" Session title applied: {self._pending_title}")
self._pending_title = None
self.agent._ensure_db_session()
if self.agent._session_db_created:
self._session_db.set_session_title(self.session_id, self._pending_title)
_cprint(f" Session title applied: {self._pending_title}")
self._pending_title = None
# else: row creation failed transiently — keep _pending_title for retry
except (ValueError, Exception) as e:
_cprint(f" Could not apply pending title: {e}")
self._pending_title = None
# Keep _pending_title so it can be retried after row creation succeeds
return True
except Exception as e:
ChatConsole().print(f"[bold red]Failed to initialize agent: {e}[/]")
@@ -4826,6 +4912,40 @@ class HermesCLI:
flush_tool_summary()
print()
def _handle_recap_command(self) -> None:
"""Show a compact recap of recent activity in this session.
Inspired by Claude Code's ``/recap`` (v2.1.114, April 2026) — useful
when running multiple sessions simultaneously and returning to one
after a while. Purely local; no LLM call, no token cost, no cache
invalidation.
"""
try:
from hermes_cli.session_recap import build_recap
except Exception as exc: # pragma: no cover - defensive
print(f" (recap unavailable: {exc})")
return
title = None
try:
if self._session_db and self.session_id:
row = self._session_db.get_session(self.session_id)
if row:
title = row.get("title") or None
except Exception:
title = None
text = build_recap(
self.conversation_history or [],
session_title=title,
session_id=self.session_id,
platform="cli",
)
print()
for line in text.splitlines():
print(line)
print()
def _notify_session_boundary(self, event_type: str) -> None:
"""Fire a session-boundary plugin hook (on_session_finalize or on_session_reset).
@@ -4885,6 +5005,7 @@ class HermesCLI:
if self._session_db:
try:
self.agent._session_db_created = False
self._session_db.create_session(
session_id=self.session_id,
source=os.environ.get("HERMES_SESSION_SOURCE", "cli"),
@@ -4894,6 +5015,7 @@ class HermesCLI:
"reasoning_config": self.reasoning_config,
},
)
self.agent._session_db_created = True
except Exception:
pass
# Notify memory providers that session_id rotated to a fresh
@@ -6087,6 +6209,27 @@ class HermesCLI:
except Exception as exc:
print(f"(._.) curator: {exc}")
def _handle_kanban_command(self, cmd: str):
"""Handle the /kanban command — delegate to the shared kanban CLI.
The string form passed here is the user's full ``/kanban ...``
including the leading slash; we strip it and hand the remainder
to ``kanban.run_slash`` which returns a single formatted string.
"""
from hermes_cli.kanban import run_slash
rest = cmd.strip()
if rest.startswith("/"):
rest = rest.lstrip("/")
if rest.startswith("kanban"):
rest = rest[len("kanban"):].lstrip()
try:
output = run_slash(rest)
except Exception as exc: # pragma: no cover - defensive
output = f"(._.) kanban error: {exc}"
if output:
print(output)
def _handle_skills_command(self, cmd: str):
"""Handle /skills slash command — delegates to hermes_cli.skills_hub."""
from hermes_cli.skills_hub import handle_skills_slash
@@ -6255,6 +6398,8 @@ class HermesCLI:
pass
elif canonical == "history":
self.show_history()
elif canonical == "recap":
self._handle_recap_command()
elif canonical == "title":
parts = cmd_original.split(maxsplit=1)
if len(parts) > 1:
@@ -6332,6 +6477,8 @@ class HermesCLI:
self._handle_cron_command(cmd_original)
elif canonical == "curator":
self._handle_curator_command(cmd_original)
elif canonical == "kanban":
self._handle_kanban_command(cmd_original)
elif canonical == "skills":
with self._busy_command(self._slow_command_status(cmd_original)):
self._handle_skills_command(cmd_original)
@@ -6449,6 +6596,8 @@ class HermesCLI:
# No active run — treat as a normal next-turn message.
self._pending_input.put(payload)
_cprint(f" No agent running; queued as next turn: {payload[:80]}{'...' if len(payload) > 80 else ''}")
elif canonical == "goal":
self._handle_goal_command(cmd_original)
elif canonical == "skin":
self._handle_skin_command(cmd_original)
elif canonical == "voice":
@@ -6494,12 +6643,17 @@ class HermesCLI:
self._console_print(f"[bold red]Quick command '{base_cmd}' has unsupported type (supported: 'exec', 'alias')[/]")
# Check for plugin-registered slash commands
elif base_cmd.lstrip("/") in _get_plugin_cmd_handler_names():
from hermes_cli.plugins import get_plugin_command_handler
from hermes_cli.plugins import (
get_plugin_command_handler,
resolve_plugin_command_result,
)
plugin_handler = get_plugin_command_handler(base_cmd.lstrip("/"))
if plugin_handler:
user_args = cmd_original[len(base_cmd):].strip()
try:
result = plugin_handler(user_args)
result = resolve_plugin_command_result(
plugin_handler(user_args)
)
if result:
_cprint(str(result))
except Exception as e:
@@ -6924,6 +7078,166 @@ class HermesCLI:
print(" status Show current browser mode")
print()
# ────────────────────────────────────────────────────────────────
# /goal — persistent cross-turn goals (Ralph-style loop)
# ────────────────────────────────────────────────────────────────
def _get_goal_manager(self):
"""Return the GoalManager bound to the current session_id.
Cached on ``self._goal_manager`` and rebound lazily when
``session_id`` changes (e.g. after /new or a compression-driven
session split).
"""
try:
from hermes_cli.goals import GoalManager
from hermes_cli.config import load_config
except Exception as exc:
logging.debug("goal manager unavailable: %s", exc)
return None
sid = getattr(self, "session_id", None) or ""
if not sid:
return None
existing = getattr(self, "_goal_manager", None)
if existing is not None and getattr(existing, "session_id", None) == sid:
return existing
try:
cfg = load_config() or {}
goals_cfg = cfg.get("goals") or {}
max_turns = int(goals_cfg.get("max_turns", 20) or 20)
except Exception:
max_turns = 20
mgr = GoalManager(session_id=sid, default_max_turns=max_turns)
self._goal_manager = mgr
return mgr
def _handle_goal_command(self, cmd: str) -> None:
"""Dispatch /goal subcommands: set / status / pause / resume / clear."""
parts = (cmd or "").strip().split(None, 1)
arg = parts[1].strip() if len(parts) > 1 else ""
mgr = self._get_goal_manager()
if mgr is None:
_cprint(f" {_DIM}Goals unavailable (no active session).{_RST}")
return
lower = arg.lower()
# Bare /goal or /goal status → show current state
if not arg or lower == "status":
_cprint(f" {mgr.status_line()}")
return
if lower == "pause":
state = mgr.pause(reason="user-paused")
if state is None:
_cprint(f" {_DIM}No goal set.{_RST}")
else:
_cprint(f" ⏸ Goal paused: {state.goal}")
return
if lower == "resume":
state = mgr.resume()
if state is None:
_cprint(f" {_DIM}No goal to resume.{_RST}")
else:
_cprint(f" ▶ Goal resumed: {state.goal}")
_cprint(
f" {_DIM}Send any message (or press Enter on an empty prompt "
f"is a no-op; type 'continue' to kick it off).{_RST}"
)
return
if lower in ("clear", "stop", "done"):
had = mgr.has_goal()
mgr.clear()
if had:
_cprint(" ✓ Goal cleared.")
else:
_cprint(f" {_DIM}No active goal.{_RST}")
return
# Otherwise treat the arg as the goal text.
try:
state = mgr.set(arg)
except ValueError as exc:
_cprint(f" Invalid goal: {exc}")
return
_cprint(f" ⊙ Goal set ({state.max_turns}-turn budget): {state.goal}")
_cprint(
f" {_DIM}After each turn, a judge model will check if the goal is done. "
f"Hermes keeps working until it is, you pause/clear it, or the budget is "
f"exhausted. Use /goal status, /goal pause, /goal resume, /goal clear.{_RST}"
)
# Kick the loop off immediately so the user doesn't have to send a
# separate message after setting the goal.
try:
self._pending_input.put(state.goal)
except Exception:
pass
def _maybe_continue_goal_after_turn(self) -> None:
"""Hook run after every CLI turn. Judges + maybe re-queues.
Safe to call when no goal is set returns quickly.
Preemption is automatic: if a real user message is already in
``_pending_input`` we skip judging (the user's new input takes
priority and we'll re-judge after that turn). If judge says done,
mark it done and tell the user. If judge says continue and we're
under budget, push the continuation prompt onto the queue.
"""
mgr = self._get_goal_manager()
if mgr is None or not mgr.is_active():
return
# If a real user message is already queued, don't inject a
# continuation prompt on top — let the user's turn go first.
try:
if getattr(self, "_pending_input", None) is not None \
and not self._pending_input.empty():
return
except Exception:
pass
# Extract the agent's final response for this turn.
last_response = ""
try:
hist = self.conversation_history or []
for msg in reversed(hist):
if msg.get("role") == "assistant":
content = msg.get("content", "")
if isinstance(content, list):
# Multimodal content — flatten text parts.
parts = [
p.get("text", "")
for p in content
if isinstance(p, dict) and p.get("type") in ("text", "output_text")
]
last_response = "\n".join(t for t in parts if t)
else:
last_response = str(content or "")
break
except Exception:
last_response = ""
decision = mgr.evaluate_after_turn(last_response, user_initiated=True)
msg = decision.get("message") or ""
if msg:
_cprint(f" {msg}")
if decision.get("should_continue"):
prompt = decision.get("continuation_prompt")
if prompt:
try:
self._pending_input.put(prompt)
except Exception as exc:
logging.debug("goal continuation enqueue failed: %s", exc)
def _handle_skin_command(self, cmd: str):
"""Handle /skin [name] — show or change the display skin."""
try:
@@ -7050,7 +7364,7 @@ class HermesCLI:
import os
from hermes_cli.colors import Colors as _Colors
current = bool(os.environ.get("HERMES_YOLO_MODE"))
current = is_truthy_value(os.environ.get("HERMES_YOLO_MODE"))
if current:
os.environ.pop("HERMES_YOLO_MODE", None)
_cprint(
@@ -7247,10 +7561,20 @@ class HermesCLI:
original_count = len(self.conversation_history)
with self._busy_command("Compressing context..."):
try:
from agent.model_metadata import estimate_messages_tokens_rough
from agent.model_metadata import estimate_request_tokens_rough
from agent.manual_compression_feedback import summarize_manual_compression
original_history = list(self.conversation_history)
approx_tokens = estimate_messages_tokens_rough(original_history)
# Include system prompt + tool schemas in the estimate —
# a transcript-only number understates real request pressure
# and can even appear to grow after compression because a
# dense handoff summary replaces many short turns (#6217).
_sys_prompt = getattr(self.agent, "_cached_system_prompt", "") or ""
_tools = getattr(self.agent, "tools", None) or None
approx_tokens = estimate_request_tokens_rough(
original_history,
system_prompt=_sys_prompt,
tools=_tools,
)
if focus_topic:
print(f"🗜️ Compressing {original_count} messages (~{approx_tokens:,} tokens), "
f"focus: \"{focus_topic}\"...")
@@ -7282,7 +7606,11 @@ class HermesCLI:
):
self.session_id = self.agent.session_id
self._pending_title = None
new_tokens = estimate_messages_tokens_rough(self.conversation_history)
new_tokens = estimate_request_tokens_rough(
self.conversation_history,
system_prompt=_sys_prompt,
tools=_tools,
)
summary = summarize_manual_compression(
original_history,
self.conversation_history,
@@ -11248,6 +11576,17 @@ class HermesCLI:
app.invalidate() # Refresh status line
# Goal continuation: if a standing goal is active, ask
# the judge whether the turn satisfied it. If not, and
# there's no real user message already queued, push the
# continuation prompt back into _pending_input so the
# next loop iteration picks it up naturally (and any
# user input that arrives in between still preempts).
try:
self._maybe_continue_goal_after_turn()
except Exception as _goal_exc:
logging.debug("goal continuation hook failed: %s", _goal_exc)
# Continuous voice: auto-restart recording after agent responds.
# Dispatch to a daemon thread so play_beep (sd.wait) and
# AudioRecorder.start (lock acquire) never block process_loop —
+118
View File
@@ -882,3 +882,121 @@ def save_job_output(job_id: str, output: str):
raise
return output_file
# =============================================================================
# Skill reference rewriting (curator integration)
# =============================================================================
def rewrite_skill_refs(
consolidated: Optional[Dict[str, str]] = None,
pruned: Optional[List[str]] = None,
) -> Dict[str, Any]:
"""Rewrite cron job skill references after a curator consolidation pass.
When the curator consolidates a skill X into umbrella Y (or archives X
as pruned), any cron job that lists ``X`` in its ``skills`` field will
fail to load ``X`` at run time — the scheduler logs a warning and
skips the skill, so the job runs without the instructions it was
scheduled to follow. See cron/scheduler.py where ``skill_view`` is
called per skill name.
This function repairs cron jobs in-place:
- A skill listed in ``consolidated`` is replaced with its umbrella
target (the ``into`` value). If the umbrella is already in the
job's skill list, the stale name is dropped without duplication.
- A skill listed in ``pruned`` is dropped outright — there is no
forwarding target.
- Ordering and other skills in the list are preserved.
- The legacy ``skill`` field is realigned via ``_apply_skill_fields``.
Args:
consolidated: mapping of ``old_skill_name -> umbrella_skill_name``.
pruned: list of skill names that were archived with no forwarding
target.
Returns a report dict::
{
"rewrites": [
{
"job_id": ...,
"job_name": ...,
"before": [...],
"after": [...],
"mapped": {"old": "new", ...},
"dropped": ["old", ...],
},
...
],
"jobs_updated": N,
"jobs_scanned": M,
}
Best-effort: exceptions from loading/saving propagate to the caller so
tests can assert behaviour; the curator invocation site wraps this
call in a try/except so a failure here never breaks the curator.
"""
consolidated = dict(consolidated or {})
pruned_set = set(pruned or [])
# A skill listed in both wins as "consolidated" — it has a target,
# which is the more useful of the two outcomes.
pruned_set -= set(consolidated.keys())
if not consolidated and not pruned_set:
return {"rewrites": [], "jobs_updated": 0, "jobs_scanned": 0}
with _jobs_file_lock:
jobs = load_jobs()
rewrites: List[Dict[str, Any]] = []
changed = False
for job in jobs:
skills_before = _normalize_skill_list(job.get("skill"), job.get("skills"))
if not skills_before:
continue
mapped: Dict[str, str] = {}
dropped: List[str] = []
new_skills: List[str] = []
for name in skills_before:
if name in consolidated:
target = consolidated[name]
mapped[name] = target
if target and target not in new_skills:
new_skills.append(target)
elif name in pruned_set:
dropped.append(name)
else:
if name not in new_skills:
new_skills.append(name)
if not mapped and not dropped:
continue
job["skills"] = new_skills
job["skill"] = new_skills[0] if new_skills else None
changed = True
rewrites.append({
"job_id": job.get("id"),
"job_name": job.get("name") or job.get("id"),
"before": list(skills_before),
"after": list(new_skills),
"mapped": mapped,
"dropped": dropped,
})
if changed:
save_jobs(jobs)
logger.info(
"Curator rewrote skill references in %d cron job(s)", len(rewrites)
)
return {
"rewrites": rewrites,
"jobs_updated": len(rewrites),
"jobs_scanned": len(jobs),
}
+1 -1
View File
@@ -40,7 +40,7 @@ services:
# - TEAMS_CLIENT_SECRET=${TEAMS_CLIENT_SECRET}
# - TEAMS_TENANT_ID=${TEAMS_TENANT_ID}
# - TEAMS_ALLOWED_USERS=${TEAMS_ALLOWED_USERS}
# - TEAMS_PORT=3978
# - TEAMS_PORT=${TEAMS_PORT:-3978}
command: ["gateway", "run"]
dashboard:
Binary file not shown.
+64 -6
View File
@@ -36,6 +36,26 @@ def _coerce_bool(value: Any, default: bool = True) -> bool:
return is_truthy_value(value, default=default)
def _coerce_float(value: Any, default: float) -> float:
"""Coerce numeric config values, falling back on malformed input."""
if value is None:
return default
try:
return float(value)
except (TypeError, ValueError):
return default
def _coerce_int(value: Any, default: int) -> int:
"""Coerce integer config values, falling back on malformed input."""
if value is None:
return default
try:
return int(value)
except (TypeError, ValueError):
return default
def _normalize_unauthorized_dm_behavior(value: Any, default: str = "pair") -> str:
"""Normalize unauthorized DM behavior to a supported value."""
if isinstance(value, str):
@@ -45,6 +65,15 @@ def _normalize_unauthorized_dm_behavior(value: Any, default: str = "pair") -> st
return default
def _normalize_notice_delivery(value: Any, default: str = "public") -> str:
"""Normalize notice delivery mode to a supported value."""
if isinstance(value, str):
normalized = value.strip().lower()
if normalized in {"public", "private"}:
return normalized
return default
# Module-level cache for bundled platform plugin names (lives outside the
# enum so it doesn't become an accidental enum member).
_Platform__bundled_plugin_names: Optional[set] = None
@@ -301,13 +330,13 @@ class StreamingConfig:
if not data:
return cls()
return cls(
enabled=data.get("enabled", False),
enabled=_coerce_bool(data.get("enabled"), False),
transport=data.get("transport", "edit"),
edit_interval=float(data.get("edit_interval", 1.0)),
buffer_threshold=int(data.get("buffer_threshold", 40)),
edit_interval=_coerce_float(data.get("edit_interval"), 1.0),
buffer_threshold=_coerce_int(data.get("buffer_threshold"), 40),
cursor=data.get("cursor", ""),
fresh_final_after_seconds=float(
data.get("fresh_final_after_seconds", 60.0)
fresh_final_after_seconds=_coerce_float(
data.get("fresh_final_after_seconds"), 60.0
),
)
@@ -572,6 +601,17 @@ class GatewayConfig:
)
return self.unauthorized_dm_behavior
def get_notice_delivery(self, platform: Optional[Platform] = None) -> str:
"""Return the effective notice-delivery mode for a platform."""
if platform:
platform_cfg = self.platforms.get(platform)
if platform_cfg and "notice_delivery" in platform_cfg.extra:
return _normalize_notice_delivery(
platform_cfg.extra.get("notice_delivery"),
"public",
)
return "public"
def load_gateway_config() -> GatewayConfig:
"""
@@ -687,6 +727,11 @@ def load_gateway_config() -> GatewayConfig:
platform_cfg.get("unauthorized_dm_behavior"),
gw_data.get("unauthorized_dm_behavior", "pair"),
)
if "notice_delivery" in platform_cfg:
bridged["notice_delivery"] = _normalize_notice_delivery(
platform_cfg.get("notice_delivery"),
"public",
)
if "reply_prefix" in platform_cfg:
bridged["reply_prefix"] = platform_cfg["reply_prefix"]
if "reply_in_thread" in platform_cfg:
@@ -900,6 +945,12 @@ def load_gateway_config() -> GatewayConfig:
if "dm_mention_threads" in matrix_cfg and not os.getenv("MATRIX_DM_MENTION_THREADS"):
os.environ["MATRIX_DM_MENTION_THREADS"] = str(matrix_cfg["dm_mention_threads"]).lower()
# Feishu settings → env vars (env vars take precedence)
feishu_cfg = yaml_cfg.get("feishu", {})
if isinstance(feishu_cfg, dict):
if "allow_bots" in feishu_cfg and not os.getenv("FEISHU_ALLOW_BOTS"):
os.environ["FEISHU_ALLOW_BOTS"] = str(feishu_cfg["allow_bots"]).lower()
except Exception as e:
logger.warning(
"Failed to process config.yaml — falling back to .env / gateway.json values. "
@@ -1051,7 +1102,14 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
if Platform.WHATSAPP not in config.platforms:
config.platforms[Platform.WHATSAPP] = PlatformConfig()
config.platforms[Platform.WHATSAPP].enabled = True
whatsapp_home = os.getenv("WHATSAPP_HOME_CHANNEL")
if whatsapp_home and Platform.WHATSAPP in config.platforms:
config.platforms[Platform.WHATSAPP].home_channel = HomeChannel(
platform=Platform.WHATSAPP,
chat_id=whatsapp_home,
name=os.getenv("WHATSAPP_HOME_CHANNEL_NAME", "Home"),
)
# Slack
slack_token = os.getenv("SLACK_BOT_TOKEN")
if slack_token:
+9 -7
View File
@@ -53,9 +53,10 @@ class DeliveryTarget:
- "telegram" Telegram home channel
- "telegram:123456" specific Telegram chat
"""
target = target.strip().lower()
target_stripped = target.strip()
target_lower = target_stripped.lower()
if target == "origin":
if target_lower == "origin":
if origin:
return cls(
platform=origin.platform,
@@ -67,13 +68,14 @@ class DeliveryTarget:
# Fallback to local if no origin
return cls(platform=Platform.LOCAL, is_origin=True)
if target == "local":
if target_lower == "local":
return cls(platform=Platform.LOCAL)
# Check for platform:chat_id or platform:chat_id:thread_id format
if ":" in target:
parts = target.split(":", 2)
platform_str = parts[0]
# Use the original case for chat_id/thread_id to preserve case-sensitive IDs
if ":" in target_stripped:
parts = target_stripped.split(":", 2)
platform_str = parts[0].lower() # Platform names are case-insensitive
chat_id = parts[1] if len(parts) > 1 else None
thread_id = parts[2] if len(parts) > 2 else None
try:
@@ -85,7 +87,7 @@ class DeliveryTarget:
# Just a platform name (use home channel)
try:
platform = Platform(target)
platform = Platform(target_lower)
return cls(platform=platform)
except ValueError:
# Unknown platform, treat as local
+4 -2
View File
@@ -2351,10 +2351,11 @@ class APIServerAdapter(BasePlatformAdapter):
)
if agent_ref is not None:
agent_ref[0] = agent
effective_task_id = session_id or str(uuid.uuid4())
result = agent.run_conversation(
user_message=user_message,
conversation_history=conversation_history,
task_id="default",
task_id=effective_task_id,
)
usage = {
"input_tokens": getattr(agent, "session_prompt_tokens", 0) or 0,
@@ -2551,10 +2552,11 @@ class APIServerAdapter(BasePlatformAdapter):
)
self._active_run_agents[run_id] = agent
def _run_sync():
effective_task_id = session_id or run_id
r = agent.run_conversation(
user_message=user_message,
conversation_history=conversation_history,
task_id="default",
task_id=effective_task_id,
)
u = {
"input_tokens": getattr(agent, "session_prompt_tokens", 0) or 0,
+209 -13
View File
@@ -416,7 +416,7 @@ def is_host_excluded_by_no_proxy(hostname: str, no_proxy_value: str | None = Non
from dataclasses import dataclass, field
from datetime import datetime
from pathlib import Path
from typing import Dict, List, Optional, Any, Callable, Awaitable, Tuple
from typing import Dict, List, Optional, Any, Callable, Awaitable, Tuple, Union
from enum import Enum
from pathlib import Path as _Path
@@ -981,7 +981,7 @@ def coerce_plaintext_gateway_command(event: "MessageEvent") -> None:
return
@dataclass
@dataclass
class SendResult:
"""Result of sending a message."""
success: bool
@@ -991,6 +991,45 @@ class SendResult:
retryable: bool = False # True for transient connection errors — base will retry automatically
class EphemeralReply(str):
"""System-notice reply that auto-deletes after a TTL.
Slash-command handlers in ``gateway/run.py`` can return this wrapper
instead of a plain string to request that the reply message be deleted
after ``ttl_seconds`` on platforms that support ``delete_message``.
Subclassing ``str`` keeps the wrapper transparent to anything that
treats handler return values as text (existing tests use ``in`` /
``startswith`` / equality; the ``_process_message_background`` pipeline
extracts attachments from the string content). ``isinstance(r,
EphemeralReply)`` still distinguishes ephemeral replies from plain
strings so the send path can schedule deletion.
Platforms that don't override :meth:`BasePlatformAdapter.delete_message`
silently ignore the TTL the message is sent normally and left in
place. When ``ttl_seconds`` is ``None``, the pipeline uses the
configured ``display.ephemeral_system_ttl`` default. A default of ``0``
disables auto-deletion globally, preserving prior behavior.
"""
ttl_seconds: Optional[int]
def __new__(cls, text: str, ttl_seconds: Optional[int] = None):
instance = super().__new__(cls, text)
instance.ttl_seconds = ttl_seconds
return instance
@property
def text(self) -> str:
"""Return the underlying text.
Provided for call sites that want an explicit string conversion,
though ``str(reply)`` and using ``reply`` directly where a string
is expected both work identically.
"""
return str.__str__(self)
def merge_pending_message_event(
pending_messages: Dict[str, MessageEvent],
session_key: str,
@@ -1034,6 +1073,11 @@ def merge_pending_message_event(
existing.text = event.text
if existing_is_photo or incoming_is_photo:
existing.message_type = MessageType.PHOTO
elif (
getattr(existing, "message_type", None) == MessageType.TEXT
and event.message_type != MessageType.TEXT
):
existing.message_type = event.message_type
return
if (
@@ -1068,8 +1112,10 @@ _RETRYABLE_ERROR_PATTERNS = (
)
# Type for message handlers
MessageHandler = Callable[[MessageEvent], Awaitable[Optional[str]]]
# Type for message handlers. Handlers may return a plain string (normal
# reply), an ``EphemeralReply`` to opt the reply into auto-deletion, or
# ``None`` when the response was already delivered (e.g. via streaming).
MessageHandler = Callable[[MessageEvent], Awaitable[Optional[Union[str, "EphemeralReply"]]]]
def resolve_channel_prompt(
@@ -1454,6 +1500,64 @@ class BasePlatformAdapter(ABC):
"""
return False
def _get_ephemeral_system_ttl_default(self) -> int:
"""Read ``display.ephemeral_system_ttl`` from config.
Returns the TTL in seconds to use when an :class:`EphemeralReply`
does not specify one explicitly. ``0`` (the default) disables
auto-deletion. Non-fatal if config is unreadable.
"""
try:
from hermes_cli.config import load_config as _load_config
except Exception:
return 0
try:
cfg = _load_config()
except Exception:
return 0
display = cfg.get("display", {}) if isinstance(cfg, dict) else {}
if not isinstance(display, dict):
return 0
raw = display.get("ephemeral_system_ttl", 0)
try:
return int(raw)
except (TypeError, ValueError):
return 0
def _schedule_ephemeral_delete(
self,
chat_id: str,
message_id: str,
ttl_seconds: int,
) -> None:
"""Spawn a detached task that deletes ``message_id`` after ``ttl_seconds``.
Best-effort failures (gateway restart, permission denied, message
too old for Telegram's 48h window) are swallowed at debug level.
Does not block the caller.
"""
async def _run_delete() -> None:
try:
await asyncio.sleep(max(1, int(ttl_seconds)))
await self.delete_message(chat_id=chat_id, message_id=message_id)
except asyncio.CancelledError:
raise
except Exception as e:
logger.debug(
"[%s] Ephemeral delete failed for %s/%s: %s",
self.name, chat_id, message_id, e,
)
coro = _run_delete()
try:
asyncio.create_task(coro)
except RuntimeError:
# No running loop (e.g. unit tests that never reach the async
# path). Close the coroutine cleanly so Python doesn't warn
# about it never being awaited, then drop silently.
coro.close()
async def send_slash_confirm(
self,
chat_id: str,
@@ -1489,6 +1593,26 @@ class BasePlatformAdapter(ABC):
"""
return SendResult(success=False, error="Not supported")
async def send_private_notice(
self,
chat_id: str,
user_id: Optional[str],
content: str,
reply_to: Optional[str] = None,
metadata: Optional[Dict[str, Any]] = None,
) -> SendResult:
"""Send a notice privately when the platform supports it.
The default implementation falls back to a normal send so callers can
use one code path across platforms.
"""
return await self.send(
chat_id=chat_id,
content=content,
reply_to=reply_to,
metadata=metadata,
)
async def send_typing(self, chat_id: str, metadata=None) -> None:
"""
Send a typing indicator.
@@ -2043,6 +2167,28 @@ class BasePlatformAdapter(ABC):
lowered = error.lower()
return "timed out" in lowered or "readtimeout" in lowered or "writetimeout" in lowered
def _unwrap_ephemeral(self, response: Any) -> Tuple[Optional[str], int]:
"""Unwrap a handler response into (text, ttl_seconds).
Accepts a plain string, ``None``, or an :class:`EphemeralReply`.
Returns ``(text, ttl)`` where ``ttl > 0`` means the caller should
schedule a deletion via :meth:`_schedule_ephemeral_delete` after
the send succeeds. ``ttl`` is forced to 0 when the adapter
doesn't override :meth:`delete_message` so non-supporting
platforms silently degrade to normal sends.
"""
if isinstance(response, EphemeralReply):
ttl = response.ttl_seconds
if ttl is None:
try:
ttl = int(self._get_ephemeral_system_ttl_default())
except Exception:
ttl = 0
if ttl and ttl > 0 and type(self).delete_message is BasePlatformAdapter.delete_message:
ttl = 0
return response.text, int(ttl or 0)
return response, 0
async def _send_with_retry(
self,
chat_id: str,
@@ -2350,13 +2496,20 @@ class BasePlatformAdapter(ABC):
release_guard=False,
discard_pending=False,
)
if response:
await self._send_with_retry(
_text, _eph_ttl = self._unwrap_ephemeral(response)
if _text:
_r = await self._send_with_retry(
chat_id=event.source.chat_id,
content=response,
content=_text,
reply_to=event.message_id,
metadata=thread_meta,
)
if _eph_ttl > 0 and _r.success and _r.message_id:
self._schedule_ephemeral_delete(
chat_id=event.source.chat_id,
message_id=_r.message_id,
ttl_seconds=_eph_ttl,
)
except Exception:
# On failure, restore the original guard if one still exists so
# we don't leave the session in a half-reset state.
@@ -2436,13 +2589,20 @@ class BasePlatformAdapter(ABC):
try:
_thread_meta = {"thread_id": event.source.thread_id} if event.source.thread_id else None
response = await self._message_handler(event)
if response:
await self._send_with_retry(
_text, _eph_ttl = self._unwrap_ephemeral(response)
if _text:
_r = await self._send_with_retry(
chat_id=event.source.chat_id,
content=response,
content=_text,
reply_to=event.message_id,
metadata=_thread_meta,
)
if _eph_ttl > 0 and _r.success and _r.message_id:
self._schedule_ephemeral_delete(
chat_id=event.source.chat_id,
message_id=_r.message_id,
ttl_seconds=_eph_ttl,
)
except Exception as e:
logger.error("[%s] Command '/%s' dispatch failed: %s", self.name, cmd, e, exc_info=True)
return
@@ -2516,7 +2676,6 @@ class BasePlatformAdapter(ABC):
# Fall back to a new Event only if the entry was removed externally.
interrupt_event = self._active_sessions.get(session_key) or asyncio.Event()
self._active_sessions[session_key] = interrupt_event
callback_generation = getattr(interrupt_event, "_hermes_run_generation", None)
# Start continuous typing indicator (refreshes every 2 seconds)
_thread_metadata = {"thread_id": event.source.thread_id} if event.source.thread_id else None
@@ -2549,7 +2708,16 @@ class BasePlatformAdapter(ABC):
# Call the handler (this can take a while with tool calls)
response = await self._message_handler(event)
# Slash-command handlers may return an EphemeralReply sentinel to
# request that their reply message auto-delete after a TTL (used
# for system notices like "✨ New session started!" that the user
# doesn't need to keep in the thread). Unwrap here so all the
# downstream extract_media / text-processing logic sees a plain
# string, and remember the TTL + platform capability so the
# post-send block can schedule the deletion.
response, _ephemeral_ttl = self._unwrap_ephemeral(response)
# Send response if any. A None/empty response is normal when
# streaming already delivered the text (already_sent=True) or
# when the message was queued behind an active agent. Log at
@@ -2638,6 +2806,21 @@ class BasePlatformAdapter(ABC):
)
_record_delivery(result)
# Schedule auto-deletion of system-notice replies.
# Detached so the handler returns immediately; errors
# (permission denied, message too old) are swallowed.
if (
_ephemeral_ttl
and _ephemeral_ttl > 0
and result.success
and result.message_id
):
self._schedule_ephemeral_delete(
chat_id=event.source.chat_id,
message_id=result.message_id,
ttl_seconds=_ephemeral_ttl,
)
# Human-like pacing delay between text and media
human_delay = self._get_human_delay()
@@ -2815,7 +2998,20 @@ class BasePlatformAdapter(ABC):
finally:
# Fire any one-shot post-delivery callback registered for this
# session (e.g. deferred background-review notifications).
_callback_generation = callback_generation
#
# Snapshot the callback generation HERE (after the agent has run),
# not at the top of this task. _hermes_run_generation is set on
# the interrupt event by GatewayRunner._bind_adapter_run_generation
# during _handle_message_with_agent — which happens DURING the
# self._message_handler(event) await above. Snapshotting earlier
# always captured None, which bypassed the generation-ownership
# check in pop_post_delivery_callback and let stale runs fire a
# fresher run's callbacks.
_callback_generation = getattr(
interrupt_event,
"_hermes_run_generation",
None,
)
if hasattr(self, "pop_post_delivery_callback"):
_post_cb = self.pop_post_delivery_callback(
session_key,
+13 -4
View File
@@ -2851,8 +2851,15 @@ class DiscordAdapter(BasePlatformAdapter):
raw = os.getenv("DISCORD_FREE_RESPONSE_CHANNELS", "")
if isinstance(raw, list):
return {str(part).strip() for part in raw if str(part).strip()}
if isinstance(raw, str) and raw.strip():
return {part.strip() for part in raw.split(",") if part.strip()}
# Coerce non-list scalars (str/int/float) to str before splitting.
# YAML parses a bare numeric value such as
# `free_response_channels: 1491973769726791812` as int, which was
# previously falling through the isinstance(str) branch and silently
# returning an empty set. str() here accepts whatever scalar the YAML
# loader hands us without changing existing string/CSV semantics.
s = str(raw).strip() if raw is not None else ""
if s:
return {part.strip() for part in s.split(",") if part.strip()}
return set()
def _thread_parent_channel(self, channel: Any) -> Any:
@@ -3078,6 +3085,7 @@ class DiscordAdapter(BasePlatformAdapter):
async def send_update_prompt(
self, chat_id: str, prompt: str, default: str = "",
session_key: str = "",
metadata: Optional[Dict[str, Any]] = None,
) -> SendResult:
"""Send an interactive button-based update prompt (Yes / No).
@@ -3087,9 +3095,10 @@ class DiscordAdapter(BasePlatformAdapter):
if not self._client or not DISCORD_AVAILABLE:
return SendResult(success=False, error="Not connected")
try:
channel = self._client.get_channel(int(chat_id))
target_id = metadata.get("thread_id") if metadata and metadata.get("thread_id") else chat_id
channel = self._client.get_channel(int(target_id))
if not channel:
channel = await self._client.fetch_channel(int(chat_id))
channel = await self._client.fetch_channel(int(target_id))
default_hint = f" (default: {default})" if default else ""
embed = discord.Embed(
+207 -51
View File
@@ -64,7 +64,7 @@ from dataclasses import dataclass, field
from datetime import datetime
from pathlib import Path
from types import SimpleNamespace
from typing import Any, Dict, List, Optional, Sequence
from typing import Any, Dict, List, Literal, Optional, Sequence
from urllib.error import HTTPError, URLError
from urllib.parse import urlencode
from urllib.request import Request, urlopen
@@ -141,6 +141,7 @@ from gateway.platforms.base import (
)
from gateway.status import acquire_scoped_lock, release_scoped_lock
from hermes_constants import get_hermes_home
from utils import atomic_json_write
logger = logging.getLogger(__name__)
@@ -387,6 +388,8 @@ class FeishuAdapterSettings:
admins: frozenset[str] = frozenset()
default_group_policy: str = ""
group_rules: Dict[str, FeishuGroupRule] = field(default_factory=dict)
allow_bots: str = "none" # "none" | "mentions" | "all"
require_mention: bool = True
@dataclass
@@ -396,6 +399,7 @@ class FeishuGroupRule:
policy: str # "open" | "allowlist" | "blacklist" | "admin_only" | "disabled"
allowlist: set[str] = field(default_factory=set)
blacklist: set[str] = field(default_factory=set)
require_mention: Optional[bool] = None # None = inherit global
@dataclass
@@ -405,6 +409,40 @@ class FeishuBatchState:
counts: Dict[str, int] = field(default_factory=dict)
# ---------------------------------------------------------------------------
# Admission: policy types
# ---------------------------------------------------------------------------
RejectReason = Literal[
"self_echo",
"self_ids_unknown",
"bots_disabled",
"bot_not_mentioned",
"group_policy_rejected",
]
def _is_bot_sender(sender: Any) -> bool:
# receive_v1 docs say {user, bot}; accept "app" defensively.
return getattr(sender, "sender_type", "") in ("bot", "app")
def _sender_identity(sender: Any) -> frozenset:
# Take any non-empty id variant — tenant sender_id_type decides which are populated.
sid = getattr(sender, "sender_id", None)
if sid is None:
return frozenset()
return frozenset(
v for v in (
getattr(sid, "open_id", None),
getattr(sid, "user_id", None),
getattr(sid, "union_id", None),
)
if v
)
# ---------------------------------------------------------------------------
# Markdown rendering helpers
# ---------------------------------------------------------------------------
@@ -1377,10 +1415,16 @@ class FeishuAdapter(BasePlatformAdapter):
for chat_id, rule_cfg in raw_group_rules.items():
if not isinstance(rule_cfg, dict):
continue
# Only override when the key is explicitly set — missing vs false
# must not collapse.
per_chat_require_mention: Optional[bool] = None
if "require_mention" in rule_cfg:
per_chat_require_mention = _to_boolean(rule_cfg.get("require_mention"))
group_rules[str(chat_id)] = FeishuGroupRule(
policy=str(rule_cfg.get("policy", "open")).strip().lower(),
allowlist=set(str(u).strip() for u in rule_cfg.get("allowlist", []) if str(u).strip()),
blacklist=set(str(u).strip() for u in rule_cfg.get("blacklist", []) if str(u).strip()),
require_mention=per_chat_require_mention,
)
# Bot-level admins
@@ -1390,6 +1434,16 @@ class FeishuAdapter(BasePlatformAdapter):
# Default group policy (for groups not in group_rules)
default_group_policy = str(extra.get("default_group_policy", "")).strip().lower()
# Env-only so adapter and gateway auth bypass share one source; yaml
# feishu.allow_bots is bridged to this env var at config load.
allow_bots = os.getenv("FEISHU_ALLOW_BOTS", "none").strip().lower()
if allow_bots not in ("none", "mentions", "all"):
logger.warning(
"[Feishu] Unknown allow_bots=%r, falling back to 'none'. Valid: none, mentions, all.",
allow_bots,
)
allow_bots = "none"
return FeishuAdapterSettings(
app_id=str(extra.get("app_id") or os.getenv("FEISHU_APP_ID", "")).strip(),
app_secret=str(extra.get("app_secret") or os.getenv("FEISHU_APP_SECRET", "")).strip(),
@@ -1446,6 +1500,10 @@ class FeishuAdapter(BasePlatformAdapter):
admins=admins,
default_group_policy=default_group_policy,
group_rules=group_rules,
allow_bots=allow_bots,
require_mention=_to_boolean(
extra.get("require_mention", os.getenv("FEISHU_REQUIRE_MENTION", "true"))
),
)
def _apply_settings(self, settings: FeishuAdapterSettings) -> None:
@@ -1476,6 +1534,8 @@ class FeishuAdapter(BasePlatformAdapter):
self._ws_reconnect_interval = settings.ws_reconnect_interval
self._ws_ping_interval = settings.ws_ping_interval
self._ws_ping_timeout = settings.ws_ping_timeout
self._allow_bots = settings.allow_bots
self._require_mention = settings.require_mention
def _build_event_handler(self) -> Any:
if EventDispatcherHandler is None:
@@ -2189,30 +2249,28 @@ class FeishuAdapter(BasePlatformAdapter):
event = getattr(data, "event", None)
message = getattr(event, "message", None)
sender = getattr(event, "sender", None)
sender_id = getattr(sender, "sender_id", None)
if not message or not sender_id:
logger.debug("[Feishu] Dropping malformed inbound event: missing message or sender_id")
if not message or not sender or not getattr(sender, "sender_id", None):
logger.debug("[Feishu] Dropping malformed inbound event: missing message/sender")
return
message_id = getattr(message, "message_id", None)
if not message_id or self._is_duplicate(message_id):
logger.debug("[Feishu] Dropping duplicate/missing message_id: %s", message_id)
return
if self._is_self_sent_bot_message(event):
logger.debug("[Feishu] Dropping self-sent bot event: %s", message_id)
reason = self._admit(sender, message)
if reason is not None:
logger.debug("[Feishu] dropping inbound event: %s", reason)
return
chat_type = getattr(message, "chat_type", "p2p")
chat_id = getattr(message, "chat_id", "") or ""
if chat_type != "p2p" and not self._should_accept_group_message(message, sender_id, chat_id):
logger.debug("[Feishu] Dropping group message that failed mention/policy gate: %s", message_id)
return
await self._process_inbound_message(
data=data,
message=message,
sender_id=sender_id,
sender_id=getattr(sender, "sender_id", None),
chat_type=chat_type,
message_id=message_id,
is_bot=_is_bot_sender(sender),
)
def _on_message_read_event(self, data: P2ImMessageMessageReadV1) -> None:
@@ -2389,10 +2447,11 @@ class FeishuAdapter(BasePlatformAdapter):
msg = items[0] if items else None
if not msg:
return
# GET im/v1/messages returns sender.id=app_id for bot messages —
# peer bots and us share sender_type="app" but differ on app_id.
sender = getattr(msg, "sender", None)
sender_type = str(getattr(sender, "sender_type", "") or "").lower()
if sender_type != "app":
return # only route reactions on our own bot messages
if str(getattr(sender, "id", "") or "") != self._app_id:
return # only route reactions on this bot's own messages
chat_id = str(getattr(msg, "chat_id", "") or "")
chat_type_raw = str(getattr(msg, "chat_type", "p2p") or "p2p")
if not chat_id:
@@ -2679,6 +2738,7 @@ class FeishuAdapter(BasePlatformAdapter):
sender_id: Any,
chat_type: str,
message_id: str,
is_bot: bool = False,
) -> None:
text, inbound_type, media_urls, media_types, mentions = await self._extract_message_content(message)
@@ -2704,19 +2764,27 @@ class FeishuAdapter(BasePlatformAdapter):
)
reply_to_text = await self._fetch_message_text(reply_to_message_id) if reply_to_message_id else None
sender_primary = (
getattr(sender_id, "open_id", None)
or getattr(sender_id, "user_id", None)
or getattr(sender_id, "union_id", None)
or "<unknown>"
)
logger.info(
"[Feishu] Inbound %s message received: id=%s type=%s chat_id=%s text=%r media=%d",
"[Feishu] Inbound %s message received: id=%s type=%s chat_id=%s sender=%s:%s text=%r media=%d",
"dm" if chat_type == "p2p" else "group",
message_id,
inbound_type.value,
getattr(message, "chat_id", "") or "",
"bot" if is_bot else "user",
sender_primary,
text[:120],
len(media_urls),
)
chat_id = getattr(message, "chat_id", "") or ""
chat_info = await self.get_chat_info(chat_id)
sender_profile = await self._resolve_sender_profile(sender_id)
sender_profile = await self._resolve_sender_profile(sender_id, is_bot=is_bot)
source = self.build_source(
chat_id=chat_id,
chat_name=chat_info.get("name") or chat_id or "Feishu Chat",
@@ -2725,6 +2793,7 @@ class FeishuAdapter(BasePlatformAdapter):
user_name=sender_profile["user_name"],
thread_id=getattr(message, "thread_id", None) or None,
user_id_alt=sender_profile["user_id_alt"],
is_bot=is_bot,
)
normalized = MessageEvent(
text=text,
@@ -3447,7 +3516,12 @@ class FeishuAdapter(BasePlatformAdapter):
return "dm"
return "group"
async def _resolve_sender_profile(self, sender_id: Any) -> Dict[str, Optional[str]]:
async def _resolve_sender_profile(
self,
sender_id: Any,
*,
is_bot: bool = False,
) -> Dict[str, Optional[str]]:
"""Map Feishu's three-tier user IDs onto Hermes' SessionSource fields.
Preference order for the primary ``user_id`` field:
@@ -3464,7 +3538,11 @@ class FeishuAdapter(BasePlatformAdapter):
union_id = getattr(sender_id, "union_id", None) or None
# Prefer tenant-scoped user_id; fall back to app-scoped open_id.
primary_id = user_id or open_id
display_name = await self._resolve_sender_name_from_api(primary_id or union_id)
# bot/v3/bots/basic_batch only accepts open_id.
name_lookup_id = open_id if is_bot else (primary_id or union_id)
display_name = await self._resolve_sender_name_from_api(
name_lookup_id, is_bot=is_bot,
)
return {
"user_id": primary_id,
"user_name": display_name,
@@ -3484,11 +3562,14 @@ class FeishuAdapter(BasePlatformAdapter):
self._sender_name_cache.pop(sender_id, None)
return None
async def _resolve_sender_name_from_api(self, sender_id: Optional[str]) -> Optional[str]:
"""Fetch the sender's display name from the Feishu contact API with a 10-minute cache.
ID-type detection mirrors openclaw: ou_ open_id, on_ union_id, else user_id.
Failures are silently suppressed; the message pipeline must not block on name resolution.
async def _resolve_sender_name_from_api(
self,
sender_id: Optional[str],
*,
is_bot: bool = False,
) -> Optional[str]:
"""Bots divert to bot/basic_batch — contact API doesn't return bot names.
Failures are silent so the pipeline never blocks on name resolution.
"""
if not sender_id or not self._client:
return None
@@ -3498,7 +3579,16 @@ class FeishuAdapter(BasePlatformAdapter):
now = time.time()
cached_name = self._get_cached_sender_name(trimmed)
if cached_name is not None:
return cached_name
return cached_name or None # "" cached means "known nameless"
if is_bot:
names = await self._fetch_bot_names([trimmed])
if names is None:
return None
expire_at = now + _FEISHU_SENDER_NAME_TTL_SECONDS
for oid, name in names.items():
self._sender_name_cache[oid] = (name, expire_at)
hit = self._sender_name_cache.get(trimmed)
return (hit[0] or None) if hit else None
try:
from lark_oapi.api.contact.v3 import GetUserRequest # lazy import
if trimmed.startswith("ou_"):
@@ -3527,6 +3617,35 @@ class FeishuAdapter(BasePlatformAdapter):
logger.debug("[Feishu] Failed to resolve sender name for %s", sender_id, exc_info=True)
return None
async def _fetch_bot_names(self, bot_ids: List[str]) -> Optional[Dict[str, str]]:
if not self._client or not bot_ids:
return None
try:
req = (
BaseRequest.builder()
.http_method(HttpMethod.GET)
.uri("/open-apis/bot/v3/bots/basic_batch")
.queries([("bot_ids", oid) for oid in bot_ids])
.token_types({AccessTokenType.TENANT})
.build()
)
resp = await asyncio.to_thread(self._client.request, req)
content = getattr(getattr(resp, "raw", None), "content", None)
if not content:
return None
payload = json.loads(content)
if payload.get("code") != 0:
return None
bots = (payload.get("data") or {}).get("bots") or {}
return {
oid: str(info.get("name") or "").strip()
for oid, info in bots.items()
if oid
}
except Exception:
logger.debug("[Feishu] Failed to fetch bot names for %s", bot_ids, exc_info=True)
return None
async def _fetch_message_text(self, message_id: str) -> Optional[str]:
if not self._client or not message_id:
return None
@@ -3590,10 +3709,60 @@ class FeishuAdapter(BasePlatformAdapter):
logger.exception("[Feishu] Background inbound processing failed")
# =========================================================================
# Group policy and mention gating
# Inbound admission
# =========================================================================
def _allow_group_message(self, sender_id: Any, chat_id: str = "") -> bool:
def _admit(self, sender: Any, message: Any) -> Optional[RejectReason]:
sender_ids = _sender_identity(sender)
self_ids = frozenset(v for v in (self._bot_open_id, self._bot_user_id) if v)
is_bot = _is_bot_sender(sender)
is_group = getattr(message, "chat_type", "p2p") != "p2p"
chat_id = getattr(message, "chat_id", "") or ""
require_mention = is_group and self._require_mention_for(chat_id)
# Defensive only — Feishu doesn't echo our outbound back as inbound,
# and open_id is always populated on both sides.
if self_ids and sender_ids & self_ids:
return "self_echo"
if is_bot:
mode = self._allow_bots
if mode != "mentions" and mode != "all":
return "bots_disabled"
# Defensive: pre-hydration or malformed payloads.
if not self_ids or not sender_ids:
return "self_ids_unknown"
# Step 4 covers mention enforcement for groups when require_mention
# is on; check here only on paths step 4 won't reach.
if mode == "mentions" and not require_mention and not self._mentions_self(message):
return "bot_not_mentioned"
if not is_group:
return None
if not self._allow_group_message(
getattr(sender, "sender_id", None), chat_id, is_bot=is_bot,
):
return "group_policy_rejected"
if require_mention and not self._mentions_self(message):
return "group_policy_rejected"
return None
def _require_mention_for(self, chat_id: str) -> bool:
rule = self._group_rules.get(chat_id) if chat_id else None
if rule and rule.require_mention is not None:
return rule.require_mention
return self._require_mention
# --- Group policy ---------------------------------------------------------
def _allow_group_message(
self,
sender_id: Any,
chat_id: str = "",
*,
is_bot: bool = False,
) -> bool:
"""Per-group policy gate for non-DM traffic."""
sender_open_id = getattr(sender_id, "open_id", None)
sender_user_id = getattr(sender_id, "user_id", None)
@@ -3612,12 +3781,17 @@ class FeishuAdapter(BasePlatformAdapter):
allowlist = self._allowed_group_users
blacklist = set()
# Channel locks apply to everyone; allowlist/blacklist only gate humans
# (bots were already cleared upstream by FEISHU_ALLOW_BOTS).
if policy == "disabled":
return False
if policy == "open":
return True
if policy == "admin_only":
return False
if is_bot:
return True
if policy == "allowlist":
return bool(sender_ids and (sender_ids & allowlist))
if policy == "blacklist":
@@ -3625,17 +3799,16 @@ class FeishuAdapter(BasePlatformAdapter):
return bool(sender_ids and (sender_ids & self._allowed_group_users))
def _should_accept_group_message(self, message: Any, sender_id: Any, chat_id: str = "") -> bool:
"""Require an explicit @mention before group messages enter the agent."""
if not self._allow_group_message(sender_id, chat_id):
return False
# @_all is Feishu's @everyone placeholder — always route to the bot.
# --- Mention detection ----------------------------------------------------
def _mentions_self(self, message: Any) -> bool:
# @_all is Feishu's @everyone placeholder.
raw_content = getattr(message, "content", "") or ""
if "@_all" in raw_content:
return True
mentions = getattr(message, "mentions", None) or []
if mentions:
return self._message_mentions_bot(mentions)
if mentions and self._message_mentions_bot(mentions):
return True
normalized = normalize_feishu_message(
message_type=getattr(message, "message_type", "") or "",
raw_content=raw_content,
@@ -3644,23 +3817,6 @@ class FeishuAdapter(BasePlatformAdapter):
)
return self._post_mentions_bot(normalized.mentions)
def _is_self_sent_bot_message(self, event: Any) -> bool:
"""Return True only for Feishu events emitted by this Hermes bot."""
sender = getattr(event, "sender", None)
sender_type = str(getattr(sender, "sender_type", "") or "").strip().lower()
if sender_type not in {"bot", "app"}:
return False
sender_id = getattr(sender, "sender_id", None)
sender_open_id = str(getattr(sender_id, "open_id", "") or "").strip()
sender_user_id = str(getattr(sender_id, "user_id", "") or "").strip()
if self._bot_open_id and sender_open_id == self._bot_open_id:
return True
if self._bot_user_id and sender_user_id == self._bot_user_id:
return True
return False
def _message_mentions_bot(self, mentions: List[Any]) -> bool:
# IDs trump names: when both sides have open_id (or both user_id),
# match requires equal IDs. Name fallback only when either side
@@ -3804,7 +3960,7 @@ class FeishuAdapter(BasePlatformAdapter):
recent = self._seen_message_order[-self._dedup_cache_size:]
# Save as {msg_id: timestamp} so TTL filtering works across restarts.
payload = {"message_ids": {k: self._seen_message_ids[k] for k in recent if k in self._seen_message_ids}}
self._dedup_state_path.write_text(json.dumps(payload, ensure_ascii=False), encoding="utf-8")
atomic_json_write(self._dedup_state_path, payload, indent=None)
except OSError:
logger.warning("[Feishu] Failed to persist dedup state to %s", self._dedup_state_path, exc_info=True)
+3 -2
View File
@@ -13,6 +13,8 @@ import time
from pathlib import Path
from typing import TYPE_CHECKING, Dict
from utils import atomic_json_write
if TYPE_CHECKING:
from gateway.platforms.base import MessageEvent
@@ -237,12 +239,11 @@ class ThreadParticipationTracker:
def _save(self) -> None:
path = self._state_path()
path.parent.mkdir(parents=True, exist_ok=True)
thread_list = list(self._threads)
if len(thread_list) > self._max_tracked:
thread_list = thread_list[-self._max_tracked:]
self._threads = set(thread_list)
path.write_text(json.dumps(thread_list), encoding="utf-8")
atomic_json_write(path, thread_list, indent=None)
def mark(self, thread_id: str) -> None:
"""Mark *thread_id* as participated and persist."""
+12
View File
@@ -534,6 +534,18 @@ class SignalAdapter(BasePlatformAdapter):
except Exception:
logger.exception("Signal: failed to fetch attachment %s", att_id)
# Skip envelopes with no meaningful content (no text, no attachments).
# Catches profile key updates, empty messages, and other metadata-only
# envelopes that still carry a dataMessage wrapper but have nothing
# worth processing. See issue: signal-cli logs "Profile key update" +
# Hermes receives msg='' triggering a full agent turn for nothing.
if (not text or not text.strip()) and not media_urls:
logger.debug(
"Signal: skipping contentless envelope from %s (%d attachments)",
redact_phone(sender), len(media_urls) if media_urls else 0,
)
return
# Build session source
source = self.build_source(
chat_id=chat_id,
+221 -13
View File
@@ -9,6 +9,7 @@ Uses slack-bolt (Python) with Socket Mode for:
"""
import asyncio
import contextvars
import json
import logging
import os
@@ -21,6 +22,7 @@ try:
from slack_bolt.async_app import AsyncApp
from slack_bolt.adapter.socket_mode.async_handler import AsyncSocketModeHandler
from slack_sdk.web.async_client import AsyncWebClient
import aiohttp
SLACK_AVAILABLE = True
except ImportError:
SLACK_AVAILABLE = False
@@ -50,6 +52,16 @@ from gateway.platforms.base import (
logger = logging.getLogger(__name__)
# ContextVar carrying the user_id of the slash-command invoker.
# Set in _handle_slash_command, read in send() to match the correct
# stashed response_url when multiple users issue commands on the same
# channel concurrently. ContextVars propagate to child asyncio.Tasks
# (Python 3.7+), so the value set in _handle_slash_command's task is
# visible in _process_message_background's child task.
_slash_user_id: contextvars.ContextVar[Optional[str]] = contextvars.ContextVar(
"_slash_user_id", default=None,
)
@dataclass
class _ThreadContextCache:
@@ -310,6 +322,11 @@ class SlackAdapter(BasePlatformAdapter):
# Track active assistant thread status indicators so stop_typing can
# clear them (chat_id → thread_ts).
self._active_status_threads: Dict[str, str] = {}
# Slash-command contexts: stash response_url + user_id so send()
# can route the first reply ephemerally. Keyed by
# (channel_id, user_id) to avoid cross-user collisions.
# Each value: {"response_url": str, "ts": float}
self._slash_command_contexts: Dict[Tuple[str, str], Dict[str, Any]] = {}
def _describe_slack_api_error(self, response: Any, *, file_obj: Optional[Dict[str, Any]] = None) -> Optional[str]:
"""Convert Slack API auth/permission failures into actionable user-facing text."""
@@ -368,6 +385,103 @@ class SlackAdapter(BasePlatformAdapter):
)
return None
# ------------------------------------------------------------------
# Slash-command ephemeral helpers
# ------------------------------------------------------------------
_SLASH_CTX_TTL = 120.0 # seconds — response_url is valid for 30 min;
# we use a much shorter TTL to avoid routing unrelated messages
# as ephemeral if the command handler was slow or dropped.
def _pop_slash_context(
self, chat_id: str,
) -> Optional[Dict[str, Any]]:
"""Return and remove the slash-command context for *chat_id*, if fresh.
Contexts older than ``_SLASH_CTX_TTL`` seconds are silently discarded.
Uses the ``_slash_user_id`` ContextVar (set in ``_handle_slash_command``)
to match the exact ``(channel_id, user_id)`` key. This prevents a
concurrent slash command from a different user on the same channel from
stealing another user's ephemeral context. Falls back to a
channel-only scan when the ContextVar is unset (e.g. send() called
from a non-slash code path should not match anything).
"""
now = time.monotonic()
# Clean up stale entries on every lookup — dict is small.
stale_keys = [
k for k, v in self._slash_command_contexts.items()
if now - v["ts"] > self._SLASH_CTX_TTL
]
for k in stale_keys:
self._slash_command_contexts.pop(k, None)
# Precise match: (channel_id, user_id) from ContextVar.
uid = _slash_user_id.get()
if uid:
return self._slash_command_contexts.pop((chat_id, uid), None)
# Fallback: channel-only scan (only reachable when ContextVar is
# unset, i.e. send() called outside a slash-command async context).
match_key = None
for key in list(self._slash_command_contexts):
if key[0] == chat_id:
match_key = key
break
if match_key is None:
return None
return self._slash_command_contexts.pop(match_key)
async def _send_slash_ephemeral(
self,
ctx: Dict[str, Any],
content: str,
) -> "SendResult":
"""Replace the initial ephemeral ack via ``response_url``.
Slack's ``response_url`` accepts a POST with ``replace_original``
for up to 30 minutes after the slash command was invoked. This
lets us swap the "Running /cmd…" placeholder with the real reply,
and the message stays ephemeral ("Only visible to you").
Falls back to a simple ``True`` SendResult if the POST fails
the user already saw the initial ack, so a delivery failure here
is non-critical.
"""
formatted = self.format_message(content)
# Slack's response_url has the same ~40k char limit as chat_postMessage.
# Truncate to MAX_MESSAGE_LENGTH and use only the first chunk — the
# response_url replaces a single ephemeral ack, so multi-chunk isn't
# possible. Long responses are rare for command replies.
chunks = self.truncate_message(formatted, self.MAX_MESSAGE_LENGTH)
text = chunks[0] if chunks else formatted
payload = {
"response_type": "ephemeral",
"replace_original": True,
"text": text,
}
try:
async with aiohttp.ClientSession() as session:
async with session.post(
ctx["response_url"],
json=payload,
timeout=aiohttp.ClientTimeout(total=10),
) as resp:
if resp.status == 200:
return SendResult(success=True, message_id=None)
body = await resp.text()
logger.warning(
"[Slack] response_url POST returned %s: %s",
resp.status,
body[:200],
)
except Exception as e:
logger.warning(
"[Slack] response_url POST failed: %s", e,
)
# Non-fatal — the user saw the initial ack already.
return SendResult(success=True, message_id=None)
async def connect(self) -> bool:
"""Connect to Slack via Socket Mode."""
if not SLACK_AVAILABLE:
@@ -446,12 +560,16 @@ class SlackAdapter(BasePlatformAdapter):
async def handle_message_event(event, say):
await self._handle_slack_message(event)
# Acknowledge app_mention events to prevent Bolt 404 errors.
# The "message" handler above already processes @mentions in
# channels, so this is intentionally a no-op to avoid duplicates.
# Handle app_mention explicitly. In some Slack app configurations,
# channel mentions arrive only as app_mention events rather than the
# generic message event. Forward them into the normal message
# pipeline so @mentions reliably produce replies.
# NOTE: when Slack fires BOTH message and app_mention for the same
# @mention, they share the same event ts — the dedup in
# _handle_slack_message (MessageDeduplicator) suppresses the second.
@self._app.event("app_mention")
async def handle_app_mention(event, say):
pass
await self._handle_slack_message(event)
# File lifecycle events can arrive around snippet uploads even when
# the actual user message is what we care about. Ack them so Slack
@@ -502,7 +620,11 @@ class SlackAdapter(BasePlatformAdapter):
@self._app.command(_slash_pattern)
async def handle_hermes_command(ack, command):
await ack()
slash = (command.get("command") or "").lstrip("/")
await ack(
response_type="ephemeral",
text=f"Running `/{slash}`…",
)
await self._handle_slash_command(command)
# Register Block Kit action handlers for approval buttons
@@ -574,6 +696,17 @@ class SlackAdapter(BasePlatformAdapter):
return SendResult(success=False, error="Not connected")
try:
# Check for a pending slash-command context. When the user ran a
# native slash command (e.g. /q, /stop, /model), the initial ack
# already showed an ephemeral "Running /cmd…" message. If we have
# a stashed response_url for this channel, replace that ack with
# the actual command reply ephemerally instead of posting publicly.
slash_ctx = self._pop_slash_context(chat_id)
if slash_ctx:
return await self._send_slash_ephemeral(
slash_ctx, content,
)
# Convert standard markdown → Slack mrkdwn
formatted = self.format_message(content)
@@ -601,6 +734,10 @@ class SlackAdapter(BasePlatformAdapter):
last_result = await self._get_client(chat_id).chat_postMessage(**kwargs)
# Clear Slack Assistant status as soon as the final message is posted.
if thread_ts:
await self.stop_typing(chat_id)
# Track the sent message ts so we can auto-respond to thread
# replies without requiring @mention.
sent_ts = last_result.get("ts") if last_result else None
@@ -624,6 +761,42 @@ class SlackAdapter(BasePlatformAdapter):
logger.error("[Slack] Send error: %s", e, exc_info=True)
return SendResult(success=False, error=str(e))
async def send_private_notice(
self,
chat_id: str,
user_id: str,
content: str,
reply_to: Optional[str] = None,
metadata: Optional[Dict[str, Any]] = None,
) -> SendResult:
"""Send a Slack ephemeral message visible only to one user."""
if not self._app:
return SendResult(success=False, error="Not connected")
if not chat_id or not user_id:
return SendResult(success=False, error="chat_id and user_id are required")
try:
formatted = self.format_message(content)
thread_ts = self._resolve_thread_ts(reply_to, metadata)
kwargs = {
"channel": chat_id,
"user": user_id,
"text": formatted,
"mrkdwn": True,
}
if thread_ts:
kwargs["thread_ts"] = thread_ts
result = await self._get_client(chat_id).chat_postEphemeral(**kwargs)
return SendResult(
success=True,
message_id=result.get("message_ts") or result.get("ts"),
raw_response=result,
)
except Exception as e: # pragma: no cover - defensive logging
logger.error("[Slack] Ephemeral send error: %s", e, exc_info=True)
return SendResult(success=False, error=str(e))
async def edit_message(
self,
chat_id: str,
@@ -642,6 +815,8 @@ class SlackAdapter(BasePlatformAdapter):
ts=message_id,
text=formatted,
)
if finalize:
await self.stop_typing(chat_id)
return SendResult(success=True, message_id=message_id)
except Exception as e: # pragma: no cover - defensive logging
logger.error(
@@ -682,7 +857,7 @@ class SlackAdapter(BasePlatformAdapter):
# in an assistant-enabled context. Falls back to reactions.
logger.debug("[Slack] assistant.threads.setStatus failed: %s", e)
async def stop_typing(self, chat_id: str) -> None:
async def stop_typing(self, chat_id: str, metadata=None) -> None:
"""Clear the assistant thread status indicator."""
if not self._app:
return
@@ -969,7 +1144,7 @@ class SlackAdapter(BasePlatformAdapter):
return _ph(f'<{url}|{label}>')
text = re.sub(
r'\[([^\]]+)\]\(([^()]*(?:\([^()]*\)[^()]*)*)\)',
r'(?<!!)\[([^\]]+)\]\(([^()]*(?:\([^()]*\)[^()]*)*)\)',
_convert_markdown_link,
text,
)
@@ -1016,9 +1191,11 @@ class SlackAdapter(BasePlatformAdapter):
)
# 10) Convert italic: _text_ stays as _text_ (already Slack italic)
# Single *text* → _text_ (Slack italic)
# Single *text* → _text_ (Slack italic), but only when the
# emphasized text touches non-whitespace on both sides so literal
# delimiters like "a * b * c" are preserved.
text = re.sub(
r'(?<!\*)\*([^*\n]+)\*(?!\*)',
r'(?<!\*)\*(\S(?:[^*\n]*?\S)?)\*(?!\*)',
lambda m: _ph(f'_{m.group(1)}_'),
text,
)
@@ -2524,9 +2701,14 @@ class SlackAdapter(BasePlatformAdapter):
# gateway command dispatcher by prepending the slash.
text = f"/{slash_name} {text}".strip()
# Slack slash commands can originate from DMs or shared channels.
# Preserve DM semantics only for DM channel IDs; shared channels must
# keep group semantics so different users do not collide into one
# session key.
is_dm = str(channel_id).startswith("D")
source = self.build_source(
chat_id=channel_id,
chat_type="dm", # Slash commands are always in DM-like context
chat_type="dm" if is_dm else "group",
user_id=user_id,
)
@@ -2537,7 +2719,26 @@ class SlackAdapter(BasePlatformAdapter):
raw_message=command,
)
await self.handle_message(event)
# Stash the Slack response_url so the first reply for this
# channel+user can be routed ephemerally (replaces the initial
# "Running /cmd…" ack shown by handle_hermes_command).
# Only stash for COMMAND events (text starts with "/") — free-form
# questions via "/hermes <question>" must produce public replies so
# the whole channel can see the agent's answer.
response_url = command.get("response_url", "")
if response_url and user_id and channel_id and text.startswith("/"):
self._slash_command_contexts[(channel_id, user_id)] = {
"response_url": response_url,
"ts": time.monotonic(),
}
# Set the ContextVar so send() can match the correct stashed
# response_url even when multiple users slash concurrently.
_slash_user_id_token = _slash_user_id.set(user_id or None)
try:
await self.handle_message(event)
finally:
_slash_user_id.reset(_slash_user_id_token)
def _has_active_session_for_thread(
self,
@@ -2698,6 +2899,13 @@ class SlackAdapter(BasePlatformAdapter):
raw = os.getenv("SLACK_FREE_RESPONSE_CHANNELS", "")
if isinstance(raw, list):
return {str(part).strip() for part in raw if str(part).strip()}
if isinstance(raw, str) and raw.strip():
return {part.strip() for part in raw.split(",") if part.strip()}
# Coerce non-list scalars (str/int/float) to str before splitting.
# A bare numeric YAML value (`free_response_channels: 1234567890`) is
# loaded as int and was previously falling through the isinstance(str)
# branch to return an empty set. str() here accepts whatever scalar
# the YAML loader hands us without changing existing string/CSV
# semantics.
s = str(raw).strip() if raw is not None else ""
if s:
return {part.strip() for part in s.split(",") if part.strip()}
return set()
+88 -7
View File
@@ -290,14 +290,53 @@ class TelegramAdapter(BasePlatformAdapter):
# and any other slash-confirm prompts; see GatewayRunner._request_slash_confirm).
self._slash_confirm_state: Dict[str, str] = {}
@staticmethod
def _is_callback_user_authorized(user_id: str) -> bool:
def _is_callback_user_authorized(
self,
user_id: str,
*,
chat_id: Optional[str] = None,
chat_type: Optional[str] = None,
thread_id: Optional[str] = None,
user_name: Optional[str] = None,
) -> bool:
"""Return whether a Telegram inline-button caller may perform gated actions."""
normalized_user_id = str(user_id or "").strip()
if not normalized_user_id:
return False
runner = getattr(getattr(self, "_message_handler", None), "__self__", None)
auth_fn = getattr(runner, "_is_user_authorized", None)
if callable(auth_fn):
try:
from gateway.session import SessionSource
normalized_chat_type = str(chat_type or "dm").strip().lower() or "dm"
if normalized_chat_type == "private":
normalized_chat_type = "dm"
elif normalized_chat_type == "supergroup":
normalized_chat_type = "forum" if thread_id is not None else "group"
source = SessionSource(
platform=Platform.TELEGRAM,
chat_id=str(chat_id or normalized_user_id),
chat_type=normalized_chat_type,
user_id=normalized_user_id,
user_name=str(user_name).strip() if user_name else None,
thread_id=str(thread_id) if thread_id is not None else None,
)
return bool(auth_fn(source))
except Exception:
logger.debug(
"[Telegram] Falling back to env-only callback auth for user %s",
normalized_user_id,
exc_info=True,
)
allowed_csv = os.getenv("TELEGRAM_ALLOWED_USERS", "").strip()
if not allowed_csv:
return True
allowed_ids = {uid.strip() for uid in allowed_csv.split(",") if uid.strip()}
return "*" in allowed_ids or user_id in allowed_ids
return "*" in allowed_ids or normalized_user_id in allowed_ids
@classmethod
def _metadata_thread_id(cls, metadata: Optional[Dict[str, Any]]) -> Optional[str]:
@@ -722,6 +761,20 @@ class TelegramAdapter(BasePlatformAdapter):
# Persist thread_id to config so we don't recreate on next restart
self._persist_dm_topic_thread_id(int(chat_id), topic_name, thread_id)
# Send a seed message so the topic is visible in Telegram's client.
# Empty topics are hidden by the client UI until they contain a message.
try:
await self._bot.send_message(
chat_id=int(chat_id),
message_thread_id=thread_id,
text=f"\U0001f4cc {topic_name}",
)
except Exception as seed_err:
logger.debug(
"[%s] Could not send seed message to topic '%s': %s",
self.name, topic_name, seed_err,
)
async def connect(self) -> bool:
"""Connect to Telegram via polling or webhook.
@@ -1321,6 +1374,7 @@ class TelegramAdapter(BasePlatformAdapter):
async def send_update_prompt(
self, chat_id: str, prompt: str, default: str = "",
session_key: str = "",
metadata: Optional[Dict[str, Any]] = None,
) -> SendResult:
"""Send an inline-keyboard update prompt (Yes / No buttons).
@@ -1338,11 +1392,14 @@ class TelegramAdapter(BasePlatformAdapter):
InlineKeyboardButton("✗ No", callback_data="update_prompt:n"),
]
])
thread_id = self._metadata_thread_id(metadata)
message_thread_id = self._message_thread_id_for_send(thread_id)
msg = await self._bot.send_message(
chat_id=int(chat_id),
text=text,
parse_mode=ParseMode.MARKDOWN,
reply_markup=keyboard,
message_thread_id=message_thread_id,
**self._link_preview_kwargs(),
)
return SendResult(success=True, message_id=str(msg.message_id))
@@ -1760,6 +1817,12 @@ class TelegramAdapter(BasePlatformAdapter):
if not query or not query.data:
return
data = query.data
query_message = getattr(query, "message", None)
query_chat_id = getattr(query_message, "chat_id", None)
query_chat = getattr(query_message, "chat", None)
query_chat_type = getattr(query_chat, "type", None)
query_thread_id = getattr(query_message, "message_thread_id", None)
query_user_name = getattr(query.from_user, "first_name", None)
# --- Model picker callbacks ---
if data.startswith(("mp:", "mm:", "mb", "mx", "mg:")):
@@ -1781,7 +1844,13 @@ class TelegramAdapter(BasePlatformAdapter):
# Only authorized users may click approval buttons.
caller_id = str(getattr(query.from_user, "id", ""))
if not self._is_callback_user_authorized(caller_id):
if not self._is_callback_user_authorized(
caller_id,
chat_id=query_chat_id,
chat_type=str(query_chat_type) if query_chat_type is not None else None,
thread_id=str(query_thread_id) if query_thread_id is not None else None,
user_name=query_user_name,
):
await query.answer(text="⛔ You are not authorized to approve commands.")
return
@@ -1831,8 +1900,14 @@ class TelegramAdapter(BasePlatformAdapter):
choice = parts[1] # once, always, cancel
confirm_id = parts[2]
caller_id = str(getattr(query.from_user, "id", ""))
if not self._is_callback_user_authorized(caller_id):
caller_id = str(getattr(query.from_user, "id", ""))
if not self._is_callback_user_authorized(
caller_id,
chat_id=query_chat_id,
chat_type=str(query_chat_type) if query_chat_type is not None else None,
thread_id=str(query_thread_id) if query_thread_id is not None else None,
user_name=query_user_name,
):
await query.answer(text="⛔ You are not authorized to answer this prompt.")
return
@@ -1891,7 +1966,13 @@ class TelegramAdapter(BasePlatformAdapter):
return
answer = data.split(":", 1)[1] # "y" or "n"
caller_id = str(getattr(query.from_user, "id", ""))
if not self._is_callback_user_authorized(caller_id):
if not self._is_callback_user_authorized(
caller_id,
chat_id=query_chat_id,
chat_type=str(query_chat_type) if query_chat_type is not None else None,
thread_id=str(query_thread_id) if query_thread_id is not None else None,
user_name=query_user_name,
):
await query.answer(text="⛔ You are not authorized to answer update prompts.")
return
await query.answer(text=f"Sent '{answer}' to the update process.")
+6 -4
View File
@@ -1896,10 +1896,12 @@ class OwnerCommandMiddleware(InboundMiddleware):
if cmd not in cls.ALLOWLIST:
return None, None, False
# Sender identity check: bot owner <-> push.from_account == push.bot_owner_id
# owner_id = (push or {}).get("bot_owner_id") or ""
# is_owner = bool(owner_id) and owner_id == from_account
is_owner = True
# Sender identity check: bot owner <-> push.from_account == push.bot_owner_id.
# The allowlisted commands (/approve, /deny, /stop, /reset, ...) are
# privileged — leaking them to non-owners lets any group member approve
# a dangerous tool call, kill the owner's task, or wipe session state.
owner_id = str((push or {}).get("bot_owner_id") or "").strip()
is_owner = bool(owner_id) and owner_id == from_account
return cmd, cmd_line, is_owner
async def handle(self, ctx: InboundContext, next_fn) -> None:
+1128 -80
View File
File diff suppressed because it is too large Load Diff
+12
View File
@@ -458,6 +458,15 @@ class SessionEntry:
was_auto_reset: bool = False
auto_reset_reason: Optional[str] = None # "idle" or "daily"
reset_had_activity: bool = False # whether the expired session had any messages
# Set by reset_session() when the user explicitly sends /new or /reset.
# Consumed once by _handle_message_with_agent to trigger topic/channel
# skill re-injection on the first message of the new session. We can't
# reuse was_auto_reset for this because that flag fires the "session
# expired due to inactivity" user-facing notice and a misleading
# context-note prepend — both wrong for an explicit manual reset.
# See issue #6508.
is_fresh_reset: bool = False
# Set by the background expiry watcher after it finalizes an expired
# session (invoking on_session_finalize hooks and evicting the cached
@@ -508,6 +517,7 @@ class SessionEntry:
if self.last_resume_marked_at
else None
),
"is_fresh_reset": self.is_fresh_reset,
}
if self.origin:
result["origin"] = self.origin.to_dict()
@@ -556,6 +566,7 @@ class SessionEntry:
resume_pending=data.get("resume_pending", False),
resume_reason=data.get("resume_reason"),
last_resume_marked_at=last_resume_marked_at,
is_fresh_reset=data.get("is_fresh_reset", False),
)
@@ -1132,6 +1143,7 @@ class SessionStore:
display_name=old_entry.display_name,
platform=old_entry.platform,
chat_type=old_entry.chat_type,
is_fresh_reset=True,
)
self._entries[session_key] = new_entry
+8 -4
View File
@@ -21,6 +21,7 @@ from datetime import datetime, timezone
from pathlib import Path
from hermes_constants import get_hermes_home
from typing import Any, Optional
from utils import atomic_json_write
if sys.platform == "win32":
import msvcrt
@@ -34,6 +35,10 @@ _IS_WINDOWS = sys.platform == "win32"
_UNSET = object()
_GATEWAY_LOCK_FILENAME = "gateway.lock"
_gateway_lock_handle = None
# Windows byte-range locks are mandatory for other readers. Lock a byte well
# past the JSON payload so runtime status / PID readers can still read the file
# while another process holds the mutual-exclusion lock.
_WINDOWS_LOCK_OFFSET = 1024 * 1024
def _get_pid_path() -> Path:
@@ -205,8 +210,7 @@ def _read_json_file(path: Path) -> Optional[dict[str, Any]]:
def _write_json_file(path: Path, payload: dict[str, Any]) -> None:
path.parent.mkdir(parents=True, exist_ok=True)
path.write_text(json.dumps(payload))
atomic_json_write(path, payload, indent=None, separators=(",", ":"))
def _read_pid_record(pid_path: Optional[Path] = None) -> Optional[dict]:
@@ -286,7 +290,7 @@ def _try_acquire_file_lock(handle) -> bool:
if handle.tell() == 0:
handle.write("\n")
handle.flush()
handle.seek(0)
handle.seek(_WINDOWS_LOCK_OFFSET)
msvcrt.locking(handle.fileno(), msvcrt.LK_NBLCK, 1)
else:
fcntl.flock(handle.fileno(), fcntl.LOCK_EX | fcntl.LOCK_NB)
@@ -298,7 +302,7 @@ def _try_acquire_file_lock(handle) -> bool:
def _release_file_lock(handle) -> None:
try:
if _IS_WINDOWS:
handle.seek(0)
handle.seek(_WINDOWS_LOCK_OFFSET)
msvcrt.locking(handle.fileno(), msvcrt.LK_UNLCK, 1)
else:
fcntl.flock(handle.fileno(), fcntl.LOCK_UN)
+5 -5
View File
@@ -43,7 +43,7 @@ import yaml
from hermes_cli.config import get_hermes_home, get_config_path, read_raw_config
from hermes_constants import OPENROUTER_BASE_URL
from utils import atomic_replace
from utils import atomic_replace, atomic_yaml_write, is_truthy_value
logger = logging.getLogger(__name__)
@@ -2480,8 +2480,8 @@ def _resolve_verify(
tls_state = tls_state if isinstance(tls_state, dict) else {}
effective_insecure = (
bool(insecure) if insecure is not None
else bool(tls_state.get("insecure", False))
is_truthy_value(insecure, default=False) if insecure is not None
else is_truthy_value(tls_state.get("insecure", False), default=False)
)
effective_ca = (
ca_bundle
@@ -3653,7 +3653,7 @@ def _update_config_for_provider(
config["model"] = model_cfg
config_path.write_text(yaml.safe_dump(config, sort_keys=False))
atomic_yaml_write(config_path, config, sort_keys=False)
return config_path
@@ -3712,7 +3712,7 @@ def _reset_config_provider() -> Path:
model["provider"] = "auto"
if "base_url" in model:
model["base_url"] = OPENROUTER_BASE_URL
config_path.write_text(yaml.safe_dump(config, sort_keys=False))
atomic_yaml_write(config_path, config, sort_keys=False)
return config_path
+25 -1
View File
@@ -19,6 +19,8 @@ from collections.abc import Callable, Mapping
from dataclasses import dataclass
from typing import Any
from utils import is_truthy_value
# prompt_toolkit is an optional CLI dependency — only needed for
# SlashCommandCompleter and SlashCommandAutoSuggest. Gateway and test
# environments that lack it must still be able to import this module
@@ -66,6 +68,7 @@ COMMAND_REGISTRY: list[CommandDef] = [
cli_only=True),
CommandDef("history", "Show conversation history", "Session",
cli_only=True),
CommandDef("recap", "Summarize recent activity in this session", "Session"),
CommandDef("save", "Save the current conversation", "Session",
cli_only=True),
CommandDef("retry", "Retry the last message (resend to agent)", "Session"),
@@ -93,6 +96,8 @@ COMMAND_REGISTRY: list[CommandDef] = [
aliases=("q",), args_hint="<prompt>"),
CommandDef("steer", "Inject a message after the next tool call without interrupting", "Session",
args_hint="<prompt>"),
CommandDef("goal", "Set a standing goal Hermes works on across turns until achieved", "Session",
args_hint="[text | pause | resume | clear | status]"),
CommandDef("status", "Show session info", "Session"),
CommandDef("profile", "Show active profile name and home directory", "Info"),
CommandDef("sethome", "Set this chat as the home channel", "Session",
@@ -151,6 +156,11 @@ COMMAND_REGISTRY: list[CommandDef] = [
CommandDef("curator", "Background skill maintenance (status, run, pin, archive)",
"Tools & Skills", args_hint="[subcommand]",
subcommands=("status", "run", "pause", "resume", "pin", "unpin", "restore")),
CommandDef("kanban", "Multi-profile collaboration board (tasks, links, comments)",
"Tools & Skills", args_hint="[subcommand]",
subcommands=("list", "ls", "show", "create", "assign", "link", "unlink",
"claim", "comment", "complete", "block", "unblock", "archive",
"tail", "dispatch", "context", "init", "gc")),
CommandDef("reload", "Reload .env variables into the running session", "Tools & Skills",
cli_only=True),
CommandDef("reload-mcp", "Reload MCP servers from config", "Tools & Skills",
@@ -310,6 +320,7 @@ ACTIVE_SESSION_BYPASS_COMMANDS: frozenset[str] = frozenset(
"new",
"profile",
"queue",
"recap",
"restart",
"status",
"steer",
@@ -366,7 +377,7 @@ def _resolve_config_gates() -> set[str]:
else:
val = None
break
if val:
if is_truthy_value(val, default=False):
result.add(cmd.name)
return result
@@ -829,6 +840,13 @@ def discord_skill_commands_by_category(
_SLACK_MAX_SLASH_COMMANDS = 50
_SLACK_NAME_LIMIT = 32
_SLACK_INVALID_CHARS = re.compile(r"[^a-z0-9_\-]")
_SLACK_RESERVED_COMMANDS = frozenset({
# Built-in Slack slash commands that cannot be registered by apps.
# https://slack.com/help/articles/201259356-Use-built-in-slash-commands
"me", "status", "away", "dnd", "shrug", "remind", "msg", "feed",
"who", "collapse", "expand", "leave", "join", "open", "search",
"topic", "mute", "pro", "shortcuts",
})
def _sanitize_slack_name(raw: str) -> str:
@@ -855,6 +873,10 @@ def slack_native_slashes() -> list[tuple[str, str, str]]:
documented form (e.g. ``/background``, ``/bg``, and ``/btw`` all work).
Plugin-registered slash commands are included too.
Commands whose sanitized name collides with a Slack built-in
(e.g. ``/status``, ``/me``, ``/join``) are silently skipped. Users
can still reach them via ``/hermes <command>``.
Results are clamped to Slack's 50-command limit with duplicate-name
avoidance. ``/hermes`` is always reserved as the first entry so the
legacy ``/hermes <subcommand>`` form keeps working for anything that
@@ -872,6 +894,8 @@ def slack_native_slashes() -> list[tuple[str, str, str]]:
slack_name = _sanitize_slack_name(name)
if not slack_name or slack_name in seen:
return
if slack_name in _SLACK_RESERVED_COMMANDS:
return
if len(entries) >= _SLACK_MAX_SLASH_COMMANDS:
return
# Slack description cap is 2000 chars; keep it short.
+81 -2
View File
@@ -457,6 +457,7 @@ DEFAULT_CONFIG = {
# remains available as a tool regardless of this setting — the routing
# only controls how inbound user images are presented.
"image_input_mode": "auto",
"disabled_toolsets": [],
},
"terminal": {
@@ -606,6 +607,24 @@ DEFAULT_CONFIG = {
"max_line_length": 2000,
},
# Tool loop guardrails nudge models when they repeat failed or
# non-progressing tool calls. Soft warnings are always-on by default;
# hard stops are opt-in so interactive CLI/TUI sessions keep flowing.
"tool_loop_guardrails": {
"warnings_enabled": True,
"hard_stop_enabled": False,
"warn_after": {
"exact_failure": 2,
"same_tool_failure": 3,
"idempotent_no_progress": 2,
},
"hard_stop_after": {
"exact_failure": 5,
"same_tool_failure": 8,
"idempotent_no_progress": 5,
},
},
"compression": {
"enabled": True,
"threshold": 0.50, # compress when context usage exceeds this ratio
@@ -756,6 +775,14 @@ DEFAULT_CONFIG = {
"tool_progress_command": False, # Enable /verbose command in messaging gateway
"tool_progress_overrides": {}, # DEPRECATED — use display.platforms instead
"tool_preview_length": 0, # Max chars for tool call previews (0 = no limit, show full paths/commands)
# Auto-delete system-notice replies (e.g. "✨ New session started!",
# "♻ Restarting gateway…", "⚡ Stopped…") after N seconds on platforms
# that support message deletion (currently Telegram; other platforms
# ignore and leave the message in place). Only affects slash-command
# replies wrapped with gateway.platforms.base.EphemeralReply — agent
# responses and content messages are never touched. Default 0
# (disabled) preserves prior behavior.
"ephemeral_system_ttl": 0,
"platforms": {}, # Per-platform display overrides: {"telegram": {"tool_progress": "all"}, "slack": {"tool_progress": "off"}}
# Gateway runtime-metadata footer appended to the FINAL message of a turn
# (disabled by default to keep replies minimal). When enabled, renders
@@ -925,7 +952,23 @@ DEFAULT_CONFIG = {
# injected at the start of every API call for few-shot priming.
# Never saved to sessions, logs, or trajectories.
"prefill_messages_file": "",
# Goals — persistent cross-turn goals (Ralph-style loop).
# After every turn, a lightweight judge call asks the auxiliary model
# whether the active /goal is satisfied by the assistant's last
# response. If not, Hermes feeds a continuation prompt back into the
# same session and keeps working until the goal is done, the turn
# budget is exhausted, or the user pauses/clears it. Judge failures
# fail OPEN (continue) so a flaky judge never wedges progress — the
# turn budget is the real backstop.
"goals": {
# Max continuation turns before Hermes auto-pauses the goal and
# asks the user to /goal resume. Protects against judge false
# negatives (goal actually done but judge says continue) and
# unbounded model spend on fuzzy / unachievable goals.
"max_turns": 20,
},
# Skills — external skill directories for sharing skills across tools/agents.
# Each path is expanded (~, ${VAR}) and resolved. Read-only — skill creation
# always goes to ~/.hermes/skills/.
@@ -979,6 +1022,14 @@ DEFAULT_CONFIG = {
# Archive a skill (move to skills/.archive/) after this many days
# without use. Archived skills are recoverable — no auto-deletion.
"archive_after_days": 90,
# Pre-run backup: before every real curator pass (dry-run is
# skipped), snapshot ~/.hermes/skills/ into
# ~/.hermes/skills/.curator_backups/<utc-iso>/skills.tar.gz so the
# user can roll back with `hermes curator rollback`.
"backup": {
"enabled": True,
"keep": 5, # retain last N regular snapshots
},
},
# Honcho AI-native memory -- reads ~/.honcho/config.json as single source of truth.
@@ -1104,6 +1155,24 @@ DEFAULT_CONFIG = {
"max_parallel_jobs": None,
},
# Kanban multi-agent coordination — controls the dispatcher loop that
# spawns workers for ready tasks. The dispatcher ticks every N seconds
# (default 60), reclaims stale claims, promotes dependency-satisfied
# todos to ready, and fires `hermes -p <assignee> chat -q ...` for
# each claimable ready task. One dispatcher per profile is sufficient;
# running more than one on the same kanban.db will race for claims.
"kanban": {
# Run the dispatcher inside the gateway process. On by default —
# the cost is ~300µs every `dispatch_interval_seconds` when idle,
# and gateway is the supervisor users already have. Set to false
# only if you run the dispatcher as a separate systemd unit or
# don't want the gateway to spawn workers.
"dispatch_in_gateway": True,
# Seconds between dispatcher ticks (idle or not). Lower = snappier
# pickup of newly-ready tasks; higher = less SQL pressure.
"dispatch_interval_seconds": 60,
},
# execute_code settings — controls the tool used for programmatic tool calls.
"code_execution": {
# Execution mode:
@@ -2400,7 +2469,17 @@ def get_missing_skill_config_vars() -> List[Dict[str, Any]]:
except Exception:
return []
all_vars = discover_all_skill_config_vars()
try:
all_vars = discover_all_skill_config_vars()
except Exception as e:
# A malformed SKILL.md, unreadable external skill dir, or similar
# should never break `hermes update`. Skill-config prompting is a
# post-migration nicety, not a blocker.
import logging
logging.getLogger(__name__).debug(
"discover_all_skill_config_vars failed: %s", e
)
return []
if not all_vars:
return []
+138 -7
View File
@@ -160,7 +160,11 @@ def _cmd_run(args) -> int:
print("curator: disabled via config; enable with `curator.enabled: true`")
return 1
print("curator: running review pass...")
dry = bool(getattr(args, "dry_run", False))
if dry:
print("curator: running DRY-RUN (report only, no mutations)...")
else:
print("curator: running review pass...")
def _on_summary(msg: str) -> None:
print(msg)
@@ -168,17 +172,29 @@ def _cmd_run(args) -> int:
result = curator.run_curator_review(
on_summary=_on_summary,
synchronous=bool(args.synchronous),
dry_run=dry,
)
auto = result.get("auto_transitions", {})
if auto:
print(
f"auto: checked={auto.get('checked', 0)} "
f"stale={auto.get('marked_stale', 0)} "
f"archived={auto.get('archived', 0)} "
f"reactivated={auto.get('reactivated', 0)}"
)
if dry:
print(
f"auto (preview): {auto.get('checked', 0)} candidate skill(s) "
"— no transitions applied in dry-run"
)
else:
print(
f"auto: checked={auto.get('checked', 0)} "
f"stale={auto.get('marked_stale', 0)} "
f"archived={auto.get('archived', 0)} "
f"reactivated={auto.get('reactivated', 0)}"
)
if not args.synchronous:
print("llm pass running in background — check `hermes curator status` later")
if dry:
print(
"dry-run: no changes applied. When the report lands, read it with "
"`hermes curator status` and run `hermes curator run` (no flag) to apply."
)
return 0
@@ -229,6 +245,86 @@ def _cmd_restore(args) -> int:
return 0 if ok else 1
def _cmd_backup(args) -> int:
"""Take a manual snapshot of the skills tree. Same mechanism as the
automatic pre-run snapshot, just user-initiated."""
from agent import curator_backup
if not curator_backup.is_enabled():
print(
"curator: backups are disabled via config "
"(`curator.backup.enabled: false`); re-enable to snapshot"
)
return 1
reason = getattr(args, "reason", None) or "manual"
snap = curator_backup.snapshot_skills(reason=reason)
if snap is None:
print("curator: snapshot failed — check logs (backup disabled or IO error)")
return 1
print(f"curator: snapshot created at ~/.hermes/skills/.curator_backups/{snap.name}")
return 0
def _cmd_rollback(args) -> int:
"""Restore the skills tree from a snapshot. Defaults to newest.
``--list`` prints available snapshots and exits. ``--id <stamp>`` picks
a specific one. Without ``-y``, prompts for confirmation. A safety
snapshot of the current tree is always taken first, so rollbacks are
themselves undoable.
"""
from agent import curator_backup
if getattr(args, "list", False):
print(curator_backup.summarize_backups())
return 0
backup_id = getattr(args, "backup_id", None)
target_path = curator_backup._resolve_backup(backup_id)
if target_path is None:
rows = curator_backup.list_backups()
if not rows:
print(
"curator: no snapshots exist yet. Take one with "
"`hermes curator backup` or wait for the next curator run."
)
else:
print(
f"curator: no snapshot matching "
f"{'id ' + repr(backup_id) if backup_id else 'your query'}."
)
print("Available:")
print(curator_backup.summarize_backups())
return 1
manifest = curator_backup._read_manifest(target_path)
print(f"Rollback target: {target_path.name}")
if manifest:
print(f" reason: {manifest.get('reason', '?')}")
print(f" created_at: {manifest.get('created_at', '?')}")
print(f" skill files: {manifest.get('skill_files', '?')}")
print(
"\nThis will replace the current ~/.hermes/skills/ tree (a safety "
"snapshot of the current state is taken first so this is undoable)."
)
if not getattr(args, "yes", False):
try:
ans = input("Proceed? [y/N] ").strip().lower()
except (EOFError, KeyboardInterrupt):
print("\ncancelled")
return 1
if ans not in ("y", "yes"):
print("cancelled")
return 1
ok, msg, _ = curator_backup.rollback(backup_id=target_path.name)
if ok:
print(f"curator: {msg}")
return 0
print(f"curator: rollback failed — {msg}")
return 1
# ---------------------------------------------------------------------------
# argparse wiring (called from hermes_cli.main)
# ---------------------------------------------------------------------------
@@ -250,6 +346,11 @@ def register_cli(parent: argparse.ArgumentParser) -> None:
"--sync", "--synchronous", dest="synchronous", action="store_true",
help="Wait for the LLM review pass to finish (default: background thread)",
)
p_run.add_argument(
"--dry-run", dest="dry_run", action="store_true",
help="Report only — no state changes, no archives, no consolidation "
"(use this to preview what curator would do)",
)
p_run.set_defaults(func=_cmd_run)
p_pause = subs.add_parser("pause", help="Pause the curator until resumed")
@@ -270,6 +371,36 @@ def register_cli(parent: argparse.ArgumentParser) -> None:
p_restore.add_argument("skill", help="Skill name")
p_restore.set_defaults(func=_cmd_restore)
p_backup = subs.add_parser(
"backup",
help="Take a manual tar.gz snapshot of ~/.hermes/skills/ "
"(curator also does this automatically before every real run)",
)
p_backup.add_argument(
"--reason", default=None,
help="Free-text label stored in manifest.json (default: 'manual')",
)
p_backup.set_defaults(func=_cmd_backup)
p_rollback = subs.add_parser(
"rollback",
help="Restore ~/.hermes/skills/ from a curator snapshot "
"(defaults to the newest)",
)
p_rollback.add_argument(
"--list", action="store_true",
help="List available snapshots and exit without restoring",
)
p_rollback.add_argument(
"--id", dest="backup_id", default=None,
help="Snapshot id to restore (see `--list`); default: newest",
)
p_rollback.add_argument(
"-y", "--yes", action="store_true",
help="Skip confirmation prompt",
)
p_rollback.set_defaults(func=_cmd_rollback)
def cli_main(argv=None) -> int:
"""Standalone entry (also usable by hermes_cli.main fallthrough)."""
+86 -1
View File
@@ -10,6 +10,7 @@ import shutil
import signal
import subprocess
import sys
import textwrap
from dataclasses import dataclass
from pathlib import Path
@@ -59,6 +60,13 @@ class GatewayRuntimeSnapshot:
def has_process_service_mismatch(self) -> bool:
return self.service_installed and self.running and not self.service_running
@dataclass(frozen=True)
class ProfileGatewayProcess:
profile: str
path: Path
pid: int
def _get_service_pids() -> set:
"""Return PIDs currently managed by systemd or launchd gateway services.
@@ -371,6 +379,83 @@ def find_gateway_pids(exclude_pids: set | None = None, all_profiles: bool = Fals
return pids
def find_profile_gateway_processes(
exclude_pids: set | None = None,
) -> list[ProfileGatewayProcess]:
"""Return running gateway PIDs mapped to Hermes profiles via PID files."""
_exclude = set(exclude_pids or set())
processes: list[ProfileGatewayProcess] = []
try:
from gateway.status import get_running_pid
from hermes_cli.profiles import list_profiles
except Exception:
return processes
seen: set[int] = set()
for profile in list_profiles():
try:
pid = get_running_pid(profile.path / "gateway.pid", cleanup_stale=False)
except Exception:
continue
if pid is None or pid <= 0 or pid in _exclude or pid in seen:
continue
seen.add(pid)
processes.append(ProfileGatewayProcess(profile=profile.name, path=profile.path, pid=pid))
return processes
def _gateway_run_args_for_profile(profile: str) -> list[str]:
args = [get_python_path(), "-m", "hermes_cli.main"]
if profile != "default":
args.extend(["--profile", profile])
args.extend(["gateway", "run", "--replace"])
return args
def launch_detached_profile_gateway_restart(profile: str, old_pid: int) -> bool:
"""Relaunch a manually-run profile gateway after its current PID exits."""
if old_pid <= 0:
return False
watcher = textwrap.dedent(
"""
import os
import subprocess
import sys
import time
pid = int(sys.argv[1])
cmd = sys.argv[2:]
deadline = time.monotonic() + 120
while time.monotonic() < deadline:
try:
os.kill(pid, 0)
except ProcessLookupError:
break
except PermissionError:
pass
time.sleep(0.2)
subprocess.Popen(
cmd,
stdout=subprocess.DEVNULL,
stderr=subprocess.DEVNULL,
start_new_session=True,
)
"""
).strip()
try:
subprocess.Popen(
[sys.executable, "-c", watcher, str(old_pid), *_gateway_run_args_for_profile(profile)],
stdout=subprocess.DEVNULL,
stderr=subprocess.DEVNULL,
start_new_session=True,
)
except OSError:
return False
return True
def _probe_systemd_service_running(system: bool = False) -> tuple[bool, bool]:
selected_system = _select_systemd_scope(system)
unit_exists = get_systemd_unit_path(system=selected_system).exists()
@@ -4377,4 +4462,4 @@ def _gateway_command_inner(args):
if not supports_systemd_services() and not is_macos():
print("Legacy unit migration only applies to systemd-based Linux hosts.")
return
remove_legacy_hermes_units(interactive=not yes, dry_run=dry_run)
remove_legacy_hermes_units(interactive=not yes, dry_run=dry_run)
+535
View File
@@ -0,0 +1,535 @@
"""Persistent session goals — the Ralph loop for Hermes.
A goal is a free-form user objective that stays active across turns. After
each turn completes, a small judge call asks an auxiliary model "is this
goal satisfied by the assistant's last response?". If not, Hermes feeds a
continuation prompt back into the same session and keeps working until the
goal is done, turn budget is exhausted, the user pauses/clears it, or the
user sends a new message (which takes priority and pauses the goal loop).
State is persisted in SessionDB's ``state_meta`` table keyed by
``goal:<session_id>`` so ``/resume`` picks it up.
Design notes / invariants:
- The continuation prompt is just a normal user message appended to the
session via ``run_conversation``. No system-prompt mutation, no toolset
swap prompt caching stays intact.
- Judge failures are fail-OPEN: ``continue``. A broken judge must not wedge
progress; the turn budget is the backstop.
- When a real user message arrives mid-loop it preempts the continuation
prompt and also pauses the goal loop for that turn (we still re-judge
after, so if the user's message happens to complete the goal the judge
will say ``done``).
- This module has zero hard dependency on ``cli.HermesCLI`` or the gateway
runner both wire the same ``GoalManager`` in.
Nothing in this module touches the agent's system prompt or toolset.
"""
from __future__ import annotations
import json
import logging
import re
import time
from dataclasses import dataclass, asdict
from typing import Any, Dict, Optional, Tuple
logger = logging.getLogger(__name__)
# ──────────────────────────────────────────────────────────────────────
# Constants & defaults
# ──────────────────────────────────────────────────────────────────────
DEFAULT_MAX_TURNS = 20
DEFAULT_JUDGE_TIMEOUT = 30.0
# Cap how much of the last response + recent messages we send to the judge.
_JUDGE_RESPONSE_SNIPPET_CHARS = 4000
CONTINUATION_PROMPT_TEMPLATE = (
"[Continuing toward your standing goal]\n"
"Goal: {goal}\n\n"
"Continue working toward this goal. Take the next concrete step. "
"If you believe the goal is complete, state so explicitly and stop. "
"If you are blocked and need input from the user, say so clearly and stop."
)
JUDGE_SYSTEM_PROMPT = (
"You are a strict judge evaluating whether an autonomous agent has "
"achieved a user's stated goal. You receive the goal text and the "
"agent's most recent response. Your only job is to decide whether "
"the goal is fully satisfied based on that response.\n\n"
"A goal is DONE only when:\n"
"- The response explicitly confirms the goal was completed, OR\n"
"- The response clearly shows the final deliverable was produced, OR\n"
"- The response explains the goal is unachievable / blocked / needs "
"user input (treat this as DONE with reason describing the block).\n\n"
"Otherwise the goal is NOT done — CONTINUE.\n\n"
"Reply ONLY with a single JSON object on one line:\n"
'{\"done\": <true|false>, \"reason\": \"<one-sentence rationale>\"}'
)
JUDGE_USER_PROMPT_TEMPLATE = (
"Goal:\n{goal}\n\n"
"Agent's most recent response:\n{response}\n\n"
"Is the goal satisfied?"
)
# ──────────────────────────────────────────────────────────────────────
# Dataclass
# ──────────────────────────────────────────────────────────────────────
@dataclass
class GoalState:
"""Serializable goal state stored per session."""
goal: str
status: str = "active" # active | paused | done | cleared
turns_used: int = 0
max_turns: int = DEFAULT_MAX_TURNS
created_at: float = 0.0
last_turn_at: float = 0.0
last_verdict: Optional[str] = None # "done" | "continue" | "skipped"
last_reason: Optional[str] = None
paused_reason: Optional[str] = None # why we auto-paused (budget, etc.)
def to_json(self) -> str:
return json.dumps(asdict(self), ensure_ascii=False)
@classmethod
def from_json(cls, raw: str) -> "GoalState":
data = json.loads(raw)
return cls(
goal=data.get("goal", ""),
status=data.get("status", "active"),
turns_used=int(data.get("turns_used", 0) or 0),
max_turns=int(data.get("max_turns", DEFAULT_MAX_TURNS) or DEFAULT_MAX_TURNS),
created_at=float(data.get("created_at", 0.0) or 0.0),
last_turn_at=float(data.get("last_turn_at", 0.0) or 0.0),
last_verdict=data.get("last_verdict"),
last_reason=data.get("last_reason"),
paused_reason=data.get("paused_reason"),
)
# ──────────────────────────────────────────────────────────────────────
# Persistence (SessionDB state_meta)
# ──────────────────────────────────────────────────────────────────────
def _meta_key(session_id: str) -> str:
return f"goal:{session_id}"
_DB_CACHE: Dict[str, Any] = {}
def _get_session_db() -> Optional[Any]:
"""Return a SessionDB instance for the current HERMES_HOME.
SessionDB has no built-in singleton, but opening a new connection per
/goal call would thrash the file. We cache one instance per
``hermes_home`` path so profile switches still pick up the right DB.
Defensive against import/instantiation failures so tests and
non-standard launchers can still use the GoalManager.
"""
try:
from hermes_constants import get_hermes_home
from hermes_state import SessionDB
home = str(get_hermes_home())
except Exception as exc: # pragma: no cover
logger.debug("GoalManager: SessionDB bootstrap failed (%s)", exc)
return None
cached = _DB_CACHE.get(home)
if cached is not None:
return cached
try:
db = SessionDB()
except Exception as exc: # pragma: no cover
logger.debug("GoalManager: SessionDB() raised (%s)", exc)
return None
_DB_CACHE[home] = db
return db
def load_goal(session_id: str) -> Optional[GoalState]:
"""Load the goal for a session, or None if none exists."""
if not session_id:
return None
db = _get_session_db()
if db is None:
return None
try:
raw = db.get_meta(_meta_key(session_id))
except Exception as exc:
logger.debug("GoalManager: get_meta failed: %s", exc)
return None
if not raw:
return None
try:
return GoalState.from_json(raw)
except Exception as exc:
logger.warning("GoalManager: could not parse stored goal for %s: %s", session_id, exc)
return None
def save_goal(session_id: str, state: GoalState) -> None:
"""Persist a goal to SessionDB. No-op if DB unavailable."""
if not session_id:
return
db = _get_session_db()
if db is None:
return
try:
db.set_meta(_meta_key(session_id), state.to_json())
except Exception as exc:
logger.debug("GoalManager: set_meta failed: %s", exc)
def clear_goal(session_id: str) -> None:
"""Mark a goal cleared in the DB (preserved for audit, status=cleared)."""
state = load_goal(session_id)
if state is None:
return
state.status = "cleared"
save_goal(session_id, state)
# ──────────────────────────────────────────────────────────────────────
# Judge
# ──────────────────────────────────────────────────────────────────────
def _truncate(text: str, limit: int) -> str:
if not text:
return ""
if len(text) <= limit:
return text
return text[:limit] + "… [truncated]"
_JSON_OBJECT_RE = re.compile(r"\{.*?\}", re.DOTALL)
def _parse_judge_response(raw: str) -> Tuple[bool, str]:
"""Parse the judge's reply. Fail-open to ``(False, "<reason>")``.
Returns ``(done, reason)``.
"""
if not raw:
return False, "judge returned empty response"
text = raw.strip()
# Strip markdown code fences the model may wrap JSON in.
if text.startswith("```"):
text = text.strip("`")
# Peel off leading json/JSON/etc tag
nl = text.find("\n")
if nl != -1:
text = text[nl + 1:]
# First try: parse the whole blob.
data: Optional[Dict[str, Any]] = None
try:
data = json.loads(text)
except Exception:
# Second try: pull the first JSON object out.
match = _JSON_OBJECT_RE.search(text)
if match:
try:
data = json.loads(match.group(0))
except Exception:
data = None
if not isinstance(data, dict):
return False, f"judge reply was not JSON: {_truncate(raw, 200)!r}"
done_val = data.get("done")
if isinstance(done_val, str):
done = done_val.strip().lower() in ("true", "yes", "1", "done")
else:
done = bool(done_val)
reason = str(data.get("reason") or "").strip()
if not reason:
reason = "no reason provided"
return done, reason
def judge_goal(
goal: str,
last_response: str,
*,
timeout: float = DEFAULT_JUDGE_TIMEOUT,
) -> Tuple[str, str]:
"""Ask the auxiliary model whether the goal is satisfied.
Returns ``(verdict, reason)`` where verdict is ``"done"``, ``"continue"``,
or ``"skipped"`` (when the judge couldn't be reached).
This is deliberately fail-open: any error returns ``("continue", "...")``
so a broken judge doesn't wedge progress — the turn budget is the
backstop.
"""
if not goal.strip():
return "skipped", "empty goal"
if not last_response.strip():
# No substantive reply this turn — almost certainly not done yet.
return "continue", "empty response (nothing to evaluate)"
try:
from agent.auxiliary_client import get_text_auxiliary_client
except Exception as exc:
logger.debug("goal judge: auxiliary client import failed: %s", exc)
return "continue", "auxiliary client unavailable"
try:
client, model = get_text_auxiliary_client("goal_judge")
except Exception as exc:
logger.debug("goal judge: get_text_auxiliary_client failed: %s", exc)
return "continue", "auxiliary client unavailable"
if client is None or not model:
return "continue", "no auxiliary client configured"
prompt = JUDGE_USER_PROMPT_TEMPLATE.format(
goal=_truncate(goal, 2000),
response=_truncate(last_response, _JUDGE_RESPONSE_SNIPPET_CHARS),
)
try:
resp = client.chat.completions.create(
model=model,
messages=[
{"role": "system", "content": JUDGE_SYSTEM_PROMPT},
{"role": "user", "content": prompt},
],
temperature=0,
max_tokens=200,
timeout=timeout,
)
except Exception as exc:
logger.info("goal judge: API call failed (%s) — falling through to continue", exc)
return "continue", f"judge error: {type(exc).__name__}"
try:
raw = resp.choices[0].message.content or ""
except Exception:
raw = ""
done, reason = _parse_judge_response(raw)
verdict = "done" if done else "continue"
logger.info("goal judge: verdict=%s reason=%s", verdict, _truncate(reason, 120))
return verdict, reason
# ──────────────────────────────────────────────────────────────────────
# GoalManager — the orchestration surface CLI + gateway talk to
# ──────────────────────────────────────────────────────────────────────
class GoalManager:
"""Per-session goal state + continuation decisions.
The CLI and gateway each hold one ``GoalManager`` per live session.
Methods:
- ``set(goal)`` start a new standing goal.
- ``clear()`` remove the active goal.
- ``pause()`` / ``resume()`` explicit user controls.
- ``status()`` printable one-liner.
- ``evaluate_after_turn(last_response)`` call the judge, update state,
and return a decision dict the caller uses to drive the next turn.
- ``next_continuation_prompt()`` the canonical user-role message to
feed back into ``run_conversation``.
"""
def __init__(self, session_id: str, *, default_max_turns: int = DEFAULT_MAX_TURNS):
self.session_id = session_id
self.default_max_turns = int(default_max_turns or DEFAULT_MAX_TURNS)
self._state: Optional[GoalState] = load_goal(session_id)
# --- introspection ------------------------------------------------
@property
def state(self) -> Optional[GoalState]:
return self._state
def is_active(self) -> bool:
return self._state is not None and self._state.status == "active"
def has_goal(self) -> bool:
return self._state is not None and self._state.status in ("active", "paused")
def status_line(self) -> str:
s = self._state
if s is None or s.status in ("cleared",):
return "No active goal. Set one with /goal <text>."
turns = f"{s.turns_used}/{s.max_turns} turns"
if s.status == "active":
return f"⊙ Goal (active, {turns}): {s.goal}"
if s.status == "paused":
extra = f"{s.paused_reason}" if s.paused_reason else ""
return f"⏸ Goal (paused, {turns}{extra}): {s.goal}"
if s.status == "done":
return f"✓ Goal done ({turns}): {s.goal}"
return f"Goal ({s.status}, {turns}): {s.goal}"
# --- mutation -----------------------------------------------------
def set(self, goal: str, *, max_turns: Optional[int] = None) -> GoalState:
goal = (goal or "").strip()
if not goal:
raise ValueError("goal text is empty")
state = GoalState(
goal=goal,
status="active",
turns_used=0,
max_turns=int(max_turns) if max_turns else self.default_max_turns,
created_at=time.time(),
last_turn_at=0.0,
)
self._state = state
save_goal(self.session_id, state)
return state
def pause(self, reason: str = "user-paused") -> Optional[GoalState]:
if not self._state:
return None
self._state.status = "paused"
self._state.paused_reason = reason
save_goal(self.session_id, self._state)
return self._state
def resume(self, *, reset_budget: bool = True) -> Optional[GoalState]:
if not self._state:
return None
self._state.status = "active"
self._state.paused_reason = None
if reset_budget:
self._state.turns_used = 0
save_goal(self.session_id, self._state)
return self._state
def clear(self) -> None:
if self._state is None:
return
self._state.status = "cleared"
save_goal(self.session_id, self._state)
self._state = None
def mark_done(self, reason: str) -> None:
if not self._state:
return
self._state.status = "done"
self._state.last_verdict = "done"
self._state.last_reason = reason
save_goal(self.session_id, self._state)
# --- the main entry point called after every turn -----------------
def evaluate_after_turn(
self,
last_response: str,
*,
user_initiated: bool = True,
) -> Dict[str, Any]:
"""Run the judge and update state. Return a decision dict.
``user_initiated`` distinguishes a real user prompt (True) from a
continuation prompt we fed ourselves (False). Both increment
``turns_used`` because both consume model budget.
Decision keys:
- ``status``: current goal status after update
- ``should_continue``: bool caller should fire another turn
- ``continuation_prompt``: str or None
- ``verdict``: "done" | "continue" | "skipped" | "inactive"
- ``reason``: str
- ``message``: user-visible one-liner to print/send
"""
state = self._state
if state is None or state.status != "active":
return {
"status": state.status if state else None,
"should_continue": False,
"continuation_prompt": None,
"verdict": "inactive",
"reason": "no active goal",
"message": "",
}
# Count the turn that just finished.
state.turns_used += 1
state.last_turn_at = time.time()
verdict, reason = judge_goal(state.goal, last_response)
state.last_verdict = verdict
state.last_reason = reason
if verdict == "done":
state.status = "done"
save_goal(self.session_id, state)
return {
"status": "done",
"should_continue": False,
"continuation_prompt": None,
"verdict": "done",
"reason": reason,
"message": f"✓ Goal achieved: {reason}",
}
if state.turns_used >= state.max_turns:
state.status = "paused"
state.paused_reason = f"turn budget exhausted ({state.turns_used}/{state.max_turns})"
save_goal(self.session_id, state)
return {
"status": "paused",
"should_continue": False,
"continuation_prompt": None,
"verdict": "continue",
"reason": reason,
"message": (
f"⏸ Goal paused — {state.turns_used}/{state.max_turns} turns used. "
"Use /goal resume to keep going, or /goal clear to stop."
),
}
save_goal(self.session_id, state)
return {
"status": "active",
"should_continue": True,
"continuation_prompt": self.next_continuation_prompt(),
"verdict": "continue",
"reason": reason,
"message": (
f"↻ Continuing toward goal ({state.turns_used}/{state.max_turns}): {reason}"
),
}
def next_continuation_prompt(self) -> Optional[str]:
if not self._state or self._state.status != "active":
return None
return CONTINUATION_PROMPT_TEMPLATE.format(goal=self._state.goal)
__all__ = [
"GoalState",
"GoalManager",
"CONTINUATION_PROMPT_TEMPLATE",
"DEFAULT_MAX_TURNS",
"load_goal",
"save_goal",
"clear_goal",
"judge_goal",
]
+1393
View File
File diff suppressed because it is too large Load Diff
File diff suppressed because it is too large Load Diff
+164 -12
View File
@@ -800,6 +800,8 @@ def _print_tui_exit_summary(session_id: Optional[str], active_session_file: Opti
title = db.get_session_title(target)
message_count = int(session.get("message_count") or 0)
if message_count == 0:
return # No real conversation — don't show resume info
input_tokens = int(session.get("input_tokens") or 0)
output_tokens = int(session.get("output_tokens") or 0)
cache_read_tokens = int(session.get("cache_read_tokens") or 0)
@@ -5041,6 +5043,13 @@ def cmd_slack(args):
return 1
def cmd_kanban(args):
"""Multi-profile collaboration board."""
from hermes_cli.kanban import kanban_command
return kanban_command(args)
def cmd_hooks(args):
"""Shell-hook inspection and management."""
from hermes_cli.hooks import hooks_command
@@ -5424,6 +5433,45 @@ def _find_stale_dashboard_pids() -> list[int]:
return dashboard_pids
def _print_curator_first_run_notice() -> None:
"""Print a short heads-up about the skill curator after `hermes update`.
Only fires when the curator is enabled AND has no recorded run yet, which
is exactly the window where the gateway ticker used to fire Curator
against a fresh skill library immediately after an update. We defer the
first real pass by one ``interval_hours``; this notice tells the user how
to preview or disable before then. Silent on steady state.
"""
try:
from agent import curator
except Exception:
return
try:
if not curator.is_enabled():
return
state = curator.load_state()
except Exception:
return
if state.get("last_run_at"):
# Curator has run before (real or already seeded) — no notice needed.
return
try:
hours = curator.get_interval_hours()
except Exception:
hours = 24 * 7
days = max(1, hours // 24)
print()
print(" Skill curator")
print(
f" Background skill maintenance is enabled. First pass is deferred "
f"~{days}d after installation; only agent-created skills are in "
f"scope and nothing is ever auto-deleted (archive is recoverable)."
)
print(" Preview now: hermes curator run --dry-run")
print(" Pause it: hermes curator pause")
print(" Docs: https://hermes-agent.nousresearch.com/docs/user-guide/features/curator")
def _kill_stale_dashboard_processes(
reason: str = "the running backend no longer matches the updated frontend",
) -> None:
@@ -5661,6 +5709,10 @@ def _update_via_zip(args):
print()
print("✓ Update complete!")
try:
_print_curator_first_run_notice()
except Exception as e:
logger.debug("Curator first-run notice failed: %s", e)
_kill_stale_dashboard_processes()
@@ -6666,6 +6718,7 @@ def _cmd_update_impl(args, gateway_mode: bool):
if gateway_mode
else None
)
assume_yes = bool(getattr(args, "yes", False))
print("⚕ Updating Hermes Agent...")
print()
@@ -6785,8 +6838,10 @@ def _cmd_update_impl(args, gateway_mode: bool):
else:
auto_stash_ref = _stash_local_changes_if_needed(git_cmd, PROJECT_ROOT)
prompt_for_restore = auto_stash_ref is not None and (
gateway_mode or (sys.stdin.isatty() and sys.stdout.isatty())
prompt_for_restore = (
auto_stash_ref is not None
and not assume_yes
and (gateway_mode or (sys.stdin.isatty() and sys.stdout.isatty()))
)
# Check if there are updates
@@ -7047,7 +7102,10 @@ def _cmd_update_impl(args, gateway_mode: bool):
print(f" {len(missing_config)} new config option(s) available")
print()
if gateway_mode:
if assume_yes:
print(" --yes: auto-applying config migration (skipping API-key prompts).")
response = "y"
elif gateway_mode:
response = (
_gateway_prompt(
"Would you like to configure new options now? [Y/n]", "n"
@@ -7073,14 +7131,17 @@ def _cmd_update_impl(args, gateway_mode: bool):
if response in ("", "y", "yes"):
print()
# In gateway mode, run auto-migrations only (no input() prompts
# for API keys which would hang the detached process).
results = migrate_config(interactive=not gateway_mode, quiet=False)
# In gateway mode OR under --yes, run auto-migrations only (no
# input() prompts for API keys which would hang the detached
# process / defeat the point of --yes).
results = migrate_config(
interactive=not (gateway_mode or assume_yes), quiet=False
)
if results["env_added"] or results["config_added"]:
print()
print("✓ Configuration updated!")
if gateway_mode and missing_env:
if (gateway_mode or assume_yes) and missing_env:
print(" API keys require manual entry: hermes config migrate")
else:
print()
@@ -7091,6 +7152,15 @@ def _cmd_update_impl(args, gateway_mode: bool):
print()
print("✓ Update complete!")
# Curator first-run heads-up. Only prints when curator is enabled AND
# has never run — i.e. the window where the ticker would otherwise
# have fired against a fresh skill library. Kept silent on steady
# state so we don't nag.
try:
_print_curator_first_run_notice()
except Exception as e:
logger.debug("Curator first-run notice failed: %s", e)
# Repair RHEL-family root installs where /usr/local/bin isn't on PATH
# for non-login interactive shells. No-op on every other platform.
try:
@@ -7130,6 +7200,8 @@ def _cmd_update_impl(args, gateway_mode: bool):
supports_systemd_services,
_ensure_user_systemd_env,
find_gateway_pids,
find_profile_gateway_processes,
launch_detached_profile_gateway_restart,
_get_service_pids,
_graceful_restart_via_sigusr1,
)
@@ -7233,6 +7305,7 @@ def _cmd_update_impl(args, gateway_mode: bool):
restarted_services = []
killed_pids = set()
relaunched_profiles = []
# --- Systemd services (Linux) ---
# Discover all hermes-gateway* units (default + profiles)
@@ -7422,7 +7495,33 @@ def _cmd_update_impl(args, gateway_mode: bool):
manual_pids = find_gateway_pids(
exclude_pids=service_pids, all_profiles=True
)
profile_processes = {
proc.pid: proc
for proc in find_profile_gateway_processes(exclude_pids=service_pids)
if proc.pid in manual_pids
}
for pid, proc in profile_processes.items():
if not launch_detached_profile_gateway_restart(proc.profile, pid):
continue
# Prefer a graceful SIGUSR1 drain so in-flight agent runs
# finish before the watcher respawns the gateway. If the
# gateway doesn't support SIGUSR1 or doesn't exit within
# the drain budget, fall back to SIGTERM — the watcher
# still sees the exit and relaunches either way.
drained = _graceful_restart_via_sigusr1(
pid, drain_timeout=_drain_budget,
)
if not drained:
try:
os.kill(pid, _signal.SIGTERM)
except (ProcessLookupError, PermissionError):
pass
killed_pids.add(pid)
relaunched_profiles.append(proc.profile)
for pid in manual_pids:
if pid in profile_processes:
continue
try:
os.kill(pid, _signal.SIGTERM)
killed_pids.add(pid)
@@ -7433,11 +7532,14 @@ def _cmd_update_impl(args, gateway_mode: bool):
print()
for svc in restarted_services:
print(f" ✓ Restarted {svc}")
if killed_pids:
print(f" → Stopped {len(killed_pids)} manual gateway process(es)")
if relaunched_profiles:
names = ", ".join(relaunched_profiles)
print(f" ✓ Restarting manual gateway profile(s): {names}")
unmapped_count = len(killed_pids) - len(relaunched_profiles)
if unmapped_count:
print(f" → Stopped {unmapped_count} manual gateway process(es)")
print(" Restart manually: hermes gateway run")
# Also restart for each profile if needed
if len(killed_pids) > 1:
if unmapped_count > 1:
print(
" (or: hermes -p <profile> gateway run for each profile)"
)
@@ -7446,6 +7548,42 @@ def _cmd_update_impl(args, gateway_mode: bool):
# No gateways were running — nothing to do
pass
# --- Post-restart survivor sweep -----------------------------
# Issue #17648: some gateways ignore SIGTERM (stuck drain,
# blocked I/O, PID dead but zombie). The detached profile
# watchers wait 120s for the old PID to exit — if it never
# does, no respawn happens and the user keeps hitting
# ImportError against a stale sys.modules. Give the
# graceful paths a brief window to complete, then SIGKILL
# any remaining pre-update PIDs so the watcher / service
# manager can relaunch with fresh code.
try:
_time.sleep(3.0)
_service_pids_after = _get_service_pids()
_surviving = find_gateway_pids(
exclude_pids=_service_pids_after, all_profiles=True,
)
# Scope to PIDs we already tried to kill during this
# update (killed_pids). Anything new is a gateway that
# started AFTER our restart attempt — respecting user
# intent, we don't kill those.
_stuck = [pid for pid in _surviving if pid in killed_pids]
if _stuck:
print()
print(
f"{len(_stuck)} gateway process(es) ignored SIGTERM — force-killing"
)
for pid in _stuck:
try:
os.kill(pid, _signal.SIGKILL)
except (ProcessLookupError, PermissionError):
pass
# Give the OS a beat to reap the processes so the
# watchers see them exit and respawn.
_time.sleep(1.5)
except Exception as _sweep_exc:
logger.debug("Post-restart survivor sweep failed: %s", _sweep_exc)
except Exception as e:
logger.debug("Gateway restart during update failed: %s", e)
@@ -7682,7 +7820,7 @@ def cmd_profile(args):
if clone_all:
print(f"Full copy from {source_label}.")
else:
print(f"Cloned config, .env, SOUL.md from {source_label}.")
print(f"Cloned config, .env, SOUL.md, and skills from {source_label}.")
# Auto-clone Honcho config for the new profile (only with --clone/--clone-all)
if clone or clone_all:
@@ -8640,6 +8778,13 @@ def main():
webhook_parser.set_defaults(func=cmd_webhook)
# =========================================================================
# kanban command — multi-profile collaboration board
# =========================================================================
from hermes_cli.kanban import build_parser as _build_kanban_parser
kanban_parser = _build_kanban_parser(subparsers)
kanban_parser.set_defaults(func=cmd_kanban)
# =========================================================================
# hooks command — shell-hook inspection and management
# =========================================================================
@@ -9847,6 +9992,13 @@ Examples:
default=False,
help="Force a pre-update backup for this run (off by default; overrides updates.pre_update_backup)",
)
update_parser.add_argument(
"--yes",
"-y",
action="store_true",
default=False,
help="Assume yes for interactive prompts (config migration, stash restore). API-key entry is skipped; run 'hermes config migrate' separately for those.",
)
update_parser.set_defaults(func=cmd_update)
# =========================================================================
+18 -10
View File
@@ -891,14 +891,19 @@ def switch_model(
if not validation.get("accepted"):
override = False
if user_providers:
for up in user_providers:
if isinstance(up, dict) and up.get("provider") == target_provider:
cfg_models = up.get("models", [])
if new_model in cfg_models or any(
m.get("name") == new_model for m in cfg_models if isinstance(m, dict)
):
# user_providers is a dict: {provider_slug: config_dict}
for slug, cfg in user_providers.items():
if slug == target_provider:
cfg_models = cfg.get("models", {})
# Direct membership works for dict (keys) and list (strings)
if new_model in cfg_models:
override = True
break
# Also accept if models is a list of dicts with 'name' field
if isinstance(cfg_models, list):
if any(m.get("name") == new_model for m in cfg_models if isinstance(m, dict)):
override = True
break
if override:
validation = {"accepted": True, "persist": True, "recognized": False, "message": validation.get("message", "")}
else:
@@ -1412,14 +1417,17 @@ def list_authenticated_providers(
models_list = list(fb)
# Prefer the endpoint's live /models list when credentials are
# available. This keeps OpenAI-compatible relays (for example CRS)
# in sync when the server catalog changes without requiring the
# user to mirror every model into config.yaml.
# available, unless the provider explicitly opts out via
# discover_models: false (e.g. dedicated endpoints that expose
# the entire aggregator catalog via /models).
api_key = str(ep_cfg.get("api_key", "") or "").strip()
if not api_key:
key_env = str(ep_cfg.get("key_env", "") or "").strip()
api_key = os.environ.get(key_env, "").strip() if key_env else ""
if api_url and api_key:
discover = ep_cfg.get("discover_models", True)
if isinstance(discover, str):
discover = discover.lower() not in ("false", "no", "0")
if api_url and api_key and discover:
try:
from hermes_cli.models import fetch_api_models
live_models = fetch_api_models(api_key, api_url)
+2 -1
View File
@@ -40,6 +40,7 @@ OPENROUTER_MODELS: list[tuple[str, str]] = [
("anthropic/claude-sonnet-4.5", ""),
("anthropic/claude-haiku-4.5", ""),
("openrouter/elephant-alpha", "free"),
("openrouter/owl-alpha", "free"),
("openai/gpt-5.5", ""),
("openai/gpt-5.4-mini", ""),
("xiaomi/mimo-v2.5-pro", ""),
@@ -773,7 +774,6 @@ CANONICAL_PROVIDERS: list[ProviderEntry] = [
ProviderEntry("nous", "Nous Portal", "Nous Portal (Nous Research subscription)"),
ProviderEntry("openrouter", "OpenRouter", "OpenRouter (100+ models, pay-per-use)"),
ProviderEntry("lmstudio", "LM Studio", "LM Studio (local desktop app with built-in model server)"),
ProviderEntry("ai-gateway", "Vercel AI Gateway", "Vercel AI Gateway (200+ models, $5 free credit, no markup)"),
ProviderEntry("anthropic", "Anthropic", "Anthropic (Claude models — API key or Claude Code)"),
ProviderEntry("openai-codex", "OpenAI Codex", "OpenAI Codex"),
ProviderEntry("xiaomi", "Xiaomi MiMo", "Xiaomi MiMo (MiMo-V2.5 and V2 models — pro, omni, flash)"),
@@ -803,6 +803,7 @@ CANONICAL_PROVIDERS: list[ProviderEntry] = [
ProviderEntry("opencode-go", "OpenCode Go", "OpenCode Go (open models, $10/month subscription)"),
ProviderEntry("bedrock", "AWS Bedrock", "AWS Bedrock (Claude, Nova, Llama, DeepSeek — IAM or API key)"),
ProviderEntry("azure-foundry", "Azure Foundry", "Azure Foundry (OpenAI-style or Anthropic-style endpoint — your Azure AI deployment)"),
ProviderEntry("ai-gateway", "Vercel AI Gateway", "Vercel AI Gateway"),
]
# Derived dicts — used throughout the codebase
+52
View File
@@ -33,12 +33,15 @@ so plugin-defined tools appear alongside the built-in tools.
from __future__ import annotations
import asyncio
import importlib
import importlib.metadata
import importlib.util
import inspect
import logging
import os
import sys
import threading
import types
from dataclasses import dataclass, field
from pathlib import Path
@@ -1226,6 +1229,55 @@ def get_plugin_command_handler(name: str) -> Optional[Callable]:
return entry["handler"] if entry else None
_PLUGIN_COMMAND_AWAIT_TIMEOUT_SECS = 30.0
def resolve_plugin_command_result(result: Any) -> Any:
"""Resolve a plugin command return value, awaiting async handlers when needed.
Sync CLI/TUI dispatch sites call plugin handlers from plain functions.
If a handler is async, await it directly when no loop is running; if
we're already inside an active loop, run it in a helper thread with its
own loop so the caller still gets a concrete result synchronously. The
threaded path is bounded by a 30s timeout so a hung async handler cannot
wedge the terminal indefinitely.
"""
if not inspect.isawaitable(result):
return result
try:
asyncio.get_running_loop()
except RuntimeError:
return asyncio.run(result)
outcome: Dict[str, Any] = {}
failure: Dict[str, BaseException] = {}
done = threading.Event()
def _runner() -> None:
try:
outcome["value"] = asyncio.run(result)
except BaseException as exc: # pragma: no cover - re-raised below
failure["exc"] = exc
finally:
done.set()
thread = threading.Thread(
target=_runner,
name="hermes-plugin-command-await",
daemon=True,
)
thread.start()
if not done.wait(timeout=_PLUGIN_COMMAND_AWAIT_TIMEOUT_SECS):
raise TimeoutError(
"Plugin command async handler did not complete within "
f"{_PLUGIN_COMMAND_AWAIT_TIMEOUT_SECS:.0f}s"
)
if "exc" in failure:
raise failure["exc"]
return outcome.get("value")
def get_plugin_commands() -> Dict[str, dict]:
"""Return the full plugin commands dict (name → {handler, description, plugin}).
+378 -113
View File
@@ -15,13 +15,18 @@ import shutil
import subprocess
import sys
from pathlib import Path
from typing import Optional
from typing import Any, Optional
from hermes_constants import get_hermes_home
from hermes_cli.config import cfg_get
logger = logging.getLogger(__name__)
class PluginOperationError(Exception):
"""Recoverable plugin install/update failure (CLI exits; HTTP maps to 4xx)."""
# Minimum manifest version this installer understands.
# Plugins may declare ``manifest_version: 1`` in plugin.yaml;
# future breaking changes to the manifest schema bump this.
@@ -150,6 +155,24 @@ def _copy_example_files(plugin_dir: Path, console) -> None:
)
def _missing_requires_env_names(manifest: dict) -> list[str]:
"""Return declared ``requires_env`` names that are unset in ``~/.hermes/.env``."""
requires_env = manifest.get("requires_env") or []
if not requires_env:
return []
from hermes_cli.config import get_env_value
env_specs: list[dict] = []
for entry in requires_env:
if isinstance(entry, str):
env_specs.append({"name": entry})
elif isinstance(entry, dict) and entry.get("name"):
env_specs.append(entry)
return [s["name"] for s in env_specs if s.get("name") and not get_env_value(s["name"])]
def _prompt_plugin_env_vars(manifest: dict, console) -> None:
"""Prompt for required environment variables declared in plugin.yaml.
@@ -283,6 +306,95 @@ def _require_installed_plugin(name: str, plugins_dir: Path, console) -> Path:
# ---------------------------------------------------------------------------
def _install_plugin_core(identifier: str, *, force: bool) -> tuple[Path, dict, str]:
"""Clone Git plugin into ``~/.hermes/plugins``.
Returns ``(target_dir, installed_manifest, canonical_name)``.
Raises ``PluginOperationError`` on failure.
"""
import tempfile
try:
git_url = _resolve_git_url(identifier)
except ValueError as e:
raise PluginOperationError(str(e)) from e
plugins_dir = _plugins_dir()
with tempfile.TemporaryDirectory() as tmp:
tmp_target = Path(tmp) / "plugin"
try:
result = subprocess.run(
["git", "clone", "--depth", "1", git_url, str(tmp_target)],
capture_output=True,
text=True,
timeout=60,
)
except FileNotFoundError as e:
raise PluginOperationError(
"git is not installed or not in PATH.",
) from e
except subprocess.TimeoutExpired as e:
raise PluginOperationError(
"Git clone timed out after 60 seconds.",
) from e
if result.returncode != 0:
err = (result.stderr or result.stdout or "").strip()
raise PluginOperationError(f"Git clone failed:\n{err}")
manifest = _read_manifest(tmp_target)
plugin_name = manifest.get("name") or _repo_name_from_url(git_url)
try:
target = _sanitize_plugin_name(plugin_name, plugins_dir)
except ValueError as e:
raise PluginOperationError(str(e)) from e
mv = manifest.get("manifest_version")
if mv is not None:
try:
mv_int = int(mv)
except (ValueError, TypeError):
raise PluginOperationError(
f"Plugin '{plugin_name}' has invalid manifest_version "
f"'{mv}' (expected an integer).",
) from None
if mv_int > _SUPPORTED_MANIFEST_VERSION:
from hermes_cli.config import recommended_update_command
raise PluginOperationError(
f"Plugin '{plugin_name}' requires manifest_version {mv}, "
f"but this installer only supports up to {_SUPPORTED_MANIFEST_VERSION}. "
f"Run {recommended_update_command()} to update Hermes.",
) from None
if target.exists():
if not force:
raise PluginOperationError(
f"Plugin '{plugin_name}' already exists. Use force reinstall "
f"or run `hermes plugins update {plugin_name}`.",
)
shutil.rmtree(target)
shutil.move(str(tmp_target), str(target))
has_yaml = (target / "plugin.yaml").exists() or (target / "plugin.yml").exists()
if not has_yaml and not (target / "__init__.py").exists():
logger.warning(
"%s has no plugin.yaml / __init__.py; may not be a valid plugin",
plugin_name,
)
from rich.console import Console
_copy_example_files(target, Console())
installed_manifest = _read_manifest(target)
installed_name = installed_manifest.get("name") or target.name
return target, installed_manifest, installed_name
def cmd_install(
identifier: str,
force: bool = False,
@@ -293,7 +405,6 @@ def cmd_install(
After install, prompt "Enable now? [y/N]" unless *enable* is provided
(True = auto-enable without prompting, False = install disabled).
"""
import tempfile
from rich.console import Console
console = Console()
@@ -304,114 +415,41 @@ def cmd_install(
console.print(f"[red]Error:[/red] {e}")
sys.exit(1)
# Warn about insecure / local URL schemes
if git_url.startswith(("http://", "file://")):
console.print(
"[yellow]Warning:[/yellow] Using insecure/local URL scheme. "
"Consider using https:// or git@ for production installs."
"Consider using https:// or git@ for production installs.",
)
plugins_dir = _plugins_dir()
console.print(f"[dim]Cloning {git_url}...[/dim]")
# Clone into a temp directory first so we can read plugin.yaml for the name
with tempfile.TemporaryDirectory() as tmp:
tmp_target = Path(tmp) / "plugin"
console.print(f"[dim]Cloning {git_url}...[/dim]")
try:
target, installed_manifest, installed_name = _install_plugin_core(
identifier,
force=force,
)
except PluginOperationError as e:
console.print(f"[red]Error:[/red] {e}")
sys.exit(1)
try:
result = subprocess.run(
["git", "clone", "--depth", "1", git_url, str(tmp_target)],
capture_output=True,
text=True,
timeout=60,
)
except FileNotFoundError:
console.print("[red]Error:[/red] git is not installed or not in PATH.")
sys.exit(1)
except subprocess.TimeoutExpired:
console.print("[red]Error:[/red] Git clone timed out after 60 seconds.")
sys.exit(1)
if result.returncode != 0:
console.print(
f"[red]Error:[/red] Git clone failed:\n{result.stderr.strip()}"
)
sys.exit(1)
# Read manifest
manifest = _read_manifest(tmp_target)
plugin_name = manifest.get("name") or _repo_name_from_url(git_url)
# Sanitize plugin name against path traversal
try:
target = _sanitize_plugin_name(plugin_name, plugins_dir)
except ValueError as e:
console.print(f"[red]Error:[/red] {e}")
sys.exit(1)
# Check manifest_version compatibility
mv = manifest.get("manifest_version")
if mv is not None:
try:
mv_int = int(mv)
except (ValueError, TypeError):
console.print(
f"[red]Error:[/red] Plugin '{plugin_name}' has invalid "
f"manifest_version '{mv}' (expected an integer)."
)
sys.exit(1)
if mv_int > _SUPPORTED_MANIFEST_VERSION:
from hermes_cli.config import recommended_update_command
console.print(
f"[red]Error:[/red] Plugin '{plugin_name}' requires manifest_version "
f"{mv}, but this installer only supports up to {_SUPPORTED_MANIFEST_VERSION}.\n"
f"Run [bold]{recommended_update_command()}[/bold] to get a newer installer."
)
sys.exit(1)
if target.exists():
if not force:
console.print(
f"[red]Error:[/red] Plugin '{plugin_name}' already exists at {target}.\n"
f"Use [bold]--force[/bold] to remove and reinstall, or "
f"[bold]hermes plugins update {plugin_name}[/bold] to pull latest."
)
sys.exit(1)
console.print(f"[dim] Removing existing {plugin_name}...[/dim]")
shutil.rmtree(target)
# Move from temp to final location
shutil.move(str(tmp_target), str(target))
# Validate it looks like a plugin
if not (target / "plugin.yaml").exists() and not (target / "__init__.py").exists():
if not (target / "plugin.yaml").exists() and not (target / "plugin.yml").exists() and not (
target / "__init__.py"
).exists():
console.print(
f"[yellow]Warning:[/yellow] {plugin_name} doesn't contain plugin.yaml "
f"or __init__.py. It may not be a valid Hermes plugin."
f"[yellow]Warning:[/yellow] {installed_name} doesn't contain plugin.yaml "
f"or __init__.py. It may not be a valid Hermes plugin.",
)
# Copy .example files to their real names (e.g. config.yaml.example → config.yaml)
_copy_example_files(target, console)
# Re-read manifest from installed location (for env var prompting)
installed_manifest = _read_manifest(target)
# Prompt for required environment variables before showing after-install docs
_prompt_plugin_env_vars(installed_manifest, console)
_display_after_install(target, identifier)
# Determine the canonical plugin name for enable-list bookkeeping.
installed_name = installed_manifest.get("name") or target.name
# Decide whether to enable: explicit flag > interactive prompt > default off
should_enable = enable
if should_enable is None:
# Interactive prompt unless stdin isn't a TTY (scripted install).
if sys.stdin.isatty() and sys.stdout.isatty():
try:
answer = input(
f" Enable '{installed_name}' now? [y/N]: "
f" Enable '{installed_name}' now? [y/N]: ",
).strip().lower()
should_enable = answer in ("y", "yes")
except (EOFError, KeyboardInterrupt):
@@ -427,12 +465,12 @@ def cmd_install(
_save_enabled_set(enabled)
_save_disabled_set(disabled)
console.print(
f"[green]✓[/green] Plugin [bold]{installed_name}[/bold] enabled."
f"[green]✓[/green] Plugin [bold]{installed_name}[/bold] enabled.",
)
else:
console.print(
f"[dim]Plugin installed but not enabled. "
f"Run `hermes plugins enable {installed_name}` to activate.[/dim]"
f"Run `hermes plugins enable {installed_name}` to activate.[/dim]",
)
console.print("[dim]Restart the gateway for the plugin to take effect:[/dim]")
@@ -462,36 +500,22 @@ def cmd_update(name: str) -> None:
console.print(f"[dim]Updating {name}...[/dim]")
try:
result = subprocess.run(
["git", "pull", "--ff-only"],
capture_output=True,
text=True,
timeout=60,
cwd=str(target),
)
except FileNotFoundError:
console.print("[red]Error:[/red] git is not installed or not in PATH.")
sys.exit(1)
except subprocess.TimeoutExpired:
console.print("[red]Error:[/red] Git pull timed out after 60 seconds.")
sys.exit(1)
if result.returncode != 0:
console.print(f"[red]Error:[/red] Git pull failed:\n{result.stderr.strip()}")
ok, output = _git_pull_plugin_dir(target)
if not ok:
console.print(f"[red]Error:[/red] {output}")
sys.exit(1)
# Copy any new .example files
_copy_example_files(target, console)
output = result.stdout.strip()
if "Already up to date" in output:
out = output.strip()
if "Already up to date" in out:
console.print(
f"[green]✓[/green] Plugin [bold]{name}[/bold] is already up to date."
)
else:
console.print(f"[green]✓[/green] Plugin [bold]{name}[/bold] updated.")
console.print(f"[dim]{output}[/dim]")
console.print(f"[dim]{out}[/dim]")
def cmd_remove(name: str) -> None:
@@ -1244,6 +1268,247 @@ def _run_composite_fallback(plugin_names, plugin_labels, plugin_selected,
print()
def dashboard_install_plugin(
identifier: str,
*,
force: bool,
enable: bool,
) -> dict[str, Any]:
"""Non-interactive install for the web dashboard. Returns a JSON-serializable dict."""
warnings: list[str] = []
try:
git_url = _resolve_git_url(identifier)
if git_url.startswith(("http://", "file://")):
warnings.append(
"Insecure URL scheme; prefer https:// or git@ for production installs.",
)
except ValueError:
pass
try:
target, installed_manifest, installed_name = _install_plugin_core(
identifier,
force=force,
)
except PluginOperationError as exc:
return {"ok": False, "error": str(exc)}
missing_env = _missing_requires_env_names(installed_manifest)
if enable:
en = _get_enabled_set()
dis = _get_disabled_set()
en.add(installed_name)
dis.discard(installed_name)
_save_enabled_set(en)
_save_disabled_set(dis)
hint: str | None = None
ap = target / "after-install.md"
if ap.exists():
hint = str(ap)
return {
"ok": True,
"plugin_name": installed_name,
"warnings": warnings,
"missing_env": missing_env,
"after_install_path": hint,
"enabled": enable,
}
def _get_plugin_toolset_key(name: str) -> Optional[str]:
"""Return the toolset key a plugin registers its tools under, or None.
Queries the live tool registry the plugin must already be loaded.
Falls back to reading ``provides_tools`` from plugin.yaml and looking
up the toolset from the registry for the first tool name found.
"""
try:
from tools.registry import registry
except Exception:
return None
# Check the plugin manager for tools this plugin registered
try:
from hermes_cli.plugins import discover_plugins, get_plugin_manager
discover_plugins() # idempotent — ensures plugins are loaded
manager = get_plugin_manager()
for _key, loaded in manager._plugins.items():
if loaded.manifest.name == name or _key == name:
for tool_name in loaded.tools_registered:
entry = registry.get_entry(tool_name)
if entry and entry.toolset:
return entry.toolset
break
except Exception:
pass
# Fallback: read provides_tools from manifest on disk and query registry
try:
from hermes_cli.plugins import get_bundled_plugins_dir
for base in (get_bundled_plugins_dir(), _plugins_dir()):
if not base.is_dir():
continue
candidate = base / name
if candidate.is_dir():
manifest = _read_manifest(candidate)
for tool_name in manifest.get("provides_tools") or []:
entry = registry.get_entry(tool_name)
if entry and entry.toolset:
return entry.toolset
except Exception:
pass
return None
def _toggle_plugin_toolset(name: str, *, enable: bool) -> None:
"""Add or remove a plugin's toolset from platform_toolsets for all platforms.
Only acts if the plugin actually provides tools (has a toolset key).
"""
toolset_key = _get_plugin_toolset_key(name)
if not toolset_key:
return
from hermes_cli.config import load_config, save_config
config = load_config()
platform_toolsets = config.get("platform_toolsets")
if not isinstance(platform_toolsets, dict):
platform_toolsets = {}
config["platform_toolsets"] = platform_toolsets
changed = False
for platform, ts_list in platform_toolsets.items():
if not isinstance(ts_list, list):
continue
if enable:
if toolset_key not in ts_list:
ts_list.append(toolset_key)
changed = True
else:
if toolset_key in ts_list:
ts_list.remove(toolset_key)
changed = True
# If enabling and no platforms have toolset lists yet, add to "cli" at minimum
if enable and not changed and not platform_toolsets:
platform_toolsets["cli"] = [toolset_key]
changed = True
if changed:
save_config(config)
def dashboard_set_agent_plugin_enabled(name: str, *, enabled: bool) -> dict[str, Any]:
"""Enable or disable a plugin in ``config.yaml`` (runtime allow/deny lists).
For plugins that provide tools (toolsets), also toggles the toolset in
``platform_toolsets`` so the agent actually sees the tools in sessions.
"""
if not _plugin_exists(name):
return {"ok": False, "error": f"Plugin '{name}' is not installed or bundled."}
en = _get_enabled_set()
dis = _get_disabled_set()
if enabled:
if name in en and name not in dis:
return {"ok": True, "name": name, "unchanged": True}
en.add(name)
dis.discard(name)
_save_enabled_set(en)
_save_disabled_set(dis)
_toggle_plugin_toolset(name, enable=True)
return {"ok": True, "name": name, "unchanged": False}
if name not in en and name in dis:
return {"ok": True, "name": name, "unchanged": True}
en.discard(name)
dis.add(name)
_save_enabled_set(en)
_save_disabled_set(dis)
_toggle_plugin_toolset(name, enable=False)
return {"ok": True, "name": name, "unchanged": False}
def _user_installed_plugin_dir(name: str) -> Optional[Path]:
"""Resolved path under ``~/.hermes/plugins/<name>`` if it exists."""
plugins_dir = _plugins_dir()
try:
target = _sanitize_plugin_name(name, plugins_dir)
except ValueError:
return None
return target if target.is_dir() else None
def dashboard_update_user_plugin(name: str) -> dict[str, Any]:
"""``git pull`` inside ``~/.hermes/plugins/<name>``."""
target = _user_installed_plugin_dir(name)
if target is None:
return {
"ok": False,
"error": f"Plugin '{name}' was not found under {_plugins_dir()}.",
}
if not (target / ".git").exists():
return {
"ok": False,
"error": f"Plugin '{name}' is not a git checkout; cannot pull updates.",
}
ok, msg = _git_pull_plugin_dir(target)
if not ok:
return {"ok": False, "error": msg}
from rich.console import Console
_copy_example_files(target, Console())
unchanged = "Already up to date" in msg
return {"ok": True, "name": name, "output": msg, "unchanged": unchanged}
def _git_pull_plugin_dir(target: Path) -> tuple[bool, str]:
try:
result = subprocess.run(
["git", "pull", "--ff-only"],
capture_output=True,
text=True,
timeout=60,
cwd=str(target),
)
except FileNotFoundError:
return False, "git is not installed or not in PATH."
except subprocess.TimeoutExpired:
return False, "Git pull timed out after 60 seconds."
if result.returncode != 0:
err = (result.stderr or "").strip() or result.stdout.strip()
return False, err or "git pull failed."
return True, result.stdout.strip()
def dashboard_remove_user_plugin(name: str) -> dict[str, Any]:
"""Delete a plugin tree under ``~/.hermes/plugins/`` only."""
plugins_dir = _plugins_dir()
for n, _ver, _d, src, _path in _discover_all_plugins():
if n == name and src == "bundled":
return {"ok": False, "error": "Bundled plugins cannot be removed from the dashboard."}
target = _user_installed_plugin_dir(name)
if target is None:
return {
"ok": False,
"error": f"Plugin '{name}' was not found under {plugins_dir}.",
}
shutil.rmtree(target)
return {"ok": True, "name": name}
def plugins_command(args) -> None:
"""Dispatch hermes plugins subcommands."""
action = getattr(args, "plugins_action", None)
+11 -2
View File
@@ -11,7 +11,7 @@ zero migration needed.
Usage::
hermes profile create coder # fresh profile + bundled skills
hermes profile create coder --clone # also copy config, .env, SOUL.md
hermes profile create coder --clone # also copy config, .env, SOUL.md, skills
hermes profile create coder --clone-all # full copy of source profile
coder chat # use via wrapper alias
hermes -p coder chat # or via flag
@@ -411,7 +411,8 @@ def create_profile(
clone_all:
If True, do a full copytree of the source (all state).
clone_config:
If True, copy only config files (config.yaml, .env, SOUL.md).
If True, copy config files (config.yaml, .env, SOUL.md), installed
skills, and selected profile identity files from the source profile.
no_alias:
If True, skip wrapper script creation.
@@ -469,6 +470,14 @@ def create_profile(
if src.exists():
shutil.copy2(src, profile_dir / filename)
# Clone installed skills from the source profile. The dashboard's
# "clone from default" flow is expected to preserve both bundled
# and user-installed skills so the new profile immediately has the
# same agent capabilities as the source profile.
source_skills = source_dir / "skills"
if source_skills.is_dir():
shutil.copytree(source_skills, profile_dir / "skills", dirs_exist_ok=True)
# Clone memory and other subdirectory files
for relpath in _CLONE_SUBDIR_FILES:
src = source_dir / relpath
+11 -2
View File
@@ -358,11 +358,20 @@ def _get_named_custom_provider(requested_provider: str) -> Optional[Dict[str, An
return None
if not requested_norm.startswith("custom:"):
try:
auth_mod.resolve_provider(requested_norm)
canonical = auth_mod.resolve_provider(requested_norm)
except AuthError:
pass
else:
return None
# A user-declared ``custom_providers`` entry whose name matches
# only an *alias* (``kimi`` → built-in ``kimi-coding``) is the
# user's intended target — alias rewriting would otherwise hijack
# the request. We only defer to the built-in when the raw name is
# the canonical provider itself (``nous``, ``openrouter``, …) so
# accidentally shadowing a canonical provider still resolves to
# the built-in. See tests/hermes_cli/test_runtime_provider_resolution.py
# ``test_named_custom_provider_does_not_shadow_builtin_provider``.
if (canonical or "").strip().lower() == requested_norm:
return None
config = load_config()
+316
View File
@@ -0,0 +1,316 @@
"""Session recap — summarize what's happened in the current session.
Inspired by Claude Code's `/recap` command (v2.1.114, April 2026), which
shows a one-line summary of what happened while a terminal was unfocused
so users juggling multiple sessions can re-orient quickly.
Source: https://code.claude.com/docs/en/whats-new/2026-w17
Differences from Claude Code:
- Pure local computation from the in-memory conversation history. No
LLM call, no auxiliary model, no prompt-cache invalidation. A
recap should be instant and free.
- Works unchanged on CLI and every gateway platform (Telegram,
Discord, Slack, ) because both call into the same ``build_recap``
helper. Claude Code only shows this on the CLI.
- Tailored to hermes-agent's tool vocabulary (``terminal``, ``patch``,
``write_file``, ``delegate_task``, ``browser_*``, ``web_*``) the
recap surfaces which classes of work were most active.
"""
from __future__ import annotations
import os
from collections import Counter
from typing import Any, Iterable, List, Mapping, Optional, Sequence, Tuple
# How many recent user/assistant turns we consider "recent activity".
_RECENT_TURN_WINDOW = 20
# How many characters of the latest user prompt to show.
_PROMPT_PREVIEW_CHARS = 140
# How many characters of the latest assistant text to show.
_ASSISTANT_PREVIEW_CHARS = 200
# How many recently-touched files to list.
_MAX_FILES_LISTED = 5
# Tool names that identify a file-editing action and the argument key that
# holds the path.
_FILE_EDIT_TOOLS: Mapping[str, str] = {
"write_file": "path",
"patch": "path",
"read_file": "path",
"skill_manage": "file_path",
"skill_view": "file_path",
}
def _coerce_text(value: Any) -> str:
"""Flatten assistant/user ``content`` into a plain string.
Content can be a string or a list of content blocks (for multimodal
or reasoning models). We concatenate every text-like block and
ignore the rest.
"""
if value is None:
return ""
if isinstance(value, str):
return value
if isinstance(value, list):
parts: List[str] = []
for block in value:
if isinstance(block, str):
parts.append(block)
continue
if isinstance(block, Mapping):
text = block.get("text")
if isinstance(text, str) and text:
parts.append(text)
return "\n".join(parts)
return str(value)
def _tool_call_name_and_args(tool_call: Any) -> Tuple[str, Mapping[str, Any]]:
"""Extract ``(name, arguments_dict)`` from a tool_call entry.
``arguments`` may be a JSON string or a dict depending on provider.
Return an empty dict if it cannot be parsed.
"""
if not isinstance(tool_call, Mapping):
return "", {}
fn = tool_call.get("function") or {}
if not isinstance(fn, Mapping):
return "", {}
name = str(fn.get("name") or "") or ""
raw_args = fn.get("arguments")
if isinstance(raw_args, Mapping):
return name, raw_args
if isinstance(raw_args, str) and raw_args:
try:
import json
parsed = json.loads(raw_args)
if isinstance(parsed, Mapping):
return name, parsed
except Exception:
return name, {}
return name, {}
def _iter_assistant_tool_calls(
messages: Sequence[Mapping[str, Any]],
) -> Iterable[Tuple[str, Mapping[str, Any]]]:
for msg in messages:
if not isinstance(msg, Mapping):
continue
if msg.get("role") != "assistant":
continue
tool_calls = msg.get("tool_calls") or []
if not isinstance(tool_calls, list):
continue
for tc in tool_calls:
name, args = _tool_call_name_and_args(tc)
if name:
yield name, args
def _count_visible_turns(
messages: Sequence[Mapping[str, Any]],
) -> Tuple[int, int, int]:
"""Return ``(user_turn_count, assistant_turn_count, tool_message_count)``."""
users = assistants = tools = 0
for msg in messages:
if not isinstance(msg, Mapping):
continue
role = msg.get("role")
if role == "user":
users += 1
elif role == "assistant":
assistants += 1
elif role == "tool":
tools += 1
return users, assistants, tools
def _latest_user_prompt(
messages: Sequence[Mapping[str, Any]],
) -> Optional[str]:
for msg in reversed(messages):
if isinstance(msg, Mapping) and msg.get("role") == "user":
text = _coerce_text(msg.get("content")).strip()
if text:
return text
return None
def _latest_assistant_text(
messages: Sequence[Mapping[str, Any]],
) -> Optional[str]:
for msg in reversed(messages):
if not isinstance(msg, Mapping):
continue
if msg.get("role") != "assistant":
continue
text = _coerce_text(msg.get("content")).strip()
if text:
return text
return None
def _recent_window(
messages: Sequence[Mapping[str, Any]], window: int = _RECENT_TURN_WINDOW
) -> List[Mapping[str, Any]]:
"""Return the tail slice of ``messages`` covering at most ``window``
user+assistant turns (tool messages ride along inside the window).
Iterating from the end, we count user and assistant messages and
keep everything from the first message that falls within the window.
"""
count = 0
cut = 0
for i in range(len(messages) - 1, -1, -1):
msg = messages[i]
if isinstance(msg, Mapping) and msg.get("role") in ("user", "assistant"):
count += 1
if count >= window:
cut = i
break
else:
return list(messages)
return list(messages[cut:])
def _shortened_path(path: str) -> str:
"""Show a path relative to cwd when possible, otherwise with ~ expansion."""
if not path:
return path
try:
abs_path = os.path.abspath(os.path.expanduser(path))
cwd = os.getcwd()
if abs_path == cwd:
return "."
if abs_path.startswith(cwd + os.sep):
return abs_path[len(cwd) + 1 :]
home = os.path.expanduser("~")
if abs_path.startswith(home + os.sep):
return "~/" + abs_path[len(home) + 1 :]
return abs_path
except Exception:
return path
def _summarise_tool_activity(
tool_calls: Sequence[Tuple[str, Mapping[str, Any]]],
) -> Tuple[List[Tuple[str, int]], List[str]]:
"""Return ``(tool_counts_sorted, recently_edited_files)``.
``tool_counts_sorted`` is descending by count, keeping the full list
so callers can truncate for display. ``recently_edited_files`` lists
distinct paths (most recent first) from file-editing tools.
"""
counter: Counter[str] = Counter()
files_seen: List[str] = []
files_set: set[str] = set()
# Walk in reverse so "most recent first" drops out of order-preserved iteration.
for name, args in reversed(list(tool_calls)):
counter[name] += 1
arg_key = _FILE_EDIT_TOOLS.get(name)
if arg_key:
path = args.get(arg_key)
if isinstance(path, str) and path and path not in files_set:
files_set.add(path)
files_seen.append(_shortened_path(path))
# Restore "reverse of reverse" for correct counts; Counter ignores order
# so only files_seen needed the reversal. Fix ordering: currently
# files_seen is newest→oldest which is what we want for display.
tool_counts = sorted(counter.items(), key=lambda kv: (-kv[1], kv[0]))
return tool_counts, files_seen
def _truncate(text: str, limit: int) -> str:
text = " ".join(text.split()) # collapse newlines for a compact one-liner
if len(text) <= limit:
return text
return text[: limit - 1].rstrip() + ""
def build_recap(
messages: Sequence[Mapping[str, Any]],
*,
session_title: Optional[str] = None,
session_id: Optional[str] = None,
platform: Optional[str] = None,
) -> str:
"""Build a multi-line recap of recent activity.
Inputs:
messages: the full conversation history as a list of
chat-completion-style dicts (``role``, ``content``,
``tool_calls``, ).
session_title: optional human title (from SessionDB).
session_id: optional session id.
platform: optional hint (``"cli"``, ``"telegram"``, ). Does not
change behavior today but is accepted for forward compat.
The output is plain text designed to render well in both a terminal
(with 80-col wrapping) and a gateway message bubble.
"""
_ = platform # reserved for future use
lines: List[str] = []
header_bits: List[str] = ["Session recap"]
if session_title:
header_bits.append(f"{session_title}")
elif session_id:
header_bits.append(f"{session_id[:8]}")
lines.append(" ".join(header_bits))
if not messages:
lines.append(" (nothing to recap — no messages yet)")
return "\n".join(lines)
users, assistants, tool_msgs = _count_visible_turns(messages)
window = _recent_window(messages)
win_users, win_assistants, _ = _count_visible_turns(window)
scope = (
f"{win_users} user turn{'s' if win_users != 1 else ''} / "
f"{win_assistants} assistant repl{'ies' if win_assistants != 1 else 'y'}"
)
if (users, assistants) != (win_users, win_assistants):
scope += f" (of {users}/{assistants} total)"
lines.append(f" Recent: {scope}, {tool_msgs} tool result{'s' if tool_msgs != 1 else ''}")
tool_calls = list(_iter_assistant_tool_calls(window))
tool_counts, files = _summarise_tool_activity(tool_calls)
if tool_counts:
top = ", ".join(f"{name}×{count}" for name, count in tool_counts[:5])
extra = len(tool_counts) - 5
if extra > 0:
top += f" (+{extra} more)"
lines.append(f" Tools used: {top}")
if files:
shown = files[:_MAX_FILES_LISTED]
extra = len(files) - len(shown)
entry = ", ".join(shown)
if extra > 0:
entry += f" (+{extra} more)"
lines.append(f" Files touched: {entry}")
latest_user = _latest_user_prompt(window)
if latest_user:
lines.append(f" Last ask: {_truncate(latest_user, _PROMPT_PREVIEW_CHARS)}")
latest_reply = _latest_assistant_text(window)
if latest_reply:
lines.append(f" Last reply: {_truncate(latest_reply, _ASSISTANT_PREVIEW_CHARS)}")
if len(lines) == 2:
# Only the header + scope line — nothing substantive to show.
lines.append(" (no assistant activity yet in this window)")
return "\n".join(lines)
__all__ = ["build_recap"]
+2 -1
View File
@@ -18,6 +18,7 @@ for reinstall when scopes/commands change.
from __future__ import annotations
import json
import os
import sys
from pathlib import Path
@@ -128,7 +129,7 @@ def slack_manifest_command(args) -> int:
target = Path(get_hermes_home()) / "slack-manifest.json"
except Exception:
target = Path.home() / ".hermes" / "slack-manifest.json"
target = Path(os.environ.get("HERMES_HOME") or str(Path.home() / ".hermes")) / "slack-manifest.json"
else:
target = Path(write_target).expanduser()
target.parent.mkdir(parents=True, exist_ok=True)
+1
View File
@@ -125,6 +125,7 @@ def show_status(args):
keys = {
"OpenRouter": "OPENROUTER_API_KEY",
"OpenAI": "OPENAI_API_KEY",
"NVIDIA": "NVIDIA_API_KEY",
"Z.AI/GLM": "GLM_API_KEY",
"Kimi": "KIMI_API_KEY",
"StepFun Step Plan": "STEPFUN_API_KEY",
+517 -2
View File
@@ -345,6 +345,7 @@ _CATEGORY_MERGE: Dict[str, str] = {
"dashboard": "display",
"code_execution": "agent",
"prompt_caching": "agent",
"goals": "agent",
# Only `telegram.reactions` currently lives under telegram — fold it in
# with the other messaging-platform config (discord) so it isn't an
# orphan tab of one field.
@@ -2344,6 +2345,254 @@ async def delete_cron_job(job_id: str):
return {"ok": True}
# ---------------------------------------------------------------------------
# Profile management endpoints (minimal — list/create/rename/delete + SOUL.md)
# ---------------------------------------------------------------------------
class ProfileCreate(BaseModel):
name: str
clone_from_default: bool = False
class ProfileRename(BaseModel):
new_name: str
class ProfileSoulUpdate(BaseModel):
content: str
def _profile_attr(info, name: str, default: Any = None) -> Any:
try:
return getattr(info, name)
except Exception:
return default
def _profile_to_dict(info) -> Dict[str, Any]:
return {
"name": _profile_attr(info, "name", ""),
"path": str(_profile_attr(info, "path", "")),
"is_default": bool(_profile_attr(info, "is_default", False)),
"model": _profile_attr(info, "model"),
"provider": _profile_attr(info, "provider"),
"has_env": bool(_profile_attr(info, "has_env", False)),
"skill_count": int(_profile_attr(info, "skill_count", 0) or 0),
}
def _fallback_profile_dicts(profiles_mod) -> List[Dict[str, Any]]:
def _safe(callable_, default):
try:
return callable_()
except Exception:
return default
profiles: List[Dict[str, Any]] = []
default_home = profiles_mod._get_default_hermes_home()
if default_home.is_dir():
model, provider = _safe(lambda: profiles_mod._read_config_model(default_home), (None, None))
profiles.append({
"name": "default",
"path": str(default_home),
"is_default": True,
"model": model,
"provider": provider,
"has_env": (default_home / ".env").exists(),
"skill_count": _safe(lambda: profiles_mod._count_skills(default_home), 0),
})
profiles_root = profiles_mod._get_profiles_root()
if profiles_root.is_dir():
for entry in sorted(profiles_root.iterdir()):
if not entry.is_dir() or not profiles_mod._PROFILE_ID_RE.match(entry.name):
continue
model, provider = _safe(lambda entry=entry: profiles_mod._read_config_model(entry), (None, None))
profiles.append({
"name": entry.name,
"path": str(entry),
"is_default": False,
"model": model,
"provider": provider,
"has_env": (entry / ".env").exists(),
"skill_count": _safe(lambda entry=entry: profiles_mod._count_skills(entry), 0),
})
return profiles
def _resolve_profile_dir(name: str) -> Path:
"""Validate ``name`` and resolve to its directory or raise an HTTPException."""
from hermes_cli import profiles as profiles_mod
try:
profiles_mod.validate_profile_name(name)
except ValueError as e:
raise HTTPException(status_code=400, detail=str(e))
if not profiles_mod.profile_exists(name):
raise HTTPException(status_code=404, detail=f"Profile '{name}' does not exist.")
return profiles_mod.get_profile_dir(name)
def _profile_setup_command(name: str) -> str:
"""Return the shell command used to configure a profile in the CLI."""
_resolve_profile_dir(name)
return "hermes setup" if name == "default" else f"{name} setup"
@app.get("/api/profiles")
async def list_profiles_endpoint():
from hermes_cli import profiles as profiles_mod
try:
return {"profiles": [_profile_to_dict(p) for p in profiles_mod.list_profiles()]}
except Exception:
_log.exception("GET /api/profiles failed; falling back to profile directory scan")
return {"profiles": _fallback_profile_dicts(profiles_mod)}
@app.post("/api/profiles")
async def create_profile_endpoint(body: ProfileCreate):
from hermes_cli import profiles as profiles_mod
try:
path = profiles_mod.create_profile(
name=body.name,
clone_from="default" if body.clone_from_default else None,
clone_config=body.clone_from_default,
)
# Match the CLI's profile-create flow: fresh named profiles get the
# bundled skills installed. When cloning from default, create_profile()
# has already copied the source profile's skills, including any
# user-installed skills.
if not body.clone_from_default:
profiles_mod.seed_profile_skills(path, quiet=True)
# Match the CLI's profile-create flow: named profiles should get a
# wrapper in ~/.local/bin when the alias is safe to create.
collision = profiles_mod.check_alias_collision(body.name)
if not collision:
profiles_mod.create_wrapper_script(body.name)
except (ValueError, FileExistsError, FileNotFoundError) as e:
raise HTTPException(status_code=400, detail=str(e))
except Exception as e:
_log.exception("POST /api/profiles failed")
raise HTTPException(status_code=500, detail=str(e))
return {"ok": True, "name": body.name, "path": str(path)}
@app.get("/api/profiles/{name}/setup-command")
async def get_profile_setup_command(name: str):
return {"command": _profile_setup_command(name)}
@app.post("/api/profiles/{name}/open-terminal")
async def open_profile_terminal_endpoint(name: str):
try:
command = _profile_setup_command(name)
if sys.platform.startswith("win"):
subprocess.Popen(["cmd.exe", "/c", "start", "", command])
elif sys.platform == "darwin":
escaped = command.replace("\\", "\\\\").replace('"', '\\"')
applescript = (
'tell application "Terminal"\n'
"activate\n"
f'do script "{escaped}"\n'
"end tell"
)
subprocess.Popen(["osascript", "-e", applescript])
else:
terminal_commands = [
("x-terminal-emulator", ["x-terminal-emulator", "-e", "sh", "-lc", command]),
("gnome-terminal", ["gnome-terminal", "--", "sh", "-lc", command]),
("konsole", ["konsole", "-e", "sh", "-lc", command]),
("xfce4-terminal", ["xfce4-terminal", "-e", f"sh -lc '{command}'"]),
("mate-terminal", ["mate-terminal", "-e", f"sh -lc '{command}'"]),
("lxterminal", ["lxterminal", "-e", f"sh -lc '{command}'"]),
("tilix", ["tilix", "-e", "sh", "-lc", command]),
("alacritty", ["alacritty", "-e", "sh", "-lc", command]),
("kitty", ["kitty", "sh", "-lc", command]),
("xterm", ["xterm", "-e", "sh", "-lc", command]),
]
for executable, popen_args in terminal_commands:
if subprocess.call(
["which", executable],
stdout=subprocess.DEVNULL,
stderr=subprocess.DEVNULL,
) == 0:
subprocess.Popen(popen_args)
break
else:
raise HTTPException(
status_code=400,
detail="No supported terminal emulator found",
)
except FileNotFoundError as e:
raise HTTPException(status_code=404, detail=str(e))
except ValueError as e:
raise HTTPException(status_code=400, detail=str(e))
except HTTPException:
raise
except Exception as e:
_log.exception("POST /api/profiles/%s/open-terminal failed", name)
raise HTTPException(status_code=500, detail=str(e))
return {"ok": True, "command": command}
@app.patch("/api/profiles/{name}")
async def rename_profile_endpoint(name: str, body: ProfileRename):
from hermes_cli import profiles as profiles_mod
try:
path = profiles_mod.rename_profile(name, body.new_name)
except FileNotFoundError as e:
raise HTTPException(status_code=404, detail=str(e))
except (ValueError, FileExistsError) as e:
raise HTTPException(status_code=400, detail=str(e))
except Exception as e:
_log.exception("PATCH /api/profiles/%s failed", name)
raise HTTPException(status_code=500, detail=str(e))
return {"ok": True, "name": body.new_name, "path": str(path)}
@app.delete("/api/profiles/{name}")
async def delete_profile_endpoint(name: str):
"""Delete a profile. The dashboard collects the user's confirmation in
its own dialog before this request, so we always pass ``yes=True`` to
skip the CLI's interactive prompt."""
from hermes_cli import profiles as profiles_mod
try:
path = profiles_mod.delete_profile(name, yes=True)
except FileNotFoundError as e:
raise HTTPException(status_code=404, detail=str(e))
except ValueError as e:
raise HTTPException(status_code=400, detail=str(e))
except Exception as e:
_log.exception("DELETE /api/profiles/%s failed", name)
raise HTTPException(status_code=500, detail=str(e))
return {"ok": True, "path": str(path)}
@app.get("/api/profiles/{name}/soul")
async def get_profile_soul(name: str):
soul_path = _resolve_profile_dir(name) / "SOUL.md"
if soul_path.exists():
try:
return {"content": soul_path.read_text(encoding="utf-8"), "exists": True}
except OSError as e:
raise HTTPException(status_code=500, detail=f"Could not read SOUL.md: {e}")
return {"content": "", "exists": False}
@app.put("/api/profiles/{name}/soul")
async def update_profile_soul(name: str, body: ProfileSoulUpdate):
soul_path = _resolve_profile_dir(name) / "SOUL.md"
try:
soul_path.write_text(body.content, encoding="utf-8")
except OSError as e:
_log.exception("PUT /api/profiles/%s/soul failed", name)
raise HTTPException(status_code=500, detail=f"Could not write SOUL.md: {e}")
return {"ok": True}
# ---------------------------------------------------------------------------
# Skills & Tools endpoints
# ---------------------------------------------------------------------------
@@ -3369,12 +3618,16 @@ def _get_dashboard_plugins(force_rescan: bool = False) -> list:
@app.get("/api/dashboard/plugins")
async def get_dashboard_plugins():
"""Return discovered dashboard plugins."""
"""Return discovered dashboard plugins (excludes user-hidden ones)."""
plugins = _get_dashboard_plugins()
# Strip internal fields before sending to frontend.
# Read user's hidden plugins list from config.
config = load_config()
hidden: list = cfg_get(config, "dashboard", "hidden_plugins", default=[]) or []
# Strip internal fields before sending to frontend and filter out hidden.
return [
{k: v for k, v in p.items() if not k.startswith("_")}
for p in plugins
if p["name"] not in hidden
]
@@ -3385,6 +3638,268 @@ async def rescan_dashboard_plugins():
return {"ok": True, "count": len(plugins)}
class _AgentPluginInstallBody(BaseModel):
identifier: str
force: bool = False
enable: bool = True
def _strip_dashboard_manifest(p: Dict[str, Any]) -> Dict[str, Any]:
return {k: v for k, v in p.items() if not k.startswith("_")}
def _merged_plugins_hub() -> Dict[str, Any]:
"""Agent discovery + dashboard manifests + optional provider picker metadata."""
from hermes_cli.plugins_cmd import (
_discover_all_plugins,
_get_current_context_engine,
_get_current_memory_provider,
_discover_context_engines,
_discover_memory_providers,
_get_disabled_set,
_get_enabled_set,
_read_manifest as _read_plugin_manifest_at,
)
dashboard_list = _get_dashboard_plugins()
dash_by_name = {str(p["name"]): p for p in dashboard_list}
disabled_set = _get_disabled_set()
enabled_set = _get_enabled_set()
# Read user-hidden plugins from config for the user_hidden field.
config = load_config()
hidden_plugins: list = cfg_get(config, "dashboard", "hidden_plugins", default=[]) or []
plugins_root_resolved = (get_hermes_home() / "plugins").resolve()
rows: List[Dict[str, Any]] = []
for name, version, description, source, dir_str in _discover_all_plugins():
if name in disabled_set:
runtime_status = "disabled"
elif name in enabled_set:
runtime_status = "enabled"
else:
runtime_status = "inactive"
dir_path = Path(dir_str)
dm = dash_by_name.get(name)
has_dash_manifest = dm is not None or (dir_path / "dashboard" / "manifest.json").exists()
under_user_tree = False
try:
dir_path.resolve().relative_to(plugins_root_resolved)
under_user_tree = True
except ValueError:
pass
can_remove_update = (
source in ("user", "git") and under_user_tree and Path(dir_str).is_dir()
)
# Check if this plugin provides tools that require auth
auth_required = False
auth_command = ""
manifest_data = _read_plugin_manifest_at(dir_path)
provides_tools = manifest_data.get("provides_tools") or []
if provides_tools:
try:
from tools.registry import registry
for tname in provides_tools:
entry = registry.get_entry(tname)
if entry and entry.check_fn and not entry.check_fn():
auth_required = True
auth_command = f"hermes auth {name}"
break
except Exception:
pass
rows.append({
"name": name,
"version": version or "",
"description": description or "",
"source": source,
"runtime_status": runtime_status,
"has_dashboard_manifest": has_dash_manifest,
"dashboard_manifest": _strip_dashboard_manifest(dm) if dm else None,
"path": dir_str,
"can_remove": can_remove_update,
"can_update_git": can_remove_update and (Path(dir_str) / ".git").exists(),
"auth_required": auth_required,
"auth_command": auth_command,
"user_hidden": name in hidden_plugins,
})
agent_names = {r["name"] for r in rows}
orphan_dashboard = [
_strip_dashboard_manifest(p)
for p in dashboard_list
if str(p["name"]) not in agent_names
]
memory_providers: List[Dict[str, str]] = []
try:
for n, desc in _discover_memory_providers():
memory_providers.append({"name": n, "description": desc})
except Exception:
memory_providers = []
context_engines: List[Dict[str, str]] = []
try:
for n, desc in _discover_context_engines():
context_engines.append({"name": n, "description": desc})
except Exception:
context_engines = []
return {
"plugins": rows,
"orphan_dashboard_plugins": orphan_dashboard,
"providers": {
"memory_provider": _get_current_memory_provider() or "",
"memory_options": memory_providers,
"context_engine": _get_current_context_engine(),
"context_options": context_engines,
},
}
@app.get("/api/dashboard/plugins/hub")
async def get_plugins_hub(request: Request):
"""Unified agent plugins + dashboard extension metadata (session protected)."""
_require_token(request)
try:
return _merged_plugins_hub()
except Exception as exc:
_log.warning("plugins/hub failed: %s", exc)
raise HTTPException(status_code=500, detail="Failed to build plugins hub.") from exc
@app.post("/api/dashboard/agent-plugins/install")
async def post_agent_plugin_install(request: Request, body: _AgentPluginInstallBody):
_require_token(request)
from hermes_cli.plugins_cmd import dashboard_install_plugin
result = dashboard_install_plugin(
body.identifier.strip(),
force=body.force,
enable=body.enable,
)
if not result.get("ok"):
raise HTTPException(
status_code=400,
detail=result.get("error") or "Install failed.",
)
_get_dashboard_plugins(force_rescan=True)
# Strip internal paths from the response
result.pop("after_install_path", None)
return result
def _validate_plugin_name(name: str) -> str:
"""Reject path-traversal attempts in plugin name URL parameters."""
if not name or "/" in name or "\\" in name or ".." in name:
raise HTTPException(status_code=400, detail="Invalid plugin name.")
return name
@app.post("/api/dashboard/agent-plugins/{name}/enable")
async def post_agent_plugin_enable(request: Request, name: str):
_require_token(request)
name = _validate_plugin_name(name)
from hermes_cli.plugins_cmd import dashboard_set_agent_plugin_enabled
result = dashboard_set_agent_plugin_enabled(name, enabled=True)
if not result.get("ok"):
raise HTTPException(status_code=400, detail=result.get("error") or "Enable failed.")
return result
@app.post("/api/dashboard/agent-plugins/{name}/disable")
async def post_agent_plugin_disable(request: Request, name: str):
_require_token(request)
name = _validate_plugin_name(name)
from hermes_cli.plugins_cmd import dashboard_set_agent_plugin_enabled
result = dashboard_set_agent_plugin_enabled(name, enabled=False)
if not result.get("ok"):
raise HTTPException(status_code=400, detail=result.get("error") or "Disable failed.")
return result
@app.post("/api/dashboard/agent-plugins/{name}/update")
async def post_agent_plugin_update(request: Request, name: str):
_require_token(request)
name = _validate_plugin_name(name)
from hermes_cli.plugins_cmd import dashboard_update_user_plugin
result = dashboard_update_user_plugin(name)
if not result.get("ok"):
raise HTTPException(status_code=400, detail=result.get("error") or "Update failed.")
_get_dashboard_plugins(force_rescan=True)
return result
@app.delete("/api/dashboard/agent-plugins/{name}")
async def delete_agent_plugin(request: Request, name: str):
_require_token(request)
name = _validate_plugin_name(name)
from hermes_cli.plugins_cmd import dashboard_remove_user_plugin
result = dashboard_remove_user_plugin(name)
if not result.get("ok"):
raise HTTPException(status_code=400, detail=result.get("error") or "Remove failed.")
_get_dashboard_plugins(force_rescan=True)
return result
class _PluginProvidersPutBody(BaseModel):
memory_provider: Optional[str] = None
context_engine: Optional[str] = None
@app.put("/api/dashboard/plugin-providers")
async def put_plugin_providers(request: Request, body: _PluginProvidersPutBody):
"""Persist memory provider / context engine selection (writes config.yaml)."""
_require_token(request)
from hermes_cli.plugins_cmd import (
_save_context_engine,
_save_memory_provider,
)
if body.memory_provider is not None:
_save_memory_provider(body.memory_provider)
if body.context_engine is not None:
_save_context_engine(body.context_engine)
return {"ok": True}
class _PluginVisibilityBody(BaseModel):
hidden: bool
@app.post("/api/dashboard/plugins/{name}/visibility")
async def post_plugin_visibility(request: Request, name: str, body: _PluginVisibilityBody):
"""Toggle a plugin's sidebar visibility (persists to config.yaml dashboard.hidden_plugins)."""
_require_token(request)
name = _validate_plugin_name(name)
config = load_config()
if "dashboard" not in config or not isinstance(config.get("dashboard"), dict):
config["dashboard"] = {}
hidden_list: list = config["dashboard"].get("hidden_plugins") or []
if not isinstance(hidden_list, list):
hidden_list = []
if body.hidden and name not in hidden_list:
hidden_list.append(name)
elif not body.hidden and name in hidden_list:
hidden_list.remove(name)
config["dashboard"]["hidden_plugins"] = hidden_list
save_config(config)
return {"ok": True, "name": name, "hidden": body.hidden}
@app.get("/dashboard-plugins/{plugin_name}/{file_path:path}")
async def serve_plugin_asset(plugin_name: str, file_path: str):
"""Serve static assets from a dashboard plugin directory.
+199 -45
View File
@@ -514,7 +514,7 @@ class SessionDB:
# Session lifecycle
# =========================================================================
def create_session(
def _insert_session_row(
self,
session_id: str,
source: str,
@@ -523,8 +523,8 @@ class SessionDB:
system_prompt: str = None,
user_id: str = None,
parent_session_id: str = None,
) -> str:
"""Create a new session record. Returns the session_id."""
) -> None:
"""Shared INSERT OR IGNORE for session rows."""
def _do(conn):
conn.execute(
"""INSERT OR IGNORE INTO sessions (id, source, user_id, model, model_config,
@@ -542,8 +542,11 @@ class SessionDB:
),
)
self._execute_write(_do)
return session_id
def create_session(self, session_id: str, source: str, **kwargs) -> str:
"""Create a new session record. Returns the session_id."""
self._insert_session_row(session_id, source, **kwargs)
return session_id
def end_session(self, session_id: str, end_reason: str) -> None:
"""Mark a session as ended.
@@ -679,21 +682,41 @@ class SessionDB:
session_id: str,
source: str = "unknown",
model: str = None,
) -> None:
"""Ensure a session row exists, creating it with minimal metadata if absent.
**kwargs,
) -> str:
"""Ensure a session row exists (INSERT OR IGNORE). Accepts optional kwargs."""
self._insert_session_row(session_id, source, model=model, **kwargs)
return session_id
def prune_empty_ghost_sessions(self, sessions_dir: "Optional[Path]" = None) -> int:
"""Remove empty TUI ghost sessions (no messages, no title, >24hr old)."""
cutoff = time.time() - 86400 # Only sessions older than 24 hours
Used by _flush_messages_to_session_db to recover from a failed
create_session() call (e.g. transient SQLite lock at agent startup).
INSERT OR IGNORE is safe to call even when the row already exists.
"""
def _do(conn):
conn.execute(
"""INSERT OR IGNORE INTO sessions
(id, source, model, started_at)
VALUES (?, ?, ?, ?)""",
(session_id, source, model, time.time()),
)
self._execute_write(_do)
rows = conn.execute("""
SELECT id FROM sessions
WHERE source = 'tui'
AND title IS NULL
AND ended_at IS NOT NULL
AND started_at < ?
AND NOT EXISTS (
SELECT 1 FROM messages WHERE messages.session_id = sessions.id
)
""", (cutoff,)).fetchall()
ids = [r[0] if isinstance(r, (tuple, list)) else r["id"] for r in rows]
if ids:
placeholders = ",".join("?" * len(ids))
conn.execute(
f"DELETE FROM sessions WHERE id IN ({placeholders})", ids
)
return ids
removed_ids = self._execute_write(_do) or []
# Clean up any on-disk session files (belt-and-suspenders)
if sessions_dir and removed_ids:
for sid in removed_ids:
self._remove_session_files(sessions_dir, sid)
return len(removed_ids)
def get_session(self, session_id: str) -> Optional[Dict[str, Any]]:
"""Get a session by ID."""
@@ -933,6 +956,7 @@ class SessionDB:
offset: int = 0,
include_children: bool = False,
project_compression_tips: bool = True,
order_by_last_active: bool = False,
) -> List[Dict[str, Any]]:
"""List sessions with preview (first user message) and last active timestamp.
@@ -952,6 +976,14 @@ class SessionDB:
compressed continuations from being invisible to users while keeping
delegate subagents and branches hidden. Pass ``False`` to return the
raw root rows (useful for admin/debug UIs).
Pass ``order_by_last_active=True`` to sort by most-recent activity
instead of original conversation start time. For compression chains,
the "most-recent activity" is taken from the live tip (not the root),
so an old conversation that was compressed and continued recently
surfaces in the correct slot. Ordering is computed at SQL level via
a recursive CTE that walks compression-continuation edges, so LIMIT
and OFFSET still apply efficiently.
"""
where_clauses = []
params = []
@@ -979,25 +1011,80 @@ class SessionDB:
params.extend(exclude_sources)
where_sql = f"WHERE {' AND '.join(where_clauses)}" if where_clauses else ""
query = f"""
SELECT s.*,
COALESCE(
(SELECT SUBSTR(REPLACE(REPLACE(m.content, X'0A', ' '), X'0D', ' '), 1, 63)
FROM messages m
WHERE m.session_id = s.id AND m.role = 'user' AND m.content IS NOT NULL
ORDER BY m.timestamp, m.id LIMIT 1),
''
) AS _preview_raw,
COALESCE(
(SELECT MAX(m2.timestamp) FROM messages m2 WHERE m2.session_id = s.id),
s.started_at
) AS last_active
FROM sessions s
{where_sql}
ORDER BY s.started_at DESC
LIMIT ? OFFSET ?
"""
params.extend([limit, offset])
if order_by_last_active:
# Compute effective_last_active by walking each surfaced session's
# compression-continuation chain forward in SQL and taking the MAX
# timestamp across the chain. This lets us ORDER BY + LIMIT at SQL
# level instead of fetching every row and sorting in Python, while
# still surfacing old compression roots whose live tip is fresh.
#
# The CTE seeds from rows the outer WHERE admits (roots + branch
# children), then recursively joins forward through
# compression-continuation edges using the same criteria as
# get_compression_tip (parent.end_reason='compression' AND
# child.started_at >= parent.ended_at).
query = f"""
WITH RECURSIVE chain(root_id, cur_id) AS (
SELECT s.id, s.id FROM sessions s {where_sql}
UNION ALL
SELECT c.root_id, child.id
FROM chain c
JOIN sessions parent ON parent.id = c.cur_id
JOIN sessions child ON child.parent_session_id = c.cur_id
WHERE parent.end_reason = 'compression'
AND child.started_at >= parent.ended_at
),
chain_max AS (
SELECT
root_id,
MAX(COALESCE(
(SELECT MAX(m.timestamp) FROM messages m WHERE m.session_id = cur_id),
(SELECT started_at FROM sessions ss WHERE ss.id = cur_id)
)) AS effective_last_active
FROM chain
GROUP BY root_id
)
SELECT s.*,
COALESCE(
(SELECT SUBSTR(REPLACE(REPLACE(m.content, X'0A', ' '), X'0D', ' '), 1, 63)
FROM messages m
WHERE m.session_id = s.id AND m.role = 'user' AND m.content IS NOT NULL
ORDER BY m.timestamp, m.id LIMIT 1),
''
) AS _preview_raw,
COALESCE(
(SELECT MAX(m2.timestamp) FROM messages m2 WHERE m2.session_id = s.id),
s.started_at
) AS last_active,
COALESCE(cm.effective_last_active, s.started_at) AS _effective_last_active
FROM sessions s
LEFT JOIN chain_max cm ON cm.root_id = s.id
{where_sql}
ORDER BY _effective_last_active DESC, s.started_at DESC, s.id DESC
LIMIT ? OFFSET ?
"""
# WHERE params apply twice (CTE seed + outer select).
params = params + params + [limit, offset]
else:
query = f"""
SELECT s.*,
COALESCE(
(SELECT SUBSTR(REPLACE(REPLACE(m.content, X'0A', ' '), X'0D', ' '), 1, 63)
FROM messages m
WHERE m.session_id = s.id AND m.role = 'user' AND m.content IS NOT NULL
ORDER BY m.timestamp, m.id LIMIT 1),
''
) AS _preview_raw,
COALESCE(
(SELECT MAX(m2.timestamp) FROM messages m2 WHERE m2.session_id = s.id),
s.started_at
) AS last_active
FROM sessions s
{where_sql}
ORDER BY s.started_at DESC
LIMIT ? OFFSET ?
"""
params.extend([limit, offset])
with self._lock:
cursor = self._conn.execute(query, params)
rows = cursor.fetchall()
@@ -1011,6 +1098,8 @@ class SessionDB:
s["preview"] = text + ("..." if len(raw) > 60 else "")
else:
s["preview"] = ""
# Drop the internal ordering column so callers see a clean dict.
s.pop("_effective_last_active", None)
sessions.append(s)
# Project compression roots forward to their tips. Each row whose
@@ -1088,6 +1177,48 @@ class SessionDB:
# Message storage
# =========================================================================
# Sentinel prefix used to distinguish JSON-encoded structured content
# (multimodal messages: lists of parts like text + image_url) from plain
# string content. The NUL byte is not legal in normal text, so this
# cannot collide with real user content.
_CONTENT_JSON_PREFIX = "\x00json:"
@classmethod
def _encode_content(cls, content: Any) -> Any:
"""Serialize structured (list/dict) message content for sqlite.
sqlite3 can only bind ``str``, ``bytes``, ``int``, ``float``, and ``None``
to query parameters. Multimodal messages have ``content`` as a list of
parts (``[{"type": "text", ...}, {"type": "image_url", ...}]``), which
raises ``ProgrammingError: Error binding parameter N: type 'list' is
not supported`` when bound directly.
Returns the value unchanged when it's already a safe scalar, or a
sentinel-prefixed JSON string for lists/dicts. Paired with
:meth:`_decode_content` on read.
"""
if content is None or isinstance(content, (str, bytes, int, float)):
return content
try:
return cls._CONTENT_JSON_PREFIX + json.dumps(content)
except (TypeError, ValueError):
# Last-resort fallback: stringify so persistence never fails.
return str(content)
@classmethod
def _decode_content(cls, content: Any) -> Any:
"""Reverse :meth:`_encode_content`; returns scalars unchanged."""
if isinstance(content, str) and content.startswith(cls._CONTENT_JSON_PREFIX):
try:
return json.loads(content[len(cls._CONTENT_JSON_PREFIX):])
except (json.JSONDecodeError, TypeError):
logger.warning(
"Failed to decode JSON-encoded message content; "
"returning raw string"
)
return content
return content
def append_message(
self,
session_id: str,
@@ -1124,6 +1255,9 @@ class SessionDB:
if codex_message_items else None
)
tool_calls_json = json.dumps(tool_calls) if tool_calls else None
# Multimodal content (list of parts) must be JSON-encoded: sqlite3
# cannot bind list/dict parameters directly.
stored_content = self._encode_content(content)
# Pre-compute tool call count
num_tool_calls = 0
@@ -1140,7 +1274,7 @@ class SessionDB:
(
session_id,
role,
content,
stored_content,
tool_call_id,
tool_calls_json,
tool_name,
@@ -1223,7 +1357,7 @@ class SessionDB:
(
session_id,
role,
msg.get("content"),
self._encode_content(msg.get("content")),
msg.get("tool_call_id"),
tool_calls_json,
msg.get("tool_name"),
@@ -1262,6 +1396,8 @@ class SessionDB:
result = []
for row in rows:
msg = dict(row)
if "content" in msg:
msg["content"] = self._decode_content(msg["content"])
if msg.get("tool_calls"):
try:
msg["tool_calls"] = json.loads(msg["tool_calls"])
@@ -1351,15 +1487,15 @@ class SessionDB:
placeholders = ",".join("?" for _ in session_ids)
rows = self._conn.execute(
"SELECT role, content, tool_call_id, tool_calls, tool_name, "
"reasoning, reasoning_content, reasoning_details, codex_reasoning_items, "
"codex_message_items "
"finish_reason, reasoning, reasoning_content, reasoning_details, "
"codex_reasoning_items, codex_message_items "
f"FROM messages WHERE session_id IN ({placeholders}) ORDER BY timestamp, id",
tuple(session_ids),
).fetchall()
messages = []
for row in rows:
content = row["content"]
content = self._decode_content(row["content"])
if row["role"] in {"user", "assistant"} and isinstance(content, str):
content = sanitize_context(content).strip()
msg = {"role": row["role"], "content": content}
@@ -1377,6 +1513,8 @@ class SessionDB:
# that replay reasoning (OpenRouter, OpenAI, Nous) receive
# coherent multi-turn reasoning context.
if row["role"] == "assistant":
if row["finish_reason"]:
msg["finish_reason"] = row["finish_reason"]
if row["reasoning"]:
msg["reasoning"] = row["reasoning"]
if row["reasoning_content"] is not None:
@@ -1744,10 +1882,26 @@ class SessionDB:
)""",
(match["id"], match["id"]),
)
context_msgs = [
{"role": r["role"], "content": (r["content"] or "")[:200]}
for r in ctx_cursor.fetchall()
]
context_msgs = []
for r in ctx_cursor.fetchall():
raw = r["content"]
decoded = self._decode_content(raw)
# Multimodal context: render a compact text-only
# summary for search previews.
if isinstance(decoded, list):
text_parts = [
p.get("text", "") for p in decoded
if isinstance(p, dict) and p.get("type") == "text"
]
text = " ".join(t for t in text_parts if t).strip()
preview = text or "[multimodal content]"
elif isinstance(decoded, str):
preview = decoded
else:
preview = ""
context_msgs.append(
{"role": r["role"], "content": preview[:200]}
)
match["context"] = context_msgs
except Exception:
match["context"] = []
+7 -6
View File
@@ -356,12 +356,17 @@ def _compute_tool_definitions(
else:
if not quiet_mode:
print(f"⚠️ Unknown toolset: {toolset_name}")
elif disabled_toolsets:
else:
# Default: start with everything
from toolsets import get_all_toolsets
for ts_name in get_all_toolsets():
tools_to_include.update(resolve_toolset(ts_name))
# Always apply disabled toolsets as a subtraction step at the end.
# This ensures that even if a composite toolset (like hermes-cli)
# is enabled, any tools belonging to a disabled toolset are strictly
# stripped out. See issue #17309.
if disabled_toolsets:
for toolset_name in disabled_toolsets:
if validate_toolset(toolset_name):
resolved = resolve_toolset(toolset_name)
@@ -376,10 +381,6 @@ def _compute_tool_definitions(
else:
if not quiet_mode:
print(f"⚠️ Unknown toolset: {toolset_name}")
else:
from toolsets import get_all_toolsets
for ts_name in get_all_toolsets():
tools_to_include.update(resolve_toolset(ts_name))
# Plugin-registered tools are now resolved through the normal toolset
# path — validate_toolset() / resolve_toolset() / get_all_toolsets()
+1 -1
View File
@@ -163,7 +163,7 @@
for entry in "''${ENTRIES[@]}"; do
IFS=":" read -r ATTR FOLDER NIX_FILE <<< "$entry"
echo "==> .#$ATTR ($FOLDER -> $NIX_FILE)"
OUTPUT=$(nix build ".#$ATTR.npmDeps" --no-link --rebuild --print-build-logs 2>&1)
OUTPUT=$(nix build ".#$ATTR.npmDeps" --no-link --print-build-logs 2>&1)
STATUS=$?
if [ "$STATUS" -eq 0 ]; then
echo " ok"
+1 -1
View File
@@ -4,7 +4,7 @@ let
src = ../ui-tui;
npmDeps = pkgs.fetchNpmDeps {
inherit src;
hash = "sha256-Chz+NW9NXqboXHOa6PKwf5bhAkkcFtKNhvKWwg2XSPc=";
hash = "sha256-a/HGI9OgVcTnZrMXA7xFMGnFoVxyHe95fulVz+WNYB0=";
};
npm = hermesNpmLib.mkNpmPassthru { folder = "ui-tui"; attr = "tui"; pname = "hermes-tui"; };
@@ -2960,7 +2960,7 @@ class Migrator:
def parse_args() -> argparse.Namespace:
parser = argparse.ArgumentParser(description="Migrate OpenClaw user state into Hermes Agent.")
parser.add_argument("--source", default=str(Path.home() / ".openclaw"), help="OpenClaw home directory")
parser.add_argument("--target", default=str(Path.home() / ".hermes"), help="Hermes home directory")
parser.add_argument("--target", default=os.environ.get("HERMES_HOME") or str(Path.home() / ".hermes"), help="Hermes home directory")
parser.add_argument(
"--workspace-target",
help="Optional workspace root where the workspace instructions file should be copied",
@@ -0,0 +1,217 @@
---
name: here.now
description: Publish static sites to {slug}.here.now and store private files in cloud Drives for agent-to-agent handoff.
version: 1.15.3
author: here.now
license: MIT
prerequisites:
commands: [curl, file, jq]
platforms: [macos, linux]
metadata:
hermes:
tags: [here.now, herenow, publish, deploy, hosting, static-site, web, share, URL, drive, storage]
homepage: https://here.now
requires_toolsets: [terminal]
---
# here.now
here.now lets agents publish websites and store private files in cloud Drives.
Use here.now for two jobs:
- **Sites**: publish websites and files at `{slug}.here.now`.
- **Drives**: store private agent files in cloud folders.
## Current docs
**Before answering questions about here.now capabilities, features, or workflows, read the current docs:**
→ **https://here.now/docs**
Read the docs:
- at the first here.now-related interaction in a conversation
- any time the user asks how to do something
- any time the user asks what is possible, supported, or recommended
- before telling the user a feature is unsupported
Topics that require current docs (do not rely on local skill text alone):
- Drives and Drive sharing
- custom domains
- payments and payment gating
- forking
- proxy routes and service variables
- handles and links
- limits and quotas
- SPA routing
- error handling and remediation
- feature availability
**If docs and live API behavior disagree, trust the live API behavior.**
If the docs fetch fails or times out, continue with the local skill and live API/script output. Prefer live API behavior for active operations.
## Requirements
- Required binaries: `curl`, `file`, `jq`
- Optional environment variable: `$HERENOW_API_KEY`
- Optional Drive token variable: `$HERENOW_DRIVE_TOKEN`
- Optional credentials file: `~/.herenow/credentials`
- Skill helper paths:
- `${HERMES_SKILL_DIR}/scripts/publish.sh` for publishing sites
- `${HERMES_SKILL_DIR}/scripts/drive.sh` for private Drive storage
## Create a site
```bash
PUBLISH="${HERMES_SKILL_DIR}/scripts/publish.sh"
bash "$PUBLISH" {file-or-dir} --client hermes
```
Outputs the live URL (e.g. `https://bright-canvas-a7k2.here.now/`).
Under the hood this is a three-step flow: create/update -> upload files -> finalize. A site is not live until finalize succeeds.
Without an API key this creates an **anonymous site** that expires in 24 hours.
With a saved API key, the site is permanent.
**File structure:** For HTML sites, place `index.html` at the root of the directory you publish, not inside a subdirectory. The directory's contents become the site root. For example, publish `my-site/` where `my-site/index.html` exists — don't publish a parent folder that contains `my-site/`.
You can also publish raw files without any HTML. Single files get a rich auto-viewer (images, PDF, video, audio). Multiple files get an auto-generated directory listing with folder navigation and an image gallery.
## Update an existing site
```bash
PUBLISH="${HERMES_SKILL_DIR}/scripts/publish.sh"
bash "$PUBLISH" {file-or-dir} --slug {slug} --client hermes
```
The script auto-loads the `claimToken` from `.herenow/state.json` when updating anonymous sites. Pass `--claim-token {token}` to override.
Authenticated updates require a saved API key.
## Use a Drive
Use a Drive when the user wants private cloud storage for agent files: documents, context, memory, plans, assets, media, research, code, and anything else that should persist without being published as a website.
Every signed-in account has a default Drive named `My Drive`.
```bash
DRIVE="${HERMES_SKILL_DIR}/scripts/drive.sh"
bash "$DRIVE" default
bash "$DRIVE" ls "My Drive"
bash "$DRIVE" put "My Drive" notes/today.md --from ./notes/today.md
bash "$DRIVE" cat "My Drive" notes/today.md
bash "$DRIVE" share "My Drive" --perms write --prefix notes/ --ttl 7d
```
Use scoped Drive tokens for agent-to-agent handoff. If you receive a `herenow_drive` share block, use its `token` as `Authorization: Bearer <token>` against `api_base`, respect `pathPrefix` when present, and preserve ETags on writes. A `pathPrefix` of `null` means full-Drive access. If the skill is available, prefer `drive.sh`; otherwise call the listed API operations directly.
## API key storage
The publish script reads the API key from these sources (first match wins):
1. `--api-key {key}` flag (CI/scripting only — avoid in interactive use)
2. `$HERENOW_API_KEY` environment variable
3. `~/.herenow/credentials` file (recommended for agents)
To store a key, write it to the credentials file:
```bash
mkdir -p ~/.herenow && echo "{API_KEY}" > ~/.herenow/credentials && chmod 600 ~/.herenow/credentials
```
**IMPORTANT**: After receiving an API key, save it immediately — run the command above yourself. Do not ask the user to run it manually. Avoid passing the key via CLI flags (e.g. `--api-key`) in interactive sessions; the credentials file is the preferred storage method.
Never commit credentials or local state files (`~/.herenow/credentials`, `.herenow/state.json`) to source control.
## Getting an API key
To upgrade from anonymous (24h) to permanent sites:
1. Ask the user for their email address.
2. Request a one-time sign-in code:
```bash
curl -sS https://here.now/api/auth/agent/request-code \
-H "content-type: application/json" \
-d '{"email": "user@example.com"}'
```
3. Tell the user: "Check your inbox for a sign-in code from here.now and paste it here."
4. Verify the code and get the API key:
```bash
curl -sS https://here.now/api/auth/agent/verify-code \
-H "content-type: application/json" \
-d '{"email":"user@example.com","code":"ABCD-2345"}'
```
5. Save the returned `apiKey` yourself (do not ask the user to do this):
```bash
mkdir -p ~/.herenow && echo "{API_KEY}" > ~/.herenow/credentials && chmod 600 ~/.herenow/credentials
```
## State file
After every site create/update, the script writes to `.herenow/state.json` in the working directory:
```json
{
"publishes": {
"bright-canvas-a7k2": {
"siteUrl": "https://bright-canvas-a7k2.here.now/",
"claimToken": "abc123",
"claimUrl": "https://here.now/claim?slug=bright-canvas-a7k2&token=abc123",
"expiresAt": "2026-02-18T01:00:00.000Z"
}
}
}
```
Before creating or updating sites, you may check this file to find prior slugs.
Treat `.herenow/state.json` as internal cache only.
Never present this local file path as a URL, and never use it as source of truth for auth mode, expiry, or claim URL.
## What to tell the user
For published sites:
- Always share the `siteUrl` from the current script run.
- Read and follow `publish_result.*` lines from script stderr to determine auth mode.
- When `publish_result.auth_mode=authenticated`: tell the user the site is **permanent** and saved to their account. No claim URL is needed.
- When `publish_result.auth_mode=anonymous`: tell the user the site **expires in 24 hours**. Share the claim URL (if `publish_result.claim_url` is non-empty and starts with `https://`) so they can keep it permanently. Warn that claim tokens are only returned once and cannot be recovered.
- Never tell the user to inspect `.herenow/state.json` for claim URLs or auth status.
For Drives:
- Do not describe Drive files as public URLs.
- Tell the user Drive contents are private unless shared with a scoped token.
- When sharing access with another agent, prefer a scoped token with a narrow `pathPrefix` and short TTL.
## publish.sh options
| Flag | Description |
| ---------------------- | -------------------------------------------- |
| `--slug {slug}` | Update an existing site instead of creating |
| `--claim-token {token}`| Override claim token for anonymous updates |
| `--title {text}` | Viewer title (non-HTML sites) |
| `--description {text}` | Viewer description |
| `--ttl {seconds}` | Set expiry (authenticated only) |
| `--client {name}` | Agent name for attribution (e.g. `hermes`) |
| `--base-url {url}` | API base URL (default: `https://here.now`) |
| `--allow-nonherenow-base-url` | Allow sending auth to non-default `--base-url` |
| `--api-key {key}` | API key override (prefer credentials file) |
| `--spa` | Enable SPA routing (serve index.html for unknown paths) |
| `--forkable` | Allow others to fork this site |
## Beyond publish.sh
For Drive operations, use `drive.sh` or the Drive API. For broader account and site management — delete, metadata, passwords, payments, domains, handles, links, variables, proxy routes, forking, duplication, and more — see the current docs:
→ **https://here.now/docs**
Full docs: https://here.now/docs
+406
View File
@@ -0,0 +1,406 @@
#!/usr/bin/env bash
set -euo pipefail
BASE_URL="https://here.now"
CREDENTIALS_FILE="$HOME/.herenow/credentials"
API_KEY="${HERENOW_API_KEY:-}"
DRIVE_TOKEN="${HERENOW_DRIVE_TOKEN:-}"
ALLOW_NON_HERENOW_BASE_URL=0
MAX_FILE_BYTES=$((500 * 1024 * 1024))
usage() {
cat <<'USAGE'
Usage: drive.sh [global options] <command> [args]
Global options:
--api-key <key> Account API key (or $HERENOW_API_KEY / ~/.herenow/credentials)
--token <drv_live_...> Drive token (or $HERENOW_DRIVE_TOKEN)
--base-url <url> API base (default: https://here.now)
--allow-nonherenow-base-url
Commands:
create [name] [--default]
default
ls
ls <drive> [prefix]
cat <drive> <path>
put <drive> <path> --from <local-file>
import <drive> <prefix> --from <local-folder> [--dry-run]
export <drive> <prefix> --to <local-folder> [--dry-run]
rm <drive> <path> [--recursive --confirm <path>]
share <drive> --perms read|write [--prefix notes/] [--ttl 30d] [--label text] [--manage-tokens]
tokens <drive>
revoke <drive> <tokenId>
delete <drive> --confirm "<drive name>"
USAGE
exit 1
}
die() { echo "error: $1" >&2; exit 1; }
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
SKILL_DIR="$(cd "${SCRIPT_DIR}/.." && pwd)"
BUNDLED_JQ="${SKILL_DIR}/bin/jq"
if [[ -x "$BUNDLED_JQ" ]]; then
JQ_BIN="$BUNDLED_JQ"
elif command -v jq >/dev/null 2>&1; then
JQ_BIN="$(command -v jq)"
else
die "requires jq"
fi
for cmd in curl file; do
command -v "$cmd" >/dev/null 2>&1 || die "requires $cmd"
done
while [[ $# -gt 0 ]]; do
case "$1" in
--api-key) API_KEY="$2"; shift 2 ;;
--token) DRIVE_TOKEN="$2"; shift 2 ;;
--base-url) BASE_URL="$2"; shift 2 ;;
--allow-nonherenow-base-url) ALLOW_NON_HERENOW_BASE_URL=1; shift ;;
--help|-h) usage ;;
--*) die "unknown global option: $1" ;;
*) break ;;
esac
done
CMD="${1:-}"
[[ -n "$CMD" ]] || usage
shift || true
if [[ -z "$API_KEY" && -z "$DRIVE_TOKEN" && -f "$CREDENTIALS_FILE" ]]; then
API_KEY=$(tr -d '[:space:]' < "$CREDENTIALS_FILE")
fi
BASE_URL="${BASE_URL%/}"
if [[ "$BASE_URL" != "https://here.now" && "$ALLOW_NON_HERENOW_BASE_URL" -ne 1 ]]; then
if [[ -n "$API_KEY" || -n "$DRIVE_TOKEN" ]]; then
die "refusing to send credentials to non-default base URL; pass --allow-nonherenow-base-url to override"
fi
fi
auth_header=()
if [[ -n "$DRIVE_TOKEN" ]]; then
auth_header=(-H "authorization: Bearer $DRIVE_TOKEN")
elif [[ -n "$API_KEY" ]]; then
auth_header=(-H "authorization: Bearer $API_KEY")
else
die "missing credentials; set HERENOW_API_KEY, HERENOW_DRIVE_TOKEN, or ~/.herenow/credentials"
fi
compute_sha256() {
local f="$1"
if command -v sha256sum >/dev/null 2>&1; then
sha256sum "$f" | cut -d' ' -f1
else
shasum -a 256 "$f" | cut -d' ' -f1
fi
}
guess_content_type() {
local f="$1"
case "${f##*.}" in
html|htm) echo "text/html; charset=utf-8" ;;
css) echo "text/css; charset=utf-8" ;;
js|mjs) echo "text/javascript; charset=utf-8" ;;
json) echo "application/json; charset=utf-8" ;;
md|txt) echo "text/plain; charset=utf-8" ;;
svg) echo "image/svg+xml" ;;
png) echo "image/png" ;;
jpg|jpeg) echo "image/jpeg" ;;
gif) echo "image/gif" ;;
webp) echo "image/webp" ;;
pdf) echo "application/pdf" ;;
*) file --brief --mime-type "$f" 2>/dev/null || echo "application/octet-stream" ;;
esac
}
api_json() {
local method="$1"; shift
local url="$1"; shift
local body="${1:-}"
local tmp
tmp=$(mktemp)
local code
if [[ -n "$body" ]]; then
code=$(curl -sS -o "$tmp" -w "%{http_code}" -X "$method" "$url" "${auth_header[@]}" -H "content-type: application/json" -d "$body")
else
code=$(curl -sS -o "$tmp" -w "%{http_code}" -X "$method" "$url" "${auth_header[@]}")
fi
if [[ "$code" -lt 200 || "$code" -ge 300 ]]; then
local err
err=$("$JQ_BIN" -r '.error // empty' "$tmp" 2>/dev/null || true)
[[ -n "$err" ]] || err="$(cat "$tmp")"
rm -f "$tmp"
die "HTTP $code: $err"
fi
cat "$tmp"
rm -f "$tmp"
}
urlenc() {
"$JQ_BIN" -nr --arg v "$1" '$v|@uri'
}
urlenc_path() {
local path="$1"
local out=""
local part
IFS='/' read -r -a parts <<< "$path"
for part in "${parts[@]}"; do
[[ -n "$out" ]] && out="$out/"
out="$out$(urlenc "$part")"
done
echo "$out"
}
resolve_drive() {
local name="$1"
if [[ "$name" == drv_* ]]; then
echo "$name"
return
fi
if [[ -n "$DRIVE_TOKEN" ]]; then
die "drive tokens must reference drives by drv_ id; use account credentials to resolve drive names"
fi
if [[ "$name" == "default" || "$name" == "my-drive" || "$name" == "My Drive" ]]; then
api_json GET "$BASE_URL/api/v1/drives/default" | "$JQ_BIN" -r '.drive.id'
return
fi
local rows count
rows=$(api_json GET "$BASE_URL/api/v1/drives" | "$JQ_BIN" --arg n "$name" '[.drives[] | select(.name == $n)]')
count=$(echo "$rows" | "$JQ_BIN" 'length')
[[ "$count" -eq 1 ]] || die "drive name '$name' matched $count drives; use a drv_ id"
echo "$rows" | "$JQ_BIN" -r '.[0].id'
}
drive_head() {
local id="$1"
api_json GET "$BASE_URL/api/v1/drives/$id" | "$JQ_BIN" -r '.drive.headVersionId // .headVersionId // empty'
}
file_meta() {
local id="$1"
local path="$2"
local prefix
prefix=$(urlenc "$path")
api_json GET "$BASE_URL/api/v1/drives/$id/files?prefix=$prefix&limit=200" | "$JQ_BIN" -c --arg p "$path" '.files[]? | select(.path == $p)' | head -n 1
}
put_file() {
local drive="$1"; shift
local path="$1"; shift
local local_file=""
while [[ $# -gt 0 ]]; do
case "$1" in
--from) local_file="$2"; shift 2 ;;
*) die "unexpected put argument: $1" ;;
esac
done
[[ -f "$local_file" ]] || die "--from must be a file"
local id sz ct sha meta body upload upload_url upload_id http_code
id=$(resolve_drive "$drive")
sz=$(wc -c < "$local_file" | tr -d ' ')
[[ "$sz" -le "$MAX_FILE_BYTES" ]] || die "$path exceeds the $MAX_FILE_BYTES byte Drive file limit"
ct=$(guess_content_type "$local_file")
sha=$(compute_sha256 "$local_file")
meta=$(file_meta "$id" "$path" || true)
body=$("$JQ_BIN" -n --arg p "$path" --argjson s "$sz" --arg c "$ct" --arg sha "$sha" \
'{path:$p,size:$s,contentType:$c,sha256:$sha}')
if [[ -n "$meta" ]]; then
etag=$(echo "$meta" | "$JQ_BIN" -r '.etag')
body=$(echo "$body" | "$JQ_BIN" --arg e "$etag" '.ifMatch = $e')
else
body=$(echo "$body" | "$JQ_BIN" '.ifNoneMatch = "*"')
fi
upload=$(api_json POST "$BASE_URL/api/v1/drives/$id/files/uploads" "$body")
upload_url=$(echo "$upload" | "$JQ_BIN" -r '.uploadUrl')
upload_id=$(echo "$upload" | "$JQ_BIN" -r '.uploadId')
http_code=$(curl -sS -o /dev/null -w "%{http_code}" -X PUT "$upload_url" -H "Content-Type: $ct" --data-binary "@$local_file")
[[ "$http_code" -ge 200 && "$http_code" -lt 300 ]] || die "upload failed for $path (HTTP $http_code)"
api_json POST "$BASE_URL/api/v1/drives/$id/files/finalize" "$("$JQ_BIN" -n --arg u "$upload_id" '{uploadId:$u}')" | "$JQ_BIN" .
}
case "$CMD" in
create)
name=""
is_default="false"
while [[ $# -gt 0 ]]; do
case "$1" in
--default) is_default="true"; shift ;;
*) [[ -z "$name" ]] && name="$1" || die "unexpected argument: $1"; shift ;;
esac
done
body=$("$JQ_BIN" -n --arg n "$name" --argjson d "$is_default" '{isDefault:$d} + (if $n == "" then {} else {name:$n} end)')
api_json POST "$BASE_URL/api/v1/drives" "$body" | "$JQ_BIN" .
;;
default)
api_json GET "$BASE_URL/api/v1/drives/default" | "$JQ_BIN" .
;;
ls)
if [[ $# -eq 0 ]]; then
[[ -z "$DRIVE_TOKEN" ]] || die "drive tokens cannot list drives; pass a drv_ id"
api_json GET "$BASE_URL/api/v1/drives" | "$JQ_BIN" .
else
id=$(resolve_drive "$1")
prefix="${2:-}"
api_json GET "$BASE_URL/api/v1/drives/$id/files?prefix=$(urlenc "$prefix")" | "$JQ_BIN" .
fi
;;
cat)
[[ $# -eq 2 ]] || die "usage: drive.sh cat <drive> <path>"
id=$(resolve_drive "$1")
curl -fsS "$BASE_URL/api/v1/drives/$id/files/$(urlenc_path "$2")" "${auth_header[@]}"
;;
put)
[[ $# -ge 2 ]] || die "usage: drive.sh put <drive> <path> --from <local-file>"
put_file "$@"
;;
import)
[[ $# -ge 2 ]] || die "usage: drive.sh import <drive> <prefix> --from <local-folder> [--dry-run]"
drive="$1"; prefix="${2%/}"; shift 2
from=""; dry=0
while [[ $# -gt 0 ]]; do
case "$1" in
--from) from="$2"; shift 2 ;;
--dry-run) dry=1; shift ;;
*) die "unexpected import argument: $1" ;;
esac
done
[[ -d "$from" ]] || die "--from must be a folder"
uploaded=0
skipped=0
failed=0
planned=0
while IFS= read -r -d '' f; do
rel="${f#$from/}"
[[ "$rel" == .git/* || "$rel" == node_modules/* || "$rel" == ".DS_Store" || "$rel" == */.DS_Store ]] && continue
planned=$((planned + 1))
sz=$(wc -c < "$f" | tr -d ' ')
if [[ "$sz" -gt "$MAX_FILE_BYTES" ]]; then
echo "skip oversized $f ($sz bytes > $MAX_FILE_BYTES)" >&2
skipped=$((skipped + 1))
continue
fi
dest="$rel"
[[ -n "$prefix" ]] && dest="$prefix/$rel"
if [[ "$dry" -eq 1 ]]; then
echo "upload $f -> $dest"
skipped=$((skipped + 1))
else
if (put_file "$drive" "$dest" --from "$f" >/dev/null); then
uploaded=$((uploaded + 1))
else
failed=$((failed + 1))
fi
fi
done < <(find "$from" -type f -print0 | sort -z)
echo "planned=$planned uploaded=$uploaded skipped=$skipped failed=$failed"
[[ "$failed" -eq 0 ]] || exit 1
;;
export)
[[ $# -ge 2 ]] || die "usage: drive.sh export <drive> <prefix> --to <local-folder> [--dry-run]"
id=$(resolve_drive "$1"); prefix="${2%/}"; shift 2
to=""; dry=0
while [[ $# -gt 0 ]]; do
case "$1" in
--to) to="$2"; shift 2 ;;
--dry-run) dry=1; shift ;;
*) die "unexpected export argument: $1" ;;
esac
done
[[ -n "$to" ]] || die "--to is required"
cursor=""
total=0
while true; do
url="$BASE_URL/api/v1/drives/$id/files?prefix=$(urlenc "$prefix")&limit=200"
[[ -n "$cursor" ]] && url="$url&cursor=$(urlenc "$cursor")"
files=$(api_json GET "$url")
while IFS= read -r p; do
[[ -n "$p" ]] || continue
rel="$p"
[[ -n "$prefix" ]] && rel="${p#$prefix/}"
out="$to/$rel"
if [[ "$dry" -eq 1 ]]; then
echo "download $p -> $out"
else
mkdir -p "$(dirname "$out")"
curl -fsS "$BASE_URL/api/v1/drives/$id/files/$(urlenc_path "$p")" "${auth_header[@]}" -o "$out"
fi
total=$((total + 1))
done < <(echo "$files" | "$JQ_BIN" -r '.files[].path')
cursor=$(echo "$files" | "$JQ_BIN" -r '.nextCursor // empty')
[[ -n "$cursor" ]] || break
done
echo "files=$total"
;;
rm)
[[ $# -ge 2 ]] || die "usage: drive.sh rm <drive> <path> [--recursive --confirm <path>]"
id=$(resolve_drive "$1"); path="$2"; shift 2
recursive=0; confirm=""
while [[ $# -gt 0 ]]; do
case "$1" in
--recursive) recursive=1; shift ;;
--confirm) confirm="$2"; shift 2 ;;
*) die "unexpected rm argument: $1" ;;
esac
done
if [[ "$recursive" -eq 1 ]]; then
[[ "$confirm" == "$path" ]] || die "recursive delete requires --confirm '$path'"
head=$(drive_head "$id")
api_json DELETE "$BASE_URL/api/v1/drives/$id/files/$(urlenc_path "$path")?recursive=true&baseVersionId=$(urlenc "$head")" | "$JQ_BIN" .
else
meta=$(file_meta "$id" "$path")
etag=$(echo "$meta" | "$JQ_BIN" -r '.etag')
curl -fsS -X DELETE "$BASE_URL/api/v1/drives/$id/files/$(urlenc_path "$path")" "${auth_header[@]}" -H "If-Match: $etag" | "$JQ_BIN" .
fi
;;
share)
[[ $# -ge 1 ]] || die "usage: drive.sh share <drive> --perms read|write [--prefix notes/] [--ttl 30d] [--label text] [--manage-tokens]"
id=$(resolve_drive "$1"); shift
perms="write"; prefix=""; ttl=""; label=""; manage_tokens="false"
while [[ $# -gt 0 ]]; do
case "$1" in
--perms) perms="$2"; shift 2 ;;
--prefix) prefix="$2"; shift 2 ;;
--ttl) ttl="$2"; shift 2 ;;
--label) label="$2"; shift 2 ;;
--manage-tokens) manage_tokens="true"; shift ;;
*) die "unexpected share argument: $1" ;;
esac
done
body=$("$JQ_BIN" -n --arg p "$perms" --arg pp "$prefix" --arg ttl "$ttl" --arg label "$label" --argjson mt "$manage_tokens" \
'{perms:$p} + (if $mt then {manageTokens:true} else {} end) + (if $ttl == "" then {} else {ttl:$ttl} end) + (if $pp == "" then {} else {pathPrefix:$pp} end) + (if $label == "" then {} else {label:$label} end)')
api_json POST "$BASE_URL/api/v1/drives/$id/tokens" "$body" | "$JQ_BIN" -r '.shareBlock'
;;
tokens)
[[ $# -eq 1 ]] || die "usage: drive.sh tokens <drive>"
id=$(resolve_drive "$1")
api_json GET "$BASE_URL/api/v1/drives/$id/tokens" | "$JQ_BIN" .
;;
revoke)
[[ $# -eq 2 ]] || die "usage: drive.sh revoke <drive> <tokenId>"
id=$(resolve_drive "$1")
api_json DELETE "$BASE_URL/api/v1/drives/$id/tokens/$2" | "$JQ_BIN" .
;;
delete)
[[ $# -ge 1 ]] || die "usage: drive.sh delete <drive> --confirm <drive name>"
id=$(resolve_drive "$1"); shift
confirm=""
while [[ $# -gt 0 ]]; do
case "$1" in
--confirm) confirm="$2"; shift 2 ;;
*) die "unexpected delete argument: $1" ;;
esac
done
drive=$(api_json GET "$BASE_URL/api/v1/drives/$id")
name=$(echo "$drive" | "$JQ_BIN" -r '.drive.name')
[[ "$confirm" == "$name" ]] || die "delete requires --confirm '$name'"
api_json DELETE "$BASE_URL/api/v1/drives/$id" | "$JQ_BIN" .
;;
*)
die "unknown command: $CMD"
;;
esac
+445
View File
@@ -0,0 +1,445 @@
#!/usr/bin/env bash
set -euo pipefail
BASE_URL="https://here.now"
CREDENTIALS_FILE="$HOME/.herenow/credentials"
API_KEY="${HERENOW_API_KEY:-}"
API_KEY_SOURCE="none"
if [[ -n "${HERENOW_API_KEY:-}" ]]; then
API_KEY_SOURCE="env"
fi
ALLOW_NON_HERENOW_BASE_URL=0
SLUG=""
CLAIM_TOKEN=""
TITLE=""
DESCRIPTION=""
TTL=""
CLIENT=""
TARGET=""
FORKABLE=""
SPA_MODE=""
FROM_DRIVE=""
DRIVE_VERSION=""
usage() {
cat <<'USAGE'
Usage: publish.sh <file-or-dir> [options]
Options:
--api-key <key> API key (or set $HERENOW_API_KEY)
--slug <slug> Update existing publish
--claim-token <token> Claim token for anonymous updates
--title <text> Viewer title
--description <text> Viewer description
--ttl <seconds> Expiry (authenticated only)
--client <name> Agent name for attribution (e.g. cursor, claude-code)
--forkable Allow others to fork this site
--spa Enable SPA routing
--from-drive <drv_...> Publish a Drive snapshot instead of local files
--version <dv_...> Drive version for --from-drive (default: current head)
--base-url <url> API base (default: https://here.now)
--allow-nonherenow-base-url
Allow auth requests to non-default API base URL
USAGE
exit 1
}
die() { echo "error: $1" >&2; exit 1; }
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
SKILL_DIR="$(cd "${SCRIPT_DIR}/.." && pwd)"
BUNDLED_JQ="${SKILL_DIR}/bin/jq"
if [[ -x "$BUNDLED_JQ" ]]; then
JQ_BIN="$BUNDLED_JQ"
elif command -v jq >/dev/null 2>&1; then
JQ_BIN="$(command -v jq)"
else
die "requires jq"
fi
for cmd in curl file; do
command -v "$cmd" >/dev/null 2>&1 || die "requires $cmd"
done
while [[ $# -gt 0 ]]; do
case "$1" in
--api-key) API_KEY="$2"; API_KEY_SOURCE="flag"; shift 2 ;;
--slug) SLUG="$2"; shift 2 ;;
--claim-token) CLAIM_TOKEN="$2"; shift 2 ;;
--title) TITLE="$2"; shift 2 ;;
--description) DESCRIPTION="$2"; shift 2 ;;
--ttl) TTL="$2"; shift 2 ;;
--client) CLIENT="$2"; shift 2 ;;
--base-url) BASE_URL="$2"; shift 2 ;;
--allow-nonherenow-base-url) ALLOW_NON_HERENOW_BASE_URL=1; shift ;;
--forkable) FORKABLE="true"; shift ;;
--spa) SPA_MODE="true"; shift ;;
--from-drive) FROM_DRIVE="$2"; shift 2 ;;
--version) DRIVE_VERSION="$2"; shift 2 ;;
--help|-h) usage ;;
-*) die "unknown option: $1" ;;
*) [[ -z "$TARGET" ]] && TARGET="$1" || die "unexpected argument: $1"; shift ;;
esac
done
if [[ -n "$FROM_DRIVE" ]]; then
[[ -z "$TARGET" ]] || die "--from-drive does not accept a local file-or-dir argument"
else
[[ -n "$TARGET" ]] || usage
[[ -e "$TARGET" ]] || die "path does not exist: $TARGET"
fi
# Load API key from credentials file if not provided via flag or env
if [[ -z "$API_KEY" && -f "$CREDENTIALS_FILE" ]]; then
API_KEY=$(cat "$CREDENTIALS_FILE" | tr -d '[:space:]')
[[ -n "$API_KEY" ]] && API_KEY_SOURCE="credentials"
fi
BASE_URL="${BASE_URL%/}"
STATE_DIR=".herenow"
STATE_FILE="$STATE_DIR/state.json"
# Safety guard: avoid accidentally sending bearer auth to arbitrary endpoints.
if [[ -n "$API_KEY" && "$BASE_URL" != "https://here.now" && "$ALLOW_NON_HERENOW_BASE_URL" -ne 1 ]]; then
die "refusing to send API key to non-default base URL; pass --allow-nonherenow-base-url to override"
fi
# Auto-load claim token from state file for anonymous updates
if [[ -n "$SLUG" && -z "$CLAIM_TOKEN" && -z "$API_KEY" && -f "$STATE_FILE" ]]; then
CLAIM_TOKEN=$("$JQ_BIN" -r --arg s "$SLUG" '.publishes[$s].claimToken // empty' "$STATE_FILE" 2>/dev/null || true)
fi
if [[ -n "$FROM_DRIVE" ]]; then
[[ -n "$API_KEY" ]] || die "--from-drive requires an account API key"
BODY=$("$JQ_BIN" -n --arg d "$FROM_DRIVE" '{driveId:$d}')
[[ -n "$DRIVE_VERSION" ]] && BODY=$(echo "$BODY" | "$JQ_BIN" --arg v "$DRIVE_VERSION" '.versionId = $v')
[[ -n "$SLUG" ]] && BODY=$(echo "$BODY" | "$JQ_BIN" --arg s "$SLUG" '.slug = $s')
if [[ -n "$TITLE" || -n "$DESCRIPTION" ]]; then
viewer="{}"
[[ -n "$TITLE" ]] && viewer=$(echo "$viewer" | "$JQ_BIN" --arg t "$TITLE" '.title = $t')
[[ -n "$DESCRIPTION" ]] && viewer=$(echo "$viewer" | "$JQ_BIN" --arg d "$DESCRIPTION" '.description = $d')
BODY=$(echo "$BODY" | "$JQ_BIN" --argjson v "$viewer" '.viewer = $v')
fi
[[ "$FORKABLE" == "true" ]] && BODY=$(echo "$BODY" | "$JQ_BIN" '.forkable = true')
[[ "$SPA_MODE" == "true" ]] && BODY=$(echo "$BODY" | "$JQ_BIN" '.spaMode = true')
CLIENT_HEADER_VALUE="here-now-publish-sh"
if [[ -n "$CLIENT" ]]; then
normalized_client=$(echo "$CLIENT" | tr '[:upper:]' '[:lower:]' | tr -cs 'a-z0-9._-' '-')
normalized_client="${normalized_client#-}"
normalized_client="${normalized_client%-}"
if [[ -n "$normalized_client" ]]; then
CLIENT_HEADER_VALUE="${normalized_client}/publish-sh"
fi
fi
echo "publishing from Drive..." >&2
RESPONSE=$(curl -sS -X POST "$BASE_URL/api/v1/publish/from-drive" \
-H "authorization: Bearer $API_KEY" \
-H "x-herenow-client: $CLIENT_HEADER_VALUE" \
-H "content-type: application/json" \
-d "$BODY")
if echo "$RESPONSE" | "$JQ_BIN" -e '.error' >/dev/null 2>&1; then
err=$(echo "$RESPONSE" | "$JQ_BIN" -r '.error')
die "$err"
fi
SITE_URL=$(echo "$RESPONSE" | "$JQ_BIN" -r '.siteUrl')
OUT_SLUG=$(echo "$RESPONSE" | "$JQ_BIN" -r '.slug')
CURRENT_VERSION=$(echo "$RESPONSE" | "$JQ_BIN" -r '.currentVersionId')
DRIVE_VERSION_OUT=$(echo "$RESPONSE" | "$JQ_BIN" -r '.driveVersionId')
echo "$SITE_URL"
echo "" >&2
echo "publish_result.site_url=$SITE_URL" >&2
echo "publish_result.slug=$OUT_SLUG" >&2
echo "publish_result.action=from_drive" >&2
echo "publish_result.auth_mode=authenticated" >&2
echo "publish_result.api_key_source=$API_KEY_SOURCE" >&2
echo "publish_result.persistence=permanent" >&2
echo "publish_result.drive_id=$FROM_DRIVE" >&2
echo "publish_result.drive_version_id=$DRIVE_VERSION_OUT" >&2
echo "publish_result.current_version_id=$CURRENT_VERSION" >&2
exit 0
fi
compute_sha256() {
local f="$1"
if command -v sha256sum >/dev/null 2>&1; then
sha256sum "$f" | cut -d' ' -f1
else
shasum -a 256 "$f" | cut -d' ' -f1
fi
}
guess_content_type() {
local f="$1"
case "${f##*.}" in
html|htm) echo "text/html; charset=utf-8" ;;
css) echo "text/css; charset=utf-8" ;;
js|mjs) echo "text/javascript; charset=utf-8" ;;
json) echo "application/json; charset=utf-8" ;;
md|txt) echo "text/plain; charset=utf-8" ;;
svg) echo "image/svg+xml" ;;
png) echo "image/png" ;;
jpg|jpeg) echo "image/jpeg" ;;
gif) echo "image/gif" ;;
webp) echo "image/webp" ;;
pdf) echo "application/pdf" ;;
mp4) echo "video/mp4" ;;
mov) echo "video/quicktime" ;;
mp3) echo "audio/mpeg" ;;
wav) echo "audio/wav" ;;
xml) echo "application/xml" ;;
woff2) echo "font/woff2" ;;
woff) echo "font/woff" ;;
ttf) echo "font/ttf" ;;
ico) echo "image/x-icon" ;;
*)
local detected
detected=$(file --brief --mime-type "$f" 2>/dev/null || echo "application/octet-stream")
echo "$detected"
;;
esac
}
# Build file manifest as JSON array
FILES_JSON="[]"
if [[ -f "$TARGET" ]]; then
sz=$(wc -c < "$TARGET" | tr -d ' ')
ct=$(guess_content_type "$TARGET")
bn=$(basename "$TARGET")
h=$(compute_sha256 "$TARGET")
FILES_JSON=$("$JQ_BIN" -n --arg p "$bn" --argjson s "$sz" --arg c "$ct" --arg h "$h" \
'[{"path":$p,"size":$s,"contentType":$c,"hash":$h}]')
FILE_MAP=$("$JQ_BIN" -n --arg p "$bn" --arg a "$(cd "$(dirname "$TARGET")" && pwd)/$(basename "$TARGET")" \
'{($p):$a}')
elif [[ -d "$TARGET" ]]; then
FILE_MAP="{}"
while IFS= read -r -d '' f; do
rel="${f#$TARGET/}"
[[ "$rel" == ".DS_Store" ]] && continue
[[ "$(basename "$rel")" == ".DS_Store" ]] && continue
[[ "$rel" == ".herenow/fork-meta.json" ]] && continue
sz=$(wc -c < "$f" | tr -d ' ')
ct=$(guess_content_type "$f")
h=$(compute_sha256 "$f")
abs=$(cd "$(dirname "$f")" && pwd)/$(basename "$f")
FILES_JSON=$(echo "$FILES_JSON" | "$JQ_BIN" --arg p "$rel" --argjson s "$sz" --arg c "$ct" --arg h "$h" \
'. + [{"path":$p,"size":$s,"contentType":$c,"hash":$h}]')
FILE_MAP=$(echo "$FILE_MAP" | "$JQ_BIN" --arg p "$rel" --arg a "$abs" '. + {($p):$a}')
done < <(find "$TARGET" -type f -print0 | sort -z)
else
die "not a file or directory: $TARGET"
fi
file_count=$(echo "$FILES_JSON" | "$JQ_BIN" 'length')
[[ "$file_count" -gt 0 ]] || die "no files found"
# Read fork-meta.json defaults if present and no explicit flags given
FORK_META=""
if [[ -d "$TARGET" ]]; then
FORK_META_PATH="$TARGET/.herenow/fork-meta.json"
if [[ -f "$FORK_META_PATH" ]]; then
FORK_META=$(cat "$FORK_META_PATH")
if [[ -z "$FORKABLE" ]]; then
FORKABLE=$("$JQ_BIN" -r '.forkable // empty' <<< "$FORK_META" 2>/dev/null || true)
fi
fi
fi
# Build request body
BODY=$(echo "$FILES_JSON" | "$JQ_BIN" '{files: .}')
if [[ -n "$TTL" ]]; then
BODY=$(echo "$BODY" | "$JQ_BIN" --argjson t "$TTL" '.ttlSeconds = $t')
fi
if [[ -n "$TITLE" || -n "$DESCRIPTION" ]]; then
viewer="{}"
[[ -n "$TITLE" ]] && viewer=$(echo "$viewer" | "$JQ_BIN" --arg t "$TITLE" '.title = $t')
[[ -n "$DESCRIPTION" ]] && viewer=$(echo "$viewer" | "$JQ_BIN" --arg d "$DESCRIPTION" '.description = $d')
BODY=$(echo "$BODY" | "$JQ_BIN" --argjson v "$viewer" '.viewer = $v')
fi
if [[ -n "$CLAIM_TOKEN" && -n "$SLUG" && -z "$API_KEY" ]]; then
BODY=$(echo "$BODY" | "$JQ_BIN" --arg ct "$CLAIM_TOKEN" '.claimToken = $ct')
fi
if [[ "$FORKABLE" == "true" ]]; then
BODY=$(echo "$BODY" | "$JQ_BIN" '.forkable = true')
fi
if [[ "$SPA_MODE" == "true" ]]; then
BODY=$(echo "$BODY" | "$JQ_BIN" '.spaMode = true')
fi
# Determine endpoint and method
if [[ -n "$SLUG" ]]; then
URL="$BASE_URL/api/v1/publish/$SLUG"
METHOD="PUT"
else
URL="$BASE_URL/api/v1/publish"
METHOD="POST"
fi
# Build auth header
AUTH_ARGS=()
if [[ -n "$API_KEY" ]]; then
AUTH_ARGS=(-H "authorization: Bearer $API_KEY")
fi
AUTH_MODE="anonymous"
if [[ -n "$API_KEY" ]]; then
AUTH_MODE="authenticated"
fi
CLIENT_HEADER_VALUE="here-now-publish-sh"
if [[ -n "$CLIENT" ]]; then
normalized_client=$(echo "$CLIENT" | tr '[:upper:]' '[:lower:]' | tr -cs 'a-z0-9._-' '-')
normalized_client="${normalized_client#-}"
normalized_client="${normalized_client%-}"
if [[ -n "$normalized_client" ]]; then
CLIENT_HEADER_VALUE="${normalized_client}/publish-sh"
fi
fi
CLIENT_ARGS=(-H "x-herenow-client: $CLIENT_HEADER_VALUE")
# Step 1: Create/update publish
echo "creating publish ($file_count files)..." >&2
RESPONSE=$(curl -sS -X "$METHOD" "$URL" \
"${AUTH_ARGS[@]+"${AUTH_ARGS[@]}"}" \
"${CLIENT_ARGS[@]+"${CLIENT_ARGS[@]}"}" \
-H "content-type: application/json" \
-d "$BODY")
# Check for errors
if echo "$RESPONSE" | "$JQ_BIN" -e '.error' >/dev/null 2>&1; then
err=$(echo "$RESPONSE" | "$JQ_BIN" -r '.error')
details=$(echo "$RESPONSE" | "$JQ_BIN" -r '.details // empty')
die "$err${details:+ ($details)}"
fi
OUT_SLUG=$(echo "$RESPONSE" | "$JQ_BIN" -r '.slug')
VERSION_ID=$(echo "$RESPONSE" | "$JQ_BIN" -r '.upload.versionId')
FINALIZE_URL=$(echo "$RESPONSE" | "$JQ_BIN" -r '.upload.finalizeUrl')
SITE_URL=$(echo "$RESPONSE" | "$JQ_BIN" -r '.siteUrl')
UPLOAD_COUNT=$(echo "$RESPONSE" | "$JQ_BIN" '.upload.uploads | length')
SKIPPED_COUNT=$(echo "$RESPONSE" | "$JQ_BIN" '.upload.skipped // [] | length')
[[ "$OUT_SLUG" != "null" ]] || die "unexpected response: $RESPONSE"
# Step 2: Upload files (skipped files are unchanged from previous version)
if [[ "$SKIPPED_COUNT" -gt 0 ]]; then
echo "uploading $UPLOAD_COUNT files ($SKIPPED_COUNT unchanged, skipped)..." >&2
else
echo "uploading $UPLOAD_COUNT files..." >&2
fi
upload_errors=0
for i in $(seq 0 $((UPLOAD_COUNT - 1))); do
upload_path=$(echo "$RESPONSE" | "$JQ_BIN" -r ".upload.uploads[$i].path")
upload_url=$(echo "$RESPONSE" | "$JQ_BIN" -r ".upload.uploads[$i].url")
upload_ct=$(echo "$RESPONSE" | "$JQ_BIN" -r ".upload.uploads[$i].headers[\"Content-Type\"] // empty")
if [[ -f "$TARGET" && ! -d "$TARGET" ]]; then
local_file="$TARGET"
else
local_file=$(echo "$FILE_MAP" | "$JQ_BIN" -r --arg p "$upload_path" '.[$p]')
fi
if [[ ! -f "$local_file" ]]; then
echo "warning: missing local file for $upload_path" >&2
upload_errors=$((upload_errors + 1))
continue
fi
ct_args=()
[[ -n "$upload_ct" ]] && ct_args=(-H "Content-Type: $upload_ct")
http_code=$(curl -sS -o /dev/null -w "%{http_code}" -X PUT "$upload_url" \
"${ct_args[@]+"${ct_args[@]}"}" \
--data-binary "@$local_file")
if [[ "$http_code" -lt 200 || "$http_code" -ge 300 ]]; then
echo "warning: upload failed for $upload_path (HTTP $http_code)" >&2
upload_errors=$((upload_errors + 1))
fi
done
[[ "$upload_errors" -eq 0 ]] || die "$upload_errors file(s) failed to upload"
# Step 3: Finalize
echo "finalizing..." >&2
FIN_RESPONSE=$(curl -sS -X POST "$FINALIZE_URL" \
"${AUTH_ARGS[@]+"${AUTH_ARGS[@]}"}" \
"${CLIENT_ARGS[@]+"${CLIENT_ARGS[@]}"}" \
-H "content-type: application/json" \
-d "{\"versionId\":\"$VERSION_ID\"}")
if echo "$FIN_RESPONSE" | "$JQ_BIN" -e '.error' >/dev/null 2>&1; then
err=$(echo "$FIN_RESPONSE" | "$JQ_BIN" -r '.error')
die "finalize failed: $err"
fi
# Save state
mkdir -p "$STATE_DIR"
if [[ -f "$STATE_FILE" ]]; then
STATE=$(cat "$STATE_FILE")
else
STATE='{"publishes":{}}'
fi
entry=$("$JQ_BIN" -n --arg s "$SITE_URL" '{siteUrl: $s}')
RESPONSE_CLAIM_TOKEN=$(echo "$RESPONSE" | "$JQ_BIN" -r '.claimToken // empty')
RESPONSE_CLAIM_URL=$(echo "$RESPONSE" | "$JQ_BIN" -r '.claimUrl // empty')
RESPONSE_EXPIRES=$(echo "$RESPONSE" | "$JQ_BIN" -r '.expiresAt // empty')
[[ -n "$RESPONSE_CLAIM_TOKEN" ]] && entry=$(echo "$entry" | "$JQ_BIN" --arg v "$RESPONSE_CLAIM_TOKEN" '.claimToken = $v')
[[ -n "$RESPONSE_CLAIM_URL" ]] && entry=$(echo "$entry" | "$JQ_BIN" --arg v "$RESPONSE_CLAIM_URL" '.claimUrl = $v')
[[ -n "$RESPONSE_EXPIRES" ]] && entry=$(echo "$entry" | "$JQ_BIN" --arg v "$RESPONSE_EXPIRES" '.expiresAt = $v')
STATE=$(echo "$STATE" | "$JQ_BIN" --arg slug "$OUT_SLUG" --argjson e "$entry" '.publishes[$slug] = $e')
echo "$STATE" | "$JQ_BIN" '.' > "$STATE_FILE"
# Output
echo "$SITE_URL"
PERSISTENCE="permanent"
if [[ "$AUTH_MODE" == "anonymous" ]]; then
PERSISTENCE="expires_24h"
elif [[ -n "$RESPONSE_EXPIRES" ]]; then
PERSISTENCE="expires_at"
fi
SAFE_CLAIM_URL=""
if [[ -n "$RESPONSE_CLAIM_URL" && "$RESPONSE_CLAIM_URL" == https://* ]]; then
SAFE_CLAIM_URL="$RESPONSE_CLAIM_URL"
fi
ACTION="create"
if [[ -n "$SLUG" ]]; then
ACTION="update"
fi
echo "" >&2
echo "publish_result.site_url=$SITE_URL" >&2
echo "publish_result.slug=$OUT_SLUG" >&2
echo "publish_result.action=$ACTION" >&2
echo "publish_result.auth_mode=$AUTH_MODE" >&2
echo "publish_result.api_key_source=$API_KEY_SOURCE" >&2
echo "publish_result.persistence=$PERSISTENCE" >&2
echo "publish_result.expires_at=$RESPONSE_EXPIRES" >&2
echo "publish_result.claim_url=$SAFE_CLAIM_URL" >&2
if [[ "$AUTH_MODE" == "authenticated" ]]; then
echo "authenticated publish (permanent, saved to your account)" >&2
else
echo "anonymous publish (expires in 24h)" >&2
if [[ -n "$SAFE_CLAIM_URL" ]]; then
echo "claim URL: $SAFE_CLAIM_URL" >&2
fi
if [[ -n "$RESPONSE_CLAIM_TOKEN" ]]; then
echo "claim token saved to $STATE_FILE" >&2
fi
fi
@@ -0,0 +1,372 @@
---
name: shopify
description: Shopify Admin & Storefront GraphQL APIs via curl. Products, orders, customers, inventory, metafields.
version: 1.0.0
author: community
license: MIT
prerequisites:
env_vars: [SHOPIFY_ACCESS_TOKEN, SHOPIFY_STORE_DOMAIN]
commands: [curl, jq]
required_environment_variables:
- name: SHOPIFY_ACCESS_TOKEN
prompt: Shopify Admin API access token (starts with shpat_)
help: "Shopify admin → Settings → Apps and sales channels → Develop apps → Create an app → API credentials. Token shown ONCE on install."
- name: SHOPIFY_STORE_DOMAIN
prompt: Your shop subdomain without protocol (e.g. my-store.myshopify.com)
help: "The permanent myshopify.com domain, not your custom domain."
- name: SHOPIFY_API_VERSION
prompt: Shopify API version (default 2026-01)
help: "Stable quarterly version. Override if you need an older one."
metadata:
hermes:
tags: [Shopify, E-commerce, Commerce, API, GraphQL]
related_skills: [airtable, xurl]
homepage: https://shopify.dev/docs/api/admin-graphql
---
# Shopify — Admin & Storefront GraphQL APIs
Work with Shopify stores directly through `curl`: list products, manage inventory, pull orders, update customers, read metafields. No SDK, no app framework — just the GraphQL endpoint and a custom-app access token.
The REST Admin API is legacy since 2024-04 and only receives security fixes. **Use GraphQL Admin** for all admin work. Use **Storefront GraphQL** for read-only customer-facing queries (products, collections, cart).
## Prerequisites
1. In Shopify admin: **Settings → Apps and sales channels → Develop apps → Create an app**.
2. Click **Configure Admin API scopes**, select what you need (examples below), save.
3. **Install app** → the Admin API access token appears ONCE. Copy it immediately — Shopify will never show it again. Tokens start with `shpat_`.
4. Save to `~/.hermes/.env`:
```
SHOPIFY_ACCESS_TOKEN=shpat_xxxxxxxxxxxxxxxxxxxx
SHOPIFY_STORE_DOMAIN=my-store.myshopify.com
SHOPIFY_API_VERSION=2026-01
```
> **Heads up:** As of January 1, 2026, new "legacy custom apps" created in the Shopify admin are gone. New setups should use the **Dev Dashboard** (`shopify.dev/docs/apps/build/dev-dashboard`). Existing admin-created apps keep working. If the user's shop has no existing custom app and it's after 2026-01-01, direct them to Dev Dashboard instead of the admin flow.
Common scopes by task:
- Products / collections: `read_products`, `write_products`
- Inventory: `read_inventory`, `write_inventory`, `read_locations`
- Orders: `read_orders`, `write_orders` (30 most recent without `read_all_orders`)
- Customers: `read_customers`, `write_customers`
- Draft orders: `read_draft_orders`, `write_draft_orders`
- Fulfillments: `read_fulfillments`, `write_fulfillments`
- Metafields / metaobjects: covered by the matching resource scopes
## API Basics
- **Endpoint:** `https://$SHOPIFY_STORE_DOMAIN/admin/api/$SHOPIFY_API_VERSION/graphql.json`
- **Auth header:** `X-Shopify-Access-Token: $SHOPIFY_ACCESS_TOKEN` (NOT `Authorization: Bearer`)
- **Method:** always `POST`, always `Content-Type: application/json`, body is `{"query": "...", "variables": {...}}`
- **HTTP 200 does not mean success.** GraphQL returns errors in a top-level `errors` array and per-field `userErrors`. Always check both.
- **IDs are GID strings:** `gid://shopify/Product/10079467700516`, `gid://shopify/Variant/...`, `gid://shopify/Order/...`. Pass these verbatim — don't strip the prefix.
- **Rate limit:** calculated via query cost (leaky bucket). Each response has `extensions.cost` with `requestedQueryCost`, `actualQueryCost`, `throttleStatus.{currentlyAvailable, maximumAvailable, restoreRate}`. Back off when `currentlyAvailable` drops below your next query's cost. Standard shops = 100 points bucket, 50/s restore; Plus = 1000/100.
Base curl pattern (reusable):
```bash
shop_gql() {
local query="$1"
local variables="${2:-{}}"
curl -sS -X POST \
"https://${SHOPIFY_STORE_DOMAIN}/admin/api/${SHOPIFY_API_VERSION:-2026-01}/graphql.json" \
-H "Content-Type: application/json" \
-H "X-Shopify-Access-Token: ${SHOPIFY_ACCESS_TOKEN}" \
--data "$(jq -nc --arg q "$query" --argjson v "$variables" '{query: $q, variables: $v}')"
}
```
Pipe through `jq` for readable output. `-sS` keeps errors visible but hides the progress bar.
## Discovery
### Shop info + current API version
```bash
shop_gql '{ shop { name myshopifyDomain primaryDomain { url } currencyCode plan { displayName } } }' | jq
```
### List all supported API versions
```bash
shop_gql '{ publicApiVersions { handle supported } }' | jq '.data.publicApiVersions[] | select(.supported)'
```
## Products
### Search products (first 20 matching query)
```bash
shop_gql '
query($q: String!) {
products(first: 20, query: $q) {
edges { node { id title handle status totalInventory variants(first: 5) { edges { node { id sku price inventoryQuantity } } } } }
pageInfo { hasNextPage endCursor }
}
}' '{"q":"hoodie status:active"}' | jq
```
Query syntax supports `title:`, `sku:`, `vendor:`, `product_type:`, `status:active`, `tag:`, `created_at:>2025-01-01`. Full grammar: https://shopify.dev/docs/api/usage/search-syntax
### Paginate products (cursor)
```bash
shop_gql '
query($cursor: String) {
products(first: 100, after: $cursor) {
edges { cursor node { id handle } }
pageInfo { hasNextPage endCursor }
}
}' '{"cursor":null}'
# subsequent calls: pass the previous endCursor
```
### Get a product with variants + metafields
```bash
shop_gql '
query($id: ID!) {
product(id: $id) {
id title handle descriptionHtml tags status
variants(first: 20) { edges { node { id sku price compareAtPrice inventoryQuantity selectedOptions { name value } } } }
metafields(first: 20) { edges { node { namespace key type value } } }
}
}' '{"id":"gid://shopify/Product/10079467700516"}' | jq
```
### Create a product with one variant
```bash
shop_gql '
mutation($input: ProductCreateInput!) {
productCreate(product: $input) {
product { id handle }
userErrors { field message }
}
}' '{"input":{"title":"Test Hoodie","status":"DRAFT","vendor":"Hermes","productType":"Apparel","tags":["test"]}}'
```
Variants now have their own mutations in recent versions:
```bash
# Add variants after creating the product
shop_gql '
mutation($productId: ID!, $variants: [ProductVariantsBulkInput!]!) {
productVariantsBulkCreate(productId: $productId, variants: $variants) {
productVariants { id sku price }
userErrors { field message }
}
}' '{"productId":"gid://shopify/Product/...","variants":[{"optionValues":[{"optionName":"Size","name":"M"}],"price":"49.00","inventoryItem":{"sku":"HD-M","tracked":true}}]}'
```
### Update price / SKU
```bash
shop_gql '
mutation($productId: ID!, $variants: [ProductVariantsBulkInput!]!) {
productVariantsBulkUpdate(productId: $productId, variants: $variants) {
productVariants { id sku price }
userErrors { field message }
}
}' '{"productId":"gid://shopify/Product/...","variants":[{"id":"gid://shopify/ProductVariant/...","price":"55.00"}]}'
```
## Orders
### List recent orders (last 30 by default without `read_all_orders`)
```bash
shop_gql '
{
orders(first: 20, reverse: true, query: "financial_status:paid") {
edges { node {
id name createdAt displayFinancialStatus displayFulfillmentStatus
totalPriceSet { shopMoney { amount currencyCode } }
customer { id displayName email }
lineItems(first: 10) { edges { node { title quantity sku } } }
} }
}
}' | jq
```
Useful order query filters: `financial_status:paid|pending|refunded`, `fulfillment_status:unfulfilled|fulfilled`, `created_at:>2025-01-01`, `tag:gift`, `email:foo@example.com`.
### Fetch a single order with shipping address
```bash
shop_gql '
query($id: ID!) {
order(id: $id) {
id name email
shippingAddress { name address1 address2 city province country zip phone }
lineItems(first: 50) { edges { node { title quantity variant { sku } originalUnitPriceSet { shopMoney { amount currencyCode } } } } }
transactions { id kind status amountSet { shopMoney { amount currencyCode } } }
}
}' '{"id":"gid://shopify/Order/...."}' | jq
```
## Customers
```bash
# Search
shop_gql '
{
customers(first: 10, query: "email:*@example.com") {
edges { node { id email displayName numberOfOrders amountSpent { amount currencyCode } } }
}
}'
# Create
shop_gql '
mutation($input: CustomerInput!) {
customerCreate(input: $input) {
customer { id email }
userErrors { field message }
}
}' '{"input":{"email":"test@example.com","firstName":"Test","lastName":"User","tags":["api-created"]}}'
```
## Inventory
Inventory lives on **inventory items** tied to variants, quantities tracked per **location**.
```bash
# Get inventory for a variant across all locations
shop_gql '
query($id: ID!) {
productVariant(id: $id) {
id sku
inventoryItem {
id tracked
inventoryLevels(first: 10) {
edges { node { location { id name } quantities(names: ["available","on_hand","committed"]) { name quantity } } }
}
}
}
}' '{"id":"gid://shopify/ProductVariant/..."}'
```
Adjust stock (delta) — uses `inventoryAdjustQuantities`:
```bash
shop_gql '
mutation($input: InventoryAdjustQuantitiesInput!) {
inventoryAdjustQuantities(input: $input) {
inventoryAdjustmentGroup { reason changes { name delta } }
userErrors { field message }
}
}' '{
"input": {
"reason": "correction",
"name": "available",
"changes": [{"delta": 5, "inventoryItemId": "gid://shopify/InventoryItem/...", "locationId": "gid://shopify/Location/..."}]
}
}'
```
Set absolute stock (not delta) — `inventorySetQuantities`:
```bash
shop_gql '
mutation($input: InventorySetQuantitiesInput!) {
inventorySetQuantities(input: $input) {
inventoryAdjustmentGroup { id }
userErrors { field message }
}
}' '{"input":{"reason":"correction","name":"available","ignoreCompareQuantity":true,"quantities":[{"inventoryItemId":"gid://shopify/InventoryItem/...","locationId":"gid://shopify/Location/...","quantity":100}]}}'
```
## Metafields & Metaobjects
Metafields attach custom data to resources (products, customers, orders, shop).
```bash
# Read
shop_gql '
query($id: ID!) {
product(id: $id) {
metafields(first: 10, namespace: "custom") {
edges { node { key type value } }
}
}
}' '{"id":"gid://shopify/Product/..."}'
# Write (works for any owner type)
shop_gql '
mutation($metafields: [MetafieldsSetInput!]!) {
metafieldsSet(metafields: $metafields) {
metafields { id key namespace }
userErrors { field message code }
}
}' '{"metafields":[{"ownerId":"gid://shopify/Product/...","namespace":"custom","key":"care_instructions","type":"multi_line_text_field","value":"Wash cold. Tumble dry low."}]}'
```
## Storefront API (public read-only)
Different endpoint, different token, used for customer-facing apps/hydrogen-style headless setups. Headers differ:
- **Endpoint:** `https://$SHOPIFY_STORE_DOMAIN/api/$SHOPIFY_API_VERSION/graphql.json`
- **Auth header (public):** `X-Shopify-Storefront-Access-Token: <public token>` — embeddable in browser
- **Auth header (private):** `Shopify-Storefront-Private-Token: <private token>` — server-only
```bash
curl -sS -X POST \
"https://${SHOPIFY_STORE_DOMAIN}/api/${SHOPIFY_API_VERSION:-2026-01}/graphql.json" \
-H "Content-Type: application/json" \
-H "X-Shopify-Storefront-Access-Token: ${SHOPIFY_STOREFRONT_TOKEN}" \
-d '{"query":"{ shop { name } products(first: 5) { edges { node { id title handle } } } }"}' | jq
```
## Bulk Operations
For dumps larger than rate limits allow (full product catalog, all orders for a year):
```bash
# 1. Start bulk query
shop_gql '
mutation {
bulkOperationRunQuery(query: """
{ products { edges { node { id title handle variants { edges { node { sku price } } } } } } }
""") {
bulkOperation { id status }
userErrors { field message }
}
}'
# 2. Poll status
shop_gql '{ currentBulkOperation { id status errorCode objectCount fileSize url partialDataUrl } }'
# 3. When status=COMPLETED, download the JSONL file
curl -sS "$URL" > products.jsonl
```
Each JSONL line is a node, and nested connections are emitted as separate lines with `__parentId`. Reassemble client-side if needed.
## Webhooks
Subscribe to events so you don't have to poll:
```bash
shop_gql '
mutation($topic: WebhookSubscriptionTopic!, $sub: WebhookSubscriptionInput!) {
webhookSubscriptionCreate(topic: $topic, webhookSubscription: $sub) {
webhookSubscription { id topic endpoint { __typename ... on WebhookHttpEndpoint { callbackUrl } } }
userErrors { field message }
}
}' '{"topic":"ORDERS_CREATE","sub":{"callbackUrl":"https://example.com/webhook","format":"JSON"}}'
```
Verify incoming webhook HMAC using the app's client secret (not the access token):
```bash
echo -n "$REQUEST_BODY" | openssl dgst -sha256 -hmac "$APP_SECRET" -binary | base64
# Compare to X-Shopify-Hmac-Sha256 header
```
## Pitfalls
- **REST endpoints still exist but are frozen.** Don't write new integrations against `/admin/api/.../products.json`. Use GraphQL.
- **Token format check.** Admin tokens start with `shpat_`. Storefront public tokens with `shpua_`. If you have one and the wrong header, every request returns 401 without a useful error body.
- **403 with a valid token = missing scope.** Shopify returns `{"errors":[{"message":"Access denied for ..."}]}`. Re-configure Admin API scopes on the app, then reinstall to regenerate the token.
- **`userErrors` is empty != success.** Also check `data.<mutation>.<resource>` is non-null. Some failures populate neither — inspect the whole response.
- **GID vs numeric ID.** Legacy REST gave numeric IDs; GraphQL wants full GID strings. To convert: `gid://shopify/Product/<numeric>`.
- **Rate limit surprise.** A single `products(first: 250)` with deep nesting can cost 1000+ points and throttle immediately on a standard-plan shop. Start narrow, read `extensions.cost`, adjust.
- **Pagination order.** `products(first: N, reverse: true)` sorts by `id DESC`, not `created_at`. Use `sortKey: CREATED_AT, reverse: true` for "newest first."
- **`read_all_orders` for historical data.** Without it, `orders(...)` silently caps at the 60-day window. You won't get an error, just fewer results than expected. For Shopify Plus merchants with many orders, request this scope via the app's protected-data settings.
- **Currencies are strings.** Amounts come back as `"49.00"` not `49.0`. Don't `jq tonumber` blindly if you care about zero-padding.
- **Multi-currency Money fields** have `shopMoney` (store's currency) AND `presentmentMoney` (customer's). Pick one consistently.
## Safety
Mutations in Shopify are real — they create products, charge refunds, cancel orders, ship fulfillments. Before running `productDelete`, `orderCancel`, `refundCreate`, or any bulk mutation: state clearly what the change is, on which shop, and confirm with the user. There is no staging clone of production data unless the user has a separate dev store.
@@ -12,6 +12,14 @@ import time
from pathlib import Path
from typing import Any, Dict, List, Optional, Set
try:
from hermes_constants import get_hermes_home
except ImportError:
import os as _os
def get_hermes_home() -> Path: # type: ignore[misc]
val = (_os.environ.get("HERMES_HOME") or "").strip()
return Path(val) if val else Path.home() / ".hermes"
try:
from fastapi import APIRouter
except Exception: # Allows local unit tests without dashboard dependencies.
@@ -135,15 +143,15 @@ ACHIEVEMENTS: List[Dict[str, Any]] = [
def state_path() -> Path:
return Path.home() / ".hermes" / "plugins" / "hermes-achievements" / "state.json"
return get_hermes_home() / "plugins" / "hermes-achievements" / "state.json"
def snapshot_path() -> Path:
return Path.home() / ".hermes" / "plugins" / "hermes-achievements" / "scan_snapshot.json"
return get_hermes_home() / "plugins" / "hermes-achievements" / "scan_snapshot.json"
def checkpoint_path() -> Path:
return Path.home() / ".hermes" / "plugins" / "hermes-achievements" / "scan_checkpoint.json"
return get_hermes_home() / "plugins" / "hermes-achievements" / "scan_checkpoint.json"
def load_state() -> Dict[str, Any]:
File diff suppressed because it is too large Load Diff
+760
View File
@@ -0,0 +1,760 @@
/*
* Hermes Kanban dashboard plugin styles.
*
* All colors reference theme CSS vars so the board reskins with the
* active dashboard theme. No hardcoded palette.
*/
.hermes-kanban {
width: 100%;
}
/* ---- Columns layout -------------------------------------------------- */
.hermes-kanban-columns {
display: grid;
grid-template-columns: repeat(auto-fit, minmax(260px, 1fr));
gap: 0.75rem;
align-items: start;
}
.hermes-kanban-column {
display: flex;
flex-direction: column;
background: color-mix(in srgb, var(--color-card) 85%, transparent);
border: 1px solid var(--color-border);
border-radius: var(--radius);
padding: 0.5rem;
min-height: 200px;
max-height: calc(100vh - 220px);
transition: border-color 120ms ease, background-color 120ms ease;
}
.hermes-kanban-column--drop {
border-color: var(--color-ring);
background: color-mix(in srgb, var(--color-ring) 8%, var(--color-card));
}
.hermes-kanban-column-header {
display: flex;
align-items: center;
gap: 0.5rem;
padding: 0.25rem 0.25rem 0.35rem;
font-weight: 600;
font-size: 0.85rem;
color: var(--color-foreground);
}
.hermes-kanban-column-label {
flex: 1;
letter-spacing: 0.01em;
}
.hermes-kanban-column-count {
font-variant-numeric: tabular-nums;
color: var(--color-muted-foreground);
font-size: 0.75rem;
font-weight: 500;
}
.hermes-kanban-column-add {
appearance: none;
background: transparent;
border: 1px solid var(--color-border);
color: var(--color-foreground);
border-radius: var(--radius-sm, 0.25rem);
width: 22px;
height: 22px;
line-height: 1;
font-size: 1rem;
cursor: pointer;
}
.hermes-kanban-column-add:hover {
background: color-mix(in srgb, var(--color-foreground) 8%, transparent);
}
.hermes-kanban-column-sub {
padding: 0 0.25rem 0.5rem;
font-size: 0.7rem;
color: var(--color-muted-foreground);
border-bottom: 1px solid color-mix(in srgb, var(--color-border) 60%, transparent);
margin-bottom: 0.5rem;
}
.hermes-kanban-column-body {
display: flex;
flex-direction: column;
gap: 0.45rem;
overflow-y: auto;
padding-right: 0.1rem;
}
.hermes-kanban-empty {
padding: 1.5rem 0.5rem;
text-align: center;
font-size: 0.75rem;
color: var(--color-muted-foreground);
border: 1px dashed color-mix(in srgb, var(--color-border) 70%, transparent);
border-radius: var(--radius-sm, 0.25rem);
}
/* ---- Status dots ----------------------------------------------------- */
.hermes-kanban-dot {
display: inline-block;
width: 0.5rem;
height: 0.5rem;
border-radius: 999px;
background: var(--color-muted-foreground);
}
.hermes-kanban-dot-triage { background: #b47dd6; } /* lilac — fresh/unspecified */
.hermes-kanban-dot-todo { background: var(--color-muted-foreground); }
.hermes-kanban-dot-ready { background: #d4b348; } /* amber */
.hermes-kanban-dot-running { background: #3fb97d; } /* green */
.hermes-kanban-dot-blocked { background: var(--color-destructive, #d14a4a); }
.hermes-kanban-dot-done { background: #4a8cd1; } /* blue */
.hermes-kanban-dot-archived { background: var(--color-border); }
/* ---- Progress pill (N/M child tasks done) --------------------------- */
.hermes-kanban-progress {
font-family: var(--font-mono, ui-monospace, monospace);
font-size: 0.62rem;
padding: 0.05rem 0.35rem;
border-radius: 999px;
background: color-mix(in srgb, var(--color-foreground) 8%, transparent);
border: 1px solid color-mix(in srgb, var(--color-border) 80%, transparent);
color: var(--color-muted-foreground);
letter-spacing: 0.02em;
}
.hermes-kanban-progress--full {
background: color-mix(in srgb, #3fb97d 22%, transparent);
border-color: color-mix(in srgb, #3fb97d 45%, transparent);
color: var(--color-foreground);
}
/* ---- Lanes (per-profile sub-grouping inside Running) ---------------- */
.hermes-kanban-lane {
display: flex;
flex-direction: column;
gap: 0.35rem;
padding: 0.25rem 0 0.35rem;
border-top: 1px dashed color-mix(in srgb, var(--color-border) 70%, transparent);
}
.hermes-kanban-lane:first-child {
border-top: 0;
padding-top: 0;
}
.hermes-kanban-lane-head {
display: flex;
align-items: center;
gap: 0.4rem;
font-size: 0.65rem;
text-transform: uppercase;
letter-spacing: 0.08em;
color: var(--color-muted-foreground);
padding: 0 0.1rem;
}
.hermes-kanban-lane-name {
font-weight: 600;
font-family: var(--font-mono, ui-monospace, monospace);
}
.hermes-kanban-lane-count {
margin-left: auto;
font-variant-numeric: tabular-nums;
}
/* ---- Card ------------------------------------------------------------ */
.hermes-kanban-card {
cursor: grab;
transition: transform 100ms ease, box-shadow 100ms ease;
}
.hermes-kanban-card:hover {
box-shadow: 0 1px 0 0 var(--color-ring) inset, 0 0 0 1px var(--color-ring) inset;
}
.hermes-kanban-card:active {
cursor: grabbing;
transform: scale(0.995);
}
.hermes-kanban-card-content {
padding: 0.5rem 0.6rem !important;
display: flex;
flex-direction: column;
gap: 0.3rem;
}
.hermes-kanban-card-row {
display: flex;
align-items: center;
gap: 0.35rem;
flex-wrap: wrap;
}
.hermes-kanban-card-id {
font-family: var(--font-mono, ui-monospace, monospace);
font-size: 0.65rem;
color: var(--color-muted-foreground);
letter-spacing: 0.03em;
}
.hermes-kanban-card-title {
font-size: 0.85rem;
font-weight: 500;
line-height: 1.3;
color: var(--color-foreground);
word-break: break-word;
}
.hermes-kanban-card-meta {
font-size: 0.7rem;
color: var(--color-muted-foreground);
gap: 0.55rem;
}
.hermes-kanban-priority {
font-size: 0.6rem !important;
padding: 0.05rem 0.3rem !important;
background: color-mix(in srgb, var(--color-ring) 18%, transparent);
color: var(--color-foreground);
border: 1px solid color-mix(in srgb, var(--color-ring) 40%, transparent);
}
.hermes-kanban-tag {
font-size: 0.6rem !important;
padding: 0.05rem 0.3rem !important;
}
.hermes-kanban-assignee {
font-weight: 500;
color: color-mix(in srgb, var(--color-foreground) 80%, var(--color-muted-foreground));
}
.hermes-kanban-unassigned {
font-style: italic;
}
.hermes-kanban-ago {
margin-left: auto;
}
/* ---- Inline create --------------------------------------------------- */
.hermes-kanban-inline-create {
display: flex;
flex-direction: column;
gap: 0.35rem;
padding: 0.5rem;
margin-bottom: 0.5rem;
background: color-mix(in srgb, var(--color-card) 70%, transparent);
border: 1px dashed var(--color-border);
border-radius: var(--radius-sm, 0.25rem);
}
.hermes-kanban-inline-create > .flex.gap-2:last-child > button:first-of-type {
flex: 1;
min-width: 0;
}
/* ---- Drawer (task detail side panel) --------------------------------- */
.hermes-kanban-drawer-shade {
position: fixed;
inset: 0;
background: rgba(0, 0, 0, 0.45);
z-index: 60;
display: flex;
justify-content: flex-end;
}
.hermes-kanban-drawer {
width: min(480px, 92vw);
height: 100vh;
background: var(--color-card);
border-left: 1px solid var(--color-border);
display: flex;
flex-direction: column;
box-shadow: -4px 0 18px rgba(0, 0, 0, 0.35);
animation: hermes-kanban-drawer-in 180ms ease-out;
}
@keyframes hermes-kanban-drawer-in {
from { transform: translateX(100%); opacity: 0.3; }
to { transform: translateX(0); opacity: 1; }
}
.hermes-kanban-drawer-head {
display: flex;
align-items: center;
justify-content: space-between;
padding: 0.6rem 0.8rem;
border-bottom: 1px solid var(--color-border);
font-family: var(--font-mono, ui-monospace, monospace);
}
.hermes-kanban-drawer-close {
appearance: none;
background: transparent;
border: 0;
color: var(--color-muted-foreground);
font-size: 1.25rem;
line-height: 1;
cursor: pointer;
padding: 0 0.25rem;
}
.hermes-kanban-drawer-close:hover { color: var(--color-foreground); }
.hermes-kanban-drawer-body {
flex: 1;
overflow-y: auto;
padding: 0.9rem;
display: flex;
flex-direction: column;
gap: 0.85rem;
}
.hermes-kanban-drawer-title {
display: flex;
align-items: center;
gap: 0.5rem;
font-size: 1rem;
font-weight: 600;
}
.hermes-kanban-drawer-meta {
display: flex;
flex-direction: column;
gap: 0.15rem;
padding: 0.5rem 0.6rem;
background: color-mix(in srgb, var(--color-foreground) 4%, transparent);
border: 1px solid var(--color-border);
border-radius: var(--radius-sm, 0.25rem);
}
.hermes-kanban-meta-row {
display: flex;
gap: 0.5rem;
font-size: 0.72rem;
}
.hermes-kanban-meta-label {
width: 92px;
color: var(--color-muted-foreground);
}
.hermes-kanban-meta-value {
color: var(--color-foreground);
word-break: break-word;
}
.hermes-kanban-actions {
display: flex;
flex-wrap: wrap;
gap: 0.3rem;
}
.hermes-kanban-section {
display: flex;
flex-direction: column;
gap: 0.35rem;
}
.hermes-kanban-section-head {
font-size: 0.72rem;
font-weight: 600;
text-transform: uppercase;
letter-spacing: 0.07em;
color: var(--color-muted-foreground);
}
.hermes-kanban-pre {
margin: 0;
padding: 0.45rem 0.55rem;
white-space: pre-wrap;
word-break: break-word;
background: color-mix(in srgb, var(--color-foreground) 4%, transparent);
border: 1px solid var(--color-border);
border-radius: var(--radius-sm, 0.25rem);
font-family: var(--font-mono, ui-monospace, monospace);
font-size: 0.72rem;
color: var(--color-foreground);
}
.hermes-kanban-comment {
border-left: 2px solid color-mix(in srgb, var(--color-ring) 35%, transparent);
padding-left: 0.5rem;
display: flex;
flex-direction: column;
gap: 0.2rem;
}
.hermes-kanban-comment-head {
display: flex;
gap: 0.5rem;
font-size: 0.7rem;
}
.hermes-kanban-comment-author {
font-weight: 600;
color: var(--color-foreground);
}
.hermes-kanban-comment-ago {
color: var(--color-muted-foreground);
}
.hermes-kanban-event {
display: flex;
gap: 0.5rem;
font-size: 0.7rem;
color: var(--color-muted-foreground);
font-family: var(--font-mono, ui-monospace, monospace);
}
.hermes-kanban-event-kind {
color: var(--color-foreground);
min-width: 6rem;
}
.hermes-kanban-event-payload {
color: var(--color-muted-foreground);
overflow: hidden;
text-overflow: ellipsis;
white-space: nowrap;
max-width: 280px;
}
.hermes-kanban-drawer-comment-row {
display: flex;
gap: 0.4rem;
padding: 0.55rem 0.75rem;
border-top: 1px solid var(--color-border);
background: color-mix(in srgb, var(--color-card) 90%, transparent);
}
.hermes-kanban-count {
display: inline-flex;
gap: 0.2rem;
align-items: center;
}
/* ---- Selection chrome ----------------------------------------------- */
.hermes-kanban-card--selected :where(.hermes-kanban-card-content) {
box-shadow: 0 0 0 2px var(--color-ring) inset,
0 0 0 1px var(--color-ring) inset;
background: color-mix(in srgb, var(--color-ring) 6%, var(--color-card));
}
.hermes-kanban-card-check {
width: 0.85rem;
height: 0.85rem;
margin: 0;
cursor: pointer;
accent-color: var(--color-ring);
}
/* ---- Bulk action bar ------------------------------------------------ */
.hermes-kanban-bulk {
display: flex;
align-items: center;
gap: 0.5rem;
padding: 0.4rem 0.75rem;
background: color-mix(in srgb, var(--color-ring) 10%, var(--color-card));
border: 1px solid color-mix(in srgb, var(--color-ring) 40%, var(--color-border));
border-radius: var(--radius-sm, 0.25rem);
flex-wrap: wrap;
}
.hermes-kanban-bulk-count {
font-weight: 600;
font-size: 0.75rem;
padding-right: 0.25rem;
}
.hermes-kanban-bulk > button,
.hermes-kanban-bulk-reassign > button {
height: 1.7rem !important;
padding: 0 0.5rem !important;
font-size: 0.7rem !important;
border: 1px solid var(--color-border);
cursor: pointer;
}
.hermes-kanban-bulk > button:hover:not(:disabled),
.hermes-kanban-bulk-reassign > button:hover:not(:disabled) {
background: color-mix(in srgb, var(--color-foreground) 8%, transparent);
}
.hermes-kanban-bulk-reassign {
display: flex;
align-items: center;
gap: 0.25rem;
padding-left: 0.5rem;
border-left: 1px solid color-mix(in srgb, var(--color-border) 70%, transparent);
}
/* ---- Dependency editor chips --------------------------------------- */
.hermes-kanban-deps-row {
display: flex;
align-items: center;
gap: 0.5rem;
margin-bottom: 0.4rem;
}
.hermes-kanban-deps-label {
font-size: 0.68rem;
text-transform: uppercase;
letter-spacing: 0.08em;
color: var(--color-muted-foreground);
min-width: 4rem;
}
.hermes-kanban-deps-chips {
display: flex;
gap: 0.3rem;
flex-wrap: wrap;
flex: 1;
}
.hermes-kanban-deps-empty {
font-size: 0.7rem;
color: var(--color-muted-foreground);
font-style: italic;
}
.hermes-kanban-dep-chip {
display: inline-flex;
align-items: center;
gap: 0.15rem;
padding: 0.1rem 0.35rem;
background: color-mix(in srgb, var(--color-foreground) 6%, transparent);
border: 1px solid var(--color-border);
border-radius: var(--radius-sm, 0.25rem);
font-family: var(--font-mono, ui-monospace, monospace);
font-size: 0.68rem;
color: var(--color-foreground);
}
.hermes-kanban-dep-chip-x {
appearance: none;
background: transparent;
border: 0;
color: var(--color-muted-foreground);
cursor: pointer;
font-size: 0.85rem;
line-height: 1;
padding: 0 0.15rem;
}
.hermes-kanban-dep-chip-x:hover { color: var(--color-destructive, #d14a4a); }
/* ---- Inline edit affordances --------------------------------------- */
.hermes-kanban-editable {
cursor: pointer;
border-bottom: 1px dotted color-mix(in srgb, var(--color-border) 80%, transparent);
}
.hermes-kanban-editable:hover {
color: var(--color-foreground);
border-bottom-color: var(--color-ring);
}
.hermes-kanban-drawer-title-text {
cursor: pointer;
}
.hermes-kanban-drawer-title-text:hover {
text-decoration: underline;
text-decoration-color: var(--color-ring);
text-decoration-style: dotted;
text-underline-offset: 3px;
}
.hermes-kanban-edit-row {
display: flex;
align-items: center;
gap: 0.35rem;
width: 100%;
}
.hermes-kanban-section-head-row {
display: flex;
align-items: center;
justify-content: space-between;
gap: 0.5rem;
}
.hermes-kanban-edit-link {
appearance: none;
background: transparent;
border: 0;
color: var(--color-muted-foreground);
font-size: 0.7rem;
text-transform: uppercase;
letter-spacing: 0.05em;
cursor: pointer;
padding: 0;
}
.hermes-kanban-edit-link:hover { color: var(--color-ring); }
.hermes-kanban-textarea {
width: 100%;
min-height: 8rem;
background: var(--color-card);
color: var(--color-foreground);
border: 1px solid var(--color-border);
border-radius: var(--radius-sm, 0.25rem);
padding: 0.5rem 0.6rem;
font-family: var(--font-mono, ui-monospace, monospace);
font-size: 0.8rem;
line-height: 1.5;
resize: vertical;
}
.hermes-kanban-textarea:focus {
outline: none;
border-color: var(--color-ring);
box-shadow: 0 0 0 2px color-mix(in srgb, var(--color-ring) 30%, transparent);
}
/* ---- Markdown rendering -------------------------------------------- */
.hermes-kanban-md {
font-size: 0.8rem;
line-height: 1.55;
color: var(--color-foreground);
}
.hermes-kanban-md p { margin: 0.25rem 0; }
.hermes-kanban-md h1,
.hermes-kanban-md h2,
.hermes-kanban-md h3,
.hermes-kanban-md h4 {
margin: 0.6rem 0 0.2rem;
line-height: 1.25;
}
.hermes-kanban-md h1 { font-size: 1.05rem; }
.hermes-kanban-md h2 { font-size: 0.95rem; }
.hermes-kanban-md h3 { font-size: 0.88rem; }
.hermes-kanban-md h4 { font-size: 0.82rem; }
.hermes-kanban-md ul {
margin: 0.25rem 0 0.25rem 1.1rem;
padding: 0;
}
.hermes-kanban-md li { margin: 0.1rem 0; }
.hermes-kanban-md a {
color: var(--color-ring);
text-decoration: underline;
}
.hermes-kanban-md code {
font-family: var(--font-mono, ui-monospace, monospace);
font-size: 0.75rem;
padding: 0.05rem 0.3rem;
background: color-mix(in srgb, var(--color-foreground) 8%, transparent);
border-radius: 3px;
}
.hermes-kanban-md-code {
margin: 0.35rem 0;
padding: 0.5rem 0.6rem;
background: color-mix(in srgb, var(--color-foreground) 5%, transparent);
border: 1px solid var(--color-border);
border-radius: var(--radius-sm, 0.25rem);
overflow-x: auto;
}
.hermes-kanban-md-code code {
background: transparent;
padding: 0;
font-size: 0.75rem;
white-space: pre;
}
.hermes-kanban-md strong { font-weight: 600; }
/* ---- Touch-drag proxy ---------------------------------------------- */
.hermes-kanban-touch-proxy {
pointer-events: none;
opacity: 0.85;
box-shadow: 0 8px 20px rgba(0, 0, 0, 0.35);
transform: scale(1.02);
transition: none;
}
/* ---- Staleness tiers ------------------------------------------------ */
.hermes-kanban-card--stale-amber :where(.hermes-kanban-card-content) {
box-shadow: 0 0 0 1px #d4b34888 inset;
}
.hermes-kanban-card--stale-amber:hover :where(.hermes-kanban-card-content) {
box-shadow: 0 0 0 2px #d4b348 inset;
}
.hermes-kanban-card--stale-red :where(.hermes-kanban-card-content) {
box-shadow: 0 0 0 1px var(--color-destructive, #d14a4a) inset,
0 0 8px color-mix(in srgb, var(--color-destructive, #d14a4a) 30%, transparent);
}
.hermes-kanban-card--stale-red:hover :where(.hermes-kanban-card-content) {
box-shadow: 0 0 0 2px var(--color-destructive, #d14a4a) inset,
0 0 10px color-mix(in srgb, var(--color-destructive, #d14a4a) 45%, transparent);
}
/* ---- Worker log pane ------------------------------------------------ */
.hermes-kanban-log {
max-height: 340px;
overflow: auto;
white-space: pre;
font-size: 0.7rem;
line-height: 1.45;
}
/* ---- Run history (per-attempt log in the drawer) ------------------- */
.hermes-kanban-run {
border-left: 2px solid var(--color-border);
padding: 0.35rem 0.5rem;
margin-bottom: 0.4rem;
background: color-mix(in srgb, var(--color-foreground) 3%, transparent);
border-radius: var(--radius-sm, 0.25rem);
}
.hermes-kanban-run--active { border-left-color: #3fb97d; }
.hermes-kanban-run--completed { border-left-color: #4a8cd1; }
.hermes-kanban-run--ended { border-left-color: #6b7280; } /* generic fallback when outcome is unset */
.hermes-kanban-run--blocked { border-left-color: var(--color-destructive, #d14a4a); }
.hermes-kanban-run--crashed,
.hermes-kanban-run--timed_out,
.hermes-kanban-run--gave_up,
.hermes-kanban-run--spawn_failed {
border-left-color: var(--color-destructive, #d14a4a);
background: color-mix(in srgb, var(--color-destructive, #d14a4a) 6%, transparent);
}
.hermes-kanban-run--reclaimed { border-left-color: #d4b348; }
.hermes-kanban-run-head {
display: flex;
align-items: center;
gap: 0.6rem;
font-size: 0.7rem;
}
.hermes-kanban-run-outcome {
font-family: var(--font-mono, ui-monospace, monospace);
font-weight: 600;
text-transform: uppercase;
letter-spacing: 0.05em;
color: var(--color-foreground);
}
.hermes-kanban-run-profile {
color: var(--color-muted-foreground);
}
.hermes-kanban-run-elapsed {
font-variant-numeric: tabular-nums;
color: var(--color-muted-foreground);
}
.hermes-kanban-run-ago {
margin-left: auto;
color: var(--color-muted-foreground);
}
.hermes-kanban-run-summary {
font-size: 0.75rem;
padding: 0.2rem 0 0;
color: var(--color-foreground);
}
.hermes-kanban-run-error {
font-size: 0.7rem;
color: var(--color-destructive, #d14a4a);
padding: 0.15rem 0 0;
font-family: var(--font-mono, ui-monospace, monospace);
}
.hermes-kanban-run-meta {
display: block;
font-size: 0.65rem;
padding: 0.15rem 0 0;
color: var(--color-muted-foreground);
white-space: pre-wrap;
word-break: break-word;
font-family: var(--font-mono, ui-monospace, monospace);
}
+14
View File
@@ -0,0 +1,14 @@
{
"name": "kanban",
"label": "Kanban",
"description": "Multi-agent collaboration board — drag-drop cards across columns, read comment threads, see which profile is running what",
"icon": "Package",
"version": "1.0.0",
"tab": {
"path": "/kanban",
"position": "after:skills"
},
"entry": "dist/index.js",
"css": "dist/style.css",
"api": "plugin_api.py"
}
+845
View File
@@ -0,0 +1,845 @@
"""Kanban dashboard plugin — backend API routes.
Mounted at /api/plugins/kanban/ by the dashboard plugin system.
This layer is intentionally thin: every handler is a small wrapper around
``hermes_cli.kanban_db`` or a direct SQL query. Writes use the same code
paths the CLI and gateway ``/kanban`` command use, so the three surfaces
cannot drift.
Live updates arrive via the ``/events`` WebSocket, which tails the
append-only ``task_events`` table on a short poll interval (WAL mode lets
reads run alongside the dispatcher's IMMEDIATE write transactions).
Security note
-------------
The dashboard's HTTP auth middleware (``web_server.auth_middleware``)
explicitly skips ``/api/plugins/`` plugin routes are unauthenticated by
design because the dashboard binds to localhost by default. For the
WebSocket we still require the session token as a ``?token=`` query
parameter (browsers cannot set the ``Authorization`` header on an upgrade
request), matching the established pattern used by the in-browser PTY
bridge in ``hermes_cli/web_server.py``. If you run the dashboard with
``--host 0.0.0.0``, every plugin route kanban included becomes
reachable from the network. Don't do that on a shared host.
"""
from __future__ import annotations
import asyncio
import hmac
import json
import logging
import sqlite3
import time
from dataclasses import asdict
from typing import Any, Optional
from fastapi import APIRouter, HTTPException, Query, WebSocket, WebSocketDisconnect, status as http_status
from pydantic import BaseModel, Field
from hermes_cli import kanban_db
log = logging.getLogger(__name__)
router = APIRouter()
# ---------------------------------------------------------------------------
# Auth helper — WebSocket only (HTTP routes live behind the dashboard's
# existing plugin-bypass; this is documented above).
# ---------------------------------------------------------------------------
def _check_ws_token(provided: Optional[str]) -> bool:
"""Constant-time compare against the dashboard session token.
Imported lazily so the plugin still loads in test contexts where the
dashboard web_server module isn't importable (e.g. the bare-FastAPI
test harness).
"""
if not provided:
return False
try:
from hermes_cli import web_server as _ws
except Exception:
# No dashboard context (tests). Accept so the tail loop is still
# testable; in production the dashboard module always imports
# cleanly because it's the caller.
return True
expected = getattr(_ws, "_SESSION_TOKEN", None)
if not expected:
return True
return hmac.compare_digest(str(provided), str(expected))
def _conn():
"""Open a kanban_db connection, creating the schema on first use.
Every handler that mutates the DB goes through this so the plugin
self-heals on a fresh install (no user-visible "no such table"
error if somebody hits POST /tasks before GET /board).
``init_db`` is idempotent.
"""
try:
kanban_db.init_db()
except Exception as exc:
log.warning("kanban init_db failed: %s", exc)
return kanban_db.connect()
# ---------------------------------------------------------------------------
# Serialization helpers
# ---------------------------------------------------------------------------
# Columns shown by the dashboard, in left-to-right order. "archived" is
# available via a filter toggle rather than a visible column.
BOARD_COLUMNS: list[str] = [
"triage", "todo", "ready", "running", "blocked", "done",
]
def _task_dict(task: kanban_db.Task) -> dict[str, Any]:
d = asdict(task)
# Add derived age metrics so the UI can colour stale cards without
# computing deltas client-side.
d["age"] = kanban_db.task_age(task)
# Keep body short on list endpoints; full body comes from /tasks/:id.
return d
def _event_dict(event: kanban_db.Event) -> dict[str, Any]:
return {
"id": event.id,
"task_id": event.task_id,
"kind": event.kind,
"payload": event.payload,
"created_at": event.created_at,
"run_id": event.run_id,
}
def _comment_dict(c: kanban_db.Comment) -> dict[str, Any]:
return {
"id": c.id,
"task_id": c.task_id,
"author": c.author,
"body": c.body,
"created_at": c.created_at,
}
def _run_dict(r: kanban_db.Run) -> dict[str, Any]:
"""Serialise a Run for the drawer's Run history section."""
return {
"id": r.id,
"task_id": r.task_id,
"profile": r.profile,
"step_key": r.step_key,
"status": r.status,
"claim_lock": r.claim_lock,
"claim_expires": r.claim_expires,
"worker_pid": r.worker_pid,
"max_runtime_seconds": r.max_runtime_seconds,
"last_heartbeat_at": r.last_heartbeat_at,
"started_at": r.started_at,
"ended_at": r.ended_at,
"outcome": r.outcome,
"summary": r.summary,
"metadata": r.metadata,
"error": r.error,
}
def _links_for(conn: sqlite3.Connection, task_id: str) -> dict[str, list[str]]:
"""Return {'parents': [...], 'children': [...]} for a task."""
parents = [
r["parent_id"]
for r in conn.execute(
"SELECT parent_id FROM task_links WHERE child_id = ? ORDER BY parent_id",
(task_id,),
)
]
children = [
r["child_id"]
for r in conn.execute(
"SELECT child_id FROM task_links WHERE parent_id = ? ORDER BY child_id",
(task_id,),
)
]
return {"parents": parents, "children": children}
# ---------------------------------------------------------------------------
# GET /board
# ---------------------------------------------------------------------------
@router.get("/board")
def get_board(
tenant: Optional[str] = Query(None, description="Filter to a single tenant"),
include_archived: bool = Query(False),
):
"""Return the full board grouped by status column.
``_conn()`` auto-initializes ``kanban.db`` on first call so a fresh
install doesn't surface a "failed to load" error on the plugin tab.
"""
conn = _conn()
try:
tasks = kanban_db.list_tasks(
conn, tenant=tenant, include_archived=include_archived
)
# Pre-fetch link counts per task (cheap: one query).
link_counts: dict[str, dict[str, int]] = {}
for row in conn.execute(
"SELECT parent_id, child_id FROM task_links"
).fetchall():
link_counts.setdefault(row["parent_id"], {"parents": 0, "children": 0})[
"children"
] += 1
link_counts.setdefault(row["child_id"], {"parents": 0, "children": 0})[
"parents"
] += 1
# Comment + event counts (both cheap aggregates).
comment_counts: dict[str, int] = {
r["task_id"]: r["n"]
for r in conn.execute(
"SELECT task_id, COUNT(*) AS n FROM task_comments GROUP BY task_id"
)
}
# Progress rollup: for each parent, how many children are done / total.
# One pass over task_links joined with child status — cheaper than
# N per-task queries and the plugin uses it to render "N/M".
progress: dict[str, dict[str, int]] = {}
for row in conn.execute(
"SELECT l.parent_id AS pid, t.status AS cstatus "
"FROM task_links l JOIN tasks t ON t.id = l.child_id"
).fetchall():
p = progress.setdefault(row["pid"], {"done": 0, "total": 0})
p["total"] += 1
if row["cstatus"] == "done":
p["done"] += 1
latest_event_id = conn.execute(
"SELECT COALESCE(MAX(id), 0) AS m FROM task_events"
).fetchone()["m"]
columns: dict[str, list[dict]] = {c: [] for c in BOARD_COLUMNS}
if include_archived:
columns["archived"] = []
for t in tasks:
d = _task_dict(t)
d["link_counts"] = link_counts.get(t.id, {"parents": 0, "children": 0})
d["comment_count"] = comment_counts.get(t.id, 0)
d["progress"] = progress.get(t.id) # None when the task has no children
col = t.status if t.status in columns else "todo"
columns[col].append(d)
# Stable per-column ordering already applied by list_tasks
# (priority DESC, created_at ASC), keep as-is.
# List of known tenants for the UI filter dropdown.
tenants = [
r["tenant"]
for r in conn.execute(
"SELECT DISTINCT tenant FROM tasks WHERE tenant IS NOT NULL ORDER BY tenant"
)
]
# List of distinct assignees for the lane-by-profile sub-grouping.
assignees = [
r["assignee"]
for r in conn.execute(
"SELECT DISTINCT assignee FROM tasks WHERE assignee IS NOT NULL "
"AND status != 'archived' ORDER BY assignee"
)
]
return {
"columns": [
{"name": name, "tasks": columns[name]} for name in columns.keys()
],
"tenants": tenants,
"assignees": assignees,
"latest_event_id": int(latest_event_id),
"now": int(time.time()),
}
finally:
conn.close()
# ---------------------------------------------------------------------------
# GET /tasks/:id
# ---------------------------------------------------------------------------
@router.get("/tasks/{task_id}")
def get_task(task_id: str):
conn = _conn()
try:
task = kanban_db.get_task(conn, task_id)
if task is None:
raise HTTPException(status_code=404, detail=f"task {task_id} not found")
return {
"task": _task_dict(task),
"comments": [_comment_dict(c) for c in kanban_db.list_comments(conn, task_id)],
"events": [_event_dict(e) for e in kanban_db.list_events(conn, task_id)],
"links": _links_for(conn, task_id),
"runs": [_run_dict(r) for r in kanban_db.list_runs(conn, task_id)],
}
finally:
conn.close()
# ---------------------------------------------------------------------------
# POST /tasks
# ---------------------------------------------------------------------------
class CreateTaskBody(BaseModel):
title: str
body: Optional[str] = None
assignee: Optional[str] = None
tenant: Optional[str] = None
priority: int = 0
workspace_kind: str = "scratch"
workspace_path: Optional[str] = None
parents: list[str] = Field(default_factory=list)
triage: bool = False
idempotency_key: Optional[str] = None
max_runtime_seconds: Optional[int] = None
skills: Optional[list[str]] = None
@router.post("/tasks")
def create_task(payload: CreateTaskBody):
conn = _conn()
try:
task_id = kanban_db.create_task(
conn,
title=payload.title,
body=payload.body,
assignee=payload.assignee,
created_by="dashboard",
workspace_kind=payload.workspace_kind,
workspace_path=payload.workspace_path,
tenant=payload.tenant,
priority=payload.priority,
parents=payload.parents,
triage=payload.triage,
idempotency_key=payload.idempotency_key,
max_runtime_seconds=payload.max_runtime_seconds,
skills=payload.skills,
)
task = kanban_db.get_task(conn, task_id)
body: dict[str, Any] = {"task": _task_dict(task) if task else None}
# Surface a dispatcher-presence warning so the UI can show a
# banner when a `ready` task would otherwise sit idle because no
# gateway is running (or dispatch_in_gateway=false). Only emit
# for ready+assigned tasks; triage/todo are expected to wait,
# and unassigned tasks can't be dispatched regardless.
if task and task.status == "ready" and task.assignee:
try:
from hermes_cli.kanban import _check_dispatcher_presence
running, message = _check_dispatcher_presence()
if not running and message:
body["warning"] = message
except Exception:
# Probe failure must never block the create itself.
pass
return body
except ValueError as e:
raise HTTPException(status_code=400, detail=str(e))
finally:
conn.close()
# ---------------------------------------------------------------------------
# PATCH /tasks/:id (status / assignee / priority / title / body)
# ---------------------------------------------------------------------------
class UpdateTaskBody(BaseModel):
status: Optional[str] = None
assignee: Optional[str] = None
priority: Optional[int] = None
title: Optional[str] = None
body: Optional[str] = None
result: Optional[str] = None
block_reason: Optional[str] = None
# Structured handoff fields — forwarded to complete_task when status
# transitions to 'done'. Dashboard parity with ``hermes kanban
# complete --summary ... --metadata ...``.
summary: Optional[str] = None
metadata: Optional[dict] = None
@router.patch("/tasks/{task_id}")
def update_task(task_id: str, payload: UpdateTaskBody):
conn = _conn()
try:
task = kanban_db.get_task(conn, task_id)
if task is None:
raise HTTPException(status_code=404, detail=f"task {task_id} not found")
# --- assignee ----------------------------------------------------
if payload.assignee is not None:
try:
ok = kanban_db.assign_task(
conn, task_id, payload.assignee or None,
)
except RuntimeError as e:
raise HTTPException(status_code=409, detail=str(e))
if not ok:
raise HTTPException(status_code=404, detail="task not found")
# --- status -------------------------------------------------------
if payload.status is not None:
s = payload.status
ok = True
if s == "done":
ok = kanban_db.complete_task(
conn, task_id,
result=payload.result,
summary=payload.summary,
metadata=payload.metadata,
)
elif s == "blocked":
ok = kanban_db.block_task(conn, task_id, reason=payload.block_reason)
elif s == "ready":
# Re-open a blocked task, or just an explicit status set.
current = kanban_db.get_task(conn, task_id)
if current and current.status == "blocked":
ok = kanban_db.unblock_task(conn, task_id)
else:
# Direct status write for drag-drop (todo -> ready etc).
ok = _set_status_direct(conn, task_id, "ready")
elif s == "archived":
ok = kanban_db.archive_task(conn, task_id)
elif s in ("todo", "running", "triage"):
ok = _set_status_direct(conn, task_id, s)
else:
raise HTTPException(status_code=400, detail=f"unknown status: {s}")
if not ok:
raise HTTPException(
status_code=409,
detail=f"status transition to {s!r} not valid from current state",
)
# --- priority -----------------------------------------------------
if payload.priority is not None:
with kanban_db.write_txn(conn):
conn.execute(
"UPDATE tasks SET priority = ? WHERE id = ?",
(int(payload.priority), task_id),
)
conn.execute(
"INSERT INTO task_events (task_id, kind, payload, created_at) "
"VALUES (?, 'reprioritized', ?, ?)",
(task_id, json.dumps({"priority": int(payload.priority)}),
int(time.time())),
)
# --- title / body -------------------------------------------------
if payload.title is not None or payload.body is not None:
with kanban_db.write_txn(conn):
sets, vals = [], []
if payload.title is not None:
if not payload.title.strip():
raise HTTPException(status_code=400, detail="title cannot be empty")
sets.append("title = ?")
vals.append(payload.title.strip())
if payload.body is not None:
sets.append("body = ?")
vals.append(payload.body)
vals.append(task_id)
conn.execute(
f"UPDATE tasks SET {', '.join(sets)} WHERE id = ?", vals,
)
conn.execute(
"INSERT INTO task_events (task_id, kind, payload, created_at) "
"VALUES (?, 'edited', NULL, ?)",
(task_id, int(time.time())),
)
updated = kanban_db.get_task(conn, task_id)
return {"task": _task_dict(updated) if updated else None}
finally:
conn.close()
def _set_status_direct(
conn: sqlite3.Connection, task_id: str, new_status: str,
) -> bool:
"""Direct status write for drag-drop moves that aren't covered by the
structured complete/block/unblock/archive verbs (e.g. todo<->ready,
running<->ready). Appends a ``status`` event row for the live feed.
When this transitions OFF ``running`` to anything other than the
terminal verbs above (which own their own run closing), we close the
active run with outcome='reclaimed' so attempt history isn't
orphaned. ``running -> ready`` via drag-drop is the common case
(user yanking a stuck worker back to the queue).
"""
with kanban_db.write_txn(conn):
# Snapshot current state so we know whether to close a run.
prev = conn.execute(
"SELECT status, current_run_id FROM tasks WHERE id = ?",
(task_id,),
).fetchone()
if prev is None:
return False
was_running = prev["status"] == "running"
cur = conn.execute(
"UPDATE tasks SET status = ?, "
" claim_lock = CASE WHEN ? = 'running' THEN claim_lock ELSE NULL END, "
" claim_expires = CASE WHEN ? = 'running' THEN claim_expires ELSE NULL END, "
" worker_pid = CASE WHEN ? = 'running' THEN worker_pid ELSE NULL END "
"WHERE id = ?",
(new_status, new_status, new_status, new_status, task_id),
)
if cur.rowcount != 1:
return False
run_id = None
if was_running and new_status != "running" and prev["current_run_id"]:
run_id = kanban_db._end_run(
conn, task_id,
outcome="reclaimed", status="reclaimed",
summary=f"status changed to {new_status} (dashboard/direct)",
)
conn.execute(
"INSERT INTO task_events (task_id, run_id, kind, payload, created_at) "
"VALUES (?, ?, 'status', ?, ?)",
(task_id, run_id, json.dumps({"status": new_status}), int(time.time())),
)
# If we re-opened something, children may have gone stale.
if new_status in ("done", "ready"):
kanban_db.recompute_ready(conn)
return True
# ---------------------------------------------------------------------------
# Comments
# ---------------------------------------------------------------------------
class CommentBody(BaseModel):
body: str
author: Optional[str] = "dashboard"
@router.post("/tasks/{task_id}/comments")
def add_comment(task_id: str, payload: CommentBody):
if not payload.body.strip():
raise HTTPException(status_code=400, detail="body is required")
conn = _conn()
try:
if kanban_db.get_task(conn, task_id) is None:
raise HTTPException(status_code=404, detail=f"task {task_id} not found")
kanban_db.add_comment(
conn, task_id, author=payload.author or "dashboard", body=payload.body,
)
return {"ok": True}
finally:
conn.close()
# ---------------------------------------------------------------------------
# Links
# ---------------------------------------------------------------------------
class LinkBody(BaseModel):
parent_id: str
child_id: str
@router.post("/links")
def add_link(payload: LinkBody):
conn = _conn()
try:
kanban_db.link_tasks(conn, payload.parent_id, payload.child_id)
return {"ok": True}
except ValueError as e:
raise HTTPException(status_code=400, detail=str(e))
finally:
conn.close()
@router.delete("/links")
def delete_link(parent_id: str = Query(...), child_id: str = Query(...)):
conn = _conn()
try:
ok = kanban_db.unlink_tasks(conn, parent_id, child_id)
return {"ok": bool(ok)}
finally:
conn.close()
# ---------------------------------------------------------------------------
# Bulk actions (multi-select on the board)
# ---------------------------------------------------------------------------
class BulkTaskBody(BaseModel):
ids: list[str]
status: Optional[str] = None
assignee: Optional[str] = None # "" or None = unassign
priority: Optional[int] = None
archive: bool = False
@router.post("/tasks/bulk")
def bulk_update(payload: BulkTaskBody):
"""Apply the same patch to every id in ``payload.ids``.
This is an *independent* iteration per-task failures don't abort
siblings. Returns per-id outcome so the UI can surface partials.
"""
ids = [i for i in (payload.ids or []) if i]
if not ids:
raise HTTPException(status_code=400, detail="ids is required")
results: list[dict] = []
conn = _conn()
try:
for tid in ids:
entry: dict[str, Any] = {"id": tid, "ok": True}
try:
task = kanban_db.get_task(conn, tid)
if task is None:
entry.update(ok=False, error="not found")
results.append(entry)
continue
if payload.archive:
if not kanban_db.archive_task(conn, tid):
entry.update(ok=False, error="archive refused")
if payload.status is not None and not payload.archive:
s = payload.status
if s == "done":
ok = kanban_db.complete_task(conn, tid)
elif s == "blocked":
ok = kanban_db.block_task(conn, tid)
elif s == "ready":
cur = kanban_db.get_task(conn, tid)
if cur and cur.status == "blocked":
ok = kanban_db.unblock_task(conn, tid)
else:
ok = _set_status_direct(conn, tid, "ready")
elif s in ("todo", "running", "triage"):
ok = _set_status_direct(conn, tid, s)
else:
entry.update(ok=False, error=f"unknown status {s!r}")
results.append(entry)
continue
if not ok:
entry.update(ok=False, error=f"transition to {s!r} refused")
if payload.assignee is not None:
try:
if not kanban_db.assign_task(
conn, tid, payload.assignee or None,
):
entry.update(ok=False, error="assign refused")
except RuntimeError as e:
entry.update(ok=False, error=str(e))
if payload.priority is not None:
with kanban_db.write_txn(conn):
conn.execute(
"UPDATE tasks SET priority = ? WHERE id = ?",
(int(payload.priority), tid),
)
conn.execute(
"INSERT INTO task_events (task_id, kind, payload, created_at) "
"VALUES (?, 'reprioritized', ?, ?)",
(tid, json.dumps({"priority": int(payload.priority)}),
int(time.time())),
)
except Exception as e: # defensive — one bad id shouldn't kill the batch
entry.update(ok=False, error=str(e))
results.append(entry)
return {"results": results}
finally:
conn.close()
# ---------------------------------------------------------------------------
# Plugin config (read dashboard.kanban.* defaults from config.yaml)
# ---------------------------------------------------------------------------
@router.get("/config")
def get_config():
"""Return kanban dashboard preferences from ~/.hermes/config.yaml.
Reads the ``dashboard.kanban`` section if present; defaults otherwise.
Used by the UI to pre-select tenant filters, toggle markdown rendering,
or set column-width preferences without a round-trip per page load.
"""
try:
from hermes_cli.config import load_config
cfg = load_config() or {}
except Exception:
cfg = {}
dash_cfg = (cfg.get("dashboard") or {})
# dashboard.kanban may itself be a dict; fall back to {}.
k_cfg = dash_cfg.get("kanban") or {}
return {
"default_tenant": k_cfg.get("default_tenant") or "",
"lane_by_profile": bool(k_cfg.get("lane_by_profile", True)),
"include_archived_by_default": bool(k_cfg.get("include_archived_by_default", False)),
"render_markdown": bool(k_cfg.get("render_markdown", True)),
}
# ---------------------------------------------------------------------------
# Stats (per-profile / per-status counts + oldest-ready age)
# ---------------------------------------------------------------------------
@router.get("/stats")
def get_stats():
"""Per-status + per-assignee counts + oldest-ready age.
Designed for the dashboard HUD and for router profiles that need to
answer "is this specialist overloaded?" without scanning the whole
board themselves.
"""
conn = _conn()
try:
return kanban_db.board_stats(conn)
finally:
conn.close()
@router.get("/assignees")
def get_assignees():
"""Known profiles + per-profile task counts.
Returns the union of ``~/.hermes/profiles/*`` on disk and every
distinct assignee currently used on the board. The dashboard uses
this to populate its assignee dropdown so a freshly-created profile
appears in the picker before it's been given any task.
"""
conn = _conn()
try:
return {"assignees": kanban_db.known_assignees(conn)}
finally:
conn.close()
# ---------------------------------------------------------------------------
# Worker log (read-only; file written by _default_spawn)
# ---------------------------------------------------------------------------
@router.get("/tasks/{task_id}/log")
def get_task_log(task_id: str, tail: Optional[int] = Query(None, ge=1, le=2_000_000)):
"""Return the worker's stdout/stderr log.
``tail`` caps the response size (bytes) so the dashboard drawer
doesn't paginate megabytes into the browser. Returns 404 if the task
has never spawned. The on-disk log is rotated at 2 MiB per
``_rotate_worker_log`` a single ``.log.1`` is kept, no further
generations, so disk usage per task is bounded at ~4 MiB.
"""
conn = _conn()
try:
task = kanban_db.get_task(conn, task_id)
finally:
conn.close()
if task is None:
raise HTTPException(status_code=404, detail=f"task {task_id} not found")
content = kanban_db.read_worker_log(task_id, tail_bytes=tail)
log_path = kanban_db.worker_log_path(task_id)
size = log_path.stat().st_size if log_path.exists() else 0
return {
"task_id": task_id,
"path": str(log_path),
"exists": content is not None,
"size_bytes": size,
"content": content or "",
# Truncated when the on-disk file was larger than the tail cap.
"truncated": bool(tail and size > tail),
}
# ---------------------------------------------------------------------------
# Dispatch nudge (optional quick-path so the UI doesn't wait 60 s)
# ---------------------------------------------------------------------------
@router.post("/dispatch")
def dispatch(dry_run: bool = Query(False), max_n: int = Query(8, alias="max")):
conn = _conn()
try:
result = kanban_db.dispatch_once(
conn, dry_run=dry_run, max_spawn=max_n,
)
# DispatchResult is a dataclass.
try:
return asdict(result)
except TypeError:
return {"result": str(result)}
finally:
conn.close()
# ---------------------------------------------------------------------------
# WebSocket: /events?since=<event_id>
# ---------------------------------------------------------------------------
# Poll interval for the event tail loop. SQLite WAL + 300 ms polling is
# the simplest and most robust approach; it adds a fraction of a percent
# of CPU and has no shared state to synchronize across workers.
_EVENT_POLL_SECONDS = 0.3
@router.websocket("/events")
async def stream_events(ws: WebSocket):
# Enforce the dashboard session token as a query param — browsers can't
# set Authorization on a WS upgrade. This matches how the PTY bridge
# authenticates in hermes_cli/web_server.py.
token = ws.query_params.get("token")
if not _check_ws_token(token):
await ws.close(code=http_status.WS_1008_POLICY_VIOLATION)
return
await ws.accept()
try:
since_raw = ws.query_params.get("since", "0")
try:
cursor = int(since_raw)
except ValueError:
cursor = 0
def _fetch_new(cursor_val: int) -> tuple[int, list[dict]]:
conn = kanban_db.connect()
try:
rows = conn.execute(
"SELECT id, task_id, run_id, kind, payload, created_at "
"FROM task_events WHERE id > ? ORDER BY id ASC LIMIT 200",
(cursor_val,),
).fetchall()
out: list[dict] = []
new_cursor = cursor_val
for r in rows:
try:
payload = json.loads(r["payload"]) if r["payload"] else None
except Exception:
payload = None
out.append({
"id": r["id"],
"task_id": r["task_id"],
"run_id": r["run_id"],
"kind": r["kind"],
"payload": payload,
"created_at": r["created_at"],
})
new_cursor = r["id"]
return new_cursor, out
finally:
conn.close()
while True:
cursor, events = await asyncio.to_thread(_fetch_new, cursor)
if events:
await ws.send_json({"events": events, "cursor": cursor})
await asyncio.sleep(_EVENT_POLL_SECONDS)
except WebSocketDisconnect:
return
except Exception as exc: # defensive: never crash the dashboard worker
log.warning("Kanban event stream error: %s", exc)
try:
await ws.close()
except Exception:
pass
@@ -0,0 +1,32 @@
# DEPRECATED — the kanban dispatcher now runs inside the gateway by
# default (config key: kanban.dispatch_in_gateway, default true). To
# migrate:
#
# systemctl --user disable --now hermes-kanban-dispatcher.service
# # then make sure a gateway is running; e.g. a systemd user unit
# # for `hermes gateway start`. The gateway hosts the dispatcher.
#
# This unit is kept for users who truly cannot run the gateway (host
# policy forbids long-lived services, etc.). It now invokes the
# standalone dispatcher via the explicit --force flag, so nobody
# accidentally keeps two dispatchers racing against the same
# kanban.db. Running this unit AND a gateway with
# dispatch_in_gateway=true is NOT supported.
[Unit]
Description=Hermes Kanban dispatcher (DEPRECATED standalone daemon — prefer gateway-embedded dispatch)
Documentation=https://hermes-agent.nousresearch.com/docs/user-guide/features/kanban
After=network.target
[Service]
Type=simple
ExecStart=/usr/bin/env hermes kanban daemon --force --interval 60 --pidfile %t/hermes-kanban-dispatcher.pid
Restart=on-failure
RestartSec=5
# Log to the journal via stdout/stderr; the dispatcher also writes per-task
# worker output to $HERMES_HOME/kanban/logs/<task>.log.
StandardOutput=journal
StandardError=journal
[Install]
WantedBy=default.target
+23 -12
View File
@@ -110,6 +110,17 @@ def _parse_context_tokens(host_val, root_val) -> int | None:
return None
def _parse_int_config(host_val, root_val, default: int) -> int:
"""Parse an integer config: host wins, then root, then default."""
for val in (host_val, root_val):
if val is not None:
try:
return int(val)
except (ValueError, TypeError):
pass
return default
def _parse_dialectic_depth(host_val, root_val) -> int:
"""Parse dialecticDepth: host wins, then root, then 1. Clamped to 1-3."""
for val in (host_val, root_val):
@@ -463,10 +474,10 @@ class HonchoClientConfig:
raw.get("dialecticDynamic"),
default=True,
),
dialectic_max_chars=int(
host_block.get("dialecticMaxChars")
or raw.get("dialecticMaxChars")
or 600
dialectic_max_chars=_parse_int_config(
host_block.get("dialecticMaxChars"),
raw.get("dialecticMaxChars"),
default=600,
),
dialectic_depth=_parse_dialectic_depth(
host_block.get("dialecticDepth"),
@@ -487,15 +498,15 @@ class HonchoClientConfig:
or raw.get("reasoningLevelCap")
or "high"
),
message_max_chars=int(
host_block.get("messageMaxChars")
or raw.get("messageMaxChars")
or 25000
message_max_chars=_parse_int_config(
host_block.get("messageMaxChars"),
raw.get("messageMaxChars"),
default=25000,
),
dialectic_max_input_chars=int(
host_block.get("dialecticMaxInputChars")
or raw.get("dialecticMaxInputChars")
or 10000
dialectic_max_input_chars=_parse_int_config(
host_block.get("dialecticMaxInputChars"),
raw.get("dialecticMaxInputChars"),
default=10000,
),
recall_mode=_normalize_recall_mode(
host_block.get("recallMode")
+9 -6
View File
@@ -160,11 +160,13 @@ class HonchoSessionManager:
Peers are lazy -- no API call until first use.
Observation settings are controlled per-session via SessionPeerConfig.
"""
if peer_id in self._peers_cache:
return self._peers_cache[peer_id]
with self._cache_lock:
if peer_id in self._peers_cache:
return self._peers_cache[peer_id]
peer = self.honcho.peer(peer_id)
self._peers_cache[peer_id] = peer
with self._cache_lock:
self._peers_cache[peer_id] = peer
return peer
def _get_or_create_honcho_session(
@@ -176,9 +178,10 @@ class HonchoSessionManager:
Returns:
Tuple of (honcho_session, existing_messages).
"""
if session_id in self._sessions_cache:
logger.debug("Honcho session '%s' retrieved from cache", session_id)
return self._sessions_cache[session_id], []
with self._cache_lock:
if session_id in self._sessions_cache:
logger.debug("Honcho session '%s' retrieved from cache", session_id)
return self._sessions_cache[session_id], []
session = self.honcho.session(session_id)
+3
View File
@@ -38,6 +38,7 @@ except ImportError:
try:
from microsoft_teams.apps import App, ActivityContext
from microsoft_teams.common.http.client import ClientOptions
from microsoft_teams.api import MessageActivity, ConversationReference
from microsoft_teams.api.activities.typing import TypingActivityInput
from microsoft_teams.api.activities.invoke.adaptive_card import AdaptiveCardInvokeActivity
@@ -57,6 +58,7 @@ try:
TEAMS_SDK_AVAILABLE = True
except ImportError:
TEAMS_SDK_AVAILABLE = False
ClientOptions = None # type: ignore[assignment,misc]
App = None # type: ignore[assignment,misc]
ActivityContext = None # type: ignore[assignment,misc]
MessageActivity = None # type: ignore[assignment,misc]
@@ -208,6 +210,7 @@ class TeamsAdapter(BasePlatformAdapter):
client_secret=self._client_secret,
tenant_id=self._tenant_id,
http_server_adapter=_AiohttpBridgeAdapter(aiohttp_app),
client=ClientOptions(headers={"User-Agent": "Hermes"}),
)
# Register message handler before initialize()
+332 -136
View File
@@ -23,6 +23,7 @@ Usage:
import asyncio
import base64
import concurrent.futures
import contextvars
import copy
import hashlib
import json
@@ -133,6 +134,7 @@ from agent.prompt_builder import (
DEFAULT_AGENT_IDENTITY, PLATFORM_HINTS,
MEMORY_GUIDANCE, SESSION_SEARCH_GUIDANCE, SKILLS_GUIDANCE,
HERMES_AGENT_HELP_GUIDANCE,
KANBAN_GUIDANCE,
build_nous_subscription_prompt,
)
from agent.model_metadata import (
@@ -160,6 +162,13 @@ from agent.display import (
_detect_tool_failure,
get_tool_emoji as _get_tool_emoji,
)
from agent.tool_guardrails import (
ToolCallGuardrailConfig,
ToolCallGuardrailController,
ToolGuardrailDecision,
append_toolguard_guidance,
toolguard_synthetic_result,
)
from agent.trajectory import (
convert_scratchpad_to_think, has_incomplete_scratchpad,
save_trajectory as _save_trajectory_to_file,
@@ -1148,6 +1157,8 @@ class AIAgent:
# Tool execution state — allows _vprint during tool execution
# even when stream consumers are registered (no tokens streaming then)
self._executing_tools = False
self._tool_guardrails = ToolCallGuardrailController()
self._tool_guardrail_halt_decision: ToolGuardrailDecision | None = None
# Interrupt mechanism for breaking out of tool loops
self._interrupt_requested = False
@@ -1621,30 +1632,12 @@ class AIAgent:
self._session_db = session_db
self._parent_session_id = parent_session_id
self._last_flushed_db_idx = 0 # tracks DB-write cursor to prevent duplicate writes
if self._session_db:
try:
self._session_db.create_session(
session_id=self.session_id,
source=self.platform or os.environ.get("HERMES_SESSION_SOURCE", "cli"),
model=self.model,
model_config={
"max_iterations": self.max_iterations,
"reasoning_config": reasoning_config,
"max_tokens": max_tokens,
},
user_id=None,
parent_session_id=self._parent_session_id,
)
except Exception as e:
# Transient SQLite lock contention (e.g. CLI and gateway writing
# concurrently) must NOT permanently disable session_search for
# this agent. Keep _session_db alive — subsequent message
# flushes and session_search calls will still work once the
# lock clears. The session row may be missing from the index
# for this run, but that is recoverable (flushes upsert rows).
logger.warning(
"Session DB create_session failed (session_search still available): %s", e
)
self._session_db_created = False # DB row deferred to run_conversation()
self._session_init_model_config = {
"max_iterations": self.max_iterations,
"reasoning_config": reasoning_config,
"max_tokens": max_tokens,
}
# In-memory todo list for task planning (one per agent/session)
from tools.todo_tool import TodoStore
@@ -1656,6 +1649,14 @@ class AIAgent:
_agent_cfg = _load_agent_config()
except Exception:
_agent_cfg = {}
try:
self._tool_guardrails = ToolCallGuardrailController(
ToolCallGuardrailConfig.from_mapping(
_agent_cfg.get("tool_loop_guardrails", {})
)
)
except Exception as _tlg_err:
logger.warning("Tool loop guardrail config ignored: %s", _tlg_err)
# Cache only the derived auxiliary compression context override that is
# needed later by the startup feasibility check. Avoid exposing a
# broad pseudo-public config object on the agent instance.
@@ -2151,6 +2152,28 @@ class AIAgent:
"is_anthropic_oauth": self._is_anthropic_oauth,
})
def _ensure_db_session(self) -> None:
"""Create session DB row on first use. Disables _session_db on failure."""
if self._session_db_created or not self._session_db:
return
try:
self._session_db.create_session(
session_id=self.session_id,
source=self.platform or os.environ.get("HERMES_SESSION_SOURCE", "cli"),
model=self.model,
model_config=self._session_init_model_config,
system_prompt=self._cached_system_prompt,
user_id=None,
parent_session_id=self._parent_session_id,
)
self._session_db_created = True
except Exception as e:
# Transient failure (e.g. SQLite lock). Keep _session_db alive —
# _session_db_created stays False so next run_conversation() retries.
logger.warning(
"Session DB creation failed (will retry next turn): %s", e
)
def reset_session_state(self):
"""Reset all session-scoped token counters to 0 for a fresh session.
@@ -3592,11 +3615,15 @@ class AIAgent:
if actions:
summary = " · ".join(dict.fromkeys(actions))
self._safe_print(f" 💾 {summary}")
self._safe_print(
f" 💾 Self-improvement review: {summary}"
)
_bg_cb = self.background_review_callback
if _bg_cb:
try:
_bg_cb(f"💾 {summary}")
_bg_cb(
f"💾 Self-improvement review: {summary}"
)
except Exception:
pass
@@ -3696,14 +3723,9 @@ class AIAgent:
return
self._apply_persist_user_message_override(messages)
try:
# If create_session() failed at startup (e.g. transient lock), the
# session row may not exist yet. ensure_session() uses INSERT OR
# IGNORE so it is a no-op when the row is already there.
self._session_db.ensure_session(
self.session_id,
source=self.platform or "cli",
model=self.model,
)
# Retry row creation if the earlier attempt failed transiently.
if not self._session_db_created:
self._ensure_db_session()
start_idx = len(conversation_history) if conversation_history else 0
flush_from = max(start_idx, self._last_flushed_db_idx)
for msg in messages[flush_from:]:
@@ -4823,6 +4845,12 @@ class AIAgent:
tool_guidance.append(SESSION_SEARCH_GUIDANCE)
if "skill_manage" in self.valid_tool_names:
tool_guidance.append(SKILLS_GUIDANCE)
# Kanban worker/orchestrator lifecycle — only present when the
# dispatcher spawned this process (kanban_show check_fn gates on
# HERMES_KANBAN_TASK env var). Normal chat sessions never see
# this block.
if "kanban_show" in self.valid_tool_names:
tool_guidance.append(KANBAN_GUIDANCE)
if tool_guidance:
prompt_parts.append(" ".join(tool_guidance))
@@ -4970,8 +4998,8 @@ class AIAgent:
def _get_tool_call_id_static(tc) -> str:
"""Extract call ID from a tool_call entry (dict or object)."""
if isinstance(tc, dict):
return tc.get("id", "") or ""
return getattr(tc, "id", "") or ""
return tc.get("call_id", "") or tc.get("id", "") or ""
return getattr(tc, "call_id", "") or getattr(tc, "id", "") or ""
_VALID_API_ROLES = frozenset({"system", "user", "assistant", "tool", "function", "developer"})
@@ -8574,9 +8602,13 @@ class AIAgent:
# message. Without it, replaying the persisted message causes
# HTTP 400 ("The reasoning_content in the thinking mode must
# be passed back to the API"). Include streamed reasoning
# text when captured; otherwise pad with empty string.
# Refs #15250, #17400.
msg["reasoning_content"] = reasoning_text or ""
# text when captured; otherwise pad with a single space —
# DeepSeek V4 Pro tightened validation and rejects empty
# string ("The reasoning content in the thinking mode must
# be passed back to the API"). A space satisfies non-empty
# checks everywhere without leaking fabricated reasoning.
# Refs #15250, #17400, #17341.
msg["reasoning_content"] = reasoning_text or " "
# Additive fallback (refs #16844, #16884). Streaming-only providers
# (glm, MiniMax, gpt-5.x via aigw, Anthropic via openai-compat shims)
@@ -8731,11 +8763,20 @@ class AIAgent:
return
# 1. Explicit reasoning_content already set — preserve it verbatim
# (includes DeepSeek/Kimi's own empty-string placeholder written at
# creation time, and any valid reasoning content from the same provider).
# (includes DeepSeek/Kimi's own space-placeholder written at creation
# time, and any valid reasoning content from the same provider).
#
# Exception: sessions persisted BEFORE #17341 have empty-string
# placeholders pinned at creation time. DeepSeek V4 Pro rejects
# those with HTTP 400. When the active provider enforces the
# thinking-mode echo, upgrade "" → " " on replay so stale history
# doesn't 400 the user on the next turn.
existing = source_msg.get("reasoning_content")
if isinstance(existing, str):
api_msg["reasoning_content"] = existing
if existing == "" and self._needs_thinking_reasoning_pad():
api_msg["reasoning_content"] = " "
else:
api_msg["reasoning_content"] = existing
return
needs_thinking_pad = self._needs_thinking_reasoning_pad()
@@ -8747,8 +8788,10 @@ class AIAgent:
# pins reasoning_content at creation time for tool-call turns, so the
# shape (reasoning set, reasoning_content absent, tool_calls present)
# is unreachable from same-provider DeepSeek history after this fix.
# Inject "" to satisfy the API without leaking another provider's
# chain of thought to DeepSeek/Kimi.
# Inject a single space to satisfy the API without leaking another
# provider's chain of thought to DeepSeek/Kimi. Space (not "")
# because DeepSeek V4 Pro rejects empty-string reasoning_content
# in thinking mode (refs #17341).
normalized_reasoning = source_msg.get("reasoning")
if (
needs_thinking_pad
@@ -8756,7 +8799,7 @@ class AIAgent:
and isinstance(normalized_reasoning, str)
and normalized_reasoning
):
api_msg["reasoning_content"] = ""
api_msg["reasoning_content"] = " "
return
# 3. Healthy session: promote 'reasoning' field to 'reasoning_content'
@@ -8769,12 +8812,15 @@ class AIAgent:
return
# 4. DeepSeek / Kimi thinking mode: all assistant messages need
# reasoning_content. Inject "" to satisfy the provider's requirement
# when no explicit reasoning content is present. Covers both
# tool-call turns (already-poisoned history with no reasoning at all)
# and plain text turns.
# reasoning_content. Inject a single space to satisfy the provider's
# requirement when no explicit reasoning content is present. Covers
# both tool-call turns (already-poisoned history with no reasoning
# at all) and plain text turns. Space (not "") because DeepSeek V4
# Pro tightened validation and rejects empty string with HTTP 400
# ("The reasoning content in the thinking mode must be passed back
# to the API"). Refs #17341.
if needs_thinking_pad:
api_msg["reasoning_content"] = ""
api_msg["reasoning_content"] = " "
return
# 5. reasoning_content was present but not a string (e.g. None after
@@ -9009,12 +9055,15 @@ class AIAgent:
self.session_id = f"{datetime.now().strftime('%Y%m%d_%H%M%S')}_{uuid.uuid4().hex[:6]}"
# Update session_log_file to point to the new session's JSON file
self.session_log_file = self.logs_dir / f"session_{self.session_id}.json"
self._session_db_created = False
self._session_db.create_session(
session_id=self.session_id,
source=self.platform or os.environ.get("HERMES_SESSION_SOURCE", "cli"),
model=self.model,
model_config=self._session_init_model_config,
parent_session_id=old_session_id,
)
self._session_db_created = True
# Auto-number the title for the continuation session
if old_title:
try:
@@ -9072,9 +9121,14 @@ class AIAgent:
# Update token estimate after compaction so pressure calculations
# use the post-compression count, not the stale pre-compression one.
_compressed_est = (
estimate_tokens_rough(new_system_prompt)
+ estimate_messages_tokens_rough(compressed)
# Use estimate_request_tokens_rough() so tool schemas are included —
# with 50+ tools enabled, schemas alone can add 20-30K tokens, and
# omitting them delays the next compression cycle far past the
# configured threshold (issue #14695).
_compressed_est = estimate_request_tokens_rough(
compressed,
system_prompt=new_system_prompt or "",
tools=self.tools or None,
)
self.context_compressor.last_prompt_tokens = _compressed_est
self.context_compressor.last_completion_tokens = 0
@@ -9095,6 +9149,44 @@ class AIAgent:
)
return compressed, new_system_prompt
def _set_tool_guardrail_halt(self, decision: ToolGuardrailDecision) -> None:
"""Record the first guardrail decision that should stop this turn."""
if decision.should_halt and self._tool_guardrail_halt_decision is None:
self._tool_guardrail_halt_decision = decision
def _toolguard_controlled_halt_response(self, decision: ToolGuardrailDecision) -> str:
tool = decision.tool_name or "a tool"
return (
f"I stopped retrying {tool} because it hit the tool-call guardrail "
f"({decision.code}) after {decision.count} repeated non-progressing "
"attempts. The last tool result explains the blocker; the next step is "
"to change strategy instead of repeating the same call."
)
def _append_guardrail_observation(
self,
tool_name: str,
function_args: dict,
function_result: str,
*,
failed: bool,
) -> str:
decision = self._tool_guardrails.after_call(
tool_name,
function_args,
function_result,
failed=failed,
)
if decision.action in {"warn", "halt"}:
function_result = append_toolguard_guidance(function_result, decision)
if decision.should_halt:
self._set_tool_guardrail_halt(decision)
return function_result
def _guardrail_block_result(self, decision: ToolGuardrailDecision) -> str:
self._set_tool_guardrail_halt(decision)
return toolguard_synthetic_result(decision)
def _execute_tool_calls(self, assistant_message, messages: list, effective_task_id: str, api_call_count: int = 0) -> None:
"""Execute tool calls from the assistant message and append results to messages.
@@ -9138,7 +9230,8 @@ class AIAgent:
)
def _invoke_tool(self, function_name: str, function_args: dict, effective_task_id: str,
tool_call_id: Optional[str] = None, messages: list = None) -> str:
tool_call_id: Optional[str] = None, messages: list = None,
pre_tool_block_checked: bool = False) -> str:
"""Invoke a single tool and return the result string. No display logic.
Handles both agent-level tools (todo, memory, etc.) and registry-dispatched
@@ -9147,13 +9240,14 @@ class AIAgent:
"""
# Check plugin hooks for a block directive before executing anything.
block_message: Optional[str] = None
try:
from hermes_cli.plugins import get_pre_tool_call_block_message
block_message = get_pre_tool_call_block_message(
function_name, function_args, task_id=effective_task_id or "",
)
except Exception:
pass
if not pre_tool_block_checked:
try:
from hermes_cli.plugins import get_pre_tool_call_block_message
block_message = get_pre_tool_call_block_message(
function_name, function_args, task_id=effective_task_id or "",
)
except Exception:
pass
if block_message is not None:
return json.dumps({"error": block_message}, ensure_ascii=False)
@@ -9305,13 +9399,31 @@ class AIAgent:
except Exception:
pass
parsed_calls.append((tool_call, function_name, function_args))
block_result = None
blocked_by_guardrail = False
try:
from hermes_cli.plugins import get_pre_tool_call_block_message
block_message = get_pre_tool_call_block_message(
function_name, function_args, task_id=effective_task_id or "",
)
except Exception:
block_message = None
if block_message is not None:
block_result = json.dumps({"error": block_message}, ensure_ascii=False)
else:
guardrail_decision = self._tool_guardrails.before_call(function_name, function_args)
if not guardrail_decision.allows_execution:
block_result = self._guardrail_block_result(guardrail_decision)
blocked_by_guardrail = True
parsed_calls.append((tool_call, function_name, function_args, block_result, blocked_by_guardrail))
# ── Logging / callbacks ──────────────────────────────────────────
tool_names_str = ", ".join(name for _, name, _ in parsed_calls)
tool_names_str = ", ".join(name for _, name, _, _, _ in parsed_calls)
if not self.quiet_mode:
print(f" ⚡ Concurrent: {num_tools} tool calls — {tool_names_str}")
for i, (tc, name, args) in enumerate(parsed_calls, 1):
for i, (tc, name, args, block_result, blocked_by_guardrail) in enumerate(parsed_calls, 1):
args_str = json.dumps(args, ensure_ascii=False)
if self.verbose_logging:
print(f" 📞 Tool {i}: {name}({list(args.keys())})")
@@ -9320,7 +9432,9 @@ class AIAgent:
args_preview = args_str[:self.log_prefix_chars] + "..." if len(args_str) > self.log_prefix_chars else args_str
print(f" 📞 Tool {i}: {name}({list(args.keys())}) - {args_preview}")
for tc, name, args in parsed_calls:
for tc, name, args, block_result, blocked_by_guardrail in parsed_calls:
if block_result is not None:
continue
if self.tool_progress_callback:
try:
preview = _build_tool_preview(name, args)
@@ -9328,7 +9442,9 @@ class AIAgent:
except Exception as cb_err:
logging.debug(f"Tool progress callback error: {cb_err}")
for tc, name, args in parsed_calls:
for tc, name, args, block_result, blocked_by_guardrail in parsed_calls:
if block_result is not None:
continue
if self.tool_start_callback:
try:
self.tool_start_callback(tc.id, name, args)
@@ -9336,8 +9452,11 @@ class AIAgent:
logging.debug(f"Tool start callback error: {cb_err}")
# ── Concurrent execution ─────────────────────────────────────────
# Each slot holds (function_name, function_args, function_result, duration, error_flag)
# Each slot holds (function_name, function_args, function_result, duration, error_flag, blocked_flag)
results = [None] * num_tools
for i, (tc, name, args, block_result, blocked_by_guardrail) in enumerate(parsed_calls):
if block_result is not None:
results[i] = (name, args, block_result, 0.0, True, True)
# Touch activity before launching workers so the gateway knows
# we're executing tools (not stuck).
@@ -9392,7 +9511,14 @@ class AIAgent:
pass
start = time.time()
try:
result = self._invoke_tool(function_name, function_args, effective_task_id, tool_call.id, messages=messages)
result = self._invoke_tool(
function_name,
function_args,
effective_task_id,
tool_call.id,
messages=messages,
pre_tool_block_checked=True,
)
except Exception as tool_error:
result = f"Error executing tool '{function_name}': {tool_error}"
logger.error("_invoke_tool raised for %s: %s", function_name, tool_error, exc_info=True)
@@ -9402,7 +9528,7 @@ class AIAgent:
logger.info("tool %s failed (%.2fs): %s", function_name, duration, result[:200])
else:
logger.info("tool %s completed (%.2fs, %d chars)", function_name, duration, len(result))
results[index] = (function_name, function_args, result, duration, is_error)
results[index] = (function_name, function_args, result, duration, is_error, False)
# Tear down worker-tid tracking. Clear any interrupt bit we may
# have set so the next task scheduled onto this recycled tid
# starts with a clean slate.
@@ -9428,59 +9554,67 @@ class AIAgent:
spinner.start()
try:
max_workers = min(num_tools, _MAX_TOOL_WORKERS)
with concurrent.futures.ThreadPoolExecutor(max_workers=max_workers) as executor:
futures = []
for i, (tc, name, args) in enumerate(parsed_calls):
f = executor.submit(_run_tool, i, tc, name, args)
futures.append(f)
runnable_calls = [
(i, tc, name, args)
for i, (tc, name, args, block_result, blocked_by_guardrail) in enumerate(parsed_calls)
if block_result is None
]
futures = []
if runnable_calls:
max_workers = min(len(runnable_calls), _MAX_TOOL_WORKERS)
with concurrent.futures.ThreadPoolExecutor(max_workers=max_workers) as executor:
for i, tc, name, args in runnable_calls:
# Propagate ContextVars (e.g. _approval_session_key); mirrors asyncio.to_thread.
ctx = contextvars.copy_context()
f = executor.submit(ctx.run, _run_tool, i, tc, name, args)
futures.append(f)
# Wait for all to complete with periodic heartbeats so the
# gateway's inactivity monitor doesn't kill us during long
# concurrent tool batches. Also check for user interrupts
# so we don't block indefinitely when the user sends /stop
# or a new message during concurrent tool execution.
_conc_start = time.time()
_interrupt_logged = False
while True:
done, not_done = concurrent.futures.wait(
futures, timeout=5.0,
)
if not not_done:
break
# Check for interrupt — the per-thread interrupt signal
# already causes individual tools (terminal, execute_code)
# to abort, but tools without interrupt checks (web_search,
# read_file) will run to completion. Cancel any futures
# that haven't started yet so we don't block on them.
if self._interrupt_requested:
if not _interrupt_logged:
_interrupt_logged = True
self._vprint(
f"{self.log_prefix}⚡ Interrupt: cancelling "
f"{len(not_done)} pending concurrent tool(s)",
force=True,
)
for f in not_done:
f.cancel()
# Give already-running tools a moment to notice the
# per-thread interrupt signal and exit gracefully.
concurrent.futures.wait(not_done, timeout=3.0)
break
_conc_elapsed = int(time.time() - _conc_start)
# Heartbeat every ~30s (6 × 5s poll intervals)
if _conc_elapsed > 0 and _conc_elapsed % 30 < 6:
_still_running = [
parsed_calls[futures.index(f)][1]
for f in not_done
if f in futures
]
self._touch_activity(
f"concurrent tools running ({_conc_elapsed}s, "
f"{len(not_done)} remaining: {', '.join(_still_running[:3])})"
# Wait for all to complete with periodic heartbeats so the
# gateway's inactivity monitor doesn't kill us during long
# concurrent tool batches. Also check for user interrupts
# so we don't block indefinitely when the user sends /stop
# or a new message during concurrent tool execution.
_conc_start = time.time()
_interrupt_logged = False
while True:
done, not_done = concurrent.futures.wait(
futures, timeout=5.0,
)
if not not_done:
break
# Check for interrupt — the per-thread interrupt signal
# already causes individual tools (terminal, execute_code)
# to abort, but tools without interrupt checks (web_search,
# read_file) will run to completion. Cancel any futures
# that haven't started yet so we don't block on them.
if self._interrupt_requested:
if not _interrupt_logged:
_interrupt_logged = True
self._vprint(
f"{self.log_prefix}⚡ Interrupt: cancelling "
f"{len(not_done)} pending concurrent tool(s)",
force=True,
)
for f in not_done:
f.cancel()
# Give already-running tools a moment to notice the
# per-thread interrupt signal and exit gracefully.
concurrent.futures.wait(not_done, timeout=3.0)
break
_conc_elapsed = int(time.time() - _conc_start)
# Heartbeat every ~30s (6 × 5s poll intervals)
if _conc_elapsed > 0 and _conc_elapsed % 30 < 6:
_still_running = [
parsed_calls[futures.index(f)][1]
for f in not_done
if f in futures
]
self._touch_activity(
f"concurrent tools running ({_conc_elapsed}s, "
f"{len(not_done)} remaining: {', '.join(_still_running[:3])})"
)
finally:
if spinner:
# Build a summary message for the spinner stop
@@ -9489,8 +9623,9 @@ class AIAgent:
spinner.stop(f"{completed}/{num_tools} tools completed in {total_dur:.1f}s total")
# ── Post-execution: display per-tool results ─────────────────────
for i, (tc, name, args) in enumerate(parsed_calls):
for i, (tc, name, args, block_result, blocked_by_guardrail) in enumerate(parsed_calls):
r = results[i]
blocked = False
if r is None:
# Tool was cancelled (interrupt) or thread didn't return
if self._interrupt_requested:
@@ -9499,13 +9634,21 @@ class AIAgent:
function_result = f"Error executing tool '{name}': thread did not return a result"
tool_duration = 0.0
else:
function_name, function_args, function_result, tool_duration, is_error = r
function_name, function_args, function_result, tool_duration, is_error, blocked = r
if not blocked:
function_result = self._append_guardrail_observation(
function_name,
function_args,
function_result,
failed=is_error,
)
if is_error:
result_preview = function_result[:200] if len(function_result) > 200 else function_result
logger.warning("Tool %s returned error (%.2fs): %s", function_name, tool_duration, result_preview)
if self.tool_progress_callback:
if not blocked and self.tool_progress_callback:
try:
self.tool_progress_callback(
"tool.completed", function_name, None, None,
@@ -9533,7 +9676,7 @@ class AIAgent:
self._current_tool = None
self._touch_activity(f"tool completed: {name} ({tool_duration:.1f}s)")
if self.tool_complete_callback:
if not blocked and self.tool_complete_callback:
try:
self.tool_complete_callback(tc.id, name, args, function_result)
except Exception as cb_err:
@@ -9615,9 +9758,17 @@ class AIAgent:
except Exception:
pass
if _block_msg is not None:
# Tool blocked by plugin policy — skip counter resets.
# Execution is handled below in the tool dispatch chain.
_guardrail_block_decision: ToolGuardrailDecision | None = None
if _block_msg is None:
guardrail_decision = self._tool_guardrails.before_call(function_name, function_args)
if not guardrail_decision.allows_execution:
_guardrail_block_decision = guardrail_decision
_execution_blocked = _block_msg is not None or _guardrail_block_decision is not None
if _execution_blocked:
# Tool blocked by plugin or guardrail policy — skip counters,
# callbacks, checkpointing, activity mutation, and real execution.
pass
else:
# Reset nudge counters when the relevant tool is actually used
@@ -9635,35 +9786,35 @@ class AIAgent:
args_preview = args_str[:self.log_prefix_chars] + "..." if len(args_str) > self.log_prefix_chars else args_str
print(f" 📞 Tool {i}: {function_name}({list(function_args.keys())}) - {args_preview}")
if _block_msg is None:
if not _execution_blocked:
self._current_tool = function_name
self._touch_activity(f"executing tool: {function_name}")
# Set activity callback for long-running tool execution (terminal
# commands, etc.) so the gateway's inactivity monitor doesn't kill
# the agent while a command is running.
if _block_msg is None:
if not _execution_blocked:
try:
from tools.environments.base import set_activity_callback
set_activity_callback(self._touch_activity)
except Exception:
pass
if _block_msg is None and self.tool_progress_callback:
if not _execution_blocked and self.tool_progress_callback:
try:
preview = _build_tool_preview(function_name, function_args)
self.tool_progress_callback("tool.started", function_name, preview, function_args)
except Exception as cb_err:
logging.debug(f"Tool progress callback error: {cb_err}")
if _block_msg is None and self.tool_start_callback:
if not _execution_blocked and self.tool_start_callback:
try:
self.tool_start_callback(tool_call.id, function_name, function_args)
except Exception as cb_err:
logging.debug(f"Tool start callback error: {cb_err}")
# Checkpoint: snapshot working dir before file-mutating tools
if _block_msg is None and function_name in ("write_file", "patch") and self._checkpoint_mgr.enabled:
if not _execution_blocked and function_name in ("write_file", "patch") and self._checkpoint_mgr.enabled:
try:
file_path = function_args.get("path", "")
if file_path:
@@ -9675,7 +9826,7 @@ class AIAgent:
pass # never block tool execution
# Checkpoint before destructive terminal commands
if _block_msg is None and function_name == "terminal" and self._checkpoint_mgr.enabled:
if not _execution_blocked and function_name == "terminal" and self._checkpoint_mgr.enabled:
try:
cmd = function_args.get("command", "")
if _is_destructive_command(cmd):
@@ -9692,6 +9843,11 @@ class AIAgent:
# Tool blocked by plugin policy — return error without executing.
function_result = json.dumps({"error": _block_msg}, ensure_ascii=False)
tool_duration = 0.0
elif _guardrail_block_decision is not None:
# Tool blocked by tool-loop guardrail — synthesize exactly one
# tool result for the original tool_call_id without executing.
function_result = self._guardrail_block_result(_guardrail_block_decision)
tool_duration = 0.0
elif function_name == "todo":
from tools.todo_tool import todo_tool as _todo_tool
function_result = _todo_tool(
@@ -9875,12 +10031,22 @@ class AIAgent:
# Log tool errors to the persistent error log so [error] tags
# in the UI always have a corresponding detailed entry on disk.
_is_error_result, _ = _detect_tool_failure(function_name, function_result)
if not _execution_blocked:
function_result = self._append_guardrail_observation(
function_name,
function_args,
function_result,
failed=_is_error_result,
)
result_preview = function_result if self.verbose_logging else (
function_result[:200] if len(function_result) > 200 else function_result
)
if _is_error_result:
logger.warning("Tool %s returned error (%.2fs): %s", function_name, tool_duration, result_preview)
else:
logger.info("tool %s completed (%.2fs, %d chars)", function_name, tool_duration, len(function_result))
if self.tool_progress_callback:
if not _execution_blocked and self.tool_progress_callback:
try:
self.tool_progress_callback(
"tool.completed", function_name, None, None,
@@ -9896,7 +10062,7 @@ class AIAgent:
logging.debug(f"Tool {function_name} completed in {tool_duration:.2f}s")
logging.debug(f"Tool result ({len(function_result)} chars): {function_result}")
if self.tool_complete_callback:
if not _execution_blocked and self.tool_complete_callback:
try:
self.tool_complete_callback(tool_call.id, function_name, function_args, function_result)
except Exception as cb_err:
@@ -9999,6 +10165,13 @@ class AIAgent:
for idx, pfm in enumerate(self.prefill_messages):
api_messages.insert(sys_offset + idx, pfm.copy())
# Same safety net as the main loop: repair tool-call/result
# pairing before asking for a final summary. Compression and
# session resume can leave a tool result whose parent assistant
# tool_call was summarized away; Responses API rejects that as
# "No tool call found for function call output".
api_messages = self._sanitize_api_messages(api_messages)
# Same safety net as the main loop: drop thinking-only assistant
# turns so Anthropic-family providers don't 400 the summary call.
api_messages = self._drop_thinking_only_and_merge_users(api_messages)
@@ -10180,6 +10353,8 @@ class AIAgent:
# Installed once, transparent when streams are healthy, prevents crash on write.
_install_safe_stdio()
self._ensure_db_session()
# Tag all log records on this thread with the session ID so
# ``hermes logs --session <id>`` can filter a single conversation.
from hermes_logging import set_session_context
@@ -10223,6 +10398,8 @@ class AIAgent:
self._last_content_tools_all_housekeeping = False
self._mute_post_response = False
self._unicode_sanitization_passes = 0
self._tool_guardrails.reset_for_turn()
self._tool_guardrail_halt_decision = None
# Pre-turn connection health check: detect and clean up dead TCP
# connections left over from provider outages or dropped streams.
@@ -13020,6 +13197,16 @@ class AIAgent:
self._execute_tool_calls(assistant_message, messages, effective_task_id, api_call_count)
if self._tool_guardrail_halt_decision is not None:
decision = self._tool_guardrail_halt_decision
_turn_exit_reason = "guardrail_halt"
final_response = self._toolguard_controlled_halt_response(decision)
self._emit_status(
f"⚠️ Tool guardrail halted {decision.tool_name}: {decision.code}"
)
messages.append({"role": "assistant", "content": final_response})
break
# Reset per-turn retry counters after successful tool
# execution so a single truncation doesn't poison the
# entire conversation.
@@ -13063,7 +13250,13 @@ class AIAgent:
# causing premature compression. (#12026)
_real_tokens = _compressor.last_prompt_tokens
else:
_real_tokens = estimate_messages_tokens_rough(messages)
# Include tool schemas — with 50+ tools enabled
# these add 20-30K tokens the messages-only
# estimate misses, which can skip compression
# past the configured threshold (#14695).
_real_tokens = estimate_request_tokens_rough(
messages, tools=self.tools or None
)
if self.compression_enabled and _compressor.should_compress(_real_tokens):
self._safe_print(" ⟳ compacting context…")
@@ -13546,6 +13739,7 @@ class AIAgent:
"messages": messages,
"api_calls": api_call_count,
"completed": completed,
"turn_exit_reason": _turn_exit_reason,
"partial": False, # True only when stopped due to invalid tool calls
"interrupted": interrupted,
"response_previewed": getattr(self, "_response_was_previewed", False),
@@ -13565,6 +13759,8 @@ class AIAgent:
"cost_status": self.session_cost_status,
"cost_source": self.session_cost_source,
}
if self._tool_guardrail_halt_decision is not None:
result["guardrail"] = self._tool_guardrail_halt_decision.to_metadata()
# If a /steer landed after the final assistant turn (no more tool
# batches to drain into), hand it back to the caller so it can be
# delivered as the next user turn instead of being silently lost.
+10 -2
View File
@@ -35,10 +35,18 @@ import time
from pathlib import Path
from typing import Any
_PROJECT_ROOT = Path(__file__).resolve().parent.parent
sys.path.insert(0, str(_PROJECT_ROOT))
try:
from hermes_constants import get_hermes_home
except ImportError:
def get_hermes_home() -> Path: # type: ignore[misc]
val = (os.environ.get("HERMES_HOME") or "").strip()
return Path(val) if val else Path.home() / ".hermes"
DEFAULT_TUI_DIR = Path(os.environ.get("HERMES_TUI_DIR", "/home/bb/hermes-agent/ui-tui"))
DEFAULT_LOG = Path(os.environ.get("HERMES_PERF_LOG", str(Path.home() / ".hermes" / "perf.log")))
DEFAULT_STATE_DB = Path.home() / ".hermes" / "state.db"
DEFAULT_LOG = Path(os.environ.get("HERMES_PERF_LOG", str(get_hermes_home() / "perf.log")))
DEFAULT_STATE_DB = get_hermes_home() / "state.db"
# Keystroke escape sequences. Matches what xterm/VT220 send when the
# terminal has bracketed-paste disabled and the key-repeat handler fires.
+24
View File
@@ -41,13 +41,17 @@ PYPROJECT_FILE = REPO_ROOT / "pyproject.toml"
AUTHOR_MAP = {
# teknium (multiple emails)
"teknium1@gmail.com": "teknium1",
"m@mobrienv.dev": "mikeyobrien",
"qiyin.zuo@pcitc.com": "qiyin-code",
"leone.parise@gmail.com": "leoneparise",
"teknium@nousresearch.com": "teknium1",
"127238744+teknium1@users.noreply.github.com": "teknium1",
"aludwin+gh@gmail.com": "adamludwin",
"2093036+exiao@users.noreply.github.com": "exiao",
"rylen.anil@gmail.com": "rylena",
"godnanijatin@gmail.com": "jatingodnani",
"14046872+tmimmanuel@users.noreply.github.com": "tmimmanuel",
"657290301@qq.com": "IMHaoyan",
"revar@users.noreply.github.com": "revaraver",
# Matrix parity salvage batch (April 2026)
"sr@samirusani": "samrusani",
@@ -76,6 +80,13 @@ AUTHOR_MAP = {
"thomasjhon6666@gmail.com": "ThomassJonax",
"focusflow.app.help@gmail.com": "yes999zc",
"rob@atlas.lan": "rmoen",
# Slack ephemeral slash-ack salvage (May 2026)
"probepark@users.noreply.github.com": "probepark",
# Slack batch salvage (May 2026)
"280484231+prive-fe-bot@users.noreply.github.com": "priveperfumes",
"amr@ghanem.sa": "amroessam",
"paperlantern.agent@gmail.com": "Hinotoi-agent",
"valda@underscore.jp": "valda",
"162235745+0z1-ghb@users.noreply.github.com": "0z1-ghb",
"yes999zc@163.com": "yes999zc",
"343873859@qq.com": "DrStrangerUJN",
@@ -92,6 +103,8 @@ AUTHOR_MAP = {
"130918800+devorun@users.noreply.github.com": "devorun",
"surat.s@itm.kmutnb.ac.th": "beesrsj2500",
"beesr@bee.localdomain": "beesrsj2500",
"mind-dragon@nous.research": "Mind-Dragon",
"juntingpublic@gmail.com": "JustinUssuri",
"mtf201013@gmail.com": "ma-pony",
"sonoyuncudmr@gmail.com": "Sonoyunchu",
"43525405+yatesjalex@users.noreply.github.com": "yatesjalex",
@@ -100,6 +113,8 @@ AUTHOR_MAP = {
"web3blind@users.noreply.github.com": "web3blind",
"julia@alexland.us": "alexg0bot",
"christian@scheid.tech": "scheidti",
# Moonshot schema anyOf+enum salvage (May 2026)
"git@local.invalid": "hendrixfreire",
"1060770+benjaminsehl@users.noreply.github.com": "benjaminsehl",
"nerijusn76@gmail.com": "Nerijusas",
"itonov@proton.me": "Ito-69",
@@ -112,6 +127,7 @@ AUTHOR_MAP = {
"foxion37@gmail.com": "foxion37",
"bloodcarter@gmail.com": "bloodcarter",
"scott@scotttrinh.com": "scotttrinh",
"quocanh261997@gmail.com": "quocanh261997",
# contributors (from noreply pattern)
"david.vv@icloud.com": "davidvv",
"wangqiang@wangqiangdeMac-mini.local": "xiaoqiang243",
@@ -167,6 +183,7 @@ AUTHOR_MAP = {
"sir_even@icloud.com": "sirEven",
"36056348+sirEven@users.noreply.github.com": "sirEven",
"70424851+insecurejezza@users.noreply.github.com": "insecurejezza",
"jezzahehn@gmail.com": "JezzaHehn",
"254021826+dodo-reach@users.noreply.github.com": "dodo-reach",
"259807879+Bartok9@users.noreply.github.com": "Bartok9",
"270082434+crayfish-ai@users.noreply.github.com": "crayfish-ai",
@@ -292,6 +309,7 @@ AUTHOR_MAP = {
"154585401+LeonSGP43@users.noreply.github.com": "LeonSGP43",
"12250313+Kailigithub@users.noreply.github.com": "Kailigithub",
"mgparkprint@gmail.com": "vlwkaos",
"1317078257maroon@gmail.com": "Oxidane-bot",
"tranquil_flow@protonmail.com": "Tranquil-Flow",
"LyleLengyel@gmail.com": "mcndjxlefnd",
"wangshengyang2004@163.com": "Wangshengyang2004",
@@ -330,6 +348,7 @@ AUTHOR_MAP = {
"stefan@dimagents.ai": "dimitrovi",
"hermes@noushq.ai": "benbarclay",
"chinmingcock@gmail.com": "ChimingLiu",
"allard.quek@singtel.com": "AllardQuek",
"openclaw@sparklab.ai": "openclaw",
"semihcvlk53@gmail.com": "Himess",
"erenkar950@gmail.com": "erenkarakus",
@@ -422,6 +441,8 @@ AUTHOR_MAP = {
"ogzerber@users.noreply.github.com": "ogzerber",
"cola-runner@users.noreply.github.com": "cola-runner",
"ygd58@users.noreply.github.com": "ygd58",
"45554392+warabe1122@users.noreply.github.com": "warabe1122",
"187001140+willy-scr@users.noreply.github.com": "willy-scr",
"vominh1919@users.noreply.github.com": "vominh1919",
"iamagenius00@users.noreply.github.com": "iamagenius00",
"9219265+cresslank@users.noreply.github.com": "cresslank",
@@ -446,6 +467,7 @@ AUTHOR_MAP = {
"taosiyuan163@153.com": "taosiyuan163",
"tesseracttars@gmail.com": "tesseracttars-creator",
"tianliangjay@gmail.com": "xingkongliang",
"1317078257maroon@gmail.com": "Oxidane-bot",
"tranquil_flow@protonmail.com": "Tranquil-Flow",
"LyleLengyel@gmail.com": "mcndjxlefnd",
"unayung@gmail.com": "Unayung",
@@ -491,9 +513,11 @@ AUTHOR_MAP = {
"hubin_ll@qq.com": "LLQWQ",
"memosr_email@gmail.com": "memosr",
"jperlow@gmail.com": "perlowja",
"jasonpette1783@gmail.com": "web-dev0521",
"tangyuanjc@JCdeAIfenshendeMac-mini.local": "tangyuanjc",
"harryplusplus@gmail.com": "harryplusplus",
"anthhub@163.com": "anthhub",
"allard.quek@singtel.com": "AllardQuek",
"shenuu@gmail.com": "shenuu",
"xiayh17@gmail.com": "xiayh0107",
"zhujianxyz@gmail.com": "opriz",
+152
View File
@@ -0,0 +1,152 @@
---
name: kanban-orchestrator
description: Decomposition playbook + specialist-roster conventions + anti-temptation rules for an orchestrator profile routing work through Kanban. The "don't do the work yourself" rule and the basic lifecycle are auto-injected into every kanban worker's system prompt; this skill is the deeper playbook when you're specifically playing the orchestrator role.
version: 2.0.0
metadata:
hermes:
tags: [kanban, multi-agent, orchestration, routing]
related_skills: [kanban-worker]
---
# Kanban Orchestrator — Decomposition Playbook
> The **core worker lifecycle** (including the `kanban_create` fan-out pattern and the "decompose, don't execute" rule) is auto-injected into every kanban process via the `KANBAN_GUIDANCE` system-prompt block. This skill is the deeper playbook when you're an orchestrator profile whose whole job is routing.
## When to use the board (vs. just doing the work)
Create Kanban tasks when any of these are true:
1. **Multiple specialists are needed.** Research + analysis + writing is three profiles.
2. **The work should survive a crash or restart.** Long-running, recurring, or important.
3. **The user might want to interject.** Human-in-the-loop at any step.
4. **Multiple subtasks can run in parallel.** Fan-out for speed.
5. **Review / iteration is expected.** A reviewer profile loops on drafter output.
6. **The audit trail matters.** Board rows persist in SQLite forever.
If *none* of those apply — it's a small one-shot reasoning task — use `delegate_task` instead or answer the user directly.
## The anti-temptation rules
Your job description says "route, don't execute." The rules that enforce that:
- **Do not execute the work yourself.** Your restricted toolset usually doesn't even include terminal/file/code/web for implementation. If you find yourself "just fixing this quickly" — stop and create a task for the right specialist.
- **For any concrete task, create a Kanban task and assign it.** Every single time.
- **If no specialist fits, ask the user which profile to create.** Do not default to doing it yourself under "close enough."
- **Decompose, route, and summarize — that's the whole job.**
## The standard specialist roster (convention)
Unless the user's setup has customized profiles, assume these exist. Adjust to whatever the user actually has — ask if you're unsure.
| Profile | Does | Typical workspace |
|---|---|---|
| `researcher` | Reads sources, gathers facts, writes findings | `scratch` |
| `analyst` | Synthesizes, ranks, de-dupes. Consumes multiple `researcher` outputs | `scratch` |
| `writer` | Drafts prose in the user's voice | `scratch` or `dir:` into their Obsidian vault |
| `reviewer` | Reads output, leaves findings, gates approval | `scratch` |
| `backend-eng` | Writes server-side code | `worktree` |
| `frontend-eng` | Writes client-side code | `worktree` |
| `ops` | Runs scripts, manages services, handles deployments | `dir:` into ops scripts repo |
| `pm` | Writes specs, acceptance criteria | `scratch` |
## Decomposition playbook
### Step 1 — Understand the goal
Ask clarifying questions if the goal is ambiguous. Cheap to ask; expensive to spawn the wrong fleet.
### Step 2 — Sketch the task graph
Before creating anything, draft the graph out loud (in your response to the user). Example for "Analyze whether we should migrate to Postgres":
```
T1 researcher research: Postgres cost vs current
T2 researcher research: Postgres performance vs current
T3 analyst synthesize migration recommendation parents: T1, T2
T4 writer draft decision memo parents: T3
```
Show this to the user. Let them correct it before you create anything.
### Step 3 — Create tasks and link
```python
t1 = kanban_create(
title="research: Postgres cost vs current",
assignee="researcher",
body="Compare estimated infrastructure costs, migration costs, and ongoing ops costs over a 3-year window. Sources: AWS/GCP pricing, team time estimates, current Postgres bills from peers.",
tenant=os.environ.get("HERMES_TENANT"),
)["task_id"]
t2 = kanban_create(
title="research: Postgres performance vs current",
assignee="researcher",
body="Compare query latency, throughput, and scaling characteristics at our expected data volume (~500GB, 10k QPS peak). Sources: benchmark papers, public case studies, pgbench results if easy.",
)["task_id"]
t3 = kanban_create(
title="synthesize migration recommendation",
assignee="analyst",
body="Read the findings from T1 (cost) and T2 (performance). Produce a 1-page recommendation with explicit trade-offs and a go/no-go call.",
parents=[t1, t2],
)["task_id"]
t4 = kanban_create(
title="draft decision memo",
assignee="writer",
body="Turn the analyst's recommendation into a 2-page memo for the CTO. Match the tone of previous decision memos in the team's knowledge base.",
parents=[t3],
)["task_id"]
```
`parents=[...]` gates promotion — children stay in `todo` until every parent reaches `done`, then auto-promote to `ready`. No manual coordination needed; the dispatcher and dependency engine handle it.
### Step 4 — Complete your own task
If you were spawned as a task yourself (e.g. `planner` profile was assigned `T0: "investigate Postgres migration"`), mark it done with a summary of what you created:
```python
kanban_complete(
summary="decomposed into T1-T4: 2 researchers parallel, 1 analyst on their outputs, 1 writer on the recommendation",
metadata={
"task_graph": {
"T1": {"assignee": "researcher", "parents": []},
"T2": {"assignee": "researcher", "parents": []},
"T3": {"assignee": "analyst", "parents": ["T1", "T2"]},
"T4": {"assignee": "writer", "parents": ["T3"]},
},
},
)
```
### Step 5 — Report back to the user
Tell them what you created in plain prose:
> I've queued 4 tasks:
> - **T1** (researcher): cost comparison
> - **T2** (researcher): performance comparison, in parallel with T1
> - **T3** (analyst): synthesizes T1 + T2 into a recommendation
> - **T4** (writer): turns T3 into a CTO memo
>
> The dispatcher will pick up T1 and T2 now. T3 starts when both finish. You'll get a gateway ping when T4 completes. Use the dashboard or `hermes kanban tail <id>` to follow along.
## Common patterns
**Fan-out + fan-in (research → synthesize):** N `researcher` tasks with no parents, one `analyst` task with all of them as parents.
**Pipeline with gates:** `pm → backend-eng → reviewer`. Each stage's `parents=[previous_task]`. Reviewer blocks or completes; if reviewer blocks, the operator unblocks with feedback and respawns.
**Same-profile queue:** 50 tasks, all assigned to `translator`, no dependencies between them. Dispatcher serializes — translator processes them in priority order, accumulating experience in their own memory.
**Human-in-the-loop:** Any task can `kanban_block()` to wait for input. Dispatcher respawns after `/unblock`. The comment thread carries the full context.
## Pitfalls
**Reassignment vs. new task.** If a reviewer blocks with "needs changes," create a NEW task linked from the reviewer's task — don't re-run the same task with a stern look. The new task is assigned to the original implementer profile.
**Argument order for links.** `kanban_link(parent_id=..., child_id=...)` — parent first. Mixing them up demotes the wrong task to `todo`.
**Don't pre-create the whole graph if the shape depends on intermediate findings.** If T3's structure depends on what T1 and T2 find, let T3 exist as a "synthesize findings" task whose own first step is to read parent handoffs and plan the rest. Orchestrators can spawn orchestrators.
**Tenant inheritance.** If `HERMES_TENANT` is set in your env, pass `tenant=os.environ.get("HERMES_TENANT")` on every `kanban_create` call so child tasks stay in the same namespace.
+134
View File
@@ -0,0 +1,134 @@
---
name: kanban-worker
description: Pitfalls, examples, and edge cases for Hermes Kanban workers. The lifecycle itself is auto-injected into every worker's system prompt as KANBAN_GUIDANCE (from agent/prompt_builder.py); this skill is what you load when you want deeper detail on specific scenarios.
version: 2.0.0
metadata:
hermes:
tags: [kanban, multi-agent, collaboration, workflow, pitfalls]
related_skills: [kanban-orchestrator]
---
# Kanban Worker — Pitfalls and Examples
> You're seeing this skill because the Hermes Kanban dispatcher spawned you as a worker with `--skills kanban-worker` — it's loaded automatically for every dispatched worker. The **lifecycle** (6 steps: orient → work → heartbeat → block/complete) also lives in the `KANBAN_GUIDANCE` block that's auto-injected into your system prompt. This skill is the deeper detail: good handoff shapes, retry diagnostics, edge cases.
## Workspace handling
Your workspace kind determines how you should behave inside `$HERMES_KANBAN_WORKSPACE`:
| Kind | What it is | How to work |
|---|---|---|
| `scratch` | Fresh tmp dir, yours alone | Read/write freely; it gets GC'd when the task is archived. |
| `dir:<path>` | Shared persistent directory | Other runs will read what you write. Treat it like long-lived state. Path is guaranteed absolute (the kernel rejects relative paths). |
| `worktree` | Git worktree at the resolved path | If `.git` doesn't exist, run `git worktree add <path> <branch>` from the main repo first, then cd and work normally. Commit work here. |
## Tenant isolation
If `$HERMES_TENANT` is set, the task belongs to a tenant namespace. When reading or writing persistent memory, prefix memory entries with the tenant so context doesn't leak across tenants:
- Good: `business-a: Acme is our biggest customer`
- Bad (leaks): `Acme is our biggest customer`
## Good summary + metadata shapes
The `kanban_complete(summary=..., metadata=...)` handoff is how downstream workers read what you did. Patterns that work:
**Coding task:**
```python
kanban_complete(
summary="shipped rate limiter — token bucket, keys on user_id with IP fallback, 14 tests pass",
metadata={
"changed_files": ["rate_limiter.py", "tests/test_rate_limiter.py"],
"tests_run": 14,
"tests_passed": 14,
"decisions": ["user_id primary, IP fallback for unauthenticated requests"],
},
)
```
**Research task:**
```python
kanban_complete(
summary="3 competing libraries reviewed; vLLM wins on throughput, SGLang on latency, Tensorrt-LLM on memory efficiency",
metadata={
"sources_read": 12,
"recommendation": "vLLM",
"benchmarks": {"vllm": 1.0, "sglang": 0.87, "trtllm": 0.72},
},
)
```
**Review task:**
```python
kanban_complete(
summary="reviewed PR #123; 2 blocking issues found (SQL injection in /search, missing CSRF on /settings)",
metadata={
"pr_number": 123,
"findings": [
{"severity": "critical", "file": "api/search.py", "line": 42, "issue": "raw SQL concat"},
{"severity": "high", "file": "api/settings.py", "issue": "missing CSRF middleware"},
],
"approved": False,
},
)
```
Shape `metadata` so downstream parsers (reviewers, aggregators, schedulers) can use it without re-reading your prose.
## Block reasons that get answered fast
Bad: `"stuck"` — the human has no context.
Good: one sentence naming the specific decision you need. Leave longer context as a comment instead.
```python
kanban_comment(
task_id=os.environ["HERMES_KANBAN_TASK"],
body="Full context: I have user IPs from Cloudflare headers but some users are behind NATs with thousands of peers. Keying on IP alone causes false positives.",
)
kanban_block(reason="Rate limit key choice: IP (simple, NAT-unsafe) or user_id (requires auth, skips anonymous endpoints)?")
```
The block message is what appears in the dashboard / gateway notifier. The comment is the deeper context a human reads when they open the task.
## Heartbeats worth sending
Good heartbeats name progress: `"epoch 12/50, loss 0.31"`, `"scanned 1.2M/2.4M rows"`, `"uploaded 47/120 videos"`.
Bad heartbeats: `"still working"`, empty notes, sub-second intervals. Every few minutes max; skip entirely for tasks under ~2 minutes.
## Retry scenarios
If you open the task and `kanban_show` returns `runs: [...]` with one or more closed runs, you're a retry. The prior runs' `outcome` / `summary` / `error` tell you what didn't work. Don't repeat that path. Typical retry diagnostics:
- `outcome: "timed_out"` — the previous attempt hit `max_runtime_seconds`. You may need to chunk the work or shorten it.
- `outcome: "crashed"` — OOM or segfault. Reduce memory footprint.
- `outcome: "spawn_failed"` + `error: "..."` — usually a profile config issue (missing credential, bad PATH). Ask the human via `kanban_block` instead of retrying blindly.
- `outcome: "reclaimed"` + `summary: "task archived..."` — operator archived the task out from under the previous run; you probably shouldn't be running at all, check status carefully.
- `outcome: "blocked"` — a previous attempt blocked; the unblock comment should be in the thread by now.
## Do NOT
- Call `delegate_task` as a substitute for `kanban_create`. `delegate_task` is for short reasoning subtasks inside YOUR run; `kanban_create` is for cross-agent handoffs that outlive one API loop.
- Modify files outside `$HERMES_KANBAN_WORKSPACE` unless the task body says to.
- Create follow-up tasks assigned to yourself — assign to the right specialist.
- Complete a task you didn't actually finish. Block it instead.
## Pitfalls
**Task state can change between dispatch and your startup.** Between when the dispatcher claimed and when your process actually booted, the task may have been blocked, reassigned, or archived. Always `kanban_show` first. If it reports `blocked` or `archived`, stop — you shouldn't be running.
**Workspace may have stale artifacts.** Especially `dir:` and `worktree` workspaces can have files from previous runs. Read the comment thread — it usually explains why you're running again and what state the workspace is in.
**Don't rely on the CLI when the guidance is available.** The `kanban_*` tools work across all terminal backends (Docker, Modal, SSH). `hermes kanban <verb>` from your terminal tool will fail in containerized backends because the CLI isn't installed there. When in doubt, use the tool.
## CLI fallback (for scripting)
Every tool has a CLI equivalent for human operators and scripts:
- `kanban_show``hermes kanban show <id> --json`
- `kanban_complete``hermes kanban complete <id> --summary "..." --metadata '{...}'`
- `kanban_block``hermes kanban block <id> "reason"`
- `kanban_create``hermes kanban create "title" --assignee <profile> [--parent <id>]`
- etc.
Use the tools from inside an agent; the CLI exists for the human at the terminal.
+2 -2
View File
@@ -124,7 +124,7 @@ class TestMcpRegistrationE2E:
mock_conn.request_permission = AsyncMock()
acp_agent._conn = mock_conn
def mock_run_conversation(user_message, conversation_history=None, task_id=None):
def mock_run_conversation(user_message, conversation_history=None, task_id=None, **kwargs):
"""Simulate an agent turn that calls terminal, gets a result, then responds."""
agent = state.agent
@@ -213,7 +213,7 @@ class TestMcpRegistrationE2E:
mock_conn.request_permission = AsyncMock()
acp_agent._conn = mock_conn
def mock_run(user_message, conversation_history=None, task_id=None):
def mock_run(user_message, conversation_history=None, task_id=None, **kwargs):
agent = state.agent
# Fire two tool calls
if agent.tool_progress_callback:
+2 -1
View File
@@ -730,6 +730,7 @@ class TestSlashCommands:
]
state.agent.compression_enabled = True
state.agent._cached_system_prompt = "system"
state.agent.tools = None
original_session_db = object()
state.agent._session_db = original_session_db
@@ -746,7 +747,7 @@ class TestSlashCommands:
with (
patch.object(agent.session_manager, "save_session") as mock_save,
patch(
"agent.model_metadata.estimate_messages_tokens_rough",
"agent.model_metadata.estimate_request_tokens_rough",
side_effect=[40, 12],
),
):
+75
View File
@@ -8,6 +8,7 @@ from types import SimpleNamespace
import pytest
from unittest.mock import MagicMock, patch
from acp_adapter import session as acp_session
from acp_adapter.session import SessionManager, SessionState
from hermes_state import SessionDB
@@ -42,6 +43,27 @@ class TestCreateSession:
state = manager.create_session(cwd="/tmp/work")
assert calls == [(state.session_id, "/tmp/work")]
def test_register_task_cwd_translates_windows_drive_for_wsl_tools(self, monkeypatch):
captured = {}
def fake_register_task_env_overrides(task_id, overrides):
captured["task_id"] = task_id
captured["overrides"] = overrides
monkeypatch.setattr("hermes_constants._wsl_detected", True)
monkeypatch.setattr(
"tools.terminal_tool.register_task_env_overrides",
fake_register_task_env_overrides,
)
acp_session._register_task_cwd("session-1", r"E:\Projects\AI\paperclip")
assert captured == {
"task_id": "session-1",
"overrides": {"cwd": "/mnt/e/Projects/AI/paperclip"},
}
def test_session_ids_are_unique(self, manager):
s1 = manager.create_session()
s2 = manager.create_session()
@@ -56,6 +78,59 @@ class TestCreateSession:
assert manager.get_session("does-not-exist") is None
# ---------------------------------------------------------------------------
# WSL cwd translation
# ---------------------------------------------------------------------------
class TestWslCwdTranslation:
def test_translate_acp_cwd_converts_windows_drive_path_when_wsl(self, monkeypatch):
monkeypatch.setattr("hermes_constants._wsl_detected", True)
assert acp_session._translate_acp_cwd(r"E:\Projects\AI\paperclip") == "/mnt/e/Projects/AI/paperclip"
def test_translate_acp_cwd_handles_forward_slashes_when_wsl(self, monkeypatch):
monkeypatch.setattr("hermes_constants._wsl_detected", True)
assert acp_session._translate_acp_cwd("D:/work/project") == "/mnt/d/work/project"
def test_translate_acp_cwd_leaves_windows_drive_path_unchanged_off_wsl(self, monkeypatch):
monkeypatch.setattr("hermes_constants._wsl_detected", False)
assert acp_session._translate_acp_cwd(r"E:\Projects\AI\paperclip") == r"E:\Projects\AI\paperclip"
def test_translate_acp_cwd_leaves_posix_path_unchanged_on_wsl(self, monkeypatch):
monkeypatch.setattr("hermes_constants._wsl_detected", True)
assert acp_session._translate_acp_cwd("/mnt/e/Projects/AI/paperclip") == "/mnt/e/Projects/AI/paperclip"
def test_create_session_stores_translated_cwd_on_wsl(self, manager, monkeypatch):
monkeypatch.setattr("hermes_constants._wsl_detected", True)
state = manager.create_session(cwd=r"E:\Projects\AI\paperclip")
assert state.cwd == "/mnt/e/Projects/AI/paperclip"
def test_fork_session_stores_translated_cwd_on_wsl(self, manager, monkeypatch):
monkeypatch.setattr("hermes_constants._wsl_detected", True)
original = manager.create_session(cwd="/tmp/base")
forked = manager.fork_session(original.session_id, cwd=r"D:\work\project")
assert forked is not None
assert forked.cwd == "/mnt/d/work/project"
def test_update_cwd_stores_translated_cwd_on_wsl(self, manager, monkeypatch):
monkeypatch.setattr("hermes_constants._wsl_detected", True)
state = manager.create_session(cwd="/tmp/old")
updated = manager.update_cwd(state.session_id, cwd=r"C:\Users\foo\project")
assert updated is not None
assert updated.cwd == "/mnt/c/Users/foo/project"
# ---------------------------------------------------------------------------
# fork
# ---------------------------------------------------------------------------
+150
View File
@@ -0,0 +1,150 @@
from types import SimpleNamespace
import pytest
from acp.schema import TextContentBlock
from acp_adapter.server import HermesACPAgent
from acp_adapter.session import SessionManager
class FakeAgent:
def __init__(self):
self.model = "fake-model"
self.provider = "fake-provider"
self.enabled_toolsets = ["hermes-acp"]
self.disabled_toolsets = []
self.tools = []
self.valid_tool_names = set()
self.steers = []
self.runs = []
def steer(self, text):
self.steers.append(text)
return True
def run_conversation(self, *, user_message, conversation_history, task_id, **kwargs):
self.runs.append(user_message)
messages = list(conversation_history or [])
messages.append({"role": "user", "content": user_message})
final = f"ran: {user_message}"
messages.append({"role": "assistant", "content": final})
return {"final_response": final, "messages": messages}
class CaptureConn:
def __init__(self):
self.updates = []
async def session_update(self, *args, **kwargs):
if kwargs:
self.updates.append((kwargs.get("session_id"), kwargs.get("update")))
else:
self.updates.append((args[0], args[1]))
async def request_permission(self, *args, **kwargs):
return SimpleNamespace(outcome="allow")
class NoopDb:
def get_session(self, *_args, **_kwargs):
return None
def create_session(self, *_args, **_kwargs):
return None
def update_session(self, *_args, **_kwargs):
return None
def make_agent_and_state():
fake = FakeAgent()
manager = SessionManager(agent_factory=lambda **kwargs: fake, db=NoopDb())
acp_agent = HermesACPAgent(session_manager=manager)
state = manager.create_session(cwd=".")
conn = CaptureConn()
acp_agent.on_connect(conn)
return acp_agent, state, fake, conn
@pytest.mark.asyncio
async def test_acp_steer_slash_command_injects_into_running_agent():
acp_agent, state, fake, _conn = make_agent_and_state()
state.is_running = True
response = await acp_agent.prompt(
session_id=state.session_id,
prompt=[TextContentBlock(type="text", text="/steer prefer the simpler fix")],
)
assert response.stop_reason == "end_turn"
assert fake.steers == ["prefer the simpler fix"]
assert fake.runs == []
@pytest.mark.asyncio
async def test_acp_steer_after_zed_interrupt_replays_interrupted_prompt_with_guidance():
acp_agent, state, fake, _conn = make_agent_and_state()
state.interrupted_prompt_text = "write hi to a text file"
response = await acp_agent.prompt(
session_id=state.session_id,
prompt=[TextContentBlock(type="text", text="/steer write HELLO instead")],
)
assert response.stop_reason == "end_turn"
assert fake.steers == []
assert fake.runs == [
"write hi to a text file\n\nUser correction/guidance after interrupt: write HELLO instead"
]
assert state.interrupted_prompt_text == ""
@pytest.mark.asyncio
async def test_acp_steer_on_idle_session_runs_as_regular_prompt():
# /steer on an idle session (no running turn, nothing to salvage) should
# run the steer payload as a normal user prompt — NOT silently append it
# to state.queued_prompts. Without this, users on Zed / other ACP clients
# see their /steer turn into "queued for the next turn" when they never
# typed /queue. Matches gateway/run.py ~L4898 idle-/steer behavior.
acp_agent, state, fake, _conn = make_agent_and_state()
response = await acp_agent.prompt(
session_id=state.session_id,
prompt=[TextContentBlock(type="text", text="/steer summarize the README")],
)
assert response.stop_reason == "end_turn"
assert fake.steers == []
assert fake.runs == ["summarize the README"]
assert state.queued_prompts == []
@pytest.mark.asyncio
async def test_acp_queue_slash_command_adds_next_turn_without_running_now():
acp_agent, state, fake, _conn = make_agent_and_state()
response = await acp_agent.prompt(
session_id=state.session_id,
prompt=[TextContentBlock(type="text", text="/queue run the tests after this")],
)
assert response.stop_reason == "end_turn"
assert state.queued_prompts == ["run the tests after this"]
assert fake.runs == []
@pytest.mark.asyncio
async def test_acp_prompt_drains_queued_turns_after_current_run():
acp_agent, state, fake, conn = make_agent_and_state()
state.queued_prompts.append("then run tests")
response = await acp_agent.prompt(
session_id=state.session_id,
prompt=[TextContentBlock(type="text", text="make the change")],
)
assert response.stop_reason == "end_turn"
assert fake.runs == ["make the change", "then run tests"]
assert state.queued_prompts == []
agent_messages = [u for _sid, u in conn.updates if getattr(u, "session_update", None) == "agent_message_chunk"]
assert len(agent_messages) >= 2
@@ -427,3 +427,68 @@ class TestProvidersDictApiModeAnthropicMessages:
assert isinstance(sync_client, OpenAI)
async_client, _ = resolve_provider_client("localchat", async_mode=True)
assert isinstance(async_client, AsyncOpenAI)
class TestCustomProviderAliasCollision:
"""A user-declared custom_providers entry whose name matches a built-in
*alias* (not a canonical provider) must win over the built-in.
Regression guard for #15743: users who defined fallback_model pointing at
a custom_providers entry named ``kimi`` were having requests routed to
the built-in kimi-coding endpoint because ``_normalize_aux_provider``
rewrote ``kimi`` ``kimi-coding`` before the named-custom lookup.
"""
def test_custom_named_kimi_wins_over_builtin_alias(self, tmp_path):
_write_config(tmp_path, {
"model": {"provider": "openrouter", "default": "anthropic/claude-sonnet-4.6"},
"custom_providers": [
{
"name": "kimi",
"base_url": "https://my-custom-kimi.example.com/v1",
"api_key": "my-kimi-key",
"models": {"my-kimi-model": {"context_length": 200000}},
},
],
})
from agent.auxiliary_client import resolve_provider_client
from openai import OpenAI
client, model = resolve_provider_client("kimi", model="my-kimi-model", raw_codex=True)
assert isinstance(client, OpenAI)
assert "my-custom-kimi.example.com" in str(client.base_url)
assert client.api_key == "my-kimi-key"
assert model == "my-kimi-model"
def test_bare_kimi_without_custom_still_routes_to_builtin(self, tmp_path, monkeypatch):
"""Regression guard: bare 'kimi' with no custom entry must still
reach the built-in kimi-coding provider."""
_write_config(tmp_path, {
"model": {"provider": "openrouter", "default": "anthropic/claude-sonnet-4.6"},
})
monkeypatch.setenv("KIMI_API_KEY", "builtin-kimi-key")
from agent.auxiliary_client import resolve_provider_client
client, _ = resolve_provider_client("kimi", model="kimi-k2-0905-preview", raw_codex=True)
assert client is not None
base_url = str(client.base_url)
# Built-in kimi-coding points at api.moonshot.ai
assert "moonshot" in base_url or "kimi" in base_url, f"unexpected base_url {base_url!r}"
def test_explicit_overrides_applied_on_api_key_branch(self, tmp_path, monkeypatch):
"""Explicit base_url/api_key from the caller must override the
registered provider's defaults on the API-key branch. Used by
_try_activate_fallback to route a fallback through a built-in
provider name but targeting a user-supplied endpoint."""
_write_config(tmp_path, {
"model": {"provider": "openrouter", "default": "anthropic/claude-sonnet-4.6"},
})
monkeypatch.setenv("KIMI_API_KEY", "builtin-kimi-key")
from agent.auxiliary_client import resolve_provider_client
from openai import OpenAI
client, _ = resolve_provider_client(
"kimi-coding", model="kimi-k2", raw_codex=True,
explicit_base_url="https://override.example.com",
explicit_api_key="override-key",
)
assert isinstance(client, OpenAI)
assert "override.example.com" in str(client.base_url)
assert client.api_key == "override-key"
+52
View File
@@ -640,6 +640,30 @@ class TestCompressWithClient:
for tc in msg["tool_calls"]:
assert tc["id"] in answered_ids
def test_sanitizer_matches_responses_call_id_when_id_differs(self, compressor):
msgs = [
{
"role": "assistant",
"content": "",
"tool_calls": [
{
"id": "fc_123",
"call_id": "call_123",
"response_item_id": "fc_123",
"type": "function",
"function": {"name": "search_files", "arguments": "{}"},
}
],
},
{"role": "tool", "tool_call_id": "call_123", "content": "result"},
]
sanitized = compressor._sanitize_tool_pairs(msgs)
assert [m.get("tool_call_id") for m in sanitized if m.get("role") == "tool"] == [
"call_123"
]
def test_summary_role_avoids_consecutive_user_messages(self):
"""Summary role should alternate with the last head message to avoid consecutive same-role messages."""
mock_client = MagicMock()
@@ -1119,6 +1143,34 @@ class TestTokenBudgetTailProtection:
# At least one old tool result should have been pruned
assert pruned >= 1
def test_prune_short_conv_protects_entire_tail(self, budget_compressor):
"""Regression guard for PR #17025.
When ``len(messages) <= protect_tail_count`` and a token budget is
also set, every message must be protected. The previous code used
``min(protect_tail_count, len(result) - 1)`` which capped the floor
one below the full length, leaving the oldest message eligible for
pruning.
"""
c = budget_compressor
# 4 messages, protect_tail_count=4 -- nothing should be pruned.
# Oldest message is a large tool result; on the buggy path it falls
# outside the protected window and gets summarized.
messages = [
{"role": "tool", "content": "x" * 5000, "tool_call_id": "c0"},
{"role": "assistant", "content": "ack"},
{"role": "user", "content": "recent"},
{"role": "assistant", "content": "reply"},
]
result, pruned = c._prune_old_tool_results(
messages,
protect_tail_count=4,
protect_tail_tokens=1_000_000, # budget large enough to protect all
)
assert pruned == 0
# Tool result at index 0 must be preserved verbatim
assert result[0]["content"] == "x" * 5000
def test_prune_without_token_budget_uses_message_count(self, budget_compressor):
"""Without protect_tail_tokens, falls back to message-count behavior."""
c = budget_compressor
+119 -2
View File
@@ -86,9 +86,22 @@ def test_curator_config_overrides(curator_env, monkeypatch):
# should_run_now
# ---------------------------------------------------------------------------
def test_first_run_always_eligible(curator_env):
def test_first_run_defers(curator_env):
"""The FIRST observation of the curator (fresh install, no state file)
must NOT trigger an immediate run. The curator is designed to run after
a full ``interval_hours`` of skill activity, not on the first background
tick after installation. Fixes #18373.
"""
c = curator_env["curator"]
assert c.should_run_now() is True
# No state file — should defer and seed last_run_at.
assert c.should_run_now() is False
state = c.load_state()
assert state.get("last_run_at") is not None, (
"first observation should seed last_run_at so the interval clock "
"starts ticking instead of firing immediately next tick"
)
# A second immediate call still returns False (seeded, not yet stale).
assert c.should_run_now() is False
def test_recent_run_blocks(curator_env):
@@ -265,6 +278,77 @@ def test_run_review_records_state(curator_env):
assert state["last_run_summary"] is not None
def test_dry_run_does_not_advance_state(curator_env, monkeypatch):
"""Dry-run previews must not bump last_run_at or run_count. A preview
shouldn't defer the next scheduled real pass or look like a real run in
`hermes curator status`. Fixes #18373.
"""
c = curator_env["curator"]
skills_dir = curator_env["home"] / "skills"
_write_skill(skills_dir, "a")
# Stub the LLM so the test doesn't need a provider.
monkeypatch.setattr(
c, "_run_llm_review",
lambda prompt: {
"final": "", "summary": "dry preview", "model": "", "provider": "",
"tool_calls": [], "error": None,
},
)
c.run_curator_review(synchronous=True, dry_run=True)
state = c.load_state()
assert state.get("last_run_at") is None, "dry-run must not seed last_run_at"
assert state.get("run_count", 0) == 0, "dry-run must not bump run_count"
assert "dry-run" in (state.get("last_run_summary") or ""), (
"dry-run summary should be labeled so status output is unambiguous"
)
def test_dry_run_injects_report_only_banner(curator_env, monkeypatch):
"""The dry-run prompt must carry a banner instructing the LLM not to
call any mutating tool. This is defense in depth the caller also
skips automatic transitions but the LLM prompt is the only guard
against the model calling skill_manage directly."""
c = curator_env["curator"]
skills_dir = curator_env["home"] / "skills"
_write_skill(skills_dir, "a")
captured = {}
def _stub(prompt):
captured["prompt"] = prompt
return {"final": "", "summary": "s", "model": "", "provider": "",
"tool_calls": [], "error": None}
monkeypatch.setattr(c, "_run_llm_review", _stub)
c.run_curator_review(synchronous=True, dry_run=True)
assert "DRY-RUN" in captured["prompt"]
assert "DO NOT" in captured["prompt"]
def test_dry_run_skips_automatic_transitions(curator_env, monkeypatch):
"""Dry-run must not call apply_automatic_transitions — the auto pass
archives skills deterministically, and a preview must not touch the
filesystem."""
c = curator_env["curator"]
skills_dir = curator_env["home"] / "skills"
_write_skill(skills_dir, "a")
called = {"n": 0}
def _explode(*_a, **_kw):
called["n"] += 1
return {"checked": 0, "marked_stale": 0, "archived": 0, "reactivated": 0}
monkeypatch.setattr(c, "apply_automatic_transitions", _explode)
monkeypatch.setattr(
c, "_run_llm_review",
lambda p: {"final": "", "summary": "s", "model": "", "provider": "",
"tool_calls": [], "error": None},
)
c.run_curator_review(synchronous=True, dry_run=True)
assert called["n"] == 0, "dry-run must skip apply_automatic_transitions"
def test_run_review_synchronous_invokes_llm_stub(curator_env, monkeypatch):
c = curator_env["curator"]
skills_dir = curator_env["home"] / "skills"
@@ -327,12 +411,32 @@ def test_maybe_run_curator_runs_when_eligible(curator_env, monkeypatch):
c = curator_env["curator"]
skills_dir = curator_env["home"] / "skills"
_write_skill(skills_dir, "a")
# Seed last_run_at far in the past so the interval gate opens — the
# "no state" path intentionally defers the first run now (#18373).
long_ago = datetime.now(timezone.utc) - timedelta(hours=c.get_interval_hours() * 2)
c.save_state({"last_run_at": long_ago.isoformat(), "paused": False})
# Force idle over threshold
result = c.maybe_run_curator(idle_for_seconds=99999.0)
assert result is not None
assert "started_at" in result
def test_maybe_run_curator_defers_on_fresh_install(curator_env):
"""Fresh install (no curator state file) must NOT fire the curator on
the first gateway tick. The first observation seeds last_run_at and
returns None. Fixes #18373."""
c = curator_env["curator"]
skills_dir = curator_env["home"] / "skills"
_write_skill(skills_dir, "a")
# Infinite idle — the only thing that should block the run is the new
# deferred-first-run gate.
result = c.maybe_run_curator(idle_for_seconds=99999.0)
assert result is None
# And the next tick still defers (we seeded last_run_at to "now").
result2 = c.maybe_run_curator(idle_for_seconds=99999.0)
assert result2 is None
def test_maybe_run_curator_swallows_exceptions(curator_env, monkeypatch):
c = curator_env["curator"]
@@ -363,6 +467,19 @@ def test_state_atomic_write_no_tmp_leftovers(curator_env):
assert not p.name.startswith(".curator_state_"), f"tmp leftover: {p.name}"
def test_state_preserves_last_report_path(curator_env):
c = curator_env["curator"]
c.save_state({
"last_run_at": "2026-04-30T12:00:00+00:00",
"last_run_summary": "ok",
"last_report_path": "/tmp/curator-report",
"paused": False,
"run_count": 1,
})
state = c.load_state()
assert state["last_report_path"] == "/tmp/curator-report"
def test_curator_review_prompt_has_invariants():
"""Core invariants must be in the review prompt text."""
from agent.curator import CURATOR_REVIEW_PROMPT
+316
View File
@@ -0,0 +1,316 @@
"""Tests for agent/curator_backup.py — snapshot + rollback of the skills tree."""
from __future__ import annotations
import importlib
import json
import os
import sys
import tarfile
import tempfile
from pathlib import Path
import pytest
@pytest.fixture
def backup_env(monkeypatch, tmp_path):
"""Isolate HERMES_HOME + reload modules so every test starts clean."""
home = tmp_path / ".hermes"
home.mkdir()
(home / "skills").mkdir()
monkeypatch.setenv("HERMES_HOME", str(home))
monkeypatch.setattr(Path, "home", lambda: tmp_path)
# Reload so get_hermes_home picks up the env var fresh.
import hermes_constants
importlib.reload(hermes_constants)
from agent import curator_backup
importlib.reload(curator_backup)
return {"home": home, "skills": home / "skills", "cb": curator_backup}
def _write_skill(skills_dir: Path, name: str, body: str = "body") -> Path:
d = skills_dir / name
d.mkdir(parents=True, exist_ok=True)
(d / "SKILL.md").write_text(
f"---\nname: {name}\ndescription: t\nversion: 1.0\n---\n\n{body}\n",
encoding="utf-8",
)
return d
# ---------------------------------------------------------------------------
# snapshot_skills
# ---------------------------------------------------------------------------
def test_snapshot_creates_tarball_and_manifest(backup_env):
cb = backup_env["cb"]
_write_skill(backup_env["skills"], "alpha")
_write_skill(backup_env["skills"], "beta")
snap = cb.snapshot_skills(reason="test")
assert snap is not None, "snapshot should succeed with a populated skills dir"
assert (snap / "skills.tar.gz").exists()
manifest = json.loads((snap / "manifest.json").read_text())
assert manifest["reason"] == "test"
assert manifest["skill_files"] == 2
assert manifest["archive_bytes"] > 0
def test_snapshot_excludes_backups_dir_itself(backup_env):
"""The backup must NOT contain .curator_backups/ — that would recurse
with every subsequent snapshot and balloon disk usage."""
cb = backup_env["cb"]
_write_skill(backup_env["skills"], "alpha")
snap1 = cb.snapshot_skills(reason="first")
assert snap1 is not None
snap2 = cb.snapshot_skills(reason="second")
assert snap2 is not None
with tarfile.open(snap2 / "skills.tar.gz") as tf:
names = tf.getnames()
assert not any(n.startswith(".curator_backups") for n in names), (
"second snapshot must not contain the first snapshot recursively"
)
def test_snapshot_excludes_hub_dir(backup_env):
""".hub/ is managed by the skills hub. Rolling it back would break
lockfile invariants, so the snapshot omits it entirely."""
cb = backup_env["cb"]
hub = backup_env["skills"] / ".hub"
hub.mkdir()
(hub / "lock.json").write_text("{}")
_write_skill(backup_env["skills"], "alpha")
snap = cb.snapshot_skills(reason="t")
assert snap is not None
with tarfile.open(snap / "skills.tar.gz") as tf:
names = tf.getnames()
assert not any(n.startswith(".hub") for n in names)
def test_snapshot_disabled_returns_none(backup_env, monkeypatch):
cb = backup_env["cb"]
monkeypatch.setattr(cb, "is_enabled", lambda: False)
_write_skill(backup_env["skills"], "alpha")
assert cb.snapshot_skills() is None
# And no backup dir should have been created
assert not (backup_env["skills"] / ".curator_backups").exists()
def test_snapshot_uniquifies_when_same_second(backup_env, monkeypatch):
"""Two snapshots in the same wallclock second must not clobber each
other. The module appends a counter to the second snapshot's id."""
cb = backup_env["cb"]
_write_skill(backup_env["skills"], "alpha")
frozen = "2026-05-01T12-00-00Z"
monkeypatch.setattr(cb, "_utc_id", lambda now=None: frozen)
s1 = cb.snapshot_skills(reason="a")
s2 = cb.snapshot_skills(reason="b")
assert s1 is not None and s2 is not None
assert s1.name == frozen
assert s2.name == f"{frozen}-01"
def test_snapshot_prunes_to_keep_count(backup_env, monkeypatch):
cb = backup_env["cb"]
_write_skill(backup_env["skills"], "alpha")
monkeypatch.setattr(cb, "get_keep", lambda: 3)
# Create 5 snapshots with monotonically increasing fake ids
ids = [f"2026-05-0{i}T00-00-00Z" for i in range(1, 6)]
for i, fid in enumerate(ids):
monkeypatch.setattr(cb, "_utc_id", lambda now=None, _f=fid: _f)
cb.snapshot_skills(reason=f"n{i}")
remaining = sorted(p.name for p in (backup_env["skills"] / ".curator_backups").iterdir())
# Newest 3 kept (lex order == date order for this id format)
assert remaining == ids[2:], f"expected newest 3, got {remaining}"
# ---------------------------------------------------------------------------
# list_backups / _resolve_backup
# ---------------------------------------------------------------------------
def test_list_backups_empty(backup_env):
cb = backup_env["cb"]
assert cb.list_backups() == []
def test_list_backups_returns_manifest_data(backup_env):
cb = backup_env["cb"]
_write_skill(backup_env["skills"], "alpha")
cb.snapshot_skills(reason="m1")
rows = cb.list_backups()
assert len(rows) == 1
assert rows[0]["reason"] == "m1"
assert rows[0]["skill_files"] == 1
def test_resolve_backup_newest_when_no_id(backup_env, monkeypatch):
cb = backup_env["cb"]
_write_skill(backup_env["skills"], "alpha")
ids = ["2026-05-01T00-00-00Z", "2026-05-02T00-00-00Z"]
for fid in ids:
monkeypatch.setattr(cb, "_utc_id", lambda now=None, _f=fid: _f)
cb.snapshot_skills()
resolved = cb._resolve_backup(None)
assert resolved is not None
assert resolved.name == "2026-05-02T00-00-00Z", (
"resolve(None) must return newest regular snapshot"
)
def test_resolve_backup_unknown_id_returns_none(backup_env):
cb = backup_env["cb"]
_write_skill(backup_env["skills"], "alpha")
cb.snapshot_skills()
assert cb._resolve_backup("not-an-id") is None
# ---------------------------------------------------------------------------
# rollback
# ---------------------------------------------------------------------------
def test_rollback_restores_deleted_skill(backup_env):
"""The whole point of this feature: user loses a skill, rollback
brings it back."""
cb = backup_env["cb"]
skills = backup_env["skills"]
user_skill = _write_skill(skills, "my-personal-workflow", body="important content")
cb.snapshot_skills(reason="pre-simulated-curator")
# Simulate curator archiving it out of existence
import shutil as _sh
_sh.rmtree(user_skill)
assert not user_skill.exists()
ok, msg, _ = cb.rollback()
assert ok, f"rollback failed: {msg}"
assert user_skill.exists(), "my-personal-workflow should be restored"
assert "important content" in (user_skill / "SKILL.md").read_text()
def test_rollback_is_itself_undoable(backup_env):
"""A rollback creates its own safety snapshot before replacing the
tree, so the user can undo a mistaken rollback. The safety snapshot
is a real tarball with reason='pre-rollback to <id>' it's
listed by list_backups() just like any other snapshot and can be
restored the same way."""
cb = backup_env["cb"]
skills = backup_env["skills"]
_write_skill(skills, "v1")
cb.snapshot_skills(reason="snapshot-of-v1")
# Overwrite with a new skill state
import shutil as _sh
_sh.rmtree(skills / "v1")
_write_skill(skills, "v2")
ok, _, _ = cb.rollback()
assert ok
assert (skills / "v1").exists()
# list_backups should show a safety snapshot tagged "pre-rollback to <target-id>"
rows = cb.list_backups()
pre_rollback_entries = [r for r in rows if "pre-rollback" in (r.get("reason") or "")]
assert len(pre_rollback_entries) >= 1, (
f"expected a pre-rollback safety snapshot in list_backups(), got: "
f"{[(r.get('id'), r.get('reason')) for r in rows]}"
)
# And the transient staging dir must be gone (it's implementation detail)
backups_dir = skills / ".curator_backups"
staging_dirs = [p for p in backups_dir.iterdir() if p.name.startswith(".rollback-staging-")]
assert staging_dirs == [], (
f"staging dir should be cleaned up on success, got: {staging_dirs}"
)
def test_rollback_no_snapshots_returns_error(backup_env):
cb = backup_env["cb"]
ok, msg, _ = cb.rollback()
assert not ok
assert "no matching backup" in msg.lower() or "no snapshot" in msg.lower()
def test_rollback_rejects_unsafe_tarball(backup_env, monkeypatch):
"""Tarballs with absolute paths or .. components must be refused even
if someone crafts a malicious snapshot. Defense in depth normal
curator snapshots never produce these."""
cb = backup_env["cb"]
skills = backup_env["skills"]
_write_skill(skills, "alpha")
cb.snapshot_skills(reason="legit")
# Hand-craft a malicious tarball replacing the legit one
rows = cb.list_backups()
snap_dir = Path(rows[0]["path"])
mal = snap_dir / "skills.tar.gz"
mal.unlink()
with tarfile.open(mal, "w:gz") as tf:
evil = tempfile.NamedTemporaryFile(delete=False, suffix=".md")
evil.write(b"evil")
evil.close()
tf.add(evil.name, arcname="../../etc/evil.md")
os.unlink(evil.name)
ok, msg, _ = cb.rollback()
assert not ok
assert "unsafe" in msg.lower() or "refus" in msg.lower() or "extract" in msg.lower()
# ---------------------------------------------------------------------------
# Integration with run_curator_review
# ---------------------------------------------------------------------------
def test_real_run_takes_pre_snapshot(backup_env, monkeypatch):
"""A real (non-dry) curator pass must snapshot the tree before calling
apply_automatic_transitions. This is the safety net #18373 asked for."""
cb = backup_env["cb"]
skills = backup_env["skills"]
_write_skill(skills, "alpha")
# Reload curator module against the freshly-env'd hermes_constants
from agent import curator
importlib.reload(curator)
# Stub out LLM review and auto transitions — we only care about the
# snapshot side-effect.
monkeypatch.setattr(
curator, "_run_llm_review",
lambda p: {"final": "", "summary": "s", "model": "", "provider": "",
"tool_calls": [], "error": None},
)
monkeypatch.setattr(
curator, "apply_automatic_transitions",
lambda now=None: {"checked": 1, "marked_stale": 0, "archived": 0, "reactivated": 0},
)
curator.run_curator_review(synchronous=True)
# Pre-run snapshot should exist
rows = cb.list_backups()
assert any(r.get("reason") == "pre-curator-run" for r in rows), (
f"expected a pre-curator-run snapshot, got {[r.get('reason') for r in rows]}"
)
def test_dry_run_skips_snapshot(backup_env, monkeypatch):
"""Dry-run previews must not spend disk on a snapshot — they don't
mutate anything, so there's nothing to back up."""
cb = backup_env["cb"]
skills = backup_env["skills"]
_write_skill(skills, "alpha")
from agent import curator
importlib.reload(curator)
monkeypatch.setattr(
curator, "_run_llm_review",
lambda p: {"final": "", "summary": "s", "model": "", "provider": "",
"tool_calls": [], "error": None},
)
curator.run_curator_review(synchronous=True, dry_run=True)
rows = cb.list_backups()
assert not any(r.get("reason") == "pre-curator-run" for r in rows), (
"dry-run must not create a pre-run snapshot"
)
+164
View File
@@ -270,3 +270,167 @@ def test_state_transitions_captured_in_report(curator_env):
assert "State transitions" in md
assert "getting-old" in md
assert "active → stale" in md
# ---------------------------------------------------------------------------
# Cron job skill reference rewriting (curator ↔ cron integration)
# ---------------------------------------------------------------------------
#
# When the curator consolidates skill X into umbrella Y during a run, any
# cron job that listed X in its ``skills`` field would fail to load X at
# run time — the scheduler logs a warning and skips it, so the scheduled
# job runs without the instructions it was scheduled to follow. These
# tests verify that _write_run_report calls into cron.jobs to repair
# those references and records what it did in both run.json and
# cron_rewrites.json.
@pytest.fixture
def curator_env_with_cron(curator_env, monkeypatch):
"""Extend curator_env with an initialized + repointed cron.jobs module."""
home = curator_env["home"]
(home / "cron").mkdir(exist_ok=True)
(home / "cron" / "output").mkdir(exist_ok=True)
import importlib
import cron.jobs as jobs_mod
importlib.reload(jobs_mod)
monkeypatch.setattr(jobs_mod, "HERMES_DIR", home)
monkeypatch.setattr(jobs_mod, "CRON_DIR", home / "cron")
monkeypatch.setattr(jobs_mod, "JOBS_FILE", home / "cron" / "jobs.json")
monkeypatch.setattr(jobs_mod, "OUTPUT_DIR", home / "cron" / "output")
return {**curator_env, "jobs": jobs_mod}
def test_curator_rewrites_cron_skills_when_skill_consolidated(curator_env_with_cron):
"""A skill consolidated into an umbrella should be rewritten in any
cron job's skills list; the rewrite should be visible in run.json
and cron_rewrites.json."""
curator = curator_env_with_cron["curator"]
jobs = curator_env_with_cron["jobs"]
# Create a cron job that depends on a soon-to-be-consolidated skill
job = jobs.create_job(
prompt="",
schedule="every 1h",
skills=["foo"],
name="foo-watcher",
)
# Simulate a curator pass that consolidated `foo` → `foo-umbrella`
before = [{"name": "foo", "state": "active", "pinned": False}]
after = [{"name": "foo-umbrella", "state": "active", "pinned": False}]
run_dir = curator._write_run_report(
started_at=datetime.now(timezone.utc),
elapsed_seconds=3.0,
auto_counts={"checked": 1, "marked_stale": 0, "archived": 0, "reactivated": 0},
auto_summary="no changes",
before_report=before,
before_names={"foo"},
after_report=after,
llm_meta=_make_llm_meta(
final="Consolidated foo into foo-umbrella.",
tool_calls=[
{
"name": "skill_manage",
"arguments": json.dumps({
"action": "write_file",
"name": "foo-umbrella",
"file_path": "references/foo.md",
"file_content": "from foo",
}),
},
],
),
)
# Cron job is rewritten on disk
loaded = jobs.get_job(job["id"])
assert loaded["skills"] == ["foo-umbrella"]
assert loaded["skill"] == "foo-umbrella"
# Rewrite is recorded in run.json
payload = json.loads((run_dir / "run.json").read_text())
assert payload["cron_rewrites"]["jobs_updated"] == 1
assert payload["counts"]["cron_jobs_rewritten"] == 1
rewrites = payload["cron_rewrites"]["rewrites"]
assert len(rewrites) == 1
assert rewrites[0]["mapped"] == {"foo": "foo-umbrella"}
# Separate cron_rewrites.json is written for convenience
cron_file = run_dir / "cron_rewrites.json"
assert cron_file.exists()
detail = json.loads(cron_file.read_text())
assert detail["jobs_updated"] == 1
# Markdown surfaces the change
md = (run_dir / "REPORT.md").read_text()
assert "Cron job skill references rewritten" in md
assert "foo-watcher" in md
assert "foo-umbrella" in md
def test_curator_drops_pruned_skill_from_cron_job(curator_env_with_cron):
"""A pruned (no-umbrella) skill should be dropped from the cron
job's skill list entirely — there's no forwarding target."""
curator = curator_env_with_cron["curator"]
jobs = curator_env_with_cron["jobs"]
job = jobs.create_job(
prompt="",
schedule="every 1h",
skills=["keep", "stale-one"],
)
before = [{"name": "stale-one", "state": "active", "pinned": False}]
after: list = [] # stale-one was archived with no target
run_dir = curator._write_run_report(
started_at=datetime.now(timezone.utc),
elapsed_seconds=1.0,
auto_counts={"checked": 1, "marked_stale": 0, "archived": 1, "reactivated": 0},
auto_summary="1 archived",
before_report=before,
before_names={"stale-one"},
after_report=after,
llm_meta=_make_llm_meta(), # no tool calls → classifier marks it pruned
)
loaded = jobs.get_job(job["id"])
assert loaded["skills"] == ["keep"]
payload = json.loads((run_dir / "run.json").read_text())
assert payload["cron_rewrites"]["jobs_updated"] == 1
rewrites = payload["cron_rewrites"]["rewrites"]
assert rewrites[0]["dropped"] == ["stale-one"]
def test_curator_report_has_no_cron_section_when_nothing_changes(curator_env_with_cron):
"""When the curator run doesn't touch any skills, cron jobs are
untouched and cron_rewrites.json is not even written."""
curator = curator_env_with_cron["curator"]
jobs = curator_env_with_cron["jobs"]
jobs.create_job(prompt="", schedule="every 1h", skills=["foo"])
run_dir = curator._write_run_report(
started_at=datetime.now(timezone.utc),
elapsed_seconds=1.0,
auto_counts={"checked": 0, "marked_stale": 0, "archived": 0, "reactivated": 0},
auto_summary="no changes",
before_report=[{"name": "foo", "state": "active", "pinned": False}],
before_names={"foo"},
after_report=[{"name": "foo", "state": "active", "pinned": False}],
llm_meta=_make_llm_meta(),
)
# No rewrites → no separate file, no section in md
assert not (run_dir / "cron_rewrites.json").exists()
md = (run_dir / "REPORT.md").read_text()
assert "Cron job skill references rewritten" not in md
payload = json.loads((run_dir / "run.json").read_text())
assert payload["cron_rewrites"]["jobs_updated"] == 0
assert payload["counts"]["cron_jobs_rewritten"] == 0
+158 -13
View File
@@ -115,9 +115,15 @@ class TestMissingTypeFilled:
class TestAnyOfParentType:
"""Rule 2: type must not appear at the anyOf parent level."""
"""Rule 2: type must not appear at the anyOf parent level.
def test_parent_type_stripped_when_anyof_present(self):
When an anyOf contains a null-type branch, Moonshot rejects it.
The sanitizer collapses the anyOf: single non-null branch is promoted,
multiple non-null branches have null removed from the list.
"""
def test_anyof_null_branch_collapsed_to_single_type(self):
"""anyOf [string, null] → plain string (anyOf removed)."""
params = {
"type": "object",
"properties": {
@@ -132,25 +138,46 @@ class TestAnyOfParentType:
}
out = sanitize_moonshot_tool_parameters(params)
from_format = out["properties"]["from_format"]
assert "type" not in from_format
assert "anyOf" in from_format
# null branch removed, anyOf collapsed to the single non-null type
assert "anyOf" not in from_format
assert from_format["type"] == "string"
def test_anyof_children_missing_type_get_filled(self):
def test_anyof_multiple_non_null_preserved(self):
"""anyOf [string, integer] (no null) → kept as-is with parent type stripped."""
params = {
"type": "object",
"properties": {
"value": {
"mode": {
"anyOf": [
{"type": "string"},
{"description": "A typeless option"},
{"type": "integer"},
],
},
},
}
out = sanitize_moonshot_tool_parameters(params)
children = out["properties"]["value"]["anyOf"]
assert children[0]["type"] == "string"
assert "type" in children[1]
mode = out["properties"]["mode"]
assert "anyOf" in mode
assert "type" not in mode # parent type stripped
def test_anyof_enum_with_null_collapsed(self):
"""anyOf [{enum: [...], type: string}, {type: null}] → enum + type only."""
params = {
"type": "object",
"properties": {
"db_type": {
"anyOf": [
{"enum": ["mysql", "postgresql", ""]},
{"type": "null"},
],
},
},
}
out = sanitize_moonshot_tool_parameters(params)
db_type = out["properties"]["db_type"]
assert "anyOf" not in db_type
assert db_type["type"] == "string"
assert db_type["enum"] == ["mysql", "postgresql"] # "" stripped by enum cleanup
class TestTopLevelGuarantees:
@@ -226,7 +253,7 @@ class TestRealWorldMCPShape:
"""End-to-end: a realistic MCP-style schema that used to 400 on Moonshot."""
def test_combined_rewrites(self):
# Shape: missing type on a property, anyOf with parent type, array
# Shape: missing type on a property, anyOf with parent type + null, array
# items without type — all in one tool.
params = {
"type": "object",
@@ -248,7 +275,125 @@ class TestRealWorldMCPShape:
}
out = sanitize_moonshot_tool_parameters(params)
assert out["properties"]["query"]["type"] == "string"
assert "type" not in out["properties"]["filter"]
assert out["properties"]["filter"]["anyOf"][0]["type"] == "string"
# anyOf with null collapsed to plain type
assert "anyOf" not in out["properties"]["filter"]
assert out["properties"]["filter"]["type"] == "string"
assert out["properties"]["tags"]["items"]["type"] == "string"
assert out["required"] == ["query"]
class TestEnumNullStripping:
"""Rule 3: Moonshot rejects null/empty-string inside enum arrays."""
def test_enum_null_value_stripped(self):
"""enum containing Python None must have it removed for Moonshot."""
params = {
"type": "object",
"properties": {
"db_type": {
"type": "string",
"enum": ["mysql", "postgresql", None],
},
},
}
out = sanitize_moonshot_tool_parameters(params)
db_type = out["properties"]["db_type"]
assert None not in db_type["enum"]
assert "mysql" in db_type["enum"]
assert "postgresql" in db_type["enum"]
def test_enum_empty_string_stripped(self):
"""enum containing empty string '' must have it removed for Moonshot."""
params = {
"type": "object",
"properties": {
"db_type": {
"type": "string",
"enum": ["mysql", "postgresql", ""],
},
},
}
out = sanitize_moonshot_tool_parameters(params)
db_type = out["properties"]["db_type"]
assert "" not in db_type["enum"]
assert db_type["enum"] == ["mysql", "postgresql"]
def test_enum_all_null_becomes_no_enum(self):
"""enum that only had null/empty values is dropped entirely."""
params = {
"type": "object",
"properties": {
"val": {
"type": "string",
"enum": [None, ""],
},
},
}
out = sanitize_moonshot_tool_parameters(params)
assert "enum" not in out["properties"]["val"]
def test_dataslayer_db_type_after_mcp_normalize(self):
"""Real-world: dataslayer db_type anyOf+enum after MCP normalization."""
# This is the exact shape after _normalize_mcp_input_schema runs:
# anyOf collapsed, but enum still has null + empty string
params = {
"type": "object",
"properties": {
"datasource": {"type": "string"},
"db_type": {
"enum": ["mysql", "mariadb", "postgresql", "sqlserver", "oracle", "", None],
"type": "string",
"nullable": True,
"default": None,
},
},
"required": ["datasource"],
}
out = sanitize_moonshot_tool_parameters(params)
db_type = out["properties"]["db_type"]
assert "nullable" not in db_type, "nullable keyword must be stripped"
assert None not in db_type["enum"]
assert "" not in db_type["enum"]
assert db_type["enum"] == ["mysql", "mariadb", "postgresql", "sqlserver", "oracle"]
assert db_type["type"] == "string"
def test_enum_on_object_type_not_stripped(self):
"""enum on non-scalar types (object) should NOT be touched."""
params = {
"type": "object",
"properties": {
"config": {
"type": "object",
"properties": {},
"enum": [{}, None],
},
},
}
out = sanitize_moonshot_tool_parameters(params)
# object-typed enum should pass through unchanged
assert "enum" in out["properties"]["config"]
def test_anyof_collapse_still_runs_nullable_and_enum_cleanup(self):
"""After anyOf collapses to a single non-null branch, the merged
node must still have ``nullable`` stripped and null/empty-string
values removed from enum not skipped by the early anyOf return.
"""
params = {
"type": "object",
"properties": {
"db_type": {
"anyOf": [
{"enum": ["mysql", "postgresql", "", None]},
{"type": "null"},
],
"nullable": True,
},
},
}
out = sanitize_moonshot_tool_parameters(params)
db_type = out["properties"]["db_type"]
assert "anyOf" not in db_type
assert "nullable" not in db_type, "nullable must be stripped after anyOf collapse"
assert db_type["type"] == "string"
assert db_type["enum"] == ["mysql", "postgresql"], \
"null/empty enum values must be stripped after anyOf collapse"
+58
View File
@@ -0,0 +1,58 @@
"""Tests for agent/skill_utils.py — extract_skill_conditions metadata handling."""
from agent.skill_utils import extract_skill_conditions
def test_metadata_as_dict_with_hermes():
"""Normal case: metadata is a dict containing hermes keys."""
frontmatter = {
"metadata": {
"hermes": {
"fallback_for_toolsets": ["toolset_a"],
"requires_toolsets": ["toolset_b"],
"fallback_for_tools": ["tool_x"],
"requires_tools": ["tool_y"],
}
}
}
result = extract_skill_conditions(frontmatter)
assert result["fallback_for_toolsets"] == ["toolset_a"]
assert result["requires_toolsets"] == ["toolset_b"]
assert result["fallback_for_tools"] == ["tool_x"]
assert result["requires_tools"] == ["tool_y"]
def test_metadata_as_string_does_not_crash():
"""Bug case: metadata is a non-dict truthy value (e.g. a YAML string)."""
frontmatter = {"metadata": "some text"}
result = extract_skill_conditions(frontmatter)
assert result == {
"fallback_for_toolsets": [],
"requires_toolsets": [],
"fallback_for_tools": [],
"requires_tools": [],
}
def test_metadata_as_none():
"""metadata key is present but set to null/None."""
frontmatter = {"metadata": None}
result = extract_skill_conditions(frontmatter)
assert result == {
"fallback_for_toolsets": [],
"requires_toolsets": [],
"fallback_for_tools": [],
"requires_tools": [],
}
def test_metadata_missing_entirely():
"""metadata key is absent from frontmatter."""
frontmatter = {"name": "my-skill", "description": "Does stuff."}
result = extract_skill_conditions(frontmatter)
assert result == {
"fallback_for_toolsets": [],
"requires_toolsets": [],
"fallback_for_tools": [],
"requires_tools": [],
}
+238
View File
@@ -0,0 +1,238 @@
"""Pure tool-call guardrail primitive tests."""
import json
from agent.tool_guardrails import (
ToolCallGuardrailConfig,
ToolCallGuardrailController,
ToolCallSignature,
canonical_tool_args,
)
def test_tool_call_signature_hashes_canonical_nested_unicode_args_without_exposing_raw_args():
args_a = {
"z": [{"β": "", "a": 1}],
"a": {"y": 2, "x": "secret-token-value"},
}
args_b = {
"a": {"x": "secret-token-value", "y": 2},
"z": [{"a": 1, "β": ""}],
}
assert canonical_tool_args(args_a) == canonical_tool_args(args_b)
sig_a = ToolCallSignature.from_call("web_search", args_a)
sig_b = ToolCallSignature.from_call("web_search", args_b)
assert sig_a == sig_b
assert len(sig_a.args_hash) == 64
metadata = sig_a.to_metadata()
assert metadata == {"tool_name": "web_search", "args_hash": sig_a.args_hash}
assert "secret-token-value" not in json.dumps(metadata)
assert "" not in json.dumps(metadata)
def test_default_config_is_soft_warning_only_with_hard_stop_disabled():
cfg = ToolCallGuardrailConfig()
assert cfg.warnings_enabled is True
assert cfg.hard_stop_enabled is False
assert cfg.exact_failure_warn_after == 2
assert cfg.same_tool_failure_warn_after == 3
assert cfg.no_progress_warn_after == 2
assert cfg.exact_failure_block_after == 5
assert cfg.same_tool_failure_halt_after == 8
assert cfg.no_progress_block_after == 5
def test_config_parses_nested_warn_and_hard_stop_thresholds():
cfg = ToolCallGuardrailConfig.from_mapping(
{
"warnings_enabled": False,
"hard_stop_enabled": True,
"warn_after": {
"exact_failure": 3,
"same_tool_failure": 4,
"idempotent_no_progress": 5,
},
"hard_stop_after": {
"exact_failure": 6,
"same_tool_failure": 7,
"idempotent_no_progress": 8,
},
}
)
assert cfg.warnings_enabled is False
assert cfg.hard_stop_enabled is True
assert cfg.exact_failure_warn_after == 3
assert cfg.same_tool_failure_warn_after == 4
assert cfg.no_progress_warn_after == 5
assert cfg.exact_failure_block_after == 6
assert cfg.same_tool_failure_halt_after == 7
assert cfg.no_progress_block_after == 8
def test_default_repeated_identical_failed_call_warns_without_blocking():
controller = ToolCallGuardrailController()
args = {"query": "same"}
decisions = []
for _ in range(5):
assert controller.before_call("web_search", args).action == "allow"
decisions.append(
controller.after_call("web_search", args, '{"error":"boom"}', failed=True)
)
assert decisions[0].action == "allow"
assert [d.action for d in decisions[1:]] == ["warn", "warn", "warn", "warn"]
assert {d.code for d in decisions[1:]} == {"repeated_exact_failure_warning"}
assert controller.before_call("web_search", args).action == "allow"
assert controller.halt_decision is None
def test_hard_stop_enabled_blocks_repeated_exact_failure_before_next_execution():
controller = ToolCallGuardrailController(
ToolCallGuardrailConfig(
hard_stop_enabled=True,
exact_failure_warn_after=2,
exact_failure_block_after=2,
same_tool_failure_halt_after=99,
)
)
args = {"query": "same"}
assert controller.before_call("web_search", args).action == "allow"
first = controller.after_call("web_search", args, '{"error":"boom"}', failed=True)
assert first.action == "allow"
assert controller.before_call("web_search", args).action == "allow"
second = controller.after_call("web_search", args, '{"error":"boom"}', failed=True)
assert second.action == "warn"
assert second.code == "repeated_exact_failure_warning"
blocked = controller.before_call("web_search", args)
assert blocked.action == "block"
assert blocked.code == "repeated_exact_failure_block"
assert blocked.count == 2
def test_success_resets_exact_signature_failure_streak():
controller = ToolCallGuardrailController(
ToolCallGuardrailConfig(hard_stop_enabled=True, exact_failure_block_after=2, same_tool_failure_halt_after=99)
)
args = {"query": "same"}
controller.after_call("web_search", args, '{"error":"boom"}', failed=True)
controller.after_call("web_search", args, '{"ok":true}', failed=False)
assert controller.before_call("web_search", args).action == "allow"
controller.after_call("web_search", args, '{"error":"boom"}', failed=True)
assert controller.before_call("web_search", args).action == "allow"
def test_same_tool_varying_args_warns_by_default_without_halting():
controller = ToolCallGuardrailController(
ToolCallGuardrailConfig(same_tool_failure_warn_after=2, same_tool_failure_halt_after=3)
)
first = controller.after_call("terminal", {"command": "cmd-1"}, '{"exit_code":1}', failed=True)
second = controller.after_call("terminal", {"command": "cmd-2"}, '{"exit_code":1}', failed=True)
third = controller.after_call("terminal", {"command": "cmd-3"}, '{"exit_code":1}', failed=True)
fourth = controller.after_call("terminal", {"command": "cmd-4"}, '{"exit_code":1}', failed=True)
assert first.action == "allow"
assert [second.action, third.action, fourth.action] == ["warn", "warn", "warn"]
assert {second.code, third.code, fourth.code} == {"same_tool_failure_warning"}
assert controller.halt_decision is None
def test_hard_stop_enabled_halts_same_tool_varying_args_failure_streak():
controller = ToolCallGuardrailController(
ToolCallGuardrailConfig(
hard_stop_enabled=True,
exact_failure_block_after=99,
same_tool_failure_warn_after=2,
same_tool_failure_halt_after=3,
)
)
first = controller.after_call("terminal", {"command": "cmd-1"}, '{"exit_code":1}', failed=True)
assert first.action == "allow"
second = controller.after_call("terminal", {"command": "cmd-2"}, '{"exit_code":1}', failed=True)
assert second.action == "warn"
assert second.code == "same_tool_failure_warning"
third = controller.after_call("terminal", {"command": "cmd-3"}, '{"exit_code":1}', failed=True)
assert third.action == "halt"
assert third.code == "same_tool_failure_halt"
assert third.count == 3
def test_idempotent_no_progress_repeated_result_warns_without_blocking_by_default():
controller = ToolCallGuardrailController(
ToolCallGuardrailConfig(no_progress_warn_after=2, no_progress_block_after=2)
)
args = {"path": "/tmp/same.txt"}
result = "same file contents"
for _ in range(4):
assert controller.before_call("read_file", args).action == "allow"
decision = controller.after_call("read_file", args, result, failed=False)
assert decision.action == "warn"
assert decision.code == "idempotent_no_progress_warning"
assert controller.before_call("read_file", args).action == "allow"
assert controller.halt_decision is None
def test_hard_stop_enabled_blocks_idempotent_no_progress_future_repeat():
controller = ToolCallGuardrailController(
ToolCallGuardrailConfig(
hard_stop_enabled=True,
no_progress_warn_after=2,
no_progress_block_after=2,
)
)
args = {"path": "/tmp/same.txt"}
result = "same file contents"
assert controller.before_call("read_file", args).action == "allow"
assert controller.after_call("read_file", args, result, failed=False).action == "allow"
assert controller.before_call("read_file", args).action == "allow"
warn = controller.after_call("read_file", args, result, failed=False)
assert warn.action == "warn"
assert warn.code == "idempotent_no_progress_warning"
blocked = controller.before_call("read_file", args)
assert blocked.action == "block"
assert blocked.code == "idempotent_no_progress_block"
def test_mutating_or_unknown_tools_are_not_blocked_for_repeated_identical_success_output_by_default():
controller = ToolCallGuardrailController(
ToolCallGuardrailConfig(no_progress_warn_after=2, no_progress_block_after=2)
)
for _ in range(3):
assert controller.before_call("write_file", {"path": "/tmp/x", "content": "x"}).action == "allow"
assert controller.after_call("write_file", {"path": "/tmp/x", "content": "x"}, "ok", failed=False).action == "allow"
assert controller.before_call("custom_tool", {"x": 1}).action == "allow"
assert controller.after_call("custom_tool", {"x": 1}, "ok", failed=False).action == "allow"
def test_reset_for_turn_clears_bounded_guardrail_state():
controller = ToolCallGuardrailController(
ToolCallGuardrailConfig(hard_stop_enabled=True, exact_failure_block_after=2, no_progress_block_after=2)
)
controller.after_call("web_search", {"query": "same"}, '{"error":"boom"}', failed=True)
controller.after_call("web_search", {"query": "same"}, '{"error":"boom"}', failed=True)
controller.after_call("read_file", {"path": "/tmp/x"}, "same", failed=False)
controller.after_call("read_file", {"path": "/tmp/x"}, "same", failed=False)
assert controller.before_call("web_search", {"query": "same"}).action == "block"
assert controller.before_call("read_file", {"path": "/tmp/x"}).action == "block"
controller.reset_for_turn()
assert controller.before_call("web_search", {"query": "same"}).action == "allow"
assert controller.before_call("read_file", {"path": "/tmp/x"}).action == "allow"
+206
View File
@@ -0,0 +1,206 @@
"""Tests for cli._cprint's bg-thread cooperation with prompt_toolkit.
Background: when a prompt_toolkit Application is running, a bg thread that
calls ``_pt_print`` directly can race with the input-area redraw and the
printed line can end up visually buried behind the prompt. ``_cprint`` now
routes cross-thread prints through ``run_in_terminal`` via
``loop.call_soon_threadsafe`` so the self-improvement background review's
``💾 Self-improvement review: `` summary actually surfaces to the user.
These tests verify the routing logic without spinning up a real PT app.
"""
from __future__ import annotations
import sys
import types
from types import SimpleNamespace
import cli
def test_cprint_no_app_direct_print(monkeypatch):
"""No active app → direct _pt_print, no run_in_terminal involvement."""
calls = []
monkeypatch.setattr(cli, "_pt_print", lambda x: calls.append(("pt_print", x)))
monkeypatch.setattr(cli, "_PT_ANSI", lambda t: ("ANSI", t))
# Patch the prompt_toolkit import the function performs internally.
fake_pt_app = types.ModuleType("prompt_toolkit.application")
fake_pt_app.get_app_or_none = lambda: None
fake_pt_app.run_in_terminal = lambda *a, **kw: calls.append(("run_in_terminal",))
monkeypatch.setitem(sys.modules, "prompt_toolkit.application", fake_pt_app)
cli._cprint("hello")
assert calls == [("pt_print", ("ANSI", "hello"))]
def test_cprint_app_not_running_direct_print(monkeypatch):
"""App exists but not running (e.g. teardown) → direct print."""
calls = []
monkeypatch.setattr(cli, "_pt_print", lambda x: calls.append(("pt_print", x)))
monkeypatch.setattr(cli, "_PT_ANSI", lambda t: t)
fake_app = SimpleNamespace(_is_running=False, loop=None)
fake_pt_app = types.ModuleType("prompt_toolkit.application")
fake_pt_app.get_app_or_none = lambda: fake_app
fake_pt_app.run_in_terminal = lambda *a, **kw: calls.append(("run_in_terminal",))
monkeypatch.setitem(sys.modules, "prompt_toolkit.application", fake_pt_app)
cli._cprint("x")
assert calls == [("pt_print", "x")]
def test_cprint_bg_thread_schedules_on_app_loop(monkeypatch):
"""App running + different thread → schedules via call_soon_threadsafe."""
scheduled = []
direct_prints = []
monkeypatch.setattr(cli, "_pt_print", lambda x: direct_prints.append(x))
monkeypatch.setattr(cli, "_PT_ANSI", lambda t: t)
class FakeLoop:
def is_running(self):
return True
def call_soon_threadsafe(self, cb, *args):
scheduled.append(cb)
fake_loop = FakeLoop()
# Install a fake "current loop" that is NOT the app's loop, so the
# cross-thread branch is taken.
fake_current_loop = SimpleNamespace(is_running=lambda: True)
fake_asyncio = types.ModuleType("asyncio")
class _Policy:
def get_event_loop(self):
return fake_current_loop
fake_asyncio.get_event_loop_policy = lambda: _Policy()
monkeypatch.setitem(sys.modules, "asyncio", fake_asyncio)
fake_app = SimpleNamespace(_is_running=True, loop=fake_loop)
fake_pt_app = types.ModuleType("prompt_toolkit.application")
fake_pt_app.get_app_or_none = lambda: fake_app
run_in_terminal_calls = []
def _fake_run_in_terminal(func, **kw):
run_in_terminal_calls.append(func)
# Simulate run_in_terminal actually calling func (as the real PT
# impl would once the app loop tick picks it up).
func()
return None
fake_pt_app.run_in_terminal = _fake_run_in_terminal
monkeypatch.setitem(sys.modules, "prompt_toolkit.application", fake_pt_app)
cli._cprint("💾 Self-improvement review: Skill updated")
# call_soon_threadsafe must have been called with a scheduling cb.
assert len(scheduled) == 1
# Invoking the scheduled callback should hit run_in_terminal.
scheduled[0]()
assert len(run_in_terminal_calls) == 1
# And run_in_terminal's inner func should have emitted a pt_print.
assert direct_prints == ["💾 Self-improvement review: Skill updated"]
def test_cprint_same_thread_as_app_loop_direct_print(monkeypatch):
"""App running on same thread → direct print (no scheduling)."""
direct_prints = []
monkeypatch.setattr(cli, "_pt_print", lambda x: direct_prints.append(x))
monkeypatch.setattr(cli, "_PT_ANSI", lambda t: t)
class FakeLoop:
def is_running(self):
return True
def call_soon_threadsafe(self, cb, *args):
raise AssertionError(
"call_soon_threadsafe must not be used on the app's own thread"
)
fake_loop = FakeLoop()
fake_asyncio = types.ModuleType("asyncio")
class _Policy:
def get_event_loop(self):
return fake_loop # same as app loop
fake_asyncio.get_event_loop_policy = lambda: _Policy()
monkeypatch.setitem(sys.modules, "asyncio", fake_asyncio)
fake_app = SimpleNamespace(_is_running=True, loop=fake_loop)
fake_pt_app = types.ModuleType("prompt_toolkit.application")
fake_pt_app.get_app_or_none = lambda: fake_app
fake_pt_app.run_in_terminal = lambda *a, **kw: None
monkeypatch.setitem(sys.modules, "prompt_toolkit.application", fake_pt_app)
cli._cprint("x")
assert direct_prints == ["x"]
def test_cprint_swallows_app_loop_attr_error(monkeypatch):
"""Loop missing on app → fall back to direct print, no crash."""
direct_prints = []
monkeypatch.setattr(cli, "_pt_print", lambda x: direct_prints.append(x))
monkeypatch.setattr(cli, "_PT_ANSI", lambda t: t)
class WeirdApp:
_is_running = True
@property
def loop(self):
raise RuntimeError("no loop for you")
fake_pt_app = types.ModuleType("prompt_toolkit.application")
fake_pt_app.get_app_or_none = lambda: WeirdApp()
fake_pt_app.run_in_terminal = lambda *a, **kw: None
monkeypatch.setitem(sys.modules, "prompt_toolkit.application", fake_pt_app)
cli._cprint("fallback")
assert direct_prints == ["fallback"]
def test_cprint_swallows_prompt_toolkit_import_error(monkeypatch):
"""If prompt_toolkit.application itself fails to import, fall back."""
direct_prints = []
monkeypatch.setattr(cli, "_pt_print", lambda x: direct_prints.append(x))
monkeypatch.setattr(cli, "_PT_ANSI", lambda t: t)
# Drop cached prompt_toolkit.application AND install a meta-path finder
# that raises ImportError on re-import.
monkeypatch.delitem(sys.modules, "prompt_toolkit.application", raising=False)
class _BlockFinder:
def find_module(self, name, path=None):
if name == "prompt_toolkit.application":
return self
return None
def load_module(self, name):
raise ImportError("blocked for test")
def find_spec(self, name, path=None, target=None):
if name == "prompt_toolkit.application":
# Returning a bogus spec that will fail on load works too,
# but raising here keeps the test simple.
raise ImportError("blocked for test")
return None
blocker = _BlockFinder()
sys.meta_path.insert(0, blocker)
try:
cli._cprint("fallback2")
finally:
sys.meta_path.remove(blocker)
assert direct_prints == ["fallback2"]
+12 -8
View File
@@ -21,20 +21,21 @@ def test_manual_compress_reports_noop_without_success_banner(capsys):
shell.agent = MagicMock()
shell.agent.compression_enabled = True
shell.agent._cached_system_prompt = ""
shell.agent.tools = None
shell.agent.session_id = shell.session_id # no-op compression: no split
shell.agent._compress_context.return_value = (list(history), "")
def _estimate(messages):
def _estimate(messages, **_kwargs):
assert messages == history
return 100
with patch("agent.model_metadata.estimate_messages_tokens_rough", side_effect=_estimate):
with patch("agent.model_metadata.estimate_request_tokens_rough", side_effect=_estimate):
shell._manual_compress()
output = capsys.readouterr().out
assert "No changes from compression" in output
assert "✅ Compressed" not in output
assert "Rough transcript estimate: ~100 tokens (unchanged)" in output
assert "Approx request size: ~100 tokens (unchanged)" in output
def test_manual_compress_explains_when_token_estimate_rises(capsys):
@@ -49,22 +50,23 @@ def test_manual_compress_explains_when_token_estimate_rises(capsys):
shell.agent = MagicMock()
shell.agent.compression_enabled = True
shell.agent._cached_system_prompt = ""
shell.agent.tools = None
shell.agent.session_id = shell.session_id # no-op: no split
shell.agent._compress_context.return_value = (compressed, "")
def _estimate(messages):
def _estimate(messages, **_kwargs):
if messages == history:
return 100
if messages == compressed:
return 120
raise AssertionError(f"unexpected transcript: {messages!r}")
with patch("agent.model_metadata.estimate_messages_tokens_rough", side_effect=_estimate):
with patch("agent.model_metadata.estimate_request_tokens_rough", side_effect=_estimate):
shell._manual_compress()
output = capsys.readouterr().out
assert "✅ Compressed: 4 → 3 messages" in output
assert "Rough transcript estimate: ~100 → ~120 tokens" in output
assert "Approx request size: ~100 → ~120 tokens" in output
assert "denser summaries" in output
@@ -89,6 +91,7 @@ def test_manual_compress_syncs_session_id_after_split():
shell.agent = MagicMock()
shell.agent.compression_enabled = True
shell.agent._cached_system_prompt = ""
shell.agent.tools = None
# Simulate _compress_context mutating agent.session_id as a side effect.
def _fake_compress(*args, **kwargs):
shell.agent.session_id = new_child_id
@@ -97,7 +100,7 @@ def test_manual_compress_syncs_session_id_after_split():
shell.agent.session_id = old_id # starts in sync
shell._pending_title = "stale title"
with patch("agent.model_metadata.estimate_messages_tokens_rough", return_value=100):
with patch("agent.model_metadata.estimate_request_tokens_rough", return_value=100):
shell._manual_compress()
# CLI session_id must now point at the continuation child, not the parent.
@@ -118,11 +121,12 @@ def test_manual_compress_no_sync_when_session_id_unchanged():
shell.agent = MagicMock()
shell.agent.compression_enabled = True
shell.agent._cached_system_prompt = ""
shell.agent.tools = None
shell.agent.session_id = shell.session_id
shell.agent._compress_context.return_value = (list(history), "")
shell._pending_title = "keep me"
with patch("agent.model_metadata.estimate_messages_tokens_rough", return_value=100):
with patch("agent.model_metadata.estimate_request_tokens_rough", return_value=100):
shell._manual_compress()
# No split → pending title untouched.
+289
View File
@@ -0,0 +1,289 @@
"""Tests for cron.jobs.rewrite_skill_refs — the curator integration that
keeps scheduled cron jobs pointing at the right skill names after a
consolidation / pruning pass.
Bug this fixes: when the curator consolidates skill X into umbrella Y,
any cron job whose ``skills`` list contains X would silently fail to
load X at run time (the scheduler logs a warning and skips it), so the
job runs without the instructions it was scheduled to follow.
"""
from __future__ import annotations
import sys
from pathlib import Path
import pytest
# Ensure project root is importable
sys.path.insert(0, str(Path(__file__).parent.parent.parent))
@pytest.fixture
def cron_env(tmp_path, monkeypatch):
"""Isolated cron environment with temp HERMES_HOME."""
hermes_home = tmp_path / ".hermes"
hermes_home.mkdir()
(hermes_home / "cron").mkdir()
(hermes_home / "cron" / "output").mkdir()
monkeypatch.setenv("HERMES_HOME", str(hermes_home))
import cron.jobs as jobs_mod
monkeypatch.setattr(jobs_mod, "HERMES_DIR", hermes_home)
monkeypatch.setattr(jobs_mod, "CRON_DIR", hermes_home / "cron")
monkeypatch.setattr(jobs_mod, "JOBS_FILE", hermes_home / "cron" / "jobs.json")
monkeypatch.setattr(jobs_mod, "OUTPUT_DIR", hermes_home / "cron" / "output")
return hermes_home
class TestRewriteSkillRefsNoop:
"""No jobs, no rewrites, no map — every combination of empty inputs."""
def test_empty_map_and_no_jobs(self, cron_env):
from cron.jobs import rewrite_skill_refs
report = rewrite_skill_refs(consolidated={}, pruned=[])
assert report == {"rewrites": [], "jobs_updated": 0, "jobs_scanned": 0}
def test_jobs_exist_but_map_empty(self, cron_env):
from cron.jobs import create_job, rewrite_skill_refs
create_job(prompt="", schedule="every 1h", skills=["foo"])
report = rewrite_skill_refs(consolidated={}, pruned=[])
assert report["jobs_updated"] == 0
# Early return: we don't even scan when there's nothing to apply.
assert report["jobs_scanned"] == 0
def test_jobs_exist_but_no_match(self, cron_env):
from cron.jobs import create_job, get_job, rewrite_skill_refs
job = create_job(prompt="", schedule="every 1h", skills=["foo"])
report = rewrite_skill_refs(
consolidated={"unrelated": "umbrella"},
pruned=["other"],
)
assert report["jobs_updated"] == 0
assert report["jobs_scanned"] == 1
# Job untouched
loaded = get_job(job["id"])
assert loaded["skills"] == ["foo"]
class TestRewriteSkillRefsConsolidation:
"""Consolidated skills should be replaced with their umbrella target."""
def test_single_skill_replaced(self, cron_env):
from cron.jobs import create_job, get_job, rewrite_skill_refs
job = create_job(prompt="", schedule="every 1h", skills=["legacy-skill"])
report = rewrite_skill_refs(
consolidated={"legacy-skill": "umbrella-skill"},
pruned=[],
)
assert report["jobs_updated"] == 1
loaded = get_job(job["id"])
assert loaded["skills"] == ["umbrella-skill"]
# Legacy ``skill`` field realigned
assert loaded["skill"] == "umbrella-skill"
def test_multiple_skills_one_consolidated(self, cron_env):
from cron.jobs import create_job, get_job, rewrite_skill_refs
job = create_job(
prompt="",
schedule="every 1h",
skills=["keep-a", "legacy", "keep-b"],
)
rewrite_skill_refs(consolidated={"legacy": "umbrella"}, pruned=[])
loaded = get_job(job["id"])
# Ordering preserved, legacy replaced in-place
assert loaded["skills"] == ["keep-a", "umbrella", "keep-b"]
def test_umbrella_already_in_list_dedupes(self, cron_env):
from cron.jobs import create_job, get_job, rewrite_skill_refs
# Job already loads the umbrella AND the legacy sub-skill
job = create_job(
prompt="",
schedule="every 1h",
skills=["umbrella", "legacy"],
)
rewrite_skill_refs(consolidated={"legacy": "umbrella"}, pruned=[])
loaded = get_job(job["id"])
# No duplicate — the umbrella stays exactly once
assert loaded["skills"] == ["umbrella"]
def test_rewrite_report_records_mapping(self, cron_env):
from cron.jobs import create_job, rewrite_skill_refs
job = create_job(
prompt="",
schedule="every 1h",
skills=["a", "b"],
name="my-job",
)
report = rewrite_skill_refs(
consolidated={"a": "umbrella-a", "b": "umbrella-b"},
pruned=[],
)
assert len(report["rewrites"]) == 1
entry = report["rewrites"][0]
assert entry["job_id"] == job["id"]
assert entry["job_name"] == "my-job"
assert entry["before"] == ["a", "b"]
assert entry["after"] == ["umbrella-a", "umbrella-b"]
assert entry["mapped"] == {"a": "umbrella-a", "b": "umbrella-b"}
assert entry["dropped"] == []
class TestRewriteSkillRefsPruning:
"""Pruned skills should be dropped outright (no forwarding target)."""
def test_pruned_skill_dropped(self, cron_env):
from cron.jobs import create_job, get_job, rewrite_skill_refs
job = create_job(
prompt="",
schedule="every 1h",
skills=["keep", "stale"],
)
report = rewrite_skill_refs(consolidated={}, pruned=["stale"])
assert report["jobs_updated"] == 1
loaded = get_job(job["id"])
assert loaded["skills"] == ["keep"]
assert loaded["skill"] == "keep"
def test_all_skills_pruned_leaves_empty_list(self, cron_env):
from cron.jobs import create_job, get_job, rewrite_skill_refs
job = create_job(prompt="", schedule="every 1h", skills=["gone"])
rewrite_skill_refs(consolidated={}, pruned=["gone"])
loaded = get_job(job["id"])
assert loaded["skills"] == []
assert loaded["skill"] is None
def test_pruned_report_records_drops(self, cron_env):
from cron.jobs import create_job, rewrite_skill_refs
create_job(prompt="", schedule="every 1h", skills=["keep", "stale"])
report = rewrite_skill_refs(consolidated={}, pruned=["stale"])
entry = report["rewrites"][0]
assert entry["dropped"] == ["stale"]
assert entry["mapped"] == {}
class TestRewriteSkillRefsMixed:
"""Consolidation + pruning in the same pass."""
def test_mixed_consolidation_and_pruning(self, cron_env):
from cron.jobs import create_job, get_job, rewrite_skill_refs
job = create_job(
prompt="",
schedule="every 1h",
skills=["keep", "legacy", "stale"],
)
rewrite_skill_refs(
consolidated={"legacy": "umbrella"},
pruned=["stale"],
)
loaded = get_job(job["id"])
assert loaded["skills"] == ["keep", "umbrella"]
def test_skill_in_both_maps_wins_as_consolidated(self, cron_env):
"""Defensive: if a skill appears in both lists (shouldn't happen
in practice), prefer consolidation it has a forwarding target,
which is the more useful outcome."""
from cron.jobs import create_job, get_job, rewrite_skill_refs
job = create_job(prompt="", schedule="every 1h", skills=["ambiguous"])
rewrite_skill_refs(
consolidated={"ambiguous": "umbrella"},
pruned=["ambiguous"],
)
loaded = get_job(job["id"])
assert loaded["skills"] == ["umbrella"]
class TestRewriteSkillRefsMultipleJobs:
"""Multiple jobs, some affected, some not."""
def test_only_affected_jobs_reported(self, cron_env):
from cron.jobs import create_job, get_job, rewrite_skill_refs
j1 = create_job(prompt="", schedule="every 1h", skills=["legacy"])
j2 = create_job(prompt="", schedule="every 1h", skills=["untouched"])
j3 = create_job(prompt="", schedule="every 1h", skills=[])
report = rewrite_skill_refs(
consolidated={"legacy": "umbrella"},
pruned=[],
)
assert report["jobs_updated"] == 1
assert report["jobs_scanned"] == 3
assert len(report["rewrites"]) == 1
assert report["rewrites"][0]["job_id"] == j1["id"]
# Untouched jobs stay put
assert get_job(j2["id"])["skills"] == ["untouched"]
assert get_job(j3["id"])["skills"] == []
def test_legacy_skill_field_also_rewritten(self, cron_env):
"""Old jobs may have the legacy single-skill ``skill`` field
set instead of ``skills``. Both paths should be rewritten."""
from cron.jobs import create_job, get_job, rewrite_skill_refs
# Create via the legacy ``skill`` argument
job = create_job(
prompt="",
schedule="every 1h",
skill="legacy",
)
rewrite_skill_refs(consolidated={"legacy": "umbrella"}, pruned=[])
loaded = get_job(job["id"])
assert loaded["skills"] == ["umbrella"]
assert loaded["skill"] == "umbrella"
class TestRewriteSkillRefsPersistence:
"""Rewrites persist to disk and survive a reload."""
def test_changes_persist_across_reload(self, cron_env):
import json
from cron.jobs import create_job, rewrite_skill_refs, JOBS_FILE
create_job(prompt="", schedule="every 1h", skills=["legacy"])
rewrite_skill_refs(consolidated={"legacy": "umbrella"}, pruned=[])
# Read raw file contents
data = json.loads(JOBS_FILE.read_text())
assert data["jobs"][0]["skills"] == ["umbrella"]
assert data["jobs"][0]["skill"] == "umbrella"
def test_noop_does_not_rewrite_file(self, cron_env):
from cron.jobs import create_job, rewrite_skill_refs, JOBS_FILE
create_job(prompt="", schedule="every 1h", skills=["keep"])
mtime_before = JOBS_FILE.stat().st_mtime_ns
# Nothing in the map matches
report = rewrite_skill_refs(
consolidated={"unrelated": "umbrella"},
pruned=["other"],
)
assert report["jobs_updated"] == 0
# File untouched — no pointless disk write
assert JOBS_FILE.stat().st_mtime_ns == mtime_before
+65
View File
@@ -0,0 +1,65 @@
"""Shared fixtures for Feishu adapter tests (admission, group policy, dispatch)."""
from __future__ import annotations
import threading
from types import SimpleNamespace
from typing import Any, Optional
def make_sender(sender_type: str = "user", open_id: str = "ou_human",
user_id: Optional[str] = None, union_id: Optional[str] = None) -> Any:
return SimpleNamespace(
sender_type=sender_type,
sender_id=SimpleNamespace(open_id=open_id, user_id=user_id, union_id=union_id),
)
def make_message(message_id: str = "om_xxx", chat_type: str = "p2p",
chat_id: str = "oc_1", mentions: Optional[list] = None) -> Any:
return SimpleNamespace(
message_id=message_id,
chat_type=chat_type,
chat_id=chat_id,
mentions=mentions,
content="",
message_type="text",
)
def make_adapter_skeleton(
*,
bot_open_id: str = "ou_me",
bot_user_id: str = "",
allow_bots: str = "none",
require_mention: bool = True,
group_policy: str = "allowlist",
) -> Any:
from gateway.platforms.feishu import FeishuAdapter
adapter = object.__new__(FeishuAdapter)
adapter._bot_open_id = bot_open_id
adapter._bot_user_id = bot_user_id
adapter._bot_name = ""
adapter._app_id = ""
adapter._admins = set()
adapter._group_rules = {}
adapter._group_policy = group_policy
adapter._default_group_policy = group_policy
adapter._allowed_group_users = frozenset()
adapter._allow_bots = allow_bots
adapter._require_mention = require_mention
return adapter
def install_dedup_state(adapter: Any, seen: Optional[dict] = None) -> None:
adapter._seen_message_ids = dict(seen) if seen else {}
adapter._seen_message_order = list((seen or {}).keys())
adapter._dedup_cache_size = 100
adapter._dedup_lock = threading.Lock()
adapter._dedup_state_path = None
adapter._persist_seen_message_ids = lambda: None
def stub_mention(adapter: Any, mentions_self: bool) -> None:
adapter._mentions_self = lambda _message: mentions_self
+30
View File
@@ -332,6 +332,36 @@ def auth_adapter():
return _make_adapter(api_key="sk-secret")
# ---------------------------------------------------------------------------
# Adapter internals
# ---------------------------------------------------------------------------
class TestAgentExecution:
@pytest.mark.asyncio
async def test_run_agent_uses_session_id_as_task_id(self, adapter):
mock_agent = MagicMock()
mock_agent.run_conversation.return_value = {"final_response": "ok"}
mock_agent.session_prompt_tokens = 1
mock_agent.session_completion_tokens = 2
mock_agent.session_total_tokens = 3
with patch.object(adapter, "_create_agent", return_value=mock_agent):
result, usage = await adapter._run_agent(
user_message="hello",
conversation_history=[],
session_id="session-123",
)
assert result == {"final_response": "ok"}
assert usage == {"input_tokens": 1, "output_tokens": 2, "total_tokens": 3}
mock_agent.run_conversation.assert_called_once_with(
user_message="hello",
conversation_history=[],
task_id="session-123",
)
# ---------------------------------------------------------------------------
# /health endpoint
# ---------------------------------------------------------------------------
+1 -4
View File
@@ -253,10 +253,7 @@ class TestRunStatus:
await asyncio.sleep(0.05)
mock_agent.run_conversation.assert_called_once()
# task_id stays "default" so the Runs API shares one sandbox
# container with CLI/gateway; session_id is surfaced in status
# for external UIs to correlate runs with their own session IDs.
assert mock_agent.run_conversation.call_args.kwargs["task_id"] == "default"
assert mock_agent.run_conversation.call_args.kwargs["task_id"] == "space-session"
assert status["session_id"] == "space-session"
@pytest.mark.asyncio
@@ -173,6 +173,23 @@ class TestBlockingGatewayApproval:
assert e1.event.is_set()
assert e2.event.is_set()
def test_clear_session_denies_and_signals_all_entries(self):
"""clear_session must wake blocked entries during boundary cleanup."""
from tools.approval import clear_session, _ApprovalEntry, _gateway_queues
session_key = "test-boundary-cleanup"
e1 = _ApprovalEntry({"command": "cmd1"})
e2 = _ApprovalEntry({"command": "cmd2"})
_gateway_queues[session_key] = [e1, e2]
clear_session(session_key)
assert e1.event.is_set()
assert e2.event.is_set()
assert e1.result == "deny"
assert e2.result == "deny"
assert session_key not in _gateway_queues
# ------------------------------------------------------------------
# /approve command
+18 -10
View File
@@ -64,11 +64,13 @@ async def test_compress_command_reports_noop_without_success_banner():
agent_instance = MagicMock()
agent_instance.shutdown_memory_provider = MagicMock()
agent_instance.close = MagicMock()
agent_instance._cached_system_prompt = ""
agent_instance.tools = None
agent_instance.context_compressor.has_content_to_compress.return_value = True
agent_instance.session_id = "sess-1"
agent_instance._compress_context.return_value = (list(history), "")
def _estimate(messages):
def _estimate(messages, **_kwargs):
assert messages == history
return 100
@@ -76,13 +78,13 @@ async def test_compress_command_reports_noop_without_success_banner():
patch("gateway.run._resolve_runtime_agent_kwargs", return_value={"api_key": "test-key"}),
patch("gateway.run._resolve_gateway_model", return_value="test-model"),
patch("run_agent.AIAgent", return_value=agent_instance),
patch("agent.model_metadata.estimate_messages_tokens_rough", side_effect=_estimate),
patch("agent.model_metadata.estimate_request_tokens_rough", side_effect=_estimate),
):
result = await runner._handle_compress_command(_make_event())
assert "No changes from compression" in result
assert "Compressed:" not in result
assert "Rough transcript estimate: ~100 tokens (unchanged)" in result
assert "Approx request size: ~100 tokens (unchanged)" in result
agent_instance.shutdown_memory_provider.assert_called_once()
agent_instance.close.assert_called_once()
@@ -99,11 +101,13 @@ async def test_compress_command_explains_when_token_estimate_rises():
agent_instance = MagicMock()
agent_instance.shutdown_memory_provider = MagicMock()
agent_instance.close = MagicMock()
agent_instance._cached_system_prompt = ""
agent_instance.tools = None
agent_instance.context_compressor.has_content_to_compress.return_value = True
agent_instance.session_id = "sess-1"
agent_instance._compress_context.return_value = (compressed, "")
def _estimate(messages):
def _estimate(messages, **_kwargs):
if messages == history:
return 100
if messages == compressed:
@@ -114,12 +118,12 @@ async def test_compress_command_explains_when_token_estimate_rises():
patch("gateway.run._resolve_runtime_agent_kwargs", return_value={"api_key": "test-key"}),
patch("gateway.run._resolve_gateway_model", return_value="test-model"),
patch("run_agent.AIAgent", return_value=agent_instance),
patch("agent.model_metadata.estimate_messages_tokens_rough", side_effect=_estimate),
patch("agent.model_metadata.estimate_request_tokens_rough", side_effect=_estimate),
):
result = await runner._handle_compress_command(_make_event())
assert "Compressed: 4 → 3 messages" in result
assert "Rough transcript estimate: ~100 → ~120 tokens" in result
assert "Approx request size: ~100 → ~120 tokens" in result
assert "denser summaries" in result
agent_instance.shutdown_memory_provider.assert_called_once()
agent_instance.close.assert_called_once()
@@ -143,6 +147,8 @@ async def test_compress_command_appends_warning_when_summary_generation_fails():
agent_instance = MagicMock()
agent_instance.shutdown_memory_provider = MagicMock()
agent_instance.close = MagicMock()
agent_instance._cached_system_prompt = ""
agent_instance.tools = None
agent_instance.context_compressor.has_content_to_compress.return_value = True
# Simulate summary-generation failure: fallback flag set, dropped count
# populated, error string captured.
@@ -154,7 +160,7 @@ async def test_compress_command_appends_warning_when_summary_generation_fails():
agent_instance.session_id = "sess-1"
agent_instance._compress_context.return_value = (compressed, "")
def _estimate(messages):
def _estimate(messages, **_kwargs):
if messages == history:
return 100
if messages == compressed:
@@ -165,7 +171,7 @@ async def test_compress_command_appends_warning_when_summary_generation_fails():
patch("gateway.run._resolve_runtime_agent_kwargs", return_value={"api_key": "***"}),
patch("gateway.run._resolve_gateway_model", return_value="test-model"),
patch("run_agent.AIAgent", return_value=agent_instance),
patch("agent.model_metadata.estimate_messages_tokens_rough", side_effect=_estimate),
patch("agent.model_metadata.estimate_request_tokens_rough", side_effect=_estimate),
):
result = await runner._handle_compress_command(_make_event())
@@ -200,6 +206,8 @@ async def test_compress_command_surfaces_aux_model_failure_even_when_recovered()
agent_instance = MagicMock()
agent_instance.shutdown_memory_provider = MagicMock()
agent_instance.close = MagicMock()
agent_instance._cached_system_prompt = ""
agent_instance.tools = None
agent_instance.context_compressor.has_content_to_compress.return_value = True
# Fallback placeholder was NOT used — recovery succeeded.
agent_instance.context_compressor._last_summary_fallback_used = False
@@ -215,7 +223,7 @@ async def test_compress_command_surfaces_aux_model_failure_even_when_recovered()
agent_instance.session_id = "sess-1"
agent_instance._compress_context.return_value = (compressed, "")
def _estimate(messages):
def _estimate(messages, **_kwargs):
if messages == history:
return 100
if messages == compressed:
@@ -226,7 +234,7 @@ async def test_compress_command_surfaces_aux_model_failure_even_when_recovered()
patch("gateway.run._resolve_runtime_agent_kwargs", return_value={"api_key": "***"}),
patch("gateway.run._resolve_gateway_model", return_value="test-model"),
patch("run_agent.AIAgent", return_value=agent_instance),
patch("agent.model_metadata.estimate_messages_tokens_rough", side_effect=_estimate),
patch("agent.model_metadata.estimate_request_tokens_rough", side_effect=_estimate),
):
result = await runner._handle_compress_command(_make_event())
+96
View File
@@ -9,6 +9,7 @@ from gateway.config import (
Platform,
PlatformConfig,
SessionResetPolicy,
StreamingConfig,
_apply_env_overrides,
load_gateway_config,
)
@@ -149,6 +150,24 @@ class TestSessionResetPolicy:
assert restored.notify is False
class TestStreamingConfig:
def test_from_dict_coerces_quoted_false_enabled(self):
restored = StreamingConfig.from_dict({"enabled": "false"})
assert restored.enabled is False
def test_from_dict_malformed_numeric_values_fall_back_to_defaults(self):
restored = StreamingConfig.from_dict(
{
"edit_interval": "oops",
"buffer_threshold": "oops",
"fresh_final_after_seconds": "oops",
}
)
assert restored.edit_interval == 1.0
assert restored.buffer_threshold == 40
assert restored.fresh_final_after_seconds == 60.0
class TestGatewayConfigRoundtrip:
def test_full_roundtrip(self):
config = GatewayConfig(
@@ -194,6 +213,26 @@ class TestGatewayConfigRoundtrip:
restored = GatewayConfig.from_dict({"always_log_local": "false"})
assert restored.always_log_local is False
def test_get_notice_delivery_defaults_to_public(self):
config = GatewayConfig(
platforms={Platform.SLACK: PlatformConfig(enabled=True, token="***")}
)
assert config.get_notice_delivery(Platform.SLACK) == "public"
def test_get_notice_delivery_honors_platform_override(self):
config = GatewayConfig(
platforms={
Platform.SLACK: PlatformConfig(
enabled=True,
token="***",
extra={"notice_delivery": "private"},
),
}
)
assert config.get_notice_delivery(Platform.SLACK) == "private"
class TestLoadGatewayConfig:
def test_bridges_quick_commands_from_config_yaml(self, tmp_path, monkeypatch):
@@ -360,6 +399,38 @@ class TestLoadGatewayConfig:
"C01ABC": "Code review mode",
}
def test_bridges_feishu_allow_bots_from_config_yaml_to_env(self, tmp_path, monkeypatch):
hermes_home = tmp_path / ".hermes"
hermes_home.mkdir()
config_path = hermes_home / "config.yaml"
config_path.write_text(
"feishu:\n allow_bots: mentions\n",
encoding="utf-8",
)
monkeypatch.setenv("HERMES_HOME", str(hermes_home))
monkeypatch.delenv("FEISHU_ALLOW_BOTS", raising=False)
load_gateway_config()
assert os.environ.get("FEISHU_ALLOW_BOTS") == "mentions"
def test_feishu_allow_bots_env_takes_precedence_over_config_yaml(self, tmp_path, monkeypatch):
hermes_home = tmp_path / ".hermes"
hermes_home.mkdir()
config_path = hermes_home / "config.yaml"
config_path.write_text(
"feishu:\n allow_bots: all\n",
encoding="utf-8",
)
monkeypatch.setenv("HERMES_HOME", str(hermes_home))
monkeypatch.setenv("FEISHU_ALLOW_BOTS", "none")
load_gateway_config()
assert os.environ.get("FEISHU_ALLOW_BOTS") == "none"
def test_invalid_quick_commands_in_config_yaml_are_ignored(self, tmp_path, monkeypatch):
hermes_home = tmp_path / ".hermes"
hermes_home.mkdir()
@@ -406,6 +477,22 @@ class TestLoadGatewayConfig:
assert config.platforms[Platform.TELEGRAM].extra["disable_link_previews"] is True
def test_bridges_notice_delivery_from_config_yaml(self, tmp_path, monkeypatch):
hermes_home = tmp_path / ".hermes"
hermes_home.mkdir()
config_path = hermes_home / "config.yaml"
config_path.write_text(
"slack:\n"
" notice_delivery: private\n",
encoding="utf-8",
)
monkeypatch.setenv("HERMES_HOME", str(hermes_home))
config = load_gateway_config()
assert config.get_notice_delivery(Platform.SLACK) == "private"
def test_bridges_telegram_proxy_url_from_config_yaml(self, tmp_path, monkeypatch):
hermes_home = tmp_path / ".hermes"
hermes_home.mkdir()
@@ -455,6 +542,15 @@ class TestHomeChannelEnvOverrides:
{"SLACK_HOME_CHANNEL": "C123", "SLACK_HOME_CHANNEL_NAME": "Ops"},
("C123", "Ops"),
),
(
Platform.WHATSAPP,
PlatformConfig(enabled=True),
{
"WHATSAPP_HOME_CHANNEL": "1234567890@lid",
"WHATSAPP_HOME_CHANNEL_NAME": "Owner DM",
},
("1234567890@lid", "Owner DM"),
),
(
Platform.SIGNAL,
PlatformConfig(
+58
View File
@@ -65,4 +65,62 @@ class TestTargetToStringRoundtrip:
assert reparsed.chat_id == "999"
class TestCaseSensitiveChatIdParsing:
"""Test that chat IDs preserve their original case (issue #11768)."""
def test_slack_uppercase_chat_id_preserved(self):
"""Slack channel IDs like C123ABC should preserve case."""
target = DeliveryTarget.parse("slack:C123ABC")
assert target.platform == Platform.SLACK
assert target.chat_id == "C123ABC" # Should NOT be lowercased to c123abc
assert target.is_explicit is True
def test_slack_chat_id_with_thread_preserved(self):
"""Slack channel:thread IDs should preserve case."""
target = DeliveryTarget.parse("slack:C123ABC:thread123")
assert target.platform == Platform.SLACK
assert target.chat_id == "C123ABC"
assert target.thread_id == "thread123"
def test_matrix_room_id_preserved(self):
"""Matrix room IDs like !RoomABC:example.org should preserve case.
Note: Matrix room IDs contain colons (e.g., !RoomABC:example.org).
Due to the platform:chat_id:thread_id format, these are parsed as
chat_id=!RoomABC and thread_id=example.org. This is a known limitation
of the current format. The fix preserves case but doesn't change the
parsing structure.
"""
target = DeliveryTarget.parse("matrix:!RoomABC:example.org")
assert target.platform == Platform.MATRIX
# The room ID is split at the first colon after the platform prefix
# This is a format limitation - the case is preserved but the structure is split
assert target.chat_id == "!RoomABC"
assert target.thread_id == "example.org"
def test_mixed_case_chat_id_roundtrip(self):
"""Mixed-case chat IDs should survive parse-to_string roundtrip."""
original = "telegram:ChatId123ABC"
target = DeliveryTarget.parse(original)
s = target.to_string()
reparsed = DeliveryTarget.parse(s)
assert reparsed.chat_id == "ChatId123ABC"
class TestPlatformNameCaseInsensitivity:
"""Test that platform names are case-insensitive."""
def test_uppercase_platform_name(self):
"""Platform names should be case-insensitive."""
target = DeliveryTarget.parse("TELEGRAM:12345")
assert target.platform == Platform.TELEGRAM
assert target.chat_id == "12345"
def test_mixed_case_platform_name(self):
"""Mixed-case platform names should work."""
target = DeliveryTarget.parse("TeleGram:12345")
assert target.platform == Platform.TELEGRAM
assert target.chat_id == "12345"
@@ -220,6 +220,26 @@ async def test_discord_free_response_channel_can_come_from_config_extra(adapter,
assert event.text == "allowed from config"
def test_discord_free_response_channels_bare_int(adapter, monkeypatch):
# YAML `discord.free_response_channels: 1491973769726791812` (single bare
# integer) is loaded as an int and previously fell through the
# isinstance(str) branch in _discord_free_response_channels, silently
# returning an empty set. Scalar → str coercion makes single-channel
# config work without having to quote the ID in YAML.
monkeypatch.delenv("DISCORD_FREE_RESPONSE_CHANNELS", raising=False)
adapter.config.extra["free_response_channels"] = 1491973769726791812
assert adapter._discord_free_response_channels() == {"1491973769726791812"}
def test_discord_free_response_channels_int_list(adapter, monkeypatch):
# YAML list form with bare numeric entries — each element should be coerced.
monkeypatch.delenv("DISCORD_FREE_RESPONSE_CHANNELS", raising=False)
adapter.config.extra["free_response_channels"] = [1491973769726791812, 99999]
assert adapter._discord_free_response_channels() == {"1491973769726791812", "99999"}
@pytest.mark.asyncio
async def test_discord_forum_parent_in_free_response_list_allows_forum_thread(adapter, monkeypatch):
monkeypatch.setenv("DISCORD_REQUIRE_MENTION", "true")
+336
View File
@@ -0,0 +1,336 @@
"""Tests for EphemeralReply — system-notice auto-delete in gateway adapters.
Slash-command handlers in ``gateway/run.py`` can return an
``EphemeralReply`` wrapper to request auto-deletion of the reply message
after a TTL. The base adapter unwraps the sentinel before sending and
schedules a detached delete task when the platform supports
``delete_message``.
Covered:
1. ``_unwrap_ephemeral`` returns text + ttl for EphemeralReply, and
passes plain strings through unchanged.
2. TTL is zeroed on platforms that don't override ``delete_message``
(silent degrade message stays in place).
3. TTL is honored on platforms that DO override ``delete_message``.
4. ``_schedule_ephemeral_delete`` invokes ``delete_message`` after the
configured delay with the correct chat_id / message_id.
5. ``_process_message_background`` sends the unwrapped text (not the
sentinel object) and schedules deletion when appropriate.
6. The two busy-session bypass paths also unwrap + schedule.
"""
import asyncio
from unittest.mock import AsyncMock, patch
import pytest
from gateway.config import Platform, PlatformConfig
from gateway.platforms.base import (
BasePlatformAdapter,
EphemeralReply,
MessageEvent,
MessageType,
SendResult,
)
from gateway.session import SessionSource
class _NoDeleteAdapter(BasePlatformAdapter):
"""Adapter that does NOT override delete_message (silent degrade)."""
async def connect(self):
pass
async def disconnect(self):
pass
async def send(self, chat_id, content="", **kwargs):
return SendResult(success=True, message_id="m-1")
async def get_chat_info(self, chat_id):
return {}
class _DeleteCapableAdapter(BasePlatformAdapter):
"""Adapter that overrides delete_message (TTL honored)."""
def __init__(self, *a, **kw):
super().__init__(*a, **kw)
self.deleted: list[tuple[str, str]] = []
async def connect(self):
pass
async def disconnect(self):
pass
async def send(self, chat_id, content="", **kwargs):
return SendResult(success=True, message_id="m-2")
async def get_chat_info(self, chat_id):
return {}
async def delete_message(self, chat_id: str, message_id: str) -> bool:
self.deleted.append((chat_id, message_id))
return True
def _no_delete_adapter():
return _NoDeleteAdapter(
PlatformConfig(enabled=True, token="t"), Platform.TELEGRAM
)
def _delete_adapter():
return _DeleteCapableAdapter(
PlatformConfig(enabled=True, token="t"), Platform.TELEGRAM
)
def _make_event(text="/stop", chat_id="42"):
return MessageEvent(
text=text,
message_id="msg-1",
source=SessionSource(
platform=Platform.TELEGRAM,
chat_id=chat_id,
user_id="u-1",
),
message_type=MessageType.TEXT,
)
# ---------------------------------------------------------------------------
# _unwrap_ephemeral
# ---------------------------------------------------------------------------
def test_unwrap_plain_string_is_passthrough():
adapter = _delete_adapter()
text, ttl = adapter._unwrap_ephemeral("hello")
assert text == "hello"
assert ttl == 0
def test_unwrap_none_is_passthrough():
adapter = _delete_adapter()
text, ttl = adapter._unwrap_ephemeral(None)
assert text is None
assert ttl == 0
def test_unwrap_ephemeral_explicit_ttl_on_capable_adapter():
adapter = _delete_adapter()
text, ttl = adapter._unwrap_ephemeral(EphemeralReply("bye", ttl_seconds=60))
assert text == "bye"
assert ttl == 60
def test_unwrap_ephemeral_zeros_ttl_on_incapable_adapter():
"""Platforms without delete_message should silently degrade to normal send."""
adapter = _no_delete_adapter()
text, ttl = adapter._unwrap_ephemeral(EphemeralReply("bye", ttl_seconds=60))
assert text == "bye"
assert ttl == 0 # forced to 0 — message will stay in place
def test_unwrap_ephemeral_default_ttl_from_config():
adapter = _delete_adapter()
with patch.object(adapter, "_get_ephemeral_system_ttl_default", return_value=120):
text, ttl = adapter._unwrap_ephemeral(EphemeralReply("bye"))
assert text == "bye"
assert ttl == 120
def test_unwrap_ephemeral_default_ttl_zero_disables():
"""Config default of 0 (the shipped default) means the feature is off."""
adapter = _delete_adapter()
with patch.object(adapter, "_get_ephemeral_system_ttl_default", return_value=0):
text, ttl = adapter._unwrap_ephemeral(EphemeralReply("bye"))
assert text == "bye"
assert ttl == 0
def test_unwrap_ephemeral_handles_unreadable_config():
adapter = _delete_adapter()
with patch.object(
adapter,
"_get_ephemeral_system_ttl_default",
side_effect=RuntimeError("boom"),
):
text, ttl = adapter._unwrap_ephemeral(EphemeralReply("bye"))
# Fall back to 0 rather than crashing the handler pipeline.
assert text == "bye"
assert ttl == 0
# ---------------------------------------------------------------------------
# _schedule_ephemeral_delete
# ---------------------------------------------------------------------------
@pytest.mark.asyncio
async def test_schedule_ephemeral_delete_calls_delete_after_ttl():
adapter = _delete_adapter()
# Use a very short TTL to keep the test fast — the implementation
# floors sleeps at 1s via ``max(1, int(ttl_seconds))``. Patch asyncio.sleep
# inside the module under test; the test body uses the real one for
# scheduler pumping.
import gateway.platforms.base as base_module
sleeps: list[float] = []
_real_sleep = base_module.asyncio.sleep
async def _fake_sleep(duration):
sleeps.append(duration)
# Yield control so the rest of the task body can run.
await _real_sleep(0)
with patch.object(base_module.asyncio, "sleep", _fake_sleep):
adapter._schedule_ephemeral_delete(
chat_id="42", message_id="m-2", ttl_seconds=5
)
# Let the spawned task run.
for _ in range(5):
await _real_sleep(0)
# Only the ttl sleep shows up — the test pump uses the real sleep.
assert 5 in sleeps
assert adapter.deleted == [("42", "m-2")]
@pytest.mark.asyncio
async def test_schedule_ephemeral_delete_swallows_errors():
adapter = _delete_adapter()
async def _boom(*a, **kw):
raise RuntimeError("permission denied")
adapter.delete_message = _boom # type: ignore[assignment]
with patch("gateway.platforms.base.asyncio.sleep", AsyncMock()):
adapter._schedule_ephemeral_delete(
chat_id="42", message_id="m-2", ttl_seconds=1
)
# No exception should propagate even though delete_message raised.
for _ in range(5):
await asyncio.sleep(0)
def test_schedule_ephemeral_delete_outside_event_loop_is_noop():
"""No running loop → no crash, silently drops the request."""
adapter = _delete_adapter()
# No pytest.mark.asyncio → no loop. Must not raise.
adapter._schedule_ephemeral_delete(
chat_id="42", message_id="m-2", ttl_seconds=1
)
assert adapter.deleted == []
# ---------------------------------------------------------------------------
# _process_message_background unwraps EphemeralReply before send
# ---------------------------------------------------------------------------
@pytest.mark.asyncio
async def test_process_message_unwraps_ephemeral_before_send():
"""The adapter must send the wrapper's .text, never the wrapper object."""
adapter = _delete_adapter()
adapter._send_with_retry = AsyncMock(
return_value=SendResult(success=True, message_id="sent-1")
)
async def _handler(evt):
return EphemeralReply("⚡ Stopped.", ttl_seconds=5)
adapter.set_message_handler(_handler)
sleeps: list[float] = []
async def _fake_sleep(duration):
sleeps.append(duration)
event = _make_event()
session_key = "agent:main:telegram:private:42"
with patch("gateway.platforms.base.asyncio.sleep", _fake_sleep), patch.object(
adapter, "_keep_typing", new=AsyncMock()
):
await adapter._process_message_background(event, session_key)
# Pump until the detached delete task completes.
for _ in range(10):
await asyncio.sleep(0)
# Sent text is the unwrapped string, NOT repr(EphemeralReply(...))
adapter._send_with_retry.assert_called_once()
sent_text = adapter._send_with_retry.call_args.kwargs["content"]
assert sent_text == "⚡ Stopped."
# Auto-delete scheduled using the returned message_id
assert ("42", "sent-1") in adapter.deleted
@pytest.mark.asyncio
async def test_process_message_incapable_platform_does_not_schedule_delete():
adapter = _no_delete_adapter()
adapter._send_with_retry = AsyncMock(
return_value=SendResult(success=True, message_id="sent-1")
)
async def _handler(evt):
return EphemeralReply("⚡ Stopped.", ttl_seconds=5)
adapter.set_message_handler(_handler)
# Spy on delete_message to confirm it is NOT invoked.
delete_calls: list = []
async def _spy_delete(chat_id, message_id):
delete_calls.append((chat_id, message_id))
return False
adapter.delete_message = _spy_delete # type: ignore[assignment]
event = _make_event()
session_key = "agent:main:telegram:private:42"
with patch("gateway.platforms.base.asyncio.sleep", AsyncMock()), patch.object(
adapter, "_keep_typing", new=AsyncMock()
):
await adapter._process_message_background(event, session_key)
for _ in range(10):
await asyncio.sleep(0)
# Send happened with the unwrapped text...
adapter._send_with_retry.assert_called_once()
assert adapter._send_with_retry.call_args.kwargs["content"] == "⚡ Stopped."
# ...but delete was never scheduled because the capability check skipped
# the schedule call (TTL was zeroed in _unwrap_ephemeral).
# Note: the capability gate on _unwrap_ephemeral checks for
# ``type(adapter).delete_message is BasePlatformAdapter.delete_message``.
# Monkeypatching the instance does NOT change the class, so this test
# verifies the gate uses the class method to detect capability.
assert delete_calls == []
@pytest.mark.asyncio
async def test_process_message_plain_string_behaves_unchanged():
adapter = _delete_adapter()
adapter._send_with_retry = AsyncMock(
return_value=SendResult(success=True, message_id="sent-1")
)
async def _handler(evt):
return "plain reply"
adapter.set_message_handler(_handler)
event = _make_event()
session_key = "agent:main:telegram:private:42"
with patch("gateway.platforms.base.asyncio.sleep", AsyncMock()), patch.object(
adapter, "_keep_typing", new=AsyncMock()
):
await adapter._process_message_background(event, session_key)
for _ in range(5):
await asyncio.sleep(0)
adapter._send_with_retry.assert_called_once()
assert adapter._send_with_retry.call_args.kwargs["content"] == "plain reply"
assert adapter.deleted == [] # no auto-delete for plain replies
+258 -115
View File
@@ -8,6 +8,7 @@ import time
import unittest
from pathlib import Path
from types import SimpleNamespace
from typing import Dict
from unittest.mock import AsyncMock, Mock, patch
from gateway.platforms.base import ProcessingOutcome
@@ -557,6 +558,16 @@ class TestAdapterModule(unittest.TestCase):
self.assertEqual(fake_client._ping_interval, 4)
def _admits_group(adapter, message, sender_id, chat_id=""):
"""Group-path shim: run a message through ``_admit`` and return a bool."""
sender = SimpleNamespace(sender_type="user", sender_id=sender_id)
if not hasattr(message, "chat_type"):
message.chat_type = "group"
if chat_id:
message.chat_id = chat_id
return adapter._admit(sender, message) is None
class TestAdapterBehavior(unittest.TestCase):
@patch.dict(os.environ, {}, clear=True)
def test_build_event_handler_registers_reaction_and_card_processors(self):
@@ -689,6 +700,67 @@ class TestAdapterBehavior(unittest.TestCase):
adapter._on_reaction_event("im.message.reaction.created_v1", data)
run_threadsafe.assert_called_once()
def _build_reaction_adapter(self, *, msg_sender_id: str):
"""Build a FeishuAdapter wired up to return a single GET-message result."""
from gateway.config import PlatformConfig
from gateway.platforms.feishu import FeishuAdapter
adapter = FeishuAdapter(PlatformConfig())
adapter._app_id = "cli_self_app"
adapter._bot_open_id = "ou_self_bot"
adapter._bot_user_id = "u_self_bot"
msg = SimpleNamespace(
sender=SimpleNamespace(sender_type="app", id=msg_sender_id, id_type="app_id"),
chat_id="oc_chat",
chat_type="group",
)
response = SimpleNamespace(success=lambda: True, data=SimpleNamespace(items=[msg]))
adapter._client = SimpleNamespace(
im=SimpleNamespace(
v1=SimpleNamespace(message=SimpleNamespace(get=Mock(return_value=response)))
)
)
adapter._build_get_message_request = Mock(return_value=object())
adapter._handle_message_with_guards = AsyncMock()
adapter._resolve_sender_profile = AsyncMock(
return_value={"user_id": "u_human", "user_name": "Human", "user_id_alt": None}
)
adapter.get_chat_info = AsyncMock(return_value={"name": "Test Chat"})
return adapter
@patch.dict(os.environ, {}, clear=True)
def test_reaction_on_peer_bot_message_is_not_routed(self):
# GET im/v1/messages sender for bot messages carries id=app_id; a peer
# bot's message has a different app_id than ours, so it must be dropped.
adapter = self._build_reaction_adapter(msg_sender_id="cli_peer_app")
event = SimpleNamespace(
message_id="om_peer_msg",
user_id=SimpleNamespace(open_id="ou_human", user_id=None, union_id=None),
reaction_type=SimpleNamespace(emoji_type="THUMBSUP"),
)
data = SimpleNamespace(event=event)
asyncio.run(
adapter._handle_reaction_event("im.message.reaction.created_v1", data)
)
adapter._handle_message_with_guards.assert_not_awaited()
@patch.dict(os.environ, {}, clear=True)
def test_reaction_on_our_own_bot_message_is_routed(self):
adapter = self._build_reaction_adapter(msg_sender_id="cli_self_app")
event = SimpleNamespace(
message_id="om_self_msg",
user_id=SimpleNamespace(open_id="ou_human", user_id=None, union_id=None),
reaction_type=SimpleNamespace(emoji_type="THUMBSUP"),
)
data = SimpleNamespace(event=event)
asyncio.run(
adapter._handle_reaction_event("im.message.reaction.created_v1", data)
)
adapter._handle_message_with_guards.assert_awaited_once()
@patch.dict(os.environ, {"FEISHU_GROUP_POLICY": "open"}, clear=True)
def test_group_message_requires_mentions_even_when_policy_open(self):
from gateway.config import PlatformConfig
@@ -697,10 +769,10 @@ class TestAdapterBehavior(unittest.TestCase):
adapter = FeishuAdapter(PlatformConfig())
message = SimpleNamespace(mentions=[])
sender_id = SimpleNamespace(open_id="ou_any", user_id=None)
self.assertFalse(adapter._should_accept_group_message(message, sender_id, ""))
self.assertFalse(_admits_group(adapter, message, sender_id, ""))
message_with_mention = SimpleNamespace(mentions=[SimpleNamespace(key="@_user_1")])
self.assertFalse(adapter._should_accept_group_message(message_with_mention, sender_id, ""))
self.assertFalse(_admits_group(adapter, message_with_mention, sender_id, ""))
@patch.dict(os.environ, {"FEISHU_GROUP_POLICY": "open"}, clear=True)
def test_group_message_with_other_user_mention_is_rejected_when_bot_identity_unknown(self):
@@ -714,59 +786,10 @@ class TestAdapterBehavior(unittest.TestCase):
id=SimpleNamespace(open_id="ou_other", user_id="u_other"),
)
self.assertFalse(adapter._should_accept_group_message(SimpleNamespace(mentions=[other_mention]), sender_id, ""))
@patch.dict(
os.environ,
{
"FEISHU_BOT_OPEN_ID": "ou_hermes",
"FEISHU_BOT_USER_ID": "u_hermes",
},
clear=True,
)
def test_other_bot_sender_is_not_treated_as_self_sent_message(self):
from gateway.config import PlatformConfig
from gateway.platforms.feishu import FeishuAdapter
adapter = FeishuAdapter(PlatformConfig())
event = SimpleNamespace(
sender=SimpleNamespace(
sender_type="bot",
sender_id=SimpleNamespace(open_id="ou_other_bot", user_id="u_other_bot"),
)
self.assertFalse(
_admits_group(adapter, SimpleNamespace(mentions=[other_mention]), sender_id, "")
)
self.assertFalse(adapter._is_self_sent_bot_message(event))
@patch.dict(
os.environ,
{
"FEISHU_BOT_OPEN_ID": "ou_hermes",
"FEISHU_BOT_USER_ID": "u_hermes",
},
clear=True,
)
def test_self_bot_sender_is_treated_as_self_sent_message(self):
from gateway.config import PlatformConfig
from gateway.platforms.feishu import FeishuAdapter
adapter = FeishuAdapter(PlatformConfig())
by_open_id = SimpleNamespace(
sender=SimpleNamespace(
sender_type="bot",
sender_id=SimpleNamespace(open_id="ou_hermes", user_id="u_other"),
)
)
by_user_id = SimpleNamespace(
sender=SimpleNamespace(
sender_type="app",
sender_id=SimpleNamespace(open_id="ou_other", user_id="u_hermes"),
)
)
self.assertTrue(adapter._is_self_sent_bot_message(by_open_id))
self.assertTrue(adapter._is_self_sent_bot_message(by_user_id))
@patch.dict(
os.environ,
{
@@ -792,14 +815,14 @@ class TestAdapterBehavior(unittest.TestCase):
)
self.assertTrue(
adapter._should_accept_group_message(
_admits_group(adapter,
mentioned,
SimpleNamespace(open_id="ou_allowed", user_id=None),
"",
)
)
self.assertFalse(
adapter._should_accept_group_message(
_admits_group(adapter,
mentioned,
SimpleNamespace(open_id="ou_blocked", user_id=None),
"",
@@ -828,14 +851,14 @@ class TestAdapterBehavior(unittest.TestCase):
)
self.assertTrue(
adapter._should_accept_group_message(
_admits_group(adapter,
message,
SimpleNamespace(open_id="ou_alice", user_id=None),
"oc_chat_a",
)
)
self.assertFalse(
adapter._should_accept_group_message(
_admits_group(adapter,
message,
SimpleNamespace(open_id="ou_charlie", user_id=None),
"oc_chat_a",
@@ -864,14 +887,14 @@ class TestAdapterBehavior(unittest.TestCase):
)
self.assertTrue(
adapter._should_accept_group_message(
_admits_group(adapter,
message,
SimpleNamespace(open_id="ou_alice", user_id=None),
"oc_chat_b",
)
)
self.assertFalse(
adapter._should_accept_group_message(
_admits_group(adapter,
message,
SimpleNamespace(open_id="ou_blocked", user_id=None),
"oc_chat_b",
@@ -900,14 +923,14 @@ class TestAdapterBehavior(unittest.TestCase):
)
self.assertTrue(
adapter._should_accept_group_message(
_admits_group(adapter,
message,
SimpleNamespace(open_id="ou_admin", user_id=None),
"oc_chat_c",
)
)
self.assertFalse(
adapter._should_accept_group_message(
_admits_group(adapter,
message,
SimpleNamespace(open_id="ou_regular", user_id=None),
"oc_chat_c",
@@ -936,14 +959,14 @@ class TestAdapterBehavior(unittest.TestCase):
)
self.assertTrue(
adapter._should_accept_group_message(
_admits_group(adapter,
message,
SimpleNamespace(open_id="ou_admin", user_id=None),
"oc_chat_d",
)
)
self.assertFalse(
adapter._should_accept_group_message(
_admits_group(adapter,
message,
SimpleNamespace(open_id="ou_regular", user_id=None),
"oc_chat_d",
@@ -973,7 +996,7 @@ class TestAdapterBehavior(unittest.TestCase):
)
self.assertTrue(
adapter._should_accept_group_message(
_admits_group(adapter,
message,
SimpleNamespace(open_id="ou_admin", user_id=None),
"oc_chat_e",
@@ -997,7 +1020,7 @@ class TestAdapterBehavior(unittest.TestCase):
)
self.assertTrue(
adapter._should_accept_group_message(
_admits_group(adapter,
message,
SimpleNamespace(open_id="ou_anyone", user_id=None),
"oc_chat_unknown",
@@ -1022,8 +1045,12 @@ class TestAdapterBehavior(unittest.TestCase):
id=SimpleNamespace(open_id="ou_other", user_id="u_other"),
)
self.assertTrue(adapter._should_accept_group_message(SimpleNamespace(mentions=[bot_mention]), sender_id, ""))
self.assertFalse(adapter._should_accept_group_message(SimpleNamespace(mentions=[other_mention]), sender_id, ""))
self.assertTrue(
_admits_group(adapter, SimpleNamespace(mentions=[bot_mention]), sender_id, "")
)
self.assertFalse(
_admits_group(adapter, SimpleNamespace(mentions=[other_mention]), sender_id, "")
)
@patch.dict(os.environ, {"FEISHU_GROUP_POLICY": "open"}, clear=True)
def test_group_message_matches_bot_name_when_only_name_available(self):
@@ -1048,8 +1075,12 @@ class TestAdapterBehavior(unittest.TestCase):
id=SimpleNamespace(open_id=None, user_id=None),
)
self.assertTrue(adapter._should_accept_group_message(SimpleNamespace(mentions=[name_only_mention]), sender_id, ""))
self.assertFalse(adapter._should_accept_group_message(SimpleNamespace(mentions=[different_mention]), sender_id, ""))
self.assertTrue(
_admits_group(adapter, SimpleNamespace(mentions=[name_only_mention]), sender_id, "")
)
self.assertFalse(
_admits_group(adapter, SimpleNamespace(mentions=[different_mention]), sender_id, "")
)
# Case 2: bot's open_id IS known — a same-name human with different
# open_id must NOT admit (IDs override names).
@@ -1066,8 +1097,17 @@ class TestAdapterBehavior(unittest.TestCase):
id=SimpleNamespace(open_id="ou_bot", user_id=None),
)
self.assertFalse(adapter2._should_accept_group_message(SimpleNamespace(mentions=[same_name_other_id_mention]), sender_id, ""))
self.assertTrue(adapter2._should_accept_group_message(SimpleNamespace(mentions=[bot_mention]), sender_id, ""))
self.assertFalse(
_admits_group(
adapter2,
SimpleNamespace(mentions=[same_name_other_id_mention]),
sender_id,
"",
)
)
self.assertTrue(
_admits_group(adapter2, SimpleNamespace(mentions=[bot_mention]), sender_id, "")
)
@patch.dict(os.environ, {}, clear=True)
def test_extract_post_message_as_text(self):
@@ -1411,6 +1451,7 @@ class TestAdapterBehavior(unittest.TestCase):
data=SimpleNamespace(event=SimpleNamespace(message=message)),
message=message,
sender_id=SimpleNamespace(open_id="ou_user", user_id=None, union_id=None),
is_bot=False,
chat_type="p2p",
message_id="om_command",
)
@@ -1522,13 +1563,14 @@ class TestAdapterBehavior(unittest.TestCase):
user_id="u_user",
union_id="on_union",
)
data = SimpleNamespace(event=SimpleNamespace(message=message, sender=SimpleNamespace(sender_id=sender_id)))
sender = SimpleNamespace(sender_type="user", sender_id=sender_id)
data = SimpleNamespace(event=SimpleNamespace(message=message, sender=sender))
asyncio.run(
adapter._process_inbound_message(
data=data,
message=message,
sender_id=sender_id,
sender_id=sender.sender_id,
chat_type="p2p",
message_id="om_text",
)
@@ -1761,13 +1803,14 @@ class TestAdapterBehavior(unittest.TestCase):
message_id="om_group_text",
)
sender_id = SimpleNamespace(open_id="ou_user", user_id=None, union_id=None)
sender = SimpleNamespace(sender_type="user", sender_id=sender_id)
data = SimpleNamespace(event=SimpleNamespace(message=message))
asyncio.run(
adapter._process_inbound_message(
data=data,
message=message,
sender_id=sender_id,
sender_id=sender.sender_id,
chat_type="group",
message_id="om_group_text",
)
@@ -1805,6 +1848,7 @@ class TestAdapterBehavior(unittest.TestCase):
data=SimpleNamespace(event=SimpleNamespace(message=message)),
message=message,
sender_id=SimpleNamespace(open_id="ou_user", user_id=None, union_id=None),
is_bot=False,
chat_type="p2p",
message_id="om_reply",
)
@@ -2667,11 +2711,12 @@ class TestAdapterBehavior(unittest.TestCase):
@unittest.skipUnless(_HAS_LARK_OAPI, "lark-oapi not installed")
class TestHydrateBotIdentity(unittest.TestCase):
"""Hydration of bot identity via /open-apis/bot/v3/info and application info.
"""Hydration of bot identity via ``/open-apis/bot/v3/info``.
Covers the manual-setup path where FEISHU_BOT_OPEN_ID / FEISHU_BOT_USER_ID
are not configured. Hydration must populate _bot_open_id so that
_is_self_sent_bot_message() can filter the adapter's own outbound echoes.
Covers the manual-setup path where ``FEISHU_BOT_OPEN_ID`` /
``FEISHU_BOT_NAME`` are not configured hydration populates them so
self-echo protection and group @mention gating both have something to
match against.
"""
def _make_adapter(self):
@@ -2700,11 +2745,6 @@ class TestHydrateBotIdentity(unittest.TestCase):
self.assertEqual(adapter._bot_open_id, "ou_hermes_hydrated")
self.assertEqual(adapter._bot_name, "Hermes Bot")
# Application-info fallback must NOT run when bot_name is already set.
self.assertFalse(
adapter._client.application.v6.application.get.called
if hasattr(adapter._client, "application") else False
)
@patch.dict(
os.environ,
@@ -2721,7 +2761,6 @@ class TestHydrateBotIdentity(unittest.TestCase):
asyncio.run(adapter._hydrate_bot_identity())
# Neither probe should run — both fields are already populated.
adapter._client.request.assert_not_called()
self.assertEqual(adapter._bot_open_id, "ou_env")
self.assertEqual(adapter._bot_name, "Env Hermes")
@@ -2766,33 +2805,6 @@ class TestHydrateBotIdentity(unittest.TestCase):
self.assertEqual(adapter._bot_open_id, "")
self.assertEqual(adapter._bot_name, "Fallback Bot")
@patch.dict(os.environ, {}, clear=True)
def test_hydrated_open_id_enables_self_send_filter(self):
"""E2E: after hydration, _is_self_sent_bot_message() rejects adapter's own id."""
adapter = self._make_adapter()
adapter._client = Mock()
payload = json.dumps(
{"code": 0, "bot": {"bot_name": "Hermes", "open_id": "ou_hermes"}}
).encode("utf-8")
adapter._client.request = Mock(return_value=SimpleNamespace(raw=SimpleNamespace(content=payload)))
asyncio.run(adapter._hydrate_bot_identity())
self_event = SimpleNamespace(
sender=SimpleNamespace(
sender_type="bot",
sender_id=SimpleNamespace(open_id="ou_hermes", user_id=""),
)
)
peer_event = SimpleNamespace(
sender=SimpleNamespace(
sender_type="bot",
sender_id=SimpleNamespace(open_id="ou_peer_bot", user_id=""),
)
)
self.assertTrue(adapter._is_self_sent_bot_message(self_event))
self.assertFalse(adapter._is_self_sent_bot_message(peer_event))
@unittest.skipUnless(_HAS_LARK_OAPI, "lark-oapi not installed")
class TestPendingInboundQueue(unittest.TestCase):
@@ -3137,7 +3149,7 @@ class TestGroupMentionAtAll(unittest.TestCase):
mentions=[],
)
sender_id = SimpleNamespace(open_id="ou_any", user_id=None)
self.assertTrue(adapter._should_accept_group_message(message, sender_id, ""))
self.assertTrue(_admits_group(adapter, message, sender_id, ""))
@patch.dict(os.environ, {"FEISHU_GROUP_POLICY": "allowlist", "FEISHU_ALLOWED_USERS": "ou_allowed"}, clear=True)
def test_at_all_still_requires_policy_gate(self):
@@ -3149,15 +3161,15 @@ class TestGroupMentionAtAll(unittest.TestCase):
message = SimpleNamespace(content='{"text":"@_all attention"}', mentions=[])
# Non-allowlisted user — should be blocked even with @_all.
blocked_sender = SimpleNamespace(open_id="ou_blocked", user_id=None)
self.assertFalse(adapter._should_accept_group_message(message, blocked_sender, ""))
self.assertFalse(_admits_group(adapter, message, blocked_sender, ""))
# Allowlisted user — should pass.
allowed_sender = SimpleNamespace(open_id="ou_allowed", user_id=None)
self.assertTrue(adapter._should_accept_group_message(message, allowed_sender, ""))
self.assertTrue(_admits_group(adapter, message, allowed_sender, ""))
@unittest.skipUnless(_HAS_LARK_OAPI, "lark-oapi not installed")
class TestSenderNameResolution(unittest.TestCase):
"""Tests for _resolve_sender_name_from_api."""
"""Tests for _resolve_sender_name_from_api (contact API + cache)."""
@patch.dict(os.environ, {}, clear=True)
def test_returns_none_when_client_is_none(self):
@@ -3261,6 +3273,137 @@ class TestSenderNameResolution(unittest.TestCase):
self.assertIsNone(result)
@unittest.skipUnless(_HAS_LARK_OAPI, "lark-oapi not installed")
class TestBotNameResolution(unittest.TestCase):
"""Tests for the bot branch of _resolve_sender_name_from_api (basic_batch API + shared cache)."""
@staticmethod
def _batch_payload(bots: Dict[str, str]):
import json as _json
body = {
oid: {"bot_id": oid, "name": name, "i18n_names": {"en_us": name}}
for oid, name in bots.items()
}
return _json.dumps({"code": 0, "msg": "", "data": {"bots": body, "failed_bots": {}}}).encode()
def _build_adapter_with_bots(self, bots: Dict[str, str]):
from gateway.config import PlatformConfig
from gateway.platforms.feishu import FeishuAdapter
adapter = FeishuAdapter(PlatformConfig())
calls = []
def _fake_request(request):
calls.append(request)
return SimpleNamespace(raw=SimpleNamespace(content=self._batch_payload(bots)))
adapter._client = SimpleNamespace(request=_fake_request)
return adapter, calls
@patch.dict(os.environ, {}, clear=True)
def test_returns_cached_bot_name_without_api_call(self):
from gateway.config import PlatformConfig
from gateway.platforms.feishu import FeishuAdapter
adapter = FeishuAdapter(PlatformConfig())
adapter._sender_name_cache["ou_peer"] = ("Peer Bot", time.time() + 600)
adapter._client = SimpleNamespace(
request=lambda _r: (_ for _ in ()).throw(RuntimeError("should not fetch"))
)
result = asyncio.run(adapter._resolve_sender_name_from_api("ou_peer", is_bot=True))
self.assertEqual(result, "Peer Bot")
@patch.dict(os.environ, {}, clear=True)
def test_fetches_and_caches_bot_name(self):
adapter, calls = self._build_adapter_with_bots({"ou_peer": "Peer Bot"})
async def _direct(func, *args, **kwargs):
return func(*args, **kwargs)
with patch("gateway.platforms.feishu.asyncio.to_thread", side_effect=_direct):
result = asyncio.run(adapter._resolve_sender_name_from_api("ou_peer", is_bot=True))
self.assertEqual(result, "Peer Bot")
self.assertEqual(adapter._sender_name_cache["ou_peer"][0], "Peer Bot")
self.assertEqual(len(calls), 1)
self.assertIn("/open-apis/bot/v3/bots/basic_batch", calls[0].uri)
# Feishu expects repeated ?bot_ids= params, not comma-joined.
self.assertEqual(calls[0].queries, [("bot_ids", "ou_peer")])
@patch.dict(os.environ, {}, clear=True)
def test_api_failure_returns_none_and_does_not_poison_cache(self):
from gateway.config import PlatformConfig
from gateway.platforms.feishu import FeishuAdapter
adapter = FeishuAdapter(PlatformConfig())
def _broken_request(_req):
raise RuntimeError("API down")
adapter._client = SimpleNamespace(request=_broken_request)
async def _direct(func, *args, **kwargs):
return func(*args, **kwargs)
with patch("gateway.platforms.feishu.asyncio.to_thread", side_effect=_direct):
result = asyncio.run(adapter._resolve_sender_name_from_api("ou_peer", is_bot=True))
self.assertIsNone(result)
self.assertNotIn("ou_peer", adapter._sender_name_cache)
@patch.dict(os.environ, {}, clear=True)
def test_bot_absent_from_response_is_not_cached(self):
"""Bot not in ``data.bots`` (e.g. landed in ``failed_bots``) → no
cache entry, next lookup re-fetches."""
adapter, _ = self._build_adapter_with_bots({"ou_other": "Other Bot"})
async def _direct(func, *args, **kwargs):
return func(*args, **kwargs)
with patch("gateway.platforms.feishu.asyncio.to_thread", side_effect=_direct):
result = asyncio.run(adapter._resolve_sender_name_from_api("ou_ghost", is_bot=True))
self.assertIsNone(result)
self.assertNotIn("ou_ghost", adapter._sender_name_cache)
@patch.dict(os.environ, {}, clear=True)
def test_empty_name_in_response_is_negative_cached(self):
"""API returns name="" → cache "" so repeat lookups short-circuit."""
adapter, calls = self._build_adapter_with_bots({"ou_nameless": ""})
async def _direct(func, *args, **kwargs):
return func(*args, **kwargs)
with patch("gateway.platforms.feishu.asyncio.to_thread", side_effect=_direct):
first = asyncio.run(adapter._resolve_sender_name_from_api("ou_nameless", is_bot=True))
second = asyncio.run(adapter._resolve_sender_name_from_api("ou_nameless", is_bot=True))
self.assertIsNone(first)
self.assertIsNone(second)
self.assertEqual(adapter._sender_name_cache["ou_nameless"][0], "")
self.assertEqual(len(calls), 1)
@patch.dict(os.environ, {}, clear=True)
def test_non_zero_code_returns_none(self):
from gateway.config import PlatformConfig
from gateway.platforms.feishu import FeishuAdapter
adapter = FeishuAdapter(PlatformConfig())
error_payload = b'{"code":99991663,"msg":"permission denied"}'
adapter._client = SimpleNamespace(
request=lambda _r: SimpleNamespace(raw=SimpleNamespace(content=error_payload))
)
async def _direct(func, *args, **kwargs):
return func(*args, **kwargs)
with patch("gateway.platforms.feishu.asyncio.to_thread", side_effect=_direct):
result = asyncio.run(adapter._resolve_sender_name_from_api("ou_peer", is_bot=True))
self.assertIsNone(result)
self.assertNotIn("ou_peer", adapter._sender_name_cache)
@unittest.skipUnless(_HAS_LARK_OAPI, "lark-oapi not installed")
class TestProcessingReactions(unittest.TestCase):
"""Typing on start → removed on SUCCESS, swapped for CrossMark on FAILURE,
+745
View File
@@ -0,0 +1,745 @@
"""Adapter-layer tests for Feishu bot-sender admission (``FeishuAdapter._admit``)."""
from __future__ import annotations
from types import SimpleNamespace
from typing import Any
import pytest
from tests.gateway.feishu_helpers import (
install_dedup_state,
make_adapter_skeleton,
make_message,
make_sender,
stub_mention,
)
# --- FeishuAdapterSettings wiring ------------------------------------------
@pytest.mark.parametrize(
"env_value, expected",
[
("none", "none"),
("mentions", "mentions"),
("all", "all"),
(" Mentions ", "mentions"),
],
)
def test_feishu_load_settings_populates_allow_bots(monkeypatch, env_value, expected):
from gateway.platforms.feishu import FeishuAdapter
monkeypatch.setenv("FEISHU_APP_ID", "cli_test")
monkeypatch.setenv("FEISHU_APP_SECRET", "secret_test")
monkeypatch.setenv("FEISHU_ALLOW_BOTS", env_value)
settings = FeishuAdapter._load_settings(extra={})
assert settings.allow_bots == expected
def test_feishu_load_settings_allow_bots_defaults_to_none(monkeypatch):
from gateway.platforms.feishu import FeishuAdapter
monkeypatch.setenv("FEISHU_APP_ID", "cli_test")
monkeypatch.setenv("FEISHU_APP_SECRET", "secret_test")
monkeypatch.delenv("FEISHU_ALLOW_BOTS", raising=False)
settings = FeishuAdapter._load_settings(extra={})
assert settings.allow_bots == "none"
def test_feishu_load_settings_ignores_extra_allow_bots(monkeypatch):
# extra is ignored — env is single source of truth (yaml is bridged to env).
from gateway.platforms.feishu import FeishuAdapter
monkeypatch.setenv("FEISHU_APP_ID", "cli_test")
monkeypatch.setenv("FEISHU_APP_SECRET", "secret_test")
monkeypatch.delenv("FEISHU_ALLOW_BOTS", raising=False)
settings = FeishuAdapter._load_settings(extra={"allow_bots": "all"})
assert settings.allow_bots == "none"
def test_feishu_load_settings_falls_back_to_env_when_extra_missing(monkeypatch):
from gateway.platforms.feishu import FeishuAdapter
monkeypatch.setenv("FEISHU_APP_ID", "cli_test")
monkeypatch.setenv("FEISHU_APP_SECRET", "secret_test")
monkeypatch.setenv("FEISHU_ALLOW_BOTS", "mentions")
settings = FeishuAdapter._load_settings(extra={})
assert settings.allow_bots == "mentions"
def test_feishu_load_settings_warns_on_unknown_allow_bots(monkeypatch, caplog):
import logging
from gateway.platforms.feishu import FeishuAdapter
monkeypatch.setenv("FEISHU_APP_ID", "cli_test")
monkeypatch.setenv("FEISHU_APP_SECRET", "secret_test")
monkeypatch.setenv("FEISHU_ALLOW_BOTS", "menton") # typo
with caplog.at_level(logging.WARNING, logger="gateway.platforms.feishu"):
settings = FeishuAdapter._load_settings(extra={})
assert settings.allow_bots == "none"
assert any("allow_bots" in r.message and "menton" in r.message for r in caplog.records)
@pytest.mark.parametrize(
"env_value, extra, expected",
[
(None, {}, True),
("false", {}, False),
("true", {}, True),
("true", {"require_mention": False}, False),
],
)
def test_feishu_load_settings_require_mention(monkeypatch, env_value, extra, expected):
from gateway.platforms.feishu import FeishuAdapter
monkeypatch.setenv("FEISHU_APP_ID", "cli_test")
monkeypatch.setenv("FEISHU_APP_SECRET", "secret_test")
if env_value is None:
monkeypatch.delenv("FEISHU_REQUIRE_MENTION", raising=False)
else:
monkeypatch.setenv("FEISHU_REQUIRE_MENTION", env_value)
settings = FeishuAdapter._load_settings(extra=extra)
assert settings.require_mention is expected
def test_feishu_load_settings_parses_per_group_require_mention(monkeypatch):
from gateway.platforms.feishu import FeishuAdapter
monkeypatch.setenv("FEISHU_APP_ID", "cli_test")
monkeypatch.setenv("FEISHU_APP_SECRET", "secret_test")
settings = FeishuAdapter._load_settings(extra={
"group_rules": {
"oc_free": {"policy": "open", "require_mention": False},
"oc_strict": {"policy": "open", "require_mention": True},
"oc_inherit": {"policy": "open"},
},
})
assert settings.group_rules["oc_free"].require_mention is False
assert settings.group_rules["oc_strict"].require_mention is True
assert settings.group_rules["oc_inherit"].require_mention is None
# --- Module-level helpers --------------------------------------------------
def test_sender_identity_collects_every_non_empty_id_variant():
from gateway.platforms.feishu import _sender_identity
sender = SimpleNamespace(
sender_id=SimpleNamespace(open_id="ou_x", user_id="", union_id="un_x"),
)
assert _sender_identity(sender) == frozenset({"ou_x", "un_x"})
def test_sender_identity_handles_missing_sender_id():
from gateway.platforms.feishu import _sender_identity
assert _sender_identity(SimpleNamespace()) == frozenset()
@pytest.mark.parametrize("sender_type", ["bot", "app"])
def test_is_bot_sender_treats_bot_and_app_as_bot_origin(sender_type):
from gateway.platforms.feishu import _is_bot_sender
assert _is_bot_sender(SimpleNamespace(sender_type=sender_type)) is True
@pytest.mark.parametrize("sender_type", ["user", "", None])
def test_is_bot_sender_rejects_non_bot_origin(sender_type):
from gateway.platforms.feishu import _is_bot_sender
assert _is_bot_sender(SimpleNamespace(sender_type=sender_type)) is False
# --- _admit pipeline matrix ------------------------------------------------
#
# Covers the four-step admission pipeline (self_echo → bot_policy →
# DM bypass → group_policy + mention) as a single result-only matrix.
# Each row pins one decision in the pipeline; tests asserting call-count
# semantics live below in their own functions.
def _admit_case(
*,
adapter: dict | None = None,
sender: dict | None = None,
message: dict | None = None,
mentions_self: bool | None = None,
expected: str | None = None,
):
return {
"adapter": adapter or {},
"sender": sender or {},
"message": message or {},
"mentions_self": mentions_self,
"expected": expected,
}
_ADMIT_CASES = [
pytest.param(
_admit_case(
adapter={"bot_open_id": "ou_me", "allow_bots": "all"},
sender={"sender_type": "bot", "open_id": "ou_me"},
expected="self_echo",
),
id="self_echo:open_id_under_all_mode",
),
pytest.param(
_admit_case(
adapter={"bot_open_id": "", "bot_user_id": "u_me", "allow_bots": "all"},
sender={"sender_type": "bot", "open_id": None, "user_id": "u_me"},
expected="self_echo",
),
id="self_echo:user_id_only",
),
pytest.param(
_admit_case(
adapter={"bot_open_id": "ou_me", "allow_bots": "all"},
sender={"sender_type": "bot", "open_id": "ou_me", "user_id": "u_me", "union_id": "un_me"},
expected="self_echo",
),
id="self_echo:mixed_ids",
),
pytest.param(
_admit_case(
adapter={"bot_open_id": "ou_self", "bot_user_id": "u_self", "allow_bots": "all"},
sender={"sender_type": "bot", "open_id": None, "user_id": "u_self"},
expected="self_echo",
),
id="self_echo:user_id_when_bot_user_id_set",
),
pytest.param(
_admit_case(
adapter={"bot_open_id": "ou_self", "allow_bots": "none"},
sender={"sender_type": "bot", "open_id": "ou_peer"},
expected="bots_disabled",
),
id="bots_disabled:mode_none",
),
pytest.param(
_admit_case(
adapter={"bot_open_id": "ou_self", "allow_bots": ""},
sender={"sender_type": "bot", "open_id": "ou_peer"},
expected="bots_disabled",
),
id="bots_disabled:mode_empty",
),
pytest.param(
_admit_case(
adapter={"bot_open_id": "ou_self", "allow_bots": "loose"},
sender={"sender_type": "bot", "open_id": "ou_peer"},
expected="bots_disabled",
),
id="bots_disabled:mode_unknown_value",
),
pytest.param(
_admit_case(
adapter={"bot_open_id": "", "allow_bots": "none"},
sender={"sender_type": "bot", "open_id": "ou_peer"},
expected="bots_disabled",
),
id="bots_disabled:wins_over_self_ids_unknown",
),
pytest.param(
_admit_case(
adapter={"bot_open_id": "", "allow_bots": "all"},
sender={"sender_type": "bot", "open_id": "ou_peer"},
expected="self_ids_unknown",
),
id="self_ids_unknown:bot_sender_no_self_ids",
),
pytest.param(
_admit_case(
adapter={"bot_open_id": "", "allow_bots": "all"},
sender={"sender_type": "app", "open_id": "ou_peer"},
expected="self_ids_unknown",
),
id="self_ids_unknown:app_sender_no_self_ids",
),
pytest.param(
_admit_case(
adapter={"bot_open_id": "ou_self", "allow_bots": "all"},
sender={"sender_type": "app", "open_id": None},
expected="self_ids_unknown",
),
id="self_ids_unknown:no_sender_ids",
),
pytest.param(
_admit_case(
adapter={"bot_open_id": "ou_self", "allow_bots": "mentions"},
sender={"sender_type": "bot", "open_id": "ou_peer"},
mentions_self=False,
expected="bot_not_mentioned",
),
id="mentions_mode:not_mentioned_dm",
),
pytest.param(
_admit_case(
adapter={"bot_open_id": "ou_self", "allow_bots": "mentions"},
sender={"sender_type": "bot", "open_id": "ou_peer"},
mentions_self=True,
expected=None,
),
id="mentions_mode:mentioned_dm",
),
pytest.param(
_admit_case(
adapter={"bot_open_id": "ou_self", "allow_bots": "all"},
sender={"sender_type": "bot", "open_id": "ou_peer"},
mentions_self=False,
expected=None,
),
id="all_mode:not_mentioned_dm",
),
pytest.param(
_admit_case(
adapter={"bot_open_id": "ou_self", "allow_bots": "all"},
sender={"sender_type": "bot", "open_id": "ou_peer"},
mentions_self=True,
expected=None,
),
id="all_mode:mentioned_dm",
),
pytest.param(
_admit_case(
adapter={"bot_open_id": "", "allow_bots": "none"},
sender={"sender_type": "user", "open_id": "ou_human"},
expected=None,
),
id="human:dm_admitted_regardless_of_allow_bots",
),
pytest.param(
_admit_case(
adapter={"allow_bots": "all"},
sender={"sender_type": "user", "open_id": "ou_human"},
message={"message_id": "om_ok", "chat_type": "p2p"},
expected=None,
),
id="human:p2p_admitted",
),
pytest.param(
_admit_case(
adapter={
"bot_open_id": "ou_self",
"require_mention": False,
"group_policy": "open",
},
sender={"sender_type": "user", "open_id": "ou_human"},
message={"chat_type": "group"},
mentions_self=False,
expected=None,
),
id="require_mention_false:group_human_no_mention_admitted",
),
pytest.param(
_admit_case(
adapter={
"bot_open_id": "ou_self",
"allow_bots": "all",
"require_mention": False,
"group_policy": "open",
},
sender={"sender_type": "bot", "open_id": "ou_peer"},
message={"chat_type": "group"},
mentions_self=False,
expected=None,
),
id="require_mention_false:group_bot_all_mode_admitted",
),
pytest.param(
_admit_case(
adapter={
"bot_open_id": "ou_self",
"allow_bots": "mentions",
"require_mention": False,
"group_policy": "open",
},
sender={"sender_type": "bot", "open_id": "ou_peer"},
message={"chat_type": "group"},
mentions_self=False,
expected="bot_not_mentioned",
),
id="require_mention_false:group_bot_mentions_mode_still_gated",
),
]
@pytest.mark.parametrize("case", _ADMIT_CASES)
def test_admit_pipeline(case):
adapter = make_adapter_skeleton(**case["adapter"])
if case["mentions_self"] is not None:
stub_mention(adapter, case["mentions_self"])
sender = make_sender(**case["sender"])
message = make_message(**case["message"])
assert adapter._admit(sender, message) == case["expected"]
# --- Mention call-count semantics ------------------------------------------
def test_admit_skips_mention_check_under_all_mode():
# Tripwire: under allow_bots=all the mention path must not be probed.
adapter = make_adapter_skeleton(bot_open_id="ou_self", allow_bots="all")
calls = 0
def _tripwire(_message):
nonlocal calls
calls += 1
return False
adapter._mentions_self = _tripwire
sender = make_sender(sender_type="bot", open_id="ou_peer")
assert adapter._admit(sender, make_message()) is None
assert calls == 0
def test_admit_group_mention_checked_once_per_call():
# Stage 2 (mentions mode) and stage 4 (group require_mention) must not
# double-evaluate _mentions_self for the same admit call.
adapter = make_adapter_skeleton(
bot_open_id="ou_self", allow_bots="mentions", require_mention=True,
group_policy="open",
)
calls = 0
def _counting(_message):
nonlocal calls
calls += 1
return True
adapter._mentions_self = _counting
sender = make_sender(sender_type="bot", open_id="ou_peer")
assert adapter._admit(sender, make_message(chat_type="group")) is None
assert calls == 1
# --- Per-group require_mention override ------------------------------------
def test_admit_per_group_require_mention_overrides_global():
from gateway.platforms.feishu import FeishuGroupRule
adapter = make_adapter_skeleton(
bot_open_id="ou_self", require_mention=True, group_policy="open",
)
adapter._group_rules = {
"oc_free": FeishuGroupRule(policy="open", require_mention=False),
}
stub_mention(adapter, False)
sender = make_sender(sender_type="user", open_id="ou_human")
assert adapter._admit(sender, make_message(chat_id="oc_free", chat_type="group")) is None
assert (
adapter._admit(sender, make_message(chat_id="oc_other", chat_type="group"))
== "group_policy_rejected"
)
# --- Hydration -------------------------------------------------------------
def test_hydrate_bot_identity_populates_self_ids_from_bot_v3_info(monkeypatch):
import asyncio
from gateway.platforms.feishu import FeishuAdapter
adapter = object.__new__(FeishuAdapter)
adapter._bot_open_id = ""
adapter._bot_user_id = ""
adapter._bot_name = ""
adapter._allow_bots = "all"
captured = {}
def _fake_request(request):
captured["uri"] = getattr(request, "uri", None)
captured["http_method"] = getattr(request, "http_method", None)
return SimpleNamespace(raw=SimpleNamespace(
content=b'{"code":0,"bot":{"app_name":"Hermes","open_id":"ou_hydrated"}}'
))
adapter._client = SimpleNamespace(request=_fake_request)
asyncio.run(adapter._hydrate_bot_identity())
assert captured["uri"] == "/open-apis/bot/v3/info"
assert str(captured["http_method"]).endswith("GET")
assert adapter._bot_open_id == "ou_hydrated"
assert adapter._bot_name == "Hermes"
# /bot/v3/info doesn't surface user_id, so _bot_user_id stays empty.
assert adapter._bot_user_id == ""
def test_resolve_sender_profile_uses_open_id_for_bot_name_lookup():
import asyncio
from gateway.platforms.feishu import FeishuAdapter
adapter = object.__new__(FeishuAdapter)
adapter._client = object()
adapter._sender_name_cache = {}
seen_ids = []
async def _fake_fetch_bot_names(bot_ids):
seen_ids.extend(bot_ids)
return {"ou_peer": "Peer Bot"}
adapter._fetch_bot_names = _fake_fetch_bot_names
profile = asyncio.run(
adapter._resolve_sender_profile(
SimpleNamespace(open_id="ou_peer", user_id="u_peer", union_id="on_peer"),
is_bot=True,
)
)
assert seen_ids == ["ou_peer"]
assert profile["user_id"] == "u_peer"
assert profile["user_name"] == "Peer Bot"
# --- _allow_group_message matrix -------------------------------------------
#
# Bot-bypass semantics: admitted bots skip allowlist/blacklist (parallel
# human-scope filters), but channel-level locks (disabled, admin_only) and
# admin short-circuits still apply.
def _group_case(
*,
adapter: dict | None = None,
admins: set | None = None,
group_rules: dict | None = None,
sender: dict | None = None,
chat_id: str = "oc_1",
is_bot: bool = False,
expected: bool = False,
):
return {
"adapter": adapter or {},
"admins": admins or set(),
"group_rules": group_rules or {},
"sender": sender or {},
"chat_id": chat_id,
"is_bot": is_bot,
"expected": expected,
}
def _group_rule(policy: str, **kwargs):
from gateway.platforms.feishu import FeishuGroupRule
return FeishuGroupRule(policy=policy, **kwargs)
_GROUP_CASES = [
pytest.param(
_group_case(
sender={"sender_type": "bot", "open_id": "ou_peer"},
is_bot=True,
expected=True,
),
id="bot:bypasses_default_allowlist",
),
pytest.param(
_group_case(
sender={"sender_type": "user", "open_id": "ou_stranger"},
is_bot=False,
expected=False,
),
id="human:gated_by_default_allowlist",
),
pytest.param(
_group_case(
admins={"ou_peer"},
sender={"sender_type": "bot", "open_id": "ou_peer"},
is_bot=True,
expected=True,
),
id="bot:admin_short_circuit",
),
pytest.param(
_group_case(
admins={"u_admin"},
sender={"sender_type": "user", "open_id": None, "user_id": "u_admin"},
is_bot=False,
expected=True,
),
id="human:admin_via_user_id",
),
pytest.param(
_group_case(
sender={"sender_type": "bot", "open_id": "ou_peer"},
is_bot=True,
expected=True,
),
id="bot:allowlist_skipped",
),
pytest.param(
_group_case(
sender={"sender_type": "app", "open_id": "ou_peer"},
is_bot=True,
expected=True,
),
id="app:allowlist_skipped",
),
]
# Channel-lock cases need group_rules construction; keep them in a separate
# parametrize so we can use _group_rule() (FeishuGroupRule import).
_GROUP_RULE_CASES = [
pytest.param(
"disabled", "bot", False,
id="bot:disabled_policy_blocks_even_with_bypass",
),
pytest.param(
"disabled", "app", False,
id="app:disabled_policy_blocks_even_with_bypass",
),
pytest.param(
"admin_only", "bot", False,
id="bot:admin_only_policy_blocks_non_admin",
),
pytest.param(
"admin_only", "app", False,
id="app:admin_only_policy_blocks_non_admin",
),
]
@pytest.mark.parametrize("case", _GROUP_CASES)
def test_allow_group_message_matrix(case):
adapter = make_adapter_skeleton(**case["adapter"])
adapter._admins = case["admins"]
adapter._group_rules = case["group_rules"]
sender = make_sender(**case["sender"])
assert adapter._allow_group_message(
sender_id=sender.sender_id,
chat_id=case["chat_id"],
is_bot=case["is_bot"],
) is case["expected"]
@pytest.mark.parametrize("policy, sender_type, expected", _GROUP_RULE_CASES)
def test_allow_group_message_channel_locks_apply_to_bots(policy, sender_type, expected):
adapter = make_adapter_skeleton()
adapter._group_rules = {"oc_locked": _group_rule(policy)}
sender = make_sender(sender_type=sender_type, open_id="ou_peer")
assert adapter._allow_group_message(
sender_id=sender.sender_id,
chat_id="oc_locked",
is_bot=True,
) is expected
@pytest.mark.parametrize("sender_type", ["bot", "app"])
def test_allow_group_message_blacklist_is_human_scope_only(sender_type):
# blacklist is parallel to allowlist (human-scope); admitted bots bypass
# it. To block a specific bot, gate upstream via FEISHU_ALLOW_BOTS.
adapter = make_adapter_skeleton()
adapter._group_rules = {
"oc_1": _group_rule("blacklist", blacklist={"ou_peer"})
}
sender = make_sender(sender_type=sender_type, open_id="ou_peer")
assert adapter._allow_group_message(
sender_id=sender.sender_id,
chat_id="oc_1",
is_bot=True,
) is True
# --- Realistic payload smoke -----------------------------------------------
def test_admit_accepts_realistic_bot_at_bot_group_event():
# Locks in the real im.message.receive_v1 payload shape under mode=mentions.
adapter = make_adapter_skeleton(bot_open_id="ou_self", allow_bots="mentions")
mention = SimpleNamespace(
key="@_user_1",
id=SimpleNamespace(union_id="on_mentionUnion", user_id="", open_id="ou_self"),
name="Hermes",
mentioned_type="bot",
tenant_key="tenant_ab",
)
message = SimpleNamespace(
message_id="om_realistic_bot_at_bot",
chat_id="oc_real",
chat_type="group",
message_type="text",
content='{"text":"@_user_1 hello"}',
mentions=[mention],
)
sender = SimpleNamespace(
sender_type="bot",
sender_id=SimpleNamespace(union_id="on_peerUnion", user_id="u_peer", open_id="ou_peer_bot"),
tenant_key="tenant_ab",
)
assert adapter._admit(sender, message) is None
# --- Event-dispatch plumbing -----------------------------------------------
def test_handle_message_event_data_drops_bot_sender_by_default():
import asyncio
adapter = make_adapter_skeleton()
install_dedup_state(adapter)
processed = []
async def _fake_process_inbound_message(**kwargs):
processed.append(kwargs)
adapter._process_inbound_message = _fake_process_inbound_message
data = SimpleNamespace(
event=SimpleNamespace(
sender=make_sender(sender_type="bot", open_id="ou_peer"),
message=make_message(message_id="om_bot_default", chat_type="p2p"),
)
)
asyncio.run(adapter._handle_message_event_data(data))
assert processed == []
def test_handle_message_event_data_forwards_sender_when_admitted():
import asyncio
adapter = make_adapter_skeleton(allow_bots="all")
install_dedup_state(adapter)
captured = {}
async def _fake_process_inbound_message(**kwargs):
captured.update(kwargs)
adapter._process_inbound_message = _fake_process_inbound_message
sender = make_sender(sender_type="bot", open_id="ou_peer")
data = SimpleNamespace(
event=SimpleNamespace(
sender=sender,
message=make_message(message_id="om_bot_ok", chat_type="p2p"),
)
)
asyncio.run(adapter._handle_message_event_data(data))
assert captured.get("sender_id") is sender.sender_id
assert captured.get("is_bot") is True
assert captured.get("message_id") == "om_bot_ok"

Some files were not shown because too many files have changed in this diff Show More