Compare commits

..

31 Commits

Author SHA1 Message Date
kshitijk4poor 25d9fc8094 fix(flush_memories): always deduct headroom + resolve flush aux model + trim defence
Three fixes for flush_memories / compression context window overflow:

1. ALWAYS deduct headroom before comparing aux_context vs threshold.
   #15631 only deducted inside 'if aux_context < threshold' — which
   never fires in the common same-model case (threshold = context × 0.50
   means aux_context > threshold always). Now headroom is computed
   unconditionally and effective_limit = aux_context - headroom is
   compared against threshold.

2. Also resolve flush_memories auxiliary model in the feasibility check.
   If the user configures separate auxiliary.flush_memories provider,
   the flush model's smaller context was unchecked.

3. Defence-in-depth trimming in flush_memories() for CLI /new and
   gateway resets that bypass preflight compression entirely.
2026-04-25 19:53:54 +05:30
kshitijk4poor d635e2df3f fix(compression): pass provider to context length resolver in feasibility check
_check_compression_model_feasibility calls get_model_context_length
without provider=, so Codex OAuth users get 1,050,000 (from models.dev
for 'openai') instead of the actual 272,000 limit. This happens because
_infer_provider_from_url maps chatgpt.com → 'openai' (not 'openai-codex'),
skipping the Codex-specific resolution branch entirely.

Result: compression threshold set at 85% of 1.05M = 892K — conversations
never trigger compression, the context grows unbounded, and when gateway
hygiene eventually forces compression, the Codex endpoint drops the
oversized streaming request ('peer closed connection without sending
complete message body').

Fix: forward self.provider to get_model_context_length so provider-
specific resolution branches (Codex OAuth 272K, Copilot live /models,
Nous suffix-match) fire correctly.

Reported by user on GPT 5.5 via Codex OAuth Pro (paste.rs/vsra3).
2026-04-25 07:09:47 -07:00
Teknium cf2fabc40f docs(dashboard): document page-scoped plugin slots (#15662)
Follow-up to PR #15658. The feature PR introduced page-scoped slots
(<page>:top / <page>:bottom inside every built-in page) but only
touched the Shell slots catalogue. Adds proper narrative coverage so
plugin authors find the feature.

Changes
- extending-the-dashboard.md:
  - Frontmatter description + intro bullet now mention page-scoped slots
  - New TOC entry "Augmenting built-in pages (page-scoped slots)"
  - New dedicated subsection after "Replacing built-in pages"
    explaining the heavy-vs-light tradeoff, listing the pages that
    expose slots, and showing a worked manifest + IIFE example with
    tab.hidden: true
  - Cross-link from the tab.override section pointing readers to the
    lighter augmentation option
- web-dashboard.md:
  - Bullet mentioning "page-scoped slots (inject widgets into
    built-in pages without overriding them)"

Validation
- TOC anchor "#augmenting-built-in-pages-page-scoped-slots" matches
  the generated heading slug
- Code fences balanced (64, even)
- Pre-existing docusaurus build errors (skills.json, api-server.md
  link) reproduce on bare main -- not introduced here
2026-04-25 06:59:24 -07:00
Teknium af22421e87 feat(dashboard): page-scoped plugin slots for built-in pages (#15658)
* fix(terminal): three-layer defense against watch_patterns notification spam

Background processes that stack notify_on_complete=True with watch_patterns
can flood the user with duplicate, delayed notifications — matches deliver
asynchronously via the completion queue and continue arriving minutes after
the process has exited. The docstring warning against this (PR #12113) has
proven insufficient; agents still misuse the combination.

Three layered defenses, each sufficient on its own:

1. Mutual exclusion (terminal_tool.py): When both flags are set on a
   background process, drop watch_patterns with a warning. notify_on_complete
   wins because 'let me know when it's done' is the more useful signal and
   fires exactly once. Extracted as _resolve_notification_flag_conflict() so
   the rule is testable in isolation.

2. Suppress-after-exit (process_registry.py): _check_watch_patterns() now
   bails the moment session.exited is True. Post-exit chunks (buffered reads
   draining after the process is gone) no longer produce notifications. This
   is the fix flagged as future work in session 20260418_020302_79881c.

3. Global circuit breaker (process_registry.py): Per-session rate limits don't
   catch the sibling-flood case — N concurrent processes can each stay under
   8/10s and still collectively spam. New WATCH_GLOBAL_MAX_PER_WINDOW=15 cap
   trips a 30-second cooldown across ALL sessions, emits a single
   watch_overflow_tripped event, silently counts dropped events, and emits a
   watch_overflow_released summary when the cooldown ends.

Also updates the tool schema + docstring to document the new behavior.

Tests: 8 new tests covering all three fixes (suppress-after-exit x2,
mutual-exclusion resolver x4, global breaker trip/cooldown/release x2).
All 60 tests across test_watch_patterns.py, test_notify_on_complete.py,
test_terminal_tool.py pass.

Real-world trigger: self-inflicted in session 20260425_051924 — three
concurrent hermes-sweeper review subprocesses each set watch_patterns=
['failed validation', 'errored'] AND notify_on_complete=True, then iterated
over multiple items, producing enough matches per process to defeat the
per-session cap while staying under the global cap that didn't yet exist.

* fix(terminal): aggressive 1-per-15s watch_patterns rate limit + strike-3 promotion

Per Teknium's direction, the watch_patterns rate limit is now much more
aggressive and self-healing.

## New rule — per session

- HARD cap: 1 watch-match notification per 15 seconds per process.
- Any match arriving inside the cooldown window is dropped and counts as
  ONE strike for that window (many drops in the same window still = 1 strike).
- After 3 consecutive strike windows, watch_patterns is permanently disabled
  for the session and the session is auto-promoted to notify_on_complete
  semantics — exactly one notification when the process actually exits.
- A cooldown window that expires with zero drops resets the consecutive
  strike counter — healthy cadence is forgiven.

## Schema + docstring rewritten

The tool schema description now gives the model explicit guidance:
- notify_on_complete is 'the right choice for almost every long-running task'
- watch_patterns is for RARE one-shot signals on LONG-LIVED processes
- Do NOT use watch_patterns with loops/batch jobs — error patterns fire every
  iteration and will hit the strike limit fast
- Mutual exclusion is stated on both parameter descriptions
- 1/15s cooldown and 3-strike promotion are stated in the watch_patterns
  description so the model sees the contract every turn

## Removed

- WATCH_MAX_PER_WINDOW (8/10s) and WATCH_OVERLOAD_KILL_SECONDS (45) — the
  new 1/15s limit subsumes both; keeping them would double-count.
- _watch_window_hits / _watch_window_start / _watch_overload_since fields
  on ProcessSession. Replaced by _watch_last_emit_at / _watch_cooldown_until
  / _watch_strike_candidate / _watch_consecutive_strikes.

## Kept

- Global circuit breaker across all sessions (15/10s → 30s cooldown) as a
  secondary safety net for concurrent siblings. Still valuable when 20
  short-lived processes each fire once — none individually violates the
  per-session limit.
- Suppress-after-exit guard.
- Mutual exclusion resolver at the tool entry point.

## Tests

- 6 new tests in TestPerSessionRateLimit covering: first match delivers,
  second in cooldown suppressed, multi-drop = single strike, 3 strikes
  disables + promotes, clean window resets counter, suppressed count
  carried to next emit.
- Global circuit breaker tests rewritten to use fresh sessions instead of
  hacking removed per-window fields.
- 50/50 watch_patterns + notify_on_complete tests pass.
- 60/60 including test_terminal_tool.py pass.

* feat(dashboard): page-scoped plugin slots for built-in pages

Dashboard plugins can now inject components into specific built-in
pages (Sessions, Analytics, Logs, Cron, Skills, Config, Env, Docs,
Chat) without overriding the whole route.

Previously, plugins could only:
  1. Add new tabs (tab.path)
  2. Replace whole built-in pages (tab.override)
  3. Inject into global shell slots (header-*, footer-*, pre-main, ...)

None of those let a plugin add a banner, card, or widget to an
existing page. The new <page>:top / <page>:bottom slots close that
gap, reusing the existing registerSlot() API.

Changes
- web/src/plugins/slots.ts: 18 new KNOWN_SLOT_NAMES entries
  (sessions:top, sessions:bottom, analytics:top, ..., chat:bottom),
  grouped under "Shell-wide" vs "Page-scoped" in the docblock
- web/src/pages/*: each built-in page now renders
    <PluginSlot name="<page>:top" />
  as the first child of its outer wrapper and
    <PluginSlot name="<page>:bottom" />
  as the last child -- zero visual cost when no plugin registers
- plugins/example-dashboard: registers a demo banner into
  sessions:top via registerSlot(), with matching slots entry in
  the manifest -- so freshly-setup users can see what page-scoped
  slots look like without writing any plugin code
- website/docs: new "Page-scoped slots" table in the plugin
  authoring guide, with a worked example
- tests/hermes_cli/test_web_server.py: round-trip test for
  colon-bearing slot names (sessions:top, analytics:bottom, ...)

Validation
- npm run build: clean (tsc -b + vite build, 2761 modules)
- scripts/run_tests.sh tests/hermes_cli/test_web_server.py::TestDashboardPluginManifestExtensions: 5/5 pass
2026-04-25 06:55:35 -07:00
Teknium 97d54f0e4d fix(terminal): three-layer defense against watch_patterns notification spam (#15642)
* fix(terminal): three-layer defense against watch_patterns notification spam

Background processes that stack notify_on_complete=True with watch_patterns
can flood the user with duplicate, delayed notifications — matches deliver
asynchronously via the completion queue and continue arriving minutes after
the process has exited. The docstring warning against this (PR #12113) has
proven insufficient; agents still misuse the combination.

Three layered defenses, each sufficient on its own:

1. Mutual exclusion (terminal_tool.py): When both flags are set on a
   background process, drop watch_patterns with a warning. notify_on_complete
   wins because 'let me know when it's done' is the more useful signal and
   fires exactly once. Extracted as _resolve_notification_flag_conflict() so
   the rule is testable in isolation.

2. Suppress-after-exit (process_registry.py): _check_watch_patterns() now
   bails the moment session.exited is True. Post-exit chunks (buffered reads
   draining after the process is gone) no longer produce notifications. This
   is the fix flagged as future work in session 20260418_020302_79881c.

3. Global circuit breaker (process_registry.py): Per-session rate limits don't
   catch the sibling-flood case — N concurrent processes can each stay under
   8/10s and still collectively spam. New WATCH_GLOBAL_MAX_PER_WINDOW=15 cap
   trips a 30-second cooldown across ALL sessions, emits a single
   watch_overflow_tripped event, silently counts dropped events, and emits a
   watch_overflow_released summary when the cooldown ends.

Also updates the tool schema + docstring to document the new behavior.

Tests: 8 new tests covering all three fixes (suppress-after-exit x2,
mutual-exclusion resolver x4, global breaker trip/cooldown/release x2).
All 60 tests across test_watch_patterns.py, test_notify_on_complete.py,
test_terminal_tool.py pass.

Real-world trigger: self-inflicted in session 20260425_051924 — three
concurrent hermes-sweeper review subprocesses each set watch_patterns=
['failed validation', 'errored'] AND notify_on_complete=True, then iterated
over multiple items, producing enough matches per process to defeat the
per-session cap while staying under the global cap that didn't yet exist.

* fix(terminal): aggressive 1-per-15s watch_patterns rate limit + strike-3 promotion

Per Teknium's direction, the watch_patterns rate limit is now much more
aggressive and self-healing.

## New rule — per session

- HARD cap: 1 watch-match notification per 15 seconds per process.
- Any match arriving inside the cooldown window is dropped and counts as
  ONE strike for that window (many drops in the same window still = 1 strike).
- After 3 consecutive strike windows, watch_patterns is permanently disabled
  for the session and the session is auto-promoted to notify_on_complete
  semantics — exactly one notification when the process actually exits.
- A cooldown window that expires with zero drops resets the consecutive
  strike counter — healthy cadence is forgiven.

## Schema + docstring rewritten

The tool schema description now gives the model explicit guidance:
- notify_on_complete is 'the right choice for almost every long-running task'
- watch_patterns is for RARE one-shot signals on LONG-LIVED processes
- Do NOT use watch_patterns with loops/batch jobs — error patterns fire every
  iteration and will hit the strike limit fast
- Mutual exclusion is stated on both parameter descriptions
- 1/15s cooldown and 3-strike promotion are stated in the watch_patterns
  description so the model sees the contract every turn

## Removed

- WATCH_MAX_PER_WINDOW (8/10s) and WATCH_OVERLOAD_KILL_SECONDS (45) — the
  new 1/15s limit subsumes both; keeping them would double-count.
- _watch_window_hits / _watch_window_start / _watch_overload_since fields
  on ProcessSession. Replaced by _watch_last_emit_at / _watch_cooldown_until
  / _watch_strike_candidate / _watch_consecutive_strikes.

## Kept

- Global circuit breaker across all sessions (15/10s → 30s cooldown) as a
  secondary safety net for concurrent siblings. Still valuable when 20
  short-lived processes each fire once — none individually violates the
  per-session limit.
- Suppress-after-exit guard.
- Mutual exclusion resolver at the tool entry point.

## Tests

- 6 new tests in TestPerSessionRateLimit covering: first match delivers,
  second in cooldown suppressed, multi-drop = single strike, 3 strikes
  disables + promotes, clean window resets counter, suppressed count
  carried to next emit.
- Global circuit breaker tests rewritten to use fresh sessions instead of
  hacking removed per-window fields.
- 50/50 watch_patterns + notify_on_complete tests pass.
- 60/60 including test_terminal_tool.py pass.
2026-04-25 06:41:58 -07:00
Teknium 6e561ffa6d fix(update): poll is-active instead of one-shot sleep(3) after gateway restart (#15639)
The auto-restart path in `hermes update` verifies systemd unit health with
`time.sleep(3)` + a single `systemctl is-active` call.  The unit's
Stopped -> Started transition after a graceful SIGUSR1 exit (or a hard
restart) is not always complete inside that 3s window, so the verify
races and reports 'drained but didn't relaunch' even though systemd is
about to bring the unit back up a fraction of a second later.  Users
then see a spurious warning, a redundant fallback `systemctl restart`
fires, and adapters (Discord, WhatsApp) get restarted twice.

Replace the three sleep+oneshot sites with a small `_wait_for_service_active()`
closure that polls `is-active` every 0.5s for up to 10s.  Behaviour
is unchanged when the unit is healthy or truly dead — only the race
window around a clean restart is now handled correctly.

Tests: tests/hermes_cli/test_update_gateway_restart.py (41/41).
2026-04-25 06:11:22 -07:00
Teknium ac05daa189 fix(tools): dedupe bundled plugin toolsets with built-in entries (#15634)
`hermes tools` → "reconfigure existing" listed Spotify twice because
the Apr 24 refactor that moved Spotify into plugins/spotify/ (PR #15174)
left the entry in CONFIGURABLE_TOOLSETS. _get_effective_configurable_toolsets()
unconditionally appended get_plugin_toolsets() on top, so the same
'spotify' key showed up from both sources.

Dedupe by key — built-in CONFIGURABLE_TOOLSETS entry wins (it has the
nicer label and description). Also guards against future bundled plugins
that share a toolset key with a built-in.
2026-04-25 05:53:08 -07:00
Teknium 3c1c65e754 fix(auxiliary): generalize unsupported-parameter detector and harden max_tokens retry (#15633)
Generalize the temperature-specific 400 retry that shipped in PR #15621 so
the same reactive strategy covers any provider that rejects an arbitrary
request parameter —  — not just temperature.

- agent/auxiliary_client.py:
  * New _is_unsupported_parameter_error(exc, param): matches the same six
    phrasings the old temperature detector did plus 'unrecognized parameter'
    and 'invalid parameter', against any named param.
  * _is_unsupported_temperature_error is now a thin back-compat wrapper so
    existing imports and tests keep working.
  * The max_tokens → max_completion_tokens retry branch in call_llm and
    async_call_llm now (a) gates on 'max_tokens is not None' so we do not
    pop a key that was never set and silently substitute a None value on
    the retry, and (b) also matches the generic helper in addition to the
    legacy 'max_tokens' / 'unsupported_parameter' substring checks — picking
    up phrasings like 'Unknown parameter: max_tokens' that previously slipped
    through.

- tests/agent/test_unsupported_parameter_retry.py: 18 new tests covering
  the generic detector across params, the back-compat wrapper, and the two
  hardenings to the max_tokens retry branch (None gate + generic phrasing).

Credit: retry-generalization pattern from @nicholasrae's PR #15416. That PR
also proposed the reactive temperature retry which landed independently via
PR #15621 + #15623 (co-authored with @BlueBirdBack). This commit salvages
the remaining hardening ideas onto current main.
2026-04-25 05:50:34 -07:00
Teknium f92006ce1c fix(compression): reserve system+tools headroom when aux binds threshold (#15631)
When the auxiliary compression model's context is smaller than the main
model's compression threshold, _check_compression_model_feasibility
auto-lowers the session threshold. Previously it set:

    new_threshold = aux_context

This let the raw message list grow to exactly aux_context tokens. But
compression and flush_memories actually send system_prompt + tool_schemas
+ messages to the aux model. With 50+ tools that overhead is 25-30K
tokens, so the full request overflowed aux with HTTP 400.

Subtract a headroom estimate from aux_context before setting the new
threshold: the actual tool-schema token count (from
estimate_request_tokens_rough) plus a 12K allowance for the system
prompt (not yet built at __init__ time) and flush-instruction overhead.
Clamp to MINIMUM_CONTEXT_LENGTH so the session still starts even with
an unusually heavy tool schema.

This fixes the 'flush_memories overflow on busy toolsets' path that
Teknium flagged — where main and aux can be nominally the same model
but still 400 because the threshold left no room for the request
overhead. Same fix also protects the normal compression summarisation
request on the same binding aux.

Tests: two new regression tests cover the headroom reservation and the
MINIMUM_CONTEXT_LENGTH floor. Two existing tests updated for the new
(lower) threshold values now that empty-tools still produces a 12K
static headroom deduction.
2026-04-25 05:41:56 -07:00
Teknium b35d692f45 chore(release): map ash@users.noreply.github.com to ash 2026-04-25 05:27:17 -07:00
Ash Rowan Vale 🌿 facea84559 fix(auxiliary): retry without temperature when any provider rejects it
Universal reactive fix for 'HTTP 400: Unsupported parameter: temperature'
across all providers/models — not just Codex Responses.

The same backend can accept temperature for some models and reject it for
others (e.g. gpt-5.4 accepts but gpt-5.5 rejects on the same OpenAI
endpoint; similar patterns on Copilot, OpenRouter reasoning routes, and
Anthropic Opus 4.7+ via OAI-compat). An allow/deny-list by model name does
not scale.

call_llm / async_call_llm now detect the concrete 'unsupported parameter:
temperature' 400 and transparently retry once without temperature. Kimi's
server-managed omission and Opus 4.7+'s proactive strip stay in place —
this is the safety net for everything else.

Changes:
- agent/auxiliary_client.py: add _is_unsupported_temperature_error helper;
  wire into both sync and async call_llm paths before the existing
  max_tokens/payment/auth retry ladder
- tests/agent/test_unsupported_temperature_retry.py: 19 tests covering
  detector phrasings, sync + async retry, no-retry-without-temperature,
  and non-temperature 400s not triggering the retry

Builds on PR #15620 (codex_responses fallback) which stripped temperature
up front for that one api_mode. This PR closes the gap for every other
provider/model combo via reactive retry.

Credit: retry approach and detector originate from @BlueBirdBack's PR #15578.

Co-authored-by: BlueBirdBack <BlueBirdBack@users.noreply.github.com>
2026-04-25 05:27:17 -07:00
Teknium f67a61dc93 fix(flush_memories): strip temperature from codex_responses fallback (#15620)
The memory-flush fallback for api_mode='codex_responses' was unconditionally
adding `temperature` to codex_kwargs before calling _run_codex_stream. The
Responses API does not accept temperature on any supported backend:

- chatgpt.com/backend-api/codex rejects it outright
- api.openai.com + gpt-5/o-series reasoning models reject it
- Copilot Responses rejects it on reasoning models

The CodexAuxiliaryClient adapter and the codex_responses transport both
correctly omit temperature — the flush fallback was the only path putting
it back. On errors from the primary aux path (e.g. expired OAuth token),
users saw `⚠ Auxiliary memory flush failed: HTTP 400: Unsupported parameter:
temperature`.

Reported by Garik [NOUS] on GPT-5.5 via Codex OAuth Pro.
2026-04-25 05:01:25 -07:00
Teknium 6ed37e0f42 feat(tools): make discord/discord_admin opt-in, Discord-only
Both discord (read/participate) and discord_admin (server admin) are now
configurable via `hermes tools` with default-OFF. Previously the core
discord tool (fetch_messages, search_members, create_thread) auto-loaded
on every Discord install with DISCORD_BOT_TOKEN set — 19 tools the user
never opted into.

Adds a platform-scoping mechanism (_TOOLSET_PLATFORM_RESTRICTIONS) so
the discord toolsets only show up in the Discord platform's checklist,
not on CLI/Telegram/Slack/etc. Applied at four gates:
  - _prompt_toolset_checklist: checklist filter
  - _get_platform_tools: resolution filter (both branches)
  - _save_platform_tools: save-time filter (covers 'Configure all
    platforms' and hand-edited config.yaml)
  - tools_disable_enable_command: rejects `hermes tools enable discord`
    on non-Discord platforms with a clear error

build_session_context_prompt now injects the Discord IDs block only
when both conditions hold: the discord/discord_admin toolset is
enabled AND DISCORD_BOT_TOKEN is set. Toolset alone isn't enough —
the tool's check_fn gates on the token at registry time, so opting
in without a token yields no tools and the IDs block would lie.
Otherwise keep the stale-API disclaimer.
2026-04-25 04:51:11 -07:00
alt-glitch 591deeb928 feat(session): inject Discord IDs block when discord tool is loaded
When DISCORD_BOT_TOKEN is set — meaning the discord tool actually
loads — emit a dedicated IDs block in the session context prompt so
the agent can call ``fetch_messages``, ``pin_message``, etc. with
real identifiers instead of probing.

Currently only ``thread_id`` was exposed as a raw ID (via the
``description`` string).  The agent in a Discord thread had to guess
that the thread ID doubles as a channel ID for the REST API (it
does), and it had no way to reference the parent channel, the guild,
or the triggering message at all.

The block adapts to context:

  - Thread:     guild / parent channel / thread / message
  - Channel:    guild / channel / message
  - (DM has no guild/channel IDs worth listing; only message)

Discord isn't in _PII_SAFE_PLATFORMS, so IDs ship unredacted.
2026-04-25 04:51:11 -07:00
alt-glitch 5ae07e7b5c fix(session): gate stale "no Discord APIs" note on DISCORD_BOT_TOKEN
The Discord platform note in the session context prompt claimed the
agent has no server-management APIs — pre-dating the discord tool.
With a bot token configured the agent actually has fetch_messages,
search_members, create_thread, and optionally the discord_admin tool;
telling the model otherwise causes it to refuse or apologise for
calls it is fully able to make.

Gate the disclaimer on DISCORD_BOT_TOKEN being unset, matching the
tool's own ``check_fn``.  Without a token the note still appears and
remains accurate; with a token the model is no longer gaslit into
refusing valid tool calls.
2026-04-25 04:51:11 -07:00
alt-glitch 47b02e961c feat(discord): populate guild_id, parent_chat_id, message_id on SessionSource
Discord knows all four identifiers for every inbound message — guild,
channel (or thread), parent channel when in a thread, and the
triggering message.  Pass them into ``SessionSource`` via the new
``build_source()`` kwargs so downstream code (context-prompt builder,
delivery, logging) can use them without re-resolving from discord.py
objects.

For auto-threaded messages, remember the original channel as the
parent before swapping ``chat_id`` to the freshly created thread.

Behavioural: still a no-op — nothing consumes these fields yet.
2026-04-25 04:51:11 -07:00
alt-glitch 0702231dd8 feat(session): add guild_id/parent_chat_id/message_id to SessionSource
Groundwork for injecting raw platform identifiers into the agent's
system prompt.  Currently only `thread_id` is exposed as a raw ID —
callers in a Discord thread had to guess `channel_id == thread_id`
(which happens to work because threads are channels in Discord's REST
API) and had no way to reference the parent channel, guild, or the
triggering message.

Adds three optional fields:

- `guild_id` — Discord guild / Slack workspace / Matrix server scope
- `parent_chat_id` — parent channel when chat_id refers to a thread
- `message_id` — ID of the triggering message (pin/reply/react)

Extends `BasePlatformAdapter.build_source()` to accept + forward them
and teaches `to_dict`/`from_dict` to serialize them.  Behaviourally a
no-op: nothing reads the fields yet and they default to None.
2026-04-25 04:51:11 -07:00
alt-glitch db09477b77 feat(feishu): wire feishu doc/drive tools into hermes-feishu composite
The feishu_doc and feishu_drive tools were registered in the tool
registry but never added to the hermes-feishu composite toolset.
The pipeline fix from the prior commit now recovers them automatically
once they are in the composite.
2026-04-25 04:50:14 -07:00
alt-glitch 81987f0350 feat(discord): split discord_server into discord + discord_admin tools
Split the monolithic discord_server tool (14 actions) into two:

- discord: core actions (fetch_messages, search_members, create_thread)
  that are useful for the agent's normal operation. Auto-enabled on
  the discord platform via the pipeline fix.

- discord_admin: server management actions (list channels/roles, pins,
  role assignment) that require explicit opt-in via hermes tools.
  Added to CONFIGURABLE_TOOLSETS and _DEFAULT_OFF_TOOLSETS.
2026-04-25 04:50:14 -07:00
alt-glitch 9830905dab fix(tools): recover non-configurable toolsets from composite resolution
The reverse-mapping loop in _get_platform_tools only checked
CONFIGURABLE_TOOLSETS, silently dropping platform-specific toolsets
like discord and feishu_doc whose tools were in the composite but
had no configurable key. Add a second pass over TOOLSETS that picks
up unclaimed toolsets whose tools are present in the resolved
composite.
2026-04-25 04:50:14 -07:00
Teknium 0d548d1db9 fix(cron): wire context_from through the update action
The tool schema promised 'On update, pass an empty array to clear' but the
update branch ignored the context_from kwarg entirely — users could set
the field at create time and never modify or clear it afterward.

- tools/cronjob_tools.py: handle context_from in the update branch the
  same way script/enabled_toolsets/workdir are handled: normalize str/list
  to refs, validate each referenced job exists (same check the create
  branch does), store as list-or-None to match create_job()'s shape.
  Empty string or empty list clears the field.
- tests/cron/test_cron_context_from.py: 6 new tests covering add/change/
  clear (both shapes)/bad-ref/preserve-across-unrelated-update.
2026-04-25 04:49:28 -07:00
MorAlekss eb92222811 fix(cron): silent skip when context_from job has no output yet 2026-04-25 04:49:28 -07:00
MorAlekss e4a91ccb76 test(cron): add PermissionError coverage for context_from 2026-04-25 04:49:28 -07:00
MorAlekss 5ac5365923 feat(cron): add context_from field for cron job output chaining 2026-04-25 04:49:28 -07:00
Teknium f433197f23 feat(installer): FHS layout for root installs on Linux (#15608)
Root installs on Linux now put the code at /usr/local/lib/hermes-agent and
the hermes command at /usr/local/bin/hermes.  HERMES_HOME (~/.hermes) stays
state-only.  Matches Claude Code / Codex CLI / OpenClaw, keeps Docker
bind-mounted /root/ volumes lean, and puts the command on every shell's
default PATH without touching shell RC files.

- Non-root users and macOS root: unchanged
- Existing root installs at $HERMES_HOME/hermes-agent: preserved in-place
  (detected via .git dir) — no auto-migration, no breakage
- Explicit --dir / $HERMES_INSTALL_DIR: always wins, never overridden
- Termux: unchanged (package manager manages /data/data/...)

Requested by @souly9999 (Discord). Our own Dockerfile already uses this
split (code at /opt/hermes, data at /opt/data volume); the user-install
path now matches.
2026-04-25 04:49:16 -07:00
Teknium df485628ce chore(release): map Readon's git email to GitHub login 2026-04-25 04:49:07 -07:00
Yindong 9fde22d233 fix the reset of model change by /model. 2026-04-25 04:49:07 -07:00
alt-glitch 9d7b64b5dd fix(tools): normalize numeric entries and clear stale no_mcp in _save_platform_tools
YAML parses bare numeric toolset names (e.g. 12306:) as int, causing
TypeError in sorted() since the read path normalizes to str but the
save path did not.

The no_mcp sentinel was preserved in existing entries even when the
user re-enabled MCP servers, causing MCP to stay silently disabled.
2026-04-25 04:49:02 -07:00
vominh1919 5401a0080d fix: recalculate token budgets on model switch in ContextCompressor
update_model() recalculated threshold_tokens but left tail_token_budget
and max_summary_tokens at their __init__ values. When switching from a
200K model to 32K, the tail budget stayed at ~20K tokens (62% of 32K)
instead of the intended ~10%.

Adds budget recalculation in update_model() and 2 regression tests.
2026-04-25 15:07:56 +05:30
Teknium e5647d7863 docs: consolidate dashboard themes and plugins into Extending the Dashboard (#15530)
The web-dashboard.md and dashboard-plugins.md pages had overlapping,
partial coverage of the theme and plugin systems. Themes were split
across two pages; the plugin docs had a minimal manifest reference but
no step-by-step guide, no slot catalog, and no theme+plugin demo.

New: user-guide/features/extending-the-dashboard.md — single navigable
reference for all three extension layers (themes, UI plugins, backend
plugins). Includes:

- Theme quick-start + full schema (palette, typography, layout, layout
  variants, assets, componentStyles, colorOverrides, customCSS)
- Plugin quick-start + full schema (manifest, SDK, slots, tab.override,
  tab.hidden, backend routes, custom CSS)
- 10-slot shell catalog with locations
- Plugin discovery + load lifecycle
- Combined theme+plugin walkthrough (Strike Freedom cockpit demo)
- API reference + troubleshooting

web-dashboard.md: trimmed to core tool docs (pages, REST API, CORS,
development). Theme/plugin content now points to the new page with a
built-in themes summary table.

dashboard-plugins.md: deleted (merged into extending-the-dashboard.md).

sidebars.ts: swap 'dashboard-plugins' → 'extending-the-dashboard' under
the Management group.

No user-facing behavior change; docs-only.
2026-04-24 23:26:51 -07:00
Teknium 023b1bff11 fix(delegate): resolve subagent approval prompts without deadlocking parent TUI (#15491)
Subagents run inside a ThreadPoolExecutor. The CLI's interactive approval
callback lives in tools/terminal_tool.py's threading.local(), which worker
threads do not inherit. When a subagent hits a dangerous-command guard,
prompt_dangerous_approval() falls back to input() from the worker thread,
deadlocking against the parent's prompt_toolkit TUI that owns stdin.

Fix: install a non-interactive callback into every subagent worker thread
via ThreadPoolExecutor(initializer=set_approval_callback, initargs=(cb,)).
The callback is config-gated by delegation.subagent_auto_approve:

  false (default) -> _subagent_auto_deny (safe; matches leaf tool blocklist)
  true            -> _subagent_auto_approve (opt-in YOLO for cron/batch)

Both emit a logger.warning audit line. Gateway sessions are unaffected
because they resolve approvals via tools/approval.py's per-session queue,
not through these TLS callbacks. Diagnosis credit: @MorAlekss (#14685).

- hermes_cli/config.py: DEFAULT_CONFIG.delegation.subagent_auto_approve: False
- cli-config.yaml.example: documented, commented (default)
- tools/delegate_tool.py: _subagent_auto_deny, _subagent_auto_approve,
  _get_subagent_approval_callback, wired into the child timeout executor
- tests/tools/test_delegate.py: 7 tests covering defaults, truthy coercion,
  and TLS scoping in the worker thread
2026-04-24 22:37:22 -07:00
52 changed files with 4202 additions and 1068 deletions
+105 -3
View File
@@ -1349,6 +1349,49 @@ def _is_auth_error(exc: Exception) -> bool:
return "error code: 401" in err_lower or "authenticationerror" in type(exc).__name__.lower()
def _is_unsupported_parameter_error(exc: Exception, param: str) -> bool:
"""Detect provider 400s for an unsupported request parameter.
Different OpenAI-compatible endpoints phrase the same class of error a few
ways: ``Unsupported parameter: X``, ``unsupported_parameter`` with a
``param`` field, ``X is not supported``, ``unknown parameter: X``,
``unrecognized request argument: X``. We match on both the parameter
name and a generic "unsupported/unknown/unrecognized parameter" marker so
call sites can reactively retry without the offending key instead of
surfacing a noisy auxiliary failure.
Generalizes the temperature-specific detector that originally shipped
with PR #15621 so the same retry strategy can cover ``max_tokens``,
``seed``, ``top_p``, and any future quirk. Credit @nicholasrae (PR #15416)
for the generalization pattern.
"""
param_lower = (param or "").lower()
if not param_lower:
return False
err_lower = str(exc).lower()
if param_lower not in err_lower:
return False
return any(marker in err_lower for marker in (
"unsupported parameter",
"unsupported_parameter",
"not supported",
"does not support",
"unknown parameter",
"unrecognized request argument",
"unrecognized parameter",
"invalid parameter",
))
def _is_unsupported_temperature_error(exc: Exception) -> bool:
"""Back-compat wrapper: detect API errors where the model rejects ``temperature``.
Delegates to :func:`_is_unsupported_parameter_error`; kept as a separate
public symbol because existing tests and call sites import it by name.
"""
return _is_unsupported_parameter_error(exc, "temperature")
def _evict_cached_clients(provider: str) -> None:
"""Drop cached auxiliary clients for a provider so fresh creds are used."""
normalized = _normalize_aux_provider(provider)
@@ -2952,13 +2995,45 @@ def call_llm(
if _is_anthropic_compat_endpoint(resolved_provider, _client_base):
kwargs["messages"] = _convert_openai_images_to_anthropic(kwargs["messages"])
# Handle max_tokens vs max_completion_tokens retry, then payment fallback.
# Handle unsupported temperature, max_tokens vs max_completion_tokens retry,
# then payment fallback.
try:
return _validate_llm_response(
client.chat.completions.create(**kwargs), task)
except Exception as first_err:
if "temperature" in kwargs and _is_unsupported_temperature_error(first_err):
retry_kwargs = dict(kwargs)
retry_kwargs.pop("temperature", None)
logger.info(
"Auxiliary %s: provider rejected temperature; retrying once without it",
task or "call",
)
try:
return _validate_llm_response(
client.chat.completions.create(**retry_kwargs), task)
except Exception as retry_err:
retry_err_str = str(retry_err)
# If retry still fails, fall through to the max_tokens /
# payment / auth chains below using the temperature-stripped
# kwargs. Re-raise only if the retry hit something those
# chains won't handle.
if not (
_is_payment_error(retry_err)
or _is_connection_error(retry_err)
or _is_auth_error(retry_err)
or "max_tokens" in retry_err_str
or "unsupported_parameter" in retry_err_str
):
raise
first_err = retry_err
kwargs = retry_kwargs
err_str = str(first_err)
if "max_tokens" in err_str or "unsupported_parameter" in err_str:
if max_tokens is not None and (
"max_tokens" in err_str
or "unsupported_parameter" in err_str
or _is_unsupported_parameter_error(first_err, "max_tokens")
):
kwargs.pop("max_tokens", None)
kwargs["max_completion_tokens"] = max_tokens
try:
@@ -3221,8 +3296,35 @@ async def async_call_llm(
return _validate_llm_response(
await client.chat.completions.create(**kwargs), task)
except Exception as first_err:
if "temperature" in kwargs and _is_unsupported_temperature_error(first_err):
retry_kwargs = dict(kwargs)
retry_kwargs.pop("temperature", None)
logger.info(
"Auxiliary %s (async): provider rejected temperature; retrying once without it",
task or "call",
)
try:
return _validate_llm_response(
await client.chat.completions.create(**retry_kwargs), task)
except Exception as retry_err:
retry_err_str = str(retry_err)
if not (
_is_payment_error(retry_err)
or _is_connection_error(retry_err)
or _is_auth_error(retry_err)
or "max_tokens" in retry_err_str
or "unsupported_parameter" in retry_err_str
):
raise
first_err = retry_err
kwargs = retry_kwargs
err_str = str(first_err)
if "max_tokens" in err_str or "unsupported_parameter" in err_str:
if max_tokens is not None and (
"max_tokens" in err_str
or "unsupported_parameter" in err_str
or _is_unsupported_parameter_error(first_err, "max_tokens")
):
kwargs.pop("max_tokens", None)
kwargs["max_completion_tokens"] = max_tokens
try:
+7
View File
@@ -318,6 +318,13 @@ class ContextCompressor(ContextEngine):
int(context_length * self.threshold_percent),
MINIMUM_CONTEXT_LENGTH,
)
# Recalculate token budgets for the new context length so the
# compressor stays calibrated after a model switch (e.g. 200K → 32K).
target_tokens = int(self.threshold_tokens * self.summary_target_ratio)
self.tail_token_budget = target_tokens
self.max_summary_tokens = min(
int(context_length * 0.05), _SUMMARY_TOKENS_CEILING,
)
def __init__(
self,
+4
View File
@@ -796,6 +796,10 @@ delegation:
# Raise to 2 to allow workers to spawn their own subagents.
# Requires role="orchestrator" on intermediate agents.
# orchestrator_enabled: true # Kill switch for role="orchestrator" children (default: true).
# subagent_auto_approve: false # When a subagent hits a dangerous-command approval prompt, auto-deny (default: false)
# or auto-approve "once" (true) instead of blocking on stdin.
# The parent TUI owns stdin, so blocking would deadlock; non-interactive resolution is required.
# Both choices emit a logger.warning audit line. Flip to true only for cron/batch pipelines.
# inherit_mcp_toolsets: true # When explicit child toolsets are narrowed, also keep the parent's MCP toolsets (default: true). Set false for strict intersection.
# model: "google/gemini-3-flash-preview" # Override model for subagents (empty = inherit parent)
# provider: "openrouter" # Override provider for subagents (empty = inherit parent)
+8 -1
View File
@@ -3176,7 +3176,14 @@ class HermesCLI:
# the configured model (e.g. "qwen3.6-plus"), causing 400 errors.
runtime_model = runtime.get("model")
if runtime_model and isinstance(runtime_model, str):
self.model = runtime_model
# Only use runtime model if: model is unset, or model equals provider name
should_use_runtime_model = (
not self.model or # No model configured yet
self.model == self.provider or # Model is the provider slug
self.model == runtime.get("name") # Model matches provider display name
)
if should_use_runtime_model:
self.model = runtime_model
# If model is still empty (e.g. user ran `hermes auth add openai-codex`
# without `hermes model`), fall back to the provider's first catalog
+14 -1
View File
@@ -16,7 +16,7 @@ import uuid
from datetime import datetime, timedelta
from pathlib import Path
from hermes_constants import get_hermes_home
from typing import Optional, Dict, List, Any
from typing import Optional, Dict, List, Any, Union
logger = logging.getLogger(__name__)
@@ -417,6 +417,7 @@ def create_job(
provider: Optional[str] = None,
base_url: Optional[str] = None,
script: Optional[str] = None,
context_from: Optional[Union[str, List[str]]] = None,
enabled_toolsets: Optional[List[str]] = None,
workdir: Optional[str] = None,
) -> Dict[str, Any]:
@@ -438,6 +439,9 @@ def create_job(
script: Optional path to a Python script whose stdout is injected into the
prompt each run. The script runs before the agent turn, and its output
is prepended as context. Useful for data collection / change detection.
context_from: Optional job ID (or list of job IDs) whose most recent output
is injected into the prompt as context before each run.
Useful for chaining cron jobs: job A finds data, job B processes it.
enabled_toolsets: Optional list of toolset names to restrict the agent to.
When set, only tools from these toolsets are loaded, reducing
token overhead. When omitted, all default tools are loaded.
@@ -481,6 +485,14 @@ def create_job(
normalized_toolsets = normalized_toolsets or None
normalized_workdir = _normalize_workdir(workdir)
# Normalize context_from: accept str or list of str, store as list or None
if isinstance(context_from, str):
context_from = [context_from.strip()] if context_from.strip() else None
elif isinstance(context_from, list):
context_from = [str(j).strip() for j in context_from if str(j).strip()] or None
else:
context_from = None
label_source = (prompt or (normalized_skills[0] if normalized_skills else None)) or "cron job"
job = {
"id": job_id,
@@ -492,6 +504,7 @@ def create_job(
"provider": normalized_provider,
"base_url": normalized_base_url,
"script": normalized_script,
"context_from": context_from,
"schedule": parsed_schedule,
"schedule_display": parsed_schedule.get("display", schedule),
"repeat": {
+41
View File
@@ -671,6 +671,47 @@ def _build_job_prompt(job: dict, prerun_script: Optional[tuple] = None) -> str:
f"{prompt}"
)
# Inject output from referenced cron jobs as context.
context_from = job.get("context_from")
if context_from:
from cron.jobs import OUTPUT_DIR
if isinstance(context_from, str):
context_from = [context_from]
for source_job_id in context_from:
# Guard against path traversal — valid job IDs are 12-char hex strings
if not source_job_id or not all(c in "0123456789abcdef" for c in source_job_id):
logger.warning("context_from: skipping invalid job_id %r", source_job_id)
continue
try:
job_output_dir = OUTPUT_DIR / source_job_id
if not job_output_dir.exists():
continue # silent skip — no output yet
output_files = sorted(
job_output_dir.glob("*.md"),
key=lambda f: f.stat().st_mtime,
reverse=True,
)
if not output_files:
continue # silent skip — no output yet
latest_output = output_files[0].read_text(encoding="utf-8").strip()
# Truncate to 8K characters to avoid prompt bloat
_MAX_CONTEXT_CHARS = 8000
if len(latest_output) > _MAX_CONTEXT_CHARS:
latest_output = latest_output[:_MAX_CONTEXT_CHARS] + "\n\n[... output truncated ...]"
if latest_output:
prompt = (
f"## Output from job '{source_job_id}'\n"
"The following is the most recent output from a preceding "
"cron job. Use it as context for your analysis.\n\n"
f"```\n{latest_output}\n```\n\n"
f"{prompt}"
)
else:
continue # silent skip — empty output
except (OSError, PermissionError) as e:
logger.warning("context_from: failed to read output for job %r: %s", source_job_id, e)
# silent skip — do not pollute the prompt with error messages
# Always prepend cron execution guidance so the agent knows how
# delivery works and can suppress delivery when appropriate.
cron_hint = (
+6
View File
@@ -2543,6 +2543,9 @@ class BasePlatformAdapter(ABC):
user_id_alt: Optional[str] = None,
chat_id_alt: Optional[str] = None,
is_bot: bool = False,
guild_id: Optional[str] = None,
parent_chat_id: Optional[str] = None,
message_id: Optional[str] = None,
) -> SessionSource:
"""Helper to build a SessionSource for this platform."""
# Normalize empty topic to None
@@ -2560,6 +2563,9 @@ class BasePlatformAdapter(ABC):
user_id_alt=user_id_alt,
chat_id_alt=chat_id_alt,
is_bot=is_bot,
guild_id=str(guild_id) if guild_id else None,
parent_chat_id=str(parent_chat_id) if parent_chat_id else None,
message_id=str(message_id) if message_id else None,
)
@abstractmethod
+4
View File
@@ -3261,6 +3261,7 @@ class DiscordAdapter(BasePlatformAdapter):
if auto_thread and not skip_thread and not is_voice_linked_channel and not is_reply_message:
thread = await self._auto_create_thread(message)
if thread:
parent_channel_id = str(message.channel.id)
is_thread = True
thread_id = str(thread.id)
auto_threaded_channel = thread
@@ -3320,6 +3321,9 @@ class DiscordAdapter(BasePlatformAdapter):
thread_id=thread_id,
chat_topic=chat_topic,
is_bot=getattr(message.author, "bot", False),
guild_id=str(message.guild.id) if message.guild else None,
parent_chat_id=parent_channel_id,
message_id=str(message.id),
)
# Build media URLs -- download image attachments to local cache so the
+65 -9
View File
@@ -87,6 +87,9 @@ class SessionSource:
user_id_alt: Optional[str] = None # Platform-specific stable alt ID (Signal UUID, Feishu union_id)
chat_id_alt: Optional[str] = None # Signal group internal ID
is_bot: bool = False # True when the message author is a bot/webhook (Discord)
guild_id: Optional[str] = None # Discord guild / Slack workspace / Matrix server scope
parent_chat_id: Optional[str] = None # Parent channel when chat_id refers to a thread
message_id: Optional[str] = None # ID of the triggering message (for pin/reply/react)
@property
def description(self) -> str:
@@ -124,8 +127,14 @@ class SessionSource:
d["user_id_alt"] = self.user_id_alt
if self.chat_id_alt:
d["chat_id_alt"] = self.chat_id_alt
if self.guild_id:
d["guild_id"] = self.guild_id
if self.parent_chat_id:
d["parent_chat_id"] = self.parent_chat_id
if self.message_id:
d["message_id"] = self.message_id
return d
@classmethod
def from_dict(cls, data: Dict[str, Any]) -> "SessionSource":
return cls(
@@ -139,6 +148,9 @@ class SessionSource:
chat_topic=data.get("chat_topic"),
user_id_alt=data.get("user_id_alt"),
chat_id_alt=data.get("chat_id_alt"),
guild_id=data.get("guild_id"),
parent_chat_id=data.get("parent_chat_id"),
message_id=data.get("message_id"),
)
@@ -190,6 +202,31 @@ that requires raw IDs). Discord is excluded because mentions use ``<@user_id>``
and the LLM needs the real ID to tag users."""
def _discord_tools_loaded() -> bool:
"""True iff the agent will actually have Discord tools this session.
Two conditions must hold:
1. The `discord` or `discord_admin` toolset is enabled for the
Discord platform via `hermes tools` (opt-in, default OFF).
2. `DISCORD_BOT_TOKEN` is set the tool's `check_fn` gates on it
at registry time, so the toolset being enabled in config is not
enough if the token isn't configured.
Returns False (safe default keeps the stale-API disclaimer) on any
error so a bad config can't silently promise tools the agent lacks.
"""
if not (os.environ.get("DISCORD_BOT_TOKEN") or "").strip():
return False
try:
from hermes_cli.config import load_config
from hermes_cli.tools_config import _get_platform_tools
cfg = load_config()
enabled = _get_platform_tools(cfg, "discord", include_default_mcp_servers=False)
return "discord" in enabled or "discord_admin" in enabled
except Exception:
return False
def build_session_context_prompt(
context: SessionContext,
*,
@@ -277,14 +314,33 @@ def build_session_context_prompt(
"that you can only read messages sent directly to you and respond."
)
elif context.source.platform == Platform.DISCORD:
lines.append("")
lines.append(
"**Platform notes:** You are running inside Discord. "
"You do NOT have access to Discord-specific APIs — you cannot search "
"channel history, pin messages, manage roles, or list server members. "
"Do not promise to perform these actions. If the user asks, explain "
"that you can only read messages sent directly to you and respond."
)
# Inject the Discord IDs block only when the agent actually has
# Discord tools loaded this session — i.e. the user opted into
# `discord` / `discord_admin` via `hermes tools` AND the bot
# token is configured. Otherwise keep the stale-API disclaimer
# honest so we never promise tools the agent lacks.
if _discord_tools_loaded():
src = context.source
id_lines = ["", "**Discord IDs (for the `discord` / `discord_admin` tools):**"]
if src.guild_id:
id_lines.append(f" - Guild: `{src.guild_id}`")
if src.thread_id and src.parent_chat_id:
id_lines.append(f" - Parent channel: `{src.parent_chat_id}`")
id_lines.append(f" - Thread: `{src.thread_id}` (use as `channel_id` for fetch_messages etc.)")
else:
id_lines.append(f" - Channel: `{src.chat_id}`")
if src.message_id:
id_lines.append(f" - Triggering message: `{src.message_id}`")
lines.extend(id_lines)
else:
lines.append("")
lines.append(
"**Platform notes:** You are running inside Discord. "
"You do NOT have access to Discord-specific APIs — you cannot search "
"channel history, pin messages, manage roles, or list server members. "
"Do not promise to perform these actions. If the user asks, explain "
"that you can only read messages sent directly to you and respond."
)
elif context.source.platform == Platform.BLUEBUBBLES:
lines.append("")
lines.append(
+10 -1
View File
@@ -783,6 +783,15 @@ DEFAULT_CONFIG = {
# warning log if out of range.
"max_spawn_depth": 1, # depth cap (1 = flat [default], 2 = orchestrator→leaf, 3 = three-level)
"orchestrator_enabled": True, # kill switch for role="orchestrator"
# When a subagent hits a dangerous-command approval prompt, the parent's
# prompt_toolkit TUI owns stdin — a thread-local input() call from the
# subagent worker would deadlock the parent UI. To avoid the deadlock,
# subagent threads ALWAYS resolve approvals non-interactively:
# false (default) → auto-deny with a logger.warning audit line (safe)
# true → auto-approve "once" with a logger.warning audit line
# Flip to true only if you trust delegated work to run dangerous cmds
# without human review (cron pipelines, batch automation, etc.).
"subagent_auto_approve": False,
},
# Ephemeral prefill messages file — JSON list of {role, content} dicts
@@ -839,7 +848,7 @@ DEFAULT_CONFIG = {
"auto_thread": True, # Auto-create threads on @mention in channels (like Slack)
"reactions": True, # Add 👀/✅/❌ reactions to messages during processing
"channel_prompts": {}, # Per-channel ephemeral system prompts (forum parents apply to child threads)
# discord_server tool: restrict which actions the agent may call.
# discord / discord_admin tools: restrict which actions the agent may call.
# Default (empty) = all actions allowed (subject to bot privileged intents).
# Accepts comma-separated string ("list_guilds,list_channels,fetch_messages")
# or YAML list. Unknown names are dropped with a warning at load time.
+39 -24
View File
@@ -6046,6 +6046,31 @@ def _cmd_update_impl(args, gateway_mode: bool):
)
import signal as _signal
def _wait_for_service_active(
scope_cmd_: list, svc_name_: str, timeout: float = 10.0,
) -> bool:
"""Poll ``systemctl is-active`` until the unit reports active.
systemd's Stopped -> Started transition after a graceful exit
(or a hard restart) is not instantaneous; a one-shot check
races that window and falsely reports the unit as down.
Poll every 0.5s up to ``timeout`` seconds before giving up.
"""
deadline = _time.monotonic() + max(timeout, 0.5)
while True:
try:
_verify = subprocess.run(
scope_cmd_ + ["is-active", svc_name_],
capture_output=True, text=True, timeout=5,
)
if _verify.stdout.strip() == "active":
return True
except (FileNotFoundError, subprocess.TimeoutExpired):
pass
if _time.monotonic() >= deadline:
return False
_time.sleep(0.5)
# Drain budget for graceful SIGUSR1 restarts. The gateway drains
# for up to ``agent.restart_drain_timeout`` (default 60s) before
# exiting with code 75; we wait slightly longer so the drain
@@ -6152,14 +6177,14 @@ def _cmd_update_impl(args, gateway_mode: bool):
if _graceful_ok:
# Gateway exited 75; systemd should relaunch
# via Restart=on-failure. Verify the new
# process came up.
_time.sleep(3)
verify = subprocess.run(
scope_cmd + ["is-active", svc_name],
capture_output=True, text=True, timeout=5,
)
if verify.stdout.strip() == "active":
# via Restart=on-failure. Poll is-active for
# up to ~10s because the unit's Stopped ->
# Started transition can take a few seconds
# after the old PID exits, and a one-shot
# check races that window.
if _wait_for_service_active(
scope_cmd, svc_name, timeout=10.0,
):
restarted_services.append(svc_name)
continue
# Process exited but wasn't respawned (older
@@ -6185,14 +6210,9 @@ def _cmd_update_impl(args, gateway_mode: bool):
# Verify the service actually survived the
# restart. systemctl restart returns 0 even
# if the new process crashes immediately.
_time.sleep(3)
verify = subprocess.run(
scope_cmd + ["is-active", svc_name],
capture_output=True,
text=True,
timeout=5,
)
if verify.stdout.strip() == "active":
if _wait_for_service_active(
scope_cmd, svc_name, timeout=10.0,
):
restarted_services.append(svc_name)
else:
# Retry once — transient startup failures
@@ -6207,14 +6227,9 @@ def _cmd_update_impl(args, gateway_mode: bool):
text=True,
timeout=15,
)
_time.sleep(3)
verify2 = subprocess.run(
scope_cmd + ["is-active", svc_name],
capture_output=True,
text=True,
timeout=5,
)
if verify2.stdout.strip() == "active":
if _wait_for_service_active(
scope_cmd, svc_name, timeout=10.0,
):
restarted_services.append(svc_name)
print(f"{svc_name} recovered on retry")
else:
+123 -12
View File
@@ -68,25 +68,58 @@ CONFIGURABLE_TOOLSETS = [
("rl", "🧪 RL Training", "Tinker-Atropos training tools"),
("homeassistant", "🏠 Home Assistant", "smart home device control"),
("spotify", "🎵 Spotify", "playback, search, playlists, library"),
("discord", "💬 Discord (read/participate)", "fetch messages, search members, create thread"),
("discord_admin", "🛡️ Discord Server Admin", "list channels/roles, pin, assign roles"),
]
# Toolsets that are OFF by default for new installs.
# They're still in _HERMES_CORE_TOOLS (available at runtime if enabled),
# but the setup checklist won't pre-select them for first-time users.
_DEFAULT_OFF_TOOLSETS = {"moa", "homeassistant", "rl", "spotify"}
_DEFAULT_OFF_TOOLSETS = {"moa", "homeassistant", "rl", "spotify", "discord", "discord_admin"}
# Platform-scoped toolsets: only appear in the `hermes tools` checklist for
# these platforms, and only resolve/save for these platforms. A toolset
# absent from this map is available on every platform (current behaviour).
#
# Use this for tools whose APIs only make sense on one platform (Discord
# server admin, Slack workspace admin, etc.). Keeps every other platform's
# checklist from filling up with irrelevant toggles.
_TOOLSET_PLATFORM_RESTRICTIONS: Dict[str, Set[str]] = {
"discord": {"discord"},
"discord_admin": {"discord"},
}
def _toolset_allowed_for_platform(ts_key: str, platform: str) -> bool:
"""Return True if ``ts_key`` is configurable on ``platform``.
Toolsets without a restriction entry are allowed everywhere (the default).
"""
allowed = _TOOLSET_PLATFORM_RESTRICTIONS.get(ts_key)
return allowed is None or platform in allowed
def _get_effective_configurable_toolsets():
"""Return CONFIGURABLE_TOOLSETS + any plugin-provided toolsets.
Plugin toolsets are appended at the end so they appear after the
built-in toolsets in the TUI checklist.
built-in toolsets in the TUI checklist. A plugin whose toolset key
already appears in ``CONFIGURABLE_TOOLSETS`` is skipped bundled
plugins (e.g. ``plugins/spotify``) share their toolset key with the
built-in entry, and we want the built-in label/description to win.
Without the dedupe, ``hermes tools`` "reconfigure existing" would
list the same toolset twice.
"""
result = list(CONFIGURABLE_TOOLSETS)
seen = {ts_key for ts_key, _, _ in result}
try:
from hermes_cli.plugins import discover_plugins, get_plugin_toolsets
discover_plugins() # idempotent — ensures plugins are loaded
result.extend(get_plugin_toolsets())
for entry in get_plugin_toolsets():
if entry[0] in seen:
continue
seen.add(entry[0])
result.append(entry)
except Exception:
pass
return result
@@ -591,7 +624,7 @@ def _get_platform_tools(
include_default_mcp_servers: bool = True,
) -> Set[str]:
"""Resolve which individual toolset names are enabled for a platform."""
from toolsets import resolve_toolset
from toolsets import resolve_toolset, TOOLSETS
platform_toolsets = config.get("platform_toolsets") or {}
toolset_names = platform_toolsets.get(platform)
@@ -605,6 +638,8 @@ def _get_platform_tools(
toolset_names = [str(ts) for ts in toolset_names]
configurable_keys = {ts_key for ts_key, _, _ in CONFIGURABLE_TOOLSETS}
plugin_ts_keys = _get_plugin_toolset_keys()
platform_default_keys = {p["default_toolset"] for p in PLATFORMS.values()}
# If the saved list contains any configurable keys directly, the user
# has explicitly configured this platform — use direct membership.
@@ -614,7 +649,10 @@ def _get_platform_tools(
has_explicit_config = any(ts in configurable_keys for ts in toolset_names)
if has_explicit_config:
enabled_toolsets = {ts for ts in toolset_names if ts in configurable_keys}
enabled_toolsets = {
ts for ts in toolset_names
if ts in configurable_keys and _toolset_allowed_for_platform(ts, platform)
}
else:
# No explicit config — fall back to resolving composite toolset names
# (e.g. "hermes-cli") to individual tool names and reverse-mapping.
@@ -624,14 +662,52 @@ def _get_platform_tools(
enabled_toolsets = set()
for ts_key, _, _ in CONFIGURABLE_TOOLSETS:
if not _toolset_allowed_for_platform(ts_key, platform):
continue
ts_tools = set(resolve_toolset(ts_key))
if ts_tools and ts_tools.issubset(all_tool_names):
enabled_toolsets.add(ts_key)
default_off = set(_DEFAULT_OFF_TOOLSETS)
if platform in default_off:
# Legacy safety: if the platform's own name matches a default-off
# toolset (e.g. `homeassistant` platform + `homeassistant` toolset),
# keep that toolset enabled on first install. Skip this dodge for
# platform-restricted toolsets — those are always opt-in even on
# their own platform (e.g. `discord` + `discord` should stay OFF).
if platform in default_off and platform not in _TOOLSET_PLATFORM_RESTRICTIONS:
default_off.remove(platform)
enabled_toolsets -= default_off
# Recover non-configurable platform toolsets (e.g. discord, feishu_doc,
# feishu_drive). These are part of the platform's default composite but
# absent from CONFIGURABLE_TOOLSETS, so they can't appear in the TUI
# checklist or in a user-saved config. Must run in BOTH branches —
# otherwise saving via `hermes tools` (which flips has_explicit_config
# to True) silently drops them.
platform_tool_universe = set(resolve_toolset(PLATFORMS[platform]["default_toolset"]))
configurable_tool_universe = set()
for ck in configurable_keys:
configurable_tool_universe.update(resolve_toolset(ck))
claimed = set()
for ts_key in enabled_toolsets:
claimed.update(resolve_toolset(ts_key))
skip = configurable_keys | plugin_ts_keys | platform_default_keys
skip |= {k for k in TOOLSETS if k.startswith("hermes-")}
skip |= set(_DEFAULT_OFF_TOOLSETS) - {platform}
for ts_key, ts_def in TOOLSETS.items():
if ts_key in skip:
continue
if ts_def.get("includes"):
continue
ts_tools = set(resolve_toolset(ts_key))
if not ts_tools or not ts_tools.issubset(platform_tool_universe):
continue
if ts_tools.issubset(configurable_tool_universe):
continue
if not ts_tools.issubset(claimed):
enabled_toolsets.add(ts_key)
claimed.update(ts_tools)
# Plugin toolsets: enabled by default unless explicitly disabled, or
# unless the toolset is in _DEFAULT_OFF_TOOLSETS (e.g. spotify —
# shipped as a bundled plugin but user must opt in via `hermes tools`
@@ -639,7 +715,6 @@ def _get_platform_tools(
# A plugin toolset is "known" for a platform once `hermes tools`
# has been saved for that platform (tracked via known_plugin_toolsets).
# Unknown plugins default to enabled; known-but-absent = disabled.
plugin_ts_keys = _get_plugin_toolset_keys()
if plugin_ts_keys:
known_map = config.get("known_plugin_toolsets", {})
known_for_platform = set(known_map.get(platform, []))
@@ -657,7 +732,6 @@ def _get_platform_tools(
# Preserve any explicit non-configurable toolset entries (for example,
# custom toolsets or MCP server names saved in platform_toolsets).
platform_default_keys = {p["default_toolset"] for p in PLATFORMS.values()}
explicit_passthrough = {
ts
for ts in toolset_names
@@ -703,6 +777,14 @@ def _save_platform_tools(config: dict, platform: str, enabled_toolset_keys: Set[
"""
config.setdefault("platform_toolsets", {})
# Drop platform-scoped toolsets that don't apply here. Prevents the
# "Configure all platforms" checklist (or a hand-edited config.yaml)
# from turning on, say, the `discord` toolset for Telegram.
enabled_toolset_keys = {
ts for ts in enabled_toolset_keys
if _toolset_allowed_for_platform(ts, platform)
}
# Get the set of all configurable toolset keys (built-in + plugin)
configurable_keys = {ts_key for ts_key, _, _ in CONFIGURABLE_TOOLSETS}
plugin_keys = _get_plugin_toolset_keys()
@@ -717,6 +799,7 @@ def _save_platform_tools(config: dict, platform: str, enabled_toolset_keys: Set[
existing_toolsets = config.get("platform_toolsets", {}).get(platform, [])
if not isinstance(existing_toolsets, list):
existing_toolsets = []
existing_toolsets = [str(ts) for ts in existing_toolsets]
# Preserve any entries that are NOT configurable toolsets and NOT platform
# defaults (i.e. only MCP server names should be preserved)
@@ -724,6 +807,11 @@ def _save_platform_tools(config: dict, platform: str, enabled_toolset_keys: Set[
entry for entry in existing_toolsets
if entry not in configurable_keys and entry not in platform_default_keys
}
# Opening `hermes tools` is the user's opt-in to reconfigure tools, so treat
# saving from the picker as consent to clear the "no_mcp" sentinel. The
# picker has no checkbox for no_mcp, so without this users who once set it
# by hand could never re-enable MCP servers through the UI.
preserved_entries.discard("no_mcp")
# Merge preserved entries with new enabled toolsets
config["platform_toolsets"][platform] = sorted(enabled_toolset_keys | preserved_entries)
@@ -831,7 +919,7 @@ def _estimate_tool_tokens() -> Dict[str, int]:
return _tool_token_cache
def _prompt_toolset_checklist(platform_label: str, enabled: Set[str]) -> Set[str]:
def _prompt_toolset_checklist(platform_label: str, enabled: Set[str], platform: str = "cli") -> Set[str]:
"""Multi-select checklist of toolsets. Returns set of selected toolset keys."""
from hermes_cli.curses_ui import curses_checklist
from toolsets import resolve_toolset
@@ -839,7 +927,12 @@ def _prompt_toolset_checklist(platform_label: str, enabled: Set[str]) -> Set[str
# Pre-compute per-tool token counts (cached after first call).
tool_tokens = _estimate_tool_tokens()
effective = _get_effective_configurable_toolsets()
effective_all = _get_effective_configurable_toolsets()
# Drop platform-scoped toolsets that don't apply to this platform.
effective = [
(k, l, d) for (k, l, d) in effective_all
if _toolset_allowed_for_platform(k, platform)
]
labels = []
for ts_key, ts_label, ts_desc in effective:
@@ -1753,7 +1846,7 @@ def tools_command(args=None, first_install: bool = False, config: dict = None):
checklist_preselected = current_enabled - _DEFAULT_OFF_TOOLSETS
# Show checklist
new_enabled = _prompt_toolset_checklist(pinfo["label"], checklist_preselected)
new_enabled = _prompt_toolset_checklist(pinfo["label"], checklist_preselected, pkey)
added = new_enabled - current_enabled
removed = current_enabled - new_enabled
@@ -2109,7 +2202,11 @@ def _apply_mcp_change(config: dict, targets: List[str], action: str) -> Set[str]
def _print_tools_list(enabled_toolsets: set, mcp_servers: dict, platform: str = "cli"):
"""Print a summary of enabled/disabled toolsets and MCP tool filters."""
effective = _get_effective_configurable_toolsets()
effective_all = _get_effective_configurable_toolsets()
effective = [
(k, l, d) for (k, l, d) in effective_all
if _toolset_allowed_for_platform(k, platform)
]
builtin_keys = {ts_key for ts_key, _, _ in CONFIGURABLE_TOOLSETS}
print(f"Built-in toolsets ({platform}):")
@@ -2175,6 +2272,20 @@ def tools_disable_enable_command(args):
_print_error(f"Unknown toolset '{name}'")
toolset_targets = [t for t in toolset_targets if t in valid_toolsets]
# Reject platform-scoped toolsets on platforms that don't allow them.
restricted_targets = [
t for t in toolset_targets
if not _toolset_allowed_for_platform(t, platform)
]
if restricted_targets:
for name in restricted_targets:
allowed = sorted(_TOOLSET_PLATFORM_RESTRICTIONS.get(name) or set())
_print_error(
f"Toolset '{name}' is not available on platform '{platform}' "
f"(only: {', '.join(allowed)})"
)
toolset_targets = [t for t in toolset_targets if t not in restricted_targets]
if toolset_targets:
_apply_toolset_change(config, platform, toolset_targets, action)
+1 -15
View File
@@ -53,7 +53,7 @@ try:
from fastapi.middleware.cors import CORSMiddleware
from fastapi.responses import FileResponse, HTMLResponse, JSONResponse
from fastapi.staticfiles import StaticFiles
from pydantic import BaseModel, field_validator
from pydantic import BaseModel
except ImportError:
raise SystemExit(
"Web UI requires fastapi and uvicorn.\n"
@@ -425,20 +425,6 @@ class EnvVarUpdate(BaseModel):
key: str
value: str
@field_validator("key")
@classmethod
def key_must_be_nonempty(cls, v: str) -> str:
if not v.strip():
raise ValueError("key must not be empty")
return v
@field_validator("value")
@classmethod
def value_must_be_nonempty(cls, v: str) -> str:
if not v.strip():
raise ValueError("value must not be empty; use DELETE /api/env to remove a key")
return v
class EnvVarDelete(BaseModel):
key: str
+27 -23
View File
@@ -288,30 +288,34 @@ def get_tool_definitions(
filtered_tools[i] = {"type": "function", "function": dynamic_schema}
break
# Rebuild discord_server schema based on the bot's privileged intents
# (detected from GET /applications/@me) and the user's action allowlist
# in config. Hides actions the bot's intents don't support so the
# model never attempts them, and annotates fetch_messages when the
# Rebuild discord / discord_admin schemas based on the bot's privileged
# intents (detected from GET /applications/@me) and the user's action
# allowlist in config. Hides actions the bot's intents don't support so
# the model never attempts them, and annotates fetch_messages when the
# MESSAGE_CONTENT intent is missing.
if "discord_server" in available_tool_names:
try:
from tools.discord_tool import get_dynamic_schema
dynamic = get_dynamic_schema()
except Exception: # pragma: no cover — defensive, fall back to static
dynamic = None
if dynamic is None:
# Tool filtered out entirely (empty allowlist or detection disabled
# the only remaining actions). Drop it from the schema list.
filtered_tools = [
t for t in filtered_tools
if t.get("function", {}).get("name") != "discord_server"
]
available_tool_names.discard("discord_server")
else:
for i, td in enumerate(filtered_tools):
if td.get("function", {}).get("name") == "discord_server":
filtered_tools[i] = {"type": "function", "function": dynamic}
break
_discord_schema_fns = {
"discord": "get_dynamic_schema_core",
"discord_admin": "get_dynamic_schema_admin",
}
for discord_tool_name in _discord_schema_fns:
if discord_tool_name in available_tool_names:
try:
from tools import discord_tool as _dt
schema_fn = getattr(_dt, _discord_schema_fns[discord_tool_name])
dynamic = schema_fn()
except Exception:
dynamic = None
if dynamic is None:
filtered_tools = [
t for t in filtered_tools
if t.get("function", {}).get("name") != discord_tool_name
]
available_tool_names.discard(discord_tool_name)
else:
for i, td in enumerate(filtered_tools):
if td.get("function", {}).get("name") == discord_tool_name:
filtered_tools[i] = {"type": "function", "function": dynamic}
break
# Strip web tool cross-references from browser_navigate description when
# web_search / web_extract are not available. The static schema says
+25
View File
@@ -91,4 +91,29 @@
// Register this plugin — the dashboard picks it up automatically.
window.__HERMES_PLUGINS__.register("example", ExamplePage);
// ─────────────────────────────────────────────────────────────────────
// Page-scoped slot demo: inject a small banner at the top of /sessions.
//
// Built-in pages expose named slots (<page>:top, <page>:bottom) that
// plugins can populate without overriding the whole route. The
// manifest lists the slots we use in its `slots` array so the shell
// knows to render <PluginSlot name="sessions:top" /> there.
// ─────────────────────────────────────────────────────────────────────
function SessionsTopBanner() {
return React.createElement(Card, {
className: "border-dashed",
},
React.createElement(CardContent, { className: "flex items-center gap-3 py-2" },
React.createElement(Badge, { variant: "outline" }, "Example"),
React.createElement("span", {
className: "text-xs text-muted-foreground",
}, "This banner was injected into the Sessions page by the example plugin via the ",
React.createElement("code", { className: "font-courier" }, "sessions:top"),
" slot."),
),
);
}
window.__HERMES_PLUGINS__.registerSlot("example", "sessions:top", SessionsTopBanner);
})();
@@ -8,6 +8,7 @@
"path": "/example",
"position": "after:skills"
},
"slots": ["sessions:top"],
"entry": "dist/index.js",
"api": "plugin_api.py"
}
+116 -11
View File
@@ -2399,8 +2399,37 @@ class AIAgent:
base_url=aux_base_url,
api_key=aux_api_key,
config_context_length=getattr(self, "_aux_compression_context_length_config", None),
provider=getattr(self, "provider", ""),
)
# Also resolve the flush_memories auxiliary model — it may differ
# from the compression model when the user configures separate
# auxiliary.flush_memories.provider/model, or when the fallback
# chain lands on a different provider. flush_memories runs with
# the FULL pre-compression conversation, so its model's context
# must also be respected.
try:
flush_client, flush_model = get_text_auxiliary_client(
"flush_memories",
main_runtime=self._current_main_runtime(),
)
if flush_client and flush_model:
_flush_ctx = get_model_context_length(
flush_model,
base_url=str(getattr(flush_client, "base_url", "") or ""),
api_key=str(getattr(flush_client, "api_key", "") or ""),
provider=getattr(self, "provider", ""),
)
if _flush_ctx and _flush_ctx < aux_context:
logger.info(
"flush_memories model %s context (%d) < compression "
"model %s context (%d) — using the smaller value",
flush_model, _flush_ctx, aux_model, aux_context,
)
aux_context = _flush_ctx
except Exception:
pass # Non-fatal — fall through with compression model's context
# Hard floor: the auxiliary compression model must have at least
# MINIMUM_CONTEXT_LENGTH (64K) tokens of context. The main model
# is already required to meet this floor (checked earlier in
@@ -2420,13 +2449,25 @@ class AIAgent:
)
threshold = self.context_compressor.threshold_tokens
if aux_context < threshold:
# Auto-correct: lower the live session threshold so
# compression actually works this session. The hard floor
# above guarantees aux_context >= MINIMUM_CONTEXT_LENGTH,
# so the new threshold is always >= 64K.
# Headroom: the threshold budgets RAW MESSAGES only, but the
# actual request auxiliary callers (compression summariser and
# flush_memories) send also includes the system prompt and every
# tool schema. We must ensure threshold + headroom <= aux_context
# or the first compression/flush request will overflow.
#
# This applies even when aux_context > threshold (the common
# same-model case after a155b4a1) — e.g. 128K context, 85%
# threshold = 108K, 20K overhead → 108K + 20K = 128K exactly
# at the limit, and any token-estimate variance causes a 400.
from agent.model_metadata import estimate_request_tokens_rough
tool_overhead = estimate_request_tokens_rough([], tools=self.tools)
headroom = tool_overhead + 12_000
effective_limit = max(aux_context - headroom, MINIMUM_CONTEXT_LENGTH)
if effective_limit < threshold:
old_threshold = threshold
new_threshold = aux_context
new_threshold = effective_limit
self.context_compressor.threshold_tokens = new_threshold
# Keep threshold_percent in sync so future main-model
# context_length changes (update_model) re-derive from a
@@ -7975,6 +8016,67 @@ class AIAgent:
messages.pop() # remove flush msg
return
# ── Defence-in-depth: trim messages to fit auxiliary context ──
#
# _check_compression_model_feasibility already lowers the
# compression threshold so conversations *triggered by preflight
# compression* should fit. But flush_memories is also called
# from CLI /new and gateway session resets — paths that bypass
# the preflight check entirely. Trim here as a safety net.
try:
from agent.auxiliary_client import get_text_auxiliary_client
from agent.model_metadata import (
get_model_context_length,
estimate_messages_tokens_rough,
)
_fc, _fm = get_text_auxiliary_client(
"flush_memories",
main_runtime=self._current_main_runtime(),
)
_fctx = 0
if _fc and _fm:
_fctx = get_model_context_length(
_fm,
base_url=str(getattr(_fc, "base_url", "") or ""),
api_key=str(getattr(_fc, "api_key", "") or ""),
provider=getattr(self, "provider", ""),
)
if not _fctx:
_fctx = getattr(
getattr(self, "context_compressor", None),
"context_length", 0,
)
if _fctx:
_budget = _fctx - 5120 - 500 # output + tool schema
if _budget > 0:
_est = estimate_messages_tokens_rough(api_messages)
if _est > _budget:
_sys = []
_conv = api_messages
if api_messages and api_messages[0].get("role") == "system":
_sys = [api_messages[0]]
_conv = api_messages[1:]
_rem = _budget - estimate_messages_tokens_rough(_sys)
_kept: list = []
_acc = 0
for _m in reversed(_conv):
_mt = estimate_messages_tokens_rough([_m])
if _acc + _mt > _rem:
break
_kept.append(_m)
_acc += _mt
_kept.reverse()
if len(_kept) < 3 and len(_conv) >= 3:
_kept = _conv[-3:]
api_messages = _sys + _kept
logger.info(
"flush_memories: trimmed %d%d msgs to fit "
"%d-token aux context",
len(_sys) + len(_conv), len(api_messages), _fctx,
)
except Exception as _te:
logger.debug("flush_memories: context trim failed: %s", _te)
# Use auxiliary client for the flush call when available --
# it's cheaper and avoids Codex Responses API incompatibility.
from agent.auxiliary_client import (
@@ -8010,17 +8112,20 @@ class AIAgent:
response = None
if not _aux_available and self.api_mode == "codex_responses":
# No auxiliary client -- use the Codex Responses path directly
# No auxiliary client -- use the Codex Responses path directly.
# The Responses API does not accept `temperature` on any
# supported backend (chatgpt.com/backend-api/codex rejects it
# outright; api.openai.com + gpt-5/o-series reasoning models
# and Copilot Responses reject it on reasoning models). The
# transport intentionally never sets it — strip any leftover
# here so the flush fallback matches the main-loop behavior.
codex_kwargs = self._build_api_kwargs(api_messages)
_ct_flush = self._get_transport()
if _ct_flush is not None:
codex_kwargs["tools"] = _ct_flush.convert_tools([memory_tool_def])
elif not codex_kwargs.get("tools"):
codex_kwargs["tools"] = [memory_tool_def]
if _flush_temperature is not None:
codex_kwargs["temperature"] = _flush_temperature
else:
codex_kwargs.pop("temperature", None)
codex_kwargs.pop("temperature", None)
if "max_output_tokens" in codex_kwargs:
codex_kwargs["max_output_tokens"] = 5120
response = self._run_codex_stream(codex_kwargs)
+99 -7
View File
@@ -29,10 +29,25 @@ BOLD='\033[1m'
REPO_URL_SSH="git@github.com:NousResearch/hermes-agent.git"
REPO_URL_HTTPS="https://github.com/NousResearch/hermes-agent.git"
HERMES_HOME="${HERMES_HOME:-$HOME/.hermes}"
INSTALL_DIR="${HERMES_INSTALL_DIR:-$HERMES_HOME/hermes-agent}"
# INSTALL_DIR is resolved AFTER arg parsing and OS detection so we can pick an
# FHS-style layout for root installs. Track whether the user gave us an
# explicit directory — if so we never override it.
if [ -n "${HERMES_INSTALL_DIR:-}" ]; then
INSTALL_DIR="$HERMES_INSTALL_DIR"
INSTALL_DIR_EXPLICIT=true
else
INSTALL_DIR=""
INSTALL_DIR_EXPLICIT=false
fi
PYTHON_VERSION="3.11"
NODE_VERSION="22"
# FHS-style root install layout (set by resolve_install_layout when applicable):
# code at /usr/local/lib/hermes-agent, command at /usr/local/bin/hermes,
# data still at /root/.hermes (HERMES_HOME). Matches Claude Code / Codex CLI
# and keeps Docker bind-mounted /root/ volumes lean.
ROOT_FHS_LAYOUT=false
# Options
USE_VENV=true
RUN_SETUP=true
@@ -64,6 +79,7 @@ while [[ $# -gt 0 ]]; do
;;
--dir)
INSTALL_DIR="$2"
INSTALL_DIR_EXPLICIT=true
shift 2
;;
--hermes-home)
@@ -79,9 +95,20 @@ while [[ $# -gt 0 ]]; do
echo " --no-venv Don't create virtual environment"
echo " --skip-setup Skip interactive setup wizard"
echo " --branch NAME Git branch to install (default: main)"
echo " --dir PATH Installation directory (default: ~/.hermes/hermes-agent)"
echo " --dir PATH Installation directory"
echo " default (non-root): ~/.hermes/hermes-agent"
echo " default (root, Linux): /usr/local/lib/hermes-agent"
echo " --hermes-home PATH Data directory (default: ~/.hermes, or \$HERMES_HOME)"
echo " -h, --help Show this help"
echo ""
echo "Notes:"
echo " When running as root on Linux, Hermes installs the code under"
echo " /usr/local/lib/hermes-agent and links the command into"
echo " /usr/local/bin/hermes (FHS layout — matches Claude Code / Codex CLI)."
echo " Data, config, sessions, and logs still live in \$HERMES_HOME"
echo " (default /root/.hermes). This keeps Docker bind-mounted volumes"
echo " small and ensures the command is on PATH for all shells."
echo " Existing installs at \$HERMES_HOME/hermes-agent are preserved in-place."
exit 0
;;
*)
@@ -163,9 +190,60 @@ is_termux() {
[ -n "${TERMUX_VERSION:-}" ] || [[ "${PREFIX:-}" == *"com.termux/files/usr"* ]]
}
# Decide where the repo checkout + venv live, and where the `hermes` command
# symlink goes. Called after detect_os so $OS/$DISTRO are known.
#
# Defaults:
# - Non-root, any OS: INSTALL_DIR = $HERMES_HOME/hermes-agent
# command link in $HOME/.local/bin
# - Termux (any uid): INSTALL_DIR = $HERMES_HOME/hermes-agent
# command link in $PREFIX/bin (already on PATH)
# - Root on Linux (new): INSTALL_DIR = /usr/local/lib/hermes-agent
# command link in /usr/local/bin
# (unless a legacy install already exists at
# $HERMES_HOME/hermes-agent — then preserve it)
#
# Always no-op when the user set --dir or $HERMES_INSTALL_DIR.
resolve_install_layout() {
if [ "$INSTALL_DIR_EXPLICIT" = true ]; then
log_info "Install directory: $INSTALL_DIR (explicit)"
return 0
fi
# Termux: package manager manages /data/data/..., keep code in HERMES_HOME.
if is_termux; then
INSTALL_DIR="$HERMES_HOME/hermes-agent"
return 0
fi
# Root on Linux: prefer FHS layout unless a legacy install already exists.
# macOS root installs keep the legacy layout because /usr/local/ on macOS
# is Homebrew territory and we don't want to fight that.
if [ "$OS" = "linux" ] && [ "$(id -u)" -eq 0 ]; then
if [ -d "$HERMES_HOME/hermes-agent/.git" ]; then
INSTALL_DIR="$HERMES_HOME/hermes-agent"
log_info "Existing install detected at $INSTALL_DIR — keeping legacy layout"
log_info " (new root installs use /usr/local/lib/hermes-agent)"
return 0
fi
INSTALL_DIR="/usr/local/lib/hermes-agent"
ROOT_FHS_LAYOUT=true
log_info "Root install on Linux — using FHS layout"
log_info " Code: $INSTALL_DIR"
log_info " Command: /usr/local/bin/hermes"
log_info " Data: $HERMES_HOME (unchanged)"
return 0
fi
# Default: non-root, non-Termux → legacy user-scoped layout.
INSTALL_DIR="$HERMES_HOME/hermes-agent"
}
get_command_link_dir() {
if is_termux && [ -n "${PREFIX:-}" ]; then
echo "$PREFIX/bin"
elif [ "$ROOT_FHS_LAYOUT" = true ]; then
echo "/usr/local/bin"
else
echo "$HOME/.local/bin"
fi
@@ -174,6 +252,8 @@ get_command_link_dir() {
get_command_link_display_dir() {
if is_termux && [ -n "${PREFIX:-}" ]; then
echo '$PREFIX/bin'
elif [ "$ROOT_FHS_LAYOUT" = true ]; then
echo '/usr/local/bin'
else
echo '~/.local/bin'
fi
@@ -975,6 +1055,14 @@ setup_path() {
return 0
fi
# FHS layout: /usr/local/bin is on PATH for every standard shell, nothing to inject.
if [ "$ROOT_FHS_LAYOUT" = true ]; then
export PATH="$command_link_dir:$PATH"
log_info "/usr/local/bin is already on PATH for all shells"
log_success "hermes command ready"
return 0
fi
# Check if ~/.local/bin is on PATH; if not, add it to shell config.
# Detect the user's actual login shell (not the shell running this script,
# which is always bash when piped from curl).
@@ -1339,12 +1427,12 @@ print_success() {
echo ""
# Show file locations
echo -e "${CYAN}${BOLD}📁 Your files (all in ~/.hermes/):${NC}"
echo -e "${CYAN}${BOLD}📁 Your files:${NC}"
echo ""
echo -e " ${YELLOW}Config:${NC} ~/.hermes/config.yaml"
echo -e " ${YELLOW}API Keys:${NC} ~/.hermes/.env"
echo -e " ${YELLOW}Data:${NC} ~/.hermes/cron/, sessions/, logs/"
echo -e " ${YELLOW}Code:${NC} ~/.hermes/hermes-agent/"
echo -e " ${YELLOW}Config:${NC} $HERMES_HOME/config.yaml"
echo -e " ${YELLOW}API Keys:${NC} $HERMES_HOME/.env"
echo -e " ${YELLOW}Data:${NC} $HERMES_HOME/cron/, sessions/, logs/"
echo -e " ${YELLOW}Code:${NC} $INSTALL_DIR"
echo ""
echo -e "${CYAN}─────────────────────────────────────────────────────────${NC}"
@@ -1364,6 +1452,9 @@ print_success() {
if [ "$DISTRO" = "termux" ]; then
echo -e "${YELLOW}⚡ 'hermes' was linked into $(get_command_link_display_dir), which is already on PATH in Termux.${NC}"
echo ""
elif [ "$ROOT_FHS_LAYOUT" = true ]; then
echo -e "${YELLOW}⚡ 'hermes' was linked into /usr/local/bin and is ready to use — no shell reload needed.${NC}"
echo ""
else
echo -e "${YELLOW}⚡ Reload your shell to use 'hermes' command:${NC}"
echo ""
@@ -1415,6 +1506,7 @@ main() {
print_banner
detect_os
resolve_install_layout
install_uv
check_python
check_git
+2
View File
@@ -92,6 +92,7 @@ AUTHOR_MAP = {
"104278804+Sertug17@users.noreply.github.com": "Sertug17",
"112503481+caentzminger@users.noreply.github.com": "caentzminger",
"258577966+voidborne-d@users.noreply.github.com": "voidborne-d",
"xydarcher@uestc.edu.cn": "Readon",
"sir_even@icloud.com": "sirEven",
"36056348+sirEven@users.noreply.github.com": "sirEven",
"70424851+insecurejezza@users.noreply.github.com": "insecurejezza",
@@ -504,6 +505,7 @@ AUTHOR_MAP = {
"screenmachine@gmail.com": "teknium1",
"chenzeshi@live.com": "chen1749144759",
"mor.aleksandr@yahoo.com": "MorAlekss",
"ash@users.noreply.github.com": "ash",
}
+26
View File
@@ -847,6 +847,32 @@ class TestTokenBudgetTailProtection:
assert isinstance(pruned, int)
class TestUpdateModelBudgets:
"""Regression: update_model() must recalculate token budgets."""
def test_tail_budget_recalculated(self):
"""tail_token_budget must change after switching to a different context length."""
from unittest.mock import patch
with patch("agent.context_compressor.get_model_context_length", return_value=200_000):
comp = ContextCompressor("model-a", threshold_percent=0.50, quiet_mode=True)
old_tail = comp.tail_token_budget
old_max_summary = comp.max_summary_tokens
comp.update_model("model-b", context_length=32_000)
assert comp.tail_token_budget != old_tail, "tail_token_budget should change"
assert comp.tail_token_budget < old_tail, "smaller context → smaller budget"
assert comp.max_summary_tokens != old_max_summary, "max_summary_tokens should change"
def test_budgets_proportional(self):
"""Budgets should be proportional to context_length after update."""
from unittest.mock import patch
with patch("agent.context_compressor.get_model_context_length", return_value=100_000):
comp = ContextCompressor("model-a", threshold_percent=0.50, quiet_mode=True)
comp.update_model("model-b", context_length=10_000)
assert comp.tail_token_budget == int(comp.threshold_tokens * comp.summary_target_ratio)
assert comp.max_summary_tokens == min(int(10_000 * 0.05), 4000)
class TestTruncateToolCallArgsJson:
"""Regression tests for #11762.
@@ -0,0 +1,201 @@
"""Regression tests for the generic unsupported-parameter detector in
``agent.auxiliary_client``.
The original temperature-specific detector (PR #15621) was generalized so the
same reactive-retry strategy covers any provider that rejects an arbitrary
request parameter ``max_tokens``, ``seed``, ``top_p``, future quirks not
just ``temperature``. Credit @nicholasrae (PR #15416) for the generalization
pattern.
These tests lock in:
* ``_is_unsupported_parameter_error(exc, param)`` across common phrasings
* the back-compat wrapper ``_is_unsupported_temperature_error`` still works
* the max_tokens retry branch no longer pops a key that was never set
(``max_tokens is None`` gate)
* the max_tokens retry branch matches via the generic helper on top of the
legacy ``"max_tokens"`` / ``"unsupported_parameter"`` substring checks
"""
from unittest.mock import patch, MagicMock, AsyncMock
import pytest
from agent.auxiliary_client import (
call_llm,
async_call_llm,
_is_unsupported_parameter_error,
_is_unsupported_temperature_error,
)
class TestIsUnsupportedParameterError:
"""The generic detector must match real provider phrasings for any param."""
@pytest.mark.parametrize("param,message", [
# temperature phrasings (regression coverage via the generic API)
("temperature", "HTTP 400: Unsupported parameter: temperature"),
("temperature", "Error code: 400 - {'error': {'code': 'unsupported_parameter', 'param': 'temperature'}}"),
("temperature", "this model does not support temperature"),
# max_tokens phrasings
("max_tokens", "HTTP 400: Unsupported parameter: max_tokens"),
("max_tokens", "Unknown parameter: max_tokens — use max_completion_tokens"),
("max_tokens", "Invalid parameter: max_tokens is not supported"),
# arbitrary future params
("seed", "HTTP 400: unrecognized parameter: seed"),
("top_p", "Error: top_p is not supported for this model"),
])
def test_matches_real_provider_messages(self, param, message):
assert _is_unsupported_parameter_error(RuntimeError(message), param) is True
@pytest.mark.parametrize("param,message", [
# Param not mentioned at all
("temperature", "HTTP 400: max_tokens is too large"),
# Param mentioned but not flagged as unsupported
("temperature", "temperature must be between 0 and 2"),
# Totally unrelated 400
("max_tokens", "Rate limit exceeded"),
# Connection-level errors
("temperature", "Connection reset by peer"),
])
def test_does_not_match_unrelated_errors(self, param, message):
assert _is_unsupported_parameter_error(RuntimeError(message), param) is False
def test_empty_param_returns_false(self):
assert _is_unsupported_parameter_error(
RuntimeError("HTTP 400: Unsupported parameter: temperature"), ""
) is False
def test_temperature_wrapper_delegates_to_generic(self):
"""Back-compat: ``_is_unsupported_temperature_error`` still routes through."""
msg = "HTTP 400: Unsupported parameter: temperature"
assert _is_unsupported_temperature_error(RuntimeError(msg)) is True
# And the unrelated-case still holds
assert _is_unsupported_temperature_error(
RuntimeError("max_tokens is too large")) is False
def _dummy_response():
"""Sentinel — real code calls ``_validate_llm_response`` which we patch out."""
return {"ok": True}
class TestMaxTokensRetryHardening:
"""The max_tokens retry branch now (a) gates on ``max_tokens is not None``
and (b) also matches the generic phrasings via the helper.
"""
def test_sync_max_tokens_retry_skipped_when_max_tokens_is_none(self):
"""No max_tokens kwarg → must not pop/retry even if the error mentions it.
Before the hardening, ``kwargs.pop("max_tokens", None)`` was safe but
``kwargs["max_completion_tokens"] = max_tokens`` would set a None
value and hit the provider again. The gate skips the whole branch.
"""
client = MagicMock()
client.base_url = "https://api.openai.com/v1"
err = RuntimeError("HTTP 400: Unsupported parameter: max_tokens")
client.chat.completions.create.side_effect = err
with (
patch("agent.auxiliary_client._resolve_task_provider_model",
return_value=("openai-codex", "gpt-5.5", None, None, None)),
patch("agent.auxiliary_client._get_cached_client",
return_value=(client, "gpt-5.5")),
patch("agent.auxiliary_client._validate_llm_response",
side_effect=lambda resp, _task: resp),
):
with pytest.raises(RuntimeError):
call_llm(
task="session_search",
messages=[{"role": "user", "content": "hi"}],
temperature=0.3,
# max_tokens omitted on purpose
)
# Only the initial attempt — no retry because the gate blocked it
assert client.chat.completions.create.call_count == 1
def test_sync_max_tokens_retry_matches_generic_phrasing(self):
"""A 400 saying "Unknown parameter: max_tokens" (not the legacy
substring ``"max_tokens"`` bare + no ``unsupported_parameter`` token)
now triggers the retry via the generic helper.
"""
client = MagicMock()
client.base_url = "https://api.openai.com/v1"
err = RuntimeError("Unknown parameter: max_tokens")
response = _dummy_response()
client.chat.completions.create.side_effect = [err, response]
with (
patch("agent.auxiliary_client._resolve_task_provider_model",
return_value=("openai-codex", "gpt-5.5", None, None, None)),
patch("agent.auxiliary_client._get_cached_client",
return_value=(client, "gpt-5.5")),
patch("agent.auxiliary_client._validate_llm_response",
side_effect=lambda resp, _task: resp),
):
result = call_llm(
task="session_search",
messages=[{"role": "user", "content": "hi"}],
temperature=0.3,
max_tokens=512,
)
assert result is response
assert client.chat.completions.create.call_count == 2
second_call = client.chat.completions.create.call_args_list[1]
assert "max_tokens" not in second_call.kwargs
assert second_call.kwargs["max_completion_tokens"] == 512
@pytest.mark.asyncio
async def test_async_max_tokens_retry_skipped_when_max_tokens_is_none(self):
client = MagicMock()
client.base_url = "https://api.openai.com/v1"
err = RuntimeError("HTTP 400: Unsupported parameter: max_tokens")
client.chat.completions.create = AsyncMock(side_effect=err)
with (
patch("agent.auxiliary_client._resolve_task_provider_model",
return_value=("openai-codex", "gpt-5.5", None, None, None)),
patch("agent.auxiliary_client._get_cached_client",
return_value=(client, "gpt-5.5")),
patch("agent.auxiliary_client._validate_llm_response",
side_effect=lambda resp, _task: resp),
):
with pytest.raises(RuntimeError):
await async_call_llm(
task="session_search",
messages=[{"role": "user", "content": "hi"}],
temperature=0.3,
)
assert client.chat.completions.create.call_count == 1
@pytest.mark.asyncio
async def test_async_max_tokens_retry_matches_generic_phrasing(self):
client = MagicMock()
client.base_url = "https://api.openai.com/v1"
err = RuntimeError("Unknown parameter: max_tokens")
response = _dummy_response()
client.chat.completions.create = AsyncMock(side_effect=[err, response])
with (
patch("agent.auxiliary_client._resolve_task_provider_model",
return_value=("openai-codex", "gpt-5.5", None, None, None)),
patch("agent.auxiliary_client._get_cached_client",
return_value=(client, "gpt-5.5")),
patch("agent.auxiliary_client._validate_llm_response",
side_effect=lambda resp, _task: resp),
):
result = await async_call_llm(
task="session_search",
messages=[{"role": "user", "content": "hi"}],
temperature=0.3,
max_tokens=512,
)
assert result is response
assert client.chat.completions.create.await_count == 2
second_call = client.chat.completions.create.call_args_list[1]
assert "max_tokens" not in second_call.kwargs
assert second_call.kwargs["max_completion_tokens"] == 512
@@ -0,0 +1,237 @@
"""Regression tests for the universal "unsupported temperature" retry in
``agent.auxiliary_client``.
Auxiliary callers (``flush_memories``, context compression, session search,
web extract summarisation, etc.) hardcode ``temperature=0.3`` for historical
reasons. Several provider/model combinations reject ``temperature`` with a
400:
* OpenAI Responses (gpt-5/o-series reasoning models)
* Copilot Responses (reasoning models)
* OpenRouter reasoning models (gpt-5.5, some anthropic via OAI-compat)
* Anthropic Opus 4.7+ via OpenAI-compat endpoints
* Kimi/Moonshot (server-managed)
``_fixed_temperature_for_model`` catches Kimi up front, and
``build_chat_completion_kwargs`` drops temperature for Anthropic Opus 4.7+,
but the same backend can accept ``temperature`` for some models and reject
it for others (for example gpt-5.4 accepts but gpt-5.5 rejects on the same
endpoint). An allow/deny-list is not maintainable across providers.
The universal fix is reactive: when a call returns an
``Unsupported parameter: temperature`` 400, retry once without temperature.
These tests lock in that behaviour for both sync and async paths.
"""
from unittest.mock import patch, MagicMock, AsyncMock
import pytest
from agent.auxiliary_client import (
call_llm,
async_call_llm,
_is_unsupported_temperature_error,
)
class TestIsUnsupportedTemperatureError:
"""The detector must match the phrasings providers actually return."""
@pytest.mark.parametrize("message", [
# OpenAI / Codex Responses
"HTTP 400: Unsupported parameter: temperature",
"Error code: 400 - {'error': {'message': \"Unsupported parameter: 'temperature'\"}}",
# Copilot / OpenAI error-code form
"Error code: 400 - {'error': {'code': 'unsupported_parameter', 'param': 'temperature'}}",
# OpenRouter-style
"Provider returned error: temperature is not supported for this model",
"this model does not support temperature",
# Anthropic-style via OAI-compat
"temperature: unknown parameter",
# Some gateways
"unrecognized request argument supplied: temperature",
])
def test_matches_real_provider_messages(self, message):
assert _is_unsupported_temperature_error(RuntimeError(message)) is True
@pytest.mark.parametrize("message", [
# Unrelated 400s must NOT trigger a silent-retry
"HTTP 400: Invalid value: 'tool'. Supported values are: 'assistant'...",
"max_tokens is too large for this model",
"Rate limit exceeded",
"Connection reset by peer",
# Temperature value error is a different class of problem
"temperature must be between 0 and 2",
])
def test_does_not_match_unrelated_errors(self, message):
assert _is_unsupported_temperature_error(RuntimeError(message)) is False
def _dummy_response():
# The real code calls _validate_llm_response which inspects
# response.choices[0].message. The tests here patch that out, so
# any sentinel object is fine.
return {"ok": True}
class TestCallLlmUnsupportedTemperatureRetry:
"""``call_llm`` retries once without temperature and returns on success."""
def _setup(self, first_exc):
client = MagicMock()
client.base_url = "https://api.openai.com/v1"
client.chat.completions.create.side_effect = [first_exc, _dummy_response()]
return client
@pytest.mark.parametrize("error_message", [
"HTTP 400: Unsupported parameter: temperature",
"Error code: 400 - {'error': {'code': 'unsupported_parameter', 'param': 'temperature'}}",
"Provider error: this model does not support temperature",
])
def test_retries_once_without_temperature(self, error_message):
client = self._setup(RuntimeError(error_message))
with (
patch("agent.auxiliary_client._resolve_task_provider_model",
return_value=("openai-codex", "gpt-5.5", None, None, None)),
patch("agent.auxiliary_client._get_cached_client",
return_value=(client, "gpt-5.5")),
patch("agent.auxiliary_client._validate_llm_response",
side_effect=lambda resp, _task: resp),
):
result = call_llm(
task="flush_memories",
messages=[{"role": "user", "content": "remember this"}],
temperature=0.3,
max_tokens=500,
)
assert result == {"ok": True}
assert client.chat.completions.create.call_count == 2
first_kwargs = client.chat.completions.create.call_args_list[0].kwargs
retry_kwargs = client.chat.completions.create.call_args_list[1].kwargs
assert first_kwargs["temperature"] == 0.3
assert "temperature" not in retry_kwargs
# other kwargs preserved
assert retry_kwargs["max_tokens"] == 500
def test_non_temperature_400_does_not_retry_as_temperature(self):
"""Unrelated 400s (e.g. bad tool role) must not silently drop temp."""
client = MagicMock()
client.base_url = "https://api.openai.com/v1"
non_temp_err = RuntimeError(
"HTTP 400: Invalid value: 'tool'. Supported values are: 'assistant'..."
)
client.chat.completions.create.side_effect = non_temp_err
with (
patch("agent.auxiliary_client._resolve_task_provider_model",
return_value=("openai-codex", "gpt-5.5", None, None, None)),
patch("agent.auxiliary_client._get_cached_client",
return_value=(client, "gpt-5.5")),
patch("agent.auxiliary_client._validate_llm_response",
side_effect=lambda resp, _task: resp),
patch("agent.auxiliary_client._try_payment_fallback",
return_value=None),
):
with pytest.raises(RuntimeError, match="Invalid value"):
call_llm(
task="flush_memories",
messages=[{"role": "user", "content": "x"}],
temperature=0.3,
max_tokens=500,
)
# Should NOT have retried (non-temperature 400 doesn't match)
assert client.chat.completions.create.call_count == 1
def test_no_retry_when_temperature_not_in_kwargs(self):
"""If caller didn't send temperature, don't invent a temperature-retry."""
client = MagicMock()
client.base_url = "https://api.openai.com/v1"
# Provider complains about temperature even though we didn't send it.
# (Pathological but possible with misleading error text.) The guard
# ``"temperature" in kwargs`` must prevent an unnecessary retry.
err = RuntimeError("HTTP 400: Unsupported parameter: temperature")
client.chat.completions.create.side_effect = err
with (
patch("agent.auxiliary_client._resolve_task_provider_model",
return_value=("openai-codex", "gpt-5.5", None, None, None)),
patch("agent.auxiliary_client._get_cached_client",
return_value=(client, "gpt-5.5")),
patch("agent.auxiliary_client._validate_llm_response",
side_effect=lambda resp, _task: resp),
patch("agent.auxiliary_client._try_payment_fallback",
return_value=None),
):
with pytest.raises(RuntimeError):
call_llm(
task="flush_memories",
messages=[{"role": "user", "content": "x"}],
temperature=None, # explicit: no temperature sent
max_tokens=500,
)
assert client.chat.completions.create.call_count == 1
class TestAsyncCallLlmUnsupportedTemperatureRetry:
"""``async_call_llm`` mirror of the sync retry semantics."""
@pytest.mark.asyncio
async def test_async_retries_once_without_temperature(self):
client = MagicMock()
client.base_url = "https://api.openai.com/v1"
client.chat.completions.create = AsyncMock(side_effect=[
RuntimeError("HTTP 400: Unsupported parameter: temperature"),
_dummy_response(),
])
with (
patch("agent.auxiliary_client._resolve_task_provider_model",
return_value=("openai-codex", "gpt-5.5", None, None, None)),
patch("agent.auxiliary_client._get_cached_client",
return_value=(client, "gpt-5.5")),
patch("agent.auxiliary_client._validate_llm_response",
side_effect=lambda resp, _task: resp),
):
result = await async_call_llm(
task="session_search",
messages=[{"role": "user", "content": "query"}],
temperature=0.3,
max_tokens=500,
)
assert result == {"ok": True}
assert client.chat.completions.create.await_count == 2
first_kwargs = client.chat.completions.create.call_args_list[0].kwargs
retry_kwargs = client.chat.completions.create.call_args_list[1].kwargs
assert first_kwargs["temperature"] == 0.3
assert "temperature" not in retry_kwargs
assert retry_kwargs["max_tokens"] == 500
@pytest.mark.asyncio
async def test_async_non_temperature_400_does_not_retry(self):
client = MagicMock()
client.base_url = "https://api.openai.com/v1"
client.chat.completions.create = AsyncMock(
side_effect=RuntimeError("HTTP 400: Invalid value: 'tool'"),
)
with (
patch("agent.auxiliary_client._resolve_task_provider_model",
return_value=("openai-codex", "gpt-5.5", None, None, None)),
patch("agent.auxiliary_client._get_cached_client",
return_value=(client, "gpt-5.5")),
patch("agent.auxiliary_client._validate_llm_response",
side_effect=lambda resp, _task: resp),
patch("agent.auxiliary_client._try_payment_fallback",
return_value=None),
):
with pytest.raises(RuntimeError, match="Invalid value"):
await async_call_llm(
task="session_search",
messages=[{"role": "user", "content": "x"}],
temperature=0.3,
max_tokens=500,
)
assert client.chat.completions.create.await_count == 1
+390
View File
@@ -0,0 +1,390 @@
"""Tests for cron job context_from feature (issue #5439 Option C)."""
import sys
from pathlib import Path
import pytest
sys.path.insert(0, str(Path(__file__).parent.parent.parent))
@pytest.fixture
def cron_env(tmp_path, monkeypatch):
"""Isolated cron environment with temp HERMES_HOME."""
hermes_home = tmp_path / ".hermes"
hermes_home.mkdir()
(hermes_home / "cron").mkdir()
(hermes_home / "cron" / "output").mkdir()
monkeypatch.setenv("HERMES_HOME", str(hermes_home))
import cron.jobs as jobs_mod
monkeypatch.setattr(jobs_mod, "HERMES_DIR", hermes_home)
monkeypatch.setattr(jobs_mod, "CRON_DIR", hermes_home / "cron")
monkeypatch.setattr(jobs_mod, "JOBS_FILE", hermes_home / "cron" / "jobs.json")
monkeypatch.setattr(jobs_mod, "OUTPUT_DIR", hermes_home / "cron" / "output")
return hermes_home
class TestJobContextFromField:
"""Test that context_from is stored and retrieved correctly."""
def test_create_job_with_context_from_string(self, cron_env):
from cron.jobs import create_job, get_job
job_a = create_job(prompt="Find news", schedule="every 1h")
job_b = create_job(
prompt="Summarize findings",
schedule="every 2h",
context_from=job_a["id"],
)
assert job_b["context_from"] == [job_a["id"]]
loaded = get_job(job_b["id"])
assert loaded["context_from"] == [job_a["id"]]
def test_create_job_with_context_from_list(self, cron_env):
from cron.jobs import create_job, get_job
job_a = create_job(prompt="Find news", schedule="every 1h")
job_b = create_job(prompt="Find weather", schedule="every 1h")
job_c = create_job(
prompt="Summarize everything",
schedule="every 2h",
context_from=[job_a["id"], job_b["id"]],
)
assert job_c["context_from"] == [job_a["id"], job_b["id"]]
def test_create_job_without_context_from(self, cron_env):
from cron.jobs import create_job
job = create_job(prompt="Hello", schedule="every 1h")
assert job.get("context_from") is None
def test_context_from_empty_string_normalized_to_none(self, cron_env):
from cron.jobs import create_job
job = create_job(prompt="Hello", schedule="every 1h", context_from="")
assert job.get("context_from") is None
def test_context_from_empty_list_normalized_to_none(self, cron_env):
from cron.jobs import create_job
job = create_job(prompt="Hello", schedule="every 1h", context_from=[])
assert job.get("context_from") is None
class TestBuildJobPromptContextFrom:
"""Test that _build_job_prompt() injects context from referenced jobs."""
def test_injects_latest_output(self, cron_env):
from cron.jobs import create_job, OUTPUT_DIR
from cron.scheduler import _build_job_prompt
job_a = create_job(prompt="Find news", schedule="every 1h")
# Записываем output для job_a
output_dir = OUTPUT_DIR / job_a["id"]
output_dir.mkdir(parents=True, exist_ok=True)
(output_dir / "2026-04-22_10-00-00.md").write_text(
"Today's top story: AI is everywhere.", encoding="utf-8"
)
job_b = create_job(
prompt="Summarize the news",
schedule="every 2h",
context_from=job_a["id"],
)
prompt = _build_job_prompt(job_b)
assert "Today's top story: AI is everywhere." in prompt
assert f"Output from job '{job_a['id']}'" in prompt
def test_uses_most_recent_output(self, cron_env):
from cron.jobs import create_job, OUTPUT_DIR
from cron.scheduler import _build_job_prompt
import time
job_a = create_job(prompt="Find news", schedule="every 1h")
output_dir = OUTPUT_DIR / job_a["id"]
output_dir.mkdir(parents=True, exist_ok=True)
old_file = output_dir / "2026-04-22_08-00-00.md"
old_file.write_text("Old output", encoding="utf-8")
time.sleep(0.01)
new_file = output_dir / "2026-04-22_10-00-00.md"
new_file.write_text("New output", encoding="utf-8")
job_b = create_job(
prompt="Summarize", schedule="every 2h", context_from=job_a["id"]
)
prompt = _build_job_prompt(job_b)
assert "New output" in prompt
assert "Old output" not in prompt
def test_graceful_when_no_output_yet(self, cron_env):
from cron.jobs import create_job
from cron.scheduler import _build_job_prompt
job_a = create_job(prompt="Find news", schedule="every 1h")
job_b = create_job(
prompt="Summarize", schedule="every 2h", context_from=job_a["id"]
)
# job_a never ran — output dir does not exist
# expect silent skip: no placeholder injected, base prompt intact
prompt = _build_job_prompt(job_b)
assert "no output" not in prompt.lower()
assert "not found" not in prompt.lower()
assert "Summarize" in prompt
def test_injects_multiple_context_jobs(self, cron_env):
from cron.jobs import create_job, OUTPUT_DIR
from cron.scheduler import _build_job_prompt
job_a = create_job(prompt="Find news", schedule="every 1h")
job_b = create_job(prompt="Find weather", schedule="every 1h")
for job, content in [(job_a, "News: AI boom"), (job_b, "Weather: Sunny")]:
out_dir = OUTPUT_DIR / job["id"]
out_dir.mkdir(parents=True, exist_ok=True)
(out_dir / "2026-04-22_10-00-00.md").write_text(content, encoding="utf-8")
job_c = create_job(
prompt="Daily briefing",
schedule="every 2h",
context_from=[job_a["id"], job_b["id"]],
)
prompt = _build_job_prompt(job_c)
assert "News: AI boom" in prompt
assert "Weather: Sunny" in prompt
def test_context_injected_before_prompt(self, cron_env):
"""Context should appear before the job's own prompt."""
from cron.jobs import create_job, OUTPUT_DIR
from cron.scheduler import _build_job_prompt
job_a = create_job(prompt="Find data", schedule="every 1h")
out_dir = OUTPUT_DIR / job_a["id"]
out_dir.mkdir(parents=True, exist_ok=True)
(out_dir / "2026-04-22_10-00-00.md").write_text("Context data", encoding="utf-8")
job_b = create_job(
prompt="Process the data above",
schedule="every 2h",
context_from=job_a["id"],
)
prompt = _build_job_prompt(job_b)
context_pos = prompt.find("Context data")
prompt_pos = prompt.find("Process the data above")
assert context_pos < prompt_pos
def test_output_truncated_at_8k_chars(self, cron_env):
"""Output longer than 8000 chars should be truncated."""
from cron.jobs import create_job, OUTPUT_DIR
from cron.scheduler import _build_job_prompt
job_a = create_job(prompt="Find data", schedule="every 1h")
out_dir = OUTPUT_DIR / job_a["id"]
out_dir.mkdir(parents=True, exist_ok=True)
big_output = "x" * 10000
(out_dir / "2026-04-22_10-00-00.md").write_text(big_output, encoding="utf-8")
job_b = create_job(
prompt="Process", schedule="every 2h", context_from=job_a["id"]
)
prompt = _build_job_prompt(job_b)
assert "truncated" in prompt
assert "x" * 10000 not in prompt
def test_graceful_when_file_deleted_between_listing_and_reading(self, cron_env):
"""Job should not crash if output file is deleted mid-read."""
from cron.jobs import create_job, OUTPUT_DIR
from cron.scheduler import _build_job_prompt
from unittest.mock import patch
job_a = create_job(prompt="Find data", schedule="every 1h")
out_dir = OUTPUT_DIR / job_a["id"]
out_dir.mkdir(parents=True, exist_ok=True)
(out_dir / "2026-04-22_10-00-00.md").write_text("Some output", encoding="utf-8")
job_b = create_job(
prompt="Process", schedule="every 2h", context_from=job_a["id"]
)
# Simulate file deleted between glob() and read_text()
original_read = Path.read_text
def mock_read_text(self, *args, **kwargs):
if self.suffix == ".md":
raise FileNotFoundError("file deleted mid-read")
return original_read(self, *args, **kwargs)
with patch.object(Path, "read_text", mock_read_text):
prompt = _build_job_prompt(job_b)
# Job should not crash, prompt should still contain the base prompt
assert "Process" in prompt
def test_graceful_when_permission_error(self, cron_env):
"""Job should not crash if output directory is not readable."""
from cron.jobs import create_job, OUTPUT_DIR
from cron.scheduler import _build_job_prompt
from unittest.mock import patch
job_a = create_job(prompt="Find data", schedule="every 1h")
out_dir = OUTPUT_DIR / job_a["id"]
out_dir.mkdir(parents=True, exist_ok=True)
(out_dir / "2026-04-22_10-00-00.md").write_text("Some output", encoding="utf-8")
job_b = create_job(
prompt="Process", schedule="every 2h", context_from=job_a["id"]
)
# Simulate permission error on read
original_read = Path.read_text
def mock_read_text(self, *args, **kwargs):
if self.suffix == ".md":
raise PermissionError("permission denied")
return original_read(self, *args, **kwargs)
with patch.object(Path, "read_text", mock_read_text):
prompt = _build_job_prompt(job_b)
# Job should not crash, prompt should still contain the base prompt
assert "Process" in prompt
def test_invalid_job_id_skipped(self, cron_env):
"""context_from with path traversal job_id should be skipped."""
from cron.jobs import create_job
from cron.scheduler import _build_job_prompt
job = create_job(prompt="Process", schedule="every 2h")
# Manually inject invalid context_from (simulating tampered jobs.json)
job["context_from"] = ["../../../etc/passwd"]
prompt = _build_job_prompt(job)
# Should not crash and should not inject anything malicious
assert "Process" in prompt
assert "etc/passwd" not in prompt
class TestUpdateContextFrom:
"""Verify the cronjob tool's `update` action wires context_from through.
Without this, the create-path stores the field but users can never modify
or clear it via the tool (schema promises "pass an empty array to clear").
"""
def test_update_adds_context_from_to_existing_job(self, cron_env):
from cron.jobs import create_job, get_job
from tools.cronjob_tools import cronjob
import json
job_a = create_job(prompt="Find news", schedule="every 1h")
job_b = create_job(prompt="Summarize", schedule="every 2h")
assert job_b.get("context_from") is None
result = json.loads(cronjob(
action="update",
job_id=job_b["id"],
context_from=job_a["id"],
))
assert result["success"] is True
reloaded = get_job(job_b["id"])
assert reloaded["context_from"] == [job_a["id"]]
def test_update_changes_context_from_reference(self, cron_env):
from cron.jobs import create_job, get_job
from tools.cronjob_tools import cronjob
import json
job_a = create_job(prompt="Find news", schedule="every 1h")
job_a2 = create_job(prompt="Find weather", schedule="every 1h")
job_b = create_job(
prompt="Summarize", schedule="every 2h", context_from=job_a["id"],
)
assert job_b["context_from"] == [job_a["id"]]
result = json.loads(cronjob(
action="update",
job_id=job_b["id"],
context_from=[job_a2["id"]],
))
assert result["success"] is True
assert get_job(job_b["id"])["context_from"] == [job_a2["id"]]
def test_update_clears_context_from_with_empty_list(self, cron_env):
from cron.jobs import create_job, get_job
from tools.cronjob_tools import cronjob
import json
job_a = create_job(prompt="Find news", schedule="every 1h")
job_b = create_job(
prompt="Summarize", schedule="every 2h", context_from=job_a["id"],
)
assert get_job(job_b["id"])["context_from"] == [job_a["id"]]
result = json.loads(cronjob(
action="update",
job_id=job_b["id"],
context_from=[],
))
assert result["success"] is True
assert get_job(job_b["id"])["context_from"] is None
def test_update_clears_context_from_with_empty_string(self, cron_env):
from cron.jobs import create_job, get_job
from tools.cronjob_tools import cronjob
import json
job_a = create_job(prompt="Find news", schedule="every 1h")
job_b = create_job(
prompt="Summarize", schedule="every 2h", context_from=job_a["id"],
)
result = json.loads(cronjob(
action="update",
job_id=job_b["id"],
context_from="",
))
assert result["success"] is True
assert get_job(job_b["id"])["context_from"] is None
def test_update_rejects_unknown_job_reference(self, cron_env):
from cron.jobs import create_job
from tools.cronjob_tools import cronjob
import json
job_b = create_job(prompt="Summarize", schedule="every 2h")
result = json.loads(cronjob(
action="update",
job_id=job_b["id"],
context_from=["deadbeef0000"],
))
assert result["success"] is False
assert "not found" in result["error"]
def test_update_preserves_context_from_when_not_passed(self, cron_env):
"""Updating other fields must not clobber context_from."""
from cron.jobs import create_job, get_job
from tools.cronjob_tools import cronjob
import json
job_a = create_job(prompt="Find news", schedule="every 1h")
job_b = create_job(
prompt="Summarize", schedule="every 2h", context_from=job_a["id"],
)
# Update an unrelated field
result = json.loads(cronjob(
action="update",
job_id=job_b["id"],
prompt="Summarize v2",
))
assert result["success"] is True
reloaded = get_job(job_b["id"])
assert reloaded["prompt"] == "Summarize v2"
assert reloaded["context_from"] == [job_a["id"]]
+186
View File
@@ -601,3 +601,189 @@ class TestImagegenModelPicker:
_configure_imagegen_model("fal", config)
assert isinstance(config["image_gen"], dict)
assert config["image_gen"]["model"] == "fal-ai/flux-2/klein/9b"
def test_save_platform_tools_normalizes_numeric_entries():
"""YAML may parse bare numeric toolset names as int. They should be
normalized to str so they survive the save round-trip.
"""
config = {
"platform_toolsets": {
"cli": ["web", "terminal", 12306, "custom-mcp"]
}
}
with patch("hermes_cli.tools_config.save_config"):
_save_platform_tools(config, "cli", {"web", "browser"})
saved = config["platform_toolsets"]["cli"]
assert "12306" in saved
assert 12306 not in saved
def test_save_platform_tools_clears_no_mcp_sentinel():
"""`hermes tools` has no UI for no_mcp, so saving from the picker clears
the sentinel unconditionally otherwise a user who once set no_mcp by
hand could never re-enable MCP servers through the UI.
"""
config = {
"platform_toolsets": {
"cli": ["web", "terminal", "no_mcp"]
}
}
with patch("hermes_cli.tools_config.save_config"):
_save_platform_tools(config, "cli", {"web", "browser"})
saved = config["platform_toolsets"]["cli"]
assert "no_mcp" not in saved
def test_save_platform_tools_preserves_mcp_server_names():
"""Non-sentinel passthrough entries (MCP server names) must still survive
the save we only clear `no_mcp`, not every non-configurable entry.
"""
config = {
"platform_toolsets": {
"cli": ["web", "terminal", "custom-mcp", "another-mcp"]
}
}
with patch("hermes_cli.tools_config.save_config"):
_save_platform_tools(config, "cli", {"web", "browser"})
saved = config["platform_toolsets"]["cli"]
assert "custom-mcp" in saved
assert "another-mcp" in saved
def test_get_platform_tools_recovers_non_configurable_toolsets_from_composite():
"""Non-configurable toolsets whose tools are in the composite but not in
CONFIGURABLE_TOOLSETS should still appear in the result.
"""
from toolsets import TOOLSETS
from hermes_cli.tools_config import PLATFORMS
from unittest.mock import patch as mock_patch
fake_toolsets = dict(TOOLSETS)
fake_toolsets["_test_platform_tool"] = {
"description": "test",
"tools": ["_test_special_tool"],
"includes": [],
}
fake_toolsets["hermes-_test_platform"] = {
"description": "test composite",
"tools": ["web_search", "web_extract", "terminal", "process", "_test_special_tool"],
"includes": [],
}
test_platforms = {
"_test_platform": {"label": "Test", "default_toolset": "hermes-_test_platform"},
}
with mock_patch("hermes_cli.tools_config.PLATFORMS", {**PLATFORMS, **test_platforms}):
with mock_patch("toolsets.TOOLSETS", fake_toolsets):
enabled = _get_platform_tools({}, "_test_platform")
assert "_test_platform_tool" in enabled
assert "web" in enabled
assert "terminal" in enabled
def test_get_platform_tools_second_pass_skips_fully_claimed_toolsets():
"""Toolsets whose tools are fully covered by configurable keys should NOT
be added by the second pass (prevents 'search', 'hermes-acp' noise).
"""
enabled = _get_platform_tools({}, "cli")
assert "search" not in enabled
def test_get_platform_tools_discord_both_off_by_default():
"""Both `discord` and `discord_admin` are opt-in via `hermes tools`,
even on the Discord platform itself. Users shouldn't auto-inherit 19
extra tools just because DISCORD_BOT_TOKEN is set."""
enabled = _get_platform_tools({}, "discord")
assert "discord" not in enabled
assert "discord_admin" not in enabled
def test_discord_toolsets_in_configurable_toolsets():
keys = {ts_key for ts_key, _, _ in CONFIGURABLE_TOOLSETS}
assert "discord" in keys
assert "discord_admin" in keys
def test_discord_toolsets_in_default_off():
assert "discord" in _DEFAULT_OFF_TOOLSETS
assert "discord_admin" in _DEFAULT_OFF_TOOLSETS
def test_discord_toolsets_not_available_on_other_platforms():
"""Platform-scoping: discord / discord_admin should not appear on CLI,
Telegram, etc. not even as an opt-in."""
from hermes_cli.tools_config import _toolset_allowed_for_platform
for plat in ["cli", "telegram", "slack", "whatsapp", "signal"]:
assert not _toolset_allowed_for_platform("discord", plat), (
f"`discord` toolset leaked onto {plat}"
)
assert not _toolset_allowed_for_platform("discord_admin", plat), (
f"`discord_admin` toolset leaked onto {plat}"
)
assert _toolset_allowed_for_platform("discord", "discord")
assert _toolset_allowed_for_platform("discord_admin", "discord")
def test_discord_toolsets_user_enabled_are_honored():
"""When the user opts in via `hermes tools`, the toolset appears."""
config = {"platform_toolsets": {"discord": ["web", "terminal", "discord"]}}
enabled = _get_platform_tools(config, "discord")
assert "discord" in enabled
assert "discord_admin" not in enabled
def test_save_platform_tools_strips_restricted_toolsets():
"""Hand-edited or all-platforms checklist with `discord` selected for
Telegram must be stripped at save time."""
from hermes_cli.tools_config import _save_platform_tools
config = {}
_save_platform_tools(config, "telegram", {"web", "terminal", "discord", "discord_admin"})
saved = config["platform_toolsets"]["telegram"]
assert "discord" not in saved
assert "discord_admin" not in saved
assert "web" in saved
assert "terminal" in saved
def test_get_platform_tools_feishu_includes_doc_and_drive():
enabled = _get_platform_tools({}, "feishu")
assert "feishu_doc" in enabled
assert "feishu_drive" in enabled
def test_get_platform_tools_feishu_tools_not_on_other_platforms():
for plat in ["cli", "telegram", "discord"]:
enabled = _get_platform_tools({}, plat)
assert "feishu_doc" not in enabled, f"feishu_doc leaked onto {plat}"
assert "feishu_drive" not in enabled, f"feishu_drive leaked onto {plat}"
def test_get_effective_configurable_toolsets_dedupes_bundled_plugins():
"""Bundled plugins (plugins/spotify) share their toolset key with the
built-in CONFIGURABLE_TOOLSETS entry. The effective list must not list
them twice otherwise `hermes tools` "reconfigure existing" shows
the same toolset two rows in a row.
"""
from hermes_cli.tools_config import _get_effective_configurable_toolsets
all_ts = _get_effective_configurable_toolsets()
keys = [ts_key for ts_key, _, _ in all_ts]
assert len(keys) == len(set(keys)), (
f"duplicate toolset keys in effective list: "
f"{[k for k in keys if keys.count(k) > 1]}"
)
# Spotify specifically — the bug that motivated the dedupe.
spotify_rows = [t for t in all_ts if t[0] == "spotify"]
assert len(spotify_rows) == 1, spotify_rows
# Built-in label wins over the plugin label.
assert spotify_rows[0][1] == "🎵 Spotify"
+39 -31
View File
@@ -1678,6 +1678,45 @@ class TestDashboardPluginManifestExtensions:
entry = next(p for p in plugins if p["name"] == "mixed-slots")
assert entry["slots"] == ["sidebar", "header-right"]
def test_page_scoped_slots_preserved(self, tmp_path, monkeypatch):
"""Page-scoped slot names (e.g. ``sessions:top``) round-trip through
the manifest loader untouched. The backend has no allowlist the
frontend ``<PluginSlot name="...">`` placements decide what actually
renders but the loader must not mangle colons in slot names."""
monkeypatch.setenv("HERMES_HOME", str(tmp_path))
self._write_plugin(tmp_path, "page-slots", {
"name": "page-slots",
"label": "Page Slots",
"tab": {"path": "/page-slots", "hidden": True},
"slots": [
"sessions:top",
"analytics:bottom",
"logs:top",
"skills:bottom",
"config:top",
"env:bottom",
"docs:top",
"cron:bottom",
"chat:top",
],
"entry": "dist/index.js",
})
from hermes_cli import web_server
web_server._dashboard_plugins_cache = None
plugins = web_server._get_dashboard_plugins(force_rescan=True)
entry = next(p for p in plugins if p["name"] == "page-slots")
assert entry["slots"] == [
"sessions:top",
"analytics:bottom",
"logs:top",
"skills:bottom",
"config:top",
"env:bottom",
"docs:top",
"cron:bottom",
"chat:top",
]
# ---------------------------------------------------------------------------
# /api/pty WebSocket — terminal bridge for the dashboard "Chat" tab.
@@ -1925,34 +1964,3 @@ class TestPtyWebSocket:
):
pass
assert exc.value.code == 4400
class TestEnvVarUpdateValidation:
"""PUT /api/env must reject empty values to prevent .env key destruction."""
def test_rejects_empty_value(self):
from hermes_cli.web_server import EnvVarUpdate
import pydantic
with pytest.raises(pydantic.ValidationError):
EnvVarUpdate(key="SOME_KEY", value="")
def test_rejects_whitespace_only_value(self):
from hermes_cli.web_server import EnvVarUpdate
import pydantic
with pytest.raises(pydantic.ValidationError):
EnvVarUpdate(key="SOME_KEY", value=" ")
def test_accepts_nonempty_value(self):
from hermes_cli.web_server import EnvVarUpdate
update = EnvVarUpdate(key="SOME_KEY", value="sk-abc123")
assert update.value == "sk-abc123"
def test_rejects_empty_key(self):
from hermes_cli.web_server import EnvVarUpdate
import pydantic
with pytest.raises(pydantic.ValidationError):
EnvVarUpdate(key="", value="some-value")
+123 -30
View File
@@ -41,6 +41,9 @@ def _make_agent(
agent.tool_progress_callback = None
agent._compression_warning = None
agent._aux_compression_context_length_config = None
# Tools feed into the headroom calculation in _check_compression_model_feasibility.
# Tests that want to assert specific threshold values can override this.
agent.tools = []
compressor = MagicMock(spec=ContextCompressor)
compressor.context_length = main_context
@@ -82,8 +85,9 @@ def test_auto_corrects_threshold_when_aux_context_below_threshold(mock_get_clien
assert "threshold:" in messages[0]
# Warning stored for gateway replay
assert agent._compression_warning is not None
# Threshold on the live compressor was actually lowered
assert agent.context_compressor.threshold_tokens == 80_000
# Threshold on the live compressor was actually lowered, accounting for
# the request-overhead headroom (empty tools list → ~12K headroom only).
assert agent.context_compressor.threshold_tokens == 68_000
@patch("agent.model_metadata.get_model_context_length", return_value=32_768)
@@ -147,15 +151,14 @@ def test_feasibility_check_passes_live_main_runtime():
agent._emit_status = lambda msg: None
agent._check_compression_model_feasibility()
mock_get_client.assert_called_once_with(
"compression",
main_runtime={
"model": "gpt-5.4",
"provider": "openai-codex",
# Called for both compression + flush_memories; verify compression call present
assert any(
c == (("compression",), {"main_runtime": {
"model": "gpt-5.4", "provider": "openai-codex",
"base_url": "https://chatgpt.com/backend-api/codex",
"api_key": "codex-token",
"api_mode": "codex_responses",
},
"api_key": "codex-token", "api_mode": "codex_responses",
}})
for c in mock_get_client.call_args_list
)
@@ -175,11 +178,12 @@ def test_feasibility_check_passes_config_context_length(mock_get_client, mock_ct
agent._emit_status = lambda msg: None
agent._check_compression_model_feasibility()
mock_ctx_len.assert_called_once_with(
"custom/big-model",
base_url="http://custom-endpoint:8080/v1",
api_key="sk-custom",
config_context_length=1_000_000,
# First call is the compression model
assert mock_ctx_len.call_args_list[0] == (
("custom/big-model",),
{"base_url": "http://custom-endpoint:8080/v1",
"api_key": "sk-custom", "config_context_length": 1_000_000,
"provider": "openrouter"},
)
@@ -197,11 +201,11 @@ def test_feasibility_check_ignores_invalid_context_length(mock_get_client, mock_
agent._emit_status = lambda msg: None
agent._check_compression_model_feasibility()
mock_ctx_len.assert_called_once_with(
"custom/model",
base_url="http://custom:8080/v1",
api_key="sk-test",
config_context_length=None,
assert mock_ctx_len.call_args_list[0] == (
("custom/model",),
{"base_url": "http://custom:8080/v1",
"api_key": "sk-test", "config_context_length": None,
"provider": "openrouter"},
)
@@ -249,12 +253,10 @@ def test_init_feasibility_check_uses_aux_context_override_from_config():
)
assert agent._aux_compression_context_length_config == 1_000_000
mock_ctx_len.assert_called_once_with(
"custom/big-model",
base_url="http://custom-endpoint:8080/v1",
api_key="sk-custom",
config_context_length=1_000_000,
)
c0 = mock_ctx_len.call_args_list[0]
assert c0.args == ("custom/big-model",)
assert c0.kwargs["base_url"] == "http://custom-endpoint:8080/v1"
assert c0.kwargs["config_context_length"] == 1_000_000
@patch("agent.auxiliary_client.get_text_auxiliary_client")
@@ -304,8 +306,10 @@ def test_exception_does_not_crash(mock_get_client):
@patch("agent.model_metadata.get_model_context_length", return_value=100_000)
@patch("agent.auxiliary_client.get_text_auxiliary_client")
def test_exact_threshold_boundary_no_warning(mock_get_client, mock_ctx_len):
"""No warning when aux context exactly equals the threshold."""
def test_exact_threshold_boundary_triggers_headroom_correction(mock_get_client, mock_ctx_len):
"""When aux context exactly equals the threshold, headroom deduction
still fires flush_memories adds system prompt + tool schema on top
of the conversation messages, so threshold must be lowered."""
agent = _make_agent(main_context=200_000, threshold_percent=0.50)
mock_client = MagicMock()
mock_client.base_url = "https://openrouter.ai/api/v1"
@@ -317,7 +321,10 @@ def test_exact_threshold_boundary_no_warning(mock_get_client, mock_ctx_len):
agent._check_compression_model_feasibility()
assert len(messages) == 0
# 100K - headroom < 100K → auto-corrects
assert len(messages) == 1
assert "Auto-lowered" in messages[0]
assert agent.context_compressor.threshold_tokens < 100_000
@patch("agent.model_metadata.get_model_context_length", return_value=99_999)
@@ -339,7 +346,93 @@ def test_just_below_threshold_auto_corrects(mock_get_client, mock_ctx_len):
assert len(messages) == 1
assert "small-model" in messages[0]
assert "Auto-lowered" in messages[0]
assert agent.context_compressor.threshold_tokens == 99_999
assert agent.context_compressor.threshold_tokens == 87_999
# ── Headroom for system prompt + tool schemas ────────────────────────
@patch("agent.model_metadata.get_model_context_length", return_value=128_000)
@patch("agent.auxiliary_client.get_text_auxiliary_client")
def test_auto_lowered_threshold_reserves_headroom_for_tools_and_system(mock_get_client, mock_ctx_len):
"""When aux context binds the threshold, new_threshold must leave room
for the system prompt and tool schemas that auxiliary callers
(compression summariser, flush_memories) prepend to the message list.
Without headroom, a full-budget message window + ~25K system/tool
overhead overflows the aux model with HTTP 400. Regression guard for
the flush_memories-on-busy-toolset overflow path.
"""
# Main context 200K, threshold 70% = 140K. Aux pins at 128K (below
# threshold → triggers auto-correct).
agent = _make_agent(main_context=200_000, threshold_percent=0.70)
# Build a realistic tool schema load.
agent.tools = [
{
"type": "function",
"function": {
"name": f"tool_{i}",
"description": "x" * 200,
"parameters": {"type": "object", "properties": {"arg": {"type": "string", "description": "y" * 120}}},
},
}
for i in range(50)
]
mock_client = MagicMock()
mock_client.base_url = "https://openrouter.ai/api/v1"
mock_client.api_key = "sk-aux"
mock_get_client.return_value = (mock_client, "model-with-128k")
agent._emit_status = lambda msg: None
agent._check_compression_model_feasibility()
new_threshold = agent.context_compressor.threshold_tokens
# Must have strictly reserved headroom: new_threshold < aux_context.
assert new_threshold < 128_000, (
f"threshold {new_threshold} did not reserve headroom below aux=128,000 "
f"— system prompt + tools would overflow the aux model"
)
# Must respect the 64K hard floor.
from agent.model_metadata import MINIMUM_CONTEXT_LENGTH
assert new_threshold >= MINIMUM_CONTEXT_LENGTH
@patch("agent.model_metadata.get_model_context_length", return_value=80_000)
@patch("agent.auxiliary_client.get_text_auxiliary_client")
def test_headroom_floors_at_minimum_context(mock_get_client, mock_ctx_len):
"""If headroom subtraction would push below 64K floor, clamp to 64K
rather than refusing the session the aux is still workable for a
smaller message window.
"""
# Aux at 80K, with enough tools to push headroom > 16K → naive subtract
# would land at < 64K. The max(..., MINIMUM_CONTEXT_LENGTH) clamp must
# keep the session running.
agent = _make_agent(main_context=200_000, threshold_percent=0.50)
agent.tools = [
{
"type": "function",
"function": {
"name": f"tool_{i}",
"description": "z" * 2_000, # fat descriptions
"parameters": {},
},
}
for i in range(30)
]
mock_client = MagicMock()
mock_client.base_url = "https://openrouter.ai/api/v1"
mock_client.api_key = "sk-aux"
mock_get_client.return_value = (mock_client, "small-aux-model")
agent._emit_status = lambda msg: None
agent._check_compression_model_feasibility()
from agent.model_metadata import MINIMUM_CONTEXT_LENGTH
assert agent.context_compressor.threshold_tokens == MINIMUM_CONTEXT_LENGTH
# ── Two-phase: __init__ + run_conversation replay ───────────────────
@@ -327,3 +327,72 @@ class TestFlushMemoriesCodexFallback:
mock_stream.assert_called_once()
mock_memory.assert_called_once()
assert mock_memory.call_args.kwargs["content"] == "Codex flush test"
@pytest.mark.parametrize(
"provider,base_url",
[
# chatgpt.com/backend-api/codex — rejects temperature unconditionally
("openai-codex", "https://chatgpt.com/backend-api/codex"),
# Native OpenAI Responses — rejects temperature on gpt-5/o-series reasoning models
("openai", "https://api.openai.com/v1"),
# Copilot Responses — rejects temperature on reasoning models
("copilot", "https://api.githubcopilot.com"),
],
)
def test_codex_fallback_never_sends_temperature(self, monkeypatch, provider, base_url):
"""Regression for the ``⚠ Auxiliary memory flush failed: HTTP 400:
Unsupported parameter: temperature`` error.
The codex_responses fallback must strip temperature before calling
_run_codex_stream the Responses API does not accept it on any
supported backend, matching the transport's behavior."""
agent = _make_agent(monkeypatch, api_mode="codex_responses", provider=provider)
agent.base_url = base_url
codex_response = SimpleNamespace(
output=[
SimpleNamespace(
type="function_call",
call_id="call_1",
name="memory",
arguments=json.dumps({
"action": "add",
"target": "notes",
"content": "no-temp test",
}),
),
],
usage=SimpleNamespace(input_tokens=50, output_tokens=10, total_tokens=60),
status="completed",
model="gpt-5.5",
)
with patch("agent.auxiliary_client.call_llm", side_effect=RuntimeError("no provider")), \
patch.object(agent, "_run_codex_stream", return_value=codex_response) as mock_stream, \
patch.object(agent, "_build_api_kwargs") as mock_build, \
patch("tools.memory_tool.memory_tool", return_value="Saved."):
# Simulate a transport that (correctly) never includes temperature,
# but also verify we strip any stray temperature the fallback used
# to inject before the fix.
mock_build.return_value = {
"model": "gpt-5.5",
"instructions": "test",
"input": [],
"tools": [],
"max_output_tokens": 4096,
# Intentionally poison the dict to prove we pop it:
"temperature": 0.3,
}
messages = [
{"role": "user", "content": "Hello"},
{"role": "assistant", "content": "Hi"},
{"role": "user", "content": "Save this"},
]
agent.flush_memories(messages)
mock_stream.assert_called_once()
sent_kwargs = mock_stream.call_args.args[0]
assert "temperature" not in sent_kwargs, (
f"codex_responses fallback must strip temperature before calling "
f"_run_codex_stream, got: {sent_kwargs.get('temperature')!r}"
)
@@ -0,0 +1,219 @@
"""Tests for flush_memories context-overflow prevention.
1. _check_compression_model_feasibility now also resolves the
flush_memories auxiliary model and uses min(compression, flush) as the
effective aux context.
2. Headroom is always deducted before comparing aux_context vs threshold
(not only when aux_context < threshold).
3. flush_memories() trims oversized conversations before the LLM call as
defence-in-depth for paths that bypass preflight compression.
"""
import sys
import types
from types import SimpleNamespace
from unittest.mock import patch, MagicMock
sys.modules.setdefault("fire", types.SimpleNamespace(Fire=lambda *a, **k: None))
sys.modules.setdefault("firecrawl", types.SimpleNamespace(Firecrawl=object))
sys.modules.setdefault("fal_client", types.SimpleNamespace())
import run_agent
# ── Helpers ──────────────────────────────────────────────────────────────
class _FakeOpenAI:
def __init__(self, **kw):
self.api_key = kw.get("api_key", "test")
self.base_url = kw.get("base_url", "http://test")
def close(self):
pass
def _make_agent(monkeypatch, **kw):
monkeypatch.setattr(run_agent, "get_tool_definitions", lambda **k: [
{"type": "function", "function": {
"name": "memory", "description": "m",
"parameters": {"type": "object", "properties": {
"action": {"type": "string"},
"target": {"type": "string"},
"content": {"type": "string"},
}},
}},
])
monkeypatch.setattr(run_agent, "check_toolset_requirements", lambda: {})
monkeypatch.setattr(run_agent, "OpenAI", _FakeOpenAI)
agent = run_agent.AIAgent(
api_key="test-key", base_url="https://test.example.com/v1",
provider=kw.get("provider", "openrouter"),
api_mode=kw.get("api_mode", "chat_completions"),
max_iterations=4, quiet_mode=True,
skip_context_files=True, skip_memory=True,
)
agent._memory_store = MagicMock()
agent._memory_flush_min_turns = 1
agent._user_turn_count = 5
return agent
def _make_msgs(n, chars=400):
return [{"role": "user" if i % 2 == 0 else "assistant",
"content": f"M{i}: " + "x" * max(0, chars - 6)}
for i in range(n)]
def _noop_response():
return SimpleNamespace(
choices=[SimpleNamespace(
finish_reason="stop",
message=SimpleNamespace(content="Nothing.", tool_calls=None),
)],
usage=SimpleNamespace(prompt_tokens=50, completion_tokens=10, total_tokens=60),
)
# ── Feasibility: flush model + always-deduct headroom ────────────────────
class TestFeasibilityFixes:
def test_smaller_flush_model_lowers_effective_context(self, monkeypatch):
"""flush_memories model with smaller context drives the threshold."""
agent = _make_agent(monkeypatch)
agent.context_compressor.context_length = 200_000
agent.context_compressor.threshold_tokens = 100_000
fc = SimpleNamespace(base_url="http://test", api_key="k")
def _aux(task, **kw):
if task == "compression":
return fc, "big-model"
return fc, "small-flush-model"
def _ctx(model, **kw):
return 200_000 if model == "big-model" else 80_000
with patch("agent.auxiliary_client.get_text_auxiliary_client", side_effect=_aux), \
patch("agent.model_metadata.get_model_context_length", side_effect=_ctx):
agent._check_compression_model_feasibility()
assert agent.context_compressor.threshold_tokens < 100_000
def test_same_model_overhead_still_triggers_correction(self, monkeypatch):
"""The primary bug: aux == main model, aux_context > threshold, but
threshold + overhead > aux_context. Headroom must fire even when
aux_context >= threshold."""
agent = _make_agent(monkeypatch)
agent.context_compressor.context_length = 128_000
agent.context_compressor.threshold_tokens = 120_000
fc = SimpleNamespace(base_url="http://test", api_key="k")
with patch("agent.auxiliary_client.get_text_auxiliary_client",
return_value=(fc, "same-model")), \
patch("agent.model_metadata.get_model_context_length",
return_value=128_000):
agent._check_compression_model_feasibility()
# 128K - headroom (~12.1K) ≈ 115.9K < 120K → threshold lowered
assert agent.context_compressor.threshold_tokens < 120_000
def test_flush_resolution_failure_is_non_fatal(self, monkeypatch):
"""If flush model resolution raises, check proceeds with compression model."""
agent = _make_agent(monkeypatch)
agent.context_compressor.context_length = 200_000
agent.context_compressor.threshold_tokens = 100_000
fc = SimpleNamespace(base_url="http://test", api_key="k")
n = [0]
def _aux(task, **kw):
n[0] += 1
if task == "flush_memories":
raise RuntimeError("boom")
return fc, "model"
with patch("agent.auxiliary_client.get_text_auxiliary_client", side_effect=_aux), \
patch("agent.model_metadata.get_model_context_length", return_value=200_000):
agent._check_compression_model_feasibility()
assert n[0] == 2 # both tasks attempted
# ── flush_memories trimming ──────────────────────────────────────────────
class TestFlushMemoriesTrimming:
def test_oversized_conversation_trimmed(self, monkeypatch):
agent = _make_agent(monkeypatch)
agent._cached_system_prompt = "System."
messages = _make_msgs(200, chars=500)
fc = SimpleNamespace(base_url="http://test", api_key="k")
with patch("agent.auxiliary_client.get_text_auxiliary_client",
return_value=(fc, "small")), \
patch("agent.model_metadata.get_model_context_length",
return_value=8_000), \
patch("agent.auxiliary_client.call_llm",
return_value=_noop_response()) as mock:
agent.flush_memories(messages)
sent = mock.call_args.kwargs.get("messages", [])
assert len(sent) < 100
def test_small_conversation_untouched(self, monkeypatch):
agent = _make_agent(monkeypatch)
agent._cached_system_prompt = "System."
messages = [
{"role": "user", "content": "Hi"},
{"role": "assistant", "content": "Hey"},
{"role": "user", "content": "Save"},
]
fc = SimpleNamespace(base_url="http://test", api_key="k")
with patch("agent.auxiliary_client.get_text_auxiliary_client",
return_value=(fc, "big")), \
patch("agent.model_metadata.get_model_context_length",
return_value=200_000), \
patch("agent.auxiliary_client.call_llm",
return_value=_noop_response()) as mock:
agent.flush_memories(messages)
sent = mock.call_args.kwargs.get("messages", [])
assert len(sent) == 5 # sys + 3 conv + flush
def test_trim_failure_does_not_block_flush(self, monkeypatch):
agent = _make_agent(monkeypatch)
messages = _make_msgs(10, chars=100)
with patch("agent.auxiliary_client.get_text_auxiliary_client",
side_effect=RuntimeError("no provider")), \
patch("agent.auxiliary_client.call_llm",
return_value=_noop_response()) as mock:
agent.flush_memories(messages)
assert mock.called
def test_sentinel_cleaned_after_trim(self, monkeypatch):
agent = _make_agent(monkeypatch)
messages = [
{"role": "user", "content": "Hi"},
{"role": "assistant", "content": "Hey"},
{"role": "user", "content": "Save"},
]
n = len(messages)
fc = SimpleNamespace(base_url="http://test", api_key="k")
with patch("agent.auxiliary_client.get_text_auxiliary_client",
return_value=(fc, "m")), \
patch("agent.model_metadata.get_model_context_length",
return_value=128_000), \
patch("agent.auxiliary_client.call_llm",
return_value=_noop_response()):
agent.flush_memories(messages)
assert len(messages) == n
assert not any(m.get("_flush_sentinel") for m in messages)
+2 -2
View File
@@ -200,8 +200,8 @@ class TestToolsetConsistency:
def test_hermes_platforms_share_core_tools(self):
"""All hermes-* platform toolsets share the same core tools.
Platform-specific additions (e.g. ``discord_server`` on
hermes-discord, gated on DISCORD_BOT_TOKEN) are allowed on top
Platform-specific additions (e.g. ``discord`` / ``discord_admin``
on hermes-discord, gated on DISCORD_BOT_TOKEN) are allowed on top
the invariant is that the core set is identical across platforms.
"""
platforms = ["hermes-cli", "hermes-telegram", "hermes-discord", "hermes-whatsapp", "hermes-slack", "hermes-signal", "hermes-homeassistant"]
+98
View File
@@ -2128,5 +2128,103 @@ class TestOrchestratorEndToEnd(unittest.TestCase):
self.assertFalse(built_agents[2]["is_orchestrator_prompt"])
class TestSubagentApprovalCallback(unittest.TestCase):
"""Subagent worker threads must have a non-interactive approval callback
installed so dangerous-command prompts don't fall back to input() and
deadlock the parent's prompt_toolkit TUI.
Governed by delegation.subagent_auto_approve:
false (default) _subagent_auto_deny
true _subagent_auto_approve
"""
def test_auto_deny_returns_deny(self):
from tools.delegate_tool import _subagent_auto_deny
self.assertEqual(
_subagent_auto_deny("rm -rf /tmp/x", "dangerous"),
"deny",
)
def test_auto_approve_returns_once(self):
from tools.delegate_tool import _subagent_auto_approve
self.assertEqual(
_subagent_auto_approve("rm -rf /tmp/x", "dangerous"),
"once",
)
@patch("tools.delegate_tool._load_config", return_value={})
def test_getter_defaults_to_deny(self, _mock_cfg):
from tools.delegate_tool import (
_get_subagent_approval_callback,
_subagent_auto_deny,
)
self.assertIs(_get_subagent_approval_callback(), _subagent_auto_deny)
@patch(
"tools.delegate_tool._load_config",
return_value={"subagent_auto_approve": False},
)
def test_getter_explicit_false_is_deny(self, _mock_cfg):
from tools.delegate_tool import (
_get_subagent_approval_callback,
_subagent_auto_deny,
)
self.assertIs(_get_subagent_approval_callback(), _subagent_auto_deny)
@patch(
"tools.delegate_tool._load_config",
return_value={"subagent_auto_approve": True},
)
def test_getter_true_is_approve(self, _mock_cfg):
from tools.delegate_tool import (
_get_subagent_approval_callback,
_subagent_auto_approve,
)
self.assertIs(_get_subagent_approval_callback(), _subagent_auto_approve)
@patch(
"tools.delegate_tool._load_config",
return_value={"subagent_auto_approve": "yes"},
)
def test_getter_truthy_string_is_approve(self, _mock_cfg):
"""is_truthy_value accepts 'yes'/'1'/'true' as truthy."""
from tools.delegate_tool import (
_get_subagent_approval_callback,
_subagent_auto_approve,
)
self.assertIs(_get_subagent_approval_callback(), _subagent_auto_approve)
def test_executor_initializer_installs_callback_in_worker(self):
"""The initializer sets the callback on the worker thread's TLS,
not the parent's — verifies the fix actually scopes to workers.
"""
from concurrent.futures import ThreadPoolExecutor
from tools.terminal_tool import (
set_approval_callback as _set_cb,
_get_approval_callback,
)
from tools.delegate_tool import _subagent_auto_deny
# Parent thread has no callback.
_set_cb(None)
self.assertIsNone(_get_approval_callback())
seen = []
def worker():
seen.append(_get_approval_callback())
with ThreadPoolExecutor(
max_workers=1,
initializer=_set_cb,
initargs=(_subagent_auto_deny,),
) as executor:
executor.submit(worker).result()
self.assertEqual(seen, [_subagent_auto_deny])
# Parent's callback slot is still empty (TLS isolates threads).
self.assertIsNone(_get_approval_callback())
if __name__ == "__main__":
unittest.main()
+183 -97
View File
@@ -11,6 +11,8 @@ import pytest
from tools.discord_tool import (
DiscordAPIError,
_ACTIONS,
_ADMIN_ACTIONS,
_CORE_ACTIONS,
_available_actions,
_build_schema,
_channel_type_name,
@@ -21,8 +23,11 @@ from tools.discord_tool import (
_load_allowed_actions_config,
_reset_capability_cache,
check_discord_tool_requirements,
discord_server,
discord_admin_handler,
discord_core,
get_dynamic_schema,
get_dynamic_schema_admin,
get_dynamic_schema_core,
)
@@ -147,32 +152,32 @@ class TestDiscordRequest:
class TestDiscordServerValidation:
def test_no_token(self, monkeypatch):
monkeypatch.delenv("DISCORD_BOT_TOKEN", raising=False)
result = json.loads(discord_server(action="list_guilds"))
result = json.loads(discord_admin_handler(action="list_guilds"))
assert "error" in result
assert "DISCORD_BOT_TOKEN" in result["error"]
def test_unknown_action(self, monkeypatch):
monkeypatch.setenv("DISCORD_BOT_TOKEN", "test-token")
result = json.loads(discord_server(action="bad_action"))
result = json.loads(discord_core(action="bad_action"))
assert "error" in result
assert "Unknown action" in result["error"]
assert "available_actions" in result
def test_missing_required_guild_id(self, monkeypatch):
monkeypatch.setenv("DISCORD_BOT_TOKEN", "test-token")
result = json.loads(discord_server(action="list_channels"))
result = json.loads(discord_admin_handler(action="list_channels"))
assert "error" in result
assert "guild_id" in result["error"]
def test_missing_required_channel_id(self, monkeypatch):
monkeypatch.setenv("DISCORD_BOT_TOKEN", "test-token")
result = json.loads(discord_server(action="fetch_messages"))
result = json.loads(discord_core(action="fetch_messages"))
assert "error" in result
assert "channel_id" in result["error"]
def test_missing_multiple_params(self, monkeypatch):
monkeypatch.setenv("DISCORD_BOT_TOKEN", "test-token")
result = json.loads(discord_server(action="add_role"))
result = json.loads(discord_admin_handler(action="add_role"))
assert "error" in result
assert "guild_id" in result["error"]
assert "user_id" in result["error"]
@@ -191,7 +196,7 @@ class TestListGuilds:
{"id": "111", "name": "Test Server", "icon": "abc", "owner": True, "permissions": "123"},
{"id": "222", "name": "Other Server", "icon": None, "owner": False, "permissions": "456"},
]
result = json.loads(discord_server(action="list_guilds"))
result = json.loads(discord_admin_handler(action="list_guilds"))
assert result["count"] == 2
assert result["guilds"][0]["name"] == "Test Server"
assert result["guilds"][1]["id"] == "222"
@@ -219,7 +224,7 @@ class TestServerInfo:
"premium_subscription_count": 5,
"verification_level": 1,
}
result = json.loads(discord_server(action="server_info", guild_id="111"))
result = json.loads(discord_admin_handler(action="server_info", guild_id="111"))
assert result["name"] == "My Server"
assert result["member_count"] == 42
assert result["online_count"] == 10
@@ -242,7 +247,7 @@ class TestListChannels:
{"id": "12", "name": "voice", "type": 2, "position": 1, "parent_id": "10", "topic": None, "nsfw": False},
{"id": "13", "name": "no-category", "type": 0, "position": 0, "parent_id": None, "topic": None, "nsfw": False},
]
result = json.loads(discord_server(action="list_channels", guild_id="111"))
result = json.loads(discord_admin_handler(action="list_channels", guild_id="111"))
assert result["total_channels"] == 3 # excludes the category itself
groups = result["channel_groups"]
# Uncategorized first
@@ -257,7 +262,7 @@ class TestListChannels:
def test_empty_guild(self, mock_req, monkeypatch):
monkeypatch.setenv("DISCORD_BOT_TOKEN", "test-token")
mock_req.return_value = []
result = json.loads(discord_server(action="list_channels", guild_id="111"))
result = json.loads(discord_admin_handler(action="list_channels", guild_id="111"))
assert result["total_channels"] == 0
@@ -274,7 +279,7 @@ class TestChannelInfo:
"topic": "Welcome!", "nsfw": False, "position": 0,
"parent_id": "10", "rate_limit_per_user": 0, "last_message_id": "999",
}
result = json.loads(discord_server(action="channel_info", channel_id="11"))
result = json.loads(discord_admin_handler(action="channel_info", channel_id="11"))
assert result["name"] == "general"
assert result["type"] == "text"
assert result["guild_id"] == "111"
@@ -293,7 +298,7 @@ class TestListRoles:
{"id": "2", "name": "Admin", "position": 2, "color": 16711680, "mentionable": True, "managed": False, "hoist": True},
{"id": "3", "name": "Mod", "position": 1, "color": 255, "mentionable": True, "managed": False, "hoist": True},
]
result = json.loads(discord_server(action="list_roles", guild_id="111"))
result = json.loads(discord_admin_handler(action="list_roles", guild_id="111"))
assert result["count"] == 3
# Should be sorted by position descending
assert result["roles"][0]["name"] == "Admin"
@@ -317,7 +322,7 @@ class TestMemberInfo:
"joined_at": "2024-01-01T00:00:00Z",
"premium_since": None,
}
result = json.loads(discord_server(action="member_info", guild_id="111", user_id="42"))
result = json.loads(discord_admin_handler(action="member_info", guild_id="111", user_id="42"))
assert result["username"] == "testuser"
assert result["nickname"] == "Testy"
assert result["roles"] == ["2", "3"]
@@ -334,7 +339,7 @@ class TestSearchMembers:
mock_req.return_value = [
{"user": {"id": "42", "username": "testuser", "global_name": "Test", "bot": False}, "nick": None, "roles": []},
]
result = json.loads(discord_server(action="search_members", guild_id="111", query="test"))
result = json.loads(discord_core(action="search_members", guild_id="111", query="test"))
assert result["count"] == 1
assert result["members"][0]["username"] == "testuser"
mock_req.assert_called_once_with(
@@ -346,7 +351,7 @@ class TestSearchMembers:
def test_search_members_limit_capped(self, mock_req, monkeypatch):
monkeypatch.setenv("DISCORD_BOT_TOKEN", "test-token")
mock_req.return_value = []
discord_server(action="search_members", guild_id="111", query="x", limit=200)
discord_core(action="search_members", guild_id="111", query="x", limit=200)
call_params = mock_req.call_args[1]["params"]
assert call_params["limit"] == "100" # Capped at 100
@@ -370,7 +375,7 @@ class TestFetchMessages:
"pinned": False,
},
]
result = json.loads(discord_server(action="fetch_messages", channel_id="11"))
result = json.loads(discord_core(action="fetch_messages", channel_id="11"))
assert result["count"] == 1
assert result["messages"][0]["content"] == "Hello world"
assert result["messages"][0]["author"]["username"] == "user1"
@@ -379,7 +384,7 @@ class TestFetchMessages:
def test_fetch_messages_with_pagination(self, mock_req, monkeypatch):
monkeypatch.setenv("DISCORD_BOT_TOKEN", "test-token")
mock_req.return_value = []
discord_server(action="fetch_messages", channel_id="11", before="999", limit=10)
discord_core(action="fetch_messages", channel_id="11", before="999", limit=10)
call_params = mock_req.call_args[1]["params"]
assert call_params["before"] == "999"
assert call_params["limit"] == "10"
@@ -396,7 +401,7 @@ class TestListPins:
mock_req.return_value = [
{"id": "500", "content": "Important announcement", "author": {"username": "admin"}, "timestamp": "2024-01-01T00:00:00Z"},
]
result = json.loads(discord_server(action="list_pins", channel_id="11"))
result = json.loads(discord_admin_handler(action="list_pins", channel_id="11"))
assert result["count"] == 1
assert result["pinned_messages"][0]["content"] == "Important announcement"
@@ -410,7 +415,7 @@ class TestPinUnpin:
def test_pin_message(self, mock_req, monkeypatch):
monkeypatch.setenv("DISCORD_BOT_TOKEN", "test-token")
mock_req.return_value = None # 204
result = json.loads(discord_server(action="pin_message", channel_id="11", message_id="500"))
result = json.loads(discord_admin_handler(action="pin_message", channel_id="11", message_id="500"))
assert result["success"] is True
mock_req.assert_called_once_with("PUT", "/channels/11/pins/500", "test-token")
@@ -418,7 +423,7 @@ class TestPinUnpin:
def test_unpin_message(self, mock_req, monkeypatch):
monkeypatch.setenv("DISCORD_BOT_TOKEN", "test-token")
mock_req.return_value = None
result = json.loads(discord_server(action="unpin_message", channel_id="11", message_id="500"))
result = json.loads(discord_admin_handler(action="unpin_message", channel_id="11", message_id="500"))
assert result["success"] is True
@@ -431,7 +436,7 @@ class TestCreateThread:
def test_create_standalone_thread(self, mock_req, monkeypatch):
monkeypatch.setenv("DISCORD_BOT_TOKEN", "test-token")
mock_req.return_value = {"id": "800", "name": "New Thread"}
result = json.loads(discord_server(action="create_thread", channel_id="11", name="New Thread"))
result = json.loads(discord_core(action="create_thread", channel_id="11", name="New Thread"))
assert result["success"] is True
assert result["thread_id"] == "800"
# Verify the API call
@@ -444,7 +449,7 @@ class TestCreateThread:
def test_create_thread_from_message(self, mock_req, monkeypatch):
monkeypatch.setenv("DISCORD_BOT_TOKEN", "test-token")
mock_req.return_value = {"id": "801", "name": "Discussion"}
result = json.loads(discord_server(
result = json.loads(discord_core(
action="create_thread", channel_id="11", name="Discussion", message_id="1001",
))
assert result["success"] is True
@@ -463,7 +468,7 @@ class TestRoleManagement:
def test_add_role(self, mock_req, monkeypatch):
monkeypatch.setenv("DISCORD_BOT_TOKEN", "test-token")
mock_req.return_value = None
result = json.loads(discord_server(
result = json.loads(discord_admin_handler(
action="add_role", guild_id="111", user_id="42", role_id="2",
))
assert result["success"] is True
@@ -475,7 +480,7 @@ class TestRoleManagement:
def test_remove_role(self, mock_req, monkeypatch):
monkeypatch.setenv("DISCORD_BOT_TOKEN", "test-token")
mock_req.return_value = None
result = json.loads(discord_server(
result = json.loads(discord_admin_handler(
action="remove_role", guild_id="111", user_id="42", role_id="2",
))
assert result["success"] is True
@@ -490,15 +495,23 @@ class TestErrorHandling:
def test_api_error_handled(self, mock_req, monkeypatch):
monkeypatch.setenv("DISCORD_BOT_TOKEN", "test-token")
mock_req.side_effect = DiscordAPIError(403, '{"message": "Missing Access"}')
result = json.loads(discord_server(action="list_guilds"))
result = json.loads(discord_admin_handler(action="list_guilds"))
assert "error" in result
assert "403" in result["error"]
@patch("tools.discord_tool._discord_request")
def test_unexpected_error_handled(self, mock_req, monkeypatch):
def test_unexpected_error_handled_admin(self, mock_req, monkeypatch):
monkeypatch.setenv("DISCORD_BOT_TOKEN", "test-token")
mock_req.side_effect = RuntimeError("something broke")
result = json.loads(discord_server(action="list_guilds"))
result = json.loads(discord_admin_handler(action="list_guilds"))
assert "error" in result
assert "something broke" in result["error"]
@patch("tools.discord_tool._discord_request")
def test_unexpected_error_handled_core(self, mock_req, monkeypatch):
monkeypatch.setenv("DISCORD_BOT_TOKEN", "test-token")
mock_req.side_effect = RuntimeError("something broke")
result = json.loads(discord_core(action="fetch_messages", channel_id="11"))
assert "error" in result
assert "something broke" in result["error"]
@@ -508,79 +521,109 @@ class TestErrorHandling:
# ---------------------------------------------------------------------------
class TestRegistration:
def test_tool_registered(self):
def test_core_tool_registered(self):
from tools.registry import registry
entry = registry._tools.get("discord_server")
entry = registry._tools.get("discord")
assert entry is not None
assert entry.schema["name"] == "discord_server"
assert entry.schema["name"] == "discord"
assert entry.toolset == "discord"
assert entry.check_fn is not None
assert entry.requires_env == ["DISCORD_BOT_TOKEN"]
def test_schema_actions(self):
"""Static schema should list all actions (the model_tools post-processing
narrows this per-session; static registration is the superset)."""
def test_admin_tool_registered(self):
from tools.registry import registry
entry = registry._tools["discord_server"]
actions = entry.schema["parameters"]["properties"]["action"]["enum"]
expected = [
"list_guilds", "server_info", "list_channels", "channel_info",
"list_roles", "member_info", "search_members", "fetch_messages",
"list_pins", "pin_message", "unpin_message", "create_thread",
"add_role", "remove_role",
]
assert set(actions) == set(expected)
assert set(_ACTIONS.keys()) == set(expected)
entry = registry._tools.get("discord_admin")
assert entry is not None
assert entry.schema["name"] == "discord_admin"
assert entry.toolset == "discord_admin"
assert entry.check_fn is not None
assert entry.requires_env == ["DISCORD_BOT_TOKEN"]
def test_core_schema_actions(self):
"""Core static schema should list only core actions."""
from tools.registry import registry
entry = registry._tools["discord"]
actions = set(entry.schema["parameters"]["properties"]["action"]["enum"])
assert actions == {"fetch_messages", "search_members", "create_thread"}
def test_admin_schema_actions(self):
"""Admin static schema should list only admin actions."""
from tools.registry import registry
entry = registry._tools["discord_admin"]
actions = set(entry.schema["parameters"]["properties"]["action"]["enum"])
expected_admin = set(_ACTIONS.keys()) - {"fetch_messages", "search_members", "create_thread"}
assert actions == expected_admin
def test_all_actions_covered(self):
"""Core + admin actions should cover all known actions."""
assert set(_CORE_ACTIONS.keys()) | set(_ADMIN_ACTIONS.keys()) == set(_ACTIONS.keys())
assert set(_CORE_ACTIONS.keys()) & set(_ADMIN_ACTIONS.keys()) == set()
def test_schema_parameter_bounds(self):
from tools.registry import registry
entry = registry._tools["discord_server"]
entry = registry._tools["discord"]
props = entry.schema["parameters"]["properties"]
assert props["limit"]["minimum"] == 1
assert props["limit"]["maximum"] == 100
assert props["auto_archive_duration"]["enum"] == [60, 1440, 4320, 10080]
def test_schema_description_is_action_manifest(self):
"""The top-level description should include the action manifest
(one-line signatures per action) so the model can find required
params without re-reading every parameter description."""
def test_core_schema_description(self):
"""Core schema description should mention core actions."""
from tools.registry import registry
entry = registry._tools["discord_server"]
entry = registry._tools["discord"]
desc = entry.schema["description"]
# Spot-check a few entries
assert "list_guilds()" in desc
assert "fetch_messages(channel_id)" in desc
assert "search_members(guild_id, query)" in desc
assert "create_thread(channel_id, name)" in desc
# Admin actions should NOT be in core description
assert "list_guilds()" not in desc
assert "add_role(" not in desc
def test_admin_schema_description(self):
"""Admin schema description should mention admin actions."""
from tools.registry import registry
entry = registry._tools["discord_admin"]
desc = entry.schema["description"]
assert "list_guilds()" in desc
assert "add_role(guild_id, user_id, role_id)" in desc
# Core actions should NOT be in admin description
assert "fetch_messages(" not in desc
assert "create_thread(" not in desc
def test_handler_callable(self):
from tools.registry import registry
entry = registry._tools["discord_server"]
entry = registry._tools["discord"]
assert callable(entry.handler)
entry_admin = registry._tools["discord_admin"]
assert callable(entry_admin.handler)
# ---------------------------------------------------------------------------
# Toolset: discord_server only in hermes-discord
# Toolset: discord / discord_admin only in hermes-discord
# ---------------------------------------------------------------------------
class TestToolsetInclusion:
def test_discord_server_in_hermes_discord_toolset(self):
def test_discord_tools_in_hermes_discord_toolset(self):
from toolsets import TOOLSETS
assert "discord_server" in TOOLSETS["hermes-discord"]["tools"]
assert "discord" in TOOLSETS["hermes-discord"]["tools"]
assert "discord_admin" in TOOLSETS["hermes-discord"]["tools"]
def test_discord_server_not_in_core_tools(self):
def test_discord_tools_not_in_core_tools(self):
from toolsets import _HERMES_CORE_TOOLS
assert "discord_server" not in _HERMES_CORE_TOOLS
assert "discord" not in _HERMES_CORE_TOOLS
assert "discord_admin" not in _HERMES_CORE_TOOLS
def test_discord_server_not_in_other_toolsets(self):
def test_discord_tools_not_in_other_toolsets(self):
from toolsets import TOOLSETS
for name, ts in TOOLSETS.items():
if name == "hermes-discord":
if name in ("hermes-discord", "hermes-gateway", "discord", "discord_admin"):
continue
# The gateway toolset might include it if it unions all platform tools
if name == "hermes-gateway":
continue
assert "discord_server" not in ts.get("tools", []), (
f"discord_server should not be in toolset '{name}'"
tools = ts.get("tools", [])
assert "discord" not in tools or name == "discord", (
f"discord tool should not be in toolset '{name}'"
)
assert "discord_admin" not in tools or name == "discord_admin", (
f"discord_admin tool should not be in toolset '{name}'"
)
@@ -798,40 +841,69 @@ class TestDynamicSchema:
@patch("tools.discord_tool._discord_request")
def test_no_token_returns_none(self, mock_req, monkeypatch):
monkeypatch.delenv("DISCORD_BOT_TOKEN", raising=False)
assert get_dynamic_schema() is None
assert get_dynamic_schema_core() is None
assert get_dynamic_schema_admin() is None
mock_req.assert_not_called()
@patch("tools.discord_tool._discord_request")
def test_full_intents_full_schema(self, mock_req, monkeypatch):
def test_full_intents_core_schema(self, mock_req, monkeypatch):
monkeypatch.setenv("DISCORD_BOT_TOKEN", "tok")
monkeypatch.setattr(
"hermes_cli.config.load_config",
lambda: {"discord": {"server_actions": ""}},
)
mock_req.return_value = {"flags": (1 << 14) | (1 << 18)}
schema = get_dynamic_schema()
actions = schema["parameters"]["properties"]["action"]["enum"]
assert set(actions) == set(_ACTIONS.keys())
# No content warning
schema = get_dynamic_schema_core()
actions = set(schema["parameters"]["properties"]["action"]["enum"])
assert actions == set(_CORE_ACTIONS.keys())
assert schema["name"] == "discord"
@patch("tools.discord_tool._discord_request")
def test_full_intents_admin_schema(self, mock_req, monkeypatch):
monkeypatch.setenv("DISCORD_BOT_TOKEN", "tok")
monkeypatch.setattr(
"hermes_cli.config.load_config",
lambda: {"discord": {"server_actions": ""}},
)
mock_req.return_value = {"flags": (1 << 14) | (1 << 18)}
schema = get_dynamic_schema_admin()
actions = set(schema["parameters"]["properties"]["action"]["enum"])
assert actions == set(_ADMIN_ACTIONS.keys())
assert schema["name"] == "discord_admin"
# No content warning when MESSAGE_CONTENT is enabled
assert "MESSAGE_CONTENT" not in schema["description"]
@patch("tools.discord_tool._discord_request")
def test_no_members_intent_removes_member_actions_from_schema(
def test_no_members_intent_removes_member_actions_from_admin_schema(
self, mock_req, monkeypatch,
):
"""member_info is an admin action; it should be hidden when
GUILD_MEMBERS intent is missing."""
monkeypatch.setenv("DISCORD_BOT_TOKEN", "tok")
monkeypatch.setattr(
"hermes_cli.config.load_config",
lambda: {"discord": {"server_actions": ""}},
)
mock_req.return_value = {"flags": 1 << 18} # only MESSAGE_CONTENT
schema = get_dynamic_schema()
schema = get_dynamic_schema_admin()
actions = schema["parameters"]["properties"]["action"]["enum"]
assert "member_info" not in actions
assert "member_info" not in schema["description"]
@patch("tools.discord_tool._discord_request")
def test_no_members_intent_hides_search_members_from_core(
self, mock_req, monkeypatch,
):
"""search_members is a core action gated by GUILD_MEMBERS intent."""
monkeypatch.setenv("DISCORD_BOT_TOKEN", "tok")
monkeypatch.setattr(
"hermes_cli.config.load_config",
lambda: {"discord": {"server_actions": ""}},
)
mock_req.return_value = {"flags": 1 << 18} # only MESSAGE_CONTENT
schema = get_dynamic_schema_core()
actions = schema["parameters"]["properties"]["action"]["enum"]
assert "search_members" not in actions
assert "member_info" not in actions
# Manifest description should also not advertise them
assert "search_members" not in schema["description"]
assert "member_info" not in schema["description"]
@patch("tools.discord_tool._discord_request")
def test_no_message_content_adds_warning_note(self, mock_req, monkeypatch):
@@ -841,41 +913,53 @@ class TestDynamicSchema:
lambda: {"discord": {"server_actions": ""}},
)
mock_req.return_value = {"flags": 1 << 14} # only GUILD_MEMBERS
schema = get_dynamic_schema()
schema = get_dynamic_schema_core()
assert "MESSAGE_CONTENT" in schema["description"]
# But fetch_messages is still available
actions = schema["parameters"]["properties"]["action"]["enum"]
assert "fetch_messages" in actions
@patch("tools.discord_tool._discord_request")
def test_config_allowlist_narrows_schema(self, mock_req, monkeypatch):
def test_config_allowlist_narrows_admin_schema(self, mock_req, monkeypatch):
monkeypatch.setenv("DISCORD_BOT_TOKEN", "tok")
monkeypatch.setattr(
"hermes_cli.config.load_config",
lambda: {"discord": {"server_actions": "list_guilds,list_channels"}},
)
mock_req.return_value = {"flags": (1 << 14) | (1 << 18)}
schema = get_dynamic_schema()
schema = get_dynamic_schema_admin()
actions = schema["parameters"]["properties"]["action"]["enum"]
assert actions == ["list_guilds", "list_channels"]
# Manifest description should only show allowed ones (check for
# the signature marker, which is specific to manifest lines)
assert "list_guilds()" in schema["description"]
assert "add_role(" not in schema["description"]
assert "create_thread(" not in schema["description"]
@patch("tools.discord_tool._discord_request")
def test_empty_allowlist_with_valid_values_hides_tool(self, mock_req, monkeypatch):
def test_empty_allowlist_with_valid_values_hides_tools(self, mock_req, monkeypatch):
"""If the allowlist resolves to zero valid actions (e.g. all names
were typos), get_dynamic_schema returns None so the tool is dropped
entirely rather than showing an empty enum."""
were typos), get_dynamic_schema returns None so the tool is dropped."""
monkeypatch.setenv("DISCORD_BOT_TOKEN", "tok")
monkeypatch.setattr(
"hermes_cli.config.load_config",
lambda: {"discord": {"server_actions": "typo_one,typo_two"}},
)
mock_req.return_value = {"flags": (1 << 14) | (1 << 18)}
assert get_dynamic_schema() is None
assert get_dynamic_schema_core() is None
assert get_dynamic_schema_admin() is None
@patch("tools.discord_tool._discord_request")
def test_backward_compat_wrapper(self, mock_req, monkeypatch):
"""get_dynamic_schema() should delegate to get_dynamic_schema_core()."""
monkeypatch.setenv("DISCORD_BOT_TOKEN", "tok")
monkeypatch.setattr(
"hermes_cli.config.load_config",
lambda: {"discord": {"server_actions": ""}},
)
mock_req.return_value = {"flags": (1 << 14) | (1 << 18)}
schema = get_dynamic_schema()
assert schema is not None
assert schema["name"] == "discord"
actions = set(schema["parameters"]["properties"]["action"]["enum"])
assert actions == set(_CORE_ACTIONS.keys())
# ---------------------------------------------------------------------------
@@ -890,7 +974,7 @@ class TestRuntimeAllowlistEnforcement:
"hermes_cli.config.load_config",
lambda: {"discord": {"server_actions": "list_guilds"}},
)
result = json.loads(discord_server(action="add_role", guild_id="1", user_id="2", role_id="3"))
result = json.loads(discord_admin_handler(action="add_role", guild_id="1", user_id="2", role_id="3"))
assert "error" in result
assert "disabled by config" in result["error"]
mock_req.assert_not_called()
@@ -903,7 +987,7 @@ class TestRuntimeAllowlistEnforcement:
lambda: {"discord": {"server_actions": "list_guilds"}},
)
mock_req.return_value = []
result = json.loads(discord_server(action="list_guilds"))
result = json.loads(discord_admin_handler(action="list_guilds"))
assert "guilds" in result
@@ -930,7 +1014,7 @@ class Test403Enrichment:
lambda: {"discord": {"server_actions": ""}},
)
mock_req.side_effect = DiscordAPIError(403, '{"message":"Missing Permissions"}')
result = json.loads(discord_server(
result = json.loads(discord_admin_handler(
action="add_role", guild_id="1", user_id="2", role_id="3",
))
assert "error" in result
@@ -944,7 +1028,7 @@ class Test403Enrichment:
lambda: {"discord": {"server_actions": ""}},
)
mock_req.side_effect = DiscordAPIError(500, "server error")
result = json.loads(discord_server(action="list_guilds"))
result = json.loads(discord_admin_handler(action="list_guilds"))
assert "500" in result["error"]
assert "MANAGE_ROLES" not in result["error"]
@@ -961,10 +1045,10 @@ class TestModelToolsIntegration:
_reset_capability_cache()
@patch("tools.discord_tool._discord_request")
def test_discord_server_schema_rebuilt_by_get_tool_definitions(
def test_discord_admin_schema_rebuilt_by_get_tool_definitions(
self, mock_req, monkeypatch,
):
"""When model_tools.get_tool_definitions runs with discord_server
"""When model_tools.get_tool_definitions runs with discord_admin
available, it should replace the static schema with the dynamic one."""
monkeypatch.setenv("DISCORD_BOT_TOKEN", "tok")
monkeypatch.setattr(
@@ -976,16 +1060,16 @@ class TestModelToolsIntegration:
from model_tools import get_tool_definitions
tools = get_tool_definitions(enabled_toolsets=["hermes-discord"], quiet_mode=True)
discord_tool = next(
(t for t in tools if t.get("function", {}).get("name") == "discord_server"),
discord_admin_tool = next(
(t for t in tools if t.get("function", {}).get("name") == "discord_admin"),
None,
)
assert discord_tool is not None, "discord_server should be in the schema"
actions = discord_tool["function"]["parameters"]["properties"]["action"]["enum"]
assert discord_admin_tool is not None, "discord_admin should be in the schema"
actions = discord_admin_tool["function"]["parameters"]["properties"]["action"]["enum"]
assert actions == ["list_guilds", "server_info"]
@patch("tools.discord_tool._discord_request")
def test_discord_server_dropped_when_allowlist_empties_it(
def test_discord_tools_dropped_when_allowlist_empties_them(
self, mock_req, monkeypatch,
):
monkeypatch.setenv("DISCORD_BOT_TOKEN", "tok")
@@ -998,4 +1082,6 @@ class TestModelToolsIntegration:
from model_tools import get_tool_definitions
tools = get_tool_definitions(enabled_toolsets=["hermes-discord"], quiet_mode=True)
names = [t.get("function", {}).get("name") for t in tools]
assert "discord" not in names
assert "discord_admin" not in names
assert "discord_server" not in names
+245 -77
View File
@@ -19,9 +19,11 @@ from unittest.mock import patch
from tools.process_registry import (
ProcessRegistry,
ProcessSession,
WATCH_MAX_PER_WINDOW,
WATCH_WINDOW_SECONDS,
WATCH_OVERLOAD_KILL_SECONDS,
WATCH_MIN_INTERVAL_SECONDS,
WATCH_STRIKE_LIMIT,
WATCH_GLOBAL_MAX_PER_WINDOW,
WATCH_GLOBAL_WINDOW_SECONDS,
WATCH_GLOBAL_COOLDOWN_SECONDS,
)
@@ -129,10 +131,15 @@ class TestCheckWatchPatterns:
assert registry.completion_queue.empty()
def test_hit_counter_increments(self, registry):
"""Each delivered notification increments _watch_hits."""
"""Each delivered notification increments _watch_hits.
With 1/15s rate limit, we need to reset cooldown between calls.
"""
session = _make_session(watch_patterns=["X"])
registry._check_watch_patterns(session, "X\n")
assert session._watch_hits == 1
# Reset cooldown so the second match gets delivered.
session._watch_cooldown_until = 0.0
registry._check_watch_patterns(session, "X\n")
assert session._watch_hits == 2
@@ -148,100 +155,114 @@ class TestCheckWatchPatterns:
# =========================================================================
# Rate limiting
# Per-session rate limiting: 1 notification per 15s, 3 strikes → disable
# =========================================================================
class TestRateLimiting:
def test_within_window_limit(self, registry):
"""Notifications within the rate limit all get delivered."""
class TestPerSessionRateLimit:
def test_first_match_delivers(self, registry):
"""A fresh session with no prior cooldown delivers the first match."""
session = _make_session(watch_patterns=["E"])
for i in range(WATCH_MAX_PER_WINDOW):
registry._check_watch_patterns(session, f"E {i}\n")
assert registry.completion_queue.qsize() == WATCH_MAX_PER_WINDOW
registry._check_watch_patterns(session, "E first\n")
assert registry.completion_queue.qsize() == 1
evt = registry.completion_queue.get_nowait()
assert evt["type"] == "watch_match"
assert session._watch_hits == 1
# Cooldown is now armed.
assert session._watch_cooldown_until > 0
def test_exceeds_window_limit(self, registry):
"""Notifications beyond the rate limit are suppressed."""
def test_second_match_within_cooldown_is_suppressed(self, registry):
"""A second match inside the 15s cooldown is dropped and counted."""
session = _make_session(watch_patterns=["E"])
for i in range(WATCH_MAX_PER_WINDOW + 5):
registry._check_watch_patterns(session, f"E {i}\n")
# Only WATCH_MAX_PER_WINDOW should be in the queue
assert registry.completion_queue.qsize() == WATCH_MAX_PER_WINDOW
assert session._watch_suppressed == 5
def test_window_resets(self, registry):
"""After the window expires, notifications can flow again."""
session = _make_session(watch_patterns=["E"])
# Fill the window
for i in range(WATCH_MAX_PER_WINDOW):
registry._check_watch_patterns(session, f"E {i}\n")
# One more should be suppressed
registry._check_watch_patterns(session, "E extra\n")
registry._check_watch_patterns(session, "E first\n")
assert registry.completion_queue.qsize() == 1
# Immediately trigger another match — well inside cooldown.
registry._check_watch_patterns(session, "E second\n")
# Still only one notification.
assert registry.completion_queue.qsize() == 1
assert session._watch_suppressed == 1
assert session._watch_consecutive_strikes == 1
# Fast-forward past window
session._watch_window_start = time.time() - WATCH_WINDOW_SECONDS - 1
registry._check_watch_patterns(session, "E after reset\n")
# Should deliver now (window reset)
assert registry.completion_queue.qsize() == WATCH_MAX_PER_WINDOW + 1
def test_suppressed_count_in_next_delivery(self, registry):
"""Suppressed count is reported in the next successful delivery."""
def test_many_drops_inside_window_count_as_ONE_strike(self, registry):
"""Multiple suppressions inside the same cooldown window = 1 strike."""
session = _make_session(watch_patterns=["E"])
for i in range(WATCH_MAX_PER_WINDOW):
registry._check_watch_patterns(session, f"E {i}\n")
# Suppress 3 more
for i in range(3):
registry._check_watch_patterns(session, f"E suppressed {i}\n")
assert session._watch_suppressed == 3
registry._check_watch_patterns(session, "E\n")
for _ in range(10):
registry._check_watch_patterns(session, "E\n")
assert session._watch_consecutive_strikes == 1
assert session._watch_suppressed == 10
# Fast-forward past window to allow delivery
session._watch_window_start = time.time() - WATCH_WINDOW_SECONDS - 1
registry._check_watch_patterns(session, "E back\n")
# Drain to the last event
last_evt = None
while not registry.completion_queue.empty():
last_evt = registry.completion_queue.get_nowait()
assert last_evt["suppressed"] == 3
assert session._watch_suppressed == 0 # reset after delivery
# =========================================================================
# Overload kill switch
# =========================================================================
class TestOverloadKillSwitch:
def test_sustained_overload_disables(self, registry):
"""Sustained overload beyond threshold permanently disables watching."""
def test_three_strikes_disables_watch_and_promotes_to_notify(self, registry):
"""Three consecutive strike windows → watch_disabled + notify_on_complete."""
session = _make_session(watch_patterns=["E"])
# Fill the window to trigger rate limit
for i in range(WATCH_MAX_PER_WINDOW):
registry._check_watch_patterns(session, f"E {i}\n")
session.notify_on_complete = False
# Simulate sustained overload: set overload_since to past threshold
session._watch_overload_since = time.time() - WATCH_OVERLOAD_KILL_SECONDS - 1
# Force another suppressed hit
registry._check_watch_patterns(session, "E overload\n")
registry._check_watch_patterns(session, "E overload2\n")
for strike in range(WATCH_STRIKE_LIMIT):
# Emit → arms cooldown.
registry._check_watch_patterns(session, f"E emit {strike}\n")
# Attempt while inside cooldown → one strike, dropped.
registry._check_watch_patterns(session, f"E drop {strike}\n")
# Fast-forward past the cooldown for the NEXT iteration, BUT leave
# the strike candidate set so the cooldown-expiry branch sees
# "this was a strike window" and doesn't reset the counter.
session._watch_cooldown_until = time.time() - 0.01
# After WATCH_STRIKE_LIMIT strikes, the next attempt should find
# the session disabled.
assert session._watch_disabled is True
# Should have a watch_disabled event in the queue
assert session.notify_on_complete is True
# One watch_disabled summary event should be in the queue.
disabled_evts = []
matches = 0
while not registry.completion_queue.empty():
evt = registry.completion_queue.get_nowait()
if evt.get("type") == "watch_disabled":
disabled_evts.append(evt)
elif evt.get("type") == "watch_match":
matches += 1
assert len(disabled_evts) == 1
assert "too many matches" in disabled_evts[0]["message"]
assert "notify_on_complete" in disabled_evts[0]["message"]
# We should have had exactly WATCH_STRIKE_LIMIT emissions before disable.
assert matches == WATCH_STRIKE_LIMIT
def test_overload_resets_on_delivery(self, registry):
"""Overload timer resets when a notification gets through."""
def test_clean_window_resets_strike_counter(self, registry):
"""A cooldown that expires with zero drops resets the consecutive counter."""
session = _make_session(watch_patterns=["E"])
# Start overload tracking
session._watch_overload_since = time.time() - 10
# But window allows delivery → overload should reset
registry._check_watch_patterns(session, "E ok\n")
assert session._watch_overload_since == 0.0
assert session._watch_disabled is False
# Emit + drop inside window → 1 strike.
registry._check_watch_patterns(session, "E emit\n")
registry._check_watch_patterns(session, "E drop\n")
assert session._watch_consecutive_strikes == 1
# Fast-forward past cooldown. No match arrived during the window —
# strike_candidate stays False from the prior window's reset, but
# it was True during that window. On the NEXT emission, the
# cooldown-expiry branch checks strike_candidate. Since we emitted
# at the start of this new window and no drop has happened, the
# reset branch should fire.
session._watch_cooldown_until = time.time() - 0.01
# Clear strike candidate to simulate "this cooldown had no drops".
session._watch_strike_candidate = False
registry._check_watch_patterns(session, "E clean\n")
assert session._watch_consecutive_strikes == 0
def test_suppressed_count_in_next_delivery(self, registry):
"""Suppressed count from a strike window is reported in the next emit."""
session = _make_session(watch_patterns=["E"])
registry._check_watch_patterns(session, "E emit\n")
for _ in range(4):
registry._check_watch_patterns(session, "E drop\n")
assert session._watch_suppressed == 4
# Fast-forward past cooldown.
session._watch_cooldown_until = time.time() - 0.01
# Drain the queue so we can inspect the next emission.
while not registry.completion_queue.empty():
registry.completion_queue.get_nowait()
registry._check_watch_patterns(session, "E back\n")
evt = registry.completion_queue.get_nowait()
assert evt["type"] == "watch_match"
assert evt["suppressed"] == 4
assert session._watch_suppressed == 0 # reset after delivery
# =========================================================================
@@ -321,3 +342,150 @@ class TestCodeExecutionBlocked:
def test_watch_patterns_blocked(self):
from tools.code_execution_tool import _TERMINAL_BLOCKED_PARAMS
assert "watch_patterns" in _TERMINAL_BLOCKED_PARAMS
# =========================================================================
# Suppress-after-exit (anti-spam fix)
# =========================================================================
class TestSuppressAfterExit:
def test_match_dropped_once_session_exited(self, registry):
"""watch_patterns notifications stop the moment session.exited is set."""
session = _make_session(watch_patterns=["ERROR"])
# Mark the process as exited BEFORE the late chunk arrives.
session.exited = True
registry._check_watch_patterns(session, "ERROR: late buffer\n")
assert registry.completion_queue.empty()
assert session._watch_hits == 0
def test_match_still_delivered_while_session_running(self, registry):
"""Sanity: while the process is still running, matches still deliver."""
session = _make_session(watch_patterns=["ERROR"])
session.exited = False
registry._check_watch_patterns(session, "ERROR: oh no\n")
assert not registry.completion_queue.empty()
evt = registry.completion_queue.get_nowait()
assert evt["type"] == "watch_match"
# =========================================================================
# Mutual exclusion: notify_on_complete wins over watch_patterns
# =========================================================================
class TestMutualExclusion:
def test_resolver_drops_watch_when_notify_set(self):
"""Both flags set → watch_patterns dropped with a note."""
from tools.terminal_tool import _resolve_notification_flag_conflict
resolved, note = _resolve_notification_flag_conflict(
notify_on_complete=True,
watch_patterns=["ERROR", "DONE"],
background=True,
)
assert resolved is None
assert "notify_on_complete" in note
assert "duplicate notifications" in note
def test_resolver_keeps_watch_when_notify_off(self):
"""notify_on_complete=False → watch_patterns kept intact."""
from tools.terminal_tool import _resolve_notification_flag_conflict
resolved, note = _resolve_notification_flag_conflict(
notify_on_complete=False,
watch_patterns=["ERROR"],
background=True,
)
assert resolved == ["ERROR"]
assert note == ""
def test_resolver_keeps_notify_when_no_watch(self):
"""Only notify_on_complete set → no conflict."""
from tools.terminal_tool import _resolve_notification_flag_conflict
resolved, note = _resolve_notification_flag_conflict(
notify_on_complete=True,
watch_patterns=None,
background=True,
)
assert resolved is None
assert note == ""
def test_resolver_inert_when_not_background(self):
"""Without background=True, the whole thing is a no-op."""
from tools.terminal_tool import _resolve_notification_flag_conflict
resolved, note = _resolve_notification_flag_conflict(
notify_on_complete=True,
watch_patterns=["ERROR"],
background=False,
)
assert resolved == ["ERROR"]
assert note == ""
# =========================================================================
# Global circuit breaker (cross-session overflow blocker)
# =========================================================================
class TestGlobalCircuitBreaker:
def test_trips_after_global_threshold(self, registry):
"""When >N matches fire across sessions in the window, breaker trips."""
sessions = [
_make_session(sid=f"proc_s{i}", watch_patterns=["E"])
for i in range(WATCH_GLOBAL_MAX_PER_WINDOW + 3)
]
# Each session fires exactly one match — individually well under the
# per-session cap. But collectively they should trip the global cap.
for s in sessions:
registry._check_watch_patterns(s, "E hit\n")
# Drain the queue and count event types.
watch_matches = 0
overflow_tripped = 0
while not registry.completion_queue.empty():
evt = registry.completion_queue.get_nowait()
if evt.get("type") == "watch_match":
watch_matches += 1
elif evt.get("type") == "watch_overflow_tripped":
overflow_tripped += 1
assert watch_matches == WATCH_GLOBAL_MAX_PER_WINDOW
assert overflow_tripped == 1
assert registry._global_watch_tripped_until > 0
def test_cooldown_suppresses_and_then_releases(self, registry):
"""After trip, further events are suppressed; cooldown expiry emits release."""
# Spawn enough fresh sessions to trip the global breaker.
sessions = [
_make_session(sid=f"proc_t{i}", watch_patterns=["E"])
for i in range(WATCH_GLOBAL_MAX_PER_WINDOW + 1)
]
for s in sessions:
registry._check_watch_patterns(s, "E hit\n")
assert registry._global_watch_tripped_until > 0
# Further matches from BRAND-NEW sessions during cooldown are dropped.
q_size_before = registry.completion_queue.qsize()
extra1 = _make_session(sid="proc_extra1", watch_patterns=["E"])
extra2 = _make_session(sid="proc_extra2", watch_patterns=["E"])
registry._check_watch_patterns(extra1, "E hit\n")
registry._check_watch_patterns(extra2, "E hit\n")
assert registry.completion_queue.qsize() == q_size_before # no new events
assert registry._global_watch_suppressed_during_trip >= 2
# Simulate cooldown expiry.
registry._global_watch_tripped_until = time.time() - 1
# Next call admits AND emits the release summary.
released_session = _make_session(sid="proc_after", watch_patterns=["E"])
registry._check_watch_patterns(released_session, "E hit\n")
released = False
admitted = False
while not registry.completion_queue.empty():
evt = registry.completion_queue.get_nowait()
if evt.get("type") == "watch_overflow_released":
released = True
assert evt["suppressed"] >= 2
elif evt.get("type") == "watch_match":
admitted = True
assert released
assert admitted
+47 -1
View File
@@ -11,7 +11,7 @@ import os
import re
import sys
from pathlib import Path
from typing import Any, Dict, List, Optional
from typing import Any, Dict, List, Optional, Union
from hermes_constants import display_hermes_home
@@ -238,6 +238,7 @@ def cronjob(
base_url: Optional[str] = None,
reason: Optional[str] = None,
script: Optional[str] = None,
context_from: Optional[Union[str, List[str]]] = None,
enabled_toolsets: Optional[List[str]] = None,
workdir: Optional[str] = None,
task_id: str = None,
@@ -265,6 +266,18 @@ def cronjob(
if script_error:
return tool_error(script_error, success=False)
# Validate context_from references existing jobs
if context_from:
from cron.jobs import get_job as _get_job
refs = [context_from] if isinstance(context_from, str) else context_from
for ref_id in refs:
if not _get_job(ref_id):
return tool_error(
f"context_from job '{ref_id}' not found. "
"Use cronjob(action='list') to see available jobs.",
success=False,
)
job = create_job(
prompt=prompt or "",
schedule=schedule,
@@ -277,6 +290,7 @@ def cronjob(
provider=_normalize_optional_job_value(provider),
base_url=_normalize_optional_job_value(base_url, strip_trailing_slash=True),
script=_normalize_optional_job_value(script),
context_from=context_from,
enabled_toolsets=enabled_toolsets or None,
workdir=_normalize_optional_job_value(workdir),
)
@@ -368,6 +382,24 @@ def cronjob(
if script_error:
return tool_error(script_error, success=False)
updates["script"] = _normalize_optional_job_value(script) if script else None
if context_from is not None:
# Empty string / empty list clears the field; otherwise validate
# each referenced job exists before storing. Normalized to a list
# (or None) to match the shape stored by create_job().
if isinstance(context_from, str):
refs = [context_from.strip()] if context_from.strip() else []
else:
refs = [str(j).strip() for j in context_from if str(j).strip()]
if refs:
from cron.jobs import get_job as _get_job
for ref_id in refs:
if not _get_job(ref_id):
return tool_error(
f"context_from job '{ref_id}' not found. "
"Use cronjob(action='list') to see available jobs.",
success=False,
)
updates["context_from"] = refs or None
if enabled_toolsets is not None:
updates["enabled_toolsets"] = enabled_toolsets or None
if workdir is not None:
@@ -473,6 +505,19 @@ Important safety rule: cron-run sessions should not recursively schedule more cr
"type": "string",
"description": f"Optional path to a Python script that runs before each cron job execution. Its stdout is injected into the prompt as context. Use for data collection and change detection. Relative paths resolve under {display_hermes_home()}/scripts/. On update, pass empty string to clear."
},
"context_from": {
"type": "array",
"items": {"type": "string"},
"description": (
"Optional job ID or list of job IDs whose most recent completed output is "
"injected into the prompt as context before each run. "
"Use this to chain cron jobs: job A collects data, job B processes it. "
"Each entry must be a valid job ID (from cronjob action='list'). "
"Note: injects the most recent completed output — does not wait for "
"upstream jobs running in the same tick. "
"On update, pass an empty array to clear."
),
},
"enabled_toolsets": {
"type": "array",
"items": {"type": "string"},
@@ -526,6 +571,7 @@ registry.register(
base_url=args.get("base_url"),
reason=args.get("reason"),
script=args.get("script"),
context_from=args.get("context_from"),
enabled_toolsets=args.get("enabled_toolsets"),
workdir=args.get("workdir"),
task_id=kw.get("task_id"),
+68 -1
View File
@@ -33,6 +33,7 @@ from typing import Any, Dict, List, Optional
from toolsets import TOOLSETS
from tools import file_state
from tools.terminal_tool import set_approval_callback as _set_subagent_approval_cb
from utils import base_url_hostname, is_truthy_value
@@ -47,6 +48,64 @@ DELEGATE_BLOCKED_TOOLS = frozenset(
]
)
# ---------------------------------------------------------------------------
# Subagent approval callbacks
# ---------------------------------------------------------------------------
# Subagents run inside a ThreadPoolExecutor worker. The CLI's interactive
# approval callback is stored in tools/terminal_tool.py's threading.local(),
# so worker threads do NOT inherit it. Without a callback,
# prompt_dangerous_approval() falls back to input() from the worker thread,
# which deadlocks against the parent's prompt_toolkit TUI that owns stdin.
#
# Fix: install a non-interactive callback into every subagent worker thread
# via ThreadPoolExecutor(initializer=_set_subagent_approval_cb, initargs=(cb,)).
# The callback is chosen by the `delegation.subagent_auto_approve` config:
# false (default) → _subagent_auto_deny (safe; matches leaf tool blocklist)
# true → _subagent_auto_approve (opt-in YOLO for cron/batch)
# Both emit a logger.warning for audit; gateway sessions are unaffected
# because they resolve approvals via tools/approval.py's per-session queue,
# not through these TLS callbacks.
def _subagent_auto_deny(command: str, description: str, **kwargs) -> str:
"""Auto-deny dangerous commands in subagent threads (safe default).
Returns 'deny' so the subagent sees a refusal it can recover from, and
never calls input() (which would deadlock the parent TUI).
"""
logger.warning(
"Subagent auto-denied dangerous command: %s (%s). "
"Set delegation.subagent_auto_approve: true to allow.",
command, description,
)
return "deny"
def _subagent_auto_approve(command: str, description: str, **kwargs) -> str:
"""Auto-approve dangerous commands in subagent threads (opt-in YOLO).
Only installed when delegation.subagent_auto_approve=true. Returns 'once'
so the subagent proceeds without blocking the parent UI.
"""
logger.warning(
"Subagent auto-approved dangerous command: %s (%s)",
command, description,
)
return "once"
def _get_subagent_approval_callback():
"""Return the callback to install into subagent worker threads.
Config key: delegation.subagent_auto_approve (bool, default False).
Reads via the same _load_config() path as the rest of delegate_task so
priority is config.yaml > (no env override for this knob) > default.
"""
cfg = _load_config()
val = cfg.get("subagent_auto_approve", False)
if is_truthy_value(val):
return _subagent_auto_approve
return _subagent_auto_deny
# Build a description fragment listing toolsets available for subagents.
# Excludes toolsets where ALL tools are blocked, composite/platform toolsets
# (hermes-* prefixed), and scenario toolsets.
@@ -1344,7 +1403,15 @@ def _run_single_child(
# Run child with a hard timeout to prevent indefinite blocking
# when the child's API call or tool-level HTTP request hangs.
child_timeout = _get_child_timeout()
_timeout_executor = ThreadPoolExecutor(max_workers=1)
_timeout_executor = ThreadPoolExecutor(
max_workers=1,
# Install a non-interactive approval callback in the worker thread
# so dangerous-command prompts from the subagent don't fall back to
# input() and deadlock the parent's prompt_toolkit TUI.
# Callback (deny vs approve) is governed by delegation.subagent_auto_approve.
initializer=_set_subagent_approval_cb,
initargs=(_get_subagent_approval_callback(),),
)
# Capture the worker thread so the timeout diagnostic can dump its
# Python stack (see #14726 — 0-API-call hangs are opaque without it).
_worker_thread_holder: Dict[str, Optional[threading.Thread]] = {"t": None}
+111 -63
View File
@@ -473,6 +473,12 @@ _ACTIONS = {
"remove_role": _remove_role,
}
_CORE_ACTION_NAMES = frozenset({"fetch_messages", "search_members", "create_thread"})
_ADMIN_ACTION_NAMES = frozenset(_ACTIONS.keys()) - _CORE_ACTION_NAMES
_CORE_ACTIONS = {k: v for k, v in _ACTIONS.items() if k in _CORE_ACTION_NAMES}
_ADMIN_ACTIONS = {k: v for k, v in _ACTIONS.items() if k in _ADMIN_ACTION_NAMES}
# Single-source-of-truth manifest: action → (signature, one-line description).
# Consumed by :func:`_build_schema` so the schema's top-level description
# always matches the registered action set.
@@ -531,7 +537,7 @@ def _load_allowed_actions_config() -> Optional[List[str]]:
from hermes_cli.config import load_config
cfg = load_config()
except Exception as exc:
logger.debug("discord_server: could not load config (%s); allowing all actions.", exc)
logger.debug("discord: could not load config (%s); allowing all actions.", exc)
return None
raw = (cfg.get("discord") or {}).get("server_actions")
@@ -586,12 +592,16 @@ def _available_actions(
def _build_schema(
actions: List[str],
caps: Optional[Dict[str, Any]] = None,
) -> Dict[str, Any]:
"""Build the tool schema for the given filtered action list."""
tool_name: str = "discord",
) -> Optional[Dict[str, Any]]:
"""Build the tool schema for the given filtered action list.
Returns ``None`` when *actions* is empty callers should drop the
tool from registration in that case.
"""
caps = caps or {}
if not actions:
# Tool shouldn't be registered when empty, but guard anyway.
actions = list(_ACTIONS.keys())
return None
# Action manifest lines (action-first, parameter-scoped).
manifest_lines = [
@@ -602,24 +612,36 @@ def _build_schema(
manifest_block = "\n".join(manifest_lines)
content_note = ""
if caps.get("detected") and caps.get("has_message_content") is False:
affected_actions = {"fetch_messages", "list_pins"} & set(actions)
if affected_actions and caps.get("detected") and caps.get("has_message_content") is False:
names = " and ".join(sorted(affected_actions))
content_note = (
"\n\nNOTE: Bot does NOT have the MESSAGE_CONTENT privileged intent. "
"fetch_messages and list_pins will return message metadata (author, "
f"\n\nNOTE: Bot does NOT have the MESSAGE_CONTENT privileged intent. "
f"{names} will return message metadata (author, "
"timestamps, attachments, reactions, pin state) but `content` will be "
"empty for messages not sent as a direct mention to the bot or in DMs. "
"Enable the intent in the Discord Developer Portal to see all content."
)
description = (
"Query and manage a Discord server via the REST API.\n\n"
"Available actions:\n"
f"{manifest_block}\n\n"
"Call list_guilds first to discover guild_ids, then list_channels for "
"channel_ids. Runtime errors will tell you if the bot lacks a specific "
"per-guild permission (e.g. MANAGE_ROLES for add_role)."
f"{content_note}"
)
if tool_name == "discord_admin":
description = (
"Manage a Discord server via the REST API.\n\n"
"Available actions:\n"
f"{manifest_block}\n\n"
"Call list_guilds first to discover guild_ids, then list_channels for "
"channel_ids. Runtime errors will tell you if the bot lacks a specific "
"per-guild permission (e.g. MANAGE_ROLES for add_role)."
f"{content_note}"
)
else:
description = (
"Read and participate in a Discord server.\n\n"
"Available actions:\n"
f"{manifest_block}\n\n"
"Use the channel_id from the current conversation context. "
"Use search_members to look up user IDs by name prefix."
f"{content_note}"
)
properties: Dict[str, Any] = {
"action": {
@@ -676,7 +698,7 @@ def _build_schema(
}
return {
"name": "discord_server",
"name": tool_name,
"description": description,
"parameters": {
"type": "object",
@@ -686,28 +708,33 @@ def _build_schema(
}
def get_dynamic_schema() -> Optional[Dict[str, Any]]:
"""Return a schema filtered by current intents + config allowlist.
Called by ``model_tools.get_tool_definitions`` as a post-processing
step so the schema the model sees always reflects reality. Returns
``None`` when no actions are available (tool should be removed from
the schema list entirely).
"""
def _get_dynamic_schema(
action_subset: Dict[str, Any],
tool_name: str,
) -> Optional[Dict[str, Any]]:
"""Build a dynamic schema for *action_subset* filtered by intents + config."""
token = _get_bot_token()
if not token:
return None
caps = _detect_capabilities(token)
allowlist = _load_allowed_actions_config()
actions = _available_actions(caps, allowlist)
actions = [a for a in _available_actions(caps, allowlist) if a in action_subset]
if not actions:
logger.warning(
"discord_server: config allowlist/intents left zero available actions; "
"hiding tool from this session."
)
return None
return _build_schema(actions, caps)
return _build_schema(actions, caps, tool_name=tool_name)
def get_dynamic_schema_core() -> Optional[Dict[str, Any]]:
return _get_dynamic_schema(_CORE_ACTIONS, "discord")
def get_dynamic_schema_admin() -> Optional[Dict[str, Any]]:
return _get_dynamic_schema(_ADMIN_ACTIONS, "discord_admin")
def get_dynamic_schema() -> Optional[Dict[str, Any]]:
"""Backward-compat wrapper — returns core schema."""
return get_dynamic_schema_core()
# ---------------------------------------------------------------------------
@@ -774,11 +801,13 @@ def check_discord_tool_requirements() -> bool:
# ---------------------------------------------------------------------------
# Main handler
# Handlers
# ---------------------------------------------------------------------------
def discord_server(
def _run_discord_action(
action: str,
valid_actions: Dict[str, Any],
tool_label: str,
guild_id: str = "",
channel_id: str = "",
user_id: str = "",
@@ -790,18 +819,17 @@ def discord_server(
before: str = "",
after: str = "",
auto_archive_duration: int = 1440,
task_id: str = None,
) -> str:
"""Execute a Discord server action."""
"""Shared handler logic for both discord tools."""
token = _get_bot_token()
if not token:
return json.dumps({"error": "DISCORD_BOT_TOKEN not configured."})
action_fn = _ACTIONS.get(action)
action_fn = valid_actions.get(action)
if not action_fn:
return json.dumps({
"error": f"Unknown action: {action}",
"available_actions": list(_ACTIONS.keys()),
"available_actions": list(valid_actions.keys()),
})
# Config-level allowlist gate (defense in depth — schema already filtered,
@@ -848,44 +876,64 @@ def discord_server(
auto_archive_duration=auto_archive_duration,
)
except DiscordAPIError as e:
logger.warning("Discord API error in action '%s': %s", action, e)
logger.warning("Discord API error in %s action '%s': %s", tool_label, action, e)
if e.status == 403:
return json.dumps({"error": _enrich_403(action, e.body)})
return json.dumps({"error": str(e)})
except Exception as e:
logger.exception("Unexpected error in discord_server action '%s'", action)
logger.exception("Unexpected error in %s action '%s'", tool_label, action)
return json.dumps({"error": f"Unexpected error: {e}"})
def discord_core(action: str, **kwargs) -> str:
"""Execute a core Discord action (fetch_messages, search_members, create_thread)."""
return _run_discord_action(action, _CORE_ACTIONS, "discord", **kwargs)
def discord_admin_handler(action: str, **kwargs) -> str:
"""Execute a Discord admin action (server management)."""
return _run_discord_action(action, _ADMIN_ACTIONS, "discord_admin", **kwargs)
# ---------------------------------------------------------------------------
# Tool registration
# ---------------------------------------------------------------------------
# Register with the full unfiltered schema. ``model_tools.get_tool_definitions``
# rebuilds this per-session via ``get_dynamic_schema`` so the model only ever
# sees intent-available, config-allowed actions. The static registration is a
# safe baseline for tools that inspect the registry directly.
_STATIC_SCHEMA = _build_schema(list(_ACTIONS.keys()), caps={"detected": False})
_HANDLER_DEFAULTS = {
"action": "", "guild_id": "", "channel_id": "", "user_id": "",
"role_id": "", "message_id": "", "query": "", "name": "",
"limit": 50, "before": "", "after": "", "auto_archive_duration": 1440,
}
def _make_handler(handler_fn):
"""Create a registry-compatible handler lambda for a discord handler."""
return lambda args, **kw: handler_fn(
**{k: args.get(k, v) for k, v in _HANDLER_DEFAULTS.items()},
)
_STATIC_CORE_SCHEMA = _build_schema(
list(_CORE_ACTIONS.keys()), caps={"detected": False}, tool_name="discord",
)
_STATIC_ADMIN_SCHEMA = _build_schema(
list(_ADMIN_ACTIONS.keys()), caps={"detected": False}, tool_name="discord_admin",
)
registry.register(
name="discord_server",
name="discord",
toolset="discord",
schema=_STATIC_SCHEMA,
handler=lambda args, **kw: discord_server(
action=args.get("action", ""),
guild_id=args.get("guild_id", ""),
channel_id=args.get("channel_id", ""),
user_id=args.get("user_id", ""),
role_id=args.get("role_id", ""),
message_id=args.get("message_id", ""),
query=args.get("query", ""),
name=args.get("name", ""),
limit=args.get("limit", 50),
before=args.get("before", ""),
after=args.get("after", ""),
auto_archive_duration=args.get("auto_archive_duration", 1440),
task_id=kw.get("task_id"),
),
schema=_STATIC_CORE_SCHEMA,
handler=_make_handler(discord_core),
check_fn=check_discord_tool_requirements,
requires_env=["DISCORD_BOT_TOKEN"],
)
registry.register(
name="discord_admin",
toolset="discord_admin",
schema=_STATIC_ADMIN_SCHEMA,
handler=_make_handler(discord_admin_handler),
check_fn=check_discord_tool_requirements,
requires_env=["DISCORD_BOT_TOKEN"],
)
+200 -50
View File
@@ -58,10 +58,20 @@ MAX_OUTPUT_CHARS = 200_000 # 200KB rolling output buffer
FINISHED_TTL_SECONDS = 1800 # Keep finished processes for 30 minutes
MAX_PROCESSES = 64 # Max concurrent tracked processes (LRU pruning)
# Watch pattern rate limiting
WATCH_MAX_PER_WINDOW = 8 # Max notifications delivered per window
WATCH_WINDOW_SECONDS = 10 # Rolling window length
WATCH_OVERLOAD_KILL_SECONDS = 45 # Sustained overload duration before disabling watch
# Watch pattern rate limiting — PER SESSION.
# Hard rule: at most ONE watch-match notification every WATCH_MIN_INTERVAL_SECONDS.
# Any match arriving inside that cooldown window is dropped and counted as a strike.
# After WATCH_STRIKE_LIMIT consecutive strike windows, watch_patterns for that
# session is permanently disabled and the session falls back to notify_on_complete
# semantics (one notification when the process actually exits).
WATCH_MIN_INTERVAL_SECONDS = 15 # Minimum spacing between consecutive watch matches
WATCH_STRIKE_LIMIT = 3 # Strikes in a row → disable watch + promote to notify_on_complete
# Global circuit breaker — across all sessions. Secondary safety net so concurrent
# siblings can't collectively flood the user even when each is under its own cap.
WATCH_GLOBAL_MAX_PER_WINDOW = 15
WATCH_GLOBAL_WINDOW_SECONDS = 10
WATCH_GLOBAL_COOLDOWN_SECONDS = 30
def format_uptime_short(seconds: int) -> str:
@@ -105,10 +115,18 @@ class ProcessSession:
watch_patterns: List[str] = field(default_factory=list)
_watch_hits: int = field(default=0, repr=False) # total matches delivered
_watch_suppressed: int = field(default=0, repr=False) # matches dropped by rate limit
_watch_overload_since: float = field(default=0.0, repr=False) # when sustained overload began
_watch_disabled: bool = field(default=False, repr=False) # permanently killed by overload
_watch_window_hits: int = field(default=0, repr=False) # hits in current rate window
_watch_window_start: float = field(default=0.0, repr=False)
_watch_disabled: bool = field(default=False, repr=False) # permanently killed after strike limit
# Per-session rate limit state: at most one match every WATCH_MIN_INTERVAL_SECONDS.
# When an emission happens, _watch_cooldown_until is set to now + interval and
# _watch_strike_candidate becomes True. The next match to arrive before that
# deadline counts as one strike (regardless of how many matches were dropped in
# between — a strike is a window, not a match). After WATCH_STRIKE_LIMIT strikes
# in a row, watch_patterns is disabled and the session promotes to
# notify_on_complete.
_watch_last_emit_at: float = field(default=0.0, repr=False)
_watch_cooldown_until: float = field(default=0.0, repr=False)
_watch_strike_candidate: bool = field(default=False, repr=False)
_watch_consecutive_strikes: int = field(default=0, repr=False)
_lock: threading.Lock = field(default_factory=threading.Lock)
_reader_thread: Optional[threading.Thread] = field(default=None, repr=False)
_pty: Any = field(default=None, repr=False) # ptyprocess handle (when use_pty=True)
@@ -151,6 +169,15 @@ class ProcessRegistry:
# via wait/poll/log. Drain loops skip notifications for these.
self._completion_consumed: set = set()
# Global watch-match circuit breaker — across all sessions.
# Prevents sibling processes from collectively flooding the user even
# when each stays under its own per-session cap.
self._global_watch_lock = threading.Lock()
self._global_watch_window_start: float = 0.0
self._global_watch_window_hits: int = 0
self._global_watch_tripped_until: float = 0.0
self._global_watch_suppressed_during_trip: int = 0
@staticmethod
def _clean_shell_noise(text: str) -> str:
"""Strip shell startup warnings from the beginning of output."""
@@ -163,12 +190,23 @@ class ProcessRegistry:
"""Scan new output for watch patterns and queue notifications.
Called from reader threads with new_text being the freshly-read chunk.
Rate-limited: max WATCH_MAX_PER_WINDOW notifications per WATCH_WINDOW_SECONDS.
If sustained overload exceeds WATCH_OVERLOAD_KILL_SECONDS, watching is
disabled permanently for this process.
Per-session rate limit: at most ONE watch-match notification per
WATCH_MIN_INTERVAL_SECONDS. Any match arriving inside the cooldown
window is dropped and counts as ONE strike for that window. After
WATCH_STRIKE_LIMIT consecutive strike windows, watch_patterns is
disabled for this session and the session is promoted to
notify_on_complete semantics one notification when the process
actually exits, no more mid-process spam.
"""
if not session.watch_patterns or session._watch_disabled:
return
# Suppress-after-exit: once the reader loop has declared the process
# exited, any late chunk we still see is post-exit noise. Dropping these
# prevents the "stale notifications delivered minutes after the process
# ended" spam when completion_queue consumers run async.
if session.exited:
return
# Scan new text line-by-line for pattern matches
matched_lines = []
@@ -185,55 +223,80 @@ class ProcessRegistry:
return
now = time.time()
should_disable = False
with session._lock:
# Reset window if it's expired
if now - session._watch_window_start >= WATCH_WINDOW_SECONDS:
session._watch_window_hits = 0
session._watch_window_start = now
# Check rate limit
if session._watch_window_hits >= WATCH_MAX_PER_WINDOW:
# Case 1: still inside the cooldown from the last emission.
# Count this as a strike for the current window (only once per window)
# and drop the event. If we've hit the strike limit, disable watch
# and promote to notify_on_complete.
if session._watch_cooldown_until and now < session._watch_cooldown_until:
session._watch_suppressed += len(matched_lines)
if not session._watch_strike_candidate:
# First drop in this window — count one strike.
session._watch_strike_candidate = True
session._watch_consecutive_strikes += 1
if session._watch_consecutive_strikes >= WATCH_STRIKE_LIMIT:
session._watch_disabled = True
# Promote to notify_on_complete so the agent still gets
# exactly one notification when the process actually ends.
session.notify_on_complete = True
should_disable = True
return_early = True
else:
# Case 2: cooldown has expired.
# Decide whether this window was a "clean" one (no drops) or a
# strike window. If no strike candidate was set during the prior
# cooldown, reset the consecutive-strike counter — we're back to
# healthy emission cadence.
if (
session._watch_cooldown_until
and not session._watch_strike_candidate
):
session._watch_consecutive_strikes = 0
session._watch_strike_candidate = False
# Track sustained overload for kill switch
if session._watch_overload_since == 0.0:
session._watch_overload_since = now
elif now - session._watch_overload_since > WATCH_OVERLOAD_KILL_SECONDS:
session._watch_disabled = True
self.completion_queue.put({
"session_id": session.id,
"session_key": session.session_key,
"command": session.command,
"type": "watch_disabled",
"suppressed": session._watch_suppressed,
"platform": session.watcher_platform,
"chat_id": session.watcher_chat_id,
"user_id": session.watcher_user_id,
"user_name": session.watcher_user_name,
"thread_id": session.watcher_thread_id,
"message": (
f"Watch patterns disabled for process {session.id}"
f"too many matches ({session._watch_suppressed} suppressed). "
f"Use process(action='poll') to check output manually."
),
})
return
# Emit the notification and start a new cooldown window.
session._watch_last_emit_at = now
session._watch_cooldown_until = now + WATCH_MIN_INTERVAL_SECONDS
session._watch_hits += 1
suppressed = session._watch_suppressed
session._watch_suppressed = 0
return_early = False
# Under the rate limit — deliver notification
session._watch_window_hits += 1
session._watch_hits += 1
# Clear overload tracker since we got a delivery through
session._watch_overload_since = 0.0
# Include suppressed count if any events were dropped
suppressed = session._watch_suppressed
session._watch_suppressed = 0
if return_early:
if should_disable:
# Emit exactly one "watch disabled, falling back to notify_on_complete"
# summary event so the agent/user sees why things went quiet.
self.completion_queue.put({
"session_id": session.id,
"session_key": session.session_key,
"command": session.command,
"type": "watch_disabled",
"suppressed": session._watch_suppressed,
"platform": session.watcher_platform,
"chat_id": session.watcher_chat_id,
"user_id": session.watcher_user_id,
"user_name": session.watcher_user_name,
"thread_id": session.watcher_thread_id,
"message": (
f"Watch patterns disabled for process {session.id}"
f"{WATCH_STRIKE_LIMIT} consecutive rate-limit windows triggered "
f"(min spacing {WATCH_MIN_INTERVAL_SECONDS}s). "
f"Falling back to notify_on_complete semantics; you'll get "
f"exactly one notification when the process exits."
),
})
return
# Trim matched output to a reasonable size
output = "\n".join(matched_lines[:20])
if len(output) > 2000:
output = output[:2000] + "\n...(truncated)"
# Global circuit breaker — across all sessions (secondary safety net).
if not self._global_watch_admit(now):
return
self.completion_queue.put({
"session_id": session.id,
"session_key": session.session_key,
@@ -249,6 +312,93 @@ class ProcessRegistry:
"thread_id": session.watcher_thread_id,
})
def _global_watch_admit(self, now: float) -> bool:
"""Return True if this watch_match event is allowed through the global breaker.
Semantics:
- If we're currently in a cooldown period, drop the event and count it.
- Otherwise, slide the rolling window and check the global cap.
- If the cap is exceeded, trip the breaker for WATCH_GLOBAL_COOLDOWN_SECONDS
and emit ONE summary event so the agent/user sees "N notifications were
suppressed" instead of getting them individually.
- When the cooldown ends, emit a release summary and reset counters.
"""
with self._global_watch_lock:
# Handle cooldown expiry first so we can emit the release summary.
if self._global_watch_tripped_until and now >= self._global_watch_tripped_until:
suppressed = self._global_watch_suppressed_during_trip
self._global_watch_tripped_until = 0.0
self._global_watch_suppressed_during_trip = 0
self._global_watch_window_start = now
self._global_watch_window_hits = 0
if suppressed > 0:
# Queue a summary event outside the lock (below).
release_msg = {
"session_id": "",
"session_key": "",
"command": "",
"type": "watch_overflow_released",
"suppressed": suppressed,
"message": (
f"Watch-pattern notifications resumed. "
f"{suppressed} match event(s) were suppressed during the flood."
),
"platform": "",
"chat_id": "",
"user_id": "",
"user_name": "",
"thread_id": "",
}
else:
release_msg = None
else:
release_msg = None
# Still in cooldown — drop and count.
if self._global_watch_tripped_until and now < self._global_watch_tripped_until:
self._global_watch_suppressed_during_trip += 1
admit = False
trip_now = None
else:
# Slide the window.
if now - self._global_watch_window_start >= WATCH_GLOBAL_WINDOW_SECONDS:
self._global_watch_window_start = now
self._global_watch_window_hits = 0
if self._global_watch_window_hits >= WATCH_GLOBAL_MAX_PER_WINDOW:
# Trip the breaker.
self._global_watch_tripped_until = now + WATCH_GLOBAL_COOLDOWN_SECONDS
self._global_watch_suppressed_during_trip += 1
trip_now = now
admit = False
else:
self._global_watch_window_hits += 1
trip_now = None
admit = True
# Queue summary events outside the lock.
if release_msg is not None:
self.completion_queue.put(release_msg)
if trip_now is not None:
self.completion_queue.put({
"session_id": "",
"session_key": "",
"command": "",
"type": "watch_overflow_tripped",
"message": (
f"Watch-pattern overflow: >{WATCH_GLOBAL_MAX_PER_WINDOW} "
f"notifications in {WATCH_GLOBAL_WINDOW_SECONDS}s across all processes. "
f"Suppressing further watch_match events for "
f"{WATCH_GLOBAL_COOLDOWN_SECONDS}s."
),
"platform": "",
"chat_id": "",
"user_id": "",
"user_name": "",
"thread_id": "",
})
return admit
@staticmethod
def _is_host_pid_alive(pid: Optional[int]) -> bool:
"""Best-effort liveness check for host-visible PIDs."""
+47 -4
View File
@@ -1388,6 +1388,33 @@ def _foreground_background_guidance(command: str) -> str | None:
return None
def _resolve_notification_flag_conflict(
*,
notify_on_complete: bool,
watch_patterns,
background: bool,
) -> tuple:
"""Decide what to do when both notify_on_complete and watch_patterns are set.
These flags produce duplicate, delayed notifications when combined one
notification per watch-pattern match AND one on process exit, with async
delivery that can spam the user long after the process ends. When both are
set, we drop watch_patterns in favor of notify_on_complete (the more useful
"let me know when it's done" signal) and return a human-readable note.
Returns:
(watch_patterns_to_use, conflict_note). conflict_note is "" when there
is no conflict.
"""
if background and notify_on_complete and watch_patterns:
note = (
"watch_patterns ignored because notify_on_complete=True; "
"these two flags produce duplicate notifications when combined"
)
return None, note
return watch_patterns, ""
def terminal_tool(
command: str,
background: bool = False,
@@ -1410,8 +1437,8 @@ def terminal_tool(
force: If True, skip dangerous command check (use after user confirms)
workdir: Working directory for this command (optional, uses session cwd if not set)
pty: If True, use pseudo-terminal for interactive CLI tools (local backend only)
notify_on_complete: If True and background=True, auto-notify the agent when the process exits
watch_patterns: List of strings to watch for in background output; fires a notification on first match per pattern. Use ONLY for mid-process signals (errors, readiness markers) that appear before exit. For end-of-run markers use notify_on_complete instead stacking both produces duplicate, delayed notifications.
notify_on_complete: If True and background=True, you'll be notified exactly once when the process exits. The right choice for almost every long task. MUTUALLY EXCLUSIVE with watch_patterns.
watch_patterns: List of strings to watch for in background output. HARD rate limit: 1 notification per 15s per process. After 3 strike windows in a row, watch_patterns is disabled and the session is auto-promoted to notify_on_complete. Use ONLY for rare, one-shot mid-process signals on long-lived processes (server readiness, migration-done markers). NEVER use in loops/batch jobs error patterns there will hit the strike limit and get disabled. MUTUALLY EXCLUSIVE with notify_on_complete set one, not both.
Returns:
str: JSON string with output, exit_code, and error fields
@@ -1701,6 +1728,22 @@ def terminal_tool(
proc_session.watcher_user_name = _gw_user_name
proc_session.watcher_thread_id = _gw_thread_id
# Mutual exclusion: if both notify_on_complete and watch_patterns
# are set, drop watch_patterns. The combination produces duplicate
# notifications (one per match + one on exit) that deliver
# asynchronously and can spam the user long after the process ends.
# notify_on_complete is the more useful signal for "let me know
# when the task finishes"; watch_patterns should be reserved for
# standalone mid-process signals on long-lived processes.
watch_patterns, conflict_note = _resolve_notification_flag_conflict(
notify_on_complete=bool(notify_on_complete),
watch_patterns=watch_patterns,
background=bool(background),
)
if conflict_note:
logger.warning("background proc %s: %s", proc_session.id, conflict_note)
result_data["watch_patterns_ignored"] = conflict_note
# Mark for agent notification on completion
if notify_on_complete and background:
proc_session.notify_on_complete = True
@@ -2039,13 +2082,13 @@ TERMINAL_SCHEMA = {
},
"notify_on_complete": {
"type": "boolean",
"description": "When true (and background=true), you'll be automatically notified when the process finishes — no polling needed. Use this for tasks that take a while (tests, builds, deployments) so you can keep working on other things in the meantime.",
"description": "When true (and background=true), you'll be automatically notified exactly once when the process finishes. **This is the right choice for almost every long-running task** — tests, builds, deployments, multi-item batch jobs, anything that takes over a minute and has a defined end. Use this and keep working on other things; the system notifies you on exit. MUTUALLY EXCLUSIVE with watch_patterns — when both are set, watch_patterns is dropped.",
"default": False
},
"watch_patterns": {
"type": "array",
"items": {"type": "string"},
"description": "Strings to watch for in background process output. Fires a notification the first time each pattern matches a line of output. **Use ONLY for mid-process signals** you want to react to before the process exits — errors, readiness markers, intermediate step markers (e.g. [\"ERROR\", \"Traceback\", \"listening on port\"]). Do NOT use for end-of-run markers (summary headers, 'DONE', 'PASS' printed right before exit) — use `notify_on_complete` for that instead. Stacking end-of-run patterns on top of `notify_on_complete` produces duplicate, delayed notifications that arrive after you've already moved on, since delivery is asynchronous and continues after the process exits."
"description": "Strings to watch for in background process output. HARD RATE LIMIT: at most 1 notification per 15 seconds per process — matches arriving inside the cooldown are dropped. After 3 consecutive 15-second windows with dropped matches, watch_patterns is automatically disabled for that process and promoted to notify_on_complete behavior (one notification on exit, no more mid-process spam). USE ONLY for truly rare, one-shot mid-process signals on LONG-LIVED processes that will never exit on their own — e.g. ['Application startup complete'] on a server so you know when to hit its endpoint, or ['migration done'] on a daemon. DO NOT use for: (1) end-of-run markers like 'DONE'/'PASS' — use notify_on_complete instead; (2) error patterns like 'ERROR'/'Traceback' in loops or multi-item batch jobs — they fire on every iteration and you'll hit the strike limit fast; (3) anything you'd ever combine with notify_on_complete. When in doubt, choose notify_on_complete. MUTUALLY EXCLUSIVE with notify_on_complete — set one, not both."
}
},
"required": ["command"]
+21 -3
View File
@@ -202,6 +202,18 @@ TOOLSETS = {
"includes": []
},
"discord": {
"description": "Discord read and participate tools (fetch messages, search members, create threads)",
"tools": ["discord"],
"includes": [],
},
"discord_admin": {
"description": "Discord server management (list channels/roles, pin messages, assign roles)",
"tools": ["discord_admin"],
"includes": [],
},
"feishu_doc": {
"description": "Read Feishu/Lark document content",
"tools": ["feishu_doc_read"],
@@ -326,8 +338,8 @@ TOOLSETS = {
"hermes-discord": {
"description": "Discord bot toolset - full access (terminal has safety checks via dangerous command approval)",
"tools": _HERMES_CORE_TOOLS + [
# Discord server introspection & management (gated on DISCORD_BOT_TOKEN via check_fn)
"discord_server",
"discord",
"discord_admin",
],
"includes": []
},
@@ -388,7 +400,13 @@ TOOLSETS = {
"hermes-feishu": {
"description": "Feishu/Lark bot toolset - enterprise messaging via Feishu/Lark (full access)",
"tools": _HERMES_CORE_TOOLS,
"tools": _HERMES_CORE_TOOLS + [
"feishu_doc_read",
"feishu_drive_list_comments",
"feishu_drive_list_comment_replies",
"feishu_drive_reply_comment",
"feishu_drive_add_comment",
],
"includes": []
},
+3
View File
@@ -15,6 +15,7 @@ import { Badge } from "@/components/ui/badge";
import { Button } from "@/components/ui/button";
import { usePageHeader } from "@/contexts/usePageHeader";
import { useI18n } from "@/i18n";
import { PluginSlot } from "@/plugins";
const PERIODS = [
{ label: "7d", days: 7 },
@@ -350,6 +351,7 @@ export default function AnalyticsPage() {
return (
<div className="flex flex-col gap-6">
<PluginSlot name="analytics:top" />
{loading && !data && (
<div className="flex items-center justify-center py-24">
<div className="h-6 w-6 animate-spin rounded-full border-2 border-primary border-t-transparent" />
@@ -409,6 +411,7 @@ export default function AnalyticsPage() {
</CardContent>
</Card>
)}
<PluginSlot name="analytics:bottom" />
</div>
);
}
+3
View File
@@ -32,6 +32,7 @@ import { useSearchParams } from "react-router-dom";
import { ChatSidebar } from "@/components/ChatSidebar";
import { usePageHeader } from "@/contexts/usePageHeader";
import { useI18n } from "@/i18n";
import { PluginSlot } from "@/plugins";
function buildWsUrl(
token: string,
@@ -670,6 +671,7 @@ export default function ChatPage() {
return (
<div className="flex min-h-0 flex-1 flex-col gap-2 normal-case">
<PluginSlot name="chat:top" />
{mobileModelToolsPortal}
{banner && (
@@ -732,6 +734,7 @@ export default function ChatPage() {
</div>
)}
</div>
<PluginSlot name="chat:bottom" />
</div>
);
}
+3
View File
@@ -39,6 +39,7 @@ import { Input } from "@/components/ui/input";
import { Badge } from "@/components/ui/badge";
import { useI18n } from "@/i18n";
import { usePageHeader } from "@/contexts/usePageHeader";
import { PluginSlot } from "@/plugins";
/* ------------------------------------------------------------------ */
/* Helpers */
@@ -313,6 +314,7 @@ export default function ConfigPage() {
return (
<div className="flex flex-col gap-4">
<PluginSlot name="config:top" />
<Toast toast={toast} />
{/* ═══════════════ Header Bar ═══════════════ */}
@@ -505,6 +507,7 @@ export default function ConfigPage() {
</div>
</div>
)}
<PluginSlot name="config:bottom" />
</div>
);
}
+3
View File
@@ -14,6 +14,7 @@ import { Input } from "@/components/ui/input";
import { Label } from "@/components/ui/label";
import { Select, SelectOption } from "@/components/ui/select";
import { useI18n } from "@/i18n";
import { PluginSlot } from "@/plugins";
function formatTime(iso?: string | null): string {
if (!iso) return "—";
@@ -149,6 +150,7 @@ export default function CronPage() {
return (
<div className="flex flex-col gap-6">
<PluginSlot name="cron:top" />
<Toast toast={toast} />
<DeleteConfirmDialog
@@ -346,6 +348,7 @@ export default function CronPage() {
</Card>
))}
</div>
<PluginSlot name="cron:bottom" />
</div>
);
}
+3
View File
@@ -4,6 +4,7 @@ import { useI18n } from "@/i18n";
import { usePageHeader } from "@/contexts/usePageHeader";
import { buttonVariants } from "@/components/ui/button";
import { cn } from "@/lib/utils";
import { PluginSlot } from "@/plugins";
export const HERMES_DOCS_URL = "https://hermes-agent.nousresearch.com/docs/";
@@ -38,6 +39,7 @@ export default function DocsPage() {
"pt-1 sm:pt-2",
)}
>
<PluginSlot name="docs:top" />
<iframe
title={t.app.nav.documentation}
src={HERMES_DOCS_URL}
@@ -49,6 +51,7 @@ export default function DocsPage() {
sandbox="allow-scripts allow-same-origin allow-popups allow-forms"
referrerPolicy="no-referrer-when-downgrade"
/>
<PluginSlot name="docs:bottom" />
</div>
);
}
+3
View File
@@ -27,6 +27,7 @@ import { Button } from "@/components/ui/button";
import { Input } from "@/components/ui/input";
import { Label } from "@/components/ui/label";
import { useI18n } from "@/i18n";
import { PluginSlot } from "@/plugins";
/* ------------------------------------------------------------------ */
/* Provider grouping */
@@ -511,6 +512,7 @@ export default function EnvPage() {
return (
<div className="flex flex-col gap-6">
<PluginSlot name="env:top" />
<Toast toast={toast} />
<DeleteConfirmDialog
@@ -610,6 +612,7 @@ export default function EnvPage() {
</Card>
);
})}
<PluginSlot name="env:bottom" />
</div>
);
}
+3
View File
@@ -9,6 +9,7 @@ import { Label } from "@/components/ui/label";
import { FilterGroup, Segmented } from "@/components/ui/segmented";
import { useI18n } from "@/i18n";
import { usePageHeader } from "@/contexts/usePageHeader";
import { PluginSlot } from "@/plugins";
const FILES = ["agent", "errors", "gateway"] as const;
const LEVELS = ["ALL", "DEBUG", "INFO", "WARNING", "ERROR"] as const;
@@ -141,6 +142,7 @@ export default function LogsPage() {
return (
<div className="flex flex-col gap-4">
<PluginSlot name="logs:top" />
{/* ═══════════════ Filter toolbar ═══════════════ */}
<div
role="toolbar"
@@ -215,6 +217,7 @@ export default function LogsPage() {
</div>
</CardContent>
</Card>
<PluginSlot name="logs:bottom" />
</div>
);
}
+3
View File
@@ -46,6 +46,7 @@ import { useSystemActions } from "@/contexts/useSystemActions";
import { useToast } from "@/hooks/useToast";
import { useI18n } from "@/i18n";
import { usePageHeader } from "@/contexts/usePageHeader";
import { PluginSlot } from "@/plugins";
import { isDashboardEmbeddedChatEnabled } from "@/lib/dashboard-flags";
const SOURCE_CONFIG: Record<string, { icon: typeof Terminal; color: string }> =
@@ -612,6 +613,7 @@ export default function SessionsPage() {
return (
<div className="flex flex-col gap-4">
<PluginSlot name="sessions:top" />
<Toast toast={toast} />
<DeleteConfirmDialog
@@ -834,6 +836,7 @@ export default function SessionsPage() {
)}
</>
)}
<PluginSlot name="sessions:bottom" />
</div>
);
}
+3
View File
@@ -25,6 +25,7 @@ import { Input } from "@/components/ui/input";
import { Switch } from "@/components/ui/switch";
import { useI18n } from "@/i18n";
import { usePageHeader } from "@/contexts/usePageHeader";
import { PluginSlot } from "@/plugins";
/* ------------------------------------------------------------------ */
/* Types & helpers */
@@ -251,6 +252,7 @@ export default function SkillsPage() {
return (
<div className="flex flex-col gap-4">
<PluginSlot name="skills:top" />
<Toast toast={toast} />
{/* ═══════════════ Filter panel + Content ═══════════════ */}
@@ -509,6 +511,7 @@ export default function SkillsPage() {
)}
</div>
</div>
<PluginSlot name="skills:bottom" />
</div>
);
}
+43
View File
@@ -18,6 +18,7 @@ import React, { Fragment, useEffect, useState } from "react";
/** Slot locations the built-in shell renders. Plugins declaring any of
* these in their manifest's `slots` field get wired in automatically.
*
* Shell-wide slots:
* - `backdrop` rendered inside `<Backdrop />`, above the noise layer
* - `header-left` injected before the Hermes brand in the top bar
* - `header-right` injected before the theme/language switchers
@@ -31,8 +32,31 @@ import React, { Fragment, useEffect, useState } from "react";
* - `overlay` fixed-position layer above everything else;
* useful for chrome (scanlines, vignettes) the
* theme's customCSS can't achieve alone
*
* Page-scoped slots (rendered inside a specific built-in page use these
* to inject widgets, cards, or toolbars into existing pages without
* overriding the whole route):
* - `sessions:top` top of /sessions page (above session list)
* - `sessions:bottom` bottom of /sessions page
* - `analytics:top` top of /analytics page
* - `analytics:bottom` bottom of /analytics page
* - `logs:top` top of /logs page (above filter toolbar)
* - `logs:bottom` bottom of /logs page (below log viewer)
* - `cron:top` top of /cron page
* - `cron:bottom` bottom of /cron page
* - `skills:top` top of /skills page
* - `skills:bottom` bottom of /skills page
* - `config:top` top of /config page
* - `config:bottom` bottom of /config page
* - `env:top` top of /env (Keys) page
* - `env:bottom` bottom of /env (Keys) page
* - `docs:top` top of /docs page (above the docs iframe)
* - `docs:bottom` bottom of /docs page
* - `chat:top` top of /chat page (above the composer, when embedded chat is on)
* - `chat:bottom` bottom of /chat page
*/
export const KNOWN_SLOT_NAMES = [
// Shell-wide
"backdrop",
"header-left",
"header-right",
@@ -43,6 +67,25 @@ export const KNOWN_SLOT_NAMES = [
"footer-left",
"footer-right",
"overlay",
// Page-scoped
"sessions:top",
"sessions:bottom",
"analytics:top",
"analytics:bottom",
"logs:top",
"logs:bottom",
"cron:top",
"cron:bottom",
"skills:top",
"skills:bottom",
"config:top",
"config:bottom",
"env:top",
"env:bottom",
"docs:top",
"docs:bottom",
"chat:top",
"chat:bottom",
] as const;
export type KnownSlotName = (typeof KNOWN_SLOT_NAMES)[number];
@@ -1,336 +0,0 @@
---
sidebar_position: 16
title: "Dashboard Plugins"
description: "Build custom tabs and extensions for the Hermes web dashboard"
---
# Dashboard Plugins
Dashboard plugins let you add custom tabs to the web dashboard. A plugin can display its own UI, call the Hermes API, and optionally register backend endpoints — all without touching the dashboard source code.
## Quick Start
Create a plugin directory with a manifest and a JS file:
```bash
mkdir -p ~/.hermes/plugins/my-plugin/dashboard/dist
```
**manifest.json:**
```json
{
"name": "my-plugin",
"label": "My Plugin",
"icon": "Sparkles",
"version": "1.0.0",
"tab": {
"path": "/my-plugin",
"position": "after:skills"
},
"entry": "dist/index.js"
}
```
**dist/index.js:**
```javascript
(function () {
var SDK = window.__HERMES_PLUGIN_SDK__;
var React = SDK.React;
var Card = SDK.components.Card;
var CardHeader = SDK.components.CardHeader;
var CardTitle = SDK.components.CardTitle;
var CardContent = SDK.components.CardContent;
function MyPage() {
return React.createElement(Card, null,
React.createElement(CardHeader, null,
React.createElement(CardTitle, null, "My Plugin")
),
React.createElement(CardContent, null,
React.createElement("p", { className: "text-sm text-muted-foreground" },
"Hello from my custom dashboard tab!"
)
)
);
}
window.__HERMES_PLUGINS__.register("my-plugin", MyPage);
})();
```
Refresh the dashboard — your tab appears in the navigation bar.
## Plugin Structure
Plugins live inside the standard `~/.hermes/plugins/` directory. The dashboard extension is a `dashboard/` subfolder:
```
~/.hermes/plugins/my-plugin/
plugin.yaml # optional — existing CLI/gateway plugin manifest
__init__.py # optional — existing CLI/gateway hooks
dashboard/ # dashboard extension
manifest.json # required — tab config, icon, entry point
dist/
index.js # required — pre-built JS bundle
style.css # optional — custom CSS
plugin_api.py # optional — backend API routes
```
A single plugin can extend both the CLI/gateway (via `plugin.yaml` + `__init__.py`) and the dashboard (via `dashboard/`) from one directory.
## Manifest Reference
The `manifest.json` file describes your plugin to the dashboard:
```json
{
"name": "my-plugin",
"label": "My Plugin",
"description": "What this plugin does",
"icon": "Sparkles",
"version": "1.0.0",
"tab": {
"path": "/my-plugin",
"position": "after:skills"
},
"entry": "dist/index.js",
"css": "dist/style.css",
"api": "plugin_api.py"
}
```
| Field | Required | Description |
|-------|----------|-------------|
| `name` | Yes | Unique plugin identifier (lowercase, hyphens ok) |
| `label` | Yes | Display name shown in the nav tab |
| `description` | No | Short description |
| `icon` | No | Lucide icon name (default: `Puzzle`) |
| `version` | No | Semver version string |
| `tab.path` | Yes | URL path for the tab (e.g. `/my-plugin`) |
| `tab.position` | No | Where to insert the tab: `end` (default), `after:<tab>`, `before:<tab>` |
| `entry` | Yes | Path to the JS bundle relative to `dashboard/` |
| `css` | No | Path to a CSS file to inject |
| `api` | No | Path to a Python file with FastAPI routes |
### Tab Position
The `position` field controls where your tab appears in the navigation:
- `"end"` — after all built-in tabs (default)
- `"after:skills"` — after the Skills tab
- `"before:config"` — before the Config tab
- `"after:cron"` — after the Cron tab
The value after the colon is the path segment of the target tab (without the leading slash).
### Available Icons
Plugins can use any of these Lucide icon names:
`Activity`, `BarChart3`, `Clock`, `Code`, `Database`, `Eye`, `FileText`, `Globe`, `Heart`, `KeyRound`, `MessageSquare`, `Package`, `Puzzle`, `Settings`, `Shield`, `Sparkles`, `Star`, `Terminal`, `Wrench`, `Zap`
Unrecognized icon names fall back to `Puzzle`.
## Plugin SDK
Plugins don't bundle React or UI components — they use the SDK exposed on `window.__HERMES_PLUGIN_SDK__`. This avoids version conflicts and keeps plugin bundles tiny.
### SDK Contents
```javascript
var SDK = window.__HERMES_PLUGIN_SDK__;
// React
SDK.React // React instance
SDK.hooks.useState // React hooks
SDK.hooks.useEffect
SDK.hooks.useCallback
SDK.hooks.useMemo
SDK.hooks.useRef
SDK.hooks.useContext
SDK.hooks.createContext
// API
SDK.api // Hermes API client (getStatus, getSessions, etc.)
SDK.fetchJSON // Raw fetch for custom endpoints — handles auth automatically
// UI Components (shadcn/ui style)
SDK.components.Card
SDK.components.CardHeader
SDK.components.CardTitle
SDK.components.CardContent
SDK.components.Badge
SDK.components.Button
SDK.components.Input
SDK.components.Label
SDK.components.Select
SDK.components.SelectOption
SDK.components.Separator
SDK.components.Tabs
SDK.components.TabsList
SDK.components.TabsTrigger
// Utilities
SDK.utils.cn // Tailwind class merger (clsx + twMerge)
SDK.utils.timeAgo // "5m ago" from unix timestamp
SDK.utils.isoTimeAgo // "5m ago" from ISO string
// Hooks
SDK.useI18n // i18n translations
SDK.useTheme // Current theme info
```
### Using SDK.fetchJSON
For calling your plugin's backend API endpoints:
```javascript
SDK.fetchJSON("/api/plugins/my-plugin/data")
.then(function (result) {
console.log(result);
})
.catch(function (err) {
console.error("API call failed:", err);
});
```
`fetchJSON` automatically injects the session auth token, handles errors, and parses JSON.
### Using Existing API Methods
The `SDK.api` object has methods for all built-in Hermes endpoints:
```javascript
// Fetch agent status
SDK.api.getStatus().then(function (status) {
console.log("Version:", status.version);
});
// List sessions
SDK.api.getSessions(10).then(function (resp) {
console.log("Sessions:", resp.sessions.length);
});
```
## Backend API Routes
Plugins can register FastAPI routes by setting the `api` field in the manifest. Create a Python file that exports a `router`:
```python
# plugin_api.py
from fastapi import APIRouter
router = APIRouter()
@router.get("/data")
async def get_data():
return {"items": ["one", "two", "three"]}
@router.post("/action")
async def do_action(body: dict):
return {"ok": True, "received": body}
```
Routes are mounted at `/api/plugins/<name>/`, so the above becomes:
- `GET /api/plugins/my-plugin/data`
- `POST /api/plugins/my-plugin/action`
Plugin API routes bypass session token authentication since the dashboard server only binds to localhost.
### Accessing Hermes Internals
Backend routes can import from the hermes-agent codebase:
```python
from fastapi import APIRouter
from hermes_state import SessionDB
from hermes_cli.config import load_config
router = APIRouter()
@router.get("/session-count")
async def session_count():
db = SessionDB()
try:
count = len(db.list_sessions(limit=9999))
return {"count": count}
finally:
db.close()
```
## Custom CSS
If your plugin needs custom styles, add a CSS file and reference it in the manifest:
```json
{
"css": "dist/style.css"
}
```
The CSS file is injected as a `<link>` tag when the plugin loads. Use specific class names to avoid conflicts with the dashboard's existing styles.
```css
/* dist/style.css */
.my-plugin-chart {
border: 1px solid var(--color-border);
background: var(--color-card);
padding: 1rem;
}
```
You can use the dashboard's CSS custom properties (e.g. `--color-border`, `--color-foreground`) to match the active theme.
## Plugin Loading Flow
1. Dashboard loads — `main.tsx` exposes the SDK on `window.__HERMES_PLUGIN_SDK__`
2. `App.tsx` calls `usePlugins()` which fetches `GET /api/dashboard/plugins`
3. For each plugin: CSS `<link>` injected (if declared), JS `<script>` loaded
4. Plugin JS calls `window.__HERMES_PLUGINS__.register(name, Component)`
5. Dashboard adds the tab to navigation and mounts the component as a route
Plugins have up to 2 seconds to register after their script loads. If a plugin fails to load, the dashboard continues without it.
## Plugin Discovery
The dashboard scans these directories for `dashboard/manifest.json`:
1. **User plugins:** `~/.hermes/plugins/<name>/dashboard/manifest.json`
2. **Bundled plugins:** `<repo>/plugins/<name>/dashboard/manifest.json`
3. **Project plugins:** `./.hermes/plugins/<name>/dashboard/manifest.json` (only when `HERMES_ENABLE_PROJECT_PLUGINS` is set)
User plugins take precedence — if the same plugin name exists in multiple sources, the user version wins.
To force re-scanning after adding a new plugin without restarting the server:
```bash
curl http://127.0.0.1:9119/api/dashboard/plugins/rescan
```
## Plugin API Endpoints
| Endpoint | Method | Description |
|----------|--------|-------------|
| `/api/dashboard/plugins` | GET | List discovered plugins |
| `/api/dashboard/plugins/rescan` | GET | Force re-scan for new plugins |
| `/dashboard-plugins/<name>/<path>` | GET | Serve plugin static assets |
| `/api/plugins/<name>/*` | * | Plugin-registered API routes |
## Example Plugin
The repository includes an example plugin at `plugins/example-dashboard/` that demonstrates:
- Using SDK components (Card, Badge, Button)
- Calling a backend API route
- Registering via `window.__HERMES_PLUGINS__.register()`
To try it, run `hermes dashboard` — the "Example" tab appears after Skills.
## Tips
- **No build step required** — write plain JavaScript IIFEs. If you prefer JSX, use any bundler (esbuild, Vite, webpack) targeting IIFE output with React as an external.
- **Keep bundles small** — React and all UI components are provided by the SDK. Your bundle should only contain your plugin logic.
- **Use theme variables** — reference `var(--color-*)` in CSS to automatically match whatever theme the user has selected.
- **Test locally** — run `hermes dashboard --no-open` and use browser dev tools to verify your plugin loads and registers correctly.
@@ -0,0 +1,904 @@
---
sidebar_position: 17
title: "Extending the Dashboard"
description: "Build themes and plugins for the Hermes web dashboard — palettes, typography, layouts, custom tabs, shell slots, page-scoped slots, and backend API routes"
---
# Extending the Dashboard
The Hermes web dashboard (`hermes dashboard`) is built to be reskinned and extended without forking the codebase. Three layers are exposed:
1. **Themes** — YAML files that repaint the dashboard's palette, typography, layout, and per-component chrome. Drop a file in `~/.hermes/dashboard-themes/`; it appears in the theme switcher.
2. **UI plugins** — a directory with `manifest.json` + a JavaScript bundle that registers a tab, replaces a built-in page, augments one via page-scoped slots, or injects components into named shell slots.
3. **Backend plugins** — a Python file inside that plugin directory that exposes a FastAPI `router`; routes are mounted under `/api/plugins/<name>/` and called from the plugin's UI.
All three are **drop-in at runtime**: no repo clone, no `npm run build`, no patching the dashboard source. This page is the canonical reference for all three.
If you just want to use the dashboard, see [Web Dashboard](./web-dashboard). If you want to reskin the terminal CLI (not the web dashboard), see [Skins & Themes](./skins) — the CLI skin system is unrelated to dashboard themes.
:::note How the pieces compose
Themes and plugins are independent but synergistic. A theme can stand alone (just a YAML file). A plugin can stand alone (just a tab). Together they let you build a complete visual reskin with custom HUDs — the bundled `strike-freedom-cockpit` demo does exactly that. See [Combined theme + plugin demo](#combined-theme--plugin-demo).
:::
---
## Table of contents
- [Themes](#themes)
- [Quick start — your first theme](#quick-start--your-first-theme)
- [Palette, typography, layout](#palette-typography-layout)
- [Layout variants](#layout-variants)
- [Theme assets (images as CSS vars)](#theme-assets-images-as-css-vars)
- [Component chrome overrides](#component-chrome-overrides)
- [Color overrides](#color-overrides)
- [Raw `customCSS`](#raw-customcss)
- [Built-in themes](#built-in-themes)
- [Full theme YAML reference](#full-theme-yaml-reference)
- [Plugins](#plugins)
- [Quick start — your first plugin](#quick-start--your-first-plugin)
- [Directory layout](#directory-layout)
- [Manifest reference](#manifest-reference)
- [The Plugin SDK](#the-plugin-sdk)
- [Shell slots](#shell-slots)
- [Replacing built-in pages (`tab.override`)](#replacing-built-in-pages-taboverride)
- [Augmenting built-in pages (page-scoped slots)](#augmenting-built-in-pages-page-scoped-slots)
- [Slot-only plugins (`tab.hidden`)](#slot-only-plugins-tabhidden)
- [Backend API routes](#backend-api-routes)
- [Custom CSS per plugin](#custom-css-per-plugin)
- [Plugin discovery & reload](#plugin-discovery--reload)
- [Combined theme + plugin demo](#combined-theme--plugin-demo)
- [API reference](#api-reference)
- [Troubleshooting](#troubleshooting)
---
## Themes
Themes are YAML files stored in `~/.hermes/dashboard-themes/`. The file name doesn't matter (the theme's `name:` field is what the system uses), but convention is `<name>.yaml`. Every field is optional — missing keys fall back to the built-in `default` theme, so a theme can be as small as one color.
### Quick start — your first theme
```bash
mkdir -p ~/.hermes/dashboard-themes
```
```yaml
# ~/.hermes/dashboard-themes/neon.yaml
name: neon
label: Neon
description: Pure magenta on black
palette:
background: "#000000"
midground: "#ff00ff"
```
Refresh the dashboard. Click the palette icon in the header and pick **Neon**. The background goes black, text and accents go magenta, and every derived color (card, border, muted, ring, etc.) is recomputed from that 2-color triplet via `color-mix()` in CSS.
That's the whole onboarding: one file, two colors. Everything below is optional refinement.
### Palette, typography, layout
These three blocks are the heart of a theme. Each is independent — override one, leave the others.
#### Palette (3-layer)
The palette is a triplet of color layers plus a warm-glow vignette color and a noise-grain multiplier. The dashboard's design-system cascade derives every shadcn-compatible token (card, popover, muted, border, primary, destructive, ring, etc.) from this triplet via CSS `color-mix()`. Overriding three colors cascades into the whole UI.
| Key | Description |
|-----|-------------|
| `palette.background` | Deepest canvas color — typically near-black. Drives the page background and card fill. |
| `palette.midground` | Primary text and accent. Most UI chrome reads this (foreground text, button outlines, focus rings). |
| `palette.foreground` | Top-layer highlight. The default theme sets this to white at alpha 0 (invisible); themes that want a bright accent on top can raise its alpha. |
| `palette.warmGlow` | `rgba(...)` string used as the vignette color by `<Backdrop />`. |
| `palette.noiseOpacity` | 01.2 multiplier on the grain overlay. Lower = softer, higher = grittier. |
Each layer accepts either `{hex: "#RRGGBB", alpha: 0.01.0}` or a bare hex string (alpha defaults to 1.0).
```yaml
palette:
background:
hex: "#05091a"
alpha: 1.0
midground: "#d8f0ff" # bare hex, alpha = 1.0
foreground:
hex: "#ffffff"
alpha: 0 # invisible top layer
warmGlow: "rgba(255, 199, 55, 0.24)"
noiseOpacity: 0.7
```
#### Typography
| Key | Type | Description |
|-----|------|-------------|
| `fontSans` | string | CSS font-family stack for body copy (applied to `html`, `body`). |
| `fontMono` | string | CSS font-family stack for code blocks, `<code>`, `.font-mono` utilities. |
| `fontDisplay` | string | Optional heading/display stack. Falls back to `fontSans`. |
| `fontUrl` | string | Optional external stylesheet URL. Injected as `<link rel="stylesheet">` in `<head>` on theme switch. Same URL is never injected twice. Works with Google Fonts, Bunny Fonts, self-hosted `@font-face` sheets — anything linkable. |
| `baseSize` | string | Root font size — controls the rem scale. E.g. `"14px"`, `"16px"`. |
| `lineHeight` | string | Default line-height. E.g. `"1.5"`, `"1.65"`. |
| `letterSpacing` | string | Default letter-spacing. E.g. `"0"`, `"0.01em"`, `"-0.01em"`. |
```yaml
typography:
fontSans: '"Orbitron", "Eurostile", "Impact", sans-serif'
fontMono: '"Share Tech Mono", ui-monospace, monospace'
fontDisplay: '"Orbitron", "Eurostile", sans-serif'
fontUrl: "https://fonts.googleapis.com/css2?family=Orbitron:wght@400;500;600;700&family=Share+Tech+Mono&display=swap"
baseSize: "14px"
lineHeight: "1.5"
letterSpacing: "0.04em"
```
#### Layout
| Key | Values | Description |
|-----|--------|-------------|
| `radius` | any CSS length (`"0"`, `"0.25rem"`, `"0.5rem"`, `"1rem"`, ...) | Corner-radius token. Maps to `--radius` and cascades into `--radius-sm/md/lg/xl` — every rounded element shifts together. |
| `density` | `compact` \| `comfortable` \| `spacious` | Spacing multiplier applied as the `--spacing-mul` CSS var. `compact = 0.85×`, `comfortable = 1.0×` (default), `spacious = 1.2×`. Scales Tailwind's base spacing, so padding, gap, and space-between utilities all shift proportionally. |
```yaml
layout:
radius: "0"
density: compact
```
### Layout variants
`layoutVariant` picks the overall shell layout. Defaults to `"standard"` when absent.
| Variant | Behaviour |
|---------|-----------|
| `standard` | Single column, 1600px max-width (default). |
| `cockpit` | Left sidebar rail (260px) + main content. Populated by plugins via the `sidebar` slot — see [Shell slots](#shell-slots). Without a plugin the rail shows a placeholder. |
| `tiled` | Drops the max-width clamp so pages can use the full viewport width. |
```yaml
layoutVariant: cockpit
```
The current variant is exposed as `document.documentElement.dataset.layoutVariant`, so raw CSS in `customCSS` can target it via `:root[data-layout-variant="cockpit"] ...`.
### Theme assets (images as CSS vars)
Ship artwork URLs with a theme. Each named slot becomes a CSS var (`--theme-asset-<name>`) that the built-in shell and any plugin can read. The `bg` slot is automatically wired into the backdrop; other slots are plugin-facing.
```yaml
assets:
bg: "https://example.com/hero-bg.jpg" # auto-wired into <Backdrop />
hero: "/my-images/strike-freedom.png" # for plugin sidebars
crest: "/my-images/crest.svg" # for header-left plugins
logo: "/my-images/logo.png"
sidebar: "/my-images/rail.png"
header: "/my-images/header-art.png"
custom:
scanLines: "/my-images/scanlines.png" # → --theme-asset-custom-scanLines
```
Values accept:
- Bare URLs — wrapped in `url(...)` automatically.
- Pre-wrapped `url(...)`, `linear-gradient(...)`, `radial-gradient(...)` expressions — used as-is.
- `"none"` — explicit opt-out.
Every asset is also emitted as `--theme-asset-<name>-raw` (the unwrapped URL), in case a plugin needs to pass it to `<img src>` instead of `background-image`.
Plugins read these with plain CSS or JS:
```javascript
// In a plugin slot
const hero = getComputedStyle(document.documentElement)
.getPropertyValue("--theme-asset-hero").trim();
```
### Component chrome overrides
`componentStyles` restyles individual shell components without writing CSS selectors. Each bucket's entries become CSS vars (`--component-<bucket>-<kebab-property>`) that the shell's shared components read. So `card:` overrides apply to every `<Card>`, `header:` to the app bar, etc.
```yaml
componentStyles:
card:
clipPath: "polygon(12px 0, 100% 0, 100% calc(100% - 12px), calc(100% - 12px) 100%, 0 100%, 0 12px)"
background: "linear-gradient(180deg, rgba(10, 22, 52, 0.85), rgba(5, 9, 26, 0.92))"
boxShadow: "inset 0 0 0 1px rgba(64, 200, 255, 0.28)"
header:
background: "linear-gradient(180deg, rgba(16, 32, 72, 0.95), rgba(5, 9, 26, 0.9))"
tab:
clipPath: "polygon(6px 0, 100% 0, calc(100% - 6px) 100%, 0 100%)"
sidebar: {}
backdrop: {}
footer: {}
progress: {}
badge: {}
page: {}
```
Supported buckets: `card`, `header`, `footer`, `sidebar`, `tab`, `progress`, `badge`, `backdrop`, `page`.
Property names use camelCase (`clipPath`) and are emitted as kebab (`clip-path`). Values are plain CSS strings — anything CSS accepts (`clip-path`, `border-image`, `background`, `box-shadow`, `animation`, ...).
### Color overrides
Most themes won't need this — the 3-layer palette derives every shadcn token. Use `colorOverrides` when you want a specific accent the derivation won't produce (a softer destructive red for a pastel theme, a specific success green for a brand).
```yaml
colorOverrides:
primary: "#ffce3a"
primaryForeground: "#05091a"
accent: "#3fd3ff"
ring: "#3fd3ff"
destructive: "#ff3a5e"
border: "rgba(64, 200, 255, 0.28)"
```
Supported keys: `card`, `cardForeground`, `popover`, `popoverForeground`, `primary`, `primaryForeground`, `secondary`, `secondaryForeground`, `muted`, `mutedForeground`, `accent`, `accentForeground`, `destructive`, `destructiveForeground`, `success`, `warning`, `border`, `input`, `ring`.
Each key maps 1:1 to the `--color-<kebab>` CSS var (e.g. `primaryForeground``--color-primary-foreground`). Any key set here wins over the palette cascade for the active theme only — switching to another theme clears the overrides.
### Raw `customCSS`
For selector-level chrome that `componentStyles` can't express — pseudo-elements, animations, media queries, theme-scoped overrides — drop raw CSS into `customCSS`:
```yaml
customCSS: |
/* Scanline overlay — only visible when cockpit variant is active. */
:root[data-layout-variant="cockpit"] body::before {
content: "";
position: fixed;
inset: 0;
pointer-events: none;
z-index: 100;
background: repeating-linear-gradient(to bottom,
transparent 0px, transparent 2px,
rgba(64, 200, 255, 0.035) 3px, rgba(64, 200, 255, 0.035) 4px);
mix-blend-mode: screen;
}
```
The CSS is injected as a single scoped `<style data-hermes-theme-css>` tag on theme apply and cleaned up on theme switch. **Capped at 32 KiB per theme.**
### Built-in themes
Each built-in ships its own palette, typography, and layout — switching produces visible changes beyond color alone.
| Theme | Palette | Typography | Layout |
|-------|---------|------------|--------|
| **Hermes Teal** (`default`) | Dark teal + cream | System stack, 15px | 0.5rem radius, comfortable |
| **Midnight** (`midnight`) | Deep blue-violet | Inter + JetBrains Mono, 14px | 0.75rem radius, comfortable |
| **Ember** (`ember`) | Warm crimson + bronze | Spectral (serif) + IBM Plex Mono, 15px | 0.25rem radius, comfortable |
| **Mono** (`mono`) | Grayscale | IBM Plex Sans + IBM Plex Mono, 13px | 0 radius, compact |
| **Cyberpunk** (`cyberpunk`) | Neon green on black | Share Tech Mono everywhere, 14px | 0 radius, compact |
| **Rosé** (`rose`) | Pink + ivory | Fraunces (serif) + DM Mono, 16px | 1rem radius, spacious |
Themes that reference Google Fonts (all except Hermes Teal) load the stylesheet on demand — the first time you switch to them a `<link>` tag is injected into `<head>`.
### Full theme YAML reference
Every knob in one file — copy and trim what you don't need:
```yaml
# ~/.hermes/dashboard-themes/ocean.yaml
name: ocean
label: Ocean Deep
description: Deep sea blues with coral accents
# 3-layer palette (accepts {hex, alpha} or bare hex)
palette:
background:
hex: "#0a1628"
alpha: 1.0
midground:
hex: "#a8d0ff"
alpha: 1.0
foreground:
hex: "#ffffff"
alpha: 0.0
warmGlow: "rgba(255, 107, 107, 0.35)"
noiseOpacity: 0.7
typography:
fontSans: "Poppins, system-ui, sans-serif"
fontMono: "Fira Code, ui-monospace, monospace"
fontDisplay: "Poppins, system-ui, sans-serif" # optional
fontUrl: "https://fonts.googleapis.com/css2?family=Poppins:wght@400;500;600&family=Fira+Code:wght@400;500&display=swap"
baseSize: "15px"
lineHeight: "1.6"
letterSpacing: "-0.003em"
layout:
radius: "0.75rem"
density: comfortable
layoutVariant: standard # standard | cockpit | tiled
assets:
bg: "https://example.com/ocean-bg.jpg"
hero: "/my-images/kraken.png"
crest: "/my-images/anchor.svg"
logo: "/my-images/logo.png"
custom:
pattern: "/my-images/waves.svg"
componentStyles:
card:
boxShadow: "inset 0 0 0 1px rgba(168, 208, 255, 0.18)"
header:
background: "linear-gradient(180deg, rgba(10, 22, 40, 0.95), rgba(5, 9, 26, 0.9))"
colorOverrides:
destructive: "#ff6b6b"
ring: "#ff6b6b"
customCSS: |
/* Any additional selector-level tweaks */
```
Refresh the dashboard after creating the file. Switch themes live from the header bar — click the palette icon. Selection persists to `config.yaml` under `dashboard.theme` and is restored on reload.
---
## Plugins
A dashboard plugin is a directory with a `manifest.json`, a pre-built JS bundle, and optionally a CSS file and a Python file with FastAPI routes. Plugins live next to other Hermes plugins in `~/.hermes/plugins/<name>/` — the dashboard extension is a `dashboard/` subfolder inside that plugin directory, so one plugin can extend both the CLI/gateway and the dashboard from a single install.
Plugins don't bundle React or UI components. They use the **Plugin SDK** exposed on `window.__HERMES_PLUGIN_SDK__`. This keeps plugin bundles tiny (typically a few KB) and avoids version conflicts.
### Quick start — your first plugin
Create the directory structure:
```bash
mkdir -p ~/.hermes/plugins/my-plugin/dashboard/dist
```
Write the manifest:
```json
// ~/.hermes/plugins/my-plugin/dashboard/manifest.json
{
"name": "my-plugin",
"label": "My Plugin",
"icon": "Sparkles",
"version": "1.0.0",
"tab": {
"path": "/my-plugin",
"position": "after:skills"
},
"entry": "dist/index.js"
}
```
Write the JS bundle (a plain IIFE — no build step needed):
```javascript
// ~/.hermes/plugins/my-plugin/dashboard/dist/index.js
(function () {
"use strict";
const SDK = window.__HERMES_PLUGIN_SDK__;
const { React } = SDK;
const { Card, CardHeader, CardTitle, CardContent } = SDK.components;
function MyPage() {
return React.createElement(Card, null,
React.createElement(CardHeader, null,
React.createElement(CardTitle, null, "My Plugin"),
),
React.createElement(CardContent, null,
React.createElement("p", { className: "text-sm text-muted-foreground" },
"Hello from my custom dashboard tab.",
),
),
);
}
window.__HERMES_PLUGINS__.register("my-plugin", MyPage);
})();
```
Refresh the dashboard — your tab appears in the nav bar, after **Skills**.
:::tip Skip React.createElement
If you prefer JSX, use any bundler (esbuild, Vite, rollup) with React as an external and IIFE output. The only hard requirement is that the final file is a single JS file loadable via `<script>`. React is never bundled; it comes from `SDK.React`.
:::
### Directory layout
```
~/.hermes/plugins/my-plugin/
├── plugin.yaml # optional — existing CLI/gateway plugin manifest
├── __init__.py # optional — existing CLI/gateway hooks
└── dashboard/ # dashboard extension
├── manifest.json # required — tab config, icon, entry point
├── dist/
│ ├── index.js # required — pre-built JS bundle (IIFE)
│ └── style.css # optional — custom CSS
└── plugin_api.py # optional — backend API routes (FastAPI)
```
A single plugin directory can carry three orthogonal extensions:
- `plugin.yaml` + `__init__.py` — CLI/gateway plugin ([see plugins page](./plugins)).
- `dashboard/manifest.json` + `dashboard/dist/index.js` — dashboard UI plugin.
- `dashboard/plugin_api.py` — dashboard backend routes.
None of them are required; include only the layers you need.
### Manifest reference
```json
{
"name": "my-plugin",
"label": "My Plugin",
"description": "What this plugin does",
"icon": "Sparkles",
"version": "1.0.0",
"tab": {
"path": "/my-plugin",
"position": "after:skills",
"override": "/",
"hidden": false
},
"slots": ["sidebar", "header-left"],
"entry": "dist/index.js",
"css": "dist/style.css",
"api": "plugin_api.py"
}
```
| Field | Required | Description |
|-------|----------|-------------|
| `name` | Yes | Unique plugin identifier. Lowercase, hyphens ok. Used in URLs and registration. |
| `label` | Yes | Display name shown in the nav tab. |
| `description` | No | Short description (shown in dashboard admin surfaces). |
| `icon` | No | Lucide icon name. Defaults to `Puzzle`. Unknown names fall back to `Puzzle`. |
| `version` | No | Semver string. Defaults to `0.0.0`. |
| `tab.path` | Yes | URL path for the tab (e.g. `/my-plugin`). |
| `tab.position` | No | Where to insert the tab. `"end"` (default), `"after:<path>"`, or `"before:<path>"` — value after the colon is the **path segment** of the target tab (no leading slash). Examples: `"after:skills"`, `"before:config"`. |
| `tab.override` | No | Set to a built-in route path (`"/"`, `"/sessions"`, `"/config"`, ...) to **replace** that page instead of adding a new tab. See [Replacing built-in pages](#replacing-built-in-pages-taboverride). |
| `tab.hidden` | No | When true, register the component and any slots without adding a tab to the nav. Used by slot-only plugins. See [Slot-only plugins](#slot-only-plugins-tabhidden). |
| `slots` | No | Named shell slots this plugin populates. **Documentation aid only** — actual registration happens from the JS bundle via `registerSlot()`. Listing slots here makes discovery surfaces more informative. |
| `entry` | Yes | Path to the JS bundle relative to `dashboard/`. Defaults to `dist/index.js`. |
| `css` | No | Path to a CSS file to inject as a `<link>` tag. |
| `api` | No | Path to a Python file with FastAPI routes. Mounted at `/api/plugins/<name>/`. |
#### Available icons
Plugins use Lucide icon names. The dashboard maps these by name — unknown names silently fall back to `Puzzle`.
Currently mapped: `Activity`, `BarChart3`, `Clock`, `Code`, `Database`, `Eye`, `FileText`, `Globe`, `Heart`, `KeyRound`, `MessageSquare`, `Package`, `Puzzle`, `Settings`, `Shield`, `Sparkles`, `Star`, `Terminal`, `Wrench`, `Zap`.
Need a different icon? Open a PR to `web/src/App.tsx`'s `ICON_MAP` — pure additive change.
### The Plugin SDK
Everything a plugin needs is on `window.__HERMES_PLUGIN_SDK__`. Plugins should never import React directly.
```javascript
const SDK = window.__HERMES_PLUGIN_SDK__;
// React + hooks
SDK.React // the React instance
SDK.hooks.useState
SDK.hooks.useEffect
SDK.hooks.useCallback
SDK.hooks.useMemo
SDK.hooks.useRef
SDK.hooks.useContext
SDK.hooks.createContext
// UI components (shadcn/ui primitives)
SDK.components.Card
SDK.components.CardHeader
SDK.components.CardTitle
SDK.components.CardContent
SDK.components.Badge
SDK.components.Button
SDK.components.Input
SDK.components.Label
SDK.components.Select
SDK.components.SelectOption
SDK.components.Separator
SDK.components.Tabs
SDK.components.TabsList
SDK.components.TabsTrigger
SDK.components.PluginSlot // render a named slot (useful for nested plugin UIs)
// Hermes API client + raw fetcher
SDK.api // typed client — getStatus, getSessions, getConfig, ...
SDK.fetchJSON // raw fetch for custom endpoints (plugin-registered routes)
// Utilities
SDK.utils.cn // Tailwind class merger (clsx + twMerge)
SDK.utils.timeAgo // "5m ago" from unix timestamp
SDK.utils.isoTimeAgo // "5m ago" from ISO string
// Hooks
SDK.useI18n // i18n hook for multi-language plugins
```
#### Calling your plugin's backend
```javascript
SDK.fetchJSON("/api/plugins/my-plugin/data")
.then((data) => console.log(data))
.catch((err) => console.error("API call failed:", err));
```
`fetchJSON` injects the session auth token, surfaces errors as thrown exceptions, and parses JSON automatically.
#### Calling built-in Hermes endpoints
```javascript
// Agent status
SDK.api.getStatus().then((s) => console.log("Version:", s.version));
// Recent sessions
SDK.api.getSessions(10).then((resp) => console.log(resp.sessions.length));
```
See [Web Dashboard → REST API](./web-dashboard#rest-api) for the full list.
### Shell slots
Slots let a plugin inject components into named locations of the app shell — the cockpit sidebar, the header, the footer, an overlay layer — without claiming a whole tab. Multiple plugins can populate the same slot; they render stacked in registration order.
Register from inside the plugin bundle:
```javascript
window.__HERMES_PLUGINS__.registerSlot("my-plugin", "sidebar", MySidebar);
window.__HERMES_PLUGINS__.registerSlot("my-plugin", "header-left", MyCrest);
```
#### Slot catalogue
**Shell-wide slots** (render anywhere in the app chrome):
| Slot | Location |
|------|----------|
| `backdrop` | Inside the `<Backdrop />` layer stack, above the noise layer. |
| `header-left` | Before the Hermes brand in the top bar. |
| `header-right` | Before the theme/language switchers in the top bar. |
| `header-banner` | Full-width strip below the nav. |
| `sidebar` | Cockpit sidebar rail — **only rendered when `layoutVariant === "cockpit"`**. |
| `pre-main` | Above the route outlet (inside `<main>`). |
| `post-main` | Below the route outlet (inside `<main>`). |
| `footer-left` | Footer cell content (replaces default). |
| `footer-right` | Footer cell content (replaces default). |
| `overlay` | Fixed-position layer above everything else. Useful for chrome (scanlines, vignettes) `customCSS` can't achieve alone. |
**Page-scoped slots** (render only on the named built-in page — use these to inject widgets, cards, or toolbars into an existing page without overriding the whole route):
| Slot | Where it renders |
|------|------------------|
| `sessions:top` / `sessions:bottom` | Top / bottom of the `/sessions` page. |
| `analytics:top` / `analytics:bottom` | Top / bottom of the `/analytics` page. |
| `logs:top` / `logs:bottom` | Top (above filter toolbar) / bottom (below log viewer) of `/logs`. |
| `cron:top` / `cron:bottom` | Top / bottom of the `/cron` page. |
| `skills:top` / `skills:bottom` | Top / bottom of the `/skills` page. |
| `config:top` / `config:bottom` | Top / bottom of the `/config` page. |
| `env:top` / `env:bottom` | Top / bottom of the `/env` (Keys) page. |
| `docs:top` / `docs:bottom` | Top (above the iframe) / bottom of `/docs`. |
| `chat:top` / `chat:bottom` | Top / bottom of `/chat` (only active when embedded chat is enabled). |
Example — add a banner card to the top of the Sessions page:
```javascript
function PinnedSessionsBanner() {
return React.createElement(Card, null,
React.createElement(CardContent, { className: "py-2 text-xs" },
"Pinned note injected by my-plugin"),
);
}
window.__HERMES_PLUGINS__.registerSlot("my-plugin", "sessions:top", PinnedSessionsBanner);
```
Combine page-scoped slots with `tab.hidden: true` if your plugin only augments existing pages and doesn't need a sidebar tab of its own.
The shell only renders `<PluginSlot name="..." />` for the slots above. Additional names are accepted by the registry for nested plugin UIs — a plugin can expose its own slots via `SDK.components.PluginSlot`.
#### Re-registration and HMR
If the same `(plugin, slot)` pair is registered twice, the later call replaces the earlier one — this matches how React HMR expects plugin re-mounts to behave.
### Replacing built-in pages (`tab.override`)
Setting `tab.override` to a built-in route path makes the plugin's component replace that page instead of adding a new tab. Useful when a theme wants a custom home page (`/`) but wants to keep the rest of the dashboard intact.
```json
{
"name": "my-home",
"label": "Home",
"tab": {
"path": "/my-home",
"override": "/",
"position": "end"
},
"entry": "dist/index.js"
}
```
With `override` set:
- The original page component at `/` is removed from the router.
- Your plugin renders at `/` instead.
- No nav tab is added for `tab.path` (the override is the point).
Only one plugin can override a given path. If two plugins claim the same override, the first wins and the second is ignored with a dev-mode warning.
If you only need to add a card or toolbar to an existing page without taking it over, use [page-scoped slots](#augmenting-built-in-pages-page-scoped-slots) instead.
### Augmenting built-in pages (page-scoped slots)
Full replacement via `tab.override` is heavy — your plugin now owns the entire page, including any future updates we ship to it. Most of the time you just want to add a banner, card, or toolbar to an existing page. That's what **page-scoped slots** are for.
Every built-in page exposes `<page>:top` and `<page>:bottom` slots rendered at the top and bottom of its content area. Your plugin populates one by calling `registerSlot()` — the built-in page keeps working normally, and your component renders alongside it.
Available slots: `sessions:*`, `analytics:*`, `logs:*`, `cron:*`, `skills:*`, `config:*`, `env:*`, `docs:*`, `chat:*` (each with `:top` and `:bottom`). See the full catalogue in [Shell slots → Slot catalogue](#slot-catalogue).
Minimal example — pin a banner to the top of the Sessions page:
```json
// ~/.hermes/plugins/session-notes/dashboard/manifest.json
{
"name": "session-notes",
"label": "Session Notes",
"tab": { "path": "/session-notes", "hidden": true },
"slots": ["sessions:top"],
"entry": "dist/index.js"
}
```
```javascript
// ~/.hermes/plugins/session-notes/dashboard/dist/index.js
(function () {
const SDK = window.__HERMES_PLUGIN_SDK__;
const { React } = SDK;
const { Card, CardContent } = SDK.components;
function Banner() {
return React.createElement(Card, null,
React.createElement(CardContent, { className: "py-2 text-xs" },
"Remember to label important sessions before archiving."),
);
}
// Placeholder for the hidden tab.
window.__HERMES_PLUGINS__.register("session-notes", function () { return null; });
// The real work.
window.__HERMES_PLUGINS__.registerSlot("session-notes", "sessions:top", Banner);
})();
```
Key points:
- `tab.hidden: true` keeps the plugin out of the sidebar — it has no standalone page.
- The `slots` manifest field is documentation only. The actual binding happens in the JS bundle via `registerSlot()`.
- Multiple plugins can claim the same page-scoped slot. They render stacked in registration order.
- Zero footprint when no plugin registers: the built-in page renders exactly as before.
The bundled `example-dashboard` plugin ships a live demo that injects a banner into `sessions:top` — install it to see the pattern end-to-end.
### Slot-only plugins (`tab.hidden`)
When `tab.hidden: true`, the plugin registers its component (for direct URL visits) and any slots, but never adds a tab to the navigation. Used by plugins that only exist to inject into slots — a header crest, a sidebar HUD, an overlay.
```json
{
"name": "header-crest",
"label": "Header Crest",
"tab": {
"path": "/header-crest",
"position": "end",
"hidden": true
},
"slots": ["header-left"],
"entry": "dist/index.js"
}
```
The bundle still calls `register()` with a placeholder component (good practice in case someone hits the URL directly) and then `registerSlot()` to do the real work.
### Backend API routes
Plugins can register FastAPI routes by setting `api` in the manifest. Create the file and export a `router`:
```python
# ~/.hermes/plugins/my-plugin/dashboard/plugin_api.py
from fastapi import APIRouter
router = APIRouter()
@router.get("/data")
async def get_data():
return {"items": ["one", "two", "three"]}
@router.post("/action")
async def do_action(body: dict):
return {"ok": True, "received": body}
```
Routes are mounted under `/api/plugins/<name>/`, so the above becomes:
- `GET /api/plugins/my-plugin/data`
- `POST /api/plugins/my-plugin/action`
Plugin API routes bypass session-token authentication since the dashboard server binds to localhost by default. **Don't expose the dashboard on a public interface with `--host 0.0.0.0` if you run untrusted plugins** — their routes become reachable too.
#### Accessing Hermes internals
Backend routes run inside the dashboard process, so they can import from the hermes-agent codebase directly:
```python
from fastapi import APIRouter
from hermes_state import SessionDB
from hermes_cli.config import load_config
router = APIRouter()
@router.get("/session-count")
async def session_count():
db = SessionDB()
try:
count = len(db.list_sessions(limit=9999))
return {"count": count}
finally:
db.close()
@router.get("/config-snapshot")
async def config_snapshot():
cfg = load_config()
return {"model": cfg.get("model", {})}
```
### Custom CSS per plugin
If your plugin needs styles beyond Tailwind classes and inline `style=`, add a CSS file and reference it in the manifest:
```json
{
"css": "dist/style.css"
}
```
The file is injected as a `<link>` tag on plugin load. Use specific class names to avoid conflicts with the dashboard's styles, and reference the dashboard's CSS vars to stay theme-aware:
```css
/* dist/style.css */
.my-plugin-chart {
border: 1px solid var(--color-border);
background: var(--color-card);
color: var(--color-card-foreground);
padding: 1rem;
}
.my-plugin-chart:hover {
border-color: var(--color-ring);
}
```
The dashboard exposes every shadcn token as `--color-*` plus theme extras (`--theme-asset-*`, `--component-<bucket>-*`, `--radius`, `--spacing-mul`). Reference those and your plugin automatically reskins with the active theme.
### Plugin discovery & reload
The dashboard scans three directories for `dashboard/manifest.json`:
| Priority | Directory | Source label |
|----------|-----------|--------------|
| 1 (wins on conflict) | `~/.hermes/plugins/<name>/dashboard/` | `user` |
| 2 | `<repo>/plugins/memory/<name>/dashboard/` | `bundled` |
| 2 | `<repo>/plugins/<name>/dashboard/` | `bundled` |
| 3 | `./.hermes/plugins/<name>/dashboard/` | `project` — only when `HERMES_ENABLE_PROJECT_PLUGINS` is set |
Discovery results are cached per dashboard process. After adding a new plugin, either:
```bash
# Force a rescan without restart
curl http://127.0.0.1:9119/api/dashboard/plugins/rescan
```
…or restart `hermes dashboard`.
#### Plugin load lifecycle
1. Dashboard loads. `main.tsx` exposes the SDK on `window.__HERMES_PLUGIN_SDK__` and the registry on `window.__HERMES_PLUGINS__`.
2. `App.tsx` calls `usePlugins()` → fetches `GET /api/dashboard/plugins`.
3. For each manifest: CSS `<link>` is injected (if declared), then a `<script>` tag loads the JS bundle.
4. The plugin's IIFE runs and calls `window.__HERMES_PLUGINS__.register(name, Component)` — and optionally `.registerSlot(name, slot, Component)` for each slot.
5. The dashboard resolves the registered component against the manifest, adds the tab to navigation (unless `hidden`), and mounts the component as a route.
Plugins have up to **2 seconds** after their script loads to call `register()`. After that the dashboard stops waiting and finishes initial render. If a plugin later registers, it still appears — the nav is reactive.
If a plugin's script fails to load (404, syntax error, exception during IIFE), the dashboard logs a warning to the browser console and continues without it.
---
## Combined theme + plugin demo
The repo ships `plugins/strike-freedom-cockpit/` as a complete reskin demo. It pairs a theme YAML with a slot-only plugin to produce a cockpit-style HUD without forking the dashboard.
**What it demonstrates:**
- A full theme using palette, typography, `fontUrl`, `layoutVariant: cockpit`, `assets`, `componentStyles` (notched card corners, gradient backgrounds), `colorOverrides`, and `customCSS` (scanline overlay).
- A slot-only plugin (`tab.hidden: true`) that registers into three slots:
- `sidebar` — an MS-STATUS panel with live telemetry bars driven by `SDK.api.getStatus()`.
- `header-left` — a faction crest that reads `--theme-asset-crest` from the active theme.
- `footer-right` — a custom tagline replacing the default org line.
- The plugin reads theme-supplied artwork via CSS vars, so swapping themes changes the hero/crest without plugin code changes.
**Install:**
```bash
# Theme
cp plugins/strike-freedom-cockpit/theme/strike-freedom.yaml \
~/.hermes/dashboard-themes/
# Plugin
cp -r plugins/strike-freedom-cockpit ~/.hermes/plugins/
```
Open the dashboard, pick **Strike Freedom** from the theme switcher. The cockpit sidebar appears, the crest shows in the header, the tagline replaces the footer. Switch back to **Hermes Teal** and the plugin remains installed but invisible (the `sidebar` slot only renders under the `cockpit` layout variant).
Read the plugin source (`plugins/strike-freedom-cockpit/dashboard/dist/index.js`) to see how it reads CSS vars, guards against older dashboards without slot support, and registers three slots from one bundle.
---
## API reference
### Theme endpoints
| Endpoint | Method | Description |
|----------|--------|-------------|
| `/api/dashboard/themes` | GET | List available themes + active name. Built-ins return `{name, label, description}`; user themes also include a `definition` field with the full normalised theme object. |
| `/api/dashboard/theme` | PUT | Set active theme. Body: `{"name": "midnight"}`. Persists to `config.yaml` under `dashboard.theme`. |
### Plugin endpoints
| Endpoint | Method | Description |
|----------|--------|-------------|
| `/api/dashboard/plugins` | GET | List discovered plugins (with manifests, minus internal fields). |
| `/api/dashboard/plugins/rescan` | GET | Force re-scan the plugin directories without restarting. |
| `/dashboard-plugins/<name>/<path>` | GET | Serve static assets from a plugin's `dashboard/` directory. Path traversal is blocked. |
| `/api/plugins/<name>/*` | * | Plugin-registered backend routes. |
### SDK on `window`
| Global | Type | Provider |
|--------|------|----------|
| `window.__HERMES_PLUGIN_SDK__` | object | `registry.ts` — React, hooks, UI components, API client, utils. |
| `window.__HERMES_PLUGINS__.register(name, Component)` | function | Register a plugin's main component. |
| `window.__HERMES_PLUGINS__.registerSlot(name, slot, Component)` | function | Register into a named shell slot. |
---
## Troubleshooting
**My theme doesn't appear in the picker.**
Check that the file is in `~/.hermes/dashboard-themes/` and ends in `.yaml` or `.yml`. Refresh the page. Run `curl http://127.0.0.1:9119/api/dashboard/themes` — your theme should be in the response. If the YAML has a parse error, the dashboard logs to `errors.log` under `~/.hermes/logs/`.
**My plugin's tab doesn't show up.**
1. Check the manifest is at `~/.hermes/plugins/<name>/dashboard/manifest.json` (note the `dashboard/` subdirectory).
2. `curl http://127.0.0.1:9119/api/dashboard/plugins/rescan` to force re-discovery.
3. Open browser dev tools → Network — confirm `manifest.json`, `index.js`, and any CSS loaded without 404s.
4. Open browser dev tools → Console — look for errors during the IIFE or `window.__HERMES_PLUGINS__ is undefined` (indicates the SDK didn't initialize, usually a React render crash earlier).
5. Verify your bundle calls `window.__HERMES_PLUGINS__.register(...)` with the **same name** as `manifest.json:name`.
**Slot-registered components don't render.**
The `sidebar` slot only renders when the active theme has `layoutVariant: cockpit`. Other slots always render. If you're registering into a slot with no hits, add `console.log` inside `registerSlot` to confirm the plugin bundle ran at all.
**Plugin backend routes return 404.**
1. Confirm the manifest has `"api": "plugin_api.py"` pointing to an existing file inside `dashboard/`.
2. Restart `hermes dashboard` — plugin API routes are mounted once at startup, **not** on rescan.
3. Check that `plugin_api.py` exports a module-level `router = APIRouter()`. Other export names are not picked up.
4. Tail `~/.hermes/logs/errors.log` for `Failed to load plugin <name> API routes` — import errors are logged there.
**Theme change drops my color overrides.**
`colorOverrides` are scoped to the active theme and cleared on theme switch — that's by design. If you want overrides that persist, put them in your theme's YAML, not in the live switcher.
**Theme customCSS gets truncated.**
The `customCSS` block is capped at 32 KiB per theme. Split large stylesheets across multiple themes, or switch to a plugin that injects a full stylesheet via its `css` field (no size cap).
**I want to ship a plugin on PyPI.**
Dashboard plugins are installed by directory layout, not by pip entry point. The cleanest distribution path today is a git repo the user clones into `~/.hermes/plugins/`. A pip-based installer for dashboard plugins is not currently wired up.
+18 -265
View File
@@ -321,274 +321,27 @@ The frontend is built with React 19, TypeScript, Tailwind CSS v4, and shadcn/ui-
When you run `hermes update`, the web frontend is automatically rebuilt if `npm` is available. This keeps the dashboard in sync with code updates. If `npm` isn't installed, the update skips the frontend build and `hermes dashboard` will build it on first launch.
## Themes
## Themes & plugins
Themes control the dashboard's visual presentation across three layers:
The dashboard ships with six built-in themes and can be extended with user-defined themes, plugin tabs, and backend API routes — all drop-in, no repo clone needed.
- **Palette** — colors (background, text, accents, warm glow, noise)
- **Typography** — font families, base size, line height, letter spacing
- **Layout** — corner radius and density (spacing multiplier)
**Switch themes live** from the header bar — click the palette icon next to the language switcher. Selection persists to `config.yaml` under `dashboard.theme` and is restored on page load.
Switch themes live from the header bar — click the palette icon next to the language switcher. Selection persists to `config.yaml` under `dashboard.theme` and is restored on page load.
Built-in themes:
### Built-in themes
| Theme | Character |
|-------|-----------|
| **Hermes Teal** (`default`) | Dark teal + cream, system fonts, comfortable spacing |
| **Midnight** (`midnight`) | Deep blue-violet, Inter + JetBrains Mono |
| **Ember** (`ember`) | Warm crimson + bronze, Spectral serif + IBM Plex Mono |
| **Mono** (`mono`) | Grayscale, IBM Plex, compact |
| **Cyberpunk** (`cyberpunk`) | Neon green on black, Share Tech Mono |
| **Rosé** (`rose`) | Pink + ivory, Fraunces serif, spacious |
Each built-in ships its own palette, typography, and layout — switching produces visible changes beyond color alone.
To build your own theme, add a plugin tab, inject into shell slots, or expose plugin-specific REST endpoints, see **[Extending the Dashboard](./extending-the-dashboard)** — the complete guide covers:
| Theme | Palette | Typography | Layout |
|-------|---------|------------|--------|
| **Hermes Teal** (`default`) | Dark teal + cream | System stack, 15px | 0.5rem radius, comfortable |
| **Midnight** (`midnight`) | Deep blue-violet | Inter + JetBrains Mono, 14px | 0.75rem radius, comfortable |
| **Ember** (`ember`) | Warm crimson / bronze | Spectral (serif) + IBM Plex Mono, 15px | 0.25rem radius, comfortable |
| **Mono** (`mono`) | Grayscale | IBM Plex Sans + IBM Plex Mono, 13px | 0 radius, compact |
| **Cyberpunk** (`cyberpunk`) | Neon green on black | Share Tech Mono everywhere, 14px | 0 radius, compact |
| **Rosé** (`rose`) | Pink and ivory | Fraunces (serif) + DM Mono, 16px | 1rem radius, spacious |
Themes that reference Google Fonts (everything except Hermes Teal) load the stylesheet on demand — the first time you switch to them, a `<link>` tag is injected into `<head>`.
### Custom themes
Drop a YAML file in `~/.hermes/dashboard-themes/` and it appears in the picker automatically. The file can be as minimal as a name plus the fields you want to override — every missing field inherits a sane default.
Minimal example (colors only, bare hex shorthand):
```yaml
# ~/.hermes/dashboard-themes/neon.yaml
name: neon
label: Neon
description: Pure magenta on black
colors:
background: "#000000"
midground: "#ff00ff"
```
Full example (every knob):
```yaml
# ~/.hermes/dashboard-themes/ocean.yaml
name: ocean
label: Ocean Deep
description: Deep sea blues with coral accents
palette:
background:
hex: "#0a1628"
alpha: 1.0
midground:
hex: "#a8d0ff"
alpha: 1.0
foreground:
hex: "#ffffff"
alpha: 0.0
warmGlow: "rgba(255, 107, 107, 0.35)"
noiseOpacity: 0.7
typography:
fontSans: "Poppins, system-ui, sans-serif"
fontMono: "Fira Code, ui-monospace, monospace"
fontDisplay: "Poppins, system-ui, sans-serif" # optional, falls back to fontSans
fontUrl: "https://fonts.googleapis.com/css2?family=Poppins:wght@400;500;600&family=Fira+Code:wght@400;500&display=swap"
baseSize: "15px"
lineHeight: "1.6"
letterSpacing: "-0.003em"
layout:
radius: "0.75rem" # 0 | 0.25rem | 0.5rem | 0.75rem | 1rem | any length
density: comfortable # compact | comfortable | spacious
# Optional — pin individual shadcn tokens that would otherwise derive from
# the palette. Any key listed here wins over the palette cascade.
colorOverrides:
destructive: "#ff6b6b"
ring: "#ff6b6b"
```
Refresh the dashboard after creating the file.
### Palette model
The palette is a 3-layer triplet — **background**, **midground**, **foreground** — plus a warm-glow rgba() string and a noise-opacity multiplier. Every shadcn token (card, muted, border, primary, popover, etc.) is derived from this triplet via CSS `color-mix()` in the dashboard's stylesheet, so overriding three colors cascades into the whole UI.
- `background` — deepest canvas color (typically near-black). The page background and card fill come from this.
- `midground` — primary text and accent. Most UI chrome reads this.
- `foreground` — top-layer highlight. In the default theme this is white at alpha 0 (invisible); themes that want a bright accent on top can raise its alpha.
- `warmGlow` — rgba() vignette color used by the ambient backdrop.
- `noiseOpacity` — 01.2 multiplier on the grain overlay. Lower = softer, higher = grittier.
Each layer accepts `{hex, alpha}` or a bare hex string (alpha defaults to 1.0).
### Typography model
| Key | Type | Description |
|-----|------|-------------|
| `fontSans` | string | CSS font-family stack for body copy (applied to `html`, `body`) |
| `fontMono` | string | CSS font-family stack for code blocks, `<code>`, `.font-mono` utilities, dense readouts |
| `fontDisplay` | string | Optional heading/display font stack. Falls back to `fontSans` |
| `fontUrl` | string | Optional external stylesheet URL. Injected as `<link rel="stylesheet">` in `<head>` on theme switch. Same URL is never injected twice. Works with Google Fonts, Bunny Fonts, self-hosted `@font-face` sheets, anything you can link |
| `baseSize` | string | Root font size — controls the rem scale for the whole dashboard. Example: `"14px"`, `"16px"` |
| `lineHeight` | string | Default line-height, e.g. `"1.5"`, `"1.65"` |
| `letterSpacing` | string | Default letter-spacing, e.g. `"0"`, `"0.01em"`, `"-0.01em"` |
### Layout model
| Key | Values | Description |
|-----|--------|-------------|
| `radius` | any CSS length | Corner-radius token. Cascades into `--radius-sm/md/lg/xl` so every rounded element shifts together. |
| `density` | `compact` \| `comfortable` \| `spacious` | Spacing multiplier. Compact = 0.85×, comfortable = 1.0× (default), spacious = 1.2×. Scales Tailwind's base spacing, so padding, gap, and space-between utilities all shift proportionally. |
### Color overrides (optional)
Most themes won't need this — the 3-layer palette derives every shadcn token. But if you want a specific accent that the derivation won't produce (a softer destructive red for a pastel theme, a specific success green for a brand), pin individual tokens here.
Supported keys: `card`, `cardForeground`, `popover`, `popoverForeground`, `primary`, `primaryForeground`, `secondary`, `secondaryForeground`, `muted`, `mutedForeground`, `accent`, `accentForeground`, `destructive`, `destructiveForeground`, `success`, `warning`, `border`, `input`, `ring`.
Any key set here overrides the derived value for the active theme only — switching to another theme clears the overrides.
### Layout variants
`layoutVariant` selects the overall shell layout. Defaults to `standard`.
| Variant | Behaviour |
|---------|-----------|
| `standard` | Single column, 1600px max-width (default) |
| `cockpit` | Left sidebar rail (260px) + main content. Populated by plugins via the `sidebar` slot |
| `tiled` | Drops the max-width clamp so pages can use the full viewport |
```yaml
layoutVariant: cockpit
```
The current variant is exposed as `document.documentElement.dataset.layoutVariant` so custom CSS can target it via `:root[data-layout-variant="cockpit"]`.
### Theme assets
Ship artwork URLs with a theme. Each named slot becomes a CSS var (`--theme-asset-<name>`) that plugins and the built-in shell read; the `bg` slot is automatically wired into the backdrop.
```yaml
assets:
bg: "https://example.com/hero-bg.jpg" # full-viewport background
hero: "/my-images/strike-freedom.png" # for plugin sidebars
crest: "/my-images/crest.svg" # for header slot plugins
logo: "/my-images/logo.png"
sidebar: "/my-images/rail.png"
header: "/my-images/header-art.png"
custom:
scanLines: "/my-images/scanlines.png" # → --theme-asset-custom-scanLines
```
Values accept bare URLs (wrapped in `url(...)` automatically), pre-wrapped `url(...)`/`linear-gradient(...)`/`radial-gradient(...)` expressions, and `none`.
### Component chrome overrides
Themes can restyle individual shell components without writing CSS selectors via the `componentStyles` block. Each bucket's entries become CSS vars (`--component-<bucket>-<kebab-property>`) that the shell's shared components read — so `card:` overrides apply to every `<Card>`, `header:` to the app bar, etc.
```yaml
componentStyles:
card:
clipPath: "polygon(12px 0, 100% 0, 100% calc(100% - 12px), calc(100% - 12px) 100%, 0 100%, 0 12px)"
background: "linear-gradient(180deg, rgba(10, 22, 52, 0.85), rgba(5, 9, 26, 0.92))"
boxShadow: "inset 0 0 0 1px rgba(64, 200, 255, 0.28)"
header:
background: "linear-gradient(180deg, rgba(16, 32, 72, 0.95), rgba(5, 9, 26, 0.9))"
tab:
clipPath: "polygon(6px 0, 100% 0, calc(100% - 6px) 100%, 0 100%)"
sidebar: {...}
backdrop: {...}
footer: {...}
progress: {...}
badge: {...}
page: {...}
```
Supported buckets: `card`, `header`, `footer`, `sidebar`, `tab`, `progress`, `badge`, `backdrop`, `page`. Property names use camelCase (`clipPath`) and are emitted as kebab (`clip-path`). Values are plain CSS strings — anything CSS accepts (`clip-path`, `border-image`, `background`, `box-shadow`, animations, etc.).
### Custom CSS
For selector-level chrome that doesn't fit `componentStyles` — pseudo-elements, animations, media queries, theme-scoped overrides — drop raw CSS into the `customCSS` field:
```yaml
customCSS: |
:root[data-layout-variant="cockpit"] body::before {
content: "";
position: fixed;
inset: 0;
pointer-events: none;
z-index: 100;
background: repeating-linear-gradient(to bottom,
transparent 0px, transparent 2px,
rgba(64, 200, 255, 0.035) 3px, rgba(64, 200, 255, 0.035) 4px);
mix-blend-mode: screen;
}
```
The CSS is injected as a single scoped `<style data-hermes-theme-css>` tag on theme apply and cleaned up on theme switch. Capped at 32 KiB per theme.
## Dashboard plugins
Plugins live in `~/.hermes/plugins/<name>/dashboard/` (user) or repo `plugins/<name>/dashboard/` (bundled). Each ships a `manifest.json` plus a plain JS bundle that uses the plugin SDK exposed on `window.__HERMES_PLUGIN_SDK__`.
### Manifest
```json
{
"name": "my-plugin",
"label": "My Plugin",
"icon": "Sparkles",
"version": "1.0.0",
"tab": {
"path": "/my-plugin",
"position": "after:skills",
"override": "/",
"hidden": false
},
"slots": ["sidebar", "header-left"],
"entry": "dist/index.js",
"css": "dist/index.css",
"api": "api.py"
}
```
| Field | Description |
|-------|-------------|
| `tab.path` | Route path the plugin component renders at |
| `tab.position` | `end`, `after:<tab>`, or `before:<tab>` |
| `tab.override` | When set to a built-in path (`/`, `/sessions`, etc.), this plugin replaces that page instead of adding a new tab |
| `tab.hidden` | When true, register component + slots but skip the nav entry. Used by slot-only plugins |
| `slots` | Shell slots this plugin populates (documentation aid; actual registration happens from the JS bundle) |
### Shell slots
Plugins inject components into named shell locations by calling `window.__HERMES_PLUGINS__.registerSlot(pluginName, slotName, Component)`. Multiple plugins can populate the same slot — they render stacked in registration order.
| Slot | Location |
|------|----------|
| `backdrop` | Inside the backdrop layer stack |
| `header-left` | Before the Hermes brand in the top bar |
| `header-right` | Before the theme/language switchers |
| `header-banner` | Full-width strip below the nav |
| `sidebar` | Cockpit sidebar rail (only rendered when `layoutVariant === "cockpit"`) |
| `pre-main` | Above the route outlet |
| `post-main` | Below the route outlet |
| `footer-left` / `footer-right` | Footer cell content (replaces default) |
| `overlay` | Fixed-position layer above everything else |
### Plugin SDK
Exposed on `window.__HERMES_PLUGIN_SDK__`:
- `React` + `hooks` (useState, useEffect, useCallback, useMemo, useRef, useContext, createContext)
- `components` — Card, Badge, Button, Input, Label, Select, Separator, Tabs, **PluginSlot**
- `api` — Hermes API client, plus raw `fetchJSON`
- `utils``cn()`, `timeAgo()`, `isoTimeAgo()`
- `useI18n` — i18n hook for multi-language plugins
### Demo: Strike Freedom Cockpit
`plugins/strike-freedom-cockpit/` ships a complete skin demo showing every extension point — cockpit layout variant, theme-supplied hero/crest assets, notched card corners via `componentStyles`, scanlines via `customCSS`, and a slot-only plugin that populates the sidebar, header, and footer. Copy the theme YAML into `~/.hermes/dashboard-themes/` and the plugin directory into `~/.hermes/plugins/` to try it.
### Theme API
| Endpoint | Method | Description |
|----------|--------|-------------|
| `/api/dashboard/themes` | GET | List available themes + active name. Built-ins return `{name, label, description}`; user themes also include a `definition` field with the full normalised theme object. |
| `/api/dashboard/theme` | PUT | Set active theme. Body: `{"name": "midnight"}` |
- Theme YAML schema — palette, typography, layout, assets, componentStyles, colorOverrides, customCSS
- Layout variants — `standard`, `cockpit`, `tiled`
- Plugin manifest, SDK, shell slots, page-scoped slots (inject widgets into built-in pages without overriding them), backend FastAPI routes
- A full combined theme-plus-plugin walkthrough (Strike Freedom cockpit demo)
- Discovery, reload, and troubleshooting
+1 -1
View File
@@ -81,7 +81,7 @@ const sidebars: SidebarsConfig = {
label: 'Management',
items: [
'user-guide/features/web-dashboard',
'user-guide/features/dashboard-plugins',
'user-guide/features/extending-the-dashboard',
],
},
{