Compare commits

..

396 Commits

Author SHA1 Message Date
alt-glitch 7efd91d4b4 feat(session): inject Discord IDs block when discord tool is loaded
When DISCORD_BOT_TOKEN is set — meaning the discord tool actually
loads — emit a dedicated IDs block in the session context prompt so
the agent can call ``fetch_messages``, ``pin_message``, etc. with
real identifiers instead of probing.

Currently only ``thread_id`` was exposed as a raw ID (via the
``description`` string).  The agent in a Discord thread had to guess
that the thread ID doubles as a channel ID for the REST API (it
does), and it had no way to reference the parent channel, the guild,
or the triggering message at all.

The block adapts to context:

  - Thread:     guild / parent channel / thread / message
  - Channel:    guild / channel / message
  - (DM has no guild/channel IDs worth listing; only message)

Discord isn't in _PII_SAFE_PLATFORMS, so IDs ship unredacted.
2026-04-25 05:43:23 +05:30
alt-glitch 0aa1269e56 fix(session): gate stale "no Discord APIs" note on DISCORD_BOT_TOKEN
The Discord platform note in the session context prompt claimed the
agent has no server-management APIs — pre-dating the discord tool.
With a bot token configured the agent actually has fetch_messages,
search_members, create_thread, and optionally the discord_admin tool;
telling the model otherwise causes it to refuse or apologise for
calls it is fully able to make.

Gate the disclaimer on DISCORD_BOT_TOKEN being unset, matching the
tool's own ``check_fn``.  Without a token the note still appears and
remains accurate; with a token the model is no longer gaslit into
refusing valid tool calls.
2026-04-25 05:43:23 +05:30
alt-glitch 3c29834354 feat(discord): populate guild_id, parent_chat_id, message_id on SessionSource
Discord knows all four identifiers for every inbound message — guild,
channel (or thread), parent channel when in a thread, and the
triggering message.  Pass them into ``SessionSource`` via the new
``build_source()`` kwargs so downstream code (context-prompt builder,
delivery, logging) can use them without re-resolving from discord.py
objects.

For auto-threaded messages, remember the original channel as the
parent before swapping ``chat_id`` to the freshly created thread.

Behavioural: still a no-op — nothing consumes these fields yet.
2026-04-25 05:43:23 +05:30
alt-glitch 0eb85906b0 feat(session): add guild_id/parent_chat_id/message_id to SessionSource
Groundwork for injecting raw platform identifiers into the agent's
system prompt.  Currently only `thread_id` is exposed as a raw ID —
callers in a Discord thread had to guess `channel_id == thread_id`
(which happens to work because threads are channels in Discord's REST
API) and had no way to reference the parent channel, guild, or the
triggering message.

Adds three optional fields:

- `guild_id` — Discord guild / Slack workspace / Matrix server scope
- `parent_chat_id` — parent channel when chat_id refers to a thread
- `message_id` — ID of the triggering message (pin/reply/react)

Extends `BasePlatformAdapter.build_source()` to accept + forward them
and teaches `to_dict`/`from_dict` to serialize them.  Behaviourally a
no-op: nothing reads the fields yet and they default to None.
2026-04-25 05:43:23 +05:30
alt-glitch ff9b0528a2 fix(tools): normalize numeric entries and clear stale no_mcp in _save_platform_tools
YAML parses bare numeric toolset names (e.g. 12306:) as int, causing
TypeError in sorted() since the read path normalizes to str but the
save path did not.

The no_mcp sentinel was preserved in existing entries even when the
user re-enabled MCP servers, causing MCP to stay silently disabled.
2026-04-25 05:43:23 +05:30
alt-glitch 8feaa7cd1b feat(feishu): wire feishu doc/drive tools into hermes-feishu composite
The feishu_doc and feishu_drive tools were registered in the tool
registry but never added to the hermes-feishu composite toolset.
The pipeline fix from the prior commit now recovers them automatically
once they are in the composite.
2026-04-25 05:43:23 +05:30
alt-glitch 57a2b97ae8 feat(discord): split discord_server into discord + discord_admin tools
Split the monolithic discord_server tool (14 actions) into two:

- discord: core actions (fetch_messages, search_members, create_thread)
  that are useful for the agent's normal operation. Auto-enabled on
  the discord platform via the pipeline fix.

- discord_admin: server management actions (list channels/roles, pins,
  role assignment) that require explicit opt-in via hermes tools.
  Added to CONFIGURABLE_TOOLSETS and _DEFAULT_OFF_TOOLSETS.
2026-04-25 05:43:23 +05:30
alt-glitch bd9afb027a fix(tools): recover non-configurable toolsets from composite resolution
The reverse-mapping loop in _get_platform_tools only checked
CONFIGURABLE_TOOLSETS, silently dropping platform-specific toolsets
like discord and feishu_doc whose tools were in the composite but
had no configurable key. Add a second pass over TOOLSETS that picks
up unclaimed toolsets whose tools are present in the resolved
composite.
2026-04-25 05:43:23 +05:30
Teknium 6051fba9dc feat(banner): hyperlink startup banner title to latest GitHub release (#14945)
Wrap the existing version label in the welcome-banner panel title
('Hermes Agent v… · upstream … · local …') with an OSC-8 terminal
hyperlink pointing at the latest git tag's GitHub release page
(https://github.com/NousResearch/hermes-agent/releases/tag/<tag>).

Clickable in modern terminals (iTerm2, WezTerm, Windows Terminal,
GNOME Terminal, Kitty, etc.); degrades to plain text on terminals
without OSC-8 support. No new line added to the banner.

New get_latest_release_tag() helper runs 'git describe --tags
--abbrev=0' in the Hermes checkout (3s timeout, per-process cache,
silent fallback for non-git/pip installs and forks without tags).
2026-04-23 23:28:34 -07:00
Teknium 2acc8783d1 fix(errors): classify OpenRouter privacy-guardrail 404s distinctly (#14943)
OpenRouter returns a 404 with the specific message

  'No endpoints available matching your guardrail restrictions and data
   policy. Configure: https://openrouter.ai/settings/privacy'

when a user's account-level privacy setting excludes the only endpoint
serving a model (e.g. DeepSeek V4 Pro, which today is hosted only by
DeepSeek's own endpoint that may log inputs).

Before this change we classified it as model_not_found, which was
misleading (the model exists) and triggered provider fallback (useless —
the same account setting applies to every OpenRouter call).

Now it classifies as a new FailoverReason.provider_policy_blocked with
retryable=False, should_fallback=False.  The error body already contains
the fix URL, so the user still gets actionable guidance.
2026-04-23 23:26:29 -07:00
brooklyn! acdcb167fb fix(tui): harden terminal dimming and multiplexer copy (#14906)
- disable ANSI dim on VTE terminals by default so dark-background reasoning and accents stay readable
- suppress local multiplexer OSC52 echo while preserving remote passthrough and add regression coverage
2026-04-23 22:46:28 -07:00
Teknium 51f4c9827f fix(context): resolve real Codex OAuth context windows (272k, not 1M) (#14935)
On ChatGPT Codex OAuth every gpt-5.x slug actually caps at 272,000 tokens,
but Hermes was resolving gpt-5.5 / gpt-5.4 to 1,050,000 (from models.dev)
because openai-codex aliases to the openai entry there. At 1.05M the
compressor never fires and requests hard-fail with 'context window
exceeded' around the real 272k boundary.

Verified live against chatgpt.com/backend-api/codex/models:
  gpt-5.5, gpt-5.4, gpt-5.4-mini, gpt-5.3-codex, gpt-5.2-codex,
  gpt-5.2, gpt-5.1-codex-max → context_window = 272000

Changes:
- agent/model_metadata.py:
  * _fetch_codex_oauth_context_lengths() — probe the Codex /models
    endpoint with the OAuth bearer token and read context_window per
    slug (1h in-memory TTL).
  * _resolve_codex_oauth_context_length() — prefer the live probe,
    fall back to hardcoded _CODEX_OAUTH_CONTEXT_FALLBACK (all 272k).
  * Wire into get_model_context_length() when provider=='openai-codex',
    running BEFORE the models.dev lookup (which returns 1.05M). Result
    persists via save_context_length() so subsequent lookups skip the
    probe entirely.
  * Fixed the now-wrong comment on the DEFAULT_CONTEXT_LENGTHS gpt-5.5
    entry (400k was never right for Codex; it's the catch-all for
    providers we can't probe live).

Tests (4 new in TestCodexOAuthContextLength):
- fallback table used when no token is available (no models.dev leakage)
- live probe overrides the fallback
- probe failure (non-200) falls back to hardcoded 272k
- non-codex providers (openrouter, direct openai) unaffected

Non-codex context resolution is unchanged — the Codex branch only fires
when provider=='openai-codex'.
2026-04-23 22:39:47 -07:00
Teknium 2e78a2b6b2 feat(models): add deepseek-v4-pro and deepseek-v4-flash (#14934)
- OpenRouter: deepseek/deepseek-v4-pro, deepseek/deepseek-v4-flash
- Nous Portal (fallback list): same two slugs
- Native DeepSeek provider: bare deepseek-v4-pro, deepseek-v4-flash
  alongside existing deepseek-chat/deepseek-reasoner

Context length resolves via existing 'deepseek' substring entry (128K)
in DEFAULT_CONTEXT_LENGTHS.
2026-04-23 22:35:04 -07:00
Teknium 5a1c599412 feat(browser): CDP supervisor — dialog detection + response + cross-origin iframe eval (#14540)
* docs: browser CDP supervisor design (for upcoming PR)

Design doc ahead of implementation — dialog + iframe detection/interaction
via a persistent CDP supervisor. Covers backend capability matrix (verified
live 2026-04-23), architecture, lifecycle, policy, agent surface, PR split,
non-goals, and test plan.

Supersedes #12550.

No code changes in this commit.

* feat(browser): add persistent CDP supervisor for dialog + frame detection

Single persistent CDP WebSocket per Hermes task_id that subscribes to
Page/Runtime/Target events and maintains thread-safe state for pending
dialogs, frame tree, and console errors.

Supervisor lives in its own daemon thread running an asyncio loop;
external callers use sync API (snapshot(), respond_to_dialog()) that
bridges onto the loop.

Auto-attaches to OOPIF child targets via Target.setAutoAttach{flatten:true}
and enables Page+Runtime on each so iframe-origin dialogs surface through
the same supervisor.

Dialog policies: must_respond (default, 300s safety timeout),
auto_dismiss, auto_accept.

Frame tree capped at 30 entries + OOPIF depth 2 to keep snapshot
payloads bounded on ad-heavy pages.

E2E verified against real Chrome via smoke test — detects + responds
to main-frame alerts, iframe-contentWindow alerts, preserves frame
tree, graceful no-dialog error path, clean shutdown.

No agent-facing tool wiring in this commit (comes next).

* feat(browser): add browser_dialog tool wired to CDP supervisor

Agent-facing response-only tool. Schema:
  action: 'accept' | 'dismiss' (required)
  prompt_text: response for prompt() dialogs (optional)
  dialog_id: disambiguate when multiple dialogs queued (optional)

Handler:
  SUPERVISOR_REGISTRY.get(task_id).respond_to_dialog(...)

check_fn shares _browser_cdp_check with browser_cdp so both surface and
hide together. When no supervisor is attached (Camofox, default
Playwright, or no browser session started yet), tool is hidden; if
somehow invoked it returns a clear error pointing the agent to
browser_navigate / /browser connect.

Registered in _HERMES_CORE_TOOLS and the browser / hermes-acp /
hermes-api-server toolsets alongside browser_cdp.

* feat(browser): wire CDP supervisor into session lifecycle + browser_snapshot

Supervisor lifecycle:
  * _get_session_info lazy-starts the supervisor after a session row is
    materialized — covers every backend code path (Browserbase, cdp_url
    override, /browser connect, future providers) with one hook.
  * cleanup_browser(task_id) stops the supervisor for that task first
    (before the backend tears down CDP).
  * cleanup_all_browsers() calls SUPERVISOR_REGISTRY.stop_all().
  * /browser connect eagerly starts the supervisor for task 'default'
    so the first snapshot already shows pending_dialogs.
  * /browser disconnect stops the supervisor.

CDP URL resolution for the supervisor:
  1. BROWSER_CDP_URL / browser.cdp_url override.
  2. Fallback: session_info['cdp_url'] from cloud providers (Browserbase).

browser_snapshot merges supervisor state (pending_dialogs + frame_tree)
into its JSON output when a supervisor is active — the agent reads
pending_dialogs from the snapshot it already requests, then calls
browser_dialog to respond. No extra tool surface.

Config defaults:
  * browser.dialog_policy: 'must_respond' (new)
  * browser.dialog_timeout_s: 300 (new)
No version bump — new keys deep-merge into existing browser section.

Deadlock fix in supervisor event dispatch:
  * _on_dialog_opening and _on_target_attached used to await CDP calls
    while the reader was still processing an event — but only the reader
    can set the response Future, so the call timed out.
  * Both now fire asyncio.create_task(...) so the reader stays pumping.
  * auto_dismiss/auto_accept now actually close the dialog immediately.

Tests (tests/tools/test_browser_supervisor.py, 11 tests, real Chrome):
  * supervisor start/snapshot
  * main-frame alert detection + dismiss
  * iframe.contentWindow alert
  * prompt() with prompt_text reply
  * respond with no pending dialog -> clean error
  * auto_dismiss clears on event
  * registry idempotency
  * registry stop -> snapshot reports inactive
  * browser_dialog tool no-supervisor error
  * browser_dialog invalid action
  * browser_dialog end-to-end via tool handler

xdist-safe: chrome_cdp fixture uses a per-worker port.
Skipped when google-chrome/chromium isn't installed.

* docs(browser): document browser_dialog tool + CDP supervisor

- user-guide/features/browser.md: new browser_dialog section with
  workflow, availability gate, and dialog_policy table
- reference/tools-reference.md: row for browser_dialog, tool count
  bumped 53 -> 54, browser tools count 11 -> 12
- reference/toolsets-reference.md: browser_dialog added to browser
  toolset row with note on pending_dialogs / frame_tree snapshot fields

Full design doc lives at
developer-guide/browser-supervisor.md (committed earlier).

* fix(browser): reconnect loop + recent_dialogs for Browserbase visibility

Found via Browserbase E2E test that revealed two production-critical issues:

1. **Supervisor WebSocket drops when other clients disconnect.** Browserbase's
   CDP proxy tears down our long-lived WebSocket whenever a short-lived
   client (e.g. agent-browser CLI's per-command CDP connection) disconnects.
   Fixed with a reconnecting _run loop that re-attaches with exponential
   backoff on drops. _page_session_id and _child_sessions are reset on each
   reconnect; pending_dialogs and frames are preserved across reconnects.

2. **Browserbase auto-dismisses dialogs server-side within ~10ms.** Their
   Playwright-based CDP proxy dismisses alert/confirm/prompt before our
   Page.handleJavaScriptDialog call can respond. So pending_dialogs is
   empty by the time the agent reads a snapshot on Browserbase.

   Added a recent_dialogs ring buffer (capacity 20) that retains a
   DialogRecord for every dialog that opened, with a closed_by tag:
     * 'agent'       — agent called browser_dialog
     * 'auto_policy' — local auto_dismiss/auto_accept fired
     * 'watchdog'    — must_respond timeout auto-dismissed (300s default)
     * 'remote'      — browser/backend closed it on us (Browserbase)

   Agents on Browserbase now see the dialog history with closed_by='remote'
   so they at least know a dialog fired, even though they couldn't respond.

3. **Page.javascriptDialogClosed matching bug.** The event doesn't include a
   'message' field (CDP spec has only 'result' and 'userInput') but our
   _on_dialog_closed was matching on message. Fixed to match by session_id
   + oldest-first, with a safety assumption that only one dialog is in
   flight per session (the JS thread is blocked while a dialog is up).

Docs + tests updated:
  * browser.md: new availability matrix showing the three backends and
    which mode (pending / recent / response) each supports
  * developer-guide/browser-supervisor.md: three-field snapshot schema
    with closed_by semantics
  * test_browser_supervisor.py: +test_recent_dialogs_ring_buffer (12/12
    passing against real Chrome)

E2E verified both backends:
  * Local Chrome via /browser connect: detect + respond full workflow
    (smoke_supervisor.py all 7 scenarios pass)
  * Browserbase: detect via recent_dialogs with closed_by='remote'
    (smoke_supervisor_browserbase_v2.py passes)

Camofox remains out of scope (REST-only, no CDP) — tracked for
upstream PR 3.

* feat(browser): XHR bridge for dialog response on Browserbase (FIXED)

Browserbase's CDP proxy auto-dismisses native JS dialogs within ~10ms, so
Page.handleJavaScriptDialog calls lose the race. Solution: bypass native
dialogs entirely.

The supervisor now injects Page.addScriptToEvaluateOnNewDocument with a
JavaScript override for window.alert/confirm/prompt. Those overrides
perform a synchronous XMLHttpRequest to a magic host
('hermes-dialog-bridge.invalid'). We intercept those XHRs via Fetch.enable
with a requestStage=Request pattern.

Flow when a page calls alert('hi'):
  1. window.alert override intercepts, builds XHR GET to
     http://hermes-dialog-bridge.invalid/?kind=alert&message=hi
  2. Sync XHR blocks the page's JS thread (mirrors real dialog semantics)
  3. Fetch.requestPaused fires on our WebSocket; supervisor surfaces
     it as a pending dialog with bridge_request_id set
  4. Agent reads pending_dialogs from browser_snapshot, calls browser_dialog
  5. Supervisor calls Fetch.fulfillRequest with JSON body:
     {accept: true|false, prompt_text: '...', dialog_id: 'd-N'}
  6. The injected script parses the body, returns the appropriate value
     from the override (undefined for alert, bool for confirm, string|null
     for prompt)

This works identically on Browserbase AND local Chrome — no native dialog
ever fires, so Browserbase's auto-dismiss has nothing to race. Dialog
policies (must_respond / auto_dismiss / auto_accept) all still work.

Bridge is installed on every attached session (main page + OOPIF child
sessions) so iframe dialogs are captured too.

Native-dialog path kept as a fallback for backends that don't auto-dismiss
(so a page that somehow bypasses our override — e.g. iframes that load
after Fetch.enable but before the init-script runs — still gets observed
via Page.javascriptDialogOpening).

E2E VERIFIED:
  * Local Chrome: 13/13 pytest tests green (12 original + new
    test_bridge_captures_prompt_and_returns_reply_text that asserts
    window.__ret === 'AGENT-SUPPLIED-REPLY' after agent responds)
  * Browserbase: smoke_bb_bridge_v2.py runs 4/4 PASS:
    - alert('BB-ALERT-MSG') dismiss → page.alert_ret = undefined ✓
    - prompt('BB-PROMPT-MSG', 'default-xyz') accept with 'AGENT-REPLY'
      → page.prompt_ret === 'AGENT-REPLY' ✓
    - confirm('BB-CONFIRM-MSG') accept → page.confirm_ret === true ✓
    - confirm('BB-CONFIRM-MSG') dismiss → page.confirm_ret === false ✓

Docs updated in browser.md and developer-guide/browser-supervisor.md —
availability matrix now shows Browserbase at full parity with local
Chrome for both detection and response.

* feat(browser): cross-origin iframe interaction via browser_cdp(frame_id=...)

Adds iframe interaction to the CDP supervisor PR (was queued as PR 2).

Design: browser_cdp gets an optional frame_id parameter. When set, the
tool looks up the frame in the supervisor's frame_tree, grabs its child
cdp_session_id (OOPIF session), and dispatches the CDP call through the
supervisor's already-connected WebSocket via run_coroutine_threadsafe.

Why not stateless: on Browserbase, each fresh browser_cdp WebSocket
must re-negotiate against a signed connectUrl. The session info carries
a specific URL that can expire while the supervisor's long-lived
connection stays valid. Routing via the supervisor sidesteps this.

Agent workflow:
  1. browser_snapshot → frame_tree.children[] shows OOPIFs with is_oopif=true
  2. browser_cdp(method='Runtime.evaluate', frame_id=<OOPIF frame_id>,
                 params={'expression': 'document.title', 'returnByValue': True})
  3. Supervisor dispatches the call on the OOPIF's child session

Supervisor state fixes needed along the way:
  * _on_frame_detached now skips reason='swap' (frame migrating processes)
  * _on_frame_detached also skips when the frame is an OOPIF with a live
    child session — Browserbase fires spurious remove events when a
    same-origin iframe gets promoted to OOPIF
  * _on_target_detached clears cdp_session_id but KEEPS the frame record
    so the agent still sees the OOPIF in frame_tree during transient
    session flaps

E2E VERIFIED on Browserbase (smoke_bb_iframe_agent_path.py):
  browser_cdp(method='Runtime.evaluate',
              params={'expression': 'document.title', 'returnByValue': True},
              frame_id=<OOPIF>)
  → {'success': True, 'result': {'value': 'Example Domain'}}

  The iframe is <iframe src='https://example.com/'> inside a top-level
  data: URL page on a real Browserbase session. The agent Runtime.evaluates
  INSIDE the cross-origin iframe and gets example.com's title back.

Tests (tests/tools/test_browser_supervisor.py — 16 pass total):
  * test_browser_cdp_frame_id_routes_via_supervisor — injects fake OOPIF,
    verifies routing via supervisor, Runtime.evaluate returns 1+1=2
  * test_browser_cdp_frame_id_missing_supervisor — clean error when no
    supervisor attached
  * test_browser_cdp_frame_id_not_in_frame_tree — clean error on bad
    frame_id

Docs (browser.md and developer-guide/browser-supervisor.md) updated with
the iframe workflow, availability matrix now shows OOPIF eval as shipped
for local Chrome + Browserbase.

* test(browser): real-OOPIF E2E verified manually + chrome_cdp uses --site-per-process

When asked 'did you test the iframe stuff' I had only done a mocked
pytest (fake injected OOPIF) plus a Browserbase E2E. Closed the
local-Chrome real-OOPIF gap by writing /tmp/dialog-iframe-test/
smoke_local_oopif.py:

  * 2 http servers on different hostnames (localhost:18905 + 127.0.0.1:18906)
  * Chrome with --site-per-process so the cross-origin iframe becomes a
    real OOPIF in its own process
  * Navigate, find OOPIF in supervisor.frame_tree, call
    browser_cdp(method='Runtime.evaluate', frame_id=<OOPIF>) which routes
    through the supervisor's child session
  * Asserts iframe document.title === 'INNER-FRAME-XYZ' (from the
    inner page, retrieved via OOPIF eval)

PASSED on 2026-04-23.

Tried to embed this as a pytest but hit an asyncio version quirk between
venv (3.11) and the system python (3.13) — Page.navigate hangs in the
pytest harness but works in standalone. Left a self-documenting skip
test that points to the smoke script + describes the verification.

chrome_cdp fixture now passes --site-per-process so future iframe tests
can rely on OOPIF behavior.

Result: 16 pass + 1 documented-skip = 17 tests in
tests/tools/test_browser_supervisor.py.

* docs(browser): add dialog_policy + dialog_timeout_s to configuration.md, fix tool count

Pre-merge docs audit revealed two gaps:

1. user-guide/configuration.md browser config example was missing the
   two new dialog_* knobs. Added with a short table explaining
   must_respond / auto_dismiss / auto_accept semantics and a link to
   the feature page for the full workflow.

2. reference/tools-reference.md header said '54 built-in tools' — real
   count on main is 54, this branch adds browser_dialog so it's 55.
   Fixed the header.  (browser count was already correctly bumped
   11 -> 12 in the earlier docs commit.)

No code changes.
2026-04-23 22:23:37 -07:00
Teknium 0f6eabb890 docs(website): dedicated page per bundled + optional skill (#14929)
Generates a full dedicated Docusaurus page for every one of the 132 skills
(73 bundled + 59 optional) under website/docs/user-guide/skills/{bundled,optional}/<category>/.
Each page carries the skill's description, metadata (version, author, license,
dependencies, platform gating, tags, related skills cross-linked to their own
pages), and the complete SKILL.md body that Hermes loads at runtime.

Previously the two catalog pages just listed skills with a one-line blurb and
no way to see what the skill actually did — users had to go read the source
repo. Now every skill has a browsable, searchable, cross-linked reference in
the docs.

- website/scripts/generate-skill-docs.py — generator that reads skills/ and
  optional-skills/, writes per-skill pages, regenerates both catalog indexes,
  and rewrites the Skills section of sidebars.ts. Handles MDX escaping
  (outside fenced code blocks: curly braces, unsafe HTML-ish tags) and
  rewrites relative references/*.md links to point at the GitHub source.
- website/docs/reference/skills-catalog.md — regenerated; each row links to
  the new dedicated page.
- website/docs/reference/optional-skills-catalog.md — same.
- website/sidebars.ts — Skills section now has Bundled / Optional subtrees
  with one nested category per skill folder.
- .github/workflows/{docs-site-checks,deploy-site}.yml — run the generator
  before docusaurus build so CI stays in sync with the source SKILL.md files.

Build verified locally with `npx docusaurus build`. Only remaining warnings
are pre-existing broken link/anchor issues in unrelated pages.
2026-04-23 22:22:11 -07:00
Teknium eb93f88e1d chore(release): add MattMaximo to AUTHOR_MAP for PR #10450 salvage 2026-04-23 22:01:24 -07:00
Matt Maximo 3ccda2aa05 fix(mcp): seed protocol header before HTTP initialize 2026-04-23 22:01:24 -07:00
Teknium 983bbe2d40 feat(skills): add design-md skill for Google's DESIGN.md spec (#14876)
* feat(config): make tool output truncation limits configurable

Port from anomalyco/opencode#23770: expose a new `tool_output` config
section so users can tune the hardcoded truncation caps that apply to
terminal output and read_file pagination.

Three knobs under `tool_output`:
- max_bytes (default 50_000) — terminal stdout/stderr cap
- max_lines (default 2000) — read_file pagination cap
- max_line_length (default 2000) — per-line cap in line-numbered view

All three keep their existing hardcoded values as defaults, so behaviour
is unchanged when the section is absent. Power users on big-context
models can raise them; small-context local models can lower them.

Implementation:
- New `tools/tool_output_limits.py` reads the section with defensive
  fallback (missing/invalid values → defaults, never raises).
- `tools/terminal_tool.py` MAX_OUTPUT_CHARS now comes from
  get_max_bytes().
- `tools/file_operations.py` normalize_read_pagination() and
  _add_line_numbers() now pull the limits at call time.
- `hermes_cli/config.py` DEFAULT_CONFIG gains the `tool_output` section
  so `hermes setup` writes defaults into fresh configs.
- Docs page `user-guide/configuration.md` gains a "Tool Output
  Truncation Limits" section with large-context and small-context
  example configs.

Tests (18 new in tests/tools/test_tool_output_limits.py):
- Default resolution with missing / malformed / non-dict config.
- Full and partial user overrides.
- Coercion of bad values (None, negative, wrong type, str int).
- Shortcut accessors delegate correctly.
- DEFAULT_CONFIG exposes the section with the right defaults.
- Integration: normalize_read_pagination clamps to the configured
  max_lines.

* feat(skills): add design-md skill for Google's DESIGN.md spec

Built-in skill under skills/creative/ that teaches the agent to author,
lint, diff, and export DESIGN.md files — Google's open-source
(Apache-2.0) format for describing a visual identity to coding agents.

Covers:
- YAML front matter + markdown body anatomy
- Full token schema (colors, typography, rounded, spacing, components)
- Canonical section order + duplicate-heading rejection
- Component property whitelist + variants-as-siblings pattern
- CLI workflow via 'npx @google/design.md' (lint/diff/export/spec)
- Lint rule reference including WCAG contrast checks
- Common YAML pitfalls (quoted hex, negative dimensions, dotted refs)
- Starter template at templates/starter.md

Package verified live on npm (@google/design.md@0.1.1).
2026-04-23 21:51:19 -07:00
Teknium 379b2273d9 fix(mcp): route stdio subprocess stderr to log file, not user TTY (#14901)
MCP stdio servers' stderr was being dumped directly onto the user's
terminal during hermes launch. Servers like FastMCP-based ones print a
large ASCII banner at startup; slack-mcp-server emits JSON logs; etc.
With prompt_toolkit / Rich rendering the TUI concurrently, these
unsolicited writes corrupt the terminal state — hanging the session
~80% of the time for one user with Google Ads Tools + slack-mcp
configured, forcing Ctrl+C and restart loops.

Root cause: `stdio_client(server_params)` in tools/mcp_tool.py was
called without `errlog=`, and the SDK's default is `sys.stderr` —
i.e. the real parent-process stderr, which is the TTY.

Fix: open a shared, append-mode log at $HERMES_HOME/logs/mcp-stderr.log
(created once per process, line-buffered, real fd required by asyncio's
subprocess machinery) and pass it as `errlog` to every stdio_client.
Each server's spawn writes a timestamped header so the shared log stays
readable when multiple servers are running. Falls back to /dev/null if
the log file cannot be opened.

Verified by E2E spawning a subprocess with the log fd as its stderr:
banner lines land in the log file, nothing reaches the calling TTY.
2026-04-23 21:50:25 -07:00
ethernet 7db2703b33 Merge pull request #14895 from NousResearch/tui-resume
fix(tui): keep FloatingOverlays visible when input is blocked
2026-04-24 01:44:50 -03:00
Ari Lotter 7c59e1a871 fix(tui): keep FloatingOverlays visible when input is blocked
FloatingOverlays (SessionPicker, ModelPicker, SkillsHub, pager,
completions) was nested inside the !isBlocked guard in ComposerPane.
When any overlay opened, isBlocked became true, which removed the
entire composer box from the tree — including the overlay that was
trying to render. This made /resume with no args appear to do nothing
(the input line vanished and no picker appeared).

Since 99d859ce (feat: refactor by splitting up app and doing proper
state), isBlocked gated only the text input lines so that
approval/clarify prompts and pickers rendered above a hidden composer.

The regression happened in 408fc893 (fix(tui): tighten composer — status
sits directly above input, overlays anchor to input) when
FloatingOverlays was moved into the input row for anchoring but
accidentally kept inside the !isBlocked guard.

so here, we render FloatingOverlays outside the !isBlocked guard inside
the same position:relative Box, so overlays
stay visible even when text input is hidden. Only the actual input
buffer lines and TextInput are gated now.

Fixes: /resume, /history, /logs, /model, /skills, and completion
dropdowns when blocked overlays are active.
2026-04-23 23:44:52 -04:00
brooklyn! 6fdbf2f2d7 Merge pull request #14820 from NousResearch/bb/tui-at-fuzzy-match
fix(tui): @<name> fuzzy-matches filenames across the repo
2026-04-23 19:40:43 -05:00
Brooklyn Nicholson 0a679cb7ad fix(tui): restore voice/panic handlers + scope fuzzy paths to cwd
Two fixes on top of the fuzzy-@ branch:

(1) Rebase artefact: re-apply only the fuzzy additions on top of
    fresh `tui_gateway/server.py`. The earlier commit was cut from a
    base 58 commits behind main and clobbered ~170 lines of
    voice.toggle / voice.record handlers and the gateway crash hooks
    (`_panic_hook`, `_thread_panic_hook`). Reset server.py to
    origin/main and re-add only:
      - `_FUZZY_*` constants + `_list_repo_files` + `_fuzzy_basename_rank`
      - the new fuzzy branch in the `complete.path` handler

(2) Path scoping (Copilot review): `git ls-files` returns repo-root-
    relative paths, but completions need to resolve under the gateway's
    cwd. When hermes is launched from a subdirectory, the previous
    code surfaced `@file:apps/web/src/foo.tsx` even though the agent
    would resolve that relative to `apps/web/` and miss. Fix:
      - `git -C root rev-parse --show-toplevel` to get repo top
      - `git -C top ls-files …` for the listing
      - `os.path.relpath(top + p, root)` per result, dropping anything
        starting with `../` so the picker stays scoped to cwd-and-below
        (matches Cmd-P workspace semantics)
    `apps/web/src/foo.tsx` ends up as `@file:src/foo.tsx` from inside
    `apps/web/`, and sibling subtrees + parent-of-cwd files don't leak.

New test `test_fuzzy_paths_relative_to_cwd_inside_subdir` builds a
3-package mono-repo, runs from `apps/web/`, and verifies completion
paths are subtree-relative + outside-of-cwd files don't appear.

Copilot review threads addressed: #3134675504 (path scoping),
#3134675532 (`voice.toggle` regression), #3134675541 (`voice.record`
regression — both were stale-base artefacts, not behavioural changes).
2026-04-23 19:38:33 -05:00
Brooklyn Nicholson 41b4d69167 Merge branch 'main' of github.com:NousResearch/hermes-agent into bb/tui-at-fuzzy-match 2026-04-23 19:35:18 -05:00
brooklyn! 3f343cf7cf Merge pull request #14822 from NousResearch/bb/tui-inline-diff-segment-anchor
fix(tui): anchor inline_diff to the segment where the edit happened
2026-04-23 19:32:21 -05:00
Brooklyn Nicholson 4ae5b58cb1 fix(tui): restore voice handlers + address copilot review
Rebase-artefact cleanup on this branch:

- Restore `voice.status` and `voice.transcript` cases in
  createGatewayEventHandler plus the `voice` / `submission` /
  `composer.setInput` ctx destructuring. They were added to main in
  the 58-commit gap that this branch was originally cut behind;
  dropping them was unintentional.
- Rebase the test ctx shape to match main (voice.* fakes,
  submission.submitRef, composer.setInput) and apply the same
  segment-anchor test rewrites on top.
- Drop the `#14XXX` placeholder from the tool.complete comment;
  replace with a plain-English rationale.
- Rewrite the broken mid-word "pushInlineDiff- Segment" in
  turnController's dedupe comment to refer to
  pushInlineDiffSegment and `kind: 'diff'` plainly.
- Collapse the filter predicate in recordMessageComplete from a
  4-line if/return into one boolean expression — same semantics,
  reads left-to-right as a single predicate.

Copilot review threads resolved: #3134668789, #3134668805,
#3134668822.
2026-04-23 19:22:41 -05:00
Brooklyn Nicholson 2258a181f0 fix(tui): give inline_diff segments blank-line breathing room
Visual polish on top of the segment-anchor change: diff blocks were
butting up against the narration around them. Tag diff-only segments
with `kind: 'diff'` (extended on Msg) and give them `marginTop={1}` +
`marginBottom={1}` in MessageLine, matching the spacing we already
use for user messages. Also swaps the regex-based `diffSegmentBody`
check for an explicit `kind === 'diff'` guard so the dedupe path is
clearer.
2026-04-23 19:11:59 -05:00
Brooklyn Nicholson 11b2942f16 fix(tui): anchor inline_diff to the segment where the edit happened
Revisits #13729. That PR buffered each `tool.complete`'s inline_diff
and merged them into the final assistant message body as a fenced
```diff block. The merge-at-end placement reads as "the agent wrote
this after the summary", even when the edit fired mid-turn — which
is both misleading and (per blitz feedback) feels like noise tacked
onto the end of every task.

Segment-anchored placement instead:

- On tool.complete with inline_diff, `pushInlineDiffSegment` calls
  `flushStreamingSegment` first (so any in-progress narration lands
  as its own segment), then pushes the ```diff block as its own
  segment into segmentMessages. The diff is now anchored BETWEEN the
  narration that preceded the edit and whatever the agent streams
  afterwards, which is where the edit actually happened.
- `recordMessageComplete` no longer merges buffered diffs. The only
  remaining dedupe is "drop diff-only segments whose body the final
  assistant text narrates verbatim (or whose diff fence the final
  text already contains)" — same tradeoff as before, kept so an
  agent that narrates its own diff doesn't render two stacked copies.
- Drops `pendingInlineDiffs` and `queueInlineDiff` — buffer + end-
  merge machinery is gone; segmentMessages is now the only source
  of truth.

Side benefit: Ctrl+C interrupt (`interruptTurn`) iterates
segmentMessages, so diff segments are now preserved in the
transcript when the user cancels after an edit. Previously the
pending buffer was silently dropped on interrupt.

Reported by Teknium during blitz usage: "no diffs are ever at the
end because it didn't make this file edit after the final message".
2026-04-23 19:02:44 -05:00
Brooklyn Nicholson b08cbc7a79 fix(tui): @<name> fuzzy-matches filenames across the repo
Typing `@appChrome` in the composer should surface
`ui-tui/src/components/appChrome.tsx` without requiring the user to
first type the full directory path — matches the Cmd-P behaviour
users expect from modern editors.

The gateway's `complete.path` handler was doing a plain
`os.listdir(".")` + `startswith` prefix match, so basenames only
resolved inside the current working directory. This reworks it to:

- enumerate repo files via `git ls-files -z --cached --others
  --exclude-standard` (fast, honours `.gitignore`); fall back to a
  bounded `os.walk` that skips common vendor / build dirs when the
  working dir isn't a git repo. Results cached per-root with a 5s
  TTL so rapid keystrokes don't respawn git processes.
- rank basenames with a 5-tier scorer: exact → prefix → camelCase
  / word-boundary → substring → subsequence. Shorter basenames win
  ties; shorter rel paths break basename-length ties.
- only take the fuzzy branch when the query is bare (no `/`), is a
  context reference (`@...`), and isn't `@folder:` — path-ish
  queries and folder tags fall through to the existing
  directory-listing path so explicit navigation intent is
  preserved.

Completion rows now carry `display = basename`,
`meta = directory`, so the picker renders
`appChrome.tsx  ui-tui/src/components` on one row (basename bold,
directory dim) — the meta column was previously "dir" / "" and is
a more useful signal for fuzzy hits.

Reported by Ben Barclay during the TUI v2 blitz test.
2026-04-23 19:01:27 -05:00
ethernet c95c6bdb7c Merge pull request #14818 from NousResearch/ink-perf
perf(ink): cache text measurements across yoga flex re-passes
2026-04-23 20:58:54 -03:00
Ari Lotter bd929ea514 perf(ink): cache text measurements across yoga flex re-passes
Adds a per-ink-text measurement cache keyed by width|widthMode to avoid
re-squashing and re-wrapping the same text when yoga calls measureFunc
multiple times per frame with different widths during flex layout re-pass.
2026-04-23 19:45:10 -04:00
Teknium 6a20e187dd test,chore: cover stringified array/object coercion + AUTHOR_MAP entry
Follow-up to the cherry-picked coercion commit: adds 9 regression tests
covering array/object parsing, invalid-JSON passthrough, wrong-shape
preservation, and the issue #3947 gmail-mcp scenario end-to-end.  Adds
dan@danlynn.com -> danklynn to scripts/release.py AUTHOR_MAP so the
salvage PR's contributor attribution doesn't break CI.
2026-04-23 16:38:38 -07:00
Dan Lynn 9ff21437a0 fix(mcp): coerce stringified arrays/objects in tool args
When a tool schema declares `type: array` or `type: object` and the model
emits the value as a JSON string (common with complex oneOf discriminated
unions), the MCP server rejects it with -32602 "expected array, received
string".  Extend `_coerce_value` to attempt `json.loads` for these types
and replace the string with the parsed value before dispatch.

Root cause confirmed via live testing: `add_reminders.reminders` uses a
oneOf discriminated union (relative/absolute/location) that triggers model
output drift.  Sending a real array passes validation; sending a string
reproduces the exact error.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-23 16:38:38 -07:00
0xbyt4 44a0cbe525 fix(tui): voice mode starts OFF each launch (CLI parity)
The voice.toggle handler was persisting display.voice_enabled /
display.voice_tts to config.yaml, so a TUI session that ever turned
voice on would re-open with it already on (and the mic badge lit) on
every subsequent launch.  cli.py treats voice strictly as runtime
state: _voice_mode = False at __init__, only /voice on flips it, and
nothing writes it back to disk.

Drop the _write_config_key calls in voice.toggle on/off/tts and the
config.yaml fallback in _voice_mode_enabled / _voice_tts_enabled.
State is now env-var-only (HERMES_VOICE / HERMES_VOICE_TTS), scoped to
the live gateway subprocess — the next launch starts clean.
2026-04-23 16:18:15 -07:00
0xbyt4 2af0848f3c fix(tui): ignore SIGPIPE so stderr back-pressure can't kill the gateway
Crash-log stack trace (tui_gateway_crash.log) from the user's session
pinned the regression: SIGPIPE arrived while main thread was blocked on
for-raw-in-sys.stdin — i.e., a background thread (debug print to stderr,
most likely from HERMES_VOICE_DEBUG=1) wrote to a pipe whose buffer the
TUI hadn't drained yet, and SIG_DFL promptly killed the process.

Two fixes that together restore CLI parity:

- entry.py: SIGPIPE → SIG_IGN instead of the _log_signal handler that
  then exited. With SIG_IGN, Python raises BrokenPipeError on the
  offending write, which write_json already handles with a clean exit
  via _log_exit. SIGTERM / SIGHUP still route through _log_signal so
  real termination signals remain diagnosable.

- hermes_cli/voice.py:_debug: wrap the stderr print in a BrokenPipeError
  / OSError try/except. This runs from daemon threads (silence callback,
  TTS playback, beep), so a broken stderr must not escape and ride up
  into the main event loop.

Verified by spawning the gateway subprocess locally:
  voice.toggle status → 200 OK, process stays alive, clean exit on
  stdin close logs "reason=stdin EOF" instead of a silent reap.
2026-04-23 16:18:15 -07:00
0xbyt4 7baf370d3d chore(tui): capture signal-triggered gateway exits in crash log
SIG_DFL for SIGPIPE means the kernel reaps the gateway subprocess the
instant a background thread (TTS playback, silence callback, voice
status emitter) writes to a stdout the TUI stopped reading — before
the Python interpreter can run excepthook, threading.excepthook,
atexit, or the entry.py post-loop _log_exit.

Replace the three SIG_DFL / SIG_IGN bindings with a _log_signal
handler that:

- records which signal (SIGPIPE / SIGTERM / SIGHUP) fired and when;
- dumps the main-thread stack at signal delivery AND every live
  thread's stack via sys._current_frames — the background-thread
  write that provoked SIGPIPE is almost always visible here;
- writes everything to ~/.hermes/logs/tui_gateway_crash.log and prints
  a [gateway-signal] breadcrumb to stderr so the TUI Activity surfaces
  it as well.

SIGINT stays ignored (TUI handles Ctrl+C for the user).
2026-04-23 16:18:15 -07:00
0xbyt4 eeda18a9b7 chore(tui): record gateway exit reason in crash log
Gateway exits weren't reaching the panic hook because entry.py calls
sys.exit(0) on broken stdout — clean termination, no exception.  That
left "gateway exited" in the TUI with zero forensic trail when pipe
breaks happened mid-turn.

Entry.py now tags each exit path — startup-write failure, parse-error-
response write failure, per-method response write failure, stdin EOF —
with a one-line entry in ~/.hermes/logs/tui_gateway_crash.log and a
gateway.stderr breadcrumb.  Includes the JSON-RPC method name on the
dispatch path, which is the only way to tell "died right after handling
voice.toggle on" from "died emitting the second message.complete".
2026-04-23 16:18:15 -07:00
0xbyt4 3a9598337f chore(tui): dump gateway crash traces to ~/.hermes/logs/tui_gateway_crash.log
When the gateway subprocess raises an unhandled exception during a
voice-mode turn, nothing survives: stdout is the JSON-RPC pipe, stderr
flushes but the process is already exiting, and no log file catches
Python's default traceback print.  The user is left with an
undiagnosable "gateway exited" banner.

Install:

- sys.excepthook → write full traceback to tui_gateway_crash.log +
  echo the first line to stderr (which the TUI pumps into
  Activity as a gateway.stderr event).  Chains to the default hook so
  the process still terminates.
- threading.excepthook → same, tagged with the thread name so it's
  clear when the crash came from a daemon thread (beep playback, TTS,
  silence callback, etc.).
- Turn-dispatcher except block now also appends a traceback to the
  crash log before emitting the user-visible error event — str(e)
  alone was too terse to identify where in the voice pipeline the
  failure happened.

Zero behavioural change on the happy path; purely forensics.
2026-04-23 16:18:15 -07:00
0xbyt4 98418afd5d fix(tui): break TTS→STT feedback loop + colorize REC badge
TTS feedback loop (hermes_cli/voice.py)

The VAD loop kept the microphone live while speak_text played the
agent's reply over the speakers, so the reply itself was picked up,
transcribed, and submitted — the agent then replied to its own echo
("Ha, looks like we're in a loop").

Ported cli.py:_voice_tts_done synchronisation:

- _tts_playing: threading.Event (initially set = "not playing").
- speak_text cancels the active recorder before opening the speakers,
  clears _tts_playing, and on exit waits 300 ms before re-starting the
  recorder — long enough for the OS audio device to settle so afplay
  and sounddevice don't race for it.
- _continuous_on_silence now waits on _tts_playing (up to 60 s) before
  re-arming the mic with another 300 ms gap, mirroring
  cli.py:10619-10621.  If the user flips voice off during the wait the
  loop exits cleanly instead of fighting for the device.

Without both halves the loop races: if the silence callback fires
before TTS starts it re-arms immediately; if TTS is already playing
the pause-and-resume path catches it.

Red REC badge (ui-tui appChrome + useMainApp)

Classic CLI (cli.py:_get_voice_status_fragments) renders "● REC" in
red and "◉ STT" in amber.  TUI was showing a dim "REC" with no dot,
making it hard to spot at a glance.  voiceLabel now emits the same
glyphs and appChrome colours them via t.color.error / t.color.warn,
falling back to dim for the idle label.
2026-04-23 16:18:15 -07:00
0xbyt4 42ff785771 fix(tui): voice TTS speak-back + transcript-key bug + auto-submit
Three issues surfaced during end-to-end testing of the CLI-parity voice
loop and are fixed together because they all blocked "speak → agent
responds → TTS reads it back" from working at all:

1. Wrong result key (hermes_cli/voice.py)

   transcribe_recording() returns {"success": bool, "transcript": str},
   matching cli.py:_voice_stop_and_transcribe. The wrapper was reading
   result.get("text"), which is None, so every successful Groq / local
   STT response was thrown away and the 3-strikes halt fired after
   three silent-looking cycles. Fixed by reading "transcript" and also
   honouring "success" like the CLI does. Updated the loop simulation
   tests to return the correct shape.

2. TTS speak-back was missing (tui_gateway/server.py + hermes_cli/voice.py)

   The TUI had a voice.toggle "tts" subcommand but nothing downstream
   actually read the flag — agent replies never spoke. Mirrored
   cli.py:8747-8754's dispatch: on message.complete with status ==
   "complete", if _voice_tts_enabled() is true, spawn a daemon thread
   running speak_text(response). Rewrote speak_text as a full port of
   cli.py:_voice_speak_response — same markdown-strip regex pipeline
   (code blocks, links, bold/italic, inline code, headers, list bullets,
   horizontal rules, excessive newlines), same 4000-char cap, same
   explicit mp3 output path, same MP3-over-OGG playback choice (afplay
   misbehaves on OGG), same cleanup of both extensions. Keeps TUI TTS
   audible output byte-for-byte identical to the classic CLI.

3. Auto-submit swallowed on non-empty composer (createGatewayEventHandler.ts)

   The voice.transcript handler branched on prev input via a setInput
   updater and fired submitRef.current inside the updater when prev was
   empty. React strict mode double-invokes state updaters, which would
   queue the submit twice; and when the composer had any content the
   transcript was merely appended — the agent never saw it. CLI
   _pending_input.put(transcript) unconditionally feeds the transcript
   as the next turn, so match that: always clear the composer and
   setTimeout(() => submitRef.current(text), 0) outside any updater.
   Side effect can't run twice this way, and a half-typed draft on the
   rare occasion is a fair trade vs. silently dropping the turn.

Also added peak_rms to the rec.stop debug line so "recording too quiet"
is diagnosable at a glance when HERMES_VOICE_DEBUG=1.
2026-04-23 16:18:15 -07:00
0xbyt4 04c489b587 feat(tui): match CLI's voice slash + VAD-continuous recording model
The TUI had drifted from the CLI's voice model in two ways:

- /voice on was lighting up the microphone immediately and Ctrl+B was
  interpreted as a mode toggle.  The CLI separates the two: /voice on
  just flips the umbrella bit, recording only starts once the user
  presses Ctrl+B, which also sets _voice_continuous so the VAD loop
  auto-restarts until the user presses Ctrl+B again or three silent
  cycles pass.
- /voice tts was missing entirely, so users couldn't turn agent reply
  speech on/off from inside the TUI.

This commit brings the TUI to parity.

Python

- hermes_cli/voice.py: continuous-mode API (start_continuous,
  stop_continuous, is_continuous_active) layered on the existing PTT
  wrappers. The silence callback transcribes, fires on_transcript,
  tracks consecutive no-speech cycles, and auto-restarts — mirroring
  cli.py:_voice_stop_and_transcribe + _restart_recording.
- tui_gateway/server.py:
  - voice.toggle now supports on / off / tts / status.  The umbrella
    bit lives in HERMES_VOICE + display.voice_enabled; tts lives in
    HERMES_VOICE_TTS + display.voice_tts.  /voice off also tears down
    any active continuous loop so a toggle-off really releases the
    microphone.
  - voice.record start/stop now drives start_continuous/stop_continuous.
    start is refused with a clear error when the mode is off, matching
    cli.py:handle_voice_record's early return on `not _voice_mode`.
  - New voice.transcript / voice.status events emit through
    _voice_emit (remembers the sid that last enabled the mode so
    events land in the right session).

TypeScript

- gatewayTypes.ts: voice.status + voice.transcript event
  discriminants; VoiceToggleResponse gains tts; VoiceRecordResponse
  gains status for the new "started/stopped" responses.
- interfaces.ts: GatewayEventHandlerContext gains composer.setInput +
  submission.submitRef + voice.{setRecording, setProcessing,
  setVoiceEnabled}; InputHandlerContext.voice gains enabled +
  setVoiceEnabled for the mode-aware Ctrl+B handler.
- createGatewayEventHandler.ts: voice.status drives REC/STT badges;
  voice.transcript auto-submits when the composer is empty (CLI
  _pending_input.put parity) and appends when a draft is in flight.
  no_speech_limit flips voice off + sys line.
- useInputHandlers.ts: Ctrl+B now calls voice.record (start/stop),
  not voice.toggle, and nudges the user with a sys line when the
  mode is off instead of silently flipping it on.
- useMainApp.ts: wires the new event-handler context fields.
- slash/commands/session.ts: /voice handles on / off / tts / status
  with CLI-matching output ("voice: mode on · tts off").

Backward compat preserved for voice.record (was always PTT shape;
gateway still honours start/stop with mode-gating added).
2026-04-23 16:18:15 -07:00
0xbyt4 0bb460b070 fix(tui): add missing hermes_cli.voice wrapper for gateway RPC
tui_gateway/server.py:3486/3491/3509 imports start_recording,
stop_and_transcribe, and speak_text from hermes_cli.voice, but the
module never existed (not in git history — never shipped, never
deleted). Every voice.record / voice.tts RPC call hit the ImportError
branch and the TUI surfaced it as "voice module not available — install
audio dependencies" even on boxes with sounddevice / faster-whisper /
numpy installed.

Adds a thin wrapper on top of tools.voice_mode (recording +
transcription) and tools.tts_tool (text-to-speech):

- start_recording() — idempotent; stores the active AudioRecorder in a
  module-global guarded by a Lock so repeat Ctrl+B presses don't fight
  over the mic.
- stop_and_transcribe() — returns None for no-op / no-speech /
  Whisper-hallucination cases so the TUI's existing "no speech detected"
  path keeps working unchanged.
- speak_text(text) — lazily imports tts_tool (optional provider SDKs
  stay unloaded until the first /voice tts call), parses the tool's
  JSON result, and plays the audio via play_audio_file.

Paired with the Ctrl+B keybinding fix in the prior commit, the TUI
voice pipeline now works end-to-end for the first time.
2026-04-23 16:18:15 -07:00
0xbyt4 3504bd401b fix(tui): route Ctrl+B to voice toggle, not composer input
When the user runs /voice and then presses Ctrl+B in the TUI, three
handlers collaborate to consume the chord and none of them dispatch
voice.record:

- isAction() is platform-aware — on macOS it requires Cmd (meta/super),
  so Ctrl+B fails the match in useInputHandlers and never triggers
  voiceStart/voiceStop.
- TextInput's Ctrl+B pass-through list doesn't include 'b', so the
  keystroke falls through to the wordMod backward-word branch on Linux
  and to the printable-char insertion branch on macOS — the latter is
  exactly what timmie reported ("enters a b into the tui").
- /voice emits "voice: on" with no hint, so the user has no way to
  know Ctrl+B is the recording toggle.

Introduces isVoiceToggleKey(key, ch) in lib/platform.ts that matches
raw Ctrl+B on every platform (mirrors tips.py and config.yaml's
voice.record_key default) and additionally accepts Cmd+B on macOS so
existing muscle memory keeps working. Wires it into useInputHandlers,
adds Ctrl+B to TextInput's pass-through list so the global handler
actually receives the chord, and appends "press Ctrl+B to record" to
the /voice on message.

Empirically verified with hermes --tui: Ctrl+B no longer leaks 'b'
into the composer and now dispatches the voice.record RPC (the
downstream ImportError for hermes_cli.voice is a separate upstream
bug — follow-up patch).
2026-04-23 16:18:15 -07:00
Teknium 50d97edbe1 feat(delegation): bump default child_timeout_seconds to 600s (#14809)
The 300s default was too tight for high-reasoning models on non-trivial
delegated tasks — e.g. gpt-5.5 xhigh reviewing 12 files would burn >5min
on reasoning tokens before issuing its first tool call, tripping the
hard wall-clock timeout with 0 api_calls logged.

- tools/delegate_tool.py: DEFAULT_CHILD_TIMEOUT 300 -> 600
- hermes_cli/config.py: surface delegation.child_timeout_seconds in
  DEFAULT_CONFIG so it's discoverable (previously the key was read by
  _get_child_timeout() but absent from the default config schema)

Users can still override via config.yaml delegation.child_timeout_seconds
or DELEGATION_CHILD_TIMEOUT_SECONDS env var (floor 30s, no ceiling).
2026-04-23 16:14:55 -07:00
Teknium e26c4f0e34 fix(kimi,mcp): Moonshot schema sanitizer + MCP schema robustness (#14805)
Fixes a broader class of 'tools.function.parameters is not a valid
moonshot flavored json schema' errors on Nous / OpenRouter aggregators
routing to moonshotai/kimi-k2.6 with MCP tools loaded.

## Moonshot sanitizer (agent/moonshot_schema.py, new)

Model-name-routed (not base-URL-routed) so Nous / OpenRouter users are
covered alongside api.moonshot.ai.  Applied in
ChatCompletionsTransport.build_kwargs when is_moonshot_model(model).

Two repairs:
1. Fill missing 'type' on every property / items / anyOf-child schema
   node (structural walk — only schema-position dicts are touched, not
   container maps like properties/$defs).
2. Strip 'type' at anyOf parents; Moonshot rejects it.

## MCP normalizer hardened (tools/mcp_tool.py)

Draft-07 $ref rewrite from PR #14802 now also does:
- coerce missing / null 'type' on object-shaped nodes (salvages #4897)
- prune 'required' arrays to names that exist in 'properties'
  (salvages #4651; Gemini 400s on dangling required)
- apply recursively, not just top-level

These repairs are provider-agnostic so the same MCP schema is valid on
OpenAI, Anthropic, Gemini, and Moonshot in one pass.

## Crash fix: safe getattr for Tool.inputSchema

_convert_mcp_schema now uses getattr(t, 'inputSchema', None) so MCP
servers whose Tool objects omit the attribute entirely no longer abort
registration (salvages #3882).

## Validation

- tests/agent/test_moonshot_schema.py: 27 new tests (model detection,
  missing-type fill, anyOf-parent strip, non-mutation, real-world MCP
  shape)
- tests/tools/test_mcp_tool.py: 7 new tests (missing / null type,
  required pruning, nested repair, safe getattr)
- tests/agent/transports/test_chat_completions.py: 2 new integration
  tests (Moonshot route sanitizes, non-Moonshot route doesn't)
- Targeted suite: 49 passed
- E2E via execute_code with a realistic MCP tool carrying all three
  Moonshot rejection modes + dangling required + draft-07 refs:
  sanitizer produces a schema valid on Moonshot and Gemini
2026-04-23 16:11:57 -07:00
helix4u 24f139e16a fix(mcp): rewrite definitions refs to in input schemas 2026-04-23 15:56:57 -07:00
Teknium ef5eaf8d87 feat(cron): honor hermes tools config for the cron platform (#14798)
Cron now resolves its toolset from the same per-platform config the
gateway uses — `_get_platform_tools(cfg, 'cron')` — instead of blindly
loading every default toolset.  Existing cron jobs without a per-job
override automatically lose `moa`, `homeassistant`, and `rl` (the
`_DEFAULT_OFF_TOOLSETS` set), which stops the "surprise $4.63
mixture_of_agents run" class of bug (Norbert, Discord).

Precedence inside `run_job`:
  1. per-job `enabled_toolsets` (PR #14767 / #6130) — wins if set
  2. `_get_platform_tools(cfg, 'cron')` — new, the blanket gate
  3. `None` fallback (legacy) — only on resolver exception

Changes:
- hermes_cli/platforms.py: register 'cron' with default_toolset
  'hermes-cron'
- toolsets.py: add 'hermes-cron' toolset (mirrors 'hermes-cli';
  `_get_platform_tools` then filters via `_DEFAULT_OFF_TOOLSETS`)
- cron/scheduler.py: add `_resolve_cron_enabled_toolsets(job, cfg)`,
  call it at the `AIAgent(...)` kwargs site
- tests/cron/test_scheduler.py: replace the 'None when not set' test
  (outdated contract) with an invariant ('moa not in default cron
  toolset') + new per-job-wins precedence test
- tests/hermes_cli/test_tools_config.py: mark 'cron' as non-messaging
  in the gateway-toolset-coverage test
2026-04-23 15:48:50 -07:00
Teknium bf196a3fc0 chore: release v0.11.0 (2026.4.23) (#14791)
The Interface release — new Ink-based TUI, pluggable transport architecture,
native AWS Bedrock, five new inference paths (NVIDIA NIM, Arcee, Step Plan,
Gemini CLI OAuth, ai-gateway), GPT-5.5 via Codex OAuth, QQBot (17th platform),
expanded plugin surface, dashboard plugin system + live theme switching, /steer
mid-run nudges, shell hooks, webhook direct-delivery, smarter delegation, and
auxiliary models config UI.

Also folds in the v0.10.0 deferred batch (v0.10.0 shipped only the Nous Tool
Gateway). 1,556 commits · 761 PRs · 290 contributors since v0.9.0.
2026-04-23 15:31:59 -07:00
Teknium f593c367be feat(dashboard): reskin extension points for themes and plugins (#14776)
Themes and plugins can now pull off arbitrary dashboard reskins (cockpit
HUD, retro terminal, etc.) without touching core code.

Themes gain four new fields:
- layoutVariant: standard | cockpit | tiled — shell layout selector
- assets: {bg, hero, logo, crest, sidebar, header, custom: {...}} —
  artwork URLs exposed as --theme-asset-* CSS vars
- customCSS: raw CSS injected as a scoped <style> tag on theme apply
  (32 KiB cap, cleaned up on theme switch)
- componentStyles: per-component CSS-var overrides (clipPath,
  borderImage, background, boxShadow, ...) for card/header/sidebar/
  backdrop/tab/progress/badge/footer/page

Plugin manifests gain three new fields:
- tab.override: replaces a built-in route instead of adding a tab
- tab.hidden: register component + slots without adding a nav entry
- slots: declares shell slots the plugin populates

10 named shell slots: backdrop, header-left/right/banner, sidebar,
pre-main, post-main, footer-left/right, overlay. Plugins register via
window.__HERMES_PLUGINS__.registerSlot(name, slot, Component). A
<PluginSlot> React helper is exported on the plugin SDK.

Ships a full demo at plugins/strike-freedom-cockpit/ — theme YAML +
slot-only plugin that reproduces a Gundam cockpit dashboard: MS-STATUS
sidebar with live telemetry, COMPASS crest in header, notched card
corners via componentStyles, scanline overlay via customCSS, gold/cyan
palette, Orbitron typography.

Validation:
- 15 new tests in test_web_server.py covering every extended field
- tests/hermes_cli/: 2615 passed (3 pre-existing unrelated failures)
- tsc -b --noEmit: clean
- vite build: 418 kB bundle, ~2 kB delta for slots/theme extensions

Co-authored-by: Teknium <p@nousresearch.com>
2026-04-23 15:31:01 -07:00
Teknium 470389e6a3 chore(release): map say8hi author for #6130 salvage 2026-04-23 15:16:18 -07:00
say8hi 18d5ba8676 test(cron): add tests for enabled_toolsets in create_job and run_job 2026-04-23 15:16:18 -07:00
say8hi 8b79acb8de feat(cron): expose enabled_toolsets in cronjob tool and create_job() 2026-04-23 15:16:18 -07:00
say8hi 0086fd894d feat(cron): support enabled_toolsets per job to reduce token overhead 2026-04-23 15:16:18 -07:00
Teknium 5e67b38437 chore(release): map devorun author + convert MoA defaults test to invariant
- AUTHOR_MAP entry for 130918800+devorun for #6636 attribution
- test_moa_defaults: was a change-detector tied to the exact frontier
  model list — flips red every OpenRouter churn. Rewritten as an
  invariant (non-empty, valid vendor/model slugs).
2026-04-23 15:14:11 -07:00
Devorun 1df35a93b2 Fix (mixture_of_agents): replace deprecated Gemini model and forward max_tokens to OpenRouter (#6621) 2026-04-23 15:14:11 -07:00
teknium1 9599271180 fix(xai-image): drop unreachable editing code path
The agent-facing image_generate tool only passes prompt + aspect_ratio to
provider.generate() (see tools/image_generation_tool.py:953). The editing
block (reference_images / edit_image kwargs) could never fire from the
tool surface, and the xAI edits endpoint is /images/edits with a
different payload shape anyway — not /images/generations as submitted.

- Remove reference_images / edit_image kwargs handling from generate()
- Remove matching test_with_reference_images case
- Update docstring + plugin.yaml description to text-to-image only
- Surface resolution in the success extras

Follow-up to PR #14547. Tests: 18/18 pass.
2026-04-23 15:13:34 -07:00
Julien Talbot a5e4a86ebe feat(xai): add xAI image generation provider (grok-imagine-image)
Add xAI as a plugin-based image generation backend using grok-imagine-image.
Follows the existing ImageGenProvider ABC pattern used by OpenAI and FAL.

Changes:
- plugins/image_gen/xai/__init__.py: xAI provider implementation
  - Uses xAI /images/generations endpoint
  - Supports text-to-image and image editing with reference images
  - Multiple aspect ratios (1:1, 16:9, 9:16, 4:3, 3:4, 3:2, 2:3)
  - Multiple resolutions (1K, 2K)
  - Base64 output saved to cache
  - Config via config.yaml image_gen.xai section
- plugins/image_gen/xai/plugin.yaml: plugin metadata
- tests/plugins/image_gen/test_xai_provider.py: 19 unit tests
  - Provider class (name, display_name, is_available, list_models, setup_schema)
  - Config (default model, resolution, custom model)
  - Generate (missing key, success b64/url, API error, timeout, empty response, reference images, auth header)
  - Registration

Requires XAI_API_KEY in ~/.hermes/.env.
To use: set image_gen.provider: xai in config.yaml.
2026-04-23 15:13:34 -07:00
Teknium d42b6a2edd docs(agents): refresh AGENTS.md — fix stale facts, expand plugins/skills sections (#14763)
Fixes several outright-wrong facts and gaps vs current main:

- venv activation: .venv is preferred, venv is fallback (per run_tests.sh)
- AIAgent default model is "" (empty, resolved from config), not hardcoded opus
- Test suite is ~15k tests / ~700 files, not ~3000
- tools/mcp_tool.py is 2.6k LOC, not 1050
- Remove stale "currently 5" config_version note; the real bump-trigger rule
  is migration-only, not every new key
- Remove MESSAGING_CWD as the messaging cwd — it's been removed in favor of
  terminal.cwd in config.yaml (gateway bridges to TERMINAL_CWD env var)
- .env is secrets-only; non-secret settings belong in config.yaml
- simple_term_menu pitfall: existing sites are legacy fallback, rule is
  no new usage

Incomplete/missing sections filled in:

- Gateway platforms list updated to reflect actual adapters (matrix,
  mattermost, email, sms, dingtalk, wecom, weixin, feishu, bluebubbles,
  webhook, api_server, etc.)
- New 'Plugins' section covering general plugins, memory-provider plugins,
  and dashboard/context-engine/image-gen plugin directories — including
  the May 2026 rule that plugins must not touch core files
- New 'Skills' section covering skills/ vs optional-skills/ split and
  SKILL.md frontmatter fields
- Logs section pointing at ~/.hermes/logs/ and 'hermes logs' CLI
- Prompt-cache policy now explicitly mentions --now / deferred slash-command
  invalidation pattern
- Two new pitfalls: gateway two-guard dispatch rule, squash-merge-from-stale
  branch silent revert, don't-wire-dead-code rule

Tree layout trimmed to load-bearing entry points — per-file subtrees were
~70% stale so replaced with directory-level notes pointing readers at the
filesystem as the source of truth.
2026-04-23 15:13:13 -07:00
Teknium d001814e3f chore(release): map rohithsaimidigudla@gmail.com -> whitehatjr1001 2026-04-23 15:12:42 -07:00
whitehatjr1001 9d147f7fde fix(gateway): enhance message handling during agent tasks with queue mode support 2026-04-23 15:12:42 -07:00
Teknium 692ae6dd07 docs(readme): fix stale RL submodule instructions, skills table row, test runner (#14758)
- Drop broken tinker-atropos submodule instructions: no .gitmodules exists,
  tinker-atropos/ is empty, and atroposlib + tinker are regular pip deps in
  pyproject.toml pulled in by .[all,dev]. Replace with a one-line note.
- CLI vs Messaging table: /skills is cli_only=True in COMMAND_REGISTRY, so
  remove it from the messaging column. /<skill-name> still works there.
- Point contributors at scripts/run_tests.sh (the canonical runner enforcing
  CI-parity env) instead of bare pytest.
2026-04-23 15:12:04 -07:00
Teknium b61ac8964b fix(gateway/discord): read permission attrs from AppCommand, canonicalize contexts
Follow-up to Magaav's safe sync policy. Two gaps in the canonicalizer
caused false diffs or silent drift:

1. discord.py's AppCommand.to_dict() omits nsfw, dm_permission, and
   default_member_permissions — those live only on attributes. The
   canonicalizer was reading them via payload.get() and getting defaults
   (False/True/None), while the desired side from Command.to_dict(tree)
   had the real values. Any command using non-default permissions
   false-diffed on every startup. Pull them from the AppCommand
   attributes via _existing_command_to_payload().

2. contexts and integration_types weren't canonicalized at all, so
   drift in either was silently ignored. Added both to
   _canonicalize_app_command_payload (sorted for stable compare).

Also normalized default_member_permissions to str-or-None since the
server emits strings but discord.py stores ints locally.

Added regression tests for both gaps.
2026-04-23 15:11:56 -07:00
Magaav a1ff6b45ea fix(gateway/discord): add safe startup slash sync policy
Replaces blind tree.sync() on every Discord reconnect with a diff-based
reconcile. In safe mode (default), fetch existing global commands,
compare desired vs existing payloads, skip unchanged, PATCH changed,
recreate when non-patchable metadata differs, POST missing, and delete
stale commands one-by-one. Keeps 'bulk' for legacy behavior and 'off'
to skip startup sync entirely.

Fixes restart-heavy workflows that burn Discord's command write budget
and can surface 429s when iterating on native slash commands.

Env var: DISCORD_COMMAND_SYNC_POLICY (safe|bulk|off), default 'safe'.

Co-authored-by: Codex <codex@openai.invalid>
2026-04-23 15:11:56 -07:00
Yukipukii1 4a0c02b7dc fix(file_tools): resolve bookkeeping paths against live terminal cwd 2026-04-23 15:11:52 -07:00
Teknium 83859b4da0 chore(release): map jefferson@heimdallstrategy.com -> Mind-Dragon 2026-04-23 15:11:47 -07:00
Jefferson 67c8f837fc fix(mcp): per-process PID isolation prevents cross-session crash on restart
- _stdio_pids: set → Dict[int,str] tracks pid→server_name
- SIGTERM-first with 2s grace before SIGKILL escalation
- hasattr guard for SIGKILL on platforms without it
- Updated tests for dict-based tracking and 3-phase kill sequence
2026-04-23 15:11:47 -07:00
MaxsolcuCrypto c7d023937c Update CONTRIBUTING.md 2026-04-23 15:08:41 -07:00
sprmn24 78d1e252fa fix(web_server): guard GATEWAY_HEALTH_TIMEOUT against invalid env values
float(os.getenv(...)) at module level raises ValueError on any
non-numeric value, crashing the web server at import before it starts.

Wrap in try/except with a warning log and fallback to 3.0s.
2026-04-23 15:07:25 -07:00
hharry11 d0821b0573 fix(gateway): only clear locks belonging to the replaced process 2026-04-23 15:07:06 -07:00
Teknium a0d8dd7ba3 chore(release): map eumael.mkt@gmail.com -> maelrx
For release-notes attribution of PR #9170 (MiniMax context preservation).
2026-04-23 14:06:37 -07:00
maelrx e020f46bec fix(agent): preserve MiniMax context length on delta-only overflow 2026-04-23 14:06:37 -07:00
helix4u a884f6d5d8 fix(skills): follow symlinked category dirs consistently 2026-04-23 14:05:47 -07:00
Teknium b848ce2c79 test: cover absolute paths in project env/config approval regex
The original regex only matched relative paths (./foo/.env or bare
.env), so the exact command from the bug report —
`cp /opt/data/.env.local /opt/data/.env` — did not trigger approval.
Broaden the leading-path prefix to accept an absolute leading slash
alongside ./ and ../, and add regressions for the bug-report command
and its redirection variant.
2026-04-23 14:05:36 -07:00
helix4u 1dfcda4e3c fix(approval): guard env and config overwrites 2026-04-23 14:05:36 -07:00
helix4u 1cc0bdd5f3 fix(dashboard): avoid auth header collision with reverse proxies 2026-04-23 14:05:23 -07:00
sgaofen 07046096d9 fix(agent): clarify exhausted OpenRouter auxiliary credentials 2026-04-23 14:04:31 -07:00
Teknium 97b9b3d6a6 fix(gateway): drain-aware hermes update + faster still-working pings (#14736)
cmd_update no longer SIGKILLs in-flight agent runs, and users get
'still working' status every 3 min instead of 10. Two long-standing
sources of '@user — agent gives up mid-task' reports on Telegram and
other gateways.

Drain-aware update:
- New helper hermes_cli.gateway._graceful_restart_via_sigusr1(pid,
  drain_timeout) sends SIGUSR1 to the gateway and polls os.kill(pid,
  0) until the process exits or the budget expires.
- cmd_update's systemd loop now reads MainPID via 'systemctl show
  --property=MainPID --value' and tries the graceful path first. The
  gateway's existing SIGUSR1 handler -> request_restart(via_service=
  True) -> drain -> exit(75) is wired in gateway/run.py and is
  respawned by systemd's Restart=on-failure (and the explicit
  RestartForceExitStatus=75 on newer units).
- Falls back to 'systemctl restart' when MainPID is unknown, the
  drain budget elapses, or the unit doesn't respawn after exit (older
  units missing Restart=on-failure). Old install behavior preserved.
- Drain budget = max(restart_drain_timeout, 30s) + 15s margin so the
  drain loop in run_agent + final exit have room before fallback
  fires. Composes with #14728's tool-subprocess reaping.

Notification interval:
- agent.gateway_notify_interval default 600 -> 180.
- HERMES_AGENT_NOTIFY_INTERVAL env-var fallback in gateway/run.py
  matched.
- 9-minute weak-model spinning runs now ping at 3 min and 6 min
  instead of 27 seconds before completion, removing the 'is the bot
  dead?' reflex that drives gateway-restart cycles.

Tests:
- Two new tests in tests/hermes_cli/test_update_gateway_restart.py:
  one asserts SIGUSR1 is sent and 'systemctl restart' is NOT called
  when MainPID is known and the helper succeeds; one asserts the
  fallback fires when the helper returns False.
- E2E: spawned detached bash processes confirm the helper returns
  True on SIGUSR1-handling exit (~0.5s) and False on SIGUSR1-ignoring
  processes (timeout). Verified non-existent PID and pid=0 edge cases.
- 41/41 in test_update_gateway_restart.py (was 39, +2 new).
- 154/154 in shutdown-related suites including #14728's new tests.

Reported by @GeoffWellman and @ANT_1515 on X.
2026-04-23 14:01:57 -07:00
Teknium 165b2e481a feat(agent): make API retry count configurable via agent.api_max_retries (#14730)
Closes #11616.

The agent's API retry loop hardcoded max_retries = 3, so users with
fallback providers on flaky primaries burned through ~3 × provider
timeout (e.g. 3 × 180s = 9 minutes) before their fallback chain got a
chance to kick in.

Expose a new config key:

    agent:
      api_max_retries: 3  # default unchanged

Set it to 1 for fast failover when you have fallback providers, or
raise it if you prefer longer tolerance on a single provider. Values
< 1 are clamped to 1 (single attempt, no retry); non-integer values
fall back to the default.

This wraps the Hermes-level retry loop only — the OpenAI SDK's own
low-level retries (max_retries=2 default) still run beneath this for
transient network errors.

Changes:
- hermes_cli/config.py: add agent.api_max_retries default 3 with comment.
- run_agent.py: read self._api_max_retries in AIAgent.__init__; replace
  hardcoded max_retries = 3 in the retry loop with self._api_max_retries.
- cli-config.yaml.example: documented example entry.
- hermes_cli/tips.py: discoverable tip line.
- tests/run_agent/test_api_max_retries_config.py: 4 tests covering
  default, override, clamp-to-one, and invalid-value fallback.
2026-04-23 13:59:32 -07:00
Teknium 327b57da91 fix(gateway): kill tool subprocesses before adapter disconnect on drain timeout (#14728)
Closes #8202.

Root cause: stop() reclaimed tool-call bash/sleep children only at the
very end of the shutdown sequence — after a 60s drain, 5s interrupt
grace, and per-adapter disconnect. Under systemd (TimeoutStopSec bounded
by drain_timeout), that meant the cgroup SIGKILL escalation fired first,
and systemd reaped the bash/sleep children instead of us.

Fix:
- Extract tool-subprocess cleanup into a local helper
  _kill_tool_subprocesses() in _stop_impl().
- Invoke it eagerly right after _interrupt_running_agents() on the
  drain-timeout path, before adapter disconnect.
- Keep the existing catch-all call at the end for the graceful path
  and defense in depth against mid-teardown respawns.
- Bump generated systemd unit TimeoutStopSec to drain_timeout + 30s
  so cleanup + disconnect + DB close has headroom above the drain
  budget, matching the 'subprocess timeout > TimeoutStopSec + margin'
  rule from the skill.

Tests:
- New: test_gateway_stop_kills_tool_subprocesses_before_adapter_disconnect_on_timeout
  asserts kill_all() runs before disconnect() when drain times out.
- New: test_gateway_stop_kills_tool_subprocesses_on_graceful_path
  guards that the final catch-all still fires when drain succeeds
  (regression guard against accidental removal during refactor).
- Updated: existing systemd unit generator tests expect TimeoutStopSec=90
  (= 60s drain + 30s headroom) with explanatory comment.
2026-04-23 13:59:29 -07:00
Teknium 64e6165686 fix(delegate): remove model-facing max_iterations override; config is authoritative (#14732)
Previously delegate_task exposed 'max_iterations' in its JSON schema and used
`max_iterations or default_max_iter` — so a model guessing conservatively (or
copy-pasting a docstring hint like 'Only set lower for simple tasks') could
silently shrink a subagent's budget below the user's configured
delegation.max_iterations. One such call this session capped a deep forensic
audit at 40 iterations while the user's config was set to 250.

Changes:
- Drop 'max_iterations' from DELEGATE_TASK_SCHEMA['parameters']['properties'].
  Models can no longer emit it.
- In delegate_task(): ignore any caller-supplied max_iterations, always use
  delegation.max_iterations from config. Log at debug if a stale schema or
  internal caller still passes one through.
- Keep the Python kwarg on the function signature for internal callers
  (_build_child_agent tests pass it through the plumbing layer).
- Update test_schema_valid to assert the param is now absent (intentional
  contract change, not a change-detector).
2026-04-23 13:56:26 -07:00
Teknium b5333abc30 fix(auth): refuse to touch real auth.json during pytest; delete sandbox-escaping test (#14729)
A test in tests/agent/test_credential_pool.py
(test_try_refresh_current_updates_only_current_entry) monkeypatched
refresh_codex_oauth_pure() to return the literal fixture strings
'access-new'/'refresh-new', then executed the real production code path
in agent/credential_pool.py::try_refresh_current which calls
_sync_device_code_entry_to_auth_store → _save_provider_state → writes
to `providers.openai-codex.tokens`. That writer resolves the target via
get_hermes_home()/auth.json. If the test ran with HERMES_HOME unset (direct
pytest invocation, IDE runner bypassing conftest discovery, or any other
sandbox escape), it would overwrite the real user's auth store with the
fixture strings.

Observed in the wild: Teknium's ~/.hermes/auth.json providers.openai-codex.tokens
held 'access-new'/'refresh-new' for five days. His CLI kept working because
the credential_pool entries still held real JWTs, but `hermes model`'s live
discovery path (which reads via resolve_codex_runtime_credentials →
_read_codex_tokens → providers.tokens) was silently 401-ing.

Fixes:
- Delete test_try_refresh_current_updates_only_current_entry. It was the
  only test that exercised a writer hitting providers.openai-codex.tokens
  with literal stub tokens. The entry-level rotation behavior it asserted
  is still covered by test_mark_exhausted_and_rotate_persists_status above.
- Add a seat belt in hermes_cli.auth._auth_file_path(): if PYTEST_CURRENT_TEST
  is set AND the resolved path equals the real ~/.hermes/auth.json, raise
  with a clear message. In production (no PYTEST_CURRENT_TEST), a single
  dict lookup. Any future test that forgets to monkeypatch HERMES_HOME
  fails loudly instead of corrupting the user's credentials.

Validation:
- production (no PYTEST_CURRENT_TEST): returns real path, unchanged behavior
- pytest + HERMES_HOME unset (points at real home): raises with message
- pytest + HERMES_HOME=/tmp/...: returns tmp path, tests pass normally
2026-04-23 13:50:21 -07:00
Teknium 255ba5bf26 feat(dashboard): expand themes to fonts, layout, density (#14725)
Dashboard themes now control typography and layout, not just colors.
Each built-in theme picks its own fonts, base size, radius, and density
so switching produces visible changes beyond hue.

Schema additions (per theme):

- typography — fontSans, fontMono, fontDisplay, fontUrl, baseSize,
  lineHeight, letterSpacing. fontUrl is injected as <link> on switch
  so Google/Bunny/self-hosted stylesheets all work.
- layout — radius (any CSS length) and density
  (compact | comfortable | spacious, multiplies Tailwind spacing).
- colorOverrides (optional) — pin individual shadcn tokens that would
  otherwise derive from the palette.

Built-in themes are now distinct beyond palette:

- default  — system stack, 15px, 0.5rem radius, comfortable
- midnight — Inter + JetBrains Mono, 14px, 0.75rem, comfortable
- ember    — Spectral (serif) + IBM Plex Mono, 15px, 0.25rem
- mono     — IBM Plex Sans + Mono, 13px, 0 radius, compact
- cyberpunk— Share Tech Mono everywhere, 14px, 0 radius, compact
- rose     — Fraunces (serif) + DM Mono, 16px, 1rem, spacious

Also fixes two bugs:

1. Custom user themes silently fell back to default. ThemeProvider
   only applied BUILTIN_THEMES[name], so YAML files in
   ~/.hermes/dashboard-themes/ showed in the picker but did nothing.
   Server now ships the full normalised definition; client applies it.
2. Docs documented a 21-token flat colors schema that never matched
   the code (applyPalette reads a 3-layer palette). Rewrote the
   Themes section against the actual shape.

Implementation:

- web/src/themes/types.ts: extend DashboardTheme with typography,
  layout, colorOverrides; ThemeListEntry carries optional definition.
- web/src/themes/presets.ts: 6 built-ins with distinct typography+layout.
- web/src/themes/context.tsx: applyTheme() writes palette+typography+
  layout+overrides as CSS vars, injects fontUrl stylesheet, fixes the
  fallback-to-default bug via resolveTheme(name).
- web/src/index.css: html/body/code read the new theme-font vars;
  --radius-sm/md/lg/xl derive from --theme-radius; --spacing scales
  with --theme-spacing-mul so Tailwind utilities shift with density.
- hermes_cli/web_server.py: _normalise_theme_definition() parses loose
  YAML (bare hex strings, partial blocks) into the canonical wire
  shape; /api/dashboard/themes ships full definitions for user themes.
- tests/hermes_cli/test_web_server.py: 16 new tests covering the
  normaliser and discovery (rejection cases, clamping, defaults).
- website/docs/user-guide/features/web-dashboard.md: rewrite Themes
  section with real schema, per-model tables, full YAML example.
2026-04-23 13:49:51 -07:00
Teknium 8f5fee3e3e feat(codex): add gpt-5.5 and wire live model discovery into picker (#14720)
OpenAI launched GPT-5.5 on Codex today (Apr 23 2026). Adds it to the static
catalog and pipes the user's OAuth access token into the openai-codex path of
provider_model_ids() so /model mid-session and the gateway picker hit the
live ChatGPT codex/models endpoint — new models appear for each user
according to what ChatGPT actually lists for their account, without a Hermes
release.

Verified live: 'gpt-5.5' returns priority 0 (featured) from the endpoint,
400k context per OpenAI's launch article. 'hermes chat --provider
openai-codex --model gpt-5.5' completes end-to-end.

Changes:
- hermes_cli/codex_models.py: add gpt-5.5 to DEFAULT_CODEX_MODELS + forward-compat
- agent/model_metadata.py: 400k context length entry
- hermes_cli/models.py: resolve codex OAuth token before calling
  get_codex_model_ids() in provider_model_ids('openai-codex')
2026-04-23 13:32:43 -07:00
brooklyn! b6ca3c28dc Merge pull request #14640 from NousResearch/bb/fix-tui-glyph-ghosting
fix(ui-tui): heal post-resize alt-screen drift
2026-04-23 14:41:05 -05:00
Brooklyn Nicholson 882278520b chore: uptick 2026-04-23 14:37:27 -05:00
Brooklyn Nicholson 9bf6e1cd6e refactor(ui-tui): clean touched resize and sticky prompt paths
Trim comment noise, remove redundant typing, normalize sticky prompt viewport args to top→bottom order, and reuse one sticky viewport helper instead of duplicating the math.
2026-04-23 14:37:00 -05:00
Brooklyn Nicholson 9a885fba31 fix(ui-tui): hide stale sticky prompt when newer prompt is visible
Sticky prompt selection only considered the top edge of the viewport, so it could keep showing an older user prompt even when a newer one was already visible lower down. Suppress sticky output whenever a user message is visible in the viewport and cover it with a regression test.
2026-04-23 14:32:29 -05:00
Brooklyn Nicholson aa47812edf fix(ui-tui): clear sticky prompt when follow snaps to bottom
Renderer-driven follow-to-bottom was restoring the viewport to the tail without notifying ScrollBox subscribers, so StickyPromptTracker could stay stale-visible. Notify on render-time scroll/sticky changes and treat near-bottom as bottom for prompt hiding.
2026-04-23 14:19:32 -05:00
Brooklyn Nicholson c8ff70fe03 perf(ui-tui): freeze offscreen live tail during scroll
When the viewport is away from the bottom, keep the last visible progress snapshot instead of rebuilding the streaming/thinking subtree on every turn-store update. This cuts scroll-time churn while preserving live updates near the tail and on turn completion.
2026-04-23 13:16:18 -05:00
kshitijk4poor f5af6520d0 fix: add extra_content property to ToolCall for Gemini thought_signature (#14488)
Commit 43de1ca8 removed the _nr_to_assistant_message shim in favor of
duck-typed properties on the ToolCall dataclass. However, the
extra_content property (which carries the Gemini thought_signature) was
omitted from the ToolCall definition. This caused _build_assistant_message
to silently drop the signature via getattr(tc, 'extra_content', None)
returning None, leading to HTTP 400 errors on subsequent turns for all
Gemini 3 thinking models.

Add the extra_content property to ToolCall (matching the existing
call_id and response_item_id pattern) so the thought_signature round-trips
correctly through the transport → agent loop → API replay path.

Credit to @celttechie for identifying the root cause and providing the fix.

Closes #14488
2026-04-23 23:45:07 +05:30
Brooklyn Nicholson 1e445b2547 fix(ui-tui): heal post-resize alt-screen drift
Broaden the settle repaint from xterm.js-only to all alt-screen terminals. Ink upstream and ConPTY/xterm reports point to resize/reflow desync as a general stale-cell class, not a host-specific quirk.
2026-04-23 13:10:52 -05:00
Brooklyn Nicholson f28f07e98e test(ui-tui): drop dead terminalReally from drift repro
Copilot flagged the variable as unused. LogUpdate.render only sees prev/next, so a simulated "physical terminal" has no hook in the public API. Kept the narrative in the comment and tightened the assertion to demonstrate the test's actual invariant: identical prev/next emits no heal patches.
2026-04-23 13:03:06 -05:00
Brooklyn Nicholson 7c4dd7d660 refactor(ui-tui): collapse xterm.js resize settle dance
Replace 28-line guard + nested queueMicrotask + pendingResizeRender flag-reuse with a named canAltScreenRepaint predicate and a single flat paint. setTimeout already drained the burst coalescer; the nested defer and flag dance were paranoia.
2026-04-23 12:49:49 -05:00
kshitijk4poor e91be4d7dc fix: resolve_alias prefers highest version + merges static catalog
Three bugs fixed in model alias resolution:

1. resolve_alias() returned the FIRST catalog match with no version
   preference. '/model mimo' picked mimo-v2-omni (index 0 in dict)
   instead of mimo-v2.5-pro. Now collects all prefix matches, sorts
   by version descending with pro/max ranked above bare names, and
   returns the highest.

2. models.dev registry missing newly added models (e.g. v2.5 for
   native xiaomi). resolve_alias() now merges static _PROVIDER_MODELS
   entries into the catalog so models resolve immediately without
   waiting for models.dev to sync.

3. hermes model picker showed only models.dev results (3 xiaomi models),
   hiding curated entries (5 total). The picker now merges curated
   models into the models.dev list so all models appear.

Also fixes a trailing-dot float parsing edge case in _model_sort_key
where '5.4.' failed float() and multi-dot versions like '5.4.1'
weren't parsed correctly.
2026-04-23 23:18:33 +05:30
Brooklyn Nicholson 60d1edc38a fix(ui-tui): keep bottom statusbar in composer layout
Render the bottom status bar inside the composer pane so aggressive resize + streaming churn cannot cull the input row via sibling overlap.
2026-04-23 12:44:56 -05:00
Brooklyn Nicholson 3e01de0b09 fix(ui-tui): preserve composer after resize-burst healing
- run the xterm.js settle-heal pass through a full render commit instead of diff-only scheduleRender
- guard against overlapping resize renders and clear settle timers on unmount
2026-04-23 12:40:39 -05:00
Brooklyn Nicholson f7e86577bc fix(ui-tui): heal xterm.js resize-burst render drift 2026-04-23 12:21:09 -05:00
Brooklyn Nicholson 2e75460066 test(ui-tui): add log-update diff contract tests
- steady-state diff skips unchanged rows
- width change emits clearTerminal before repaint
- drift repro: prev.screen desync from terminal leaves orphaned cells no code path can reach
2026-04-23 12:08:23 -05:00
kshitij 82a0ed1afb feat: add Xiaomi MiMo v2.5-pro and v2.5 model support (#14635)
## Merged

Adds MiMo v2.5-pro and v2.5 support to Xiaomi native provider, OpenCode Go, and setup wizard.

### Changes
- Context lengths: added v2.5-pro (1M) and v2.5 (1M), corrected existing MiMo entries to exact values (262144)
- Provider lists: xiaomi, opencode-go, setup wizard
- Vision: upgraded from mimo-v2-omni to mimo-v2.5 (omnimodal)
- Config description updated for XIAOMI_API_KEY
- Tests updated for new vision model preference

### Verification
- 4322 tests passed, 0 new regressions
- Live API tested on Xiaomi portal: basic, reasoning, tool calling, multi-tool, file ops, system prompt, vision — all pass
- Self-review found and fixed 2 issues (redundant vision check, stale HuggingFace context length)
2026-04-23 10:06:25 -07:00
Brooklyn Nicholson 071bdb5a3f Revert "fix(ui-tui): force full xterm.js alt-screen repaints"
This reverts commit bc9518f660.
2026-04-23 11:55:09 -05:00
Brooklyn Nicholson bc9518f660 fix(ui-tui): force full xterm.js alt-screen repaints
- force full alt-screen damage in xterm.js hosts to avoid stale glyph artifacts
- skip incremental scroll optimization there and repaint from a cleared screen atomically
2026-04-23 11:44:27 -05:00
Teknium ce089169d5 feat(skills-guard): gate agent-created scanner on config.skills.guard_agent_created (default off)
Replaces the blanket 'always allow' change from the previous commit with
an opt-in config flag so users who want belt-and-suspenders security can
still get the keyword scan on skill_manage output.

## Default behavior (flag off)
skill_manage(action='create'|'edit'|'patch') no longer runs the keyword
scanner. The agent can write skills that mention risky keywords in prose
(documenting what reviewers should watch for, describing cache-bust
semantics in a PR-review skill, referencing AGENTS.md, etc.) without
getting blocked.

Rationale: the agent can already execute the same code paths via
terminal() with no gate, so the scan adds friction without meaningful
security against a compromised or malicious agent.

## Opt-in behavior (flag on)
Set skills.guard_agent_created: true in config.yaml to get the original
behavior back. Scanner runs on every skill_manage write; dangerous
verdicts surface as a tool error the agent can react to (retry without
the flagged content).

## External hub installs unaffected
trusted/community sources (hermes skills install) always get scanned
regardless of this flag. The gate is specifically for skill_manage,
which only agents call.

## Changes
- hermes_cli/config.py: add skills.guard_agent_created: False to DEFAULT_CONFIG
- tools/skill_manager_tool.py: _guard_agent_created_enabled() reads the flag;
  _security_scan_skill() short-circuits to None when the flag is off
- tools/skills_guard.py: restore INSTALL_POLICY['agent-created'] =
  ('allow', 'allow', 'ask') so the scan remains strict when it does run
- tests/tools/test_skills_guard.py: restore original ask/force tests
- tests/tools/test_skill_manager_tool.py: new TestSecurityScanGate class
  covering both flag states + config error handling

## Validation
- tests/tools/test_skills_guard.py + test_skill_manager_tool.py: 115/115 pass
- E2E: flagged-keyword skill creates with default config, blocks with flag on
2026-04-23 06:20:47 -07:00
Teknium e3c0084140 fix(skills-guard): allow agent-created dangerous verdicts without confirmation
The security scanner is meant to protect against hostile external skills
pulled from GitHub via hermes skills install — trusted/community policies
block or ask on dangerous verdicts accordingly. But agent-created skills
(from skill_manage) run in the same process as the agent that wrote them.
The agent can already execute the same code paths via terminal() with no
gate, so the ask-on-dangerous policy adds friction without meaningful
security.

Concrete trigger: an agent writing a PR-review skill that describes
cache-busting or persistence semantics in prose gets blocked because
those words appear in the patterns list. The skill isn't actually doing
anything dangerous — it's just documenting what reviewers should watch
for in other PRs.

Change: agent-created dangerous verdict maps to 'allow' instead of 'ask'.
External hub installs (trusted/community) keep their stricter policies
intact. Tests updated: renamed test_dangerous_agent_created_asks →
test_dangerous_agent_created_allowed; renamed force-override test and
updated assertion since force is now a no-op for agent-created (the allow
branch returns first).
2026-04-23 05:18:44 -07:00
Teknium 5651a73331 fix(gateway): guard-match the finally-block _active_sessions delete
Before this, _process_message_background's finally did an unconditional
'del self._active_sessions[session_key]' — even if a /stop/ /new
command had already swapped in its own command_guard via
_dispatch_active_session_command and cancelled us.  The old task's
unwind would clobber the newer guard, opening a race for follow-ups.

Replace with _release_session_guard(session_key, guard=interrupt_event)
so the delete only fires when the guard we captured is still the one
installed.  The sibling _session_tasks pop already had equivalent
ownership matching via asyncio.current_task() identity; this closes the
asymmetry.

Adds two direct regressions in test_session_split_brain_11016:
- stale guard reference must not clobber a newer guard by identity
- guard=None default still releases unconditionally (for callers that
  don't have a captured guard to match against)

Refs #11016
2026-04-23 05:15:52 -07:00
Teknium 81d925f2a5 chore(release): map dyxushuai and etcircle in AUTHOR_MAP
Personal gmail and noreply pattern for the contributors whose commits
are preserved on the salvage PR for issue #11016.
2026-04-23 05:15:52 -07:00
Teknium ec02d905c9 test(gateway): regressions for issue #11016 split-brain session locks
Covers all three layers of the salvaged fix:

1. Adapter-side cancellation: /stop, /new, /reset cancel the in-flight
   adapter task, release the guard, and let follow-up messages through;
   /new keeps the guard installed until the runner response lands, then
   drains the queued follow-up in order.

2. Adapter-side self-heal: a split-brain guard (done owner task, lock
   still live) is healed on the next inbound message and the user gets
   a reply instead of being trapped in infinite busy acks.  A guard
   with no recorded owner task is NOT auto-healed (protects fixtures
   that install guards directly).

3. Runner-side generation guard: stale async runs whose generation was
   bumped by /stop or /new cannot clear a newer run's _running_agents
   slot on the way out.

11 tests, all green.

Refs #11016
2026-04-23 05:15:52 -07:00
etcircle b7bdf32d4e fix(gateway): guard session slot ownership after stop/reset
Closes the runner-side half of the split-brain described in issue #11016
by wiring the existing _session_run_generation counter through the
session-slot promotion and release paths.

Without this, an older async run could still:
  - promote itself from sentinel to real agent after /stop or /new
    invalidated its run generation
  - clear _running_agents on the way out, deleting a newer run's slot

Both races leave _running_agents desynced from what the user actually
has in flight, which is half of what shows up as 'No active task to
stop' followed by late 'Interrupting current task...' acks.

Changes:
- track_agent() in _run_agent now calls _is_session_run_current() before
  writing the real agent into _running_agents[session_key]; if /stop or
  /new bumped the generation while the agent was spinning up, the slot
  is left alone (the newer run owns it).
- _release_running_agent_state() gained an optional run_generation
  keyword.  When provided, it only clears the slot if the generation is
  still current.  The final cleanup at the tail of _run_agent passes the
  run's generation so an old unwind can't blow away a newer run's state.
- Returns bool so callers can tell when a release was blocked.

All the existing call sites that do NOT pass run_generation behave
exactly as before — this is a strict additive guard.

Refs #11016
2026-04-23 05:15:52 -07:00
dyxushuai d72985b7ce fix(gateway): serialize reset command handoff and heal stale session locks
Closes the adapter-side half of the split-brain described in issue #11016
where _active_sessions stays live but nothing is processing, trapping the
chat in repeated 'Interrupting current task...' while /stop reports no
active task.

Changes on BasePlatformAdapter:
- Add _session_tasks: Dict[str, asyncio.Task] mapping session -> owner task
  so session-terminating commands can cancel the right task and old task
  finally blocks can't clobber a newer task's guard.
- Add _release_session_guard(guard=...) that only releases if the guard
  Event still matches, preventing races where /stop or /new swaps in a
  temporary guard while the old task unwinds.
- Add _session_task_is_stale() and _heal_stale_session_lock() for
  on-entry self-heal: when handle_message() sees an _active_sessions
  entry whose RECORDED owner task is done/cancelled, clear it and fall
  through to normal dispatch.  No owner task recorded = not stale (some
  tests install guards directly and shouldn't be auto-healed).
- Add cancel_session_processing() as the explicit adapter-side cancel
  API so /stop/ /new/ /reset can cleanly tear down in-flight work.
- Route /stop, /new, /reset through _dispatch_active_session_command():
    1. install a temporary command guard so follow-ups stay queued
    2. let the runner process the command
    3. cancel the old adapter task AFTER the runner response is ready
    4. release the command guard and drain the latest pending follow-up
- _start_session_processing() replaces the inline create_task + guard
  setup in handle_message() so guard + owner-task entry land atomically.
- cancel_background_tasks() also clears _session_tasks.

Combined, this means:
- /stop / /new / /reset actually cancel stuck work instead of leaving
  adapter state desynced from runner state.
- A dead session lock self-heals on the next inbound message rather than
  persisting until gateway restart.
- Follow-up messages after /new are processed in order, after the reset
  command's runner response lands.

Refs #11016
2026-04-23 05:15:52 -07:00
Teknium 5a26938aa5 fix(terminal): auto-source ~/.profile and ~/.bash_profile so n/nvm PATH survives (#14534)
The environment-snapshot login shell was auto-sourcing only ~/.bashrc when
building the PATH snapshot. On Debian/Ubuntu the default ~/.bashrc starts
with a non-interactive short-circuit:

    case $- in *i*) ;; *) return;; esac

Sourcing it from a non-interactive shell returns before any PATH export
below that guard runs. Node version managers like n and nvm append their
PATH line under that guard, so Hermes was capturing a PATH without
~/n/bin — and the terminal tool saw 'node: command not found' even when
node was on the user's interactive shell PATH.

Expand the auto-source list (when auto_source_bashrc is on) to:

    ~/.profile → ~/.bash_profile → ~/.bashrc

~/.profile and ~/.bash_profile have no interactivity guard — installers
that write their PATH there (n's n-install, nvm's curl installer on most
setups) take effect. ~/.bashrc still runs last to preserve behaviour for
users who put PATH logic there without the guard.

Added two tests covering the new behaviour plus an E2E test that spins up
a real LocalEnvironment with a guard-prefixed ~/.bashrc and a ~/.profile
PATH export, and verifies the captured snapshot PATH contains the profile
entry.
2026-04-23 05:15:37 -07:00
Teknium d45c738a52 fix(gateway): preflight user D-Bus before systemctl --user start (#14531)
On fresh RHEL/Debian SSH sessions without linger, `systemctl --user
start hermes-gateway` fails with 'Failed to connect to bus: No medium
found' because /run/user/$UID/bus doesn't exist. Setup previously
showed a raw CalledProcessError and continued claiming success, so the
gateway never actually started.

systemd_start() and systemd_restart() now call _preflight_user_systemd()
for the user scope first:
- Bus socket already there → no-op (desktop / linger-enabled servers)
- Linger off → try loginctl enable-linger (works when polkit permits,
  needs sudo otherwise), wait for socket
- Still unreachable → raise UserSystemdUnavailableError with a clean
  remediation message pointing to sudo loginctl + hermes gateway run
  as the foreground fallback

Setup's start/restart handlers and gateway_command() catch the new
exception and render the multi-line guidance instead of a traceback.
2026-04-23 05:09:38 -07:00
Teknium d50be05b1c chore(release): map j0sephz in AUTHOR_MAP 2026-04-23 05:09:08 -07:00
Teknium 24e8a6e701 feat(skills_sync): surface collision with reset-hint
When a newly-bundled skill's name collides with a pre-existing user
skill, sync silently kept the user's copy. Users never learned that
a bundled version shipped by that name.

Now (on non-quiet sync only) print:

  ⚠ <name>: bundled version shipped but you already have a local
    skill by this name — yours was kept. Run `hermes skills reset
    <name>` to replace it with the bundled version.

No behavior change to manifest writes or to the kept user copy —
purely additive warning on the existing collision-skip path.
2026-04-23 05:09:08 -07:00
j0sephz 3a97fb3d47 fix(skills_sync): don't poison manifest on new-skill collision
When a new bundled skill's name collided with a pre-existing user skill
(from hub, custom, or leftover), sync_skills() recorded the bundled hash
in the manifest even though the on-disk copy was unrelated to bundled.
On the next sync, user_hash != origin_hash (bundled_hash) marked the
skill as "user-modified" permanently, blocking all bundled updates for
that skill until the user ran `hermes skills reset`.

Fix: only baseline the manifest entry when the user's on-disk copy is
byte-identical to bundled (safe to track — this is the reset re-sync or
coincidentally-identical install case). Otherwise skip the manifest
write entirely: the on-disk skill is unrelated to bundled and shouldn't
be tracked as if it were.

This preserves reset_bundled_skill()'s re-baseline flow (its post-delete
sync still writes to the manifest when user copy matches bundled) while
fixing the poisoning scenario for genuinely unrelated collisions.

Adds two tests following the existing test_failed_copy_does_not_poison_manifest
pattern: one verifying the manifest stays clean after a collision with
differing content, one verifying no false user_modified flag on resync.
2026-04-23 05:09:08 -07:00
Siddharth Balyan 91d6ea07c8 chore(dev): add ruff linter to dev deps and configure in pyproject.toml (#14527)
Adds ruff (fast Python linter from Astral) as a dev dependency and sets
up initial config with all files excluded — ruff is entirely disabled
for now, this just lands the config for slow rollout enabling it
module-by-module in follow-up PRs.
2026-04-23 17:20:18 +05:30
Siddharth Balyan fdcb3e9a4b chore(dev): add ty type checker to dev deps and configure in pyproject.toml (#14525)
Adds ty (Red Knot) as a dev dependency and sets up initial configuration
with all files excluded — to be incrementally enabled per-module.
2026-04-23 17:15:57 +05:30
Teknium 627abbb1ea chore(release): map davidvv in AUTHOR_MAP 2026-04-23 03:10:30 -07:00
David VV 39fcf1d127 fix(model_switch): group custom_providers by endpoint in /model picker (#9210)
Multiple custom_providers entries sharing the same base_url + api_key
are now grouped into a single picker row. A local Ollama host with
per-model display names ("Ollama — GLM 5.1", "Ollama — Qwen3-coder",
"Ollama — Kimi K2", "Ollama — MiniMax M2.7") previously produced four
near-duplicate picker rows that differed only by suffix; now it appears
as one "Ollama" row with four models.

Key changes:
- Grouping key changed from slug-by-name to (base_url, api_key). Names
  frequently differ per model while the endpoint stays the same.
- When the grouped endpoint matches current_base_url, the row's slug is
  set to current_provider so picker-driven switches route through the
  live credential pipeline (no re-resolution needed).
- Per-model suffix is stripped from the display name ("Ollama — X" →
  "Ollama") via em-dash / " - " separators.
- Two groups with different api_keys at the same base_url (or otherwise
  colliding on cleaned name) are disambiguated with a numeric suffix
  (custom:openai, custom:openai-2) so both stay visible.
- current_base_url parameter plumbed through both gateway call sites.

Existing #8216, #11499, #13509 regressions covered (dict/list shapes
of models:, section-3/section-4 dedup, normalized list-format entries).

Salvaged from @davidvv's PR #9210 — the underlying code had diverged
~1400 commits since that PR was opened, so this is a reconstruction of
the same approach on current main rather than a clean cherry-pick.
Authorship preserved via --author on this commit.

Closes #9210
2026-04-23 03:10:30 -07:00
Teknium 6172f95944 chore(release): map GuyCui in AUTHOR_MAP 2026-04-23 03:10:04 -07:00
GuyCui b24d239ce1 Update permissions for config.yaml
Fix config.yaml permission drift on startup
2026-04-23 03:10:04 -07:00
Teknium cd9cd1b159 chore(release): map MikeFac in AUTHOR_MAP 2026-04-23 03:08:53 -07:00
MikeFac 78e213710c fix: guard against None tirith path in security scanner
When _resolve_tirith_path() returns None (e.g. install failed on
unsupported platform or all resolution paths exhausted), the function
passed None directly to subprocess.run(), causing a TypeError instead
of respecting the fail_open config.

Add a None check before the subprocess call that allows or blocks
according to the configured fail_open policy, matching the existing
error handling behavior for OSError and TimeoutExpired.
2026-04-23 03:08:53 -07:00
Teknium 4f4fd21149 chore(release): map vivganes in AUTHOR_MAP 2026-04-23 03:07:06 -07:00
Vivek Ganesan 7ca2f70055 fix(docs): Add links to Atropos and wandb in user guide
fix #7724

The user guide has mention of atropos and wandb but no links.  This PR adds links so that users dont have to search for them.
2026-04-23 03:07:06 -07:00
Teknium dab36d9511 chore(release): map phpoh in AUTHOR_MAP 2026-04-23 03:05:49 -07:00
phpoh 4c02e4597e fix(status): catch OSError in os.kill(pid, 0) for Windows compatibility
On Windows, os.kill(nonexistent_pid, 0) raises OSError with WinError 87
("The parameter is incorrect") instead of ProcessLookupError. Without
catching OSError, the acquire_scoped_lock() and get_running_pid() paths
crash on any invalid PID check — preventing gateway startup on Windows
whenever a stale PID file survives from a prior run.

Adapted @phpoh's fix in #12490 onto current main. The main file was
refactored in the interim (get_running_pid now iterates over
(primary_record, fallback_record) with a per-iteration try/except),
so the OSError catch is added as a new except clause after
PermissionError (which is a subclass of OSError, so order matters:
PermissionError must match first).

Co-authored-by: phpoh <1352808998@qq.com>
2026-04-23 03:05:49 -07:00
Aslaaen 51c1d2de16 fix(profiles): stage profile imports to prevent directory clobbering 2026-04-23 03:02:34 -07:00
Teknium 08cb345e24 chore(release): map Lind3ey in AUTHOR_MAP 2026-04-23 03:02:09 -07:00
Lind3ey 9dba75bc38 fix(feishu): issue where streaming edits in Feishu show extra leading newlines 2026-04-23 03:02:09 -07:00
Teknium 8f50f2834a chore(release): add Wysie to AUTHOR_MAP 2026-04-23 03:01:18 -07:00
Wysie be99feff1f fix(image-gen): force-refresh plugin providers in long-lived sessions 2026-04-23 03:01:18 -07:00
Teknium 911f57ad97 chore(release): map TaroballzChen in AUTHOR_MAP 2026-04-23 02:37:15 -07:00
TaroballzChen 5d09474348 fix(tools): enforce ACP transport overrides in delegate_task child agents
When override_acp_command was passed to _build_child_agent, it failed to
override effective_provider to 'copilot-acp' and effective_api_mode to
'chat_completions'. This caused the child AIAgent to inherit the parent's
native API configuration (e.g. Anthropic) and attempt real HTTP requests
using the parent's API key, leading to HTTP 401 errors and completely
bypassing the ACP subprocess.

Ensure that if an ACP command override is provided, the child agent
correctly routes through CopilotACPClient.

Refs #2653
2026-04-23 02:37:15 -07:00
Teknium 33773ed5c6 chore(release): map DrStrangerUJN in AUTHOR_MAP 2026-04-23 02:37:07 -07:00
drstrangerujn a5b0c7e2ec fix(config): preserve list-format models in custom_providers normalize
_normalize_custom_provider_entry silently drops the models field when it's
a list. Hand-edited configs (and the shape used by older Hermes versions)
still write models as a plain list of ids, so after the normalize pass the
entry reaches list_authenticated_providers() with no models and /model
shows the provider with (0) models — even though the underlying picker
code handles lists fine.

Convert list-format models into the empty-value dict shape the rest of
the pipeline already expects. Dict-format entries keep passing through
unchanged.

Repro (before the fix):

    custom_providers:
    - name: acme
      base_url: https://api.example.com/v1
      models: [foo, bar, baz]

/model shows "acme (0)"; bypassing normalize in list_authenticated_providers
returns three models, confirming the drop happens in normalize.

Adds four unit tests covering list→dict conversion, dict pass-through,
filtering of empty/non-string entries, and the empty-list case.
2026-04-23 02:37:07 -07:00
Teknium c80cc8557e chore(release): map RyanLee-Dev in AUTHOR_MAP 2026-04-23 02:35:13 -07:00
yuanhe 1df0c812c4 feat(skills): add MiniMax-AI/cli as default skill tap
Adds MiniMax-AI/cli to the default taps list so the mmx-cli skill
is discoverable and installable out of the box via /skills browse
and /skills install. The skill definition lives upstream at
github.com/MiniMax-AI/cli/skill/SKILL.md, keeping updates decoupled.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-23 02:35:13 -07:00
Teknium b5ec6e8df7 chore(release): map sharziki in AUTHOR_MAP 2026-04-23 02:34:11 -07:00
sharziki d7452af257 fix(pairing): handle null user_name in pairing list display
When user_name is stored as None (e.g. Telegram users without a
display name), dict.get('user_name', '') returns None because the
key exists — the default is only used for missing keys. This causes
a TypeError when the format specifier :<20 is applied to None.

Use `or ''` to coerce None to an empty string.

Fixes #7392

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-23 02:34:11 -07:00
Teknium 48923e5a3d chore(release): map azhengbot in AUTHOR_MAP 2026-04-23 02:32:56 -07:00
azhengbot f77da7de42 Rename _api_call_with_interrupt to _interruptible_api_call 2026-04-23 02:32:56 -07:00
azhengbot 36adcebe6c Rename API call function to _interruptible_api_call 2026-04-23 02:32:56 -07:00
kshitijk4poor 43de1ca8c2 refactor: remove _nr_to_assistant_message shim + fix flush_memories guard
NormalizedResponse and ToolCall now have backward-compat properties
so the agent loop can read them directly without the shim:

  ToolCall: .type, .function (returns self), .call_id, .response_item_id
  NormalizedResponse: .reasoning_content, .reasoning_details,
                      .codex_reasoning_items

This eliminates the 35-line shim and its 4 call sites in run_agent.py.

Also changes flush_memories guard from hasattr(response, 'choices')
to self.api_mode in ('chat_completions', 'bedrock_converse') so it
works with raw boto3 dicts too.

WS1 items 3+4 of Cycle 2 (#14418).
2026-04-23 02:30:05 -07:00
kshitijk4poor f4612785a4 refactor: collapse normalize_anthropic_response to return NormalizedResponse directly
3-layer chain (transport → v2 → v1) was collapsed to 2-layer in PR 7.
This collapses the remaining 2-layer (transport → v1 → NR mapping in
transport) to 1-layer: v1 now returns NormalizedResponse directly.

Before: adapter returns (SimpleNamespace, finish_reason) tuple,
  transport unpacks and maps to NormalizedResponse (22 lines).
After: adapter returns NormalizedResponse, transport is a
  1-line passthrough.

Also updates ToolCall construction — adapter now creates ToolCall
dataclass directly instead of SimpleNamespace(id, type, function).

WS1 item 1 of Cycle 2 (#14418).
2026-04-23 02:30:05 -07:00
kshitijk4poor 738d0900fd refactor: migrate auxiliary_client Anthropic path to use transport
Replace direct normalize_anthropic_response() call in
_AnthropicCompletionsAdapter.create() with
AnthropicTransport.normalize_response() via get_transport().

Before: auxiliary_client called adapter v1 directly, bypassing
the transport layer entirely.

After: auxiliary_client → get_transport('anthropic_messages') →
transport.normalize_response() → adapter v1 → NormalizedResponse.

The adapter v1 function (normalize_anthropic_response) now has
zero callers outside agent/anthropic_adapter.py and the transport.
This unblocks collapsing v1 to return NormalizedResponse directly
in a follow-up (the remaining 2-layer chain becomes 1-layer).

WS1 item 2 of Cycle 2 (#14418).
2026-04-23 02:30:05 -07:00
Teknium 1c532278ae chore(release): map lvnilesh in AUTHOR_MAP 2026-04-23 02:30:00 -07:00
Nilesh 22afa066f8 fix(cron): guard against non-dict result from run_conversation
When run_conversation returns a non-dict value (e.g. an int under
error conditions), the subsequent result.get("final_response", "")
raises an opaque "'int' object has no attribute 'get'" AttributeError.

Add a type guard that converts this into a clear RuntimeError, which
is properly caught by the outer except Exception handler that marks
the job as failed and delivers the error message.

Fixes NousResearch/hermes-agent#9433

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-23 02:30:00 -07:00
Teknium 5e76c650bb chore(release): map yzx9 in AUTHOR_MAP 2026-04-23 02:06:16 -07:00
Zexin Yuan 15efb410d0 fix(nix): make working directory writable 2026-04-23 02:06:16 -07:00
Teknium e8cba18f77 chore(release): map wenhao7 in AUTHOR_MAP 2026-04-23 02:04:45 -07:00
wenhao7 48dc8ef1d1 docs(cron): clarify default model/provider setup for scheduled jobs
Added a note about configuring default model and provider before creating cron jobs.
2026-04-23 02:04:45 -07:00
wenhao7 156b358320 docs(cron): explain runtime resolution for null model/provider
Clarify job storage behavior regarding model and provider fields.
2026-04-23 02:04:45 -07:00
Teknium fa47cbd456 chore(release): map minorgod in AUTHOR_MAP 2026-04-23 02:02:49 -07:00
Brett Brewer 92e4bbc201 Update Docker guide with terminal command
Add alternative instructions for opening an interactive Hermes cli chat session in a running Docker container.
2026-04-23 02:02:49 -07:00
Teknium 85cc12e2bd chore(release): map roytian1217 in AUTHOR_MAP 2026-04-23 02:00:56 -07:00
roytian1217 8b1ff55f53 fix(wecom): strip @mention prefix in group chats for slash command recognition
In WeCom group chats, messages sent as "@BotName /command" arrive with
the @mention prefix intact. This causes is_command() to return False
since the text does not start with "/".

Strip the leading @mention in group messages before creating the
MessageEvent, mirroring the existing behavior in the Telegram adapter.
2026-04-23 02:00:56 -07:00
Teknium 77f99c4ff4 chore(release): map zhouxiaoya12 in AUTHOR_MAP 2026-04-23 01:59:20 -07:00
zhzouxiaoya12 3d90292eda fix: normalize provider in list_provider_models to support aliases 2026-04-23 01:59:20 -07:00
Julien Talbot d8cc85dcdc review(stt-xai): address cetej's nits
- Replace hardcoded 'fr' default with DEFAULT_LOCAL_STT_LANGUAGE ('en')
  — removes locale leak, matches other providers
- Drop redundant default=True on is_truthy_value (dict .get already defaults)
- Update auto-detect comment to include 'xai' in the chain
- Fix docstring: 21 languages (match PR body + actual xAI API)
- Update test_sends_language_and_format to set HERMES_LOCAL_STT_LANGUAGE=fr
  explicitly, since default is no longer 'fr'

All 18 xAI STT tests pass locally.
2026-04-23 01:57:33 -07:00
Julien Talbot 18b29b124a test(stt): add unit tests for xAI Grok STT provider
Covers:
- _transcribe_xai: no key, successful transcription, whitespace stripping,
  API error (HTTP 400), empty transcript, permission error, network error,
  language/format params sent, custom base_url, diarize config
- _get_provider xAI: key set, no key, auto-detect after mistral,
  mistral preferred over xai, no key returns none
- transcribe_audio xAI dispatch: dispatch, default model (grok-stt),
  model override
2026-04-23 01:57:33 -07:00
Julien Talbot a6ffa994cd feat(stt): add xAI Grok STT provider
Add xAI as a sixth STT provider using the POST /v1/stt endpoint.

Features:
- Multipart/form-data upload to api.x.ai/v1/stt
- Inverse Text Normalization (ITN) via format=true (default)
- Optional diarization via config (stt.xai.diarize)
- Language configuration (default: fr, overridable via config or env)
- Custom base_url support (XAI_STT_BASE_URL env or stt.xai.base_url)
- Full provider integration: explicit config + auto-detect fallback chain
- Consistent error handling matching existing provider patterns

Config (config.yaml):
  stt:
    provider: xai
    xai:
      language: fr
      format: true
      diarize: false
      base_url: https://api.x.ai/v1   # optional override

Auto-detect priority: local > groq > openai > mistral > xai > none
2026-04-23 01:57:33 -07:00
helix4u bace220d29 fix(image-gen): persist plugin provider on reconfigure 2026-04-23 01:56:09 -07:00
Siddharth Balyan d1ce358646 feat(agent): add PLATFORM_HINTS for matrix, mattermost, and feishu (#14428)
* feat(agent): add PLATFORM_HINTS for matrix, mattermost, and feishu

These platform adapters fully support media delivery (send_image,
send_document, send_voice, send_video) but were missing from
PLATFORM_HINTS, leaving agents unaware of their platform context,
markdown rendering, and MEDIA: tag support.

Salvaged from PR #7370 by Rutimka — wecom excluded since main already
has a more detailed version.

Co-Authored-By: Marco Rutsch <marco@rutimka.de>

* test: add missing Markdown assertion for feishu platform hint

---------

Co-authored-by: Marco Rutsch <marco@rutimka.de>
2026-04-23 12:50:22 +05:30
Teknium 88b6eb9ad1 chore(release): map Nan93 in AUTHOR_MAP 2026-04-22 21:30:32 -07:00
Nan93 2f48c58b85 fix: normalize iOS unicode dashes in slash command args
iOS auto-corrects -- to — (em dash) and - to – (en dash), causing
commands like /model glm-4.7 —provider zai to fail with
'Model names cannot contain spaces'. Normalize at get_command_args().
2026-04-22 21:30:32 -07:00
Teknium e25c319fa3 chore(release): map hsy5571616 in AUTHOR_MAP 2026-04-22 21:29:49 -07:00
saitsuki 9357db2844 docs: fix fallback behavior description — it is per-turn, not per-session
The documentation claimed fallback activates 'at most once per session',
but the actual implementation restores the primary model at the start of
every run_conversation() call via _restore_primary_runtime().

Relevant source: run_agent.py lines 1666-1694 (snapshot), 6454-6517
(restore), 8681-8684 (called each turn).

Updated the One-Shot info box and the summary table to accurately
describe the per-turn restoration behavior.
2026-04-22 21:29:49 -07:00
Teknium 400b5235b8 chore(release): map isaachuangGMICLOUD in AUTHOR_MAP 2026-04-22 21:29:00 -07:00
isaachuangGMICLOUD 73533fc728 docs: add GMI Cloud to compatible providers list 2026-04-22 21:29:00 -07:00
Teknium 74520392f2 chore(release): map WadydX in AUTHOR_MAP 2026-04-22 21:28:13 -07:00
WadydX dcb8c5c67a docs(contributing): align Node requirement in repo + docs site 2026-04-22 21:28:13 -07:00
WadydX 2c53a3344d docs(contributing): align Node prerequisite with package engines 2026-04-22 21:28:13 -07:00
Teknium 7f1c1aa4d9 chore(release): map mikewaters in AUTHOR_MAP 2026-04-22 21:27:32 -07:00
Mike Waters ed5f16323f Update Git requirement to include git-lfs extension 2026-04-22 21:27:32 -07:00
Mike Waters d6d9f10629 Update Git requirement to include git-lfs extension 2026-04-22 21:27:32 -07:00
Teknium fa8f0c6fae chore(release): map xinpengdr in AUTHOR_MAP 2026-04-22 21:18:28 -07:00
xinpengdr 5eefdd9c02 fix: skip non-API-key auth providers in env-var credential detection
In list_authenticated_providers(), providers like qwen-oauth that use
OAuth authentication were incorrectly flagged as authenticated because
the env-var check fell back to models.dev provider env vars (e.g.
DASHSCOPE_API_KEY for alibaba). Any user with an alibaba API key would
see a ghost qwen-oauth entry in /model picker with 0 models listed.

Fix: skip providers whose auth_type is not api_key in the env-var
detection section (step 1). OAuth/external-process providers are
properly handled in step 2 (HERMES_OVERLAYS) which checks the auth store.
2026-04-22 21:18:28 -07:00
Teknium 268a4aa1c1 chore(release): map fatinghenji in AUTHOR_MAP 2026-04-22 21:17:37 -07:00
VantHoff 99af222ecf fix(tirith): detect Android/Termux as Linux ABI-compatible
In _detect_target(), platform.system() returns "Android" on Termux,
not "Linux". Without this change tirith's auto-installer skips
Android even though the Linux GNU binaries are ABI-compatible.
2026-04-22 21:17:37 -07:00
Teknium f347315e07 chore(release): map lmoncany in AUTHOR_MAP 2026-04-22 21:17:00 -07:00
Loic Moncany b80b400141 fix(mcp): respect ssl_verify config for StreamableHTTP servers
When an MCP server config has ssl_verify: false (e.g. local dev with
a self-signed cert), the setting was read from config.yaml but never
passed to the httpx client, causing CERTIFICATE_VERIFY_FAILED errors
and silent connection failures.

Fix: read ssl_verify from config and pass it as the 'verify' kwarg to
both code paths:
- New API (mcp >= 1.24.0): httpx.AsyncClient(verify=ssl_verify)
- Legacy API (mcp < 1.24.0): streamablehttp_client(..., verify=ssl_verify)

Fixes local dev setups using ServBay, LocalWP, MAMP, or any stack with
a self-signed TLS certificate.
2026-04-22 21:17:00 -07:00
Teknium bf039a9268 chore(release): map fengtianyu88 in AUTHOR_MAP 2026-04-22 21:16:16 -07:00
fengtianyu88 ec7e92082d fix(qqbot): add backoff upper-bound check for QQCloseError reconnect path
The QQCloseError (non-4008) reconnect path in _listen_loop was
missing the MAX_RECONNECT_ATTEMPTS upper-bound check that exists
in both the Exception handler (line 546) and the 4008 rate-limit
handler (line 486). Without this check, if _reconnect() fails
permanently for any non-4008 close code, backoff_idx grows
indefinitely and the bot retries forever at 60-second intervals
instead of giving up cleanly.

Fix: add the same guard after backoff_idx += 1 in the general
QQCloseError branch, consistent with the existing Exception path.
2026-04-22 21:16:16 -07:00
Teknium a4877faf96 chore(release): map Llugaes in AUTHOR_MAP 2026-04-22 21:15:28 -07:00
Llugaes 85caa5d447 fix(docker): exclude runtime data/ from build context
The Dockerfile declares VOLUME /opt/data and the published
docker-compose flow bind-mounts ./data:/opt/data for runtime
state. Because .dockerignore did not list data/, any file the
container writes under /opt/data leaks back into the build
context on the next `docker compose build`.

This becomes a hard failure when the container writes a
dangling symlink there — e.g. PulseAudio's XDG runtime entry
(data/.config/pulse/<host>-runtime -> /tmp/pulse-*) whose
target only exists inside the container. Docker's tar packer
cannot resolve the broken symlink on the host and aborts
context load with `invalid file request`.

Excluding data/ keeps build context clean, shrinks the context
tarball (logs/, sessions/, memories/ no longer shipped), and
matches the intent already expressed in .gitignore.
2026-04-22 21:15:28 -07:00
Teknium eda5ae5a5e feat(image_gen): add openai-codex plugin (gpt-image-2 via Codex OAuth) (#14317)
New built-in image_gen backend at plugins/image_gen/openai-codex/ that
exposes the same gpt-image-2 low/medium/high tier catalog as the
existing 'openai' plugin, but routes generation through the ChatGPT/
Codex Responses image_generation tool path. Available whenever the user
has Codex OAuth signed in; no OPENAI_API_KEY required.

The two plugins are independent — users select between them via
'hermes tools' → Image Generation, and image_gen.provider in
config.yaml. The existing 'openai' (API-key) plugin is unchanged.

Reuses _read_codex_access_token() and _codex_cloudflare_headers() from
agent.auxiliary_client so token expiry / cred-pool / Cloudflare
originator handling stays in one place.

Inspired by #14047 by @Hygaard, but re-implemented as a separate
plugin instead of an in-place fork of the openai plugin.

Closes #11195
2026-04-22 20:43:21 -07:00
Teknium 563ed0e61f chore(release): map fuleinist in AUTHOR_MAP 2026-04-22 20:03:39 -07:00
fuleinist e371af1df2 Add config option to disable Discord slash commands
Add discord.slash_commands config option (default: true) to allow
users to disable Discord slash command registration when running
alongside other bots that use the same command names.

When set to false in config.yaml:
  discord:
    slash_commands: false

The _register_slash_commands() call is skipped while text-based
parsing of /commands continues to work normally.

Fixes #4881
2026-04-22 20:03:39 -07:00
Teknium ee54e20c29 chore(release): map zhang9w0v5 in AUTHOR_MAP 2026-04-22 20:02:46 -07:00
多米 82fbd4771a Update .gitignore
Filter out .DS_Store (Desktop Services Store)
2026-04-22 20:02:46 -07:00
Teknium 30ad507a0f chore(release): map christopherwoodall in AUTHOR_MAP 2026-04-22 20:02:01 -07:00
Chris dce2b0dfa8 Add exclude-newer option for UV tool in pyproject.toml 2026-04-22 20:02:01 -07:00
Teknium f9487ee831 chore(release): map 10ishq in AUTHOR_MAP 2026-04-22 20:00:29 -07:00
10ishq e038677ef6 docs: add Exa web search backend setup guide and details
Adds an Exa-specific setup note next to the Parallel search-modes line
documenting EXA_API_KEY, category filtering (company, research paper,
news, people, personal site, pdf), and domain/date filters.

Reapplied onto current main from @10ishq's PR #6697 — the original branch
was too far behind main to cherry-pick directly (touched 1,456 unrelated
files from deleted/renamed paths).

Co-authored-by: 10ishq <tanishq@exa.ai>
2026-04-22 20:00:29 -07:00
Teknium effcbc8a6b chore(release): map huangke19 in AUTHOR_MAP 2026-04-22 19:59:11 -07:00
huangke 6209e85e7d feat: support document/archive extensions in MEDIA: tag extraction
Add epub, pdf, zip, rar, 7z, docx, xlsx, pptx, txt, csv, apk, ipa to
the MEDIA: path regex in extract_media(). These file types were already
routed to send_document() in the delivery loop (base.py:1705), but the
extraction regex only matched media extensions (audio/video/image),
causing document paths to fall through to the generic \S+ branch which
could fail silently in some cases. This explicit list ensures reliable
matching and delivery for all common document formats.
2026-04-22 19:59:11 -07:00
Teknium a2a8092e90 feat(cli): add --ignore-user-config and --ignore-rules flags
Port from openai/codex#18646.

Adds two flags to 'hermes chat' that fully isolate a run from user-level
configuration and rules:

* --ignore-user-config: skip ~/.hermes/config.yaml and fall back to
  built-in defaults. Credentials in .env are still loaded so the agent
  can actually call a provider.
* --ignore-rules: skip auto-injection of AGENTS.md, SOUL.md,
  .cursorrules, and persistent memory (maps to AIAgent(skip_context_files=True,
  skip_memory=True)).

Primary use cases:
- Reproducible CI runs that should not pick up developer-local config
- Third-party integrations (e.g. Chronicle in Codex) that bring their
  own config and don't want user preferences leaking in
- Bug-report reproduction without the reporter's personal overrides
- Debugging: bisect 'was it my config?' vs 'real bug' in one command

Both flags are registered on the parent parser AND the 'chat' subparser
(with argparse.SUPPRESS on the subparser to avoid overwriting the parent
value when the flag is placed before the subcommand, matching the
existing --yolo/--worktree/--pass-session-id pattern).

Env vars HERMES_IGNORE_USER_CONFIG=1 and HERMES_IGNORE_RULES=1 are set
by cmd_chat BEFORE 'from cli import main' runs, which is critical
because cli.py evaluates CLI_CONFIG = load_cli_config() at module import
time. The cli.py / hermes_cli.config.load_cli_config() function checks
the env var and skips ~/.hermes/config.yaml when set.

Tests: 11 new tests in tests/hermes_cli/test_ignore_user_config_flags.py
covering the env gate, constructor wiring, cmd_chat simulation, and
argparse flag registration. All pass; existing hermes_cli + cli suites
unaffected (3005 pass, 2 pre-existing unrelated failures).
2026-04-22 19:58:42 -07:00
Teknium 520b8d9002 chore(release): map A-afflatus in AUTHOR_MAP 2026-04-22 18:44:45 -07:00
A-afflatus 9c5c8268c6 fix(skills): remove invalid llm-wiki related skill
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-04-22 18:44:45 -07:00
Teknium 463fbf1418 chore(release): map iborazzi in AUTHOR_MAP 2026-04-22 18:44:07 -07:00
iborazzi f41031af3a fix: increase max_tokens for GLM 5.1 reasoning headroom 2026-04-22 18:44:07 -07:00
Teknium c78a188ddd refactor: invalidate transport cache when api_mode auto-upgrades to codex_responses
Follow-up for #13862 — the post-init api_mode upgrade at __init__ (direct OpenAI /
gpt-5-requires-responses path) runs AFTER the eager transport warm. Clear the cache
so the stale chat_completions entry is evicted.

Cosmetic: correctness was already fine since _get_transport() keys by current
api_mode, but this avoids leaving unused cache state behind.
2026-04-22 18:34:25 -07:00
kshitijk4poor d30ee2e545 refactor: unify transport dispatch + collapse normalize shims
Consolidate 4 per-transport lazy singleton helpers (_get_anthropic_transport,
_get_codex_transport, _get_chat_completions_transport, _get_bedrock_transport)
into one generic _get_transport(api_mode) with a shared dict cache.

Collapse the 65-line main normalize block (3 api_mode branches, each with
its own SimpleNamespace shim) into 7 lines: one _get_transport() call +
one _nr_to_assistant_message() shared shim. The shim extracts provider_data
fields (codex_reasoning_items, reasoning_details, call_id, response_item_id)
into the SimpleNamespace shape downstream code expects.

Wire chat_completions and bedrock_converse normalize through their transports
for the first time — these were previously falling into the raw
response.choices[0].message else branch.

Remove 8 dead codex adapter imports that have zero callers after PRs 1-6.

Transport lifecycle improvements:
- Eagerly warm transport cache at __init__ (surfaces import errors early)
- Invalidate transport cache on api_mode change (switch_model, fallback
  activation, fallback restore, transport recovery) — prevents stale
  transport after mid-session provider switch

run_agent.py: -32 net lines (11,988 -> 11,956).

PR 7 of the provider transport refactor.
2026-04-22 18:34:25 -07:00
Teknium 36730b90c4 fix(gateway): also clear session-scoped approval state on /new
Follow-up to the /resume and /branch cleanup in the previous commit:
/new is a conversation-boundary operation too, so session-scoped
dangerous-command approvals and /yolo state must not survive it.

Adds a scoped unit test for _clear_session_boundary_security_state that
also covers the /new path (which calls the same helper).
2026-04-22 18:26:59 -07:00
Es1la 050aabe2d4 fix(gateway): reset approval and yolo state on session boundary 2026-04-22 18:26:59 -07:00
Teknium 64c38cc4d0 chore(release): map shushuzn in AUTHOR_MAP 2026-04-22 18:17:37 -07:00
shushuzn fa2dbd1bb5 fix: use utf-8 encoding when reading .env file in load_env()
On Windows, Path.open() defaults to the system ANSI code page (cp1252).
If the .env file contains UTF-8 characters, decoding fails with
'gbk codec can't decode byte 0x94'. Specify encoding='utf-8'
explicitly to ensure consistent behavior across platforms.
2026-04-22 18:17:37 -07:00
Teknium 6ad2fab8cf chore(release): map Dev-Mriganka in AUTHOR_MAP 2026-04-22 18:16:49 -07:00
Dev-Mriganka a14fb3ab1a fix(cli): guard fallback_model list format in save_config_value
When a user manually sets fallback_model as a YAML list instead of a
dict, save_config_value() crashes with:

  AttributeError: 'list' object has no attribute 'get'

at the fb.get('provider') call on hermes_cli/config.py.

The fix adds isinstance(fb, dict) so list-format values are treated as
unconfigured — the fallback_model comment block is appended to guide
correct usage — instead of crashing.

Fixes #4091

Co-authored-by: [AI-assisted — Claude Sonnet 4.6 via Milo/Hermes]
2026-04-22 18:16:49 -07:00
Teknium 2c26a80848 chore(release): map projectadmin-dev in AUTHOR_MAP 2026-04-22 18:16:08 -07:00
projectadmin-dev d67d12b5df Update whatsapp-bridge package-lock.json 2026-04-22 18:16:08 -07:00
Teknium 86510477f3 chore(release): map NIDNASSER-Abdelmajid in AUTHOR_MAP 2026-04-22 18:15:27 -07:00
Abdelmajid NIDNASSER ce4214ec94 Normalize claw workspace paths for Windows 2026-04-22 18:15:27 -07:00
Teknium 50387d718e chore(release): map haimu0x in AUTHOR_MAP 2026-04-22 18:14:49 -07:00
haimu0x aa75d0a90b fix(web): remove duplicate skill count in dashboard badge (#12372)
skillCount i18n already embeds {count}; the badge also prefixed activeSkills.length, showing duplicated numbers.
2026-04-22 18:14:49 -07:00
Teknium 159061836e chore(release): map @akhater's Azure VM commit email in AUTHOR_MAP
Commits in PRs #13346 and #13349 were authored as
Cos_Admin@PTG-COS.lodluvup4uaudnm3ycd14giyug.xx.internal.cloudapp.net
(Azure VM default hostname-based identity). Mapping to akhater so
check-attribution passes and release notes credit correctly.
2026-04-22 18:13:14 -07:00
Ubuntu d70f0f1dc0 fix(docker): allow entrypoint to pass-through non-hermes commands
Commit 8254b820 ("--init for zombie reaping + sleep infinity for
idle-based lifetime") made the Docker terminal backend launch
sandbox containers with `sleep infinity` as the command, so the
lifetime is controlled by an external idle reaper instead of a
fixed timeout.

But `docker/entrypoint.sh` unconditionally wraps its args with
`hermes`:

    exec hermes "$@"

Result: `hermes sleep infinity` → argparse rejects `sleep` as a
subcommand and the container exits immediately with code 2:

    hermes: error: argument command: invalid choice: 'sleep'
        (choose from chat, model, gateway, setup, ...)

Every sandbox container launched by the docker backend dies at
startup, breaking terminal/file tool execution end-to-end.

Fix: dispatch at the tail of the entrypoint. If the first arg is
an executable on PATH (sleep, bash, sh, etc.) run it raw; otherwise
preserve the legacy `hermes <subcommand>` wrapping behavior. Both
invocation styles below keep working:

    docker run <image>                 -> hermes (interactive)
    docker run <image> chat -q "hi"    -> hermes chat -q "hi"
    docker run <image> sleep infinity  -> sleep infinity
    docker run <image> bash            -> bash

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 18:13:14 -07:00
Ubuntu a3014a4481 fix(docker): add SETUID/SETGID caps so gosu drop in entrypoint succeeds
The Docker terminal backend runs containers with `--cap-drop ALL`
and re-adds only DAC_OVERRIDE, CHOWN, FOWNER. Since commit fee0e0d3
("run as non-root user, use virtualenv") the image entrypoint drops
from root to the `hermes` user via `gosu`, which requires CAP_SETUID
and CAP_SETGID. Without them every sandbox container exits
immediately with:

    Dropping root privileges
    error: failed switching to 'hermes': operation not permitted

Breaking every terminal/file tool invocation in `terminal.backend: docker`
mode.

Fix: add SETUID and SETGID to the cap-add list. The `no-new-privileges`
security-opt is kept, so gosu still cannot escalate back to root after
the one-way drop — the hardening posture is preserved.

Reproduction
------------
With any image whose ENTRYPOINT calls `gosu <user>`, the container
exits immediately under the pre-fix cap set. Post-fix, the drop
succeeds and the container proceeds normally.

    docker run --rm \
        --cap-drop ALL \
        --cap-add DAC_OVERRIDE --cap-add CHOWN --cap-add FOWNER \
        --security-opt no-new-privileges \
        --entrypoint /usr/local/bin/gosu \
        hermes-claude:latest hermes id
    # -> error: failed switching to 'hermes': operation not permitted

    # Same command with SETUID+SETGID added:
    # -> uid=10000(hermes) gid=10000(hermes) groups=10000(hermes)

Tests
-----
Added `test_security_args_include_setuid_setgid_for_gosu_drop` that
asserts both caps are present and the overall hardening posture
(cap-drop ALL + no-new-privileges) is preserved.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 18:13:14 -07:00
Teknium c345ec9a63 fix(display): strip standalone tool-call XML tags from visible text
Port from openclaw/openclaw#67318. Some open models (notably Gemma
variants served via OpenRouter) emit tool calls as XML blocks inside
assistant content instead of via the structured tool_calls field:

  <function name="read_file"><parameter name="path">/tmp/x</parameter></function>
  <tool_call>{"name":"x"}</tool_call>
  <function_calls>[{...}]</function_calls>

Left unstripped, this raw XML leaked to gateway users (Discord, Telegram,
Matrix, Feishu, Signal, WhatsApp, etc.) and the CLI, since hermes-agent's
existing reasoning-tag stripper handled only <think>/<thinking>/<thought>
variants.

Extend _strip_think_blocks (run_agent.py) and _strip_reasoning_tags
(cli.py) to cover:
  * <tool_call>, <tool_calls>, <tool_result>
  * <function_call>, <function_calls>
  * <function name="..."> ... </function> (Gemma-style)

The <function> variant is boundary-gated (only strips when the tag sits
at start-of-line or after sentence punctuation AND carries a name="..."
attribute) so prose mentions like 'Use <function> declarations in JS'
are preserved. Dangling <function name="..."> with no close is
intentionally left visible — matches OpenClaw's asymmetry so a truncated
streaming tail still reaches the user.

Tests: 9 new cases in TestStripThinkBlocks (run_agent) + 9 in new file
tests/run_agent/test_strip_reasoning_tags_cli.py. Covers Qwen-style
<tool_call>, Gemma-style <function name="...">, multi-line payloads,
prose preservation, stray close tags, dangling open tags, and mixed
reasoning+tool_call content.

Note: this port covers the post-streaming final-text path, which is what
gateway adapters and CLI display consume. Extending the per-delta stream
filter in gateway/stream_consumer.py to hide these tags live as they
stream is a separate follow-up; for now users may see raw XML briefly
during a stream before the final cleaned text replaces it.

Refs: openclaw/openclaw#67318
2026-04-22 18:12:42 -07:00
brooklyn! 64b61cc24b Merge pull request #11887 from liftaris/fix/tui-provider-resolution
fix(tui): resolve runtime provider in _make_agent
2026-04-22 20:11:21 -05:00
brooklyn! e47537e99d Merge pull request #14135 from helix4u/fix/tui-state-db-optional
fix(tui): degrade gracefully when state.db init fails
2026-04-22 20:11:07 -05:00
Teknium 9bd1518425 fix(feishu): correct identity model docs and prefer tenant-scoped user_id
Feishu's open_id is app-scoped (same user gets different open_ids per
bot app), not a canonical identity. Functionally correct for single-bot
mode but semantically misleading.

- Add comprehensive Feishu identity model documentation to module docstring
- Prefer user_id (tenant-scoped) over open_id (app-scoped) in
  _resolve_sender_profile when both are available
- Document bot_open_id usage for @mention matching
- Update user_id_alt comment in SessionSource to be platform-generic

Ref: closes analysis from PR #8388 (closed as over-scoped)
2026-04-22 18:06:22 -07:00
Teknium c9c6182839 fix(anthropic): guard max_tokens against non-positive values
Port from openclaw/openclaw#66664. The build_anthropic_kwargs call site
used 'max_tokens or _get_anthropic_max_output(model)', which correctly
falls back when max_tokens is 0 or None (falsy) but lets negative ints
(-1, -500), fractional floats (0.5, 8192.7), NaN, and infinity leak
through to the Anthropic API. Anthropic rejects these with HTTP 400
('max_tokens: must be greater than or equal to 1'), turning a local
config error into a surprise mid-conversation failure.

Add two resolver helpers matching OpenClaw's:
  _resolve_positive_anthropic_max_tokens — returns int(value) only if
    value is a finite positive number; excludes bools, strings, NaN,
    infinity, sub-one positives (floor to 0).
  _resolve_anthropic_messages_max_tokens — prefers a positive requested
    value, else falls back to the model's output ceiling; raises
    ValueError only if no positive budget can be resolved.

The context-window clamp at the call site (max_tokens > context_length)
is preserved unchanged — it handles oversized values; the new resolver
handles non-positive values. These concerns are now cleanly separated.

Tests: 17 new cases covering positive/zero/negative ints, fractional
floats (both >1 and <1), NaN, infinity, booleans, strings, None, and
integration via build_anthropic_kwargs.

Refs: openclaw/openclaw#66664
2026-04-22 18:04:47 -07:00
Teknium 8152de2a84 chore(release): map sicnuyudidi in AUTHOR_MAP 2026-04-22 17:57:13 -07:00
sicnuyudidi c03858733d fix: pass correct arguments in summary model fallback retry
_generate_summary() takes (turns_to_summarize, focus_topic) but the
summary model fallback path passed (messages, summary_budget) — where
'messages' is not even in scope, causing a NameError.

Fix the recursive call to pass the correct variables so the fallback
to the main model actually works when the summary model is unavailable.

Fixes: #10721
2026-04-22 17:57:13 -07:00
Teknium 08089738d8 chore(release): map li0near in AUTHOR_MAP 2026-04-22 17:56:14 -07:00
li0near 82cce3d26c fix: add base_url_env_var to Anthropic ProviderConfig
The Anthropic provider entry in PROVIDER_REGISTRY is the only standard
API-key provider missing a base_url_env_var. This causes the credential
pool to hardcode base_url to https://api.anthropic.com, ignoring
ANTHROPIC_BASE_URL from the environment.

When using a proxy (e.g. LiteLLM, custom gateway), subagent delegation
fails with 401 because:
1. _seed_from_env() creates pool entries with the hardcoded base_url
2. On error recovery, _swap_credential() overwrites the child agent's
   proxy URL with the pool entry's api.anthropic.com
3. The proxy API key is sent to real Anthropic → authentication_error

Adding base_url_env_var="ANTHROPIC_BASE_URL" aligns Anthropic with the
20+ other providers that already have this field set (alibaba, gemini,
deepseek, xai, etc.).
2026-04-22 17:56:14 -07:00
Teknium e5114298f0 chore(release): map WuTianyi123 in AUTHOR_MAP 2026-04-22 17:55:23 -07:00
WuTianyi123 4c1362884d fix(local): respect configured cwd in init_session()
LocalEnvironment._run_bash() spawned subprocess.Popen without a cwd
argument, so init_session()'s pwd -P ran in the gateway process's
startup directory and overwrote self.cwd. Pass cwd=self.cwd so the
initial snapshot captures the user-configured working directory.

Tested:
- pytest tests/ -q (255 env-related tests passed)
- Full suite: 13,537 passed; 70 pre-existing failures unrelated to local env
2026-04-22 17:55:23 -07:00
Teknium 9ea2d96d73 chore(release): map ms-alan in AUTHOR_MAP 2026-04-22 17:54:23 -07:00
ms-alan 8db5517b4c fix: add /opt/data/.local/bin to PATH in Docker image (Closes #13739)
Running 'hermes profile create' inside the container creates wrappers at
/opt/data/.local/bin but that directory isn't on PATH by default.
Add ENV PATH so wrappers are discoverable without touching shell configs.
2026-04-22 17:54:23 -07:00
Teknium 54db933667 chore(release): map longsizhuo in AUTHOR_MAP 2026-04-22 17:53:45 -07:00
Siz Long 846b9758d8 Remove Discussions link from README
Removed Discussions link from README
2026-04-22 17:53:45 -07:00
Teknium 142202910e chore(release): map ycbai in AUTHOR_MAP 2026-04-22 17:45:56 -07:00
ycbai db86ed1990 fix(terminal): forward docker_forward_env and docker_env to container_config
The container_config builder in terminal_tool.py was missing
docker_forward_env and docker_env keys, causing config.yaml's
docker_forward_env setting to be silently ignored. Environment
variables listed in docker_forward_env were never injected into
Docker containers.
This fix adds both keys to the container_config dict so they are
properly passed to _create_environment().
2026-04-22 17:45:56 -07:00
Teknium 7d8b2eee63 fix(delegate): default inherit_mcp_toolsets=true, drop version bump
Follow-up on helix4u's PR #14211:
- Flip default to true: narrowing toolsets=['web','browser'] expresses
  'I want these extras', not 'silently strip MCP'. Parent MCP tools
  (registered at runtime) should survive narrowing by default.
- Drop _config_version bump (22->23); additive nested key under
  delegation.* is handled by _deep_merge, no migration needed.
- Update tests to reflect new default behavior.
2026-04-22 17:45:48 -07:00
helix4u 3e96c87f37 fix(delegate): make MCP toolset inheritance configurable 2026-04-22 17:45:48 -07:00
Teknium 98e1396b15 chore(release): map yudaiyan in AUTHOR_MAP 2026-04-22 17:45:17 -07:00
yudaiyan 96b0f37001 fix: separate browser_cdp into its own toolset
browser_cdp_tool.py registers before browser_tool.py (alphabetical
import order), so its stricter check_fn (requires CDP endpoint) becomes
the toolset-level check for all 11 browser tools. This causes
'hermes doctor' to report the entire browser toolset as unavailable
even when agent-browser is correctly installed.

Move browser_cdp to toolset='browser-cdp' so it is evaluated
independently. browser_navigate et al. only need agent-browser;
browser_cdp additionally requires a reachable CDP endpoint.
2026-04-22 17:45:17 -07:00
Teknium d74eaef5f9 fix(error_classifier): retry mid-stream SSL/TLS alert errors as transport
Mid-stream SSL alerts (bad_record_mac, tls_alert_internal_error, handshake
failures) previously fell through the classifier pipeline to the 'unknown'
bucket because:

  - ssl.SSLError type names weren't in _TRANSPORT_ERROR_TYPES (the
    isinstance(OSError) catch picks up some but not all SDK-wrapped forms)
  - the message-pattern list had no SSL alert substrings

The 'unknown' bucket is still retryable, but: (a) logs tell the user
'unknown' instead of identifying the cause, (b) it bypasses the
transport-specific backoff/fallback logic, and (c) if the SSL error
happens on a large session with a generic 'connection closed' wrapper,
the existing disconnect-on-large-session heuristic would incorrectly
trigger context compression — expensive, and never fixes a transport
hiccup.

Changes:
  - Add ssl.SSLError and its subclass type names to _TRANSPORT_ERROR_TYPES
  - New _SSL_TRANSIENT_PATTERNS list (separate from _SERVER_DISCONNECT_PATTERNS
    so SSL alerts route to timeout, not context_overflow+compress)
  - New step 5 in the classifier pipeline: SSL pattern check runs BEFORE
    the disconnect check to pre-empt the large-session-compress path

Patterns cover both space-separated ('ssl alert', 'bad record mac')
and underscore-separated ('ERR_SSL_SSL/TLS_ALERT_BAD_RECORD_MAC')
forms.  This is load-bearing because OpenSSL 3.x changed the error-code
separator from underscore to slash (e.g. SSLV3_ALERT_BAD_RECORD_MAC →
SSL/TLS_ALERT_BAD_RECORD_MAC) and will likely churn again — matching on
stable alert reason substrings survives future format changes.

Tests (8 new):
  - BAD_RECORD_MAC in Python ssl.c format
  - OpenSSL 3.x underscore format
  - TLSV1_ALERT_INTERNAL_ERROR
  - ssl handshake failure
  - [SSL: ...] prefix fallback
  - Real ssl.SSLError instance
  - REGRESSION GUARD: SSL on large session does NOT compress
  - REGRESSION GUARD: plain disconnect on large session STILL compresses
2026-04-22 17:44:50 -07:00
Teknium b2593c8d4e chore(release): map brianclemens in AUTHOR_MAP 2026-04-22 17:44:40 -07:00
brianclemens 4009f2edd9 feat(docker): add docker-cli to Docker image 2026-04-22 17:44:40 -07:00
Teknium c0100dde35 chore(release): map Somme4096 in AUTHOR_MAP 2026-04-22 17:43:59 -07:00
Somme4096 5fbb69989d fix(docker): add openssh-client for SSH terminal backend 2026-04-22 17:43:59 -07:00
Teknium 6f629a0462 chore(release): map xandersbell in AUTHOR_MAP 2026-04-22 17:43:30 -07:00
Anders Bell 02aba4a728 fix(skills): follow symlinks in iter_skill_index_files
os.walk() by default does not follow symlinks, causing skills
linked via symlinks to be invisible to the skill discovery system.
Add followlinks=True so that symlinked skill directories are scanned.
2026-04-22 17:43:30 -07:00
Teknium b9463e32c6 fix(usage): read top-level Anthropic cache fields from OAI-compatible proxies
Port from cline/cline#10266.

When OpenAI-compatible proxies (OpenRouter, Vercel AI Gateway, Cline)
route Claude models, they sometimes surface the Anthropic-native cache
counters (`cache_read_input_tokens`, `cache_creation_input_tokens`) at
the top level of the `usage` object instead of nesting them inside
`prompt_tokens_details`. Our chat-completions branch of
`normalize_usage()` only read the nested `prompt_tokens_details` fields,
so those responses:

- reported `cache_write_tokens = 0` even when the model actually did a
  prompt-cache write,
- reported only some of the cache-read tokens when the proxy exposed them
  top-level only,
- overstated `input_tokens` by the missed cache-write amount, which in
  turn made cost estimation and the status-bar cache-hit percentage wrong
  for Claude traffic going through these gateways.

Now the chat-completions branch tries the OpenAI-standard
`prompt_tokens_details` first and falls back to the top-level
Anthropic-shape fields only if the nested values are absent/zero. The
Anthropic and Codex Responses branches are unchanged.

Regression guards added for three shapes: top-level write + nested read,
top-level-only, and both-present (nested wins).
2026-04-22 17:40:49 -07:00
Teknium 75221db967 chore(release): map vrinek in AUTHOR_MAP 2026-04-22 17:37:12 -07:00
Konstantinos Karachalios 435d86ce36 fix: use builtin cd in command wrapper to bypass shell aliases
Version managers like frum (Ruby), rvm, nvm, and others commonly alias
cd to a wrapper function that runs additional logic after directory
changes. When Hermes captures the shell environment into a session
snapshot, these aliases are preserved. If the wrapper function fails
in the subprocess context (e.g. frum not on PATH), every cd fails,
causing all terminal commands to exit with code 126.

Using builtin cd bypasses any aliases or functions, ensuring the
directory change always uses the real bash builtin regardless of
what version managers are installed.
2026-04-22 17:37:12 -07:00
Teknium 3e95963bde chore(release): map niyoh120 in AUTHOR_MAP 2026-04-22 17:36:33 -07:00
niyoh 3445530dbf feat(web): support TAVILY_BASE_URL env var for custom proxy endpoints
Make Tavily client respect a TAVILY_BASE_URL environment variable,
defaulting to https://api.tavily.com for backward compatibility.
Consistent with FIRECRAWL_API_URL pattern already used in this module.
2026-04-22 17:36:33 -07:00
Teknium ea83cd91e4 chore(release): map wujhsu in AUTHOR_MAP 2026-04-22 17:35:55 -07:00
wujhsu 276ef49c96 fix(provider): recognize open.bigmodel.cn as Zhipu/ZAI provider
Zhipu AI (智谱) serves both international users via api.z.ai and
China-based users via open.bigmodel.cn. The domestic endpoint was not
mapped in _URL_TO_PROVIDER, causing Hermes to treat it as an unknown
custom endpoint and fall back to the default 128K context length
instead of resolving the correct 200K+ context via models.dev or the
hardcoded GLM defaults.

This affects users of both the standard API
(https://open.bigmodel.cn/api/paas/v4) and the Coding Plan
(https://open.bigmodel.cn/api/coding/paas/v4).
2026-04-22 17:35:55 -07:00
Teknium 0dace06db7 chore(release): map Tianworld in AUTHOR_MAP 2026-04-22 17:34:29 -07:00
Tianworld 953f8fa943 fix(scripts): read gateway_voice_mode.json as UTF-8
json.loads after read_text() used locale default on Windows; UTF-8 state file could mis-parse.

Made-with: Cursor
2026-04-22 17:34:29 -07:00
Teknium 0187de1f67 chore(release): map hxp-plus in AUTHOR_MAP 2026-04-22 17:34:05 -07:00
Xiping Hu c0df4a0a7f fix(email): accept **kwargs in send_document to handle metadata param 2026-04-22 17:34:05 -07:00
Teknium 9eb543cafe feat(/model): merge models.dev entries for lesser-loved providers (#14221)
New and newer models from models.dev now surface automatically in
/model (both hermes model CLI and the gateway Telegram/Discord picker)
for a curated set of secondary providers — no Hermes release required
when the registry publishes a new model.

Primary user-visible fix: on OpenCode Go, typing '/model mimo-v2.5-pro'
no longer silently fuzzy-corrects to 'mimo-v2-pro'. The exact match
against the merged models.dev catalog wins.

Scope (opt-in frozenset _MODELS_DEV_PREFERRED in hermes_cli/models.py):
  opencode-go, opencode-zen, deepseek, kilocode, fireworks, mistral,
  togetherai, cohere, perplexity, groq, nvidia, huggingface, zai,
  gemini, google.

Explicitly NOT merged:
  - openrouter and nous (never): curated list is already a hand-picked
    subset / Portal is source of truth.
  - xai, xiaomi, minimax, minimax-cn, kimi-coding, kimi-coding-cn,
    alibaba, qwen-oauth (per-project decision to keep curated-only).
  - providers with dedicated live-endpoint paths (copilot, anthropic,
    ai-gateway, ollama-cloud, custom, stepfun, openai-codex) — those
    paths already handle freshness themselves.

Changes:
  - hermes_cli/models.py: add _MODELS_DEV_PREFERRED + _merge_with_models_dev
    helper. provider_model_ids() branches on the set at its curated-fallback
    return. Merge is models.dev-first, curated-only extras appended,
    case-insensitive dedup, graceful fallback when models.dev is offline.
  - hermes_cli/model_switch.py: list_authenticated_providers() calls the
    same merge in both its code paths (PROVIDER_TO_MODELS_DEV loop +
    HERMES_OVERLAYS loop). Picker AND validation-fallback both see
    fresh entries.
  - tests/hermes_cli/test_models_dev_preferred_merge.py (new): 13 tests —
    merge-helper unit tests (empty/raise/order/dedup), opencode-go/zen
    behavior, openrouter+nous explicitly guarded from merge.
  - tests/hermes_cli/test_opencode_go_in_model_list.py: converted from
    snapshot-style assertion to a behavior-based floor check, so it
    doesn't break when models.dev publishes additional opencode-go
    entries.

Addresses a report from @pfanis via Telegram: newer Xiaomi variants
on OpenCode Go weren't appearing in the /model picker, and /model
was silently routing requests for new variants to older ones.
2026-04-22 17:33:42 -07:00
Teknium ea0e4c267d chore(release): map jaffarkeikei in AUTHOR_MAP 2026-04-22 17:27:18 -07:00
Jaffar Keikei c47d4eda13 fix(tools): restrict RPC socket permissions to owner-only
The code execution sandbox creates a Unix domain socket in /tmp with
default permissions, allowing any local user to connect and execute
tool calls. Restrict to 0o600 after bind.

Closes #6230
2026-04-22 17:27:18 -07:00
Teknium 80108104cf chore(release): map anna-oake in AUTHOR_MAP 2026-04-22 17:25:30 -07:00
Anna Oake e826cc42ef fix(nix): use stdenv.hostPlatform.system instead of system
system has been deprecated for a while and emits a deprecation warning when evaluated
2026-04-22 17:25:30 -07:00
Teknium e710bb1f7f chore(release): map cgarwood82 in AUTHOR_MAP 2026-04-22 17:25:04 -07:00
Clifford Garwood 27621ef836 feat: add ctx_size to context length keys for Lemonade server support
- Adds 'ctx_size' field to _CONTEXT_LENGTH_KEYS tuple
- Enables hermes agent to correctly detect context size from custom LLMs
  running on Lemonade server that use this field name instead of the
  standard keys (max_seq_len, n_ctx_train, n_ctx)
2026-04-22 17:25:04 -07:00
Teknium 12f9f10f0f chore(release): map houko in AUTHOR_MAP 2026-04-22 17:24:15 -07:00
Evan e67eb7ff4b fix(gateway): add hermes-gateway script pattern to PID detection
The _looks_like_gateway_process function was missing the
hermes-gateway script pattern, causing dashboard to report gateway
as not running even when the process was active.

Patterns now cover all entry points:
- hermes_cli.main gateway
- hermes_cli/main.py gateway
- hermes gateway
- hermes-gateway (new)
- gateway/run.py

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-22 17:24:15 -07:00
Teknium dad53205ea chore(release): map simon-gtcl in AUTHOR_MAP 2026-04-22 17:23:41 -07:00
simon-gtcl 10063e730c [verified] docs: fix broken env var example in contributing guide 2026-04-22 17:23:41 -07:00
Teknium 402d048eb6 fix(gateway): also unlink stale PID + lock files on cleanup
Follow-up for salvaged PR #14179.

`_cleanup_invalid_pid_path` previously called `remove_pid_file()` for the
default PID path, but that helper defensively refuses to delete a PID file
whose pid field differs from `os.getpid()` (to protect --replace handoffs).
Every realistic stale-PID scenario is exactly that case: a crashed/Ctrl+C'd
gateway left behind a PID file owned by a now-dead foreign PID.

Once `get_running_pid()` has confirmed the runtime lock is inactive, the
on-disk metadata is known to belong to a dead process, so we can force-unlink
both the PID file and the sibling `gateway.lock` directly instead of going
through the defensive helper.

Also adds a regression test with a dead foreign PID that would have failed
against the previous cleanup logic.
2026-04-22 16:33:46 -07:00
helix4u b52123eb15 fix(gateway): recover stale pid and planned restart state 2026-04-22 16:33:46 -07:00
kshitijk4poor 284e084bcc perf(browser): upgrade agent-browser 0.13 -> 0.26, wire daemon idle timeout
Upgrades agent-browser from 0.13.0 to 0.26.0, picking up 13 releases of
daemon reliability fixes:

- Daemon hang on Linux from waitpid(-1) race in SIGCHLD handler (#1098)
- Chrome killed after ~10s idle due to PR_SET_PDEATHSIG thread tracking (#1157)
- Orphaned Chrome processes via process-group kill on shutdown (#1137)
- Stale daemon after upgrade via .version sidecar and auto-restart (#1134)
- Idle timeout not firing (sleep future recreated each loop) (#1110)
- Navigation hanging on lifecycle events that never fire (#1059, #1092)
- CDP attach hang on Chrome 144+ (#1133)
- Windows daemon TCP bind with Hyper-V port conflicts (#1041)
- Shadow DOM traversal in accessibility tree snapshots
- doctor command for user self-diagnosis

Also wires AGENT_BROWSER_IDLE_TIMEOUT_MS into the browser subprocess
environment so the daemon self-terminates after our configured inactivity
timeout (default 300s). This is the daemon-side counterpart to the
Python-side inactivity reaper — the daemon kills itself and its Chrome
children when no commands arrive, preventing orphan accumulation even
when the Python process dies without running atexit handlers.

Addresses #7343 (daemon socket hangs, shadow DOM) and #13793 (orphan
accumulation from force-killed sessions).
2026-04-22 16:33:36 -07:00
Teknium 3c54ceb3ca chore(release): add AUTHOR_MAP entry for Feranmi10 2026-04-22 16:33:25 -07:00
Feranmi 66d2d7090e fix(model_metadata): add gemma-4 and gemma4 context length entries
Fixes #12976

The generic "gemma": 8192 fallback was incorrectly matching gemma4:31b-cloud
before the more specific Gemma 4 entries could match, causing Hermes to assign
only 8K context instead of 262K. Added "gemma-4" and "gemma4" entries before
the fallback to correctly handle Gemma 4 model naming conventions.
2026-04-22 16:33:25 -07:00
Teknium 51ca575994 feat(gateway): expose plugin slash commands natively on all platforms + decision-capable command hook
Plugin slash commands now surface as first-class commands in every gateway
enumerator — Discord native slash picker, Telegram BotCommand menu, Slack
/hermes subcommand map — without a separate per-platform plugin API.

The existing 'command:<name>' gateway hook gains a decision protocol via
HookRegistry.emit_collect(): handlers that return a dict with
{'decision': 'deny'|'handled'|'rewrite'|'allow'} can intercept slash
command dispatch before core handling runs, unifying what would otherwise
have been a parallel 'pre_gateway_command' hook surface.

Changes:

- gateway/hooks.py: add HookRegistry.emit_collect() that fires the same
  handler set as emit() but collects non-None return values. Backward
  compatible — fire-and-forget telemetry hooks still work via emit().
- hermes_cli/plugins.py: add optional 'args_hint' param to
  register_command() so plugins can opt into argument-aware native UI
  registration (Discord arg picker, future platforms).
- hermes_cli/commands.py: add _iter_plugin_command_entries() helper and
  merge plugin commands into telegram_bot_commands() and
  slack_subcommand_map(). New is_gateway_known_command() recognizes both
  built-in and plugin commands so the gateway hook fires for either.
- gateway/platforms/discord.py: extract _build_auto_slash_command helper
  from the COMMAND_REGISTRY auto-register loop and reuse it for
  plugin-registered commands. Built-in name conflicts are skipped.
- gateway/run.py: before normal slash dispatch, call emit_collect on
  command:<canonical> and honor deny/handled/rewrite/allow decisions.
  Hook now fires for plugin commands too.
- scripts/release.py: AUTHOR_MAP entry for @Magaav.
- Tests: emit_collect semantics, plugin command surfacing per platform,
  decision protocol (deny/handled/rewrite/allow + non-dict tolerance),
  Discord plugin auto-registration + conflict skipping, is_gateway_known_command.

Salvaged from #14131 (@Magaav). Original PR added a parallel
'pre_gateway_command' hook and a platform-keyed plugin command
registry; this re-implementation reuses the existing 'command:<name>'
hook and treats plugin commands as platform-agnostic so the same
capability reaches Telegram and Slack without new API surface.

Co-authored-by: Magaav <73175452+Magaav@users.noreply.github.com>
2026-04-22 16:23:21 -07:00
Teknium c96a548bde feat(models): add xiaomi/mimo-v2.5-pro and mimo-v2.5 to openrouter + nous (#14184)
Replace xiaomi/mimo-v2-pro with xiaomi/mimo-v2.5-pro and xiaomi/mimo-v2.5
in the OpenRouter fallback catalog and the nous provider model list.
Add matching DEFAULT_CONTEXT_LENGTHS entries (1M tokens each).
2026-04-22 16:12:39 -07:00
brooklyn! a1d57292af Merge pull request #14145 from NousResearch/bb/tui-polish
fix(tui): input wrap, shift-tab yolo, statusline, clean boot
2026-04-22 16:48:37 -05:00
Brooklyn Nicholson 83efea661f fix(tui): address copilot round 3 on #14145
- appLayout.tsx: restore the 1-row placeholder when `showStickyPrompt`
  is false. Dropping it saved a row but the composer height shifted by
  one as the prompt appeared/disappeared, jumping the input vertically
  on scroll.
- useInputHandlers: gateway.rpc (from useMainApp) already catches errors
  with its own sys() message and resolves to null. The previous `.catch`
  was dead code and on RPC failures the user saw both 'error: ...' (from
  rpc) and 'failed to toggle yolo'. Drop the catch and gate 'failed to
  toggle yolo' on a non-null response so null (= rpc already spoke)
  stays silent.
2026-04-22 16:48:03 -05:00
Yukipukii1 1e8254e599 fix(agent): guard context compressor against structured message content 2026-04-22 14:46:51 -07:00
Teknium 2e5ddf9d2e chore(release): add AUTHOR_MAP entry for ismell0992-afk 2026-04-22 14:46:10 -07:00
ismell0992-afk 6513138f26 fix(agent): recognize Tailscale CGNAT (100.64.0.0/10) as local for Ollama timeouts
`is_local_endpoint()` leaned on `ipaddress.is_private`, which classifies
RFC-1918 ranges and link-local as private but deliberately excludes the
RFC 6598 CGNAT block (100.64.0.0/10) — the range Tailscale uses for its
mesh IPs. As a result, Ollama reached over Tailscale (e.g.
`http://100.77.243.5:11434`) was treated as remote and missed the
automatic stream-read / stale-stream timeout bumps, so cold model load
plus long prefill would trip the 300 s watchdog before the first token.

Add a module-level `_TAILSCALE_CGNAT = ipaddress.IPv4Network("100.64.0.0/10")`
(built once) and extend `is_local_endpoint()` to match the block both
via the parsed-`IPv4Address` path and the existing bare-string fallback
(for symmetry with the 10/172/192 checks). Also hoist the previously
function-local `import ipaddress` to module scope now that it's used by
the constant.

Extend `TestIsLocalEndpoint` with a CGNAT positive set (lower bound,
representative host, MagicDNS anchor, upper bound) and a near-miss
negative set (just below 100.64.0.0, just above 100.127.255.255, well
outside the block, and first-octet-wrong).
2026-04-22 14:46:10 -07:00
Yukipukii1 44a16c5d9d guard terminal_tool import-time env parsing 2026-04-22 14:45:50 -07:00
Roy-oss1 e86acad8f1 feat(feishu): preserve @mention context on inbound messages
Resolve Feishu @_user_N / @_all placeholders into display names plus a
structured [Mentioned: Name (open_id=...), ...] hint so agents can both
reason about who was mentioned and call Feishu OpenAPI tools with stable
open_ids. Strip bot self-mentions only at message edges (leading
unconditionally, trailing only before whitespace/terminal punctuation)
so commands parse cleanly while mid-text references are preserved.
Covers both plain-text and rich-post payloads.

Also fixes a pre-existing hydration bug: Client.request no longer accepts
the 'method' kwarg on lark-oapi 1.5.3, so bot identity silently failed
to hydrate and self-filtering never worked. Migrate to the
BaseRequest.builder() pattern and accept the 'app_name' field the API
actually returns. Tighten identity matching precedence so open_id is
authoritative when present on both sides.
2026-04-22 14:44:07 -07:00
LeonSGP43 4ac1c959b2 fix(agent): resolve fallback provider key_env secrets 2026-04-22 14:42:48 -07:00
Aslaaen 76c454914a fix(core): ensure non-blocking executor shutdown on async timeout 2026-04-22 14:42:32 -07:00
kshitijk4poor d6ed35d047 feat(security): add global toggle to allow private/internal URL resolution
Adds security.allow_private_urls / HERMES_ALLOW_PRIVATE_URLS toggle so
users on OpenWrt routers, TUN-mode proxies (Clash/Mihomo/Sing-box),
corporate split-tunnel VPNs, and Tailscale networks — where DNS resolves
public domains to 198.18.0.0/15 or 100.64.0.0/10 — can use web_extract,
browser, vision URL fetching, and gateway media downloads.

Single toggle in tools/url_safety.py; all 23 is_safe_url() call sites
inherit automatically. Cached for process lifetime.

Cloud metadata endpoints stay ALWAYS blocked regardless of the toggle:
169.254.169.254 (AWS/GCP/Azure/DO/Oracle), 169.254.170.2 (AWS ECS task
IAM creds), 169.254.169.253 (Azure IMDS wire server), 100.100.100.200
(Alibaba), fd00:ec2::254 (AWS IPv6), the entire 169.254.0.0/16
link-local range, and the metadata.google.internal / metadata.goog
hostnames (checked pre-DNS so they can't be bypassed on networks where
those names resolve to local IPs).

Supersedes #3779 (narrower HERMES_ALLOW_RFC2544 for the same class of
users).

Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com>
2026-04-22 14:38:59 -07:00
Dylan Socolobsky ea9ddecc72 fix(tui): route Ctrl+K and Ctrl+W through macOS readline fallback
Makes Ctrl+K and Ctrl+W work in hermes --tui mode in macOS
2026-04-22 14:38:17 -07:00
Brooklyn Nicholson 4107538da8 style(debug): add missing blank line between LogSnapshot and helpers
Copilot on #14145 flagged PEP 8 / Black convention — two blank lines
between top-level class and next top-level function.
2026-04-22 16:34:05 -05:00
Brooklyn Nicholson 103c71ac36 refactor(tui): /clean pass on tui-polish — data tables, tighter title
- normalizeStatusBar: replace Set + early-returns + cast with a single
  alias lookup table. Handles legacy `false`, trims/lowercases strings,
  maps `on` → `top` in one pass. One expression, no `as` hacks.
- Tab title block: drop the narrative comment, fold
  blockedOnInput/titleStatus/cwdTag/terminalTitle into inline expressions
  inside useTerminalTitle. Avoids shadowing the outer `cwd`.
- tui_gateway statusbar set branch: read `display` once instead of
  `cfg0.get("display")` twice.
2026-04-22 16:32:48 -05:00
Brooklyn Nicholson 8410ac05a9 fix(tui): tab title shows cwd + waiting-for-input marker
Previously the terminal tab title was `{/✓} {model} — Hermes` which
only distinguished busy vs idle. Users juggling multiple Hermes tabs had
no way to tell which one was waiting on them for approval/clarify/sudo/
secret, and no cue for which workspace the tab was attached to.

- 3-state marker: `⚠` when an overlay prompt is open, `` busy, `✓` idle.
- Append `· {shortCwd}` (28-char budget, $HOME → ~) so the tab surfaces
  the workspace directly.
- Drop the `— Hermes` suffix — the marker already signals what this is,
  and tab titles are tight.
2026-04-22 16:27:44 -05:00
bobashopcashier b49a1b71a7 fix(agent): accept empty content with stop_reason=end_turn as valid anthropic response
Anthropic's API can legitimately return content=[] with stop_reason="end_turn"
when the model has nothing more to add after a turn that already delivered the
user-facing text alongside a trivial tool call (e.g. memory write). The transport
validator was treating that as an invalid response, triggering 3 retries that
each returned the same valid-but-empty response, then failing the run with
"Invalid API response after 3 retries."

The downstream normalizer already handles empty content correctly (empty loop
over response.content, content=None, finish_reason="stop"), so the only fix
needed is at the validator boundary.

Tests:
- Empty content + stop_reason="end_turn" → valid (the fix)
- Empty content + stop_reason="tool_use" → still invalid (regression guard)
- Empty content without stop_reason → still invalid (existing behavior preserved)
2026-04-22 14:26:23 -07:00
Brooklyn Nicholson e0d698cfb3 fix(tui): yolo toggle only reports on/off for strict '0'/'1' values
Copilot on #14145 flagged that the shift+tab yolo handler treated any
non-null RPC result as valid, so a response shape like {value: undefined}
or {value: 'weird'} would incorrectly echo 'yolo off'. Now only '1' and
'0' map to on/off; anything else (including missing value) surfaces as
'failed to toggle yolo', matching the null/catch branches.
2026-04-22 15:51:11 -05:00
Teknium ea67e49574 fix(streaming): silent retry when stream dies mid tool-call (#14151)
When the streaming connection dropped AFTER user-visible text was
delivered but a tool call was in flight, we stubbed the turn with a
'⚠ Stream stalled mid tool-call; Ask me to retry' warning — costing
an iteration and breaking the flow.  Users report this happening
increasingly often on long SSE streams through flaky provider routes.

Fix: in the existing inner stream-retry loop, relax the
deltas_were_sent short-circuit.  If a tool call was in flight
(partial_tool_names populated) AND the error is a transient connection
error (timeout, RemoteProtocolError, SSE 'connection lost', etc.),
silently retry instead of bailing out.  Fire a brief 'Connection
dropped mid tool-call; reconnecting…' marker so the user understands
the preamble is about to be re-streamed.

Researched how Claude Code (tombstone + non-streaming fallback),
OpenCode (blind Effect.retry wrapping whole stream), and Clawdbot
(4-way gate: stopReason==error + output==0 + !hadPotentialSideEffects)
handle this.  Chose the narrow Clawdbot-style gate: retry only when
(a) a tool call was actually in flight (otherwise the existing
stub-with-recovered-text is correct for pure-text stalls) and
(b) the error is transient.  Side-effect safety is automatic — no
tool has been dispatched within this single API call yet.

UX trade-off: user sees preamble text twice on retry (OpenCode-style).
Strictly better than a lost action with a 'retry manually' message.
If retries exhaust, falls through to the existing stub-with-warning
path so the user isn't left with zero signal.

Tests: 3 new tests in TestSilentRetryMidToolCall covering
(1) silent retry recovers tool call; (2) exhausted retries fall back
to stub; (3) text-only stalls don't trigger retry.  30/30 pass.
2026-04-22 13:47:33 -07:00
Brooklyn Nicholson b641639e42 fix(debug): distinguish empty-log from missing-log in report placeholder
Copilot on #14138 flagged that the share report says '(file not found)'
when the log exists but is empty (either because the primary is empty
and no .1 rotation exists, or in the rare race where the file is
truncated between _resolve_log_path() and stat()).

- Split _primary_log_path() out of _resolve_log_path so both can share
  the LOG_FILES/home math without duplication.
- _capture_log_snapshot now reports '(file empty)' when the primary
  path exists on disk with zero bytes, and keeps '(file not found)'
  for the truly-missing case.

Tests: rename test_returns_none_for_empty → test_empty_primary_reports_file_empty
with the new assertion, plus a race-path test that monkeypatches
_resolve_log_path to exercise the size==0 branch directly.
2026-04-22 15:27:54 -05:00
Brooklyn Nicholson 3ef6992edf fix(tui): drop main-screen banner flash, widen alt-screen clear on entry
- entry.tsx no longer writes bootBanner() to the main screen before the
  alt-screen enters. The <Banner> renders inside the alt screen via the
  seeded intro row, so nothing is lost — just the flash that preceded it.
  Fixes the torn first frame reported on Alacritty (blitz row 5 #17) and
  shaves the 'starting agent' hang perception (row 5 #1) since the UI
  paints straight into the steady-state view
- AlternateScreen prefixes ERASE_SCROLLBACK (\x1b[3J) to its entry so
  strict emulators start from a pristine grid; named constants replace
  the inline sequences for clarity
- bootBanner.ts deleted — dead code
2026-04-22 15:27:54 -05:00
Brooklyn Nicholson 6fb98f343a fix(tui): address copilot review on #14103
- normalizeStatusBar: trim/lowercase + 'on' → 'top' alias so user-edited
  YAML variants (Top, " bottom ", on) coerce correctly
- shift-tab yolo: no-op with sys note when no live session; success-gated
  echo and catch fallback so RPC failures don't report as 'yolo off'
- tui_gateway config.set/get statusbar: isinstance(display, dict) guards
  mirroring the compact branch so a malformed display scalar in config.yaml
  can't raise

Tests: +1 vitest for trim/case/on, +2 pytest for non-dict display survival.
2026-04-22 15:27:54 -05:00
Brooklyn Nicholson 48f2ac3352 refactor(tui): /clean pass on blitz closeout — trim comments, flatten logic
- normalizeStatusBar collapses to one ternary expression
- /statusbar slash hoists the toggle value and flattens the branch tree
- shift-tab yolo comment reduced to one line
- cursorLayout/offsetFromPosition lose paragraph-length comments
- appLayout collapses the three {!overlay.agents && …} into one fragment
- StatusRule drops redundant flexShrink={0} (Yoga default)
- server.py uses a walrus + frozenset and trims the compat helper

Net -43 LoC. 237 vitest + 46 pytest green, layouts unchanged.
2026-04-22 15:27:54 -05:00
Brooklyn Nicholson 1e8cfa9092 fix(tui): idle good-vibes heart no longer blanks the input's last cell
The heart was rendered as a literal space when inactive. Because it's
absolutely positioned at right:0 inside the composer row, that blank
still overpainted the rightmost input cell. On wrapped 2-line drafts,
editing near the boundary made the final visible character appear to
jump in/out as it crossed the overpainted column.

When inactive, render nothing; only mount the heart while it's actually
animating.
2026-04-22 15:27:54 -05:00
Brooklyn Nicholson 88993a468f fix(tui): input wrap width mismatch — last letter no longer flickers
The 'columns' prop passed to TextInput was cols - pw, but the actual
render width is cols - pw - 2 (NoSelect's paddingX={1} on each side
subtracts two cols from the composer area). cursorLayout thought it
had two extra cols, so wrap-ansi wrapped at render col N while the
declared cursor sat at col N+2 on the same row. The render and the
declared cursor disagreed right at the wrap boundary — the last
letter of a sentence spanning two lines flickered in/out as each
keystroke flipped which cell the cursor claimed.

Also polish the /help hotkeys panel — the !cmd / {!cmd} placeholders
read as literal commands to type, so show them with angle-bracket
syntax and a concrete example (blitz row 5 sub-item 4).
2026-04-22 15:27:54 -05:00
Brooklyn Nicholson a7cc903bf5 fix(tui): breathing room above the composer cluster, status tight to input
Previous revision added marginTop={1} to the input which stacked as a
phantom gap BETWEEN status and input. The breathing row should sit
ABOVE the status-in-top cluster, not inside it.

- StatusRulePane at="top" now carries its own marginTop={1} so it
  always has a one-row gap above (separating it from transcript or,
  when queue is present, from the last queue item)
- Input Box marginTop flips: 0 in top mode (status is the separator),
  1 in bottom/off mode (input itself caps the composer cluster)
- Net: status and input are tight together in 'top'; input and status
  are tight together at the bottom in 'bottom'; one-row breathing room
  above whichever element sits on top of the cluster
2026-04-22 15:27:54 -05:00
Brooklyn Nicholson 408fc893e9 fix(tui): tighten composer — status sits directly above input, overlays anchor to input
Three bugs rolled together, all in the composer area:

- StatusRule was measuring as 2 rows in Yoga due to a quirk with the
  complex nested <Text wrap="truncate-end"> content. Lock the outer box
  to height={1} so 'top' mode actually abuts the input instead of
  leaving a phantom blank row between them
- FloatingOverlays (slash completions, /model picker, /resume, /skills
  browser, pager) was anchored to the status box. In 'bottom' mode the
  status box moved away, so overlays vanished. Move the overlays into
  the input row (which is position:relative) so they always pop up
  above the input regardless of status position
- Drop the <Text> </Text> fallback in the sticky-prompt slot (only
  render a row when there's an actual sticky prompt to show) and
  collapse the now-unused Box column wrapping the input. Saves two
  rows of dead vertical space in the default layout
2026-04-22 15:27:54 -05:00
Brooklyn Nicholson ea32364c96 fix(tui): /statusbar top = inline above input, not row 0 of the screen
'top' and 'bottom' are positions relative to the input row, not the alt
screen viewport:

- top (default) → inline above the input, where the bar originally lived
  (what 'on' used to mean)
- bottom → below the input, pinned to the last row
- off → hidden

Drops the literal top-of-screen placement; 'on' is kept as a backward-
compat alias that resolves to 'top' at both the config layer
(normalizeStatusBar, _coerce_statusbar) and the slash command.
2026-04-22 15:27:54 -05:00
Brooklyn Nicholson d55a17bd82 refactor(tui): statusbar as 4-mode position (on|off|bottom|top)
Default is back to 'on' (inline, above the input) — bottom was too far
from the input and felt disconnected. Users who want it pinned can
opt in explicitly.

- UiState.statusBar: boolean → 'on' | 'off' | 'bottom' | 'top'
- /statusbar [on|off|bottom|top|toggle]; no-arg still binary-toggles
  between off and on (preserves muscle memory)
- appLayout renders StatusRulePane in three slots (inline inside
  ComposerPane for 'on', above transcript row for 'top', after
  ComposerPane for 'bottom'); only the slot matching ui.statusBar
  actually mounts
- drop the input's marginBottom when 'bottom' so the rule sits tight
  against the input instead of floating a row below
- useConfigSync.normalizeStatusBar coerces legacy bool (true→on,
  false→off) and unknown shapes to 'on' for forward-compat reads
- tui_gateway: split compact from statusbar config handlers; persist
  string enum with _coerce_statusbar helper for legacy bool configs
2026-04-22 15:27:54 -05:00
Brooklyn Nicholson 7027ce42ef fix(tui): blitz closeout — input wrap parity, shift-tab yolo, bottom statusline
- input wrap: add <Text wrap="wrap-char"> mode that drives wrap-ansi with
  wordWrap:false, and align cursorLayout/offsetFromPosition to that same
  boundary (w=cols, trailing-cell overflow). Word-wrap's whitespace
  reshuffle was causing the cursor to jump a word left/right on each
  keystroke near the right edge — blitz row 9
- shift-tab: toggle per-session yolo without submitting a turn (mirrors
  Claude Code's in-place dangerously-approve); slash /yolo still works
  for discoverability — blitz row 5 sub-item 11
- statusline: lift StatusRule out of ComposerPane to a new StatusRulePane
  anchored at the bottom of AppLayout, below the input — blitz row 5
  sub-item 12
2026-04-22 15:27:54 -05:00
Teknium 88564ad8bc fix(skins): don't inherit status_bar_* into light-mode skins
The salvaged status-bar skin keys were seeded on the default skin, but
_build_skin_config merges default.colors into every skin — so daylight
and warm-lightmode silently inherited silver status_bar_text (#C0C0C0)
on their light backgrounds, rendering as low-contrast gray on gray.

Drop the seven status_bar_{text,strong,dim,good,warn,bad,critical}
entries from the default skin's colors and let get_prompt_toolkit_style
_overrides fall back to banner_text / banner_title / banner_dim /
ui_ok / ui_warn / ui_error. Dark skins keep their explicit overrides
and render identically; light skins now inherit their own dark banner
colors for readable status-bar text.
2026-04-22 13:20:02 -07:00
kshitij 81a504a4a0 fix: align status bar skin tests with upstream main
Drop rebased test assumptions about theme-mode helpers removed on main and keep the status bar skin integration aligned with the current skin engine model.
2026-04-22 13:20:02 -07:00
kshitij c323217188 fix: make CLI status bar skin-aware
Route prompt_toolkit status bar colors through the skin engine so /skin updates the status bar alongside the rest of the interactive TUI.

Add regression coverage for the new status bar style override keys and CLI style composition.
2026-04-22 13:20:02 -07:00
helix4u 5dead0f2a0 fix(tui): degrade gracefully when state.db init fails 2026-04-22 13:49:33 -06:00
kshitijk4poor de849c410d refactor(debug): remove dead _read_log_tail/_read_full_log wrappers
These thin wrappers around _capture_log_snapshot had zero production
callers after the snapshot refactor — run_debug_share uses snapshots
directly and collect_debug_report captures internally.  The wrappers
also caused a performance regression: _read_log_tail read up to 512KB
and built full_text just to return tail_text.

Remove both wrappers and migrate TestReadFullLog → TestCaptureLogSnapshot
to test _capture_log_snapshot directly.  Same coverage, tests the real
API instead of dead indirection.
2026-04-22 11:59:39 -07:00
kshitijk4poor 8dc936f10e chore: add taosiyuan163 to AUTHOR_MAP, add truncation boundary tests
Add missing AUTHOR_MAP entry for taosiyuan163 whose truncation boundary
fix was adapted into _capture_log_snapshot().

Add regression tests proving: line-boundary truncation keeps the full
first line, mid-line truncation correctly drops the partial fragment.
2026-04-22 11:59:39 -07:00
Junass1 61d0a99c11 fix(debug): sweep expired pending pastes on slash debug paths 2026-04-22 11:59:39 -07:00
kshitijk4poor 921133cfa5 fix(debug): preserve full line at truncation boundary and cap memory
Adapt the byte-boundary-safe truncation fix from PR #14040 by
taosiyuan163 into the new _capture_log_snapshot() code path: when
the truncation cut lands exactly on a line boundary, keep the first
retained line instead of unconditionally dropping it.

Also add a 2x max_bytes safety cap to the backward-reading loop to
prevent unbounded memory consumption when log files contain very long
lines (e.g. JSON blobs) with few newlines.

Based on #14040 by @taosiyuan163.
2026-04-22 11:59:39 -07:00
helix4u fc3862bdd6 fix(debug): snapshot logs once for debug share 2026-04-22 11:59:39 -07:00
Kaio ec374c0599 Merge branch 'main' into fix/tui-provider-resolution 2026-04-22 11:47:49 -07:00
brooklyn! bc5da42b2c Merge pull request #14045 from NousResearch/bb/subagent-observability
feat(tui): subagent spawn observability overlay
2026-04-22 12:21:25 -05:00
Brooklyn Nicholson 5b0741e986 refactor(tui): consolidate agents overlay — share duration/root helpers via lib
Pull duplicated rules into ui-tui/src/lib/subagentTree so the live overlay,
disk snapshot label, and diff pane all speak one dialect:

- export fmtDuration(seconds) — was a private helper in subagentTree;
  agentsOverlay's local secLabel/fmtDur/fmtElapsedLabel now wrap the same
  core (with UI-only empty-string policy).
- export topLevelSubagents(items) — matches buildSubagentTree's orphan
  semantics (no parent OR parent not in snapshot). Replaces three hand-
  rolled copies across createGatewayEventHandler (disk label), agentsOverlay
  DiffPane, and prior inline filters.

Also collapse agentsOverlay boilerplate:
- replace IIFE title + inner `delta` helper with straight expressions;
- introduce module-level diffMetricLine for replay-diff rows;
- tighten OverlayScrollbar (single thumbColor expression, vBar/thumbBody).

Adds unit coverage for the new exports (fmtDuration + topLevelSubagents).
No behaviour change; 221 tests pass.
2026-04-22 12:10:21 -05:00
Brooklyn Nicholson 9e1f606f7f fix: scroll in agents detail view 2026-04-22 12:03:14 -05:00
Brooklyn Nicholson 7eae504d15 fix(tui): address Copilot round-2 on #14045
- delegate_task: use shared tool_error() for the paused-spawn early return
  so the error envelope matches the rest of the tool.
- Disk snapshot label: treat orphaned nodes (parentId missing from the
  snapshot) as top-level, matching buildSubagentTree / summarizeLabel.
2026-04-22 11:54:19 -05:00
Brooklyn Nicholson eda400d8a5 chore: uptick 2026-04-22 11:32:17 -05:00
Brooklyn Nicholson 82197a87dc style(tui): breathing room around status glyphs in agents overlay
- List rows: pad the status dot with space before (heat-marker gap or
  matching 2-space filler) and after (3 spaces to goal) so `●` / `○` /
  `✓` / `■` / `✗` don't read glued to the heat bar or the goal text.
- Gantt rows: bump id→bar separator from 1 to 2 spaces; widen the id
  gutter from 4 to 5 cols and re-align the ruler lead to match.
2026-04-22 11:01:22 -05:00
Brooklyn Nicholson dee51c1607 fix(tui): address Copilot review on #14045
Four real issues Copilot flagged:

1. delegate_tool: `_build_child_agent` never passed `toolsets` to the
   progress callback, so the event payload's `toolsets` field (wired
   through every layer) was always empty and the overlay's toolsets
   row never populated.  Thread `child_toolsets` through.

2. event handler: the race-protection on subagent.spawn_requested /
   subagent.start only preserved `completed`, so a late-arriving queued
   event could clobber `failed` / `interrupted` too.  Preserve any
   terminal status (`completed | failed | interrupted`).

3. SpawnHud: comment claimed concurrency was approximated by "widest
   level in the tree" but code used `totals.activeCount` (total across
   all parents).  `max_concurrent_children` is a per-parent cap, so
   activeCount over-warns for multi-orchestrator runs.  Switch to
   `max(widthByDepth(tree))`; the label now reads `W/cap+extra` where
   W is the widest level (drives the ratio) and `+extra` is the rest.

4. spawn_tree.list: comment said "peek header without parsing full list"
   but the code json.loads()'d every snapshot.  Adds a per-session
   `_index.jsonl` sidecar written on save; list() reads only the index
   (with a full-scan fallback for pre-index sessions).  O(1) per
   snapshot now vs O(file-size).
2026-04-22 10:56:32 -05:00
kshitijk4poor 5e8262da26 chore: add rnijhara to AUTHOR_MAP 2026-04-22 08:49:24 -07:00
kshitijk4poor 1f216ecbb4 feat(gateway/slack): add SLACK_REACTIONS env toggle for reaction lifecycle
Adds _reactions_enabled() gating to match Discord (DISCORD_REACTIONS) and
Telegram (TELEGRAM_REACTIONS) pattern. Defaults to true to preserve existing
behavior. Gates at three levels:
- _handle_slack_message: skips _reacting_message_ids registration
- on_processing_start: early return
- on_processing_complete: early return

Also adds config.yaml bridge (slack.reactions) and two new tests.
2026-04-22 08:49:24 -07:00
Roopak Nijhara 70a33708e7 fix(gateway/slack): align reaction lifecycle with Discord/Telegram pattern
Slack reactions were placed around handle_message(), which returns
immediately after spawning a background task. This caused the 👀 swap to happen before any real work began.

Fix: implement on_processing_start / on_processing_complete callbacks
(matching Discord/Telegram) so reactions bracket actual _message_handler
work driven by the base class.

Also fixes missing stop_typing() for Slack's assistant thread status
indicator, which left 'is thinking...' stuck in the UI after processing
completed.

- Add _reacting_message_ids set for DM/@mention-only gating
- Add _active_status_threads dict for stop_typing lookup
- Update test_reactions_in_message_flow for new callback pattern
- Add test_reactions_failure_outcome and test_reactions_skipped_for_non_dm_non_mention
2026-04-22 08:49:24 -07:00
Brooklyn Nicholson f06adcc1ae chore(tui): drop unreachable return + prettier pass
- createGatewayEventHandler: remove dead `return` after a block that
  always returns (tool.complete case).  The inner block exits via
  both branches so the outer statement was never reachable.  Was
  pre-existing on main; fixed here because it was the only thing
  blocking `npm run fix` on this branch.
- agentsOverlay + ops: prettier reformatting.

`npm run fix` / `npm run type-check` / `npm test` all clean.
2026-04-22 10:43:59 -05:00
Brooklyn Nicholson 06ebe34b40 fix(tui): repair useInput handler in agents overlay
The Write tool that wrote the cleaned overlay split the `if` keyword
across two lines in 9 places (`    i\nf (cond) {`), which silently
passed one typecheck run but actually left the handler as broken
JS — every keystroke threw.  Input froze in the /agents overlay
(j/k/arrows/q/etc. all no-ops) while the 500ms now-tick kept
rendering, so the UI looked "frozen but the timeline moves".

Reflows the handler as-intended with no behaviour change.
2026-04-22 10:41:13 -05:00
Brooklyn Nicholson 7785654ad5 feat(tui): subagent spawn observability overlay
Adds a live + post-hoc audit surface for recursive delegate_task fan-out.
None of cc/oc/oclaw tackle nested subagent trees inside an Ink overlay;
this ships a view-switched dashboard that handles arbitrary depth + width.

Python
- delegate_tool: every subagent event now carries subagent_id, parent_id,
  depth, model, tool_count; subagent.complete also ships input/output/
  reasoning tokens, cost, api_calls, files_read/files_written, and a
  tail of tool-call outputs
- delegate_tool: new subagent.spawn_requested event + _active_subagents
  registry so the overlay can kill a branch by id and pause new spawns
- tui_gateway: new RPCs delegation.status, delegation.pause,
  subagent.interrupt, spawn_tree.save/list/load (disk under
  \$HERMES_HOME/spawn-trees/<session>/<ts>.json)

TUI
- /agents overlay: full-width list mode (gantt strip + row picker) and
  Enter-to-drill full-width scrollable detail mode; inverse+amber
  selection, heat-coloured branch markers, wall-clock gantt with tick
  ruler, per-branch rollups
- Detail pane: collapsible accordions (Budget, Files, Tool calls, Output,
  Progress, Summary); open-state persists across agents + mode switches
  via a shared atom
- /replay [N|last|list|load <path>] for in-memory + disk history;
  /replay-diff <a> <b> for side-by-side tree comparison
- Status-bar SpawnHud warns as depth/concurrency approaches caps;
  overlay auto-follows the just-finished turn onto history[1]
- Theme: bump DARK dim #B8860B → #CC9B1F for readable secondary text
  globally; keep LIGHT untouched

Tests: +29 new subagentTree unit tests; 215/215 passing.
2026-04-22 10:38:17 -05:00
kshitijk4poor 04e039f687 fix: Kimi /coding thinking block survival + empty reasoning_content + block ordering
Follow-up to the cherry-picked PR #13897 fix. Three issues found:

1. CRITICAL: The thinking block synthesised from reasoning_content was
   immediately stripped by the third-party signature management code
   (Kimi is classified as _is_third_party_anthropic_endpoint). Added a
   Kimi-specific carve-out that preserves unsigned thinking blocks while
   still stripping Anthropic-signed blocks Kimi can't validate.

2. Empty-string reasoning_content was silently dropped because the
   truthiness check ('if reasoning_content and ...') evaluates to False
   for ''. Changed to 'isinstance(reasoning_content, str)' so the
   tier-3 fallback from _copy_reasoning_content_for_api (which injects
   '' for Kimi tool-call messages with no reasoning) actually produces
   a thinking block.

3. The thinking block was appended AFTER tool_use blocks. Anthropic
   protocol requires thinking -> text -> tool_use ordering. Changed to
   blocks.insert(0, ...) to prepend.
2026-04-22 08:21:23 -07:00
Jerome 97a536057d chore(release): add hiddenpuppy to AUTHOR_MAP
Map tsuijinglei@gmail.com → hiddenpuppy.
2026-04-22 08:21:23 -07:00
Jerome 2efb0eea21 fix(anthropic_adapter): preserve reasoning_content on assistant tool-call messages for Kimi /coding
Fixes NousResearch/hermes-agent#13848

Kimi's /coding endpoint speaks the Anthropic Messages protocol but has its
own thinking semantics: when thinking is enabled, Kimi validates message
history and requires every prior assistant tool-call message to carry
OpenAI-style reasoning_content.

The Anthropic path never populated that field, and
convert_messages_to_anthropic strips all Anthropic thinking blocks on
third-party endpoints — so the request failed with HTTP 400:
  "thinking is enabled but reasoning_content is missing in assistant
tool call message at index N"

Now, when an assistant message contains tool_calls and a
reasoning_content string, we append a {"type": "thinking", ...} block
to the Anthropic content so Kimi can validate the history.  This only
affects assistant messages with tool_calls + reasoning_content; plain
text assistant messages are unchanged.
2026-04-22 08:21:23 -07:00
Teknium 77e04a29d5 fix(error_classifier): don't classify generic 404 as model_not_found (#14013)
The 404 branch in _classify_by_status had dead code: the generic
fallback below the _MODEL_NOT_FOUND_PATTERNS check returned the
exact same classification (model_not_found + should_fallback=True),
so every 404 — regardless of message — was treated as a missing model.

This bites local-endpoint users (llama.cpp, Ollama, vLLM) whose 404s
usually mean a wrong endpoint path, proxy routing glitch, or transient
backend issue — not a missing model. Claiming 'model not found' misleads
the next turn and silently falls back to another provider when the real
problem was a URL typo the user should see.

Fix: only classify 404 as model_not_found when the message actually
matches _MODEL_NOT_FOUND_PATTERNS ("invalid model", "model not found",
etc.). Otherwise fall through as unknown (retryable) so the real error
surfaces in the retry loop.

Test updated to match the new behavior. 103 error_classifier tests pass.
2026-04-22 06:11:47 -07:00
Yukipukii1 40619b393f tools: normalize file tool pagination bounds 2026-04-22 06:11:41 -07:00
Teknium 3e652f75b2 fix(plugins+nous): auto-coerce memory plugins; actionable Nous 401 diagnostic (#14005)
* fix(plugins): auto-coerce user-installed memory plugins to kind=exclusive

User-installed memory provider plugins at $HERMES_HOME/plugins/<name>/
were being dispatched to the general PluginManager, which has no
register_memory_provider method on PluginContext. Every startup logged:

  Failed to load plugin 'mempalace': 'PluginContext' object has no
  attribute 'register_memory_provider'

Bundled memory providers were already skipped via skip_names={memory,
context_engine} in discover_and_load, but user-installed ones weren't.

Fix: _parse_manifest now scans the plugin's __init__.py source for
'register_memory_provider' or 'MemoryProvider' (same heuristic as
plugins/memory/__init__.py:_is_memory_provider_dir) and auto-coerces
kind to 'exclusive' when the manifest didn't declare one explicitly.
This routes the plugin to plugins/memory discovery instead of the
general loader.

The escape hatch: if a manifest explicitly declares kind: standalone,
the heuristic doesn't override it.

Reported by Uncle HODL on Discord.

* fix(nous): actionable CLI message when Nous 401 refresh fails

Mirrors the Anthropic 401 diagnostic pattern. When Nous returns 401
and the credential refresh (_try_refresh_nous_client_credentials)
also fails, the user used to see only the raw APIError. Now prints:

  🔐 Nous 401 — Portal authentication failed.
     Response: <truncated body>
     Most likely: Portal OAuth expired, account out of credits, or
                  agent key revoked.
     Troubleshooting:
       • Re-authenticate: hermes login --provider nous
       • Check credits / billing: https://portal.nousresearch.com
       • Verify stored credentials: $HERMES_HOME/auth.json
       • Switch providers temporarily: /model <model> --provider openrouter

Addresses the common 'my hermes model hangs' pattern where the user's
Portal OAuth expired and the CLI gave no hint about the next step.
2026-04-22 05:54:11 -07:00
kshitijk4poor 5fb143169b feat(dashboard): track real API call count per session
Adds schema v7 'api_call_count' column. run_agent.py increments it by 1
per LLM API call, web_server analytics SQL aggregates it, frontend uses
the real counter instead of summing sessions.

The 'API Calls' card on the analytics dashboard previously displayed
COUNT(*) from the sessions table — the number of conversations, not
LLM requests. Each session makes 10-90 API calls through the tool loop,
so the reported number was ~30x lower than real.

Salvaged from PR #10140 (@kshitijk4poor). The cache-token accuracy
portions of the original PR were deferred — per-provider analytics is
the better path there, since cache_write_tokens and actual_cost_usd
are only reliably available from a subset of providers (Anthropic
native, Codex Responses, OpenRouter with usage.include).

Tests:
- schema_version v7 assertion
- migration v2 -> v7 adds api_call_count column with default 0
- update_token_counts increments api_call_count by provided delta
- absolute=True sets api_call_count directly
- /api/analytics/usage exposes total_api_calls in totals
2026-04-22 05:51:58 -07:00
teknium1 be11a75eae chore(release): map hharry11 email to GitHub handle 2026-04-22 05:51:44 -07:00
hharry11 83cb9a03ee fix(cli): ensure project .env is sanitized before loading 2026-04-22 05:51:44 -07:00
WideLee cf55c738e7 refactor(qqbot): migrate qr onboard flow to sync + consolidate into onboard.py
- Replace async create_bind_task/poll_bind_result with synchronous
  httpx.Client equivalents, eliminating manual event loop management
- Move _render_qr and full qr_register() entry-point into onboard.py,
  mirroring the Feishu onboarding pattern
- Remove _qqbot_render_qr and _qqbot_qr_flow from gateway.py (~90 lines);
  call site becomes a single qr_register() import
- Fix potential segfault: previous code called loop.close() in the EXPIRED
  branch and again in the finally block (double-close crashed under uvloop)
2026-04-22 05:50:21 -07:00
Teknium ba7e8b0df9 chore(release): map Abner email to Abnertheforeman 2026-04-22 05:27:10 -07:00
Abner b66644f0ec feat(hindsight): richer session-scoped retain metadata
- Add configurable retain_tags / retain_source / retain_user_prefix /
  retain_assistant_prefix knobs for native Hindsight.
- Thread gateway session identity (user_name, chat_id, chat_name,
  chat_type, thread_id) through AIAgent and MemoryManager into
  MemoryProvider.initialize kwargs so providers can scope and tag
  retained memories.
- Hindsight attaches the new identity fields as retain metadata,
  merges per-call tool tags with configured default tags, and uses
  the configurable transcript labels for auto-retained turns.

Co-authored-by: Abner <abner.the.foreman@agentmail.to>
2026-04-22 05:27:10 -07:00
Teknium b8663813b6 feat(state): auto-prune old sessions + VACUUM state.db at startup (#13861)
* feat(state): auto-prune old sessions + VACUUM state.db at startup

state.db accumulates every session, message, and FTS5 index entry forever.
A heavy user (gateway + cron) reported 384MB with 982 sessions / 68K messages
causing slowdown; manual 'hermes sessions prune --older-than 7' + VACUUM
brought it to 43MB. The prune command and VACUUM are not wired to run
automatically anywhere — sessions grew unbounded until users noticed.

Changes:
- hermes_state.py: new state_meta key/value table, vacuum() method, and
  maybe_auto_prune_and_vacuum() — idempotent via last-run timestamp in
  state_meta so it only actually executes once per min_interval_hours
  across all Hermes processes for a given HERMES_HOME. Never raises.
- hermes_cli/config.py: new 'sessions:' block in DEFAULT_CONFIG
  (auto_prune=True, retention_days=90, vacuum_after_prune=True,
  min_interval_hours=24). Added to _KNOWN_ROOT_KEYS.
- cli.py: call maintenance once at HermesCLI init (shared helper
  _run_state_db_auto_maintenance reads config and delegates to DB).
- gateway/run.py: call maintenance once at GatewayRunner init.
- Docs: user-guide/sessions.md rewrites 'Automatic Cleanup' section.

Why VACUUM matters: SQLite does NOT shrink the file on DELETE — freed
pages get reused on next INSERT. Without VACUUM, a delete-heavy DB stays
bloated forever. VACUUM only runs when the prune actually removed rows,
so tight DBs don't pay the I/O cost.

Tests: 10 new tests in tests/test_hermes_state.py covering state_meta,
vacuum, idempotency, interval skipping, VACUUM-only-when-needed,
corrupt-marker recovery. All 246 existing state/config/gateway tests
still pass.

Verified E2E with real imports + isolated HERMES_HOME: DEFAULT_CONFIG
exposes the new block, load_config() returns it for fresh installs,
first call prunes+vacuums, second call within min_interval_hours skips,
and the state_meta marker persists across connection close/reopen.

* sessions.auto_prune defaults to false (opt-in)

Session history powers session_search recall across past conversations,
so silently pruning on startup could surprise users. Ship the machinery
disabled and let users opt in when they notice state.db is hurting
performance.

- DEFAULT_CONFIG.sessions.auto_prune: True → False
- Call-site fallbacks in cli.py and gateway/run.py match the new default
  (so unmigrated configs still see off)
- Docs: flip 'Enable in config.yaml' framing + tip explains the tradeoff
2026-04-22 05:21:49 -07:00
Teknium b43524ecab fix(wecom): visible poll progress + clearer no-bot-info failure + docstring note
Follow-ups on top of salvaged #13923 (@keifergu):
- Print QR poll dot every 3s instead of every 18s so "Fetching
  configuration results..." doesn't look hung.
- On "status=success but no bot_info" from the WeCom query endpoint,
  log the full payload at WARNING and tell the user we're falling
  back to manual entry (was previously a single opaque line).
- Document in the qr_scan_for_bot_info() docstring that the
  work.weixin.qq.com/ai/qc/* endpoints are the admin-console web-UI
  flow, not the public developer API, and may change without notice.

Also add keifergu@tencent.com to scripts/release.py AUTHOR_MAP so
release notes attribute the feature correctly.
2026-04-22 05:15:32 -07:00
keifergu 3f60a907e1 docs(wecom): document QR scan-to-create setup flow 2026-04-22 05:15:32 -07:00
keifergu 8bcd77a9c2 feat(wecom): add QR scan flow and interactive setup wizard for bot credentials 2026-04-22 05:15:32 -07:00
Teknium d166716c65 feat(optional-skills): add page-agent skill under new web-development category (#13976)
Adds an optional skill that walks users through installing and using
alibaba/page-agent — a pure-JS in-page GUI agent that web developers
embed into their own webapps so end users can drive the UI with
natural language.

Three install paths: CDN demo (30s, no install), npm install into an
existing app with provider config table (Qwen/OpenAI/Ollama/OpenRouter),
and clone-from-source for dev/contributor workflow.

Clear use-case framing up front (embed AI copilot in SaaS/admin/B2B,
modernize legacy UIs, accessibility via natural language) and an
explicit NOT-for list that points users wanting server-side browser
automation back to Hermes' built-in browser tool.

Live-verified: repo builds on Node 22.22 + npm 10.9, dev:demo serves
at localhost:5174, API surface (new PageAgent{...}, panel.show(),
execute(task)) matches what the skill documents. Also verified
discovery end-to-end via OptionalSkillSource with isolated
HERMES_HOME — search/inspect/fetch all resolve
official/web-development/page-agent correctly.

New category directory: optional-skills/web-development/ with a
DESCRIPTION.md explaining the distinction from Hermes' own browser
automation (outside-in vs inside-out).
2026-04-22 04:54:26 -07:00
helix4u a7d78d3bfd fix: preserve reasoning_content on Kimi replay 2026-04-22 04:31:59 -07:00
kshitijk4poor 30ec12970b fix(packaging): include agent.* sub-packages in pyproject.toml
The transport refactor (PRs #13862 ff.) added agent/transports/ as a
sub-package but the setuptools packages.find include list only had
"agent" (top-level files), not "agent.*" (sub-packages).

pip install / Nix builds therefore ship run_agent.py (which now imports
from agent.transports on every API call) but omit the transports
directory entirely, causing:

  ModuleNotFoundError: No module named 'agent.transports'

on every LLM call for packaged installs.

Adds "agent.*" to match the existing pattern used by tools, gateway,
tui_gateway, and plugins.
2026-04-22 03:35:37 -07:00
hengm3467 c6b1ef4e58 feat: add Step Plan provider support (salvage #6005)
Adds a first-class 'stepfun' API-key provider surfaced as Step Plan:

- Support Step Plan setup for both International and China regions
- Discover Step Plan models live from /step_plan/v1/models, with a
  small coding-focused fallback catalog when discovery is unavailable
- Thread StepFun through provider metadata, setup persistence, status
  and doctor output, auxiliary routing, and model normalization
- Add tests for provider resolution, model validation, metadata
  mapping, and StepFun region/model persistence

Based on #6005 by @hengm3467.

Co-authored-by: hengm3467 <100685635+hengm3467@users.noreply.github.com>
2026-04-22 02:59:58 -07:00
Teknium ff9752410a feat(plugins): pluggable image_gen backends + OpenAI provider (#13799)
* feat(plugins): pluggable image_gen backends + OpenAI provider

Adds a ImageGenProvider ABC so image generation backends register as
bundled plugins under `plugins/image_gen/<name>/`. The plugin scanner
gains three primitives to make this work generically:

- `kind:` manifest field (`standalone` | `backend` | `exclusive`).
  Bundled `kind: backend` plugins auto-load — no `plugins.enabled`
  incantation. User-installed backends stay opt-in.
- Path-derived keys: `plugins/image_gen/openai/` gets key
  `image_gen/openai`, so a future `tts/openai` cannot collide.
- Depth-2 recursion into category namespaces (parent dirs without a
  `plugin.yaml` of their own).

Includes `OpenAIImageGenProvider` as the first consumer (gpt-image-1.5
default, plus gpt-image-1, gpt-image-1-mini, DALL-E 3/2). Base64
responses save to `$HERMES_HOME/cache/images/`; URL responses pass
through.

FAL stays in-tree for this PR — a follow-up ports it into
`plugins/image_gen/fal/` so the in-tree `image_generation_tool.py`
slims down. The dispatch shim in `_handle_image_generate` only fires
when `image_gen.provider` is explicitly set to a non-FAL value, so
existing FAL setups are untouched.

- 41 unit tests (scanner recursion, kind parsing, gate logic,
  registry, OpenAI payload shapes)
- E2E smoke verified: bundled plugin autoloads, registers, and
  `_handle_image_generate` routes to OpenAI when configured

* fix(image_gen/openai): don't send response_format to gpt-image-*

The live API rejects it: 'Unknown parameter: response_format'
(verified 2026-04-21 with gpt-image-1.5). gpt-image-* models return
b64_json unconditionally, so the parameter was both unnecessary and
actively broken.

* feat(image_gen/openai): gpt-image-2 only, drop legacy catalog

gpt-image-2 is the latest/best OpenAI image model (released 2026-04-21)
and there's no reason to expose the older gpt-image-1.5 / gpt-image-1 /
dall-e-3 / dall-e-2 alongside it — slower, lower quality, or awkward
(dall-e-2 squares only). Trim the catalog down to a single model.

Live-verified end-to-end: landscape 1536x1024 render of a Moog-style
synth matches prompt exactly, 2.4MB PNG saved to cache.

* feat(image_gen/openai): expose gpt-image-2 as three quality tiers

Users pick speed/fidelity via the normal model picker instead of a
hidden quality knob. All three tier IDs resolve to the single underlying
gpt-image-2 API model with a different quality parameter:

  gpt-image-2-low     ~15s   fast iteration
  gpt-image-2-medium  ~40s   default
  gpt-image-2-high    ~2min  highest fidelity

Live-measured on OpenAI's API today: 15.4s / 40.8s / 116.9s for the
same 1024x1024 prompt.

Config:
  image_gen.openai.model: gpt-image-2-high
  # or
  image_gen.model: gpt-image-2-low
  # or env var for scripts/tests
  OPENAI_IMAGE_MODEL=gpt-image-2-medium

Live-verified end-to-end with the low tier: 18.8s landscape render of a
golden retriever in wildflowers, vision-confirmed exact match.

* feat(tools_config): plugin image_gen providers inject themselves into picker

'hermes tools' → Image Generation now shows plugin-registered backends
alongside Nous Subscription and FAL.ai without tools_config.py needing
to know about them. OpenAI appears as a third option today; future
backends appear automatically as they're added.

Mechanism:
- ImageGenProvider gains an optional get_setup_schema() hook
  (name, badge, tag, env_vars). Default derived from display_name.
- tools_config._plugin_image_gen_providers() pulls the schemas from
  every registered non-FAL plugin provider.
- _visible_providers() appends those rows when rendering the Image
  Generation category.
- _configure_provider() handles the new image_gen_plugin_name marker:
  writes image_gen.provider and routes to the plugin's list_models()
  catalog for the model picker.
- _toolset_needs_configuration_prompt('image_gen') stops demanding a
  FAL key when any plugin provider reports is_available().

FAL is skipped in the plugin path because it already has hardcoded
TOOL_CATEGORIES rows — when it gets ported to a plugin in a follow-up
PR the hardcoded rows go away and it surfaces through the same path
as OpenAI.

Verified live: picker shows Nous Subscription / FAL.ai / OpenAI.
Picking OpenAI prompts for OPENAI_API_KEY, then shows the
gpt-image-2-low/medium/high model picker sourced from the plugin.

397 tests pass across plugins/, tools_config, registry, and picker.

* fix(image_gen): close final gaps for plugin-backend parity with FAL

Two small places that still hardcoded FAL:

- hermes_cli/setup.py status line: an OpenAI-only setup showed
  'Image Generation: missing FAL_KEY'. Now probes plugin providers
  and reports '(OpenAI)' when one is_available() — or falls back to
  'missing FAL_KEY or OPENAI_API_KEY' if nothing is configured.

- image_generate tool schema description: said 'using FAL.ai, default
  FLUX 2 Klein 9B'. Rewrote provider-neutral — 'backend and model are
  user-configured' — and notes the 'image' field can be a URL or an
  absolute path, which the gateway delivers either way via
  extract_local_files().
2026-04-21 21:30:10 -07:00
Teknium d1acf17773 feat(models): add minimax/minimax-m2.5:free to OpenRouter catalog (#13836)
Surfaces the free variant alongside the paid minimax-m2.5 entry in
both the OPENROUTER_MODELS fallback snapshot and the nous/openrouter
provider model list.
2026-04-21 21:27:40 -07:00
Teknium 410f33a728 fix(kimi): don't send Anthropic thinking to api.kimi.com/coding (#13826)
Kimi's /coding endpoint speaks the Anthropic Messages protocol but has
its own thinking semantics: when thinking.enabled is sent, Kimi validates
the history and requires every prior assistant tool-call message to carry
OpenAI-style reasoning_content. The Anthropic path never populates that
field, and convert_messages_to_anthropic strips Anthropic thinking blocks
on third-party endpoints — so after one tool-calling turn the next request
fails with:

  HTTP 400: thinking is enabled but reasoning_content is missing in
  assistant tool call message at index N

Kimi on chat_completions handles thinking via extra_body in
ChatCompletionsTransport (#13503). On the Anthropic route, drop the
parameter entirely and let Kimi drive reasoning server-side.

build_anthropic_kwargs now gates the reasoning_config -> thinking block
on not _is_kimi_coding_endpoint(base_url).

Tests: 8 new parametric tests cover /coding, /coding/v1, /coding/anthropic,
/coding/ (trailing slash), explicit disabled, other third-party endpoints
still getting thinking (MiniMax), native Anthropic unaffected, and the
non-/coding Kimi root route.
2026-04-21 21:19:14 -07:00
Teknium 7b79e0f4c9 chore(models): drop 3 models from nous portal recommended list (#13822)
Remove nvidia/nemotron-3-super-120b-a12b:free, arcee-ai/trinity-large-preview:free,
and openrouter/elephant-alpha from _PROVIDER_MODELS['nous']. The paid nemotron and
arcee-thinking variants remain.
2026-04-21 21:10:20 -07:00
kshitijk4poor 57411fca24 feat: add BedrockTransport + wire all Bedrock transport paths
Fourth and final transport — completes the transport layer with all four
api_modes covered.  Wraps agent/bedrock_adapter.py behind the ProviderTransport
ABC, handles both raw boto3 dicts and already-normalized SimpleNamespace.

Wires all transport methods to production paths in run_agent.py:
- build_kwargs: _build_api_kwargs bedrock branch
- validate_response: response validation, new bedrock_converse branch
- finish_reason: new bedrock_converse branch in finish_reason extraction

Based on PR #13467 by @kshitijk4poor, with one adjustment: the main normalize
loop does NOT add a bedrock_converse branch to invoke normalize_response on
the already-normalized response.  Bedrock's normalize_converse_response runs
at the dispatch site (run_agent.py:5189), so the response already has the
OpenAI-compatible .choices[0].message shape by the time the main loop sees
it.  Falling through to the chat_completions else branch is correct and
sidesteps a redundant NormalizedResponse rebuild.

Transport coverage — complete:
| api_mode           | Transport                | build_kwargs | normalize | validate |
|--------------------|--------------------------|:------------:|:---------:|:--------:|
| anthropic_messages | AnthropicTransport       |             |          |         |
| codex_responses    | ResponsesApiTransport    |             |          |         |
| chat_completions   | ChatCompletionsTransport |             |          |         |
| bedrock_converse   | BedrockTransport         |             |          |         |

17 new BedrockTransport tests pass.  117 transport tests total pass.
160 bedrock/converse tests across tests/agent/ pass.  Full tests/run_agent/
targeted suite passes (885/885 + 15 skipped; the 1 remaining failure is the
pre-existing test_concurrent_interrupt flake on origin/main).
2026-04-21 20:58:37 -07:00
Brooklyn Nicholson 572e27c93f fix(tui): demote gateway log-noise from Activity to info tone
Restore the old-CLI contract where only complete failures tint Activity
red. Everything else is still visible for debugging but no longer
commandeers attention.

- gateway.stderr: always tone='info' (drops the ERRLIKE_RE regex)
- gateway.protocol_error: both pushes demoted to 'info'
- commands.catalog cold-start failure: demoted to 'info'
- approval.request: no longer duplicates the overlay into Activity

Kept as 'error': terminal `error` event, gateway.start_timeout,
gateway-exited, explicit status.update kinds.
2026-04-21 20:57:40 -07:00
Brooklyn Nicholson 76ad697dcb fix(tui): don't force-open Activity on every error
Reverts the auto-expand-on-new-error effect added in 93b47d96. The
effect overrode the user's chosen detailsMode and visually interrupted
every turn. Red/yellow chevron tint remains as the passive signal —
click to read, just like Thinking and Tool calls.
2026-04-21 20:57:40 -07:00
kshitijk4poor 83d86ce344 feat: add ChatCompletionsTransport + wire all default paths
Third concrete transport — handles the default 'chat_completions' api_mode used
by ~16 OpenAI-compatible providers (OpenRouter, Nous, NVIDIA, Qwen, Ollama,
DeepSeek, xAI, Kimi, custom, etc.). Wires build_kwargs + validate_response to
production paths.

Based on PR #13447 by @kshitijk4poor, with fixes:
- Preserve tool_call.extra_content (Gemini thought_signature) via
  ToolCall.provider_data — the original shim stripped it, causing 400 errors
  on multi-turn Gemini 3 thinking requests.
- Preserve reasoning_content distinctly from reasoning (DeepSeek/Moonshot) so
  the thinking-prefill retry check (_has_structured) still triggers.
- Port Kimi/Moonshot quirks (32000 max_tokens, top-level reasoning_effort,
  extra_body.thinking) that landed on main after the original PR was opened.
- Keep _qwen_prepare_chat_messages_inplace alive and call it through the
  transport when sanitization already deepcopied (avoids a second deepcopy).
- Skip the back-compat SimpleNamespace shim in the main normalize loop — for
  chat_completions, response.choices[0].message is already the right shape
  with .content/.tool_calls/.reasoning/.reasoning_content/.reasoning_details
  and per-tool-call .extra_content from the OpenAI SDK.

run_agent.py: -239 lines in _build_api_kwargs default branch extracted to the
transport. build_kwargs now owns: codex-field sanitization, Qwen portal prep,
developer role swap, provider preferences, max_tokens resolution (ephemeral >
user > NVIDIA 16384 > Qwen 65536 > Kimi 32000 > anthropic_max_output), Kimi
reasoning_effort + extra_body.thinking, OpenRouter/Nous/GitHub reasoning,
Nous product attribution tags, Ollama num_ctx, custom-provider think=false,
Qwen vl_high_resolution_images, request_overrides.

39 new transport tests (8 build_kwargs, 5 Kimi, 4 validate, 4 normalize
including extra_content regression, 3 cache stats, 3 basic). Tests/run_agent/
targeted suite passes (885/885 + 15 skipped; the 1 remaining failure is the
test_concurrent_interrupt flake present on origin/main).
2026-04-21 20:50:02 -07:00
emozilla 29693f9d8e feat(aux): use Portal /api/nous/recommended-models for auxiliary models
Wire the auxiliary client (compaction, vision, session search, web extract)
to the Nous Portal's curated recommended-models endpoint when running on
Nous Portal, with a TTL-cached fetch that mirrors how we pull /models for
pricing.

hermes_cli/models.py
  - fetch_nous_recommended_models(portal_base_url, force_refresh=False)
    10-minute TTL cache, keyed per portal URL (staging vs prod don't
    collide).  Public endpoint, no auth required.  Returns {} on any
    failure so callers always get a dict.
  - get_nous_recommended_aux_model(vision, free_tier=None, ...)
    Tier-aware pick from the payload:
      - Paid tier → paidRecommended{Vision,Compaction}Model, falling back
        to freeRecommended* when the paid field is null (common during
        staged rollouts of new paid models).
      - Free tier → freeRecommended* only, never leaks paid models.
    When free_tier is None, auto-detects via the existing
    check_nous_free_tier() helper (already cached 3 min against
    /api/oauth/account).  Detection errors default to paid so we never
    silently downgrade a paying user.

agent/auxiliary_client.py — _try_nous()
  - Replaces the hardcoded xiaomi/mimo free-tier branch with a single call
    to get_nous_recommended_aux_model(vision=vision).
  - Falls back to _NOUS_MODEL (google/gemini-3-flash-preview) when the
    Portal is unreachable or returns a null recommendation.
  - The Portal is now the source of truth for aux model selection; the
    xiaomi allowlist we used to carry is effectively dead.

Tests (15 new)
  - tests/hermes_cli/test_models.py::TestNousRecommendedModels
    Fetch caching, per-portal keying, network failure, force_refresh;
    paid-prefers-paid, paid-falls-to-free, free-never-leaks-paid,
    auto-detect, detection-error → paid default, null/blank modelName
    handling.
  - tests/agent/test_auxiliary_client.py::TestNousAuxiliaryRefresh
    _try_nous honors Portal recommendation for text + vision, falls
    back to google/gemini-3-flash-preview on None or exception.

Behavior won't visibly change today — both tier recommendations currently
point at google/gemini-3-flash-preview — but the moment the Portal ships
a better paid recommendation, subscribers pick it up within 10 minutes
without a Hermes release.
2026-04-21 20:35:16 -07:00
emozilla c22f4a76de remove Nous Portal free-model allowlist
Drop _NOUS_ALLOWED_FREE_MODELS + filter_nous_free_models and its two call
sites. Whatever Nous Portal prices as free now shows up in the picker as-is
— no local allowlist gatekeeping. Free-tier partitioning (paid vs free in
the menu) still runs via partition_nous_models_by_tier.
2026-04-21 20:35:16 -07:00
Kongxi dd8ab40556 fix(delegation): add hard timeout and stale detection for subagent execution (#13770)
- Wrap child.run_conversation() in a ThreadPoolExecutor with configurable
  timeout (delegation.child_timeout_seconds, default 300s) to prevent
  indefinite blocking when a subagent's API call or tool HTTP request hangs.

- Add heartbeat stale detection: if a child's api_call_count doesn't
  advance for 5 consecutive heartbeat cycles (~2.5 min), stop touching
  the parent's activity timestamp so the gateway inactivity timeout
  can fire as a last resort.

- Add 'timeout' as a new exit_reason/status alongside the existing
  completed/max_iterations/interrupted states.

- Use shutdown(wait=False) on the timeout executor to avoid the
  ThreadPoolExecutor.__exit__ deadlock when a child is stuck on
  blocking I/O.

Closes #13768
2026-04-21 20:20:16 -07:00
kshitijk4poor c832ebd67c feat: add ResponsesApiTransport + wire all Codex transport paths
Add ResponsesApiTransport wrapping codex_responses_adapter.py behind the
ProviderTransport ABC. Auto-registered via _discover_transports().

Wire ALL Codex transport methods to production paths in run_agent.py:
- build_kwargs: main _build_api_kwargs codex branch (50 lines extracted)
- normalize_response: main loop + flush + summary + retry (4 sites)
- convert_tools: memory flush tool override
- convert_messages: called internally via build_kwargs
- validate_response: response validation gate
- preflight_kwargs: request sanitization (2 sites)

Remove 7 dead legacy wrappers from AIAgent (_responses_tools,
_chat_messages_to_responses_input, _normalize_codex_response,
_preflight_codex_api_kwargs, _preflight_codex_input_items,
_extract_responses_message_text, _extract_responses_reasoning_text).
Keep 3 ID manipulation methods still used by _build_assistant_message.

Update 18 test call sites across 3 test files to call adapter functions
directly instead of through deleted AIAgent wrappers.

24 new tests. 343 codex/responses/transport tests pass (0 failures).

PR 4 of the provider transport refactor.
2026-04-21 19:48:56 -07:00
Teknium 09dd5eb6a5 chore(release): map xiaoqiang243 personal email in AUTHOR_MAP 2026-04-21 19:48:39 -07:00
Teknium b2ba351380 fix(kimi): reconcile sk-kimi- routing with Anthropic SDK URL semantics
Follow-ups after salvaging xiaoqiang243's kimi-for-coding patches:

- KIMI_CODE_BASE_URL: drop trailing /v1 (was /coding/v1).
  The /coding endpoint speaks Anthropic Messages, and the Anthropic SDK
  appends /v1/messages internally. /coding/v1 + SDK suffix produced
  /coding/v1/v1/messages (a 404). /coding + SDK suffix now yields
  /coding/v1/messages correctly.
- kimi-coding ProviderConfig: keep legacy default api.moonshot.ai/v1 so
  non-sk-kimi- moonshot keys still authenticate. sk-kimi- keys are
  already redirected to api.kimi.com/coding via _resolve_kimi_base_url.
- doctor.py: update Kimi UA to claude-code/0.1.0 (was KimiCLI/1.30.0)
  and rewrite /coding base URLs to /coding/v1 for the /models health
  check (Anthropic surface has no /models).
- test_kimi_env_vars: accept KIMI_CODING_API_KEY as a secondary env var.

E2E verified:
  sk-kimi-<key>  → https://api.kimi.com/coding/v1/messages (Anthropic)
  sk-<legacy>    → https://api.moonshot.ai/v1/chat/completions (OpenAI)
  UA: claude-code/0.1.0, x-api-key: <sk-kimi-*>
2026-04-21 19:48:39 -07:00
王强 6caf8bd994 fix: Enhance Kimi Coding API mode detection and User-Agent 2026-04-21 19:48:39 -07:00
王强 2a026eb762 fix: Update Kimi Coding API endpoint and User-Agent 2026-04-21 19:48:39 -07:00
王强 46d680125e fix(kimi-coding): set anthropic_messages api_mode for /coding endpoint 2026-04-21 19:48:39 -07:00
王强 bad5471409 fix(kimi-coding): add KIMI_CODING_API_KEY fallback + api_mode detection for /coding endpoint 2026-04-21 19:48:39 -07:00
王强 fd403854b9 fix: auto-detect anthropic_messages mode for Kimi /coding/v1 endpoints 2026-04-21 19:48:39 -07:00
王强 de181dfd22 fix: add User-Agent claude-code/0.1.0 for Kimi /coding endpoint
- Add _is_kimi_coding_endpoint() to detect Kimi coding API
- Place Kimi check BEFORE _requires_bearer_auth to ensure User-Agent header is set
- Without this header, Kimi returns 403 on /coding/v1/messages
- Fixes kimi-2.5, kimi-for-coding, kimi-k2.6-code-preview all returning 403
2026-04-21 19:48:39 -07:00
Teknium 84449d9afe fix(prompt): tell CLI agents not to emit MEDIA:/path tags (#13766)
The CLI has no attachment channel — MEDIA:<path> tags are only
intercepted on messaging gateway platforms (Telegram, Discord,
Slack, WhatsApp, Signal, BlueBubbles, email, etc.). On the CLI
they render as literal text, which is confusing for users.

The CLI platform hint was the one PLATFORM_HINTS entry that said
nothing about file delivery, so models trained on the messaging
hints would default to MEDIA: tags on the CLI too. Tool schemas
(browser_tool, tts_tool, etc.) also recommend MEDIA: generically.

Extend the CLI hint to explicitly discourage MEDIA: tags and tell
the agent to reference files by plain absolute path instead.

Add a regression test asserting the CLI hint carries negative
guidance about MEDIA: while messaging hints keep positive guidance.
2026-04-21 19:36:05 -07:00
Teknium 0a1e85dd0d fix(skills/baoyu-comic): absolute curl paths + clarify-timeout handling (#13775)
* fix(skills/baoyu-comic): require absolute paths for curl -o downloads

When downloading generated images across several batches of image_generate
calls, relying on persistent-shell CWD is unsafe. The terminal tool's shell
can rotate (TERMINAL_LIFETIME_SECONDS expiry, a failed cd that leaves the
shell somewhere else), and 'curl -fsSL <url> -o relative.png' then silently
writes to the wrong directory with no error.

Update the skill's Step 7 Download step to require absolute -o paths (or
workdir= on the terminal tool) and add a matching pitfall entry referencing
the Apr 2026 incident where pages 06-09 of a 10-page comic landed at the
repo root instead of comic/<slug>/. The agent then spent several turns
claiming the files existed where they didn't.

* fix(skills/baoyu-comic): handle clarify timeouts correctly in Step 2

A clarify timeout returning 'Use your best judgement to make the choice
and proceed' is NOT user consent to default the entire Step 2 questionnaire.
It is a per-question default only. Add guidance at both instruction sites
(SKILL.md User Questions section, references/workflow.md Step 2 header)
telling the agent to:

1. Continue asking the remaining questions in the sequence after a
   timeout — each question is an independent consent point.
2. Surface every defaulted choice in the next user-visible message
   so the user can correct it when they return. An unreported default
   is indistinguishable from never having asked.

Reported live Apr 2026: agent asked style question via clarify, got a
timeout response, and silently defaulted style + narrative focus +
audience + review flags in one pass. User only learned style had
defaulted to 'ohmsha' after the comic was fully generated.
2026-04-21 19:35:42 -07:00
brooklyn! 1dfbfcfe74 Merge pull request #13729 from NousResearch/bb/tui-diff-inline-sequence
fix(tui): tool inline_diff renders inline with the active turn
2026-04-21 21:13:50 -05:00
Teknium 964b444107 fix(website): run skill extraction automatically on npm run build/start (#13747)
website/src/pages/skills/index.tsx imports ../../data/skills.json, but
that file is git-ignored and generated at build time by
website/scripts/extract-skills.py. CI workflows (deploy-site.yml,
docs-site-checks.yml) run the script explicitly before 'npm run build',
so production and PR checks always work — but 'npm run build' on a
contributor's machine fails with:

  Module not found: Can't resolve '../../data/skills.json'

because the extraction step was never wired into the npm scripts.

Adds a prebuild/prestart hook that runs extract-skills.py automatically.
If python3 or pyyaml aren't installed locally, writes an empty
skills.json instead of hard-failing — the Skills Hub page renders with
an empty state, the rest of the site builds normally, and CI (which
always has the deps) still generates the full catalog for production.
2026-04-21 18:02:04 -07:00
Teknium bf73ced4f5 docs: document delegation width + depth knobs (#13745)
Fills the three gaps left by the orchestrator/width-depth salvage:

- configuration.md §Delegation: max_concurrent_children, max_spawn_depth,
  orchestrator_enabled are now in the canonical config.yaml reference
  with a paragraph covering defaults, clamping, role-degradation, and
  the 3x3x3=27-leaf cost scaling.
- environment-variables.md: adds DELEGATION_MAX_CONCURRENT_CHILDREN to
  the Agent Behavior table.
- features/delegation.md: corrects stale 'default 5, cap 8' wording
  (that was from the original PR; the salvage landed on default 3 with
  no ceiling and a tool error on excess instead of truncation).
2026-04-21 17:54:39 -07:00
Jim Liu 宝玉 83a7a005aa fix(skills): clarify baoyu-comic character sheet role
Page prompts are written in Step 5 from the text descriptions in
characters/characters.md — the PNG sheet generated in Step 7.1
cannot be used to write them. Reposition the PNG as a human-facing
review artifact (and reference for later regenerations / manual
edits), and drop the confusing "Character sheet | Strategy" tables
since the embedding rule is uniform.
2026-04-21 17:50:04 -07:00
Jim Liu 宝玉 fe025425cb fix(skills): address baoyu-comic PR review
- Remove PDF merge feature and scripts/ directory (no pdf-lib dep)
- Correct image_generate docs: prompt-only, returns URL; add
  curl download step after every call
- Downgrade reference images to text-based trait extraction
  (style/palette/scene); character sheet is agent-facing reference
- Unify source file naming on source-{slug}.md across SKILL.md
  and workflow.md
2026-04-21 17:50:04 -07:00
Jim Liu 宝玉 a8beba82d0 refactor(skills): adapt baoyu-comic for Hermes
Port the upstream baoyu-comic skill to Hermes' tool ecosystem, matching
the earlier baoyu-infographic adaptation:

- metadata namespace openclaw -> hermes (+ tags, homepage)
- drop EXTEND.md preferences system (references/config/ removed,
  workflow Step 1.1 removed)
- user prompts via clarify (one question at a time) instead of
  AskUserQuestion batches
- image generation via image_generate instead of baoyu-imagine, with
  aspect-ratio mapping to landscape/portrait/square
- Windows/PowerShell/WSL shell snippets dropped
- file I/O referenced via Hermes write_file/read_file tools
- CLI-style --flags converted to natural-language options and
  user-intent cues (skill matching has no slash command trigger)

Add PORT_NOTES.md documenting the adaptations and a sync procedure.
Art-style/tone/layout reference files are preserved verbatim from
upstream v1.56.1.
2026-04-21 17:50:04 -07:00
Jim Liu 宝玉 be7dcf3628 feat(skills): add baoyu-comic skill 2026-04-21 17:50:04 -07:00
Teknium 8f167e8791 fix(tts): use per-provider input-character caps instead of global 4000 (#13743)
A single global MAX_TEXT_LENGTH = 4000 truncated every TTS provider at
4000 chars, causing long inputs to be silently chopped even though the
underlying APIs allow much more:

  - OpenAI:     4096
  - xAI:        15000
  - MiniMax:    10000
  - ElevenLabs: 5000 / 10000 / 30000 / 40000 (model-aware)
  - Gemini:     ~5000
  - Edge:       ~5000

The schema description also told the model 'Keep under 4000 characters',
which encouraged the agent to self-chunk long briefs into multiple TTS
calls (producing 3 separate audio files instead of one).

New behavior:
  - PROVIDER_MAX_TEXT_LENGTH table + ELEVENLABS_MODEL_MAX_TEXT_LENGTH
    encode the documented per-provider limits.
  - _resolve_max_text_length(provider, cfg) resolves:
      1. tts.<provider>.max_text_length user override
      2. ElevenLabs model_id lookup
      3. provider default
      4. 4000 fallback
  - text_to_speech_tool() and stream_tts_to_speaker() both call the
    resolver; old MAX_TEXT_LENGTH alias kept for back-compat.
  - Schema description no longer hardcodes 4000.

Tests: 27 new unit + E2E tests; all 53 existing TTS tests and 253
voice-command/voice-cli tests still pass.
2026-04-21 17:49:39 -07:00
Brooklyn Nicholson a8eb13e828 fix(tui): dedupe inline diffs, strip CLI review-diff header
After the prior inline-diff fix, the gateway still prepends a literal
"  ┊ review diff" line to inline_diff (it's terminal chrome written by
`_emit_inline_diff`). Wrapping that in a ```diff fence left that header
inside the code block. The agent also often narrates its own edit in a
second fenced diff, so the assistant message ended up stacking two
diff blocks for the same change.

- Strip the leading "┊ review diff" header from queued inline diffs
  before fencing.
- Skip appending the fenced diff entirely when the assistant already
  wrote its own ```diff (or ```patch) fence.

Keeps the single-surface diff UX even when the agent is chatty.
2026-04-21 19:21:00 -05:00
Brooklyn Nicholson e684afa151 fix(tui): keep review-diff tool rows terse
When tool.complete already carries inline_diff, the assistant message owns the full diff block. Suppress the tool-row summary/detail in that case so the turn shows one detailed diff surface instead of a rich diff plus a duplicated tool-detail payload.
2026-04-21 19:13:15 -05:00
Brooklyn Nicholson 9654c9fb10 fix(tui): dedupe inline_diff when assistant already echoes it
Avoid duplicate diff rendering in #13729 flow. We now skip queued inline diffs that are already present in final assistant text and dedupe repeated queued diffs by exact content.
2026-04-21 19:06:49 -05:00
Brooklyn Nicholson 31b3b09ea4 fix(tui): render inline diffs inside assistant completion
Follow-up for #13729: segment-level system artifacts still looked detached in real flow.\n\nInstead of appending inline_diff as a standalone segment/system row, queue sanitized diffs during tool.complete and append them as a fenced diff block to the assistant completion text on message.complete. This keeps the diff in the same message flow as the assistant response.
2026-04-21 19:02:53 -05:00
brooklyn! 1e5daa4ece Merge pull request #13728 from NousResearch/bb/tui-history-local
fix(tui): /history shows the TUI's own transcript, scrollable
2026-04-21 18:59:31 -05:00
brooklyn! 90fca3c7e0 Merge pull request #13724 from NousResearch/bb/tui-resume-all-sources
fix(tui): /resume picker shows telegram/discord/etc sessions
2026-04-21 18:59:12 -05:00
brooklyn! e2feccf7c6 Merge pull request #13726 from NousResearch/bb/tui-multiline-up-arrow
fix(tui): up-arrow inside a multi-line buffer moves cursor, not history
2026-04-21 18:58:56 -05:00
Brooklyn Nicholson 35cc66df62 fix(tui): arrow history fallback when no line exists
Follow-up on multiline arrow behavior: Up/Down now fall back to queue/history whenever there is no logical line above/below the caret (not only at absolute start/end character positions). This makes Up from the end of the top line cycle history, matching expected readline-ish behavior.
2026-04-21 18:55:57 -05:00
Brooklyn Nicholson bd046220b3 fix(tui): narrow /resume sources to human adapters
Follow-up on #13724: showing literally every source was too noisy.\n\n now fetches a wider window (, larger limit) and then filters to a curated allowlist of human-facing sources (tui/cli plus chat adapters like telegram/discord/slack/whatsapp/etc). This keeps row #7 fixed (telegram sessions visible in /resume) without surfacing internal source kinds such as tool/acp.
2026-04-21 18:52:26 -05:00
Brooklyn Nicholson bddf0cd61e fix(tui): keep inline diffs below tool rows and strip ANSI
Follow-up on #13729 from blitz screenshot feedback.\n\n- When tool.complete carried inline_diff but no buffered assistant text existed, pending tool rows were still in streamPendingTools, so diff rendered above the tool row section. appendSegmentMessage now emits pending tool rows as a trail segment before appending the diff artifact.\n- Strip ANSI color escapes from inline_diff payloads so we don't render loud red/green terminal palettes in the transcript.
2026-04-21 18:50:42 -05:00
Brooklyn Nicholson 95fd023eeb fix(tui): only cycle history at input boundaries on arrows
Follow-up on #13726 from blitz feedback: Up/Down history cycling should only trigger when the caret is at the start/end boundary (or the input is empty).\n\nPreviously useInputHandlers intercepted arrows whenever inputBuf was empty, which still stole Up/Down from normal multiline editing. textInput now publishes caret position through inputSelectionStore even with no active selection, and useInputHandlers gates history/queue cycling on those boundaries.
2026-04-21 18:48:35 -05:00
Teknium 9c9d9b7ddf feat(delegate): cross-agent file state coordination for concurrent subagents (#13718)
* feat(models): hide OpenRouter models that don't advertise tool support

Port from Kilo-Org/kilocode#9068.

hermes-agent is tool-calling-first — every provider path assumes the
model can invoke tools. Models whose OpenRouter supported_parameters
doesn't include 'tools' (e.g. image-only or completion-only models)
cannot be driven by the agent loop and fail at the first tool call.

Filter them out of fetch_openrouter_models() so they never appear in
the model picker (`hermes model`, setup wizard, /model slash command).

Permissive when the field is missing — OpenRouter-compatible gateways
(Nous Portal, private mirrors, older snapshots) don't always populate
supported_parameters. Treat missing as 'unknown → allow' rather than
silently emptying the picker on those gateways. Only hide models
whose supported_parameters is an explicit list that omits tools.

Tests cover: tools present → kept, tools absent → dropped, field
missing → kept, malformed non-list → kept, non-dict item → kept,
empty list → dropped.

* feat(delegate): cross-agent file state coordination for concurrent subagents

Prevents mangled edits when concurrent subagents touch the same file
(same process, same filesystem — the mangle scenario from #11215).

Three layers, all opt-out via HERMES_DISABLE_FILE_STATE_GUARD=1:

1. FileStateRegistry (tools/file_state.py) — process-wide singleton
   tracking per-agent read stamps and the last writer globally.
   check_stale() names the sibling subagent in the warning when a
   non-owning agent wrote after this agent's last read.

2. Per-path threading.Lock wrapped around the read-modify-write
   region in write_file_tool and patch_tool. Concurrent siblings on
   the same path serialize; different paths stay fully parallel.
   V4A multi-file patches lock in sorted path order (deadlock-free).

3. Delegate-completion reminder in tools/delegate_tool.py: after a
   subagent returns, writes_since(parent, child_start, parent_reads)
   appends '[NOTE: subagent modified files the parent previously
   read — re-read before editing: ...]' to entry.summary when the
   child touched anything the parent had already seen.

Complements (does not replace) the existing path-overlap check in
run_agent._should_parallelize_tool_batch — batch check prevents
same-file parallel dispatch within one agent's turn (cheap prevention,
zero API cost), registry catches cross-subagent and cross-turn
staleness at write time (detection).

Behavior is warning-only, not hard-failing — matches existing project
style. Errors surface naturally: sibling writes often invalidate the
old_string in patch operations, which already errors cleanly.

Tests: tests/tools/test_file_state_registry.py — 16 tests covering
registry state transitions, per-path locking, per-path-not-global
locking, writes_since filtering, kill switch, and end-to-end
integration through the real read_file/write_file/patch handlers.
2026-04-21 16:41:26 -07:00
Brooklyn Nicholson dff1c8fcf1 fix(tui): tool inline_diff renders inline with the active turn
Reported during TUI v2 blitz retest: code-review diffs from tool.complete
appeared at the top of the current interaction thread, out of sequence
with the agent's messages and tool rows below them.

Root cause — `sys(inline_diff)` appends to `historyItems`, which sits
above the `StreamingAssistant` pane that renders the active turn.
Until the turn closed, the diff visually floated above everything
else happening in the same turn.

Route the diff through `turnController.appendSegmentMessage` instead
so it flushes any pending streaming text first, then lands in the
segment stream beside assistant output and tool calls.  On
`message.complete` the segment list is committed to history in emit
order (diff → final text), matching what the gateway sent.

Adds a regression test that exercises tool.complete → message.complete
with an inline_diff payload and asserts both the streaming and final
placement.
2026-04-21 18:35:59 -05:00
Brooklyn Nicholson 723a9cfb1e fix(tui): /history shows the TUI's own transcript, scrollable
Reported during TUI v2 blitz retest: `/history` in the TUI only shows
prompts from non-TUI Hermes runs and can't scroll the window.  Root
cause is the slash-worker subprocess: it's a detached HermesCLI that
never sees the TUI's turns, so its `conversation_history` starts empty
and `show_history` surfaces whatever was persisted from earlier CLI
sessions — not what the user just did inside the TUI.

Intercept `/history` as a local slash command so it dumps
`ctx.local.getHistoryItems()` — the TUI's own transcript — routed
through the pager (which scrolls after #13591).  Accepts an optional
preview-length argument (default 400 chars per message).

Adds createSlashHandler coverage.
2026-04-21 18:33:27 -05:00
Brooklyn Nicholson d30f6ac44e fix(tui): up-arrow inside a multi-line buffer moves cursor, not history
Reported during TUI v2 blitz retest: typing a multi-line message with
shift-Enter and then pressing Up to edit an earlier line swapped the
whole buffer for the previous history entry instead of moving the
cursor up a line.  Down then restored the draft → the buffer appeared
to "flip" between the draft and a prior prompt.

`useInputHandlers` cycles history on Up/Down, but textInput only
checked `inputBuf.length` — that only counts lines committed with a
trailing backslash, not shift-Enter newlines inside `input` itself.

Fix: detect logical lines inside the input string and move the cursor
one line up/down preserving column offset (clamp to line end when the
destination is shorter, standard editor behavior).  Only fall through
to history cycling when the cursor is already on the first line (Up)
or last line (Down).

Adds unit coverage for the new `lineNav` helper.
2026-04-21 18:31:35 -05:00
Brooklyn Nicholson 0dfb7b8a0d fix(tui): /resume picker shows telegram/discord/etc sessions
Reported during TUI v2 blitz retest: /resume modal only surfaced tui/cli
rows, even though `hermes --tui --resume <id>` with a pasted telegram
session id works fine.  The handler double-fetched with explicit
`source="tui"` and `source="cli"` filters and dropped everything else on
the floor.

Drop the filter — list_sessions_rich(source=None) already excludes
child sessions (subagents, compression continuations) via its default,
and users want to resume messenger sessions from inside the TUI.

Adds gateway regression coverage.
2026-04-21 18:28:40 -05:00
brooklyn! 35a4b093d8 Merge pull request #13719 from NousResearch/bb/tui-markdown-cleanup
refactor(tui): clean markdown.tsx per KISS/DRY
2026-04-21 18:13:18 -05:00
brooklyn! 5504ee8de8 Merge pull request #13715 from NousResearch/bb/tui-markdown-tilde-subscript
fix(tui): don't swallow Kimi/Qwen ~! ~? kaomoji as subscript spans
2026-04-21 18:12:59 -05:00
Brooklyn Nicholson b97b4c4981 refactor(tui): clean markdown.tsx per KISS/DRY
- Drop the outer no-op capture group from INLINE_RE and restructure the
  source as an ordered list of patterns-with-index-comments so each
  alternative is individually greppable. Shift group indices in MdInline
  down by one accordingly.
- Inline single-use helpers (parseFence, isFenceClose, isMarkdownFence,
  trimBareUrl) and intermediate variables (path, lang, raw, prefix, body,
  depth, task body, setext match, etc.).
- Hoist block-level regexes used inside MdImpl (FENCE_CLOSE_RE, SETEXT_RE,
  BULLET_RE, TASK_RE, NUMBERED_RE, QUOTE_RE) to top-level consts so
  they're compiled once instead of per-line.
- Collapse the duplicate compact-vs-normal blank-line branches into one
  if/!compact gap call.
- Move Fence and MdProps types to the bottom per house style.
- Shorten splitTableRow → splitRow and use optional chaining in a few
  match sites.

No behavior change; 162/162 tests pass. Net -22 LoC.
2026-04-21 18:11:12 -05:00
Brooklyn Nicholson 43eb1153e9 fix(tui): don't swallow Kimi/Qwen ~! ~? kaomoji as subscript spans
The inline markdown regex had `~([^~\s][^~]*?)~` for Pandoc-style subscript
(H~2~O, CO~2~). On models that decorate prose with kaomoji like `thing ~!`
and `cool ~?` — Kimi especially — the opener `~!` paired with the next
stray `~` on the line and dim-formatted everything between them with a
leading `_` character, mangling markdown output.

Tighten the pattern to short alphanumeric-only content (`~[A-Za-z0-9]{1,8}~`)
since real subscript never contains punctuation, spaces, or long runs.
Same tightening applied to stripInlineMarkup so width measurement stays
consistent. Classic CLI was unaffected because it renders these literally.
2026-04-21 17:34:48 -05:00
Kaio 9ed6eb0cca fix(tui): resolve runtime provider in _make_agent (#11884)
_make_agent() was not calling resolve_runtime_provider(), so bare-slug
models (e.g. 'claude-opus-4-6' with provider: anthropic) left provider,
base_url, and api_key empty in AIAgent — causing HTTP 404 at
api.anthropic.com.

Now mirrors cli.py: calls resolve_runtime_provider(requested=None) and
forwards all 7 resolved fields to AIAgent.

Adds regression test.
2026-04-18 22:01:07 -07:00
513 changed files with 84603 additions and 7646 deletions
+3
View File
@@ -14,3 +14,6 @@ node_modules
.env
*.md
# Runtime data (bind-mounted at /opt/data; must not leak into build context)
data/
+3
View File
@@ -53,6 +53,9 @@ jobs:
- name: Extract skill metadata for dashboard
run: python3 website/scripts/extract-skills.py
- name: Regenerate per-skill docs pages + catalogs
run: python3 website/scripts/generate-skill-docs.py
- name: Build skills index (if not already present)
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
+3
View File
@@ -36,6 +36,9 @@ jobs:
- name: Extract skill metadata for dashboard
run: python3 website/scripts/extract-skills.py
- name: Regenerate per-skill docs pages + catalogs
run: python3 website/scripts/generate-skill-docs.py
- name: Lint docs diagrams
run: npm run lint:diagrams
working-directory: website
+1
View File
@@ -1,3 +1,4 @@
.DS_Store
/venv/
/_pycache/
*.pyc*
+211 -77
View File
@@ -5,78 +5,61 @@ Instructions for AI coding assistants and developers working on the hermes-agent
## Development Environment
```bash
source venv/bin/activate # ALWAYS activate before running Python
# Prefer .venv; fall back to venv if that's what your checkout has.
source .venv/bin/activate # or: source venv/bin/activate
```
`scripts/run_tests.sh` probes `.venv` first, then `venv`, then
`$HOME/.hermes/hermes-agent/venv` (for worktrees that share a venv with the
main checkout).
## Project Structure
File counts shift constantly — don't treat the tree below as exhaustive.
The canonical source is the filesystem. The notes call out the load-bearing
entry points you'll actually edit.
```
hermes-agent/
├── run_agent.py # AIAgent class — core conversation loop
├── run_agent.py # AIAgent class — core conversation loop (~12k LOC)
├── model_tools.py # Tool orchestration, discover_builtin_tools(), handle_function_call()
├── toolsets.py # Toolset definitions, _HERMES_CORE_TOOLS list
├── cli.py # HermesCLI class — interactive CLI orchestrator
├── cli.py # HermesCLI class — interactive CLI orchestrator (~11k LOC)
├── hermes_state.py # SessionDB — SQLite session store (FTS5 search)
├── agent/ # Agent internals
│ ├── prompt_builder.py # System prompt assembly
│ ├── context_compressor.py # Auto context compression
│ ├── prompt_caching.py # Anthropic prompt caching
│ ├── auxiliary_client.py # Auxiliary LLM client (vision, summarization)
│ ├── model_metadata.py # Model context lengths, token estimation
│ ├── models_dev.py # models.dev registry integration (provider-aware context)
│ ├── display.py # KawaiiSpinner, tool preview formatting
│ ├── skill_commands.py # Skill slash commands (shared CLI/gateway)
│ └── trajectory.py # Trajectory saving helpers
├── hermes_cli/ # CLI subcommands and setup
│ ├── main.py # Entry point — all `hermes` subcommands
│ ├── config.py # DEFAULT_CONFIG, OPTIONAL_ENV_VARS, migration
│ ├── commands.py # Slash command definitions + SlashCommandCompleter
│ ├── callbacks.py # Terminal callbacks (clarify, sudo, approval)
│ ├── setup.py # Interactive setup wizard
│ ├── skin_engine.py # Skin/theme engine — CLI visual customization
│ ├── skills_config.py # `hermes skills` — enable/disable skills per platform
│ ├── tools_config.py # `hermes tools` — enable/disable tools per platform
│ ├── skills_hub.py # `/skills` slash command (search, browse, install)
│ ├── models.py # Model catalog, provider model lists
│ ├── model_switch.py # Shared /model switch pipeline (CLI + gateway)
│ └── auth.py # Provider credential resolution
├── tools/ # Tool implementations (one file per tool)
│ ├── registry.py # Central tool registry (schemas, handlers, dispatch)
│ ├── approval.py # Dangerous command detection
│ ├── terminal_tool.py # Terminal orchestration
│ ├── process_registry.py # Background process management
│ ├── file_tools.py # File read/write/search/patch
│ ├── web_tools.py # Web search/extract (Parallel + Firecrawl)
│ ├── browser_tool.py # Browserbase browser automation
│ ├── code_execution_tool.py # execute_code sandbox
│ ├── delegate_tool.py # Subagent delegation
│ ├── mcp_tool.py # MCP client (~1050 lines)
├── hermes_constants.py # get_hermes_home(), display_hermes_home() — profile-aware paths
├── hermes_logging.py # setup_logging() — agent.log / errors.log / gateway.log (profile-aware)
├── batch_runner.py # Parallel batch processing
├── agent/ # Agent internals (provider adapters, memory, caching, compression, etc.)
├── hermes_cli/ # CLI subcommands, setup wizard, plugins loader, skin engine
├── tools/ # Tool implementations — auto-discovered via tools/registry.py
│ └── environments/ # Terminal backends (local, docker, ssh, modal, daytona, singularity)
├── gateway/ # Messaging platform gateway
│ ├── run.py # Main loop, slash commands, message dispatch
├── session.py # SessionStore — conversation persistence
└── platforms/ # Adapters: telegram, discord, slack, whatsapp, homeassistant, signal, qqbot
├── gateway/ # Messaging gateway — run.py + session.py + platforms/
│ ├── platforms/ # Adapter per platform (telegram, discord, slack, whatsapp,
│ # homeassistant, signal, matrix, mattermost, email, sms,
│ # dingtalk, wecom, weixin, feishu, qqbot, bluebubbles,
│ │ # webhook, api_server, ...). See ADDING_A_PLATFORM.md.
│ └── builtin_hooks/ # Always-registered gateway hooks (boot-md, ...)
├── plugins/ # Plugin system (see "Plugins" section below)
│ ├── memory/ # Memory-provider plugins (honcho, mem0, supermemory, ...)
│ ├── context_engine/ # Context-engine plugins
│ └── <others>/ # Dashboard, image-gen, disk-cleanup, examples, ...
├── optional-skills/ # Heavier/niche skills shipped but NOT active by default
├── skills/ # Built-in skills bundled with the repo
├── ui-tui/ # Ink (React) terminal UI — `hermes --tui`
── src/entry.tsx # TTY gate + render()
│ ├── src/app.tsx # Main state machine and UI
│ ├── src/gatewayClient.ts # Child process + JSON-RPC bridge
│ ├── src/app/ # Decomposed app logic (event handler, slash handler, stores, hooks)
│ ├── src/components/ # Ink components (branding, markdown, prompts, pickers, etc.)
│ ├── src/hooks/ # useCompletion, useInputHistory, useQueue, useVirtualHistory
│ └── src/lib/ # Pure helpers (history, osc52, text, rpc, messages)
── src/ # entry.tsx, app.tsx, gatewayClient.ts + app/components/hooks/lib
├── tui_gateway/ # Python JSON-RPC backend for the TUI
│ ├── entry.py # stdio entrypoint
│ ├── server.py # RPC handlers and session logic
│ ├── render.py # Optional rich/ANSI bridge
│ └── slash_worker.py # Persistent HermesCLI subprocess for slash commands
├── acp_adapter/ # ACP server (VS Code / Zed / JetBrains integration)
├── cron/ # Scheduler (jobs.py, scheduler.py)
├── cron/ # Scheduler jobs.py, scheduler.py
├── environments/ # RL training environments (Atropos)
├── tests/ # Pytest suite (~3000 tests)
── batch_runner.py # Parallel batch processing
├── scripts/ # run_tests.sh, release.py, auxiliary scripts
── website/ # Docusaurus docs site
└── tests/ # Pytest suite (~15k tests across ~700 files as of Apr 2026)
```
**User config:** `~/.hermes/config.yaml` (settings), `~/.hermes/.env` (API keys)
**User config:** `~/.hermes/config.yaml` (settings), `~/.hermes/.env` (API keys only).
**Logs:** `~/.hermes/logs/``agent.log` (INFO+), `errors.log` (WARNING+),
`gateway.log` when running the gateway. Profile-aware via `get_hermes_home()`.
Browse with `hermes logs [--follow] [--level ...] [--session ...]`.
## File Dependency Chain
@@ -94,20 +77,30 @@ run_agent.py, cli.py, batch_runner.py, environments/
## AIAgent Class (run_agent.py)
The real `AIAgent.__init__` takes ~60 parameters (credentials, routing, callbacks,
session context, budget, credential pool, etc.). The signature below is the
minimum subset you'll usually touch — read `run_agent.py` for the full list.
```python
class AIAgent:
def __init__(self,
model: str = "anthropic/claude-opus-4.6",
max_iterations: int = 90,
base_url: str = None,
api_key: str = None,
provider: str = None,
api_mode: str = None, # "chat_completions" | "codex_responses" | ...
model: str = "", # empty → resolved from config/provider later
max_iterations: int = 90, # tool-calling iterations (shared with subagents)
enabled_toolsets: list = None,
disabled_toolsets: list = None,
quiet_mode: bool = False,
save_trajectories: bool = False,
platform: str = None, # "cli", "telegram", etc.
platform: str = None, # "cli", "telegram", etc.
session_id: str = None,
skip_context_files: bool = False,
skip_memory: bool = False,
# ... plus provider, api_mode, callbacks, routing params
credential_pool=None,
# ... plus callbacks, thread/user/chat IDs, iteration_budget, fallback_model,
# checkpoints config, prefill_messages, service_tier, reasoning_config, etc.
): ...
def chat(self, message: str) -> str:
@@ -120,10 +113,13 @@ class AIAgent:
### Agent Loop
The core loop is inside `run_conversation()` — entirely synchronous:
The core loop is inside `run_conversation()` — entirely synchronous, with
interrupt checks, budget tracking, and a one-turn grace call:
```python
while api_call_count < self.max_iterations and self.iteration_budget.remaining > 0:
while (api_call_count < self.max_iterations and self.iteration_budget.remaining > 0) \
or self._budget_grace_call:
if self._interrupt_requested: break
response = client.chat.completions.create(model=model, messages=messages, tools=tool_schemas)
if response.tool_calls:
for tool_call in response.tool_calls:
@@ -134,7 +130,8 @@ while api_call_count < self.max_iterations and self.iteration_budget.remaining >
return response.content
```
Messages follow OpenAI format: `{"role": "system/user/assistant/tool", ...}`. Reasoning content is stored in `assistant_msg["reasoning"]`.
Messages follow OpenAI format: `{"role": "system/user/assistant/tool", ...}`.
Reasoning content is stored in `assistant_msg["reasoning"]`.
---
@@ -280,7 +277,7 @@ The registry handles schema collection, dispatch, availability checking, and err
**State files**: If a tool stores persistent state (caches, logs, checkpoints), use `get_hermes_home()` for the base directory — never `Path.home() / ".hermes"`. This ensures each profile gets its own state.
**Agent-level tools** (todo, memory): intercepted by `run_agent.py` before `handle_function_call()`. See `todo_tool.py` for the pattern.
**Agent-level tools** (todo, memory): intercepted by `run_agent.py` before `handle_function_call()`. See `tools/todo_tool.py` for the pattern.
---
@@ -288,9 +285,13 @@ The registry handles schema collection, dispatch, availability checking, and err
### config.yaml options:
1. Add to `DEFAULT_CONFIG` in `hermes_cli/config.py`
2. Bump `_config_version` (currently 5) to trigger migration for existing users
2. Bump `_config_version` (check the current value at the top of `DEFAULT_CONFIG`)
ONLY if you need to actively migrate/transform existing user config
(renaming keys, changing structure). Adding a new key to an existing
section is handled automatically by the deep-merge and does NOT require
a version bump.
### .env variables:
### .env variables (SECRETS ONLY — API keys, tokens, passwords):
1. Add to `OPTIONAL_ENV_VARS` in `hermes_cli/config.py` with metadata:
```python
"NEW_API_KEY": {
@@ -302,13 +303,29 @@ The registry handles schema collection, dispatch, availability checking, and err
},
```
### Config loaders (two separate systems):
Non-secret settings (timeouts, thresholds, feature flags, paths, display
preferences) belong in `config.yaml`, not `.env`. If internal code needs an
env var mirror for backward compatibility, bridge it from `config.yaml` to
the env var in code (see `gateway_timeout`, `terminal.cwd``TERMINAL_CWD`).
### Config loaders (three paths — know which one you're in):
| Loader | Used by | Location |
|--------|---------|----------|
| `load_cli_config()` | CLI mode | `cli.py` |
| `load_config()` | `hermes tools`, `hermes setup` | `hermes_cli/config.py` |
| Direct YAML load | Gateway | `gateway/run.py` |
| `load_cli_config()` | CLI mode | `cli.py` — merges CLI-specific defaults + user YAML |
| `load_config()` | `hermes tools`, `hermes setup`, most CLI subcommands | `hermes_cli/config.py` — merges `DEFAULT_CONFIG` + user YAML |
| Direct YAML load | Gateway runtime | `gateway/run.py` + `gateway/config.py` — reads user YAML raw |
If you add a new key and the CLI sees it but the gateway doesn't (or vice
versa), you're on the wrong loader. Check `DEFAULT_CONFIG` coverage.
### Working directory:
- **CLI** — uses the process's current directory (`os.getcwd()`).
- **Messaging** — uses `terminal.cwd` from `config.yaml`. The gateway bridges this
to the `TERMINAL_CWD` env var for child tools. **`MESSAGING_CWD` has been
removed** — the config loader prints a deprecation warning if it's set in
`.env`. Same for `TERMINAL_CWD` in `.env`; the canonical setting is
`terminal.cwd` in `config.yaml`.
---
@@ -401,7 +418,95 @@ Activate with `/skin cyberpunk` or `display.skin: cyberpunk` in config.yaml.
---
## Plugins
Hermes has two plugin surfaces. Both live under `plugins/` in the repo so
repo-shipped plugins can be discovered alongside user-installed ones in
`~/.hermes/plugins/` and pip-installed entry points.
### General plugins (`hermes_cli/plugins.py` + `plugins/<name>/`)
`PluginManager` discovers plugins from `~/.hermes/plugins/`, `./.hermes/plugins/`,
and pip entry points. Each plugin exposes a `register(ctx)` function that
can:
- Register Python-callback lifecycle hooks:
`pre_tool_call`, `post_tool_call`, `pre_llm_call`, `post_llm_call`,
`on_session_start`, `on_session_end`
- Register new tools via `ctx.register_tool(...)`
- Register CLI subcommands via `ctx.register_cli_command(...)` — the
plugin's argparse tree is wired into `hermes` at startup so
`hermes <pluginname> <subcmd>` works with no change to `main.py`
Hooks are invoked from `model_tools.py` (pre/post tool) and `run_agent.py`
(lifecycle). **Discovery timing pitfall:** `discover_plugins()` only runs
as a side effect of importing `model_tools.py`. Code paths that read plugin
state without importing `model_tools.py` first must call `discover_plugins()`
explicitly (it's idempotent).
### Memory-provider plugins (`plugins/memory/<name>/`)
Separate discovery system for pluggable memory backends. Current built-in
providers include **honcho, mem0, supermemory, byterover, hindsight,
holographic, openviking, retaindb**.
Each provider implements the `MemoryProvider` ABC (see `agent/memory_provider.py`)
and is orchestrated by `agent/memory_manager.py`. Lifecycle hooks include
`sync_turn(turn_messages)`, `prefetch(query)`, `shutdown()`, and optional
`post_setup(hermes_home, config)` for setup-wizard integration.
**CLI commands via `plugins/memory/<name>/cli.py`:** if a memory plugin
defines `register_cli(subparser)`, `discover_plugin_cli_commands()` finds
it at argparse setup time and wires it into `hermes <plugin>`. The
framework only exposes CLI commands for the **currently active** memory
provider (read from `memory.provider` in config.yaml), so disabled
providers don't clutter `hermes --help`.
**Rule (Teknium, May 2026):** plugins MUST NOT modify core files
(`run_agent.py`, `cli.py`, `gateway/run.py`, `hermes_cli/main.py`, etc.).
If a plugin needs a capability the framework doesn't expose, expand the
generic plugin surface (new hook, new ctx method) — never hardcode
plugin-specific logic into core. PR #5295 removed 95 lines of hardcoded
honcho argparse from `main.py` for exactly this reason.
### Dashboard / context-engine / image-gen plugin directories
`plugins/context_engine/`, `plugins/image_gen/`, `plugins/example-dashboard/`,
etc. follow the same pattern (ABC + orchestrator + per-plugin directory).
Context engines plug into `agent/context_engine.py`; image-gen providers
into `agent/image_gen_provider.py`.
---
## Skills
Two parallel surfaces:
- **`skills/`** — built-in skills shipped and loadable by default.
Organized by category directories (e.g. `skills/github/`, `skills/mlops/`).
- **`optional-skills/`** — heavier or niche skills shipped with the repo but
NOT active by default. Installed explicitly via
`hermes skills install official/<category>/<skill>`. Adapter lives in
`tools/skills_hub.py` (`OptionalSkillSource`). Categories include
`autonomous-ai-agents`, `blockchain`, `communication`, `creative`,
`devops`, `email`, `health`, `mcp`, `migration`, `mlops`, `productivity`,
`research`, `security`, `web-development`.
When reviewing skill PRs, check which directory they target — heavy-dep or
niche skills belong in `optional-skills/`.
### SKILL.md frontmatter
Standard fields: `name`, `description`, `version`, `platforms`
(OS-gating list: `[macos]`, `[linux, macos]`, ...),
`metadata.hermes.tags`, `metadata.hermes.category`,
`metadata.hermes.config` (config.yaml settings the skill needs — stored
under `skills.config.<key>`, prompted during setup, injected at load time).
---
## Important Policies
### Prompt Caching Must Not Break
Hermes-Agent ensures caching remains valid throughout a conversation. **Do NOT implement changes that would:**
@@ -411,9 +516,10 @@ Hermes-Agent ensures caching remains valid throughout a conversation. **Do NOT i
Cache-breaking forces dramatically higher costs. The ONLY time we alter context is during context compression.
### Working Directory Behavior
- **CLI**: Uses current directory (`.``os.getcwd()`)
- **Messaging**: Uses `MESSAGING_CWD` env var (default: home directory)
Slash commands that mutate system-prompt state (skills, tools, memory, etc.)
must be **cache-aware**: default to deferred invalidation (change takes
effect next session), with an opt-in `--now` flag for immediate
invalidation. See `/skills install --now` for the canonical pattern.
### Background Process Notifications (Gateway)
@@ -435,7 +541,7 @@ Hermes supports **profiles** — multiple fully isolated instances, each with it
`HERMES_HOME` directory (config, API keys, memory, sessions, skills, gateway, etc.).
The core mechanism: `_apply_profile_override()` in `hermes_cli/main.py` sets
`HERMES_HOME` before any module imports. All 119+ references to `get_hermes_home()`
`HERMES_HOME` before any module imports. All `get_hermes_home()` references
automatically scope to the active profile.
### Rules for profile-safe code
@@ -492,8 +598,12 @@ Use `get_hermes_home()` from `hermes_constants` for code paths. Use `display_her
for user-facing print/log messages. Hardcoding `~/.hermes` breaks profiles — each profile
has its own `HERMES_HOME` directory. This was the source of 5 bugs fixed in PR #3575.
### DO NOT use `simple_term_menu` for interactive menus
Rendering bugs in tmux/iTerm2 — ghosting on scroll. Use `curses` (stdlib) instead. See `hermes_cli/tools_config.py` for the pattern.
### DO NOT introduce new `simple_term_menu` usage
Existing call sites in `hermes_cli/main.py` remain for legacy fallback only;
the preferred UI is curses (stdlib) because `simple_term_menu` has
ghost-duplication rendering bugs in tmux/iTerm2 with arrow keys. New
interactive menus must use `hermes_cli/curses_ui.py` — see
`hermes_cli/tools_config.py` for the canonical pattern.
### DO NOT use `\033[K` (ANSI erase-to-EOL) in spinner/display code
Leaks as literal `?[K` text under `prompt_toolkit`'s `patch_stdout`. Use space-padding: `f"\r{line}{' ' * pad}"`.
@@ -504,6 +614,30 @@ Leaks as literal `?[K` text under `prompt_toolkit`'s `patch_stdout`. Use space-p
### DO NOT hardcode cross-tool references in schema descriptions
Tool schema descriptions must not mention tools from other toolsets by name (e.g., `browser_navigate` saying "prefer web_search"). Those tools may be unavailable (missing API keys, disabled toolset), causing the model to hallucinate calls to non-existent tools. If a cross-reference is needed, add it dynamically in `get_tool_definitions()` in `model_tools.py` — see the `browser_navigate` / `execute_code` post-processing blocks for the pattern.
### The gateway has TWO message guards — both must bypass approval/control commands
When an agent is running, messages pass through two sequential guards:
(1) **base adapter** (`gateway/platforms/base.py`) queues messages in
`_pending_messages` when `session_key in self._active_sessions`, and
(2) **gateway runner** (`gateway/run.py`) intercepts `/stop`, `/new`,
`/queue`, `/status`, `/approve`, `/deny` before they reach
`running_agent.interrupt()`. Any new command that must reach the runner
while the agent is blocked (e.g. approval prompts) MUST bypass BOTH
guards and be dispatched inline, not via `_process_message_background()`
(which races session lifecycle).
### Squash merges from stale branches silently revert recent fixes
Before squash-merging a PR, ensure the branch is up to date with `main`
(`git fetch origin main && git reset --hard origin/main` in the worktree,
then re-apply the PR's commits). A stale branch's version of an unrelated
file will silently overwrite recent fixes on main when squashed. Verify
with `git diff HEAD~1..HEAD` after merging — unexpected deletions are a
red flag.
### Don't wire in dead code without E2E validation
Unused code that was never shipped was dead for a reason. Before wiring an
unused module into a live code path, E2E test the real resolution chain
with actual imports (not mocks) against a temp `HERMES_HOME`.
### Tests must not write to `~/.hermes/`
The `_isolate_hermes_home` autouse fixture in `tests/conftest.py` redirects `HERMES_HOME` to a temp dir. Never hardcode `~/.hermes/` paths in tests.
@@ -559,7 +693,7 @@ If you can't use the wrapper (e.g. on Windows or inside an IDE that shells
pytest directly), at minimum activate the venv and pass `-n 4`:
```bash
source venv/bin/activate
source .venv/bin/activate # or: source venv/bin/activate
python -m pytest tests/ -q -n 4
```
+6 -6
View File
@@ -9,7 +9,7 @@ Thank you for contributing to Hermes Agent! This guide covers everything you nee
We value contributions in this order:
1. **Bug fixes** — crashes, incorrect behavior, data loss. Always top priority.
2. **Cross-platform compatibility** Windows, macOS, different Linux distros, different terminal emulators. We want Hermes to work everywhere.
2. **Cross-platform compatibility** — macOS, different Linux distros, and WSL2 on Windows. We want Hermes to work everywhere.
3. **Security hardening** — shell injection, prompt injection, path traversal, privilege escalation. See [Security](#security-considerations).
4. **Performance and robustness** — retry logic, error handling, graceful degradation.
5. **New skills** — but only broadly useful ones. See [Should it be a Skill or a Tool?](#should-it-be-a-skill-or-a-tool)
@@ -55,10 +55,10 @@ If your skill is specialized, community-contributed, or niche, it's better suite
| Requirement | Notes |
|-------------|-------|
| **Git** | With `--recurse-submodules` support |
| **Git** | With `--recurse-submodules` support, and the `git-lfs` extension installed |
| **Python 3.11+** | uv will install it if missing |
| **uv** | Fast Python package manager ([install](https://docs.astral.sh/uv/)) |
| **Node.js 18+** | Optional — needed for browser tools and WhatsApp bridge |
| **Node.js 20+** | Optional — needed for browser tools and WhatsApp bridge (matches root `package.json` engines) |
### Clone and install
@@ -88,7 +88,7 @@ cp cli-config.yaml.example ~/.hermes/config.yaml
touch ~/.hermes/.env
# Add at minimum an LLM provider key:
echo 'OPENROUTER_API_KEY=sk-or-v1-your-key' >> ~/.hermes/.env
echo "OPENROUTER_API_KEY=***" >> ~/.hermes/.env
```
### Run
@@ -515,7 +515,7 @@ See `hermes_cli/skin_engine.py` for the full schema and existing skins as exampl
## Cross-Platform Compatibility
Hermes runs on Linux, macOS, and Windows. When writing code that touches the OS:
Hermes runs on Linux, macOS, and WSL2 on Windows. When writing code that touches the OS:
### Critical rules
@@ -597,7 +597,7 @@ refactor/description # Code restructuring
1. **Run tests**: `pytest tests/ -v`
2. **Test manually**: Run `hermes` and exercise the code path you changed
3. **Check cross-platform impact**: If you touch file I/O, process management, or terminal handling, consider Windows and macOS
3. **Check cross-platform impact**: If you touch file I/O, process management, or terminal handling, consider macOS, Linux, and WSL2
4. **Keep PRs focused**: One logical change per PR. Don't mix a bug fix with a refactor with a new feature.
### PR description
+2 -1
View File
@@ -12,7 +12,7 @@ ENV PLAYWRIGHT_BROWSERS_PATH=/opt/hermes/.playwright
# Install system dependencies in one layer, clear APT cache
RUN apt-get update && \
apt-get install -y --no-install-recommends \
build-essential nodejs npm python3 ripgrep ffmpeg gcc python3-dev libffi-dev procps git && \
build-essential nodejs npm python3 ripgrep ffmpeg gcc python3-dev libffi-dev procps git openssh-client docker-cli && \
rm -rf /var/lib/apt/lists/*
# Non-root user for runtime; UID can be overridden via HERMES_UID at runtime
@@ -50,5 +50,6 @@ RUN uv venv && \
# ---------- Runtime ----------
ENV HERMES_WEB_DIST=/opt/hermes/hermes_cli/web_dist
ENV HERMES_HOME=/opt/data
ENV PATH="/opt/data/.local/bin:${PATH}"
VOLUME [ "/opt/data" ]
ENTRYPOINT [ "/opt/hermes/docker/entrypoint.sh" ]
+3 -8
View File
@@ -76,7 +76,7 @@ Hermes has two entry points: start the terminal UI with `hermes`, or run the gat
| Set a personality | `/personality [name]` | `/personality [name]` |
| Retry or undo the last turn | `/retry`, `/undo` | `/retry`, `/undo` |
| Compress context / check usage | `/compress`, `/usage`, `/insights [--days N]` | `/compress`, `/usage`, `/insights [days]` |
| Browse skills | `/skills` or `/<skill-name>` | `/skills` or `/<skill-name>` |
| Browse skills | `/skills` or `/<skill-name>` | `/<skill-name>` |
| Interrupt current work | `Ctrl+C` or send a new message | `/stop` or send a new message |
| Platform-specific status | `/platforms` | `/status`, `/sethome` |
@@ -157,14 +157,10 @@ curl -LsSf https://astral.sh/uv/install.sh | sh
uv venv venv --python 3.11
source venv/bin/activate
uv pip install -e ".[all,dev]"
python -m pytest tests/ -q
scripts/run_tests.sh
```
> **RL Training (optional):** To work on the RL/Tinker-Atropos integration:
> ```bash
> git submodule update --init tinker-atropos
> uv pip install -e "./tinker-atropos"
> ```
> **RL Training (optional):** The RL/Atropos integration (`environments/`) ships via the `atroposlib` and `tinker` dependencies pulled in by `.[all,dev]` — no submodule setup required.
---
@@ -173,7 +169,6 @@ python -m pytest tests/ -q
- 💬 [Discord](https://discord.gg/NousResearch)
- 📚 [Skills Hub](https://agentskills.io)
- 🐛 [Issues](https://github.com/NousResearch/hermes-agent/issues)
- 💡 [Discussions](https://github.com/NousResearch/hermes-agent/discussions)
- 🔌 [HermesClaw](https://github.com/AaronWong1999/hermesclaw) — Community WeChat bridge: Run Hermes Agent and OpenClaw on the same WeChat account.
---
+453
View File
@@ -0,0 +1,453 @@
# Hermes Agent v0.11.0 (v2026.4.23)
**Release Date:** April 23, 2026
**Since v0.9.0:** 1,556 commits · 761 merged PRs · 1,314 files changed · 224,174 insertions · 29 community contributors (290 including co-authors)
> The Interface release — a full React/Ink rewrite of the interactive CLI, a pluggable transport architecture underneath every provider, native AWS Bedrock support, five new inference paths, a 17th messaging platform (QQBot), a dramatically expanded plugin surface, and GPT-5.5 via Codex OAuth.
This release also folds in all the highlights deferred from v0.10.0 (which shipped only the Nous Tool Gateway) — so it covers roughly two weeks of work across the whole stack.
---
## ✨ Highlights
- **New Ink-based TUI** — `hermes --tui` is now a full React/Ink rewrite of the interactive CLI, with a Python JSON-RPC backend (`tui_gateway`). Sticky composer, live streaming with OSC-52 clipboard support, stable picker keys, status bar with per-turn stopwatch and git branch, `/clear` confirm, light-theme preset, and a subagent spawn observability overlay. ~310 commits to `ui-tui/` + `tui_gateway/`. (@OutThisLife + Teknium)
- **Transport ABC + Native AWS Bedrock** — Format conversion and HTTP transport were extracted from `run_agent.py` into a pluggable `agent/transports/` layer. `AnthropicTransport`, `ChatCompletionsTransport`, `ResponsesApiTransport`, and `BedrockTransport` each own their own format conversion and API shape. Native AWS Bedrock support via the Converse API ships on top of the new abstraction. ([#10549](https://github.com/NousResearch/hermes-agent/pull/10549), [#13347](https://github.com/NousResearch/hermes-agent/pull/13347), [#13366](https://github.com/NousResearch/hermes-agent/pull/13366), [#13430](https://github.com/NousResearch/hermes-agent/pull/13430), [#13805](https://github.com/NousResearch/hermes-agent/pull/13805), [#13814](https://github.com/NousResearch/hermes-agent/pull/13814) — @kshitijk4poor + Teknium)
- **Five new inference paths** — Native NVIDIA NIM ([#11774](https://github.com/NousResearch/hermes-agent/pull/11774)), Arcee AI ([#9276](https://github.com/NousResearch/hermes-agent/pull/9276)), Step Plan ([#13893](https://github.com/NousResearch/hermes-agent/pull/13893)), Google Gemini CLI OAuth ([#11270](https://github.com/NousResearch/hermes-agent/pull/11270)), and Vercel ai-gateway with pricing + dynamic discovery ([#13223](https://github.com/NousResearch/hermes-agent/pull/13223) — @jerilynzheng). Plus Gemini routed through the native AI Studio API for better performance ([#12674](https://github.com/NousResearch/hermes-agent/pull/12674)).
- **GPT-5.5 over Codex OAuth** — OpenAI's new GPT-5.5 reasoning model is now available through your ChatGPT Codex OAuth, with live model discovery wired into the model picker so new OpenAI releases show up without catalog updates. ([#14720](https://github.com/NousResearch/hermes-agent/pull/14720))
- **QQBot — 17th supported platform** — Native QQBot adapter via QQ Official API v2, with QR scan-to-configure setup wizard, streaming cursor, emoji reactions, and DM/group policy gating that matches WeCom/Weixin parity. ([#9364](https://github.com/NousResearch/hermes-agent/pull/9364), [#11831](https://github.com/NousResearch/hermes-agent/pull/11831))
- **Plugin surface expanded** — Plugins can now register slash commands (`register_command`), dispatch tools directly (`dispatch_tool`), block tool execution from hooks (`pre_tool_call` can veto), rewrite tool results (`transform_tool_result`), transform terminal output (`transform_terminal_output`), ship image_gen backends, and add custom dashboard tabs. The bundled disk-cleanup plugin is opt-in by default as a reference implementation. ([#9377](https://github.com/NousResearch/hermes-agent/pull/9377), [#10626](https://github.com/NousResearch/hermes-agent/pull/10626), [#10763](https://github.com/NousResearch/hermes-agent/pull/10763), [#10951](https://github.com/NousResearch/hermes-agent/pull/10951), [#12929](https://github.com/NousResearch/hermes-agent/pull/12929), [#12944](https://github.com/NousResearch/hermes-agent/pull/12944), [#12972](https://github.com/NousResearch/hermes-agent/pull/12972), [#13799](https://github.com/NousResearch/hermes-agent/pull/13799), [#14175](https://github.com/NousResearch/hermes-agent/pull/14175))
- **`/steer` — mid-run agent nudges** — `/steer <prompt>` injects a note that the running agent sees after its next tool call, without interrupting the turn or breaking prompt cache. For when you want to course-correct an agent in-flight. ([#12116](https://github.com/NousResearch/hermes-agent/pull/12116))
- **Shell hooks** — Wire any shell script as a Hermes lifecycle hook (pre_tool_call, post_tool_call, on_session_start, etc.) without writing a Python plugin. ([#13296](https://github.com/NousResearch/hermes-agent/pull/13296))
- **Webhook direct-delivery mode** — Webhook subscriptions can now forward payloads straight to a platform chat without going through the agent — zero-LLM push notifications for alerting, uptime checks, and event streams. ([#12473](https://github.com/NousResearch/hermes-agent/pull/12473))
- **Smarter delegation** — Subagents now have an explicit `orchestrator` role that can spawn their own workers, with configurable `max_spawn_depth` (default flat). Concurrent sibling subagents share filesystem state through a file-coordination layer so they don't clobber each other's edits. ([#13691](https://github.com/NousResearch/hermes-agent/pull/13691), [#13718](https://github.com/NousResearch/hermes-agent/pull/13718))
- **Auxiliary models — configurable UI + main-model-first** — `hermes model` has a dedicated "Configure auxiliary models" screen for per-task overrides (compression, vision, session_search, title_generation). `auto` routing now defaults to the main model for side tasks across all users (previously aggregator users were silently routed to a cheap provider-side default). ([#11891](https://github.com/NousResearch/hermes-agent/pull/11891), [#11900](https://github.com/NousResearch/hermes-agent/pull/11900))
- **Dashboard plugin system + live theme switching** — The web dashboard is now extensible. Third-party plugins can add custom tabs, widgets, and views without forking. Paired with a live-switching theme system — themes now control colors, fonts, layout, and density — so users can hot-swap the dashboard look without a reload. Same theming discipline the CLI has, now on the web. ([#10951](https://github.com/NousResearch/hermes-agent/pull/10951), [#10687](https://github.com/NousResearch/hermes-agent/pull/10687), [#14725](https://github.com/NousResearch/hermes-agent/pull/14725))
- **Dashboard polish** — i18n (English + Chinese), react-router sidebar layout, mobile-responsive, Vercel deployment, real per-session API call tracking, and one-click update + gateway restart buttons. ([#9228](https://github.com/NousResearch/hermes-agent/pull/9228), [#9370](https://github.com/NousResearch/hermes-agent/pull/9370), [#9453](https://github.com/NousResearch/hermes-agent/pull/9453), [#10686](https://github.com/NousResearch/hermes-agent/pull/10686), [#13526](https://github.com/NousResearch/hermes-agent/pull/13526), [#14004](https://github.com/NousResearch/hermes-agent/pull/14004) — @austinpickett + @DeployFaith + Teknium)
---
## 🏗️ Core Agent & Architecture
### Transport Layer (NEW)
- **Transport ABC** abstracts format conversion and HTTP transport from `run_agent.py` into `agent/transports/` ([#13347](https://github.com/NousResearch/hermes-agent/pull/13347))
- **AnthropicTransport** — Anthropic Messages API path ([#13366](https://github.com/NousResearch/hermes-agent/pull/13366), @kshitijk4poor)
- **ChatCompletionsTransport** — default path for OpenAI-compatible providers ([#13805](https://github.com/NousResearch/hermes-agent/pull/13805))
- **ResponsesApiTransport** — OpenAI Responses API + Codex build_kwargs wiring ([#13430](https://github.com/NousResearch/hermes-agent/pull/13430), @kshitijk4poor)
- **BedrockTransport** — AWS Bedrock Converse API transport ([#13814](https://github.com/NousResearch/hermes-agent/pull/13814))
### Provider & Model Support
- **Native AWS Bedrock provider** via Converse API ([#10549](https://github.com/NousResearch/hermes-agent/pull/10549))
- **NVIDIA NIM native provider** (salvage of #11703) ([#11774](https://github.com/NousResearch/hermes-agent/pull/11774))
- **Arcee AI direct provider** ([#9276](https://github.com/NousResearch/hermes-agent/pull/9276))
- **Step Plan provider** (salvage #6005) ([#13893](https://github.com/NousResearch/hermes-agent/pull/13893), @kshitijk4poor)
- **Google Gemini CLI OAuth** inference provider ([#11270](https://github.com/NousResearch/hermes-agent/pull/11270))
- **Vercel ai-gateway** with pricing, attribution, and dynamic discovery ([#13223](https://github.com/NousResearch/hermes-agent/pull/13223), @jerilynzheng)
- **GPT-5.5 over Codex OAuth** with live model discovery in the picker ([#14720](https://github.com/NousResearch/hermes-agent/pull/14720))
- **Gemini routed through native AI Studio API** ([#12674](https://github.com/NousResearch/hermes-agent/pull/12674))
- **xAI Grok upgraded to Responses API** ([#10783](https://github.com/NousResearch/hermes-agent/pull/10783))
- **Ollama improvements** — Cloud provider support, GLM continuation, `think=false` control, surrogate sanitization, `/v1` hint ([#10782](https://github.com/NousResearch/hermes-agent/pull/10782))
- **Kimi K2.6** across OpenRouter, Nous Portal, native Kimi, and HuggingFace ([#13148](https://github.com/NousResearch/hermes-agent/pull/13148), [#13152](https://github.com/NousResearch/hermes-agent/pull/13152), [#13169](https://github.com/NousResearch/hermes-agent/pull/13169))
- **Kimi K2.5** promoted to first position in all model suggestion lists ([#11745](https://github.com/NousResearch/hermes-agent/pull/11745), @kshitijk4poor)
- **Xiaomi MiMo v2.5-pro + v2.5** on OpenRouter, Nous Portal, and native ([#14184](https://github.com/NousResearch/hermes-agent/pull/14184), [#14635](https://github.com/NousResearch/hermes-agent/pull/14635), @kshitijk4poor)
- **GLM-5V-Turbo** for coding plan ([#9907](https://github.com/NousResearch/hermes-agent/pull/9907))
- **Claude Opus 4.7** in Nous Portal catalog ([#11398](https://github.com/NousResearch/hermes-agent/pull/11398))
- **OpenRouter elephant-alpha** in curated lists ([#9378](https://github.com/NousResearch/hermes-agent/pull/9378))
- **OpenCode-Go** — Kimi K2.6 and Qwen3.5/3.6 Plus in curated catalog ([#13429](https://github.com/NousResearch/hermes-agent/pull/13429))
- **minimax/minimax-m2.5:free** in OpenRouter catalog ([#13836](https://github.com/NousResearch/hermes-agent/pull/13836))
- **`/model` merges models.dev entries** for lesser-loved providers ([#14221](https://github.com/NousResearch/hermes-agent/pull/14221))
- **Per-provider + per-model `request_timeout_seconds`** config ([#12652](https://github.com/NousResearch/hermes-agent/pull/12652))
- **Configurable API retry count** via `agent.api_max_retries` ([#14730](https://github.com/NousResearch/hermes-agent/pull/14730))
- **ctx_size context length key** for Lemonade server (salvage #8536) ([#14215](https://github.com/NousResearch/hermes-agent/pull/14215))
- **Custom provider display name prompt** ([#9420](https://github.com/NousResearch/hermes-agent/pull/9420))
- **Recommendation badges** on tool provider selection ([#9929](https://github.com/NousResearch/hermes-agent/pull/9929))
- Fix: correct GPT-5 family context lengths in fallback defaults ([#9309](https://github.com/NousResearch/hermes-agent/pull/9309))
- Fix: clamp `minimal` reasoning effort to `low` on Responses API ([#9429](https://github.com/NousResearch/hermes-agent/pull/9429))
- Fix: strip reasoning item IDs from Responses API input when `store=False` ([#10217](https://github.com/NousResearch/hermes-agent/pull/10217))
- Fix: OpenViking correct account default + commit session on `/new` and compress ([#10463](https://github.com/NousResearch/hermes-agent/pull/10463))
- Fix: Kimi `/coding` thinking block survival + empty reasoning_content + block ordering (multiple PRs)
- Fix: don't send Anthropic thinking to api.kimi.com/coding ([#13826](https://github.com/NousResearch/hermes-agent/pull/13826))
- Fix: send `max_tokens`, `reasoning_effort`, and `thinking` for Kimi/Moonshot
- Fix: stream reasoning content through OpenAI-compatible providers that emit it
### Agent Loop & Conversation
- **`/steer <prompt>`** — mid-run agent nudges after next tool call ([#12116](https://github.com/NousResearch/hermes-agent/pull/12116))
- **Orchestrator role + configurable spawn depth** for `delegate_task` (default flat) ([#13691](https://github.com/NousResearch/hermes-agent/pull/13691))
- **Cross-agent file state coordination** for concurrent subagents ([#13718](https://github.com/NousResearch/hermes-agent/pull/13718))
- **Compressor smart collapse, dedup, anti-thrashing**, template upgrade, hardening ([#10088](https://github.com/NousResearch/hermes-agent/pull/10088))
- **Compression summaries respect the conversation's language** ([#12556](https://github.com/NousResearch/hermes-agent/pull/12556))
- **Compression model falls back to main model** on permanent 503/404 ([#10093](https://github.com/NousResearch/hermes-agent/pull/10093))
- **Auto-continue interrupted agent work** after gateway restart ([#9934](https://github.com/NousResearch/hermes-agent/pull/9934))
- **Activity heartbeats** prevent false gateway inactivity timeouts ([#10501](https://github.com/NousResearch/hermes-agent/pull/10501))
- **Auxiliary models UI** — dedicated screen for per-task overrides ([#11891](https://github.com/NousResearch/hermes-agent/pull/11891))
- **Auxiliary auto routing defaults to main model** for all users ([#11900](https://github.com/NousResearch/hermes-agent/pull/11900))
- **PLATFORM_HINTS for Matrix, Mattermost, Feishu** ([#14428](https://github.com/NousResearch/hermes-agent/pull/14428), @alt-glitch)
- Fix: reset retry counters after compression; stop poisoning conversation history ([#10055](https://github.com/NousResearch/hermes-agent/pull/10055))
- Fix: break compression-exhaustion infinite loop and auto-reset session ([#10063](https://github.com/NousResearch/hermes-agent/pull/10063))
- Fix: stale agent timeout, uv venv detection, empty response after tools ([#10065](https://github.com/NousResearch/hermes-agent/pull/10065))
- Fix: prevent premature loop exit when weak models return empty after substantive tool calls ([#10472](https://github.com/NousResearch/hermes-agent/pull/10472))
- Fix: preserve pre-start terminal interrupts ([#10504](https://github.com/NousResearch/hermes-agent/pull/10504))
- Fix: improve interrupt responsiveness during concurrent tool execution ([#10935](https://github.com/NousResearch/hermes-agent/pull/10935))
- Fix: word-wrap spinner, interruptable agent join, and delegate_task interrupt ([#10940](https://github.com/NousResearch/hermes-agent/pull/10940))
- Fix: `/stop` no longer resets the session ([#9224](https://github.com/NousResearch/hermes-agent/pull/9224))
- Fix: honor interrupts during MCP tool waits ([#9382](https://github.com/NousResearch/hermes-agent/pull/9382), @helix4u)
- Fix: break stuck session resume loops after repeated restarts ([#9941](https://github.com/NousResearch/hermes-agent/pull/9941))
- Fix: empty response nudge crash + placeholder leak to cron targets ([#11021](https://github.com/NousResearch/hermes-agent/pull/11021))
- Fix: streaming cursor sanitization to prevent message truncation (multiple PRs)
- Fix: resolve `context_length` for plugin context engines ([#9238](https://github.com/NousResearch/hermes-agent/pull/9238))
### Session & Memory
- **Auto-prune old sessions + VACUUM state.db** at startup ([#13861](https://github.com/NousResearch/hermes-agent/pull/13861))
- **Honcho overhaul** — context injection, 5-tool surface, cost safety, session isolation ([#10619](https://github.com/NousResearch/hermes-agent/pull/10619))
- **Hindsight richer session-scoped retain metadata** (salvage of #6290) ([#13987](https://github.com/NousResearch/hermes-agent/pull/13987))
- Fix: deduplicate memory provider tools to prevent 400 on strict providers ([#10511](https://github.com/NousResearch/hermes-agent/pull/10511))
- Fix: discover user-installed memory providers from `$HERMES_HOME/plugins/` ([#10529](https://github.com/NousResearch/hermes-agent/pull/10529))
- Fix: add `on_memory_write` bridge to sequential tool execution path ([#10507](https://github.com/NousResearch/hermes-agent/pull/10507))
- Fix: preserve `session_id` across `previous_response_id` chains in `/v1/responses` ([#10059](https://github.com/NousResearch/hermes-agent/pull/10059))
---
## 🖥️ New Ink-based TUI
A full React/Ink rewrite of the interactive CLI — invoked via `hermes --tui` or `HERMES_TUI=1`. Shipped across ~310 commits to `ui-tui/` and `tui_gateway/`.
### TUI Foundations
- New TUI based on Ink + Python JSON-RPC backend
- Prettier + ESLint + vitest tooling for `ui-tui/`
- Entry split between `src/entry.tsx` (TTY gate) and `src/app.tsx` (state machine)
- Persistent `_SlashWorker` subprocess for slash command dispatch
### UX & Features
- **Stable picker keys, /clear confirm, light-theme preset** ([#12312](https://github.com/NousResearch/hermes-agent/pull/12312), @OutThisLife)
- **Git branch in status bar** cwd label ([#12305](https://github.com/NousResearch/hermes-agent/pull/12305), @OutThisLife)
- **Per-turn elapsed stopwatch in FaceTicker + done-in sys line** ([#13105](https://github.com/NousResearch/hermes-agent/pull/13105), @OutThisLife)
- **Subagent spawn observability overlay** ([#14045](https://github.com/NousResearch/hermes-agent/pull/14045), @OutThisLife)
- **Per-prompt elapsed stopwatch in status bar** ([#12948](https://github.com/NousResearch/hermes-agent/pull/12948))
- Sticky composer that freezes during scroll
- OSC-52 clipboard support for copy across SSH sessions
- Virtualized history rendering for performance
- Slash command autocomplete via `complete.slash` RPC
- Path autocomplete via `complete.path` RPC
- Dozens of resize/ghosting/sticky-prompt fixes landed through the week
### Structural Refactors
- Decomposed `app.tsx` into `app/event-handler`, `app/slash-handler`, `app/stores`, `app/hooks` ([#14640](https://github.com/NousResearch/hermes-agent/pull/14640) and surrounding)
- Component split: `branding.tsx`, `markdown.tsx`, `prompts.tsx`, `sessionPicker.tsx`, `messageLine.tsx`, `thinking.tsx`, `maskedPrompt.tsx`
- Hook split: `useCompletion`, `useInputHistory`, `useQueue`, `useVirtualHistory`
---
## 📱 Messaging Platforms (Gateway)
### New Platforms
- **QQBot (17th platform)** — QQ Official API v2 adapter with QR setup, streaming, package split ([#9364](https://github.com/NousResearch/hermes-agent/pull/9364), [#11831](https://github.com/NousResearch/hermes-agent/pull/11831))
### Telegram
- **Dedicated `TELEGRAM_PROXY` env var + config.yaml proxy support** (closes #9414, #6530, #9074, #7786) ([#10681](https://github.com/NousResearch/hermes-agent/pull/10681))
- **`ignored_threads` config** for Telegram groups ([#9530](https://github.com/NousResearch/hermes-agent/pull/9530))
- **Config option to disable link previews** (closes #8728) ([#10610](https://github.com/NousResearch/hermes-agent/pull/10610))
- **Auto-wrap markdown tables** in code blocks ([#11794](https://github.com/NousResearch/hermes-agent/pull/11794))
- Fix: prevent duplicate replies when stream task is cancelled ([#9319](https://github.com/NousResearch/hermes-agent/pull/9319))
- Fix: prevent streaming cursor (▉) from appearing as standalone messages ([#9538](https://github.com/NousResearch/hermes-agent/pull/9538))
- Fix: retry transient tool sends + cold-boot budget ([#10947](https://github.com/NousResearch/hermes-agent/pull/10947))
- Fix: Markdown special char escaping in `send_exec_approval`
- Fix: parentheses in URLs during MarkdownV2 link conversion
- Fix: Unicode dash normalization in model switch (closes iOS smart-punctuation issue)
- Many platform hint / streaming / session-key fixes
### Discord
- **Forum channel support** (salvage of #10145 + media + polish) ([#11920](https://github.com/NousResearch/hermes-agent/pull/11920))
- **`DISCORD_ALLOWED_ROLES`** for role-based access control ([#11608](https://github.com/NousResearch/hermes-agent/pull/11608))
- **Config option to disable slash commands** (salvage #13130) ([#14315](https://github.com/NousResearch/hermes-agent/pull/14315))
- **Native `send_animation`** for inline GIF playback ([#10283](https://github.com/NousResearch/hermes-agent/pull/10283))
- **`send_message` Discord media attachments** ([#10246](https://github.com/NousResearch/hermes-agent/pull/10246))
- **`/skill` command group** with category subcommands ([#9909](https://github.com/NousResearch/hermes-agent/pull/9909))
- **Extract reply text from message references** ([#9781](https://github.com/NousResearch/hermes-agent/pull/9781))
### Feishu
- **Intelligent reply on document comments** with 3-tier access control ([#11898](https://github.com/NousResearch/hermes-agent/pull/11898))
- **Show processing state via reactions** on user messages ([#12927](https://github.com/NousResearch/hermes-agent/pull/12927))
- **Preserve @mention context for agent consumption** (salvage #13874) ([#14167](https://github.com/NousResearch/hermes-agent/pull/14167))
### DingTalk
- **`require_mention` + `allowed_users` gating** (parity with Slack/Telegram/Discord) ([#11564](https://github.com/NousResearch/hermes-agent/pull/11564))
- **QR-code device-flow authorization** for setup wizard ([#11574](https://github.com/NousResearch/hermes-agent/pull/11574))
- **AI Cards streaming, emoji reactions, and media handling** (salvage of #10985) ([#11910](https://github.com/NousResearch/hermes-agent/pull/11910))
### WhatsApp
- **`send_voice`** — native audio message delivery ([#13002](https://github.com/NousResearch/hermes-agent/pull/13002))
- **`dm_policy` and `group_policy`** parity with WeCom/Weixin/QQ adapters ([#13151](https://github.com/NousResearch/hermes-agent/pull/13151))
### WeCom / Weixin
- **WeCom QR-scan bot creation + interactive setup wizard** (salvage #13923) ([#13961](https://github.com/NousResearch/hermes-agent/pull/13961))
### Signal
- **Media delivery support** via `send_message` ([#13178](https://github.com/NousResearch/hermes-agent/pull/13178))
### Slack
- **Per-thread sessions for DMs by default** ([#10987](https://github.com/NousResearch/hermes-agent/pull/10987))
### BlueBubbles (iMessage)
- Group chat session separation, webhook registration & auth fixes ([#9806](https://github.com/NousResearch/hermes-agent/pull/9806))
### Gateway Core
- **Gateway proxy mode** — forward messages to a remote API server ([#9787](https://github.com/NousResearch/hermes-agent/pull/9787))
- **Per-channel ephemeral prompts** (Discord, Telegram, Slack, Mattermost) ([#10564](https://github.com/NousResearch/hermes-agent/pull/10564))
- **Surface plugin slash commands** natively on all platforms + decision-capable command hook ([#14175](https://github.com/NousResearch/hermes-agent/pull/14175))
- **Support document/archive extensions in MEDIA: tag extraction** (salvage #8255) ([#14307](https://github.com/NousResearch/hermes-agent/pull/14307))
- **Recognize `.pdf` in MEDIA: tag extraction** ([#13683](https://github.com/NousResearch/hermes-agent/pull/13683))
- **`--all` flag for `gateway start` and `restart`** ([#10043](https://github.com/NousResearch/hermes-agent/pull/10043))
- **Notify active sessions on gateway shutdown** + update health check ([#9850](https://github.com/NousResearch/hermes-agent/pull/9850))
- **Block agent from self-destructing the gateway** via terminal (closes #6666) ([#9895](https://github.com/NousResearch/hermes-agent/pull/9895))
- Fix: suppress duplicate replies on interrupt and streaming flood control ([#10235](https://github.com/NousResearch/hermes-agent/pull/10235))
- Fix: close temporary agents after one-off tasks ([#11028](https://github.com/NousResearch/hermes-agent/pull/11028), @kshitijk4poor)
- Fix: busy-session ack when user messages during active agent run ([#10068](https://github.com/NousResearch/hermes-agent/pull/10068))
- Fix: route watch-pattern notifications to the originating session ([#10460](https://github.com/NousResearch/hermes-agent/pull/10460))
- Fix: preserve notify context in executor threads ([#10921](https://github.com/NousResearch/hermes-agent/pull/10921), @kshitijk4poor)
- Fix: avoid duplicate replies after interrupted long tasks ([#11018](https://github.com/NousResearch/hermes-agent/pull/11018))
- Fix: unlink stale PID + lock files on cleanup
- Fix: force-unlink stale PID file after `--replace` takeover
---
## 🔧 Tool System
### Plugin Surface (major expansion)
- **`register_command()`** — plugins can now add slash commands ([#10626](https://github.com/NousResearch/hermes-agent/pull/10626))
- **`dispatch_tool()`** — plugins can invoke tools from their code ([#10763](https://github.com/NousResearch/hermes-agent/pull/10763))
- **`pre_tool_call` blocking** — plugins can veto tool execution ([#9377](https://github.com/NousResearch/hermes-agent/pull/9377))
- **`transform_tool_result`** — plugins rewrite tool results generically ([#12972](https://github.com/NousResearch/hermes-agent/pull/12972))
- **`transform_terminal_output`** — plugins rewrite terminal tool output ([#12929](https://github.com/NousResearch/hermes-agent/pull/12929))
- **Namespaced skill registration** for plugin skill bundles ([#9786](https://github.com/NousResearch/hermes-agent/pull/9786))
- **Opt-in-by-default + bundled disk-cleanup plugin** (salvage #12212) ([#12944](https://github.com/NousResearch/hermes-agent/pull/12944))
- **Pluggable `image_gen` backends + OpenAI provider** ([#13799](https://github.com/NousResearch/hermes-agent/pull/13799))
- **`openai-codex` image_gen plugin** (gpt-image-2 via Codex OAuth) ([#14317](https://github.com/NousResearch/hermes-agent/pull/14317))
- **Shell hooks** — wire shell scripts as hook callbacks ([#13296](https://github.com/NousResearch/hermes-agent/pull/13296))
### Browser
- **`browser_cdp` raw DevTools Protocol passthrough** ([#12369](https://github.com/NousResearch/hermes-agent/pull/12369))
- Camofox hardening + connection stability across the window
### Execute Code
- **Project/strict execution modes** (default: project) ([#11971](https://github.com/NousResearch/hermes-agent/pull/11971))
### Image Generation
- **Multi-model FAL support** with picker in `hermes tools` ([#11265](https://github.com/NousResearch/hermes-agent/pull/11265))
- **Recraft V3 → V4 Pro, Nano Banana → Pro upgrades** ([#11406](https://github.com/NousResearch/hermes-agent/pull/11406))
- **GPT Image 2** in FAL catalog ([#13677](https://github.com/NousResearch/hermes-agent/pull/13677))
- **xAI image generation provider** (grok-imagine-image) ([#14765](https://github.com/NousResearch/hermes-agent/pull/14765))
### TTS / STT / Voice
- **Google Gemini TTS provider** ([#11229](https://github.com/NousResearch/hermes-agent/pull/11229))
- **xAI Grok STT provider** ([#14473](https://github.com/NousResearch/hermes-agent/pull/14473))
- **xAI TTS** (shipped with Responses API upgrade) ([#10783](https://github.com/NousResearch/hermes-agent/pull/10783))
- **KittenTTS local provider** (salvage of #2109) ([#13395](https://github.com/NousResearch/hermes-agent/pull/13395))
- **CLI record beep toggle** ([#13247](https://github.com/NousResearch/hermes-agent/pull/13247), @helix4u)
### Webhook / Cron
- **Webhook direct-delivery mode** — zero-LLM push notifications ([#12473](https://github.com/NousResearch/hermes-agent/pull/12473))
- **Cron `wakeAgent` gate** — scripts can skip the agent entirely ([#12373](https://github.com/NousResearch/hermes-agent/pull/12373))
- **Cron per-job `enabled_toolsets`** — cap token overhead + cost per job ([#14767](https://github.com/NousResearch/hermes-agent/pull/14767))
### Delegate
- **Orchestrator role** + configurable spawn depth (default flat) ([#13691](https://github.com/NousResearch/hermes-agent/pull/13691))
- **Cross-agent file state coordination** ([#13718](https://github.com/NousResearch/hermes-agent/pull/13718))
### File / Patch
- **`patch` — "did you mean?" feedback** when patch fails to match ([#13435](https://github.com/NousResearch/hermes-agent/pull/13435))
### API Server
- **Stream `/v1/responses` SSE tool events** (salvage #9779) ([#10049](https://github.com/NousResearch/hermes-agent/pull/10049))
- **Inline image inputs** on `/v1/chat/completions` and `/v1/responses` ([#12969](https://github.com/NousResearch/hermes-agent/pull/12969))
### Docker / Podman
- **Entry-level Podman support** — `find_docker()` + rootless entrypoint ([#10066](https://github.com/NousResearch/hermes-agent/pull/10066))
- **Add docker-cli to Docker image** (salvage #10096) ([#14232](https://github.com/NousResearch/hermes-agent/pull/14232))
- **File-sync back to host on teardown** (salvage of #8189 + hardening) ([#11291](https://github.com/NousResearch/hermes-agent/pull/11291))
### MCP
- 12 MCP improvements across the window (status, timeout handling, tool-call forwarding, etc.)
---
## 🧩 Skills Ecosystem
### Skill System
- **Namespaced skill registration** for plugin bundles ([#9786](https://github.com/NousResearch/hermes-agent/pull/9786))
- **`hermes skills reset`** to un-stick bundled skills ([#11468](https://github.com/NousResearch/hermes-agent/pull/11468))
- **Skills guard opt-in** — `config.skills.guard_agent_created` (default off) ([#14557](https://github.com/NousResearch/hermes-agent/pull/14557))
- **Bundled skill scripts runnable out of the box** ([#13384](https://github.com/NousResearch/hermes-agent/pull/13384))
- **`xitter` replaced with `xurl`** — the official X API CLI ([#12303](https://github.com/NousResearch/hermes-agent/pull/12303))
- **MiniMax-AI/cli as default skill tap** (salvage #7501) ([#14493](https://github.com/NousResearch/hermes-agent/pull/14493))
- **Fuzzy `@` file completions + mtime sorting** ([#9467](https://github.com/NousResearch/hermes-agent/pull/9467))
### New Skills
- **concept-diagrams** (salvage of #11045, @v1k22) ([#11363](https://github.com/NousResearch/hermes-agent/pull/11363))
- **architecture-diagram** (Cocoon AI port) ([#9906](https://github.com/NousResearch/hermes-agent/pull/9906))
- **pixel-art** with hardware palettes and video animation ([#12663](https://github.com/NousResearch/hermes-agent/pull/12663), [#12725](https://github.com/NousResearch/hermes-agent/pull/12725))
- **baoyu-comic** ([#13257](https://github.com/NousResearch/hermes-agent/pull/13257), @JimLiu)
- **baoyu-infographic** — 21 layouts × 21 styles (salvage #9901) ([#12254](https://github.com/NousResearch/hermes-agent/pull/12254))
- **page-agent** — embed Alibaba's in-page GUI agent in your webapp ([#13976](https://github.com/NousResearch/hermes-agent/pull/13976))
- **fitness-nutrition** optional skill + optional env var support ([#9355](https://github.com/NousResearch/hermes-agent/pull/9355))
- **drug-discovery** — ChEMBL, PubChem, OpenFDA, ADMET ([#9443](https://github.com/NousResearch/hermes-agent/pull/9443))
- **touchdesigner-mcp** (salvage of #10081) ([#12298](https://github.com/NousResearch/hermes-agent/pull/12298))
- **adversarial-ux-test** optional skill (salvage of #2494, @omnissiah-comelse) ([#13425](https://github.com/NousResearch/hermes-agent/pull/13425))
- **maps** — added `guest_house`, `camp_site`, and dual-key bakery lookup ([#13398](https://github.com/NousResearch/hermes-agent/pull/13398))
- **llm-wiki** — port provenance markers, source hashing, and quality signals ([#13700](https://github.com/NousResearch/hermes-agent/pull/13700))
---
## 📊 Web Dashboard
- **i18n (English + Chinese) language switcher** ([#9453](https://github.com/NousResearch/hermes-agent/pull/9453))
- **Live-switching theme system** ([#10687](https://github.com/NousResearch/hermes-agent/pull/10687))
- **Dashboard plugin system** — extend the web UI with custom tabs ([#10951](https://github.com/NousResearch/hermes-agent/pull/10951))
- **react-router, sidebar layout, sticky header, dropdown component** ([#9370](https://github.com/NousResearch/hermes-agent/pull/9370), @austinpickett)
- **Responsive for mobile** ([#9228](https://github.com/NousResearch/hermes-agent/pull/9228), @DeployFaith)
- **Vercel deployment** ([#10686](https://github.com/NousResearch/hermes-agent/pull/10686), [#11061](https://github.com/NousResearch/hermes-agent/pull/11061), @austinpickett)
- **Context window config support** ([#9357](https://github.com/NousResearch/hermes-agent/pull/9357))
- **HTTP health probe for cross-container gateway detection** ([#9894](https://github.com/NousResearch/hermes-agent/pull/9894))
- **Update + restart gateway buttons** ([#13526](https://github.com/NousResearch/hermes-agent/pull/13526), @austinpickett)
- **Real API call count per session** (salvages #10140) ([#14004](https://github.com/NousResearch/hermes-agent/pull/14004))
---
## 🖱️ CLI & User Experience
- **Dynamic shell completion for bash, zsh, and fish** ([#9785](https://github.com/NousResearch/hermes-agent/pull/9785))
- **Light-mode skins + skin-aware completion menus** ([#9461](https://github.com/NousResearch/hermes-agent/pull/9461))
- **Numbered keyboard shortcuts** on approval and clarify prompts ([#13416](https://github.com/NousResearch/hermes-agent/pull/13416))
- **Markdown stripping, compact multiline previews, external editor** ([#12934](https://github.com/NousResearch/hermes-agent/pull/12934))
- **`--ignore-user-config` and `--ignore-rules` flags** (port codex#18646) ([#14277](https://github.com/NousResearch/hermes-agent/pull/14277))
- **Account limits section in `/usage`** ([#13428](https://github.com/NousResearch/hermes-agent/pull/13428))
- **Doctor: Command Installation check** for `hermes` bin symlink ([#10112](https://github.com/NousResearch/hermes-agent/pull/10112))
- **ESC cancels secret/sudo prompts**, clearer skip messaging ([#9902](https://github.com/NousResearch/hermes-agent/pull/9902))
- Fix: agent-facing text uses `display_hermes_home()` instead of hardcoded `~/.hermes` ([#10285](https://github.com/NousResearch/hermes-agent/pull/10285))
- Fix: enforce `config.yaml` as sole CWD source + deprecate `.env` CWD vars + add `hermes memory reset` ([#11029](https://github.com/NousResearch/hermes-agent/pull/11029))
---
## 🔒 Security & Reliability
- **Global toggle to allow private/internal URL resolution** ([#14166](https://github.com/NousResearch/hermes-agent/pull/14166))
- **Block agent from self-destructing the gateway** via terminal (closes #6666) ([#9895](https://github.com/NousResearch/hermes-agent/pull/9895))
- **Telegram callback authorization** on update prompts ([#10536](https://github.com/NousResearch/hermes-agent/pull/10536))
- **SECURITY.md** added ([#10532](https://github.com/NousResearch/hermes-agent/pull/10532), @I3eg1nner)
- **Warn about legacy hermes.service units** during `hermes update` ([#11918](https://github.com/NousResearch/hermes-agent/pull/11918))
- **Complete ASCII-locale UnicodeEncodeError recovery** for `api_messages`/`reasoning_content` (closes #6843) ([#10537](https://github.com/NousResearch/hermes-agent/pull/10537))
- **Prevent stale `os.environ` leak** after `clear_session_vars` ([#10527](https://github.com/NousResearch/hermes-agent/pull/10527))
- **Prevent agent hang when backgrounding processes** via terminal tool ([#10584](https://github.com/NousResearch/hermes-agent/pull/10584))
- Many smaller session-resume, interrupt, streaming, and memory-race fixes throughout the window
---
## 🐛 Notable Bug Fixes
The `fix:` category in this window covers 482 PRs. Highlights:
- Streaming cursor artifacts filtered from Matrix, Telegram, WhatsApp, Discord (multiple PRs)
- `<think>` and `<thought>` blocks filtered from gateway stream consumers ([#9408](https://github.com/NousResearch/hermes-agent/pull/9408))
- Gateway display.streaming root-config override regression ([#9799](https://github.com/NousResearch/hermes-agent/pull/9799))
- Context `session_search` coerces limit to int (prevents TypeError) ([#10522](https://github.com/NousResearch/hermes-agent/pull/10522))
- Memory tool stays available when `fcntl` is unavailable (Windows) ([#9783](https://github.com/NousResearch/hermes-agent/pull/9783))
- Trajectory compressor credentials load from `HERMES_HOME/.env` ([#9632](https://github.com/NousResearch/hermes-agent/pull/9632), @Dusk1e)
- `@_context_completions` no longer crashes on `@` mention ([#9683](https://github.com/NousResearch/hermes-agent/pull/9683), @kshitijk4poor)
- Group session `user_id` no longer treated as `thread_id` in shutdown notifications ([#10546](https://github.com/NousResearch/hermes-agent/pull/10546))
- Telegram `platform_hint` — markdown is supported (closes #8261) ([#10612](https://github.com/NousResearch/hermes-agent/pull/10612))
- Doctor checks for Kimi China credentials fixed
- Streaming: don't suppress final response when commentary message is sent ([#10540](https://github.com/NousResearch/hermes-agent/pull/10540))
- Rapid Telegram follow-ups no longer get cut off
---
## 🧪 Testing & CI
- **Contributor attribution CI check** on PRs ([#9376](https://github.com/NousResearch/hermes-agent/pull/9376))
- Hermetic test parity (`scripts/run_tests.sh`) held across this window
- Test count stabilized post-Transport refactor; CI matrix held green through the transport rollout
---
## 📚 Documentation
- Atropos + wandb links in user guide
- ACP / VS Code / Zed / JetBrains integration docs refresh
- Webhook subscription docs updated for direct-delivery mode
- Plugin author guide expanded for new hooks (`register_command`, `dispatch_tool`, `transform_tool_result`)
- Transport layer developer guide added
- Website removed Discussions link from README
---
## 👥 Contributors
### Core
- **@teknium1** (Teknium)
### Top Community Contributors (by merged PR count)
- **@kshitijk4poor** — 49 PRs · Transport refactor (AnthropicTransport, ResponsesApiTransport), Step Plan provider, Xiaomi MiMo v2.5 support, numerous gateway fixes, promoted Kimi K2.5, @ mention crash fix
- **@OutThisLife** (Brooklyn) — 31 PRs · TUI polish, git branch in status bar, per-turn stopwatch, stable picker keys, `/clear` confirm, light-theme preset, subagent spawn observability overlay
- **@helix4u** — 11 PRs · Voice CLI record beep, MCP tool interrupt handling, assorted stability fixes
- **@austinpickett** — 8 PRs · Dashboard react-router + sidebar + sticky header + dropdown, Vercel deployment, update + restart buttons
- **@alt-glitch** — 8 PRs · PLATFORM_HINTS for Matrix/Mattermost/Feishu, Matrix fixes
- **@ethernet8023** — 3 PRs
- **@benbarclay** — 3 PRs
- **@Aslaaen** — 2 PRs
### Also contributing
@jerilynzheng (ai-gateway pricing), @JimLiu (baoyu-comic skill), @Dusk1e (trajectory compressor credentials), @DeployFaith (mobile-responsive dashboard), @LeonSGP43, @v1k22 (concept-diagrams), @omnissiah-comelse (adversarial-ux-test), @coekfung (Telegram MarkdownV2 expandable blockquotes), @liftaris (TUI provider resolution), @arihantsethia (skill analytics dashboard), @topcheer + @xing8star (QQBot foundation), @kovyrin, @I3eg1nner (SECURITY.md), @PeterBerthelsen, @lengxii, @priveperfumes, @sjz-ks, @cuyua9, @Disaster-Terminator, @leozeli, @LehaoLin, @trevthefoolish, @loongfay, @MrNiceRicee, @WideLee, @bluefishs, @malaiwah, @bobashopcashier, @dsocolobsky, @iamagenius00, @IAvecilla, @aniruddhaadak80, @Es1la, @asheriif, @walli, @jquesnelle (original Tool Gateway work).
### All Contributors (alphabetical)
@0xyg3n, @10ishq, @A-afflatus, @Abnertheforeman, @admin28980, @adybag14-cyber, @akhater, @alexzhu0,
@AllardQuek, @alt-glitch, @aniruddhaadak80, @anna-oake, @anniesurla, @anthhub, @areu01or00, @arihantsethia,
@arthurbr11, @asheriif, @Aslaaen, @Asunfly, @austinpickett, @AviArora02-commits, @AxDSan, @azhengbot, @Bartok9,
@benbarclay, @bennytimz, @bernylinville, @bingo906, @binhnt92, @bkadish, @bluefishs, @bobashopcashier,
@brantzh6, @BrennerSpear, @brianclemens, @briandevans, @brooklynnicholson, @bugkill3r, @buray, @burtenshaw,
@cdanis, @cgarwood82, @ChimingLiu, @chongweiliu, @christopherwoodall, @coekfung, @cola-runner, @corazzione,
@counterposition, @cresslank, @cuyua9, @cypres0099, @danieldoderlein, @davetist, @davidvv, @DeployFaith,
@Dev-Mriganka, @devorun, @dieutx, @Disaster-Terminator, @dodo-reach, @draix, @DrStrangerUJN, @dsocolobsky,
@Dusk1e, @dyxushuai, @elkimek, @elmatadorgh, @emozilla, @entropidelic, @Erosika, @erosika, @Es1la, @etcircle,
@etherman-os, @ethernet8023, @fancydirty, @farion1231, @fatinghenji, @Fatty911, @fengtianyu88, @Feranmi10,
@flobo3, @francip, @fuleinist, @g-guthrie, @GenKoKo, @gianfrancopiana, @gnanam1990, @GuyCui, @haileymarshall,
@haimu0x, @handsdiff, @hansnow, @hedgeho9X, @helix4u, @hengm3467, @HenkDz, @heykb, @hharry11, @HiddenPuppy,
@honghua, @houko, @houziershi, @hsy5571616, @huangke19, @hxp-plus, @Hypn0sis, @I3eg1nner, @iacker,
@iamagenius00, @IAvecilla, @iborazzi, @Ifkellx, @ifrederico, @imink, @isaachuangGMICLOUD, @ismell0992-afk,
@j0sephz, @Jaaneek, @jackjin1997, @JackTheGit, @jaffarkeikei, @jerilynzheng, @JiaDe-Wu, @Jiawen-lee, @JimLiu,
@jinzheng8115, @jneeee, @jplew, @jquesnelle, @Julientalbot, @Junass1, @jvcl, @kagura-agent, @keifergu,
@kevinskysunny, @keyuyuan, @konsisumer, @kovyrin, @kshitijk4poor, @leeyang1990, @LehaoLin, @lengxii,
@LeonSGP43, @leozeli, @li0near, @liftaris, @Lind3ey, @Linux2010, @liujinkun2025, @LLQWQ, @Llugaes, @lmoncany,
@longsizhuo, @lrawnsley, @Lubrsy706, @lumenradley, @luyao618, @lvnilesh, @LVT382009, @m0n5t3r, @Magaav,
@MagicRay1217, @malaiwah, @manuelschipper, @Marvae, @MassiveMassimo, @mavrickdeveloper, @maxchernin, @memosr,
@meng93, @mengjian-github, @MestreY0d4-Uninter, @Mibayy, @MikeFac, @mikewaters, @milkoor, @minorgod,
@MrNiceRicee, @ms-alan, @mvanhorn, @n-WN, @N0nb0at, @Nan93, @NIDNASSER-Abdelmajid, @nish3451, @niyoh120,
@nocoo, @nosleepcassette, @NousResearch, @ogzerber, @omnissiah-comelse, @Only-Code-A, @opriz, @OwenYWT, @pedh,
@pefontana, @PeterBerthelsen, @phpoh, @pinion05, @plgonzalezrx8, @pradeep7127, @priveperfumes,
@projectadmin-dev, @PStarH, @rnijhara, @Roy-oss1, @roytian1217, @RucchiZ, @Ruzzgar, @RyanLee-Dev, @Salt-555,
@Sanjays2402, @sgaofen, @sharziki, @shenuu, @shin4, @SHL0MS, @shushuzn, @sicnuyudidi, @simon-gtcl,
@simon-marcus, @sirEven, @Sisyphus, @sjz-ks, @snreynolds, @Societus, @Somme4096, @sontianye, @sprmn24,
@StefanIsMe, @stephenschoettler, @Swift42, @taeng0204, @taeuk178, @tannerfokkens-maker, @TaroballzChen,
@ten-ltw, @teyrebaz33, @Tianworld, @topcheer, @Tranquil-Flow, @trevthefoolish, @TroyMitchell911, @UNLINEARITY,
@v1k22, @vivganes, @vominh1919, @vrinek, @VTRiot, @WadydX, @walli, @wenhao7, @WhiteWorld, @WideLee, @wujhsu,
@WuTianyi123, @Wysie, @xandersbell, @xiaoqiang243, @xiayh0107, @xinpengdr, @Xowiek, @ycbai, @yeyitech, @ygd58,
@youngDoo, @yudaiyan, @Yukipukii1, @yule975, @yyq4193, @yzx9, @ZaynJarvis, @zhang9w0v5, @zhanggttry,
@zhangxicen, @zhongyueming1121, @zhouxiaoya12, @zons-zhaozhy
Also: @maelrx, @Marco Rutsch, @MaxsolcuCrypto, @Mind-Dragon, @Paul Bergeron, @say8hi, @whitehatjr1001.
---
**Full Changelog**: [v2026.4.13...v2026.4.23](https://github.com/NousResearch/hermes-agent/compare/v2026.4.13...v2026.4.23)
+142 -111
View File
@@ -17,7 +17,6 @@ import os
from pathlib import Path
from hermes_constants import get_hermes_home
from types import SimpleNamespace
from typing import Any, Dict, List, Optional, Tuple
from utils import normalize_proxy_env_vars
@@ -117,6 +116,63 @@ def _get_anthropic_max_output(model: str) -> int:
return best_val
def _resolve_positive_anthropic_max_tokens(value) -> Optional[int]:
"""Return ``value`` floored to a positive int, or ``None`` if it is not a
finite positive number. Ported from openclaw/openclaw#66664.
Anthropic's Messages API rejects ``max_tokens`` values that are 0,
negative, non-integer, or non-finite with HTTP 400. Python's ``or``
idiom (``max_tokens or fallback``) correctly catches ``0`` but lets
negative ints and fractional floats (``-1``, ``0.5``) through to the
API, producing a user-visible failure instead of a local error.
"""
# Booleans are a subclass of int — exclude explicitly so ``True`` doesn't
# silently become 1 and ``False`` doesn't become 0.
if isinstance(value, bool):
return None
if not isinstance(value, (int, float)):
return None
try:
import math
if not math.isfinite(value):
return None
except Exception:
return None
floored = int(value) # truncates toward zero for floats
return floored if floored > 0 else None
def _resolve_anthropic_messages_max_tokens(
requested,
model: str,
context_length: Optional[int] = None,
) -> int:
"""Resolve the ``max_tokens`` budget for an Anthropic Messages call.
Prefers ``requested`` when it is a positive finite number; otherwise
falls back to the model's output ceiling. Raises ``ValueError`` if no
positive budget can be resolved (should not happen with current model
table defaults, but guards against a future regression where
``_get_anthropic_max_output`` could return ``0``).
Separately, callers apply a context-window clamp — this resolver does
not, to keep the positive-value contract independent of endpoint
specifics.
Ported from openclaw/openclaw#66664 (resolveAnthropicMessagesMaxTokens).
"""
resolved = _resolve_positive_anthropic_max_tokens(requested)
if resolved is not None:
return resolved
fallback = _get_anthropic_max_output(model)
if fallback > 0:
return fallback
raise ValueError(
f"Anthropic Messages adapter requires a positive max_tokens value for "
f"model {model!r}; got {requested!r} and no model default resolved."
)
def _supports_adaptive_thinking(model: str) -> bool:
"""Return True for Claude 4.6+ models that support adaptive thinking."""
return any(v in model for v in _ADAPTIVE_THINKING_SUBSTRINGS)
@@ -266,6 +322,14 @@ def _is_third_party_anthropic_endpoint(base_url: str | None) -> bool:
return True # Any other endpoint is a third-party proxy
def _is_kimi_coding_endpoint(base_url: str | None) -> bool:
"""Return True for Kimi's /coding endpoint that requires claude-code UA."""
normalized = _normalize_base_url_text(base_url)
if not normalized:
return False
return normalized.rstrip("/").lower().startswith("https://api.kimi.com/coding")
def _requires_bearer_auth(base_url: str | None) -> bool:
"""Return True for Anthropic-compatible providers that require Bearer auth.
@@ -323,9 +387,18 @@ def build_anthropic_client(api_key: str, base_url: str = None, timeout: float =
kwargs["base_url"] = normalized_base_url
common_betas = _common_betas_for_base_url(normalized_base_url)
if _requires_bearer_auth(normalized_base_url):
if _is_kimi_coding_endpoint(base_url):
# Kimi's /coding endpoint requires User-Agent: claude-code/0.1.0
# to be recognized as a valid Coding Agent. Without it, returns 403.
# Check this BEFORE _requires_bearer_auth since both match api.kimi.com/coding.
kwargs["api_key"] = api_key
kwargs["default_headers"] = {
"User-Agent": "claude-code/0.1.0",
**( {"anthropic-beta": ",".join(common_betas)} if common_betas else {} )
}
elif _requires_bearer_auth(normalized_base_url):
# Some Anthropic-compatible providers (e.g. MiniMax) expect the API key in
# Authorization: Bearer even for regular API keys. Route those endpoints
# Authorization: Bearer *** for regular API keys. Route those endpoints
# through auth_token so the SDK sends Bearer auth instead of x-api-key.
# Check this before OAuth token shape detection because MiniMax secrets do
# not use Anthropic's sk-ant-api prefix and would otherwise be misread as
@@ -1066,6 +1139,31 @@ def convert_messages_to_anthropic(
"name": fn.get("name", ""),
"input": parsed_args,
})
# Kimi's /coding endpoint (Anthropic protocol) requires assistant
# tool-call messages to carry reasoning_content when thinking is
# enabled server-side. Preserve it as a thinking block so Kimi
# can validate the message history. See hermes-agent#13848.
#
# Accept empty string "" — _copy_reasoning_content_for_api()
# injects "" as a tier-3 fallback for Kimi tool-call messages
# that had no reasoning. Kimi requires the field to exist, even
# if empty.
#
# Prepend (not append): Anthropic protocol requires thinking
# blocks before text and tool_use blocks.
#
# Guard: only add when reasoning_details didn't already contribute
# thinking blocks. On native Anthropic, reasoning_details produces
# signed thinking blocks — adding another unsigned one from
# reasoning_content would create a duplicate (same text) that gets
# downgraded to a spurious text block on the last assistant message.
reasoning_content = m.get("reasoning_content")
_already_has_thinking = any(
isinstance(b, dict) and b.get("type") in ("thinking", "redacted_thinking")
for b in blocks
)
if isinstance(reasoning_content, str) and not _already_has_thinking:
blocks.insert(0, {"type": "thinking", "thinking": reasoning_content})
# Anthropic rejects empty assistant content
effective = blocks or content
if not effective or effective == "":
@@ -1221,6 +1319,7 @@ def convert_messages_to_anthropic(
# cache markers can interfere with signature validation.
_THINKING_TYPES = frozenset(("thinking", "redacted_thinking"))
_is_third_party = _is_third_party_anthropic_endpoint(base_url)
_is_kimi = _is_kimi_coding_endpoint(base_url)
last_assistant_idx = None
for i in range(len(result) - 1, -1, -1):
@@ -1232,7 +1331,25 @@ def convert_messages_to_anthropic(
if m.get("role") != "assistant" or not isinstance(m.get("content"), list):
continue
if _is_third_party or idx != last_assistant_idx:
if _is_kimi:
# Kimi's /coding endpoint enables thinking server-side and
# requires unsigned thinking blocks on replayed assistant
# tool-call messages. Strip signed Anthropic blocks (Kimi
# can't validate signatures) but preserve the unsigned ones
# we synthesised from reasoning_content above.
new_content = []
for b in m["content"]:
if not isinstance(b, dict) or b.get("type") not in _THINKING_TYPES:
new_content.append(b)
continue
if b.get("signature") or b.get("data"):
# Anthropic-signed block — Kimi can't validate, strip
continue
# Unsigned thinking (synthesised from reasoning_content) —
# keep it: Kimi needs it for message-history validation.
new_content.append(b)
m["content"] = new_content or [{"type": "text", "text": "(empty)"}]
elif _is_third_party or idx != last_assistant_idx:
# Third-party endpoint: strip ALL thinking blocks from every
# assistant message — signatures are Anthropic-proprietary.
# Direct Anthropic: strip from non-latest assistant messages only.
@@ -1330,7 +1447,12 @@ def build_anthropic_kwargs(
model = normalize_model_name(model, preserve_dots=preserve_dots)
# effective_max_tokens = output cap for this call (≠ total context window)
effective_max_tokens = max_tokens or _get_anthropic_max_output(model)
# Use the resolver helper so non-positive values (negative ints,
# fractional floats, NaN, non-numeric) fail locally with a clear error
# rather than 400-ing at the Anthropic API. See openclaw/openclaw#66664.
effective_max_tokens = _resolve_anthropic_messages_max_tokens(
max_tokens, model, context_length=context_length
)
# Clamp output cap to fit inside the total context window.
# Only matters for small custom endpoints where context_length < native
@@ -1409,11 +1531,25 @@ def build_anthropic_kwargs(
# MiniMax Anthropic-compat endpoints support thinking (manual mode only,
# not adaptive). Haiku does NOT support extended thinking — skip entirely.
#
# Kimi's /coding endpoint speaks the Anthropic Messages protocol but has
# its own thinking semantics: when ``thinking.enabled`` is sent, Kimi
# validates the message history and requires every prior assistant
# tool-call message to carry OpenAI-style ``reasoning_content``. The
# Anthropic path never populates that field, and
# ``convert_messages_to_anthropic`` strips all Anthropic thinking blocks
# on third-party endpoints — so the request fails with HTTP 400
# "thinking is enabled but reasoning_content is missing in assistant
# tool call message at index N". Kimi's reasoning is driven server-side
# on the /coding route, so skip Anthropic's thinking parameter entirely
# for that host. (Kimi on chat_completions enables thinking via
# extra_body in the ChatCompletionsTransport — see #13503.)
#
# On 4.7+ the `thinking.display` field defaults to "omitted", which
# silently hides reasoning text that Hermes surfaces in its CLI. We
# request "summarized" so the reasoning blocks stay populated — matching
# 4.6 behavior and preserving the activity-feed UX during long tool runs.
if reasoning_config and isinstance(reasoning_config, dict):
_is_kimi_coding = _is_kimi_coding_endpoint(base_url)
if reasoning_config and isinstance(reasoning_config, dict) and not _is_kimi_coding:
if reasoning_config.get("enabled") is not False and "haiku" not in model.lower():
effort = str(reasoning_config.get("effort", "medium")).lower()
budget = THINKING_BUDGET.get(effort, 8000)
@@ -1462,109 +1598,4 @@ def build_anthropic_kwargs(
return kwargs
def normalize_anthropic_response(
response,
strip_tool_prefix: bool = False,
) -> Tuple[SimpleNamespace, str]:
"""Normalize Anthropic response to match the shape expected by AIAgent.
Returns (assistant_message, finish_reason) where assistant_message has
.content, .tool_calls, and .reasoning attributes.
When *strip_tool_prefix* is True, removes the ``mcp_`` prefix that was
added to tool names for OAuth Claude Code compatibility.
"""
text_parts = []
reasoning_parts = []
reasoning_details = []
tool_calls = []
for block in response.content:
if block.type == "text":
text_parts.append(block.text)
elif block.type == "thinking":
reasoning_parts.append(block.thinking)
block_dict = _to_plain_data(block)
if isinstance(block_dict, dict):
reasoning_details.append(block_dict)
elif block.type == "tool_use":
name = block.name
if strip_tool_prefix and name.startswith(_MCP_TOOL_PREFIX):
name = name[len(_MCP_TOOL_PREFIX):]
tool_calls.append(
SimpleNamespace(
id=block.id,
type="function",
function=SimpleNamespace(
name=name,
arguments=json.dumps(block.input),
),
)
)
# Map Anthropic stop_reason to OpenAI finish_reason.
# Newer stop reasons added in Claude 4.5+ / 4.7:
# - refusal: the model declined to answer (cyber safeguards, CSAM, etc.)
# - model_context_window_exceeded: hit context limit (not max_tokens)
# Both need distinct handling upstream — a refusal should surface to the
# user with a clear message, and a context-window overflow should trigger
# compression/truncation rather than be treated as normal end-of-turn.
stop_reason_map = {
"end_turn": "stop",
"tool_use": "tool_calls",
"max_tokens": "length",
"stop_sequence": "stop",
"refusal": "content_filter",
"model_context_window_exceeded": "length",
}
finish_reason = stop_reason_map.get(response.stop_reason, "stop")
return (
SimpleNamespace(
content="\n".join(text_parts) if text_parts else None,
tool_calls=tool_calls or None,
reasoning="\n\n".join(reasoning_parts) if reasoning_parts else None,
reasoning_content=None,
reasoning_details=reasoning_details or None,
),
finish_reason,
)
def normalize_anthropic_response_v2(
response,
strip_tool_prefix: bool = False,
) -> "NormalizedResponse":
"""Normalize Anthropic response to NormalizedResponse.
Wraps the existing normalize_anthropic_response() and maps its output
to the shared transport types. This allows incremental migration —
one call site at a time — without changing the original function.
"""
from agent.transports.types import NormalizedResponse, build_tool_call
assistant_msg, finish_reason = normalize_anthropic_response(response, strip_tool_prefix)
tool_calls = None
if assistant_msg.tool_calls:
tool_calls = [
build_tool_call(
id=tc.id,
name=tc.function.name,
arguments=tc.function.arguments,
)
for tc in assistant_msg.tool_calls
]
provider_data = {}
if getattr(assistant_msg, "reasoning_details", None):
provider_data["reasoning_details"] = assistant_msg.reasoning_details
return NormalizedResponse(
content=assistant_msg.content,
tool_calls=tool_calls,
finish_reason=finish_reason,
reasoning=getattr(assistant_msg, "reasoning", None),
usage=None, # Anthropic usage is on the raw response, not the normaliser
provider_data=provider_data or None,
)
+67 -27
View File
@@ -134,6 +134,7 @@ _API_KEY_PROVIDER_AUX_MODELS: Dict[str, str] = {
"gemini": "gemini-3-flash-preview",
"zai": "glm-4.5-flash",
"kimi-coding": "kimi-k2-turbo-preview",
"stepfun": "step-3.5-flash",
"kimi-coding-cn": "kimi-k2-turbo-preview",
"minimax": "MiniMax-M2.7",
"minimax-cn": "MiniMax-M2.7",
@@ -150,7 +151,7 @@ _API_KEY_PROVIDER_AUX_MODELS: Dict[str, str] = {
# differs from their main chat model, map it here. The vision auto-detect
# "exotic provider" branch checks this before falling back to the main model.
_PROVIDER_VISION_MODELS: Dict[str, str] = {
"xiaomi": "mimo-v2-omni",
"xiaomi": "mimo-v2.5",
"zai": "glm-5v-turbo",
}
@@ -182,8 +183,6 @@ auxiliary_is_nous: bool = False
# Default auxiliary models per provider
_OPENROUTER_MODEL = "google/gemini-3-flash-preview"
_NOUS_MODEL = "google/gemini-3-flash-preview"
_NOUS_FREE_TIER_VISION_MODEL = "xiaomi/mimo-v2-omni"
_NOUS_FREE_TIER_AUX_MODEL = "xiaomi/mimo-v2-pro"
_NOUS_DEFAULT_BASE_URL = "https://inference-api.nousresearch.com/v1"
_ANTHROPIC_DEFAULT_BASE_URL = "https://api.anthropic.com"
_AUTH_JSON_PATH = get_hermes_home() / "auth.json"
@@ -574,7 +573,8 @@ class _AnthropicCompletionsAdapter:
self._is_oauth = is_oauth
def create(self, **kwargs) -> Any:
from agent.anthropic_adapter import build_anthropic_kwargs, normalize_anthropic_response
from agent.anthropic_adapter import build_anthropic_kwargs
from agent.transports import get_transport
messages = kwargs.get("messages", [])
model = kwargs.get("model", self._model)
@@ -611,7 +611,19 @@ class _AnthropicCompletionsAdapter:
anthropic_kwargs["temperature"] = temperature
response = self._client.messages.create(**anthropic_kwargs)
assistant_message, finish_reason = normalize_anthropic_response(response)
_transport = get_transport("anthropic_messages")
_nr = _transport.normalize_response(
response, strip_tool_prefix=self._is_oauth
)
# ToolCall already duck-types as OpenAI shape (.type, .function.name,
# .function.arguments) via properties, so no wrapping needed.
assistant_message = SimpleNamespace(
content=_nr.content,
tool_calls=_nr.tool_calls,
reasoning=_nr.reasoning,
)
finish_reason = _nr.finish_reason
usage = None
if hasattr(response, "usage") and response.usage:
@@ -845,7 +857,7 @@ def _resolve_api_key_provider() -> Tuple[Optional[OpenAI], Optional[str]]:
return GeminiNativeClient(api_key=api_key, base_url=base_url), model
extra = {}
if base_url_host_matches(base_url, "api.kimi.com"):
extra["default_headers"] = {"User-Agent": "KimiCLI/1.30.0"}
extra["default_headers"] = {"User-Agent": "claude-code/0.1.0"}
elif base_url_host_matches(base_url, "api.githubcopilot.com"):
from hermes_cli.models import copilot_default_headers
@@ -871,7 +883,7 @@ def _resolve_api_key_provider() -> Tuple[Optional[OpenAI], Optional[str]]:
return GeminiNativeClient(api_key=api_key, base_url=base_url), model
extra = {}
if base_url_host_matches(base_url, "api.kimi.com"):
extra["default_headers"] = {"User-Agent": "KimiCLI/1.30.0"}
extra["default_headers"] = {"User-Agent": "claude-code/0.1.0"}
elif base_url_host_matches(base_url, "api.githubcopilot.com"):
from hermes_cli.models import copilot_default_headers
@@ -904,6 +916,19 @@ def _try_openrouter() -> Tuple[Optional[OpenAI], Optional[str]]:
default_headers=_OR_HEADERS), _OPENROUTER_MODEL
def _describe_openrouter_unavailable() -> str:
"""Return a more precise OpenRouter auth failure reason for logs."""
pool_present, entry = _select_pool_entry("openrouter")
if pool_present:
if entry is None:
return "OpenRouter credential pool has no usable entries (credentials may be exhausted)"
if not _pool_runtime_api_key(entry):
return "OpenRouter credential pool entry is missing a runtime API key"
if not str(os.getenv("OPENROUTER_API_KEY") or "").strip():
return "OPENROUTER_API_KEY not set"
return "no usable OpenRouter credentials found"
def _try_nous(vision: bool = False) -> Tuple[Optional[OpenAI], Optional[str]]:
# Check cross-session rate limit guard before attempting Nous —
# if another session already recorded a 429, skip Nous entirely
@@ -927,22 +952,35 @@ def _try_nous(vision: bool = False) -> Tuple[Optional[OpenAI], Optional[str]]:
global auxiliary_is_nous
auxiliary_is_nous = True
logger.debug("Auxiliary client: Nous Portal")
if nous.get("source") == "pool":
model = "gemini-3-flash"
else:
model = _NOUS_MODEL
# Free-tier users can't use paid auxiliary models — use the free
# models instead: mimo-v2-omni for vision, mimo-v2-pro for text tasks.
# Paid accounts keep their tier-appropriate models: gemini-3-flash-preview
# for both text and vision tasks.
# Ask the Portal which model it currently recommends for this task type.
# The /api/nous/recommended-models endpoint is the authoritative source:
# it distinguishes paid vs free tier recommendations, and get_nous_recommended_aux_model
# auto-detects the caller's tier via check_nous_free_tier(). Fall back to
# _NOUS_MODEL (google/gemini-3-flash-preview) when the Portal is unreachable
# or returns a null recommendation for this task type.
model = _NOUS_MODEL
try:
from hermes_cli.models import check_nous_free_tier
if check_nous_free_tier():
model = _NOUS_FREE_TIER_VISION_MODEL if vision else _NOUS_FREE_TIER_AUX_MODEL
logger.debug("Free-tier Nous account — using %s for auxiliary/%s",
model, "vision" if vision else "text")
except Exception:
pass
from hermes_cli.models import get_nous_recommended_aux_model
recommended = get_nous_recommended_aux_model(vision=vision)
if recommended:
model = recommended
logger.debug(
"Auxiliary/%s: using Portal-recommended model %s",
"vision" if vision else "text", model,
)
else:
logger.debug(
"Auxiliary/%s: no Portal recommendation, falling back to %s",
"vision" if vision else "text", model,
)
except Exception as exc:
logger.debug(
"Auxiliary/%s: recommended-models lookup failed (%s); "
"falling back to %s",
"vision" if vision else "text", exc, model,
)
if runtime is not None:
api_key, base_url = runtime
else:
@@ -1487,7 +1525,7 @@ def _to_async_client(sync_client, model: str):
async_kwargs["default_headers"] = copilot_default_headers()
elif base_url_host_matches(sync_base_url, "api.kimi.com"):
async_kwargs["default_headers"] = {"User-Agent": "KimiCLI/1.30.0"}
async_kwargs["default_headers"] = {"User-Agent": "claude-code/0.1.0"}
return AsyncOpenAI(**async_kwargs), model
@@ -1602,8 +1640,10 @@ def resolve_provider_client(
if provider == "openrouter":
client, default = _try_openrouter()
if client is None:
logger.warning("resolve_provider_client: openrouter requested "
"but OPENROUTER_API_KEY not set")
logger.warning(
"resolve_provider_client: openrouter requested but %s",
_describe_openrouter_unavailable(),
)
return None, None
final_model = _normalize_resolved_model(model or default, provider)
return (_to_async_client(client, final_model) if async_mode
@@ -1674,7 +1714,7 @@ def resolve_provider_client(
)
extra = {}
if base_url_host_matches(custom_base, "api.kimi.com"):
extra["default_headers"] = {"User-Agent": "KimiCLI/1.30.0"}
extra["default_headers"] = {"User-Agent": "claude-code/0.1.0"}
elif base_url_host_matches(custom_base, "api.githubcopilot.com"):
from hermes_cli.models import copilot_default_headers
extra["default_headers"] = copilot_default_headers()
@@ -1781,7 +1821,7 @@ def resolve_provider_client(
# Provider-specific headers
headers = {}
if base_url_host_matches(base_url, "api.kimi.com"):
headers["User-Agent"] = "KimiCLI/1.30.0"
headers["User-Agent"] = "claude-code/0.1.0"
elif base_url_host_matches(base_url, "api.githubcopilot.com"):
from hermes_cli.models import copilot_default_headers
+54 -7
View File
@@ -64,6 +64,47 @@ _CHARS_PER_TOKEN = 4
_SUMMARY_FAILURE_COOLDOWN_SECONDS = 600
def _content_text_for_contains(content: Any) -> str:
"""Return a best-effort text view of message content.
Used only for substring checks when we need to know whether we've already
appended a note to a message. Keeps multimodal lists intact elsewhere.
"""
if content is None:
return ""
if isinstance(content, str):
return content
if isinstance(content, list):
parts: list[str] = []
for item in content:
if isinstance(item, str):
parts.append(item)
elif isinstance(item, dict):
text = item.get("text")
if isinstance(text, str):
parts.append(text)
return "\n".join(part for part in parts if part)
return str(content)
def _append_text_to_content(content: Any, text: str, *, prepend: bool = False) -> Any:
"""Append or prepend plain text to message content safely.
Compression sometimes needs to add a note or merge a summary into an
existing message. Message content may be plain text or a multimodal list of
blocks, so direct string concatenation is not always safe.
"""
if content is None:
return text
if isinstance(content, str):
return text + content if prepend else content + text
if isinstance(content, list):
text_block = {"type": "text", "text": text}
return [text_block, *content] if prepend else [*content, text_block]
rendered = str(content)
return text + rendered if prepend else rendered + text
def _truncate_tool_call_args_json(args: str, head_chars: int = 200) -> str:
"""Shrink long string values inside a tool-call arguments JSON blob while
preserving JSON validity.
@@ -807,7 +848,7 @@ The user has requested that this compaction PRIORITISE preserving all informatio
)
self.summary_model = "" # empty = use main model
self._summary_failure_cooldown_until = 0.0 # no cooldown
return self._generate_summary(turns_to_summarize) # retry immediately
return self._generate_summary(turns_to_summarize, focus_topic=focus_topic) # retry immediately
# Transient errors (timeout, rate limit, network) — shorter cooldown
_transient_cooldown = 60
@@ -1144,10 +1185,13 @@ The user has requested that this compaction PRIORITISE preserving all informatio
for i in range(compress_start):
msg = messages[i].copy()
if i == 0 and msg.get("role") == "system":
existing = msg.get("content") or ""
existing = msg.get("content")
_compression_note = "[Note: Some earlier conversation turns have been compacted into a handoff summary to preserve context space. The current session state may still reflect earlier work, so build on that summary and state rather than re-doing work.]"
if _compression_note not in existing:
msg["content"] = existing + "\n\n" + _compression_note
if _compression_note not in _content_text_for_contains(existing):
msg["content"] = _append_text_to_content(
existing,
"\n\n" + _compression_note if isinstance(existing, str) and existing else _compression_note,
)
compressed.append(msg)
# If LLM summary failed, insert a static fallback so the model
@@ -1191,12 +1235,15 @@ The user has requested that this compaction PRIORITISE preserving all informatio
for i in range(compress_end, n_messages):
msg = messages[i].copy()
if _merge_summary_into_tail and i == compress_end:
original = msg.get("content") or ""
msg["content"] = (
merged_prefix = (
summary
+ "\n\n--- END OF CONTEXT SUMMARY — "
"respond to the message below, not the summary above ---\n\n"
+ original
)
msg["content"] = _append_text_to_content(
msg.get("content"),
merged_prefix,
prepend=True,
)
_merge_summary_into_tail = False
compressed.append(msg)
+124 -10
View File
@@ -45,6 +45,7 @@ class FailoverReason(enum.Enum):
# Model
model_not_found = "model_not_found" # 404 or invalid model — fallback to different model
provider_policy_blocked = "provider_policy_blocked" # Aggregator (e.g. OpenRouter) blocked the only endpoint due to account data/privacy policy
# Request format
format_error = "format_error" # 400 bad request — abort or strip + retry
@@ -194,6 +195,29 @@ _MODEL_NOT_FOUND_PATTERNS = [
"unsupported model",
]
# OpenRouter aggregator policy-block patterns.
#
# When a user's OpenRouter account privacy setting (or a per-request
# `provider.data_collection: deny` preference) excludes the only endpoint
# serving a model, OpenRouter returns 404 with a *specific* message that is
# distinct from "model not found":
#
# "No endpoints available matching your guardrail restrictions and
# data policy. Configure: https://openrouter.ai/settings/privacy"
#
# We classify this as `provider_policy_blocked` rather than
# `model_not_found` because:
# - The model *exists* — model_not_found is misleading in logs
# - Provider fallback won't help: the account-level setting applies to
# every call on the same OpenRouter account
# - The error body already contains the fix URL, so the user gets
# actionable guidance without us rewriting the message
_PROVIDER_POLICY_BLOCKED_PATTERNS = [
"no endpoints available matching your guardrail",
"no endpoints available matching your data policy",
"no endpoints found matching your data policy",
]
# Auth patterns (non-status-code signals)
_AUTH_PATTERNS = [
"invalid api key",
@@ -220,12 +244,25 @@ _TRANSPORT_ERROR_TYPES = frozenset({
"ConnectionAbortedError", "BrokenPipeError",
"TimeoutError", "ReadError",
"ServerDisconnectedError",
# SSL/TLS transport errors — transient mid-stream handshake/record
# failures that should retry rather than surface as a stalled session.
# ssl.SSLError subclasses OSError (caught by isinstance) but we list
# the type names here so provider-wrapped SSL errors (e.g. when the
# SDK re-raises without preserving the exception chain) still classify
# as transport rather than falling through to the unknown bucket.
"SSLError", "SSLZeroReturnError", "SSLWantReadError",
"SSLWantWriteError", "SSLEOFError", "SSLSyscallError",
# OpenAI SDK errors (not subclasses of Python builtins)
"APIConnectionError",
"APITimeoutError",
})
# Server disconnect patterns (no status code, but transport-level)
# Server disconnect patterns (no status code, but transport-level).
# These are the "ambiguous" patterns — a plain connection close could be
# transient transport hiccup OR server-side context overflow rejection
# (common when the API gateway disconnects instead of returning an HTTP
# error for oversized requests). A large session + one of these patterns
# triggers the context-overflow-with-compression recovery path.
_SERVER_DISCONNECT_PATTERNS = [
"server disconnected",
"peer closed connection",
@@ -236,6 +273,40 @@ _SERVER_DISCONNECT_PATTERNS = [
"incomplete chunked read",
]
# SSL/TLS transient failure patterns — intentionally distinct from
# _SERVER_DISCONNECT_PATTERNS above.
#
# An SSL alert mid-stream is almost always a transport-layer hiccup
# (flaky network, mid-session TLS renegotiation failure, load balancer
# dropping the connection) — NOT a server-side context overflow signal.
# So we want the retry path but NOT the compression path; lumping these
# into _SERVER_DISCONNECT_PATTERNS would trigger unnecessary (and
# expensive) context compression on any large-session SSL hiccup.
#
# The OpenSSL library constructs error codes by prepending a format string
# to the uppercased alert reason; OpenSSL 3.x changed the separator
# (e.g. `SSLV3_ALERT_BAD_RECORD_MAC` → `SSL/TLS_ALERT_BAD_RECORD_MAC`),
# which silently stopped matching anything explicit. Matching on the
# stable substrings (`bad record mac`, `ssl alert`, `tls alert`, etc.)
# survives future OpenSSL format churn without code changes.
_SSL_TRANSIENT_PATTERNS = [
# Space-separated (human-readable form, Python ssl module, most SDKs)
"bad record mac",
"ssl alert",
"tls alert",
"ssl handshake failure",
"tlsv1 alert",
"sslv3 alert",
# Underscore-separated (OpenSSL error code tokens, e.g.
# `ERR_SSL_SSL/TLS_ALERT_BAD_RECORD_MAC`, `SSLV3_ALERT_BAD_RECORD_MAC`)
"bad_record_mac",
"ssl_alert",
"tls_alert",
"tls_alert_internal_error",
# Python ssl module prefix, e.g. "[SSL: BAD_RECORD_MAC]"
"[ssl:",
]
# ── Classification pipeline ─────────────────────────────────────────────
@@ -255,9 +326,10 @@ def classify_api_error(
2. HTTP status code + message-aware refinement
3. Error code classification (from body)
4. Message pattern matching (billing vs rate_limit vs context vs auth)
5. Transport error heuristics
5. SSL/TLS transient alert patterns → retry as timeout
6. Server disconnect + large session → context overflow
7. Fallback: unknown (retryable with backoff)
7. Transport error heuristics
8. Fallback: unknown (retryable with backoff)
Args:
error: The exception from the API call.
@@ -388,7 +460,18 @@ def classify_api_error(
if classified is not None:
return classified
# ── 5. Server disconnect + large session → context overflow ─────
# ── 5. SSL/TLS transient errors → retry as timeout (not compression) ──
# SSL alerts mid-stream are transport hiccups, not server-side context
# overflow signals. Classify before the disconnect check so a large
# session doesn't incorrectly trigger context compression when the real
# cause is a flaky TLS handshake. Also matches when the error is
# wrapped in a generic exception whose message string carries the SSL
# alert text but the type isn't ssl.SSLError (happens with some SDKs
# that re-raise without chaining).
if any(p in error_msg for p in _SSL_TRANSIENT_PATTERNS):
return _result(FailoverReason.timeout, retryable=True)
# ── 6. Server disconnect + large session → context overflow ─────
# Must come BEFORE generic transport error catch — a disconnect on
# a large session is more likely context overflow than a transient
# transport hiccup. Without this ordering, RemoteProtocolError
@@ -405,12 +488,12 @@ def classify_api_error(
)
return _result(FailoverReason.timeout, retryable=True)
# ── 6. Transport / timeout heuristics ───────────────────────────
# ── 7. Transport / timeout heuristics ───────────────────────────
if error_type in _TRANSPORT_ERROR_TYPES or isinstance(error, (TimeoutError, ConnectionError, OSError)):
return _result(FailoverReason.timeout, retryable=True)
# ── 7. Fallback: unknown ────────────────────────────────────────
# ── 8. Fallback: unknown ────────────────────────────────────────
return _result(FailoverReason.unknown, retryable=True)
@@ -464,17 +547,33 @@ def _classify_by_status(
return _classify_402(error_msg, result_fn)
if status_code == 404:
# OpenRouter policy-block 404 — distinct from "model not found".
# The model exists; the user's account privacy setting excludes the
# only endpoint serving it. Falling back to another provider won't
# help (same account setting applies). The error body already
# contains the fix URL, so just surface it.
if any(p in error_msg for p in _PROVIDER_POLICY_BLOCKED_PATTERNS):
return result_fn(
FailoverReason.provider_policy_blocked,
retryable=False,
should_fallback=False,
)
if any(p in error_msg for p in _MODEL_NOT_FOUND_PATTERNS):
return result_fn(
FailoverReason.model_not_found,
retryable=False,
should_fallback=True,
)
# Generic 404 — could be model or endpoint
# Generic 404 with no "model not found" signal — could be a wrong
# endpoint path (common with local llama.cpp / Ollama / vLLM when
# the URL is slightly misconfigured), a proxy routing glitch, or
# a transient backend issue. Classifying these as model_not_found
# silently falls back to a different provider and tells the model
# the model is missing, which is wrong and wastes a turn. Treat
# as unknown so the retry loop surfaces the real error instead.
return result_fn(
FailoverReason.model_not_found,
retryable=False,
should_fallback=True,
FailoverReason.unknown,
retryable=True,
)
if status_code == 413:
@@ -576,6 +675,12 @@ def _classify_400(
)
# Some providers return model-not-found as 400 instead of 404 (e.g. OpenRouter).
if any(p in error_msg for p in _PROVIDER_POLICY_BLOCKED_PATTERNS):
return result_fn(
FailoverReason.provider_policy_blocked,
retryable=False,
should_fallback=False,
)
if any(p in error_msg for p in _MODEL_NOT_FOUND_PATTERNS):
return result_fn(
FailoverReason.model_not_found,
@@ -748,6 +853,15 @@ def _classify_by_message(
should_fallback=True,
)
# Provider policy-block (aggregator-side guardrail) — check before
# model_not_found so we don't mis-label as a missing model.
if any(p in error_msg for p in _PROVIDER_POLICY_BLOCKED_PATTERNS):
return result_fn(
FailoverReason.provider_policy_blocked,
retryable=False,
should_fallback=False,
)
# Model not found patterns
if any(p in error_msg for p in _MODEL_NOT_FOUND_PATTERNS):
return result_fn(
+242
View File
@@ -0,0 +1,242 @@
"""
Image Generation Provider ABC
=============================
Defines the pluggable-backend interface for image generation. Providers register
instances via ``PluginContext.register_image_gen_provider()``; the active one
(selected via ``image_gen.provider`` in ``config.yaml``) services every
``image_generate`` tool call.
Providers live in ``<repo>/plugins/image_gen/<name>/`` (built-in, auto-loaded
as ``kind: backend``) or ``~/.hermes/plugins/image_gen/<name>/`` (user, opt-in
via ``plugins.enabled``).
Response shape
--------------
All providers return a dict that :func:`success_response` / :func:`error_response`
produce. The tool wrapper JSON-serializes it. Keys:
success bool
image str | None URL or absolute file path
model str provider-specific model identifier
prompt str echoed prompt
aspect_ratio str "landscape" | "square" | "portrait"
provider str provider name (for diagnostics)
error str only when success=False
error_type str only when success=False
"""
from __future__ import annotations
import abc
import base64
import datetime
import logging
import uuid
from pathlib import Path
from typing import Any, Dict, List, Optional, Tuple
logger = logging.getLogger(__name__)
VALID_ASPECT_RATIOS: Tuple[str, ...] = ("landscape", "square", "portrait")
DEFAULT_ASPECT_RATIO = "landscape"
# ---------------------------------------------------------------------------
# ABC
# ---------------------------------------------------------------------------
class ImageGenProvider(abc.ABC):
"""Abstract base class for an image generation backend.
Subclasses must implement :meth:`generate`. Everything else has sane
defaults — override only what your provider needs.
"""
@property
@abc.abstractmethod
def name(self) -> str:
"""Stable short identifier used in ``image_gen.provider`` config.
Lowercase, no spaces. Examples: ``fal``, ``openai``, ``replicate``.
"""
@property
def display_name(self) -> str:
"""Human-readable label shown in ``hermes tools``. Defaults to ``name.title()``."""
return self.name.title()
def is_available(self) -> bool:
"""Return True when this provider can service calls.
Typically checks for a required API key. Default: True
(providers with no external dependencies are always available).
"""
return True
def list_models(self) -> List[Dict[str, Any]]:
"""Return catalog entries for ``hermes tools`` model picker.
Each entry::
{
"id": "gpt-image-1.5", # required
"display": "GPT Image 1.5", # optional; defaults to id
"speed": "~10s", # optional
"strengths": "...", # optional
"price": "$...", # optional
}
Default: empty list (provider has no user-selectable models).
"""
return []
def get_setup_schema(self) -> Dict[str, Any]:
"""Return provider metadata for the ``hermes tools`` picker.
Used by ``tools_config.py`` to inject this provider as a row in
the Image Generation provider list. Shape::
{
"name": "OpenAI", # picker label
"badge": "paid", # optional short tag
"tag": "One-line description...", # optional subtitle
"env_vars": [ # keys to prompt for
{"key": "OPENAI_API_KEY",
"prompt": "OpenAI API key",
"url": "https://platform.openai.com/api-keys"},
],
}
Default: minimal entry derived from ``display_name``. Override to
expose API key prompts and custom badges.
"""
return {
"name": self.display_name,
"badge": "",
"tag": "",
"env_vars": [],
}
def default_model(self) -> Optional[str]:
"""Return the default model id, or None if not applicable."""
models = self.list_models()
if models:
return models[0].get("id")
return None
@abc.abstractmethod
def generate(
self,
prompt: str,
aspect_ratio: str = DEFAULT_ASPECT_RATIO,
**kwargs: Any,
) -> Dict[str, Any]:
"""Generate an image.
Implementations should return the dict from :func:`success_response`
or :func:`error_response`. ``kwargs`` may contain forward-compat
parameters future versions of the schema will expose — implementations
should ignore unknown keys.
"""
# ---------------------------------------------------------------------------
# Helpers
# ---------------------------------------------------------------------------
def resolve_aspect_ratio(value: Optional[str]) -> str:
"""Clamp an aspect_ratio value to the valid set, defaulting to landscape.
Invalid values are coerced rather than rejected so the tool surface is
forgiving of agent mistakes.
"""
if not isinstance(value, str):
return DEFAULT_ASPECT_RATIO
v = value.strip().lower()
if v in VALID_ASPECT_RATIOS:
return v
return DEFAULT_ASPECT_RATIO
def _images_cache_dir() -> Path:
"""Return ``$HERMES_HOME/cache/images/``, creating parents as needed."""
from hermes_constants import get_hermes_home
path = get_hermes_home() / "cache" / "images"
path.mkdir(parents=True, exist_ok=True)
return path
def save_b64_image(
b64_data: str,
*,
prefix: str = "image",
extension: str = "png",
) -> Path:
"""Decode base64 image data and write it under ``$HERMES_HOME/cache/images/``.
Returns the absolute :class:`Path` to the saved file.
Filename format: ``<prefix>_<YYYYMMDD_HHMMSS>_<short-uuid>.<ext>``.
"""
raw = base64.b64decode(b64_data)
ts = datetime.datetime.now().strftime("%Y%m%d_%H%M%S")
short = uuid.uuid4().hex[:8]
path = _images_cache_dir() / f"{prefix}_{ts}_{short}.{extension}"
path.write_bytes(raw)
return path
def success_response(
*,
image: str,
model: str,
prompt: str,
aspect_ratio: str,
provider: str,
extra: Optional[Dict[str, Any]] = None,
) -> Dict[str, Any]:
"""Build a uniform success response dict.
``image`` may be an HTTP URL or an absolute filesystem path (for b64
providers like OpenAI). Callers that need to pass through additional
backend-specific fields can supply ``extra``.
"""
payload: Dict[str, Any] = {
"success": True,
"image": image,
"model": model,
"prompt": prompt,
"aspect_ratio": aspect_ratio,
"provider": provider,
}
if extra:
for k, v in extra.items():
payload.setdefault(k, v)
return payload
def error_response(
*,
error: str,
error_type: str = "provider_error",
provider: str = "",
model: str = "",
prompt: str = "",
aspect_ratio: str = DEFAULT_ASPECT_RATIO,
) -> Dict[str, Any]:
"""Build a uniform error response dict."""
return {
"success": False,
"image": None,
"error": error,
"error_type": error_type,
"model": model,
"prompt": prompt,
"aspect_ratio": aspect_ratio,
"provider": provider,
}
+120
View File
@@ -0,0 +1,120 @@
"""
Image Generation Provider Registry
==================================
Central map of registered providers. Populated by plugins at import-time via
``PluginContext.register_image_gen_provider()``; consumed by the
``image_generate`` tool to dispatch each call to the active backend.
Active selection
----------------
The active provider is chosen by ``image_gen.provider`` in ``config.yaml``.
If unset, :func:`get_active_provider` applies fallback logic:
1. If exactly one provider is registered, use it.
2. Otherwise if a provider named ``fal`` is registered, use it (legacy
default — matches pre-plugin behavior).
3. Otherwise return ``None`` (the tool surfaces a helpful error pointing
the user at ``hermes tools``).
"""
from __future__ import annotations
import logging
import threading
from typing import Dict, List, Optional
from agent.image_gen_provider import ImageGenProvider
logger = logging.getLogger(__name__)
_providers: Dict[str, ImageGenProvider] = {}
_lock = threading.Lock()
def register_provider(provider: ImageGenProvider) -> None:
"""Register an image generation provider.
Re-registration (same ``name``) overwrites the previous entry and logs
a debug message — this makes hot-reload scenarios (tests, dev loops)
behave predictably.
"""
if not isinstance(provider, ImageGenProvider):
raise TypeError(
f"register_provider() expects an ImageGenProvider instance, "
f"got {type(provider).__name__}"
)
name = provider.name
if not isinstance(name, str) or not name.strip():
raise ValueError("Image gen provider .name must be a non-empty string")
with _lock:
existing = _providers.get(name)
_providers[name] = provider
if existing is not None:
logger.debug("Image gen provider '%s' re-registered (was %r)", name, type(existing).__name__)
else:
logger.debug("Registered image gen provider '%s' (%s)", name, type(provider).__name__)
def list_providers() -> List[ImageGenProvider]:
"""Return all registered providers, sorted by name."""
with _lock:
items = list(_providers.values())
return sorted(items, key=lambda p: p.name)
def get_provider(name: str) -> Optional[ImageGenProvider]:
"""Return the provider registered under *name*, or None."""
if not isinstance(name, str):
return None
with _lock:
return _providers.get(name.strip())
def get_active_provider() -> Optional[ImageGenProvider]:
"""Resolve the currently-active provider.
Reads ``image_gen.provider`` from config.yaml; falls back per the
module docstring.
"""
configured: Optional[str] = None
try:
from hermes_cli.config import load_config
cfg = load_config()
section = cfg.get("image_gen") if isinstance(cfg, dict) else None
if isinstance(section, dict):
raw = section.get("provider")
if isinstance(raw, str) and raw.strip():
configured = raw.strip()
except Exception as exc:
logger.debug("Could not read image_gen.provider from config: %s", exc)
with _lock:
snapshot = dict(_providers)
if configured:
provider = snapshot.get(configured)
if provider is not None:
return provider
logger.debug(
"image_gen.provider='%s' configured but not registered; falling back",
configured,
)
# Fallback: single-provider case
if len(snapshot) == 1:
return next(iter(snapshot.values()))
# Fallback: prefer legacy FAL for backward compat
if "fal" in snapshot:
return snapshot["fal"]
return None
def _reset_for_tests() -> None:
"""Clear the registry. **Test-only.**"""
with _lock:
_providers.clear()
+161 -10
View File
@@ -4,6 +4,7 @@ Pure utility functions with no AIAgent dependency. Used by ContextCompressor
and run_agent.py for pre-flight context checks.
"""
import ipaddress
import logging
import re
import time
@@ -25,7 +26,7 @@ logger = logging.getLogger(__name__)
# are preserved so the full model name reaches cache lookups and server queries.
_PROVIDER_PREFIXES: frozenset[str] = frozenset({
"openrouter", "nous", "openai-codex", "copilot", "copilot-acp",
"gemini", "ollama-cloud", "zai", "kimi-coding", "kimi-coding-cn", "minimax", "minimax-cn", "anthropic", "deepseek",
"gemini", "ollama-cloud", "zai", "kimi-coding", "kimi-coding-cn", "stepfun", "minimax", "minimax-cn", "anthropic", "deepseek",
"opencode-zen", "opencode-go", "ai-gateway", "kilocode", "alibaba",
"qwen-oauth",
"xiaomi",
@@ -36,7 +37,7 @@ _PROVIDER_PREFIXES: frozenset[str] = frozenset({
"glm", "z-ai", "z.ai", "zhipu", "github", "github-copilot",
"github-models", "kimi", "moonshot", "kimi-cn", "moonshot-cn", "claude", "deep-seek",
"ollama",
"opencode", "zen", "go", "vercel", "kilo", "dashscope", "aliyun", "qwen",
"stepfun", "opencode", "zen", "go", "vercel", "kilo", "dashscope", "aliyun", "qwen",
"mimo", "xiaomi-mimo",
"arcee-ai", "arceeai",
"xai", "x-ai", "x.ai", "grok",
@@ -51,6 +52,13 @@ _OLLAMA_TAG_PATTERN = re.compile(
)
# Tailscale's CGNAT range (RFC 6598). `ipaddress.is_private` excludes this
# block, so without an explicit check Ollama reached over Tailscale (e.g.
# `http://100.77.243.5:11434`) wouldn't be treated as local and its stream
# read / stale timeouts wouldn't get auto-bumped. Built once at import time.
_TAILSCALE_CGNAT = ipaddress.IPv4Network("100.64.0.0/10")
def _strip_provider_prefix(model: str) -> str:
"""Strip a recognised provider prefix from a model string.
@@ -115,6 +123,10 @@ DEFAULT_CONTEXT_LENGTHS = {
"claude": 200000,
# OpenAI — GPT-5 family (most have 400k; specific overrides first)
# Source: https://developers.openai.com/api/docs/models
# GPT-5.5 (launched Apr 23 2026). 400k is the fallback for providers we
# can't probe live. ChatGPT Codex OAuth actually caps lower (272k as of
# Apr 2026) and is resolved via _resolve_codex_oauth_context_length().
"gpt-5.5": 400000,
"gpt-5.4-nano": 400000, # 400k (not 1.05M like full 5.4)
"gpt-5.4-mini": 400000, # 400k (not 1.05M like full 5.4)
"gpt-5.4": 1050000, # GPT-5.4, GPT-5.4 Pro (1.05M context)
@@ -125,6 +137,8 @@ DEFAULT_CONTEXT_LENGTHS = {
# Google
"gemini": 1048576,
# Gemma (open models served via AI Studio)
"gemma-4": 256000, # Gemma 4 family
"gemma4": 256000, # Ollama-style naming (e.g. gemma4:31b-cloud)
"gemma-4-31b": 256000,
"gemma-3": 131072,
"gemma": 8192, # fallback for older gemma models
@@ -173,10 +187,12 @@ DEFAULT_CONTEXT_LENGTHS = {
"moonshotai/Kimi-K2.6": 262144,
"moonshotai/Kimi-K2-Thinking": 262144,
"MiniMaxAI/MiniMax-M2.5": 204800,
"XiaomiMiMo/MiMo-V2-Flash": 256000,
"mimo-v2-pro": 1000000,
"mimo-v2-omni": 256000,
"mimo-v2-flash": 256000,
"XiaomiMiMo/MiMo-V2-Flash": 262144,
"mimo-v2-pro": 1048576,
"mimo-v2.5-pro": 1048576,
"mimo-v2.5": 1048576,
"mimo-v2-omni": 262144,
"mimo-v2-flash": 262144,
"zai-org/GLM-5": 202752,
}
@@ -191,6 +207,7 @@ _CONTEXT_LENGTH_KEYS = (
"max_seq_len",
"n_ctx_train",
"n_ctx",
"ctx_size",
)
_MAX_COMPLETION_KEYS = (
@@ -234,9 +251,12 @@ _URL_TO_PROVIDER: Dict[str, str] = {
"chatgpt.com": "openai",
"api.anthropic.com": "anthropic",
"api.z.ai": "zai",
"open.bigmodel.cn": "zai",
"api.moonshot.ai": "kimi-coding",
"api.moonshot.cn": "kimi-coding-cn",
"api.kimi.com": "kimi-coding",
"api.stepfun.ai": "stepfun",
"api.stepfun.com": "stepfun",
"api.arcee.ai": "arcee",
"api.minimax": "minimax",
"dashscope.aliyuncs.com": "alibaba",
@@ -281,7 +301,15 @@ def _is_known_provider_base_url(base_url: str) -> bool:
def is_local_endpoint(base_url: str) -> bool:
"""Return True if base_url points to a local machine (localhost / RFC-1918 / WSL)."""
"""Return True if base_url points to a local machine.
Recognises loopback (``localhost``, ``127.0.0.0/8``, ``::1``),
container-internal DNS names (``host.docker.internal`` et al.),
RFC-1918 private ranges (``10/8``, ``172.16/12``, ``192.168/16``),
link-local, and Tailscale CGNAT (``100.64.0.0/10``). Tailscale CGNAT
is included so remote-but-trusted Ollama boxes reached over a
Tailscale mesh get the same timeout auto-bumps as localhost Ollama.
"""
normalized = _normalize_base_url(base_url)
if not normalized:
return False
@@ -296,14 +324,17 @@ def is_local_endpoint(base_url: str) -> bool:
# Docker / Podman / Lima internal DNS names (e.g. host.docker.internal)
if any(host.endswith(suffix) for suffix in _CONTAINER_LOCAL_SUFFIXES):
return True
# RFC-1918 private ranges and link-local
import ipaddress
# RFC-1918 private ranges, link-local, and Tailscale CGNAT
try:
addr = ipaddress.ip_address(host)
return addr.is_private or addr.is_loopback or addr.is_link_local
if addr.is_private or addr.is_loopback or addr.is_link_local:
return True
if isinstance(addr, ipaddress.IPv4Address) and addr in _TAILSCALE_CGNAT:
return True
except ValueError:
pass
# Bare IP that looks like a private range (e.g. 172.26.x.x for WSL)
# or Tailscale CGNAT (100.64.x.x100.127.x.x).
parts = host.split(".")
if len(parts) == 4:
try:
@@ -314,6 +345,8 @@ def is_local_endpoint(base_url: str) -> bool:
return True
if first == 192 and second == 168:
return True
if first == 100 and 64 <= second <= 127:
return True
except ValueError:
pass
return False
@@ -973,6 +1006,115 @@ def _query_anthropic_context_length(model: str, base_url: str, api_key: str) ->
return None
# Known ChatGPT Codex OAuth context windows (observed via live
# chatgpt.com/backend-api/codex/models probe, Apr 2026). These are the
# `context_window` values, which are what Codex actually enforces — the
# direct OpenAI API has larger limits for the same slugs, but Codex OAuth
# caps lower (e.g. gpt-5.5 is 1.05M on the API, 272K on Codex).
#
# Used as a fallback when the live probe fails (no token, network error).
# Longest keys first so substring match picks the most specific entry.
_CODEX_OAUTH_CONTEXT_FALLBACK: Dict[str, int] = {
"gpt-5.1-codex-max": 272_000,
"gpt-5.1-codex-mini": 272_000,
"gpt-5.3-codex": 272_000,
"gpt-5.2-codex": 272_000,
"gpt-5.4-mini": 272_000,
"gpt-5.5": 272_000,
"gpt-5.4": 272_000,
"gpt-5.2": 272_000,
"gpt-5": 272_000,
}
_codex_oauth_context_cache: Dict[str, int] = {}
_codex_oauth_context_cache_time: float = 0.0
_CODEX_OAUTH_CONTEXT_CACHE_TTL = 3600 # 1 hour
def _fetch_codex_oauth_context_lengths(access_token: str) -> Dict[str, int]:
"""Probe the ChatGPT Codex /models endpoint for per-slug context windows.
Codex OAuth imposes its own context limits that differ from the direct
OpenAI API (e.g. gpt-5.5 is 1.05M on the API, 272K on Codex). The
`context_window` field in each model entry is the authoritative source.
Returns a ``{slug: context_window}`` dict. Empty on failure.
"""
global _codex_oauth_context_cache, _codex_oauth_context_cache_time
now = time.time()
if (
_codex_oauth_context_cache
and now - _codex_oauth_context_cache_time < _CODEX_OAUTH_CONTEXT_CACHE_TTL
):
return _codex_oauth_context_cache
try:
resp = requests.get(
"https://chatgpt.com/backend-api/codex/models?client_version=1.0.0",
headers={"Authorization": f"Bearer {access_token}"},
timeout=10,
)
if resp.status_code != 200:
logger.debug(
"Codex /models probe returned HTTP %s; falling back to hardcoded defaults",
resp.status_code,
)
return {}
data = resp.json()
except Exception as exc:
logger.debug("Codex /models probe failed: %s", exc)
return {}
entries = data.get("models", []) if isinstance(data, dict) else []
result: Dict[str, int] = {}
for item in entries:
if not isinstance(item, dict):
continue
slug = item.get("slug")
ctx = item.get("context_window")
if isinstance(slug, str) and isinstance(ctx, int) and ctx > 0:
result[slug.strip()] = ctx
if result:
_codex_oauth_context_cache = result
_codex_oauth_context_cache_time = now
return result
def _resolve_codex_oauth_context_length(
model: str, access_token: str = ""
) -> Optional[int]:
"""Resolve a Codex OAuth model's real context window.
Prefers a live probe of chatgpt.com/backend-api/codex/models (when we
have a bearer token), then falls back to ``_CODEX_OAUTH_CONTEXT_FALLBACK``.
"""
model_bare = _strip_provider_prefix(model).strip()
if not model_bare:
return None
if access_token:
live = _fetch_codex_oauth_context_lengths(access_token)
if model_bare in live:
return live[model_bare]
# Case-insensitive match in case casing drifts
model_lower = model_bare.lower()
for slug, ctx in live.items():
if slug.lower() == model_lower:
return ctx
# Fallback: longest-key-first substring match over hardcoded defaults.
model_lower = model_bare.lower()
for slug, ctx in sorted(
_CODEX_OAUTH_CONTEXT_FALLBACK.items(), key=lambda x: len(x[0]), reverse=True
):
if slug in model_lower:
return ctx
return None
def _resolve_nous_context_length(model: str) -> Optional[int]:
"""Resolve Nous Portal model context length via OpenRouter metadata.
@@ -1117,6 +1259,15 @@ def get_model_context_length(
ctx = _resolve_nous_context_length(model)
if ctx:
return ctx
if effective_provider == "openai-codex":
# Codex OAuth enforces lower context limits than the direct OpenAI
# API for the same slug (e.g. gpt-5.5 is 1.05M on the API but 272K
# on Codex). Authoritative source is Codex's own /models endpoint.
codex_ctx = _resolve_codex_oauth_context_length(model, access_token=api_key or "")
if codex_ctx:
if base_url:
save_context_length(model, base_url, codex_ctx)
return codex_ctx
if effective_provider:
from agent.models_dev import lookup_models_dev_context
ctx = lookup_models_dev_context(effective_provider, model)
+4
View File
@@ -146,6 +146,7 @@ PROVIDER_TO_MODELS_DEV: Dict[str, str] = {
"openai-codex": "openai",
"zai": "zai",
"kimi-coding": "kimi-for-coding",
"stepfun": "stepfun",
"kimi-coding-cn": "kimi-for-coding",
"minimax": "minimax",
"minimax-cn": "minimax-cn",
@@ -417,6 +418,9 @@ def list_provider_models(provider: str) -> List[str]:
Returns an empty list if the provider is unknown or has no data.
"""
from hermes_cli.models import normalize_provider
provider = normalize_provider(provider) or provider
models = _get_provider_models(provider)
if models is None:
return []
+190
View File
@@ -0,0 +1,190 @@
"""Helpers for translating OpenAI-style tool schemas to Moonshot's schema subset.
Moonshot (Kimi) accepts a stricter subset of JSON Schema than standard OpenAI
tool calling. Requests that violate it fail with HTTP 400:
tools.function.parameters is not a valid moonshot flavored json schema,
details: <...>
Known rejection modes documented at
https://forum.moonshot.ai/t/tool-calling-specification-violation-on-moonshot-api/102
and MoonshotAI/kimi-cli#1595:
1. Every property schema must carry a ``type``. Standard JSON Schema allows
type to be omitted (the value is then unconstrained); Moonshot refuses.
2. When ``anyOf`` is used, ``type`` must be on the ``anyOf`` children, not
the parent. Presence of both causes "type should be defined in anyOf
items instead of the parent schema".
The ``#/definitions/...`` → ``#/$defs/...`` rewrite for draft-07 refs is
handled separately in ``tools/mcp_tool._normalize_mcp_input_schema`` so it
applies at MCP registration time for all providers.
"""
from __future__ import annotations
import copy
from typing import Any, Dict, List
# Keys whose values are maps of name → schema (not schemas themselves).
# When we recurse, we walk the values of these maps as schemas, but we do
# NOT apply the missing-type repair to the map itself.
_SCHEMA_MAP_KEYS = frozenset({"properties", "patternProperties", "$defs", "definitions"})
# Keys whose values are lists of schemas.
_SCHEMA_LIST_KEYS = frozenset({"anyOf", "oneOf", "allOf", "prefixItems"})
# Keys whose values are a single nested schema.
_SCHEMA_NODE_KEYS = frozenset({"items", "contains", "not", "additionalProperties", "propertyNames"})
def _repair_schema(node: Any, is_schema: bool = True) -> Any:
"""Recursively apply Moonshot repairs to a schema node.
``is_schema=True`` means this dict is a JSON Schema node and gets the
missing-type + anyOf-parent repairs applied. ``is_schema=False`` means
it's a container map (e.g. the value of ``properties``) and we only
recurse into its values.
"""
if isinstance(node, list):
# Lists only show up under schema-list keys (anyOf/oneOf/allOf), so
# every element is itself a schema.
return [_repair_schema(item, is_schema=True) for item in node]
if not isinstance(node, dict):
return node
# Walk the dict, deciding per-key whether recursion is into a schema
# node, a container map, or a scalar.
repaired: Dict[str, Any] = {}
for key, value in node.items():
if key in _SCHEMA_MAP_KEYS and isinstance(value, dict):
# Map of name → schema. Don't treat the map itself as a schema
# (it has no type / properties of its own), but each value is.
repaired[key] = {
sub_key: _repair_schema(sub_val, is_schema=True)
for sub_key, sub_val in value.items()
}
elif key in _SCHEMA_LIST_KEYS and isinstance(value, list):
repaired[key] = [_repair_schema(v, is_schema=True) for v in value]
elif key in _SCHEMA_NODE_KEYS:
# items / not / additionalProperties: single nested schema.
# additionalProperties can also be a bool — leave those alone.
if isinstance(value, dict):
repaired[key] = _repair_schema(value, is_schema=True)
else:
repaired[key] = value
else:
# Scalars (description, title, format, enum values, etc.) pass through.
repaired[key] = value
if not is_schema:
return repaired
# Rule 2: when anyOf is present, type belongs only on the children.
if "anyOf" in repaired and isinstance(repaired["anyOf"], list):
repaired.pop("type", None)
return repaired
# Rule 1: property schemas without type need one. $ref nodes are exempt
# — their type comes from the referenced definition.
if "$ref" in repaired:
return repaired
return _fill_missing_type(repaired)
def _fill_missing_type(node: Dict[str, Any]) -> Dict[str, Any]:
"""Infer a reasonable ``type`` if this schema node has none."""
if "type" in node and node["type"] not in (None, ""):
return node
# Heuristic: presence of ``properties`` → object, ``items`` → array, ``enum``
# → type of first enum value, else fall back to ``string`` (safest scalar).
if "properties" in node or "required" in node or "additionalProperties" in node:
inferred = "object"
elif "items" in node or "prefixItems" in node:
inferred = "array"
elif "enum" in node and isinstance(node["enum"], list) and node["enum"]:
sample = node["enum"][0]
if isinstance(sample, bool):
inferred = "boolean"
elif isinstance(sample, int):
inferred = "integer"
elif isinstance(sample, float):
inferred = "number"
else:
inferred = "string"
else:
inferred = "string"
return {**node, "type": inferred}
def sanitize_moonshot_tool_parameters(parameters: Any) -> Dict[str, Any]:
"""Normalize tool parameters to a Moonshot-compatible object schema.
Returns a deep-copied schema with the two flavored-JSON-Schema repairs
applied. Input is not mutated.
"""
if not isinstance(parameters, dict):
return {"type": "object", "properties": {}}
repaired = _repair_schema(copy.deepcopy(parameters), is_schema=True)
if not isinstance(repaired, dict):
return {"type": "object", "properties": {}}
# Top-level must be an object schema
if repaired.get("type") != "object":
repaired["type"] = "object"
if "properties" not in repaired:
repaired["properties"] = {}
return repaired
def sanitize_moonshot_tools(tools: List[Dict[str, Any]]) -> List[Dict[str, Any]]:
"""Apply ``sanitize_moonshot_tool_parameters`` to every tool's parameters."""
if not tools:
return tools
sanitized: List[Dict[str, Any]] = []
any_change = False
for tool in tools:
if not isinstance(tool, dict):
sanitized.append(tool)
continue
fn = tool.get("function")
if not isinstance(fn, dict):
sanitized.append(tool)
continue
params = fn.get("parameters")
repaired = sanitize_moonshot_tool_parameters(params)
if repaired is not params:
any_change = True
new_fn = {**fn, "parameters": repaired}
sanitized.append({**tool, "function": new_fn})
else:
sanitized.append(tool)
return sanitized if any_change else tools
def is_moonshot_model(model: str | None) -> bool:
"""True for any Kimi / Moonshot model slug, regardless of aggregator prefix.
Matches bare names (``kimi-k2.6``, ``moonshotai/Kimi-K2.6``) and aggregator-
prefixed slugs (``nous/moonshotai/kimi-k2.6``, ``openrouter/moonshotai/...``).
Detection by model name covers Nous / OpenRouter / other aggregators that
route to Moonshot's inference, where the base URL is the aggregator's, not
``api.moonshot.ai``.
"""
if not model:
return False
bare = model.strip().lower()
# Last path segment (covers aggregator-prefixed slugs)
tail = bare.rsplit("/", 1)[-1]
if tail.startswith("kimi-") or tail == "kimi":
return True
# Vendor-prefixed forms commonly used on aggregators
if "moonshot" in bare or "/kimi" in bare or bare.startswith("kimi"):
return True
return False
+33 -1
View File
@@ -350,7 +350,13 @@ PLATFORM_HINTS = {
),
"cli": (
"You are a CLI AI Agent. Try not to use markdown but simple text "
"renderable inside a terminal."
"renderable inside a terminal. "
"File delivery: there is no attachment channel — the user reads your "
"response directly in their terminal. Do NOT emit MEDIA:/path tags "
"(those are only intercepted on messaging platforms like Telegram, "
"Discord, Slack, etc.; on the CLI they render as literal text). "
"When referring to a file you created or changed, just state its "
"absolute path in plain text; the user can open it from there."
),
"sms": (
"You are communicating via SMS. Keep responses concise and use plain text "
@@ -364,6 +370,32 @@ PLATFORM_HINTS = {
"MEDIA:/absolute/path/to/file in your response. Images (.jpg, .png, "
".heic) appear as photos and other files arrive as attachments."
),
"mattermost": (
"You are in a Mattermost workspace communicating with your user. "
"Mattermost renders standard Markdown — headings, bold, italic, code "
"blocks, and tables all work. "
"You can send media files natively: include MEDIA:/absolute/path/to/file "
"in your response. Images (.jpg, .png, .webp) are uploaded as photo "
"attachments, audio and video as file attachments. "
"Image URLs in markdown format ![alt](url) are rendered as inline previews automatically."
),
"matrix": (
"You are in a Matrix room communicating with your user. "
"Matrix renders Markdown — bold, italic, code blocks, and links work; "
"the adapter converts your Markdown to HTML for rich display. "
"You can send media files natively: include MEDIA:/absolute/path/to/file "
"in your response. Images (.jpg, .png, .webp) are sent as inline photos, "
"audio (.ogg, .mp3) as voice/audio messages, video (.mp4) inline, "
"and other files as downloadable attachments."
),
"feishu": (
"You are in a Feishu (Lark) workspace communicating with your user. "
"Feishu renders Markdown in messages — bold, italic, code blocks, and "
"links are supported. "
"You can send media files natively: include MEDIA:/absolute/path/to/file "
"in your response. Images (.jpg, .png, .webp) are uploaded and displayed "
"inline, audio files as voice messages, and other files as attachments."
),
"weixin": (
"You are on Weixin/WeChat. Markdown formatting is supported, so you may use it when "
"it improves readability, but keep the message compact and chat-friendly. You can send media files natively: "
+2 -2
View File
@@ -345,7 +345,7 @@ def scan_skill_commands() -> Dict[str, Dict[str, Any]]:
_skill_commands = {}
try:
from tools.skills_tool import SKILLS_DIR, _parse_frontmatter, skill_matches_platform, _get_disabled_skill_names
from agent.skill_utils import get_external_skills_dirs
from agent.skill_utils import get_external_skills_dirs, iter_skill_index_files
disabled = _get_disabled_skill_names()
seen_names: set = set()
@@ -356,7 +356,7 @@ def scan_skill_commands() -> Dict[str, Dict[str, Any]]:
dirs_to_scan.extend(get_external_skills_dirs())
for scan_dir in dirs_to_scan:
for skill_md in scan_dir.rglob("SKILL.md"):
for skill_md in iter_skill_index_files(scan_dir, "SKILL.md"):
if any(part in ('.git', '.github', '.hub') for part in skill_md.parts):
continue
try:
+1 -1
View File
@@ -435,7 +435,7 @@ def iter_skill_index_files(skills_dir: Path, filename: str):
Excludes ``.git``, ``.github``, ``.hub`` directories.
"""
matches = []
for root, dirs, files in os.walk(skills_dir):
for root, dirs, files in os.walk(skills_dir, followlinks=True):
dirs[:] = [d for d in dirs if d not in EXCLUDED_SKILL_DIRS]
if filename in files:
matches.append(Path(root) / filename)
+1 -1
View File
@@ -38,7 +38,7 @@ def generate_title(user_message: str, assistant_response: str, timeout: float =
response = call_llm(
task="title_generation",
messages=messages,
max_tokens=30,
max_tokens=500,
temperature=0.3,
timeout=timeout,
)
+12
View File
@@ -37,3 +37,15 @@ def _discover_transports() -> None:
import agent.transports.anthropic # noqa: F401
except ImportError:
pass
try:
import agent.transports.codex # noqa: F401
except ImportError:
pass
try:
import agent.transports.chat_completions # noqa: F401
except ImportError:
pass
try:
import agent.transports.bedrock # noqa: F401
except ImportError:
pass
+54 -6
View File
@@ -78,23 +78,71 @@ class AnthropicTransport(ProviderTransport):
def normalize_response(self, response: Any, **kwargs) -> NormalizedResponse:
"""Normalize Anthropic response to NormalizedResponse.
kwargs:
strip_tool_prefix: bool strip 'mcp_mcp_' prefixes from tool names.
Parses content blocks (text, thinking, tool_use), maps stop_reason
to OpenAI finish_reason, and collects reasoning_details in provider_data.
"""
from agent.anthropic_adapter import normalize_anthropic_response_v2
import json
from agent.anthropic_adapter import _to_plain_data
from agent.transports.types import ToolCall
strip_tool_prefix = kwargs.get("strip_tool_prefix", False)
return normalize_anthropic_response_v2(response, strip_tool_prefix=strip_tool_prefix)
_MCP_PREFIX = "mcp_"
text_parts = []
reasoning_parts = []
reasoning_details = []
tool_calls = []
for block in response.content:
if block.type == "text":
text_parts.append(block.text)
elif block.type == "thinking":
reasoning_parts.append(block.thinking)
block_dict = _to_plain_data(block)
if isinstance(block_dict, dict):
reasoning_details.append(block_dict)
elif block.type == "tool_use":
name = block.name
if strip_tool_prefix and name.startswith(_MCP_PREFIX):
name = name[len(_MCP_PREFIX):]
tool_calls.append(
ToolCall(
id=block.id,
name=name,
arguments=json.dumps(block.input),
)
)
finish_reason = self._STOP_REASON_MAP.get(response.stop_reason, "stop")
provider_data = {}
if reasoning_details:
provider_data["reasoning_details"] = reasoning_details
return NormalizedResponse(
content="\n".join(text_parts) if text_parts else None,
tool_calls=tool_calls or None,
finish_reason=finish_reason,
reasoning="\n\n".join(reasoning_parts) if reasoning_parts else None,
usage=None,
provider_data=provider_data or None,
)
def validate_response(self, response: Any) -> bool:
"""Check Anthropic response structure is valid."""
"""Check Anthropic response structure is valid.
An empty content list is legitimate when ``stop_reason == "end_turn"``
the model's canonical way of signalling "nothing more to add" after
a tool turn that already delivered the user-facing text. Treating it
as invalid falsely retries a completed response.
"""
if response is None:
return False
content_blocks = getattr(response, "content", None)
if not isinstance(content_blocks, list):
return False
if not content_blocks:
return False
return getattr(response, "stop_reason", None) == "end_turn"
return True
def extract_cache_stats(self, response: Any) -> Optional[Dict[str, int]]:
+154
View File
@@ -0,0 +1,154 @@
"""AWS Bedrock Converse API transport.
Delegates to the existing adapter functions in agent/bedrock_adapter.py.
Bedrock uses its own boto3 client (not the OpenAI SDK), so the transport
owns format conversion and normalization, while client construction and
boto3 calls stay on AIAgent.
"""
from typing import Any, Dict, List, Optional
from agent.transports.base import ProviderTransport
from agent.transports.types import NormalizedResponse, ToolCall, Usage
class BedrockTransport(ProviderTransport):
"""Transport for api_mode='bedrock_converse'."""
@property
def api_mode(self) -> str:
return "bedrock_converse"
def convert_messages(self, messages: List[Dict[str, Any]], **kwargs) -> Any:
"""Convert OpenAI messages to Bedrock Converse format."""
from agent.bedrock_adapter import convert_messages_to_converse
return convert_messages_to_converse(messages)
def convert_tools(self, tools: List[Dict[str, Any]]) -> Any:
"""Convert OpenAI tool schemas to Bedrock Converse toolConfig."""
from agent.bedrock_adapter import convert_tools_to_converse
return convert_tools_to_converse(tools)
def build_kwargs(
self,
model: str,
messages: List[Dict[str, Any]],
tools: Optional[List[Dict[str, Any]]] = None,
**params,
) -> Dict[str, Any]:
"""Build Bedrock converse() kwargs.
Calls convert_messages and convert_tools internally.
params:
max_tokens: int output token limit (default 4096)
temperature: float | None
guardrail_config: dict | None Bedrock guardrails
region: str AWS region (default 'us-east-1')
"""
from agent.bedrock_adapter import build_converse_kwargs
region = params.get("region", "us-east-1")
guardrail = params.get("guardrail_config")
kwargs = build_converse_kwargs(
model=model,
messages=messages,
tools=tools,
max_tokens=params.get("max_tokens", 4096),
temperature=params.get("temperature"),
guardrail_config=guardrail,
)
# Sentinel keys for dispatch — agent pops these before the boto3 call
kwargs["__bedrock_converse__"] = True
kwargs["__bedrock_region__"] = region
return kwargs
def normalize_response(self, response: Any, **kwargs) -> NormalizedResponse:
"""Normalize Bedrock response to NormalizedResponse.
Handles two shapes:
1. Raw boto3 dict (from direct converse() calls)
2. Already-normalized SimpleNamespace with .choices (from dispatch site)
"""
from agent.bedrock_adapter import normalize_converse_response
# Normalize to OpenAI-compatible SimpleNamespace
if hasattr(response, "choices") and response.choices:
# Already normalized at dispatch site
ns = response
else:
# Raw boto3 dict
ns = normalize_converse_response(response)
choice = ns.choices[0]
msg = choice.message
finish_reason = choice.finish_reason or "stop"
tool_calls = None
if msg.tool_calls:
tool_calls = [
ToolCall(
id=tc.id,
name=tc.function.name,
arguments=tc.function.arguments,
)
for tc in msg.tool_calls
]
usage = None
if hasattr(ns, "usage") and ns.usage:
u = ns.usage
usage = Usage(
prompt_tokens=getattr(u, "prompt_tokens", 0) or 0,
completion_tokens=getattr(u, "completion_tokens", 0) or 0,
total_tokens=getattr(u, "total_tokens", 0) or 0,
)
reasoning = getattr(msg, "reasoning", None) or getattr(msg, "reasoning_content", None)
return NormalizedResponse(
content=msg.content,
tool_calls=tool_calls,
finish_reason=finish_reason,
reasoning=reasoning,
usage=usage,
)
def validate_response(self, response: Any) -> bool:
"""Check Bedrock response structure.
After normalize_converse_response, the response has OpenAI-compatible
.choices same check as chat_completions.
"""
if response is None:
return False
# Raw Bedrock dict response — check for 'output' key
if isinstance(response, dict):
return "output" in response
# Already-normalized SimpleNamespace
if hasattr(response, "choices"):
return bool(response.choices)
return False
def map_finish_reason(self, raw_reason: str) -> str:
"""Map Bedrock stop reason to OpenAI finish_reason.
The adapter already does this mapping inside normalize_converse_response,
so this is only used for direct access to raw responses.
"""
_MAP = {
"end_turn": "stop",
"tool_use": "tool_calls",
"max_tokens": "length",
"stop_sequence": "stop",
"guardrail_intervened": "content_filter",
"content_filtered": "content_filter",
}
return _MAP.get(raw_reason, "stop")
# Auto-register on import
from agent.transports import register_transport # noqa: E402
register_transport("bedrock_converse", BedrockTransport)
+393
View File
@@ -0,0 +1,393 @@
"""OpenAI Chat Completions transport.
Handles the default api_mode ('chat_completions') used by ~16 OpenAI-compatible
providers (OpenRouter, Nous, NVIDIA, Qwen, Ollama, DeepSeek, xAI, Kimi, etc.).
Messages and tools are already in OpenAI format convert_messages and
convert_tools are near-identity. The complexity lives in build_kwargs
which has provider-specific conditionals for max_tokens defaults,
reasoning configuration, temperature handling, and extra_body assembly.
"""
import copy
from typing import Any, Dict, List, Optional
from agent.moonshot_schema import is_moonshot_model, sanitize_moonshot_tools
from agent.prompt_builder import DEVELOPER_ROLE_MODELS
from agent.transports.base import ProviderTransport
from agent.transports.types import NormalizedResponse, ToolCall, Usage
class ChatCompletionsTransport(ProviderTransport):
"""Transport for api_mode='chat_completions'.
The default path for OpenAI-compatible providers.
"""
@property
def api_mode(self) -> str:
return "chat_completions"
def convert_messages(self, messages: List[Dict[str, Any]], **kwargs) -> List[Dict[str, Any]]:
"""Messages are already in OpenAI format — sanitize Codex leaks only.
Strips Codex Responses API fields (``codex_reasoning_items`` on the
message, ``call_id``/``response_item_id`` on tool_calls) that strict
chat-completions providers reject with 400/422.
"""
needs_sanitize = False
for msg in messages:
if not isinstance(msg, dict):
continue
if "codex_reasoning_items" in msg:
needs_sanitize = True
break
tool_calls = msg.get("tool_calls")
if isinstance(tool_calls, list):
for tc in tool_calls:
if isinstance(tc, dict) and ("call_id" in tc or "response_item_id" in tc):
needs_sanitize = True
break
if needs_sanitize:
break
if not needs_sanitize:
return messages
sanitized = copy.deepcopy(messages)
for msg in sanitized:
if not isinstance(msg, dict):
continue
msg.pop("codex_reasoning_items", None)
tool_calls = msg.get("tool_calls")
if isinstance(tool_calls, list):
for tc in tool_calls:
if isinstance(tc, dict):
tc.pop("call_id", None)
tc.pop("response_item_id", None)
return sanitized
def convert_tools(self, tools: List[Dict[str, Any]]) -> List[Dict[str, Any]]:
"""Tools are already in OpenAI format — identity."""
return tools
def build_kwargs(
self,
model: str,
messages: List[Dict[str, Any]],
tools: Optional[List[Dict[str, Any]]] = None,
**params,
) -> Dict[str, Any]:
"""Build chat.completions.create() kwargs.
This is the most complex transport method it handles ~16 providers
via params rather than subclasses.
params:
timeout: float API call timeout
max_tokens: int | None user-configured max tokens
ephemeral_max_output_tokens: int | None one-shot override (error recovery)
max_tokens_param_fn: callable returns {max_tokens: N} or {max_completion_tokens: N}
reasoning_config: dict | None
request_overrides: dict | None
session_id: str | None
qwen_session_metadata: dict | None {sessionId, promptId} precomputed
model_lower: str lowercase model name for pattern matching
# Provider detection flags (all optional, default False)
is_openrouter: bool
is_nous: bool
is_qwen_portal: bool
is_github_models: bool
is_nvidia_nim: bool
is_kimi: bool
is_custom_provider: bool
ollama_num_ctx: int | None
# Provider routing
provider_preferences: dict | None
# Qwen-specific
qwen_prepare_fn: callable | None runs AFTER codex sanitization
qwen_prepare_inplace_fn: callable | None in-place variant for deepcopied lists
# Temperature
fixed_temperature: Any from _fixed_temperature_for_model()
omit_temperature: bool
# Reasoning
supports_reasoning: bool
github_reasoning_extra: dict | None
# Claude on OpenRouter/Nous max output
anthropic_max_output: int | None
# Extra
extra_body_additions: dict | None pre-built extra_body entries
"""
# Codex sanitization: drop reasoning_items / call_id / response_item_id
sanitized = self.convert_messages(messages)
# Qwen portal prep AFTER codex sanitization. If sanitize already
# deepcopied, reuse that copy via the in-place variant to avoid a
# second deepcopy.
is_qwen = params.get("is_qwen_portal", False)
if is_qwen:
qwen_prep = params.get("qwen_prepare_fn")
qwen_prep_inplace = params.get("qwen_prepare_inplace_fn")
if sanitized is messages:
if qwen_prep is not None:
sanitized = qwen_prep(sanitized)
else:
# Already deepcopied — transform in place
if qwen_prep_inplace is not None:
qwen_prep_inplace(sanitized)
elif qwen_prep is not None:
sanitized = qwen_prep(sanitized)
# Developer role swap for GPT-5/Codex models
model_lower = params.get("model_lower", (model or "").lower())
if (
sanitized
and isinstance(sanitized[0], dict)
and sanitized[0].get("role") == "system"
and any(p in model_lower for p in DEVELOPER_ROLE_MODELS)
):
sanitized = list(sanitized)
sanitized[0] = {**sanitized[0], "role": "developer"}
api_kwargs: Dict[str, Any] = {
"model": model,
"messages": sanitized,
}
timeout = params.get("timeout")
if timeout is not None:
api_kwargs["timeout"] = timeout
# Temperature
fixed_temp = params.get("fixed_temperature")
omit_temp = params.get("omit_temperature", False)
if omit_temp:
api_kwargs.pop("temperature", None)
elif fixed_temp is not None:
api_kwargs["temperature"] = fixed_temp
# Qwen metadata (caller precomputes {sessionId, promptId})
qwen_meta = params.get("qwen_session_metadata")
if qwen_meta and is_qwen:
api_kwargs["metadata"] = qwen_meta
# Tools
if tools:
# Moonshot/Kimi uses a stricter flavored JSON Schema. Rewriting
# tool parameters here keeps aggregator routes (Nous, OpenRouter,
# etc.) compatible, in addition to direct moonshot.ai endpoints.
if is_moonshot_model(model):
tools = sanitize_moonshot_tools(tools)
api_kwargs["tools"] = tools
# max_tokens resolution — priority: ephemeral > user > provider default
max_tokens_fn = params.get("max_tokens_param_fn")
ephemeral = params.get("ephemeral_max_output_tokens")
max_tokens = params.get("max_tokens")
anthropic_max_out = params.get("anthropic_max_output")
is_nvidia_nim = params.get("is_nvidia_nim", False)
is_kimi = params.get("is_kimi", False)
reasoning_config = params.get("reasoning_config")
if ephemeral is not None and max_tokens_fn:
api_kwargs.update(max_tokens_fn(ephemeral))
elif max_tokens is not None and max_tokens_fn:
api_kwargs.update(max_tokens_fn(max_tokens))
elif is_nvidia_nim and max_tokens_fn:
api_kwargs.update(max_tokens_fn(16384))
elif is_qwen and max_tokens_fn:
api_kwargs.update(max_tokens_fn(65536))
elif is_kimi and max_tokens_fn:
# Kimi/Moonshot: 32000 matches Kimi CLI's default
api_kwargs.update(max_tokens_fn(32000))
elif anthropic_max_out is not None:
api_kwargs["max_tokens"] = anthropic_max_out
# Kimi: top-level reasoning_effort (unless thinking disabled)
if is_kimi:
_kimi_thinking_off = bool(
reasoning_config
and isinstance(reasoning_config, dict)
and reasoning_config.get("enabled") is False
)
if not _kimi_thinking_off:
_kimi_effort = "medium"
if reasoning_config and isinstance(reasoning_config, dict):
_e = (reasoning_config.get("effort") or "").strip().lower()
if _e in ("low", "medium", "high"):
_kimi_effort = _e
api_kwargs["reasoning_effort"] = _kimi_effort
# extra_body assembly
extra_body: Dict[str, Any] = {}
is_openrouter = params.get("is_openrouter", False)
is_nous = params.get("is_nous", False)
is_github_models = params.get("is_github_models", False)
provider_prefs = params.get("provider_preferences")
if provider_prefs and is_openrouter:
extra_body["provider"] = provider_prefs
# Kimi extra_body.thinking
if is_kimi:
_kimi_thinking_enabled = True
if reasoning_config and isinstance(reasoning_config, dict):
if reasoning_config.get("enabled") is False:
_kimi_thinking_enabled = False
extra_body["thinking"] = {
"type": "enabled" if _kimi_thinking_enabled else "disabled",
}
# Reasoning
if params.get("supports_reasoning", False):
if is_github_models:
gh_reasoning = params.get("github_reasoning_extra")
if gh_reasoning is not None:
extra_body["reasoning"] = gh_reasoning
else:
if reasoning_config is not None:
rc = dict(reasoning_config)
if is_nous and rc.get("enabled") is False:
pass # omit for Nous when disabled
else:
extra_body["reasoning"] = rc
else:
extra_body["reasoning"] = {"enabled": True, "effort": "medium"}
if is_nous:
extra_body["tags"] = ["product=hermes-agent"]
# Ollama num_ctx
ollama_ctx = params.get("ollama_num_ctx")
if ollama_ctx:
options = extra_body.get("options", {})
options["num_ctx"] = ollama_ctx
extra_body["options"] = options
# Ollama/custom think=false
if params.get("is_custom_provider", False):
if reasoning_config and isinstance(reasoning_config, dict):
_effort = (reasoning_config.get("effort") or "").strip().lower()
_enabled = reasoning_config.get("enabled", True)
if _effort == "none" or _enabled is False:
extra_body["think"] = False
if is_qwen:
extra_body["vl_high_resolution_images"] = True
# Merge any pre-built extra_body additions
additions = params.get("extra_body_additions")
if additions:
extra_body.update(additions)
if extra_body:
api_kwargs["extra_body"] = extra_body
# Request overrides last (service_tier etc.)
overrides = params.get("request_overrides")
if overrides:
api_kwargs.update(overrides)
return api_kwargs
def normalize_response(self, response: Any, **kwargs) -> NormalizedResponse:
"""Normalize OpenAI ChatCompletion to NormalizedResponse.
For chat_completions, this is near-identity the response is already
in OpenAI format. extra_content on tool_calls (Gemini thought_signature)
is preserved via ToolCall.provider_data. reasoning_details (OpenRouter
unified format) and reasoning_content (DeepSeek/Moonshot) are also
preserved for downstream replay.
"""
choice = response.choices[0]
msg = choice.message
finish_reason = choice.finish_reason or "stop"
tool_calls = None
if msg.tool_calls:
tool_calls = []
for tc in msg.tool_calls:
# Preserve provider-specific extras on the tool call.
# Gemini 3 thinking models attach extra_content with
# thought_signature — without replay on the next turn the API
# rejects the request with 400.
tc_provider_data: Dict[str, Any] = {}
extra = getattr(tc, "extra_content", None)
if extra is None and hasattr(tc, "model_extra"):
extra = (tc.model_extra or {}).get("extra_content")
if extra is not None:
if hasattr(extra, "model_dump"):
try:
extra = extra.model_dump()
except Exception:
pass
tc_provider_data["extra_content"] = extra
tool_calls.append(ToolCall(
id=tc.id,
name=tc.function.name,
arguments=tc.function.arguments,
provider_data=tc_provider_data or None,
))
usage = None
if hasattr(response, "usage") and response.usage:
u = response.usage
usage = Usage(
prompt_tokens=getattr(u, "prompt_tokens", 0) or 0,
completion_tokens=getattr(u, "completion_tokens", 0) or 0,
total_tokens=getattr(u, "total_tokens", 0) or 0,
)
# Preserve reasoning fields separately. DeepSeek/Moonshot use
# ``reasoning_content``; others use ``reasoning``. Downstream code
# (_extract_reasoning, thinking-prefill retry) reads both distinctly,
# so keep them apart in provider_data rather than merging.
reasoning = getattr(msg, "reasoning", None)
reasoning_content = getattr(msg, "reasoning_content", None)
provider_data: Dict[str, Any] = {}
if reasoning_content:
provider_data["reasoning_content"] = reasoning_content
rd = getattr(msg, "reasoning_details", None)
if rd:
provider_data["reasoning_details"] = rd
return NormalizedResponse(
content=msg.content,
tool_calls=tool_calls,
finish_reason=finish_reason,
reasoning=reasoning,
usage=usage,
provider_data=provider_data or None,
)
def validate_response(self, response: Any) -> bool:
"""Check that response has valid choices."""
if response is None:
return False
if not hasattr(response, "choices") or response.choices is None:
return False
if not response.choices:
return False
return True
def extract_cache_stats(self, response: Any) -> Optional[Dict[str, int]]:
"""Extract OpenRouter/OpenAI cache stats from prompt_tokens_details."""
usage = getattr(response, "usage", None)
if usage is None:
return None
details = getattr(usage, "prompt_tokens_details", None)
if details is None:
return None
cached = getattr(details, "cached_tokens", 0) or 0
written = getattr(details, "cache_write_tokens", 0) or 0
if cached or written:
return {"cached_tokens": cached, "creation_tokens": written}
return None
# Auto-register on import
from agent.transports import register_transport # noqa: E402
register_transport("chat_completions", ChatCompletionsTransport)
+217
View File
@@ -0,0 +1,217 @@
"""OpenAI Responses API (Codex) transport.
Delegates to the existing adapter functions in agent/codex_responses_adapter.py.
This transport owns format conversion and normalization NOT client lifecycle,
streaming, or the _run_codex_stream() call path.
"""
from typing import Any, Dict, List, Optional
from agent.transports.base import ProviderTransport
from agent.transports.types import NormalizedResponse, ToolCall, Usage
class ResponsesApiTransport(ProviderTransport):
"""Transport for api_mode='codex_responses'.
Wraps the functions extracted into codex_responses_adapter.py (PR 1).
"""
@property
def api_mode(self) -> str:
return "codex_responses"
def convert_messages(self, messages: List[Dict[str, Any]], **kwargs) -> Any:
"""Convert OpenAI chat messages to Responses API input items."""
from agent.codex_responses_adapter import _chat_messages_to_responses_input
return _chat_messages_to_responses_input(messages)
def convert_tools(self, tools: List[Dict[str, Any]]) -> Any:
"""Convert OpenAI tool schemas to Responses API function definitions."""
from agent.codex_responses_adapter import _responses_tools
return _responses_tools(tools)
def build_kwargs(
self,
model: str,
messages: List[Dict[str, Any]],
tools: Optional[List[Dict[str, Any]]] = None,
**params,
) -> Dict[str, Any]:
"""Build Responses API kwargs.
Calls convert_messages and convert_tools internally.
params:
instructions: str system prompt (extracted from messages[0] if not given)
reasoning_config: dict | None {effort, enabled}
session_id: str | None used for prompt_cache_key + xAI conv header
max_tokens: int | None max_output_tokens
request_overrides: dict | None extra kwargs merged in
provider: str | None provider name for backend-specific logic
base_url: str | None endpoint URL
base_url_hostname: str | None hostname for backend detection
is_github_responses: bool Copilot/GitHub models backend
is_codex_backend: bool chatgpt.com/backend-api/codex
is_xai_responses: bool xAI/Grok backend
github_reasoning_extra: dict | None Copilot reasoning params
"""
from agent.codex_responses_adapter import (
_chat_messages_to_responses_input,
_responses_tools,
)
from run_agent import DEFAULT_AGENT_IDENTITY
instructions = params.get("instructions", "")
payload_messages = messages
if not instructions:
if messages and messages[0].get("role") == "system":
instructions = str(messages[0].get("content") or "").strip()
payload_messages = messages[1:]
if not instructions:
instructions = DEFAULT_AGENT_IDENTITY
is_github_responses = params.get("is_github_responses", False)
is_codex_backend = params.get("is_codex_backend", False)
is_xai_responses = params.get("is_xai_responses", False)
# Resolve reasoning effort
reasoning_effort = "medium"
reasoning_enabled = True
reasoning_config = params.get("reasoning_config")
if reasoning_config and isinstance(reasoning_config, dict):
if reasoning_config.get("enabled") is False:
reasoning_enabled = False
elif reasoning_config.get("effort"):
reasoning_effort = reasoning_config["effort"]
_effort_clamp = {"minimal": "low"}
reasoning_effort = _effort_clamp.get(reasoning_effort, reasoning_effort)
kwargs = {
"model": model,
"instructions": instructions,
"input": _chat_messages_to_responses_input(payload_messages),
"tools": _responses_tools(tools),
"tool_choice": "auto",
"parallel_tool_calls": True,
"store": False,
}
session_id = params.get("session_id")
if not is_github_responses and session_id:
kwargs["prompt_cache_key"] = session_id
if reasoning_enabled and is_xai_responses:
kwargs["include"] = ["reasoning.encrypted_content"]
elif reasoning_enabled:
if is_github_responses:
github_reasoning = params.get("github_reasoning_extra")
if github_reasoning is not None:
kwargs["reasoning"] = github_reasoning
else:
kwargs["reasoning"] = {"effort": reasoning_effort, "summary": "auto"}
kwargs["include"] = ["reasoning.encrypted_content"]
elif not is_github_responses and not is_xai_responses:
kwargs["include"] = []
request_overrides = params.get("request_overrides")
if request_overrides:
kwargs.update(request_overrides)
max_tokens = params.get("max_tokens")
if max_tokens is not None and not is_codex_backend:
kwargs["max_output_tokens"] = max_tokens
if is_xai_responses and session_id:
kwargs["extra_headers"] = {"x-grok-conv-id": session_id}
return kwargs
def normalize_response(self, response: Any, **kwargs) -> NormalizedResponse:
"""Normalize Codex Responses API response to NormalizedResponse."""
from agent.codex_responses_adapter import (
_normalize_codex_response,
_extract_responses_message_text,
_extract_responses_reasoning_text,
)
# _normalize_codex_response returns (SimpleNamespace, finish_reason_str)
msg, finish_reason = _normalize_codex_response(response)
tool_calls = None
if msg and msg.tool_calls:
tool_calls = []
for tc in msg.tool_calls:
provider_data = {}
if hasattr(tc, "call_id") and tc.call_id:
provider_data["call_id"] = tc.call_id
if hasattr(tc, "response_item_id") and tc.response_item_id:
provider_data["response_item_id"] = tc.response_item_id
tool_calls.append(ToolCall(
id=tc.id if hasattr(tc, "id") else (tc.function.name if hasattr(tc, "function") else None),
name=tc.function.name if hasattr(tc, "function") else getattr(tc, "name", ""),
arguments=tc.function.arguments if hasattr(tc, "function") else getattr(tc, "arguments", "{}"),
provider_data=provider_data or None,
))
# Extract reasoning items for provider_data
provider_data = {}
if msg and hasattr(msg, "codex_reasoning_items") and msg.codex_reasoning_items:
provider_data["codex_reasoning_items"] = msg.codex_reasoning_items
if msg and hasattr(msg, "reasoning_details") and msg.reasoning_details:
provider_data["reasoning_details"] = msg.reasoning_details
return NormalizedResponse(
content=msg.content if msg else None,
tool_calls=tool_calls,
finish_reason=finish_reason or "stop",
reasoning=msg.reasoning if msg and hasattr(msg, "reasoning") else None,
usage=None, # Codex usage is extracted separately in normalize_usage()
provider_data=provider_data or None,
)
def validate_response(self, response: Any) -> bool:
"""Check Codex Responses API response has valid output structure.
Returns True only if response.output is a non-empty list.
Does NOT check output_text fallback the caller handles that
with diagnostic logging for stream backfill recovery.
"""
if response is None:
return False
output = getattr(response, "output", None)
if not isinstance(output, list) or not output:
return False
return True
def preflight_kwargs(self, api_kwargs: Any, *, allow_stream: bool = False) -> dict:
"""Validate and sanitize Codex API kwargs before the call.
Normalizes input items, strips unsupported fields, validates structure.
"""
from agent.codex_responses_adapter import _preflight_codex_api_kwargs
return _preflight_codex_api_kwargs(api_kwargs, allow_stream=allow_stream)
def map_finish_reason(self, raw_reason: str) -> str:
"""Map Codex response.status to OpenAI finish_reason.
Codex uses response.status ('completed', 'incomplete') +
response.incomplete_details.reason for granular mapping.
This method handles the simple status string; the caller
should check incomplete_details separately for 'max_output_tokens'.
"""
_MAP = {
"completed": "stop",
"incomplete": "length",
"failed": "stop",
"cancelled": "stop",
}
return _MAP.get(raw_reason, "stop")
# Auto-register on import
from agent.transports import register_transport # noqa: E402
register_transport("codex_responses", ResponsesApiTransport)
+56
View File
@@ -37,6 +37,44 @@ class ToolCall:
arguments: str # JSON string
provider_data: Optional[Dict[str, Any]] = field(default=None, repr=False)
# ── Backward compatibility ──────────────────────────────────
# The agent loop reads tc.function.name / tc.function.arguments
# throughout run_agent.py (45+ sites). These properties let
# NormalizedResponse pass through without the _nr_to_assistant_message
# shim, while keeping ToolCall's canonical fields flat.
@property
def type(self) -> str:
return "function"
@property
def function(self) -> "ToolCall":
"""Return self so tc.function.name / tc.function.arguments work."""
return self
@property
def call_id(self) -> Optional[str]:
"""Codex call_id from provider_data, accessed via getattr by _build_assistant_message."""
return (self.provider_data or {}).get("call_id")
@property
def response_item_id(self) -> Optional[str]:
"""Codex response_item_id from provider_data."""
return (self.provider_data or {}).get("response_item_id")
@property
def extra_content(self) -> Optional[Dict[str, Any]]:
"""Gemini extra_content (thought_signature) from provider_data.
Gemini 3 thinking models attach ``extra_content`` with a
``thought_signature`` to each tool call. This signature must be
replayed on subsequent API calls without it the API rejects the
request with HTTP 400. The chat_completions transport stores this
in ``provider_data["extra_content"]``; this property exposes it so
``_build_assistant_message`` can ``getattr(tc, "extra_content")``
uniformly.
"""
return (self.provider_data or {}).get("extra_content")
@dataclass
class Usage:
@@ -70,6 +108,24 @@ class NormalizedResponse:
usage: Optional[Usage] = None
provider_data: Optional[Dict[str, Any]] = field(default=None, repr=False)
# ── Backward compatibility ──────────────────────────────────
# The shim _nr_to_assistant_message() mapped these from provider_data.
# These properties let NormalizedResponse pass through directly.
@property
def reasoning_content(self) -> Optional[str]:
pd = self.provider_data or {}
return pd.get("reasoning_content")
@property
def reasoning_details(self):
pd = self.provider_data or {}
return pd.get("reasoning_details")
@property
def codex_reasoning_items(self):
pd = self.provider_data or {}
return pd.get("codex_reasoning_items")
# ---------------------------------------------------------------------------
# Factory helpers
+12
View File
@@ -533,10 +533,22 @@ def normalize_usage(
prompt_total = _to_int(getattr(response_usage, "prompt_tokens", 0))
output_tokens = _to_int(getattr(response_usage, "completion_tokens", 0))
details = getattr(response_usage, "prompt_tokens_details", None)
# Primary: OpenAI-style prompt_tokens_details. Fallback: Anthropic-style
# top-level fields that some OpenAI-compatible proxies (OpenRouter, Vercel
# AI Gateway, Cline) expose when routing Claude models — without this
# fallback, cache writes are undercounted as 0 and cache reads can be
# missed when the proxy only surfaces them at the top level.
# Port of cline/cline#10266.
cache_read_tokens = _to_int(getattr(details, "cached_tokens", 0) if details else 0)
if not cache_read_tokens:
cache_read_tokens = _to_int(getattr(response_usage, "cache_read_input_tokens", 0))
cache_write_tokens = _to_int(
getattr(details, "cache_write_tokens", 0) if details else 0
)
if not cache_write_tokens:
cache_write_tokens = _to_int(
getattr(response_usage, "cache_creation_input_tokens", 0)
)
input_tokens = max(0, prompt_total - cache_read_tokens - cache_write_tokens)
reasoning_tokens = 0
+8
View File
@@ -507,6 +507,13 @@ agent:
# finish, then interrupts anything still running after this timeout.
# 0 = no drain, interrupt immediately.
# restart_drain_timeout: 60
# Max app-level retry attempts for API errors (connection drops, provider
# timeouts, 5xx, etc.) before the agent surfaces the failure. Lower this
# to 1 if you use fallback providers and want fast failover on flaky
# primaries (default 3). The OpenAI SDK does its own low-level retries
# underneath this wrapper — this is the Hermes-level loop.
# api_max_retries: 3
# Enable verbose logging
verbose: false
@@ -776,6 +783,7 @@ delegation:
# max_concurrent_children: 3 # Max parallel child agents (default: 3)
# max_spawn_depth: 1 # Tree depth cap (1-3, default: 1 = flat). Raise to 2 or 3 to allow orchestrator children to spawn their own workers.
# orchestrator_enabled: true # Kill switch for role="orchestrator" children (default: true).
# inherit_mcp_toolsets: true # When explicit child toolsets are narrowed, also keep the parent's MCP toolsets (default: true). Set false for strict intersection.
# model: "google/gemini-3-flash-preview" # Override model for subagents (empty = inherit parent)
# provider: "openrouter" # Override provider for subagents (empty = inherit parent)
# # Resolves full credentials (base_url, api_key) automatically.
+94 -3
View File
@@ -108,6 +108,11 @@ def _strip_reasoning_tags(text: str) -> str:
``<thought>`` (Gemma 4). Must stay in sync with
``run_agent.py::_strip_think_blocks`` and the stream consumer's
``_OPEN_THINK_TAGS`` / ``_CLOSE_THINK_TAGS`` tuples.
Also strips tool-call XML blocks some open models leak into visible
content (``<tool_call>``, ``<function_calls>``, Gemma-style
``<function name=""></function>``). Ported from
openclaw/openclaw#67318.
"""
cleaned = text
for tag in _REASONING_TAGS:
@@ -132,6 +137,31 @@ def _strip_reasoning_tags(text: str) -> str:
cleaned,
flags=re.IGNORECASE,
)
# Tool-call XML blocks (openclaw/openclaw#67318).
for tc_tag in ("tool_call", "tool_calls", "tool_result",
"function_call", "function_calls"):
cleaned = re.sub(
rf"<{tc_tag}\b[^>]*>.*?</{tc_tag}>\s*",
"",
cleaned,
flags=re.DOTALL | re.IGNORECASE,
)
# <function name="..."> — boundary + attribute gated to avoid prose FPs.
cleaned = re.sub(
r'(?:(?<=^)|(?<=[\n\r.!?:]))[ \t]*'
r'<function\b[^>]*\bname\s*=[^>]*>'
r'(?:(?:(?!</function>).)*)</function>\s*',
'',
cleaned,
flags=re.DOTALL | re.IGNORECASE,
)
# Stray tool-call close tags.
cleaned = re.sub(
r'</(?:tool_call|tool_calls|tool_result|function_call|function_calls|function)>\s*',
'',
cleaned,
flags=re.IGNORECASE,
)
return cleaned.strip()
@@ -275,13 +305,23 @@ def load_cli_config() -> Dict[str, Any]:
Environment variables take precedence over config file values.
Returns default values if no config file exists.
If HERMES_IGNORE_USER_CONFIG=1 is set (via ``hermes chat --ignore-user-config``),
the user config at ``~/.hermes/config.yaml`` is skipped entirely and only the
built-in defaults plus the project-level ``cli-config.yaml`` (if any) are used.
Credentials in ``.env`` are still loaded this flag only suppresses
behavioral/config settings.
"""
# Check user config first ({HERMES_HOME}/config.yaml)
user_config_path = _hermes_home / 'config.yaml'
project_config_path = Path(__file__).parent / 'cli-config.yaml'
# --ignore-user-config: force-skip the user config.yaml (still honor project
# config as a fallback so defaults stay sensible).
ignore_user_config = os.environ.get("HERMES_IGNORE_USER_CONFIG") == "1"
# Use user config if it exists, otherwise project config
if user_config_path.exists():
if user_config_path.exists() and not ignore_user_config:
config_path = user_config_path
else:
config_path = project_config_path
@@ -914,6 +954,32 @@ def _cleanup_worktree(info: Dict[str, str] = None) -> None:
print(f"\033[32m✓ Worktree cleaned up: {wt_path}\033[0m")
def _run_state_db_auto_maintenance(session_db) -> None:
"""Call ``SessionDB.maybe_auto_prune_and_vacuum`` using current config.
Reads the ``sessions:`` section from config.yaml via
:func:`hermes_cli.config.load_config` (the authoritative loader that
deep-merges DEFAULT_CONFIG, so unmigrated configs still get default
values). Honours ``auto_prune`` / ``retention_days`` /
``vacuum_after_prune`` / ``min_interval_hours``, and delegates to the
DB. Never raises maintenance must never block interactive startup.
"""
if session_db is None:
return
try:
from hermes_cli.config import load_config as _load_full_config
cfg = (_load_full_config().get("sessions") or {})
if not cfg.get("auto_prune", False):
return
session_db.maybe_auto_prune_and_vacuum(
retention_days=int(cfg.get("retention_days", 90)),
min_interval_hours=int(cfg.get("min_interval_hours", 24)),
vacuum=bool(cfg.get("vacuum_after_prune", True)),
)
except Exception as exc:
logger.debug("state.db auto-maintenance skipped: %s", exc)
def _prune_stale_worktrees(repo_root: str, max_age_hours: int = 24) -> None:
"""Remove stale worktrees and orphaned branches on startup.
@@ -1746,6 +1812,7 @@ class HermesCLI:
resume: str = None,
checkpoints: bool = False,
pass_session_id: bool = False,
ignore_rules: bool = False,
):
"""
Initialize the Hermes CLI.
@@ -1899,6 +1966,11 @@ class HermesCLI:
self.checkpoints_enabled = checkpoints or cp_cfg.get("enabled", False)
self.checkpoint_max_snapshots = cp_cfg.get("max_snapshots", 50)
self.pass_session_id = pass_session_id
# --ignore-rules: honor either the constructor flag or the env var set
# by `hermes chat --ignore-rules` in hermes_cli/main.py. When true we
# pass skip_context_files=True and skip_memory=True to AIAgent so
# AGENTS.md/SOUL.md/.cursorrules and persistent memory are not loaded.
self.ignore_rules = ignore_rules or os.environ.get("HERMES_IGNORE_RULES") == "1"
# Ephemeral system prompt: env var takes precedence, then config
self.system_prompt = (
@@ -1961,7 +2033,13 @@ class HermesCLI:
self._session_db = SessionDB()
except Exception as e:
logger.warning("Failed to initialize SessionDB — session will NOT be indexed for search: %s", e)
# Opportunistic state.db maintenance — runs at most once per
# min_interval_hours, tracked via state_meta in state.db itself so
# it's shared across all Hermes processes for this HERMES_HOME.
# Never blocks startup on failure.
_run_state_db_auto_maintenance(self._session_db)
# Deferred title: stored in memory until the session is created in the DB
self._pending_title: Optional[str] = None
@@ -3250,6 +3328,8 @@ class HermesCLI:
checkpoints_enabled=self.checkpoints_enabled,
checkpoint_max_snapshots=self.checkpoint_max_snapshots,
pass_session_id=self.pass_session_id,
skip_context_files=self.ignore_rules,
skip_memory=self.ignore_rules,
tool_progress_callback=self._on_tool_progress,
tool_start_callback=self._on_tool_start if self._inline_diffs_enabled else None,
tool_complete_callback=self._on_tool_complete if self._inline_diffs_enabled else None,
@@ -6605,6 +6685,13 @@ class HermesCLI:
print(f" ⚠ Port {_port} is not reachable at {cdp_url}")
os.environ["BROWSER_CDP_URL"] = cdp_url
# Eagerly start the CDP supervisor so pending_dialogs + frame_tree
# show up in the next browser_snapshot. No-op if already started.
try:
from tools.browser_tool import _ensure_cdp_supervisor # type: ignore[import-not-found]
_ensure_cdp_supervisor("default")
except Exception:
pass
print()
print("🌐 Browser connected to live Chrome via CDP")
print(f" Endpoint: {cdp_url}")
@@ -6626,7 +6713,8 @@ class HermesCLI:
if current:
os.environ.pop("BROWSER_CDP_URL", None)
try:
from tools.browser_tool import cleanup_all_browsers
from tools.browser_tool import cleanup_all_browsers, _stop_cdp_supervisor
_stop_cdp_supervisor("default")
cleanup_all_browsers()
except Exception:
pass
@@ -10754,6 +10842,8 @@ def main(
w: bool = False,
checkpoints: bool = False,
pass_session_id: bool = False,
ignore_user_config: bool = False,
ignore_rules: bool = False,
):
"""
Hermes Agent CLI - Interactive AI Assistant
@@ -10863,6 +10953,7 @@ def main(
resume=resume,
checkpoints=checkpoints,
pass_session_id=pass_session_id,
ignore_rules=ignore_rules,
)
if parsed_skills:
+7
View File
@@ -384,6 +384,7 @@ def create_job(
provider: Optional[str] = None,
base_url: Optional[str] = None,
script: Optional[str] = None,
enabled_toolsets: Optional[List[str]] = None,
) -> Dict[str, Any]:
"""
Create a new cron job.
@@ -403,6 +404,9 @@ def create_job(
script: Optional path to a Python script whose stdout is injected into the
prompt each run. The script runs before the agent turn, and its output
is prepended as context. Useful for data collection / change detection.
enabled_toolsets: Optional list of toolset names to restrict the agent to.
When set, only tools from these toolsets are loaded, reducing
token overhead. When omitted, all default tools are loaded.
Returns:
The created job dict
@@ -433,6 +437,8 @@ def create_job(
normalized_base_url = normalized_base_url or None
normalized_script = str(script).strip() if isinstance(script, str) else None
normalized_script = normalized_script or None
normalized_toolsets = [str(t).strip() for t in enabled_toolsets if str(t).strip()] if enabled_toolsets else None
normalized_toolsets = normalized_toolsets or None
label_source = (prompt or (normalized_skills[0] if normalized_skills else None)) or "cron job"
job = {
@@ -464,6 +470,7 @@ def create_job(
# Delivery configuration
"deliver": deliver,
"origin": origin, # Tracks where job was created for "origin" delivery
"enabled_toolsets": normalized_toolsets,
}
jobs = load_jobs()
+38
View File
@@ -40,6 +40,37 @@ from hermes_time import now as _hermes_now
logger = logging.getLogger(__name__)
def _resolve_cron_enabled_toolsets(job: dict, cfg: dict) -> list[str] | None:
"""Resolve the toolset list for a cron job.
Precedence:
1. Per-job ``enabled_toolsets`` (set via ``cronjob`` tool on create/update).
Keeps the agent's job-scoped toolset override intact — #6130.
2. Per-platform ``hermes tools`` config for the ``cron`` platform.
Mirrors gateway behavior (``_get_platform_tools(cfg, platform_key)``)
so users can gate cron toolsets globally without recreating every job.
3. ``None`` on any lookup failure AIAgent loads the full default set
(legacy behavior before this change, preserved as the safety net).
_DEFAULT_OFF_TOOLSETS ({moa, homeassistant, rl}) are removed by
``_get_platform_tools`` for unconfigured platforms, so fresh installs
get cron WITHOUT ``moa`` by default (issue reported by Norbert
surprise $4.63 run).
"""
per_job = job.get("enabled_toolsets")
if per_job:
return per_job
try:
from hermes_cli.tools_config import _get_platform_tools # lazy: avoid heavy import at cron module load
return sorted(_get_platform_tools(cfg or {}, "cron"))
except Exception as exc:
logger.warning(
"Cron toolset resolution failed, falling back to full default toolset: %s",
exc,
)
return None
# Valid delivery platforms — used to validate user-supplied platform names
# in cron delivery targets, preventing env var enumeration via crafted names.
_KNOWN_DELIVERY_PLATFORMS = frozenset({
@@ -886,6 +917,7 @@ def run_job(job: dict) -> tuple[bool, str, str, Optional[str]]:
providers_ignored=pr.get("ignore"),
providers_order=pr.get("order"),
provider_sort=pr.get("sort"),
enabled_toolsets=_resolve_cron_enabled_toolsets(job, _cfg),
disabled_toolsets=["cronjob", "messaging", "clarify"],
quiet_mode=True,
skip_context_files=True, # Don't inject SOUL.md/AGENTS.md from scheduler cwd
@@ -972,6 +1004,12 @@ def run_job(job: dict) -> tuple[bool, str, str, Optional[str]]:
f"— last activity: {_last_desc}"
)
# Guard against non-dict returns from run_conversation under error conditions
if not isinstance(result, dict):
raise RuntimeError(
f"agent.run_conversation returned {type(result).__name__} instead of dict: {result!r}"
)
final_response = result.get("final_response", "") or ""
# Strip leaked placeholder text that upstream may inject on empty completions.
if final_response.strip() == "(No response generated)":
+22
View File
@@ -58,6 +58,13 @@ if [ ! -f "$HERMES_HOME/config.yaml" ]; then
cp "$INSTALL_DIR/cli-config.yaml.example" "$HERMES_HOME/config.yaml"
fi
# Ensure the main config file remains accessible to the hermes runtime user
# even if it was edited on the host after initial ownership setup.
if [ -f "$HERMES_HOME/config.yaml" ]; then
chown hermes:hermes "$HERMES_HOME/config.yaml"
chmod 640 "$HERMES_HOME/config.yaml"
fi
# SOUL.md
if [ ! -f "$HERMES_HOME/SOUL.md" ]; then
cp "$INSTALL_DIR/docker/SOUL.md" "$HERMES_HOME/SOUL.md"
@@ -68,4 +75,19 @@ if [ -d "$INSTALL_DIR/skills" ]; then
python3 "$INSTALL_DIR/tools/skills_sync.py"
fi
# Final exec: two supported invocation patterns.
#
# docker run <image> -> exec `hermes` with no args (legacy default)
# docker run <image> chat -q "..." -> exec `hermes chat -q "..."` (legacy wrap)
# docker run <image> sleep infinity -> exec `sleep infinity` directly
# docker run <image> bash -> exec `bash` directly
#
# If the first positional arg resolves to an executable on PATH, we assume the
# caller wants to run it directly (needed by the launcher which runs long-lived
# `sleep infinity` sandbox containers — see tools/environments/docker.py).
# Otherwise we treat the args as a hermes subcommand and wrap with `hermes`,
# preserving the documented `docker run <image> <subcommand>` behavior.
if [ $# -gt 0 ] && command -v "$1" >/dev/null 2>&1; then
exec "$@"
fi
exec hermes "$@"
+2
View File
@@ -616,6 +616,8 @@ def load_gateway_config() -> GatewayConfig:
if isinstance(frc, list):
frc = ",".join(str(v) for v in frc)
os.environ["SLACK_FREE_RESPONSE_CHANNELS"] = str(frc)
if "reactions" in slack_cfg and not os.getenv("SLACK_REACTIONS"):
os.environ["SLACK_REACTIONS"] = str(slack_cfg["reactions"]).lower()
# Discord settings → env vars (env vars take precedence)
discord_cfg = yaml_cfg.get("discord", {})
+44 -11
View File
@@ -135,9 +135,22 @@ class HookRegistry:
except Exception as e:
print(f"[hooks] Error loading hook {hook_dir.name}: {e}", flush=True)
def _resolve_handlers(self, event_type: str) -> List[Callable]:
"""Return all handlers that should fire for ``event_type``.
Exact matches fire first, followed by wildcard matches (e.g.
``command:*`` matches ``command:reset``).
"""
handlers = list(self._handlers.get(event_type, []))
if ":" in event_type:
base = event_type.split(":")[0]
wildcard_key = f"{base}:*"
handlers.extend(self._handlers.get(wildcard_key, []))
return handlers
async def emit(self, event_type: str, context: Optional[Dict[str, Any]] = None) -> None:
"""
Fire all handlers registered for an event.
Fire all handlers registered for an event, discarding return values.
Supports wildcard matching: handlers registered for "command:*" will
fire for any "command:..." event. Handlers registered for a base type
@@ -151,16 +164,7 @@ class HookRegistry:
if context is None:
context = {}
# Collect handlers: exact match + wildcard match
handlers = list(self._handlers.get(event_type, []))
# Check for wildcard patterns (e.g., "command:*" matches "command:reset")
if ":" in event_type:
base = event_type.split(":")[0]
wildcard_key = f"{base}:*"
handlers.extend(self._handlers.get(wildcard_key, []))
for fn in handlers:
for fn in self._resolve_handlers(event_type):
try:
result = fn(event_type, context)
# Support both sync and async handlers
@@ -168,3 +172,32 @@ class HookRegistry:
await result
except Exception as e:
print(f"[hooks] Error in handler for '{event_type}': {e}", flush=True)
async def emit_collect(
self,
event_type: str,
context: Optional[Dict[str, Any]] = None,
) -> List[Any]:
"""Fire handlers and return their non-None return values in order.
Like :meth:`emit` but captures each handler's return value. Used for
decision-style hooks (e.g. ``command:<name>`` policies that want to
allow/deny/rewrite the command before normal dispatch).
Exceptions from individual handlers are logged but do not abort the
remaining handlers.
"""
if context is None:
context = {}
results: List[Any] = []
for fn in self._resolve_handlers(event_type):
try:
result = fn(event_type, context)
if asyncio.iscoroutine(result):
result = await result
if result is not None:
results.append(result)
except Exception as e:
print(f"[hooks] Error in handler for '{event_type}': {e}", flush=True)
return results
+276 -21
View File
@@ -752,7 +752,10 @@ class MessageEvent:
if not self.is_command():
return self.text
parts = self.text.split(maxsplit=1)
return parts[1] if len(parts) > 1 else ""
args = parts[1] if len(parts) > 1 else ""
# iOS auto-corrects -- to — (em dash) and - to (en dash)
args = args.replace("\u2014\u2014", "--").replace("\u2014", "--").replace("\u2013", "-")
return args
@dataclass
@@ -897,10 +900,16 @@ class BasePlatformAdapter(ABC):
self._fatal_error_retryable = True
self._fatal_error_handler: Optional[Callable[["BasePlatformAdapter"], Awaitable[None] | None]] = None
# Track active message handlers per session for interrupt support
# Key: session_key (e.g., chat_id), Value: (event, asyncio.Event for interrupt)
# Track active message handlers per session for interrupt support.
# _active_sessions stores the per-session interrupt Event; _session_tasks
# maps session → the specific Task currently processing it so that
# session-terminating commands (/stop, /new, /reset) can cancel the
# right task and release the adapter-level guard deterministically.
# Without the owner-task map, an old task's finally block could delete
# a newer task's guard, leaving stale busy state.
self._active_sessions: Dict[str, asyncio.Event] = {}
self._pending_messages: Dict[str, MessageEvent] = {}
self._session_tasks: Dict[str, asyncio.Task] = {}
# Background message-processing tasks spawned by handle_message().
# Gateway shutdown cancels these so an old gateway instance doesn't keep
# working on a task after --replace or manual restarts.
@@ -1343,7 +1352,7 @@ class BasePlatformAdapter(ABC):
# Extract MEDIA:<path> tags, allowing optional whitespace after the colon
# and quoted/backticked paths for LLM-formatted outputs.
media_pattern = re.compile(
r'''[`"']?MEDIA:\s*(?P<path>`[^`\n]+`|"[^"\n]+"|'[^'\n]+'|(?:~/|/)\S+(?:[^\S\n]+\S+)*?\.(?:png|jpe?g|gif|webp|mp4|mov|avi|mkv|webm|ogg|opus|mp3|wav|m4a|pdf)(?=[\s`"',;:)\]}]|$)|\S+)[`"']?'''
r'''[`"']?MEDIA:\s*(?P<path>`[^`\n]+`|"[^"\n]+"|'[^'\n]+'|(?:~/|/)\S+(?:[^\S\n]+\S+)*?\.(?:png|jpe?g|gif|webp|mp4|mov|avi|mkv|webm|ogg|opus|mp3|wav|m4a|epub|pdf|zip|rar|7z|docx?|xlsx?|pptx?|txt|csv|apk|ipa)(?=[\s`"',;:)\]}]|$)|\S+)[`"']?'''
)
for match in media_pattern.finditer(content):
path = match.group("path").strip()
@@ -1677,6 +1686,222 @@ class BasePlatformAdapter(ABC):
return f"{existing_text}\n\n{new_text}".strip()
return existing_text
# ------------------------------------------------------------------
# Session task + guard ownership helpers
# ------------------------------------------------------------------
# These were introduced together with the _session_tasks owner map to
# make session lifecycle reconciliation deterministic across (a) the
# normal completion path, (b) /stop/ /new/ /reset bypass commands,
# and (c) stale-lock self-heal on the next inbound message.
def _release_session_guard(
self,
session_key: str,
*,
guard: Optional[asyncio.Event] = None,
) -> None:
"""Release the adapter-level guard for a session.
When ``guard`` is provided, only release the entry if it still points
at that exact Event. This lets reset-like commands swap in a temporary
guard while the old processing task unwinds, without having the old
task's cleanup accidentally clear the replacement guard.
"""
current_guard = self._active_sessions.get(session_key)
if current_guard is None:
return
if guard is not None and current_guard is not guard:
return
del self._active_sessions[session_key]
def _session_task_is_stale(self, session_key: str) -> bool:
"""Return True if the owner task for ``session_key`` is done/cancelled.
A lock is "stale" when the adapter still has ``_active_sessions[key]``
AND a known owner task in ``_session_tasks`` that has already exited.
When there is no owner task at all, that usually means the guard was
installed by some path other than handle_message() (tests sometimes
install guards directly) don't treat that as stale. The on-entry
self-heal only needs to handle the production split-brain case where
an owner task was recorded, then exited without clearing its guard.
"""
task = self._session_tasks.get(session_key)
if task is None:
return False
done = getattr(task, "done", None)
return bool(done and done())
def _heal_stale_session_lock(self, session_key: str) -> bool:
"""Clear a stale session lock if the owner task is already gone.
Returns True if a stale lock was healed. Returns False if there is
no lock, or the owner task is still alive (the normal busy case).
This is the on-entry safety net sidbin's issue #11016 analysis calls
for: without it, a split-brain adapter still thinks the session is
active, but nothing is actually processing traps the chat in
infinite "Interrupting current task..." until the gateway is
restarted.
"""
if session_key not in self._active_sessions:
return False
if not self._session_task_is_stale(session_key):
return False
logger.warning(
"[%s] Healing stale session lock for %s (owner task is done/absent)",
self.name,
session_key,
)
self._active_sessions.pop(session_key, None)
self._pending_messages.pop(session_key, None)
self._session_tasks.pop(session_key, None)
return True
def _start_session_processing(
self,
event: MessageEvent,
session_key: str,
*,
interrupt_event: Optional[asyncio.Event] = None,
) -> bool:
"""Spawn a background processing task under the given session guard.
Returns True on success. If the runtime stubs ``create_task`` with a
non-Task sentinel (some tests do this), the guard is rolled back and
False is returned so the caller isn't left holding a half-installed
session lock.
"""
guard = interrupt_event or asyncio.Event()
self._active_sessions[session_key] = guard
task = asyncio.create_task(self._process_message_background(event, session_key))
self._session_tasks[session_key] = task
try:
self._background_tasks.add(task)
except TypeError:
# Tests stub create_task() with lightweight sentinels that are not
# hashable and do not support lifecycle callbacks.
self._session_tasks.pop(session_key, None)
self._release_session_guard(session_key, guard=guard)
return False
if hasattr(task, "add_done_callback"):
task.add_done_callback(self._background_tasks.discard)
task.add_done_callback(self._expected_cancelled_tasks.discard)
return True
async def cancel_session_processing(
self,
session_key: str,
*,
release_guard: bool = True,
discard_pending: bool = True,
) -> None:
"""Cancel in-flight processing for a single session.
``release_guard=False`` keeps the adapter-level session guard in place
so reset-like commands can finish atomically before follow-up messages
are allowed to start a fresh background task.
"""
task = self._session_tasks.pop(session_key, None)
if task is not None and not task.done():
logger.debug(
"[%s] Cancelling active processing for session %s",
self.name,
session_key,
)
self._expected_cancelled_tasks.add(task)
task.cancel()
try:
await task
except asyncio.CancelledError:
pass
except Exception:
logger.debug(
"[%s] Session cancellation raised while unwinding %s",
self.name,
session_key,
exc_info=True,
)
if discard_pending:
self._pending_messages.pop(session_key, None)
if release_guard:
self._release_session_guard(session_key)
async def _drain_pending_after_session_command(
self,
session_key: str,
command_guard: asyncio.Event,
) -> None:
"""Resume the latest queued follow-up once a session command completes.
Called at the tail of /stop, /new, and /reset dispatch. Releases the
command-scoped guard, then if a follow-up message landed while the
command was running spawns a fresh processing task for it.
"""
pending_event = self._pending_messages.pop(session_key, None)
self._release_session_guard(session_key, guard=command_guard)
if pending_event is None:
return
self._start_session_processing(pending_event, session_key)
async def _dispatch_active_session_command(
self,
event: MessageEvent,
session_key: str,
cmd: str,
) -> None:
"""Dispatch a reset-like bypass command while preserving guard ordering.
/stop, /new, and /reset must:
1. Keep the session guard installed while the runner processes the
command (so a racing follow-up message stays queued, not
dispatched as a second parallel run).
2. Cancel the old in-flight adapter task only AFTER the runner has
finished handling the command (so the runner sees consistent
state and its response is sent in order).
3. Release the command-scoped guard and drain the latest queued
follow-up exactly once, after 1 and 2 complete.
"""
logger.debug(
"[%s] Command '/%s' bypassing active-session guard for %s",
self.name,
cmd,
session_key,
)
current_guard = self._active_sessions.get(session_key)
command_guard = asyncio.Event()
self._active_sessions[session_key] = command_guard
thread_meta = {"thread_id": event.source.thread_id} if event.source.thread_id else None
try:
response = await self._message_handler(event)
# Old adapter task (if any) is cancelled AFTER the runner has
# fully handled the command — keeps ordering deterministic.
await self.cancel_session_processing(
session_key,
release_guard=False,
discard_pending=False,
)
if response:
await self._send_with_retry(
chat_id=event.source.chat_id,
content=response,
reply_to=event.message_id,
metadata=thread_meta,
)
except Exception:
# On failure, restore the original guard if one still exists so
# we don't leave the session in a half-reset state.
if self._active_sessions.get(session_key) is command_guard:
if session_key in self._session_tasks and current_guard is not None:
self._active_sessions[session_key] = current_guard
else:
self._release_session_guard(session_key, guard=command_guard)
raise
await self._drain_pending_after_session_command(session_key, command_guard)
async def handle_message(self, event: MessageEvent) -> None:
"""
Process an incoming message.
@@ -1693,7 +1918,15 @@ class BasePlatformAdapter(ABC):
group_sessions_per_user=self.config.extra.get("group_sessions_per_user", True),
thread_sessions_per_user=self.config.extra.get("thread_sessions_per_user", False),
)
# On-entry self-heal: if the adapter still has an _active_sessions
# entry for this key but the owner task has already exited (done or
# cancelled), the lock is stale. Clear it and fall through to
# normal dispatch so the user isn't trapped behind a dead guard —
# this is the split-brain tail described in issue #11016.
if session_key in self._active_sessions:
self._heal_stale_session_lock(session_key)
# Check if there's already an active handler for this session
if session_key in self._active_sessions:
# Certain commands must bypass the active-session guard and be
@@ -1710,6 +1943,23 @@ class BasePlatformAdapter(ABC):
from hermes_cli.commands import should_bypass_active_session
if should_bypass_active_session(cmd):
# /stop, /new, /reset must cancel the in-flight adapter task
# and preserve ordering of queued follow-ups. Route those
# through the dedicated handoff path that serializes
# cancellation + runner response + pending drain.
if cmd in ("stop", "new", "reset"):
try:
await self._dispatch_active_session_command(event, session_key, cmd)
except Exception as e:
logger.error(
"[%s] Command '/%s' dispatch failed: %s",
self.name, cmd, e, exc_info=True,
)
return
# Other bypass commands (/approve, /deny, /status,
# /background, /restart) just need direct dispatch — they
# don't cancel the running task.
logger.debug(
"[%s] Command '/%s' bypassing active-session guard for %s",
self.name, cmd, session_key,
@@ -1755,19 +2005,9 @@ class BasePlatformAdapter(ABC):
# starts would also pass the _active_sessions check and spawn a
# duplicate task. (grammY sequentialize / aiogram EventIsolation
# pattern — set the guard synchronously, not inside the task.)
self._active_sessions[session_key] = asyncio.Event()
# Spawn background task to process this message
task = asyncio.create_task(self._process_message_background(event, session_key))
try:
self._background_tasks.add(task)
except TypeError:
# Some tests stub create_task() with lightweight sentinels that are not
# hashable and do not support lifecycle callbacks.
return
if hasattr(task, "add_done_callback"):
task.add_done_callback(self._background_tasks.discard)
task.add_done_callback(self._expected_cancelled_tasks.discard)
# _start_session_processing installs the guard AND the owner-task
# mapping atomically so stale-lock detection works.
self._start_session_processing(event, session_key)
@staticmethod
def _get_human_delay() -> float:
@@ -2127,6 +2367,9 @@ class BasePlatformAdapter(ABC):
drain_task = asyncio.create_task(
self._process_message_background(late_pending, session_key)
)
# Hand ownership of the session to the drain task so stale-lock
# detection keeps working while it runs.
self._session_tasks[session_key] = drain_task
try:
self._background_tasks.add(drain_task)
drain_task.add_done_callback(self._background_tasks.discard)
@@ -2136,9 +2379,14 @@ class BasePlatformAdapter(ABC):
# Leave _active_sessions[session_key] populated — the drain
# task's own lifecycle will clean it up.
else:
# Clean up session tracking
if session_key in self._active_sessions:
del self._active_sessions[session_key]
# Clean up session tracking. Guard-match both deletes so a
# reset-like command that already swapped in its own
# command_guard (and cancelled us) can't be accidentally
# cleared by our unwind. The command owns the session now.
current_task = asyncio.current_task()
if current_task is not None and self._session_tasks.get(session_key) is current_task:
del self._session_tasks[session_key]
self._release_session_guard(session_key, guard=interrupt_event)
async def cancel_background_tasks(self) -> None:
"""Cancel any in-flight background message-processing tasks.
@@ -2168,6 +2416,7 @@ class BasePlatformAdapter(ABC):
# will be in self._background_tasks now. Re-check.
self._background_tasks.clear()
self._expected_cancelled_tasks.clear()
self._session_tasks.clear()
self._pending_messages.clear()
self._active_sessions.clear()
@@ -2191,6 +2440,9 @@ class BasePlatformAdapter(ABC):
user_id_alt: Optional[str] = None,
chat_id_alt: Optional[str] = None,
is_bot: bool = False,
guild_id: Optional[str] = None,
parent_chat_id: Optional[str] = None,
message_id: Optional[str] = None,
) -> SessionSource:
"""Helper to build a SessionSource for this platform."""
# Normalize empty topic to None
@@ -2208,6 +2460,9 @@ class BasePlatformAdapter(ABC):
user_id_alt=user_id_alt,
chat_id_alt=chat_id_alt,
is_bot=is_bot,
guild_id=str(guild_id) if guild_id else None,
parent_chat_id=str(parent_chat_id) if parent_chat_id else None,
message_id=str(message_id) if message_id else None,
)
@abstractmethod
+272 -36
View File
@@ -23,6 +23,7 @@ from typing import Callable, Dict, Optional, Any
logger = logging.getLogger(__name__)
VALID_THREAD_AUTO_ARCHIVE_MINUTES = {60, 1440, 4320, 10080}
_DISCORD_COMMAND_SYNC_POLICIES = {"safe", "bulk", "off"}
try:
import discord
@@ -527,6 +528,7 @@ class DiscordAdapter(BasePlatformAdapter):
# Reply threading mode: "off" (no replies), "first" (reply on first
# chunk only, default), "all" (reply-reference on every chunk).
self._reply_to_mode: str = getattr(config, 'reply_to_mode', 'first') or 'first'
self._slash_commands: bool = self.config.extra.get("slash_commands", True)
async def connect(self) -> bool:
"""Connect to Discord and start receiving events."""
@@ -744,7 +746,8 @@ class DiscordAdapter(BasePlatformAdapter):
)
# Register slash commands
self._register_slash_commands()
if self._slash_commands:
self._register_slash_commands()
# Start the bot in background
self._bot_task = asyncio.create_task(self._client.start(self.config.token))
@@ -800,8 +803,27 @@ class DiscordAdapter(BasePlatformAdapter):
if not self._client:
return
try:
synced = await asyncio.wait_for(self._client.tree.sync(), timeout=30)
logger.info("[%s] Synced %d slash command(s)", self.name, len(synced))
sync_policy = self._get_discord_command_sync_policy()
if sync_policy == "off":
logger.info("[%s] Skipping Discord slash command sync (policy=off)", self.name)
return
if sync_policy == "bulk":
synced = await asyncio.wait_for(self._client.tree.sync(), timeout=30)
logger.info("[%s] Synced %d slash command(s) via bulk tree sync", self.name, len(synced))
return
summary = await asyncio.wait_for(self._safe_sync_slash_commands(), timeout=30)
logger.info(
"[%s] Safely reconciled %d slash command(s): unchanged=%d updated=%d recreated=%d created=%d deleted=%d",
self.name,
summary["total"],
summary["unchanged"],
summary["updated"],
summary["recreated"],
summary["created"],
summary["deleted"],
)
except asyncio.TimeoutError:
logger.warning("[%s] Slash command sync timed out after 30s", self.name)
except asyncio.CancelledError:
@@ -809,6 +831,183 @@ class DiscordAdapter(BasePlatformAdapter):
except Exception as e: # pragma: no cover - defensive logging
logger.warning("[%s] Slash command sync failed: %s", self.name, e, exc_info=True)
def _get_discord_command_sync_policy(self) -> str:
raw = str(os.getenv("DISCORD_COMMAND_SYNC_POLICY", "safe") or "").strip().lower()
if raw in _DISCORD_COMMAND_SYNC_POLICIES:
return raw
if raw:
logger.warning(
"[%s] Invalid DISCORD_COMMAND_SYNC_POLICY=%r; falling back to 'safe'",
self.name,
raw,
)
return "safe"
def _canonicalize_app_command_payload(self, payload: Dict[str, Any]) -> Dict[str, Any]:
"""Reduce command payloads to the semantic fields Hermes manages."""
contexts = payload.get("contexts")
integration_types = payload.get("integration_types")
return {
"type": int(payload.get("type", 1) or 1),
"name": str(payload.get("name", "") or ""),
"description": str(payload.get("description", "") or ""),
"default_member_permissions": self._normalize_permissions(
payload.get("default_member_permissions")
),
"dm_permission": bool(payload.get("dm_permission", True)),
"nsfw": bool(payload.get("nsfw", False)),
"contexts": sorted(int(c) for c in contexts) if contexts else None,
"integration_types": (
sorted(int(i) for i in integration_types) if integration_types else None
),
"options": [
self._canonicalize_app_command_option(item)
for item in payload.get("options", []) or []
if isinstance(item, dict)
],
}
@staticmethod
def _normalize_permissions(value: Any) -> Optional[str]:
"""Discord emits default_member_permissions as str server-side but discord.py
sets it as int locally. Normalize to str-or-None so the comparison is stable."""
if value is None:
return None
return str(value)
def _existing_command_to_payload(self, command: Any) -> Dict[str, Any]:
"""Build a canonical-ready dict from an AppCommand.
discord.py's AppCommand.to_dict() does NOT include nsfw,
dm_permission, or default_member_permissions (they live only on the
attributes). Pull them from the attributes so the canonicalizer sees
the real server-side values instead of defaults otherwise any
command using non-default permissions would diff on every startup.
"""
payload = dict(command.to_dict())
nsfw = getattr(command, "nsfw", None)
if nsfw is not None:
payload["nsfw"] = bool(nsfw)
guild_only = getattr(command, "guild_only", None)
if guild_only is not None:
payload["dm_permission"] = not bool(guild_only)
default_permissions = getattr(command, "default_member_permissions", None)
if default_permissions is not None:
payload["default_member_permissions"] = getattr(
default_permissions, "value", default_permissions
)
return payload
def _canonicalize_app_command_option(self, payload: Dict[str, Any]) -> Dict[str, Any]:
return {
"type": int(payload.get("type", 0) or 0),
"name": str(payload.get("name", "") or ""),
"description": str(payload.get("description", "") or ""),
"required": bool(payload.get("required", False)),
"autocomplete": bool(payload.get("autocomplete", False)),
"choices": [
{
"name": str(choice.get("name", "") or ""),
"value": choice.get("value"),
}
for choice in payload.get("choices", []) or []
if isinstance(choice, dict)
],
"channel_types": list(payload.get("channel_types", []) or []),
"min_value": payload.get("min_value"),
"max_value": payload.get("max_value"),
"min_length": payload.get("min_length"),
"max_length": payload.get("max_length"),
"options": [
self._canonicalize_app_command_option(item)
for item in payload.get("options", []) or []
if isinstance(item, dict)
],
}
def _patchable_app_command_payload(self, payload: Dict[str, Any]) -> Dict[str, Any]:
"""Fields supported by discord.py's edit_global_command route."""
canonical = self._canonicalize_app_command_payload(payload)
return {
"name": canonical["name"],
"description": canonical["description"],
"options": canonical["options"],
}
async def _safe_sync_slash_commands(self) -> Dict[str, int]:
"""Diff existing global commands and only mutate the commands that changed."""
if not self._client:
return {
"total": 0,
"unchanged": 0,
"updated": 0,
"recreated": 0,
"created": 0,
"deleted": 0,
}
tree = self._client.tree
app_id = getattr(self._client, "application_id", None) or getattr(getattr(self._client, "user", None), "id", None)
if not app_id:
raise RuntimeError("Discord application ID is unavailable for slash command sync")
desired_payloads = [command.to_dict(tree) for command in tree.get_commands()]
desired_by_key = {
(int(payload.get("type", 1) or 1), str(payload.get("name", "") or "").lower()): payload
for payload in desired_payloads
}
existing_commands = await tree.fetch_commands()
existing_by_key = {
(
int(getattr(getattr(command, "type", None), "value", getattr(command, "type", 1)) or 1),
str(command.name or "").lower(),
): command
for command in existing_commands
}
unchanged = 0
updated = 0
recreated = 0
created = 0
deleted = 0
http = self._client.http
for key, desired in desired_by_key.items():
current = existing_by_key.pop(key, None)
if current is None:
await http.upsert_global_command(app_id, desired)
created += 1
continue
current_existing_payload = self._existing_command_to_payload(current)
current_payload = self._canonicalize_app_command_payload(current_existing_payload)
desired_payload = self._canonicalize_app_command_payload(desired)
if current_payload == desired_payload:
unchanged += 1
continue
if self._patchable_app_command_payload(current_existing_payload) == self._patchable_app_command_payload(desired):
await http.delete_global_command(app_id, current.id)
await http.upsert_global_command(app_id, desired)
recreated += 1
continue
await http.edit_global_command(app_id, current.id, desired)
updated += 1
for current in existing_by_key.values():
await http.delete_global_command(app_id, current.id)
deleted += 1
return {
"total": len(desired_payloads),
"unchanged": unchanged,
"updated": updated,
"recreated": recreated,
"created": created,
"deleted": deleted,
}
async def _add_reaction(self, message: Any, emoji: str) -> bool:
"""Add an emoji reaction to a Discord message."""
if not message or not hasattr(message, "add_reaction"):
@@ -2129,10 +2328,42 @@ class DiscordAdapter(BasePlatformAdapter):
# This ensures new commands added to COMMAND_REGISTRY in
# hermes_cli/commands.py automatically appear as Discord slash
# commands without needing a manual entry here.
def _build_auto_slash_command(_name: str, _description: str, _args_hint: str = ""):
"""Build a discord.app_commands.Command that proxies to _run_simple_slash."""
discord_name = _name.lower()[:32]
desc = (_description or f"Run /{_name}")[:100]
has_args = bool(_args_hint)
if has_args:
def _make_args_handler(__name: str, __hint: str):
@discord.app_commands.describe(args=f"Arguments: {__hint}"[:100])
async def _handler(interaction: discord.Interaction, args: str = ""):
await self._run_simple_slash(
interaction, f"/{__name} {args}".strip()
)
_handler.__name__ = f"auto_slash_{__name.replace('-', '_')}"
return _handler
handler = _make_args_handler(_name, _args_hint)
else:
def _make_simple_handler(__name: str):
async def _handler(interaction: discord.Interaction):
await self._run_simple_slash(interaction, f"/{__name}")
_handler.__name__ = f"auto_slash_{__name.replace('-', '_')}"
return _handler
handler = _make_simple_handler(_name)
return discord.app_commands.Command(
name=discord_name,
description=desc,
callback=handler,
)
already_registered: set[str] = set()
try:
from hermes_cli.commands import COMMAND_REGISTRY, _is_gateway_available, _resolve_config_gates
already_registered = set()
try:
already_registered = {cmd.name for cmd in tree.get_commands()}
except Exception:
@@ -2147,38 +2378,10 @@ class DiscordAdapter(BasePlatformAdapter):
discord_name = cmd_def.name.lower()[:32]
if discord_name in already_registered:
continue
# Skip aliases that overlap with already-registered names
# (aliases for explicitly registered commands are handled above).
desc = (cmd_def.description or f"Run /{cmd_def.name}")[:100]
has_args = bool(cmd_def.args_hint)
if has_args:
# Command takes optional arguments — create handler with
# an optional ``args`` string parameter.
def _make_args_handler(_name: str, _hint: str):
@discord.app_commands.describe(args=f"Arguments: {_hint}"[:100])
async def _handler(interaction: discord.Interaction, args: str = ""):
await self._run_simple_slash(
interaction, f"/{_name} {args}".strip()
)
_handler.__name__ = f"auto_slash_{_name.replace('-', '_')}"
return _handler
handler = _make_args_handler(cmd_def.name, cmd_def.args_hint)
else:
# Parameterless command.
def _make_simple_handler(_name: str):
async def _handler(interaction: discord.Interaction):
await self._run_simple_slash(interaction, f"/{_name}")
_handler.__name__ = f"auto_slash_{_name.replace('-', '_')}"
return _handler
handler = _make_simple_handler(cmd_def.name)
auto_cmd = discord.app_commands.Command(
name=discord_name,
description=desc,
callback=handler,
auto_cmd = _build_auto_slash_command(
cmd_def.name,
cmd_def.description,
cmd_def.args_hint,
)
try:
tree.add_command(auto_cmd)
@@ -2195,6 +2398,35 @@ class DiscordAdapter(BasePlatformAdapter):
except Exception as e:
logger.warning("Discord auto-register from COMMAND_REGISTRY failed: %s", e)
# ── Plugin-registered slash commands ──
# Plugins register via PluginContext.register_command(); we mirror
# those into Discord's native slash picker so users get the same
# autocomplete UX as for built-in commands. No per-platform plugin
# API needed — plugin commands are platform-agnostic.
try:
from hermes_cli.commands import _iter_plugin_command_entries
for plugin_name, plugin_desc, plugin_args_hint in _iter_plugin_command_entries():
discord_name = plugin_name.lower()[:32]
if discord_name in already_registered:
continue
auto_cmd = _build_auto_slash_command(
plugin_name,
plugin_desc,
plugin_args_hint,
)
try:
tree.add_command(auto_cmd)
already_registered.add(discord_name)
except Exception:
# Silently skip commands that fail registration (e.g.
# name conflict with a subcommand group).
pass
except Exception as e:
logger.warning(
"Discord auto-register from plugin commands failed: %s", e
)
# Register skills under a single /skill command group with category
# subcommand groups. This uses 1 top-level slot instead of N,
# supporting up to 25 categories × 25 skills = 625 skills.
@@ -3024,6 +3256,7 @@ class DiscordAdapter(BasePlatformAdapter):
if auto_thread and not skip_thread and not is_voice_linked_channel and not is_reply_message:
thread = await self._auto_create_thread(message)
if thread:
parent_channel_id = str(message.channel.id)
is_thread = True
thread_id = str(thread.id)
auto_threaded_channel = thread
@@ -3083,6 +3316,9 @@ class DiscordAdapter(BasePlatformAdapter):
thread_id=thread_id,
chat_topic=chat_topic,
is_bot=getattr(message.author, "bot", False),
guild_id=str(message.guild.id) if message.guild else None,
parent_chat_id=parent_channel_id,
message_id=str(message.id),
)
# Build media URLs -- download image attachments to local cache so the
+1
View File
@@ -545,6 +545,7 @@ class EmailAdapter(BasePlatformAdapter):
caption: Optional[str] = None,
file_name: Optional[str] = None,
reply_to: Optional[str] = None,
**kwargs,
) -> SendResult:
"""Send a file as an email attachment."""
try:
+349 -80
View File
@@ -14,6 +14,35 @@ Supports:
- Interactive card button-click events routed as synthetic COMMAND events
- Webhook anomaly tracking (matches openclaw createWebhookAnomalyTracker)
- Verification token validation as second auth layer (matches openclaw)
Feishu identity model
---------------------
Feishu uses three user-ID tiers (official docs:
https://open.feishu.cn/document/home/user-identity-introduction/introduction):
open_id (ou_xxx) **App-scoped**. The same person gets a different
open_id under each Feishu app. Always available in
event payloads without extra permissions.
user_id (u_xxx) **Tenant-scoped**. Stable within a company but
requires the ``contact:user.employee_id:readonly``
scope. May not be present.
union_id (on_xxx) **Developer-scoped**. Same across all apps owned by
one developer/ISV. Best cross-app stable ID.
For bots specifically:
app_id The application's canonical credential identifier.
bot open_id Returned by ``/bot/v3/info``. This is the bot's own
open_id *within its app context* and is what Feishu
puts in ``mentions[].id.open_id`` when someone
@-mentions the bot. Used for mention gating only.
In single-bot mode (what Hermes currently supports), open_id works as a
de-facto unique user identifier since there is only one app context.
Session-key participant isolation prefers ``union_id`` (via user_id_alt)
over ``open_id`` (via user_id) so that sessions stay stable if the same
user is seen through different apps in the future.
"""
from __future__ import annotations
@@ -35,7 +64,7 @@ from dataclasses import dataclass, field
from datetime import datetime
from pathlib import Path
from types import SimpleNamespace
from typing import Any, Dict, List, Optional
from typing import Any, Dict, List, Optional, Sequence
from urllib.error import HTTPError, URLError
from urllib.parse import urlencode
from urllib.request import Request, urlopen
@@ -73,7 +102,9 @@ try:
UpdateMessageRequest,
UpdateMessageRequestBody,
)
from lark_oapi.core import AccessTokenType, HttpMethod
from lark_oapi.core.const import FEISHU_DOMAIN, LARK_DOMAIN
from lark_oapi.core.model import BaseRequest
from lark_oapi.event.callback.model.p2_card_action_trigger import (
CallBackCard,
P2CardActionTriggerResponse,
@@ -234,6 +265,8 @@ FALLBACK_ATTACHMENT_TEXT = "[Attachment]"
_PREFERRED_LOCALES = ("zh_cn", "en_us")
_MARKDOWN_SPECIAL_CHARS_RE = re.compile(r"([\\`*_{}\[\]()#+\-!|>~])")
_MENTION_PLACEHOLDER_RE = re.compile(r"@_user_\d+")
_MENTION_BOUNDARY_CHARS = frozenset(" \t\n\r.,;:!?、,。;:!?()[]{}<>\"'`")
_TRAILING_TERMINAL_PUNCT = frozenset(" \t\n\r.!?。!?")
_WHITESPACE_RE = re.compile(r"\s+")
_SUPPORTED_CARD_TEXT_KEYS = (
"title",
@@ -277,12 +310,36 @@ class FeishuPostMediaRef:
resource_type: str = "file"
@dataclass(frozen=True)
class FeishuMentionRef:
name: str = ""
open_id: str = ""
is_all: bool = False
is_self: bool = False
@dataclass(frozen=True)
class _FeishuBotIdentity:
open_id: str = ""
user_id: str = ""
name: str = ""
def matches(self, *, open_id: str, user_id: str, name: str) -> bool:
# Precedence: open_id > user_id > name. IDs are authoritative when both
# sides have them; the next tier is only considered when either side
# lacks the current one.
if open_id and self.open_id:
return open_id == self.open_id
if user_id and self.user_id:
return user_id == self.user_id
return bool(self.name) and name == self.name
@dataclass(frozen=True)
class FeishuPostParseResult:
text_content: str
image_keys: List[str] = field(default_factory=list)
media_refs: List[FeishuPostMediaRef] = field(default_factory=list)
mentioned_ids: List[str] = field(default_factory=list)
@dataclass(frozen=True)
@@ -292,14 +349,14 @@ class FeishuNormalizedMessage:
preferred_message_type: str = "text"
image_keys: List[str] = field(default_factory=list)
media_refs: List[FeishuPostMediaRef] = field(default_factory=list)
mentioned_ids: List[str] = field(default_factory=list)
mentions: List[FeishuMentionRef] = field(default_factory=list)
relation_kind: str = "plain"
metadata: Dict[str, Any] = field(default_factory=dict)
@dataclass(frozen=True)
class FeishuAdapterSettings:
app_id: str
app_id: str # Canonical bot/app identifier (credential, not from event payloads)
app_secret: str
domain_name: str
connection_mode: str
@@ -307,7 +364,11 @@ class FeishuAdapterSettings:
verification_token: str
group_policy: str
allowed_group_users: frozenset[str]
# Bot's own open_id (app-scoped) — returned by /bot/v3/info. Used only for
# @mention matching: Feishu puts this value in mentions[].id.open_id when
# a user @-mentions the bot in a group chat.
bot_open_id: str
# Bot's user_id (tenant-scoped) — optional, used as fallback mention match.
bot_user_id: str
bot_name: str
dedup_cache_size: int
@@ -505,14 +566,17 @@ def _build_markdown_post_rows(content: str) -> List[List[Dict[str, str]]]:
return rows or [[{"tag": "md", "text": content}]]
def parse_feishu_post_payload(payload: Any) -> FeishuPostParseResult:
def parse_feishu_post_payload(
payload: Any,
*,
mentions_map: Optional[Dict[str, FeishuMentionRef]] = None,
) -> FeishuPostParseResult:
resolved = _resolve_post_payload(payload)
if not resolved:
return FeishuPostParseResult(text_content=FALLBACK_POST_TEXT)
image_keys: List[str] = []
media_refs: List[FeishuPostMediaRef] = []
mentioned_ids: List[str] = []
parts: List[str] = []
title = _normalize_feishu_text(str(resolved.get("title", "")).strip())
@@ -523,7 +587,10 @@ def parse_feishu_post_payload(payload: Any) -> FeishuPostParseResult:
if not isinstance(row, list):
continue
row_text = _normalize_feishu_text(
"".join(_render_post_element(item, image_keys, media_refs, mentioned_ids) for item in row)
"".join(
_render_post_element(item, image_keys, media_refs, mentions_map)
for item in row
)
)
if row_text:
parts.append(row_text)
@@ -532,7 +599,6 @@ def parse_feishu_post_payload(payload: Any) -> FeishuPostParseResult:
text_content="\n".join(parts).strip() or FALLBACK_POST_TEXT,
image_keys=image_keys,
media_refs=media_refs,
mentioned_ids=mentioned_ids,
)
@@ -584,7 +650,7 @@ def _render_post_element(
element: Any,
image_keys: List[str],
media_refs: List[FeishuPostMediaRef],
mentioned_ids: List[str],
mentions_map: Optional[Dict[str, FeishuMentionRef]] = None,
) -> str:
if isinstance(element, str):
return element
@@ -602,19 +668,21 @@ def _render_post_element(
escaped_label = _escape_markdown_text(label)
return f"[{escaped_label}]({href})" if href else escaped_label
if tag == "at":
mentioned_id = (
str(element.get("open_id", "")).strip()
or str(element.get("user_id", "")).strip()
)
if mentioned_id and mentioned_id not in mentioned_ids:
mentioned_ids.append(mentioned_id)
display_name = (
str(element.get("user_name", "")).strip()
or str(element.get("name", "")).strip()
or str(element.get("text", "")).strip()
or mentioned_id
)
return f"@{_escape_markdown_text(display_name)}" if display_name else "@"
# Post <at>.user_id is a placeholder ("@_user_N" or "@_all"); look up
# the real ref in mentions_map for the display name.
placeholder = str(element.get("user_id", "")).strip()
if placeholder == "@_all":
# Feishu SDK sometimes omits @_all from the top-level mentions
# payload; record it here so the caller's mention list stays complete.
if mentions_map is not None and "@_all" not in mentions_map:
mentions_map["@_all"] = FeishuMentionRef(is_all=True)
return "@all"
ref = (mentions_map or {}).get(placeholder)
if ref is not None:
display_name = ref.name or ref.open_id or "user"
else:
display_name = str(element.get("user_name", "")).strip() or "user"
return f"@{_escape_markdown_text(display_name)}"
if tag in {"img", "image"}:
image_key = str(element.get("image_key", "")).strip()
if image_key and image_key not in image_keys:
@@ -652,8 +720,7 @@ def _render_post_element(
nested_parts: List[str] = []
for key in ("text", "title", "content", "children", "elements"):
value = element.get(key)
extracted = _render_nested_post(value, image_keys, media_refs, mentioned_ids)
extracted = _render_nested_post(element.get(key), image_keys, media_refs, mentions_map)
if extracted:
nested_parts.append(extracted)
return " ".join(part for part in nested_parts if part)
@@ -663,7 +730,7 @@ def _render_nested_post(
value: Any,
image_keys: List[str],
media_refs: List[FeishuPostMediaRef],
mentioned_ids: List[str],
mentions_map: Optional[Dict[str, FeishuMentionRef]] = None,
) -> str:
if isinstance(value, str):
return _escape_markdown_text(value)
@@ -671,17 +738,17 @@ def _render_nested_post(
return " ".join(
part
for item in value
for part in [_render_nested_post(item, image_keys, media_refs, mentioned_ids)]
for part in [_render_nested_post(item, image_keys, media_refs, mentions_map)]
if part
)
if isinstance(value, dict):
direct = _render_post_element(value, image_keys, media_refs, mentioned_ids)
direct = _render_post_element(value, image_keys, media_refs, mentions_map)
if direct:
return direct
return " ".join(
part
for item in value.values()
for part in [_render_nested_post(item, image_keys, media_refs, mentioned_ids)]
for part in [_render_nested_post(item, image_keys, media_refs, mentions_map)]
if part
)
return ""
@@ -692,31 +759,48 @@ def _render_nested_post(
# ---------------------------------------------------------------------------
def normalize_feishu_message(*, message_type: str, raw_content: str) -> FeishuNormalizedMessage:
def normalize_feishu_message(
*,
message_type: str,
raw_content: str,
mentions: Optional[Sequence[Any]] = None,
bot: _FeishuBotIdentity = _FeishuBotIdentity(),
) -> FeishuNormalizedMessage:
normalized_type = str(message_type or "").strip().lower()
payload = _load_feishu_payload(raw_content)
mentions_map = _build_mentions_map(mentions, bot)
if normalized_type == "text":
text = str(payload.get("text", "") or "")
# Feishu SDK sometimes omits @_all from the mentions payload even when
# the text literal contains it (confirmed via im.v1.message.get).
if "@_all" in text and "@_all" not in mentions_map:
mentions_map["@_all"] = FeishuMentionRef(is_all=True)
return FeishuNormalizedMessage(
raw_type=normalized_type,
text_content=_normalize_feishu_text(str(payload.get("text", "") or "")),
text_content=_normalize_feishu_text(text, mentions_map),
mentions=list(mentions_map.values()),
)
if normalized_type == "post":
parsed_post = parse_feishu_post_payload(payload)
# The walker writes back to mentions_map if it encounters
# <at user_id="@_all">, so reading .values() after parsing is enough.
parsed_post = parse_feishu_post_payload(payload, mentions_map=mentions_map)
return FeishuNormalizedMessage(
raw_type=normalized_type,
text_content=parsed_post.text_content,
image_keys=list(parsed_post.image_keys),
media_refs=list(parsed_post.media_refs),
mentioned_ids=list(parsed_post.mentioned_ids),
mentions=list(mentions_map.values()),
relation_kind="post",
)
mention_refs = list(mentions_map.values())
if normalized_type == "image":
image_key = str(payload.get("image_key", "") or "").strip()
alt_text = _normalize_feishu_text(
str(payload.get("text", "") or "")
or str(payload.get("alt", "") or "")
or FALLBACK_IMAGE_TEXT
or FALLBACK_IMAGE_TEXT,
mentions_map,
)
return FeishuNormalizedMessage(
raw_type=normalized_type,
@@ -724,6 +808,7 @@ def normalize_feishu_message(*, message_type: str, raw_content: str) -> FeishuNo
preferred_message_type="photo",
image_keys=[image_key] if image_key else [],
relation_kind="image",
mentions=mention_refs,
)
if normalized_type in {"file", "audio", "media"}:
media_ref = _build_media_ref_from_payload(payload, resource_type=normalized_type)
@@ -735,6 +820,7 @@ def normalize_feishu_message(*, message_type: str, raw_content: str) -> FeishuNo
media_refs=[media_ref] if media_ref.file_key else [],
relation_kind=normalized_type,
metadata={"placeholder_text": placeholder},
mentions=mention_refs,
)
if normalized_type == "merge_forward":
return _normalize_merge_forward_message(payload)
@@ -1009,8 +1095,20 @@ def _first_non_empty_text(*values: Any) -> str:
# ---------------------------------------------------------------------------
def _normalize_feishu_text(text: str) -> str:
cleaned = _MENTION_PLACEHOLDER_RE.sub(" ", text or "")
def _normalize_feishu_text(
text: str,
mentions_map: Optional[Dict[str, FeishuMentionRef]] = None,
) -> str:
def _sub(match: "re.Match[str]") -> str:
key = match.group(0)
ref = (mentions_map or {}).get(key)
if ref is None:
return " "
name = ref.name or ref.open_id or "user"
return f"@{name}"
cleaned = _MENTION_PLACEHOLDER_RE.sub(_sub, text or "")
cleaned = cleaned.replace("@_all", "@all")
cleaned = cleaned.replace("\r\n", "\n").replace("\r", "\n")
cleaned = "\n".join(_WHITESPACE_RE.sub(" ", line).strip() for line in cleaned.split("\n"))
cleaned = "\n".join(line for line in cleaned.split("\n") if line)
@@ -1029,6 +1127,117 @@ def _unique_lines(lines: List[str]) -> List[str]:
return unique
# ---------------------------------------------------------------------------
# Mention helpers
# ---------------------------------------------------------------------------
def _extract_mention_ids(mention: Any) -> tuple[str, str]:
# Returns (open_id, user_id). im.v1.message.get hands back id as a string
# plus id_type discriminator; event payloads hand back a nested UserId
# object carrying both fields.
mention_id = getattr(mention, "id", None)
if isinstance(mention_id, str):
id_type = str(getattr(mention, "id_type", "") or "").lower()
if id_type == "open_id":
return mention_id, ""
if id_type == "user_id":
return "", mention_id
return "", ""
if mention_id is None:
return "", ""
return (
str(getattr(mention_id, "open_id", "") or ""),
str(getattr(mention_id, "user_id", "") or ""),
)
def _build_mentions_map(
mentions: Optional[Sequence[Any]],
bot: _FeishuBotIdentity,
) -> Dict[str, FeishuMentionRef]:
result: Dict[str, FeishuMentionRef] = {}
for mention in mentions or []:
key = str(getattr(mention, "key", "") or "")
if not key:
continue
if key == "@_all":
result[key] = FeishuMentionRef(is_all=True)
continue
open_id, user_id = _extract_mention_ids(mention)
name = str(getattr(mention, "name", "") or "").strip()
result[key] = FeishuMentionRef(
name=name,
open_id=open_id,
is_self=bot.matches(open_id=open_id, user_id=user_id, name=name),
)
return result
def _build_mention_hint(mentions: Sequence[FeishuMentionRef]) -> str:
parts: List[str] = []
seen: set = set()
for ref in mentions:
if ref.is_self:
continue
signature = (ref.is_all, ref.open_id, ref.name)
if signature in seen:
continue
seen.add(signature)
if ref.is_all:
parts.append("@all")
elif ref.open_id:
parts.append(f"{ref.name or 'unknown'} (open_id={ref.open_id})")
else:
parts.append(ref.name or "unknown")
return f"[Mentioned: {', '.join(parts)}]" if parts else ""
def _strip_edge_self_mentions(
text: str,
mentions: Sequence[FeishuMentionRef],
) -> str:
# Leading: strip consecutive self-mentions unconditionally.
# Trailing: strip only when followed by whitespace/terminal punct, so
# mid-sentence references ("don't @Bot again") stay intact.
# Leading word-boundary prevents @Al from eating @Alice.
if not text:
return text
self_names = [
f"@{ref.name or ref.open_id or 'user'}"
for ref in mentions
if ref.is_self
]
if not self_names:
return text
remaining = text.lstrip()
while True:
for nm in self_names:
if not remaining.startswith(nm):
continue
after = remaining[len(nm):]
if after and after[0] not in _MENTION_BOUNDARY_CHARS:
continue
remaining = after.lstrip()
break
else:
break
while True:
i = len(remaining)
while i > 0 and remaining[i - 1] in _TRAILING_TERMINAL_PUNCT:
i -= 1
body = remaining[:i]
tail = remaining[i:]
for nm in self_names:
if body.endswith(nm):
remaining = body[: -len(nm)].rstrip() + tail
break
else:
return remaining
def _run_official_feishu_ws_client(ws_client: Any, adapter: Any) -> None:
"""Run the official Lark WS client in its own thread-local event loop."""
import lark_oapi.ws.client as ws_client_module
@@ -1491,6 +1700,7 @@ class FeishuAdapter(BasePlatformAdapter):
if not self._client:
return SendResult(success=False, error="Not connected")
content = self.format_message(content)
try:
msg_type, payload = self._build_outbound_payload(content)
body = self._build_update_message_body(msg_type=msg_type, content=payload)
@@ -2470,13 +2680,22 @@ class FeishuAdapter(BasePlatformAdapter):
chat_type: str,
message_id: str,
) -> None:
text, inbound_type, media_urls, media_types = await self._extract_message_content(message)
text, inbound_type, media_urls, media_types, mentions = await self._extract_message_content(message)
if inbound_type == MessageType.TEXT:
text = _strip_edge_self_mentions(text, mentions)
if text.startswith("/"):
inbound_type = MessageType.COMMAND
# Guard runs post-strip so a pure "@Bot" message (stripped to "") is dropped.
if inbound_type == MessageType.TEXT and not text and not media_urls:
logger.debug("[Feishu] Ignoring unsupported or empty message type: %s", getattr(message, "message_type", ""))
logger.debug("[Feishu] Ignoring empty text message id=%s", message_id)
return
if inbound_type == MessageType.TEXT and text.startswith("/"):
inbound_type = MessageType.COMMAND
if inbound_type != MessageType.COMMAND:
hint = _build_mention_hint(mentions)
if hint:
text = f"{hint}\n\n{text}" if text else hint
reply_to_message_id = (
getattr(message, "parent_id", None)
@@ -2935,14 +3154,20 @@ class FeishuAdapter(BasePlatformAdapter):
# Message content extraction and resource download
# =========================================================================
async def _extract_message_content(self, message: Any) -> tuple[str, MessageType, List[str], List[str]]:
"""Extract text and cached media from a normalized Feishu message."""
async def _extract_message_content(
self, message: Any
) -> tuple[str, MessageType, List[str], List[str], List[FeishuMentionRef]]:
raw_content = getattr(message, "content", "") or ""
raw_type = getattr(message, "message_type", "") or ""
message_id = str(getattr(message, "message_id", "") or "")
logger.info("[Feishu] Received raw message type=%s message_id=%s", raw_type, message_id)
normalized = normalize_feishu_message(message_type=raw_type, raw_content=raw_content)
normalized = normalize_feishu_message(
message_type=raw_type,
raw_content=raw_content,
mentions=getattr(message, "mentions", None),
bot=self._bot_identity(),
)
media_urls, media_types = await self._download_feishu_message_resources(
message_id=message_id,
normalized=normalized,
@@ -2959,7 +3184,7 @@ class FeishuAdapter(BasePlatformAdapter):
if injected:
text = injected
return text, inbound_type, media_urls, media_types
return text, inbound_type, media_urls, media_types, list(normalized.mentions)
async def _download_feishu_message_resources(
self,
@@ -3223,10 +3448,22 @@ class FeishuAdapter(BasePlatformAdapter):
return "group"
async def _resolve_sender_profile(self, sender_id: Any) -> Dict[str, Optional[str]]:
"""Map Feishu's three-tier user IDs onto Hermes' SessionSource fields.
Preference order for the primary ``user_id`` field:
1. user_id (tenant-scoped, most stable requires permission scope)
2. open_id (app-scoped, always available different per bot app)
``user_id_alt`` carries the union_id (developer-scoped, stable across
all apps by the same developer). Session-key generation prefers
user_id_alt when present, so participant isolation stays stable even
if the primary ID is the app-scoped open_id.
"""
open_id = getattr(sender_id, "open_id", None) or None
user_id = getattr(sender_id, "user_id", None) or None
union_id = getattr(sender_id, "union_id", None) or None
primary_id = open_id or user_id
# Prefer tenant-scoped user_id; fall back to app-scoped open_id.
primary_id = user_id or open_id
display_name = await self._resolve_sender_name_from_api(primary_id or union_id)
return {
"user_id": primary_id,
@@ -3308,15 +3545,31 @@ class FeishuAdapter(BasePlatformAdapter):
body = getattr(parent, "body", None)
msg_type = getattr(parent, "msg_type", "") or ""
raw_content = getattr(body, "content", "") or ""
text = self._extract_text_from_raw_content(msg_type=msg_type, raw_content=raw_content)
parent_mentions = getattr(parent, "mentions", None) if parent else None
text = self._extract_text_from_raw_content(
msg_type=msg_type,
raw_content=raw_content,
mentions=parent_mentions,
)
self._message_text_cache[message_id] = text
return text
except Exception:
logger.warning("[Feishu] Failed to fetch parent message %s", message_id, exc_info=True)
return None
def _extract_text_from_raw_content(self, *, msg_type: str, raw_content: str) -> Optional[str]:
normalized = normalize_feishu_message(message_type=msg_type, raw_content=raw_content)
def _extract_text_from_raw_content(
self,
*,
msg_type: str,
raw_content: str,
mentions: Optional[Sequence[Any]] = None,
) -> Optional[str]:
normalized = normalize_feishu_message(
message_type=msg_type,
raw_content=raw_content,
mentions=mentions,
bot=self._bot_identity(),
)
if normalized.text_content:
return normalized.text_content
placeholder = normalized.metadata.get("placeholder_text") if isinstance(normalized.metadata, dict) else None
@@ -3386,10 +3639,10 @@ class FeishuAdapter(BasePlatformAdapter):
normalized = normalize_feishu_message(
message_type=getattr(message, "message_type", "") or "",
raw_content=raw_content,
mentions=getattr(message, "mentions", None),
bot=self._bot_identity(),
)
if normalized.mentioned_ids:
return self._post_mentions_bot(normalized.mentioned_ids)
return False
return self._post_mentions_bot(normalized.mentions)
def _is_self_sent_bot_message(self, event: Any) -> bool:
"""Return True only for Feishu events emitted by this Hermes bot."""
@@ -3409,30 +3662,37 @@ class FeishuAdapter(BasePlatformAdapter):
return False
def _message_mentions_bot(self, mentions: List[Any]) -> bool:
"""Check whether any mention targets the configured or inferred bot identity."""
# IDs trump names: when both sides have open_id (or both user_id),
# match requires equal IDs. Name fallback only when either side
# lacks an ID.
for mention in mentions:
mention_id = getattr(mention, "id", None)
mention_open_id = getattr(mention_id, "open_id", None)
mention_user_id = getattr(mention_id, "user_id", None)
mention_open_id = (getattr(mention_id, "open_id", None) or "").strip()
mention_user_id = (getattr(mention_id, "user_id", None) or "").strip()
mention_name = (getattr(mention, "name", None) or "").strip()
if self._bot_open_id and mention_open_id == self._bot_open_id:
return True
if self._bot_user_id and mention_user_id == self._bot_user_id:
return True
if mention_open_id and self._bot_open_id:
if mention_open_id == self._bot_open_id:
return True
continue # IDs differ — not the bot; skip name fallback.
if mention_user_id and self._bot_user_id:
if mention_user_id == self._bot_user_id:
return True
continue
if self._bot_name and mention_name == self._bot_name:
return True
return False
def _post_mentions_bot(self, mentioned_ids: List[str]) -> bool:
if not mentioned_ids:
return False
if self._bot_open_id and self._bot_open_id in mentioned_ids:
return True
if self._bot_user_id and self._bot_user_id in mentioned_ids:
return True
return False
def _post_mentions_bot(self, mentions: List[FeishuMentionRef]) -> bool:
return any(m.is_self for m in mentions)
def _bot_identity(self) -> _FeishuBotIdentity:
return _FeishuBotIdentity(
open_id=self._bot_open_id,
user_id=self._bot_user_id,
name=self._bot_name,
)
async def _hydrate_bot_identity(self) -> None:
"""Best-effort discovery of bot identity for precise group mention gating
@@ -3457,14 +3717,15 @@ class FeishuAdapter(BasePlatformAdapter):
# uses via probe_bot().
if not self._bot_open_id or not self._bot_name:
try:
resp = await asyncio.to_thread(
self._client.request,
method="GET",
url="/open-apis/bot/v3/info",
body=None,
raw_response=True,
req = (
BaseRequest.builder()
.http_method(HttpMethod.GET)
.uri("/open-apis/bot/v3/info")
.token_types({AccessTokenType.TENANT})
.build()
)
content = getattr(resp, "content", None)
resp = await asyncio.to_thread(self._client.request, req)
content = getattr(getattr(resp, "raw", None), "content", None)
if content:
payload = json.loads(content)
parsed = _parse_bot_response(payload) or {}
@@ -4212,6 +4473,9 @@ def probe_bot(app_id: str, app_secret: str, domain: str) -> Optional[dict]:
Uses lark_oapi SDK when available, falls back to raw HTTP otherwise.
Returns {"bot_name": ..., "bot_open_id": ...} on success, None on failure.
Note: ``bot_open_id`` here is the bot's app-scoped open_id — the same ID
that Feishu puts in @mention payloads. It is NOT the app_id.
"""
if FEISHU_AVAILABLE:
return _probe_bot_sdk(app_id, app_secret, domain)
@@ -4232,12 +4496,12 @@ def _build_onboard_client(app_id: str, app_secret: str, domain: str) -> Any:
def _parse_bot_response(data: dict) -> Optional[dict]:
"""Extract bot_name and bot_open_id from a /bot/v3/info response."""
# /bot/v3/info returns bot.app_name; legacy paths used bot_name — accept both.
if data.get("code") != 0:
return None
bot = data.get("bot") or data.get("data", {}).get("bot") or {}
return {
"bot_name": bot.get("bot_name"),
"bot_name": bot.get("app_name") or bot.get("bot_name"),
"bot_open_id": bot.get("open_id"),
}
@@ -4246,13 +4510,18 @@ def _probe_bot_sdk(app_id: str, app_secret: str, domain: str) -> Optional[dict]:
"""Probe bot info using lark_oapi SDK."""
try:
client = _build_onboard_client(app_id, app_secret, domain)
resp = client.request(
method="GET",
url="/open-apis/bot/v3/info",
body=None,
raw_response=True,
req = (
BaseRequest.builder()
.http_method(HttpMethod.GET)
.uri("/open-apis/bot/v3/info")
.token_types({AccessTokenType.TENANT})
.build()
)
return _parse_bot_response(json.loads(resp.content))
resp = client.request(req)
content = getattr(getattr(resp, "raw", None), "content", None)
if content is None:
return None
return _parse_bot_response(json.loads(content))
except Exception as exc:
logger.debug("[Feishu onboard] SDK probe failed: %s", exc)
return None
+2 -4
View File
@@ -26,9 +26,8 @@ from .adapter import ( # noqa: F401
# -- Onboard (QR-code scan-to-configure) -----------------------------------
from .onboard import ( # noqa: F401
BindStatus,
create_bind_task,
poll_bind_result,
build_connect_url,
qr_register,
)
from .crypto import decrypt_secret, generate_bind_key # noqa: F401
@@ -44,9 +43,8 @@ __all__ = [
"_ssrf_redirect_guard",
# onboard
"BindStatus",
"create_bind_task",
"poll_bind_result",
"build_connect_url",
"qr_register",
# crypto
"decrypt_secret",
"generate_bind_key",
+3
View File
@@ -535,6 +535,9 @@ class QQAdapter(BasePlatformAdapter):
quick_disconnect_count = 0
else:
backoff_idx += 1
if backoff_idx >= MAX_RECONNECT_ATTEMPTS:
logger.error("[%s] Max reconnect attempts reached (QQCloseError)", self._log_tag)
return
except Exception as exc:
if not self._running:
+117 -21
View File
@@ -1,6 +1,10 @@
"""
QQBot scan-to-configure (QR code onboard) module.
Mirrors the Feishu onboarding pattern: synchronous HTTP + a single public
entry-point ``qr_register()`` that handles the full flow (create task
display QR code poll decrypt credentials).
Calls the ``q.qq.com`` ``create_bind_task`` / ``poll_bind_result`` APIs to
generate a QR-code URL and poll for scan completion. On success the caller
receives the bot's *app_id*, *client_secret* (decrypted locally), and the
@@ -12,18 +16,20 @@ Reference: https://bot.q.qq.com/wiki/develop/api-v2/
from __future__ import annotations
import logging
import time
from enum import IntEnum
from typing import Tuple
from typing import Optional, Tuple
from urllib.parse import quote
from .constants import (
ONBOARD_API_TIMEOUT,
ONBOARD_CREATE_PATH,
ONBOARD_POLL_INTERVAL,
ONBOARD_POLL_PATH,
PORTAL_HOST,
QR_URL_TEMPLATE,
)
from .crypto import generate_bind_key
from .crypto import decrypt_secret, generate_bind_key
from .utils import get_api_headers
logger = logging.getLogger(__name__)
@@ -35,7 +41,7 @@ logger = logging.getLogger(__name__)
class BindStatus(IntEnum):
"""Status codes returned by ``poll_bind_result``."""
"""Status codes returned by ``_poll_bind_result``."""
NONE = 0
PENDING = 1
@@ -44,18 +50,40 @@ class BindStatus(IntEnum):
# ---------------------------------------------------------------------------
# Public API
# QR rendering
# ---------------------------------------------------------------------------
try:
import qrcode as _qrcode_mod
except (ImportError, TypeError):
_qrcode_mod = None # type: ignore[assignment]
def _render_qr(url: str) -> bool:
"""Try to render a QR code in the terminal. Returns True if successful."""
if _qrcode_mod is None:
return False
try:
qr = _qrcode_mod.QRCode(
error_correction=_qrcode_mod.constants.ERROR_CORRECT_M,
border=2,
)
qr.add_data(url)
qr.make(fit=True)
qr.print_ascii(invert=True)
return True
except Exception:
return False
# ---------------------------------------------------------------------------
# Synchronous HTTP helpers (mirrors Feishu _post_registration pattern)
# ---------------------------------------------------------------------------
async def create_bind_task(
timeout: float = ONBOARD_API_TIMEOUT,
) -> Tuple[str, str]:
def _create_bind_task(timeout: float = ONBOARD_API_TIMEOUT) -> Tuple[str, str]:
"""Create a bind task and return *(task_id, aes_key_base64)*.
The AES key is generated locally and sent to the server so it can
encrypt the bot credentials before returning them.
Raises:
RuntimeError: If the API returns a non-zero ``retcode``.
"""
@@ -64,8 +92,8 @@ async def create_bind_task(
url = f"https://{PORTAL_HOST}{ONBOARD_CREATE_PATH}"
key = generate_bind_key()
async with httpx.AsyncClient(timeout=timeout, follow_redirects=True) as client:
resp = await client.post(url, json={"key": key}, headers=get_api_headers())
with httpx.Client(timeout=timeout, follow_redirects=True) as client:
resp = client.post(url, json={"key": key}, headers=get_api_headers())
resp.raise_for_status()
data = resp.json()
@@ -80,7 +108,7 @@ async def create_bind_task(
return task_id, key
async def poll_bind_result(
def _poll_bind_result(
task_id: str,
timeout: float = ONBOARD_API_TIMEOUT,
) -> Tuple[BindStatus, str, str, str]:
@@ -89,12 +117,6 @@ async def poll_bind_result(
Returns:
A 4-tuple of ``(status, bot_appid, bot_encrypt_secret, user_openid)``.
* ``bot_encrypt_secret`` is AES-256-GCM encrypted decrypt it with
:func:`~gateway.platforms.qqbot.crypto.decrypt_secret` using the
key from :func:`create_bind_task`.
* ``user_openid`` is the OpenID of the person who scanned the code
(available when ``status == COMPLETED``).
Raises:
RuntimeError: If the API returns a non-zero ``retcode``.
"""
@@ -102,8 +124,8 @@ async def poll_bind_result(
url = f"https://{PORTAL_HOST}{ONBOARD_POLL_PATH}"
async with httpx.AsyncClient(timeout=timeout, follow_redirects=True) as client:
resp = await client.post(url, json={"task_id": task_id}, headers=get_api_headers())
with httpx.Client(timeout=timeout, follow_redirects=True) as client:
resp = client.post(url, json={"task_id": task_id}, headers=get_api_headers())
resp.raise_for_status()
data = resp.json()
@@ -122,3 +144,77 @@ async def poll_bind_result(
def build_connect_url(task_id: str) -> str:
"""Build the QR-code target URL for a given *task_id*."""
return QR_URL_TEMPLATE.format(task_id=quote(task_id))
# ---------------------------------------------------------------------------
# Public entry-point
# ---------------------------------------------------------------------------
_MAX_REFRESHES = 3
def qr_register(timeout_seconds: int = 600) -> Optional[dict]:
"""Run the QQBot scan-to-configure QR registration flow.
Mirrors ``feishu.qr_register()``: handles create display poll
decrypt in one call. Unexpected errors propagate to the caller.
:returns:
``{"app_id": ..., "client_secret": ..., "user_openid": ...}`` on
success, or ``None`` on failure / expiry / cancellation.
"""
deadline = time.monotonic() + timeout_seconds
for refresh_count in range(_MAX_REFRESHES + 1):
# ── Create bind task ──
try:
task_id, aes_key = _create_bind_task()
except Exception as exc:
logger.warning("[QQBot onboard] Failed to create bind task: %s", exc)
return None
url = build_connect_url(task_id)
# ── Display QR code + URL ──
print()
if _render_qr(url):
print(f" Scan the QR code above, or open this URL directly:\n {url}")
else:
print(f" Open this URL in QQ on your phone:\n {url}")
print(" Tip: pip install qrcode to display a scannable QR code here")
print()
# ── Poll loop ──
while time.monotonic() < deadline:
try:
status, app_id, encrypted_secret, user_openid = _poll_bind_result(task_id)
except Exception:
time.sleep(ONBOARD_POLL_INTERVAL)
continue
if status == BindStatus.COMPLETED:
client_secret = decrypt_secret(encrypted_secret, aes_key)
print()
print(f" QR scan complete! (App ID: {app_id})")
if user_openid:
print(f" Scanner's OpenID: {user_openid}")
return {
"app_id": app_id,
"client_secret": client_secret,
"user_openid": user_openid,
}
if status == BindStatus.EXPIRED:
if refresh_count >= _MAX_REFRESHES:
logger.warning("[QQBot onboard] QR code expired %d times — giving up", _MAX_REFRESHES)
return None
print(f"\n QR code expired, refreshing... ({refresh_count + 1}/{_MAX_REFRESHES})")
break # next for-loop iteration creates a new task
time.sleep(ONBOARD_POLL_INTERVAL)
else:
# deadline reached without completing
logger.warning("[QQBot onboard] Poll timed out after %ds", timeout_seconds)
return None
return None
+57 -7
View File
@@ -38,6 +38,7 @@ from gateway.platforms.base import (
BasePlatformAdapter,
MessageEvent,
MessageType,
ProcessingOutcome,
SendResult,
SUPPORTED_DOCUMENT_TYPES,
safe_url_for_log,
@@ -113,6 +114,11 @@ class SlackAdapter(BasePlatformAdapter):
# Cache for _fetch_thread_context results: cache_key → _ThreadContextCache
self._thread_context_cache: Dict[str, _ThreadContextCache] = {}
self._THREAD_CACHE_TTL = 60.0
# Track message IDs that should get reaction lifecycle (DMs / @mentions).
self._reacting_message_ids: set = set()
# Track active assistant thread status indicators so stop_typing can
# clear them (chat_id → thread_ts).
self._active_status_threads: Dict[str, str] = {}
async def connect(self) -> bool:
"""Connect to Slack via Socket Mode."""
@@ -362,6 +368,7 @@ class SlackAdapter(BasePlatformAdapter):
if not thread_ts:
return # Can only set status in a thread context
self._active_status_threads[chat_id] = thread_ts
try:
await self._get_client(chat_id).assistant_threads_setStatus(
channel_id=chat_id,
@@ -373,6 +380,22 @@ class SlackAdapter(BasePlatformAdapter):
# in an assistant-enabled context. Falls back to reactions.
logger.debug("[Slack] assistant.threads.setStatus failed: %s", e)
async def stop_typing(self, chat_id: str) -> None:
"""Clear the assistant thread status indicator."""
if not self._app:
return
thread_ts = self._active_status_threads.pop(chat_id, None)
if not thread_ts:
return
try:
await self._get_client(chat_id).assistant_threads_setStatus(
channel_id=chat_id,
thread_ts=thread_ts,
status="",
)
except Exception as e:
logger.debug("[Slack] assistant.threads.setStatus clear failed: %s", e)
def _dm_top_level_threads_as_sessions(self) -> bool:
"""Whether top-level Slack DMs get per-message session threads.
@@ -584,6 +607,38 @@ class SlackAdapter(BasePlatformAdapter):
logger.debug("[Slack] reactions.remove failed (%s): %s", emoji, e)
return False
def _reactions_enabled(self) -> bool:
"""Check if message reactions are enabled via config/env."""
return os.getenv("SLACK_REACTIONS", "true").lower() not in ("false", "0", "no")
async def on_processing_start(self, event: MessageEvent) -> None:
"""Add an in-progress reaction when message processing begins."""
if not self._reactions_enabled():
return
ts = getattr(event, "message_id", None)
if not ts or ts not in self._reacting_message_ids:
return
channel_id = getattr(event.source, "chat_id", None)
if channel_id:
await self._add_reaction(channel_id, ts, "eyes")
async def on_processing_complete(self, event: MessageEvent, outcome: ProcessingOutcome) -> None:
"""Swap the in-progress reaction for a final success/failure reaction."""
if not self._reactions_enabled():
return
ts = getattr(event, "message_id", None)
if not ts or ts not in self._reacting_message_ids:
return
self._reacting_message_ids.discard(ts)
channel_id = getattr(event.source, "chat_id", None)
if not channel_id:
return
await self._remove_reaction(channel_id, ts, "eyes")
if outcome == ProcessingOutcome.SUCCESS:
await self._add_reaction(channel_id, ts, "white_check_mark")
elif outcome == ProcessingOutcome.FAILURE:
await self._add_reaction(channel_id, ts, "x")
# ----- User identity resolution -----
async def _resolve_user_name(self, user_id: str, chat_id: str = "") -> str:
@@ -1213,17 +1268,12 @@ class SlackAdapter(BasePlatformAdapter):
# Only react when bot is directly addressed (DM or @mention).
# In listen-all channels (require_mention=false), reacting to every
# casual message would be noisy.
_should_react = is_dm or is_mentioned
_should_react = (is_dm or is_mentioned) and self._reactions_enabled()
if _should_react:
await self._add_reaction(channel_id, ts, "eyes")
self._reacting_message_ids.add(ts)
await self.handle_message(msg_event)
if _should_react:
await self._remove_reaction(channel_id, ts, "eyes")
await self._add_reaction(channel_id, ts, "white_check_mark")
# ----- Approval button support (Block Kit) -----
async def send_exec_approval(
+136
View File
@@ -508,6 +508,11 @@ class WeComAdapter(BasePlatformAdapter):
self._remember_chat_req_id(chat_id, self._payload_req_id(payload))
text, reply_text = self._extract_text(body)
# Strip leading @mention in group chats so slash commands like
# "@BotName /approve" are correctly recognized as "/approve".
# Mirrors what the Telegram adapter does (re.sub @botname).
if is_group and text:
text = re.sub(r"^@\S+\s*", "", text).strip()
media_urls, media_types = await self._extract_media(body)
message_type = self._derive_message_type(body, text, media_types)
has_reply_context = bool(reply_text and (text or media_urls))
@@ -1464,3 +1469,134 @@ class WeComAdapter(BasePlatformAdapter):
"name": chat_id,
"type": "group" if chat_id and chat_id.lower().startswith("group") else "dm",
}
# ------------------------------------------------------------------
# QR code scan flow for obtaining bot credentials
# ------------------------------------------------------------------
_QR_GENERATE_URL = "https://work.weixin.qq.com/ai/qc/generate"
_QR_QUERY_URL = "https://work.weixin.qq.com/ai/qc/query_result"
_QR_CODE_PAGE = "https://work.weixin.qq.com/ai/qc/gen?source=hermes&scode="
_QR_POLL_INTERVAL = 3 # seconds
_QR_POLL_TIMEOUT = 300 # 5 minutes
def qr_scan_for_bot_info(
*,
timeout_seconds: int = _QR_POLL_TIMEOUT,
) -> Optional[Dict[str, str]]:
"""Run the WeCom QR scan flow to obtain bot_id and secret.
Fetches a QR code from WeCom, renders it in the terminal, and polls
until the user scans it or the timeout expires.
Returns ``{"bot_id": ..., "secret": ...}`` on success, ``None`` on
failure or timeout.
Note: the ``work.weixin.qq.com/ai/qc/{generate,query_result}`` endpoints
used here are not part of WeCom's public developer API — they back the
admin-console web UI's bot-creation flow and may change without notice.
The same pattern is used by the feishu/dingtalk QR setup wizards.
"""
try:
import urllib.request
import urllib.parse
except ImportError: # pragma: no cover
logger.error("urllib is required for WeCom QR scan")
return None
generate_url = f"{_QR_GENERATE_URL}?source=hermes"
# ── Step 1: Fetch QR code ──
print(" Connecting to WeCom...", end="", flush=True)
try:
req = urllib.request.Request(generate_url, headers={"User-Agent": "HermesAgent/1.0"})
with urllib.request.urlopen(req, timeout=15) as resp:
raw = json.loads(resp.read().decode("utf-8"))
except Exception as exc:
logger.error("WeCom QR: failed to fetch QR code: %s", exc)
print(f" failed: {exc}")
return None
data = raw.get("data") or {}
scode = str(data.get("scode") or "").strip()
auth_url = str(data.get("auth_url") or "").strip()
if not scode or not auth_url:
logger.error("WeCom QR: unexpected response format: %s", raw)
print(" failed: unexpected response format")
return None
print(" done.")
# ── Step 2: Render QR code in terminal ──
print()
qr_rendered = False
try:
import qrcode as _qrcode
qr = _qrcode.QRCode()
qr.add_data(auth_url)
qr.make(fit=True)
qr.print_ascii(invert=True)
qr_rendered = True
except ImportError:
pass
except Exception:
pass
page_url = f"{_QR_CODE_PAGE}{urllib.parse.quote(scode)}"
if qr_rendered:
print(f"\n Scan the QR code above, or open this URL directly:\n {page_url}")
else:
print(f" Open this URL in WeCom on your phone:\n\n {page_url}\n")
print(" Tip: pip install qrcode to display a scannable QR code here next time")
print()
print(" Fetching configuration results...", end="", flush=True)
# ── Step 3: Poll for result ──
import time
deadline = time.time() + timeout_seconds
query_url = f"{_QR_QUERY_URL}?scode={urllib.parse.quote(scode)}"
poll_count = 0
while time.time() < deadline:
try:
req = urllib.request.Request(query_url, headers={"User-Agent": "HermesAgent/1.0"})
with urllib.request.urlopen(req, timeout=10) as resp:
result = json.loads(resp.read().decode("utf-8"))
except Exception as exc:
logger.debug("WeCom QR poll error: %s", exc)
time.sleep(_QR_POLL_INTERVAL)
continue
poll_count += 1
# Print a dot on every poll so progress is visible within 3s.
print(".", end="", flush=True)
result_data = result.get("data") or {}
status = str(result_data.get("status") or "").lower()
if status == "success":
print() # newline after "Fetching configuration results..." dots
bot_info = result_data.get("bot_info") or {}
bot_id = str(bot_info.get("botid") or bot_info.get("bot_id") or "").strip()
secret = str(bot_info.get("secret") or "").strip()
if bot_id and secret:
return {"bot_id": bot_id, "secret": secret}
logger.warning(
"WeCom QR: scan reported success but bot_info missing or incomplete: %s",
result_data,
)
print(
" QR scan reported success but no bot credentials were returned.\n"
" This usually means the bot was not actually created on the WeCom side.\n"
" Falling back to manual credential entry."
)
return None
time.sleep(_QR_POLL_INTERVAL)
print() # newline after dots
print(f" QR scan timed out ({timeout_seconds // 60} minutes). Please try again.")
return None
+267 -60
View File
@@ -710,7 +710,26 @@ class GatewayRunner:
self._session_db = SessionDB()
except Exception as e:
logger.debug("SQLite session store not available: %s", e)
# Opportunistic state.db maintenance: prune ended sessions older
# than sessions.retention_days + optional VACUUM. Tracks last-run
# in state_meta so it only actually executes once per
# sessions.min_interval_hours. Gateway is long-lived so blocking
# a few seconds once per day is acceptable; failures are logged
# but never raised.
if self._session_db is not None:
try:
from hermes_cli.config import load_config as _load_full_config
_sess_cfg = (_load_full_config().get("sessions") or {})
if _sess_cfg.get("auto_prune", False):
self._session_db.maybe_auto_prune_and_vacuum(
retention_days=int(_sess_cfg.get("retention_days", 90)),
min_interval_hours=int(_sess_cfg.get("min_interval_hours", 24)),
vacuum=bool(_sess_cfg.get("vacuum_after_prune", True)),
)
except Exception as exc:
logger.debug("state.db auto-maintenance skipped: %s", exc)
# DM pairing store for code-based user authorization
from gateway.pairing import PairingStore
self.pairing_store = PairingStore()
@@ -1532,27 +1551,23 @@ class GatewayRunner:
)
return True
# --- Normal busy case (agent actively running a task) ---
# The user sent a message while the agent is working. Interrupt the
# agent immediately so it stops the current tool-calling loop and
# processes the new message. The pending message is stored in the
# adapter so the base adapter picks it up once the interrupted run
# returns. A brief ack tells the user what's happening (debounced
# to avoid spam when they fire multiple messages quickly).
# Normal busy case (agent actively running a task)
adapter = self.adapters.get(event.source.platform)
if not adapter:
return False # let default path handle it
# Store the message so it's processed as the next turn after the
# interrupt causes the current run to exit.
# current run finishes (or is interrupted).
from gateway.platforms.base import merge_pending_message_event
merge_pending_message_event(adapter._pending_messages, session_key, event)
# Interrupt the running agent — this aborts in-flight tool calls and
# causes the agent loop to exit at the next check point.
is_queue_mode = self._busy_input_mode == "queue"
# If not in queue mode, interrupt the running agent immediately.
# This aborts in-flight tool calls and causes the agent loop to exit
# at the next check point.
running_agent = self._running_agents.get(session_key)
if running_agent and running_agent is not _AGENT_PENDING_SENTINEL:
if not is_queue_mode and running_agent and running_agent is not _AGENT_PENDING_SENTINEL:
try:
running_agent.interrupt(event.text)
except Exception:
@@ -1564,7 +1579,7 @@ class GatewayRunner:
now = time.time()
last_ack = self._busy_ack_ts.get(session_key, 0)
if now - last_ack < _BUSY_ACK_COOLDOWN:
return True # interrupt sent, ack already delivered recently
return True # interrupt sent (if not queue), ack already delivered recently
self._busy_ack_ts[session_key] = now
@@ -1589,10 +1604,16 @@ class GatewayRunner:
pass
status_detail = f" ({', '.join(status_parts)})" if status_parts else ""
message = (
f"⚡ Interrupting current task{status_detail}. "
f"I'll respond to your message shortly."
)
if is_queue_mode:
message = (
f"⏳ Queued for the next turn{status_detail}. "
f"I'll respond once the current task finishes."
)
else:
message = (
f"⚡ Interrupting current task{status_detail}. "
f"I'll respond to your message shortly."
)
thread_meta = {"thread_id": event.source.thread_id} if event.source.thread_id else None
try:
@@ -2541,6 +2562,40 @@ class GatewayRunner:
return
async def _stop_impl() -> None:
def _kill_tool_subprocesses(phase: str) -> None:
"""Kill tool subprocesses + tear down terminal envs + browsers.
Called twice in the shutdown path: once eagerly after a
drain timeout forces agent interrupt (so we reclaim bash/
sleep children before systemd TimeoutStopSec escalates to
SIGKILL on the cgroup #8202), and once as a final
catch-all at the end of _stop_impl() for the graceful
path or anything respawned mid-teardown.
All steps are best-effort; exceptions are swallowed so
one subsystem's failure doesn't block the rest.
"""
try:
from tools.process_registry import process_registry
_killed = process_registry.kill_all()
if _killed:
logger.info(
"Shutdown (%s): killed %d tool subprocess(es)",
phase, _killed,
)
except Exception as _e:
logger.debug("process_registry.kill_all (%s) error: %s", phase, _e)
try:
from tools.terminal_tool import cleanup_all_environments
cleanup_all_environments()
except Exception as _e:
logger.debug("cleanup_all_environments (%s) error: %s", phase, _e)
try:
from tools.browser_tool import cleanup_all_browsers
cleanup_all_browsers()
except Exception as _e:
logger.debug("cleanup_all_browsers (%s) error: %s", phase, _e)
logger.info(
"Stopping gateway%s...",
" for restart" if self._restart_requested else "",
@@ -2602,6 +2657,16 @@ class GatewayRunner:
self._update_runtime_status("draining")
await asyncio.sleep(0.1)
# Kill lingering tool subprocesses NOW, before we spend more
# budget on adapter disconnect / session DB close. Under
# systemd (TimeoutStopSec bounded by drain_timeout+headroom),
# deferring this to the end of stop() risks systemd escalating
# to SIGKILL on the cgroup first — at which point bash/sleep
# children left behind by an interrupted terminal tool get
# killed by systemd instead of us (issue #8202). The final
# catch-all cleanup below still runs for the graceful path.
_kill_tool_subprocesses("post-interrupt")
if self._restart_requested and self._restart_detached:
try:
await self._launch_detached_restart_command()
@@ -2637,22 +2702,13 @@ class GatewayRunner:
self._shutdown_event.set()
# Global cleanup: kill any remaining tool subprocesses not tied
# to a specific agent (catch-all for zombie prevention).
try:
from tools.process_registry import process_registry
process_registry.kill_all()
except Exception:
pass
try:
from tools.terminal_tool import cleanup_all_environments
cleanup_all_environments()
except Exception:
pass
try:
from tools.browser_tool import cleanup_all_browsers
cleanup_all_browsers()
except Exception:
pass
# to a specific agent (catch-all for zombie prevention). On the
# drain-timeout path we already did this earlier after agent
# interrupt — this second call catches (a) the graceful path
# where drain succeeded without interrupt, and (b) anything
# that got respawned between the earlier call and adapter
# disconnect (defense in depth; safe to call repeatedly).
_kill_tool_subprocesses("final-cleanup")
# Close SQLite session DBs so the WAL write lock is released.
# Without this, --replace and similar restart flows leave the
@@ -2668,8 +2724,9 @@ class GatewayRunner:
except Exception as _e:
logger.debug("SessionDB close error: %s", _e)
from gateway.status import remove_pid_file
from gateway.status import remove_pid_file, release_gateway_runtime_lock
remove_pid_file()
release_gateway_runtime_lock()
# Write a clean-shutdown marker so the next startup knows this
# wasn't a crash. suspend_recently_active() only needs to run
@@ -3466,23 +3523,73 @@ class GatewayRunner:
# Check for commands
command = event.get_command()
# Emit command:* hook for any recognized slash command.
# GATEWAY_KNOWN_COMMANDS is derived from the central COMMAND_REGISTRY
# in hermes_cli/commands.py — no hardcoded set to maintain here.
from hermes_cli.commands import GATEWAY_KNOWN_COMMANDS, resolve_command as _resolve_cmd
if command and command in GATEWAY_KNOWN_COMMANDS:
await self.hooks.emit(f"command:{command}", {
"platform": source.platform.value if source.platform else "",
"user_id": source.user_id,
"command": command,
"args": event.get_command_args().strip(),
})
# Resolve aliases to canonical name so dispatch only checks canonicals.
from hermes_cli.commands import (
GATEWAY_KNOWN_COMMANDS,
is_gateway_known_command,
resolve_command as _resolve_cmd,
)
# Resolve aliases to canonical name so dispatch and hook names
# don't depend on the exact alias the user typed.
_cmd_def = _resolve_cmd(command) if command else None
canonical = _cmd_def.name if _cmd_def else command
# Fire the ``command:<canonical>`` hook for any recognized slash
# command — built-in OR plugin-registered. Handlers can return a
# dict with ``{"decision": "deny" | "handled" | "rewrite", ...}``
# to intercept dispatch before core handling runs. This replaces
# the previous fire-and-forget emit(): return values are now
# honored, but handlers that return nothing behave exactly as
# before (telemetry-style hooks keep working).
if command and is_gateway_known_command(canonical):
raw_args = event.get_command_args().strip()
hook_ctx = {
"platform": source.platform.value if source.platform else "",
"user_id": source.user_id,
"command": canonical,
"raw_command": command,
"args": raw_args,
"raw_args": raw_args,
}
try:
hook_results = await self.hooks.emit_collect(
f"command:{canonical}", hook_ctx
)
except Exception as _hook_err:
logger.debug(
"command:%s hook dispatch failed (non-fatal): %s",
canonical, _hook_err,
)
hook_results = []
for hook_result in hook_results:
if not isinstance(hook_result, dict):
continue
decision = str(hook_result.get("decision", "")).strip().lower()
if not decision or decision == "allow":
continue
if decision == "deny":
message = hook_result.get("message")
if isinstance(message, str) and message:
return message
return f"Command `/{command}` was blocked by a hook."
if decision == "handled":
message = hook_result.get("message")
return message if isinstance(message, str) and message else None
if decision == "rewrite":
new_command = str(
hook_result.get("command_name", "")
).strip().lstrip("/")
if not new_command:
continue
new_args = str(hook_result.get("raw_args", "")).strip()
event.text = f"/{new_command} {new_args}".strip()
command = event.get_command()
_cmd_def = _resolve_cmd(command) if command else None
canonical = _cmd_def.name if _cmd_def else command
break
if canonical == "new":
return await self._handle_reset_command(event)
@@ -4901,6 +5008,11 @@ class GatewayRunner:
# the configured default instead of the previously switched model.
self._session_model_overrides.pop(session_key, None)
# Clear session-scoped dangerous-command approvals and /yolo state.
# /new is a conversation-boundary operation — approval state from the
# previous conversation must not survive the reset.
self._clear_session_boundary_security_state(session_key)
# Fire plugin on_session_finalize hook (session boundary)
try:
from hermes_cli.plugins import invoke_hook as _invoke_hook
@@ -5409,6 +5521,7 @@ class GatewayRunner:
try:
providers = list_authenticated_providers(
current_provider=current_provider,
current_base_url=current_base_url,
user_providers=user_provs,
custom_providers=custom_provs,
max_models=50,
@@ -5520,6 +5633,7 @@ class GatewayRunner:
try:
providers = list_authenticated_providers(
current_provider=current_provider,
current_base_url=current_base_url,
user_providers=user_provs,
custom_providers=custom_provs,
max_models=5,
@@ -6456,6 +6570,11 @@ class GatewayRunner:
session_id=task_id,
platform=platform_key,
user_id=source.user_id,
user_name=source.user_name,
chat_id=source.chat_id,
chat_name=source.chat_name,
chat_type=source.chat_type,
thread_id=source.thread_id,
session_db=self._session_db,
fallback_model=self._fallback_model,
)
@@ -7142,6 +7261,7 @@ class GatewayRunner:
new_entry = self.session_store.switch_session(session_key, target_id)
if not new_entry:
return "Failed to switch session."
self._clear_session_boundary_security_state(session_key)
# Get the title for confirmation
title = self._session_db.get_session_title(target_id) or name
@@ -7216,6 +7336,7 @@ class GatewayRunner:
tool_calls=msg.get("tool_calls"),
tool_call_id=msg.get("tool_call_id"),
reasoning=msg.get("reasoning"),
reasoning_content=msg.get("reasoning_content"),
)
except Exception:
pass # Best-effort copy
@@ -7230,6 +7351,7 @@ class GatewayRunner:
new_entry = self.session_store.switch_session(session_key, new_session_id)
if not new_entry:
return "Branch created but failed to switch to it."
self._clear_session_boundary_security_state(session_key)
# Evict any cached agent for this session
self._evict_cached_agent(session_key)
@@ -7620,13 +7742,14 @@ class GatewayRunner:
from hermes_cli.debug import (
_capture_dump, collect_debug_report,
upload_to_pastebin, _schedule_auto_delete,
_GATEWAY_PRIVACY_NOTICE,
_GATEWAY_PRIVACY_NOTICE, _best_effort_sweep_expired_pastes,
)
loop = asyncio.get_running_loop()
# Run blocking I/O (dump capture, log reads, uploads) in a thread.
def _collect_and_upload():
_best_effort_sweep_expired_pastes()
dump_text = _capture_dump()
report = collect_debug_report(log_lines=200, dump_text=dump_text)
@@ -8579,7 +8702,12 @@ class GatewayRunner:
override = self._session_model_overrides.get(session_key)
return override is not None and override.get("model") == agent_model
def _release_running_agent_state(self, session_key: str) -> None:
def _release_running_agent_state(
self,
session_key: str,
*,
run_generation: Optional[int] = None,
) -> bool:
"""Pop ALL per-running-agent state entries for ``session_key``.
Replaces ad-hoc ``del self._running_agents[key]`` calls scattered
@@ -8595,13 +8723,48 @@ class GatewayRunner:
across turns (``_session_model_overrides``, ``_voice_mode``,
``_pending_approvals``, ``_update_prompt_pending``) is NOT
touched here those have their own lifecycles.
When ``run_generation`` is provided, only clear the slot if that
generation is still current for the session. This prevents an
older async run whose generation was bumped by /stop or /new from
clobbering a newer run's state during its own unwind. Returns
True when the slot was cleared, False when an ownership guard
blocked it.
"""
if not session_key:
return
return False
if run_generation is not None and not self._is_session_run_current(
session_key, run_generation
):
return False
self._running_agents.pop(session_key, None)
self._running_agents_ts.pop(session_key, None)
if hasattr(self, "_busy_ack_ts"):
self._busy_ack_ts.pop(session_key, None)
return True
def _clear_session_boundary_security_state(self, session_key: str) -> None:
"""Clear approval state that must not survive a real conversation switch."""
if not session_key:
return
pending_approvals = getattr(self, "_pending_approvals", None)
if isinstance(pending_approvals, dict):
pending_approvals.pop(session_key, None)
try:
from tools.approval import clear_session as _clear_approval_session
except Exception:
return
try:
_clear_approval_session(session_key)
except Exception as e:
logger.debug(
"Failed to clear approval state for session boundary %s: %s",
session_key,
e,
)
def _begin_session_run_generation(self, session_key: str) -> int:
"""Claim a fresh run generation token for ``session_key``.
@@ -9698,6 +9861,11 @@ class GatewayRunner:
session_id=session_id,
platform=platform_key,
user_id=source.user_id,
user_name=source.user_name,
chat_id=source.chat_id,
chat_name=source.chat_name,
chat_type=source.chat_type,
thread_id=source.thread_id,
gateway_session_key=session_key,
session_db=self._session_db,
fallback_model=self._fallback_model,
@@ -10135,10 +10303,24 @@ class GatewayRunner:
# Wait for agent to be created
while agent_holder[0] is None:
await asyncio.sleep(0.05)
if session_key:
self._running_agents[session_key] = agent_holder[0]
if self._draining:
self._update_runtime_status("draining")
if not session_key:
return
# Only promote the sentinel to the real agent if this run is still
# current. If /stop or /new bumped the generation while we were
# spinning up, leave the newer run's slot alone — we'll be
# discarded by the stale-result check in _handle_message_with_agent.
if run_generation is not None and not self._is_session_run_current(
session_key, run_generation
):
logger.info(
"Skipping stale agent promotion for %s — generation %s is no longer current",
(session_key or "")[:20],
run_generation,
)
return
self._running_agents[session_key] = agent_holder[0]
if self._draining:
self._update_runtime_status("draining")
tracking_task = asyncio.create_task(track_agent())
@@ -10193,9 +10375,9 @@ class GatewayRunner:
# Periodic "still working" notifications for long-running tasks.
# Fires every N seconds so the user knows the agent hasn't died.
# Config: agent.gateway_notify_interval in config.yaml, or
# HERMES_AGENT_NOTIFY_INTERVAL env var. Default 600s (10 min).
# HERMES_AGENT_NOTIFY_INTERVAL env var. Default 180s (3 min).
# 0 = disable notifications.
_NOTIFY_INTERVAL_RAW = float(os.getenv("HERMES_AGENT_NOTIFY_INTERVAL", 600))
_NOTIFY_INTERVAL_RAW = float(os.getenv("HERMES_AGENT_NOTIFY_INTERVAL", 180))
_NOTIFY_INTERVAL = _NOTIFY_INTERVAL_RAW if _NOTIFY_INTERVAL_RAW > 0 else None
_notify_start = time.time()
@@ -10644,7 +10826,14 @@ class GatewayRunner:
# Clean up tracking
tracking_task.cancel()
if session_key:
self._release_running_agent_state(session_key)
# Only release the slot if this run's generation still owns
# it. A /stop or /new that bumped the generation while we
# were unwinding has already installed its own state; this
# guard prevents an old run from clobbering it on the way
# out.
self._release_running_agent_state(
session_key, run_generation=run_generation
)
if self._draining:
self._update_runtime_status("draining")
@@ -10764,10 +10953,18 @@ async def start_gateway(config: Optional[GatewayConfig] = None, replace: bool =
# The PID file is scoped to HERMES_HOME, so future multi-profile
# setups (each profile using a distinct HERMES_HOME) will naturally
# allow concurrent instances without tripping this guard.
from gateway.status import get_running_pid, remove_pid_file, terminate_pid
from gateway.status import (
acquire_gateway_runtime_lock,
get_running_pid,
get_process_start_time,
release_gateway_runtime_lock,
remove_pid_file,
terminate_pid,
)
existing_pid = get_running_pid()
if existing_pid is not None and existing_pid != os.getpid():
if replace:
existing_start_time = get_process_start_time(existing_pid)
logger.info(
"Replacing existing gateway instance (PID %d) with --replace.",
existing_pid,
@@ -10836,7 +11033,10 @@ async def start_gateway(config: Optional[GatewayConfig] = None, replace: bool =
# leaving stale lock files that block the new gateway from starting.
try:
from gateway.status import release_all_scoped_locks
_released = release_all_scoped_locks()
_released = release_all_scoped_locks(
owner_pid=existing_pid,
owner_start_time=existing_start_time,
)
if _released:
logger.info("Released %d stale scoped lock(s) from old gateway.", _released)
except Exception:
@@ -10977,14 +11177,21 @@ async def start_gateway(config: Optional[GatewayConfig] = None, replace: bool =
"Exiting to avoid double-running.", _current_pid
)
return False
if not acquire_gateway_runtime_lock():
logger.error(
"Gateway runtime lock is already held by another instance. Exiting."
)
return False
try:
write_pid_file()
except FileExistsError:
release_gateway_runtime_lock()
logger.error(
"PID file race lost to another gateway instance. Exiting."
)
return False
atexit.register(remove_pid_file)
atexit.register(release_gateway_runtime_lock)
# Start the gateway
success = await runner.start()
+47 -10
View File
@@ -80,9 +80,12 @@ class SessionSource:
user_name: Optional[str] = None
thread_id: Optional[str] = None # For forum topics, Discord threads, etc.
chat_topic: Optional[str] = None # Channel topic/description (Discord, Slack)
user_id_alt: Optional[str] = None # Signal UUID (alternative to phone number)
user_id_alt: Optional[str] = None # Platform-specific stable alt ID (Signal UUID, Feishu union_id)
chat_id_alt: Optional[str] = None # Signal group internal ID
is_bot: bool = False # True when the message author is a bot/webhook (Discord)
guild_id: Optional[str] = None # Discord guild / Slack workspace / Matrix server scope
parent_chat_id: Optional[str] = None # Parent channel when chat_id refers to a thread
message_id: Optional[str] = None # ID of the triggering message (for pin/reply/react)
@property
def description(self) -> str:
@@ -120,8 +123,14 @@ class SessionSource:
d["user_id_alt"] = self.user_id_alt
if self.chat_id_alt:
d["chat_id_alt"] = self.chat_id_alt
if self.guild_id:
d["guild_id"] = self.guild_id
if self.parent_chat_id:
d["parent_chat_id"] = self.parent_chat_id
if self.message_id:
d["message_id"] = self.message_id
return d
@classmethod
def from_dict(cls, data: Dict[str, Any]) -> "SessionSource":
return cls(
@@ -135,6 +144,9 @@ class SessionSource:
chat_topic=data.get("chat_topic"),
user_id_alt=data.get("user_id_alt"),
chat_id_alt=data.get("chat_id_alt"),
guild_id=data.get("guild_id"),
parent_chat_id=data.get("parent_chat_id"),
message_id=data.get("message_id"),
)
@@ -273,14 +285,34 @@ def build_session_context_prompt(
"that you can only read messages sent directly to you and respond."
)
elif context.source.platform == Platform.DISCORD:
lines.append("")
lines.append(
"**Platform notes:** You are running inside Discord. "
"You do NOT have access to Discord-specific APIs — you cannot search "
"channel history, pin messages, manage roles, or list server members. "
"Do not promise to perform these actions. If the user asks, explain "
"that you can only read messages sent directly to you and respond."
)
# The discord tool self-gates on DISCORD_BOT_TOKEN at registry
# check time. Match that condition so the prompt stays honest:
# with a token the agent has fetch_messages/search_members/
# create_thread (and optionally discord_admin) and should know
# the IDs it can call them with; without one it really is
# limited to reading/replying via the gateway.
if (os.environ.get("DISCORD_BOT_TOKEN") or "").strip():
src = context.source
id_lines = ["", "**Discord IDs (for the `discord` / `discord_admin` tools):**"]
if src.guild_id:
id_lines.append(f" - Guild: `{src.guild_id}`")
if src.thread_id and src.parent_chat_id:
id_lines.append(f" - Parent channel: `{src.parent_chat_id}`")
id_lines.append(f" - Thread: `{src.thread_id}` (use as `channel_id` for fetch_messages etc.)")
else:
id_lines.append(f" - Channel: `{src.chat_id}`")
if src.message_id:
id_lines.append(f" - Triggering message: `{src.message_id}`")
lines.extend(id_lines)
else:
lines.append("")
lines.append(
"**Platform notes:** You are running inside Discord. "
"You do NOT have access to Discord-specific APIs — you cannot search "
"channel history, pin messages, manage roles, or list server members. "
"Do not promise to perform these actions. If the user asks, explain "
"that you can only read messages sent directly to you and respond."
)
# Connected platforms
platforms_list = ["local (files on this machine)"]
@@ -1147,6 +1179,10 @@ class SessionStore:
tool_name=message.get("tool_name"),
tool_calls=message.get("tool_calls"),
tool_call_id=message.get("tool_call_id"),
reasoning=message.get("reasoning") if message.get("role") == "assistant" else None,
reasoning_content=message.get("reasoning_content") if message.get("role") == "assistant" else None,
reasoning_details=message.get("reasoning_details") if message.get("role") == "assistant" else None,
codex_reasoning_items=message.get("codex_reasoning_items") if message.get("role") == "assistant" else None,
)
except Exception as e:
logger.debug("Session DB operation failed: %s", e)
@@ -1176,6 +1212,7 @@ class SessionStore:
tool_calls=msg.get("tool_calls"),
tool_call_id=msg.get("tool_call_id"),
reasoning=msg.get("reasoning") if role == "assistant" else None,
reasoning_content=msg.get("reasoning_content") if role == "assistant" else None,
reasoning_details=msg.get("reasoning_details") if role == "assistant" else None,
codex_reasoning_items=msg.get("codex_reasoning_items") if role == "assistant" else None,
)
+208 -30
View File
@@ -22,11 +22,18 @@ from pathlib import Path
from hermes_constants import get_hermes_home
from typing import Any, Optional
if sys.platform == "win32":
import msvcrt
else:
import fcntl
_GATEWAY_KIND = "hermes-gateway"
_RUNTIME_STATUS_FILE = "gateway_state.json"
_LOCKS_DIRNAME = "gateway-locks"
_IS_WINDOWS = sys.platform == "win32"
_UNSET = object()
_GATEWAY_LOCK_FILENAME = "gateway.lock"
_gateway_lock_handle = None
def _get_pid_path() -> Path:
@@ -35,6 +42,14 @@ def _get_pid_path() -> Path:
return home / "gateway.pid"
def _get_gateway_lock_path(pid_path: Optional[Path] = None) -> Path:
"""Return the path to the runtime gateway lock file."""
if pid_path is not None:
return pid_path.with_name(_GATEWAY_LOCK_FILENAME)
home = get_hermes_home()
return home / _GATEWAY_LOCK_FILENAME
def _get_runtime_status_path() -> Path:
"""Return the persisted runtime health/status file path."""
return _get_pid_path().with_name(_RUNTIME_STATUS_FILE)
@@ -98,6 +113,11 @@ def _get_process_start_time(pid: int) -> Optional[int]:
return None
def get_process_start_time(pid: int) -> Optional[int]:
"""Public wrapper for retrieving a process start time when available."""
return _get_process_start_time(pid)
def _read_process_cmdline(pid: int) -> Optional[str]:
"""Return the process command line as a space-separated string."""
cmdline_path = Path(f"/proc/{pid}/cmdline")
@@ -121,6 +141,7 @@ def _looks_like_gateway_process(pid: int) -> bool:
"hermes_cli.main gateway",
"hermes_cli/main.py gateway",
"hermes gateway",
"hermes-gateway",
"gateway/run.py",
)
return any(pattern in cmdline for pattern in patterns)
@@ -212,16 +233,135 @@ def _read_pid_record(pid_path: Optional[Path] = None) -> Optional[dict]:
return None
def _read_gateway_lock_record(lock_path: Optional[Path] = None) -> Optional[dict[str, Any]]:
return _read_pid_record(lock_path or _get_gateway_lock_path())
def _pid_from_record(record: Optional[dict[str, Any]]) -> Optional[int]:
if not record:
return None
try:
return int(record["pid"])
except (KeyError, TypeError, ValueError):
return None
def _cleanup_invalid_pid_path(pid_path: Path, *, cleanup_stale: bool) -> None:
"""Delete a stale gateway PID file (and its sibling lock metadata).
Called from ``get_running_pid()`` after the runtime lock has already been
confirmed inactive, so the on-disk metadata is known to belong to a dead
process. Unlike ``remove_pid_file()`` (which defensively refuses to delete
a PID file whose ``pid`` field differs from ``os.getpid()`` to protect
``--replace`` handoffs), this path force-unlinks both files so the next
startup sees a clean slate.
"""
if not cleanup_stale:
return
try:
if pid_path == _get_pid_path():
remove_pid_file()
else:
pid_path.unlink(missing_ok=True)
pid_path.unlink(missing_ok=True)
except Exception:
pass
try:
_get_gateway_lock_path(pid_path).unlink(missing_ok=True)
except Exception:
pass
def _write_gateway_lock_record(handle) -> None:
handle.seek(0)
handle.truncate()
json.dump(_build_pid_record(), handle)
handle.flush()
try:
os.fsync(handle.fileno())
except OSError:
pass
def _try_acquire_file_lock(handle) -> bool:
try:
if _IS_WINDOWS:
handle.seek(0, os.SEEK_END)
if handle.tell() == 0:
handle.write("\n")
handle.flush()
handle.seek(0)
msvcrt.locking(handle.fileno(), msvcrt.LK_NBLCK, 1)
else:
fcntl.flock(handle.fileno(), fcntl.LOCK_EX | fcntl.LOCK_NB)
return True
except (BlockingIOError, OSError):
return False
def _release_file_lock(handle) -> None:
try:
if _IS_WINDOWS:
handle.seek(0)
msvcrt.locking(handle.fileno(), msvcrt.LK_UNLCK, 1)
else:
fcntl.flock(handle.fileno(), fcntl.LOCK_UN)
except OSError:
pass
def acquire_gateway_runtime_lock() -> bool:
"""Claim the cross-process runtime lock for the gateway.
Unlike the PID file, the lock is owned by the live process itself. If the
process dies abruptly, the OS releases the lock automatically.
"""
global _gateway_lock_handle
if _gateway_lock_handle is not None:
return True
path = _get_gateway_lock_path()
path.parent.mkdir(parents=True, exist_ok=True)
handle = open(path, "a+", encoding="utf-8")
if not _try_acquire_file_lock(handle):
handle.close()
return False
_write_gateway_lock_record(handle)
_gateway_lock_handle = handle
return True
def release_gateway_runtime_lock() -> None:
"""Release the gateway runtime lock when owned by this process."""
global _gateway_lock_handle
handle = _gateway_lock_handle
if handle is None:
return
_gateway_lock_handle = None
_release_file_lock(handle)
try:
handle.close()
except OSError:
pass
def is_gateway_runtime_lock_active(lock_path: Optional[Path] = None) -> bool:
"""Return True when some process currently owns the gateway runtime lock."""
global _gateway_lock_handle
resolved_lock_path = lock_path or _get_gateway_lock_path()
if _gateway_lock_handle is not None and resolved_lock_path == _get_gateway_lock_path():
return True
if not resolved_lock_path.exists():
return False
handle = open(resolved_lock_path, "a+", encoding="utf-8")
try:
if _try_acquire_file_lock(handle):
_release_file_lock(handle)
return False
return True
finally:
try:
handle.close()
except OSError:
pass
def write_pid_file() -> None:
@@ -361,7 +501,8 @@ def acquire_scoped_lock(scope: str, identity: str, metadata: Optional[dict[str,
if not stale:
try:
os.kill(existing_pid, 0)
except (ProcessLookupError, PermissionError):
except (ProcessLookupError, PermissionError, OSError):
# Windows raises OSError with WinError 87 for invalid pid check
stale = True
else:
current_start = _get_process_start_time(existing_pid)
@@ -426,17 +567,43 @@ def release_scoped_lock(scope: str, identity: str) -> None:
pass
def release_all_scoped_locks() -> int:
"""Remove all scoped lock files in the lock directory.
def release_all_scoped_locks(
*,
owner_pid: Optional[int] = None,
owner_start_time: Optional[int] = None,
) -> int:
"""Remove scoped lock files in the lock directory.
Called during --replace to clean up stale locks left by stopped/killed
gateway processes that did not release their locks gracefully.
gateway processes that did not release their locks gracefully. When an
``owner_pid`` is provided, only lock records belonging to that gateway
process are removed. ``owner_start_time`` further narrows the match to
protect against PID reuse.
When no owner is provided, preserves the legacy behavior and removes every
scoped lock file in the directory.
Returns the number of lock files removed.
"""
lock_dir = _get_lock_dir()
removed = 0
if lock_dir.exists():
for lock_file in lock_dir.glob("*.lock"):
if owner_pid is not None:
record = _read_json_file(lock_file)
if not isinstance(record, dict):
continue
try:
record_pid = int(record.get("pid"))
except (TypeError, ValueError):
continue
if record_pid != owner_pid:
continue
if (
owner_start_time is not None
and record.get("start_time") != owner_start_time
):
continue
try:
lock_file.unlink(missing_ok=True)
removed += 1
@@ -583,35 +750,46 @@ def get_running_pid(
Cleans up stale PID files automatically.
"""
resolved_pid_path = pid_path or _get_pid_path()
record = _read_pid_record(resolved_pid_path)
if not record:
resolved_lock_path = _get_gateway_lock_path(resolved_pid_path)
lock_active = is_gateway_runtime_lock_active(resolved_lock_path)
if not lock_active:
_cleanup_invalid_pid_path(resolved_pid_path, cleanup_stale=cleanup_stale)
return None
try:
pid = int(record["pid"])
except (KeyError, TypeError, ValueError):
_cleanup_invalid_pid_path(resolved_pid_path, cleanup_stale=cleanup_stale)
return None
primary_record = _read_pid_record(resolved_pid_path)
fallback_record = _read_gateway_lock_record(resolved_lock_path)
try:
os.kill(pid, 0) # signal 0 = existence check, no actual signal sent
except (ProcessLookupError, PermissionError):
_cleanup_invalid_pid_path(resolved_pid_path, cleanup_stale=cleanup_stale)
return None
for record in (primary_record, fallback_record):
pid = _pid_from_record(record)
if pid is None:
continue
recorded_start = record.get("start_time")
current_start = _get_process_start_time(pid)
if recorded_start is not None and current_start is not None and current_start != recorded_start:
_cleanup_invalid_pid_path(resolved_pid_path, cleanup_stale=cleanup_stale)
return None
try:
os.kill(pid, 0) # signal 0 = existence check, no actual signal sent
except ProcessLookupError:
continue
except PermissionError:
# The process exists but belongs to another user/service scope.
# With the runtime lock still held, prefer keeping it visible
# rather than deleting the PID file as "stale".
if _record_looks_like_gateway(record):
return pid
continue
except OSError:
# Windows raises OSError with WinError 87 for an invalid pid
# (process is definitely gone). Treat as "process doesn't exist".
continue
if not _looks_like_gateway_process(pid):
if not _record_looks_like_gateway(record):
_cleanup_invalid_pid_path(resolved_pid_path, cleanup_stale=cleanup_stale)
return None
recorded_start = record.get("start_time")
current_start = _get_process_start_time(pid)
if recorded_start is not None and current_start is not None and current_start != recorded_start:
continue
return pid
if _looks_like_gateway_process(pid) or _record_looks_like_gateway(record):
return pid
_cleanup_invalid_pid_path(resolved_pid_path, cleanup_stale=cleanup_stale)
return None
def is_gateway_running(
+2 -2
View File
@@ -11,5 +11,5 @@ Provides subcommands for:
- hermes cron - Manage cron jobs
"""
__version__ = "0.10.0"
__release_date__ = "2026.4.16"
__version__ = "0.11.0"
__release_date__ = "2026.4.23"
+45 -7
View File
@@ -72,6 +72,8 @@ DEFAULT_QWEN_BASE_URL = "https://portal.qwen.ai/v1"
DEFAULT_GITHUB_MODELS_BASE_URL = "https://api.githubcopilot.com"
DEFAULT_COPILOT_ACP_BASE_URL = "acp://copilot"
DEFAULT_OLLAMA_CLOUD_BASE_URL = "https://ollama.com/v1"
STEPFUN_STEP_PLAN_INTL_BASE_URL = "https://api.stepfun.ai/step_plan/v1"
STEPFUN_STEP_PLAN_CN_BASE_URL = "https://api.stepfun.com/step_plan/v1"
CODEX_OAUTH_CLIENT_ID = "app_EMoamEEZ73f0CkXaXp7hrann"
CODEX_OAUTH_TOKEN_URL = "https://auth.openai.com/oauth/token"
CODEX_ACCESS_TOKEN_REFRESH_SKEW_SECONDS = 120
@@ -168,8 +170,11 @@ PROVIDER_REGISTRY: Dict[str, ProviderConfig] = {
id="kimi-coding",
name="Kimi / Moonshot",
auth_type="api_key",
# Legacy platform.moonshot.ai keys use this endpoint (OpenAI-compat).
# sk-kimi- (Kimi Code) keys are auto-redirected to api.kimi.com/coding
# by _resolve_kimi_base_url() below.
inference_base_url="https://api.moonshot.ai/v1",
api_key_env_vars=("KIMI_API_KEY",),
api_key_env_vars=("KIMI_API_KEY", "KIMI_CODING_API_KEY"),
base_url_env_var="KIMI_BASE_URL",
),
"kimi-coding-cn": ProviderConfig(
@@ -179,6 +184,14 @@ PROVIDER_REGISTRY: Dict[str, ProviderConfig] = {
inference_base_url="https://api.moonshot.cn/v1",
api_key_env_vars=("KIMI_CN_API_KEY",),
),
"stepfun": ProviderConfig(
id="stepfun",
name="StepFun Step Plan",
auth_type="api_key",
inference_base_url=STEPFUN_STEP_PLAN_INTL_BASE_URL,
api_key_env_vars=("STEPFUN_API_KEY",),
base_url_env_var="STEPFUN_BASE_URL",
),
"arcee": ProviderConfig(
id="arcee",
name="Arcee AI",
@@ -201,6 +214,7 @@ PROVIDER_REGISTRY: Dict[str, ProviderConfig] = {
auth_type="api_key",
inference_base_url="https://api.anthropic.com",
api_key_env_vars=("ANTHROPIC_API_KEY", "ANTHROPIC_TOKEN", "CLAUDE_CODE_OAUTH_TOKEN"),
base_url_env_var="ANTHROPIC_BASE_URL",
),
"alibaba": ProviderConfig(
id="alibaba",
@@ -340,10 +354,16 @@ def get_anthropic_key() -> str:
# =============================================================================
# Kimi Code (kimi.com/code) issues keys prefixed "sk-kimi-" that only work
# on api.kimi.com/coding/v1. Legacy keys from platform.moonshot.ai work on
# api.moonshot.ai/v1 (the default). Auto-detect when user hasn't set
# on api.kimi.com/coding. Legacy keys from platform.moonshot.ai work on
# api.moonshot.ai/v1 (the old default). Auto-detect when user hasn't set
# KIMI_BASE_URL explicitly.
KIMI_CODE_BASE_URL = "https://api.kimi.com/coding/v1"
#
# Note: the base URL intentionally has NO /v1 suffix. The /coding endpoint
# speaks the Anthropic Messages protocol, and the anthropic SDK appends
# "/v1/messages" internally — so "/coding" + SDK suffix → "/coding/v1/messages"
# (the correct target). Using "/coding/v1" here would produce
# "/coding/v1/v1/messages" (a 404).
KIMI_CODE_BASE_URL = "https://api.kimi.com/coding"
def _resolve_kimi_base_url(api_key: str, default_url: str, env_override: str) -> str:
@@ -599,7 +619,25 @@ def _oauth_trace(event: str, *, sequence_id: Optional[str] = None, **fields: Any
# =============================================================================
def _auth_file_path() -> Path:
return get_hermes_home() / "auth.json"
path = get_hermes_home() / "auth.json"
# Seat belt: if pytest is running and HERMES_HOME resolves to the real
# user's auth store, refuse rather than silently corrupt it. This catches
# tests that forgot to monkeypatch HERMES_HOME, tests invoked without the
# hermetic conftest, or sandbox escapes via threads/subprocesses. In
# production (no PYTEST_CURRENT_TEST) this is a single dict lookup.
if os.environ.get("PYTEST_CURRENT_TEST"):
real_home_auth = (Path.home() / ".hermes" / "auth.json").resolve(strict=False)
try:
resolved = path.resolve(strict=False)
except Exception:
resolved = path
if resolved == real_home_auth:
raise RuntimeError(
f"Refusing to touch real user auth store during test run: {path}. "
"Set HERMES_HOME to a tmp_path in your test fixture, or run "
"via scripts/run_tests.sh for hermetic CI-parity env."
)
return path
def _auth_lock_path() -> Path:
@@ -983,6 +1021,7 @@ def resolve_provider(
"x-ai": "xai", "x.ai": "xai", "grok": "xai",
"kimi": "kimi-coding", "kimi-for-coding": "kimi-coding", "moonshot": "kimi-coding",
"kimi-cn": "kimi-coding-cn", "moonshot-cn": "kimi-coding-cn",
"step": "stepfun", "stepfun-coding-plan": "stepfun",
"arcee-ai": "arcee", "arceeai": "arcee",
"minimax-china": "minimax-cn", "minimax_cn": "minimax-cn",
"claude": "anthropic", "claude-code": "anthropic",
@@ -3375,7 +3414,7 @@ def _login_nous(args, pconfig: ProviderConfig) -> None:
)
from hermes_cli.models import (
_PROVIDER_MODELS, get_pricing_for_provider, filter_nous_free_models,
_PROVIDER_MODELS, get_pricing_for_provider,
check_nous_free_tier, partition_nous_models_by_tier,
)
model_ids = _PROVIDER_MODELS.get("nous", [])
@@ -3384,7 +3423,6 @@ def _login_nous(args, pconfig: ProviderConfig) -> None:
unavailable_models: list = []
if model_ids:
pricing = get_pricing_for_provider("nous")
model_ids = filter_nous_free_models(model_ids, pricing)
free_tier = check_nous_free_tier()
if free_tier:
model_ids, unavailable_models = partition_nous_models_by_tier(
+54 -1
View File
@@ -238,6 +238,52 @@ def get_git_banner_state(repo_dir: Optional[Path] = None) -> Optional[dict]:
return {"upstream": upstream, "local": local, "ahead": max(ahead, 0)}
_RELEASE_URL_BASE = "https://github.com/NousResearch/hermes-agent/releases/tag"
_latest_release_cache: Optional[tuple] = None # (tag, url) once resolved
def get_latest_release_tag(repo_dir: Optional[Path] = None) -> Optional[tuple]:
"""Return ``(tag, release_url)`` for the latest git tag, or None.
Local-only runs ``git describe --tags --abbrev=0`` against the
Hermes checkout. Cached per-process. Release URL always points at the
canonical NousResearch/hermes-agent repo (forks don't get a link).
"""
global _latest_release_cache
if _latest_release_cache is not None:
return _latest_release_cache or None
repo_dir = repo_dir or _resolve_repo_dir()
if repo_dir is None:
_latest_release_cache = () # falsy sentinel — skip future lookups
return None
try:
result = subprocess.run(
["git", "describe", "--tags", "--abbrev=0"],
capture_output=True,
text=True,
timeout=3,
cwd=str(repo_dir),
)
except Exception:
_latest_release_cache = ()
return None
if result.returncode != 0:
_latest_release_cache = ()
return None
tag = (result.stdout or "").strip()
if not tag:
_latest_release_cache = ()
return None
url = f"{_RELEASE_URL_BASE}/{tag}"
_latest_release_cache = (tag, url)
return _latest_release_cache
def format_banner_version_label() -> str:
"""Return the version label shown in the startup banner title."""
base = f"Hermes Agent v{VERSION} ({RELEASE_DATE})"
@@ -519,9 +565,16 @@ def build_welcome_banner(console: Console, model: str, cwd: str,
agent_name = _skin_branding("agent_name", "Hermes Agent")
title_color = _skin_color("banner_title", "#FFD700")
border_color = _skin_color("banner_border", "#CD7F32")
version_label = format_banner_version_label()
release_info = get_latest_release_tag()
if release_info:
_tag, _url = release_info
title_markup = f"[bold {title_color}][link={_url}]{version_label}[/link][/]"
else:
title_markup = f"[bold {title_color}]{version_label}[/]"
outer_panel = Panel(
layout_table,
title=f"[bold {title_color}]{format_banner_version_label()}[/]",
title=title_markup,
border_style=border_color,
padding=(0, 2),
)
+1 -1
View File
@@ -249,7 +249,7 @@ def _scan_workspace_state(source_dir: Path) -> list[tuple[Path, str]]:
state_path = child / state_name
if state_path.exists():
kind = "directory" if state_path.is_dir() else "file"
rel = state_path.relative_to(source_dir)
rel = state_path.relative_to(source_dir).as_posix()
findings.append((state_path, f"Workspace {kind}: {rel}"))
return findings
+2
View File
@@ -12,6 +12,7 @@ import os
logger = logging.getLogger(__name__)
DEFAULT_CODEX_MODELS: List[str] = [
"gpt-5.5",
"gpt-5.4-mini",
"gpt-5.4",
"gpt-5.3-codex",
@@ -21,6 +22,7 @@ DEFAULT_CODEX_MODELS: List[str] = [
]
_FORWARD_COMPAT_TEMPLATE_MODELS: List[tuple[str, tuple[str, ...]]] = [
("gpt-5.5", ("gpt-5.4", "gpt-5.4-mini", "gpt-5.3-codex")),
("gpt-5.4-mini", ("gpt-5.3-codex", "gpt-5.2-codex")),
("gpt-5.4", ("gpt-5.3-codex", "gpt-5.2-codex")),
("gpt-5.3-codex", ("gpt-5.2-codex",)),
+65
View File
@@ -260,6 +260,26 @@ GATEWAY_KNOWN_COMMANDS: frozenset[str] = frozenset(
)
def is_gateway_known_command(name: str | None) -> bool:
"""Return True if ``name`` resolves to a gateway-dispatchable slash command.
This covers both built-in commands (``GATEWAY_KNOWN_COMMANDS`` derived
from ``COMMAND_REGISTRY``) and plugin-registered commands, which are
looked up lazily so importing this module never forces plugin
discovery. Gateway code uses this to decide whether to emit
``command:<name>`` hooks plugin commands get the same lifecycle
events as built-ins.
"""
if not name:
return False
if name in GATEWAY_KNOWN_COMMANDS:
return True
for plugin_name, _description, _args_hint in _iter_plugin_command_entries():
if plugin_name == name:
return True
return False
# Commands with explicit Level-2 running-agent handlers in gateway/run.py.
# Listed here for introspection / tests; semantically a subset of
# "all resolvable commands" — which is the real bypass set (see
@@ -371,12 +391,47 @@ def gateway_help_lines() -> list[str]:
return lines
def _iter_plugin_command_entries() -> list[tuple[str, str, str]]:
"""Yield (name, description, args_hint) tuples for all plugin slash commands.
Plugin commands are registered via
:func:`hermes_cli.plugins.PluginContext.register_command`. They behave
like ``CommandDef`` entries for gateway surfacing: they appear in the
Telegram command menu, in Slack's ``/hermes`` subcommand mapping, and
(via :func:`gateway.platforms.discord._register_slash_commands`) in
Discord's native slash command picker.
Lookup is lazy so importing this module never forces plugin discovery
(which can trigger filesystem scans and environment-dependent
behavior).
"""
try:
from hermes_cli.plugins import get_plugin_commands
except Exception:
return []
try:
commands = get_plugin_commands() or {}
except Exception:
return []
entries: list[tuple[str, str, str]] = []
for name, meta in commands.items():
if not isinstance(name, str) or not isinstance(meta, dict):
continue
description = str(meta.get("description") or f"Run /{name}")
args_hint = str(meta.get("args_hint") or "").strip()
entries.append((name, description, args_hint))
return entries
def telegram_bot_commands() -> list[tuple[str, str]]:
"""Return (command_name, description) pairs for Telegram setMyCommands.
Telegram command names cannot contain hyphens, so they are replaced with
underscores. Aliases are skipped -- Telegram shows one menu entry per
canonical command.
Plugin-registered slash commands are included so plugins get native
autocomplete in Telegram without touching core code.
"""
overrides = _resolve_config_gates()
result: list[tuple[str, str]] = []
@@ -386,6 +441,10 @@ def telegram_bot_commands() -> list[tuple[str, str]]:
tg_name = _sanitize_telegram_name(cmd.name)
if tg_name:
result.append((tg_name, cmd.description))
for name, description, _args_hint in _iter_plugin_command_entries():
tg_name = _sanitize_telegram_name(name)
if tg_name:
result.append((tg_name, description))
return result
@@ -750,6 +809,9 @@ def slack_subcommand_map() -> dict[str, str]:
Maps both canonical names and aliases so /hermes bg do stuff works
the same as /hermes background do stuff.
Plugin-registered slash commands are included so ``/hermes <plugin-cmd>``
routes through the plugin handler.
"""
overrides = _resolve_config_gates()
mapping: dict[str, str] = {}
@@ -759,6 +821,9 @@ def slack_subcommand_map() -> dict[str, str]:
mapping[cmd.name] = f"/{cmd.name}"
for alias in cmd.aliases:
mapping[alias] = f"/{alias}"
for name, _description, _args_hint in _iter_plugin_command_entries():
if name not in mapping:
mapping[name] = f"/{name}"
return mapping
+136 -12
View File
@@ -361,6 +361,15 @@ DEFAULT_CONFIG = {
# to finish, then interrupts any remaining runs after the timeout.
# 0 = no drain, interrupt immediately.
"restart_drain_timeout": 60,
# Max app-level retry attempts for API errors (connection drops,
# provider timeouts, 5xx, etc.) before the agent surfaces the
# failure. The OpenAI SDK already does its own low-level retries
# (max_retries=2 default) for transient network errors; this is
# the Hermes-level retry loop that wraps the whole call. Lower
# this to 1 if you use fallback providers and want fast failover
# on flaky primaries; raise it if you prefer to tolerate longer
# provider hiccups on a single provider.
"api_max_retries": 3,
"service_tier": "",
# Tool-use enforcement: injects system prompt guidance that tells the
# model to actually call tools instead of describing intended actions.
@@ -375,7 +384,11 @@ DEFAULT_CONFIG = {
# Periodic "still working" notification interval (seconds).
# Sends a status message every N seconds so the user knows the
# agent hasn't died during long tasks. 0 = disable notifications.
"gateway_notify_interval": 600,
# Lower values mean faster feedback on slow tasks but more chat
# noise; 180s is a compromise that catches spinning weak-model runs
# (60+ tool iterations with tiny output) before users assume the
# bot is dead and /restart.
"gateway_notify_interval": 180,
},
"terminal": {
@@ -394,17 +407,23 @@ DEFAULT_CONFIG = {
# (bash doesn't source bashrc in non-interactive login mode) or
# zsh-specific files like ``~/.zshrc`` / ``~/.zprofile``.
# Paths support ``~`` / ``${VAR}``. Missing files are silently
# skipped. When empty, Hermes auto-appends ``~/.bashrc`` if the
# skipped. When empty, Hermes auto-sources ``~/.profile``,
# ``~/.bash_profile``, and ``~/.bashrc`` (in that order) if the
# snapshot shell is bash (this is the ``auto_source_bashrc``
# behaviour — disable with that key if you want strict login-only
# semantics).
"shell_init_files": [],
# When true (default), Hermes sources ``~/.bashrc`` in the login
# shell used to build the environment snapshot. This captures
# PATH additions, shell functions, and aliases defined in the
# user's bashrc — which a plain ``bash -l -c`` would otherwise
# miss because bash skips bashrc in non-interactive login mode.
# Turn this off if you have a bashrc that misbehaves when sourced
# When true (default), Hermes sources the user's shell rc files
# (``~/.profile``, ``~/.bash_profile``, ``~/.bashrc``) in the
# login shell used to build the environment snapshot. This
# captures PATH additions, shell functions, and aliases — which a
# plain ``bash -l -c`` would otherwise miss because bash skips
# bashrc in non-interactive login mode, and because a default
# Debian/Ubuntu ``~/.bashrc`` short-circuits on non-interactive
# sources. ``~/.profile`` and ``~/.bash_profile`` are tried first
# because ``n`` / ``nvm`` / ``asdf`` installers typically write
# their PATH exports there without an interactivity guard. Turn
# this off if your rc files misbehave when sourced
# non-interactively (e.g. one that hard-exits on TTY checks).
"auto_source_bashrc": True,
"docker_image": "nikolaik/python-nodejs:python3.11-nodejs20",
@@ -447,6 +466,12 @@ DEFAULT_CONFIG = {
"record_sessions": False, # Auto-record browser sessions as WebM videos
"allow_private_urls": False, # Allow navigating to private/internal IPs (localhost, 192.168.x.x, etc.)
"cdp_url": "", # Optional persistent CDP endpoint for attaching to an existing Chromium/Chrome
# CDP supervisor — dialog + frame detection via a persistent WebSocket.
# Active only when a CDP-capable backend is attached (Browserbase or
# local Chrome via /browser connect). See
# website/docs/developer-guide/browser-supervisor.md.
"dialog_policy": "must_respond", # must_respond | auto_dismiss | auto_accept
"dialog_timeout_s": 300, # Safety auto-dismiss after N seconds under must_respond
"camofox": {
# When true, Hermes sends a stable profile-scoped userId to Camofox
# so the server maps it to a persistent Firefox profile automatically.
@@ -467,7 +492,27 @@ DEFAULT_CONFIG = {
# exceed this are rejected with guidance to use offset+limit.
# 100K chars ≈ 2535K tokens across typical tokenisers.
"file_read_max_chars": 100_000,
# Tool-output truncation thresholds. When terminal output or a
# single read_file page exceeds these limits, Hermes truncates the
# payload sent to the model (keeping head + tail for terminal,
# enforcing pagination for read_file). Tuning these trades context
# footprint against how much raw output the model can see in one
# shot. Ported from anomalyco/opencode PR #23770.
#
# - max_bytes: terminal_tool output cap, in chars
# (default 50_000 ≈ 12-15K tokens).
# - max_lines: read_file pagination cap — the maximum `limit`
# a single read_file call can request before
# being clamped (default 2000).
# - max_line_length: per-line cap applied when read_file emits a
# line-numbered view (default 2000 chars).
"tool_output": {
"max_bytes": 50_000,
"max_lines": 2000,
"max_line_length": 2000,
},
"compression": {
"enabled": True,
"threshold": 0.50, # compress when context usage exceeds this ratio
@@ -613,6 +658,10 @@ DEFAULT_CONFIG = {
},
# Text-to-speech configuration
# Each provider supports an optional `max_text_length:` override for the
# per-request input-character cap. Omit it to use the provider's documented
# limit (OpenAI 4096, xAI 15000, MiniMax 10000, ElevenLabs 5k-40k model-aware,
# Gemini 5000, Edge 5000, Mistral 4000, NeuTTS/KittenTTS 2000).
"tts": {
"provider": "edge", # "edge" (free) | "elevenlabs" (premium) | "openai" | "xai" | "minimax" | "mistral" | "neutts" (local)
"edge": {
@@ -708,8 +757,18 @@ DEFAULT_CONFIG = {
"provider": "", # e.g. "openrouter" (empty = inherit parent provider + credentials)
"base_url": "", # direct OpenAI-compatible endpoint for subagents
"api_key": "", # API key for delegation.base_url (falls back to OPENAI_API_KEY)
# When delegate_task narrows child toolsets explicitly, preserve any
# MCP toolsets the parent already has enabled. On by default so
# narrowing (e.g. toolsets=["web","browser"]) expresses "I want these
# extras" without silently stripping MCP tools the parent already has.
# Set to false for strict intersection.
"inherit_mcp_toolsets": True,
"max_iterations": 50, # per-subagent iteration cap (each subagent gets its own budget,
# independent of the parent's max_iterations)
"child_timeout_seconds": 600, # wall-clock timeout for each child agent (floor 30s,
# no ceiling). High-reasoning models on large tasks
# (e.g. gpt-5.5 xhigh, opus-4.6) need generous budgets;
# raise if children time out before producing output.
"reasoning_effort": "", # reasoning effort for subagents: "xhigh", "high", "medium",
# "low", "minimal", "none" (empty = inherit parent's level)
"max_concurrent_children": 3, # max parallel children per batch; floor of 1 enforced, no ceiling
@@ -744,6 +803,17 @@ DEFAULT_CONFIG = {
"inline_shell": False,
# Timeout (seconds) for each !`cmd` snippet when inline_shell is on.
"inline_shell_timeout": 10,
# Run the keyword/pattern security scanner on skills the agent
# writes via skill_manage (create/edit/patch). Off by default
# because the agent can already execute the same code paths via
# terminal() with no gate, so the scan adds friction (blocks
# skills that mention risky keywords in prose) without meaningful
# security. Turn on if you want the belt-and-suspenders — a
# dangerous verdict will then surface as a tool error to the
# agent, which can retry with the flagged content removed.
# External hub installs (trusted/community sources) are always
# scanned regardless of this setting.
"guard_agent_created": False,
},
# Honcho AI-native memory -- reads ~/.honcho/config.json as single source of truth.
@@ -763,7 +833,7 @@ DEFAULT_CONFIG = {
"auto_thread": True, # Auto-create threads on @mention in channels (like Slack)
"reactions": True, # Add 👀/✅/❌ reactions to messages during processing
"channel_prompts": {}, # Per-channel ephemeral system prompts (forum parents apply to child threads)
# discord_server tool: restrict which actions the agent may call.
# discord / discord_admin tools: restrict which actions the agent may call.
# Default (empty) = all actions allowed (subject to bot privileged intents).
# Accepts comma-separated string ("list_guilds,list_channels,fetch_messages")
# or YAML list. Unknown names are dropped with a warning at load time.
@@ -836,6 +906,7 @@ DEFAULT_CONFIG = {
# Pre-exec security scanning via tirith
"security": {
"allow_private_urls": False, # Allow requests to private/internal IPs (for OpenWrt, proxies, VPNs)
"redact_secrets": True,
"tirith_enabled": True,
"tirith_path": "tirith",
@@ -889,6 +960,34 @@ DEFAULT_CONFIG = {
"force_ipv4": False,
},
# Session storage — controls automatic cleanup of ~/.hermes/state.db.
# state.db accumulates every session, message, tool call, and FTS5 index
# entry forever. Without auto-pruning, a heavy user (gateway + cron)
# reports 384MB+ databases with 68K+ messages, which slows down FTS5
# inserts, /resume listing, and insights queries.
"sessions": {
# When true, prune ended sessions older than retention_days once
# per (roughly) min_interval_hours at CLI/gateway/cron startup.
# Only touches ended sessions — active sessions are always preserved.
# Default false: session history is valuable for search recall, and
# silently deleting it could surprise users. Opt in explicitly.
"auto_prune": False,
# How many days of ended-session history to keep. Matches the
# default of ``hermes sessions prune``.
"retention_days": 90,
# VACUUM after a prune that actually deleted rows. SQLite does not
# reclaim disk space on DELETE — freed pages are just reused on
# subsequent INSERTs — so without VACUUM the file stays bloated
# even after pruning. VACUUM blocks writes for a few seconds per
# 100MB, so it only runs at startup, and only when prune deleted
# ≥1 session.
"vacuum_after_prune": True,
# Minimum hours between auto-maintenance runs (avoids repeating
# the sweep on every CLI invocation). Tracked via state_meta in
# state.db itself, so it's shared across all processes.
"min_interval_hours": 24,
},
# Config schema version - bump this when adding new required fields
"_config_version": 22,
}
@@ -1046,6 +1145,22 @@ OPTIONAL_ENV_VARS = {
"category": "provider",
"advanced": True,
},
"STEPFUN_API_KEY": {
"description": "StepFun Step Plan API key",
"prompt": "StepFun Step Plan API key",
"url": "https://platform.stepfun.com/",
"password": True,
"category": "provider",
"advanced": True,
},
"STEPFUN_BASE_URL": {
"description": "StepFun Step Plan base URL override",
"prompt": "StepFun Step Plan base URL (leave empty for default)",
"url": None,
"password": False,
"category": "provider",
"advanced": True,
},
"ARCEEAI_API_KEY": {
"description": "Arcee AI API key",
"prompt": "Arcee AI API key",
@@ -1219,7 +1334,7 @@ OPTIONAL_ENV_VARS = {
"advanced": True,
},
"XIAOMI_API_KEY": {
"description": "Xiaomi MiMo API key for MiMo models (mimo-v2-pro, mimo-v2-omni, mimo-v2-flash)",
"description": "Xiaomi MiMo API key for MiMo models (mimo-v2.5-pro, mimo-v2.5, mimo-v2-pro, mimo-v2-omni, mimo-v2-flash)",
"prompt": "Xiaomi MiMo API Key",
"url": "https://platform.xiaomimimo.com",
"password": True,
@@ -2000,6 +2115,14 @@ def _normalize_custom_provider_entry(
models = entry.get("models")
if isinstance(models, dict) and models:
normalized["models"] = models
elif isinstance(models, list) and models:
# Hand-edited configs (and older Hermes versions) write ``models`` as
# a plain list of model ids. Preserve them by converting to the dict
# shape downstream code expects; otherwise normalize silently drops
# the list and /model shows the provider with (0) models.
normalized["models"] = {
str(m): {} for m in models if isinstance(m, str) and m.strip()
}
context_length = entry.get("context_length")
if isinstance(context_length, int) and context_length > 0:
@@ -2098,6 +2221,7 @@ _KNOWN_ROOT_KEYS = {
"fallback_providers", "credential_pool_strategies", "toolsets",
"agent", "terminal", "display", "compression", "delegation",
"auxiliary", "custom_providers", "context", "memory", "gateway",
"sessions",
}
# Valid fields inside a custom_providers list entry
@@ -3113,7 +3237,7 @@ def save_config(config: Dict[str, Any]):
if not sec or sec.get("redact_secrets") is None:
parts.append(_SECURITY_COMMENT)
fb = normalized.get("fallback_model", {})
if not fb or not (fb.get("provider") and fb.get("model")):
if not fb or not isinstance(fb, dict) or not (fb.get("provider") and fb.get("model")):
parts.append(_FALLBACK_COMMENT)
atomic_yaml_write(
+128 -48
View File
@@ -13,6 +13,7 @@ import time
import urllib.error
import urllib.parse
import urllib.request
from dataclasses import dataclass
from pathlib import Path
from typing import Optional
@@ -147,6 +148,14 @@ def _sweep_expired_pastes(now: Optional[float] = None) -> tuple[int, int]:
return (deleted, len(remaining))
def _best_effort_sweep_expired_pastes() -> None:
"""Attempt pending-paste cleanup without letting /debug fail offline."""
try:
_sweep_expired_pastes()
except Exception:
pass
# ---------------------------------------------------------------------------
# Privacy / delete helpers
# ---------------------------------------------------------------------------
@@ -314,72 +323,128 @@ def upload_to_pastebin(content: str, expiry_days: int = 7) -> str:
# Log file reading
# ---------------------------------------------------------------------------
def _resolve_log_path(log_name: str) -> Optional[Path]:
"""Find the log file for *log_name*, falling back to the .1 rotation.
Returns the path if found, or None.
"""
@dataclass
class LogSnapshot:
"""Single-read snapshot of a log file used by debug-share."""
path: Optional[Path]
tail_text: str
full_text: Optional[str]
def _primary_log_path(log_name: str) -> Optional[Path]:
"""Where *log_name* would live if present. Doesn't check existence."""
from hermes_cli.logs import LOG_FILES
filename = LOG_FILES.get(log_name)
if not filename:
return (get_hermes_home() / "logs" / filename) if filename else None
def _resolve_log_path(log_name: str) -> Optional[Path]:
"""Find the log file for *log_name*, falling back to the .1 rotation.
Returns the first non-empty candidate (primary, then .1), or None.
Callers distinguish 'empty primary' from 'truly missing' via
:func:`_primary_log_path`.
"""
primary = _primary_log_path(log_name)
if primary is None:
return None
log_dir = get_hermes_home() / "logs"
primary = log_dir / filename
if primary.exists() and primary.stat().st_size > 0:
return primary
# Fall back to the most recent rotated file (.1).
rotated = log_dir / f"{filename}.1"
rotated = primary.parent / f"{primary.name}.1"
if rotated.exists() and rotated.stat().st_size > 0:
return rotated
return None
def _read_log_tail(log_name: str, num_lines: int) -> str:
"""Read the last *num_lines* from a log file, or return a placeholder."""
from hermes_cli.logs import _read_last_n_lines
def _capture_log_snapshot(
log_name: str,
*,
tail_lines: int,
max_bytes: int = _MAX_LOG_BYTES,
) -> LogSnapshot:
"""Capture a log once and derive summary/full-log views from it.
log_path = _resolve_log_path(log_name)
if log_path is None:
return "(file not found)"
try:
lines = _read_last_n_lines(log_path, num_lines)
return "".join(lines).rstrip("\n")
except Exception as exc:
return f"(error reading: {exc})"
def _read_full_log(log_name: str, max_bytes: int = _MAX_LOG_BYTES) -> Optional[str]:
"""Read a log file for standalone upload.
Returns the file content (last *max_bytes* if truncated), or None if the
file doesn't exist or is empty.
The report tail and standalone log upload must come from the same file
snapshot. Otherwise a rotation/truncate between reads can make the report
look newer than the uploaded ``agent.log`` paste.
"""
log_path = _resolve_log_path(log_name)
if log_path is None:
return None
primary = _primary_log_path(log_name)
tail = "(file empty)" if primary and primary.exists() else "(file not found)"
return LogSnapshot(path=None, tail_text=tail, full_text=None)
try:
size = log_path.stat().st_size
if size == 0:
return None
# race: file was truncated between _resolve_log_path and stat
return LogSnapshot(path=log_path, tail_text="(file empty)", full_text=None)
if size <= max_bytes:
return log_path.read_text(encoding="utf-8", errors="replace")
# File is larger than max_bytes — read the tail.
with open(log_path, "rb") as f:
f.seek(size - max_bytes)
# Skip partial line at the seek point.
f.readline()
content = f.read().decode("utf-8", errors="replace")
return f"[... truncated — showing last ~{max_bytes // 1024}KB ...]\n{content}"
except Exception:
return None
if size <= max_bytes:
raw = f.read()
truncated = False
else:
# Read from the end until we have enough bytes for the
# standalone upload and enough newline context to render the
# summary tail from the same snapshot.
chunk_size = 8192
pos = size
chunks: list[bytes] = []
total = 0
newline_count = 0
while pos > 0 and (total < max_bytes or newline_count <= tail_lines + 1) and total < max_bytes * 2:
read_size = min(chunk_size, pos)
pos -= read_size
f.seek(pos)
chunk = f.read(read_size)
chunks.insert(0, chunk)
total += len(chunk)
newline_count += chunk.count(b"\n")
chunk_size = min(chunk_size * 2, 65536)
raw = b"".join(chunks)
truncated = pos > 0
full_raw = raw
if truncated and len(full_raw) > max_bytes:
cut = len(full_raw) - max_bytes
# Check whether the cut lands exactly on a line boundary. If the
# byte just before the cut position is a newline the first retained
# byte starts a complete line and we should keep it. Only drop a
# partial first line when we're genuinely mid-line.
on_boundary = cut > 0 and full_raw[cut - 1 : cut] == b"\n"
full_raw = full_raw[cut:]
if not on_boundary and b"\n" in full_raw:
full_raw = full_raw.split(b"\n", 1)[1]
all_text = raw.decode("utf-8", errors="replace")
tail_text = "".join(all_text.splitlines(keepends=True)[-tail_lines:]).rstrip("\n")
full_text = full_raw.decode("utf-8", errors="replace")
if truncated:
full_text = f"[... truncated — showing last ~{max_bytes // 1024}KB ...]\n{full_text}"
return LogSnapshot(path=log_path, tail_text=tail_text, full_text=full_text)
except Exception as exc:
return LogSnapshot(path=log_path, tail_text=f"(error reading: {exc})", full_text=None)
def _capture_default_log_snapshots(log_lines: int) -> dict[str, LogSnapshot]:
"""Capture all logs used by debug-share exactly once."""
errors_lines = min(log_lines, 100)
return {
"agent": _capture_log_snapshot("agent", tail_lines=log_lines),
"errors": _capture_log_snapshot("errors", tail_lines=errors_lines),
"gateway": _capture_log_snapshot("gateway", tail_lines=errors_lines),
}
# ---------------------------------------------------------------------------
@@ -405,7 +470,12 @@ def _capture_dump() -> str:
return capture.getvalue()
def collect_debug_report(*, log_lines: int = 200, dump_text: str = "") -> str:
def collect_debug_report(
*,
log_lines: int = 200,
dump_text: str = "",
log_snapshots: Optional[dict[str, LogSnapshot]] = None,
) -> str:
"""Build the summary debug report: system dump + log tails.
Parameters
@@ -424,19 +494,22 @@ def collect_debug_report(*, log_lines: int = 200, dump_text: str = "") -> str:
dump_text = _capture_dump()
buf.write(dump_text)
if log_snapshots is None:
log_snapshots = _capture_default_log_snapshots(log_lines)
# ── Recent log tails (summary only) ──────────────────────────────────
buf.write("\n\n")
buf.write(f"--- agent.log (last {log_lines} lines) ---\n")
buf.write(_read_log_tail("agent", log_lines))
buf.write(log_snapshots["agent"].tail_text)
buf.write("\n\n")
errors_lines = min(log_lines, 100)
buf.write(f"--- errors.log (last {errors_lines} lines) ---\n")
buf.write(_read_log_tail("errors", errors_lines))
buf.write(log_snapshots["errors"].tail_text)
buf.write("\n\n")
buf.write(f"--- gateway.log (last {errors_lines} lines) ---\n")
buf.write(_read_log_tail("gateway", errors_lines))
buf.write(log_snapshots["gateway"].tail_text)
buf.write("\n")
return buf.getvalue()
@@ -448,6 +521,8 @@ def collect_debug_report(*, log_lines: int = 200, dump_text: str = "") -> str:
def run_debug_share(args):
"""Collect debug report + full logs, upload each, print URLs."""
_best_effort_sweep_expired_pastes()
log_lines = getattr(args, "lines", 200)
expiry = getattr(args, "expire", 7)
local_only = getattr(args, "local", False)
@@ -459,10 +534,15 @@ def run_debug_share(args):
# Capture dump once — prepended to every paste for context.
dump_text = _capture_dump()
log_snapshots = _capture_default_log_snapshots(log_lines)
report = collect_debug_report(log_lines=log_lines, dump_text=dump_text)
agent_log = _read_full_log("agent")
gateway_log = _read_full_log("gateway")
report = collect_debug_report(
log_lines=log_lines,
dump_text=dump_text,
log_snapshots=log_snapshots,
)
agent_log = log_snapshots["agent"].full_text
gateway_log = log_snapshots["gateway"].full_text
# Prepend dump header to each full log so every paste is self-contained.
if agent_log:
+9 -4
View File
@@ -912,6 +912,7 @@ def run_doctor(args):
_apikey_providers = [
("Z.AI / GLM", ("GLM_API_KEY", "ZAI_API_KEY", "Z_AI_API_KEY"), "https://api.z.ai/api/paas/v4/models", "GLM_BASE_URL", True),
("Kimi / Moonshot", ("KIMI_API_KEY",), "https://api.moonshot.ai/v1/models", "KIMI_BASE_URL", True),
("StepFun Step Plan", ("STEPFUN_API_KEY",), "https://api.stepfun.ai/step_plan/v1/models", "STEPFUN_BASE_URL", True),
("Kimi / Moonshot (China)", ("KIMI_CN_API_KEY",), "https://api.moonshot.cn/v1/models", None, True),
("Arcee AI", ("ARCEEAI_API_KEY",), "https://api.arcee.ai/api/v1/models", "ARCEE_BASE_URL", True),
("DeepSeek", ("DEEPSEEK_API_KEY",), "https://api.deepseek.com/v1/models", "DEEPSEEK_BASE_URL", True),
@@ -943,18 +944,22 @@ def run_doctor(args):
try:
import httpx
_base = os.getenv(_base_env, "") if _base_env else ""
# Auto-detect Kimi Code keys (sk-kimi-) → api.kimi.com
# Auto-detect Kimi Code keys (sk-kimi-) → api.kimi.com/coding/v1
# (OpenAI-compat surface, which exposes /models for health check).
if not _base and _key.startswith("sk-kimi-"):
_base = "https://api.kimi.com/coding/v1"
# Anthropic-compat endpoints (/anthropic) don't support /models.
# Rewrite to the OpenAI-compat /v1 surface for health checks.
# Anthropic-compat endpoints (/anthropic, api.kimi.com/coding
# with no /v1) don't support /models. Rewrite to the OpenAI-compat
# /v1 surface for health checks.
if _base and _base.rstrip("/").endswith("/anthropic"):
from agent.auxiliary_client import _to_openai_base_url
_base = _to_openai_base_url(_base)
if base_url_host_matches(_base, "api.kimi.com") and _base.rstrip("/").endswith("/coding"):
_base = _base.rstrip("/") + "/v1"
_url = (_base.rstrip("/") + "/models") if _base else _default_url
_headers = {"Authorization": f"Bearer {_key}"}
if base_url_host_matches(_base, "api.kimi.com"):
_headers["User-Agent"] = "KimiCLI/1.30.0"
_headers["User-Agent"] = "claude-code/0.1.0"
_resp = httpx.get(
_url,
headers=_headers,
+2
View File
@@ -160,6 +160,8 @@ def load_hermes_dotenv(
# Fix corrupted .env files before python-dotenv parses them (#8908).
if user_env.exists():
_sanitize_env_file_if_needed(user_env)
if project_env_path and project_env_path.exists():
_sanitize_env_file_if_needed(project_env_path)
if user_env.exists():
_load_dotenv_with_fallback(user_env, override=True)
+545 -149
View File
@@ -175,6 +175,60 @@ def _request_gateway_self_restart(pid: int) -> bool:
return True
def _graceful_restart_via_sigusr1(pid: int, drain_timeout: float) -> bool:
"""Send SIGUSR1 to a gateway PID and wait for it to exit gracefully.
SIGUSR1 is wired in gateway/run.py to ``request_restart(via_service=True)``
which drains in-flight agent runs (up to ``agent.restart_drain_timeout``
seconds), then exits with code 75. Both systemd (``Restart=on-failure``
+ ``RestartForceExitStatus=75``) and launchd (``KeepAlive.SuccessfulExit
= false``) relaunch the process after the graceful exit.
This is the drain-aware alternative to ``systemctl restart`` / ``SIGTERM``,
which SIGKILL in-flight agents after a short timeout.
Args:
pid: Gateway process PID (systemd MainPID, launchd PID, or bare
process PID).
drain_timeout: Seconds to wait for the process to exit after sending
SIGUSR1. Should be slightly larger than the gateway's
``agent.restart_drain_timeout`` to allow the drain loop to
finish cleanly.
Returns:
True if the PID was signalled and exited within the timeout.
False if SIGUSR1 couldn't be sent or the process didn't exit in
time (caller should fall back to a harder restart path).
"""
if not hasattr(signal, "SIGUSR1"):
return False
if pid <= 0:
return False
try:
os.kill(pid, signal.SIGUSR1)
except ProcessLookupError:
# Already gone — nothing to drain.
return True
except (PermissionError, OSError):
return False
import time as _time
deadline = _time.monotonic() + max(drain_timeout, 1.0)
while _time.monotonic() < deadline:
try:
os.kill(pid, 0) # signal 0 — probe liveness
except ProcessLookupError:
return True
except PermissionError:
# Process still exists but we can't signal it. Treat as alive
# so the caller falls back.
pass
_time.sleep(0.5)
# Drain didn't finish in time.
return False
def _append_unique_pid(pids: list[int], pid: int | None, exclude_pids: set[int]) -> None:
if pid is None or pid <= 0:
return
@@ -333,6 +387,147 @@ def _probe_systemd_service_running(system: bool = False) -> tuple[bool, bool]:
return selected_system, result.stdout.strip() == "active"
def _read_systemd_unit_properties(
system: bool = False,
properties: tuple[str, ...] = (
"ActiveState",
"SubState",
"Result",
"ExecMainStatus",
),
) -> dict[str, str]:
"""Return selected ``systemctl show`` properties for the gateway unit."""
selected_system = _select_systemd_scope(system)
try:
result = _run_systemctl(
[
"show",
get_service_name(),
"--no-pager",
"--property",
",".join(properties),
],
system=selected_system,
capture_output=True,
text=True,
timeout=10,
)
except (RuntimeError, subprocess.TimeoutExpired, OSError):
return {}
if result.returncode != 0:
return {}
parsed: dict[str, str] = {}
for line in result.stdout.splitlines():
if "=" not in line:
continue
key, value = line.split("=", 1)
parsed[key] = value.strip()
return parsed
def _wait_for_systemd_service_restart(
*,
system: bool = False,
previous_pid: int | None = None,
timeout: float = 60.0,
) -> bool:
"""Wait for the gateway service to become active after a restart handoff."""
import time
svc = get_service_name()
scope_label = _service_scope_label(system).capitalize()
deadline = time.time() + timeout
while time.time() < deadline:
props = _read_systemd_unit_properties(system=system)
active_state = props.get("ActiveState", "")
sub_state = props.get("SubState", "")
new_pid = None
try:
from gateway.status import get_running_pid
new_pid = get_running_pid()
except Exception:
new_pid = None
if active_state == "active":
if new_pid and (previous_pid is None or new_pid != previous_pid):
print(f"{scope_label} service restarted (PID {new_pid})")
return True
if previous_pid is None:
print(f"{scope_label} service restarted")
return True
if active_state == "activating" and sub_state == "auto-restart":
time.sleep(1)
continue
time.sleep(2)
print(
f"{scope_label} service did not become active within {int(timeout)}s.\n"
f" Check status: {'sudo ' if system else ''}hermes gateway status\n"
f" Check logs: journalctl {'--user ' if not system else ''}-u {svc} -l --since '2 min ago'"
)
return False
def _recover_pending_systemd_restart(system: bool = False, previous_pid: int | None = None) -> bool:
"""Recover a planned service restart that is stuck in systemd state."""
props = _read_systemd_unit_properties(system=system)
if not props:
return False
try:
from gateway.status import read_runtime_status
except Exception:
return False
runtime_state = read_runtime_status() or {}
if not runtime_state.get("restart_requested"):
return False
active_state = props.get("ActiveState", "")
sub_state = props.get("SubState", "")
exec_main_status = props.get("ExecMainStatus", "")
result = props.get("Result", "")
if active_state == "activating" and sub_state == "auto-restart":
print("⏳ Service restart already pending — waiting for systemd relaunch...")
return _wait_for_systemd_service_restart(
system=system,
previous_pid=previous_pid,
)
if active_state == "failed" and (
exec_main_status == str(GATEWAY_SERVICE_RESTART_EXIT_CODE)
or result == "exit-code"
):
svc = get_service_name()
scope_label = _service_scope_label(system).capitalize()
print(f"↻ Clearing failed state for pending {scope_label.lower()} service restart...")
_run_systemctl(
["reset-failed", svc],
system=system,
check=False,
timeout=30,
)
_run_systemctl(
["start", svc],
system=system,
check=False,
timeout=90,
)
return _wait_for_systemd_service_restart(
system=system,
previous_pid=previous_pid,
)
return False
def _probe_launchd_service_running() -> bool:
if not get_launchd_plist_path().exists():
return False
@@ -470,7 +665,8 @@ def stop_profile_gateway() -> bool:
except (ProcessLookupError, PermissionError):
break
remove_pid_file()
if get_running_pid() is None:
remove_pid_file()
return True
@@ -619,6 +815,21 @@ def get_systemd_unit_path(system: bool = False) -> Path:
return Path.home() / ".config" / "systemd" / "user" / f"{name}.service"
class UserSystemdUnavailableError(RuntimeError):
"""Raised when ``systemctl --user`` cannot reach the user D-Bus session.
Typically hit on fresh RHEL/Debian SSH sessions where linger is disabled
and no user@.service is running, so ``/run/user/$UID/bus`` never exists.
Carries a user-facing remediation message in ``args[0]``.
"""
def _user_dbus_socket_path() -> Path:
"""Return the expected per-user D-Bus socket path (regardless of existence)."""
xdg = os.environ.get("XDG_RUNTIME_DIR") or f"/run/user/{os.getuid()}"
return Path(xdg) / "bus"
def _ensure_user_systemd_env() -> None:
"""Ensure DBUS_SESSION_BUS_ADDRESS and XDG_RUNTIME_DIR are set for systemctl --user.
@@ -641,6 +852,126 @@ def _ensure_user_systemd_env() -> None:
os.environ["DBUS_SESSION_BUS_ADDRESS"] = f"unix:path={bus_path}"
def _wait_for_user_dbus_socket(timeout: float = 3.0) -> bool:
"""Poll for the user D-Bus socket to appear, up to ``timeout`` seconds.
Linger-enabled user@.service can take a second or two to spawn the socket
after ``loginctl enable-linger`` runs. Returns True once the socket exists.
"""
import time
deadline = time.monotonic() + timeout
while time.monotonic() < deadline:
if _user_dbus_socket_path().exists():
_ensure_user_systemd_env()
return True
time.sleep(0.2)
return _user_dbus_socket_path().exists()
def _preflight_user_systemd(*, auto_enable_linger: bool = True) -> None:
"""Ensure ``systemctl --user`` will reach the user D-Bus session bus.
No-op when the bus socket is already there (the common case on desktops
and linger-enabled servers). On fresh SSH sessions where the socket is
missing:
* If linger is already enabled, wait briefly for user@.service to spawn
the socket.
* If linger is disabled and ``auto_enable_linger`` is True, try
``loginctl enable-linger $USER`` (works as non-root when polkit permits
it, otherwise needs sudo).
* If the socket is still missing afterwards, raise
:class:`UserSystemdUnavailableError` with a precise remediation message.
Callers should treat the exception as a terminal condition for user-scope
systemd operations and surface the message to the user.
"""
_ensure_user_systemd_env()
bus_path = _user_dbus_socket_path()
if bus_path.exists():
return
import getpass
username = getpass.getuser()
linger_enabled, linger_detail = get_systemd_linger_status()
if linger_enabled is True:
if _wait_for_user_dbus_socket(timeout=3.0):
return
# Linger is on but socket still missing — unusual; fall through to error.
_raise_user_systemd_unavailable(
username,
reason="User D-Bus socket is missing even though linger is enabled.",
fix_hint=(
f" systemctl start user@{os.getuid()}.service\n"
" (may require sudo; try again after the command succeeds)"
),
)
if auto_enable_linger and shutil.which("loginctl"):
try:
result = subprocess.run(
["loginctl", "enable-linger", username],
capture_output=True,
text=True,
check=False,
timeout=30,
)
except Exception as exc:
_raise_user_systemd_unavailable(
username,
reason=f"loginctl enable-linger failed ({exc}).",
fix_hint=f" sudo loginctl enable-linger {username}",
)
else:
if result.returncode == 0:
if _wait_for_user_dbus_socket(timeout=5.0):
print(f"✓ Enabled linger for {username} — user D-Bus now available")
return
# enable-linger succeeded but the socket never appeared.
_raise_user_systemd_unavailable(
username,
reason="Linger was enabled, but the user D-Bus socket did not appear.",
fix_hint=(
" Log out and log back in, then re-run the command.\n"
f" Or reboot and run: systemctl --user start {get_service_name()}"
),
)
detail = (result.stderr or result.stdout or f"exit {result.returncode}").strip()
_raise_user_systemd_unavailable(
username,
reason=f"loginctl enable-linger was denied: {detail}",
fix_hint=f" sudo loginctl enable-linger {username}",
)
_raise_user_systemd_unavailable(
username,
reason=(
"User D-Bus session is not available "
f"({linger_detail or 'linger disabled'})."
),
fix_hint=f" sudo loginctl enable-linger {username}",
)
def _raise_user_systemd_unavailable(username: str, *, reason: str, fix_hint: str) -> None:
"""Build a user-facing error message and raise UserSystemdUnavailableError."""
msg = (
f"{reason}\n"
" systemctl --user cannot reach the user D-Bus session in this shell.\n"
"\n"
" To fix:\n"
f"{fix_hint}\n"
"\n"
" Alternative: run the gateway in the foreground (stays up until\n"
" you exit / close the terminal):\n"
" hermes gateway run"
)
raise UserSystemdUnavailableError(msg)
def _systemctl_cmd(system: bool = False) -> list[str]:
if not system:
_ensure_user_systemd_env()
@@ -1192,7 +1523,14 @@ def generate_systemd_unit(system: bool = False, run_as_user: str | None = None)
path_entries.append(resolved_node_dir)
common_bin_paths = ["/usr/local/sbin", "/usr/local/bin", "/usr/sbin", "/usr/bin", "/sbin", "/bin"]
restart_timeout = max(60, int(_get_restart_drain_timeout() or 0))
# systemd's TimeoutStopSec must exceed the gateway's drain_timeout so
# there's budget left for post-interrupt cleanup (tool subprocess kill,
# adapter disconnect, session DB close) before systemd escalates to
# SIGKILL on the cgroup — otherwise bash/sleep tool-call children left
# by a force-interrupted agent get reaped by systemd instead of us
# (#8202). 30s of headroom covers the worst case we've observed.
_drain_timeout = int(_get_restart_drain_timeout() or 0)
restart_timeout = max(60, _drain_timeout) + 30
if system:
username, group_name, home_dir = _system_service_identity(run_as_user)
@@ -1481,6 +1819,11 @@ def systemd_start(system: bool = False):
system = _select_systemd_scope(system)
if system:
_require_root_for_system_service("start")
else:
# Fail fast with actionable guidance if the user D-Bus session is not
# reachable (common on fresh RHEL/Debian SSH sessions without linger).
# Raises UserSystemdUnavailableError with a remediation message.
_preflight_user_systemd()
refresh_systemd_unit_if_needed(system=system)
_run_systemctl(["start", get_service_name()], system=system, check=True, timeout=30)
print(f"{_service_scope_label(system).capitalize()} service started")
@@ -1500,19 +1843,16 @@ def systemd_restart(system: bool = False):
system = _select_systemd_scope(system)
if system:
_require_root_for_system_service("restart")
else:
_preflight_user_systemd()
refresh_systemd_unit_if_needed(system=system)
from gateway.status import get_running_pid
pid = get_running_pid()
if pid is not None and _request_gateway_self_restart(pid):
# SIGUSR1 sent — the gateway will drain active agents, exit with
# code 75, and systemd will restart it after RestartSec (30s).
# Wait for the old process to die and the new one to become active
# so the CLI doesn't return while the service is still restarting.
import time
scope_label = _service_scope_label(system).capitalize()
svc = get_service_name()
scope_cmd = _systemctl_cmd(system)
# Phase 1: wait for old process to exit (drain + shutdown)
print(f"{scope_label} service draining active work...")
@@ -1526,48 +1866,41 @@ def systemd_restart(system: bool = False):
else:
print(f"⚠ Old process (PID {pid}) still alive after 90s")
# Phase 2: wait for systemd to start the new process
print(f"⏳ Waiting for {svc} to restart...")
deadline = time.time() + 60
while time.time() < deadline:
try:
result = subprocess.run(
scope_cmd + ["is-active", svc],
capture_output=True, text=True, timeout=5,
)
if result.stdout.strip() == "active":
# Verify it's a NEW process, not the old one somehow
new_pid = get_running_pid()
if new_pid and new_pid != pid:
print(f"{scope_label} service restarted (PID {new_pid})")
return
except (subprocess.TimeoutExpired, FileNotFoundError):
pass
time.sleep(2)
# Timed out — check final state
try:
result = subprocess.run(
scope_cmd + ["is-active", svc],
capture_output=True, text=True, timeout=5,
)
if result.stdout.strip() == "active":
print(f"{scope_label} service restarted")
return
except Exception:
pass
print(
f"{scope_label} service did not become active within 60s.\n"
f" Check status: {'sudo ' if system else ''}hermes gateway status\n"
f" Check logs: journalctl {'--user ' if not system else ''}-u {svc} --since '2 min ago'"
# The gateway exits with code 75 for a planned service restart.
# systemd can sit in the RestartSec window or even wedge itself into a
# failed/rate-limited state if the operator asks for another restart in
# the middle of that handoff. Clear any stale failed state and kick the
# unit immediately so `hermes gateway restart` behaves idempotently.
_run_systemctl(
["reset-failed", svc],
system=system,
check=False,
timeout=30,
)
_run_systemctl(
["start", svc],
system=system,
check=False,
timeout=90,
)
_wait_for_systemd_service_restart(system=system, previous_pid=pid)
return
if _recover_pending_systemd_restart(system=system, previous_pid=pid):
return
_run_systemctl(
["reset-failed", get_service_name()],
system=system,
check=False,
timeout=30,
)
_run_systemctl(["reload-or-restart", get_service_name()], system=system, check=True, timeout=90)
print(f"{_service_scope_label(system).capitalize()} service restarted")
def systemd_status(deep: bool = False, system: bool = False):
def systemd_status(deep: bool = False, system: bool = False, full: bool = False):
system = _select_systemd_scope(system)
unit_path = get_systemd_unit_path(system=system)
scope_flag = " --system" if system else ""
@@ -1590,8 +1923,12 @@ def systemd_status(deep: bool = False, system: bool = False):
print(f" Run: {'sudo ' if system else ''}hermes gateway restart{scope_flag} # auto-refreshes the unit")
print()
status_cmd = ["status", get_service_name(), "--no-pager"]
if full:
status_cmd.append("-l")
_run_systemctl(
["status", get_service_name(), "--no-pager"],
status_cmd,
system=system,
capture_output=False,
timeout=10,
@@ -1624,6 +1961,19 @@ def systemd_status(deep: bool = False, system: bool = False):
for line in runtime_lines:
print(f" {line}")
unit_props = _read_systemd_unit_properties(system=system)
active_state = unit_props.get("ActiveState", "")
sub_state = unit_props.get("SubState", "")
exec_main_status = unit_props.get("ExecMainStatus", "")
result_code = unit_props.get("Result", "")
if active_state == "activating" and sub_state == "auto-restart":
print(" ⏳ Restart pending: systemd is waiting to relaunch the gateway")
elif active_state == "failed" and exec_main_status == str(GATEWAY_SERVICE_RESTART_EXIT_CODE):
print(" ⚠ Planned restart is stuck in systemd failed state (exit 75)")
print(f" Run: systemctl {'--user ' if not system else ''}reset-failed {get_service_name()} && {'sudo ' if system else ''}hermes gateway start{scope_flag}")
elif active_state == "failed" and result_code:
print(f" ⚠ Systemd unit result: {result_code}")
if system:
print("✓ System service starts at boot without requiring systemd linger")
elif deep:
@@ -1639,7 +1989,10 @@ def systemd_status(deep: bool = False, system: bool = False):
if deep:
print()
print("Recent logs:")
subprocess.run(_journalctl_cmd(system) + ["-u", get_service_name(), "-n", "20", "--no-pager"], timeout=10)
log_cmd = _journalctl_cmd(system) + ["-u", get_service_name(), "-n", "20", "--no-pager"]
if full:
log_cmd.append("-l")
subprocess.run(log_cmd, timeout=10)
# =============================================================================
@@ -2639,9 +2992,120 @@ def _setup_dingtalk():
def _setup_wecom():
"""Configure WeCom (Enterprise WeChat) via the standard platform setup."""
wecom_platform = next(p for p in _PLATFORMS if p["key"] == "wecom")
_setup_standard_platform(wecom_platform)
"""Interactive setup for WeCom — scan QR code or manual credential input."""
print()
print(color(" ─── 💬 WeCom (Enterprise WeChat) Setup ───", Colors.CYAN))
existing_bot_id = get_env_value("WECOM_BOT_ID")
existing_secret = get_env_value("WECOM_SECRET")
if existing_bot_id and existing_secret:
print()
print_success("WeCom is already configured.")
if not prompt_yes_no(" Reconfigure WeCom?", False):
return
# ── Choose setup method ──
print()
method_choices = [
"Scan QR code to obtain Bot ID and Secret automatically (recommended)",
"Enter existing Bot ID and Secret manually",
]
method_idx = prompt_choice(" How would you like to set up WeCom?", method_choices, 0)
bot_id = None
secret = None
if method_idx == 0:
# ── QR scan flow ──
try:
from gateway.platforms.wecom import qr_scan_for_bot_info
except Exception as exc:
print_error(f" WeCom QR scan import failed: {exc}")
qr_scan_for_bot_info = None
if qr_scan_for_bot_info is not None:
try:
credentials = qr_scan_for_bot_info()
except KeyboardInterrupt:
print()
print_warning(" WeCom setup cancelled.")
return
except Exception as exc:
print_warning(f" QR scan failed: {exc}")
credentials = None
if credentials:
bot_id = credentials.get("bot_id", "")
secret = credentials.get("secret", "")
print_success(" ✔ QR scan successful! Bot ID and Secret obtained.")
if not bot_id or not secret:
print_info(" QR scan did not complete. Continuing with manual input.")
bot_id = None
secret = None
# ── Manual credential input ──
if not bot_id or not secret:
print()
print_info(" 1. Go to WeCom Application → Workspace → Smart Robot -> Create smart robots")
print_info(" 2. Select API Mode")
print_info(" 3. Copy the Bot ID and Secret from the bot's credentials info")
print_info(" 4. The bot connects via WebSocket — no public endpoint needed")
print()
bot_id = prompt(" Bot ID", password=False)
if not bot_id:
print_warning(" Skipped — WeCom won't work without a Bot ID.")
return
secret = prompt(" Secret", password=True)
if not secret:
print_warning(" Skipped — WeCom won't work without a Secret.")
return
# ── Save core credentials ──
save_env_value("WECOM_BOT_ID", bot_id)
save_env_value("WECOM_SECRET", secret)
# ── Allowed users (deny-by-default security) ──
print()
print_info(" The gateway DENIES all users by default for security.")
print_info(" Enter user IDs to create an allowlist, or leave empty.")
allowed = prompt(" Allowed user IDs (comma-separated, or empty)", password=False)
if allowed:
cleaned = allowed.replace(" ", "")
save_env_value("WECOM_ALLOWED_USERS", cleaned)
print_success(" Saved — only these users can interact with the bot.")
else:
print()
access_choices = [
"Enable open access (anyone can message the bot)",
"Use DM pairing (unknown users request access, you approve with 'hermes pairing approve')",
"Disable direct messages",
"Skip for now (bot will deny all users until configured)",
]
access_idx = prompt_choice(" How should unauthorized users be handled?", access_choices, 1)
if access_idx == 0:
save_env_value("WECOM_DM_POLICY", "open")
save_env_value("GATEWAY_ALLOW_ALL_USERS", "true")
print_warning(" Open access enabled — anyone can use your bot!")
elif access_idx == 1:
save_env_value("WECOM_DM_POLICY", "pairing")
print_success(" DM pairing mode — users will receive a code to request access.")
print_info(" Approve with: hermes pairing approve <platform> <code>")
elif access_idx == 2:
save_env_value("WECOM_DM_POLICY", "disabled")
print_warning(" Direct messages disabled.")
else:
print_info(" Skipped — configure later with 'hermes gateway setup'")
# ── Home channel (optional) ──
print()
print_info(" Chat ID for scheduled results and notifications.")
home = prompt(" Home chat ID (optional, for cron/notifications)", password=False)
if home:
save_env_value("WECOM_HOME_CHANNEL", home)
print_success(f" Home channel set to {home}")
print()
print_success("💬 WeCom configured!")
def _is_service_installed() -> bool:
@@ -3021,7 +3485,8 @@ def _setup_qqbot():
if method_idx == 0:
# ── QR scan-to-configure ──
try:
credentials = _qqbot_qr_flow()
from gateway.platforms.qqbot import qr_register
credentials = qr_register()
except KeyboardInterrupt:
print()
print_warning(" QQ Bot setup cancelled.")
@@ -3103,106 +3568,6 @@ def _setup_qqbot():
print_info(f" App ID: {credentials['app_id']}")
def _qqbot_render_qr(url: str) -> bool:
"""Try to render a QR code in the terminal. Returns True if successful."""
try:
import qrcode as _qr
qr = _qr.QRCode(border=1,error_correction=_qr.constants.ERROR_CORRECT_L)
qr.add_data(url)
qr.make(fit=True)
qr.print_ascii(invert=True)
return True
except Exception:
return False
def _qqbot_qr_flow():
"""Run the QR-code scan-to-configure flow.
Returns a dict with app_id, client_secret, user_openid on success,
or None on failure/cancel.
"""
try:
from gateway.platforms.qqbot import (
create_bind_task, poll_bind_result, build_connect_url,
decrypt_secret, BindStatus,
)
from gateway.platforms.qqbot.constants import ONBOARD_POLL_INTERVAL
except Exception as exc:
print_error(f" QQBot onboard import failed: {exc}")
return None
import asyncio
import time
MAX_REFRESHES = 3
refresh_count = 0
while refresh_count <= MAX_REFRESHES:
loop = asyncio.new_event_loop()
# ── Create bind task ──
try:
task_id, aes_key = loop.run_until_complete(create_bind_task())
except Exception as e:
print_warning(f" Failed to create bind task: {e}")
loop.close()
return None
url = build_connect_url(task_id)
# ── Display QR code + URL ──
print()
if _qqbot_render_qr(url):
print(f" Scan the QR code above, or open this URL directly:\n {url}")
else:
print(f" Open this URL in QQ on your phone:\n {url}")
print_info(" Tip: pip install qrcode to show a scannable QR code here")
# ── Poll loop (silent — keep QR visible at bottom) ──
try:
while True:
try:
status, app_id, encrypted_secret, user_openid = loop.run_until_complete(
poll_bind_result(task_id)
)
except Exception:
time.sleep(ONBOARD_POLL_INTERVAL)
continue
if status == BindStatus.COMPLETED:
client_secret = decrypt_secret(encrypted_secret, aes_key)
print()
print_success(f" QR scan complete! (App ID: {app_id})")
if user_openid:
print_info(f" Scanner's OpenID: {user_openid}")
return {
"app_id": app_id,
"client_secret": client_secret,
"user_openid": user_openid,
}
if status == BindStatus.EXPIRED:
refresh_count += 1
if refresh_count > MAX_REFRESHES:
print()
print_warning(f" QR code expired {MAX_REFRESHES} times — giving up.")
return None
print()
print_warning(f" QR code expired, refreshing... ({refresh_count}/{MAX_REFRESHES})")
loop.close()
break # outer while creates a new task
time.sleep(ONBOARD_POLL_INTERVAL)
except KeyboardInterrupt:
loop.close()
raise
finally:
loop.close()
return None
def _setup_signal():
"""Interactive setup for Signal messenger."""
import shutil
@@ -3354,6 +3719,10 @@ def gateway_setup():
systemd_start()
elif is_macos():
launchd_start()
except UserSystemdUnavailableError as e:
print_error(" Failed to start — user systemd not reachable:")
for line in str(e).splitlines():
print(f" {line}")
except subprocess.CalledProcessError as e:
print_error(f" Failed to start: {e}")
else:
@@ -3390,6 +3759,8 @@ def gateway_setup():
_setup_feishu()
elif platform["key"] == "qqbot":
_setup_qqbot()
elif platform["key"] == "wecom":
_setup_wecom()
else:
_setup_standard_platform(platform)
@@ -3416,6 +3787,10 @@ def gateway_setup():
else:
stop_profile_gateway()
print_info("Start manually: hermes gateway")
except UserSystemdUnavailableError as e:
print_error(" Restart failed — user systemd not reachable:")
for line in str(e).splitlines():
print(f" {line}")
except subprocess.CalledProcessError as e:
print_error(f" Restart failed: {e}")
elif service_installed:
@@ -3425,6 +3800,10 @@ def gateway_setup():
systemd_start()
elif is_macos():
launchd_start()
except UserSystemdUnavailableError as e:
print_error(" Start failed — user systemd not reachable:")
for line in str(e).splitlines():
print(f" {line}")
except subprocess.CalledProcessError as e:
print_error(f" Start failed: {e}")
else:
@@ -3448,6 +3827,10 @@ def gateway_setup():
systemd_start(system=installed_scope == "system")
else:
launchd_start()
except UserSystemdUnavailableError as e:
print_error(" Start failed — user systemd not reachable:")
for line in str(e).splitlines():
print(f" {line}")
except subprocess.CalledProcessError as e:
print_error(f" Start failed: {e}")
except subprocess.CalledProcessError as e:
@@ -3485,6 +3868,18 @@ def gateway_setup():
def gateway_command(args):
"""Handle gateway subcommands."""
try:
return _gateway_command_inner(args)
except UserSystemdUnavailableError as e:
# Clean, actionable message instead of a traceback when the user D-Bus
# session is unreachable (fresh SSH shell, no linger, container, etc.).
print_error("User systemd not reachable:")
for line in str(e).splitlines():
print(f" {line}")
sys.exit(1)
def _gateway_command_inner(args):
subcmd = getattr(args, 'gateway_command', None)
# Default to run if no subcommand
@@ -3748,12 +4143,13 @@ def gateway_command(args):
elif subcmd == "status":
deep = getattr(args, 'deep', False)
full = getattr(args, 'full', False)
system = getattr(args, 'system', False)
snapshot = get_gateway_runtime_snapshot(system=system)
# Check for service first
if supports_systemd_services() and (get_systemd_unit_path(system=False).exists() or get_systemd_unit_path(system=True).exists()):
systemd_status(deep, system=system)
systemd_status(deep, system=system, full=full)
_print_gateway_process_mismatch(snapshot)
elif is_macos() and get_launchd_plist_path().exists():
launchd_status(deep)
+329 -52
View File
@@ -1131,6 +1131,20 @@ def cmd_chat(args):
if getattr(args, "yolo", False):
os.environ["HERMES_YOLO_MODE"] = "1"
# --ignore-user-config: make load_cli_config() / load_config() skip the
# user's ~/.hermes/config.yaml and return built-in defaults. Set BEFORE
# importing cli (which runs `CLI_CONFIG = load_cli_config()` at module
# import time). Credentials in .env are still loaded — this flag only
# ignores behavioral/config settings.
if getattr(args, "ignore_user_config", False):
os.environ["HERMES_IGNORE_USER_CONFIG"] = "1"
# --ignore-rules: skip auto-injection of AGENTS.md/SOUL.md/.cursorrules
# (rules), memory entries, and any preloaded skills coming from user config.
# Maps to AIAgent(skip_context_files=True, skip_memory=True).
if getattr(args, "ignore_rules", False):
os.environ["HERMES_IGNORE_RULES"] = "1"
# --source: tag session source for filtering (e.g. 'tool' for third-party integrations)
if getattr(args, "source", None):
os.environ["HERMES_SESSION_SOURCE"] = args.source
@@ -1159,6 +1173,8 @@ def cmd_chat(args):
"checkpoints": getattr(args, "checkpoints", False),
"pass_session_id": getattr(args, "pass_session_id", False),
"max_turns": getattr(args, "max_turns", None),
"ignore_rules": getattr(args, "ignore_rules", False),
"ignore_user_config": getattr(args, "ignore_user_config", False),
}
# Filter out None values
kwargs = {k: v for k, v in kwargs.items() if v is not None}
@@ -1566,6 +1582,8 @@ def select_provider_and_model(args=None):
_model_flow_anthropic(config, current_model)
elif selected_provider == "kimi-coding":
_model_flow_kimi(config, current_model)
elif selected_provider == "stepfun":
_model_flow_stepfun(config, current_model)
elif selected_provider == "bedrock":
_model_flow_bedrock(config, current_model)
elif selected_provider in (
@@ -2165,7 +2183,6 @@ def _model_flow_nous(config, current_model="", args=None):
from hermes_cli.models import (
_PROVIDER_MODELS,
get_pricing_for_provider,
filter_nous_free_models,
check_nous_free_tier,
partition_nous_models_by_tier,
)
@@ -2208,10 +2225,8 @@ def _model_flow_nous(config, current_model="", args=None):
# Check if user is on free tier
free_tier = check_nous_free_tier()
# For both tiers: apply the allowlist filter first (removes non-allowlisted
# free models and allowlist models that aren't actually free).
# Then for free users: partition remaining models into selectable/unavailable.
model_ids = filter_nous_free_models(model_ids, pricing)
# For free users: partition models into selectable/unavailable based on
# whether they are free per the Portal-reported pricing.
unavailable_models: list[str] = []
if free_tier:
model_ids, unavailable_models = partition_nous_models_by_tier(
@@ -3465,6 +3480,140 @@ def _model_flow_kimi(config, current_model=""):
print("No change.")
def _infer_stepfun_region(base_url: str) -> str:
"""Infer the current StepFun region from the configured endpoint."""
normalized = (base_url or "").strip().lower()
if "api.stepfun.com" in normalized:
return "china"
return "international"
def _stepfun_base_url_for_region(region: str) -> str:
from hermes_cli.auth import (
STEPFUN_STEP_PLAN_CN_BASE_URL,
STEPFUN_STEP_PLAN_INTL_BASE_URL,
)
return (
STEPFUN_STEP_PLAN_CN_BASE_URL
if region == "china"
else STEPFUN_STEP_PLAN_INTL_BASE_URL
)
def _model_flow_stepfun(config, current_model=""):
"""StepFun Step Plan flow with region-specific endpoints."""
from hermes_cli.auth import (
PROVIDER_REGISTRY,
_prompt_model_selection,
_save_model_choice,
deactivate_provider,
)
from hermes_cli.config import get_env_value, save_env_value, load_config, save_config
from hermes_cli.models import fetch_api_models
provider_id = "stepfun"
pconfig = PROVIDER_REGISTRY[provider_id]
key_env = pconfig.api_key_env_vars[0] if pconfig.api_key_env_vars else ""
base_url_env = pconfig.base_url_env_var or ""
existing_key = ""
for ev in pconfig.api_key_env_vars:
existing_key = get_env_value(ev) or os.getenv(ev, "")
if existing_key:
break
if not existing_key:
print(f"No {pconfig.name} API key configured.")
if key_env:
try:
import getpass
new_key = getpass.getpass(f"{key_env} (or Enter to cancel): ").strip()
except (KeyboardInterrupt, EOFError):
print()
return
if not new_key:
print("Cancelled.")
return
save_env_value(key_env, new_key)
existing_key = new_key
print("API key saved.")
print()
else:
print(f" {pconfig.name} API key: {existing_key[:8]}... ✓")
print()
current_base = ""
if base_url_env:
current_base = get_env_value(base_url_env) or os.getenv(base_url_env, "")
if not current_base:
model_cfg = config.get("model")
if isinstance(model_cfg, dict):
current_base = str(model_cfg.get("base_url") or "").strip()
current_region = _infer_stepfun_region(current_base or pconfig.inference_base_url)
region_choices = [
("international", f"International ({_stepfun_base_url_for_region('international')})"),
("china", f"China ({_stepfun_base_url_for_region('china')})"),
]
ordered_regions = []
for region_key, label in region_choices:
if region_key == current_region:
ordered_regions.insert(0, (region_key, f"{label} ← currently active"))
else:
ordered_regions.append((region_key, label))
ordered_regions.append(("cancel", "Cancel"))
region_idx = _prompt_provider_choice([label for _, label in ordered_regions])
if region_idx is None or ordered_regions[region_idx][0] == "cancel":
print("No change.")
return
selected_region = ordered_regions[region_idx][0]
effective_base = _stepfun_base_url_for_region(selected_region)
if base_url_env:
save_env_value(base_url_env, effective_base)
live_models = fetch_api_models(existing_key, effective_base)
if live_models:
model_list = live_models
print(f" Found {len(model_list)} model(s) from {pconfig.name} API")
else:
model_list = _PROVIDER_MODELS.get(provider_id, [])
if model_list:
print(
f" Could not auto-detect models from {pconfig.name} API — "
"showing Step Plan fallback catalog."
)
if model_list:
selected = _prompt_model_selection(model_list, current_model=current_model)
else:
try:
selected = input("Model name: ").strip()
except (KeyboardInterrupt, EOFError):
selected = None
if selected:
_save_model_choice(selected)
cfg = load_config()
model = cfg.get("model")
if not isinstance(model, dict):
model = {"default": model} if model else {}
cfg["model"] = model
model["provider"] = provider_id
model["base_url"] = effective_base
model.pop("api_mode", None)
save_config(cfg)
deactivate_provider()
config["model"] = dict(model)
print(f"Default model set to: {selected} (via {pconfig.name})")
else:
print("No change.")
def _model_flow_bedrock_api_key(config, region, current_model=""):
"""Bedrock API Key mode — uses the OpenAI-compatible bedrock-mantle endpoint.
@@ -3835,7 +3984,18 @@ def _model_flow_api_key_provider(config, provider_id, current_model=""):
pass
if mdev_models:
model_list = mdev_models
# Merge models.dev with curated list so newly added models
# (not yet in models.dev) still appear in the picker.
if curated:
seen = {m.lower() for m in mdev_models}
merged = list(mdev_models)
for m in curated:
if m.lower() not in seen:
merged.append(m)
seen.add(m.lower())
model_list = merged
else:
model_list = mdev_models
print(f" Found {len(model_list)} model(s) from models.dev registry")
elif curated and len(curated) >= 8:
# Curated list is substantial — use it directly, skip live probe
@@ -5704,12 +5864,15 @@ def _cmd_update_impl(args, gateway_mode: bool):
# Write exit code *before* the gateway restart attempt.
# When running as ``hermes update --gateway`` (spawned by the gateway's
# /update command), this process lives inside the gateway's systemd
# cgroup. ``systemctl restart hermes-gateway`` kills everything in the
# cgroup (KillMode=mixed → SIGKILL to remaining processes), including
# us and the wrapping bash shell. The shell never reaches its
# ``printf $status > .update_exit_code`` epilogue, so the exit-code
# marker file is never created. The new gateway's update watcher then
# polls for 30 minutes and sends a spurious timeout message.
# cgroup. A graceful SIGUSR1 restart keeps the drain loop alive long
# enough for the exit-code marker to be written below, but the
# fallback ``systemctl restart`` path (see below) kills everything in
# the cgroup (KillMode=mixed → SIGKILL to remaining processes),
# including us and the wrapping bash shell. The shell never reaches
# its ``printf $status > .update_exit_code`` epilogue, so the
# exit-code marker file would never be created. The new gateway's
# update watcher would then poll for 30 minutes and send a spurious
# timeout message.
#
# Writing the marker here — after git pull + pip install succeed but
# before we attempt the restart — ensures the new gateway sees it
@@ -5731,9 +5894,37 @@ def _cmd_update_impl(args, gateway_mode: bool):
_ensure_user_systemd_env,
find_gateway_pids,
_get_service_pids,
_graceful_restart_via_sigusr1,
)
import signal as _signal
# Drain budget for graceful SIGUSR1 restarts. The gateway drains
# for up to ``agent.restart_drain_timeout`` (default 60s) before
# exiting with code 75; we wait slightly longer so the drain
# completes before we fall back to a hard restart. On older
# systemd units without SIGUSR1 wiring this wait just times out
# and we fall back to ``systemctl restart`` (the old behaviour).
try:
from hermes_constants import (
DEFAULT_GATEWAY_RESTART_DRAIN_TIMEOUT as _DEFAULT_DRAIN,
)
except Exception:
_DEFAULT_DRAIN = 60.0
_cfg_drain = None
try:
from hermes_cli.config import load_config
_cfg_agent = (load_config().get("agent") or {})
_cfg_drain = _cfg_agent.get("restart_drain_timeout")
except Exception:
pass
try:
_drain_budget = float(_cfg_drain) if _cfg_drain is not None else float(_DEFAULT_DRAIN)
except (TypeError, ValueError):
_drain_budget = float(_DEFAULT_DRAIN)
# Add a 15s margin so the drain loop + final exit finish before
# we escalate to ``systemctl restart`` / SIGTERM.
_drain_budget = max(_drain_budget, 30.0) + 15.0
restarted_services = []
killed_pids = set()
@@ -5780,59 +5971,114 @@ def _cmd_update_impl(args, gateway_mode: bool):
text=True,
timeout=5,
)
if check.stdout.strip() == "active":
restart = subprocess.run(
scope_cmd + ["restart", svc_name],
if check.stdout.strip() != "active":
continue
# Prefer a graceful SIGUSR1 restart so in-flight
# agent runs drain instead of being SIGKILLed.
# The gateway's SIGUSR1 handler calls
# request_restart(via_service=True) → drain →
# exit(75); systemd's Restart=on-failure (and
# RestartForceExitStatus=75) respawns the unit.
_main_pid = 0
try:
_show = subprocess.run(
scope_cmd + [
"show", svc_name,
"--property=MainPID", "--value",
],
capture_output=True, text=True, timeout=5,
)
_main_pid = int((_show.stdout or "").strip() or 0)
except (ValueError, subprocess.TimeoutExpired, FileNotFoundError):
_main_pid = 0
_graceful_ok = False
if _main_pid > 0:
print(
f"{svc_name}: draining (up to {int(_drain_budget)}s)..."
)
_graceful_ok = _graceful_restart_via_sigusr1(
_main_pid, drain_timeout=_drain_budget,
)
if _graceful_ok:
# Gateway exited 75; systemd should relaunch
# via Restart=on-failure. Verify the new
# process came up.
_time.sleep(3)
verify = subprocess.run(
scope_cmd + ["is-active", svc_name],
capture_output=True, text=True, timeout=5,
)
if verify.stdout.strip() == "active":
restarted_services.append(svc_name)
continue
# Process exited but wasn't respawned (older
# unit without Restart=on-failure or
# RestartForceExitStatus=75). Fall through
# to systemctl start/restart.
print(
f"{svc_name} drained but didn't relaunch — forcing restart"
)
# Fallback: blunt systemctl restart. This is
# what the old code always did; we get here only
# when the graceful path failed (unit missing
# SIGUSR1 wiring, drain exceeded the budget,
# restart-policy mismatch).
restart = subprocess.run(
scope_cmd + ["restart", svc_name],
capture_output=True,
text=True,
timeout=15,
)
if restart.returncode == 0:
# Verify the service actually survived the
# restart. systemctl restart returns 0 even
# if the new process crashes immediately.
_time.sleep(3)
verify = subprocess.run(
scope_cmd + ["is-active", svc_name],
capture_output=True,
text=True,
timeout=15,
timeout=5,
)
if restart.returncode == 0:
# Verify the service actually survived the
# restart. systemctl restart returns 0 even
# if the new process crashes immediately.
if verify.stdout.strip() == "active":
restarted_services.append(svc_name)
else:
# Retry once — transient startup failures
# (stale module cache, import race) often
# resolve on the second attempt.
print(
f"{svc_name} died after restart, retrying..."
)
retry = subprocess.run(
scope_cmd + ["restart", svc_name],
capture_output=True,
text=True,
timeout=15,
)
_time.sleep(3)
verify = subprocess.run(
verify2 = subprocess.run(
scope_cmd + ["is-active", svc_name],
capture_output=True,
text=True,
timeout=5,
)
if verify.stdout.strip() == "active":
if verify2.stdout.strip() == "active":
restarted_services.append(svc_name)
print(f"{svc_name} recovered on retry")
else:
# Retry once — transient startup failures
# (stale module cache, import race) often
# resolve on the second attempt.
print(
f" {svc_name} died after restart, retrying..."
f" {svc_name} failed to stay running after restart.\n"
f" Check logs: journalctl --user -u {svc_name} --since '2 min ago'\n"
f" Restart manually: systemctl {'--user ' if scope == 'user' else ''}restart {svc_name}"
)
retry = subprocess.run(
scope_cmd + ["restart", svc_name],
capture_output=True,
text=True,
timeout=15,
)
_time.sleep(3)
verify2 = subprocess.run(
scope_cmd + ["is-active", svc_name],
capture_output=True,
text=True,
timeout=5,
)
if verify2.stdout.strip() == "active":
restarted_services.append(svc_name)
print(f"{svc_name} recovered on retry")
else:
print(
f"{svc_name} failed to stay running after restart.\n"
f" Check logs: journalctl --user -u {svc_name} --since '2 min ago'\n"
f" Restart manually: systemctl {'--user ' if scope == 'user' else ''}restart {svc_name}"
)
else:
print(
f" ⚠ Failed to restart {svc_name}: {restart.stderr.strip()}"
)
else:
print(
f" ⚠ Failed to restart {svc_name}: {restart.stderr.strip()}"
)
except (FileNotFoundError, subprocess.TimeoutExpired):
pass
@@ -6473,6 +6719,18 @@ For more help on a command:
default=False,
help="Include the session ID in the agent's system prompt",
)
parser.add_argument(
"--ignore-user-config",
action="store_true",
default=False,
help="Ignore ~/.hermes/config.yaml and fall back to built-in defaults (credentials in .env are still loaded)",
)
parser.add_argument(
"--ignore-rules",
action="store_true",
default=False,
help="Skip auto-injection of AGENTS.md, SOUL.md, .cursorrules, memory, and preloaded skills",
)
parser.add_argument(
"--tui",
action="store_true",
@@ -6533,6 +6791,7 @@ For more help on a command:
"zai",
"kimi-coding",
"kimi-coding-cn",
"stepfun",
"minimax",
"minimax-cn",
"kilocode",
@@ -6611,6 +6870,18 @@ For more help on a command:
default=argparse.SUPPRESS,
help="Include the session ID in the agent's system prompt",
)
chat_parser.add_argument(
"--ignore-user-config",
action="store_true",
default=argparse.SUPPRESS,
help="Ignore ~/.hermes/config.yaml and fall back to built-in defaults (credentials in .env are still loaded). Useful for isolated CI runs, reproduction, and third-party integrations.",
)
chat_parser.add_argument(
"--ignore-rules",
action="store_true",
default=argparse.SUPPRESS,
help="Skip auto-injection of AGENTS.md, SOUL.md, .cursorrules, memory, and preloaded skills. Combine with --ignore-user-config for a fully isolated run.",
)
chat_parser.add_argument(
"--source",
default=None,
@@ -6754,6 +7025,12 @@ For more help on a command:
# gateway status
gateway_status = gateway_subparsers.add_parser("status", help="Show gateway status")
gateway_status.add_argument("--deep", action="store_true", help="Deep status check")
gateway_status.add_argument(
"-l",
"--full",
action="store_true",
help="Show full, untruncated service/log output where supported",
)
gateway_status.add_argument(
"--system",
action="store_true",
+237 -49
View File
@@ -143,7 +143,7 @@ MODEL_ALIASES: dict[str, ModelIdentity] = {
# Z.AI / GLM
"glm": ModelIdentity("z-ai", "glm"),
# StepFun
# Step Plan (StepFun)
"step": ModelIdentity("stepfun", "step"),
# Xiaomi
@@ -304,6 +304,113 @@ def parse_model_flags(raw_args: str) -> tuple[str, str, bool]:
# Alias resolution
# ---------------------------------------------------------------------------
def _model_sort_key(model_id: str, prefix: str) -> tuple:
"""Sort key for model version preference.
Extracts version numbers after the family prefix and returns a sort key
that prefers higher versions. Suffix tokens (``pro``, ``omni``, etc.)
are used as tiebreakers, with common quality indicators ranked.
Examples (with prefix ``"mimo"``)::
mimo-v2.5-pro (-2.5, 0, 'pro') # highest version wins
mimo-v2.5 (-2.5, 1, '') # no suffix = lower than pro
mimo-v2-pro (-2.0, 0, 'pro')
mimo-v2-omni (-2.0, 1, 'omni')
mimo-v2-flash (-2.0, 1, 'flash')
"""
# Strip the prefix (and optional "/" separator for aggregator slugs)
rest = model_id[len(prefix):]
if rest.startswith("/"):
rest = rest[1:]
rest = rest.lstrip("-").strip()
# Parse version and suffix from the remainder.
# "v2.5-pro" → version [2.5], suffix "pro"
# "-omni" → version [], suffix "omni"
# State machine: start → in_version → between → in_suffix
nums: list[float] = []
suffix_buf = ""
state = "start"
num_buf = ""
for ch in rest:
if state == "start":
if ch in "vV":
state = "in_version"
elif ch.isdigit():
state = "in_version"
num_buf += ch
elif ch in "-_.":
pass # skip separators before any content
else:
state = "in_suffix"
suffix_buf += ch
elif state == "in_version":
if ch.isdigit():
num_buf += ch
elif ch == ".":
if "." in num_buf:
# Second dot — flush current number, start new component
try:
nums.append(float(num_buf.rstrip(".")))
except ValueError:
pass
num_buf = ""
else:
num_buf += ch
elif ch in "-_.":
if num_buf:
try:
nums.append(float(num_buf.rstrip(".")))
except ValueError:
pass
num_buf = ""
state = "between"
else:
if num_buf:
try:
nums.append(float(num_buf.rstrip(".")))
except ValueError:
pass
num_buf = ""
state = "in_suffix"
suffix_buf += ch
elif state == "between":
if ch.isdigit():
state = "in_version"
num_buf = ch
elif ch in "vV":
state = "in_version"
elif ch in "-_.":
pass
else:
state = "in_suffix"
suffix_buf += ch
elif state == "in_suffix":
suffix_buf += ch
# Flush remaining buffer (strip trailing dots — "5.4." → "5.4")
if num_buf and state == "in_version":
try:
nums.append(float(num_buf.rstrip(".")))
except ValueError:
pass
suffix = suffix_buf.lower().strip("-_.")
suffix = suffix.strip()
# Negate versions so higher → sorts first
version_key = tuple(-n for n in nums)
# Suffix quality ranking: pro/max > (no suffix) > omni/flash/mini/lite
# Lower number = preferred
_SUFFIX_RANK = {"pro": 0, "max": 0, "plus": 0, "turbo": 0}
suffix_rank = _SUFFIX_RANK.get(suffix, 1)
return version_key + (suffix_rank, suffix)
def resolve_alias(
raw_input: str,
current_provider: str,
@@ -311,9 +418,9 @@ def resolve_alias(
"""Resolve a short alias against the current provider's catalog.
Looks up *raw_input* in :data:`MODEL_ALIASES`, then searches the
current provider's models.dev catalog for the first model whose ID
starts with ``vendor/family`` (or just ``family`` for non-aggregator
providers).
current provider's models.dev catalog for the model whose ID starts
with ``vendor/family`` (or just ``family`` for non-aggregator
providers) and has the **highest version**.
Returns:
``(provider, resolved_model_id, alias_name)`` if a match is
@@ -341,28 +448,44 @@ def resolve_alias(
vendor, family = identity
# Search the provider's catalog from models.dev
# Build catalog from models.dev, then merge in static _PROVIDER_MODELS
# entries that models.dev may be missing (e.g. newly added models not
# yet synced to the registry).
catalog = list_provider_models(current_provider)
if not catalog:
return None
try:
from hermes_cli.models import _PROVIDER_MODELS
static = _PROVIDER_MODELS.get(current_provider, [])
if static:
seen = {m.lower() for m in catalog}
for m in static:
if m.lower() not in seen:
catalog.append(m)
except Exception:
pass
# For aggregators, models are vendor/model-name format
aggregator = is_aggregator(current_provider)
for model_id in catalog:
mid_lower = model_id.lower()
if aggregator:
# Match vendor/family prefix -- e.g. "anthropic/claude-sonnet"
prefix = f"{vendor}/{family}".lower()
if mid_lower.startswith(prefix):
return (current_provider, model_id, key)
else:
# Non-aggregator: bare names -- e.g. "claude-sonnet-4-6"
family_lower = family.lower()
if mid_lower.startswith(family_lower):
return (current_provider, model_id, key)
if aggregator:
prefix = f"{vendor}/{family}".lower()
matches = [
mid for mid in catalog
if mid.lower().startswith(prefix)
]
else:
family_lower = family.lower()
matches = [
mid for mid in catalog
if mid.lower().startswith(family_lower)
]
return None
if not matches:
return None
# Sort by version descending — prefer the latest/highest version
prefix_for_sort = f"{vendor}/{family}" if aggregator else family
matches.sort(key=lambda m: _model_sort_key(m, prefix_for_sort))
return (current_provider, matches[0], key)
def get_authenticated_provider_slugs(
@@ -678,6 +801,7 @@ def switch_model(
_da = DIRECT_ALIASES.get(resolved_alias)
if _da is not None and _da.base_url:
base_url = _da.base_url
api_mode = "" # clear so determine_api_mode re-detects from URL
if not api_key:
api_key = "no-key-required"
@@ -781,6 +905,7 @@ def switch_model(
def list_authenticated_providers(
current_provider: str = "",
current_base_url: str = "",
user_providers: dict = None,
custom_providers: list | None = None,
max_models: int = 8,
@@ -809,7 +934,10 @@ def list_authenticated_providers(
get_provider_info as _mdev_pinfo,
)
from hermes_cli.auth import PROVIDER_REGISTRY
from hermes_cli.models import OPENROUTER_MODELS, _PROVIDER_MODELS
from hermes_cli.models import (
OPENROUTER_MODELS, _PROVIDER_MODELS,
_MODELS_DEV_PREFERRED, _merge_with_models_dev,
)
results: List[dict] = []
seen_slugs: set = set() # lowercase-normalized to catch case variants (#9545)
@@ -843,6 +971,10 @@ def list_authenticated_providers(
# source of truth. models.dev can have wrong mappings (e.g.
# minimax-cn → MINIMAX_API_KEY instead of MINIMAX_CN_API_KEY).
pconfig = PROVIDER_REGISTRY.get(hermes_id)
# Skip non-API-key auth providers here — they are handled in
# section 2 (HERMES_OVERLAYS) with proper auth store checking.
if pconfig and pconfig.auth_type != "api_key":
continue
if pconfig and pconfig.api_key_env_vars:
env_vars = list(pconfig.api_key_env_vars)
else:
@@ -855,8 +987,13 @@ def list_authenticated_providers(
if not has_creds:
continue
# Use curated list, falling back to models.dev if no curated list
# Use curated list, falling back to models.dev if no curated list.
# For preferred providers, merge models.dev entries into the curated
# catalog so newly released models (e.g. mimo-v2.5-pro on opencode-go)
# show up in the picker without requiring a Hermes release.
model_ids = curated.get(hermes_id, [])
if hermes_id in _MODELS_DEV_PREFERRED:
model_ids = _merge_with_models_dev(hermes_id, model_ids)
total = len(model_ids)
top = model_ids[:max_models]
@@ -960,6 +1097,9 @@ def list_authenticated_providers(
# Use curated list — look up by Hermes slug, fall back to overlay key
model_ids = curated.get(hermes_slug, []) or curated.get(pid, [])
# Merge with models.dev for preferred providers (same rationale as above).
if hermes_slug in _MODELS_DEV_PREFERRED:
model_ids = _merge_with_models_dev(hermes_slug, model_ids)
total = len(model_ids)
top = model_ids[:max_models]
@@ -1105,66 +1245,113 @@ def list_authenticated_providers(
# --- 4. Saved custom providers from config ---
# Each ``custom_providers`` entry represents one model under a named
# provider. Entries sharing the same provider name are grouped into a
# single picker row so that e.g. four Ollama Cloud entries
# (qwen3-coder, glm-5.1, kimi-k2, minimax-m2.7) appear as one
# "Ollama Cloud" row with four models inside instead of four
# duplicate "Ollama Cloud" rows. Entries with distinct provider names
# still produce separate rows (e.g. Ollama Cloud vs Moonshot).
# provider. Entries sharing the same endpoint (``base_url`` + ``api_key``)
# are grouped into a single picker row, so e.g. four Ollama entries
# pointing at ``http://localhost:11434/v1`` with per-model display names
# ("Ollama — GLM 5.1", "Ollama — Qwen3-coder", ...) appear as one
# "Ollama" row with four models inside instead of four near-duplicates
# that differ only by suffix. Entries with distinct endpoints still
# produce separate rows.
#
# When the grouped endpoint matches ``current_base_url`` the group's
# slug becomes ``current_provider`` so that selecting a model from the
# picker flows back through the runtime provider that already holds
# valid credentials — no re-resolution needed.
if custom_providers and isinstance(custom_providers, list):
from collections import OrderedDict
groups: "OrderedDict[str, dict]" = OrderedDict()
# Key by (base_url, api_key) instead of slug: names frequently
# differ per model ("Ollama — X") while the endpoint stays the
# same. Slug-based grouping left them as separate rows.
groups: "OrderedDict[tuple, dict]" = OrderedDict()
for entry in custom_providers:
if not isinstance(entry, dict):
continue
display_name = (entry.get("name") or "").strip()
raw_name = (entry.get("name") or "").strip()
api_url = (
entry.get("base_url", "")
or entry.get("url", "")
or entry.get("api", "")
or ""
).strip()
if not display_name or not api_url:
).strip().rstrip("/")
if not raw_name or not api_url:
continue
api_key = (entry.get("api_key") or "").strip()
slug = custom_provider_slug(display_name)
if slug not in groups:
groups[slug] = {
group_key = (api_url, api_key)
if group_key not in groups:
# Strip per-model suffix so "Ollama — GLM 5.1" becomes
# "Ollama" for the grouped row. Em dash is the convention
# Hermes's own writer uses; a hyphen variant is accepted
# for hand-edited configs.
display_name = raw_name
for sep in ("", " - "):
if sep in display_name:
display_name = display_name.split(sep)[0].strip()
break
if not display_name:
display_name = raw_name
# If this endpoint matches the currently active one, use
# ``current_provider`` as the slug so picker-driven switches
# route through the live credential pipeline.
if (
current_base_url
and api_url == current_base_url.strip().rstrip("/")
):
slug = current_provider or custom_provider_slug(display_name)
else:
slug = custom_provider_slug(display_name)
groups[group_key] = {
"slug": slug,
"name": display_name,
"api_url": api_url,
"models": [],
}
# The singular ``model:`` field only holds the currently
# active model. Hermes's own writer (main.py::_save_custom_provider)
# stores every configured model as a dict under ``models:``;
# downstream readers (agent/models_dev.py, gateway/run.py,
# run_agent.py, hermes_cli/config.py) already consume that dict.
# The /model picker previously ignored it, so multi-model
# custom providers appeared to have only the active model.
default_model = (entry.get("model") or "").strip()
if default_model and default_model not in groups[slug]["models"]:
groups[slug]["models"].append(default_model)
if default_model and default_model not in groups[group_key]["models"]:
groups[group_key]["models"].append(default_model)
cfg_models = entry.get("models", {})
if isinstance(cfg_models, dict):
for m in cfg_models:
if m and m not in groups[slug]["models"]:
groups[slug]["models"].append(m)
if m and m not in groups[group_key]["models"]:
groups[group_key]["models"].append(m)
elif isinstance(cfg_models, list):
for m in cfg_models:
if m and m not in groups[slug]["models"]:
groups[slug]["models"].append(m)
if m and m not in groups[group_key]["models"]:
groups[group_key]["models"].append(m)
for slug, grp in groups.items():
if slug.lower() in seen_slugs:
_section4_emitted_slugs: set = set()
for grp in groups.values():
slug = grp["slug"]
# If the slug is already claimed by a built-in / overlay /
# user-provider row (sections 1-3), skip this custom group
# to avoid shadowing a real provider.
if slug.lower() in seen_slugs and slug.lower() not in _section4_emitted_slugs:
continue
# If a prior section-4 group already used this slug (two custom
# endpoints with the same cleaned name — e.g. two OpenAI-
# compatible gateways named identically with different keys),
# append a counter so both rows stay visible in the picker.
if slug.lower() in _section4_emitted_slugs:
base_slug = slug
n = 2
while f"{base_slug}-{n}".lower() in seen_slugs:
n += 1
slug = f"{base_slug}-{n}"
grp["slug"] = slug
# Skip if section 3 already emitted this endpoint under its
# ``providers:`` dict key — matches on (display_name, base_url),
# the tuple section 4 groups by. Prevents two picker rows
# labelled identically when callers pass both ``user_providers``
# and a compatibility-merged ``custom_providers`` list.
# ``providers:`` dict key — matches on (display_name, base_url).
# Prevents two picker rows labelled identically when callers
# pass both ``user_providers`` and a compatibility-merged
# ``custom_providers`` list.
_pair_key = (
str(grp["name"]).strip().lower(),
str(grp["api_url"]).strip().rstrip("/").lower(),
@@ -1182,6 +1369,7 @@ def list_authenticated_providers(
"api_url": grp["api_url"],
})
seen_slugs.add(slug.lower())
_section4_emitted_slugs.add(slug.lower())
# Sort: current provider first, then by model count descending
results.sort(key=lambda r: (not r["is_current"], -r["total_models"]))
+284 -50
View File
@@ -33,6 +33,8 @@ COPILOT_REASONING_EFFORTS_O_SERIES = ["low", "medium", "high"]
# (model_id, display description shown in menus)
OPENROUTER_MODELS: list[tuple[str, str]] = [
("moonshotai/kimi-k2.6", "recommended"),
("deepseek/deepseek-v4-pro", ""),
("deepseek/deepseek-v4-flash", ""),
("anthropic/claude-opus-4.7", ""),
("anthropic/claude-opus-4.6", ""),
("anthropic/claude-sonnet-4.6", ""),
@@ -42,7 +44,8 @@ OPENROUTER_MODELS: list[tuple[str, str]] = [
("openrouter/elephant-alpha", "free"),
("openai/gpt-5.4", ""),
("openai/gpt-5.4-mini", ""),
("xiaomi/mimo-v2-pro", ""),
("xiaomi/mimo-v2.5-pro", ""),
("xiaomi/mimo-v2.5", ""),
("openai/gpt-5.3-codex", ""),
("google/gemini-3-pro-image-preview", ""),
("google/gemini-3-flash-preview", ""),
@@ -53,6 +56,7 @@ OPENROUTER_MODELS: list[tuple[str, str]] = [
("stepfun/step-3.5-flash", ""),
("minimax/minimax-m2.7", ""),
("minimax/minimax-m2.5", ""),
("minimax/minimax-m2.5:free", "free"),
("z-ai/glm-5.1", ""),
("z-ai/glm-5v-turbo", ""),
("z-ai/glm-5-turbo", ""),
@@ -107,7 +111,10 @@ def _codex_curated_models() -> list[str]:
_PROVIDER_MODELS: dict[str, list[str]] = {
"nous": [
"moonshotai/kimi-k2.6",
"xiaomi/mimo-v2-pro",
"deepseek/deepseek-v4-pro",
"deepseek/deepseek-v4-flash",
"xiaomi/mimo-v2.5-pro",
"xiaomi/mimo-v2.5",
"anthropic/claude-opus-4.7",
"anthropic/claude-opus-4.6",
"anthropic/claude-sonnet-4.6",
@@ -125,17 +132,15 @@ _PROVIDER_MODELS: dict[str, list[str]] = {
"stepfun/step-3.5-flash",
"minimax/minimax-m2.7",
"minimax/minimax-m2.5",
"minimax/minimax-m2.5:free",
"z-ai/glm-5.1",
"z-ai/glm-5v-turbo",
"z-ai/glm-5-turbo",
"x-ai/grok-4.20-beta",
"nvidia/nemotron-3-super-120b-a12b",
"nvidia/nemotron-3-super-120b-a12b:free",
"arcee-ai/trinity-large-preview:free",
"arcee-ai/trinity-large-thinking",
"openai/gpt-5.4-pro",
"openai/gpt-5.4-nano",
"openrouter/elephant-alpha",
],
"openai-codex": _codex_curated_models(),
"copilot-acp": [
@@ -211,6 +216,10 @@ _PROVIDER_MODELS: dict[str, list[str]] = {
"kimi-k2-turbo-preview",
"kimi-k2-0905-preview",
],
"stepfun": [
"step-3.5-flash",
"step-3.5-flash-2603",
],
"moonshot": [
"kimi-k2.6",
"kimi-k2.5",
@@ -241,10 +250,14 @@ _PROVIDER_MODELS: dict[str, list[str]] = {
"claude-haiku-4-5-20251001",
],
"deepseek": [
"deepseek-v4-pro",
"deepseek-v4-flash",
"deepseek-chat",
"deepseek-reasoner",
],
"xiaomi": [
"mimo-v2.5-pro",
"mimo-v2.5",
"mimo-v2-pro",
"mimo-v2-omni",
"mimo-v2-flash",
@@ -296,6 +309,8 @@ _PROVIDER_MODELS: dict[str, list[str]] = {
"kimi-k2.5",
"glm-5.1",
"glm-5",
"mimo-v2.5-pro",
"mimo-v2.5",
"mimo-v2-pro",
"mimo-v2-omni",
"minimax-m2.7",
@@ -362,17 +377,11 @@ _PROVIDER_MODELS: dict[str, list[str]] = {
_PROVIDER_MODELS["ai-gateway"] = [mid for mid, _ in VERCEL_AI_GATEWAY_MODELS]
# ---------------------------------------------------------------------------
# Nous Portal free-model filtering
# Nous Portal free-model helper
# ---------------------------------------------------------------------------
# Models that are ALLOWED to appear when priced as free on Nous Portal.
# Any other free model is hidden — prevents promotional/temporary free models
# from cluttering the selection when users are paying subscribers.
# Models in this list are ALSO filtered out if they are NOT free (i.e. they
# should only appear in the menu when they are genuinely free).
_NOUS_ALLOWED_FREE_MODELS: frozenset[str] = frozenset({
"xiaomi/mimo-v2-pro",
"xiaomi/mimo-v2-omni",
})
# The Nous Portal models endpoint is the source of truth for which models
# are currently offered (free or paid). We trust whatever it returns and
# surface it to users as-is — no local allowlist filtering.
def _is_model_free(model_id: str, pricing: dict[str, dict[str, str]]) -> bool:
@@ -386,35 +395,6 @@ def _is_model_free(model_id: str, pricing: dict[str, dict[str, str]]) -> bool:
return False
def filter_nous_free_models(
model_ids: list[str],
pricing: dict[str, dict[str, str]],
) -> list[str]:
"""Filter the Nous Portal model list according to free-model policy.
Rules:
Paid models that are NOT in the allowlist keep (normal case).
Free models that are NOT in the allowlist drop.
Allowlist models that ARE free keep.
Allowlist models that are NOT free drop.
"""
if not pricing:
return model_ids # no pricing data — can't filter, show everything
result: list[str] = []
for mid in model_ids:
free = _is_model_free(mid, pricing)
if mid in _NOUS_ALLOWED_FREE_MODELS:
# Allowlist model: only show when it's actually free
if free:
result.append(mid)
else:
# Regular model: keep only when it's NOT free
if not free:
result.append(mid)
return result
# ---------------------------------------------------------------------------
# Nous Portal account tier detection
# ---------------------------------------------------------------------------
@@ -478,8 +458,7 @@ def partition_nous_models_by_tier(
) -> tuple[list[str], list[str]]:
"""Split Nous models into (selectable, unavailable) based on user tier.
For paid-tier users: all models are selectable, none unavailable
(free-model filtering is handled separately by ``filter_nous_free_models``).
For paid-tier users: all models are selectable, none unavailable.
For free-tier users: only free models are selectable; paid models
are returned as unavailable (shown grayed out in the menu).
@@ -549,6 +528,157 @@ def check_nous_free_tier() -> bool:
return False # default to paid on error — don't block users
# ---------------------------------------------------------------------------
# Nous Portal recommended models
#
# The Portal publishes a curated list of suggested models (separated into
# paid and free tiers) plus dedicated recommendations for compaction (text
# summarisation / auxiliary) and vision tasks. We fetch it once per process
# with a TTL cache so callers can ask "what's the best aux model right now?"
# without hitting the network on every lookup.
#
# Shape of the response (fields we care about):
# {
# "paidRecommendedModels": [ {modelName, ...}, ... ],
# "freeRecommendedModels": [ {modelName, ...}, ... ],
# "paidRecommendedCompactionModel": {modelName, ...} | null,
# "paidRecommendedVisionModel": {modelName, ...} | null,
# "freeRecommendedCompactionModel": {modelName, ...} | null,
# "freeRecommendedVisionModel": {modelName, ...} | null,
# }
# ---------------------------------------------------------------------------
NOUS_RECOMMENDED_MODELS_PATH = "/api/nous/recommended-models"
_NOUS_RECOMMENDED_CACHE_TTL: int = 600 # seconds (10 minutes)
# (result_dict, timestamp) keyed by portal_base_url so staging vs prod don't collide.
_nous_recommended_cache: dict[str, tuple[dict[str, Any], float]] = {}
def fetch_nous_recommended_models(
portal_base_url: str = "",
timeout: float = 5.0,
*,
force_refresh: bool = False,
) -> dict[str, Any]:
"""Fetch the Nous Portal's curated recommended-models payload.
Hits ``<portal>/api/nous/recommended-models``. The endpoint is public
no auth is required. Results are cached per portal URL for
``_NOUS_RECOMMENDED_CACHE_TTL`` seconds; pass ``force_refresh=True`` to
bypass the cache.
Returns the parsed JSON dict on success, or ``{}`` on any failure
(network, parse, non-2xx). Callers must treat missing/null fields as
"no recommendation" and fall back to their own default.
"""
base = (portal_base_url or "https://portal.nousresearch.com").rstrip("/")
now = time.monotonic()
cached = _nous_recommended_cache.get(base)
if not force_refresh and cached is not None:
payload, cached_at = cached
if now - cached_at < _NOUS_RECOMMENDED_CACHE_TTL:
return payload
url = f"{base}{NOUS_RECOMMENDED_MODELS_PATH}"
try:
req = urllib.request.Request(
url,
headers={"Accept": "application/json"},
)
with urllib.request.urlopen(req, timeout=timeout) as resp:
data = json.loads(resp.read().decode())
if not isinstance(data, dict):
data = {}
except Exception:
data = {}
_nous_recommended_cache[base] = (data, now)
return data
def _resolve_nous_portal_url() -> str:
"""Best-effort lookup of the Portal base URL the user is authed against."""
try:
from hermes_cli.auth import (
DEFAULT_NOUS_PORTAL_URL,
get_provider_auth_state,
)
state = get_provider_auth_state("nous") or {}
portal = str(state.get("portal_base_url") or "").strip()
if portal:
return portal.rstrip("/")
return str(DEFAULT_NOUS_PORTAL_URL).rstrip("/")
except Exception:
return "https://portal.nousresearch.com"
def _extract_model_name(entry: Any) -> Optional[str]:
"""Pull the ``modelName`` field from a recommended-model entry, else None."""
if not isinstance(entry, dict):
return None
model_name = entry.get("modelName")
if isinstance(model_name, str) and model_name.strip():
return model_name.strip()
return None
def get_nous_recommended_aux_model(
*,
vision: bool = False,
free_tier: Optional[bool] = None,
portal_base_url: str = "",
force_refresh: bool = False,
) -> Optional[str]:
"""Return the Portal's recommended model name for an auxiliary task.
Picks the best field from the Portal's recommended-models payload:
* ``vision=True`` ``paidRecommendedVisionModel`` (paid tier) or
``freeRecommendedVisionModel`` (free tier)
* ``vision=False`` ``paidRecommendedCompactionModel`` or
``freeRecommendedCompactionModel``
When ``free_tier`` is ``None`` (default) the user's tier is auto-detected
via :func:`check_nous_free_tier`. Pass an explicit bool to bypass the
detection useful for tests or when the caller already knows the tier.
For paid-tier users we prefer the paid recommendation but gracefully fall
back to the free recommendation if the Portal returned ``null`` for the
paid field (common during the staged rollout of new paid models).
Returns ``None`` when every candidate is missing, null, or the fetch
fails callers should fall back to their own default (currently
``google/gemini-3-flash-preview``).
"""
base = portal_base_url or _resolve_nous_portal_url()
payload = fetch_nous_recommended_models(base, force_refresh=force_refresh)
if not payload:
return None
if free_tier is None:
try:
free_tier = check_nous_free_tier()
except Exception:
# On any detection error, assume paid — paid users see both fields
# anyway so this is a safe default that maximises model quality.
free_tier = False
if vision:
paid_key, free_key = "paidRecommendedVisionModel", "freeRecommendedVisionModel"
else:
paid_key, free_key = "paidRecommendedCompactionModel", "freeRecommendedCompactionModel"
# Preference order:
# free tier → free only
# paid tier → paid, then free (if paid field is null)
candidates = [free_key] if free_tier else [paid_key, free_key]
for key in candidates:
name = _extract_model_name(payload.get(key))
if name:
return name
return None
# ---------------------------------------------------------------------------
# Canonical provider list — single source of truth for provider identity.
# Every code path that lists, displays, or iterates providers derives from
@@ -572,7 +702,7 @@ CANONICAL_PROVIDERS: list[ProviderEntry] = [
ProviderEntry("ai-gateway", "Vercel AI Gateway", "Vercel AI Gateway (200+ models, $5 free credit, no markup)"),
ProviderEntry("anthropic", "Anthropic", "Anthropic (Claude models — API key or Claude Code)"),
ProviderEntry("openai-codex", "OpenAI Codex", "OpenAI Codex"),
ProviderEntry("xiaomi", "Xiaomi MiMo", "Xiaomi MiMo (MiMo-V2 models — pro, omni, flash)"),
ProviderEntry("xiaomi", "Xiaomi MiMo", "Xiaomi MiMo (MiMo-V2.5 and V2 models — pro, omni, flash)"),
ProviderEntry("nvidia", "NVIDIA NIM", "NVIDIA NIM (Nemotron models — build.nvidia.com or local NIM)"),
ProviderEntry("qwen-oauth", "Qwen OAuth (Portal)", "Qwen OAuth (reuses local Qwen CLI login)"),
ProviderEntry("copilot", "GitHub Copilot", "GitHub Copilot (uses GITHUB_TOKEN or gh auth token)"),
@@ -585,6 +715,7 @@ CANONICAL_PROVIDERS: list[ProviderEntry] = [
ProviderEntry("zai", "Z.AI / GLM", "Z.AI / GLM (Zhipu AI direct API)"),
ProviderEntry("kimi-coding", "Kimi / Kimi Coding Plan", "Kimi Coding Plan (api.kimi.com) & Moonshot API"),
ProviderEntry("kimi-coding-cn", "Kimi / Moonshot (China)", "Kimi / Moonshot China (Moonshot CN direct API)"),
ProviderEntry("stepfun", "StepFun Step Plan", "StepFun Step Plan (agent/coding models via Step Plan API)"),
ProviderEntry("minimax", "MiniMax", "MiniMax (global direct API)"),
ProviderEntry("minimax-cn", "MiniMax (China)", "MiniMax China (domestic direct API)"),
ProviderEntry("alibaba", "Alibaba Cloud (DashScope)","Alibaba Cloud / DashScope Coding (Qwen + multi-provider)"),
@@ -619,6 +750,8 @@ _PROVIDER_ALIASES = {
"moonshot": "kimi-coding",
"kimi-cn": "kimi-coding-cn",
"moonshot-cn": "kimi-coding-cn",
"step": "stepfun",
"stepfun-coding-plan": "stepfun",
"arcee-ai": "arcee",
"arceeai": "arcee",
"minimax-china": "minimax-cn",
@@ -1466,11 +1599,84 @@ def _resolve_copilot_catalog_api_key() -> str:
return ""
# Providers where models.dev is treated as authoritative: curated static
# lists are kept only as an offline fallback and to capture custom additions
# the registry doesn't publish yet. Adding a provider here causes its
# curated list to be merged with fresh models.dev entries (fresh first, any
# curated-only names appended) for both the CLI and the gateway /model picker.
#
# DELIBERATELY EXCLUDED:
# - "openrouter": curated list is already a hand-picked agentic subset of
# OpenRouter's 400+ catalog. Blindly merging would dump everything.
# - "nous": curated list and Portal /models endpoint are the source of
# truth for the subscription tier.
# Also excluded: providers that already have dedicated live-endpoint
# branches below (copilot, anthropic, ai-gateway, ollama-cloud, custom,
# stepfun, openai-codex) — those paths handle freshness themselves.
_MODELS_DEV_PREFERRED: frozenset[str] = frozenset({
"opencode-go",
"opencode-zen",
"deepseek",
"kilocode",
"fireworks",
"mistral",
"togetherai",
"cohere",
"perplexity",
"groq",
"nvidia",
"huggingface",
"zai",
"gemini",
"google",
})
def _merge_with_models_dev(provider: str, curated: list[str]) -> list[str]:
"""Merge curated list with fresh models.dev entries for a preferred provider.
Returns models.dev entries first (in models.dev order), then any
curated-only entries appended. Preserves case for curated fallbacks
(e.g. ``MiniMax-M2.7``) while trusting models.dev for newer variants.
If models.dev is unreachable or returns nothing, the curated list is
returned unchanged this is the offline/CI fallback path.
"""
try:
from agent.models_dev import list_agentic_models
mdev = list_agentic_models(provider)
except Exception:
mdev = []
if not mdev:
return list(curated)
# Case-insensitive dedup while preserving order and curated casing.
seen_lower: set[str] = set()
merged: list[str] = []
for mid in mdev:
key = str(mid).lower()
if key in seen_lower:
continue
seen_lower.add(key)
merged.append(mid)
for mid in curated:
key = str(mid).lower()
if key in seen_lower:
continue
seen_lower.add(key)
merged.append(mid)
return merged
def provider_model_ids(provider: Optional[str], *, force_refresh: bool = False) -> list[str]:
"""Return the best known model catalog for a provider.
Tries live API endpoints for providers that support them (Codex, Nous),
falling back to static lists.
falling back to static lists. For providers in ``_MODELS_DEV_PREFERRED``
(opencode-go/zen, xiaomi, deepseek, smaller inference providers, etc.),
models.dev entries are merged on top of curated so new models released
on the platform appear in ``/model`` without a Hermes release.
"""
normalized = normalize_provider(provider)
if normalized == "openrouter":
@@ -1478,7 +1684,19 @@ def provider_model_ids(provider: Optional[str], *, force_refresh: bool = False)
if normalized == "openai-codex":
from hermes_cli.codex_models import get_codex_model_ids
return get_codex_model_ids()
# Pass the live OAuth access token so the picker matches whatever
# ChatGPT lists for this account right now (new models appear without
# a Hermes release). Falls back to the hardcoded catalog if no token
# or the endpoint is unreachable.
access_token = None
try:
from hermes_cli.auth import resolve_codex_runtime_credentials
creds = resolve_codex_runtime_credentials(refresh_if_expiring=True)
access_token = creds.get("api_key")
except Exception:
access_token = None
return get_codex_model_ids(access_token=access_token)
if normalized in {"copilot", "copilot-acp"}:
try:
live = _fetch_github_models(_resolve_copilot_catalog_api_key())
@@ -1499,6 +1717,19 @@ def provider_model_ids(provider: Optional[str], *, force_refresh: bool = False)
return live
except Exception:
pass
if normalized == "stepfun":
try:
from hermes_cli.auth import resolve_api_key_provider_credentials
creds = resolve_api_key_provider_credentials("stepfun")
api_key = str(creds.get("api_key") or "").strip()
base_url = str(creds.get("base_url") or "").strip()
if api_key and base_url:
live = fetch_api_models(api_key, base_url)
if live:
return live
except Exception:
pass
if normalized == "anthropic":
live = _fetch_anthropic_models()
if live:
@@ -1523,7 +1754,10 @@ def provider_model_ids(provider: Optional[str], *, force_refresh: bool = False)
live = fetch_api_models(api_key, base_url)
if live:
return live
return list(_PROVIDER_MODELS.get(normalized, []))
curated_static = list(_PROVIDER_MODELS.get(normalized, []))
if normalized in _MODELS_DEV_PREFERRED:
return _merge_with_models_dev(normalized, curated_static)
return curated_static
def _fetch_anthropic_models(timeout: float = 5.0) -> Optional[list[str]]:
+3 -3
View File
@@ -44,7 +44,7 @@ def _cmd_list(store):
for p in pending:
print(
f" {p['platform']:<12} {p['code']:<10} {p['user_id']:<20} "
f"{p.get('user_name', ''):<20} {p['age_minutes']}m ago"
f"{(p.get('user_name') or ''):<20} {p['age_minutes']}m ago"
)
else:
print("\n No pending pairing requests.")
@@ -54,7 +54,7 @@ def _cmd_list(store):
print(f" {'Platform':<12} {'User ID':<20} {'Name':<20}")
print(f" {'--------':<12} {'-------':<20} {'----':<20}")
for a in approved:
print(f" {a['platform']:<12} {a['user_id']:<20} {a.get('user_name', ''):<20}")
print(f" {a['platform']:<12} {a['user_id']:<20} {(a.get('user_name') or ''):<20}")
else:
print("\n No approved users.")
@@ -69,7 +69,7 @@ def _cmd_approve(store, platform: str, code: str):
result = store.approve_code(platform, code)
if result:
uid = result["user_id"]
name = result.get("user_name", "")
name = result.get("user_name") or ""
display = f"{name} ({uid})" if name else uid
print(f"\n Approved! User {display} on {platform} can now use the bot~")
print(" They'll be recognized automatically on their next message.\n")
+1
View File
@@ -38,6 +38,7 @@ PLATFORMS: OrderedDict[str, PlatformInfo] = OrderedDict([
("qqbot", PlatformInfo(label="💬 QQBot", default_toolset="hermes-qqbot")),
("webhook", PlatformInfo(label="🔗 Webhook", default_toolset="hermes-webhook")),
("api_server", PlatformInfo(label="🌐 API Server", default_toolset="hermes-api-server")),
("cron", PlatformInfo(label="⏰ Cron", default_toolset="hermes-cron")),
])
+290 -62
View File
@@ -133,6 +133,9 @@ def _get_enabled_plugins() -> Optional[set]:
# Data classes
# ---------------------------------------------------------------------------
_VALID_PLUGIN_KINDS: Set[str] = {"standalone", "backend", "exclusive"}
@dataclass
class PluginManifest:
"""Parsed representation of a plugin.yaml manifest."""
@@ -146,6 +149,23 @@ class PluginManifest:
provides_hooks: List[str] = field(default_factory=list)
source: str = "" # "user", "project", or "entrypoint"
path: Optional[str] = None
# Plugin kind — see plugins.py module docstring for semantics.
# ``standalone`` (default): hooks/tools of its own; opt-in via
# ``plugins.enabled``.
# ``backend``: pluggable backend for an existing core tool (e.g.
# image_gen). Built-in (bundled) backends auto-load;
# user-installed still gated by ``plugins.enabled``.
# ``exclusive``: category with exactly one active provider (memory).
# Selection via ``<category>.provider`` config key; the
# category's own discovery system handles loading and the
# general scanner skips these.
kind: str = "standalone"
# Registry key — path-derived, used by ``plugins.enabled``/``disabled``
# lookups and by ``hermes plugins list``. For a flat plugin at
# ``plugins/disk-cleanup/`` the key is ``disk-cleanup``; for a nested
# category plugin at ``plugins/image_gen/openai/`` the key is
# ``image_gen/openai``. When empty, falls back to ``name``.
key: str = ""
@dataclass
@@ -263,6 +283,7 @@ class PluginContext:
name: str,
handler: Callable,
description: str = "",
args_hint: str = "",
) -> None:
"""Register a slash command (e.g. ``/lcm``) available in CLI and gateway sessions.
@@ -273,6 +294,13 @@ class PluginContext:
terminal commands), this registers in-session slash commands that users
invoke during a conversation.
``args_hint`` is an optional short string (e.g. ``"<file>"`` or
``"dias:7 formato:json"``) used by gateway adapters to surface the
command with an argument field for example Discord's native slash
command picker. Plugin commands without ``args_hint`` register as
parameterless in Discord and still accept trailing text when invoked
as free-form chat.
Names conflicting with built-in commands are rejected with a warning.
"""
clean = name.lower().strip().lstrip("/").replace(" ", "-")
@@ -300,6 +328,7 @@ class PluginContext:
"handler": handler,
"description": description or "Plugin command",
"plugin": self.manifest.name,
"args_hint": (args_hint or "").strip(),
}
logger.debug("Plugin %s registered command: /%s", self.manifest.name, clean)
@@ -366,6 +395,33 @@ class PluginContext:
self.manifest.name, engine.name,
)
# -- image gen provider registration ------------------------------------
def register_image_gen_provider(self, provider) -> None:
"""Register an image generation backend.
``provider`` must be an instance of
:class:`agent.image_gen_provider.ImageGenProvider`. The
``provider.name`` attribute is what ``image_gen.provider`` in
``config.yaml`` matches against when routing ``image_generate``
tool calls.
"""
from agent.image_gen_provider import ImageGenProvider
from agent.image_gen_registry import register_provider
if not isinstance(provider, ImageGenProvider):
logger.warning(
"Plugin '%s' tried to register an image_gen provider that does "
"not inherit from ImageGenProvider. Ignoring.",
self.manifest.name,
)
return
register_provider(provider)
logger.info(
"Plugin '%s' registered image_gen provider: %s",
self.manifest.name, provider.name,
)
# -- hook registration --------------------------------------------------
def register_hook(self, hook_name: str, callback: Callable) -> None:
@@ -456,20 +512,38 @@ class PluginManager:
# Public
# -----------------------------------------------------------------------
def discover_and_load(self) -> None:
"""Scan all plugin sources and load each plugin found."""
if self._discovered:
def discover_and_load(self, force: bool = False) -> None:
"""Scan all plugin sources and load each plugin found.
When ``force`` is true, clear cached discovery state first so config
changes or newly-added bundled backends become visible in long-lived
sessions without requiring a full agent restart.
"""
if self._discovered and not force:
return
if force:
self._plugins.clear()
self._hooks.clear()
self._plugin_tool_names.clear()
self._cli_commands.clear()
self._plugin_commands.clear()
self._plugin_skills.clear()
self._context_engine = None
self._discovered = True
manifests: List[PluginManifest] = []
# 1. Bundled plugins (<repo>/plugins/<name>/)
# Repo-shipped generic plugins live next to hermes_cli/. Memory and
# context_engine subdirs are handled by their own discovery paths, so
# skip those names here. Bundled plugins are discovered (so they
# show up in `hermes plugins`) but only loaded when added to
# `plugins.enabled` in config.yaml — opt-in like any other plugin.
#
# Repo-shipped plugins live next to hermes_cli/. Two layouts are
# supported (see ``_scan_directory`` for details):
#
# - flat: ``plugins/disk-cleanup/plugin.yaml`` (standalone)
# - category: ``plugins/image_gen/openai/plugin.yaml`` (backend)
#
# ``memory/`` and ``context_engine/`` are skipped at the top level —
# they have their own discovery systems. Porting those to the
# category-namespace ``kind: exclusive`` model is a future PR.
repo_plugins = Path(__file__).resolve().parent.parent / "plugins"
manifests.extend(
self._scan_directory(
@@ -492,36 +566,69 @@ class PluginManager:
manifests.extend(self._scan_entry_points())
# Load each manifest (skip user-disabled plugins).
# Later sources override earlier ones on name collision — user plugins
# take precedence over bundled, project plugins take precedence over
# user. Dedup here so we only load the final winner.
# Later sources override earlier ones on key collision — user
# plugins take precedence over bundled, project plugins take
# precedence over user. Dedup here so we only load the final
# winner. Keys are path-derived (``image_gen/openai``,
# ``disk-cleanup``) so ``tts/openai`` and ``image_gen/openai``
# don't collide even when both manifests say ``name: openai``.
disabled = _get_disabled_plugins()
enabled = _get_enabled_plugins() # None = opt-in default (nothing enabled)
winners: Dict[str, PluginManifest] = {}
for manifest in manifests:
winners[manifest.name] = manifest
winners[manifest.key or manifest.name] = manifest
for manifest in winners.values():
# Explicit disable always wins.
if manifest.name in disabled:
lookup_key = manifest.key or manifest.name
# Explicit disable always wins (matches on key or on legacy
# bare name for back-compat with existing user configs).
if lookup_key in disabled or manifest.name in disabled:
loaded = LoadedPlugin(manifest=manifest, enabled=False)
loaded.error = "disabled via config"
self._plugins[manifest.name] = loaded
logger.debug("Skipping disabled plugin '%s'", manifest.name)
self._plugins[lookup_key] = loaded
logger.debug("Skipping disabled plugin '%s'", lookup_key)
continue
# Opt-in gate: plugins must be in the enabled allow-list.
# If the allow-list is missing (None), treat as "nothing enabled"
# — users have to explicitly enable plugins to load them.
# Memory and context_engine providers are excluded from this gate
# since they have their own single-select config (memory.provider
# / context.engine), not the enabled list.
if enabled is None or manifest.name not in enabled:
# Exclusive plugins (memory providers) have their own
# discovery/activation path. The general loader records the
# manifest for introspection but does not load the module.
if manifest.kind == "exclusive":
loaded = LoadedPlugin(manifest=manifest, enabled=False)
loaded.error = "not enabled in config (run `hermes plugins enable {}` to activate)".format(
manifest.name
loaded.error = (
"exclusive plugin — activate via <category>.provider config"
)
self._plugins[manifest.name] = loaded
self._plugins[lookup_key] = loaded
logger.debug(
"Skipping '%s' (not in plugins.enabled)", manifest.name
"Skipping '%s' (exclusive, handled by category discovery)",
lookup_key,
)
continue
# Built-in backends auto-load — they ship with hermes and must
# just work. Selection among them (e.g. which image_gen backend
# services calls) is driven by ``<category>.provider`` config,
# enforced by the tool wrapper.
if manifest.kind == "backend" and manifest.source == "bundled":
self._load_plugin(manifest)
continue
# Everything else (standalone, user-installed backends,
# entry-point plugins) is opt-in via plugins.enabled.
# Accept both the path-derived key and the legacy bare name
# so existing configs keep working.
is_enabled = (
enabled is not None
and (lookup_key in enabled or manifest.name in enabled)
)
if not is_enabled:
loaded = LoadedPlugin(manifest=manifest, enabled=False)
loaded.error = (
"not enabled in config (run `hermes plugins enable {}` to activate)"
.format(lookup_key)
)
self._plugins[lookup_key] = loaded
logger.debug(
"Skipping '%s' (not in plugins.enabled)", lookup_key
)
continue
self._load_plugin(manifest)
@@ -545,9 +652,37 @@ class PluginManager:
) -> List[PluginManifest]:
"""Read ``plugin.yaml`` manifests from subdirectories of *path*.
*skip_names* is an optional allow-list of names to ignore (used
for the bundled scan to exclude ``memory`` / ``context_engine``
subdirs that have their own discovery path).
Supports two layouts, mixed freely:
* **Flat** ``<root>/<plugin-name>/plugin.yaml``. Key is
``<plugin-name>`` (e.g. ``disk-cleanup``).
* **Category** ``<root>/<category>/<plugin-name>/plugin.yaml``,
where the ``<category>`` directory itself has no ``plugin.yaml``.
Key is ``<category>/<plugin-name>`` (e.g. ``image_gen/openai``).
Depth is capped at two segments.
*skip_names* is an optional allow-list of names to ignore at the
top level (kept for back-compat; the current call sites no longer
pass it now that categories are first-class).
"""
return self._scan_directory_level(
path, source, skip_names=skip_names, prefix="", depth=0
)
def _scan_directory_level(
self,
path: Path,
source: str,
*,
skip_names: Optional[Set[str]],
prefix: str,
depth: int,
) -> List[PluginManifest]:
"""Recursive implementation of :meth:`_scan_directory`.
``prefix`` is the category path already accumulated ("" at root,
"image_gen" one level in). ``depth`` is the recursion depth; we
cap at 2 so ``<root>/a/b/c/`` is ignored.
"""
manifests: List[PluginManifest] = []
if not path.is_dir():
@@ -556,37 +691,112 @@ class PluginManager:
for child in sorted(path.iterdir()):
if not child.is_dir():
continue
if skip_names and child.name in skip_names:
if depth == 0 and skip_names and child.name in skip_names:
continue
manifest_file = child / "plugin.yaml"
if not manifest_file.exists():
manifest_file = child / "plugin.yml"
if not manifest_file.exists():
logger.debug("Skipping %s (no plugin.yaml)", child)
if manifest_file.exists():
manifest = self._parse_manifest(
manifest_file, child, source, prefix
)
if manifest is not None:
manifests.append(manifest)
continue
try:
if yaml is None:
logger.warning("PyYAML not installed cannot load %s", manifest_file)
continue
data = yaml.safe_load(manifest_file.read_text()) or {}
manifest = PluginManifest(
name=data.get("name", child.name),
version=str(data.get("version", "")),
description=data.get("description", ""),
author=data.get("author", ""),
requires_env=data.get("requires_env", []),
provides_tools=data.get("provides_tools", []),
provides_hooks=data.get("provides_hooks", []),
source=source,
path=str(child),
# No manifest at this level. If we're still within the depth
# cap, treat this directory as a category namespace and recurse
# one level in looking for children with manifests.
if depth >= 1:
logger.debug("Skipping %s (no plugin.yaml, depth cap reached)", child)
continue
sub_prefix = f"{prefix}/{child.name}" if prefix else child.name
manifests.extend(
self._scan_directory_level(
child,
source,
skip_names=None,
prefix=sub_prefix,
depth=depth + 1,
)
manifests.append(manifest)
except Exception as exc:
logger.warning("Failed to parse %s: %s", manifest_file, exc)
)
return manifests
def _parse_manifest(
self,
manifest_file: Path,
plugin_dir: Path,
source: str,
prefix: str,
) -> Optional[PluginManifest]:
"""Parse a single ``plugin.yaml`` into a :class:`PluginManifest`.
Returns ``None`` on parse failure (logs a warning).
"""
try:
if yaml is None:
logger.warning("PyYAML not installed cannot load %s", manifest_file)
return None
data = yaml.safe_load(manifest_file.read_text()) or {}
name = data.get("name", plugin_dir.name)
key = f"{prefix}/{plugin_dir.name}" if prefix else name
raw_kind = data.get("kind", "standalone")
if not isinstance(raw_kind, str):
raw_kind = "standalone"
kind = raw_kind.strip().lower()
if kind not in _VALID_PLUGIN_KINDS:
logger.warning(
"Plugin %s: unknown kind '%s' (valid: %s); treating as 'standalone'",
key, raw_kind, ", ".join(sorted(_VALID_PLUGIN_KINDS)),
)
kind = "standalone"
# Auto-coerce user-installed memory providers to kind="exclusive"
# so they're routed to plugins/memory discovery instead of being
# loaded by the general PluginManager (which has no
# register_memory_provider on PluginContext). Mirrors the
# heuristic in plugins/memory/__init__.py:_is_memory_provider_dir.
# Bundled memory providers are already skipped via skip_names.
if kind == "standalone" and "kind" not in data:
init_file = plugin_dir / "__init__.py"
if init_file.exists():
try:
source_text = init_file.read_text(errors="replace")[:8192]
if (
"register_memory_provider" in source_text
or "MemoryProvider" in source_text
):
kind = "exclusive"
logger.debug(
"Plugin %s: detected memory provider, "
"treating as kind='exclusive'",
key,
)
except Exception:
pass
return PluginManifest(
name=name,
version=str(data.get("version", "")),
description=data.get("description", ""),
author=data.get("author", ""),
requires_env=data.get("requires_env", []),
provides_tools=data.get("provides_tools", []),
provides_hooks=data.get("provides_hooks", []),
source=source,
path=str(plugin_dir),
kind=kind,
key=key,
)
except Exception as exc:
logger.warning("Failed to parse %s: %s", manifest_file, exc)
return None
# -----------------------------------------------------------------------
# Entry-point scanning
# -----------------------------------------------------------------------
@@ -609,6 +819,7 @@ class PluginManager:
name=ep.name,
source="entrypoint",
path=ep.value,
key=ep.name,
)
manifests.append(manifest)
except Exception as exc:
@@ -670,10 +881,16 @@ class PluginManager:
loaded.error = str(exc)
logger.warning("Failed to load plugin '%s': %s", manifest.name, exc)
self._plugins[manifest.name] = loaded
self._plugins[manifest.key or manifest.name] = loaded
def _load_directory_module(self, manifest: PluginManifest) -> types.ModuleType:
"""Import a directory-based plugin as ``hermes_plugins.<name>``."""
"""Import a directory-based plugin as ``hermes_plugins.<slug>``.
The module slug is derived from ``manifest.key`` so category-namespaced
plugins (``image_gen/openai``) import as
``hermes_plugins.image_gen__openai`` without colliding with any
future ``tts/openai``.
"""
plugin_dir = Path(manifest.path) # type: ignore[arg-type]
init_file = plugin_dir / "__init__.py"
if not init_file.exists():
@@ -686,7 +903,9 @@ class PluginManager:
ns_pkg.__package__ = _NS_PARENT
sys.modules[_NS_PARENT] = ns_pkg
module_name = f"{_NS_PARENT}.{manifest.name.replace('-', '_')}"
key = manifest.key or manifest.name
slug = key.replace("/", "__").replace("-", "_")
module_name = f"{_NS_PARENT}.{slug}"
spec = importlib.util.spec_from_file_location(
module_name,
init_file,
@@ -767,10 +986,12 @@ class PluginManager:
def list_plugins(self) -> List[Dict[str, Any]]:
"""Return a list of info dicts for all discovered plugins."""
result: List[Dict[str, Any]] = []
for name, loaded in sorted(self._plugins.items()):
for key, loaded in sorted(self._plugins.items()):
result.append(
{
"name": name,
"name": loaded.manifest.name,
"key": loaded.manifest.key or loaded.manifest.name,
"kind": loaded.manifest.kind,
"version": loaded.manifest.version,
"description": loaded.manifest.description,
"source": loaded.manifest.source,
@@ -821,9 +1042,13 @@ def get_plugin_manager() -> PluginManager:
return _plugin_manager
def discover_plugins() -> None:
"""Discover and load all plugins (idempotent)."""
get_plugin_manager().discover_and_load()
def discover_plugins(force: bool = False) -> None:
"""Discover and load all plugins.
Default behavior is idempotent. Pass ``force=True`` to rescan plugin
manifests and reload state in the current process.
"""
get_plugin_manager().discover_and_load(force=force)
def invoke_hook(hook_name: str, **kwargs: Any) -> List[Any]:
@@ -874,10 +1099,13 @@ def get_pre_tool_call_block_message(
return None
def _ensure_plugins_discovered() -> PluginManager:
"""Return the global manager after running idempotent plugin discovery."""
def _ensure_plugins_discovered(force: bool = False) -> PluginManager:
"""Return the global manager after ensuring plugin discovery has run.
Pass ``force=True`` to rescan in the current process.
"""
manager = get_plugin_manager()
manager.discover_and_load()
manager.discover_and_load(force=force)
return manager
+41 -15
View File
@@ -863,19 +863,15 @@ def _safe_extract_profile_archive(archive: Path, destination: Path) -> None:
pass
def import_profile(archive_path: str, name: Optional[str] = None) -> Path:
"""Import a profile from a tar.gz archive.
def _inspect_profile_archive_roots(archive: Path) -> set[str]:
"""Return the archive's top-level directory names.
If *name* is not given, infers it from the archive's top-level directory.
Returns the imported profile directory.
Profile imports expect exactly one root directory. Inspecting the archive
before extraction lets us stage the import safely instead of mutating a
live profile tree first and reconciling names later.
"""
import tarfile
archive = Path(archive_path)
if not archive.exists():
raise FileNotFoundError(f"Archive not found: {archive}")
# Peek at the archive to find the top-level directory name
with tarfile.open(archive, "r:gz") as tf:
top_dirs = {
parts[0]
@@ -889,13 +885,33 @@ def import_profile(archive_path: str, name: Optional[str] = None) -> Path:
for member in tf.getmembers()
if member.isdir()
}
return top_dirs
inferred_name = name or (top_dirs.pop() if len(top_dirs) == 1 else None)
def import_profile(archive_path: str, name: Optional[str] = None) -> Path:
"""Import a profile from a tar.gz archive.
If *name* is not given, infers it from the archive's top-level directory.
Returns the imported profile directory.
"""
import tempfile
archive = Path(archive_path)
if not archive.exists():
raise FileNotFoundError(f"Archive not found: {archive}")
top_dirs = _inspect_profile_archive_roots(archive)
archive_root = top_dirs.pop() if len(top_dirs) == 1 else None
inferred_name = name or archive_root
if not inferred_name:
raise ValueError(
"Cannot determine profile name from archive. "
"Specify it explicitly: hermes profile import <archive> --name <name>"
)
if archive_root is None:
raise ValueError(
"Profile archive must contain exactly one top-level directory."
)
# Archives exported from the default profile have "default/" as top-level
# dir. Importing as "default" would target ~/.hermes itself — disallow
@@ -914,12 +930,22 @@ def import_profile(archive_path: str, name: Optional[str] = None) -> Path:
profiles_root = _get_profiles_root()
profiles_root.mkdir(parents=True, exist_ok=True)
_safe_extract_profile_archive(archive, profiles_root)
with tempfile.TemporaryDirectory(prefix="hermes_profile_import_") as tmpdir:
staging_root = Path(tmpdir)
_safe_extract_profile_archive(archive, staging_root)
# If the archive extracted under a different name, rename
extracted = profiles_root / (top_dirs.pop() if top_dirs else inferred_name)
if extracted != profile_dir and extracted.exists():
extracted.rename(profile_dir)
extracted = staging_root / archive_root
if not extracted.is_dir():
raise ValueError(
f"Profile archive root is missing or invalid: {archive_root}"
)
final_source = extracted
if archive_root != inferred_name:
final_source = staging_root / inferred_name
extracted.rename(final_source)
shutil.move(str(final_source), str(profile_dir))
return profile_dir
+23
View File
@@ -94,6 +94,12 @@ HERMES_OVERLAYS: Dict[str, HermesOverlay] = {
transport="openai_chat",
base_url_env_var="KIMI_BASE_URL",
),
"stepfun": HermesOverlay(
transport="openai_chat",
extra_env_vars=("STEPFUN_API_KEY",),
base_url_override="https://api.stepfun.ai/step_plan/v1",
base_url_env_var="STEPFUN_BASE_URL",
),
"minimax": HermesOverlay(
transport="anthropic_messages",
base_url_env_var="MINIMAX_BASE_URL",
@@ -210,6 +216,10 @@ ALIASES: Dict[str, str] = {
"kimi-coding-cn": "kimi-for-coding",
"moonshot": "kimi-for-coding",
# stepfun
"step": "stepfun",
"stepfun-coding-plan": "stepfun",
# minimax-cn
"minimax-china": "minimax-cn",
"minimax_cn": "minimax-cn",
@@ -294,6 +304,7 @@ _LABEL_OVERRIDES: Dict[str, str] = {
"nous": "Nous Portal",
"openai-codex": "OpenAI Codex",
"copilot-acp": "GitHub Copilot ACP",
"stepfun": "StepFun Step Plan",
"xiaomi": "Xiaomi MiMo",
"local": "Local endpoint",
"bedrock": "AWS Bedrock",
@@ -427,6 +438,16 @@ def determine_api_mode(provider: str, base_url: str = "") -> str:
"""
pdef = get_provider(provider)
if pdef is not None:
# Even for known providers, check URL heuristics for special endpoints
# (e.g. kimi /coding endpoint needs anthropic_messages even on 'custom')
if base_url:
url_lower = base_url.rstrip("/").lower()
if "api.kimi.com/coding" in url_lower:
return "anthropic_messages"
if url_lower.endswith("/anthropic") or "api.anthropic.com" in url_lower:
return "anthropic_messages"
if "api.openai.com" in url_lower:
return "codex_responses"
return TRANSPORT_TO_API_MODE.get(pdef.transport, "chat_completions")
# Direct provider checks for providers not in HERMES_OVERLAYS
@@ -439,6 +460,8 @@ def determine_api_mode(provider: str, base_url: str = "") -> str:
hostname = base_url_hostname(base_url)
if url_lower.endswith("/anthropic") or hostname == "api.anthropic.com":
return "anthropic_messages"
if hostname == "api.kimi.com" and "/coding" in url_lower:
return "anthropic_messages"
if hostname == "api.openai.com":
return "codex_responses"
if hostname.startswith("bedrock-runtime.") and base_url_host_matches(base_url, "amazonaws.com"):
+9 -2
View File
@@ -46,6 +46,9 @@ def _detect_api_mode_for_url(base_url: str) -> Optional[str]:
protocol under a ``/anthropic`` suffix treat those as
``anthropic_messages`` transport instead of the default
``chat_completions``.
- Kimi Code's ``api.kimi.com/coding`` endpoint also speaks the
Anthropic Messages protocol (the /coding route accepts Claude
Code's native request shape).
"""
normalized = (base_url or "").strip().lower().rstrip("/")
hostname = base_url_hostname(base_url)
@@ -55,6 +58,8 @@ def _detect_api_mode_for_url(base_url: str) -> Optional[str]:
return "codex_responses"
if normalized.endswith("/anthropic"):
return "anthropic_messages"
if hostname == "api.kimi.com" and "/coding" in normalized:
return "anthropic_messages"
return None
@@ -205,7 +210,8 @@ def _resolve_runtime_from_pool_entry(
api_mode = opencode_model_api_mode(provider, model_cfg.get("default", ""))
else:
# Auto-detect Anthropic-compatible endpoints (/anthropic suffix,
# api.openai.com → codex_responses, api.x.ai → codex_responses).
# Kimi /coding, api.openai.com → codex_responses, api.x.ai →
# codex_responses).
detected = _detect_api_mode_for_url(base_url)
if detected:
api_mode = detected
@@ -660,7 +666,8 @@ def _resolve_explicit_runtime(
if configured_mode:
api_mode = configured_mode
else:
# Auto-detect Anthropic-compatible endpoints (/anthropic suffix).
# Auto-detect from URL (Anthropic /anthropic suffix,
# api.openai.com → Responses, Kimi /coding, etc.).
detected = _detect_api_mode_for_url(base_url)
if detected:
api_mode = detected
+41 -3
View File
@@ -96,13 +96,14 @@ _DEFAULT_PROVIDER_MODELS = {
"zai": ["glm-5.1", "glm-5", "glm-4.7", "glm-4.5", "glm-4.5-flash"],
"kimi-coding": ["kimi-k2.6", "kimi-k2.5", "kimi-k2-thinking", "kimi-k2-turbo-preview"],
"kimi-coding-cn": ["kimi-k2.6", "kimi-k2.5", "kimi-k2-thinking", "kimi-k2-turbo-preview"],
"stepfun": ["step-3.5-flash", "step-3.5-flash-2603"],
"arcee": ["trinity-large-thinking", "trinity-large-preview", "trinity-mini"],
"minimax": ["MiniMax-M2.7", "MiniMax-M2.5", "MiniMax-M2.1", "MiniMax-M2"],
"minimax-cn": ["MiniMax-M2.7", "MiniMax-M2.5", "MiniMax-M2.1", "MiniMax-M2"],
"ai-gateway": ["anthropic/claude-opus-4.6", "anthropic/claude-sonnet-4.6", "openai/gpt-5", "google/gemini-3-flash"],
"kilocode": ["anthropic/claude-opus-4.6", "anthropic/claude-sonnet-4.6", "openai/gpt-5.4", "google/gemini-3-pro-preview", "google/gemini-3-flash-preview"],
"opencode-zen": ["gpt-5.4", "gpt-5.3-codex", "claude-sonnet-4-6", "gemini-3-flash", "glm-5", "kimi-k2.5", "minimax-m2.7"],
"opencode-go": ["kimi-k2.6", "kimi-k2.5", "glm-5.1", "glm-5", "mimo-v2-pro", "mimo-v2-omni", "minimax-m2.5", "minimax-m2.7", "qwen3.6-plus", "qwen3.5-plus"],
"opencode-go": ["kimi-k2.6", "kimi-k2.5", "glm-5.1", "glm-5", "mimo-v2.5-pro", "mimo-v2.5", "mimo-v2-pro", "mimo-v2-omni", "minimax-m2.7", "minimax-m2.5", "qwen3.6-plus", "qwen3.5-plus"],
"huggingface": [
"Qwen/Qwen3.5-397B-A17B", "Qwen/Qwen3-235B-A22B-Thinking-2507",
"Qwen/Qwen3-Coder-480B-A35B-Instruct", "deepseek-ai/DeepSeek-R1-0528",
@@ -408,13 +409,36 @@ def _print_setup_summary(config: dict, hermes_home):
("Browser Automation", False, missing_browser_hint)
)
# FAL (image generation)
# Image generation — FAL (direct or via Nous), or any plugin-registered
# provider (OpenAI, etc.)
if subscription_features.image_gen.managed_by_nous:
tool_status.append(("Image Generation (Nous subscription)", True, None))
elif subscription_features.image_gen.available:
tool_status.append(("Image Generation", True, None))
else:
tool_status.append(("Image Generation", False, "FAL_KEY"))
# Fall back to probing plugin-registered providers so OpenAI-only
# setups don't show as "missing FAL_KEY".
_img_backend = None
try:
from agent.image_gen_registry import list_providers
from hermes_cli.plugins import _ensure_plugins_discovered
_ensure_plugins_discovered()
for _p in list_providers():
if _p.name == "fal":
continue
try:
if _p.is_available():
_img_backend = _p.display_name
break
except Exception:
continue
except Exception:
pass
if _img_backend:
tool_status.append((f"Image Generation ({_img_backend})", True, None))
else:
tool_status.append(("Image Generation", False, "FAL_KEY or OPENAI_API_KEY"))
# TTS — show configured provider
tts_provider = config.get("tts", {}).get("provider", "edge")
@@ -781,6 +805,7 @@ def setup_model_provider(config: dict, *, quick: bool = False):
"zai": "Z.AI / GLM",
"kimi-coding": "Kimi / Moonshot",
"kimi-coding-cn": "Kimi / Moonshot (China)",
"stepfun": "StepFun Step Plan",
"minimax": "MiniMax",
"minimax-cn": "MiniMax CN",
"anthropic": "Anthropic",
@@ -2309,6 +2334,7 @@ def setup_gateway(config: dict):
launchd_install,
launchd_start,
launchd_restart,
UserSystemdUnavailableError,
)
service_installed = _is_service_installed()
@@ -2332,6 +2358,10 @@ def setup_gateway(config: dict):
systemd_restart()
elif _is_macos:
launchd_restart()
except UserSystemdUnavailableError as e:
print_error(" Restart failed — user systemd not reachable:")
for line in str(e).splitlines():
print(f" {line}")
except Exception as e:
print_error(f" Restart failed: {e}")
elif service_installed:
@@ -2341,6 +2371,10 @@ def setup_gateway(config: dict):
systemd_start()
elif _is_macos:
launchd_start()
except UserSystemdUnavailableError as e:
print_error(" Start failed — user systemd not reachable:")
for line in str(e).splitlines():
print(f" {line}")
except Exception as e:
print_error(f" Start failed: {e}")
elif supports_service_manager:
@@ -2364,6 +2398,10 @@ def setup_gateway(config: dict):
systemd_start(system=installed_scope == "system")
elif _is_macos:
launchd_start()
except UserSystemdUnavailableError as e:
print_error(" Start failed — user systemd not reachable:")
for line in str(e).splitlines():
print(f" {line}")
except Exception as e:
print_error(f" Start failed: {e}")
except Exception as e:
+71 -7
View File
@@ -30,6 +30,14 @@ All fields are optional. Missing values inherit from the ``default`` skin.
prompt: "#FFF8DC" # Prompt text color
input_rule: "#CD7F32" # Input area horizontal rule
response_border: "#FFD700" # Response box border (ANSI)
status_bar_bg: "#1a1a2e" # Status bar background
status_bar_text: "#C0C0C0" # Status bar default text
status_bar_strong: "#FFD700" # Status bar highlighted text
status_bar_dim: "#8B8682" # Status bar separators/muted text
status_bar_good: "#8FBC8F" # Healthy context usage
status_bar_warn: "#FFD700" # Warning context usage
status_bar_bad: "#FF8C00" # High context usage
status_bar_critical: "#FF6B6B" # Critical context usage
session_label: "#DAA520" # Session label color
session_border: "#8B8682" # Session ID dim color
status_bar_bg: "#1a1a2e" # TUI status/usage bar background
@@ -170,6 +178,7 @@ _BUILTIN_SKINS: Dict[str, Dict[str, Any]] = {
"prompt": "#FFF8DC",
"input_rule": "#CD7F32",
"response_border": "#FFD700",
"status_bar_bg": "#1a1a2e",
"session_label": "#DAA520",
"session_border": "#8B8682",
},
@@ -203,6 +212,14 @@ _BUILTIN_SKINS: Dict[str, Dict[str, Any]] = {
"prompt": "#F1E6CF",
"input_rule": "#9F1C1C",
"response_border": "#C7A96B",
"status_bar_bg": "#2A1212",
"status_bar_text": "#F1E6CF",
"status_bar_strong": "#C7A96B",
"status_bar_dim": "#6E584B",
"status_bar_good": "#7BC96F",
"status_bar_warn": "#C7A96B",
"status_bar_bad": "#DD4A3A",
"status_bar_critical": "#EF5350",
"session_label": "#C7A96B",
"session_border": "#6E584B",
},
@@ -267,6 +284,14 @@ _BUILTIN_SKINS: Dict[str, Dict[str, Any]] = {
"prompt": "#c9d1d9",
"input_rule": "#444444",
"response_border": "#aaaaaa",
"status_bar_bg": "#1F1F1F",
"status_bar_text": "#C9D1D9",
"status_bar_strong": "#E6EDF3",
"status_bar_dim": "#777777",
"status_bar_good": "#B5B5B5",
"status_bar_warn": "#AAAAAA",
"status_bar_bad": "#D0D0D0",
"status_bar_critical": "#F0F0F0",
"session_label": "#888888",
"session_border": "#555555",
},
@@ -298,6 +323,14 @@ _BUILTIN_SKINS: Dict[str, Dict[str, Any]] = {
"prompt": "#c9d1d9",
"input_rule": "#4169e1",
"response_border": "#7eb8f6",
"status_bar_bg": "#151C2F",
"status_bar_text": "#C9D1D9",
"status_bar_strong": "#7EB8F6",
"status_bar_dim": "#4B5563",
"status_bar_good": "#63D0A6",
"status_bar_warn": "#E6A855",
"status_bar_bad": "#F7A072",
"status_bar_critical": "#FF7A7A",
"session_label": "#7eb8f6",
"session_border": "#4b5563",
},
@@ -403,6 +436,14 @@ _BUILTIN_SKINS: Dict[str, Dict[str, Any]] = {
"prompt": "#EAF7FF",
"input_rule": "#2A6FB9",
"response_border": "#5DB8F5",
"status_bar_bg": "#0F2440",
"status_bar_text": "#EAF7FF",
"status_bar_strong": "#A9DFFF",
"status_bar_dim": "#496884",
"status_bar_good": "#6ED7B0",
"status_bar_warn": "#5DB8F5",
"status_bar_bad": "#2A6FB9",
"status_bar_critical": "#D94F4F",
"session_label": "#A9DFFF",
"session_border": "#496884",
},
@@ -467,6 +508,14 @@ _BUILTIN_SKINS: Dict[str, Dict[str, Any]] = {
"prompt": "#F5F5F5",
"input_rule": "#656565",
"response_border": "#B7B7B7",
"status_bar_bg": "#202020",
"status_bar_text": "#D3D3D3",
"status_bar_strong": "#F5F5F5",
"status_bar_dim": "#656565",
"status_bar_good": "#B7B7B7",
"status_bar_warn": "#D3D3D3",
"status_bar_bad": "#E7E7E7",
"status_bar_critical": "#F5F5F5",
"session_label": "#919191",
"session_border": "#656565",
},
@@ -532,6 +581,14 @@ _BUILTIN_SKINS: Dict[str, Dict[str, Any]] = {
"prompt": "#FFF0D4",
"input_rule": "#C75B1D",
"response_border": "#F29C38",
"status_bar_bg": "#2B160E",
"status_bar_text": "#FFF0D4",
"status_bar_strong": "#FFD39A",
"status_bar_dim": "#6C4724",
"status_bar_good": "#6BCB77",
"status_bar_warn": "#F29C38",
"status_bar_bad": "#E2832B",
"status_bar_critical": "#EF5350",
"session_label": "#FFD39A",
"session_border": "#6C4724",
},
@@ -770,6 +827,13 @@ def get_prompt_toolkit_style_overrides() -> Dict[str, str]:
warn = skin.get_color("ui_warn", "#FF8C00")
error = skin.get_color("ui_error", "#FF6B6B")
status_bg = skin.get_color("status_bar_bg", "#1a1a2e")
status_text = skin.get_color("status_bar_text", text)
status_strong = skin.get_color("status_bar_strong", title)
status_dim = skin.get_color("status_bar_dim", dim)
status_good = skin.get_color("status_bar_good", skin.get_color("ui_ok", "#8FBC8F"))
status_warn = skin.get_color("status_bar_warn", warn)
status_bad = skin.get_color("status_bar_bad", skin.get_color("banner_accent", warn))
status_critical = skin.get_color("status_bar_critical", error)
voice_bg = skin.get_color("voice_status_bg", status_bg)
menu_bg = skin.get_color("completion_menu_bg", "#1a1a2e")
menu_current_bg = skin.get_color("completion_menu_current_bg", "#333355")
@@ -782,13 +846,13 @@ def get_prompt_toolkit_style_overrides() -> Dict[str, str]:
"prompt": prompt,
"prompt-working": f"{dim} italic",
"hint": f"{dim} italic",
"status-bar": f"bg:{status_bg} {text}",
"status-bar-strong": f"bg:{status_bg} {title} bold",
"status-bar-dim": f"bg:{status_bg} {dim}",
"status-bar-good": f"bg:{status_bg} {skin.get_color('ui_ok', '#8FBC8F')} bold",
"status-bar-warn": f"bg:{status_bg} {warn} bold",
"status-bar-bad": f"bg:{status_bg} {skin.get_color('banner_accent', warn)} bold",
"status-bar-critical": f"bg:{status_bg} {error} bold",
"status-bar": f"bg:{status_bg} {status_text}",
"status-bar-strong": f"bg:{status_bg} {status_strong} bold",
"status-bar-dim": f"bg:{status_bg} {status_dim}",
"status-bar-good": f"bg:{status_bg} {status_good} bold",
"status-bar-warn": f"bg:{status_bg} {status_warn} bold",
"status-bar-bad": f"bg:{status_bg} {status_bad} bold",
"status-bar-critical": f"bg:{status_bg} {status_critical} bold",
"input-rule": input_rule,
"image-badge": f"{label} bold",
"completion-menu": f"bg:{menu_bg} {text}",
+2
View File
@@ -122,6 +122,7 @@ def show_status(args):
"OpenAI": "OPENAI_API_KEY",
"Z.AI/GLM": "GLM_API_KEY",
"Kimi": "KIMI_API_KEY",
"StepFun Step Plan": "STEPFUN_API_KEY",
"MiniMax": "MINIMAX_API_KEY",
"MiniMax-CN": "MINIMAX_CN_API_KEY",
"Firecrawl": "FIRECRAWL_API_KEY",
@@ -252,6 +253,7 @@ def show_status(args):
apikey_providers = {
"Z.AI / GLM": ("GLM_API_KEY", "ZAI_API_KEY", "Z_AI_API_KEY"),
"Kimi / Moonshot": ("KIMI_API_KEY",),
"StepFun Step Plan": ("STEPFUN_API_KEY",),
"MiniMax": ("MINIMAX_API_KEY",),
"MiniMax (China)": ("MINIMAX_CN_API_KEY",),
}
+1
View File
@@ -289,6 +289,7 @@ TIPS = [
"When a provider returns HTTP 402 (payment required), the auxiliary client auto-falls back to the next one.",
"agent.tool_use_enforcement steers models that describe actions instead of calling tools — auto for GPT/Codex.",
"agent.restart_drain_timeout (default 60s) lets running agents finish before a gateway restart takes effect.",
"agent.api_max_retries (default 3) controls how many times the agent retries a failed API call before surfacing the error — lower it for fast fallback.",
"The gateway caches AIAgent instances per session — destroying this cache breaks Anthropic prompt caching.",
"Any website can expose skills via /.well-known/skills/index.json — the skills hub discovers them automatically.",
"The skills audit log at ~/.hermes/skills/.hub/audit.log tracks every install and removal operation.",
+276 -5
View File
@@ -67,12 +67,13 @@ CONFIGURABLE_TOOLSETS = [
("messaging", "📨 Cross-Platform Messaging", "send_message"),
("rl", "🧪 RL Training", "Tinker-Atropos training tools"),
("homeassistant", "🏠 Home Assistant", "smart home device control"),
("discord_admin", "🛡️ Discord Server Admin", "list channels/roles, pin, assign roles"),
]
# Toolsets that are OFF by default for new installs.
# They're still in _HERMES_CORE_TOOLS (available at runtime if enabled),
# but the setup checklist won't pre-select them for first-time users.
_DEFAULT_OFF_TOOLSETS = {"moa", "homeassistant", "rl"}
_DEFAULT_OFF_TOOLSETS = {"moa", "homeassistant", "rl", "discord_admin"}
def _get_effective_configurable_toolsets():
@@ -549,7 +550,7 @@ def _get_platform_tools(
include_default_mcp_servers: bool = True,
) -> Set[str]:
"""Resolve which individual toolset names are enabled for a platform."""
from toolsets import resolve_toolset
from toolsets import resolve_toolset, TOOLSETS
platform_toolsets = config.get("platform_toolsets") or {}
toolset_names = platform_toolsets.get(platform)
@@ -563,6 +564,8 @@ def _get_platform_tools(
toolset_names = [str(ts) for ts in toolset_names]
configurable_keys = {ts_key for ts_key, _, _ in CONFIGURABLE_TOOLSETS}
plugin_ts_keys = _get_plugin_toolset_keys()
platform_default_keys = {p["default_toolset"] for p in PLATFORMS.values()}
# If the saved list contains any configurable keys directly, the user
# has explicitly configured this platform — use direct membership.
@@ -585,16 +588,46 @@ def _get_platform_tools(
ts_tools = set(resolve_toolset(ts_key))
if ts_tools and ts_tools.issubset(all_tool_names):
enabled_toolsets.add(ts_key)
default_off = set(_DEFAULT_OFF_TOOLSETS)
if platform in default_off:
default_off.remove(platform)
enabled_toolsets -= default_off
# Recover non-configurable platform toolsets (e.g. discord, feishu_doc,
# feishu_drive). These are part of the platform's default composite but
# absent from CONFIGURABLE_TOOLSETS, so they can't appear in the TUI
# checklist or in a user-saved config. Must run in BOTH branches —
# otherwise saving via `hermes tools` (which flips has_explicit_config
# to True) silently drops them.
platform_tool_universe = set(resolve_toolset(PLATFORMS[platform]["default_toolset"]))
configurable_tool_universe = set()
for ck in configurable_keys:
configurable_tool_universe.update(resolve_toolset(ck))
claimed = set()
for ts_key in enabled_toolsets:
claimed.update(resolve_toolset(ts_key))
skip = configurable_keys | plugin_ts_keys | platform_default_keys
skip |= {k for k in TOOLSETS if k.startswith("hermes-")}
skip |= set(_DEFAULT_OFF_TOOLSETS) - {platform}
for ts_key, ts_def in TOOLSETS.items():
if ts_key in skip:
continue
if ts_def.get("includes"):
continue
ts_tools = set(resolve_toolset(ts_key))
if not ts_tools or not ts_tools.issubset(platform_tool_universe):
continue
if ts_tools.issubset(configurable_tool_universe):
continue
if not ts_tools.issubset(claimed):
enabled_toolsets.add(ts_key)
claimed.update(ts_tools)
# Plugin toolsets: enabled by default unless explicitly disabled.
# A plugin toolset is "known" for a platform once `hermes tools`
# has been saved for that platform (tracked via known_plugin_toolsets).
# Unknown plugins default to enabled; known-but-absent = disabled.
plugin_ts_keys = _get_plugin_toolset_keys()
if plugin_ts_keys:
known_map = config.get("known_plugin_toolsets", {})
known_for_platform = set(known_map.get(platform, []))
@@ -609,7 +642,6 @@ def _get_platform_tools(
# Preserve any explicit non-configurable toolset entries (for example,
# custom toolsets or MCP server names saved in platform_toolsets).
platform_default_keys = {p["default_toolset"] for p in PLATFORMS.values()}
explicit_passthrough = {
ts
for ts in toolset_names
@@ -669,6 +701,7 @@ def _save_platform_tools(config: dict, platform: str, enabled_toolset_keys: Set[
existing_toolsets = config.get("platform_toolsets", {}).get(platform, [])
if not isinstance(existing_toolsets, list):
existing_toolsets = []
existing_toolsets = [str(ts) for ts in existing_toolsets]
# Preserve any entries that are NOT configurable toolsets and NOT platform
# defaults (i.e. only MCP server names should be preserved)
@@ -676,6 +709,8 @@ def _save_platform_tools(config: dict, platform: str, enabled_toolset_keys: Set[
entry for entry in existing_toolsets
if entry not in configurable_keys and entry not in platform_default_keys
}
if "no_mcp" not in enabled_toolset_keys:
preserved_entries.discard("no_mcp")
# Merge preserved entries with new enabled toolsets
config["platform_toolsets"][platform] = sorted(enabled_toolset_keys | preserved_entries)
@@ -847,6 +882,51 @@ def _configure_toolset(ts_key: str, config: dict):
_configure_simple_requirements(ts_key)
def _plugin_image_gen_providers() -> list[dict]:
"""Build picker-row dicts from plugin-registered image gen providers.
Each returned dict looks like a regular ``TOOL_CATEGORIES`` provider
row but carries an ``image_gen_plugin_name`` marker so downstream
code (config writing, model picker) knows to route through the
plugin registry instead of the in-tree FAL backend.
FAL is skipped it's already exposed by the hardcoded
``TOOL_CATEGORIES["image_gen"]`` entries. When FAL gets ported to
a plugin in a follow-up PR, the hardcoded entries go away and this
function surfaces it alongside OpenAI automatically.
"""
try:
from agent.image_gen_registry import list_providers
from hermes_cli.plugins import _ensure_plugins_discovered
_ensure_plugins_discovered()
providers = list_providers()
except Exception:
return []
rows: list[dict] = []
for provider in providers:
if getattr(provider, "name", None) == "fal":
# FAL has its own hardcoded rows today.
continue
try:
schema = provider.get_setup_schema()
except Exception:
continue
if not isinstance(schema, dict):
continue
rows.append(
{
"name": schema.get("name", provider.display_name),
"badge": schema.get("badge", ""),
"tag": schema.get("tag", ""),
"env_vars": schema.get("env_vars", []),
"image_gen_plugin_name": provider.name,
}
)
return rows
def _visible_providers(cat: dict, config: dict) -> list[dict]:
"""Return provider entries visible for the current auth/config state."""
features = get_nous_subscription_features(config)
@@ -857,6 +937,12 @@ def _visible_providers(cat: dict, config: dict) -> list[dict]:
if provider.get("requires_nous_auth") and not features.nous_auth_present:
continue
visible.append(provider)
# Inject plugin-registered image_gen backends (OpenAI today, more
# later) so the picker lists them alongside FAL / Nous Subscription.
if cat.get("name") == "Image Generation":
visible.extend(_plugin_image_gen_providers())
return visible
@@ -876,7 +962,24 @@ def _toolset_needs_configuration_prompt(ts_key: str, config: dict) -> bool:
browser_cfg = config.get("browser", {})
return not isinstance(browser_cfg, dict) or "cloud_provider" not in browser_cfg
if ts_key == "image_gen":
return not fal_key_is_configured()
# Satisfied when the in-tree FAL backend is configured OR any
# plugin-registered image gen provider is available.
if fal_key_is_configured():
return False
try:
from agent.image_gen_registry import list_providers
from hermes_cli.plugins import _ensure_plugins_discovered
_ensure_plugins_discovered()
for provider in list_providers():
try:
if provider.is_available():
return False
except Exception:
continue
except Exception:
pass
return True
return not _toolset_has_keys(ts_key, config)
@@ -951,6 +1054,11 @@ def _configure_tool_category(ts_key: str, cat: dict, config: dict):
def _is_provider_active(provider: dict, config: dict) -> bool:
"""Check if a provider entry matches the currently active config."""
plugin_name = provider.get("image_gen_plugin_name")
if plugin_name:
image_cfg = config.get("image_gen", {})
return isinstance(image_cfg, dict) and image_cfg.get("provider") == plugin_name
managed_feature = provider.get("managed_nous_feature")
if managed_feature:
features = get_nous_subscription_features(config)
@@ -958,6 +1066,13 @@ def _is_provider_active(provider: dict, config: dict) -> bool:
if feature is None:
return False
if managed_feature == "image_gen":
image_cfg = config.get("image_gen", {})
if isinstance(image_cfg, dict):
configured_provider = image_cfg.get("provider")
if configured_provider not in (None, "", "fal"):
return False
if image_cfg.get("use_gateway") is False:
return False
return feature.managed_by_nous
if provider.get("tts_provider"):
return (
@@ -980,6 +1095,16 @@ def _is_provider_active(provider: dict, config: dict) -> bool:
if provider.get("web_backend"):
current = config.get("web", {}).get("backend")
return current == provider["web_backend"]
if provider.get("imagegen_backend"):
image_cfg = config.get("image_gen", {})
if not isinstance(image_cfg, dict):
return False
configured_provider = image_cfg.get("provider")
return (
provider["imagegen_backend"] == "fal"
and configured_provider in (None, "", "fal")
and not image_cfg.get("use_gateway")
)
return False
@@ -1095,6 +1220,100 @@ def _configure_imagegen_model(backend_name: str, config: dict) -> None:
_print_success(f" Model set to: {chosen}")
def _plugin_image_gen_catalog(plugin_name: str):
"""Return ``(catalog_dict, default_model_id)`` for a plugin provider.
``catalog_dict`` is shaped like the legacy ``FAL_MODELS`` table
``{model_id: {"display", "speed", "strengths", "price", ...}}``
so the existing picker code paths work without change. Returns
``({}, None)`` if the provider isn't registered or has no models.
"""
try:
from agent.image_gen_registry import get_provider
from hermes_cli.plugins import _ensure_plugins_discovered
_ensure_plugins_discovered()
provider = get_provider(plugin_name)
except Exception:
return {}, None
if provider is None:
return {}, None
try:
models = provider.list_models() or []
default = provider.default_model()
except Exception:
return {}, None
catalog = {m["id"]: m for m in models if isinstance(m, dict) and "id" in m}
return catalog, default
def _configure_imagegen_model_for_plugin(plugin_name: str, config: dict) -> None:
"""Prompt the user to pick a model for a plugin-registered backend.
Writes selection to ``image_gen.model``. Mirrors
:func:`_configure_imagegen_model` but sources its catalog from the
plugin registry instead of :data:`IMAGEGEN_BACKENDS`.
"""
catalog, default_model = _plugin_image_gen_catalog(plugin_name)
if not catalog:
return
cur_cfg = config.setdefault("image_gen", {})
if not isinstance(cur_cfg, dict):
cur_cfg = {}
config["image_gen"] = cur_cfg
current_model = cur_cfg.get("model") or default_model
if current_model not in catalog:
current_model = default_model
model_ids = list(catalog.keys())
ordered = [current_model] + [m for m in model_ids if m != current_model]
widths = {
"model": max(len(m) for m in model_ids),
"speed": max((len(catalog[m].get("speed", "")) for m in model_ids), default=6),
"strengths": max((len(catalog[m].get("strengths", "")) for m in model_ids), default=0),
}
print()
header = (
f" {'Model':<{widths['model']}} "
f"{'Speed':<{widths['speed']}} "
f"{'Strengths':<{widths['strengths']}} "
f"Price"
)
print(color(header, Colors.CYAN))
rows = []
for mid in ordered:
row = _format_imagegen_model_row(mid, catalog[mid], widths)
if mid == current_model:
row += " ← currently in use"
rows.append(row)
idx = _prompt_choice(
f" Choose {plugin_name} model:",
rows,
default=0,
)
chosen = ordered[idx]
cur_cfg["model"] = chosen
_print_success(f" Model set to: {chosen}")
def _select_plugin_image_gen_provider(plugin_name: str, config: dict) -> None:
"""Persist a plugin-backed image generation provider selection."""
img_cfg = config.setdefault("image_gen", {})
if not isinstance(img_cfg, dict):
img_cfg = {}
config["image_gen"] = img_cfg
img_cfg["provider"] = plugin_name
img_cfg["use_gateway"] = False
_print_success(f" image_gen.provider set to: {plugin_name}")
_configure_imagegen_model_for_plugin(plugin_name, config)
def _configure_provider(provider: dict, config: dict):
"""Configure a single provider - prompt for API keys and set config."""
env_vars = provider.get("env_vars", [])
@@ -1151,10 +1370,22 @@ def _configure_provider(provider: dict, config: dict):
_print_success(f" {provider['name']} - no configuration needed!")
if managed_feature:
_print_info(" Requests for this tool will be billed to your Nous subscription.")
# Plugin-registered image_gen provider: write image_gen.provider
# and route model selection to the plugin's own catalog.
plugin_name = provider.get("image_gen_plugin_name")
if plugin_name:
_select_plugin_image_gen_provider(plugin_name, config)
return
# Imagegen backends prompt for model selection after backend pick.
backend = provider.get("imagegen_backend")
if backend:
_configure_imagegen_model(backend, config)
# In-tree FAL is the only non-plugin backend today. Keep
# image_gen.provider clear so the dispatch shim falls through
# to the legacy FAL path.
img_cfg = config.setdefault("image_gen", {})
if isinstance(img_cfg, dict) and img_cfg.get("provider") not in (None, "", "fal"):
img_cfg["provider"] = "fal"
return
# Prompt for each required env var
@@ -1189,10 +1420,17 @@ def _configure_provider(provider: dict, config: dict):
if all_configured:
_print_success(f" {provider['name']} configured!")
plugin_name = provider.get("image_gen_plugin_name")
if plugin_name:
_select_plugin_image_gen_provider(plugin_name, config)
return
# Imagegen backends prompt for model selection after env vars are in.
backend = provider.get("imagegen_backend")
if backend:
_configure_imagegen_model(backend, config)
img_cfg = config.setdefault("image_gen", {})
if isinstance(img_cfg, dict) and img_cfg.get("provider") not in (None, "", "fal"):
img_cfg["provider"] = "fal"
def _configure_simple_requirements(ts_key: str):
@@ -1358,16 +1596,39 @@ def _reconfigure_provider(provider: dict, config: dict):
config.setdefault("web", {})["backend"] = provider["web_backend"]
_print_success(f" Web backend set to: {provider['web_backend']}")
if managed_feature and managed_feature not in ("web", "tts", "browser"):
section = config.setdefault(managed_feature, {})
if not isinstance(section, dict):
section = {}
config[managed_feature] = section
section["use_gateway"] = True
elif not managed_feature:
for cat_key, cat in TOOL_CATEGORIES.items():
if provider in cat.get("providers", []):
section = config.get(cat_key)
if isinstance(section, dict) and section.get("use_gateway"):
section["use_gateway"] = False
break
if not env_vars:
if provider.get("post_setup"):
_run_post_setup(provider["post_setup"])
_print_success(f" {provider['name']} - no configuration needed!")
if managed_feature:
_print_info(" Requests for this tool will be billed to your Nous subscription.")
plugin_name = provider.get("image_gen_plugin_name")
if plugin_name:
_select_plugin_image_gen_provider(plugin_name, config)
return
# Imagegen backends prompt for model selection on reconfig too.
backend = provider.get("imagegen_backend")
if backend:
_configure_imagegen_model(backend, config)
if backend == "fal":
img_cfg = config.setdefault("image_gen", {})
if isinstance(img_cfg, dict):
img_cfg["provider"] = "fal"
img_cfg["use_gateway"] = False
return
for var in env_vars:
@@ -1386,9 +1647,19 @@ def _reconfigure_provider(provider: dict, config: dict):
_print_info(" Kept current")
# Imagegen backends prompt for model selection on reconfig too.
plugin_name = provider.get("image_gen_plugin_name")
if plugin_name:
_select_plugin_image_gen_provider(plugin_name, config)
return
backend = provider.get("imagegen_backend")
if backend:
_configure_imagegen_model(backend, config)
if backend == "fal":
img_cfg = config.setdefault("image_gen", {})
if isinstance(img_cfg, dict):
img_cfg["provider"] = "fal"
img_cfg["use_gateway"] = False
def _reconfigure_simple_requirements(ts_key: str):
+548
View File
@@ -0,0 +1,548 @@
"""Process-wide voice recording + TTS API for the TUI gateway.
Wraps ``tools.voice_mode`` (recording/transcription) and ``tools.tts_tool``
(text-to-speech) behind idempotent, stateful entry points that the gateway's
``voice.record``, ``voice.toggle``, and ``voice.tts`` JSON-RPC handlers can
call from a dedicated thread. The gateway imports this module lazily so that
missing optional audio deps (sounddevice, faster-whisper, numpy) surface as
an ``ImportError`` at call time, not at startup.
Two usage modes are exposed:
* **Push-to-talk** (``start_recording`` / ``stop_and_transcribe``) single
manually-bounded capture used when the caller drives the start/stop pair
explicitly.
* **Continuous (VAD)** (``start_continuous`` / ``stop_continuous``) mirrors
the classic CLI voice mode: recording auto-stops on silence, transcribes,
hands the result to a callback, and then auto-restarts for the next turn.
Three consecutive no-speech cycles stop the loop and fire
``on_silent_limit`` so the UI can turn the mode off.
"""
from __future__ import annotations
import logging
import os
import sys
import threading
from typing import Any, Callable, Optional
from tools.voice_mode import (
create_audio_recorder,
is_whisper_hallucination,
play_audio_file,
transcribe_recording,
)
logger = logging.getLogger(__name__)
def _debug(msg: str) -> None:
"""Emit a debug breadcrumb when HERMES_VOICE_DEBUG=1.
Goes to stderr so the TUI gateway wraps it as a gateway.stderr event,
which createGatewayEventHandler shows as an Activity line exactly
what we need to diagnose "why didn't the loop auto-restart?" in the
user's real terminal without shipping a separate debug RPC.
Any OSError / BrokenPipeError is swallowed because this fires from
background threads (silence callback, TTS daemon, beep) where a
broken stderr pipe must not kill the whole gateway the main
command pipe (stdin+stdout) is what actually matters.
"""
if os.environ.get("HERMES_VOICE_DEBUG", "").strip() != "1":
return
try:
print(f"[voice] {msg}", file=sys.stderr, flush=True)
except (BrokenPipeError, OSError):
pass
def _beeps_enabled() -> bool:
"""CLI parity: voice.beep_enabled in config.yaml (default True)."""
try:
from hermes_cli.config import load_config
voice_cfg = load_config().get("voice", {})
if isinstance(voice_cfg, dict):
return bool(voice_cfg.get("beep_enabled", True))
except Exception:
pass
return True
def _play_beep(frequency: int, count: int = 1) -> None:
"""Audible cue matching cli.py's record/stop beeps.
880 Hz single-beep on start (cli.py:_voice_start_recording line 7532),
660 Hz double-beep on stop (cli.py:_voice_stop_and_transcribe line 7585).
Best-effort sounddevice failures are silently swallowed so the
voice loop never breaks because a speaker was unavailable.
"""
if not _beeps_enabled():
return
try:
from tools.voice_mode import play_beep
play_beep(frequency=frequency, count=count)
except Exception as e:
_debug(f"beep {frequency}Hz failed: {e}")
# ── Push-to-talk state ───────────────────────────────────────────────
_recorder = None
_recorder_lock = threading.Lock()
# ── Continuous (VAD) state ───────────────────────────────────────────
_continuous_lock = threading.Lock()
_continuous_active = False
_continuous_recorder: Any = None
# ── TTS-vs-STT feedback guard ────────────────────────────────────────
# When TTS plays the agent reply over the speakers, the live microphone
# picks it up and transcribes the agent's own voice as user input — an
# infinite loop the agent happily joins ("Ha, looks like we're in a loop").
# This Event mirrors cli.py:_voice_tts_done: cleared while speak_text is
# playing, set while silent. _continuous_on_silence waits on it before
# re-arming the recorder, and speak_text itself cancels any live capture
# before starting playback so the tail of the previous utterance doesn't
# leak into the mic.
_tts_playing = threading.Event()
_tts_playing.set() # initially "not playing"
_continuous_on_transcript: Optional[Callable[[str], None]] = None
_continuous_on_status: Optional[Callable[[str], None]] = None
_continuous_on_silent_limit: Optional[Callable[[], None]] = None
_continuous_no_speech_count = 0
_CONTINUOUS_NO_SPEECH_LIMIT = 3
# ── Push-to-talk API ─────────────────────────────────────────────────
def start_recording() -> None:
"""Begin capturing from the default input device (push-to-talk).
Idempotent calling again while a recording is in progress is a no-op.
"""
global _recorder
with _recorder_lock:
if _recorder is not None and getattr(_recorder, "is_recording", False):
return
rec = create_audio_recorder()
rec.start()
_recorder = rec
def stop_and_transcribe() -> Optional[str]:
"""Stop the active push-to-talk recording, transcribe, return text.
Returns ``None`` when no recording is active, when the microphone
captured no speech, or when Whisper returned a known hallucination.
"""
global _recorder
with _recorder_lock:
rec = _recorder
_recorder = None
if rec is None:
return None
wav_path = rec.stop()
if not wav_path:
return None
try:
result = transcribe_recording(wav_path)
except Exception as e:
logger.warning("voice transcription failed: %s", e)
return None
finally:
try:
if os.path.isfile(wav_path):
os.unlink(wav_path)
except Exception:
pass
# transcribe_recording returns {"success": bool, "transcript": str, ...}
# — matches cli.py:_voice_stop_and_transcribe's result.get("transcript").
if not result.get("success"):
return None
text = (result.get("transcript") or "").strip()
if not text or is_whisper_hallucination(text):
return None
return text
# ── Continuous (VAD) API ─────────────────────────────────────────────
def start_continuous(
on_transcript: Callable[[str], None],
on_status: Optional[Callable[[str], None]] = None,
on_silent_limit: Optional[Callable[[], None]] = None,
silence_threshold: int = 200,
silence_duration: float = 3.0,
) -> None:
"""Start a VAD-driven continuous recording loop.
The loop calls ``on_transcript(text)`` each time speech is detected and
transcribed successfully, then auto-restarts. After
``_CONTINUOUS_NO_SPEECH_LIMIT`` consecutive silent cycles (no speech
picked up at all) the loop stops itself and calls ``on_silent_limit``
so the UI can reflect "voice off". Idempotent calling while already
active is a no-op.
``on_status`` is called with ``"listening"`` / ``"transcribing"`` /
``"idle"`` so the UI can show a live indicator.
"""
global _continuous_active, _continuous_recorder
global _continuous_on_transcript, _continuous_on_status, _continuous_on_silent_limit
global _continuous_no_speech_count
with _continuous_lock:
if _continuous_active:
_debug("start_continuous: already active — no-op")
return
_continuous_active = True
_continuous_on_transcript = on_transcript
_continuous_on_status = on_status
_continuous_on_silent_limit = on_silent_limit
_continuous_no_speech_count = 0
if _continuous_recorder is None:
_continuous_recorder = create_audio_recorder()
_continuous_recorder._silence_threshold = silence_threshold
_continuous_recorder._silence_duration = silence_duration
rec = _continuous_recorder
_debug(
f"start_continuous: begin (threshold={silence_threshold}, duration={silence_duration}s)"
)
# CLI parity: single 880 Hz beep *before* opening the stream — placing
# the beep after stream.start() on macOS triggers a CoreAudio conflict
# (cli.py:7528 comment).
_play_beep(frequency=880, count=1)
try:
rec.start(on_silence_stop=_continuous_on_silence)
except Exception as e:
logger.error("failed to start continuous recording: %s", e)
_debug(f"start_continuous: rec.start raised {type(e).__name__}: {e}")
with _continuous_lock:
_continuous_active = False
raise
if on_status:
try:
on_status("listening")
except Exception:
pass
def stop_continuous() -> None:
"""Stop the active continuous loop and release the microphone.
Idempotent calling while not active is a no-op. Any in-flight
transcription completes but its result is discarded (the callback
checks ``_continuous_active`` before firing).
"""
global _continuous_active, _continuous_on_transcript
global _continuous_on_status, _continuous_on_silent_limit
global _continuous_recorder, _continuous_no_speech_count
with _continuous_lock:
if not _continuous_active:
return
_continuous_active = False
rec = _continuous_recorder
on_status = _continuous_on_status
_continuous_on_transcript = None
_continuous_on_status = None
_continuous_on_silent_limit = None
_continuous_no_speech_count = 0
if rec is not None:
try:
# cancel() (not stop()) discards buffered frames — the loop
# is over, we don't want to transcribe a half-captured turn.
rec.cancel()
except Exception as e:
logger.warning("failed to cancel recorder: %s", e)
# Audible "recording stopped" cue (CLI parity: same 660 Hz × 2 the
# silence-auto-stop path plays).
_play_beep(frequency=660, count=2)
if on_status:
try:
on_status("idle")
except Exception:
pass
def is_continuous_active() -> bool:
"""Whether a continuous voice loop is currently running."""
with _continuous_lock:
return _continuous_active
def _continuous_on_silence() -> None:
"""AudioRecorder silence callback — runs in a daemon thread.
Stops the current capture, transcribes, delivers the text via
``on_transcript``, and if the loop is still active starts the
next capture. Three consecutive silent cycles end the loop.
"""
global _continuous_active, _continuous_no_speech_count
_debug("_continuous_on_silence: fired")
with _continuous_lock:
if not _continuous_active:
_debug("_continuous_on_silence: loop inactive — abort")
return
rec = _continuous_recorder
on_transcript = _continuous_on_transcript
on_status = _continuous_on_status
on_silent_limit = _continuous_on_silent_limit
if rec is None:
_debug("_continuous_on_silence: no recorder — abort")
return
if on_status:
try:
on_status("transcribing")
except Exception:
pass
wav_path = rec.stop()
# Peak RMS is the critical diagnostic when stop() returns None despite
# the VAD firing — tells us at a glance whether the mic was too quiet
# for SILENCE_RMS_THRESHOLD (200) or the VAD + peak checks disagree.
peak_rms = getattr(rec, "_peak_rms", -1)
_debug(
f"_continuous_on_silence: rec.stop -> {wav_path!r} (peak_rms={peak_rms})"
)
# CLI parity: double 660 Hz beep after the stream stops (safe from the
# CoreAudio conflict that blocks pre-start beeps).
_play_beep(frequency=660, count=2)
transcript: Optional[str] = None
if wav_path:
try:
result = transcribe_recording(wav_path)
# transcribe_recording returns {"success": bool, "transcript": str,
# "error": str?} — NOT {"text": str}. Using the wrong key silently
# produced empty transcripts even when Groq/local STT returned fine,
# which masqueraded as "not hearing the user" to the caller.
success = bool(result.get("success"))
text = (result.get("transcript") or "").strip()
err = result.get("error")
_debug(
f"_continuous_on_silence: transcribe -> success={success} "
f"text={text!r} err={err!r}"
)
if success and text and not is_whisper_hallucination(text):
transcript = text
except Exception as e:
logger.warning("continuous transcription failed: %s", e)
_debug(f"_continuous_on_silence: transcribe raised {type(e).__name__}: {e}")
finally:
try:
if os.path.isfile(wav_path):
os.unlink(wav_path)
except Exception:
pass
with _continuous_lock:
if not _continuous_active:
# User stopped us while we were transcribing — discard.
_debug("_continuous_on_silence: stopped during transcribe — no restart")
return
if transcript:
_continuous_no_speech_count = 0
else:
_continuous_no_speech_count += 1
should_halt = _continuous_no_speech_count >= _CONTINUOUS_NO_SPEECH_LIMIT
no_speech = _continuous_no_speech_count
if transcript and on_transcript:
try:
on_transcript(transcript)
except Exception as e:
logger.warning("on_transcript callback raised: %s", e)
if should_halt:
_debug(f"_continuous_on_silence: {no_speech} silent cycles — halting")
with _continuous_lock:
_continuous_active = False
_continuous_no_speech_count = 0
if on_silent_limit:
try:
on_silent_limit()
except Exception:
pass
try:
rec.cancel()
except Exception:
pass
if on_status:
try:
on_status("idle")
except Exception:
pass
return
# CLI parity (cli.py:10619-10621): wait for any in-flight TTS to
# finish before re-arming the mic, then leave a small gap to avoid
# catching the tail of the speaker output. Without this the voice
# loop becomes a feedback loop — the agent's spoken reply lands
# back in the mic and gets re-submitted.
if not _tts_playing.is_set():
_debug("_continuous_on_silence: waiting for TTS to finish")
_tts_playing.wait(timeout=60)
import time as _time
_time.sleep(0.3)
# User may have stopped the loop during the wait.
with _continuous_lock:
if not _continuous_active:
_debug("_continuous_on_silence: stopped while waiting for TTS")
return
# Restart for the next turn.
_debug(f"_continuous_on_silence: restarting loop (no_speech={no_speech})")
_play_beep(frequency=880, count=1)
try:
rec.start(on_silence_stop=_continuous_on_silence)
except Exception as e:
logger.error("failed to restart continuous recording: %s", e)
_debug(f"_continuous_on_silence: restart raised {type(e).__name__}: {e}")
with _continuous_lock:
_continuous_active = False
return
if on_status:
try:
on_status("listening")
except Exception:
pass
# ── TTS API ──────────────────────────────────────────────────────────
def speak_text(text: str) -> None:
"""Synthesize ``text`` with the configured TTS provider and play it.
Mirrors cli.py:_voice_speak_response exactly same markdown strip
pipeline, same 4000-char cap, same explicit mp3 output path, same
MP3-over-OGG playback choice (afplay misbehaves on OGG), same cleanup
of both extensions. Keeping these in sync means a voice-mode TTS
session in the TUI sounds identical to one in the classic CLI.
While playback is in flight the module-level _tts_playing Event is
cleared so the continuous-recording loop knows to wait before
re-arming the mic (otherwise the agent's spoken reply feedback-loops
through the microphone and the agent ends up replying to itself).
"""
if not text or not text.strip():
return
import re
import tempfile
import time
# Cancel any live capture before we open the speakers — otherwise the
# last ~200ms of the user's turn tail + the first syllables of our TTS
# both end up in the next recording window. The continuous loop will
# re-arm itself after _tts_playing flips back (see _continuous_on_silence).
paused_recording = False
with _continuous_lock:
if (
_continuous_active
and _continuous_recorder is not None
and getattr(_continuous_recorder, "is_recording", False)
):
try:
_continuous_recorder.cancel()
paused_recording = True
except Exception as e:
logger.warning("failed to pause recorder for TTS: %s", e)
_tts_playing.clear()
_debug(f"speak_text: TTS begin (paused_recording={paused_recording})")
try:
from tools.tts_tool import text_to_speech_tool
tts_text = text[:4000] if len(text) > 4000 else text
tts_text = re.sub(r'```[\s\S]*?```', ' ', tts_text) # fenced code blocks
tts_text = re.sub(r'\[([^\]]+)\]\([^)]+\)', r'\1', tts_text) # [text](url) → text
tts_text = re.sub(r'https?://\S+', '', tts_text) # bare URLs
tts_text = re.sub(r'\*\*(.+?)\*\*', r'\1', tts_text) # bold
tts_text = re.sub(r'\*(.+?)\*', r'\1', tts_text) # italic
tts_text = re.sub(r'`(.+?)`', r'\1', tts_text) # inline code
tts_text = re.sub(r'^#+\s*', '', tts_text, flags=re.MULTILINE) # headers
tts_text = re.sub(r'^\s*[-*]\s+', '', tts_text, flags=re.MULTILINE) # list bullets
tts_text = re.sub(r'---+', '', tts_text) # horizontal rules
tts_text = re.sub(r'\n{3,}', '\n\n', tts_text) # excess newlines
tts_text = tts_text.strip()
if not tts_text:
return
# MP3 output path, pre-chosen so we can play the MP3 directly even
# when text_to_speech_tool auto-converts to OGG for messaging
# platforms. afplay's OGG support is flaky, MP3 always works.
os.makedirs(os.path.join(tempfile.gettempdir(), "hermes_voice"), exist_ok=True)
mp3_path = os.path.join(
tempfile.gettempdir(),
"hermes_voice",
f"tts_{time.strftime('%Y%m%d_%H%M%S')}.mp3",
)
_debug(f"speak_text: synthesizing {len(tts_text)} chars -> {mp3_path}")
text_to_speech_tool(text=tts_text, output_path=mp3_path)
if os.path.isfile(mp3_path) and os.path.getsize(mp3_path) > 0:
_debug(f"speak_text: playing {mp3_path} ({os.path.getsize(mp3_path)} bytes)")
play_audio_file(mp3_path)
try:
os.unlink(mp3_path)
ogg_path = mp3_path.rsplit(".", 1)[0] + ".ogg"
if os.path.isfile(ogg_path):
os.unlink(ogg_path)
except OSError:
pass
else:
_debug(f"speak_text: TTS tool produced no audio at {mp3_path}")
except Exception as e:
logger.warning("Voice TTS playback failed: %s", e)
_debug(f"speak_text raised {type(e).__name__}: {e}")
finally:
_tts_playing.set()
_debug("speak_text: TTS done")
# Re-arm the mic so the user can answer without pressing Ctrl+B.
# Small delay lets the OS flush speaker output and afplay fully
# release the audio device before sounddevice re-opens the input.
if paused_recording:
time.sleep(0.3)
with _continuous_lock:
if _continuous_active and _continuous_recorder is not None:
try:
_continuous_recorder.start(
on_silence_stop=_continuous_on_silence
)
_debug("speak_text: recording resumed after TTS")
except Exception as e:
logger.warning(
"failed to resume recorder after TTS: %s", e
)
+298 -24
View File
@@ -71,6 +71,7 @@ app = FastAPI(title="Hermes Agent", version=__version__)
# Injected into the SPA HTML so only the legitimate web UI can use it.
# ---------------------------------------------------------------------------
_SESSION_TOKEN = secrets.token_urlsafe(32)
_SESSION_HEADER_NAME = "X-Hermes-Session-Token"
# Simple rate limiter for the reveal endpoint
_reveal_timestamps: List[float] = []
@@ -104,14 +105,29 @@ _PUBLIC_API_PATHS: frozenset = frozenset({
})
def _require_token(request: Request) -> None:
"""Validate the ephemeral session token. Raises 401 on mismatch.
def _has_valid_session_token(request: Request) -> bool:
"""True if the request carries a valid dashboard session token.
Uses ``hmac.compare_digest`` to prevent timing side-channels.
The dedicated session header avoids collisions with reverse proxies that
already use ``Authorization`` (for example Caddy ``basic_auth``). We still
accept the legacy Bearer path for backward compatibility with older
dashboard bundles.
"""
session_header = request.headers.get(_SESSION_HEADER_NAME, "")
if session_header and hmac.compare_digest(
session_header.encode(),
_SESSION_TOKEN.encode(),
):
return True
auth = request.headers.get("authorization", "")
expected = f"Bearer {_SESSION_TOKEN}"
if not hmac.compare_digest(auth.encode(), expected.encode()):
return hmac.compare_digest(auth.encode(), expected.encode())
def _require_token(request: Request) -> None:
"""Validate the ephemeral session token. Raises 401 on mismatch."""
if not _has_valid_session_token(request):
raise HTTPException(status_code=401, detail="Unauthorized")
@@ -205,9 +221,7 @@ async def auth_middleware(request: Request, call_next):
"""Require the session token on all /api/ routes except the public list."""
path = request.url.path
if path.startswith("/api/") and path not in _PUBLIC_API_PATHS and not path.startswith("/api/plugins/"):
auth = request.headers.get("authorization", "")
expected = f"Bearer {_SESSION_TOKEN}"
if not hmac.compare_digest(auth.encode(), expected.encode()):
if not _has_valid_session_token(request):
return JSONResponse(
status_code=401,
content={"detail": "Unauthorized"},
@@ -417,7 +431,14 @@ class EnvVarReveal(BaseModel):
_GATEWAY_HEALTH_URL = os.getenv("GATEWAY_HEALTH_URL")
_GATEWAY_HEALTH_TIMEOUT = float(os.getenv("GATEWAY_HEALTH_TIMEOUT", "3"))
try:
_GATEWAY_HEALTH_TIMEOUT = float(os.getenv("GATEWAY_HEALTH_TIMEOUT", "3"))
except (ValueError, TypeError):
_log.warning(
"Invalid GATEWAY_HEALTH_TIMEOUT value %r — using default 3.0s",
os.getenv("GATEWAY_HEALTH_TIMEOUT"),
)
_GATEWAY_HEALTH_TIMEOUT = 3.0
def _probe_gateway_health() -> tuple[bool, dict | None]:
@@ -2189,7 +2210,8 @@ async def get_usage_analytics(days: int = 30):
SUM(reasoning_tokens) as reasoning_tokens,
COALESCE(SUM(estimated_cost_usd), 0) as estimated_cost,
COALESCE(SUM(actual_cost_usd), 0) as actual_cost,
COUNT(*) as sessions
COUNT(*) as sessions,
SUM(COALESCE(api_call_count, 0)) as api_calls
FROM sessions WHERE started_at > ?
GROUP BY day ORDER BY day
""", (cutoff,))
@@ -2200,7 +2222,8 @@ async def get_usage_analytics(days: int = 30):
SUM(input_tokens) as input_tokens,
SUM(output_tokens) as output_tokens,
COALESCE(SUM(estimated_cost_usd), 0) as estimated_cost,
COUNT(*) as sessions
COUNT(*) as sessions,
SUM(COALESCE(api_call_count, 0)) as api_calls
FROM sessions WHERE started_at > ? AND model IS NOT NULL
GROUP BY model ORDER BY SUM(input_tokens) + SUM(output_tokens) DESC
""", (cutoff,))
@@ -2213,7 +2236,8 @@ async def get_usage_analytics(days: int = 30):
SUM(reasoning_tokens) as total_reasoning,
COALESCE(SUM(estimated_cost_usd), 0) as total_estimated_cost,
COALESCE(SUM(actual_cost_usd), 0) as total_actual_cost,
COUNT(*) as total_sessions
COUNT(*) as total_sessions,
SUM(COALESCE(api_call_count, 0)) as total_api_calls
FROM sessions WHERE started_at > ?
""", (cutoff,))
totals = dict(cur3.fetchone())
@@ -2301,8 +2325,227 @@ _BUILTIN_DASHBOARD_THEMES = [
]
def _parse_theme_layer(value: Any, default_hex: str, default_alpha: float = 1.0) -> Optional[Dict[str, Any]]:
"""Normalise a theme layer spec from YAML into `{hex, alpha}` form.
Accepts shorthand (a bare hex string) or full dict form. Returns
``None`` on garbage input so the caller can fall back to a built-in
default rather than blowing up.
"""
if value is None:
return {"hex": default_hex, "alpha": default_alpha}
if isinstance(value, str):
return {"hex": value, "alpha": default_alpha}
if isinstance(value, dict):
hex_val = value.get("hex", default_hex)
alpha_val = value.get("alpha", default_alpha)
if not isinstance(hex_val, str):
return None
try:
alpha_f = float(alpha_val)
except (TypeError, ValueError):
alpha_f = default_alpha
return {"hex": hex_val, "alpha": max(0.0, min(1.0, alpha_f))}
return None
_THEME_DEFAULT_TYPOGRAPHY: Dict[str, str] = {
"fontSans": 'system-ui, -apple-system, "Segoe UI", Roboto, "Helvetica Neue", Arial, sans-serif',
"fontMono": 'ui-monospace, "SF Mono", "Cascadia Mono", Menlo, Consolas, monospace',
"baseSize": "15px",
"lineHeight": "1.55",
"letterSpacing": "0",
}
_THEME_DEFAULT_LAYOUT: Dict[str, str] = {
"radius": "0.5rem",
"density": "comfortable",
}
_THEME_OVERRIDE_KEYS = {
"card", "cardForeground", "popover", "popoverForeground",
"primary", "primaryForeground", "secondary", "secondaryForeground",
"muted", "mutedForeground", "accent", "accentForeground",
"destructive", "destructiveForeground", "success", "warning",
"border", "input", "ring",
}
# Well-known named asset slots themes can populate. Any other keys under
# ``assets.custom`` are exposed as ``--theme-asset-custom-<key>`` CSS vars
# for plugin/shell use.
_THEME_NAMED_ASSET_KEYS = {"bg", "hero", "logo", "crest", "sidebar", "header"}
# Component-style buckets themes can override. The value under each bucket
# is a mapping from camelCase property name to CSS string; each pair emits
# ``--component-<bucket>-<kebab-property>`` on :root. The frontend's shell
# components (Card, App header, Backdrop, etc.) consume these vars so themes
# can restyle chrome (clip-path, border-image, segmented progress, etc.)
# without shipping their own CSS.
_THEME_COMPONENT_BUCKETS = {
"card", "header", "footer", "sidebar", "tab",
"progress", "badge", "backdrop", "page",
}
_THEME_LAYOUT_VARIANTS = {"standard", "cockpit", "tiled"}
# Cap on customCSS length so a malformed/oversized theme YAML can't blow up
# the response payload or the <style> tag. 32 KiB is plenty for every
# practical reskin (the Strike Freedom demo is ~2 KiB).
_THEME_CUSTOM_CSS_MAX = 32 * 1024
def _normalise_theme_definition(data: Dict[str, Any]) -> Optional[Dict[str, Any]]:
"""Normalise a user theme YAML into the wire format `ThemeProvider`
expects. Returns ``None`` if the theme is unusable.
Accepts both the full schema (palette/typography/layout) and a loose
form with bare hex strings, so hand-written YAMLs stay friendly.
"""
if not isinstance(data, dict):
return None
name = data.get("name")
if not isinstance(name, str) or not name.strip():
return None
# Palette
palette_src = data.get("palette", {}) if isinstance(data.get("palette"), dict) else {}
# Allow top-level `colors.background` as a shorthand too.
colors_src = data.get("colors", {}) if isinstance(data.get("colors"), dict) else {}
def _layer(key: str, default_hex: str, default_alpha: float = 1.0) -> Dict[str, Any]:
spec = palette_src.get(key, colors_src.get(key))
parsed = _parse_theme_layer(spec, default_hex, default_alpha)
return parsed if parsed is not None else {"hex": default_hex, "alpha": default_alpha}
palette = {
"background": _layer("background", "#041c1c", 1.0),
"midground": _layer("midground", "#ffe6cb", 1.0),
"foreground": _layer("foreground", "#ffffff", 0.0),
"warmGlow": palette_src.get("warmGlow") or data.get("warmGlow") or "rgba(255, 189, 56, 0.35)",
"noiseOpacity": 1.0,
}
raw_noise = palette_src.get("noiseOpacity", data.get("noiseOpacity"))
try:
palette["noiseOpacity"] = float(raw_noise) if raw_noise is not None else 1.0
except (TypeError, ValueError):
palette["noiseOpacity"] = 1.0
# Typography
typo_src = data.get("typography", {}) if isinstance(data.get("typography"), dict) else {}
typography = dict(_THEME_DEFAULT_TYPOGRAPHY)
for key in ("fontSans", "fontMono", "fontDisplay", "fontUrl", "baseSize", "lineHeight", "letterSpacing"):
val = typo_src.get(key)
if isinstance(val, str) and val.strip():
typography[key] = val
# Layout
layout_src = data.get("layout", {}) if isinstance(data.get("layout"), dict) else {}
layout = dict(_THEME_DEFAULT_LAYOUT)
radius = layout_src.get("radius")
if isinstance(radius, str) and radius.strip():
layout["radius"] = radius
density = layout_src.get("density")
if isinstance(density, str) and density in ("compact", "comfortable", "spacious"):
layout["density"] = density
# Color overrides — keep only valid keys with string values.
overrides_src = data.get("colorOverrides", {})
color_overrides: Dict[str, str] = {}
if isinstance(overrides_src, dict):
for key, val in overrides_src.items():
if key in _THEME_OVERRIDE_KEYS and isinstance(val, str) and val.strip():
color_overrides[key] = val
# Assets — named slots + arbitrary user-defined keys. Values must be
# strings (URLs or CSS ``url(...)``/``linear-gradient(...)`` expressions).
# We don't fetch remote assets here; the frontend just injects them as
# CSS vars. Empty values are dropped so a theme can explicitly clear a
# slot by setting ``hero: ""``.
assets_out: Dict[str, Any] = {}
assets_src = data.get("assets", {}) if isinstance(data.get("assets"), dict) else {}
for key in _THEME_NAMED_ASSET_KEYS:
val = assets_src.get(key)
if isinstance(val, str) and val.strip():
assets_out[key] = val
custom_assets_src = assets_src.get("custom")
if isinstance(custom_assets_src, dict):
custom_assets: Dict[str, str] = {}
for key, val in custom_assets_src.items():
if (
isinstance(key, str)
and key.replace("-", "").replace("_", "").isalnum()
and isinstance(val, str)
and val.strip()
):
custom_assets[key] = val
if custom_assets:
assets_out["custom"] = custom_assets
# Custom CSS — raw CSS text the frontend injects as a scoped <style>
# tag on theme apply. Clipped to _THEME_CUSTOM_CSS_MAX to keep the
# payload bounded. We intentionally do NOT parse/sanitise the CSS
# here — the dashboard is localhost-only and themes are user-authored
# YAML in ~/.hermes/, same trust level as the config file itself.
custom_css_val = data.get("customCSS")
custom_css: Optional[str] = None
if isinstance(custom_css_val, str) and custom_css_val.strip():
custom_css = custom_css_val[:_THEME_CUSTOM_CSS_MAX]
# Component style overrides — per-bucket dicts of camelCase CSS
# property -> CSS string. The frontend converts these into CSS vars
# that shell components (Card, App header, Backdrop) consume.
component_styles_src = data.get("componentStyles", {})
component_styles: Dict[str, Dict[str, str]] = {}
if isinstance(component_styles_src, dict):
for bucket, props in component_styles_src.items():
if bucket not in _THEME_COMPONENT_BUCKETS or not isinstance(props, dict):
continue
clean: Dict[str, str] = {}
for prop, value in props.items():
if (
isinstance(prop, str)
and prop.replace("-", "").replace("_", "").isalnum()
and isinstance(value, (str, int, float))
and str(value).strip()
):
clean[prop] = str(value)
if clean:
component_styles[bucket] = clean
layout_variant_src = data.get("layoutVariant")
layout_variant = (
layout_variant_src
if isinstance(layout_variant_src, str) and layout_variant_src in _THEME_LAYOUT_VARIANTS
else "standard"
)
result: Dict[str, Any] = {
"name": name,
"label": data.get("label") or name,
"description": data.get("description", ""),
"palette": palette,
"typography": typography,
"layout": layout,
"layoutVariant": layout_variant,
}
if color_overrides:
result["colorOverrides"] = color_overrides
if assets_out:
result["assets"] = assets_out
if custom_css is not None:
result["customCSS"] = custom_css
if component_styles:
result["componentStyles"] = component_styles
return result
def _discover_user_themes() -> list:
"""Scan ~/.hermes/dashboard-themes/*.yaml for user-created themes."""
"""Scan ~/.hermes/dashboard-themes/*.yaml for user-created themes.
Returns a list of fully-normalised theme definitions ready to ship
to the frontend, so the client can apply them without a secondary
round-trip or a built-in stub.
"""
themes_dir = get_hermes_home() / "dashboard-themes"
if not themes_dir.is_dir():
return []
@@ -2310,33 +2553,42 @@ def _discover_user_themes() -> list:
for f in sorted(themes_dir.glob("*.yaml")):
try:
data = yaml.safe_load(f.read_text(encoding="utf-8"))
if isinstance(data, dict) and data.get("name"):
result.append({
"name": data["name"],
"label": data.get("label", data["name"]),
"description": data.get("description", ""),
})
except Exception:
continue
normalised = _normalise_theme_definition(data)
if normalised is not None:
result.append(normalised)
return result
@app.get("/api/dashboard/themes")
async def get_dashboard_themes():
"""Return available themes and the currently active one."""
"""Return available themes and the currently active one.
Built-in entries ship name/label/description only (the frontend owns
their full definitions in `web/src/themes/presets.ts`). User themes
from `~/.hermes/dashboard-themes/*.yaml` ship with their full
normalised definition under `definition`, so the client can apply
them without a stub.
"""
config = load_config()
active = config.get("dashboard", {}).get("theme", "default")
user_themes = _discover_user_themes()
# Merge built-in + user, user themes override built-in by name.
seen = set()
themes = []
for t in _BUILTIN_DASHBOARD_THEMES:
seen.add(t["name"])
themes.append(t)
for t in user_themes:
if t["name"] not in seen:
themes.append(t)
seen.add(t["name"])
if t["name"] in seen:
continue
themes.append({
"name": t["name"],
"label": t["label"],
"description": t["description"],
"definition": t,
})
seen.add(t["name"])
return {"themes": themes, "active": active}
@@ -2393,13 +2645,35 @@ def _discover_dashboard_plugins() -> list:
if name in seen_names:
continue
seen_names.add(name)
# Tab options: ``path`` + ``position`` for a new tab, optional
# ``override`` to replace a built-in route, and ``hidden`` to
# register the plugin component/slots without adding a tab
# (useful for slot-only plugins like a header-crest injector).
raw_tab = data.get("tab", {}) if isinstance(data.get("tab"), dict) else {}
tab_info = {
"path": raw_tab.get("path", f"/{name}"),
"position": raw_tab.get("position", "end"),
}
override_path = raw_tab.get("override")
if isinstance(override_path, str) and override_path.startswith("/"):
tab_info["override"] = override_path
if bool(raw_tab.get("hidden")):
tab_info["hidden"] = True
# Slots: list of named slot locations this plugin populates.
# The frontend exposes ``registerSlot(pluginName, slotName, Component)``
# on window; plugins with non-empty slots call it from their JS bundle.
slots_src = data.get("slots")
slots: List[str] = []
if isinstance(slots_src, list):
slots = [s for s in slots_src if isinstance(s, str) and s]
plugins.append({
"name": name,
"label": data.get("label", name),
"description": data.get("description", ""),
"icon": data.get("icon", "Puzzle"),
"version": data.get("version", "0.0.0"),
"tab": data.get("tab", {"path": f"/{name}", "position": "end"}),
"tab": tab_info,
"slots": slots,
"entry": data.get("entry", "dist/index.js"),
"css": data.get("css"),
"has_api": bool(data.get("api")),
+154 -6
View File
@@ -31,7 +31,7 @@ T = TypeVar("T")
DEFAULT_DB_PATH = get_hermes_home() / "state.db"
SCHEMA_VERSION = 6
SCHEMA_VERSION = 8
SCHEMA_SQL = """
CREATE TABLE IF NOT EXISTS schema_version (
@@ -65,6 +65,7 @@ CREATE TABLE IF NOT EXISTS sessions (
cost_source TEXT,
pricing_version TEXT,
title TEXT,
api_call_count INTEGER DEFAULT 0,
FOREIGN KEY (parent_session_id) REFERENCES sessions(id)
);
@@ -80,10 +81,16 @@ CREATE TABLE IF NOT EXISTS messages (
token_count INTEGER,
finish_reason TEXT,
reasoning TEXT,
reasoning_content TEXT,
reasoning_details TEXT,
codex_reasoning_items TEXT
);
CREATE TABLE IF NOT EXISTS state_meta (
key TEXT PRIMARY KEY,
value TEXT
);
CREATE INDEX IF NOT EXISTS idx_sessions_source ON sessions(source);
CREATE INDEX IF NOT EXISTS idx_sessions_parent ON sessions(parent_session_id);
CREATE INDEX IF NOT EXISTS idx_sessions_started ON sessions(started_at DESC);
@@ -329,6 +336,26 @@ class SessionDB:
except sqlite3.OperationalError:
pass # Column already exists
cursor.execute("UPDATE schema_version SET version = 6")
if current_version < 7:
# v7: preserve provider-native reasoning_content separately from
# normalized reasoning text. Kimi/Moonshot replay can require
# this field on assistant tool-call messages when thinking is on.
try:
cursor.execute('ALTER TABLE messages ADD COLUMN "reasoning_content" TEXT')
except sqlite3.OperationalError:
pass # Column already exists
cursor.execute("UPDATE schema_version SET version = 7")
if current_version < 8:
# v8: add api_call_count column to sessions — tracks the number
# of individual LLM API calls made within a session (as opposed
# to the session count itself).
try:
cursor.execute(
'ALTER TABLE sessions ADD COLUMN "api_call_count" INTEGER DEFAULT 0'
)
except sqlite3.OperationalError:
pass # Column already exists
cursor.execute("UPDATE schema_version SET version = 8")
# Unique title index — always ensure it exists (safe to run after migrations
# since the title column is guaranteed to exist at this point)
@@ -435,6 +462,7 @@ class SessionDB:
billing_provider: Optional[str] = None,
billing_base_url: Optional[str] = None,
billing_mode: Optional[str] = None,
api_call_count: int = 0,
absolute: bool = False,
) -> None:
"""Update token counters and backfill model if not already set.
@@ -464,7 +492,8 @@ class SessionDB:
billing_provider = COALESCE(billing_provider, ?),
billing_base_url = COALESCE(billing_base_url, ?),
billing_mode = COALESCE(billing_mode, ?),
model = COALESCE(model, ?)
model = COALESCE(model, ?),
api_call_count = ?
WHERE id = ?"""
else:
sql = """UPDATE sessions SET
@@ -484,7 +513,8 @@ class SessionDB:
billing_provider = COALESCE(billing_provider, ?),
billing_base_url = COALESCE(billing_base_url, ?),
billing_mode = COALESCE(billing_mode, ?),
model = COALESCE(model, ?)
model = COALESCE(model, ?),
api_call_count = COALESCE(api_call_count, 0) + ?
WHERE id = ?"""
params = (
input_tokens,
@@ -502,6 +532,7 @@ class SessionDB:
billing_base_url,
billing_mode,
model,
api_call_count,
session_id,
)
def _do(conn):
@@ -922,6 +953,7 @@ class SessionDB:
token_count: int = None,
finish_reason: str = None,
reasoning: str = None,
reasoning_content: str = None,
reasoning_details: Any = None,
codex_reasoning_items: Any = None,
) -> int:
@@ -951,8 +983,8 @@ class SessionDB:
cursor = conn.execute(
"""INSERT INTO messages (session_id, role, content, tool_call_id,
tool_calls, tool_name, timestamp, token_count, finish_reason,
reasoning, reasoning_details, codex_reasoning_items)
VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)""",
reasoning, reasoning_content, reasoning_details, codex_reasoning_items)
VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)""",
(
session_id,
role,
@@ -964,6 +996,7 @@ class SessionDB:
token_count,
finish_reason,
reasoning,
reasoning_content,
reasoning_details_json,
codex_items_json,
),
@@ -1014,7 +1047,7 @@ class SessionDB:
with self._lock:
cursor = self._conn.execute(
"SELECT role, content, tool_call_id, tool_calls, tool_name, "
"reasoning, reasoning_details, codex_reasoning_items "
"reasoning, reasoning_content, reasoning_details, codex_reasoning_items "
"FROM messages WHERE session_id = ? ORDER BY timestamp, id",
(session_id,),
)
@@ -1038,6 +1071,8 @@ class SessionDB:
if row["role"] == "assistant":
if row["reasoning"]:
msg["reasoning"] = row["reasoning"]
if row["reasoning_content"] is not None:
msg["reasoning_content"] = row["reasoning_content"]
if row["reasoning_details"]:
try:
msg["reasoning_details"] = json.loads(row["reasoning_details"])
@@ -1441,3 +1476,116 @@ class SessionDB:
return len(session_ids)
return self._execute_write(_do)
# ── Meta key/value (for scheduler bookkeeping) ──
def get_meta(self, key: str) -> Optional[str]:
"""Read a value from the state_meta key/value store."""
with self._lock:
row = self._conn.execute(
"SELECT value FROM state_meta WHERE key = ?", (key,)
).fetchone()
if row is None:
return None
return row["value"] if isinstance(row, sqlite3.Row) else row[0]
def set_meta(self, key: str, value: str) -> None:
"""Write a value to the state_meta key/value store."""
def _do(conn):
conn.execute(
"INSERT INTO state_meta (key, value) VALUES (?, ?) "
"ON CONFLICT(key) DO UPDATE SET value = excluded.value",
(key, value),
)
self._execute_write(_do)
# ── Space reclamation ──
def vacuum(self) -> None:
"""Run VACUUM to reclaim disk space after large deletes.
SQLite does not shrink the database file when rows are deleted
freed pages just get reused on the next insert. After a prune that
removed hundreds of sessions, the file stays bloated unless we
explicitly VACUUM.
VACUUM rewrites the entire DB, so it's expensive (seconds per
100MB) and cannot run inside a transaction. It also acquires an
exclusive lock, so callers must ensure no other writers are
active. Safe to call at startup before the gateway/CLI starts
serving traffic.
"""
# VACUUM cannot be executed inside a transaction.
with self._lock:
# Best-effort WAL checkpoint first, then VACUUM.
try:
self._conn.execute("PRAGMA wal_checkpoint(TRUNCATE)")
except Exception:
pass
self._conn.execute("VACUUM")
def maybe_auto_prune_and_vacuum(
self,
retention_days: int = 90,
min_interval_hours: int = 24,
vacuum: bool = True,
) -> Dict[str, Any]:
"""Idempotent auto-maintenance: prune old sessions + optional VACUUM.
Records the last run timestamp in state_meta so subsequent calls
within ``min_interval_hours`` no-op. Designed to be called once at
startup from long-lived entrypoints (CLI, gateway, cron scheduler).
Never raises. On any failure, logs a warning and returns a dict
with ``"error"`` set.
Returns a dict with keys:
- ``"skipped"`` (bool) true if within min_interval_hours of last run
- ``"pruned"`` (int) number of sessions deleted
- ``"vacuumed"`` (bool) true if VACUUM ran
- ``"error"`` (str, optional) present only on failure
"""
result: Dict[str, Any] = {"skipped": False, "pruned": 0, "vacuumed": False}
try:
# Skip if another process/call did maintenance recently.
last_raw = self.get_meta("last_auto_prune")
now = time.time()
if last_raw:
try:
last_ts = float(last_raw)
if now - last_ts < min_interval_hours * 3600:
result["skipped"] = True
return result
except (TypeError, ValueError):
pass # corrupt meta; treat as no prior run
pruned = self.prune_sessions(older_than_days=retention_days)
result["pruned"] = pruned
# Only VACUUM if we actually freed rows — VACUUM on a tight DB
# is wasted I/O. Threshold keeps small DBs from paying the cost.
if vacuum and pruned > 0:
try:
self.vacuum()
result["vacuumed"] = True
except Exception as exc:
logger.warning("state.db VACUUM failed: %s", exc)
# Record the attempt even if pruned == 0, so we don't retry
# every startup within the min_interval_hours window.
self.set_meta("last_auto_prune", str(now))
if pruned > 0:
logger.info(
"state.db auto-maintenance: pruned %d session(s) older than %d days%s",
pruned,
retention_days,
" + VACUUM" if result["vacuumed"] else "",
)
except Exception as exc:
# Maintenance must never block startup. Log and return error marker.
logger.warning("state.db auto-maintenance failed: %s", exc)
result["error"] = str(exc)
return result
+60 -25
View File
@@ -108,9 +108,15 @@ def _run_async(coro):
if loop and loop.is_running():
# Inside an async context (gateway, RL env) — run in a fresh thread.
import concurrent.futures
with concurrent.futures.ThreadPoolExecutor(max_workers=1) as pool:
future = pool.submit(asyncio.run, coro)
pool = concurrent.futures.ThreadPoolExecutor(max_workers=1)
future = pool.submit(asyncio.run, coro)
try:
return future.result(timeout=300)
except concurrent.futures.TimeoutError:
future.cancel()
raise
finally:
pool.shutdown(wait=False, cancel_futures=True)
# If we're on a worker thread (e.g., parallel tool execution in
# delegate_task), use a per-thread persistent loop. This avoids
@@ -282,30 +288,34 @@ def get_tool_definitions(
filtered_tools[i] = {"type": "function", "function": dynamic_schema}
break
# Rebuild discord_server schema based on the bot's privileged intents
# (detected from GET /applications/@me) and the user's action allowlist
# in config. Hides actions the bot's intents don't support so the
# model never attempts them, and annotates fetch_messages when the
# Rebuild discord / discord_admin schemas based on the bot's privileged
# intents (detected from GET /applications/@me) and the user's action
# allowlist in config. Hides actions the bot's intents don't support so
# the model never attempts them, and annotates fetch_messages when the
# MESSAGE_CONTENT intent is missing.
if "discord_server" in available_tool_names:
try:
from tools.discord_tool import get_dynamic_schema
dynamic = get_dynamic_schema()
except Exception: # pragma: no cover — defensive, fall back to static
dynamic = None
if dynamic is None:
# Tool filtered out entirely (empty allowlist or detection disabled
# the only remaining actions). Drop it from the schema list.
filtered_tools = [
t for t in filtered_tools
if t.get("function", {}).get("name") != "discord_server"
]
available_tool_names.discard("discord_server")
else:
for i, td in enumerate(filtered_tools):
if td.get("function", {}).get("name") == "discord_server":
filtered_tools[i] = {"type": "function", "function": dynamic}
break
_discord_schema_fns = {
"discord": "get_dynamic_schema_core",
"discord_admin": "get_dynamic_schema_admin",
}
for discord_tool_name in _discord_schema_fns:
if discord_tool_name in available_tool_names:
try:
from tools import discord_tool as _dt
schema_fn = getattr(_dt, _discord_schema_fns[discord_tool_name])
dynamic = schema_fn()
except Exception:
dynamic = None
if dynamic is None:
filtered_tools = [
t for t in filtered_tools
if t.get("function", {}).get("name") != discord_tool_name
]
available_tool_names.discard(discord_tool_name)
else:
for i, td in enumerate(filtered_tools):
if td.get("function", {}).get("name") == discord_tool_name:
filtered_tools[i] = {"type": "function", "function": dynamic}
break
# Strip web tool cross-references from browser_navigate description when
# web_search / web_extract are not available. The static schema says
@@ -412,6 +422,31 @@ def _coerce_value(value: str, expected_type):
return _coerce_number(value, integer_only=(expected_type == "integer"))
if expected_type == "boolean":
return _coerce_boolean(value)
if expected_type == "array":
return _coerce_json(value, list)
if expected_type == "object":
return _coerce_json(value, dict)
return value
def _coerce_json(value: str, expected_python_type: type):
"""Parse *value* as JSON when the schema expects an array or object.
Handles model output drift where a complex oneOf/discriminated-union schema
causes the LLM to emit the array/object as a JSON string instead of a native
structure. Returns the original string if parsing fails or yields the wrong
Python type.
"""
try:
parsed = json.loads(value)
except (ValueError, TypeError):
return value
if isinstance(parsed, expected_python_type):
logger.debug(
"coerce_tool_args: coerced string to %s via json.loads",
expected_python_type.__name__,
)
return parsed
return value
+5 -2
View File
@@ -28,7 +28,7 @@
let
cfg = config.services.hermes-agent;
hermes-agent = inputs.self.packages.${pkgs.system}.default;
hermes-agent = inputs.self.packages.${pkgs.stdenv.hostPlatform.system}.default;
# Deep-merge config type (from 0xrsydn/nix-hermes-agent)
deepConfigType = lib.types.mkOptionType {
@@ -777,7 +777,10 @@ HERMES_NIX_ENV_EOF
NoNewPrivileges = true;
ProtectSystem = "strict";
ProtectHome = false;
ReadWritePaths = [ cfg.stateDir ];
ReadWritePaths = [
cfg.stateDir
cfg.workingDirectory
];
PrivateTmp = true;
};
@@ -0,0 +1,5 @@
# Web Development
Optional skills for client-side web development workflows — embedding agents, copilots, and AI-native UX patterns into user-facing web apps.
These are distinct from Hermes' own browser automation (Browserbase, Camofox), which operate *on* websites from outside. Web-development skills here help users build *into* their own websites.
@@ -0,0 +1,189 @@
---
name: page-agent
description: Embed alibaba/page-agent into your own web application — a pure-JavaScript in-page GUI agent that ships as a single <script> tag or npm package and lets end-users of your site drive the UI with natural language ("click login, fill username as John"). No Python, no headless browser, no extension required. Use this skill when the user is a web developer who wants to add an AI copilot to their SaaS / admin panel / B2B tool, make a legacy web app accessible via natural language, or evaluate page-agent against a local (Ollama) or cloud (Qwen / OpenAI / OpenRouter) LLM. NOT for server-side browser automation — point those users to Hermes' built-in browser tool instead.
version: 1.0.0
author: Hermes Agent
license: MIT
metadata:
hermes:
tags: [web, javascript, agent, browser, gui, alibaba, embed, copilot, saas]
category: web-development
---
# page-agent
alibaba/page-agent (https://github.com/alibaba/page-agent, 17k+ stars, MIT) is an in-page GUI agent written in TypeScript. It lives inside a webpage, reads the DOM as text (no screenshots, no multi-modal LLM), and executes natural-language instructions like "click the login button, then fill username as John" against the current page. Pure client-side — the host site just includes a script and passes an OpenAI-compatible LLM endpoint.
## When to use this skill
Load this skill when a user wants to:
- **Ship an AI copilot inside their own web app** (SaaS, admin panel, B2B tool, ERP, CRM) — "users on my dashboard should be able to type 'create invoice for Acme Corp and email it' instead of clicking through five screens"
- **Modernize a legacy web app** without rewriting the frontend — page-agent drops on top of existing DOM
- **Add accessibility via natural language** — voice / screen-reader users drive the UI by describing what they want
- **Demo or evaluate page-agent** against a local (Ollama) or hosted (Qwen, OpenAI, OpenRouter) LLM
- **Build interactive training / product demos** — let an AI walk a user through "how to submit an expense report" live in the real UI
## When NOT to use this skill
- User wants **Hermes itself to drive a browser** → use Hermes' built-in browser tool (Browserbase / Camofox). page-agent is the *opposite* direction.
- User wants **cross-tab automation without embedding** → use Playwright, browser-use, or the page-agent Chrome extension
- User needs **visual grounding / screenshots** → page-agent is text-DOM only; use a multimodal browser agent instead
## Prerequisites
- Node 22.13+ or 24+, npm 10+ (docs claim 11+ but 10.9 works fine)
- An OpenAI-compatible LLM endpoint: Qwen (DashScope), OpenAI, Ollama, OpenRouter, or anything speaking `/v1/chat/completions`
- Browser with devtools (for debugging)
## Path 1 — 30-second demo via CDN (no install)
Fastest way to see it work. Uses alibaba's free testing LLM proxy — **for evaluation only**, subject to their terms.
Add to any HTML page (or paste into the devtools console as a bookmarklet):
```html
<script src="https://cdn.jsdelivr.net/npm/page-agent@1.8.0/dist/iife/page-agent.demo.js" crossorigin="true"></script>
```
A panel appears. Type an instruction. Done.
Bookmarklet form (drop into bookmarks bar, click on any page):
```javascript
javascript:(function(){var s=document.createElement('script');s.src='https://cdn.jsdelivr.net/npm/page-agent@1.8.0/dist/iife/page-agent.demo.js';document.head.appendChild(s);})();
```
## Path 2 — npm install into your own web app (production use)
Inside an existing web project (React / Vue / Svelte / plain):
```bash
npm install page-agent
```
Wire it up with your own LLM endpoint — **never ship the demo CDN to real users**:
```javascript
import { PageAgent } from 'page-agent'
const agent = new PageAgent({
model: 'qwen3.5-plus',
baseURL: 'https://dashscope.aliyuncs.com/compatible-mode/v1',
apiKey: process.env.LLM_API_KEY, // never hardcode
language: 'en-US',
})
// Show the panel for end users:
agent.panel.show()
// Or drive it programmatically:
await agent.execute('Click submit button, then fill username as John')
```
Provider examples (any OpenAI-compatible endpoint works):
| Provider | `baseURL` | `model` |
|----------|-----------|---------|
| Qwen / DashScope | `https://dashscope.aliyuncs.com/compatible-mode/v1` | `qwen3.5-plus` |
| OpenAI | `https://api.openai.com/v1` | `gpt-4o-mini` |
| Ollama (local) | `http://localhost:11434/v1` | `qwen3:14b` |
| OpenRouter | `https://openrouter.ai/api/v1` | `anthropic/claude-sonnet-4.6` |
**Key config fields** (passed to `new PageAgent({...})`):
- `model`, `baseURL`, `apiKey` — LLM connection
- `language` — UI language (`en-US`, `zh-CN`, etc.)
- Allowlist and data-masking hooks exist for locking down what the agent can touch — see https://alibaba.github.io/page-agent/ for the full option list
**Security.** Don't put your `apiKey` in client-side code for a real deployment — proxy LLM calls through your backend and point `baseURL` at your proxy. The demo CDN exists because alibaba runs that proxy for evaluation.
## Path 3 — clone the source repo (contributing, or hacking on it)
Use this when the user wants to modify page-agent itself, test it against arbitrary sites via a local IIFE bundle, or develop the browser extension.
```bash
git clone https://github.com/alibaba/page-agent.git
cd page-agent
npm ci # exact lockfile install (or `npm i` to allow updates)
```
Create `.env` in the repo root with an LLM endpoint. Example:
```
LLM_MODEL_NAME=gpt-4o-mini
LLM_API_KEY=sk-...
LLM_BASE_URL=https://api.openai.com/v1
```
Ollama flavor:
```
LLM_BASE_URL=http://localhost:11434/v1
LLM_API_KEY=NA
LLM_MODEL_NAME=qwen3:14b
```
Common commands:
```bash
npm start # docs/website dev server
npm run build # build every package
npm run dev:demo # serve IIFE bundle at http://localhost:5174/page-agent.demo.js
npm run dev:ext # develop the browser extension (WXT + React)
npm run build:ext # build the extension
```
**Test on any website** using the local IIFE bundle. Add this bookmarklet:
```javascript
javascript:(function(){var s=document.createElement('script');s.src=`http://localhost:5174/page-agent.demo.js?t=${Math.random()}`;s.onload=()=>console.log('PageAgent ready!');document.head.appendChild(s);})();
```
Then: `npm run dev:demo`, click the bookmarklet on any page, and the local build injects. Auto-rebuilds on save.
**Warning:** your `.env` `LLM_API_KEY` is inlined into the IIFE bundle during dev builds. Don't share the bundle. Don't commit it. Don't paste the URL into Slack. (Verified: grepping the public dev bundle returns the literal values from `.env`.)
## Repo layout (Path 3)
Monorepo with npm workspaces. Key packages:
| Package | Path | Purpose |
|---------|------|---------|
| `page-agent` | `packages/page-agent/` | Main entry with UI panel |
| `@page-agent/core` | `packages/core/` | Core agent logic, no UI |
| `@page-agent/mcp` | `packages/mcp/` | MCP server (beta) |
| — | `packages/llms/` | LLM client |
| — | `packages/page-controller/` | DOM ops + visual feedback |
| — | `packages/ui/` | Panel + i18n |
| — | `packages/extension/` | Chrome/Firefox extension |
| — | `packages/website/` | Docs + landing site |
## Verifying it works
After Path 1 or Path 2:
1. Open the page in a browser with devtools open
2. You should see a floating panel. If not, check the console for errors (most common: CORS on the LLM endpoint, wrong `baseURL`, or a bad API key)
3. Type a simple instruction matching something visible on the page ("click the Login link")
4. Watch the Network tab — you should see a request to your `baseURL`
After Path 3:
1. `npm run dev:demo` prints `Accepting connections at http://localhost:5174`
2. `curl -I http://localhost:5174/page-agent.demo.js` returns `HTTP/1.1 200 OK` with `Content-Type: application/javascript`
3. Click the bookmarklet on any site; panel appears
## Pitfalls
- **Demo CDN in production** — don't. It's rate-limited, uses alibaba's free proxy, and their terms forbid production use.
- **API key exposure** — any key passed to `new PageAgent({apiKey: ...})` ships in your JS bundle. Always proxy through your own backend for real deployments.
- **Non-OpenAI-compatible endpoints** fail silently or with cryptic errors. If your provider needs native Anthropic/Gemini formatting, use an OpenAI-compatibility proxy (LiteLLM, OpenRouter) in front.
- **CSP blocks** — sites with strict Content-Security-Policy may refuse to load the CDN script or disallow inline eval. In that case, self-host from your origin.
- **Restart dev server** after editing `.env` in Path 3 — Vite only reads env at startup.
- **Node version** — the repo declares `^22.13.0 || >=24`. Node 20 will fail `npm ci` with engine errors.
- **npm 10 vs 11** — docs say npm 11+; npm 10.9 actually works fine.
## Reference
- Repo: https://github.com/alibaba/page-agent
- Docs: https://alibaba.github.io/page-agent/
- License: MIT (built on browser-use's DOM processing internals, Copyright 2024 Gregor Zunic)
+4 -3347
View File
File diff suppressed because it is too large Load Diff
+2 -2
View File
@@ -16,8 +16,8 @@
},
"homepage": "https://github.com/NousResearch/Hermes-Agent#readme",
"dependencies": {
"agent-browser": "^0.13.0",
"@askjo/camofox-browser": "^1.5.2"
"@askjo/camofox-browser": "^1.5.2",
"agent-browser": "^0.26.0"
},
"overrides": {
"lodash": "4.18.1"
+378
View File
@@ -0,0 +1,378 @@
"""OpenAI image generation backend — ChatGPT/Codex OAuth variant.
Identical model catalog and tier semantics to the ``openai`` image-gen plugin
(``gpt-image-2`` at low/medium/high quality), but routes the request through
the Codex Responses API ``image_generation`` tool instead of the
``images.generate`` REST endpoint. This lets users who are already
authenticated with Codex/ChatGPT generate images without configuring a
separate ``OPENAI_API_KEY``.
Selection precedence for the tier (first hit wins):
1. ``OPENAI_IMAGE_MODEL`` env var (escape hatch for scripts / tests)
2. ``image_gen.openai-codex.model`` in ``config.yaml``
3. ``image_gen.model`` in ``config.yaml`` (when it's one of our tier IDs)
4. :data:`DEFAULT_MODEL` ``gpt-image-2-medium``
Output is saved as PNG under ``$HERMES_HOME/cache/images/``.
"""
from __future__ import annotations
import logging
from typing import Any, Dict, List, Optional, Tuple
from agent.image_gen_provider import (
DEFAULT_ASPECT_RATIO,
ImageGenProvider,
error_response,
resolve_aspect_ratio,
save_b64_image,
success_response,
)
logger = logging.getLogger(__name__)
# ---------------------------------------------------------------------------
# Model catalog — mirrors the ``openai`` plugin so the picker UX is identical.
# ---------------------------------------------------------------------------
API_MODEL = "gpt-image-2"
_MODELS: Dict[str, Dict[str, Any]] = {
"gpt-image-2-low": {
"display": "GPT Image 2 (Low)",
"speed": "~15s",
"strengths": "Fast iteration, lowest cost",
"quality": "low",
},
"gpt-image-2-medium": {
"display": "GPT Image 2 (Medium)",
"speed": "~40s",
"strengths": "Balanced — default",
"quality": "medium",
},
"gpt-image-2-high": {
"display": "GPT Image 2 (High)",
"speed": "~2min",
"strengths": "Highest fidelity, strongest prompt adherence",
"quality": "high",
},
}
DEFAULT_MODEL = "gpt-image-2-medium"
_SIZES = {
"landscape": "1536x1024",
"square": "1024x1024",
"portrait": "1024x1536",
}
# Codex Responses surface used for the request. The chat model itself is only
# the host that calls the ``image_generation`` tool; the actual image work is
# done by ``API_MODEL``.
_CODEX_CHAT_MODEL = "gpt-5.4"
_CODEX_BASE_URL = "https://chatgpt.com/backend-api/codex"
_CODEX_INSTRUCTIONS = (
"You are an assistant that must fulfill image generation requests by "
"using the image_generation tool when provided."
)
# ---------------------------------------------------------------------------
# Config + auth helpers
# ---------------------------------------------------------------------------
def _load_image_gen_config() -> Dict[str, Any]:
"""Read ``image_gen`` from config.yaml (returns {} on any failure)."""
try:
from hermes_cli.config import load_config
cfg = load_config()
section = cfg.get("image_gen") if isinstance(cfg, dict) else None
return section if isinstance(section, dict) else {}
except Exception as exc:
logger.debug("Could not load image_gen config: %s", exc)
return {}
def _resolve_model() -> Tuple[str, Dict[str, Any]]:
"""Decide which tier to use and return ``(model_id, meta)``."""
import os
env_override = os.environ.get("OPENAI_IMAGE_MODEL")
if env_override and env_override in _MODELS:
return env_override, _MODELS[env_override]
cfg = _load_image_gen_config()
sub = cfg.get("openai-codex") if isinstance(cfg.get("openai-codex"), dict) else {}
candidate: Optional[str] = None
if isinstance(sub, dict):
value = sub.get("model")
if isinstance(value, str) and value in _MODELS:
candidate = value
if candidate is None:
top = cfg.get("model")
if isinstance(top, str) and top in _MODELS:
candidate = top
if candidate is not None:
return candidate, _MODELS[candidate]
return DEFAULT_MODEL, _MODELS[DEFAULT_MODEL]
def _read_codex_access_token() -> Optional[str]:
"""Return a usable Codex OAuth token, or None.
Delegates to the canonical reader in ``agent.auxiliary_client`` so token
expiry, credential pool selection, and JWT decoding stay in one place.
"""
try:
from agent.auxiliary_client import _read_codex_access_token as _reader
token = _reader()
if isinstance(token, str) and token.strip():
return token.strip()
return None
except Exception as exc:
logger.debug("Could not resolve Codex access token: %s", exc)
return None
def _build_codex_client():
"""Return an OpenAI client pointed at the ChatGPT/Codex backend, or None."""
token = _read_codex_access_token()
if not token:
return None
try:
import openai
from agent.auxiliary_client import _codex_cloudflare_headers
return openai.OpenAI(
api_key=token,
base_url=_CODEX_BASE_URL,
default_headers=_codex_cloudflare_headers(token),
)
except Exception as exc:
logger.debug("Could not build Codex image client: %s", exc)
return None
def _collect_image_b64(client: Any, *, prompt: str, size: str, quality: str) -> Optional[str]:
"""Stream a Codex Responses image_generation call and return the b64 image."""
image_b64: Optional[str] = None
with client.responses.stream(
model=_CODEX_CHAT_MODEL,
store=False,
instructions=_CODEX_INSTRUCTIONS,
input=[{
"type": "message",
"role": "user",
"content": [{"type": "input_text", "text": prompt}],
}],
tools=[{
"type": "image_generation",
"model": API_MODEL,
"size": size,
"quality": quality,
"output_format": "png",
"background": "opaque",
"partial_images": 1,
}],
tool_choice={
"type": "allowed_tools",
"mode": "required",
"tools": [{"type": "image_generation"}],
},
) as stream:
for event in stream:
event_type = getattr(event, "type", "")
if event_type == "response.output_item.done":
item = getattr(event, "item", None)
if getattr(item, "type", None) == "image_generation_call":
result = getattr(item, "result", None)
if isinstance(result, str) and result:
image_b64 = result
elif event_type == "response.image_generation_call.partial_image":
partial = getattr(event, "partial_image_b64", None)
if isinstance(partial, str) and partial:
image_b64 = partial
final = stream.get_final_response()
# Final-response sweep covers the case where the stream finished before
# we observed the ``output_item.done`` event for the image call.
for item in getattr(final, "output", None) or []:
if getattr(item, "type", None) == "image_generation_call":
result = getattr(item, "result", None)
if isinstance(result, str) and result:
image_b64 = result
return image_b64
# ---------------------------------------------------------------------------
# Provider
# ---------------------------------------------------------------------------
class OpenAICodexImageGenProvider(ImageGenProvider):
"""gpt-image-2 routed through ChatGPT/Codex OAuth instead of an API key."""
@property
def name(self) -> str:
return "openai-codex"
@property
def display_name(self) -> str:
return "OpenAI (Codex auth)"
def is_available(self) -> bool:
if not _read_codex_access_token():
return False
try:
import openai # noqa: F401
except ImportError:
return False
return True
def list_models(self) -> List[Dict[str, Any]]:
return [
{
"id": model_id,
"display": meta["display"],
"speed": meta["speed"],
"strengths": meta["strengths"],
"price": "varies",
}
for model_id, meta in _MODELS.items()
]
def default_model(self) -> Optional[str]:
return DEFAULT_MODEL
def get_setup_schema(self) -> Dict[str, Any]:
return {
"name": "OpenAI (Codex auth)",
"badge": "free",
"tag": "gpt-image-2 via ChatGPT/Codex OAuth — no API key required",
"env_vars": [],
"post_setup_hint": (
"Sign in with `hermes auth codex` (or `hermes setup` → Codex) "
"if you haven't already. No API key needed."
),
}
def generate(
self,
prompt: str,
aspect_ratio: str = DEFAULT_ASPECT_RATIO,
**kwargs: Any,
) -> Dict[str, Any]:
prompt = (prompt or "").strip()
aspect = resolve_aspect_ratio(aspect_ratio)
if not prompt:
return error_response(
error="Prompt is required and must be a non-empty string",
error_type="invalid_argument",
provider="openai-codex",
aspect_ratio=aspect,
)
if not _read_codex_access_token():
return error_response(
error=(
"No Codex/ChatGPT OAuth credentials available. Run "
"`hermes auth codex` (or `hermes setup` → Codex) to sign in."
),
error_type="auth_required",
provider="openai-codex",
aspect_ratio=aspect,
)
try:
import openai # noqa: F401
except ImportError:
return error_response(
error="openai Python package not installed (pip install openai)",
error_type="missing_dependency",
provider="openai-codex",
aspect_ratio=aspect,
)
tier_id, meta = _resolve_model()
size = _SIZES.get(aspect, _SIZES["square"])
client = _build_codex_client()
if client is None:
return error_response(
error="Could not initialize Codex image client",
error_type="auth_required",
provider="openai-codex",
model=tier_id,
prompt=prompt,
aspect_ratio=aspect,
)
try:
b64 = _collect_image_b64(
client,
prompt=prompt,
size=size,
quality=meta["quality"],
)
except Exception as exc:
logger.debug("Codex image generation failed", exc_info=True)
return error_response(
error=f"OpenAI image generation via Codex auth failed: {exc}",
error_type="api_error",
provider="openai-codex",
model=tier_id,
prompt=prompt,
aspect_ratio=aspect,
)
if not b64:
return error_response(
error="Codex response contained no image_generation_call result",
error_type="empty_response",
provider="openai-codex",
model=tier_id,
prompt=prompt,
aspect_ratio=aspect,
)
try:
saved_path = save_b64_image(b64, prefix=f"openai_codex_{tier_id}")
except Exception as exc:
return error_response(
error=f"Could not save image to cache: {exc}",
error_type="io_error",
provider="openai-codex",
model=tier_id,
prompt=prompt,
aspect_ratio=aspect,
)
return success_response(
image=str(saved_path),
model=tier_id,
prompt=prompt,
aspect_ratio=aspect,
provider="openai-codex",
extra={"size": size, "quality": meta["quality"]},
)
# ---------------------------------------------------------------------------
# Plugin entry point
# ---------------------------------------------------------------------------
def register(ctx) -> None:
"""Plugin entry point — register the Codex-backed image-gen provider."""
ctx.register_image_gen_provider(OpenAICodexImageGenProvider())
@@ -0,0 +1,5 @@
name: openai-codex
version: 1.0.0
description: "OpenAI image generation backed by ChatGPT/Codex OAuth (gpt-image-2 via the Responses image_generation tool). Saves generated images to $HERMES_HOME/cache/images/."
author: NousResearch
kind: backend
+303
View File
@@ -0,0 +1,303 @@
"""OpenAI image generation backend.
Exposes OpenAI's ``gpt-image-2`` model at three quality tiers as an
:class:`ImageGenProvider` implementation. The tiers are implemented as
three virtual model IDs so the ``hermes tools`` model picker and the
``image_gen.model`` config key behave like any other multi-model backend:
gpt-image-2-low ~15s fastest, good for iteration
gpt-image-2-medium ~40s default balanced
gpt-image-2-high ~2min slowest, highest fidelity
All three hit the same underlying API model (``gpt-image-2``) with a
different ``quality`` parameter. Output is base64 JSON saved under
``$HERMES_HOME/cache/images/``.
Selection precedence (first hit wins):
1. ``OPENAI_IMAGE_MODEL`` env var (escape hatch for scripts / tests)
2. ``image_gen.openai.model`` in ``config.yaml``
3. ``image_gen.model`` in ``config.yaml`` (when it's one of our tier IDs)
4. :data:`DEFAULT_MODEL` ``gpt-image-2-medium``
"""
from __future__ import annotations
import logging
import os
from typing import Any, Dict, List, Optional, Tuple
from agent.image_gen_provider import (
DEFAULT_ASPECT_RATIO,
ImageGenProvider,
error_response,
resolve_aspect_ratio,
save_b64_image,
success_response,
)
logger = logging.getLogger(__name__)
# ---------------------------------------------------------------------------
# Model catalog
# ---------------------------------------------------------------------------
#
# All three IDs resolve to the same underlying API model with a different
# ``quality`` setting. ``api_model`` is what gets sent to OpenAI;
# ``quality`` is the knob that changes generation time and output fidelity.
API_MODEL = "gpt-image-2"
_MODELS: Dict[str, Dict[str, Any]] = {
"gpt-image-2-low": {
"display": "GPT Image 2 (Low)",
"speed": "~15s",
"strengths": "Fast iteration, lowest cost",
"quality": "low",
},
"gpt-image-2-medium": {
"display": "GPT Image 2 (Medium)",
"speed": "~40s",
"strengths": "Balanced — default",
"quality": "medium",
},
"gpt-image-2-high": {
"display": "GPT Image 2 (High)",
"speed": "~2min",
"strengths": "Highest fidelity, strongest prompt adherence",
"quality": "high",
},
}
DEFAULT_MODEL = "gpt-image-2-medium"
_SIZES = {
"landscape": "1536x1024",
"square": "1024x1024",
"portrait": "1024x1536",
}
def _load_openai_config() -> Dict[str, Any]:
"""Read ``image_gen`` from config.yaml (returns {} on any failure)."""
try:
from hermes_cli.config import load_config
cfg = load_config()
section = cfg.get("image_gen") if isinstance(cfg, dict) else None
return section if isinstance(section, dict) else {}
except Exception as exc:
logger.debug("Could not load image_gen config: %s", exc)
return {}
def _resolve_model() -> Tuple[str, Dict[str, Any]]:
"""Decide which tier to use and return ``(model_id, meta)``."""
env_override = os.environ.get("OPENAI_IMAGE_MODEL")
if env_override and env_override in _MODELS:
return env_override, _MODELS[env_override]
cfg = _load_openai_config()
openai_cfg = cfg.get("openai") if isinstance(cfg.get("openai"), dict) else {}
candidate: Optional[str] = None
if isinstance(openai_cfg, dict):
value = openai_cfg.get("model")
if isinstance(value, str) and value in _MODELS:
candidate = value
if candidate is None:
top = cfg.get("model")
if isinstance(top, str) and top in _MODELS:
candidate = top
if candidate is not None:
return candidate, _MODELS[candidate]
return DEFAULT_MODEL, _MODELS[DEFAULT_MODEL]
# ---------------------------------------------------------------------------
# Provider
# ---------------------------------------------------------------------------
class OpenAIImageGenProvider(ImageGenProvider):
"""OpenAI ``images.generate`` backend — gpt-image-2 at low/medium/high."""
@property
def name(self) -> str:
return "openai"
@property
def display_name(self) -> str:
return "OpenAI"
def is_available(self) -> bool:
if not os.environ.get("OPENAI_API_KEY"):
return False
try:
import openai # noqa: F401
except ImportError:
return False
return True
def list_models(self) -> List[Dict[str, Any]]:
return [
{
"id": model_id,
"display": meta["display"],
"speed": meta["speed"],
"strengths": meta["strengths"],
"price": "varies",
}
for model_id, meta in _MODELS.items()
]
def default_model(self) -> Optional[str]:
return DEFAULT_MODEL
def get_setup_schema(self) -> Dict[str, Any]:
return {
"name": "OpenAI",
"badge": "paid",
"tag": "gpt-image-2 at low/medium/high quality tiers",
"env_vars": [
{
"key": "OPENAI_API_KEY",
"prompt": "OpenAI API key",
"url": "https://platform.openai.com/api-keys",
},
],
}
def generate(
self,
prompt: str,
aspect_ratio: str = DEFAULT_ASPECT_RATIO,
**kwargs: Any,
) -> Dict[str, Any]:
prompt = (prompt or "").strip()
aspect = resolve_aspect_ratio(aspect_ratio)
if not prompt:
return error_response(
error="Prompt is required and must be a non-empty string",
error_type="invalid_argument",
provider="openai",
aspect_ratio=aspect,
)
if not os.environ.get("OPENAI_API_KEY"):
return error_response(
error=(
"OPENAI_API_KEY not set. Run `hermes tools` → Image "
"Generation → OpenAI to configure, or `hermes setup` "
"to add the key."
),
error_type="auth_required",
provider="openai",
aspect_ratio=aspect,
)
try:
import openai
except ImportError:
return error_response(
error="openai Python package not installed (pip install openai)",
error_type="missing_dependency",
provider="openai",
aspect_ratio=aspect,
)
tier_id, meta = _resolve_model()
size = _SIZES.get(aspect, _SIZES["square"])
# gpt-image-2 returns b64_json unconditionally and REJECTS
# ``response_format`` as an unknown parameter. Don't send it.
payload: Dict[str, Any] = {
"model": API_MODEL,
"prompt": prompt,
"size": size,
"n": 1,
"quality": meta["quality"],
}
try:
client = openai.OpenAI()
response = client.images.generate(**payload)
except Exception as exc:
logger.debug("OpenAI image generation failed", exc_info=True)
return error_response(
error=f"OpenAI image generation failed: {exc}",
error_type="api_error",
provider="openai",
model=tier_id,
prompt=prompt,
aspect_ratio=aspect,
)
data = getattr(response, "data", None) or []
if not data:
return error_response(
error="OpenAI returned no image data",
error_type="empty_response",
provider="openai",
model=tier_id,
prompt=prompt,
aspect_ratio=aspect,
)
first = data[0]
b64 = getattr(first, "b64_json", None)
url = getattr(first, "url", None)
revised_prompt = getattr(first, "revised_prompt", None)
if b64:
try:
saved_path = save_b64_image(b64, prefix=f"openai_{tier_id}")
except Exception as exc:
return error_response(
error=f"Could not save image to cache: {exc}",
error_type="io_error",
provider="openai",
model=tier_id,
prompt=prompt,
aspect_ratio=aspect,
)
image_ref = str(saved_path)
elif url:
# Defensive — gpt-image-2 returns b64 today, but fall back
# gracefully if the API ever changes.
image_ref = url
else:
return error_response(
error="OpenAI response contained neither b64_json nor URL",
error_type="empty_response",
provider="openai",
model=tier_id,
prompt=prompt,
aspect_ratio=aspect,
)
extra: Dict[str, Any] = {"size": size, "quality": meta["quality"]}
if revised_prompt:
extra["revised_prompt"] = revised_prompt
return success_response(
image=image_ref,
model=tier_id,
prompt=prompt,
aspect_ratio=aspect,
provider="openai",
extra=extra,
)
# ---------------------------------------------------------------------------
# Plugin entry point
# ---------------------------------------------------------------------------
def register(ctx) -> None:
"""Plugin entry point — wire ``OpenAIImageGenProvider`` into the registry."""
ctx.register_image_gen_provider(OpenAIImageGenProvider())
+7
View File
@@ -0,0 +1,7 @@
name: openai
version: 1.0.0
description: "OpenAI image generation backend (gpt-image-2). Saves generated images to $HERMES_HOME/cache/images/."
author: NousResearch
kind: backend
requires_env:
- OPENAI_API_KEY
+313
View File
@@ -0,0 +1,313 @@
"""xAI image generation backend.
Exposes xAI's ``grok-imagine-image`` model as an
:class:`ImageGenProvider` implementation.
Features:
- Text-to-image generation
- Multiple aspect ratios (1:1, 16:9, 9:16, etc.)
- Multiple resolutions (1K, 2K)
- Base64 output saved to cache
Selection precedence (first hit wins):
1. ``XAI_IMAGE_MODEL`` env var
2. ``image_gen.xai.model`` in ``config.yaml``
3. :data:`DEFAULT_MODEL`
"""
from __future__ import annotations
import logging
import os
from typing import Any, Dict, List, Optional, Tuple
import requests
from agent.image_gen_provider import (
DEFAULT_ASPECT_RATIO,
ImageGenProvider,
error_response,
resolve_aspect_ratio,
save_b64_image,
success_response,
)
from tools.xai_http import hermes_xai_user_agent
logger = logging.getLogger(__name__)
# ---------------------------------------------------------------------------
# Model catalog
# ---------------------------------------------------------------------------
API_MODEL = "grok-imagine-image"
_MODELS: Dict[str, Dict[str, Any]] = {
"grok-imagine-image": {
"display": "Grok Imagine Image",
"speed": "~5-10s",
"strengths": "Fast, high-quality",
},
}
DEFAULT_MODEL = "grok-imagine-image"
# xAI aspect ratios (more options than FAL/OpenAI)
_XAI_ASPECT_RATIOS = {
"landscape": "16:9",
"square": "1:1",
"portrait": "9:16",
"4:3": "4:3",
"3:4": "3:4",
"3:2": "3:2",
"2:3": "2:3",
}
# xAI resolutions
_XAI_RESOLUTIONS = {
"1k": "1024",
"2k": "2048",
}
DEFAULT_RESOLUTION = "1k"
# ---------------------------------------------------------------------------
# Config
# ---------------------------------------------------------------------------
def _load_xai_config() -> Dict[str, Any]:
"""Read ``image_gen.xai`` from config.yaml."""
try:
from hermes_cli.config import load_config
cfg = load_config()
section = cfg.get("image_gen") if isinstance(cfg, dict) else None
xai_section = section.get("xai") if isinstance(section, dict) else None
return xai_section if isinstance(xai_section, dict) else {}
except Exception as exc:
logger.debug("Could not load image_gen.xai config: %s", exc)
return {}
def _resolve_model() -> Tuple[str, Dict[str, Any]]:
"""Decide which model to use and return ``(model_id, meta)``."""
env_override = os.environ.get("XAI_IMAGE_MODEL")
if env_override and env_override in _MODELS:
return env_override, _MODELS[env_override]
cfg = _load_xai_config()
candidate = cfg.get("model") if isinstance(cfg.get("model"), str) else None
if candidate and candidate in _MODELS:
return candidate, _MODELS[candidate]
return DEFAULT_MODEL, _MODELS[DEFAULT_MODEL]
def _resolve_resolution() -> str:
"""Get configured resolution."""
cfg = _load_xai_config()
res = cfg.get("resolution") if isinstance(cfg.get("resolution"), str) else None
if res and res in _XAI_RESOLUTIONS:
return res
return DEFAULT_RESOLUTION
# ---------------------------------------------------------------------------
# Provider
# ---------------------------------------------------------------------------
class XAIImageGenProvider(ImageGenProvider):
"""xAI ``grok-imagine-image`` backend."""
@property
def name(self) -> str:
return "xai"
@property
def display_name(self) -> str:
return "xAI (Grok)"
def is_available(self) -> bool:
return bool(os.getenv("XAI_API_KEY"))
def list_models(self) -> List[Dict[str, Any]]:
return [
{
"id": model_id,
"display": meta.get("display", model_id),
"speed": meta.get("speed", ""),
"strengths": meta.get("strengths", ""),
}
for model_id, meta in _MODELS.items()
]
def get_setup_schema(self) -> Dict[str, Any]:
return {
"name": "xAI (Grok)",
"badge": "paid",
"tag": "Native xAI image generation via grok-imagine-image",
"env_vars": [
{
"key": "XAI_API_KEY",
"prompt": "xAI API key",
"url": "https://console.x.ai/",
},
],
}
def generate(
self,
prompt: str,
aspect_ratio: str = DEFAULT_ASPECT_RATIO,
**kwargs: Any,
) -> Dict[str, Any]:
"""Generate an image using xAI's grok-imagine-image."""
api_key = os.getenv("XAI_API_KEY", "").strip()
if not api_key:
return error_response(
error="XAI_API_KEY not set. Get one at https://console.x.ai/",
error_type="missing_api_key",
provider="xai",
aspect_ratio=aspect_ratio,
)
model_id, meta = _resolve_model()
aspect = resolve_aspect_ratio(aspect_ratio)
xai_ar = _XAI_ASPECT_RATIOS.get(aspect, "1:1")
resolution = _resolve_resolution()
xai_res = _XAI_RESOLUTIONS.get(resolution, "1024")
payload: Dict[str, Any] = {
"model": API_MODEL,
"prompt": prompt,
"aspect_ratio": xai_ar,
"resolution": xai_res,
}
headers = {
"Authorization": f"Bearer {api_key}",
"Content-Type": "application/json",
"User-Agent": hermes_xai_user_agent(),
}
base_url = (os.getenv("XAI_BASE_URL") or "https://api.x.ai/v1").strip().rstrip("/")
try:
response = requests.post(
f"{base_url}/images/generations",
headers=headers,
json=payload,
timeout=120,
)
response.raise_for_status()
except requests.HTTPError as exc:
status = exc.response.status_code if exc.response else 0
try:
err_msg = exc.response.json().get("error", {}).get("message", exc.response.text[:300])
except Exception:
err_msg = exc.response.text[:300] if exc.response else str(exc)
logger.error("xAI image gen failed (%d): %s", status, err_msg)
return error_response(
error=f"xAI image generation failed ({status}): {err_msg}",
error_type="api_error",
provider="xai",
model=model_id,
prompt=prompt,
aspect_ratio=aspect,
)
except requests.Timeout:
return error_response(
error="xAI image generation timed out (120s)",
error_type="timeout",
provider="xai",
model=model_id,
prompt=prompt,
aspect_ratio=aspect,
)
except requests.ConnectionError as exc:
return error_response(
error=f"xAI connection error: {exc}",
error_type="connection_error",
provider="xai",
model=model_id,
prompt=prompt,
aspect_ratio=aspect,
)
try:
result = response.json()
except Exception as exc:
return error_response(
error=f"xAI returned invalid JSON: {exc}",
error_type="invalid_response",
provider="xai",
model=model_id,
prompt=prompt,
aspect_ratio=aspect,
)
# Parse response — xAI returns data[0].b64_json or data[0].url
data = result.get("data", [])
if not data:
return error_response(
error="xAI returned no image data",
error_type="empty_response",
provider="xai",
model=model_id,
prompt=prompt,
aspect_ratio=aspect,
)
first = data[0]
b64 = first.get("b64_json")
url = first.get("url")
if b64:
try:
saved_path = save_b64_image(b64, prefix=f"xai_{model_id}")
except Exception as exc:
return error_response(
error=f"Could not save image to cache: {exc}",
error_type="io_error",
provider="xai",
model=model_id,
prompt=prompt,
aspect_ratio=aspect,
)
image_ref = str(saved_path)
elif url:
image_ref = url
else:
return error_response(
error="xAI response contained neither b64_json nor URL",
error_type="empty_response",
provider="xai",
model=model_id,
prompt=prompt,
aspect_ratio=aspect,
)
extra: Dict[str, Any] = {
"resolution": xai_res,
}
return success_response(
image=image_ref,
model=model_id,
prompt=prompt,
aspect_ratio=aspect,
provider="xai",
extra=extra,
)
# ---------------------------------------------------------------------------
# Plugin registration
# ---------------------------------------------------------------------------
def register(ctx: Any) -> None:
"""Register this provider with the image gen registry."""
ctx.register_image_gen_provider(XAIImageGenProvider())
+7
View File
@@ -0,0 +1,7 @@
name: xai
version: 1.0.0
description: "xAI image generation backend (grok-imagine-image). Text-to-image."
author: Julien Talbot
kind: backend
requires_env:
- XAI_API_KEY
+5 -2
View File
@@ -84,7 +84,10 @@ Config file: `~/.hermes/hindsight/config.json`
| `retain_async` | `true` | Process retain asynchronously on the Hindsight server |
| `retain_every_n_turns` | `1` | Retain every N turns (1 = every turn) |
| `retain_context` | `conversation between Hermes Agent and the User` | Context label for retained memories |
| `tags` | — | Tags applied when storing memories |
| `retain_tags` | — | Default tags applied to retained memories; merged with per-call tool tags |
| `retain_source` | — | Optional `metadata.source` attached to retained memories |
| `retain_user_prefix` | `User` | Label used before user turns in auto-retained transcripts |
| `retain_assistant_prefix` | `Assistant` | Label used before assistant turns in auto-retained transcripts |
### Integration
@@ -113,7 +116,7 @@ Available in `hybrid` and `tools` memory modes:
| Tool | Description |
|------|-------------|
| `hindsight_retain` | Store information with auto entity extraction |
| `hindsight_retain` | Store information with auto entity extraction; supports optional per-call `tags` |
| `hindsight_recall` | Multi-strategy search (semantic + entity graph) |
| `hindsight_reflect` | Cross-memory synthesis (LLM-powered) |
+198 -37
View File
@@ -6,11 +6,15 @@ retrieval. Supports cloud (API key) and local modes.
Original PR #1811 by benfrank241, adapted to MemoryProvider ABC.
Config via environment variables:
HINDSIGHT_API_KEY API key for Hindsight Cloud
HINDSIGHT_BANK_ID memory bank identifier (default: hermes)
HINDSIGHT_BUDGET recall budget: low/mid/high (default: mid)
HINDSIGHT_API_URL API endpoint
HINDSIGHT_MODE cloud or local (default: cloud)
HINDSIGHT_API_KEY API key for Hindsight Cloud
HINDSIGHT_BANK_ID memory bank identifier (default: hermes)
HINDSIGHT_BUDGET recall budget: low/mid/high (default: mid)
HINDSIGHT_API_URL API endpoint
HINDSIGHT_MODE cloud or local (default: cloud)
HINDSIGHT_RETAIN_TAGS comma-separated tags attached to retained memories
HINDSIGHT_RETAIN_SOURCE metadata source value attached to retained memories
HINDSIGHT_RETAIN_USER_PREFIX label used before user turns in retained transcripts
HINDSIGHT_RETAIN_ASSISTANT_PREFIX label used before assistant turns in retained transcripts
Or via $HERMES_HOME/hindsight/config.json (profile-scoped), falling back to
~/.hindsight/config.json (legacy, shared) for backward compatibility.
@@ -24,7 +28,7 @@ import logging
import os
import threading
from hermes_constants import get_hermes_home
from datetime import datetime, timezone
from typing import Any, Dict, List
from agent.memory_provider import MemoryProvider
@@ -99,6 +103,11 @@ RETAIN_SCHEMA = {
"properties": {
"content": {"type": "string", "description": "The information to store."},
"context": {"type": "string", "description": "Short label (e.g. 'user preference', 'project decision')."},
"tags": {
"type": "array",
"items": {"type": "string"},
"description": "Optional per-call tags to merge with configured default retain tags.",
},
},
"required": ["content"],
},
@@ -168,6 +177,10 @@ def _load_config() -> dict:
return {
"mode": os.environ.get("HINDSIGHT_MODE", "cloud"),
"apiKey": os.environ.get("HINDSIGHT_API_KEY", ""),
"retain_tags": os.environ.get("HINDSIGHT_RETAIN_TAGS", ""),
"retain_source": os.environ.get("HINDSIGHT_RETAIN_SOURCE", ""),
"retain_user_prefix": os.environ.get("HINDSIGHT_RETAIN_USER_PREFIX", "User"),
"retain_assistant_prefix": os.environ.get("HINDSIGHT_RETAIN_ASSISTANT_PREFIX", "Assistant"),
"banks": {
"hermes": {
"bankId": os.environ.get("HINDSIGHT_BANK_ID", "hermes"),
@@ -178,6 +191,48 @@ def _load_config() -> dict:
}
def _normalize_retain_tags(value: Any) -> List[str]:
"""Normalize tag config/tool values to a deduplicated list of strings."""
if value is None:
return []
raw_items: list[Any]
if isinstance(value, list):
raw_items = value
elif isinstance(value, str):
text = value.strip()
if not text:
return []
if text.startswith("["):
try:
parsed = json.loads(text)
except Exception:
parsed = None
if isinstance(parsed, list):
raw_items = parsed
else:
raw_items = text.split(",")
else:
raw_items = text.split(",")
else:
raw_items = [value]
normalized = []
seen = set()
for item in raw_items:
tag = str(item).strip()
if not tag or tag in seen:
continue
seen.add(tag)
normalized.append(tag)
return normalized
def _utc_timestamp() -> str:
"""Return current UTC timestamp in ISO-8601 with milliseconds and Z suffix."""
return datetime.now(timezone.utc).isoformat(timespec="milliseconds").replace("+00:00", "Z")
# ---------------------------------------------------------------------------
# MemoryProvider implementation
# ---------------------------------------------------------------------------
@@ -195,6 +250,19 @@ class HindsightMemoryProvider(MemoryProvider):
self._llm_base_url = ""
self._memory_mode = "hybrid" # "context", "tools", or "hybrid"
self._prefetch_method = "recall" # "recall" or "reflect"
self._retain_tags: List[str] = []
self._retain_source = ""
self._retain_user_prefix = "User"
self._retain_assistant_prefix = "Assistant"
self._platform = ""
self._user_id = ""
self._user_name = ""
self._chat_id = ""
self._chat_name = ""
self._chat_type = ""
self._thread_id = ""
self._agent_identity = ""
self._turn_index = 0
self._client = None
self._prefetch_result = ""
self._prefetch_lock = threading.Lock()
@@ -210,6 +278,7 @@ class HindsightMemoryProvider(MemoryProvider):
# Retain controls
self._auto_retain = True
self._retain_every_n_turns = 1
self._retain_async = True
self._retain_context = "conversation between Hermes Agent and the User"
self._turn_counter = 0
self._session_turns: list[str] = [] # accumulates ALL turns for the session
@@ -224,7 +293,6 @@ class HindsightMemoryProvider(MemoryProvider):
# Bank
self._bank_mission = ""
self._bank_retain_mission: str | None = None
self._retain_async = True
@property
def name(self) -> str:
@@ -423,7 +491,10 @@ class HindsightMemoryProvider(MemoryProvider):
{"key": "recall_budget", "description": "Recall thoroughness", "default": "mid", "choices": ["low", "mid", "high"]},
{"key": "memory_mode", "description": "Memory integration mode", "default": "hybrid", "choices": ["hybrid", "context", "tools"]},
{"key": "recall_prefetch_method", "description": "Auto-recall method", "default": "recall", "choices": ["recall", "reflect"]},
{"key": "tags", "description": "Tags applied when storing memories (comma-separated)", "default": ""},
{"key": "retain_tags", "description": "Default tags applied to retained memories (comma-separated)", "default": ""},
{"key": "retain_source", "description": "Metadata source value attached to retained memories", "default": ""},
{"key": "retain_user_prefix", "description": "Label used before user turns in retained transcripts", "default": "User"},
{"key": "retain_assistant_prefix", "description": "Label used before assistant turns in retained transcripts", "default": "Assistant"},
{"key": "recall_tags", "description": "Tags to filter when searching memories (comma-separated)", "default": ""},
{"key": "recall_tags_match", "description": "Tag matching mode for recall", "default": "any", "choices": ["any", "all", "any_strict", "all_strict"]},
{"key": "auto_recall", "description": "Automatically recall memories before each turn", "default": True},
@@ -467,7 +538,7 @@ class HindsightMemoryProvider(MemoryProvider):
return self._client
def initialize(self, session_id: str, **kwargs) -> None:
self._session_id = session_id
self._session_id = str(session_id or "").strip()
# Check client version and auto-upgrade if needed
try:
@@ -496,6 +567,16 @@ class HindsightMemoryProvider(MemoryProvider):
pass # packaging not available or other issue — proceed anyway
self._config = _load_config()
self._platform = str(kwargs.get("platform") or "").strip()
self._user_id = str(kwargs.get("user_id") or "").strip()
self._user_name = str(kwargs.get("user_name") or "").strip()
self._chat_id = str(kwargs.get("chat_id") or "").strip()
self._chat_name = str(kwargs.get("chat_name") or "").strip()
self._chat_type = str(kwargs.get("chat_type") or "").strip()
self._thread_id = str(kwargs.get("thread_id") or "").strip()
self._agent_identity = str(kwargs.get("agent_identity") or "").strip()
self._turn_index = 0
self._session_turns = []
self._mode = self._config.get("mode", "cloud")
# "local" is a legacy alias for "local_embedded"
if self._mode == "local":
@@ -513,7 +594,7 @@ class HindsightMemoryProvider(MemoryProvider):
memory_mode = self._config.get("memory_mode", "hybrid")
self._memory_mode = memory_mode if memory_mode in ("context", "tools", "hybrid") else "hybrid"
prefetch_method = self._config.get("recall_prefetch_method", "recall")
prefetch_method = self._config.get("recall_prefetch_method") or self._config.get("prefetch_method", "recall")
self._prefetch_method = prefetch_method if prefetch_method in ("recall", "reflect") else "recall"
# Bank options
@@ -521,9 +602,22 @@ class HindsightMemoryProvider(MemoryProvider):
self._bank_retain_mission = self._config.get("bank_retain_mission") or None
# Tags
self._tags = self._config.get("tags") or None
self._retain_tags = _normalize_retain_tags(
self._config.get("retain_tags")
or os.environ.get("HINDSIGHT_RETAIN_TAGS", "")
)
self._tags = self._retain_tags or None
self._recall_tags = self._config.get("recall_tags") or None
self._recall_tags_match = self._config.get("recall_tags_match", "any")
self._retain_source = str(
self._config.get("retain_source") or os.environ.get("HINDSIGHT_RETAIN_SOURCE", "")
).strip()
self._retain_user_prefix = str(
self._config.get("retain_user_prefix") or os.environ.get("HINDSIGHT_RETAIN_USER_PREFIX", "User")
).strip() or "User"
self._retain_assistant_prefix = str(
self._config.get("retain_assistant_prefix") or os.environ.get("HINDSIGHT_RETAIN_ASSISTANT_PREFIX", "Assistant")
).strip() or "Assistant"
# Retain controls
self._auto_retain = self._config.get("auto_retain", True)
@@ -547,11 +641,9 @@ class HindsightMemoryProvider(MemoryProvider):
logger.info("Hindsight initialized: mode=%s, api_url=%s, bank=%s, budget=%s, memory_mode=%s, prefetch_method=%s, client=%s",
self._mode, self._api_url, self._bank_id, self._budget, self._memory_mode, self._prefetch_method, _client_version)
logger.debug("Hindsight config: auto_retain=%s, auto_recall=%s, retain_every_n=%d, "
"retain_async=%s, retain_context=%s, "
"recall_max_tokens=%d, recall_max_input_chars=%d, tags=%s, recall_tags=%s",
"retain_async=%s, retain_context=%s, recall_max_tokens=%d, recall_max_input_chars=%d, tags=%s, recall_tags=%s",
self._auto_retain, self._auto_recall, self._retain_every_n_turns,
self._retain_async, self._retain_context,
self._recall_max_tokens, self._recall_max_input_chars,
self._retain_async, self._retain_context, self._recall_max_tokens, self._recall_max_input_chars,
self._tags, self._recall_tags)
# For local mode, start the embedded daemon in the background so it
@@ -712,6 +804,78 @@ class HindsightMemoryProvider(MemoryProvider):
self._prefetch_thread = threading.Thread(target=_run, daemon=True, name="hindsight-prefetch")
self._prefetch_thread.start()
def _build_turn_messages(self, user_content: str, assistant_content: str) -> List[Dict[str, str]]:
now = datetime.now(timezone.utc).isoformat()
return [
{
"role": "user",
"content": f"{self._retain_user_prefix}: {user_content}",
"timestamp": now,
},
{
"role": "assistant",
"content": f"{self._retain_assistant_prefix}: {assistant_content}",
"timestamp": now,
},
]
def _build_metadata(self, *, message_count: int, turn_index: int) -> Dict[str, str]:
metadata: Dict[str, str] = {
"retained_at": _utc_timestamp(),
"message_count": str(message_count),
"turn_index": str(turn_index),
}
if self._retain_source:
metadata["source"] = self._retain_source
if self._session_id:
metadata["session_id"] = self._session_id
if self._platform:
metadata["platform"] = self._platform
if self._user_id:
metadata["user_id"] = self._user_id
if self._user_name:
metadata["user_name"] = self._user_name
if self._chat_id:
metadata["chat_id"] = self._chat_id
if self._chat_name:
metadata["chat_name"] = self._chat_name
if self._chat_type:
metadata["chat_type"] = self._chat_type
if self._thread_id:
metadata["thread_id"] = self._thread_id
if self._agent_identity:
metadata["agent_identity"] = self._agent_identity
return metadata
def _build_retain_kwargs(
self,
content: str,
*,
context: str | None = None,
document_id: str | None = None,
metadata: Dict[str, str] | None = None,
tags: List[str] | None = None,
retain_async: bool | None = None,
) -> Dict[str, Any]:
kwargs: Dict[str, Any] = {
"bank_id": self._bank_id,
"content": content,
"metadata": metadata or self._build_metadata(message_count=1, turn_index=self._turn_index),
}
if context is not None:
kwargs["context"] = context
if document_id:
kwargs["document_id"] = document_id
if retain_async is not None:
kwargs["retain_async"] = retain_async
merged_tags = _normalize_retain_tags(self._retain_tags)
for tag in _normalize_retain_tags(tags):
if tag not in merged_tags:
merged_tags.append(tag)
if merged_tags:
kwargs["tags"] = merged_tags
return kwargs
def sync_turn(self, user_content: str, assistant_content: str, *, session_id: str = "") -> None:
"""Retain conversation turn in background (non-blocking).
@@ -721,19 +885,14 @@ class HindsightMemoryProvider(MemoryProvider):
logger.debug("sync_turn: skipped (auto_retain disabled)")
return
from datetime import datetime, timezone
now = datetime.now(timezone.utc).isoformat()
if session_id:
self._session_id = str(session_id).strip()
messages = [
{"role": "user", "content": user_content, "timestamp": now},
{"role": "assistant", "content": assistant_content, "timestamp": now},
]
turn = json.dumps(messages)
turn = json.dumps(self._build_turn_messages(user_content, assistant_content))
self._session_turns.append(turn)
self._turn_counter += 1
self._turn_index = self._turn_counter
# Only retain every N turns
if self._turn_counter % self._retain_every_n_turns != 0:
logger.debug("sync_turn: buffered turn %d (will retain at turn %d)",
self._turn_counter, self._turn_counter + (self._retain_every_n_turns - self._turn_counter % self._retain_every_n_turns))
@@ -741,19 +900,21 @@ class HindsightMemoryProvider(MemoryProvider):
logger.debug("sync_turn: retaining %d turns, total session content %d chars",
len(self._session_turns), sum(len(t) for t in self._session_turns))
# Send the ENTIRE session as a single JSON array (document_id deduplicates).
# Each element in _session_turns is a JSON string of that turn's messages.
content = "[" + ",".join(self._session_turns) + "]"
def _sync():
try:
client = self._get_client()
item: dict = {
"content": content,
"context": self._retain_context,
}
if self._tags:
item["tags"] = self._tags
item = self._build_retain_kwargs(
content,
context=self._retain_context,
metadata=self._build_metadata(
message_count=len(self._session_turns) * 2,
turn_index=self._turn_index,
),
)
item.pop("bank_id", None)
item.pop("retain_async", None)
logger.debug("Hindsight retain: bank=%s, doc=%s, async=%s, content_len=%d, num_turns=%d",
self._bank_id, self._session_id, self._retain_async, len(content), len(self._session_turns))
_run_sync(client.aretain_batch(
@@ -789,11 +950,11 @@ class HindsightMemoryProvider(MemoryProvider):
return tool_error("Missing required parameter: content")
context = args.get("context")
try:
retain_kwargs: dict = {
"bank_id": self._bank_id, "content": content, "context": context,
}
if self._tags:
retain_kwargs["tags"] = self._tags
retain_kwargs = self._build_retain_kwargs(
content,
context=context,
tags=args.get("tags"),
)
logger.debug("Tool hindsight_retain: bank=%s, content_len=%d, context=%s",
self._bank_id, len(content), context)
_run_sync(client.aretain(**retain_kwargs))
+70
View File
@@ -0,0 +1,70 @@
# Strike Freedom Cockpit — dashboard skin demo
Demonstrates how the dashboard skin+plugin system can be used to build a
fully custom cockpit-style reskin without touching the core dashboard.
Two pieces:
- `theme/strike-freedom.yaml` — a dashboard theme YAML that paints the
palette, typography, layout variant (`cockpit`), component chrome
(notched card corners, scanlines, accent colors), and declares asset
slots (`hero`, `crest`, `bg`).
- `dashboard/` — a plugin that populates the `sidebar`, `header-left`,
and `footer-right` slots reserved by the cockpit layout. The sidebar
renders an MS-STATUS panel with segmented telemetry bars driven by
real agent status; the header-left injects a COMPASS crest; the
footer-right replaces the default org tagline.
## Install
1. **Theme** — copy the theme YAML into your Hermes home:
```
cp theme/strike-freedom.yaml ~/.hermes/dashboard-themes/
```
2. **Plugin** — the `dashboard/` directory gets auto-discovered because
it lives under `plugins/` in the repo. On a user install, copy the
whole plugin directory into `~/.hermes/plugins/`:
```
cp -r . ~/.hermes/plugins/strike-freedom-cockpit
```
3. Restart the web UI (or `GET /api/dashboard/plugins/rescan`), open it,
pick **Strike Freedom** from the theme switcher.
## Customising the artwork
The sidebar plugin reads `--theme-asset-hero` and `--theme-asset-crest`
from the active theme. Drop your own URLs into the theme YAML:
```yaml
assets:
hero: "/my-images/strike-freedom.png"
crest: "/my-images/compass-crest.svg"
bg: "/my-images/cosmic-era-bg.jpg"
```
The plugin reads those at render time — no plugin code changes needed
to swap artwork across themes.
## What this demo proves
The dashboard skin+plugin system supports (ref: `web/src/themes/types.ts`,
`web/src/plugins/slots.ts`):
- Palette, typography, font URLs, density, radius — already present
- **Asset URLs exposed as CSS vars** (bg / hero / crest / logo /
sidebar / header + arbitrary `custom.*`)
- **Raw `customCSS` blocks** injected as scoped `<style>` tags
- **Per-component style overrides** (card / header / sidebar / backdrop /
tab / progress / footer / badge / page) via CSS vars
- **`layoutVariant`** — `standard`, `cockpit`, or `tiled`
- **Plugin slots** — 10 named shell slots plugins can inject into
(`backdrop`, `header-left/right/banner`, `sidebar`, `pre-main`,
`post-main`, `footer-left/right`, `overlay`)
- **Route overrides** — plugins can replace a built-in page entirely
(`tab.override: "/"`) instead of just adding a tab
- **Hidden plugins** — slot-only plugins that never show in the nav
(`tab.hidden: true`) — as used here
+309
View File
@@ -0,0 +1,309 @@
/**
* Strike Freedom Cockpit dashboard plugin demo.
*
* A slot-only plugin (manifest sets tab.hidden: true) that populates
* three shell slots when the user has the ``strike-freedom`` theme
* selected (or any theme that picks layoutVariant: cockpit):
*
* - sidebar MS-STATUS panel: ENERGY / SHIELD / POWER bars,
* ZGMF-X20A identity line, pilot block, hero
* render (from --theme-asset-hero when the theme
* provides one).
* - header-left COMPASS faction crest (uses --theme-asset-crest
* if provided, falls back to a geometric SVG).
* - footer-right COSMIC ERA tagline that replaces the default
* footer org line.
*
* The plugin demonstrates every extension point added alongside the
* slot system: registerSlot, tab.hidden, reading theme asset CSS vars
* from plugin code, and rendering above the built-in route content.
*/
(function () {
"use strict";
const SDK = window.__HERMES_PLUGIN_SDK__;
const PLUGINS = window.__HERMES_PLUGINS__;
if (!SDK || !PLUGINS || !PLUGINS.registerSlot) {
// Old dashboard bundle without slot support — bail silently rather
// than breaking the page.
return;
}
const { React } = SDK;
const { useState, useEffect } = SDK.hooks;
const { api } = SDK;
// ---------------------------------------------------------------------
// Helpers
// ---------------------------------------------------------------------
/** Read a CSS custom property from :root. Empty string when unset. */
function cssVar(name) {
if (typeof document === "undefined") return "";
return getComputedStyle(document.documentElement).getPropertyValue(name).trim();
}
/** Segmented chip progress bar — 10 cells filled proportionally to value. */
function TelemetryBar(props) {
const { label, value, color } = props;
const cells = [];
for (let i = 0; i < 10; i++) {
const filled = Math.round(value / 10) > i;
cells.push(
React.createElement("span", {
key: i,
style: {
flex: 1,
height: 8,
background: filled ? color : "rgba(255,255,255,0.06)",
transition: "background 200ms",
clipPath: "polygon(2px 0, 100% 0, calc(100% - 2px) 100%, 0 100%)",
},
}),
);
}
return React.createElement(
"div",
{ style: { display: "flex", flexDirection: "column", gap: 4 } },
React.createElement(
"div",
{
style: {
display: "flex",
justifyContent: "space-between",
fontSize: "0.65rem",
letterSpacing: "0.12em",
opacity: 0.75,
},
},
React.createElement("span", null, label),
React.createElement("span", { style: { color, fontWeight: 700 } }, value + "%"),
),
React.createElement(
"div",
{ style: { display: "flex", gap: 2 } },
cells,
),
);
}
// ---------------------------------------------------------------------
// Sidebar: MS-STATUS panel
// ---------------------------------------------------------------------
function SidebarSlot() {
// Pull live-ish numbers from the status API so the plugin isn't just
// a static decoration. Fall back to full bars if the API is slow /
// unavailable.
const [status, setStatus] = useState(null);
useEffect(function () {
let cancel = false;
api.getStatus()
.then(function (s) { if (!cancel) setStatus(s); })
.catch(function () {});
return function () { cancel = true; };
}, []);
// Map real status signals to HUD telemetry. Energy/shield/power
// aren't literal concepts on a software agent, so we read them from
// adjacent signals: active sessions, gateway connected-platforms,
// and agent-online health.
const energy = status && status.gateway_online ? 92 : 18;
const shield = status && status.connected_platforms
? Math.min(100, 40 + (status.connected_platforms.length * 15))
: 70;
const power = status && status.active_sessions
? Math.min(100, 55 + (status.active_sessions.length * 10))
: 87;
const hero = cssVar("--theme-asset-hero");
return React.createElement(
"div",
{
style: {
padding: "1rem 0.75rem",
display: "flex",
flexDirection: "column",
gap: "1rem",
fontFamily: "var(--theme-font-display, sans-serif)",
letterSpacing: "0.08em",
textTransform: "uppercase",
fontSize: "0.65rem",
},
},
// Header line
React.createElement(
"div",
{
style: {
borderBottom: "1px solid rgba(64,200,255,0.3)",
paddingBottom: 8,
display: "flex",
flexDirection: "column",
gap: 2,
},
},
React.createElement("span", { style: { opacity: 0.6 } }, "ms status"),
React.createElement("span", { style: { fontWeight: 700, fontSize: "0.85rem" } }, "zgmf-x20a"),
React.createElement("span", { style: { opacity: 0.6, fontSize: "0.6rem" } }, "strike freedom"),
),
// Hero slot — only renders when the theme provides one.
hero
? React.createElement("div", {
style: {
width: "100%",
aspectRatio: "3 / 4",
backgroundImage: hero,
backgroundSize: "contain",
backgroundPosition: "center",
backgroundRepeat: "no-repeat",
opacity: 0.85,
},
"aria-hidden": true,
})
: React.createElement("div", {
style: {
width: "100%",
aspectRatio: "3 / 4",
border: "1px dashed rgba(64,200,255,0.25)",
display: "flex",
alignItems: "center",
justifyContent: "center",
fontSize: "0.55rem",
opacity: 0.4,
},
}, "hero slot — set assets.hero in theme"),
// Pilot block
React.createElement(
"div",
{
style: {
borderTop: "1px solid rgba(64,200,255,0.18)",
borderBottom: "1px solid rgba(64,200,255,0.18)",
padding: "8px 0",
display: "flex",
flexDirection: "column",
gap: 2,
},
},
React.createElement("span", { style: { opacity: 0.5, fontSize: "0.55rem" } }, "pilot"),
React.createElement("span", { style: { fontWeight: 700 } }, "hermes agent"),
React.createElement("span", { style: { opacity: 0.5, fontSize: "0.55rem" } }, "compass"),
),
// Telemetry bars
React.createElement(TelemetryBar, { label: "energy", value: energy, color: "#ffce3a" }),
React.createElement(TelemetryBar, { label: "shield", value: shield, color: "#3fd3ff" }),
React.createElement(TelemetryBar, { label: "power", value: power, color: "#ff3a5e" }),
// System online
React.createElement(
"div",
{
style: {
marginTop: 4,
padding: "6px 8px",
border: "1px solid rgba(74,222,128,0.4)",
color: "#4ade80",
textAlign: "center",
fontWeight: 700,
fontSize: "0.6rem",
},
},
status && status.gateway_online ? "system online" : "system offline",
),
);
}
// ---------------------------------------------------------------------
// Header-left: COMPASS crest
// ---------------------------------------------------------------------
function HeaderCrestSlot() {
const crest = cssVar("--theme-asset-crest");
const inner = crest
? React.createElement("div", {
style: {
width: 28,
height: 28,
backgroundImage: crest,
backgroundSize: "contain",
backgroundPosition: "center",
backgroundRepeat: "no-repeat",
},
"aria-hidden": true,
})
: React.createElement(
"svg",
{
width: 28,
height: 28,
viewBox: "0 0 28 28",
fill: "none",
stroke: "currentColor",
strokeWidth: 1.5,
"aria-hidden": true,
},
React.createElement("path", { d: "M14 2 L26 14 L14 26 L2 14 Z" }),
React.createElement("path", { d: "M14 8 L20 14 L14 20 L8 14 Z" }),
React.createElement("circle", { cx: 14, cy: 14, r: 2, fill: "currentColor" }),
);
return React.createElement(
"div",
{
style: {
display: "flex",
alignItems: "center",
paddingLeft: 12,
paddingRight: 8,
color: "var(--color-accent, #3fd3ff)",
},
},
inner,
);
}
// ---------------------------------------------------------------------
// Footer-right: COSMIC ERA tagline
// ---------------------------------------------------------------------
function FooterTaglineSlot() {
return React.createElement(
"span",
{
style: {
fontFamily: "var(--theme-font-display, sans-serif)",
fontSize: "0.6rem",
letterSpacing: "0.18em",
textTransform: "uppercase",
opacity: 0.75,
mixBlendMode: "plus-lighter",
},
},
"compass hermes systems / cosmic era 71",
);
}
// ---------------------------------------------------------------------
// Hidden tab placeholder — tab.hidden=true means this never renders in
// the nav, but we still register something sensible in case someone
// manually navigates to /strike-freedom-cockpit (e.g. via a bookmark).
// ---------------------------------------------------------------------
function HiddenPage() {
return React.createElement(
"div",
{ style: { padding: "2rem", opacity: 0.6, fontSize: "0.8rem" } },
"Strike Freedom cockpit is a slot-only plugin — it populates the sidebar, header, and footer instead of showing a tab page.",
);
}
// ---------------------------------------------------------------------
// Registration
// ---------------------------------------------------------------------
const NAME = "strike-freedom-cockpit";
PLUGINS.register(NAME, HiddenPage);
PLUGINS.registerSlot(NAME, "sidebar", SidebarSlot);
PLUGINS.registerSlot(NAME, "header-left", HeaderCrestSlot);
PLUGINS.registerSlot(NAME, "footer-right", FooterTaglineSlot);
})();
@@ -0,0 +1,14 @@
{
"name": "strike-freedom-cockpit",
"label": "Strike Freedom Cockpit",
"description": "MS-STATUS sidebar + header crest for the Strike Freedom theme",
"icon": "Shield",
"version": "1.0.0",
"tab": {
"path": "/strike-freedom-cockpit",
"position": "end",
"hidden": true
},
"slots": ["sidebar", "header-left", "footer-right"],
"entry": "dist/index.js"
}
@@ -0,0 +1,126 @@
# Strike Freedom — Hermes dashboard theme demo
#
# Copy this file to ~/.hermes/dashboard-themes/strike-freedom.yaml and
# restart the web UI (or hit `/api/dashboard/plugins/rescan`). Pair with
# the `strike-freedom-cockpit` plugin (plugins/strike-freedom-cockpit/)
# for the full cockpit experience — this theme paints the palette,
# chrome, and layout; the plugin supplies the MS-STATUS sidebar + header
# crest that the cockpit layout variant reserves space for.
#
# Demonstrates every theme extension point added alongside the plugin
# slot system: palette, typography, layoutVariant, assets, customCSS,
# componentStyles, colorOverrides.
name: strike-freedom
label: "Strike Freedom"
description: "Cockpit HUD — deep navy + cyan + gold accents"
# ------- palette (3-layer) -------
palette:
background: "#05091a"
midground: "#d8f0ff"
foreground:
hex: "#ffffff"
alpha: 0
warmGlow: "rgba(255, 199, 55, 0.24)"
noiseOpacity: 0.7
# ------- typography -------
typography:
fontSans: '"Orbitron", "Eurostile", "Bank Gothic", "Impact", sans-serif'
fontMono: '"Share Tech Mono", "JetBrains Mono", ui-monospace, monospace'
fontDisplay: '"Orbitron", "Eurostile", "Impact", sans-serif'
fontUrl: "https://fonts.googleapis.com/css2?family=Orbitron:wght@400;500;600;700;800&family=Share+Tech+Mono&display=swap"
baseSize: "14px"
lineHeight: "1.5"
letterSpacing: "0.04em"
# ------- layout -------
layout:
radius: "0"
density: "compact"
# ``cockpit`` reserves a 260px left rail that the shell renders when the
# user is on this theme. A paired plugin populates the rail via the
# ``sidebar`` slot; with no plugin the rail shows a placeholder.
layoutVariant: cockpit
# ------- assets -------
# Use any URL (https, data:, /dashboard-plugins/...) or a pre-wrapped
# ``url(...)``/``linear-gradient(...)`` expression. The shell exposes
# each as a CSS var so plugins can read the same imagery.
assets:
bg: "linear-gradient(140deg, #05091a 0%, #0a1530 55%, #102048 100%)"
# Plugin reads --theme-asset-hero / --theme-asset-crest to populate
# its sidebar hero render + header crest. Replace these URLs with your
# own artwork (copy files into ~/.hermes/dashboard-themes/assets/ and
# reference them as /dashboard-themes-assets/strike-freedom/hero.png
# once that static route is wired up — for now use inline data URLs or
# remote URLs).
hero: ""
crest: ""
# ------- component chrome -------
# Each bucket's props become CSS vars (--component-<bucket>-<kebab>) that
# built-in shell components (Card, header, sidebar, backdrop) consume.
componentStyles:
card:
# Notched corners on the top-left + bottom-right — classic mecha UI.
clipPath: "polygon(12px 0, 100% 0, 100% calc(100% - 12px), calc(100% - 12px) 100%, 0 100%, 0 12px)"
background: "linear-gradient(180deg, rgba(10, 22, 52, 0.85) 0%, rgba(5, 9, 26, 0.92) 100%)"
boxShadow: "inset 0 0 0 1px rgba(64, 200, 255, 0.28), 0 0 18px -6px rgba(64, 200, 255, 0.4)"
header:
background: "linear-gradient(180deg, rgba(16, 32, 72, 0.95) 0%, rgba(5, 9, 26, 0.9) 100%)"
sidebar:
background: "linear-gradient(180deg, rgba(8, 18, 42, 0.88) 0%, rgba(5, 9, 26, 0.85) 100%)"
tab:
clipPath: "polygon(6px 0, 100% 0, calc(100% - 6px) 100%, 0 100%)"
backdrop:
backgroundSize: "cover"
backgroundPosition: "center"
fillerOpacity: "1"
fillerBlendMode: "normal"
# ------- color overrides -------
colorOverrides:
primary: "#ffce3a"
primaryForeground: "#05091a"
accent: "#3fd3ff"
accentForeground: "#05091a"
ring: "#3fd3ff"
success: "#4ade80"
warning: "#ffce3a"
destructive: "#ff3a5e"
border: "rgba(64, 200, 255, 0.28)"
# ------- customCSS -------
# Raw CSS injected as a scoped <style> tag on theme apply. Use this for
# selector-level tweaks componentStyles can't express (pseudo-elements,
# animations, media queries). Bounded to 32 KiB per theme.
customCSS: |
/* Scanline overlay — subtle, only when theme is active. */
:root[data-layout-variant="cockpit"] body::before {
content: "";
position: fixed;
inset: 0;
pointer-events: none;
z-index: 100;
background: repeating-linear-gradient(
to bottom,
transparent 0px,
transparent 2px,
rgba(64, 200, 255, 0.035) 3px,
rgba(64, 200, 255, 0.035) 4px
);
mix-blend-mode: screen;
}
/* Chevron pips on card corners. */
[data-layout-variant="cockpit"] .border-border::before,
[data-layout-variant="cockpit"] .border-border::after {
content: "";
position: absolute;
width: 8px;
height: 8px;
border: 1px solid rgba(64, 200, 255, 0.55);
pointer-events: none;
}
+28 -3
View File
@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
[project]
name = "hermes-agent"
version = "0.10.0"
version = "0.11.0"
description = "The self-improving AI agent — creates skills from experience, improves them during use, and runs anywhere"
readme = "README.md"
requires-python = ">=3.11"
@@ -39,7 +39,7 @@ dependencies = [
[project.optional-dependencies]
modal = ["modal>=1.0.0,<2"]
daytona = ["daytona>=0.148.0,<1"]
dev = ["debugpy>=1.8.0,<2", "pytest>=9.0.2,<10", "pytest-asyncio>=1.3.0,<2", "pytest-xdist>=3.0,<4", "mcp>=1.2.0,<2"]
dev = ["debugpy>=1.8.0,<2", "pytest>=9.0.2,<10", "pytest-asyncio>=1.3.0,<2", "pytest-xdist>=3.0,<4", "mcp>=1.2.0,<2", "ty>=0.0.1a29,<0.0.22", "ruff"]
messaging = ["python-telegram-bot[webhooks]>=22.6,<23", "discord.py[voice]>=2.7.1,<3", "aiohttp>=3.13.3,<4", "slack-bolt>=1.18.0,<2", "slack-sdk>=3.27.0,<4", "qrcode>=7.0,<8"]
cron = ["croniter>=6.0.0,<7"]
slack = ["slack-bolt>=1.18.0,<2", "slack-sdk>=3.27.0,<4"]
@@ -126,7 +126,7 @@ py-modules = ["run_agent", "model_tools", "toolsets", "batch_runner", "trajector
hermes_cli = ["web_dist/**/*"]
[tool.setuptools.packages.find]
include = ["agent", "tools", "tools.*", "hermes_cli", "gateway", "gateway.*", "tui_gateway", "tui_gateway.*", "cron", "acp_adapter", "plugins", "plugins.*"]
include = ["agent", "agent.*", "tools", "tools.*", "hermes_cli", "gateway", "gateway.*", "tui_gateway", "tui_gateway.*", "cron", "acp_adapter", "plugins", "plugins.*"]
[tool.pytest.ini_options]
testpaths = ["tests"]
@@ -134,3 +134,28 @@ markers = [
"integration: marks tests requiring external services (API keys, Modal, etc.)",
]
addopts = "-m 'not integration' -n auto"
[tool.ty.environment]
python-version = "3.13"
[tool.ty.rules]
unknown-argument = "warn"
redundant-cast = "ignore"
[tool.ty.src]
exclude = ["**"]
[[tool.ty.overrides]]
include = ["**"]
[tool.ty.overrides.rules]
unresolved-import = "ignore"
invalid-method-override = "ignore"
invalid-assignment = "ignore"
not-iterable = "ignore"
[tool.ruff]
exclude = ["*"]
[tool.uv]
exclude-newer = "7 days"
+538 -514
View File
File diff suppressed because it is too large Load Diff
+1 -1
View File
@@ -265,7 +265,7 @@ def check_config(groq_key, eleven_key):
if voice_mode_path.exists():
try:
import json
modes = json.loads(voice_mode_path.read_text())
modes = json.loads(voice_mode_path.read_text(encoding="utf-8"))
off_count = sum(1 for v in modes.values() if v == "off")
all_count = sum(1 for v in modes.values() if v == "all")
check("Voice mode state", True, f"{all_count} on, {off_count} off, {len(modes)} total")
+112
View File
@@ -43,17 +43,27 @@ AUTHOR_MAP = {
"teknium1@gmail.com": "teknium1",
"teknium@nousresearch.com": "teknium1",
"127238744+teknium1@users.noreply.github.com": "teknium1",
"343873859@qq.com": "DrStrangerUJN",
"jefferson@heimdallstrategy.com": "Mind-Dragon",
"130918800+devorun@users.noreply.github.com": "devorun",
"maks.mir@yahoo.com": "say8hi",
# contributors (from noreply pattern)
"david.vv@icloud.com": "davidvv",
"wangqiang@wangqiangdeMac-mini.local": "xiaoqiang243",
"snreynolds2506@gmail.com": "snreynolds",
"35742124+0xbyt4@users.noreply.github.com": "0xbyt4",
"71184274+MassiveMassimo@users.noreply.github.com": "MassiveMassimo",
"massivemassimo@users.noreply.github.com": "MassiveMassimo",
"82637225+kshitijk4poor@users.noreply.github.com": "kshitijk4poor",
"keifergu@tencent.com": "keifergu",
"kshitijk4poor@users.noreply.github.com": "kshitijk4poor",
"abner.the.foreman@agentmail.to": "Abnertheforeman",
"harryykyle1@gmail.com": "hharry11",
"kshitijk4poor@gmail.com": "kshitijk4poor",
"16443023+stablegenius49@users.noreply.github.com": "stablegenius49",
"185121704+stablegenius49@users.noreply.github.com": "stablegenius49",
"101283333+batuhankocyigit@users.noreply.github.com": "batuhankocyigit",
"255305877+ismell0992-afk@users.noreply.github.com": "ismell0992-afk",
"valdi.jorge@gmail.com": "jvcl",
"francip@gmail.com": "francip",
"omni@comelse.com": "omnissiah-comelse",
@@ -91,6 +101,9 @@ AUTHOR_MAP = {
"135070653+sgaofen@users.noreply.github.com": "sgaofen",
"nocoo@users.noreply.github.com": "nocoo",
"30841158+n-WN@users.noreply.github.com": "n-WN",
"tsuijinglei@gmail.com": "hiddenpuppy",
"jerome@clawwork.ai": "HiddenPuppy",
"wysie@users.noreply.github.com": "Wysie",
"leoyuan0099@gmail.com": "keyuyuan",
"bxzt2006@163.com": "Only-Code-A",
"i@troy-y.org": "TroyMitchell911",
@@ -98,8 +111,12 @@ AUTHOR_MAP = {
"hansnow@users.noreply.github.com": "hansnow",
"134848055+UNLINEARITY@users.noreply.github.com": "UNLINEARITY",
"ben.burtenshaw@gmail.com": "burtenshaw",
"roopaknijhara@gmail.com": "rnijhara",
"josephzcan@gmail.com": "j0sephz",
# contributors (manual mapping from git names)
"ahmedsherif95@gmail.com": "asheriif",
"dyxushuai@gmail.com": "dyxushuai",
"33860762+etcircle@users.noreply.github.com": "etcircle",
"liujinkun@bytedance.com": "liujinkun2025",
"dmayhem93@gmail.com": "dmahan93",
"fr@tecompanytea.com": "ifrederico",
@@ -133,6 +150,7 @@ AUTHOR_MAP = {
"331214+counterposition@users.noreply.github.com": "counterposition",
"blspear@gmail.com": "BrennerSpear",
"akhater@gmail.com": "akhater",
"Cos_Admin@PTG-COS.lodluvup4uaudnm3ycd14giyug.xx.internal.cloudapp.net": "akhater",
"239876380+handsdiff@users.noreply.github.com": "handsdiff",
"hesapacicam112@gmail.com": "etherman-os",
"mark.ramsell@rivermounts.com": "mark-ramsell",
@@ -149,7 +167,10 @@ AUTHOR_MAP = {
"socrates1024@gmail.com": "socrates1024",
"seanalt555@gmail.com": "Salt-555",
"satelerd@gmail.com": "satelerd",
"dan@danlynn.com": "danklynn",
"mattmaximo@hotmail.com": "MattMaximo",
"numman.ali@gmail.com": "nummanali",
"rohithsaimidigudla@gmail.com": "whitehatjr1001",
"0xNyk@users.noreply.github.com": "0xNyk",
"0xnykcd@googlemail.com": "0xNyk",
"buraysandro9@gmail.com": "buray",
@@ -174,6 +195,7 @@ AUTHOR_MAP = {
"adavyasharma@gmail.com": "adavyas",
"acaayush1111@gmail.com": "aayushchaudhary",
"jason@outland.art": "jasonoutland",
"73175452+Magaav@users.noreply.github.com": "Magaav",
"mrflu1918@proton.me": "SPANISHFLU",
"morganemoss@gmai.com": "mormio",
"kopjop926@gmail.com": "cesareth",
@@ -278,6 +300,7 @@ AUTHOR_MAP = {
"srhtsrht17@gmail.com": "Sertug17",
"stephenschoettler@gmail.com": "stephenschoettler",
"tanishq231003@gmail.com": "yyovil",
"taosiyuan163@153.com": "taosiyuan163",
"tesseracttars@gmail.com": "tesseracttars-creator",
"tianliangjay@gmail.com": "xingkongliang",
"tranquil_flow@protonmail.com": "Tranquil-Flow",
@@ -333,6 +356,95 @@ AUTHOR_MAP = {
"asslaenn5@gmail.com": "Aslaaen",
"shalompmc0505@naver.com": "pinion05",
"105142614+VTRiot@users.noreply.github.com": "VTRiot",
"vivien000812@gmail.com": "iamagenius00",
"89228157+Feranmi10@users.noreply.github.com": "Feranmi10",
"simon@gtcl.us": "simon-gtcl",
"suzukaze.haduki@gmail.com": "houko",
"cliff@cigii.com": "cgarwood82",
"anna@oa.ke": "anna-oake",
"jaffarkeikei@gmail.com": "jaffarkeikei",
"hxp@hxp.plus": "hxp-plus",
"3580442280@qq.com": "Tianworld",
"wujianxu91@gmail.com": "wujhsu",
"zhrh120@gmail.com": "niyoh120",
"vrinek@hey.com": "vrinek",
"268198004+xandersbell@users.noreply.github.com": "xandersbell",
"somme4096@gmail.com": "Somme4096",
"brian@tiuxo.com": "brianclemens",
"25944632+yudaiyan@users.noreply.github.com": "yudaiyan",
"chayton@sina.com": "ycbai",
"longsizhuo@gmail.com": "longsizhuo",
"chenb19870707@gmail.com": "ms-alan",
"276886827+WuTianyi123@users.noreply.github.com": "WuTianyi123",
"22549957+li0near@users.noreply.github.com": "li0near",
"23434080+sicnuyudidi@users.noreply.github.com": "sicnuyudidi",
"haimu0x0@proton.me": "haimu0x",
"abdelmajidnidnasser1@gmail.com": "NIDNASSER-Abdelmajid",
"projectadmin@wit.id": "projectadmin-dev",
"mrigankamondal10@gmail.com": "Dev-Mriganka",
"132275809+shushuzn@users.noreply.github.com": "shushuzn",
"ibrahimozsarac@gmail.com": "iborazzi",
"130149563+A-afflatus@users.noreply.github.com": "A-afflatus",
"huangkwell@163.com": "huangke19",
"tanishq@exa.ai": "10ishq",
"363708+christopherwoodall@users.noreply.github.com": "christopherwoodall",
"zhang9w0v5@qq.com": "zhang9w0v5",
"fuleinist@outlook.com": "fuleinist",
"43494187+Llugaes@users.noreply.github.com": "Llugaes",
"fengtianyu88@users.noreply.github.com": "fengtianyu88",
"l.moncany@gmail.com": "lmoncany",
"fatinghenji@users.noreply.github.com": "fatinghenji",
"xin.peng.dr@gmail.com": "xinpengdr",
"mike@mikewaters.net": "mikewaters",
"65117428+WadydX@users.noreply.github.com": "WadydX",
"216480837+isaachuangGMICLOUD@users.noreply.github.com": "isaachuangGMICLOUD",
"nukuom976228@gmail.com": "hsy5571616",
"11462216+Nan93@users.noreply.github.com": "Nan93",
"l973401489@126.com": "zhouxiaoya12",
"373119611@qq.com": "roytian1217",
"brett@brettbrewer.com": "minorgod",
"67779267+wenhao7@users.noreply.github.com": "wenhao7",
"git@yzx9.xyz": "yzx9",
"nilesh@cloudgeni.us": "lvnilesh",
"63502660+azhengbot@users.noreply.github.com": "azhengbot",
"sharvil.saxena@gmail.com": "sharziki",
"yuanhe@minimaxi.com": "RyanLee-Dev",
"curtis992250@gmail.com": "TaroballzChen",
"92638503+Lind3ey@users.noreply.github.com": "Lind3ey",
"1352808998@qq.com": "phpoh",
"caliberoviv@gmail.com": "vivganes",
"michaelfackerell@gmail.com": "MikeFac",
"18024642@qq.com": "GuyCui",
"eumael.mkt@gmail.com": "maelrx",
# v0.11.0 additions
"benbarclay@gmail.com": "benbarclay",
"lijiawen@umich.edu": "Jiawen-lee",
"oleksiy@kovyrin.net": "kovyrin",
"kovyrin.claw@gmail.com": "kovyrin",
"kaiobarb@gmail.com": "liftaris",
"me@arihantsethia.com": "arihantsethia",
"zhuofengwang2003@gmail.com": "coekfung",
"teknium@noreply.github.com": "teknium1",
"2114364329@qq.com": "cuyua9",
"2557058999@qq.com": "Disaster-Terminator",
"cine.dreamer.one@gmail.com": "LeonSGP43",
"leozeli@qq.com": "leozeli",
"linlehao@cuhk.edu.cn": "LehaoLin",
"liutong@isacas.ac.cn": "I3eg1nner",
"peterberthelsen@Peters-MacBook-Air.local": "PeterBerthelsen",
"root@debian.debian": "lengxii",
"roque@priveperfumeshn.com": "priveperfumes",
"shijianzhi@shijianzhideMacBook-Pro.local": "sjz-ks",
"topcheer@me.com": "topcheer",
"walli@tencent.com": "walli",
"zhuofengwang@tencent.com": "Zhuofeng-Wang",
# no-github-match — keep as display names
"clio-agent@sisyphuslabs.ai": "Sisyphus",
"marco@rutimka.de": "Marco Rutsch",
"paul@gamma.app": "Paul Bergeron",
"zhangxicen@example.com": "zhangxicen",
"codex@openai.invalid": "teknium1",
"screenmachine@gmail.com": "teknium1",
}
+1 -1
View File
@@ -8,7 +8,7 @@
"name": "hermes-whatsapp-bridge",
"version": "1.0.0",
"dependencies": {
"@whiskeysockets/baileys": "WhiskeySockets/Baileys#fix/abprops-abt-fetch",
"@whiskeysockets/baileys": "WhiskeySockets/Baileys#01047debd81beb20da7b7779b08edcb06aa03770",
"express": "^4.21.0",
"pino": "^9.0.0",
"qrcode-terminal": "^0.12.0"
+77
View File
@@ -0,0 +1,77 @@
# Port Notes — baoyu-comic
Ported from [JimLiu/baoyu-skills](https://github.com/JimLiu/baoyu-skills) v1.56.1.
## Changes from upstream
### SKILL.md adaptations
| Change | Upstream | Hermes |
|--------|----------|--------|
| Metadata namespace | `openclaw` | `hermes` (with `tags` + `homepage`) |
| Trigger | Slash commands / CLI flags | Natural language skill matching |
| User config | EXTEND.md file (project/user/XDG paths) | Removed — not part of Hermes infra |
| User prompts | `AskUserQuestion` (batched) | `clarify` tool (one question at a time) |
| Image generation | baoyu-imagine (Bun/TypeScript, supports `--ref`) | `image_generate`**prompt-only**, returns a URL; no reference image input; agent must download the URL to the output directory |
| PDF assembly | `scripts/merge-to-pdf.ts` (Bun + `pdf-lib`) | Removed — the PDF merge step is out of scope for this port; pages are delivered as PNGs only |
| Platform support | Linux/macOS/Windows/WSL/PowerShell | Linux/macOS only |
| File operations | Generic instructions | Hermes file tools (`write_file`, `read_file`) |
### Structural removals
- **`references/config/` directory** (removed entirely):
- `first-time-setup.md` — blocking first-time setup flow for EXTEND.md
- `preferences-schema.md` — EXTEND.md YAML schema
- `watermark-guide.md` — watermark config (tied to EXTEND.md)
- **`scripts/` directory** (removed entirely): upstream's `merge-to-pdf.ts` depended on `pdf-lib`, which is not declared anywhere in the Hermes repo. Rather than add a new dependency, the port drops PDF assembly and delivers per-page PNGs.
- **Workflow Step 8 (Merge to PDF)** removed from `workflow.md`; Step 9 (Completion report) renumbered to Step 8.
- **Workflow Step 1.1** — "Load Preferences (EXTEND.md)" section removed from `workflow.md`; steps 1.2/1.3 renumbered to 1.1/1.2.
- **Generic "User Input Tools" and "Image Generation Tools" preambles** — SKILL.md no longer lists fallback rules for multiple possible tools; it references `clarify` and `image_generate` directly.
### Image generation strategy changes
`image_generate`'s schema accepts only `prompt` and `aspect_ratio` (`landscape` | `portrait` | `square`). Upstream's reference-image flow (`--ref characters.png` for character consistency, plus user-supplied refs for style/palette/scene) does not map to this tool, so the workflow was restructured:
- **Character sheet PNG** is still generated for multi-page comics, but it is repositioned as a **human-facing review artifact** (for visual verification) and a reference for later regenerations / manual prompt edits. Page prompts themselves are built from the **text descriptions** in `characters/characters.md` (embedded inline during Step 5). `image_generate` never sees the PNG as a visual input.
- **User-supplied reference images** are reduced to `style` / `palette` / `scene` trait extraction — traits are embedded in the prompt body; the image files themselves are kept only for provenance under `refs/`.
- **Page prompts** now mandate that character descriptions are embedded inline (copied from `characters/characters.md`) — this is the only mechanism left to enforce cross-page character consistency.
- **Download step** — after every `image_generate` call, the returned URL is fetched to disk (e.g., `curl -fsSL "<url>" -o <target>.png`) and verified before the workflow advances.
### SKILL.md reductions
- CLI option columns (`--art`, `--tone`, `--layout`, `--aspect`, `--lang`, `--ref`, `--storyboard-only`, `--prompts-only`, `--images-only`, `--regenerate`) converted to plain-English option descriptions.
- Preset files (`presets/*.md`) and `ohmsha-guide.md`: `` `--style X` `` / `` `--art X --tone Y` `` shorthand rewritten to `art=X, tone=Y` + natural-language references.
- `partial-workflows.md`: per-skill slash command invocations rewritten as user-intent cues; PDF-related outputs removed.
- `auto-selection.md`: priority order dropped the EXTEND.md tier.
- `analysis-framework.md`: language-priority comment updated (user option → conversation → source).
### File naming convention
Source content pasted by the user is saved as `source-{slug}.md`, where `{slug}` is the kebab-case topic slug used for the output directory. Backups follow the same pattern with a `-backup-YYYYMMDD-HHMMSS` suffix. SKILL.md and `workflow.md` now agree on this single convention.
### What was preserved verbatim
- All 6 art-style definitions (`references/art-styles/`)
- All 7 tone definitions (`references/tones/`)
- All 7 layout definitions (`references/layouts/`)
- Core templates: `character-template.md`, `storyboard-template.md`, `base-prompt.md`
- Preset bodies (only the first few intro lines adapted; special rules unchanged)
- Author, version, homepage attribution
## Syncing with upstream
To pull upstream updates:
```bash
# Compare versions
curl -sL https://raw.githubusercontent.com/JimLiu/baoyu-skills/main/skills/baoyu-comic/SKILL.md | head -5
# Look for the version: line
# Diff a reference file
diff <(curl -sL https://raw.githubusercontent.com/JimLiu/baoyu-skills/main/skills/baoyu-comic/references/art-styles/manga.md) \
references/art-styles/manga.md
```
Art-style, tone, and layout reference files can usually be overwritten directly (they're upstream-verbatim). `SKILL.md`, `references/workflow.md`, `references/partial-workflows.md`, `references/auto-selection.md`, `references/analysis-framework.md`, `references/ohmsha-guide.md`, and `references/presets/*.md` must be manually merged since they contain Hermes-specific adaptations.
If upstream adds a Hermes-compatible PDF merge step (no extra npm deps), restore `scripts/` and reintroduce Step 8 in `workflow.md`.

Some files were not shown because too many files have changed in this diff Show More