Compare commits

..

1 Commits

Author SHA1 Message Date
Brooklyn Nicholson d049d88dd7 fix(docker): run bundled TUI without npm install 2026-04-30 12:36:02 -05:00
275 changed files with 2068 additions and 41175 deletions
-6
View File
@@ -9,12 +9,6 @@ node_modules
.venv
**/.venv
# Built artifacts that are regenerated inside the image. Excluded so local
# rebuilds on the developer's machine don't invalidate the npm-install layer
# that now depends on the full ui-tui/packages/hermes-ink/ tree being present.
ui-tui/dist/
ui-tui/packages/hermes-ink/dist/
# CI/CD
.github
-10
View File
@@ -76,16 +76,6 @@ jobs:
run: |
mkdir -p _site/docs
cp -r website/build/* _site/docs/
# llms.txt / llms-full.txt are also published at the site root
# (https://hermes-agent.nousresearch.com/llms.txt) because some
# agents and IDE plugins probe the classic root-level path rather
# than /docs/llms.txt. Same file, two URLs, one source of truth.
if [ -f website/build/llms.txt ]; then
cp website/build/llms.txt _site/llms.txt
fi
if [ -f website/build/llms-full.txt ]; then
cp website/build/llms-full.txt _site/llms-full.txt
fi
- name: Upload artifact
uses: actions/upload-pages-artifact@56afc609e74202658d3ffba0e8f6dda462b719fa # v3
+9 -18
View File
@@ -28,26 +28,10 @@ WORKDIR /opt/hermes
# ---------- Layer-cached dependency install ----------
# Copy only package manifests first so npm install + Playwright are cached
# unless the lockfiles themselves change.
#
# ui-tui/packages/hermes-ink/ is copied IN FULL (not just its manifests)
# because it is referenced as a `file:` workspace dependency from
# ui-tui/package.json. Copying the tree up front lets npm resolve the
# workspace to real content instead of stopping at a bare package.json.
COPY package.json package-lock.json ./
COPY web/package.json web/package-lock.json web/
COPY ui-tui/package.json ui-tui/package-lock.json ui-tui/
COPY ui-tui/packages/hermes-ink/ ui-tui/packages/hermes-ink/
# `npm_config_install_links=false` forces npm to install `file:` deps as
# symlinks (the npm 10+ default) even on Debian's older bundled npm 9.x,
# which defaults to `install-links=true` and installs file deps as *copies*.
# The host-side package-lock.json is generated with a newer npm that uses
# symlinks, so an install-as-copy produces a hidden node_modules/.package-lock.json
# that permanently disagrees with the root lock on the @hermes/ink entry.
# That disagreement trips the TUI launcher's `_tui_need_npm_install()`
# check on every startup and triggers a runtime `npm install` that then
# fails with EACCES (node_modules/ is root-owned from build time).
ENV npm_config_install_links=false
COPY ui-tui/packages/hermes-ink/package.json ui-tui/packages/hermes-ink/package-lock.json ui-tui/packages/hermes-ink/
RUN npm install --prefer-offline --no-audit && \
npx playwright install --with-deps chromium --only-shell && \
@@ -61,7 +45,14 @@ COPY --chown=hermes:hermes . .
# Build browser dashboard and terminal UI assets.
RUN cd web && npm run build && \
cd ../ui-tui && npm run build
cd ../ui-tui && npm run build && \
rm -rf node_modules/@hermes/ink && \
rm -rf packages/hermes-ink/node_modules && \
cp -R packages/hermes-ink node_modules/@hermes/ink && \
npm install --omit=dev --prefer-offline --no-audit --prefix node_modules/@hermes/ink && \
rm -rf node_modules/@hermes/ink/node_modules/react && \
node --input-type=module -e "await import('@hermes/ink')" && \
touch .hermes-prebuilt-tui
# ---------- Permissions ----------
# Make install dir world-readable so any HERMES_UID can read it at runtime.
-505
View File
@@ -1,505 +0,0 @@
# Hermes Agent v0.12.0 (v2026.4.30)
**Release Date:** April 30, 2026
**Since v0.11.0:** 1,096 commits · 550 merged PRs · 1,270 files changed · 217,776 insertions · 213 community contributors (including co-authors)
> The Curator release — Hermes Agent now maintains itself. An autonomous background Curator grades, prunes, and consolidates your skill library on its own schedule. The self-improvement loop that reviews what to save got a substantial upgrade. Four new inference providers, a 18th messaging platform, a 19th via Teams plugin, native Spotify + Google Meet integrations, ComfyUI and TouchDesigner-MCP moved from optional to bundled-by-default, and a ~57% cut to visible TUI cold start.
---
## ✨ Highlights
- **Autonomous Curator** — `hermes curator` runs as a background agent on the gateway's cron ticker (7-day cycle default). It grades your skill library, consolidates related skills, prunes dead ones, and writes per-run reports to `logs/curator/run.json` + `REPORT.md`. Archived skills are classified consolidated-vs-pruned via model + heuristic. Defense-in-depth gates protect bundled/hub skills from mutation. Unified under `auxiliary.curator` — pick the curator's model in `hermes model`, manage it from the dashboard. `hermes curator status` ranks skills by usage (most-used / least-used). ([#17277](https://github.com/NousResearch/hermes-agent/pull/17277), [#17307](https://github.com/NousResearch/hermes-agent/pull/17307), [#17941](https://github.com/NousResearch/hermes-agent/pull/17941), [#17868](https://github.com/NousResearch/hermes-agent/pull/17868), [#18033](https://github.com/NousResearch/hermes-agent/pull/18033))
- **Self-improvement loop — substantially upgraded** — The background review fork (the core of Hermes' self-improvement: after each turn it decides what memories/skills to save or update) is now class-first (rubric-based rather than free-form), active-update biased (prefers the skill the agent just loaded), handles `references/`/`templates/` sub-files, and properly inherits the parent's live runtime (provider, model, credentials actually propagate). Restricted to memory + skills toolsets so it can't sprawl. Memory providers shut down cleanly. Prior-turn tool messages excluded from the summary so the fork sees a clean context. ([#16026](https://github.com/NousResearch/hermes-agent/pull/16026), [#17213](https://github.com/NousResearch/hermes-agent/pull/17213), [#16099](https://github.com/NousResearch/hermes-agent/pull/16099), [#16569](https://github.com/NousResearch/hermes-agent/pull/16569), [#16204](https://github.com/NousResearch/hermes-agent/pull/16204), [#15057](https://github.com/NousResearch/hermes-agent/pull/15057))
- **Skill integrations — major expansion** — **ComfyUI v5** with official CLI + REST + hardware-gated local install, moved from optional to **built-in by default** ([#17610](https://github.com/NousResearch/hermes-agent/pull/17610), [#17631](https://github.com/NousResearch/hermes-agent/pull/17631), [#17734](https://github.com/NousResearch/hermes-agent/pull/17734)). **TouchDesigner-MCP** bundled by default, expanded with GLSL, post-FX, audio, geometry, and 9 new reference docs ([#16753](https://github.com/NousResearch/hermes-agent/pull/16753), [#16624](https://github.com/NousResearch/hermes-agent/pull/16624), [#16768](https://github.com/NousResearch/hermes-agent/pull/16768) — @kshitijk4poor + @SHL0MS). **Humanizer** skill ports a text-cleaner that strips AI-isms ([#16787](https://github.com/NousResearch/hermes-agent/pull/16787)). **claude-design** HTML artifact skill + design-md (Google DESIGN.md spec) + airtable salvage + `skill_manage` edits in `external_dirs` + direct-URL skill install + `/reload-skills` slash command. ([#16358](https://github.com/NousResearch/hermes-agent/pull/16358), [#14876](https://github.com/NousResearch/hermes-agent/pull/14876), [#16291](https://github.com/NousResearch/hermes-agent/pull/16291), [#17512](https://github.com/NousResearch/hermes-agent/pull/17512), [#16323](https://github.com/NousResearch/hermes-agent/pull/16323), [#17744](https://github.com/NousResearch/hermes-agent/pull/17744))
- **LM Studio — first-class provider** — upgraded from a custom-endpoint alias to a full-blown native provider: dedicated auth, `hermes doctor` checks, reasoning transport, live `/models` listing. (Salvage of @kshitijk4poor's #17061.) ([#17102](https://github.com/NousResearch/hermes-agent/pull/17102))
- **Four more new inference providers** — **GMI Cloud** (first-class, salvage of #11955@isaachuangGMICLOUD), **Azure AI Foundry** with auto-detection, **MiniMax OAuth** with PKCE browser flow (salvage #15203), **Tencent Tokenhub** (salvage of #16860). ([#16663](https://github.com/NousResearch/hermes-agent/pull/16663), [#15845](https://github.com/NousResearch/hermes-agent/pull/15845), [#17524](https://github.com/NousResearch/hermes-agent/pull/17524), [#16960](https://github.com/NousResearch/hermes-agent/pull/16960))
- **Pluggable gateway platforms + Microsoft Teams** — the gateway is now a plugin host. Drop-in messaging adapters live outside the core, and Microsoft Teams is the first plugin-shipped platform. (Salvage of #17664.) ([#17751](https://github.com/NousResearch/hermes-agent/pull/17751), [#17828](https://github.com/NousResearch/hermes-agent/pull/17828))
- **Tencent 元宝 (Yuanbao) — 18th messaging platform** — native gateway adapter with text + media delivery. ([#16298](https://github.com/NousResearch/hermes-agent/pull/16298), [#17424](https://github.com/NousResearch/hermes-agent/pull/17424))
- **Spotify — native tools + bundled skill + wizard** — 7 tools (play, search, queue, playlists, devices) behind PKCE OAuth, interactive setup wizard, bundled skill, surfacing in `hermes tools`, cron usage documented. ([#15121](https://github.com/NousResearch/hermes-agent/pull/15121), [#15130](https://github.com/NousResearch/hermes-agent/pull/15130), [#15154](https://github.com/NousResearch/hermes-agent/pull/15154), [#15180](https://github.com/NousResearch/hermes-agent/pull/15180))
- **Google Meet plugin** — join calls, transcribe, speak, follow up. Realtime OpenAI transport + Node bot server, full pipeline bundled as a plugin. ([#16364](https://github.com/NousResearch/hermes-agent/pull/16364))
- **`hermes -z` one-shot mode + `hermes update --check`** — non-interactive `hermes -z <prompt>` with `--model`/`--provider`/`HERMES_INFERENCE_MODEL`. `hermes update --check` preflight. Opt-in pre-update HERMES_HOME backup. ([#15702](https://github.com/NousResearch/hermes-agent/pull/15702), [#15704](https://github.com/NousResearch/hermes-agent/pull/15704), [#15841](https://github.com/NousResearch/hermes-agent/pull/15841), [#16539](https://github.com/NousResearch/hermes-agent/pull/16539), [#16566](https://github.com/NousResearch/hermes-agent/pull/16566))
- **Models dashboard tab + in-browser model config** — rich per-model analytics, switch main + auxiliary models from the dashboard. ([#17745](https://github.com/NousResearch/hermes-agent/pull/17745), [#17802](https://github.com/NousResearch/hermes-agent/pull/17802))
- **Remote model catalog manifest** — OpenRouter + Nous Portal model catalogs are now pulled from a remote manifest so new models show up without a release. ([#16033](https://github.com/NousResearch/hermes-agent/pull/16033))
- **Native multimodal image routing** — images now route based on the model's actual vision capability rather than provider defaults. ([#16506](https://github.com/NousResearch/hermes-agent/pull/16506))
- **Gateway media parity** — native multi-image sending across Telegram, Discord, Slack, Mattermost, Email, and Signal; centralized audio routing with FLAC support + Telegram document fallback. ([#17909](https://github.com/NousResearch/hermes-agent/pull/17909), [#17833](https://github.com/NousResearch/hermes-agent/pull/17833))
- **TUI catches up to (and past) the classic CLI** — LaTeX rendering (@austinpickett), `/reload` .env hot-reload, pluggable busy-indicator styles (@OutThisLife, #13610), opt-in auto-resume of last session, expanded light-terminal auto-detection, session delete from `/resume` picker with `d`, modified mouse-wheel line scroll, and a `/mouse` toggle that kills ConPTY's phantom mouse injection (@kevin-ho). ([#17175](https://github.com/NousResearch/hermes-agent/pull/17175), [#17286](https://github.com/NousResearch/hermes-agent/pull/17286), [#17150](https://github.com/NousResearch/hermes-agent/pull/17150), [#17130](https://github.com/NousResearch/hermes-agent/pull/17130), [#17113](https://github.com/NousResearch/hermes-agent/pull/17113), [#17668](https://github.com/NousResearch/hermes-agent/pull/17668), [#17669](https://github.com/NousResearch/hermes-agent/pull/17669), [#15488](https://github.com/NousResearch/hermes-agent/pull/15488))
- **Observability + achievements plugins** — bundled Langfuse observability plugin (salvage #16845) + bundled hermes-achievements plugin that scans full session history. ([#16917](https://github.com/NousResearch/hermes-agent/pull/16917), [#17754](https://github.com/NousResearch/hermes-agent/pull/17754))
- **TTS provider registry + Piper local TTS** — pluggable `tts.providers.<name>` registry; Piper ships as a native local TTS provider. (Closes #8508.) ([#17843](https://github.com/NousResearch/hermes-agent/pull/17843), [#17885](https://github.com/NousResearch/hermes-agent/pull/17885))
- **Vercel Sandbox backend** — Vercel sandboxes as an execute_code/terminal backend (@kshitijk4poor). ([#17445](https://github.com/NousResearch/hermes-agent/pull/17445))
- **Secret redaction off by default** — default flipped to off. Prevents the long-standing patch-corruption incidents where fake secret-shaped substrings mangled tool outputs. Opt in via `redaction.enabled: true` when you need it. ([#16794](https://github.com/NousResearch/hermes-agent/pull/16794))
- **Cold-start performance** — visible TUI cold start cut **~57%** via lazy agent init (@OutThisLife), lazy imports of OpenAI / Anthropic / Firecrawl / account_usage, mtime-cached `load_config()`, memoized `get_tool_definitions()` with TTL-cached `check_fn` results, precompiled dangerous-command patterns. ([#17190](https://github.com/NousResearch/hermes-agent/pull/17190), [#17046](https://github.com/NousResearch/hermes-agent/pull/17046), [#17041](https://github.com/NousResearch/hermes-agent/pull/17041), [#17098](https://github.com/NousResearch/hermes-agent/pull/17098), [#17206](https://github.com/NousResearch/hermes-agent/pull/17206))
- **Configurable prompt cache TTL** — `prompt_caching.cache_ttl` (5m default, 1h opt-in — cost savings for bursty sessions that keep cache warm). Salvage of #12659. ([#15065](https://github.com/NousResearch/hermes-agent/pull/15065))
---
## 🧠 Autonomous Curator & Self-Improvement Loop
### Curator — autonomous skill maintenance
- **`hermes curator` as a background agent** — runs on the gateway's cron ticker, 7-day cycle by default, umbrella-first prompt, inherits parent config, unbounded iterations ([#17277](https://github.com/NousResearch/hermes-agent/pull/17277) — issue #7816)
- **Per-run reports** — `logs/curator/run.json` + `REPORT.md` per cycle ([#17307](https://github.com/NousResearch/hermes-agent/pull/17307))
- **Consolidated vs pruned classification** — archived skills split with model + heuristic ([#17941](https://github.com/NousResearch/hermes-agent/pull/17941))
- **`hermes curator status`** — ranks skills by usage, shows most-used and least-used ([#18033](https://github.com/NousResearch/hermes-agent/pull/18033))
- **Unified under `auxiliary.curator`** — pick the model in `hermes model`, configure from the dashboard ([#17868](https://github.com/NousResearch/hermes-agent/pull/17868))
- **Documentation** — dedicated curator feature page on the docs site ([#17563](https://github.com/NousResearch/hermes-agent/pull/17563))
- Fix: seed defaults on update, create `logs/curator/` directory, defer fire import ([#17927](https://github.com/NousResearch/hermes-agent/pull/17927))
- Fix: scan nested archive subdirs in `restore_skill` (@0xDevNinja) ([#17951](https://github.com/NousResearch/hermes-agent/pull/17951))
- Fix: use actual skill activity in curator status (@y0shua1ee) ([#17953](https://github.com/NousResearch/hermes-agent/pull/17953))
- Fix: `skill_manage` refuses writes on pinned skills; pinning now blocks curator writes ([#17562](https://github.com/NousResearch/hermes-agent/pull/17562), [#17578](https://github.com/NousResearch/hermes-agent/pull/17578))
- Fix: `bump_use()` wired into skill invocation + preload + skill_view (salvage #17782) ([#17932](https://github.com/NousResearch/hermes-agent/pull/17932))
### Self-improvement loop (background review fork)
- **Class-first skill-review prompt** — rubric-based grading rather than free-form "should this update" ([#16026](https://github.com/NousResearch/hermes-agent/pull/16026))
- **Active-update bias** — prefers updating skills the agent just loaded, handles `references/` + `templates/` sub-files ([#17213](https://github.com/NousResearch/hermes-agent/pull/17213))
- **Fork inherits parent's live runtime** — provider, model, credentials actually propagate now ([#16099](https://github.com/NousResearch/hermes-agent/pull/16099))
- **Scoped toolsets** — review fork restricted to memory + skills (no shell, no web) ([#16569](https://github.com/NousResearch/hermes-agent/pull/16569))
- **Clean shutdown** — background review memory providers exit properly (salvage #15289) ([#16204](https://github.com/NousResearch/hermes-agent/pull/16204))
- **Clean context** — prior-history tool messages excluded from review summary (salvage #14967) ([#15057](https://github.com/NousResearch/hermes-agent/pull/15057))
---
## 🧩 Skills Ecosystem
### Skill integrations — newly bundled or promoted
- **ComfyUI v5** — official CLI + REST + hardware-gated local install; **moved from optional to built-in** ([#17610](https://github.com/NousResearch/hermes-agent/pull/17610), [#17631](https://github.com/NousResearch/hermes-agent/pull/17631), [#17734](https://github.com/NousResearch/hermes-agent/pull/17734), [#17612](https://github.com/NousResearch/hermes-agent/pull/17612))
- **TouchDesigner-MCP** — **bundled by default** ([#16753](https://github.com/NousResearch/hermes-agent/pull/16753) — @kshitijk4poor), expanded with GLSL, post-FX, audio, geometry references ([#16624](https://github.com/NousResearch/hermes-agent/pull/16624)), 9 new reference docs ([#16768](https://github.com/NousResearch/hermes-agent/pull/16768) — @SHL0MS)
- **Humanizer** — strips AI-isms from text ([#16787](https://github.com/NousResearch/hermes-agent/pull/16787))
- **claude-design** — HTML artifact skill with disambiguation from other design skills ([#16358](https://github.com/NousResearch/hermes-agent/pull/16358))
- **design-md** — Google's DESIGN.md spec skill ([#14876](https://github.com/NousResearch/hermes-agent/pull/14876))
- **airtable** — salvaged skill + skill API keys wired into `.env` (#15838) ([#16291](https://github.com/NousResearch/hermes-agent/pull/16291))
- **pretext** — creative browser demos with @chenglou/pretext ([#17259](https://github.com/NousResearch/hermes-agent/pull/17259))
- **spike** + **sketch** — throwaway experiments + HTML mockups, adapted from gsd-build ([#17421](https://github.com/NousResearch/hermes-agent/pull/17421))
### Skills UX
- **Install skills from a direct HTTP(S) URL** — `hermes skills install <url>` ([#16323](https://github.com/NousResearch/hermes-agent/pull/16323))
- **`/reload-skills`** slash command (salvage #17670) ([#17744](https://github.com/NousResearch/hermes-agent/pull/17744))
- **`hermes skills list`** shows enabled/disabled status ([#16129](https://github.com/NousResearch/hermes-agent/pull/16129))
- **`skill_manage` refuses writes on pinned skills** ([#17562](https://github.com/NousResearch/hermes-agent/pull/17562))
- **`skill_manage` edits external_dirs skills in place** (salvage #9966) ([#17512](https://github.com/NousResearch/hermes-agent/pull/17512), [#17289](https://github.com/NousResearch/hermes-agent/pull/17289))
- Fix: inline-shell rendering in `skill_view` ([#15376](https://github.com/NousResearch/hermes-agent/pull/15376))
- Fix: exclude `.archive/` from skill index walk (salvage #17639) ([#17931](https://github.com/NousResearch/hermes-agent/pull/17931))
- Fix: dedicated docs page per bundled + optional skill ([#14929](https://github.com/NousResearch/hermes-agent/pull/14929))
- Fix: `google-workspace` shared HERMES_HOME helper + ship deps as optional extra ([#15405](https://github.com/NousResearch/hermes-agent/pull/15405))
- Fix: auto-wrap ASCII-art code blocks in generated skill pages ([#16497](https://github.com/NousResearch/hermes-agent/pull/16497))
- Point agent at `hermes-agent` skill + docs site for Hermes questions ([#16535](https://github.com/NousResearch/hermes-agent/pull/16535))
---
## 🏗️ Core Agent & Architecture
### Provider & Model Support
#### New providers
- **GMI Cloud** — first-class API-key provider on par with Arcee/Kilocode/Xiaomi (salvage of #11955@isaachuangGMICLOUD) ([#16663](https://github.com/NousResearch/hermes-agent/pull/16663))
- **Azure AI Foundry** — auto-detection, full wiring ([#15845](https://github.com/NousResearch/hermes-agent/pull/15845))
- **LM Studio** — upgraded from custom-endpoint alias to first-class provider: dedicated auth, doctor checks, reasoning transport, live `/models` (salvage of #17061@kshitijk4poor) ([#17102](https://github.com/NousResearch/hermes-agent/pull/17102))
- **MiniMax OAuth** — PKCE browser flow with full OAuth integration (salvage #15203) ([#17524](https://github.com/NousResearch/hermes-agent/pull/17524))
- **Tencent Tokenhub** — new provider (salvage of #16860) ([#16960](https://github.com/NousResearch/hermes-agent/pull/16960))
#### Model catalog
- **Remote model catalog manifest** — OpenRouter + Nous Portal catalogs pulled from remote manifest so new models show up without a release ([#16033](https://github.com/NousResearch/hermes-agent/pull/16033))
- `openai/gpt-5.5` and `gpt-5.5-pro` added to OpenRouter + Nous Portal ([#15343](https://github.com/NousResearch/hermes-agent/pull/15343))
- `deepseek-v4-pro` and `deepseek-v4-flash` added ([#14934](https://github.com/NousResearch/hermes-agent/pull/14934))
- `qwen3.6-plus` added to Alibaba-supported models ([#16896](https://github.com/NousResearch/hermes-agent/pull/16896))
- Gemini free-tier keys blocked at setup with 429 guidance surfacing ([#15100](https://github.com/NousResearch/hermes-agent/pull/15100))
#### Model configuration
- **Configurable `prompt_caching.cache_ttl`** — 5m default, 1h opt-in (salvage #12659) ([#15065](https://github.com/NousResearch/hermes-agent/pull/15065))
- `/fast` whitelist broadened to all OpenAI + Anthropic models ([#16883](https://github.com/NousResearch/hermes-agent/pull/16883))
- `auxiliary.extra_body.reasoning` translates into Codex Responses API ([#17004](https://github.com/NousResearch/hermes-agent/pull/17004))
- `hermes fallback` command for managing fallback providers ([#16052](https://github.com/NousResearch/hermes-agent/pull/16052))
### Agent Loop & Conversation
- **Native multimodal image routing** — based on model vision capability, not provider defaults ([#16506](https://github.com/NousResearch/hermes-agent/pull/16506))
- **Delegate `child_timeout_seconds` default bumped to 600s** ([#14809](https://github.com/NousResearch/hermes-agent/pull/14809))
- **Diagnostic dump when subagent times out with 0 API calls** ([#15105](https://github.com/NousResearch/hermes-agent/pull/15105))
- **Gateway busts cached agent on compression/context_length config edits** ([#17008](https://github.com/NousResearch/hermes-agent/pull/17008))
- **Opt-in runtime-metadata footer on final replies** ([#17026](https://github.com/NousResearch/hermes-agent/pull/17026))
- `/reload-mcp` awareness — rebuild cached agents + prompt-cache cost confirmation ([#17729](https://github.com/NousResearch/hermes-agent/pull/17729))
- Fix: repair CamelCase + `_tool` suffix tool-call emissions ([#15124](https://github.com/NousResearch/hermes-agent/pull/15124))
- Fix: retry on `json.JSONDecodeError` instead of treating as local validation error ([#15107](https://github.com/NousResearch/hermes-agent/pull/15107))
- Fix: handle unescaped control chars in `tool_call.arguments` ([#15356](https://github.com/NousResearch/hermes-agent/pull/15356))
- Fix: ordering fix in `_copy_reasoning_content_for_api` — cross-provider reasoning isolation (@Zjianru) ([#15749](https://github.com/NousResearch/hermes-agent/pull/15749))
- Fix: inject empty `reasoning_content` for DeepSeek/Kimi `tool_calls` unconditionally (@Zjianru) ([#15762](https://github.com/NousResearch/hermes-agent/pull/15762))
- Fix: persist streamed `reasoning_content` on assistant turns (#16844) ([#16892](https://github.com/NousResearch/hermes-agent/pull/16892))
- Fix: cancel coroutine on timeout so worker thread exits; full traceback on tool failure ([#17428](https://github.com/NousResearch/hermes-agent/pull/17428))
- Fix: isolate `get_tool_definitions` quiet_mode cache + dedup LCM injection (#17335) ([#17889](https://github.com/NousResearch/hermes-agent/pull/17889))
- Fix: serialize concurrent `hermes_tools` RPC calls from `execute_code` (#17770) ([#17894](https://github.com/NousResearch/hermes-agent/pull/17894), [#17902](https://github.com/NousResearch/hermes-agent/pull/17902))
- Fix: rename `[SYSTEM:``[IMPORTANT:` in all user-injected markers (dodges Azure content filter) ([#16114](https://github.com/NousResearch/hermes-agent/pull/16114))
### Compression
- **Retry summary on main model for unknown errors before giving up** ([#16774](https://github.com/NousResearch/hermes-agent/pull/16774))
- **Notify users when configured aux model fails even if main-model fallback recovers** ([#16775](https://github.com/NousResearch/hermes-agent/pull/16775))
- `/compress` wrapped in `_busy_command` to block input during compression ([#15388](https://github.com/NousResearch/hermes-agent/pull/15388))
- Fix: reserve system + tools headroom when aux binds threshold ([#15631](https://github.com/NousResearch/hermes-agent/pull/15631))
- Fix: use text-char sum for multimodal token estimation in `_find_tail_cut_by_tokens` ([#16369](https://github.com/NousResearch/hermes-agent/pull/16369))
### Session, Memory & State
- **Trigram FTS5 index for CJK search, replace LIKE fallback** (@alt-glitch) ([#16651](https://github.com/NousResearch/hermes-agent/pull/16651))
- **Index `tool_name` + `tool_calls` in FTS5, with repair + migration** (salvages #16866) ([#16914](https://github.com/NousResearch/hermes-agent/pull/16914))
- **Checkpoints: auto-prune orphan and stale shadow repos at startup** ([#16303](https://github.com/NousResearch/hermes-agent/pull/16303))
- **Memory providers notified on mid-process session_id rotation** (#6672) ([#17409](https://github.com/NousResearch/hermes-agent/pull/17409))
- Fix: quote underscored terms in FTS5 query sanitization ([#16915](https://github.com/NousResearch/hermes-agent/pull/16915))
- Fix: resolve viking_read 500/412 on file URIs + pseudo-summary URIs (salvage #5886) ([#17869](https://github.com/NousResearch/hermes-agent/pull/17869))
- Fix: skip external-provider sync on interrupted turns ([#15395](https://github.com/NousResearch/hermes-agent/pull/15395))
- Fix: close embedded Hindsight async client cleanly (salvage #14605) ([#16209](https://github.com/NousResearch/hermes-agent/pull/16209))
- Fix: pass session transcript to `shutdown_memory_provider` on gateway + CLI (#15165) ([#16571](https://github.com/NousResearch/hermes-agent/pull/16571))
- Fix: write-origin metadata seam ([#15346](https://github.com/NousResearch/hermes-agent/pull/15346))
- Fix: preserve symlinks during atomic file writes ([#16980](https://github.com/NousResearch/hermes-agent/pull/16980))
- Refactor: remove `flush_memories` entirely ([#15696](https://github.com/NousResearch/hermes-agent/pull/15696))
### Auxiliary models
- Fix: surface auxiliary failures in UI (previously silent) ([#15324](https://github.com/NousResearch/hermes-agent/pull/15324))
- Fix: surface title-gen auxiliary failures instead of silently dropping ([#16371](https://github.com/NousResearch/hermes-agent/pull/16371))
- Fix: generalize unsupported-parameter detector and harden `max_tokens` retry ([#15633](https://github.com/NousResearch/hermes-agent/pull/15633))
---
## 📱 Messaging Platforms (Gateway)
### New Platforms
- **Microsoft Teams (19th platform)** — as a plugin, + xdist collision guard ([#17828](https://github.com/NousResearch/hermes-agent/pull/17828))
- **Yuanbao (Tencent 元宝, 18th platform)** — native adapter with text + media delivery ([#16298](https://github.com/NousResearch/hermes-agent/pull/16298), [#17424](https://github.com/NousResearch/hermes-agent/pull/17424), [#16880](https://github.com/NousResearch/hermes-agent/pull/16880))
### Pluggable Gateway Platforms
- **Drop-in messaging adapters** — the gateway is now a plugin host for platforms (salvage of #17664) ([#17751](https://github.com/NousResearch/hermes-agent/pull/17751))
### Telegram
- **Chat allowlists for groups and forums** (@web3blind) ([#15027](https://github.com/NousResearch/hermes-agent/pull/15027))
- **Send fresh finals for stale preview streams** (port openclaw#72038) ([#16261](https://github.com/NousResearch/hermes-agent/pull/16261))
- **Render markdown tables as row-group bullets + prompt hint** ([#16997](https://github.com/NousResearch/hermes-agent/pull/16997))
- Document fallback in centralized audio routing ([#17833](https://github.com/NousResearch/hermes-agent/pull/17833))
- Native multi-image sending ([#17909](https://github.com/NousResearch/hermes-agent/pull/17909))
### Discord
- **Opt-in toolsets + ID injection + tool split + Feishu wiring** (salvage #15457, #15458) ([#15610](https://github.com/NousResearch/hermes-agent/pull/15610), [#15613](https://github.com/NousResearch/hermes-agent/pull/15613))
- Fix: coerce `limit` parameter to int before `min()` call ([#16319](https://github.com/NousResearch/hermes-agent/pull/16319))
### Slack
- **Register every gateway command as a native slash (Discord/Telegram parity)** ([#16164](https://github.com/NousResearch/hermes-agent/pull/16164))
- **`strict_mention` config** — prevents thread auto-engagement ([#16193](https://github.com/NousResearch/hermes-agent/pull/16193))
- **`channel_skill_bindings`** — bind specific skills to specific Slack channels ([#16283](https://github.com/NousResearch/hermes-agent/pull/16283))
### Signal
- **Native formatting** — markdown → bodyRanges, reply quotes, reactions ([#17417](https://github.com/NousResearch/hermes-agent/pull/17417))
- Native multi-image sending ([#17909](https://github.com/NousResearch/hermes-agent/pull/17909))
### Feishu / Mattermost / Email / Signal
- All participate in **native multi-image sending** ([#17909](https://github.com/NousResearch/hermes-agent/pull/17909))
### Gateway Core
- **Centralized audio routing + FLAC support + Telegram doc fallback** ([#17833](https://github.com/NousResearch/hermes-agent/pull/17833))
- **Native multi-image sending** across Telegram, Discord, Slack, Mattermost, Email, Signal ([#17909](https://github.com/NousResearch/hermes-agent/pull/17909))
- **Make hygiene hard message limit configurable** ([#17000](https://github.com/NousResearch/hermes-agent/pull/17000))
- **Opt-in runtime-metadata footer on final replies** ([#17026](https://github.com/NousResearch/hermes-agent/pull/17026))
- **`pre_gateway_dispatch` hook** — plugins can intercept before dispatch ([#15050](https://github.com/NousResearch/hermes-agent/pull/15050))
- **`pre_approval_request` / `post_approval_response` hooks** ([#16776](https://github.com/NousResearch/hermes-agent/pull/16776))
- Fix: timeouts — guard `load_config()` call against runtime exceptions ([#16318](https://github.com/NousResearch/hermes-agent/pull/16318))
- Fix: support passing handler tools via registry ([#15613](https://github.com/NousResearch/hermes-agent/pull/15613))
---
## 🔧 Tool System
### Plugin-first architecture
- **Pluggable gateway platforms** — platforms can ship as plugins ([#17751](https://github.com/NousResearch/hermes-agent/pull/17751))
- **Microsoft Teams as first plugin-shipped platform** ([#17828](https://github.com/NousResearch/hermes-agent/pull/17828))
- **`pre_gateway_dispatch` hook** ([#15050](https://github.com/NousResearch/hermes-agent/pull/15050))
- **`pre_approval_request` + `post_approval_response` hooks** ([#16776](https://github.com/NousResearch/hermes-agent/pull/16776))
- **`duration_ms` on `post_tool_call`** (inspired by Claude Code 2.1.119) ([#15429](https://github.com/NousResearch/hermes-agent/pull/15429))
- **Bundled plugins**: Spotify ([#15174](https://github.com/NousResearch/hermes-agent/pull/15174)), Google Meet ([#16364](https://github.com/NousResearch/hermes-agent/pull/16364)), Langfuse observability ([#16917](https://github.com/NousResearch/hermes-agent/pull/16917)), hermes-achievements ([#17754](https://github.com/NousResearch/hermes-agent/pull/17754))
- **Page-scoped plugin slots for built-in dashboard pages** ([#15658](https://github.com/NousResearch/hermes-agent/pull/15658))
- **Declarative plugin installation for NixOS module** (@alt-glitch) ([#15953](https://github.com/NousResearch/hermes-agent/pull/15953))
### Browser
- **CDP supervisor** — dialog detection + response + cross-origin iframe eval ([#14540](https://github.com/NousResearch/hermes-agent/pull/14540))
- **Auto-spawn local Chromium for LAN/localhost URLs** when cloud provider is configured ([#16136](https://github.com/NousResearch/hermes-agent/pull/16136))
### Execute code / Terminal
- **Vercel Sandbox backend** for `execute_code` / terminal (@kshitijk4poor) ([#17445](https://github.com/NousResearch/hermes-agent/pull/17445))
- **Collapse subagent `task_id`s to shared container** ([#16177](https://github.com/NousResearch/hermes-agent/pull/16177))
- **Docker: run container as host user** to avoid root-owned bind mounts (@benbarclay) ([#17305](https://github.com/NousResearch/hermes-agent/pull/17305))
- Fix: safely quote `~/` subpaths in wrapped `cd` commands ([#15394](https://github.com/NousResearch/hermes-agent/pull/15394))
- Fix: close file descriptor in `LocalEnvironment._update_cwd` ([#17300](https://github.com/NousResearch/hermes-agent/pull/17300))
- Fix: SSH — prevent tar from overwriting remote home dir permissions ([#17898](https://github.com/NousResearch/hermes-agent/pull/17898), [#17867](https://github.com/NousResearch/hermes-agent/pull/17867))
### Image generation
- See Provider section for updates; no new image providers this window.
### TTS / Voice
- **Pluggable TTS provider registry** under `tts.providers.<name>` ([#17843](https://github.com/NousResearch/hermes-agent/pull/17843))
- **Piper** as native local TTS provider (closes #8508) ([#17885](https://github.com/NousResearch/hermes-agent/pull/17885))
- **Voice mode CLI parity in the TUI** — VAD loop + TTS + crash forensics ([#14810](https://github.com/NousResearch/hermes-agent/pull/14810))
- Fix: vision — use HERMES_HOME-based cache dir instead of cwd ([#17719](https://github.com/NousResearch/hermes-agent/pull/17719))
### Cron
- **Honor `hermes tools` config for the cron platform** ([#14798](https://github.com/NousResearch/hermes-agent/pull/14798))
- **Per-job `workdir`** — project-aware cron runs ([#15110](https://github.com/NousResearch/hermes-agent/pull/15110))
- **`context_from` field** — chain cron job outputs ([#15606](https://github.com/NousResearch/hermes-agent/pull/15606))
- Fix: promote `croniter` to a core dependency ([#17577](https://github.com/NousResearch/hermes-agent/pull/17577))
### Web search
- **Expose `limit` for `web_search`** ([#16934](https://github.com/NousResearch/hermes-agent/pull/16934))
### Maps
- Fix: include seconds in timezone UTC offset output ([#16300](https://github.com/NousResearch/hermes-agent/pull/16300))
### Approvals
- **Hardline blocklist for unrecoverable commands** ([#15878](https://github.com/NousResearch/hermes-agent/pull/15878))
- Perf: precompile DANGEROUS_PATTERNS and HARDLINE_PATTERNS ([#17206](https://github.com/NousResearch/hermes-agent/pull/17206))
### ACP
- **Advertise and forward image prompts** ([#18030](https://github.com/NousResearch/hermes-agent/pull/18030))
### API Server
- **POST `/v1/runs/{run_id}/stop`** (salvage of #15656) ([#15842](https://github.com/NousResearch/hermes-agent/pull/15842))
- **Expose run status for external UIs** (#17085) ([#17458](https://github.com/NousResearch/hermes-agent/pull/17458))
### Nix
- **Declarative plugin installation for NixOS module** (@alt-glitch) ([#15953](https://github.com/NousResearch/hermes-agent/pull/15953))
- Fix: use `--rebuild` in fix-lockfiles to bypass cached FOD store paths ([#15444](https://github.com/NousResearch/hermes-agent/pull/15444))
- Fix: `extraPackages` now actually works via per-user profile ([#17047](https://github.com/NousResearch/hermes-agent/pull/17047))
- Fix: refresh web/ npm-deps hash to unblock main builds ([#17174](https://github.com/NousResearch/hermes-agent/pull/17174))
- Fix: replace magic-nix-cache with Cachix ([#17928](https://github.com/NousResearch/hermes-agent/pull/17928))
---
## 🖥️ TUI
### New features
- **LaTeX rendering** (@austinpickett) ([#17175](https://github.com/NousResearch/hermes-agent/pull/17175))
- **`/reload` .env hot-reload** — ported from the classic CLI ([#17286](https://github.com/NousResearch/hermes-agent/pull/17286))
- **Pluggable busy-indicator styles** (@OutThisLife, #13610) ([#17150](https://github.com/NousResearch/hermes-agent/pull/17150))
- **Opt-in auto-resume of the most recent session** (@OutThisLife) ([#17130](https://github.com/NousResearch/hermes-agent/pull/17130))
- **Expanded light-terminal auto-detection** — `HERMES_TUI_THEME` + background hex (@OutThisLife) ([#17113](https://github.com/NousResearch/hermes-agent/pull/17113))
- **Delete sessions from `/resume` picker with `d`** (@OutThisLife) ([#17668](https://github.com/NousResearch/hermes-agent/pull/17668))
- **Line-by-line scroll on modified mouse wheel** (@OutThisLife) ([#17669](https://github.com/NousResearch/hermes-agent/pull/17669))
- **Delete queued message while editing with ctrl-x / cancel with esc** (@OutThisLife) ([#16707](https://github.com/NousResearch/hermes-agent/pull/16707))
- **Per-section visibility for the details accordion** (@OutThisLife) ([#14968](https://github.com/NousResearch/hermes-agent/pull/14968))
- **Voice mode CLI parity** — VAD loop + TTS + crash forensics ([#14810](https://github.com/NousResearch/hermes-agent/pull/14810))
- **Contextual first-touch hints ported to TUI** — `/busy`, `/verbose` ([#16054](https://github.com/NousResearch/hermes-agent/pull/16054))
- **Mini help menu on `?` in the input field** (@ethernet8023) ([#18043](https://github.com/NousResearch/hermes-agent/pull/18043))
### Fixes
- Fix: proactive mouse disable on ConPTY + `/mouse` toggle command (@kevin-ho, WSL2 ghost-mouse fix) ([#15488](https://github.com/NousResearch/hermes-agent/pull/15488))
- Fix: restore skills search RPC ([#15870](https://github.com/NousResearch/hermes-agent/pull/15870))
- Perf: cache text measurements across yoga flex re-passes ([#14818](https://github.com/NousResearch/hermes-agent/pull/14818))
- Perf: stabilize long-session scrolling ([#15926](https://github.com/NousResearch/hermes-agent/pull/15926))
- Perf: lazily seed virtual history heights ([#16523](https://github.com/NousResearch/hermes-agent/pull/16523))
- Perf: cut visible cold start ~57% with lazy agent init ([#17190](https://github.com/NousResearch/hermes-agent/pull/17190))
---
## 🖱️ CLI & User Experience
### New commands
- **`hermes -z <prompt>`** — non-interactive one-shot mode ([#15702](https://github.com/NousResearch/hermes-agent/pull/15702))
- **`hermes -z` with `--model` / `--provider` / `HERMES_INFERENCE_MODEL`** ([#15704](https://github.com/NousResearch/hermes-agent/pull/15704))
- **`hermes update --check`** preflight flag ([#15841](https://github.com/NousResearch/hermes-agent/pull/15841))
- **`hermes fallback`** command for managing fallback providers ([#16052](https://github.com/NousResearch/hermes-agent/pull/16052))
- **`/busy`** slash command for busy input mode ([#15382](https://github.com/NousResearch/hermes-agent/pull/15382))
- **`/busy` input mode 'steer'** as a third option ([#16279](https://github.com/NousResearch/hermes-agent/pull/16279))
- **`/btw` as alias for `/background`** ([#16053](https://github.com/NousResearch/hermes-agent/pull/16053))
- **`/reload-skills`** slash command (salvage #17670) ([#17744](https://github.com/NousResearch/hermes-agent/pull/17744))
- **Surface `/queue`, `/bg`, `/steer` in agent-running placeholder** ([#16118](https://github.com/NousResearch/hermes-agent/pull/16118))
### Setup / onboarding
- **Auto-reconfigure on existing installs** ([#15879](https://github.com/NousResearch/hermes-agent/pull/15879))
- **Contextual first-touch hints for `/busy` and `/verbose`** ([#16046](https://github.com/NousResearch/hermes-agent/pull/16046))
- **Cost-saving tips from the April 30 tip-of-the-day** ([#17841](https://github.com/NousResearch/hermes-agent/pull/17841))
- **Hyperlink startup banner title to the latest GitHub Release** ([#14945](https://github.com/NousResearch/hermes-agent/pull/14945))
### Update / backup
- **Snapshot pairing data before `git pull`** ([#16383](https://github.com/NousResearch/hermes-agent/pull/16383))
- **Auto-backup HERMES_HOME before `hermes update`** (opt-in, off by default) ([#16539](https://github.com/NousResearch/hermes-agent/pull/16539), [#16566](https://github.com/NousResearch/hermes-agent/pull/16566))
- **Exclude `checkpoints/` from backups** ([#16572](https://github.com/NousResearch/hermes-agent/pull/16572))
- **Exclude SQLite WAL/SHM/journal sidecars from backups** ([#16576](https://github.com/NousResearch/hermes-agent/pull/16576))
- **Installer FHS layout for root installs on Linux** ([#15608](https://github.com/NousResearch/hermes-agent/pull/15608))
- Fix: kill stale dashboards instead of warning ([#17832](https://github.com/NousResearch/hermes-agent/pull/17832))
- Fix: show correct update status on nix-built hermes ([#17550](https://github.com/NousResearch/hermes-agent/pull/17550))
### Slash-command housekeeping
- Refactor: drop `/provider`, `/plan` handler, and clean up slash registry ([#15047](https://github.com/NousResearch/hermes-agent/pull/15047))
- Refactor: drop `persist_session` plumbing + fix broken `/btw` mid-turn bypass ([#16075](https://github.com/NousResearch/hermes-agent/pull/16075))
### OpenClaw migration (for folks coming from OpenClaw)
- **Hardened OpenClaw import** — plan-first apply, redaction, pre-migration backup ([#16911](https://github.com/NousResearch/hermes-agent/pull/16911))
- Fix: case-preserving brand rewrite + one-time `~/.openclaw` residue banner ([#16327](https://github.com/NousResearch/hermes-agent/pull/16327))
- Fix: resolve `openclaw` workspace files from `agents.defaults.workspace` ([#16879](https://github.com/NousResearch/hermes-agent/pull/16879))
- Fix: resolve model aliases against real OpenClaw catalog schema (salvage #16778) ([#16977](https://github.com/NousResearch/hermes-agent/pull/16977))
---
## 📊 Web Dashboard
- **Models tab** — rich per-model analytics ([#17745](https://github.com/NousResearch/hermes-agent/pull/17745))
- **Configure main + auxiliary models from the Models page** ([#17802](https://github.com/NousResearch/hermes-agent/pull/17802))
- **Dashboard Chat tab — xterm.js + JSON-RPC sidecar** (supersedes #12710 + #13379, @OutThisLife) ([#14890](https://github.com/NousResearch/hermes-agent/pull/14890))
- **Dashboard layout refresh** (@austinpickett) ([#14899](https://github.com/NousResearch/hermes-agent/pull/14899))
- **`--stop` and `--status` flags** on the dashboard CLI ([#17840](https://github.com/NousResearch/hermes-agent/pull/17840))
- **Page-scoped plugin slots for built-in pages** ([#15658](https://github.com/NousResearch/hermes-agent/pull/15658))
- Fix: replace all buttons for design system buttons ([#17007](https://github.com/NousResearch/hermes-agent/pull/17007))
---
## ⚡ Performance
- **TUI visible cold start cut ~57%** via lazy agent init ([#17190](https://github.com/NousResearch/hermes-agent/pull/17190))
- **Lazy-import OpenAI, Anthropic, Firecrawl, account_usage** ([#17046](https://github.com/NousResearch/hermes-agent/pull/17046))
- **mtime-cache `load_config()` and `read_raw_config()`** ([#17041](https://github.com/NousResearch/hermes-agent/pull/17041))
- **Memoize `get_tool_definitions()` + TTL-cache `check_fn` results** ([#17098](https://github.com/NousResearch/hermes-agent/pull/17098))
- **Precompile DANGEROUS_PATTERNS and HARDLINE_PATTERNS** ([#17206](https://github.com/NousResearch/hermes-agent/pull/17206))
- **Cache Ink text measurements across yoga flex re-passes** ([#14818](https://github.com/NousResearch/hermes-agent/pull/14818))
- **Stabilize long-session scrolling** ([#15926](https://github.com/NousResearch/hermes-agent/pull/15926))
- **Lazily seed virtual history heights** ([#16523](https://github.com/NousResearch/hermes-agent/pull/16523))
---
## 🔒 Security & Reliability
- **Secret redaction off by default** — stops corrupting patches / API payloads with fake-key substitutions. Opt in via `redaction.enabled: true` ([#16794](https://github.com/NousResearch/hermes-agent/pull/16794))
- **`[SYSTEM:``[IMPORTANT:`** in all user-injected markers (Azure content filter dodge) ([#16114](https://github.com/NousResearch/hermes-agent/pull/16114))
- **Hardline blocklist for unrecoverable commands** ([#15878](https://github.com/NousResearch/hermes-agent/pull/15878))
- **Canonical `mask_secret` helper; fix status.py DIM drift** ([#17207](https://github.com/NousResearch/hermes-agent/pull/17207))
- **Sweep expired paste.rs uploads on a real timer** ([#16431](https://github.com/NousResearch/hermes-agent/pull/16431))
- **Preserve symlinks during atomic file writes** ([#16980](https://github.com/NousResearch/hermes-agent/pull/16980))
- **Probe `/dev/tty` by opening it, not bare existence** ([#17024](https://github.com/NousResearch/hermes-agent/pull/17024))
---
## 🐛 Notable Bug Fixes
This window includes 360 `fix:` PRs. Selected highlights from across the stack:
- **Background review fork inherits parent's live runtime** — provider/model/creds now propagate correctly ([#16099](https://github.com/NousResearch/hermes-agent/pull/16099))
- **Hindsight configurable `HINDSIGHT_TIMEOUT` env var** ([#15077](https://github.com/NousResearch/hermes-agent/pull/15077))
- **Tools: normalize numeric entries + clear stale `no_mcp` in `_save_platform_tools`** ([#15607](https://github.com/NousResearch/hermes-agent/pull/15607))
- **MCP: rewrite `definitions` refs to `$defs` in input schemas** — closes provider-side 400s
- **Azure content filter compatibility** — renamed `[SYSTEM:` markers so Azure's content filter stops flagging them ([#16114](https://github.com/NousResearch/hermes-agent/pull/16114))
- **Vision cache uses HERMES_HOME instead of cwd** ([#17719](https://github.com/NousResearch/hermes-agent/pull/17719))
- **FTS5 search** — tool_name + tool_calls indexing with repair + migration ([#16914](https://github.com/NousResearch/hermes-agent/pull/16914))
- **Streaming reasoning persists on assistant turns** ([#16892](https://github.com/NousResearch/hermes-agent/pull/16892))
- **execute_code concurrent RPC serialization** (#17770) ([#17894](https://github.com/NousResearch/hermes-agent/pull/17894), [#17902](https://github.com/NousResearch/hermes-agent/pull/17902))
- **Background reviewer scoped to memory + skills toolsets** — no more accidental web/shell escapes ([#16569](https://github.com/NousResearch/hermes-agent/pull/16569))
- **Compression recovery** — retry on main before giving up; notify user when aux fails ([#16774](https://github.com/NousResearch/hermes-agent/pull/16774), [#16775](https://github.com/NousResearch/hermes-agent/pull/16775))
- **`croniter` promoted to a core dependency** ([#17577](https://github.com/NousResearch/hermes-agent/pull/17577))
- **Discord tool `limit` parameter coerced to int** before `min()` call ([#16319](https://github.com/NousResearch/hermes-agent/pull/16319))
- **Yuanbao messaging platform entrance fix** ([#16880](https://github.com/NousResearch/hermes-agent/pull/16880))
- **ACP advertise and forward image prompts** ([#18030](https://github.com/NousResearch/hermes-agent/pull/18030))
- **DeepSeek / Kimi reasoning content isolation** across cross-provider histories (@Zjianru) ([#15749](https://github.com/NousResearch/hermes-agent/pull/15749), [#15762](https://github.com/NousResearch/hermes-agent/pull/15762))
- **Preserve reasoning_content replay on DeepSeek v4 + Kimi/Moonshot thinking** ([#18045](https://github.com/NousResearch/hermes-agent/pull/18045))
The vast majority of the 360 fixes landed in the streaming/compression/tool-calling paths across all providers — DeepSeek, Kimi, Moonshot, GLM, Qwen, MiniMax, Gemini, Anthropic, OpenAI — alongside TUI polish (resize, scroll, sticky-prompt) and gateway platform-specific edge cases.
---
## 🧪 Testing & CI
- Hermetic test parity (`scripts/run_tests.sh`) held across this window
- **Microsoft Teams xdist collision guard** — prevents worker collisions when Teams platform tests run in parallel ([#17828](https://github.com/NousResearch/hermes-agent/pull/17828))
- Chore: remove unused imports and dead locals (ruff F401, F841) ([#17010](https://github.com/NousResearch/hermes-agent/pull/17010))
---
## 📚 Documentation
- **Curator feature page** added to docs site ([#17563](https://github.com/NousResearch/hermes-agent/pull/17563))
- **Document pin also blocking `skill_manage` writes** ([#17578](https://github.com/NousResearch/hermes-agent/pull/17578))
- **Direct-URL skill install documented** across features, reference, guide, and `hermes-agent` skill ([#16355](https://github.com/NousResearch/hermes-agent/pull/16355))
- **Hooks tutorial — build a BOOT.md startup checklist** (replaces the removed built-in hook) ([#17202](https://github.com/NousResearch/hermes-agent/pull/17202))
- **ComfyUI docs: ask local vs cloud FIRST before hardware check** ([#17612](https://github.com/NousResearch/hermes-agent/pull/17612))
- **Obliteratus skill: link YouTube video guide in SKILL.md** ([#15808](https://github.com/NousResearch/hermes-agent/pull/15808))
- Per-skill docs pages generated for bundled + optional skills; ASCII art code blocks auto-wrapped ([#14929](https://github.com/NousResearch/hermes-agent/pull/14929), [#16497](https://github.com/NousResearch/hermes-agent/pull/16497))
---
## ⚖️ Removed / Reverted
- **Kanban multi-profile collaboration board** — landed in #16081, reverted in ([#16098](https://github.com/NousResearch/hermes-agent/pull/16098)) while the design is reworked
- **computer-use cua-driver** — 3 preparatory PRs landed then were reverted in ([#16927](https://github.com/NousResearch/hermes-agent/pull/16927))
- **BOOT.md built-in hook** removed ([#17093](https://github.com/NousResearch/hermes-agent/pull/17093)); the hooks tutorial ([#17202](https://github.com/NousResearch/hermes-agent/pull/17202)) shows how to build the same workflow yourself with a shell hook
- **`/provider` + `/plan` slash commands dropped** ([#15047](https://github.com/NousResearch/hermes-agent/pull/15047))
- **`flush_memories` removed entirely** ([#15696](https://github.com/NousResearch/hermes-agent/pull/15696))
---
## 👥 Contributors
### Core
- **@teknium1** (Teknium)
### Top Community Contributors (by merged PR count since v0.11.0)
- **@OutThisLife** (Brooklyn) — 52 PRs · TUI — light-terminal detection + pluggable busy styles + auto-resume + session-delete from /resume + mouse-wheel scrolling + xterm.js dashboard Chat tab + cold-start cut + accordion polish
- **@kshitijk4poor** — 12 PRs · LM Studio first-class provider (salvage), Vercel Sandbox backend, GMI Cloud salvage, bundled-by-default touchdesigner-mcp, many tool-call / reasoning fixes
- **@helix4u** — 10 PRs · MCP schema robustness, assorted stability fixes
- **@alt-glitch** — 8 PRs · trigram FTS5 CJK search, declarative Nix plugin install, matrix/feishu hints and fixes
- **@ethernet8023** — 4 PRs
- **@austinpickett** — 4 PRs · LaTeX rendering in TUI, dashboard layout refresh
- **@benbarclay** — 3 PRs · Docker run-as-host-user so bind mounts don't get root-owned
- **@vominh1919** — 2 PRs
- **@stephenschoettler** — 2 PRs
- **@kevin-ho** — ConPTY mouse-injection fix (#15488)
- **@Zjianru** — cross-provider reasoning_content isolation + DeepSeek/Kimi empty-reasoning injection (#15749, #15762)
- **@web3blind** — Telegram chat allowlists for groups and forums (#15027)
- **@SHL0MS** — 9 new TouchDesigner-MCP reference docs (#16768)
- **@0xDevNinja** — curator `restore_skill` nested-archive fix (#17951)
- **@y0shua1ee** — curator `use` activity fix (#17953)
### Also contributing
Salvaged or co-authored work from **@isaachuangGMICLOUD** (GMI Cloud), earlier upstream PRs from the original author of each salvage chain, and a long tail of one-shot fixes, documentation nudges, and skill contributions from the community.
### All Contributors (alphabetical, excluding @teknium1)
@0xbyt4, @0xharryriddle, @0xDevNinja, @0z1-ghb, @5park1e, @A-FdL-Prog, @aj-nt, @akhater, @alblez, @alexg0bot,
@alexzhu0, @AllardQuek, @alt-glitch, @amanning3390, @amanuel2, @AndreKurait, @andrewhosf, @Andy283, @andyylin,
@angel12, @AntAISecurityLab, @ash, @austinpickett, @badgerbees, @BadTechBandit, @Bartok9, @beenherebefore,
@beesrsj2500, @BeliefanX, @benbarclay, @benjaminsehl, @BlackishGreen33, @bloodcarter, @BlueBirdBack,
@briandevans, @brooklynnicholson, @bsgdigital, @buray, @bwjoke, @camaragon, @cdanis, @cgarwood82,
@charles-brooks, @chen1749144759, @chengoak, @ching-kaching, @Contentment003111, @crayfish-ai, @CruxExperts,
@cyclingwithelephants, @dandaka, @danklynn, @ddupont808, @dhabibi, @difujia, @dimitrovi, @dlkakbs,
@dontcallmejames, @EKKOLearnAI, @emozilla, @ericnicolaides, @Erosika, @ethernet8023, @exiao, @Feranmi10,
@flobo3, @foxion37, @georgeglessner, @georgex8001, @ghostmfr, @H-Ali13381, @HangGlidersRule, @harryplusplus,
@haru398801, @heathley, @hejuntt1014, @hekaru-agent, @helix4u, @Heltman, @HenkDz, @heyitsaamir, @hharry11,
@hhhonzik, @hhuang91, @HiddenPuppy, @htsh, @iamagenius00, @in-liberty420, @innocarpe, @irispillars, @iRonin,
@isaachuangGMICLOUD, @Ito-69, @j3ffffff, @jackjin1997, @jakubkrcmar, @Jason2031, @JayGwod, @jerome-benoit,
@johnncenae, @Kailigithub, @keiravoss94, @kevin-ho, @knockyai, @konsisumer, @kshitijk4poor, @kunlabs, @l0hde,
@Leihb, @leoneparise, @LeonSGP43, @liizfq, @liuhao1024, @loongzhao, @lsdsjy, @luyao618, @ma-pony, @Magaav,
@MagicRay1217, @math0r-be, @MattMaximo, @maxims-oss, @MaxyMoos, @maymuneth, @mcndjxlefnd, @memosr,
@MestreY0d4-Uninter, @mewwts, @Mirac1eSky, @MorAlekss, @mrhwick, @mrunmayee17, @mssteuer, @Nanako0129,
@nazirulhafiy, @Nerijusas, @Nicecsh, @nicoloboschi, @nightq, @ningfangbin, @octo-patch, @Octopus,
@OutThisLife, @Paperclip, @pein892, @perlowja, @prasadus92, @qike-ms, @qiyin-code, @Readon, @ReginaldasR,
@revaraver, @rfilgueiras, @rmoen, @romanornr, @rugvedS07, @rylena, @samrusani, @Sanjays2402, @sasha-id,
@Satoshi-agi, @scheidti, @scotttrinh, @season179, @SeeYangZhi, @sgaofen, @shamork, @shannonsands, @SHL0MS,
@simbam99, @Societus, @socrates1024, @Sonoyunchu, @sprmn24, @stephenschoettler, @tangyuanjc, @TechPrototyper,
@tekgnosis-net, @ThomassJonax, @tmimmanuel, @tochukwuada, @Tosko4, @Tranquil-Flow, @twozle, @txbxxx,
@UgwujaGeorge, @Versun, @vlwkaos, @voidborne-d, @vominh1919, @Wang-tianhao, @Wangshengyang2004, @web3blind,
@westers, @Wysie, @xandersbell, @xiahu88988, @XieNBi, @xinbenlv, @xnbi, @y0shua1ee, @yatesjalex, @yes999zc,
@yeyitech, @Yoimex, @YueLich, @Yukipukii1, @zhiyanliu, @zicochaos, @Zjianru, @zkl2333, @zons-zhaozhy,
@ztexydt-cqh.
Also: @Siddharth Balyan, @YuShu.
---
**Full Changelog**: [v2026.4.23...v2026.4.30](https://github.com/NousResearch/hermes-agent/compare/v2026.4.23...v2026.4.30)
+3 -136
View File
@@ -164,8 +164,6 @@ class HermesACPAgent(acp.Agent):
"context": "Show conversation context info",
"reset": "Clear conversation history",
"compact": "Compress conversation context",
"steer": "Inject guidance into the currently running agent turn",
"queue": "Queue a prompt to run after the current turn finishes",
"version": "Show Hermes version",
}
@@ -195,16 +193,6 @@ class HermesACPAgent(acp.Agent):
"name": "compact",
"description": "Compress conversation context",
},
{
"name": "steer",
"description": "Inject guidance into the currently running agent turn",
"input_hint": "guidance for the active turn",
},
{
"name": "queue",
"description": "Queue a prompt to run after the current turn finishes",
"input_hint": "prompt to run next",
},
{
"name": "version",
"description": "Show Hermes version",
@@ -569,9 +557,6 @@ class HermesACPAgent(acp.Agent):
async def cancel(self, session_id: str, **kwargs: Any) -> None:
state = self.session_manager.get_session(session_id)
if state and state.cancel_event:
with state.runtime_lock:
if state.is_running and state.current_prompt_text:
state.interrupted_prompt_text = state.current_prompt_text
state.cancel_event.set()
try:
if getattr(state, "agent", None) and hasattr(state.agent, "interrupt"):
@@ -669,39 +654,6 @@ class HermesACPAgent(acp.Agent):
if not has_content:
return PromptResponse(stop_reason="end_turn")
# /steer on an idle session has no in-flight tool call to inject into.
# Rewrite it so the payload runs as a normal user prompt, matching the
# gateway's behavior (gateway/run.py ~L4898). Two sub-cases:
# 1. Zed-interrupt salvage — a prior prompt was cancelled by the
# client right before /steer arrived; replay it with the steer
# text attached as explicit correction/guidance so the user's
# in-flight work isn't lost.
# 2. Plain idle — no prior work to salvage; just run the steer
# payload as a regular prompt. Without this, _cmd_steer would
# silently append to state.queued_prompts and respond with
# "No active turn — queued for the next turn", which looks like
# /queue even though the user never typed /queue.
if isinstance(user_content, str) and user_text.startswith("/steer"):
steer_text = user_text.split(maxsplit=1)[1].strip() if len(user_text.split(maxsplit=1)) > 1 else ""
interrupted_prompt = ""
rewrite_idle = False
with state.runtime_lock:
if not state.is_running and steer_text:
if state.interrupted_prompt_text:
interrupted_prompt = state.interrupted_prompt_text
state.interrupted_prompt_text = ""
else:
rewrite_idle = True
if interrupted_prompt:
user_text = (
f"{interrupted_prompt}\n\n"
f"User correction/guidance after interrupt: {steer_text}"
)
user_content = user_text
elif rewrite_idle:
user_text = steer_text
user_content = steer_text
# Intercept slash commands — handle locally without calling the LLM.
# Slash commands are text-only; if the client included images/resources,
# send the whole multimodal prompt to the agent instead of treating it as
@@ -714,24 +666,6 @@ class HermesACPAgent(acp.Agent):
await self._conn.session_update(session_id, update)
return PromptResponse(stop_reason="end_turn")
# If Zed sends another regular prompt while the same ACP session is
# still running, queue it instead of racing two AIAgent loops against
# the same state.history. /steer and /queue are handled above and can
# land immediately.
with state.runtime_lock:
if state.is_running:
queued_text = user_text or "[Image attachment]"
state.queued_prompts.append(queued_text)
depth = len(state.queued_prompts)
if self._conn:
update = acp.update_agent_message_text(
f"Queued for the next turn. ({depth} queued)"
)
await self._conn.session_update(session_id, update)
return PromptResponse(stop_reason="end_turn")
state.is_running = True
state.current_prompt_text = user_text or "[Image attachment]"
logger.info("Prompt on session %s: %s", session_id, user_text[:100])
conn = self._conn
@@ -843,9 +777,6 @@ class HermesACPAgent(acp.Agent):
result = await loop.run_in_executor(_executor, ctx.run, _run_agent)
except Exception:
logger.exception("Executor error for session %s", session_id)
with state.runtime_lock:
state.is_running = False
state.current_prompt_text = ""
return PromptResponse(stop_reason="end_turn")
if result.get("messages"):
@@ -871,28 +802,6 @@ class HermesACPAgent(acp.Agent):
update = acp.update_agent_message_text(final_response)
await conn.session_update(session_id, update)
# Mark this turn idle before draining queued work so recursive prompt()
# calls can acquire the session. Queued turns are intentionally run as
# normal follow-up user prompts, preserving role alternation and history.
with state.runtime_lock:
state.is_running = False
state.current_prompt_text = ""
while True:
with state.runtime_lock:
if not state.queued_prompts:
break
next_prompt = state.queued_prompts.pop(0)
if conn:
await conn.session_update(
session_id,
acp.update_user_message_text(next_prompt),
)
await self.prompt(
prompt=[TextContentBlock(type="text", text=next_prompt)],
session_id=session_id,
)
usage = None
if any(result.get(key) is not None for key in ("prompt_tokens", "completion_tokens", "total_tokens")):
usage = Usage(
@@ -970,8 +879,6 @@ class HermesACPAgent(acp.Agent):
"context": self._cmd_context,
"reset": self._cmd_reset,
"compact": self._cmd_compact,
"steer": self._cmd_steer,
"queue": self._cmd_queue,
"version": self._cmd_version,
}.get(cmd)
@@ -1068,16 +975,10 @@ class HermesACPAgent(acp.Agent):
if not hasattr(agent, "_compress_context"):
return "Context compression not available for this agent."
from agent.model_metadata import estimate_request_tokens_rough
from agent.model_metadata import estimate_messages_tokens_rough
original_count = len(state.history)
# Include system prompt + tool schemas so the figure reflects real
# request pressure, not a transcript-only underestimate (#6217).
_sys_prompt = getattr(agent, "_cached_system_prompt", "") or ""
_tools = getattr(agent, "tools", None) or None
approx_tokens = estimate_request_tokens_rough(
state.history, system_prompt=_sys_prompt, tools=_tools
)
approx_tokens = estimate_messages_tokens_rough(state.history)
original_session_db = getattr(agent, "_session_db", None)
try:
@@ -1097,13 +998,7 @@ class HermesACPAgent(acp.Agent):
self.session_manager.save_session(state.session_id)
new_count = len(state.history)
_sys_prompt_after = getattr(agent, "_cached_system_prompt", "") or _sys_prompt
_tools_after = getattr(agent, "tools", None) or _tools
new_tokens = estimate_request_tokens_rough(
state.history,
system_prompt=_sys_prompt_after,
tools=_tools_after,
)
new_tokens = estimate_messages_tokens_rough(state.history)
return (
f"Context compressed: {original_count} -> {new_count} messages\n"
f"~{approx_tokens:,} -> ~{new_tokens:,} tokens"
@@ -1111,34 +1006,6 @@ class HermesACPAgent(acp.Agent):
except Exception as e:
return f"Compression failed: {e}"
def _cmd_steer(self, args: str, state: SessionState) -> str:
steer_text = args.strip()
if not steer_text:
return "Usage: /steer <guidance>"
if state.is_running and hasattr(state.agent, "steer"):
try:
if state.agent.steer(steer_text):
preview = steer_text[:80] + ("..." if len(steer_text) > 80 else "")
return f"⏩ Steer queued for the active turn: {preview}"
except Exception as exc:
logger.warning("ACP steer failed for session %s: %s", state.session_id, exc)
return f"⚠️ Steer failed: {exc}"
with state.runtime_lock:
state.queued_prompts.append(steer_text)
depth = len(state.queued_prompts)
return f"No active turn — queued for the next turn. ({depth} queued)"
def _cmd_queue(self, args: str, state: SessionState) -> str:
queued_text = args.strip()
if not queued_text:
return "Usage: /queue <prompt>"
with state.runtime_lock:
state.queued_prompts.append(queued_text)
depth = len(state.queued_prompts)
return f"Queued for the next turn. ({depth} queued)"
def _cmd_version(self, args: str, state: SessionState) -> str:
return f"Hermes Agent v{HERMES_VERSION}"
+7 -46
View File
@@ -26,33 +26,6 @@ from typing import Any, Dict, List, Optional
logger = logging.getLogger(__name__)
def _win_path_to_wsl(path: str) -> str | None:
"""Convert a Windows drive path to its WSL /mnt/<drive>/... equivalent."""
match = re.match(r"^([A-Za-z]):[\\/](.*)$", path)
if not match:
return None
drive = match.group(1).lower()
tail = match.group(2).replace("\\", "/")
return f"/mnt/{drive}/{tail}"
def _translate_acp_cwd(cwd: str) -> str:
"""Translate Windows ACP cwd values when Hermes itself is running in WSL.
Windows ACP clients can launch ``hermes acp`` inside WSL while still sending
editor workspaces as Windows drive paths such as ``E:\\Projects``. Store
and execute against the WSL mount path so agents, tools, and persisted ACP
sessions all agree on the usable workspace. Native Linux/macOS keeps the
original cwd unchanged.
"""
from hermes_constants import is_wsl
if not is_wsl():
return cwd
translated = _win_path_to_wsl(str(cwd))
return translated if translated is not None else cwd
def _normalize_cwd_for_compare(cwd: str | None) -> str:
raw = str(cwd or ".").strip()
if not raw:
@@ -61,9 +34,11 @@ def _normalize_cwd_for_compare(cwd: str | None) -> str:
# Normalize Windows drive paths into the equivalent WSL mount form so
# ACP history filters match the same workspace across Windows and WSL.
translated = _win_path_to_wsl(expanded)
if translated is not None:
expanded = translated
match = re.match(r"^([A-Za-z]):[\\/](.*)$", expanded)
if match:
drive = match.group(1).lower()
tail = match.group(2).replace("\\", "/")
expanded = f"/mnt/{drive}/{tail}"
elif re.match(r"^/mnt/[A-Za-z]/", expanded):
expanded = f"/mnt/{expanded[5].lower()}/{expanded[7:]}"
@@ -121,18 +96,12 @@ def _acp_stderr_print(*args, **kwargs) -> None:
def _register_task_cwd(task_id: str, cwd: str) -> None:
"""Bind a task/session id to the editor's working directory for tools.
Zed can launch Hermes from a Windows workspace while the ACP process runs
inside WSL. In that case ACP sends cwd as e.g. ``E:\\Projects\\POTI``;
local tools need the WSL mount equivalent or subprocess creation fails
before the command can run.
"""
"""Bind a task/session id to the editor's working directory for tools."""
if not task_id:
return
try:
from tools.terminal_tool import register_task_env_overrides
register_task_env_overrides(task_id, {"cwd": _translate_acp_cwd(cwd)})
register_task_env_overrides(task_id, {"cwd": cwd})
except Exception:
logger.debug("Failed to register ACP task cwd override", exc_info=True)
@@ -176,11 +145,6 @@ class SessionState:
model: str = ""
history: List[Dict[str, Any]] = field(default_factory=list)
cancel_event: Any = None # threading.Event
is_running: bool = False
queued_prompts: List[str] = field(default_factory=list)
runtime_lock: Any = field(default_factory=Lock)
current_prompt_text: str = ""
interrupted_prompt_text: str = ""
class SessionManager:
@@ -211,7 +175,6 @@ class SessionManager:
"""Create a new session with a unique ID and a fresh AIAgent."""
import threading
cwd = _translate_acp_cwd(cwd)
session_id = str(uuid.uuid4())
agent = self._make_agent(session_id=session_id, cwd=cwd)
state = SessionState(
@@ -254,7 +217,6 @@ class SessionManager:
"""Deep-copy a session's history into a new session."""
import threading
cwd = _translate_acp_cwd(cwd)
original = self.get_session(session_id) # checks DB too
if original is None:
return None
@@ -356,7 +318,6 @@ class SessionManager:
def update_cwd(self, session_id: str, cwd: str) -> Optional[SessionState]:
"""Update the working directory for a session and its tool overrides."""
cwd = _translate_acp_cwd(cwd)
state = self.get_session(session_id) # checks DB too
if state is None:
return None
+1 -29
View File
@@ -1977,12 +1977,6 @@ def resolve_provider_client(
(client, resolved_model) or (None, None) if auth is unavailable.
"""
_validate_proxy_env_urls()
# Preserve the original provider name before alias normalization so a
# user-declared ``custom_providers`` entry whose name coincidentally
# matches a built-in alias (e.g. user names their custom provider "kimi"
# which aliases to "kimi-coding") is still reachable via the named-custom
# branch below.
original_provider = (provider or "").strip().lower()
# Normalise aliases
provider = _normalize_aux_provider(provider)
@@ -2169,18 +2163,7 @@ def resolve_provider_client(
# ── Named custom providers (config.yaml providers dict / custom_providers list) ───
try:
from hermes_cli.runtime_provider import _get_named_custom_provider
# When the raw requested name is an alias (``kimi`` → ``kimi-coding``)
# and the user defined a ``custom_providers`` entry under that alias
# name, the custom entry is the intended target — the built-in alias
# rewriting would otherwise hijack the request. Only preferred when
# the raw name is an alias (not a canonical provider name) so custom
# entries that coincidentally match a canonical provider (e.g. ``nous``)
# still defer to the built-in per `_get_named_custom_provider`'s guard.
custom_entry = None
if original_provider and original_provider != provider:
custom_entry = _get_named_custom_provider(original_provider)
if custom_entry is None:
custom_entry = _get_named_custom_provider(provider)
custom_entry = _get_named_custom_provider(provider)
if custom_entry:
custom_base = custom_entry.get("base_url", "").strip()
custom_key = custom_entry.get("api_key", "").strip()
@@ -2290,12 +2273,6 @@ def resolve_provider_client(
creds = resolve_api_key_provider_credentials(provider)
api_key = str(creds.get("api_key", "")).strip()
# Honour an explicit api_key override (e.g. from a fallback_model entry
# or a custom_providers entry) so callers that pass an explicit
# credential can authenticate against endpoints where no built-in
# credential is registered for this provider alias.
if explicit_api_key:
api_key = explicit_api_key.strip() or api_key
if not api_key:
tried_sources = list(pconfig.api_key_env_vars)
if provider == "copilot":
@@ -2307,11 +2284,6 @@ def resolve_provider_client(
raw_base_url = str(creds.get("base_url", "")).strip().rstrip("/") or pconfig.inference_base_url
base_url = _to_openai_base_url(raw_base_url)
# Honour an explicit base_url override from the caller — used when a
# fallback_model entry (or custom_providers lookup) routes through a
# built-in provider name but targets a user-specified endpoint.
if explicit_base_url:
base_url = _to_openai_base_url(explicit_base_url.strip().rstrip("/"))
default_model = _API_KEY_PROVIDER_AUX_MODELS.get(provider, "")
final_model = _normalize_resolved_model(model or default_model, provider)
+3 -3
View File
@@ -538,7 +538,7 @@ class ContextCompressor(ContextEngine):
# Token-budget approach: walk backward accumulating tokens
accumulated = 0
boundary = len(result)
min_protect = min(protect_tail_count, len(result))
min_protect = min(protect_tail_count, len(result) - 1)
for i in range(len(result) - 1, -1, -1):
msg = result[i]
raw_content = msg.get("content") or ""
@@ -992,8 +992,8 @@ The user has requested that this compaction PRIORITISE preserving all informatio
def _get_tool_call_id(tc) -> str:
"""Extract the call ID from a tool_call entry (dict or SimpleNamespace)."""
if isinstance(tc, dict):
return tc.get("call_id", "") or tc.get("id", "") or ""
return getattr(tc, "call_id", "") or getattr(tc, "id", "") or ""
return tc.get("id", "")
return getattr(tc, "id", "") or ""
def _sanitize_tool_pairs(self, messages: List[Dict[str, Any]]) -> List[Dict[str, Any]]:
"""Fix orphaned tool_call / tool_result pairs after compression.
+11 -188
View File
@@ -55,7 +55,6 @@ def _default_state() -> Dict[str, Any]:
"last_run_at": None,
"last_run_duration_seconds": None,
"last_run_summary": None,
"last_report_path": None,
"paused": False,
"run_count": 0,
}
@@ -184,16 +183,7 @@ def should_run_now(now: Optional[datetime] = None) -> bool:
Gates:
- curator.enabled == True
- not paused
- last_run_at present AND older than interval_hours
First-run behavior: when there is no ``last_run_at`` (fresh install, or
install that predates the curator), we DO NOT run immediately. The
curator is designed to run after at least ``interval_hours`` (7 days by
default) of skill activity, not on the first background tick after
``hermes update``. On first observation we seed ``last_run_at`` to "now"
and defer the first real pass by one full interval. Users who want to
run it sooner can always invoke ``hermes curator run`` (with or without
``--dry-run``) explicitly — that path bypasses this gate.
- last_run_at missing, OR older than interval_hours
The idle check (min_idle_hours) is applied at the call site where we know
whether an agent is actively running — here we only enforce the static
@@ -207,21 +197,7 @@ def should_run_now(now: Optional[datetime] = None) -> bool:
state = load_state()
last = _parse_iso(state.get("last_run_at"))
if last is None:
# Never run before. Seed state so we wait a full interval before the
# first real pass. Report-only; do not auto-mutate the library the
# very first time a gateway ticks after an update.
if now is None:
now = datetime.now(timezone.utc)
try:
state["last_run_at"] = now.isoformat()
state["last_run_summary"] = (
"deferred first run — curator seeded, will run after one "
"interval; use `hermes curator run --dry-run` to preview now"
)
save_state(state)
except Exception as e: # pragma: no cover — best-effort persistence
logger.debug("Failed to seed curator last_run_at: %s", e)
return False
return True
if now is None:
now = datetime.now(timezone.utc)
@@ -282,33 +258,6 @@ def apply_automatic_transitions(now: Optional[datetime] = None) -> Dict[str, int
# Review prompt for the forked agent
# ---------------------------------------------------------------------------
CURATOR_DRY_RUN_BANNER = (
"═══════════════════════════════════════════════════════════════\n"
"DRY-RUN — REPORT ONLY. DO NOT MUTATE THE SKILL LIBRARY.\n"
"═══════════════════════════════════════════════════════════════\n"
"\n"
"This is a PREVIEW pass. Follow every instruction below EXCEPT:\n"
"\n"
" • DO NOT call skill_manage with action=patch, create, delete, "
"write_file, or remove_file.\n"
" • DO NOT call terminal to mv skill directories into .archive/.\n"
" • DO NOT call terminal to mv, cp, rm, or rewrite any file under "
"~/.hermes/skills/.\n"
" • skills_list and skill_view are FINE — read as much as you need.\n"
"\n"
"Your output IS the deliverable. Produce the exact same "
"human-readable summary and structured YAML block you would "
"produce on a live run — but describe the actions you WOULD take, "
"not actions you took. A downstream reviewer will read the report "
"and decide whether to approve a live run with "
"`hermes curator run` (no flag).\n"
"\n"
"If you accidentally take a mutating action, say so explicitly in "
"the summary so the reviewer can revert it.\n"
"═══════════════════════════════════════════════════════════════"
)
CURATOR_REVIEW_PROMPT = (
"You are running as Hermes' background skill CURATOR. This is an "
"UMBRELLA-BUILDING consolidation pass, not a passive audit and not a "
@@ -817,39 +766,6 @@ def _write_run_report(
consolidated = classification["consolidated"]
pruned = classification["pruned"]
# Rewrite cron job skill references. When the curator consolidates
# skill X into umbrella Y, any cron job that lists X fails to load
# it at run time — the scheduler skips it and the job runs without
# the instructions it was scheduled to follow. Rewriting the
# references in-place keeps scheduled jobs working across
# consolidation passes. Best-effort: never let a cron-module issue
# break the curator.
cron_rewrites: Dict[str, Any] = {"rewrites": [], "jobs_updated": 0, "jobs_scanned": 0}
try:
consolidated_map = {
e["name"]: e["into"]
for e in consolidated
if isinstance(e, dict) and e.get("name") and e.get("into")
}
pruned_names = [
e["name"] for e in pruned
if isinstance(e, dict) and e.get("name")
]
if consolidated_map or pruned_names:
from cron.jobs import rewrite_skill_refs as _rewrite_cron_refs
cron_rewrites = _rewrite_cron_refs(
consolidated=consolidated_map,
pruned=pruned_names,
)
except Exception as e:
logger.debug("Curator cron skill rewrite failed: %s", e, exc_info=True)
cron_rewrites = {
"rewrites": [],
"jobs_updated": 0,
"jobs_scanned": 0,
"error": str(e),
}
payload = {
"started_at": started_at.isoformat(),
"duration_seconds": round(elapsed_seconds, 2),
@@ -865,7 +781,6 @@ def _write_run_report(
"consolidated_this_run": len(consolidated),
"pruned_this_run": len(pruned),
"state_transitions": len(transitions),
"cron_jobs_rewritten": int(cron_rewrites.get("jobs_updated", 0)),
"tool_calls_total": sum(tc_counts.values()),
},
"tool_call_counts": tc_counts,
@@ -875,7 +790,6 @@ def _write_run_report(
"pruned_names": [p["name"] for p in pruned],
"added": added,
"state_transitions": transitions,
"cron_rewrites": cron_rewrites,
"llm_final": llm_meta.get("final", ""),
"llm_summary": llm_meta.get("summary", ""),
"llm_error": llm_meta.get("error"),
@@ -898,17 +812,6 @@ def _write_run_report(
except Exception as e:
logger.debug("Curator REPORT.md write failed: %s", e)
# cron_rewrites.json — only when at least one job was touched, to
# keep run dirs uncluttered for the common no-op case.
try:
if int(cron_rewrites.get("jobs_updated", 0)) > 0:
(run_dir / "cron_rewrites.json").write_text(
json.dumps(cron_rewrites, indent=2, ensure_ascii=False) + "\n",
encoding="utf-8",
)
except Exception as e:
logger.debug("Curator cron_rewrites.json write failed: %s", e)
return run_dir
@@ -1039,39 +942,6 @@ def _render_report_markdown(p: Dict[str, Any]) -> str:
lines.append(f"- `{t.get('name')}`: {t.get('from')}{t.get('to')}")
lines.append("")
# Cron job rewrites — show which scheduled jobs had their skill
# references updated so users can audit that the auto-rewrite did
# the right thing. Only present when at least one job changed.
cron_rw = p.get("cron_rewrites") or {}
cron_rewrites_list = cron_rw.get("rewrites") or []
if cron_rewrites_list:
lines.append(f"### Cron job skill references rewritten ({len(cron_rewrites_list)})\n")
lines.append(
"_Cron jobs that referenced a consolidated or pruned skill were "
"updated in-place so they keep loading the right instructions "
"on their next run. See `cron_rewrites.json` for the full record._\n"
)
SHOW = 25
for entry in cron_rewrites_list[:SHOW]:
job_name = entry.get("job_name") or entry.get("job_id") or "?"
before = entry.get("before") or []
after = entry.get("after") or []
mapped = entry.get("mapped") or {}
dropped = entry.get("dropped") or []
lines.append(
f"- `{job_name}`: `{', '.join(before)}` → `{', '.join(after) or '(none)'}`"
)
for old, new in mapped.items():
lines.append(f" - `{old}` → `{new}` (consolidated)")
for name in dropped:
lines.append(f" - `{name}` dropped (pruned)")
if len(cron_rewrites_list) > SHOW:
lines.append(
f"- … and {len(cron_rewrites_list) - SHOW} more "
"(see `cron_rewrites.json`)"
)
lines.append("")
# Full LLM final response
final = (p.get("llm_final") or "").strip()
if final:
@@ -1122,7 +992,6 @@ def _render_candidate_list() -> str:
def run_curator_review(
on_summary: Optional[Callable[[str], None]] = None,
synchronous: bool = False,
dry_run: bool = False,
) -> Dict[str, Any]:
"""Execute a single curator review pass.
@@ -1135,43 +1004,9 @@ def run_curator_review(
If *synchronous* is True, the LLM review runs in the calling thread; the
default is to spawn a daemon thread so the caller returns immediately.
If *dry_run* is True, the automatic stale/archive transitions are SKIPPED
and the LLM review pass is instructed to produce a report only — no
skill_manage mutations, no terminal archive moves. The REPORT.md still
gets written and ``state.last_report_path`` still records it so users
can read what the curator WOULD have done.
"""
start = datetime.now(timezone.utc)
if dry_run:
# Count candidates without mutating state.
try:
report = skill_usage.agent_created_report()
counts = {
"checked": len(report),
"marked_stale": 0,
"archived": 0,
"reactivated": 0,
}
except Exception:
counts = {"checked": 0, "marked_stale": 0, "archived": 0, "reactivated": 0}
else:
# Pre-mutation snapshot — best-effort, never blocks the run. A
# failed snapshot logs at debug and continues (the alternative is
# that a transient disk issue silently disables curator forever,
# which is worse). Users who want to require snapshots can disable
# curator entirely until they can fix disk space.
try:
from agent import curator_backup
snap = curator_backup.snapshot_skills(reason="pre-curator-run")
if snap is not None and on_summary:
try:
on_summary(f"curator: snapshot created ({snap.name})")
except Exception:
pass
except Exception as e:
logger.debug("Curator pre-run snapshot failed: %s", e, exc_info=True)
counts = apply_automatic_transitions(now=start)
counts = apply_automatic_transitions(now=start)
auto_summary_parts = []
if counts["marked_stale"]:
@@ -1183,16 +1018,11 @@ def run_curator_review(
auto_summary = ", ".join(auto_summary_parts) if auto_summary_parts else "no changes"
# Persist state before the LLM pass so a crash mid-review still records
# the run and doesn't immediately re-trigger. In dry-run we do NOT bump
# last_run_at or run_count — a preview shouldn't push the next scheduled
# real pass out. We still record a summary so `hermes curator status`
# shows that a preview ran.
# the run and doesn't immediately re-trigger.
state = load_state()
if not dry_run:
state["last_run_at"] = start.isoformat()
state["run_count"] = int(state.get("run_count", 0)) + 1
prefix = "dry-run auto: " if dry_run else "auto: "
state["last_run_summary"] = f"{prefix}{auto_summary}"
state["last_run_at"] = start.isoformat()
state["run_count"] = int(state.get("run_count", 0)) + 1
state["last_run_summary"] = f"auto: {auto_summary}"
save_state(state)
def _llm_pass():
@@ -1208,7 +1038,7 @@ def run_curator_review(
try:
candidate_list = _render_candidate_list()
if "No agent-created skills" in candidate_list:
final_summary = f"{prefix}{auto_summary}; llm: skipped (no candidates)"
final_summary = f"auto: {auto_summary}; llm: skipped (no candidates)"
llm_meta = {
"final": "",
"summary": "skipped (no candidates)",
@@ -1218,21 +1048,14 @@ def run_curator_review(
"error": None,
}
else:
if dry_run:
prompt = (
f"{CURATOR_DRY_RUN_BANNER}\n\n"
f"{CURATOR_REVIEW_PROMPT}\n\n"
f"{candidate_list}"
)
else:
prompt = f"{CURATOR_REVIEW_PROMPT}\n\n{candidate_list}"
prompt = f"{CURATOR_REVIEW_PROMPT}\n\n{candidate_list}"
llm_meta = _run_llm_review(prompt)
final_summary = (
f"{prefix}{auto_summary}; llm: {llm_meta.get('summary', 'no change')}"
f"auto: {auto_summary}; llm: {llm_meta.get('summary', 'no change')}"
)
except Exception as e:
logger.debug("Curator LLM pass failed: %s", e, exc_info=True)
final_summary = f"{prefix}{auto_summary}; llm: error ({e})"
final_summary = f"auto: {auto_summary}; llm: error ({e})"
llm_meta = {
"final": "",
"summary": f"error ({e})",
-440
View File
@@ -1,440 +0,0 @@
"""Curator snapshot + rollback.
A pre-run snapshot of ``~/.hermes/skills/`` (excluding ``.curator_backups/``
itself) is taken before any mutating curator pass. Snapshots are tar.gz
files under ``~/.hermes/skills/.curator_backups/<utc-iso>/`` with a
companion ``manifest.json`` describing the snapshot (reason, time, size,
counted skill files). Rollback picks a snapshot, moves the current
``skills/`` tree aside into another snapshot so even the rollback itself
is undoable, then extracts the chosen snapshot into place.
The snapshot does NOT include:
- ``.curator_backups/`` (would recurse)
- ``.hub/`` (hub-installed skills — managed by the hub, not us)
It DOES include:
- all SKILL.md files + their directories (``scripts/``, ``references/``,
``templates/``, ``assets/``)
- ``.usage.json`` (usage telemetry — needed to rehydrate state cleanly)
- ``.archive/`` (so rollback restores previously-archived skills too)
- ``.curator_state`` (so rolling back also restores the last-run-at
pointer — otherwise the curator would immediately re-fire on the next
tick)
- ``.bundled_manifest`` (so protection markers stay consistent)
"""
from __future__ import annotations
import json
import logging
import os
import re
import shutil
import tarfile
import tempfile
import time
from datetime import datetime, timezone
from pathlib import Path
from typing import Any, Dict, List, Optional, Tuple
from hermes_constants import get_hermes_home
logger = logging.getLogger(__name__)
DEFAULT_KEEP = 5
# Entries under skills/ that should NEVER be rolled up into a snapshot.
# .hub/ is managed by the skills hub; rolling it back would break lockfile
# invariants. .curator_backups is the backup dir itself — recursion bomb.
_EXCLUDE_TOP_LEVEL = {".curator_backups", ".hub"}
# Snapshot id regex: UTC ISO with colons replaced by dashes so the filename
# is portable (Windows-safe). An optional ``-NN`` suffix handles two
# snapshots landing in the same wallclock second.
_ID_RE = re.compile(r"^\d{4}-\d{2}-\d{2}T\d{2}-\d{2}-\d{2}Z(-\d{2})?$")
def _backups_dir() -> Path:
return get_hermes_home() / "skills" / ".curator_backups"
def _skills_dir() -> Path:
return get_hermes_home() / "skills"
def _utc_id(now: Optional[datetime] = None) -> str:
"""UTC ISO-ish filesystem-safe timestamp: ``2026-05-01T13-05-42Z``."""
if now is None:
now = datetime.now(timezone.utc)
# isoformat → "2026-05-01T13:05:42.123456+00:00"; strip subseconds and tz.
s = now.replace(microsecond=0).isoformat()
if s.endswith("+00:00"):
s = s[:-6]
return s.replace(":", "-") + "Z"
def _load_config() -> Dict[str, Any]:
try:
from hermes_cli.config import load_config
cfg = load_config()
except Exception as e:
logger.debug("Failed to load config for curator backup: %s", e)
return {}
if not isinstance(cfg, dict):
return {}
cur = cfg.get("curator") or {}
if not isinstance(cur, dict):
return {}
bk = cur.get("backup") or {}
return bk if isinstance(bk, dict) else {}
def is_enabled() -> bool:
"""Default ON — the whole point of the backup is safety by default."""
return bool(_load_config().get("enabled", True))
def get_keep() -> int:
cfg = _load_config()
try:
n = int(cfg.get("keep", DEFAULT_KEEP))
except (TypeError, ValueError):
n = DEFAULT_KEEP
return max(1, n)
# ---------------------------------------------------------------------------
# Snapshot
# ---------------------------------------------------------------------------
def _count_skill_files(base: Path) -> int:
try:
return sum(1 for _ in base.rglob("SKILL.md"))
except OSError:
return 0
def _write_manifest(dest: Path, reason: str, archive_path: Path,
skills_counted: int) -> None:
manifest = {
"id": dest.name,
"reason": reason,
"created_at": datetime.now(timezone.utc).isoformat(),
"archive": archive_path.name,
"archive_bytes": archive_path.stat().st_size,
"skill_files": skills_counted,
}
(dest / "manifest.json").write_text(
json.dumps(manifest, indent=2, sort_keys=True), encoding="utf-8"
)
def snapshot_skills(reason: str = "manual") -> Optional[Path]:
"""Create a tar.gz snapshot of ``~/.hermes/skills/`` and prune old ones.
Returns the snapshot directory path, or ``None`` if the snapshot was
skipped (backup disabled, skills dir missing, or an IO error occurred —
in which case we log at debug and return None so the curator never
aborts a pass because of a backup failure).
"""
if not is_enabled():
logger.debug("Curator backup disabled by config; skipping snapshot")
return None
skills = _skills_dir()
if not skills.exists():
logger.debug("No ~/.hermes/skills/ directory — nothing to back up")
return None
backups = _backups_dir()
try:
backups.mkdir(parents=True, exist_ok=True)
except OSError as e:
logger.debug("Failed to create backups dir %s: %s", backups, e)
return None
# Uniquify: if a snapshot with the same second already exists (can
# happen if two curator runs fire in the same second), append a short
# counter. Avoids clobbering and avoids timestamp collisions.
base_id = _utc_id()
snap_id = base_id
counter = 1
while (backups / snap_id).exists():
snap_id = f"{base_id}-{counter:02d}"
counter += 1
dest = backups / snap_id
try:
dest.mkdir(parents=True, exist_ok=False)
except OSError as e:
logger.debug("Failed to create snapshot dir %s: %s", dest, e)
return None
archive = dest / "skills.tar.gz"
try:
# Stream into the tarball — no tempdir copy needed.
with tarfile.open(archive, "w:gz", compresslevel=6) as tf:
for entry in sorted(skills.iterdir()):
if entry.name in _EXCLUDE_TOP_LEVEL:
continue
# arcname: store paths relative to skills/ so extraction
# drops cleanly back into the skills dir.
tf.add(str(entry), arcname=entry.name, recursive=True)
_write_manifest(dest, reason, archive, _count_skill_files(skills))
except (OSError, tarfile.TarError) as e:
logger.debug("Curator snapshot failed: %s", e, exc_info=True)
# Clean up partial snapshot
try:
shutil.rmtree(dest, ignore_errors=True)
except OSError:
pass
return None
_prune_old(keep=get_keep())
logger.info("Curator snapshot created: %s (%s)", snap_id, reason)
return dest
def _prune_old(keep: int) -> List[str]:
"""Delete regular snapshots beyond the newest *keep*. Returns deleted
ids. Staging dirs (``.rollback-staging-*``) are implementation detail
and pruned independently on every call."""
backups = _backups_dir()
if not backups.exists():
return []
entries: List[Tuple[str, Path]] = []
stale_staging: List[Path] = []
for child in backups.iterdir():
if not child.is_dir():
continue
if child.name.startswith(".rollback-staging-"):
# Staging dirs are only supposed to exist briefly during a
# rollback. If we find one here (e.g. from a crashed rollback),
# clean it up opportunistically.
stale_staging.append(child)
continue
if _ID_RE.match(child.name):
entries.append((child.name, child))
# Newest first (lexicographic works because the id is UTC ISO).
entries.sort(key=lambda t: t[0], reverse=True)
deleted: List[str] = []
for _, path in entries[keep:]:
try:
shutil.rmtree(path)
deleted.append(path.name)
except OSError as e:
logger.debug("Failed to prune %s: %s", path, e)
for path in stale_staging:
try:
shutil.rmtree(path)
except OSError as e:
logger.debug("Failed to clean stale staging dir %s: %s", path, e)
return deleted
# ---------------------------------------------------------------------------
# List + rollback
# ---------------------------------------------------------------------------
def _read_manifest(snap_dir: Path) -> Dict[str, Any]:
mf = snap_dir / "manifest.json"
if not mf.exists():
return {}
try:
return json.loads(mf.read_text(encoding="utf-8"))
except (OSError, json.JSONDecodeError):
return {}
def list_backups() -> List[Dict[str, Any]]:
"""Return all restorable snapshots, newest first. Only entries with a
real ``skills.tar.gz`` tarball are listed — transient
``.rollback-staging-*`` directories created mid-rollback are
implementation detail and not shown."""
backups = _backups_dir()
if not backups.exists():
return []
out: List[Dict[str, Any]] = []
for child in sorted(backups.iterdir(), reverse=True):
if not child.is_dir():
continue
if not _ID_RE.match(child.name):
continue
if not (child / "skills.tar.gz").exists():
continue
mf = _read_manifest(child)
mf.setdefault("id", child.name)
mf.setdefault("path", str(child))
if "archive_bytes" not in mf:
arc = child / "skills.tar.gz"
try:
mf["archive_bytes"] = arc.stat().st_size
except OSError:
mf["archive_bytes"] = 0
out.append(mf)
return out
def _resolve_backup(backup_id: Optional[str]) -> Optional[Path]:
"""Return the path of the requested backup, or the newest one if
*backup_id* is None. Returns None if no match."""
backups = _backups_dir()
if not backups.exists():
return None
if backup_id:
target = backups / backup_id
if (
target.is_dir()
and _ID_RE.match(backup_id)
and (target / "skills.tar.gz").exists()
):
return target
return None
candidates = [
c for c in sorted(backups.iterdir(), reverse=True)
if c.is_dir() and _ID_RE.match(c.name) and (c / "skills.tar.gz").exists()
]
return candidates[0] if candidates else None
def rollback(backup_id: Optional[str] = None) -> Tuple[bool, str, Optional[Path]]:
"""Restore ``~/.hermes/skills/`` from a snapshot.
Strategy:
1. Resolve the target snapshot (explicit id or newest regular).
2. Take a safety snapshot of the CURRENT skills tree under
``.curator_backups/pre-rollback-<ts>/`` so the rollback itself is
undoable.
3. Move all current top-level entries (except ``.curator_backups``
and ``.hub``) into a tempdir.
4. Extract the chosen snapshot into ``~/.hermes/skills/``.
5. On failure during 4, move the tempdir contents back (best-effort)
and return failure.
Returns ``(ok, message, snapshot_path)``.
"""
target = _resolve_backup(backup_id)
if target is None:
return (
False,
f"no matching backup found"
+ (f" for id '{backup_id}'" if backup_id else "")
+ " (use `hermes curator rollback --list` to see available snapshots)",
None,
)
archive = target / "skills.tar.gz"
if not archive.exists():
return (False, f"snapshot {target.name} has no skills.tar.gz — corrupted?", None)
skills = _skills_dir()
skills.mkdir(parents=True, exist_ok=True)
backups = _backups_dir()
backups.mkdir(parents=True, exist_ok=True)
# Step 2: safety snapshot of current state FIRST. If this fails we bail
# out before touching anything — otherwise a failed extract could leave
# the user with no skills.
try:
snapshot_skills(reason=f"pre-rollback to {target.name}")
except Exception as e:
return (False, f"pre-rollback safety snapshot failed: {e}", None)
# Additionally move current entries into an internal staging dir so
# the extract happens into an empty skills tree (predictable result).
# This dir is implementation detail — not listed as a restorable
# backup. The safety snapshot above is the user-facing undo handle.
staged = backups / f".rollback-staging-{_utc_id()}"
try:
staged.mkdir(parents=True, exist_ok=False)
except OSError as e:
return (False, f"failed to create staging dir: {e}", None)
moved: List[Tuple[Path, Path]] = []
try:
for entry in list(skills.iterdir()):
if entry.name in _EXCLUDE_TOP_LEVEL:
continue
dest = staged / entry.name
shutil.move(str(entry), str(dest))
moved.append((entry, dest))
except OSError as e:
# Best-effort rollback of the move
for orig, dest in moved:
try:
shutil.move(str(dest), str(orig))
except OSError:
pass
try:
shutil.rmtree(staged, ignore_errors=True)
except OSError:
pass
return (False, f"failed to stage current skills: {e}", None)
# Step 4: extract the snapshot into skills/
try:
with tarfile.open(archive, "r:gz") as tf:
# Python 3.12+ supports filter='data' for safer extraction.
# Fall back to the unfiltered call for older interpreters but
# still reject absolute paths and .. components defensively.
for member in tf.getmembers():
name = member.name
if name.startswith("/") or ".." in Path(name).parts:
raise tarfile.TarError(
f"refusing to extract unsafe path: {name!r}"
)
try:
tf.extractall(str(skills), filter="data") # type: ignore[call-arg]
except TypeError:
# Python < 3.12 — no filter kwarg
tf.extractall(str(skills))
except (OSError, tarfile.TarError) as e:
# Best-effort recover: move staged contents back
for orig, dest in moved:
try:
shutil.move(str(dest), str(orig))
except OSError:
pass
try:
shutil.rmtree(staged, ignore_errors=True)
except OSError:
pass
return (False, f"snapshot extract failed (state restored): {e}", None)
# Extract succeeded — the staging dir has served its purpose. The
# user's undo handle is the safety snapshot tarball we took earlier.
try:
shutil.rmtree(staged, ignore_errors=True)
except OSError:
pass
logger.info("Curator rollback: restored from %s", target.name)
return (True, f"restored from snapshot {target.name}", target)
# ---------------------------------------------------------------------------
# Human-readable summary for CLI
# ---------------------------------------------------------------------------
def format_size(n: int) -> str:
for unit in ("B", "KB", "MB", "GB"):
if n < 1024 or unit == "GB":
return f"{n:.1f} {unit}" if unit != "B" else f"{n} B"
n /= 1024
return f"{n:.1f} GB"
def summarize_backups() -> str:
rows = list_backups()
if not rows:
return "No curator snapshots yet."
lines = [f"{'id':<24} {'reason':<40} {'skills':>6} {'size':>8}"]
lines.append("" * len(lines[0]))
for r in rows:
lines.append(
f"{r.get('id','?'):<24} "
f"{(r.get('reason','?') or '?')[:40]:<40} "
f"{r.get('skill_files', 0):>6} "
f"{format_size(int(r.get('archive_bytes', 0))):>8}"
)
return "\n".join(lines)
+5 -5
View File
@@ -20,25 +20,25 @@ def summarize_manual_compression(
headline = f"No changes from compression: {before_count} messages"
if after_tokens == before_tokens:
token_line = (
f"Approx request size: ~{before_tokens:,} tokens (unchanged)"
f"Rough transcript estimate: ~{before_tokens:,} tokens (unchanged)"
)
else:
token_line = (
f"Approx request size: ~{before_tokens:,}"
f"Rough transcript estimate: ~{before_tokens:,}"
f"~{after_tokens:,} tokens"
)
else:
headline = f"Compressed: {before_count}{after_count} messages"
token_line = (
f"Approx request size: ~{before_tokens:,}"
f"Rough transcript estimate: ~{before_tokens:,}"
f"~{after_tokens:,} tokens"
)
note = None
if not noop and after_count < before_count and after_tokens > before_tokens:
note = (
"Note: fewer messages can still raise this estimate when "
"compression rewrites the transcript into denser summaries."
"Note: fewer messages can still raise this rough transcript estimate "
"when compression rewrites the transcript into denser summaries."
)
return {
+4 -45
View File
@@ -81,56 +81,15 @@ def _repair_schema(node: Any, is_schema: bool = True) -> Any:
return repaired
# Rule 2: when anyOf is present, type belongs only on the children.
# Additionally, Moonshot rejects null-type branches inside anyOf
# (enum value (<nil>) does not match any type in [string]).
# Collapse the anyOf to the first non-null branch and infer its type.
if "anyOf" in repaired and isinstance(repaired["anyOf"], list):
repaired.pop("type", None)
non_null = [b for b in repaired["anyOf"]
if isinstance(b, dict) and b.get("type") != "null"]
if non_null and len(non_null) < len(repaired["anyOf"]):
# Drop the anyOf wrapper — keep only the non-null branch.
# If there's a single non-null branch, promote it and fall
# through to Rules 1/3 so nullable/enum cleanup still applies
# to the merged node.
if len(non_null) == 1:
merge = {k: v for k, v in repaired.items() if k != "anyOf"}
merge.update(non_null[0])
repaired = merge
else:
repaired["anyOf"] = non_null
return repaired
else:
# Nothing to collapse — parent type stripped, children already
# repaired by the recursive walk above.
return repaired
# Moonshot also rejects non-standard keywords like ``nullable`` on
# parameter schemas — strip it.
repaired.pop("nullable", None)
return repaired
# Rule 1: property schemas without type need one. $ref nodes are exempt
# — their type comes from the referenced definition.
# Fill missing type BEFORE Rule 3 so enum cleanup can check the type.
if "$ref" not in repaired:
repaired = _fill_missing_type(repaired)
# Rule 3: Moonshot rejects null/empty-string values inside enum arrays
# when the parent type is a scalar (string, integer, etc.). The error:
# "enum value (<nil>) does not match any type in [string]"
# Strip null and empty-string from enum values, and if the enum becomes
# empty, drop it entirely.
if "enum" in repaired and isinstance(repaired["enum"], list):
node_type = repaired.get("type")
if node_type in ("string", "integer", "number", "boolean"):
cleaned = [v for v in repaired["enum"]
if v is not None and v != ""]
if cleaned:
repaired["enum"] = cleaned
else:
repaired.pop("enum")
return repaired
if "$ref" in repaired:
return repaired
return _fill_missing_type(repaired)
def _fill_missing_type(node: Dict[str, Any]) -> Dict[str, Any]:
-58
View File
@@ -182,64 +182,6 @@ SKILLS_GUIDANCE = (
"Skills that aren't maintained become liabilities."
)
KANBAN_GUIDANCE = (
"# You are a Kanban worker\n"
"You were spawned by the Hermes Kanban dispatcher to execute ONE task from "
"the shared board at `~/.hermes/kanban.db`. Your task id is in "
"`$HERMES_KANBAN_TASK`; your workspace is `$HERMES_KANBAN_WORKSPACE`. "
"The `kanban_*` tools in your schema are your primary coordination surface — "
"they write directly to the shared SQLite DB and work regardless of terminal "
"backend (local/docker/modal/ssh).\n"
"\n"
"## Lifecycle\n"
"\n"
"1. **Orient.** Call `kanban_show()` first (no args — it defaults to your "
"task). The response includes title, body, parent-task handoffs (summary + "
"metadata), any prior attempts on this task if you're a retry, the full "
"comment thread, and a pre-formatted `worker_context` you can treat as "
"ground truth.\n"
"2. **Work inside the workspace.** `cd $HERMES_KANBAN_WORKSPACE` before "
"any file operations. The workspace is yours for this run. Don't modify "
"files outside it unless the task explicitly asks.\n"
"3. **Heartbeat on long operations.** Call `kanban_heartbeat(note=...)` "
"every few minutes during long subprocesses (training, encoding, crawling). "
"Skip heartbeats for short tasks.\n"
"4. **Block on genuine ambiguity.** If you need a human decision you cannot "
"infer (missing credentials, UX choice, paywalled source, peer output you "
"need first), call `kanban_block(reason=\"...\")` and stop. Don't guess. "
"The user will unblock with context and the dispatcher will respawn you.\n"
"5. **Complete with structured handoff.** Call `kanban_complete(summary=..., "
"metadata=...)`. `summary` is 13 human-readable sentences naming concrete "
"artifacts. `metadata` is machine-readable facts "
"(`{changed_files: [...], tests_run: N, decisions: [...]}`). Downstream "
"workers read both via their own `kanban_show`. Never put secrets / "
"tokens / raw PII in either field — run rows are durable forever.\n"
"6. **If follow-up work appears, create it; don't do it.** Use "
"`kanban_create(title=..., assignee=<right-profile>, parents=[your-task-id])` "
"to spawn a child task for the appropriate specialist profile instead of "
"scope-creeping into the next thing.\n"
"\n"
"## Orchestrator mode\n"
"\n"
"If your task is itself a decomposition task (e.g. a planner profile given "
"a high-level goal), use `kanban_create` to fan out into child tasks — one "
"per specialist, each with an explicit `assignee` and `parents=[...]` to "
"express dependencies. Then `kanban_complete` your own task with a summary "
"of the decomposition. Do NOT execute the work yourself; your job is "
"routing, not implementation.\n"
"\n"
"## Do NOT\n"
"\n"
"- Do not shell out to `hermes kanban <verb>` for board operations. Use "
"the `kanban_*` tools — they work across all terminal backends.\n"
"- Do not complete a task you didn't actually finish. Block it.\n"
"- Do not assign follow-up work to yourself. Assign it to the right "
"specialist profile.\n"
"- Do not call `delegate_task` as a board substitute. `delegate_task` is "
"for short reasoning subtasks inside your own run; board tasks are for "
"cross-agent handoffs that outlive one API loop."
)
TOOL_USE_ENFORCEMENT_GUIDANCE = (
"# Tool-use enforcement\n"
"You MUST use your tools to take action — do not describe what you would do "
-455
View File
@@ -1,455 +0,0 @@
"""Pure tool-call loop guardrail primitives.
The controller in this module is intentionally side-effect free: it tracks
per-turn tool-call observations and returns decisions. Runtime code owns whether
those decisions become warning guidance, synthetic tool results, or controlled
turn halts.
"""
from __future__ import annotations
import hashlib
import json
from dataclasses import dataclass, field
from typing import Any, Mapping
from utils import safe_json_loads
IDEMPOTENT_TOOL_NAMES = frozenset(
{
"read_file",
"search_files",
"web_search",
"web_extract",
"session_search",
"browser_snapshot",
"browser_console",
"browser_get_images",
"mcp_filesystem_read_file",
"mcp_filesystem_read_text_file",
"mcp_filesystem_read_multiple_files",
"mcp_filesystem_list_directory",
"mcp_filesystem_list_directory_with_sizes",
"mcp_filesystem_directory_tree",
"mcp_filesystem_get_file_info",
"mcp_filesystem_search_files",
}
)
MUTATING_TOOL_NAMES = frozenset(
{
"terminal",
"execute_code",
"write_file",
"patch",
"todo",
"memory",
"skill_manage",
"browser_click",
"browser_type",
"browser_press",
"browser_scroll",
"browser_navigate",
"send_message",
"cronjob",
"delegate_task",
"process",
}
)
@dataclass(frozen=True)
class ToolCallGuardrailConfig:
"""Thresholds for per-turn tool-call loop detection.
Warnings are enabled by default and never prevent tool execution. Hard stops
are explicit opt-in so interactive CLI/TUI sessions get a gentle nudge unless
the user enables circuit-breaker behavior in config.yaml.
"""
warnings_enabled: bool = True
hard_stop_enabled: bool = False
exact_failure_warn_after: int = 2
exact_failure_block_after: int = 5
same_tool_failure_warn_after: int = 3
same_tool_failure_halt_after: int = 8
no_progress_warn_after: int = 2
no_progress_block_after: int = 5
idempotent_tools: frozenset[str] = field(default_factory=lambda: IDEMPOTENT_TOOL_NAMES)
mutating_tools: frozenset[str] = field(default_factory=lambda: MUTATING_TOOL_NAMES)
@classmethod
def from_mapping(cls, data: Mapping[str, Any] | None) -> "ToolCallGuardrailConfig":
"""Build config from the `tool_loop_guardrails` config.yaml section."""
if not isinstance(data, Mapping):
return cls()
warn_after = data.get("warn_after")
if not isinstance(warn_after, Mapping):
warn_after = {}
hard_stop_after = data.get("hard_stop_after")
if not isinstance(hard_stop_after, Mapping):
hard_stop_after = {}
defaults = cls()
return cls(
warnings_enabled=_as_bool(data.get("warnings_enabled"), defaults.warnings_enabled),
hard_stop_enabled=_as_bool(data.get("hard_stop_enabled"), defaults.hard_stop_enabled),
exact_failure_warn_after=_positive_int(
warn_after.get("exact_failure", data.get("exact_failure_warn_after")),
defaults.exact_failure_warn_after,
),
same_tool_failure_warn_after=_positive_int(
warn_after.get("same_tool_failure", data.get("same_tool_failure_warn_after")),
defaults.same_tool_failure_warn_after,
),
no_progress_warn_after=_positive_int(
warn_after.get("idempotent_no_progress", data.get("no_progress_warn_after")),
defaults.no_progress_warn_after,
),
exact_failure_block_after=_positive_int(
hard_stop_after.get("exact_failure", data.get("exact_failure_block_after")),
defaults.exact_failure_block_after,
),
same_tool_failure_halt_after=_positive_int(
hard_stop_after.get("same_tool_failure", data.get("same_tool_failure_halt_after")),
defaults.same_tool_failure_halt_after,
),
no_progress_block_after=_positive_int(
hard_stop_after.get("idempotent_no_progress", data.get("no_progress_block_after")),
defaults.no_progress_block_after,
),
)
@dataclass(frozen=True)
class ToolCallSignature:
"""Stable, non-reversible identity for a tool name plus canonical args."""
tool_name: str
args_hash: str
@classmethod
def from_call(cls, tool_name: str, args: Mapping[str, Any] | None) -> "ToolCallSignature":
canonical = canonical_tool_args(args or {})
return cls(tool_name=tool_name, args_hash=_sha256(canonical))
def to_metadata(self) -> dict[str, str]:
"""Return public metadata without raw argument values."""
return {"tool_name": self.tool_name, "args_hash": self.args_hash}
@dataclass(frozen=True)
class ToolGuardrailDecision:
"""Decision returned by the tool-call guardrail controller."""
action: str = "allow" # allow | warn | block | halt
code: str = "allow"
message: str = ""
tool_name: str = ""
count: int = 0
signature: ToolCallSignature | None = None
@property
def allows_execution(self) -> bool:
return self.action in {"allow", "warn"}
@property
def should_halt(self) -> bool:
return self.action in {"block", "halt"}
def to_metadata(self) -> dict[str, Any]:
data: dict[str, Any] = {
"action": self.action,
"code": self.code,
"message": self.message,
"tool_name": self.tool_name,
"count": self.count,
}
if self.signature is not None:
data["signature"] = self.signature.to_metadata()
return data
def canonical_tool_args(args: Mapping[str, Any]) -> str:
"""Return sorted compact JSON for parsed tool arguments."""
if not isinstance(args, Mapping):
raise TypeError(f"tool args must be a mapping, got {type(args).__name__}")
return json.dumps(
args,
ensure_ascii=False,
sort_keys=True,
separators=(",", ":"),
default=str,
)
def classify_tool_failure(tool_name: str, result: str | None) -> tuple[bool, str]:
"""Safety-fallback classifier used only when callers don't pass ``failed``.
Mirrors ``agent.display._detect_tool_failure`` exactly so the guardrail
never disagrees with the CLI's user-visible ``[error]`` tag. Production
callers in ``run_agent.py`` always pass an explicit ``failed=`` derived
from ``_detect_tool_failure``; this function exists so standalone callers
(tests, tooling) still get consistent behavior.
"""
if result is None:
return False, ""
if tool_name == "terminal":
data = safe_json_loads(result)
if isinstance(data, dict):
exit_code = data.get("exit_code")
if exit_code is not None and exit_code != 0:
return True, f" [exit {exit_code}]"
return False, ""
if tool_name == "memory":
data = safe_json_loads(result)
if isinstance(data, dict):
if data.get("success") is False and "exceed the limit" in data.get("error", ""):
return True, " [full]"
lower = result[:500].lower()
if '"error"' in lower or '"failed"' in lower or result.startswith("Error"):
return True, " [error]"
return False, ""
class ToolCallGuardrailController:
"""Per-turn controller for repeated failed/non-progressing tool calls."""
def __init__(self, config: ToolCallGuardrailConfig | None = None):
self.config = config or ToolCallGuardrailConfig()
self.reset_for_turn()
def reset_for_turn(self) -> None:
self._exact_failure_counts: dict[ToolCallSignature, int] = {}
self._same_tool_failure_counts: dict[str, int] = {}
self._no_progress: dict[ToolCallSignature, tuple[str, int]] = {}
self._halt_decision: ToolGuardrailDecision | None = None
@property
def halt_decision(self) -> ToolGuardrailDecision | None:
return self._halt_decision
def before_call(self, tool_name: str, args: Mapping[str, Any] | None) -> ToolGuardrailDecision:
signature = ToolCallSignature.from_call(tool_name, _coerce_args(args))
if not self.config.hard_stop_enabled:
return ToolGuardrailDecision(tool_name=tool_name, signature=signature)
exact_count = self._exact_failure_counts.get(signature, 0)
if exact_count >= self.config.exact_failure_block_after:
decision = ToolGuardrailDecision(
action="block",
code="repeated_exact_failure_block",
message=(
f"Blocked {tool_name}: the same tool call failed {exact_count} "
"times with identical arguments. Stop retrying it unchanged; "
"change strategy or explain the blocker."
),
tool_name=tool_name,
count=exact_count,
signature=signature,
)
self._halt_decision = decision
return decision
if self._is_idempotent(tool_name):
record = self._no_progress.get(signature)
if record is not None:
_result_hash, repeat_count = record
if repeat_count >= self.config.no_progress_block_after:
decision = ToolGuardrailDecision(
action="block",
code="idempotent_no_progress_block",
message=(
f"Blocked {tool_name}: this read-only call returned the same "
f"result {repeat_count} times. Stop repeating it unchanged; "
"use the result already provided or try a different query."
),
tool_name=tool_name,
count=repeat_count,
signature=signature,
)
self._halt_decision = decision
return decision
return ToolGuardrailDecision(tool_name=tool_name, signature=signature)
def after_call(
self,
tool_name: str,
args: Mapping[str, Any] | None,
result: str | None,
*,
failed: bool | None = None,
) -> ToolGuardrailDecision:
args = _coerce_args(args)
signature = ToolCallSignature.from_call(tool_name, args)
if failed is None:
failed, _ = classify_tool_failure(tool_name, result)
if failed:
exact_count = self._exact_failure_counts.get(signature, 0) + 1
self._exact_failure_counts[signature] = exact_count
self._no_progress.pop(signature, None)
same_count = self._same_tool_failure_counts.get(tool_name, 0) + 1
self._same_tool_failure_counts[tool_name] = same_count
if self.config.hard_stop_enabled and same_count >= self.config.same_tool_failure_halt_after:
decision = ToolGuardrailDecision(
action="halt",
code="same_tool_failure_halt",
message=(
f"Stopped {tool_name}: it failed {same_count} times this turn. "
"Stop retrying the same failing tool path and choose a different approach."
),
tool_name=tool_name,
count=same_count,
signature=signature,
)
self._halt_decision = decision
return decision
if self.config.warnings_enabled and exact_count >= self.config.exact_failure_warn_after:
return ToolGuardrailDecision(
action="warn",
code="repeated_exact_failure_warning",
message=(
f"{tool_name} has failed {exact_count} times with identical arguments. "
"This looks like a loop; inspect the error and change strategy "
"instead of retrying it unchanged."
),
tool_name=tool_name,
count=exact_count,
signature=signature,
)
if self.config.warnings_enabled and same_count >= self.config.same_tool_failure_warn_after:
return ToolGuardrailDecision(
action="warn",
code="same_tool_failure_warning",
message=(
f"{tool_name} has failed {same_count} times this turn. "
"This looks like a loop; change approach before retrying."
),
tool_name=tool_name,
count=same_count,
signature=signature,
)
return ToolGuardrailDecision(tool_name=tool_name, count=exact_count, signature=signature)
self._exact_failure_counts.pop(signature, None)
self._same_tool_failure_counts.pop(tool_name, None)
if not self._is_idempotent(tool_name):
self._no_progress.pop(signature, None)
return ToolGuardrailDecision(tool_name=tool_name, signature=signature)
result_hash = _result_hash(result)
previous = self._no_progress.get(signature)
repeat_count = 1
if previous is not None and previous[0] == result_hash:
repeat_count = previous[1] + 1
self._no_progress[signature] = (result_hash, repeat_count)
if self.config.warnings_enabled and repeat_count >= self.config.no_progress_warn_after:
return ToolGuardrailDecision(
action="warn",
code="idempotent_no_progress_warning",
message=(
f"{tool_name} returned the same result {repeat_count} times. "
"Use the result already provided or change the query instead of "
"repeating it unchanged."
),
tool_name=tool_name,
count=repeat_count,
signature=signature,
)
return ToolGuardrailDecision(tool_name=tool_name, count=repeat_count, signature=signature)
def _is_idempotent(self, tool_name: str) -> bool:
if tool_name in self.config.mutating_tools:
return False
return tool_name in self.config.idempotent_tools
def toolguard_synthetic_result(decision: ToolGuardrailDecision) -> str:
"""Build a synthetic role=tool content string for a blocked tool call."""
return json.dumps(
{
"error": decision.message,
"guardrail": decision.to_metadata(),
},
ensure_ascii=False,
)
def append_toolguard_guidance(result: str, decision: ToolGuardrailDecision) -> str:
"""Append runtime guidance to the current tool result content."""
if decision.action not in {"warn", "halt"} or not decision.message:
return result
label = "Tool loop hard stop" if decision.action == "halt" else "Tool loop warning"
suffix = (
f"\n\n[{label}: "
f"{decision.code}; count={decision.count}; {decision.message}]"
)
return (result or "") + suffix
def _coerce_args(args: Mapping[str, Any] | None) -> Mapping[str, Any]:
return args if isinstance(args, Mapping) else {}
def _result_hash(result: str | None) -> str:
parsed = safe_json_loads(result or "")
if parsed is not None:
try:
canonical = json.dumps(
parsed,
ensure_ascii=False,
sort_keys=True,
separators=(",", ":"),
default=str,
)
except TypeError:
canonical = str(parsed)
else:
canonical = result or ""
return _sha256(canonical)
def _as_bool(value: Any, default: bool) -> bool:
if value is None:
return default
if isinstance(value, bool):
return value
if isinstance(value, (int, float)):
return bool(value)
if isinstance(value, str):
lowered = value.strip().lower()
if lowered in {"1", "true", "yes", "on", "enabled"}:
return True
if lowered in {"0", "false", "no", "off", "disabled"}:
return False
return default
def _positive_int(value: Any, default: int) -> int:
if value is None:
return default
try:
parsed = int(value)
except (TypeError, ValueError):
return default
return parsed if parsed >= 1 else default
def _sha256(value: str) -> str:
return hashlib.sha256(value.encode("utf-8")).hexdigest()
+1 -5
View File
@@ -477,13 +477,9 @@ class ChatCompletionsTransport(ProviderTransport):
# so keep them apart in provider_data rather than merging.
reasoning = getattr(msg, "reasoning", None)
reasoning_content = getattr(msg, "reasoning_content", None)
if reasoning_content is None and hasattr(msg, "model_extra"):
model_extra = getattr(msg, "model_extra", None) or {}
if isinstance(model_extra, dict) and "reasoning_content" in model_extra:
reasoning_content = model_extra["reasoning_content"]
provider_data: Dict[str, Any] = {}
if reasoning_content is not None:
if reasoning_content:
provider_data["reasoning_content"] = reasoning_content
rd = getattr(msg, "reasoning_details", None)
if rd:
-19
View File
@@ -289,25 +289,6 @@ browser:
# after this period of no activity between agent loops (default: 120 = 2 minutes)
inactivity_timeout: 120
# =============================================================================
# Tool Loop Guardrails
# =============================================================================
# Soft warnings are enabled by default. They append guidance to repeated failed
# or non-progressing tool results but still let the tool execute. Hard stops are
# opt-in circuit breakers for autonomous/cron sessions where stopping a loop is
# preferable to spending the full iteration budget.
tool_loop_guardrails:
warnings_enabled: true
hard_stop_enabled: false
warn_after:
exact_failure: 2
same_tool_failure: 3
idempotent_no_progress: 2
hard_stop_after:
exact_failure: 5
same_tool_failure: 8
idempotent_no_progress: 5
# =============================================================================
# Context Compression (Auto-shrinks long conversations)
# =============================================================================
+15 -354
View File
@@ -15,6 +15,7 @@ Usage:
import logging
import os
import re
import shutil
import sys
import json
@@ -85,7 +86,7 @@ from hermes_cli.browser_connect import (
try_launch_chrome_debug,
)
from hermes_cli.env_loader import load_hermes_dotenv
from utils import base_url_host_matches, is_truthy_value
from utils import base_url_host_matches
_hermes_home = get_hermes_home()
_project_env = Path(__file__).parent / '.env'
@@ -599,7 +600,6 @@ def load_cli_config() -> Dict[str, Any]:
# Load configuration at module startup
CLI_CONFIG = load_cli_config()
# Initialize centralized logging early — agent.log + errors.log in ~/.hermes/logs/.
# This ensures CLI sessions produce a log trail even before AIAgent is instantiated.
try:
@@ -934,20 +934,6 @@ def _run_state_db_auto_maintenance(session_db) -> None:
try:
from hermes_cli.config import load_config as _load_full_config
from hermes_constants import get_hermes_home as _get_hermes_home
_hermes_home_maint = _get_hermes_home()
# One-time prune of empty TUI ghost sessions.
try:
if not session_db.get_meta("ghost_session_prune_v1"):
pruned = session_db.prune_empty_ghost_sessions(
sessions_dir=_hermes_home_maint / "sessions"
)
session_db.set_meta("ghost_session_prune_v1", "1")
if pruned:
logger.info("Pruned %d empty TUI ghost sessions", pruned)
except Exception as _prune_exc:
logger.debug("Ghost session prune skipped: %s", _prune_exc)
cfg = (_load_full_config().get("sessions") or {})
if not cfg.get("auto_prune", False):
return
@@ -955,7 +941,7 @@ def _run_state_db_auto_maintenance(session_db) -> None:
retention_days=int(cfg.get("retention_days", 90)),
min_interval_hours=int(cfg.get("min_interval_hours", 24)),
vacuum=bool(cfg.get("vacuum_after_prune", True)),
sessions_dir=_hermes_home_maint / "sessions",
sessions_dir=_get_hermes_home() / "sessions",
)
except Exception as exc:
logger.debug("state.db auto-maintenance skipped: %s", exc)
@@ -1254,73 +1240,8 @@ def _cprint(text: str):
Raw ANSI escapes written via print() are swallowed by patch_stdout's
StdoutProxy. Routing through print_formatted_text(ANSI(...)) lets
prompt_toolkit parse the escapes and render real colors.
When called from a background thread while a prompt_toolkit
``Application`` is running (the common case for the self-improvement
background review's ``💾 …`` summary, curator summaries, and other
bg-thread emissions), a direct ``_pt_print`` races with the input
area's redraw and the line can end up visually buried behind the
prompt. Route those cases through ``run_in_terminal`` via
``loop.call_soon_threadsafe``, which pauses the input area, prints
the line above it, and redraws the prompt cleanly.
"""
try:
from prompt_toolkit.application import get_app_or_none, run_in_terminal
except Exception:
_pt_print(_PT_ANSI(text))
return
app = None
try:
app = get_app_or_none()
except Exception:
app = None
# No active app, or we're already on the app's main thread: the
# direct prompt_toolkit print is safe and matches existing behavior
# (spinner frames, streamed tokens, tool activity prefixes, …).
if app is None or not getattr(app, "_is_running", False):
_pt_print(_PT_ANSI(text))
return
try:
loop = app.loop # type: ignore[attr-defined]
except Exception:
loop = None
if loop is None:
_pt_print(_PT_ANSI(text))
return
import asyncio as _asyncio
try:
current_loop = _asyncio.get_event_loop_policy().get_event_loop()
except Exception:
current_loop = None
# Same thread as the app's loop → safe to print directly.
if current_loop is loop and loop.is_running():
_pt_print(_PT_ANSI(text))
return
# Cross-thread emission: ask the app's event loop to schedule a
# ``run_in_terminal`` that wraps ``_pt_print``. This hides the
# prompt, prints, and redraws. Fire-and-forget — if scheduling
# fails we fall back to a direct print so the line isn't lost.
def _schedule():
try:
run_in_terminal(lambda: _pt_print(_PT_ANSI(text)))
except Exception:
try:
_pt_print(_PT_ANSI(text))
except Exception:
pass
try:
loop.call_soon_threadsafe(_schedule)
except Exception:
try:
_pt_print(_PT_ANSI(text))
except Exception:
pass
_pt_print(_PT_ANSI(text))
# ---------------------------------------------------------------------------
@@ -2132,8 +2053,6 @@ class HermesCLI:
# Parse and validate toolsets
self.enabled_toolsets = toolsets
self.disabled_toolsets = CLI_CONFIG["agent"].get("disabled_toolsets") or []
if toolsets and "all" not in toolsets and "*" not in toolsets:
# Validate each toolset — MCP server names are resolved via
# live registry aliases (registered during discover_mcp_tools),
@@ -3584,7 +3503,6 @@ class HermesCLI:
credential_pool=runtime.get("credential_pool"),
max_iterations=self.max_turns,
enabled_toolsets=self.enabled_toolsets,
disabled_toolsets=self.disabled_toolsets,
verbose_logging=self.verbose,
quiet_mode=not self.verbose,
ephemeral_system_prompt=self.system_prompt if self.system_prompt else None,
@@ -3632,18 +3550,14 @@ class HermesCLI:
tuple(runtime.get("args") or ()),
)
# Force-create DB row on /title intent, then apply title.
if self._pending_title and self._session_db and self.agent:
if self._pending_title and self._session_db:
try:
self.agent._ensure_db_session()
if self.agent._session_db_created:
self._session_db.set_session_title(self.session_id, self._pending_title)
_cprint(f" Session title applied: {self._pending_title}")
self._pending_title = None
# else: row creation failed transiently — keep _pending_title for retry
self._session_db.set_session_title(self.session_id, self._pending_title)
_cprint(f" Session title applied: {self._pending_title}")
self._pending_title = None
except (ValueError, Exception) as e:
_cprint(f" Could not apply pending title: {e}")
# Keep _pending_title so it can be retried after row creation succeeds
self._pending_title = None
return True
except Exception as e:
ChatConsole().print(f"[bold red]Failed to initialize agent: {e}[/]")
@@ -4912,40 +4826,6 @@ class HermesCLI:
flush_tool_summary()
print()
def _handle_recap_command(self) -> None:
"""Show a compact recap of recent activity in this session.
Inspired by Claude Code's ``/recap`` (v2.1.114, April 2026) — useful
when running multiple sessions simultaneously and returning to one
after a while. Purely local; no LLM call, no token cost, no cache
invalidation.
"""
try:
from hermes_cli.session_recap import build_recap
except Exception as exc: # pragma: no cover - defensive
print(f" (recap unavailable: {exc})")
return
title = None
try:
if self._session_db and self.session_id:
row = self._session_db.get_session(self.session_id)
if row:
title = row.get("title") or None
except Exception:
title = None
text = build_recap(
self.conversation_history or [],
session_title=title,
session_id=self.session_id,
platform="cli",
)
print()
for line in text.splitlines():
print(line)
print()
def _notify_session_boundary(self, event_type: str) -> None:
"""Fire a session-boundary plugin hook (on_session_finalize or on_session_reset).
@@ -5005,7 +4885,6 @@ class HermesCLI:
if self._session_db:
try:
self.agent._session_db_created = False
self._session_db.create_session(
session_id=self.session_id,
source=os.environ.get("HERMES_SESSION_SOURCE", "cli"),
@@ -5015,7 +4894,6 @@ class HermesCLI:
"reasoning_config": self.reasoning_config,
},
)
self.agent._session_db_created = True
except Exception:
pass
# Notify memory providers that session_id rotated to a fresh
@@ -6209,27 +6087,6 @@ class HermesCLI:
except Exception as exc:
print(f"(._.) curator: {exc}")
def _handle_kanban_command(self, cmd: str):
"""Handle the /kanban command — delegate to the shared kanban CLI.
The string form passed here is the user's full ``/kanban ...``
including the leading slash; we strip it and hand the remainder
to ``kanban.run_slash`` which returns a single formatted string.
"""
from hermes_cli.kanban import run_slash
rest = cmd.strip()
if rest.startswith("/"):
rest = rest.lstrip("/")
if rest.startswith("kanban"):
rest = rest[len("kanban"):].lstrip()
try:
output = run_slash(rest)
except Exception as exc: # pragma: no cover - defensive
output = f"(._.) kanban error: {exc}"
if output:
print(output)
def _handle_skills_command(self, cmd: str):
"""Handle /skills slash command — delegates to hermes_cli.skills_hub."""
from hermes_cli.skills_hub import handle_skills_slash
@@ -6398,8 +6255,6 @@ class HermesCLI:
pass
elif canonical == "history":
self.show_history()
elif canonical == "recap":
self._handle_recap_command()
elif canonical == "title":
parts = cmd_original.split(maxsplit=1)
if len(parts) > 1:
@@ -6477,8 +6332,6 @@ class HermesCLI:
self._handle_cron_command(cmd_original)
elif canonical == "curator":
self._handle_curator_command(cmd_original)
elif canonical == "kanban":
self._handle_kanban_command(cmd_original)
elif canonical == "skills":
with self._busy_command(self._slow_command_status(cmd_original)):
self._handle_skills_command(cmd_original)
@@ -6596,8 +6449,6 @@ class HermesCLI:
# No active run — treat as a normal next-turn message.
self._pending_input.put(payload)
_cprint(f" No agent running; queued as next turn: {payload[:80]}{'...' if len(payload) > 80 else ''}")
elif canonical == "goal":
self._handle_goal_command(cmd_original)
elif canonical == "skin":
self._handle_skin_command(cmd_original)
elif canonical == "voice":
@@ -6643,17 +6494,12 @@ class HermesCLI:
self._console_print(f"[bold red]Quick command '{base_cmd}' has unsupported type (supported: 'exec', 'alias')[/]")
# Check for plugin-registered slash commands
elif base_cmd.lstrip("/") in _get_plugin_cmd_handler_names():
from hermes_cli.plugins import (
get_plugin_command_handler,
resolve_plugin_command_result,
)
from hermes_cli.plugins import get_plugin_command_handler
plugin_handler = get_plugin_command_handler(base_cmd.lstrip("/"))
if plugin_handler:
user_args = cmd_original[len(base_cmd):].strip()
try:
result = resolve_plugin_command_result(
plugin_handler(user_args)
)
result = plugin_handler(user_args)
if result:
_cprint(str(result))
except Exception as e:
@@ -7078,166 +6924,6 @@ class HermesCLI:
print(" status Show current browser mode")
print()
# ────────────────────────────────────────────────────────────────
# /goal — persistent cross-turn goals (Ralph-style loop)
# ────────────────────────────────────────────────────────────────
def _get_goal_manager(self):
"""Return the GoalManager bound to the current session_id.
Cached on ``self._goal_manager`` and rebound lazily when
``session_id`` changes (e.g. after /new or a compression-driven
session split).
"""
try:
from hermes_cli.goals import GoalManager
from hermes_cli.config import load_config
except Exception as exc:
logging.debug("goal manager unavailable: %s", exc)
return None
sid = getattr(self, "session_id", None) or ""
if not sid:
return None
existing = getattr(self, "_goal_manager", None)
if existing is not None and getattr(existing, "session_id", None) == sid:
return existing
try:
cfg = load_config() or {}
goals_cfg = cfg.get("goals") or {}
max_turns = int(goals_cfg.get("max_turns", 20) or 20)
except Exception:
max_turns = 20
mgr = GoalManager(session_id=sid, default_max_turns=max_turns)
self._goal_manager = mgr
return mgr
def _handle_goal_command(self, cmd: str) -> None:
"""Dispatch /goal subcommands: set / status / pause / resume / clear."""
parts = (cmd or "").strip().split(None, 1)
arg = parts[1].strip() if len(parts) > 1 else ""
mgr = self._get_goal_manager()
if mgr is None:
_cprint(f" {_DIM}Goals unavailable (no active session).{_RST}")
return
lower = arg.lower()
# Bare /goal or /goal status → show current state
if not arg or lower == "status":
_cprint(f" {mgr.status_line()}")
return
if lower == "pause":
state = mgr.pause(reason="user-paused")
if state is None:
_cprint(f" {_DIM}No goal set.{_RST}")
else:
_cprint(f" ⏸ Goal paused: {state.goal}")
return
if lower == "resume":
state = mgr.resume()
if state is None:
_cprint(f" {_DIM}No goal to resume.{_RST}")
else:
_cprint(f" ▶ Goal resumed: {state.goal}")
_cprint(
f" {_DIM}Send any message (or press Enter on an empty prompt "
f"is a no-op; type 'continue' to kick it off).{_RST}"
)
return
if lower in ("clear", "stop", "done"):
had = mgr.has_goal()
mgr.clear()
if had:
_cprint(" ✓ Goal cleared.")
else:
_cprint(f" {_DIM}No active goal.{_RST}")
return
# Otherwise treat the arg as the goal text.
try:
state = mgr.set(arg)
except ValueError as exc:
_cprint(f" Invalid goal: {exc}")
return
_cprint(f" ⊙ Goal set ({state.max_turns}-turn budget): {state.goal}")
_cprint(
f" {_DIM}After each turn, a judge model will check if the goal is done. "
f"Hermes keeps working until it is, you pause/clear it, or the budget is "
f"exhausted. Use /goal status, /goal pause, /goal resume, /goal clear.{_RST}"
)
# Kick the loop off immediately so the user doesn't have to send a
# separate message after setting the goal.
try:
self._pending_input.put(state.goal)
except Exception:
pass
def _maybe_continue_goal_after_turn(self) -> None:
"""Hook run after every CLI turn. Judges + maybe re-queues.
Safe to call when no goal is set returns quickly.
Preemption is automatic: if a real user message is already in
``_pending_input`` we skip judging (the user's new input takes
priority and we'll re-judge after that turn). If judge says done,
mark it done and tell the user. If judge says continue and we're
under budget, push the continuation prompt onto the queue.
"""
mgr = self._get_goal_manager()
if mgr is None or not mgr.is_active():
return
# If a real user message is already queued, don't inject a
# continuation prompt on top — let the user's turn go first.
try:
if getattr(self, "_pending_input", None) is not None \
and not self._pending_input.empty():
return
except Exception:
pass
# Extract the agent's final response for this turn.
last_response = ""
try:
hist = self.conversation_history or []
for msg in reversed(hist):
if msg.get("role") == "assistant":
content = msg.get("content", "")
if isinstance(content, list):
# Multimodal content — flatten text parts.
parts = [
p.get("text", "")
for p in content
if isinstance(p, dict) and p.get("type") in ("text", "output_text")
]
last_response = "\n".join(t for t in parts if t)
else:
last_response = str(content or "")
break
except Exception:
last_response = ""
decision = mgr.evaluate_after_turn(last_response, user_initiated=True)
msg = decision.get("message") or ""
if msg:
_cprint(f" {msg}")
if decision.get("should_continue"):
prompt = decision.get("continuation_prompt")
if prompt:
try:
self._pending_input.put(prompt)
except Exception as exc:
logging.debug("goal continuation enqueue failed: %s", exc)
def _handle_skin_command(self, cmd: str):
"""Handle /skin [name] — show or change the display skin."""
try:
@@ -7364,7 +7050,7 @@ class HermesCLI:
import os
from hermes_cli.colors import Colors as _Colors
current = is_truthy_value(os.environ.get("HERMES_YOLO_MODE"))
current = bool(os.environ.get("HERMES_YOLO_MODE"))
if current:
os.environ.pop("HERMES_YOLO_MODE", None)
_cprint(
@@ -7561,20 +7247,10 @@ class HermesCLI:
original_count = len(self.conversation_history)
with self._busy_command("Compressing context..."):
try:
from agent.model_metadata import estimate_request_tokens_rough
from agent.model_metadata import estimate_messages_tokens_rough
from agent.manual_compression_feedback import summarize_manual_compression
original_history = list(self.conversation_history)
# Include system prompt + tool schemas in the estimate —
# a transcript-only number understates real request pressure
# and can even appear to grow after compression because a
# dense handoff summary replaces many short turns (#6217).
_sys_prompt = getattr(self.agent, "_cached_system_prompt", "") or ""
_tools = getattr(self.agent, "tools", None) or None
approx_tokens = estimate_request_tokens_rough(
original_history,
system_prompt=_sys_prompt,
tools=_tools,
)
approx_tokens = estimate_messages_tokens_rough(original_history)
if focus_topic:
print(f"🗜️ Compressing {original_count} messages (~{approx_tokens:,} tokens), "
f"focus: \"{focus_topic}\"...")
@@ -7606,11 +7282,7 @@ class HermesCLI:
):
self.session_id = self.agent.session_id
self._pending_title = None
new_tokens = estimate_request_tokens_rough(
self.conversation_history,
system_prompt=_sys_prompt,
tools=_tools,
)
new_tokens = estimate_messages_tokens_rough(self.conversation_history)
summary = summarize_manual_compression(
original_history,
self.conversation_history,
@@ -11576,17 +11248,6 @@ class HermesCLI:
app.invalidate() # Refresh status line
# Goal continuation: if a standing goal is active, ask
# the judge whether the turn satisfied it. If not, and
# there's no real user message already queued, push the
# continuation prompt back into _pending_input so the
# next loop iteration picks it up naturally (and any
# user input that arrives in between still preempts).
try:
self._maybe_continue_goal_after_turn()
except Exception as _goal_exc:
logging.debug("goal continuation hook failed: %s", _goal_exc)
# Continuous voice: auto-restart recording after agent responds.
# Dispatch to a daemon thread so play_beep (sd.wait) and
# AudioRecorder.start (lock acquire) never block process_loop —
-118
View File
@@ -882,121 +882,3 @@ def save_job_output(job_id: str, output: str):
raise
return output_file
# =============================================================================
# Skill reference rewriting (curator integration)
# =============================================================================
def rewrite_skill_refs(
consolidated: Optional[Dict[str, str]] = None,
pruned: Optional[List[str]] = None,
) -> Dict[str, Any]:
"""Rewrite cron job skill references after a curator consolidation pass.
When the curator consolidates a skill X into umbrella Y (or archives X
as pruned), any cron job that lists ``X`` in its ``skills`` field will
fail to load ``X`` at run time — the scheduler logs a warning and
skips the skill, so the job runs without the instructions it was
scheduled to follow. See cron/scheduler.py where ``skill_view`` is
called per skill name.
This function repairs cron jobs in-place:
- A skill listed in ``consolidated`` is replaced with its umbrella
target (the ``into`` value). If the umbrella is already in the
job's skill list, the stale name is dropped without duplication.
- A skill listed in ``pruned`` is dropped outright — there is no
forwarding target.
- Ordering and other skills in the list are preserved.
- The legacy ``skill`` field is realigned via ``_apply_skill_fields``.
Args:
consolidated: mapping of ``old_skill_name -> umbrella_skill_name``.
pruned: list of skill names that were archived with no forwarding
target.
Returns a report dict::
{
"rewrites": [
{
"job_id": ...,
"job_name": ...,
"before": [...],
"after": [...],
"mapped": {"old": "new", ...},
"dropped": ["old", ...],
},
...
],
"jobs_updated": N,
"jobs_scanned": M,
}
Best-effort: exceptions from loading/saving propagate to the caller so
tests can assert behaviour; the curator invocation site wraps this
call in a try/except so a failure here never breaks the curator.
"""
consolidated = dict(consolidated or {})
pruned_set = set(pruned or [])
# A skill listed in both wins as "consolidated" — it has a target,
# which is the more useful of the two outcomes.
pruned_set -= set(consolidated.keys())
if not consolidated and not pruned_set:
return {"rewrites": [], "jobs_updated": 0, "jobs_scanned": 0}
with _jobs_file_lock:
jobs = load_jobs()
rewrites: List[Dict[str, Any]] = []
changed = False
for job in jobs:
skills_before = _normalize_skill_list(job.get("skill"), job.get("skills"))
if not skills_before:
continue
mapped: Dict[str, str] = {}
dropped: List[str] = []
new_skills: List[str] = []
for name in skills_before:
if name in consolidated:
target = consolidated[name]
mapped[name] = target
if target and target not in new_skills:
new_skills.append(target)
elif name in pruned_set:
dropped.append(name)
else:
if name not in new_skills:
new_skills.append(name)
if not mapped and not dropped:
continue
job["skills"] = new_skills
job["skill"] = new_skills[0] if new_skills else None
changed = True
rewrites.append({
"job_id": job.get("id"),
"job_name": job.get("name") or job.get("id"),
"before": list(skills_before),
"after": list(new_skills),
"mapped": mapped,
"dropped": dropped,
})
if changed:
save_jobs(jobs)
logger.info(
"Curator rewrote skill references in %d cron job(s)", len(rewrites)
)
return {
"rewrites": rewrites,
"jobs_updated": len(rewrites),
"jobs_scanned": len(jobs),
}
+1 -1
View File
@@ -40,7 +40,7 @@ services:
# - TEAMS_CLIENT_SECRET=${TEAMS_CLIENT_SECRET}
# - TEAMS_TENANT_ID=${TEAMS_TENANT_ID}
# - TEAMS_ALLOWED_USERS=${TEAMS_ALLOWED_USERS}
# - TEAMS_PORT=${TEAMS_PORT:-3978}
# - TEAMS_PORT=3978
command: ["gateway", "run"]
dashboard:
Binary file not shown.
+6 -64
View File
@@ -36,26 +36,6 @@ def _coerce_bool(value: Any, default: bool = True) -> bool:
return is_truthy_value(value, default=default)
def _coerce_float(value: Any, default: float) -> float:
"""Coerce numeric config values, falling back on malformed input."""
if value is None:
return default
try:
return float(value)
except (TypeError, ValueError):
return default
def _coerce_int(value: Any, default: int) -> int:
"""Coerce integer config values, falling back on malformed input."""
if value is None:
return default
try:
return int(value)
except (TypeError, ValueError):
return default
def _normalize_unauthorized_dm_behavior(value: Any, default: str = "pair") -> str:
"""Normalize unauthorized DM behavior to a supported value."""
if isinstance(value, str):
@@ -65,15 +45,6 @@ def _normalize_unauthorized_dm_behavior(value: Any, default: str = "pair") -> st
return default
def _normalize_notice_delivery(value: Any, default: str = "public") -> str:
"""Normalize notice delivery mode to a supported value."""
if isinstance(value, str):
normalized = value.strip().lower()
if normalized in {"public", "private"}:
return normalized
return default
# Module-level cache for bundled platform plugin names (lives outside the
# enum so it doesn't become an accidental enum member).
_Platform__bundled_plugin_names: Optional[set] = None
@@ -330,13 +301,13 @@ class StreamingConfig:
if not data:
return cls()
return cls(
enabled=_coerce_bool(data.get("enabled"), False),
enabled=data.get("enabled", False),
transport=data.get("transport", "edit"),
edit_interval=_coerce_float(data.get("edit_interval"), 1.0),
buffer_threshold=_coerce_int(data.get("buffer_threshold"), 40),
edit_interval=float(data.get("edit_interval", 1.0)),
buffer_threshold=int(data.get("buffer_threshold", 40)),
cursor=data.get("cursor", ""),
fresh_final_after_seconds=_coerce_float(
data.get("fresh_final_after_seconds"), 60.0
fresh_final_after_seconds=float(
data.get("fresh_final_after_seconds", 60.0)
),
)
@@ -601,17 +572,6 @@ class GatewayConfig:
)
return self.unauthorized_dm_behavior
def get_notice_delivery(self, platform: Optional[Platform] = None) -> str:
"""Return the effective notice-delivery mode for a platform."""
if platform:
platform_cfg = self.platforms.get(platform)
if platform_cfg and "notice_delivery" in platform_cfg.extra:
return _normalize_notice_delivery(
platform_cfg.extra.get("notice_delivery"),
"public",
)
return "public"
def load_gateway_config() -> GatewayConfig:
"""
@@ -727,11 +687,6 @@ def load_gateway_config() -> GatewayConfig:
platform_cfg.get("unauthorized_dm_behavior"),
gw_data.get("unauthorized_dm_behavior", "pair"),
)
if "notice_delivery" in platform_cfg:
bridged["notice_delivery"] = _normalize_notice_delivery(
platform_cfg.get("notice_delivery"),
"public",
)
if "reply_prefix" in platform_cfg:
bridged["reply_prefix"] = platform_cfg["reply_prefix"]
if "reply_in_thread" in platform_cfg:
@@ -945,12 +900,6 @@ def load_gateway_config() -> GatewayConfig:
if "dm_mention_threads" in matrix_cfg and not os.getenv("MATRIX_DM_MENTION_THREADS"):
os.environ["MATRIX_DM_MENTION_THREADS"] = str(matrix_cfg["dm_mention_threads"]).lower()
# Feishu settings → env vars (env vars take precedence)
feishu_cfg = yaml_cfg.get("feishu", {})
if isinstance(feishu_cfg, dict):
if "allow_bots" in feishu_cfg and not os.getenv("FEISHU_ALLOW_BOTS"):
os.environ["FEISHU_ALLOW_BOTS"] = str(feishu_cfg["allow_bots"]).lower()
except Exception as e:
logger.warning(
"Failed to process config.yaml — falling back to .env / gateway.json values. "
@@ -1102,14 +1051,7 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
if Platform.WHATSAPP not in config.platforms:
config.platforms[Platform.WHATSAPP] = PlatformConfig()
config.platforms[Platform.WHATSAPP].enabled = True
whatsapp_home = os.getenv("WHATSAPP_HOME_CHANNEL")
if whatsapp_home and Platform.WHATSAPP in config.platforms:
config.platforms[Platform.WHATSAPP].home_channel = HomeChannel(
platform=Platform.WHATSAPP,
chat_id=whatsapp_home,
name=os.getenv("WHATSAPP_HOME_CHANNEL_NAME", "Home"),
)
# Slack
slack_token = os.getenv("SLACK_BOT_TOKEN")
if slack_token:
+7 -9
View File
@@ -53,10 +53,9 @@ class DeliveryTarget:
- "telegram" Telegram home channel
- "telegram:123456" specific Telegram chat
"""
target_stripped = target.strip()
target_lower = target_stripped.lower()
target = target.strip().lower()
if target_lower == "origin":
if target == "origin":
if origin:
return cls(
platform=origin.platform,
@@ -68,14 +67,13 @@ class DeliveryTarget:
# Fallback to local if no origin
return cls(platform=Platform.LOCAL, is_origin=True)
if target_lower == "local":
if target == "local":
return cls(platform=Platform.LOCAL)
# Check for platform:chat_id or platform:chat_id:thread_id format
# Use the original case for chat_id/thread_id to preserve case-sensitive IDs
if ":" in target_stripped:
parts = target_stripped.split(":", 2)
platform_str = parts[0].lower() # Platform names are case-insensitive
if ":" in target:
parts = target.split(":", 2)
platform_str = parts[0]
chat_id = parts[1] if len(parts) > 1 else None
thread_id = parts[2] if len(parts) > 2 else None
try:
@@ -87,7 +85,7 @@ class DeliveryTarget:
# Just a platform name (use home channel)
try:
platform = Platform(target_lower)
platform = Platform(target)
return cls(platform=platform)
except ValueError:
# Unknown platform, treat as local
+2 -4
View File
@@ -2351,11 +2351,10 @@ class APIServerAdapter(BasePlatformAdapter):
)
if agent_ref is not None:
agent_ref[0] = agent
effective_task_id = session_id or str(uuid.uuid4())
result = agent.run_conversation(
user_message=user_message,
conversation_history=conversation_history,
task_id=effective_task_id,
task_id="default",
)
usage = {
"input_tokens": getattr(agent, "session_prompt_tokens", 0) or 0,
@@ -2552,11 +2551,10 @@ class APIServerAdapter(BasePlatformAdapter):
)
self._active_run_agents[run_id] = agent
def _run_sync():
effective_task_id = session_id or run_id
r = agent.run_conversation(
user_message=user_message,
conversation_history=conversation_history,
task_id=effective_task_id,
task_id="default",
)
u = {
"input_tokens": getattr(agent, "session_prompt_tokens", 0) or 0,
+13 -209
View File
@@ -416,7 +416,7 @@ def is_host_excluded_by_no_proxy(hostname: str, no_proxy_value: str | None = Non
from dataclasses import dataclass, field
from datetime import datetime
from pathlib import Path
from typing import Dict, List, Optional, Any, Callable, Awaitable, Tuple, Union
from typing import Dict, List, Optional, Any, Callable, Awaitable, Tuple
from enum import Enum
from pathlib import Path as _Path
@@ -981,7 +981,7 @@ def coerce_plaintext_gateway_command(event: "MessageEvent") -> None:
return
@dataclass
@dataclass
class SendResult:
"""Result of sending a message."""
success: bool
@@ -991,45 +991,6 @@ class SendResult:
retryable: bool = False # True for transient connection errors — base will retry automatically
class EphemeralReply(str):
"""System-notice reply that auto-deletes after a TTL.
Slash-command handlers in ``gateway/run.py`` can return this wrapper
instead of a plain string to request that the reply message be deleted
after ``ttl_seconds`` on platforms that support ``delete_message``.
Subclassing ``str`` keeps the wrapper transparent to anything that
treats handler return values as text (existing tests use ``in`` /
``startswith`` / equality; the ``_process_message_background`` pipeline
extracts attachments from the string content). ``isinstance(r,
EphemeralReply)`` still distinguishes ephemeral replies from plain
strings so the send path can schedule deletion.
Platforms that don't override :meth:`BasePlatformAdapter.delete_message`
silently ignore the TTL the message is sent normally and left in
place. When ``ttl_seconds`` is ``None``, the pipeline uses the
configured ``display.ephemeral_system_ttl`` default. A default of ``0``
disables auto-deletion globally, preserving prior behavior.
"""
ttl_seconds: Optional[int]
def __new__(cls, text: str, ttl_seconds: Optional[int] = None):
instance = super().__new__(cls, text)
instance.ttl_seconds = ttl_seconds
return instance
@property
def text(self) -> str:
"""Return the underlying text.
Provided for call sites that want an explicit string conversion,
though ``str(reply)`` and using ``reply`` directly where a string
is expected both work identically.
"""
return str.__str__(self)
def merge_pending_message_event(
pending_messages: Dict[str, MessageEvent],
session_key: str,
@@ -1073,11 +1034,6 @@ def merge_pending_message_event(
existing.text = event.text
if existing_is_photo or incoming_is_photo:
existing.message_type = MessageType.PHOTO
elif (
getattr(existing, "message_type", None) == MessageType.TEXT
and event.message_type != MessageType.TEXT
):
existing.message_type = event.message_type
return
if (
@@ -1112,10 +1068,8 @@ _RETRYABLE_ERROR_PATTERNS = (
)
# Type for message handlers. Handlers may return a plain string (normal
# reply), an ``EphemeralReply`` to opt the reply into auto-deletion, or
# ``None`` when the response was already delivered (e.g. via streaming).
MessageHandler = Callable[[MessageEvent], Awaitable[Optional[Union[str, "EphemeralReply"]]]]
# Type for message handlers
MessageHandler = Callable[[MessageEvent], Awaitable[Optional[str]]]
def resolve_channel_prompt(
@@ -1500,64 +1454,6 @@ class BasePlatformAdapter(ABC):
"""
return False
def _get_ephemeral_system_ttl_default(self) -> int:
"""Read ``display.ephemeral_system_ttl`` from config.
Returns the TTL in seconds to use when an :class:`EphemeralReply`
does not specify one explicitly. ``0`` (the default) disables
auto-deletion. Non-fatal if config is unreadable.
"""
try:
from hermes_cli.config import load_config as _load_config
except Exception:
return 0
try:
cfg = _load_config()
except Exception:
return 0
display = cfg.get("display", {}) if isinstance(cfg, dict) else {}
if not isinstance(display, dict):
return 0
raw = display.get("ephemeral_system_ttl", 0)
try:
return int(raw)
except (TypeError, ValueError):
return 0
def _schedule_ephemeral_delete(
self,
chat_id: str,
message_id: str,
ttl_seconds: int,
) -> None:
"""Spawn a detached task that deletes ``message_id`` after ``ttl_seconds``.
Best-effort failures (gateway restart, permission denied, message
too old for Telegram's 48h window) are swallowed at debug level.
Does not block the caller.
"""
async def _run_delete() -> None:
try:
await asyncio.sleep(max(1, int(ttl_seconds)))
await self.delete_message(chat_id=chat_id, message_id=message_id)
except asyncio.CancelledError:
raise
except Exception as e:
logger.debug(
"[%s] Ephemeral delete failed for %s/%s: %s",
self.name, chat_id, message_id, e,
)
coro = _run_delete()
try:
asyncio.create_task(coro)
except RuntimeError:
# No running loop (e.g. unit tests that never reach the async
# path). Close the coroutine cleanly so Python doesn't warn
# about it never being awaited, then drop silently.
coro.close()
async def send_slash_confirm(
self,
chat_id: str,
@@ -1593,26 +1489,6 @@ class BasePlatformAdapter(ABC):
"""
return SendResult(success=False, error="Not supported")
async def send_private_notice(
self,
chat_id: str,
user_id: Optional[str],
content: str,
reply_to: Optional[str] = None,
metadata: Optional[Dict[str, Any]] = None,
) -> SendResult:
"""Send a notice privately when the platform supports it.
The default implementation falls back to a normal send so callers can
use one code path across platforms.
"""
return await self.send(
chat_id=chat_id,
content=content,
reply_to=reply_to,
metadata=metadata,
)
async def send_typing(self, chat_id: str, metadata=None) -> None:
"""
Send a typing indicator.
@@ -2167,28 +2043,6 @@ class BasePlatformAdapter(ABC):
lowered = error.lower()
return "timed out" in lowered or "readtimeout" in lowered or "writetimeout" in lowered
def _unwrap_ephemeral(self, response: Any) -> Tuple[Optional[str], int]:
"""Unwrap a handler response into (text, ttl_seconds).
Accepts a plain string, ``None``, or an :class:`EphemeralReply`.
Returns ``(text, ttl)`` where ``ttl > 0`` means the caller should
schedule a deletion via :meth:`_schedule_ephemeral_delete` after
the send succeeds. ``ttl`` is forced to 0 when the adapter
doesn't override :meth:`delete_message` so non-supporting
platforms silently degrade to normal sends.
"""
if isinstance(response, EphemeralReply):
ttl = response.ttl_seconds
if ttl is None:
try:
ttl = int(self._get_ephemeral_system_ttl_default())
except Exception:
ttl = 0
if ttl and ttl > 0 and type(self).delete_message is BasePlatformAdapter.delete_message:
ttl = 0
return response.text, int(ttl or 0)
return response, 0
async def _send_with_retry(
self,
chat_id: str,
@@ -2496,20 +2350,13 @@ class BasePlatformAdapter(ABC):
release_guard=False,
discard_pending=False,
)
_text, _eph_ttl = self._unwrap_ephemeral(response)
if _text:
_r = await self._send_with_retry(
if response:
await self._send_with_retry(
chat_id=event.source.chat_id,
content=_text,
content=response,
reply_to=event.message_id,
metadata=thread_meta,
)
if _eph_ttl > 0 and _r.success and _r.message_id:
self._schedule_ephemeral_delete(
chat_id=event.source.chat_id,
message_id=_r.message_id,
ttl_seconds=_eph_ttl,
)
except Exception:
# On failure, restore the original guard if one still exists so
# we don't leave the session in a half-reset state.
@@ -2589,20 +2436,13 @@ class BasePlatformAdapter(ABC):
try:
_thread_meta = {"thread_id": event.source.thread_id} if event.source.thread_id else None
response = await self._message_handler(event)
_text, _eph_ttl = self._unwrap_ephemeral(response)
if _text:
_r = await self._send_with_retry(
if response:
await self._send_with_retry(
chat_id=event.source.chat_id,
content=_text,
content=response,
reply_to=event.message_id,
metadata=_thread_meta,
)
if _eph_ttl > 0 and _r.success and _r.message_id:
self._schedule_ephemeral_delete(
chat_id=event.source.chat_id,
message_id=_r.message_id,
ttl_seconds=_eph_ttl,
)
except Exception as e:
logger.error("[%s] Command '/%s' dispatch failed: %s", self.name, cmd, e, exc_info=True)
return
@@ -2676,6 +2516,7 @@ class BasePlatformAdapter(ABC):
# Fall back to a new Event only if the entry was removed externally.
interrupt_event = self._active_sessions.get(session_key) or asyncio.Event()
self._active_sessions[session_key] = interrupt_event
callback_generation = getattr(interrupt_event, "_hermes_run_generation", None)
# Start continuous typing indicator (refreshes every 2 seconds)
_thread_metadata = {"thread_id": event.source.thread_id} if event.source.thread_id else None
@@ -2708,16 +2549,7 @@ class BasePlatformAdapter(ABC):
# Call the handler (this can take a while with tool calls)
response = await self._message_handler(event)
# Slash-command handlers may return an EphemeralReply sentinel to
# request that their reply message auto-delete after a TTL (used
# for system notices like "✨ New session started!" that the user
# doesn't need to keep in the thread). Unwrap here so all the
# downstream extract_media / text-processing logic sees a plain
# string, and remember the TTL + platform capability so the
# post-send block can schedule the deletion.
response, _ephemeral_ttl = self._unwrap_ephemeral(response)
# Send response if any. A None/empty response is normal when
# streaming already delivered the text (already_sent=True) or
# when the message was queued behind an active agent. Log at
@@ -2806,21 +2638,6 @@ class BasePlatformAdapter(ABC):
)
_record_delivery(result)
# Schedule auto-deletion of system-notice replies.
# Detached so the handler returns immediately; errors
# (permission denied, message too old) are swallowed.
if (
_ephemeral_ttl
and _ephemeral_ttl > 0
and result.success
and result.message_id
):
self._schedule_ephemeral_delete(
chat_id=event.source.chat_id,
message_id=result.message_id,
ttl_seconds=_ephemeral_ttl,
)
# Human-like pacing delay between text and media
human_delay = self._get_human_delay()
@@ -2998,20 +2815,7 @@ class BasePlatformAdapter(ABC):
finally:
# Fire any one-shot post-delivery callback registered for this
# session (e.g. deferred background-review notifications).
#
# Snapshot the callback generation HERE (after the agent has run),
# not at the top of this task. _hermes_run_generation is set on
# the interrupt event by GatewayRunner._bind_adapter_run_generation
# during _handle_message_with_agent — which happens DURING the
# self._message_handler(event) await above. Snapshotting earlier
# always captured None, which bypassed the generation-ownership
# check in pop_post_delivery_callback and let stale runs fire a
# fresher run's callbacks.
_callback_generation = getattr(
interrupt_event,
"_hermes_run_generation",
None,
)
_callback_generation = callback_generation
if hasattr(self, "pop_post_delivery_callback"):
_post_cb = self.pop_post_delivery_callback(
session_key,
+4 -13
View File
@@ -2851,15 +2851,8 @@ class DiscordAdapter(BasePlatformAdapter):
raw = os.getenv("DISCORD_FREE_RESPONSE_CHANNELS", "")
if isinstance(raw, list):
return {str(part).strip() for part in raw if str(part).strip()}
# Coerce non-list scalars (str/int/float) to str before splitting.
# YAML parses a bare numeric value such as
# `free_response_channels: 1491973769726791812` as int, which was
# previously falling through the isinstance(str) branch and silently
# returning an empty set. str() here accepts whatever scalar the YAML
# loader hands us without changing existing string/CSV semantics.
s = str(raw).strip() if raw is not None else ""
if s:
return {part.strip() for part in s.split(",") if part.strip()}
if isinstance(raw, str) and raw.strip():
return {part.strip() for part in raw.split(",") if part.strip()}
return set()
def _thread_parent_channel(self, channel: Any) -> Any:
@@ -3085,7 +3078,6 @@ class DiscordAdapter(BasePlatformAdapter):
async def send_update_prompt(
self, chat_id: str, prompt: str, default: str = "",
session_key: str = "",
metadata: Optional[Dict[str, Any]] = None,
) -> SendResult:
"""Send an interactive button-based update prompt (Yes / No).
@@ -3095,10 +3087,9 @@ class DiscordAdapter(BasePlatformAdapter):
if not self._client or not DISCORD_AVAILABLE:
return SendResult(success=False, error="Not connected")
try:
target_id = metadata.get("thread_id") if metadata and metadata.get("thread_id") else chat_id
channel = self._client.get_channel(int(target_id))
channel = self._client.get_channel(int(chat_id))
if not channel:
channel = await self._client.fetch_channel(int(target_id))
channel = await self._client.fetch_channel(int(chat_id))
default_hint = f" (default: {default})" if default else ""
embed = discord.Embed(
+51 -207
View File
@@ -64,7 +64,7 @@ from dataclasses import dataclass, field
from datetime import datetime
from pathlib import Path
from types import SimpleNamespace
from typing import Any, Dict, List, Literal, Optional, Sequence
from typing import Any, Dict, List, Optional, Sequence
from urllib.error import HTTPError, URLError
from urllib.parse import urlencode
from urllib.request import Request, urlopen
@@ -141,7 +141,6 @@ from gateway.platforms.base import (
)
from gateway.status import acquire_scoped_lock, release_scoped_lock
from hermes_constants import get_hermes_home
from utils import atomic_json_write
logger = logging.getLogger(__name__)
@@ -388,8 +387,6 @@ class FeishuAdapterSettings:
admins: frozenset[str] = frozenset()
default_group_policy: str = ""
group_rules: Dict[str, FeishuGroupRule] = field(default_factory=dict)
allow_bots: str = "none" # "none" | "mentions" | "all"
require_mention: bool = True
@dataclass
@@ -399,7 +396,6 @@ class FeishuGroupRule:
policy: str # "open" | "allowlist" | "blacklist" | "admin_only" | "disabled"
allowlist: set[str] = field(default_factory=set)
blacklist: set[str] = field(default_factory=set)
require_mention: Optional[bool] = None # None = inherit global
@dataclass
@@ -409,40 +405,6 @@ class FeishuBatchState:
counts: Dict[str, int] = field(default_factory=dict)
# ---------------------------------------------------------------------------
# Admission: policy types
# ---------------------------------------------------------------------------
RejectReason = Literal[
"self_echo",
"self_ids_unknown",
"bots_disabled",
"bot_not_mentioned",
"group_policy_rejected",
]
def _is_bot_sender(sender: Any) -> bool:
# receive_v1 docs say {user, bot}; accept "app" defensively.
return getattr(sender, "sender_type", "") in ("bot", "app")
def _sender_identity(sender: Any) -> frozenset:
# Take any non-empty id variant — tenant sender_id_type decides which are populated.
sid = getattr(sender, "sender_id", None)
if sid is None:
return frozenset()
return frozenset(
v for v in (
getattr(sid, "open_id", None),
getattr(sid, "user_id", None),
getattr(sid, "union_id", None),
)
if v
)
# ---------------------------------------------------------------------------
# Markdown rendering helpers
# ---------------------------------------------------------------------------
@@ -1415,16 +1377,10 @@ class FeishuAdapter(BasePlatformAdapter):
for chat_id, rule_cfg in raw_group_rules.items():
if not isinstance(rule_cfg, dict):
continue
# Only override when the key is explicitly set — missing vs false
# must not collapse.
per_chat_require_mention: Optional[bool] = None
if "require_mention" in rule_cfg:
per_chat_require_mention = _to_boolean(rule_cfg.get("require_mention"))
group_rules[str(chat_id)] = FeishuGroupRule(
policy=str(rule_cfg.get("policy", "open")).strip().lower(),
allowlist=set(str(u).strip() for u in rule_cfg.get("allowlist", []) if str(u).strip()),
blacklist=set(str(u).strip() for u in rule_cfg.get("blacklist", []) if str(u).strip()),
require_mention=per_chat_require_mention,
)
# Bot-level admins
@@ -1434,16 +1390,6 @@ class FeishuAdapter(BasePlatformAdapter):
# Default group policy (for groups not in group_rules)
default_group_policy = str(extra.get("default_group_policy", "")).strip().lower()
# Env-only so adapter and gateway auth bypass share one source; yaml
# feishu.allow_bots is bridged to this env var at config load.
allow_bots = os.getenv("FEISHU_ALLOW_BOTS", "none").strip().lower()
if allow_bots not in ("none", "mentions", "all"):
logger.warning(
"[Feishu] Unknown allow_bots=%r, falling back to 'none'. Valid: none, mentions, all.",
allow_bots,
)
allow_bots = "none"
return FeishuAdapterSettings(
app_id=str(extra.get("app_id") or os.getenv("FEISHU_APP_ID", "")).strip(),
app_secret=str(extra.get("app_secret") or os.getenv("FEISHU_APP_SECRET", "")).strip(),
@@ -1500,10 +1446,6 @@ class FeishuAdapter(BasePlatformAdapter):
admins=admins,
default_group_policy=default_group_policy,
group_rules=group_rules,
allow_bots=allow_bots,
require_mention=_to_boolean(
extra.get("require_mention", os.getenv("FEISHU_REQUIRE_MENTION", "true"))
),
)
def _apply_settings(self, settings: FeishuAdapterSettings) -> None:
@@ -1534,8 +1476,6 @@ class FeishuAdapter(BasePlatformAdapter):
self._ws_reconnect_interval = settings.ws_reconnect_interval
self._ws_ping_interval = settings.ws_ping_interval
self._ws_ping_timeout = settings.ws_ping_timeout
self._allow_bots = settings.allow_bots
self._require_mention = settings.require_mention
def _build_event_handler(self) -> Any:
if EventDispatcherHandler is None:
@@ -2249,28 +2189,30 @@ class FeishuAdapter(BasePlatformAdapter):
event = getattr(data, "event", None)
message = getattr(event, "message", None)
sender = getattr(event, "sender", None)
if not message or not sender or not getattr(sender, "sender_id", None):
logger.debug("[Feishu] Dropping malformed inbound event: missing message/sender")
sender_id = getattr(sender, "sender_id", None)
if not message or not sender_id:
logger.debug("[Feishu] Dropping malformed inbound event: missing message or sender_id")
return
message_id = getattr(message, "message_id", None)
if not message_id or self._is_duplicate(message_id):
logger.debug("[Feishu] Dropping duplicate/missing message_id: %s", message_id)
return
reason = self._admit(sender, message)
if reason is not None:
logger.debug("[Feishu] dropping inbound event: %s", reason)
if self._is_self_sent_bot_message(event):
logger.debug("[Feishu] Dropping self-sent bot event: %s", message_id)
return
chat_type = getattr(message, "chat_type", "p2p")
chat_id = getattr(message, "chat_id", "") or ""
if chat_type != "p2p" and not self._should_accept_group_message(message, sender_id, chat_id):
logger.debug("[Feishu] Dropping group message that failed mention/policy gate: %s", message_id)
return
await self._process_inbound_message(
data=data,
message=message,
sender_id=getattr(sender, "sender_id", None),
sender_id=sender_id,
chat_type=chat_type,
message_id=message_id,
is_bot=_is_bot_sender(sender),
)
def _on_message_read_event(self, data: P2ImMessageMessageReadV1) -> None:
@@ -2447,11 +2389,10 @@ class FeishuAdapter(BasePlatformAdapter):
msg = items[0] if items else None
if not msg:
return
# GET im/v1/messages returns sender.id=app_id for bot messages —
# peer bots and us share sender_type="app" but differ on app_id.
sender = getattr(msg, "sender", None)
if str(getattr(sender, "id", "") or "") != self._app_id:
return # only route reactions on this bot's own messages
sender_type = str(getattr(sender, "sender_type", "") or "").lower()
if sender_type != "app":
return # only route reactions on our own bot messages
chat_id = str(getattr(msg, "chat_id", "") or "")
chat_type_raw = str(getattr(msg, "chat_type", "p2p") or "p2p")
if not chat_id:
@@ -2738,7 +2679,6 @@ class FeishuAdapter(BasePlatformAdapter):
sender_id: Any,
chat_type: str,
message_id: str,
is_bot: bool = False,
) -> None:
text, inbound_type, media_urls, media_types, mentions = await self._extract_message_content(message)
@@ -2764,27 +2704,19 @@ class FeishuAdapter(BasePlatformAdapter):
)
reply_to_text = await self._fetch_message_text(reply_to_message_id) if reply_to_message_id else None
sender_primary = (
getattr(sender_id, "open_id", None)
or getattr(sender_id, "user_id", None)
or getattr(sender_id, "union_id", None)
or "<unknown>"
)
logger.info(
"[Feishu] Inbound %s message received: id=%s type=%s chat_id=%s sender=%s:%s text=%r media=%d",
"[Feishu] Inbound %s message received: id=%s type=%s chat_id=%s text=%r media=%d",
"dm" if chat_type == "p2p" else "group",
message_id,
inbound_type.value,
getattr(message, "chat_id", "") or "",
"bot" if is_bot else "user",
sender_primary,
text[:120],
len(media_urls),
)
chat_id = getattr(message, "chat_id", "") or ""
chat_info = await self.get_chat_info(chat_id)
sender_profile = await self._resolve_sender_profile(sender_id, is_bot=is_bot)
sender_profile = await self._resolve_sender_profile(sender_id)
source = self.build_source(
chat_id=chat_id,
chat_name=chat_info.get("name") or chat_id or "Feishu Chat",
@@ -2793,7 +2725,6 @@ class FeishuAdapter(BasePlatformAdapter):
user_name=sender_profile["user_name"],
thread_id=getattr(message, "thread_id", None) or None,
user_id_alt=sender_profile["user_id_alt"],
is_bot=is_bot,
)
normalized = MessageEvent(
text=text,
@@ -3516,12 +3447,7 @@ class FeishuAdapter(BasePlatformAdapter):
return "dm"
return "group"
async def _resolve_sender_profile(
self,
sender_id: Any,
*,
is_bot: bool = False,
) -> Dict[str, Optional[str]]:
async def _resolve_sender_profile(self, sender_id: Any) -> Dict[str, Optional[str]]:
"""Map Feishu's three-tier user IDs onto Hermes' SessionSource fields.
Preference order for the primary ``user_id`` field:
@@ -3538,11 +3464,7 @@ class FeishuAdapter(BasePlatformAdapter):
union_id = getattr(sender_id, "union_id", None) or None
# Prefer tenant-scoped user_id; fall back to app-scoped open_id.
primary_id = user_id or open_id
# bot/v3/bots/basic_batch only accepts open_id.
name_lookup_id = open_id if is_bot else (primary_id or union_id)
display_name = await self._resolve_sender_name_from_api(
name_lookup_id, is_bot=is_bot,
)
display_name = await self._resolve_sender_name_from_api(primary_id or union_id)
return {
"user_id": primary_id,
"user_name": display_name,
@@ -3562,14 +3484,11 @@ class FeishuAdapter(BasePlatformAdapter):
self._sender_name_cache.pop(sender_id, None)
return None
async def _resolve_sender_name_from_api(
self,
sender_id: Optional[str],
*,
is_bot: bool = False,
) -> Optional[str]:
"""Bots divert to bot/basic_batch — contact API doesn't return bot names.
Failures are silent so the pipeline never blocks on name resolution.
async def _resolve_sender_name_from_api(self, sender_id: Optional[str]) -> Optional[str]:
"""Fetch the sender's display name from the Feishu contact API with a 10-minute cache.
ID-type detection mirrors openclaw: ou_ open_id, on_ union_id, else user_id.
Failures are silently suppressed; the message pipeline must not block on name resolution.
"""
if not sender_id or not self._client:
return None
@@ -3579,16 +3498,7 @@ class FeishuAdapter(BasePlatformAdapter):
now = time.time()
cached_name = self._get_cached_sender_name(trimmed)
if cached_name is not None:
return cached_name or None # "" cached means "known nameless"
if is_bot:
names = await self._fetch_bot_names([trimmed])
if names is None:
return None
expire_at = now + _FEISHU_SENDER_NAME_TTL_SECONDS
for oid, name in names.items():
self._sender_name_cache[oid] = (name, expire_at)
hit = self._sender_name_cache.get(trimmed)
return (hit[0] or None) if hit else None
return cached_name
try:
from lark_oapi.api.contact.v3 import GetUserRequest # lazy import
if trimmed.startswith("ou_"):
@@ -3617,35 +3527,6 @@ class FeishuAdapter(BasePlatformAdapter):
logger.debug("[Feishu] Failed to resolve sender name for %s", sender_id, exc_info=True)
return None
async def _fetch_bot_names(self, bot_ids: List[str]) -> Optional[Dict[str, str]]:
if not self._client or not bot_ids:
return None
try:
req = (
BaseRequest.builder()
.http_method(HttpMethod.GET)
.uri("/open-apis/bot/v3/bots/basic_batch")
.queries([("bot_ids", oid) for oid in bot_ids])
.token_types({AccessTokenType.TENANT})
.build()
)
resp = await asyncio.to_thread(self._client.request, req)
content = getattr(getattr(resp, "raw", None), "content", None)
if not content:
return None
payload = json.loads(content)
if payload.get("code") != 0:
return None
bots = (payload.get("data") or {}).get("bots") or {}
return {
oid: str(info.get("name") or "").strip()
for oid, info in bots.items()
if oid
}
except Exception:
logger.debug("[Feishu] Failed to fetch bot names for %s", bot_ids, exc_info=True)
return None
async def _fetch_message_text(self, message_id: str) -> Optional[str]:
if not self._client or not message_id:
return None
@@ -3709,60 +3590,10 @@ class FeishuAdapter(BasePlatformAdapter):
logger.exception("[Feishu] Background inbound processing failed")
# =========================================================================
# Inbound admission
# Group policy and mention gating
# =========================================================================
def _admit(self, sender: Any, message: Any) -> Optional[RejectReason]:
sender_ids = _sender_identity(sender)
self_ids = frozenset(v for v in (self._bot_open_id, self._bot_user_id) if v)
is_bot = _is_bot_sender(sender)
is_group = getattr(message, "chat_type", "p2p") != "p2p"
chat_id = getattr(message, "chat_id", "") or ""
require_mention = is_group and self._require_mention_for(chat_id)
# Defensive only — Feishu doesn't echo our outbound back as inbound,
# and open_id is always populated on both sides.
if self_ids and sender_ids & self_ids:
return "self_echo"
if is_bot:
mode = self._allow_bots
if mode != "mentions" and mode != "all":
return "bots_disabled"
# Defensive: pre-hydration or malformed payloads.
if not self_ids or not sender_ids:
return "self_ids_unknown"
# Step 4 covers mention enforcement for groups when require_mention
# is on; check here only on paths step 4 won't reach.
if mode == "mentions" and not require_mention and not self._mentions_self(message):
return "bot_not_mentioned"
if not is_group:
return None
if not self._allow_group_message(
getattr(sender, "sender_id", None), chat_id, is_bot=is_bot,
):
return "group_policy_rejected"
if require_mention and not self._mentions_self(message):
return "group_policy_rejected"
return None
def _require_mention_for(self, chat_id: str) -> bool:
rule = self._group_rules.get(chat_id) if chat_id else None
if rule and rule.require_mention is not None:
return rule.require_mention
return self._require_mention
# --- Group policy ---------------------------------------------------------
def _allow_group_message(
self,
sender_id: Any,
chat_id: str = "",
*,
is_bot: bool = False,
) -> bool:
def _allow_group_message(self, sender_id: Any, chat_id: str = "") -> bool:
"""Per-group policy gate for non-DM traffic."""
sender_open_id = getattr(sender_id, "open_id", None)
sender_user_id = getattr(sender_id, "user_id", None)
@@ -3781,17 +3612,12 @@ class FeishuAdapter(BasePlatformAdapter):
allowlist = self._allowed_group_users
blacklist = set()
# Channel locks apply to everyone; allowlist/blacklist only gate humans
# (bots were already cleared upstream by FEISHU_ALLOW_BOTS).
if policy == "disabled":
return False
if policy == "open":
return True
if policy == "admin_only":
return False
if is_bot:
return True
if policy == "allowlist":
return bool(sender_ids and (sender_ids & allowlist))
if policy == "blacklist":
@@ -3799,16 +3625,17 @@ class FeishuAdapter(BasePlatformAdapter):
return bool(sender_ids and (sender_ids & self._allowed_group_users))
# --- Mention detection ----------------------------------------------------
def _mentions_self(self, message: Any) -> bool:
# @_all is Feishu's @everyone placeholder.
def _should_accept_group_message(self, message: Any, sender_id: Any, chat_id: str = "") -> bool:
"""Require an explicit @mention before group messages enter the agent."""
if not self._allow_group_message(sender_id, chat_id):
return False
# @_all is Feishu's @everyone placeholder — always route to the bot.
raw_content = getattr(message, "content", "") or ""
if "@_all" in raw_content:
return True
mentions = getattr(message, "mentions", None) or []
if mentions and self._message_mentions_bot(mentions):
return True
if mentions:
return self._message_mentions_bot(mentions)
normalized = normalize_feishu_message(
message_type=getattr(message, "message_type", "") or "",
raw_content=raw_content,
@@ -3817,6 +3644,23 @@ class FeishuAdapter(BasePlatformAdapter):
)
return self._post_mentions_bot(normalized.mentions)
def _is_self_sent_bot_message(self, event: Any) -> bool:
"""Return True only for Feishu events emitted by this Hermes bot."""
sender = getattr(event, "sender", None)
sender_type = str(getattr(sender, "sender_type", "") or "").strip().lower()
if sender_type not in {"bot", "app"}:
return False
sender_id = getattr(sender, "sender_id", None)
sender_open_id = str(getattr(sender_id, "open_id", "") or "").strip()
sender_user_id = str(getattr(sender_id, "user_id", "") or "").strip()
if self._bot_open_id and sender_open_id == self._bot_open_id:
return True
if self._bot_user_id and sender_user_id == self._bot_user_id:
return True
return False
def _message_mentions_bot(self, mentions: List[Any]) -> bool:
# IDs trump names: when both sides have open_id (or both user_id),
# match requires equal IDs. Name fallback only when either side
@@ -3960,7 +3804,7 @@ class FeishuAdapter(BasePlatformAdapter):
recent = self._seen_message_order[-self._dedup_cache_size:]
# Save as {msg_id: timestamp} so TTL filtering works across restarts.
payload = {"message_ids": {k: self._seen_message_ids[k] for k in recent if k in self._seen_message_ids}}
atomic_json_write(self._dedup_state_path, payload, indent=None)
self._dedup_state_path.write_text(json.dumps(payload, ensure_ascii=False), encoding="utf-8")
except OSError:
logger.warning("[Feishu] Failed to persist dedup state to %s", self._dedup_state_path, exc_info=True)
+2 -3
View File
@@ -13,8 +13,6 @@ import time
from pathlib import Path
from typing import TYPE_CHECKING, Dict
from utils import atomic_json_write
if TYPE_CHECKING:
from gateway.platforms.base import MessageEvent
@@ -239,11 +237,12 @@ class ThreadParticipationTracker:
def _save(self) -> None:
path = self._state_path()
path.parent.mkdir(parents=True, exist_ok=True)
thread_list = list(self._threads)
if len(thread_list) > self._max_tracked:
thread_list = thread_list[-self._max_tracked:]
self._threads = set(thread_list)
atomic_json_write(path, thread_list, indent=None)
path.write_text(json.dumps(thread_list), encoding="utf-8")
def mark(self, thread_id: str) -> None:
"""Mark *thread_id* as participated and persist."""
-12
View File
@@ -534,18 +534,6 @@ class SignalAdapter(BasePlatformAdapter):
except Exception:
logger.exception("Signal: failed to fetch attachment %s", att_id)
# Skip envelopes with no meaningful content (no text, no attachments).
# Catches profile key updates, empty messages, and other metadata-only
# envelopes that still carry a dataMessage wrapper but have nothing
# worth processing. See issue: signal-cli logs "Profile key update" +
# Hermes receives msg='' triggering a full agent turn for nothing.
if (not text or not text.strip()) and not media_urls:
logger.debug(
"Signal: skipping contentless envelope from %s (%d attachments)",
redact_phone(sender), len(media_urls) if media_urls else 0,
)
return
# Build session source
source = self.build_source(
chat_id=chat_id,
+13 -221
View File
@@ -9,7 +9,6 @@ Uses slack-bolt (Python) with Socket Mode for:
"""
import asyncio
import contextvars
import json
import logging
import os
@@ -22,7 +21,6 @@ try:
from slack_bolt.async_app import AsyncApp
from slack_bolt.adapter.socket_mode.async_handler import AsyncSocketModeHandler
from slack_sdk.web.async_client import AsyncWebClient
import aiohttp
SLACK_AVAILABLE = True
except ImportError:
SLACK_AVAILABLE = False
@@ -52,16 +50,6 @@ from gateway.platforms.base import (
logger = logging.getLogger(__name__)
# ContextVar carrying the user_id of the slash-command invoker.
# Set in _handle_slash_command, read in send() to match the correct
# stashed response_url when multiple users issue commands on the same
# channel concurrently. ContextVars propagate to child asyncio.Tasks
# (Python 3.7+), so the value set in _handle_slash_command's task is
# visible in _process_message_background's child task.
_slash_user_id: contextvars.ContextVar[Optional[str]] = contextvars.ContextVar(
"_slash_user_id", default=None,
)
@dataclass
class _ThreadContextCache:
@@ -322,11 +310,6 @@ class SlackAdapter(BasePlatformAdapter):
# Track active assistant thread status indicators so stop_typing can
# clear them (chat_id → thread_ts).
self._active_status_threads: Dict[str, str] = {}
# Slash-command contexts: stash response_url + user_id so send()
# can route the first reply ephemerally. Keyed by
# (channel_id, user_id) to avoid cross-user collisions.
# Each value: {"response_url": str, "ts": float}
self._slash_command_contexts: Dict[Tuple[str, str], Dict[str, Any]] = {}
def _describe_slack_api_error(self, response: Any, *, file_obj: Optional[Dict[str, Any]] = None) -> Optional[str]:
"""Convert Slack API auth/permission failures into actionable user-facing text."""
@@ -385,103 +368,6 @@ class SlackAdapter(BasePlatformAdapter):
)
return None
# ------------------------------------------------------------------
# Slash-command ephemeral helpers
# ------------------------------------------------------------------
_SLASH_CTX_TTL = 120.0 # seconds — response_url is valid for 30 min;
# we use a much shorter TTL to avoid routing unrelated messages
# as ephemeral if the command handler was slow or dropped.
def _pop_slash_context(
self, chat_id: str,
) -> Optional[Dict[str, Any]]:
"""Return and remove the slash-command context for *chat_id*, if fresh.
Contexts older than ``_SLASH_CTX_TTL`` seconds are silently discarded.
Uses the ``_slash_user_id`` ContextVar (set in ``_handle_slash_command``)
to match the exact ``(channel_id, user_id)`` key. This prevents a
concurrent slash command from a different user on the same channel from
stealing another user's ephemeral context. Falls back to a
channel-only scan when the ContextVar is unset (e.g. send() called
from a non-slash code path should not match anything).
"""
now = time.monotonic()
# Clean up stale entries on every lookup — dict is small.
stale_keys = [
k for k, v in self._slash_command_contexts.items()
if now - v["ts"] > self._SLASH_CTX_TTL
]
for k in stale_keys:
self._slash_command_contexts.pop(k, None)
# Precise match: (channel_id, user_id) from ContextVar.
uid = _slash_user_id.get()
if uid:
return self._slash_command_contexts.pop((chat_id, uid), None)
# Fallback: channel-only scan (only reachable when ContextVar is
# unset, i.e. send() called outside a slash-command async context).
match_key = None
for key in list(self._slash_command_contexts):
if key[0] == chat_id:
match_key = key
break
if match_key is None:
return None
return self._slash_command_contexts.pop(match_key)
async def _send_slash_ephemeral(
self,
ctx: Dict[str, Any],
content: str,
) -> "SendResult":
"""Replace the initial ephemeral ack via ``response_url``.
Slack's ``response_url`` accepts a POST with ``replace_original``
for up to 30 minutes after the slash command was invoked. This
lets us swap the "Running /cmd…" placeholder with the real reply,
and the message stays ephemeral ("Only visible to you").
Falls back to a simple ``True`` SendResult if the POST fails
the user already saw the initial ack, so a delivery failure here
is non-critical.
"""
formatted = self.format_message(content)
# Slack's response_url has the same ~40k char limit as chat_postMessage.
# Truncate to MAX_MESSAGE_LENGTH and use only the first chunk — the
# response_url replaces a single ephemeral ack, so multi-chunk isn't
# possible. Long responses are rare for command replies.
chunks = self.truncate_message(formatted, self.MAX_MESSAGE_LENGTH)
text = chunks[0] if chunks else formatted
payload = {
"response_type": "ephemeral",
"replace_original": True,
"text": text,
}
try:
async with aiohttp.ClientSession() as session:
async with session.post(
ctx["response_url"],
json=payload,
timeout=aiohttp.ClientTimeout(total=10),
) as resp:
if resp.status == 200:
return SendResult(success=True, message_id=None)
body = await resp.text()
logger.warning(
"[Slack] response_url POST returned %s: %s",
resp.status,
body[:200],
)
except Exception as e:
logger.warning(
"[Slack] response_url POST failed: %s", e,
)
# Non-fatal — the user saw the initial ack already.
return SendResult(success=True, message_id=None)
async def connect(self) -> bool:
"""Connect to Slack via Socket Mode."""
if not SLACK_AVAILABLE:
@@ -560,16 +446,12 @@ class SlackAdapter(BasePlatformAdapter):
async def handle_message_event(event, say):
await self._handle_slack_message(event)
# Handle app_mention explicitly. In some Slack app configurations,
# channel mentions arrive only as app_mention events rather than the
# generic message event. Forward them into the normal message
# pipeline so @mentions reliably produce replies.
# NOTE: when Slack fires BOTH message and app_mention for the same
# @mention, they share the same event ts — the dedup in
# _handle_slack_message (MessageDeduplicator) suppresses the second.
# Acknowledge app_mention events to prevent Bolt 404 errors.
# The "message" handler above already processes @mentions in
# channels, so this is intentionally a no-op to avoid duplicates.
@self._app.event("app_mention")
async def handle_app_mention(event, say):
await self._handle_slack_message(event)
pass
# File lifecycle events can arrive around snippet uploads even when
# the actual user message is what we care about. Ack them so Slack
@@ -620,11 +502,7 @@ class SlackAdapter(BasePlatformAdapter):
@self._app.command(_slash_pattern)
async def handle_hermes_command(ack, command):
slash = (command.get("command") or "").lstrip("/")
await ack(
response_type="ephemeral",
text=f"Running `/{slash}`…",
)
await ack()
await self._handle_slash_command(command)
# Register Block Kit action handlers for approval buttons
@@ -696,17 +574,6 @@ class SlackAdapter(BasePlatformAdapter):
return SendResult(success=False, error="Not connected")
try:
# Check for a pending slash-command context. When the user ran a
# native slash command (e.g. /q, /stop, /model), the initial ack
# already showed an ephemeral "Running /cmd…" message. If we have
# a stashed response_url for this channel, replace that ack with
# the actual command reply ephemerally instead of posting publicly.
slash_ctx = self._pop_slash_context(chat_id)
if slash_ctx:
return await self._send_slash_ephemeral(
slash_ctx, content,
)
# Convert standard markdown → Slack mrkdwn
formatted = self.format_message(content)
@@ -734,10 +601,6 @@ class SlackAdapter(BasePlatformAdapter):
last_result = await self._get_client(chat_id).chat_postMessage(**kwargs)
# Clear Slack Assistant status as soon as the final message is posted.
if thread_ts:
await self.stop_typing(chat_id)
# Track the sent message ts so we can auto-respond to thread
# replies without requiring @mention.
sent_ts = last_result.get("ts") if last_result else None
@@ -761,42 +624,6 @@ class SlackAdapter(BasePlatformAdapter):
logger.error("[Slack] Send error: %s", e, exc_info=True)
return SendResult(success=False, error=str(e))
async def send_private_notice(
self,
chat_id: str,
user_id: str,
content: str,
reply_to: Optional[str] = None,
metadata: Optional[Dict[str, Any]] = None,
) -> SendResult:
"""Send a Slack ephemeral message visible only to one user."""
if not self._app:
return SendResult(success=False, error="Not connected")
if not chat_id or not user_id:
return SendResult(success=False, error="chat_id and user_id are required")
try:
formatted = self.format_message(content)
thread_ts = self._resolve_thread_ts(reply_to, metadata)
kwargs = {
"channel": chat_id,
"user": user_id,
"text": formatted,
"mrkdwn": True,
}
if thread_ts:
kwargs["thread_ts"] = thread_ts
result = await self._get_client(chat_id).chat_postEphemeral(**kwargs)
return SendResult(
success=True,
message_id=result.get("message_ts") or result.get("ts"),
raw_response=result,
)
except Exception as e: # pragma: no cover - defensive logging
logger.error("[Slack] Ephemeral send error: %s", e, exc_info=True)
return SendResult(success=False, error=str(e))
async def edit_message(
self,
chat_id: str,
@@ -815,8 +642,6 @@ class SlackAdapter(BasePlatformAdapter):
ts=message_id,
text=formatted,
)
if finalize:
await self.stop_typing(chat_id)
return SendResult(success=True, message_id=message_id)
except Exception as e: # pragma: no cover - defensive logging
logger.error(
@@ -857,7 +682,7 @@ class SlackAdapter(BasePlatformAdapter):
# in an assistant-enabled context. Falls back to reactions.
logger.debug("[Slack] assistant.threads.setStatus failed: %s", e)
async def stop_typing(self, chat_id: str, metadata=None) -> None:
async def stop_typing(self, chat_id: str) -> None:
"""Clear the assistant thread status indicator."""
if not self._app:
return
@@ -1144,7 +969,7 @@ class SlackAdapter(BasePlatformAdapter):
return _ph(f'<{url}|{label}>')
text = re.sub(
r'(?<!!)\[([^\]]+)\]\(([^()]*(?:\([^()]*\)[^()]*)*)\)',
r'\[([^\]]+)\]\(([^()]*(?:\([^()]*\)[^()]*)*)\)',
_convert_markdown_link,
text,
)
@@ -1191,11 +1016,9 @@ class SlackAdapter(BasePlatformAdapter):
)
# 10) Convert italic: _text_ stays as _text_ (already Slack italic)
# Single *text* → _text_ (Slack italic), but only when the
# emphasized text touches non-whitespace on both sides so literal
# delimiters like "a * b * c" are preserved.
# Single *text* → _text_ (Slack italic)
text = re.sub(
r'(?<!\*)\*(\S(?:[^*\n]*?\S)?)\*(?!\*)',
r'(?<!\*)\*([^*\n]+)\*(?!\*)',
lambda m: _ph(f'_{m.group(1)}_'),
text,
)
@@ -2701,14 +2524,9 @@ class SlackAdapter(BasePlatformAdapter):
# gateway command dispatcher by prepending the slash.
text = f"/{slash_name} {text}".strip()
# Slack slash commands can originate from DMs or shared channels.
# Preserve DM semantics only for DM channel IDs; shared channels must
# keep group semantics so different users do not collide into one
# session key.
is_dm = str(channel_id).startswith("D")
source = self.build_source(
chat_id=channel_id,
chat_type="dm" if is_dm else "group",
chat_type="dm", # Slash commands are always in DM-like context
user_id=user_id,
)
@@ -2719,26 +2537,7 @@ class SlackAdapter(BasePlatformAdapter):
raw_message=command,
)
# Stash the Slack response_url so the first reply for this
# channel+user can be routed ephemerally (replaces the initial
# "Running /cmd…" ack shown by handle_hermes_command).
# Only stash for COMMAND events (text starts with "/") — free-form
# questions via "/hermes <question>" must produce public replies so
# the whole channel can see the agent's answer.
response_url = command.get("response_url", "")
if response_url and user_id and channel_id and text.startswith("/"):
self._slash_command_contexts[(channel_id, user_id)] = {
"response_url": response_url,
"ts": time.monotonic(),
}
# Set the ContextVar so send() can match the correct stashed
# response_url even when multiple users slash concurrently.
_slash_user_id_token = _slash_user_id.set(user_id or None)
try:
await self.handle_message(event)
finally:
_slash_user_id.reset(_slash_user_id_token)
await self.handle_message(event)
def _has_active_session_for_thread(
self,
@@ -2899,13 +2698,6 @@ class SlackAdapter(BasePlatformAdapter):
raw = os.getenv("SLACK_FREE_RESPONSE_CHANNELS", "")
if isinstance(raw, list):
return {str(part).strip() for part in raw if str(part).strip()}
# Coerce non-list scalars (str/int/float) to str before splitting.
# A bare numeric YAML value (`free_response_channels: 1234567890`) is
# loaded as int and was previously falling through the isinstance(str)
# branch to return an empty set. str() here accepts whatever scalar
# the YAML loader hands us without changing existing string/CSV
# semantics.
s = str(raw).strip() if raw is not None else ""
if s:
return {part.strip() for part in s.split(",") if part.strip()}
if isinstance(raw, str) and raw.strip():
return {part.strip() for part in raw.split(",") if part.strip()}
return set()
+7 -88
View File
@@ -290,53 +290,14 @@ class TelegramAdapter(BasePlatformAdapter):
# and any other slash-confirm prompts; see GatewayRunner._request_slash_confirm).
self._slash_confirm_state: Dict[str, str] = {}
def _is_callback_user_authorized(
self,
user_id: str,
*,
chat_id: Optional[str] = None,
chat_type: Optional[str] = None,
thread_id: Optional[str] = None,
user_name: Optional[str] = None,
) -> bool:
@staticmethod
def _is_callback_user_authorized(user_id: str) -> bool:
"""Return whether a Telegram inline-button caller may perform gated actions."""
normalized_user_id = str(user_id or "").strip()
if not normalized_user_id:
return False
runner = getattr(getattr(self, "_message_handler", None), "__self__", None)
auth_fn = getattr(runner, "_is_user_authorized", None)
if callable(auth_fn):
try:
from gateway.session import SessionSource
normalized_chat_type = str(chat_type or "dm").strip().lower() or "dm"
if normalized_chat_type == "private":
normalized_chat_type = "dm"
elif normalized_chat_type == "supergroup":
normalized_chat_type = "forum" if thread_id is not None else "group"
source = SessionSource(
platform=Platform.TELEGRAM,
chat_id=str(chat_id or normalized_user_id),
chat_type=normalized_chat_type,
user_id=normalized_user_id,
user_name=str(user_name).strip() if user_name else None,
thread_id=str(thread_id) if thread_id is not None else None,
)
return bool(auth_fn(source))
except Exception:
logger.debug(
"[Telegram] Falling back to env-only callback auth for user %s",
normalized_user_id,
exc_info=True,
)
allowed_csv = os.getenv("TELEGRAM_ALLOWED_USERS", "").strip()
if not allowed_csv:
return True
allowed_ids = {uid.strip() for uid in allowed_csv.split(",") if uid.strip()}
return "*" in allowed_ids or normalized_user_id in allowed_ids
return "*" in allowed_ids or user_id in allowed_ids
@classmethod
def _metadata_thread_id(cls, metadata: Optional[Dict[str, Any]]) -> Optional[str]:
@@ -761,20 +722,6 @@ class TelegramAdapter(BasePlatformAdapter):
# Persist thread_id to config so we don't recreate on next restart
self._persist_dm_topic_thread_id(int(chat_id), topic_name, thread_id)
# Send a seed message so the topic is visible in Telegram's client.
# Empty topics are hidden by the client UI until they contain a message.
try:
await self._bot.send_message(
chat_id=int(chat_id),
message_thread_id=thread_id,
text=f"\U0001f4cc {topic_name}",
)
except Exception as seed_err:
logger.debug(
"[%s] Could not send seed message to topic '%s': %s",
self.name, topic_name, seed_err,
)
async def connect(self) -> bool:
"""Connect to Telegram via polling or webhook.
@@ -1374,7 +1321,6 @@ class TelegramAdapter(BasePlatformAdapter):
async def send_update_prompt(
self, chat_id: str, prompt: str, default: str = "",
session_key: str = "",
metadata: Optional[Dict[str, Any]] = None,
) -> SendResult:
"""Send an inline-keyboard update prompt (Yes / No buttons).
@@ -1392,14 +1338,11 @@ class TelegramAdapter(BasePlatformAdapter):
InlineKeyboardButton("✗ No", callback_data="update_prompt:n"),
]
])
thread_id = self._metadata_thread_id(metadata)
message_thread_id = self._message_thread_id_for_send(thread_id)
msg = await self._bot.send_message(
chat_id=int(chat_id),
text=text,
parse_mode=ParseMode.MARKDOWN,
reply_markup=keyboard,
message_thread_id=message_thread_id,
**self._link_preview_kwargs(),
)
return SendResult(success=True, message_id=str(msg.message_id))
@@ -1817,12 +1760,6 @@ class TelegramAdapter(BasePlatformAdapter):
if not query or not query.data:
return
data = query.data
query_message = getattr(query, "message", None)
query_chat_id = getattr(query_message, "chat_id", None)
query_chat = getattr(query_message, "chat", None)
query_chat_type = getattr(query_chat, "type", None)
query_thread_id = getattr(query_message, "message_thread_id", None)
query_user_name = getattr(query.from_user, "first_name", None)
# --- Model picker callbacks ---
if data.startswith(("mp:", "mm:", "mb", "mx", "mg:")):
@@ -1844,13 +1781,7 @@ class TelegramAdapter(BasePlatformAdapter):
# Only authorized users may click approval buttons.
caller_id = str(getattr(query.from_user, "id", ""))
if not self._is_callback_user_authorized(
caller_id,
chat_id=query_chat_id,
chat_type=str(query_chat_type) if query_chat_type is not None else None,
thread_id=str(query_thread_id) if query_thread_id is not None else None,
user_name=query_user_name,
):
if not self._is_callback_user_authorized(caller_id):
await query.answer(text="⛔ You are not authorized to approve commands.")
return
@@ -1900,14 +1831,8 @@ class TelegramAdapter(BasePlatformAdapter):
choice = parts[1] # once, always, cancel
confirm_id = parts[2]
caller_id = str(getattr(query.from_user, "id", ""))
if not self._is_callback_user_authorized(
caller_id,
chat_id=query_chat_id,
chat_type=str(query_chat_type) if query_chat_type is not None else None,
thread_id=str(query_thread_id) if query_thread_id is not None else None,
user_name=query_user_name,
):
caller_id = str(getattr(query.from_user, "id", ""))
if not self._is_callback_user_authorized(caller_id):
await query.answer(text="⛔ You are not authorized to answer this prompt.")
return
@@ -1966,13 +1891,7 @@ class TelegramAdapter(BasePlatformAdapter):
return
answer = data.split(":", 1)[1] # "y" or "n"
caller_id = str(getattr(query.from_user, "id", ""))
if not self._is_callback_user_authorized(
caller_id,
chat_id=query_chat_id,
chat_type=str(query_chat_type) if query_chat_type is not None else None,
thread_id=str(query_thread_id) if query_thread_id is not None else None,
user_name=query_user_name,
):
if not self._is_callback_user_authorized(caller_id):
await query.answer(text="⛔ You are not authorized to answer update prompts.")
return
await query.answer(text=f"Sent '{answer}' to the update process.")
+4 -6
View File
@@ -1896,12 +1896,10 @@ class OwnerCommandMiddleware(InboundMiddleware):
if cmd not in cls.ALLOWLIST:
return None, None, False
# Sender identity check: bot owner <-> push.from_account == push.bot_owner_id.
# The allowlisted commands (/approve, /deny, /stop, /reset, ...) are
# privileged — leaking them to non-owners lets any group member approve
# a dangerous tool call, kill the owner's task, or wipe session state.
owner_id = str((push or {}).get("bot_owner_id") or "").strip()
is_owner = bool(owner_id) and owner_id == from_account
# Sender identity check: bot owner <-> push.from_account == push.bot_owner_id
# owner_id = (push or {}).get("bot_owner_id") or ""
# is_owner = bool(owner_id) and owner_id == from_account
is_owner = True
return cmd, cmd_line, is_owner
async def handle(self, ctx: InboundContext, next_fn) -> None:
+80 -1128
View File
File diff suppressed because it is too large Load Diff
-12
View File
@@ -458,15 +458,6 @@ class SessionEntry:
was_auto_reset: bool = False
auto_reset_reason: Optional[str] = None # "idle" or "daily"
reset_had_activity: bool = False # whether the expired session had any messages
# Set by reset_session() when the user explicitly sends /new or /reset.
# Consumed once by _handle_message_with_agent to trigger topic/channel
# skill re-injection on the first message of the new session. We can't
# reuse was_auto_reset for this because that flag fires the "session
# expired due to inactivity" user-facing notice and a misleading
# context-note prepend — both wrong for an explicit manual reset.
# See issue #6508.
is_fresh_reset: bool = False
# Set by the background expiry watcher after it finalizes an expired
# session (invoking on_session_finalize hooks and evicting the cached
@@ -517,7 +508,6 @@ class SessionEntry:
if self.last_resume_marked_at
else None
),
"is_fresh_reset": self.is_fresh_reset,
}
if self.origin:
result["origin"] = self.origin.to_dict()
@@ -566,7 +556,6 @@ class SessionEntry:
resume_pending=data.get("resume_pending", False),
resume_reason=data.get("resume_reason"),
last_resume_marked_at=last_resume_marked_at,
is_fresh_reset=data.get("is_fresh_reset", False),
)
@@ -1143,7 +1132,6 @@ class SessionStore:
display_name=old_entry.display_name,
platform=old_entry.platform,
chat_type=old_entry.chat_type,
is_fresh_reset=True,
)
self._entries[session_key] = new_entry
+4 -8
View File
@@ -21,7 +21,6 @@ from datetime import datetime, timezone
from pathlib import Path
from hermes_constants import get_hermes_home
from typing import Any, Optional
from utils import atomic_json_write
if sys.platform == "win32":
import msvcrt
@@ -35,10 +34,6 @@ _IS_WINDOWS = sys.platform == "win32"
_UNSET = object()
_GATEWAY_LOCK_FILENAME = "gateway.lock"
_gateway_lock_handle = None
# Windows byte-range locks are mandatory for other readers. Lock a byte well
# past the JSON payload so runtime status / PID readers can still read the file
# while another process holds the mutual-exclusion lock.
_WINDOWS_LOCK_OFFSET = 1024 * 1024
def _get_pid_path() -> Path:
@@ -210,7 +205,8 @@ def _read_json_file(path: Path) -> Optional[dict[str, Any]]:
def _write_json_file(path: Path, payload: dict[str, Any]) -> None:
atomic_json_write(path, payload, indent=None, separators=(",", ":"))
path.parent.mkdir(parents=True, exist_ok=True)
path.write_text(json.dumps(payload))
def _read_pid_record(pid_path: Optional[Path] = None) -> Optional[dict]:
@@ -290,7 +286,7 @@ def _try_acquire_file_lock(handle) -> bool:
if handle.tell() == 0:
handle.write("\n")
handle.flush()
handle.seek(_WINDOWS_LOCK_OFFSET)
handle.seek(0)
msvcrt.locking(handle.fileno(), msvcrt.LK_NBLCK, 1)
else:
fcntl.flock(handle.fileno(), fcntl.LOCK_EX | fcntl.LOCK_NB)
@@ -302,7 +298,7 @@ def _try_acquire_file_lock(handle) -> bool:
def _release_file_lock(handle) -> None:
try:
if _IS_WINDOWS:
handle.seek(_WINDOWS_LOCK_OFFSET)
handle.seek(0)
msvcrt.locking(handle.fileno(), msvcrt.LK_UNLCK, 1)
else:
fcntl.flock(handle.fileno(), fcntl.LOCK_UN)
+2 -2
View File
@@ -11,5 +11,5 @@ Provides subcommands for:
- hermes cron - Manage cron jobs
"""
__version__ = "0.12.0"
__release_date__ = "2026.4.30"
__version__ = "0.11.0"
__release_date__ = "2026.4.23"
+5 -5
View File
@@ -43,7 +43,7 @@ import yaml
from hermes_cli.config import get_hermes_home, get_config_path, read_raw_config
from hermes_constants import OPENROUTER_BASE_URL
from utils import atomic_replace, atomic_yaml_write, is_truthy_value
from utils import atomic_replace
logger = logging.getLogger(__name__)
@@ -2480,8 +2480,8 @@ def _resolve_verify(
tls_state = tls_state if isinstance(tls_state, dict) else {}
effective_insecure = (
is_truthy_value(insecure, default=False) if insecure is not None
else is_truthy_value(tls_state.get("insecure", False), default=False)
bool(insecure) if insecure is not None
else bool(tls_state.get("insecure", False))
)
effective_ca = (
ca_bundle
@@ -3653,7 +3653,7 @@ def _update_config_for_provider(
config["model"] = model_cfg
atomic_yaml_write(config_path, config, sort_keys=False)
config_path.write_text(yaml.safe_dump(config, sort_keys=False))
return config_path
@@ -3712,7 +3712,7 @@ def _reset_config_provider() -> Path:
model["provider"] = "auto"
if "base_url" in model:
model["base_url"] = OPENROUTER_BASE_URL
atomic_yaml_write(config_path, config, sort_keys=False)
config_path.write_text(yaml.safe_dump(config, sort_keys=False))
return config_path
+1 -25
View File
@@ -19,8 +19,6 @@ from collections.abc import Callable, Mapping
from dataclasses import dataclass
from typing import Any
from utils import is_truthy_value
# prompt_toolkit is an optional CLI dependency — only needed for
# SlashCommandCompleter and SlashCommandAutoSuggest. Gateway and test
# environments that lack it must still be able to import this module
@@ -68,7 +66,6 @@ COMMAND_REGISTRY: list[CommandDef] = [
cli_only=True),
CommandDef("history", "Show conversation history", "Session",
cli_only=True),
CommandDef("recap", "Summarize recent activity in this session", "Session"),
CommandDef("save", "Save the current conversation", "Session",
cli_only=True),
CommandDef("retry", "Retry the last message (resend to agent)", "Session"),
@@ -96,8 +93,6 @@ COMMAND_REGISTRY: list[CommandDef] = [
aliases=("q",), args_hint="<prompt>"),
CommandDef("steer", "Inject a message after the next tool call without interrupting", "Session",
args_hint="<prompt>"),
CommandDef("goal", "Set a standing goal Hermes works on across turns until achieved", "Session",
args_hint="[text | pause | resume | clear | status]"),
CommandDef("status", "Show session info", "Session"),
CommandDef("profile", "Show active profile name and home directory", "Info"),
CommandDef("sethome", "Set this chat as the home channel", "Session",
@@ -156,11 +151,6 @@ COMMAND_REGISTRY: list[CommandDef] = [
CommandDef("curator", "Background skill maintenance (status, run, pin, archive)",
"Tools & Skills", args_hint="[subcommand]",
subcommands=("status", "run", "pause", "resume", "pin", "unpin", "restore")),
CommandDef("kanban", "Multi-profile collaboration board (tasks, links, comments)",
"Tools & Skills", args_hint="[subcommand]",
subcommands=("list", "ls", "show", "create", "assign", "link", "unlink",
"claim", "comment", "complete", "block", "unblock", "archive",
"tail", "dispatch", "context", "init", "gc")),
CommandDef("reload", "Reload .env variables into the running session", "Tools & Skills",
cli_only=True),
CommandDef("reload-mcp", "Reload MCP servers from config", "Tools & Skills",
@@ -320,7 +310,6 @@ ACTIVE_SESSION_BYPASS_COMMANDS: frozenset[str] = frozenset(
"new",
"profile",
"queue",
"recap",
"restart",
"status",
"steer",
@@ -377,7 +366,7 @@ def _resolve_config_gates() -> set[str]:
else:
val = None
break
if is_truthy_value(val, default=False):
if val:
result.add(cmd.name)
return result
@@ -840,13 +829,6 @@ def discord_skill_commands_by_category(
_SLACK_MAX_SLASH_COMMANDS = 50
_SLACK_NAME_LIMIT = 32
_SLACK_INVALID_CHARS = re.compile(r"[^a-z0-9_\-]")
_SLACK_RESERVED_COMMANDS = frozenset({
# Built-in Slack slash commands that cannot be registered by apps.
# https://slack.com/help/articles/201259356-Use-built-in-slash-commands
"me", "status", "away", "dnd", "shrug", "remind", "msg", "feed",
"who", "collapse", "expand", "leave", "join", "open", "search",
"topic", "mute", "pro", "shortcuts",
})
def _sanitize_slack_name(raw: str) -> str:
@@ -873,10 +855,6 @@ def slack_native_slashes() -> list[tuple[str, str, str]]:
documented form (e.g. ``/background``, ``/bg``, and ``/btw`` all work).
Plugin-registered slash commands are included too.
Commands whose sanitized name collides with a Slack built-in
(e.g. ``/status``, ``/me``, ``/join``) are silently skipped. Users
can still reach them via ``/hermes <command>``.
Results are clamped to Slack's 50-command limit with duplicate-name
avoidance. ``/hermes`` is always reserved as the first entry so the
legacy ``/hermes <subcommand>`` form keeps working for anything that
@@ -894,8 +872,6 @@ def slack_native_slashes() -> list[tuple[str, str, str]]:
slack_name = _sanitize_slack_name(name)
if not slack_name or slack_name in seen:
return
if slack_name in _SLACK_RESERVED_COMMANDS:
return
if len(entries) >= _SLACK_MAX_SLASH_COMMANDS:
return
# Slack description cap is 2000 chars; keep it short.
+2 -81
View File
@@ -457,7 +457,6 @@ DEFAULT_CONFIG = {
# remains available as a tool regardless of this setting — the routing
# only controls how inbound user images are presented.
"image_input_mode": "auto",
"disabled_toolsets": [],
},
"terminal": {
@@ -607,24 +606,6 @@ DEFAULT_CONFIG = {
"max_line_length": 2000,
},
# Tool loop guardrails nudge models when they repeat failed or
# non-progressing tool calls. Soft warnings are always-on by default;
# hard stops are opt-in so interactive CLI/TUI sessions keep flowing.
"tool_loop_guardrails": {
"warnings_enabled": True,
"hard_stop_enabled": False,
"warn_after": {
"exact_failure": 2,
"same_tool_failure": 3,
"idempotent_no_progress": 2,
},
"hard_stop_after": {
"exact_failure": 5,
"same_tool_failure": 8,
"idempotent_no_progress": 5,
},
},
"compression": {
"enabled": True,
"threshold": 0.50, # compress when context usage exceeds this ratio
@@ -775,14 +756,6 @@ DEFAULT_CONFIG = {
"tool_progress_command": False, # Enable /verbose command in messaging gateway
"tool_progress_overrides": {}, # DEPRECATED — use display.platforms instead
"tool_preview_length": 0, # Max chars for tool call previews (0 = no limit, show full paths/commands)
# Auto-delete system-notice replies (e.g. "✨ New session started!",
# "♻ Restarting gateway…", "⚡ Stopped…") after N seconds on platforms
# that support message deletion (currently Telegram; other platforms
# ignore and leave the message in place). Only affects slash-command
# replies wrapped with gateway.platforms.base.EphemeralReply — agent
# responses and content messages are never touched. Default 0
# (disabled) preserves prior behavior.
"ephemeral_system_ttl": 0,
"platforms": {}, # Per-platform display overrides: {"telegram": {"tool_progress": "all"}, "slack": {"tool_progress": "off"}}
# Gateway runtime-metadata footer appended to the FINAL message of a turn
# (disabled by default to keep replies minimal). When enabled, renders
@@ -952,23 +925,7 @@ DEFAULT_CONFIG = {
# injected at the start of every API call for few-shot priming.
# Never saved to sessions, logs, or trajectories.
"prefill_messages_file": "",
# Goals — persistent cross-turn goals (Ralph-style loop).
# After every turn, a lightweight judge call asks the auxiliary model
# whether the active /goal is satisfied by the assistant's last
# response. If not, Hermes feeds a continuation prompt back into the
# same session and keeps working until the goal is done, the turn
# budget is exhausted, or the user pauses/clears it. Judge failures
# fail OPEN (continue) so a flaky judge never wedges progress — the
# turn budget is the real backstop.
"goals": {
# Max continuation turns before Hermes auto-pauses the goal and
# asks the user to /goal resume. Protects against judge false
# negatives (goal actually done but judge says continue) and
# unbounded model spend on fuzzy / unachievable goals.
"max_turns": 20,
},
# Skills — external skill directories for sharing skills across tools/agents.
# Each path is expanded (~, ${VAR}) and resolved. Read-only — skill creation
# always goes to ~/.hermes/skills/.
@@ -1022,14 +979,6 @@ DEFAULT_CONFIG = {
# Archive a skill (move to skills/.archive/) after this many days
# without use. Archived skills are recoverable — no auto-deletion.
"archive_after_days": 90,
# Pre-run backup: before every real curator pass (dry-run is
# skipped), snapshot ~/.hermes/skills/ into
# ~/.hermes/skills/.curator_backups/<utc-iso>/skills.tar.gz so the
# user can roll back with `hermes curator rollback`.
"backup": {
"enabled": True,
"keep": 5, # retain last N regular snapshots
},
},
# Honcho AI-native memory -- reads ~/.honcho/config.json as single source of truth.
@@ -1155,24 +1104,6 @@ DEFAULT_CONFIG = {
"max_parallel_jobs": None,
},
# Kanban multi-agent coordination — controls the dispatcher loop that
# spawns workers for ready tasks. The dispatcher ticks every N seconds
# (default 60), reclaims stale claims, promotes dependency-satisfied
# todos to ready, and fires `hermes -p <assignee> chat -q ...` for
# each claimable ready task. One dispatcher per profile is sufficient;
# running more than one on the same kanban.db will race for claims.
"kanban": {
# Run the dispatcher inside the gateway process. On by default —
# the cost is ~300µs every `dispatch_interval_seconds` when idle,
# and gateway is the supervisor users already have. Set to false
# only if you run the dispatcher as a separate systemd unit or
# don't want the gateway to spawn workers.
"dispatch_in_gateway": True,
# Seconds between dispatcher ticks (idle or not). Lower = snappier
# pickup of newly-ready tasks; higher = less SQL pressure.
"dispatch_interval_seconds": 60,
},
# execute_code settings — controls the tool used for programmatic tool calls.
"code_execution": {
# Execution mode:
@@ -2469,17 +2400,7 @@ def get_missing_skill_config_vars() -> List[Dict[str, Any]]:
except Exception:
return []
try:
all_vars = discover_all_skill_config_vars()
except Exception as e:
# A malformed SKILL.md, unreadable external skill dir, or similar
# should never break `hermes update`. Skill-config prompting is a
# post-migration nicety, not a blocker.
import logging
logging.getLogger(__name__).debug(
"discover_all_skill_config_vars failed: %s", e
)
return []
all_vars = discover_all_skill_config_vars()
if not all_vars:
return []
+7 -181
View File
@@ -108,49 +108,6 @@ def _cmd_status(args) -> int:
f"last_activity={last}"
)
# Show top 5 most-active and least-active skills by activity_count
# (use + view + patch). This is a different signal from
# least-recently-active: activity_count reflects frequency,
# last_activity_at reflects recency. A skill touched 30 times a year
# ago is high-frequency but stale; a skill touched once yesterday is
# recent but low-frequency. Both can matter.
active_all = by_state.get("active", [])
if active_all:
most_active = sorted(
active_all,
key=lambda r: (r.get("activity_count") or 0, r.get("last_activity_at") or ""),
reverse=True,
)[:5]
if most_active and (most_active[0].get("activity_count") or 0) > 0:
print("\nmost active (top 5):")
for r in most_active:
last = _fmt_ts(r.get("last_activity_at"))
print(
f" {r['name']:40s} "
f"activity={r.get('activity_count', 0):3d} "
f"use={r.get('use_count', 0):3d} "
f"view={r.get('view_count', 0):3d} "
f"patches={r.get('patch_count', 0):3d} "
f"last_activity={last}"
)
least_active = sorted(
active_all,
key=lambda r: (r.get("activity_count") or 0, r.get("last_activity_at") or ""),
)[:5]
if least_active:
print("\nleast active (top 5):")
for r in least_active:
last = _fmt_ts(r.get("last_activity_at"))
print(
f" {r['name']:40s} "
f"activity={r.get('activity_count', 0):3d} "
f"use={r.get('use_count', 0):3d} "
f"view={r.get('view_count', 0):3d} "
f"patches={r.get('patch_count', 0):3d} "
f"last_activity={last}"
)
return 0
@@ -160,11 +117,7 @@ def _cmd_run(args) -> int:
print("curator: disabled via config; enable with `curator.enabled: true`")
return 1
dry = bool(getattr(args, "dry_run", False))
if dry:
print("curator: running DRY-RUN (report only, no mutations)...")
else:
print("curator: running review pass...")
print("curator: running review pass...")
def _on_summary(msg: str) -> None:
print(msg)
@@ -172,29 +125,17 @@ def _cmd_run(args) -> int:
result = curator.run_curator_review(
on_summary=_on_summary,
synchronous=bool(args.synchronous),
dry_run=dry,
)
auto = result.get("auto_transitions", {})
if auto:
if dry:
print(
f"auto (preview): {auto.get('checked', 0)} candidate skill(s) "
"— no transitions applied in dry-run"
)
else:
print(
f"auto: checked={auto.get('checked', 0)} "
f"stale={auto.get('marked_stale', 0)} "
f"archived={auto.get('archived', 0)} "
f"reactivated={auto.get('reactivated', 0)}"
)
print(
f"auto: checked={auto.get('checked', 0)} "
f"stale={auto.get('marked_stale', 0)} "
f"archived={auto.get('archived', 0)} "
f"reactivated={auto.get('reactivated', 0)}"
)
if not args.synchronous:
print("llm pass running in background — check `hermes curator status` later")
if dry:
print(
"dry-run: no changes applied. When the report lands, read it with "
"`hermes curator status` and run `hermes curator run` (no flag) to apply."
)
return 0
@@ -245,86 +186,6 @@ def _cmd_restore(args) -> int:
return 0 if ok else 1
def _cmd_backup(args) -> int:
"""Take a manual snapshot of the skills tree. Same mechanism as the
automatic pre-run snapshot, just user-initiated."""
from agent import curator_backup
if not curator_backup.is_enabled():
print(
"curator: backups are disabled via config "
"(`curator.backup.enabled: false`); re-enable to snapshot"
)
return 1
reason = getattr(args, "reason", None) or "manual"
snap = curator_backup.snapshot_skills(reason=reason)
if snap is None:
print("curator: snapshot failed — check logs (backup disabled or IO error)")
return 1
print(f"curator: snapshot created at ~/.hermes/skills/.curator_backups/{snap.name}")
return 0
def _cmd_rollback(args) -> int:
"""Restore the skills tree from a snapshot. Defaults to newest.
``--list`` prints available snapshots and exits. ``--id <stamp>`` picks
a specific one. Without ``-y``, prompts for confirmation. A safety
snapshot of the current tree is always taken first, so rollbacks are
themselves undoable.
"""
from agent import curator_backup
if getattr(args, "list", False):
print(curator_backup.summarize_backups())
return 0
backup_id = getattr(args, "backup_id", None)
target_path = curator_backup._resolve_backup(backup_id)
if target_path is None:
rows = curator_backup.list_backups()
if not rows:
print(
"curator: no snapshots exist yet. Take one with "
"`hermes curator backup` or wait for the next curator run."
)
else:
print(
f"curator: no snapshot matching "
f"{'id ' + repr(backup_id) if backup_id else 'your query'}."
)
print("Available:")
print(curator_backup.summarize_backups())
return 1
manifest = curator_backup._read_manifest(target_path)
print(f"Rollback target: {target_path.name}")
if manifest:
print(f" reason: {manifest.get('reason', '?')}")
print(f" created_at: {manifest.get('created_at', '?')}")
print(f" skill files: {manifest.get('skill_files', '?')}")
print(
"\nThis will replace the current ~/.hermes/skills/ tree (a safety "
"snapshot of the current state is taken first so this is undoable)."
)
if not getattr(args, "yes", False):
try:
ans = input("Proceed? [y/N] ").strip().lower()
except (EOFError, KeyboardInterrupt):
print("\ncancelled")
return 1
if ans not in ("y", "yes"):
print("cancelled")
return 1
ok, msg, _ = curator_backup.rollback(backup_id=target_path.name)
if ok:
print(f"curator: {msg}")
return 0
print(f"curator: rollback failed — {msg}")
return 1
# ---------------------------------------------------------------------------
# argparse wiring (called from hermes_cli.main)
# ---------------------------------------------------------------------------
@@ -346,11 +207,6 @@ def register_cli(parent: argparse.ArgumentParser) -> None:
"--sync", "--synchronous", dest="synchronous", action="store_true",
help="Wait for the LLM review pass to finish (default: background thread)",
)
p_run.add_argument(
"--dry-run", dest="dry_run", action="store_true",
help="Report only — no state changes, no archives, no consolidation "
"(use this to preview what curator would do)",
)
p_run.set_defaults(func=_cmd_run)
p_pause = subs.add_parser("pause", help="Pause the curator until resumed")
@@ -371,36 +227,6 @@ def register_cli(parent: argparse.ArgumentParser) -> None:
p_restore.add_argument("skill", help="Skill name")
p_restore.set_defaults(func=_cmd_restore)
p_backup = subs.add_parser(
"backup",
help="Take a manual tar.gz snapshot of ~/.hermes/skills/ "
"(curator also does this automatically before every real run)",
)
p_backup.add_argument(
"--reason", default=None,
help="Free-text label stored in manifest.json (default: 'manual')",
)
p_backup.set_defaults(func=_cmd_backup)
p_rollback = subs.add_parser(
"rollback",
help="Restore ~/.hermes/skills/ from a curator snapshot "
"(defaults to the newest)",
)
p_rollback.add_argument(
"--list", action="store_true",
help="List available snapshots and exit without restoring",
)
p_rollback.add_argument(
"--id", dest="backup_id", default=None,
help="Snapshot id to restore (see `--list`); default: newest",
)
p_rollback.add_argument(
"-y", "--yes", action="store_true",
help="Skip confirmation prompt",
)
p_rollback.set_defaults(func=_cmd_rollback)
def cli_main(argv=None) -> int:
"""Standalone entry (also usable by hermes_cli.main fallthrough)."""
+1 -86
View File
@@ -10,7 +10,6 @@ import shutil
import signal
import subprocess
import sys
import textwrap
from dataclasses import dataclass
from pathlib import Path
@@ -60,13 +59,6 @@ class GatewayRuntimeSnapshot:
def has_process_service_mismatch(self) -> bool:
return self.service_installed and self.running and not self.service_running
@dataclass(frozen=True)
class ProfileGatewayProcess:
profile: str
path: Path
pid: int
def _get_service_pids() -> set:
"""Return PIDs currently managed by systemd or launchd gateway services.
@@ -379,83 +371,6 @@ def find_gateway_pids(exclude_pids: set | None = None, all_profiles: bool = Fals
return pids
def find_profile_gateway_processes(
exclude_pids: set | None = None,
) -> list[ProfileGatewayProcess]:
"""Return running gateway PIDs mapped to Hermes profiles via PID files."""
_exclude = set(exclude_pids or set())
processes: list[ProfileGatewayProcess] = []
try:
from gateway.status import get_running_pid
from hermes_cli.profiles import list_profiles
except Exception:
return processes
seen: set[int] = set()
for profile in list_profiles():
try:
pid = get_running_pid(profile.path / "gateway.pid", cleanup_stale=False)
except Exception:
continue
if pid is None or pid <= 0 or pid in _exclude or pid in seen:
continue
seen.add(pid)
processes.append(ProfileGatewayProcess(profile=profile.name, path=profile.path, pid=pid))
return processes
def _gateway_run_args_for_profile(profile: str) -> list[str]:
args = [get_python_path(), "-m", "hermes_cli.main"]
if profile != "default":
args.extend(["--profile", profile])
args.extend(["gateway", "run", "--replace"])
return args
def launch_detached_profile_gateway_restart(profile: str, old_pid: int) -> bool:
"""Relaunch a manually-run profile gateway after its current PID exits."""
if old_pid <= 0:
return False
watcher = textwrap.dedent(
"""
import os
import subprocess
import sys
import time
pid = int(sys.argv[1])
cmd = sys.argv[2:]
deadline = time.monotonic() + 120
while time.monotonic() < deadline:
try:
os.kill(pid, 0)
except ProcessLookupError:
break
except PermissionError:
pass
time.sleep(0.2)
subprocess.Popen(
cmd,
stdout=subprocess.DEVNULL,
stderr=subprocess.DEVNULL,
start_new_session=True,
)
"""
).strip()
try:
subprocess.Popen(
[sys.executable, "-c", watcher, str(old_pid), *_gateway_run_args_for_profile(profile)],
stdout=subprocess.DEVNULL,
stderr=subprocess.DEVNULL,
start_new_session=True,
)
except OSError:
return False
return True
def _probe_systemd_service_running(system: bool = False) -> tuple[bool, bool]:
selected_system = _select_systemd_scope(system)
unit_exists = get_systemd_unit_path(system=selected_system).exists()
@@ -4462,4 +4377,4 @@ def _gateway_command_inner(args):
if not supports_systemd_services() and not is_macos():
print("Legacy unit migration only applies to systemd-based Linux hosts.")
return
remove_legacy_hermes_units(interactive=not yes, dry_run=dry_run)
remove_legacy_hermes_units(interactive=not yes, dry_run=dry_run)
-535
View File
@@ -1,535 +0,0 @@
"""Persistent session goals — the Ralph loop for Hermes.
A goal is a free-form user objective that stays active across turns. After
each turn completes, a small judge call asks an auxiliary model "is this
goal satisfied by the assistant's last response?". If not, Hermes feeds a
continuation prompt back into the same session and keeps working until the
goal is done, turn budget is exhausted, the user pauses/clears it, or the
user sends a new message (which takes priority and pauses the goal loop).
State is persisted in SessionDB's ``state_meta`` table keyed by
``goal:<session_id>`` so ``/resume`` picks it up.
Design notes / invariants:
- The continuation prompt is just a normal user message appended to the
session via ``run_conversation``. No system-prompt mutation, no toolset
swap prompt caching stays intact.
- Judge failures are fail-OPEN: ``continue``. A broken judge must not wedge
progress; the turn budget is the backstop.
- When a real user message arrives mid-loop it preempts the continuation
prompt and also pauses the goal loop for that turn (we still re-judge
after, so if the user's message happens to complete the goal the judge
will say ``done``).
- This module has zero hard dependency on ``cli.HermesCLI`` or the gateway
runner both wire the same ``GoalManager`` in.
Nothing in this module touches the agent's system prompt or toolset.
"""
from __future__ import annotations
import json
import logging
import re
import time
from dataclasses import dataclass, asdict
from typing import Any, Dict, Optional, Tuple
logger = logging.getLogger(__name__)
# ──────────────────────────────────────────────────────────────────────
# Constants & defaults
# ──────────────────────────────────────────────────────────────────────
DEFAULT_MAX_TURNS = 20
DEFAULT_JUDGE_TIMEOUT = 30.0
# Cap how much of the last response + recent messages we send to the judge.
_JUDGE_RESPONSE_SNIPPET_CHARS = 4000
CONTINUATION_PROMPT_TEMPLATE = (
"[Continuing toward your standing goal]\n"
"Goal: {goal}\n\n"
"Continue working toward this goal. Take the next concrete step. "
"If you believe the goal is complete, state so explicitly and stop. "
"If you are blocked and need input from the user, say so clearly and stop."
)
JUDGE_SYSTEM_PROMPT = (
"You are a strict judge evaluating whether an autonomous agent has "
"achieved a user's stated goal. You receive the goal text and the "
"agent's most recent response. Your only job is to decide whether "
"the goal is fully satisfied based on that response.\n\n"
"A goal is DONE only when:\n"
"- The response explicitly confirms the goal was completed, OR\n"
"- The response clearly shows the final deliverable was produced, OR\n"
"- The response explains the goal is unachievable / blocked / needs "
"user input (treat this as DONE with reason describing the block).\n\n"
"Otherwise the goal is NOT done — CONTINUE.\n\n"
"Reply ONLY with a single JSON object on one line:\n"
'{\"done\": <true|false>, \"reason\": \"<one-sentence rationale>\"}'
)
JUDGE_USER_PROMPT_TEMPLATE = (
"Goal:\n{goal}\n\n"
"Agent's most recent response:\n{response}\n\n"
"Is the goal satisfied?"
)
# ──────────────────────────────────────────────────────────────────────
# Dataclass
# ──────────────────────────────────────────────────────────────────────
@dataclass
class GoalState:
"""Serializable goal state stored per session."""
goal: str
status: str = "active" # active | paused | done | cleared
turns_used: int = 0
max_turns: int = DEFAULT_MAX_TURNS
created_at: float = 0.0
last_turn_at: float = 0.0
last_verdict: Optional[str] = None # "done" | "continue" | "skipped"
last_reason: Optional[str] = None
paused_reason: Optional[str] = None # why we auto-paused (budget, etc.)
def to_json(self) -> str:
return json.dumps(asdict(self), ensure_ascii=False)
@classmethod
def from_json(cls, raw: str) -> "GoalState":
data = json.loads(raw)
return cls(
goal=data.get("goal", ""),
status=data.get("status", "active"),
turns_used=int(data.get("turns_used", 0) or 0),
max_turns=int(data.get("max_turns", DEFAULT_MAX_TURNS) or DEFAULT_MAX_TURNS),
created_at=float(data.get("created_at", 0.0) or 0.0),
last_turn_at=float(data.get("last_turn_at", 0.0) or 0.0),
last_verdict=data.get("last_verdict"),
last_reason=data.get("last_reason"),
paused_reason=data.get("paused_reason"),
)
# ──────────────────────────────────────────────────────────────────────
# Persistence (SessionDB state_meta)
# ──────────────────────────────────────────────────────────────────────
def _meta_key(session_id: str) -> str:
return f"goal:{session_id}"
_DB_CACHE: Dict[str, Any] = {}
def _get_session_db() -> Optional[Any]:
"""Return a SessionDB instance for the current HERMES_HOME.
SessionDB has no built-in singleton, but opening a new connection per
/goal call would thrash the file. We cache one instance per
``hermes_home`` path so profile switches still pick up the right DB.
Defensive against import/instantiation failures so tests and
non-standard launchers can still use the GoalManager.
"""
try:
from hermes_constants import get_hermes_home
from hermes_state import SessionDB
home = str(get_hermes_home())
except Exception as exc: # pragma: no cover
logger.debug("GoalManager: SessionDB bootstrap failed (%s)", exc)
return None
cached = _DB_CACHE.get(home)
if cached is not None:
return cached
try:
db = SessionDB()
except Exception as exc: # pragma: no cover
logger.debug("GoalManager: SessionDB() raised (%s)", exc)
return None
_DB_CACHE[home] = db
return db
def load_goal(session_id: str) -> Optional[GoalState]:
"""Load the goal for a session, or None if none exists."""
if not session_id:
return None
db = _get_session_db()
if db is None:
return None
try:
raw = db.get_meta(_meta_key(session_id))
except Exception as exc:
logger.debug("GoalManager: get_meta failed: %s", exc)
return None
if not raw:
return None
try:
return GoalState.from_json(raw)
except Exception as exc:
logger.warning("GoalManager: could not parse stored goal for %s: %s", session_id, exc)
return None
def save_goal(session_id: str, state: GoalState) -> None:
"""Persist a goal to SessionDB. No-op if DB unavailable."""
if not session_id:
return
db = _get_session_db()
if db is None:
return
try:
db.set_meta(_meta_key(session_id), state.to_json())
except Exception as exc:
logger.debug("GoalManager: set_meta failed: %s", exc)
def clear_goal(session_id: str) -> None:
"""Mark a goal cleared in the DB (preserved for audit, status=cleared)."""
state = load_goal(session_id)
if state is None:
return
state.status = "cleared"
save_goal(session_id, state)
# ──────────────────────────────────────────────────────────────────────
# Judge
# ──────────────────────────────────────────────────────────────────────
def _truncate(text: str, limit: int) -> str:
if not text:
return ""
if len(text) <= limit:
return text
return text[:limit] + "… [truncated]"
_JSON_OBJECT_RE = re.compile(r"\{.*?\}", re.DOTALL)
def _parse_judge_response(raw: str) -> Tuple[bool, str]:
"""Parse the judge's reply. Fail-open to ``(False, "<reason>")``.
Returns ``(done, reason)``.
"""
if not raw:
return False, "judge returned empty response"
text = raw.strip()
# Strip markdown code fences the model may wrap JSON in.
if text.startswith("```"):
text = text.strip("`")
# Peel off leading json/JSON/etc tag
nl = text.find("\n")
if nl != -1:
text = text[nl + 1:]
# First try: parse the whole blob.
data: Optional[Dict[str, Any]] = None
try:
data = json.loads(text)
except Exception:
# Second try: pull the first JSON object out.
match = _JSON_OBJECT_RE.search(text)
if match:
try:
data = json.loads(match.group(0))
except Exception:
data = None
if not isinstance(data, dict):
return False, f"judge reply was not JSON: {_truncate(raw, 200)!r}"
done_val = data.get("done")
if isinstance(done_val, str):
done = done_val.strip().lower() in ("true", "yes", "1", "done")
else:
done = bool(done_val)
reason = str(data.get("reason") or "").strip()
if not reason:
reason = "no reason provided"
return done, reason
def judge_goal(
goal: str,
last_response: str,
*,
timeout: float = DEFAULT_JUDGE_TIMEOUT,
) -> Tuple[str, str]:
"""Ask the auxiliary model whether the goal is satisfied.
Returns ``(verdict, reason)`` where verdict is ``"done"``, ``"continue"``,
or ``"skipped"`` (when the judge couldn't be reached).
This is deliberately fail-open: any error returns ``("continue", "...")``
so a broken judge doesn't wedge progress — the turn budget is the
backstop.
"""
if not goal.strip():
return "skipped", "empty goal"
if not last_response.strip():
# No substantive reply this turn — almost certainly not done yet.
return "continue", "empty response (nothing to evaluate)"
try:
from agent.auxiliary_client import get_text_auxiliary_client
except Exception as exc:
logger.debug("goal judge: auxiliary client import failed: %s", exc)
return "continue", "auxiliary client unavailable"
try:
client, model = get_text_auxiliary_client("goal_judge")
except Exception as exc:
logger.debug("goal judge: get_text_auxiliary_client failed: %s", exc)
return "continue", "auxiliary client unavailable"
if client is None or not model:
return "continue", "no auxiliary client configured"
prompt = JUDGE_USER_PROMPT_TEMPLATE.format(
goal=_truncate(goal, 2000),
response=_truncate(last_response, _JUDGE_RESPONSE_SNIPPET_CHARS),
)
try:
resp = client.chat.completions.create(
model=model,
messages=[
{"role": "system", "content": JUDGE_SYSTEM_PROMPT},
{"role": "user", "content": prompt},
],
temperature=0,
max_tokens=200,
timeout=timeout,
)
except Exception as exc:
logger.info("goal judge: API call failed (%s) — falling through to continue", exc)
return "continue", f"judge error: {type(exc).__name__}"
try:
raw = resp.choices[0].message.content or ""
except Exception:
raw = ""
done, reason = _parse_judge_response(raw)
verdict = "done" if done else "continue"
logger.info("goal judge: verdict=%s reason=%s", verdict, _truncate(reason, 120))
return verdict, reason
# ──────────────────────────────────────────────────────────────────────
# GoalManager — the orchestration surface CLI + gateway talk to
# ──────────────────────────────────────────────────────────────────────
class GoalManager:
"""Per-session goal state + continuation decisions.
The CLI and gateway each hold one ``GoalManager`` per live session.
Methods:
- ``set(goal)`` start a new standing goal.
- ``clear()`` remove the active goal.
- ``pause()`` / ``resume()`` explicit user controls.
- ``status()`` printable one-liner.
- ``evaluate_after_turn(last_response)`` call the judge, update state,
and return a decision dict the caller uses to drive the next turn.
- ``next_continuation_prompt()`` the canonical user-role message to
feed back into ``run_conversation``.
"""
def __init__(self, session_id: str, *, default_max_turns: int = DEFAULT_MAX_TURNS):
self.session_id = session_id
self.default_max_turns = int(default_max_turns or DEFAULT_MAX_TURNS)
self._state: Optional[GoalState] = load_goal(session_id)
# --- introspection ------------------------------------------------
@property
def state(self) -> Optional[GoalState]:
return self._state
def is_active(self) -> bool:
return self._state is not None and self._state.status == "active"
def has_goal(self) -> bool:
return self._state is not None and self._state.status in ("active", "paused")
def status_line(self) -> str:
s = self._state
if s is None or s.status in ("cleared",):
return "No active goal. Set one with /goal <text>."
turns = f"{s.turns_used}/{s.max_turns} turns"
if s.status == "active":
return f"⊙ Goal (active, {turns}): {s.goal}"
if s.status == "paused":
extra = f"{s.paused_reason}" if s.paused_reason else ""
return f"⏸ Goal (paused, {turns}{extra}): {s.goal}"
if s.status == "done":
return f"✓ Goal done ({turns}): {s.goal}"
return f"Goal ({s.status}, {turns}): {s.goal}"
# --- mutation -----------------------------------------------------
def set(self, goal: str, *, max_turns: Optional[int] = None) -> GoalState:
goal = (goal or "").strip()
if not goal:
raise ValueError("goal text is empty")
state = GoalState(
goal=goal,
status="active",
turns_used=0,
max_turns=int(max_turns) if max_turns else self.default_max_turns,
created_at=time.time(),
last_turn_at=0.0,
)
self._state = state
save_goal(self.session_id, state)
return state
def pause(self, reason: str = "user-paused") -> Optional[GoalState]:
if not self._state:
return None
self._state.status = "paused"
self._state.paused_reason = reason
save_goal(self.session_id, self._state)
return self._state
def resume(self, *, reset_budget: bool = True) -> Optional[GoalState]:
if not self._state:
return None
self._state.status = "active"
self._state.paused_reason = None
if reset_budget:
self._state.turns_used = 0
save_goal(self.session_id, self._state)
return self._state
def clear(self) -> None:
if self._state is None:
return
self._state.status = "cleared"
save_goal(self.session_id, self._state)
self._state = None
def mark_done(self, reason: str) -> None:
if not self._state:
return
self._state.status = "done"
self._state.last_verdict = "done"
self._state.last_reason = reason
save_goal(self.session_id, self._state)
# --- the main entry point called after every turn -----------------
def evaluate_after_turn(
self,
last_response: str,
*,
user_initiated: bool = True,
) -> Dict[str, Any]:
"""Run the judge and update state. Return a decision dict.
``user_initiated`` distinguishes a real user prompt (True) from a
continuation prompt we fed ourselves (False). Both increment
``turns_used`` because both consume model budget.
Decision keys:
- ``status``: current goal status after update
- ``should_continue``: bool caller should fire another turn
- ``continuation_prompt``: str or None
- ``verdict``: "done" | "continue" | "skipped" | "inactive"
- ``reason``: str
- ``message``: user-visible one-liner to print/send
"""
state = self._state
if state is None or state.status != "active":
return {
"status": state.status if state else None,
"should_continue": False,
"continuation_prompt": None,
"verdict": "inactive",
"reason": "no active goal",
"message": "",
}
# Count the turn that just finished.
state.turns_used += 1
state.last_turn_at = time.time()
verdict, reason = judge_goal(state.goal, last_response)
state.last_verdict = verdict
state.last_reason = reason
if verdict == "done":
state.status = "done"
save_goal(self.session_id, state)
return {
"status": "done",
"should_continue": False,
"continuation_prompt": None,
"verdict": "done",
"reason": reason,
"message": f"✓ Goal achieved: {reason}",
}
if state.turns_used >= state.max_turns:
state.status = "paused"
state.paused_reason = f"turn budget exhausted ({state.turns_used}/{state.max_turns})"
save_goal(self.session_id, state)
return {
"status": "paused",
"should_continue": False,
"continuation_prompt": None,
"verdict": "continue",
"reason": reason,
"message": (
f"⏸ Goal paused — {state.turns_used}/{state.max_turns} turns used. "
"Use /goal resume to keep going, or /goal clear to stop."
),
}
save_goal(self.session_id, state)
return {
"status": "active",
"should_continue": True,
"continuation_prompt": self.next_continuation_prompt(),
"verdict": "continue",
"reason": reason,
"message": (
f"↻ Continuing toward goal ({state.turns_used}/{state.max_turns}): {reason}"
),
}
def next_continuation_prompt(self) -> Optional[str]:
if not self._state or self._state.status != "active":
return None
return CONTINUATION_PROMPT_TEMPLATE.format(goal=self._state.goal)
__all__ = [
"GoalState",
"GoalManager",
"CONTINUATION_PROMPT_TEMPLATE",
"DEFAULT_MAX_TURNS",
"load_goal",
"save_goal",
"clear_goal",
"judge_goal",
]
-1393
View File
File diff suppressed because it is too large Load Diff
File diff suppressed because it is too large Load Diff
+27 -164
View File
@@ -800,8 +800,6 @@ def _print_tui_exit_summary(session_id: Optional[str], active_session_file: Opti
title = db.get_session_title(target)
message_count = int(session.get("message_count") or 0)
if message_count == 0:
return # No real conversation — don't show resume info
input_tokens = int(session.get("input_tokens") or 0)
output_tokens = int(session.get("output_tokens") or 0)
cache_read_tokens = int(session.get("cache_read_tokens") or 0)
@@ -838,6 +836,16 @@ def _print_tui_exit_summary(session_id: Optional[str], active_session_file: Opti
_NPM_LOCK_RUNTIME_KEYS = frozenset({"ideallyInert"})
_TUI_PREBUILT_MARKER = ".hermes-prebuilt-tui"
def _tui_prebuilt_ready(root: Path) -> bool:
return (
(root / _TUI_PREBUILT_MARKER).is_file()
and (root / "dist" / "entry.js").is_file()
and (root / "node_modules" / "@hermes" / "ink" / "package.json").is_file()
and (root / "packages" / "hermes-ink" / "dist" / "ink-bundle.js").is_file()
)
def _tui_need_npm_install(root: Path) -> bool:
@@ -860,6 +868,9 @@ def _tui_need_npm_install(root: Path) -> bool:
we'd rather not force a reinstall for them. Falls back to mtime
comparison if either lockfile is unparseable.
"""
if _tui_prebuilt_ready(root):
return False
ink = root / "node_modules" / "@hermes" / "ink" / "package.json"
if not ink.is_file():
return True
@@ -913,6 +924,8 @@ def _find_bundled_tui(tui_dir: Path) -> Optional[Path]:
def _tui_build_needed(tui_dir: Path) -> bool:
if _tui_prebuilt_ready(tui_dir):
return False
if _hermes_ink_bundle_stale(tui_dir):
return True
entry = tui_dir / "dist" / "entry.js"
@@ -5043,13 +5056,6 @@ def cmd_slack(args):
return 1
def cmd_kanban(args):
"""Multi-profile collaboration board."""
from hermes_cli.kanban import kanban_command
return kanban_command(args)
def cmd_hooks(args):
"""Shell-hook inspection and management."""
from hermes_cli.hooks import hooks_command
@@ -5433,45 +5439,6 @@ def _find_stale_dashboard_pids() -> list[int]:
return dashboard_pids
def _print_curator_first_run_notice() -> None:
"""Print a short heads-up about the skill curator after `hermes update`.
Only fires when the curator is enabled AND has no recorded run yet, which
is exactly the window where the gateway ticker used to fire Curator
against a fresh skill library immediately after an update. We defer the
first real pass by one ``interval_hours``; this notice tells the user how
to preview or disable before then. Silent on steady state.
"""
try:
from agent import curator
except Exception:
return
try:
if not curator.is_enabled():
return
state = curator.load_state()
except Exception:
return
if state.get("last_run_at"):
# Curator has run before (real or already seeded) — no notice needed.
return
try:
hours = curator.get_interval_hours()
except Exception:
hours = 24 * 7
days = max(1, hours // 24)
print()
print(" Skill curator")
print(
f" Background skill maintenance is enabled. First pass is deferred "
f"~{days}d after installation; only agent-created skills are in "
f"scope and nothing is ever auto-deleted (archive is recoverable)."
)
print(" Preview now: hermes curator run --dry-run")
print(" Pause it: hermes curator pause")
print(" Docs: https://hermes-agent.nousresearch.com/docs/user-guide/features/curator")
def _kill_stale_dashboard_processes(
reason: str = "the running backend no longer matches the updated frontend",
) -> None:
@@ -5709,10 +5676,6 @@ def _update_via_zip(args):
print()
print("✓ Update complete!")
try:
_print_curator_first_run_notice()
except Exception as e:
logger.debug("Curator first-run notice failed: %s", e)
_kill_stale_dashboard_processes()
@@ -6718,7 +6681,6 @@ def _cmd_update_impl(args, gateway_mode: bool):
if gateway_mode
else None
)
assume_yes = bool(getattr(args, "yes", False))
print("⚕ Updating Hermes Agent...")
print()
@@ -6838,10 +6800,8 @@ def _cmd_update_impl(args, gateway_mode: bool):
else:
auto_stash_ref = _stash_local_changes_if_needed(git_cmd, PROJECT_ROOT)
prompt_for_restore = (
auto_stash_ref is not None
and not assume_yes
and (gateway_mode or (sys.stdin.isatty() and sys.stdout.isatty()))
prompt_for_restore = auto_stash_ref is not None and (
gateway_mode or (sys.stdin.isatty() and sys.stdout.isatty())
)
# Check if there are updates
@@ -7102,10 +7062,7 @@ def _cmd_update_impl(args, gateway_mode: bool):
print(f" {len(missing_config)} new config option(s) available")
print()
if assume_yes:
print(" --yes: auto-applying config migration (skipping API-key prompts).")
response = "y"
elif gateway_mode:
if gateway_mode:
response = (
_gateway_prompt(
"Would you like to configure new options now? [Y/n]", "n"
@@ -7131,17 +7088,14 @@ def _cmd_update_impl(args, gateway_mode: bool):
if response in ("", "y", "yes"):
print()
# In gateway mode OR under --yes, run auto-migrations only (no
# input() prompts for API keys which would hang the detached
# process / defeat the point of --yes).
results = migrate_config(
interactive=not (gateway_mode or assume_yes), quiet=False
)
# In gateway mode, run auto-migrations only (no input() prompts
# for API keys which would hang the detached process).
results = migrate_config(interactive=not gateway_mode, quiet=False)
if results["env_added"] or results["config_added"]:
print()
print("✓ Configuration updated!")
if (gateway_mode or assume_yes) and missing_env:
if gateway_mode and missing_env:
print(" API keys require manual entry: hermes config migrate")
else:
print()
@@ -7152,15 +7106,6 @@ def _cmd_update_impl(args, gateway_mode: bool):
print()
print("✓ Update complete!")
# Curator first-run heads-up. Only prints when curator is enabled AND
# has never run — i.e. the window where the ticker would otherwise
# have fired against a fresh skill library. Kept silent on steady
# state so we don't nag.
try:
_print_curator_first_run_notice()
except Exception as e:
logger.debug("Curator first-run notice failed: %s", e)
# Repair RHEL-family root installs where /usr/local/bin isn't on PATH
# for non-login interactive shells. No-op on every other platform.
try:
@@ -7200,8 +7145,6 @@ def _cmd_update_impl(args, gateway_mode: bool):
supports_systemd_services,
_ensure_user_systemd_env,
find_gateway_pids,
find_profile_gateway_processes,
launch_detached_profile_gateway_restart,
_get_service_pids,
_graceful_restart_via_sigusr1,
)
@@ -7305,7 +7248,6 @@ def _cmd_update_impl(args, gateway_mode: bool):
restarted_services = []
killed_pids = set()
relaunched_profiles = []
# --- Systemd services (Linux) ---
# Discover all hermes-gateway* units (default + profiles)
@@ -7495,33 +7437,7 @@ def _cmd_update_impl(args, gateway_mode: bool):
manual_pids = find_gateway_pids(
exclude_pids=service_pids, all_profiles=True
)
profile_processes = {
proc.pid: proc
for proc in find_profile_gateway_processes(exclude_pids=service_pids)
if proc.pid in manual_pids
}
for pid, proc in profile_processes.items():
if not launch_detached_profile_gateway_restart(proc.profile, pid):
continue
# Prefer a graceful SIGUSR1 drain so in-flight agent runs
# finish before the watcher respawns the gateway. If the
# gateway doesn't support SIGUSR1 or doesn't exit within
# the drain budget, fall back to SIGTERM — the watcher
# still sees the exit and relaunches either way.
drained = _graceful_restart_via_sigusr1(
pid, drain_timeout=_drain_budget,
)
if not drained:
try:
os.kill(pid, _signal.SIGTERM)
except (ProcessLookupError, PermissionError):
pass
killed_pids.add(pid)
relaunched_profiles.append(proc.profile)
for pid in manual_pids:
if pid in profile_processes:
continue
try:
os.kill(pid, _signal.SIGTERM)
killed_pids.add(pid)
@@ -7532,14 +7448,11 @@ def _cmd_update_impl(args, gateway_mode: bool):
print()
for svc in restarted_services:
print(f" ✓ Restarted {svc}")
if relaunched_profiles:
names = ", ".join(relaunched_profiles)
print(f" ✓ Restarting manual gateway profile(s): {names}")
unmapped_count = len(killed_pids) - len(relaunched_profiles)
if unmapped_count:
print(f" → Stopped {unmapped_count} manual gateway process(es)")
if killed_pids:
print(f" → Stopped {len(killed_pids)} manual gateway process(es)")
print(" Restart manually: hermes gateway run")
if unmapped_count > 1:
# Also restart for each profile if needed
if len(killed_pids) > 1:
print(
" (or: hermes -p <profile> gateway run for each profile)"
)
@@ -7548,42 +7461,6 @@ def _cmd_update_impl(args, gateway_mode: bool):
# No gateways were running — nothing to do
pass
# --- Post-restart survivor sweep -----------------------------
# Issue #17648: some gateways ignore SIGTERM (stuck drain,
# blocked I/O, PID dead but zombie). The detached profile
# watchers wait 120s for the old PID to exit — if it never
# does, no respawn happens and the user keeps hitting
# ImportError against a stale sys.modules. Give the
# graceful paths a brief window to complete, then SIGKILL
# any remaining pre-update PIDs so the watcher / service
# manager can relaunch with fresh code.
try:
_time.sleep(3.0)
_service_pids_after = _get_service_pids()
_surviving = find_gateway_pids(
exclude_pids=_service_pids_after, all_profiles=True,
)
# Scope to PIDs we already tried to kill during this
# update (killed_pids). Anything new is a gateway that
# started AFTER our restart attempt — respecting user
# intent, we don't kill those.
_stuck = [pid for pid in _surviving if pid in killed_pids]
if _stuck:
print()
print(
f"{len(_stuck)} gateway process(es) ignored SIGTERM — force-killing"
)
for pid in _stuck:
try:
os.kill(pid, _signal.SIGKILL)
except (ProcessLookupError, PermissionError):
pass
# Give the OS a beat to reap the processes so the
# watchers see them exit and respawn.
_time.sleep(1.5)
except Exception as _sweep_exc:
logger.debug("Post-restart survivor sweep failed: %s", _sweep_exc)
except Exception as e:
logger.debug("Gateway restart during update failed: %s", e)
@@ -7820,7 +7697,7 @@ def cmd_profile(args):
if clone_all:
print(f"Full copy from {source_label}.")
else:
print(f"Cloned config, .env, SOUL.md, and skills from {source_label}.")
print(f"Cloned config, .env, SOUL.md from {source_label}.")
# Auto-clone Honcho config for the new profile (only with --clone/--clone-all)
if clone or clone_all:
@@ -8778,13 +8655,6 @@ def main():
webhook_parser.set_defaults(func=cmd_webhook)
# =========================================================================
# kanban command — multi-profile collaboration board
# =========================================================================
from hermes_cli.kanban import build_parser as _build_kanban_parser
kanban_parser = _build_kanban_parser(subparsers)
kanban_parser.set_defaults(func=cmd_kanban)
# =========================================================================
# hooks command — shell-hook inspection and management
# =========================================================================
@@ -9992,13 +9862,6 @@ Examples:
default=False,
help="Force a pre-update backup for this run (off by default; overrides updates.pre_update_backup)",
)
update_parser.add_argument(
"--yes",
"-y",
action="store_true",
default=False,
help="Assume yes for interactive prompts (config migration, stash restore). API-key entry is skipped; run 'hermes config migrate' separately for those.",
)
update_parser.set_defaults(func=cmd_update)
# =========================================================================
+10 -18
View File
@@ -891,19 +891,14 @@ def switch_model(
if not validation.get("accepted"):
override = False
if user_providers:
# user_providers is a dict: {provider_slug: config_dict}
for slug, cfg in user_providers.items():
if slug == target_provider:
cfg_models = cfg.get("models", {})
# Direct membership works for dict (keys) and list (strings)
if new_model in cfg_models:
for up in user_providers:
if isinstance(up, dict) and up.get("provider") == target_provider:
cfg_models = up.get("models", [])
if new_model in cfg_models or any(
m.get("name") == new_model for m in cfg_models if isinstance(m, dict)
):
override = True
break
# Also accept if models is a list of dicts with 'name' field
if isinstance(cfg_models, list):
if any(m.get("name") == new_model for m in cfg_models if isinstance(m, dict)):
override = True
break
if override:
validation = {"accepted": True, "persist": True, "recognized": False, "message": validation.get("message", "")}
else:
@@ -1417,17 +1412,14 @@ def list_authenticated_providers(
models_list = list(fb)
# Prefer the endpoint's live /models list when credentials are
# available, unless the provider explicitly opts out via
# discover_models: false (e.g. dedicated endpoints that expose
# the entire aggregator catalog via /models).
# available. This keeps OpenAI-compatible relays (for example CRS)
# in sync when the server catalog changes without requiring the
# user to mirror every model into config.yaml.
api_key = str(ep_cfg.get("api_key", "") or "").strip()
if not api_key:
key_env = str(ep_cfg.get("key_env", "") or "").strip()
api_key = os.environ.get(key_env, "").strip() if key_env else ""
discover = ep_cfg.get("discover_models", True)
if isinstance(discover, str):
discover = discover.lower() not in ("false", "no", "0")
if api_url and api_key and discover:
if api_url and api_key:
try:
from hermes_cli.models import fetch_api_models
live_models = fetch_api_models(api_key, api_url)
+1 -2
View File
@@ -40,7 +40,6 @@ OPENROUTER_MODELS: list[tuple[str, str]] = [
("anthropic/claude-sonnet-4.5", ""),
("anthropic/claude-haiku-4.5", ""),
("openrouter/elephant-alpha", "free"),
("openrouter/owl-alpha", "free"),
("openai/gpt-5.5", ""),
("openai/gpt-5.4-mini", ""),
("xiaomi/mimo-v2.5-pro", ""),
@@ -774,6 +773,7 @@ CANONICAL_PROVIDERS: list[ProviderEntry] = [
ProviderEntry("nous", "Nous Portal", "Nous Portal (Nous Research subscription)"),
ProviderEntry("openrouter", "OpenRouter", "OpenRouter (100+ models, pay-per-use)"),
ProviderEntry("lmstudio", "LM Studio", "LM Studio (local desktop app with built-in model server)"),
ProviderEntry("ai-gateway", "Vercel AI Gateway", "Vercel AI Gateway (200+ models, $5 free credit, no markup)"),
ProviderEntry("anthropic", "Anthropic", "Anthropic (Claude models — API key or Claude Code)"),
ProviderEntry("openai-codex", "OpenAI Codex", "OpenAI Codex"),
ProviderEntry("xiaomi", "Xiaomi MiMo", "Xiaomi MiMo (MiMo-V2.5 and V2 models — pro, omni, flash)"),
@@ -803,7 +803,6 @@ CANONICAL_PROVIDERS: list[ProviderEntry] = [
ProviderEntry("opencode-go", "OpenCode Go", "OpenCode Go (open models, $10/month subscription)"),
ProviderEntry("bedrock", "AWS Bedrock", "AWS Bedrock (Claude, Nova, Llama, DeepSeek — IAM or API key)"),
ProviderEntry("azure-foundry", "Azure Foundry", "Azure Foundry (OpenAI-style or Anthropic-style endpoint — your Azure AI deployment)"),
ProviderEntry("ai-gateway", "Vercel AI Gateway", "Vercel AI Gateway"),
]
# Derived dicts — used throughout the codebase
-52
View File
@@ -33,15 +33,12 @@ so plugin-defined tools appear alongside the built-in tools.
from __future__ import annotations
import asyncio
import importlib
import importlib.metadata
import importlib.util
import inspect
import logging
import os
import sys
import threading
import types
from dataclasses import dataclass, field
from pathlib import Path
@@ -1229,55 +1226,6 @@ def get_plugin_command_handler(name: str) -> Optional[Callable]:
return entry["handler"] if entry else None
_PLUGIN_COMMAND_AWAIT_TIMEOUT_SECS = 30.0
def resolve_plugin_command_result(result: Any) -> Any:
"""Resolve a plugin command return value, awaiting async handlers when needed.
Sync CLI/TUI dispatch sites call plugin handlers from plain functions.
If a handler is async, await it directly when no loop is running; if
we're already inside an active loop, run it in a helper thread with its
own loop so the caller still gets a concrete result synchronously. The
threaded path is bounded by a 30s timeout so a hung async handler cannot
wedge the terminal indefinitely.
"""
if not inspect.isawaitable(result):
return result
try:
asyncio.get_running_loop()
except RuntimeError:
return asyncio.run(result)
outcome: Dict[str, Any] = {}
failure: Dict[str, BaseException] = {}
done = threading.Event()
def _runner() -> None:
try:
outcome["value"] = asyncio.run(result)
except BaseException as exc: # pragma: no cover - re-raised below
failure["exc"] = exc
finally:
done.set()
thread = threading.Thread(
target=_runner,
name="hermes-plugin-command-await",
daemon=True,
)
thread.start()
if not done.wait(timeout=_PLUGIN_COMMAND_AWAIT_TIMEOUT_SECS):
raise TimeoutError(
"Plugin command async handler did not complete within "
f"{_PLUGIN_COMMAND_AWAIT_TIMEOUT_SECS:.0f}s"
)
if "exc" in failure:
raise failure["exc"]
return outcome.get("value")
def get_plugin_commands() -> Dict[str, dict]:
"""Return the full plugin commands dict (name → {handler, description, plugin}).
+113 -378
View File
@@ -15,18 +15,13 @@ import shutil
import subprocess
import sys
from pathlib import Path
from typing import Any, Optional
from typing import Optional
from hermes_constants import get_hermes_home
from hermes_cli.config import cfg_get
logger = logging.getLogger(__name__)
class PluginOperationError(Exception):
"""Recoverable plugin install/update failure (CLI exits; HTTP maps to 4xx)."""
# Minimum manifest version this installer understands.
# Plugins may declare ``manifest_version: 1`` in plugin.yaml;
# future breaking changes to the manifest schema bump this.
@@ -155,24 +150,6 @@ def _copy_example_files(plugin_dir: Path, console) -> None:
)
def _missing_requires_env_names(manifest: dict) -> list[str]:
"""Return declared ``requires_env`` names that are unset in ``~/.hermes/.env``."""
requires_env = manifest.get("requires_env") or []
if not requires_env:
return []
from hermes_cli.config import get_env_value
env_specs: list[dict] = []
for entry in requires_env:
if isinstance(entry, str):
env_specs.append({"name": entry})
elif isinstance(entry, dict) and entry.get("name"):
env_specs.append(entry)
return [s["name"] for s in env_specs if s.get("name") and not get_env_value(s["name"])]
def _prompt_plugin_env_vars(manifest: dict, console) -> None:
"""Prompt for required environment variables declared in plugin.yaml.
@@ -306,95 +283,6 @@ def _require_installed_plugin(name: str, plugins_dir: Path, console) -> Path:
# ---------------------------------------------------------------------------
def _install_plugin_core(identifier: str, *, force: bool) -> tuple[Path, dict, str]:
"""Clone Git plugin into ``~/.hermes/plugins``.
Returns ``(target_dir, installed_manifest, canonical_name)``.
Raises ``PluginOperationError`` on failure.
"""
import tempfile
try:
git_url = _resolve_git_url(identifier)
except ValueError as e:
raise PluginOperationError(str(e)) from e
plugins_dir = _plugins_dir()
with tempfile.TemporaryDirectory() as tmp:
tmp_target = Path(tmp) / "plugin"
try:
result = subprocess.run(
["git", "clone", "--depth", "1", git_url, str(tmp_target)],
capture_output=True,
text=True,
timeout=60,
)
except FileNotFoundError as e:
raise PluginOperationError(
"git is not installed or not in PATH.",
) from e
except subprocess.TimeoutExpired as e:
raise PluginOperationError(
"Git clone timed out after 60 seconds.",
) from e
if result.returncode != 0:
err = (result.stderr or result.stdout or "").strip()
raise PluginOperationError(f"Git clone failed:\n{err}")
manifest = _read_manifest(tmp_target)
plugin_name = manifest.get("name") or _repo_name_from_url(git_url)
try:
target = _sanitize_plugin_name(plugin_name, plugins_dir)
except ValueError as e:
raise PluginOperationError(str(e)) from e
mv = manifest.get("manifest_version")
if mv is not None:
try:
mv_int = int(mv)
except (ValueError, TypeError):
raise PluginOperationError(
f"Plugin '{plugin_name}' has invalid manifest_version "
f"'{mv}' (expected an integer).",
) from None
if mv_int > _SUPPORTED_MANIFEST_VERSION:
from hermes_cli.config import recommended_update_command
raise PluginOperationError(
f"Plugin '{plugin_name}' requires manifest_version {mv}, "
f"but this installer only supports up to {_SUPPORTED_MANIFEST_VERSION}. "
f"Run {recommended_update_command()} to update Hermes.",
) from None
if target.exists():
if not force:
raise PluginOperationError(
f"Plugin '{plugin_name}' already exists. Use force reinstall "
f"or run `hermes plugins update {plugin_name}`.",
)
shutil.rmtree(target)
shutil.move(str(tmp_target), str(target))
has_yaml = (target / "plugin.yaml").exists() or (target / "plugin.yml").exists()
if not has_yaml and not (target / "__init__.py").exists():
logger.warning(
"%s has no plugin.yaml / __init__.py; may not be a valid plugin",
plugin_name,
)
from rich.console import Console
_copy_example_files(target, Console())
installed_manifest = _read_manifest(target)
installed_name = installed_manifest.get("name") or target.name
return target, installed_manifest, installed_name
def cmd_install(
identifier: str,
force: bool = False,
@@ -405,6 +293,7 @@ def cmd_install(
After install, prompt "Enable now? [y/N]" unless *enable* is provided
(True = auto-enable without prompting, False = install disabled).
"""
import tempfile
from rich.console import Console
console = Console()
@@ -415,41 +304,114 @@ def cmd_install(
console.print(f"[red]Error:[/red] {e}")
sys.exit(1)
# Warn about insecure / local URL schemes
if git_url.startswith(("http://", "file://")):
console.print(
"[yellow]Warning:[/yellow] Using insecure/local URL scheme. "
"Consider using https:// or git@ for production installs.",
"Consider using https:// or git@ for production installs."
)
console.print(f"[dim]Cloning {git_url}...[/dim]")
plugins_dir = _plugins_dir()
try:
target, installed_manifest, installed_name = _install_plugin_core(
identifier,
force=force,
)
except PluginOperationError as e:
console.print(f"[red]Error:[/red] {e}")
sys.exit(1)
# Clone into a temp directory first so we can read plugin.yaml for the name
with tempfile.TemporaryDirectory() as tmp:
tmp_target = Path(tmp) / "plugin"
console.print(f"[dim]Cloning {git_url}...[/dim]")
if not (target / "plugin.yaml").exists() and not (target / "plugin.yml").exists() and not (
target / "__init__.py"
).exists():
try:
result = subprocess.run(
["git", "clone", "--depth", "1", git_url, str(tmp_target)],
capture_output=True,
text=True,
timeout=60,
)
except FileNotFoundError:
console.print("[red]Error:[/red] git is not installed or not in PATH.")
sys.exit(1)
except subprocess.TimeoutExpired:
console.print("[red]Error:[/red] Git clone timed out after 60 seconds.")
sys.exit(1)
if result.returncode != 0:
console.print(
f"[red]Error:[/red] Git clone failed:\n{result.stderr.strip()}"
)
sys.exit(1)
# Read manifest
manifest = _read_manifest(tmp_target)
plugin_name = manifest.get("name") or _repo_name_from_url(git_url)
# Sanitize plugin name against path traversal
try:
target = _sanitize_plugin_name(plugin_name, plugins_dir)
except ValueError as e:
console.print(f"[red]Error:[/red] {e}")
sys.exit(1)
# Check manifest_version compatibility
mv = manifest.get("manifest_version")
if mv is not None:
try:
mv_int = int(mv)
except (ValueError, TypeError):
console.print(
f"[red]Error:[/red] Plugin '{plugin_name}' has invalid "
f"manifest_version '{mv}' (expected an integer)."
)
sys.exit(1)
if mv_int > _SUPPORTED_MANIFEST_VERSION:
from hermes_cli.config import recommended_update_command
console.print(
f"[red]Error:[/red] Plugin '{plugin_name}' requires manifest_version "
f"{mv}, but this installer only supports up to {_SUPPORTED_MANIFEST_VERSION}.\n"
f"Run [bold]{recommended_update_command()}[/bold] to get a newer installer."
)
sys.exit(1)
if target.exists():
if not force:
console.print(
f"[red]Error:[/red] Plugin '{plugin_name}' already exists at {target}.\n"
f"Use [bold]--force[/bold] to remove and reinstall, or "
f"[bold]hermes plugins update {plugin_name}[/bold] to pull latest."
)
sys.exit(1)
console.print(f"[dim] Removing existing {plugin_name}...[/dim]")
shutil.rmtree(target)
# Move from temp to final location
shutil.move(str(tmp_target), str(target))
# Validate it looks like a plugin
if not (target / "plugin.yaml").exists() and not (target / "__init__.py").exists():
console.print(
f"[yellow]Warning:[/yellow] {installed_name} doesn't contain plugin.yaml "
f"or __init__.py. It may not be a valid Hermes plugin.",
f"[yellow]Warning:[/yellow] {plugin_name} doesn't contain plugin.yaml "
f"or __init__.py. It may not be a valid Hermes plugin."
)
# Copy .example files to their real names (e.g. config.yaml.example → config.yaml)
_copy_example_files(target, console)
# Re-read manifest from installed location (for env var prompting)
installed_manifest = _read_manifest(target)
# Prompt for required environment variables before showing after-install docs
_prompt_plugin_env_vars(installed_manifest, console)
_display_after_install(target, identifier)
# Determine the canonical plugin name for enable-list bookkeeping.
installed_name = installed_manifest.get("name") or target.name
# Decide whether to enable: explicit flag > interactive prompt > default off
should_enable = enable
if should_enable is None:
# Interactive prompt unless stdin isn't a TTY (scripted install).
if sys.stdin.isatty() and sys.stdout.isatty():
try:
answer = input(
f" Enable '{installed_name}' now? [y/N]: ",
f" Enable '{installed_name}' now? [y/N]: "
).strip().lower()
should_enable = answer in ("y", "yes")
except (EOFError, KeyboardInterrupt):
@@ -465,12 +427,12 @@ def cmd_install(
_save_enabled_set(enabled)
_save_disabled_set(disabled)
console.print(
f"[green]✓[/green] Plugin [bold]{installed_name}[/bold] enabled.",
f"[green]✓[/green] Plugin [bold]{installed_name}[/bold] enabled."
)
else:
console.print(
f"[dim]Plugin installed but not enabled. "
f"Run `hermes plugins enable {installed_name}` to activate.[/dim]",
f"Run `hermes plugins enable {installed_name}` to activate.[/dim]"
)
console.print("[dim]Restart the gateway for the plugin to take effect:[/dim]")
@@ -500,22 +462,36 @@ def cmd_update(name: str) -> None:
console.print(f"[dim]Updating {name}...[/dim]")
ok, output = _git_pull_plugin_dir(target)
if not ok:
console.print(f"[red]Error:[/red] {output}")
try:
result = subprocess.run(
["git", "pull", "--ff-only"],
capture_output=True,
text=True,
timeout=60,
cwd=str(target),
)
except FileNotFoundError:
console.print("[red]Error:[/red] git is not installed or not in PATH.")
sys.exit(1)
except subprocess.TimeoutExpired:
console.print("[red]Error:[/red] Git pull timed out after 60 seconds.")
sys.exit(1)
if result.returncode != 0:
console.print(f"[red]Error:[/red] Git pull failed:\n{result.stderr.strip()}")
sys.exit(1)
# Copy any new .example files
_copy_example_files(target, console)
out = output.strip()
if "Already up to date" in out:
output = result.stdout.strip()
if "Already up to date" in output:
console.print(
f"[green]✓[/green] Plugin [bold]{name}[/bold] is already up to date."
)
else:
console.print(f"[green]✓[/green] Plugin [bold]{name}[/bold] updated.")
console.print(f"[dim]{out}[/dim]")
console.print(f"[dim]{output}[/dim]")
def cmd_remove(name: str) -> None:
@@ -1268,247 +1244,6 @@ def _run_composite_fallback(plugin_names, plugin_labels, plugin_selected,
print()
def dashboard_install_plugin(
identifier: str,
*,
force: bool,
enable: bool,
) -> dict[str, Any]:
"""Non-interactive install for the web dashboard. Returns a JSON-serializable dict."""
warnings: list[str] = []
try:
git_url = _resolve_git_url(identifier)
if git_url.startswith(("http://", "file://")):
warnings.append(
"Insecure URL scheme; prefer https:// or git@ for production installs.",
)
except ValueError:
pass
try:
target, installed_manifest, installed_name = _install_plugin_core(
identifier,
force=force,
)
except PluginOperationError as exc:
return {"ok": False, "error": str(exc)}
missing_env = _missing_requires_env_names(installed_manifest)
if enable:
en = _get_enabled_set()
dis = _get_disabled_set()
en.add(installed_name)
dis.discard(installed_name)
_save_enabled_set(en)
_save_disabled_set(dis)
hint: str | None = None
ap = target / "after-install.md"
if ap.exists():
hint = str(ap)
return {
"ok": True,
"plugin_name": installed_name,
"warnings": warnings,
"missing_env": missing_env,
"after_install_path": hint,
"enabled": enable,
}
def _get_plugin_toolset_key(name: str) -> Optional[str]:
"""Return the toolset key a plugin registers its tools under, or None.
Queries the live tool registry the plugin must already be loaded.
Falls back to reading ``provides_tools`` from plugin.yaml and looking
up the toolset from the registry for the first tool name found.
"""
try:
from tools.registry import registry
except Exception:
return None
# Check the plugin manager for tools this plugin registered
try:
from hermes_cli.plugins import discover_plugins, get_plugin_manager
discover_plugins() # idempotent — ensures plugins are loaded
manager = get_plugin_manager()
for _key, loaded in manager._plugins.items():
if loaded.manifest.name == name or _key == name:
for tool_name in loaded.tools_registered:
entry = registry.get_entry(tool_name)
if entry and entry.toolset:
return entry.toolset
break
except Exception:
pass
# Fallback: read provides_tools from manifest on disk and query registry
try:
from hermes_cli.plugins import get_bundled_plugins_dir
for base in (get_bundled_plugins_dir(), _plugins_dir()):
if not base.is_dir():
continue
candidate = base / name
if candidate.is_dir():
manifest = _read_manifest(candidate)
for tool_name in manifest.get("provides_tools") or []:
entry = registry.get_entry(tool_name)
if entry and entry.toolset:
return entry.toolset
except Exception:
pass
return None
def _toggle_plugin_toolset(name: str, *, enable: bool) -> None:
"""Add or remove a plugin's toolset from platform_toolsets for all platforms.
Only acts if the plugin actually provides tools (has a toolset key).
"""
toolset_key = _get_plugin_toolset_key(name)
if not toolset_key:
return
from hermes_cli.config import load_config, save_config
config = load_config()
platform_toolsets = config.get("platform_toolsets")
if not isinstance(platform_toolsets, dict):
platform_toolsets = {}
config["platform_toolsets"] = platform_toolsets
changed = False
for platform, ts_list in platform_toolsets.items():
if not isinstance(ts_list, list):
continue
if enable:
if toolset_key not in ts_list:
ts_list.append(toolset_key)
changed = True
else:
if toolset_key in ts_list:
ts_list.remove(toolset_key)
changed = True
# If enabling and no platforms have toolset lists yet, add to "cli" at minimum
if enable and not changed and not platform_toolsets:
platform_toolsets["cli"] = [toolset_key]
changed = True
if changed:
save_config(config)
def dashboard_set_agent_plugin_enabled(name: str, *, enabled: bool) -> dict[str, Any]:
"""Enable or disable a plugin in ``config.yaml`` (runtime allow/deny lists).
For plugins that provide tools (toolsets), also toggles the toolset in
``platform_toolsets`` so the agent actually sees the tools in sessions.
"""
if not _plugin_exists(name):
return {"ok": False, "error": f"Plugin '{name}' is not installed or bundled."}
en = _get_enabled_set()
dis = _get_disabled_set()
if enabled:
if name in en and name not in dis:
return {"ok": True, "name": name, "unchanged": True}
en.add(name)
dis.discard(name)
_save_enabled_set(en)
_save_disabled_set(dis)
_toggle_plugin_toolset(name, enable=True)
return {"ok": True, "name": name, "unchanged": False}
if name not in en and name in dis:
return {"ok": True, "name": name, "unchanged": True}
en.discard(name)
dis.add(name)
_save_enabled_set(en)
_save_disabled_set(dis)
_toggle_plugin_toolset(name, enable=False)
return {"ok": True, "name": name, "unchanged": False}
def _user_installed_plugin_dir(name: str) -> Optional[Path]:
"""Resolved path under ``~/.hermes/plugins/<name>`` if it exists."""
plugins_dir = _plugins_dir()
try:
target = _sanitize_plugin_name(name, plugins_dir)
except ValueError:
return None
return target if target.is_dir() else None
def dashboard_update_user_plugin(name: str) -> dict[str, Any]:
"""``git pull`` inside ``~/.hermes/plugins/<name>``."""
target = _user_installed_plugin_dir(name)
if target is None:
return {
"ok": False,
"error": f"Plugin '{name}' was not found under {_plugins_dir()}.",
}
if not (target / ".git").exists():
return {
"ok": False,
"error": f"Plugin '{name}' is not a git checkout; cannot pull updates.",
}
ok, msg = _git_pull_plugin_dir(target)
if not ok:
return {"ok": False, "error": msg}
from rich.console import Console
_copy_example_files(target, Console())
unchanged = "Already up to date" in msg
return {"ok": True, "name": name, "output": msg, "unchanged": unchanged}
def _git_pull_plugin_dir(target: Path) -> tuple[bool, str]:
try:
result = subprocess.run(
["git", "pull", "--ff-only"],
capture_output=True,
text=True,
timeout=60,
cwd=str(target),
)
except FileNotFoundError:
return False, "git is not installed or not in PATH."
except subprocess.TimeoutExpired:
return False, "Git pull timed out after 60 seconds."
if result.returncode != 0:
err = (result.stderr or "").strip() or result.stdout.strip()
return False, err or "git pull failed."
return True, result.stdout.strip()
def dashboard_remove_user_plugin(name: str) -> dict[str, Any]:
"""Delete a plugin tree under ``~/.hermes/plugins/`` only."""
plugins_dir = _plugins_dir()
for n, _ver, _d, src, _path in _discover_all_plugins():
if n == name and src == "bundled":
return {"ok": False, "error": "Bundled plugins cannot be removed from the dashboard."}
target = _user_installed_plugin_dir(name)
if target is None:
return {
"ok": False,
"error": f"Plugin '{name}' was not found under {plugins_dir}.",
}
shutil.rmtree(target)
return {"ok": True, "name": name}
def plugins_command(args) -> None:
"""Dispatch hermes plugins subcommands."""
action = getattr(args, "plugins_action", None)
+2 -11
View File
@@ -11,7 +11,7 @@ zero migration needed.
Usage::
hermes profile create coder # fresh profile + bundled skills
hermes profile create coder --clone # also copy config, .env, SOUL.md, skills
hermes profile create coder --clone # also copy config, .env, SOUL.md
hermes profile create coder --clone-all # full copy of source profile
coder chat # use via wrapper alias
hermes -p coder chat # or via flag
@@ -411,8 +411,7 @@ def create_profile(
clone_all:
If True, do a full copytree of the source (all state).
clone_config:
If True, copy config files (config.yaml, .env, SOUL.md), installed
skills, and selected profile identity files from the source profile.
If True, copy only config files (config.yaml, .env, SOUL.md).
no_alias:
If True, skip wrapper script creation.
@@ -470,14 +469,6 @@ def create_profile(
if src.exists():
shutil.copy2(src, profile_dir / filename)
# Clone installed skills from the source profile. The dashboard's
# "clone from default" flow is expected to preserve both bundled
# and user-installed skills so the new profile immediately has the
# same agent capabilities as the source profile.
source_skills = source_dir / "skills"
if source_skills.is_dir():
shutil.copytree(source_skills, profile_dir / "skills", dirs_exist_ok=True)
# Clone memory and other subdirectory files
for relpath in _CLONE_SUBDIR_FILES:
src = source_dir / relpath
+2 -11
View File
@@ -358,20 +358,11 @@ def _get_named_custom_provider(requested_provider: str) -> Optional[Dict[str, An
return None
if not requested_norm.startswith("custom:"):
try:
canonical = auth_mod.resolve_provider(requested_norm)
auth_mod.resolve_provider(requested_norm)
except AuthError:
pass
else:
# A user-declared ``custom_providers`` entry whose name matches
# only an *alias* (``kimi`` → built-in ``kimi-coding``) is the
# user's intended target — alias rewriting would otherwise hijack
# the request. We only defer to the built-in when the raw name is
# the canonical provider itself (``nous``, ``openrouter``, …) so
# accidentally shadowing a canonical provider still resolves to
# the built-in. See tests/hermes_cli/test_runtime_provider_resolution.py
# ``test_named_custom_provider_does_not_shadow_builtin_provider``.
if (canonical or "").strip().lower() == requested_norm:
return None
return None
config = load_config()
-316
View File
@@ -1,316 +0,0 @@
"""Session recap — summarize what's happened in the current session.
Inspired by Claude Code's `/recap` command (v2.1.114, April 2026), which
shows a one-line summary of what happened while a terminal was unfocused
so users juggling multiple sessions can re-orient quickly.
Source: https://code.claude.com/docs/en/whats-new/2026-w17
Differences from Claude Code:
- Pure local computation from the in-memory conversation history. No
LLM call, no auxiliary model, no prompt-cache invalidation. A
recap should be instant and free.
- Works unchanged on CLI and every gateway platform (Telegram,
Discord, Slack, ) because both call into the same ``build_recap``
helper. Claude Code only shows this on the CLI.
- Tailored to hermes-agent's tool vocabulary (``terminal``, ``patch``,
``write_file``, ``delegate_task``, ``browser_*``, ``web_*``) the
recap surfaces which classes of work were most active.
"""
from __future__ import annotations
import os
from collections import Counter
from typing import Any, Iterable, List, Mapping, Optional, Sequence, Tuple
# How many recent user/assistant turns we consider "recent activity".
_RECENT_TURN_WINDOW = 20
# How many characters of the latest user prompt to show.
_PROMPT_PREVIEW_CHARS = 140
# How many characters of the latest assistant text to show.
_ASSISTANT_PREVIEW_CHARS = 200
# How many recently-touched files to list.
_MAX_FILES_LISTED = 5
# Tool names that identify a file-editing action and the argument key that
# holds the path.
_FILE_EDIT_TOOLS: Mapping[str, str] = {
"write_file": "path",
"patch": "path",
"read_file": "path",
"skill_manage": "file_path",
"skill_view": "file_path",
}
def _coerce_text(value: Any) -> str:
"""Flatten assistant/user ``content`` into a plain string.
Content can be a string or a list of content blocks (for multimodal
or reasoning models). We concatenate every text-like block and
ignore the rest.
"""
if value is None:
return ""
if isinstance(value, str):
return value
if isinstance(value, list):
parts: List[str] = []
for block in value:
if isinstance(block, str):
parts.append(block)
continue
if isinstance(block, Mapping):
text = block.get("text")
if isinstance(text, str) and text:
parts.append(text)
return "\n".join(parts)
return str(value)
def _tool_call_name_and_args(tool_call: Any) -> Tuple[str, Mapping[str, Any]]:
"""Extract ``(name, arguments_dict)`` from a tool_call entry.
``arguments`` may be a JSON string or a dict depending on provider.
Return an empty dict if it cannot be parsed.
"""
if not isinstance(tool_call, Mapping):
return "", {}
fn = tool_call.get("function") or {}
if not isinstance(fn, Mapping):
return "", {}
name = str(fn.get("name") or "") or ""
raw_args = fn.get("arguments")
if isinstance(raw_args, Mapping):
return name, raw_args
if isinstance(raw_args, str) and raw_args:
try:
import json
parsed = json.loads(raw_args)
if isinstance(parsed, Mapping):
return name, parsed
except Exception:
return name, {}
return name, {}
def _iter_assistant_tool_calls(
messages: Sequence[Mapping[str, Any]],
) -> Iterable[Tuple[str, Mapping[str, Any]]]:
for msg in messages:
if not isinstance(msg, Mapping):
continue
if msg.get("role") != "assistant":
continue
tool_calls = msg.get("tool_calls") or []
if not isinstance(tool_calls, list):
continue
for tc in tool_calls:
name, args = _tool_call_name_and_args(tc)
if name:
yield name, args
def _count_visible_turns(
messages: Sequence[Mapping[str, Any]],
) -> Tuple[int, int, int]:
"""Return ``(user_turn_count, assistant_turn_count, tool_message_count)``."""
users = assistants = tools = 0
for msg in messages:
if not isinstance(msg, Mapping):
continue
role = msg.get("role")
if role == "user":
users += 1
elif role == "assistant":
assistants += 1
elif role == "tool":
tools += 1
return users, assistants, tools
def _latest_user_prompt(
messages: Sequence[Mapping[str, Any]],
) -> Optional[str]:
for msg in reversed(messages):
if isinstance(msg, Mapping) and msg.get("role") == "user":
text = _coerce_text(msg.get("content")).strip()
if text:
return text
return None
def _latest_assistant_text(
messages: Sequence[Mapping[str, Any]],
) -> Optional[str]:
for msg in reversed(messages):
if not isinstance(msg, Mapping):
continue
if msg.get("role") != "assistant":
continue
text = _coerce_text(msg.get("content")).strip()
if text:
return text
return None
def _recent_window(
messages: Sequence[Mapping[str, Any]], window: int = _RECENT_TURN_WINDOW
) -> List[Mapping[str, Any]]:
"""Return the tail slice of ``messages`` covering at most ``window``
user+assistant turns (tool messages ride along inside the window).
Iterating from the end, we count user and assistant messages and
keep everything from the first message that falls within the window.
"""
count = 0
cut = 0
for i in range(len(messages) - 1, -1, -1):
msg = messages[i]
if isinstance(msg, Mapping) and msg.get("role") in ("user", "assistant"):
count += 1
if count >= window:
cut = i
break
else:
return list(messages)
return list(messages[cut:])
def _shortened_path(path: str) -> str:
"""Show a path relative to cwd when possible, otherwise with ~ expansion."""
if not path:
return path
try:
abs_path = os.path.abspath(os.path.expanduser(path))
cwd = os.getcwd()
if abs_path == cwd:
return "."
if abs_path.startswith(cwd + os.sep):
return abs_path[len(cwd) + 1 :]
home = os.path.expanduser("~")
if abs_path.startswith(home + os.sep):
return "~/" + abs_path[len(home) + 1 :]
return abs_path
except Exception:
return path
def _summarise_tool_activity(
tool_calls: Sequence[Tuple[str, Mapping[str, Any]]],
) -> Tuple[List[Tuple[str, int]], List[str]]:
"""Return ``(tool_counts_sorted, recently_edited_files)``.
``tool_counts_sorted`` is descending by count, keeping the full list
so callers can truncate for display. ``recently_edited_files`` lists
distinct paths (most recent first) from file-editing tools.
"""
counter: Counter[str] = Counter()
files_seen: List[str] = []
files_set: set[str] = set()
# Walk in reverse so "most recent first" drops out of order-preserved iteration.
for name, args in reversed(list(tool_calls)):
counter[name] += 1
arg_key = _FILE_EDIT_TOOLS.get(name)
if arg_key:
path = args.get(arg_key)
if isinstance(path, str) and path and path not in files_set:
files_set.add(path)
files_seen.append(_shortened_path(path))
# Restore "reverse of reverse" for correct counts; Counter ignores order
# so only files_seen needed the reversal. Fix ordering: currently
# files_seen is newest→oldest which is what we want for display.
tool_counts = sorted(counter.items(), key=lambda kv: (-kv[1], kv[0]))
return tool_counts, files_seen
def _truncate(text: str, limit: int) -> str:
text = " ".join(text.split()) # collapse newlines for a compact one-liner
if len(text) <= limit:
return text
return text[: limit - 1].rstrip() + ""
def build_recap(
messages: Sequence[Mapping[str, Any]],
*,
session_title: Optional[str] = None,
session_id: Optional[str] = None,
platform: Optional[str] = None,
) -> str:
"""Build a multi-line recap of recent activity.
Inputs:
messages: the full conversation history as a list of
chat-completion-style dicts (``role``, ``content``,
``tool_calls``, ).
session_title: optional human title (from SessionDB).
session_id: optional session id.
platform: optional hint (``"cli"``, ``"telegram"``, ). Does not
change behavior today but is accepted for forward compat.
The output is plain text designed to render well in both a terminal
(with 80-col wrapping) and a gateway message bubble.
"""
_ = platform # reserved for future use
lines: List[str] = []
header_bits: List[str] = ["Session recap"]
if session_title:
header_bits.append(f"{session_title}")
elif session_id:
header_bits.append(f"{session_id[:8]}")
lines.append(" ".join(header_bits))
if not messages:
lines.append(" (nothing to recap — no messages yet)")
return "\n".join(lines)
users, assistants, tool_msgs = _count_visible_turns(messages)
window = _recent_window(messages)
win_users, win_assistants, _ = _count_visible_turns(window)
scope = (
f"{win_users} user turn{'s' if win_users != 1 else ''} / "
f"{win_assistants} assistant repl{'ies' if win_assistants != 1 else 'y'}"
)
if (users, assistants) != (win_users, win_assistants):
scope += f" (of {users}/{assistants} total)"
lines.append(f" Recent: {scope}, {tool_msgs} tool result{'s' if tool_msgs != 1 else ''}")
tool_calls = list(_iter_assistant_tool_calls(window))
tool_counts, files = _summarise_tool_activity(tool_calls)
if tool_counts:
top = ", ".join(f"{name}×{count}" for name, count in tool_counts[:5])
extra = len(tool_counts) - 5
if extra > 0:
top += f" (+{extra} more)"
lines.append(f" Tools used: {top}")
if files:
shown = files[:_MAX_FILES_LISTED]
extra = len(files) - len(shown)
entry = ", ".join(shown)
if extra > 0:
entry += f" (+{extra} more)"
lines.append(f" Files touched: {entry}")
latest_user = _latest_user_prompt(window)
if latest_user:
lines.append(f" Last ask: {_truncate(latest_user, _PROMPT_PREVIEW_CHARS)}")
latest_reply = _latest_assistant_text(window)
if latest_reply:
lines.append(f" Last reply: {_truncate(latest_reply, _ASSISTANT_PREVIEW_CHARS)}")
if len(lines) == 2:
# Only the header + scope line — nothing substantive to show.
lines.append(" (no assistant activity yet in this window)")
return "\n".join(lines)
__all__ = ["build_recap"]
+1 -2
View File
@@ -18,7 +18,6 @@ for reinstall when scopes/commands change.
from __future__ import annotations
import json
import os
import sys
from pathlib import Path
@@ -129,7 +128,7 @@ def slack_manifest_command(args) -> int:
target = Path(get_hermes_home()) / "slack-manifest.json"
except Exception:
target = Path(os.environ.get("HERMES_HOME") or str(Path.home() / ".hermes")) / "slack-manifest.json"
target = Path.home() / ".hermes" / "slack-manifest.json"
else:
target = Path(write_target).expanduser()
target.parent.mkdir(parents=True, exist_ok=True)
-1
View File
@@ -125,7 +125,6 @@ def show_status(args):
keys = {
"OpenRouter": "OPENROUTER_API_KEY",
"OpenAI": "OPENAI_API_KEY",
"NVIDIA": "NVIDIA_API_KEY",
"Z.AI/GLM": "GLM_API_KEY",
"Kimi": "KIMI_API_KEY",
"StepFun Step Plan": "STEPFUN_API_KEY",
+2 -517
View File
@@ -345,7 +345,6 @@ _CATEGORY_MERGE: Dict[str, str] = {
"dashboard": "display",
"code_execution": "agent",
"prompt_caching": "agent",
"goals": "agent",
# Only `telegram.reactions` currently lives under telegram — fold it in
# with the other messaging-platform config (discord) so it isn't an
# orphan tab of one field.
@@ -2345,254 +2344,6 @@ async def delete_cron_job(job_id: str):
return {"ok": True}
# ---------------------------------------------------------------------------
# Profile management endpoints (minimal — list/create/rename/delete + SOUL.md)
# ---------------------------------------------------------------------------
class ProfileCreate(BaseModel):
name: str
clone_from_default: bool = False
class ProfileRename(BaseModel):
new_name: str
class ProfileSoulUpdate(BaseModel):
content: str
def _profile_attr(info, name: str, default: Any = None) -> Any:
try:
return getattr(info, name)
except Exception:
return default
def _profile_to_dict(info) -> Dict[str, Any]:
return {
"name": _profile_attr(info, "name", ""),
"path": str(_profile_attr(info, "path", "")),
"is_default": bool(_profile_attr(info, "is_default", False)),
"model": _profile_attr(info, "model"),
"provider": _profile_attr(info, "provider"),
"has_env": bool(_profile_attr(info, "has_env", False)),
"skill_count": int(_profile_attr(info, "skill_count", 0) or 0),
}
def _fallback_profile_dicts(profiles_mod) -> List[Dict[str, Any]]:
def _safe(callable_, default):
try:
return callable_()
except Exception:
return default
profiles: List[Dict[str, Any]] = []
default_home = profiles_mod._get_default_hermes_home()
if default_home.is_dir():
model, provider = _safe(lambda: profiles_mod._read_config_model(default_home), (None, None))
profiles.append({
"name": "default",
"path": str(default_home),
"is_default": True,
"model": model,
"provider": provider,
"has_env": (default_home / ".env").exists(),
"skill_count": _safe(lambda: profiles_mod._count_skills(default_home), 0),
})
profiles_root = profiles_mod._get_profiles_root()
if profiles_root.is_dir():
for entry in sorted(profiles_root.iterdir()):
if not entry.is_dir() or not profiles_mod._PROFILE_ID_RE.match(entry.name):
continue
model, provider = _safe(lambda entry=entry: profiles_mod._read_config_model(entry), (None, None))
profiles.append({
"name": entry.name,
"path": str(entry),
"is_default": False,
"model": model,
"provider": provider,
"has_env": (entry / ".env").exists(),
"skill_count": _safe(lambda entry=entry: profiles_mod._count_skills(entry), 0),
})
return profiles
def _resolve_profile_dir(name: str) -> Path:
"""Validate ``name`` and resolve to its directory or raise an HTTPException."""
from hermes_cli import profiles as profiles_mod
try:
profiles_mod.validate_profile_name(name)
except ValueError as e:
raise HTTPException(status_code=400, detail=str(e))
if not profiles_mod.profile_exists(name):
raise HTTPException(status_code=404, detail=f"Profile '{name}' does not exist.")
return profiles_mod.get_profile_dir(name)
def _profile_setup_command(name: str) -> str:
"""Return the shell command used to configure a profile in the CLI."""
_resolve_profile_dir(name)
return "hermes setup" if name == "default" else f"{name} setup"
@app.get("/api/profiles")
async def list_profiles_endpoint():
from hermes_cli import profiles as profiles_mod
try:
return {"profiles": [_profile_to_dict(p) for p in profiles_mod.list_profiles()]}
except Exception:
_log.exception("GET /api/profiles failed; falling back to profile directory scan")
return {"profiles": _fallback_profile_dicts(profiles_mod)}
@app.post("/api/profiles")
async def create_profile_endpoint(body: ProfileCreate):
from hermes_cli import profiles as profiles_mod
try:
path = profiles_mod.create_profile(
name=body.name,
clone_from="default" if body.clone_from_default else None,
clone_config=body.clone_from_default,
)
# Match the CLI's profile-create flow: fresh named profiles get the
# bundled skills installed. When cloning from default, create_profile()
# has already copied the source profile's skills, including any
# user-installed skills.
if not body.clone_from_default:
profiles_mod.seed_profile_skills(path, quiet=True)
# Match the CLI's profile-create flow: named profiles should get a
# wrapper in ~/.local/bin when the alias is safe to create.
collision = profiles_mod.check_alias_collision(body.name)
if not collision:
profiles_mod.create_wrapper_script(body.name)
except (ValueError, FileExistsError, FileNotFoundError) as e:
raise HTTPException(status_code=400, detail=str(e))
except Exception as e:
_log.exception("POST /api/profiles failed")
raise HTTPException(status_code=500, detail=str(e))
return {"ok": True, "name": body.name, "path": str(path)}
@app.get("/api/profiles/{name}/setup-command")
async def get_profile_setup_command(name: str):
return {"command": _profile_setup_command(name)}
@app.post("/api/profiles/{name}/open-terminal")
async def open_profile_terminal_endpoint(name: str):
try:
command = _profile_setup_command(name)
if sys.platform.startswith("win"):
subprocess.Popen(["cmd.exe", "/c", "start", "", command])
elif sys.platform == "darwin":
escaped = command.replace("\\", "\\\\").replace('"', '\\"')
applescript = (
'tell application "Terminal"\n'
"activate\n"
f'do script "{escaped}"\n'
"end tell"
)
subprocess.Popen(["osascript", "-e", applescript])
else:
terminal_commands = [
("x-terminal-emulator", ["x-terminal-emulator", "-e", "sh", "-lc", command]),
("gnome-terminal", ["gnome-terminal", "--", "sh", "-lc", command]),
("konsole", ["konsole", "-e", "sh", "-lc", command]),
("xfce4-terminal", ["xfce4-terminal", "-e", f"sh -lc '{command}'"]),
("mate-terminal", ["mate-terminal", "-e", f"sh -lc '{command}'"]),
("lxterminal", ["lxterminal", "-e", f"sh -lc '{command}'"]),
("tilix", ["tilix", "-e", "sh", "-lc", command]),
("alacritty", ["alacritty", "-e", "sh", "-lc", command]),
("kitty", ["kitty", "sh", "-lc", command]),
("xterm", ["xterm", "-e", "sh", "-lc", command]),
]
for executable, popen_args in terminal_commands:
if subprocess.call(
["which", executable],
stdout=subprocess.DEVNULL,
stderr=subprocess.DEVNULL,
) == 0:
subprocess.Popen(popen_args)
break
else:
raise HTTPException(
status_code=400,
detail="No supported terminal emulator found",
)
except FileNotFoundError as e:
raise HTTPException(status_code=404, detail=str(e))
except ValueError as e:
raise HTTPException(status_code=400, detail=str(e))
except HTTPException:
raise
except Exception as e:
_log.exception("POST /api/profiles/%s/open-terminal failed", name)
raise HTTPException(status_code=500, detail=str(e))
return {"ok": True, "command": command}
@app.patch("/api/profiles/{name}")
async def rename_profile_endpoint(name: str, body: ProfileRename):
from hermes_cli import profiles as profiles_mod
try:
path = profiles_mod.rename_profile(name, body.new_name)
except FileNotFoundError as e:
raise HTTPException(status_code=404, detail=str(e))
except (ValueError, FileExistsError) as e:
raise HTTPException(status_code=400, detail=str(e))
except Exception as e:
_log.exception("PATCH /api/profiles/%s failed", name)
raise HTTPException(status_code=500, detail=str(e))
return {"ok": True, "name": body.new_name, "path": str(path)}
@app.delete("/api/profiles/{name}")
async def delete_profile_endpoint(name: str):
"""Delete a profile. The dashboard collects the user's confirmation in
its own dialog before this request, so we always pass ``yes=True`` to
skip the CLI's interactive prompt."""
from hermes_cli import profiles as profiles_mod
try:
path = profiles_mod.delete_profile(name, yes=True)
except FileNotFoundError as e:
raise HTTPException(status_code=404, detail=str(e))
except ValueError as e:
raise HTTPException(status_code=400, detail=str(e))
except Exception as e:
_log.exception("DELETE /api/profiles/%s failed", name)
raise HTTPException(status_code=500, detail=str(e))
return {"ok": True, "path": str(path)}
@app.get("/api/profiles/{name}/soul")
async def get_profile_soul(name: str):
soul_path = _resolve_profile_dir(name) / "SOUL.md"
if soul_path.exists():
try:
return {"content": soul_path.read_text(encoding="utf-8"), "exists": True}
except OSError as e:
raise HTTPException(status_code=500, detail=f"Could not read SOUL.md: {e}")
return {"content": "", "exists": False}
@app.put("/api/profiles/{name}/soul")
async def update_profile_soul(name: str, body: ProfileSoulUpdate):
soul_path = _resolve_profile_dir(name) / "SOUL.md"
try:
soul_path.write_text(body.content, encoding="utf-8")
except OSError as e:
_log.exception("PUT /api/profiles/%s/soul failed", name)
raise HTTPException(status_code=500, detail=f"Could not write SOUL.md: {e}")
return {"ok": True}
# ---------------------------------------------------------------------------
# Skills & Tools endpoints
# ---------------------------------------------------------------------------
@@ -3618,16 +3369,12 @@ def _get_dashboard_plugins(force_rescan: bool = False) -> list:
@app.get("/api/dashboard/plugins")
async def get_dashboard_plugins():
"""Return discovered dashboard plugins (excludes user-hidden ones)."""
"""Return discovered dashboard plugins."""
plugins = _get_dashboard_plugins()
# Read user's hidden plugins list from config.
config = load_config()
hidden: list = cfg_get(config, "dashboard", "hidden_plugins", default=[]) or []
# Strip internal fields before sending to frontend and filter out hidden.
# Strip internal fields before sending to frontend.
return [
{k: v for k, v in p.items() if not k.startswith("_")}
for p in plugins
if p["name"] not in hidden
]
@@ -3638,268 +3385,6 @@ async def rescan_dashboard_plugins():
return {"ok": True, "count": len(plugins)}
class _AgentPluginInstallBody(BaseModel):
identifier: str
force: bool = False
enable: bool = True
def _strip_dashboard_manifest(p: Dict[str, Any]) -> Dict[str, Any]:
return {k: v for k, v in p.items() if not k.startswith("_")}
def _merged_plugins_hub() -> Dict[str, Any]:
"""Agent discovery + dashboard manifests + optional provider picker metadata."""
from hermes_cli.plugins_cmd import (
_discover_all_plugins,
_get_current_context_engine,
_get_current_memory_provider,
_discover_context_engines,
_discover_memory_providers,
_get_disabled_set,
_get_enabled_set,
_read_manifest as _read_plugin_manifest_at,
)
dashboard_list = _get_dashboard_plugins()
dash_by_name = {str(p["name"]): p for p in dashboard_list}
disabled_set = _get_disabled_set()
enabled_set = _get_enabled_set()
# Read user-hidden plugins from config for the user_hidden field.
config = load_config()
hidden_plugins: list = cfg_get(config, "dashboard", "hidden_plugins", default=[]) or []
plugins_root_resolved = (get_hermes_home() / "plugins").resolve()
rows: List[Dict[str, Any]] = []
for name, version, description, source, dir_str in _discover_all_plugins():
if name in disabled_set:
runtime_status = "disabled"
elif name in enabled_set:
runtime_status = "enabled"
else:
runtime_status = "inactive"
dir_path = Path(dir_str)
dm = dash_by_name.get(name)
has_dash_manifest = dm is not None or (dir_path / "dashboard" / "manifest.json").exists()
under_user_tree = False
try:
dir_path.resolve().relative_to(plugins_root_resolved)
under_user_tree = True
except ValueError:
pass
can_remove_update = (
source in ("user", "git") and under_user_tree and Path(dir_str).is_dir()
)
# Check if this plugin provides tools that require auth
auth_required = False
auth_command = ""
manifest_data = _read_plugin_manifest_at(dir_path)
provides_tools = manifest_data.get("provides_tools") or []
if provides_tools:
try:
from tools.registry import registry
for tname in provides_tools:
entry = registry.get_entry(tname)
if entry and entry.check_fn and not entry.check_fn():
auth_required = True
auth_command = f"hermes auth {name}"
break
except Exception:
pass
rows.append({
"name": name,
"version": version or "",
"description": description or "",
"source": source,
"runtime_status": runtime_status,
"has_dashboard_manifest": has_dash_manifest,
"dashboard_manifest": _strip_dashboard_manifest(dm) if dm else None,
"path": dir_str,
"can_remove": can_remove_update,
"can_update_git": can_remove_update and (Path(dir_str) / ".git").exists(),
"auth_required": auth_required,
"auth_command": auth_command,
"user_hidden": name in hidden_plugins,
})
agent_names = {r["name"] for r in rows}
orphan_dashboard = [
_strip_dashboard_manifest(p)
for p in dashboard_list
if str(p["name"]) not in agent_names
]
memory_providers: List[Dict[str, str]] = []
try:
for n, desc in _discover_memory_providers():
memory_providers.append({"name": n, "description": desc})
except Exception:
memory_providers = []
context_engines: List[Dict[str, str]] = []
try:
for n, desc in _discover_context_engines():
context_engines.append({"name": n, "description": desc})
except Exception:
context_engines = []
return {
"plugins": rows,
"orphan_dashboard_plugins": orphan_dashboard,
"providers": {
"memory_provider": _get_current_memory_provider() or "",
"memory_options": memory_providers,
"context_engine": _get_current_context_engine(),
"context_options": context_engines,
},
}
@app.get("/api/dashboard/plugins/hub")
async def get_plugins_hub(request: Request):
"""Unified agent plugins + dashboard extension metadata (session protected)."""
_require_token(request)
try:
return _merged_plugins_hub()
except Exception as exc:
_log.warning("plugins/hub failed: %s", exc)
raise HTTPException(status_code=500, detail="Failed to build plugins hub.") from exc
@app.post("/api/dashboard/agent-plugins/install")
async def post_agent_plugin_install(request: Request, body: _AgentPluginInstallBody):
_require_token(request)
from hermes_cli.plugins_cmd import dashboard_install_plugin
result = dashboard_install_plugin(
body.identifier.strip(),
force=body.force,
enable=body.enable,
)
if not result.get("ok"):
raise HTTPException(
status_code=400,
detail=result.get("error") or "Install failed.",
)
_get_dashboard_plugins(force_rescan=True)
# Strip internal paths from the response
result.pop("after_install_path", None)
return result
def _validate_plugin_name(name: str) -> str:
"""Reject path-traversal attempts in plugin name URL parameters."""
if not name or "/" in name or "\\" in name or ".." in name:
raise HTTPException(status_code=400, detail="Invalid plugin name.")
return name
@app.post("/api/dashboard/agent-plugins/{name}/enable")
async def post_agent_plugin_enable(request: Request, name: str):
_require_token(request)
name = _validate_plugin_name(name)
from hermes_cli.plugins_cmd import dashboard_set_agent_plugin_enabled
result = dashboard_set_agent_plugin_enabled(name, enabled=True)
if not result.get("ok"):
raise HTTPException(status_code=400, detail=result.get("error") or "Enable failed.")
return result
@app.post("/api/dashboard/agent-plugins/{name}/disable")
async def post_agent_plugin_disable(request: Request, name: str):
_require_token(request)
name = _validate_plugin_name(name)
from hermes_cli.plugins_cmd import dashboard_set_agent_plugin_enabled
result = dashboard_set_agent_plugin_enabled(name, enabled=False)
if not result.get("ok"):
raise HTTPException(status_code=400, detail=result.get("error") or "Disable failed.")
return result
@app.post("/api/dashboard/agent-plugins/{name}/update")
async def post_agent_plugin_update(request: Request, name: str):
_require_token(request)
name = _validate_plugin_name(name)
from hermes_cli.plugins_cmd import dashboard_update_user_plugin
result = dashboard_update_user_plugin(name)
if not result.get("ok"):
raise HTTPException(status_code=400, detail=result.get("error") or "Update failed.")
_get_dashboard_plugins(force_rescan=True)
return result
@app.delete("/api/dashboard/agent-plugins/{name}")
async def delete_agent_plugin(request: Request, name: str):
_require_token(request)
name = _validate_plugin_name(name)
from hermes_cli.plugins_cmd import dashboard_remove_user_plugin
result = dashboard_remove_user_plugin(name)
if not result.get("ok"):
raise HTTPException(status_code=400, detail=result.get("error") or "Remove failed.")
_get_dashboard_plugins(force_rescan=True)
return result
class _PluginProvidersPutBody(BaseModel):
memory_provider: Optional[str] = None
context_engine: Optional[str] = None
@app.put("/api/dashboard/plugin-providers")
async def put_plugin_providers(request: Request, body: _PluginProvidersPutBody):
"""Persist memory provider / context engine selection (writes config.yaml)."""
_require_token(request)
from hermes_cli.plugins_cmd import (
_save_context_engine,
_save_memory_provider,
)
if body.memory_provider is not None:
_save_memory_provider(body.memory_provider)
if body.context_engine is not None:
_save_context_engine(body.context_engine)
return {"ok": True}
class _PluginVisibilityBody(BaseModel):
hidden: bool
@app.post("/api/dashboard/plugins/{name}/visibility")
async def post_plugin_visibility(request: Request, name: str, body: _PluginVisibilityBody):
"""Toggle a plugin's sidebar visibility (persists to config.yaml dashboard.hidden_plugins)."""
_require_token(request)
name = _validate_plugin_name(name)
config = load_config()
if "dashboard" not in config or not isinstance(config.get("dashboard"), dict):
config["dashboard"] = {}
hidden_list: list = config["dashboard"].get("hidden_plugins") or []
if not isinstance(hidden_list, list):
hidden_list = []
if body.hidden and name not in hidden_list:
hidden_list.append(name)
elif not body.hidden and name in hidden_list:
hidden_list.remove(name)
config["dashboard"]["hidden_plugins"] = hidden_list
save_config(config)
return {"ok": True, "name": name, "hidden": body.hidden}
@app.get("/dashboard-plugins/{plugin_name}/{file_path:path}")
async def serve_plugin_asset(plugin_name: str, file_path: str):
"""Serve static assets from a dashboard plugin directory.
+45 -199
View File
@@ -514,7 +514,7 @@ class SessionDB:
# Session lifecycle
# =========================================================================
def _insert_session_row(
def create_session(
self,
session_id: str,
source: str,
@@ -523,8 +523,8 @@ class SessionDB:
system_prompt: str = None,
user_id: str = None,
parent_session_id: str = None,
) -> None:
"""Shared INSERT OR IGNORE for session rows."""
) -> str:
"""Create a new session record. Returns the session_id."""
def _do(conn):
conn.execute(
"""INSERT OR IGNORE INTO sessions (id, source, user_id, model, model_config,
@@ -542,11 +542,8 @@ class SessionDB:
),
)
self._execute_write(_do)
def create_session(self, session_id: str, source: str, **kwargs) -> str:
"""Create a new session record. Returns the session_id."""
self._insert_session_row(session_id, source, **kwargs)
return session_id
def end_session(self, session_id: str, end_reason: str) -> None:
"""Mark a session as ended.
@@ -682,41 +679,21 @@ class SessionDB:
session_id: str,
source: str = "unknown",
model: str = None,
**kwargs,
) -> str:
"""Ensure a session row exists (INSERT OR IGNORE). Accepts optional kwargs."""
self._insert_session_row(session_id, source, model=model, **kwargs)
return session_id
def prune_empty_ghost_sessions(self, sessions_dir: "Optional[Path]" = None) -> int:
"""Remove empty TUI ghost sessions (no messages, no title, >24hr old)."""
cutoff = time.time() - 86400 # Only sessions older than 24 hours
) -> None:
"""Ensure a session row exists, creating it with minimal metadata if absent.
Used by _flush_messages_to_session_db to recover from a failed
create_session() call (e.g. transient SQLite lock at agent startup).
INSERT OR IGNORE is safe to call even when the row already exists.
"""
def _do(conn):
rows = conn.execute("""
SELECT id FROM sessions
WHERE source = 'tui'
AND title IS NULL
AND ended_at IS NOT NULL
AND started_at < ?
AND NOT EXISTS (
SELECT 1 FROM messages WHERE messages.session_id = sessions.id
)
""", (cutoff,)).fetchall()
ids = [r[0] if isinstance(r, (tuple, list)) else r["id"] for r in rows]
if ids:
placeholders = ",".join("?" * len(ids))
conn.execute(
f"DELETE FROM sessions WHERE id IN ({placeholders})", ids
)
return ids
removed_ids = self._execute_write(_do) or []
# Clean up any on-disk session files (belt-and-suspenders)
if sessions_dir and removed_ids:
for sid in removed_ids:
self._remove_session_files(sessions_dir, sid)
return len(removed_ids)
conn.execute(
"""INSERT OR IGNORE INTO sessions
(id, source, model, started_at)
VALUES (?, ?, ?, ?)""",
(session_id, source, model, time.time()),
)
self._execute_write(_do)
def get_session(self, session_id: str) -> Optional[Dict[str, Any]]:
"""Get a session by ID."""
@@ -956,7 +933,6 @@ class SessionDB:
offset: int = 0,
include_children: bool = False,
project_compression_tips: bool = True,
order_by_last_active: bool = False,
) -> List[Dict[str, Any]]:
"""List sessions with preview (first user message) and last active timestamp.
@@ -976,14 +952,6 @@ class SessionDB:
compressed continuations from being invisible to users while keeping
delegate subagents and branches hidden. Pass ``False`` to return the
raw root rows (useful for admin/debug UIs).
Pass ``order_by_last_active=True`` to sort by most-recent activity
instead of original conversation start time. For compression chains,
the "most-recent activity" is taken from the live tip (not the root),
so an old conversation that was compressed and continued recently
surfaces in the correct slot. Ordering is computed at SQL level via
a recursive CTE that walks compression-continuation edges, so LIMIT
and OFFSET still apply efficiently.
"""
where_clauses = []
params = []
@@ -1011,80 +979,25 @@ class SessionDB:
params.extend(exclude_sources)
where_sql = f"WHERE {' AND '.join(where_clauses)}" if where_clauses else ""
if order_by_last_active:
# Compute effective_last_active by walking each surfaced session's
# compression-continuation chain forward in SQL and taking the MAX
# timestamp across the chain. This lets us ORDER BY + LIMIT at SQL
# level instead of fetching every row and sorting in Python, while
# still surfacing old compression roots whose live tip is fresh.
#
# The CTE seeds from rows the outer WHERE admits (roots + branch
# children), then recursively joins forward through
# compression-continuation edges using the same criteria as
# get_compression_tip (parent.end_reason='compression' AND
# child.started_at >= parent.ended_at).
query = f"""
WITH RECURSIVE chain(root_id, cur_id) AS (
SELECT s.id, s.id FROM sessions s {where_sql}
UNION ALL
SELECT c.root_id, child.id
FROM chain c
JOIN sessions parent ON parent.id = c.cur_id
JOIN sessions child ON child.parent_session_id = c.cur_id
WHERE parent.end_reason = 'compression'
AND child.started_at >= parent.ended_at
),
chain_max AS (
SELECT
root_id,
MAX(COALESCE(
(SELECT MAX(m.timestamp) FROM messages m WHERE m.session_id = cur_id),
(SELECT started_at FROM sessions ss WHERE ss.id = cur_id)
)) AS effective_last_active
FROM chain
GROUP BY root_id
)
SELECT s.*,
COALESCE(
(SELECT SUBSTR(REPLACE(REPLACE(m.content, X'0A', ' '), X'0D', ' '), 1, 63)
FROM messages m
WHERE m.session_id = s.id AND m.role = 'user' AND m.content IS NOT NULL
ORDER BY m.timestamp, m.id LIMIT 1),
''
) AS _preview_raw,
COALESCE(
(SELECT MAX(m2.timestamp) FROM messages m2 WHERE m2.session_id = s.id),
s.started_at
) AS last_active,
COALESCE(cm.effective_last_active, s.started_at) AS _effective_last_active
FROM sessions s
LEFT JOIN chain_max cm ON cm.root_id = s.id
{where_sql}
ORDER BY _effective_last_active DESC, s.started_at DESC, s.id DESC
LIMIT ? OFFSET ?
"""
# WHERE params apply twice (CTE seed + outer select).
params = params + params + [limit, offset]
else:
query = f"""
SELECT s.*,
COALESCE(
(SELECT SUBSTR(REPLACE(REPLACE(m.content, X'0A', ' '), X'0D', ' '), 1, 63)
FROM messages m
WHERE m.session_id = s.id AND m.role = 'user' AND m.content IS NOT NULL
ORDER BY m.timestamp, m.id LIMIT 1),
''
) AS _preview_raw,
COALESCE(
(SELECT MAX(m2.timestamp) FROM messages m2 WHERE m2.session_id = s.id),
s.started_at
) AS last_active
FROM sessions s
{where_sql}
ORDER BY s.started_at DESC
LIMIT ? OFFSET ?
"""
params.extend([limit, offset])
query = f"""
SELECT s.*,
COALESCE(
(SELECT SUBSTR(REPLACE(REPLACE(m.content, X'0A', ' '), X'0D', ' '), 1, 63)
FROM messages m
WHERE m.session_id = s.id AND m.role = 'user' AND m.content IS NOT NULL
ORDER BY m.timestamp, m.id LIMIT 1),
''
) AS _preview_raw,
COALESCE(
(SELECT MAX(m2.timestamp) FROM messages m2 WHERE m2.session_id = s.id),
s.started_at
) AS last_active
FROM sessions s
{where_sql}
ORDER BY s.started_at DESC
LIMIT ? OFFSET ?
"""
params.extend([limit, offset])
with self._lock:
cursor = self._conn.execute(query, params)
rows = cursor.fetchall()
@@ -1098,8 +1011,6 @@ class SessionDB:
s["preview"] = text + ("..." if len(raw) > 60 else "")
else:
s["preview"] = ""
# Drop the internal ordering column so callers see a clean dict.
s.pop("_effective_last_active", None)
sessions.append(s)
# Project compression roots forward to their tips. Each row whose
@@ -1177,48 +1088,6 @@ class SessionDB:
# Message storage
# =========================================================================
# Sentinel prefix used to distinguish JSON-encoded structured content
# (multimodal messages: lists of parts like text + image_url) from plain
# string content. The NUL byte is not legal in normal text, so this
# cannot collide with real user content.
_CONTENT_JSON_PREFIX = "\x00json:"
@classmethod
def _encode_content(cls, content: Any) -> Any:
"""Serialize structured (list/dict) message content for sqlite.
sqlite3 can only bind ``str``, ``bytes``, ``int``, ``float``, and ``None``
to query parameters. Multimodal messages have ``content`` as a list of
parts (``[{"type": "text", ...}, {"type": "image_url", ...}]``), which
raises ``ProgrammingError: Error binding parameter N: type 'list' is
not supported`` when bound directly.
Returns the value unchanged when it's already a safe scalar, or a
sentinel-prefixed JSON string for lists/dicts. Paired with
:meth:`_decode_content` on read.
"""
if content is None or isinstance(content, (str, bytes, int, float)):
return content
try:
return cls._CONTENT_JSON_PREFIX + json.dumps(content)
except (TypeError, ValueError):
# Last-resort fallback: stringify so persistence never fails.
return str(content)
@classmethod
def _decode_content(cls, content: Any) -> Any:
"""Reverse :meth:`_encode_content`; returns scalars unchanged."""
if isinstance(content, str) and content.startswith(cls._CONTENT_JSON_PREFIX):
try:
return json.loads(content[len(cls._CONTENT_JSON_PREFIX):])
except (json.JSONDecodeError, TypeError):
logger.warning(
"Failed to decode JSON-encoded message content; "
"returning raw string"
)
return content
return content
def append_message(
self,
session_id: str,
@@ -1255,9 +1124,6 @@ class SessionDB:
if codex_message_items else None
)
tool_calls_json = json.dumps(tool_calls) if tool_calls else None
# Multimodal content (list of parts) must be JSON-encoded: sqlite3
# cannot bind list/dict parameters directly.
stored_content = self._encode_content(content)
# Pre-compute tool call count
num_tool_calls = 0
@@ -1274,7 +1140,7 @@ class SessionDB:
(
session_id,
role,
stored_content,
content,
tool_call_id,
tool_calls_json,
tool_name,
@@ -1357,7 +1223,7 @@ class SessionDB:
(
session_id,
role,
self._encode_content(msg.get("content")),
msg.get("content"),
msg.get("tool_call_id"),
tool_calls_json,
msg.get("tool_name"),
@@ -1396,8 +1262,6 @@ class SessionDB:
result = []
for row in rows:
msg = dict(row)
if "content" in msg:
msg["content"] = self._decode_content(msg["content"])
if msg.get("tool_calls"):
try:
msg["tool_calls"] = json.loads(msg["tool_calls"])
@@ -1487,15 +1351,15 @@ class SessionDB:
placeholders = ",".join("?" for _ in session_ids)
rows = self._conn.execute(
"SELECT role, content, tool_call_id, tool_calls, tool_name, "
"finish_reason, reasoning, reasoning_content, reasoning_details, "
"codex_reasoning_items, codex_message_items "
"reasoning, reasoning_content, reasoning_details, codex_reasoning_items, "
"codex_message_items "
f"FROM messages WHERE session_id IN ({placeholders}) ORDER BY timestamp, id",
tuple(session_ids),
).fetchall()
messages = []
for row in rows:
content = self._decode_content(row["content"])
content = row["content"]
if row["role"] in {"user", "assistant"} and isinstance(content, str):
content = sanitize_context(content).strip()
msg = {"role": row["role"], "content": content}
@@ -1513,8 +1377,6 @@ class SessionDB:
# that replay reasoning (OpenRouter, OpenAI, Nous) receive
# coherent multi-turn reasoning context.
if row["role"] == "assistant":
if row["finish_reason"]:
msg["finish_reason"] = row["finish_reason"]
if row["reasoning"]:
msg["reasoning"] = row["reasoning"]
if row["reasoning_content"] is not None:
@@ -1882,26 +1744,10 @@ class SessionDB:
)""",
(match["id"], match["id"]),
)
context_msgs = []
for r in ctx_cursor.fetchall():
raw = r["content"]
decoded = self._decode_content(raw)
# Multimodal context: render a compact text-only
# summary for search previews.
if isinstance(decoded, list):
text_parts = [
p.get("text", "") for p in decoded
if isinstance(p, dict) and p.get("type") == "text"
]
text = " ".join(t for t in text_parts if t).strip()
preview = text or "[multimodal content]"
elif isinstance(decoded, str):
preview = decoded
else:
preview = ""
context_msgs.append(
{"role": r["role"], "content": preview[:200]}
)
context_msgs = [
{"role": r["role"], "content": (r["content"] or "")[:200]}
for r in ctx_cursor.fetchall()
]
match["context"] = context_msgs
except Exception:
match["context"] = []
+6 -7
View File
@@ -356,17 +356,12 @@ def _compute_tool_definitions(
else:
if not quiet_mode:
print(f"⚠️ Unknown toolset: {toolset_name}")
else:
# Default: start with everything
elif disabled_toolsets:
from toolsets import get_all_toolsets
for ts_name in get_all_toolsets():
tools_to_include.update(resolve_toolset(ts_name))
# Always apply disabled toolsets as a subtraction step at the end.
# This ensures that even if a composite toolset (like hermes-cli)
# is enabled, any tools belonging to a disabled toolset are strictly
# stripped out. See issue #17309.
if disabled_toolsets:
for toolset_name in disabled_toolsets:
if validate_toolset(toolset_name):
resolved = resolve_toolset(toolset_name)
@@ -381,6 +376,10 @@ def _compute_tool_definitions(
else:
if not quiet_mode:
print(f"⚠️ Unknown toolset: {toolset_name}")
else:
from toolsets import get_all_toolsets
for ts_name in get_all_toolsets():
tools_to_include.update(resolve_toolset(ts_name))
# Plugin-registered tools are now resolved through the normal toolset
# path — validate_toolset() / resolve_toolset() / get_all_toolsets()
+1 -1
View File
@@ -163,7 +163,7 @@
for entry in "''${ENTRIES[@]}"; do
IFS=":" read -r ATTR FOLDER NIX_FILE <<< "$entry"
echo "==> .#$ATTR ($FOLDER -> $NIX_FILE)"
OUTPUT=$(nix build ".#$ATTR.npmDeps" --no-link --print-build-logs 2>&1)
OUTPUT=$(nix build ".#$ATTR.npmDeps" --no-link --rebuild --print-build-logs 2>&1)
STATUS=$?
if [ "$STATUS" -eq 0 ]; then
echo " ok"
+1 -1
View File
@@ -4,7 +4,7 @@ let
src = ../ui-tui;
npmDeps = pkgs.fetchNpmDeps {
inherit src;
hash = "sha256-a/HGI9OgVcTnZrMXA7xFMGnFoVxyHe95fulVz+WNYB0=";
hash = "sha256-Chz+NW9NXqboXHOa6PKwf5bhAkkcFtKNhvKWwg2XSPc=";
};
npm = hermesNpmLib.mkNpmPassthru { folder = "ui-tui"; attr = "tui"; pname = "hermes-tui"; };
@@ -2960,7 +2960,7 @@ class Migrator:
def parse_args() -> argparse.Namespace:
parser = argparse.ArgumentParser(description="Migrate OpenClaw user state into Hermes Agent.")
parser.add_argument("--source", default=str(Path.home() / ".openclaw"), help="OpenClaw home directory")
parser.add_argument("--target", default=os.environ.get("HERMES_HOME") or str(Path.home() / ".hermes"), help="Hermes home directory")
parser.add_argument("--target", default=str(Path.home() / ".hermes"), help="Hermes home directory")
parser.add_argument(
"--workspace-target",
help="Optional workspace root where the workspace instructions file should be copied",
@@ -1,217 +0,0 @@
---
name: here.now
description: Publish static sites to {slug}.here.now and store private files in cloud Drives for agent-to-agent handoff.
version: 1.15.3
author: here.now
license: MIT
prerequisites:
commands: [curl, file, jq]
platforms: [macos, linux]
metadata:
hermes:
tags: [here.now, herenow, publish, deploy, hosting, static-site, web, share, URL, drive, storage]
homepage: https://here.now
requires_toolsets: [terminal]
---
# here.now
here.now lets agents publish websites and store private files in cloud Drives.
Use here.now for two jobs:
- **Sites**: publish websites and files at `{slug}.here.now`.
- **Drives**: store private agent files in cloud folders.
## Current docs
**Before answering questions about here.now capabilities, features, or workflows, read the current docs:**
→ **https://here.now/docs**
Read the docs:
- at the first here.now-related interaction in a conversation
- any time the user asks how to do something
- any time the user asks what is possible, supported, or recommended
- before telling the user a feature is unsupported
Topics that require current docs (do not rely on local skill text alone):
- Drives and Drive sharing
- custom domains
- payments and payment gating
- forking
- proxy routes and service variables
- handles and links
- limits and quotas
- SPA routing
- error handling and remediation
- feature availability
**If docs and live API behavior disagree, trust the live API behavior.**
If the docs fetch fails or times out, continue with the local skill and live API/script output. Prefer live API behavior for active operations.
## Requirements
- Required binaries: `curl`, `file`, `jq`
- Optional environment variable: `$HERENOW_API_KEY`
- Optional Drive token variable: `$HERENOW_DRIVE_TOKEN`
- Optional credentials file: `~/.herenow/credentials`
- Skill helper paths:
- `${HERMES_SKILL_DIR}/scripts/publish.sh` for publishing sites
- `${HERMES_SKILL_DIR}/scripts/drive.sh` for private Drive storage
## Create a site
```bash
PUBLISH="${HERMES_SKILL_DIR}/scripts/publish.sh"
bash "$PUBLISH" {file-or-dir} --client hermes
```
Outputs the live URL (e.g. `https://bright-canvas-a7k2.here.now/`).
Under the hood this is a three-step flow: create/update -> upload files -> finalize. A site is not live until finalize succeeds.
Without an API key this creates an **anonymous site** that expires in 24 hours.
With a saved API key, the site is permanent.
**File structure:** For HTML sites, place `index.html` at the root of the directory you publish, not inside a subdirectory. The directory's contents become the site root. For example, publish `my-site/` where `my-site/index.html` exists — don't publish a parent folder that contains `my-site/`.
You can also publish raw files without any HTML. Single files get a rich auto-viewer (images, PDF, video, audio). Multiple files get an auto-generated directory listing with folder navigation and an image gallery.
## Update an existing site
```bash
PUBLISH="${HERMES_SKILL_DIR}/scripts/publish.sh"
bash "$PUBLISH" {file-or-dir} --slug {slug} --client hermes
```
The script auto-loads the `claimToken` from `.herenow/state.json` when updating anonymous sites. Pass `--claim-token {token}` to override.
Authenticated updates require a saved API key.
## Use a Drive
Use a Drive when the user wants private cloud storage for agent files: documents, context, memory, plans, assets, media, research, code, and anything else that should persist without being published as a website.
Every signed-in account has a default Drive named `My Drive`.
```bash
DRIVE="${HERMES_SKILL_DIR}/scripts/drive.sh"
bash "$DRIVE" default
bash "$DRIVE" ls "My Drive"
bash "$DRIVE" put "My Drive" notes/today.md --from ./notes/today.md
bash "$DRIVE" cat "My Drive" notes/today.md
bash "$DRIVE" share "My Drive" --perms write --prefix notes/ --ttl 7d
```
Use scoped Drive tokens for agent-to-agent handoff. If you receive a `herenow_drive` share block, use its `token` as `Authorization: Bearer <token>` against `api_base`, respect `pathPrefix` when present, and preserve ETags on writes. A `pathPrefix` of `null` means full-Drive access. If the skill is available, prefer `drive.sh`; otherwise call the listed API operations directly.
## API key storage
The publish script reads the API key from these sources (first match wins):
1. `--api-key {key}` flag (CI/scripting only — avoid in interactive use)
2. `$HERENOW_API_KEY` environment variable
3. `~/.herenow/credentials` file (recommended for agents)
To store a key, write it to the credentials file:
```bash
mkdir -p ~/.herenow && echo "{API_KEY}" > ~/.herenow/credentials && chmod 600 ~/.herenow/credentials
```
**IMPORTANT**: After receiving an API key, save it immediately — run the command above yourself. Do not ask the user to run it manually. Avoid passing the key via CLI flags (e.g. `--api-key`) in interactive sessions; the credentials file is the preferred storage method.
Never commit credentials or local state files (`~/.herenow/credentials`, `.herenow/state.json`) to source control.
## Getting an API key
To upgrade from anonymous (24h) to permanent sites:
1. Ask the user for their email address.
2. Request a one-time sign-in code:
```bash
curl -sS https://here.now/api/auth/agent/request-code \
-H "content-type: application/json" \
-d '{"email": "user@example.com"}'
```
3. Tell the user: "Check your inbox for a sign-in code from here.now and paste it here."
4. Verify the code and get the API key:
```bash
curl -sS https://here.now/api/auth/agent/verify-code \
-H "content-type: application/json" \
-d '{"email":"user@example.com","code":"ABCD-2345"}'
```
5. Save the returned `apiKey` yourself (do not ask the user to do this):
```bash
mkdir -p ~/.herenow && echo "{API_KEY}" > ~/.herenow/credentials && chmod 600 ~/.herenow/credentials
```
## State file
After every site create/update, the script writes to `.herenow/state.json` in the working directory:
```json
{
"publishes": {
"bright-canvas-a7k2": {
"siteUrl": "https://bright-canvas-a7k2.here.now/",
"claimToken": "abc123",
"claimUrl": "https://here.now/claim?slug=bright-canvas-a7k2&token=abc123",
"expiresAt": "2026-02-18T01:00:00.000Z"
}
}
}
```
Before creating or updating sites, you may check this file to find prior slugs.
Treat `.herenow/state.json` as internal cache only.
Never present this local file path as a URL, and never use it as source of truth for auth mode, expiry, or claim URL.
## What to tell the user
For published sites:
- Always share the `siteUrl` from the current script run.
- Read and follow `publish_result.*` lines from script stderr to determine auth mode.
- When `publish_result.auth_mode=authenticated`: tell the user the site is **permanent** and saved to their account. No claim URL is needed.
- When `publish_result.auth_mode=anonymous`: tell the user the site **expires in 24 hours**. Share the claim URL (if `publish_result.claim_url` is non-empty and starts with `https://`) so they can keep it permanently. Warn that claim tokens are only returned once and cannot be recovered.
- Never tell the user to inspect `.herenow/state.json` for claim URLs or auth status.
For Drives:
- Do not describe Drive files as public URLs.
- Tell the user Drive contents are private unless shared with a scoped token.
- When sharing access with another agent, prefer a scoped token with a narrow `pathPrefix` and short TTL.
## publish.sh options
| Flag | Description |
| ---------------------- | -------------------------------------------- |
| `--slug {slug}` | Update an existing site instead of creating |
| `--claim-token {token}`| Override claim token for anonymous updates |
| `--title {text}` | Viewer title (non-HTML sites) |
| `--description {text}` | Viewer description |
| `--ttl {seconds}` | Set expiry (authenticated only) |
| `--client {name}` | Agent name for attribution (e.g. `hermes`) |
| `--base-url {url}` | API base URL (default: `https://here.now`) |
| `--allow-nonherenow-base-url` | Allow sending auth to non-default `--base-url` |
| `--api-key {key}` | API key override (prefer credentials file) |
| `--spa` | Enable SPA routing (serve index.html for unknown paths) |
| `--forkable` | Allow others to fork this site |
## Beyond publish.sh
For Drive operations, use `drive.sh` or the Drive API. For broader account and site management — delete, metadata, passwords, payments, domains, handles, links, variables, proxy routes, forking, duplication, and more — see the current docs:
→ **https://here.now/docs**
Full docs: https://here.now/docs
@@ -1,406 +0,0 @@
#!/usr/bin/env bash
set -euo pipefail
BASE_URL="https://here.now"
CREDENTIALS_FILE="$HOME/.herenow/credentials"
API_KEY="${HERENOW_API_KEY:-}"
DRIVE_TOKEN="${HERENOW_DRIVE_TOKEN:-}"
ALLOW_NON_HERENOW_BASE_URL=0
MAX_FILE_BYTES=$((500 * 1024 * 1024))
usage() {
cat <<'USAGE'
Usage: drive.sh [global options] <command> [args]
Global options:
--api-key <key> Account API key (or $HERENOW_API_KEY / ~/.herenow/credentials)
--token <drv_live_...> Drive token (or $HERENOW_DRIVE_TOKEN)
--base-url <url> API base (default: https://here.now)
--allow-nonherenow-base-url
Commands:
create [name] [--default]
default
ls
ls <drive> [prefix]
cat <drive> <path>
put <drive> <path> --from <local-file>
import <drive> <prefix> --from <local-folder> [--dry-run]
export <drive> <prefix> --to <local-folder> [--dry-run]
rm <drive> <path> [--recursive --confirm <path>]
share <drive> --perms read|write [--prefix notes/] [--ttl 30d] [--label text] [--manage-tokens]
tokens <drive>
revoke <drive> <tokenId>
delete <drive> --confirm "<drive name>"
USAGE
exit 1
}
die() { echo "error: $1" >&2; exit 1; }
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
SKILL_DIR="$(cd "${SCRIPT_DIR}/.." && pwd)"
BUNDLED_JQ="${SKILL_DIR}/bin/jq"
if [[ -x "$BUNDLED_JQ" ]]; then
JQ_BIN="$BUNDLED_JQ"
elif command -v jq >/dev/null 2>&1; then
JQ_BIN="$(command -v jq)"
else
die "requires jq"
fi
for cmd in curl file; do
command -v "$cmd" >/dev/null 2>&1 || die "requires $cmd"
done
while [[ $# -gt 0 ]]; do
case "$1" in
--api-key) API_KEY="$2"; shift 2 ;;
--token) DRIVE_TOKEN="$2"; shift 2 ;;
--base-url) BASE_URL="$2"; shift 2 ;;
--allow-nonherenow-base-url) ALLOW_NON_HERENOW_BASE_URL=1; shift ;;
--help|-h) usage ;;
--*) die "unknown global option: $1" ;;
*) break ;;
esac
done
CMD="${1:-}"
[[ -n "$CMD" ]] || usage
shift || true
if [[ -z "$API_KEY" && -z "$DRIVE_TOKEN" && -f "$CREDENTIALS_FILE" ]]; then
API_KEY=$(tr -d '[:space:]' < "$CREDENTIALS_FILE")
fi
BASE_URL="${BASE_URL%/}"
if [[ "$BASE_URL" != "https://here.now" && "$ALLOW_NON_HERENOW_BASE_URL" -ne 1 ]]; then
if [[ -n "$API_KEY" || -n "$DRIVE_TOKEN" ]]; then
die "refusing to send credentials to non-default base URL; pass --allow-nonherenow-base-url to override"
fi
fi
auth_header=()
if [[ -n "$DRIVE_TOKEN" ]]; then
auth_header=(-H "authorization: Bearer $DRIVE_TOKEN")
elif [[ -n "$API_KEY" ]]; then
auth_header=(-H "authorization: Bearer $API_KEY")
else
die "missing credentials; set HERENOW_API_KEY, HERENOW_DRIVE_TOKEN, or ~/.herenow/credentials"
fi
compute_sha256() {
local f="$1"
if command -v sha256sum >/dev/null 2>&1; then
sha256sum "$f" | cut -d' ' -f1
else
shasum -a 256 "$f" | cut -d' ' -f1
fi
}
guess_content_type() {
local f="$1"
case "${f##*.}" in
html|htm) echo "text/html; charset=utf-8" ;;
css) echo "text/css; charset=utf-8" ;;
js|mjs) echo "text/javascript; charset=utf-8" ;;
json) echo "application/json; charset=utf-8" ;;
md|txt) echo "text/plain; charset=utf-8" ;;
svg) echo "image/svg+xml" ;;
png) echo "image/png" ;;
jpg|jpeg) echo "image/jpeg" ;;
gif) echo "image/gif" ;;
webp) echo "image/webp" ;;
pdf) echo "application/pdf" ;;
*) file --brief --mime-type "$f" 2>/dev/null || echo "application/octet-stream" ;;
esac
}
api_json() {
local method="$1"; shift
local url="$1"; shift
local body="${1:-}"
local tmp
tmp=$(mktemp)
local code
if [[ -n "$body" ]]; then
code=$(curl -sS -o "$tmp" -w "%{http_code}" -X "$method" "$url" "${auth_header[@]}" -H "content-type: application/json" -d "$body")
else
code=$(curl -sS -o "$tmp" -w "%{http_code}" -X "$method" "$url" "${auth_header[@]}")
fi
if [[ "$code" -lt 200 || "$code" -ge 300 ]]; then
local err
err=$("$JQ_BIN" -r '.error // empty' "$tmp" 2>/dev/null || true)
[[ -n "$err" ]] || err="$(cat "$tmp")"
rm -f "$tmp"
die "HTTP $code: $err"
fi
cat "$tmp"
rm -f "$tmp"
}
urlenc() {
"$JQ_BIN" -nr --arg v "$1" '$v|@uri'
}
urlenc_path() {
local path="$1"
local out=""
local part
IFS='/' read -r -a parts <<< "$path"
for part in "${parts[@]}"; do
[[ -n "$out" ]] && out="$out/"
out="$out$(urlenc "$part")"
done
echo "$out"
}
resolve_drive() {
local name="$1"
if [[ "$name" == drv_* ]]; then
echo "$name"
return
fi
if [[ -n "$DRIVE_TOKEN" ]]; then
die "drive tokens must reference drives by drv_ id; use account credentials to resolve drive names"
fi
if [[ "$name" == "default" || "$name" == "my-drive" || "$name" == "My Drive" ]]; then
api_json GET "$BASE_URL/api/v1/drives/default" | "$JQ_BIN" -r '.drive.id'
return
fi
local rows count
rows=$(api_json GET "$BASE_URL/api/v1/drives" | "$JQ_BIN" --arg n "$name" '[.drives[] | select(.name == $n)]')
count=$(echo "$rows" | "$JQ_BIN" 'length')
[[ "$count" -eq 1 ]] || die "drive name '$name' matched $count drives; use a drv_ id"
echo "$rows" | "$JQ_BIN" -r '.[0].id'
}
drive_head() {
local id="$1"
api_json GET "$BASE_URL/api/v1/drives/$id" | "$JQ_BIN" -r '.drive.headVersionId // .headVersionId // empty'
}
file_meta() {
local id="$1"
local path="$2"
local prefix
prefix=$(urlenc "$path")
api_json GET "$BASE_URL/api/v1/drives/$id/files?prefix=$prefix&limit=200" | "$JQ_BIN" -c --arg p "$path" '.files[]? | select(.path == $p)' | head -n 1
}
put_file() {
local drive="$1"; shift
local path="$1"; shift
local local_file=""
while [[ $# -gt 0 ]]; do
case "$1" in
--from) local_file="$2"; shift 2 ;;
*) die "unexpected put argument: $1" ;;
esac
done
[[ -f "$local_file" ]] || die "--from must be a file"
local id sz ct sha meta body upload upload_url upload_id http_code
id=$(resolve_drive "$drive")
sz=$(wc -c < "$local_file" | tr -d ' ')
[[ "$sz" -le "$MAX_FILE_BYTES" ]] || die "$path exceeds the $MAX_FILE_BYTES byte Drive file limit"
ct=$(guess_content_type "$local_file")
sha=$(compute_sha256 "$local_file")
meta=$(file_meta "$id" "$path" || true)
body=$("$JQ_BIN" -n --arg p "$path" --argjson s "$sz" --arg c "$ct" --arg sha "$sha" \
'{path:$p,size:$s,contentType:$c,sha256:$sha}')
if [[ -n "$meta" ]]; then
etag=$(echo "$meta" | "$JQ_BIN" -r '.etag')
body=$(echo "$body" | "$JQ_BIN" --arg e "$etag" '.ifMatch = $e')
else
body=$(echo "$body" | "$JQ_BIN" '.ifNoneMatch = "*"')
fi
upload=$(api_json POST "$BASE_URL/api/v1/drives/$id/files/uploads" "$body")
upload_url=$(echo "$upload" | "$JQ_BIN" -r '.uploadUrl')
upload_id=$(echo "$upload" | "$JQ_BIN" -r '.uploadId')
http_code=$(curl -sS -o /dev/null -w "%{http_code}" -X PUT "$upload_url" -H "Content-Type: $ct" --data-binary "@$local_file")
[[ "$http_code" -ge 200 && "$http_code" -lt 300 ]] || die "upload failed for $path (HTTP $http_code)"
api_json POST "$BASE_URL/api/v1/drives/$id/files/finalize" "$("$JQ_BIN" -n --arg u "$upload_id" '{uploadId:$u}')" | "$JQ_BIN" .
}
case "$CMD" in
create)
name=""
is_default="false"
while [[ $# -gt 0 ]]; do
case "$1" in
--default) is_default="true"; shift ;;
*) [[ -z "$name" ]] && name="$1" || die "unexpected argument: $1"; shift ;;
esac
done
body=$("$JQ_BIN" -n --arg n "$name" --argjson d "$is_default" '{isDefault:$d} + (if $n == "" then {} else {name:$n} end)')
api_json POST "$BASE_URL/api/v1/drives" "$body" | "$JQ_BIN" .
;;
default)
api_json GET "$BASE_URL/api/v1/drives/default" | "$JQ_BIN" .
;;
ls)
if [[ $# -eq 0 ]]; then
[[ -z "$DRIVE_TOKEN" ]] || die "drive tokens cannot list drives; pass a drv_ id"
api_json GET "$BASE_URL/api/v1/drives" | "$JQ_BIN" .
else
id=$(resolve_drive "$1")
prefix="${2:-}"
api_json GET "$BASE_URL/api/v1/drives/$id/files?prefix=$(urlenc "$prefix")" | "$JQ_BIN" .
fi
;;
cat)
[[ $# -eq 2 ]] || die "usage: drive.sh cat <drive> <path>"
id=$(resolve_drive "$1")
curl -fsS "$BASE_URL/api/v1/drives/$id/files/$(urlenc_path "$2")" "${auth_header[@]}"
;;
put)
[[ $# -ge 2 ]] || die "usage: drive.sh put <drive> <path> --from <local-file>"
put_file "$@"
;;
import)
[[ $# -ge 2 ]] || die "usage: drive.sh import <drive> <prefix> --from <local-folder> [--dry-run]"
drive="$1"; prefix="${2%/}"; shift 2
from=""; dry=0
while [[ $# -gt 0 ]]; do
case "$1" in
--from) from="$2"; shift 2 ;;
--dry-run) dry=1; shift ;;
*) die "unexpected import argument: $1" ;;
esac
done
[[ -d "$from" ]] || die "--from must be a folder"
uploaded=0
skipped=0
failed=0
planned=0
while IFS= read -r -d '' f; do
rel="${f#$from/}"
[[ "$rel" == .git/* || "$rel" == node_modules/* || "$rel" == ".DS_Store" || "$rel" == */.DS_Store ]] && continue
planned=$((planned + 1))
sz=$(wc -c < "$f" | tr -d ' ')
if [[ "$sz" -gt "$MAX_FILE_BYTES" ]]; then
echo "skip oversized $f ($sz bytes > $MAX_FILE_BYTES)" >&2
skipped=$((skipped + 1))
continue
fi
dest="$rel"
[[ -n "$prefix" ]] && dest="$prefix/$rel"
if [[ "$dry" -eq 1 ]]; then
echo "upload $f -> $dest"
skipped=$((skipped + 1))
else
if (put_file "$drive" "$dest" --from "$f" >/dev/null); then
uploaded=$((uploaded + 1))
else
failed=$((failed + 1))
fi
fi
done < <(find "$from" -type f -print0 | sort -z)
echo "planned=$planned uploaded=$uploaded skipped=$skipped failed=$failed"
[[ "$failed" -eq 0 ]] || exit 1
;;
export)
[[ $# -ge 2 ]] || die "usage: drive.sh export <drive> <prefix> --to <local-folder> [--dry-run]"
id=$(resolve_drive "$1"); prefix="${2%/}"; shift 2
to=""; dry=0
while [[ $# -gt 0 ]]; do
case "$1" in
--to) to="$2"; shift 2 ;;
--dry-run) dry=1; shift ;;
*) die "unexpected export argument: $1" ;;
esac
done
[[ -n "$to" ]] || die "--to is required"
cursor=""
total=0
while true; do
url="$BASE_URL/api/v1/drives/$id/files?prefix=$(urlenc "$prefix")&limit=200"
[[ -n "$cursor" ]] && url="$url&cursor=$(urlenc "$cursor")"
files=$(api_json GET "$url")
while IFS= read -r p; do
[[ -n "$p" ]] || continue
rel="$p"
[[ -n "$prefix" ]] && rel="${p#$prefix/}"
out="$to/$rel"
if [[ "$dry" -eq 1 ]]; then
echo "download $p -> $out"
else
mkdir -p "$(dirname "$out")"
curl -fsS "$BASE_URL/api/v1/drives/$id/files/$(urlenc_path "$p")" "${auth_header[@]}" -o "$out"
fi
total=$((total + 1))
done < <(echo "$files" | "$JQ_BIN" -r '.files[].path')
cursor=$(echo "$files" | "$JQ_BIN" -r '.nextCursor // empty')
[[ -n "$cursor" ]] || break
done
echo "files=$total"
;;
rm)
[[ $# -ge 2 ]] || die "usage: drive.sh rm <drive> <path> [--recursive --confirm <path>]"
id=$(resolve_drive "$1"); path="$2"; shift 2
recursive=0; confirm=""
while [[ $# -gt 0 ]]; do
case "$1" in
--recursive) recursive=1; shift ;;
--confirm) confirm="$2"; shift 2 ;;
*) die "unexpected rm argument: $1" ;;
esac
done
if [[ "$recursive" -eq 1 ]]; then
[[ "$confirm" == "$path" ]] || die "recursive delete requires --confirm '$path'"
head=$(drive_head "$id")
api_json DELETE "$BASE_URL/api/v1/drives/$id/files/$(urlenc_path "$path")?recursive=true&baseVersionId=$(urlenc "$head")" | "$JQ_BIN" .
else
meta=$(file_meta "$id" "$path")
etag=$(echo "$meta" | "$JQ_BIN" -r '.etag')
curl -fsS -X DELETE "$BASE_URL/api/v1/drives/$id/files/$(urlenc_path "$path")" "${auth_header[@]}" -H "If-Match: $etag" | "$JQ_BIN" .
fi
;;
share)
[[ $# -ge 1 ]] || die "usage: drive.sh share <drive> --perms read|write [--prefix notes/] [--ttl 30d] [--label text] [--manage-tokens]"
id=$(resolve_drive "$1"); shift
perms="write"; prefix=""; ttl=""; label=""; manage_tokens="false"
while [[ $# -gt 0 ]]; do
case "$1" in
--perms) perms="$2"; shift 2 ;;
--prefix) prefix="$2"; shift 2 ;;
--ttl) ttl="$2"; shift 2 ;;
--label) label="$2"; shift 2 ;;
--manage-tokens) manage_tokens="true"; shift ;;
*) die "unexpected share argument: $1" ;;
esac
done
body=$("$JQ_BIN" -n --arg p "$perms" --arg pp "$prefix" --arg ttl "$ttl" --arg label "$label" --argjson mt "$manage_tokens" \
'{perms:$p} + (if $mt then {manageTokens:true} else {} end) + (if $ttl == "" then {} else {ttl:$ttl} end) + (if $pp == "" then {} else {pathPrefix:$pp} end) + (if $label == "" then {} else {label:$label} end)')
api_json POST "$BASE_URL/api/v1/drives/$id/tokens" "$body" | "$JQ_BIN" -r '.shareBlock'
;;
tokens)
[[ $# -eq 1 ]] || die "usage: drive.sh tokens <drive>"
id=$(resolve_drive "$1")
api_json GET "$BASE_URL/api/v1/drives/$id/tokens" | "$JQ_BIN" .
;;
revoke)
[[ $# -eq 2 ]] || die "usage: drive.sh revoke <drive> <tokenId>"
id=$(resolve_drive "$1")
api_json DELETE "$BASE_URL/api/v1/drives/$id/tokens/$2" | "$JQ_BIN" .
;;
delete)
[[ $# -ge 1 ]] || die "usage: drive.sh delete <drive> --confirm <drive name>"
id=$(resolve_drive "$1"); shift
confirm=""
while [[ $# -gt 0 ]]; do
case "$1" in
--confirm) confirm="$2"; shift 2 ;;
*) die "unexpected delete argument: $1" ;;
esac
done
drive=$(api_json GET "$BASE_URL/api/v1/drives/$id")
name=$(echo "$drive" | "$JQ_BIN" -r '.drive.name')
[[ "$confirm" == "$name" ]] || die "delete requires --confirm '$name'"
api_json DELETE "$BASE_URL/api/v1/drives/$id" | "$JQ_BIN" .
;;
*)
die "unknown command: $CMD"
;;
esac
@@ -1,445 +0,0 @@
#!/usr/bin/env bash
set -euo pipefail
BASE_URL="https://here.now"
CREDENTIALS_FILE="$HOME/.herenow/credentials"
API_KEY="${HERENOW_API_KEY:-}"
API_KEY_SOURCE="none"
if [[ -n "${HERENOW_API_KEY:-}" ]]; then
API_KEY_SOURCE="env"
fi
ALLOW_NON_HERENOW_BASE_URL=0
SLUG=""
CLAIM_TOKEN=""
TITLE=""
DESCRIPTION=""
TTL=""
CLIENT=""
TARGET=""
FORKABLE=""
SPA_MODE=""
FROM_DRIVE=""
DRIVE_VERSION=""
usage() {
cat <<'USAGE'
Usage: publish.sh <file-or-dir> [options]
Options:
--api-key <key> API key (or set $HERENOW_API_KEY)
--slug <slug> Update existing publish
--claim-token <token> Claim token for anonymous updates
--title <text> Viewer title
--description <text> Viewer description
--ttl <seconds> Expiry (authenticated only)
--client <name> Agent name for attribution (e.g. cursor, claude-code)
--forkable Allow others to fork this site
--spa Enable SPA routing
--from-drive <drv_...> Publish a Drive snapshot instead of local files
--version <dv_...> Drive version for --from-drive (default: current head)
--base-url <url> API base (default: https://here.now)
--allow-nonherenow-base-url
Allow auth requests to non-default API base URL
USAGE
exit 1
}
die() { echo "error: $1" >&2; exit 1; }
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
SKILL_DIR="$(cd "${SCRIPT_DIR}/.." && pwd)"
BUNDLED_JQ="${SKILL_DIR}/bin/jq"
if [[ -x "$BUNDLED_JQ" ]]; then
JQ_BIN="$BUNDLED_JQ"
elif command -v jq >/dev/null 2>&1; then
JQ_BIN="$(command -v jq)"
else
die "requires jq"
fi
for cmd in curl file; do
command -v "$cmd" >/dev/null 2>&1 || die "requires $cmd"
done
while [[ $# -gt 0 ]]; do
case "$1" in
--api-key) API_KEY="$2"; API_KEY_SOURCE="flag"; shift 2 ;;
--slug) SLUG="$2"; shift 2 ;;
--claim-token) CLAIM_TOKEN="$2"; shift 2 ;;
--title) TITLE="$2"; shift 2 ;;
--description) DESCRIPTION="$2"; shift 2 ;;
--ttl) TTL="$2"; shift 2 ;;
--client) CLIENT="$2"; shift 2 ;;
--base-url) BASE_URL="$2"; shift 2 ;;
--allow-nonherenow-base-url) ALLOW_NON_HERENOW_BASE_URL=1; shift ;;
--forkable) FORKABLE="true"; shift ;;
--spa) SPA_MODE="true"; shift ;;
--from-drive) FROM_DRIVE="$2"; shift 2 ;;
--version) DRIVE_VERSION="$2"; shift 2 ;;
--help|-h) usage ;;
-*) die "unknown option: $1" ;;
*) [[ -z "$TARGET" ]] && TARGET="$1" || die "unexpected argument: $1"; shift ;;
esac
done
if [[ -n "$FROM_DRIVE" ]]; then
[[ -z "$TARGET" ]] || die "--from-drive does not accept a local file-or-dir argument"
else
[[ -n "$TARGET" ]] || usage
[[ -e "$TARGET" ]] || die "path does not exist: $TARGET"
fi
# Load API key from credentials file if not provided via flag or env
if [[ -z "$API_KEY" && -f "$CREDENTIALS_FILE" ]]; then
API_KEY=$(cat "$CREDENTIALS_FILE" | tr -d '[:space:]')
[[ -n "$API_KEY" ]] && API_KEY_SOURCE="credentials"
fi
BASE_URL="${BASE_URL%/}"
STATE_DIR=".herenow"
STATE_FILE="$STATE_DIR/state.json"
# Safety guard: avoid accidentally sending bearer auth to arbitrary endpoints.
if [[ -n "$API_KEY" && "$BASE_URL" != "https://here.now" && "$ALLOW_NON_HERENOW_BASE_URL" -ne 1 ]]; then
die "refusing to send API key to non-default base URL; pass --allow-nonherenow-base-url to override"
fi
# Auto-load claim token from state file for anonymous updates
if [[ -n "$SLUG" && -z "$CLAIM_TOKEN" && -z "$API_KEY" && -f "$STATE_FILE" ]]; then
CLAIM_TOKEN=$("$JQ_BIN" -r --arg s "$SLUG" '.publishes[$s].claimToken // empty' "$STATE_FILE" 2>/dev/null || true)
fi
if [[ -n "$FROM_DRIVE" ]]; then
[[ -n "$API_KEY" ]] || die "--from-drive requires an account API key"
BODY=$("$JQ_BIN" -n --arg d "$FROM_DRIVE" '{driveId:$d}')
[[ -n "$DRIVE_VERSION" ]] && BODY=$(echo "$BODY" | "$JQ_BIN" --arg v "$DRIVE_VERSION" '.versionId = $v')
[[ -n "$SLUG" ]] && BODY=$(echo "$BODY" | "$JQ_BIN" --arg s "$SLUG" '.slug = $s')
if [[ -n "$TITLE" || -n "$DESCRIPTION" ]]; then
viewer="{}"
[[ -n "$TITLE" ]] && viewer=$(echo "$viewer" | "$JQ_BIN" --arg t "$TITLE" '.title = $t')
[[ -n "$DESCRIPTION" ]] && viewer=$(echo "$viewer" | "$JQ_BIN" --arg d "$DESCRIPTION" '.description = $d')
BODY=$(echo "$BODY" | "$JQ_BIN" --argjson v "$viewer" '.viewer = $v')
fi
[[ "$FORKABLE" == "true" ]] && BODY=$(echo "$BODY" | "$JQ_BIN" '.forkable = true')
[[ "$SPA_MODE" == "true" ]] && BODY=$(echo "$BODY" | "$JQ_BIN" '.spaMode = true')
CLIENT_HEADER_VALUE="here-now-publish-sh"
if [[ -n "$CLIENT" ]]; then
normalized_client=$(echo "$CLIENT" | tr '[:upper:]' '[:lower:]' | tr -cs 'a-z0-9._-' '-')
normalized_client="${normalized_client#-}"
normalized_client="${normalized_client%-}"
if [[ -n "$normalized_client" ]]; then
CLIENT_HEADER_VALUE="${normalized_client}/publish-sh"
fi
fi
echo "publishing from Drive..." >&2
RESPONSE=$(curl -sS -X POST "$BASE_URL/api/v1/publish/from-drive" \
-H "authorization: Bearer $API_KEY" \
-H "x-herenow-client: $CLIENT_HEADER_VALUE" \
-H "content-type: application/json" \
-d "$BODY")
if echo "$RESPONSE" | "$JQ_BIN" -e '.error' >/dev/null 2>&1; then
err=$(echo "$RESPONSE" | "$JQ_BIN" -r '.error')
die "$err"
fi
SITE_URL=$(echo "$RESPONSE" | "$JQ_BIN" -r '.siteUrl')
OUT_SLUG=$(echo "$RESPONSE" | "$JQ_BIN" -r '.slug')
CURRENT_VERSION=$(echo "$RESPONSE" | "$JQ_BIN" -r '.currentVersionId')
DRIVE_VERSION_OUT=$(echo "$RESPONSE" | "$JQ_BIN" -r '.driveVersionId')
echo "$SITE_URL"
echo "" >&2
echo "publish_result.site_url=$SITE_URL" >&2
echo "publish_result.slug=$OUT_SLUG" >&2
echo "publish_result.action=from_drive" >&2
echo "publish_result.auth_mode=authenticated" >&2
echo "publish_result.api_key_source=$API_KEY_SOURCE" >&2
echo "publish_result.persistence=permanent" >&2
echo "publish_result.drive_id=$FROM_DRIVE" >&2
echo "publish_result.drive_version_id=$DRIVE_VERSION_OUT" >&2
echo "publish_result.current_version_id=$CURRENT_VERSION" >&2
exit 0
fi
compute_sha256() {
local f="$1"
if command -v sha256sum >/dev/null 2>&1; then
sha256sum "$f" | cut -d' ' -f1
else
shasum -a 256 "$f" | cut -d' ' -f1
fi
}
guess_content_type() {
local f="$1"
case "${f##*.}" in
html|htm) echo "text/html; charset=utf-8" ;;
css) echo "text/css; charset=utf-8" ;;
js|mjs) echo "text/javascript; charset=utf-8" ;;
json) echo "application/json; charset=utf-8" ;;
md|txt) echo "text/plain; charset=utf-8" ;;
svg) echo "image/svg+xml" ;;
png) echo "image/png" ;;
jpg|jpeg) echo "image/jpeg" ;;
gif) echo "image/gif" ;;
webp) echo "image/webp" ;;
pdf) echo "application/pdf" ;;
mp4) echo "video/mp4" ;;
mov) echo "video/quicktime" ;;
mp3) echo "audio/mpeg" ;;
wav) echo "audio/wav" ;;
xml) echo "application/xml" ;;
woff2) echo "font/woff2" ;;
woff) echo "font/woff" ;;
ttf) echo "font/ttf" ;;
ico) echo "image/x-icon" ;;
*)
local detected
detected=$(file --brief --mime-type "$f" 2>/dev/null || echo "application/octet-stream")
echo "$detected"
;;
esac
}
# Build file manifest as JSON array
FILES_JSON="[]"
if [[ -f "$TARGET" ]]; then
sz=$(wc -c < "$TARGET" | tr -d ' ')
ct=$(guess_content_type "$TARGET")
bn=$(basename "$TARGET")
h=$(compute_sha256 "$TARGET")
FILES_JSON=$("$JQ_BIN" -n --arg p "$bn" --argjson s "$sz" --arg c "$ct" --arg h "$h" \
'[{"path":$p,"size":$s,"contentType":$c,"hash":$h}]')
FILE_MAP=$("$JQ_BIN" -n --arg p "$bn" --arg a "$(cd "$(dirname "$TARGET")" && pwd)/$(basename "$TARGET")" \
'{($p):$a}')
elif [[ -d "$TARGET" ]]; then
FILE_MAP="{}"
while IFS= read -r -d '' f; do
rel="${f#$TARGET/}"
[[ "$rel" == ".DS_Store" ]] && continue
[[ "$(basename "$rel")" == ".DS_Store" ]] && continue
[[ "$rel" == ".herenow/fork-meta.json" ]] && continue
sz=$(wc -c < "$f" | tr -d ' ')
ct=$(guess_content_type "$f")
h=$(compute_sha256 "$f")
abs=$(cd "$(dirname "$f")" && pwd)/$(basename "$f")
FILES_JSON=$(echo "$FILES_JSON" | "$JQ_BIN" --arg p "$rel" --argjson s "$sz" --arg c "$ct" --arg h "$h" \
'. + [{"path":$p,"size":$s,"contentType":$c,"hash":$h}]')
FILE_MAP=$(echo "$FILE_MAP" | "$JQ_BIN" --arg p "$rel" --arg a "$abs" '. + {($p):$a}')
done < <(find "$TARGET" -type f -print0 | sort -z)
else
die "not a file or directory: $TARGET"
fi
file_count=$(echo "$FILES_JSON" | "$JQ_BIN" 'length')
[[ "$file_count" -gt 0 ]] || die "no files found"
# Read fork-meta.json defaults if present and no explicit flags given
FORK_META=""
if [[ -d "$TARGET" ]]; then
FORK_META_PATH="$TARGET/.herenow/fork-meta.json"
if [[ -f "$FORK_META_PATH" ]]; then
FORK_META=$(cat "$FORK_META_PATH")
if [[ -z "$FORKABLE" ]]; then
FORKABLE=$("$JQ_BIN" -r '.forkable // empty' <<< "$FORK_META" 2>/dev/null || true)
fi
fi
fi
# Build request body
BODY=$(echo "$FILES_JSON" | "$JQ_BIN" '{files: .}')
if [[ -n "$TTL" ]]; then
BODY=$(echo "$BODY" | "$JQ_BIN" --argjson t "$TTL" '.ttlSeconds = $t')
fi
if [[ -n "$TITLE" || -n "$DESCRIPTION" ]]; then
viewer="{}"
[[ -n "$TITLE" ]] && viewer=$(echo "$viewer" | "$JQ_BIN" --arg t "$TITLE" '.title = $t')
[[ -n "$DESCRIPTION" ]] && viewer=$(echo "$viewer" | "$JQ_BIN" --arg d "$DESCRIPTION" '.description = $d')
BODY=$(echo "$BODY" | "$JQ_BIN" --argjson v "$viewer" '.viewer = $v')
fi
if [[ -n "$CLAIM_TOKEN" && -n "$SLUG" && -z "$API_KEY" ]]; then
BODY=$(echo "$BODY" | "$JQ_BIN" --arg ct "$CLAIM_TOKEN" '.claimToken = $ct')
fi
if [[ "$FORKABLE" == "true" ]]; then
BODY=$(echo "$BODY" | "$JQ_BIN" '.forkable = true')
fi
if [[ "$SPA_MODE" == "true" ]]; then
BODY=$(echo "$BODY" | "$JQ_BIN" '.spaMode = true')
fi
# Determine endpoint and method
if [[ -n "$SLUG" ]]; then
URL="$BASE_URL/api/v1/publish/$SLUG"
METHOD="PUT"
else
URL="$BASE_URL/api/v1/publish"
METHOD="POST"
fi
# Build auth header
AUTH_ARGS=()
if [[ -n "$API_KEY" ]]; then
AUTH_ARGS=(-H "authorization: Bearer $API_KEY")
fi
AUTH_MODE="anonymous"
if [[ -n "$API_KEY" ]]; then
AUTH_MODE="authenticated"
fi
CLIENT_HEADER_VALUE="here-now-publish-sh"
if [[ -n "$CLIENT" ]]; then
normalized_client=$(echo "$CLIENT" | tr '[:upper:]' '[:lower:]' | tr -cs 'a-z0-9._-' '-')
normalized_client="${normalized_client#-}"
normalized_client="${normalized_client%-}"
if [[ -n "$normalized_client" ]]; then
CLIENT_HEADER_VALUE="${normalized_client}/publish-sh"
fi
fi
CLIENT_ARGS=(-H "x-herenow-client: $CLIENT_HEADER_VALUE")
# Step 1: Create/update publish
echo "creating publish ($file_count files)..." >&2
RESPONSE=$(curl -sS -X "$METHOD" "$URL" \
"${AUTH_ARGS[@]+"${AUTH_ARGS[@]}"}" \
"${CLIENT_ARGS[@]+"${CLIENT_ARGS[@]}"}" \
-H "content-type: application/json" \
-d "$BODY")
# Check for errors
if echo "$RESPONSE" | "$JQ_BIN" -e '.error' >/dev/null 2>&1; then
err=$(echo "$RESPONSE" | "$JQ_BIN" -r '.error')
details=$(echo "$RESPONSE" | "$JQ_BIN" -r '.details // empty')
die "$err${details:+ ($details)}"
fi
OUT_SLUG=$(echo "$RESPONSE" | "$JQ_BIN" -r '.slug')
VERSION_ID=$(echo "$RESPONSE" | "$JQ_BIN" -r '.upload.versionId')
FINALIZE_URL=$(echo "$RESPONSE" | "$JQ_BIN" -r '.upload.finalizeUrl')
SITE_URL=$(echo "$RESPONSE" | "$JQ_BIN" -r '.siteUrl')
UPLOAD_COUNT=$(echo "$RESPONSE" | "$JQ_BIN" '.upload.uploads | length')
SKIPPED_COUNT=$(echo "$RESPONSE" | "$JQ_BIN" '.upload.skipped // [] | length')
[[ "$OUT_SLUG" != "null" ]] || die "unexpected response: $RESPONSE"
# Step 2: Upload files (skipped files are unchanged from previous version)
if [[ "$SKIPPED_COUNT" -gt 0 ]]; then
echo "uploading $UPLOAD_COUNT files ($SKIPPED_COUNT unchanged, skipped)..." >&2
else
echo "uploading $UPLOAD_COUNT files..." >&2
fi
upload_errors=0
for i in $(seq 0 $((UPLOAD_COUNT - 1))); do
upload_path=$(echo "$RESPONSE" | "$JQ_BIN" -r ".upload.uploads[$i].path")
upload_url=$(echo "$RESPONSE" | "$JQ_BIN" -r ".upload.uploads[$i].url")
upload_ct=$(echo "$RESPONSE" | "$JQ_BIN" -r ".upload.uploads[$i].headers[\"Content-Type\"] // empty")
if [[ -f "$TARGET" && ! -d "$TARGET" ]]; then
local_file="$TARGET"
else
local_file=$(echo "$FILE_MAP" | "$JQ_BIN" -r --arg p "$upload_path" '.[$p]')
fi
if [[ ! -f "$local_file" ]]; then
echo "warning: missing local file for $upload_path" >&2
upload_errors=$((upload_errors + 1))
continue
fi
ct_args=()
[[ -n "$upload_ct" ]] && ct_args=(-H "Content-Type: $upload_ct")
http_code=$(curl -sS -o /dev/null -w "%{http_code}" -X PUT "$upload_url" \
"${ct_args[@]+"${ct_args[@]}"}" \
--data-binary "@$local_file")
if [[ "$http_code" -lt 200 || "$http_code" -ge 300 ]]; then
echo "warning: upload failed for $upload_path (HTTP $http_code)" >&2
upload_errors=$((upload_errors + 1))
fi
done
[[ "$upload_errors" -eq 0 ]] || die "$upload_errors file(s) failed to upload"
# Step 3: Finalize
echo "finalizing..." >&2
FIN_RESPONSE=$(curl -sS -X POST "$FINALIZE_URL" \
"${AUTH_ARGS[@]+"${AUTH_ARGS[@]}"}" \
"${CLIENT_ARGS[@]+"${CLIENT_ARGS[@]}"}" \
-H "content-type: application/json" \
-d "{\"versionId\":\"$VERSION_ID\"}")
if echo "$FIN_RESPONSE" | "$JQ_BIN" -e '.error' >/dev/null 2>&1; then
err=$(echo "$FIN_RESPONSE" | "$JQ_BIN" -r '.error')
die "finalize failed: $err"
fi
# Save state
mkdir -p "$STATE_DIR"
if [[ -f "$STATE_FILE" ]]; then
STATE=$(cat "$STATE_FILE")
else
STATE='{"publishes":{}}'
fi
entry=$("$JQ_BIN" -n --arg s "$SITE_URL" '{siteUrl: $s}')
RESPONSE_CLAIM_TOKEN=$(echo "$RESPONSE" | "$JQ_BIN" -r '.claimToken // empty')
RESPONSE_CLAIM_URL=$(echo "$RESPONSE" | "$JQ_BIN" -r '.claimUrl // empty')
RESPONSE_EXPIRES=$(echo "$RESPONSE" | "$JQ_BIN" -r '.expiresAt // empty')
[[ -n "$RESPONSE_CLAIM_TOKEN" ]] && entry=$(echo "$entry" | "$JQ_BIN" --arg v "$RESPONSE_CLAIM_TOKEN" '.claimToken = $v')
[[ -n "$RESPONSE_CLAIM_URL" ]] && entry=$(echo "$entry" | "$JQ_BIN" --arg v "$RESPONSE_CLAIM_URL" '.claimUrl = $v')
[[ -n "$RESPONSE_EXPIRES" ]] && entry=$(echo "$entry" | "$JQ_BIN" --arg v "$RESPONSE_EXPIRES" '.expiresAt = $v')
STATE=$(echo "$STATE" | "$JQ_BIN" --arg slug "$OUT_SLUG" --argjson e "$entry" '.publishes[$slug] = $e')
echo "$STATE" | "$JQ_BIN" '.' > "$STATE_FILE"
# Output
echo "$SITE_URL"
PERSISTENCE="permanent"
if [[ "$AUTH_MODE" == "anonymous" ]]; then
PERSISTENCE="expires_24h"
elif [[ -n "$RESPONSE_EXPIRES" ]]; then
PERSISTENCE="expires_at"
fi
SAFE_CLAIM_URL=""
if [[ -n "$RESPONSE_CLAIM_URL" && "$RESPONSE_CLAIM_URL" == https://* ]]; then
SAFE_CLAIM_URL="$RESPONSE_CLAIM_URL"
fi
ACTION="create"
if [[ -n "$SLUG" ]]; then
ACTION="update"
fi
echo "" >&2
echo "publish_result.site_url=$SITE_URL" >&2
echo "publish_result.slug=$OUT_SLUG" >&2
echo "publish_result.action=$ACTION" >&2
echo "publish_result.auth_mode=$AUTH_MODE" >&2
echo "publish_result.api_key_source=$API_KEY_SOURCE" >&2
echo "publish_result.persistence=$PERSISTENCE" >&2
echo "publish_result.expires_at=$RESPONSE_EXPIRES" >&2
echo "publish_result.claim_url=$SAFE_CLAIM_URL" >&2
if [[ "$AUTH_MODE" == "authenticated" ]]; then
echo "authenticated publish (permanent, saved to your account)" >&2
else
echo "anonymous publish (expires in 24h)" >&2
if [[ -n "$SAFE_CLAIM_URL" ]]; then
echo "claim URL: $SAFE_CLAIM_URL" >&2
fi
if [[ -n "$RESPONSE_CLAIM_TOKEN" ]]; then
echo "claim token saved to $STATE_FILE" >&2
fi
fi
@@ -1,372 +0,0 @@
---
name: shopify
description: Shopify Admin & Storefront GraphQL APIs via curl. Products, orders, customers, inventory, metafields.
version: 1.0.0
author: community
license: MIT
prerequisites:
env_vars: [SHOPIFY_ACCESS_TOKEN, SHOPIFY_STORE_DOMAIN]
commands: [curl, jq]
required_environment_variables:
- name: SHOPIFY_ACCESS_TOKEN
prompt: Shopify Admin API access token (starts with shpat_)
help: "Shopify admin → Settings → Apps and sales channels → Develop apps → Create an app → API credentials. Token shown ONCE on install."
- name: SHOPIFY_STORE_DOMAIN
prompt: Your shop subdomain without protocol (e.g. my-store.myshopify.com)
help: "The permanent myshopify.com domain, not your custom domain."
- name: SHOPIFY_API_VERSION
prompt: Shopify API version (default 2026-01)
help: "Stable quarterly version. Override if you need an older one."
metadata:
hermes:
tags: [Shopify, E-commerce, Commerce, API, GraphQL]
related_skills: [airtable, xurl]
homepage: https://shopify.dev/docs/api/admin-graphql
---
# Shopify — Admin & Storefront GraphQL APIs
Work with Shopify stores directly through `curl`: list products, manage inventory, pull orders, update customers, read metafields. No SDK, no app framework — just the GraphQL endpoint and a custom-app access token.
The REST Admin API is legacy since 2024-04 and only receives security fixes. **Use GraphQL Admin** for all admin work. Use **Storefront GraphQL** for read-only customer-facing queries (products, collections, cart).
## Prerequisites
1. In Shopify admin: **Settings → Apps and sales channels → Develop apps → Create an app**.
2. Click **Configure Admin API scopes**, select what you need (examples below), save.
3. **Install app** → the Admin API access token appears ONCE. Copy it immediately — Shopify will never show it again. Tokens start with `shpat_`.
4. Save to `~/.hermes/.env`:
```
SHOPIFY_ACCESS_TOKEN=shpat_xxxxxxxxxxxxxxxxxxxx
SHOPIFY_STORE_DOMAIN=my-store.myshopify.com
SHOPIFY_API_VERSION=2026-01
```
> **Heads up:** As of January 1, 2026, new "legacy custom apps" created in the Shopify admin are gone. New setups should use the **Dev Dashboard** (`shopify.dev/docs/apps/build/dev-dashboard`). Existing admin-created apps keep working. If the user's shop has no existing custom app and it's after 2026-01-01, direct them to Dev Dashboard instead of the admin flow.
Common scopes by task:
- Products / collections: `read_products`, `write_products`
- Inventory: `read_inventory`, `write_inventory`, `read_locations`
- Orders: `read_orders`, `write_orders` (30 most recent without `read_all_orders`)
- Customers: `read_customers`, `write_customers`
- Draft orders: `read_draft_orders`, `write_draft_orders`
- Fulfillments: `read_fulfillments`, `write_fulfillments`
- Metafields / metaobjects: covered by the matching resource scopes
## API Basics
- **Endpoint:** `https://$SHOPIFY_STORE_DOMAIN/admin/api/$SHOPIFY_API_VERSION/graphql.json`
- **Auth header:** `X-Shopify-Access-Token: $SHOPIFY_ACCESS_TOKEN` (NOT `Authorization: Bearer`)
- **Method:** always `POST`, always `Content-Type: application/json`, body is `{"query": "...", "variables": {...}}`
- **HTTP 200 does not mean success.** GraphQL returns errors in a top-level `errors` array and per-field `userErrors`. Always check both.
- **IDs are GID strings:** `gid://shopify/Product/10079467700516`, `gid://shopify/Variant/...`, `gid://shopify/Order/...`. Pass these verbatim — don't strip the prefix.
- **Rate limit:** calculated via query cost (leaky bucket). Each response has `extensions.cost` with `requestedQueryCost`, `actualQueryCost`, `throttleStatus.{currentlyAvailable, maximumAvailable, restoreRate}`. Back off when `currentlyAvailable` drops below your next query's cost. Standard shops = 100 points bucket, 50/s restore; Plus = 1000/100.
Base curl pattern (reusable):
```bash
shop_gql() {
local query="$1"
local variables="${2:-{}}"
curl -sS -X POST \
"https://${SHOPIFY_STORE_DOMAIN}/admin/api/${SHOPIFY_API_VERSION:-2026-01}/graphql.json" \
-H "Content-Type: application/json" \
-H "X-Shopify-Access-Token: ${SHOPIFY_ACCESS_TOKEN}" \
--data "$(jq -nc --arg q "$query" --argjson v "$variables" '{query: $q, variables: $v}')"
}
```
Pipe through `jq` for readable output. `-sS` keeps errors visible but hides the progress bar.
## Discovery
### Shop info + current API version
```bash
shop_gql '{ shop { name myshopifyDomain primaryDomain { url } currencyCode plan { displayName } } }' | jq
```
### List all supported API versions
```bash
shop_gql '{ publicApiVersions { handle supported } }' | jq '.data.publicApiVersions[] | select(.supported)'
```
## Products
### Search products (first 20 matching query)
```bash
shop_gql '
query($q: String!) {
products(first: 20, query: $q) {
edges { node { id title handle status totalInventory variants(first: 5) { edges { node { id sku price inventoryQuantity } } } } }
pageInfo { hasNextPage endCursor }
}
}' '{"q":"hoodie status:active"}' | jq
```
Query syntax supports `title:`, `sku:`, `vendor:`, `product_type:`, `status:active`, `tag:`, `created_at:>2025-01-01`. Full grammar: https://shopify.dev/docs/api/usage/search-syntax
### Paginate products (cursor)
```bash
shop_gql '
query($cursor: String) {
products(first: 100, after: $cursor) {
edges { cursor node { id handle } }
pageInfo { hasNextPage endCursor }
}
}' '{"cursor":null}'
# subsequent calls: pass the previous endCursor
```
### Get a product with variants + metafields
```bash
shop_gql '
query($id: ID!) {
product(id: $id) {
id title handle descriptionHtml tags status
variants(first: 20) { edges { node { id sku price compareAtPrice inventoryQuantity selectedOptions { name value } } } }
metafields(first: 20) { edges { node { namespace key type value } } }
}
}' '{"id":"gid://shopify/Product/10079467700516"}' | jq
```
### Create a product with one variant
```bash
shop_gql '
mutation($input: ProductCreateInput!) {
productCreate(product: $input) {
product { id handle }
userErrors { field message }
}
}' '{"input":{"title":"Test Hoodie","status":"DRAFT","vendor":"Hermes","productType":"Apparel","tags":["test"]}}'
```
Variants now have their own mutations in recent versions:
```bash
# Add variants after creating the product
shop_gql '
mutation($productId: ID!, $variants: [ProductVariantsBulkInput!]!) {
productVariantsBulkCreate(productId: $productId, variants: $variants) {
productVariants { id sku price }
userErrors { field message }
}
}' '{"productId":"gid://shopify/Product/...","variants":[{"optionValues":[{"optionName":"Size","name":"M"}],"price":"49.00","inventoryItem":{"sku":"HD-M","tracked":true}}]}'
```
### Update price / SKU
```bash
shop_gql '
mutation($productId: ID!, $variants: [ProductVariantsBulkInput!]!) {
productVariantsBulkUpdate(productId: $productId, variants: $variants) {
productVariants { id sku price }
userErrors { field message }
}
}' '{"productId":"gid://shopify/Product/...","variants":[{"id":"gid://shopify/ProductVariant/...","price":"55.00"}]}'
```
## Orders
### List recent orders (last 30 by default without `read_all_orders`)
```bash
shop_gql '
{
orders(first: 20, reverse: true, query: "financial_status:paid") {
edges { node {
id name createdAt displayFinancialStatus displayFulfillmentStatus
totalPriceSet { shopMoney { amount currencyCode } }
customer { id displayName email }
lineItems(first: 10) { edges { node { title quantity sku } } }
} }
}
}' | jq
```
Useful order query filters: `financial_status:paid|pending|refunded`, `fulfillment_status:unfulfilled|fulfilled`, `created_at:>2025-01-01`, `tag:gift`, `email:foo@example.com`.
### Fetch a single order with shipping address
```bash
shop_gql '
query($id: ID!) {
order(id: $id) {
id name email
shippingAddress { name address1 address2 city province country zip phone }
lineItems(first: 50) { edges { node { title quantity variant { sku } originalUnitPriceSet { shopMoney { amount currencyCode } } } } }
transactions { id kind status amountSet { shopMoney { amount currencyCode } } }
}
}' '{"id":"gid://shopify/Order/...."}' | jq
```
## Customers
```bash
# Search
shop_gql '
{
customers(first: 10, query: "email:*@example.com") {
edges { node { id email displayName numberOfOrders amountSpent { amount currencyCode } } }
}
}'
# Create
shop_gql '
mutation($input: CustomerInput!) {
customerCreate(input: $input) {
customer { id email }
userErrors { field message }
}
}' '{"input":{"email":"test@example.com","firstName":"Test","lastName":"User","tags":["api-created"]}}'
```
## Inventory
Inventory lives on **inventory items** tied to variants, quantities tracked per **location**.
```bash
# Get inventory for a variant across all locations
shop_gql '
query($id: ID!) {
productVariant(id: $id) {
id sku
inventoryItem {
id tracked
inventoryLevels(first: 10) {
edges { node { location { id name } quantities(names: ["available","on_hand","committed"]) { name quantity } } }
}
}
}
}' '{"id":"gid://shopify/ProductVariant/..."}'
```
Adjust stock (delta) — uses `inventoryAdjustQuantities`:
```bash
shop_gql '
mutation($input: InventoryAdjustQuantitiesInput!) {
inventoryAdjustQuantities(input: $input) {
inventoryAdjustmentGroup { reason changes { name delta } }
userErrors { field message }
}
}' '{
"input": {
"reason": "correction",
"name": "available",
"changes": [{"delta": 5, "inventoryItemId": "gid://shopify/InventoryItem/...", "locationId": "gid://shopify/Location/..."}]
}
}'
```
Set absolute stock (not delta) — `inventorySetQuantities`:
```bash
shop_gql '
mutation($input: InventorySetQuantitiesInput!) {
inventorySetQuantities(input: $input) {
inventoryAdjustmentGroup { id }
userErrors { field message }
}
}' '{"input":{"reason":"correction","name":"available","ignoreCompareQuantity":true,"quantities":[{"inventoryItemId":"gid://shopify/InventoryItem/...","locationId":"gid://shopify/Location/...","quantity":100}]}}'
```
## Metafields & Metaobjects
Metafields attach custom data to resources (products, customers, orders, shop).
```bash
# Read
shop_gql '
query($id: ID!) {
product(id: $id) {
metafields(first: 10, namespace: "custom") {
edges { node { key type value } }
}
}
}' '{"id":"gid://shopify/Product/..."}'
# Write (works for any owner type)
shop_gql '
mutation($metafields: [MetafieldsSetInput!]!) {
metafieldsSet(metafields: $metafields) {
metafields { id key namespace }
userErrors { field message code }
}
}' '{"metafields":[{"ownerId":"gid://shopify/Product/...","namespace":"custom","key":"care_instructions","type":"multi_line_text_field","value":"Wash cold. Tumble dry low."}]}'
```
## Storefront API (public read-only)
Different endpoint, different token, used for customer-facing apps/hydrogen-style headless setups. Headers differ:
- **Endpoint:** `https://$SHOPIFY_STORE_DOMAIN/api/$SHOPIFY_API_VERSION/graphql.json`
- **Auth header (public):** `X-Shopify-Storefront-Access-Token: <public token>` — embeddable in browser
- **Auth header (private):** `Shopify-Storefront-Private-Token: <private token>` — server-only
```bash
curl -sS -X POST \
"https://${SHOPIFY_STORE_DOMAIN}/api/${SHOPIFY_API_VERSION:-2026-01}/graphql.json" \
-H "Content-Type: application/json" \
-H "X-Shopify-Storefront-Access-Token: ${SHOPIFY_STOREFRONT_TOKEN}" \
-d '{"query":"{ shop { name } products(first: 5) { edges { node { id title handle } } } }"}' | jq
```
## Bulk Operations
For dumps larger than rate limits allow (full product catalog, all orders for a year):
```bash
# 1. Start bulk query
shop_gql '
mutation {
bulkOperationRunQuery(query: """
{ products { edges { node { id title handle variants { edges { node { sku price } } } } } } }
""") {
bulkOperation { id status }
userErrors { field message }
}
}'
# 2. Poll status
shop_gql '{ currentBulkOperation { id status errorCode objectCount fileSize url partialDataUrl } }'
# 3. When status=COMPLETED, download the JSONL file
curl -sS "$URL" > products.jsonl
```
Each JSONL line is a node, and nested connections are emitted as separate lines with `__parentId`. Reassemble client-side if needed.
## Webhooks
Subscribe to events so you don't have to poll:
```bash
shop_gql '
mutation($topic: WebhookSubscriptionTopic!, $sub: WebhookSubscriptionInput!) {
webhookSubscriptionCreate(topic: $topic, webhookSubscription: $sub) {
webhookSubscription { id topic endpoint { __typename ... on WebhookHttpEndpoint { callbackUrl } } }
userErrors { field message }
}
}' '{"topic":"ORDERS_CREATE","sub":{"callbackUrl":"https://example.com/webhook","format":"JSON"}}'
```
Verify incoming webhook HMAC using the app's client secret (not the access token):
```bash
echo -n "$REQUEST_BODY" | openssl dgst -sha256 -hmac "$APP_SECRET" -binary | base64
# Compare to X-Shopify-Hmac-Sha256 header
```
## Pitfalls
- **REST endpoints still exist but are frozen.** Don't write new integrations against `/admin/api/.../products.json`. Use GraphQL.
- **Token format check.** Admin tokens start with `shpat_`. Storefront public tokens with `shpua_`. If you have one and the wrong header, every request returns 401 without a useful error body.
- **403 with a valid token = missing scope.** Shopify returns `{"errors":[{"message":"Access denied for ..."}]}`. Re-configure Admin API scopes on the app, then reinstall to regenerate the token.
- **`userErrors` is empty != success.** Also check `data.<mutation>.<resource>` is non-null. Some failures populate neither — inspect the whole response.
- **GID vs numeric ID.** Legacy REST gave numeric IDs; GraphQL wants full GID strings. To convert: `gid://shopify/Product/<numeric>`.
- **Rate limit surprise.** A single `products(first: 250)` with deep nesting can cost 1000+ points and throttle immediately on a standard-plan shop. Start narrow, read `extensions.cost`, adjust.
- **Pagination order.** `products(first: N, reverse: true)` sorts by `id DESC`, not `created_at`. Use `sortKey: CREATED_AT, reverse: true` for "newest first."
- **`read_all_orders` for historical data.** Without it, `orders(...)` silently caps at the 60-day window. You won't get an error, just fewer results than expected. For Shopify Plus merchants with many orders, request this scope via the app's protected-data settings.
- **Currencies are strings.** Amounts come back as `"49.00"` not `49.0`. Don't `jq tonumber` blindly if you care about zero-padding.
- **Multi-currency Money fields** have `shopMoney` (store's currency) AND `presentmentMoney` (customer's). Pick one consistently.
## Safety
Mutations in Shopify are real — they create products, charge refunds, cancel orders, ship fulfillments. Before running `productDelete`, `orderCancel`, `refundCreate`, or any bulk mutation: state clearly what the change is, on which shop, and confirm with the user. There is no staging clone of production data unless the user has a separate dev store.
@@ -12,14 +12,6 @@ import time
from pathlib import Path
from typing import Any, Dict, List, Optional, Set
try:
from hermes_constants import get_hermes_home
except ImportError:
import os as _os
def get_hermes_home() -> Path: # type: ignore[misc]
val = (_os.environ.get("HERMES_HOME") or "").strip()
return Path(val) if val else Path.home() / ".hermes"
try:
from fastapi import APIRouter
except Exception: # Allows local unit tests without dashboard dependencies.
@@ -143,15 +135,15 @@ ACHIEVEMENTS: List[Dict[str, Any]] = [
def state_path() -> Path:
return get_hermes_home() / "plugins" / "hermes-achievements" / "state.json"
return Path.home() / ".hermes" / "plugins" / "hermes-achievements" / "state.json"
def snapshot_path() -> Path:
return get_hermes_home() / "plugins" / "hermes-achievements" / "scan_snapshot.json"
return Path.home() / ".hermes" / "plugins" / "hermes-achievements" / "scan_snapshot.json"
def checkpoint_path() -> Path:
return get_hermes_home() / "plugins" / "hermes-achievements" / "scan_checkpoint.json"
return Path.home() / ".hermes" / "plugins" / "hermes-achievements" / "scan_checkpoint.json"
def load_state() -> Dict[str, Any]:
File diff suppressed because it is too large Load Diff
-760
View File
@@ -1,760 +0,0 @@
/*
* Hermes Kanban dashboard plugin styles.
*
* All colors reference theme CSS vars so the board reskins with the
* active dashboard theme. No hardcoded palette.
*/
.hermes-kanban {
width: 100%;
}
/* ---- Columns layout -------------------------------------------------- */
.hermes-kanban-columns {
display: grid;
grid-template-columns: repeat(auto-fit, minmax(260px, 1fr));
gap: 0.75rem;
align-items: start;
}
.hermes-kanban-column {
display: flex;
flex-direction: column;
background: color-mix(in srgb, var(--color-card) 85%, transparent);
border: 1px solid var(--color-border);
border-radius: var(--radius);
padding: 0.5rem;
min-height: 200px;
max-height: calc(100vh - 220px);
transition: border-color 120ms ease, background-color 120ms ease;
}
.hermes-kanban-column--drop {
border-color: var(--color-ring);
background: color-mix(in srgb, var(--color-ring) 8%, var(--color-card));
}
.hermes-kanban-column-header {
display: flex;
align-items: center;
gap: 0.5rem;
padding: 0.25rem 0.25rem 0.35rem;
font-weight: 600;
font-size: 0.85rem;
color: var(--color-foreground);
}
.hermes-kanban-column-label {
flex: 1;
letter-spacing: 0.01em;
}
.hermes-kanban-column-count {
font-variant-numeric: tabular-nums;
color: var(--color-muted-foreground);
font-size: 0.75rem;
font-weight: 500;
}
.hermes-kanban-column-add {
appearance: none;
background: transparent;
border: 1px solid var(--color-border);
color: var(--color-foreground);
border-radius: var(--radius-sm, 0.25rem);
width: 22px;
height: 22px;
line-height: 1;
font-size: 1rem;
cursor: pointer;
}
.hermes-kanban-column-add:hover {
background: color-mix(in srgb, var(--color-foreground) 8%, transparent);
}
.hermes-kanban-column-sub {
padding: 0 0.25rem 0.5rem;
font-size: 0.7rem;
color: var(--color-muted-foreground);
border-bottom: 1px solid color-mix(in srgb, var(--color-border) 60%, transparent);
margin-bottom: 0.5rem;
}
.hermes-kanban-column-body {
display: flex;
flex-direction: column;
gap: 0.45rem;
overflow-y: auto;
padding-right: 0.1rem;
}
.hermes-kanban-empty {
padding: 1.5rem 0.5rem;
text-align: center;
font-size: 0.75rem;
color: var(--color-muted-foreground);
border: 1px dashed color-mix(in srgb, var(--color-border) 70%, transparent);
border-radius: var(--radius-sm, 0.25rem);
}
/* ---- Status dots ----------------------------------------------------- */
.hermes-kanban-dot {
display: inline-block;
width: 0.5rem;
height: 0.5rem;
border-radius: 999px;
background: var(--color-muted-foreground);
}
.hermes-kanban-dot-triage { background: #b47dd6; } /* lilac — fresh/unspecified */
.hermes-kanban-dot-todo { background: var(--color-muted-foreground); }
.hermes-kanban-dot-ready { background: #d4b348; } /* amber */
.hermes-kanban-dot-running { background: #3fb97d; } /* green */
.hermes-kanban-dot-blocked { background: var(--color-destructive, #d14a4a); }
.hermes-kanban-dot-done { background: #4a8cd1; } /* blue */
.hermes-kanban-dot-archived { background: var(--color-border); }
/* ---- Progress pill (N/M child tasks done) --------------------------- */
.hermes-kanban-progress {
font-family: var(--font-mono, ui-monospace, monospace);
font-size: 0.62rem;
padding: 0.05rem 0.35rem;
border-radius: 999px;
background: color-mix(in srgb, var(--color-foreground) 8%, transparent);
border: 1px solid color-mix(in srgb, var(--color-border) 80%, transparent);
color: var(--color-muted-foreground);
letter-spacing: 0.02em;
}
.hermes-kanban-progress--full {
background: color-mix(in srgb, #3fb97d 22%, transparent);
border-color: color-mix(in srgb, #3fb97d 45%, transparent);
color: var(--color-foreground);
}
/* ---- Lanes (per-profile sub-grouping inside Running) ---------------- */
.hermes-kanban-lane {
display: flex;
flex-direction: column;
gap: 0.35rem;
padding: 0.25rem 0 0.35rem;
border-top: 1px dashed color-mix(in srgb, var(--color-border) 70%, transparent);
}
.hermes-kanban-lane:first-child {
border-top: 0;
padding-top: 0;
}
.hermes-kanban-lane-head {
display: flex;
align-items: center;
gap: 0.4rem;
font-size: 0.65rem;
text-transform: uppercase;
letter-spacing: 0.08em;
color: var(--color-muted-foreground);
padding: 0 0.1rem;
}
.hermes-kanban-lane-name {
font-weight: 600;
font-family: var(--font-mono, ui-monospace, monospace);
}
.hermes-kanban-lane-count {
margin-left: auto;
font-variant-numeric: tabular-nums;
}
/* ---- Card ------------------------------------------------------------ */
.hermes-kanban-card {
cursor: grab;
transition: transform 100ms ease, box-shadow 100ms ease;
}
.hermes-kanban-card:hover {
box-shadow: 0 1px 0 0 var(--color-ring) inset, 0 0 0 1px var(--color-ring) inset;
}
.hermes-kanban-card:active {
cursor: grabbing;
transform: scale(0.995);
}
.hermes-kanban-card-content {
padding: 0.5rem 0.6rem !important;
display: flex;
flex-direction: column;
gap: 0.3rem;
}
.hermes-kanban-card-row {
display: flex;
align-items: center;
gap: 0.35rem;
flex-wrap: wrap;
}
.hermes-kanban-card-id {
font-family: var(--font-mono, ui-monospace, monospace);
font-size: 0.65rem;
color: var(--color-muted-foreground);
letter-spacing: 0.03em;
}
.hermes-kanban-card-title {
font-size: 0.85rem;
font-weight: 500;
line-height: 1.3;
color: var(--color-foreground);
word-break: break-word;
}
.hermes-kanban-card-meta {
font-size: 0.7rem;
color: var(--color-muted-foreground);
gap: 0.55rem;
}
.hermes-kanban-priority {
font-size: 0.6rem !important;
padding: 0.05rem 0.3rem !important;
background: color-mix(in srgb, var(--color-ring) 18%, transparent);
color: var(--color-foreground);
border: 1px solid color-mix(in srgb, var(--color-ring) 40%, transparent);
}
.hermes-kanban-tag {
font-size: 0.6rem !important;
padding: 0.05rem 0.3rem !important;
}
.hermes-kanban-assignee {
font-weight: 500;
color: color-mix(in srgb, var(--color-foreground) 80%, var(--color-muted-foreground));
}
.hermes-kanban-unassigned {
font-style: italic;
}
.hermes-kanban-ago {
margin-left: auto;
}
/* ---- Inline create --------------------------------------------------- */
.hermes-kanban-inline-create {
display: flex;
flex-direction: column;
gap: 0.35rem;
padding: 0.5rem;
margin-bottom: 0.5rem;
background: color-mix(in srgb, var(--color-card) 70%, transparent);
border: 1px dashed var(--color-border);
border-radius: var(--radius-sm, 0.25rem);
}
.hermes-kanban-inline-create > .flex.gap-2:last-child > button:first-of-type {
flex: 1;
min-width: 0;
}
/* ---- Drawer (task detail side panel) --------------------------------- */
.hermes-kanban-drawer-shade {
position: fixed;
inset: 0;
background: rgba(0, 0, 0, 0.45);
z-index: 60;
display: flex;
justify-content: flex-end;
}
.hermes-kanban-drawer {
width: min(480px, 92vw);
height: 100vh;
background: var(--color-card);
border-left: 1px solid var(--color-border);
display: flex;
flex-direction: column;
box-shadow: -4px 0 18px rgba(0, 0, 0, 0.35);
animation: hermes-kanban-drawer-in 180ms ease-out;
}
@keyframes hermes-kanban-drawer-in {
from { transform: translateX(100%); opacity: 0.3; }
to { transform: translateX(0); opacity: 1; }
}
.hermes-kanban-drawer-head {
display: flex;
align-items: center;
justify-content: space-between;
padding: 0.6rem 0.8rem;
border-bottom: 1px solid var(--color-border);
font-family: var(--font-mono, ui-monospace, monospace);
}
.hermes-kanban-drawer-close {
appearance: none;
background: transparent;
border: 0;
color: var(--color-muted-foreground);
font-size: 1.25rem;
line-height: 1;
cursor: pointer;
padding: 0 0.25rem;
}
.hermes-kanban-drawer-close:hover { color: var(--color-foreground); }
.hermes-kanban-drawer-body {
flex: 1;
overflow-y: auto;
padding: 0.9rem;
display: flex;
flex-direction: column;
gap: 0.85rem;
}
.hermes-kanban-drawer-title {
display: flex;
align-items: center;
gap: 0.5rem;
font-size: 1rem;
font-weight: 600;
}
.hermes-kanban-drawer-meta {
display: flex;
flex-direction: column;
gap: 0.15rem;
padding: 0.5rem 0.6rem;
background: color-mix(in srgb, var(--color-foreground) 4%, transparent);
border: 1px solid var(--color-border);
border-radius: var(--radius-sm, 0.25rem);
}
.hermes-kanban-meta-row {
display: flex;
gap: 0.5rem;
font-size: 0.72rem;
}
.hermes-kanban-meta-label {
width: 92px;
color: var(--color-muted-foreground);
}
.hermes-kanban-meta-value {
color: var(--color-foreground);
word-break: break-word;
}
.hermes-kanban-actions {
display: flex;
flex-wrap: wrap;
gap: 0.3rem;
}
.hermes-kanban-section {
display: flex;
flex-direction: column;
gap: 0.35rem;
}
.hermes-kanban-section-head {
font-size: 0.72rem;
font-weight: 600;
text-transform: uppercase;
letter-spacing: 0.07em;
color: var(--color-muted-foreground);
}
.hermes-kanban-pre {
margin: 0;
padding: 0.45rem 0.55rem;
white-space: pre-wrap;
word-break: break-word;
background: color-mix(in srgb, var(--color-foreground) 4%, transparent);
border: 1px solid var(--color-border);
border-radius: var(--radius-sm, 0.25rem);
font-family: var(--font-mono, ui-monospace, monospace);
font-size: 0.72rem;
color: var(--color-foreground);
}
.hermes-kanban-comment {
border-left: 2px solid color-mix(in srgb, var(--color-ring) 35%, transparent);
padding-left: 0.5rem;
display: flex;
flex-direction: column;
gap: 0.2rem;
}
.hermes-kanban-comment-head {
display: flex;
gap: 0.5rem;
font-size: 0.7rem;
}
.hermes-kanban-comment-author {
font-weight: 600;
color: var(--color-foreground);
}
.hermes-kanban-comment-ago {
color: var(--color-muted-foreground);
}
.hermes-kanban-event {
display: flex;
gap: 0.5rem;
font-size: 0.7rem;
color: var(--color-muted-foreground);
font-family: var(--font-mono, ui-monospace, monospace);
}
.hermes-kanban-event-kind {
color: var(--color-foreground);
min-width: 6rem;
}
.hermes-kanban-event-payload {
color: var(--color-muted-foreground);
overflow: hidden;
text-overflow: ellipsis;
white-space: nowrap;
max-width: 280px;
}
.hermes-kanban-drawer-comment-row {
display: flex;
gap: 0.4rem;
padding: 0.55rem 0.75rem;
border-top: 1px solid var(--color-border);
background: color-mix(in srgb, var(--color-card) 90%, transparent);
}
.hermes-kanban-count {
display: inline-flex;
gap: 0.2rem;
align-items: center;
}
/* ---- Selection chrome ----------------------------------------------- */
.hermes-kanban-card--selected :where(.hermes-kanban-card-content) {
box-shadow: 0 0 0 2px var(--color-ring) inset,
0 0 0 1px var(--color-ring) inset;
background: color-mix(in srgb, var(--color-ring) 6%, var(--color-card));
}
.hermes-kanban-card-check {
width: 0.85rem;
height: 0.85rem;
margin: 0;
cursor: pointer;
accent-color: var(--color-ring);
}
/* ---- Bulk action bar ------------------------------------------------ */
.hermes-kanban-bulk {
display: flex;
align-items: center;
gap: 0.5rem;
padding: 0.4rem 0.75rem;
background: color-mix(in srgb, var(--color-ring) 10%, var(--color-card));
border: 1px solid color-mix(in srgb, var(--color-ring) 40%, var(--color-border));
border-radius: var(--radius-sm, 0.25rem);
flex-wrap: wrap;
}
.hermes-kanban-bulk-count {
font-weight: 600;
font-size: 0.75rem;
padding-right: 0.25rem;
}
.hermes-kanban-bulk > button,
.hermes-kanban-bulk-reassign > button {
height: 1.7rem !important;
padding: 0 0.5rem !important;
font-size: 0.7rem !important;
border: 1px solid var(--color-border);
cursor: pointer;
}
.hermes-kanban-bulk > button:hover:not(:disabled),
.hermes-kanban-bulk-reassign > button:hover:not(:disabled) {
background: color-mix(in srgb, var(--color-foreground) 8%, transparent);
}
.hermes-kanban-bulk-reassign {
display: flex;
align-items: center;
gap: 0.25rem;
padding-left: 0.5rem;
border-left: 1px solid color-mix(in srgb, var(--color-border) 70%, transparent);
}
/* ---- Dependency editor chips --------------------------------------- */
.hermes-kanban-deps-row {
display: flex;
align-items: center;
gap: 0.5rem;
margin-bottom: 0.4rem;
}
.hermes-kanban-deps-label {
font-size: 0.68rem;
text-transform: uppercase;
letter-spacing: 0.08em;
color: var(--color-muted-foreground);
min-width: 4rem;
}
.hermes-kanban-deps-chips {
display: flex;
gap: 0.3rem;
flex-wrap: wrap;
flex: 1;
}
.hermes-kanban-deps-empty {
font-size: 0.7rem;
color: var(--color-muted-foreground);
font-style: italic;
}
.hermes-kanban-dep-chip {
display: inline-flex;
align-items: center;
gap: 0.15rem;
padding: 0.1rem 0.35rem;
background: color-mix(in srgb, var(--color-foreground) 6%, transparent);
border: 1px solid var(--color-border);
border-radius: var(--radius-sm, 0.25rem);
font-family: var(--font-mono, ui-monospace, monospace);
font-size: 0.68rem;
color: var(--color-foreground);
}
.hermes-kanban-dep-chip-x {
appearance: none;
background: transparent;
border: 0;
color: var(--color-muted-foreground);
cursor: pointer;
font-size: 0.85rem;
line-height: 1;
padding: 0 0.15rem;
}
.hermes-kanban-dep-chip-x:hover { color: var(--color-destructive, #d14a4a); }
/* ---- Inline edit affordances --------------------------------------- */
.hermes-kanban-editable {
cursor: pointer;
border-bottom: 1px dotted color-mix(in srgb, var(--color-border) 80%, transparent);
}
.hermes-kanban-editable:hover {
color: var(--color-foreground);
border-bottom-color: var(--color-ring);
}
.hermes-kanban-drawer-title-text {
cursor: pointer;
}
.hermes-kanban-drawer-title-text:hover {
text-decoration: underline;
text-decoration-color: var(--color-ring);
text-decoration-style: dotted;
text-underline-offset: 3px;
}
.hermes-kanban-edit-row {
display: flex;
align-items: center;
gap: 0.35rem;
width: 100%;
}
.hermes-kanban-section-head-row {
display: flex;
align-items: center;
justify-content: space-between;
gap: 0.5rem;
}
.hermes-kanban-edit-link {
appearance: none;
background: transparent;
border: 0;
color: var(--color-muted-foreground);
font-size: 0.7rem;
text-transform: uppercase;
letter-spacing: 0.05em;
cursor: pointer;
padding: 0;
}
.hermes-kanban-edit-link:hover { color: var(--color-ring); }
.hermes-kanban-textarea {
width: 100%;
min-height: 8rem;
background: var(--color-card);
color: var(--color-foreground);
border: 1px solid var(--color-border);
border-radius: var(--radius-sm, 0.25rem);
padding: 0.5rem 0.6rem;
font-family: var(--font-mono, ui-monospace, monospace);
font-size: 0.8rem;
line-height: 1.5;
resize: vertical;
}
.hermes-kanban-textarea:focus {
outline: none;
border-color: var(--color-ring);
box-shadow: 0 0 0 2px color-mix(in srgb, var(--color-ring) 30%, transparent);
}
/* ---- Markdown rendering -------------------------------------------- */
.hermes-kanban-md {
font-size: 0.8rem;
line-height: 1.55;
color: var(--color-foreground);
}
.hermes-kanban-md p { margin: 0.25rem 0; }
.hermes-kanban-md h1,
.hermes-kanban-md h2,
.hermes-kanban-md h3,
.hermes-kanban-md h4 {
margin: 0.6rem 0 0.2rem;
line-height: 1.25;
}
.hermes-kanban-md h1 { font-size: 1.05rem; }
.hermes-kanban-md h2 { font-size: 0.95rem; }
.hermes-kanban-md h3 { font-size: 0.88rem; }
.hermes-kanban-md h4 { font-size: 0.82rem; }
.hermes-kanban-md ul {
margin: 0.25rem 0 0.25rem 1.1rem;
padding: 0;
}
.hermes-kanban-md li { margin: 0.1rem 0; }
.hermes-kanban-md a {
color: var(--color-ring);
text-decoration: underline;
}
.hermes-kanban-md code {
font-family: var(--font-mono, ui-monospace, monospace);
font-size: 0.75rem;
padding: 0.05rem 0.3rem;
background: color-mix(in srgb, var(--color-foreground) 8%, transparent);
border-radius: 3px;
}
.hermes-kanban-md-code {
margin: 0.35rem 0;
padding: 0.5rem 0.6rem;
background: color-mix(in srgb, var(--color-foreground) 5%, transparent);
border: 1px solid var(--color-border);
border-radius: var(--radius-sm, 0.25rem);
overflow-x: auto;
}
.hermes-kanban-md-code code {
background: transparent;
padding: 0;
font-size: 0.75rem;
white-space: pre;
}
.hermes-kanban-md strong { font-weight: 600; }
/* ---- Touch-drag proxy ---------------------------------------------- */
.hermes-kanban-touch-proxy {
pointer-events: none;
opacity: 0.85;
box-shadow: 0 8px 20px rgba(0, 0, 0, 0.35);
transform: scale(1.02);
transition: none;
}
/* ---- Staleness tiers ------------------------------------------------ */
.hermes-kanban-card--stale-amber :where(.hermes-kanban-card-content) {
box-shadow: 0 0 0 1px #d4b34888 inset;
}
.hermes-kanban-card--stale-amber:hover :where(.hermes-kanban-card-content) {
box-shadow: 0 0 0 2px #d4b348 inset;
}
.hermes-kanban-card--stale-red :where(.hermes-kanban-card-content) {
box-shadow: 0 0 0 1px var(--color-destructive, #d14a4a) inset,
0 0 8px color-mix(in srgb, var(--color-destructive, #d14a4a) 30%, transparent);
}
.hermes-kanban-card--stale-red:hover :where(.hermes-kanban-card-content) {
box-shadow: 0 0 0 2px var(--color-destructive, #d14a4a) inset,
0 0 10px color-mix(in srgb, var(--color-destructive, #d14a4a) 45%, transparent);
}
/* ---- Worker log pane ------------------------------------------------ */
.hermes-kanban-log {
max-height: 340px;
overflow: auto;
white-space: pre;
font-size: 0.7rem;
line-height: 1.45;
}
/* ---- Run history (per-attempt log in the drawer) ------------------- */
.hermes-kanban-run {
border-left: 2px solid var(--color-border);
padding: 0.35rem 0.5rem;
margin-bottom: 0.4rem;
background: color-mix(in srgb, var(--color-foreground) 3%, transparent);
border-radius: var(--radius-sm, 0.25rem);
}
.hermes-kanban-run--active { border-left-color: #3fb97d; }
.hermes-kanban-run--completed { border-left-color: #4a8cd1; }
.hermes-kanban-run--ended { border-left-color: #6b7280; } /* generic fallback when outcome is unset */
.hermes-kanban-run--blocked { border-left-color: var(--color-destructive, #d14a4a); }
.hermes-kanban-run--crashed,
.hermes-kanban-run--timed_out,
.hermes-kanban-run--gave_up,
.hermes-kanban-run--spawn_failed {
border-left-color: var(--color-destructive, #d14a4a);
background: color-mix(in srgb, var(--color-destructive, #d14a4a) 6%, transparent);
}
.hermes-kanban-run--reclaimed { border-left-color: #d4b348; }
.hermes-kanban-run-head {
display: flex;
align-items: center;
gap: 0.6rem;
font-size: 0.7rem;
}
.hermes-kanban-run-outcome {
font-family: var(--font-mono, ui-monospace, monospace);
font-weight: 600;
text-transform: uppercase;
letter-spacing: 0.05em;
color: var(--color-foreground);
}
.hermes-kanban-run-profile {
color: var(--color-muted-foreground);
}
.hermes-kanban-run-elapsed {
font-variant-numeric: tabular-nums;
color: var(--color-muted-foreground);
}
.hermes-kanban-run-ago {
margin-left: auto;
color: var(--color-muted-foreground);
}
.hermes-kanban-run-summary {
font-size: 0.75rem;
padding: 0.2rem 0 0;
color: var(--color-foreground);
}
.hermes-kanban-run-error {
font-size: 0.7rem;
color: var(--color-destructive, #d14a4a);
padding: 0.15rem 0 0;
font-family: var(--font-mono, ui-monospace, monospace);
}
.hermes-kanban-run-meta {
display: block;
font-size: 0.65rem;
padding: 0.15rem 0 0;
color: var(--color-muted-foreground);
white-space: pre-wrap;
word-break: break-word;
font-family: var(--font-mono, ui-monospace, monospace);
}
-14
View File
@@ -1,14 +0,0 @@
{
"name": "kanban",
"label": "Kanban",
"description": "Multi-agent collaboration board — drag-drop cards across columns, read comment threads, see which profile is running what",
"icon": "Package",
"version": "1.0.0",
"tab": {
"path": "/kanban",
"position": "after:skills"
},
"entry": "dist/index.js",
"css": "dist/style.css",
"api": "plugin_api.py"
}
-845
View File
@@ -1,845 +0,0 @@
"""Kanban dashboard plugin — backend API routes.
Mounted at /api/plugins/kanban/ by the dashboard plugin system.
This layer is intentionally thin: every handler is a small wrapper around
``hermes_cli.kanban_db`` or a direct SQL query. Writes use the same code
paths the CLI and gateway ``/kanban`` command use, so the three surfaces
cannot drift.
Live updates arrive via the ``/events`` WebSocket, which tails the
append-only ``task_events`` table on a short poll interval (WAL mode lets
reads run alongside the dispatcher's IMMEDIATE write transactions).
Security note
-------------
The dashboard's HTTP auth middleware (``web_server.auth_middleware``)
explicitly skips ``/api/plugins/`` plugin routes are unauthenticated by
design because the dashboard binds to localhost by default. For the
WebSocket we still require the session token as a ``?token=`` query
parameter (browsers cannot set the ``Authorization`` header on an upgrade
request), matching the established pattern used by the in-browser PTY
bridge in ``hermes_cli/web_server.py``. If you run the dashboard with
``--host 0.0.0.0``, every plugin route kanban included becomes
reachable from the network. Don't do that on a shared host.
"""
from __future__ import annotations
import asyncio
import hmac
import json
import logging
import sqlite3
import time
from dataclasses import asdict
from typing import Any, Optional
from fastapi import APIRouter, HTTPException, Query, WebSocket, WebSocketDisconnect, status as http_status
from pydantic import BaseModel, Field
from hermes_cli import kanban_db
log = logging.getLogger(__name__)
router = APIRouter()
# ---------------------------------------------------------------------------
# Auth helper — WebSocket only (HTTP routes live behind the dashboard's
# existing plugin-bypass; this is documented above).
# ---------------------------------------------------------------------------
def _check_ws_token(provided: Optional[str]) -> bool:
"""Constant-time compare against the dashboard session token.
Imported lazily so the plugin still loads in test contexts where the
dashboard web_server module isn't importable (e.g. the bare-FastAPI
test harness).
"""
if not provided:
return False
try:
from hermes_cli import web_server as _ws
except Exception:
# No dashboard context (tests). Accept so the tail loop is still
# testable; in production the dashboard module always imports
# cleanly because it's the caller.
return True
expected = getattr(_ws, "_SESSION_TOKEN", None)
if not expected:
return True
return hmac.compare_digest(str(provided), str(expected))
def _conn():
"""Open a kanban_db connection, creating the schema on first use.
Every handler that mutates the DB goes through this so the plugin
self-heals on a fresh install (no user-visible "no such table"
error if somebody hits POST /tasks before GET /board).
``init_db`` is idempotent.
"""
try:
kanban_db.init_db()
except Exception as exc:
log.warning("kanban init_db failed: %s", exc)
return kanban_db.connect()
# ---------------------------------------------------------------------------
# Serialization helpers
# ---------------------------------------------------------------------------
# Columns shown by the dashboard, in left-to-right order. "archived" is
# available via a filter toggle rather than a visible column.
BOARD_COLUMNS: list[str] = [
"triage", "todo", "ready", "running", "blocked", "done",
]
def _task_dict(task: kanban_db.Task) -> dict[str, Any]:
d = asdict(task)
# Add derived age metrics so the UI can colour stale cards without
# computing deltas client-side.
d["age"] = kanban_db.task_age(task)
# Keep body short on list endpoints; full body comes from /tasks/:id.
return d
def _event_dict(event: kanban_db.Event) -> dict[str, Any]:
return {
"id": event.id,
"task_id": event.task_id,
"kind": event.kind,
"payload": event.payload,
"created_at": event.created_at,
"run_id": event.run_id,
}
def _comment_dict(c: kanban_db.Comment) -> dict[str, Any]:
return {
"id": c.id,
"task_id": c.task_id,
"author": c.author,
"body": c.body,
"created_at": c.created_at,
}
def _run_dict(r: kanban_db.Run) -> dict[str, Any]:
"""Serialise a Run for the drawer's Run history section."""
return {
"id": r.id,
"task_id": r.task_id,
"profile": r.profile,
"step_key": r.step_key,
"status": r.status,
"claim_lock": r.claim_lock,
"claim_expires": r.claim_expires,
"worker_pid": r.worker_pid,
"max_runtime_seconds": r.max_runtime_seconds,
"last_heartbeat_at": r.last_heartbeat_at,
"started_at": r.started_at,
"ended_at": r.ended_at,
"outcome": r.outcome,
"summary": r.summary,
"metadata": r.metadata,
"error": r.error,
}
def _links_for(conn: sqlite3.Connection, task_id: str) -> dict[str, list[str]]:
"""Return {'parents': [...], 'children': [...]} for a task."""
parents = [
r["parent_id"]
for r in conn.execute(
"SELECT parent_id FROM task_links WHERE child_id = ? ORDER BY parent_id",
(task_id,),
)
]
children = [
r["child_id"]
for r in conn.execute(
"SELECT child_id FROM task_links WHERE parent_id = ? ORDER BY child_id",
(task_id,),
)
]
return {"parents": parents, "children": children}
# ---------------------------------------------------------------------------
# GET /board
# ---------------------------------------------------------------------------
@router.get("/board")
def get_board(
tenant: Optional[str] = Query(None, description="Filter to a single tenant"),
include_archived: bool = Query(False),
):
"""Return the full board grouped by status column.
``_conn()`` auto-initializes ``kanban.db`` on first call so a fresh
install doesn't surface a "failed to load" error on the plugin tab.
"""
conn = _conn()
try:
tasks = kanban_db.list_tasks(
conn, tenant=tenant, include_archived=include_archived
)
# Pre-fetch link counts per task (cheap: one query).
link_counts: dict[str, dict[str, int]] = {}
for row in conn.execute(
"SELECT parent_id, child_id FROM task_links"
).fetchall():
link_counts.setdefault(row["parent_id"], {"parents": 0, "children": 0})[
"children"
] += 1
link_counts.setdefault(row["child_id"], {"parents": 0, "children": 0})[
"parents"
] += 1
# Comment + event counts (both cheap aggregates).
comment_counts: dict[str, int] = {
r["task_id"]: r["n"]
for r in conn.execute(
"SELECT task_id, COUNT(*) AS n FROM task_comments GROUP BY task_id"
)
}
# Progress rollup: for each parent, how many children are done / total.
# One pass over task_links joined with child status — cheaper than
# N per-task queries and the plugin uses it to render "N/M".
progress: dict[str, dict[str, int]] = {}
for row in conn.execute(
"SELECT l.parent_id AS pid, t.status AS cstatus "
"FROM task_links l JOIN tasks t ON t.id = l.child_id"
).fetchall():
p = progress.setdefault(row["pid"], {"done": 0, "total": 0})
p["total"] += 1
if row["cstatus"] == "done":
p["done"] += 1
latest_event_id = conn.execute(
"SELECT COALESCE(MAX(id), 0) AS m FROM task_events"
).fetchone()["m"]
columns: dict[str, list[dict]] = {c: [] for c in BOARD_COLUMNS}
if include_archived:
columns["archived"] = []
for t in tasks:
d = _task_dict(t)
d["link_counts"] = link_counts.get(t.id, {"parents": 0, "children": 0})
d["comment_count"] = comment_counts.get(t.id, 0)
d["progress"] = progress.get(t.id) # None when the task has no children
col = t.status if t.status in columns else "todo"
columns[col].append(d)
# Stable per-column ordering already applied by list_tasks
# (priority DESC, created_at ASC), keep as-is.
# List of known tenants for the UI filter dropdown.
tenants = [
r["tenant"]
for r in conn.execute(
"SELECT DISTINCT tenant FROM tasks WHERE tenant IS NOT NULL ORDER BY tenant"
)
]
# List of distinct assignees for the lane-by-profile sub-grouping.
assignees = [
r["assignee"]
for r in conn.execute(
"SELECT DISTINCT assignee FROM tasks WHERE assignee IS NOT NULL "
"AND status != 'archived' ORDER BY assignee"
)
]
return {
"columns": [
{"name": name, "tasks": columns[name]} for name in columns.keys()
],
"tenants": tenants,
"assignees": assignees,
"latest_event_id": int(latest_event_id),
"now": int(time.time()),
}
finally:
conn.close()
# ---------------------------------------------------------------------------
# GET /tasks/:id
# ---------------------------------------------------------------------------
@router.get("/tasks/{task_id}")
def get_task(task_id: str):
conn = _conn()
try:
task = kanban_db.get_task(conn, task_id)
if task is None:
raise HTTPException(status_code=404, detail=f"task {task_id} not found")
return {
"task": _task_dict(task),
"comments": [_comment_dict(c) for c in kanban_db.list_comments(conn, task_id)],
"events": [_event_dict(e) for e in kanban_db.list_events(conn, task_id)],
"links": _links_for(conn, task_id),
"runs": [_run_dict(r) for r in kanban_db.list_runs(conn, task_id)],
}
finally:
conn.close()
# ---------------------------------------------------------------------------
# POST /tasks
# ---------------------------------------------------------------------------
class CreateTaskBody(BaseModel):
title: str
body: Optional[str] = None
assignee: Optional[str] = None
tenant: Optional[str] = None
priority: int = 0
workspace_kind: str = "scratch"
workspace_path: Optional[str] = None
parents: list[str] = Field(default_factory=list)
triage: bool = False
idempotency_key: Optional[str] = None
max_runtime_seconds: Optional[int] = None
skills: Optional[list[str]] = None
@router.post("/tasks")
def create_task(payload: CreateTaskBody):
conn = _conn()
try:
task_id = kanban_db.create_task(
conn,
title=payload.title,
body=payload.body,
assignee=payload.assignee,
created_by="dashboard",
workspace_kind=payload.workspace_kind,
workspace_path=payload.workspace_path,
tenant=payload.tenant,
priority=payload.priority,
parents=payload.parents,
triage=payload.triage,
idempotency_key=payload.idempotency_key,
max_runtime_seconds=payload.max_runtime_seconds,
skills=payload.skills,
)
task = kanban_db.get_task(conn, task_id)
body: dict[str, Any] = {"task": _task_dict(task) if task else None}
# Surface a dispatcher-presence warning so the UI can show a
# banner when a `ready` task would otherwise sit idle because no
# gateway is running (or dispatch_in_gateway=false). Only emit
# for ready+assigned tasks; triage/todo are expected to wait,
# and unassigned tasks can't be dispatched regardless.
if task and task.status == "ready" and task.assignee:
try:
from hermes_cli.kanban import _check_dispatcher_presence
running, message = _check_dispatcher_presence()
if not running and message:
body["warning"] = message
except Exception:
# Probe failure must never block the create itself.
pass
return body
except ValueError as e:
raise HTTPException(status_code=400, detail=str(e))
finally:
conn.close()
# ---------------------------------------------------------------------------
# PATCH /tasks/:id (status / assignee / priority / title / body)
# ---------------------------------------------------------------------------
class UpdateTaskBody(BaseModel):
status: Optional[str] = None
assignee: Optional[str] = None
priority: Optional[int] = None
title: Optional[str] = None
body: Optional[str] = None
result: Optional[str] = None
block_reason: Optional[str] = None
# Structured handoff fields — forwarded to complete_task when status
# transitions to 'done'. Dashboard parity with ``hermes kanban
# complete --summary ... --metadata ...``.
summary: Optional[str] = None
metadata: Optional[dict] = None
@router.patch("/tasks/{task_id}")
def update_task(task_id: str, payload: UpdateTaskBody):
conn = _conn()
try:
task = kanban_db.get_task(conn, task_id)
if task is None:
raise HTTPException(status_code=404, detail=f"task {task_id} not found")
# --- assignee ----------------------------------------------------
if payload.assignee is not None:
try:
ok = kanban_db.assign_task(
conn, task_id, payload.assignee or None,
)
except RuntimeError as e:
raise HTTPException(status_code=409, detail=str(e))
if not ok:
raise HTTPException(status_code=404, detail="task not found")
# --- status -------------------------------------------------------
if payload.status is not None:
s = payload.status
ok = True
if s == "done":
ok = kanban_db.complete_task(
conn, task_id,
result=payload.result,
summary=payload.summary,
metadata=payload.metadata,
)
elif s == "blocked":
ok = kanban_db.block_task(conn, task_id, reason=payload.block_reason)
elif s == "ready":
# Re-open a blocked task, or just an explicit status set.
current = kanban_db.get_task(conn, task_id)
if current and current.status == "blocked":
ok = kanban_db.unblock_task(conn, task_id)
else:
# Direct status write for drag-drop (todo -> ready etc).
ok = _set_status_direct(conn, task_id, "ready")
elif s == "archived":
ok = kanban_db.archive_task(conn, task_id)
elif s in ("todo", "running", "triage"):
ok = _set_status_direct(conn, task_id, s)
else:
raise HTTPException(status_code=400, detail=f"unknown status: {s}")
if not ok:
raise HTTPException(
status_code=409,
detail=f"status transition to {s!r} not valid from current state",
)
# --- priority -----------------------------------------------------
if payload.priority is not None:
with kanban_db.write_txn(conn):
conn.execute(
"UPDATE tasks SET priority = ? WHERE id = ?",
(int(payload.priority), task_id),
)
conn.execute(
"INSERT INTO task_events (task_id, kind, payload, created_at) "
"VALUES (?, 'reprioritized', ?, ?)",
(task_id, json.dumps({"priority": int(payload.priority)}),
int(time.time())),
)
# --- title / body -------------------------------------------------
if payload.title is not None or payload.body is not None:
with kanban_db.write_txn(conn):
sets, vals = [], []
if payload.title is not None:
if not payload.title.strip():
raise HTTPException(status_code=400, detail="title cannot be empty")
sets.append("title = ?")
vals.append(payload.title.strip())
if payload.body is not None:
sets.append("body = ?")
vals.append(payload.body)
vals.append(task_id)
conn.execute(
f"UPDATE tasks SET {', '.join(sets)} WHERE id = ?", vals,
)
conn.execute(
"INSERT INTO task_events (task_id, kind, payload, created_at) "
"VALUES (?, 'edited', NULL, ?)",
(task_id, int(time.time())),
)
updated = kanban_db.get_task(conn, task_id)
return {"task": _task_dict(updated) if updated else None}
finally:
conn.close()
def _set_status_direct(
conn: sqlite3.Connection, task_id: str, new_status: str,
) -> bool:
"""Direct status write for drag-drop moves that aren't covered by the
structured complete/block/unblock/archive verbs (e.g. todo<->ready,
running<->ready). Appends a ``status`` event row for the live feed.
When this transitions OFF ``running`` to anything other than the
terminal verbs above (which own their own run closing), we close the
active run with outcome='reclaimed' so attempt history isn't
orphaned. ``running -> ready`` via drag-drop is the common case
(user yanking a stuck worker back to the queue).
"""
with kanban_db.write_txn(conn):
# Snapshot current state so we know whether to close a run.
prev = conn.execute(
"SELECT status, current_run_id FROM tasks WHERE id = ?",
(task_id,),
).fetchone()
if prev is None:
return False
was_running = prev["status"] == "running"
cur = conn.execute(
"UPDATE tasks SET status = ?, "
" claim_lock = CASE WHEN ? = 'running' THEN claim_lock ELSE NULL END, "
" claim_expires = CASE WHEN ? = 'running' THEN claim_expires ELSE NULL END, "
" worker_pid = CASE WHEN ? = 'running' THEN worker_pid ELSE NULL END "
"WHERE id = ?",
(new_status, new_status, new_status, new_status, task_id),
)
if cur.rowcount != 1:
return False
run_id = None
if was_running and new_status != "running" and prev["current_run_id"]:
run_id = kanban_db._end_run(
conn, task_id,
outcome="reclaimed", status="reclaimed",
summary=f"status changed to {new_status} (dashboard/direct)",
)
conn.execute(
"INSERT INTO task_events (task_id, run_id, kind, payload, created_at) "
"VALUES (?, ?, 'status', ?, ?)",
(task_id, run_id, json.dumps({"status": new_status}), int(time.time())),
)
# If we re-opened something, children may have gone stale.
if new_status in ("done", "ready"):
kanban_db.recompute_ready(conn)
return True
# ---------------------------------------------------------------------------
# Comments
# ---------------------------------------------------------------------------
class CommentBody(BaseModel):
body: str
author: Optional[str] = "dashboard"
@router.post("/tasks/{task_id}/comments")
def add_comment(task_id: str, payload: CommentBody):
if not payload.body.strip():
raise HTTPException(status_code=400, detail="body is required")
conn = _conn()
try:
if kanban_db.get_task(conn, task_id) is None:
raise HTTPException(status_code=404, detail=f"task {task_id} not found")
kanban_db.add_comment(
conn, task_id, author=payload.author or "dashboard", body=payload.body,
)
return {"ok": True}
finally:
conn.close()
# ---------------------------------------------------------------------------
# Links
# ---------------------------------------------------------------------------
class LinkBody(BaseModel):
parent_id: str
child_id: str
@router.post("/links")
def add_link(payload: LinkBody):
conn = _conn()
try:
kanban_db.link_tasks(conn, payload.parent_id, payload.child_id)
return {"ok": True}
except ValueError as e:
raise HTTPException(status_code=400, detail=str(e))
finally:
conn.close()
@router.delete("/links")
def delete_link(parent_id: str = Query(...), child_id: str = Query(...)):
conn = _conn()
try:
ok = kanban_db.unlink_tasks(conn, parent_id, child_id)
return {"ok": bool(ok)}
finally:
conn.close()
# ---------------------------------------------------------------------------
# Bulk actions (multi-select on the board)
# ---------------------------------------------------------------------------
class BulkTaskBody(BaseModel):
ids: list[str]
status: Optional[str] = None
assignee: Optional[str] = None # "" or None = unassign
priority: Optional[int] = None
archive: bool = False
@router.post("/tasks/bulk")
def bulk_update(payload: BulkTaskBody):
"""Apply the same patch to every id in ``payload.ids``.
This is an *independent* iteration per-task failures don't abort
siblings. Returns per-id outcome so the UI can surface partials.
"""
ids = [i for i in (payload.ids or []) if i]
if not ids:
raise HTTPException(status_code=400, detail="ids is required")
results: list[dict] = []
conn = _conn()
try:
for tid in ids:
entry: dict[str, Any] = {"id": tid, "ok": True}
try:
task = kanban_db.get_task(conn, tid)
if task is None:
entry.update(ok=False, error="not found")
results.append(entry)
continue
if payload.archive:
if not kanban_db.archive_task(conn, tid):
entry.update(ok=False, error="archive refused")
if payload.status is not None and not payload.archive:
s = payload.status
if s == "done":
ok = kanban_db.complete_task(conn, tid)
elif s == "blocked":
ok = kanban_db.block_task(conn, tid)
elif s == "ready":
cur = kanban_db.get_task(conn, tid)
if cur and cur.status == "blocked":
ok = kanban_db.unblock_task(conn, tid)
else:
ok = _set_status_direct(conn, tid, "ready")
elif s in ("todo", "running", "triage"):
ok = _set_status_direct(conn, tid, s)
else:
entry.update(ok=False, error=f"unknown status {s!r}")
results.append(entry)
continue
if not ok:
entry.update(ok=False, error=f"transition to {s!r} refused")
if payload.assignee is not None:
try:
if not kanban_db.assign_task(
conn, tid, payload.assignee or None,
):
entry.update(ok=False, error="assign refused")
except RuntimeError as e:
entry.update(ok=False, error=str(e))
if payload.priority is not None:
with kanban_db.write_txn(conn):
conn.execute(
"UPDATE tasks SET priority = ? WHERE id = ?",
(int(payload.priority), tid),
)
conn.execute(
"INSERT INTO task_events (task_id, kind, payload, created_at) "
"VALUES (?, 'reprioritized', ?, ?)",
(tid, json.dumps({"priority": int(payload.priority)}),
int(time.time())),
)
except Exception as e: # defensive — one bad id shouldn't kill the batch
entry.update(ok=False, error=str(e))
results.append(entry)
return {"results": results}
finally:
conn.close()
# ---------------------------------------------------------------------------
# Plugin config (read dashboard.kanban.* defaults from config.yaml)
# ---------------------------------------------------------------------------
@router.get("/config")
def get_config():
"""Return kanban dashboard preferences from ~/.hermes/config.yaml.
Reads the ``dashboard.kanban`` section if present; defaults otherwise.
Used by the UI to pre-select tenant filters, toggle markdown rendering,
or set column-width preferences without a round-trip per page load.
"""
try:
from hermes_cli.config import load_config
cfg = load_config() or {}
except Exception:
cfg = {}
dash_cfg = (cfg.get("dashboard") or {})
# dashboard.kanban may itself be a dict; fall back to {}.
k_cfg = dash_cfg.get("kanban") or {}
return {
"default_tenant": k_cfg.get("default_tenant") or "",
"lane_by_profile": bool(k_cfg.get("lane_by_profile", True)),
"include_archived_by_default": bool(k_cfg.get("include_archived_by_default", False)),
"render_markdown": bool(k_cfg.get("render_markdown", True)),
}
# ---------------------------------------------------------------------------
# Stats (per-profile / per-status counts + oldest-ready age)
# ---------------------------------------------------------------------------
@router.get("/stats")
def get_stats():
"""Per-status + per-assignee counts + oldest-ready age.
Designed for the dashboard HUD and for router profiles that need to
answer "is this specialist overloaded?" without scanning the whole
board themselves.
"""
conn = _conn()
try:
return kanban_db.board_stats(conn)
finally:
conn.close()
@router.get("/assignees")
def get_assignees():
"""Known profiles + per-profile task counts.
Returns the union of ``~/.hermes/profiles/*`` on disk and every
distinct assignee currently used on the board. The dashboard uses
this to populate its assignee dropdown so a freshly-created profile
appears in the picker before it's been given any task.
"""
conn = _conn()
try:
return {"assignees": kanban_db.known_assignees(conn)}
finally:
conn.close()
# ---------------------------------------------------------------------------
# Worker log (read-only; file written by _default_spawn)
# ---------------------------------------------------------------------------
@router.get("/tasks/{task_id}/log")
def get_task_log(task_id: str, tail: Optional[int] = Query(None, ge=1, le=2_000_000)):
"""Return the worker's stdout/stderr log.
``tail`` caps the response size (bytes) so the dashboard drawer
doesn't paginate megabytes into the browser. Returns 404 if the task
has never spawned. The on-disk log is rotated at 2 MiB per
``_rotate_worker_log`` a single ``.log.1`` is kept, no further
generations, so disk usage per task is bounded at ~4 MiB.
"""
conn = _conn()
try:
task = kanban_db.get_task(conn, task_id)
finally:
conn.close()
if task is None:
raise HTTPException(status_code=404, detail=f"task {task_id} not found")
content = kanban_db.read_worker_log(task_id, tail_bytes=tail)
log_path = kanban_db.worker_log_path(task_id)
size = log_path.stat().st_size if log_path.exists() else 0
return {
"task_id": task_id,
"path": str(log_path),
"exists": content is not None,
"size_bytes": size,
"content": content or "",
# Truncated when the on-disk file was larger than the tail cap.
"truncated": bool(tail and size > tail),
}
# ---------------------------------------------------------------------------
# Dispatch nudge (optional quick-path so the UI doesn't wait 60 s)
# ---------------------------------------------------------------------------
@router.post("/dispatch")
def dispatch(dry_run: bool = Query(False), max_n: int = Query(8, alias="max")):
conn = _conn()
try:
result = kanban_db.dispatch_once(
conn, dry_run=dry_run, max_spawn=max_n,
)
# DispatchResult is a dataclass.
try:
return asdict(result)
except TypeError:
return {"result": str(result)}
finally:
conn.close()
# ---------------------------------------------------------------------------
# WebSocket: /events?since=<event_id>
# ---------------------------------------------------------------------------
# Poll interval for the event tail loop. SQLite WAL + 300 ms polling is
# the simplest and most robust approach; it adds a fraction of a percent
# of CPU and has no shared state to synchronize across workers.
_EVENT_POLL_SECONDS = 0.3
@router.websocket("/events")
async def stream_events(ws: WebSocket):
# Enforce the dashboard session token as a query param — browsers can't
# set Authorization on a WS upgrade. This matches how the PTY bridge
# authenticates in hermes_cli/web_server.py.
token = ws.query_params.get("token")
if not _check_ws_token(token):
await ws.close(code=http_status.WS_1008_POLICY_VIOLATION)
return
await ws.accept()
try:
since_raw = ws.query_params.get("since", "0")
try:
cursor = int(since_raw)
except ValueError:
cursor = 0
def _fetch_new(cursor_val: int) -> tuple[int, list[dict]]:
conn = kanban_db.connect()
try:
rows = conn.execute(
"SELECT id, task_id, run_id, kind, payload, created_at "
"FROM task_events WHERE id > ? ORDER BY id ASC LIMIT 200",
(cursor_val,),
).fetchall()
out: list[dict] = []
new_cursor = cursor_val
for r in rows:
try:
payload = json.loads(r["payload"]) if r["payload"] else None
except Exception:
payload = None
out.append({
"id": r["id"],
"task_id": r["task_id"],
"run_id": r["run_id"],
"kind": r["kind"],
"payload": payload,
"created_at": r["created_at"],
})
new_cursor = r["id"]
return new_cursor, out
finally:
conn.close()
while True:
cursor, events = await asyncio.to_thread(_fetch_new, cursor)
if events:
await ws.send_json({"events": events, "cursor": cursor})
await asyncio.sleep(_EVENT_POLL_SECONDS)
except WebSocketDisconnect:
return
except Exception as exc: # defensive: never crash the dashboard worker
log.warning("Kanban event stream error: %s", exc)
try:
await ws.close()
except Exception:
pass
@@ -1,32 +0,0 @@
# DEPRECATED — the kanban dispatcher now runs inside the gateway by
# default (config key: kanban.dispatch_in_gateway, default true). To
# migrate:
#
# systemctl --user disable --now hermes-kanban-dispatcher.service
# # then make sure a gateway is running; e.g. a systemd user unit
# # for `hermes gateway start`. The gateway hosts the dispatcher.
#
# This unit is kept for users who truly cannot run the gateway (host
# policy forbids long-lived services, etc.). It now invokes the
# standalone dispatcher via the explicit --force flag, so nobody
# accidentally keeps two dispatchers racing against the same
# kanban.db. Running this unit AND a gateway with
# dispatch_in_gateway=true is NOT supported.
[Unit]
Description=Hermes Kanban dispatcher (DEPRECATED standalone daemon — prefer gateway-embedded dispatch)
Documentation=https://hermes-agent.nousresearch.com/docs/user-guide/features/kanban
After=network.target
[Service]
Type=simple
ExecStart=/usr/bin/env hermes kanban daemon --force --interval 60 --pidfile %t/hermes-kanban-dispatcher.pid
Restart=on-failure
RestartSec=5
# Log to the journal via stdout/stderr; the dispatcher also writes per-task
# worker output to $HERMES_HOME/kanban/logs/<task>.log.
StandardOutput=journal
StandardError=journal
[Install]
WantedBy=default.target
+12 -23
View File
@@ -110,17 +110,6 @@ def _parse_context_tokens(host_val, root_val) -> int | None:
return None
def _parse_int_config(host_val, root_val, default: int) -> int:
"""Parse an integer config: host wins, then root, then default."""
for val in (host_val, root_val):
if val is not None:
try:
return int(val)
except (ValueError, TypeError):
pass
return default
def _parse_dialectic_depth(host_val, root_val) -> int:
"""Parse dialecticDepth: host wins, then root, then 1. Clamped to 1-3."""
for val in (host_val, root_val):
@@ -474,10 +463,10 @@ class HonchoClientConfig:
raw.get("dialecticDynamic"),
default=True,
),
dialectic_max_chars=_parse_int_config(
host_block.get("dialecticMaxChars"),
raw.get("dialecticMaxChars"),
default=600,
dialectic_max_chars=int(
host_block.get("dialecticMaxChars")
or raw.get("dialecticMaxChars")
or 600
),
dialectic_depth=_parse_dialectic_depth(
host_block.get("dialecticDepth"),
@@ -498,15 +487,15 @@ class HonchoClientConfig:
or raw.get("reasoningLevelCap")
or "high"
),
message_max_chars=_parse_int_config(
host_block.get("messageMaxChars"),
raw.get("messageMaxChars"),
default=25000,
message_max_chars=int(
host_block.get("messageMaxChars")
or raw.get("messageMaxChars")
or 25000
),
dialectic_max_input_chars=_parse_int_config(
host_block.get("dialecticMaxInputChars"),
raw.get("dialecticMaxInputChars"),
default=10000,
dialectic_max_input_chars=int(
host_block.get("dialecticMaxInputChars")
or raw.get("dialecticMaxInputChars")
or 10000
),
recall_mode=_normalize_recall_mode(
host_block.get("recallMode")
+6 -9
View File
@@ -160,13 +160,11 @@ class HonchoSessionManager:
Peers are lazy -- no API call until first use.
Observation settings are controlled per-session via SessionPeerConfig.
"""
with self._cache_lock:
if peer_id in self._peers_cache:
return self._peers_cache[peer_id]
if peer_id in self._peers_cache:
return self._peers_cache[peer_id]
peer = self.honcho.peer(peer_id)
with self._cache_lock:
self._peers_cache[peer_id] = peer
self._peers_cache[peer_id] = peer
return peer
def _get_or_create_honcho_session(
@@ -178,10 +176,9 @@ class HonchoSessionManager:
Returns:
Tuple of (honcho_session, existing_messages).
"""
with self._cache_lock:
if session_id in self._sessions_cache:
logger.debug("Honcho session '%s' retrieved from cache", session_id)
return self._sessions_cache[session_id], []
if session_id in self._sessions_cache:
logger.debug("Honcho session '%s' retrieved from cache", session_id)
return self._sessions_cache[session_id], []
session = self.honcho.session(session_id)
-3
View File
@@ -38,7 +38,6 @@ except ImportError:
try:
from microsoft_teams.apps import App, ActivityContext
from microsoft_teams.common.http.client import ClientOptions
from microsoft_teams.api import MessageActivity, ConversationReference
from microsoft_teams.api.activities.typing import TypingActivityInput
from microsoft_teams.api.activities.invoke.adaptive_card import AdaptiveCardInvokeActivity
@@ -58,7 +57,6 @@ try:
TEAMS_SDK_AVAILABLE = True
except ImportError:
TEAMS_SDK_AVAILABLE = False
ClientOptions = None # type: ignore[assignment,misc]
App = None # type: ignore[assignment,misc]
ActivityContext = None # type: ignore[assignment,misc]
MessageActivity = None # type: ignore[assignment,misc]
@@ -210,7 +208,6 @@ class TeamsAdapter(BasePlatformAdapter):
client_secret=self._client_secret,
tenant_id=self._tenant_id,
http_server_adapter=_AiohttpBridgeAdapter(aiohttp_app),
client=ClientOptions(headers={"User-Agent": "Hermes"}),
)
# Register message handler before initialize()
+1 -1
View File
@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
[project]
name = "hermes-agent"
version = "0.12.0"
version = "0.11.0"
description = "The self-improving AI agent — creates skills from experience, improves them during use, and runs anywhere"
readme = "README.md"
requires-python = ">=3.11"
+154 -366
View File
@@ -23,7 +23,6 @@ Usage:
import asyncio
import base64
import concurrent.futures
import contextvars
import copy
import hashlib
import json
@@ -134,7 +133,6 @@ from agent.prompt_builder import (
DEFAULT_AGENT_IDENTITY, PLATFORM_HINTS,
MEMORY_GUIDANCE, SESSION_SEARCH_GUIDANCE, SKILLS_GUIDANCE,
HERMES_AGENT_HELP_GUIDANCE,
KANBAN_GUIDANCE,
build_nous_subscription_prompt,
)
from agent.model_metadata import (
@@ -162,13 +160,6 @@ from agent.display import (
_detect_tool_failure,
get_tool_emoji as _get_tool_emoji,
)
from agent.tool_guardrails import (
ToolCallGuardrailConfig,
ToolCallGuardrailController,
ToolGuardrailDecision,
append_toolguard_guidance,
toolguard_synthetic_result,
)
from agent.trajectory import (
convert_scratchpad_to_think, has_incomplete_scratchpad,
save_trajectory as _save_trajectory_to_file,
@@ -1157,8 +1148,6 @@ class AIAgent:
# Tool execution state — allows _vprint during tool execution
# even when stream consumers are registered (no tokens streaming then)
self._executing_tools = False
self._tool_guardrails = ToolCallGuardrailController()
self._tool_guardrail_halt_decision: ToolGuardrailDecision | None = None
# Interrupt mechanism for breaking out of tool loops
self._interrupt_requested = False
@@ -1632,12 +1621,30 @@ class AIAgent:
self._session_db = session_db
self._parent_session_id = parent_session_id
self._last_flushed_db_idx = 0 # tracks DB-write cursor to prevent duplicate writes
self._session_db_created = False # DB row deferred to run_conversation()
self._session_init_model_config = {
"max_iterations": self.max_iterations,
"reasoning_config": reasoning_config,
"max_tokens": max_tokens,
}
if self._session_db:
try:
self._session_db.create_session(
session_id=self.session_id,
source=self.platform or os.environ.get("HERMES_SESSION_SOURCE", "cli"),
model=self.model,
model_config={
"max_iterations": self.max_iterations,
"reasoning_config": reasoning_config,
"max_tokens": max_tokens,
},
user_id=None,
parent_session_id=self._parent_session_id,
)
except Exception as e:
# Transient SQLite lock contention (e.g. CLI and gateway writing
# concurrently) must NOT permanently disable session_search for
# this agent. Keep _session_db alive — subsequent message
# flushes and session_search calls will still work once the
# lock clears. The session row may be missing from the index
# for this run, but that is recoverable (flushes upsert rows).
logger.warning(
"Session DB create_session failed (session_search still available): %s", e
)
# In-memory todo list for task planning (one per agent/session)
from tools.todo_tool import TodoStore
@@ -1649,14 +1656,6 @@ class AIAgent:
_agent_cfg = _load_agent_config()
except Exception:
_agent_cfg = {}
try:
self._tool_guardrails = ToolCallGuardrailController(
ToolCallGuardrailConfig.from_mapping(
_agent_cfg.get("tool_loop_guardrails", {})
)
)
except Exception as _tlg_err:
logger.warning("Tool loop guardrail config ignored: %s", _tlg_err)
# Cache only the derived auxiliary compression context override that is
# needed later by the startup feasibility check. Avoid exposing a
# broad pseudo-public config object on the agent instance.
@@ -2152,28 +2151,6 @@ class AIAgent:
"is_anthropic_oauth": self._is_anthropic_oauth,
})
def _ensure_db_session(self) -> None:
"""Create session DB row on first use. Disables _session_db on failure."""
if self._session_db_created or not self._session_db:
return
try:
self._session_db.create_session(
session_id=self.session_id,
source=self.platform or os.environ.get("HERMES_SESSION_SOURCE", "cli"),
model=self.model,
model_config=self._session_init_model_config,
system_prompt=self._cached_system_prompt,
user_id=None,
parent_session_id=self._parent_session_id,
)
self._session_db_created = True
except Exception as e:
# Transient failure (e.g. SQLite lock). Keep _session_db alive —
# _session_db_created stays False so next run_conversation() retries.
logger.warning(
"Session DB creation failed (will retry next turn): %s", e
)
def reset_session_state(self):
"""Reset all session-scoped token counters to 0 for a fresh session.
@@ -3615,15 +3592,11 @@ class AIAgent:
if actions:
summary = " · ".join(dict.fromkeys(actions))
self._safe_print(
f" 💾 Self-improvement review: {summary}"
)
self._safe_print(f" 💾 {summary}")
_bg_cb = self.background_review_callback
if _bg_cb:
try:
_bg_cb(
f"💾 Self-improvement review: {summary}"
)
_bg_cb(f"💾 {summary}")
except Exception:
pass
@@ -3723,9 +3696,14 @@ class AIAgent:
return
self._apply_persist_user_message_override(messages)
try:
# Retry row creation if the earlier attempt failed transiently.
if not self._session_db_created:
self._ensure_db_session()
# If create_session() failed at startup (e.g. transient lock), the
# session row may not exist yet. ensure_session() uses INSERT OR
# IGNORE so it is a no-op when the row is already there.
self._session_db.ensure_session(
self.session_id,
source=self.platform or "cli",
model=self.model,
)
start_idx = len(conversation_history) if conversation_history else 0
flush_from = max(start_idx, self._last_flushed_db_idx)
for msg in messages[flush_from:]:
@@ -4845,12 +4823,6 @@ class AIAgent:
tool_guidance.append(SESSION_SEARCH_GUIDANCE)
if "skill_manage" in self.valid_tool_names:
tool_guidance.append(SKILLS_GUIDANCE)
# Kanban worker/orchestrator lifecycle — only present when the
# dispatcher spawned this process (kanban_show check_fn gates on
# HERMES_KANBAN_TASK env var). Normal chat sessions never see
# this block.
if "kanban_show" in self.valid_tool_names:
tool_guidance.append(KANBAN_GUIDANCE)
if tool_guidance:
prompt_parts.append(" ".join(tool_guidance))
@@ -4998,8 +4970,8 @@ class AIAgent:
def _get_tool_call_id_static(tc) -> str:
"""Extract call ID from a tool_call entry (dict or object)."""
if isinstance(tc, dict):
return tc.get("call_id", "") or tc.get("id", "") or ""
return getattr(tc, "call_id", "") or getattr(tc, "id", "") or ""
return tc.get("id", "") or ""
return getattr(tc, "id", "") or ""
_VALID_API_ROLES = frozenset({"system", "user", "assistant", "tool", "function", "developer"})
@@ -8529,7 +8501,6 @@ class AIAgent:
Handles reasoning extraction, reasoning_details, and optional tool_calls
so both the tool-call path and the final-response path share one builder.
"""
assistant_tool_calls = getattr(assistant_message, "tool_calls", None)
reasoning_text = self._extract_reasoning(assistant_message)
_from_structured = bool(reasoning_text)
@@ -8589,26 +8560,16 @@ class AIAgent:
"finish_reason": finish_reason,
}
raw_reasoning_content = getattr(assistant_message, "reasoning_content", None)
if raw_reasoning_content is None and hasattr(assistant_message, "model_extra"):
model_extra = getattr(assistant_message, "model_extra", None) or {}
if isinstance(model_extra, dict) and "reasoning_content" in model_extra:
raw_reasoning_content = model_extra["reasoning_content"]
if raw_reasoning_content is not None:
msg["reasoning_content"] = _sanitize_surrogates(raw_reasoning_content)
elif assistant_tool_calls and self._needs_thinking_reasoning_pad():
# DeepSeek v4 thinking mode and Kimi / Moonshot thinking mode
# both require reasoning_content on every assistant tool-call
# message. Without it, replaying the persisted message causes
# HTTP 400 ("The reasoning_content in the thinking mode must
# be passed back to the API"). Include streamed reasoning
# text when captured; otherwise pad with a single space —
# DeepSeek V4 Pro tightened validation and rejects empty
# string ("The reasoning content in the thinking mode must
# be passed back to the API"). A space satisfies non-empty
# checks everywhere without leaking fabricated reasoning.
# Refs #15250, #17400, #17341.
msg["reasoning_content"] = reasoning_text or " "
if hasattr(assistant_message, "reasoning_content"):
raw_reasoning_content = getattr(assistant_message, "reasoning_content", None)
if raw_reasoning_content is not None:
msg["reasoning_content"] = _sanitize_surrogates(raw_reasoning_content)
elif msg.get("tool_calls") and self._needs_deepseek_tool_reasoning():
# DeepSeek thinking mode requires reasoning_content on every
# assistant tool-call message. Without it, replaying the
# persisted message causes HTTP 400. Include empty string
# as a defensive compatibility fallback (refs #15250).
msg["reasoning_content"] = ""
# Additive fallback (refs #16844, #16884). Streaming-only providers
# (glm, MiniMax, gpt-5.x via aigw, Anthropic via openai-compat shims)
@@ -8665,9 +8626,9 @@ class AIAgent:
if codex_message_items:
msg["codex_message_items"] = codex_message_items
if assistant_tool_calls:
if assistant_message.tool_calls:
tool_calls = []
for tool_call in assistant_tool_calls:
for tool_call in assistant_message.tool_calls:
raw_id = getattr(tool_call, "id", None)
call_id = getattr(tool_call, "call_id", None)
if not isinstance(call_id, str) or not call_id.strip():
@@ -8716,18 +8677,6 @@ class AIAgent:
return msg
def _needs_thinking_reasoning_pad(self) -> bool:
"""Return True when the active provider enforces reasoning_content echo-back.
DeepSeek v4 thinking and Kimi / Moonshot thinking both reject replays
of assistant tool-call messages that omit ``reasoning_content`` (refs
#15250, #17400).
"""
return (
self._needs_deepseek_tool_reasoning()
or self._needs_kimi_tool_reasoning()
)
def _needs_kimi_tool_reasoning(self) -> bool:
"""Return True when the current provider is Kimi / Moonshot thinking mode.
@@ -8763,35 +8712,27 @@ class AIAgent:
return
# 1. Explicit reasoning_content already set — preserve it verbatim
# (includes DeepSeek/Kimi's own space-placeholder written at creation
# time, and any valid reasoning content from the same provider).
#
# Exception: sessions persisted BEFORE #17341 have empty-string
# placeholders pinned at creation time. DeepSeek V4 Pro rejects
# those with HTTP 400. When the active provider enforces the
# thinking-mode echo, upgrade "" → " " on replay so stale history
# doesn't 400 the user on the next turn.
# (includes DeepSeek/Kimi's own empty-string placeholder written at
# creation time, and any valid reasoning content from the same provider).
existing = source_msg.get("reasoning_content")
if isinstance(existing, str):
if existing == "" and self._needs_thinking_reasoning_pad():
api_msg["reasoning_content"] = " "
else:
api_msg["reasoning_content"] = existing
api_msg["reasoning_content"] = existing
return
needs_thinking_pad = self._needs_thinking_reasoning_pad()
needs_thinking_pad = (
self._needs_kimi_tool_reasoning()
or self._needs_deepseek_tool_reasoning()
)
# 2. Cross-provider poisoned history (#15748): on DeepSeek/Kimi,
# if the source turn has tool_calls AND a 'reasoning' field but no
# 'reasoning_content' key, the 'reasoning' text was written by a
# prior provider (e.g. MiniMax) — DeepSeek's own _build_assistant_message
# pins reasoning_content at creation time for tool-call turns, so the
# shape (reasoning set, reasoning_content absent, tool_calls present)
# is unreachable from same-provider DeepSeek history after this fix.
# Inject a single space to satisfy the API without leaking another
# provider's chain of thought to DeepSeek/Kimi. Space (not "")
# because DeepSeek V4 Pro rejects empty-string reasoning_content
# in thinking mode (refs #17341).
# always pins reasoning_content="" at creation time for tool-call turns,
# so the shape (reasoning set, reasoning_content absent, tool_calls
# present) is unreachable from same-provider DeepSeek history. Inject
# "" to satisfy the API without leaking another provider's chain of
# thought to DeepSeek/Kimi.
normalized_reasoning = source_msg.get("reasoning")
if (
needs_thinking_pad
@@ -8799,28 +8740,25 @@ class AIAgent:
and isinstance(normalized_reasoning, str)
and normalized_reasoning
):
api_msg["reasoning_content"] = " "
api_msg["reasoning_content"] = ""
return
# 3. Healthy session: promote 'reasoning' field to 'reasoning_content'
# for providers that use the internal 'reasoning' key.
# This must happen before the unconditional empty-string fallback so
# genuine reasoning content is not overwritten (#15812 regression in
# PR #15478).
# This must happen BEFORE the DeepSeek/Kimi tool-call check so that
# genuine reasoning content is not overwritten by the empty-string
# fallback (#15812 regression in PR #15478).
if isinstance(normalized_reasoning, str) and normalized_reasoning:
api_msg["reasoning_content"] = normalized_reasoning
return
# 4. DeepSeek / Kimi thinking mode: all assistant messages need
# reasoning_content. Inject a single space to satisfy the provider's
# requirement when no explicit reasoning content is present. Covers
# both tool-call turns (already-poisoned history with no reasoning
# at all) and plain text turns. Space (not "") because DeepSeek V4
# Pro tightened validation and rejects empty string with HTTP 400
# ("The reasoning content in the thinking mode must be passed back
# to the API"). Refs #17341.
# reasoning_content. Inject "" to satisfy the provider's requirement
# when no explicit reasoning content is present. Covers both
# tool-call turns (already-poisoned history with no reasoning at all)
# and plain text turns.
if needs_thinking_pad:
api_msg["reasoning_content"] = " "
api_msg["reasoning_content"] = ""
return
# 5. reasoning_content was present but not a string (e.g. None after
@@ -9055,15 +8993,12 @@ class AIAgent:
self.session_id = f"{datetime.now().strftime('%Y%m%d_%H%M%S')}_{uuid.uuid4().hex[:6]}"
# Update session_log_file to point to the new session's JSON file
self.session_log_file = self.logs_dir / f"session_{self.session_id}.json"
self._session_db_created = False
self._session_db.create_session(
session_id=self.session_id,
source=self.platform or os.environ.get("HERMES_SESSION_SOURCE", "cli"),
model=self.model,
model_config=self._session_init_model_config,
parent_session_id=old_session_id,
)
self._session_db_created = True
# Auto-number the title for the continuation session
if old_title:
try:
@@ -9121,14 +9056,9 @@ class AIAgent:
# Update token estimate after compaction so pressure calculations
# use the post-compression count, not the stale pre-compression one.
# Use estimate_request_tokens_rough() so tool schemas are included —
# with 50+ tools enabled, schemas alone can add 20-30K tokens, and
# omitting them delays the next compression cycle far past the
# configured threshold (issue #14695).
_compressed_est = estimate_request_tokens_rough(
compressed,
system_prompt=new_system_prompt or "",
tools=self.tools or None,
_compressed_est = (
estimate_tokens_rough(new_system_prompt)
+ estimate_messages_tokens_rough(compressed)
)
self.context_compressor.last_prompt_tokens = _compressed_est
self.context_compressor.last_completion_tokens = 0
@@ -9149,44 +9079,6 @@ class AIAgent:
)
return compressed, new_system_prompt
def _set_tool_guardrail_halt(self, decision: ToolGuardrailDecision) -> None:
"""Record the first guardrail decision that should stop this turn."""
if decision.should_halt and self._tool_guardrail_halt_decision is None:
self._tool_guardrail_halt_decision = decision
def _toolguard_controlled_halt_response(self, decision: ToolGuardrailDecision) -> str:
tool = decision.tool_name or "a tool"
return (
f"I stopped retrying {tool} because it hit the tool-call guardrail "
f"({decision.code}) after {decision.count} repeated non-progressing "
"attempts. The last tool result explains the blocker; the next step is "
"to change strategy instead of repeating the same call."
)
def _append_guardrail_observation(
self,
tool_name: str,
function_args: dict,
function_result: str,
*,
failed: bool,
) -> str:
decision = self._tool_guardrails.after_call(
tool_name,
function_args,
function_result,
failed=failed,
)
if decision.action in {"warn", "halt"}:
function_result = append_toolguard_guidance(function_result, decision)
if decision.should_halt:
self._set_tool_guardrail_halt(decision)
return function_result
def _guardrail_block_result(self, decision: ToolGuardrailDecision) -> str:
self._set_tool_guardrail_halt(decision)
return toolguard_synthetic_result(decision)
def _execute_tool_calls(self, assistant_message, messages: list, effective_task_id: str, api_call_count: int = 0) -> None:
"""Execute tool calls from the assistant message and append results to messages.
@@ -9230,8 +9122,7 @@ class AIAgent:
)
def _invoke_tool(self, function_name: str, function_args: dict, effective_task_id: str,
tool_call_id: Optional[str] = None, messages: list = None,
pre_tool_block_checked: bool = False) -> str:
tool_call_id: Optional[str] = None, messages: list = None) -> str:
"""Invoke a single tool and return the result string. No display logic.
Handles both agent-level tools (todo, memory, etc.) and registry-dispatched
@@ -9240,14 +9131,13 @@ class AIAgent:
"""
# Check plugin hooks for a block directive before executing anything.
block_message: Optional[str] = None
if not pre_tool_block_checked:
try:
from hermes_cli.plugins import get_pre_tool_call_block_message
block_message = get_pre_tool_call_block_message(
function_name, function_args, task_id=effective_task_id or "",
)
except Exception:
pass
try:
from hermes_cli.plugins import get_pre_tool_call_block_message
block_message = get_pre_tool_call_block_message(
function_name, function_args, task_id=effective_task_id or "",
)
except Exception:
pass
if block_message is not None:
return json.dumps({"error": block_message}, ensure_ascii=False)
@@ -9399,31 +9289,13 @@ class AIAgent:
except Exception:
pass
block_result = None
blocked_by_guardrail = False
try:
from hermes_cli.plugins import get_pre_tool_call_block_message
block_message = get_pre_tool_call_block_message(
function_name, function_args, task_id=effective_task_id or "",
)
except Exception:
block_message = None
if block_message is not None:
block_result = json.dumps({"error": block_message}, ensure_ascii=False)
else:
guardrail_decision = self._tool_guardrails.before_call(function_name, function_args)
if not guardrail_decision.allows_execution:
block_result = self._guardrail_block_result(guardrail_decision)
blocked_by_guardrail = True
parsed_calls.append((tool_call, function_name, function_args, block_result, blocked_by_guardrail))
parsed_calls.append((tool_call, function_name, function_args))
# ── Logging / callbacks ──────────────────────────────────────────
tool_names_str = ", ".join(name for _, name, _, _, _ in parsed_calls)
tool_names_str = ", ".join(name for _, name, _ in parsed_calls)
if not self.quiet_mode:
print(f" ⚡ Concurrent: {num_tools} tool calls — {tool_names_str}")
for i, (tc, name, args, block_result, blocked_by_guardrail) in enumerate(parsed_calls, 1):
for i, (tc, name, args) in enumerate(parsed_calls, 1):
args_str = json.dumps(args, ensure_ascii=False)
if self.verbose_logging:
print(f" 📞 Tool {i}: {name}({list(args.keys())})")
@@ -9432,9 +9304,7 @@ class AIAgent:
args_preview = args_str[:self.log_prefix_chars] + "..." if len(args_str) > self.log_prefix_chars else args_str
print(f" 📞 Tool {i}: {name}({list(args.keys())}) - {args_preview}")
for tc, name, args, block_result, blocked_by_guardrail in parsed_calls:
if block_result is not None:
continue
for tc, name, args in parsed_calls:
if self.tool_progress_callback:
try:
preview = _build_tool_preview(name, args)
@@ -9442,9 +9312,7 @@ class AIAgent:
except Exception as cb_err:
logging.debug(f"Tool progress callback error: {cb_err}")
for tc, name, args, block_result, blocked_by_guardrail in parsed_calls:
if block_result is not None:
continue
for tc, name, args in parsed_calls:
if self.tool_start_callback:
try:
self.tool_start_callback(tc.id, name, args)
@@ -9452,11 +9320,8 @@ class AIAgent:
logging.debug(f"Tool start callback error: {cb_err}")
# ── Concurrent execution ─────────────────────────────────────────
# Each slot holds (function_name, function_args, function_result, duration, error_flag, blocked_flag)
# Each slot holds (function_name, function_args, function_result, duration, error_flag)
results = [None] * num_tools
for i, (tc, name, args, block_result, blocked_by_guardrail) in enumerate(parsed_calls):
if block_result is not None:
results[i] = (name, args, block_result, 0.0, True, True)
# Touch activity before launching workers so the gateway knows
# we're executing tools (not stuck).
@@ -9511,14 +9376,7 @@ class AIAgent:
pass
start = time.time()
try:
result = self._invoke_tool(
function_name,
function_args,
effective_task_id,
tool_call.id,
messages=messages,
pre_tool_block_checked=True,
)
result = self._invoke_tool(function_name, function_args, effective_task_id, tool_call.id, messages=messages)
except Exception as tool_error:
result = f"Error executing tool '{function_name}': {tool_error}"
logger.error("_invoke_tool raised for %s: %s", function_name, tool_error, exc_info=True)
@@ -9528,7 +9386,7 @@ class AIAgent:
logger.info("tool %s failed (%.2fs): %s", function_name, duration, result[:200])
else:
logger.info("tool %s completed (%.2fs, %d chars)", function_name, duration, len(result))
results[index] = (function_name, function_args, result, duration, is_error, False)
results[index] = (function_name, function_args, result, duration, is_error)
# Tear down worker-tid tracking. Clear any interrupt bit we may
# have set so the next task scheduled onto this recycled tid
# starts with a clean slate.
@@ -9554,67 +9412,59 @@ class AIAgent:
spinner.start()
try:
runnable_calls = [
(i, tc, name, args)
for i, (tc, name, args, block_result, blocked_by_guardrail) in enumerate(parsed_calls)
if block_result is None
]
futures = []
if runnable_calls:
max_workers = min(len(runnable_calls), _MAX_TOOL_WORKERS)
with concurrent.futures.ThreadPoolExecutor(max_workers=max_workers) as executor:
for i, tc, name, args in runnable_calls:
# Propagate ContextVars (e.g. _approval_session_key); mirrors asyncio.to_thread.
ctx = contextvars.copy_context()
f = executor.submit(ctx.run, _run_tool, i, tc, name, args)
futures.append(f)
max_workers = min(num_tools, _MAX_TOOL_WORKERS)
with concurrent.futures.ThreadPoolExecutor(max_workers=max_workers) as executor:
futures = []
for i, (tc, name, args) in enumerate(parsed_calls):
f = executor.submit(_run_tool, i, tc, name, args)
futures.append(f)
# Wait for all to complete with periodic heartbeats so the
# gateway's inactivity monitor doesn't kill us during long
# concurrent tool batches. Also check for user interrupts
# so we don't block indefinitely when the user sends /stop
# or a new message during concurrent tool execution.
_conc_start = time.time()
_interrupt_logged = False
while True:
done, not_done = concurrent.futures.wait(
futures, timeout=5.0,
)
if not not_done:
break
# Wait for all to complete with periodic heartbeats so the
# gateway's inactivity monitor doesn't kill us during long
# concurrent tool batches. Also check for user interrupts
# so we don't block indefinitely when the user sends /stop
# or a new message during concurrent tool execution.
_conc_start = time.time()
_interrupt_logged = False
while True:
done, not_done = concurrent.futures.wait(
futures, timeout=5.0,
)
if not not_done:
break
# Check for interrupt — the per-thread interrupt signal
# already causes individual tools (terminal, execute_code)
# to abort, but tools without interrupt checks (web_search,
# read_file) will run to completion. Cancel any futures
# that haven't started yet so we don't block on them.
if self._interrupt_requested:
if not _interrupt_logged:
_interrupt_logged = True
self._vprint(
f"{self.log_prefix}⚡ Interrupt: cancelling "
f"{len(not_done)} pending concurrent tool(s)",
force=True,
)
for f in not_done:
f.cancel()
# Give already-running tools a moment to notice the
# per-thread interrupt signal and exit gracefully.
concurrent.futures.wait(not_done, timeout=3.0)
break
_conc_elapsed = int(time.time() - _conc_start)
# Heartbeat every ~30s (6 × 5s poll intervals)
if _conc_elapsed > 0 and _conc_elapsed % 30 < 6:
_still_running = [
parsed_calls[futures.index(f)][1]
for f in not_done
if f in futures
]
self._touch_activity(
f"concurrent tools running ({_conc_elapsed}s, "
f"{len(not_done)} remaining: {', '.join(_still_running[:3])})"
# Check for interrupt — the per-thread interrupt signal
# already causes individual tools (terminal, execute_code)
# to abort, but tools without interrupt checks (web_search,
# read_file) will run to completion. Cancel any futures
# that haven't started yet so we don't block on them.
if self._interrupt_requested:
if not _interrupt_logged:
_interrupt_logged = True
self._vprint(
f"{self.log_prefix}⚡ Interrupt: cancelling "
f"{len(not_done)} pending concurrent tool(s)",
force=True,
)
for f in not_done:
f.cancel()
# Give already-running tools a moment to notice the
# per-thread interrupt signal and exit gracefully.
concurrent.futures.wait(not_done, timeout=3.0)
break
_conc_elapsed = int(time.time() - _conc_start)
# Heartbeat every ~30s (6 × 5s poll intervals)
if _conc_elapsed > 0 and _conc_elapsed % 30 < 6:
_still_running = [
parsed_calls[futures.index(f)][1]
for f in not_done
if f in futures
]
self._touch_activity(
f"concurrent tools running ({_conc_elapsed}s, "
f"{len(not_done)} remaining: {', '.join(_still_running[:3])})"
)
finally:
if spinner:
# Build a summary message for the spinner stop
@@ -9623,9 +9473,8 @@ class AIAgent:
spinner.stop(f"{completed}/{num_tools} tools completed in {total_dur:.1f}s total")
# ── Post-execution: display per-tool results ─────────────────────
for i, (tc, name, args, block_result, blocked_by_guardrail) in enumerate(parsed_calls):
for i, (tc, name, args) in enumerate(parsed_calls):
r = results[i]
blocked = False
if r is None:
# Tool was cancelled (interrupt) or thread didn't return
if self._interrupt_requested:
@@ -9634,21 +9483,13 @@ class AIAgent:
function_result = f"Error executing tool '{name}': thread did not return a result"
tool_duration = 0.0
else:
function_name, function_args, function_result, tool_duration, is_error, blocked = r
if not blocked:
function_result = self._append_guardrail_observation(
function_name,
function_args,
function_result,
failed=is_error,
)
function_name, function_args, function_result, tool_duration, is_error = r
if is_error:
result_preview = function_result[:200] if len(function_result) > 200 else function_result
logger.warning("Tool %s returned error (%.2fs): %s", function_name, tool_duration, result_preview)
if not blocked and self.tool_progress_callback:
if self.tool_progress_callback:
try:
self.tool_progress_callback(
"tool.completed", function_name, None, None,
@@ -9676,7 +9517,7 @@ class AIAgent:
self._current_tool = None
self._touch_activity(f"tool completed: {name} ({tool_duration:.1f}s)")
if not blocked and self.tool_complete_callback:
if self.tool_complete_callback:
try:
self.tool_complete_callback(tc.id, name, args, function_result)
except Exception as cb_err:
@@ -9758,17 +9599,9 @@ class AIAgent:
except Exception:
pass
_guardrail_block_decision: ToolGuardrailDecision | None = None
if _block_msg is None:
guardrail_decision = self._tool_guardrails.before_call(function_name, function_args)
if not guardrail_decision.allows_execution:
_guardrail_block_decision = guardrail_decision
_execution_blocked = _block_msg is not None or _guardrail_block_decision is not None
if _execution_blocked:
# Tool blocked by plugin or guardrail policy — skip counters,
# callbacks, checkpointing, activity mutation, and real execution.
if _block_msg is not None:
# Tool blocked by plugin policy — skip counter resets.
# Execution is handled below in the tool dispatch chain.
pass
else:
# Reset nudge counters when the relevant tool is actually used
@@ -9786,35 +9619,35 @@ class AIAgent:
args_preview = args_str[:self.log_prefix_chars] + "..." if len(args_str) > self.log_prefix_chars else args_str
print(f" 📞 Tool {i}: {function_name}({list(function_args.keys())}) - {args_preview}")
if not _execution_blocked:
if _block_msg is None:
self._current_tool = function_name
self._touch_activity(f"executing tool: {function_name}")
# Set activity callback for long-running tool execution (terminal
# commands, etc.) so the gateway's inactivity monitor doesn't kill
# the agent while a command is running.
if not _execution_blocked:
if _block_msg is None:
try:
from tools.environments.base import set_activity_callback
set_activity_callback(self._touch_activity)
except Exception:
pass
if not _execution_blocked and self.tool_progress_callback:
if _block_msg is None and self.tool_progress_callback:
try:
preview = _build_tool_preview(function_name, function_args)
self.tool_progress_callback("tool.started", function_name, preview, function_args)
except Exception as cb_err:
logging.debug(f"Tool progress callback error: {cb_err}")
if not _execution_blocked and self.tool_start_callback:
if _block_msg is None and self.tool_start_callback:
try:
self.tool_start_callback(tool_call.id, function_name, function_args)
except Exception as cb_err:
logging.debug(f"Tool start callback error: {cb_err}")
# Checkpoint: snapshot working dir before file-mutating tools
if not _execution_blocked and function_name in ("write_file", "patch") and self._checkpoint_mgr.enabled:
if _block_msg is None and function_name in ("write_file", "patch") and self._checkpoint_mgr.enabled:
try:
file_path = function_args.get("path", "")
if file_path:
@@ -9826,7 +9659,7 @@ class AIAgent:
pass # never block tool execution
# Checkpoint before destructive terminal commands
if not _execution_blocked and function_name == "terminal" and self._checkpoint_mgr.enabled:
if _block_msg is None and function_name == "terminal" and self._checkpoint_mgr.enabled:
try:
cmd = function_args.get("command", "")
if _is_destructive_command(cmd):
@@ -9843,11 +9676,6 @@ class AIAgent:
# Tool blocked by plugin policy — return error without executing.
function_result = json.dumps({"error": _block_msg}, ensure_ascii=False)
tool_duration = 0.0
elif _guardrail_block_decision is not None:
# Tool blocked by tool-loop guardrail — synthesize exactly one
# tool result for the original tool_call_id without executing.
function_result = self._guardrail_block_result(_guardrail_block_decision)
tool_duration = 0.0
elif function_name == "todo":
from tools.todo_tool import todo_tool as _todo_tool
function_result = _todo_tool(
@@ -10031,22 +9859,12 @@ class AIAgent:
# Log tool errors to the persistent error log so [error] tags
# in the UI always have a corresponding detailed entry on disk.
_is_error_result, _ = _detect_tool_failure(function_name, function_result)
if not _execution_blocked:
function_result = self._append_guardrail_observation(
function_name,
function_args,
function_result,
failed=_is_error_result,
)
result_preview = function_result if self.verbose_logging else (
function_result[:200] if len(function_result) > 200 else function_result
)
if _is_error_result:
logger.warning("Tool %s returned error (%.2fs): %s", function_name, tool_duration, result_preview)
else:
logger.info("tool %s completed (%.2fs, %d chars)", function_name, tool_duration, len(function_result))
if not _execution_blocked and self.tool_progress_callback:
if self.tool_progress_callback:
try:
self.tool_progress_callback(
"tool.completed", function_name, None, None,
@@ -10062,7 +9880,7 @@ class AIAgent:
logging.debug(f"Tool {function_name} completed in {tool_duration:.2f}s")
logging.debug(f"Tool result ({len(function_result)} chars): {function_result}")
if not _execution_blocked and self.tool_complete_callback:
if self.tool_complete_callback:
try:
self.tool_complete_callback(tool_call.id, function_name, function_args, function_result)
except Exception as cb_err:
@@ -10165,13 +9983,6 @@ class AIAgent:
for idx, pfm in enumerate(self.prefill_messages):
api_messages.insert(sys_offset + idx, pfm.copy())
# Same safety net as the main loop: repair tool-call/result
# pairing before asking for a final summary. Compression and
# session resume can leave a tool result whose parent assistant
# tool_call was summarized away; Responses API rejects that as
# "No tool call found for function call output".
api_messages = self._sanitize_api_messages(api_messages)
# Same safety net as the main loop: drop thinking-only assistant
# turns so Anthropic-family providers don't 400 the summary call.
api_messages = self._drop_thinking_only_and_merge_users(api_messages)
@@ -10353,8 +10164,6 @@ class AIAgent:
# Installed once, transparent when streams are healthy, prevents crash on write.
_install_safe_stdio()
self._ensure_db_session()
# Tag all log records on this thread with the session ID so
# ``hermes logs --session <id>`` can filter a single conversation.
from hermes_logging import set_session_context
@@ -10398,8 +10207,6 @@ class AIAgent:
self._last_content_tools_all_housekeeping = False
self._mute_post_response = False
self._unicode_sanitization_passes = 0
self._tool_guardrails.reset_for_turn()
self._tool_guardrail_halt_decision = None
# Pre-turn connection health check: detect and clean up dead TCP
# connections left over from provider outages or dropped streams.
@@ -13197,16 +13004,6 @@ class AIAgent:
self._execute_tool_calls(assistant_message, messages, effective_task_id, api_call_count)
if self._tool_guardrail_halt_decision is not None:
decision = self._tool_guardrail_halt_decision
_turn_exit_reason = "guardrail_halt"
final_response = self._toolguard_controlled_halt_response(decision)
self._emit_status(
f"⚠️ Tool guardrail halted {decision.tool_name}: {decision.code}"
)
messages.append({"role": "assistant", "content": final_response})
break
# Reset per-turn retry counters after successful tool
# execution so a single truncation doesn't poison the
# entire conversation.
@@ -13250,13 +13047,7 @@ class AIAgent:
# causing premature compression. (#12026)
_real_tokens = _compressor.last_prompt_tokens
else:
# Include tool schemas — with 50+ tools enabled
# these add 20-30K tokens the messages-only
# estimate misses, which can skip compression
# past the configured threshold (#14695).
_real_tokens = estimate_request_tokens_rough(
messages, tools=self.tools or None
)
_real_tokens = estimate_messages_tokens_rough(messages)
if self.compression_enabled and _compressor.should_compress(_real_tokens):
self._safe_print(" ⟳ compacting context…")
@@ -13739,7 +13530,6 @@ class AIAgent:
"messages": messages,
"api_calls": api_call_count,
"completed": completed,
"turn_exit_reason": _turn_exit_reason,
"partial": False, # True only when stopped due to invalid tool calls
"interrupted": interrupted,
"response_previewed": getattr(self, "_response_was_previewed", False),
@@ -13759,8 +13549,6 @@ class AIAgent:
"cost_status": self.session_cost_status,
"cost_source": self.session_cost_source,
}
if self._tool_guardrail_halt_decision is not None:
result["guardrail"] = self._tool_guardrail_halt_decision.to_metadata()
# If a /steer landed after the final assistant turn (no more tool
# batches to drain into), hand it back to the caller so it can be
# delivered as the next user turn instead of being silently lost.
+2 -10
View File
@@ -35,18 +35,10 @@ import time
from pathlib import Path
from typing import Any
_PROJECT_ROOT = Path(__file__).resolve().parent.parent
sys.path.insert(0, str(_PROJECT_ROOT))
try:
from hermes_constants import get_hermes_home
except ImportError:
def get_hermes_home() -> Path: # type: ignore[misc]
val = (os.environ.get("HERMES_HOME") or "").strip()
return Path(val) if val else Path.home() / ".hermes"
DEFAULT_TUI_DIR = Path(os.environ.get("HERMES_TUI_DIR", "/home/bb/hermes-agent/ui-tui"))
DEFAULT_LOG = Path(os.environ.get("HERMES_PERF_LOG", str(get_hermes_home() / "perf.log")))
DEFAULT_STATE_DB = get_hermes_home() / "state.db"
DEFAULT_LOG = Path(os.environ.get("HERMES_PERF_LOG", str(Path.home() / ".hermes" / "perf.log")))
DEFAULT_STATE_DB = Path.home() / ".hermes" / "state.db"
# Keystroke escape sequences. Matches what xterm/VT220 send when the
# terminal has bracketed-paste disabled and the key-repeat handler fires.
-42
View File
@@ -41,25 +41,18 @@ PYPROJECT_FILE = REPO_ROOT / "pyproject.toml"
AUTHOR_MAP = {
# teknium (multiple emails)
"teknium1@gmail.com": "teknium1",
"m@mobrienv.dev": "mikeyobrien",
"qiyin.zuo@pcitc.com": "qiyin-code",
"leone.parise@gmail.com": "leoneparise",
"teknium@nousresearch.com": "teknium1",
"127238744+teknium1@users.noreply.github.com": "teknium1",
"aludwin+gh@gmail.com": "adamludwin",
"2093036+exiao@users.noreply.github.com": "exiao",
"rylen.anil@gmail.com": "rylena",
"godnanijatin@gmail.com": "jatingodnani",
"14046872+tmimmanuel@users.noreply.github.com": "tmimmanuel",
"657290301@qq.com": "IMHaoyan",
"revar@users.noreply.github.com": "revaraver",
# Matrix parity salvage batch (April 2026)
"sr@samirusani": "samrusani",
"angelclaw@AngelMacBook.local": "angel12",
"charles@cryptoassetrecovery.com": "charles-brooks",
# DeepSeek v4 + Kimi thinking-mode reasoning_content salvage (April 2026)
"luwinyang@deepseek.com": "lsdsjy",
"season.saw@gmail.com": "season179",
"heathley@Heathley-MacBook-Air.local": "heathley",
"vlad19@gmail.com": "dandaka",
"adamrummer@gmail.com": "cyclingwithelephants",
@@ -80,13 +73,6 @@ AUTHOR_MAP = {
"thomasjhon6666@gmail.com": "ThomassJonax",
"focusflow.app.help@gmail.com": "yes999zc",
"rob@atlas.lan": "rmoen",
# Slack ephemeral slash-ack salvage (May 2026)
"probepark@users.noreply.github.com": "probepark",
# Slack batch salvage (May 2026)
"280484231+prive-fe-bot@users.noreply.github.com": "priveperfumes",
"amr@ghanem.sa": "amroessam",
"paperlantern.agent@gmail.com": "Hinotoi-agent",
"valda@underscore.jp": "valda",
"162235745+0z1-ghb@users.noreply.github.com": "0z1-ghb",
"yes999zc@163.com": "yes999zc",
"343873859@qq.com": "DrStrangerUJN",
@@ -103,8 +89,6 @@ AUTHOR_MAP = {
"130918800+devorun@users.noreply.github.com": "devorun",
"surat.s@itm.kmutnb.ac.th": "beesrsj2500",
"beesr@bee.localdomain": "beesrsj2500",
"mind-dragon@nous.research": "Mind-Dragon",
"juntingpublic@gmail.com": "JustinUssuri",
"mtf201013@gmail.com": "ma-pony",
"sonoyuncudmr@gmail.com": "Sonoyunchu",
"43525405+yatesjalex@users.noreply.github.com": "yatesjalex",
@@ -113,8 +97,6 @@ AUTHOR_MAP = {
"web3blind@users.noreply.github.com": "web3blind",
"julia@alexland.us": "alexg0bot",
"christian@scheid.tech": "scheidti",
# Moonshot schema anyOf+enum salvage (May 2026)
"git@local.invalid": "hendrixfreire",
"1060770+benjaminsehl@users.noreply.github.com": "benjaminsehl",
"nerijusn76@gmail.com": "Nerijusas",
"itonov@proton.me": "Ito-69",
@@ -127,7 +109,6 @@ AUTHOR_MAP = {
"foxion37@gmail.com": "foxion37",
"bloodcarter@gmail.com": "bloodcarter",
"scott@scotttrinh.com": "scotttrinh",
"quocanh261997@gmail.com": "quocanh261997",
# contributors (from noreply pattern)
"david.vv@icloud.com": "davidvv",
"wangqiang@wangqiangdeMac-mini.local": "xiaoqiang243",
@@ -183,7 +164,6 @@ AUTHOR_MAP = {
"sir_even@icloud.com": "sirEven",
"36056348+sirEven@users.noreply.github.com": "sirEven",
"70424851+insecurejezza@users.noreply.github.com": "insecurejezza",
"jezzahehn@gmail.com": "JezzaHehn",
"254021826+dodo-reach@users.noreply.github.com": "dodo-reach",
"259807879+Bartok9@users.noreply.github.com": "Bartok9",
"270082434+crayfish-ai@users.noreply.github.com": "crayfish-ai",
@@ -309,7 +289,6 @@ AUTHOR_MAP = {
"154585401+LeonSGP43@users.noreply.github.com": "LeonSGP43",
"12250313+Kailigithub@users.noreply.github.com": "Kailigithub",
"mgparkprint@gmail.com": "vlwkaos",
"1317078257maroon@gmail.com": "Oxidane-bot",
"tranquil_flow@protonmail.com": "Tranquil-Flow",
"LyleLengyel@gmail.com": "mcndjxlefnd",
"wangshengyang2004@163.com": "Wangshengyang2004",
@@ -348,7 +327,6 @@ AUTHOR_MAP = {
"stefan@dimagents.ai": "dimitrovi",
"hermes@noushq.ai": "benbarclay",
"chinmingcock@gmail.com": "ChimingLiu",
"allard.quek@singtel.com": "AllardQuek",
"openclaw@sparklab.ai": "openclaw",
"semihcvlk53@gmail.com": "Himess",
"erenkar950@gmail.com": "erenkarakus",
@@ -441,8 +419,6 @@ AUTHOR_MAP = {
"ogzerber@users.noreply.github.com": "ogzerber",
"cola-runner@users.noreply.github.com": "cola-runner",
"ygd58@users.noreply.github.com": "ygd58",
"45554392+warabe1122@users.noreply.github.com": "warabe1122",
"187001140+willy-scr@users.noreply.github.com": "willy-scr",
"vominh1919@users.noreply.github.com": "vominh1919",
"iamagenius00@users.noreply.github.com": "iamagenius00",
"9219265+cresslank@users.noreply.github.com": "cresslank",
@@ -467,7 +443,6 @@ AUTHOR_MAP = {
"taosiyuan163@153.com": "taosiyuan163",
"tesseracttars@gmail.com": "tesseracttars-creator",
"tianliangjay@gmail.com": "xingkongliang",
"1317078257maroon@gmail.com": "Oxidane-bot",
"tranquil_flow@protonmail.com": "Tranquil-Flow",
"LyleLengyel@gmail.com": "mcndjxlefnd",
"unayung@gmail.com": "Unayung",
@@ -513,11 +488,9 @@ AUTHOR_MAP = {
"hubin_ll@qq.com": "LLQWQ",
"memosr_email@gmail.com": "memosr",
"jperlow@gmail.com": "perlowja",
"jasonpette1783@gmail.com": "web-dev0521",
"tangyuanjc@JCdeAIfenshendeMac-mini.local": "tangyuanjc",
"harryplusplus@gmail.com": "harryplusplus",
"anthhub@163.com": "anthhub",
"allard.quek@singtel.com": "AllardQuek",
"shenuu@gmail.com": "shenuu",
"xiayh17@gmail.com": "xiayh0107",
"zhujianxyz@gmail.com": "opriz",
@@ -653,21 +626,6 @@ AUTHOR_MAP = {
"164839249+Joseph19820124@users.noreply.github.com": "Joseph19820124",
"rugved@lmstudio.ai": "rugvedS07",
"44333070+Heltman@users.noreply.github.com": "Heltman",
# v0.12.0 additions
"ching@kachingappz.com": "ching-kaching",
"codezhujr@gmail.com": "Zjianru", # salvage chain: code by codez, PR #15749 author @Zjianru
"daimon@noreply.github.com": "Siddharth Balyan", # co-author only
"i@zkl2333.com": "zkl2333",
"isaachuang@Isaacs-MacBook-Pro.local": "isaachuangGMICLOUD",
"isaachuang@Mac.localdomain": "isaachuangGMICLOUD", # salvage of PR #11955 → #16663
"liyuan851277048@icloud.com": "Octopus", # co-author only
"me+github7604@versun.org": "Versun", # co-author only
"my.vesper.nine@gmail.com": "kevin-ho", # salvage: PR #15488 author @kevin-ho
"noreply@paperclip.ing": "Paperclip", # co-author only
"teknium@hermes-agent": "teknium1",
"web3blind@gmail.com": "web3blind",
"ztzheng@163.com": "chengoak", # PR #17467
"24110240104@m.fudan.edu.cn": "YuShu", # co-author only
}
-152
View File
@@ -1,152 +0,0 @@
---
name: kanban-orchestrator
description: Decomposition playbook + specialist-roster conventions + anti-temptation rules for an orchestrator profile routing work through Kanban. The "don't do the work yourself" rule and the basic lifecycle are auto-injected into every kanban worker's system prompt; this skill is the deeper playbook when you're specifically playing the orchestrator role.
version: 2.0.0
metadata:
hermes:
tags: [kanban, multi-agent, orchestration, routing]
related_skills: [kanban-worker]
---
# Kanban Orchestrator — Decomposition Playbook
> The **core worker lifecycle** (including the `kanban_create` fan-out pattern and the "decompose, don't execute" rule) is auto-injected into every kanban process via the `KANBAN_GUIDANCE` system-prompt block. This skill is the deeper playbook when you're an orchestrator profile whose whole job is routing.
## When to use the board (vs. just doing the work)
Create Kanban tasks when any of these are true:
1. **Multiple specialists are needed.** Research + analysis + writing is three profiles.
2. **The work should survive a crash or restart.** Long-running, recurring, or important.
3. **The user might want to interject.** Human-in-the-loop at any step.
4. **Multiple subtasks can run in parallel.** Fan-out for speed.
5. **Review / iteration is expected.** A reviewer profile loops on drafter output.
6. **The audit trail matters.** Board rows persist in SQLite forever.
If *none* of those apply — it's a small one-shot reasoning task — use `delegate_task` instead or answer the user directly.
## The anti-temptation rules
Your job description says "route, don't execute." The rules that enforce that:
- **Do not execute the work yourself.** Your restricted toolset usually doesn't even include terminal/file/code/web for implementation. If you find yourself "just fixing this quickly" — stop and create a task for the right specialist.
- **For any concrete task, create a Kanban task and assign it.** Every single time.
- **If no specialist fits, ask the user which profile to create.** Do not default to doing it yourself under "close enough."
- **Decompose, route, and summarize — that's the whole job.**
## The standard specialist roster (convention)
Unless the user's setup has customized profiles, assume these exist. Adjust to whatever the user actually has — ask if you're unsure.
| Profile | Does | Typical workspace |
|---|---|---|
| `researcher` | Reads sources, gathers facts, writes findings | `scratch` |
| `analyst` | Synthesizes, ranks, de-dupes. Consumes multiple `researcher` outputs | `scratch` |
| `writer` | Drafts prose in the user's voice | `scratch` or `dir:` into their Obsidian vault |
| `reviewer` | Reads output, leaves findings, gates approval | `scratch` |
| `backend-eng` | Writes server-side code | `worktree` |
| `frontend-eng` | Writes client-side code | `worktree` |
| `ops` | Runs scripts, manages services, handles deployments | `dir:` into ops scripts repo |
| `pm` | Writes specs, acceptance criteria | `scratch` |
## Decomposition playbook
### Step 1 — Understand the goal
Ask clarifying questions if the goal is ambiguous. Cheap to ask; expensive to spawn the wrong fleet.
### Step 2 — Sketch the task graph
Before creating anything, draft the graph out loud (in your response to the user). Example for "Analyze whether we should migrate to Postgres":
```
T1 researcher research: Postgres cost vs current
T2 researcher research: Postgres performance vs current
T3 analyst synthesize migration recommendation parents: T1, T2
T4 writer draft decision memo parents: T3
```
Show this to the user. Let them correct it before you create anything.
### Step 3 — Create tasks and link
```python
t1 = kanban_create(
title="research: Postgres cost vs current",
assignee="researcher",
body="Compare estimated infrastructure costs, migration costs, and ongoing ops costs over a 3-year window. Sources: AWS/GCP pricing, team time estimates, current Postgres bills from peers.",
tenant=os.environ.get("HERMES_TENANT"),
)["task_id"]
t2 = kanban_create(
title="research: Postgres performance vs current",
assignee="researcher",
body="Compare query latency, throughput, and scaling characteristics at our expected data volume (~500GB, 10k QPS peak). Sources: benchmark papers, public case studies, pgbench results if easy.",
)["task_id"]
t3 = kanban_create(
title="synthesize migration recommendation",
assignee="analyst",
body="Read the findings from T1 (cost) and T2 (performance). Produce a 1-page recommendation with explicit trade-offs and a go/no-go call.",
parents=[t1, t2],
)["task_id"]
t4 = kanban_create(
title="draft decision memo",
assignee="writer",
body="Turn the analyst's recommendation into a 2-page memo for the CTO. Match the tone of previous decision memos in the team's knowledge base.",
parents=[t3],
)["task_id"]
```
`parents=[...]` gates promotion — children stay in `todo` until every parent reaches `done`, then auto-promote to `ready`. No manual coordination needed; the dispatcher and dependency engine handle it.
### Step 4 — Complete your own task
If you were spawned as a task yourself (e.g. `planner` profile was assigned `T0: "investigate Postgres migration"`), mark it done with a summary of what you created:
```python
kanban_complete(
summary="decomposed into T1-T4: 2 researchers parallel, 1 analyst on their outputs, 1 writer on the recommendation",
metadata={
"task_graph": {
"T1": {"assignee": "researcher", "parents": []},
"T2": {"assignee": "researcher", "parents": []},
"T3": {"assignee": "analyst", "parents": ["T1", "T2"]},
"T4": {"assignee": "writer", "parents": ["T3"]},
},
},
)
```
### Step 5 — Report back to the user
Tell them what you created in plain prose:
> I've queued 4 tasks:
> - **T1** (researcher): cost comparison
> - **T2** (researcher): performance comparison, in parallel with T1
> - **T3** (analyst): synthesizes T1 + T2 into a recommendation
> - **T4** (writer): turns T3 into a CTO memo
>
> The dispatcher will pick up T1 and T2 now. T3 starts when both finish. You'll get a gateway ping when T4 completes. Use the dashboard or `hermes kanban tail <id>` to follow along.
## Common patterns
**Fan-out + fan-in (research → synthesize):** N `researcher` tasks with no parents, one `analyst` task with all of them as parents.
**Pipeline with gates:** `pm → backend-eng → reviewer`. Each stage's `parents=[previous_task]`. Reviewer blocks or completes; if reviewer blocks, the operator unblocks with feedback and respawns.
**Same-profile queue:** 50 tasks, all assigned to `translator`, no dependencies between them. Dispatcher serializes — translator processes them in priority order, accumulating experience in their own memory.
**Human-in-the-loop:** Any task can `kanban_block()` to wait for input. Dispatcher respawns after `/unblock`. The comment thread carries the full context.
## Pitfalls
**Reassignment vs. new task.** If a reviewer blocks with "needs changes," create a NEW task linked from the reviewer's task — don't re-run the same task with a stern look. The new task is assigned to the original implementer profile.
**Argument order for links.** `kanban_link(parent_id=..., child_id=...)` — parent first. Mixing them up demotes the wrong task to `todo`.
**Don't pre-create the whole graph if the shape depends on intermediate findings.** If T3's structure depends on what T1 and T2 find, let T3 exist as a "synthesize findings" task whose own first step is to read parent handoffs and plan the rest. Orchestrators can spawn orchestrators.
**Tenant inheritance.** If `HERMES_TENANT` is set in your env, pass `tenant=os.environ.get("HERMES_TENANT")` on every `kanban_create` call so child tasks stay in the same namespace.
-134
View File
@@ -1,134 +0,0 @@
---
name: kanban-worker
description: Pitfalls, examples, and edge cases for Hermes Kanban workers. The lifecycle itself is auto-injected into every worker's system prompt as KANBAN_GUIDANCE (from agent/prompt_builder.py); this skill is what you load when you want deeper detail on specific scenarios.
version: 2.0.0
metadata:
hermes:
tags: [kanban, multi-agent, collaboration, workflow, pitfalls]
related_skills: [kanban-orchestrator]
---
# Kanban Worker — Pitfalls and Examples
> You're seeing this skill because the Hermes Kanban dispatcher spawned you as a worker with `--skills kanban-worker` — it's loaded automatically for every dispatched worker. The **lifecycle** (6 steps: orient → work → heartbeat → block/complete) also lives in the `KANBAN_GUIDANCE` block that's auto-injected into your system prompt. This skill is the deeper detail: good handoff shapes, retry diagnostics, edge cases.
## Workspace handling
Your workspace kind determines how you should behave inside `$HERMES_KANBAN_WORKSPACE`:
| Kind | What it is | How to work |
|---|---|---|
| `scratch` | Fresh tmp dir, yours alone | Read/write freely; it gets GC'd when the task is archived. |
| `dir:<path>` | Shared persistent directory | Other runs will read what you write. Treat it like long-lived state. Path is guaranteed absolute (the kernel rejects relative paths). |
| `worktree` | Git worktree at the resolved path | If `.git` doesn't exist, run `git worktree add <path> <branch>` from the main repo first, then cd and work normally. Commit work here. |
## Tenant isolation
If `$HERMES_TENANT` is set, the task belongs to a tenant namespace. When reading or writing persistent memory, prefix memory entries with the tenant so context doesn't leak across tenants:
- Good: `business-a: Acme is our biggest customer`
- Bad (leaks): `Acme is our biggest customer`
## Good summary + metadata shapes
The `kanban_complete(summary=..., metadata=...)` handoff is how downstream workers read what you did. Patterns that work:
**Coding task:**
```python
kanban_complete(
summary="shipped rate limiter — token bucket, keys on user_id with IP fallback, 14 tests pass",
metadata={
"changed_files": ["rate_limiter.py", "tests/test_rate_limiter.py"],
"tests_run": 14,
"tests_passed": 14,
"decisions": ["user_id primary, IP fallback for unauthenticated requests"],
},
)
```
**Research task:**
```python
kanban_complete(
summary="3 competing libraries reviewed; vLLM wins on throughput, SGLang on latency, Tensorrt-LLM on memory efficiency",
metadata={
"sources_read": 12,
"recommendation": "vLLM",
"benchmarks": {"vllm": 1.0, "sglang": 0.87, "trtllm": 0.72},
},
)
```
**Review task:**
```python
kanban_complete(
summary="reviewed PR #123; 2 blocking issues found (SQL injection in /search, missing CSRF on /settings)",
metadata={
"pr_number": 123,
"findings": [
{"severity": "critical", "file": "api/search.py", "line": 42, "issue": "raw SQL concat"},
{"severity": "high", "file": "api/settings.py", "issue": "missing CSRF middleware"},
],
"approved": False,
},
)
```
Shape `metadata` so downstream parsers (reviewers, aggregators, schedulers) can use it without re-reading your prose.
## Block reasons that get answered fast
Bad: `"stuck"` — the human has no context.
Good: one sentence naming the specific decision you need. Leave longer context as a comment instead.
```python
kanban_comment(
task_id=os.environ["HERMES_KANBAN_TASK"],
body="Full context: I have user IPs from Cloudflare headers but some users are behind NATs with thousands of peers. Keying on IP alone causes false positives.",
)
kanban_block(reason="Rate limit key choice: IP (simple, NAT-unsafe) or user_id (requires auth, skips anonymous endpoints)?")
```
The block message is what appears in the dashboard / gateway notifier. The comment is the deeper context a human reads when they open the task.
## Heartbeats worth sending
Good heartbeats name progress: `"epoch 12/50, loss 0.31"`, `"scanned 1.2M/2.4M rows"`, `"uploaded 47/120 videos"`.
Bad heartbeats: `"still working"`, empty notes, sub-second intervals. Every few minutes max; skip entirely for tasks under ~2 minutes.
## Retry scenarios
If you open the task and `kanban_show` returns `runs: [...]` with one or more closed runs, you're a retry. The prior runs' `outcome` / `summary` / `error` tell you what didn't work. Don't repeat that path. Typical retry diagnostics:
- `outcome: "timed_out"` — the previous attempt hit `max_runtime_seconds`. You may need to chunk the work or shorten it.
- `outcome: "crashed"` — OOM or segfault. Reduce memory footprint.
- `outcome: "spawn_failed"` + `error: "..."` — usually a profile config issue (missing credential, bad PATH). Ask the human via `kanban_block` instead of retrying blindly.
- `outcome: "reclaimed"` + `summary: "task archived..."` — operator archived the task out from under the previous run; you probably shouldn't be running at all, check status carefully.
- `outcome: "blocked"` — a previous attempt blocked; the unblock comment should be in the thread by now.
## Do NOT
- Call `delegate_task` as a substitute for `kanban_create`. `delegate_task` is for short reasoning subtasks inside YOUR run; `kanban_create` is for cross-agent handoffs that outlive one API loop.
- Modify files outside `$HERMES_KANBAN_WORKSPACE` unless the task body says to.
- Create follow-up tasks assigned to yourself — assign to the right specialist.
- Complete a task you didn't actually finish. Block it instead.
## Pitfalls
**Task state can change between dispatch and your startup.** Between when the dispatcher claimed and when your process actually booted, the task may have been blocked, reassigned, or archived. Always `kanban_show` first. If it reports `blocked` or `archived`, stop — you shouldn't be running.
**Workspace may have stale artifacts.** Especially `dir:` and `worktree` workspaces can have files from previous runs. Read the comment thread — it usually explains why you're running again and what state the workspace is in.
**Don't rely on the CLI when the guidance is available.** The `kanban_*` tools work across all terminal backends (Docker, Modal, SSH). `hermes kanban <verb>` from your terminal tool will fail in containerized backends because the CLI isn't installed there. When in doubt, use the tool.
## CLI fallback (for scripting)
Every tool has a CLI equivalent for human operators and scripts:
- `kanban_show``hermes kanban show <id> --json`
- `kanban_complete``hermes kanban complete <id> --summary "..." --metadata '{...}'`
- `kanban_block``hermes kanban block <id> "reason"`
- `kanban_create``hermes kanban create "title" --assignee <profile> [--parent <id>]`
- etc.
Use the tools from inside an agent; the CLI exists for the human at the terminal.
+2 -2
View File
@@ -124,7 +124,7 @@ class TestMcpRegistrationE2E:
mock_conn.request_permission = AsyncMock()
acp_agent._conn = mock_conn
def mock_run_conversation(user_message, conversation_history=None, task_id=None, **kwargs):
def mock_run_conversation(user_message, conversation_history=None, task_id=None):
"""Simulate an agent turn that calls terminal, gets a result, then responds."""
agent = state.agent
@@ -213,7 +213,7 @@ class TestMcpRegistrationE2E:
mock_conn.request_permission = AsyncMock()
acp_agent._conn = mock_conn
def mock_run(user_message, conversation_history=None, task_id=None, **kwargs):
def mock_run(user_message, conversation_history=None, task_id=None):
agent = state.agent
# Fire two tool calls
if agent.tool_progress_callback:
+1 -2
View File
@@ -730,7 +730,6 @@ class TestSlashCommands:
]
state.agent.compression_enabled = True
state.agent._cached_system_prompt = "system"
state.agent.tools = None
original_session_db = object()
state.agent._session_db = original_session_db
@@ -747,7 +746,7 @@ class TestSlashCommands:
with (
patch.object(agent.session_manager, "save_session") as mock_save,
patch(
"agent.model_metadata.estimate_request_tokens_rough",
"agent.model_metadata.estimate_messages_tokens_rough",
side_effect=[40, 12],
),
):
-75
View File
@@ -8,7 +8,6 @@ from types import SimpleNamespace
import pytest
from unittest.mock import MagicMock, patch
from acp_adapter import session as acp_session
from acp_adapter.session import SessionManager, SessionState
from hermes_state import SessionDB
@@ -43,27 +42,6 @@ class TestCreateSession:
state = manager.create_session(cwd="/tmp/work")
assert calls == [(state.session_id, "/tmp/work")]
def test_register_task_cwd_translates_windows_drive_for_wsl_tools(self, monkeypatch):
captured = {}
def fake_register_task_env_overrides(task_id, overrides):
captured["task_id"] = task_id
captured["overrides"] = overrides
monkeypatch.setattr("hermes_constants._wsl_detected", True)
monkeypatch.setattr(
"tools.terminal_tool.register_task_env_overrides",
fake_register_task_env_overrides,
)
acp_session._register_task_cwd("session-1", r"E:\Projects\AI\paperclip")
assert captured == {
"task_id": "session-1",
"overrides": {"cwd": "/mnt/e/Projects/AI/paperclip"},
}
def test_session_ids_are_unique(self, manager):
s1 = manager.create_session()
s2 = manager.create_session()
@@ -78,59 +56,6 @@ class TestCreateSession:
assert manager.get_session("does-not-exist") is None
# ---------------------------------------------------------------------------
# WSL cwd translation
# ---------------------------------------------------------------------------
class TestWslCwdTranslation:
def test_translate_acp_cwd_converts_windows_drive_path_when_wsl(self, monkeypatch):
monkeypatch.setattr("hermes_constants._wsl_detected", True)
assert acp_session._translate_acp_cwd(r"E:\Projects\AI\paperclip") == "/mnt/e/Projects/AI/paperclip"
def test_translate_acp_cwd_handles_forward_slashes_when_wsl(self, monkeypatch):
monkeypatch.setattr("hermes_constants._wsl_detected", True)
assert acp_session._translate_acp_cwd("D:/work/project") == "/mnt/d/work/project"
def test_translate_acp_cwd_leaves_windows_drive_path_unchanged_off_wsl(self, monkeypatch):
monkeypatch.setattr("hermes_constants._wsl_detected", False)
assert acp_session._translate_acp_cwd(r"E:\Projects\AI\paperclip") == r"E:\Projects\AI\paperclip"
def test_translate_acp_cwd_leaves_posix_path_unchanged_on_wsl(self, monkeypatch):
monkeypatch.setattr("hermes_constants._wsl_detected", True)
assert acp_session._translate_acp_cwd("/mnt/e/Projects/AI/paperclip") == "/mnt/e/Projects/AI/paperclip"
def test_create_session_stores_translated_cwd_on_wsl(self, manager, monkeypatch):
monkeypatch.setattr("hermes_constants._wsl_detected", True)
state = manager.create_session(cwd=r"E:\Projects\AI\paperclip")
assert state.cwd == "/mnt/e/Projects/AI/paperclip"
def test_fork_session_stores_translated_cwd_on_wsl(self, manager, monkeypatch):
monkeypatch.setattr("hermes_constants._wsl_detected", True)
original = manager.create_session(cwd="/tmp/base")
forked = manager.fork_session(original.session_id, cwd=r"D:\work\project")
assert forked is not None
assert forked.cwd == "/mnt/d/work/project"
def test_update_cwd_stores_translated_cwd_on_wsl(self, manager, monkeypatch):
monkeypatch.setattr("hermes_constants._wsl_detected", True)
state = manager.create_session(cwd="/tmp/old")
updated = manager.update_cwd(state.session_id, cwd=r"C:\Users\foo\project")
assert updated is not None
assert updated.cwd == "/mnt/c/Users/foo/project"
# ---------------------------------------------------------------------------
# fork
# ---------------------------------------------------------------------------
-150
View File
@@ -1,150 +0,0 @@
from types import SimpleNamespace
import pytest
from acp.schema import TextContentBlock
from acp_adapter.server import HermesACPAgent
from acp_adapter.session import SessionManager
class FakeAgent:
def __init__(self):
self.model = "fake-model"
self.provider = "fake-provider"
self.enabled_toolsets = ["hermes-acp"]
self.disabled_toolsets = []
self.tools = []
self.valid_tool_names = set()
self.steers = []
self.runs = []
def steer(self, text):
self.steers.append(text)
return True
def run_conversation(self, *, user_message, conversation_history, task_id, **kwargs):
self.runs.append(user_message)
messages = list(conversation_history or [])
messages.append({"role": "user", "content": user_message})
final = f"ran: {user_message}"
messages.append({"role": "assistant", "content": final})
return {"final_response": final, "messages": messages}
class CaptureConn:
def __init__(self):
self.updates = []
async def session_update(self, *args, **kwargs):
if kwargs:
self.updates.append((kwargs.get("session_id"), kwargs.get("update")))
else:
self.updates.append((args[0], args[1]))
async def request_permission(self, *args, **kwargs):
return SimpleNamespace(outcome="allow")
class NoopDb:
def get_session(self, *_args, **_kwargs):
return None
def create_session(self, *_args, **_kwargs):
return None
def update_session(self, *_args, **_kwargs):
return None
def make_agent_and_state():
fake = FakeAgent()
manager = SessionManager(agent_factory=lambda **kwargs: fake, db=NoopDb())
acp_agent = HermesACPAgent(session_manager=manager)
state = manager.create_session(cwd=".")
conn = CaptureConn()
acp_agent.on_connect(conn)
return acp_agent, state, fake, conn
@pytest.mark.asyncio
async def test_acp_steer_slash_command_injects_into_running_agent():
acp_agent, state, fake, _conn = make_agent_and_state()
state.is_running = True
response = await acp_agent.prompt(
session_id=state.session_id,
prompt=[TextContentBlock(type="text", text="/steer prefer the simpler fix")],
)
assert response.stop_reason == "end_turn"
assert fake.steers == ["prefer the simpler fix"]
assert fake.runs == []
@pytest.mark.asyncio
async def test_acp_steer_after_zed_interrupt_replays_interrupted_prompt_with_guidance():
acp_agent, state, fake, _conn = make_agent_and_state()
state.interrupted_prompt_text = "write hi to a text file"
response = await acp_agent.prompt(
session_id=state.session_id,
prompt=[TextContentBlock(type="text", text="/steer write HELLO instead")],
)
assert response.stop_reason == "end_turn"
assert fake.steers == []
assert fake.runs == [
"write hi to a text file\n\nUser correction/guidance after interrupt: write HELLO instead"
]
assert state.interrupted_prompt_text == ""
@pytest.mark.asyncio
async def test_acp_steer_on_idle_session_runs_as_regular_prompt():
# /steer on an idle session (no running turn, nothing to salvage) should
# run the steer payload as a normal user prompt — NOT silently append it
# to state.queued_prompts. Without this, users on Zed / other ACP clients
# see their /steer turn into "queued for the next turn" when they never
# typed /queue. Matches gateway/run.py ~L4898 idle-/steer behavior.
acp_agent, state, fake, _conn = make_agent_and_state()
response = await acp_agent.prompt(
session_id=state.session_id,
prompt=[TextContentBlock(type="text", text="/steer summarize the README")],
)
assert response.stop_reason == "end_turn"
assert fake.steers == []
assert fake.runs == ["summarize the README"]
assert state.queued_prompts == []
@pytest.mark.asyncio
async def test_acp_queue_slash_command_adds_next_turn_without_running_now():
acp_agent, state, fake, _conn = make_agent_and_state()
response = await acp_agent.prompt(
session_id=state.session_id,
prompt=[TextContentBlock(type="text", text="/queue run the tests after this")],
)
assert response.stop_reason == "end_turn"
assert state.queued_prompts == ["run the tests after this"]
assert fake.runs == []
@pytest.mark.asyncio
async def test_acp_prompt_drains_queued_turns_after_current_run():
acp_agent, state, fake, conn = make_agent_and_state()
state.queued_prompts.append("then run tests")
response = await acp_agent.prompt(
session_id=state.session_id,
prompt=[TextContentBlock(type="text", text="make the change")],
)
assert response.stop_reason == "end_turn"
assert fake.runs == ["make the change", "then run tests"]
assert state.queued_prompts == []
agent_messages = [u for _sid, u in conn.updates if getattr(u, "session_update", None) == "agent_message_chunk"]
assert len(agent_messages) >= 2
@@ -427,68 +427,3 @@ class TestProvidersDictApiModeAnthropicMessages:
assert isinstance(sync_client, OpenAI)
async_client, _ = resolve_provider_client("localchat", async_mode=True)
assert isinstance(async_client, AsyncOpenAI)
class TestCustomProviderAliasCollision:
"""A user-declared custom_providers entry whose name matches a built-in
*alias* (not a canonical provider) must win over the built-in.
Regression guard for #15743: users who defined fallback_model pointing at
a custom_providers entry named ``kimi`` were having requests routed to
the built-in kimi-coding endpoint because ``_normalize_aux_provider``
rewrote ``kimi`` ``kimi-coding`` before the named-custom lookup.
"""
def test_custom_named_kimi_wins_over_builtin_alias(self, tmp_path):
_write_config(tmp_path, {
"model": {"provider": "openrouter", "default": "anthropic/claude-sonnet-4.6"},
"custom_providers": [
{
"name": "kimi",
"base_url": "https://my-custom-kimi.example.com/v1",
"api_key": "my-kimi-key",
"models": {"my-kimi-model": {"context_length": 200000}},
},
],
})
from agent.auxiliary_client import resolve_provider_client
from openai import OpenAI
client, model = resolve_provider_client("kimi", model="my-kimi-model", raw_codex=True)
assert isinstance(client, OpenAI)
assert "my-custom-kimi.example.com" in str(client.base_url)
assert client.api_key == "my-kimi-key"
assert model == "my-kimi-model"
def test_bare_kimi_without_custom_still_routes_to_builtin(self, tmp_path, monkeypatch):
"""Regression guard: bare 'kimi' with no custom entry must still
reach the built-in kimi-coding provider."""
_write_config(tmp_path, {
"model": {"provider": "openrouter", "default": "anthropic/claude-sonnet-4.6"},
})
monkeypatch.setenv("KIMI_API_KEY", "builtin-kimi-key")
from agent.auxiliary_client import resolve_provider_client
client, _ = resolve_provider_client("kimi", model="kimi-k2-0905-preview", raw_codex=True)
assert client is not None
base_url = str(client.base_url)
# Built-in kimi-coding points at api.moonshot.ai
assert "moonshot" in base_url or "kimi" in base_url, f"unexpected base_url {base_url!r}"
def test_explicit_overrides_applied_on_api_key_branch(self, tmp_path, monkeypatch):
"""Explicit base_url/api_key from the caller must override the
registered provider's defaults on the API-key branch. Used by
_try_activate_fallback to route a fallback through a built-in
provider name but targeting a user-supplied endpoint."""
_write_config(tmp_path, {
"model": {"provider": "openrouter", "default": "anthropic/claude-sonnet-4.6"},
})
monkeypatch.setenv("KIMI_API_KEY", "builtin-kimi-key")
from agent.auxiliary_client import resolve_provider_client
from openai import OpenAI
client, _ = resolve_provider_client(
"kimi-coding", model="kimi-k2", raw_codex=True,
explicit_base_url="https://override.example.com",
explicit_api_key="override-key",
)
assert isinstance(client, OpenAI)
assert "override.example.com" in str(client.base_url)
assert client.api_key == "override-key"
-52
View File
@@ -640,30 +640,6 @@ class TestCompressWithClient:
for tc in msg["tool_calls"]:
assert tc["id"] in answered_ids
def test_sanitizer_matches_responses_call_id_when_id_differs(self, compressor):
msgs = [
{
"role": "assistant",
"content": "",
"tool_calls": [
{
"id": "fc_123",
"call_id": "call_123",
"response_item_id": "fc_123",
"type": "function",
"function": {"name": "search_files", "arguments": "{}"},
}
],
},
{"role": "tool", "tool_call_id": "call_123", "content": "result"},
]
sanitized = compressor._sanitize_tool_pairs(msgs)
assert [m.get("tool_call_id") for m in sanitized if m.get("role") == "tool"] == [
"call_123"
]
def test_summary_role_avoids_consecutive_user_messages(self):
"""Summary role should alternate with the last head message to avoid consecutive same-role messages."""
mock_client = MagicMock()
@@ -1143,34 +1119,6 @@ class TestTokenBudgetTailProtection:
# At least one old tool result should have been pruned
assert pruned >= 1
def test_prune_short_conv_protects_entire_tail(self, budget_compressor):
"""Regression guard for PR #17025.
When ``len(messages) <= protect_tail_count`` and a token budget is
also set, every message must be protected. The previous code used
``min(protect_tail_count, len(result) - 1)`` which capped the floor
one below the full length, leaving the oldest message eligible for
pruning.
"""
c = budget_compressor
# 4 messages, protect_tail_count=4 -- nothing should be pruned.
# Oldest message is a large tool result; on the buggy path it falls
# outside the protected window and gets summarized.
messages = [
{"role": "tool", "content": "x" * 5000, "tool_call_id": "c0"},
{"role": "assistant", "content": "ack"},
{"role": "user", "content": "recent"},
{"role": "assistant", "content": "reply"},
]
result, pruned = c._prune_old_tool_results(
messages,
protect_tail_count=4,
protect_tail_tokens=1_000_000, # budget large enough to protect all
)
assert pruned == 0
# Tool result at index 0 must be preserved verbatim
assert result[0]["content"] == "x" * 5000
def test_prune_without_token_budget_uses_message_count(self, budget_compressor):
"""Without protect_tail_tokens, falls back to message-count behavior."""
c = budget_compressor
+2 -119
View File
@@ -86,22 +86,9 @@ def test_curator_config_overrides(curator_env, monkeypatch):
# should_run_now
# ---------------------------------------------------------------------------
def test_first_run_defers(curator_env):
"""The FIRST observation of the curator (fresh install, no state file)
must NOT trigger an immediate run. The curator is designed to run after
a full ``interval_hours`` of skill activity, not on the first background
tick after installation. Fixes #18373.
"""
def test_first_run_always_eligible(curator_env):
c = curator_env["curator"]
# No state file — should defer and seed last_run_at.
assert c.should_run_now() is False
state = c.load_state()
assert state.get("last_run_at") is not None, (
"first observation should seed last_run_at so the interval clock "
"starts ticking instead of firing immediately next tick"
)
# A second immediate call still returns False (seeded, not yet stale).
assert c.should_run_now() is False
assert c.should_run_now() is True
def test_recent_run_blocks(curator_env):
@@ -278,77 +265,6 @@ def test_run_review_records_state(curator_env):
assert state["last_run_summary"] is not None
def test_dry_run_does_not_advance_state(curator_env, monkeypatch):
"""Dry-run previews must not bump last_run_at or run_count. A preview
shouldn't defer the next scheduled real pass or look like a real run in
`hermes curator status`. Fixes #18373.
"""
c = curator_env["curator"]
skills_dir = curator_env["home"] / "skills"
_write_skill(skills_dir, "a")
# Stub the LLM so the test doesn't need a provider.
monkeypatch.setattr(
c, "_run_llm_review",
lambda prompt: {
"final": "", "summary": "dry preview", "model": "", "provider": "",
"tool_calls": [], "error": None,
},
)
c.run_curator_review(synchronous=True, dry_run=True)
state = c.load_state()
assert state.get("last_run_at") is None, "dry-run must not seed last_run_at"
assert state.get("run_count", 0) == 0, "dry-run must not bump run_count"
assert "dry-run" in (state.get("last_run_summary") or ""), (
"dry-run summary should be labeled so status output is unambiguous"
)
def test_dry_run_injects_report_only_banner(curator_env, monkeypatch):
"""The dry-run prompt must carry a banner instructing the LLM not to
call any mutating tool. This is defense in depth the caller also
skips automatic transitions but the LLM prompt is the only guard
against the model calling skill_manage directly."""
c = curator_env["curator"]
skills_dir = curator_env["home"] / "skills"
_write_skill(skills_dir, "a")
captured = {}
def _stub(prompt):
captured["prompt"] = prompt
return {"final": "", "summary": "s", "model": "", "provider": "",
"tool_calls": [], "error": None}
monkeypatch.setattr(c, "_run_llm_review", _stub)
c.run_curator_review(synchronous=True, dry_run=True)
assert "DRY-RUN" in captured["prompt"]
assert "DO NOT" in captured["prompt"]
def test_dry_run_skips_automatic_transitions(curator_env, monkeypatch):
"""Dry-run must not call apply_automatic_transitions — the auto pass
archives skills deterministically, and a preview must not touch the
filesystem."""
c = curator_env["curator"]
skills_dir = curator_env["home"] / "skills"
_write_skill(skills_dir, "a")
called = {"n": 0}
def _explode(*_a, **_kw):
called["n"] += 1
return {"checked": 0, "marked_stale": 0, "archived": 0, "reactivated": 0}
monkeypatch.setattr(c, "apply_automatic_transitions", _explode)
monkeypatch.setattr(
c, "_run_llm_review",
lambda p: {"final": "", "summary": "s", "model": "", "provider": "",
"tool_calls": [], "error": None},
)
c.run_curator_review(synchronous=True, dry_run=True)
assert called["n"] == 0, "dry-run must skip apply_automatic_transitions"
def test_run_review_synchronous_invokes_llm_stub(curator_env, monkeypatch):
c = curator_env["curator"]
skills_dir = curator_env["home"] / "skills"
@@ -411,32 +327,12 @@ def test_maybe_run_curator_runs_when_eligible(curator_env, monkeypatch):
c = curator_env["curator"]
skills_dir = curator_env["home"] / "skills"
_write_skill(skills_dir, "a")
# Seed last_run_at far in the past so the interval gate opens — the
# "no state" path intentionally defers the first run now (#18373).
long_ago = datetime.now(timezone.utc) - timedelta(hours=c.get_interval_hours() * 2)
c.save_state({"last_run_at": long_ago.isoformat(), "paused": False})
# Force idle over threshold
result = c.maybe_run_curator(idle_for_seconds=99999.0)
assert result is not None
assert "started_at" in result
def test_maybe_run_curator_defers_on_fresh_install(curator_env):
"""Fresh install (no curator state file) must NOT fire the curator on
the first gateway tick. The first observation seeds last_run_at and
returns None. Fixes #18373."""
c = curator_env["curator"]
skills_dir = curator_env["home"] / "skills"
_write_skill(skills_dir, "a")
# Infinite idle — the only thing that should block the run is the new
# deferred-first-run gate.
result = c.maybe_run_curator(idle_for_seconds=99999.0)
assert result is None
# And the next tick still defers (we seeded last_run_at to "now").
result2 = c.maybe_run_curator(idle_for_seconds=99999.0)
assert result2 is None
def test_maybe_run_curator_swallows_exceptions(curator_env, monkeypatch):
c = curator_env["curator"]
@@ -467,19 +363,6 @@ def test_state_atomic_write_no_tmp_leftovers(curator_env):
assert not p.name.startswith(".curator_state_"), f"tmp leftover: {p.name}"
def test_state_preserves_last_report_path(curator_env):
c = curator_env["curator"]
c.save_state({
"last_run_at": "2026-04-30T12:00:00+00:00",
"last_run_summary": "ok",
"last_report_path": "/tmp/curator-report",
"paused": False,
"run_count": 1,
})
state = c.load_state()
assert state["last_report_path"] == "/tmp/curator-report"
def test_curator_review_prompt_has_invariants():
"""Core invariants must be in the review prompt text."""
from agent.curator import CURATOR_REVIEW_PROMPT
-316
View File
@@ -1,316 +0,0 @@
"""Tests for agent/curator_backup.py — snapshot + rollback of the skills tree."""
from __future__ import annotations
import importlib
import json
import os
import sys
import tarfile
import tempfile
from pathlib import Path
import pytest
@pytest.fixture
def backup_env(monkeypatch, tmp_path):
"""Isolate HERMES_HOME + reload modules so every test starts clean."""
home = tmp_path / ".hermes"
home.mkdir()
(home / "skills").mkdir()
monkeypatch.setenv("HERMES_HOME", str(home))
monkeypatch.setattr(Path, "home", lambda: tmp_path)
# Reload so get_hermes_home picks up the env var fresh.
import hermes_constants
importlib.reload(hermes_constants)
from agent import curator_backup
importlib.reload(curator_backup)
return {"home": home, "skills": home / "skills", "cb": curator_backup}
def _write_skill(skills_dir: Path, name: str, body: str = "body") -> Path:
d = skills_dir / name
d.mkdir(parents=True, exist_ok=True)
(d / "SKILL.md").write_text(
f"---\nname: {name}\ndescription: t\nversion: 1.0\n---\n\n{body}\n",
encoding="utf-8",
)
return d
# ---------------------------------------------------------------------------
# snapshot_skills
# ---------------------------------------------------------------------------
def test_snapshot_creates_tarball_and_manifest(backup_env):
cb = backup_env["cb"]
_write_skill(backup_env["skills"], "alpha")
_write_skill(backup_env["skills"], "beta")
snap = cb.snapshot_skills(reason="test")
assert snap is not None, "snapshot should succeed with a populated skills dir"
assert (snap / "skills.tar.gz").exists()
manifest = json.loads((snap / "manifest.json").read_text())
assert manifest["reason"] == "test"
assert manifest["skill_files"] == 2
assert manifest["archive_bytes"] > 0
def test_snapshot_excludes_backups_dir_itself(backup_env):
"""The backup must NOT contain .curator_backups/ — that would recurse
with every subsequent snapshot and balloon disk usage."""
cb = backup_env["cb"]
_write_skill(backup_env["skills"], "alpha")
snap1 = cb.snapshot_skills(reason="first")
assert snap1 is not None
snap2 = cb.snapshot_skills(reason="second")
assert snap2 is not None
with tarfile.open(snap2 / "skills.tar.gz") as tf:
names = tf.getnames()
assert not any(n.startswith(".curator_backups") for n in names), (
"second snapshot must not contain the first snapshot recursively"
)
def test_snapshot_excludes_hub_dir(backup_env):
""".hub/ is managed by the skills hub. Rolling it back would break
lockfile invariants, so the snapshot omits it entirely."""
cb = backup_env["cb"]
hub = backup_env["skills"] / ".hub"
hub.mkdir()
(hub / "lock.json").write_text("{}")
_write_skill(backup_env["skills"], "alpha")
snap = cb.snapshot_skills(reason="t")
assert snap is not None
with tarfile.open(snap / "skills.tar.gz") as tf:
names = tf.getnames()
assert not any(n.startswith(".hub") for n in names)
def test_snapshot_disabled_returns_none(backup_env, monkeypatch):
cb = backup_env["cb"]
monkeypatch.setattr(cb, "is_enabled", lambda: False)
_write_skill(backup_env["skills"], "alpha")
assert cb.snapshot_skills() is None
# And no backup dir should have been created
assert not (backup_env["skills"] / ".curator_backups").exists()
def test_snapshot_uniquifies_when_same_second(backup_env, monkeypatch):
"""Two snapshots in the same wallclock second must not clobber each
other. The module appends a counter to the second snapshot's id."""
cb = backup_env["cb"]
_write_skill(backup_env["skills"], "alpha")
frozen = "2026-05-01T12-00-00Z"
monkeypatch.setattr(cb, "_utc_id", lambda now=None: frozen)
s1 = cb.snapshot_skills(reason="a")
s2 = cb.snapshot_skills(reason="b")
assert s1 is not None and s2 is not None
assert s1.name == frozen
assert s2.name == f"{frozen}-01"
def test_snapshot_prunes_to_keep_count(backup_env, monkeypatch):
cb = backup_env["cb"]
_write_skill(backup_env["skills"], "alpha")
monkeypatch.setattr(cb, "get_keep", lambda: 3)
# Create 5 snapshots with monotonically increasing fake ids
ids = [f"2026-05-0{i}T00-00-00Z" for i in range(1, 6)]
for i, fid in enumerate(ids):
monkeypatch.setattr(cb, "_utc_id", lambda now=None, _f=fid: _f)
cb.snapshot_skills(reason=f"n{i}")
remaining = sorted(p.name for p in (backup_env["skills"] / ".curator_backups").iterdir())
# Newest 3 kept (lex order == date order for this id format)
assert remaining == ids[2:], f"expected newest 3, got {remaining}"
# ---------------------------------------------------------------------------
# list_backups / _resolve_backup
# ---------------------------------------------------------------------------
def test_list_backups_empty(backup_env):
cb = backup_env["cb"]
assert cb.list_backups() == []
def test_list_backups_returns_manifest_data(backup_env):
cb = backup_env["cb"]
_write_skill(backup_env["skills"], "alpha")
cb.snapshot_skills(reason="m1")
rows = cb.list_backups()
assert len(rows) == 1
assert rows[0]["reason"] == "m1"
assert rows[0]["skill_files"] == 1
def test_resolve_backup_newest_when_no_id(backup_env, monkeypatch):
cb = backup_env["cb"]
_write_skill(backup_env["skills"], "alpha")
ids = ["2026-05-01T00-00-00Z", "2026-05-02T00-00-00Z"]
for fid in ids:
monkeypatch.setattr(cb, "_utc_id", lambda now=None, _f=fid: _f)
cb.snapshot_skills()
resolved = cb._resolve_backup(None)
assert resolved is not None
assert resolved.name == "2026-05-02T00-00-00Z", (
"resolve(None) must return newest regular snapshot"
)
def test_resolve_backup_unknown_id_returns_none(backup_env):
cb = backup_env["cb"]
_write_skill(backup_env["skills"], "alpha")
cb.snapshot_skills()
assert cb._resolve_backup("not-an-id") is None
# ---------------------------------------------------------------------------
# rollback
# ---------------------------------------------------------------------------
def test_rollback_restores_deleted_skill(backup_env):
"""The whole point of this feature: user loses a skill, rollback
brings it back."""
cb = backup_env["cb"]
skills = backup_env["skills"]
user_skill = _write_skill(skills, "my-personal-workflow", body="important content")
cb.snapshot_skills(reason="pre-simulated-curator")
# Simulate curator archiving it out of existence
import shutil as _sh
_sh.rmtree(user_skill)
assert not user_skill.exists()
ok, msg, _ = cb.rollback()
assert ok, f"rollback failed: {msg}"
assert user_skill.exists(), "my-personal-workflow should be restored"
assert "important content" in (user_skill / "SKILL.md").read_text()
def test_rollback_is_itself_undoable(backup_env):
"""A rollback creates its own safety snapshot before replacing the
tree, so the user can undo a mistaken rollback. The safety snapshot
is a real tarball with reason='pre-rollback to <id>' it's
listed by list_backups() just like any other snapshot and can be
restored the same way."""
cb = backup_env["cb"]
skills = backup_env["skills"]
_write_skill(skills, "v1")
cb.snapshot_skills(reason="snapshot-of-v1")
# Overwrite with a new skill state
import shutil as _sh
_sh.rmtree(skills / "v1")
_write_skill(skills, "v2")
ok, _, _ = cb.rollback()
assert ok
assert (skills / "v1").exists()
# list_backups should show a safety snapshot tagged "pre-rollback to <target-id>"
rows = cb.list_backups()
pre_rollback_entries = [r for r in rows if "pre-rollback" in (r.get("reason") or "")]
assert len(pre_rollback_entries) >= 1, (
f"expected a pre-rollback safety snapshot in list_backups(), got: "
f"{[(r.get('id'), r.get('reason')) for r in rows]}"
)
# And the transient staging dir must be gone (it's implementation detail)
backups_dir = skills / ".curator_backups"
staging_dirs = [p for p in backups_dir.iterdir() if p.name.startswith(".rollback-staging-")]
assert staging_dirs == [], (
f"staging dir should be cleaned up on success, got: {staging_dirs}"
)
def test_rollback_no_snapshots_returns_error(backup_env):
cb = backup_env["cb"]
ok, msg, _ = cb.rollback()
assert not ok
assert "no matching backup" in msg.lower() or "no snapshot" in msg.lower()
def test_rollback_rejects_unsafe_tarball(backup_env, monkeypatch):
"""Tarballs with absolute paths or .. components must be refused even
if someone crafts a malicious snapshot. Defense in depth normal
curator snapshots never produce these."""
cb = backup_env["cb"]
skills = backup_env["skills"]
_write_skill(skills, "alpha")
cb.snapshot_skills(reason="legit")
# Hand-craft a malicious tarball replacing the legit one
rows = cb.list_backups()
snap_dir = Path(rows[0]["path"])
mal = snap_dir / "skills.tar.gz"
mal.unlink()
with tarfile.open(mal, "w:gz") as tf:
evil = tempfile.NamedTemporaryFile(delete=False, suffix=".md")
evil.write(b"evil")
evil.close()
tf.add(evil.name, arcname="../../etc/evil.md")
os.unlink(evil.name)
ok, msg, _ = cb.rollback()
assert not ok
assert "unsafe" in msg.lower() or "refus" in msg.lower() or "extract" in msg.lower()
# ---------------------------------------------------------------------------
# Integration with run_curator_review
# ---------------------------------------------------------------------------
def test_real_run_takes_pre_snapshot(backup_env, monkeypatch):
"""A real (non-dry) curator pass must snapshot the tree before calling
apply_automatic_transitions. This is the safety net #18373 asked for."""
cb = backup_env["cb"]
skills = backup_env["skills"]
_write_skill(skills, "alpha")
# Reload curator module against the freshly-env'd hermes_constants
from agent import curator
importlib.reload(curator)
# Stub out LLM review and auto transitions — we only care about the
# snapshot side-effect.
monkeypatch.setattr(
curator, "_run_llm_review",
lambda p: {"final": "", "summary": "s", "model": "", "provider": "",
"tool_calls": [], "error": None},
)
monkeypatch.setattr(
curator, "apply_automatic_transitions",
lambda now=None: {"checked": 1, "marked_stale": 0, "archived": 0, "reactivated": 0},
)
curator.run_curator_review(synchronous=True)
# Pre-run snapshot should exist
rows = cb.list_backups()
assert any(r.get("reason") == "pre-curator-run" for r in rows), (
f"expected a pre-curator-run snapshot, got {[r.get('reason') for r in rows]}"
)
def test_dry_run_skips_snapshot(backup_env, monkeypatch):
"""Dry-run previews must not spend disk on a snapshot — they don't
mutate anything, so there's nothing to back up."""
cb = backup_env["cb"]
skills = backup_env["skills"]
_write_skill(skills, "alpha")
from agent import curator
importlib.reload(curator)
monkeypatch.setattr(
curator, "_run_llm_review",
lambda p: {"final": "", "summary": "s", "model": "", "provider": "",
"tool_calls": [], "error": None},
)
curator.run_curator_review(synchronous=True, dry_run=True)
rows = cb.list_backups()
assert not any(r.get("reason") == "pre-curator-run" for r in rows), (
"dry-run must not create a pre-run snapshot"
)
-164
View File
@@ -270,167 +270,3 @@ def test_state_transitions_captured_in_report(curator_env):
assert "State transitions" in md
assert "getting-old" in md
assert "active → stale" in md
# ---------------------------------------------------------------------------
# Cron job skill reference rewriting (curator ↔ cron integration)
# ---------------------------------------------------------------------------
#
# When the curator consolidates skill X into umbrella Y during a run, any
# cron job that listed X in its ``skills`` field would fail to load X at
# run time — the scheduler logs a warning and skips it, so the scheduled
# job runs without the instructions it was scheduled to follow. These
# tests verify that _write_run_report calls into cron.jobs to repair
# those references and records what it did in both run.json and
# cron_rewrites.json.
@pytest.fixture
def curator_env_with_cron(curator_env, monkeypatch):
"""Extend curator_env with an initialized + repointed cron.jobs module."""
home = curator_env["home"]
(home / "cron").mkdir(exist_ok=True)
(home / "cron" / "output").mkdir(exist_ok=True)
import importlib
import cron.jobs as jobs_mod
importlib.reload(jobs_mod)
monkeypatch.setattr(jobs_mod, "HERMES_DIR", home)
monkeypatch.setattr(jobs_mod, "CRON_DIR", home / "cron")
monkeypatch.setattr(jobs_mod, "JOBS_FILE", home / "cron" / "jobs.json")
monkeypatch.setattr(jobs_mod, "OUTPUT_DIR", home / "cron" / "output")
return {**curator_env, "jobs": jobs_mod}
def test_curator_rewrites_cron_skills_when_skill_consolidated(curator_env_with_cron):
"""A skill consolidated into an umbrella should be rewritten in any
cron job's skills list; the rewrite should be visible in run.json
and cron_rewrites.json."""
curator = curator_env_with_cron["curator"]
jobs = curator_env_with_cron["jobs"]
# Create a cron job that depends on a soon-to-be-consolidated skill
job = jobs.create_job(
prompt="",
schedule="every 1h",
skills=["foo"],
name="foo-watcher",
)
# Simulate a curator pass that consolidated `foo` → `foo-umbrella`
before = [{"name": "foo", "state": "active", "pinned": False}]
after = [{"name": "foo-umbrella", "state": "active", "pinned": False}]
run_dir = curator._write_run_report(
started_at=datetime.now(timezone.utc),
elapsed_seconds=3.0,
auto_counts={"checked": 1, "marked_stale": 0, "archived": 0, "reactivated": 0},
auto_summary="no changes",
before_report=before,
before_names={"foo"},
after_report=after,
llm_meta=_make_llm_meta(
final="Consolidated foo into foo-umbrella.",
tool_calls=[
{
"name": "skill_manage",
"arguments": json.dumps({
"action": "write_file",
"name": "foo-umbrella",
"file_path": "references/foo.md",
"file_content": "from foo",
}),
},
],
),
)
# Cron job is rewritten on disk
loaded = jobs.get_job(job["id"])
assert loaded["skills"] == ["foo-umbrella"]
assert loaded["skill"] == "foo-umbrella"
# Rewrite is recorded in run.json
payload = json.loads((run_dir / "run.json").read_text())
assert payload["cron_rewrites"]["jobs_updated"] == 1
assert payload["counts"]["cron_jobs_rewritten"] == 1
rewrites = payload["cron_rewrites"]["rewrites"]
assert len(rewrites) == 1
assert rewrites[0]["mapped"] == {"foo": "foo-umbrella"}
# Separate cron_rewrites.json is written for convenience
cron_file = run_dir / "cron_rewrites.json"
assert cron_file.exists()
detail = json.loads(cron_file.read_text())
assert detail["jobs_updated"] == 1
# Markdown surfaces the change
md = (run_dir / "REPORT.md").read_text()
assert "Cron job skill references rewritten" in md
assert "foo-watcher" in md
assert "foo-umbrella" in md
def test_curator_drops_pruned_skill_from_cron_job(curator_env_with_cron):
"""A pruned (no-umbrella) skill should be dropped from the cron
job's skill list entirely — there's no forwarding target."""
curator = curator_env_with_cron["curator"]
jobs = curator_env_with_cron["jobs"]
job = jobs.create_job(
prompt="",
schedule="every 1h",
skills=["keep", "stale-one"],
)
before = [{"name": "stale-one", "state": "active", "pinned": False}]
after: list = [] # stale-one was archived with no target
run_dir = curator._write_run_report(
started_at=datetime.now(timezone.utc),
elapsed_seconds=1.0,
auto_counts={"checked": 1, "marked_stale": 0, "archived": 1, "reactivated": 0},
auto_summary="1 archived",
before_report=before,
before_names={"stale-one"},
after_report=after,
llm_meta=_make_llm_meta(), # no tool calls → classifier marks it pruned
)
loaded = jobs.get_job(job["id"])
assert loaded["skills"] == ["keep"]
payload = json.loads((run_dir / "run.json").read_text())
assert payload["cron_rewrites"]["jobs_updated"] == 1
rewrites = payload["cron_rewrites"]["rewrites"]
assert rewrites[0]["dropped"] == ["stale-one"]
def test_curator_report_has_no_cron_section_when_nothing_changes(curator_env_with_cron):
"""When the curator run doesn't touch any skills, cron jobs are
untouched and cron_rewrites.json is not even written."""
curator = curator_env_with_cron["curator"]
jobs = curator_env_with_cron["jobs"]
jobs.create_job(prompt="", schedule="every 1h", skills=["foo"])
run_dir = curator._write_run_report(
started_at=datetime.now(timezone.utc),
elapsed_seconds=1.0,
auto_counts={"checked": 0, "marked_stale": 0, "archived": 0, "reactivated": 0},
auto_summary="no changes",
before_report=[{"name": "foo", "state": "active", "pinned": False}],
before_names={"foo"},
after_report=[{"name": "foo", "state": "active", "pinned": False}],
llm_meta=_make_llm_meta(),
)
# No rewrites → no separate file, no section in md
assert not (run_dir / "cron_rewrites.json").exists()
md = (run_dir / "REPORT.md").read_text()
assert "Cron job skill references rewritten" not in md
payload = json.loads((run_dir / "run.json").read_text())
assert payload["cron_rewrites"]["jobs_updated"] == 0
assert payload["counts"]["cron_jobs_rewritten"] == 0
+13 -158
View File
@@ -115,15 +115,9 @@ class TestMissingTypeFilled:
class TestAnyOfParentType:
"""Rule 2: type must not appear at the anyOf parent level.
"""Rule 2: type must not appear at the anyOf parent level."""
When an anyOf contains a null-type branch, Moonshot rejects it.
The sanitizer collapses the anyOf: single non-null branch is promoted,
multiple non-null branches have null removed from the list.
"""
def test_anyof_null_branch_collapsed_to_single_type(self):
"""anyOf [string, null] → plain string (anyOf removed)."""
def test_parent_type_stripped_when_anyof_present(self):
params = {
"type": "object",
"properties": {
@@ -138,46 +132,25 @@ class TestAnyOfParentType:
}
out = sanitize_moonshot_tool_parameters(params)
from_format = out["properties"]["from_format"]
# null branch removed, anyOf collapsed to the single non-null type
assert "anyOf" not in from_format
assert from_format["type"] == "string"
assert "type" not in from_format
assert "anyOf" in from_format
def test_anyof_multiple_non_null_preserved(self):
"""anyOf [string, integer] (no null) → kept as-is with parent type stripped."""
def test_anyof_children_missing_type_get_filled(self):
params = {
"type": "object",
"properties": {
"mode": {
"value": {
"anyOf": [
{"type": "string"},
{"type": "integer"},
{"description": "A typeless option"},
],
},
},
}
out = sanitize_moonshot_tool_parameters(params)
mode = out["properties"]["mode"]
assert "anyOf" in mode
assert "type" not in mode # parent type stripped
def test_anyof_enum_with_null_collapsed(self):
"""anyOf [{enum: [...], type: string}, {type: null}] → enum + type only."""
params = {
"type": "object",
"properties": {
"db_type": {
"anyOf": [
{"enum": ["mysql", "postgresql", ""]},
{"type": "null"},
],
},
},
}
out = sanitize_moonshot_tool_parameters(params)
db_type = out["properties"]["db_type"]
assert "anyOf" not in db_type
assert db_type["type"] == "string"
assert db_type["enum"] == ["mysql", "postgresql"] # "" stripped by enum cleanup
children = out["properties"]["value"]["anyOf"]
assert children[0]["type"] == "string"
assert "type" in children[1]
class TestTopLevelGuarantees:
@@ -253,7 +226,7 @@ class TestRealWorldMCPShape:
"""End-to-end: a realistic MCP-style schema that used to 400 on Moonshot."""
def test_combined_rewrites(self):
# Shape: missing type on a property, anyOf with parent type + null, array
# Shape: missing type on a property, anyOf with parent type, array
# items without type — all in one tool.
params = {
"type": "object",
@@ -275,125 +248,7 @@ class TestRealWorldMCPShape:
}
out = sanitize_moonshot_tool_parameters(params)
assert out["properties"]["query"]["type"] == "string"
# anyOf with null collapsed to plain type
assert "anyOf" not in out["properties"]["filter"]
assert out["properties"]["filter"]["type"] == "string"
assert "type" not in out["properties"]["filter"]
assert out["properties"]["filter"]["anyOf"][0]["type"] == "string"
assert out["properties"]["tags"]["items"]["type"] == "string"
assert out["required"] == ["query"]
class TestEnumNullStripping:
"""Rule 3: Moonshot rejects null/empty-string inside enum arrays."""
def test_enum_null_value_stripped(self):
"""enum containing Python None must have it removed for Moonshot."""
params = {
"type": "object",
"properties": {
"db_type": {
"type": "string",
"enum": ["mysql", "postgresql", None],
},
},
}
out = sanitize_moonshot_tool_parameters(params)
db_type = out["properties"]["db_type"]
assert None not in db_type["enum"]
assert "mysql" in db_type["enum"]
assert "postgresql" in db_type["enum"]
def test_enum_empty_string_stripped(self):
"""enum containing empty string '' must have it removed for Moonshot."""
params = {
"type": "object",
"properties": {
"db_type": {
"type": "string",
"enum": ["mysql", "postgresql", ""],
},
},
}
out = sanitize_moonshot_tool_parameters(params)
db_type = out["properties"]["db_type"]
assert "" not in db_type["enum"]
assert db_type["enum"] == ["mysql", "postgresql"]
def test_enum_all_null_becomes_no_enum(self):
"""enum that only had null/empty values is dropped entirely."""
params = {
"type": "object",
"properties": {
"val": {
"type": "string",
"enum": [None, ""],
},
},
}
out = sanitize_moonshot_tool_parameters(params)
assert "enum" not in out["properties"]["val"]
def test_dataslayer_db_type_after_mcp_normalize(self):
"""Real-world: dataslayer db_type anyOf+enum after MCP normalization."""
# This is the exact shape after _normalize_mcp_input_schema runs:
# anyOf collapsed, but enum still has null + empty string
params = {
"type": "object",
"properties": {
"datasource": {"type": "string"},
"db_type": {
"enum": ["mysql", "mariadb", "postgresql", "sqlserver", "oracle", "", None],
"type": "string",
"nullable": True,
"default": None,
},
},
"required": ["datasource"],
}
out = sanitize_moonshot_tool_parameters(params)
db_type = out["properties"]["db_type"]
assert "nullable" not in db_type, "nullable keyword must be stripped"
assert None not in db_type["enum"]
assert "" not in db_type["enum"]
assert db_type["enum"] == ["mysql", "mariadb", "postgresql", "sqlserver", "oracle"]
assert db_type["type"] == "string"
def test_enum_on_object_type_not_stripped(self):
"""enum on non-scalar types (object) should NOT be touched."""
params = {
"type": "object",
"properties": {
"config": {
"type": "object",
"properties": {},
"enum": [{}, None],
},
},
}
out = sanitize_moonshot_tool_parameters(params)
# object-typed enum should pass through unchanged
assert "enum" in out["properties"]["config"]
def test_anyof_collapse_still_runs_nullable_and_enum_cleanup(self):
"""After anyOf collapses to a single non-null branch, the merged
node must still have ``nullable`` stripped and null/empty-string
values removed from enum not skipped by the early anyOf return.
"""
params = {
"type": "object",
"properties": {
"db_type": {
"anyOf": [
{"enum": ["mysql", "postgresql", "", None]},
{"type": "null"},
],
"nullable": True,
},
},
}
out = sanitize_moonshot_tool_parameters(params)
db_type = out["properties"]["db_type"]
assert "anyOf" not in db_type
assert "nullable" not in db_type, "nullable must be stripped after anyOf collapse"
assert db_type["type"] == "string"
assert db_type["enum"] == ["mysql", "postgresql"], \
"null/empty enum values must be stripped after anyOf collapse"
-58
View File
@@ -1,58 +0,0 @@
"""Tests for agent/skill_utils.py — extract_skill_conditions metadata handling."""
from agent.skill_utils import extract_skill_conditions
def test_metadata_as_dict_with_hermes():
"""Normal case: metadata is a dict containing hermes keys."""
frontmatter = {
"metadata": {
"hermes": {
"fallback_for_toolsets": ["toolset_a"],
"requires_toolsets": ["toolset_b"],
"fallback_for_tools": ["tool_x"],
"requires_tools": ["tool_y"],
}
}
}
result = extract_skill_conditions(frontmatter)
assert result["fallback_for_toolsets"] == ["toolset_a"]
assert result["requires_toolsets"] == ["toolset_b"]
assert result["fallback_for_tools"] == ["tool_x"]
assert result["requires_tools"] == ["tool_y"]
def test_metadata_as_string_does_not_crash():
"""Bug case: metadata is a non-dict truthy value (e.g. a YAML string)."""
frontmatter = {"metadata": "some text"}
result = extract_skill_conditions(frontmatter)
assert result == {
"fallback_for_toolsets": [],
"requires_toolsets": [],
"fallback_for_tools": [],
"requires_tools": [],
}
def test_metadata_as_none():
"""metadata key is present but set to null/None."""
frontmatter = {"metadata": None}
result = extract_skill_conditions(frontmatter)
assert result == {
"fallback_for_toolsets": [],
"requires_toolsets": [],
"fallback_for_tools": [],
"requires_tools": [],
}
def test_metadata_missing_entirely():
"""metadata key is absent from frontmatter."""
frontmatter = {"name": "my-skill", "description": "Does stuff."}
result = extract_skill_conditions(frontmatter)
assert result == {
"fallback_for_toolsets": [],
"requires_toolsets": [],
"fallback_for_tools": [],
"requires_tools": [],
}
-238
View File
@@ -1,238 +0,0 @@
"""Pure tool-call guardrail primitive tests."""
import json
from agent.tool_guardrails import (
ToolCallGuardrailConfig,
ToolCallGuardrailController,
ToolCallSignature,
canonical_tool_args,
)
def test_tool_call_signature_hashes_canonical_nested_unicode_args_without_exposing_raw_args():
args_a = {
"z": [{"β": "", "a": 1}],
"a": {"y": 2, "x": "secret-token-value"},
}
args_b = {
"a": {"x": "secret-token-value", "y": 2},
"z": [{"a": 1, "β": ""}],
}
assert canonical_tool_args(args_a) == canonical_tool_args(args_b)
sig_a = ToolCallSignature.from_call("web_search", args_a)
sig_b = ToolCallSignature.from_call("web_search", args_b)
assert sig_a == sig_b
assert len(sig_a.args_hash) == 64
metadata = sig_a.to_metadata()
assert metadata == {"tool_name": "web_search", "args_hash": sig_a.args_hash}
assert "secret-token-value" not in json.dumps(metadata)
assert "" not in json.dumps(metadata)
def test_default_config_is_soft_warning_only_with_hard_stop_disabled():
cfg = ToolCallGuardrailConfig()
assert cfg.warnings_enabled is True
assert cfg.hard_stop_enabled is False
assert cfg.exact_failure_warn_after == 2
assert cfg.same_tool_failure_warn_after == 3
assert cfg.no_progress_warn_after == 2
assert cfg.exact_failure_block_after == 5
assert cfg.same_tool_failure_halt_after == 8
assert cfg.no_progress_block_after == 5
def test_config_parses_nested_warn_and_hard_stop_thresholds():
cfg = ToolCallGuardrailConfig.from_mapping(
{
"warnings_enabled": False,
"hard_stop_enabled": True,
"warn_after": {
"exact_failure": 3,
"same_tool_failure": 4,
"idempotent_no_progress": 5,
},
"hard_stop_after": {
"exact_failure": 6,
"same_tool_failure": 7,
"idempotent_no_progress": 8,
},
}
)
assert cfg.warnings_enabled is False
assert cfg.hard_stop_enabled is True
assert cfg.exact_failure_warn_after == 3
assert cfg.same_tool_failure_warn_after == 4
assert cfg.no_progress_warn_after == 5
assert cfg.exact_failure_block_after == 6
assert cfg.same_tool_failure_halt_after == 7
assert cfg.no_progress_block_after == 8
def test_default_repeated_identical_failed_call_warns_without_blocking():
controller = ToolCallGuardrailController()
args = {"query": "same"}
decisions = []
for _ in range(5):
assert controller.before_call("web_search", args).action == "allow"
decisions.append(
controller.after_call("web_search", args, '{"error":"boom"}', failed=True)
)
assert decisions[0].action == "allow"
assert [d.action for d in decisions[1:]] == ["warn", "warn", "warn", "warn"]
assert {d.code for d in decisions[1:]} == {"repeated_exact_failure_warning"}
assert controller.before_call("web_search", args).action == "allow"
assert controller.halt_decision is None
def test_hard_stop_enabled_blocks_repeated_exact_failure_before_next_execution():
controller = ToolCallGuardrailController(
ToolCallGuardrailConfig(
hard_stop_enabled=True,
exact_failure_warn_after=2,
exact_failure_block_after=2,
same_tool_failure_halt_after=99,
)
)
args = {"query": "same"}
assert controller.before_call("web_search", args).action == "allow"
first = controller.after_call("web_search", args, '{"error":"boom"}', failed=True)
assert first.action == "allow"
assert controller.before_call("web_search", args).action == "allow"
second = controller.after_call("web_search", args, '{"error":"boom"}', failed=True)
assert second.action == "warn"
assert second.code == "repeated_exact_failure_warning"
blocked = controller.before_call("web_search", args)
assert blocked.action == "block"
assert blocked.code == "repeated_exact_failure_block"
assert blocked.count == 2
def test_success_resets_exact_signature_failure_streak():
controller = ToolCallGuardrailController(
ToolCallGuardrailConfig(hard_stop_enabled=True, exact_failure_block_after=2, same_tool_failure_halt_after=99)
)
args = {"query": "same"}
controller.after_call("web_search", args, '{"error":"boom"}', failed=True)
controller.after_call("web_search", args, '{"ok":true}', failed=False)
assert controller.before_call("web_search", args).action == "allow"
controller.after_call("web_search", args, '{"error":"boom"}', failed=True)
assert controller.before_call("web_search", args).action == "allow"
def test_same_tool_varying_args_warns_by_default_without_halting():
controller = ToolCallGuardrailController(
ToolCallGuardrailConfig(same_tool_failure_warn_after=2, same_tool_failure_halt_after=3)
)
first = controller.after_call("terminal", {"command": "cmd-1"}, '{"exit_code":1}', failed=True)
second = controller.after_call("terminal", {"command": "cmd-2"}, '{"exit_code":1}', failed=True)
third = controller.after_call("terminal", {"command": "cmd-3"}, '{"exit_code":1}', failed=True)
fourth = controller.after_call("terminal", {"command": "cmd-4"}, '{"exit_code":1}', failed=True)
assert first.action == "allow"
assert [second.action, third.action, fourth.action] == ["warn", "warn", "warn"]
assert {second.code, third.code, fourth.code} == {"same_tool_failure_warning"}
assert controller.halt_decision is None
def test_hard_stop_enabled_halts_same_tool_varying_args_failure_streak():
controller = ToolCallGuardrailController(
ToolCallGuardrailConfig(
hard_stop_enabled=True,
exact_failure_block_after=99,
same_tool_failure_warn_after=2,
same_tool_failure_halt_after=3,
)
)
first = controller.after_call("terminal", {"command": "cmd-1"}, '{"exit_code":1}', failed=True)
assert first.action == "allow"
second = controller.after_call("terminal", {"command": "cmd-2"}, '{"exit_code":1}', failed=True)
assert second.action == "warn"
assert second.code == "same_tool_failure_warning"
third = controller.after_call("terminal", {"command": "cmd-3"}, '{"exit_code":1}', failed=True)
assert third.action == "halt"
assert third.code == "same_tool_failure_halt"
assert third.count == 3
def test_idempotent_no_progress_repeated_result_warns_without_blocking_by_default():
controller = ToolCallGuardrailController(
ToolCallGuardrailConfig(no_progress_warn_after=2, no_progress_block_after=2)
)
args = {"path": "/tmp/same.txt"}
result = "same file contents"
for _ in range(4):
assert controller.before_call("read_file", args).action == "allow"
decision = controller.after_call("read_file", args, result, failed=False)
assert decision.action == "warn"
assert decision.code == "idempotent_no_progress_warning"
assert controller.before_call("read_file", args).action == "allow"
assert controller.halt_decision is None
def test_hard_stop_enabled_blocks_idempotent_no_progress_future_repeat():
controller = ToolCallGuardrailController(
ToolCallGuardrailConfig(
hard_stop_enabled=True,
no_progress_warn_after=2,
no_progress_block_after=2,
)
)
args = {"path": "/tmp/same.txt"}
result = "same file contents"
assert controller.before_call("read_file", args).action == "allow"
assert controller.after_call("read_file", args, result, failed=False).action == "allow"
assert controller.before_call("read_file", args).action == "allow"
warn = controller.after_call("read_file", args, result, failed=False)
assert warn.action == "warn"
assert warn.code == "idempotent_no_progress_warning"
blocked = controller.before_call("read_file", args)
assert blocked.action == "block"
assert blocked.code == "idempotent_no_progress_block"
def test_mutating_or_unknown_tools_are_not_blocked_for_repeated_identical_success_output_by_default():
controller = ToolCallGuardrailController(
ToolCallGuardrailConfig(no_progress_warn_after=2, no_progress_block_after=2)
)
for _ in range(3):
assert controller.before_call("write_file", {"path": "/tmp/x", "content": "x"}).action == "allow"
assert controller.after_call("write_file", {"path": "/tmp/x", "content": "x"}, "ok", failed=False).action == "allow"
assert controller.before_call("custom_tool", {"x": 1}).action == "allow"
assert controller.after_call("custom_tool", {"x": 1}, "ok", failed=False).action == "allow"
def test_reset_for_turn_clears_bounded_guardrail_state():
controller = ToolCallGuardrailController(
ToolCallGuardrailConfig(hard_stop_enabled=True, exact_failure_block_after=2, no_progress_block_after=2)
)
controller.after_call("web_search", {"query": "same"}, '{"error":"boom"}', failed=True)
controller.after_call("web_search", {"query": "same"}, '{"error":"boom"}', failed=True)
controller.after_call("read_file", {"path": "/tmp/x"}, "same", failed=False)
controller.after_call("read_file", {"path": "/tmp/x"}, "same", failed=False)
assert controller.before_call("web_search", {"query": "same"}).action == "block"
assert controller.before_call("read_file", {"path": "/tmp/x"}).action == "block"
controller.reset_for_turn()
assert controller.before_call("web_search", {"query": "same"}).action == "allow"
assert controller.before_call("read_file", {"path": "/tmp/x"}).action == "allow"
@@ -620,41 +620,6 @@ class TestChatCompletionsNormalize:
assert nr.reasoning == "summary text"
assert nr.provider_data == {"reasoning_content": "detailed scratchpad"}
def test_empty_reasoning_content_preserved(self, transport):
"""DeepSeek can require an explicit empty reasoning_content replay field."""
r = SimpleNamespace(
choices=[SimpleNamespace(
message=SimpleNamespace(
content=None,
tool_calls=None,
reasoning=None,
reasoning_content="",
),
finish_reason="stop",
)],
usage=None,
)
nr = transport.normalize_response(r)
assert nr.provider_data == {"reasoning_content": ""}
assert nr.reasoning_content == ""
def test_reasoning_content_preserved_from_model_extra(self, transport):
"""OpenAI SDK can expose provider-specific DeepSeek fields via model_extra."""
r = SimpleNamespace(
choices=[SimpleNamespace(
message=SimpleNamespace(
content=None,
tool_calls=None,
reasoning=None,
model_extra={"reasoning_content": "model-extra scratchpad"},
),
finish_reason="stop",
)],
usage=None,
)
nr = transport.normalize_response(r)
assert nr.provider_data == {"reasoning_content": "model-extra scratchpad"}
class TestChatCompletionsCacheStats:
-206
View File
@@ -1,206 +0,0 @@
"""Tests for cli._cprint's bg-thread cooperation with prompt_toolkit.
Background: when a prompt_toolkit Application is running, a bg thread that
calls ``_pt_print`` directly can race with the input-area redraw and the
printed line can end up visually buried behind the prompt. ``_cprint`` now
routes cross-thread prints through ``run_in_terminal`` via
``loop.call_soon_threadsafe`` so the self-improvement background review's
``💾 Self-improvement review: `` summary actually surfaces to the user.
These tests verify the routing logic without spinning up a real PT app.
"""
from __future__ import annotations
import sys
import types
from types import SimpleNamespace
import cli
def test_cprint_no_app_direct_print(monkeypatch):
"""No active app → direct _pt_print, no run_in_terminal involvement."""
calls = []
monkeypatch.setattr(cli, "_pt_print", lambda x: calls.append(("pt_print", x)))
monkeypatch.setattr(cli, "_PT_ANSI", lambda t: ("ANSI", t))
# Patch the prompt_toolkit import the function performs internally.
fake_pt_app = types.ModuleType("prompt_toolkit.application")
fake_pt_app.get_app_or_none = lambda: None
fake_pt_app.run_in_terminal = lambda *a, **kw: calls.append(("run_in_terminal",))
monkeypatch.setitem(sys.modules, "prompt_toolkit.application", fake_pt_app)
cli._cprint("hello")
assert calls == [("pt_print", ("ANSI", "hello"))]
def test_cprint_app_not_running_direct_print(monkeypatch):
"""App exists but not running (e.g. teardown) → direct print."""
calls = []
monkeypatch.setattr(cli, "_pt_print", lambda x: calls.append(("pt_print", x)))
monkeypatch.setattr(cli, "_PT_ANSI", lambda t: t)
fake_app = SimpleNamespace(_is_running=False, loop=None)
fake_pt_app = types.ModuleType("prompt_toolkit.application")
fake_pt_app.get_app_or_none = lambda: fake_app
fake_pt_app.run_in_terminal = lambda *a, **kw: calls.append(("run_in_terminal",))
monkeypatch.setitem(sys.modules, "prompt_toolkit.application", fake_pt_app)
cli._cprint("x")
assert calls == [("pt_print", "x")]
def test_cprint_bg_thread_schedules_on_app_loop(monkeypatch):
"""App running + different thread → schedules via call_soon_threadsafe."""
scheduled = []
direct_prints = []
monkeypatch.setattr(cli, "_pt_print", lambda x: direct_prints.append(x))
monkeypatch.setattr(cli, "_PT_ANSI", lambda t: t)
class FakeLoop:
def is_running(self):
return True
def call_soon_threadsafe(self, cb, *args):
scheduled.append(cb)
fake_loop = FakeLoop()
# Install a fake "current loop" that is NOT the app's loop, so the
# cross-thread branch is taken.
fake_current_loop = SimpleNamespace(is_running=lambda: True)
fake_asyncio = types.ModuleType("asyncio")
class _Policy:
def get_event_loop(self):
return fake_current_loop
fake_asyncio.get_event_loop_policy = lambda: _Policy()
monkeypatch.setitem(sys.modules, "asyncio", fake_asyncio)
fake_app = SimpleNamespace(_is_running=True, loop=fake_loop)
fake_pt_app = types.ModuleType("prompt_toolkit.application")
fake_pt_app.get_app_or_none = lambda: fake_app
run_in_terminal_calls = []
def _fake_run_in_terminal(func, **kw):
run_in_terminal_calls.append(func)
# Simulate run_in_terminal actually calling func (as the real PT
# impl would once the app loop tick picks it up).
func()
return None
fake_pt_app.run_in_terminal = _fake_run_in_terminal
monkeypatch.setitem(sys.modules, "prompt_toolkit.application", fake_pt_app)
cli._cprint("💾 Self-improvement review: Skill updated")
# call_soon_threadsafe must have been called with a scheduling cb.
assert len(scheduled) == 1
# Invoking the scheduled callback should hit run_in_terminal.
scheduled[0]()
assert len(run_in_terminal_calls) == 1
# And run_in_terminal's inner func should have emitted a pt_print.
assert direct_prints == ["💾 Self-improvement review: Skill updated"]
def test_cprint_same_thread_as_app_loop_direct_print(monkeypatch):
"""App running on same thread → direct print (no scheduling)."""
direct_prints = []
monkeypatch.setattr(cli, "_pt_print", lambda x: direct_prints.append(x))
monkeypatch.setattr(cli, "_PT_ANSI", lambda t: t)
class FakeLoop:
def is_running(self):
return True
def call_soon_threadsafe(self, cb, *args):
raise AssertionError(
"call_soon_threadsafe must not be used on the app's own thread"
)
fake_loop = FakeLoop()
fake_asyncio = types.ModuleType("asyncio")
class _Policy:
def get_event_loop(self):
return fake_loop # same as app loop
fake_asyncio.get_event_loop_policy = lambda: _Policy()
monkeypatch.setitem(sys.modules, "asyncio", fake_asyncio)
fake_app = SimpleNamespace(_is_running=True, loop=fake_loop)
fake_pt_app = types.ModuleType("prompt_toolkit.application")
fake_pt_app.get_app_or_none = lambda: fake_app
fake_pt_app.run_in_terminal = lambda *a, **kw: None
monkeypatch.setitem(sys.modules, "prompt_toolkit.application", fake_pt_app)
cli._cprint("x")
assert direct_prints == ["x"]
def test_cprint_swallows_app_loop_attr_error(monkeypatch):
"""Loop missing on app → fall back to direct print, no crash."""
direct_prints = []
monkeypatch.setattr(cli, "_pt_print", lambda x: direct_prints.append(x))
monkeypatch.setattr(cli, "_PT_ANSI", lambda t: t)
class WeirdApp:
_is_running = True
@property
def loop(self):
raise RuntimeError("no loop for you")
fake_pt_app = types.ModuleType("prompt_toolkit.application")
fake_pt_app.get_app_or_none = lambda: WeirdApp()
fake_pt_app.run_in_terminal = lambda *a, **kw: None
monkeypatch.setitem(sys.modules, "prompt_toolkit.application", fake_pt_app)
cli._cprint("fallback")
assert direct_prints == ["fallback"]
def test_cprint_swallows_prompt_toolkit_import_error(monkeypatch):
"""If prompt_toolkit.application itself fails to import, fall back."""
direct_prints = []
monkeypatch.setattr(cli, "_pt_print", lambda x: direct_prints.append(x))
monkeypatch.setattr(cli, "_PT_ANSI", lambda t: t)
# Drop cached prompt_toolkit.application AND install a meta-path finder
# that raises ImportError on re-import.
monkeypatch.delitem(sys.modules, "prompt_toolkit.application", raising=False)
class _BlockFinder:
def find_module(self, name, path=None):
if name == "prompt_toolkit.application":
return self
return None
def load_module(self, name):
raise ImportError("blocked for test")
def find_spec(self, name, path=None, target=None):
if name == "prompt_toolkit.application":
# Returning a bogus spec that will fail on load works too,
# but raising here keeps the test simple.
raise ImportError("blocked for test")
return None
blocker = _BlockFinder()
sys.meta_path.insert(0, blocker)
try:
cli._cprint("fallback2")
finally:
sys.meta_path.remove(blocker)
assert direct_prints == ["fallback2"]
+8 -12
View File
@@ -21,21 +21,20 @@ def test_manual_compress_reports_noop_without_success_banner(capsys):
shell.agent = MagicMock()
shell.agent.compression_enabled = True
shell.agent._cached_system_prompt = ""
shell.agent.tools = None
shell.agent.session_id = shell.session_id # no-op compression: no split
shell.agent._compress_context.return_value = (list(history), "")
def _estimate(messages, **_kwargs):
def _estimate(messages):
assert messages == history
return 100
with patch("agent.model_metadata.estimate_request_tokens_rough", side_effect=_estimate):
with patch("agent.model_metadata.estimate_messages_tokens_rough", side_effect=_estimate):
shell._manual_compress()
output = capsys.readouterr().out
assert "No changes from compression" in output
assert "✅ Compressed" not in output
assert "Approx request size: ~100 tokens (unchanged)" in output
assert "Rough transcript estimate: ~100 tokens (unchanged)" in output
def test_manual_compress_explains_when_token_estimate_rises(capsys):
@@ -50,23 +49,22 @@ def test_manual_compress_explains_when_token_estimate_rises(capsys):
shell.agent = MagicMock()
shell.agent.compression_enabled = True
shell.agent._cached_system_prompt = ""
shell.agent.tools = None
shell.agent.session_id = shell.session_id # no-op: no split
shell.agent._compress_context.return_value = (compressed, "")
def _estimate(messages, **_kwargs):
def _estimate(messages):
if messages == history:
return 100
if messages == compressed:
return 120
raise AssertionError(f"unexpected transcript: {messages!r}")
with patch("agent.model_metadata.estimate_request_tokens_rough", side_effect=_estimate):
with patch("agent.model_metadata.estimate_messages_tokens_rough", side_effect=_estimate):
shell._manual_compress()
output = capsys.readouterr().out
assert "✅ Compressed: 4 → 3 messages" in output
assert "Approx request size: ~100 → ~120 tokens" in output
assert "Rough transcript estimate: ~100 → ~120 tokens" in output
assert "denser summaries" in output
@@ -91,7 +89,6 @@ def test_manual_compress_syncs_session_id_after_split():
shell.agent = MagicMock()
shell.agent.compression_enabled = True
shell.agent._cached_system_prompt = ""
shell.agent.tools = None
# Simulate _compress_context mutating agent.session_id as a side effect.
def _fake_compress(*args, **kwargs):
shell.agent.session_id = new_child_id
@@ -100,7 +97,7 @@ def test_manual_compress_syncs_session_id_after_split():
shell.agent.session_id = old_id # starts in sync
shell._pending_title = "stale title"
with patch("agent.model_metadata.estimate_request_tokens_rough", return_value=100):
with patch("agent.model_metadata.estimate_messages_tokens_rough", return_value=100):
shell._manual_compress()
# CLI session_id must now point at the continuation child, not the parent.
@@ -121,12 +118,11 @@ def test_manual_compress_no_sync_when_session_id_unchanged():
shell.agent = MagicMock()
shell.agent.compression_enabled = True
shell.agent._cached_system_prompt = ""
shell.agent.tools = None
shell.agent.session_id = shell.session_id
shell.agent._compress_context.return_value = (list(history), "")
shell._pending_title = "keep me"
with patch("agent.model_metadata.estimate_request_tokens_rough", return_value=100):
with patch("agent.model_metadata.estimate_messages_tokens_rough", return_value=100):
shell._manual_compress()
# No split → pending title untouched.
-289
View File
@@ -1,289 +0,0 @@
"""Tests for cron.jobs.rewrite_skill_refs — the curator integration that
keeps scheduled cron jobs pointing at the right skill names after a
consolidation / pruning pass.
Bug this fixes: when the curator consolidates skill X into umbrella Y,
any cron job whose ``skills`` list contains X would silently fail to
load X at run time (the scheduler logs a warning and skips it), so the
job runs without the instructions it was scheduled to follow.
"""
from __future__ import annotations
import sys
from pathlib import Path
import pytest
# Ensure project root is importable
sys.path.insert(0, str(Path(__file__).parent.parent.parent))
@pytest.fixture
def cron_env(tmp_path, monkeypatch):
"""Isolated cron environment with temp HERMES_HOME."""
hermes_home = tmp_path / ".hermes"
hermes_home.mkdir()
(hermes_home / "cron").mkdir()
(hermes_home / "cron" / "output").mkdir()
monkeypatch.setenv("HERMES_HOME", str(hermes_home))
import cron.jobs as jobs_mod
monkeypatch.setattr(jobs_mod, "HERMES_DIR", hermes_home)
monkeypatch.setattr(jobs_mod, "CRON_DIR", hermes_home / "cron")
monkeypatch.setattr(jobs_mod, "JOBS_FILE", hermes_home / "cron" / "jobs.json")
monkeypatch.setattr(jobs_mod, "OUTPUT_DIR", hermes_home / "cron" / "output")
return hermes_home
class TestRewriteSkillRefsNoop:
"""No jobs, no rewrites, no map — every combination of empty inputs."""
def test_empty_map_and_no_jobs(self, cron_env):
from cron.jobs import rewrite_skill_refs
report = rewrite_skill_refs(consolidated={}, pruned=[])
assert report == {"rewrites": [], "jobs_updated": 0, "jobs_scanned": 0}
def test_jobs_exist_but_map_empty(self, cron_env):
from cron.jobs import create_job, rewrite_skill_refs
create_job(prompt="", schedule="every 1h", skills=["foo"])
report = rewrite_skill_refs(consolidated={}, pruned=[])
assert report["jobs_updated"] == 0
# Early return: we don't even scan when there's nothing to apply.
assert report["jobs_scanned"] == 0
def test_jobs_exist_but_no_match(self, cron_env):
from cron.jobs import create_job, get_job, rewrite_skill_refs
job = create_job(prompt="", schedule="every 1h", skills=["foo"])
report = rewrite_skill_refs(
consolidated={"unrelated": "umbrella"},
pruned=["other"],
)
assert report["jobs_updated"] == 0
assert report["jobs_scanned"] == 1
# Job untouched
loaded = get_job(job["id"])
assert loaded["skills"] == ["foo"]
class TestRewriteSkillRefsConsolidation:
"""Consolidated skills should be replaced with their umbrella target."""
def test_single_skill_replaced(self, cron_env):
from cron.jobs import create_job, get_job, rewrite_skill_refs
job = create_job(prompt="", schedule="every 1h", skills=["legacy-skill"])
report = rewrite_skill_refs(
consolidated={"legacy-skill": "umbrella-skill"},
pruned=[],
)
assert report["jobs_updated"] == 1
loaded = get_job(job["id"])
assert loaded["skills"] == ["umbrella-skill"]
# Legacy ``skill`` field realigned
assert loaded["skill"] == "umbrella-skill"
def test_multiple_skills_one_consolidated(self, cron_env):
from cron.jobs import create_job, get_job, rewrite_skill_refs
job = create_job(
prompt="",
schedule="every 1h",
skills=["keep-a", "legacy", "keep-b"],
)
rewrite_skill_refs(consolidated={"legacy": "umbrella"}, pruned=[])
loaded = get_job(job["id"])
# Ordering preserved, legacy replaced in-place
assert loaded["skills"] == ["keep-a", "umbrella", "keep-b"]
def test_umbrella_already_in_list_dedupes(self, cron_env):
from cron.jobs import create_job, get_job, rewrite_skill_refs
# Job already loads the umbrella AND the legacy sub-skill
job = create_job(
prompt="",
schedule="every 1h",
skills=["umbrella", "legacy"],
)
rewrite_skill_refs(consolidated={"legacy": "umbrella"}, pruned=[])
loaded = get_job(job["id"])
# No duplicate — the umbrella stays exactly once
assert loaded["skills"] == ["umbrella"]
def test_rewrite_report_records_mapping(self, cron_env):
from cron.jobs import create_job, rewrite_skill_refs
job = create_job(
prompt="",
schedule="every 1h",
skills=["a", "b"],
name="my-job",
)
report = rewrite_skill_refs(
consolidated={"a": "umbrella-a", "b": "umbrella-b"},
pruned=[],
)
assert len(report["rewrites"]) == 1
entry = report["rewrites"][0]
assert entry["job_id"] == job["id"]
assert entry["job_name"] == "my-job"
assert entry["before"] == ["a", "b"]
assert entry["after"] == ["umbrella-a", "umbrella-b"]
assert entry["mapped"] == {"a": "umbrella-a", "b": "umbrella-b"}
assert entry["dropped"] == []
class TestRewriteSkillRefsPruning:
"""Pruned skills should be dropped outright (no forwarding target)."""
def test_pruned_skill_dropped(self, cron_env):
from cron.jobs import create_job, get_job, rewrite_skill_refs
job = create_job(
prompt="",
schedule="every 1h",
skills=["keep", "stale"],
)
report = rewrite_skill_refs(consolidated={}, pruned=["stale"])
assert report["jobs_updated"] == 1
loaded = get_job(job["id"])
assert loaded["skills"] == ["keep"]
assert loaded["skill"] == "keep"
def test_all_skills_pruned_leaves_empty_list(self, cron_env):
from cron.jobs import create_job, get_job, rewrite_skill_refs
job = create_job(prompt="", schedule="every 1h", skills=["gone"])
rewrite_skill_refs(consolidated={}, pruned=["gone"])
loaded = get_job(job["id"])
assert loaded["skills"] == []
assert loaded["skill"] is None
def test_pruned_report_records_drops(self, cron_env):
from cron.jobs import create_job, rewrite_skill_refs
create_job(prompt="", schedule="every 1h", skills=["keep", "stale"])
report = rewrite_skill_refs(consolidated={}, pruned=["stale"])
entry = report["rewrites"][0]
assert entry["dropped"] == ["stale"]
assert entry["mapped"] == {}
class TestRewriteSkillRefsMixed:
"""Consolidation + pruning in the same pass."""
def test_mixed_consolidation_and_pruning(self, cron_env):
from cron.jobs import create_job, get_job, rewrite_skill_refs
job = create_job(
prompt="",
schedule="every 1h",
skills=["keep", "legacy", "stale"],
)
rewrite_skill_refs(
consolidated={"legacy": "umbrella"},
pruned=["stale"],
)
loaded = get_job(job["id"])
assert loaded["skills"] == ["keep", "umbrella"]
def test_skill_in_both_maps_wins_as_consolidated(self, cron_env):
"""Defensive: if a skill appears in both lists (shouldn't happen
in practice), prefer consolidation it has a forwarding target,
which is the more useful outcome."""
from cron.jobs import create_job, get_job, rewrite_skill_refs
job = create_job(prompt="", schedule="every 1h", skills=["ambiguous"])
rewrite_skill_refs(
consolidated={"ambiguous": "umbrella"},
pruned=["ambiguous"],
)
loaded = get_job(job["id"])
assert loaded["skills"] == ["umbrella"]
class TestRewriteSkillRefsMultipleJobs:
"""Multiple jobs, some affected, some not."""
def test_only_affected_jobs_reported(self, cron_env):
from cron.jobs import create_job, get_job, rewrite_skill_refs
j1 = create_job(prompt="", schedule="every 1h", skills=["legacy"])
j2 = create_job(prompt="", schedule="every 1h", skills=["untouched"])
j3 = create_job(prompt="", schedule="every 1h", skills=[])
report = rewrite_skill_refs(
consolidated={"legacy": "umbrella"},
pruned=[],
)
assert report["jobs_updated"] == 1
assert report["jobs_scanned"] == 3
assert len(report["rewrites"]) == 1
assert report["rewrites"][0]["job_id"] == j1["id"]
# Untouched jobs stay put
assert get_job(j2["id"])["skills"] == ["untouched"]
assert get_job(j3["id"])["skills"] == []
def test_legacy_skill_field_also_rewritten(self, cron_env):
"""Old jobs may have the legacy single-skill ``skill`` field
set instead of ``skills``. Both paths should be rewritten."""
from cron.jobs import create_job, get_job, rewrite_skill_refs
# Create via the legacy ``skill`` argument
job = create_job(
prompt="",
schedule="every 1h",
skill="legacy",
)
rewrite_skill_refs(consolidated={"legacy": "umbrella"}, pruned=[])
loaded = get_job(job["id"])
assert loaded["skills"] == ["umbrella"]
assert loaded["skill"] == "umbrella"
class TestRewriteSkillRefsPersistence:
"""Rewrites persist to disk and survive a reload."""
def test_changes_persist_across_reload(self, cron_env):
import json
from cron.jobs import create_job, rewrite_skill_refs, JOBS_FILE
create_job(prompt="", schedule="every 1h", skills=["legacy"])
rewrite_skill_refs(consolidated={"legacy": "umbrella"}, pruned=[])
# Read raw file contents
data = json.loads(JOBS_FILE.read_text())
assert data["jobs"][0]["skills"] == ["umbrella"]
assert data["jobs"][0]["skill"] == "umbrella"
def test_noop_does_not_rewrite_file(self, cron_env):
from cron.jobs import create_job, rewrite_skill_refs, JOBS_FILE
create_job(prompt="", schedule="every 1h", skills=["keep"])
mtime_before = JOBS_FILE.stat().st_mtime_ns
# Nothing in the map matches
report = rewrite_skill_refs(
consolidated={"unrelated": "umbrella"},
pruned=["other"],
)
assert report["jobs_updated"] == 0
# File untouched — no pointless disk write
assert JOBS_FILE.stat().st_mtime_ns == mtime_before
-65
View File
@@ -1,65 +0,0 @@
"""Shared fixtures for Feishu adapter tests (admission, group policy, dispatch)."""
from __future__ import annotations
import threading
from types import SimpleNamespace
from typing import Any, Optional
def make_sender(sender_type: str = "user", open_id: str = "ou_human",
user_id: Optional[str] = None, union_id: Optional[str] = None) -> Any:
return SimpleNamespace(
sender_type=sender_type,
sender_id=SimpleNamespace(open_id=open_id, user_id=user_id, union_id=union_id),
)
def make_message(message_id: str = "om_xxx", chat_type: str = "p2p",
chat_id: str = "oc_1", mentions: Optional[list] = None) -> Any:
return SimpleNamespace(
message_id=message_id,
chat_type=chat_type,
chat_id=chat_id,
mentions=mentions,
content="",
message_type="text",
)
def make_adapter_skeleton(
*,
bot_open_id: str = "ou_me",
bot_user_id: str = "",
allow_bots: str = "none",
require_mention: bool = True,
group_policy: str = "allowlist",
) -> Any:
from gateway.platforms.feishu import FeishuAdapter
adapter = object.__new__(FeishuAdapter)
adapter._bot_open_id = bot_open_id
adapter._bot_user_id = bot_user_id
adapter._bot_name = ""
adapter._app_id = ""
adapter._admins = set()
adapter._group_rules = {}
adapter._group_policy = group_policy
adapter._default_group_policy = group_policy
adapter._allowed_group_users = frozenset()
adapter._allow_bots = allow_bots
adapter._require_mention = require_mention
return adapter
def install_dedup_state(adapter: Any, seen: Optional[dict] = None) -> None:
adapter._seen_message_ids = dict(seen) if seen else {}
adapter._seen_message_order = list((seen or {}).keys())
adapter._dedup_cache_size = 100
adapter._dedup_lock = threading.Lock()
adapter._dedup_state_path = None
adapter._persist_seen_message_ids = lambda: None
def stub_mention(adapter: Any, mentions_self: bool) -> None:
adapter._mentions_self = lambda _message: mentions_self
-30
View File
@@ -332,36 +332,6 @@ def auth_adapter():
return _make_adapter(api_key="sk-secret")
# ---------------------------------------------------------------------------
# Adapter internals
# ---------------------------------------------------------------------------
class TestAgentExecution:
@pytest.mark.asyncio
async def test_run_agent_uses_session_id_as_task_id(self, adapter):
mock_agent = MagicMock()
mock_agent.run_conversation.return_value = {"final_response": "ok"}
mock_agent.session_prompt_tokens = 1
mock_agent.session_completion_tokens = 2
mock_agent.session_total_tokens = 3
with patch.object(adapter, "_create_agent", return_value=mock_agent):
result, usage = await adapter._run_agent(
user_message="hello",
conversation_history=[],
session_id="session-123",
)
assert result == {"final_response": "ok"}
assert usage == {"input_tokens": 1, "output_tokens": 2, "total_tokens": 3}
mock_agent.run_conversation.assert_called_once_with(
user_message="hello",
conversation_history=[],
task_id="session-123",
)
# ---------------------------------------------------------------------------
# /health endpoint
# ---------------------------------------------------------------------------
+4 -1
View File
@@ -253,7 +253,10 @@ class TestRunStatus:
await asyncio.sleep(0.05)
mock_agent.run_conversation.assert_called_once()
assert mock_agent.run_conversation.call_args.kwargs["task_id"] == "space-session"
# task_id stays "default" so the Runs API shares one sandbox
# container with CLI/gateway; session_id is surfaced in status
# for external UIs to correlate runs with their own session IDs.
assert mock_agent.run_conversation.call_args.kwargs["task_id"] == "default"
assert status["session_id"] == "space-session"
@pytest.mark.asyncio
@@ -173,23 +173,6 @@ class TestBlockingGatewayApproval:
assert e1.event.is_set()
assert e2.event.is_set()
def test_clear_session_denies_and_signals_all_entries(self):
"""clear_session must wake blocked entries during boundary cleanup."""
from tools.approval import clear_session, _ApprovalEntry, _gateway_queues
session_key = "test-boundary-cleanup"
e1 = _ApprovalEntry({"command": "cmd1"})
e2 = _ApprovalEntry({"command": "cmd2"})
_gateway_queues[session_key] = [e1, e2]
clear_session(session_key)
assert e1.event.is_set()
assert e2.event.is_set()
assert e1.result == "deny"
assert e2.result == "deny"
assert session_key not in _gateway_queues
# ------------------------------------------------------------------
# /approve command
+10 -18
View File
@@ -64,13 +64,11 @@ async def test_compress_command_reports_noop_without_success_banner():
agent_instance = MagicMock()
agent_instance.shutdown_memory_provider = MagicMock()
agent_instance.close = MagicMock()
agent_instance._cached_system_prompt = ""
agent_instance.tools = None
agent_instance.context_compressor.has_content_to_compress.return_value = True
agent_instance.session_id = "sess-1"
agent_instance._compress_context.return_value = (list(history), "")
def _estimate(messages, **_kwargs):
def _estimate(messages):
assert messages == history
return 100
@@ -78,13 +76,13 @@ async def test_compress_command_reports_noop_without_success_banner():
patch("gateway.run._resolve_runtime_agent_kwargs", return_value={"api_key": "test-key"}),
patch("gateway.run._resolve_gateway_model", return_value="test-model"),
patch("run_agent.AIAgent", return_value=agent_instance),
patch("agent.model_metadata.estimate_request_tokens_rough", side_effect=_estimate),
patch("agent.model_metadata.estimate_messages_tokens_rough", side_effect=_estimate),
):
result = await runner._handle_compress_command(_make_event())
assert "No changes from compression" in result
assert "Compressed:" not in result
assert "Approx request size: ~100 tokens (unchanged)" in result
assert "Rough transcript estimate: ~100 tokens (unchanged)" in result
agent_instance.shutdown_memory_provider.assert_called_once()
agent_instance.close.assert_called_once()
@@ -101,13 +99,11 @@ async def test_compress_command_explains_when_token_estimate_rises():
agent_instance = MagicMock()
agent_instance.shutdown_memory_provider = MagicMock()
agent_instance.close = MagicMock()
agent_instance._cached_system_prompt = ""
agent_instance.tools = None
agent_instance.context_compressor.has_content_to_compress.return_value = True
agent_instance.session_id = "sess-1"
agent_instance._compress_context.return_value = (compressed, "")
def _estimate(messages, **_kwargs):
def _estimate(messages):
if messages == history:
return 100
if messages == compressed:
@@ -118,12 +114,12 @@ async def test_compress_command_explains_when_token_estimate_rises():
patch("gateway.run._resolve_runtime_agent_kwargs", return_value={"api_key": "test-key"}),
patch("gateway.run._resolve_gateway_model", return_value="test-model"),
patch("run_agent.AIAgent", return_value=agent_instance),
patch("agent.model_metadata.estimate_request_tokens_rough", side_effect=_estimate),
patch("agent.model_metadata.estimate_messages_tokens_rough", side_effect=_estimate),
):
result = await runner._handle_compress_command(_make_event())
assert "Compressed: 4 → 3 messages" in result
assert "Approx request size: ~100 → ~120 tokens" in result
assert "Rough transcript estimate: ~100 → ~120 tokens" in result
assert "denser summaries" in result
agent_instance.shutdown_memory_provider.assert_called_once()
agent_instance.close.assert_called_once()
@@ -147,8 +143,6 @@ async def test_compress_command_appends_warning_when_summary_generation_fails():
agent_instance = MagicMock()
agent_instance.shutdown_memory_provider = MagicMock()
agent_instance.close = MagicMock()
agent_instance._cached_system_prompt = ""
agent_instance.tools = None
agent_instance.context_compressor.has_content_to_compress.return_value = True
# Simulate summary-generation failure: fallback flag set, dropped count
# populated, error string captured.
@@ -160,7 +154,7 @@ async def test_compress_command_appends_warning_when_summary_generation_fails():
agent_instance.session_id = "sess-1"
agent_instance._compress_context.return_value = (compressed, "")
def _estimate(messages, **_kwargs):
def _estimate(messages):
if messages == history:
return 100
if messages == compressed:
@@ -171,7 +165,7 @@ async def test_compress_command_appends_warning_when_summary_generation_fails():
patch("gateway.run._resolve_runtime_agent_kwargs", return_value={"api_key": "***"}),
patch("gateway.run._resolve_gateway_model", return_value="test-model"),
patch("run_agent.AIAgent", return_value=agent_instance),
patch("agent.model_metadata.estimate_request_tokens_rough", side_effect=_estimate),
patch("agent.model_metadata.estimate_messages_tokens_rough", side_effect=_estimate),
):
result = await runner._handle_compress_command(_make_event())
@@ -206,8 +200,6 @@ async def test_compress_command_surfaces_aux_model_failure_even_when_recovered()
agent_instance = MagicMock()
agent_instance.shutdown_memory_provider = MagicMock()
agent_instance.close = MagicMock()
agent_instance._cached_system_prompt = ""
agent_instance.tools = None
agent_instance.context_compressor.has_content_to_compress.return_value = True
# Fallback placeholder was NOT used — recovery succeeded.
agent_instance.context_compressor._last_summary_fallback_used = False
@@ -223,7 +215,7 @@ async def test_compress_command_surfaces_aux_model_failure_even_when_recovered()
agent_instance.session_id = "sess-1"
agent_instance._compress_context.return_value = (compressed, "")
def _estimate(messages, **_kwargs):
def _estimate(messages):
if messages == history:
return 100
if messages == compressed:
@@ -234,7 +226,7 @@ async def test_compress_command_surfaces_aux_model_failure_even_when_recovered()
patch("gateway.run._resolve_runtime_agent_kwargs", return_value={"api_key": "***"}),
patch("gateway.run._resolve_gateway_model", return_value="test-model"),
patch("run_agent.AIAgent", return_value=agent_instance),
patch("agent.model_metadata.estimate_request_tokens_rough", side_effect=_estimate),
patch("agent.model_metadata.estimate_messages_tokens_rough", side_effect=_estimate),
):
result = await runner._handle_compress_command(_make_event())
-96
View File
@@ -9,7 +9,6 @@ from gateway.config import (
Platform,
PlatformConfig,
SessionResetPolicy,
StreamingConfig,
_apply_env_overrides,
load_gateway_config,
)
@@ -150,24 +149,6 @@ class TestSessionResetPolicy:
assert restored.notify is False
class TestStreamingConfig:
def test_from_dict_coerces_quoted_false_enabled(self):
restored = StreamingConfig.from_dict({"enabled": "false"})
assert restored.enabled is False
def test_from_dict_malformed_numeric_values_fall_back_to_defaults(self):
restored = StreamingConfig.from_dict(
{
"edit_interval": "oops",
"buffer_threshold": "oops",
"fresh_final_after_seconds": "oops",
}
)
assert restored.edit_interval == 1.0
assert restored.buffer_threshold == 40
assert restored.fresh_final_after_seconds == 60.0
class TestGatewayConfigRoundtrip:
def test_full_roundtrip(self):
config = GatewayConfig(
@@ -213,26 +194,6 @@ class TestGatewayConfigRoundtrip:
restored = GatewayConfig.from_dict({"always_log_local": "false"})
assert restored.always_log_local is False
def test_get_notice_delivery_defaults_to_public(self):
config = GatewayConfig(
platforms={Platform.SLACK: PlatformConfig(enabled=True, token="***")}
)
assert config.get_notice_delivery(Platform.SLACK) == "public"
def test_get_notice_delivery_honors_platform_override(self):
config = GatewayConfig(
platforms={
Platform.SLACK: PlatformConfig(
enabled=True,
token="***",
extra={"notice_delivery": "private"},
),
}
)
assert config.get_notice_delivery(Platform.SLACK) == "private"
class TestLoadGatewayConfig:
def test_bridges_quick_commands_from_config_yaml(self, tmp_path, monkeypatch):
@@ -399,38 +360,6 @@ class TestLoadGatewayConfig:
"C01ABC": "Code review mode",
}
def test_bridges_feishu_allow_bots_from_config_yaml_to_env(self, tmp_path, monkeypatch):
hermes_home = tmp_path / ".hermes"
hermes_home.mkdir()
config_path = hermes_home / "config.yaml"
config_path.write_text(
"feishu:\n allow_bots: mentions\n",
encoding="utf-8",
)
monkeypatch.setenv("HERMES_HOME", str(hermes_home))
monkeypatch.delenv("FEISHU_ALLOW_BOTS", raising=False)
load_gateway_config()
assert os.environ.get("FEISHU_ALLOW_BOTS") == "mentions"
def test_feishu_allow_bots_env_takes_precedence_over_config_yaml(self, tmp_path, monkeypatch):
hermes_home = tmp_path / ".hermes"
hermes_home.mkdir()
config_path = hermes_home / "config.yaml"
config_path.write_text(
"feishu:\n allow_bots: all\n",
encoding="utf-8",
)
monkeypatch.setenv("HERMES_HOME", str(hermes_home))
monkeypatch.setenv("FEISHU_ALLOW_BOTS", "none")
load_gateway_config()
assert os.environ.get("FEISHU_ALLOW_BOTS") == "none"
def test_invalid_quick_commands_in_config_yaml_are_ignored(self, tmp_path, monkeypatch):
hermes_home = tmp_path / ".hermes"
hermes_home.mkdir()
@@ -477,22 +406,6 @@ class TestLoadGatewayConfig:
assert config.platforms[Platform.TELEGRAM].extra["disable_link_previews"] is True
def test_bridges_notice_delivery_from_config_yaml(self, tmp_path, monkeypatch):
hermes_home = tmp_path / ".hermes"
hermes_home.mkdir()
config_path = hermes_home / "config.yaml"
config_path.write_text(
"slack:\n"
" notice_delivery: private\n",
encoding="utf-8",
)
monkeypatch.setenv("HERMES_HOME", str(hermes_home))
config = load_gateway_config()
assert config.get_notice_delivery(Platform.SLACK) == "private"
def test_bridges_telegram_proxy_url_from_config_yaml(self, tmp_path, monkeypatch):
hermes_home = tmp_path / ".hermes"
hermes_home.mkdir()
@@ -542,15 +455,6 @@ class TestHomeChannelEnvOverrides:
{"SLACK_HOME_CHANNEL": "C123", "SLACK_HOME_CHANNEL_NAME": "Ops"},
("C123", "Ops"),
),
(
Platform.WHATSAPP,
PlatformConfig(enabled=True),
{
"WHATSAPP_HOME_CHANNEL": "1234567890@lid",
"WHATSAPP_HOME_CHANNEL_NAME": "Owner DM",
},
("1234567890@lid", "Owner DM"),
),
(
Platform.SIGNAL,
PlatformConfig(

Some files were not shown because too many files have changed in this diff Show More