Compare commits
31 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
| 9a85c99d58 | |||
| e5dad4ac57 | |||
| 180a7036bc | |||
| 8fed969618 | |||
| ded011c5a5 | |||
| 71b685aee0 | |||
| bbbce92651 | |||
| 80a676658c | |||
| c868425467 | |||
| 59c1a13f45 | |||
| 1d8068d71d | |||
| 9ac4a2e53e | |||
| 6bc5d72271 | |||
| b737af8226 | |||
| 73bf3ab1b2 | |||
| 76edc40ab0 | |||
| b9b9ee3e6c | |||
| 8fbc9d7d78 | |||
| 699a9c11a9 | |||
| d60a9917d3 | |||
| 7c07422202 | |||
| ca7f46beb5 | |||
| cb0e2e2f36 | |||
| 4c0cc77e94 | |||
| 9b62c98170 | |||
| 469e4df3c2 | |||
| ae11a31058 | |||
| 3e200b64fb | |||
| 1745cfc6d7 | |||
| 58c07867e3 | |||
| 4523965de9 |
@@ -0,0 +1,505 @@
|
||||
# Hermes Agent v0.12.0 (v2026.4.30)
|
||||
|
||||
**Release Date:** April 30, 2026
|
||||
**Since v0.11.0:** 1,096 commits · 550 merged PRs · 1,270 files changed · 217,776 insertions · 213 community contributors (including co-authors)
|
||||
|
||||
> The Curator release — Hermes Agent now maintains itself. An autonomous background Curator grades, prunes, and consolidates your skill library on its own schedule. The self-improvement loop that reviews what to save got a substantial upgrade. Four new inference providers, a 18th messaging platform, a 19th via Teams plugin, native Spotify + Google Meet integrations, ComfyUI and TouchDesigner-MCP moved from optional to bundled-by-default, and a ~57% cut to visible TUI cold start.
|
||||
|
||||
---
|
||||
|
||||
## ✨ Highlights
|
||||
|
||||
- **Autonomous Curator** — `hermes curator` runs as a background agent on the gateway's cron ticker (7-day cycle default). It grades your skill library, consolidates related skills, prunes dead ones, and writes per-run reports to `logs/curator/run.json` + `REPORT.md`. Archived skills are classified consolidated-vs-pruned via model + heuristic. Defense-in-depth gates protect bundled/hub skills from mutation. Unified under `auxiliary.curator` — pick the curator's model in `hermes model`, manage it from the dashboard. `hermes curator status` ranks skills by usage (most-used / least-used). ([#17277](https://github.com/NousResearch/hermes-agent/pull/17277), [#17307](https://github.com/NousResearch/hermes-agent/pull/17307), [#17941](https://github.com/NousResearch/hermes-agent/pull/17941), [#17868](https://github.com/NousResearch/hermes-agent/pull/17868), [#18033](https://github.com/NousResearch/hermes-agent/pull/18033))
|
||||
|
||||
- **Self-improvement loop — substantially upgraded** — The background review fork (the core of Hermes' self-improvement: after each turn it decides what memories/skills to save or update) is now class-first (rubric-based rather than free-form), active-update biased (prefers the skill the agent just loaded), handles `references/`/`templates/` sub-files, and properly inherits the parent's live runtime (provider, model, credentials actually propagate). Restricted to memory + skills toolsets so it can't sprawl. Memory providers shut down cleanly. Prior-turn tool messages excluded from the summary so the fork sees a clean context. ([#16026](https://github.com/NousResearch/hermes-agent/pull/16026), [#17213](https://github.com/NousResearch/hermes-agent/pull/17213), [#16099](https://github.com/NousResearch/hermes-agent/pull/16099), [#16569](https://github.com/NousResearch/hermes-agent/pull/16569), [#16204](https://github.com/NousResearch/hermes-agent/pull/16204), [#15057](https://github.com/NousResearch/hermes-agent/pull/15057))
|
||||
|
||||
- **Skill integrations — major expansion** — **ComfyUI v5** with official CLI + REST + hardware-gated local install, moved from optional to **built-in by default** ([#17610](https://github.com/NousResearch/hermes-agent/pull/17610), [#17631](https://github.com/NousResearch/hermes-agent/pull/17631), [#17734](https://github.com/NousResearch/hermes-agent/pull/17734)). **TouchDesigner-MCP** bundled by default, expanded with GLSL, post-FX, audio, geometry, and 9 new reference docs ([#16753](https://github.com/NousResearch/hermes-agent/pull/16753), [#16624](https://github.com/NousResearch/hermes-agent/pull/16624), [#16768](https://github.com/NousResearch/hermes-agent/pull/16768) — @kshitijk4poor + @SHL0MS). **Humanizer** skill ports a text-cleaner that strips AI-isms ([#16787](https://github.com/NousResearch/hermes-agent/pull/16787)). **claude-design** HTML artifact skill + design-md (Google DESIGN.md spec) + airtable salvage + `skill_manage` edits in `external_dirs` + direct-URL skill install + `/reload-skills` slash command. ([#16358](https://github.com/NousResearch/hermes-agent/pull/16358), [#14876](https://github.com/NousResearch/hermes-agent/pull/14876), [#16291](https://github.com/NousResearch/hermes-agent/pull/16291), [#17512](https://github.com/NousResearch/hermes-agent/pull/17512), [#16323](https://github.com/NousResearch/hermes-agent/pull/16323), [#17744](https://github.com/NousResearch/hermes-agent/pull/17744))
|
||||
|
||||
- **LM Studio — first-class provider** — upgraded from a custom-endpoint alias to a full-blown native provider: dedicated auth, `hermes doctor` checks, reasoning transport, live `/models` listing. (Salvage of @kshitijk4poor's #17061.) ([#17102](https://github.com/NousResearch/hermes-agent/pull/17102))
|
||||
|
||||
- **Four more new inference providers** — **GMI Cloud** (first-class, salvage of #11955 — @isaachuangGMICLOUD), **Azure AI Foundry** with auto-detection, **MiniMax OAuth** with PKCE browser flow (salvage #15203), **Tencent Tokenhub** (salvage of #16860). ([#16663](https://github.com/NousResearch/hermes-agent/pull/16663), [#15845](https://github.com/NousResearch/hermes-agent/pull/15845), [#17524](https://github.com/NousResearch/hermes-agent/pull/17524), [#16960](https://github.com/NousResearch/hermes-agent/pull/16960))
|
||||
|
||||
- **Pluggable gateway platforms + Microsoft Teams** — the gateway is now a plugin host. Drop-in messaging adapters live outside the core, and Microsoft Teams is the first plugin-shipped platform. (Salvage of #17664.) ([#17751](https://github.com/NousResearch/hermes-agent/pull/17751), [#17828](https://github.com/NousResearch/hermes-agent/pull/17828))
|
||||
|
||||
- **Tencent 元宝 (Yuanbao) — 18th messaging platform** — native gateway adapter with text + media delivery. ([#16298](https://github.com/NousResearch/hermes-agent/pull/16298), [#17424](https://github.com/NousResearch/hermes-agent/pull/17424))
|
||||
|
||||
- **Spotify — native tools + bundled skill + wizard** — 7 tools (play, search, queue, playlists, devices) behind PKCE OAuth, interactive setup wizard, bundled skill, surfacing in `hermes tools`, cron usage documented. ([#15121](https://github.com/NousResearch/hermes-agent/pull/15121), [#15130](https://github.com/NousResearch/hermes-agent/pull/15130), [#15154](https://github.com/NousResearch/hermes-agent/pull/15154), [#15180](https://github.com/NousResearch/hermes-agent/pull/15180))
|
||||
|
||||
- **Google Meet plugin** — join calls, transcribe, speak, follow up. Realtime OpenAI transport + Node bot server, full pipeline bundled as a plugin. ([#16364](https://github.com/NousResearch/hermes-agent/pull/16364))
|
||||
|
||||
- **`hermes -z` one-shot mode + `hermes update --check`** — non-interactive `hermes -z <prompt>` with `--model`/`--provider`/`HERMES_INFERENCE_MODEL`. `hermes update --check` preflight. Opt-in pre-update HERMES_HOME backup. ([#15702](https://github.com/NousResearch/hermes-agent/pull/15702), [#15704](https://github.com/NousResearch/hermes-agent/pull/15704), [#15841](https://github.com/NousResearch/hermes-agent/pull/15841), [#16539](https://github.com/NousResearch/hermes-agent/pull/16539), [#16566](https://github.com/NousResearch/hermes-agent/pull/16566))
|
||||
|
||||
- **Models dashboard tab + in-browser model config** — rich per-model analytics, switch main + auxiliary models from the dashboard. ([#17745](https://github.com/NousResearch/hermes-agent/pull/17745), [#17802](https://github.com/NousResearch/hermes-agent/pull/17802))
|
||||
|
||||
- **Remote model catalog manifest** — OpenRouter + Nous Portal model catalogs are now pulled from a remote manifest so new models show up without a release. ([#16033](https://github.com/NousResearch/hermes-agent/pull/16033))
|
||||
|
||||
- **Native multimodal image routing** — images now route based on the model's actual vision capability rather than provider defaults. ([#16506](https://github.com/NousResearch/hermes-agent/pull/16506))
|
||||
|
||||
- **Gateway media parity** — native multi-image sending across Telegram, Discord, Slack, Mattermost, Email, and Signal; centralized audio routing with FLAC support + Telegram document fallback. ([#17909](https://github.com/NousResearch/hermes-agent/pull/17909), [#17833](https://github.com/NousResearch/hermes-agent/pull/17833))
|
||||
|
||||
- **TUI catches up to (and past) the classic CLI** — LaTeX rendering (@austinpickett), `/reload` .env hot-reload, pluggable busy-indicator styles (@OutThisLife, #13610), opt-in auto-resume of last session, expanded light-terminal auto-detection, session delete from `/resume` picker with `d`, modified mouse-wheel line scroll, and a `/mouse` toggle that kills ConPTY's phantom mouse injection (@kevin-ho). ([#17175](https://github.com/NousResearch/hermes-agent/pull/17175), [#17286](https://github.com/NousResearch/hermes-agent/pull/17286), [#17150](https://github.com/NousResearch/hermes-agent/pull/17150), [#17130](https://github.com/NousResearch/hermes-agent/pull/17130), [#17113](https://github.com/NousResearch/hermes-agent/pull/17113), [#17668](https://github.com/NousResearch/hermes-agent/pull/17668), [#17669](https://github.com/NousResearch/hermes-agent/pull/17669), [#15488](https://github.com/NousResearch/hermes-agent/pull/15488))
|
||||
|
||||
- **Observability + achievements plugins** — bundled Langfuse observability plugin (salvage #16845) + bundled hermes-achievements plugin that scans full session history. ([#16917](https://github.com/NousResearch/hermes-agent/pull/16917), [#17754](https://github.com/NousResearch/hermes-agent/pull/17754))
|
||||
|
||||
- **TTS provider registry + Piper local TTS** — pluggable `tts.providers.<name>` registry; Piper ships as a native local TTS provider. (Closes #8508.) ([#17843](https://github.com/NousResearch/hermes-agent/pull/17843), [#17885](https://github.com/NousResearch/hermes-agent/pull/17885))
|
||||
|
||||
- **Vercel Sandbox backend** — Vercel sandboxes as an execute_code/terminal backend (@kshitijk4poor). ([#17445](https://github.com/NousResearch/hermes-agent/pull/17445))
|
||||
|
||||
- **Secret redaction off by default** — default flipped to off. Prevents the long-standing patch-corruption incidents where fake secret-shaped substrings mangled tool outputs. Opt in via `redaction.enabled: true` when you need it. ([#16794](https://github.com/NousResearch/hermes-agent/pull/16794))
|
||||
|
||||
- **Cold-start performance** — visible TUI cold start cut **~57%** via lazy agent init (@OutThisLife), lazy imports of OpenAI / Anthropic / Firecrawl / account_usage, mtime-cached `load_config()`, memoized `get_tool_definitions()` with TTL-cached `check_fn` results, precompiled dangerous-command patterns. ([#17190](https://github.com/NousResearch/hermes-agent/pull/17190), [#17046](https://github.com/NousResearch/hermes-agent/pull/17046), [#17041](https://github.com/NousResearch/hermes-agent/pull/17041), [#17098](https://github.com/NousResearch/hermes-agent/pull/17098), [#17206](https://github.com/NousResearch/hermes-agent/pull/17206))
|
||||
|
||||
- **Configurable prompt cache TTL** — `prompt_caching.cache_ttl` (5m default, 1h opt-in — cost savings for bursty sessions that keep cache warm). Salvage of #12659. ([#15065](https://github.com/NousResearch/hermes-agent/pull/15065))
|
||||
|
||||
---
|
||||
|
||||
## 🧠 Autonomous Curator & Self-Improvement Loop
|
||||
|
||||
### Curator — autonomous skill maintenance
|
||||
- **`hermes curator` as a background agent** — runs on the gateway's cron ticker, 7-day cycle by default, umbrella-first prompt, inherits parent config, unbounded iterations ([#17277](https://github.com/NousResearch/hermes-agent/pull/17277) — issue #7816)
|
||||
- **Per-run reports** — `logs/curator/run.json` + `REPORT.md` per cycle ([#17307](https://github.com/NousResearch/hermes-agent/pull/17307))
|
||||
- **Consolidated vs pruned classification** — archived skills split with model + heuristic ([#17941](https://github.com/NousResearch/hermes-agent/pull/17941))
|
||||
- **`hermes curator status`** — ranks skills by usage, shows most-used and least-used ([#18033](https://github.com/NousResearch/hermes-agent/pull/18033))
|
||||
- **Unified under `auxiliary.curator`** — pick the model in `hermes model`, configure from the dashboard ([#17868](https://github.com/NousResearch/hermes-agent/pull/17868))
|
||||
- **Documentation** — dedicated curator feature page on the docs site ([#17563](https://github.com/NousResearch/hermes-agent/pull/17563))
|
||||
- Fix: seed defaults on update, create `logs/curator/` directory, defer fire import ([#17927](https://github.com/NousResearch/hermes-agent/pull/17927))
|
||||
- Fix: scan nested archive subdirs in `restore_skill` (@0xDevNinja) ([#17951](https://github.com/NousResearch/hermes-agent/pull/17951))
|
||||
- Fix: use actual skill activity in curator status (@y0shua1ee) ([#17953](https://github.com/NousResearch/hermes-agent/pull/17953))
|
||||
- Fix: `skill_manage` refuses writes on pinned skills; pinning now blocks curator writes ([#17562](https://github.com/NousResearch/hermes-agent/pull/17562), [#17578](https://github.com/NousResearch/hermes-agent/pull/17578))
|
||||
- Fix: `bump_use()` wired into skill invocation + preload + skill_view (salvage #17782) ([#17932](https://github.com/NousResearch/hermes-agent/pull/17932))
|
||||
|
||||
### Self-improvement loop (background review fork)
|
||||
- **Class-first skill-review prompt** — rubric-based grading rather than free-form "should this update" ([#16026](https://github.com/NousResearch/hermes-agent/pull/16026))
|
||||
- **Active-update bias** — prefers updating skills the agent just loaded, handles `references/` + `templates/` sub-files ([#17213](https://github.com/NousResearch/hermes-agent/pull/17213))
|
||||
- **Fork inherits parent's live runtime** — provider, model, credentials actually propagate now ([#16099](https://github.com/NousResearch/hermes-agent/pull/16099))
|
||||
- **Scoped toolsets** — review fork restricted to memory + skills (no shell, no web) ([#16569](https://github.com/NousResearch/hermes-agent/pull/16569))
|
||||
- **Clean shutdown** — background review memory providers exit properly (salvage #15289) ([#16204](https://github.com/NousResearch/hermes-agent/pull/16204))
|
||||
- **Clean context** — prior-history tool messages excluded from review summary (salvage #14967) ([#15057](https://github.com/NousResearch/hermes-agent/pull/15057))
|
||||
|
||||
---
|
||||
|
||||
## 🧩 Skills Ecosystem
|
||||
|
||||
### Skill integrations — newly bundled or promoted
|
||||
- **ComfyUI v5** — official CLI + REST + hardware-gated local install; **moved from optional to built-in** ([#17610](https://github.com/NousResearch/hermes-agent/pull/17610), [#17631](https://github.com/NousResearch/hermes-agent/pull/17631), [#17734](https://github.com/NousResearch/hermes-agent/pull/17734), [#17612](https://github.com/NousResearch/hermes-agent/pull/17612))
|
||||
- **TouchDesigner-MCP** — **bundled by default** ([#16753](https://github.com/NousResearch/hermes-agent/pull/16753) — @kshitijk4poor), expanded with GLSL, post-FX, audio, geometry references ([#16624](https://github.com/NousResearch/hermes-agent/pull/16624)), 9 new reference docs ([#16768](https://github.com/NousResearch/hermes-agent/pull/16768) — @SHL0MS)
|
||||
- **Humanizer** — strips AI-isms from text ([#16787](https://github.com/NousResearch/hermes-agent/pull/16787))
|
||||
- **claude-design** — HTML artifact skill with disambiguation from other design skills ([#16358](https://github.com/NousResearch/hermes-agent/pull/16358))
|
||||
- **design-md** — Google's DESIGN.md spec skill ([#14876](https://github.com/NousResearch/hermes-agent/pull/14876))
|
||||
- **airtable** — salvaged skill + skill API keys wired into `.env` (#15838) ([#16291](https://github.com/NousResearch/hermes-agent/pull/16291))
|
||||
- **pretext** — creative browser demos with @chenglou/pretext ([#17259](https://github.com/NousResearch/hermes-agent/pull/17259))
|
||||
- **spike** + **sketch** — throwaway experiments + HTML mockups, adapted from gsd-build ([#17421](https://github.com/NousResearch/hermes-agent/pull/17421))
|
||||
|
||||
### Skills UX
|
||||
- **Install skills from a direct HTTP(S) URL** — `hermes skills install <url>` ([#16323](https://github.com/NousResearch/hermes-agent/pull/16323))
|
||||
- **`/reload-skills`** slash command (salvage #17670) ([#17744](https://github.com/NousResearch/hermes-agent/pull/17744))
|
||||
- **`hermes skills list`** shows enabled/disabled status ([#16129](https://github.com/NousResearch/hermes-agent/pull/16129))
|
||||
- **`skill_manage` refuses writes on pinned skills** ([#17562](https://github.com/NousResearch/hermes-agent/pull/17562))
|
||||
- **`skill_manage` edits external_dirs skills in place** (salvage #9966) ([#17512](https://github.com/NousResearch/hermes-agent/pull/17512), [#17289](https://github.com/NousResearch/hermes-agent/pull/17289))
|
||||
- Fix: inline-shell rendering in `skill_view` ([#15376](https://github.com/NousResearch/hermes-agent/pull/15376))
|
||||
- Fix: exclude `.archive/` from skill index walk (salvage #17639) ([#17931](https://github.com/NousResearch/hermes-agent/pull/17931))
|
||||
- Fix: dedicated docs page per bundled + optional skill ([#14929](https://github.com/NousResearch/hermes-agent/pull/14929))
|
||||
- Fix: `google-workspace` shared HERMES_HOME helper + ship deps as optional extra ([#15405](https://github.com/NousResearch/hermes-agent/pull/15405))
|
||||
- Fix: auto-wrap ASCII-art code blocks in generated skill pages ([#16497](https://github.com/NousResearch/hermes-agent/pull/16497))
|
||||
- Point agent at `hermes-agent` skill + docs site for Hermes questions ([#16535](https://github.com/NousResearch/hermes-agent/pull/16535))
|
||||
|
||||
---
|
||||
|
||||
## 🏗️ Core Agent & Architecture
|
||||
|
||||
### Provider & Model Support
|
||||
|
||||
#### New providers
|
||||
- **GMI Cloud** — first-class API-key provider on par with Arcee/Kilocode/Xiaomi (salvage of #11955 — @isaachuangGMICLOUD) ([#16663](https://github.com/NousResearch/hermes-agent/pull/16663))
|
||||
- **Azure AI Foundry** — auto-detection, full wiring ([#15845](https://github.com/NousResearch/hermes-agent/pull/15845))
|
||||
- **LM Studio** — upgraded from custom-endpoint alias to first-class provider: dedicated auth, doctor checks, reasoning transport, live `/models` (salvage of #17061 — @kshitijk4poor) ([#17102](https://github.com/NousResearch/hermes-agent/pull/17102))
|
||||
- **MiniMax OAuth** — PKCE browser flow with full OAuth integration (salvage #15203) ([#17524](https://github.com/NousResearch/hermes-agent/pull/17524))
|
||||
- **Tencent Tokenhub** — new provider (salvage of #16860) ([#16960](https://github.com/NousResearch/hermes-agent/pull/16960))
|
||||
|
||||
#### Model catalog
|
||||
- **Remote model catalog manifest** — OpenRouter + Nous Portal catalogs pulled from remote manifest so new models show up without a release ([#16033](https://github.com/NousResearch/hermes-agent/pull/16033))
|
||||
- `openai/gpt-5.5` and `gpt-5.5-pro` added to OpenRouter + Nous Portal ([#15343](https://github.com/NousResearch/hermes-agent/pull/15343))
|
||||
- `deepseek-v4-pro` and `deepseek-v4-flash` added ([#14934](https://github.com/NousResearch/hermes-agent/pull/14934))
|
||||
- `qwen3.6-plus` added to Alibaba-supported models ([#16896](https://github.com/NousResearch/hermes-agent/pull/16896))
|
||||
- Gemini free-tier keys blocked at setup with 429 guidance surfacing ([#15100](https://github.com/NousResearch/hermes-agent/pull/15100))
|
||||
|
||||
#### Model configuration
|
||||
- **Configurable `prompt_caching.cache_ttl`** — 5m default, 1h opt-in (salvage #12659) ([#15065](https://github.com/NousResearch/hermes-agent/pull/15065))
|
||||
- `/fast` whitelist broadened to all OpenAI + Anthropic models ([#16883](https://github.com/NousResearch/hermes-agent/pull/16883))
|
||||
- `auxiliary.extra_body.reasoning` translates into Codex Responses API ([#17004](https://github.com/NousResearch/hermes-agent/pull/17004))
|
||||
- `hermes fallback` command for managing fallback providers ([#16052](https://github.com/NousResearch/hermes-agent/pull/16052))
|
||||
|
||||
### Agent Loop & Conversation
|
||||
- **Native multimodal image routing** — based on model vision capability, not provider defaults ([#16506](https://github.com/NousResearch/hermes-agent/pull/16506))
|
||||
- **Delegate `child_timeout_seconds` default bumped to 600s** ([#14809](https://github.com/NousResearch/hermes-agent/pull/14809))
|
||||
- **Diagnostic dump when subagent times out with 0 API calls** ([#15105](https://github.com/NousResearch/hermes-agent/pull/15105))
|
||||
- **Gateway busts cached agent on compression/context_length config edits** ([#17008](https://github.com/NousResearch/hermes-agent/pull/17008))
|
||||
- **Opt-in runtime-metadata footer on final replies** ([#17026](https://github.com/NousResearch/hermes-agent/pull/17026))
|
||||
- `/reload-mcp` awareness — rebuild cached agents + prompt-cache cost confirmation ([#17729](https://github.com/NousResearch/hermes-agent/pull/17729))
|
||||
- Fix: repair CamelCase + `_tool` suffix tool-call emissions ([#15124](https://github.com/NousResearch/hermes-agent/pull/15124))
|
||||
- Fix: retry on `json.JSONDecodeError` instead of treating as local validation error ([#15107](https://github.com/NousResearch/hermes-agent/pull/15107))
|
||||
- Fix: handle unescaped control chars in `tool_call.arguments` ([#15356](https://github.com/NousResearch/hermes-agent/pull/15356))
|
||||
- Fix: ordering fix in `_copy_reasoning_content_for_api` — cross-provider reasoning isolation (@Zjianru) ([#15749](https://github.com/NousResearch/hermes-agent/pull/15749))
|
||||
- Fix: inject empty `reasoning_content` for DeepSeek/Kimi `tool_calls` unconditionally (@Zjianru) ([#15762](https://github.com/NousResearch/hermes-agent/pull/15762))
|
||||
- Fix: persist streamed `reasoning_content` on assistant turns (#16844) ([#16892](https://github.com/NousResearch/hermes-agent/pull/16892))
|
||||
- Fix: cancel coroutine on timeout so worker thread exits; full traceback on tool failure ([#17428](https://github.com/NousResearch/hermes-agent/pull/17428))
|
||||
- Fix: isolate `get_tool_definitions` quiet_mode cache + dedup LCM injection (#17335) ([#17889](https://github.com/NousResearch/hermes-agent/pull/17889))
|
||||
- Fix: serialize concurrent `hermes_tools` RPC calls from `execute_code` (#17770) ([#17894](https://github.com/NousResearch/hermes-agent/pull/17894), [#17902](https://github.com/NousResearch/hermes-agent/pull/17902))
|
||||
- Fix: rename `[SYSTEM:` → `[IMPORTANT:` in all user-injected markers (dodges Azure content filter) ([#16114](https://github.com/NousResearch/hermes-agent/pull/16114))
|
||||
|
||||
### Compression
|
||||
- **Retry summary on main model for unknown errors before giving up** ([#16774](https://github.com/NousResearch/hermes-agent/pull/16774))
|
||||
- **Notify users when configured aux model fails even if main-model fallback recovers** ([#16775](https://github.com/NousResearch/hermes-agent/pull/16775))
|
||||
- `/compress` wrapped in `_busy_command` to block input during compression ([#15388](https://github.com/NousResearch/hermes-agent/pull/15388))
|
||||
- Fix: reserve system + tools headroom when aux binds threshold ([#15631](https://github.com/NousResearch/hermes-agent/pull/15631))
|
||||
- Fix: use text-char sum for multimodal token estimation in `_find_tail_cut_by_tokens` ([#16369](https://github.com/NousResearch/hermes-agent/pull/16369))
|
||||
|
||||
### Session, Memory & State
|
||||
- **Trigram FTS5 index for CJK search, replace LIKE fallback** (@alt-glitch) ([#16651](https://github.com/NousResearch/hermes-agent/pull/16651))
|
||||
- **Index `tool_name` + `tool_calls` in FTS5, with repair + migration** (salvages #16866) ([#16914](https://github.com/NousResearch/hermes-agent/pull/16914))
|
||||
- **Checkpoints: auto-prune orphan and stale shadow repos at startup** ([#16303](https://github.com/NousResearch/hermes-agent/pull/16303))
|
||||
- **Memory providers notified on mid-process session_id rotation** (#6672) ([#17409](https://github.com/NousResearch/hermes-agent/pull/17409))
|
||||
- Fix: quote underscored terms in FTS5 query sanitization ([#16915](https://github.com/NousResearch/hermes-agent/pull/16915))
|
||||
- Fix: resolve viking_read 500/412 on file URIs + pseudo-summary URIs (salvage #5886) ([#17869](https://github.com/NousResearch/hermes-agent/pull/17869))
|
||||
- Fix: skip external-provider sync on interrupted turns ([#15395](https://github.com/NousResearch/hermes-agent/pull/15395))
|
||||
- Fix: close embedded Hindsight async client cleanly (salvage #14605) ([#16209](https://github.com/NousResearch/hermes-agent/pull/16209))
|
||||
- Fix: pass session transcript to `shutdown_memory_provider` on gateway + CLI (#15165) ([#16571](https://github.com/NousResearch/hermes-agent/pull/16571))
|
||||
- Fix: write-origin metadata seam ([#15346](https://github.com/NousResearch/hermes-agent/pull/15346))
|
||||
- Fix: preserve symlinks during atomic file writes ([#16980](https://github.com/NousResearch/hermes-agent/pull/16980))
|
||||
- Refactor: remove `flush_memories` entirely ([#15696](https://github.com/NousResearch/hermes-agent/pull/15696))
|
||||
|
||||
### Auxiliary models
|
||||
- Fix: surface auxiliary failures in UI (previously silent) ([#15324](https://github.com/NousResearch/hermes-agent/pull/15324))
|
||||
- Fix: surface title-gen auxiliary failures instead of silently dropping ([#16371](https://github.com/NousResearch/hermes-agent/pull/16371))
|
||||
- Fix: generalize unsupported-parameter detector and harden `max_tokens` retry ([#15633](https://github.com/NousResearch/hermes-agent/pull/15633))
|
||||
|
||||
---
|
||||
|
||||
## 📱 Messaging Platforms (Gateway)
|
||||
|
||||
### New Platforms
|
||||
- **Microsoft Teams (19th platform)** — as a plugin, + xdist collision guard ([#17828](https://github.com/NousResearch/hermes-agent/pull/17828))
|
||||
- **Yuanbao (Tencent 元宝, 18th platform)** — native adapter with text + media delivery ([#16298](https://github.com/NousResearch/hermes-agent/pull/16298), [#17424](https://github.com/NousResearch/hermes-agent/pull/17424), [#16880](https://github.com/NousResearch/hermes-agent/pull/16880))
|
||||
|
||||
### Pluggable Gateway Platforms
|
||||
- **Drop-in messaging adapters** — the gateway is now a plugin host for platforms (salvage of #17664) ([#17751](https://github.com/NousResearch/hermes-agent/pull/17751))
|
||||
|
||||
### Telegram
|
||||
- **Chat allowlists for groups and forums** (@web3blind) ([#15027](https://github.com/NousResearch/hermes-agent/pull/15027))
|
||||
- **Send fresh finals for stale preview streams** (port openclaw#72038) ([#16261](https://github.com/NousResearch/hermes-agent/pull/16261))
|
||||
- **Render markdown tables as row-group bullets + prompt hint** ([#16997](https://github.com/NousResearch/hermes-agent/pull/16997))
|
||||
- Document fallback in centralized audio routing ([#17833](https://github.com/NousResearch/hermes-agent/pull/17833))
|
||||
- Native multi-image sending ([#17909](https://github.com/NousResearch/hermes-agent/pull/17909))
|
||||
|
||||
### Discord
|
||||
- **Opt-in toolsets + ID injection + tool split + Feishu wiring** (salvage #15457, #15458) ([#15610](https://github.com/NousResearch/hermes-agent/pull/15610), [#15613](https://github.com/NousResearch/hermes-agent/pull/15613))
|
||||
- Fix: coerce `limit` parameter to int before `min()` call ([#16319](https://github.com/NousResearch/hermes-agent/pull/16319))
|
||||
|
||||
### Slack
|
||||
- **Register every gateway command as a native slash (Discord/Telegram parity)** ([#16164](https://github.com/NousResearch/hermes-agent/pull/16164))
|
||||
- **`strict_mention` config** — prevents thread auto-engagement ([#16193](https://github.com/NousResearch/hermes-agent/pull/16193))
|
||||
- **`channel_skill_bindings`** — bind specific skills to specific Slack channels ([#16283](https://github.com/NousResearch/hermes-agent/pull/16283))
|
||||
|
||||
### Signal
|
||||
- **Native formatting** — markdown → bodyRanges, reply quotes, reactions ([#17417](https://github.com/NousResearch/hermes-agent/pull/17417))
|
||||
- Native multi-image sending ([#17909](https://github.com/NousResearch/hermes-agent/pull/17909))
|
||||
|
||||
### Feishu / Mattermost / Email / Signal
|
||||
- All participate in **native multi-image sending** ([#17909](https://github.com/NousResearch/hermes-agent/pull/17909))
|
||||
|
||||
### Gateway Core
|
||||
- **Centralized audio routing + FLAC support + Telegram doc fallback** ([#17833](https://github.com/NousResearch/hermes-agent/pull/17833))
|
||||
- **Native multi-image sending** across Telegram, Discord, Slack, Mattermost, Email, Signal ([#17909](https://github.com/NousResearch/hermes-agent/pull/17909))
|
||||
- **Make hygiene hard message limit configurable** ([#17000](https://github.com/NousResearch/hermes-agent/pull/17000))
|
||||
- **Opt-in runtime-metadata footer on final replies** ([#17026](https://github.com/NousResearch/hermes-agent/pull/17026))
|
||||
- **`pre_gateway_dispatch` hook** — plugins can intercept before dispatch ([#15050](https://github.com/NousResearch/hermes-agent/pull/15050))
|
||||
- **`pre_approval_request` / `post_approval_response` hooks** ([#16776](https://github.com/NousResearch/hermes-agent/pull/16776))
|
||||
- Fix: timeouts — guard `load_config()` call against runtime exceptions ([#16318](https://github.com/NousResearch/hermes-agent/pull/16318))
|
||||
- Fix: support passing handler tools via registry ([#15613](https://github.com/NousResearch/hermes-agent/pull/15613))
|
||||
|
||||
---
|
||||
|
||||
## 🔧 Tool System
|
||||
|
||||
### Plugin-first architecture
|
||||
- **Pluggable gateway platforms** — platforms can ship as plugins ([#17751](https://github.com/NousResearch/hermes-agent/pull/17751))
|
||||
- **Microsoft Teams as first plugin-shipped platform** ([#17828](https://github.com/NousResearch/hermes-agent/pull/17828))
|
||||
- **`pre_gateway_dispatch` hook** ([#15050](https://github.com/NousResearch/hermes-agent/pull/15050))
|
||||
- **`pre_approval_request` + `post_approval_response` hooks** ([#16776](https://github.com/NousResearch/hermes-agent/pull/16776))
|
||||
- **`duration_ms` on `post_tool_call`** (inspired by Claude Code 2.1.119) ([#15429](https://github.com/NousResearch/hermes-agent/pull/15429))
|
||||
- **Bundled plugins**: Spotify ([#15174](https://github.com/NousResearch/hermes-agent/pull/15174)), Google Meet ([#16364](https://github.com/NousResearch/hermes-agent/pull/16364)), Langfuse observability ([#16917](https://github.com/NousResearch/hermes-agent/pull/16917)), hermes-achievements ([#17754](https://github.com/NousResearch/hermes-agent/pull/17754))
|
||||
- **Page-scoped plugin slots for built-in dashboard pages** ([#15658](https://github.com/NousResearch/hermes-agent/pull/15658))
|
||||
- **Declarative plugin installation for NixOS module** (@alt-glitch) ([#15953](https://github.com/NousResearch/hermes-agent/pull/15953))
|
||||
|
||||
### Browser
|
||||
- **CDP supervisor** — dialog detection + response + cross-origin iframe eval ([#14540](https://github.com/NousResearch/hermes-agent/pull/14540))
|
||||
- **Auto-spawn local Chromium for LAN/localhost URLs** when cloud provider is configured ([#16136](https://github.com/NousResearch/hermes-agent/pull/16136))
|
||||
|
||||
### Execute code / Terminal
|
||||
- **Vercel Sandbox backend** for `execute_code` / terminal (@kshitijk4poor) ([#17445](https://github.com/NousResearch/hermes-agent/pull/17445))
|
||||
- **Collapse subagent `task_id`s to shared container** ([#16177](https://github.com/NousResearch/hermes-agent/pull/16177))
|
||||
- **Docker: run container as host user** to avoid root-owned bind mounts (@benbarclay) ([#17305](https://github.com/NousResearch/hermes-agent/pull/17305))
|
||||
- Fix: safely quote `~/` subpaths in wrapped `cd` commands ([#15394](https://github.com/NousResearch/hermes-agent/pull/15394))
|
||||
- Fix: close file descriptor in `LocalEnvironment._update_cwd` ([#17300](https://github.com/NousResearch/hermes-agent/pull/17300))
|
||||
- Fix: SSH — prevent tar from overwriting remote home dir permissions ([#17898](https://github.com/NousResearch/hermes-agent/pull/17898), [#17867](https://github.com/NousResearch/hermes-agent/pull/17867))
|
||||
|
||||
### Image generation
|
||||
- See Provider section for updates; no new image providers this window.
|
||||
|
||||
### TTS / Voice
|
||||
- **Pluggable TTS provider registry** under `tts.providers.<name>` ([#17843](https://github.com/NousResearch/hermes-agent/pull/17843))
|
||||
- **Piper** as native local TTS provider (closes #8508) ([#17885](https://github.com/NousResearch/hermes-agent/pull/17885))
|
||||
- **Voice mode CLI parity in the TUI** — VAD loop + TTS + crash forensics ([#14810](https://github.com/NousResearch/hermes-agent/pull/14810))
|
||||
- Fix: vision — use HERMES_HOME-based cache dir instead of cwd ([#17719](https://github.com/NousResearch/hermes-agent/pull/17719))
|
||||
|
||||
### Cron
|
||||
- **Honor `hermes tools` config for the cron platform** ([#14798](https://github.com/NousResearch/hermes-agent/pull/14798))
|
||||
- **Per-job `workdir`** — project-aware cron runs ([#15110](https://github.com/NousResearch/hermes-agent/pull/15110))
|
||||
- **`context_from` field** — chain cron job outputs ([#15606](https://github.com/NousResearch/hermes-agent/pull/15606))
|
||||
- Fix: promote `croniter` to a core dependency ([#17577](https://github.com/NousResearch/hermes-agent/pull/17577))
|
||||
|
||||
### Web search
|
||||
- **Expose `limit` for `web_search`** ([#16934](https://github.com/NousResearch/hermes-agent/pull/16934))
|
||||
|
||||
### Maps
|
||||
- Fix: include seconds in timezone UTC offset output ([#16300](https://github.com/NousResearch/hermes-agent/pull/16300))
|
||||
|
||||
### Approvals
|
||||
- **Hardline blocklist for unrecoverable commands** ([#15878](https://github.com/NousResearch/hermes-agent/pull/15878))
|
||||
- Perf: precompile DANGEROUS_PATTERNS and HARDLINE_PATTERNS ([#17206](https://github.com/NousResearch/hermes-agent/pull/17206))
|
||||
|
||||
### ACP
|
||||
- **Advertise and forward image prompts** ([#18030](https://github.com/NousResearch/hermes-agent/pull/18030))
|
||||
|
||||
### API Server
|
||||
- **POST `/v1/runs/{run_id}/stop`** (salvage of #15656) ([#15842](https://github.com/NousResearch/hermes-agent/pull/15842))
|
||||
- **Expose run status for external UIs** (#17085) ([#17458](https://github.com/NousResearch/hermes-agent/pull/17458))
|
||||
|
||||
### Nix
|
||||
- **Declarative plugin installation for NixOS module** (@alt-glitch) ([#15953](https://github.com/NousResearch/hermes-agent/pull/15953))
|
||||
- Fix: use `--rebuild` in fix-lockfiles to bypass cached FOD store paths ([#15444](https://github.com/NousResearch/hermes-agent/pull/15444))
|
||||
- Fix: `extraPackages` now actually works via per-user profile ([#17047](https://github.com/NousResearch/hermes-agent/pull/17047))
|
||||
- Fix: refresh web/ npm-deps hash to unblock main builds ([#17174](https://github.com/NousResearch/hermes-agent/pull/17174))
|
||||
- Fix: replace magic-nix-cache with Cachix ([#17928](https://github.com/NousResearch/hermes-agent/pull/17928))
|
||||
|
||||
---
|
||||
|
||||
## 🖥️ TUI
|
||||
|
||||
### New features
|
||||
- **LaTeX rendering** (@austinpickett) ([#17175](https://github.com/NousResearch/hermes-agent/pull/17175))
|
||||
- **`/reload` .env hot-reload** — ported from the classic CLI ([#17286](https://github.com/NousResearch/hermes-agent/pull/17286))
|
||||
- **Pluggable busy-indicator styles** (@OutThisLife, #13610) ([#17150](https://github.com/NousResearch/hermes-agent/pull/17150))
|
||||
- **Opt-in auto-resume of the most recent session** (@OutThisLife) ([#17130](https://github.com/NousResearch/hermes-agent/pull/17130))
|
||||
- **Expanded light-terminal auto-detection** — `HERMES_TUI_THEME` + background hex (@OutThisLife) ([#17113](https://github.com/NousResearch/hermes-agent/pull/17113))
|
||||
- **Delete sessions from `/resume` picker with `d`** (@OutThisLife) ([#17668](https://github.com/NousResearch/hermes-agent/pull/17668))
|
||||
- **Line-by-line scroll on modified mouse wheel** (@OutThisLife) ([#17669](https://github.com/NousResearch/hermes-agent/pull/17669))
|
||||
- **Delete queued message while editing with ctrl-x / cancel with esc** (@OutThisLife) ([#16707](https://github.com/NousResearch/hermes-agent/pull/16707))
|
||||
- **Per-section visibility for the details accordion** (@OutThisLife) ([#14968](https://github.com/NousResearch/hermes-agent/pull/14968))
|
||||
- **Voice mode CLI parity** — VAD loop + TTS + crash forensics ([#14810](https://github.com/NousResearch/hermes-agent/pull/14810))
|
||||
- **Contextual first-touch hints ported to TUI** — `/busy`, `/verbose` ([#16054](https://github.com/NousResearch/hermes-agent/pull/16054))
|
||||
- **Mini help menu on `?` in the input field** (@ethernet8023) ([#18043](https://github.com/NousResearch/hermes-agent/pull/18043))
|
||||
|
||||
### Fixes
|
||||
- Fix: proactive mouse disable on ConPTY + `/mouse` toggle command (@kevin-ho, WSL2 ghost-mouse fix) ([#15488](https://github.com/NousResearch/hermes-agent/pull/15488))
|
||||
- Fix: restore skills search RPC ([#15870](https://github.com/NousResearch/hermes-agent/pull/15870))
|
||||
- Perf: cache text measurements across yoga flex re-passes ([#14818](https://github.com/NousResearch/hermes-agent/pull/14818))
|
||||
- Perf: stabilize long-session scrolling ([#15926](https://github.com/NousResearch/hermes-agent/pull/15926))
|
||||
- Perf: lazily seed virtual history heights ([#16523](https://github.com/NousResearch/hermes-agent/pull/16523))
|
||||
- Perf: cut visible cold start ~57% with lazy agent init ([#17190](https://github.com/NousResearch/hermes-agent/pull/17190))
|
||||
|
||||
---
|
||||
|
||||
## 🖱️ CLI & User Experience
|
||||
|
||||
### New commands
|
||||
- **`hermes -z <prompt>`** — non-interactive one-shot mode ([#15702](https://github.com/NousResearch/hermes-agent/pull/15702))
|
||||
- **`hermes -z` with `--model` / `--provider` / `HERMES_INFERENCE_MODEL`** ([#15704](https://github.com/NousResearch/hermes-agent/pull/15704))
|
||||
- **`hermes update --check`** preflight flag ([#15841](https://github.com/NousResearch/hermes-agent/pull/15841))
|
||||
- **`hermes fallback`** command for managing fallback providers ([#16052](https://github.com/NousResearch/hermes-agent/pull/16052))
|
||||
- **`/busy`** slash command for busy input mode ([#15382](https://github.com/NousResearch/hermes-agent/pull/15382))
|
||||
- **`/busy` input mode 'steer'** as a third option ([#16279](https://github.com/NousResearch/hermes-agent/pull/16279))
|
||||
- **`/btw` as alias for `/background`** ([#16053](https://github.com/NousResearch/hermes-agent/pull/16053))
|
||||
- **`/reload-skills`** slash command (salvage #17670) ([#17744](https://github.com/NousResearch/hermes-agent/pull/17744))
|
||||
- **Surface `/queue`, `/bg`, `/steer` in agent-running placeholder** ([#16118](https://github.com/NousResearch/hermes-agent/pull/16118))
|
||||
|
||||
### Setup / onboarding
|
||||
- **Auto-reconfigure on existing installs** ([#15879](https://github.com/NousResearch/hermes-agent/pull/15879))
|
||||
- **Contextual first-touch hints for `/busy` and `/verbose`** ([#16046](https://github.com/NousResearch/hermes-agent/pull/16046))
|
||||
- **Cost-saving tips from the April 30 tip-of-the-day** ([#17841](https://github.com/NousResearch/hermes-agent/pull/17841))
|
||||
- **Hyperlink startup banner title to the latest GitHub Release** ([#14945](https://github.com/NousResearch/hermes-agent/pull/14945))
|
||||
|
||||
### Update / backup
|
||||
- **Snapshot pairing data before `git pull`** ([#16383](https://github.com/NousResearch/hermes-agent/pull/16383))
|
||||
- **Auto-backup HERMES_HOME before `hermes update`** (opt-in, off by default) ([#16539](https://github.com/NousResearch/hermes-agent/pull/16539), [#16566](https://github.com/NousResearch/hermes-agent/pull/16566))
|
||||
- **Exclude `checkpoints/` from backups** ([#16572](https://github.com/NousResearch/hermes-agent/pull/16572))
|
||||
- **Exclude SQLite WAL/SHM/journal sidecars from backups** ([#16576](https://github.com/NousResearch/hermes-agent/pull/16576))
|
||||
- **Installer FHS layout for root installs on Linux** ([#15608](https://github.com/NousResearch/hermes-agent/pull/15608))
|
||||
- Fix: kill stale dashboards instead of warning ([#17832](https://github.com/NousResearch/hermes-agent/pull/17832))
|
||||
- Fix: show correct update status on nix-built hermes ([#17550](https://github.com/NousResearch/hermes-agent/pull/17550))
|
||||
|
||||
### Slash-command housekeeping
|
||||
- Refactor: drop `/provider`, `/plan` handler, and clean up slash registry ([#15047](https://github.com/NousResearch/hermes-agent/pull/15047))
|
||||
- Refactor: drop `persist_session` plumbing + fix broken `/btw` mid-turn bypass ([#16075](https://github.com/NousResearch/hermes-agent/pull/16075))
|
||||
|
||||
### OpenClaw migration (for folks coming from OpenClaw)
|
||||
- **Hardened OpenClaw import** — plan-first apply, redaction, pre-migration backup ([#16911](https://github.com/NousResearch/hermes-agent/pull/16911))
|
||||
- Fix: case-preserving brand rewrite + one-time `~/.openclaw` residue banner ([#16327](https://github.com/NousResearch/hermes-agent/pull/16327))
|
||||
- Fix: resolve `openclaw` workspace files from `agents.defaults.workspace` ([#16879](https://github.com/NousResearch/hermes-agent/pull/16879))
|
||||
- Fix: resolve model aliases against real OpenClaw catalog schema (salvage #16778) ([#16977](https://github.com/NousResearch/hermes-agent/pull/16977))
|
||||
|
||||
---
|
||||
|
||||
## 📊 Web Dashboard
|
||||
|
||||
- **Models tab** — rich per-model analytics ([#17745](https://github.com/NousResearch/hermes-agent/pull/17745))
|
||||
- **Configure main + auxiliary models from the Models page** ([#17802](https://github.com/NousResearch/hermes-agent/pull/17802))
|
||||
- **Dashboard Chat tab — xterm.js + JSON-RPC sidecar** (supersedes #12710 + #13379, @OutThisLife) ([#14890](https://github.com/NousResearch/hermes-agent/pull/14890))
|
||||
- **Dashboard layout refresh** (@austinpickett) ([#14899](https://github.com/NousResearch/hermes-agent/pull/14899))
|
||||
- **`--stop` and `--status` flags** on the dashboard CLI ([#17840](https://github.com/NousResearch/hermes-agent/pull/17840))
|
||||
- **Page-scoped plugin slots for built-in pages** ([#15658](https://github.com/NousResearch/hermes-agent/pull/15658))
|
||||
- Fix: replace all buttons for design system buttons ([#17007](https://github.com/NousResearch/hermes-agent/pull/17007))
|
||||
|
||||
---
|
||||
|
||||
## ⚡ Performance
|
||||
|
||||
- **TUI visible cold start cut ~57%** via lazy agent init ([#17190](https://github.com/NousResearch/hermes-agent/pull/17190))
|
||||
- **Lazy-import OpenAI, Anthropic, Firecrawl, account_usage** ([#17046](https://github.com/NousResearch/hermes-agent/pull/17046))
|
||||
- **mtime-cache `load_config()` and `read_raw_config()`** ([#17041](https://github.com/NousResearch/hermes-agent/pull/17041))
|
||||
- **Memoize `get_tool_definitions()` + TTL-cache `check_fn` results** ([#17098](https://github.com/NousResearch/hermes-agent/pull/17098))
|
||||
- **Precompile DANGEROUS_PATTERNS and HARDLINE_PATTERNS** ([#17206](https://github.com/NousResearch/hermes-agent/pull/17206))
|
||||
- **Cache Ink text measurements across yoga flex re-passes** ([#14818](https://github.com/NousResearch/hermes-agent/pull/14818))
|
||||
- **Stabilize long-session scrolling** ([#15926](https://github.com/NousResearch/hermes-agent/pull/15926))
|
||||
- **Lazily seed virtual history heights** ([#16523](https://github.com/NousResearch/hermes-agent/pull/16523))
|
||||
|
||||
---
|
||||
|
||||
## 🔒 Security & Reliability
|
||||
|
||||
- **Secret redaction off by default** — stops corrupting patches / API payloads with fake-key substitutions. Opt in via `redaction.enabled: true` ([#16794](https://github.com/NousResearch/hermes-agent/pull/16794))
|
||||
- **`[SYSTEM:` → `[IMPORTANT:`** in all user-injected markers (Azure content filter dodge) ([#16114](https://github.com/NousResearch/hermes-agent/pull/16114))
|
||||
- **Hardline blocklist for unrecoverable commands** ([#15878](https://github.com/NousResearch/hermes-agent/pull/15878))
|
||||
- **Canonical `mask_secret` helper; fix status.py DIM drift** ([#17207](https://github.com/NousResearch/hermes-agent/pull/17207))
|
||||
- **Sweep expired paste.rs uploads on a real timer** ([#16431](https://github.com/NousResearch/hermes-agent/pull/16431))
|
||||
- **Preserve symlinks during atomic file writes** ([#16980](https://github.com/NousResearch/hermes-agent/pull/16980))
|
||||
- **Probe `/dev/tty` by opening it, not bare existence** ([#17024](https://github.com/NousResearch/hermes-agent/pull/17024))
|
||||
|
||||
---
|
||||
|
||||
## 🐛 Notable Bug Fixes
|
||||
|
||||
This window includes 360 `fix:` PRs. Selected highlights from across the stack:
|
||||
|
||||
- **Background review fork inherits parent's live runtime** — provider/model/creds now propagate correctly ([#16099](https://github.com/NousResearch/hermes-agent/pull/16099))
|
||||
- **Hindsight configurable `HINDSIGHT_TIMEOUT` env var** ([#15077](https://github.com/NousResearch/hermes-agent/pull/15077))
|
||||
- **Tools: normalize numeric entries + clear stale `no_mcp` in `_save_platform_tools`** ([#15607](https://github.com/NousResearch/hermes-agent/pull/15607))
|
||||
- **MCP: rewrite `definitions` refs to `$defs` in input schemas** — closes provider-side 400s
|
||||
- **Azure content filter compatibility** — renamed `[SYSTEM:` markers so Azure's content filter stops flagging them ([#16114](https://github.com/NousResearch/hermes-agent/pull/16114))
|
||||
- **Vision cache uses HERMES_HOME instead of cwd** ([#17719](https://github.com/NousResearch/hermes-agent/pull/17719))
|
||||
- **FTS5 search** — tool_name + tool_calls indexing with repair + migration ([#16914](https://github.com/NousResearch/hermes-agent/pull/16914))
|
||||
- **Streaming reasoning persists on assistant turns** ([#16892](https://github.com/NousResearch/hermes-agent/pull/16892))
|
||||
- **execute_code concurrent RPC serialization** (#17770) ([#17894](https://github.com/NousResearch/hermes-agent/pull/17894), [#17902](https://github.com/NousResearch/hermes-agent/pull/17902))
|
||||
- **Background reviewer scoped to memory + skills toolsets** — no more accidental web/shell escapes ([#16569](https://github.com/NousResearch/hermes-agent/pull/16569))
|
||||
- **Compression recovery** — retry on main before giving up; notify user when aux fails ([#16774](https://github.com/NousResearch/hermes-agent/pull/16774), [#16775](https://github.com/NousResearch/hermes-agent/pull/16775))
|
||||
- **`croniter` promoted to a core dependency** ([#17577](https://github.com/NousResearch/hermes-agent/pull/17577))
|
||||
- **Discord tool `limit` parameter coerced to int** before `min()` call ([#16319](https://github.com/NousResearch/hermes-agent/pull/16319))
|
||||
- **Yuanbao messaging platform entrance fix** ([#16880](https://github.com/NousResearch/hermes-agent/pull/16880))
|
||||
- **ACP advertise and forward image prompts** ([#18030](https://github.com/NousResearch/hermes-agent/pull/18030))
|
||||
- **DeepSeek / Kimi reasoning content isolation** across cross-provider histories (@Zjianru) ([#15749](https://github.com/NousResearch/hermes-agent/pull/15749), [#15762](https://github.com/NousResearch/hermes-agent/pull/15762))
|
||||
- **Preserve reasoning_content replay on DeepSeek v4 + Kimi/Moonshot thinking** ([#18045](https://github.com/NousResearch/hermes-agent/pull/18045))
|
||||
|
||||
The vast majority of the 360 fixes landed in the streaming/compression/tool-calling paths across all providers — DeepSeek, Kimi, Moonshot, GLM, Qwen, MiniMax, Gemini, Anthropic, OpenAI — alongside TUI polish (resize, scroll, sticky-prompt) and gateway platform-specific edge cases.
|
||||
|
||||
---
|
||||
|
||||
## 🧪 Testing & CI
|
||||
|
||||
- Hermetic test parity (`scripts/run_tests.sh`) held across this window
|
||||
- **Microsoft Teams xdist collision guard** — prevents worker collisions when Teams platform tests run in parallel ([#17828](https://github.com/NousResearch/hermes-agent/pull/17828))
|
||||
- Chore: remove unused imports and dead locals (ruff F401, F841) ([#17010](https://github.com/NousResearch/hermes-agent/pull/17010))
|
||||
|
||||
---
|
||||
|
||||
## 📚 Documentation
|
||||
|
||||
- **Curator feature page** added to docs site ([#17563](https://github.com/NousResearch/hermes-agent/pull/17563))
|
||||
- **Document pin also blocking `skill_manage` writes** ([#17578](https://github.com/NousResearch/hermes-agent/pull/17578))
|
||||
- **Direct-URL skill install documented** across features, reference, guide, and `hermes-agent` skill ([#16355](https://github.com/NousResearch/hermes-agent/pull/16355))
|
||||
- **Hooks tutorial — build a BOOT.md startup checklist** (replaces the removed built-in hook) ([#17202](https://github.com/NousResearch/hermes-agent/pull/17202))
|
||||
- **ComfyUI docs: ask local vs cloud FIRST before hardware check** ([#17612](https://github.com/NousResearch/hermes-agent/pull/17612))
|
||||
- **Obliteratus skill: link YouTube video guide in SKILL.md** ([#15808](https://github.com/NousResearch/hermes-agent/pull/15808))
|
||||
- Per-skill docs pages generated for bundled + optional skills; ASCII art code blocks auto-wrapped ([#14929](https://github.com/NousResearch/hermes-agent/pull/14929), [#16497](https://github.com/NousResearch/hermes-agent/pull/16497))
|
||||
|
||||
---
|
||||
|
||||
## ⚖️ Removed / Reverted
|
||||
|
||||
- **Kanban multi-profile collaboration board** — landed in #16081, reverted in ([#16098](https://github.com/NousResearch/hermes-agent/pull/16098)) while the design is reworked
|
||||
- **computer-use cua-driver** — 3 preparatory PRs landed then were reverted in ([#16927](https://github.com/NousResearch/hermes-agent/pull/16927))
|
||||
- **BOOT.md built-in hook** removed ([#17093](https://github.com/NousResearch/hermes-agent/pull/17093)); the hooks tutorial ([#17202](https://github.com/NousResearch/hermes-agent/pull/17202)) shows how to build the same workflow yourself with a shell hook
|
||||
- **`/provider` + `/plan` slash commands dropped** ([#15047](https://github.com/NousResearch/hermes-agent/pull/15047))
|
||||
- **`flush_memories` removed entirely** ([#15696](https://github.com/NousResearch/hermes-agent/pull/15696))
|
||||
|
||||
---
|
||||
|
||||
## 👥 Contributors
|
||||
|
||||
### Core
|
||||
- **@teknium1** (Teknium)
|
||||
|
||||
### Top Community Contributors (by merged PR count since v0.11.0)
|
||||
|
||||
- **@OutThisLife** (Brooklyn) — 52 PRs · TUI — light-terminal detection + pluggable busy styles + auto-resume + session-delete from /resume + mouse-wheel scrolling + xterm.js dashboard Chat tab + cold-start cut + accordion polish
|
||||
- **@kshitijk4poor** — 12 PRs · LM Studio first-class provider (salvage), Vercel Sandbox backend, GMI Cloud salvage, bundled-by-default touchdesigner-mcp, many tool-call / reasoning fixes
|
||||
- **@helix4u** — 10 PRs · MCP schema robustness, assorted stability fixes
|
||||
- **@alt-glitch** — 8 PRs · trigram FTS5 CJK search, declarative Nix plugin install, matrix/feishu hints and fixes
|
||||
- **@ethernet8023** — 4 PRs
|
||||
- **@austinpickett** — 4 PRs · LaTeX rendering in TUI, dashboard layout refresh
|
||||
- **@benbarclay** — 3 PRs · Docker run-as-host-user so bind mounts don't get root-owned
|
||||
- **@vominh1919** — 2 PRs
|
||||
- **@stephenschoettler** — 2 PRs
|
||||
- **@kevin-ho** — ConPTY mouse-injection fix (#15488)
|
||||
- **@Zjianru** — cross-provider reasoning_content isolation + DeepSeek/Kimi empty-reasoning injection (#15749, #15762)
|
||||
- **@web3blind** — Telegram chat allowlists for groups and forums (#15027)
|
||||
- **@SHL0MS** — 9 new TouchDesigner-MCP reference docs (#16768)
|
||||
- **@0xDevNinja** — curator `restore_skill` nested-archive fix (#17951)
|
||||
- **@y0shua1ee** — curator `use` activity fix (#17953)
|
||||
|
||||
### Also contributing
|
||||
Salvaged or co-authored work from **@isaachuangGMICLOUD** (GMI Cloud), earlier upstream PRs from the original author of each salvage chain, and a long tail of one-shot fixes, documentation nudges, and skill contributions from the community.
|
||||
|
||||
### All Contributors (alphabetical, excluding @teknium1)
|
||||
|
||||
@0xbyt4, @0xharryriddle, @0xDevNinja, @0z1-ghb, @5park1e, @A-FdL-Prog, @aj-nt, @akhater, @alblez, @alexg0bot,
|
||||
@alexzhu0, @AllardQuek, @alt-glitch, @amanning3390, @amanuel2, @AndreKurait, @andrewhosf, @Andy283, @andyylin,
|
||||
@angel12, @AntAISecurityLab, @ash, @austinpickett, @badgerbees, @BadTechBandit, @Bartok9, @beenherebefore,
|
||||
@beesrsj2500, @BeliefanX, @benbarclay, @benjaminsehl, @BlackishGreen33, @bloodcarter, @BlueBirdBack,
|
||||
@briandevans, @brooklynnicholson, @bsgdigital, @buray, @bwjoke, @camaragon, @cdanis, @cgarwood82,
|
||||
@charles-brooks, @chen1749144759, @chengoak, @ching-kaching, @Contentment003111, @crayfish-ai, @CruxExperts,
|
||||
@cyclingwithelephants, @dandaka, @danklynn, @ddupont808, @dhabibi, @difujia, @dimitrovi, @dlkakbs,
|
||||
@dontcallmejames, @EKKOLearnAI, @emozilla, @ericnicolaides, @Erosika, @ethernet8023, @exiao, @Feranmi10,
|
||||
@flobo3, @foxion37, @georgeglessner, @georgex8001, @ghostmfr, @H-Ali13381, @HangGlidersRule, @harryplusplus,
|
||||
@haru398801, @heathley, @hejuntt1014, @hekaru-agent, @helix4u, @Heltman, @HenkDz, @heyitsaamir, @hharry11,
|
||||
@hhhonzik, @hhuang91, @HiddenPuppy, @htsh, @iamagenius00, @in-liberty420, @innocarpe, @irispillars, @iRonin,
|
||||
@isaachuangGMICLOUD, @Ito-69, @j3ffffff, @jackjin1997, @jakubkrcmar, @Jason2031, @JayGwod, @jerome-benoit,
|
||||
@johnncenae, @Kailigithub, @keiravoss94, @kevin-ho, @knockyai, @konsisumer, @kshitijk4poor, @kunlabs, @l0hde,
|
||||
@Leihb, @leoneparise, @LeonSGP43, @liizfq, @liuhao1024, @loongzhao, @lsdsjy, @luyao618, @ma-pony, @Magaav,
|
||||
@MagicRay1217, @math0r-be, @MattMaximo, @maxims-oss, @MaxyMoos, @maymuneth, @mcndjxlefnd, @memosr,
|
||||
@MestreY0d4-Uninter, @mewwts, @Mirac1eSky, @MorAlekss, @mrhwick, @mrunmayee17, @mssteuer, @Nanako0129,
|
||||
@nazirulhafiy, @Nerijusas, @Nicecsh, @nicoloboschi, @nightq, @ningfangbin, @octo-patch, @Octopus,
|
||||
@OutThisLife, @Paperclip, @pein892, @perlowja, @prasadus92, @qike-ms, @qiyin-code, @Readon, @ReginaldasR,
|
||||
@revaraver, @rfilgueiras, @rmoen, @romanornr, @rugvedS07, @rylena, @samrusani, @Sanjays2402, @sasha-id,
|
||||
@Satoshi-agi, @scheidti, @scotttrinh, @season179, @SeeYangZhi, @sgaofen, @shamork, @shannonsands, @SHL0MS,
|
||||
@simbam99, @Societus, @socrates1024, @Sonoyunchu, @sprmn24, @stephenschoettler, @tangyuanjc, @TechPrototyper,
|
||||
@tekgnosis-net, @ThomassJonax, @tmimmanuel, @tochukwuada, @Tosko4, @Tranquil-Flow, @twozle, @txbxxx,
|
||||
@UgwujaGeorge, @Versun, @vlwkaos, @voidborne-d, @vominh1919, @Wang-tianhao, @Wangshengyang2004, @web3blind,
|
||||
@westers, @Wysie, @xandersbell, @xiahu88988, @XieNBi, @xinbenlv, @xnbi, @y0shua1ee, @yatesjalex, @yes999zc,
|
||||
@yeyitech, @Yoimex, @YueLich, @Yukipukii1, @zhiyanliu, @zicochaos, @Zjianru, @zkl2333, @zons-zhaozhy,
|
||||
@ztexydt-cqh.
|
||||
|
||||
Also: @Siddharth Balyan, @YuShu.
|
||||
|
||||
---
|
||||
|
||||
**Full Changelog**: [v2026.4.23...v2026.4.30](https://github.com/NousResearch/hermes-agent/compare/v2026.4.23...v2026.4.30)
|
||||
@@ -182,6 +182,64 @@ SKILLS_GUIDANCE = (
|
||||
"Skills that aren't maintained become liabilities."
|
||||
)
|
||||
|
||||
KANBAN_GUIDANCE = (
|
||||
"# You are a Kanban worker\n"
|
||||
"You were spawned by the Hermes Kanban dispatcher to execute ONE task from "
|
||||
"the shared board at `~/.hermes/kanban.db`. Your task id is in "
|
||||
"`$HERMES_KANBAN_TASK`; your workspace is `$HERMES_KANBAN_WORKSPACE`. "
|
||||
"The `kanban_*` tools in your schema are your primary coordination surface — "
|
||||
"they write directly to the shared SQLite DB and work regardless of terminal "
|
||||
"backend (local/docker/modal/ssh).\n"
|
||||
"\n"
|
||||
"## Lifecycle\n"
|
||||
"\n"
|
||||
"1. **Orient.** Call `kanban_show()` first (no args — it defaults to your "
|
||||
"task). The response includes title, body, parent-task handoffs (summary + "
|
||||
"metadata), any prior attempts on this task if you're a retry, the full "
|
||||
"comment thread, and a pre-formatted `worker_context` you can treat as "
|
||||
"ground truth.\n"
|
||||
"2. **Work inside the workspace.** `cd $HERMES_KANBAN_WORKSPACE` before "
|
||||
"any file operations. The workspace is yours for this run. Don't modify "
|
||||
"files outside it unless the task explicitly asks.\n"
|
||||
"3. **Heartbeat on long operations.** Call `kanban_heartbeat(note=...)` "
|
||||
"every few minutes during long subprocesses (training, encoding, crawling). "
|
||||
"Skip heartbeats for short tasks.\n"
|
||||
"4. **Block on genuine ambiguity.** If you need a human decision you cannot "
|
||||
"infer (missing credentials, UX choice, paywalled source, peer output you "
|
||||
"need first), call `kanban_block(reason=\"...\")` and stop. Don't guess. "
|
||||
"The user will unblock with context and the dispatcher will respawn you.\n"
|
||||
"5. **Complete with structured handoff.** Call `kanban_complete(summary=..., "
|
||||
"metadata=...)`. `summary` is 1–3 human-readable sentences naming concrete "
|
||||
"artifacts. `metadata` is machine-readable facts "
|
||||
"(`{changed_files: [...], tests_run: N, decisions: [...]}`). Downstream "
|
||||
"workers read both via their own `kanban_show`. Never put secrets / "
|
||||
"tokens / raw PII in either field — run rows are durable forever.\n"
|
||||
"6. **If follow-up work appears, create it; don't do it.** Use "
|
||||
"`kanban_create(title=..., assignee=<right-profile>, parents=[your-task-id])` "
|
||||
"to spawn a child task for the appropriate specialist profile instead of "
|
||||
"scope-creeping into the next thing.\n"
|
||||
"\n"
|
||||
"## Orchestrator mode\n"
|
||||
"\n"
|
||||
"If your task is itself a decomposition task (e.g. a planner profile given "
|
||||
"a high-level goal), use `kanban_create` to fan out into child tasks — one "
|
||||
"per specialist, each with an explicit `assignee` and `parents=[...]` to "
|
||||
"express dependencies. Then `kanban_complete` your own task with a summary "
|
||||
"of the decomposition. Do NOT execute the work yourself; your job is "
|
||||
"routing, not implementation.\n"
|
||||
"\n"
|
||||
"## Do NOT\n"
|
||||
"\n"
|
||||
"- Do not shell out to `hermes kanban <verb>` for board operations. Use "
|
||||
"the `kanban_*` tools — they work across all terminal backends.\n"
|
||||
"- Do not complete a task you didn't actually finish. Block it.\n"
|
||||
"- Do not assign follow-up work to yourself. Assign it to the right "
|
||||
"specialist profile.\n"
|
||||
"- Do not call `delegate_task` as a board substitute. `delegate_task` is "
|
||||
"for short reasoning subtasks inside your own run; board tasks are for "
|
||||
"cross-agent handoffs that outlive one API loop."
|
||||
)
|
||||
|
||||
TOOL_USE_ENFORCEMENT_GUIDANCE = (
|
||||
"# Tool-use enforcement\n"
|
||||
"You MUST use your tools to take action — do not describe what you would do "
|
||||
|
||||
@@ -477,9 +477,13 @@ class ChatCompletionsTransport(ProviderTransport):
|
||||
# so keep them apart in provider_data rather than merging.
|
||||
reasoning = getattr(msg, "reasoning", None)
|
||||
reasoning_content = getattr(msg, "reasoning_content", None)
|
||||
if reasoning_content is None and hasattr(msg, "model_extra"):
|
||||
model_extra = getattr(msg, "model_extra", None) or {}
|
||||
if isinstance(model_extra, dict) and "reasoning_content" in model_extra:
|
||||
reasoning_content = model_extra["reasoning_content"]
|
||||
|
||||
provider_data: Dict[str, Any] = {}
|
||||
if reasoning_content:
|
||||
if reasoning_content is not None:
|
||||
provider_data["reasoning_content"] = reasoning_content
|
||||
rd = getattr(msg, "reasoning_details", None)
|
||||
if rd:
|
||||
|
||||
@@ -1240,8 +1240,73 @@ def _cprint(text: str):
|
||||
Raw ANSI escapes written via print() are swallowed by patch_stdout's
|
||||
StdoutProxy. Routing through print_formatted_text(ANSI(...)) lets
|
||||
prompt_toolkit parse the escapes and render real colors.
|
||||
|
||||
When called from a background thread while a prompt_toolkit
|
||||
``Application`` is running (the common case for the self-improvement
|
||||
background review's ``💾 …`` summary, curator summaries, and other
|
||||
bg-thread emissions), a direct ``_pt_print`` races with the input
|
||||
area's redraw and the line can end up visually buried behind the
|
||||
prompt. Route those cases through ``run_in_terminal`` via
|
||||
``loop.call_soon_threadsafe``, which pauses the input area, prints
|
||||
the line above it, and redraws the prompt cleanly.
|
||||
"""
|
||||
_pt_print(_PT_ANSI(text))
|
||||
try:
|
||||
from prompt_toolkit.application import get_app_or_none, run_in_terminal
|
||||
except Exception:
|
||||
_pt_print(_PT_ANSI(text))
|
||||
return
|
||||
|
||||
app = None
|
||||
try:
|
||||
app = get_app_or_none()
|
||||
except Exception:
|
||||
app = None
|
||||
|
||||
# No active app, or we're already on the app's main thread: the
|
||||
# direct prompt_toolkit print is safe and matches existing behavior
|
||||
# (spinner frames, streamed tokens, tool activity prefixes, …).
|
||||
if app is None or not getattr(app, "_is_running", False):
|
||||
_pt_print(_PT_ANSI(text))
|
||||
return
|
||||
|
||||
try:
|
||||
loop = app.loop # type: ignore[attr-defined]
|
||||
except Exception:
|
||||
loop = None
|
||||
if loop is None:
|
||||
_pt_print(_PT_ANSI(text))
|
||||
return
|
||||
|
||||
import asyncio as _asyncio
|
||||
try:
|
||||
current_loop = _asyncio.get_event_loop_policy().get_event_loop()
|
||||
except Exception:
|
||||
current_loop = None
|
||||
# Same thread as the app's loop → safe to print directly.
|
||||
if current_loop is loop and loop.is_running():
|
||||
_pt_print(_PT_ANSI(text))
|
||||
return
|
||||
|
||||
# Cross-thread emission: ask the app's event loop to schedule a
|
||||
# ``run_in_terminal`` that wraps ``_pt_print``. This hides the
|
||||
# prompt, prints, and redraws. Fire-and-forget — if scheduling
|
||||
# fails we fall back to a direct print so the line isn't lost.
|
||||
def _schedule():
|
||||
try:
|
||||
run_in_terminal(lambda: _pt_print(_PT_ANSI(text)))
|
||||
except Exception:
|
||||
try:
|
||||
_pt_print(_PT_ANSI(text))
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
try:
|
||||
loop.call_soon_threadsafe(_schedule)
|
||||
except Exception:
|
||||
try:
|
||||
_pt_print(_PT_ANSI(text))
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
@@ -6087,6 +6152,27 @@ class HermesCLI:
|
||||
except Exception as exc:
|
||||
print(f"(._.) curator: {exc}")
|
||||
|
||||
def _handle_kanban_command(self, cmd: str):
|
||||
"""Handle the /kanban command — delegate to the shared kanban CLI.
|
||||
|
||||
The string form passed here is the user's full ``/kanban ...``
|
||||
including the leading slash; we strip it and hand the remainder
|
||||
to ``kanban.run_slash`` which returns a single formatted string.
|
||||
"""
|
||||
from hermes_cli.kanban import run_slash
|
||||
|
||||
rest = cmd.strip()
|
||||
if rest.startswith("/"):
|
||||
rest = rest.lstrip("/")
|
||||
if rest.startswith("kanban"):
|
||||
rest = rest[len("kanban"):].lstrip()
|
||||
try:
|
||||
output = run_slash(rest)
|
||||
except Exception as exc: # pragma: no cover - defensive
|
||||
output = f"(._.) kanban error: {exc}"
|
||||
if output:
|
||||
print(output)
|
||||
|
||||
def _handle_skills_command(self, cmd: str):
|
||||
"""Handle /skills slash command — delegates to hermes_cli.skills_hub."""
|
||||
from hermes_cli.skills_hub import handle_skills_slash
|
||||
@@ -6332,6 +6418,8 @@ class HermesCLI:
|
||||
self._handle_cron_command(cmd_original)
|
||||
elif canonical == "curator":
|
||||
self._handle_curator_command(cmd_original)
|
||||
elif canonical == "kanban":
|
||||
self._handle_kanban_command(cmd_original)
|
||||
elif canonical == "skills":
|
||||
with self._busy_command(self._slow_command_status(cmd_original)):
|
||||
self._handle_skills_command(cmd_original)
|
||||
|
||||
Binary file not shown.
+493
@@ -2732,6 +2732,17 @@ class GatewayRunner:
|
||||
# Start background session expiry watcher to finalize expired sessions
|
||||
asyncio.create_task(self._session_expiry_watcher())
|
||||
|
||||
# Start background kanban notifier — delivers `completed`, `blocked`,
|
||||
# `spawn_auto_blocked`, and `crashed` events to gateway subscribers
|
||||
# so human-in-the-loop workflows hear back without polling.
|
||||
asyncio.create_task(self._kanban_notifier_watcher())
|
||||
|
||||
# Start background kanban dispatcher — spawns workers for ready
|
||||
# tasks. Gated by `kanban.dispatch_in_gateway` (default True).
|
||||
# When false, users run `hermes kanban daemon` externally or
|
||||
# simply don't use kanban; this loop becomes a no-op.
|
||||
asyncio.create_task(self._kanban_dispatcher_watcher())
|
||||
|
||||
# Start background reconnection watcher for platforms that failed at startup
|
||||
if self._failed_platforms:
|
||||
logger.info(
|
||||
@@ -2907,6 +2918,399 @@ class GatewayRunner:
|
||||
break
|
||||
await asyncio.sleep(1)
|
||||
|
||||
async def _kanban_notifier_watcher(self, interval: float = 5.0) -> None:
|
||||
"""Poll ``kanban_notify_subs`` and deliver terminal events to users.
|
||||
|
||||
For each subscription row, fetches ``task_events`` newer than the
|
||||
stored cursor with kind in the terminal set (``completed``,
|
||||
``blocked``, ``gave_up``, ``crashed``, ``timed_out``). Sends one
|
||||
message per new event to ``(platform, chat_id, thread_id)``,
|
||||
then advances the cursor. When a task reaches a terminal state
|
||||
(``completed`` / ``archived``), the subscription is removed.
|
||||
|
||||
Runs in the gateway event loop; all SQLite work is pushed to a
|
||||
thread via ``asyncio.to_thread`` so the loop never blocks on the
|
||||
WAL lock. Failures in one tick don't stop subsequent ticks.
|
||||
"""
|
||||
from gateway.config import Platform as _Platform
|
||||
try:
|
||||
from hermes_cli import kanban_db as _kb
|
||||
except Exception:
|
||||
logger.warning("kanban notifier: kanban_db not importable; notifier disabled")
|
||||
return
|
||||
|
||||
TERMINAL_KINDS = ("completed", "blocked", "gave_up", "crashed", "timed_out")
|
||||
# Terminal event kinds trigger automatic unsubscription — the task
|
||||
# is done, blocked, or in a retry-needed state that the human
|
||||
# shouldn't keep pinging a stale chat for. Previously we only
|
||||
# unsubbed when task.status in ('done', 'archived'), which left
|
||||
# subscriptions on 'blocked' / 'gave_up' / 'crashed' / 'timed_out'
|
||||
# tasks stranded forever.
|
||||
TERMINAL_EVENT_KINDS = TERMINAL_KINDS
|
||||
# Per-subscription send-failure counter. Adapter.send raising
|
||||
# means the chat is dead (deleted, bot kicked, etc.) — after N
|
||||
# consecutive send failures the sub is dropped so we don't spin
|
||||
# against a dead chat every 5 seconds forever.
|
||||
MAX_SEND_FAILURES = 3
|
||||
sub_fail_counts: dict[tuple, int] = getattr(
|
||||
self, "_kanban_sub_fail_counts", {}
|
||||
)
|
||||
self._kanban_sub_fail_counts = sub_fail_counts
|
||||
|
||||
# Initial delay so the gateway can finish wiring adapters.
|
||||
await asyncio.sleep(5)
|
||||
|
||||
while self._running:
|
||||
try:
|
||||
def _collect():
|
||||
conn = _kb.connect()
|
||||
try:
|
||||
_kb.init_db() # idempotent; handles first-run
|
||||
except Exception:
|
||||
pass
|
||||
try:
|
||||
subs = _kb.list_notify_subs(conn)
|
||||
deliveries: list[dict] = []
|
||||
for sub in subs:
|
||||
cursor, events = _kb.unseen_events_for_sub(
|
||||
conn,
|
||||
task_id=sub["task_id"],
|
||||
platform=sub["platform"],
|
||||
chat_id=sub["chat_id"],
|
||||
thread_id=sub.get("thread_id") or "",
|
||||
kinds=TERMINAL_KINDS,
|
||||
)
|
||||
if not events:
|
||||
continue
|
||||
task = _kb.get_task(conn, sub["task_id"])
|
||||
deliveries.append({
|
||||
"sub": sub,
|
||||
"cursor": cursor,
|
||||
"events": events,
|
||||
"task": task,
|
||||
})
|
||||
return deliveries
|
||||
finally:
|
||||
conn.close()
|
||||
|
||||
deliveries = await asyncio.to_thread(_collect)
|
||||
for d in deliveries:
|
||||
sub = d["sub"]
|
||||
task = d["task"]
|
||||
platform_str = (sub["platform"] or "").lower()
|
||||
try:
|
||||
plat = _Platform(platform_str)
|
||||
except ValueError:
|
||||
# Unknown platform string; skip and advance cursor so
|
||||
# we don't replay forever.
|
||||
await asyncio.to_thread(
|
||||
self._kanban_advance, sub, d["cursor"],
|
||||
)
|
||||
continue
|
||||
adapter = self.adapters.get(plat)
|
||||
if adapter is None:
|
||||
continue # platform not currently connected
|
||||
title = (task.title if task else sub["task_id"])[:120]
|
||||
for ev in d["events"]:
|
||||
kind = ev.kind
|
||||
# Identity prefix: attribute terminal pings to the
|
||||
# worker that did the work. Makes fleets (where one
|
||||
# chat subscribes to many tasks) legible at a glance.
|
||||
who = (task.assignee if task and task.assignee else None)
|
||||
tag = f"@{who} " if who else ""
|
||||
if kind == "completed":
|
||||
# Prefer the run's summary (the worker's
|
||||
# intentional human-facing handoff, carried
|
||||
# in the event payload), then fall back to
|
||||
# task.result for legacy rows written before
|
||||
# runs shipped.
|
||||
handoff = ""
|
||||
payload_summary = None
|
||||
if ev.payload and ev.payload.get("summary"):
|
||||
payload_summary = str(ev.payload["summary"])
|
||||
if payload_summary:
|
||||
h = payload_summary.strip().splitlines()[0][:200]
|
||||
handoff = f"\n{h}"
|
||||
elif task and task.result:
|
||||
r = task.result.strip().splitlines()[0][:160]
|
||||
handoff = f"\n{r}"
|
||||
msg = (
|
||||
f"✔ {tag}Kanban {sub['task_id']} done"
|
||||
f" — {title}{handoff}"
|
||||
)
|
||||
elif kind == "blocked":
|
||||
reason = ""
|
||||
if ev.payload and ev.payload.get("reason"):
|
||||
reason = f": {str(ev.payload['reason'])[:160]}"
|
||||
msg = f"⏸ {tag}Kanban {sub['task_id']} blocked{reason}"
|
||||
elif kind == "gave_up":
|
||||
err = ""
|
||||
if ev.payload and ev.payload.get("error"):
|
||||
err = f"\n{str(ev.payload['error'])[:200]}"
|
||||
msg = (
|
||||
f"✖ {tag}Kanban {sub['task_id']} gave up "
|
||||
f"after repeated spawn failures{err}"
|
||||
)
|
||||
elif kind == "crashed":
|
||||
msg = (
|
||||
f"✖ {tag}Kanban {sub['task_id']} worker crashed "
|
||||
f"(pid gone); dispatcher will retry"
|
||||
)
|
||||
elif kind == "timed_out":
|
||||
limit = 0
|
||||
if ev.payload and ev.payload.get("limit_seconds"):
|
||||
limit = int(ev.payload["limit_seconds"])
|
||||
msg = (
|
||||
f"⏱ {tag}Kanban {sub['task_id']} timed out "
|
||||
f"(max_runtime={limit}s); will retry"
|
||||
)
|
||||
else:
|
||||
continue
|
||||
metadata: dict[str, Any] = {}
|
||||
if sub.get("thread_id"):
|
||||
metadata["thread_id"] = sub["thread_id"]
|
||||
sub_key = (
|
||||
sub["task_id"], sub["platform"],
|
||||
sub["chat_id"], sub.get("thread_id") or "",
|
||||
)
|
||||
try:
|
||||
await adapter.send(
|
||||
sub["chat_id"], msg, metadata=metadata,
|
||||
)
|
||||
# Reset the failure counter on success.
|
||||
sub_fail_counts.pop(sub_key, None)
|
||||
except Exception as exc:
|
||||
fails = sub_fail_counts.get(sub_key, 0) + 1
|
||||
sub_fail_counts[sub_key] = fails
|
||||
logger.warning(
|
||||
"kanban notifier: send failed for %s on %s "
|
||||
"(attempt %d/%d): %s",
|
||||
sub["task_id"], platform_str, fails,
|
||||
MAX_SEND_FAILURES, exc,
|
||||
)
|
||||
if fails >= MAX_SEND_FAILURES:
|
||||
logger.warning(
|
||||
"kanban notifier: dropping subscription "
|
||||
"%s on %s after %d consecutive send failures",
|
||||
sub["task_id"], platform_str, fails,
|
||||
)
|
||||
await asyncio.to_thread(self._kanban_unsub, sub)
|
||||
sub_fail_counts.pop(sub_key, None)
|
||||
# Don't advance cursor on send failure — retry next tick.
|
||||
break
|
||||
else:
|
||||
# All events delivered; advance cursor + maybe unsub.
|
||||
await asyncio.to_thread(
|
||||
self._kanban_advance, sub, d["cursor"],
|
||||
)
|
||||
# Unsubscribe when the LAST delivered event is a
|
||||
# terminal kind (the task hit a "no further updates"
|
||||
# state), not just on task.status in {done, archived}.
|
||||
# Covers blocked / gave_up / crashed / timed_out which
|
||||
# used to leak subs forever.
|
||||
last_kind = d["events"][-1].kind if d["events"] else None
|
||||
task_terminal = task and task.status in ("done", "archived")
|
||||
event_terminal = last_kind in TERMINAL_EVENT_KINDS
|
||||
if task_terminal or event_terminal:
|
||||
await asyncio.to_thread(
|
||||
self._kanban_unsub, sub,
|
||||
)
|
||||
except Exception as exc:
|
||||
logger.warning("kanban notifier tick failed: %s", exc)
|
||||
# Sleep with cancellation checks.
|
||||
for _ in range(int(max(1, interval))):
|
||||
if not self._running:
|
||||
return
|
||||
await asyncio.sleep(1)
|
||||
|
||||
def _kanban_advance(self, sub: dict, cursor: int) -> None:
|
||||
"""Sync helper: advance a subscription's cursor. Runs in to_thread."""
|
||||
from hermes_cli import kanban_db as _kb
|
||||
conn = _kb.connect()
|
||||
try:
|
||||
_kb.advance_notify_cursor(
|
||||
conn,
|
||||
task_id=sub["task_id"],
|
||||
platform=sub["platform"],
|
||||
chat_id=sub["chat_id"],
|
||||
thread_id=sub.get("thread_id") or "",
|
||||
new_cursor=cursor,
|
||||
)
|
||||
finally:
|
||||
conn.close()
|
||||
|
||||
def _kanban_unsub(self, sub: dict) -> None:
|
||||
from hermes_cli import kanban_db as _kb
|
||||
conn = _kb.connect()
|
||||
try:
|
||||
_kb.remove_notify_sub(
|
||||
conn,
|
||||
task_id=sub["task_id"],
|
||||
platform=sub["platform"],
|
||||
chat_id=sub["chat_id"],
|
||||
thread_id=sub.get("thread_id") or "",
|
||||
)
|
||||
finally:
|
||||
conn.close()
|
||||
|
||||
async def _kanban_dispatcher_watcher(self) -> None:
|
||||
"""Embedded kanban dispatcher — one tick every `dispatch_interval_seconds`.
|
||||
|
||||
Gated by `kanban.dispatch_in_gateway` in config.yaml (default True).
|
||||
When true, the gateway hosts the single dispatcher for this profile:
|
||||
no separate `hermes kanban daemon` process needed. When false, the
|
||||
loop exits immediately and an external daemon is expected.
|
||||
|
||||
Each tick calls :func:`kanban_db.dispatch_once` inside
|
||||
``asyncio.to_thread`` so the SQLite WAL lock never blocks the
|
||||
event loop. Failures in one tick don't stop subsequent ticks —
|
||||
same pattern as `_kanban_notifier_watcher`.
|
||||
|
||||
Shutdown: the loop checks ``self._running`` between ticks; gateway
|
||||
stop() flips it to False and cancels pending tasks, and the
|
||||
in-flight ``to_thread`` returns on its own after the current
|
||||
``dispatch_once`` call finishes (typically <1ms on an idle board).
|
||||
"""
|
||||
# Read config once at boot. If the user flips the flag later, they
|
||||
# restart the gateway; same pattern as every other background
|
||||
# watcher here. Honours HERMES_KANBAN_DISPATCH_IN_GATEWAY env var
|
||||
# as an escape hatch (false-y value disables without editing YAML).
|
||||
try:
|
||||
from hermes_cli.config import load_config as _load_config
|
||||
except Exception:
|
||||
logger.warning("kanban dispatcher: config loader unavailable; disabled")
|
||||
return
|
||||
env_override = os.environ.get("HERMES_KANBAN_DISPATCH_IN_GATEWAY", "").strip().lower()
|
||||
if env_override in ("0", "false", "no", "off"):
|
||||
logger.info("kanban dispatcher: disabled via HERMES_KANBAN_DISPATCH_IN_GATEWAY env")
|
||||
return
|
||||
|
||||
try:
|
||||
cfg = _load_config()
|
||||
except Exception as exc:
|
||||
logger.warning("kanban dispatcher: cannot load config (%s); disabled", exc)
|
||||
return
|
||||
kanban_cfg = cfg.get("kanban", {}) if isinstance(cfg, dict) else {}
|
||||
if not kanban_cfg.get("dispatch_in_gateway", True):
|
||||
logger.info(
|
||||
"kanban dispatcher: disabled via config kanban.dispatch_in_gateway=false"
|
||||
)
|
||||
return
|
||||
|
||||
try:
|
||||
from hermes_cli import kanban_db as _kb
|
||||
except Exception:
|
||||
logger.warning("kanban dispatcher: kanban_db not importable; dispatcher disabled")
|
||||
return
|
||||
|
||||
interval = float(kanban_cfg.get("dispatch_interval_seconds", 60) or 60)
|
||||
if interval < 1.0:
|
||||
interval = 1.0 # sanity floor — tighter than this is a footgun
|
||||
|
||||
# Initial delay so the gateway finishes wiring adapters before the
|
||||
# dispatcher spawns workers (those workers may hit gateway notify
|
||||
# subscriptions etc.). Matches the notifier watcher's delay.
|
||||
await asyncio.sleep(5)
|
||||
|
||||
# Health telemetry mirrored from `_cmd_daemon`: warn when ready
|
||||
# queue is non-empty but spawns are 0 for N consecutive ticks —
|
||||
# usually means broken PATH, missing venv, or credential loss.
|
||||
HEALTH_WINDOW = 6
|
||||
bad_ticks = 0
|
||||
last_warn_at = 0
|
||||
|
||||
def _tick_once() -> "Optional[object]":
|
||||
"""Run one dispatch_once; return result or None on error.
|
||||
|
||||
Runs in a worker thread via `asyncio.to_thread`."""
|
||||
conn = None
|
||||
try:
|
||||
conn = _kb.connect()
|
||||
try:
|
||||
_kb.init_db() # idempotent, handles first-run
|
||||
except Exception:
|
||||
pass
|
||||
return _kb.dispatch_once(conn)
|
||||
except Exception:
|
||||
logger.exception("kanban dispatcher: tick failed")
|
||||
return None
|
||||
finally:
|
||||
if conn is not None:
|
||||
try:
|
||||
conn.close()
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
def _ready_nonempty() -> bool:
|
||||
"""Cheap probe: is there at least one ready+assigned+unclaimed task?"""
|
||||
conn = None
|
||||
try:
|
||||
conn = _kb.connect()
|
||||
row = conn.execute(
|
||||
"SELECT 1 FROM tasks "
|
||||
"WHERE status = 'ready' AND assignee IS NOT NULL "
|
||||
" AND claim_lock IS NULL LIMIT 1"
|
||||
).fetchone()
|
||||
return row is not None
|
||||
except Exception:
|
||||
return False
|
||||
finally:
|
||||
if conn is not None:
|
||||
try:
|
||||
conn.close()
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
logger.info(
|
||||
"kanban dispatcher: embedded in gateway (interval=%.1fs)", interval
|
||||
)
|
||||
while self._running:
|
||||
try:
|
||||
res = await asyncio.to_thread(_tick_once)
|
||||
if res is not None and getattr(res, "spawned", None):
|
||||
# Quiet by default — only log when something actually
|
||||
# happened, so an idle gateway stays silent.
|
||||
logger.info(
|
||||
"kanban dispatcher: tick spawned=%d reclaimed=%d "
|
||||
"crashed=%d timed_out=%d promoted=%d auto_blocked=%d",
|
||||
len(res.spawned),
|
||||
res.reclaimed,
|
||||
len(res.crashed) if hasattr(res.crashed, "__len__") else 0,
|
||||
len(res.timed_out) if hasattr(res.timed_out, "__len__") else 0,
|
||||
res.promoted,
|
||||
len(res.auto_blocked) if hasattr(res.auto_blocked, "__len__") else 0,
|
||||
)
|
||||
# Health telemetry
|
||||
ready_pending = await asyncio.to_thread(_ready_nonempty)
|
||||
spawned_any = bool(res and getattr(res, "spawned", None))
|
||||
if ready_pending and not spawned_any:
|
||||
bad_ticks += 1
|
||||
else:
|
||||
bad_ticks = 0
|
||||
if bad_ticks >= HEALTH_WINDOW:
|
||||
now = int(time.time())
|
||||
if now - last_warn_at >= 300:
|
||||
logger.warning(
|
||||
"kanban dispatcher stuck: ready queue non-empty for "
|
||||
"%d consecutive ticks but 0 workers spawned. Check "
|
||||
"profile health (venv, PATH, credentials) and "
|
||||
"`hermes kanban list --status ready`.",
|
||||
bad_ticks,
|
||||
)
|
||||
last_warn_at = now
|
||||
except asyncio.CancelledError:
|
||||
logger.debug("kanban dispatcher: cancelled")
|
||||
raise
|
||||
except Exception:
|
||||
logger.exception("kanban dispatcher: unexpected watcher error")
|
||||
|
||||
# Sleep in 1s slices so shutdown is snappy — otherwise a stop()
|
||||
# waits up to `interval` seconds for the current sleep to finish.
|
||||
slept = 0.0
|
||||
while slept < interval and self._running:
|
||||
await asyncio.sleep(min(1.0, interval - slept))
|
||||
slept += 1.0
|
||||
|
||||
async def _platform_reconnect_watcher(self) -> None:
|
||||
"""Background task that periodically retries connecting failed platforms.
|
||||
|
||||
@@ -4168,6 +4572,14 @@ class GatewayRunner:
|
||||
if _cmd_def_inner and _cmd_def_inner.name == "background":
|
||||
return await self._handle_background_command(event)
|
||||
|
||||
# /kanban must bypass the guard. It writes to a profile-agnostic
|
||||
# DB (kanban.db), not to the running agent's state. In fact
|
||||
# /kanban unblock is often the only way to free a worker that
|
||||
# has blocked waiting for a peer — letting that be dispatched
|
||||
# mid-run is the whole point of the board.
|
||||
if _cmd_def_inner and _cmd_def_inner.name == "kanban":
|
||||
return await self._handle_kanban_command(event)
|
||||
|
||||
# Session-level toggles that are safe to run mid-agent —
|
||||
# /yolo can unblock a pending approval prompt, /verbose cycles
|
||||
# the tool-progress display mode for the ongoing stream.
|
||||
@@ -4415,6 +4827,9 @@ class GatewayRunner:
|
||||
if canonical == "personality":
|
||||
return await self._handle_personality_command(event)
|
||||
|
||||
if canonical == "kanban":
|
||||
return await self._handle_kanban_command(event)
|
||||
|
||||
if canonical == "retry":
|
||||
return await self._handle_retry_command(event)
|
||||
|
||||
@@ -6031,6 +6446,84 @@ class GatewayRunner:
|
||||
|
||||
return "\n".join(lines)
|
||||
|
||||
|
||||
async def _handle_kanban_command(self, event: MessageEvent) -> str:
|
||||
"""Handle /kanban — delegate to the shared kanban CLI.
|
||||
|
||||
Run the potentially-blocking DB work in a thread pool so the
|
||||
gateway event loop stays responsive. Read operations (list,
|
||||
show, context, tail) are permitted while an agent is running;
|
||||
mutations are allowed too because the board is profile-agnostic
|
||||
and does not touch the running agent's state.
|
||||
|
||||
For ``/kanban create`` invocations we also auto-subscribe the
|
||||
originating gateway source (platform + chat + thread) to the new
|
||||
task's terminal events, so the user hears back when the worker
|
||||
completes / blocks / auto-blocks / crashes without having to poll.
|
||||
"""
|
||||
import asyncio
|
||||
import re
|
||||
from hermes_cli.kanban import run_slash
|
||||
|
||||
text = (event.text or "").strip()
|
||||
# Strip the leading "/kanban" (with or without slash), leaving args.
|
||||
if text.startswith("/"):
|
||||
text = text.lstrip("/")
|
||||
if text.startswith("kanban"):
|
||||
text = text[len("kanban"):].lstrip()
|
||||
|
||||
is_create = text.split(None, 1)[:1] == ["create"]
|
||||
|
||||
try:
|
||||
output = await asyncio.to_thread(run_slash, text)
|
||||
except Exception as exc: # pragma: no cover - defensive
|
||||
return f"⚠ kanban error: {exc}"
|
||||
|
||||
# Auto-subscribe on create. Parse the task id from the CLI's standard
|
||||
# success line ("Created t_abcd (ready, assignee=...)"). If the user
|
||||
# passed --json we don't subscribe; they're clearly scripting and
|
||||
# can call /kanban notify-subscribe explicitly.
|
||||
if is_create and output:
|
||||
m = re.search(r"Created\s+(t_[0-9a-f]+)\b", output)
|
||||
if m:
|
||||
task_id = m.group(1)
|
||||
try:
|
||||
source = event.source
|
||||
platform = getattr(source, "platform", None)
|
||||
platform_str = (
|
||||
platform.value if hasattr(platform, "value") else str(platform or "")
|
||||
).lower()
|
||||
chat_id = str(getattr(source, "chat_id", "") or "")
|
||||
thread_id = str(getattr(source, "thread_id", "") or "")
|
||||
user_id = str(getattr(source, "user_id", "") or "") or None
|
||||
if platform_str and chat_id:
|
||||
def _sub():
|
||||
from hermes_cli import kanban_db as _kb
|
||||
conn = _kb.connect()
|
||||
try:
|
||||
_kb.add_notify_sub(
|
||||
conn, task_id=task_id,
|
||||
platform=platform_str, chat_id=chat_id,
|
||||
thread_id=thread_id or None,
|
||||
user_id=user_id,
|
||||
)
|
||||
finally:
|
||||
conn.close()
|
||||
await asyncio.to_thread(_sub)
|
||||
output = (
|
||||
output.rstrip()
|
||||
+ f"\n(subscribed — you'll be notified when {task_id} "
|
||||
f"completes or blocks)"
|
||||
)
|
||||
except Exception as exc:
|
||||
logger.warning("kanban create auto-subscribe failed: %s", exc)
|
||||
|
||||
# Gateway messages have practical length caps; truncate long
|
||||
# listings to keep the UX reasonable.
|
||||
if len(output) > 3800:
|
||||
output = output[:3800] + "\n… (truncated; use `hermes kanban …` in your terminal for full output)"
|
||||
return output or "(no output)"
|
||||
|
||||
async def _handle_status_command(self, event: MessageEvent) -> str:
|
||||
"""Handle /status command."""
|
||||
source = event.source
|
||||
|
||||
@@ -11,5 +11,5 @@ Provides subcommands for:
|
||||
- hermes cron - Manage cron jobs
|
||||
"""
|
||||
|
||||
__version__ = "0.11.0"
|
||||
__release_date__ = "2026.4.23"
|
||||
__version__ = "0.12.0"
|
||||
__release_date__ = "2026.4.30"
|
||||
|
||||
@@ -151,6 +151,11 @@ COMMAND_REGISTRY: list[CommandDef] = [
|
||||
CommandDef("curator", "Background skill maintenance (status, run, pin, archive)",
|
||||
"Tools & Skills", args_hint="[subcommand]",
|
||||
subcommands=("status", "run", "pause", "resume", "pin", "unpin", "restore")),
|
||||
CommandDef("kanban", "Multi-profile collaboration board (tasks, links, comments)",
|
||||
"Tools & Skills", args_hint="[subcommand]",
|
||||
subcommands=("list", "ls", "show", "create", "assign", "link", "unlink",
|
||||
"claim", "comment", "complete", "block", "unblock", "archive",
|
||||
"tail", "dispatch", "context", "init", "gc")),
|
||||
CommandDef("reload", "Reload .env variables into the running session", "Tools & Skills",
|
||||
cli_only=True),
|
||||
CommandDef("reload-mcp", "Reload MCP servers from config", "Tools & Skills",
|
||||
|
||||
@@ -1104,6 +1104,24 @@ DEFAULT_CONFIG = {
|
||||
"max_parallel_jobs": None,
|
||||
},
|
||||
|
||||
# Kanban multi-agent coordination — controls the dispatcher loop that
|
||||
# spawns workers for ready tasks. The dispatcher ticks every N seconds
|
||||
# (default 60), reclaims stale claims, promotes dependency-satisfied
|
||||
# todos to ready, and fires `hermes -p <assignee> chat -q ...` for
|
||||
# each claimable ready task. One dispatcher per profile is sufficient;
|
||||
# running more than one on the same kanban.db will race for claims.
|
||||
"kanban": {
|
||||
# Run the dispatcher inside the gateway process. On by default —
|
||||
# the cost is ~300µs every `dispatch_interval_seconds` when idle,
|
||||
# and gateway is the supervisor users already have. Set to false
|
||||
# only if you run the dispatcher as a separate systemd unit or
|
||||
# don't want the gateway to spawn workers.
|
||||
"dispatch_in_gateway": True,
|
||||
# Seconds between dispatcher ticks (idle or not). Lower = snappier
|
||||
# pickup of newly-ready tasks; higher = less SQL pressure.
|
||||
"dispatch_interval_seconds": 60,
|
||||
},
|
||||
|
||||
# execute_code settings — controls the tool used for programmatic tool calls.
|
||||
"code_execution": {
|
||||
# Execution mode:
|
||||
|
||||
@@ -108,6 +108,49 @@ def _cmd_status(args) -> int:
|
||||
f"last_activity={last}"
|
||||
)
|
||||
|
||||
# Show top 5 most-active and least-active skills by activity_count
|
||||
# (use + view + patch). This is a different signal from
|
||||
# least-recently-active: activity_count reflects frequency,
|
||||
# last_activity_at reflects recency. A skill touched 30 times a year
|
||||
# ago is high-frequency but stale; a skill touched once yesterday is
|
||||
# recent but low-frequency. Both can matter.
|
||||
active_all = by_state.get("active", [])
|
||||
if active_all:
|
||||
most_active = sorted(
|
||||
active_all,
|
||||
key=lambda r: (r.get("activity_count") or 0, r.get("last_activity_at") or ""),
|
||||
reverse=True,
|
||||
)[:5]
|
||||
if most_active and (most_active[0].get("activity_count") or 0) > 0:
|
||||
print("\nmost active (top 5):")
|
||||
for r in most_active:
|
||||
last = _fmt_ts(r.get("last_activity_at"))
|
||||
print(
|
||||
f" {r['name']:40s} "
|
||||
f"activity={r.get('activity_count', 0):3d} "
|
||||
f"use={r.get('use_count', 0):3d} "
|
||||
f"view={r.get('view_count', 0):3d} "
|
||||
f"patches={r.get('patch_count', 0):3d} "
|
||||
f"last_activity={last}"
|
||||
)
|
||||
|
||||
least_active = sorted(
|
||||
active_all,
|
||||
key=lambda r: (r.get("activity_count") or 0, r.get("last_activity_at") or ""),
|
||||
)[:5]
|
||||
if least_active:
|
||||
print("\nleast active (top 5):")
|
||||
for r in least_active:
|
||||
last = _fmt_ts(r.get("last_activity_at"))
|
||||
print(
|
||||
f" {r['name']:40s} "
|
||||
f"activity={r.get('activity_count', 0):3d} "
|
||||
f"use={r.get('use_count', 0):3d} "
|
||||
f"view={r.get('view_count', 0):3d} "
|
||||
f"patches={r.get('patch_count', 0):3d} "
|
||||
f"last_activity={last}"
|
||||
)
|
||||
|
||||
return 0
|
||||
|
||||
|
||||
|
||||
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
+15
-1
@@ -5041,6 +5041,13 @@ def cmd_slack(args):
|
||||
return 1
|
||||
|
||||
|
||||
def cmd_kanban(args):
|
||||
"""Multi-profile collaboration board."""
|
||||
from hermes_cli.kanban import kanban_command
|
||||
|
||||
return kanban_command(args)
|
||||
|
||||
|
||||
def cmd_hooks(args):
|
||||
"""Shell-hook inspection and management."""
|
||||
from hermes_cli.hooks import hooks_command
|
||||
@@ -7682,7 +7689,7 @@ def cmd_profile(args):
|
||||
if clone_all:
|
||||
print(f"Full copy from {source_label}.")
|
||||
else:
|
||||
print(f"Cloned config, .env, SOUL.md from {source_label}.")
|
||||
print(f"Cloned config, .env, SOUL.md, and skills from {source_label}.")
|
||||
|
||||
# Auto-clone Honcho config for the new profile (only with --clone/--clone-all)
|
||||
if clone or clone_all:
|
||||
@@ -8640,6 +8647,13 @@ def main():
|
||||
|
||||
webhook_parser.set_defaults(func=cmd_webhook)
|
||||
|
||||
# =========================================================================
|
||||
# kanban command — multi-profile collaboration board
|
||||
# =========================================================================
|
||||
from hermes_cli.kanban import build_parser as _build_kanban_parser
|
||||
kanban_parser = _build_kanban_parser(subparsers)
|
||||
kanban_parser.set_defaults(func=cmd_kanban)
|
||||
|
||||
# =========================================================================
|
||||
# hooks command — shell-hook inspection and management
|
||||
# =========================================================================
|
||||
|
||||
@@ -40,6 +40,7 @@ OPENROUTER_MODELS: list[tuple[str, str]] = [
|
||||
("anthropic/claude-sonnet-4.5", ""),
|
||||
("anthropic/claude-haiku-4.5", ""),
|
||||
("openrouter/elephant-alpha", "free"),
|
||||
("openrouter/owl-alpha", "free"),
|
||||
("openai/gpt-5.5", ""),
|
||||
("openai/gpt-5.4-mini", ""),
|
||||
("xiaomi/mimo-v2.5-pro", ""),
|
||||
|
||||
+11
-2
@@ -11,7 +11,7 @@ zero migration needed.
|
||||
Usage::
|
||||
|
||||
hermes profile create coder # fresh profile + bundled skills
|
||||
hermes profile create coder --clone # also copy config, .env, SOUL.md
|
||||
hermes profile create coder --clone # also copy config, .env, SOUL.md, skills
|
||||
hermes profile create coder --clone-all # full copy of source profile
|
||||
coder chat # use via wrapper alias
|
||||
hermes -p coder chat # or via flag
|
||||
@@ -411,7 +411,8 @@ def create_profile(
|
||||
clone_all:
|
||||
If True, do a full copytree of the source (all state).
|
||||
clone_config:
|
||||
If True, copy only config files (config.yaml, .env, SOUL.md).
|
||||
If True, copy config files (config.yaml, .env, SOUL.md), installed
|
||||
skills, and selected profile identity files from the source profile.
|
||||
no_alias:
|
||||
If True, skip wrapper script creation.
|
||||
|
||||
@@ -469,6 +470,14 @@ def create_profile(
|
||||
if src.exists():
|
||||
shutil.copy2(src, profile_dir / filename)
|
||||
|
||||
# Clone installed skills from the source profile. The dashboard's
|
||||
# "clone from default" flow is expected to preserve both bundled
|
||||
# and user-installed skills so the new profile immediately has the
|
||||
# same agent capabilities as the source profile.
|
||||
source_skills = source_dir / "skills"
|
||||
if source_skills.is_dir():
|
||||
shutil.copytree(source_skills, profile_dir / "skills", dirs_exist_ok=True)
|
||||
|
||||
# Clone memory and other subdirectory files
|
||||
for relpath in _CLONE_SUBDIR_FILES:
|
||||
src = source_dir / relpath
|
||||
|
||||
@@ -2344,6 +2344,254 @@ async def delete_cron_job(job_id: str):
|
||||
return {"ok": True}
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Profile management endpoints (minimal — list/create/rename/delete + SOUL.md)
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
class ProfileCreate(BaseModel):
|
||||
name: str
|
||||
clone_from_default: bool = False
|
||||
|
||||
|
||||
class ProfileRename(BaseModel):
|
||||
new_name: str
|
||||
|
||||
|
||||
class ProfileSoulUpdate(BaseModel):
|
||||
content: str
|
||||
|
||||
|
||||
def _profile_attr(info, name: str, default: Any = None) -> Any:
|
||||
try:
|
||||
return getattr(info, name)
|
||||
except Exception:
|
||||
return default
|
||||
|
||||
|
||||
def _profile_to_dict(info) -> Dict[str, Any]:
|
||||
return {
|
||||
"name": _profile_attr(info, "name", ""),
|
||||
"path": str(_profile_attr(info, "path", "")),
|
||||
"is_default": bool(_profile_attr(info, "is_default", False)),
|
||||
"model": _profile_attr(info, "model"),
|
||||
"provider": _profile_attr(info, "provider"),
|
||||
"has_env": bool(_profile_attr(info, "has_env", False)),
|
||||
"skill_count": int(_profile_attr(info, "skill_count", 0) or 0),
|
||||
}
|
||||
|
||||
|
||||
def _fallback_profile_dicts(profiles_mod) -> List[Dict[str, Any]]:
|
||||
def _safe(callable_, default):
|
||||
try:
|
||||
return callable_()
|
||||
except Exception:
|
||||
return default
|
||||
|
||||
profiles: List[Dict[str, Any]] = []
|
||||
default_home = profiles_mod._get_default_hermes_home()
|
||||
if default_home.is_dir():
|
||||
model, provider = _safe(lambda: profiles_mod._read_config_model(default_home), (None, None))
|
||||
profiles.append({
|
||||
"name": "default",
|
||||
"path": str(default_home),
|
||||
"is_default": True,
|
||||
"model": model,
|
||||
"provider": provider,
|
||||
"has_env": (default_home / ".env").exists(),
|
||||
"skill_count": _safe(lambda: profiles_mod._count_skills(default_home), 0),
|
||||
})
|
||||
|
||||
profiles_root = profiles_mod._get_profiles_root()
|
||||
if profiles_root.is_dir():
|
||||
for entry in sorted(profiles_root.iterdir()):
|
||||
if not entry.is_dir() or not profiles_mod._PROFILE_ID_RE.match(entry.name):
|
||||
continue
|
||||
model, provider = _safe(lambda entry=entry: profiles_mod._read_config_model(entry), (None, None))
|
||||
profiles.append({
|
||||
"name": entry.name,
|
||||
"path": str(entry),
|
||||
"is_default": False,
|
||||
"model": model,
|
||||
"provider": provider,
|
||||
"has_env": (entry / ".env").exists(),
|
||||
"skill_count": _safe(lambda entry=entry: profiles_mod._count_skills(entry), 0),
|
||||
})
|
||||
|
||||
return profiles
|
||||
|
||||
|
||||
def _resolve_profile_dir(name: str) -> Path:
|
||||
"""Validate ``name`` and resolve to its directory or raise an HTTPException."""
|
||||
from hermes_cli import profiles as profiles_mod
|
||||
try:
|
||||
profiles_mod.validate_profile_name(name)
|
||||
except ValueError as e:
|
||||
raise HTTPException(status_code=400, detail=str(e))
|
||||
if not profiles_mod.profile_exists(name):
|
||||
raise HTTPException(status_code=404, detail=f"Profile '{name}' does not exist.")
|
||||
return profiles_mod.get_profile_dir(name)
|
||||
|
||||
|
||||
def _profile_setup_command(name: str) -> str:
|
||||
"""Return the shell command used to configure a profile in the CLI."""
|
||||
_resolve_profile_dir(name)
|
||||
return "hermes setup" if name == "default" else f"{name} setup"
|
||||
|
||||
|
||||
@app.get("/api/profiles")
|
||||
async def list_profiles_endpoint():
|
||||
from hermes_cli import profiles as profiles_mod
|
||||
try:
|
||||
return {"profiles": [_profile_to_dict(p) for p in profiles_mod.list_profiles()]}
|
||||
except Exception:
|
||||
_log.exception("GET /api/profiles failed; falling back to profile directory scan")
|
||||
return {"profiles": _fallback_profile_dicts(profiles_mod)}
|
||||
|
||||
|
||||
@app.post("/api/profiles")
|
||||
async def create_profile_endpoint(body: ProfileCreate):
|
||||
from hermes_cli import profiles as profiles_mod
|
||||
try:
|
||||
path = profiles_mod.create_profile(
|
||||
name=body.name,
|
||||
clone_from="default" if body.clone_from_default else None,
|
||||
clone_config=body.clone_from_default,
|
||||
)
|
||||
# Match the CLI's profile-create flow: fresh named profiles get the
|
||||
# bundled skills installed. When cloning from default, create_profile()
|
||||
# has already copied the source profile's skills, including any
|
||||
# user-installed skills.
|
||||
if not body.clone_from_default:
|
||||
profiles_mod.seed_profile_skills(path, quiet=True)
|
||||
|
||||
# Match the CLI's profile-create flow: named profiles should get a
|
||||
# wrapper in ~/.local/bin when the alias is safe to create.
|
||||
collision = profiles_mod.check_alias_collision(body.name)
|
||||
if not collision:
|
||||
profiles_mod.create_wrapper_script(body.name)
|
||||
except (ValueError, FileExistsError, FileNotFoundError) as e:
|
||||
raise HTTPException(status_code=400, detail=str(e))
|
||||
except Exception as e:
|
||||
_log.exception("POST /api/profiles failed")
|
||||
raise HTTPException(status_code=500, detail=str(e))
|
||||
return {"ok": True, "name": body.name, "path": str(path)}
|
||||
|
||||
|
||||
@app.get("/api/profiles/{name}/setup-command")
|
||||
async def get_profile_setup_command(name: str):
|
||||
return {"command": _profile_setup_command(name)}
|
||||
|
||||
|
||||
@app.post("/api/profiles/{name}/open-terminal")
|
||||
async def open_profile_terminal_endpoint(name: str):
|
||||
try:
|
||||
command = _profile_setup_command(name)
|
||||
|
||||
if sys.platform.startswith("win"):
|
||||
subprocess.Popen(["cmd.exe", "/c", "start", "", command])
|
||||
elif sys.platform == "darwin":
|
||||
escaped = command.replace("\\", "\\\\").replace('"', '\\"')
|
||||
applescript = (
|
||||
'tell application "Terminal"\n'
|
||||
"activate\n"
|
||||
f'do script "{escaped}"\n'
|
||||
"end tell"
|
||||
)
|
||||
subprocess.Popen(["osascript", "-e", applescript])
|
||||
else:
|
||||
terminal_commands = [
|
||||
("x-terminal-emulator", ["x-terminal-emulator", "-e", "sh", "-lc", command]),
|
||||
("gnome-terminal", ["gnome-terminal", "--", "sh", "-lc", command]),
|
||||
("konsole", ["konsole", "-e", "sh", "-lc", command]),
|
||||
("xfce4-terminal", ["xfce4-terminal", "-e", f"sh -lc '{command}'"]),
|
||||
("mate-terminal", ["mate-terminal", "-e", f"sh -lc '{command}'"]),
|
||||
("lxterminal", ["lxterminal", "-e", f"sh -lc '{command}'"]),
|
||||
("tilix", ["tilix", "-e", "sh", "-lc", command]),
|
||||
("alacritty", ["alacritty", "-e", "sh", "-lc", command]),
|
||||
("kitty", ["kitty", "sh", "-lc", command]),
|
||||
("xterm", ["xterm", "-e", "sh", "-lc", command]),
|
||||
]
|
||||
for executable, popen_args in terminal_commands:
|
||||
if subprocess.call(
|
||||
["which", executable],
|
||||
stdout=subprocess.DEVNULL,
|
||||
stderr=subprocess.DEVNULL,
|
||||
) == 0:
|
||||
subprocess.Popen(popen_args)
|
||||
break
|
||||
else:
|
||||
raise HTTPException(
|
||||
status_code=400,
|
||||
detail="No supported terminal emulator found",
|
||||
)
|
||||
except FileNotFoundError as e:
|
||||
raise HTTPException(status_code=404, detail=str(e))
|
||||
except ValueError as e:
|
||||
raise HTTPException(status_code=400, detail=str(e))
|
||||
except HTTPException:
|
||||
raise
|
||||
except Exception as e:
|
||||
_log.exception("POST /api/profiles/%s/open-terminal failed", name)
|
||||
raise HTTPException(status_code=500, detail=str(e))
|
||||
return {"ok": True, "command": command}
|
||||
|
||||
|
||||
@app.patch("/api/profiles/{name}")
|
||||
async def rename_profile_endpoint(name: str, body: ProfileRename):
|
||||
from hermes_cli import profiles as profiles_mod
|
||||
try:
|
||||
path = profiles_mod.rename_profile(name, body.new_name)
|
||||
except FileNotFoundError as e:
|
||||
raise HTTPException(status_code=404, detail=str(e))
|
||||
except (ValueError, FileExistsError) as e:
|
||||
raise HTTPException(status_code=400, detail=str(e))
|
||||
except Exception as e:
|
||||
_log.exception("PATCH /api/profiles/%s failed", name)
|
||||
raise HTTPException(status_code=500, detail=str(e))
|
||||
return {"ok": True, "name": body.new_name, "path": str(path)}
|
||||
|
||||
|
||||
@app.delete("/api/profiles/{name}")
|
||||
async def delete_profile_endpoint(name: str):
|
||||
"""Delete a profile. The dashboard collects the user's confirmation in
|
||||
its own dialog before this request, so we always pass ``yes=True`` to
|
||||
skip the CLI's interactive prompt."""
|
||||
from hermes_cli import profiles as profiles_mod
|
||||
try:
|
||||
path = profiles_mod.delete_profile(name, yes=True)
|
||||
except FileNotFoundError as e:
|
||||
raise HTTPException(status_code=404, detail=str(e))
|
||||
except ValueError as e:
|
||||
raise HTTPException(status_code=400, detail=str(e))
|
||||
except Exception as e:
|
||||
_log.exception("DELETE /api/profiles/%s failed", name)
|
||||
raise HTTPException(status_code=500, detail=str(e))
|
||||
return {"ok": True, "path": str(path)}
|
||||
|
||||
|
||||
@app.get("/api/profiles/{name}/soul")
|
||||
async def get_profile_soul(name: str):
|
||||
soul_path = _resolve_profile_dir(name) / "SOUL.md"
|
||||
if soul_path.exists():
|
||||
try:
|
||||
return {"content": soul_path.read_text(encoding="utf-8"), "exists": True}
|
||||
except OSError as e:
|
||||
raise HTTPException(status_code=500, detail=f"Could not read SOUL.md: {e}")
|
||||
return {"content": "", "exists": False}
|
||||
|
||||
|
||||
@app.put("/api/profiles/{name}/soul")
|
||||
async def update_profile_soul(name: str, body: ProfileSoulUpdate):
|
||||
soul_path = _resolve_profile_dir(name) / "SOUL.md"
|
||||
try:
|
||||
soul_path.write_text(body.content, encoding="utf-8")
|
||||
except OSError as e:
|
||||
_log.exception("PUT /api/profiles/%s/soul failed", name)
|
||||
raise HTTPException(status_code=500, detail=f"Could not write SOUL.md: {e}")
|
||||
return {"ok": True}
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Skills & Tools endpoints
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
+1
-1
@@ -163,7 +163,7 @@
|
||||
for entry in "''${ENTRIES[@]}"; do
|
||||
IFS=":" read -r ATTR FOLDER NIX_FILE <<< "$entry"
|
||||
echo "==> .#$ATTR ($FOLDER -> $NIX_FILE)"
|
||||
OUTPUT=$(nix build ".#$ATTR.npmDeps" --no-link --rebuild --print-build-logs 2>&1)
|
||||
OUTPUT=$(nix build ".#$ATTR.npmDeps" --no-link --print-build-logs 2>&1)
|
||||
STATUS=$?
|
||||
if [ "$STATUS" -eq 0 ]; then
|
||||
echo " ok"
|
||||
|
||||
@@ -0,0 +1,372 @@
|
||||
---
|
||||
name: shopify
|
||||
description: Shopify Admin & Storefront GraphQL APIs via curl. Products, orders, customers, inventory, metafields.
|
||||
version: 1.0.0
|
||||
author: community
|
||||
license: MIT
|
||||
prerequisites:
|
||||
env_vars: [SHOPIFY_ACCESS_TOKEN, SHOPIFY_STORE_DOMAIN]
|
||||
commands: [curl, jq]
|
||||
required_environment_variables:
|
||||
- name: SHOPIFY_ACCESS_TOKEN
|
||||
prompt: Shopify Admin API access token (starts with shpat_)
|
||||
help: "Shopify admin → Settings → Apps and sales channels → Develop apps → Create an app → API credentials. Token shown ONCE on install."
|
||||
- name: SHOPIFY_STORE_DOMAIN
|
||||
prompt: Your shop subdomain without protocol (e.g. my-store.myshopify.com)
|
||||
help: "The permanent myshopify.com domain, not your custom domain."
|
||||
- name: SHOPIFY_API_VERSION
|
||||
prompt: Shopify API version (default 2026-01)
|
||||
help: "Stable quarterly version. Override if you need an older one."
|
||||
metadata:
|
||||
hermes:
|
||||
tags: [Shopify, E-commerce, Commerce, API, GraphQL]
|
||||
related_skills: [airtable, xurl]
|
||||
homepage: https://shopify.dev/docs/api/admin-graphql
|
||||
---
|
||||
|
||||
# Shopify — Admin & Storefront GraphQL APIs
|
||||
|
||||
Work with Shopify stores directly through `curl`: list products, manage inventory, pull orders, update customers, read metafields. No SDK, no app framework — just the GraphQL endpoint and a custom-app access token.
|
||||
|
||||
The REST Admin API is legacy since 2024-04 and only receives security fixes. **Use GraphQL Admin** for all admin work. Use **Storefront GraphQL** for read-only customer-facing queries (products, collections, cart).
|
||||
|
||||
## Prerequisites
|
||||
|
||||
1. In Shopify admin: **Settings → Apps and sales channels → Develop apps → Create an app**.
|
||||
2. Click **Configure Admin API scopes**, select what you need (examples below), save.
|
||||
3. **Install app** → the Admin API access token appears ONCE. Copy it immediately — Shopify will never show it again. Tokens start with `shpat_`.
|
||||
4. Save to `~/.hermes/.env`:
|
||||
```
|
||||
SHOPIFY_ACCESS_TOKEN=shpat_xxxxxxxxxxxxxxxxxxxx
|
||||
SHOPIFY_STORE_DOMAIN=my-store.myshopify.com
|
||||
SHOPIFY_API_VERSION=2026-01
|
||||
```
|
||||
|
||||
> **Heads up:** As of January 1, 2026, new "legacy custom apps" created in the Shopify admin are gone. New setups should use the **Dev Dashboard** (`shopify.dev/docs/apps/build/dev-dashboard`). Existing admin-created apps keep working. If the user's shop has no existing custom app and it's after 2026-01-01, direct them to Dev Dashboard instead of the admin flow.
|
||||
|
||||
Common scopes by task:
|
||||
- Products / collections: `read_products`, `write_products`
|
||||
- Inventory: `read_inventory`, `write_inventory`, `read_locations`
|
||||
- Orders: `read_orders`, `write_orders` (30 most recent without `read_all_orders`)
|
||||
- Customers: `read_customers`, `write_customers`
|
||||
- Draft orders: `read_draft_orders`, `write_draft_orders`
|
||||
- Fulfillments: `read_fulfillments`, `write_fulfillments`
|
||||
- Metafields / metaobjects: covered by the matching resource scopes
|
||||
|
||||
## API Basics
|
||||
|
||||
- **Endpoint:** `https://$SHOPIFY_STORE_DOMAIN/admin/api/$SHOPIFY_API_VERSION/graphql.json`
|
||||
- **Auth header:** `X-Shopify-Access-Token: $SHOPIFY_ACCESS_TOKEN` (NOT `Authorization: Bearer`)
|
||||
- **Method:** always `POST`, always `Content-Type: application/json`, body is `{"query": "...", "variables": {...}}`
|
||||
- **HTTP 200 does not mean success.** GraphQL returns errors in a top-level `errors` array and per-field `userErrors`. Always check both.
|
||||
- **IDs are GID strings:** `gid://shopify/Product/10079467700516`, `gid://shopify/Variant/...`, `gid://shopify/Order/...`. Pass these verbatim — don't strip the prefix.
|
||||
- **Rate limit:** calculated via query cost (leaky bucket). Each response has `extensions.cost` with `requestedQueryCost`, `actualQueryCost`, `throttleStatus.{currentlyAvailable, maximumAvailable, restoreRate}`. Back off when `currentlyAvailable` drops below your next query's cost. Standard shops = 100 points bucket, 50/s restore; Plus = 1000/100.
|
||||
|
||||
Base curl pattern (reusable):
|
||||
|
||||
```bash
|
||||
shop_gql() {
|
||||
local query="$1"
|
||||
local variables="${2:-{}}"
|
||||
curl -sS -X POST \
|
||||
"https://${SHOPIFY_STORE_DOMAIN}/admin/api/${SHOPIFY_API_VERSION:-2026-01}/graphql.json" \
|
||||
-H "Content-Type: application/json" \
|
||||
-H "X-Shopify-Access-Token: ${SHOPIFY_ACCESS_TOKEN}" \
|
||||
--data "$(jq -nc --arg q "$query" --argjson v "$variables" '{query: $q, variables: $v}')"
|
||||
}
|
||||
```
|
||||
|
||||
Pipe through `jq` for readable output. `-sS` keeps errors visible but hides the progress bar.
|
||||
|
||||
## Discovery
|
||||
|
||||
### Shop info + current API version
|
||||
```bash
|
||||
shop_gql '{ shop { name myshopifyDomain primaryDomain { url } currencyCode plan { displayName } } }' | jq
|
||||
```
|
||||
|
||||
### List all supported API versions
|
||||
```bash
|
||||
shop_gql '{ publicApiVersions { handle supported } }' | jq '.data.publicApiVersions[] | select(.supported)'
|
||||
```
|
||||
|
||||
## Products
|
||||
|
||||
### Search products (first 20 matching query)
|
||||
```bash
|
||||
shop_gql '
|
||||
query($q: String!) {
|
||||
products(first: 20, query: $q) {
|
||||
edges { node { id title handle status totalInventory variants(first: 5) { edges { node { id sku price inventoryQuantity } } } } }
|
||||
pageInfo { hasNextPage endCursor }
|
||||
}
|
||||
}' '{"q":"hoodie status:active"}' | jq
|
||||
```
|
||||
|
||||
Query syntax supports `title:`, `sku:`, `vendor:`, `product_type:`, `status:active`, `tag:`, `created_at:>2025-01-01`. Full grammar: https://shopify.dev/docs/api/usage/search-syntax
|
||||
|
||||
### Paginate products (cursor)
|
||||
```bash
|
||||
shop_gql '
|
||||
query($cursor: String) {
|
||||
products(first: 100, after: $cursor) {
|
||||
edges { cursor node { id handle } }
|
||||
pageInfo { hasNextPage endCursor }
|
||||
}
|
||||
}' '{"cursor":null}'
|
||||
# subsequent calls: pass the previous endCursor
|
||||
```
|
||||
|
||||
### Get a product with variants + metafields
|
||||
```bash
|
||||
shop_gql '
|
||||
query($id: ID!) {
|
||||
product(id: $id) {
|
||||
id title handle descriptionHtml tags status
|
||||
variants(first: 20) { edges { node { id sku price compareAtPrice inventoryQuantity selectedOptions { name value } } } }
|
||||
metafields(first: 20) { edges { node { namespace key type value } } }
|
||||
}
|
||||
}' '{"id":"gid://shopify/Product/10079467700516"}' | jq
|
||||
```
|
||||
|
||||
### Create a product with one variant
|
||||
```bash
|
||||
shop_gql '
|
||||
mutation($input: ProductCreateInput!) {
|
||||
productCreate(product: $input) {
|
||||
product { id handle }
|
||||
userErrors { field message }
|
||||
}
|
||||
}' '{"input":{"title":"Test Hoodie","status":"DRAFT","vendor":"Hermes","productType":"Apparel","tags":["test"]}}'
|
||||
```
|
||||
|
||||
Variants now have their own mutations in recent versions:
|
||||
|
||||
```bash
|
||||
# Add variants after creating the product
|
||||
shop_gql '
|
||||
mutation($productId: ID!, $variants: [ProductVariantsBulkInput!]!) {
|
||||
productVariantsBulkCreate(productId: $productId, variants: $variants) {
|
||||
productVariants { id sku price }
|
||||
userErrors { field message }
|
||||
}
|
||||
}' '{"productId":"gid://shopify/Product/...","variants":[{"optionValues":[{"optionName":"Size","name":"M"}],"price":"49.00","inventoryItem":{"sku":"HD-M","tracked":true}}]}'
|
||||
```
|
||||
|
||||
### Update price / SKU
|
||||
```bash
|
||||
shop_gql '
|
||||
mutation($productId: ID!, $variants: [ProductVariantsBulkInput!]!) {
|
||||
productVariantsBulkUpdate(productId: $productId, variants: $variants) {
|
||||
productVariants { id sku price }
|
||||
userErrors { field message }
|
||||
}
|
||||
}' '{"productId":"gid://shopify/Product/...","variants":[{"id":"gid://shopify/ProductVariant/...","price":"55.00"}]}'
|
||||
```
|
||||
|
||||
## Orders
|
||||
|
||||
### List recent orders (last 30 by default without `read_all_orders`)
|
||||
```bash
|
||||
shop_gql '
|
||||
{
|
||||
orders(first: 20, reverse: true, query: "financial_status:paid") {
|
||||
edges { node {
|
||||
id name createdAt displayFinancialStatus displayFulfillmentStatus
|
||||
totalPriceSet { shopMoney { amount currencyCode } }
|
||||
customer { id displayName email }
|
||||
lineItems(first: 10) { edges { node { title quantity sku } } }
|
||||
} }
|
||||
}
|
||||
}' | jq
|
||||
```
|
||||
|
||||
Useful order query filters: `financial_status:paid|pending|refunded`, `fulfillment_status:unfulfilled|fulfilled`, `created_at:>2025-01-01`, `tag:gift`, `email:foo@example.com`.
|
||||
|
||||
### Fetch a single order with shipping address
|
||||
```bash
|
||||
shop_gql '
|
||||
query($id: ID!) {
|
||||
order(id: $id) {
|
||||
id name email
|
||||
shippingAddress { name address1 address2 city province country zip phone }
|
||||
lineItems(first: 50) { edges { node { title quantity variant { sku } originalUnitPriceSet { shopMoney { amount currencyCode } } } } }
|
||||
transactions { id kind status amountSet { shopMoney { amount currencyCode } } }
|
||||
}
|
||||
}' '{"id":"gid://shopify/Order/...."}' | jq
|
||||
```
|
||||
|
||||
## Customers
|
||||
|
||||
```bash
|
||||
# Search
|
||||
shop_gql '
|
||||
{
|
||||
customers(first: 10, query: "email:*@example.com") {
|
||||
edges { node { id email displayName numberOfOrders amountSpent { amount currencyCode } } }
|
||||
}
|
||||
}'
|
||||
|
||||
# Create
|
||||
shop_gql '
|
||||
mutation($input: CustomerInput!) {
|
||||
customerCreate(input: $input) {
|
||||
customer { id email }
|
||||
userErrors { field message }
|
||||
}
|
||||
}' '{"input":{"email":"test@example.com","firstName":"Test","lastName":"User","tags":["api-created"]}}'
|
||||
```
|
||||
|
||||
## Inventory
|
||||
|
||||
Inventory lives on **inventory items** tied to variants, quantities tracked per **location**.
|
||||
|
||||
```bash
|
||||
# Get inventory for a variant across all locations
|
||||
shop_gql '
|
||||
query($id: ID!) {
|
||||
productVariant(id: $id) {
|
||||
id sku
|
||||
inventoryItem {
|
||||
id tracked
|
||||
inventoryLevels(first: 10) {
|
||||
edges { node { location { id name } quantities(names: ["available","on_hand","committed"]) { name quantity } } }
|
||||
}
|
||||
}
|
||||
}
|
||||
}' '{"id":"gid://shopify/ProductVariant/..."}'
|
||||
```
|
||||
|
||||
Adjust stock (delta) — uses `inventoryAdjustQuantities`:
|
||||
|
||||
```bash
|
||||
shop_gql '
|
||||
mutation($input: InventoryAdjustQuantitiesInput!) {
|
||||
inventoryAdjustQuantities(input: $input) {
|
||||
inventoryAdjustmentGroup { reason changes { name delta } }
|
||||
userErrors { field message }
|
||||
}
|
||||
}' '{
|
||||
"input": {
|
||||
"reason": "correction",
|
||||
"name": "available",
|
||||
"changes": [{"delta": 5, "inventoryItemId": "gid://shopify/InventoryItem/...", "locationId": "gid://shopify/Location/..."}]
|
||||
}
|
||||
}'
|
||||
```
|
||||
|
||||
Set absolute stock (not delta) — `inventorySetQuantities`:
|
||||
|
||||
```bash
|
||||
shop_gql '
|
||||
mutation($input: InventorySetQuantitiesInput!) {
|
||||
inventorySetQuantities(input: $input) {
|
||||
inventoryAdjustmentGroup { id }
|
||||
userErrors { field message }
|
||||
}
|
||||
}' '{"input":{"reason":"correction","name":"available","ignoreCompareQuantity":true,"quantities":[{"inventoryItemId":"gid://shopify/InventoryItem/...","locationId":"gid://shopify/Location/...","quantity":100}]}}'
|
||||
```
|
||||
|
||||
## Metafields & Metaobjects
|
||||
|
||||
Metafields attach custom data to resources (products, customers, orders, shop).
|
||||
|
||||
```bash
|
||||
# Read
|
||||
shop_gql '
|
||||
query($id: ID!) {
|
||||
product(id: $id) {
|
||||
metafields(first: 10, namespace: "custom") {
|
||||
edges { node { key type value } }
|
||||
}
|
||||
}
|
||||
}' '{"id":"gid://shopify/Product/..."}'
|
||||
|
||||
# Write (works for any owner type)
|
||||
shop_gql '
|
||||
mutation($metafields: [MetafieldsSetInput!]!) {
|
||||
metafieldsSet(metafields: $metafields) {
|
||||
metafields { id key namespace }
|
||||
userErrors { field message code }
|
||||
}
|
||||
}' '{"metafields":[{"ownerId":"gid://shopify/Product/...","namespace":"custom","key":"care_instructions","type":"multi_line_text_field","value":"Wash cold. Tumble dry low."}]}'
|
||||
```
|
||||
|
||||
## Storefront API (public read-only)
|
||||
|
||||
Different endpoint, different token, used for customer-facing apps/hydrogen-style headless setups. Headers differ:
|
||||
|
||||
- **Endpoint:** `https://$SHOPIFY_STORE_DOMAIN/api/$SHOPIFY_API_VERSION/graphql.json`
|
||||
- **Auth header (public):** `X-Shopify-Storefront-Access-Token: <public token>` — embeddable in browser
|
||||
- **Auth header (private):** `Shopify-Storefront-Private-Token: <private token>` — server-only
|
||||
|
||||
```bash
|
||||
curl -sS -X POST \
|
||||
"https://${SHOPIFY_STORE_DOMAIN}/api/${SHOPIFY_API_VERSION:-2026-01}/graphql.json" \
|
||||
-H "Content-Type: application/json" \
|
||||
-H "X-Shopify-Storefront-Access-Token: ${SHOPIFY_STOREFRONT_TOKEN}" \
|
||||
-d '{"query":"{ shop { name } products(first: 5) { edges { node { id title handle } } } }"}' | jq
|
||||
```
|
||||
|
||||
## Bulk Operations
|
||||
|
||||
For dumps larger than rate limits allow (full product catalog, all orders for a year):
|
||||
|
||||
```bash
|
||||
# 1. Start bulk query
|
||||
shop_gql '
|
||||
mutation {
|
||||
bulkOperationRunQuery(query: """
|
||||
{ products { edges { node { id title handle variants { edges { node { sku price } } } } } } }
|
||||
""") {
|
||||
bulkOperation { id status }
|
||||
userErrors { field message }
|
||||
}
|
||||
}'
|
||||
|
||||
# 2. Poll status
|
||||
shop_gql '{ currentBulkOperation { id status errorCode objectCount fileSize url partialDataUrl } }'
|
||||
|
||||
# 3. When status=COMPLETED, download the JSONL file
|
||||
curl -sS "$URL" > products.jsonl
|
||||
```
|
||||
|
||||
Each JSONL line is a node, and nested connections are emitted as separate lines with `__parentId`. Reassemble client-side if needed.
|
||||
|
||||
## Webhooks
|
||||
|
||||
Subscribe to events so you don't have to poll:
|
||||
|
||||
```bash
|
||||
shop_gql '
|
||||
mutation($topic: WebhookSubscriptionTopic!, $sub: WebhookSubscriptionInput!) {
|
||||
webhookSubscriptionCreate(topic: $topic, webhookSubscription: $sub) {
|
||||
webhookSubscription { id topic endpoint { __typename ... on WebhookHttpEndpoint { callbackUrl } } }
|
||||
userErrors { field message }
|
||||
}
|
||||
}' '{"topic":"ORDERS_CREATE","sub":{"callbackUrl":"https://example.com/webhook","format":"JSON"}}'
|
||||
```
|
||||
|
||||
Verify incoming webhook HMAC using the app's client secret (not the access token):
|
||||
|
||||
```bash
|
||||
echo -n "$REQUEST_BODY" | openssl dgst -sha256 -hmac "$APP_SECRET" -binary | base64
|
||||
# Compare to X-Shopify-Hmac-Sha256 header
|
||||
```
|
||||
|
||||
## Pitfalls
|
||||
|
||||
- **REST endpoints still exist but are frozen.** Don't write new integrations against `/admin/api/.../products.json`. Use GraphQL.
|
||||
- **Token format check.** Admin tokens start with `shpat_`. Storefront public tokens with `shpua_`. If you have one and the wrong header, every request returns 401 without a useful error body.
|
||||
- **403 with a valid token = missing scope.** Shopify returns `{"errors":[{"message":"Access denied for ..."}]}`. Re-configure Admin API scopes on the app, then reinstall to regenerate the token.
|
||||
- **`userErrors` is empty != success.** Also check `data.<mutation>.<resource>` is non-null. Some failures populate neither — inspect the whole response.
|
||||
- **GID vs numeric ID.** Legacy REST gave numeric IDs; GraphQL wants full GID strings. To convert: `gid://shopify/Product/<numeric>`.
|
||||
- **Rate limit surprise.** A single `products(first: 250)` with deep nesting can cost 1000+ points and throttle immediately on a standard-plan shop. Start narrow, read `extensions.cost`, adjust.
|
||||
- **Pagination order.** `products(first: N, reverse: true)` sorts by `id DESC`, not `created_at`. Use `sortKey: CREATED_AT, reverse: true` for "newest first."
|
||||
- **`read_all_orders` for historical data.** Without it, `orders(...)` silently caps at the 60-day window. You won't get an error, just fewer results than expected. For Shopify Plus merchants with many orders, request this scope via the app's protected-data settings.
|
||||
- **Currencies are strings.** Amounts come back as `"49.00"` not `49.0`. Don't `jq tonumber` blindly if you care about zero-padding.
|
||||
- **Multi-currency Money fields** have `shopMoney` (store's currency) AND `presentmentMoney` (customer's). Pick one consistently.
|
||||
|
||||
## Safety
|
||||
|
||||
Mutations in Shopify are real — they create products, charge refunds, cancel orders, ship fulfillments. Before running `productDelete`, `orderCancel`, `refundCreate`, or any bulk mutation: state clearly what the change is, on which shop, and confirm with the user. There is no staging clone of production data unless the user has a separate dev store.
|
||||
+1601
File diff suppressed because it is too large
Load Diff
+752
@@ -0,0 +1,752 @@
|
||||
/*
|
||||
* Hermes Kanban — dashboard plugin styles.
|
||||
*
|
||||
* All colors reference theme CSS vars so the board reskins with the
|
||||
* active dashboard theme. No hardcoded palette.
|
||||
*/
|
||||
|
||||
.hermes-kanban {
|
||||
width: 100%;
|
||||
}
|
||||
|
||||
/* ---- Columns layout -------------------------------------------------- */
|
||||
|
||||
.hermes-kanban-columns {
|
||||
display: grid;
|
||||
grid-template-columns: repeat(auto-fit, minmax(260px, 1fr));
|
||||
gap: 0.75rem;
|
||||
align-items: start;
|
||||
}
|
||||
|
||||
.hermes-kanban-column {
|
||||
display: flex;
|
||||
flex-direction: column;
|
||||
background: color-mix(in srgb, var(--color-card) 85%, transparent);
|
||||
border: 1px solid var(--color-border);
|
||||
border-radius: var(--radius);
|
||||
padding: 0.5rem;
|
||||
min-height: 200px;
|
||||
max-height: calc(100vh - 220px);
|
||||
transition: border-color 120ms ease, background-color 120ms ease;
|
||||
}
|
||||
|
||||
.hermes-kanban-column--drop {
|
||||
border-color: var(--color-ring);
|
||||
background: color-mix(in srgb, var(--color-ring) 8%, var(--color-card));
|
||||
}
|
||||
|
||||
.hermes-kanban-column-header {
|
||||
display: flex;
|
||||
align-items: center;
|
||||
gap: 0.5rem;
|
||||
padding: 0.25rem 0.25rem 0.35rem;
|
||||
font-weight: 600;
|
||||
font-size: 0.85rem;
|
||||
color: var(--color-foreground);
|
||||
}
|
||||
|
||||
.hermes-kanban-column-label {
|
||||
flex: 1;
|
||||
letter-spacing: 0.01em;
|
||||
}
|
||||
|
||||
.hermes-kanban-column-count {
|
||||
font-variant-numeric: tabular-nums;
|
||||
color: var(--color-muted-foreground);
|
||||
font-size: 0.75rem;
|
||||
font-weight: 500;
|
||||
}
|
||||
|
||||
.hermes-kanban-column-add {
|
||||
appearance: none;
|
||||
background: transparent;
|
||||
border: 1px solid var(--color-border);
|
||||
color: var(--color-foreground);
|
||||
border-radius: var(--radius-sm, 0.25rem);
|
||||
width: 22px;
|
||||
height: 22px;
|
||||
line-height: 1;
|
||||
font-size: 1rem;
|
||||
cursor: pointer;
|
||||
}
|
||||
.hermes-kanban-column-add:hover {
|
||||
background: color-mix(in srgb, var(--color-foreground) 8%, transparent);
|
||||
}
|
||||
|
||||
.hermes-kanban-column-sub {
|
||||
padding: 0 0.25rem 0.5rem;
|
||||
font-size: 0.7rem;
|
||||
color: var(--color-muted-foreground);
|
||||
border-bottom: 1px solid color-mix(in srgb, var(--color-border) 60%, transparent);
|
||||
margin-bottom: 0.5rem;
|
||||
}
|
||||
|
||||
.hermes-kanban-column-body {
|
||||
display: flex;
|
||||
flex-direction: column;
|
||||
gap: 0.45rem;
|
||||
overflow-y: auto;
|
||||
padding-right: 0.1rem;
|
||||
}
|
||||
|
||||
.hermes-kanban-empty {
|
||||
padding: 1.5rem 0.5rem;
|
||||
text-align: center;
|
||||
font-size: 0.75rem;
|
||||
color: var(--color-muted-foreground);
|
||||
border: 1px dashed color-mix(in srgb, var(--color-border) 70%, transparent);
|
||||
border-radius: var(--radius-sm, 0.25rem);
|
||||
}
|
||||
|
||||
/* ---- Status dots ----------------------------------------------------- */
|
||||
|
||||
.hermes-kanban-dot {
|
||||
display: inline-block;
|
||||
width: 0.5rem;
|
||||
height: 0.5rem;
|
||||
border-radius: 999px;
|
||||
background: var(--color-muted-foreground);
|
||||
}
|
||||
.hermes-kanban-dot-triage { background: #b47dd6; } /* lilac — fresh/unspecified */
|
||||
.hermes-kanban-dot-todo { background: var(--color-muted-foreground); }
|
||||
.hermes-kanban-dot-ready { background: #d4b348; } /* amber */
|
||||
.hermes-kanban-dot-running { background: #3fb97d; } /* green */
|
||||
.hermes-kanban-dot-blocked { background: var(--color-destructive, #d14a4a); }
|
||||
.hermes-kanban-dot-done { background: #4a8cd1; } /* blue */
|
||||
.hermes-kanban-dot-archived { background: var(--color-border); }
|
||||
|
||||
/* ---- Progress pill (N/M child tasks done) --------------------------- */
|
||||
|
||||
.hermes-kanban-progress {
|
||||
font-family: var(--font-mono, ui-monospace, monospace);
|
||||
font-size: 0.62rem;
|
||||
padding: 0.05rem 0.35rem;
|
||||
border-radius: 999px;
|
||||
background: color-mix(in srgb, var(--color-foreground) 8%, transparent);
|
||||
border: 1px solid color-mix(in srgb, var(--color-border) 80%, transparent);
|
||||
color: var(--color-muted-foreground);
|
||||
letter-spacing: 0.02em;
|
||||
}
|
||||
.hermes-kanban-progress--full {
|
||||
background: color-mix(in srgb, #3fb97d 22%, transparent);
|
||||
border-color: color-mix(in srgb, #3fb97d 45%, transparent);
|
||||
color: var(--color-foreground);
|
||||
}
|
||||
|
||||
/* ---- Lanes (per-profile sub-grouping inside Running) ---------------- */
|
||||
|
||||
.hermes-kanban-lane {
|
||||
display: flex;
|
||||
flex-direction: column;
|
||||
gap: 0.35rem;
|
||||
padding: 0.25rem 0 0.35rem;
|
||||
border-top: 1px dashed color-mix(in srgb, var(--color-border) 70%, transparent);
|
||||
}
|
||||
.hermes-kanban-lane:first-child {
|
||||
border-top: 0;
|
||||
padding-top: 0;
|
||||
}
|
||||
.hermes-kanban-lane-head {
|
||||
display: flex;
|
||||
align-items: center;
|
||||
gap: 0.4rem;
|
||||
font-size: 0.65rem;
|
||||
text-transform: uppercase;
|
||||
letter-spacing: 0.08em;
|
||||
color: var(--color-muted-foreground);
|
||||
padding: 0 0.1rem;
|
||||
}
|
||||
.hermes-kanban-lane-name {
|
||||
font-weight: 600;
|
||||
font-family: var(--font-mono, ui-monospace, monospace);
|
||||
}
|
||||
.hermes-kanban-lane-count {
|
||||
margin-left: auto;
|
||||
font-variant-numeric: tabular-nums;
|
||||
}
|
||||
|
||||
/* ---- Card ------------------------------------------------------------ */
|
||||
|
||||
.hermes-kanban-card {
|
||||
cursor: grab;
|
||||
transition: transform 100ms ease, box-shadow 100ms ease;
|
||||
}
|
||||
.hermes-kanban-card:hover {
|
||||
box-shadow: 0 1px 0 0 var(--color-ring) inset, 0 0 0 1px var(--color-ring) inset;
|
||||
}
|
||||
.hermes-kanban-card:active {
|
||||
cursor: grabbing;
|
||||
transform: scale(0.995);
|
||||
}
|
||||
|
||||
.hermes-kanban-card-content {
|
||||
padding: 0.5rem 0.6rem !important;
|
||||
display: flex;
|
||||
flex-direction: column;
|
||||
gap: 0.3rem;
|
||||
}
|
||||
|
||||
.hermes-kanban-card-row {
|
||||
display: flex;
|
||||
align-items: center;
|
||||
gap: 0.35rem;
|
||||
flex-wrap: wrap;
|
||||
}
|
||||
|
||||
.hermes-kanban-card-id {
|
||||
font-family: var(--font-mono, ui-monospace, monospace);
|
||||
font-size: 0.65rem;
|
||||
color: var(--color-muted-foreground);
|
||||
letter-spacing: 0.03em;
|
||||
}
|
||||
|
||||
.hermes-kanban-card-title {
|
||||
font-size: 0.85rem;
|
||||
font-weight: 500;
|
||||
line-height: 1.3;
|
||||
color: var(--color-foreground);
|
||||
word-break: break-word;
|
||||
}
|
||||
|
||||
.hermes-kanban-card-meta {
|
||||
font-size: 0.7rem;
|
||||
color: var(--color-muted-foreground);
|
||||
gap: 0.55rem;
|
||||
}
|
||||
|
||||
.hermes-kanban-priority {
|
||||
font-size: 0.6rem !important;
|
||||
padding: 0.05rem 0.3rem !important;
|
||||
background: color-mix(in srgb, var(--color-ring) 18%, transparent);
|
||||
color: var(--color-foreground);
|
||||
border: 1px solid color-mix(in srgb, var(--color-ring) 40%, transparent);
|
||||
}
|
||||
|
||||
.hermes-kanban-tag {
|
||||
font-size: 0.6rem !important;
|
||||
padding: 0.05rem 0.3rem !important;
|
||||
}
|
||||
|
||||
.hermes-kanban-assignee {
|
||||
font-weight: 500;
|
||||
color: color-mix(in srgb, var(--color-foreground) 80%, var(--color-muted-foreground));
|
||||
}
|
||||
.hermes-kanban-unassigned {
|
||||
font-style: italic;
|
||||
}
|
||||
.hermes-kanban-ago {
|
||||
margin-left: auto;
|
||||
}
|
||||
|
||||
/* ---- Inline create --------------------------------------------------- */
|
||||
|
||||
.hermes-kanban-inline-create {
|
||||
display: flex;
|
||||
flex-direction: column;
|
||||
gap: 0.35rem;
|
||||
padding: 0.5rem;
|
||||
margin-bottom: 0.5rem;
|
||||
background: color-mix(in srgb, var(--color-card) 70%, transparent);
|
||||
border: 1px dashed var(--color-border);
|
||||
border-radius: var(--radius-sm, 0.25rem);
|
||||
}
|
||||
|
||||
/* ---- Drawer (task detail side panel) --------------------------------- */
|
||||
|
||||
.hermes-kanban-drawer-shade {
|
||||
position: fixed;
|
||||
inset: 0;
|
||||
background: rgba(0, 0, 0, 0.45);
|
||||
z-index: 60;
|
||||
display: flex;
|
||||
justify-content: flex-end;
|
||||
}
|
||||
|
||||
.hermes-kanban-drawer {
|
||||
width: min(480px, 92vw);
|
||||
height: 100vh;
|
||||
background: var(--color-card);
|
||||
border-left: 1px solid var(--color-border);
|
||||
display: flex;
|
||||
flex-direction: column;
|
||||
box-shadow: -4px 0 18px rgba(0, 0, 0, 0.35);
|
||||
animation: hermes-kanban-drawer-in 180ms ease-out;
|
||||
}
|
||||
|
||||
@keyframes hermes-kanban-drawer-in {
|
||||
from { transform: translateX(100%); opacity: 0.3; }
|
||||
to { transform: translateX(0); opacity: 1; }
|
||||
}
|
||||
|
||||
.hermes-kanban-drawer-head {
|
||||
display: flex;
|
||||
align-items: center;
|
||||
justify-content: space-between;
|
||||
padding: 0.6rem 0.8rem;
|
||||
border-bottom: 1px solid var(--color-border);
|
||||
font-family: var(--font-mono, ui-monospace, monospace);
|
||||
}
|
||||
|
||||
.hermes-kanban-drawer-close {
|
||||
appearance: none;
|
||||
background: transparent;
|
||||
border: 0;
|
||||
color: var(--color-muted-foreground);
|
||||
font-size: 1.25rem;
|
||||
line-height: 1;
|
||||
cursor: pointer;
|
||||
padding: 0 0.25rem;
|
||||
}
|
||||
.hermes-kanban-drawer-close:hover { color: var(--color-foreground); }
|
||||
|
||||
.hermes-kanban-drawer-body {
|
||||
flex: 1;
|
||||
overflow-y: auto;
|
||||
padding: 0.9rem;
|
||||
display: flex;
|
||||
flex-direction: column;
|
||||
gap: 0.85rem;
|
||||
}
|
||||
|
||||
.hermes-kanban-drawer-title {
|
||||
display: flex;
|
||||
align-items: center;
|
||||
gap: 0.5rem;
|
||||
font-size: 1rem;
|
||||
font-weight: 600;
|
||||
}
|
||||
|
||||
.hermes-kanban-drawer-meta {
|
||||
display: flex;
|
||||
flex-direction: column;
|
||||
gap: 0.15rem;
|
||||
padding: 0.5rem 0.6rem;
|
||||
background: color-mix(in srgb, var(--color-foreground) 4%, transparent);
|
||||
border: 1px solid var(--color-border);
|
||||
border-radius: var(--radius-sm, 0.25rem);
|
||||
}
|
||||
|
||||
.hermes-kanban-meta-row {
|
||||
display: flex;
|
||||
gap: 0.5rem;
|
||||
font-size: 0.72rem;
|
||||
}
|
||||
.hermes-kanban-meta-label {
|
||||
width: 92px;
|
||||
color: var(--color-muted-foreground);
|
||||
}
|
||||
.hermes-kanban-meta-value {
|
||||
color: var(--color-foreground);
|
||||
word-break: break-word;
|
||||
}
|
||||
|
||||
.hermes-kanban-actions {
|
||||
display: flex;
|
||||
flex-wrap: wrap;
|
||||
gap: 0.3rem;
|
||||
}
|
||||
|
||||
.hermes-kanban-section {
|
||||
display: flex;
|
||||
flex-direction: column;
|
||||
gap: 0.35rem;
|
||||
}
|
||||
|
||||
.hermes-kanban-section-head {
|
||||
font-size: 0.72rem;
|
||||
font-weight: 600;
|
||||
text-transform: uppercase;
|
||||
letter-spacing: 0.07em;
|
||||
color: var(--color-muted-foreground);
|
||||
}
|
||||
|
||||
.hermes-kanban-pre {
|
||||
margin: 0;
|
||||
padding: 0.45rem 0.55rem;
|
||||
white-space: pre-wrap;
|
||||
word-break: break-word;
|
||||
background: color-mix(in srgb, var(--color-foreground) 4%, transparent);
|
||||
border: 1px solid var(--color-border);
|
||||
border-radius: var(--radius-sm, 0.25rem);
|
||||
font-family: var(--font-mono, ui-monospace, monospace);
|
||||
font-size: 0.72rem;
|
||||
color: var(--color-foreground);
|
||||
}
|
||||
|
||||
.hermes-kanban-comment {
|
||||
border-left: 2px solid color-mix(in srgb, var(--color-ring) 35%, transparent);
|
||||
padding-left: 0.5rem;
|
||||
display: flex;
|
||||
flex-direction: column;
|
||||
gap: 0.2rem;
|
||||
}
|
||||
|
||||
.hermes-kanban-comment-head {
|
||||
display: flex;
|
||||
gap: 0.5rem;
|
||||
font-size: 0.7rem;
|
||||
}
|
||||
.hermes-kanban-comment-author {
|
||||
font-weight: 600;
|
||||
color: var(--color-foreground);
|
||||
}
|
||||
.hermes-kanban-comment-ago {
|
||||
color: var(--color-muted-foreground);
|
||||
}
|
||||
|
||||
.hermes-kanban-event {
|
||||
display: flex;
|
||||
gap: 0.5rem;
|
||||
font-size: 0.7rem;
|
||||
color: var(--color-muted-foreground);
|
||||
font-family: var(--font-mono, ui-monospace, monospace);
|
||||
}
|
||||
.hermes-kanban-event-kind {
|
||||
color: var(--color-foreground);
|
||||
min-width: 6rem;
|
||||
}
|
||||
.hermes-kanban-event-payload {
|
||||
color: var(--color-muted-foreground);
|
||||
overflow: hidden;
|
||||
text-overflow: ellipsis;
|
||||
white-space: nowrap;
|
||||
max-width: 280px;
|
||||
}
|
||||
|
||||
.hermes-kanban-drawer-comment-row {
|
||||
display: flex;
|
||||
gap: 0.4rem;
|
||||
padding: 0.55rem 0.75rem;
|
||||
border-top: 1px solid var(--color-border);
|
||||
background: color-mix(in srgb, var(--color-card) 90%, transparent);
|
||||
}
|
||||
|
||||
.hermes-kanban-count {
|
||||
display: inline-flex;
|
||||
gap: 0.2rem;
|
||||
align-items: center;
|
||||
}
|
||||
|
||||
/* ---- Selection chrome ----------------------------------------------- */
|
||||
|
||||
.hermes-kanban-card--selected :where(.hermes-kanban-card-content) {
|
||||
box-shadow: 0 0 0 2px var(--color-ring) inset,
|
||||
0 0 0 1px var(--color-ring) inset;
|
||||
background: color-mix(in srgb, var(--color-ring) 6%, var(--color-card));
|
||||
}
|
||||
|
||||
.hermes-kanban-card-check {
|
||||
width: 0.85rem;
|
||||
height: 0.85rem;
|
||||
margin: 0;
|
||||
cursor: pointer;
|
||||
accent-color: var(--color-ring);
|
||||
}
|
||||
|
||||
/* ---- Bulk action bar ------------------------------------------------ */
|
||||
|
||||
.hermes-kanban-bulk {
|
||||
display: flex;
|
||||
align-items: center;
|
||||
gap: 0.5rem;
|
||||
padding: 0.4rem 0.75rem;
|
||||
background: color-mix(in srgb, var(--color-ring) 10%, var(--color-card));
|
||||
border: 1px solid color-mix(in srgb, var(--color-ring) 40%, var(--color-border));
|
||||
border-radius: var(--radius-sm, 0.25rem);
|
||||
flex-wrap: wrap;
|
||||
}
|
||||
.hermes-kanban-bulk-count {
|
||||
font-weight: 600;
|
||||
font-size: 0.75rem;
|
||||
padding-right: 0.25rem;
|
||||
}
|
||||
.hermes-kanban-bulk-btn {
|
||||
height: 1.7rem !important;
|
||||
padding: 0 0.5rem !important;
|
||||
font-size: 0.7rem !important;
|
||||
border: 1px solid var(--color-border);
|
||||
cursor: pointer;
|
||||
}
|
||||
.hermes-kanban-bulk-btn:hover {
|
||||
background: color-mix(in srgb, var(--color-foreground) 8%, transparent);
|
||||
}
|
||||
.hermes-kanban-bulk-reassign {
|
||||
display: flex;
|
||||
align-items: center;
|
||||
gap: 0.25rem;
|
||||
padding-left: 0.5rem;
|
||||
border-left: 1px solid color-mix(in srgb, var(--color-border) 70%, transparent);
|
||||
}
|
||||
|
||||
/* ---- Dependency editor chips --------------------------------------- */
|
||||
|
||||
.hermes-kanban-deps-row {
|
||||
display: flex;
|
||||
align-items: center;
|
||||
gap: 0.5rem;
|
||||
margin-bottom: 0.4rem;
|
||||
}
|
||||
.hermes-kanban-deps-label {
|
||||
font-size: 0.68rem;
|
||||
text-transform: uppercase;
|
||||
letter-spacing: 0.08em;
|
||||
color: var(--color-muted-foreground);
|
||||
min-width: 4rem;
|
||||
}
|
||||
.hermes-kanban-deps-chips {
|
||||
display: flex;
|
||||
gap: 0.3rem;
|
||||
flex-wrap: wrap;
|
||||
flex: 1;
|
||||
}
|
||||
.hermes-kanban-deps-empty {
|
||||
font-size: 0.7rem;
|
||||
color: var(--color-muted-foreground);
|
||||
font-style: italic;
|
||||
}
|
||||
.hermes-kanban-dep-chip {
|
||||
display: inline-flex;
|
||||
align-items: center;
|
||||
gap: 0.15rem;
|
||||
padding: 0.1rem 0.35rem;
|
||||
background: color-mix(in srgb, var(--color-foreground) 6%, transparent);
|
||||
border: 1px solid var(--color-border);
|
||||
border-radius: var(--radius-sm, 0.25rem);
|
||||
font-family: var(--font-mono, ui-monospace, monospace);
|
||||
font-size: 0.68rem;
|
||||
color: var(--color-foreground);
|
||||
}
|
||||
.hermes-kanban-dep-chip-x {
|
||||
appearance: none;
|
||||
background: transparent;
|
||||
border: 0;
|
||||
color: var(--color-muted-foreground);
|
||||
cursor: pointer;
|
||||
font-size: 0.85rem;
|
||||
line-height: 1;
|
||||
padding: 0 0.15rem;
|
||||
}
|
||||
.hermes-kanban-dep-chip-x:hover { color: var(--color-destructive, #d14a4a); }
|
||||
|
||||
/* ---- Inline edit affordances --------------------------------------- */
|
||||
|
||||
.hermes-kanban-editable {
|
||||
cursor: pointer;
|
||||
border-bottom: 1px dotted color-mix(in srgb, var(--color-border) 80%, transparent);
|
||||
}
|
||||
.hermes-kanban-editable:hover {
|
||||
color: var(--color-foreground);
|
||||
border-bottom-color: var(--color-ring);
|
||||
}
|
||||
|
||||
.hermes-kanban-drawer-title-text {
|
||||
cursor: pointer;
|
||||
}
|
||||
.hermes-kanban-drawer-title-text:hover {
|
||||
text-decoration: underline;
|
||||
text-decoration-color: var(--color-ring);
|
||||
text-decoration-style: dotted;
|
||||
text-underline-offset: 3px;
|
||||
}
|
||||
|
||||
.hermes-kanban-edit-row {
|
||||
display: flex;
|
||||
align-items: center;
|
||||
gap: 0.35rem;
|
||||
width: 100%;
|
||||
}
|
||||
|
||||
.hermes-kanban-section-head-row {
|
||||
display: flex;
|
||||
align-items: center;
|
||||
justify-content: space-between;
|
||||
gap: 0.5rem;
|
||||
}
|
||||
.hermes-kanban-edit-link {
|
||||
appearance: none;
|
||||
background: transparent;
|
||||
border: 0;
|
||||
color: var(--color-muted-foreground);
|
||||
font-size: 0.7rem;
|
||||
text-transform: uppercase;
|
||||
letter-spacing: 0.05em;
|
||||
cursor: pointer;
|
||||
padding: 0;
|
||||
}
|
||||
.hermes-kanban-edit-link:hover { color: var(--color-ring); }
|
||||
|
||||
.hermes-kanban-textarea {
|
||||
width: 100%;
|
||||
min-height: 8rem;
|
||||
background: var(--color-card);
|
||||
color: var(--color-foreground);
|
||||
border: 1px solid var(--color-border);
|
||||
border-radius: var(--radius-sm, 0.25rem);
|
||||
padding: 0.5rem 0.6rem;
|
||||
font-family: var(--font-mono, ui-monospace, monospace);
|
||||
font-size: 0.8rem;
|
||||
line-height: 1.5;
|
||||
resize: vertical;
|
||||
}
|
||||
.hermes-kanban-textarea:focus {
|
||||
outline: none;
|
||||
border-color: var(--color-ring);
|
||||
box-shadow: 0 0 0 2px color-mix(in srgb, var(--color-ring) 30%, transparent);
|
||||
}
|
||||
|
||||
/* ---- Markdown rendering -------------------------------------------- */
|
||||
|
||||
.hermes-kanban-md {
|
||||
font-size: 0.8rem;
|
||||
line-height: 1.55;
|
||||
color: var(--color-foreground);
|
||||
}
|
||||
.hermes-kanban-md p { margin: 0.25rem 0; }
|
||||
.hermes-kanban-md h1,
|
||||
.hermes-kanban-md h2,
|
||||
.hermes-kanban-md h3,
|
||||
.hermes-kanban-md h4 {
|
||||
margin: 0.6rem 0 0.2rem;
|
||||
line-height: 1.25;
|
||||
}
|
||||
.hermes-kanban-md h1 { font-size: 1.05rem; }
|
||||
.hermes-kanban-md h2 { font-size: 0.95rem; }
|
||||
.hermes-kanban-md h3 { font-size: 0.88rem; }
|
||||
.hermes-kanban-md h4 { font-size: 0.82rem; }
|
||||
.hermes-kanban-md ul {
|
||||
margin: 0.25rem 0 0.25rem 1.1rem;
|
||||
padding: 0;
|
||||
}
|
||||
.hermes-kanban-md li { margin: 0.1rem 0; }
|
||||
.hermes-kanban-md a {
|
||||
color: var(--color-ring);
|
||||
text-decoration: underline;
|
||||
}
|
||||
.hermes-kanban-md code {
|
||||
font-family: var(--font-mono, ui-monospace, monospace);
|
||||
font-size: 0.75rem;
|
||||
padding: 0.05rem 0.3rem;
|
||||
background: color-mix(in srgb, var(--color-foreground) 8%, transparent);
|
||||
border-radius: 3px;
|
||||
}
|
||||
.hermes-kanban-md-code {
|
||||
margin: 0.35rem 0;
|
||||
padding: 0.5rem 0.6rem;
|
||||
background: color-mix(in srgb, var(--color-foreground) 5%, transparent);
|
||||
border: 1px solid var(--color-border);
|
||||
border-radius: var(--radius-sm, 0.25rem);
|
||||
overflow-x: auto;
|
||||
}
|
||||
.hermes-kanban-md-code code {
|
||||
background: transparent;
|
||||
padding: 0;
|
||||
font-size: 0.75rem;
|
||||
white-space: pre;
|
||||
}
|
||||
.hermes-kanban-md strong { font-weight: 600; }
|
||||
|
||||
/* ---- Touch-drag proxy ---------------------------------------------- */
|
||||
|
||||
.hermes-kanban-touch-proxy {
|
||||
pointer-events: none;
|
||||
opacity: 0.85;
|
||||
box-shadow: 0 8px 20px rgba(0, 0, 0, 0.35);
|
||||
transform: scale(1.02);
|
||||
transition: none;
|
||||
}
|
||||
|
||||
|
||||
/* ---- Staleness tiers ------------------------------------------------ */
|
||||
|
||||
.hermes-kanban-card--stale-amber :where(.hermes-kanban-card-content) {
|
||||
box-shadow: 0 0 0 1px #d4b34888 inset;
|
||||
}
|
||||
.hermes-kanban-card--stale-amber:hover :where(.hermes-kanban-card-content) {
|
||||
box-shadow: 0 0 0 2px #d4b348 inset;
|
||||
}
|
||||
.hermes-kanban-card--stale-red :where(.hermes-kanban-card-content) {
|
||||
box-shadow: 0 0 0 1px var(--color-destructive, #d14a4a) inset,
|
||||
0 0 8px color-mix(in srgb, var(--color-destructive, #d14a4a) 30%, transparent);
|
||||
}
|
||||
.hermes-kanban-card--stale-red:hover :where(.hermes-kanban-card-content) {
|
||||
box-shadow: 0 0 0 2px var(--color-destructive, #d14a4a) inset,
|
||||
0 0 10px color-mix(in srgb, var(--color-destructive, #d14a4a) 45%, transparent);
|
||||
}
|
||||
|
||||
/* ---- Worker log pane ------------------------------------------------ */
|
||||
|
||||
.hermes-kanban-log {
|
||||
max-height: 340px;
|
||||
overflow: auto;
|
||||
white-space: pre;
|
||||
font-size: 0.7rem;
|
||||
line-height: 1.45;
|
||||
}
|
||||
|
||||
|
||||
/* ---- Run history (per-attempt log in the drawer) ------------------- */
|
||||
|
||||
.hermes-kanban-run {
|
||||
border-left: 2px solid var(--color-border);
|
||||
padding: 0.35rem 0.5rem;
|
||||
margin-bottom: 0.4rem;
|
||||
background: color-mix(in srgb, var(--color-foreground) 3%, transparent);
|
||||
border-radius: var(--radius-sm, 0.25rem);
|
||||
}
|
||||
.hermes-kanban-run--active { border-left-color: #3fb97d; }
|
||||
.hermes-kanban-run--completed { border-left-color: #4a8cd1; }
|
||||
.hermes-kanban-run--ended { border-left-color: #6b7280; } /* generic fallback when outcome is unset */
|
||||
.hermes-kanban-run--blocked { border-left-color: var(--color-destructive, #d14a4a); }
|
||||
.hermes-kanban-run--crashed,
|
||||
.hermes-kanban-run--timed_out,
|
||||
.hermes-kanban-run--gave_up,
|
||||
.hermes-kanban-run--spawn_failed {
|
||||
border-left-color: var(--color-destructive, #d14a4a);
|
||||
background: color-mix(in srgb, var(--color-destructive, #d14a4a) 6%, transparent);
|
||||
}
|
||||
.hermes-kanban-run--reclaimed { border-left-color: #d4b348; }
|
||||
|
||||
.hermes-kanban-run-head {
|
||||
display: flex;
|
||||
align-items: center;
|
||||
gap: 0.6rem;
|
||||
font-size: 0.7rem;
|
||||
}
|
||||
.hermes-kanban-run-outcome {
|
||||
font-family: var(--font-mono, ui-monospace, monospace);
|
||||
font-weight: 600;
|
||||
text-transform: uppercase;
|
||||
letter-spacing: 0.05em;
|
||||
color: var(--color-foreground);
|
||||
}
|
||||
.hermes-kanban-run-profile {
|
||||
color: var(--color-muted-foreground);
|
||||
}
|
||||
.hermes-kanban-run-elapsed {
|
||||
font-variant-numeric: tabular-nums;
|
||||
color: var(--color-muted-foreground);
|
||||
}
|
||||
.hermes-kanban-run-ago {
|
||||
margin-left: auto;
|
||||
color: var(--color-muted-foreground);
|
||||
}
|
||||
.hermes-kanban-run-summary {
|
||||
font-size: 0.75rem;
|
||||
padding: 0.2rem 0 0;
|
||||
color: var(--color-foreground);
|
||||
}
|
||||
.hermes-kanban-run-error {
|
||||
font-size: 0.7rem;
|
||||
color: var(--color-destructive, #d14a4a);
|
||||
padding: 0.15rem 0 0;
|
||||
font-family: var(--font-mono, ui-monospace, monospace);
|
||||
}
|
||||
.hermes-kanban-run-meta {
|
||||
display: block;
|
||||
font-size: 0.65rem;
|
||||
padding: 0.15rem 0 0;
|
||||
color: var(--color-muted-foreground);
|
||||
white-space: pre-wrap;
|
||||
word-break: break-word;
|
||||
font-family: var(--font-mono, ui-monospace, monospace);
|
||||
}
|
||||
@@ -0,0 +1,14 @@
|
||||
{
|
||||
"name": "kanban",
|
||||
"label": "Kanban",
|
||||
"description": "Multi-agent collaboration board — drag-drop cards across columns, read comment threads, see which profile is running what",
|
||||
"icon": "Package",
|
||||
"version": "1.0.0",
|
||||
"tab": {
|
||||
"path": "/kanban",
|
||||
"position": "after:skills"
|
||||
},
|
||||
"entry": "dist/index.js",
|
||||
"css": "dist/style.css",
|
||||
"api": "plugin_api.py"
|
||||
}
|
||||
@@ -0,0 +1,845 @@
|
||||
"""Kanban dashboard plugin — backend API routes.
|
||||
|
||||
Mounted at /api/plugins/kanban/ by the dashboard plugin system.
|
||||
|
||||
This layer is intentionally thin: every handler is a small wrapper around
|
||||
``hermes_cli.kanban_db`` or a direct SQL query. Writes use the same code
|
||||
paths the CLI and gateway ``/kanban`` command use, so the three surfaces
|
||||
cannot drift.
|
||||
|
||||
Live updates arrive via the ``/events`` WebSocket, which tails the
|
||||
append-only ``task_events`` table on a short poll interval (WAL mode lets
|
||||
reads run alongside the dispatcher's IMMEDIATE write transactions).
|
||||
|
||||
Security note
|
||||
-------------
|
||||
The dashboard's HTTP auth middleware (``web_server.auth_middleware``)
|
||||
explicitly skips ``/api/plugins/`` — plugin routes are unauthenticated by
|
||||
design because the dashboard binds to localhost by default. For the
|
||||
WebSocket we still require the session token as a ``?token=`` query
|
||||
parameter (browsers cannot set the ``Authorization`` header on an upgrade
|
||||
request), matching the established pattern used by the in-browser PTY
|
||||
bridge in ``hermes_cli/web_server.py``. If you run the dashboard with
|
||||
``--host 0.0.0.0``, every plugin route — kanban included — becomes
|
||||
reachable from the network. Don't do that on a shared host.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import asyncio
|
||||
import hmac
|
||||
import json
|
||||
import logging
|
||||
import sqlite3
|
||||
import time
|
||||
from dataclasses import asdict
|
||||
from typing import Any, Optional
|
||||
|
||||
from fastapi import APIRouter, HTTPException, Query, WebSocket, WebSocketDisconnect, status as http_status
|
||||
from pydantic import BaseModel, Field
|
||||
|
||||
from hermes_cli import kanban_db
|
||||
|
||||
log = logging.getLogger(__name__)
|
||||
|
||||
router = APIRouter()
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Auth helper — WebSocket only (HTTP routes live behind the dashboard's
|
||||
# existing plugin-bypass; this is documented above).
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def _check_ws_token(provided: Optional[str]) -> bool:
|
||||
"""Constant-time compare against the dashboard session token.
|
||||
|
||||
Imported lazily so the plugin still loads in test contexts where the
|
||||
dashboard web_server module isn't importable (e.g. the bare-FastAPI
|
||||
test harness).
|
||||
"""
|
||||
if not provided:
|
||||
return False
|
||||
try:
|
||||
from hermes_cli import web_server as _ws
|
||||
except Exception:
|
||||
# No dashboard context (tests). Accept so the tail loop is still
|
||||
# testable; in production the dashboard module always imports
|
||||
# cleanly because it's the caller.
|
||||
return True
|
||||
expected = getattr(_ws, "_SESSION_TOKEN", None)
|
||||
if not expected:
|
||||
return True
|
||||
return hmac.compare_digest(str(provided), str(expected))
|
||||
|
||||
|
||||
def _conn():
|
||||
"""Open a kanban_db connection, creating the schema on first use.
|
||||
|
||||
Every handler that mutates the DB goes through this so the plugin
|
||||
self-heals on a fresh install (no user-visible "no such table"
|
||||
error if somebody hits POST /tasks before GET /board).
|
||||
``init_db`` is idempotent.
|
||||
"""
|
||||
try:
|
||||
kanban_db.init_db()
|
||||
except Exception as exc:
|
||||
log.warning("kanban init_db failed: %s", exc)
|
||||
return kanban_db.connect()
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Serialization helpers
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
# Columns shown by the dashboard, in left-to-right order. "archived" is
|
||||
# available via a filter toggle rather than a visible column.
|
||||
BOARD_COLUMNS: list[str] = [
|
||||
"triage", "todo", "ready", "running", "blocked", "done",
|
||||
]
|
||||
|
||||
|
||||
def _task_dict(task: kanban_db.Task) -> dict[str, Any]:
|
||||
d = asdict(task)
|
||||
# Add derived age metrics so the UI can colour stale cards without
|
||||
# computing deltas client-side.
|
||||
d["age"] = kanban_db.task_age(task)
|
||||
# Keep body short on list endpoints; full body comes from /tasks/:id.
|
||||
return d
|
||||
|
||||
|
||||
def _event_dict(event: kanban_db.Event) -> dict[str, Any]:
|
||||
return {
|
||||
"id": event.id,
|
||||
"task_id": event.task_id,
|
||||
"kind": event.kind,
|
||||
"payload": event.payload,
|
||||
"created_at": event.created_at,
|
||||
"run_id": event.run_id,
|
||||
}
|
||||
|
||||
|
||||
def _comment_dict(c: kanban_db.Comment) -> dict[str, Any]:
|
||||
return {
|
||||
"id": c.id,
|
||||
"task_id": c.task_id,
|
||||
"author": c.author,
|
||||
"body": c.body,
|
||||
"created_at": c.created_at,
|
||||
}
|
||||
|
||||
|
||||
def _run_dict(r: kanban_db.Run) -> dict[str, Any]:
|
||||
"""Serialise a Run for the drawer's Run history section."""
|
||||
return {
|
||||
"id": r.id,
|
||||
"task_id": r.task_id,
|
||||
"profile": r.profile,
|
||||
"step_key": r.step_key,
|
||||
"status": r.status,
|
||||
"claim_lock": r.claim_lock,
|
||||
"claim_expires": r.claim_expires,
|
||||
"worker_pid": r.worker_pid,
|
||||
"max_runtime_seconds": r.max_runtime_seconds,
|
||||
"last_heartbeat_at": r.last_heartbeat_at,
|
||||
"started_at": r.started_at,
|
||||
"ended_at": r.ended_at,
|
||||
"outcome": r.outcome,
|
||||
"summary": r.summary,
|
||||
"metadata": r.metadata,
|
||||
"error": r.error,
|
||||
}
|
||||
|
||||
|
||||
def _links_for(conn: sqlite3.Connection, task_id: str) -> dict[str, list[str]]:
|
||||
"""Return {'parents': [...], 'children': [...]} for a task."""
|
||||
parents = [
|
||||
r["parent_id"]
|
||||
for r in conn.execute(
|
||||
"SELECT parent_id FROM task_links WHERE child_id = ? ORDER BY parent_id",
|
||||
(task_id,),
|
||||
)
|
||||
]
|
||||
children = [
|
||||
r["child_id"]
|
||||
for r in conn.execute(
|
||||
"SELECT child_id FROM task_links WHERE parent_id = ? ORDER BY child_id",
|
||||
(task_id,),
|
||||
)
|
||||
]
|
||||
return {"parents": parents, "children": children}
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# GET /board
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
@router.get("/board")
|
||||
def get_board(
|
||||
tenant: Optional[str] = Query(None, description="Filter to a single tenant"),
|
||||
include_archived: bool = Query(False),
|
||||
):
|
||||
"""Return the full board grouped by status column.
|
||||
|
||||
``_conn()`` auto-initializes ``kanban.db`` on first call so a fresh
|
||||
install doesn't surface a "failed to load" error on the plugin tab.
|
||||
"""
|
||||
conn = _conn()
|
||||
try:
|
||||
tasks = kanban_db.list_tasks(
|
||||
conn, tenant=tenant, include_archived=include_archived
|
||||
)
|
||||
# Pre-fetch link counts per task (cheap: one query).
|
||||
link_counts: dict[str, dict[str, int]] = {}
|
||||
for row in conn.execute(
|
||||
"SELECT parent_id, child_id FROM task_links"
|
||||
).fetchall():
|
||||
link_counts.setdefault(row["parent_id"], {"parents": 0, "children": 0})[
|
||||
"children"
|
||||
] += 1
|
||||
link_counts.setdefault(row["child_id"], {"parents": 0, "children": 0})[
|
||||
"parents"
|
||||
] += 1
|
||||
|
||||
# Comment + event counts (both cheap aggregates).
|
||||
comment_counts: dict[str, int] = {
|
||||
r["task_id"]: r["n"]
|
||||
for r in conn.execute(
|
||||
"SELECT task_id, COUNT(*) AS n FROM task_comments GROUP BY task_id"
|
||||
)
|
||||
}
|
||||
|
||||
# Progress rollup: for each parent, how many children are done / total.
|
||||
# One pass over task_links joined with child status — cheaper than
|
||||
# N per-task queries and the plugin uses it to render "N/M".
|
||||
progress: dict[str, dict[str, int]] = {}
|
||||
for row in conn.execute(
|
||||
"SELECT l.parent_id AS pid, t.status AS cstatus "
|
||||
"FROM task_links l JOIN tasks t ON t.id = l.child_id"
|
||||
).fetchall():
|
||||
p = progress.setdefault(row["pid"], {"done": 0, "total": 0})
|
||||
p["total"] += 1
|
||||
if row["cstatus"] == "done":
|
||||
p["done"] += 1
|
||||
|
||||
latest_event_id = conn.execute(
|
||||
"SELECT COALESCE(MAX(id), 0) AS m FROM task_events"
|
||||
).fetchone()["m"]
|
||||
|
||||
columns: dict[str, list[dict]] = {c: [] for c in BOARD_COLUMNS}
|
||||
if include_archived:
|
||||
columns["archived"] = []
|
||||
|
||||
for t in tasks:
|
||||
d = _task_dict(t)
|
||||
d["link_counts"] = link_counts.get(t.id, {"parents": 0, "children": 0})
|
||||
d["comment_count"] = comment_counts.get(t.id, 0)
|
||||
d["progress"] = progress.get(t.id) # None when the task has no children
|
||||
col = t.status if t.status in columns else "todo"
|
||||
columns[col].append(d)
|
||||
|
||||
# Stable per-column ordering already applied by list_tasks
|
||||
# (priority DESC, created_at ASC), keep as-is.
|
||||
|
||||
# List of known tenants for the UI filter dropdown.
|
||||
tenants = [
|
||||
r["tenant"]
|
||||
for r in conn.execute(
|
||||
"SELECT DISTINCT tenant FROM tasks WHERE tenant IS NOT NULL ORDER BY tenant"
|
||||
)
|
||||
]
|
||||
# List of distinct assignees for the lane-by-profile sub-grouping.
|
||||
assignees = [
|
||||
r["assignee"]
|
||||
for r in conn.execute(
|
||||
"SELECT DISTINCT assignee FROM tasks WHERE assignee IS NOT NULL "
|
||||
"AND status != 'archived' ORDER BY assignee"
|
||||
)
|
||||
]
|
||||
|
||||
return {
|
||||
"columns": [
|
||||
{"name": name, "tasks": columns[name]} for name in columns.keys()
|
||||
],
|
||||
"tenants": tenants,
|
||||
"assignees": assignees,
|
||||
"latest_event_id": int(latest_event_id),
|
||||
"now": int(time.time()),
|
||||
}
|
||||
finally:
|
||||
conn.close()
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# GET /tasks/:id
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
@router.get("/tasks/{task_id}")
|
||||
def get_task(task_id: str):
|
||||
conn = _conn()
|
||||
try:
|
||||
task = kanban_db.get_task(conn, task_id)
|
||||
if task is None:
|
||||
raise HTTPException(status_code=404, detail=f"task {task_id} not found")
|
||||
return {
|
||||
"task": _task_dict(task),
|
||||
"comments": [_comment_dict(c) for c in kanban_db.list_comments(conn, task_id)],
|
||||
"events": [_event_dict(e) for e in kanban_db.list_events(conn, task_id)],
|
||||
"links": _links_for(conn, task_id),
|
||||
"runs": [_run_dict(r) for r in kanban_db.list_runs(conn, task_id)],
|
||||
}
|
||||
finally:
|
||||
conn.close()
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# POST /tasks
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
class CreateTaskBody(BaseModel):
|
||||
title: str
|
||||
body: Optional[str] = None
|
||||
assignee: Optional[str] = None
|
||||
tenant: Optional[str] = None
|
||||
priority: int = 0
|
||||
workspace_kind: str = "scratch"
|
||||
workspace_path: Optional[str] = None
|
||||
parents: list[str] = Field(default_factory=list)
|
||||
triage: bool = False
|
||||
idempotency_key: Optional[str] = None
|
||||
max_runtime_seconds: Optional[int] = None
|
||||
skills: Optional[list[str]] = None
|
||||
|
||||
|
||||
@router.post("/tasks")
|
||||
def create_task(payload: CreateTaskBody):
|
||||
conn = _conn()
|
||||
try:
|
||||
task_id = kanban_db.create_task(
|
||||
conn,
|
||||
title=payload.title,
|
||||
body=payload.body,
|
||||
assignee=payload.assignee,
|
||||
created_by="dashboard",
|
||||
workspace_kind=payload.workspace_kind,
|
||||
workspace_path=payload.workspace_path,
|
||||
tenant=payload.tenant,
|
||||
priority=payload.priority,
|
||||
parents=payload.parents,
|
||||
triage=payload.triage,
|
||||
idempotency_key=payload.idempotency_key,
|
||||
max_runtime_seconds=payload.max_runtime_seconds,
|
||||
skills=payload.skills,
|
||||
)
|
||||
task = kanban_db.get_task(conn, task_id)
|
||||
body: dict[str, Any] = {"task": _task_dict(task) if task else None}
|
||||
# Surface a dispatcher-presence warning so the UI can show a
|
||||
# banner when a `ready` task would otherwise sit idle because no
|
||||
# gateway is running (or dispatch_in_gateway=false). Only emit
|
||||
# for ready+assigned tasks; triage/todo are expected to wait,
|
||||
# and unassigned tasks can't be dispatched regardless.
|
||||
if task and task.status == "ready" and task.assignee:
|
||||
try:
|
||||
from hermes_cli.kanban import _check_dispatcher_presence
|
||||
running, message = _check_dispatcher_presence()
|
||||
if not running and message:
|
||||
body["warning"] = message
|
||||
except Exception:
|
||||
# Probe failure must never block the create itself.
|
||||
pass
|
||||
return body
|
||||
except ValueError as e:
|
||||
raise HTTPException(status_code=400, detail=str(e))
|
||||
finally:
|
||||
conn.close()
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# PATCH /tasks/:id (status / assignee / priority / title / body)
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
class UpdateTaskBody(BaseModel):
|
||||
status: Optional[str] = None
|
||||
assignee: Optional[str] = None
|
||||
priority: Optional[int] = None
|
||||
title: Optional[str] = None
|
||||
body: Optional[str] = None
|
||||
result: Optional[str] = None
|
||||
block_reason: Optional[str] = None
|
||||
# Structured handoff fields — forwarded to complete_task when status
|
||||
# transitions to 'done'. Dashboard parity with ``hermes kanban
|
||||
# complete --summary ... --metadata ...``.
|
||||
summary: Optional[str] = None
|
||||
metadata: Optional[dict] = None
|
||||
|
||||
|
||||
@router.patch("/tasks/{task_id}")
|
||||
def update_task(task_id: str, payload: UpdateTaskBody):
|
||||
conn = _conn()
|
||||
try:
|
||||
task = kanban_db.get_task(conn, task_id)
|
||||
if task is None:
|
||||
raise HTTPException(status_code=404, detail=f"task {task_id} not found")
|
||||
|
||||
# --- assignee ----------------------------------------------------
|
||||
if payload.assignee is not None:
|
||||
try:
|
||||
ok = kanban_db.assign_task(
|
||||
conn, task_id, payload.assignee or None,
|
||||
)
|
||||
except RuntimeError as e:
|
||||
raise HTTPException(status_code=409, detail=str(e))
|
||||
if not ok:
|
||||
raise HTTPException(status_code=404, detail="task not found")
|
||||
|
||||
# --- status -------------------------------------------------------
|
||||
if payload.status is not None:
|
||||
s = payload.status
|
||||
ok = True
|
||||
if s == "done":
|
||||
ok = kanban_db.complete_task(
|
||||
conn, task_id,
|
||||
result=payload.result,
|
||||
summary=payload.summary,
|
||||
metadata=payload.metadata,
|
||||
)
|
||||
elif s == "blocked":
|
||||
ok = kanban_db.block_task(conn, task_id, reason=payload.block_reason)
|
||||
elif s == "ready":
|
||||
# Re-open a blocked task, or just an explicit status set.
|
||||
current = kanban_db.get_task(conn, task_id)
|
||||
if current and current.status == "blocked":
|
||||
ok = kanban_db.unblock_task(conn, task_id)
|
||||
else:
|
||||
# Direct status write for drag-drop (todo -> ready etc).
|
||||
ok = _set_status_direct(conn, task_id, "ready")
|
||||
elif s == "archived":
|
||||
ok = kanban_db.archive_task(conn, task_id)
|
||||
elif s in ("todo", "running", "triage"):
|
||||
ok = _set_status_direct(conn, task_id, s)
|
||||
else:
|
||||
raise HTTPException(status_code=400, detail=f"unknown status: {s}")
|
||||
if not ok:
|
||||
raise HTTPException(
|
||||
status_code=409,
|
||||
detail=f"status transition to {s!r} not valid from current state",
|
||||
)
|
||||
|
||||
# --- priority -----------------------------------------------------
|
||||
if payload.priority is not None:
|
||||
with kanban_db.write_txn(conn):
|
||||
conn.execute(
|
||||
"UPDATE tasks SET priority = ? WHERE id = ?",
|
||||
(int(payload.priority), task_id),
|
||||
)
|
||||
conn.execute(
|
||||
"INSERT INTO task_events (task_id, kind, payload, created_at) "
|
||||
"VALUES (?, 'reprioritized', ?, ?)",
|
||||
(task_id, json.dumps({"priority": int(payload.priority)}),
|
||||
int(time.time())),
|
||||
)
|
||||
|
||||
# --- title / body -------------------------------------------------
|
||||
if payload.title is not None or payload.body is not None:
|
||||
with kanban_db.write_txn(conn):
|
||||
sets, vals = [], []
|
||||
if payload.title is not None:
|
||||
if not payload.title.strip():
|
||||
raise HTTPException(status_code=400, detail="title cannot be empty")
|
||||
sets.append("title = ?")
|
||||
vals.append(payload.title.strip())
|
||||
if payload.body is not None:
|
||||
sets.append("body = ?")
|
||||
vals.append(payload.body)
|
||||
vals.append(task_id)
|
||||
conn.execute(
|
||||
f"UPDATE tasks SET {', '.join(sets)} WHERE id = ?", vals,
|
||||
)
|
||||
conn.execute(
|
||||
"INSERT INTO task_events (task_id, kind, payload, created_at) "
|
||||
"VALUES (?, 'edited', NULL, ?)",
|
||||
(task_id, int(time.time())),
|
||||
)
|
||||
|
||||
updated = kanban_db.get_task(conn, task_id)
|
||||
return {"task": _task_dict(updated) if updated else None}
|
||||
finally:
|
||||
conn.close()
|
||||
|
||||
|
||||
def _set_status_direct(
|
||||
conn: sqlite3.Connection, task_id: str, new_status: str,
|
||||
) -> bool:
|
||||
"""Direct status write for drag-drop moves that aren't covered by the
|
||||
structured complete/block/unblock/archive verbs (e.g. todo<->ready,
|
||||
running<->ready). Appends a ``status`` event row for the live feed.
|
||||
|
||||
When this transitions OFF ``running`` to anything other than the
|
||||
terminal verbs above (which own their own run closing), we close the
|
||||
active run with outcome='reclaimed' so attempt history isn't
|
||||
orphaned. ``running -> ready`` via drag-drop is the common case
|
||||
(user yanking a stuck worker back to the queue).
|
||||
"""
|
||||
with kanban_db.write_txn(conn):
|
||||
# Snapshot current state so we know whether to close a run.
|
||||
prev = conn.execute(
|
||||
"SELECT status, current_run_id FROM tasks WHERE id = ?",
|
||||
(task_id,),
|
||||
).fetchone()
|
||||
if prev is None:
|
||||
return False
|
||||
was_running = prev["status"] == "running"
|
||||
|
||||
cur = conn.execute(
|
||||
"UPDATE tasks SET status = ?, "
|
||||
" claim_lock = CASE WHEN ? = 'running' THEN claim_lock ELSE NULL END, "
|
||||
" claim_expires = CASE WHEN ? = 'running' THEN claim_expires ELSE NULL END, "
|
||||
" worker_pid = CASE WHEN ? = 'running' THEN worker_pid ELSE NULL END "
|
||||
"WHERE id = ?",
|
||||
(new_status, new_status, new_status, new_status, task_id),
|
||||
)
|
||||
if cur.rowcount != 1:
|
||||
return False
|
||||
run_id = None
|
||||
if was_running and new_status != "running" and prev["current_run_id"]:
|
||||
run_id = kanban_db._end_run(
|
||||
conn, task_id,
|
||||
outcome="reclaimed", status="reclaimed",
|
||||
summary=f"status changed to {new_status} (dashboard/direct)",
|
||||
)
|
||||
conn.execute(
|
||||
"INSERT INTO task_events (task_id, run_id, kind, payload, created_at) "
|
||||
"VALUES (?, ?, 'status', ?, ?)",
|
||||
(task_id, run_id, json.dumps({"status": new_status}), int(time.time())),
|
||||
)
|
||||
# If we re-opened something, children may have gone stale.
|
||||
if new_status in ("done", "ready"):
|
||||
kanban_db.recompute_ready(conn)
|
||||
return True
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Comments
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
class CommentBody(BaseModel):
|
||||
body: str
|
||||
author: Optional[str] = "dashboard"
|
||||
|
||||
|
||||
@router.post("/tasks/{task_id}/comments")
|
||||
def add_comment(task_id: str, payload: CommentBody):
|
||||
if not payload.body.strip():
|
||||
raise HTTPException(status_code=400, detail="body is required")
|
||||
conn = _conn()
|
||||
try:
|
||||
if kanban_db.get_task(conn, task_id) is None:
|
||||
raise HTTPException(status_code=404, detail=f"task {task_id} not found")
|
||||
kanban_db.add_comment(
|
||||
conn, task_id, author=payload.author or "dashboard", body=payload.body,
|
||||
)
|
||||
return {"ok": True}
|
||||
finally:
|
||||
conn.close()
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Links
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
class LinkBody(BaseModel):
|
||||
parent_id: str
|
||||
child_id: str
|
||||
|
||||
|
||||
@router.post("/links")
|
||||
def add_link(payload: LinkBody):
|
||||
conn = _conn()
|
||||
try:
|
||||
kanban_db.link_tasks(conn, payload.parent_id, payload.child_id)
|
||||
return {"ok": True}
|
||||
except ValueError as e:
|
||||
raise HTTPException(status_code=400, detail=str(e))
|
||||
finally:
|
||||
conn.close()
|
||||
|
||||
|
||||
@router.delete("/links")
|
||||
def delete_link(parent_id: str = Query(...), child_id: str = Query(...)):
|
||||
conn = _conn()
|
||||
try:
|
||||
ok = kanban_db.unlink_tasks(conn, parent_id, child_id)
|
||||
return {"ok": bool(ok)}
|
||||
finally:
|
||||
conn.close()
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Bulk actions (multi-select on the board)
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
class BulkTaskBody(BaseModel):
|
||||
ids: list[str]
|
||||
status: Optional[str] = None
|
||||
assignee: Optional[str] = None # "" or None = unassign
|
||||
priority: Optional[int] = None
|
||||
archive: bool = False
|
||||
|
||||
|
||||
@router.post("/tasks/bulk")
|
||||
def bulk_update(payload: BulkTaskBody):
|
||||
"""Apply the same patch to every id in ``payload.ids``.
|
||||
|
||||
This is an *independent* iteration — per-task failures don't abort
|
||||
siblings. Returns per-id outcome so the UI can surface partials.
|
||||
"""
|
||||
ids = [i for i in (payload.ids or []) if i]
|
||||
if not ids:
|
||||
raise HTTPException(status_code=400, detail="ids is required")
|
||||
results: list[dict] = []
|
||||
conn = _conn()
|
||||
try:
|
||||
for tid in ids:
|
||||
entry: dict[str, Any] = {"id": tid, "ok": True}
|
||||
try:
|
||||
task = kanban_db.get_task(conn, tid)
|
||||
if task is None:
|
||||
entry.update(ok=False, error="not found")
|
||||
results.append(entry)
|
||||
continue
|
||||
if payload.archive:
|
||||
if not kanban_db.archive_task(conn, tid):
|
||||
entry.update(ok=False, error="archive refused")
|
||||
if payload.status is not None and not payload.archive:
|
||||
s = payload.status
|
||||
if s == "done":
|
||||
ok = kanban_db.complete_task(conn, tid)
|
||||
elif s == "blocked":
|
||||
ok = kanban_db.block_task(conn, tid)
|
||||
elif s == "ready":
|
||||
cur = kanban_db.get_task(conn, tid)
|
||||
if cur and cur.status == "blocked":
|
||||
ok = kanban_db.unblock_task(conn, tid)
|
||||
else:
|
||||
ok = _set_status_direct(conn, tid, "ready")
|
||||
elif s in ("todo", "running", "triage"):
|
||||
ok = _set_status_direct(conn, tid, s)
|
||||
else:
|
||||
entry.update(ok=False, error=f"unknown status {s!r}")
|
||||
results.append(entry)
|
||||
continue
|
||||
if not ok:
|
||||
entry.update(ok=False, error=f"transition to {s!r} refused")
|
||||
if payload.assignee is not None:
|
||||
try:
|
||||
if not kanban_db.assign_task(
|
||||
conn, tid, payload.assignee or None,
|
||||
):
|
||||
entry.update(ok=False, error="assign refused")
|
||||
except RuntimeError as e:
|
||||
entry.update(ok=False, error=str(e))
|
||||
if payload.priority is not None:
|
||||
with kanban_db.write_txn(conn):
|
||||
conn.execute(
|
||||
"UPDATE tasks SET priority = ? WHERE id = ?",
|
||||
(int(payload.priority), tid),
|
||||
)
|
||||
conn.execute(
|
||||
"INSERT INTO task_events (task_id, kind, payload, created_at) "
|
||||
"VALUES (?, 'reprioritized', ?, ?)",
|
||||
(tid, json.dumps({"priority": int(payload.priority)}),
|
||||
int(time.time())),
|
||||
)
|
||||
except Exception as e: # defensive — one bad id shouldn't kill the batch
|
||||
entry.update(ok=False, error=str(e))
|
||||
results.append(entry)
|
||||
return {"results": results}
|
||||
finally:
|
||||
conn.close()
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Plugin config (read dashboard.kanban.* defaults from config.yaml)
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
@router.get("/config")
|
||||
def get_config():
|
||||
"""Return kanban dashboard preferences from ~/.hermes/config.yaml.
|
||||
|
||||
Reads the ``dashboard.kanban`` section if present; defaults otherwise.
|
||||
Used by the UI to pre-select tenant filters, toggle markdown rendering,
|
||||
or set column-width preferences without a round-trip per page load.
|
||||
"""
|
||||
try:
|
||||
from hermes_cli.config import load_config
|
||||
cfg = load_config() or {}
|
||||
except Exception:
|
||||
cfg = {}
|
||||
dash_cfg = (cfg.get("dashboard") or {})
|
||||
# dashboard.kanban may itself be a dict; fall back to {}.
|
||||
k_cfg = dash_cfg.get("kanban") or {}
|
||||
return {
|
||||
"default_tenant": k_cfg.get("default_tenant") or "",
|
||||
"lane_by_profile": bool(k_cfg.get("lane_by_profile", True)),
|
||||
"include_archived_by_default": bool(k_cfg.get("include_archived_by_default", False)),
|
||||
"render_markdown": bool(k_cfg.get("render_markdown", True)),
|
||||
}
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Stats (per-profile / per-status counts + oldest-ready age)
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
@router.get("/stats")
|
||||
def get_stats():
|
||||
"""Per-status + per-assignee counts + oldest-ready age.
|
||||
|
||||
Designed for the dashboard HUD and for router profiles that need to
|
||||
answer "is this specialist overloaded?" without scanning the whole
|
||||
board themselves.
|
||||
"""
|
||||
conn = _conn()
|
||||
try:
|
||||
return kanban_db.board_stats(conn)
|
||||
finally:
|
||||
conn.close()
|
||||
|
||||
|
||||
@router.get("/assignees")
|
||||
def get_assignees():
|
||||
"""Known profiles + per-profile task counts.
|
||||
|
||||
Returns the union of ``~/.hermes/profiles/*`` on disk and every
|
||||
distinct assignee currently used on the board. The dashboard uses
|
||||
this to populate its assignee dropdown so a freshly-created profile
|
||||
appears in the picker before it's been given any task.
|
||||
"""
|
||||
conn = _conn()
|
||||
try:
|
||||
return {"assignees": kanban_db.known_assignees(conn)}
|
||||
finally:
|
||||
conn.close()
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Worker log (read-only; file written by _default_spawn)
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
@router.get("/tasks/{task_id}/log")
|
||||
def get_task_log(task_id: str, tail: Optional[int] = Query(None, ge=1, le=2_000_000)):
|
||||
"""Return the worker's stdout/stderr log.
|
||||
|
||||
``tail`` caps the response size (bytes) so the dashboard drawer
|
||||
doesn't paginate megabytes into the browser. Returns 404 if the task
|
||||
has never spawned. The on-disk log is rotated at 2 MiB per
|
||||
``_rotate_worker_log`` — a single ``.log.1`` is kept, no further
|
||||
generations, so disk usage per task is bounded at ~4 MiB.
|
||||
"""
|
||||
conn = _conn()
|
||||
try:
|
||||
task = kanban_db.get_task(conn, task_id)
|
||||
finally:
|
||||
conn.close()
|
||||
if task is None:
|
||||
raise HTTPException(status_code=404, detail=f"task {task_id} not found")
|
||||
content = kanban_db.read_worker_log(task_id, tail_bytes=tail)
|
||||
log_path = kanban_db.worker_log_path(task_id)
|
||||
size = log_path.stat().st_size if log_path.exists() else 0
|
||||
return {
|
||||
"task_id": task_id,
|
||||
"path": str(log_path),
|
||||
"exists": content is not None,
|
||||
"size_bytes": size,
|
||||
"content": content or "",
|
||||
# Truncated when the on-disk file was larger than the tail cap.
|
||||
"truncated": bool(tail and size > tail),
|
||||
}
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Dispatch nudge (optional quick-path so the UI doesn't wait 60 s)
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
@router.post("/dispatch")
|
||||
def dispatch(dry_run: bool = Query(False), max_n: int = Query(8, alias="max")):
|
||||
conn = _conn()
|
||||
try:
|
||||
result = kanban_db.dispatch_once(
|
||||
conn, dry_run=dry_run, max_spawn=max_n,
|
||||
)
|
||||
# DispatchResult is a dataclass.
|
||||
try:
|
||||
return asdict(result)
|
||||
except TypeError:
|
||||
return {"result": str(result)}
|
||||
finally:
|
||||
conn.close()
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# WebSocket: /events?since=<event_id>
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
# Poll interval for the event tail loop. SQLite WAL + 300 ms polling is
|
||||
# the simplest and most robust approach; it adds a fraction of a percent
|
||||
# of CPU and has no shared state to synchronize across workers.
|
||||
_EVENT_POLL_SECONDS = 0.3
|
||||
|
||||
|
||||
@router.websocket("/events")
|
||||
async def stream_events(ws: WebSocket):
|
||||
# Enforce the dashboard session token as a query param — browsers can't
|
||||
# set Authorization on a WS upgrade. This matches how the PTY bridge
|
||||
# authenticates in hermes_cli/web_server.py.
|
||||
token = ws.query_params.get("token")
|
||||
if not _check_ws_token(token):
|
||||
await ws.close(code=http_status.WS_1008_POLICY_VIOLATION)
|
||||
return
|
||||
await ws.accept()
|
||||
try:
|
||||
since_raw = ws.query_params.get("since", "0")
|
||||
try:
|
||||
cursor = int(since_raw)
|
||||
except ValueError:
|
||||
cursor = 0
|
||||
|
||||
def _fetch_new(cursor_val: int) -> tuple[int, list[dict]]:
|
||||
conn = kanban_db.connect()
|
||||
try:
|
||||
rows = conn.execute(
|
||||
"SELECT id, task_id, run_id, kind, payload, created_at "
|
||||
"FROM task_events WHERE id > ? ORDER BY id ASC LIMIT 200",
|
||||
(cursor_val,),
|
||||
).fetchall()
|
||||
out: list[dict] = []
|
||||
new_cursor = cursor_val
|
||||
for r in rows:
|
||||
try:
|
||||
payload = json.loads(r["payload"]) if r["payload"] else None
|
||||
except Exception:
|
||||
payload = None
|
||||
out.append({
|
||||
"id": r["id"],
|
||||
"task_id": r["task_id"],
|
||||
"run_id": r["run_id"],
|
||||
"kind": r["kind"],
|
||||
"payload": payload,
|
||||
"created_at": r["created_at"],
|
||||
})
|
||||
new_cursor = r["id"]
|
||||
return new_cursor, out
|
||||
finally:
|
||||
conn.close()
|
||||
|
||||
while True:
|
||||
cursor, events = await asyncio.to_thread(_fetch_new, cursor)
|
||||
if events:
|
||||
await ws.send_json({"events": events, "cursor": cursor})
|
||||
await asyncio.sleep(_EVENT_POLL_SECONDS)
|
||||
except WebSocketDisconnect:
|
||||
return
|
||||
except Exception as exc: # defensive: never crash the dashboard worker
|
||||
log.warning("Kanban event stream error: %s", exc)
|
||||
try:
|
||||
await ws.close()
|
||||
except Exception:
|
||||
pass
|
||||
@@ -0,0 +1,32 @@
|
||||
# DEPRECATED — the kanban dispatcher now runs inside the gateway by
|
||||
# default (config key: kanban.dispatch_in_gateway, default true). To
|
||||
# migrate:
|
||||
#
|
||||
# systemctl --user disable --now hermes-kanban-dispatcher.service
|
||||
# # then make sure a gateway is running; e.g. a systemd user unit
|
||||
# # for `hermes gateway start`. The gateway hosts the dispatcher.
|
||||
#
|
||||
# This unit is kept for users who truly cannot run the gateway (host
|
||||
# policy forbids long-lived services, etc.). It now invokes the
|
||||
# standalone dispatcher via the explicit --force flag, so nobody
|
||||
# accidentally keeps two dispatchers racing against the same
|
||||
# kanban.db. Running this unit AND a gateway with
|
||||
# dispatch_in_gateway=true is NOT supported.
|
||||
|
||||
[Unit]
|
||||
Description=Hermes Kanban dispatcher (DEPRECATED standalone daemon — prefer gateway-embedded dispatch)
|
||||
Documentation=https://hermes-agent.nousresearch.com/docs/user-guide/features/kanban
|
||||
After=network.target
|
||||
|
||||
[Service]
|
||||
Type=simple
|
||||
ExecStart=/usr/bin/env hermes kanban daemon --force --interval 60 --pidfile %t/hermes-kanban-dispatcher.pid
|
||||
Restart=on-failure
|
||||
RestartSec=5
|
||||
# Log to the journal via stdout/stderr; the dispatcher also writes per-task
|
||||
# worker output to $HERMES_HOME/kanban/logs/<task>.log.
|
||||
StandardOutput=journal
|
||||
StandardError=journal
|
||||
|
||||
[Install]
|
||||
WantedBy=default.target
|
||||
+1
-1
@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
|
||||
|
||||
[project]
|
||||
name = "hermes-agent"
|
||||
version = "0.11.0"
|
||||
version = "0.12.0"
|
||||
description = "The self-improving AI agent — creates skills from experience, improves them during use, and runs anywhere"
|
||||
readme = "README.md"
|
||||
requires-python = ">=3.11"
|
||||
|
||||
+57
-27
@@ -23,6 +23,7 @@ Usage:
|
||||
import asyncio
|
||||
import base64
|
||||
import concurrent.futures
|
||||
import contextvars
|
||||
import copy
|
||||
import hashlib
|
||||
import json
|
||||
@@ -133,6 +134,7 @@ from agent.prompt_builder import (
|
||||
DEFAULT_AGENT_IDENTITY, PLATFORM_HINTS,
|
||||
MEMORY_GUIDANCE, SESSION_SEARCH_GUIDANCE, SKILLS_GUIDANCE,
|
||||
HERMES_AGENT_HELP_GUIDANCE,
|
||||
KANBAN_GUIDANCE,
|
||||
build_nous_subscription_prompt,
|
||||
)
|
||||
from agent.model_metadata import (
|
||||
@@ -3592,11 +3594,15 @@ class AIAgent:
|
||||
|
||||
if actions:
|
||||
summary = " · ".join(dict.fromkeys(actions))
|
||||
self._safe_print(f" 💾 {summary}")
|
||||
self._safe_print(
|
||||
f" 💾 Self-improvement review: {summary}"
|
||||
)
|
||||
_bg_cb = self.background_review_callback
|
||||
if _bg_cb:
|
||||
try:
|
||||
_bg_cb(f"💾 {summary}")
|
||||
_bg_cb(
|
||||
f"💾 Self-improvement review: {summary}"
|
||||
)
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
@@ -4823,6 +4829,12 @@ class AIAgent:
|
||||
tool_guidance.append(SESSION_SEARCH_GUIDANCE)
|
||||
if "skill_manage" in self.valid_tool_names:
|
||||
tool_guidance.append(SKILLS_GUIDANCE)
|
||||
# Kanban worker/orchestrator lifecycle — only present when the
|
||||
# dispatcher spawned this process (kanban_show check_fn gates on
|
||||
# HERMES_KANBAN_TASK env var). Normal chat sessions never see
|
||||
# this block.
|
||||
if "kanban_show" in self.valid_tool_names:
|
||||
tool_guidance.append(KANBAN_GUIDANCE)
|
||||
if tool_guidance:
|
||||
prompt_parts.append(" ".join(tool_guidance))
|
||||
|
||||
@@ -8501,6 +8513,7 @@ class AIAgent:
|
||||
Handles reasoning extraction, reasoning_details, and optional tool_calls
|
||||
so both the tool-call path and the final-response path share one builder.
|
||||
"""
|
||||
assistant_tool_calls = getattr(assistant_message, "tool_calls", None)
|
||||
reasoning_text = self._extract_reasoning(assistant_message)
|
||||
_from_structured = bool(reasoning_text)
|
||||
|
||||
@@ -8560,16 +8573,22 @@ class AIAgent:
|
||||
"finish_reason": finish_reason,
|
||||
}
|
||||
|
||||
if hasattr(assistant_message, "reasoning_content"):
|
||||
raw_reasoning_content = getattr(assistant_message, "reasoning_content", None)
|
||||
if raw_reasoning_content is not None:
|
||||
msg["reasoning_content"] = _sanitize_surrogates(raw_reasoning_content)
|
||||
elif msg.get("tool_calls") and self._needs_deepseek_tool_reasoning():
|
||||
# DeepSeek thinking mode requires reasoning_content on every
|
||||
# assistant tool-call message. Without it, replaying the
|
||||
# persisted message causes HTTP 400. Include empty string
|
||||
# as a defensive compatibility fallback (refs #15250).
|
||||
msg["reasoning_content"] = ""
|
||||
raw_reasoning_content = getattr(assistant_message, "reasoning_content", None)
|
||||
if raw_reasoning_content is None and hasattr(assistant_message, "model_extra"):
|
||||
model_extra = getattr(assistant_message, "model_extra", None) or {}
|
||||
if isinstance(model_extra, dict) and "reasoning_content" in model_extra:
|
||||
raw_reasoning_content = model_extra["reasoning_content"]
|
||||
if raw_reasoning_content is not None:
|
||||
msg["reasoning_content"] = _sanitize_surrogates(raw_reasoning_content)
|
||||
elif assistant_tool_calls and self._needs_thinking_reasoning_pad():
|
||||
# DeepSeek v4 thinking mode and Kimi / Moonshot thinking mode
|
||||
# both require reasoning_content on every assistant tool-call
|
||||
# message. Without it, replaying the persisted message causes
|
||||
# HTTP 400 ("The reasoning_content in the thinking mode must
|
||||
# be passed back to the API"). Include streamed reasoning
|
||||
# text when captured; otherwise pad with empty string.
|
||||
# Refs #15250, #17400.
|
||||
msg["reasoning_content"] = reasoning_text or ""
|
||||
|
||||
# Additive fallback (refs #16844, #16884). Streaming-only providers
|
||||
# (glm, MiniMax, gpt-5.x via aigw, Anthropic via openai-compat shims)
|
||||
@@ -8626,9 +8645,9 @@ class AIAgent:
|
||||
if codex_message_items:
|
||||
msg["codex_message_items"] = codex_message_items
|
||||
|
||||
if assistant_message.tool_calls:
|
||||
if assistant_tool_calls:
|
||||
tool_calls = []
|
||||
for tool_call in assistant_message.tool_calls:
|
||||
for tool_call in assistant_tool_calls:
|
||||
raw_id = getattr(tool_call, "id", None)
|
||||
call_id = getattr(tool_call, "call_id", None)
|
||||
if not isinstance(call_id, str) or not call_id.strip():
|
||||
@@ -8677,6 +8696,18 @@ class AIAgent:
|
||||
|
||||
return msg
|
||||
|
||||
def _needs_thinking_reasoning_pad(self) -> bool:
|
||||
"""Return True when the active provider enforces reasoning_content echo-back.
|
||||
|
||||
DeepSeek v4 thinking and Kimi / Moonshot thinking both reject replays
|
||||
of assistant tool-call messages that omit ``reasoning_content`` (refs
|
||||
#15250, #17400).
|
||||
"""
|
||||
return (
|
||||
self._needs_deepseek_tool_reasoning()
|
||||
or self._needs_kimi_tool_reasoning()
|
||||
)
|
||||
|
||||
def _needs_kimi_tool_reasoning(self) -> bool:
|
||||
"""Return True when the current provider is Kimi / Moonshot thinking mode.
|
||||
|
||||
@@ -8719,20 +8750,17 @@ class AIAgent:
|
||||
api_msg["reasoning_content"] = existing
|
||||
return
|
||||
|
||||
needs_thinking_pad = (
|
||||
self._needs_kimi_tool_reasoning()
|
||||
or self._needs_deepseek_tool_reasoning()
|
||||
)
|
||||
needs_thinking_pad = self._needs_thinking_reasoning_pad()
|
||||
|
||||
# 2. Cross-provider poisoned history (#15748): on DeepSeek/Kimi,
|
||||
# if the source turn has tool_calls AND a 'reasoning' field but no
|
||||
# 'reasoning_content' key, the 'reasoning' text was written by a
|
||||
# prior provider (e.g. MiniMax) — DeepSeek's own _build_assistant_message
|
||||
# always pins reasoning_content="" at creation time for tool-call turns,
|
||||
# so the shape (reasoning set, reasoning_content absent, tool_calls
|
||||
# present) is unreachable from same-provider DeepSeek history. Inject
|
||||
# "" to satisfy the API without leaking another provider's chain of
|
||||
# thought to DeepSeek/Kimi.
|
||||
# pins reasoning_content at creation time for tool-call turns, so the
|
||||
# shape (reasoning set, reasoning_content absent, tool_calls present)
|
||||
# is unreachable from same-provider DeepSeek history after this fix.
|
||||
# Inject "" to satisfy the API without leaking another provider's
|
||||
# chain of thought to DeepSeek/Kimi.
|
||||
normalized_reasoning = source_msg.get("reasoning")
|
||||
if (
|
||||
needs_thinking_pad
|
||||
@@ -8745,9 +8773,9 @@ class AIAgent:
|
||||
|
||||
# 3. Healthy session: promote 'reasoning' field to 'reasoning_content'
|
||||
# for providers that use the internal 'reasoning' key.
|
||||
# This must happen BEFORE the DeepSeek/Kimi tool-call check so that
|
||||
# genuine reasoning content is not overwritten by the empty-string
|
||||
# fallback (#15812 regression in PR #15478).
|
||||
# This must happen before the unconditional empty-string fallback so
|
||||
# genuine reasoning content is not overwritten (#15812 regression in
|
||||
# PR #15478).
|
||||
if isinstance(normalized_reasoning, str) and normalized_reasoning:
|
||||
api_msg["reasoning_content"] = normalized_reasoning
|
||||
return
|
||||
@@ -9416,7 +9444,9 @@ class AIAgent:
|
||||
with concurrent.futures.ThreadPoolExecutor(max_workers=max_workers) as executor:
|
||||
futures = []
|
||||
for i, (tc, name, args) in enumerate(parsed_calls):
|
||||
f = executor.submit(_run_tool, i, tc, name, args)
|
||||
# Propagate ContextVars (e.g. _approval_session_key); mirrors asyncio.to_thread.
|
||||
ctx = contextvars.copy_context()
|
||||
f = executor.submit(ctx.run, _run_tool, i, tc, name, args)
|
||||
futures.append(f)
|
||||
|
||||
# Wait for all to complete with periodic heartbeats so the
|
||||
|
||||
@@ -53,6 +53,9 @@ AUTHOR_MAP = {
|
||||
"sr@samirusani": "samrusani",
|
||||
"angelclaw@AngelMacBook.local": "angel12",
|
||||
"charles@cryptoassetrecovery.com": "charles-brooks",
|
||||
# DeepSeek v4 + Kimi thinking-mode reasoning_content salvage (April 2026)
|
||||
"luwinyang@deepseek.com": "lsdsjy",
|
||||
"season.saw@gmail.com": "season179",
|
||||
"heathley@Heathley-MacBook-Air.local": "heathley",
|
||||
"vlad19@gmail.com": "dandaka",
|
||||
"adamrummer@gmail.com": "cyclingwithelephants",
|
||||
@@ -626,6 +629,21 @@ AUTHOR_MAP = {
|
||||
"164839249+Joseph19820124@users.noreply.github.com": "Joseph19820124",
|
||||
"rugved@lmstudio.ai": "rugvedS07",
|
||||
"44333070+Heltman@users.noreply.github.com": "Heltman",
|
||||
# v0.12.0 additions
|
||||
"ching@kachingappz.com": "ching-kaching",
|
||||
"codezhujr@gmail.com": "Zjianru", # salvage chain: code by codez, PR #15749 author @Zjianru
|
||||
"daimon@noreply.github.com": "Siddharth Balyan", # co-author only
|
||||
"i@zkl2333.com": "zkl2333",
|
||||
"isaachuang@Isaacs-MacBook-Pro.local": "isaachuangGMICLOUD",
|
||||
"isaachuang@Mac.localdomain": "isaachuangGMICLOUD", # salvage of PR #11955 → #16663
|
||||
"liyuan851277048@icloud.com": "Octopus", # co-author only
|
||||
"me+github7604@versun.org": "Versun", # co-author only
|
||||
"my.vesper.nine@gmail.com": "kevin-ho", # salvage: PR #15488 author @kevin-ho
|
||||
"noreply@paperclip.ing": "Paperclip", # co-author only
|
||||
"teknium@hermes-agent": "teknium1",
|
||||
"web3blind@gmail.com": "web3blind",
|
||||
"ztzheng@163.com": "chengoak", # PR #17467
|
||||
"24110240104@m.fudan.edu.cn": "YuShu", # co-author only
|
||||
}
|
||||
|
||||
|
||||
|
||||
@@ -0,0 +1,152 @@
|
||||
---
|
||||
name: kanban-orchestrator
|
||||
description: Decomposition playbook + specialist-roster conventions + anti-temptation rules for an orchestrator profile routing work through Kanban. The "don't do the work yourself" rule and the basic lifecycle are auto-injected into every kanban worker's system prompt; this skill is the deeper playbook when you're specifically playing the orchestrator role.
|
||||
version: 2.0.0
|
||||
metadata:
|
||||
hermes:
|
||||
tags: [kanban, multi-agent, orchestration, routing]
|
||||
related_skills: [kanban-worker]
|
||||
---
|
||||
|
||||
# Kanban Orchestrator — Decomposition Playbook
|
||||
|
||||
> The **core worker lifecycle** (including the `kanban_create` fan-out pattern and the "decompose, don't execute" rule) is auto-injected into every kanban process via the `KANBAN_GUIDANCE` system-prompt block. This skill is the deeper playbook when you're an orchestrator profile whose whole job is routing.
|
||||
|
||||
## When to use the board (vs. just doing the work)
|
||||
|
||||
Create Kanban tasks when any of these are true:
|
||||
|
||||
1. **Multiple specialists are needed.** Research + analysis + writing is three profiles.
|
||||
2. **The work should survive a crash or restart.** Long-running, recurring, or important.
|
||||
3. **The user might want to interject.** Human-in-the-loop at any step.
|
||||
4. **Multiple subtasks can run in parallel.** Fan-out for speed.
|
||||
5. **Review / iteration is expected.** A reviewer profile loops on drafter output.
|
||||
6. **The audit trail matters.** Board rows persist in SQLite forever.
|
||||
|
||||
If *none* of those apply — it's a small one-shot reasoning task — use `delegate_task` instead or answer the user directly.
|
||||
|
||||
## The anti-temptation rules
|
||||
|
||||
Your job description says "route, don't execute." The rules that enforce that:
|
||||
|
||||
- **Do not execute the work yourself.** Your restricted toolset usually doesn't even include terminal/file/code/web for implementation. If you find yourself "just fixing this quickly" — stop and create a task for the right specialist.
|
||||
- **For any concrete task, create a Kanban task and assign it.** Every single time.
|
||||
- **If no specialist fits, ask the user which profile to create.** Do not default to doing it yourself under "close enough."
|
||||
- **Decompose, route, and summarize — that's the whole job.**
|
||||
|
||||
## The standard specialist roster (convention)
|
||||
|
||||
Unless the user's setup has customized profiles, assume these exist. Adjust to whatever the user actually has — ask if you're unsure.
|
||||
|
||||
| Profile | Does | Typical workspace |
|
||||
|---|---|---|
|
||||
| `researcher` | Reads sources, gathers facts, writes findings | `scratch` |
|
||||
| `analyst` | Synthesizes, ranks, de-dupes. Consumes multiple `researcher` outputs | `scratch` |
|
||||
| `writer` | Drafts prose in the user's voice | `scratch` or `dir:` into their Obsidian vault |
|
||||
| `reviewer` | Reads output, leaves findings, gates approval | `scratch` |
|
||||
| `backend-eng` | Writes server-side code | `worktree` |
|
||||
| `frontend-eng` | Writes client-side code | `worktree` |
|
||||
| `ops` | Runs scripts, manages services, handles deployments | `dir:` into ops scripts repo |
|
||||
| `pm` | Writes specs, acceptance criteria | `scratch` |
|
||||
|
||||
## Decomposition playbook
|
||||
|
||||
### Step 1 — Understand the goal
|
||||
|
||||
Ask clarifying questions if the goal is ambiguous. Cheap to ask; expensive to spawn the wrong fleet.
|
||||
|
||||
### Step 2 — Sketch the task graph
|
||||
|
||||
Before creating anything, draft the graph out loud (in your response to the user). Example for "Analyze whether we should migrate to Postgres":
|
||||
|
||||
```
|
||||
T1 researcher research: Postgres cost vs current
|
||||
T2 researcher research: Postgres performance vs current
|
||||
T3 analyst synthesize migration recommendation parents: T1, T2
|
||||
T4 writer draft decision memo parents: T3
|
||||
```
|
||||
|
||||
Show this to the user. Let them correct it before you create anything.
|
||||
|
||||
### Step 3 — Create tasks and link
|
||||
|
||||
```python
|
||||
t1 = kanban_create(
|
||||
title="research: Postgres cost vs current",
|
||||
assignee="researcher",
|
||||
body="Compare estimated infrastructure costs, migration costs, and ongoing ops costs over a 3-year window. Sources: AWS/GCP pricing, team time estimates, current Postgres bills from peers.",
|
||||
tenant=os.environ.get("HERMES_TENANT"),
|
||||
)["task_id"]
|
||||
|
||||
t2 = kanban_create(
|
||||
title="research: Postgres performance vs current",
|
||||
assignee="researcher",
|
||||
body="Compare query latency, throughput, and scaling characteristics at our expected data volume (~500GB, 10k QPS peak). Sources: benchmark papers, public case studies, pgbench results if easy.",
|
||||
)["task_id"]
|
||||
|
||||
t3 = kanban_create(
|
||||
title="synthesize migration recommendation",
|
||||
assignee="analyst",
|
||||
body="Read the findings from T1 (cost) and T2 (performance). Produce a 1-page recommendation with explicit trade-offs and a go/no-go call.",
|
||||
parents=[t1, t2],
|
||||
)["task_id"]
|
||||
|
||||
t4 = kanban_create(
|
||||
title="draft decision memo",
|
||||
assignee="writer",
|
||||
body="Turn the analyst's recommendation into a 2-page memo for the CTO. Match the tone of previous decision memos in the team's knowledge base.",
|
||||
parents=[t3],
|
||||
)["task_id"]
|
||||
```
|
||||
|
||||
`parents=[...]` gates promotion — children stay in `todo` until every parent reaches `done`, then auto-promote to `ready`. No manual coordination needed; the dispatcher and dependency engine handle it.
|
||||
|
||||
### Step 4 — Complete your own task
|
||||
|
||||
If you were spawned as a task yourself (e.g. `planner` profile was assigned `T0: "investigate Postgres migration"`), mark it done with a summary of what you created:
|
||||
|
||||
```python
|
||||
kanban_complete(
|
||||
summary="decomposed into T1-T4: 2 researchers parallel, 1 analyst on their outputs, 1 writer on the recommendation",
|
||||
metadata={
|
||||
"task_graph": {
|
||||
"T1": {"assignee": "researcher", "parents": []},
|
||||
"T2": {"assignee": "researcher", "parents": []},
|
||||
"T3": {"assignee": "analyst", "parents": ["T1", "T2"]},
|
||||
"T4": {"assignee": "writer", "parents": ["T3"]},
|
||||
},
|
||||
},
|
||||
)
|
||||
```
|
||||
|
||||
### Step 5 — Report back to the user
|
||||
|
||||
Tell them what you created in plain prose:
|
||||
|
||||
> I've queued 4 tasks:
|
||||
> - **T1** (researcher): cost comparison
|
||||
> - **T2** (researcher): performance comparison, in parallel with T1
|
||||
> - **T3** (analyst): synthesizes T1 + T2 into a recommendation
|
||||
> - **T4** (writer): turns T3 into a CTO memo
|
||||
>
|
||||
> The dispatcher will pick up T1 and T2 now. T3 starts when both finish. You'll get a gateway ping when T4 completes. Use the dashboard or `hermes kanban tail <id>` to follow along.
|
||||
|
||||
## Common patterns
|
||||
|
||||
**Fan-out + fan-in (research → synthesize):** N `researcher` tasks with no parents, one `analyst` task with all of them as parents.
|
||||
|
||||
**Pipeline with gates:** `pm → backend-eng → reviewer`. Each stage's `parents=[previous_task]`. Reviewer blocks or completes; if reviewer blocks, the operator unblocks with feedback and respawns.
|
||||
|
||||
**Same-profile queue:** 50 tasks, all assigned to `translator`, no dependencies between them. Dispatcher serializes — translator processes them in priority order, accumulating experience in their own memory.
|
||||
|
||||
**Human-in-the-loop:** Any task can `kanban_block()` to wait for input. Dispatcher respawns after `/unblock`. The comment thread carries the full context.
|
||||
|
||||
## Pitfalls
|
||||
|
||||
**Reassignment vs. new task.** If a reviewer blocks with "needs changes," create a NEW task linked from the reviewer's task — don't re-run the same task with a stern look. The new task is assigned to the original implementer profile.
|
||||
|
||||
**Argument order for links.** `kanban_link(parent_id=..., child_id=...)` — parent first. Mixing them up demotes the wrong task to `todo`.
|
||||
|
||||
**Don't pre-create the whole graph if the shape depends on intermediate findings.** If T3's structure depends on what T1 and T2 find, let T3 exist as a "synthesize findings" task whose own first step is to read parent handoffs and plan the rest. Orchestrators can spawn orchestrators.
|
||||
|
||||
**Tenant inheritance.** If `HERMES_TENANT` is set in your env, pass `tenant=os.environ.get("HERMES_TENANT")` on every `kanban_create` call so child tasks stay in the same namespace.
|
||||
@@ -0,0 +1,134 @@
|
||||
---
|
||||
name: kanban-worker
|
||||
description: Pitfalls, examples, and edge cases for Hermes Kanban workers. The lifecycle itself is auto-injected into every worker's system prompt as KANBAN_GUIDANCE (from agent/prompt_builder.py); this skill is what you load when you want deeper detail on specific scenarios.
|
||||
version: 2.0.0
|
||||
metadata:
|
||||
hermes:
|
||||
tags: [kanban, multi-agent, collaboration, workflow, pitfalls]
|
||||
related_skills: [kanban-orchestrator]
|
||||
---
|
||||
|
||||
# Kanban Worker — Pitfalls and Examples
|
||||
|
||||
> You're seeing this skill because the Hermes Kanban dispatcher spawned you as a worker with `--skills kanban-worker` — it's loaded automatically for every dispatched worker. The **lifecycle** (6 steps: orient → work → heartbeat → block/complete) also lives in the `KANBAN_GUIDANCE` block that's auto-injected into your system prompt. This skill is the deeper detail: good handoff shapes, retry diagnostics, edge cases.
|
||||
|
||||
## Workspace handling
|
||||
|
||||
Your workspace kind determines how you should behave inside `$HERMES_KANBAN_WORKSPACE`:
|
||||
|
||||
| Kind | What it is | How to work |
|
||||
|---|---|---|
|
||||
| `scratch` | Fresh tmp dir, yours alone | Read/write freely; it gets GC'd when the task is archived. |
|
||||
| `dir:<path>` | Shared persistent directory | Other runs will read what you write. Treat it like long-lived state. Path is guaranteed absolute (the kernel rejects relative paths). |
|
||||
| `worktree` | Git worktree at the resolved path | If `.git` doesn't exist, run `git worktree add <path> <branch>` from the main repo first, then cd and work normally. Commit work here. |
|
||||
|
||||
## Tenant isolation
|
||||
|
||||
If `$HERMES_TENANT` is set, the task belongs to a tenant namespace. When reading or writing persistent memory, prefix memory entries with the tenant so context doesn't leak across tenants:
|
||||
|
||||
- Good: `business-a: Acme is our biggest customer`
|
||||
- Bad (leaks): `Acme is our biggest customer`
|
||||
|
||||
## Good summary + metadata shapes
|
||||
|
||||
The `kanban_complete(summary=..., metadata=...)` handoff is how downstream workers read what you did. Patterns that work:
|
||||
|
||||
**Coding task:**
|
||||
```python
|
||||
kanban_complete(
|
||||
summary="shipped rate limiter — token bucket, keys on user_id with IP fallback, 14 tests pass",
|
||||
metadata={
|
||||
"changed_files": ["rate_limiter.py", "tests/test_rate_limiter.py"],
|
||||
"tests_run": 14,
|
||||
"tests_passed": 14,
|
||||
"decisions": ["user_id primary, IP fallback for unauthenticated requests"],
|
||||
},
|
||||
)
|
||||
```
|
||||
|
||||
**Research task:**
|
||||
```python
|
||||
kanban_complete(
|
||||
summary="3 competing libraries reviewed; vLLM wins on throughput, SGLang on latency, Tensorrt-LLM on memory efficiency",
|
||||
metadata={
|
||||
"sources_read": 12,
|
||||
"recommendation": "vLLM",
|
||||
"benchmarks": {"vllm": 1.0, "sglang": 0.87, "trtllm": 0.72},
|
||||
},
|
||||
)
|
||||
```
|
||||
|
||||
**Review task:**
|
||||
```python
|
||||
kanban_complete(
|
||||
summary="reviewed PR #123; 2 blocking issues found (SQL injection in /search, missing CSRF on /settings)",
|
||||
metadata={
|
||||
"pr_number": 123,
|
||||
"findings": [
|
||||
{"severity": "critical", "file": "api/search.py", "line": 42, "issue": "raw SQL concat"},
|
||||
{"severity": "high", "file": "api/settings.py", "issue": "missing CSRF middleware"},
|
||||
],
|
||||
"approved": False,
|
||||
},
|
||||
)
|
||||
```
|
||||
|
||||
Shape `metadata` so downstream parsers (reviewers, aggregators, schedulers) can use it without re-reading your prose.
|
||||
|
||||
## Block reasons that get answered fast
|
||||
|
||||
Bad: `"stuck"` — the human has no context.
|
||||
|
||||
Good: one sentence naming the specific decision you need. Leave longer context as a comment instead.
|
||||
|
||||
```python
|
||||
kanban_comment(
|
||||
task_id=os.environ["HERMES_KANBAN_TASK"],
|
||||
body="Full context: I have user IPs from Cloudflare headers but some users are behind NATs with thousands of peers. Keying on IP alone causes false positives.",
|
||||
)
|
||||
kanban_block(reason="Rate limit key choice: IP (simple, NAT-unsafe) or user_id (requires auth, skips anonymous endpoints)?")
|
||||
```
|
||||
|
||||
The block message is what appears in the dashboard / gateway notifier. The comment is the deeper context a human reads when they open the task.
|
||||
|
||||
## Heartbeats worth sending
|
||||
|
||||
Good heartbeats name progress: `"epoch 12/50, loss 0.31"`, `"scanned 1.2M/2.4M rows"`, `"uploaded 47/120 videos"`.
|
||||
|
||||
Bad heartbeats: `"still working"`, empty notes, sub-second intervals. Every few minutes max; skip entirely for tasks under ~2 minutes.
|
||||
|
||||
## Retry scenarios
|
||||
|
||||
If you open the task and `kanban_show` returns `runs: [...]` with one or more closed runs, you're a retry. The prior runs' `outcome` / `summary` / `error` tell you what didn't work. Don't repeat that path. Typical retry diagnostics:
|
||||
|
||||
- `outcome: "timed_out"` — the previous attempt hit `max_runtime_seconds`. You may need to chunk the work or shorten it.
|
||||
- `outcome: "crashed"` — OOM or segfault. Reduce memory footprint.
|
||||
- `outcome: "spawn_failed"` + `error: "..."` — usually a profile config issue (missing credential, bad PATH). Ask the human via `kanban_block` instead of retrying blindly.
|
||||
- `outcome: "reclaimed"` + `summary: "task archived..."` — operator archived the task out from under the previous run; you probably shouldn't be running at all, check status carefully.
|
||||
- `outcome: "blocked"` — a previous attempt blocked; the unblock comment should be in the thread by now.
|
||||
|
||||
## Do NOT
|
||||
|
||||
- Call `delegate_task` as a substitute for `kanban_create`. `delegate_task` is for short reasoning subtasks inside YOUR run; `kanban_create` is for cross-agent handoffs that outlive one API loop.
|
||||
- Modify files outside `$HERMES_KANBAN_WORKSPACE` unless the task body says to.
|
||||
- Create follow-up tasks assigned to yourself — assign to the right specialist.
|
||||
- Complete a task you didn't actually finish. Block it instead.
|
||||
|
||||
## Pitfalls
|
||||
|
||||
**Task state can change between dispatch and your startup.** Between when the dispatcher claimed and when your process actually booted, the task may have been blocked, reassigned, or archived. Always `kanban_show` first. If it reports `blocked` or `archived`, stop — you shouldn't be running.
|
||||
|
||||
**Workspace may have stale artifacts.** Especially `dir:` and `worktree` workspaces can have files from previous runs. Read the comment thread — it usually explains why you're running again and what state the workspace is in.
|
||||
|
||||
**Don't rely on the CLI when the guidance is available.** The `kanban_*` tools work across all terminal backends (Docker, Modal, SSH). `hermes kanban <verb>` from your terminal tool will fail in containerized backends because the CLI isn't installed there. When in doubt, use the tool.
|
||||
|
||||
## CLI fallback (for scripting)
|
||||
|
||||
Every tool has a CLI equivalent for human operators and scripts:
|
||||
- `kanban_show` ↔ `hermes kanban show <id> --json`
|
||||
- `kanban_complete` ↔ `hermes kanban complete <id> --summary "..." --metadata '{...}'`
|
||||
- `kanban_block` ↔ `hermes kanban block <id> "reason"`
|
||||
- `kanban_create` ↔ `hermes kanban create "title" --assignee <profile> [--parent <id>]`
|
||||
- etc.
|
||||
|
||||
Use the tools from inside an agent; the CLI exists for the human at the terminal.
|
||||
@@ -124,7 +124,7 @@ class TestMcpRegistrationE2E:
|
||||
mock_conn.request_permission = AsyncMock()
|
||||
acp_agent._conn = mock_conn
|
||||
|
||||
def mock_run_conversation(user_message, conversation_history=None, task_id=None):
|
||||
def mock_run_conversation(user_message, conversation_history=None, task_id=None, **kwargs):
|
||||
"""Simulate an agent turn that calls terminal, gets a result, then responds."""
|
||||
agent = state.agent
|
||||
|
||||
@@ -213,7 +213,7 @@ class TestMcpRegistrationE2E:
|
||||
mock_conn.request_permission = AsyncMock()
|
||||
acp_agent._conn = mock_conn
|
||||
|
||||
def mock_run(user_message, conversation_history=None, task_id=None):
|
||||
def mock_run(user_message, conversation_history=None, task_id=None, **kwargs):
|
||||
agent = state.agent
|
||||
# Fire two tool calls
|
||||
if agent.tool_progress_callback:
|
||||
|
||||
@@ -620,6 +620,41 @@ class TestChatCompletionsNormalize:
|
||||
assert nr.reasoning == "summary text"
|
||||
assert nr.provider_data == {"reasoning_content": "detailed scratchpad"}
|
||||
|
||||
def test_empty_reasoning_content_preserved(self, transport):
|
||||
"""DeepSeek can require an explicit empty reasoning_content replay field."""
|
||||
r = SimpleNamespace(
|
||||
choices=[SimpleNamespace(
|
||||
message=SimpleNamespace(
|
||||
content=None,
|
||||
tool_calls=None,
|
||||
reasoning=None,
|
||||
reasoning_content="",
|
||||
),
|
||||
finish_reason="stop",
|
||||
)],
|
||||
usage=None,
|
||||
)
|
||||
nr = transport.normalize_response(r)
|
||||
assert nr.provider_data == {"reasoning_content": ""}
|
||||
assert nr.reasoning_content == ""
|
||||
|
||||
def test_reasoning_content_preserved_from_model_extra(self, transport):
|
||||
"""OpenAI SDK can expose provider-specific DeepSeek fields via model_extra."""
|
||||
r = SimpleNamespace(
|
||||
choices=[SimpleNamespace(
|
||||
message=SimpleNamespace(
|
||||
content=None,
|
||||
tool_calls=None,
|
||||
reasoning=None,
|
||||
model_extra={"reasoning_content": "model-extra scratchpad"},
|
||||
),
|
||||
finish_reason="stop",
|
||||
)],
|
||||
usage=None,
|
||||
)
|
||||
nr = transport.normalize_response(r)
|
||||
assert nr.provider_data == {"reasoning_content": "model-extra scratchpad"}
|
||||
|
||||
|
||||
class TestChatCompletionsCacheStats:
|
||||
|
||||
|
||||
@@ -0,0 +1,206 @@
|
||||
"""Tests for cli._cprint's bg-thread cooperation with prompt_toolkit.
|
||||
|
||||
Background: when a prompt_toolkit Application is running, a bg thread that
|
||||
calls ``_pt_print`` directly can race with the input-area redraw and the
|
||||
printed line can end up visually buried behind the prompt. ``_cprint`` now
|
||||
routes cross-thread prints through ``run_in_terminal`` via
|
||||
``loop.call_soon_threadsafe`` so the self-improvement background review's
|
||||
``💾 Self-improvement review: …`` summary actually surfaces to the user.
|
||||
|
||||
These tests verify the routing logic without spinning up a real PT app.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import sys
|
||||
import types
|
||||
from types import SimpleNamespace
|
||||
|
||||
import cli
|
||||
|
||||
|
||||
def test_cprint_no_app_direct_print(monkeypatch):
|
||||
"""No active app → direct _pt_print, no run_in_terminal involvement."""
|
||||
calls = []
|
||||
monkeypatch.setattr(cli, "_pt_print", lambda x: calls.append(("pt_print", x)))
|
||||
monkeypatch.setattr(cli, "_PT_ANSI", lambda t: ("ANSI", t))
|
||||
|
||||
# Patch the prompt_toolkit import the function performs internally.
|
||||
fake_pt_app = types.ModuleType("prompt_toolkit.application")
|
||||
fake_pt_app.get_app_or_none = lambda: None
|
||||
fake_pt_app.run_in_terminal = lambda *a, **kw: calls.append(("run_in_terminal",))
|
||||
monkeypatch.setitem(sys.modules, "prompt_toolkit.application", fake_pt_app)
|
||||
|
||||
cli._cprint("hello")
|
||||
|
||||
assert calls == [("pt_print", ("ANSI", "hello"))]
|
||||
|
||||
|
||||
def test_cprint_app_not_running_direct_print(monkeypatch):
|
||||
"""App exists but not running (e.g. teardown) → direct print."""
|
||||
calls = []
|
||||
monkeypatch.setattr(cli, "_pt_print", lambda x: calls.append(("pt_print", x)))
|
||||
monkeypatch.setattr(cli, "_PT_ANSI", lambda t: t)
|
||||
|
||||
fake_app = SimpleNamespace(_is_running=False, loop=None)
|
||||
fake_pt_app = types.ModuleType("prompt_toolkit.application")
|
||||
fake_pt_app.get_app_or_none = lambda: fake_app
|
||||
fake_pt_app.run_in_terminal = lambda *a, **kw: calls.append(("run_in_terminal",))
|
||||
monkeypatch.setitem(sys.modules, "prompt_toolkit.application", fake_pt_app)
|
||||
|
||||
cli._cprint("x")
|
||||
|
||||
assert calls == [("pt_print", "x")]
|
||||
|
||||
|
||||
def test_cprint_bg_thread_schedules_on_app_loop(monkeypatch):
|
||||
"""App running + different thread → schedules via call_soon_threadsafe."""
|
||||
scheduled = []
|
||||
direct_prints = []
|
||||
|
||||
monkeypatch.setattr(cli, "_pt_print", lambda x: direct_prints.append(x))
|
||||
monkeypatch.setattr(cli, "_PT_ANSI", lambda t: t)
|
||||
|
||||
class FakeLoop:
|
||||
def is_running(self):
|
||||
return True
|
||||
|
||||
def call_soon_threadsafe(self, cb, *args):
|
||||
scheduled.append(cb)
|
||||
|
||||
fake_loop = FakeLoop()
|
||||
|
||||
# Install a fake "current loop" that is NOT the app's loop, so the
|
||||
# cross-thread branch is taken.
|
||||
fake_current_loop = SimpleNamespace(is_running=lambda: True)
|
||||
fake_asyncio = types.ModuleType("asyncio")
|
||||
|
||||
class _Policy:
|
||||
def get_event_loop(self):
|
||||
return fake_current_loop
|
||||
|
||||
fake_asyncio.get_event_loop_policy = lambda: _Policy()
|
||||
monkeypatch.setitem(sys.modules, "asyncio", fake_asyncio)
|
||||
|
||||
fake_app = SimpleNamespace(_is_running=True, loop=fake_loop)
|
||||
fake_pt_app = types.ModuleType("prompt_toolkit.application")
|
||||
fake_pt_app.get_app_or_none = lambda: fake_app
|
||||
|
||||
run_in_terminal_calls = []
|
||||
|
||||
def _fake_run_in_terminal(func, **kw):
|
||||
run_in_terminal_calls.append(func)
|
||||
# Simulate run_in_terminal actually calling func (as the real PT
|
||||
# impl would once the app loop tick picks it up).
|
||||
func()
|
||||
return None
|
||||
|
||||
fake_pt_app.run_in_terminal = _fake_run_in_terminal
|
||||
monkeypatch.setitem(sys.modules, "prompt_toolkit.application", fake_pt_app)
|
||||
|
||||
cli._cprint("💾 Self-improvement review: Skill updated")
|
||||
|
||||
# call_soon_threadsafe must have been called with a scheduling cb.
|
||||
assert len(scheduled) == 1
|
||||
|
||||
# Invoking the scheduled callback should hit run_in_terminal.
|
||||
scheduled[0]()
|
||||
assert len(run_in_terminal_calls) == 1
|
||||
|
||||
# And run_in_terminal's inner func should have emitted a pt_print.
|
||||
assert direct_prints == ["💾 Self-improvement review: Skill updated"]
|
||||
|
||||
|
||||
def test_cprint_same_thread_as_app_loop_direct_print(monkeypatch):
|
||||
"""App running on same thread → direct print (no scheduling)."""
|
||||
direct_prints = []
|
||||
monkeypatch.setattr(cli, "_pt_print", lambda x: direct_prints.append(x))
|
||||
monkeypatch.setattr(cli, "_PT_ANSI", lambda t: t)
|
||||
|
||||
class FakeLoop:
|
||||
def is_running(self):
|
||||
return True
|
||||
|
||||
def call_soon_threadsafe(self, cb, *args):
|
||||
raise AssertionError(
|
||||
"call_soon_threadsafe must not be used on the app's own thread"
|
||||
)
|
||||
|
||||
fake_loop = FakeLoop()
|
||||
fake_asyncio = types.ModuleType("asyncio")
|
||||
|
||||
class _Policy:
|
||||
def get_event_loop(self):
|
||||
return fake_loop # same as app loop
|
||||
|
||||
fake_asyncio.get_event_loop_policy = lambda: _Policy()
|
||||
monkeypatch.setitem(sys.modules, "asyncio", fake_asyncio)
|
||||
|
||||
fake_app = SimpleNamespace(_is_running=True, loop=fake_loop)
|
||||
fake_pt_app = types.ModuleType("prompt_toolkit.application")
|
||||
fake_pt_app.get_app_or_none = lambda: fake_app
|
||||
fake_pt_app.run_in_terminal = lambda *a, **kw: None
|
||||
monkeypatch.setitem(sys.modules, "prompt_toolkit.application", fake_pt_app)
|
||||
|
||||
cli._cprint("x")
|
||||
|
||||
assert direct_prints == ["x"]
|
||||
|
||||
|
||||
def test_cprint_swallows_app_loop_attr_error(monkeypatch):
|
||||
"""Loop missing on app → fall back to direct print, no crash."""
|
||||
direct_prints = []
|
||||
monkeypatch.setattr(cli, "_pt_print", lambda x: direct_prints.append(x))
|
||||
monkeypatch.setattr(cli, "_PT_ANSI", lambda t: t)
|
||||
|
||||
class WeirdApp:
|
||||
_is_running = True
|
||||
|
||||
@property
|
||||
def loop(self):
|
||||
raise RuntimeError("no loop for you")
|
||||
|
||||
fake_pt_app = types.ModuleType("prompt_toolkit.application")
|
||||
fake_pt_app.get_app_or_none = lambda: WeirdApp()
|
||||
fake_pt_app.run_in_terminal = lambda *a, **kw: None
|
||||
monkeypatch.setitem(sys.modules, "prompt_toolkit.application", fake_pt_app)
|
||||
|
||||
cli._cprint("fallback")
|
||||
|
||||
assert direct_prints == ["fallback"]
|
||||
|
||||
|
||||
def test_cprint_swallows_prompt_toolkit_import_error(monkeypatch):
|
||||
"""If prompt_toolkit.application itself fails to import, fall back."""
|
||||
direct_prints = []
|
||||
monkeypatch.setattr(cli, "_pt_print", lambda x: direct_prints.append(x))
|
||||
monkeypatch.setattr(cli, "_PT_ANSI", lambda t: t)
|
||||
|
||||
# Drop cached prompt_toolkit.application AND install a meta-path finder
|
||||
# that raises ImportError on re-import.
|
||||
monkeypatch.delitem(sys.modules, "prompt_toolkit.application", raising=False)
|
||||
|
||||
class _BlockFinder:
|
||||
def find_module(self, name, path=None):
|
||||
if name == "prompt_toolkit.application":
|
||||
return self
|
||||
return None
|
||||
|
||||
def load_module(self, name):
|
||||
raise ImportError("blocked for test")
|
||||
|
||||
def find_spec(self, name, path=None, target=None):
|
||||
if name == "prompt_toolkit.application":
|
||||
# Returning a bogus spec that will fail on load works too,
|
||||
# but raising here keeps the test simple.
|
||||
raise ImportError("blocked for test")
|
||||
return None
|
||||
|
||||
blocker = _BlockFinder()
|
||||
sys.meta_path.insert(0, blocker)
|
||||
try:
|
||||
cli._cprint("fallback2")
|
||||
finally:
|
||||
sys.meta_path.remove(blocker)
|
||||
|
||||
assert direct_prints == ["fallback2"]
|
||||
@@ -1,7 +1,21 @@
|
||||
"""Tests for the curator CLI status renderer."""
|
||||
"""Tests for `hermes curator status` output.
|
||||
|
||||
Covers:
|
||||
- y0shualee's "least recently active" semantic (view/patch/use all count as activity).
|
||||
- The most-used / least-used rankings by activity_count so users can see which
|
||||
skills actually get exercised.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import io
|
||||
from argparse import Namespace
|
||||
from contextlib import redirect_stdout
|
||||
from pathlib import Path
|
||||
from types import SimpleNamespace
|
||||
|
||||
import pytest
|
||||
|
||||
|
||||
def test_status_uses_last_activity_not_only_last_used(monkeypatch, capsys):
|
||||
import agent.curator as curator_state
|
||||
@@ -41,3 +55,115 @@ def test_status_uses_last_activity_not_only_last_used(monkeypatch, capsys):
|
||||
assert "activity= 4" in out
|
||||
assert "last_activity=never" not in out
|
||||
assert "last_used=never" not in out
|
||||
|
||||
|
||||
@pytest.fixture
|
||||
def curator_status_env(tmp_path, monkeypatch):
|
||||
"""Isolated HERMES_HOME with real agent-created skills on disk."""
|
||||
home = tmp_path / ".hermes"
|
||||
skills = home / "skills"
|
||||
skills.mkdir(parents=True)
|
||||
(home / "logs").mkdir()
|
||||
monkeypatch.setenv("HERMES_HOME", str(home))
|
||||
monkeypatch.setattr(Path, "home", lambda: tmp_path)
|
||||
|
||||
import importlib
|
||||
import hermes_constants
|
||||
importlib.reload(hermes_constants)
|
||||
from tools import skill_usage
|
||||
importlib.reload(skill_usage)
|
||||
from agent import curator
|
||||
importlib.reload(curator)
|
||||
from hermes_cli import curator as curator_cli
|
||||
importlib.reload(curator_cli)
|
||||
|
||||
def _write_skill(name: str) -> None:
|
||||
d = skills / name
|
||||
d.mkdir()
|
||||
(d / "SKILL.md").write_text(
|
||||
"---\n"
|
||||
f"name: {name}\n"
|
||||
"description: test\n"
|
||||
"version: 1.0.0\n"
|
||||
"metadata:\n"
|
||||
" hermes:\n"
|
||||
" agent_created: true\n"
|
||||
"---\n"
|
||||
f"# {name}\n"
|
||||
)
|
||||
|
||||
return {
|
||||
"home": home,
|
||||
"skills": skills,
|
||||
"make_skill": _write_skill,
|
||||
"skill_usage": skill_usage,
|
||||
"curator_cli": curator_cli,
|
||||
}
|
||||
|
||||
|
||||
def _capture_status(curator_cli) -> str:
|
||||
buf = io.StringIO()
|
||||
with redirect_stdout(buf):
|
||||
rc = curator_cli._cmd_status(Namespace())
|
||||
assert rc == 0
|
||||
return buf.getvalue()
|
||||
|
||||
|
||||
def test_status_shows_most_and_least_used_sections(curator_status_env):
|
||||
env = curator_status_env
|
||||
env["make_skill"]("top-dog")
|
||||
env["make_skill"]("middling")
|
||||
env["make_skill"]("never-used")
|
||||
|
||||
# Bump use_count differentially. All three counters (use/view/patch) feed
|
||||
# into activity_count, so bumping use alone is enough to make activity
|
||||
# diverge between skills.
|
||||
for _ in range(10):
|
||||
env["skill_usage"].bump_use("top-dog")
|
||||
for _ in range(2):
|
||||
env["skill_usage"].bump_use("middling")
|
||||
|
||||
out = _capture_status(env["curator_cli"])
|
||||
|
||||
# Both new sections present
|
||||
assert "most active (top 5):" in out
|
||||
assert "least active (top 5):" in out
|
||||
# y0shualee's section preserved
|
||||
assert "least recently active (top 5):" in out
|
||||
|
||||
# most-active lists top-dog FIRST (highest activity_count)
|
||||
most_section = out.split("most active (top 5):")[1].split("\n\n")[0]
|
||||
top_line = most_section.strip().split("\n")[0]
|
||||
assert "top-dog" in top_line
|
||||
assert "activity= 10" in top_line
|
||||
|
||||
# least-active lists never-used FIRST (activity=0)
|
||||
least_section = out.split("least active (top 5):")[1].split("\n\n")[0]
|
||||
bottom_line = least_section.strip().split("\n")[0]
|
||||
assert "never-used" in bottom_line
|
||||
assert "activity= 0" in bottom_line
|
||||
|
||||
|
||||
def test_status_hides_most_active_when_all_zero(curator_status_env):
|
||||
"""If no skills have any activity, skip the most-active block — it's noise.
|
||||
Least-active still shows so the user sees their catalog."""
|
||||
env = curator_status_env
|
||||
env["make_skill"]("a")
|
||||
env["make_skill"]("b")
|
||||
# No bumps.
|
||||
|
||||
out = _capture_status(env["curator_cli"])
|
||||
|
||||
# most-active section is hidden because the top is 0
|
||||
assert "most active (top 5):" not in out
|
||||
# least-active still renders — it's part of the catalog overview
|
||||
assert "least active (top 5):" in out
|
||||
|
||||
|
||||
def test_status_no_skills_produces_clean_empty_output(curator_status_env):
|
||||
env = curator_status_env
|
||||
out = _capture_status(env["curator_cli"])
|
||||
assert "no agent-created skills" in out
|
||||
# None of the ranking sections render
|
||||
assert "most active" not in out
|
||||
assert "least active" not in out
|
||||
|
||||
@@ -0,0 +1,16 @@
|
||||
"""Static dashboard tests for browser-safe @nous-research/ui imports."""
|
||||
from pathlib import Path
|
||||
|
||||
|
||||
WEB_SRC = Path(__file__).resolve().parents[2] / "web" / "src"
|
||||
|
||||
|
||||
def test_dashboard_does_not_import_nous_ui_root_barrel():
|
||||
offenders = []
|
||||
for ext in ("*.tsx", "*.ts"):
|
||||
for path in WEB_SRC.rglob(ext):
|
||||
content = path.read_text(encoding="utf-8")
|
||||
if 'from "@nous-research/ui"' in content or "from '@nous-research/ui'" in content:
|
||||
offenders.append(str(path.relative_to(WEB_SRC)))
|
||||
|
||||
assert offenders == []
|
||||
@@ -0,0 +1,11 @@
|
||||
"""Static dashboard tests for the Profiles navigation copy."""
|
||||
from pathlib import Path
|
||||
|
||||
|
||||
def test_profiles_nav_label_uses_short_multi_agents_copy():
|
||||
en_i18n = Path(__file__).resolve().parents[2] / "web" / "src" / "i18n" / "en.ts"
|
||||
|
||||
content = en_i18n.read_text(encoding="utf-8")
|
||||
|
||||
assert 'profiles: "profiles : multi agents"' in content
|
||||
assert "Profiles: Running Multiple Agents" not in content
|
||||
@@ -0,0 +1,210 @@
|
||||
"""Tests for the kanban CLI surface (hermes_cli.kanban)."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import argparse
|
||||
import json
|
||||
import os
|
||||
from pathlib import Path
|
||||
|
||||
import pytest
|
||||
|
||||
from hermes_cli import kanban as kc
|
||||
from hermes_cli import kanban_db as kb
|
||||
|
||||
|
||||
@pytest.fixture
|
||||
def kanban_home(tmp_path, monkeypatch):
|
||||
home = tmp_path / ".hermes"
|
||||
home.mkdir()
|
||||
monkeypatch.setenv("HERMES_HOME", str(home))
|
||||
monkeypatch.setattr(Path, "home", lambda: tmp_path)
|
||||
kb.init_db()
|
||||
return home
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Workspace flag parsing
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
@pytest.mark.parametrize(
|
||||
"value,expected",
|
||||
[
|
||||
("scratch", ("scratch", None)),
|
||||
("worktree", ("worktree", None)),
|
||||
("dir:/tmp/work", ("dir", "/tmp/work")),
|
||||
],
|
||||
)
|
||||
def test_parse_workspace_flag_valid(value, expected):
|
||||
assert kc._parse_workspace_flag(value) == expected
|
||||
|
||||
|
||||
def test_parse_workspace_flag_expands_user():
|
||||
kind, path = kc._parse_workspace_flag("dir:~/vault")
|
||||
assert kind == "dir"
|
||||
assert path.endswith("/vault")
|
||||
assert not path.startswith("~")
|
||||
|
||||
|
||||
@pytest.mark.parametrize("bad", ["cloud", "dir:", "", "worktree:/x"])
|
||||
def test_parse_workspace_flag_rejects(bad):
|
||||
if not bad:
|
||||
# Empty -> defaults; not an error.
|
||||
assert kc._parse_workspace_flag(bad) == ("scratch", None)
|
||||
return
|
||||
with pytest.raises(argparse.ArgumentTypeError):
|
||||
kc._parse_workspace_flag(bad)
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# run_slash smoke tests (end-to-end via the same entry both CLI and gateway use)
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def test_run_slash_no_args_shows_usage(kanban_home):
|
||||
out = kc.run_slash("")
|
||||
assert "kanban" in out.lower()
|
||||
assert "create" in out.lower() or "subcommand" in out.lower() or "action" in out.lower()
|
||||
|
||||
|
||||
def test_run_slash_create_and_list(kanban_home):
|
||||
out = kc.run_slash("create 'ship feature' --assignee alice")
|
||||
assert "Created" in out
|
||||
out = kc.run_slash("list")
|
||||
assert "ship feature" in out
|
||||
assert "alice" in out
|
||||
|
||||
|
||||
def test_run_slash_create_with_parent_and_cascade(kanban_home):
|
||||
# Parent then child via --parent
|
||||
out1 = kc.run_slash("create 'parent' --assignee alice")
|
||||
# Extract the "t_xxxx" id from "Created t_xxxx (ready, ...)"
|
||||
import re
|
||||
m = re.search(r"(t_[a-f0-9]+)", out1)
|
||||
assert m
|
||||
p = m.group(1)
|
||||
out2 = kc.run_slash(f"create 'child' --assignee bob --parent {p}")
|
||||
assert "todo" in out2 # child starts as todo
|
||||
|
||||
# Complete parent; list should promote child to ready
|
||||
kc.run_slash(f"complete {p}")
|
||||
# Explicit filter: child should now be ready (was todo before complete).
|
||||
ready_list = kc.run_slash("list --status ready")
|
||||
assert "child" in ready_list
|
||||
|
||||
|
||||
def test_run_slash_show_includes_comments(kanban_home):
|
||||
out = kc.run_slash("create 'x'")
|
||||
import re
|
||||
tid = re.search(r"(t_[a-f0-9]+)", out).group(1)
|
||||
kc.run_slash(f"comment {tid} 'source is paywalled'")
|
||||
show = kc.run_slash(f"show {tid}")
|
||||
assert "source is paywalled" in show
|
||||
|
||||
|
||||
def test_run_slash_block_unblock_cycle(kanban_home):
|
||||
out = kc.run_slash("create 'x' --assignee alice")
|
||||
import re
|
||||
tid = re.search(r"(t_[a-f0-9]+)", out).group(1)
|
||||
# Claim first so block() finds it running
|
||||
kc.run_slash(f"claim {tid}")
|
||||
assert "Blocked" in kc.run_slash(f"block {tid} 'need decision'")
|
||||
assert "Unblocked" in kc.run_slash(f"unblock {tid}")
|
||||
|
||||
|
||||
def test_run_slash_json_output(kanban_home):
|
||||
out = kc.run_slash("create 'jsontask' --assignee alice --json")
|
||||
payload = json.loads(out)
|
||||
assert payload["title"] == "jsontask"
|
||||
assert payload["assignee"] == "alice"
|
||||
assert payload["status"] == "ready"
|
||||
|
||||
|
||||
def test_run_slash_dispatch_dry_run_counts(kanban_home):
|
||||
kc.run_slash("create 'a' --assignee alice")
|
||||
kc.run_slash("create 'b' --assignee bob")
|
||||
out = kc.run_slash("dispatch --dry-run")
|
||||
assert "Spawned:" in out
|
||||
|
||||
|
||||
def test_run_slash_context_output_format(kanban_home):
|
||||
out = kc.run_slash("create 'tech spec' --assignee alice --body 'write an RFC'")
|
||||
import re
|
||||
tid = re.search(r"(t_[a-f0-9]+)", out).group(1)
|
||||
kc.run_slash(f"comment {tid} 'remember to include performance section'")
|
||||
ctx = kc.run_slash(f"context {tid}")
|
||||
assert "tech spec" in ctx
|
||||
assert "write an RFC" in ctx
|
||||
assert "performance section" in ctx
|
||||
|
||||
|
||||
def test_run_slash_tenant_filter(kanban_home):
|
||||
kc.run_slash("create 'biz-a task' --tenant biz-a --assignee alice")
|
||||
kc.run_slash("create 'biz-b task' --tenant biz-b --assignee alice")
|
||||
a = kc.run_slash("list --tenant biz-a")
|
||||
b = kc.run_slash("list --tenant biz-b")
|
||||
assert "biz-a task" in a and "biz-b task" not in a
|
||||
assert "biz-b task" in b and "biz-a task" not in b
|
||||
|
||||
|
||||
def test_run_slash_usage_error_returns_message(kanban_home):
|
||||
# Missing required argument for create
|
||||
out = kc.run_slash("create")
|
||||
assert "usage" in out.lower() or "error" in out.lower()
|
||||
|
||||
|
||||
def test_run_slash_assign_reassigns(kanban_home):
|
||||
out = kc.run_slash("create 'x' --assignee alice")
|
||||
import re
|
||||
tid = re.search(r"(t_[a-f0-9]+)", out).group(1)
|
||||
assert "Assigned" in kc.run_slash(f"assign {tid} bob")
|
||||
show = kc.run_slash(f"show {tid}")
|
||||
assert "bob" in show
|
||||
|
||||
|
||||
def test_run_slash_link_unlink(kanban_home):
|
||||
a = kc.run_slash("create 'a'")
|
||||
b = kc.run_slash("create 'b'")
|
||||
import re
|
||||
ta = re.search(r"(t_[a-f0-9]+)", a).group(1)
|
||||
tb = re.search(r"(t_[a-f0-9]+)", b).group(1)
|
||||
assert "Linked" in kc.run_slash(f"link {ta} {tb}")
|
||||
# After link, b is todo
|
||||
show = kc.run_slash(f"show {tb}")
|
||||
assert "todo" in show
|
||||
assert "Unlinked" in kc.run_slash(f"unlink {ta} {tb}")
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Integration with the COMMAND_REGISTRY
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def test_kanban_is_resolvable():
|
||||
from hermes_cli.commands import resolve_command
|
||||
|
||||
cmd = resolve_command("kanban")
|
||||
assert cmd is not None
|
||||
assert cmd.name == "kanban"
|
||||
|
||||
|
||||
def test_kanban_bypasses_active_session_guard():
|
||||
from hermes_cli.commands import should_bypass_active_session
|
||||
|
||||
assert should_bypass_active_session("kanban")
|
||||
|
||||
|
||||
def test_kanban_in_autocomplete_table():
|
||||
from hermes_cli.commands import COMMANDS, SUBCOMMANDS
|
||||
|
||||
assert "/kanban" in COMMANDS
|
||||
subs = SUBCOMMANDS.get("/kanban") or []
|
||||
assert "create" in subs
|
||||
assert "dispatch" in subs
|
||||
|
||||
|
||||
def test_kanban_not_gateway_only():
|
||||
# kanban is available in BOTH CLI and gateway surfaces.
|
||||
from hermes_cli.commands import COMMAND_REGISTRY
|
||||
|
||||
cmd = next(c for c in COMMAND_REGISTRY if c.name == "kanban")
|
||||
assert not cmd.cli_only
|
||||
assert not cmd.gateway_only
|
||||
File diff suppressed because it is too large
Load Diff
@@ -0,0 +1,438 @@
|
||||
"""Tests for the Kanban DB layer (hermes_cli.kanban_db)."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import concurrent.futures
|
||||
import os
|
||||
import time
|
||||
from pathlib import Path
|
||||
|
||||
import pytest
|
||||
|
||||
from hermes_cli import kanban_db as kb
|
||||
|
||||
|
||||
@pytest.fixture
|
||||
def kanban_home(tmp_path, monkeypatch):
|
||||
"""Isolated HERMES_HOME with an empty kanban DB."""
|
||||
home = tmp_path / ".hermes"
|
||||
home.mkdir()
|
||||
monkeypatch.setenv("HERMES_HOME", str(home))
|
||||
monkeypatch.setattr(Path, "home", lambda: tmp_path)
|
||||
kb.init_db()
|
||||
return home
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Schema / init
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def test_init_db_is_idempotent(kanban_home):
|
||||
# Second call should not error or drop data.
|
||||
with kb.connect() as conn:
|
||||
kb.create_task(conn, title="persisted")
|
||||
kb.init_db()
|
||||
with kb.connect() as conn:
|
||||
tasks = kb.list_tasks(conn)
|
||||
assert len(tasks) == 1
|
||||
assert tasks[0].title == "persisted"
|
||||
|
||||
|
||||
def test_init_creates_expected_tables(kanban_home):
|
||||
with kb.connect() as conn:
|
||||
rows = conn.execute(
|
||||
"SELECT name FROM sqlite_master WHERE type='table' ORDER BY name"
|
||||
).fetchall()
|
||||
names = {r["name"] for r in rows}
|
||||
assert {"tasks", "task_links", "task_comments", "task_events"} <= names
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Task creation + status inference
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def test_create_task_no_parents_is_ready(kanban_home):
|
||||
with kb.connect() as conn:
|
||||
tid = kb.create_task(conn, title="ship it", assignee="alice")
|
||||
t = kb.get_task(conn, tid)
|
||||
assert t is not None
|
||||
assert t.status == "ready"
|
||||
assert t.assignee == "alice"
|
||||
assert t.workspace_kind == "scratch"
|
||||
|
||||
|
||||
def test_create_task_with_parent_is_todo_until_parent_done(kanban_home):
|
||||
with kb.connect() as conn:
|
||||
p = kb.create_task(conn, title="parent")
|
||||
c = kb.create_task(conn, title="child", parents=[p])
|
||||
assert kb.get_task(conn, c).status == "todo"
|
||||
kb.complete_task(conn, p, result="ok")
|
||||
assert kb.get_task(conn, c).status == "ready"
|
||||
|
||||
|
||||
def test_create_task_unknown_parent_errors(kanban_home):
|
||||
with kb.connect() as conn, pytest.raises(ValueError, match="unknown parent"):
|
||||
kb.create_task(conn, title="orphan", parents=["t_ghost"])
|
||||
|
||||
|
||||
def test_workspace_kind_validation(kanban_home):
|
||||
with kb.connect() as conn, pytest.raises(ValueError, match="workspace_kind"):
|
||||
kb.create_task(conn, title="bad ws", workspace_kind="cloud")
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Links + dependency resolution
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def test_link_demotes_ready_child_to_todo_when_parent_not_done(kanban_home):
|
||||
with kb.connect() as conn:
|
||||
a = kb.create_task(conn, title="a")
|
||||
b = kb.create_task(conn, title="b")
|
||||
assert kb.get_task(conn, b).status == "ready"
|
||||
kb.link_tasks(conn, a, b)
|
||||
assert kb.get_task(conn, b).status == "todo"
|
||||
|
||||
|
||||
def test_link_keeps_ready_child_when_parent_already_done(kanban_home):
|
||||
with kb.connect() as conn:
|
||||
a = kb.create_task(conn, title="a")
|
||||
kb.complete_task(conn, a)
|
||||
b = kb.create_task(conn, title="b")
|
||||
assert kb.get_task(conn, b).status == "ready"
|
||||
kb.link_tasks(conn, a, b)
|
||||
assert kb.get_task(conn, b).status == "ready"
|
||||
|
||||
|
||||
def test_link_rejects_self_loop(kanban_home):
|
||||
with kb.connect() as conn:
|
||||
a = kb.create_task(conn, title="a")
|
||||
with pytest.raises(ValueError, match="itself"):
|
||||
kb.link_tasks(conn, a, a)
|
||||
|
||||
|
||||
def test_link_detects_cycle(kanban_home):
|
||||
with kb.connect() as conn:
|
||||
a = kb.create_task(conn, title="a")
|
||||
b = kb.create_task(conn, title="b", parents=[a])
|
||||
c = kb.create_task(conn, title="c", parents=[b])
|
||||
with pytest.raises(ValueError, match="cycle"):
|
||||
kb.link_tasks(conn, c, a)
|
||||
with pytest.raises(ValueError, match="cycle"):
|
||||
kb.link_tasks(conn, b, a)
|
||||
|
||||
|
||||
def test_recompute_ready_cascades_through_chain(kanban_home):
|
||||
with kb.connect() as conn:
|
||||
a = kb.create_task(conn, title="a")
|
||||
b = kb.create_task(conn, title="b", parents=[a])
|
||||
c = kb.create_task(conn, title="c", parents=[b])
|
||||
assert [kb.get_task(conn, x).status for x in (a, b, c)] == \
|
||||
["ready", "todo", "todo"]
|
||||
kb.complete_task(conn, a)
|
||||
assert kb.get_task(conn, b).status == "ready"
|
||||
kb.complete_task(conn, b)
|
||||
assert kb.get_task(conn, c).status == "ready"
|
||||
|
||||
|
||||
def test_recompute_ready_fan_in_waits_for_all_parents(kanban_home):
|
||||
with kb.connect() as conn:
|
||||
a = kb.create_task(conn, title="a")
|
||||
b = kb.create_task(conn, title="b")
|
||||
c = kb.create_task(conn, title="c", parents=[a, b])
|
||||
kb.complete_task(conn, a)
|
||||
assert kb.get_task(conn, c).status == "todo"
|
||||
kb.complete_task(conn, b)
|
||||
assert kb.get_task(conn, c).status == "ready"
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Atomic claim (CAS)
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def test_claim_once_wins_second_loses(kanban_home):
|
||||
with kb.connect() as conn:
|
||||
t = kb.create_task(conn, title="x", assignee="a")
|
||||
first = kb.claim_task(conn, t, claimer="host:1")
|
||||
assert first is not None and first.status == "running"
|
||||
second = kb.claim_task(conn, t, claimer="host:2")
|
||||
assert second is None
|
||||
|
||||
|
||||
def test_claim_fails_on_non_ready(kanban_home):
|
||||
with kb.connect() as conn:
|
||||
t = kb.create_task(conn, title="x")
|
||||
# Move to todo by introducing an unsatisfied parent.
|
||||
p = kb.create_task(conn, title="p")
|
||||
kb.link_tasks(conn, p, t)
|
||||
assert kb.get_task(conn, t).status == "todo"
|
||||
assert kb.claim_task(conn, t) is None
|
||||
|
||||
|
||||
def test_stale_claim_reclaimed(kanban_home):
|
||||
with kb.connect() as conn:
|
||||
t = kb.create_task(conn, title="x", assignee="a")
|
||||
kb.claim_task(conn, t)
|
||||
# Rewind claim_expires so it looks stale.
|
||||
conn.execute(
|
||||
"UPDATE tasks SET claim_expires = ? WHERE id = ?",
|
||||
(int(time.time()) - 3600, t),
|
||||
)
|
||||
reclaimed = kb.release_stale_claims(conn)
|
||||
assert reclaimed == 1
|
||||
assert kb.get_task(conn, t).status == "ready"
|
||||
|
||||
|
||||
def test_heartbeat_extends_claim(kanban_home):
|
||||
with kb.connect() as conn:
|
||||
t = kb.create_task(conn, title="x", assignee="a")
|
||||
claimer = "host:hb"
|
||||
kb.claim_task(conn, t, claimer=claimer, ttl_seconds=60)
|
||||
original = kb.get_task(conn, t).claim_expires
|
||||
# Rewind then heartbeat.
|
||||
conn.execute("UPDATE tasks SET claim_expires = ? WHERE id = ?", (0, t))
|
||||
ok = kb.heartbeat_claim(conn, t, claimer=claimer, ttl_seconds=3600)
|
||||
assert ok
|
||||
new = kb.get_task(conn, t).claim_expires
|
||||
assert new > int(time.time()) + 3000
|
||||
|
||||
|
||||
def test_concurrent_claims_only_one_wins(kanban_home):
|
||||
"""Fire N threads claiming the same task; exactly one must win."""
|
||||
with kb.connect() as conn:
|
||||
t = kb.create_task(conn, title="race", assignee="a")
|
||||
|
||||
def attempt(i):
|
||||
with kb.connect() as c:
|
||||
return kb.claim_task(c, t, claimer=f"host:{i}")
|
||||
|
||||
n_workers = 8
|
||||
with concurrent.futures.ThreadPoolExecutor(max_workers=n_workers) as ex:
|
||||
results = list(ex.map(attempt, range(n_workers)))
|
||||
winners = [r for r in results if r is not None]
|
||||
assert len(winners) == 1
|
||||
assert winners[0].status == "running"
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Complete / block / unblock / archive / assign
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def test_complete_records_result(kanban_home):
|
||||
with kb.connect() as conn:
|
||||
t = kb.create_task(conn, title="x")
|
||||
assert kb.complete_task(conn, t, result="done and dusted")
|
||||
task = kb.get_task(conn, t)
|
||||
assert task.status == "done"
|
||||
assert task.result == "done and dusted"
|
||||
assert task.completed_at is not None
|
||||
|
||||
|
||||
def test_block_then_unblock(kanban_home):
|
||||
with kb.connect() as conn:
|
||||
t = kb.create_task(conn, title="x", assignee="a")
|
||||
kb.claim_task(conn, t)
|
||||
assert kb.block_task(conn, t, reason="need input")
|
||||
assert kb.get_task(conn, t).status == "blocked"
|
||||
assert kb.unblock_task(conn, t)
|
||||
assert kb.get_task(conn, t).status == "ready"
|
||||
|
||||
|
||||
def test_assign_refuses_while_running(kanban_home):
|
||||
with kb.connect() as conn:
|
||||
t = kb.create_task(conn, title="x", assignee="a")
|
||||
kb.claim_task(conn, t)
|
||||
with pytest.raises(RuntimeError, match="currently running"):
|
||||
kb.assign_task(conn, t, "b")
|
||||
|
||||
|
||||
def test_assign_reassigns_when_not_running(kanban_home):
|
||||
with kb.connect() as conn:
|
||||
t = kb.create_task(conn, title="x", assignee="a")
|
||||
assert kb.assign_task(conn, t, "b")
|
||||
assert kb.get_task(conn, t).assignee == "b"
|
||||
|
||||
|
||||
def test_archive_hides_from_default_list(kanban_home):
|
||||
with kb.connect() as conn:
|
||||
t = kb.create_task(conn, title="x")
|
||||
kb.complete_task(conn, t)
|
||||
assert kb.archive_task(conn, t)
|
||||
assert len(kb.list_tasks(conn)) == 0
|
||||
assert len(kb.list_tasks(conn, include_archived=True)) == 1
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Comments / events / worker context
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def test_comments_recorded_in_order(kanban_home):
|
||||
with kb.connect() as conn:
|
||||
t = kb.create_task(conn, title="x")
|
||||
kb.add_comment(conn, t, "user", "first")
|
||||
kb.add_comment(conn, t, "researcher", "second")
|
||||
comments = kb.list_comments(conn, t)
|
||||
assert [c.body for c in comments] == ["first", "second"]
|
||||
assert [c.author for c in comments] == ["user", "researcher"]
|
||||
|
||||
|
||||
def test_empty_comment_rejected(kanban_home):
|
||||
with kb.connect() as conn:
|
||||
t = kb.create_task(conn, title="x")
|
||||
with pytest.raises(ValueError, match="body is required"):
|
||||
kb.add_comment(conn, t, "user", "")
|
||||
|
||||
|
||||
def test_events_capture_lifecycle(kanban_home):
|
||||
with kb.connect() as conn:
|
||||
t = kb.create_task(conn, title="x", assignee="a")
|
||||
kb.claim_task(conn, t)
|
||||
kb.complete_task(conn, t, result="ok")
|
||||
events = kb.list_events(conn, t)
|
||||
kinds = [e.kind for e in events]
|
||||
assert "created" in kinds
|
||||
assert "claimed" in kinds
|
||||
assert "completed" in kinds
|
||||
|
||||
|
||||
def test_worker_context_includes_parent_results_and_comments(kanban_home):
|
||||
with kb.connect() as conn:
|
||||
p = kb.create_task(conn, title="p")
|
||||
kb.complete_task(conn, p, result="PARENT_RESULT_MARKER")
|
||||
c = kb.create_task(conn, title="child", parents=[p])
|
||||
kb.add_comment(conn, c, "user", "CLARIFICATION_MARKER")
|
||||
ctx = kb.build_worker_context(conn, c)
|
||||
assert "PARENT_RESULT_MARKER" in ctx
|
||||
assert "CLARIFICATION_MARKER" in ctx
|
||||
assert c in ctx
|
||||
assert "child" in ctx
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Dispatcher
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def test_dispatch_dry_run_does_not_claim(kanban_home):
|
||||
with kb.connect() as conn:
|
||||
t1 = kb.create_task(conn, title="a", assignee="alice")
|
||||
t2 = kb.create_task(conn, title="b", assignee="bob")
|
||||
res = kb.dispatch_once(conn, dry_run=True)
|
||||
assert {s[0] for s in res.spawned} == {t1, t2}
|
||||
with kb.connect() as conn:
|
||||
# Dry run must NOT mutate status.
|
||||
assert kb.get_task(conn, t1).status == "ready"
|
||||
assert kb.get_task(conn, t2).status == "ready"
|
||||
|
||||
|
||||
def test_dispatch_skips_unassigned(kanban_home):
|
||||
with kb.connect() as conn:
|
||||
t = kb.create_task(conn, title="floater")
|
||||
res = kb.dispatch_once(conn, dry_run=True)
|
||||
assert t in res.skipped_unassigned
|
||||
assert not res.spawned
|
||||
|
||||
|
||||
def test_dispatch_promotes_ready_and_spawns(kanban_home):
|
||||
spawns = []
|
||||
|
||||
def fake_spawn(task, workspace):
|
||||
spawns.append((task.id, task.assignee, workspace))
|
||||
|
||||
with kb.connect() as conn:
|
||||
p = kb.create_task(conn, title="p", assignee="alice")
|
||||
c = kb.create_task(conn, title="c", assignee="bob", parents=[p])
|
||||
# Finish parent outside dispatch; promotion happens inside.
|
||||
kb.complete_task(conn, p)
|
||||
res = kb.dispatch_once(conn, spawn_fn=fake_spawn)
|
||||
# Spawned c (a was already done when dispatch was called).
|
||||
assert len(spawns) == 1
|
||||
assert spawns[0][0] == c
|
||||
assert spawns[0][1] == "bob"
|
||||
# c is now running
|
||||
with kb.connect() as conn:
|
||||
assert kb.get_task(conn, c).status == "running"
|
||||
|
||||
|
||||
def test_dispatch_spawn_failure_releases_claim(kanban_home):
|
||||
def boom(task, workspace):
|
||||
raise RuntimeError("spawn failed")
|
||||
|
||||
with kb.connect() as conn:
|
||||
t = kb.create_task(conn, title="boom", assignee="alice")
|
||||
kb.dispatch_once(conn, spawn_fn=boom)
|
||||
# Must return to ready so the next tick can retry.
|
||||
assert kb.get_task(conn, t).status == "ready"
|
||||
assert kb.get_task(conn, t).claim_lock is None
|
||||
|
||||
|
||||
def test_dispatch_reclaims_stale_before_spawning(kanban_home):
|
||||
with kb.connect() as conn:
|
||||
t = kb.create_task(conn, title="x", assignee="alice")
|
||||
kb.claim_task(conn, t)
|
||||
conn.execute(
|
||||
"UPDATE tasks SET claim_expires = ? WHERE id = ?",
|
||||
(int(time.time()) - 1, t),
|
||||
)
|
||||
res = kb.dispatch_once(conn, dry_run=True)
|
||||
assert res.reclaimed == 1
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Workspace resolution
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def test_scratch_workspace_created_under_hermes_home(kanban_home):
|
||||
with kb.connect() as conn:
|
||||
t = kb.create_task(conn, title="x")
|
||||
task = kb.get_task(conn, t)
|
||||
ws = kb.resolve_workspace(task)
|
||||
assert ws.exists()
|
||||
assert ws.is_dir()
|
||||
assert "kanban" in str(ws)
|
||||
|
||||
|
||||
def test_dir_workspace_honors_given_path(kanban_home, tmp_path):
|
||||
target = tmp_path / "my-vault"
|
||||
with kb.connect() as conn:
|
||||
t = kb.create_task(
|
||||
conn, title="biz", workspace_kind="dir", workspace_path=str(target)
|
||||
)
|
||||
task = kb.get_task(conn, t)
|
||||
ws = kb.resolve_workspace(task)
|
||||
assert ws == target
|
||||
assert ws.exists()
|
||||
|
||||
|
||||
def test_worktree_workspace_returns_intended_path(kanban_home, tmp_path):
|
||||
target = str(tmp_path / ".worktrees" / "my-task")
|
||||
with kb.connect() as conn:
|
||||
t = kb.create_task(
|
||||
conn, title="ship", workspace_kind="worktree", workspace_path=target
|
||||
)
|
||||
task = kb.get_task(conn, t)
|
||||
ws = kb.resolve_workspace(task)
|
||||
# We do NOT auto-create worktrees; the worker's skill handles that.
|
||||
assert str(ws) == target
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Tenancy
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def test_tenant_column_filters_listings(kanban_home):
|
||||
with kb.connect() as conn:
|
||||
kb.create_task(conn, title="a1", tenant="biz-a")
|
||||
kb.create_task(conn, title="b1", tenant="biz-b")
|
||||
kb.create_task(conn, title="shared") # no tenant
|
||||
biz_a = kb.list_tasks(conn, tenant="biz-a")
|
||||
biz_b = kb.list_tasks(conn, tenant="biz-b")
|
||||
assert [t.title for t in biz_a] == ["a1"]
|
||||
assert [t.title for t in biz_b] == ["b1"]
|
||||
|
||||
|
||||
def test_tenant_propagates_to_events(kanban_home):
|
||||
with kb.connect() as conn:
|
||||
t = kb.create_task(conn, title="tenant-task", tenant="biz-a")
|
||||
events = kb.list_events(conn, t)
|
||||
# The "created" event should have tenant in its payload.
|
||||
created = [e for e in events if e.kind == "created"]
|
||||
assert created and created[0].payload.get("tenant") == "biz-a"
|
||||
@@ -149,6 +149,23 @@ class TestCreateProfile:
|
||||
assert (profile_dir / ".env").read_text() == "KEY=val"
|
||||
assert (profile_dir / "SOUL.md").read_text() == "Be helpful."
|
||||
|
||||
def test_clone_config_copies_source_skills(self, profile_env):
|
||||
tmp_path = profile_env
|
||||
default_home = tmp_path / ".hermes"
|
||||
skill_dir = default_home / "skills" / "custom" / "installed-skill"
|
||||
skill_dir.mkdir(parents=True)
|
||||
(skill_dir / "SKILL.md").write_text("---\nname: installed-skill\n---\n")
|
||||
|
||||
profile_dir = create_profile("coder", clone_config=True, no_alias=True)
|
||||
|
||||
assert (
|
||||
profile_dir
|
||||
/ "skills"
|
||||
/ "custom"
|
||||
/ "installed-skill"
|
||||
/ "SKILL.md"
|
||||
).read_text() == "---\nname: installed-skill\n---\n"
|
||||
|
||||
def test_clone_all_copies_entire_tree(self, profile_env):
|
||||
tmp_path = profile_env
|
||||
default_home = tmp_path / ".hermes"
|
||||
|
||||
@@ -591,6 +591,222 @@ class TestNewEndpoints:
|
||||
resp = self.client.get("/api/cron/jobs/nonexistent-id")
|
||||
assert resp.status_code == 404
|
||||
|
||||
# --- Profiles ---
|
||||
|
||||
def test_profiles_list_includes_default(self):
|
||||
from hermes_constants import get_hermes_home
|
||||
get_hermes_home().mkdir(parents=True, exist_ok=True)
|
||||
|
||||
resp = self.client.get("/api/profiles")
|
||||
assert resp.status_code == 200
|
||||
names = [p["name"] for p in resp.json()["profiles"]]
|
||||
assert "default" in names
|
||||
|
||||
def test_profiles_list_falls_back_when_profile_listing_fails(self, monkeypatch):
|
||||
from hermes_constants import get_hermes_home
|
||||
import hermes_cli.profiles as profiles_mod
|
||||
|
||||
hermes_home = get_hermes_home()
|
||||
hermes_home.mkdir(parents=True, exist_ok=True)
|
||||
(hermes_home / "config.yaml").write_text(
|
||||
"model:\n provider: openrouter\n name: anthropic/claude-sonnet-4.6\n",
|
||||
encoding="utf-8",
|
||||
)
|
||||
named = hermes_home / "profiles" / "multi-agent"
|
||||
named.mkdir(parents=True)
|
||||
(named / ".env").write_text("EXAMPLE=1\n", encoding="utf-8")
|
||||
(named / "skills" / "demo").mkdir(parents=True)
|
||||
(named / "skills" / "demo" / "SKILL.md").write_text("---\nname: demo\n---\n", encoding="utf-8")
|
||||
|
||||
monkeypatch.setattr(
|
||||
profiles_mod,
|
||||
"list_profiles",
|
||||
lambda: (_ for _ in ()).throw(RuntimeError("boom")),
|
||||
)
|
||||
|
||||
resp = self.client.get("/api/profiles")
|
||||
|
||||
assert resp.status_code == 200
|
||||
profiles = {p["name"]: p for p in resp.json()["profiles"]}
|
||||
assert profiles["default"]["is_default"] is True
|
||||
assert profiles["default"]["provider"] == "openrouter"
|
||||
assert profiles["multi-agent"]["has_env"] is True
|
||||
assert profiles["multi-agent"]["skill_count"] == 1
|
||||
|
||||
def test_profiles_create_rename_delete_round_trip(self, monkeypatch):
|
||||
# Stub gateway service teardown so the test doesn't shell out to
|
||||
# launchctl/systemctl on the host.
|
||||
import hermes_cli.profiles as profiles_mod
|
||||
monkeypatch.setattr(profiles_mod, "_cleanup_gateway_service", lambda *a, **kw: None)
|
||||
|
||||
created = self.client.post("/api/profiles", json={"name": "test-prof"})
|
||||
assert created.status_code == 200
|
||||
|
||||
renamed = self.client.patch(
|
||||
"/api/profiles/test-prof",
|
||||
json={"new_name": "test-prof-2"},
|
||||
)
|
||||
assert renamed.status_code == 200
|
||||
|
||||
names = [p["name"] for p in self.client.get("/api/profiles").json()["profiles"]]
|
||||
assert "test-prof" not in names
|
||||
assert "test-prof-2" in names
|
||||
|
||||
deleted = self.client.delete("/api/profiles/test-prof-2")
|
||||
assert deleted.status_code == 200
|
||||
names = [p["name"] for p in self.client.get("/api/profiles").json()["profiles"]]
|
||||
assert "test-prof-2" not in names
|
||||
|
||||
def test_profile_setup_command_uses_named_profile_wrapper(self):
|
||||
from hermes_constants import get_hermes_home
|
||||
|
||||
(get_hermes_home() / "profiles" / "coder").mkdir(parents=True)
|
||||
|
||||
resp = self.client.get("/api/profiles/coder/setup-command")
|
||||
|
||||
assert resp.status_code == 200
|
||||
assert resp.json()["command"] == "coder setup"
|
||||
|
||||
def test_profile_setup_command_uses_hermes_for_default_profile(self):
|
||||
from hermes_constants import get_hermes_home
|
||||
|
||||
get_hermes_home().mkdir(parents=True, exist_ok=True)
|
||||
|
||||
resp = self.client.get("/api/profiles/default/setup-command")
|
||||
|
||||
assert resp.status_code == 200
|
||||
assert resp.json()["command"] == "hermes setup"
|
||||
|
||||
def test_profiles_create_creates_wrapper_alias_when_safe(self, monkeypatch, tmp_path):
|
||||
import hermes_cli.profiles as profiles_mod
|
||||
|
||||
wrapper_dir = tmp_path / "bin"
|
||||
wrapper_dir.mkdir()
|
||||
monkeypatch.setattr(profiles_mod, "_get_wrapper_dir", lambda: wrapper_dir)
|
||||
|
||||
resp = self.client.post(
|
||||
"/api/profiles",
|
||||
json={"name": "writer", "clone_from_default": False},
|
||||
)
|
||||
|
||||
assert resp.status_code == 200
|
||||
wrapper_path = wrapper_dir / "writer"
|
||||
assert wrapper_path.exists()
|
||||
assert wrapper_path.read_text() == '#!/bin/sh\nexec hermes -p writer "$@"\n'
|
||||
|
||||
def test_profiles_create_with_clone_from_default_copies_default_skills(self, monkeypatch):
|
||||
from hermes_constants import get_hermes_home
|
||||
import hermes_cli.profiles as profiles_mod
|
||||
|
||||
monkeypatch.setattr(profiles_mod, "create_wrapper_script", lambda name: None)
|
||||
default_skill = get_hermes_home() / "skills" / "custom" / "new-skill"
|
||||
default_skill.mkdir(parents=True)
|
||||
(default_skill / "SKILL.md").write_text("---\nname: new-skill\n---\n", encoding="utf-8")
|
||||
|
||||
resp = self.client.post(
|
||||
"/api/profiles",
|
||||
json={"name": "cloned", "clone_from_default": True},
|
||||
)
|
||||
|
||||
assert resp.status_code == 200
|
||||
cloned_skill = get_hermes_home() / "profiles" / "cloned" / "skills" / "custom" / "new-skill" / "SKILL.md"
|
||||
assert cloned_skill.exists()
|
||||
profiles = {p["name"]: p for p in self.client.get("/api/profiles").json()["profiles"]}
|
||||
assert profiles["cloned"]["skill_count"] == 1
|
||||
|
||||
def test_profiles_create_without_clone_seeds_bundled_skills(self, monkeypatch):
|
||||
from hermes_constants import get_hermes_home
|
||||
import hermes_cli.profiles as profiles_mod
|
||||
|
||||
monkeypatch.setattr(profiles_mod, "create_wrapper_script", lambda name: None)
|
||||
|
||||
def fake_seed(profile_dir, quiet=False):
|
||||
skill_dir = profile_dir / "skills" / "software-development" / "plan"
|
||||
skill_dir.mkdir(parents=True)
|
||||
(skill_dir / "SKILL.md").write_text("---\nname: plan\n---\n", encoding="utf-8")
|
||||
return {"copied": ["plan"]}
|
||||
|
||||
monkeypatch.setattr(profiles_mod, "seed_profile_skills", fake_seed)
|
||||
|
||||
resp = self.client.post(
|
||||
"/api/profiles",
|
||||
json={"name": "fresh", "clone_from_default": False},
|
||||
)
|
||||
|
||||
assert resp.status_code == 200
|
||||
seeded_skill = get_hermes_home() / "profiles" / "fresh" / "skills" / "software-development" / "plan" / "SKILL.md"
|
||||
assert seeded_skill.exists()
|
||||
profiles = {p["name"]: p for p in self.client.get("/api/profiles").json()["profiles"]}
|
||||
assert profiles["fresh"]["skill_count"] == 1
|
||||
|
||||
def test_profile_open_terminal_uses_macos_terminal(self, monkeypatch):
|
||||
from hermes_constants import get_hermes_home
|
||||
import hermes_cli.web_server as web_server
|
||||
|
||||
(get_hermes_home() / "profiles" / "coder").mkdir(parents=True)
|
||||
calls = []
|
||||
monkeypatch.setattr(web_server.sys, "platform", "darwin")
|
||||
monkeypatch.setattr(web_server.subprocess, "Popen", lambda args, **kwargs: calls.append(args))
|
||||
|
||||
resp = self.client.post("/api/profiles/coder/open-terminal")
|
||||
|
||||
assert resp.status_code == 200
|
||||
assert calls
|
||||
assert calls[0][0] == "osascript"
|
||||
assert "coder setup" in " ".join(calls[0])
|
||||
|
||||
def test_profile_open_terminal_uses_windows_cmd(self, monkeypatch):
|
||||
from hermes_constants import get_hermes_home
|
||||
import hermes_cli.web_server as web_server
|
||||
|
||||
(get_hermes_home() / "profiles" / "coder").mkdir(parents=True)
|
||||
calls = []
|
||||
monkeypatch.setattr(web_server.sys, "platform", "win32")
|
||||
monkeypatch.setattr(web_server.subprocess, "Popen", lambda args, **kwargs: calls.append(args))
|
||||
|
||||
resp = self.client.post("/api/profiles/coder/open-terminal")
|
||||
|
||||
assert resp.status_code == 200
|
||||
assert calls
|
||||
assert calls[0][:4] == ["cmd.exe", "/c", "start", ""]
|
||||
assert calls[0][-1] == "coder setup"
|
||||
|
||||
def test_profiles_create_rejects_invalid_name(self):
|
||||
resp = self.client.post("/api/profiles", json={"name": "Has Spaces"})
|
||||
assert resp.status_code == 400
|
||||
|
||||
def test_profiles_delete_default_forbidden(self):
|
||||
resp = self.client.delete("/api/profiles/default")
|
||||
assert resp.status_code == 400
|
||||
|
||||
def test_profiles_delete_not_found(self):
|
||||
resp = self.client.delete("/api/profiles/does-not-exist")
|
||||
assert resp.status_code == 404
|
||||
|
||||
def test_profile_soul_round_trip(self, monkeypatch):
|
||||
import hermes_cli.profiles as profiles_mod
|
||||
monkeypatch.setattr(profiles_mod, "_cleanup_gateway_service", lambda *a, **kw: None)
|
||||
|
||||
self.client.post("/api/profiles", json={"name": "soul-prof"})
|
||||
get1 = self.client.get("/api/profiles/soul-prof/soul")
|
||||
assert get1.status_code == 200
|
||||
assert get1.json()["exists"] is True
|
||||
|
||||
put = self.client.put(
|
||||
"/api/profiles/soul-prof/soul",
|
||||
json={"content": "# Edited soul"},
|
||||
)
|
||||
assert put.status_code == 200
|
||||
|
||||
got = self.client.get("/api/profiles/soul-prof/soul").json()
|
||||
assert got["content"] == "# Edited soul"
|
||||
|
||||
self.client.delete("/api/profiles/soul-prof")
|
||||
|
||||
def test_profile_soul_unknown_profile_404(self):
|
||||
resp = self.client.get("/api/profiles/nonexistent/soul")
|
||||
assert resp.status_code == 404
|
||||
|
||||
def test_skills_list(self):
|
||||
resp = self.client.get("/api/skills")
|
||||
assert resp.status_code == 200
|
||||
|
||||
@@ -0,0 +1,889 @@
|
||||
"""Tests for the Kanban dashboard plugin backend (plugins/kanban/dashboard/plugin_api.py).
|
||||
|
||||
The plugin mounts as /api/plugins/kanban/ inside the dashboard's FastAPI app,
|
||||
but here we attach its router to a bare FastAPI instance so we can test the
|
||||
REST surface without spinning up the whole dashboard.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import importlib.util
|
||||
import os
|
||||
import sys
|
||||
import time
|
||||
from pathlib import Path
|
||||
|
||||
import pytest
|
||||
from fastapi import FastAPI
|
||||
from fastapi.testclient import TestClient
|
||||
|
||||
from hermes_cli import kanban_db as kb
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Fixtures
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
def _load_plugin_router():
|
||||
"""Dynamically load plugins/kanban/dashboard/plugin_api.py and return its router."""
|
||||
repo_root = Path(__file__).resolve().parents[2]
|
||||
plugin_file = repo_root / "plugins" / "kanban" / "dashboard" / "plugin_api.py"
|
||||
assert plugin_file.exists(), f"plugin file missing: {plugin_file}"
|
||||
|
||||
spec = importlib.util.spec_from_file_location(
|
||||
"hermes_dashboard_plugin_kanban_test", plugin_file,
|
||||
)
|
||||
assert spec is not None and spec.loader is not None
|
||||
mod = importlib.util.module_from_spec(spec)
|
||||
sys.modules[spec.name] = mod
|
||||
spec.loader.exec_module(mod)
|
||||
return mod.router
|
||||
|
||||
|
||||
@pytest.fixture
|
||||
def kanban_home(tmp_path, monkeypatch):
|
||||
"""Isolated HERMES_HOME with an empty kanban DB."""
|
||||
home = tmp_path / ".hermes"
|
||||
home.mkdir()
|
||||
monkeypatch.setenv("HERMES_HOME", str(home))
|
||||
monkeypatch.setattr(Path, "home", lambda: tmp_path)
|
||||
kb.init_db()
|
||||
return home
|
||||
|
||||
|
||||
@pytest.fixture
|
||||
def client(kanban_home):
|
||||
app = FastAPI()
|
||||
app.include_router(_load_plugin_router(), prefix="/api/plugins/kanban")
|
||||
return TestClient(app)
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# GET /board on an empty DB
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
def test_board_empty(client):
|
||||
r = client.get("/api/plugins/kanban/board")
|
||||
assert r.status_code == 200
|
||||
data = r.json()
|
||||
# All canonical columns present (triage + the rest), each empty.
|
||||
names = [c["name"] for c in data["columns"]]
|
||||
for expected in ("triage", "todo", "ready", "running", "blocked", "done"):
|
||||
assert expected in names, f"missing column {expected}: {names}"
|
||||
assert all(len(c["tasks"]) == 0 for c in data["columns"])
|
||||
assert data["tenants"] == []
|
||||
assert data["assignees"] == []
|
||||
assert data["latest_event_id"] == 0
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# POST /tasks then GET /board sees it
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
def test_create_task_appears_on_board(client):
|
||||
r = client.post(
|
||||
"/api/plugins/kanban/tasks",
|
||||
json={
|
||||
"title": "Research LLM caching",
|
||||
"assignee": "researcher",
|
||||
"priority": 3,
|
||||
"tenant": "acme",
|
||||
},
|
||||
)
|
||||
assert r.status_code == 200, r.text
|
||||
task = r.json()["task"]
|
||||
assert task["title"] == "Research LLM caching"
|
||||
assert task["assignee"] == "researcher"
|
||||
assert task["status"] == "ready" # no parents -> immediately ready
|
||||
assert task["priority"] == 3
|
||||
assert task["tenant"] == "acme"
|
||||
task_id = task["id"]
|
||||
|
||||
# Board now lists it under 'ready'.
|
||||
r = client.get("/api/plugins/kanban/board")
|
||||
assert r.status_code == 200
|
||||
data = r.json()
|
||||
ready = next(c for c in data["columns"] if c["name"] == "ready")
|
||||
assert len(ready["tasks"]) == 1
|
||||
assert ready["tasks"][0]["id"] == task_id
|
||||
assert "acme" in data["tenants"]
|
||||
assert "researcher" in data["assignees"]
|
||||
|
||||
|
||||
def test_tenant_filter(client):
|
||||
client.post("/api/plugins/kanban/tasks", json={"title": "A", "tenant": "t1"})
|
||||
client.post("/api/plugins/kanban/tasks", json={"title": "B", "tenant": "t2"})
|
||||
|
||||
r = client.get("/api/plugins/kanban/board?tenant=t1")
|
||||
counts = {c["name"]: len(c["tasks"]) for c in r.json()["columns"]}
|
||||
total = sum(counts.values())
|
||||
assert total == 1
|
||||
|
||||
r = client.get("/api/plugins/kanban/board?tenant=t2")
|
||||
total = sum(len(c["tasks"]) for c in r.json()["columns"])
|
||||
assert total == 1
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# GET /tasks/:id returns body + comments + events + links
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
def test_task_detail_includes_links_and_events(client):
|
||||
parent = client.post(
|
||||
"/api/plugins/kanban/tasks", json={"title": "parent"},
|
||||
).json()["task"]
|
||||
child = client.post(
|
||||
"/api/plugins/kanban/tasks",
|
||||
json={"title": "child", "parents": [parent["id"]]},
|
||||
).json()["task"]
|
||||
assert child["status"] == "todo" # parent not done yet
|
||||
|
||||
# Detail for the child shows the parent link.
|
||||
r = client.get(f"/api/plugins/kanban/tasks/{child['id']}")
|
||||
assert r.status_code == 200
|
||||
data = r.json()
|
||||
assert data["task"]["id"] == child["id"]
|
||||
assert parent["id"] in data["links"]["parents"]
|
||||
|
||||
# Detail for the parent shows the child.
|
||||
r = client.get(f"/api/plugins/kanban/tasks/{parent['id']}")
|
||||
assert child["id"] in r.json()["links"]["children"]
|
||||
|
||||
# Events exist from creation.
|
||||
assert len(data["events"]) >= 1
|
||||
|
||||
|
||||
def test_task_detail_404_on_unknown(client):
|
||||
r = client.get("/api/plugins/kanban/tasks/does-not-exist")
|
||||
assert r.status_code == 404
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# PATCH /tasks/:id — status transitions
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
def test_patch_status_complete(client):
|
||||
t = client.post("/api/plugins/kanban/tasks", json={"title": "x"}).json()["task"]
|
||||
r = client.patch(
|
||||
f"/api/plugins/kanban/tasks/{t['id']}",
|
||||
json={"status": "done", "result": "shipped"},
|
||||
)
|
||||
assert r.status_code == 200
|
||||
assert r.json()["task"]["status"] == "done"
|
||||
|
||||
# Board reflects the move.
|
||||
done = next(
|
||||
c for c in client.get("/api/plugins/kanban/board").json()["columns"]
|
||||
if c["name"] == "done"
|
||||
)
|
||||
assert any(x["id"] == t["id"] for x in done["tasks"])
|
||||
|
||||
|
||||
def test_patch_block_then_unblock(client):
|
||||
t = client.post("/api/plugins/kanban/tasks", json={"title": "x"}).json()["task"]
|
||||
r = client.patch(
|
||||
f"/api/plugins/kanban/tasks/{t['id']}",
|
||||
json={"status": "blocked", "block_reason": "need input"},
|
||||
)
|
||||
assert r.status_code == 200
|
||||
assert r.json()["task"]["status"] == "blocked"
|
||||
|
||||
r = client.patch(
|
||||
f"/api/plugins/kanban/tasks/{t['id']}",
|
||||
json={"status": "ready"},
|
||||
)
|
||||
assert r.status_code == 200
|
||||
assert r.json()["task"]["status"] == "ready"
|
||||
|
||||
|
||||
def test_patch_drag_drop_move_todo_to_ready(client):
|
||||
"""Direct status write: the drag-drop path for statuses without a
|
||||
dedicated verb (e.g. manually promoting todo -> ready)."""
|
||||
parent = client.post("/api/plugins/kanban/tasks", json={"title": "p"}).json()["task"]
|
||||
child = client.post(
|
||||
"/api/plugins/kanban/tasks",
|
||||
json={"title": "c", "parents": [parent["id"]]},
|
||||
).json()["task"]
|
||||
assert child["status"] == "todo"
|
||||
|
||||
r = client.patch(
|
||||
f"/api/plugins/kanban/tasks/{child['id']}",
|
||||
json={"status": "ready"},
|
||||
)
|
||||
assert r.status_code == 200
|
||||
assert r.json()["task"]["status"] == "ready"
|
||||
|
||||
|
||||
def test_patch_reassign(client):
|
||||
t = client.post(
|
||||
"/api/plugins/kanban/tasks",
|
||||
json={"title": "x", "assignee": "a"},
|
||||
).json()["task"]
|
||||
r = client.patch(
|
||||
f"/api/plugins/kanban/tasks/{t['id']}",
|
||||
json={"assignee": "b"},
|
||||
)
|
||||
assert r.status_code == 200
|
||||
assert r.json()["task"]["assignee"] == "b"
|
||||
|
||||
|
||||
def test_patch_priority_and_edit(client):
|
||||
t = client.post("/api/plugins/kanban/tasks", json={"title": "x"}).json()["task"]
|
||||
r = client.patch(
|
||||
f"/api/plugins/kanban/tasks/{t['id']}",
|
||||
json={"priority": 5, "title": "renamed"},
|
||||
)
|
||||
assert r.status_code == 200
|
||||
data = r.json()["task"]
|
||||
assert data["priority"] == 5
|
||||
assert data["title"] == "renamed"
|
||||
|
||||
|
||||
def test_patch_invalid_status(client):
|
||||
t = client.post("/api/plugins/kanban/tasks", json={"title": "x"}).json()["task"]
|
||||
r = client.patch(
|
||||
f"/api/plugins/kanban/tasks/{t['id']}",
|
||||
json={"status": "banana"},
|
||||
)
|
||||
assert r.status_code == 400
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Comments + Links
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
def test_add_comment(client):
|
||||
t = client.post("/api/plugins/kanban/tasks", json={"title": "x"}).json()["task"]
|
||||
r = client.post(
|
||||
f"/api/plugins/kanban/tasks/{t['id']}/comments",
|
||||
json={"body": "how's progress?", "author": "teknium"},
|
||||
)
|
||||
assert r.status_code == 200
|
||||
|
||||
r = client.get(f"/api/plugins/kanban/tasks/{t['id']}")
|
||||
comments = r.json()["comments"]
|
||||
assert len(comments) == 1
|
||||
assert comments[0]["body"] == "how's progress?"
|
||||
assert comments[0]["author"] == "teknium"
|
||||
|
||||
|
||||
def test_add_comment_empty_rejected(client):
|
||||
t = client.post("/api/plugins/kanban/tasks", json={"title": "x"}).json()["task"]
|
||||
r = client.post(
|
||||
f"/api/plugins/kanban/tasks/{t['id']}/comments",
|
||||
json={"body": " "},
|
||||
)
|
||||
assert r.status_code == 400
|
||||
|
||||
|
||||
def test_add_link_and_delete_link(client):
|
||||
a = client.post("/api/plugins/kanban/tasks", json={"title": "a"}).json()["task"]
|
||||
b = client.post("/api/plugins/kanban/tasks", json={"title": "b"}).json()["task"]
|
||||
|
||||
r = client.post(
|
||||
"/api/plugins/kanban/links",
|
||||
json={"parent_id": a["id"], "child_id": b["id"]},
|
||||
)
|
||||
assert r.status_code == 200
|
||||
|
||||
r = client.get(f"/api/plugins/kanban/tasks/{b['id']}")
|
||||
assert a["id"] in r.json()["links"]["parents"]
|
||||
|
||||
r = client.delete(
|
||||
"/api/plugins/kanban/links",
|
||||
params={"parent_id": a["id"], "child_id": b["id"]},
|
||||
)
|
||||
assert r.status_code == 200
|
||||
assert r.json()["ok"] is True
|
||||
|
||||
|
||||
def test_add_link_cycle_rejected(client):
|
||||
a = client.post("/api/plugins/kanban/tasks", json={"title": "a"}).json()["task"]
|
||||
b = client.post("/api/plugins/kanban/tasks", json={"title": "b"}).json()["task"]
|
||||
client.post(
|
||||
"/api/plugins/kanban/links",
|
||||
json={"parent_id": a["id"], "child_id": b["id"]},
|
||||
)
|
||||
r = client.post(
|
||||
"/api/plugins/kanban/links",
|
||||
json={"parent_id": b["id"], "child_id": a["id"]},
|
||||
)
|
||||
assert r.status_code == 400
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Dispatch nudge
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
def test_dispatch_dry_run(client):
|
||||
client.post(
|
||||
"/api/plugins/kanban/tasks",
|
||||
json={"title": "work", "assignee": "researcher"},
|
||||
)
|
||||
r = client.post("/api/plugins/kanban/dispatch?dry_run=true&max=4")
|
||||
assert r.status_code == 200
|
||||
body = r.json()
|
||||
# DispatchResult is serialized as a dataclass dict.
|
||||
assert isinstance(body, dict)
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Triage column (new v1 status)
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
def test_create_triage_lands_in_triage_column(client):
|
||||
r = client.post(
|
||||
"/api/plugins/kanban/tasks",
|
||||
json={"title": "rough idea, spec me", "triage": True},
|
||||
)
|
||||
assert r.status_code == 200
|
||||
task = r.json()["task"]
|
||||
assert task["status"] == "triage"
|
||||
|
||||
r = client.get("/api/plugins/kanban/board")
|
||||
triage = next(c for c in r.json()["columns"] if c["name"] == "triage")
|
||||
assert len(triage["tasks"]) == 1
|
||||
assert triage["tasks"][0]["title"] == "rough idea, spec me"
|
||||
|
||||
|
||||
def test_triage_task_not_promoted_to_ready(client):
|
||||
"""Triage tasks must stay in triage even when they have no parents."""
|
||||
client.post(
|
||||
"/api/plugins/kanban/tasks",
|
||||
json={"title": "must stay put", "triage": True},
|
||||
)
|
||||
# Run the dispatcher — it should NOT promote the triage task.
|
||||
client.post("/api/plugins/kanban/dispatch?dry_run=false&max=4")
|
||||
r = client.get("/api/plugins/kanban/board")
|
||||
triage = next(c for c in r.json()["columns"] if c["name"] == "triage")
|
||||
ready = next(c for c in r.json()["columns"] if c["name"] == "ready")
|
||||
assert len(triage["tasks"]) == 1
|
||||
assert len(ready["tasks"]) == 0
|
||||
|
||||
|
||||
def test_patch_status_triage_works(client):
|
||||
"""A user (or specifier) can push a task back into triage, and out of it."""
|
||||
t = client.post(
|
||||
"/api/plugins/kanban/tasks", json={"title": "x"},
|
||||
).json()["task"]
|
||||
# Normal creation is 'ready'; push to triage.
|
||||
r = client.patch(
|
||||
f"/api/plugins/kanban/tasks/{t['id']}", json={"status": "triage"},
|
||||
)
|
||||
assert r.status_code == 200
|
||||
assert r.json()["task"]["status"] == "triage"
|
||||
|
||||
# Now promote to todo.
|
||||
r = client.patch(
|
||||
f"/api/plugins/kanban/tasks/{t['id']}", json={"status": "todo"},
|
||||
)
|
||||
assert r.status_code == 200
|
||||
assert r.json()["task"]["status"] == "todo"
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Progress rollup (done children / total children)
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
def test_board_progress_rollup(client):
|
||||
parent = client.post(
|
||||
"/api/plugins/kanban/tasks", json={"title": "parent"},
|
||||
).json()["task"]
|
||||
child_a = client.post(
|
||||
"/api/plugins/kanban/tasks",
|
||||
json={"title": "a", "parents": [parent["id"]]},
|
||||
).json()["task"]
|
||||
child_b = client.post(
|
||||
"/api/plugins/kanban/tasks",
|
||||
json={"title": "b", "parents": [parent["id"]]},
|
||||
).json()["task"]
|
||||
# Children start as "todo" because the parent isn't done yet; promote
|
||||
# them to "ready" so complete_task will accept the transition.
|
||||
for cid in (child_a["id"], child_b["id"]):
|
||||
r = client.patch(
|
||||
f"/api/plugins/kanban/tasks/{cid}", json={"status": "ready"},
|
||||
)
|
||||
assert r.status_code == 200
|
||||
|
||||
# 0/2 done.
|
||||
r = client.get("/api/plugins/kanban/board")
|
||||
parent_row = next(
|
||||
t for col in r.json()["columns"] for t in col["tasks"]
|
||||
if t["id"] == parent["id"]
|
||||
)
|
||||
assert parent_row["progress"] == {"done": 0, "total": 2}
|
||||
|
||||
# Complete one child. 1/2.
|
||||
r = client.patch(
|
||||
f"/api/plugins/kanban/tasks/{child_a['id']}",
|
||||
json={"status": "done"},
|
||||
)
|
||||
assert r.status_code == 200
|
||||
r = client.get("/api/plugins/kanban/board")
|
||||
parent_row = next(
|
||||
t for col in r.json()["columns"] for t in col["tasks"]
|
||||
if t["id"] == parent["id"]
|
||||
)
|
||||
assert parent_row["progress"] == {"done": 1, "total": 2}
|
||||
|
||||
# Childless tasks report progress=None, not {0/0}.
|
||||
assert next(
|
||||
t for col in r.json()["columns"] for t in col["tasks"]
|
||||
if t["id"] == child_b["id"]
|
||||
)["progress"] is None
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Auto-init on first board read
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
def test_board_auto_initializes_missing_db(tmp_path, monkeypatch):
|
||||
"""If kanban.db doesn't exist yet, GET /board must create it, not 500."""
|
||||
home = tmp_path / ".hermes"
|
||||
home.mkdir()
|
||||
monkeypatch.setenv("HERMES_HOME", str(home))
|
||||
monkeypatch.setattr(Path, "home", lambda: tmp_path)
|
||||
# Deliberately DO NOT call kb.init_db().
|
||||
|
||||
app = FastAPI()
|
||||
app.include_router(_load_plugin_router(), prefix="/api/plugins/kanban")
|
||||
c = TestClient(app)
|
||||
r = c.get("/api/plugins/kanban/board")
|
||||
assert r.status_code == 200
|
||||
assert (home / "kanban.db").exists(), "init_db wasn't invoked by /board"
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# WebSocket auth (query-param token)
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
def test_ws_events_rejects_when_token_required(tmp_path, monkeypatch):
|
||||
"""When _SESSION_TOKEN is set (normal dashboard context), a missing or
|
||||
wrong ?token= query param must be rejected with policy-violation."""
|
||||
home = tmp_path / ".hermes"
|
||||
home.mkdir()
|
||||
monkeypatch.setenv("HERMES_HOME", str(home))
|
||||
monkeypatch.setattr(Path, "home", lambda: tmp_path)
|
||||
kb.init_db()
|
||||
|
||||
# Stub web_server so _check_ws_token has a token to compare against.
|
||||
import types
|
||||
stub = types.SimpleNamespace(_SESSION_TOKEN="secret-xyz")
|
||||
monkeypatch.setitem(sys.modules, "hermes_cli.web_server", stub)
|
||||
|
||||
app = FastAPI()
|
||||
app.include_router(_load_plugin_router(), prefix="/api/plugins/kanban")
|
||||
c = TestClient(app)
|
||||
|
||||
# No token → policy violation close.
|
||||
from starlette.websockets import WebSocketDisconnect
|
||||
with pytest.raises(WebSocketDisconnect) as exc:
|
||||
with c.websocket_connect("/api/plugins/kanban/events"):
|
||||
pass
|
||||
assert exc.value.code == 1008
|
||||
|
||||
# Wrong token → policy violation close.
|
||||
with pytest.raises(WebSocketDisconnect) as exc:
|
||||
with c.websocket_connect("/api/plugins/kanban/events?token=nope"):
|
||||
pass
|
||||
assert exc.value.code == 1008
|
||||
|
||||
# Correct token → accepted (connect then close cleanly from our side).
|
||||
with c.websocket_connect(
|
||||
"/api/plugins/kanban/events?token=secret-xyz"
|
||||
) as ws:
|
||||
assert ws is not None # handshake succeeded
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Bulk actions
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
def test_bulk_status_ready(client):
|
||||
a = client.post("/api/plugins/kanban/tasks", json={"title": "a"}).json()["task"]
|
||||
b = client.post("/api/plugins/kanban/tasks", json={"title": "b"}).json()["task"]
|
||||
c2 = client.post("/api/plugins/kanban/tasks", json={"title": "c"}).json()["task"]
|
||||
# Parent-less tasks land in "ready" already; push them to blocked first.
|
||||
for tid in (a["id"], b["id"], c2["id"]):
|
||||
client.patch(f"/api/plugins/kanban/tasks/{tid}",
|
||||
json={"status": "blocked", "block_reason": "wait"})
|
||||
|
||||
r = client.post("/api/plugins/kanban/tasks/bulk",
|
||||
json={"ids": [a["id"], b["id"], c2["id"]], "status": "ready"})
|
||||
assert r.status_code == 200
|
||||
results = r.json()["results"]
|
||||
assert all(r["ok"] for r in results)
|
||||
# All three are now ready.
|
||||
board = client.get("/api/plugins/kanban/board").json()
|
||||
ready = next(col for col in board["columns"] if col["name"] == "ready")
|
||||
ids = {t["id"] for t in ready["tasks"]}
|
||||
assert {a["id"], b["id"], c2["id"]}.issubset(ids)
|
||||
|
||||
|
||||
def test_bulk_archive(client):
|
||||
a = client.post("/api/plugins/kanban/tasks", json={"title": "a"}).json()["task"]
|
||||
b = client.post("/api/plugins/kanban/tasks", json={"title": "b"}).json()["task"]
|
||||
r = client.post("/api/plugins/kanban/tasks/bulk",
|
||||
json={"ids": [a["id"], b["id"]], "archive": True})
|
||||
assert r.status_code == 200
|
||||
assert all(r["ok"] for r in r.json()["results"])
|
||||
# Default board (archived hidden) — both gone.
|
||||
board = client.get("/api/plugins/kanban/board").json()
|
||||
ids = {t["id"] for col in board["columns"] for t in col["tasks"]}
|
||||
assert a["id"] not in ids
|
||||
assert b["id"] not in ids
|
||||
|
||||
|
||||
def test_bulk_reassign(client):
|
||||
a = client.post("/api/plugins/kanban/tasks",
|
||||
json={"title": "a", "assignee": "old"}).json()["task"]
|
||||
b = client.post("/api/plugins/kanban/tasks",
|
||||
json={"title": "b", "assignee": "old"}).json()["task"]
|
||||
r = client.post("/api/plugins/kanban/tasks/bulk",
|
||||
json={"ids": [a["id"], b["id"]], "assignee": "new"})
|
||||
assert r.status_code == 200
|
||||
for tid in (a["id"], b["id"]):
|
||||
t = client.get(f"/api/plugins/kanban/tasks/{tid}").json()["task"]
|
||||
assert t["assignee"] == "new"
|
||||
|
||||
|
||||
def test_bulk_unassign_via_empty_string(client):
|
||||
a = client.post("/api/plugins/kanban/tasks",
|
||||
json={"title": "a", "assignee": "x"}).json()["task"]
|
||||
r = client.post("/api/plugins/kanban/tasks/bulk",
|
||||
json={"ids": [a["id"]], "assignee": ""})
|
||||
assert r.status_code == 200
|
||||
t = client.get(f"/api/plugins/kanban/tasks/{a['id']}").json()["task"]
|
||||
assert t["assignee"] is None
|
||||
|
||||
|
||||
def test_bulk_partial_failure_doesnt_abort_siblings(client):
|
||||
"""One bad id in the middle of a batch must not prevent others from
|
||||
applying."""
|
||||
a = client.post("/api/plugins/kanban/tasks", json={"title": "a"}).json()["task"]
|
||||
c2 = client.post("/api/plugins/kanban/tasks", json={"title": "c"}).json()["task"]
|
||||
r = client.post("/api/plugins/kanban/tasks/bulk",
|
||||
json={"ids": [a["id"], "bogus-id", c2["id"]], "priority": 7})
|
||||
assert r.status_code == 200
|
||||
results = r.json()["results"]
|
||||
assert len(results) == 3
|
||||
ok_ids = {r["id"] for r in results if r["ok"]}
|
||||
assert a["id"] in ok_ids
|
||||
assert c2["id"] in ok_ids
|
||||
assert any(not r["ok"] and r["id"] == "bogus-id" for r in results)
|
||||
# Good siblings actually got the priority bump.
|
||||
for tid in (a["id"], c2["id"]):
|
||||
t = client.get(f"/api/plugins/kanban/tasks/{tid}").json()["task"]
|
||||
assert t["priority"] == 7
|
||||
|
||||
|
||||
def test_bulk_empty_ids_400(client):
|
||||
r = client.post("/api/plugins/kanban/tasks/bulk", json={"ids": []})
|
||||
assert r.status_code == 400
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# /config endpoint
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
def test_config_returns_defaults_when_section_missing(client):
|
||||
r = client.get("/api/plugins/kanban/config")
|
||||
assert r.status_code == 200
|
||||
data = r.json()
|
||||
# Defaults when dashboard.kanban is missing.
|
||||
assert data["default_tenant"] == ""
|
||||
assert data["lane_by_profile"] is True
|
||||
assert data["include_archived_by_default"] is False
|
||||
assert data["render_markdown"] is True
|
||||
|
||||
|
||||
def test_config_reads_dashboard_kanban_section(tmp_path, monkeypatch, client):
|
||||
home = Path(os.environ["HERMES_HOME"])
|
||||
(home / "config.yaml").write_text(
|
||||
"dashboard:\n"
|
||||
" kanban:\n"
|
||||
" default_tenant: acme\n"
|
||||
" lane_by_profile: false\n"
|
||||
" include_archived_by_default: true\n"
|
||||
" render_markdown: false\n"
|
||||
)
|
||||
r = client.get("/api/plugins/kanban/config")
|
||||
assert r.status_code == 200
|
||||
data = r.json()
|
||||
assert data["default_tenant"] == "acme"
|
||||
assert data["lane_by_profile"] is False
|
||||
assert data["include_archived_by_default"] is True
|
||||
assert data["render_markdown"] is False
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Runs surfacing (vulcan-artivus RFC feedback)
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def test_task_detail_includes_runs(client):
|
||||
"""GET /tasks/:id carries a runs[] array with the attempt history."""
|
||||
r = client.post("/api/plugins/kanban/tasks",
|
||||
json={"title": "port x", "assignee": "worker"}).json()
|
||||
tid = r["task"]["id"]
|
||||
|
||||
# Drive status running to force a run creation: PATCH to running
|
||||
# doesn't call claim_task (the PATCH path uses _set_status_direct),
|
||||
# so use the bulk/claim indirection via the kernel.
|
||||
import hermes_cli.kanban_db as _kb
|
||||
conn = _kb.connect()
|
||||
try:
|
||||
_kb.claim_task(conn, tid)
|
||||
_kb.complete_task(
|
||||
conn, tid,
|
||||
result="done",
|
||||
summary="tested on rate limiter",
|
||||
metadata={"changed_files": ["limiter.py"]},
|
||||
)
|
||||
finally:
|
||||
conn.close()
|
||||
|
||||
d = client.get(f"/api/plugins/kanban/tasks/{tid}").json()
|
||||
assert "runs" in d
|
||||
assert len(d["runs"]) == 1
|
||||
run = d["runs"][0]
|
||||
assert run["outcome"] == "completed"
|
||||
assert run["profile"] == "worker"
|
||||
assert run["summary"] == "tested on rate limiter"
|
||||
assert run["metadata"] == {"changed_files": ["limiter.py"]}
|
||||
assert run["ended_at"] is not None
|
||||
|
||||
|
||||
def test_task_detail_runs_empty_before_claim(client):
|
||||
"""A task that's never been claimed has an empty runs[] list, not
|
||||
a missing key."""
|
||||
r = client.post("/api/plugins/kanban/tasks", json={"title": "fresh"}).json()
|
||||
d = client.get(f"/api/plugins/kanban/tasks/{r['task']['id']}").json()
|
||||
assert d["runs"] == []
|
||||
|
||||
|
||||
def test_patch_status_done_with_summary_and_metadata(client):
|
||||
"""PATCH /tasks/:id with status=done + summary + metadata must
|
||||
reach complete_task, so the dashboard has CLI parity."""
|
||||
# Create + claim.
|
||||
r = client.post("/api/plugins/kanban/tasks", json={"title": "x", "assignee": "worker"})
|
||||
tid = r.json()["task"]["id"]
|
||||
from hermes_cli import kanban_db as kb
|
||||
conn = kb.connect()
|
||||
try:
|
||||
kb.claim_task(conn, tid)
|
||||
finally:
|
||||
conn.close()
|
||||
|
||||
r = client.patch(
|
||||
f"/api/plugins/kanban/tasks/{tid}",
|
||||
json={
|
||||
"status": "done",
|
||||
"summary": "shipped the thing",
|
||||
"metadata": {"changed_files": ["a.py", "b.py"], "tests_run": 7},
|
||||
},
|
||||
)
|
||||
assert r.status_code == 200, r.text
|
||||
|
||||
# The run must have the summary + metadata attached.
|
||||
conn = kb.connect()
|
||||
try:
|
||||
run = kb.latest_run(conn, tid)
|
||||
assert run.outcome == "completed"
|
||||
assert run.summary == "shipped the thing"
|
||||
assert run.metadata == {"changed_files": ["a.py", "b.py"], "tests_run": 7}
|
||||
finally:
|
||||
conn.close()
|
||||
|
||||
|
||||
def test_patch_status_done_without_summary_still_works(client):
|
||||
"""Back-compat: PATCH without the new fields still completes."""
|
||||
r = client.post("/api/plugins/kanban/tasks", json={"title": "y", "assignee": "worker"})
|
||||
tid = r.json()["task"]["id"]
|
||||
from hermes_cli import kanban_db as kb
|
||||
conn = kb.connect()
|
||||
try:
|
||||
kb.claim_task(conn, tid)
|
||||
finally:
|
||||
conn.close()
|
||||
r = client.patch(
|
||||
f"/api/plugins/kanban/tasks/{tid}",
|
||||
json={"status": "done", "result": "legacy shape"},
|
||||
)
|
||||
assert r.status_code == 200, r.text
|
||||
conn = kb.connect()
|
||||
try:
|
||||
run = kb.latest_run(conn, tid)
|
||||
assert run.outcome == "completed"
|
||||
assert run.summary == "legacy shape" # falls back to result
|
||||
finally:
|
||||
conn.close()
|
||||
|
||||
|
||||
def test_patch_status_archive_closes_running_run(client):
|
||||
"""PATCH to archived while running must close the in-flight run."""
|
||||
r = client.post("/api/plugins/kanban/tasks", json={"title": "z", "assignee": "worker"})
|
||||
tid = r.json()["task"]["id"]
|
||||
from hermes_cli import kanban_db as kb
|
||||
conn = kb.connect()
|
||||
try:
|
||||
kb.claim_task(conn, tid)
|
||||
open_run = kb.latest_run(conn, tid)
|
||||
assert open_run.ended_at is None
|
||||
finally:
|
||||
conn.close()
|
||||
r = client.patch(
|
||||
f"/api/plugins/kanban/tasks/{tid}",
|
||||
json={"status": "archived"},
|
||||
)
|
||||
assert r.status_code == 200, r.text
|
||||
conn = kb.connect()
|
||||
try:
|
||||
task = kb.get_task(conn, tid)
|
||||
assert task.status == "archived"
|
||||
assert task.current_run_id is None
|
||||
assert kb.latest_run(conn, tid).outcome == "reclaimed"
|
||||
finally:
|
||||
conn.close()
|
||||
|
||||
|
||||
def test_event_dict_includes_run_id(client):
|
||||
"""GET /tasks/:id returns events with run_id populated."""
|
||||
r = client.post("/api/plugins/kanban/tasks", json={"title": "e", "assignee": "worker"})
|
||||
tid = r.json()["task"]["id"]
|
||||
from hermes_cli import kanban_db as kb
|
||||
conn = kb.connect()
|
||||
try:
|
||||
kb.claim_task(conn, tid)
|
||||
run_id = kb.latest_run(conn, tid).id
|
||||
kb.complete_task(conn, tid, summary="wss")
|
||||
finally:
|
||||
conn.close()
|
||||
|
||||
r = client.get(f"/api/plugins/kanban/tasks/{tid}")
|
||||
assert r.status_code == 200
|
||||
events = r.json()["events"]
|
||||
# Every event in the response must have a run_id key (None or int).
|
||||
for e in events:
|
||||
assert "run_id" in e, f"missing run_id in event: {e}"
|
||||
# completed event must have the actual run_id.
|
||||
comp = [e for e in events if e["kind"] == "completed"]
|
||||
assert comp[0]["run_id"] == run_id
|
||||
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Per-task force-loaded skills via REST
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def test_create_task_with_skills_roundtrips(client):
|
||||
"""POST /tasks accepts `skills: [...]`, GET /tasks/:id returns it."""
|
||||
r = client.post(
|
||||
"/api/plugins/kanban/tasks",
|
||||
json={
|
||||
"title": "translate docs",
|
||||
"assignee": "linguist",
|
||||
"skills": ["translation", "github-code-review"],
|
||||
},
|
||||
)
|
||||
assert r.status_code == 200, r.text
|
||||
task = r.json()["task"]
|
||||
assert task["skills"] == ["translation", "github-code-review"]
|
||||
|
||||
# Fetch via GET /tasks/:id as the drawer does.
|
||||
got = client.get(f"/api/plugins/kanban/tasks/{task['id']}").json()
|
||||
assert got["task"]["skills"] == ["translation", "github-code-review"]
|
||||
|
||||
|
||||
def test_create_task_without_skills_defaults_to_empty_list(client):
|
||||
"""_task_dict serializes Task.skills=None as [] so the drawer can
|
||||
always .length check without guarding against null."""
|
||||
r = client.post(
|
||||
"/api/plugins/kanban/tasks",
|
||||
json={"title": "no skills", "assignee": "x"},
|
||||
)
|
||||
assert r.status_code == 200, r.text
|
||||
task = r.json()["task"]
|
||||
# Task.skills is None in-memory; _task_dict serializes via
|
||||
# dataclasses.asdict which keeps it None. The drawer's
|
||||
# `t.skills && t.skills.length > 0` guard handles both null and [].
|
||||
assert task.get("skills") in (None, [])
|
||||
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Dispatcher-presence warning in POST /tasks response
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def test_create_task_includes_warning_when_no_dispatcher(client, monkeypatch):
|
||||
"""ready+assigned task + no gateway -> response has `warning` field
|
||||
so the dashboard UI can surface a banner."""
|
||||
# Force the dispatcher probe to report "not running".
|
||||
monkeypatch.setattr(
|
||||
"hermes_cli.kanban._check_dispatcher_presence",
|
||||
lambda: (False, "No gateway is running — start `hermes gateway start`."),
|
||||
)
|
||||
r = client.post(
|
||||
"/api/plugins/kanban/tasks",
|
||||
json={"title": "warn-me", "assignee": "worker"},
|
||||
)
|
||||
assert r.status_code == 200
|
||||
data = r.json()
|
||||
assert data.get("warning")
|
||||
assert "gateway" in data["warning"].lower()
|
||||
|
||||
|
||||
def test_create_task_no_warning_when_dispatcher_up(client, monkeypatch):
|
||||
"""Dispatcher running -> no `warning` field in the response."""
|
||||
monkeypatch.setattr(
|
||||
"hermes_cli.kanban._check_dispatcher_presence",
|
||||
lambda: (True, ""),
|
||||
)
|
||||
r = client.post(
|
||||
"/api/plugins/kanban/tasks",
|
||||
json={"title": "silent", "assignee": "worker"},
|
||||
)
|
||||
assert r.status_code == 200
|
||||
assert "warning" not in r.json() or not r.json()["warning"]
|
||||
|
||||
|
||||
def test_create_task_no_warning_on_triage(client, monkeypatch):
|
||||
"""Triage tasks never get the warning (they can't be dispatched
|
||||
anyway until promoted)."""
|
||||
monkeypatch.setattr(
|
||||
"hermes_cli.kanban._check_dispatcher_presence",
|
||||
lambda: (False, "oh no"),
|
||||
)
|
||||
r = client.post(
|
||||
"/api/plugins/kanban/tasks",
|
||||
json={"title": "triage-task", "assignee": "worker", "triage": True},
|
||||
)
|
||||
assert r.status_code == 200
|
||||
assert "warning" not in r.json() or not r.json()["warning"]
|
||||
|
||||
|
||||
def test_create_task_probe_error_does_not_break_create(client, monkeypatch):
|
||||
"""Probe failure must never break task creation."""
|
||||
def _raise():
|
||||
raise RuntimeError("probe crashed")
|
||||
monkeypatch.setattr(
|
||||
"hermes_cli.kanban._check_dispatcher_presence", _raise,
|
||||
)
|
||||
r = client.post(
|
||||
"/api/plugins/kanban/tasks",
|
||||
json={"title": "resilient", "assignee": "worker"},
|
||||
)
|
||||
assert r.status_code == 200
|
||||
assert r.json()["task"]["title"] == "resilient"
|
||||
@@ -127,3 +127,66 @@ def test_background_review_installs_auto_deny_approval_callback(monkeypatch):
|
||||
"Background review leaked its approval callback into the worker "
|
||||
"thread's TLS slot; a recycled thread-id could reuse it."
|
||||
)
|
||||
|
||||
|
||||
def test_background_review_summary_is_attributed_to_self_improvement_loop(monkeypatch):
|
||||
"""The CLI/gateway emission must identify the self-improvement loop.
|
||||
|
||||
Users who miss the line in their terminal have no way to tell that the
|
||||
background review was what modified their skill/memory stores. The
|
||||
summary prefix ``💾 Self-improvement review: …`` makes the origin
|
||||
explicit so both the CLI and gateway deliveries are unambiguous.
|
||||
"""
|
||||
import json
|
||||
|
||||
captured_prints: list = []
|
||||
captured_bg_callback: list = []
|
||||
|
||||
class FakeReviewAgent:
|
||||
def __init__(self, **kwargs):
|
||||
# Simulate a review that successfully updated memory so
|
||||
# _summarize_background_review_actions returns a real action.
|
||||
self._session_messages = [
|
||||
{
|
||||
"role": "tool",
|
||||
"tool_call_id": "call_bg",
|
||||
"content": json.dumps(
|
||||
{"success": True, "message": "Entry added", "target": "memory"}
|
||||
),
|
||||
}
|
||||
]
|
||||
|
||||
def run_conversation(self, **kwargs):
|
||||
pass
|
||||
|
||||
def shutdown_memory_provider(self):
|
||||
pass
|
||||
|
||||
def close(self):
|
||||
pass
|
||||
|
||||
monkeypatch.setattr(run_agent_module, "AIAgent", FakeReviewAgent)
|
||||
monkeypatch.setattr(run_agent_module.threading, "Thread", ImmediateThread)
|
||||
|
||||
agent = _bare_agent()
|
||||
agent._safe_print = lambda *a, **kw: captured_prints.append(" ".join(str(x) for x in a))
|
||||
agent.background_review_callback = lambda msg: captured_bg_callback.append(msg)
|
||||
|
||||
AIAgent._spawn_background_review(
|
||||
agent,
|
||||
messages_snapshot=[{"role": "user", "content": "hi"}],
|
||||
review_memory=True,
|
||||
)
|
||||
|
||||
# Exactly one summary should have been emitted, and it must identify
|
||||
# the self-improvement review explicitly.
|
||||
assert len(captured_prints) == 1, captured_prints
|
||||
printed = captured_prints[0]
|
||||
assert "Self-improvement review" in printed, printed
|
||||
assert "Memory updated" in printed, printed
|
||||
|
||||
# Gateway path gets the same prefix.
|
||||
assert len(captured_bg_callback) == 1
|
||||
assert captured_bg_callback[0].startswith("💾 Self-improvement review:"), (
|
||||
captured_bg_callback[0]
|
||||
)
|
||||
|
||||
@@ -23,6 +23,8 @@ Refs #15250 / #15353.
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from types import SimpleNamespace
|
||||
|
||||
import pytest
|
||||
|
||||
from run_agent import AIAgent
|
||||
@@ -33,9 +35,36 @@ def _make_agent(provider: str = "", model: str = "", base_url: str = "") -> AIAg
|
||||
agent.provider = provider
|
||||
agent.model = model
|
||||
agent.base_url = base_url
|
||||
agent.verbose_logging = False
|
||||
agent.reasoning_callback = None
|
||||
agent.stream_delta_callback = None
|
||||
agent._stream_callback = None
|
||||
return agent
|
||||
|
||||
|
||||
_ATTR_ABSENT = object()
|
||||
_EXPECT_NOT_PRESENT = object()
|
||||
|
||||
|
||||
def _sdk_tool_call(call_id: str = "c1", name: str = "terminal", arguments: str = "{}"):
|
||||
"""Minimal SDK-shaped tool_call object that satisfies the builder's iteration."""
|
||||
return SimpleNamespace(
|
||||
id=call_id,
|
||||
call_id=call_id,
|
||||
type="function",
|
||||
function=SimpleNamespace(name=name, arguments=arguments),
|
||||
extra_content=None,
|
||||
)
|
||||
|
||||
|
||||
def _build_sdk_message(reasoning_content=_ATTR_ABSENT, **extra):
|
||||
"""SDK-shaped assistant message; ``reasoning_content`` defaults to absent."""
|
||||
kwargs = {"content": "", **extra}
|
||||
if reasoning_content is not _ATTR_ABSENT:
|
||||
kwargs["reasoning_content"] = reasoning_content
|
||||
return SimpleNamespace(**kwargs)
|
||||
|
||||
|
||||
class TestNeedsDeepSeekToolReasoning:
|
||||
"""_needs_deepseek_tool_reasoning() recognises all three detection signals."""
|
||||
|
||||
@@ -109,16 +138,7 @@ class TestCopyReasoningContentForApi:
|
||||
assert api_msg["reasoning_content"] == "<think>real chain of thought</think>"
|
||||
|
||||
def test_deepseek_reasoning_field_promoted(self) -> None:
|
||||
"""When only 'reasoning' is set (no tool_calls), it gets promoted to reasoning_content.
|
||||
|
||||
On DeepSeek/Kimi, tool-call turns with 'reasoning' but no
|
||||
'reasoning_content' are treated as cross-provider poisoned history
|
||||
(#15748) and padded with "" instead of promoted. Same-provider
|
||||
DeepSeek tool-call turns always have reasoning_content pinned at
|
||||
creation time by _build_assistant_message, so the (reasoning-set,
|
||||
reasoning_content-absent, tool_calls-present) shape is unreachable
|
||||
from same-provider history.
|
||||
"""
|
||||
"""When only 'reasoning' is set, it gets promoted to reasoning_content."""
|
||||
agent = _make_agent(provider="deepseek", model="deepseek-v4-flash")
|
||||
source = {
|
||||
"role": "assistant",
|
||||
@@ -135,8 +155,8 @@ class TestCopyReasoningContentForApi:
|
||||
|
||||
If the source turn has tool_calls AND a 'reasoning' field but NO
|
||||
'reasoning_content' key, it's from a prior provider (the DeepSeek
|
||||
build path always pins reasoning_content="" at creation). Inject
|
||||
"" instead of forwarding the prior provider's chain of thought.
|
||||
build path pins reasoning_content at creation). Inject "" instead
|
||||
of forwarding the prior provider's chain of thought.
|
||||
"""
|
||||
agent = _make_agent(provider="deepseek", model="deepseek-v4-flash")
|
||||
source = {
|
||||
@@ -228,6 +248,172 @@ class TestCopyReasoningContentForApi:
|
||||
assert "reasoning_content" not in api_msg
|
||||
|
||||
|
||||
class TestBuildAssistantMessageDeepSeekReasoningContent:
|
||||
"""_build_assistant_message pins replay-safe DeepSeek tool-call state."""
|
||||
|
||||
def test_deepseek_tool_call_reasoning_is_backfilled_into_reasoning_content(self) -> None:
|
||||
agent = _make_agent(provider="deepseek", model="deepseek-v4-flash")
|
||||
assistant_message = SimpleNamespace(
|
||||
content=None,
|
||||
reasoning="DeepSeek tool-call reasoning",
|
||||
reasoning_content=None,
|
||||
reasoning_details=None,
|
||||
codex_reasoning_items=None,
|
||||
codex_message_items=None,
|
||||
tool_calls=[
|
||||
SimpleNamespace(
|
||||
id="call_1",
|
||||
call_id=None,
|
||||
response_item_id=None,
|
||||
type="function",
|
||||
function=SimpleNamespace(name="terminal", arguments="{}"),
|
||||
)
|
||||
],
|
||||
)
|
||||
|
||||
msg = agent._build_assistant_message(assistant_message, "tool_calls")
|
||||
|
||||
assert msg["reasoning_content"] == "DeepSeek tool-call reasoning"
|
||||
assert msg["tool_calls"][0]["id"] == "call_1"
|
||||
|
||||
def test_deepseek_model_extra_reasoning_content_is_preserved(self) -> None:
|
||||
"""OpenAI SDK stores unknown provider fields in model_extra."""
|
||||
agent = _make_agent(provider="deepseek", model="deepseek-v4-flash")
|
||||
assistant_message = SimpleNamespace(
|
||||
content=None,
|
||||
reasoning=None,
|
||||
reasoning_content=None,
|
||||
model_extra={"reasoning_content": "DeepSeek model_extra reasoning"},
|
||||
reasoning_details=None,
|
||||
codex_reasoning_items=None,
|
||||
codex_message_items=None,
|
||||
tool_calls=[
|
||||
SimpleNamespace(
|
||||
id="call_1",
|
||||
call_id=None,
|
||||
response_item_id=None,
|
||||
type="function",
|
||||
function=SimpleNamespace(name="terminal", arguments="{}"),
|
||||
)
|
||||
],
|
||||
)
|
||||
|
||||
msg = agent._build_assistant_message(assistant_message, "tool_calls")
|
||||
|
||||
assert msg["reasoning_content"] == "DeepSeek model_extra reasoning"
|
||||
|
||||
def test_deepseek_tool_call_without_raw_reasoning_content_gets_empty_string(self) -> None:
|
||||
agent = _make_agent(provider="deepseek", model="deepseek-v4-flash")
|
||||
assistant_message = SimpleNamespace(
|
||||
content=None,
|
||||
reasoning=None,
|
||||
reasoning_content=None,
|
||||
reasoning_details=None,
|
||||
codex_reasoning_items=None,
|
||||
codex_message_items=None,
|
||||
tool_calls=[
|
||||
SimpleNamespace(
|
||||
id="call_1",
|
||||
call_id=None,
|
||||
response_item_id=None,
|
||||
type="function",
|
||||
function=SimpleNamespace(name="terminal", arguments="{}"),
|
||||
)
|
||||
],
|
||||
)
|
||||
|
||||
msg = agent._build_assistant_message(assistant_message, "tool_calls")
|
||||
|
||||
assert msg["reasoning_content"] == ""
|
||||
assert msg["tool_calls"][0]["id"] == "call_1"
|
||||
|
||||
|
||||
class TestBuildAssistantMessagePadsStrictProviders:
|
||||
"""Regression for #17400: _build_assistant_message must pin reasoning_content
|
||||
on tool-call turns when the active provider enforces echo-back, regardless
|
||||
of whether the SDK exposed reasoning_content as None, omitted it entirely,
|
||||
or returned an empty thinking block.
|
||||
|
||||
Prior to the fix, the pad branch was guarded by ``msg.get("tool_calls")``,
|
||||
which was always falsy because tool_calls were assigned later in the same
|
||||
method. Persisted history accumulated assistant tool-call turns with no
|
||||
reasoning_content; the next replay 400'd on DeepSeek/Kimi.
|
||||
"""
|
||||
|
||||
@pytest.mark.parametrize(
|
||||
"provider,model,base_url,sdk_reasoning_content,expected",
|
||||
[
|
||||
pytest.param(
|
||||
"deepseek", "deepseek-v4-pro", "",
|
||||
None, "",
|
||||
id="deepseek-attr-none",
|
||||
),
|
||||
pytest.param(
|
||||
"deepseek", "deepseek-v4-pro", "",
|
||||
_ATTR_ABSENT, "",
|
||||
id="deepseek-attr-absent",
|
||||
),
|
||||
pytest.param(
|
||||
"kimi-coding", "kimi-k2.6", "",
|
||||
None, "",
|
||||
id="kimi-attr-none",
|
||||
),
|
||||
pytest.param(
|
||||
"custom", "kimi-k2", "https://api.moonshot.ai/v1",
|
||||
_ATTR_ABSENT, "",
|
||||
id="moonshot-base-url",
|
||||
),
|
||||
pytest.param(
|
||||
"openrouter", "anthropic/claude-sonnet-4.6", "https://openrouter.ai/api/v1",
|
||||
_ATTR_ABSENT, _EXPECT_NOT_PRESENT,
|
||||
id="openrouter-no-pad",
|
||||
),
|
||||
],
|
||||
)
|
||||
def test_tool_call_reasoning_content_pad(
|
||||
self, provider, model, base_url, sdk_reasoning_content, expected,
|
||||
) -> None:
|
||||
agent = _make_agent(provider=provider, model=model, base_url=base_url)
|
||||
msg_in = _build_sdk_message(
|
||||
reasoning_content=sdk_reasoning_content,
|
||||
tool_calls=[_sdk_tool_call()],
|
||||
)
|
||||
msg = agent._build_assistant_message(msg_in, finish_reason="tool_calls")
|
||||
if expected is _EXPECT_NOT_PRESENT:
|
||||
assert "reasoning_content" not in msg
|
||||
else:
|
||||
assert msg["reasoning_content"] == expected
|
||||
|
||||
def test_tool_call_preserves_real_reasoning_content(self) -> None:
|
||||
agent = _make_agent(provider="deepseek", model="deepseek-v4-pro")
|
||||
msg_in = _build_sdk_message(
|
||||
reasoning_content="actual chain of thought",
|
||||
tool_calls=[_sdk_tool_call()],
|
||||
)
|
||||
msg = agent._build_assistant_message(msg_in, finish_reason="tool_calls")
|
||||
assert msg["reasoning_content"] == "actual chain of thought"
|
||||
|
||||
def test_text_only_turn_not_padded_by_tool_call_branch(self) -> None:
|
||||
"""Plain-text turns rely on _copy_reasoning_content_for_api at replay
|
||||
time, not on this builder's tool-call pad."""
|
||||
agent = _make_agent(provider="deepseek", model="deepseek-v4-pro")
|
||||
msg_in = SimpleNamespace(content="hello", tool_calls=None)
|
||||
msg = agent._build_assistant_message(msg_in, finish_reason="stop")
|
||||
assert "tool_calls" not in msg
|
||||
assert "reasoning_content" not in msg
|
||||
|
||||
def test_streamed_reasoning_text_promoted_over_pad(self) -> None:
|
||||
"""When ``.reasoning`` carries streamed thinking, it must be promoted
|
||||
to reasoning_content rather than overwritten with the empty pad."""
|
||||
agent = _make_agent(provider="deepseek", model="deepseek-v4-pro")
|
||||
msg_in = _build_sdk_message(
|
||||
reasoning="streamed thoughts",
|
||||
tool_calls=[_sdk_tool_call()],
|
||||
)
|
||||
msg = agent._build_assistant_message(msg_in, finish_reason="tool_calls")
|
||||
assert msg["reasoning_content"] == "streamed thoughts"
|
||||
|
||||
|
||||
class TestNeedsKimiToolReasoning:
|
||||
"""The extracted _needs_kimi_tool_reasoning() helper keeps Kimi behavior intact."""
|
||||
|
||||
|
||||
@@ -0,0 +1,245 @@
|
||||
"""Live DeepSeek V4 thinking-mode tool-call replay smoke test.
|
||||
|
||||
Opt-in only:
|
||||
HERMES_LIVE_TESTS=1 pytest tests/run_agent/test_deepseek_v4_thinking_live.py -q
|
||||
|
||||
Requires DEEPSEEK_API_KEY in the process environment. The key is captured at
|
||||
module import time because tests/conftest.py intentionally removes credential
|
||||
environment variables before each test body runs.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
import os
|
||||
import sys
|
||||
from typing import Any
|
||||
|
||||
import pytest
|
||||
|
||||
|
||||
LIVE = os.environ.get("HERMES_LIVE_TESTS") == "1"
|
||||
DEEPSEEK_KEY = os.environ.get("DEEPSEEK_API_KEY", "")
|
||||
LIVE_MODELS = ("deepseek-v4-flash", "deepseek-v4-pro")
|
||||
LIVE_BASE_URL = "https://api.deepseek.com"
|
||||
|
||||
pytestmark = [
|
||||
pytest.mark.skipif(not LIVE, reason="live-only: set HERMES_LIVE_TESTS=1"),
|
||||
pytest.mark.skipif(not DEEPSEEK_KEY, reason="DEEPSEEK_API_KEY not configured"),
|
||||
]
|
||||
|
||||
TOOL_NAME = "lookup_ticket_status"
|
||||
TOOLS = [
|
||||
{
|
||||
"type": "function",
|
||||
"function": {
|
||||
"name": TOOL_NAME,
|
||||
"description": "Return the status for a test ticket id.",
|
||||
"parameters": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"ticket_id": {
|
||||
"type": "string",
|
||||
"description": "The ticket id to look up.",
|
||||
},
|
||||
},
|
||||
"required": ["ticket_id"],
|
||||
"additionalProperties": False,
|
||||
},
|
||||
},
|
||||
}
|
||||
]
|
||||
|
||||
|
||||
def _thinking_kwargs() -> dict:
|
||||
return {
|
||||
"reasoning_effort": "high",
|
||||
"extra_body": {"thinking": {"type": "enabled"}},
|
||||
}
|
||||
|
||||
|
||||
def _jsonable(value: Any) -> Any:
|
||||
if hasattr(value, "model_dump"):
|
||||
return value.model_dump(mode="json")
|
||||
if isinstance(value, dict):
|
||||
return {k: _jsonable(v) for k, v in value.items()}
|
||||
if isinstance(value, list):
|
||||
return [_jsonable(v) for v in value]
|
||||
return value
|
||||
|
||||
|
||||
def _print_trace(label: str, value: Any) -> None:
|
||||
sys.__stdout__.write(f"\n--- {label} ---\n")
|
||||
sys.__stdout__.write(
|
||||
json.dumps(_jsonable(value), ensure_ascii=False, indent=2, sort_keys=True)
|
||||
)
|
||||
sys.__stdout__.write("\n")
|
||||
sys.__stdout__.flush()
|
||||
|
||||
|
||||
def _message_snapshot(message) -> dict:
|
||||
return {
|
||||
"content": getattr(message, "content", None),
|
||||
"reasoning": getattr(message, "reasoning", None),
|
||||
"reasoning_content": _raw_reasoning_content(message),
|
||||
"model_extra": getattr(message, "model_extra", None),
|
||||
"tool_calls": _jsonable(getattr(message, "tool_calls", None)),
|
||||
}
|
||||
|
||||
|
||||
def _make_live_client():
|
||||
from openai import OpenAI
|
||||
|
||||
return OpenAI(api_key=DEEPSEEK_KEY, base_url=LIVE_BASE_URL)
|
||||
|
||||
|
||||
def _make_agent_for_message_building(model: str):
|
||||
from run_agent import AIAgent
|
||||
|
||||
agent = object.__new__(AIAgent)
|
||||
agent.provider = "deepseek"
|
||||
agent.model = model
|
||||
agent.base_url = LIVE_BASE_URL
|
||||
agent.verbose_logging = False
|
||||
agent.reasoning_callback = None
|
||||
agent.stream_delta_callback = None
|
||||
agent._stream_callback = None
|
||||
return agent
|
||||
|
||||
|
||||
def _raw_reasoning_content(message):
|
||||
direct = getattr(message, "reasoning_content", None)
|
||||
if direct is not None:
|
||||
return direct
|
||||
model_extra = getattr(message, "model_extra", None) or {}
|
||||
if isinstance(model_extra, dict) and "reasoning_content" in model_extra:
|
||||
return model_extra["reasoning_content"]
|
||||
return None
|
||||
|
||||
|
||||
@pytest.mark.parametrize("live_model", LIVE_MODELS)
|
||||
def test_deepseek_v4_thinking_tool_call_replay_round_trip(live_model: str):
|
||||
"""Hit DeepSeek twice and replay the assistant tool-call turn.
|
||||
|
||||
The first request forces a tool call with thinking enabled. The second
|
||||
request replays that assistant message with content, reasoning_content,
|
||||
and tool_calls, then appends the tool result. DeepSeek accepting the
|
||||
second request is the live guardrail for the V4 thinking replay contract.
|
||||
"""
|
||||
|
||||
client = _make_live_client()
|
||||
agent = _make_agent_for_message_building(live_model)
|
||||
|
||||
first_request = {
|
||||
"model": live_model,
|
||||
"messages": [
|
||||
{
|
||||
"role": "user",
|
||||
"content": (
|
||||
"You must use the provided lookup_ticket_status tool "
|
||||
"exactly once with ticket_id 'DS-4242'. Do not answer "
|
||||
"directly."
|
||||
),
|
||||
}
|
||||
],
|
||||
"tools": TOOLS,
|
||||
"max_tokens": 1024,
|
||||
"timeout": 90,
|
||||
**_thinking_kwargs(),
|
||||
}
|
||||
_print_trace(f"{live_model} first request", first_request)
|
||||
first = client.chat.completions.create(**first_request)
|
||||
_print_trace(f"{live_model} first raw response", first)
|
||||
|
||||
first_choice = first.choices[0]
|
||||
first_message = first_choice.message
|
||||
_print_trace(
|
||||
f"{live_model} first assistant message",
|
||||
{
|
||||
"finish_reason": first_choice.finish_reason,
|
||||
**_message_snapshot(first_message),
|
||||
},
|
||||
)
|
||||
assert first_message.tool_calls, "DeepSeek did not return a tool call"
|
||||
first_tool_call = first_message.tool_calls[0]
|
||||
assert first_tool_call.function.name == TOOL_NAME
|
||||
assert isinstance(json.loads(first_tool_call.function.arguments or "{}"), dict)
|
||||
|
||||
raw_reasoning_content = _raw_reasoning_content(first_message)
|
||||
assert raw_reasoning_content is not None, (
|
||||
"DeepSeek did not return reasoning_content; the thinking payload may "
|
||||
"not have been honored"
|
||||
)
|
||||
|
||||
stored_assistant = agent._build_assistant_message(
|
||||
first_message,
|
||||
first_choice.finish_reason or "tool_calls",
|
||||
)
|
||||
_print_trace(f"{live_model} stored assistant message", stored_assistant)
|
||||
assert stored_assistant["reasoning_content"] == raw_reasoning_content
|
||||
|
||||
replay_assistant = {
|
||||
"role": "assistant",
|
||||
"content": stored_assistant.get("content") or "",
|
||||
"tool_calls": stored_assistant["tool_calls"],
|
||||
}
|
||||
agent._copy_reasoning_content_for_api(stored_assistant, replay_assistant)
|
||||
_print_trace(f"{live_model} replay assistant message", replay_assistant)
|
||||
|
||||
tool_call_id = stored_assistant["tool_calls"][0]["id"]
|
||||
messages = [
|
||||
{
|
||||
"role": "user",
|
||||
"content": (
|
||||
"You must use the provided lookup_ticket_status tool "
|
||||
"exactly once with ticket_id 'DS-4242'. Do not answer "
|
||||
"directly."
|
||||
),
|
||||
},
|
||||
replay_assistant,
|
||||
{
|
||||
"role": "tool",
|
||||
"tool_call_id": tool_call_id,
|
||||
"content": json.dumps(
|
||||
{"ticket_id": "DS-4242", "status": "green", "source": "live-test"},
|
||||
separators=(",", ":"),
|
||||
),
|
||||
},
|
||||
]
|
||||
|
||||
from agent.transports.chat_completions import ChatCompletionsTransport
|
||||
|
||||
api_messages = ChatCompletionsTransport().convert_messages(messages)
|
||||
_print_trace(
|
||||
f"{live_model} second request messages after transport conversion",
|
||||
api_messages,
|
||||
)
|
||||
assert api_messages[1]["reasoning_content"] == raw_reasoning_content
|
||||
assert "call_id" not in api_messages[1]["tool_calls"][0]
|
||||
assert "response_item_id" not in api_messages[1]["tool_calls"][0]
|
||||
|
||||
second_request = {
|
||||
"model": live_model,
|
||||
"messages": api_messages,
|
||||
"max_tokens": 1024,
|
||||
"timeout": 90,
|
||||
**_thinking_kwargs(),
|
||||
}
|
||||
_print_trace(f"{live_model} second request", second_request)
|
||||
second = client.chat.completions.create(**second_request)
|
||||
_print_trace(f"{live_model} second raw response", second)
|
||||
_print_trace(
|
||||
f"{live_model} second assistant message",
|
||||
{
|
||||
"finish_reason": second.choices[0].finish_reason,
|
||||
**_message_snapshot(second.choices[0].message),
|
||||
},
|
||||
)
|
||||
|
||||
second_message = second.choices[0].message
|
||||
final_content = second_message.content or ""
|
||||
final_reasoning = _raw_reasoning_content(second_message) or ""
|
||||
assert second.choices[0].finish_reason == "stop"
|
||||
assert final_content.strip() or final_reasoning.strip(), (
|
||||
"DeepSeek returned neither visible content nor reasoning_content"
|
||||
)
|
||||
@@ -0,0 +1,249 @@
|
||||
"""Regression guard for PR #16660 (salvaged as PR #18027): ContextVar
|
||||
propagation into concurrent tool worker threads.
|
||||
|
||||
Background
|
||||
----------
|
||||
Gateway adapters (Slack, Telegram, Discord, ...) set
|
||||
``tools.approval._approval_session_key`` as a ContextVar before calling
|
||||
``agent.run_conversation`` so that dangerous-command approval prompts route
|
||||
back to the channel/session that initiated the tool call. When the agent
|
||||
dispatches multiple tools in parallel, it uses
|
||||
``concurrent.futures.ThreadPoolExecutor.submit(...)`` — and ``submit`` runs
|
||||
the callable in a *fresh* context, NOT the caller's context. Without an
|
||||
explicit ``contextvars.copy_context().run(...)`` wrapper, worker threads
|
||||
observe the ContextVar's default value, fall through to the
|
||||
``os.environ`` legacy fallback (which the gateway overwrites at each
|
||||
agent step), and route the approval card to *whichever session stepped
|
||||
most recently* — not the one that raised the prompt. Confirmed in the
|
||||
wild on Slack with two concurrent channels: session A's `rm -rf`
|
||||
approval card was delivered to session B.
|
||||
|
||||
The fix (4 LOC in ``run_agent.py``) snapshots the caller's context with
|
||||
``copy_context()`` and submits ``ctx.run(_run_tool, …)`` instead of
|
||||
``_run_tool`` directly. Mirrors ``asyncio.to_thread`` semantics.
|
||||
|
||||
This suite follows the ``contextvar-run-in-executor-bridge`` skill's
|
||||
two-test pattern: one end-to-end test proves the fix works at the
|
||||
call-site level, one documents the Python contract that makes the fix
|
||||
necessary. If anyone ever reverts the wrapper, the call-site test
|
||||
fails while the contract test keeps passing — a clear diagnostic
|
||||
signal for *why* the call-site regressed.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import concurrent.futures
|
||||
import contextvars
|
||||
import threading
|
||||
|
||||
|
||||
def test_executor_submit_without_copy_context_does_not_propagate():
|
||||
"""Documents the Python contract the fix relies on.
|
||||
|
||||
``concurrent.futures.ThreadPoolExecutor.submit(fn)`` runs ``fn`` in a
|
||||
worker thread with a fresh, empty context. A ContextVar set by the
|
||||
caller is invisible inside ``fn``. This is the exact trap that made
|
||||
approval-session routing race in the gateway before #16660.
|
||||
|
||||
If this test ever fails — i.e. submit() starts propagating
|
||||
ContextVars by default — the copy_context() wrapper in run_agent.py
|
||||
becomes redundant but not harmful, and the call-site test below
|
||||
should be updated accordingly.
|
||||
"""
|
||||
probe: contextvars.ContextVar[str] = contextvars.ContextVar(
|
||||
"probe_default_propagation", default="unset"
|
||||
)
|
||||
|
||||
def read_in_worker() -> str:
|
||||
return probe.get()
|
||||
|
||||
probe.set("set-in-main")
|
||||
|
||||
with concurrent.futures.ThreadPoolExecutor(max_workers=1) as ex:
|
||||
observed = ex.submit(read_in_worker).result(timeout=5)
|
||||
|
||||
assert observed == "unset", (
|
||||
"Unexpected: executor.submit propagated a ContextVar without "
|
||||
"copy_context(). If Python's behavior changed, update "
|
||||
"test_run_tool_worker_sees_parent_context below."
|
||||
)
|
||||
|
||||
|
||||
def test_executor_submit_with_copy_context_run_propagates():
|
||||
"""Positive case: the explicit ``copy_context().run(...)`` wrapper the
|
||||
PR adds makes parent-context ContextVar values visible in the worker.
|
||||
"""
|
||||
probe: contextvars.ContextVar[str] = contextvars.ContextVar(
|
||||
"probe_explicit_propagation", default="unset"
|
||||
)
|
||||
|
||||
def read_in_worker() -> str:
|
||||
return probe.get()
|
||||
|
||||
probe.set("set-in-main")
|
||||
|
||||
with concurrent.futures.ThreadPoolExecutor(max_workers=1) as ex:
|
||||
ctx = contextvars.copy_context()
|
||||
observed = ex.submit(ctx.run, read_in_worker).result(timeout=5)
|
||||
|
||||
assert observed == "set-in-main", (
|
||||
f"copy_context().run(...) failed to propagate: got {observed!r}"
|
||||
)
|
||||
|
||||
|
||||
def test_run_tool_worker_sees_parent_approval_session_key():
|
||||
"""End-to-end call-site guard.
|
||||
|
||||
Mirrors the exact shape of the fixed call site in
|
||||
``run_agent.py::_execute_tool_calls_concurrent`` — a
|
||||
``ThreadPoolExecutor`` with ``executor.submit(ctx.run, fn, *args)``.
|
||||
Sets the real ``tools.approval._approval_session_key`` ContextVar
|
||||
in the caller and asserts the worker observes it via
|
||||
``tools.approval.get_current_session_key()``.
|
||||
|
||||
If the PR's ``copy_context().run`` wrapper is reverted, this test
|
||||
fails with ``Expected 'session-A' but worker saw 'default'``.
|
||||
"""
|
||||
from tools.approval import (
|
||||
_approval_session_key,
|
||||
get_current_session_key,
|
||||
)
|
||||
|
||||
observed: dict = {}
|
||||
barrier = threading.Event()
|
||||
|
||||
def worker_equivalent_to_run_tool() -> None:
|
||||
# Mirror what real _run_tool does early: read the session key.
|
||||
observed["session_key"] = get_current_session_key(default="FALLBACK")
|
||||
barrier.set()
|
||||
|
||||
# Set the ContextVar the gateway would set before calling agent.run.
|
||||
token = _approval_session_key.set("session-A")
|
||||
try:
|
||||
with concurrent.futures.ThreadPoolExecutor(max_workers=1) as ex:
|
||||
ctx = contextvars.copy_context()
|
||||
fut = ex.submit(ctx.run, worker_equivalent_to_run_tool)
|
||||
fut.result(timeout=5)
|
||||
assert barrier.is_set(), "worker did not complete"
|
||||
finally:
|
||||
_approval_session_key.reset(token)
|
||||
|
||||
assert observed.get("session_key") == "session-A", (
|
||||
f"Worker thread did not inherit _approval_session_key from caller. "
|
||||
f"Expected 'session-A', got {observed.get('session_key')!r}. "
|
||||
"This is the bug that PR #16660 fixed — approval prompts route to "
|
||||
"the wrong session in concurrent gateway traffic. Check whether "
|
||||
"the copy_context().run wrapper in _execute_tool_calls_concurrent "
|
||||
"was removed."
|
||||
)
|
||||
|
||||
|
||||
def test_run_agent_concurrent_executor_wraps_submit_with_copy_context():
|
||||
"""Source-level guard that the fix stays at the REAL call site.
|
||||
|
||||
The behavioral tests above exercise the pattern in isolation and
|
||||
pass regardless of whether ``run_agent.py`` actually uses it.
|
||||
This guard inspects ``_execute_tool_calls_concurrent`` directly and
|
||||
asserts that ``executor.submit`` is called with ``ctx.run`` (or
|
||||
``copy_context()`` appears within a few lines) — so reverting the
|
||||
wrapper in ``run_agent.py`` fails this test with a clear message.
|
||||
"""
|
||||
import ast
|
||||
import inspect
|
||||
|
||||
import run_agent
|
||||
|
||||
src_path = inspect.getsourcefile(run_agent)
|
||||
assert src_path is not None
|
||||
tree = ast.parse(open(src_path, encoding="utf-8").read())
|
||||
|
||||
submit_calls_in_agent: list[ast.Call] = []
|
||||
for node in ast.walk(tree):
|
||||
if not isinstance(node, ast.Call):
|
||||
continue
|
||||
func = node.func
|
||||
# Match executor.submit(...) style calls.
|
||||
if isinstance(func, ast.Attribute) and func.attr == "submit":
|
||||
submit_calls_in_agent.append(node)
|
||||
|
||||
# Filter to the submit call inside the concurrent tool executor —
|
||||
# identifiable by passing `_run_tool` as its target. Other submit()
|
||||
# call sites in run_agent.py (e.g. auxiliary client warm-up) are
|
||||
# out of scope for this regression.
|
||||
tool_submits = []
|
||||
for call in submit_calls_in_agent:
|
||||
if not call.args:
|
||||
continue
|
||||
first = call.args[0]
|
||||
# Unfixed: executor.submit(_run_tool, ...) → first arg is a Name
|
||||
if isinstance(first, ast.Name) and first.id == "_run_tool":
|
||||
tool_submits.append(("unfixed", call))
|
||||
# Fixed: executor.submit(ctx.run, _run_tool, ...) → first arg is
|
||||
# ctx.run (Attribute), and _run_tool is the second arg.
|
||||
elif (
|
||||
isinstance(first, ast.Attribute)
|
||||
and first.attr == "run"
|
||||
and len(call.args) >= 2
|
||||
and isinstance(call.args[1], ast.Name)
|
||||
and call.args[1].id == "_run_tool"
|
||||
):
|
||||
tool_submits.append(("fixed", call))
|
||||
|
||||
assert tool_submits, (
|
||||
"Could not locate `executor.submit(... _run_tool ...)` in "
|
||||
"run_agent.py. The call site may have been renamed — update this "
|
||||
"guard along with the refactor."
|
||||
)
|
||||
unfixed = [c for kind, c in tool_submits if kind == "unfixed"]
|
||||
assert not unfixed, (
|
||||
"run_agent.py contains `executor.submit(_run_tool, ...)` without a "
|
||||
"`ctx.run` wrapper. This is the pre-#16660 shape: worker threads "
|
||||
"will read a fresh ContextVar and approval-session routing "
|
||||
"collapses to the os.environ fallback. Wrap with "
|
||||
"`ctx = contextvars.copy_context(); executor.submit(ctx.run, "
|
||||
"_run_tool, ...)`."
|
||||
)
|
||||
|
||||
|
||||
def test_two_concurrent_tool_batches_keep_session_keys_isolated():
|
||||
"""End-to-end guard: two callers each set a different session key
|
||||
and submit workers concurrently. Each worker must see its own
|
||||
caller's key, not the other's.
|
||||
|
||||
Guards against a future "optimization" that reuses a single context
|
||||
snapshot across callers (which would collapse isolation the same way
|
||||
the unfixed ``submit`` does).
|
||||
"""
|
||||
from tools.approval import (
|
||||
_approval_session_key,
|
||||
get_current_session_key,
|
||||
)
|
||||
|
||||
results: dict = {}
|
||||
|
||||
def caller(label: str) -> None:
|
||||
token = _approval_session_key.set(f"session-{label}")
|
||||
try:
|
||||
with concurrent.futures.ThreadPoolExecutor(max_workers=1) as ex:
|
||||
ctx = contextvars.copy_context()
|
||||
fut = ex.submit(
|
||||
ctx.run,
|
||||
lambda: get_current_session_key(default="FALLBACK"),
|
||||
)
|
||||
results[label] = fut.result(timeout=5)
|
||||
finally:
|
||||
_approval_session_key.reset(token)
|
||||
|
||||
t_a = threading.Thread(target=caller, args=("A",))
|
||||
t_b = threading.Thread(target=caller, args=("B",))
|
||||
t_a.start()
|
||||
t_b.start()
|
||||
t_a.join(timeout=10)
|
||||
t_b.join(timeout=10)
|
||||
|
||||
assert results.get("A") == "session-A", (
|
||||
f"Session A worker saw {results.get('A')!r}, expected 'session-A'"
|
||||
)
|
||||
assert results.get("B") == "session-B", (
|
||||
f"Session B worker saw {results.get('B')!r}, expected 'session-B'"
|
||||
)
|
||||
@@ -0,0 +1,41 @@
|
||||
# Stress / battle-test suite
|
||||
|
||||
Long-running tests that exercise the Kanban kernel under adversarial
|
||||
conditions. **Not run by `scripts/run_tests.sh`** because they can
|
||||
take 30+ seconds each and spawn real subprocesses.
|
||||
|
||||
Run manually:
|
||||
|
||||
```bash
|
||||
./venv/bin/python -m pytest tests/stress/ -v -s
|
||||
# or individual files:
|
||||
./venv/bin/python tests/stress/test_concurrency.py
|
||||
./venv/bin/python tests/stress/test_subprocess_e2e.py
|
||||
./venv/bin/python tests/stress/test_property_fuzzing.py
|
||||
./venv/bin/python tests/stress/test_benchmarks.py
|
||||
```
|
||||
|
||||
## What's covered
|
||||
|
||||
- **test_concurrency.py** — 5 workers, 100 tasks, race-for-claim. Asserts
|
||||
no double-claims, no orphan runs, no SQLite errors escape retry.
|
||||
- **test_concurrency_mixed.py** — 10 workers + 1 reclaimer, 500 tasks,
|
||||
random ops (claim/complete/block/unblock/archive). Same invariants
|
||||
under adversarial scheduling.
|
||||
- **test_concurrency_reclaim_race.py** — TTL < work duration so the
|
||||
reclaimer intentionally yanks tasks mid-work; verifies the worker's
|
||||
late-complete is refused cleanly (CAS guard works).
|
||||
- **test_subprocess_e2e.py** — dispatcher spawns real Python subprocess
|
||||
workers that heartbeat + complete via the CLI; crash detection
|
||||
against a real dead PID.
|
||||
- **test_property_fuzzing.py** — 500 random operation sequences,
|
||||
~40k operations total, 9 invariant checks after each step.
|
||||
- **test_atypical_scenarios.py** — 28 scenarios covering atypical
|
||||
user inputs: unicode/emoji/RTL, 1 MB strings, SQL injection
|
||||
attempts, cycles, self-parents, wide fan-in/out, clock skew,
|
||||
HERMES_HOME with spaces/unicode/symlinks, 1000 runs on one
|
||||
task, idempotency-key race across processes, terminal-state
|
||||
resurrection attempts, dashboard REST with weird JSON.
|
||||
- **test_benchmarks.py** — latency at 100/1k/10k tasks for dispatch,
|
||||
recompute_ready, list_tasks, build_worker_context, etc. Results saved
|
||||
to JSON for regression diffing.
|
||||
@@ -0,0 +1,50 @@
|
||||
#!/usr/bin/env python3
|
||||
"""Fake worker process that exercises the real subprocess contract.
|
||||
|
||||
Reads HERMES_KANBAN_TASK from env, heartbeats periodically, does short
|
||||
work, completes via the CLI. Designed to be spawned by the dispatcher
|
||||
exactly the way `hermes chat -q` would be, minus the LLM cost.
|
||||
"""
|
||||
|
||||
import json
|
||||
import os
|
||||
import subprocess
|
||||
import sys
|
||||
import time
|
||||
|
||||
|
||||
def main():
|
||||
tid = os.environ["HERMES_KANBAN_TASK"]
|
||||
workspace = os.environ.get("HERMES_KANBAN_WORKSPACE", "")
|
||||
|
||||
# Announce via CLI (goes through real argparse + init_db + etc)
|
||||
subprocess.run(
|
||||
["hermes", "kanban", "heartbeat", tid, "--note", "started"],
|
||||
check=True, capture_output=True,
|
||||
)
|
||||
|
||||
# Simulate work with periodic heartbeats
|
||||
for i in range(3):
|
||||
time.sleep(0.3)
|
||||
subprocess.run(
|
||||
["hermes", "kanban", "heartbeat", tid, "--note", f"progress {i+1}/3"],
|
||||
check=True, capture_output=True,
|
||||
)
|
||||
|
||||
# Complete with structured handoff
|
||||
subprocess.run(
|
||||
[
|
||||
"hermes", "kanban", "complete", tid,
|
||||
"--summary", f"real-subprocess worker finished {tid}",
|
||||
"--metadata", json.dumps({
|
||||
"workspace": workspace,
|
||||
"worker_pid": os.getpid(),
|
||||
"iterations": 3,
|
||||
}),
|
||||
],
|
||||
check=True, capture_output=True,
|
||||
)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -0,0 +1,37 @@
|
||||
"""pytest config for the stress/ subdirectory.
|
||||
|
||||
These tests are slow (30s+), spawn subprocesses, and are not run by
|
||||
default. Enable via `pytest --run-stress` or by running the scripts
|
||||
directly.
|
||||
|
||||
The scripts are primarily __main__-executable entry points; pytest
|
||||
isn't expected to collect individual test functions from them.
|
||||
"""
|
||||
import pytest
|
||||
|
||||
|
||||
def pytest_collection_modifyitems(config, items):
|
||||
if config.getoption("--run-stress", default=False):
|
||||
return
|
||||
skip_stress = pytest.mark.skip(
|
||||
reason="stress test (opt-in via --run-stress or run script directly)"
|
||||
)
|
||||
for item in items:
|
||||
if "tests/stress" in str(item.fspath):
|
||||
item.add_marker(skip_stress)
|
||||
|
||||
|
||||
def pytest_addoption(parser):
|
||||
parser.addoption(
|
||||
"--run-stress",
|
||||
action="store_true",
|
||||
default=False,
|
||||
help="Run the stress/battle-test suite (slow, spawns subprocesses).",
|
||||
)
|
||||
|
||||
|
||||
collect_ignore_glob = [
|
||||
# The stress scripts have top-level code and hard-coded paths; they're
|
||||
# meant to run as `python tests/stress/<name>.py`, not as pytest modules.
|
||||
"*.py",
|
||||
]
|
||||
File diff suppressed because it is too large
Load Diff
@@ -0,0 +1,221 @@
|
||||
"""Scale benchmarks for the Kanban kernel.
|
||||
|
||||
Measures:
|
||||
- dispatch_once latency at 100, 1000, 10000 tasks
|
||||
- recompute_ready latency at 100, 1000, 10000 todo tasks with wide parent graphs
|
||||
- build_worker_context latency with 1, 10, 50 parent dependencies
|
||||
- board list/stats query latency
|
||||
- task_runs query latency at scale
|
||||
|
||||
Results printed as a table. Saved to JSON for regression-diffing in CI
|
||||
or future reviews. Not a pass/fail test — records numbers so we know
|
||||
when a change regresses latency by 10x and can decide whether to care.
|
||||
"""
|
||||
|
||||
import json
|
||||
import os
|
||||
import random
|
||||
import sys
|
||||
import tempfile
|
||||
import time
|
||||
from pathlib import Path
|
||||
|
||||
WT = str(Path(__file__).resolve().parents[2])
|
||||
|
||||
|
||||
def bench(label, fn, iterations=5):
|
||||
"""Time fn over `iterations` runs, return (min, median, max) in ms."""
|
||||
times = []
|
||||
for _ in range(iterations):
|
||||
t0 = time.perf_counter()
|
||||
fn()
|
||||
times.append((time.perf_counter() - t0) * 1000)
|
||||
times.sort()
|
||||
mn = times[0]
|
||||
md = times[len(times) // 2]
|
||||
mx = times[-1]
|
||||
return {"label": label, "iter": iterations, "min_ms": mn, "median_ms": md, "max_ms": mx}
|
||||
|
||||
|
||||
def seed_tasks(conn, kb, n, assignee="bench-worker", with_parents=False):
|
||||
"""Seed n tasks. Optionally give each task 5 parents."""
|
||||
ids = []
|
||||
for i in range(n):
|
||||
if with_parents and i >= 5:
|
||||
parents = random.sample(ids[:i], 5)
|
||||
else:
|
||||
parents = ()
|
||||
tid = kb.create_task(
|
||||
conn, title=f"bench {i}", assignee=assignee,
|
||||
tenant="bench", parents=parents,
|
||||
)
|
||||
ids.append(tid)
|
||||
return ids
|
||||
|
||||
|
||||
def main():
|
||||
home = tempfile.mkdtemp(prefix="hermes_bench_")
|
||||
os.environ["HERMES_HOME"] = home
|
||||
os.environ["HOME"] = home
|
||||
sys.path.insert(0, WT)
|
||||
from hermes_cli import kanban_db as kb
|
||||
|
||||
kb.init_db()
|
||||
|
||||
results = []
|
||||
|
||||
# ============ dispatch_once latency ============
|
||||
for n in [100, 1000, 10000]:
|
||||
print(f"\n== dispatch_once @ {n} tasks ==")
|
||||
# Fresh DB each time so we're not measuring cumulative effects
|
||||
import shutil
|
||||
shutil.rmtree(home, ignore_errors=True)
|
||||
os.makedirs(home)
|
||||
kb._INITIALIZED_PATHS.clear()
|
||||
kb.init_db()
|
||||
conn = kb.connect()
|
||||
seed_tasks(conn, kb, n, assignee=None) # no assignee → won't spawn
|
||||
r = bench(
|
||||
f"dispatch_once (n={n}, no spawn)",
|
||||
lambda: kb.dispatch_once(conn, spawn_fn=lambda *_: None),
|
||||
iterations=5,
|
||||
)
|
||||
print(f" min={r['min_ms']:.1f} median={r['median_ms']:.1f} max={r['max_ms']:.1f} ms")
|
||||
r["n"] = n
|
||||
results.append(r)
|
||||
conn.close()
|
||||
|
||||
# ============ recompute_ready at scale with parent graphs ============
|
||||
for n in [100, 1000, 10000]:
|
||||
print(f"\n== recompute_ready @ {n} tasks (5 parents each) ==")
|
||||
shutil.rmtree(home, ignore_errors=True)
|
||||
os.makedirs(home)
|
||||
kb._INITIALIZED_PATHS.clear()
|
||||
kb.init_db()
|
||||
conn = kb.connect()
|
||||
ids = seed_tasks(conn, kb, n, assignee=None, with_parents=True)
|
||||
# Complete the first 100 so some todo tasks might get promoted
|
||||
for tid in ids[:min(100, n // 10)]:
|
||||
kb.complete_task(conn, tid, result="bench")
|
||||
r = bench(
|
||||
f"recompute_ready (n={n}, with parents)",
|
||||
lambda: kb.recompute_ready(conn),
|
||||
iterations=5,
|
||||
)
|
||||
print(f" min={r['min_ms']:.1f} median={r['median_ms']:.1f} max={r['max_ms']:.1f} ms")
|
||||
r["n"] = n
|
||||
results.append(r)
|
||||
conn.close()
|
||||
|
||||
# ============ build_worker_context with N parents ============
|
||||
for parent_count in [1, 10, 50]:
|
||||
print(f"\n== build_worker_context with {parent_count} parents ==")
|
||||
shutil.rmtree(home, ignore_errors=True)
|
||||
os.makedirs(home)
|
||||
kb._INITIALIZED_PATHS.clear()
|
||||
kb.init_db()
|
||||
conn = kb.connect()
|
||||
# Create parents, complete them with summaries+metadata
|
||||
parent_ids = []
|
||||
for i in range(parent_count):
|
||||
pid = kb.create_task(conn, title=f"parent {i}", assignee="p")
|
||||
kb.claim_task(conn, pid)
|
||||
kb.complete_task(
|
||||
conn, pid,
|
||||
summary=f"parent {i} result that is longer than a single token "
|
||||
f"so we actually measure the IO",
|
||||
metadata={"files": [f"file_{j}.py" for j in range(5)], "i": i},
|
||||
)
|
||||
parent_ids.append(pid)
|
||||
child_id = kb.create_task(
|
||||
conn, title="child", assignee="c", parents=parent_ids,
|
||||
)
|
||||
r = bench(
|
||||
f"build_worker_context (parents={parent_count})",
|
||||
lambda: kb.build_worker_context(conn, child_id),
|
||||
iterations=10,
|
||||
)
|
||||
print(f" min={r['min_ms']:.1f} median={r['median_ms']:.1f} max={r['max_ms']:.1f} ms")
|
||||
r["parent_count"] = parent_count
|
||||
results.append(r)
|
||||
conn.close()
|
||||
|
||||
# ============ list_tasks at scale ============
|
||||
for n in [100, 1000, 10000]:
|
||||
print(f"\n== list_tasks @ {n} ==")
|
||||
shutil.rmtree(home, ignore_errors=True)
|
||||
os.makedirs(home)
|
||||
kb._INITIALIZED_PATHS.clear()
|
||||
kb.init_db()
|
||||
conn = kb.connect()
|
||||
seed_tasks(conn, kb, n)
|
||||
r = bench(
|
||||
f"list_tasks (n={n})",
|
||||
lambda: kb.list_tasks(conn),
|
||||
iterations=5,
|
||||
)
|
||||
print(f" min={r['min_ms']:.1f} median={r['median_ms']:.1f} max={r['max_ms']:.1f} ms")
|
||||
r["n"] = n
|
||||
results.append(r)
|
||||
conn.close()
|
||||
|
||||
# ============ board_stats at scale ============
|
||||
for n in [100, 1000, 10000]:
|
||||
print(f"\n== board_stats @ {n} ==")
|
||||
shutil.rmtree(home, ignore_errors=True)
|
||||
os.makedirs(home)
|
||||
kb._INITIALIZED_PATHS.clear()
|
||||
kb.init_db()
|
||||
conn = kb.connect()
|
||||
seed_tasks(conn, kb, n)
|
||||
r = bench(
|
||||
f"board_stats (n={n})",
|
||||
lambda: kb.board_stats(conn),
|
||||
iterations=5,
|
||||
)
|
||||
print(f" min={r['min_ms']:.1f} median={r['median_ms']:.1f} max={r['max_ms']:.1f} ms")
|
||||
r["n"] = n
|
||||
results.append(r)
|
||||
conn.close()
|
||||
|
||||
# ============ list_runs at scale ============
|
||||
for n in [100, 1000]:
|
||||
print(f"\n== list_runs for task with {n} attempts ==")
|
||||
shutil.rmtree(home, ignore_errors=True)
|
||||
os.makedirs(home)
|
||||
kb._INITIALIZED_PATHS.clear()
|
||||
kb.init_db()
|
||||
conn = kb.connect()
|
||||
tid = kb.create_task(conn, title="x", assignee="w")
|
||||
# Create N attempts via claim/release
|
||||
for i in range(n):
|
||||
kb.claim_task(conn, tid, ttl_seconds=0)
|
||||
kb.release_stale_claims(conn)
|
||||
r = bench(
|
||||
f"list_runs (runs={n})",
|
||||
lambda: kb.list_runs(conn, tid),
|
||||
iterations=10,
|
||||
)
|
||||
print(f" min={r['min_ms']:.1f} median={r['median_ms']:.1f} max={r['max_ms']:.1f} ms")
|
||||
r["run_count"] = n
|
||||
results.append(r)
|
||||
conn.close()
|
||||
|
||||
# ============ SUMMARY TABLE ============
|
||||
print()
|
||||
print("=" * 60)
|
||||
print("SUMMARY")
|
||||
print("=" * 60)
|
||||
print(f"{'Benchmark':<50} {'min':>8} {'median':>8} {'max':>8}")
|
||||
for r in results:
|
||||
print(f"{r['label']:<50} {r['min_ms']:>7.1f}ms {r['median_ms']:>7.1f}ms {r['max_ms']:>7.1f}ms")
|
||||
|
||||
# Save for future diffing.
|
||||
out_path = "/tmp/kanban_bench_results.json"
|
||||
with open(out_path, "w") as f:
|
||||
json.dump(results, f, indent=2)
|
||||
print(f"\nResults saved to {out_path}")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -0,0 +1,302 @@
|
||||
"""Multi-process concurrency stress test for the Kanban kernel.
|
||||
|
||||
5 worker processes race for claims on a shared DB with 100 tasks. Each
|
||||
worker loops: claim -> simulate work -> complete. Asserts the invariants
|
||||
that make the system worth building:
|
||||
|
||||
- No task claimed by two workers simultaneously
|
||||
- No task completed twice
|
||||
- Every claim produces exactly one run row
|
||||
- Every completion closes exactly one run row
|
||||
- Zero SQLite locking errors that escape the retry layer
|
||||
- Total run count == total claim events == total completed events
|
||||
|
||||
This test is the primary justification for WAL + CAS-based claim. If it
|
||||
passes, the architecture holds. If it fails, we have a real bug to fix
|
||||
before anyone runs this in anger.
|
||||
"""
|
||||
|
||||
import json
|
||||
import multiprocessing as mp
|
||||
import os
|
||||
import random
|
||||
import sqlite3
|
||||
import subprocess
|
||||
import sys
|
||||
import tempfile
|
||||
import time
|
||||
from pathlib import Path
|
||||
|
||||
|
||||
NUM_WORKERS = 5
|
||||
NUM_TASKS = 100
|
||||
WORKER_TIMEOUT_S = 60
|
||||
WT = str(Path(__file__).resolve().parents[2])
|
||||
|
||||
|
||||
def worker_loop(worker_id: int, hermes_home: str, result_file: str) -> None:
|
||||
"""One worker's inner loop. Runs in a fresh Python process.
|
||||
|
||||
Tries to claim a ready task, marks it done with a per-worker summary,
|
||||
repeats until the ready pool is empty. Records every claim + complete
|
||||
into its own JSON result file for later aggregation.
|
||||
"""
|
||||
os.environ["HERMES_HOME"] = hermes_home
|
||||
os.environ["HOME"] = hermes_home
|
||||
sys.path.insert(0, WT)
|
||||
|
||||
from hermes_cli import kanban_db as kb
|
||||
|
||||
events = []
|
||||
empty_polls = 0
|
||||
start = time.monotonic()
|
||||
|
||||
while time.monotonic() - start < WORKER_TIMEOUT_S:
|
||||
conn = kb.connect()
|
||||
try:
|
||||
# Find any ready task (non-deterministic order intentional — we
|
||||
# want workers to race on popular assignees).
|
||||
row = conn.execute(
|
||||
"SELECT id FROM tasks WHERE status = 'ready' "
|
||||
"AND claim_lock IS NULL LIMIT 1"
|
||||
).fetchone()
|
||||
if row is None:
|
||||
empty_polls += 1
|
||||
if empty_polls > 20:
|
||||
break # queue empty long enough, stop
|
||||
time.sleep(0.01)
|
||||
continue
|
||||
empty_polls = 0
|
||||
|
||||
tid = row["id"]
|
||||
try:
|
||||
claimed = kb.claim_task(
|
||||
conn, tid, claimer=f"worker-{worker_id}",
|
||||
)
|
||||
except sqlite3.OperationalError as e:
|
||||
events.append({"kind": "sqlite_err_on_claim", "task": tid, "err": str(e)})
|
||||
continue
|
||||
if claimed is None:
|
||||
# Someone else beat us — expected contention, not an error.
|
||||
events.append({"kind": "lost_claim_race", "task": tid})
|
||||
continue
|
||||
|
||||
run = kb.latest_run(conn, tid)
|
||||
events.append({
|
||||
"kind": "claimed",
|
||||
"task": tid,
|
||||
"worker": worker_id,
|
||||
"run_id": run.id,
|
||||
"t": time.monotonic() - start,
|
||||
})
|
||||
|
||||
# Simulate short, variable work
|
||||
time.sleep(random.uniform(0.001, 0.05))
|
||||
|
||||
try:
|
||||
kb.complete_task(
|
||||
conn, tid,
|
||||
result=f"done by worker-{worker_id}",
|
||||
summary=f"worker-{worker_id} finished task {tid}",
|
||||
metadata={"worker_id": worker_id, "run_id": run.id},
|
||||
)
|
||||
except sqlite3.OperationalError as e:
|
||||
events.append({"kind": "sqlite_err_on_complete", "task": tid, "err": str(e)})
|
||||
continue
|
||||
events.append({
|
||||
"kind": "completed",
|
||||
"task": tid,
|
||||
"worker": worker_id,
|
||||
"run_id": run.id,
|
||||
"t": time.monotonic() - start,
|
||||
})
|
||||
finally:
|
||||
conn.close()
|
||||
|
||||
with open(result_file, "w") as f:
|
||||
json.dump(events, f)
|
||||
|
||||
|
||||
def main():
|
||||
home = tempfile.mkdtemp(prefix="hermes_concurrency_")
|
||||
print(f"HERMES_HOME = {home}")
|
||||
|
||||
# Seed.
|
||||
os.environ["HERMES_HOME"] = home
|
||||
os.environ["HOME"] = home
|
||||
sys.path.insert(0, WT)
|
||||
from hermes_cli import kanban_db as kb
|
||||
|
||||
kb.init_db()
|
||||
conn = kb.connect()
|
||||
tids = []
|
||||
for i in range(NUM_TASKS):
|
||||
tid = kb.create_task(
|
||||
conn, title=f"task #{i}", assignee="shared",
|
||||
tenant="concurrency-test",
|
||||
)
|
||||
tids.append(tid)
|
||||
conn.close()
|
||||
print(f"Seeded {NUM_TASKS} tasks.")
|
||||
|
||||
# Spawn workers.
|
||||
ctx = mp.get_context("spawn")
|
||||
result_files = [f"/tmp/concurrency_worker_{i}.json" for i in range(NUM_WORKERS)]
|
||||
procs = []
|
||||
start = time.monotonic()
|
||||
for i in range(NUM_WORKERS):
|
||||
p = ctx.Process(target=worker_loop, args=(i, home, result_files[i]))
|
||||
p.start()
|
||||
procs.append(p)
|
||||
|
||||
for p in procs:
|
||||
p.join(timeout=WORKER_TIMEOUT_S + 30)
|
||||
if p.is_alive():
|
||||
p.terminate()
|
||||
p.join()
|
||||
|
||||
elapsed = time.monotonic() - start
|
||||
print(f"All workers done in {elapsed:.1f}s")
|
||||
|
||||
# Aggregate worker events.
|
||||
all_events = []
|
||||
for i, f in enumerate(result_files):
|
||||
if not os.path.isfile(f):
|
||||
print(f" WORKER {i} produced no result file — died?")
|
||||
continue
|
||||
with open(f) as fh:
|
||||
events = json.load(fh)
|
||||
all_events.extend(events)
|
||||
|
||||
# ============ INVARIANT CHECKS ============
|
||||
print()
|
||||
print("=" * 60)
|
||||
print("INVARIANT CHECKS")
|
||||
print("=" * 60)
|
||||
|
||||
failures = []
|
||||
|
||||
# Check 1: no task claimed by two different workers
|
||||
claims_by_task = {}
|
||||
for e in all_events:
|
||||
if e["kind"] == "claimed":
|
||||
if e["task"] in claims_by_task:
|
||||
prev = claims_by_task[e["task"]]
|
||||
if prev["worker"] != e["worker"]:
|
||||
failures.append(
|
||||
f"DOUBLE CLAIM: task {e['task']} claimed by "
|
||||
f"worker {prev['worker']} AND worker {e['worker']}"
|
||||
)
|
||||
claims_by_task[e["task"]] = e
|
||||
|
||||
# Check 2: every completion has a matching claim from the same worker
|
||||
for e in all_events:
|
||||
if e["kind"] == "completed":
|
||||
prev_claim = claims_by_task.get(e["task"])
|
||||
if prev_claim is None:
|
||||
failures.append(f"COMPLETION WITHOUT CLAIM: task {e['task']}")
|
||||
elif prev_claim["worker"] != e["worker"]:
|
||||
failures.append(
|
||||
f"WORKER MISMATCH: task {e['task']} claimed by "
|
||||
f"{prev_claim['worker']} but completed by {e['worker']}"
|
||||
)
|
||||
|
||||
# Check 3: DB state — every task should be in 'done', no dangling claims
|
||||
conn = kb.connect()
|
||||
try:
|
||||
bad_status = conn.execute(
|
||||
"SELECT id, status, claim_lock, current_run_id FROM tasks "
|
||||
"WHERE status != 'done' OR claim_lock IS NOT NULL "
|
||||
"OR current_run_id IS NOT NULL"
|
||||
).fetchall()
|
||||
if bad_status:
|
||||
for row in bad_status:
|
||||
failures.append(
|
||||
f"BAD FINAL STATE: task {row['id']} status={row['status']} "
|
||||
f"claim_lock={row['claim_lock']} current_run_id={row['current_run_id']}"
|
||||
)
|
||||
|
||||
# Check 4: exactly one run per task, all closed as completed
|
||||
bad_runs = conn.execute(
|
||||
"SELECT task_id, COUNT(*) as n FROM task_runs "
|
||||
"GROUP BY task_id HAVING n != 1"
|
||||
).fetchall()
|
||||
if bad_runs:
|
||||
for row in bad_runs:
|
||||
failures.append(
|
||||
f"WRONG RUN COUNT: task {row['task_id']} has {row['n']} runs (expected 1)"
|
||||
)
|
||||
|
||||
open_runs = conn.execute(
|
||||
"SELECT id, task_id FROM task_runs WHERE ended_at IS NULL"
|
||||
).fetchall()
|
||||
for row in open_runs:
|
||||
failures.append(f"OPEN RUN: run {row['id']} on task {row['task_id']}")
|
||||
|
||||
wrong_outcomes = conn.execute(
|
||||
"SELECT task_id, outcome FROM task_runs "
|
||||
"WHERE outcome IS NULL OR outcome != 'completed'"
|
||||
).fetchall()
|
||||
for row in wrong_outcomes:
|
||||
failures.append(
|
||||
f"WRONG OUTCOME: task {row['task_id']} run outcome={row['outcome']}"
|
||||
)
|
||||
|
||||
# Check 5: event counts — exactly NUM_TASKS completed events
|
||||
completed_events = conn.execute(
|
||||
"SELECT COUNT(*) as n FROM task_events WHERE kind='completed'"
|
||||
).fetchone()["n"]
|
||||
if completed_events != NUM_TASKS:
|
||||
failures.append(
|
||||
f"EVENT COUNT MISMATCH: {completed_events} completed events "
|
||||
f"expected {NUM_TASKS}"
|
||||
)
|
||||
|
||||
# Check 6: count SQLite errors that escaped retry
|
||||
sqlite_errs = sum(
|
||||
1 for e in all_events if e["kind"].startswith("sqlite_err")
|
||||
)
|
||||
if sqlite_errs > 0:
|
||||
failures.append(f"UNRETRIED SQLITE ERRORS: {sqlite_errs}")
|
||||
|
||||
finally:
|
||||
conn.close()
|
||||
|
||||
# ============ STATS ============
|
||||
print()
|
||||
total_claims = sum(1 for e in all_events if e["kind"] == "claimed")
|
||||
total_completes = sum(1 for e in all_events if e["kind"] == "completed")
|
||||
total_lost_races = sum(1 for e in all_events if e["kind"] == "lost_claim_race")
|
||||
|
||||
per_worker = {}
|
||||
for e in all_events:
|
||||
if e["kind"] == "completed":
|
||||
per_worker.setdefault(e["worker"], 0)
|
||||
per_worker[e["worker"]] += 1
|
||||
|
||||
print(f"Total claims: {total_claims}")
|
||||
print(f"Total completes: {total_completes}")
|
||||
print(f"Lost claim races: {total_lost_races} (expected contention; not a bug)")
|
||||
print(f"Elapsed: {elapsed:.2f}s")
|
||||
print(f"Throughput: {NUM_TASKS/elapsed:.1f} tasks/sec")
|
||||
print(f"Per-worker completions:")
|
||||
for w in sorted(per_worker.keys()):
|
||||
print(f" worker-{w}: {per_worker[w]}")
|
||||
|
||||
if failures:
|
||||
print()
|
||||
print("=" * 60)
|
||||
print(f"FAILURES ({len(failures)}):")
|
||||
print("=" * 60)
|
||||
for f in failures[:20]:
|
||||
print(f" {f}")
|
||||
if len(failures) > 20:
|
||||
print(f" ... and {len(failures) - 20} more")
|
||||
sys.exit(1)
|
||||
else:
|
||||
print()
|
||||
print("✔ ALL INVARIANTS HELD")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -0,0 +1,350 @@
|
||||
"""Harder concurrency stress: mixed operations + larger scale.
|
||||
|
||||
Scales to 500 tasks, 10 workers, 60s runtime. Each worker randomly:
|
||||
- claims + completes (70%)
|
||||
- claims + blocks with a reason (15%)
|
||||
- unblocks a random blocked task (10%)
|
||||
- archives a random done task (5%)
|
||||
|
||||
Adds a background "dispatcher" process that calls release_stale_claims
|
||||
and detect_crashed_workers every 200ms, racing against the workers to
|
||||
surface TTL + crash detection races.
|
||||
|
||||
Pass criteria: runs invariant holds, no double-completions, no orphan
|
||||
runs, no SQLite errors escape the retry layer.
|
||||
"""
|
||||
|
||||
import json
|
||||
import multiprocessing as mp
|
||||
import os
|
||||
import random
|
||||
import sqlite3
|
||||
import sys
|
||||
import tempfile
|
||||
import time
|
||||
from pathlib import Path
|
||||
|
||||
NUM_WORKERS = 10
|
||||
NUM_TASKS = 500
|
||||
RUN_DURATION_S = 30
|
||||
WT = str(Path(__file__).resolve().parents[2])
|
||||
|
||||
|
||||
def worker_loop(worker_id: int, hermes_home: str, result_file: str) -> None:
|
||||
os.environ["HERMES_HOME"] = hermes_home
|
||||
os.environ["HOME"] = hermes_home
|
||||
sys.path.insert(0, WT)
|
||||
from hermes_cli import kanban_db as kb
|
||||
|
||||
events = []
|
||||
start = time.monotonic()
|
||||
idle_rounds = 0
|
||||
|
||||
while time.monotonic() - start < RUN_DURATION_S:
|
||||
conn = kb.connect()
|
||||
try:
|
||||
op = random.random()
|
||||
|
||||
if op < 0.10:
|
||||
# Try to unblock a blocked task.
|
||||
row = conn.execute(
|
||||
"SELECT id FROM tasks WHERE status='blocked' "
|
||||
"ORDER BY RANDOM() LIMIT 1"
|
||||
).fetchone()
|
||||
if row:
|
||||
try:
|
||||
ok = kb.unblock_task(conn, row["id"])
|
||||
events.append({"kind": "unblocked" if ok else "unblock_noop",
|
||||
"task": row["id"], "worker": worker_id})
|
||||
except sqlite3.OperationalError as e:
|
||||
events.append({"kind": "sqlite_err", "op": "unblock",
|
||||
"task": row["id"], "err": str(e)[:100]})
|
||||
continue
|
||||
|
||||
if op < 0.15:
|
||||
# Try to archive a done task.
|
||||
row = conn.execute(
|
||||
"SELECT id FROM tasks WHERE status='done' "
|
||||
"ORDER BY RANDOM() LIMIT 1"
|
||||
).fetchone()
|
||||
if row:
|
||||
try:
|
||||
kb.archive_task(conn, row["id"])
|
||||
events.append({"kind": "archived", "task": row["id"],
|
||||
"worker": worker_id})
|
||||
except sqlite3.OperationalError as e:
|
||||
events.append({"kind": "sqlite_err", "op": "archive",
|
||||
"task": row["id"], "err": str(e)[:100]})
|
||||
continue
|
||||
|
||||
# Default: claim + complete-or-block.
|
||||
row = conn.execute(
|
||||
"SELECT id FROM tasks WHERE status='ready' "
|
||||
"AND claim_lock IS NULL LIMIT 1"
|
||||
).fetchone()
|
||||
if row is None:
|
||||
idle_rounds += 1
|
||||
if idle_rounds > 50:
|
||||
break
|
||||
time.sleep(0.02)
|
||||
continue
|
||||
idle_rounds = 0
|
||||
|
||||
tid = row["id"]
|
||||
try:
|
||||
claimed = kb.claim_task(
|
||||
conn, tid, claimer=f"worker-{worker_id}",
|
||||
ttl_seconds=5, # short TTL so reclaim races in
|
||||
)
|
||||
except sqlite3.OperationalError as e:
|
||||
events.append({"kind": "sqlite_err", "op": "claim",
|
||||
"task": tid, "err": str(e)[:100]})
|
||||
continue
|
||||
if claimed is None:
|
||||
events.append({"kind": "lost_claim_race", "task": tid})
|
||||
continue
|
||||
|
||||
run = kb.latest_run(conn, tid)
|
||||
events.append({"kind": "claimed", "task": tid, "worker": worker_id,
|
||||
"run_id": run.id, "t": time.monotonic() - start})
|
||||
|
||||
time.sleep(random.uniform(0.005, 0.05))
|
||||
|
||||
# 20% of the time, block instead of complete
|
||||
if random.random() < 0.20:
|
||||
try:
|
||||
kb.block_task(conn, tid,
|
||||
reason=f"blocked by worker-{worker_id}")
|
||||
events.append({"kind": "blocked", "task": tid,
|
||||
"worker": worker_id, "run_id": run.id})
|
||||
except sqlite3.OperationalError as e:
|
||||
events.append({"kind": "sqlite_err", "op": "block",
|
||||
"task": tid, "err": str(e)[:100]})
|
||||
else:
|
||||
try:
|
||||
kb.complete_task(
|
||||
conn, tid,
|
||||
result=f"done by worker-{worker_id}",
|
||||
summary=f"worker-{worker_id} ok",
|
||||
metadata={"worker_id": worker_id},
|
||||
)
|
||||
events.append({"kind": "completed", "task": tid,
|
||||
"worker": worker_id, "run_id": run.id,
|
||||
"t": time.monotonic() - start})
|
||||
except sqlite3.OperationalError as e:
|
||||
events.append({"kind": "sqlite_err", "op": "complete",
|
||||
"task": tid, "err": str(e)[:100]})
|
||||
finally:
|
||||
conn.close()
|
||||
|
||||
with open(result_file, "w") as f:
|
||||
json.dump(events, f)
|
||||
|
||||
|
||||
def reclaimer_loop(hermes_home: str, result_file: str) -> None:
|
||||
"""Background dispatcher-like loop that reclaims stale tasks."""
|
||||
os.environ["HERMES_HOME"] = hermes_home
|
||||
os.environ["HOME"] = hermes_home
|
||||
sys.path.insert(0, WT)
|
||||
from hermes_cli import kanban_db as kb
|
||||
|
||||
events = []
|
||||
start = time.monotonic()
|
||||
while time.monotonic() - start < RUN_DURATION_S + 2:
|
||||
conn = kb.connect()
|
||||
try:
|
||||
try:
|
||||
reclaimed = kb.release_stale_claims(conn)
|
||||
if reclaimed:
|
||||
events.append({"kind": "reclaimed", "count": reclaimed,
|
||||
"t": time.monotonic() - start})
|
||||
except sqlite3.OperationalError as e:
|
||||
events.append({"kind": "sqlite_err", "op": "reclaim",
|
||||
"err": str(e)[:100]})
|
||||
finally:
|
||||
conn.close()
|
||||
time.sleep(0.2)
|
||||
|
||||
with open(result_file, "w") as f:
|
||||
json.dump(events, f)
|
||||
|
||||
|
||||
def main():
|
||||
home = tempfile.mkdtemp(prefix="hermes_mixed_stress_")
|
||||
print(f"HERMES_HOME = {home}")
|
||||
|
||||
os.environ["HERMES_HOME"] = home
|
||||
os.environ["HOME"] = home
|
||||
sys.path.insert(0, WT)
|
||||
from hermes_cli import kanban_db as kb
|
||||
|
||||
kb.init_db()
|
||||
conn = kb.connect()
|
||||
for i in range(NUM_TASKS):
|
||||
kb.create_task(
|
||||
conn, title=f"t#{i}", assignee="shared", tenant="mixed-stress",
|
||||
)
|
||||
conn.close()
|
||||
print(f"Seeded {NUM_TASKS} tasks, launching {NUM_WORKERS} workers + 1 reclaimer")
|
||||
|
||||
ctx = mp.get_context("spawn")
|
||||
worker_results = [f"/tmp/mixed_worker_{i}.json" for i in range(NUM_WORKERS)]
|
||||
reclaim_result = "/tmp/mixed_reclaim.json"
|
||||
|
||||
procs = []
|
||||
start = time.monotonic()
|
||||
for i in range(NUM_WORKERS):
|
||||
p = ctx.Process(target=worker_loop, args=(i, home, worker_results[i]))
|
||||
p.start()
|
||||
procs.append(p)
|
||||
r = ctx.Process(target=reclaimer_loop, args=(home, reclaim_result))
|
||||
r.start()
|
||||
procs.append(r)
|
||||
|
||||
for p in procs:
|
||||
p.join(timeout=RUN_DURATION_S + 30)
|
||||
if p.is_alive():
|
||||
p.terminate()
|
||||
p.join()
|
||||
|
||||
elapsed = time.monotonic() - start
|
||||
print(f"Done in {elapsed:.1f}s")
|
||||
|
||||
# Aggregate.
|
||||
all_events = []
|
||||
for i, f in enumerate(worker_results):
|
||||
if os.path.isfile(f):
|
||||
with open(f) as fh:
|
||||
all_events.extend(json.load(fh))
|
||||
else:
|
||||
print(f" WORKER {i} died with no result file!")
|
||||
reclaim_events = []
|
||||
if os.path.isfile(reclaim_result):
|
||||
with open(reclaim_result) as fh:
|
||||
reclaim_events = json.load(fh)
|
||||
|
||||
# ============ INVARIANT CHECKS ============
|
||||
print()
|
||||
print("=" * 60)
|
||||
print("INVARIANT CHECKS")
|
||||
print("=" * 60)
|
||||
|
||||
failures = []
|
||||
|
||||
# Per-run attribution tracking
|
||||
claims = [e for e in all_events if e["kind"] == "claimed"]
|
||||
completions = [e for e in all_events if e["kind"] == "completed"]
|
||||
blocks = [e for e in all_events if e["kind"] == "blocked"]
|
||||
|
||||
# Every completion must have a matching claim on the same run_id AND
|
||||
# the same worker (workers don't steal each other's runs).
|
||||
claims_by_run = {c["run_id"]: c for c in claims}
|
||||
for comp in completions:
|
||||
claim = claims_by_run.get(comp["run_id"])
|
||||
if claim is None:
|
||||
# It's possible this worker saw a reclaimed run from another worker
|
||||
# — that's still a bug: the worker shouldn't be able to complete
|
||||
# a run it didn't claim. But let me check if reclaim happened first.
|
||||
failures.append(
|
||||
f"COMPLETION WITHOUT CLAIM: task {comp['task']} run {comp['run_id']} "
|
||||
f"by worker {comp['worker']}"
|
||||
)
|
||||
elif claim["worker"] != comp["worker"]:
|
||||
failures.append(
|
||||
f"CROSS-WORKER COMPLETION: run {comp['run_id']} claimed by "
|
||||
f"worker {claim['worker']} but completed by worker {comp['worker']}"
|
||||
)
|
||||
|
||||
# SQLite errors that escaped the retry layer
|
||||
sqlite_errs = [e for e in all_events if e["kind"] == "sqlite_err"]
|
||||
if sqlite_errs:
|
||||
for e in sqlite_errs[:5]:
|
||||
failures.append(f"SQLITE ERROR: op={e.get('op')} err={e.get('err')}")
|
||||
if len(sqlite_errs) > 5:
|
||||
failures.append(f" ... and {len(sqlite_errs) - 5} more sqlite errs")
|
||||
|
||||
# DB final state — every task should be in a clean terminal state.
|
||||
conn = kb.connect()
|
||||
try:
|
||||
# Invariant: current_run_id NULL iff latest run is terminal
|
||||
inconsistent = conn.execute("""
|
||||
SELECT t.id, t.status, t.current_run_id
|
||||
FROM tasks t
|
||||
WHERE t.current_run_id IS NOT NULL
|
||||
AND EXISTS (SELECT 1 FROM task_runs r
|
||||
WHERE r.id = t.current_run_id AND r.ended_at IS NOT NULL)
|
||||
""").fetchall()
|
||||
for row in inconsistent:
|
||||
failures.append(
|
||||
f"INVARIANT VIOLATION: task {row['id']} status={row['status']} "
|
||||
f"has current_run_id={row['current_run_id']} but run is ended"
|
||||
)
|
||||
|
||||
# Invariant: no orphan open runs
|
||||
orphans = conn.execute("""
|
||||
SELECT r.id, r.task_id, r.status
|
||||
FROM task_runs r
|
||||
LEFT JOIN tasks t ON t.current_run_id = r.id
|
||||
WHERE r.ended_at IS NULL AND t.id IS NULL
|
||||
""").fetchall()
|
||||
for row in orphans:
|
||||
failures.append(
|
||||
f"ORPHAN OPEN RUN: run {row['id']} on task {row['task_id']}"
|
||||
)
|
||||
|
||||
# Counts — should roughly balance.
|
||||
status_counts = dict(
|
||||
conn.execute("SELECT status, COUNT(*) FROM tasks GROUP BY status").fetchall()
|
||||
)
|
||||
run_outcome_counts = dict(
|
||||
conn.execute(
|
||||
"SELECT outcome, COUNT(*) FROM task_runs "
|
||||
"WHERE ended_at IS NOT NULL GROUP BY outcome"
|
||||
).fetchall()
|
||||
)
|
||||
active_runs = conn.execute(
|
||||
"SELECT COUNT(*) FROM task_runs WHERE ended_at IS NULL"
|
||||
).fetchone()[0]
|
||||
|
||||
finally:
|
||||
conn.close()
|
||||
|
||||
# ============ STATS ============
|
||||
print()
|
||||
print(f"Workers: {NUM_WORKERS}, Tasks: {NUM_TASKS}")
|
||||
print(f"Elapsed: {elapsed:.1f}s")
|
||||
print(f"Events collected: {len(all_events)} (+{len(reclaim_events)} reclaim)")
|
||||
print()
|
||||
print("Operations:")
|
||||
op_counts = {}
|
||||
for e in all_events:
|
||||
op_counts[e["kind"]] = op_counts.get(e["kind"], 0) + 1
|
||||
for k in sorted(op_counts.keys()):
|
||||
print(f" {k:<25} {op_counts[k]}")
|
||||
|
||||
print()
|
||||
print("Final task status:")
|
||||
for s, n in sorted(status_counts.items()):
|
||||
print(f" {s:<10} {n}")
|
||||
print("Final run outcomes:")
|
||||
for o, n in sorted(run_outcome_counts.items(), key=lambda x: (x[0] or '',)):
|
||||
print(f" {o:<12} {n}")
|
||||
print(f" active {active_runs}")
|
||||
|
||||
if failures:
|
||||
print()
|
||||
print("=" * 60)
|
||||
print(f"FAILURES ({len(failures)}):")
|
||||
print("=" * 60)
|
||||
for f in failures[:30]:
|
||||
print(f" {f}")
|
||||
if len(failures) > 30:
|
||||
print(f" ... and {len(failures) - 30} more")
|
||||
sys.exit(1)
|
||||
else:
|
||||
print()
|
||||
print("✔ ALL INVARIANTS HELD UNDER MIXED STRESS")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -0,0 +1,241 @@
|
||||
"""Target the reclaim race specifically.
|
||||
|
||||
Workers claim tasks with a 1s TTL but sleep 2s before completing. The
|
||||
reclaimer runs every 200ms. Scenario: worker claims, reclaimer expires
|
||||
the claim mid-work, worker tries to complete AFTER its run has been
|
||||
reclaimed.
|
||||
|
||||
Expected behavior (per design): the worker's complete_task should
|
||||
either succeed on the reclaimed-and-re-claimed-by-another-worker case
|
||||
(no, it should refuse — the claim was invalidated), OR succeed by
|
||||
grace (we "forgive" a late complete from the original worker if no
|
||||
one else picked it up).
|
||||
|
||||
Actually looking at complete_task: it doesn't check claim_lock. It just
|
||||
transitions from 'running' -> 'done'. So if the reclaimer moved it back
|
||||
to 'ready', the late worker's complete_task will fail (CAS on
|
||||
status='running' fails). This is the CORRECT behavior.
|
||||
|
||||
Invariant being tested: race between worker.complete and
|
||||
dispatcher.reclaim must not produce a double-run-close or other
|
||||
inconsistency.
|
||||
"""
|
||||
|
||||
import json
|
||||
import multiprocessing as mp
|
||||
import os
|
||||
import random
|
||||
import sqlite3
|
||||
import sys
|
||||
import tempfile
|
||||
import time
|
||||
from pathlib import Path
|
||||
|
||||
NUM_WORKERS = 5
|
||||
NUM_TASKS = 50
|
||||
TTL = 1
|
||||
WORK_DURATION_S = 2.0 # longer than TTL => reclaimer wins
|
||||
WT = str(Path(__file__).resolve().parents[2])
|
||||
|
||||
|
||||
def worker_loop(worker_id: int, hermes_home: str, result_file: str) -> None:
|
||||
os.environ["HERMES_HOME"] = hermes_home
|
||||
os.environ["HOME"] = hermes_home
|
||||
sys.path.insert(0, WT)
|
||||
from hermes_cli import kanban_db as kb
|
||||
|
||||
events = []
|
||||
start = time.monotonic()
|
||||
idle = 0
|
||||
|
||||
while time.monotonic() - start < 40:
|
||||
conn = kb.connect()
|
||||
try:
|
||||
row = conn.execute(
|
||||
"SELECT id FROM tasks WHERE status='ready' AND claim_lock IS NULL LIMIT 1"
|
||||
).fetchone()
|
||||
if row is None:
|
||||
idle += 1
|
||||
if idle > 30:
|
||||
break
|
||||
time.sleep(0.05)
|
||||
continue
|
||||
idle = 0
|
||||
tid = row["id"]
|
||||
try:
|
||||
claimed = kb.claim_task(conn, tid, claimer=f"worker-{worker_id}",
|
||||
ttl_seconds=TTL)
|
||||
except sqlite3.OperationalError as e:
|
||||
events.append({"kind": "sqlite_err", "op": "claim", "err": str(e)[:100]})
|
||||
continue
|
||||
if claimed is None:
|
||||
events.append({"kind": "lost_claim", "task": tid})
|
||||
continue
|
||||
run = kb.latest_run(conn, tid)
|
||||
events.append({"kind": "claimed", "task": tid, "worker": worker_id,
|
||||
"run_id": run.id})
|
||||
|
||||
# Sleep longer than TTL so reclaimer has a chance to intervene
|
||||
time.sleep(WORK_DURATION_S + random.uniform(-0.3, 0.3))
|
||||
|
||||
try:
|
||||
ok = kb.complete_task(
|
||||
conn, tid,
|
||||
result=f"by worker-{worker_id}",
|
||||
summary=f"worker-{worker_id} finished",
|
||||
)
|
||||
events.append({"kind": "complete_ok" if ok else "complete_refused",
|
||||
"task": tid, "worker": worker_id, "run_id": run.id})
|
||||
except sqlite3.OperationalError as e:
|
||||
events.append({"kind": "sqlite_err", "op": "complete", "err": str(e)[:100]})
|
||||
finally:
|
||||
conn.close()
|
||||
|
||||
with open(result_file, "w") as f:
|
||||
json.dump(events, f)
|
||||
|
||||
|
||||
def reclaimer_loop(hermes_home: str, result_file: str) -> None:
|
||||
os.environ["HERMES_HOME"] = hermes_home
|
||||
os.environ["HOME"] = hermes_home
|
||||
sys.path.insert(0, WT)
|
||||
from hermes_cli import kanban_db as kb
|
||||
|
||||
events = []
|
||||
start = time.monotonic()
|
||||
while time.monotonic() - start < 42:
|
||||
conn = kb.connect()
|
||||
try:
|
||||
try:
|
||||
n = kb.release_stale_claims(conn)
|
||||
if n:
|
||||
events.append({"kind": "reclaimed", "count": n,
|
||||
"t": time.monotonic() - start})
|
||||
except sqlite3.OperationalError as e:
|
||||
events.append({"kind": "sqlite_err", "err": str(e)[:100]})
|
||||
finally:
|
||||
conn.close()
|
||||
time.sleep(0.2)
|
||||
with open(result_file, "w") as f:
|
||||
json.dump(events, f)
|
||||
|
||||
|
||||
def main():
|
||||
home = tempfile.mkdtemp(prefix="hermes_reclaim_race_")
|
||||
os.environ["HERMES_HOME"] = home
|
||||
os.environ["HOME"] = home
|
||||
sys.path.insert(0, WT)
|
||||
from hermes_cli import kanban_db as kb
|
||||
|
||||
kb.init_db()
|
||||
conn = kb.connect()
|
||||
for i in range(NUM_TASKS):
|
||||
kb.create_task(conn, title=f"t{i}", assignee="shared",
|
||||
tenant="reclaim-race")
|
||||
conn.close()
|
||||
print(f"Seeded {NUM_TASKS} tasks. TTL={TTL}s, work_duration={WORK_DURATION_S}s")
|
||||
print(f"(worker work > TTL guarantees reclaims)")
|
||||
|
||||
ctx = mp.get_context("spawn")
|
||||
worker_results = [f"/tmp/rc_worker_{i}.json" for i in range(NUM_WORKERS)]
|
||||
reclaim_result = "/tmp/rc_reclaim.json"
|
||||
procs = []
|
||||
for i in range(NUM_WORKERS):
|
||||
p = ctx.Process(target=worker_loop, args=(i, home, worker_results[i]))
|
||||
p.start()
|
||||
procs.append(p)
|
||||
r = ctx.Process(target=reclaimer_loop, args=(home, reclaim_result))
|
||||
r.start()
|
||||
procs.append(r)
|
||||
|
||||
for p in procs:
|
||||
p.join(timeout=60)
|
||||
if p.is_alive():
|
||||
p.terminate()
|
||||
p.join()
|
||||
|
||||
# Aggregate.
|
||||
all_events = []
|
||||
for f in worker_results:
|
||||
if os.path.isfile(f):
|
||||
with open(f) as fh:
|
||||
all_events.extend(json.load(fh))
|
||||
reclaim_events = []
|
||||
if os.path.isfile(reclaim_result):
|
||||
with open(reclaim_result) as fh:
|
||||
reclaim_events = json.load(fh)
|
||||
|
||||
op_counts = {}
|
||||
for e in all_events:
|
||||
op_counts[e["kind"]] = op_counts.get(e["kind"], 0) + 1
|
||||
total_reclaims = sum(e.get("count", 0) for e in reclaim_events)
|
||||
print(f"\nReclaimer fired {len(reclaim_events)} times, total tasks reclaimed: {total_reclaims}")
|
||||
print("Worker events:")
|
||||
for k in sorted(op_counts):
|
||||
print(f" {k:<25} {op_counts[k]}")
|
||||
|
||||
# Invariant checks
|
||||
failures = []
|
||||
conn = kb.connect()
|
||||
try:
|
||||
# Any task stuck with current_run_id pointing at a closed run?
|
||||
bad = conn.execute("""
|
||||
SELECT t.id, t.status, t.current_run_id, r.ended_at, r.outcome
|
||||
FROM tasks t
|
||||
JOIN task_runs r ON r.id = t.current_run_id
|
||||
WHERE r.ended_at IS NOT NULL
|
||||
""").fetchall()
|
||||
for row in bad:
|
||||
failures.append(
|
||||
f"INVARIANT VIOLATION: task {row['id']} status={row['status']} "
|
||||
f"current_run_id={row['current_run_id']} but run ended "
|
||||
f"outcome={row['outcome']}"
|
||||
)
|
||||
# Every run with NULL ended_at should still have the task pointing at it
|
||||
orphans = conn.execute("""
|
||||
SELECT r.id, r.task_id
|
||||
FROM task_runs r
|
||||
LEFT JOIN tasks t ON t.current_run_id = r.id
|
||||
WHERE r.ended_at IS NULL AND t.id IS NULL
|
||||
""").fetchall()
|
||||
for row in orphans:
|
||||
failures.append(f"ORPHAN OPEN RUN: run {row['id']} on task {row['task_id']}")
|
||||
# Event counts
|
||||
claim_evts = conn.execute(
|
||||
"SELECT COUNT(*) FROM task_events WHERE kind='claimed'").fetchone()[0]
|
||||
reclaim_evts = conn.execute(
|
||||
"SELECT COUNT(*) FROM task_events WHERE kind='reclaimed'").fetchone()[0]
|
||||
comp_evts = conn.execute(
|
||||
"SELECT COUNT(*) FROM task_events WHERE kind='completed'").fetchone()[0]
|
||||
print(f"\nDB event counts: claimed={claim_evts} reclaimed={reclaim_evts} completed={comp_evts}")
|
||||
# Every reclaimed run must have ended_at set
|
||||
unended_reclaims = conn.execute(
|
||||
"SELECT COUNT(*) FROM task_runs WHERE outcome='reclaimed' AND ended_at IS NULL"
|
||||
).fetchone()[0]
|
||||
if unended_reclaims:
|
||||
failures.append(f"UNENDED RECLAIMED RUNS: {unended_reclaims}")
|
||||
# Count of completed runs
|
||||
comp_runs = conn.execute(
|
||||
"SELECT COUNT(*) FROM task_runs WHERE outcome='completed'"
|
||||
).fetchone()[0]
|
||||
reclaim_runs = conn.execute(
|
||||
"SELECT COUNT(*) FROM task_runs WHERE outcome='reclaimed'"
|
||||
).fetchone()[0]
|
||||
print(f"DB run outcomes: completed={comp_runs} reclaimed={reclaim_runs}")
|
||||
finally:
|
||||
conn.close()
|
||||
|
||||
if reclaim_runs == 0:
|
||||
failures.append("NO RECLAIMS HAPPENED — test didn't stress what it was supposed to")
|
||||
|
||||
if failures:
|
||||
print(f"\nFAILURES ({len(failures)}):")
|
||||
for f in failures[:20]:
|
||||
print(f" {f}")
|
||||
sys.exit(1)
|
||||
else:
|
||||
print("\n✔ RECLAIM RACE INVARIANTS HELD")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -0,0 +1,283 @@
|
||||
"""Randomized property testing for the Kanban kernel.
|
||||
|
||||
Generates 1000 random operation sequences, each 20-50 ops, on small
|
||||
task graphs. After each step, checks the full invariant set:
|
||||
|
||||
I1. If tasks.current_run_id IS NOT NULL, the run MUST exist AND
|
||||
ended_at MUST be NULL (we never point at a closed run).
|
||||
I2. If a run has ended_at NULL, SOME task MUST have current_run_id
|
||||
pointing at it (no orphan open runs).
|
||||
I3. task.status in the valid set {triage, todo, ready, running,
|
||||
blocked, done, archived}.
|
||||
I4. task.claim_lock NULL iff status not in (running,).
|
||||
I5. Every run has started_at <= ended_at (or ended_at is NULL).
|
||||
I6. If outcome is set, ended_at must also be set.
|
||||
I7. Events are strictly monotonic in (created_at, id).
|
||||
I8. task_events.run_id references a task_runs.id that exists
|
||||
(or is NULL).
|
||||
I9. Parent completion invariant: if all parents are 'done', the
|
||||
child cannot be in 'todo' status (recompute_ready should have
|
||||
promoted it). This is called out in the comment on
|
||||
recompute_ready; verify it holds after every random seq.
|
||||
|
||||
Not using hypothesis the lib; just Python random for simplicity.
|
||||
"""
|
||||
|
||||
import os
|
||||
import random
|
||||
import sys
|
||||
import tempfile
|
||||
import time
|
||||
from pathlib import Path
|
||||
|
||||
WT = str(Path(__file__).resolve().parents[2])
|
||||
NUM_SEQUENCES = 500
|
||||
OPS_PER_SEQUENCE = 100
|
||||
TASK_POOL = 10
|
||||
|
||||
OPS = [
|
||||
"create", "create_child", "claim", "complete", "block", "unblock",
|
||||
"archive", "heartbeat", "release_stale", "detect_crashed",
|
||||
"recompute_ready", "reassign",
|
||||
]
|
||||
|
||||
|
||||
def assert_invariants(conn, kb, ops_log):
|
||||
"""Run all invariant checks; raise AssertionError with context on any."""
|
||||
failures = []
|
||||
|
||||
# I1: current_run_id → run exists and not ended
|
||||
bad_ptr = conn.execute("""
|
||||
SELECT t.id, t.current_run_id, r.ended_at, r.outcome
|
||||
FROM tasks t
|
||||
LEFT JOIN task_runs r ON r.id = t.current_run_id
|
||||
WHERE t.current_run_id IS NOT NULL
|
||||
AND (r.id IS NULL OR r.ended_at IS NOT NULL)
|
||||
""").fetchall()
|
||||
for row in bad_ptr:
|
||||
if row["ended_at"] is None and row["outcome"] is None:
|
||||
detail = "missing"
|
||||
else:
|
||||
detail = f"closed ({row['outcome']})"
|
||||
failures.append(
|
||||
f"I1: task {row['id']} points at run {row['current_run_id']} "
|
||||
f"which is {detail}"
|
||||
)
|
||||
|
||||
# I2: open run → some task points at it
|
||||
orphans = conn.execute("""
|
||||
SELECT r.id, r.task_id
|
||||
FROM task_runs r
|
||||
WHERE r.ended_at IS NULL
|
||||
AND NOT EXISTS (SELECT 1 FROM tasks t WHERE t.current_run_id = r.id)
|
||||
""").fetchall()
|
||||
for row in orphans:
|
||||
failures.append(f"I2: open run {row['id']} on task {row['task_id']} has no pointer")
|
||||
|
||||
# I3: valid statuses
|
||||
valid = {"triage", "todo", "ready", "running", "blocked", "done", "archived"}
|
||||
bad_status = conn.execute("SELECT id, status FROM tasks").fetchall()
|
||||
for row in bad_status:
|
||||
if row["status"] not in valid:
|
||||
failures.append(f"I3: task {row['id']} has invalid status {row['status']!r}")
|
||||
|
||||
# I4: claim_lock set only when running
|
||||
bad_lock = conn.execute("""
|
||||
SELECT id, status, claim_lock FROM tasks
|
||||
WHERE (status != 'running' AND claim_lock IS NOT NULL)
|
||||
""").fetchall()
|
||||
for row in bad_lock:
|
||||
failures.append(
|
||||
f"I4: task {row['id']} status={row['status']} but claim_lock={row['claim_lock']!r}"
|
||||
)
|
||||
|
||||
# I5: run started_at <= ended_at
|
||||
bad_times = conn.execute("""
|
||||
SELECT id, started_at, ended_at FROM task_runs
|
||||
WHERE ended_at IS NOT NULL AND started_at > ended_at
|
||||
""").fetchall()
|
||||
for row in bad_times:
|
||||
failures.append(
|
||||
f"I5: run {row['id']} started_at={row['started_at']} > ended_at={row['ended_at']}"
|
||||
)
|
||||
|
||||
# I6: outcome set → ended_at set
|
||||
bad_outcome = conn.execute("""
|
||||
SELECT id, outcome, ended_at FROM task_runs
|
||||
WHERE outcome IS NOT NULL AND ended_at IS NULL
|
||||
""").fetchall()
|
||||
for row in bad_outcome:
|
||||
failures.append(f"I6: run {row['id']} outcome={row['outcome']} but ended_at NULL")
|
||||
|
||||
# I7: events monotonic in id (always true for autoincrement)
|
||||
# Skip — autoincrement guarantees it.
|
||||
|
||||
# I8: event.run_id references existing run
|
||||
bad_ev_fk = conn.execute("""
|
||||
SELECT e.id, e.run_id FROM task_events e
|
||||
LEFT JOIN task_runs r ON r.id = e.run_id
|
||||
WHERE e.run_id IS NOT NULL AND r.id IS NULL
|
||||
""").fetchall()
|
||||
for row in bad_ev_fk:
|
||||
failures.append(f"I8: event {row['id']} references missing run {row['run_id']}")
|
||||
|
||||
# I9: if all parents done → child not in todo
|
||||
# (Only applies to children with at least one parent)
|
||||
orphaned_todo = conn.execute("""
|
||||
SELECT c.id AS child_id,
|
||||
COUNT(*) AS n_parents,
|
||||
SUM(CASE WHEN p.status = 'done' THEN 1 ELSE 0 END) AS done_parents
|
||||
FROM tasks c
|
||||
JOIN task_links l ON l.child_id = c.id
|
||||
JOIN tasks p ON p.id = l.parent_id
|
||||
WHERE c.status = 'todo'
|
||||
GROUP BY c.id
|
||||
HAVING n_parents > 0 AND n_parents = done_parents
|
||||
""").fetchall()
|
||||
for row in orphaned_todo:
|
||||
failures.append(
|
||||
f"I9: task {row['child_id']} is todo but all {row['n_parents']} parents are done"
|
||||
)
|
||||
|
||||
if failures:
|
||||
print(f"\n!!! INVARIANT VIOLATION after {len(ops_log)} ops:")
|
||||
for f in failures[:10]:
|
||||
print(f" {f}")
|
||||
if len(failures) > 10:
|
||||
print(f" ... and {len(failures) - 10} more")
|
||||
print("\nLast 10 ops:")
|
||||
for op in ops_log[-10:]:
|
||||
print(f" {op}")
|
||||
return False
|
||||
return True
|
||||
|
||||
|
||||
def random_op(rng, conn, kb, task_pool):
|
||||
op = rng.choice(OPS)
|
||||
|
||||
if op == "create":
|
||||
tid = kb.create_task(
|
||||
conn,
|
||||
title=f"rand {rng.randint(0, 1000)}",
|
||||
assignee=rng.choice(["w1", "w2", "w3", None]),
|
||||
)
|
||||
task_pool.append(tid)
|
||||
return {"op": "create", "tid": tid}
|
||||
|
||||
if op == "create_child" and task_pool:
|
||||
parent = rng.choice(task_pool)
|
||||
tid = kb.create_task(
|
||||
conn, title=f"child of {parent}",
|
||||
assignee=rng.choice(["w1", "w2", "w3", None]),
|
||||
parents=[parent],
|
||||
)
|
||||
task_pool.append(tid)
|
||||
return {"op": "create_child", "tid": tid, "parent": parent}
|
||||
|
||||
if not task_pool:
|
||||
return None
|
||||
|
||||
tid = rng.choice(task_pool)
|
||||
task = kb.get_task(conn, tid)
|
||||
if task is None:
|
||||
task_pool.remove(tid)
|
||||
return None
|
||||
|
||||
if op == "claim":
|
||||
claimed = kb.claim_task(conn, tid, ttl_seconds=rng.choice([1, 3, 10]))
|
||||
return {"op": "claim", "tid": tid, "ok": claimed is not None}
|
||||
if op == "complete":
|
||||
summary = rng.choice([None, f"done via op {rng.randint(0, 1000)}"])
|
||||
ok = kb.complete_task(conn, tid, summary=summary)
|
||||
return {"op": "complete", "tid": tid, "ok": ok}
|
||||
if op == "block":
|
||||
reason = rng.choice([None, "rand block"])
|
||||
ok = kb.block_task(conn, tid, reason=reason)
|
||||
return {"op": "block", "tid": tid, "ok": ok}
|
||||
if op == "unblock":
|
||||
ok = kb.unblock_task(conn, tid)
|
||||
return {"op": "unblock", "tid": tid, "ok": ok}
|
||||
if op == "archive":
|
||||
ok = kb.archive_task(conn, tid)
|
||||
if ok:
|
||||
task_pool.remove(tid)
|
||||
return {"op": "archive", "tid": tid, "ok": ok}
|
||||
if op == "heartbeat":
|
||||
ok = kb.heartbeat_worker(conn, tid)
|
||||
return {"op": "heartbeat", "tid": tid, "ok": ok}
|
||||
if op == "release_stale":
|
||||
n = kb.release_stale_claims(conn)
|
||||
return {"op": "release_stale", "n": n}
|
||||
if op == "detect_crashed":
|
||||
# Force-kill a fake PID first so there's something to detect
|
||||
crashed = kb.detect_crashed_workers(conn)
|
||||
return {"op": "detect_crashed", "n": len(crashed)}
|
||||
if op == "recompute_ready":
|
||||
n = kb.recompute_ready(conn)
|
||||
return {"op": "recompute_ready", "promoted": n}
|
||||
if op == "reassign":
|
||||
# Reassignment isn't a direct API; simulate via assign_task
|
||||
new_a = rng.choice(["w1", "w2", "w3", None])
|
||||
try:
|
||||
kb.assign_task(conn, tid, new_a)
|
||||
return {"op": "reassign", "tid": tid, "to": new_a}
|
||||
except Exception as e:
|
||||
return {"op": "reassign", "tid": tid, "err": str(e)[:50]}
|
||||
|
||||
return None
|
||||
|
||||
|
||||
def main():
|
||||
total_ops = 0
|
||||
total_violations = 0
|
||||
|
||||
for seq_idx in range(NUM_SEQUENCES):
|
||||
seed = random.randint(0, 10**9)
|
||||
rng = random.Random(seed)
|
||||
home = tempfile.mkdtemp(prefix=f"hermes_fuzz_{seq_idx}_")
|
||||
os.environ["HERMES_HOME"] = home
|
||||
os.environ["HOME"] = home
|
||||
sys.path.insert(0, WT)
|
||||
|
||||
# Fresh module state per sequence to avoid cached init paths.
|
||||
for m in list(sys.modules.keys()):
|
||||
if m.startswith("hermes_cli"):
|
||||
del sys.modules[m]
|
||||
from hermes_cli import kanban_db as kb
|
||||
|
||||
kb.init_db()
|
||||
conn = kb.connect()
|
||||
task_pool = []
|
||||
ops_log = []
|
||||
|
||||
try:
|
||||
for i in range(OPS_PER_SEQUENCE):
|
||||
result = random_op(rng, conn, kb, task_pool)
|
||||
if result is None:
|
||||
continue
|
||||
ops_log.append(result)
|
||||
total_ops += 1
|
||||
if not assert_invariants(conn, kb, ops_log):
|
||||
total_violations += 1
|
||||
print(f" sequence {seq_idx} (seed={seed}) failed at op {i}")
|
||||
break
|
||||
finally:
|
||||
conn.close()
|
||||
|
||||
if seq_idx % 10 == 0:
|
||||
print(f" seq {seq_idx:3d}: {total_ops} ops so far, {total_violations} violations")
|
||||
|
||||
print()
|
||||
print("=" * 60)
|
||||
print(f"Total sequences: {NUM_SEQUENCES}")
|
||||
print(f"Total operations: {total_ops}")
|
||||
print(f"Invariant violations: {total_violations}")
|
||||
if total_violations == 0:
|
||||
print("\n✔ ALL INVARIANTS HELD ACROSS RANDOMIZED SEQUENCES")
|
||||
else:
|
||||
print("\n✗ INVARIANT VIOLATIONS FOUND")
|
||||
sys.exit(1)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -0,0 +1,228 @@
|
||||
"""E2E: dispatcher spawns real Python subprocess workers.
|
||||
|
||||
This validates the IPC + lifecycle story that mocks can't:
|
||||
- spawn_fn returns a real PID
|
||||
- the child process resolves hermes_cli.kanban_db on its own
|
||||
- the child writes heartbeats via the CLI (real argparse, real init_db)
|
||||
- the child completes via the CLI with --summary + --metadata
|
||||
- the dispatcher observes all of this through the DB only
|
||||
- worker logs are captured to HERMES_HOME/kanban/logs/<task>.log
|
||||
- crash detection works against a real dead PID
|
||||
"""
|
||||
|
||||
import json
|
||||
import os
|
||||
import subprocess
|
||||
import sys
|
||||
import tempfile
|
||||
import time
|
||||
|
||||
WT = str(Path(__file__).resolve().parents[2])
|
||||
FAKE_WORKER = str(Path(__file__).parent / "_fake_worker.py")
|
||||
PY = sys.executable
|
||||
|
||||
|
||||
def make_spawn_fn(home: str):
|
||||
"""Return a spawn_fn the dispatcher can call. Launches the fake
|
||||
worker as a detached subprocess."""
|
||||
|
||||
def _spawn(task, workspace):
|
||||
log_path = os.path.join(home, f"worker_{task.id}.log")
|
||||
env = {
|
||||
**os.environ,
|
||||
"HERMES_HOME": home,
|
||||
"HOME": home,
|
||||
"PYTHONPATH": WT,
|
||||
"HERMES_KANBAN_TASK": task.id,
|
||||
"HERMES_KANBAN_WORKSPACE": workspace,
|
||||
"PATH": f"{os.path.dirname(PY)}:{os.environ.get('PATH','')}",
|
||||
}
|
||||
log_f = open(log_path, "ab")
|
||||
proc = subprocess.Popen(
|
||||
[PY, FAKE_WORKER],
|
||||
stdin=subprocess.DEVNULL,
|
||||
stdout=log_f,
|
||||
stderr=subprocess.STDOUT,
|
||||
env=env,
|
||||
start_new_session=True,
|
||||
)
|
||||
return proc.pid
|
||||
|
||||
return _spawn
|
||||
|
||||
|
||||
def main():
|
||||
home = tempfile.mkdtemp(prefix="hermes_e2e_")
|
||||
os.environ["HERMES_HOME"] = home
|
||||
os.environ["HOME"] = home
|
||||
sys.path.insert(0, WT)
|
||||
from hermes_cli import kanban_db as kb
|
||||
|
||||
# Point the `hermes` CLI child processes will run at the worktree
|
||||
# hermes_cli.main. We do this by putting a shim on PATH.
|
||||
shim_dir = os.path.join(home, "bin")
|
||||
os.makedirs(shim_dir, exist_ok=True)
|
||||
shim_path = os.path.join(shim_dir, "hermes")
|
||||
with open(shim_path, "w") as f:
|
||||
f.write(f"""#!/bin/sh
|
||||
exec {PY} -m hermes_cli.main "$@"
|
||||
""")
|
||||
os.chmod(shim_path, 0o755)
|
||||
os.environ["PATH"] = f"{shim_dir}:{os.environ.get('PATH','')}"
|
||||
|
||||
kb.init_db()
|
||||
conn = kb.connect()
|
||||
|
||||
# ============ SCENARIO A: happy path, 3 tasks ============
|
||||
print("=" * 60)
|
||||
print("A. Real-subprocess happy path (3 tasks)")
|
||||
print("=" * 60)
|
||||
|
||||
tids = []
|
||||
for i in range(3):
|
||||
tid = kb.create_task(
|
||||
conn, title=f"real-e2e-{i}", assignee="worker",
|
||||
)
|
||||
tids.append(tid)
|
||||
|
||||
spawn_fn = make_spawn_fn(home)
|
||||
result = kb.dispatch_once(conn, spawn_fn=spawn_fn)
|
||||
print(f" dispatched: {len(result.spawned)} spawned")
|
||||
spawned_pids = []
|
||||
# The dispatcher sets worker_pid on each claimed task via _set_worker_pid.
|
||||
for tid in tids:
|
||||
task = kb.get_task(conn, tid)
|
||||
spawned_pids.append(task.worker_pid)
|
||||
print(f" task {tid}: pid={task.worker_pid} status={task.status}")
|
||||
|
||||
# Wait for all workers to complete (up to 10s).
|
||||
deadline = time.monotonic() + 10
|
||||
while time.monotonic() < deadline:
|
||||
statuses = [kb.get_task(conn, tid).status for tid in tids]
|
||||
if all(s == "done" for s in statuses):
|
||||
break
|
||||
time.sleep(0.2)
|
||||
|
||||
print()
|
||||
failures = []
|
||||
for tid in tids:
|
||||
task = kb.get_task(conn, tid)
|
||||
runs = kb.list_runs(conn, tid)
|
||||
print(f" task {tid}: status={task.status}, current_run_id={task.current_run_id}, "
|
||||
f"runs={[(r.id, r.outcome) for r in runs]}")
|
||||
if task.status != "done":
|
||||
failures.append(f"task {tid} not done: status={task.status}")
|
||||
if task.current_run_id is not None:
|
||||
failures.append(f"task {tid} has dangling current_run_id={task.current_run_id}")
|
||||
if len(runs) != 1:
|
||||
failures.append(f"task {tid} has {len(runs)} runs, expected 1")
|
||||
else:
|
||||
r = runs[0]
|
||||
if r.outcome != "completed":
|
||||
failures.append(f"task {tid} run outcome={r.outcome}, expected completed")
|
||||
if not r.summary or "real-subprocess worker finished" not in r.summary:
|
||||
failures.append(f"task {tid} summary missing: {r.summary!r}")
|
||||
if not r.metadata or r.metadata.get("iterations") != 3:
|
||||
failures.append(f"task {tid} metadata missing iterations: {r.metadata}")
|
||||
# Heartbeat events should be present
|
||||
events = kb.list_events(conn, tid)
|
||||
heartbeats = [e for e in events if e.kind == "heartbeat"]
|
||||
if len(heartbeats) < 3: # start + 3 progress
|
||||
failures.append(f"task {tid} heartbeats={len(heartbeats)} expected >=3")
|
||||
|
||||
if failures:
|
||||
print("\nFAILURES:")
|
||||
for f in failures:
|
||||
print(f" {f}")
|
||||
sys.exit(1)
|
||||
|
||||
print("\n ✔ Scenario A: all 3 real-subprocess workers completed cleanly")
|
||||
|
||||
# ============ SCENARIO B: crashed worker ============
|
||||
print()
|
||||
print("=" * 60)
|
||||
print("B. Crashed worker (kill -9 mid-heartbeat)")
|
||||
print("=" * 60)
|
||||
|
||||
crash_tid = kb.create_task(
|
||||
conn, title="crash-e2e", assignee="worker",
|
||||
)
|
||||
|
||||
# Spawn a worker that sleeps long enough for us to kill it.
|
||||
# CRITICAL: spawn through a double-fork so when we kill the child it
|
||||
# doesn't zombify under our pid (which would fool kill -0 liveness
|
||||
# checks into thinking it's still alive). In production the
|
||||
# dispatcher daemon is long-lived but its workers are reaped by init
|
||||
# after exit; the test needs to match that orphaning behavior.
|
||||
def spawn_sleeper(task, workspace):
|
||||
r, w = os.pipe()
|
||||
middleman = subprocess.Popen(
|
||||
[
|
||||
PY, "-c",
|
||||
"import os,sys,subprocess;"
|
||||
"p=subprocess.Popen(['sleep','30'],"
|
||||
"stdin=subprocess.DEVNULL,"
|
||||
"stdout=subprocess.DEVNULL,stderr=subprocess.DEVNULL,"
|
||||
"start_new_session=True);"
|
||||
"os.write(int(sys.argv[1]), str(p.pid).encode());"
|
||||
"sys.exit(0)",
|
||||
str(w),
|
||||
],
|
||||
pass_fds=(w,),
|
||||
stdin=subprocess.DEVNULL,
|
||||
stdout=subprocess.DEVNULL,
|
||||
stderr=subprocess.DEVNULL,
|
||||
)
|
||||
os.close(w)
|
||||
middleman.wait() # middleman exits immediately, orphaning the sleep
|
||||
grandchild_pid = int(os.read(r, 16))
|
||||
os.close(r)
|
||||
return grandchild_pid
|
||||
|
||||
result = kb.dispatch_once(conn, spawn_fn=spawn_sleeper)
|
||||
task = kb.get_task(conn, crash_tid)
|
||||
print(f" spawned sleeper pid={task.worker_pid} for {crash_tid}")
|
||||
# Kill the sleeper forcibly
|
||||
os.kill(task.worker_pid, 9)
|
||||
# Give the OS a moment to reap
|
||||
time.sleep(0.5)
|
||||
|
||||
# Simulate next dispatcher tick — should detect the crashed PID
|
||||
crashed = kb.detect_crashed_workers(conn)
|
||||
print(f" detect_crashed_workers returned {len(crashed)} crashed (expected 1)")
|
||||
|
||||
task = kb.get_task(conn, crash_tid)
|
||||
runs = kb.list_runs(conn, crash_tid)
|
||||
print(f" task status={task.status}, runs={[(r.id, r.outcome) for r in runs]}")
|
||||
|
||||
if len(crashed) < 1:
|
||||
print(" ✗ crash NOT detected")
|
||||
sys.exit(1)
|
||||
if task.status != "ready":
|
||||
print(f" ✗ task should be back to ready, got {task.status}")
|
||||
sys.exit(1)
|
||||
if runs[0].outcome != "crashed":
|
||||
print(f" ✗ run outcome should be 'crashed', got {runs[0].outcome!r}")
|
||||
sys.exit(1)
|
||||
print("\n ✔ Scenario B: crash detected, task re-queued, run outcome=crashed")
|
||||
|
||||
# ============ SCENARIO C: worker log was captured ============
|
||||
print()
|
||||
print("=" * 60)
|
||||
print("C. Worker log captured to disk")
|
||||
print("=" * 60)
|
||||
# Scenario A workers wrote to /tmp/hermes_e2e_*/worker_*.log
|
||||
import glob
|
||||
logs = glob.glob(os.path.join(home, "worker_*.log"))
|
||||
print(f" {len(logs)} worker log files")
|
||||
for lp in logs[:3]:
|
||||
size = os.path.getsize(lp)
|
||||
print(f" {os.path.basename(lp)}: {size} bytes")
|
||||
# Our fake worker is quiet (no prints); size=0 is fine
|
||||
|
||||
conn.close()
|
||||
print("\n✔ ALL E2E SCENARIOS PASS")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -0,0 +1,494 @@
|
||||
"""Tests for the Kanban tool surface (tools/kanban_tools.py).
|
||||
|
||||
Verifies:
|
||||
- Tools are gated on HERMES_KANBAN_TASK: a normal chat session sees
|
||||
zero kanban tools in its schema; a worker session sees all seven.
|
||||
- Each handler's happy path.
|
||||
- Error paths (missing required args, bad metadata type, etc).
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
import os
|
||||
|
||||
import pytest
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Gating
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def test_kanban_tools_hidden_without_env_var(monkeypatch, tmp_path):
|
||||
"""Normal `hermes chat` sessions (no HERMES_KANBAN_TASK) must have
|
||||
zero kanban_* tools in their schema."""
|
||||
monkeypatch.delenv("HERMES_KANBAN_TASK", raising=False)
|
||||
home = tmp_path / ".hermes"
|
||||
home.mkdir()
|
||||
monkeypatch.setenv("HERMES_HOME", str(home))
|
||||
|
||||
import tools.kanban_tools # ensure registered
|
||||
from tools.registry import registry
|
||||
from toolsets import resolve_toolset
|
||||
|
||||
schema = registry.get_definitions(set(resolve_toolset("hermes-cli")), quiet=True)
|
||||
names = {s["function"].get("name") for s in schema if "function" in s}
|
||||
kanban = {n for n in names if n and n.startswith("kanban_")}
|
||||
assert kanban == set(), (
|
||||
f"kanban tools leaked into normal chat schema: {kanban}"
|
||||
)
|
||||
|
||||
|
||||
def test_kanban_tools_visible_with_env_var(monkeypatch, tmp_path):
|
||||
"""Worker sessions (HERMES_KANBAN_TASK set) must have all 7 tools."""
|
||||
monkeypatch.setenv("HERMES_KANBAN_TASK", "t_fake")
|
||||
home = tmp_path / ".hermes"
|
||||
home.mkdir()
|
||||
monkeypatch.setenv("HERMES_HOME", str(home))
|
||||
|
||||
import tools.kanban_tools # ensure registered
|
||||
from tools.registry import registry
|
||||
from toolsets import resolve_toolset
|
||||
|
||||
schema = registry.get_definitions(set(resolve_toolset("hermes-cli")), quiet=True)
|
||||
names = {s["function"].get("name") for s in schema if "function" in s}
|
||||
kanban = {n for n in names if n and n.startswith("kanban_")}
|
||||
expected = {
|
||||
"kanban_show", "kanban_complete", "kanban_block", "kanban_heartbeat",
|
||||
"kanban_comment", "kanban_create", "kanban_link",
|
||||
}
|
||||
assert kanban == expected, f"expected {expected}, got {kanban}"
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Handler happy paths
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
@pytest.fixture
|
||||
def worker_env(monkeypatch, tmp_path):
|
||||
"""Simulate being a worker: HERMES_HOME isolated, HERMES_KANBAN_TASK set
|
||||
after we've created the task."""
|
||||
home = tmp_path / ".hermes"
|
||||
home.mkdir()
|
||||
monkeypatch.setenv("HERMES_HOME", str(home))
|
||||
monkeypatch.setenv("HERMES_PROFILE", "test-worker")
|
||||
from pathlib import Path as _Path
|
||||
monkeypatch.setattr(_Path, "home", lambda: tmp_path)
|
||||
|
||||
from hermes_cli import kanban_db as kb
|
||||
kb._INITIALIZED_PATHS.clear()
|
||||
kb.init_db()
|
||||
conn = kb.connect()
|
||||
try:
|
||||
tid = kb.create_task(conn, title="worker-test", assignee="test-worker")
|
||||
kb.claim_task(conn, tid)
|
||||
finally:
|
||||
conn.close()
|
||||
monkeypatch.setenv("HERMES_KANBAN_TASK", tid)
|
||||
return tid
|
||||
|
||||
|
||||
def test_show_defaults_to_env_task_id(worker_env):
|
||||
from tools import kanban_tools as kt
|
||||
out = kt._handle_show({})
|
||||
d = json.loads(out)
|
||||
assert "task" in d
|
||||
assert d["task"]["id"] == worker_env
|
||||
assert d["task"]["status"] == "running"
|
||||
assert "worker_context" in d
|
||||
assert "runs" in d
|
||||
|
||||
|
||||
def test_show_explicit_task_id(worker_env):
|
||||
"""Peek at a different task than the one in env."""
|
||||
from hermes_cli import kanban_db as kb
|
||||
conn = kb.connect()
|
||||
try:
|
||||
other = kb.create_task(conn, title="other task", assignee="peer")
|
||||
finally:
|
||||
conn.close()
|
||||
from tools import kanban_tools as kt
|
||||
out = kt._handle_show({"task_id": other})
|
||||
d = json.loads(out)
|
||||
assert d["task"]["id"] == other
|
||||
|
||||
|
||||
def test_complete_happy_path(worker_env):
|
||||
from tools import kanban_tools as kt
|
||||
out = kt._handle_complete({
|
||||
"summary": "got the thing done",
|
||||
"metadata": {"files": 2},
|
||||
})
|
||||
d = json.loads(out)
|
||||
assert d["ok"] is True
|
||||
assert d["task_id"] == worker_env
|
||||
# Verify via kernel
|
||||
from hermes_cli import kanban_db as kb
|
||||
conn = kb.connect()
|
||||
try:
|
||||
run = kb.latest_run(conn, worker_env)
|
||||
assert run.outcome == "completed"
|
||||
assert run.summary == "got the thing done"
|
||||
assert run.metadata == {"files": 2}
|
||||
finally:
|
||||
conn.close()
|
||||
|
||||
|
||||
def test_complete_with_result_only(worker_env):
|
||||
"""`result` alone (without summary) is accepted for legacy compat."""
|
||||
from tools import kanban_tools as kt
|
||||
out = kt._handle_complete({"result": "legacy result"})
|
||||
d = json.loads(out)
|
||||
assert d["ok"] is True
|
||||
|
||||
|
||||
def test_complete_rejects_no_handoff(worker_env):
|
||||
from tools import kanban_tools as kt
|
||||
out = kt._handle_complete({})
|
||||
assert json.loads(out).get("error"), "should have errored"
|
||||
|
||||
|
||||
def test_complete_rejects_non_dict_metadata(worker_env):
|
||||
from tools import kanban_tools as kt
|
||||
out = kt._handle_complete({"summary": "x", "metadata": [1, 2, 3]})
|
||||
assert json.loads(out).get("error")
|
||||
|
||||
|
||||
def test_block_happy_path(worker_env):
|
||||
from tools import kanban_tools as kt
|
||||
out = kt._handle_block({"reason": "need clarification"})
|
||||
d = json.loads(out)
|
||||
assert d["ok"] is True
|
||||
from hermes_cli import kanban_db as kb
|
||||
conn = kb.connect()
|
||||
try:
|
||||
assert kb.get_task(conn, worker_env).status == "blocked"
|
||||
finally:
|
||||
conn.close()
|
||||
|
||||
|
||||
def test_block_rejects_empty_reason(worker_env):
|
||||
from tools import kanban_tools as kt
|
||||
for bad in ["", " ", None]:
|
||||
out = kt._handle_block({"reason": bad})
|
||||
assert json.loads(out).get("error")
|
||||
|
||||
|
||||
def test_heartbeat_happy_path(worker_env):
|
||||
from tools import kanban_tools as kt
|
||||
out = kt._handle_heartbeat({"note": "progress"})
|
||||
d = json.loads(out)
|
||||
assert d["ok"] is True
|
||||
|
||||
|
||||
def test_heartbeat_without_note(worker_env):
|
||||
"""note is optional."""
|
||||
from tools import kanban_tools as kt
|
||||
out = kt._handle_heartbeat({})
|
||||
d = json.loads(out)
|
||||
assert d["ok"] is True
|
||||
|
||||
|
||||
def test_comment_happy_path(worker_env):
|
||||
from tools import kanban_tools as kt
|
||||
out = kt._handle_comment({
|
||||
"task_id": worker_env,
|
||||
"body": "hello thread",
|
||||
})
|
||||
d = json.loads(out)
|
||||
assert d["ok"] is True
|
||||
assert d["comment_id"]
|
||||
from hermes_cli import kanban_db as kb
|
||||
conn = kb.connect()
|
||||
try:
|
||||
comments = kb.list_comments(conn, worker_env)
|
||||
assert len(comments) == 1
|
||||
# Author defaults to HERMES_PROFILE env we set in the fixture
|
||||
assert comments[0].author == "test-worker"
|
||||
assert comments[0].body == "hello thread"
|
||||
finally:
|
||||
conn.close()
|
||||
|
||||
|
||||
def test_comment_rejects_empty_body(worker_env):
|
||||
from tools import kanban_tools as kt
|
||||
out = kt._handle_comment({"task_id": worker_env, "body": " "})
|
||||
assert json.loads(out).get("error")
|
||||
|
||||
|
||||
def test_comment_custom_author(worker_env):
|
||||
from tools import kanban_tools as kt
|
||||
out = kt._handle_comment({
|
||||
"task_id": worker_env, "body": "hi", "author": "custom-bot",
|
||||
})
|
||||
assert json.loads(out)["ok"]
|
||||
from hermes_cli import kanban_db as kb
|
||||
conn = kb.connect()
|
||||
try:
|
||||
comments = kb.list_comments(conn, worker_env)
|
||||
assert comments[0].author == "custom-bot"
|
||||
finally:
|
||||
conn.close()
|
||||
|
||||
|
||||
def test_create_happy_path(worker_env):
|
||||
from tools import kanban_tools as kt
|
||||
out = kt._handle_create({
|
||||
"title": "child task",
|
||||
"assignee": "peer",
|
||||
"parents": [worker_env],
|
||||
})
|
||||
d = json.loads(out)
|
||||
assert d["ok"] is True
|
||||
assert d["task_id"]
|
||||
assert d["status"] == "todo" # parent isn't done yet
|
||||
from hermes_cli import kanban_db as kb
|
||||
conn = kb.connect()
|
||||
try:
|
||||
child = kb.get_task(conn, d["task_id"])
|
||||
assert child.title == "child task"
|
||||
assert child.assignee == "peer"
|
||||
finally:
|
||||
conn.close()
|
||||
|
||||
|
||||
def test_create_rejects_no_title(worker_env):
|
||||
from tools import kanban_tools as kt
|
||||
assert json.loads(kt._handle_create({"assignee": "x"})).get("error")
|
||||
assert json.loads(kt._handle_create({"title": " ", "assignee": "x"})).get("error")
|
||||
|
||||
|
||||
def test_create_rejects_no_assignee(worker_env):
|
||||
from tools import kanban_tools as kt
|
||||
assert json.loads(kt._handle_create({"title": "t"})).get("error")
|
||||
|
||||
|
||||
def test_create_rejects_non_list_parents(worker_env):
|
||||
from tools import kanban_tools as kt
|
||||
out = kt._handle_create({"title": "t", "assignee": "a", "parents": 42})
|
||||
assert json.loads(out).get("error")
|
||||
|
||||
|
||||
def test_create_accepts_string_parent(worker_env):
|
||||
"""Convenience: a single parent id as string is coerced to [id]."""
|
||||
from tools import kanban_tools as kt
|
||||
out = kt._handle_create({
|
||||
"title": "t", "assignee": "a", "parents": worker_env,
|
||||
})
|
||||
assert json.loads(out)["ok"]
|
||||
|
||||
|
||||
def test_create_accepts_skills_list(worker_env):
|
||||
"""Tool writes the per-task skills through to the kernel."""
|
||||
from tools import kanban_tools as kt
|
||||
from hermes_cli import kanban_db as kb
|
||||
out = kt._handle_create({
|
||||
"title": "skilled",
|
||||
"assignee": "linguist",
|
||||
"skills": ["translation", "github-code-review"],
|
||||
})
|
||||
d = json.loads(out)
|
||||
assert d["ok"] is True
|
||||
with kb.connect() as conn:
|
||||
task = kb.get_task(conn, d["task_id"])
|
||||
assert task.skills == ["translation", "github-code-review"]
|
||||
|
||||
|
||||
def test_create_accepts_skills_string(worker_env):
|
||||
"""Convenience: a single skill name as string is coerced to [name]."""
|
||||
from tools import kanban_tools as kt
|
||||
from hermes_cli import kanban_db as kb
|
||||
out = kt._handle_create({
|
||||
"title": "one-skill",
|
||||
"assignee": "a",
|
||||
"skills": "translation",
|
||||
})
|
||||
d = json.loads(out)
|
||||
assert d["ok"] is True
|
||||
with kb.connect() as conn:
|
||||
task = kb.get_task(conn, d["task_id"])
|
||||
assert task.skills == ["translation"]
|
||||
|
||||
|
||||
def test_create_rejects_non_list_skills(worker_env):
|
||||
"""skills: 42 must be rejected, not silently dropped."""
|
||||
from tools import kanban_tools as kt
|
||||
out = kt._handle_create({
|
||||
"title": "t", "assignee": "a", "skills": 42,
|
||||
})
|
||||
assert json.loads(out).get("error")
|
||||
|
||||
|
||||
def test_link_happy_path(worker_env):
|
||||
from hermes_cli import kanban_db as kb
|
||||
conn = kb.connect()
|
||||
try:
|
||||
a = kb.create_task(conn, title="A", assignee="x")
|
||||
b = kb.create_task(conn, title="B", assignee="x")
|
||||
finally:
|
||||
conn.close()
|
||||
from tools import kanban_tools as kt
|
||||
out = kt._handle_link({"parent_id": a, "child_id": b})
|
||||
d = json.loads(out)
|
||||
assert d["ok"] is True
|
||||
|
||||
|
||||
def test_link_rejects_self_reference(worker_env):
|
||||
from tools import kanban_tools as kt
|
||||
out = kt._handle_link({"parent_id": worker_env, "child_id": worker_env})
|
||||
assert json.loads(out).get("error")
|
||||
|
||||
|
||||
def test_link_rejects_missing_args(worker_env):
|
||||
from tools import kanban_tools as kt
|
||||
assert json.loads(kt._handle_link({"parent_id": "x"})).get("error")
|
||||
assert json.loads(kt._handle_link({"child_id": "y"})).get("error")
|
||||
|
||||
|
||||
def test_link_rejects_cycle(worker_env):
|
||||
"""A → B, then try to link B → A."""
|
||||
from hermes_cli import kanban_db as kb
|
||||
conn = kb.connect()
|
||||
try:
|
||||
a = kb.create_task(conn, title="A", assignee="x")
|
||||
b = kb.create_task(conn, title="B", assignee="x", parents=[a])
|
||||
finally:
|
||||
conn.close()
|
||||
from tools import kanban_tools as kt
|
||||
out = kt._handle_link({"parent_id": b, "child_id": a})
|
||||
assert json.loads(out).get("error")
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# End-to-end: simulate a full worker lifecycle through the tools
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def test_worker_lifecycle_through_tools(worker_env):
|
||||
"""Drive the full claim -> heartbeat -> comment -> complete lifecycle
|
||||
exclusively through the tools, then verify the DB state matches what
|
||||
the dispatcher/notifier expect."""
|
||||
from tools import kanban_tools as kt
|
||||
|
||||
# 1. show — worker orientation
|
||||
show = json.loads(kt._handle_show({}))
|
||||
assert show["task"]["id"] == worker_env
|
||||
|
||||
# 2. heartbeat during long op
|
||||
assert json.loads(kt._handle_heartbeat({"note": "warming up"}))["ok"]
|
||||
|
||||
# 3. comment for a future peer
|
||||
assert json.loads(kt._handle_comment({
|
||||
"task_id": worker_env,
|
||||
"body": "note: using stdlib sqlite3 bindings",
|
||||
}))["ok"]
|
||||
|
||||
# 4. spawn a child task for follow-up
|
||||
child_out = json.loads(kt._handle_create({
|
||||
"title": "write integration test",
|
||||
"assignee": "qa",
|
||||
"parents": [worker_env],
|
||||
}))
|
||||
assert child_out["ok"]
|
||||
|
||||
# 5. complete with structured handoff
|
||||
comp = json.loads(kt._handle_complete({
|
||||
"summary": "implemented + spawned QA follow-up",
|
||||
"metadata": {"child_task": child_out["task_id"]},
|
||||
}))
|
||||
assert comp["ok"]
|
||||
|
||||
# Verify final state
|
||||
from hermes_cli import kanban_db as kb
|
||||
conn = kb.connect()
|
||||
try:
|
||||
parent = kb.get_task(conn, worker_env)
|
||||
assert parent.status == "done"
|
||||
assert parent.current_run_id is None
|
||||
run = kb.latest_run(conn, worker_env)
|
||||
assert run.outcome == "completed"
|
||||
assert run.metadata == {"child_task": child_out["task_id"]}
|
||||
# Child is todo (parent just finished, but recompute_ready may
|
||||
# have promoted it — complete_task runs recompute internally).
|
||||
child = kb.get_task(conn, child_out["task_id"])
|
||||
assert child.status == "ready", (
|
||||
f"child should be ready after parent done, got {child.status}"
|
||||
)
|
||||
# Comment is visible
|
||||
assert len(kb.list_comments(conn, worker_env)) == 1
|
||||
# Heartbeat event recorded
|
||||
hb = [e for e in kb.list_events(conn, worker_env) if e.kind == "heartbeat"]
|
||||
assert len(hb) == 1
|
||||
finally:
|
||||
conn.close()
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# System-prompt guidance injection
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def test_kanban_guidance_not_in_normal_prompt(monkeypatch, tmp_path):
|
||||
"""A normal chat session (no HERMES_KANBAN_TASK) must NOT have
|
||||
KANBAN_GUIDANCE in its system prompt."""
|
||||
monkeypatch.delenv("HERMES_KANBAN_TASK", raising=False)
|
||||
home = tmp_path / ".hermes"
|
||||
home.mkdir()
|
||||
monkeypatch.setenv("HERMES_HOME", str(home))
|
||||
from pathlib import Path as _P
|
||||
monkeypatch.setattr(_P, "home", lambda: tmp_path)
|
||||
|
||||
from run_agent import AIAgent
|
||||
a = AIAgent(
|
||||
api_key="test",
|
||||
base_url="https://openrouter.ai/api/v1",
|
||||
quiet_mode=True,
|
||||
skip_context_files=True,
|
||||
skip_memory=True,
|
||||
)
|
||||
prompt = a._build_system_prompt()
|
||||
assert "You are a Kanban worker" not in prompt
|
||||
assert "kanban_show()" not in prompt
|
||||
|
||||
|
||||
def test_kanban_guidance_in_worker_prompt(monkeypatch, tmp_path):
|
||||
"""A worker session (HERMES_KANBAN_TASK set) MUST have the full
|
||||
lifecycle guidance in its system prompt."""
|
||||
monkeypatch.setenv("HERMES_KANBAN_TASK", "t_fake")
|
||||
home = tmp_path / ".hermes"
|
||||
home.mkdir()
|
||||
monkeypatch.setenv("HERMES_HOME", str(home))
|
||||
from pathlib import Path as _P
|
||||
monkeypatch.setattr(_P, "home", lambda: tmp_path)
|
||||
|
||||
from run_agent import AIAgent
|
||||
a = AIAgent(
|
||||
api_key="test",
|
||||
base_url="https://openrouter.ai/api/v1",
|
||||
quiet_mode=True,
|
||||
skip_context_files=True,
|
||||
skip_memory=True,
|
||||
)
|
||||
prompt = a._build_system_prompt()
|
||||
# Header phrase
|
||||
assert "You are a Kanban worker" in prompt
|
||||
# Lifecycle signals
|
||||
assert "kanban_show()" in prompt
|
||||
assert "kanban_complete" in prompt
|
||||
assert "kanban_block" in prompt
|
||||
assert "kanban_create" in prompt
|
||||
# Anti-shell guidance
|
||||
assert "Do not shell out" in prompt or "tools — they work" in prompt
|
||||
|
||||
|
||||
def test_kanban_guidance_prompt_size_bounded(monkeypatch, tmp_path):
|
||||
"""Sanity: the guidance block is under 4 KB so it doesn't blow
|
||||
up the cached prompt."""
|
||||
monkeypatch.setenv("HERMES_KANBAN_TASK", "t_fake")
|
||||
home = tmp_path / ".hermes"
|
||||
home.mkdir()
|
||||
monkeypatch.setenv("HERMES_HOME", str(home))
|
||||
from pathlib import Path as _P
|
||||
monkeypatch.setattr(_P, "home", lambda: tmp_path)
|
||||
|
||||
from agent.prompt_builder import KANBAN_GUIDANCE
|
||||
assert 1_500 < len(KANBAN_GUIDANCE) < 4_096, (
|
||||
f"KANBAN_GUIDANCE is {len(KANBAN_GUIDANCE)} chars — too short (missing?) or too long"
|
||||
)
|
||||
@@ -0,0 +1,125 @@
|
||||
"""Tests for the MCP remote-URL validator.
|
||||
|
||||
Ported from anomalyco/opencode#25019 (``fix: handle invalid mcp urls``).
|
||||
|
||||
Previously, a typo in ``config.yaml`` (missing scheme, wrong scheme, empty
|
||||
string, dict where a URL was expected) caused the MCP server startup code
|
||||
to enter httpx's URL-parsing path and crash inside the transport layer.
|
||||
The reconnect-backoff loop would then retry
|
||||
``_MAX_INITIAL_CONNECT_RETRIES`` times with doubling backoff — a minute or
|
||||
more of pointless retries plus a confusing opaque error message — before
|
||||
eventually giving up.
|
||||
|
||||
The fix validates the URL once, up front, and fails fast with a specific
|
||||
error message identifying the offending server.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import pytest
|
||||
|
||||
from tools.mcp_tool import (
|
||||
InvalidMcpUrlError,
|
||||
_validate_remote_mcp_url,
|
||||
)
|
||||
|
||||
|
||||
class TestValidUrlsAccepted:
|
||||
"""Every valid http(s) URL must pass through untouched (stripped of whitespace)."""
|
||||
|
||||
@pytest.mark.parametrize(
|
||||
"url",
|
||||
[
|
||||
"http://localhost:3000/mcp",
|
||||
"https://example.com/mcp",
|
||||
"https://context7.liam.com/mcp",
|
||||
"http://127.0.0.1:8080",
|
||||
"https://api.example.com:443/v1/mcp?session=abc",
|
||||
"http://[::1]:9000/mcp", # IPv6
|
||||
"https://host.example.com", # no port, no path
|
||||
],
|
||||
)
|
||||
def test_accepts_valid_http_url(self, url):
|
||||
assert _validate_remote_mcp_url("test", url) == url
|
||||
|
||||
def test_strips_surrounding_whitespace(self):
|
||||
assert (
|
||||
_validate_remote_mcp_url("test", " https://example.com/mcp ")
|
||||
== "https://example.com/mcp"
|
||||
)
|
||||
|
||||
|
||||
class TestInvalidUrlsRejected:
|
||||
"""Every broken shape must raise ``InvalidMcpUrlError`` with a clear message."""
|
||||
|
||||
def test_none_rejected(self):
|
||||
with pytest.raises(InvalidMcpUrlError, match="context7.*expected a string"):
|
||||
_validate_remote_mcp_url("context7", None)
|
||||
|
||||
def test_dict_rejected(self):
|
||||
with pytest.raises(InvalidMcpUrlError, match="expected a string, got dict"):
|
||||
_validate_remote_mcp_url("ctx", {"url": "nested"})
|
||||
|
||||
def test_int_rejected(self):
|
||||
with pytest.raises(InvalidMcpUrlError, match="expected a string, got int"):
|
||||
_validate_remote_mcp_url("ctx", 8080)
|
||||
|
||||
def test_empty_string_rejected(self):
|
||||
with pytest.raises(InvalidMcpUrlError, match="empty url"):
|
||||
_validate_remote_mcp_url("ctx", "")
|
||||
|
||||
def test_whitespace_only_rejected(self):
|
||||
with pytest.raises(InvalidMcpUrlError, match="empty url"):
|
||||
_validate_remote_mcp_url("ctx", " \t\n")
|
||||
|
||||
def test_missing_scheme_rejected(self):
|
||||
# The most common typo — users copy a host from a web page.
|
||||
with pytest.raises(
|
||||
InvalidMcpUrlError, match="scheme must be http or https"
|
||||
):
|
||||
_validate_remote_mcp_url("ctx", "example.com/mcp")
|
||||
|
||||
def test_file_scheme_rejected(self):
|
||||
with pytest.raises(
|
||||
InvalidMcpUrlError, match="scheme must be http or https"
|
||||
):
|
||||
_validate_remote_mcp_url("ctx", "file:///etc/passwd")
|
||||
|
||||
def test_ws_scheme_rejected(self):
|
||||
# WebSocket is not MCP's remote transport.
|
||||
with pytest.raises(
|
||||
InvalidMcpUrlError, match="scheme must be http or https"
|
||||
):
|
||||
_validate_remote_mcp_url("ctx", "ws://example.com/mcp")
|
||||
|
||||
def test_stdio_scheme_rejected(self):
|
||||
# stdio servers use the ``command`` key, not ``url``.
|
||||
with pytest.raises(
|
||||
InvalidMcpUrlError, match="scheme must be http or https"
|
||||
):
|
||||
_validate_remote_mcp_url("ctx", "stdio:///node server.js")
|
||||
|
||||
def test_empty_host_rejected(self):
|
||||
with pytest.raises(InvalidMcpUrlError, match="missing host"):
|
||||
_validate_remote_mcp_url("ctx", "http:///")
|
||||
|
||||
def test_empty_host_with_path_rejected(self):
|
||||
with pytest.raises(InvalidMcpUrlError, match="missing host"):
|
||||
_validate_remote_mcp_url("ctx", "https:///path/only")
|
||||
|
||||
def test_error_mentions_server_name(self):
|
||||
# So users can find the bad entry when there are multiple configured.
|
||||
with pytest.raises(InvalidMcpUrlError, match="my-weird-server"):
|
||||
_validate_remote_mcp_url("my-weird-server", "not a url at all")
|
||||
|
||||
|
||||
class TestErrorIsValueError:
|
||||
"""InvalidMcpUrlError must be a ValueError for broad downstream catch blocks."""
|
||||
|
||||
def test_is_value_error(self):
|
||||
try:
|
||||
_validate_remote_mcp_url("ctx", "garbage")
|
||||
except ValueError:
|
||||
pass # expected
|
||||
else:
|
||||
pytest.fail("expected ValueError")
|
||||
@@ -0,0 +1,117 @@
|
||||
"""Tests for tui_gateway background-review summary delivery.
|
||||
|
||||
When the self-improvement background review fires and saves a skill or
|
||||
memory entry, it calls ``agent.background_review_callback(message)``. In
|
||||
the CLI that routes through a prompt_toolkit-safe ``_cprint``; in the TUI
|
||||
there is no print surface, so without a callback wired up the review
|
||||
writes the change silently. ``_init_session`` attaches a callback that
|
||||
emits a ``review.summary`` event which Ink renders as a persistent
|
||||
transcript line.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import sys
|
||||
from unittest.mock import MagicMock, patch
|
||||
|
||||
import pytest
|
||||
|
||||
|
||||
@pytest.fixture()
|
||||
def server():
|
||||
with patch.dict(
|
||||
"sys.modules",
|
||||
{
|
||||
"hermes_constants": MagicMock(
|
||||
get_hermes_home=MagicMock(return_value="/tmp/hermes_test_review_summary")
|
||||
),
|
||||
"hermes_cli.env_loader": MagicMock(),
|
||||
"hermes_cli.banner": MagicMock(),
|
||||
"hermes_state": MagicMock(),
|
||||
},
|
||||
):
|
||||
import importlib
|
||||
|
||||
mod = importlib.import_module("tui_gateway.server")
|
||||
yield mod
|
||||
mod._sessions.clear()
|
||||
mod._pending.clear()
|
||||
mod._answers.clear()
|
||||
mod._methods.clear()
|
||||
importlib.reload(mod)
|
||||
|
||||
|
||||
def test_init_session_attaches_background_review_callback(server, monkeypatch):
|
||||
"""After _init_session, agent.background_review_callback is set to a
|
||||
function that emits 'review.summary' for the session's sid."""
|
||||
# Neutralize side-effect calls inside _init_session so we're testing
|
||||
# just the callback wiring.
|
||||
monkeypatch.setattr(server, "_SlashWorker", lambda *a, **kw: object())
|
||||
monkeypatch.setattr(server, "_wire_callbacks", lambda sid: None)
|
||||
monkeypatch.setattr(server, "_notify_session_boundary", lambda *a, **kw: None)
|
||||
monkeypatch.setattr(server, "_session_info", lambda agent: {"model": "m"})
|
||||
monkeypatch.setattr(server, "_load_show_reasoning", lambda: False)
|
||||
monkeypatch.setattr(server, "_load_tool_progress_mode", lambda: "all")
|
||||
|
||||
captured_emits: list = []
|
||||
monkeypatch.setattr(
|
||||
server,
|
||||
"_emit",
|
||||
lambda event, sid, payload=None: captured_emits.append(
|
||||
(event, sid, payload)
|
||||
),
|
||||
)
|
||||
|
||||
class FakeAgent:
|
||||
model = "fake/model"
|
||||
# Presence of the attribute is all the Python side needs; the real
|
||||
# AIAgent has it defaulted to None in __init__.
|
||||
background_review_callback = None
|
||||
|
||||
agent = FakeAgent()
|
||||
server._init_session("sid-abc", "session-key", agent, [], cols=80)
|
||||
|
||||
cb = getattr(agent, "background_review_callback", None)
|
||||
assert callable(cb), (
|
||||
"_init_session must attach a background_review_callback to the "
|
||||
"agent so the self-improvement review is visible in the TUI."
|
||||
)
|
||||
|
||||
# Clear the session.info emit captured during _init_session.
|
||||
captured_emits.clear()
|
||||
|
||||
# Invoke the callback the way AIAgent._spawn_background_review would.
|
||||
cb("💾 Self-improvement review: Skill 'hermes-release' patched")
|
||||
|
||||
# Exactly one review.summary event should have been emitted, bound to
|
||||
# the session id we passed in, carrying the full message text.
|
||||
matched = [e for e in captured_emits if e[0] == "review.summary"]
|
||||
assert len(matched) == 1, captured_emits
|
||||
event, sid, payload = matched[0]
|
||||
assert sid == "sid-abc"
|
||||
assert payload == {
|
||||
"text": "💾 Self-improvement review: Skill 'hermes-release' patched"
|
||||
}
|
||||
|
||||
|
||||
def test_review_summary_callback_survives_agent_without_attribute(server, monkeypatch):
|
||||
"""If the agent is a bare object that doesn't allow attribute
|
||||
assignment (e.g. some stubbed test double), _init_session must not
|
||||
raise — session startup stays robust."""
|
||||
monkeypatch.setattr(server, "_SlashWorker", lambda *a, **kw: object())
|
||||
monkeypatch.setattr(server, "_wire_callbacks", lambda sid: None)
|
||||
monkeypatch.setattr(server, "_notify_session_boundary", lambda *a, **kw: None)
|
||||
monkeypatch.setattr(server, "_session_info", lambda agent: {"model": "m"})
|
||||
monkeypatch.setattr(server, "_load_show_reasoning", lambda: False)
|
||||
monkeypatch.setattr(server, "_load_tool_progress_mode", lambda: "all")
|
||||
monkeypatch.setattr(server, "_emit", lambda *a, **kw: None)
|
||||
|
||||
class LockedAgent:
|
||||
__slots__ = ("model",)
|
||||
|
||||
def __init__(self):
|
||||
self.model = "fake/model"
|
||||
|
||||
# LockedAgent's __slots__ blocks background_review_callback assignment.
|
||||
server._init_session("sid-x", "key-x", LockedAgent(), [], cols=80)
|
||||
# If we got here, _init_session swallowed the AttributeError gracefully.
|
||||
@@ -0,0 +1,726 @@
|
||||
"""Kanban tools — structured tool-call surface for worker + orchestrator agents.
|
||||
|
||||
These tools are only registered into the model's schema when the agent is
|
||||
running under the dispatcher (env var ``HERMES_KANBAN_TASK`` set). A
|
||||
normal ``hermes chat`` session sees **zero** kanban tools in its schema.
|
||||
|
||||
Why tools instead of just shelling out to ``hermes kanban``?
|
||||
|
||||
1. **Backend portability.** A worker whose terminal tool points at Docker
|
||||
/ Modal / Singularity / SSH would run ``hermes kanban complete …``
|
||||
inside the container, where ``hermes`` isn't installed and the DB
|
||||
isn't mounted. Tools run in the agent's Python process, so they
|
||||
always reach ``~/.hermes/kanban.db`` regardless of terminal backend.
|
||||
|
||||
2. **No shell-quoting footguns.** Passing ``--metadata '{"x": [...]}'``
|
||||
through shlex+argparse is fragile. Structured tool args skip it.
|
||||
|
||||
3. **Better errors.** Tool-call failures return structured JSON the
|
||||
model can reason about, not stderr strings it has to parse.
|
||||
|
||||
Humans continue to use the CLI (``hermes kanban …``), the dashboard
|
||||
(``hermes dashboard``), and the slash command (``/kanban …``) — all
|
||||
three bypass the agent entirely. The tools are ONLY for the worker
|
||||
agent's handoff back to the kernel.
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
import logging
|
||||
import os
|
||||
from typing import Any, Optional
|
||||
|
||||
from tools.registry import registry, tool_error
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Gating
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def _check_kanban_mode() -> bool:
|
||||
"""Tools are available iff the current process has ``HERMES_KANBAN_TASK``
|
||||
set in its env, which the dispatcher sets when spawning a worker.
|
||||
|
||||
Humans running ``hermes chat`` see zero kanban tools. Workers spawned
|
||||
by the kanban dispatcher (gateway-embedded by default) see all seven.
|
||||
"""
|
||||
return bool(os.environ.get("HERMES_KANBAN_TASK"))
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Shared helpers
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def _default_task_id(arg: Optional[str]) -> Optional[str]:
|
||||
"""Resolve ``task_id`` arg or fall back to the env var the dispatcher set."""
|
||||
if arg:
|
||||
return arg
|
||||
env_tid = os.environ.get("HERMES_KANBAN_TASK")
|
||||
return env_tid or None
|
||||
|
||||
|
||||
def _connect():
|
||||
"""Import + connect lazily so the module imports cleanly in non-kanban
|
||||
contexts (e.g. test rigs that import every tool module)."""
|
||||
from hermes_cli import kanban_db as kb
|
||||
return kb, kb.connect()
|
||||
|
||||
|
||||
def _ok(**fields: Any) -> str:
|
||||
return json.dumps({"ok": True, **fields})
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Handlers
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def _handle_show(args: dict, **kw) -> str:
|
||||
"""Read a task's full state: task row, parents, children, comments,
|
||||
runs (attempt history), and the last N events."""
|
||||
tid = _default_task_id(args.get("task_id"))
|
||||
if not tid:
|
||||
return tool_error(
|
||||
"task_id is required (or set HERMES_KANBAN_TASK in the env)"
|
||||
)
|
||||
try:
|
||||
kb, conn = _connect()
|
||||
try:
|
||||
task = kb.get_task(conn, tid)
|
||||
if task is None:
|
||||
return tool_error(f"task {tid} not found")
|
||||
comments = kb.list_comments(conn, tid)
|
||||
events = kb.list_events(conn, tid)
|
||||
runs = kb.list_runs(conn, tid)
|
||||
parents = kb.parent_ids(conn, tid)
|
||||
children = kb.child_ids(conn, tid)
|
||||
|
||||
def _task_dict(t):
|
||||
return {
|
||||
"id": t.id, "title": t.title, "body": t.body,
|
||||
"assignee": t.assignee, "status": t.status,
|
||||
"tenant": t.tenant, "priority": t.priority,
|
||||
"workspace_kind": t.workspace_kind,
|
||||
"workspace_path": t.workspace_path,
|
||||
"created_by": t.created_by, "created_at": t.created_at,
|
||||
"started_at": t.started_at,
|
||||
"completed_at": t.completed_at,
|
||||
"result": t.result,
|
||||
"current_run_id": t.current_run_id,
|
||||
}
|
||||
|
||||
def _run_dict(r):
|
||||
return {
|
||||
"id": r.id, "profile": r.profile,
|
||||
"status": r.status, "outcome": r.outcome,
|
||||
"summary": r.summary, "error": r.error,
|
||||
"metadata": r.metadata,
|
||||
"started_at": r.started_at, "ended_at": r.ended_at,
|
||||
}
|
||||
|
||||
return json.dumps({
|
||||
"task": _task_dict(task),
|
||||
"parents": parents,
|
||||
"children": children,
|
||||
"comments": [
|
||||
{"author": c.author, "body": c.body,
|
||||
"created_at": c.created_at}
|
||||
for c in comments
|
||||
],
|
||||
"events": [
|
||||
{"kind": e.kind, "payload": e.payload,
|
||||
"created_at": e.created_at, "run_id": e.run_id}
|
||||
for e in events[-50:] # cap; full log via CLI
|
||||
],
|
||||
"runs": [_run_dict(r) for r in runs],
|
||||
# Also surface the worker's own context block so the
|
||||
# agent can include it directly if it wants. This is
|
||||
# the same string build_worker_context returns to the
|
||||
# dispatcher at spawn time.
|
||||
"worker_context": kb.build_worker_context(conn, tid),
|
||||
})
|
||||
finally:
|
||||
conn.close()
|
||||
except Exception as e:
|
||||
logger.exception("kanban_show failed")
|
||||
return tool_error(f"kanban_show: {e}")
|
||||
|
||||
|
||||
def _handle_complete(args: dict, **kw) -> str:
|
||||
"""Mark the current task done with a structured handoff."""
|
||||
tid = _default_task_id(args.get("task_id"))
|
||||
if not tid:
|
||||
return tool_error(
|
||||
"task_id is required (or set HERMES_KANBAN_TASK in the env)"
|
||||
)
|
||||
summary = args.get("summary")
|
||||
metadata = args.get("metadata")
|
||||
result = args.get("result")
|
||||
if not (summary or result):
|
||||
return tool_error(
|
||||
"provide at least one of: summary (preferred), result"
|
||||
)
|
||||
if metadata is not None and not isinstance(metadata, dict):
|
||||
return tool_error(
|
||||
f"metadata must be an object/dict, got {type(metadata).__name__}"
|
||||
)
|
||||
try:
|
||||
kb, conn = _connect()
|
||||
try:
|
||||
ok = kb.complete_task(
|
||||
conn, tid,
|
||||
result=result, summary=summary, metadata=metadata,
|
||||
)
|
||||
if not ok:
|
||||
return tool_error(
|
||||
f"could not complete {tid} (unknown id or already terminal)"
|
||||
)
|
||||
run = kb.latest_run(conn, tid)
|
||||
return _ok(task_id=tid, run_id=run.id if run else None)
|
||||
finally:
|
||||
conn.close()
|
||||
except Exception as e:
|
||||
logger.exception("kanban_complete failed")
|
||||
return tool_error(f"kanban_complete: {e}")
|
||||
|
||||
|
||||
def _handle_block(args: dict, **kw) -> str:
|
||||
"""Transition the task to blocked with a reason a human will read."""
|
||||
tid = _default_task_id(args.get("task_id"))
|
||||
if not tid:
|
||||
return tool_error(
|
||||
"task_id is required (or set HERMES_KANBAN_TASK in the env)"
|
||||
)
|
||||
reason = args.get("reason")
|
||||
if not reason or not str(reason).strip():
|
||||
return tool_error("reason is required — explain what input you need")
|
||||
try:
|
||||
kb, conn = _connect()
|
||||
try:
|
||||
ok = kb.block_task(conn, tid, reason=reason)
|
||||
if not ok:
|
||||
return tool_error(
|
||||
f"could not block {tid} (unknown id or not in "
|
||||
f"running/ready)"
|
||||
)
|
||||
run = kb.latest_run(conn, tid)
|
||||
return _ok(task_id=tid, run_id=run.id if run else None)
|
||||
finally:
|
||||
conn.close()
|
||||
except Exception as e:
|
||||
logger.exception("kanban_block failed")
|
||||
return tool_error(f"kanban_block: {e}")
|
||||
|
||||
|
||||
def _handle_heartbeat(args: dict, **kw) -> str:
|
||||
"""Signal that the worker is still alive during a long operation."""
|
||||
tid = _default_task_id(args.get("task_id"))
|
||||
if not tid:
|
||||
return tool_error(
|
||||
"task_id is required (or set HERMES_KANBAN_TASK in the env)"
|
||||
)
|
||||
note = args.get("note")
|
||||
try:
|
||||
kb, conn = _connect()
|
||||
try:
|
||||
ok = kb.heartbeat_worker(conn, tid, note=note)
|
||||
if not ok:
|
||||
return tool_error(
|
||||
f"could not heartbeat {tid} (unknown id or not running)"
|
||||
)
|
||||
return _ok(task_id=tid)
|
||||
finally:
|
||||
conn.close()
|
||||
except Exception as e:
|
||||
logger.exception("kanban_heartbeat failed")
|
||||
return tool_error(f"kanban_heartbeat: {e}")
|
||||
|
||||
|
||||
def _handle_comment(args: dict, **kw) -> str:
|
||||
"""Append a comment to a task's thread."""
|
||||
tid = args.get("task_id")
|
||||
if not tid:
|
||||
return tool_error(
|
||||
"task_id is required (use the current task id if that's what "
|
||||
"you mean — pulls from env but kept explicit here)"
|
||||
)
|
||||
body = args.get("body")
|
||||
if not body or not str(body).strip():
|
||||
return tool_error("body is required")
|
||||
author = args.get("author") or os.environ.get("HERMES_PROFILE") or "worker"
|
||||
try:
|
||||
kb, conn = _connect()
|
||||
try:
|
||||
cid = kb.add_comment(conn, tid, author=author, body=str(body))
|
||||
return _ok(task_id=tid, comment_id=cid)
|
||||
finally:
|
||||
conn.close()
|
||||
except Exception as e:
|
||||
logger.exception("kanban_comment failed")
|
||||
return tool_error(f"kanban_comment: {e}")
|
||||
|
||||
|
||||
def _handle_create(args: dict, **kw) -> str:
|
||||
"""Create a child task. Orchestrator workers use this to fan out.
|
||||
|
||||
``parents`` can be a list of task ids; dependency-gated promotion
|
||||
works as usual.
|
||||
"""
|
||||
title = args.get("title")
|
||||
if not title or not str(title).strip():
|
||||
return tool_error("title is required")
|
||||
assignee = args.get("assignee")
|
||||
if not assignee:
|
||||
return tool_error(
|
||||
"assignee is required — name the profile that should execute this "
|
||||
"task (the dispatcher will only spawn tasks with an assignee)"
|
||||
)
|
||||
body = args.get("body")
|
||||
parents = args.get("parents") or []
|
||||
tenant = args.get("tenant") or os.environ.get("HERMES_TENANT")
|
||||
priority = args.get("priority")
|
||||
workspace_kind = args.get("workspace_kind") or "scratch"
|
||||
workspace_path = args.get("workspace_path")
|
||||
triage = bool(args.get("triage"))
|
||||
idempotency_key = args.get("idempotency_key")
|
||||
max_runtime_seconds = args.get("max_runtime_seconds")
|
||||
skills = args.get("skills")
|
||||
if isinstance(skills, str):
|
||||
# Accept a single skill name as a string for convenience.
|
||||
skills = [skills]
|
||||
if skills is not None and not isinstance(skills, (list, tuple)):
|
||||
return tool_error(
|
||||
f"skills must be a list of skill names, got {type(skills).__name__}"
|
||||
)
|
||||
if isinstance(parents, str):
|
||||
parents = [parents]
|
||||
if not isinstance(parents, (list, tuple)):
|
||||
return tool_error(
|
||||
f"parents must be a list of task ids, got {type(parents).__name__}"
|
||||
)
|
||||
try:
|
||||
kb, conn = _connect()
|
||||
try:
|
||||
new_tid = kb.create_task(
|
||||
conn,
|
||||
title=str(title).strip(),
|
||||
body=body,
|
||||
assignee=str(assignee),
|
||||
parents=tuple(parents),
|
||||
tenant=tenant,
|
||||
priority=int(priority) if priority is not None else 0,
|
||||
workspace_kind=str(workspace_kind),
|
||||
workspace_path=workspace_path,
|
||||
triage=triage,
|
||||
idempotency_key=idempotency_key,
|
||||
max_runtime_seconds=(
|
||||
int(max_runtime_seconds)
|
||||
if max_runtime_seconds is not None else None
|
||||
),
|
||||
skills=skills,
|
||||
created_by=os.environ.get("HERMES_PROFILE") or "worker",
|
||||
)
|
||||
new_task = kb.get_task(conn, new_tid)
|
||||
return _ok(
|
||||
task_id=new_tid,
|
||||
status=new_task.status if new_task else None,
|
||||
)
|
||||
finally:
|
||||
conn.close()
|
||||
except Exception as e:
|
||||
logger.exception("kanban_create failed")
|
||||
return tool_error(f"kanban_create: {e}")
|
||||
|
||||
|
||||
def _handle_link(args: dict, **kw) -> str:
|
||||
"""Add a parent→child dependency edge after the fact."""
|
||||
parent_id = args.get("parent_id")
|
||||
child_id = args.get("child_id")
|
||||
if not parent_id or not child_id:
|
||||
return tool_error("both parent_id and child_id are required")
|
||||
try:
|
||||
kb, conn = _connect()
|
||||
try:
|
||||
kb.link_tasks(conn, parent_id=parent_id, child_id=child_id)
|
||||
return _ok(parent_id=parent_id, child_id=child_id)
|
||||
finally:
|
||||
conn.close()
|
||||
except ValueError as e:
|
||||
# Covers cycle + self-parent rejections
|
||||
return tool_error(f"kanban_link: {e}")
|
||||
except Exception as e:
|
||||
logger.exception("kanban_link failed")
|
||||
return tool_error(f"kanban_link: {e}")
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Schemas
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
_DESC_TASK_ID_DEFAULT = (
|
||||
"Task id. If omitted, defaults to HERMES_KANBAN_TASK from the env "
|
||||
"(the task the dispatcher spawned you to work on)."
|
||||
)
|
||||
|
||||
KANBAN_SHOW_SCHEMA = {
|
||||
"name": "kanban_show",
|
||||
"description": (
|
||||
"Read a task's full state — title, body, assignee, parent task "
|
||||
"handoffs, your prior attempts on this task if any, comments, "
|
||||
"and recent events. Use this to (re)orient yourself before "
|
||||
"starting work, especially on retries. The response includes a "
|
||||
"pre-formatted ``worker_context`` string suitable for inclusion "
|
||||
"verbatim in your reasoning."
|
||||
),
|
||||
"parameters": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"task_id": {
|
||||
"type": "string",
|
||||
"description": _DESC_TASK_ID_DEFAULT,
|
||||
},
|
||||
},
|
||||
"required": [],
|
||||
},
|
||||
}
|
||||
|
||||
KANBAN_COMPLETE_SCHEMA = {
|
||||
"name": "kanban_complete",
|
||||
"description": (
|
||||
"Mark your current task done with a structured handoff for "
|
||||
"downstream workers and humans. Prefer ``summary`` for a "
|
||||
"human-readable 1-3 sentence description of what you did; put "
|
||||
"machine-readable facts in ``metadata`` (changed_files, "
|
||||
"tests_run, decisions, findings, etc). At least one of "
|
||||
"``summary`` or ``result`` is required."
|
||||
),
|
||||
"parameters": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"task_id": {
|
||||
"type": "string",
|
||||
"description": _DESC_TASK_ID_DEFAULT,
|
||||
},
|
||||
"summary": {
|
||||
"type": "string",
|
||||
"description": (
|
||||
"Human-readable handoff, 1-3 sentences. Appears in "
|
||||
"Run History on the dashboard and in downstream "
|
||||
"workers' context."
|
||||
),
|
||||
},
|
||||
"metadata": {
|
||||
"type": "object",
|
||||
"description": (
|
||||
"Free-form dict of structured facts about this "
|
||||
"attempt — {\"changed_files\": [...], \"tests_run\": 12, "
|
||||
"\"findings\": [...]}. Surfaced to downstream "
|
||||
"workers alongside ``summary``."
|
||||
),
|
||||
},
|
||||
"result": {
|
||||
"type": "string",
|
||||
"description": (
|
||||
"Short result log line (legacy field, maps to "
|
||||
"task.result). Use ``summary`` instead when "
|
||||
"possible; this exists for compatibility with "
|
||||
"callers that still set --result on the CLI."
|
||||
),
|
||||
},
|
||||
},
|
||||
"required": [],
|
||||
},
|
||||
}
|
||||
|
||||
KANBAN_BLOCK_SCHEMA = {
|
||||
"name": "kanban_block",
|
||||
"description": (
|
||||
"Transition the task to blocked because you need human input "
|
||||
"to proceed. ``reason`` will be shown to the human on the "
|
||||
"board and included in context when someone unblocks you. "
|
||||
"Use for genuine blockers only — don't block on things you can "
|
||||
"resolve yourself."
|
||||
),
|
||||
"parameters": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"task_id": {
|
||||
"type": "string",
|
||||
"description": _DESC_TASK_ID_DEFAULT,
|
||||
},
|
||||
"reason": {
|
||||
"type": "string",
|
||||
"description": (
|
||||
"What you need answered, in one or two sentences. "
|
||||
"Don't paste the whole conversation; the human has "
|
||||
"the board and can ask follow-ups via comments."
|
||||
),
|
||||
},
|
||||
},
|
||||
"required": ["reason"],
|
||||
},
|
||||
}
|
||||
|
||||
KANBAN_HEARTBEAT_SCHEMA = {
|
||||
"name": "kanban_heartbeat",
|
||||
"description": (
|
||||
"Signal that you're still alive during a long operation "
|
||||
"(training, encoding, large crawls). Call every few minutes so "
|
||||
"humans see liveness separately from PID checks. Pure side "
|
||||
"effect — no work changes."
|
||||
),
|
||||
"parameters": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"task_id": {
|
||||
"type": "string",
|
||||
"description": _DESC_TASK_ID_DEFAULT,
|
||||
},
|
||||
"note": {
|
||||
"type": "string",
|
||||
"description": (
|
||||
"Optional short note describing current progress. "
|
||||
"Shown in the event log."
|
||||
),
|
||||
},
|
||||
},
|
||||
"required": [],
|
||||
},
|
||||
}
|
||||
|
||||
KANBAN_COMMENT_SCHEMA = {
|
||||
"name": "kanban_comment",
|
||||
"description": (
|
||||
"Append a comment to a task's thread. Use for durable notes "
|
||||
"that should outlive this run (questions for the next worker, "
|
||||
"partial findings, rationale). Ephemeral reasoning doesn't "
|
||||
"belong here — use your normal response instead."
|
||||
),
|
||||
"parameters": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"task_id": {
|
||||
"type": "string",
|
||||
"description": (
|
||||
"Task id. Required (may be your own task or "
|
||||
"another's — comment threads are per-task)."
|
||||
),
|
||||
},
|
||||
"body": {
|
||||
"type": "string",
|
||||
"description": "Markdown-supported comment body.",
|
||||
},
|
||||
"author": {
|
||||
"type": "string",
|
||||
"description": (
|
||||
"Override author name. Defaults to the current "
|
||||
"profile (HERMES_PROFILE env)."
|
||||
),
|
||||
},
|
||||
},
|
||||
"required": ["task_id", "body"],
|
||||
},
|
||||
}
|
||||
|
||||
KANBAN_CREATE_SCHEMA = {
|
||||
"name": "kanban_create",
|
||||
"description": (
|
||||
"Create a new kanban task, optionally as a child of the current "
|
||||
"one (pass the current task id in ``parents``). Used by "
|
||||
"orchestrator workers to fan out — decompose work into child "
|
||||
"tasks with specific assignees, link them into a pipeline, "
|
||||
"then complete your own task. The dispatcher picks up the new "
|
||||
"tasks on its next tick and spawns the assigned profiles."
|
||||
),
|
||||
"parameters": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"title": {
|
||||
"type": "string",
|
||||
"description": "Short task title (required).",
|
||||
},
|
||||
"assignee": {
|
||||
"type": "string",
|
||||
"description": (
|
||||
"Profile name that should execute this task "
|
||||
"(e.g. 'researcher-a', 'reviewer', 'writer'). "
|
||||
"Required — tasks without an assignee are never "
|
||||
"dispatched."
|
||||
),
|
||||
},
|
||||
"body": {
|
||||
"type": "string",
|
||||
"description": (
|
||||
"Opening post: full spec, acceptance criteria, "
|
||||
"links. The assigned worker reads this as part of "
|
||||
"its context."
|
||||
),
|
||||
},
|
||||
"parents": {
|
||||
"type": "array",
|
||||
"items": {"type": "string"},
|
||||
"description": (
|
||||
"Parent task ids. The new task stays in 'todo' "
|
||||
"until every parent reaches 'done'; then it "
|
||||
"auto-promotes to 'ready'. Typical fan-in: list "
|
||||
"all the researcher task ids when creating a "
|
||||
"synthesizer task."
|
||||
),
|
||||
},
|
||||
"tenant": {
|
||||
"type": "string",
|
||||
"description": (
|
||||
"Optional namespace for multi-project isolation. "
|
||||
"Defaults to HERMES_TENANT env if set."
|
||||
),
|
||||
},
|
||||
"priority": {
|
||||
"type": "integer",
|
||||
"description": (
|
||||
"Dispatcher tiebreaker. Higher = picked sooner "
|
||||
"when multiple ready tasks share an assignee."
|
||||
),
|
||||
},
|
||||
"workspace_kind": {
|
||||
"type": "string",
|
||||
"enum": ["scratch", "dir", "worktree"],
|
||||
"description": (
|
||||
"Workspace flavor: 'scratch' (fresh tmp dir, "
|
||||
"default), 'dir' (shared directory, requires "
|
||||
"absolute workspace_path), 'worktree' (git worktree)."
|
||||
),
|
||||
},
|
||||
"workspace_path": {
|
||||
"type": "string",
|
||||
"description": (
|
||||
"Absolute path for 'dir' or 'worktree' workspace. "
|
||||
"Relative paths are rejected at dispatch."
|
||||
),
|
||||
},
|
||||
"triage": {
|
||||
"type": "boolean",
|
||||
"description": (
|
||||
"If true, task lands in 'triage' instead of 'todo' "
|
||||
"— a specifier profile is expected to flesh out "
|
||||
"the body before work starts."
|
||||
),
|
||||
},
|
||||
"idempotency_key": {
|
||||
"type": "string",
|
||||
"description": (
|
||||
"If a non-archived task with this key already "
|
||||
"exists, return that task's id instead of creating "
|
||||
"a duplicate. Useful for retry-safe automation."
|
||||
),
|
||||
},
|
||||
"max_runtime_seconds": {
|
||||
"type": "integer",
|
||||
"description": (
|
||||
"Per-task runtime cap. When exceeded, the "
|
||||
"dispatcher SIGTERMs the worker and re-queues the "
|
||||
"task with outcome='timed_out'."
|
||||
),
|
||||
},
|
||||
"skills": {
|
||||
"type": "array",
|
||||
"items": {"type": "string"},
|
||||
"description": (
|
||||
"Skill names to force-load into the dispatched "
|
||||
"worker (in addition to the built-in kanban-worker "
|
||||
"skill). Use this to pin a task to a specialist "
|
||||
"context — e.g. ['translation'] for a translation "
|
||||
"task, ['github-code-review'] for a reviewer task. "
|
||||
"The names must match skills installed on the "
|
||||
"assignee's profile."
|
||||
),
|
||||
},
|
||||
},
|
||||
"required": ["title", "assignee"],
|
||||
},
|
||||
}
|
||||
|
||||
KANBAN_LINK_SCHEMA = {
|
||||
"name": "kanban_link",
|
||||
"description": (
|
||||
"Add a parent→child dependency edge after both tasks already "
|
||||
"exist. The child won't promote to 'ready' until all parents "
|
||||
"are 'done'. Cycles and self-links are rejected."
|
||||
),
|
||||
"parameters": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"parent_id": {"type": "string", "description": "Parent task id."},
|
||||
"child_id": {"type": "string", "description": "Child task id."},
|
||||
},
|
||||
"required": ["parent_id", "child_id"],
|
||||
},
|
||||
}
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Registration
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
registry.register(
|
||||
name="kanban_show",
|
||||
toolset="kanban",
|
||||
schema=KANBAN_SHOW_SCHEMA,
|
||||
handler=_handle_show,
|
||||
check_fn=_check_kanban_mode,
|
||||
emoji="📋",
|
||||
)
|
||||
|
||||
registry.register(
|
||||
name="kanban_complete",
|
||||
toolset="kanban",
|
||||
schema=KANBAN_COMPLETE_SCHEMA,
|
||||
handler=_handle_complete,
|
||||
check_fn=_check_kanban_mode,
|
||||
emoji="✔",
|
||||
)
|
||||
|
||||
registry.register(
|
||||
name="kanban_block",
|
||||
toolset="kanban",
|
||||
schema=KANBAN_BLOCK_SCHEMA,
|
||||
handler=_handle_block,
|
||||
check_fn=_check_kanban_mode,
|
||||
emoji="⏸",
|
||||
)
|
||||
|
||||
registry.register(
|
||||
name="kanban_heartbeat",
|
||||
toolset="kanban",
|
||||
schema=KANBAN_HEARTBEAT_SCHEMA,
|
||||
handler=_handle_heartbeat,
|
||||
check_fn=_check_kanban_mode,
|
||||
emoji="💓",
|
||||
)
|
||||
|
||||
registry.register(
|
||||
name="kanban_comment",
|
||||
toolset="kanban",
|
||||
schema=KANBAN_COMMENT_SCHEMA,
|
||||
handler=_handle_comment,
|
||||
check_fn=_check_kanban_mode,
|
||||
emoji="💬",
|
||||
)
|
||||
|
||||
registry.register(
|
||||
name="kanban_create",
|
||||
toolset="kanban",
|
||||
schema=KANBAN_CREATE_SCHEMA,
|
||||
handler=_handle_create,
|
||||
check_fn=_check_kanban_mode,
|
||||
emoji="➕",
|
||||
)
|
||||
|
||||
registry.register(
|
||||
name="kanban_link",
|
||||
toolset="kanban",
|
||||
schema=KANBAN_LINK_SCHEMA,
|
||||
handler=_handle_link,
|
||||
check_fn=_check_kanban_mode,
|
||||
emoji="🔗",
|
||||
)
|
||||
@@ -83,6 +83,7 @@ import threading
|
||||
import time
|
||||
from datetime import datetime
|
||||
from typing import Any, Dict, List, Optional
|
||||
from urllib.parse import urlparse
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
@@ -403,6 +404,67 @@ def _resolve_stdio_command(command: str, env: dict) -> tuple[str, dict]:
|
||||
return resolved_command, resolved_env
|
||||
|
||||
|
||||
class InvalidMcpUrlError(ValueError):
|
||||
"""Raised when a remote MCP server's ``url`` cannot be parsed as http(s)://.
|
||||
|
||||
Validated once at startup so we fail fast with a clear message instead of
|
||||
burning through the reconnect-backoff loop on every attempt. (Ported from
|
||||
anomalyco/opencode#25019.)
|
||||
"""
|
||||
|
||||
|
||||
def _validate_remote_mcp_url(server_name: str, url: Any) -> str:
|
||||
"""Return the URL as a string if it's a valid http(s) remote MCP URL.
|
||||
|
||||
Raises :class:`InvalidMcpUrlError` otherwise with a message naming the
|
||||
offending server, so users can spot the bad entry in their config.
|
||||
|
||||
Accepts:
|
||||
- ``http://host`` / ``https://host`` with optional port, path, query
|
||||
- IPv4, IPv6 (bracketed), DNS hostnames
|
||||
|
||||
Rejects:
|
||||
- Non-string values (``None``, dicts, ints)
|
||||
- Missing scheme (``example.com/mcp``)
|
||||
- Non-http(s) schemes (``file://``, ``ws://``, ``stdio:`` — stdio servers
|
||||
use the ``command`` key, not ``url``)
|
||||
- Empty host (``http://``, ``https:///path``)
|
||||
"""
|
||||
if not isinstance(url, str):
|
||||
raise InvalidMcpUrlError(
|
||||
f"Invalid MCP URL for '{server_name}': expected a string, got "
|
||||
f"{type(url).__name__}"
|
||||
)
|
||||
stripped = url.strip()
|
||||
if not stripped:
|
||||
raise InvalidMcpUrlError(
|
||||
f"Invalid MCP URL for '{server_name}': empty url"
|
||||
)
|
||||
try:
|
||||
parsed = urlparse(stripped)
|
||||
except Exception as exc: # urlparse is very permissive — belt and braces
|
||||
raise InvalidMcpUrlError(
|
||||
f"Invalid MCP URL for '{server_name}': {stripped!r} ({exc})"
|
||||
) from exc
|
||||
if parsed.scheme.lower() not in ("http", "https"):
|
||||
raise InvalidMcpUrlError(
|
||||
f"Invalid MCP URL for '{server_name}': scheme must be http or "
|
||||
f"https, got {parsed.scheme!r} ({stripped!r})"
|
||||
)
|
||||
if not parsed.netloc:
|
||||
raise InvalidMcpUrlError(
|
||||
f"Invalid MCP URL for '{server_name}': missing host ({stripped!r})"
|
||||
)
|
||||
# ``urlparse`` accepts ``http://:8080`` (empty host, explicit port).
|
||||
# Reject that — we need a real host.
|
||||
if not parsed.hostname:
|
||||
raise InvalidMcpUrlError(
|
||||
f"Invalid MCP URL for '{server_name}': missing hostname "
|
||||
f"({stripped!r})"
|
||||
)
|
||||
return stripped
|
||||
|
||||
|
||||
def _format_connect_error(exc: BaseException) -> str:
|
||||
"""Render nested MCP connection errors into an actionable short message."""
|
||||
|
||||
@@ -1287,6 +1349,21 @@ class MCPServerTask:
|
||||
"this warning.",
|
||||
self.name,
|
||||
)
|
||||
|
||||
# Validate remote URL once, up front. Raising here (rather than
|
||||
# letting it blow up inside the SDK's httpx layer on every retry)
|
||||
# means a typo in config.yaml fails fast with a clear error — and
|
||||
# critically, no reconnect-backoff burn. (Ported from
|
||||
# anomalyco/opencode#25019.)
|
||||
if self._is_http():
|
||||
try:
|
||||
_validate_remote_mcp_url(self.name, config.get("url"))
|
||||
except InvalidMcpUrlError as exc:
|
||||
logger.warning("%s", exc)
|
||||
self._error = exc
|
||||
self._ready.set()
|
||||
return
|
||||
|
||||
retries = 0
|
||||
initial_retries = 0
|
||||
backoff = 1.0
|
||||
|
||||
+23
@@ -60,6 +60,11 @@ _HERMES_CORE_TOOLS = [
|
||||
"send_message",
|
||||
# Home Assistant smart home control (gated on HASS_TOKEN via check_fn)
|
||||
"ha_list_entities", "ha_get_state", "ha_list_services", "ha_call_service",
|
||||
# Kanban multi-agent coordination — only in schema when the agent is
|
||||
# spawned as a kanban worker (HERMES_KANBAN_TASK env set), otherwise
|
||||
# zero schema footprint. Gated via check_fn in tools/kanban_tools.py.
|
||||
"kanban_show", "kanban_complete", "kanban_block", "kanban_heartbeat",
|
||||
"kanban_comment", "kanban_create", "kanban_link",
|
||||
]
|
||||
|
||||
|
||||
@@ -202,6 +207,24 @@ TOOLSETS = {
|
||||
"includes": []
|
||||
},
|
||||
|
||||
"kanban": {
|
||||
"description": (
|
||||
"Kanban multi-agent coordination — only active when the agent "
|
||||
"is spawned by the kanban dispatcher (HERMES_KANBAN_TASK env "
|
||||
"set). The dispatcher runs inside the gateway by default; see "
|
||||
"`kanban.dispatch_in_gateway` in config.yaml. Lets workers mark "
|
||||
"tasks done with structured handoffs, block for human input, "
|
||||
"heartbeat during long ops, comment on threads, and (for "
|
||||
"orchestrators) fan out into child tasks."
|
||||
),
|
||||
"tools": [
|
||||
"kanban_show", "kanban_complete", "kanban_block",
|
||||
"kanban_heartbeat", "kanban_comment",
|
||||
"kanban_create", "kanban_link",
|
||||
],
|
||||
"includes": [],
|
||||
},
|
||||
|
||||
"discord": {
|
||||
"description": "Discord read and participate tools (fetch messages, search members, create threads)",
|
||||
"tools": ["discord"],
|
||||
|
||||
@@ -1806,6 +1806,21 @@ def _init_session(sid: str, key: str, agent, history: list, cols: int = 80):
|
||||
load_permanent_allowlist()
|
||||
except Exception:
|
||||
pass
|
||||
# Surface the self-improvement background review's "💾 …" summary as a
|
||||
# review.summary event so Ink can render it as a persistent system line
|
||||
# in the transcript. In the CLI path this message is printed via
|
||||
# prompt_toolkit; the TUI has no equivalent print surface, so without
|
||||
# this callback the review would write the skill/memory change silently.
|
||||
try:
|
||||
agent.background_review_callback = (
|
||||
lambda message, _sid=sid: _emit(
|
||||
"review.summary", _sid, {"text": str(message)}
|
||||
)
|
||||
)
|
||||
except Exception:
|
||||
# Bare AIAgents that don't expose the attribute (unlikely, but keep
|
||||
# session startup resilient).
|
||||
pass
|
||||
_wire_callbacks(sid)
|
||||
_notify_session_boundary("on_session_reset", key)
|
||||
_emit("session.info", sid, _session_info(agent))
|
||||
|
||||
@@ -96,3 +96,41 @@ describe('mouse wheel modifier decoding', () => {
|
||||
expect(key).toMatchObject({ name: 'wheelup', meta: true })
|
||||
})
|
||||
})
|
||||
|
||||
describe('fragmented SGR mouse recovery', () => {
|
||||
it('re-synthesizes bracket-only SGR mouse tails as mouse events', () => {
|
||||
const [[mouse]] = parseMultipleKeypresses(INITIAL_STATE, '[<35;159;11M')
|
||||
|
||||
expect(mouse).toMatchObject({ kind: 'mouse', button: 35, col: 159, row: 11, action: 'press' })
|
||||
})
|
||||
|
||||
it('re-synthesizes angle-only SGR mouse tails as mouse events', () => {
|
||||
const [[mouse]] = parseMultipleKeypresses(INITIAL_STATE, '<35;159;11M')
|
||||
|
||||
expect(mouse).toMatchObject({ kind: 'mouse', button: 35, col: 159, row: 11, action: 'press' })
|
||||
})
|
||||
|
||||
it('re-synthesizes degraded SGR mouse bursts without leaking prompt text', () => {
|
||||
const [events] = parseMultipleKeypresses(INITIAL_STATE, '5;142;11M<35;159;11M35;124;26M35;119;26Mtyped')
|
||||
|
||||
expect(events.slice(0, 4)).toEqual([
|
||||
expect.objectContaining({ kind: 'mouse', button: 5, col: 142, row: 11 }),
|
||||
expect.objectContaining({ kind: 'mouse', button: 35, col: 159, row: 11 }),
|
||||
expect.objectContaining({ kind: 'mouse', button: 35, col: 124, row: 26 }),
|
||||
expect.objectContaining({ kind: 'mouse', button: 35, col: 119, row: 26 })
|
||||
])
|
||||
expect(events[4]).toMatchObject({ kind: 'key', sequence: 'typed' })
|
||||
})
|
||||
|
||||
it('keeps isolated semicolon text that only resembles a prefixless mouse report', () => {
|
||||
const [[key]] = parseMultipleKeypresses(INITIAL_STATE, 'see 1;2;3M for details')
|
||||
|
||||
expect(key).toMatchObject({ kind: 'key', sequence: 'see 1;2;3M for details' })
|
||||
})
|
||||
|
||||
it('does not match prefixless fragments inside longer digit runs', () => {
|
||||
const [[key]] = parseMultipleKeypresses(INITIAL_STATE, '1234;56;78M9;10;11M')
|
||||
|
||||
expect(key).toMatchObject({ kind: 'key', sequence: '1234;56;78M9;10;11M' })
|
||||
})
|
||||
})
|
||||
|
||||
@@ -63,6 +63,7 @@ const XTVERSION_RE = /^\x1bP>\|(.*?)(?:\x07|\x1b\\)$/s
|
||||
// Button 32=left-drag (0x20 | motion-bit). Plain 0/1/2 = left/mid/right click.
|
||||
// eslint-disable-next-line no-control-regex
|
||||
const SGR_MOUSE_RE = /^\x1b\[<(\d+);(\d+);(\d+)([Mm])$/
|
||||
const SGR_MOUSE_FRAGMENT_RE = /(?<!\d)(?:\[<|<)?(?:[0-9]|[1-9][0-9]|1\d{2}|2[0-4]\d|25[0-5]);\d+;\d+[Mm]/g
|
||||
|
||||
function createPasteKey(content: string): ParsedKey {
|
||||
return {
|
||||
@@ -267,23 +268,22 @@ export function parseMultipleKeypresses(
|
||||
} else if (token.type === 'text') {
|
||||
if (inPaste) {
|
||||
pasteBuffer += token.value
|
||||
} else if (/^\[<\d+;\d+;\d+[Mm]$/.test(token.value) || /^\[M[\x60-\x7f][\x20-\uffff]{2}$/.test(token.value)) {
|
||||
// Orphaned SGR/X10 mouse tail (fullscreen only — mouse tracking is off
|
||||
// otherwise). A heavy render blocked the event loop past App's 50ms
|
||||
// flush timer, so the buffered ESC was flushed as a lone Escape and
|
||||
// the continuation `[<btn;col;rowM` arrived as text. Re-synthesize
|
||||
// with the ESC prefix so the scroll event still fires instead of
|
||||
// leaking into the prompt. The spurious Escape is gone; App.tsx's
|
||||
// readableLength check prevents it. The X10 Cb slot is narrowed to
|
||||
// the wheel range [\x60-\x7f] (0x40|modifiers + 32) — a full [\x20-]
|
||||
// range would match typed input like `[MAX]` batched into one read
|
||||
// and silently drop it as a phantom click. Click/drag orphans leak
|
||||
// as visible garbage instead; deletable garbage beats silent loss.
|
||||
const resynthesized = '\x1b' + token.value
|
||||
const mouse = parseMouseEvent(resynthesized)
|
||||
keys.push(mouse ?? parseKeypress(resynthesized))
|
||||
} else {
|
||||
keys.push(parseKeypress(token.value))
|
||||
const mouseFragments = parseTextWithSgrMouseFragments(token.value)
|
||||
|
||||
if (mouseFragments) {
|
||||
keys.push(...mouseFragments)
|
||||
} else if (/^\[M[\x60-\x7f][\x20-\uffff]{2}$/.test(token.value)) {
|
||||
// Orphaned X10 wheel tail (fullscreen only — mouse tracking is off
|
||||
// otherwise). A heavy render blocked the event loop past App's 50ms
|
||||
// flush timer, so the buffered ESC was flushed as a lone Escape and
|
||||
// the continuation arrived as text. Re-synthesize with ESC so the
|
||||
// scroll event still fires instead of leaking into the prompt.
|
||||
const resynthesized = '\x1b' + token.value
|
||||
keys.push(parseKeypress(resynthesized))
|
||||
} else {
|
||||
keys.push(parseKeypress(token.value))
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -625,6 +625,77 @@ function parseMouseEvent(s: string): ParsedMouse | null {
|
||||
}
|
||||
}
|
||||
|
||||
function normalizeSgrMouseFragment(fragment: string): string {
|
||||
if (fragment.startsWith('[<')) {
|
||||
return `\x1b${fragment}`
|
||||
}
|
||||
|
||||
if (fragment.startsWith('<')) {
|
||||
return `\x1b[${fragment}`
|
||||
}
|
||||
|
||||
return `\x1b[<${fragment}`
|
||||
}
|
||||
|
||||
function parseSgrMouseFragment(fragment: string): ParsedInput {
|
||||
const sequence = normalizeSgrMouseFragment(fragment)
|
||||
return parseMouseEvent(sequence) ?? parseKeypress(sequence)
|
||||
}
|
||||
|
||||
function parseTextWithSgrMouseFragments(text: string): ParsedInput[] | null {
|
||||
SGR_MOUSE_FRAGMENT_RE.lastIndex = 0
|
||||
|
||||
const matches = [...text.matchAll(SGR_MOUSE_FRAGMENT_RE)]
|
||||
if (matches.length === 0) {
|
||||
return null
|
||||
}
|
||||
|
||||
const parsed: ParsedInput[] = []
|
||||
let cursor = 0
|
||||
let consumedAny = false
|
||||
|
||||
for (let i = 0; i < matches.length;) {
|
||||
const first = matches[i]!
|
||||
const run: RegExpMatchArray[] = [first]
|
||||
let runEnd = first.index! + first[0].length
|
||||
i++
|
||||
|
||||
while (i < matches.length && matches[i]!.index === runEnd) {
|
||||
run.push(matches[i]!)
|
||||
runEnd = matches[i]!.index! + matches[i]![0].length
|
||||
i++
|
||||
}
|
||||
|
||||
const hasExplicitMousePrefix = run.some(match => match[0].startsWith('[<') || match[0].startsWith('<'))
|
||||
const isFragmentBurst = run.length > 1
|
||||
|
||||
if (!hasExplicitMousePrefix && !isFragmentBurst) {
|
||||
continue
|
||||
}
|
||||
|
||||
if (first.index! > cursor) {
|
||||
parsed.push(parseKeypress(text.slice(cursor, first.index!)))
|
||||
}
|
||||
|
||||
for (const match of run) {
|
||||
parsed.push(parseSgrMouseFragment(match[0]))
|
||||
}
|
||||
|
||||
cursor = runEnd
|
||||
consumedAny = true
|
||||
}
|
||||
|
||||
if (!consumedAny) {
|
||||
return null
|
||||
}
|
||||
|
||||
if (cursor < text.length) {
|
||||
parsed.push(parseKeypress(text.slice(cursor)))
|
||||
}
|
||||
|
||||
return parsed
|
||||
}
|
||||
|
||||
function parseKeypress(s: string = ''): ParsedKey {
|
||||
let parts
|
||||
|
||||
|
||||
@@ -132,6 +132,33 @@ describe('createGatewayEventHandler', () => {
|
||||
expect(ctx.system.sys).toHaveBeenCalledWith('compressing 968 messages (~123,400 tok)…')
|
||||
})
|
||||
|
||||
it('surfaces self-improvement review summaries as a persistent system line', () => {
|
||||
const appended: Msg[] = []
|
||||
const ctx = buildCtx(appended)
|
||||
const onEvent = createGatewayEventHandler(ctx)
|
||||
|
||||
onEvent({
|
||||
payload: { text: "💾 Self-improvement review: Skill 'hermes-release' patched" },
|
||||
type: 'review.summary'
|
||||
} as any)
|
||||
|
||||
expect(ctx.system.sys).toHaveBeenCalledWith(
|
||||
"💾 Self-improvement review: Skill 'hermes-release' patched"
|
||||
)
|
||||
})
|
||||
|
||||
it('ignores review.summary events with empty or missing text', () => {
|
||||
const appended: Msg[] = []
|
||||
const ctx = buildCtx(appended)
|
||||
const onEvent = createGatewayEventHandler(ctx)
|
||||
|
||||
onEvent({ payload: { text: '' }, type: 'review.summary' } as any)
|
||||
onEvent({ payload: { text: ' ' }, type: 'review.summary' } as any)
|
||||
onEvent({ payload: undefined, type: 'review.summary' } as any)
|
||||
|
||||
expect(ctx.system.sys).not.toHaveBeenCalled()
|
||||
})
|
||||
|
||||
it('clears the visible todo list when the todo tool returns an empty list', () => {
|
||||
const appended: Msg[] = []
|
||||
const todos = [{ content: 'Boil water', id: 'boil', status: 'in_progress' }]
|
||||
|
||||
@@ -3,11 +3,19 @@ import { describe, expect, it, vi } from 'vitest'
|
||||
import { resetTerminalModes, TERMINAL_MODE_RESET } from '../lib/terminalModes.js'
|
||||
|
||||
describe('terminal mode reset', () => {
|
||||
it('includes the sticky input modes Hermes enables', () => {
|
||||
it('includes common sticky input modes', () => {
|
||||
expect(TERMINAL_MODE_RESET).toContain('\x1b[0\'z')
|
||||
expect(TERMINAL_MODE_RESET).toContain('\x1b[0\'{')
|
||||
expect(TERMINAL_MODE_RESET).toContain('\x1b[?2029l')
|
||||
expect(TERMINAL_MODE_RESET).toContain('\x1b[?1016l')
|
||||
expect(TERMINAL_MODE_RESET).toContain('\x1b[?1015l')
|
||||
expect(TERMINAL_MODE_RESET).toContain('\x1b[?1006l')
|
||||
expect(TERMINAL_MODE_RESET).toContain('\x1b[?1005l')
|
||||
expect(TERMINAL_MODE_RESET).toContain('\x1b[?1003l')
|
||||
expect(TERMINAL_MODE_RESET).toContain('\x1b[?1002l')
|
||||
expect(TERMINAL_MODE_RESET).toContain('\x1b[?1001l')
|
||||
expect(TERMINAL_MODE_RESET).toContain('\x1b[?1000l')
|
||||
expect(TERMINAL_MODE_RESET).toContain('\x1b[?9l')
|
||||
expect(TERMINAL_MODE_RESET).toContain('\x1b[?1004l')
|
||||
expect(TERMINAL_MODE_RESET).toContain('\x1b[?2004l')
|
||||
expect(TERMINAL_MODE_RESET).toContain('\x1b[?1049l')
|
||||
|
||||
@@ -510,6 +510,20 @@ export function createGatewayEventHandler(ctx: GatewayEventHandlerContext): (ev:
|
||||
|
||||
return
|
||||
|
||||
case 'review.summary': {
|
||||
// Self-improvement background review emitted a persistent summary
|
||||
// of what it saved to memory/skills. Surface it as a system line
|
||||
// in the transcript so it never gets lost to a transient status
|
||||
// flash. Python-side already formats it as "💾 Self-improvement
|
||||
// review: …".
|
||||
const text = String(ev.payload?.text ?? '').trim()
|
||||
if (text) {
|
||||
sys(text)
|
||||
}
|
||||
|
||||
return
|
||||
}
|
||||
|
||||
case 'subagent.spawn_requested':
|
||||
// Child built but not yet running (waiting on ThreadPoolExecutor slot).
|
||||
// Preserve completed state if a later event races in before this one.
|
||||
|
||||
@@ -22,6 +22,7 @@ import { GoodVibesHeart, StatusRule, StickyPromptTracker, TranscriptScrollbar }
|
||||
import { FloatingOverlays, PromptZone } from './appOverlays.js'
|
||||
import { Banner, Panel, SessionPanel } from './branding.js'
|
||||
import { FpsOverlay } from './fpsOverlay.js'
|
||||
import { HelpHint } from './helpHint.js'
|
||||
import { MessageLine } from './messageLine.js'
|
||||
import { QueuedMessages } from './queuedMessages.js'
|
||||
import { LiveTodoPanel, StreamingAssistant } from './streamingAssistant.js'
|
||||
@@ -242,6 +243,8 @@ const ComposerPane = memo(function ComposerPane({
|
||||
pagerPageSize={composer.pagerPageSize}
|
||||
/>
|
||||
|
||||
{composer.input === '?' && !composer.inputBuf.length && <HelpHint t={ui.theme} />}
|
||||
|
||||
{!isBlocked && (
|
||||
<>
|
||||
{composer.inputBuf.map((line, i) => (
|
||||
|
||||
@@ -0,0 +1,73 @@
|
||||
import { Box, Text } from '@hermes/ink'
|
||||
|
||||
import { HOTKEYS } from '../content/hotkeys.js'
|
||||
import type { Theme } from '../theme.js'
|
||||
|
||||
const COMMON_COMMANDS: [string, string][] = [
|
||||
['/help', 'full list of commands + hotkeys'],
|
||||
['/clear', 'start a new session'],
|
||||
['/resume', 'resume a prior session'],
|
||||
['/details', 'control transcript detail level'],
|
||||
['/copy', 'copy selection or last assistant message'],
|
||||
['/quit', 'exit hermes']
|
||||
]
|
||||
|
||||
const HOTKEY_PREVIEW = HOTKEYS.slice(0, 8)
|
||||
|
||||
export function HelpHint({ t }: { t: Theme }) {
|
||||
const labelW = Math.max(
|
||||
...COMMON_COMMANDS.map(([k]) => k.length),
|
||||
...HOTKEY_PREVIEW.map(([k]) => k.length)
|
||||
)
|
||||
|
||||
const pad = (s: string) => s + ' '.repeat(Math.max(0, labelW - s.length + 2))
|
||||
|
||||
return (
|
||||
<Box alignItems="flex-start" bottom="100%" flexDirection="column" left={0} position="absolute" right={0}>
|
||||
<Box
|
||||
alignSelf="flex-start"
|
||||
borderColor={t.color.primary}
|
||||
borderStyle="round"
|
||||
flexDirection="column"
|
||||
marginBottom={1}
|
||||
opaque
|
||||
paddingX={1}
|
||||
>
|
||||
<Text>
|
||||
<Text bold color={t.color.primary}>
|
||||
? quick help
|
||||
</Text>
|
||||
<Text color={t.color.muted}>
|
||||
{' · type /help for the full panel · backspace to dismiss'}
|
||||
</Text>
|
||||
</Text>
|
||||
|
||||
<Box marginTop={1}>
|
||||
<Text bold color={t.color.accent}>
|
||||
Common commands
|
||||
</Text>
|
||||
</Box>
|
||||
|
||||
{COMMON_COMMANDS.map(([k, v]) => (
|
||||
<Text key={k}>
|
||||
<Text color={t.color.label}>{pad(k)}</Text>
|
||||
<Text color={t.color.muted}>{v}</Text>
|
||||
</Text>
|
||||
))}
|
||||
|
||||
<Box marginTop={1}>
|
||||
<Text bold color={t.color.accent}>
|
||||
Hotkeys
|
||||
</Text>
|
||||
</Box>
|
||||
|
||||
{HOTKEY_PREVIEW.map(([k, v]) => (
|
||||
<Text key={k}>
|
||||
<Text color={t.color.label}>{pad(k)}</Text>
|
||||
<Text color={t.color.muted}>{v}</Text>
|
||||
</Text>
|
||||
))}
|
||||
</Box>
|
||||
</Box>
|
||||
)
|
||||
}
|
||||
@@ -493,6 +493,7 @@ export type GatewayEvent =
|
||||
| { payload: { request_id: string }; session_id?: string; type: 'sudo.request' }
|
||||
| { payload: { env_var: string; prompt: string; request_id: string }; session_id?: string; type: 'secret.request' }
|
||||
| { payload: { task_id: string; text: string }; session_id?: string; type: 'background.complete' }
|
||||
| { payload?: { text?: string }; session_id?: string; type: 'review.summary' }
|
||||
| { payload: SubagentEventPayload; session_id?: string; type: 'subagent.spawn_requested' }
|
||||
| { payload: SubagentEventPayload; session_id?: string; type: 'subagent.start' }
|
||||
| { payload: SubagentEventPayload; session_id?: string; type: 'subagent.thinking' }
|
||||
|
||||
@@ -1,10 +1,18 @@
|
||||
import { writeSync } from 'node:fs'
|
||||
|
||||
export const TERMINAL_MODE_RESET =
|
||||
'\x1b[0\'z' + // DEC locator reporting
|
||||
'\x1b[0\'{' + // selectable locator events
|
||||
'\x1b[?2029l' + // passive mouse
|
||||
'\x1b[?1016l' + // SGR-pixels mouse
|
||||
'\x1b[?1015l' + // urxvt decimal mouse
|
||||
'\x1b[?1006l' + // SGR mouse
|
||||
'\x1b[?1005l' + // UTF-8 extended mouse
|
||||
'\x1b[?1003l' + // any-motion mouse
|
||||
'\x1b[?1002l' + // button-motion mouse
|
||||
'\x1b[?1001l' + // highlight mouse
|
||||
'\x1b[?1000l' + // click mouse
|
||||
'\x1b[?9l' + // X10 mouse
|
||||
'\x1b[?1004l' + // focus events
|
||||
'\x1b[?2004l' + // bracketed paste
|
||||
'\x1b[?1049l' + // alternate screen
|
||||
|
||||
+10
-7
@@ -38,17 +38,16 @@ import {
|
||||
Sparkles,
|
||||
Star,
|
||||
Terminal,
|
||||
Users,
|
||||
Wrench,
|
||||
X,
|
||||
Zap,
|
||||
} from "lucide-react";
|
||||
import {
|
||||
Button,
|
||||
ListItem,
|
||||
SelectionSwitcher,
|
||||
Spinner,
|
||||
Typography,
|
||||
} from "@nous-research/ui";
|
||||
import { Button } from "@nous-research/ui/ui/components/button";
|
||||
import { ListItem } from "@nous-research/ui/ui/components/list-item";
|
||||
import { SelectionSwitcher } from "@nous-research/ui/ui/components/selection-switcher";
|
||||
import { Spinner } from "@nous-research/ui/ui/components/spinner";
|
||||
import { Typography } from "@/components/NouiTypography";
|
||||
import { cn } from "@/lib/utils";
|
||||
import { Backdrop } from "@/components/Backdrop";
|
||||
import { SidebarFooter } from "@/components/SidebarFooter";
|
||||
@@ -64,6 +63,7 @@ import LogsPage from "@/pages/LogsPage";
|
||||
import AnalyticsPage from "@/pages/AnalyticsPage";
|
||||
import ModelsPage from "@/pages/ModelsPage";
|
||||
import CronPage from "@/pages/CronPage";
|
||||
import ProfilesPage from "@/pages/ProfilesPage";
|
||||
import SkillsPage from "@/pages/SkillsPage";
|
||||
import ChatPage from "@/pages/ChatPage";
|
||||
import { LanguageSwitcher } from "@/components/LanguageSwitcher";
|
||||
@@ -102,6 +102,7 @@ const BUILTIN_ROUTES_CORE: Record<string, ComponentType> = {
|
||||
"/logs": LogsPage,
|
||||
"/cron": CronPage,
|
||||
"/skills": SkillsPage,
|
||||
"/profiles": ProfilesPage,
|
||||
"/config": ConfigPage,
|
||||
"/env": EnvPage,
|
||||
"/docs": DocsPage,
|
||||
@@ -137,6 +138,7 @@ const BUILTIN_NAV_REST: NavItem[] = [
|
||||
{ path: "/logs", labelKey: "logs", label: "Logs", icon: FileText },
|
||||
{ path: "/cron", labelKey: "cron", label: "Cron", icon: Clock },
|
||||
{ path: "/skills", labelKey: "skills", label: "Skills", icon: Package },
|
||||
{ path: "/profiles", labelKey: "profiles", label: "Profiles", icon: Users },
|
||||
{ path: "/config", labelKey: "config", label: "Config", icon: Settings },
|
||||
{ path: "/env", labelKey: "keys", label: "Keys", icon: KeyRound },
|
||||
{
|
||||
@@ -163,6 +165,7 @@ const ICON_MAP: Record<string, ComponentType<{ className?: string }>> = {
|
||||
Globe,
|
||||
Database,
|
||||
Shield,
|
||||
Users,
|
||||
Wrench,
|
||||
Zap,
|
||||
Heart,
|
||||
|
||||
@@ -1,4 +1,5 @@
|
||||
import { Select, SelectOption, Switch } from "@nous-research/ui";
|
||||
import { Select, SelectOption } from "@nous-research/ui/ui/components/select";
|
||||
import { Switch } from "@nous-research/ui/ui/components/switch";
|
||||
import { Input } from "@/components/ui/input";
|
||||
import { Label } from "@/components/ui/label";
|
||||
|
||||
|
||||
@@ -23,8 +23,8 @@
|
||||
* terminal pane keeps working unimpaired.
|
||||
*/
|
||||
|
||||
import { Button } from "@nous-research/ui";
|
||||
import { Badge } from "@nous-research/ui";
|
||||
import { Button } from "@nous-research/ui/ui/components/button";
|
||||
import { Badge } from "@nous-research/ui/ui/components/badge";
|
||||
import { Card } from "@/components/ui/card";
|
||||
|
||||
import { ModelPickerDialog } from "@/components/ModelPickerDialog";
|
||||
|
||||
@@ -1,4 +1,5 @@
|
||||
import { Button, Typography } from "@nous-research/ui";
|
||||
import { Button } from "@nous-research/ui/ui/components/button";
|
||||
import { Typography } from "@/components/NouiTypography";
|
||||
import { useI18n } from "@/i18n/context";
|
||||
|
||||
/**
|
||||
|
||||
@@ -1,6 +1,6 @@
|
||||
import { useEffect, useRef, useState } from "react";
|
||||
import { Brain, Eye, Gauge, Lightbulb, Wrench } from "lucide-react";
|
||||
import { Spinner } from "@nous-research/ui";
|
||||
import { Spinner } from "@nous-research/ui/ui/components/spinner";
|
||||
import { api } from "@/lib/api";
|
||||
import type { ModelInfoResponse } from "@/lib/api";
|
||||
import { formatTokenCount } from "@/lib/format";
|
||||
|
||||
@@ -1,4 +1,6 @@
|
||||
import { Button, ListItem, Spinner } from "@nous-research/ui";
|
||||
import { Button } from "@nous-research/ui/ui/components/button";
|
||||
import { ListItem } from "@nous-research/ui/ui/components/list-item";
|
||||
import { Spinner } from "@nous-research/ui/ui/components/spinner";
|
||||
import { Input } from "@/components/ui/input";
|
||||
import type { GatewayClient } from "@/lib/gatewayClient";
|
||||
import { Check, Search, X } from "lucide-react";
|
||||
|
||||
@@ -0,0 +1,63 @@
|
||||
import { forwardRef, type ElementType, type HTMLAttributes, type ReactNode } from "react";
|
||||
import { cn } from "@/lib/utils";
|
||||
|
||||
type TypographyProps = HTMLAttributes<HTMLElement> & {
|
||||
as?: ElementType;
|
||||
children?: ReactNode;
|
||||
compressed?: boolean;
|
||||
courier?: boolean;
|
||||
expanded?: boolean;
|
||||
mondwest?: boolean;
|
||||
mono?: boolean;
|
||||
sans?: boolean;
|
||||
variant?: "sm" | "md" | "lg" | "xl";
|
||||
};
|
||||
|
||||
const variantClasses: Record<NonNullable<TypographyProps["variant"]>, string> = {
|
||||
sm: "leading-[1.4] text-[.9375rem] tracking-[0.1875rem]",
|
||||
md: "text-[2.625rem] leading-[1] tracking-[0.0525rem]",
|
||||
lg: "text-[2.625rem] leading-[1] tracking-[0.0525rem]",
|
||||
xl: "text-[4.5rem] leading-[1] tracking-[0.135rem]",
|
||||
};
|
||||
|
||||
export const Typography = forwardRef<HTMLElement, TypographyProps>(function Typography(
|
||||
{
|
||||
as: Component = "span",
|
||||
className,
|
||||
compressed,
|
||||
courier,
|
||||
expanded,
|
||||
mondwest,
|
||||
mono,
|
||||
sans,
|
||||
variant,
|
||||
...props
|
||||
},
|
||||
ref,
|
||||
) {
|
||||
const hasFontVariant = compressed || courier || expanded || mondwest || mono || sans;
|
||||
|
||||
return (
|
||||
<Component
|
||||
className={cn(
|
||||
compressed && "font-compressed",
|
||||
courier && "font-courier",
|
||||
expanded && "font-expanded",
|
||||
mondwest && "font-mondwest tracking-[0.1875rem]",
|
||||
mono && "font-mono",
|
||||
(!hasFontVariant || sans) && "font-sans",
|
||||
variant && variantClasses[variant],
|
||||
className,
|
||||
)}
|
||||
ref={ref}
|
||||
{...props}
|
||||
/>
|
||||
);
|
||||
});
|
||||
|
||||
export const H2 = forwardRef<HTMLHeadingElement, Omit<TypographyProps, "as">>(function H2(
|
||||
{ className, variant = "lg", ...props },
|
||||
ref,
|
||||
) {
|
||||
return <Typography as="h2" className={cn("font-bold", className)} variant={variant} ref={ref} {...props} />;
|
||||
});
|
||||
@@ -1,6 +1,9 @@
|
||||
import { useEffect, useRef, useState } from "react";
|
||||
import { ExternalLink, X, Check } from "lucide-react";
|
||||
import { Button, CopyButton, H2, Spinner } from "@nous-research/ui";
|
||||
import { Button } from "@nous-research/ui/ui/components/button";
|
||||
import { CopyButton } from "@nous-research/ui/ui/components/command-block";
|
||||
import { Spinner } from "@nous-research/ui/ui/components/spinner";
|
||||
import { H2 } from "@/components/NouiTypography";
|
||||
import { api, type OAuthProvider, type OAuthStartResponse } from "@/lib/api";
|
||||
import { Input } from "@/components/ui/input";
|
||||
import { useI18n } from "@/i18n";
|
||||
|
||||
@@ -9,7 +9,9 @@ import {
|
||||
LogIn,
|
||||
} from "lucide-react";
|
||||
import { api, type OAuthProvider } from "@/lib/api";
|
||||
import { Button, CopyButton, Spinner } from "@nous-research/ui";
|
||||
import { Button } from "@nous-research/ui/ui/components/button";
|
||||
import { CopyButton } from "@nous-research/ui/ui/components/command-block";
|
||||
import { Spinner } from "@nous-research/ui/ui/components/spinner";
|
||||
import {
|
||||
Card,
|
||||
CardContent,
|
||||
@@ -17,7 +19,7 @@ import {
|
||||
CardHeader,
|
||||
CardTitle,
|
||||
} from "@/components/ui/card";
|
||||
import { Badge } from "@nous-research/ui";
|
||||
import { Badge } from "@nous-research/ui/ui/components/badge";
|
||||
import { OAuthLoginModal } from "@/components/OAuthLoginModal";
|
||||
import { useI18n } from "@/i18n";
|
||||
|
||||
|
||||
@@ -1,7 +1,7 @@
|
||||
import { AlertTriangle, Radio, Wifi, WifiOff } from "lucide-react";
|
||||
import type { PlatformStatus } from "@/lib/api";
|
||||
import { isoTimeAgo } from "@/lib/utils";
|
||||
import { Badge } from "@nous-research/ui";
|
||||
import { Badge } from "@nous-research/ui/ui/components/badge";
|
||||
import { Card, CardContent, CardHeader, CardTitle } from "@/components/ui/card";
|
||||
import { useI18n } from "@/i18n";
|
||||
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
import { Typography } from "@nous-research/ui";
|
||||
import { Typography } from "@/components/NouiTypography";
|
||||
import { useSidebarStatus } from "@/hooks/useSidebarStatus";
|
||||
import { cn } from "@/lib/utils";
|
||||
import { useI18n } from "@/i18n";
|
||||
|
||||
@@ -1,5 +1,5 @@
|
||||
import type { GatewayClient } from "@/lib/gatewayClient";
|
||||
import { ListItem } from "@nous-research/ui";
|
||||
import { ListItem } from "@nous-research/ui/ui/components/list-item";
|
||||
import { ChevronRight } from "lucide-react";
|
||||
import {
|
||||
forwardRef,
|
||||
|
||||
@@ -1,6 +1,8 @@
|
||||
import { useCallback, useEffect, useRef, useState } from "react";
|
||||
import { Palette, Check } from "lucide-react";
|
||||
import { Button, ListItem, Typography } from "@nous-research/ui";
|
||||
import { Button } from "@nous-research/ui/ui/components/button";
|
||||
import { ListItem } from "@nous-research/ui/ui/components/list-item";
|
||||
import { Typography } from "@/components/NouiTypography";
|
||||
import { BUILTIN_THEMES, useTheme } from "@/themes";
|
||||
import { useI18n } from "@/i18n";
|
||||
import { cn } from "@/lib/utils";
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
import { ListItem } from "@nous-research/ui";
|
||||
import { ListItem } from "@nous-research/ui/ui/components/list-item";
|
||||
import {
|
||||
AlertCircle,
|
||||
Check,
|
||||
|
||||
@@ -1,7 +1,7 @@
|
||||
import { useEffect, useRef } from "react";
|
||||
import { createPortal } from "react-dom";
|
||||
import { AlertTriangle } from "lucide-react";
|
||||
import { Button } from "@nous-research/ui";
|
||||
import { Button } from "@nous-research/ui/ui/components/button";
|
||||
import { cn } from "@/lib/utils";
|
||||
|
||||
export function ConfirmDialog({
|
||||
|
||||
@@ -75,6 +75,7 @@ export const en: Translations = {
|
||||
keys: "Keys",
|
||||
logs: "Logs",
|
||||
models: "Models",
|
||||
profiles: "profiles : multi agents",
|
||||
sessions: "Sessions",
|
||||
skills: "Skills",
|
||||
},
|
||||
@@ -223,6 +224,38 @@ export const en: Translations = {
|
||||
},
|
||||
},
|
||||
|
||||
profiles: {
|
||||
newProfile: "New Profile",
|
||||
name: "Name",
|
||||
namePlaceholder: "e.g. coder, writer, etc.",
|
||||
nameRequired: "Name is required",
|
||||
nameRule:
|
||||
"Lowercase letters, digits, _ and - only; must start with a letter or digit; up to 64 characters.",
|
||||
invalidName: "Invalid profile name",
|
||||
cloneFromDefault: "Clone config from default profile",
|
||||
allProfiles: "Profiles",
|
||||
noProfiles: "No profiles found.",
|
||||
defaultBadge: "default",
|
||||
hasEnv: "env",
|
||||
model: "Model",
|
||||
skills: "Skills",
|
||||
rename: "Rename",
|
||||
editSoul: "Edit SOUL.md",
|
||||
soulSection: "SOUL.md (personality / system prompt)",
|
||||
soulPlaceholder: "# How this agent should behave…",
|
||||
saveSoul: "Save SOUL",
|
||||
soulSaved: "SOUL.md saved",
|
||||
openInTerminal: "Copy CLI command",
|
||||
commandCopied: "Copied to clipboard",
|
||||
copyFailed: "Could not copy",
|
||||
confirmDeleteTitle: "Delete profile?",
|
||||
confirmDeleteMessage:
|
||||
"This permanently deletes profile '{name}' — config, keys, memories, sessions, skills, cron jobs. Cannot be undone.",
|
||||
created: "Created",
|
||||
deleted: "Deleted",
|
||||
renamed: "Renamed",
|
||||
},
|
||||
|
||||
skills: {
|
||||
title: "Skills",
|
||||
searchPlaceholder: "Search skills and toolsets...",
|
||||
|
||||
@@ -75,6 +75,7 @@ export interface Translations {
|
||||
keys: string;
|
||||
logs: string;
|
||||
models: string;
|
||||
profiles: string;
|
||||
sessions: string;
|
||||
skills: string;
|
||||
};
|
||||
@@ -227,6 +228,37 @@ export interface Translations {
|
||||
};
|
||||
};
|
||||
|
||||
// ── Profiles page ──
|
||||
profiles: {
|
||||
newProfile: string;
|
||||
name: string;
|
||||
namePlaceholder: string;
|
||||
nameRequired: string;
|
||||
nameRule: string;
|
||||
invalidName: string;
|
||||
cloneFromDefault: string;
|
||||
allProfiles: string;
|
||||
noProfiles: string;
|
||||
defaultBadge: string;
|
||||
hasEnv: string;
|
||||
model: string;
|
||||
skills: string;
|
||||
rename: string;
|
||||
editSoul: string;
|
||||
soulSection: string;
|
||||
soulPlaceholder: string;
|
||||
saveSoul: string;
|
||||
soulSaved: string;
|
||||
openInTerminal: string;
|
||||
commandCopied: string;
|
||||
copyFailed: string;
|
||||
confirmDeleteTitle: string;
|
||||
confirmDeleteMessage: string;
|
||||
created: string;
|
||||
deleted: string;
|
||||
renamed: string;
|
||||
};
|
||||
|
||||
// ── Skills page ──
|
||||
skills: {
|
||||
title: string;
|
||||
|
||||
@@ -74,6 +74,7 @@ export const zh: Translations = {
|
||||
keys: "密钥",
|
||||
logs: "日志",
|
||||
models: "模型",
|
||||
profiles: "多Agent配置",
|
||||
sessions: "会话",
|
||||
skills: "技能",
|
||||
},
|
||||
@@ -220,6 +221,38 @@ export const zh: Translations = {
|
||||
},
|
||||
},
|
||||
|
||||
profiles: {
|
||||
newProfile: "新建多Agent配置",
|
||||
name: "名称",
|
||||
namePlaceholder: "例如:coder, writer 等",
|
||||
nameRequired: "名称必填",
|
||||
nameRule:
|
||||
"仅允许小写字母、数字、下划线和短横线;首字符必须是字母或数字;最多 64 个字符。",
|
||||
invalidName: "多Agent配置名称非法",
|
||||
cloneFromDefault: "从默认多Agent配置克隆配置",
|
||||
allProfiles: "多Agent配置列表",
|
||||
noProfiles: "暂无多Agent配置。",
|
||||
defaultBadge: "默认",
|
||||
hasEnv: "已配置 env",
|
||||
model: "模型",
|
||||
skills: "技能",
|
||||
rename: "重命名",
|
||||
editSoul: "编辑 SOUL.md",
|
||||
soulSection: "SOUL.md(人格 / 系统提示词)",
|
||||
soulPlaceholder: "# 这个代理应当如何工作……",
|
||||
saveSoul: "保存 SOUL",
|
||||
soulSaved: "SOUL.md 已保存",
|
||||
openInTerminal: "复制 CLI 命令",
|
||||
commandCopied: "已复制到剪贴板",
|
||||
copyFailed: "复制失败",
|
||||
confirmDeleteTitle: "删除多Agent配置?",
|
||||
confirmDeleteMessage:
|
||||
"将永久删除多Agent配置 '{name}' — 包括配置、密钥、记忆、会话、技能、定时任务。此操作无法撤销。",
|
||||
created: "已创建",
|
||||
deleted: "已删除",
|
||||
renamed: "已重命名",
|
||||
},
|
||||
|
||||
skills: {
|
||||
title: "技能",
|
||||
searchPlaceholder: "搜索技能和工具集...",
|
||||
|
||||
@@ -132,6 +132,47 @@ export const api = {
|
||||
deleteCronJob: (id: string) =>
|
||||
fetchJSON<{ ok: boolean }>(`/api/cron/jobs/${id}`, { method: "DELETE" }),
|
||||
|
||||
// Profiles (minimal)
|
||||
getProfiles: () =>
|
||||
fetchJSON<{ profiles: ProfileInfo[] }>("/api/profiles"),
|
||||
createProfile: (body: { name: string; clone_from_default: boolean }) =>
|
||||
fetchJSON<{ ok: boolean; name: string; path: string }>("/api/profiles", {
|
||||
method: "POST",
|
||||
headers: { "Content-Type": "application/json" },
|
||||
body: JSON.stringify(body),
|
||||
}),
|
||||
renameProfile: (name: string, newName: string) =>
|
||||
fetchJSON<{ ok: boolean; name: string; path: string }>(
|
||||
`/api/profiles/${encodeURIComponent(name)}`,
|
||||
{
|
||||
method: "PATCH",
|
||||
headers: { "Content-Type": "application/json" },
|
||||
body: JSON.stringify({ new_name: newName }),
|
||||
},
|
||||
),
|
||||
deleteProfile: (name: string) =>
|
||||
fetchJSON<{ ok: boolean }>(
|
||||
`/api/profiles/${encodeURIComponent(name)}`,
|
||||
{ method: "DELETE" },
|
||||
),
|
||||
getProfileSetupCommand: (name: string) =>
|
||||
fetchJSON<{ command: string }>(
|
||||
`/api/profiles/${encodeURIComponent(name)}/setup-command`,
|
||||
),
|
||||
getProfileSoul: (name: string) =>
|
||||
fetchJSON<{ content: string; exists: boolean }>(
|
||||
`/api/profiles/${encodeURIComponent(name)}/soul`,
|
||||
),
|
||||
updateProfileSoul: (name: string, content: string) =>
|
||||
fetchJSON<{ ok: boolean }>(
|
||||
`/api/profiles/${encodeURIComponent(name)}/soul`,
|
||||
{
|
||||
method: "PUT",
|
||||
headers: { "Content-Type": "application/json" },
|
||||
body: JSON.stringify({ content }),
|
||||
},
|
||||
),
|
||||
|
||||
// Skills & Toolsets
|
||||
getSkills: () => fetchJSON<SkillInfo[]>("/api/skills"),
|
||||
toggleSkill: (name: string, enabled: boolean) =>
|
||||
@@ -380,6 +421,16 @@ export interface AnalyticsResponse {
|
||||
};
|
||||
}
|
||||
|
||||
export interface ProfileInfo {
|
||||
name: string;
|
||||
path: string;
|
||||
is_default: boolean;
|
||||
model: string | null;
|
||||
provider: string | null;
|
||||
has_env: boolean;
|
||||
skill_count: number;
|
||||
}
|
||||
|
||||
export interface ModelsAnalyticsModelEntry {
|
||||
model: string;
|
||||
provider: string;
|
||||
|
||||
@@ -8,9 +8,11 @@ import type {
|
||||
AnalyticsSkillEntry,
|
||||
} from "@/lib/api";
|
||||
import { timeAgo } from "@/lib/utils";
|
||||
import { Button, Spinner, Stats } from "@nous-research/ui";
|
||||
import { Button } from "@nous-research/ui/ui/components/button";
|
||||
import { Spinner } from "@nous-research/ui/ui/components/spinner";
|
||||
import { Stats } from "@nous-research/ui/ui/components/stats";
|
||||
import { Card, CardContent, CardHeader, CardTitle } from "@/components/ui/card";
|
||||
import { Badge } from "@nous-research/ui";
|
||||
import { Badge } from "@nous-research/ui/ui/components/badge";
|
||||
import { usePageHeader } from "@/contexts/usePageHeader";
|
||||
import { useI18n } from "@/i18n";
|
||||
import { PluginSlot } from "@/plugins";
|
||||
|
||||
@@ -22,7 +22,8 @@ import { WebLinksAddon } from "@xterm/addon-web-links";
|
||||
import { WebglAddon } from "@xterm/addon-webgl";
|
||||
import { Terminal } from "@xterm/xterm";
|
||||
import "@xterm/xterm/css/xterm.css";
|
||||
import { Button, Typography } from "@nous-research/ui";
|
||||
import { Button } from "@nous-research/ui/ui/components/button";
|
||||
import { Typography } from "@/components/NouiTypography";
|
||||
import { cn } from "@/lib/utils";
|
||||
import { Copy, PanelRight, X } from "lucide-react";
|
||||
import { useCallback, useEffect, useMemo, useRef, useState } from "react";
|
||||
|
||||
@@ -33,10 +33,12 @@ import { getNestedValue, setNestedValue } from "@/lib/nested";
|
||||
import { useToast } from "@/hooks/useToast";
|
||||
import { Toast } from "@/components/Toast";
|
||||
import { AutoField } from "@/components/AutoField";
|
||||
import { Button, ListItem, Spinner } from "@nous-research/ui";
|
||||
import { Button } from "@nous-research/ui/ui/components/button";
|
||||
import { ListItem } from "@nous-research/ui/ui/components/list-item";
|
||||
import { Spinner } from "@nous-research/ui/ui/components/spinner";
|
||||
import { Card, CardContent, CardHeader, CardTitle } from "@/components/ui/card";
|
||||
import { Input } from "@/components/ui/input";
|
||||
import { Badge } from "@nous-research/ui";
|
||||
import { Badge } from "@nous-research/ui/ui/components/badge";
|
||||
import { useI18n } from "@/i18n";
|
||||
import { usePageHeader } from "@/contexts/usePageHeader";
|
||||
import { PluginSlot } from "@/plugins";
|
||||
|
||||
@@ -1,6 +1,10 @@
|
||||
import { useCallback, useEffect, useState } from "react";
|
||||
import { Clock, Pause, Play, Plus, Trash2, Zap } from "lucide-react";
|
||||
import { Badge, Button, H2, Select, SelectOption, Spinner } from "@nous-research/ui";
|
||||
import { Badge } from "@nous-research/ui/ui/components/badge";
|
||||
import { Button } from "@nous-research/ui/ui/components/button";
|
||||
import { Select, SelectOption } from "@nous-research/ui/ui/components/select";
|
||||
import { Spinner } from "@nous-research/ui/ui/components/spinner";
|
||||
import { H2 } from "@/components/NouiTypography";
|
||||
import { api } from "@/lib/api";
|
||||
import type { CronJob } from "@/lib/api";
|
||||
import { DeleteConfirmDialog } from "@/components/DeleteConfirmDialog";
|
||||
|
||||
@@ -21,7 +21,9 @@ import { Toast } from "@/components/Toast";
|
||||
import { useConfirmDelete } from "@/hooks/useConfirmDelete";
|
||||
import { useToast } from "@/hooks/useToast";
|
||||
import { OAuthProvidersCard } from "@/components/OAuthProvidersCard";
|
||||
import { Button, ListItem, Spinner } from "@nous-research/ui";
|
||||
import { Button } from "@nous-research/ui/ui/components/button";
|
||||
import { ListItem } from "@nous-research/ui/ui/components/list-item";
|
||||
import { Spinner } from "@nous-research/ui/ui/components/spinner";
|
||||
import {
|
||||
Card,
|
||||
CardContent,
|
||||
@@ -29,7 +31,7 @@ import {
|
||||
CardHeader,
|
||||
CardTitle,
|
||||
} from "@/components/ui/card";
|
||||
import { Badge } from "@nous-research/ui";
|
||||
import { Badge } from "@nous-research/ui/ui/components/badge";
|
||||
import { Input } from "@/components/ui/input";
|
||||
import { Label } from "@/components/ui/label";
|
||||
import { useI18n } from "@/i18n";
|
||||
|
||||
@@ -7,14 +7,11 @@ import {
|
||||
} from "react";
|
||||
import { FileText, RefreshCw } from "lucide-react";
|
||||
import { api } from "@/lib/api";
|
||||
import {
|
||||
Badge,
|
||||
Button,
|
||||
FilterGroup,
|
||||
Segmented,
|
||||
Spinner,
|
||||
Switch,
|
||||
} from "@nous-research/ui";
|
||||
import { Badge } from "@nous-research/ui/ui/components/badge";
|
||||
import { Button } from "@nous-research/ui/ui/components/button";
|
||||
import { FilterGroup, Segmented } from "@nous-research/ui/ui/components/segmented";
|
||||
import { Spinner } from "@nous-research/ui/ui/components/spinner";
|
||||
import { Switch } from "@nous-research/ui/ui/components/switch";
|
||||
import { Card, CardContent, CardHeader, CardTitle } from "@/components/ui/card";
|
||||
import { Label } from "@/components/ui/label";
|
||||
import { useI18n } from "@/i18n";
|
||||
|
||||
@@ -20,9 +20,11 @@ import type {
|
||||
} from "@/lib/api";
|
||||
import { timeAgo } from "@/lib/utils";
|
||||
import { formatTokenCount } from "@/lib/format";
|
||||
import { Button, Spinner, Stats } from "@nous-research/ui";
|
||||
import { Button } from "@nous-research/ui/ui/components/button";
|
||||
import { Spinner } from "@nous-research/ui/ui/components/spinner";
|
||||
import { Stats } from "@nous-research/ui/ui/components/stats";
|
||||
import { Card, CardContent, CardHeader, CardTitle } from "@/components/ui/card";
|
||||
import { Badge } from "@nous-research/ui";
|
||||
import { Badge } from "@nous-research/ui/ui/components/badge";
|
||||
import { usePageHeader } from "@/contexts/usePageHeader";
|
||||
import { useI18n } from "@/i18n";
|
||||
import { PluginSlot } from "@/plugins";
|
||||
|
||||
@@ -0,0 +1,444 @@
|
||||
import { useCallback, useEffect, useRef, useState } from "react";
|
||||
import { ChevronDown, Pencil, Plus, Terminal, Trash2, Users } from "lucide-react";
|
||||
import { H2 } from "@/components/NouiTypography";
|
||||
import { api } from "@/lib/api";
|
||||
import type { ProfileInfo } from "@/lib/api";
|
||||
import { DeleteConfirmDialog } from "@/components/DeleteConfirmDialog";
|
||||
import { useToast } from "@/hooks/useToast";
|
||||
import { useConfirmDelete } from "@/hooks/useConfirmDelete";
|
||||
import { Toast } from "@/components/Toast";
|
||||
import { Card, CardContent, CardHeader, CardTitle } from "@/components/ui/card";
|
||||
import { Badge } from "@nous-research/ui/ui/components/badge";
|
||||
import { Button } from "@nous-research/ui/ui/components/button";
|
||||
import { Input } from "@/components/ui/input";
|
||||
import { Label } from "@/components/ui/label";
|
||||
import { useI18n } from "@/i18n";
|
||||
|
||||
// Mirrors hermes_cli/profiles.py::_PROFILE_ID_RE so we can reject obviously
|
||||
// invalid names (uppercase, spaces, …) before round-tripping a doomed POST.
|
||||
const PROFILE_NAME_RE = /^[a-z0-9][a-z0-9_-]{0,63}$/;
|
||||
|
||||
export default function ProfilesPage() {
|
||||
const [profiles, setProfiles] = useState<ProfileInfo[]>([]);
|
||||
const [loading, setLoading] = useState(true);
|
||||
const { toast, showToast } = useToast();
|
||||
const { t } = useI18n();
|
||||
|
||||
// Create form
|
||||
const [newName, setNewName] = useState("");
|
||||
const [cloneFromDefault, setCloneFromDefault] = useState(true);
|
||||
const [creating, setCreating] = useState(false);
|
||||
|
||||
// Inline rename state
|
||||
const [renamingFrom, setRenamingFrom] = useState<string | null>(null);
|
||||
const [renameTo, setRenameTo] = useState("");
|
||||
|
||||
// Inline SOUL editor state
|
||||
const [editingSoulFor, setEditingSoulFor] = useState<string | null>(null);
|
||||
const [soulText, setSoulText] = useState("");
|
||||
const [soulSaving, setSoulSaving] = useState(false);
|
||||
// Tracks the latest SOUL request so out-of-order responses don't overwrite
|
||||
// newer state when the user switches profiles or closes the editor.
|
||||
const activeSoulRequest = useRef<string | null>(null);
|
||||
|
||||
const load = useCallback(() => {
|
||||
api
|
||||
.getProfiles()
|
||||
.then((res) => setProfiles(res.profiles))
|
||||
.catch((e) => showToast(`${t.status.error}: ${e}`, "error"))
|
||||
.finally(() => setLoading(false));
|
||||
}, [showToast, t.status.error]);
|
||||
|
||||
useEffect(() => {
|
||||
load();
|
||||
}, [load]);
|
||||
|
||||
const handleCreate = async () => {
|
||||
const name = newName.trim();
|
||||
if (!name) {
|
||||
showToast(t.profiles.nameRequired, "error");
|
||||
return;
|
||||
}
|
||||
if (!PROFILE_NAME_RE.test(name)) {
|
||||
showToast(`${t.profiles.invalidName}: ${t.profiles.nameRule}`, "error");
|
||||
return;
|
||||
}
|
||||
setCreating(true);
|
||||
try {
|
||||
await api.createProfile({ name, clone_from_default: cloneFromDefault });
|
||||
showToast(`${t.profiles.created}: ${name}`, "success");
|
||||
setNewName("");
|
||||
load();
|
||||
} catch (e) {
|
||||
showToast(`${t.status.error}: ${e}`, "error");
|
||||
} finally {
|
||||
setCreating(false);
|
||||
}
|
||||
};
|
||||
|
||||
const handleRenameSubmit = async () => {
|
||||
if (!renamingFrom) return;
|
||||
const target = renameTo.trim();
|
||||
if (!target || target === renamingFrom) {
|
||||
setRenamingFrom(null);
|
||||
setRenameTo("");
|
||||
return;
|
||||
}
|
||||
if (!PROFILE_NAME_RE.test(target)) {
|
||||
showToast(`${t.profiles.invalidName}: ${t.profiles.nameRule}`, "error");
|
||||
return;
|
||||
}
|
||||
try {
|
||||
await api.renameProfile(renamingFrom, target);
|
||||
showToast(`${t.profiles.renamed}: ${renamingFrom} → ${target}`, "success");
|
||||
setRenamingFrom(null);
|
||||
setRenameTo("");
|
||||
load();
|
||||
} catch (e) {
|
||||
showToast(`${t.status.error}: ${e}`, "error");
|
||||
}
|
||||
};
|
||||
|
||||
const openSoulEditor = useCallback(
|
||||
async (name: string) => {
|
||||
if (editingSoulFor === name) {
|
||||
activeSoulRequest.current = null;
|
||||
setEditingSoulFor(null);
|
||||
return;
|
||||
}
|
||||
setEditingSoulFor(name);
|
||||
setSoulText("");
|
||||
activeSoulRequest.current = name;
|
||||
try {
|
||||
const soul = await api.getProfileSoul(name);
|
||||
if (activeSoulRequest.current === name) {
|
||||
setSoulText(soul.content);
|
||||
}
|
||||
} catch (e) {
|
||||
if (activeSoulRequest.current === name) {
|
||||
showToast(`${t.status.error}: ${e}`, "error");
|
||||
}
|
||||
}
|
||||
},
|
||||
[editingSoulFor, showToast, t.status.error],
|
||||
);
|
||||
|
||||
const handleSaveSoul = async (name: string) => {
|
||||
setSoulSaving(true);
|
||||
try {
|
||||
await api.updateProfileSoul(name, soulText);
|
||||
showToast(`${t.profiles.soulSaved}: ${name}`, "success");
|
||||
} catch (e) {
|
||||
showToast(`${t.status.error}: ${e}`, "error");
|
||||
} finally {
|
||||
setSoulSaving(false);
|
||||
}
|
||||
};
|
||||
|
||||
const handleCopyTerminalCommand = async (name: string) => {
|
||||
let cmd: string;
|
||||
try {
|
||||
const res = await api.getProfileSetupCommand(name);
|
||||
cmd = res.command;
|
||||
} catch (e) {
|
||||
showToast(`${t.status.error}: ${e}`, "error");
|
||||
return;
|
||||
}
|
||||
try {
|
||||
await navigator.clipboard.writeText(cmd);
|
||||
showToast(`${t.profiles.commandCopied}: ${cmd}`, "success");
|
||||
} catch {
|
||||
showToast(`${t.profiles.copyFailed}: ${cmd}`, "error");
|
||||
}
|
||||
};
|
||||
|
||||
const profileDelete = useConfirmDelete<string>({
|
||||
onDelete: useCallback(
|
||||
async (name: string) => {
|
||||
try {
|
||||
await api.deleteProfile(name);
|
||||
showToast(`${t.profiles.deleted}: ${name}`, "success");
|
||||
load();
|
||||
} catch (e) {
|
||||
showToast(`${t.status.error}: ${e}`, "error");
|
||||
throw e;
|
||||
}
|
||||
},
|
||||
[load, showToast, t.profiles.deleted, t.status.error],
|
||||
),
|
||||
});
|
||||
|
||||
const pendingName = profileDelete.pendingId;
|
||||
|
||||
if (loading) {
|
||||
return (
|
||||
<div className="flex items-center justify-center py-24">
|
||||
<div className="h-6 w-6 animate-spin rounded-full border-2 border-primary border-t-transparent" />
|
||||
</div>
|
||||
);
|
||||
}
|
||||
|
||||
return (
|
||||
// Profile names, model slugs, and paths are case-sensitive; opt out of
|
||||
// the app shell's global ``uppercase`` so they render as the user typed.
|
||||
// Children that explicitly opt back in (Badges, etc.) keep their casing.
|
||||
<div className="flex flex-col gap-6 normal-case">
|
||||
<Toast toast={toast} />
|
||||
|
||||
<DeleteConfirmDialog
|
||||
open={profileDelete.isOpen}
|
||||
onCancel={profileDelete.cancel}
|
||||
onConfirm={profileDelete.confirm}
|
||||
title={t.profiles.confirmDeleteTitle}
|
||||
description={
|
||||
pendingName
|
||||
? t.profiles.confirmDeleteMessage.replace("{name}", pendingName)
|
||||
: t.profiles.confirmDeleteMessage
|
||||
}
|
||||
loading={profileDelete.isDeleting}
|
||||
/>
|
||||
|
||||
{/* Create new profile */}
|
||||
<Card>
|
||||
<CardHeader>
|
||||
<CardTitle className="flex items-center gap-2 text-base">
|
||||
<Plus className="h-4 w-4" />
|
||||
{t.profiles.newProfile}
|
||||
</CardTitle>
|
||||
</CardHeader>
|
||||
<CardContent>
|
||||
<div className="grid gap-4">
|
||||
<div className="grid gap-2">
|
||||
<Label htmlFor="profile-name">{t.profiles.name}</Label>
|
||||
<Input
|
||||
id="profile-name"
|
||||
placeholder={t.profiles.namePlaceholder}
|
||||
value={newName}
|
||||
onChange={(e) => setNewName(e.target.value)}
|
||||
aria-invalid={
|
||||
newName.trim() !== "" &&
|
||||
!PROFILE_NAME_RE.test(newName.trim())
|
||||
}
|
||||
/>
|
||||
<p className="text-xs text-muted-foreground">
|
||||
{t.profiles.nameRule}
|
||||
</p>
|
||||
</div>
|
||||
|
||||
<label className="flex items-center gap-2 text-sm cursor-pointer">
|
||||
<input
|
||||
type="checkbox"
|
||||
checked={cloneFromDefault}
|
||||
onChange={(e) => setCloneFromDefault(e.target.checked)}
|
||||
/>
|
||||
{t.profiles.cloneFromDefault}
|
||||
</label>
|
||||
|
||||
<div>
|
||||
<Button onClick={handleCreate} disabled={creating}>
|
||||
<Plus className="h-3 w-3" />
|
||||
{creating ? t.common.creating : t.common.create}
|
||||
</Button>
|
||||
</div>
|
||||
</div>
|
||||
</CardContent>
|
||||
</Card>
|
||||
|
||||
{/* List */}
|
||||
<div className="flex flex-col gap-3">
|
||||
<H2
|
||||
variant="sm"
|
||||
className="flex items-center gap-2 text-muted-foreground"
|
||||
>
|
||||
<Users className="h-4 w-4" />
|
||||
{t.profiles.allProfiles} ({profiles.length})
|
||||
</H2>
|
||||
|
||||
{profiles.length === 0 && (
|
||||
<Card>
|
||||
<CardContent className="py-8 text-center text-sm text-muted-foreground">
|
||||
{t.profiles.noProfiles}
|
||||
</CardContent>
|
||||
</Card>
|
||||
)}
|
||||
|
||||
{profiles.map((p) => {
|
||||
const isRenaming = renamingFrom === p.name;
|
||||
const isEditingSoul = editingSoulFor === p.name;
|
||||
return (
|
||||
<Card key={p.name}>
|
||||
<CardContent className="flex items-center gap-4 py-4">
|
||||
<div className="flex-1 min-w-0">
|
||||
<div className="flex items-center gap-2 mb-1 flex-wrap">
|
||||
{isRenaming ? (
|
||||
<Input
|
||||
autoFocus
|
||||
value={renameTo}
|
||||
onChange={(e) => setRenameTo(e.target.value)}
|
||||
onKeyDown={(e) => {
|
||||
if (e.key === "Enter") handleRenameSubmit();
|
||||
if (e.key === "Escape") setRenamingFrom(null);
|
||||
}}
|
||||
aria-invalid={
|
||||
renameTo.trim() !== "" &&
|
||||
renameTo.trim() !== p.name &&
|
||||
!PROFILE_NAME_RE.test(renameTo.trim())
|
||||
}
|
||||
className="max-w-xs"
|
||||
/>
|
||||
) : (
|
||||
<span className="font-medium text-sm truncate">
|
||||
{p.name}
|
||||
</span>
|
||||
)}
|
||||
{p.is_default && (
|
||||
<Badge tone="secondary">{t.profiles.defaultBadge}</Badge>
|
||||
)}
|
||||
{p.has_env && (
|
||||
<Badge tone="outline">{t.profiles.hasEnv}</Badge>
|
||||
)}
|
||||
</div>
|
||||
{isRenaming &&
|
||||
(() => {
|
||||
const trimmed = renameTo.trim();
|
||||
const invalid =
|
||||
trimmed !== "" &&
|
||||
trimmed !== p.name &&
|
||||
!PROFILE_NAME_RE.test(trimmed);
|
||||
return (
|
||||
<p
|
||||
className={
|
||||
"text-xs mb-1 " +
|
||||
(invalid
|
||||
? "text-destructive"
|
||||
: "text-muted-foreground")
|
||||
}
|
||||
>
|
||||
{invalid
|
||||
? `${t.profiles.invalidName}: ${t.profiles.nameRule}`
|
||||
: t.profiles.nameRule}
|
||||
</p>
|
||||
);
|
||||
})()}
|
||||
<div className="flex items-center gap-4 text-xs text-muted-foreground flex-wrap">
|
||||
{p.model && (
|
||||
<span>
|
||||
{t.profiles.model}: {p.model}
|
||||
{p.provider ? ` (${p.provider})` : ""}
|
||||
</span>
|
||||
)}
|
||||
<span>
|
||||
{t.profiles.skills}: {p.skill_count}
|
||||
</span>
|
||||
<span className="font-mono truncate max-w-[28rem]">
|
||||
{p.path}
|
||||
</span>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div className="flex items-center gap-1 shrink-0">
|
||||
{isRenaming ? (
|
||||
<>
|
||||
<Button
|
||||
size="sm"
|
||||
onClick={handleRenameSubmit}
|
||||
>
|
||||
{t.common.save}
|
||||
</Button>
|
||||
<Button
|
||||
size="sm"
|
||||
ghost
|
||||
onClick={() => setRenamingFrom(null)}
|
||||
>
|
||||
{t.common.cancel}
|
||||
</Button>
|
||||
</>
|
||||
) : (
|
||||
<>
|
||||
<Button
|
||||
ghost
|
||||
size="icon"
|
||||
title={t.profiles.editSoul}
|
||||
aria-label={t.profiles.editSoul}
|
||||
onClick={() => openSoulEditor(p.name)}
|
||||
>
|
||||
{isEditingSoul ? (
|
||||
<ChevronDown className="h-4 w-4" />
|
||||
) : (
|
||||
<span aria-hidden className="text-xs font-bold">
|
||||
S
|
||||
</span>
|
||||
)}
|
||||
</Button>
|
||||
<Button
|
||||
ghost
|
||||
size="icon"
|
||||
title={t.profiles.openInTerminal}
|
||||
aria-label={t.profiles.openInTerminal}
|
||||
onClick={() => handleCopyTerminalCommand(p.name)}
|
||||
>
|
||||
<Terminal className="h-4 w-4" />
|
||||
</Button>
|
||||
{!p.is_default && (
|
||||
<Button
|
||||
ghost
|
||||
size="icon"
|
||||
title={t.profiles.rename}
|
||||
aria-label={t.profiles.rename}
|
||||
onClick={() => {
|
||||
setRenamingFrom(p.name);
|
||||
setRenameTo(p.name);
|
||||
}}
|
||||
>
|
||||
<Pencil className="h-4 w-4" />
|
||||
</Button>
|
||||
)}
|
||||
{!p.is_default && (
|
||||
<Button
|
||||
ghost
|
||||
size="icon"
|
||||
title={t.common.delete}
|
||||
aria-label={t.common.delete}
|
||||
onClick={() => profileDelete.requestDelete(p.name)}
|
||||
>
|
||||
<Trash2 className="h-4 w-4 text-destructive" />
|
||||
</Button>
|
||||
)}
|
||||
</>
|
||||
)}
|
||||
</div>
|
||||
</CardContent>
|
||||
|
||||
{isEditingSoul && (
|
||||
<div className="border-t border-border px-4 pb-4 pt-3 flex flex-col gap-2">
|
||||
<Label
|
||||
htmlFor={`soul-editor-${p.name}`}
|
||||
className="flex items-center gap-2 text-xs uppercase tracking-wider text-muted-foreground"
|
||||
>
|
||||
{t.profiles.soulSection}
|
||||
</Label>
|
||||
<textarea
|
||||
id={`soul-editor-${p.name}`}
|
||||
className="flex min-h-[180px] w-full border border-input bg-transparent px-3 py-2 text-sm font-mono shadow-sm placeholder:text-muted-foreground focus-visible:outline-none focus-visible:ring-1 focus-visible:ring-ring"
|
||||
placeholder={t.profiles.soulPlaceholder}
|
||||
value={soulText}
|
||||
onChange={(e) => setSoulText(e.target.value)}
|
||||
/>
|
||||
<div>
|
||||
<Button
|
||||
size="sm"
|
||||
onClick={() => handleSaveSoul(p.name)}
|
||||
disabled={soulSaving}
|
||||
>
|
||||
{soulSaving ? t.common.saving : t.profiles.saveSoul}
|
||||
</Button>
|
||||
</div>
|
||||
</div>
|
||||
)}
|
||||
</Card>
|
||||
);
|
||||
})}
|
||||
</div>
|
||||
</div>
|
||||
);
|
||||
}
|
||||
@@ -35,8 +35,10 @@ import { timeAgo } from "@/lib/utils";
|
||||
import { Markdown } from "@/components/Markdown";
|
||||
import { PlatformsCard } from "@/components/PlatformsCard";
|
||||
import { Toast } from "@/components/Toast";
|
||||
import { Button, ListItem, Spinner } from "@nous-research/ui";
|
||||
import { Badge } from "@nous-research/ui";
|
||||
import { Button } from "@nous-research/ui/ui/components/button";
|
||||
import { ListItem } from "@nous-research/ui/ui/components/list-item";
|
||||
import { Spinner } from "@nous-research/ui/ui/components/spinner";
|
||||
import { Badge } from "@nous-research/ui/ui/components/badge";
|
||||
import { Card, CardContent, CardHeader, CardTitle } from "@/components/ui/card";
|
||||
import { DeleteConfirmDialog } from "@/components/DeleteConfirmDialog";
|
||||
import { useConfirmDelete } from "@/hooks/useConfirmDelete";
|
||||
|
||||
@@ -20,7 +20,11 @@ import type { SkillInfo, ToolsetInfo } from "@/lib/api";
|
||||
import { useToast } from "@/hooks/useToast";
|
||||
import { Toast } from "@/components/Toast";
|
||||
import { Card, CardContent, CardHeader, CardTitle } from "@/components/ui/card";
|
||||
import { Badge, Button, ListItem, Spinner, Switch } from "@nous-research/ui";
|
||||
import { Badge } from "@nous-research/ui/ui/components/badge";
|
||||
import { Button } from "@nous-research/ui/ui/components/button";
|
||||
import { ListItem } from "@nous-research/ui/ui/components/list-item";
|
||||
import { Spinner } from "@nous-research/ui/ui/components/spinner";
|
||||
import { Switch } from "@nous-research/ui/ui/components/switch";
|
||||
import { cn } from "@/lib/utils";
|
||||
import { Input } from "@/components/ui/input";
|
||||
import { useI18n } from "@/i18n";
|
||||
|
||||
@@ -1,5 +1,5 @@
|
||||
import { useSyncExternalStore } from "react";
|
||||
import { Spinner } from "@nous-research/ui";
|
||||
import { Spinner } from "@nous-research/ui/ui/components/spinner";
|
||||
import {
|
||||
getPluginComponent,
|
||||
getPluginLoadError,
|
||||
|
||||
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user