Compare commits

..

28 Commits

Author SHA1 Message Date
Brooklyn Nicholson 0b5bb9f0b5 fix(windows): bootstrap utf-8 mode at entrypoints
Force UTF-8 defaults on legacy Windows by re-execing Hermes entrypoints with -X utf8, preventing locale codec crashes from implicit text encoding in file and stdio paths.
2026-05-07 22:43:17 -04:00
Brooklyn Nicholson 31e3bdee99 fix(windows): harden native CLI and TUI bootstrap
Handle native Windows dependency edge cases by avoiding npm.ps1 execution-policy failures, persisting managed Node resolution, and validating runtime imports per platform.
2026-05-07 22:04:42 -04:00
helix4u faa13e49f8 docs(web): fix SearXNG env configuration 2026-05-07 17:54:47 -07:00
Teknium 1bdacb697c chore(release): add BennetYrWang to AUTHOR_MAP 2026-05-07 17:47:22 -07:00
BennetYrWang 34f7297359 Serialize Hermes config access 2026-05-07 17:47:22 -07:00
Teknium 307c85e5c1 fix(goals): auto-pause when judge model returns unparseable output
Weak judge models (e.g. deepseek-v4-flash) return empty strings or prose
when asked for the strict {done, reason} JSON verdict. The old code
failed-open to continue on every such turn, burning the entire turn
budget with log lines like

  judge returned empty response
  judge reply was not JSON: "Let me analyze whether the goal..."

and /goal clear could not stop it mid-loop without /stop.

After N=3 consecutive *parse* failures (transport/API errors don't
count — those are transient), the loop auto-pauses and prints:

  ⏸ Goal paused — the judge model (3 turns) isn't returning the
  required JSON verdict. Route the judge to a stricter model in
  ~/.hermes/config.yaml:
    auxiliary:
      goal_judge:
        provider: openrouter
        model: google/gemini-3-flash-preview
  Then /goal resume to continue.

The counter resets on any usable reply (both "done"/"continue" and
API errors) and persists across GoalManager reloads so cross-session
resumes carry the correct state.

Also fixes test_goal_verdict_send.py sharing a hardcoded session_id
across tests — the shared id only worked because the previous
_post_turn_goal_continuation was a never-awaited coroutine. Now that
PR #19160 made it properly awaited, the xdist test-leakage bug
surfaced. Each test gets a unique session_id via uuid suffix.
2026-05-07 17:33:09 -07:00
JC 03ddff8897 fix(gateway): defer goal status notices until after response delivery
Route goal status notices through the platform adapter send API and register post-delivery callbacks so completed-goal notices appear after the final assistant response. Also cancel queued synthetic goal continuations on /goal pause and /goal clear while preserving normal queued user messages.
2026-05-07 17:33:09 -07:00
Teknium 7d66d30d77 feat(kanban): add tooltips and docs link across dashboard (#21541)
Makes first-time use of the kanban view self-explanatory. Every control
that wasn't already labelled now has a `title` tooltip describing what
it does, and a `?` icon next to the board switcher opens the kanban
docs page in a new tab.

Coverage:
- BoardSwitcher: board select, + New board button, docs-link icon
  (both compact and full variants)
- BoardToolbar: Search, Tenant, Assignee, Show archived, Nudge
  dispatcher, Refresh
- BulkActionBar: → ready, Complete, Archive, reassign group, Apply,
  Clear
- Column header: hovering the header now surfaces COLUMN_HELP as a
  tooltip in addition to the visible sub-text; column count also
  labelled
- Card: task id, priority badge, tenant badge, assignee/unassigned,
  comment count, link count, age timestamp
- InlineCreate: assignee, priority, parent-task selectors

Closes the community feedback from @CharlieDePew asking for tooltips
and a docs link in the kanban view.

Relevant docs page:
https://hermes-agent.nousresearch.com/docs/user-guide/features/kanban
2026-05-07 16:13:27 -07:00
Austin Pickett 7f92e5506e Merge pull request #20942 from NousResearch/austin/fix/personality
fix(tui): preserve session when switching personality
2026-05-07 18:54:29 -04:00
Austin Pickett b0393af38c Merge pull request #20805 from NousResearch/austin-feat-sessions-skills-menu
feat(tui): add /sessions slash command for browsing and resuming previous sessions
2026-05-07 18:54:16 -04:00
teknium1 7f369bfe55 chore(release): add hllqkb to AUTHOR_MAP for PR #21288 salvage 2026-05-07 15:21:34 -07:00
hllqkb c80fa728bd fix(installer): set UV_NO_CONFIG=1 to avoid permission denied under sudo -u
When the installer is run via , uv resolves config file
paths against the process owner's (root) home directory rather than the
effective user's, causing a Permission denied error when trying to read
/root/uv.toml.

Setting UV_NO_CONFIG=1 prevents uv from discovering any config files
(uv.toml, pyproject.toml) during installation, which is the correct
behavior for a bootstrap script that manages its own environment.

Fixes #21269
2026-05-07 15:21:34 -07:00
teknium 292f468366 fix(mcp): unwrap platforms key in channels_list
channels_list was iterating directory.items() directly, yielding
("updated_at", str) and ("platforms", dict) pairs — neither passed
the isinstance(entries_list, list) check, so the inner loop never ran
and every call returned count=0 even when channel_directory.json was
populated.

The writer (gateway/channel_directory.py) wraps the payload as
{"updated_at": ..., "platforms": {...}}; every other reader in the
codebase unwraps via directory.get("platforms", {}). This aligns
channels_list with that convention.

Also tightens the existing test_channels_with_directory test, which
bypassed the bug by asserting against _load_channel_directory() directly
instead of calling channels_list. It now calls the tool end-to-end and
a new test_channels_with_directory_platform_filter covers the filter
path. Both tests fail against the pre-fix code.

Closes #21474

Co-authored-by: chrisworksai <262485129+chrisworksai@users.noreply.github.com>
2026-05-07 13:41:16 -07:00
Austin Pickett d87c7b99e2 fix(analytics): prevent silent token loss and add Claude 4.5–4.7 pricing (#21455)
- Add pricing entries for Claude Opus 4.5/4.6/4.7, Sonnet 4.5/4.6, and
  Haiku 4.5 with updated source URLs (platform.claude.com)
- Add _normalize_anthropic_model_name() to handle dot-notation variants
  (e.g. claude-opus-4.7 → claude-opus-4-7) for pricing lookups
- Fix silent token loss: ensure session row exists before UPDATE in both
  run_agent.py and hermes_state.py (INSERT OR IGNORE is idempotent)
- Log token persistence failures at DEBUG level instead of swallowing
  them silently — makes undercounted analytics diagnosable
- Surface reasoning tokens in CLI /usage and TUI usage panel
- Add 'reasoning' and 'cost_status' fields to TUI Usage type
2026-05-07 13:24:31 -07:00
Teknium cff821e2dc docs: register triage_specifier in the aux-models enumerations (#21494)
The kanban specifier landed in #21435 with feature-page docs (the
kanban page itself + the CLI reference table), but three other docs
pages enumerate every auxiliary task slot and were missed:

  user-guide/configuration.md            Auxiliary Models section —
                                         interactive picker example
                                         + full auxiliary config
                                         reference YAML block.
  user-guide/features/fallback-providers.md
                                         Both 'Auxiliary Tasks' and
                                         'Fallback Reference' tables.
  user-guide/features/kanban-tutorial.md
                                         Triage-column bullet now
                                         mentions the  Specify
                                         button + CLI + slash command.

No other docs enumerate the aux task slots (verified with
grep -r 'title_generation\|auxiliary.session_search' website/docs/).
2026-05-07 13:07:18 -07:00
teknium1 2214ab1073 chore: fix AUTHOR_MAP for johnsonblake1@gmail.com → voteblake
The existing mapping pointed to the wrong GitHub user (blakejohnson, id
866695, IBM) — the email actually belongs to voteblake (id 5585957),
confirmed via search/commits?author-email. Mis-credited since 323ca7084.
2026-05-07 13:04:42 -07:00
Blake Johnson 9076a2e74e fix(agent): keep Nous GPT-5 fallback on chat completions 2026-05-07 13:04:42 -07:00
Teknium 24d48ffb82 feat(kanban): add specify — auxiliary LLM fleshes out triage tasks (#21435)
* feat(kanban): add `specify` — auxiliary LLM fleshes out triage tasks

The Triage column shipped with a placeholder 'a specifier will flesh
out the spec', but the specifier itself was never built. This wires
it up as a dedicated CLI verb.

`hermes kanban specify <id>` calls the auxiliary LLM (configured under
`auxiliary.triage_specifier`) to expand a rough one-liner into a
concrete spec — tightened title plus a body with Goal / Approach /
Acceptance criteria / Out-of-scope sections — then atomically flips
`status: triage -> todo` and recomputes ready so parent-free tasks
go straight to the dispatcher on the same tick.

Surface:

  hermes kanban specify <task_id>               # single task
  hermes kanban specify --all [--tenant T]      # sweep triage column
  hermes kanban specify ... --author NAME       # audit-comment author
  hermes kanban specify ... --json              # one JSON line per task

Design choices:

  - Parent gating is preserved. specify_triage_task flips to 'todo',
    then recompute_ready promotes to 'ready' only when parents are
    done — same rule as a normal parent-gated todo.
  - No daemon, no background watcher. Every invocation is explicit —
    keeps cost predictable and doesn't fight the dispatcher loop.
  - Response parse is lenient: strict JSON preferred, markdown-fence
    tolerated, raw-body fallback on malformed JSON so the LLM can't
    strand a task in triage.
  - All failure modes (no aux client, API error, task moved out of
    triage mid-call) return SpecifyOutcome(ok=False, reason=...) so
    --all continues past individual failures.

Changes:

  hermes_cli/kanban_db.py    + specify_triage_task()
  hermes_cli/kanban_specify.py  NEW (~220 LOC — prompt, parse, call)
  hermes_cli/kanban.py       + specify subcommand + _cmd_specify
  hermes_cli/config.py       + auxiliary.triage_specifier task slot
  website/docs/user-guide/features/kanban.md  specify + config notes
  website/docs/reference/cli-commands.md      CLI reference entry
  tests/hermes_cli/test_kanban_specify_db.py    NEW (10 tests)
  tests/hermes_cli/test_kanban_specify.py       NEW (20 tests)

Validation: 30/30 targeted tests pass. E2E: triage task -> specify ->
ends in 'ready' with events [created, specified, promoted] and the
audit comment recorded under the configured author.

* feat(kanban): wire specifier into dashboard and gateway slash

Follow-ups to the initial PR #21435 — closes the two gaps I'd left as
post-merge: dashboard button and first-class gateway surface.

Dashboard (plugins/kanban/dashboard/)
  - POST /tasks/:id/specify  NEW endpoint. Thin wrapper around
    kanban_specify.specify_task(). Returns the CLI outcome shape
    ({ok, task_id, reason, new_title}); ok=false with a human reason
    is a 200, not a 4xx, so the UI can render it inline without
    treating 'no aux client configured' as a crash.
  - Runs sync in FastAPI's threadpool because the LLM call can take
    tens of seconds on reasoning models.
  - Pins HERMES_KANBAN_BOARD around the specify call so the module's
    argless kb.connect() lands on the right board.
  - dist/index.js: doSpecify callback threaded through the drawer →
    TaskDetail → StatusActions prop chain.  Specify button appears
    ONLY when task.status === 'triage' (elsewhere the backend would
    reject anyway — hide the button to keep the action row clean).
    Busy state (Specifying…) + inline success/error banner under the
    button using the response.reason text.
  - dist/style.css: tiny hermes-kanban-msg-ok / -err classes using
    existing --color vars so themes reskin cleanly.

Gateway slash (/kanban specify)
  - Already works via the existing run_slash → build_parser →
    kanban_command pipeline. No code change needed — slash commands
    inherit the argparse tree automatically. Added coverage:
    test_run_slash_specify_end_to_end (create --triage, specify, verify
    promotion + retitle) and test_run_slash_specify_help_is_reachable.

Tests
  - tests/plugins/test_kanban_dashboard_plugin.py: 3 new tests for the
    REST endpoint — happy path, non-triage rejection as ok=false 200,
    missing aux client as ok=false 200.
  - tests/hermes_cli/test_kanban_cli.py: 2 new slash-surface tests.

Docs
  - website/docs/user-guide/features/kanban.md: dashboard action row
    description mentions  Specify + all three surfaces. REST table
    gains /tasks/:id/specify. Slash examples include /kanban specify.

Validation: 340/340 targeted tests pass. E2E via TestClient: create a
triage task over REST → POST /specify with mocked aux client → task
moves to 'ready' column on /board with new title and body applied.
2026-05-07 13:04:41 -07:00
adybag14-cyber 732a6c45fa feat: add termux doctor fallback guidance for blocked extras 2026-05-07 13:04:08 -07:00
adybag14-cyber dc5ef1ac8e fix: add termux-all install profile and safe fallbacks 2026-05-07 13:04:08 -07:00
adybag14-cyber da18fd084a fix: strengthen termux install network prerequisites 2026-05-07 13:04:08 -07:00
adybag14-cyber 54c0b10d14 fix(update): add heartbeat during dependency install 2026-05-07 13:04:08 -07:00
Abd0r 04193cf71c feat(web): add Brave Search (free tier) and DDGS search providers
Both implement WebSearchProvider via tools/web_providers/ — matching the
existing SearXNG pattern (PR #5c906d702). Search-only; pair with any
extract provider via web.extract_backend.

- tools/web_providers/brave_free.py — Brave Search API (free tier, 2k
  queries/mo). Uses BRAVE_SEARCH_API_KEY as X-Subscription-Token.
- tools/web_providers/ddgs.py — DuckDuckGo via the ddgs Python package.
  No API key; gated on package importability.
- tools/web_tools.py: both backends added to _get_backend() config list
  and auto-detect chain (trails paid providers), _is_backend_available,
  web_search_tool dispatch, web_extract_tool + web_crawl_tool search-only
  refusals, check_web_api_key, and the __main__ diagnostic. Introduces
  _ddgs_package_importable() helper so tests can monkeypatch a single
  symbol for the ddgs availability check.
- hermes_cli/tools_config.py: picker entries for both providers; ddgs
  gets a post_setup handler that runs `pip install ddgs`.
- hermes_cli/config.py: BRAVE_SEARCH_API_KEY in OPTIONAL_ENV_VARS.
- scripts/release.py: AUTHOR_MAP entry for @Abd0r.
- tests: 14 new tests (brave-free) + 15 new tests (ddgs) covering
  provider unit behavior, backend wiring, and search-only refusals.

Salvages the brave-free + ddgs portion of PR #19796. Not included: the
in-line helpers in web_tools.py (replaced with provider modules to match
the shipped architecture), the lynx-based extract path (these backends
should refuse extract with a clear error — users pair with a real
extract provider), and scripts/start-llama-server.sh (unrelated).

Co-authored-by: Abd0r <223003280+Abd0r@users.noreply.github.com>
2026-05-07 09:59:17 -07:00
xxxigm cdc0a47dd5 test(hermes_constants): cover parse_reasoning_effort() 2026-05-07 09:59:07 -07:00
Teknium 7e2af0c2e8 feat(acp): pass image file attachments through as image_url parts
Extends PR #21400's resource inlining with image-specific handling: ACP
resource_link and embedded blob resources with an image/* mime (or image
file suffix when mime is missing) now emit an OpenAI image_url part
with a base64 data URL, so vision models actually see the image
instead of a [Binary file omitted] note. Non-image resources keep the
existing text-inlining behavior.

Adds 3 tests: local PNG via resource_link, JPEG mime inferred from
suffix when client omits mimeType, and embedded blob PNG.
2026-05-07 09:24:32 -07:00
HenkDz 733e297b8a fix(acp): inline file attachment resources 2026-05-07 09:24:32 -07:00
Austin Pickett 65c762b2e8 fix(tui): preserve session when switching personality
Previously, /personality in the TUI called _reset_session_agent() which
destroyed the agent, cleared conversation history, and effectively started
a new session. This made personality switching disruptive — users lost
their entire conversation context.

Now /personality updates the agent's ephemeral_system_prompt in-place and
injects a pivot marker into the conversation history. The marker tells
the model to adopt the new persona from that point forward, which is
necessary because LLMs tend to pattern-match their prior responses and
continue the established tone without an explicit signal.

Changes:
- tui_gateway/server.py: Rewrite _apply_personality_to_session to update
  the agent in-place instead of resetting. Inject a user-role pivot
  marker so the model actually switches style mid-conversation.
- ui-tui/src/app/slash/commands/session.ts: Update help text (no longer
  mentions history reset).
- tests/test_tui_gateway_server.py: Update test to verify history is
  preserved, pivot marker is injected, and ephemeral prompt is set.
2026-05-06 19:30:46 -04:00
Austin Pickett 09a491464c feat(tui): add /sessions slash command for browsing and resuming previous sessions 2026-05-06 11:58:53 -04:00
62 changed files with 4602 additions and 348 deletions
+8
View File
@@ -17,7 +17,15 @@ import asyncio
import logging
import sys
from pathlib import Path
from hermes_constants import get_hermes_home
from utf8_bootstrap import ensure_windows_utf8_mode
# Ensure ACP stdio/file defaults are UTF-8 on legacy Windows builds.
ensure_windows_utf8_mode(
module="acp_adapter.entry",
entrypoint_markers=("hermes-acp", "entry.py"),
)
# Methods clients send as periodic liveness probes. They are not part of the
+288 -2
View File
@@ -3,13 +3,16 @@
from __future__ import annotations
import asyncio
import base64
import contextvars
import json
import logging
import os
from collections import defaultdict, deque
from concurrent.futures import ThreadPoolExecutor
from pathlib import Path
from typing import Any, Deque, Optional
from urllib.parse import unquote, urlparse
import acp
from acp.schema import (
@@ -18,6 +21,7 @@ from acp.schema import (
AuthenticateResponse,
AvailableCommand,
AvailableCommandsUpdate,
BlobResourceContents,
ClientCapabilities,
EmbeddedResourceContentBlock,
ForkSessionResponse,
@@ -46,6 +50,7 @@ from acp.schema import (
SessionResumeCapabilities,
SessionInfo,
TextContentBlock,
TextResourceContents,
UnstructuredCommandInput,
Usage,
UsageUpdate,
@@ -83,6 +88,272 @@ _executor = ThreadPoolExecutor(max_workers=4, thread_name_prefix="acp-agent")
# does not expose a client-side limit, so this is a fixed cap that clients
# paginate against using `cursor` / `next_cursor`.
_LIST_SESSIONS_PAGE_SIZE = 50
_MAX_ACP_RESOURCE_BYTES = 512 * 1024
_TEXT_RESOURCE_MIME_PREFIXES = ("text/",)
_TEXT_RESOURCE_MIME_TYPES = {
"application/json",
"application/javascript",
"application/typescript",
"application/xml",
"application/x-yaml",
"application/yaml",
"application/toml",
"application/sql",
}
def _resource_display_name(uri: str, name: str | None = None, title: str | None = None) -> str:
"""Human-readable attachment name for prompt context."""
raw_name = (name or "").strip()
raw_title = (title or "").strip()
if raw_title and raw_name and raw_title != raw_name:
return f"{raw_title} ({raw_name})"
if raw_title:
return raw_title
if raw_name:
return raw_name
parsed = urlparse(uri)
candidate = parsed.path if parsed.scheme else uri
return Path(unquote(candidate)).name or uri or "resource"
def _is_text_resource(mime_type: str | None) -> bool:
mime = (mime_type or "").split(";", 1)[0].strip().lower()
if not mime:
return False
return mime.startswith(_TEXT_RESOURCE_MIME_PREFIXES) or mime in _TEXT_RESOURCE_MIME_TYPES
def _is_image_resource(mime_type: str | None) -> bool:
mime = (mime_type or "").split(";", 1)[0].strip().lower()
return mime.startswith("image/")
def _guess_image_mime_from_path(path: Path) -> str | None:
suffix = path.suffix.lower()
return {
".png": "image/png",
".jpg": "image/jpeg",
".jpeg": "image/jpeg",
".gif": "image/gif",
".webp": "image/webp",
".bmp": "image/bmp",
".svg": "image/svg+xml",
}.get(suffix)
def _image_data_url(data: bytes, mime_type: str) -> str:
return f"data:{mime_type};base64,{base64.b64encode(data).decode('ascii')}"
def _path_from_file_uri(uri: str) -> Path | None:
"""Convert local file URIs/paths from ACP clients into a readable Path.
Zed may send POSIX file URIs from Linux/WSL workspaces or Windows-ish paths
when launched through wsl.exe. Translate the common Windows drive form to
/mnt/<drive>/... so Hermes running in WSL can read it.
"""
raw = (uri or "").strip()
if not raw:
return None
parsed = urlparse(raw)
if parsed.scheme and parsed.scheme != "file":
return None
if parsed.scheme == "file":
if parsed.netloc and parsed.netloc not in {"", "localhost"}:
return None
path_text = unquote(parsed.path or "")
else:
path_text = unquote(raw)
# file:///C:/Users/... or C:\Users\...
if len(path_text) >= 3 and path_text[0] == "/" and path_text[2] == ":" and path_text[1].isalpha():
drive = path_text[1].lower()
rest = path_text[3:].lstrip("/\\").replace("\\", "/")
return Path("/mnt") / drive / rest
if len(path_text) >= 2 and path_text[1] == ":" and path_text[0].isalpha():
drive = path_text[0].lower()
rest = path_text[2:].lstrip("/\\").replace("\\", "/")
return Path("/mnt") / drive / rest
return Path(path_text)
def _decode_text_bytes(data: bytes, mime_type: str | None) -> str | None:
"""Decode resource bytes if they are probably text; return None for binary."""
if b"\x00" in data and not _is_text_resource(mime_type):
return None
for encoding in ("utf-8-sig", "utf-8", "latin-1"):
try:
return data.decode(encoding)
except UnicodeDecodeError:
continue
return data.decode("utf-8", errors="replace")
def _format_resource_text(
*,
uri: str,
body: str,
name: str | None = None,
title: str | None = None,
note: str | None = None,
) -> str:
display = _resource_display_name(uri, name=name, title=title)
header = f"[Attached file: {display}]"
if note:
header += f" ({note})"
return f"{header}\nURI: {uri}\n\n{body}"
def _resource_link_to_parts(block: ResourceContentBlock) -> list[dict[str, Any]]:
"""Convert an ACP resource_link block to OpenAI content parts.
Returns a list of {"type": "text", ...} and/or {"type": "image_url", ...}
parts. Image resources produce an image_url part with a small text header
so the model knows which attachment it is. Non-image resources return a
single text part with the inlined file body (or a binary-omit note).
"""
uri = str(getattr(block, "uri", "") or "").strip()
if not uri:
return []
name = str(getattr(block, "name", "") or "").strip() or None
title = str(getattr(block, "title", "") or "").strip() or None
mime_type = str(getattr(block, "mime_type", "") or "").strip() or None
path = _path_from_file_uri(uri)
if path is None:
return [{
"type": "text",
"text": _format_resource_text(
uri=uri,
name=name,
title=title,
body="[Resource link only; Hermes cannot read non-file ACP resource URIs directly.]",
),
}]
# Image files: emit a short text header + image_url data URL so vision
# models can see the attachment instead of a "binary omitted" note.
image_mime = mime_type if _is_image_resource(mime_type) else _guess_image_mime_from_path(path)
if image_mime and _is_image_resource(image_mime):
try:
size = path.stat().st_size
if size > _MAX_ACP_RESOURCE_BYTES:
return [{
"type": "text",
"text": _format_resource_text(
uri=uri,
name=name,
title=title,
body=f"[Image too large to inline: {size} bytes, cap={_MAX_ACP_RESOURCE_BYTES}]",
),
}]
with path.open("rb") as fh:
data = fh.read()
except OSError as exc:
logger.warning("ACP image resource read failed: %s", uri, exc_info=True)
return [{
"type": "text",
"text": _format_resource_text(
uri=uri,
name=name,
title=title,
body=f"[Could not read attached image: {exc}]",
),
}]
display = _resource_display_name(uri, name=name, title=title)
return [
{"type": "text", "text": f"[Attached image: {display}]\nURI: {uri}"},
{"type": "image_url", "image_url": {"url": _image_data_url(data, image_mime)}},
]
try:
size = path.stat().st_size
read_size = min(size, _MAX_ACP_RESOURCE_BYTES)
with path.open("rb") as fh:
data = fh.read(read_size)
text = _decode_text_bytes(data, mime_type)
if text is None:
return [{
"type": "text",
"text": _format_resource_text(
uri=uri,
name=name,
title=title,
body=f"[Binary file omitted: {size} bytes, mime={mime_type or 'unknown'}]",
),
}]
note = None
if size > _MAX_ACP_RESOURCE_BYTES:
note = f"truncated to {_MAX_ACP_RESOURCE_BYTES} of {size} bytes"
return [{
"type": "text",
"text": _format_resource_text(uri=uri, name=name, title=title, body=text, note=note),
}]
except OSError as exc:
logger.warning("ACP resource read failed: %s", uri, exc_info=True)
return [{
"type": "text",
"text": _format_resource_text(
uri=uri,
name=name,
title=title,
body=f"[Could not read attached file: {exc}]",
),
}]
def _embedded_resource_to_parts(block: EmbeddedResourceContentBlock) -> list[dict[str, Any]]:
resource = getattr(block, "resource", None)
if resource is None:
return []
uri = str(getattr(resource, "uri", "") or "").strip()
mime_type = str(getattr(resource, "mime_type", "") or "").strip() or None
if isinstance(resource, TextResourceContents):
return [{"type": "text", "text": _format_resource_text(uri=uri, body=resource.text)}]
if isinstance(resource, BlobResourceContents):
blob = resource.blob or ""
try:
data = base64.b64decode(blob, validate=True)
except Exception:
data = blob.encode("utf-8", errors="replace")
# Image blobs go through as image_url so vision models can see them.
if _is_image_resource(mime_type):
if len(data) > _MAX_ACP_RESOURCE_BYTES:
return [{
"type": "text",
"text": _format_resource_text(
uri=uri,
body=f"[Embedded image too large to inline: {len(data)} bytes, cap={_MAX_ACP_RESOURCE_BYTES}]",
),
}]
display = _resource_display_name(uri)
return [
{"type": "text", "text": f"[Attached image: {display}]" + (f"\nURI: {uri}" if uri else "")},
{"type": "image_url", "image_url": {"url": _image_data_url(data, mime_type or "image/png")}},
]
text = _decode_text_bytes(data[:_MAX_ACP_RESOURCE_BYTES], mime_type)
if text is None:
body = f"[Binary embedded file omitted: {len(data)} bytes, mime={mime_type or 'unknown'}]"
else:
body = text
if len(data) > _MAX_ACP_RESOURCE_BYTES:
body += f"\n\n[Truncated to {_MAX_ACP_RESOURCE_BYTES} of {len(data)} bytes]"
return [{"type": "text", "text": _format_resource_text(uri=uri, body=body)}]
text = getattr(resource, "text", None)
if text:
return [{"type": "text", "text": _format_resource_text(uri=uri, body=str(text))}]
return []
def _extract_text(
@@ -144,6 +415,20 @@ def _content_blocks_to_openai_user_content(
if image_part is not None:
parts.append(image_part)
continue
if isinstance(block, ResourceContentBlock):
resource_parts = _resource_link_to_parts(block)
for part in resource_parts:
parts.append(part)
if part.get("type") == "text":
text_parts.append(part["text"])
continue
if isinstance(block, EmbeddedResourceContentBlock):
resource_parts = _embedded_resource_to_parts(block)
for part in resource_parts:
parts.append(part)
if part.get("type") == "text":
text_parts.append(part["text"])
continue
if not parts:
return _extract_text(prompt)
@@ -803,6 +1088,7 @@ class HermesACPAgent(acp.Agent):
user_text = _extract_text(prompt).strip()
user_content = _content_blocks_to_openai_user_content(prompt)
text_only_prompt = all(isinstance(block, TextContentBlock) for block in prompt)
has_content = bool(user_text) or (
isinstance(user_content, list) and bool(user_content)
)
@@ -821,7 +1107,7 @@ class HermesACPAgent(acp.Agent):
# silently append to state.queued_prompts and respond with
# "No active turn — queued for the next turn", which looks like
# /queue even though the user never typed /queue.
if isinstance(user_content, str) and user_text.startswith("/steer"):
if text_only_prompt and isinstance(user_content, str) and user_text.startswith("/steer"):
steer_text = user_text.split(maxsplit=1)[1].strip() if len(user_text.split(maxsplit=1)) > 1 else ""
interrupted_prompt = ""
rewrite_idle = False
@@ -846,7 +1132,7 @@ class HermesACPAgent(acp.Agent):
# Slash commands are text-only; if the client included images/resources,
# send the whole multimodal prompt to the agent instead of treating it as
# an ACP command.
if isinstance(user_content, str) and user_text.startswith("/"):
if text_only_prompt and isinstance(user_content, str) and user_text.startswith("/"):
response_text = self._handle_slash_command(user_text, state)
if response_text is not None:
if self._conn:
+159 -14
View File
@@ -1,5 +1,6 @@
from __future__ import annotations
import re
from dataclasses import dataclass
from datetime import datetime, timezone
from decimal import Decimal
@@ -82,6 +83,121 @@ _UTC_NOW = lambda: datetime.now(timezone.utc)
# Official docs snapshot entries. Models whose published pricing and cache
# semantics are stable enough to encode exactly.
_OFFICIAL_DOCS_PRICING: Dict[tuple[str, str], PricingEntry] = {
# ── Anthropic Claude 4.7 ─────────────────────────────────────────────
# Opus 4.5/4.6/4.7 share $5/$25 pricing (new tokenizer, up to 35% more
# tokens for the same text).
# Source: https://platform.claude.com/docs/en/about-claude/pricing
(
"anthropic",
"claude-opus-4-7",
): PricingEntry(
input_cost_per_million=Decimal("5.00"),
output_cost_per_million=Decimal("25.00"),
cache_read_cost_per_million=Decimal("0.50"),
cache_write_cost_per_million=Decimal("6.25"),
source="official_docs_snapshot",
source_url="https://platform.claude.com/docs/en/about-claude/pricing",
pricing_version="anthropic-pricing-2026-05",
),
(
"anthropic",
"claude-opus-4-7-20250507",
): PricingEntry(
input_cost_per_million=Decimal("5.00"),
output_cost_per_million=Decimal("25.00"),
cache_read_cost_per_million=Decimal("0.50"),
cache_write_cost_per_million=Decimal("6.25"),
source="official_docs_snapshot",
source_url="https://platform.claude.com/docs/en/about-claude/pricing",
pricing_version="anthropic-pricing-2026-05",
),
# ── Anthropic Claude 4.6 ─────────────────────────────────────────────
(
"anthropic",
"claude-opus-4-6",
): PricingEntry(
input_cost_per_million=Decimal("5.00"),
output_cost_per_million=Decimal("25.00"),
cache_read_cost_per_million=Decimal("0.50"),
cache_write_cost_per_million=Decimal("6.25"),
source="official_docs_snapshot",
source_url="https://platform.claude.com/docs/en/about-claude/pricing",
pricing_version="anthropic-pricing-2026-05",
),
(
"anthropic",
"claude-opus-4-6-20250414",
): PricingEntry(
input_cost_per_million=Decimal("5.00"),
output_cost_per_million=Decimal("25.00"),
cache_read_cost_per_million=Decimal("0.50"),
cache_write_cost_per_million=Decimal("6.25"),
source="official_docs_snapshot",
source_url="https://platform.claude.com/docs/en/about-claude/pricing",
pricing_version="anthropic-pricing-2026-05",
),
(
"anthropic",
"claude-sonnet-4-6",
): PricingEntry(
input_cost_per_million=Decimal("3.00"),
output_cost_per_million=Decimal("15.00"),
cache_read_cost_per_million=Decimal("0.30"),
cache_write_cost_per_million=Decimal("3.75"),
source="official_docs_snapshot",
source_url="https://platform.claude.com/docs/en/about-claude/pricing",
pricing_version="anthropic-pricing-2026-05",
),
(
"anthropic",
"claude-sonnet-4-6-20250414",
): PricingEntry(
input_cost_per_million=Decimal("3.00"),
output_cost_per_million=Decimal("15.00"),
cache_read_cost_per_million=Decimal("0.30"),
cache_write_cost_per_million=Decimal("3.75"),
source="official_docs_snapshot",
source_url="https://platform.claude.com/docs/en/about-claude/pricing",
pricing_version="anthropic-pricing-2026-05",
),
# ── Anthropic Claude 4.5 ─────────────────────────────────────────────
(
"anthropic",
"claude-opus-4-5",
): PricingEntry(
input_cost_per_million=Decimal("5.00"),
output_cost_per_million=Decimal("25.00"),
cache_read_cost_per_million=Decimal("0.50"),
cache_write_cost_per_million=Decimal("6.25"),
source="official_docs_snapshot",
source_url="https://platform.claude.com/docs/en/about-claude/pricing",
pricing_version="anthropic-pricing-2026-05",
),
(
"anthropic",
"claude-sonnet-4-5",
): PricingEntry(
input_cost_per_million=Decimal("3.00"),
output_cost_per_million=Decimal("15.00"),
cache_read_cost_per_million=Decimal("0.30"),
cache_write_cost_per_million=Decimal("3.75"),
source="official_docs_snapshot",
source_url="https://platform.claude.com/docs/en/about-claude/pricing",
pricing_version="anthropic-pricing-2026-05",
),
(
"anthropic",
"claude-haiku-4-5",
): PricingEntry(
input_cost_per_million=Decimal("1.00"),
output_cost_per_million=Decimal("5.00"),
cache_read_cost_per_million=Decimal("0.10"),
cache_write_cost_per_million=Decimal("1.25"),
source="official_docs_snapshot",
source_url="https://platform.claude.com/docs/en/about-claude/pricing",
pricing_version="anthropic-pricing-2026-05",
),
# ── Anthropic Claude 4 / 4.1 ─────────────────────────────────────────
(
"anthropic",
"claude-opus-4-20250514",
@@ -91,8 +207,8 @@ _OFFICIAL_DOCS_PRICING: Dict[tuple[str, str], PricingEntry] = {
cache_read_cost_per_million=Decimal("1.50"),
cache_write_cost_per_million=Decimal("18.75"),
source="official_docs_snapshot",
source_url="https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching",
pricing_version="anthropic-prompt-caching-2026-03-16",
source_url="https://platform.claude.com/docs/en/about-claude/pricing",
pricing_version="anthropic-pricing-2026-05",
),
(
"anthropic",
@@ -103,8 +219,8 @@ _OFFICIAL_DOCS_PRICING: Dict[tuple[str, str], PricingEntry] = {
cache_read_cost_per_million=Decimal("0.30"),
cache_write_cost_per_million=Decimal("3.75"),
source="official_docs_snapshot",
source_url="https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching",
pricing_version="anthropic-prompt-caching-2026-03-16",
source_url="https://platform.claude.com/docs/en/about-claude/pricing",
pricing_version="anthropic-pricing-2026-05",
),
# OpenAI
(
@@ -184,7 +300,7 @@ _OFFICIAL_DOCS_PRICING: Dict[tuple[str, str], PricingEntry] = {
source_url="https://openai.com/api/pricing/",
pricing_version="openai-pricing-2026-03-16",
),
# Anthropic older models (pre-4.6 generation)
# ── Anthropic older models (pre-4.5 generation) ────────────────────────
(
"anthropic",
"claude-3-5-sonnet-20241022",
@@ -194,8 +310,8 @@ _OFFICIAL_DOCS_PRICING: Dict[tuple[str, str], PricingEntry] = {
cache_read_cost_per_million=Decimal("0.30"),
cache_write_cost_per_million=Decimal("3.75"),
source="official_docs_snapshot",
source_url="https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching",
pricing_version="anthropic-pricing-2026-03-16",
source_url="https://platform.claude.com/docs/en/about-claude/pricing",
pricing_version="anthropic-pricing-2026-05",
),
(
"anthropic",
@@ -206,8 +322,8 @@ _OFFICIAL_DOCS_PRICING: Dict[tuple[str, str], PricingEntry] = {
cache_read_cost_per_million=Decimal("0.08"),
cache_write_cost_per_million=Decimal("1.00"),
source="official_docs_snapshot",
source_url="https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching",
pricing_version="anthropic-pricing-2026-03-16",
source_url="https://platform.claude.com/docs/en/about-claude/pricing",
pricing_version="anthropic-pricing-2026-05",
),
(
"anthropic",
@@ -218,8 +334,8 @@ _OFFICIAL_DOCS_PRICING: Dict[tuple[str, str], PricingEntry] = {
cache_read_cost_per_million=Decimal("1.50"),
cache_write_cost_per_million=Decimal("18.75"),
source="official_docs_snapshot",
source_url="https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching",
pricing_version="anthropic-pricing-2026-03-16",
source_url="https://platform.claude.com/docs/en/about-claude/pricing",
pricing_version="anthropic-pricing-2026-05",
),
(
"anthropic",
@@ -230,8 +346,8 @@ _OFFICIAL_DOCS_PRICING: Dict[tuple[str, str], PricingEntry] = {
cache_read_cost_per_million=Decimal("0.03"),
cache_write_cost_per_million=Decimal("0.30"),
source="official_docs_snapshot",
source_url="https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching",
pricing_version="anthropic-pricing-2026-03-16",
source_url="https://platform.claude.com/docs/en/about-claude/pricing",
pricing_version="anthropic-pricing-2026-05",
),
# DeepSeek
(
@@ -426,8 +542,37 @@ def resolve_billing_route(
return BillingRoute(provider=provider_name or "unknown", model=model.split("/")[-1] if model else "", base_url=base_url or "", billing_mode="unknown")
def _normalize_anthropic_model_name(model: str) -> str:
"""Normalize Anthropic model name variants to canonical form.
Handles:
- Dot notation: claude-opus-4.7 → claude-opus-4-7
- Short aliases: claude-opus-4.7 → claude-opus-4-7
- Strips anthropic/ prefix if present
"""
name = model.lower().strip()
if name.startswith("anthropic/"):
name = name[len("anthropic/"):]
# Normalize dots to dashes in version numbers (e.g. 4.7 → 4-7, 4.6 → 4-6)
# But preserve the rest of the name structure
name = re.sub(r"(\d+)\.(\d+)", r"\1-\2", name)
return name
def _lookup_official_docs_pricing(route: BillingRoute) -> Optional[PricingEntry]:
return _OFFICIAL_DOCS_PRICING.get((route.provider, route.model.lower()))
model = route.model.lower()
# Direct lookup first
entry = _OFFICIAL_DOCS_PRICING.get((route.provider, model))
if entry:
return entry
# Try normalized name for Anthropic (handles dot-notation like opus-4.7)
if route.provider == "anthropic":
normalized = _normalize_anthropic_model_name(model)
if normalized != model:
entry = _OFFICIAL_DOCS_PRICING.get((route.provider, normalized))
if entry:
return entry
return None
def _openrouter_pricing_entry(route: BillingRoute) -> Optional[PricingEntry]:
+10
View File
@@ -34,6 +34,8 @@ from pathlib import Path
from datetime import datetime
from typing import List, Dict, Any, Optional
from utf8_bootstrap import ensure_windows_utf8_mode
logger = logging.getLogger(__name__)
# Suppress startup messages for clean CLI experience
@@ -7991,6 +7993,7 @@ class HermesCLI:
output_tokens = getattr(agent, "session_output_tokens", 0) or 0
cache_read_tokens = getattr(agent, "session_cache_read_tokens", 0) or 0
cache_write_tokens = getattr(agent, "session_cache_write_tokens", 0) or 0
reasoning_tokens = getattr(agent, "session_reasoning_tokens", 0) or 0
prompt = agent.session_prompt_tokens
completion = agent.session_completion_tokens
total = agent.session_total_tokens
@@ -8022,6 +8025,8 @@ class HermesCLI:
print(f" Cache read tokens: {cache_read_tokens:>10,}")
print(f" Cache write tokens: {cache_write_tokens:>10,}")
print(f" Output tokens: {output_tokens:>10,}")
if reasoning_tokens:
print(f" ↳ Reasoning (subset): {reasoning_tokens:>10,}")
print(f" Prompt tokens (total): {prompt:>10,}")
print(f" Completion tokens: {completion:>10,}")
print(f" Total tokens: {total:>10,}")
@@ -12339,6 +12344,11 @@ def main(
"""
global _active_worktree
ensure_windows_utf8_mode(
module="cli",
entrypoint_markers=("hermes", "cli.py"),
)
# Signal to terminal_tool that we're in interactive mode
# This enables interactive sudo password prompts with timeout
os.environ["HERMES_INTERACTIVE"] = "1"
+3 -1
View File
@@ -3146,7 +3146,9 @@ class BasePlatformAdapter(ABC):
_post_cb = getattr(self, "_post_delivery_callbacks", {}).pop(session_key, None)
if callable(_post_cb):
try:
_post_cb()
_post_result = _post_cb()
if inspect.isawaitable(_post_result):
await _post_result
except Exception:
pass
# Stop typing indicator
+156 -35
View File
@@ -1903,6 +1903,59 @@ class GatewayRunner:
depth += 1
return depth
@staticmethod
def _is_goal_continuation_event(event_or_text: Any) -> bool:
"""Return True for synthetic /goal continuation turns.
Goal continuations are normal queued user-role events, so pause/clear
must distinguish them from real user /queue messages before removing or
suppressing them.
"""
text = getattr(event_or_text, "text", event_or_text) or ""
return str(text).startswith("[Continuing toward your standing goal]\nGoal:")
def _clear_goal_pending_continuations(self, session_key: str, adapter: Any) -> int:
"""Remove queued synthetic /goal continuations for one session.
User-issued /goal pause/clear can race with a continuation already
queued by the judge. Remove only synthetic goal continuations while
preserving normal /queue and user follow-up events.
"""
removed = 0
pending_slot = getattr(adapter, "_pending_messages", None) if adapter is not None else None
if isinstance(pending_slot, dict):
pending_event = pending_slot.get(session_key)
if self._is_goal_continuation_event(pending_event):
pending_slot.pop(session_key, None)
removed += 1
queued_events = getattr(self, "_queued_events", None)
if isinstance(queued_events, dict):
overflow = queued_events.get(session_key) or []
if overflow:
kept = []
for queued_event in overflow:
if self._is_goal_continuation_event(queued_event):
removed += 1
else:
kept.append(queued_event)
if kept:
queued_events[session_key] = kept
else:
queued_events.pop(session_key, None)
return removed
def _goal_still_active_for_session(self, session_id: str) -> bool:
"""Best-effort fresh DB check before running a queued continuation."""
if not session_id:
return False
try:
from hermes_cli.goals import GoalManager
return GoalManager(session_id=session_id).is_active()
except Exception as exc:
logger.debug("goal continuation: active-state recheck failed: %s", exc)
return False
def _update_runtime_status(self, gateway_state: Optional[str] = None, exit_reason: Optional[str] = None) -> None:
try:
from gateway.status import write_runtime_status
@@ -5836,7 +5889,7 @@ class GatewayRunner:
except Exception:
session_entry = None
if session_entry is not None:
self._post_turn_goal_continuation(
await self._post_turn_goal_continuation(
session_entry=session_entry,
source=source,
final_response=_final_text,
@@ -8404,6 +8457,13 @@ class GatewayRunner:
state = mgr.pause(reason="user-paused")
if state is None:
return "No goal set."
try:
adapter = self.adapters.get(event.source.platform) if event.source else None
_quick_key = self._session_key_for_source(event.source) if event.source else None
if adapter and _quick_key:
self._clear_goal_pending_continuations(_quick_key, adapter)
except Exception as exc:
logger.debug("goal pause: pending continuation cleanup failed: %s", exc)
return f"⏸ Goal paused: {state.goal}"
if lower == "resume":
@@ -8418,6 +8478,13 @@ class GatewayRunner:
if lower in ("clear", "stop", "done"):
had = mgr.has_goal()
mgr.clear()
try:
adapter = self.adapters.get(event.source.platform) if event.source else None
_quick_key = self._session_key_for_source(event.source) if event.source else None
if adapter and _quick_key:
self._clear_goal_pending_continuations(_quick_key, adapter)
except Exception as exc:
logger.debug("goal clear: pending continuation cleanup failed: %s", exc)
return t("gateway.goal_cleared") if had else t("gateway.no_active_goal")
# Otherwise — treat the remaining text as the new goal.
@@ -8449,7 +8516,69 @@ class GatewayRunner:
"Controls: /goal status · /goal pause · /goal resume · /goal clear"
)
def _post_turn_goal_continuation(
async def _send_goal_status_notice(self, source: Any, message: str) -> None:
"""Send a /goal judge status line back to the originating chat/thread."""
adapter = self.adapters.get(source.platform)
if not adapter:
logger.debug("goal continuation: no adapter for %s", getattr(source, "platform", None))
return
try:
metadata = self._thread_metadata_for_source(source)
except Exception:
metadata = {"thread_id": source.thread_id} if getattr(source, "thread_id", None) else None
result = await adapter.send(source.chat_id, message, metadata=metadata)
if result is not None and not getattr(result, "success", True):
logger.warning(
"goal continuation: status send failed: %s",
getattr(result, "error", "unknown error"),
)
async def _defer_goal_status_notice_after_delivery(self, source: Any, message: str) -> None:
"""Send a /goal status line after the main response is delivered.
The gateway message handler returns the agent response to the platform
adapter, which sends it after this method's caller has returned. For a
natural Discord/Telegram reading order, goal status belongs after that
send. Platform adapters provide a one-shot post-delivery callback for
exactly this boundary; when unavailable, fall back to direct awaited
delivery rather than silently dropping the notice.
"""
adapter = self.adapters.get(source.platform)
if not adapter:
logger.debug("goal continuation: no adapter for %s", getattr(source, "platform", None))
return
async def _deliver() -> None:
try:
await self._send_goal_status_notice(source, message)
except Exception as exc:
logger.warning("goal continuation: status send failed: %s", exc, exc_info=True)
try:
session_key = self._session_key_for_source(source)
except Exception:
session_key = None
if session_key and hasattr(adapter, "register_post_delivery_callback"):
try:
generation = None
active = getattr(adapter, "_active_sessions", {}).get(session_key)
if active is not None:
generation = getattr(active, "_hermes_run_generation", None)
adapter.register_post_delivery_callback(
session_key,
_deliver,
generation=generation,
)
return
except Exception as exc:
logger.debug("goal continuation: post-delivery callback registration failed: %s", exc)
await _deliver()
async def _post_turn_goal_continuation(
self,
*,
session_entry: Any,
@@ -8485,38 +8614,14 @@ class GatewayRunner:
decision = mgr.evaluate_after_turn(final_response or "", user_initiated=True)
msg = decision.get("message") or ""
# Send the status line back to the user so they see the judge's
# verdict. Fire-and-forget via the adapter's ``send()`` method —
# adapters expose ``send(chat_id, content, reply_to, metadata)``,
# not a ``send_message(source, msg)`` wrapper, so an earlier
# ``hasattr(adapter, "send_message")`` gate here was dead code and
# users never saw ``✓ Goal achieved`` / ``⏸ budget exhausted``
# verdicts.
# Defer the status line until after the adapter has delivered the
# agent's visible final response. The judge runs after the response is
# produced but before BasePlatformAdapter sends it, so sending here
# would show "✓ Goal achieved" before the answer itself. Registering
# an awaited post-delivery callback preserves delivery reliability
# without reversing the user-visible ordering.
if msg and source is not None:
try:
adapter = self.adapters.get(source.platform)
if adapter is not None and hasattr(adapter, "send"):
import asyncio as _asyncio
thread_meta = (
{"thread_id": source.thread_id} if source.thread_id else None
)
coro = adapter.send(
chat_id=source.chat_id,
content=msg,
metadata=thread_meta,
)
if _asyncio.iscoroutine(coro):
try:
loop = _asyncio.get_running_loop()
loop.create_task(coro)
except RuntimeError:
# No running loop in this thread — best effort.
try:
_asyncio.run(coro)
except Exception:
pass
except Exception as exc:
logger.debug("goal continuation: status send failed: %s", exc)
await self._defer_goal_status_notice_after_delivery(source, msg)
if not decision.get("should_continue"):
return
@@ -14768,14 +14873,18 @@ class GatewayRunner:
)
if callable(_bg_cb):
try:
_bg_cb()
_bg_result = _bg_cb()
if inspect.isawaitable(_bg_result):
await _bg_result
except Exception:
pass
elif adapter and hasattr(adapter, "_post_delivery_callbacks"):
_bg_cb = adapter._post_delivery_callbacks.pop(session_key, None)
if callable(_bg_cb):
try:
_bg_cb()
_bg_result = _bg_cb()
if inspect.isawaitable(_bg_result):
await _bg_result
except Exception:
pass
# else: interrupted — discard the interrupted response ("Operation
@@ -14789,6 +14898,12 @@ class GatewayRunner:
next_channel_prompt = None
if pending_event is not None:
next_source = getattr(pending_event, "source", None) or source
if self._is_goal_continuation_event(pending_event) and not self._goal_still_active_for_session(session_id):
logger.info(
"Discarding stale goal continuation for session %s — goal is no longer active",
session_key or "?",
)
return result
next_message = await self._prepare_inbound_message_text(
event=pending_event,
source=next_source,
@@ -15385,6 +15500,12 @@ async def start_gateway(config: Optional[GatewayConfig] = None, replace: bool =
def main():
"""CLI entry point for the gateway."""
from utf8_bootstrap import ensure_windows_utf8_mode
ensure_windows_utf8_mode(
module="gateway.run",
entrypoint_markers=("gateway", "run.py"),
)
import argparse
parser = argparse.ArgumentParser(description="Hermes Gateway - Multi-platform messaging")
+3
View File
@@ -109,6 +109,9 @@ COMMAND_REGISTRY: list[CommandDef] = [
CommandDef("resume", "Resume a previously-named session", "Session",
args_hint="[name]"),
# Configuration
CommandDef("sessions", "Browse and resume previous sessions", "Session"),
# Configuration
CommandDef("config", "Show current configuration", "Configuration",
cli_only=True),
+123 -90
View File
@@ -21,6 +21,7 @@ import stat
import subprocess
import sys
import tempfile
import threading
from dataclasses import dataclass
from pathlib import Path
from typing import Dict, Any, Optional, List, Tuple
@@ -42,6 +43,14 @@ _LOAD_CONFIG_CACHE: Dict[str, Tuple[int, int, Dict[str, Any]]] = {}
# _LOAD_CONFIG_CACHE but for read_raw_config() — used when callers want
# the user's on-disk values without defaults merged in.
_RAW_CONFIG_CACHE: Dict[str, Tuple[int, int, Dict[str, Any]]] = {}
# Serializes all config read/write paths. libyaml's C extension is not
# thread-safe for concurrent safe_load() on the same file, and multiple
# tool threads (approval.py, browser_tool.py, setup flows) hit
# load_config / read_raw_config / save_config from different threads
# during long agent runs. RLock (not Lock) because save_config internally
# calls read_raw_config. Also covers mutation of the module-level cache
# dicts above.
_CONFIG_LOCK = threading.RLock()
# Env var names written to .env that aren't in OPTIONAL_ENV_VARS
# (managed by setup/provider flows directly).
_EXTRA_ENV_KEYS = frozenset({
@@ -780,6 +789,19 @@ DEFAULT_CONFIG = {
"timeout": 30,
"extra_body": {},
},
# Triage specifier — flesh out a rough one-liner in the Kanban
# Triage column into a concrete spec, then promote it to ``todo``.
# Invoked by ``hermes kanban specify`` (single id or --all). Set a
# cheap, capable model here (gemini-flash works well); the main
# model is overkill for short spec expansion.
"triage_specifier": {
"provider": "auto",
"model": "",
"base_url": "",
"api_key": "",
"timeout": 120,
"extra_body": {},
},
# Curator — skill-usage review fork. Timeout is generous because the
# review pass can take several minutes on reasoning models (umbrella
# building over hundreds of candidate skills). "auto" = use main chat
@@ -1864,6 +1886,14 @@ OPTIONAL_ENV_VARS = {
"password": False,
"category": "tool",
},
"BRAVE_SEARCH_API_KEY": {
"description": "Brave Search API subscription token (free tier: 2,000 queries/mo)",
"prompt": "Brave Search subscription token",
"url": "https://brave.com/search/api/",
"tools": ["web_search"],
"password": True,
"category": "tool",
},
"BROWSERBASE_API_KEY": {
"description": "Browserbase API key for cloud browser (optional — local browser works without this)",
"prompt": "Browserbase API key",
@@ -3920,28 +3950,29 @@ def read_raw_config() -> Dict[str, Any]:
``load_config()``. Returns a deepcopy on every call since some callers
mutate the result before passing to ``save_config()``.
"""
try:
config_path = get_config_path()
st = config_path.stat()
cache_key = (st.st_mtime_ns, st.st_size)
except (FileNotFoundError, OSError):
return {}
with _CONFIG_LOCK:
try:
config_path = get_config_path()
st = config_path.stat()
cache_key = (st.st_mtime_ns, st.st_size)
except (FileNotFoundError, OSError):
return {}
path_key = str(config_path)
cached = _RAW_CONFIG_CACHE.get(path_key)
if cached is not None and cached[:2] == cache_key:
return copy.deepcopy(cached[2])
path_key = str(config_path)
cached = _RAW_CONFIG_CACHE.get(path_key)
if cached is not None and cached[:2] == cache_key:
return copy.deepcopy(cached[2])
try:
with open(config_path, encoding="utf-8") as f:
data = yaml.safe_load(f) or {}
except Exception:
return {}
try:
with open(config_path, encoding="utf-8") as f:
data = yaml.safe_load(f) or {}
except Exception:
return {}
if not isinstance(data, dict):
data = {}
_RAW_CONFIG_CACHE[path_key] = (cache_key[0], cache_key[1], copy.deepcopy(data))
return data
if not isinstance(data, dict):
data = {}
_RAW_CONFIG_CACHE[path_key] = (cache_key[0], cache_key[1], copy.deepcopy(data))
return data
def load_config() -> Dict[str, Any]:
@@ -3954,46 +3985,47 @@ def load_config() -> Dict[str, Any]:
(which change ``HERMES_HOME`` and therefore ``get_config_path()``)
don't collide.
"""
ensure_hermes_home()
config_path = get_config_path()
path_key = str(config_path)
with _CONFIG_LOCK:
ensure_hermes_home()
config_path = get_config_path()
path_key = str(config_path)
try:
st = config_path.stat()
cache_key: Optional[Tuple[int, int]] = (st.st_mtime_ns, st.st_size)
except FileNotFoundError:
cache_key = None
cached = _LOAD_CONFIG_CACHE.get(path_key)
if cached is not None and cache_key is not None and cached[:2] == cache_key:
return copy.deepcopy(cached[2])
config = copy.deepcopy(DEFAULT_CONFIG)
if cache_key is not None:
try:
with open(config_path, encoding="utf-8") as f:
user_config = yaml.safe_load(f) or {}
st = config_path.stat()
cache_key: Optional[Tuple[int, int]] = (st.st_mtime_ns, st.st_size)
except FileNotFoundError:
cache_key = None
if "max_turns" in user_config:
agent_user_config = dict(user_config.get("agent") or {})
if agent_user_config.get("max_turns") is None:
agent_user_config["max_turns"] = user_config["max_turns"]
user_config["agent"] = agent_user_config
user_config.pop("max_turns", None)
cached = _LOAD_CONFIG_CACHE.get(path_key)
if cached is not None and cache_key is not None and cached[:2] == cache_key:
return copy.deepcopy(cached[2])
config = _deep_merge(config, user_config)
except Exception as e:
print(f"Warning: Failed to load config: {e}")
config = copy.deepcopy(DEFAULT_CONFIG)
normalized = _normalize_root_model_keys(_normalize_max_turns_config(config))
expanded = _expand_env_vars(normalized)
_LAST_EXPANDED_CONFIG_BY_PATH[path_key] = copy.deepcopy(expanded)
if cache_key is not None:
_LOAD_CONFIG_CACHE[path_key] = (cache_key[0], cache_key[1], copy.deepcopy(expanded))
else:
_LOAD_CONFIG_CACHE.pop(path_key, None)
return expanded
if cache_key is not None:
try:
with open(config_path, encoding="utf-8") as f:
user_config = yaml.safe_load(f) or {}
if "max_turns" in user_config:
agent_user_config = dict(user_config.get("agent") or {})
if agent_user_config.get("max_turns") is None:
agent_user_config["max_turns"] = user_config["max_turns"]
user_config["agent"] = agent_user_config
user_config.pop("max_turns", None)
config = _deep_merge(config, user_config)
except Exception as e:
print(f"Warning: Failed to load config: {e}")
normalized = _normalize_root_model_keys(_normalize_max_turns_config(config))
expanded = _expand_env_vars(normalized)
_LAST_EXPANDED_CONFIG_BY_PATH[path_key] = copy.deepcopy(expanded)
if cache_key is not None:
_LOAD_CONFIG_CACHE[path_key] = (cache_key[0], cache_key[1], copy.deepcopy(expanded))
else:
_LOAD_CONFIG_CACHE.pop(path_key, None)
return expanded
_SECURITY_COMMENT = """
@@ -4073,45 +4105,46 @@ _COMMENTED_SECTIONS = """
def save_config(config: Dict[str, Any]):
"""Save configuration to ~/.hermes/config.yaml."""
if is_managed():
managed_error("save configuration")
return
from utils import atomic_yaml_write
with _CONFIG_LOCK:
if is_managed():
managed_error("save configuration")
return
from utils import atomic_yaml_write
ensure_hermes_home()
config_path = get_config_path()
current_normalized = _normalize_root_model_keys(_normalize_max_turns_config(config))
normalized = current_normalized
raw_existing = _normalize_root_model_keys(_normalize_max_turns_config(read_raw_config()))
if raw_existing:
normalized = _preserve_env_ref_templates(
ensure_hermes_home()
config_path = get_config_path()
current_normalized = _normalize_root_model_keys(_normalize_max_turns_config(config))
normalized = current_normalized
raw_existing = _normalize_root_model_keys(_normalize_max_turns_config(read_raw_config()))
if raw_existing:
normalized = _preserve_env_ref_templates(
normalized,
raw_existing,
_LAST_EXPANDED_CONFIG_BY_PATH.get(str(config_path)),
)
# Build optional commented-out sections for features that are off by
# default or only relevant when explicitly configured.
parts = []
sec = normalized.get("security", {})
if not sec or sec.get("redact_secrets") is None:
parts.append(_SECURITY_COMMENT)
fb = normalized.get("fallback_model", {})
fb_is_valid = False
if isinstance(fb, list):
fb_is_valid = any(isinstance(e, dict) and e.get("provider") and e.get("model") for e in fb)
elif isinstance(fb, dict):
fb_is_valid = bool(fb.get("provider") and fb.get("model"))
if not fb_is_valid:
parts.append(_FALLBACK_COMMENT)
atomic_yaml_write(
config_path,
normalized,
raw_existing,
_LAST_EXPANDED_CONFIG_BY_PATH.get(str(config_path)),
extra_content="".join(parts) if parts else None,
)
# Build optional commented-out sections for features that are off by
# default or only relevant when explicitly configured.
parts = []
sec = normalized.get("security", {})
if not sec or sec.get("redact_secrets") is None:
parts.append(_SECURITY_COMMENT)
fb = normalized.get("fallback_model", {})
fb_is_valid = False
if isinstance(fb, list):
fb_is_valid = any(isinstance(e, dict) and e.get("provider") and e.get("model") for e in fb)
elif isinstance(fb, dict):
fb_is_valid = bool(fb.get("provider") and fb.get("model"))
if not fb_is_valid:
parts.append(_FALLBACK_COMMENT)
atomic_yaml_write(
config_path,
normalized,
extra_content="".join(parts) if parts else None,
)
_secure_file(config_path)
_LAST_EXPANDED_CONFIG_BY_PATH[str(config_path)] = copy.deepcopy(current_normalized)
_secure_file(config_path)
_LAST_EXPANDED_CONFIG_BY_PATH[str(config_path)] = copy.deepcopy(current_normalized)
def load_env() -> Dict[str, str]:
+14
View File
@@ -91,6 +91,15 @@ def _termux_browser_setup_steps(node_installed: bool) -> list[str]:
return steps
def _termux_install_all_fallback_notes() -> list[str]:
return [
"Termux install profile: use .[termux-all] for broad compatibility (installer default on Termux).",
"Matrix E2EE extra is excluded on Termux (python-olm currently fails to build).",
"Local faster-whisper extra is excluded on Termux (ctranslate2/av build path unavailable).",
"STT fallback: use Groq Whisper (set GROQ_API_KEY) or OpenAI Whisper (set VOICE_TOOLS_OPENAI_KEY).",
]
def _has_provider_env_config(content: str) -> bool:
"""Return True when ~/.hermes/.env contains provider auth/base URL settings."""
return any(key in content for key in _PROVIDER_ENV_HINTS)
@@ -1084,6 +1093,11 @@ def run_doctor(args):
except Exception:
pass
if _is_termux():
check_info("Termux compatibility fallbacks:")
for note in _termux_install_all_fallback_notes():
check_info(note)
# =========================================================================
# Check: API connectivity
# =========================================================================
+79 -21
View File
@@ -47,6 +47,14 @@ DEFAULT_MAX_TURNS = 20
DEFAULT_JUDGE_TIMEOUT = 30.0
# Cap how much of the last response + recent messages we send to the judge.
_JUDGE_RESPONSE_SNIPPET_CHARS = 4000
# After this many consecutive judge *parse* failures (empty output / non-JSON),
# the loop auto-pauses and points the user at the goal_judge config. API /
# transport errors do NOT count toward this — those are transient. This guards
# against small models (e.g. deepseek-v4-flash) that cannot follow the strict
# JSON reply contract; without it the loop runs until the turn budget is
# exhausted with every reply shaped like `judge returned empty response` or
# `judge reply was not JSON`.
DEFAULT_MAX_CONSECUTIVE_PARSE_FAILURES = 3
CONTINUATION_PROMPT_TEMPLATE = (
@@ -99,6 +107,7 @@ class GoalState:
last_verdict: Optional[str] = None # "done" | "continue" | "skipped"
last_reason: Optional[str] = None
paused_reason: Optional[str] = None # why we auto-paused (budget, etc.)
consecutive_parse_failures: int = 0 # judge-output parse failures in a row
def to_json(self) -> str:
return json.dumps(asdict(self), ensure_ascii=False)
@@ -116,6 +125,7 @@ class GoalState:
last_verdict=data.get("last_verdict"),
last_reason=data.get("last_reason"),
paused_reason=data.get("paused_reason"),
consecutive_parse_failures=int(data.get("consecutive_parse_failures", 0) or 0),
)
@@ -220,13 +230,17 @@ def _truncate(text: str, limit: int) -> str:
_JSON_OBJECT_RE = re.compile(r"\{.*?\}", re.DOTALL)
def _parse_judge_response(raw: str) -> Tuple[bool, str]:
"""Parse the judge's reply. Fail-open to ``(False, "<reason>")``.
def _parse_judge_response(raw: str) -> Tuple[bool, str, bool]:
"""Parse the judge's reply. Fail-open to ``(False, "<reason>", parse_failed)``.
Returns ``(done, reason)``.
Returns ``(done, reason, parse_failed)``. ``parse_failed`` is True when the
judge returned output that couldn't be interpreted as the expected JSON
verdict (empty body, prose, malformed JSON). Callers use that flag to
auto-pause after N consecutive parse failures so a weak judge model
doesn't silently burn the turn budget.
"""
if not raw:
return False, "judge returned empty response"
return False, "judge returned empty response", True
text = raw.strip()
@@ -252,7 +266,7 @@ def _parse_judge_response(raw: str) -> Tuple[bool, str]:
data = None
if not isinstance(data, dict):
return False, f"judge reply was not JSON: {_truncate(raw, 200)!r}"
return False, f"judge reply was not JSON: {_truncate(raw, 200)!r}", True
done_val = data.get("done")
if isinstance(done_val, str):
@@ -262,7 +276,7 @@ def _parse_judge_response(raw: str) -> Tuple[bool, str]:
reason = str(data.get("reason") or "").strip()
if not reason:
reason = "no reason provided"
return done, reason
return done, reason, False
def judge_goal(
@@ -270,36 +284,42 @@ def judge_goal(
last_response: str,
*,
timeout: float = DEFAULT_JUDGE_TIMEOUT,
) -> Tuple[str, str]:
) -> Tuple[str, str, bool]:
"""Ask the auxiliary model whether the goal is satisfied.
Returns ``(verdict, reason)`` where verdict is ``"done"``, ``"continue"``,
or ``"skipped"`` (when the judge couldn't be reached).
Returns ``(verdict, reason, parse_failed)`` where verdict is ``"done"``,
``"continue"``, or ``"skipped"`` (when the judge couldn't be reached).
This is deliberately fail-open: any error returns ``("continue", "...")``
so a broken judge doesn't wedge progress — the turn budget is the
backstop.
``parse_failed`` is True only when the judge call succeeded but its output
was unusable (empty or non-JSON). API/transport errors return False they
are transient and should fail-open silently. Callers use this flag to
auto-pause after N consecutive parse failures (see
``DEFAULT_MAX_CONSECUTIVE_PARSE_FAILURES``).
This is deliberately fail-open: any error returns ``("continue", "...", False)``
so a broken judge doesn't wedge progress — the turn budget and the
consecutive-parse-failures auto-pause are the backstops.
"""
if not goal.strip():
return "skipped", "empty goal"
return "skipped", "empty goal", False
if not last_response.strip():
# No substantive reply this turn — almost certainly not done yet.
return "continue", "empty response (nothing to evaluate)"
return "continue", "empty response (nothing to evaluate)", False
try:
from agent.auxiliary_client import get_text_auxiliary_client
except Exception as exc:
logger.debug("goal judge: auxiliary client import failed: %s", exc)
return "continue", "auxiliary client unavailable"
return "continue", "auxiliary client unavailable", False
try:
client, model = get_text_auxiliary_client("goal_judge")
except Exception as exc:
logger.debug("goal judge: get_text_auxiliary_client failed: %s", exc)
return "continue", "auxiliary client unavailable"
return "continue", "auxiliary client unavailable", False
if client is None or not model:
return "continue", "no auxiliary client configured"
return "continue", "no auxiliary client configured", False
prompt = JUDGE_USER_PROMPT_TEMPLATE.format(
goal=_truncate(goal, 2000),
@@ -319,17 +339,17 @@ def judge_goal(
)
except Exception as exc:
logger.info("goal judge: API call failed (%s) — falling through to continue", exc)
return "continue", f"judge error: {type(exc).__name__}"
return "continue", f"judge error: {type(exc).__name__}", False
try:
raw = resp.choices[0].message.content or ""
except Exception:
raw = ""
done, reason = _parse_judge_response(raw)
done, reason, parse_failed = _parse_judge_response(raw)
verdict = "done" if done else "continue"
logger.info("goal judge: verdict=%s reason=%s", verdict, _truncate(reason, 120))
return verdict, reason
return verdict, reason, parse_failed
# ──────────────────────────────────────────────────────────────────────
@@ -473,10 +493,18 @@ class GoalManager:
state.turns_used += 1
state.last_turn_at = time.time()
verdict, reason = judge_goal(state.goal, last_response)
verdict, reason, parse_failed = judge_goal(state.goal, last_response)
state.last_verdict = verdict
state.last_reason = reason
# Track consecutive judge parse failures. Reset on any usable reply,
# including API / transport errors (parse_failed=False) so a flaky
# network doesn't trip the auto-pause meant for bad judge models.
if parse_failed:
state.consecutive_parse_failures += 1
else:
state.consecutive_parse_failures = 0
if verdict == "done":
state.status = "done"
save_goal(self.session_id, state)
@@ -489,6 +517,36 @@ class GoalManager:
"message": f"✓ Goal achieved: {reason}",
}
# Auto-pause when the judge model can't produce the expected JSON
# verdict N turns in a row. Points the user at the goal_judge config
# so they can route this side task to a model that follows the
# contract (e.g. google/gemini-3-flash-preview). Without this guard,
# weak judge models burn the entire turn budget returning prose or
# empty strings.
if state.consecutive_parse_failures >= DEFAULT_MAX_CONSECUTIVE_PARSE_FAILURES:
state.status = "paused"
state.paused_reason = (
f"judge model returned unparseable output {state.consecutive_parse_failures} turns in a row"
)
save_goal(self.session_id, state)
return {
"status": "paused",
"should_continue": False,
"continuation_prompt": None,
"verdict": "continue",
"reason": reason,
"message": (
f"⏸ Goal paused — the judge model ({state.consecutive_parse_failures} turns) "
"isn't returning the required JSON verdict. Route the judge to a stricter "
"model in ~/.hermes/config.yaml:\n"
" auxiliary:\n"
" goal_judge:\n"
" provider: openrouter\n"
" model: google/gemini-3-flash-preview\n"
"Then /goal resume to continue."
),
}
if state.turns_used >= state.max_turns:
state.status = "paused"
state.paused_reason = f"turn budget exhausted ({state.turns_used}/{state.max_turns})"
+111
View File
@@ -570,6 +570,42 @@ def build_parser(parent_subparsers: argparse._SubParsersAction) -> argparse.Argu
)
p_ctx.add_argument("task_id")
# --- specify --- (triage → todo via auxiliary LLM)
p_specify = sub.add_parser(
"specify",
help="Flesh out a triage-column task into a concrete spec "
"(title + body) and promote it to todo. Uses the auxiliary "
"LLM configured under auxiliary.triage_specifier.",
)
p_specify.add_argument(
"task_id",
nargs="?",
default=None,
help="Task id to specify (required unless --all is given)",
)
p_specify.add_argument(
"--all",
dest="all_triage",
action="store_true",
help="Specify every task currently in the triage column",
)
p_specify.add_argument(
"--tenant",
default=None,
help="When used with --all, restrict the sweep to this tenant",
)
p_specify.add_argument(
"--author",
default=None,
help="Author name recorded on the audit comment "
"(default: $HERMES_PROFILE or 'specifier')",
)
p_specify.add_argument(
"--json",
action="store_true",
help="Emit one JSON object per task on stdout",
)
# --- gc ---
p_gc = sub.add_parser(
"gc", help="Garbage-collect archived-task workspaces, old events, and old logs",
@@ -684,6 +720,7 @@ def kanban_command(args: argparse.Namespace) -> int:
"notify-list": _cmd_notify_list,
"notify-unsubscribe": _cmd_notify_unsubscribe,
"context": _cmd_context,
"specify": _cmd_specify,
"gc": _cmd_gc,
}
handler = handlers.get(action)
@@ -1980,6 +2017,80 @@ def _cmd_context(args: argparse.Namespace) -> int:
return 0
def _cmd_specify(args: argparse.Namespace) -> int:
"""Flesh out a triage task (or all of them) via auxiliary LLM,
then promote to todo. Thin wrapper over ``kanban_specify``."""
from hermes_cli import kanban_specify as spec
all_flag = bool(getattr(args, "all_triage", False))
tenant = getattr(args, "tenant", None)
author = getattr(args, "author", None) or _profile_author()
want_json = bool(getattr(args, "json", False))
if args.task_id and all_flag:
print(
"kanban: pass either a task id OR --all, not both",
file=sys.stderr,
)
return 2
if all_flag:
ids = spec.list_triage_ids(tenant=tenant)
if not ids:
msg = (
"No triage tasks"
+ (f" for tenant {tenant!r}" if tenant else "")
+ "."
)
if want_json:
print(json.dumps({"specified": 0, "total": 0}))
else:
print(msg)
return 0
elif args.task_id:
ids = [args.task_id]
else:
print(
"kanban: specify requires a task id or --all",
file=sys.stderr,
)
return 2
ok_count = 0
fail_count = 0
for tid in ids:
outcome = spec.specify_task(tid, author=author)
if outcome.ok:
ok_count += 1
else:
fail_count += 1
if want_json:
print(json.dumps({
"task_id": outcome.task_id,
"ok": outcome.ok,
"reason": outcome.reason,
"new_title": outcome.new_title,
}))
else:
if outcome.ok:
title_suffix = (
f" — retitled: {outcome.new_title!r}"
if outcome.new_title
else ""
)
print(f"Specified {outcome.task_id} → todo{title_suffix}")
else:
print(
f"kanban: specify {outcome.task_id}: {outcome.reason}",
file=sys.stderr,
)
if not all_flag:
return 0 if ok_count == 1 else 1
# --all: succeed if at least one promotion landed; exit 1 only when
# every candidate failed (honest signal for scripts).
return 0 if (ok_count > 0 or not ids) else 1
def _cmd_gc(args: argparse.Namespace) -> int:
"""Remove scratch workspaces of archived tasks, prune old events, and
delete old worker logs."""
+85
View File
@@ -2503,6 +2503,91 @@ def unblock_task(conn: sqlite3.Connection, task_id: str) -> bool:
return True
def specify_triage_task(
conn: sqlite3.Connection,
task_id: str,
*,
title: Optional[str] = None,
body: Optional[str] = None,
author: Optional[str] = None,
) -> bool:
"""Flesh out a triage task and promote it to ``todo``.
Atomically updates ``title`` / ``body`` (when provided) and transitions
``status: triage -> todo`` in a single write txn. Returns False when
the task is missing or not in the ``triage`` column callers should
surface that as "nothing to specify" rather than an error.
``todo`` (not ``ready``) is the correct landing column: ``recompute_ready``
promotes parent-free / parent-done todos to ``ready`` on the next
dispatcher tick, which keeps the normal parent-gating behaviour intact
for specified tasks that happen to have open parents.
``author`` is recorded on an audit comment only when at least one of
``title`` / ``body`` actually changed avoids noisy comment spam for
status-only promotions.
"""
if title is not None and not title.strip():
raise ValueError("title cannot be blank")
with write_txn(conn):
existing = conn.execute(
"SELECT title, body FROM tasks WHERE id = ? AND status = 'triage'",
(task_id,),
).fetchone()
if existing is None:
return False
sets: list[str] = ["status = 'todo'"]
params: list[Any] = []
changed_fields: list[str] = []
if title is not None and title.strip() != (existing["title"] or ""):
sets.append("title = ?")
params.append(title.strip())
changed_fields.append("title")
if body is not None and (body or "") != (existing["body"] or ""):
sets.append("body = ?")
params.append(body)
changed_fields.append("body")
params.append(task_id)
cur = conn.execute(
f"UPDATE tasks SET {', '.join(sets)} "
f"WHERE id = ? AND status = 'triage'",
tuple(params),
)
if cur.rowcount != 1:
return False
if changed_fields and author and author.strip():
# Inline INSERT (rather than ``add_comment``) because we're
# already inside this function's write_txn — nested BEGIN
# IMMEDIATE would raise OperationalError. We also skip the
# 'commented' event that ``add_comment`` emits, since the
# 'specified' event below already records the change.
conn.execute(
"INSERT INTO task_comments (task_id, author, body, created_at) "
"VALUES (?, ?, ?, ?)",
(
task_id,
author.strip(),
"Specified — updated "
+ ", ".join(changed_fields)
+ " and promoted to todo.",
int(time.time()),
),
)
_append_event(
conn,
task_id,
"specified",
{"changed_fields": changed_fields} if changed_fields else None,
)
# Outside the write_txn above, so we don't nest BEGIN IMMEDIATE — the
# ready-promotion pass opens its own IMMEDIATE txn. This runs the same
# logic the dispatcher would on its next tick, so a specified task
# with no open parents flips straight to 'ready' here instead of
# idling in 'todo' until the next sweep.
recompute_ready(conn)
return True
def archive_task(conn: sqlite3.Connection, task_id: str) -> bool:
with write_txn(conn):
cur = conn.execute(
+265
View File
@@ -0,0 +1,265 @@
"""Kanban triage specifier — flesh out a one-liner into a real spec.
Used by ``hermes kanban specify [task_id | --all]``. Takes a task that
lives in the Triage column (a rough idea, typically only a title), calls
the auxiliary LLM to produce:
* A tightened title (optional only replaces if the model proposes a
materially different one)
* A concrete body: goal, proposed approach, acceptance criteria
and then flips the task ``triage -> todo`` via
``kanban_db.specify_triage_task``. The dispatcher promotes it to
``ready`` on its next tick (or immediately if there are no open parents).
Design notes
------------
* This module intentionally mirrors ``hermes_cli/goals.py`` same aux
client pattern, same "empty config => skip, don't crash" tolerance.
Keeps the surface area tiny and the failure modes predictable.
* The prompt is a short system + user pair. We ask for JSON with
``{title, body}``; if parsing fails, we fall back to treating the
whole response as the body and leave the title untouched. No
retry loop one shot, keep cost bounded.
* Structured output / JSON mode is not requested explicitly so the
specifier works on providers that don't implement it. The parse
is lenient (tolerates markdown code fences around the JSON).
"""
from __future__ import annotations
import json
import logging
import os
import re
from dataclasses import dataclass
from typing import Optional
from hermes_cli import kanban_db as kb
logger = logging.getLogger(__name__)
_SYSTEM_PROMPT = """You are the Kanban triage specifier for the Hermes Agent board.
A user dropped a rough idea into the Triage column. Your job is to turn it
into a concrete, actionable task spec that an autonomous worker can pick up
and execute without further clarification.
Output a single JSON object with exactly two keys:
{
"title": "<tightened task title, <= 80 chars, imperative voice>",
"body": "<multi-line spec, see structure below>"
}
The body MUST include these sections, each prefixed with a bold markdown
heading, in this order:
**Goal** one sentence, user-facing outcome.
**Approach** 2-5 bullets on how a worker should tackle it.
**Acceptance criteria** checklist of concrete, verifiable conditions.
**Out of scope** short list of things NOT to touch (omit if nothing
obvious; never invent scope creep).
Rules:
- Keep the tightened title close in meaning to the original idea do
NOT invent a different project.
- If the original idea is already detailed, preserve its substance and
just reformat into the sections above.
- Never add invented requirements the user didn't hint at.
- No preamble, no closing remarks, no code fences around the JSON.
- Output only the JSON object and nothing else.
"""
_USER_TEMPLATE = """Task id: {task_id}
Current title: {title}
Current body:
{body}
"""
@dataclass
class SpecifyOutcome:
"""Result of specifying a single triage task."""
task_id: str
ok: bool
reason: str = ""
new_title: Optional[str] = None
def _truncate(text: str, limit: int) -> str:
if len(text) <= limit:
return text
return text[: limit - 1] + ""
_FENCE_RE = re.compile(r"^\s*```(?:json)?\s*|\s*```\s*$", re.IGNORECASE)
def _extract_json_blob(raw: str) -> Optional[dict]:
"""Lenient JSON extraction — tolerates fenced code blocks and
leading/trailing whitespace. Returns None if nothing parses."""
if not raw:
return None
stripped = _FENCE_RE.sub("", raw.strip())
# Greedy: find the first `{` and last `}` and try that slice.
first = stripped.find("{")
last = stripped.rfind("}")
if first == -1 or last == -1 or last <= first:
return None
candidate = stripped[first : last + 1]
try:
val = json.loads(candidate)
except (ValueError, json.JSONDecodeError):
return None
if not isinstance(val, dict):
return None
return val
def _profile_author() -> str:
"""Mirror of ``hermes_cli.kanban._profile_author``. Kept local to
avoid a circular import when kanban.py imports this module."""
return (
os.environ.get("HERMES_PROFILE")
or os.environ.get("USER")
or "specifier"
)
def specify_task(
task_id: str,
*,
author: Optional[str] = None,
timeout: Optional[int] = None,
) -> SpecifyOutcome:
"""Specify a single triage task and promote it to ``todo``.
Returns an outcome describing what happened. Never raises for expected
failure modes (task not in triage, no aux client configured, API
error, malformed response) those surface via ``ok=False`` so the
``--all`` sweep can continue past individual failures.
"""
with kb.connect() as conn:
task = kb.get_task(conn, task_id)
if task is None:
return SpecifyOutcome(task_id, False, "unknown task id")
if task.status != "triage":
return SpecifyOutcome(
task_id, False, f"task is not in triage (status={task.status!r})"
)
try:
from agent.auxiliary_client import get_text_auxiliary_client
except Exception as exc: # pragma: no cover — import smoke test
logger.debug("specify: auxiliary client import failed: %s", exc)
return SpecifyOutcome(task_id, False, "auxiliary client unavailable")
try:
client, model = get_text_auxiliary_client("triage_specifier")
except Exception as exc:
logger.debug("specify: get_text_auxiliary_client failed: %s", exc)
return SpecifyOutcome(task_id, False, "auxiliary client unavailable")
if client is None or not model:
return SpecifyOutcome(
task_id, False, "no auxiliary client configured"
)
user_msg = _USER_TEMPLATE.format(
task_id=task.id,
title=_truncate(task.title or "", 400),
body=_truncate(task.body or "(no body)", 4000),
)
try:
resp = client.chat.completions.create(
model=model,
messages=[
{"role": "system", "content": _SYSTEM_PROMPT},
{"role": "user", "content": user_msg},
],
temperature=0.3,
max_tokens=1500,
timeout=timeout or 120,
)
except Exception as exc:
logger.info(
"specify: API call failed for %s (%s) — skipping",
task_id, exc,
)
return SpecifyOutcome(
task_id, False, f"LLM error: {type(exc).__name__}"
)
try:
raw = resp.choices[0].message.content or ""
except Exception:
raw = ""
parsed = _extract_json_blob(raw)
new_title: Optional[str]
new_body: Optional[str]
if parsed is None:
# Fall back: treat the whole reply as the body, leave title as-is.
# Worst case the user edits afterward — still better than stranding
# the task in triage on a malformed LLM reply.
stripped_raw = raw.strip()
if not stripped_raw:
return SpecifyOutcome(
task_id, False, "LLM returned an empty response"
)
new_title = None
new_body = stripped_raw
else:
title_val = parsed.get("title")
body_val = parsed.get("body")
new_title = (
title_val.strip()
if isinstance(title_val, str) and title_val.strip()
else None
)
new_body = (
body_val if isinstance(body_val, str) and body_val.strip() else None
)
if new_body is None and new_title is None:
return SpecifyOutcome(
task_id, False, "LLM response missing title and body"
)
with kb.connect() as conn:
ok = kb.specify_triage_task(
conn,
task_id,
title=new_title,
body=new_body,
author=author or _profile_author(),
)
if not ok:
# Race: someone else promoted / archived the task between our
# read above and the write. Report, don't crash.
return SpecifyOutcome(
task_id, False, "task moved out of triage before promotion"
)
return SpecifyOutcome(task_id, True, "specified", new_title=new_title)
def list_triage_ids(*, tenant: Optional[str] = None) -> list[str]:
"""Return task ids currently in the triage column.
``tenant`` narrows the sweep; ``None`` returns every triage task.
"""
with kb.connect() as conn:
tasks = kb.list_tasks(
conn,
status="triage",
tenant=tenant,
include_archived=False,
)
return [t.id for t in tasks]
+56 -11
View File
@@ -43,12 +43,20 @@ Usage:
hermes claw migrate --dry-run # Preview migration without changes
"""
import os
import sys
from utf8_bootstrap import ensure_windows_utf8_mode
# Force UTF-8 defaults on Windows before any module-level file I/O.
ensure_windows_utf8_mode(
module="hermes_cli.main",
entrypoint_markers=("hermes", "main.py"),
)
import argparse
import json
import os
import shutil
import subprocess
import sys
from pathlib import Path
from typing import Optional
@@ -230,6 +238,7 @@ except Exception:
pass # best-effort — don't crash if config isn't available yet
import logging
import threading
import time as _time
from datetime import datetime
@@ -6445,6 +6454,45 @@ def _load_installable_optional_extras() -> list[str]:
return referenced
def _run_install_with_heartbeat(
cmd: list[str],
*,
env: dict[str, str] | None = None,
heartbeat_interval_seconds: int = 30,
) -> None:
"""Run dependency install command with periodic heartbeat output.
Some resolvers/build backends (especially when compiling Rust/C extensions)
can stay quiet for minutes. Emit a simple elapsed-time heartbeat so users
know ``hermes update`` is still progressing even if pip/uv itself is silent.
"""
done = threading.Event()
start = _time.time()
def _heartbeat() -> None:
# Wait first, then print, so short installs don't emit noise.
while not done.wait(heartbeat_interval_seconds):
elapsed = int(_time.time() - start)
print(
f" … still installing dependencies ({elapsed}s elapsed)"
" — compiling Rust/C extensions can take several minutes",
flush=True,
)
t = threading.Thread(target=_heartbeat, daemon=True)
t.start()
try:
subprocess.run(
cmd,
cwd=PROJECT_ROOT,
check=True,
env=env,
)
finally:
done.set()
t.join(timeout=0.2)
def _install_python_dependencies_with_optional_fallback(
install_cmd_prefix: list[str],
*,
@@ -6461,12 +6509,13 @@ def _install_python_dependencies_with_optional_fallback(
Collecting/Building/Installing step), so keeping it visible costs
nothing on fast hardware and prevents the "hermes update hangs" reports
on slow hardware.
We also add periodic heartbeat lines in case the resolver/build backend is
itself silent for long stretches.
"""
try:
subprocess.run(
_run_install_with_heartbeat(
install_cmd_prefix + ["install", "-e", ".[all]"],
cwd=PROJECT_ROOT,
check=True,
env=env,
)
return
@@ -6475,10 +6524,8 @@ def _install_python_dependencies_with_optional_fallback(
" ⚠ Optional extras failed, reinstalling base dependencies and retrying extras individually..."
)
subprocess.run(
_run_install_with_heartbeat(
install_cmd_prefix + ["install", "-e", "."],
cwd=PROJECT_ROOT,
check=True,
env=env,
)
@@ -6486,10 +6533,8 @@ def _install_python_dependencies_with_optional_fallback(
installed_extras: list[str] = []
for extra in _load_installable_optional_extras():
try:
subprocess.run(
_run_install_with_heartbeat(
install_cmd_prefix + ["install", "-e", f".[{extra}]"],
cwd=PROJECT_ROOT,
check=True,
env=env,
)
installed_extras.append(extra)
+43
View File
@@ -308,6 +308,23 @@ TOOL_CATEGORIES = {
{"key": "SEARXNG_URL", "prompt": "Your SearXNG instance URL (e.g., http://localhost:8080)", "url": "https://searxng.github.io/searxng/"},
],
},
{
"name": "Brave Search (Free Tier)",
"badge": "free tier · search only",
"tag": "2,000 queries/mo free — search only (pair with any extract provider)",
"web_backend": "brave-free",
"env_vars": [
{"key": "BRAVE_SEARCH_API_KEY", "prompt": "Brave Search subscription token", "url": "https://brave.com/search/api/"},
],
},
{
"name": "DuckDuckGo (ddgs)",
"badge": "free · no key · search only",
"tag": "Search via the ddgs Python package — no API key (pair with any extract provider)",
"web_backend": "ddgs",
"env_vars": [],
"post_setup": "ddgs",
},
],
},
"image_gen": {
@@ -669,6 +686,32 @@ def _run_post_setup(post_setup_key: str):
_print_info(" Full voice list: https://github.com/OHF-Voice/piper1-gpl/blob/main/docs/VOICES.md")
_print_info(" Switch voices by setting tts.piper.voice in ~/.hermes/config.yaml")
elif post_setup_key == "ddgs":
try:
__import__("ddgs")
_print_success(" ddgs is already installed")
except ImportError:
import subprocess
_print_info(" Installing ddgs (DuckDuckGo search package)...")
try:
result = subprocess.run(
[sys.executable, "-m", "pip", "install", "-U", "ddgs", "--quiet"],
capture_output=True, text=True, timeout=300,
)
if result.returncode == 0:
_print_success(" ddgs installed")
else:
_print_warning(" ddgs install failed:")
_print_info(f" {result.stderr.strip()[:300]}")
_print_info(" Run manually: python -m pip install -U ddgs")
return
except subprocess.TimeoutExpired:
_print_warning(" ddgs install timed out (>5min)")
_print_info(" Run manually: python -m pip install -U ddgs")
return
_print_info(" No API key required. DuckDuckGo enforces server-side rate limits.")
_print_info(" Pair with an extract provider if you also need web_extract.")
elif post_setup_key == "spotify":
# Run the full `hermes auth spotify` flow — if the user has no
# client_id yet, this drops them into the interactive wizard
+5
View File
@@ -612,6 +612,11 @@ class SessionDB:
the caller already holds cumulative totals (gateway path, where the
cached agent accumulates across messages).
"""
# Ensure the session row exists so the UPDATE doesn't silently affect
# 0 rows. Under concurrent load (cron + kanban + delegate_task) the
# initial create_session() may have failed due to SQLite locking.
# INSERT OR IGNORE is cheap and idempotent.
self._insert_session_row(session_id, "unknown", model=model)
if absolute:
sql = """UPDATE sessions SET
input_tokens = ?,
+1 -1
View File
@@ -802,7 +802,7 @@ def create_mcp_server(event_bridge: Optional[EventBridge] = None) -> "FastMCP":
return json.dumps({"count": len(targets), "channels": targets}, indent=2)
channels = []
for plat, entries_list in directory.items():
for plat, entries_list in directory.get("platforms", {}).items():
if platform and plat.lower() != platform.lower():
continue
if isinstance(entries_list, list):
+158 -32
View File
@@ -97,6 +97,12 @@
const API = "/api/plugins/kanban";
const MIME_TASK = "text/x-hermes-task";
// Docs link — surfaced as a `?` icon next to the board switcher and as
// `title=` hints on unlabelled controls. Kept in one place so rebrands or
// path changes are a single edit.
const DOCS_URL = "https://hermes-agent.nousresearch.com/docs/user-guide/features/kanban";
const DOCS_TUTORIAL_URL = "https://hermes-agent.nousresearch.com/docs/user-guide/features/kanban-tutorial";
// localStorage key for the user's selected board. Independent of the
// CLI's on-disk ``<root>/kanban/current`` pointer so browser users
// can inspect any board without shifting the CLI's active board out
@@ -1128,6 +1134,20 @@
// Board switcher (multi-project)
// -------------------------------------------------------------------------
// Small `?` affordance next to the board controls. Opens the kanban docs
// page in a new tab so users can look up what any of the widgets mean
// without losing the current board view.
function DocsLink() {
return h("a", {
href: DOCS_URL,
target: "_blank",
rel: "noopener noreferrer",
className: "hermes-kanban-docs-link",
title: "Open Hermes Kanban docs in a new tab",
"aria-label": "Hermes Kanban documentation",
}, "?");
}
function BoardSwitcher(props) {
const list = props.boardList || [];
const current = list.find(function (b) { return b.slug === props.board; });
@@ -1152,6 +1172,7 @@
size: "sm",
className: "h-7 text-xs",
}, "+ New board"),
h(DocsLink, null),
);
}
@@ -1165,6 +1186,7 @@
value: props.board,
className: "h-8 min-w-[220px]",
"aria-label": "Switch kanban board",
title: "Boards are independent work streams. Each board has its own tasks, tenants, and assignees.",
}, selectChangeHandler(function (v) { if (v) props.onSwitch(v); })),
list.map(function (b) {
const label = b.total > 0
@@ -1178,10 +1200,12 @@
),
),
h("div", { className: "flex-1" }),
h(DocsLink, null),
h(Button, {
onClick: props.onNewClick,
size: "sm",
className: "h-8",
title: "Create a new board. Useful when you want an unrelated work stream (different project, different team, isolated scratch area).",
}, "+ New board"),
props.board !== "default"
? h(Button, {
@@ -1326,7 +1350,8 @@
const tenants = (props.board && props.board.tenants) || [];
const assignees = (props.board && props.board.assignees) || [];
return h("div", { className: "flex flex-wrap items-end gap-3" },
h("div", { className: "flex flex-col gap-1" },
h("div", { className: "flex flex-col gap-1",
title: "Fuzzy-match tasks by id, title, or description. Matches across all columns." },
h(Label, { className: "text-xs text-muted-foreground" }, "Search"),
h(Input, {
placeholder: "Filter cards…",
@@ -1335,7 +1360,8 @@
className: "w-56 h-8",
}),
),
h("div", { className: "flex flex-col gap-1" },
h("div", { className: "flex flex-col gap-1",
title: "Tenants are free-form tags on a task (e.g. customer, project, team). Set them via the task drawer or kanban_create." },
h(Label, { className: "text-xs text-muted-foreground" }, "Tenant"),
h(Select, Object.assign({
value: props.tenantFilter,
@@ -1347,7 +1373,8 @@
}),
),
),
h("div", { className: "flex flex-col gap-1" },
h("div", { className: "flex flex-col gap-1",
title: "Filter by assigned Hermes profile. Profiles are the named agent identities that claim and work on tasks." },
h(Label, { className: "text-xs text-muted-foreground" }, "Assignee"),
h(Select, Object.assign({
value: props.assigneeFilter,
@@ -1359,7 +1386,8 @@
}),
),
),
h("label", { className: "flex items-center gap-2 text-xs" },
h("label", { className: "flex items-center gap-2 text-xs",
title: "Include archived tasks in the board view. Archived tasks are hidden by default." },
h("input", {
type: "checkbox",
checked: props.includeArchived,
@@ -1380,10 +1408,12 @@
h(Button, {
onClick: props.onNudgeDispatch,
size: "sm",
title: "Wake the dispatcher to claim ready tasks now instead of waiting for the next tick. Use this after adding tasks if you want them picked up immediately.",
}, "Nudge dispatcher"),
h(Button, {
onClick: props.onRefresh,
size: "sm",
title: "Reload the board from the database. The board auto-refreshes on task events; this is for forcing a re-read.",
}, "Refresh"),
);
}
@@ -1400,6 +1430,7 @@
h(Button, {
onClick: function () { props.onApply({ status: "ready" }); },
size: "sm",
title: "Move selected tasks to Ready. Ready tasks are picked up by the dispatcher on the next tick.",
}, "→ ready"),
h(Button, {
onClick: function () {
@@ -1407,6 +1438,7 @@
`Mark ${props.count} task(s) as done?`);
},
size: "sm",
title: "Mark selected tasks as done. Releases any claims and unblocks dependent children. You'll be asked for a completion summary.",
}, "Complete"),
h(Button, {
onClick: function () {
@@ -1414,8 +1446,10 @@
`Archive ${props.count} task(s)?`);
},
size: "sm",
title: "Archive selected tasks. They disappear from the default board view but remain in the database.",
}, "Archive"),
h("div", { className: "hermes-kanban-bulk-reassign" },
h("div", { className: "hermes-kanban-bulk-reassign",
title: "Reassign selected tasks to a different Hermes profile. Pick a profile (or unassign) and click Apply." },
h(Select, {
value: assignee,
onChange: function (e) { setAssignee(e.target.value); },
@@ -1435,12 +1469,14 @@
},
disabled: !assignee,
size: "sm",
title: "Apply the selected assignee to all selected tasks.",
}, "Apply"),
),
h("div", { className: "flex-1" }),
h(Button, {
onClick: props.onClear,
size: "sm",
title: "Deselect all tasks and hide this bar.",
}, "Clear"),
);
}
@@ -1521,11 +1557,13 @@
onDragLeave: handleDragLeave,
onDrop: handleDrop,
},
h("div", { className: "hermes-kanban-column-header" },
h("div", { className: "hermes-kanban-column-header",
title: COLUMN_HELP[props.column.name] || "" },
h("span", { className: cn("hermes-kanban-dot", COLUMN_DOT[props.column.name]) }),
h("span", { className: "hermes-kanban-column-label" },
COLUMN_LABEL[props.column.name] || props.column.name),
h("span", { className: "hermes-kanban-column-count" },
h("span", { className: "hermes-kanban-column-count",
title: `${props.column.tasks.length} task${props.column.tasks.length === 1 ? "" : "s"} in this column` },
props.column.tasks.length),
h("button", {
type: "button",
@@ -1652,7 +1690,8 @@
onClick: function (e) { e.stopPropagation(); },
title: "Select for bulk actions",
}),
h("span", { className: "hermes-kanban-card-id" }, t.id),
h("span", { className: "hermes-kanban-card-id",
title: `Task id: ${t.id}. Use this id with kanban_show, /kanban show, or hermes kanban show.` }, t.id),
t.warnings && t.warnings.count > 0
? h("span", {
className: cn(
@@ -1669,10 +1708,12 @@
t.warnings.highest_severity === "error" ? "!!" : "⚠")
: null,
t.priority > 0
? h(Badge, { className: "hermes-kanban-priority" }, `P${t.priority}`)
? h(Badge, { className: "hermes-kanban-priority",
title: `Priority ${t.priority}. Higher-priority tasks are claimed first by the dispatcher.` }, `P${t.priority}`)
: null,
t.tenant
? h(Badge, { variant: "outline", className: "hermes-kanban-tag" }, t.tenant)
? h(Badge, { variant: "outline", className: "hermes-kanban-tag",
title: `Tenant: ${t.tenant}. Free-form tag for grouping tasks (customer, project, team).` }, t.tenant)
: null,
progress
? h("span", {
@@ -1687,16 +1728,21 @@
h("div", { className: "hermes-kanban-card-title" }, t.title || "(untitled)"),
h("div", { className: "hermes-kanban-card-row hermes-kanban-card-meta" },
t.assignee
? h("span", { className: "hermes-kanban-assignee" }, "@", t.assignee)
: h("span", { className: "hermes-kanban-unassigned" }, "unassigned"),
? h("span", { className: "hermes-kanban-assignee",
title: `Assigned to Hermes profile @${t.assignee}` }, "@", t.assignee)
: h("span", { className: "hermes-kanban-unassigned",
title: "No profile assigned. The dispatcher will pick one from available profiles when the task is Ready." }, "unassigned"),
t.comment_count > 0
? h("span", { className: "hermes-kanban-count" }, "💬 ", t.comment_count)
? h("span", { className: "hermes-kanban-count",
title: `${t.comment_count} comment${t.comment_count === 1 ? "" : "s"} on this task` }, "💬 ", t.comment_count)
: null,
t.link_counts && (t.link_counts.parents + t.link_counts.children) > 0
? h("span", { className: "hermes-kanban-count" },
? h("span", { className: "hermes-kanban-count",
title: `${t.link_counts.parents} parent${t.link_counts.parents === 1 ? "" : "s"}, ${t.link_counts.children} child${t.link_counts.children === 1 ? "" : "ren"}. Children stay blocked until their parent is done.` },
"↔ ", t.link_counts.parents + t.link_counts.children)
: null,
h("span", { className: "hermes-kanban-ago" },
h("span", { className: "hermes-kanban-ago",
title: t.created_at ? `Created ${t.created_at}` : "" },
timeAgo ? timeAgo(t.created_at) : ""),
),
),
@@ -1777,6 +1823,9 @@
onChange: function (e) { setAssignee(e.target.value); },
placeholder: props.columnName === "triage" ? "specifier" : "assignee",
className: "h-7 text-xs flex-1",
title: props.columnName === "triage"
? "Hermes profile that will spec this task (default: the dispatcher's configured specifier). Leave blank to let the dispatcher pick."
: "Hermes profile to assign. Leave blank and the dispatcher will pick from available profiles when the task is Ready.",
}),
h(Input, {
type: "number",
@@ -1784,6 +1833,7 @@
onChange: function (e) { setPriority(e.target.value); },
placeholder: "pri",
className: "h-7 text-xs w-16",
title: "Priority. Higher-priority tasks are claimed first by the dispatcher. 0 = default.",
}),
),
h(Input, {
@@ -1815,6 +1865,7 @@
value: parent,
onChange: function (e) { setParent(e.target.value); },
className: "h-7 text-xs",
title: "Optional parent task. A child stays blocked in its current column until the parent is marked done.",
},
h(SelectOption, { value: "" }, "— no parent —"),
(props.allTasks || []).map(function (t) {
@@ -1905,6 +1956,29 @@
}).then(function () { load(); props.onRefresh(); });
};
// Triage specifier — calls the auxiliary LLM to flesh out a rough
// idea in the Triage column into a concrete spec (title + body with
// goal, approach, acceptance criteria) and promotes it to todo.
// Not a PATCH: runs through a dedicated POST endpoint because the
// LLM call can take tens of seconds, and its outcome is richer than
// a status flip (may update title AND body AND emit an audit
// comment — or fail with a human-readable reason that the UI
// surfaces inline without treating it as an HTTP error).
const doSpecify = function () {
return SDK.fetchJSON(
withBoard(`${API}/tasks/${encodeURIComponent(props.taskId)}/specify`, boardSlug),
{
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({}),
}
).then(function (res) {
load();
props.onRefresh();
return res;
});
};
const addLink = function (parentId) {
return SDK.fetchJSON(withBoard(`${API}/links`, boardSlug), {
method: "POST",
@@ -1994,6 +2068,7 @@
assignees: props.assignees || [],
boardSlug: boardSlug,
onPatch: doPatch,
onSpecify: doSpecify,
onAddParent: addLink,
onRemoveParent: removeLink,
onAddChild: addChild,
@@ -2062,7 +2137,11 @@
}) : null,
t.created_by ? h(MetaRow, { label: "Created by", value: t.created_by }) : null,
),
h(StatusActions, { task: t, onPatch: props.onPatch }),
h(StatusActions, {
task: t,
onPatch: props.onPatch,
onSpecify: props.onSpecify,
}),
h(DiagnosticsSection, {
task: t,
boardSlug: props.boardSlug,
@@ -2495,6 +2574,8 @@
function StatusActions(props) {
const t = props.task;
const [specifyBusy, setSpecifyBusy] = useState(false);
const [specifyMsg, setSpecifyMsg] = useState(null);
const b = function (label, patch, enabled, confirmMsg) {
return h(Button, {
onClick: function () { if (enabled !== false) props.onPatch(patch, { confirm: confirmMsg }); },
@@ -2502,22 +2583,67 @@
size: "sm",
}, label);
};
return h("div", { className: "hermes-kanban-actions" },
b("→ triage", { status: "triage" }, t.status !== "triage"),
b("→ ready", { status: "ready" }, t.status !== "ready"),
// No direct → running button: /tasks/:id PATCH rejects status=running
// with 400 (issue #19535). Tasks enter running only through the
// dispatcher's claim_task path, which atomically creates the run row,
// claim lock, and worker process metadata.
b("Block", { status: "blocked" },
t.status === "running" || t.status === "ready",
DESTRUCTIVE_TRANSITIONS.blocked),
b("Unblock", { status: "ready" }, t.status === "blocked"),
b("Complete", { status: "done" },
t.status === "running" || t.status === "ready" || t.status === "blocked",
DESTRUCTIVE_TRANSITIONS.done),
b("Archive", { status: "archived" }, t.status !== "archived",
DESTRUCTIVE_TRANSITIONS.archived),
// "Specify" appears only when the task is in the Triage column — the
// one column where an auxiliary LLM pass is meaningful. Elsewhere
// the backend would return ok:false with "not in triage" anyway,
// so hiding the button keeps the action row uncluttered.
const specifyButton = (t.status === "triage" && props.onSpecify)
? h(Button, {
onClick: function () {
if (specifyBusy) return;
setSpecifyBusy(true);
setSpecifyMsg(null);
props.onSpecify().then(function (res) {
if (res && res.ok) {
const suffix = res.new_title
? ` — retitled: ${res.new_title}`
: "";
setSpecifyMsg({ ok: true, text: `Specified${suffix}` });
} else {
setSpecifyMsg({
ok: false,
text: "Specify failed: " + ((res && res.reason) || "unknown error"),
});
}
}).catch(function (err) {
setSpecifyMsg({
ok: false,
text: "Specify failed: " + (err.message || String(err)),
});
}).then(function () {
setSpecifyBusy(false);
});
},
disabled: specifyBusy,
size: "sm",
}, specifyBusy ? "Specifying…" : "✨ Specify")
: null;
return h("div", null,
h("div", { className: "hermes-kanban-actions" },
specifyButton,
b("→ triage", { status: "triage" }, t.status !== "triage"),
b("→ ready", { status: "ready" }, t.status !== "ready"),
// No direct → running button: /tasks/:id PATCH rejects status=running
// with 400 (issue #19535). Tasks enter running only through the
// dispatcher's claim_task path, which atomically creates the run row,
// claim lock, and worker process metadata.
b("Block", { status: "blocked" },
t.status === "running" || t.status === "ready",
DESTRUCTIVE_TRANSITIONS.blocked),
b("Unblock", { status: "ready" }, t.status === "blocked"),
b("Complete", { status: "done" },
t.status === "running" || t.status === "ready" || t.status === "blocked",
DESTRUCTIVE_TRANSITIONS.done),
b("Archive", { status: "archived" }, t.status !== "archived",
DESTRUCTIVE_TRANSITIONS.archived),
),
specifyMsg ? h("div", {
className: specifyMsg.ok
? "hermes-kanban-msg-ok"
: "hermes-kanban-msg-err",
}, specifyMsg.text) : null,
);
}
+46
View File
@@ -402,6 +402,26 @@
gap: 0.3rem;
}
/* Specifier result banner — sits directly under the status action row. */
.hermes-kanban-msg-ok,
.hermes-kanban-msg-err {
margin-top: 0.4rem;
padding: 0.35rem 0.55rem;
border-radius: 0.375rem;
font-size: 0.85rem;
line-height: 1.3;
}
.hermes-kanban-msg-ok {
background: rgba(46, 160, 67, 0.12);
color: #2ea043;
border: 1px solid rgba(46, 160, 67, 0.35);
}
.hermes-kanban-msg-err {
background: rgba(248, 81, 73, 0.12);
color: #f85149;
border: 1px solid rgba(248, 81, 73, 0.35);
}
/* ---- Home channel subscription toggles (per-platform, per-task) ----- */
.hermes-kanban-home-subs {
@@ -871,6 +891,32 @@
display: flex;
justify-content: flex-end;
padding: 0 0.25rem;
gap: 0.5rem;
align-items: center;
}
.hermes-kanban-docs-link {
display: inline-flex;
align-items: center;
justify-content: center;
width: 1.5rem;
height: 1.5rem;
border-radius: 9999px;
font-size: 0.75rem;
font-weight: 600;
line-height: 1;
color: var(--color-muted-foreground, rgba(180, 180, 200, 0.8));
background: var(--color-card-subtle, rgba(255, 255, 255, 0.04));
border: 1px solid var(--color-border, rgba(120, 120, 140, 0.25));
text-decoration: none;
cursor: help;
transition: color 0.15s, background 0.15s, border-color 0.15s;
}
.hermes-kanban-docs-link:hover,
.hermes-kanban-docs-link:focus-visible {
color: var(--color-foreground, #e7e7ee);
background: var(--color-card, rgba(255, 255, 255, 0.08));
border-color: var(--color-border, rgba(160, 160, 190, 0.45));
outline: none;
}
.hermes-kanban-dialog-backdrop {
position: fixed;
+56
View File
@@ -30,6 +30,7 @@ import asyncio
import hmac
import json
import logging
import os
import sqlite3
import time
from dataclasses import asdict
@@ -1011,6 +1012,61 @@ def reclaim_task_endpoint(
conn.close()
class SpecifyBody(BaseModel):
"""Optional author override. Nothing else is configurable from the
dashboard model + prompt come from ``auxiliary.triage_specifier``
in config.yaml, same as the CLI."""
author: Optional[str] = None
@router.post("/tasks/{task_id}/specify")
def specify_task_endpoint(
task_id: str,
payload: SpecifyBody,
board: Optional[str] = Query(None),
):
"""Flesh out a triage-column task via the auxiliary LLM and promote
it to ``todo``. Maps 1:1 to ``hermes kanban specify <task_id>``.
Returns the outcome shape used by the CLI: ``{ok, task_id, reason,
new_title}``. A non-OK outcome is NOT an HTTP error the UI renders
the reason inline (e.g. "no auxiliary client configured") so the
operator knows what to fix, and retries without a page reload.
This endpoint runs in FastAPI's threadpool (sync ``def``) because
the underlying LLM call can take tens of seconds to minutes on
reasoning models, which would block the event loop if we used
``async def`` without an explicit ``run_in_executor``.
"""
board = _resolve_board(board)
# Pin the board for the duration of this call so the specifier module
# (which calls ``kb.connect()`` with no args) hits the right DB.
prev_env = os.environ.get("HERMES_KANBAN_BOARD")
try:
os.environ["HERMES_KANBAN_BOARD"] = board or kanban_db.DEFAULT_BOARD
# Import lazily so a missing auxiliary client at import time
# doesn't break plugin load.
from hermes_cli import kanban_specify # noqa: WPS433 (intentional)
outcome = kanban_specify.specify_task(
task_id,
author=(payload.author or None),
)
finally:
if prev_env is None:
os.environ.pop("HERMES_KANBAN_BOARD", None)
else:
os.environ["HERMES_KANBAN_BOARD"] = prev_env
return {
"ok": bool(outcome.ok),
"task_id": outcome.task_id,
"reason": outcome.reason,
"new_title": outcome.new_title,
}
class ReassignBody(BaseModel):
profile: Optional[str] = None # "" or None = unassign
reclaim_first: bool = False
+23 -4
View File
@@ -68,9 +68,7 @@ acp = ["agent-client-protocol>=0.9.0,<1.0"]
mistral = ["mistralai>=2.3.0,<3"]
bedrock = ["boto3>=1.35.0,<2"]
termux = [
# Tested Android / Termux path: keeps the core CLI feature-rich while
# avoiding extras that currently depend on non-Android wheels (notably
# faster-whisper -> ctranslate2 via the voice extra).
# Baseline Android / Termux path for reliable fresh installs.
"python-telegram-bot[webhooks]>=22.6,<23",
"hermes-agent[cron]",
"hermes-agent[cli]",
@@ -79,6 +77,27 @@ termux = [
"hermes-agent[honcho]",
"hermes-agent[acp]",
]
termux-all = [
# Best-effort "install all" profile for Termux: include broad extras that
# are known to resolve on Android, while intentionally excluding extras that
# currently hard-fail from missing/broken Android wheels/toolchains.
#
# Excluded for now:
# - matrix (mautrix[encryption] -> python-olm build failures on Termux)
# - voice (faster-whisper chain requires ctranslate2/av builds not packaged)
"hermes-agent[termux]",
"hermes-agent[messaging]",
"hermes-agent[slack]",
"hermes-agent[tts-premium]",
"hermes-agent[dingtalk]",
"hermes-agent[feishu]",
"hermes-agent[google]",
"hermes-agent[mistral]",
"hermes-agent[bedrock]",
"hermes-agent[homeassistant]",
"hermes-agent[sms]",
"hermes-agent[web]",
]
dingtalk = ["dingtalk-stream>=0.20,<1", "alibabacloud-dingtalk>=2.0.0", "qrcode>=7.0,<8"]
feishu = ["lark-oapi>=1.5.3,<2", "qrcode>=7.0,<8"]
google = [
@@ -135,7 +154,7 @@ hermes-agent = "run_agent:main"
hermes-acp = "acp_adapter.entry:main"
[tool.setuptools]
py-modules = ["run_agent", "model_tools", "toolsets", "batch_runner", "trajectory_compressor", "toolset_distributions", "cli", "hermes_constants", "hermes_state", "hermes_time", "hermes_logging", "rl_cli", "utils"]
py-modules = ["run_agent", "model_tools", "toolsets", "batch_runner", "trajectory_compressor", "toolset_distributions", "cli", "hermes_constants", "hermes_state", "hermes_time", "hermes_logging", "rl_cli", "utils", "utf8_bootstrap"]
[tool.setuptools.package-data]
hermes_cli = ["web_dist/**/*"]
+26 -2
View File
@@ -58,6 +58,7 @@ from datetime import datetime
from pathlib import Path
from hermes_constants import get_hermes_home
from utf8_bootstrap import ensure_windows_utf8_mode
_OPENAI_CLS_CACHE: Optional[type] = None
@@ -3065,6 +3066,10 @@ class AIAgent:
) -> bool:
"""Return True when this provider/model pair should use Responses API."""
normalized_provider = (provider or "").strip().lower()
# Nous serves GPT-5.x models via its OpenAI-compatible chat
# completions endpoint; its /v1/responses endpoint returns 404.
if normalized_provider == "nous":
return False
if normalized_provider == "copilot":
try:
from hermes_cli.models import _should_use_copilot_responses_api
@@ -12127,6 +12132,14 @@ class AIAgent:
# deltas instead of double-counting them.
if self._session_db and self.session_id:
try:
# Ensure the session row exists before attempting UPDATE.
# Under concurrent load (cron/kanban), the initial
# _ensure_db_session() may have failed due to SQLite
# locking. Retry here so per-call token deltas are
# not silently lost (UPDATE on a non-existent row
# affects 0 rows without error).
if not self._session_db_created:
self._ensure_db_session()
self._session_db.update_token_counts(
self.session_id,
input_tokens=canonical_usage.input_tokens,
@@ -12145,8 +12158,14 @@ class AIAgent:
model=self.model,
api_call_count=1,
)
except Exception:
pass # never block the agent loop
except Exception as e:
# Log token persistence failures so they're
# visible in agent.log — silent loss here is
# the root cause of undercounted analytics.
logger.debug(
"Token persistence failed (session=%s, tokens=%d): %s",
self.session_id, total_tokens, e,
)
if self.verbose_logging:
logging.debug(f"Token usage: prompt={usage_dict['prompt_tokens']:,}, completion={usage_dict['completion_tokens']:,}, total={usage_dict['total_tokens']:,}")
@@ -14465,6 +14484,11 @@ def main(
Toolset Examples:
- "research": Web search, extract, crawl + vision tools
"""
ensure_windows_utf8_mode(
module="run_agent",
entrypoint_markers=("hermes-agent", "run_agent.py"),
)
print("🤖 AI Agent with Tool Calling")
print("=" * 50)
+152 -30
View File
@@ -65,6 +65,108 @@ function Write-Err {
Write-Host "$Message" -ForegroundColor Red
}
function Add-UserPathEntry {
param(
[string]$CurrentPath,
[string]$Entry
)
if (-not $Entry) {
return $CurrentPath
}
$parts = @()
if ($CurrentPath) {
$parts = $CurrentPath -split ";" | Where-Object { $_ -and $_.Trim() }
}
$normalizedEntry = $Entry.Trim().TrimEnd("\")
foreach ($part in $parts) {
if ($part.Trim().TrimEnd("\") -ieq $normalizedEntry) {
return $CurrentPath
}
}
if ($CurrentPath) {
return "$Entry;$CurrentPath"
}
return $Entry
}
function Resolve-NpmInvocation {
# Prefer npm.cmd to avoid PowerShell execution-policy failures from npm.ps1.
$npmCmd = Get-Command npm.cmd -ErrorAction SilentlyContinue
if ($npmCmd -and $npmCmd.Source) {
return @($npmCmd.Source)
}
$npm = Get-Command npm -ErrorAction SilentlyContinue
if ($npm -and $npm.Source) {
if ($npm.Source -notmatch "\.ps1$") {
return @($npm.Source)
}
$candidateCmd = [System.IO.Path]::ChangeExtension($npm.Source, ".cmd")
if (Test-Path $candidateCmd) {
return @($candidateCmd)
}
}
# Last fallback for odd PATH setups: invoke npm-cli.js directly via node.
$node = Get-Command node -ErrorAction SilentlyContinue
if ($node -and $node.Source) {
$nodeDir = Split-Path -Parent $node.Source
$candidates = @(
(Join-Path $nodeDir "node_modules\npm\bin\npm-cli.js"),
"$HermesHome\node\node_modules\npm\bin\npm-cli.js"
)
foreach ($candidate in $candidates) {
if (Test-Path $candidate) {
return @($node.Source, $candidate)
}
}
}
return $null
}
function Invoke-NpmInstallSilent {
param(
[string]$WorkingDir
)
$npmInvocation = Resolve-NpmInvocation
if (-not $npmInvocation) {
throw "npm command not found in PATH"
}
Push-Location $WorkingDir
try {
$output = @()
if ($npmInvocation.Count -eq 1) {
$output = & $npmInvocation[0] install --silent 2>&1
} else {
$output = & $npmInvocation[0] $npmInvocation[1] install --silent 2>&1
}
if ($LASTEXITCODE -ne 0) {
$lastLine = ""
if ($output) {
$lines = @($output | ForEach-Object { "$_" } | Where-Object { $_ })
if ($lines.Count -gt 0) {
$lastLine = $lines[-1]
}
}
if ($lastLine) {
throw "npm install exited with code $LASTEXITCODE: $lastLine"
}
throw "npm install exited with code $LASTEXITCODE"
}
} finally {
Pop-Location
}
}
# ============================================================================
# Dependency checks
# ============================================================================
@@ -550,11 +652,21 @@ function Install-Dependencies {
$env:VIRTUAL_ENV = "$InstallDir\venv"
}
# Install main package with all extras
try {
& $UvCmd pip install -e ".[all]" 2>&1 | Out-Null
} catch {
& $UvCmd pip install -e "." | Out-Null
# Install main package with all extras first. If that fails (for example
# due to an optional extra on this machine), fall back to the minimum
# dependency profile required for native Windows CLI + TUI operation.
& $UvCmd pip install -e ".[all]" 2>&1 | Out-Null
if ($LASTEXITCODE -ne 0) {
Write-Warn "Full extras install failed. Retrying with Windows CLI/TUI dependency set..."
& $UvCmd pip install -e ".[pty,mcp,honcho,acp]" 2>&1 | Out-Null
if ($LASTEXITCODE -ne 0) {
Write-Warn "Windows CLI/TUI extras install failed. Retrying with base package..."
& $UvCmd pip install -e "." 2>&1 | Out-Null
if ($LASTEXITCODE -ne 0) {
Pop-Location
throw "Failed to install Hermes Python dependencies."
}
}
}
Write-Success "Main package installed"
@@ -586,20 +698,35 @@ function Set-PathVariable {
$hermesBin = "$InstallDir\venv\Scripts"
}
# Add the venv Scripts dir to user PATH so hermes is globally available
# On Windows, the hermes.exe in venv\Scripts\ has the venv Python baked in
# Add required bins to user PATH so hermes and --tui dependencies persist
# across new terminal sessions.
# On Windows, the hermes.exe in venv\Scripts\ has the venv Python baked in.
$currentPath = [Environment]::GetEnvironmentVariable("Path", "User")
if ($currentPath -notlike "*$hermesBin*") {
[Environment]::SetEnvironmentVariable(
"Path",
"$hermesBin;$currentPath",
"User"
)
$newPath = Add-UserPathEntry -CurrentPath $currentPath -Entry $hermesBin
if ($newPath -ne $currentPath) {
Write-Success "Added to user PATH: $hermesBin"
} else {
Write-Info "PATH already configured"
Write-Info "PATH already includes: $hermesBin"
}
$managedNodeDir = "$HermesHome\node"
$managedNodeExe = "$managedNodeDir\node.exe"
if (Test-Path $managedNodeExe) {
$pathWithNode = Add-UserPathEntry -CurrentPath $newPath -Entry $managedNodeDir
if ($pathWithNode -ne $newPath) {
Write-Success "Added managed Node.js to user PATH: $managedNodeDir"
} else {
Write-Info "PATH already includes managed Node.js"
}
$newPath = $pathWithNode
# Hint hermes_cli.main._make_tui_argv() where node lives when a managed
# install is used (it still prefers PATH when available).
[Environment]::SetEnvironmentVariable("HERMES_NODE", $managedNodeExe, "User")
$env:HERMES_NODE = $managedNodeExe
}
[Environment]::SetEnvironmentVariable("Path", $newPath, "User")
# Set HERMES_HOME so the Python code finds config/data in the right place.
# Only needed on Windows where we install to %LOCALAPPDATA%\hermes instead
@@ -612,7 +739,10 @@ function Set-PathVariable {
$env:HERMES_HOME = $HermesHome
# Update current session
$env:Path = "$hermesBin;$env:Path"
$env:Path = Add-UserPathEntry -CurrentPath $env:Path -Entry $hermesBin
if (Test-Path "$HermesHome\node\node.exe") {
$env:Path = Add-UserPathEntry -CurrentPath $env:Path -Entry "$HermesHome\node"
}
Write-Success "hermes command ready"
}
@@ -708,16 +838,14 @@ function Install-NodeDeps {
Write-Info "Skipping Node.js dependencies (Node not installed)"
return
}
Push-Location $InstallDir
if (Test-Path "package.json") {
if (Test-Path "$InstallDir\package.json") {
Write-Info "Installing Node.js dependencies (browser tools)..."
try {
npm install --silent 2>&1 | Out-Null
Invoke-NpmInstallSilent -WorkingDir $InstallDir
Write-Success "Node.js dependencies installed"
} catch {
Write-Warn "npm install failed (browser tools may not work)"
Write-Warn "Browser tools npm install could not be launched: $($_.Exception.Message)"
}
}
@@ -725,19 +853,13 @@ function Install-NodeDeps {
$tuiDir = "$InstallDir\ui-tui"
if (Test-Path "$tuiDir\package.json") {
Write-Info "Installing TUI dependencies..."
Push-Location $tuiDir
try {
npm install --silent 2>&1 | Out-Null
Invoke-NpmInstallSilent -WorkingDir $tuiDir
Write-Success "TUI dependencies installed"
} catch {
Write-Warn "TUI npm install failed (hermes --tui may not work)"
Write-Warn "TUI npm install could not be launched: $($_.Exception.Message)"
}
Pop-Location
}
Pop-Location
}
function Invoke-SetupWizard {
+56 -9
View File
@@ -28,6 +28,10 @@ if [ -n "${PYTHONHOME:-}" ]; then
unset PYTHONHOME
fi
# Prevent uv from discovering config files (uv.toml, pyproject.toml) from the
# wrong user's home directory when running under sudo -u <user>. See #21269.
export UV_NO_CONFIG=1
# Colors
RED='\033[0;31m'
GREEN='\033[0;32m'
@@ -615,6 +619,41 @@ install_node() {
HAS_NODE=true
}
check_network_prerequisites() {
log_info "Checking internet connectivity for package install and web tools..."
local url
local failed=false
local checks=("https://pypi.org/simple/" "https://duckduckgo.com/")
if ! command -v curl >/dev/null 2>&1; then
log_warn "curl not found; skipping connectivity probes"
return 0
fi
for url in "${checks[@]}"; do
if ! curl -fsSI --max-time 8 "$url" >/dev/null 2>&1; then
failed=true
log_warn "Could not reach $url"
fi
done
if [ "$failed" = false ]; then
log_success "Internet connectivity looks good"
return 0
fi
if [ "$DISTRO" = "termux" ]; then
log_warn "Termux network prerequisites may be incomplete."
log_info "Try: pkg install -y ca-certificates curl && pkg update"
log_info "If mirrors are stale: termux-change-repo"
log_info "Then test: curl -I https://pypi.org/simple/ && curl -I https://duckduckgo.com/"
else
log_warn "Network checks failed. Hermes install may complete, but web search and dependency downloads can fail."
log_info "Verify internet/DNS and retry if pip install fails."
fi
}
install_system_packages() {
# Detect what's missing
HAS_RIPGREP=false
@@ -642,7 +681,7 @@ install_system_packages() {
# Termux always needs the Android build toolchain for the tested pip path,
# even when ripgrep/ffmpeg are already present.
if [ "$DISTRO" = "termux" ]; then
local termux_pkgs=(clang rust make pkg-config libffi openssl)
local termux_pkgs=(clang rust make pkg-config libffi openssl ca-certificates curl)
if [ "$need_ripgrep" = true ]; then
termux_pkgs+=("ripgrep")
fi
@@ -945,17 +984,24 @@ install_deps() {
fi
"$PIP_PYTHON" -m pip install --upgrade pip setuptools wheel >/dev/null
if ! "$PIP_PYTHON" -m pip install -e '.[termux]' -c constraints-termux.txt; then
log_warn "Termux feature install (.[termux]) failed, trying base install..."
if ! "$PIP_PYTHON" -m pip install -e '.' -c constraints-termux.txt; then
log_error "Package installation failed on Termux."
log_info "Ensure these packages are installed: pkg install clang rust make pkg-config libffi openssl"
log_info "Then re-run: cd $INSTALL_DIR && python -m pip install -e '.[termux]' -c constraints-termux.txt"
exit 1
# Try the broad Termux profile first (best-effort "install all" for Android),
# then fall back to the conservative Termux baseline, then base package.
if ! "$PIP_PYTHON" -m pip install -e '.[termux-all]' -c constraints-termux.txt; then
log_warn "Termux broad profile (.[termux-all]) failed, trying baseline Termux profile..."
if ! "$PIP_PYTHON" -m pip install -e '.[termux]' -c constraints-termux.txt; then
log_warn "Termux baseline profile (.[termux]) failed, trying base install..."
if ! "$PIP_PYTHON" -m pip install -e '.' -c constraints-termux.txt; then
log_error "Package installation failed on Termux."
log_info "Ensure these packages are installed: pkg install clang rust make pkg-config libffi openssl ca-certificates curl"
log_info "Then re-run: cd $INSTALL_DIR && python -m pip install -e '.[termux-all]' -c constraints-termux.txt"
exit 1
fi
fi
fi
log_success "Main package installed"
log_info "Termux note: matrix e2ee and local faster-whisper extras are excluded from .[termux-all] due to upstream Android wheel/toolchain blockers."
log_info "Termux note: browser/WhatsApp tooling is not installed by default; see the Termux guide for optional follow-up steps."
if [ -d "tinker-atropos" ] && [ -f "tinker-atropos/pyproject.toml" ]; then
@@ -1047,7 +1093,7 @@ setup_path() {
log_warn "hermes entry point not found at $HERMES_BIN"
log_info "This usually means the pip install didn't complete successfully."
if [ "$DISTRO" = "termux" ]; then
log_info "Try: cd $INSTALL_DIR && python -m pip install -e '.[termux]' -c constraints-termux.txt"
log_info "Try: cd $INSTALL_DIR && python -m pip install -e '.[termux-all]' -c constraints-termux.txt"
else
log_info "Try: cd $INSTALL_DIR && uv pip install -e '.[all]'"
fi
@@ -1570,6 +1616,7 @@ main() {
check_python
check_git
check_node
check_network_prerequisites
install_system_packages
clone_repo
+5 -1
View File
@@ -55,8 +55,10 @@ AUTHOR_MAP = {
"127238744+teknium1@users.noreply.github.com": "teknium1",
"128259593+Gutslabs@users.noreply.github.com": "Gutslabs",
"50326054+nocturnum91@users.noreply.github.com": "nocturnum91",
"223003280+Abd0r@users.noreply.github.com": "Abd0r",
"abdielv@proton.me": "AJV20",
"mason@growagainorchids.com": "masonjames",
"ytchen0719@gmail.com": "liquidchen",
"am@studio1.tailb672fe.ts.net": "subtract0",
"axmaiqiu@gmail.com": "qWaitCrypto",
"159539633+MottledShadow@users.noreply.github.com": "MottledShadow",
@@ -77,6 +79,7 @@ AUTHOR_MAP = {
"dengtaoyuan@dengtaoyuandeMac-mini.local": "dengtaoyuan450-a11y",
"ysfalweshcan@gmail.com": "Junass1",
"bartokmagic@proton.me": "Bartok9",
"androidhtml@yandex.com": "hllqkb",
"25840394+Bongulielmi@users.noreply.github.com": "Bongulielmi",
"jonathan.troyer@overmatch.com": "JTroyerOvermatch",
"harryykyle1@gmail.com": "hharry11",
@@ -424,9 +427,10 @@ AUTHOR_MAP = {
"camilo@tekelala.com": "tekelala",
"vincentcharlebois@gmail.com": "vincentcharlebois",
"aryan@synvoid.com": "aryansingh",
"johnsonblake1@gmail.com": "blakejohnson",
"johnsonblake1@gmail.com": "voteblake",
"hcn518@gmail.com": "pedh",
"haileymarshall005@gmail.com": "haileymarshall",
"bennet.yr.wang@gmail.com": "BennetYrWang",
"greer.guthrie@gmail.com": "g-guthrie",
"kennyx102@gmail.com": "bobashopcashier",
"77253505+bobashopcashier@users.noreply.github.com": "bobashopcashier",
+4
View File
@@ -29,6 +29,10 @@ NC='\033[0m'
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
cd "$SCRIPT_DIR"
# Prevent uv from discovering config files (uv.toml, pyproject.toml) from the
# wrong user's home directory when running under sudo -u <user>. See #21269.
export UV_NO_CONFIG=1
PYTHON_VERSION="3.11"
is_termux() {
+124 -1
View File
@@ -1,5 +1,14 @@
import base64
import pytest
from acp.schema import ImageContentBlock, TextContentBlock
from acp.schema import (
BlobResourceContents,
EmbeddedResourceContentBlock,
ImageContentBlock,
ResourceContentBlock,
TextContentBlock,
TextResourceContents,
)
from acp_adapter.server import HermesACPAgent, _content_blocks_to_openai_user_content
@@ -27,6 +36,48 @@ def test_text_only_acp_blocks_stay_string_for_legacy_prompt_path():
assert content == "/help"
def test_acp_resource_link_file_is_inlined_as_text(tmp_path):
attached = tmp_path / "notes.md"
attached.write_text("# Notes\n\nAttached file body", encoding="utf-8")
content = _content_blocks_to_openai_user_content([
TextContentBlock(type="text", text="Please read this file"),
ResourceContentBlock(
type="resource_link",
name="notes.md",
title="Project notes",
uri=attached.as_uri(),
mimeType="text/markdown",
),
])
assert content == (
"Please read this file\n"
"[Attached file: Project notes (notes.md)]\n"
f"URI: {attached.as_uri()}\n\n"
"# Notes\n\nAttached file body"
)
def test_acp_embedded_text_resource_is_inlined_as_text():
content = _content_blocks_to_openai_user_content([
EmbeddedResourceContentBlock(
type="resource",
resource=TextResourceContents(
uri="file:///workspace/todo.txt",
mimeType="text/plain",
text="first\nsecond",
),
),
])
assert content == (
"[Attached file: todo.txt]\n"
"URI: file:///workspace/todo.txt\n\n"
"first\nsecond"
)
@pytest.mark.asyncio
async def test_initialize_advertises_image_prompt_capability():
response = await HermesACPAgent().initialize()
@@ -34,3 +85,75 @@ async def test_initialize_advertises_image_prompt_capability():
assert response.agent_capabilities is not None
assert response.agent_capabilities.prompt_capabilities is not None
assert response.agent_capabilities.prompt_capabilities.image is True
# 1x1 transparent PNG — smallest valid image payload for inlining tests.
_ONE_PX_PNG = bytes.fromhex(
"89504e470d0a1a0a0000000d49484452000000010000000108060000001f15c4"
"890000000a49444154789c6300010000000500010d0a2db40000000049454e44ae426082"
)
def test_acp_resource_link_image_file_is_inlined_as_image_url(tmp_path):
attached = tmp_path / "shot.png"
attached.write_bytes(_ONE_PX_PNG)
content = _content_blocks_to_openai_user_content([
TextContentBlock(type="text", text="Look at this screenshot"),
ResourceContentBlock(
type="resource_link",
name="shot.png",
uri=attached.as_uri(),
mimeType="image/png",
),
])
assert isinstance(content, list)
# [user text, image header, image_url]
assert content[0] == {"type": "text", "text": "Look at this screenshot"}
assert content[1]["type"] == "text"
assert "[Attached image: shot.png]" in content[1]["text"]
assert content[2]["type"] == "image_url"
expected_url = "data:image/png;base64," + base64.b64encode(_ONE_PX_PNG).decode("ascii")
assert content[2]["image_url"]["url"] == expected_url
def test_acp_resource_link_image_mime_inferred_from_suffix(tmp_path):
"""No mimeType sent — should still be recognised as image by file suffix."""
attached = tmp_path / "pic.jpg"
attached.write_bytes(_ONE_PX_PNG) # content doesn't matter for the code path
content = _content_blocks_to_openai_user_content([
ResourceContentBlock(
type="resource_link",
name="pic.jpg",
uri=attached.as_uri(),
),
])
assert isinstance(content, list)
image_parts = [p for p in content if p.get("type") == "image_url"]
assert len(image_parts) == 1
assert image_parts[0]["image_url"]["url"].startswith("data:image/jpeg;base64,")
def test_acp_embedded_blob_image_is_inlined_as_image_url():
b64 = base64.b64encode(_ONE_PX_PNG).decode("ascii")
content = _content_blocks_to_openai_user_content([
EmbeddedResourceContentBlock(
type="resource",
resource=BlobResourceContents(
uri="file:///tmp/embed.png",
mimeType="image/png",
blob=b64,
),
),
])
assert isinstance(content, list)
assert content[0]["type"] == "text"
assert "[Attached image: embed.png]" in content[0]["text"]
assert content[1] == {
"type": "image_url",
"image_url": {"url": f"data:image/png;base64,{b64}"},
}
+147
View File
@@ -0,0 +1,147 @@
from __future__ import annotations
from types import SimpleNamespace
import pytest
from gateway.config import Platform
from gateway.platforms.base import MessageEvent, MessageType
from gateway.run import GatewayRunner
from gateway.session import SessionSource
from hermes_cli.goals import CONTINUATION_PROMPT_TEMPLATE
class FakeAdapter:
def __init__(self):
self.calls = []
self.callbacks = {}
self._active_sessions = {}
async def send(self, chat_id, content, reply_to=None, metadata=None):
self.calls.append(
{
"chat_id": chat_id,
"content": content,
"reply_to": reply_to,
"metadata": metadata,
}
)
return SimpleNamespace(success=True)
def register_post_delivery_callback(self, session_key, callback, *, generation=None):
self.callbacks[session_key] = (generation, callback)
def _goal_continuation_event(source, goal="finish the task"):
return MessageEvent(
text=CONTINUATION_PROMPT_TEMPLATE.format(goal=goal),
message_type=MessageType.TEXT,
source=source,
)
@pytest.mark.asyncio
async def test_goal_status_notice_uses_adapter_send_with_thread_metadata():
"""Regression: /goal judge status must use BasePlatformAdapter.send().
The old implementation checked for a non-existent send_message() method,
so the goal could be marked done in state_meta without the visible
"✓ Goal achieved" status line being delivered to Discord/Telegram.
"""
runner = GatewayRunner.__new__(GatewayRunner)
adapter = FakeAdapter()
runner.adapters = {Platform.DISCORD: adapter}
source = SessionSource(
platform=Platform.DISCORD,
chat_id="parent-channel",
thread_id="thread-123",
)
await runner._send_goal_status_notice(source, "✓ Goal achieved: done")
assert adapter.calls == [
{
"chat_id": "parent-channel",
"content": "✓ Goal achieved: done",
"reply_to": None,
"metadata": {"thread_id": "thread-123"},
}
]
@pytest.mark.asyncio
async def test_goal_status_notice_defers_until_post_delivery_callback():
"""Regression: goal status must appear after the agent's visible reply.
_post_turn_goal_continuation runs before BasePlatformAdapter sends the
returned final response. It should therefore register a post-delivery
callback, not send the judge status immediately.
"""
runner = GatewayRunner.__new__(GatewayRunner)
adapter = FakeAdapter()
runner.adapters = {Platform.DISCORD: adapter}
runner.config = SimpleNamespace(group_sessions_per_user=True, thread_sessions_per_user=False)
source = SessionSource(
platform=Platform.DISCORD,
chat_id="parent-channel",
thread_id="thread-123",
user_id="user-1",
)
await runner._defer_goal_status_notice_after_delivery(source, "✓ Goal achieved: done")
assert adapter.calls == []
assert len(adapter.callbacks) == 1
_, callback = next(iter(adapter.callbacks.values()))
result = callback()
if hasattr(result, "__await__"):
await result
assert adapter.calls == [
{
"chat_id": "parent-channel",
"content": "✓ Goal achieved: done",
"reply_to": None,
"metadata": {"thread_id": "thread-123"},
}
]
def test_clear_goal_pending_continuations_removes_slot_and_overflow_only():
"""Regression: /goal pause/clear must cancel queued self-continuations.
A user-issued /goal pause can arrive after the judge queued the next
continuation but before that queued turn runs. The queued synthetic goal
continuation should be removed without dropping normal user /queue items.
"""
runner = GatewayRunner.__new__(GatewayRunner)
adapter = FakeAdapter()
adapter._pending_messages = {}
runner._queued_events = {}
source = SessionSource(
platform=Platform.DISCORD,
chat_id="parent-channel",
thread_id="thread-123",
)
session_key = "discord:parent-channel:thread-123"
normal_event = MessageEvent(
text="normal queued user message",
message_type=MessageType.TEXT,
source=source,
)
adapter._pending_messages[session_key] = _goal_continuation_event(source)
runner._queued_events[session_key] = [
normal_event,
_goal_continuation_event(source, goal="second continuation"),
]
removed = runner._clear_goal_pending_continuations(session_key, adapter)
assert removed == 2
assert adapter._pending_messages.get(session_key) is None
assert runner._queued_events[session_key] == [normal_event]
+15 -11
View File
@@ -61,8 +61,9 @@ class _RecordingAdapter:
return _R()
def _make_runner_with_adapter():
def _make_runner_with_adapter(session_id: str = None):
from gateway.run import GatewayRunner
import uuid
runner = object.__new__(GatewayRunner)
runner.config = GatewayConfig(
@@ -74,9 +75,12 @@ def _make_runner_with_adapter():
runner._queued_events = {}
src = _make_source()
# Default to a unique session_id so xdist parallel runs on the same worker
# don't see each other's GoalManager state (DEFAULT_DB_PATH gets frozen at
# module-import time, defeating per-test HERMES_HOME monkeypatches).
session_entry = SessionEntry(
session_key=build_session_key(src),
session_id="goal-sess-1",
session_id=session_id or f"goal-sess-{uuid.uuid4().hex[:8]}",
created_at=datetime.now(),
updated_at=datetime.now(),
platform=Platform.TELEGRAM,
@@ -103,8 +107,8 @@ async def test_goal_verdict_done_sent_via_adapter_send(hermes_home):
mgr = GoalManager(session_entry.session_id)
mgr.set("ship the feature")
with patch("hermes_cli.goals.judge_goal", return_value=("done", "the feature shipped")):
runner._post_turn_goal_continuation(
with patch("hermes_cli.goals.judge_goal", return_value=("done", "the feature shipped", False)):
await runner._post_turn_goal_continuation(
session_entry=session_entry,
source=src,
final_response="I shipped the feature.",
@@ -132,8 +136,8 @@ async def test_goal_verdict_continue_enqueues_continuation(hermes_home):
mgr = GoalManager(session_entry.session_id)
mgr.set("polish the docs")
with patch("hermes_cli.goals.judge_goal", return_value=("continue", "still needs work")):
runner._post_turn_goal_continuation(
with patch("hermes_cli.goals.judge_goal", return_value=("continue", "still needs work", False)):
await runner._post_turn_goal_continuation(
session_entry=session_entry,
source=src,
final_response="here's a partial edit",
@@ -160,8 +164,8 @@ async def test_goal_verdict_budget_exhausted_sends_pause(hermes_home):
state.turns_used = 2
save_goal(session_entry.session_id, state)
with patch("hermes_cli.goals.judge_goal", return_value=("continue", "keep going")):
runner._post_turn_goal_continuation(
with patch("hermes_cli.goals.judge_goal", return_value=("continue", "keep going", False)):
await runner._post_turn_goal_continuation(
session_entry=session_entry,
source=src,
final_response="still partial",
@@ -181,7 +185,7 @@ async def test_goal_verdict_skipped_when_no_active_goal(hermes_home):
"""No goal set → the hook is a no-op. Nothing is sent, nothing enqueued."""
runner, adapter, session_entry, src = _make_runner_with_adapter()
runner._post_turn_goal_continuation(
await runner._post_turn_goal_continuation(
session_entry=session_entry,
source=src,
final_response="anything",
@@ -207,9 +211,9 @@ async def test_goal_verdict_survives_adapter_without_send(hermes_home):
runner.adapters[Platform.TELEGRAM] = _NoSendAdapter()
with patch("hermes_cli.goals.judge_goal", return_value=("done", "ok")):
with patch("hermes_cli.goals.judge_goal", return_value=("done", "ok", False)):
# must not raise
runner._post_turn_goal_continuation(
await runner._post_turn_goal_continuation(
session_entry=session_entry,
source=src,
final_response="whatever",
+5
View File
@@ -378,6 +378,11 @@ def test_run_doctor_termux_treats_docker_and_browser_warnings_as_expected(monkey
assert "1) pkg install nodejs" in out
assert "2) npm install -g agent-browser" in out
assert "3) agent-browser install" in out
assert "Termux compatibility fallbacks:" in out
assert "use .[termux-all] for broad compatibility" in out
assert "Matrix E2EE extra is excluded on Termux" in out
assert "Local faster-whisper extra is excluded on Termux" in out
assert "STT fallback: use Groq Whisper (set GROQ_API_KEY) or OpenAI Whisper (set VOICE_TOOLS_OPENAI_KEY)." in out
assert "docker not found (optional)" not in out
+175 -17
View File
@@ -40,14 +40,14 @@ class TestParseJudgeResponse:
def test_clean_json_done(self):
from hermes_cli.goals import _parse_judge_response
done, reason = _parse_judge_response('{"done": true, "reason": "all good"}')
done, reason, _ = _parse_judge_response('{"done": true, "reason": "all good"}')
assert done is True
assert reason == "all good"
def test_clean_json_continue(self):
from hermes_cli.goals import _parse_judge_response
done, reason = _parse_judge_response('{"done": false, "reason": "more work needed"}')
done, reason, _ = _parse_judge_response('{"done": false, "reason": "more work needed"}')
assert done is False
assert reason == "more work needed"
@@ -55,7 +55,7 @@ class TestParseJudgeResponse:
from hermes_cli.goals import _parse_judge_response
raw = '```json\n{"done": true, "reason": "done"}\n```'
done, reason = _parse_judge_response(raw)
done, reason, _ = _parse_judge_response(raw)
assert done is True
assert "done" in reason
@@ -64,7 +64,7 @@ class TestParseJudgeResponse:
from hermes_cli.goals import _parse_judge_response
raw = 'Looking at this... the agent says X. Verdict: {"done": false, "reason": "partial"}'
done, reason = _parse_judge_response(raw)
done, reason, _ = _parse_judge_response(raw)
assert done is False
assert reason == "partial"
@@ -72,24 +72,24 @@ class TestParseJudgeResponse:
from hermes_cli.goals import _parse_judge_response
for s in ("true", "yes", "done", "1"):
done, _ = _parse_judge_response(f'{{"done": "{s}", "reason": "r"}}')
done, _, _ = _parse_judge_response(f'{{"done": "{s}", "reason": "r"}}')
assert done is True
for s in ("false", "no", "not yet"):
done, _ = _parse_judge_response(f'{{"done": "{s}", "reason": "r"}}')
done, _, _ = _parse_judge_response(f'{{"done": "{s}", "reason": "r"}}')
assert done is False
def test_malformed_json_fails_open(self):
"""Non-JSON → not done, with error-ish reason (so judge_goal can map to continue)."""
from hermes_cli.goals import _parse_judge_response
done, reason = _parse_judge_response("this is not json at all")
done, reason, _ = _parse_judge_response("this is not json at all")
assert done is False
assert reason # non-empty
def test_empty_response(self):
from hermes_cli.goals import _parse_judge_response
done, reason = _parse_judge_response("")
done, reason, _ = _parse_judge_response("")
assert done is False
assert reason
@@ -103,13 +103,13 @@ class TestJudgeGoal:
def test_empty_goal_skipped(self):
from hermes_cli.goals import judge_goal
verdict, _ = judge_goal("", "some response")
verdict, _, _ = judge_goal("", "some response")
assert verdict == "skipped"
def test_empty_response_continues(self):
from hermes_cli.goals import judge_goal
verdict, _ = judge_goal("ship the thing", "")
verdict, _, _ = judge_goal("ship the thing", "")
assert verdict == "continue"
def test_no_aux_client_continues(self):
@@ -120,7 +120,7 @@ class TestJudgeGoal:
"agent.auxiliary_client.get_text_auxiliary_client",
return_value=(None, None),
):
verdict, _ = goals.judge_goal("my goal", "my response")
verdict, _, _ = goals.judge_goal("my goal", "my response")
assert verdict == "continue"
def test_api_error_continues(self):
@@ -133,7 +133,7 @@ class TestJudgeGoal:
"agent.auxiliary_client.get_text_auxiliary_client",
return_value=(fake_client, "judge-model"),
):
verdict, reason = goals.judge_goal("goal", "response")
verdict, reason, _ = goals.judge_goal("goal", "response")
assert verdict == "continue"
assert "judge error" in reason.lower()
@@ -152,7 +152,7 @@ class TestJudgeGoal:
"agent.auxiliary_client.get_text_auxiliary_client",
return_value=(fake_client, "judge-model"),
):
verdict, reason = goals.judge_goal("goal", "agent response")
verdict, reason, _ = goals.judge_goal("goal", "agent response")
assert verdict == "done"
assert reason == "achieved"
@@ -171,7 +171,7 @@ class TestJudgeGoal:
"agent.auxiliary_client.get_text_auxiliary_client",
return_value=(fake_client, "judge-model"),
):
verdict, reason = goals.judge_goal("goal", "agent response")
verdict, reason, _ = goals.judge_goal("goal", "agent response")
assert verdict == "continue"
assert reason == "not yet"
@@ -260,7 +260,7 @@ class TestGoalManager:
mgr = GoalManager(session_id="eval-sid-1")
mgr.set("ship it")
with patch.object(goals, "judge_goal", return_value=("done", "shipped")):
with patch.object(goals, "judge_goal", return_value=("done", "shipped", False)):
decision = mgr.evaluate_after_turn("I shipped the feature.")
assert decision["verdict"] == "done"
@@ -276,7 +276,7 @@ class TestGoalManager:
mgr = GoalManager(session_id="eval-sid-2", default_max_turns=5)
mgr.set("a long goal")
with patch.object(goals, "judge_goal", return_value=("continue", "more work")):
with patch.object(goals, "judge_goal", return_value=("continue", "more work", False)):
decision = mgr.evaluate_after_turn("made some progress")
assert decision["verdict"] == "continue"
@@ -294,7 +294,7 @@ class TestGoalManager:
mgr = GoalManager(session_id="eval-sid-3", default_max_turns=2)
mgr.set("hard goal")
with patch.object(goals, "judge_goal", return_value=("continue", "not yet")):
with patch.object(goals, "judge_goal", return_value=("continue", "not yet", False)):
d1 = mgr.evaluate_after_turn("step 1")
assert d1["should_continue"] is True
assert mgr.state.turns_used == 1
@@ -356,3 +356,161 @@ def test_goal_command_dispatches_in_cli_registry_helpers():
assert "/goal" in COMMANDS
session_cmds = COMMANDS_BY_CATEGORY.get("Session", {})
assert "/goal" in session_cmds
# ──────────────────────────────────────────────────────────────────────
# Auto-pause on consecutive judge parse failures
# ──────────────────────────────────────────────────────────────────────
class TestJudgeParseFailureAutoPause:
"""Regression: weak judge models (e.g. deepseek-v4-flash) that return
empty strings or non-JSON prose must auto-pause the loop after N turns
instead of burning the whole turn budget."""
def test_parse_response_flags_empty_as_parse_failure(self):
from hermes_cli.goals import _parse_judge_response
done, reason, parse_failed = _parse_judge_response("")
assert done is False
assert parse_failed is True
assert "empty" in reason.lower()
def test_parse_response_flags_non_json_as_parse_failure(self):
from hermes_cli.goals import _parse_judge_response
done, reason, parse_failed = _parse_judge_response(
"Let me analyze whether the goal is fully satisfied based on the agent's response..."
)
assert done is False
assert parse_failed is True
assert "not json" in reason.lower()
def test_parse_response_clean_json_is_not_parse_failure(self):
from hermes_cli.goals import _parse_judge_response
done, _, parse_failed = _parse_judge_response(
'{"done": false, "reason": "more work"}'
)
assert done is False
assert parse_failed is False
def test_api_error_does_not_count_as_parse_failure(self):
"""Transient network/API errors must not trip the auto-pause guard."""
from hermes_cli import goals
fake_client = MagicMock()
fake_client.chat.completions.create.side_effect = RuntimeError("connection reset")
with patch(
"agent.auxiliary_client.get_text_auxiliary_client",
return_value=(fake_client, "judge-model"),
):
verdict, _, parse_failed = goals.judge_goal("goal", "response")
assert verdict == "continue"
assert parse_failed is False
def test_empty_judge_reply_flagged_as_parse_failure(self):
"""End-to-end: judge returns empty content → parse_failed=True."""
from hermes_cli import goals
fake_client = MagicMock()
fake_client.chat.completions.create.return_value = MagicMock(
choices=[MagicMock(message=MagicMock(content=""))]
)
with patch(
"agent.auxiliary_client.get_text_auxiliary_client",
return_value=(fake_client, "judge-model"),
):
verdict, _, parse_failed = goals.judge_goal("goal", "response")
assert verdict == "continue"
assert parse_failed is True
def test_auto_pause_after_three_consecutive_parse_failures(self, hermes_home):
"""N=3 consecutive parse failures → auto-pause with config pointer."""
from hermes_cli import goals
from hermes_cli.goals import GoalManager, DEFAULT_MAX_CONSECUTIVE_PARSE_FAILURES
assert DEFAULT_MAX_CONSECUTIVE_PARSE_FAILURES == 3
mgr = GoalManager(session_id="parse-fail-sid-1", default_max_turns=20)
mgr.set("do a thing")
with patch.object(
goals, "judge_goal", return_value=("continue", "judge returned empty response", True)
):
d1 = mgr.evaluate_after_turn("step 1")
assert d1["should_continue"] is True
assert mgr.state.consecutive_parse_failures == 1
d2 = mgr.evaluate_after_turn("step 2")
assert d2["should_continue"] is True
assert mgr.state.consecutive_parse_failures == 2
d3 = mgr.evaluate_after_turn("step 3")
assert d3["should_continue"] is False
assert d3["status"] == "paused"
assert mgr.state.consecutive_parse_failures == 3
# Message points at the config surface so the user can fix it.
assert "auxiliary" in d3["message"]
assert "goal_judge" in d3["message"]
assert "config.yaml" in d3["message"]
def test_parse_failure_counter_resets_on_good_reply(self, hermes_home):
"""A single good judge reply resets the counter — transient flakes don't pause."""
from hermes_cli import goals
from hermes_cli.goals import GoalManager
mgr = GoalManager(session_id="parse-fail-sid-2", default_max_turns=20)
mgr.set("another goal")
# Two parse failures…
with patch.object(
goals, "judge_goal", return_value=("continue", "not json", True)
):
mgr.evaluate_after_turn("step 1")
mgr.evaluate_after_turn("step 2")
assert mgr.state.consecutive_parse_failures == 2
# …then one clean reply resets the counter.
with patch.object(
goals, "judge_goal", return_value=("continue", "making progress", False)
):
d = mgr.evaluate_after_turn("step 3")
assert d["should_continue"] is True
assert mgr.state.consecutive_parse_failures == 0
def test_parse_failure_counter_not_incremented_by_api_errors(self, hermes_home):
"""API/transport errors must NOT count toward the auto-pause threshold."""
from hermes_cli import goals
from hermes_cli.goals import GoalManager
mgr = GoalManager(session_id="parse-fail-sid-3", default_max_turns=20)
mgr.set("goal")
with patch.object(
goals, "judge_goal", return_value=("continue", "judge error: RuntimeError", False)
):
for _ in range(5):
d = mgr.evaluate_after_turn("still going")
assert d["should_continue"] is True
assert mgr.state.consecutive_parse_failures == 0
assert mgr.state.status == "active"
def test_consecutive_parse_failures_persists_across_goalmanager_reloads(
self, hermes_home
):
"""The counter must be durable so cross-session resumes see it."""
from hermes_cli import goals
from hermes_cli.goals import GoalManager, load_goal
mgr = GoalManager(session_id="parse-fail-sid-4", default_max_turns=20)
mgr.set("persistent goal")
with patch.object(
goals, "judge_goal", return_value=("continue", "empty", True)
):
mgr.evaluate_after_turn("r")
mgr.evaluate_after_turn("r")
reloaded = load_goal("parse-fail-sid-4")
assert reloaded is not None
assert reloaded.consecutive_parse_failures == 2
+55
View File
@@ -286,3 +286,58 @@ def test_run_slash_reassign_with_reclaim_flag(kanban_home):
assert "Reassigned" in out, out
out2 = kc.run_slash(f"show {tid}")
assert "newbie" in out2
# ---------------------------------------------------------------------------
# /kanban specify — slash surface (same entry point CLI + gateway use)
# ---------------------------------------------------------------------------
def test_run_slash_specify_end_to_end(kanban_home, monkeypatch):
"""The /kanban specify slash command routes through run_slash, which
both the interactive CLI and every gateway platform use. This test
covers both surfaces."""
from unittest.mock import MagicMock
# Create a triage task via the same slash surface.
create_out = kc.run_slash("create 'rough idea' --triage")
import re
m = re.search(r"(t_[a-f0-9]+)", create_out)
assert m, f"no task id in: {create_out!r}"
tid = m.group(1)
# Mock the auxiliary client so we don't hit a real provider.
resp = MagicMock()
resp.choices = [MagicMock()]
resp.choices[0].message.content = (
'{"title": "Spec: rough idea", "body": "**Goal**\\nShip it."}'
)
fake_client = MagicMock()
fake_client.chat.completions.create = MagicMock(return_value=resp)
monkeypatch.setattr(
"agent.auxiliary_client.get_text_auxiliary_client",
lambda *a, **kw: (fake_client, "test-model"),
)
# Specify via slash.
out = kc.run_slash(f"specify {tid}")
assert "Specified" in out
assert tid in out
# Task is promoted and retitled.
with kb.connect() as conn:
task = kb.get_task(conn, tid)
assert task.status in {"todo", "ready"}
assert task.title == "Spec: rough idea"
def test_run_slash_specify_help_is_reachable(kanban_home):
"""`--help` on a subcommand is handled by argparse itself — it prints
to the process stdout and raises SystemExit before run_slash's output
redirection is installed, so the returned string is the usage-error
sentinel. All we're asserting here is that the subcommand is
registered (no "unknown action" error) the shape of the help text
is covered by the direct argparse tests in test_kanban_specify.py."""
out = kc.run_slash("specify --help")
# Either the usage-error sentinel (stdout swallowed by argparse) or
# a real help rendering — both mean the subcommand exists.
assert "usage error" in out.lower() or "specify" in out.lower()
+337
View File
@@ -0,0 +1,337 @@
"""Tests for the specifier module + `hermes kanban specify` CLI surface.
The auxiliary LLM client is mocked these tests don't hit any network or
real provider. They exercise the prompt plumbing, response parsing, DB
writes, and CLI flag surface.
"""
from __future__ import annotations
import argparse
import json as jsonlib
from pathlib import Path
from unittest.mock import MagicMock, patch
import pytest
from hermes_cli import kanban as kanban_cli
from hermes_cli import kanban_db as kb
from hermes_cli import kanban_specify as spec
@pytest.fixture
def kanban_home(tmp_path, monkeypatch):
home = tmp_path / ".hermes"
home.mkdir()
monkeypatch.setenv("HERMES_HOME", str(home))
monkeypatch.setattr(Path, "home", lambda: tmp_path)
kb.init_db()
return home
def _fake_aux_response(content: str):
"""Build a minimal object shaped like an OpenAI chat.completions result.
The specifier only reads ``resp.choices[0].message.content``, so we
avoid importing the openai SDK and build the tree with MagicMock.
"""
resp = MagicMock()
resp.choices = [MagicMock()]
resp.choices[0].message.content = content
return resp
def _mock_client_returning(content: str):
client = MagicMock()
client.chat.completions.create = MagicMock(return_value=_fake_aux_response(content))
return client
def _patch_aux_client(content: str, *, model: str = "test-model"):
"""Patch get_text_auxiliary_client at its source + at the module that
imported it lazily inside specify_task. Both patches are needed
because kanban_specify imports the function inside the function body.
"""
client = _mock_client_returning(content)
return patch(
"agent.auxiliary_client.get_text_auxiliary_client",
return_value=(client, model),
), client
# ---------------------------------------------------------------------------
# JSON extraction helpers
# ---------------------------------------------------------------------------
def test_extract_json_blob_handles_plain_json():
raw = '{"title": "T", "body": "B"}'
assert spec._extract_json_blob(raw) == {"title": "T", "body": "B"}
def test_extract_json_blob_handles_fenced_json():
raw = '```json\n{"title": "T", "body": "B"}\n```'
assert spec._extract_json_blob(raw) == {"title": "T", "body": "B"}
def test_extract_json_blob_handles_prose_preamble():
raw = 'Sure! Here you go:\n{"title": "T", "body": "B"}\nThanks.'
assert spec._extract_json_blob(raw) == {"title": "T", "body": "B"}
def test_extract_json_blob_returns_none_for_unparseable():
assert spec._extract_json_blob("no json here") is None
assert spec._extract_json_blob("") is None
assert spec._extract_json_blob("{not: valid}") is None
# ---------------------------------------------------------------------------
# specify_task (module-level entry point)
# ---------------------------------------------------------------------------
def test_specify_task_happy_path(kanban_home):
with kb.connect() as conn:
tid = kb.create_task(conn, title="rough", triage=True)
content = jsonlib.dumps({
"title": "Refined rough",
"body": "**Goal**\nA concrete goal.",
})
p, _ = _patch_aux_client(content)
with p:
outcome = spec.specify_task(tid, author="ace")
assert outcome.ok is True
assert outcome.task_id == tid
assert outcome.new_title == "Refined rough"
with kb.connect() as conn:
task = kb.get_task(conn, tid)
# Parent-free → recompute_ready promotes to ready.
assert task.status == "ready"
assert task.title == "Refined rough"
assert "**Goal**" in (task.body or "")
def test_specify_task_falls_back_to_body_only_on_bad_json(kanban_home):
with kb.connect() as conn:
tid = kb.create_task(conn, title="keep title", triage=True)
# Model returned plain markdown, no JSON object.
content = "Goal: Do a thing.\nApproach: Steps here."
p, _ = _patch_aux_client(content)
with p:
outcome = spec.specify_task(tid)
assert outcome.ok is True
with kb.connect() as conn:
t = kb.get_task(conn, tid)
# Title preserved (no JSON with a title key).
assert t.title == "keep title"
# Body replaced with the raw response.
assert "Goal:" in (t.body or "")
def test_specify_task_rejects_non_triage_task(kanban_home):
with kb.connect() as conn:
tid = kb.create_task(conn, title="ready task")
p, client = _patch_aux_client("unused")
with p:
outcome = spec.specify_task(tid)
assert outcome.ok is False
assert "not in triage" in outcome.reason
# LLM must not be invoked for a non-triage task — fail cheap.
assert client.chat.completions.create.call_count == 0
def test_specify_task_unknown_id(kanban_home):
p, client = _patch_aux_client("unused")
with p:
outcome = spec.specify_task("t_nope")
assert outcome.ok is False
assert "unknown task" in outcome.reason
assert client.chat.completions.create.call_count == 0
def test_specify_task_no_aux_client_configured(kanban_home):
with kb.connect() as conn:
tid = kb.create_task(conn, title="rough", triage=True)
with patch(
"agent.auxiliary_client.get_text_auxiliary_client",
return_value=(None, ""),
):
outcome = spec.specify_task(tid)
assert outcome.ok is False
assert "auxiliary client" in outcome.reason
# Task must stay in triage — we never touched it.
with kb.connect() as conn:
assert kb.get_task(conn, tid).status == "triage"
def test_specify_task_llm_api_error_keeps_task_in_triage(kanban_home):
with kb.connect() as conn:
tid = kb.create_task(conn, title="rough", triage=True)
client = MagicMock()
client.chat.completions.create = MagicMock(side_effect=RuntimeError("429 rate limited"))
with patch(
"agent.auxiliary_client.get_text_auxiliary_client",
return_value=(client, "test-model"),
):
outcome = spec.specify_task(tid)
assert outcome.ok is False
assert "LLM error" in outcome.reason
with kb.connect() as conn:
assert kb.get_task(conn, tid).status == "triage"
def test_specify_task_empty_llm_response(kanban_home):
with kb.connect() as conn:
tid = kb.create_task(conn, title="rough", triage=True)
p, _ = _patch_aux_client("")
with p:
outcome = spec.specify_task(tid)
assert outcome.ok is False
with kb.connect() as conn:
assert kb.get_task(conn, tid).status == "triage"
def test_list_triage_ids(kanban_home):
with kb.connect() as conn:
a = kb.create_task(conn, title="a", triage=True)
b = kb.create_task(conn, title="b", triage=True, tenant="proj-1")
kb.create_task(conn, title="c") # not triage — excluded
ids_all = spec.list_triage_ids()
assert set(ids_all) == {a, b}
ids_tenant = spec.list_triage_ids(tenant="proj-1")
assert ids_tenant == [b]
# ---------------------------------------------------------------------------
# CLI wiring — argparse + _cmd_specify
# ---------------------------------------------------------------------------
def _run_cli(*argv: str) -> int:
"""Invoke the `hermes kanban …` argparse surface directly."""
root = argparse.ArgumentParser()
subp = root.add_subparsers(dest="cmd")
kanban_cli.build_parser(subp)
ns = root.parse_args(["kanban", *argv])
return kanban_cli.kanban_command(ns)
def test_cli_specify_requires_id_or_all(kanban_home, capsys):
rc = _run_cli("specify")
assert rc == 2
err = capsys.readouterr().err
assert "requires a task id or --all" in err
def test_cli_specify_rejects_both_id_and_all(kanban_home, capsys):
with kb.connect() as conn:
tid = kb.create_task(conn, title="rough", triage=True)
rc = _run_cli("specify", tid, "--all")
assert rc == 2
err = capsys.readouterr().err
assert "either a task id OR --all" in err
def test_cli_specify_single_id_success(kanban_home, capsys):
with kb.connect() as conn:
tid = kb.create_task(conn, title="rough", triage=True)
content = jsonlib.dumps({"title": "clean", "body": "body"})
p, _ = _patch_aux_client(content)
with p:
rc = _run_cli("specify", tid)
assert rc == 0
out = capsys.readouterr().out
assert tid in out
assert "→ todo" in out or "-> todo" in out or "" in out
def test_cli_specify_all_success_and_json(kanban_home, capsys):
with kb.connect() as conn:
a = kb.create_task(conn, title="a", triage=True)
b = kb.create_task(conn, title="b", triage=True)
content = jsonlib.dumps({"title": "spec", "body": "body"})
p, _ = _patch_aux_client(content)
with p:
rc = _run_cli("specify", "--all", "--json")
assert rc == 0
lines = [l for l in capsys.readouterr().out.strip().splitlines() if l]
# One JSON object per task + nothing else.
assert len(lines) == 2
parsed = [jsonlib.loads(l) for l in lines]
ids = {row["task_id"] for row in parsed}
assert ids == {a, b}
assert all(row["ok"] for row in parsed)
def test_cli_specify_all_empty_triage_column(kanban_home, capsys):
rc = _run_cli("specify", "--all")
assert rc == 0
assert "No triage tasks" in capsys.readouterr().out
def test_cli_specify_all_returns_1_when_every_task_fails(kanban_home, capsys):
with kb.connect() as conn:
kb.create_task(conn, title="a", triage=True)
kb.create_task(conn, title="b", triage=True)
with patch(
"agent.auxiliary_client.get_text_auxiliary_client",
return_value=(None, ""), # no aux client → every task fails
):
rc = _run_cli("specify", "--all")
assert rc == 1
def test_cli_specify_tenant_filter(kanban_home, capsys):
with kb.connect() as conn:
outside = kb.create_task(conn, title="outside", triage=True)
inside = kb.create_task(
conn, title="inside", triage=True, tenant="proj-a",
)
content = jsonlib.dumps({"title": "spec", "body": "body"})
p, _ = _patch_aux_client(content)
with p:
rc = _run_cli("specify", "--all", "--tenant", "proj-a", "--json")
assert rc == 0
lines = [
jsonlib.loads(l)
for l in capsys.readouterr().out.strip().splitlines()
if l
]
ids = {row["task_id"] for row in lines}
assert ids == {inside}
# The outside task stays in triage.
with kb.connect() as conn:
assert kb.get_task(conn, outside).status == "triage"
# The inside task was promoted.
assert kb.get_task(conn, inside).status in {"todo", "ready"}
def test_cli_specify_author_passed_through(kanban_home, capsys):
with kb.connect() as conn:
tid = kb.create_task(conn, title="rough", triage=True)
content = jsonlib.dumps({"title": "fresh title", "body": "fresh body"})
p, _ = _patch_aux_client(content)
with p:
rc = _run_cli("specify", tid, "--author", "custom-agent")
assert rc == 0
with kb.connect() as conn:
comments = kb.list_comments(conn, tid)
assert comments and comments[0].author == "custom-agent"
+184
View File
@@ -0,0 +1,184 @@
"""Tests for kb.specify_triage_task — the DB-layer atomic promotion
from the triage column to todo. LLM-free by design."""
from __future__ import annotations
from pathlib import Path
import pytest
from hermes_cli import kanban_db as kb
@pytest.fixture
def kanban_home(tmp_path, monkeypatch):
"""Isolated HERMES_HOME with an empty kanban DB."""
home = tmp_path / ".hermes"
home.mkdir()
monkeypatch.setenv("HERMES_HOME", str(home))
monkeypatch.setattr(Path, "home", lambda: tmp_path)
kb.init_db()
return home
def _create_triage(conn, title="rough idea", body=None, assignee=None):
return kb.create_task(
conn,
title=title,
body=body,
assignee=assignee,
triage=True,
)
def test_specify_promotes_triage_to_todo(kanban_home):
with kb.connect() as conn:
tid = _create_triage(conn, title="rough idea")
assert kb.get_task(conn, tid).status == "triage"
with kb.connect() as conn:
ok = kb.specify_triage_task(
conn,
tid,
title="Refined: rough idea",
body="**Goal**\nDo the thing.",
author="specifier-bot",
)
assert ok is True
with kb.connect() as conn:
task = kb.get_task(conn, tid)
# No parents → recompute_ready should have flipped it past todo to ready.
assert task.status == "ready"
assert task.title == "Refined: rough idea"
assert "**Goal**" in (task.body or "")
def test_specify_with_open_parent_lands_in_todo_not_ready(kanban_home):
# Parent-gated specified tasks must not jump the dispatcher — they go
# to todo and wait for parent completion like any other gated task.
with kb.connect() as conn:
parent = kb.create_task(conn, title="parent work")
child = _create_triage(conn, title="child idea")
kb.link_tasks(conn, parent, child)
# After linking with an open parent, triage status should still be
# 'triage' (linking doesn't touch triage tasks).
assert kb.get_task(conn, child).status == "triage"
with kb.connect() as conn:
ok = kb.specify_triage_task(
conn,
child,
body="full spec",
author="specifier",
)
assert ok is True
with kb.connect() as conn:
t = kb.get_task(conn, child)
# Parent still open → specified child sits in 'todo', not 'ready'.
assert t.status == "todo"
def test_specify_refuses_non_triage_task(kanban_home):
with kb.connect() as conn:
tid = kb.create_task(conn, title="normal task")
assert kb.get_task(conn, tid).status == "ready"
with kb.connect() as conn:
ok = kb.specify_triage_task(conn, tid, body="won't apply")
assert ok is False
with kb.connect() as conn:
# Status unchanged.
assert kb.get_task(conn, tid).status == "ready"
def test_specify_returns_false_for_unknown_id(kanban_home):
with kb.connect() as conn:
ok = kb.specify_triage_task(conn, "t_does_not_exist", body="x")
assert ok is False
def test_specify_rejects_blank_title(kanban_home):
with kb.connect() as conn:
tid = _create_triage(conn, title="rough")
with kb.connect() as conn, pytest.raises(ValueError):
kb.specify_triage_task(conn, tid, title=" ", body="ok")
def test_specify_emits_event(kanban_home):
with kb.connect() as conn:
tid = _create_triage(conn, title="rough")
with kb.connect() as conn:
kb.specify_triage_task(
conn, tid, title="new", body="b", author="ace"
)
with kb.connect() as conn:
events = kb.list_events(conn, tid)
kinds = [e.kind for e in events]
assert "specified" in kinds
# The specified event records which fields actually changed as a
# JSON payload under task_events.payload.
spec_ev = next(e for e in events if e.kind == "specified")
assert spec_ev.payload is not None
fields = spec_ev.payload.get("changed_fields") or []
assert "title" in fields
assert "body" in fields
def test_specify_records_audit_comment_only_when_author_given(kanban_home):
# With author → comment added.
with kb.connect() as conn:
tid1 = _create_triage(conn, title="a")
kb.specify_triage_task(
conn, tid1, title="A-spec", body="b", author="ace"
)
comments1 = kb.list_comments(conn, tid1)
assert len(comments1) == 1
assert "Specified" in comments1[0].body
assert comments1[0].author == "ace"
# Without author → no comment (silent).
with kb.connect() as conn:
tid2 = _create_triage(conn, title="b")
kb.specify_triage_task(conn, tid2, title="B-spec", body="b")
comments2 = kb.list_comments(conn, tid2)
assert comments2 == []
def test_specify_skips_comment_when_nothing_changed(kanban_home):
# Create triage task with title and body already set; pass identical
# values to specify. Should promote to todo but skip audit comment.
with kb.connect() as conn:
tid = _create_triage(conn, title="same", body="same body")
with kb.connect() as conn:
ok = kb.specify_triage_task(
conn,
tid,
title="same",
body="same body",
author="ace",
)
assert ok is True
with kb.connect() as conn:
# Promoted.
assert kb.get_task(conn, tid).status in {"todo", "ready"}
# No audit comment because neither field changed.
assert kb.list_comments(conn, tid) == []
def test_specify_with_only_body_preserves_title(kanban_home):
with kb.connect() as conn:
tid = _create_triage(conn, title="keep this title")
with kb.connect() as conn:
kb.specify_triage_task(conn, tid, body="new body only")
with kb.connect() as conn:
t = kb.get_task(conn, tid)
assert t.title == "keep this title"
assert t.body == "new body only"
def test_specify_second_call_noop_false(kanban_home):
# Promoting twice must not crash and the second call returns False
# because the task is no longer in triage.
with kb.connect() as conn:
tid = _create_triage(conn, title="once")
with kb.connect() as conn:
assert kb.specify_triage_task(conn, tid, body="spec") is True
with kb.connect() as conn:
assert kb.specify_triage_task(conn, tid, body="spec again") is False
+28 -10
View File
@@ -323,15 +323,15 @@ def test_cmd_update_retries_optional_extras_individually_when_all_fails(monkeypa
return SimpleNamespace(stdout="main\n", stderr="", returncode=0)
if cmd == ["git", "rev-list", "HEAD..origin/main", "--count"]:
return SimpleNamespace(stdout="1\n", stderr="", returncode=0)
if cmd == ["git", "pull", "origin", "main"]:
if cmd == ["git", "pull", "--ff-only", "origin", "main"]:
return SimpleNamespace(stdout="Updating\n", stderr="", returncode=0)
if cmd == ["/usr/bin/uv", "pip", "install", "-e", ".[all]", "--quiet"]:
if cmd == ["/usr/bin/uv", "pip", "install", "-e", ".[all]"]:
raise CalledProcessError(returncode=1, cmd=cmd)
if cmd == ["/usr/bin/uv", "pip", "install", "-e", ".", "--quiet"]:
if cmd == ["/usr/bin/uv", "pip", "install", "-e", "."]:
return SimpleNamespace(returncode=0)
if cmd == ["/usr/bin/uv", "pip", "install", "-e", ".[matrix]", "--quiet"]:
if cmd == ["/usr/bin/uv", "pip", "install", "-e", ".[matrix]"]:
raise CalledProcessError(returncode=1, cmd=cmd)
if cmd == ["/usr/bin/uv", "pip", "install", "-e", ".[mcp]", "--quiet"]:
if cmd == ["/usr/bin/uv", "pip", "install", "-e", ".[mcp]"]:
return SimpleNamespace(returncode=0)
# Catch-all must include stdout/stderr so consumers that parse
# output (e.g. the dashboard-restart `ps -A` scan added in the
@@ -344,10 +344,10 @@ def test_cmd_update_retries_optional_extras_individually_when_all_fails(monkeypa
install_cmds = [c for c in recorded if "pip" in c and "install" in c]
assert install_cmds == [
["/usr/bin/uv", "pip", "install", "-e", ".[all]", "--quiet"],
["/usr/bin/uv", "pip", "install", "-e", ".", "--quiet"],
["/usr/bin/uv", "pip", "install", "-e", ".[matrix]", "--quiet"],
["/usr/bin/uv", "pip", "install", "-e", ".[mcp]", "--quiet"],
["/usr/bin/uv", "pip", "install", "-e", ".[all]"],
["/usr/bin/uv", "pip", "install", "-e", "."],
["/usr/bin/uv", "pip", "install", "-e", ".[matrix]"],
["/usr/bin/uv", "pip", "install", "-e", ".[mcp]"],
]
out = capsys.readouterr().out
@@ -371,7 +371,7 @@ def test_cmd_update_succeeds_with_extras(monkeypatch, tmp_path):
return SimpleNamespace(stdout="main\n", stderr="", returncode=0)
if cmd == ["git", "rev-list", "HEAD..origin/main", "--count"]:
return SimpleNamespace(stdout="1\n", stderr="", returncode=0)
if cmd == ["git", "pull", "origin", "main"]:
if cmd == ["git", "pull", "--ff-only", "origin", "main"]:
return SimpleNamespace(stdout="Updating\n", stderr="", returncode=0)
return SimpleNamespace(returncode=0, stdout="", stderr="")
@@ -384,6 +384,24 @@ def test_cmd_update_succeeds_with_extras(monkeypatch, tmp_path):
assert ".[all]" in install_cmds[0]
def test_install_heartbeat_prints_when_dependency_install_is_silent(monkeypatch, capsys):
"""Long quiet installs should emit periodic heartbeat lines."""
def fake_run(cmd, **kwargs):
hermes_main._time.sleep(1.2)
return SimpleNamespace(returncode=0)
monkeypatch.setattr(hermes_main.subprocess, "run", fake_run)
hermes_main._run_install_with_heartbeat(
["uv", "pip", "install", "-e", "."],
heartbeat_interval_seconds=1,
)
out = capsys.readouterr().out
assert "still installing dependencies" in out
# ---------------------------------------------------------------------------
# ff-only fallback to reset --hard on diverged history
# ---------------------------------------------------------------------------
@@ -1582,3 +1582,104 @@ def test_board_exposes_diagnostics_list_and_summary(client):
assert task_dict["warnings"] is not None
assert task_dict["warnings"]["highest_severity"] == "error"
assert task_dict["diagnostics"][0]["kind"] == "repeated_crashes"
# ---------------------------------------------------------------------------
# POST /tasks/:id/specify — triage specifier endpoint
# ---------------------------------------------------------------------------
def _patch_specifier_response(monkeypatch, *, content, model="test-model"):
"""Helper: install a fake auxiliary client so the specifier endpoint
can run without hitting any real provider."""
from unittest.mock import MagicMock
resp = MagicMock()
resp.choices = [MagicMock()]
resp.choices[0].message.content = content
fake_client = MagicMock()
fake_client.chat.completions.create = MagicMock(return_value=resp)
monkeypatch.setattr(
"agent.auxiliary_client.get_text_auxiliary_client",
lambda *a, **kw: (fake_client, model),
)
return fake_client
def test_specify_happy_path(client, monkeypatch):
import json as jsonlib
# Create a triage task.
t = client.post(
"/api/plugins/kanban/tasks",
json={"title": "one-liner", "triage": True},
).json()["task"]
assert t["status"] == "triage"
_patch_specifier_response(
monkeypatch,
content=jsonlib.dumps(
{"title": "Polished", "body": "**Goal**\nDo the thing."}
),
)
r = client.post(
f"/api/plugins/kanban/tasks/{t['id']}/specify",
json={"author": "ui-tester"},
)
assert r.status_code == 200
body = r.json()
assert body["ok"] is True
assert body["task_id"] == t["id"]
assert body["new_title"] == "Polished"
# Task should have moved off the triage column.
detail = client.get(f"/api/plugins/kanban/tasks/{t['id']}").json()["task"]
assert detail["status"] in {"todo", "ready"}
assert detail["title"] == "Polished"
assert "**Goal**" in (detail["body"] or "")
def test_specify_non_triage_returns_ok_false_not_http_error(client, monkeypatch):
"""The endpoint intentionally returns ``{ok: false, reason: ...}`` for
"task not in triage" rather than a 4xx the dashboard renders the
reason inline so the user can fix it without a page reload."""
# Create a normal (ready) task — not in triage.
t = client.post("/api/plugins/kanban/tasks", json={"title": "x"}).json()["task"]
_patch_specifier_response(monkeypatch, content="unused")
r = client.post(
f"/api/plugins/kanban/tasks/{t['id']}/specify",
json={},
)
assert r.status_code == 200
body = r.json()
assert body["ok"] is False
assert "not in triage" in body["reason"]
def test_specify_no_aux_client_surfaces_reason(client, monkeypatch):
t = client.post(
"/api/plugins/kanban/tasks",
json={"title": "rough", "triage": True},
).json()["task"]
# Simulate "no auxiliary client configured".
monkeypatch.setattr(
"agent.auxiliary_client.get_text_auxiliary_client",
lambda *a, **kw: (None, ""),
)
r = client.post(
f"/api/plugins/kanban/tasks/{t['id']}/specify",
json={},
)
assert r.status_code == 200
body = r.json()
assert body["ok"] is False
assert "auxiliary client" in body["reason"]
# Task must stay in triage — nothing was touched.
detail = client.get(f"/api/plugins/kanban/tasks/{t['id']}").json()["task"]
assert detail["status"] == "triage"
+21 -2
View File
@@ -3729,8 +3729,8 @@ class TestMaxTokensParam:
assert result == {"max_completion_tokens": 4096}
class TestAzureOpenAIRouting:
"""Verify Azure OpenAI endpoints stay on chat_completions for gpt-5.x."""
class TestGpt5ApiModeRouting:
"""Verify provider-specific GPT-5 API-mode routing."""
def test_azure_gpt5_stays_on_chat_completions(self, agent):
"""Azure serves gpt-5.x on /chat/completions — must not upgrade to codex_responses."""
@@ -3769,6 +3769,25 @@ class TestAzureOpenAIRouting:
agent.api_mode = "codex_responses"
assert agent.api_mode == "codex_responses"
def test_nous_gpt5_stays_on_chat_completions(self, agent):
"""Nous serves gpt-5.x on /chat/completions — must not upgrade to codex_responses."""
agent.provider = "nous"
agent.base_url = "https://inference-api.nousresearch.com/v1"
agent.api_mode = "chat_completions"
agent.model = "openai/gpt-5.5"
if (
agent.api_mode == "chat_completions"
and not agent._is_azure_openai_url()
and (
agent._is_direct_openai_url()
or agent._provider_model_requires_responses_api(
agent.model, provider=agent.provider,
)
)
):
agent.api_mode = "codex_responses"
assert agent.api_mode == "chat_completions"
def test_is_azure_openai_url_detection(self, agent):
assert agent._is_azure_openai_url("https://foo.openai.azure.com/openai/v1") is True
assert agent._is_azure_openai_url("https://api.openai.com/v1") is False
+61 -1
View File
@@ -7,7 +7,12 @@ from unittest.mock import patch
import pytest
import hermes_constants
from hermes_constants import get_default_hermes_root, is_container
from hermes_constants import (
VALID_REASONING_EFFORTS,
get_default_hermes_root,
is_container,
parse_reasoning_effort,
)
class TestGetDefaultHermesRoot:
@@ -17,6 +22,7 @@ class TestGetDefaultHermesRoot:
"""When HERMES_HOME is not set, returns ~/.hermes."""
monkeypatch.delenv("HERMES_HOME", raising=False)
monkeypatch.setattr(Path, "home", lambda: tmp_path)
assert get_default_hermes_root() == tmp_path / ".hermes"
def test_hermes_home_is_native(self, tmp_path, monkeypatch):
@@ -111,3 +117,57 @@ class TestIsContainer:
# Even if we make os.path.exists return False, cached value wins
monkeypatch.setattr(os.path, "exists", lambda p: False)
assert is_container() is True
class TestParseReasoningEffort:
"""Tests for parse_reasoning_effort() — string → reasoning config dict."""
@pytest.mark.parametrize("value", ["", " ", "\t", "\n"])
def test_empty_or_whitespace_returns_none(self, value):
"""Empty / whitespace-only input falls back to caller default (None)."""
assert parse_reasoning_effort(value) is None
def test_none_disables_reasoning(self):
"""The literal "none" disables reasoning explicitly."""
assert parse_reasoning_effort("none") == {"enabled": False}
@pytest.mark.parametrize("level", list(VALID_REASONING_EFFORTS))
def test_each_valid_level(self, level):
"""Every level listed in VALID_REASONING_EFFORTS is accepted as-is."""
assert parse_reasoning_effort(level) == {"enabled": True, "effort": level}
@pytest.mark.parametrize(
"raw, expected_effort",
[
("MEDIUM", "medium"),
("High", "high"),
(" low ", "low"),
("\tXHIGH\n", "xhigh"),
("None", False),
],
)
def test_case_and_whitespace_normalized(self, raw, expected_effort):
"""Mixed case and surrounding whitespace are normalized before lookup."""
result = parse_reasoning_effort(raw)
if expected_effort is False:
assert result == {"enabled": False}
else:
assert result == {"enabled": True, "effort": expected_effort}
@pytest.mark.parametrize(
"value",
["bogus", "very-high", "max", "0", "off", "true", "default"],
)
def test_unknown_levels_return_none(self, value):
"""Unrecognized strings fall back to the caller default (None)."""
assert parse_reasoning_effort(value) is None
def test_known_supported_levels_are_documented(self):
"""Guard against silently dropping a documented level.
The docstring promises "minimal", "low", "medium", "high", "xhigh".
If someone removes one from VALID_REASONING_EFFORTS without updating
the docstring, this test will fail and force the call out.
"""
documented = {"minimal", "low", "medium", "high", "xhigh"}
assert documented.issubset(set(VALID_REASONING_EFFORTS))
@@ -0,0 +1,74 @@
"""Regression tests for Windows install.ps1 dependency branch handling.
These assertions lock in the critical control-flow paths needed for native
Windows CLI + TUI installs:
- Node.js install via winget, with managed ZIP fallback
- npm invocation that avoids execution-policy failures on npm.ps1
- Python dependency fallback chain for Windows CLI/TUI
- Managed Node PATH/HERMES_NODE persistence across terminal sessions
"""
from pathlib import Path
REPO_ROOT = Path(__file__).resolve().parent.parent
INSTALL_PS1 = REPO_ROOT / "scripts" / "install.ps1"
def test_node_install_keeps_winget_and_zip_fallback_paths() -> None:
text = INSTALL_PS1.read_text()
# Primary path: modern Windows machines with winget.
assert "if (Get-Command winget -ErrorAction SilentlyContinue)" in text
assert "winget install OpenJS.NodeJS.LTS" in text
# Fallback path: no winget / winget failure => managed ZIP install.
assert 'Write-Info "Downloading Node.js $NodeVersion binary..."' in text
assert 'Move-Item $extractedDir.FullName "$HermesHome\\node"' in text
assert '& "$HermesHome\\node\\node.exe" --version' in text
def test_system_packages_keep_winget_choco_scoop_fallback_chain() -> None:
text = INSTALL_PS1.read_text()
assert "$hasWinget = Get-Command winget -ErrorAction SilentlyContinue" in text
assert "$hasChoco = Get-Command choco -ErrorAction SilentlyContinue" in text
assert "$hasScoop = Get-Command scoop -ErrorAction SilentlyContinue" in text
assert "if ($hasWinget)" in text
assert "if ($hasChoco -and ($needRipgrep -or $needFfmpeg))" in text
assert "if ($hasScoop -and ($needRipgrep -or $needFfmpeg))" in text
def test_npm_resolution_avoids_powershell_policy_blocks() -> None:
text = INSTALL_PS1.read_text()
# Prefer npm.cmd and convert npm.ps1 -> npm.cmd when needed.
assert "function Resolve-NpmInvocation" in text
assert "Get-Command npm.cmd -ErrorAction SilentlyContinue" in text
assert '[System.IO.Path]::ChangeExtension($npm.Source, ".cmd")' in text
# Last-resort path should still work by launching npm-cli.js via node.
assert "node_modules\\npm\\bin\\npm-cli.js" in text
assert "Invoke-NpmInstallSilent -WorkingDir $InstallDir" in text
assert "Invoke-NpmInstallSilent -WorkingDir $tuiDir" in text
def test_python_dependency_install_has_windows_cli_tui_fallback() -> None:
text = INSTALL_PS1.read_text()
# Keep broad install attempt first.
assert '& $UvCmd pip install -e ".[all]"' in text
# Then fallback to Windows CLI/TUI essentials if optional extras fail.
assert '& $UvCmd pip install -e ".[pty,mcp,honcho,acp]"' in text
# Final safety fallback to base package.
assert '& $UvCmd pip install -e "."' in text
assert 'throw "Failed to install Hermes Python dependencies."' in text
def test_managed_node_is_persisted_for_future_tui_runs() -> None:
text = INSTALL_PS1.read_text()
assert "Add-UserPathEntry -CurrentPath $newPath -Entry $managedNodeDir" in text
assert '[Environment]::SetEnvironmentVariable("HERMES_NODE", $managedNodeExe, "User")' in text
assert '$env:Path = Add-UserPathEntry -CurrentPath $env:Path -Entry "$HermesHome\\node"' in text
@@ -0,0 +1,22 @@
"""Regression tests for Termux network prerequisite handling in install.sh."""
from pathlib import Path
REPO_ROOT = Path(__file__).resolve().parent.parent
INSTALL_SH = REPO_ROOT / "scripts" / "install.sh"
def test_termux_pkg_list_includes_network_basics() -> None:
text = INSTALL_SH.read_text()
assert "local termux_pkgs=(clang rust make pkg-config libffi openssl ca-certificates curl)" in text
def test_install_script_has_connectivity_probe_and_termux_guidance() -> None:
text = INSTALL_SH.read_text()
assert "check_network_prerequisites()" in text
assert "https://pypi.org/simple/" in text
assert "https://duckduckgo.com/" in text
assert "termux-change-repo" in text
assert "pkg install -y ca-certificates curl && pkg update" in text
assert "check_network_prerequisites" in text
+36 -9
View File
@@ -828,18 +828,45 @@ class TestE2EChannelsList:
assert result["channels"][0]["target"] == "slack:C1234"
def test_channels_with_directory(self, mcp_server_e2e, _event_loop, monkeypatch):
"""Populated channel_directory.json should be unwrapped via the 'platforms' key.
Regression test for issue #21474: the writer wraps platforms under
{"updated_at": ..., "platforms": {...}} but the reader was iterating
directory.items() directly, so channels_list always returned 0.
"""
import mcp_serve
monkeypatch.setattr(mcp_serve, "_load_channel_directory", lambda: {
"telegram": [
{"id": "123456", "name": "Alice", "type": "dm"},
{"id": "-100999", "name": "Dev Group", "type": "group"},
],
"updated_at": "2026-05-07T12:00:00",
"platforms": {
"telegram": [
{"id": "123456", "name": "Alice", "type": "dm"},
{"id": "-100999", "name": "Dev Group", "type": "group"},
],
"discord": [
{"id": "789", "name": "general", "type": "text"},
],
},
})
# Need to recreate server to pick up the new mock
server, bridge = mcp_server_e2e
# The tool closure already captured the old mock, so test the function directly
directory = mcp_serve._load_channel_directory()
assert len(directory["telegram"]) == 2
server, _ = mcp_server_e2e
result = _run_tool(server, "channels_list")
assert result["count"] == 3
targets = {c["target"] for c in result["channels"]}
assert targets == {"telegram:123456", "telegram:-100999", "discord:789"}
def test_channels_with_directory_platform_filter(self, mcp_server_e2e, _event_loop, monkeypatch):
"""Platform filter should work against the wrapped 'platforms' payload."""
import mcp_serve
monkeypatch.setattr(mcp_serve, "_load_channel_directory", lambda: {
"updated_at": "2026-05-07T12:00:00",
"platforms": {
"telegram": [{"id": "123456", "name": "Alice", "type": "dm"}],
"discord": [{"id": "789", "name": "general", "type": "text"}],
},
})
server, _ = mcp_server_e2e
result = _run_tool(server, "channels_list", {"platform": "discord"})
assert result["count"] == 1
assert result["channels"][0]["target"] == "discord:789"
class TestE2EPermissions:
+23
View File
@@ -0,0 +1,23 @@
"""Regression coverage for the Termux broad install profile."""
from pathlib import Path
REPO_ROOT = Path(__file__).resolve().parent.parent
PYPROJECT = REPO_ROOT / "pyproject.toml"
INSTALL_SH = REPO_ROOT / "scripts" / "install.sh"
def test_pyproject_defines_termux_all_without_known_blockers() -> None:
text = PYPROJECT.read_text()
assert "termux-all = [" in text
assert '"hermes-agent[termux]"' in text
assert '"hermes-agent[matrix]"' not in text.split("termux-all = [", 1)[1].split("]", 1)[0]
assert '"hermes-agent[voice]"' not in text.split("termux-all = [", 1)[1].split("]", 1)[0]
def test_install_script_prefers_termux_all_then_fallbacks() -> None:
text = INSTALL_SH.read_text()
assert "pip install -e '.[termux-all]' -c constraints-termux.txt" in text
assert "Termux broad profile (.[termux-all]) failed, trying baseline Termux profile..." in text
assert "Termux baseline profile (.[termux]) failed, trying base install..." in text
+17 -11
View File
@@ -1863,13 +1863,15 @@ def test_config_set_personality_rejects_unknown_name(monkeypatch):
assert "Unknown personality" in resp["error"]["message"]
def test_config_set_personality_resets_history_and_returns_info(monkeypatch):
def test_config_set_personality_preserves_history_and_returns_info(monkeypatch):
agent = types.SimpleNamespace(
ephemeral_system_prompt=None, _cached_system_prompt="old"
)
session = _session(
agent=types.SimpleNamespace(),
agent=agent,
history=[{"role": "user", "text": "hi"}],
history_version=4,
)
new_agent = types.SimpleNamespace(model="x")
emits = []
server._sessions["sid"] = session
@@ -1878,13 +1880,9 @@ def test_config_set_personality_resets_history_and_returns_info(monkeypatch):
"_available_personalities",
lambda cfg=None: {"helpful": "You are helpful."},
)
monkeypatch.setattr(
server, "_make_agent", lambda sid, key, session_id=None: new_agent
)
monkeypatch.setattr(
server, "_session_info", lambda agent: {"model": getattr(agent, "model", "?")}
)
monkeypatch.setattr(server, "_restart_slash_worker", lambda session: None)
monkeypatch.setattr(server, "_emit", lambda *args: emits.append(args))
monkeypatch.setattr(server, "_write_config_key", lambda path, value: None)
@@ -1896,11 +1894,19 @@ def test_config_set_personality_resets_history_and_returns_info(monkeypatch):
}
)
assert resp["result"]["history_reset"] is True
assert resp["result"]["info"] == {"model": "x"}
assert session["history"] == []
assert resp["result"]["history_reset"] is False
assert resp["result"]["info"] == {"model": "?"}
# History is preserved with a pivot marker appended
assert len(session["history"]) == 2
assert session["history"][0] == {"role": "user", "text": "hi"}
assert session["history"][1]["role"] == "user"
assert "personality" in session["history"][1]["content"].lower()
assert "You are helpful." in session["history"][1]["content"]
assert session["history_version"] == 5
assert ("session.info", "sid", {"model": "x"}) in emits
# Agent's system prompt was updated in-place; cached prompt untouched
assert agent.ephemeral_system_prompt == "You are helpful."
assert agent._cached_system_prompt == "old"
assert ("session.info", "sid", {"model": "?"}) in emits
def test_session_compress_uses_compress_helper(monkeypatch):
+182
View File
@@ -0,0 +1,182 @@
"""Unit tests for Windows UTF-8 process bootstrap."""
from __future__ import annotations
import os
from types import SimpleNamespace
import utf8_bootstrap as utf8_bootstrap
def _fake_sys(
*,
platform: str,
utf8_mode: int,
argv: list[str] | None = None,
executable: str = r"C:\Python\python.exe",
) -> SimpleNamespace:
return SimpleNamespace(
platform=platform,
flags=SimpleNamespace(utf8_mode=utf8_mode),
argv=argv or ["hermes"],
executable=executable,
)
def test_non_windows_noop(monkeypatch) -> None:
monkeypatch.setattr(
utf8_bootstrap,
"sys",
_fake_sys(platform="darwin", utf8_mode=0),
)
monkeypatch.delenv("PYTHONUTF8", raising=False)
monkeypatch.delenv("PYTHONIOENCODING", raising=False)
called = {"exec": False}
def _fake_exec(*_args, **_kwargs):
called["exec"] = True
raise AssertionError("exec should not run on non-Windows")
monkeypatch.setattr(utf8_bootstrap.os, "execvpe", _fake_exec)
assert utf8_bootstrap.ensure_windows_utf8_mode() is False
assert called["exec"] is False
assert "PYTHONUTF8" not in os.environ
assert "PYTHONIOENCODING" not in os.environ
def test_windows_utf8_already_enabled_sets_env_without_reexec(monkeypatch) -> None:
monkeypatch.setattr(
utf8_bootstrap,
"sys",
_fake_sys(platform="win32", utf8_mode=1),
)
monkeypatch.delenv("PYTHONUTF8", raising=False)
monkeypatch.delenv("PYTHONIOENCODING", raising=False)
called = {"exec": False}
def _fake_exec(*_args, **_kwargs):
called["exec"] = True
raise AssertionError("exec should not run when utf8_mode=1")
monkeypatch.setattr(utf8_bootstrap.os, "execvpe", _fake_exec)
assert utf8_bootstrap.ensure_windows_utf8_mode() is False
assert called["exec"] is False
assert os.environ["PYTHONUTF8"] == "1"
assert os.environ["PYTHONIOENCODING"] == "utf-8"
def test_windows_reexec_attempt_uses_utf8_flag(monkeypatch) -> None:
fake_sys = _fake_sys(platform="win32", utf8_mode=0, argv=["hermes", "--help"])
monkeypatch.setattr(utf8_bootstrap, "sys", fake_sys)
monkeypatch.delenv("PYTHONUTF8", raising=False)
monkeypatch.delenv("PYTHONIOENCODING", raising=False)
monkeypatch.delenv("_HERMES_UTF8_REEXEC", raising=False)
captured: dict[str, object] = {}
def _fake_exec(executable, argv, env):
captured["executable"] = executable
captured["argv"] = argv
captured["env"] = env
raise OSError("blocked by test")
monkeypatch.setattr(utf8_bootstrap.os, "execvpe", _fake_exec)
assert (
utf8_bootstrap.ensure_windows_utf8_mode(entrypoint_markers=("hermes",))
is False
)
assert captured["executable"] == fake_sys.executable
assert captured["argv"] == [
fake_sys.executable,
"-X",
"utf8",
*fake_sys.argv,
]
env = captured["env"]
assert isinstance(env, dict)
assert env["PYTHONUTF8"] == "1"
assert env["PYTHONIOENCODING"] == "utf-8"
assert env["_HERMES_UTF8_REEXEC"] == "1"
def test_module_reexec_uses_dash_m_and_drops_argv0(monkeypatch) -> None:
fake_sys = _fake_sys(
platform="win32",
utf8_mode=0,
argv=[r"C:\Users\me\AppData\Local\Programs\Python\Scripts\hermes.exe", "chat", "--verbose"],
)
monkeypatch.setattr(utf8_bootstrap, "sys", fake_sys)
monkeypatch.delenv("_HERMES_UTF8_REEXEC", raising=False)
captured: dict[str, object] = {}
def _fake_exec(executable, argv, env):
captured["executable"] = executable
captured["argv"] = argv
captured["env"] = env
raise OSError("blocked by test")
monkeypatch.setattr(utf8_bootstrap.os, "execvpe", _fake_exec)
assert (
utf8_bootstrap.ensure_windows_utf8_mode(
module="hermes_cli.main",
entrypoint_markers=("hermes",),
)
is False
)
assert captured["executable"] == fake_sys.executable
assert captured["argv"] == [
fake_sys.executable,
"-X",
"utf8",
"-m",
"hermes_cli.main",
"chat",
"--verbose",
]
env = captured["env"]
assert isinstance(env, dict)
assert env["_HERMES_UTF8_REEXEC"] == "1"
def test_marker_mismatch_skips_reexec(monkeypatch) -> None:
fake_sys = _fake_sys(platform="win32", utf8_mode=0, argv=["pytest", "-k", "x"])
monkeypatch.setattr(utf8_bootstrap, "sys", fake_sys)
monkeypatch.delenv("_HERMES_UTF8_REEXEC", raising=False)
called = {"exec": False}
def _fake_exec(*_args, **_kwargs):
called["exec"] = True
raise AssertionError("exec should be skipped for non-matching marker")
monkeypatch.setattr(utf8_bootstrap.os, "execvpe", _fake_exec)
assert (
utf8_bootstrap.ensure_windows_utf8_mode(entrypoint_markers=("hermes",))
is False
)
assert called["exec"] is False
def test_reexec_guard_prevents_loops(monkeypatch) -> None:
fake_sys = _fake_sys(platform="win32", utf8_mode=0, argv=["hermes"])
monkeypatch.setattr(utf8_bootstrap, "sys", fake_sys)
monkeypatch.setenv("_HERMES_UTF8_REEXEC", "1")
called = {"exec": False}
def _fake_exec(*_args, **_kwargs):
called["exec"] = True
raise AssertionError("exec should be skipped when guard is set")
monkeypatch.setattr(utf8_bootstrap.os, "execvpe", _fake_exec)
assert utf8_bootstrap.ensure_windows_utf8_mode() is False
assert called["exec"] is False
+33
View File
@@ -328,3 +328,36 @@ class TestSanePathIncludesHomebrew:
result = _make_run_env({})
# Should keep existing PATH unchanged
assert result["PATH"] == "/usr/bin:/bin"
class TestWindowsSanePath:
def test_make_run_env_windows_uses_windows_path_rules(self):
"""Windows mode should use ';' and avoid POSIX /usr/bin injections."""
from tools.environments.local import _make_run_env
with patch("tools.environments.local._IS_WINDOWS", True), patch.dict(
os.environ, {"PATH": r"C:\Users\Test\bin"}, clear=True
):
result = _make_run_env({})
assert "PATH" in result
assert ";" in result["PATH"]
assert "/usr/bin" not in result["PATH"]
parts = [p for p in result["PATH"].split(";") if p]
assert any("git" in p.lower() and "bin" in p.lower() for p in parts)
def test_make_run_env_windows_dedupes_case_insensitive_entries(self):
"""Repeated Windows path entries should not be appended twice."""
from tools.environments.local import _make_run_env
with patch("tools.environments.local._IS_WINDOWS", True), patch.dict(
os.environ, {"PATH": r"C:\Windows\System32;C:\TOOLS\BIN"}, clear=True
):
result = _make_run_env({})
normalized = [
p.replace("/", "\\").lower().rstrip("\\")
for p in result["PATH"].split(";")
if p
]
assert normalized.count(r"c:\windows\system32") == 1
@@ -0,0 +1,275 @@
"""Tests for the Brave Search (free tier) web search provider.
Covers:
- BraveFreeSearchProvider.is_configured() env var gating
- BraveFreeSearchProvider.search() happy path, HTTP error, request error, bad JSON
- Result normalization (title, url, description, position)
- Limit truncation + Brave's count cap (20)
- _is_backend_available("brave-free") integration
- _get_backend() recognizes "brave-free" as a valid configured backend
- check_web_api_key() includes brave-free in availability check
- web_extract / web_crawl return search-only errors when brave-free is active
"""
from __future__ import annotations
import json
from unittest.mock import MagicMock, patch
# ---------------------------------------------------------------------------
# BraveFreeSearchProvider unit tests
# ---------------------------------------------------------------------------
class TestBraveFreeProviderIsConfigured:
def test_configured_when_key_set(self, monkeypatch):
monkeypatch.setenv("BRAVE_SEARCH_API_KEY", "BSAkey123")
from tools.web_providers.brave_free import BraveFreeSearchProvider
assert BraveFreeSearchProvider().is_configured() is True
def test_not_configured_when_key_missing(self, monkeypatch):
monkeypatch.delenv("BRAVE_SEARCH_API_KEY", raising=False)
from tools.web_providers.brave_free import BraveFreeSearchProvider
assert BraveFreeSearchProvider().is_configured() is False
def test_not_configured_when_key_whitespace(self, monkeypatch):
monkeypatch.setenv("BRAVE_SEARCH_API_KEY", " ")
from tools.web_providers.brave_free import BraveFreeSearchProvider
assert BraveFreeSearchProvider().is_configured() is False
def test_provider_name(self):
from tools.web_providers.brave_free import BraveFreeSearchProvider
assert BraveFreeSearchProvider().provider_name() == "brave-free"
def test_implements_web_search_provider(self):
from tools.web_providers.base import WebSearchProvider
from tools.web_providers.brave_free import BraveFreeSearchProvider
assert issubclass(BraveFreeSearchProvider, WebSearchProvider)
class TestBraveFreeProviderSearch:
_SAMPLE_RESPONSE = {
"web": {
"results": [
{"title": "A", "url": "https://a.example.com", "description": "desc A"},
{"title": "B", "url": "https://b.example.com", "description": "desc B"},
{"title": "C", "url": "https://c.example.com", "description": "desc C"},
]
}
}
@staticmethod
def _mock_resp(json_data, status_code=200):
m = MagicMock()
m.status_code = status_code
m.json.return_value = json_data
m.raise_for_status = MagicMock()
return m
def test_happy_path_normalizes_results(self, monkeypatch):
monkeypatch.setenv("BRAVE_SEARCH_API_KEY", "BSAkey123")
from tools.web_providers.brave_free import BraveFreeSearchProvider
with patch("httpx.get", return_value=self._mock_resp(self._SAMPLE_RESPONSE)):
result = BraveFreeSearchProvider().search("test query", limit=5)
assert result["success"] is True
web = result["data"]["web"]
assert len(web) == 3
assert web[0] == {"title": "A", "url": "https://a.example.com", "description": "desc A", "position": 1}
assert web[2]["position"] == 3
def test_sends_subscription_token_header_and_count(self, monkeypatch):
"""Brave uses X-Subscription-Token; count maps from limit."""
monkeypatch.setenv("BRAVE_SEARCH_API_KEY", "BSAkey123")
from tools.web_providers.brave_free import BraveFreeSearchProvider
captured = {}
def fake_get(url, **kwargs):
captured["url"] = url
captured["headers"] = kwargs.get("headers", {})
captured["params"] = kwargs.get("params", {})
return self._mock_resp({"web": {"results": []}})
with patch("httpx.get", side_effect=fake_get):
BraveFreeSearchProvider().search("q", limit=5)
assert captured["url"] == "https://api.search.brave.com/res/v1/web/search"
assert captured["headers"].get("X-Subscription-Token") == "BSAkey123"
assert captured["params"].get("q") == "q"
assert captured["params"].get("count") == 5
def test_count_is_capped_at_20(self, monkeypatch):
"""Brave caps count at 20 — limit above that clamps."""
monkeypatch.setenv("BRAVE_SEARCH_API_KEY", "BSAkey123")
from tools.web_providers.brave_free import BraveFreeSearchProvider
captured = {}
def fake_get(url, **kwargs):
captured["params"] = kwargs.get("params", {})
return self._mock_resp({"web": {"results": []}})
with patch("httpx.get", side_effect=fake_get):
BraveFreeSearchProvider().search("q", limit=100)
assert captured["params"].get("count") == 20
def test_limit_is_respected_client_side(self, monkeypatch):
monkeypatch.setenv("BRAVE_SEARCH_API_KEY", "BSAkey123")
from tools.web_providers.brave_free import BraveFreeSearchProvider
with patch("httpx.get", return_value=self._mock_resp(self._SAMPLE_RESPONSE)):
result = BraveFreeSearchProvider().search("q", limit=2)
assert result["success"] is True
assert len(result["data"]["web"]) == 2
def test_empty_results(self, monkeypatch):
monkeypatch.setenv("BRAVE_SEARCH_API_KEY", "BSAkey123")
from tools.web_providers.brave_free import BraveFreeSearchProvider
with patch("httpx.get", return_value=self._mock_resp({"web": {"results": []}})):
result = BraveFreeSearchProvider().search("nothing", limit=5)
assert result["success"] is True
assert result["data"]["web"] == []
def test_missing_web_key_returns_empty(self, monkeypatch):
"""Responses without a ``web`` block should produce an empty result set, not crash."""
monkeypatch.setenv("BRAVE_SEARCH_API_KEY", "BSAkey123")
from tools.web_providers.brave_free import BraveFreeSearchProvider
with patch("httpx.get", return_value=self._mock_resp({})):
result = BraveFreeSearchProvider().search("q", limit=5)
assert result["success"] is True
assert result["data"]["web"] == []
def test_http_error_returns_failure(self, monkeypatch):
import httpx
monkeypatch.setenv("BRAVE_SEARCH_API_KEY", "BSAkey123")
from tools.web_providers.brave_free import BraveFreeSearchProvider
bad = MagicMock()
bad.status_code = 429
err = httpx.HTTPStatusError("429", request=MagicMock(), response=bad)
with patch("httpx.get", side_effect=err):
result = BraveFreeSearchProvider().search("q", limit=5)
assert result["success"] is False
assert "429" in result["error"]
def test_request_error_returns_failure(self, monkeypatch):
import httpx
monkeypatch.setenv("BRAVE_SEARCH_API_KEY", "BSAkey123")
from tools.web_providers.brave_free import BraveFreeSearchProvider
with patch("httpx.get", side_effect=httpx.RequestError("boom")):
result = BraveFreeSearchProvider().search("q", limit=5)
assert result["success"] is False
assert "boom" in result["error"] or "Brave" in result["error"]
def test_missing_key_returns_failure(self, monkeypatch):
monkeypatch.delenv("BRAVE_SEARCH_API_KEY", raising=False)
from tools.web_providers.brave_free import BraveFreeSearchProvider
result = BraveFreeSearchProvider().search("q", limit=5)
assert result["success"] is False
assert "BRAVE_SEARCH_API_KEY" in result["error"]
# ---------------------------------------------------------------------------
# Integration: _is_backend_available / _get_backend / check_web_api_key
# ---------------------------------------------------------------------------
class TestBraveFreeBackendWiring:
def test_is_backend_available_true_when_key_set(self, monkeypatch):
monkeypatch.setenv("BRAVE_SEARCH_API_KEY", "BSAkey123")
from tools.web_tools import _is_backend_available
assert _is_backend_available("brave-free") is True
def test_is_backend_available_false_when_key_missing(self, monkeypatch):
monkeypatch.delenv("BRAVE_SEARCH_API_KEY", raising=False)
from tools.web_tools import _is_backend_available
assert _is_backend_available("brave-free") is False
def test_configured_backend_accepted(self, monkeypatch):
from tools import web_tools
monkeypatch.setattr(web_tools, "_load_web_config", lambda: {"backend": "brave-free"})
monkeypatch.setenv("BRAVE_SEARCH_API_KEY", "BSAkey123")
assert web_tools._get_backend() == "brave-free"
def test_auto_detect_picks_brave_free_when_only_key_set(self, monkeypatch):
from tools import web_tools
monkeypatch.setattr(web_tools, "_load_web_config", lambda: {})
for key in ("FIRECRAWL_API_KEY", "FIRECRAWL_API_URL", "PARALLEL_API_KEY",
"TAVILY_API_KEY", "EXA_API_KEY", "SEARXNG_URL"):
monkeypatch.delenv(key, raising=False)
monkeypatch.setenv("BRAVE_SEARCH_API_KEY", "BSAkey123")
monkeypatch.setattr(web_tools, "_is_tool_gateway_ready", lambda: False)
monkeypatch.setattr(web_tools, "_ddgs_package_importable", lambda: False)
assert web_tools._get_backend() == "brave-free"
def test_brave_free_does_not_override_paid_provider(self, monkeypatch):
"""Tavily (higher priority) should win in auto-detect."""
from tools import web_tools
monkeypatch.setattr(web_tools, "_load_web_config", lambda: {})
for key in ("FIRECRAWL_API_KEY", "FIRECRAWL_API_URL", "PARALLEL_API_KEY", "EXA_API_KEY", "SEARXNG_URL"):
monkeypatch.delenv(key, raising=False)
monkeypatch.setenv("TAVILY_API_KEY", "tvly")
monkeypatch.setenv("BRAVE_SEARCH_API_KEY", "BSAkey123")
monkeypatch.setattr(web_tools, "_is_tool_gateway_ready", lambda: False)
assert web_tools._get_backend() == "tavily"
def test_check_web_api_key_true_when_brave_free_configured(self, monkeypatch):
from tools import web_tools
monkeypatch.setattr(web_tools, "_load_web_config", lambda: {"backend": "brave-free"})
monkeypatch.setenv("BRAVE_SEARCH_API_KEY", "BSAkey123")
assert web_tools.check_web_api_key() is True
# ---------------------------------------------------------------------------
# brave-free is search-only: web_extract / web_crawl return clear errors
# ---------------------------------------------------------------------------
class TestBraveFreeSearchOnlyErrors:
def test_web_extract_returns_search_only_error(self, monkeypatch):
import asyncio
from tools import web_tools
monkeypatch.setattr(web_tools, "_load_web_config", lambda: {"backend": "brave-free"})
monkeypatch.setenv("BRAVE_SEARCH_API_KEY", "BSAkey123")
monkeypatch.setattr(web_tools, "_is_tool_gateway_ready", lambda: False)
monkeypatch.setattr("tools.interrupt.is_interrupted", lambda: False, raising=False)
result_str = asyncio.get_event_loop().run_until_complete(
web_tools.web_extract_tool(["https://example.com"])
)
result = json.loads(result_str)
assert result["success"] is False
assert "search-only" in result["error"].lower()
assert "brave" in result["error"].lower()
def test_web_crawl_returns_search_only_error(self, monkeypatch):
import asyncio
from tools import web_tools
monkeypatch.setattr(web_tools, "_load_web_config", lambda: {"backend": "brave-free"})
monkeypatch.setenv("BRAVE_SEARCH_API_KEY", "BSAkey123")
monkeypatch.setattr(web_tools, "_is_tool_gateway_ready", lambda: False)
monkeypatch.setattr(web_tools, "check_firecrawl_api_key", lambda: False)
monkeypatch.setattr("tools.interrupt.is_interrupted", lambda: False, raising=False)
result_str = asyncio.get_event_loop().run_until_complete(
web_tools.web_crawl_tool("https://example.com")
)
result = json.loads(result_str)
assert result["success"] is False
assert "search-only" in result["error"].lower()
assert "brave" in result["error"].lower()
+246
View File
@@ -0,0 +1,246 @@
"""Tests for the DuckDuckGo (ddgs) web search provider.
Covers:
- DDGSSearchProvider.is_configured() reflects package importability
- DDGSSearchProvider.search() happy path, missing package, runtime error
- Result normalization (title, url, description, position)
- _is_backend_available("ddgs") / _get_backend() integration
- web_extract / web_crawl return search-only errors when ddgs is active
"""
from __future__ import annotations
import json
import sys
import types
from unittest.mock import MagicMock
def _install_fake_ddgs(monkeypatch, *, text_results=None, text_raises=None):
"""Install a stub ``ddgs`` module in sys.modules for the duration of a test.
``text_results``: iterable of dicts to yield from DDGS().text(...).
``text_raises``: if set, DDGS().text raises this exception instead.
"""
fake = types.ModuleType("ddgs")
class _FakeDDGS:
def __enter__(self):
return self
def __exit__(self, *_a):
return False
def text(self, query, max_results=5):
if text_raises is not None:
raise text_raises
for hit in (text_results or []):
yield hit
fake.DDGS = _FakeDDGS
monkeypatch.setitem(sys.modules, "ddgs", fake)
return fake
# ---------------------------------------------------------------------------
# DDGSSearchProvider unit tests
# ---------------------------------------------------------------------------
class TestDDGSProviderIsConfigured:
def test_configured_when_package_importable(self, monkeypatch):
_install_fake_ddgs(monkeypatch)
# Drop any cached ``tools.web_providers.ddgs`` so is_configured re-imports ddgs fresh
monkeypatch.delitem(sys.modules, "tools.web_providers.ddgs", raising=False)
from tools.web_providers.ddgs import DDGSSearchProvider
assert DDGSSearchProvider().is_configured() is True
def test_not_configured_when_package_missing(self, monkeypatch):
monkeypatch.delitem(sys.modules, "ddgs", raising=False)
monkeypatch.delitem(sys.modules, "tools.web_providers.ddgs", raising=False)
# Block the import so ``import ddgs`` raises ImportError even if the package is actually installed
import builtins
orig_import = builtins.__import__
def blocked_import(name, *args, **kwargs):
if name == "ddgs":
raise ImportError("blocked for test")
return orig_import(name, *args, **kwargs)
monkeypatch.setattr(builtins, "__import__", blocked_import)
from tools.web_providers.ddgs import DDGSSearchProvider
assert DDGSSearchProvider().is_configured() is False
def test_provider_name(self):
from tools.web_providers.ddgs import DDGSSearchProvider
assert DDGSSearchProvider().provider_name() == "ddgs"
def test_implements_web_search_provider(self):
from tools.web_providers.base import WebSearchProvider
from tools.web_providers.ddgs import DDGSSearchProvider
assert issubclass(DDGSSearchProvider, WebSearchProvider)
class TestDDGSProviderSearch:
def test_happy_path_normalizes_results(self, monkeypatch):
_install_fake_ddgs(monkeypatch, text_results=[
{"title": "A", "href": "https://a.example.com", "body": "desc A"},
{"title": "B", "href": "https://b.example.com", "body": "desc B"},
{"title": "C", "href": "https://c.example.com", "body": "desc C"},
])
from tools.web_providers.ddgs import DDGSSearchProvider
result = DDGSSearchProvider().search("q", limit=5)
assert result["success"] is True
web = result["data"]["web"]
assert len(web) == 3
assert web[0] == {"title": "A", "url": "https://a.example.com", "description": "desc A", "position": 1}
assert web[2]["position"] == 3
def test_accepts_url_key_as_fallback_for_href(self, monkeypatch):
_install_fake_ddgs(monkeypatch, text_results=[
{"title": "A", "url": "https://a.example.com", "body": "desc A"},
])
from tools.web_providers.ddgs import DDGSSearchProvider
result = DDGSSearchProvider().search("q", limit=5)
assert result["success"] is True
assert result["data"]["web"][0]["url"] == "https://a.example.com"
def test_limit_is_respected(self, monkeypatch):
_install_fake_ddgs(monkeypatch, text_results=[
{"title": f"R{i}", "href": f"https://r{i}.example.com", "body": ""}
for i in range(10)
])
from tools.web_providers.ddgs import DDGSSearchProvider
result = DDGSSearchProvider().search("q", limit=3)
assert result["success"] is True
assert len(result["data"]["web"]) == 3
def test_missing_package_returns_failure(self, monkeypatch):
monkeypatch.delitem(sys.modules, "ddgs", raising=False)
monkeypatch.delitem(sys.modules, "tools.web_providers.ddgs", raising=False)
import builtins
orig_import = builtins.__import__
def blocked_import(name, *args, **kwargs):
if name == "ddgs":
raise ImportError("blocked for test")
return orig_import(name, *args, **kwargs)
monkeypatch.setattr(builtins, "__import__", blocked_import)
from tools.web_providers.ddgs import DDGSSearchProvider
result = DDGSSearchProvider().search("q", limit=5)
assert result["success"] is False
assert "ddgs" in result["error"].lower()
def test_runtime_error_returns_failure(self, monkeypatch):
_install_fake_ddgs(monkeypatch, text_raises=RuntimeError("rate limited 202"))
from tools.web_providers.ddgs import DDGSSearchProvider
result = DDGSSearchProvider().search("q", limit=5)
assert result["success"] is False
assert "rate limited" in result["error"] or "failed" in result["error"].lower()
def test_empty_results(self, monkeypatch):
_install_fake_ddgs(monkeypatch, text_results=[])
from tools.web_providers.ddgs import DDGSSearchProvider
result = DDGSSearchProvider().search("nothing", limit=5)
assert result["success"] is True
assert result["data"]["web"] == []
# ---------------------------------------------------------------------------
# Integration: _is_backend_available / _get_backend / check_web_api_key
# ---------------------------------------------------------------------------
class TestDDGSBackendWiring:
def test_is_backend_available_true_when_package_importable(self, monkeypatch):
from tools import web_tools
monkeypatch.setattr(web_tools, "_ddgs_package_importable", lambda: True)
assert web_tools._is_backend_available("ddgs") is True
def test_is_backend_available_false_when_package_missing(self, monkeypatch):
from tools import web_tools
monkeypatch.setattr(web_tools, "_ddgs_package_importable", lambda: False)
assert web_tools._is_backend_available("ddgs") is False
def test_configured_backend_accepted(self, monkeypatch):
from tools import web_tools
monkeypatch.setattr(web_tools, "_load_web_config", lambda: {"backend": "ddgs"})
monkeypatch.setattr(web_tools, "_ddgs_package_importable", lambda: True)
assert web_tools._get_backend() == "ddgs"
def test_ddgs_trails_paid_providers_in_auto_detect(self, monkeypatch):
"""Exa (priority) should win over ddgs in auto-detect."""
from tools import web_tools
monkeypatch.setattr(web_tools, "_load_web_config", lambda: {})
for key in ("FIRECRAWL_API_KEY", "FIRECRAWL_API_URL", "PARALLEL_API_KEY",
"TAVILY_API_KEY", "SEARXNG_URL", "BRAVE_SEARCH_API_KEY"):
monkeypatch.delenv(key, raising=False)
monkeypatch.setenv("EXA_API_KEY", "exa-key")
monkeypatch.setattr(web_tools, "_is_tool_gateway_ready", lambda: False)
monkeypatch.setattr(web_tools, "_ddgs_package_importable", lambda: True)
assert web_tools._get_backend() == "exa"
def test_auto_detect_picks_ddgs_as_last_resort(self, monkeypatch):
from tools import web_tools
monkeypatch.setattr(web_tools, "_load_web_config", lambda: {})
for key in ("FIRECRAWL_API_KEY", "FIRECRAWL_API_URL", "PARALLEL_API_KEY",
"TAVILY_API_KEY", "EXA_API_KEY", "SEARXNG_URL", "BRAVE_SEARCH_API_KEY"):
monkeypatch.delenv(key, raising=False)
monkeypatch.setattr(web_tools, "_is_tool_gateway_ready", lambda: False)
monkeypatch.setattr(web_tools, "_ddgs_package_importable", lambda: True)
assert web_tools._get_backend() == "ddgs"
def test_check_web_api_key_true_when_ddgs_configured(self, monkeypatch):
from tools import web_tools
monkeypatch.setattr(web_tools, "_load_web_config", lambda: {"backend": "ddgs"})
monkeypatch.setattr(web_tools, "_ddgs_package_importable", lambda: True)
assert web_tools.check_web_api_key() is True
# ---------------------------------------------------------------------------
# ddgs is search-only: web_extract / web_crawl return clear errors
# ---------------------------------------------------------------------------
class TestDDGSSearchOnlyErrors:
def test_web_extract_returns_search_only_error(self, monkeypatch):
import asyncio
from tools import web_tools
monkeypatch.setattr(web_tools, "_load_web_config", lambda: {"backend": "ddgs"})
monkeypatch.setattr(web_tools, "_ddgs_package_importable", lambda: True)
monkeypatch.setattr(web_tools, "_is_tool_gateway_ready", lambda: False)
monkeypatch.setattr("tools.interrupt.is_interrupted", lambda: False, raising=False)
result_str = asyncio.get_event_loop().run_until_complete(
web_tools.web_extract_tool(["https://example.com"])
)
result = json.loads(result_str)
assert result["success"] is False
assert "search-only" in result["error"].lower()
assert "duckduckgo" in result["error"].lower() or "ddgs" in result["error"].lower()
def test_web_crawl_returns_search_only_error(self, monkeypatch):
import asyncio
from tools import web_tools
monkeypatch.setattr(web_tools, "_load_web_config", lambda: {"backend": "ddgs"})
monkeypatch.setattr(web_tools, "_ddgs_package_importable", lambda: True)
monkeypatch.setattr(web_tools, "_is_tool_gateway_ready", lambda: False)
monkeypatch.setattr(web_tools, "check_firecrawl_api_key", lambda: False)
monkeypatch.setattr("tools.interrupt.is_interrupted", lambda: False, raising=False)
result_str = asyncio.get_event_loop().run_until_complete(
web_tools.web_crawl_tool("https://example.com")
)
result = json.loads(result_str)
assert result["success"] is False
assert "search-only" in result["error"].lower()
assert "duckduckgo" in result["error"].lower() or "ddgs" in result["error"].lower()
+43 -1
View File
@@ -1,6 +1,7 @@
"""Local execution environment — spawn-per-call with session snapshot."""
import logging
import ntpath
import os
import platform
import re
@@ -217,6 +218,30 @@ _SANE_PATH = (
"/opt/homebrew/bin:/opt/homebrew/sbin:"
"/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"
)
_SANE_PATH_WINDOWS = tuple(
p for p in (
os.path.join(os.environ.get("SystemRoot", r"C:\Windows"), "System32"),
os.environ.get("SystemRoot", r"C:\Windows"),
os.path.join(
os.environ.get("SystemRoot", r"C:\Windows"),
"System32",
"WindowsPowerShell",
"v1.0",
),
os.path.join(
os.environ.get("ProgramFiles", r"C:\Program Files"),
"Git",
"bin",
),
os.path.join(
os.environ.get("ProgramFiles(x86)", r"C:\Program Files (x86)"),
"Git",
"bin",
),
os.path.join(os.environ.get("LOCALAPPDATA", ""), "Programs", "Git", "bin"),
)
if p
)
def _make_run_env(env: dict) -> dict:
@@ -235,7 +260,24 @@ def _make_run_env(env: dict) -> dict:
elif k not in _HERMES_PROVIDER_ENV_BLOCKLIST or _is_passthrough(k):
run_env[k] = v
existing_path = run_env.get("PATH", "")
if "/usr/bin" not in existing_path.split(":"):
if _IS_WINDOWS:
# Keep PATH Windows-native (`;` separator, case-insensitive dedupe)
# and avoid injecting POSIX defaults like /usr/bin.
parts = [p for p in existing_path.split(";") if p]
seen = {
ntpath.normcase(ntpath.normpath(p.rstrip("\\/")))
for p in parts
if p
}
for candidate in _SANE_PATH_WINDOWS:
norm = ntpath.normcase(ntpath.normpath(candidate.rstrip("\\/")))
if norm in seen:
continue
parts.append(candidate)
seen.add(norm)
if parts:
run_env["PATH"] = ";".join(parts)
elif "/usr/bin" not in existing_path.split(":"):
run_env["PATH"] = f"{existing_path}:{_SANE_PATH}" if existing_path else _SANE_PATH
# Per-profile HOME isolation: redirect system tool configs (git, ssh, gh,
+130
View File
@@ -0,0 +1,130 @@
"""Brave Search web search provider (free tier).
Brave Search's Data-for-Search API offers a free tier (2,000 queries/mo at the
time of writing) after signing up at https://brave.com/search/api/. This
provider implements ``WebSearchProvider`` only the Data-for-Search endpoint
returns search results, it does not extract/crawl arbitrary URLs.
Configuration::
# ~/.hermes/.env
BRAVE_SEARCH_API_KEY=your-subscription-token
# ~/.hermes/config.yaml
web:
search_backend: "brave-free"
extract_backend: "firecrawl" # pair with an extract provider if needed
The API uses the ``X-Subscription-Token`` header. Free-tier keys are rate
limited (1 qps) and capped at 2k queries/month; see the Brave dashboard for
current quotas.
"""
from __future__ import annotations
import logging
import os
from typing import Any, Dict
from tools.web_providers.base import WebSearchProvider
logger = logging.getLogger(__name__)
_BRAVE_ENDPOINT = "https://api.search.brave.com/res/v1/web/search"
class BraveFreeSearchProvider(WebSearchProvider):
"""Search via the Brave Search API (free tier).
Requires ``BRAVE_SEARCH_API_KEY`` to be set. The value is passed as the
``X-Subscription-Token`` header. No extract capability pair with
Firecrawl/Tavily/Exa/Parallel when you also need ``web_extract``.
"""
def provider_name(self) -> str:
return "brave-free"
def is_configured(self) -> bool:
"""Return True when ``BRAVE_SEARCH_API_KEY`` is set to a non-empty value."""
return bool(os.getenv("BRAVE_SEARCH_API_KEY", "").strip())
def search(self, query: str, limit: int = 5) -> Dict[str, Any]:
"""Execute a search against the Brave Search API.
Returns normalized results::
{
"success": True,
"data": {
"web": [
{
"title": str,
"url": str,
"description": str,
"position": int,
},
...
]
}
}
On failure returns ``{"success": False, "error": str}``.
"""
import httpx
api_key = os.getenv("BRAVE_SEARCH_API_KEY", "").strip()
if not api_key:
return {"success": False, "error": "BRAVE_SEARCH_API_KEY is not set"}
# Brave's `count` is capped at 20.
count = max(1, min(int(limit), 20))
try:
resp = httpx.get(
_BRAVE_ENDPOINT,
params={"q": query, "count": count},
headers={
"X-Subscription-Token": api_key,
"Accept": "application/json",
},
timeout=15,
)
resp.raise_for_status()
except httpx.HTTPStatusError as exc:
logger.warning("Brave Search HTTP error: %s", exc)
return {
"success": False,
"error": f"Brave Search returned HTTP {exc.response.status_code}",
}
except httpx.RequestError as exc:
logger.warning("Brave Search request error: %s", exc)
return {"success": False, "error": f"Could not reach Brave Search: {exc}"}
try:
data = resp.json()
except Exception as exc: # noqa: BLE001
logger.warning("Brave Search response parse error: %s", exc)
return {"success": False, "error": "Could not parse Brave Search response as JSON"}
raw_results = (data.get("web") or {}).get("results", []) or []
truncated = raw_results[:limit]
web_results = [
{
"title": str(r.get("title", "")),
"url": str(r.get("url", "")),
"description": str(r.get("description", "")),
"position": i + 1,
}
for i, r in enumerate(truncated)
]
logger.info(
"Brave Search '%s': %d results (from %d raw, limit %d)",
query,
len(web_results),
len(raw_results),
limit,
)
return {"success": True, "data": {"web": web_results}}
+98
View File
@@ -0,0 +1,98 @@
"""DuckDuckGo web search provider via the ``ddgs`` Python package.
DuckDuckGo does not provide an official programmatic search API. The
community-maintained `ddgs <https://pypi.org/project/ddgs/>`_ package (the
renamed successor of ``duckduckgo-search``) scrapes DuckDuckGo's HTML results
page and normalizes them. It implements ``WebSearchProvider`` only there is
no extract capability.
Configuration::
# No API key required. Enable by installing the package and pointing the
# web backend at ddgs:
pip install ddgs
# ~/.hermes/config.yaml
web:
search_backend: "ddgs"
extract_backend: "firecrawl" # pair with an extract provider if needed
Rate limits are enforced server-side by DuckDuckGo. Expect intermittent
``DuckDuckGoSearchException`` / 202 responses under heavy use; this provider
surfaces them as ``{"success": False, "error": ...}`` rather than crashing
the tool call.
See https://duckduckgo.com/?q=duckduckgo+tos for terms of use.
"""
from __future__ import annotations
import logging
from typing import Any, Dict
from tools.web_providers.base import WebSearchProvider
logger = logging.getLogger(__name__)
class DDGSSearchProvider(WebSearchProvider):
"""Search via the ``ddgs`` package (DuckDuckGo HTML scrape).
No API key required. The provider is considered "configured" when the
``ddgs`` package is importable there is nothing else to set up.
"""
def provider_name(self) -> str:
return "ddgs"
def is_configured(self) -> bool:
"""Return True when the ``ddgs`` package is importable.
Called at tool-registration time; must not perform network I/O.
"""
try:
import ddgs # noqa: F401
return True
except ImportError:
return False
def search(self, query: str, limit: int = 5) -> Dict[str, Any]:
"""Execute a DuckDuckGo search and return normalized results.
Returns ``{"success": True, "data": {"web": [...]}}`` on success or
``{"success": False, "error": str}`` on failure (missing package,
rate-limited, network error, etc.).
"""
try:
from ddgs import DDGS # type: ignore
except ImportError:
return {
"success": False,
"error": "ddgs package is not installed — run `pip install ddgs`",
}
# DDGS().text yields at most `max_results` items; we cap defensively
# in case the package ignores the hint.
safe_limit = max(1, int(limit))
try:
web_results = []
with DDGS() as client:
for i, hit in enumerate(client.text(query, max_results=safe_limit)):
if i >= safe_limit:
break
url = str(hit.get("href") or hit.get("url") or "")
web_results.append(
{
"title": str(hit.get("title", "")),
"url": url,
"description": str(hit.get("body", "")),
"position": i + 1,
}
)
except Exception as exc: # noqa: BLE001 — ddgs raises its own exceptions
logger.warning("DDGS search error: %s", exc)
return {"success": False, "error": f"DuckDuckGo search failed: {exc}"}
logger.info("DDGS search '%s': %d results (limit %d)", query, len(web_results), limit)
return {"success": True, "data": {"web": web_results}}
+3 -2
View File
@@ -5,10 +5,11 @@ It implements ``WebSearchProvider`` only — there is no extract capability.
Configuration::
# ~/.hermes/config.yaml (SEARXNG_URL is a URL, not a secret — use config.yaml not .env)
SEARXNG_URL: http://localhost:8080
# ~/.hermes/.env
SEARXNG_URL=http://localhost:8080
# Use SearXNG for search, pair with any extract provider:
# ~/.hermes/config.yaml
web:
search_backend: "searxng"
extract_backend: "firecrawl"
+61 -9
View File
@@ -126,18 +126,22 @@ def _get_backend() -> str:
keys manually without running setup.
"""
configured = (_load_web_config().get("backend") or "").lower().strip()
if configured in ("parallel", "firecrawl", "tavily", "exa", "searxng"):
if configured in ("parallel", "firecrawl", "tavily", "exa", "searxng", "brave-free", "ddgs"):
return configured
# Fallback for manual / legacy config — pick the highest-priority
# available backend. Firecrawl also counts as available when the managed
# tool gateway is configured for Nous subscribers.
# Free-tier backends (searxng / brave-free / ddgs) trail the paid ones so
# existing paid setups are unaffected.
backend_candidates = (
("firecrawl", _has_env("FIRECRAWL_API_KEY") or _has_env("FIRECRAWL_API_URL") or _is_tool_gateway_ready()),
("parallel", _has_env("PARALLEL_API_KEY")),
("tavily", _has_env("TAVILY_API_KEY")),
("exa", _has_env("EXA_API_KEY")),
("searxng", _has_env("SEARXNG_URL")),
("brave-free", _has_env("BRAVE_SEARCH_API_KEY")),
("ddgs", _ddgs_package_importable()),
)
for backend, available in backend_candidates:
if available:
@@ -196,8 +200,27 @@ def _is_backend_available(backend: str) -> bool:
return _has_env("TAVILY_API_KEY")
if backend == "searxng":
return _has_env("SEARXNG_URL")
if backend == "brave-free":
return _has_env("BRAVE_SEARCH_API_KEY")
if backend == "ddgs":
return _ddgs_package_importable()
return False
def _ddgs_package_importable() -> bool:
"""Return True when the ``ddgs`` Python package can be imported.
ddgs is the only backend whose availability is driven by a package
presence rather than an env var / config entry. Wrapped in a helper
so auto-detect and ``_is_backend_available`` share the same check
(and tests can monkeypatch a single symbol).
"""
try:
import ddgs # noqa: F401
return True
except ImportError:
return False
# ─── Firecrawl Client ────────────────────────────────────────────────────────
_firecrawl_client = None
@@ -1200,6 +1223,26 @@ def web_search_tool(query: str, limit: int = 5) -> str:
_debug.save()
return result_json
if backend == "brave-free":
from tools.web_providers.brave_free import BraveFreeSearchProvider
response_data = BraveFreeSearchProvider().search(query, limit)
debug_call_data["results_count"] = len(response_data.get("data", {}).get("web", []))
result_json = json.dumps(response_data, indent=2, ensure_ascii=False)
debug_call_data["final_response_size"] = len(result_json)
_debug.log_call("web_search_tool", debug_call_data)
_debug.save()
return result_json
if backend == "ddgs":
from tools.web_providers.ddgs import DDGSSearchProvider
response_data = DDGSSearchProvider().search(query, limit)
debug_call_data["results_count"] = len(response_data.get("data", {}).get("web", []))
result_json = json.dumps(response_data, indent=2, ensure_ascii=False)
debug_call_data["final_response_size"] = len(result_json)
_debug.log_call("web_search_tool", debug_call_data)
_debug.save()
return result_json
if backend == "tavily":
logger.info("Tavily search: '%s' (limit: %d)", query, limit)
raw = _tavily_request("search", {
@@ -1350,11 +1393,12 @@ async def web_extract_tool(
"include_images": False,
})
results = _normalize_tavily_documents(raw, fallback_url=safe_urls[0] if safe_urls else "")
elif backend == "searxng":
# SearXNG is search-only — it cannot extract URL content
elif backend in ("searxng", "brave-free", "ddgs"):
# These backends are search-only — they cannot extract URL content
_label = {"searxng": "SearXNG", "brave-free": "Brave Search (free tier)", "ddgs": "DuckDuckGo (ddgs)"}[backend]
return json.dumps({
"success": False,
"error": "SearXNG is a search-only backend and cannot extract URL content. "
"error": f"{_label} is a search-only backend and cannot extract URL content. "
"Set web.extract_backend to firecrawl, tavily, exa, or parallel.",
}, ensure_ascii=False)
else:
@@ -1732,10 +1776,11 @@ async def web_crawl_tool(
_debug.save()
return cleaned_result
# SearXNG is search-only — it cannot crawl
if backend == "searxng":
# SearXNG / Brave Search (free tier) / DuckDuckGo (ddgs) are search-only — they cannot crawl
if backend in ("searxng", "brave-free", "ddgs"):
_label = {"searxng": "SearXNG", "brave-free": "Brave Search (free tier)", "ddgs": "DuckDuckGo (ddgs)"}[backend]
return json.dumps({
"error": "SearXNG is a search-only backend and cannot crawl URLs. "
"error": f"{_label} is a search-only backend and cannot crawl URLs. "
"Set FIRECRAWL_API_KEY for crawling, or use web_search instead.",
"success": False,
}, ensure_ascii=False)
@@ -2035,9 +2080,12 @@ def check_firecrawl_api_key() -> bool:
def check_web_api_key() -> bool:
"""Check whether the configured web backend is available."""
configured = _load_web_config().get("backend", "").lower().strip()
if configured in ("exa", "parallel", "firecrawl", "tavily", "searxng"):
if configured in ("exa", "parallel", "firecrawl", "tavily", "searxng", "brave-free", "ddgs"):
return _is_backend_available(configured)
return any(_is_backend_available(backend) for backend in ("exa", "parallel", "firecrawl", "tavily", "searxng"))
return any(
_is_backend_available(backend)
for backend in ("exa", "parallel", "firecrawl", "tavily", "searxng", "brave-free", "ddgs")
)
def check_auxiliary_model() -> bool:
@@ -2074,6 +2122,10 @@ if __name__ == "__main__":
print(" Using Tavily API (https://tavily.com)")
elif backend == "searxng":
print(f" Using SearXNG (search only): {os.getenv('SEARXNG_URL', '').strip()}")
elif backend == "brave-free":
print(" Using Brave Search free tier (search only)")
elif backend == "ddgs":
print(" Using DuckDuckGo via ddgs package (search only)")
else:
if firecrawl_url_available:
print(f" Using self-hosted Firecrawl: {os.getenv('FIRECRAWL_API_URL').strip().rstrip('/')}")
+38 -12
View File
@@ -1280,6 +1280,7 @@ def _get_usage(agent) -> dict:
"output": g("session_output_tokens", "session_completion_tokens"),
"cache_read": g("session_cache_read_tokens"),
"cache_write": g("session_cache_write_tokens"),
"reasoning": g("session_reasoning_tokens"),
"prompt": g("session_prompt_tokens"),
"completion": g("session_completion_tokens"),
"total": g("session_total_tokens"),
@@ -1725,21 +1726,46 @@ def _validate_personality(value: str, cfg: dict | None = None) -> tuple[str, str
def _apply_personality_to_session(
sid: str, session: dict, new_prompt: str
) -> tuple[bool, dict | None]:
"""Apply a personality change to an existing session without resetting history.
Updates the agent's ephemeral system prompt in-place so the new personality
takes effect on the next turn. The cached base system prompt is left intact
(ephemeral_system_prompt is appended at API-call time, not baked into the
cache), which preserves prompt-cache hits.
Also injects a system-role marker into the conversation history so the model
knows to pivot its style from this point forward (without this, LLMs tend to
continue the tone established by earlier messages in the transcript).
Returns (history_reset, info) history_reset is always False since we
preserve the conversation.
"""
if not session:
return False, None
try:
info = _reset_session_agent(sid, session)
return True, info
except Exception:
if session.get("agent"):
agent = session["agent"]
agent.ephemeral_system_prompt = new_prompt or None
agent._cached_system_prompt = None
info = _session_info(agent)
_emit("session.info", sid, info)
return False, info
return False, None
agent = session.get("agent")
if agent:
agent.ephemeral_system_prompt = new_prompt or None
# Inject a pivot marker into history so the model sees the change point.
# This prevents it from pattern-matching its prior style.
if new_prompt:
marker = (
"[System: The user has changed the assistant's personality. "
"From this point forward, adopt the following persona and respond "
f"accordingly: {new_prompt}]"
)
else:
marker = (
"[System: The user has cleared the personality overlay. "
"From this point forward, respond in your normal default style.]"
)
with session["history_lock"]:
session["history"].append({"role": "user", "content": marker})
session["history_version"] = int(session.get("history_version", 0)) + 1
info = _session_info(agent)
_emit("session.info", sid, info)
return False, info
return False, None
def _cfg_max_turns(cfg: dict, default: int) -> int:
+14 -1
View File
@@ -92,6 +92,19 @@ export const sessionCommands: SlashCommand[] = [
}
},
{
help: 'browse and resume previous sessions',
name: 'sessions',
run: (arg, ctx) => {
if (ctx.session.guardBusySessionSwitch('switch sessions')) {
return
}
if (!arg.trim()) {
return patchOverlayState({ picker: true })
}
}
},
{
help: 'attach an image',
name: 'image',
@@ -109,7 +122,7 @@ export const sessionCommands: SlashCommand[] = [
},
{
help: 'switch or reset personality (history reset on set)',
help: 'switch personality for this session',
name: 'personality',
run: (arg, ctx) => {
if (!arg) {
+2
View File
@@ -164,9 +164,11 @@ export interface Usage {
context_max?: number
context_percent?: number
context_used?: number
cost_status?: string
cost_usd?: number
input: number
output: number
reasoning?: number
total: number
}
+81
View File
@@ -0,0 +1,81 @@
"""Windows UTF-8 bootstrap for Hermes entrypoints.
On older Windows builds, Python may start with a locale codec such as cp1252.
That makes text-mode ``open()`` without ``encoding=`` and stdio defaults prone
to Unicode decode/encode failures. Hermes touches many files in long-running
processes, so we force UTF-8 mode at process start for CLI entrypoints.
"""
from __future__ import annotations
import os
import sys
_UTF8_REEXEC_GUARD = "_HERMES_UTF8_REEXEC"
def ensure_windows_utf8_mode(
*,
reexec: bool = True,
module: str | None = None,
entrypoint_markers: tuple[str, ...] | None = None,
) -> bool:
"""Ensure UTF-8 defaults on Windows.
Behavior:
- Always sets ``PYTHONUTF8=1`` and ``PYTHONIOENCODING=utf-8`` on Windows.
- If Python is already in UTF-8 mode, returns immediately.
- Otherwise re-execs the current interpreter with ``-X utf8`` (once),
unless marker-gated or explicitly disabled via ``reexec=False``.
- When ``module=...`` is supplied, re-execs as ``python -m <module>`` and
forwards original user args (excluding argv0), which avoids Windows
console-script ``.exe`` wrappers being treated as Python scripts.
Returns ``True`` only when a re-exec is attempted and the exec call
unexpectedly returns (e.g. under a patched test double). In normal
operation ``os.execvpe`` never returns on success.
"""
if sys.platform != "win32":
return False
os.environ.setdefault("PYTHONUTF8", "1")
os.environ.setdefault("PYTHONIOENCODING", "utf-8")
if getattr(sys.flags, "utf8_mode", 0) == 1:
return False
if not reexec:
return False
if os.environ.get(_UTF8_REEXEC_GUARD) == "1":
return False
if entrypoint_markers:
argv0 = ""
if getattr(sys, "argv", None):
argv0 = os.path.basename(str(sys.argv[0])).lower()
markers = tuple(marker.lower() for marker in entrypoint_markers if marker)
if markers and not any(marker in argv0 for marker in markers):
return False
executable = getattr(sys, "executable", None)
argv = list(getattr(sys, "argv", []))
if not executable:
return False
child_env = dict(os.environ)
child_env[_UTF8_REEXEC_GUARD] = "1"
child_argv = [executable, "-X", "utf8"]
if module:
child_argv.extend(["-m", module])
if len(argv) > 1:
child_argv.extend(argv[1:])
else:
child_argv.extend(argv)
try:
os.execvpe(executable, child_argv, child_env)
except OSError:
# Best-effort fallback: env vars remain set for child processes.
return False
# ``exec`` should not return on success.
return True
+1
View File
@@ -378,6 +378,7 @@ Multi-profile, multi-project collaboration board. Each install can host many boa
| `tail <id>` | Follow a task's event stream. |
| `dispatch` | One dispatcher pass on the active board. Flags: `--dry-run`, `--max N`, `--json`. |
| `context <id>` | Print the full context a worker would see (title + body + parent results + comments). |
| `specify <id>` / `specify --all` | Flesh out a triage-column task into a concrete spec (title + body with goal, approach, acceptance criteria) via the auxiliary LLM, then promote it to `todo`. Flags: `--tenant` (scope `--all` to one tenant), `--author`, `--json`. Configure the model under `auxiliary.triage_specifier` in `config.yaml`. |
| `gc` | Remove scratch workspaces for archived tasks. |
Examples:
+13
View File
@@ -784,6 +784,7 @@ $ hermes model
[ ] title_generation currently: openrouter / google/gemini-3-flash-preview
[ ] compression currently: auto / main model
[ ] approval currently: auto / main model
[ ] triage_specifier currently: auto / main model
```
Select a task, pick a provider (OAuth flows open a browser; API-key providers prompt), pick a model. The change persists to `auxiliary.<task>.*` in `config.yaml`. Same machinery as the main-model picker — no extra syntax to learn.
@@ -880,6 +881,18 @@ auxiliary:
base_url: ""
api_key: ""
timeout: 30
# Kanban triage specifier — `hermes kanban specify <id>` (or the
# dashboard's ✨ Specify button on Triage-column cards) uses this
# slot to expand a one-liner into a concrete spec and promote the
# task to `todo`. Cheap fast models work well here; spec expansion
# is short and doesn't need reasoning depth.
triage_specifier:
provider: "auto"
model: ""
base_url: ""
api_key: ""
timeout: 120
```
:::tip
@@ -192,6 +192,7 @@ Hermes uses separate lightweight models for side tasks. Each task has its own pr
| MCP | MCP helper operations | `auxiliary.mcp` |
| Approval | Smart command-approval classification | `auxiliary.approval` |
| Title Generation | Session title summaries | `auxiliary.title_generation` |
| Triage Specifier | `hermes kanban specify` / dashboard ✨ button — fleshes out a one-liner triage task into a real spec | `auxiliary.triage_specifier` |
### Auto-Detection Chain
@@ -384,5 +385,6 @@ See [Scheduled Tasks (Cron)](/docs/user-guide/features/cron) for full configurat
| MCP helpers | Auto-detection chain | `auxiliary.mcp` |
| Approval classification | Auto-detection chain | `auxiliary.approval` |
| Title generation | Auto-detection chain | `auxiliary.title_generation` |
| Triage specifier | Auto-detection chain | `auxiliary.triage_specifier` |
| Delegation | Provider override only (no automatic fallback) | `delegation.provider` / `delegation.model` |
| Cron jobs | Per-job provider override only (no automatic fallback) | Per-job `provider` / `model` |
@@ -22,7 +22,7 @@ Throughout the tutorial, **code blocks labelled `bash` are commands *you* run.**
Six columns, left to right:
- **Triage** — raw ideas, a specifier will flesh out the spec before anyone works on them.
- **Triage** — raw ideas, a specifier will flesh out the spec before anyone works on them. Click the **✨ Specify** button on any triage card (or run `hermes kanban specify <id>` / `/kanban specify <id>` from a chat) to have the auxiliary LLM turn a one-liner into a full spec (goal, approach, acceptance criteria) and promote it to `todo` in one shot. Configure which model runs it under `auxiliary.triage_specifier` in `config.yaml`.
- **Todo** — created but waiting on dependencies, or not yet assigned.
- **Ready** — assigned and waiting for the dispatcher to claim.
- **In progress** — a worker is actively running the task. With "Lanes by profile" on (the default), this column sub-groups by assignee so you can see at a glance what each worker is doing.
+8 -3
View File
@@ -442,7 +442,7 @@ hermes dashboard # "Kanban" tab appears in the nav, after "Skills"
### What the plugin gives you
- A **Kanban** tab showing one column per status: `triage`, `todo`, `ready`, `running`, `blocked`, `done` (plus `archived` when the toggle is on).
- `triage` is the parking column for rough ideas a specifier is expected to flesh out. Tasks created with `hermes kanban create --triage` (or via the Triage column's inline create) land here and the dispatcher leaves them alone until a human or specifier promotes them to `todo` / `ready`.
- `triage` is the parking column for rough ideas a specifier is expected to flesh out. Tasks created with `hermes kanban create --triage` (or via the Triage column's inline create) land here and the dispatcher leaves them alone until a human or specifier promotes them to `todo` / `ready`. Run `hermes kanban specify <id>` to have the auxiliary LLM expand a triage task into a concrete spec (title + body with goal, approach, acceptance criteria) and promote it to `todo` in one shot; `--all` sweeps every triage task at once. Configure which model runs the specifier under `auxiliary.triage_specifier` in `config.yaml`.
- Cards show the task id, title, priority badge, tenant tag, assigned profile, comment/link counts, a **progress pill** (`N/M` children done when the task has dependents), and "created N ago". A per-card checkbox enables multi-select.
- **Per-profile lanes inside Running** — toolbar checkbox toggles sub-grouping of the Running column by assignee.
- **Live updates via WebSocket** — the plugin tails the append-only `task_events` table on a short poll interval; the board reflects changes the instant any profile (CLI, gateway, or another dashboard tab) acts. Reloads are debounced so a burst of events triggers a single refetch.
@@ -454,7 +454,7 @@ hermes dashboard # "Kanban" tab appears in the nav, after "Skills"
- **Editable assignee / priority** — click the meta row to rewrite.
- **Editable description** — markdown-rendered by default (headings, bold, italic, inline code, fenced code, `http(s)` / `mailto:` links, bullet lists), with an "edit" button that swaps in a textarea. Markdown rendering is a tiny, XSS-safe renderer — every substitution runs on HTML-escaped input, only `http(s)` / `mailto:` links pass through, and `target="_blank"` + `rel="noopener noreferrer"` are always set.
- **Dependency editor** — chip list of parents and children, each with an `×` to unlink, plus dropdowns over every other task to add a new parent or child. Cycle attempts are rejected server-side with a clear message.
- **Status action row** (→ triage / → ready / → running / block / unblock / complete / archive) with confirm prompts for destructive transitions.
- **Status action row** (→ triage / → ready / → running / block / unblock / complete / archive) with confirm prompts for destructive transitions. For cards in the **Triage** column the row also exposes a **✨ Specify** button that calls the auxiliary LLM (`auxiliary.triage_specifier` in `config.yaml`) to expand the one-liner into a concrete spec (title + body with goal, approach, acceptance criteria) and promote the task to `todo`. The same behaviour is reachable from the CLI (`hermes kanban specify <id>` / `--all`), from any gateway platform (`/kanban specify <id>`), and programmatically via `POST /api/plugins/kanban/tasks/:id/specify`.
- Result section (also markdown-rendered), comment thread with Enter-to-submit, the last 20 events.
- **Toolbar filters** — free-text search, tenant dropdown (defaults to `dashboard.kanban.default_tenant` from `config.yaml`), assignee dropdown, "show archived" toggle, "lanes by profile" toggle, and a **Nudge dispatcher** button so you don't have to wait for the next 60 s tick.
@@ -496,6 +496,7 @@ All routes are mounted under `/api/plugins/kanban/` and protected by the dashboa
| `PATCH` | `/tasks/:id` | Status / assignee / priority / title / body / result |
| `POST` | `/tasks/bulk` | Apply the same patch (status / archive / assignee / priority) to every id in `ids`. Per-id failures reported without aborting siblings |
| `POST` | `/tasks/:id/comments` | Append a comment |
| `POST` | `/tasks/:id/specify` | Run the triage specifier — auxiliary LLM fleshes out the task body and promotes it from `triage` to `todo`. Returns `{ok, task_id, reason, new_title}`; `ok=false` with a human-readable reason on "not in triage" / no aux client / LLM error is a 200, not a 4xx |
| `POST` | `/links` | Add a dependency (`parent_id``child_id`) |
| `DELETE` | `/links?parent_id=…&child_id=…` | Remove a dependency |
| `POST` | `/dispatch?max=…&dry_run=…` | Nudge the dispatcher — skip the 60 s wait |
@@ -588,6 +589,8 @@ hermes kanban notify-list [<id>] [--json]
hermes kanban notify-unsubscribe <id>
--platform <name> --chat-id <id> [--thread-id <id>]
hermes kanban context <id> # what a worker sees
hermes kanban specify [<id> | --all] [--tenant T] # flesh out a triage-column idea
[--author NAME] [--json] # into a full spec and promote to todo
hermes kanban gc [--event-retention-days N] # workspaces + old events + old logs
[--log-retention-days N]
```
@@ -605,6 +608,8 @@ Every `hermes kanban <action>` verb is also reachable as `/kanban <action>` —
/kanban comment t_abcd "looks good, ship it"
/kanban unblock t_abcd
/kanban dispatch --max 3
/kanban specify t_abcd # flesh out a triage one-liner into a real spec
/kanban specify --all --tenant engineering # sweep every triage task in one tenant
```
Quote multi-word arguments the same way you would on a shell — `run_slash` parses the rest of the line with `shlex.split`, so `"..."` and `'...'` both work.
@@ -658,7 +663,7 @@ The board supports these eight patterns without any new primitives:
| **P6 `@mention`** | inline routing from prose | `@reviewer look at this` |
| **P7 Thread-scoped workspace** | `/kanban here` in a thread | per-project gateway threads |
| **P8 Fleet farming** | one profile, N subjects | 50 social accounts |
| **P9 Triage specifier** | rough idea → `triage` → specifier expands body → `todo` | "turn this one-liner into a spec' task" |
| **P9 Triage specifier** | rough idea → `triage` `hermes kanban specify` expands body → `todo` | "turn this one-liner into a spec'd task" |
For worked examples of each, see `docs/hermes-kanban-v1-spec.pdf`.
+11 -4
View File
@@ -148,8 +148,15 @@ You should see something like `10 results`. If you get a `403 Forbidden`, JSON f
**7. Configure Hermes:**
```bash
# ~/.hermes/config.yaml
SEARXNG_URL: http://localhost:8888
# ~/.hermes/.env
SEARXNG_URL=http://localhost:8888
```
Then select SearXNG as the search backend in `~/.hermes/config.yaml`:
```yaml
web:
search_backend: "searxng"
```
Or set via `hermes tools` → Web Search & Extract → SearXNG.
@@ -161,8 +168,8 @@ Or set via `hermes tools` → Web Search & Extract → SearXNG.
Public SearXNG instances are listed at [searx.space](https://searx.space/). Filter by instances that have **JSON format enabled** (shown in the table).
```bash
# ~/.hermes/config.yaml
SEARXNG_URL: https://searx.example.com
# ~/.hermes/.env
SEARXNG_URL=https://searx.example.com
```
:::caution Public instances