Compare commits

...

17 Commits

Author SHA1 Message Date
Brooklyn Nicholson 0b5bb9f0b5 fix(windows): bootstrap utf-8 mode at entrypoints
Force UTF-8 defaults on legacy Windows by re-execing Hermes entrypoints with -X utf8, preventing locale codec crashes from implicit text encoding in file and stdio paths.
2026-05-07 22:43:17 -04:00
Brooklyn Nicholson 31e3bdee99 fix(windows): harden native CLI and TUI bootstrap
Handle native Windows dependency edge cases by avoiding npm.ps1 execution-policy failures, persisting managed Node resolution, and validating runtime imports per platform.
2026-05-07 22:04:42 -04:00
helix4u faa13e49f8 docs(web): fix SearXNG env configuration 2026-05-07 17:54:47 -07:00
Teknium 1bdacb697c chore(release): add BennetYrWang to AUTHOR_MAP 2026-05-07 17:47:22 -07:00
BennetYrWang 34f7297359 Serialize Hermes config access 2026-05-07 17:47:22 -07:00
Teknium 307c85e5c1 fix(goals): auto-pause when judge model returns unparseable output
Weak judge models (e.g. deepseek-v4-flash) return empty strings or prose
when asked for the strict {done, reason} JSON verdict. The old code
failed-open to continue on every such turn, burning the entire turn
budget with log lines like

  judge returned empty response
  judge reply was not JSON: "Let me analyze whether the goal..."

and /goal clear could not stop it mid-loop without /stop.

After N=3 consecutive *parse* failures (transport/API errors don't
count — those are transient), the loop auto-pauses and prints:

  ⏸ Goal paused — the judge model (3 turns) isn't returning the
  required JSON verdict. Route the judge to a stricter model in
  ~/.hermes/config.yaml:
    auxiliary:
      goal_judge:
        provider: openrouter
        model: google/gemini-3-flash-preview
  Then /goal resume to continue.

The counter resets on any usable reply (both "done"/"continue" and
API errors) and persists across GoalManager reloads so cross-session
resumes carry the correct state.

Also fixes test_goal_verdict_send.py sharing a hardcoded session_id
across tests — the shared id only worked because the previous
_post_turn_goal_continuation was a never-awaited coroutine. Now that
PR #19160 made it properly awaited, the xdist test-leakage bug
surfaced. Each test gets a unique session_id via uuid suffix.
2026-05-07 17:33:09 -07:00
JC 03ddff8897 fix(gateway): defer goal status notices until after response delivery
Route goal status notices through the platform adapter send API and register post-delivery callbacks so completed-goal notices appear after the final assistant response. Also cancel queued synthetic goal continuations on /goal pause and /goal clear while preserving normal queued user messages.
2026-05-07 17:33:09 -07:00
Teknium 7d66d30d77 feat(kanban): add tooltips and docs link across dashboard (#21541)
Makes first-time use of the kanban view self-explanatory. Every control
that wasn't already labelled now has a `title` tooltip describing what
it does, and a `?` icon next to the board switcher opens the kanban
docs page in a new tab.

Coverage:
- BoardSwitcher: board select, + New board button, docs-link icon
  (both compact and full variants)
- BoardToolbar: Search, Tenant, Assignee, Show archived, Nudge
  dispatcher, Refresh
- BulkActionBar: → ready, Complete, Archive, reassign group, Apply,
  Clear
- Column header: hovering the header now surfaces COLUMN_HELP as a
  tooltip in addition to the visible sub-text; column count also
  labelled
- Card: task id, priority badge, tenant badge, assignee/unassigned,
  comment count, link count, age timestamp
- InlineCreate: assignee, priority, parent-task selectors

Closes the community feedback from @CharlieDePew asking for tooltips
and a docs link in the kanban view.

Relevant docs page:
https://hermes-agent.nousresearch.com/docs/user-guide/features/kanban
2026-05-07 16:13:27 -07:00
Austin Pickett 7f92e5506e Merge pull request #20942 from NousResearch/austin/fix/personality
fix(tui): preserve session when switching personality
2026-05-07 18:54:29 -04:00
Austin Pickett b0393af38c Merge pull request #20805 from NousResearch/austin-feat-sessions-skills-menu
feat(tui): add /sessions slash command for browsing and resuming previous sessions
2026-05-07 18:54:16 -04:00
teknium1 7f369bfe55 chore(release): add hllqkb to AUTHOR_MAP for PR #21288 salvage 2026-05-07 15:21:34 -07:00
hllqkb c80fa728bd fix(installer): set UV_NO_CONFIG=1 to avoid permission denied under sudo -u
When the installer is run via , uv resolves config file
paths against the process owner's (root) home directory rather than the
effective user's, causing a Permission denied error when trying to read
/root/uv.toml.

Setting UV_NO_CONFIG=1 prevents uv from discovering any config files
(uv.toml, pyproject.toml) during installation, which is the correct
behavior for a bootstrap script that manages its own environment.

Fixes #21269
2026-05-07 15:21:34 -07:00
teknium 292f468366 fix(mcp): unwrap platforms key in channels_list
channels_list was iterating directory.items() directly, yielding
("updated_at", str) and ("platforms", dict) pairs — neither passed
the isinstance(entries_list, list) check, so the inner loop never ran
and every call returned count=0 even when channel_directory.json was
populated.

The writer (gateway/channel_directory.py) wraps the payload as
{"updated_at": ..., "platforms": {...}}; every other reader in the
codebase unwraps via directory.get("platforms", {}). This aligns
channels_list with that convention.

Also tightens the existing test_channels_with_directory test, which
bypassed the bug by asserting against _load_channel_directory() directly
instead of calling channels_list. It now calls the tool end-to-end and
a new test_channels_with_directory_platform_filter covers the filter
path. Both tests fail against the pre-fix code.

Closes #21474

Co-authored-by: chrisworksai <262485129+chrisworksai@users.noreply.github.com>
2026-05-07 13:41:16 -07:00
Austin Pickett d87c7b99e2 fix(analytics): prevent silent token loss and add Claude 4.5–4.7 pricing (#21455)
- Add pricing entries for Claude Opus 4.5/4.6/4.7, Sonnet 4.5/4.6, and
  Haiku 4.5 with updated source URLs (platform.claude.com)
- Add _normalize_anthropic_model_name() to handle dot-notation variants
  (e.g. claude-opus-4.7 → claude-opus-4-7) for pricing lookups
- Fix silent token loss: ensure session row exists before UPDATE in both
  run_agent.py and hermes_state.py (INSERT OR IGNORE is idempotent)
- Log token persistence failures at DEBUG level instead of swallowing
  them silently — makes undercounted analytics diagnosable
- Surface reasoning tokens in CLI /usage and TUI usage panel
- Add 'reasoning' and 'cost_status' fields to TUI Usage type
2026-05-07 13:24:31 -07:00
Teknium cff821e2dc docs: register triage_specifier in the aux-models enumerations (#21494)
The kanban specifier landed in #21435 with feature-page docs (the
kanban page itself + the CLI reference table), but three other docs
pages enumerate every auxiliary task slot and were missed:

  user-guide/configuration.md            Auxiliary Models section —
                                         interactive picker example
                                         + full auxiliary config
                                         reference YAML block.
  user-guide/features/fallback-providers.md
                                         Both 'Auxiliary Tasks' and
                                         'Fallback Reference' tables.
  user-guide/features/kanban-tutorial.md
                                         Triage-column bullet now
                                         mentions the  Specify
                                         button + CLI + slash command.

No other docs enumerate the aux task slots (verified with
grep -r 'title_generation\|auxiliary.session_search' website/docs/).
2026-05-07 13:07:18 -07:00
Austin Pickett 65c762b2e8 fix(tui): preserve session when switching personality
Previously, /personality in the TUI called _reset_session_agent() which
destroyed the agent, cleared conversation history, and effectively started
a new session. This made personality switching disruptive — users lost
their entire conversation context.

Now /personality updates the agent's ephemeral_system_prompt in-place and
injects a pivot marker into the conversation history. The marker tells
the model to adopt the new persona from that point forward, which is
necessary because LLMs tend to pattern-match their prior responses and
continue the established tone without an explicit signal.

Changes:
- tui_gateway/server.py: Rewrite _apply_personality_to_session to update
  the agent in-place instead of resetting. Inject a user-role pivot
  marker so the model actually switches style mid-conversation.
- ui-tui/src/app/slash/commands/session.ts: Update help text (no longer
  mentions history reset).
- tests/test_tui_gateway_server.py: Update test to verify history is
  preserved, pivot marker is injected, and ephemeral prompt is set.
2026-05-06 19:30:46 -04:00
Austin Pickett 09a491464c feat(tui): add /sessions slash command for browsing and resuming previous sessions 2026-05-06 11:58:53 -04:00
37 changed files with 1701 additions and 281 deletions
+8
View File
@@ -17,7 +17,15 @@ import asyncio
import logging
import sys
from pathlib import Path
from hermes_constants import get_hermes_home
from utf8_bootstrap import ensure_windows_utf8_mode
# Ensure ACP stdio/file defaults are UTF-8 on legacy Windows builds.
ensure_windows_utf8_mode(
module="acp_adapter.entry",
entrypoint_markers=("hermes-acp", "entry.py"),
)
# Methods clients send as periodic liveness probes. They are not part of the
+159 -14
View File
@@ -1,5 +1,6 @@
from __future__ import annotations
import re
from dataclasses import dataclass
from datetime import datetime, timezone
from decimal import Decimal
@@ -82,6 +83,121 @@ _UTC_NOW = lambda: datetime.now(timezone.utc)
# Official docs snapshot entries. Models whose published pricing and cache
# semantics are stable enough to encode exactly.
_OFFICIAL_DOCS_PRICING: Dict[tuple[str, str], PricingEntry] = {
# ── Anthropic Claude 4.7 ─────────────────────────────────────────────
# Opus 4.5/4.6/4.7 share $5/$25 pricing (new tokenizer, up to 35% more
# tokens for the same text).
# Source: https://platform.claude.com/docs/en/about-claude/pricing
(
"anthropic",
"claude-opus-4-7",
): PricingEntry(
input_cost_per_million=Decimal("5.00"),
output_cost_per_million=Decimal("25.00"),
cache_read_cost_per_million=Decimal("0.50"),
cache_write_cost_per_million=Decimal("6.25"),
source="official_docs_snapshot",
source_url="https://platform.claude.com/docs/en/about-claude/pricing",
pricing_version="anthropic-pricing-2026-05",
),
(
"anthropic",
"claude-opus-4-7-20250507",
): PricingEntry(
input_cost_per_million=Decimal("5.00"),
output_cost_per_million=Decimal("25.00"),
cache_read_cost_per_million=Decimal("0.50"),
cache_write_cost_per_million=Decimal("6.25"),
source="official_docs_snapshot",
source_url="https://platform.claude.com/docs/en/about-claude/pricing",
pricing_version="anthropic-pricing-2026-05",
),
# ── Anthropic Claude 4.6 ─────────────────────────────────────────────
(
"anthropic",
"claude-opus-4-6",
): PricingEntry(
input_cost_per_million=Decimal("5.00"),
output_cost_per_million=Decimal("25.00"),
cache_read_cost_per_million=Decimal("0.50"),
cache_write_cost_per_million=Decimal("6.25"),
source="official_docs_snapshot",
source_url="https://platform.claude.com/docs/en/about-claude/pricing",
pricing_version="anthropic-pricing-2026-05",
),
(
"anthropic",
"claude-opus-4-6-20250414",
): PricingEntry(
input_cost_per_million=Decimal("5.00"),
output_cost_per_million=Decimal("25.00"),
cache_read_cost_per_million=Decimal("0.50"),
cache_write_cost_per_million=Decimal("6.25"),
source="official_docs_snapshot",
source_url="https://platform.claude.com/docs/en/about-claude/pricing",
pricing_version="anthropic-pricing-2026-05",
),
(
"anthropic",
"claude-sonnet-4-6",
): PricingEntry(
input_cost_per_million=Decimal("3.00"),
output_cost_per_million=Decimal("15.00"),
cache_read_cost_per_million=Decimal("0.30"),
cache_write_cost_per_million=Decimal("3.75"),
source="official_docs_snapshot",
source_url="https://platform.claude.com/docs/en/about-claude/pricing",
pricing_version="anthropic-pricing-2026-05",
),
(
"anthropic",
"claude-sonnet-4-6-20250414",
): PricingEntry(
input_cost_per_million=Decimal("3.00"),
output_cost_per_million=Decimal("15.00"),
cache_read_cost_per_million=Decimal("0.30"),
cache_write_cost_per_million=Decimal("3.75"),
source="official_docs_snapshot",
source_url="https://platform.claude.com/docs/en/about-claude/pricing",
pricing_version="anthropic-pricing-2026-05",
),
# ── Anthropic Claude 4.5 ─────────────────────────────────────────────
(
"anthropic",
"claude-opus-4-5",
): PricingEntry(
input_cost_per_million=Decimal("5.00"),
output_cost_per_million=Decimal("25.00"),
cache_read_cost_per_million=Decimal("0.50"),
cache_write_cost_per_million=Decimal("6.25"),
source="official_docs_snapshot",
source_url="https://platform.claude.com/docs/en/about-claude/pricing",
pricing_version="anthropic-pricing-2026-05",
),
(
"anthropic",
"claude-sonnet-4-5",
): PricingEntry(
input_cost_per_million=Decimal("3.00"),
output_cost_per_million=Decimal("15.00"),
cache_read_cost_per_million=Decimal("0.30"),
cache_write_cost_per_million=Decimal("3.75"),
source="official_docs_snapshot",
source_url="https://platform.claude.com/docs/en/about-claude/pricing",
pricing_version="anthropic-pricing-2026-05",
),
(
"anthropic",
"claude-haiku-4-5",
): PricingEntry(
input_cost_per_million=Decimal("1.00"),
output_cost_per_million=Decimal("5.00"),
cache_read_cost_per_million=Decimal("0.10"),
cache_write_cost_per_million=Decimal("1.25"),
source="official_docs_snapshot",
source_url="https://platform.claude.com/docs/en/about-claude/pricing",
pricing_version="anthropic-pricing-2026-05",
),
# ── Anthropic Claude 4 / 4.1 ─────────────────────────────────────────
(
"anthropic",
"claude-opus-4-20250514",
@@ -91,8 +207,8 @@ _OFFICIAL_DOCS_PRICING: Dict[tuple[str, str], PricingEntry] = {
cache_read_cost_per_million=Decimal("1.50"),
cache_write_cost_per_million=Decimal("18.75"),
source="official_docs_snapshot",
source_url="https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching",
pricing_version="anthropic-prompt-caching-2026-03-16",
source_url="https://platform.claude.com/docs/en/about-claude/pricing",
pricing_version="anthropic-pricing-2026-05",
),
(
"anthropic",
@@ -103,8 +219,8 @@ _OFFICIAL_DOCS_PRICING: Dict[tuple[str, str], PricingEntry] = {
cache_read_cost_per_million=Decimal("0.30"),
cache_write_cost_per_million=Decimal("3.75"),
source="official_docs_snapshot",
source_url="https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching",
pricing_version="anthropic-prompt-caching-2026-03-16",
source_url="https://platform.claude.com/docs/en/about-claude/pricing",
pricing_version="anthropic-pricing-2026-05",
),
# OpenAI
(
@@ -184,7 +300,7 @@ _OFFICIAL_DOCS_PRICING: Dict[tuple[str, str], PricingEntry] = {
source_url="https://openai.com/api/pricing/",
pricing_version="openai-pricing-2026-03-16",
),
# Anthropic older models (pre-4.6 generation)
# ── Anthropic older models (pre-4.5 generation) ────────────────────────
(
"anthropic",
"claude-3-5-sonnet-20241022",
@@ -194,8 +310,8 @@ _OFFICIAL_DOCS_PRICING: Dict[tuple[str, str], PricingEntry] = {
cache_read_cost_per_million=Decimal("0.30"),
cache_write_cost_per_million=Decimal("3.75"),
source="official_docs_snapshot",
source_url="https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching",
pricing_version="anthropic-pricing-2026-03-16",
source_url="https://platform.claude.com/docs/en/about-claude/pricing",
pricing_version="anthropic-pricing-2026-05",
),
(
"anthropic",
@@ -206,8 +322,8 @@ _OFFICIAL_DOCS_PRICING: Dict[tuple[str, str], PricingEntry] = {
cache_read_cost_per_million=Decimal("0.08"),
cache_write_cost_per_million=Decimal("1.00"),
source="official_docs_snapshot",
source_url="https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching",
pricing_version="anthropic-pricing-2026-03-16",
source_url="https://platform.claude.com/docs/en/about-claude/pricing",
pricing_version="anthropic-pricing-2026-05",
),
(
"anthropic",
@@ -218,8 +334,8 @@ _OFFICIAL_DOCS_PRICING: Dict[tuple[str, str], PricingEntry] = {
cache_read_cost_per_million=Decimal("1.50"),
cache_write_cost_per_million=Decimal("18.75"),
source="official_docs_snapshot",
source_url="https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching",
pricing_version="anthropic-pricing-2026-03-16",
source_url="https://platform.claude.com/docs/en/about-claude/pricing",
pricing_version="anthropic-pricing-2026-05",
),
(
"anthropic",
@@ -230,8 +346,8 @@ _OFFICIAL_DOCS_PRICING: Dict[tuple[str, str], PricingEntry] = {
cache_read_cost_per_million=Decimal("0.03"),
cache_write_cost_per_million=Decimal("0.30"),
source="official_docs_snapshot",
source_url="https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching",
pricing_version="anthropic-pricing-2026-03-16",
source_url="https://platform.claude.com/docs/en/about-claude/pricing",
pricing_version="anthropic-pricing-2026-05",
),
# DeepSeek
(
@@ -426,8 +542,37 @@ def resolve_billing_route(
return BillingRoute(provider=provider_name or "unknown", model=model.split("/")[-1] if model else "", base_url=base_url or "", billing_mode="unknown")
def _normalize_anthropic_model_name(model: str) -> str:
"""Normalize Anthropic model name variants to canonical form.
Handles:
- Dot notation: claude-opus-4.7 → claude-opus-4-7
- Short aliases: claude-opus-4.7 → claude-opus-4-7
- Strips anthropic/ prefix if present
"""
name = model.lower().strip()
if name.startswith("anthropic/"):
name = name[len("anthropic/"):]
# Normalize dots to dashes in version numbers (e.g. 4.7 → 4-7, 4.6 → 4-6)
# But preserve the rest of the name structure
name = re.sub(r"(\d+)\.(\d+)", r"\1-\2", name)
return name
def _lookup_official_docs_pricing(route: BillingRoute) -> Optional[PricingEntry]:
return _OFFICIAL_DOCS_PRICING.get((route.provider, route.model.lower()))
model = route.model.lower()
# Direct lookup first
entry = _OFFICIAL_DOCS_PRICING.get((route.provider, model))
if entry:
return entry
# Try normalized name for Anthropic (handles dot-notation like opus-4.7)
if route.provider == "anthropic":
normalized = _normalize_anthropic_model_name(model)
if normalized != model:
entry = _OFFICIAL_DOCS_PRICING.get((route.provider, normalized))
if entry:
return entry
return None
def _openrouter_pricing_entry(route: BillingRoute) -> Optional[PricingEntry]:
+10
View File
@@ -34,6 +34,8 @@ from pathlib import Path
from datetime import datetime
from typing import List, Dict, Any, Optional
from utf8_bootstrap import ensure_windows_utf8_mode
logger = logging.getLogger(__name__)
# Suppress startup messages for clean CLI experience
@@ -7991,6 +7993,7 @@ class HermesCLI:
output_tokens = getattr(agent, "session_output_tokens", 0) or 0
cache_read_tokens = getattr(agent, "session_cache_read_tokens", 0) or 0
cache_write_tokens = getattr(agent, "session_cache_write_tokens", 0) or 0
reasoning_tokens = getattr(agent, "session_reasoning_tokens", 0) or 0
prompt = agent.session_prompt_tokens
completion = agent.session_completion_tokens
total = agent.session_total_tokens
@@ -8022,6 +8025,8 @@ class HermesCLI:
print(f" Cache read tokens: {cache_read_tokens:>10,}")
print(f" Cache write tokens: {cache_write_tokens:>10,}")
print(f" Output tokens: {output_tokens:>10,}")
if reasoning_tokens:
print(f" ↳ Reasoning (subset): {reasoning_tokens:>10,}")
print(f" Prompt tokens (total): {prompt:>10,}")
print(f" Completion tokens: {completion:>10,}")
print(f" Total tokens: {total:>10,}")
@@ -12339,6 +12344,11 @@ def main(
"""
global _active_worktree
ensure_windows_utf8_mode(
module="cli",
entrypoint_markers=("hermes", "cli.py"),
)
# Signal to terminal_tool that we're in interactive mode
# This enables interactive sudo password prompts with timeout
os.environ["HERMES_INTERACTIVE"] = "1"
+3 -1
View File
@@ -3146,7 +3146,9 @@ class BasePlatformAdapter(ABC):
_post_cb = getattr(self, "_post_delivery_callbacks", {}).pop(session_key, None)
if callable(_post_cb):
try:
_post_cb()
_post_result = _post_cb()
if inspect.isawaitable(_post_result):
await _post_result
except Exception:
pass
# Stop typing indicator
+156 -35
View File
@@ -1903,6 +1903,59 @@ class GatewayRunner:
depth += 1
return depth
@staticmethod
def _is_goal_continuation_event(event_or_text: Any) -> bool:
"""Return True for synthetic /goal continuation turns.
Goal continuations are normal queued user-role events, so pause/clear
must distinguish them from real user /queue messages before removing or
suppressing them.
"""
text = getattr(event_or_text, "text", event_or_text) or ""
return str(text).startswith("[Continuing toward your standing goal]\nGoal:")
def _clear_goal_pending_continuations(self, session_key: str, adapter: Any) -> int:
"""Remove queued synthetic /goal continuations for one session.
User-issued /goal pause/clear can race with a continuation already
queued by the judge. Remove only synthetic goal continuations while
preserving normal /queue and user follow-up events.
"""
removed = 0
pending_slot = getattr(adapter, "_pending_messages", None) if adapter is not None else None
if isinstance(pending_slot, dict):
pending_event = pending_slot.get(session_key)
if self._is_goal_continuation_event(pending_event):
pending_slot.pop(session_key, None)
removed += 1
queued_events = getattr(self, "_queued_events", None)
if isinstance(queued_events, dict):
overflow = queued_events.get(session_key) or []
if overflow:
kept = []
for queued_event in overflow:
if self._is_goal_continuation_event(queued_event):
removed += 1
else:
kept.append(queued_event)
if kept:
queued_events[session_key] = kept
else:
queued_events.pop(session_key, None)
return removed
def _goal_still_active_for_session(self, session_id: str) -> bool:
"""Best-effort fresh DB check before running a queued continuation."""
if not session_id:
return False
try:
from hermes_cli.goals import GoalManager
return GoalManager(session_id=session_id).is_active()
except Exception as exc:
logger.debug("goal continuation: active-state recheck failed: %s", exc)
return False
def _update_runtime_status(self, gateway_state: Optional[str] = None, exit_reason: Optional[str] = None) -> None:
try:
from gateway.status import write_runtime_status
@@ -5836,7 +5889,7 @@ class GatewayRunner:
except Exception:
session_entry = None
if session_entry is not None:
self._post_turn_goal_continuation(
await self._post_turn_goal_continuation(
session_entry=session_entry,
source=source,
final_response=_final_text,
@@ -8404,6 +8457,13 @@ class GatewayRunner:
state = mgr.pause(reason="user-paused")
if state is None:
return "No goal set."
try:
adapter = self.adapters.get(event.source.platform) if event.source else None
_quick_key = self._session_key_for_source(event.source) if event.source else None
if adapter and _quick_key:
self._clear_goal_pending_continuations(_quick_key, adapter)
except Exception as exc:
logger.debug("goal pause: pending continuation cleanup failed: %s", exc)
return f"⏸ Goal paused: {state.goal}"
if lower == "resume":
@@ -8418,6 +8478,13 @@ class GatewayRunner:
if lower in ("clear", "stop", "done"):
had = mgr.has_goal()
mgr.clear()
try:
adapter = self.adapters.get(event.source.platform) if event.source else None
_quick_key = self._session_key_for_source(event.source) if event.source else None
if adapter and _quick_key:
self._clear_goal_pending_continuations(_quick_key, adapter)
except Exception as exc:
logger.debug("goal clear: pending continuation cleanup failed: %s", exc)
return t("gateway.goal_cleared") if had else t("gateway.no_active_goal")
# Otherwise — treat the remaining text as the new goal.
@@ -8449,7 +8516,69 @@ class GatewayRunner:
"Controls: /goal status · /goal pause · /goal resume · /goal clear"
)
def _post_turn_goal_continuation(
async def _send_goal_status_notice(self, source: Any, message: str) -> None:
"""Send a /goal judge status line back to the originating chat/thread."""
adapter = self.adapters.get(source.platform)
if not adapter:
logger.debug("goal continuation: no adapter for %s", getattr(source, "platform", None))
return
try:
metadata = self._thread_metadata_for_source(source)
except Exception:
metadata = {"thread_id": source.thread_id} if getattr(source, "thread_id", None) else None
result = await adapter.send(source.chat_id, message, metadata=metadata)
if result is not None and not getattr(result, "success", True):
logger.warning(
"goal continuation: status send failed: %s",
getattr(result, "error", "unknown error"),
)
async def _defer_goal_status_notice_after_delivery(self, source: Any, message: str) -> None:
"""Send a /goal status line after the main response is delivered.
The gateway message handler returns the agent response to the platform
adapter, which sends it after this method's caller has returned. For a
natural Discord/Telegram reading order, goal status belongs after that
send. Platform adapters provide a one-shot post-delivery callback for
exactly this boundary; when unavailable, fall back to direct awaited
delivery rather than silently dropping the notice.
"""
adapter = self.adapters.get(source.platform)
if not adapter:
logger.debug("goal continuation: no adapter for %s", getattr(source, "platform", None))
return
async def _deliver() -> None:
try:
await self._send_goal_status_notice(source, message)
except Exception as exc:
logger.warning("goal continuation: status send failed: %s", exc, exc_info=True)
try:
session_key = self._session_key_for_source(source)
except Exception:
session_key = None
if session_key and hasattr(adapter, "register_post_delivery_callback"):
try:
generation = None
active = getattr(adapter, "_active_sessions", {}).get(session_key)
if active is not None:
generation = getattr(active, "_hermes_run_generation", None)
adapter.register_post_delivery_callback(
session_key,
_deliver,
generation=generation,
)
return
except Exception as exc:
logger.debug("goal continuation: post-delivery callback registration failed: %s", exc)
await _deliver()
async def _post_turn_goal_continuation(
self,
*,
session_entry: Any,
@@ -8485,38 +8614,14 @@ class GatewayRunner:
decision = mgr.evaluate_after_turn(final_response or "", user_initiated=True)
msg = decision.get("message") or ""
# Send the status line back to the user so they see the judge's
# verdict. Fire-and-forget via the adapter's ``send()`` method —
# adapters expose ``send(chat_id, content, reply_to, metadata)``,
# not a ``send_message(source, msg)`` wrapper, so an earlier
# ``hasattr(adapter, "send_message")`` gate here was dead code and
# users never saw ``✓ Goal achieved`` / ``⏸ budget exhausted``
# verdicts.
# Defer the status line until after the adapter has delivered the
# agent's visible final response. The judge runs after the response is
# produced but before BasePlatformAdapter sends it, so sending here
# would show "✓ Goal achieved" before the answer itself. Registering
# an awaited post-delivery callback preserves delivery reliability
# without reversing the user-visible ordering.
if msg and source is not None:
try:
adapter = self.adapters.get(source.platform)
if adapter is not None and hasattr(adapter, "send"):
import asyncio as _asyncio
thread_meta = (
{"thread_id": source.thread_id} if source.thread_id else None
)
coro = adapter.send(
chat_id=source.chat_id,
content=msg,
metadata=thread_meta,
)
if _asyncio.iscoroutine(coro):
try:
loop = _asyncio.get_running_loop()
loop.create_task(coro)
except RuntimeError:
# No running loop in this thread — best effort.
try:
_asyncio.run(coro)
except Exception:
pass
except Exception as exc:
logger.debug("goal continuation: status send failed: %s", exc)
await self._defer_goal_status_notice_after_delivery(source, msg)
if not decision.get("should_continue"):
return
@@ -14768,14 +14873,18 @@ class GatewayRunner:
)
if callable(_bg_cb):
try:
_bg_cb()
_bg_result = _bg_cb()
if inspect.isawaitable(_bg_result):
await _bg_result
except Exception:
pass
elif adapter and hasattr(adapter, "_post_delivery_callbacks"):
_bg_cb = adapter._post_delivery_callbacks.pop(session_key, None)
if callable(_bg_cb):
try:
_bg_cb()
_bg_result = _bg_cb()
if inspect.isawaitable(_bg_result):
await _bg_result
except Exception:
pass
# else: interrupted — discard the interrupted response ("Operation
@@ -14789,6 +14898,12 @@ class GatewayRunner:
next_channel_prompt = None
if pending_event is not None:
next_source = getattr(pending_event, "source", None) or source
if self._is_goal_continuation_event(pending_event) and not self._goal_still_active_for_session(session_id):
logger.info(
"Discarding stale goal continuation for session %s — goal is no longer active",
session_key or "?",
)
return result
next_message = await self._prepare_inbound_message_text(
event=pending_event,
source=next_source,
@@ -15385,6 +15500,12 @@ async def start_gateway(config: Optional[GatewayConfig] = None, replace: bool =
def main():
"""CLI entry point for the gateway."""
from utf8_bootstrap import ensure_windows_utf8_mode
ensure_windows_utf8_mode(
module="gateway.run",
entrypoint_markers=("gateway", "run.py"),
)
import argparse
parser = argparse.ArgumentParser(description="Hermes Gateway - Multi-platform messaging")
+3
View File
@@ -109,6 +109,9 @@ COMMAND_REGISTRY: list[CommandDef] = [
CommandDef("resume", "Resume a previously-named session", "Session",
args_hint="[name]"),
# Configuration
CommandDef("sessions", "Browse and resume previous sessions", "Session"),
# Configuration
CommandDef("config", "Show current configuration", "Configuration",
cli_only=True),
+102 -90
View File
@@ -21,6 +21,7 @@ import stat
import subprocess
import sys
import tempfile
import threading
from dataclasses import dataclass
from pathlib import Path
from typing import Dict, Any, Optional, List, Tuple
@@ -42,6 +43,14 @@ _LOAD_CONFIG_CACHE: Dict[str, Tuple[int, int, Dict[str, Any]]] = {}
# _LOAD_CONFIG_CACHE but for read_raw_config() — used when callers want
# the user's on-disk values without defaults merged in.
_RAW_CONFIG_CACHE: Dict[str, Tuple[int, int, Dict[str, Any]]] = {}
# Serializes all config read/write paths. libyaml's C extension is not
# thread-safe for concurrent safe_load() on the same file, and multiple
# tool threads (approval.py, browser_tool.py, setup flows) hit
# load_config / read_raw_config / save_config from different threads
# during long agent runs. RLock (not Lock) because save_config internally
# calls read_raw_config. Also covers mutation of the module-level cache
# dicts above.
_CONFIG_LOCK = threading.RLock()
# Env var names written to .env that aren't in OPTIONAL_ENV_VARS
# (managed by setup/provider flows directly).
_EXTRA_ENV_KEYS = frozenset({
@@ -3941,28 +3950,29 @@ def read_raw_config() -> Dict[str, Any]:
``load_config()``. Returns a deepcopy on every call since some callers
mutate the result before passing to ``save_config()``.
"""
try:
config_path = get_config_path()
st = config_path.stat()
cache_key = (st.st_mtime_ns, st.st_size)
except (FileNotFoundError, OSError):
return {}
with _CONFIG_LOCK:
try:
config_path = get_config_path()
st = config_path.stat()
cache_key = (st.st_mtime_ns, st.st_size)
except (FileNotFoundError, OSError):
return {}
path_key = str(config_path)
cached = _RAW_CONFIG_CACHE.get(path_key)
if cached is not None and cached[:2] == cache_key:
return copy.deepcopy(cached[2])
path_key = str(config_path)
cached = _RAW_CONFIG_CACHE.get(path_key)
if cached is not None and cached[:2] == cache_key:
return copy.deepcopy(cached[2])
try:
with open(config_path, encoding="utf-8") as f:
data = yaml.safe_load(f) or {}
except Exception:
return {}
try:
with open(config_path, encoding="utf-8") as f:
data = yaml.safe_load(f) or {}
except Exception:
return {}
if not isinstance(data, dict):
data = {}
_RAW_CONFIG_CACHE[path_key] = (cache_key[0], cache_key[1], copy.deepcopy(data))
return data
if not isinstance(data, dict):
data = {}
_RAW_CONFIG_CACHE[path_key] = (cache_key[0], cache_key[1], copy.deepcopy(data))
return data
def load_config() -> Dict[str, Any]:
@@ -3975,46 +3985,47 @@ def load_config() -> Dict[str, Any]:
(which change ``HERMES_HOME`` and therefore ``get_config_path()``)
don't collide.
"""
ensure_hermes_home()
config_path = get_config_path()
path_key = str(config_path)
with _CONFIG_LOCK:
ensure_hermes_home()
config_path = get_config_path()
path_key = str(config_path)
try:
st = config_path.stat()
cache_key: Optional[Tuple[int, int]] = (st.st_mtime_ns, st.st_size)
except FileNotFoundError:
cache_key = None
cached = _LOAD_CONFIG_CACHE.get(path_key)
if cached is not None and cache_key is not None and cached[:2] == cache_key:
return copy.deepcopy(cached[2])
config = copy.deepcopy(DEFAULT_CONFIG)
if cache_key is not None:
try:
with open(config_path, encoding="utf-8") as f:
user_config = yaml.safe_load(f) or {}
st = config_path.stat()
cache_key: Optional[Tuple[int, int]] = (st.st_mtime_ns, st.st_size)
except FileNotFoundError:
cache_key = None
if "max_turns" in user_config:
agent_user_config = dict(user_config.get("agent") or {})
if agent_user_config.get("max_turns") is None:
agent_user_config["max_turns"] = user_config["max_turns"]
user_config["agent"] = agent_user_config
user_config.pop("max_turns", None)
cached = _LOAD_CONFIG_CACHE.get(path_key)
if cached is not None and cache_key is not None and cached[:2] == cache_key:
return copy.deepcopy(cached[2])
config = _deep_merge(config, user_config)
except Exception as e:
print(f"Warning: Failed to load config: {e}")
config = copy.deepcopy(DEFAULT_CONFIG)
normalized = _normalize_root_model_keys(_normalize_max_turns_config(config))
expanded = _expand_env_vars(normalized)
_LAST_EXPANDED_CONFIG_BY_PATH[path_key] = copy.deepcopy(expanded)
if cache_key is not None:
_LOAD_CONFIG_CACHE[path_key] = (cache_key[0], cache_key[1], copy.deepcopy(expanded))
else:
_LOAD_CONFIG_CACHE.pop(path_key, None)
return expanded
if cache_key is not None:
try:
with open(config_path, encoding="utf-8") as f:
user_config = yaml.safe_load(f) or {}
if "max_turns" in user_config:
agent_user_config = dict(user_config.get("agent") or {})
if agent_user_config.get("max_turns") is None:
agent_user_config["max_turns"] = user_config["max_turns"]
user_config["agent"] = agent_user_config
user_config.pop("max_turns", None)
config = _deep_merge(config, user_config)
except Exception as e:
print(f"Warning: Failed to load config: {e}")
normalized = _normalize_root_model_keys(_normalize_max_turns_config(config))
expanded = _expand_env_vars(normalized)
_LAST_EXPANDED_CONFIG_BY_PATH[path_key] = copy.deepcopy(expanded)
if cache_key is not None:
_LOAD_CONFIG_CACHE[path_key] = (cache_key[0], cache_key[1], copy.deepcopy(expanded))
else:
_LOAD_CONFIG_CACHE.pop(path_key, None)
return expanded
_SECURITY_COMMENT = """
@@ -4094,45 +4105,46 @@ _COMMENTED_SECTIONS = """
def save_config(config: Dict[str, Any]):
"""Save configuration to ~/.hermes/config.yaml."""
if is_managed():
managed_error("save configuration")
return
from utils import atomic_yaml_write
with _CONFIG_LOCK:
if is_managed():
managed_error("save configuration")
return
from utils import atomic_yaml_write
ensure_hermes_home()
config_path = get_config_path()
current_normalized = _normalize_root_model_keys(_normalize_max_turns_config(config))
normalized = current_normalized
raw_existing = _normalize_root_model_keys(_normalize_max_turns_config(read_raw_config()))
if raw_existing:
normalized = _preserve_env_ref_templates(
ensure_hermes_home()
config_path = get_config_path()
current_normalized = _normalize_root_model_keys(_normalize_max_turns_config(config))
normalized = current_normalized
raw_existing = _normalize_root_model_keys(_normalize_max_turns_config(read_raw_config()))
if raw_existing:
normalized = _preserve_env_ref_templates(
normalized,
raw_existing,
_LAST_EXPANDED_CONFIG_BY_PATH.get(str(config_path)),
)
# Build optional commented-out sections for features that are off by
# default or only relevant when explicitly configured.
parts = []
sec = normalized.get("security", {})
if not sec or sec.get("redact_secrets") is None:
parts.append(_SECURITY_COMMENT)
fb = normalized.get("fallback_model", {})
fb_is_valid = False
if isinstance(fb, list):
fb_is_valid = any(isinstance(e, dict) and e.get("provider") and e.get("model") for e in fb)
elif isinstance(fb, dict):
fb_is_valid = bool(fb.get("provider") and fb.get("model"))
if not fb_is_valid:
parts.append(_FALLBACK_COMMENT)
atomic_yaml_write(
config_path,
normalized,
raw_existing,
_LAST_EXPANDED_CONFIG_BY_PATH.get(str(config_path)),
extra_content="".join(parts) if parts else None,
)
# Build optional commented-out sections for features that are off by
# default or only relevant when explicitly configured.
parts = []
sec = normalized.get("security", {})
if not sec or sec.get("redact_secrets") is None:
parts.append(_SECURITY_COMMENT)
fb = normalized.get("fallback_model", {})
fb_is_valid = False
if isinstance(fb, list):
fb_is_valid = any(isinstance(e, dict) and e.get("provider") and e.get("model") for e in fb)
elif isinstance(fb, dict):
fb_is_valid = bool(fb.get("provider") and fb.get("model"))
if not fb_is_valid:
parts.append(_FALLBACK_COMMENT)
atomic_yaml_write(
config_path,
normalized,
extra_content="".join(parts) if parts else None,
)
_secure_file(config_path)
_LAST_EXPANDED_CONFIG_BY_PATH[str(config_path)] = copy.deepcopy(current_normalized)
_secure_file(config_path)
_LAST_EXPANDED_CONFIG_BY_PATH[str(config_path)] = copy.deepcopy(current_normalized)
def load_env() -> Dict[str, str]:
+79 -21
View File
@@ -47,6 +47,14 @@ DEFAULT_MAX_TURNS = 20
DEFAULT_JUDGE_TIMEOUT = 30.0
# Cap how much of the last response + recent messages we send to the judge.
_JUDGE_RESPONSE_SNIPPET_CHARS = 4000
# After this many consecutive judge *parse* failures (empty output / non-JSON),
# the loop auto-pauses and points the user at the goal_judge config. API /
# transport errors do NOT count toward this — those are transient. This guards
# against small models (e.g. deepseek-v4-flash) that cannot follow the strict
# JSON reply contract; without it the loop runs until the turn budget is
# exhausted with every reply shaped like `judge returned empty response` or
# `judge reply was not JSON`.
DEFAULT_MAX_CONSECUTIVE_PARSE_FAILURES = 3
CONTINUATION_PROMPT_TEMPLATE = (
@@ -99,6 +107,7 @@ class GoalState:
last_verdict: Optional[str] = None # "done" | "continue" | "skipped"
last_reason: Optional[str] = None
paused_reason: Optional[str] = None # why we auto-paused (budget, etc.)
consecutive_parse_failures: int = 0 # judge-output parse failures in a row
def to_json(self) -> str:
return json.dumps(asdict(self), ensure_ascii=False)
@@ -116,6 +125,7 @@ class GoalState:
last_verdict=data.get("last_verdict"),
last_reason=data.get("last_reason"),
paused_reason=data.get("paused_reason"),
consecutive_parse_failures=int(data.get("consecutive_parse_failures", 0) or 0),
)
@@ -220,13 +230,17 @@ def _truncate(text: str, limit: int) -> str:
_JSON_OBJECT_RE = re.compile(r"\{.*?\}", re.DOTALL)
def _parse_judge_response(raw: str) -> Tuple[bool, str]:
"""Parse the judge's reply. Fail-open to ``(False, "<reason>")``.
def _parse_judge_response(raw: str) -> Tuple[bool, str, bool]:
"""Parse the judge's reply. Fail-open to ``(False, "<reason>", parse_failed)``.
Returns ``(done, reason)``.
Returns ``(done, reason, parse_failed)``. ``parse_failed`` is True when the
judge returned output that couldn't be interpreted as the expected JSON
verdict (empty body, prose, malformed JSON). Callers use that flag to
auto-pause after N consecutive parse failures so a weak judge model
doesn't silently burn the turn budget.
"""
if not raw:
return False, "judge returned empty response"
return False, "judge returned empty response", True
text = raw.strip()
@@ -252,7 +266,7 @@ def _parse_judge_response(raw: str) -> Tuple[bool, str]:
data = None
if not isinstance(data, dict):
return False, f"judge reply was not JSON: {_truncate(raw, 200)!r}"
return False, f"judge reply was not JSON: {_truncate(raw, 200)!r}", True
done_val = data.get("done")
if isinstance(done_val, str):
@@ -262,7 +276,7 @@ def _parse_judge_response(raw: str) -> Tuple[bool, str]:
reason = str(data.get("reason") or "").strip()
if not reason:
reason = "no reason provided"
return done, reason
return done, reason, False
def judge_goal(
@@ -270,36 +284,42 @@ def judge_goal(
last_response: str,
*,
timeout: float = DEFAULT_JUDGE_TIMEOUT,
) -> Tuple[str, str]:
) -> Tuple[str, str, bool]:
"""Ask the auxiliary model whether the goal is satisfied.
Returns ``(verdict, reason)`` where verdict is ``"done"``, ``"continue"``,
or ``"skipped"`` (when the judge couldn't be reached).
Returns ``(verdict, reason, parse_failed)`` where verdict is ``"done"``,
``"continue"``, or ``"skipped"`` (when the judge couldn't be reached).
This is deliberately fail-open: any error returns ``("continue", "...")``
so a broken judge doesn't wedge progress — the turn budget is the
backstop.
``parse_failed`` is True only when the judge call succeeded but its output
was unusable (empty or non-JSON). API/transport errors return False — they
are transient and should fail-open silently. Callers use this flag to
auto-pause after N consecutive parse failures (see
``DEFAULT_MAX_CONSECUTIVE_PARSE_FAILURES``).
This is deliberately fail-open: any error returns ``("continue", "...", False)``
so a broken judge doesn't wedge progress — the turn budget and the
consecutive-parse-failures auto-pause are the backstops.
"""
if not goal.strip():
return "skipped", "empty goal"
return "skipped", "empty goal", False
if not last_response.strip():
# No substantive reply this turn — almost certainly not done yet.
return "continue", "empty response (nothing to evaluate)"
return "continue", "empty response (nothing to evaluate)", False
try:
from agent.auxiliary_client import get_text_auxiliary_client
except Exception as exc:
logger.debug("goal judge: auxiliary client import failed: %s", exc)
return "continue", "auxiliary client unavailable"
return "continue", "auxiliary client unavailable", False
try:
client, model = get_text_auxiliary_client("goal_judge")
except Exception as exc:
logger.debug("goal judge: get_text_auxiliary_client failed: %s", exc)
return "continue", "auxiliary client unavailable"
return "continue", "auxiliary client unavailable", False
if client is None or not model:
return "continue", "no auxiliary client configured"
return "continue", "no auxiliary client configured", False
prompt = JUDGE_USER_PROMPT_TEMPLATE.format(
goal=_truncate(goal, 2000),
@@ -319,17 +339,17 @@ def judge_goal(
)
except Exception as exc:
logger.info("goal judge: API call failed (%s) — falling through to continue", exc)
return "continue", f"judge error: {type(exc).__name__}"
return "continue", f"judge error: {type(exc).__name__}", False
try:
raw = resp.choices[0].message.content or ""
except Exception:
raw = ""
done, reason = _parse_judge_response(raw)
done, reason, parse_failed = _parse_judge_response(raw)
verdict = "done" if done else "continue"
logger.info("goal judge: verdict=%s reason=%s", verdict, _truncate(reason, 120))
return verdict, reason
return verdict, reason, parse_failed
# ──────────────────────────────────────────────────────────────────────
@@ -473,10 +493,18 @@ class GoalManager:
state.turns_used += 1
state.last_turn_at = time.time()
verdict, reason = judge_goal(state.goal, last_response)
verdict, reason, parse_failed = judge_goal(state.goal, last_response)
state.last_verdict = verdict
state.last_reason = reason
# Track consecutive judge parse failures. Reset on any usable reply,
# including API / transport errors (parse_failed=False) so a flaky
# network doesn't trip the auto-pause meant for bad judge models.
if parse_failed:
state.consecutive_parse_failures += 1
else:
state.consecutive_parse_failures = 0
if verdict == "done":
state.status = "done"
save_goal(self.session_id, state)
@@ -489,6 +517,36 @@ class GoalManager:
"message": f"✓ Goal achieved: {reason}",
}
# Auto-pause when the judge model can't produce the expected JSON
# verdict N turns in a row. Points the user at the goal_judge config
# so they can route this side task to a model that follows the
# contract (e.g. google/gemini-3-flash-preview). Without this guard,
# weak judge models burn the entire turn budget returning prose or
# empty strings.
if state.consecutive_parse_failures >= DEFAULT_MAX_CONSECUTIVE_PARSE_FAILURES:
state.status = "paused"
state.paused_reason = (
f"judge model returned unparseable output {state.consecutive_parse_failures} turns in a row"
)
save_goal(self.session_id, state)
return {
"status": "paused",
"should_continue": False,
"continuation_prompt": None,
"verdict": "continue",
"reason": reason,
"message": (
f"⏸ Goal paused — the judge model ({state.consecutive_parse_failures} turns) "
"isn't returning the required JSON verdict. Route the judge to a stricter "
"model in ~/.hermes/config.yaml:\n"
" auxiliary:\n"
" goal_judge:\n"
" provider: openrouter\n"
" model: google/gemini-3-flash-preview\n"
"Then /goal resume to continue."
),
}
if state.turns_used >= state.max_turns:
state.status = "paused"
state.paused_reason = f"turn budget exhausted ({state.turns_used}/{state.max_turns})"
+10 -2
View File
@@ -43,12 +43,20 @@ Usage:
hermes claw migrate --dry-run # Preview migration without changes
"""
import os
import sys
from utf8_bootstrap import ensure_windows_utf8_mode
# Force UTF-8 defaults on Windows before any module-level file I/O.
ensure_windows_utf8_mode(
module="hermes_cli.main",
entrypoint_markers=("hermes", "main.py"),
)
import argparse
import json
import os
import shutil
import subprocess
import sys
from pathlib import Path
from typing import Optional
+5
View File
@@ -612,6 +612,11 @@ class SessionDB:
the caller already holds cumulative totals (gateway path, where the
cached agent accumulates across messages).
"""
# Ensure the session row exists so the UPDATE doesn't silently affect
# 0 rows. Under concurrent load (cron + kanban + delegate_task) the
# initial create_session() may have failed due to SQLite locking.
# INSERT OR IGNORE is cheap and idempotent.
self._insert_session_row(session_id, "unknown", model=model)
if absolute:
sql = """UPDATE sessions SET
input_tokens = ?,
+1 -1
View File
@@ -802,7 +802,7 @@ def create_mcp_server(event_bridge: Optional[EventBridge] = None) -> "FastMCP":
return json.dumps({"count": len(targets), "channels": targets}, indent=2)
channels = []
for plat, entries_list in directory.items():
for plat, entries_list in directory.get("platforms", {}).items():
if platform and plat.lower() != platform.lower():
continue
if isinstance(entries_list, list):
+66 -15
View File
@@ -97,6 +97,12 @@
const API = "/api/plugins/kanban";
const MIME_TASK = "text/x-hermes-task";
// Docs link — surfaced as a `?` icon next to the board switcher and as
// `title=` hints on unlabelled controls. Kept in one place so rebrands or
// path changes are a single edit.
const DOCS_URL = "https://hermes-agent.nousresearch.com/docs/user-guide/features/kanban";
const DOCS_TUTORIAL_URL = "https://hermes-agent.nousresearch.com/docs/user-guide/features/kanban-tutorial";
// localStorage key for the user's selected board. Independent of the
// CLI's on-disk ``<root>/kanban/current`` pointer so browser users
// can inspect any board without shifting the CLI's active board out
@@ -1128,6 +1134,20 @@
// Board switcher (multi-project)
// -------------------------------------------------------------------------
// Small `?` affordance next to the board controls. Opens the kanban docs
// page in a new tab so users can look up what any of the widgets mean
// without losing the current board view.
function DocsLink() {
return h("a", {
href: DOCS_URL,
target: "_blank",
rel: "noopener noreferrer",
className: "hermes-kanban-docs-link",
title: "Open Hermes Kanban docs in a new tab",
"aria-label": "Hermes Kanban documentation",
}, "?");
}
function BoardSwitcher(props) {
const list = props.boardList || [];
const current = list.find(function (b) { return b.slug === props.board; });
@@ -1152,6 +1172,7 @@
size: "sm",
className: "h-7 text-xs",
}, "+ New board"),
h(DocsLink, null),
);
}
@@ -1165,6 +1186,7 @@
value: props.board,
className: "h-8 min-w-[220px]",
"aria-label": "Switch kanban board",
title: "Boards are independent work streams. Each board has its own tasks, tenants, and assignees.",
}, selectChangeHandler(function (v) { if (v) props.onSwitch(v); })),
list.map(function (b) {
const label = b.total > 0
@@ -1178,10 +1200,12 @@
),
),
h("div", { className: "flex-1" }),
h(DocsLink, null),
h(Button, {
onClick: props.onNewClick,
size: "sm",
className: "h-8",
title: "Create a new board. Useful when you want an unrelated work stream (different project, different team, isolated scratch area).",
}, "+ New board"),
props.board !== "default"
? h(Button, {
@@ -1326,7 +1350,8 @@
const tenants = (props.board && props.board.tenants) || [];
const assignees = (props.board && props.board.assignees) || [];
return h("div", { className: "flex flex-wrap items-end gap-3" },
h("div", { className: "flex flex-col gap-1" },
h("div", { className: "flex flex-col gap-1",
title: "Fuzzy-match tasks by id, title, or description. Matches across all columns." },
h(Label, { className: "text-xs text-muted-foreground" }, "Search"),
h(Input, {
placeholder: "Filter cards…",
@@ -1335,7 +1360,8 @@
className: "w-56 h-8",
}),
),
h("div", { className: "flex flex-col gap-1" },
h("div", { className: "flex flex-col gap-1",
title: "Tenants are free-form tags on a task (e.g. customer, project, team). Set them via the task drawer or kanban_create." },
h(Label, { className: "text-xs text-muted-foreground" }, "Tenant"),
h(Select, Object.assign({
value: props.tenantFilter,
@@ -1347,7 +1373,8 @@
}),
),
),
h("div", { className: "flex flex-col gap-1" },
h("div", { className: "flex flex-col gap-1",
title: "Filter by assigned Hermes profile. Profiles are the named agent identities that claim and work on tasks." },
h(Label, { className: "text-xs text-muted-foreground" }, "Assignee"),
h(Select, Object.assign({
value: props.assigneeFilter,
@@ -1359,7 +1386,8 @@
}),
),
),
h("label", { className: "flex items-center gap-2 text-xs" },
h("label", { className: "flex items-center gap-2 text-xs",
title: "Include archived tasks in the board view. Archived tasks are hidden by default." },
h("input", {
type: "checkbox",
checked: props.includeArchived,
@@ -1380,10 +1408,12 @@
h(Button, {
onClick: props.onNudgeDispatch,
size: "sm",
title: "Wake the dispatcher to claim ready tasks now instead of waiting for the next tick. Use this after adding tasks if you want them picked up immediately.",
}, "Nudge dispatcher"),
h(Button, {
onClick: props.onRefresh,
size: "sm",
title: "Reload the board from the database. The board auto-refreshes on task events; this is for forcing a re-read.",
}, "Refresh"),
);
}
@@ -1400,6 +1430,7 @@
h(Button, {
onClick: function () { props.onApply({ status: "ready" }); },
size: "sm",
title: "Move selected tasks to Ready. Ready tasks are picked up by the dispatcher on the next tick.",
}, "→ ready"),
h(Button, {
onClick: function () {
@@ -1407,6 +1438,7 @@
`Mark ${props.count} task(s) as done?`);
},
size: "sm",
title: "Mark selected tasks as done. Releases any claims and unblocks dependent children. You'll be asked for a completion summary.",
}, "Complete"),
h(Button, {
onClick: function () {
@@ -1414,8 +1446,10 @@
`Archive ${props.count} task(s)?`);
},
size: "sm",
title: "Archive selected tasks. They disappear from the default board view but remain in the database.",
}, "Archive"),
h("div", { className: "hermes-kanban-bulk-reassign" },
h("div", { className: "hermes-kanban-bulk-reassign",
title: "Reassign selected tasks to a different Hermes profile. Pick a profile (or unassign) and click Apply." },
h(Select, {
value: assignee,
onChange: function (e) { setAssignee(e.target.value); },
@@ -1435,12 +1469,14 @@
},
disabled: !assignee,
size: "sm",
title: "Apply the selected assignee to all selected tasks.",
}, "Apply"),
),
h("div", { className: "flex-1" }),
h(Button, {
onClick: props.onClear,
size: "sm",
title: "Deselect all tasks and hide this bar.",
}, "Clear"),
);
}
@@ -1521,11 +1557,13 @@
onDragLeave: handleDragLeave,
onDrop: handleDrop,
},
h("div", { className: "hermes-kanban-column-header" },
h("div", { className: "hermes-kanban-column-header",
title: COLUMN_HELP[props.column.name] || "" },
h("span", { className: cn("hermes-kanban-dot", COLUMN_DOT[props.column.name]) }),
h("span", { className: "hermes-kanban-column-label" },
COLUMN_LABEL[props.column.name] || props.column.name),
h("span", { className: "hermes-kanban-column-count" },
h("span", { className: "hermes-kanban-column-count",
title: `${props.column.tasks.length} task${props.column.tasks.length === 1 ? "" : "s"} in this column` },
props.column.tasks.length),
h("button", {
type: "button",
@@ -1652,7 +1690,8 @@
onClick: function (e) { e.stopPropagation(); },
title: "Select for bulk actions",
}),
h("span", { className: "hermes-kanban-card-id" }, t.id),
h("span", { className: "hermes-kanban-card-id",
title: `Task id: ${t.id}. Use this id with kanban_show, /kanban show, or hermes kanban show.` }, t.id),
t.warnings && t.warnings.count > 0
? h("span", {
className: cn(
@@ -1669,10 +1708,12 @@
t.warnings.highest_severity === "error" ? "!!" : "⚠")
: null,
t.priority > 0
? h(Badge, { className: "hermes-kanban-priority" }, `P${t.priority}`)
? h(Badge, { className: "hermes-kanban-priority",
title: `Priority ${t.priority}. Higher-priority tasks are claimed first by the dispatcher.` }, `P${t.priority}`)
: null,
t.tenant
? h(Badge, { variant: "outline", className: "hermes-kanban-tag" }, t.tenant)
? h(Badge, { variant: "outline", className: "hermes-kanban-tag",
title: `Tenant: ${t.tenant}. Free-form tag for grouping tasks (customer, project, team).` }, t.tenant)
: null,
progress
? h("span", {
@@ -1687,16 +1728,21 @@
h("div", { className: "hermes-kanban-card-title" }, t.title || "(untitled)"),
h("div", { className: "hermes-kanban-card-row hermes-kanban-card-meta" },
t.assignee
? h("span", { className: "hermes-kanban-assignee" }, "@", t.assignee)
: h("span", { className: "hermes-kanban-unassigned" }, "unassigned"),
? h("span", { className: "hermes-kanban-assignee",
title: `Assigned to Hermes profile @${t.assignee}` }, "@", t.assignee)
: h("span", { className: "hermes-kanban-unassigned",
title: "No profile assigned. The dispatcher will pick one from available profiles when the task is Ready." }, "unassigned"),
t.comment_count > 0
? h("span", { className: "hermes-kanban-count" }, "💬 ", t.comment_count)
? h("span", { className: "hermes-kanban-count",
title: `${t.comment_count} comment${t.comment_count === 1 ? "" : "s"} on this task` }, "💬 ", t.comment_count)
: null,
t.link_counts && (t.link_counts.parents + t.link_counts.children) > 0
? h("span", { className: "hermes-kanban-count" },
? h("span", { className: "hermes-kanban-count",
title: `${t.link_counts.parents} parent${t.link_counts.parents === 1 ? "" : "s"}, ${t.link_counts.children} child${t.link_counts.children === 1 ? "" : "ren"}. Children stay blocked until their parent is done.` },
"↔ ", t.link_counts.parents + t.link_counts.children)
: null,
h("span", { className: "hermes-kanban-ago" },
h("span", { className: "hermes-kanban-ago",
title: t.created_at ? `Created ${t.created_at}` : "" },
timeAgo ? timeAgo(t.created_at) : ""),
),
),
@@ -1777,6 +1823,9 @@
onChange: function (e) { setAssignee(e.target.value); },
placeholder: props.columnName === "triage" ? "specifier" : "assignee",
className: "h-7 text-xs flex-1",
title: props.columnName === "triage"
? "Hermes profile that will spec this task (default: the dispatcher's configured specifier). Leave blank to let the dispatcher pick."
: "Hermes profile to assign. Leave blank and the dispatcher will pick from available profiles when the task is Ready.",
}),
h(Input, {
type: "number",
@@ -1784,6 +1833,7 @@
onChange: function (e) { setPriority(e.target.value); },
placeholder: "pri",
className: "h-7 text-xs w-16",
title: "Priority. Higher-priority tasks are claimed first by the dispatcher. 0 = default.",
}),
),
h(Input, {
@@ -1815,6 +1865,7 @@
value: parent,
onChange: function (e) { setParent(e.target.value); },
className: "h-7 text-xs",
title: "Optional parent task. A child stays blocked in its current column until the parent is marked done.",
},
h(SelectOption, { value: "" }, "— no parent —"),
(props.allTasks || []).map(function (t) {
+26
View File
@@ -891,6 +891,32 @@
display: flex;
justify-content: flex-end;
padding: 0 0.25rem;
gap: 0.5rem;
align-items: center;
}
.hermes-kanban-docs-link {
display: inline-flex;
align-items: center;
justify-content: center;
width: 1.5rem;
height: 1.5rem;
border-radius: 9999px;
font-size: 0.75rem;
font-weight: 600;
line-height: 1;
color: var(--color-muted-foreground, rgba(180, 180, 200, 0.8));
background: var(--color-card-subtle, rgba(255, 255, 255, 0.04));
border: 1px solid var(--color-border, rgba(120, 120, 140, 0.25));
text-decoration: none;
cursor: help;
transition: color 0.15s, background 0.15s, border-color 0.15s;
}
.hermes-kanban-docs-link:hover,
.hermes-kanban-docs-link:focus-visible {
color: var(--color-foreground, #e7e7ee);
background: var(--color-card, rgba(255, 255, 255, 0.08));
border-color: var(--color-border, rgba(160, 160, 190, 0.45));
outline: none;
}
.hermes-kanban-dialog-backdrop {
position: fixed;
+1 -1
View File
@@ -154,7 +154,7 @@ hermes-agent = "run_agent:main"
hermes-acp = "acp_adapter.entry:main"
[tool.setuptools]
py-modules = ["run_agent", "model_tools", "toolsets", "batch_runner", "trajectory_compressor", "toolset_distributions", "cli", "hermes_constants", "hermes_state", "hermes_time", "hermes_logging", "rl_cli", "utils"]
py-modules = ["run_agent", "model_tools", "toolsets", "batch_runner", "trajectory_compressor", "toolset_distributions", "cli", "hermes_constants", "hermes_state", "hermes_time", "hermes_logging", "rl_cli", "utils", "utf8_bootstrap"]
[tool.setuptools.package-data]
hermes_cli = ["web_dist/**/*"]
+22 -2
View File
@@ -58,6 +58,7 @@ from datetime import datetime
from pathlib import Path
from hermes_constants import get_hermes_home
from utf8_bootstrap import ensure_windows_utf8_mode
_OPENAI_CLS_CACHE: Optional[type] = None
@@ -12131,6 +12132,14 @@ class AIAgent:
# deltas instead of double-counting them.
if self._session_db and self.session_id:
try:
# Ensure the session row exists before attempting UPDATE.
# Under concurrent load (cron/kanban), the initial
# _ensure_db_session() may have failed due to SQLite
# locking. Retry here so per-call token deltas are
# not silently lost (UPDATE on a non-existent row
# affects 0 rows without error).
if not self._session_db_created:
self._ensure_db_session()
self._session_db.update_token_counts(
self.session_id,
input_tokens=canonical_usage.input_tokens,
@@ -12149,8 +12158,14 @@ class AIAgent:
model=self.model,
api_call_count=1,
)
except Exception:
pass # never block the agent loop
except Exception as e:
# Log token persistence failures so they're
# visible in agent.log — silent loss here is
# the root cause of undercounted analytics.
logger.debug(
"Token persistence failed (session=%s, tokens=%d): %s",
self.session_id, total_tokens, e,
)
if self.verbose_logging:
logging.debug(f"Token usage: prompt={usage_dict['prompt_tokens']:,}, completion={usage_dict['completion_tokens']:,}, total={usage_dict['total_tokens']:,}")
@@ -14469,6 +14484,11 @@ def main(
Toolset Examples:
- "research": Web search, extract, crawl + vision tools
"""
ensure_windows_utf8_mode(
module="run_agent",
entrypoint_markers=("hermes-agent", "run_agent.py"),
)
print("🤖 AI Agent with Tool Calling")
print("=" * 50)
+152 -30
View File
@@ -65,6 +65,108 @@ function Write-Err {
Write-Host "$Message" -ForegroundColor Red
}
function Add-UserPathEntry {
param(
[string]$CurrentPath,
[string]$Entry
)
if (-not $Entry) {
return $CurrentPath
}
$parts = @()
if ($CurrentPath) {
$parts = $CurrentPath -split ";" | Where-Object { $_ -and $_.Trim() }
}
$normalizedEntry = $Entry.Trim().TrimEnd("\")
foreach ($part in $parts) {
if ($part.Trim().TrimEnd("\") -ieq $normalizedEntry) {
return $CurrentPath
}
}
if ($CurrentPath) {
return "$Entry;$CurrentPath"
}
return $Entry
}
function Resolve-NpmInvocation {
# Prefer npm.cmd to avoid PowerShell execution-policy failures from npm.ps1.
$npmCmd = Get-Command npm.cmd -ErrorAction SilentlyContinue
if ($npmCmd -and $npmCmd.Source) {
return @($npmCmd.Source)
}
$npm = Get-Command npm -ErrorAction SilentlyContinue
if ($npm -and $npm.Source) {
if ($npm.Source -notmatch "\.ps1$") {
return @($npm.Source)
}
$candidateCmd = [System.IO.Path]::ChangeExtension($npm.Source, ".cmd")
if (Test-Path $candidateCmd) {
return @($candidateCmd)
}
}
# Last fallback for odd PATH setups: invoke npm-cli.js directly via node.
$node = Get-Command node -ErrorAction SilentlyContinue
if ($node -and $node.Source) {
$nodeDir = Split-Path -Parent $node.Source
$candidates = @(
(Join-Path $nodeDir "node_modules\npm\bin\npm-cli.js"),
"$HermesHome\node\node_modules\npm\bin\npm-cli.js"
)
foreach ($candidate in $candidates) {
if (Test-Path $candidate) {
return @($node.Source, $candidate)
}
}
}
return $null
}
function Invoke-NpmInstallSilent {
param(
[string]$WorkingDir
)
$npmInvocation = Resolve-NpmInvocation
if (-not $npmInvocation) {
throw "npm command not found in PATH"
}
Push-Location $WorkingDir
try {
$output = @()
if ($npmInvocation.Count -eq 1) {
$output = & $npmInvocation[0] install --silent 2>&1
} else {
$output = & $npmInvocation[0] $npmInvocation[1] install --silent 2>&1
}
if ($LASTEXITCODE -ne 0) {
$lastLine = ""
if ($output) {
$lines = @($output | ForEach-Object { "$_" } | Where-Object { $_ })
if ($lines.Count -gt 0) {
$lastLine = $lines[-1]
}
}
if ($lastLine) {
throw "npm install exited with code $LASTEXITCODE: $lastLine"
}
throw "npm install exited with code $LASTEXITCODE"
}
} finally {
Pop-Location
}
}
# ============================================================================
# Dependency checks
# ============================================================================
@@ -550,11 +652,21 @@ function Install-Dependencies {
$env:VIRTUAL_ENV = "$InstallDir\venv"
}
# Install main package with all extras
try {
& $UvCmd pip install -e ".[all]" 2>&1 | Out-Null
} catch {
& $UvCmd pip install -e "." | Out-Null
# Install main package with all extras first. If that fails (for example
# due to an optional extra on this machine), fall back to the minimum
# dependency profile required for native Windows CLI + TUI operation.
& $UvCmd pip install -e ".[all]" 2>&1 | Out-Null
if ($LASTEXITCODE -ne 0) {
Write-Warn "Full extras install failed. Retrying with Windows CLI/TUI dependency set..."
& $UvCmd pip install -e ".[pty,mcp,honcho,acp]" 2>&1 | Out-Null
if ($LASTEXITCODE -ne 0) {
Write-Warn "Windows CLI/TUI extras install failed. Retrying with base package..."
& $UvCmd pip install -e "." 2>&1 | Out-Null
if ($LASTEXITCODE -ne 0) {
Pop-Location
throw "Failed to install Hermes Python dependencies."
}
}
}
Write-Success "Main package installed"
@@ -586,20 +698,35 @@ function Set-PathVariable {
$hermesBin = "$InstallDir\venv\Scripts"
}
# Add the venv Scripts dir to user PATH so hermes is globally available
# On Windows, the hermes.exe in venv\Scripts\ has the venv Python baked in
# Add required bins to user PATH so hermes and --tui dependencies persist
# across new terminal sessions.
# On Windows, the hermes.exe in venv\Scripts\ has the venv Python baked in.
$currentPath = [Environment]::GetEnvironmentVariable("Path", "User")
if ($currentPath -notlike "*$hermesBin*") {
[Environment]::SetEnvironmentVariable(
"Path",
"$hermesBin;$currentPath",
"User"
)
$newPath = Add-UserPathEntry -CurrentPath $currentPath -Entry $hermesBin
if ($newPath -ne $currentPath) {
Write-Success "Added to user PATH: $hermesBin"
} else {
Write-Info "PATH already configured"
Write-Info "PATH already includes: $hermesBin"
}
$managedNodeDir = "$HermesHome\node"
$managedNodeExe = "$managedNodeDir\node.exe"
if (Test-Path $managedNodeExe) {
$pathWithNode = Add-UserPathEntry -CurrentPath $newPath -Entry $managedNodeDir
if ($pathWithNode -ne $newPath) {
Write-Success "Added managed Node.js to user PATH: $managedNodeDir"
} else {
Write-Info "PATH already includes managed Node.js"
}
$newPath = $pathWithNode
# Hint hermes_cli.main._make_tui_argv() where node lives when a managed
# install is used (it still prefers PATH when available).
[Environment]::SetEnvironmentVariable("HERMES_NODE", $managedNodeExe, "User")
$env:HERMES_NODE = $managedNodeExe
}
[Environment]::SetEnvironmentVariable("Path", $newPath, "User")
# Set HERMES_HOME so the Python code finds config/data in the right place.
# Only needed on Windows where we install to %LOCALAPPDATA%\hermes instead
@@ -612,7 +739,10 @@ function Set-PathVariable {
$env:HERMES_HOME = $HermesHome
# Update current session
$env:Path = "$hermesBin;$env:Path"
$env:Path = Add-UserPathEntry -CurrentPath $env:Path -Entry $hermesBin
if (Test-Path "$HermesHome\node\node.exe") {
$env:Path = Add-UserPathEntry -CurrentPath $env:Path -Entry "$HermesHome\node"
}
Write-Success "hermes command ready"
}
@@ -708,16 +838,14 @@ function Install-NodeDeps {
Write-Info "Skipping Node.js dependencies (Node not installed)"
return
}
Push-Location $InstallDir
if (Test-Path "package.json") {
if (Test-Path "$InstallDir\package.json") {
Write-Info "Installing Node.js dependencies (browser tools)..."
try {
npm install --silent 2>&1 | Out-Null
Invoke-NpmInstallSilent -WorkingDir $InstallDir
Write-Success "Node.js dependencies installed"
} catch {
Write-Warn "npm install failed (browser tools may not work)"
Write-Warn "Browser tools npm install could not be launched: $($_.Exception.Message)"
}
}
@@ -725,19 +853,13 @@ function Install-NodeDeps {
$tuiDir = "$InstallDir\ui-tui"
if (Test-Path "$tuiDir\package.json") {
Write-Info "Installing TUI dependencies..."
Push-Location $tuiDir
try {
npm install --silent 2>&1 | Out-Null
Invoke-NpmInstallSilent -WorkingDir $tuiDir
Write-Success "TUI dependencies installed"
} catch {
Write-Warn "TUI npm install failed (hermes --tui may not work)"
Write-Warn "TUI npm install could not be launched: $($_.Exception.Message)"
}
Pop-Location
}
Pop-Location
}
function Invoke-SetupWizard {
+4
View File
@@ -28,6 +28,10 @@ if [ -n "${PYTHONHOME:-}" ]; then
unset PYTHONHOME
fi
# Prevent uv from discovering config files (uv.toml, pyproject.toml) from the
# wrong user's home directory when running under sudo -u <user>. See #21269.
export UV_NO_CONFIG=1
# Colors
RED='\033[0;31m'
GREEN='\033[0;32m'
+3
View File
@@ -58,6 +58,7 @@ AUTHOR_MAP = {
"223003280+Abd0r@users.noreply.github.com": "Abd0r",
"abdielv@proton.me": "AJV20",
"mason@growagainorchids.com": "masonjames",
"ytchen0719@gmail.com": "liquidchen",
"am@studio1.tailb672fe.ts.net": "subtract0",
"axmaiqiu@gmail.com": "qWaitCrypto",
"159539633+MottledShadow@users.noreply.github.com": "MottledShadow",
@@ -78,6 +79,7 @@ AUTHOR_MAP = {
"dengtaoyuan@dengtaoyuandeMac-mini.local": "dengtaoyuan450-a11y",
"ysfalweshcan@gmail.com": "Junass1",
"bartokmagic@proton.me": "Bartok9",
"androidhtml@yandex.com": "hllqkb",
"25840394+Bongulielmi@users.noreply.github.com": "Bongulielmi",
"jonathan.troyer@overmatch.com": "JTroyerOvermatch",
"harryykyle1@gmail.com": "hharry11",
@@ -428,6 +430,7 @@ AUTHOR_MAP = {
"johnsonblake1@gmail.com": "voteblake",
"hcn518@gmail.com": "pedh",
"haileymarshall005@gmail.com": "haileymarshall",
"bennet.yr.wang@gmail.com": "BennetYrWang",
"greer.guthrie@gmail.com": "g-guthrie",
"kennyx102@gmail.com": "bobashopcashier",
"77253505+bobashopcashier@users.noreply.github.com": "bobashopcashier",
+4
View File
@@ -29,6 +29,10 @@ NC='\033[0m'
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
cd "$SCRIPT_DIR"
# Prevent uv from discovering config files (uv.toml, pyproject.toml) from the
# wrong user's home directory when running under sudo -u <user>. See #21269.
export UV_NO_CONFIG=1
PYTHON_VERSION="3.11"
is_termux() {
+147
View File
@@ -0,0 +1,147 @@
from __future__ import annotations
from types import SimpleNamespace
import pytest
from gateway.config import Platform
from gateway.platforms.base import MessageEvent, MessageType
from gateway.run import GatewayRunner
from gateway.session import SessionSource
from hermes_cli.goals import CONTINUATION_PROMPT_TEMPLATE
class FakeAdapter:
def __init__(self):
self.calls = []
self.callbacks = {}
self._active_sessions = {}
async def send(self, chat_id, content, reply_to=None, metadata=None):
self.calls.append(
{
"chat_id": chat_id,
"content": content,
"reply_to": reply_to,
"metadata": metadata,
}
)
return SimpleNamespace(success=True)
def register_post_delivery_callback(self, session_key, callback, *, generation=None):
self.callbacks[session_key] = (generation, callback)
def _goal_continuation_event(source, goal="finish the task"):
return MessageEvent(
text=CONTINUATION_PROMPT_TEMPLATE.format(goal=goal),
message_type=MessageType.TEXT,
source=source,
)
@pytest.mark.asyncio
async def test_goal_status_notice_uses_adapter_send_with_thread_metadata():
"""Regression: /goal judge status must use BasePlatformAdapter.send().
The old implementation checked for a non-existent send_message() method,
so the goal could be marked done in state_meta without the visible
"✓ Goal achieved" status line being delivered to Discord/Telegram.
"""
runner = GatewayRunner.__new__(GatewayRunner)
adapter = FakeAdapter()
runner.adapters = {Platform.DISCORD: adapter}
source = SessionSource(
platform=Platform.DISCORD,
chat_id="parent-channel",
thread_id="thread-123",
)
await runner._send_goal_status_notice(source, "✓ Goal achieved: done")
assert adapter.calls == [
{
"chat_id": "parent-channel",
"content": "✓ Goal achieved: done",
"reply_to": None,
"metadata": {"thread_id": "thread-123"},
}
]
@pytest.mark.asyncio
async def test_goal_status_notice_defers_until_post_delivery_callback():
"""Regression: goal status must appear after the agent's visible reply.
_post_turn_goal_continuation runs before BasePlatformAdapter sends the
returned final response. It should therefore register a post-delivery
callback, not send the judge status immediately.
"""
runner = GatewayRunner.__new__(GatewayRunner)
adapter = FakeAdapter()
runner.adapters = {Platform.DISCORD: adapter}
runner.config = SimpleNamespace(group_sessions_per_user=True, thread_sessions_per_user=False)
source = SessionSource(
platform=Platform.DISCORD,
chat_id="parent-channel",
thread_id="thread-123",
user_id="user-1",
)
await runner._defer_goal_status_notice_after_delivery(source, "✓ Goal achieved: done")
assert adapter.calls == []
assert len(adapter.callbacks) == 1
_, callback = next(iter(adapter.callbacks.values()))
result = callback()
if hasattr(result, "__await__"):
await result
assert adapter.calls == [
{
"chat_id": "parent-channel",
"content": "✓ Goal achieved: done",
"reply_to": None,
"metadata": {"thread_id": "thread-123"},
}
]
def test_clear_goal_pending_continuations_removes_slot_and_overflow_only():
"""Regression: /goal pause/clear must cancel queued self-continuations.
A user-issued /goal pause can arrive after the judge queued the next
continuation but before that queued turn runs. The queued synthetic goal
continuation should be removed without dropping normal user /queue items.
"""
runner = GatewayRunner.__new__(GatewayRunner)
adapter = FakeAdapter()
adapter._pending_messages = {}
runner._queued_events = {}
source = SessionSource(
platform=Platform.DISCORD,
chat_id="parent-channel",
thread_id="thread-123",
)
session_key = "discord:parent-channel:thread-123"
normal_event = MessageEvent(
text="normal queued user message",
message_type=MessageType.TEXT,
source=source,
)
adapter._pending_messages[session_key] = _goal_continuation_event(source)
runner._queued_events[session_key] = [
normal_event,
_goal_continuation_event(source, goal="second continuation"),
]
removed = runner._clear_goal_pending_continuations(session_key, adapter)
assert removed == 2
assert adapter._pending_messages.get(session_key) is None
assert runner._queued_events[session_key] == [normal_event]
+15 -11
View File
@@ -61,8 +61,9 @@ class _RecordingAdapter:
return _R()
def _make_runner_with_adapter():
def _make_runner_with_adapter(session_id: str = None):
from gateway.run import GatewayRunner
import uuid
runner = object.__new__(GatewayRunner)
runner.config = GatewayConfig(
@@ -74,9 +75,12 @@ def _make_runner_with_adapter():
runner._queued_events = {}
src = _make_source()
# Default to a unique session_id so xdist parallel runs on the same worker
# don't see each other's GoalManager state (DEFAULT_DB_PATH gets frozen at
# module-import time, defeating per-test HERMES_HOME monkeypatches).
session_entry = SessionEntry(
session_key=build_session_key(src),
session_id="goal-sess-1",
session_id=session_id or f"goal-sess-{uuid.uuid4().hex[:8]}",
created_at=datetime.now(),
updated_at=datetime.now(),
platform=Platform.TELEGRAM,
@@ -103,8 +107,8 @@ async def test_goal_verdict_done_sent_via_adapter_send(hermes_home):
mgr = GoalManager(session_entry.session_id)
mgr.set("ship the feature")
with patch("hermes_cli.goals.judge_goal", return_value=("done", "the feature shipped")):
runner._post_turn_goal_continuation(
with patch("hermes_cli.goals.judge_goal", return_value=("done", "the feature shipped", False)):
await runner._post_turn_goal_continuation(
session_entry=session_entry,
source=src,
final_response="I shipped the feature.",
@@ -132,8 +136,8 @@ async def test_goal_verdict_continue_enqueues_continuation(hermes_home):
mgr = GoalManager(session_entry.session_id)
mgr.set("polish the docs")
with patch("hermes_cli.goals.judge_goal", return_value=("continue", "still needs work")):
runner._post_turn_goal_continuation(
with patch("hermes_cli.goals.judge_goal", return_value=("continue", "still needs work", False)):
await runner._post_turn_goal_continuation(
session_entry=session_entry,
source=src,
final_response="here's a partial edit",
@@ -160,8 +164,8 @@ async def test_goal_verdict_budget_exhausted_sends_pause(hermes_home):
state.turns_used = 2
save_goal(session_entry.session_id, state)
with patch("hermes_cli.goals.judge_goal", return_value=("continue", "keep going")):
runner._post_turn_goal_continuation(
with patch("hermes_cli.goals.judge_goal", return_value=("continue", "keep going", False)):
await runner._post_turn_goal_continuation(
session_entry=session_entry,
source=src,
final_response="still partial",
@@ -181,7 +185,7 @@ async def test_goal_verdict_skipped_when_no_active_goal(hermes_home):
"""No goal set → the hook is a no-op. Nothing is sent, nothing enqueued."""
runner, adapter, session_entry, src = _make_runner_with_adapter()
runner._post_turn_goal_continuation(
await runner._post_turn_goal_continuation(
session_entry=session_entry,
source=src,
final_response="anything",
@@ -207,9 +211,9 @@ async def test_goal_verdict_survives_adapter_without_send(hermes_home):
runner.adapters[Platform.TELEGRAM] = _NoSendAdapter()
with patch("hermes_cli.goals.judge_goal", return_value=("done", "ok")):
with patch("hermes_cli.goals.judge_goal", return_value=("done", "ok", False)):
# must not raise
runner._post_turn_goal_continuation(
await runner._post_turn_goal_continuation(
session_entry=session_entry,
source=src,
final_response="whatever",
+175 -17
View File
@@ -40,14 +40,14 @@ class TestParseJudgeResponse:
def test_clean_json_done(self):
from hermes_cli.goals import _parse_judge_response
done, reason = _parse_judge_response('{"done": true, "reason": "all good"}')
done, reason, _ = _parse_judge_response('{"done": true, "reason": "all good"}')
assert done is True
assert reason == "all good"
def test_clean_json_continue(self):
from hermes_cli.goals import _parse_judge_response
done, reason = _parse_judge_response('{"done": false, "reason": "more work needed"}')
done, reason, _ = _parse_judge_response('{"done": false, "reason": "more work needed"}')
assert done is False
assert reason == "more work needed"
@@ -55,7 +55,7 @@ class TestParseJudgeResponse:
from hermes_cli.goals import _parse_judge_response
raw = '```json\n{"done": true, "reason": "done"}\n```'
done, reason = _parse_judge_response(raw)
done, reason, _ = _parse_judge_response(raw)
assert done is True
assert "done" in reason
@@ -64,7 +64,7 @@ class TestParseJudgeResponse:
from hermes_cli.goals import _parse_judge_response
raw = 'Looking at this... the agent says X. Verdict: {"done": false, "reason": "partial"}'
done, reason = _parse_judge_response(raw)
done, reason, _ = _parse_judge_response(raw)
assert done is False
assert reason == "partial"
@@ -72,24 +72,24 @@ class TestParseJudgeResponse:
from hermes_cli.goals import _parse_judge_response
for s in ("true", "yes", "done", "1"):
done, _ = _parse_judge_response(f'{{"done": "{s}", "reason": "r"}}')
done, _, _ = _parse_judge_response(f'{{"done": "{s}", "reason": "r"}}')
assert done is True
for s in ("false", "no", "not yet"):
done, _ = _parse_judge_response(f'{{"done": "{s}", "reason": "r"}}')
done, _, _ = _parse_judge_response(f'{{"done": "{s}", "reason": "r"}}')
assert done is False
def test_malformed_json_fails_open(self):
"""Non-JSON → not done, with error-ish reason (so judge_goal can map to continue)."""
from hermes_cli.goals import _parse_judge_response
done, reason = _parse_judge_response("this is not json at all")
done, reason, _ = _parse_judge_response("this is not json at all")
assert done is False
assert reason # non-empty
def test_empty_response(self):
from hermes_cli.goals import _parse_judge_response
done, reason = _parse_judge_response("")
done, reason, _ = _parse_judge_response("")
assert done is False
assert reason
@@ -103,13 +103,13 @@ class TestJudgeGoal:
def test_empty_goal_skipped(self):
from hermes_cli.goals import judge_goal
verdict, _ = judge_goal("", "some response")
verdict, _, _ = judge_goal("", "some response")
assert verdict == "skipped"
def test_empty_response_continues(self):
from hermes_cli.goals import judge_goal
verdict, _ = judge_goal("ship the thing", "")
verdict, _, _ = judge_goal("ship the thing", "")
assert verdict == "continue"
def test_no_aux_client_continues(self):
@@ -120,7 +120,7 @@ class TestJudgeGoal:
"agent.auxiliary_client.get_text_auxiliary_client",
return_value=(None, None),
):
verdict, _ = goals.judge_goal("my goal", "my response")
verdict, _, _ = goals.judge_goal("my goal", "my response")
assert verdict == "continue"
def test_api_error_continues(self):
@@ -133,7 +133,7 @@ class TestJudgeGoal:
"agent.auxiliary_client.get_text_auxiliary_client",
return_value=(fake_client, "judge-model"),
):
verdict, reason = goals.judge_goal("goal", "response")
verdict, reason, _ = goals.judge_goal("goal", "response")
assert verdict == "continue"
assert "judge error" in reason.lower()
@@ -152,7 +152,7 @@ class TestJudgeGoal:
"agent.auxiliary_client.get_text_auxiliary_client",
return_value=(fake_client, "judge-model"),
):
verdict, reason = goals.judge_goal("goal", "agent response")
verdict, reason, _ = goals.judge_goal("goal", "agent response")
assert verdict == "done"
assert reason == "achieved"
@@ -171,7 +171,7 @@ class TestJudgeGoal:
"agent.auxiliary_client.get_text_auxiliary_client",
return_value=(fake_client, "judge-model"),
):
verdict, reason = goals.judge_goal("goal", "agent response")
verdict, reason, _ = goals.judge_goal("goal", "agent response")
assert verdict == "continue"
assert reason == "not yet"
@@ -260,7 +260,7 @@ class TestGoalManager:
mgr = GoalManager(session_id="eval-sid-1")
mgr.set("ship it")
with patch.object(goals, "judge_goal", return_value=("done", "shipped")):
with patch.object(goals, "judge_goal", return_value=("done", "shipped", False)):
decision = mgr.evaluate_after_turn("I shipped the feature.")
assert decision["verdict"] == "done"
@@ -276,7 +276,7 @@ class TestGoalManager:
mgr = GoalManager(session_id="eval-sid-2", default_max_turns=5)
mgr.set("a long goal")
with patch.object(goals, "judge_goal", return_value=("continue", "more work")):
with patch.object(goals, "judge_goal", return_value=("continue", "more work", False)):
decision = mgr.evaluate_after_turn("made some progress")
assert decision["verdict"] == "continue"
@@ -294,7 +294,7 @@ class TestGoalManager:
mgr = GoalManager(session_id="eval-sid-3", default_max_turns=2)
mgr.set("hard goal")
with patch.object(goals, "judge_goal", return_value=("continue", "not yet")):
with patch.object(goals, "judge_goal", return_value=("continue", "not yet", False)):
d1 = mgr.evaluate_after_turn("step 1")
assert d1["should_continue"] is True
assert mgr.state.turns_used == 1
@@ -356,3 +356,161 @@ def test_goal_command_dispatches_in_cli_registry_helpers():
assert "/goal" in COMMANDS
session_cmds = COMMANDS_BY_CATEGORY.get("Session", {})
assert "/goal" in session_cmds
# ──────────────────────────────────────────────────────────────────────
# Auto-pause on consecutive judge parse failures
# ──────────────────────────────────────────────────────────────────────
class TestJudgeParseFailureAutoPause:
"""Regression: weak judge models (e.g. deepseek-v4-flash) that return
empty strings or non-JSON prose must auto-pause the loop after N turns
instead of burning the whole turn budget."""
def test_parse_response_flags_empty_as_parse_failure(self):
from hermes_cli.goals import _parse_judge_response
done, reason, parse_failed = _parse_judge_response("")
assert done is False
assert parse_failed is True
assert "empty" in reason.lower()
def test_parse_response_flags_non_json_as_parse_failure(self):
from hermes_cli.goals import _parse_judge_response
done, reason, parse_failed = _parse_judge_response(
"Let me analyze whether the goal is fully satisfied based on the agent's response..."
)
assert done is False
assert parse_failed is True
assert "not json" in reason.lower()
def test_parse_response_clean_json_is_not_parse_failure(self):
from hermes_cli.goals import _parse_judge_response
done, _, parse_failed = _parse_judge_response(
'{"done": false, "reason": "more work"}'
)
assert done is False
assert parse_failed is False
def test_api_error_does_not_count_as_parse_failure(self):
"""Transient network/API errors must not trip the auto-pause guard."""
from hermes_cli import goals
fake_client = MagicMock()
fake_client.chat.completions.create.side_effect = RuntimeError("connection reset")
with patch(
"agent.auxiliary_client.get_text_auxiliary_client",
return_value=(fake_client, "judge-model"),
):
verdict, _, parse_failed = goals.judge_goal("goal", "response")
assert verdict == "continue"
assert parse_failed is False
def test_empty_judge_reply_flagged_as_parse_failure(self):
"""End-to-end: judge returns empty content → parse_failed=True."""
from hermes_cli import goals
fake_client = MagicMock()
fake_client.chat.completions.create.return_value = MagicMock(
choices=[MagicMock(message=MagicMock(content=""))]
)
with patch(
"agent.auxiliary_client.get_text_auxiliary_client",
return_value=(fake_client, "judge-model"),
):
verdict, _, parse_failed = goals.judge_goal("goal", "response")
assert verdict == "continue"
assert parse_failed is True
def test_auto_pause_after_three_consecutive_parse_failures(self, hermes_home):
"""N=3 consecutive parse failures → auto-pause with config pointer."""
from hermes_cli import goals
from hermes_cli.goals import GoalManager, DEFAULT_MAX_CONSECUTIVE_PARSE_FAILURES
assert DEFAULT_MAX_CONSECUTIVE_PARSE_FAILURES == 3
mgr = GoalManager(session_id="parse-fail-sid-1", default_max_turns=20)
mgr.set("do a thing")
with patch.object(
goals, "judge_goal", return_value=("continue", "judge returned empty response", True)
):
d1 = mgr.evaluate_after_turn("step 1")
assert d1["should_continue"] is True
assert mgr.state.consecutive_parse_failures == 1
d2 = mgr.evaluate_after_turn("step 2")
assert d2["should_continue"] is True
assert mgr.state.consecutive_parse_failures == 2
d3 = mgr.evaluate_after_turn("step 3")
assert d3["should_continue"] is False
assert d3["status"] == "paused"
assert mgr.state.consecutive_parse_failures == 3
# Message points at the config surface so the user can fix it.
assert "auxiliary" in d3["message"]
assert "goal_judge" in d3["message"]
assert "config.yaml" in d3["message"]
def test_parse_failure_counter_resets_on_good_reply(self, hermes_home):
"""A single good judge reply resets the counter — transient flakes don't pause."""
from hermes_cli import goals
from hermes_cli.goals import GoalManager
mgr = GoalManager(session_id="parse-fail-sid-2", default_max_turns=20)
mgr.set("another goal")
# Two parse failures…
with patch.object(
goals, "judge_goal", return_value=("continue", "not json", True)
):
mgr.evaluate_after_turn("step 1")
mgr.evaluate_after_turn("step 2")
assert mgr.state.consecutive_parse_failures == 2
# …then one clean reply resets the counter.
with patch.object(
goals, "judge_goal", return_value=("continue", "making progress", False)
):
d = mgr.evaluate_after_turn("step 3")
assert d["should_continue"] is True
assert mgr.state.consecutive_parse_failures == 0
def test_parse_failure_counter_not_incremented_by_api_errors(self, hermes_home):
"""API/transport errors must NOT count toward the auto-pause threshold."""
from hermes_cli import goals
from hermes_cli.goals import GoalManager
mgr = GoalManager(session_id="parse-fail-sid-3", default_max_turns=20)
mgr.set("goal")
with patch.object(
goals, "judge_goal", return_value=("continue", "judge error: RuntimeError", False)
):
for _ in range(5):
d = mgr.evaluate_after_turn("still going")
assert d["should_continue"] is True
assert mgr.state.consecutive_parse_failures == 0
assert mgr.state.status == "active"
def test_consecutive_parse_failures_persists_across_goalmanager_reloads(
self, hermes_home
):
"""The counter must be durable so cross-session resumes see it."""
from hermes_cli import goals
from hermes_cli.goals import GoalManager, load_goal
mgr = GoalManager(session_id="parse-fail-sid-4", default_max_turns=20)
mgr.set("persistent goal")
with patch.object(
goals, "judge_goal", return_value=("continue", "empty", True)
):
mgr.evaluate_after_turn("r")
mgr.evaluate_after_turn("r")
reloaded = load_goal("parse-fail-sid-4")
assert reloaded is not None
assert reloaded.consecutive_parse_failures == 2
@@ -0,0 +1,74 @@
"""Regression tests for Windows install.ps1 dependency branch handling.
These assertions lock in the critical control-flow paths needed for native
Windows CLI + TUI installs:
- Node.js install via winget, with managed ZIP fallback
- npm invocation that avoids execution-policy failures on npm.ps1
- Python dependency fallback chain for Windows CLI/TUI
- Managed Node PATH/HERMES_NODE persistence across terminal sessions
"""
from pathlib import Path
REPO_ROOT = Path(__file__).resolve().parent.parent
INSTALL_PS1 = REPO_ROOT / "scripts" / "install.ps1"
def test_node_install_keeps_winget_and_zip_fallback_paths() -> None:
text = INSTALL_PS1.read_text()
# Primary path: modern Windows machines with winget.
assert "if (Get-Command winget -ErrorAction SilentlyContinue)" in text
assert "winget install OpenJS.NodeJS.LTS" in text
# Fallback path: no winget / winget failure => managed ZIP install.
assert 'Write-Info "Downloading Node.js $NodeVersion binary..."' in text
assert 'Move-Item $extractedDir.FullName "$HermesHome\\node"' in text
assert '& "$HermesHome\\node\\node.exe" --version' in text
def test_system_packages_keep_winget_choco_scoop_fallback_chain() -> None:
text = INSTALL_PS1.read_text()
assert "$hasWinget = Get-Command winget -ErrorAction SilentlyContinue" in text
assert "$hasChoco = Get-Command choco -ErrorAction SilentlyContinue" in text
assert "$hasScoop = Get-Command scoop -ErrorAction SilentlyContinue" in text
assert "if ($hasWinget)" in text
assert "if ($hasChoco -and ($needRipgrep -or $needFfmpeg))" in text
assert "if ($hasScoop -and ($needRipgrep -or $needFfmpeg))" in text
def test_npm_resolution_avoids_powershell_policy_blocks() -> None:
text = INSTALL_PS1.read_text()
# Prefer npm.cmd and convert npm.ps1 -> npm.cmd when needed.
assert "function Resolve-NpmInvocation" in text
assert "Get-Command npm.cmd -ErrorAction SilentlyContinue" in text
assert '[System.IO.Path]::ChangeExtension($npm.Source, ".cmd")' in text
# Last-resort path should still work by launching npm-cli.js via node.
assert "node_modules\\npm\\bin\\npm-cli.js" in text
assert "Invoke-NpmInstallSilent -WorkingDir $InstallDir" in text
assert "Invoke-NpmInstallSilent -WorkingDir $tuiDir" in text
def test_python_dependency_install_has_windows_cli_tui_fallback() -> None:
text = INSTALL_PS1.read_text()
# Keep broad install attempt first.
assert '& $UvCmd pip install -e ".[all]"' in text
# Then fallback to Windows CLI/TUI essentials if optional extras fail.
assert '& $UvCmd pip install -e ".[pty,mcp,honcho,acp]"' in text
# Final safety fallback to base package.
assert '& $UvCmd pip install -e "."' in text
assert 'throw "Failed to install Hermes Python dependencies."' in text
def test_managed_node_is_persisted_for_future_tui_runs() -> None:
text = INSTALL_PS1.read_text()
assert "Add-UserPathEntry -CurrentPath $newPath -Entry $managedNodeDir" in text
assert '[Environment]::SetEnvironmentVariable("HERMES_NODE", $managedNodeExe, "User")' in text
assert '$env:Path = Add-UserPathEntry -CurrentPath $env:Path -Entry "$HermesHome\\node"' in text
+36 -9
View File
@@ -828,18 +828,45 @@ class TestE2EChannelsList:
assert result["channels"][0]["target"] == "slack:C1234"
def test_channels_with_directory(self, mcp_server_e2e, _event_loop, monkeypatch):
"""Populated channel_directory.json should be unwrapped via the 'platforms' key.
Regression test for issue #21474: the writer wraps platforms under
{"updated_at": ..., "platforms": {...}} but the reader was iterating
directory.items() directly, so channels_list always returned 0.
"""
import mcp_serve
monkeypatch.setattr(mcp_serve, "_load_channel_directory", lambda: {
"telegram": [
{"id": "123456", "name": "Alice", "type": "dm"},
{"id": "-100999", "name": "Dev Group", "type": "group"},
],
"updated_at": "2026-05-07T12:00:00",
"platforms": {
"telegram": [
{"id": "123456", "name": "Alice", "type": "dm"},
{"id": "-100999", "name": "Dev Group", "type": "group"},
],
"discord": [
{"id": "789", "name": "general", "type": "text"},
],
},
})
# Need to recreate server to pick up the new mock
server, bridge = mcp_server_e2e
# The tool closure already captured the old mock, so test the function directly
directory = mcp_serve._load_channel_directory()
assert len(directory["telegram"]) == 2
server, _ = mcp_server_e2e
result = _run_tool(server, "channels_list")
assert result["count"] == 3
targets = {c["target"] for c in result["channels"]}
assert targets == {"telegram:123456", "telegram:-100999", "discord:789"}
def test_channels_with_directory_platform_filter(self, mcp_server_e2e, _event_loop, monkeypatch):
"""Platform filter should work against the wrapped 'platforms' payload."""
import mcp_serve
monkeypatch.setattr(mcp_serve, "_load_channel_directory", lambda: {
"updated_at": "2026-05-07T12:00:00",
"platforms": {
"telegram": [{"id": "123456", "name": "Alice", "type": "dm"}],
"discord": [{"id": "789", "name": "general", "type": "text"}],
},
})
server, _ = mcp_server_e2e
result = _run_tool(server, "channels_list", {"platform": "discord"})
assert result["count"] == 1
assert result["channels"][0]["target"] == "discord:789"
class TestE2EPermissions:
+17 -11
View File
@@ -1863,13 +1863,15 @@ def test_config_set_personality_rejects_unknown_name(monkeypatch):
assert "Unknown personality" in resp["error"]["message"]
def test_config_set_personality_resets_history_and_returns_info(monkeypatch):
def test_config_set_personality_preserves_history_and_returns_info(monkeypatch):
agent = types.SimpleNamespace(
ephemeral_system_prompt=None, _cached_system_prompt="old"
)
session = _session(
agent=types.SimpleNamespace(),
agent=agent,
history=[{"role": "user", "text": "hi"}],
history_version=4,
)
new_agent = types.SimpleNamespace(model="x")
emits = []
server._sessions["sid"] = session
@@ -1878,13 +1880,9 @@ def test_config_set_personality_resets_history_and_returns_info(monkeypatch):
"_available_personalities",
lambda cfg=None: {"helpful": "You are helpful."},
)
monkeypatch.setattr(
server, "_make_agent", lambda sid, key, session_id=None: new_agent
)
monkeypatch.setattr(
server, "_session_info", lambda agent: {"model": getattr(agent, "model", "?")}
)
monkeypatch.setattr(server, "_restart_slash_worker", lambda session: None)
monkeypatch.setattr(server, "_emit", lambda *args: emits.append(args))
monkeypatch.setattr(server, "_write_config_key", lambda path, value: None)
@@ -1896,11 +1894,19 @@ def test_config_set_personality_resets_history_and_returns_info(monkeypatch):
}
)
assert resp["result"]["history_reset"] is True
assert resp["result"]["info"] == {"model": "x"}
assert session["history"] == []
assert resp["result"]["history_reset"] is False
assert resp["result"]["info"] == {"model": "?"}
# History is preserved with a pivot marker appended
assert len(session["history"]) == 2
assert session["history"][0] == {"role": "user", "text": "hi"}
assert session["history"][1]["role"] == "user"
assert "personality" in session["history"][1]["content"].lower()
assert "You are helpful." in session["history"][1]["content"]
assert session["history_version"] == 5
assert ("session.info", "sid", {"model": "x"}) in emits
# Agent's system prompt was updated in-place; cached prompt untouched
assert agent.ephemeral_system_prompt == "You are helpful."
assert agent._cached_system_prompt == "old"
assert ("session.info", "sid", {"model": "?"}) in emits
def test_session_compress_uses_compress_helper(monkeypatch):
+182
View File
@@ -0,0 +1,182 @@
"""Unit tests for Windows UTF-8 process bootstrap."""
from __future__ import annotations
import os
from types import SimpleNamespace
import utf8_bootstrap as utf8_bootstrap
def _fake_sys(
*,
platform: str,
utf8_mode: int,
argv: list[str] | None = None,
executable: str = r"C:\Python\python.exe",
) -> SimpleNamespace:
return SimpleNamespace(
platform=platform,
flags=SimpleNamespace(utf8_mode=utf8_mode),
argv=argv or ["hermes"],
executable=executable,
)
def test_non_windows_noop(monkeypatch) -> None:
monkeypatch.setattr(
utf8_bootstrap,
"sys",
_fake_sys(platform="darwin", utf8_mode=0),
)
monkeypatch.delenv("PYTHONUTF8", raising=False)
monkeypatch.delenv("PYTHONIOENCODING", raising=False)
called = {"exec": False}
def _fake_exec(*_args, **_kwargs):
called["exec"] = True
raise AssertionError("exec should not run on non-Windows")
monkeypatch.setattr(utf8_bootstrap.os, "execvpe", _fake_exec)
assert utf8_bootstrap.ensure_windows_utf8_mode() is False
assert called["exec"] is False
assert "PYTHONUTF8" not in os.environ
assert "PYTHONIOENCODING" not in os.environ
def test_windows_utf8_already_enabled_sets_env_without_reexec(monkeypatch) -> None:
monkeypatch.setattr(
utf8_bootstrap,
"sys",
_fake_sys(platform="win32", utf8_mode=1),
)
monkeypatch.delenv("PYTHONUTF8", raising=False)
monkeypatch.delenv("PYTHONIOENCODING", raising=False)
called = {"exec": False}
def _fake_exec(*_args, **_kwargs):
called["exec"] = True
raise AssertionError("exec should not run when utf8_mode=1")
monkeypatch.setattr(utf8_bootstrap.os, "execvpe", _fake_exec)
assert utf8_bootstrap.ensure_windows_utf8_mode() is False
assert called["exec"] is False
assert os.environ["PYTHONUTF8"] == "1"
assert os.environ["PYTHONIOENCODING"] == "utf-8"
def test_windows_reexec_attempt_uses_utf8_flag(monkeypatch) -> None:
fake_sys = _fake_sys(platform="win32", utf8_mode=0, argv=["hermes", "--help"])
monkeypatch.setattr(utf8_bootstrap, "sys", fake_sys)
monkeypatch.delenv("PYTHONUTF8", raising=False)
monkeypatch.delenv("PYTHONIOENCODING", raising=False)
monkeypatch.delenv("_HERMES_UTF8_REEXEC", raising=False)
captured: dict[str, object] = {}
def _fake_exec(executable, argv, env):
captured["executable"] = executable
captured["argv"] = argv
captured["env"] = env
raise OSError("blocked by test")
monkeypatch.setattr(utf8_bootstrap.os, "execvpe", _fake_exec)
assert (
utf8_bootstrap.ensure_windows_utf8_mode(entrypoint_markers=("hermes",))
is False
)
assert captured["executable"] == fake_sys.executable
assert captured["argv"] == [
fake_sys.executable,
"-X",
"utf8",
*fake_sys.argv,
]
env = captured["env"]
assert isinstance(env, dict)
assert env["PYTHONUTF8"] == "1"
assert env["PYTHONIOENCODING"] == "utf-8"
assert env["_HERMES_UTF8_REEXEC"] == "1"
def test_module_reexec_uses_dash_m_and_drops_argv0(monkeypatch) -> None:
fake_sys = _fake_sys(
platform="win32",
utf8_mode=0,
argv=[r"C:\Users\me\AppData\Local\Programs\Python\Scripts\hermes.exe", "chat", "--verbose"],
)
monkeypatch.setattr(utf8_bootstrap, "sys", fake_sys)
monkeypatch.delenv("_HERMES_UTF8_REEXEC", raising=False)
captured: dict[str, object] = {}
def _fake_exec(executable, argv, env):
captured["executable"] = executable
captured["argv"] = argv
captured["env"] = env
raise OSError("blocked by test")
monkeypatch.setattr(utf8_bootstrap.os, "execvpe", _fake_exec)
assert (
utf8_bootstrap.ensure_windows_utf8_mode(
module="hermes_cli.main",
entrypoint_markers=("hermes",),
)
is False
)
assert captured["executable"] == fake_sys.executable
assert captured["argv"] == [
fake_sys.executable,
"-X",
"utf8",
"-m",
"hermes_cli.main",
"chat",
"--verbose",
]
env = captured["env"]
assert isinstance(env, dict)
assert env["_HERMES_UTF8_REEXEC"] == "1"
def test_marker_mismatch_skips_reexec(monkeypatch) -> None:
fake_sys = _fake_sys(platform="win32", utf8_mode=0, argv=["pytest", "-k", "x"])
monkeypatch.setattr(utf8_bootstrap, "sys", fake_sys)
monkeypatch.delenv("_HERMES_UTF8_REEXEC", raising=False)
called = {"exec": False}
def _fake_exec(*_args, **_kwargs):
called["exec"] = True
raise AssertionError("exec should be skipped for non-matching marker")
monkeypatch.setattr(utf8_bootstrap.os, "execvpe", _fake_exec)
assert (
utf8_bootstrap.ensure_windows_utf8_mode(entrypoint_markers=("hermes",))
is False
)
assert called["exec"] is False
def test_reexec_guard_prevents_loops(monkeypatch) -> None:
fake_sys = _fake_sys(platform="win32", utf8_mode=0, argv=["hermes"])
monkeypatch.setattr(utf8_bootstrap, "sys", fake_sys)
monkeypatch.setenv("_HERMES_UTF8_REEXEC", "1")
called = {"exec": False}
def _fake_exec(*_args, **_kwargs):
called["exec"] = True
raise AssertionError("exec should be skipped when guard is set")
monkeypatch.setattr(utf8_bootstrap.os, "execvpe", _fake_exec)
assert utf8_bootstrap.ensure_windows_utf8_mode() is False
assert called["exec"] is False
+33
View File
@@ -328,3 +328,36 @@ class TestSanePathIncludesHomebrew:
result = _make_run_env({})
# Should keep existing PATH unchanged
assert result["PATH"] == "/usr/bin:/bin"
class TestWindowsSanePath:
def test_make_run_env_windows_uses_windows_path_rules(self):
"""Windows mode should use ';' and avoid POSIX /usr/bin injections."""
from tools.environments.local import _make_run_env
with patch("tools.environments.local._IS_WINDOWS", True), patch.dict(
os.environ, {"PATH": r"C:\Users\Test\bin"}, clear=True
):
result = _make_run_env({})
assert "PATH" in result
assert ";" in result["PATH"]
assert "/usr/bin" not in result["PATH"]
parts = [p for p in result["PATH"].split(";") if p]
assert any("git" in p.lower() and "bin" in p.lower() for p in parts)
def test_make_run_env_windows_dedupes_case_insensitive_entries(self):
"""Repeated Windows path entries should not be appended twice."""
from tools.environments.local import _make_run_env
with patch("tools.environments.local._IS_WINDOWS", True), patch.dict(
os.environ, {"PATH": r"C:\Windows\System32;C:\TOOLS\BIN"}, clear=True
):
result = _make_run_env({})
normalized = [
p.replace("/", "\\").lower().rstrip("\\")
for p in result["PATH"].split(";")
if p
]
assert normalized.count(r"c:\windows\system32") == 1
+43 -1
View File
@@ -1,6 +1,7 @@
"""Local execution environment — spawn-per-call with session snapshot."""
import logging
import ntpath
import os
import platform
import re
@@ -217,6 +218,30 @@ _SANE_PATH = (
"/opt/homebrew/bin:/opt/homebrew/sbin:"
"/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"
)
_SANE_PATH_WINDOWS = tuple(
p for p in (
os.path.join(os.environ.get("SystemRoot", r"C:\Windows"), "System32"),
os.environ.get("SystemRoot", r"C:\Windows"),
os.path.join(
os.environ.get("SystemRoot", r"C:\Windows"),
"System32",
"WindowsPowerShell",
"v1.0",
),
os.path.join(
os.environ.get("ProgramFiles", r"C:\Program Files"),
"Git",
"bin",
),
os.path.join(
os.environ.get("ProgramFiles(x86)", r"C:\Program Files (x86)"),
"Git",
"bin",
),
os.path.join(os.environ.get("LOCALAPPDATA", ""), "Programs", "Git", "bin"),
)
if p
)
def _make_run_env(env: dict) -> dict:
@@ -235,7 +260,24 @@ def _make_run_env(env: dict) -> dict:
elif k not in _HERMES_PROVIDER_ENV_BLOCKLIST or _is_passthrough(k):
run_env[k] = v
existing_path = run_env.get("PATH", "")
if "/usr/bin" not in existing_path.split(":"):
if _IS_WINDOWS:
# Keep PATH Windows-native (`;` separator, case-insensitive dedupe)
# and avoid injecting POSIX defaults like /usr/bin.
parts = [p for p in existing_path.split(";") if p]
seen = {
ntpath.normcase(ntpath.normpath(p.rstrip("\\/")))
for p in parts
if p
}
for candidate in _SANE_PATH_WINDOWS:
norm = ntpath.normcase(ntpath.normpath(candidate.rstrip("\\/")))
if norm in seen:
continue
parts.append(candidate)
seen.add(norm)
if parts:
run_env["PATH"] = ";".join(parts)
elif "/usr/bin" not in existing_path.split(":"):
run_env["PATH"] = f"{existing_path}:{_SANE_PATH}" if existing_path else _SANE_PATH
# Per-profile HOME isolation: redirect system tool configs (git, ssh, gh,
+3 -2
View File
@@ -5,10 +5,11 @@ It implements ``WebSearchProvider`` only — there is no extract capability.
Configuration::
# ~/.hermes/config.yaml (SEARXNG_URL is a URL, not a secret — use config.yaml not .env)
SEARXNG_URL: http://localhost:8080
# ~/.hermes/.env
SEARXNG_URL=http://localhost:8080
# Use SearXNG for search, pair with any extract provider:
# ~/.hermes/config.yaml
web:
search_backend: "searxng"
extract_backend: "firecrawl"
+38 -12
View File
@@ -1280,6 +1280,7 @@ def _get_usage(agent) -> dict:
"output": g("session_output_tokens", "session_completion_tokens"),
"cache_read": g("session_cache_read_tokens"),
"cache_write": g("session_cache_write_tokens"),
"reasoning": g("session_reasoning_tokens"),
"prompt": g("session_prompt_tokens"),
"completion": g("session_completion_tokens"),
"total": g("session_total_tokens"),
@@ -1725,21 +1726,46 @@ def _validate_personality(value: str, cfg: dict | None = None) -> tuple[str, str
def _apply_personality_to_session(
sid: str, session: dict, new_prompt: str
) -> tuple[bool, dict | None]:
"""Apply a personality change to an existing session without resetting history.
Updates the agent's ephemeral system prompt in-place so the new personality
takes effect on the next turn. The cached base system prompt is left intact
(ephemeral_system_prompt is appended at API-call time, not baked into the
cache), which preserves prompt-cache hits.
Also injects a system-role marker into the conversation history so the model
knows to pivot its style from this point forward (without this, LLMs tend to
continue the tone established by earlier messages in the transcript).
Returns (history_reset, info) history_reset is always False since we
preserve the conversation.
"""
if not session:
return False, None
try:
info = _reset_session_agent(sid, session)
return True, info
except Exception:
if session.get("agent"):
agent = session["agent"]
agent.ephemeral_system_prompt = new_prompt or None
agent._cached_system_prompt = None
info = _session_info(agent)
_emit("session.info", sid, info)
return False, info
return False, None
agent = session.get("agent")
if agent:
agent.ephemeral_system_prompt = new_prompt or None
# Inject a pivot marker into history so the model sees the change point.
# This prevents it from pattern-matching its prior style.
if new_prompt:
marker = (
"[System: The user has changed the assistant's personality. "
"From this point forward, adopt the following persona and respond "
f"accordingly: {new_prompt}]"
)
else:
marker = (
"[System: The user has cleared the personality overlay. "
"From this point forward, respond in your normal default style.]"
)
with session["history_lock"]:
session["history"].append({"role": "user", "content": marker})
session["history_version"] = int(session.get("history_version", 0)) + 1
info = _session_info(agent)
_emit("session.info", sid, info)
return False, info
return False, None
def _cfg_max_turns(cfg: dict, default: int) -> int:
+14 -1
View File
@@ -92,6 +92,19 @@ export const sessionCommands: SlashCommand[] = [
}
},
{
help: 'browse and resume previous sessions',
name: 'sessions',
run: (arg, ctx) => {
if (ctx.session.guardBusySessionSwitch('switch sessions')) {
return
}
if (!arg.trim()) {
return patchOverlayState({ picker: true })
}
}
},
{
help: 'attach an image',
name: 'image',
@@ -109,7 +122,7 @@ export const sessionCommands: SlashCommand[] = [
},
{
help: 'switch or reset personality (history reset on set)',
help: 'switch personality for this session',
name: 'personality',
run: (arg, ctx) => {
if (!arg) {
+2
View File
@@ -164,9 +164,11 @@ export interface Usage {
context_max?: number
context_percent?: number
context_used?: number
cost_status?: string
cost_usd?: number
input: number
output: number
reasoning?: number
total: number
}
+81
View File
@@ -0,0 +1,81 @@
"""Windows UTF-8 bootstrap for Hermes entrypoints.
On older Windows builds, Python may start with a locale codec such as cp1252.
That makes text-mode ``open()`` without ``encoding=`` and stdio defaults prone
to Unicode decode/encode failures. Hermes touches many files in long-running
processes, so we force UTF-8 mode at process start for CLI entrypoints.
"""
from __future__ import annotations
import os
import sys
_UTF8_REEXEC_GUARD = "_HERMES_UTF8_REEXEC"
def ensure_windows_utf8_mode(
*,
reexec: bool = True,
module: str | None = None,
entrypoint_markers: tuple[str, ...] | None = None,
) -> bool:
"""Ensure UTF-8 defaults on Windows.
Behavior:
- Always sets ``PYTHONUTF8=1`` and ``PYTHONIOENCODING=utf-8`` on Windows.
- If Python is already in UTF-8 mode, returns immediately.
- Otherwise re-execs the current interpreter with ``-X utf8`` (once),
unless marker-gated or explicitly disabled via ``reexec=False``.
- When ``module=...`` is supplied, re-execs as ``python -m <module>`` and
forwards original user args (excluding argv0), which avoids Windows
console-script ``.exe`` wrappers being treated as Python scripts.
Returns ``True`` only when a re-exec is attempted and the exec call
unexpectedly returns (e.g. under a patched test double). In normal
operation ``os.execvpe`` never returns on success.
"""
if sys.platform != "win32":
return False
os.environ.setdefault("PYTHONUTF8", "1")
os.environ.setdefault("PYTHONIOENCODING", "utf-8")
if getattr(sys.flags, "utf8_mode", 0) == 1:
return False
if not reexec:
return False
if os.environ.get(_UTF8_REEXEC_GUARD) == "1":
return False
if entrypoint_markers:
argv0 = ""
if getattr(sys, "argv", None):
argv0 = os.path.basename(str(sys.argv[0])).lower()
markers = tuple(marker.lower() for marker in entrypoint_markers if marker)
if markers and not any(marker in argv0 for marker in markers):
return False
executable = getattr(sys, "executable", None)
argv = list(getattr(sys, "argv", []))
if not executable:
return False
child_env = dict(os.environ)
child_env[_UTF8_REEXEC_GUARD] = "1"
child_argv = [executable, "-X", "utf8"]
if module:
child_argv.extend(["-m", module])
if len(argv) > 1:
child_argv.extend(argv[1:])
else:
child_argv.extend(argv)
try:
os.execvpe(executable, child_argv, child_env)
except OSError:
# Best-effort fallback: env vars remain set for child processes.
return False
# ``exec`` should not return on success.
return True
+13
View File
@@ -784,6 +784,7 @@ $ hermes model
[ ] title_generation currently: openrouter / google/gemini-3-flash-preview
[ ] compression currently: auto / main model
[ ] approval currently: auto / main model
[ ] triage_specifier currently: auto / main model
```
Select a task, pick a provider (OAuth flows open a browser; API-key providers prompt), pick a model. The change persists to `auxiliary.<task>.*` in `config.yaml`. Same machinery as the main-model picker — no extra syntax to learn.
@@ -880,6 +881,18 @@ auxiliary:
base_url: ""
api_key: ""
timeout: 30
# Kanban triage specifier — `hermes kanban specify <id>` (or the
# dashboard's ✨ Specify button on Triage-column cards) uses this
# slot to expand a one-liner into a concrete spec and promote the
# task to `todo`. Cheap fast models work well here; spec expansion
# is short and doesn't need reasoning depth.
triage_specifier:
provider: "auto"
model: ""
base_url: ""
api_key: ""
timeout: 120
```
:::tip
@@ -192,6 +192,7 @@ Hermes uses separate lightweight models for side tasks. Each task has its own pr
| MCP | MCP helper operations | `auxiliary.mcp` |
| Approval | Smart command-approval classification | `auxiliary.approval` |
| Title Generation | Session title summaries | `auxiliary.title_generation` |
| Triage Specifier | `hermes kanban specify` / dashboard ✨ button — fleshes out a one-liner triage task into a real spec | `auxiliary.triage_specifier` |
### Auto-Detection Chain
@@ -384,5 +385,6 @@ See [Scheduled Tasks (Cron)](/docs/user-guide/features/cron) for full configurat
| MCP helpers | Auto-detection chain | `auxiliary.mcp` |
| Approval classification | Auto-detection chain | `auxiliary.approval` |
| Title generation | Auto-detection chain | `auxiliary.title_generation` |
| Triage specifier | Auto-detection chain | `auxiliary.triage_specifier` |
| Delegation | Provider override only (no automatic fallback) | `delegation.provider` / `delegation.model` |
| Cron jobs | Per-job provider override only (no automatic fallback) | Per-job `provider` / `model` |
@@ -22,7 +22,7 @@ Throughout the tutorial, **code blocks labelled `bash` are commands *you* run.**
Six columns, left to right:
- **Triage** — raw ideas, a specifier will flesh out the spec before anyone works on them.
- **Triage** — raw ideas, a specifier will flesh out the spec before anyone works on them. Click the **✨ Specify** button on any triage card (or run `hermes kanban specify <id>` / `/kanban specify <id>` from a chat) to have the auxiliary LLM turn a one-liner into a full spec (goal, approach, acceptance criteria) and promote it to `todo` in one shot. Configure which model runs it under `auxiliary.triage_specifier` in `config.yaml`.
- **Todo** — created but waiting on dependencies, or not yet assigned.
- **Ready** — assigned and waiting for the dispatcher to claim.
- **In progress** — a worker is actively running the task. With "Lanes by profile" on (the default), this column sub-groups by assignee so you can see at a glance what each worker is doing.
+11 -4
View File
@@ -148,8 +148,15 @@ You should see something like `10 results`. If you get a `403 Forbidden`, JSON f
**7. Configure Hermes:**
```bash
# ~/.hermes/config.yaml
SEARXNG_URL: http://localhost:8888
# ~/.hermes/.env
SEARXNG_URL=http://localhost:8888
```
Then select SearXNG as the search backend in `~/.hermes/config.yaml`:
```yaml
web:
search_backend: "searxng"
```
Or set via `hermes tools` → Web Search & Extract → SearXNG.
@@ -161,8 +168,8 @@ Or set via `hermes tools` → Web Search & Extract → SearXNG.
Public SearXNG instances are listed at [searx.space](https://searx.space/). Filter by instances that have **JSON format enabled** (shown in the table).
```bash
# ~/.hermes/config.yaml
SEARXNG_URL: https://searx.example.com
# ~/.hermes/.env
SEARXNG_URL=https://searx.example.com
```
:::caution Public instances