Compare commits

..

24 Commits

Author SHA1 Message Date
Brooklyn Nicholson f45b844670 fix(tui): improve learning ledger scanability
Shrink the details panel share and simplify category row labels so the four learning lists are easier to scan.
2026-04-28 17:01:34 -05:00
Brooklyn Nicholson 2476beac3a feat(tui): split learning ledger into category panels
Stress the shared overlay grid with separate memories, skills, recalls, and connected panels plus a details panel navigated by arrow keys.
2026-04-28 17:01:34 -05:00
Brooklyn Nicholson 8a0498d41e fix(tui): reserve overlay panel footers
Let overlay grid panels define footer content outside the clipped body so hints and pager controls stay visible under height caps.
2026-04-28 17:01:34 -05:00
Brooklyn Nicholson 5f749667e2 fix(tui): cap overlay grid height
Give floating overlay panels a shared terminal-derived height cap so long details or pager content clips inside the modal instead of expanding the whole overlay upward.
2026-04-28 17:01:34 -05:00
Brooklyn Nicholson ee2cc327cb refactor(tui): use shared overlay grid
Replace bespoke floating boxes with a shared panel grid renderer so overlays and slash completions use stable widths, gaps, and panel ratios.
2026-04-28 17:01:34 -05:00
Brooklyn Nicholson e004e1e5e4 fix(tui): simplify learning ledger panes
Drop nested borders from the learning ledger grid so the single floating shell frames the list/details layout without visual clutter.
2026-04-28 17:01:34 -05:00
Brooklyn Nicholson 2fe2d943b1 fix(tui): let learned overlay use full shell width
Remove the max-width cap from floating overlays and pass the full shell width into the learning ledger grid.
2026-04-28 17:01:34 -05:00
Brooklyn Nicholson 9a4bc5508a fix(tui): render learning ledger as grid panes
Make the learned overlay a real two-cell grid with a bordered 70% list pane, fixed gap, and bordered 30% details pane.
2026-04-28 17:01:34 -05:00
Brooklyn Nicholson 5f50f3df0d fix(tui): prevent learning ledger detail overlap
Pass the fixed floating overlay width into the learning ledger and reserve an explicit 70/30 master-detail split when details are open.
2026-04-28 17:01:34 -05:00
Brooklyn Nicholson 4821b50cfe fix(tui): stabilize floating overlay widths
Give shared floating overlays a stable terminal-derived width and split slash completions into fixed name/meta columns so popups stop resizing around content.
2026-04-28 17:01:34 -05:00
Brooklyn Nicholson 5bf688a30b fix(skins): make prompt symbols replace chevrons
Store built-in skin prompt symbols as the actual replacement glyph and let CLI/TUI prompt renderers own spacing.
2026-04-28 17:01:34 -05:00
Brooklyn Nicholson d3cb027e17 fix(tui): stabilize skin prompt width
Normalize skin prompt symbols to trimmed single-line text and measure the active prompt width so wide skin glyphs do not wrap or distort the composer.
2026-04-28 17:01:34 -05:00
Brooklyn Nicholson b0c84756ba fix(tui): keep memory tool previews one-line
Avoid malformed multi-line memory tool labels when the model omits a target by keeping the add preview compact and quote-adjacent.
2026-04-28 17:00:37 -05:00
Brooklyn Nicholson bb5c3c1074 refactor(tui): migrate components to semantic theme tokens
Move regular TUI surfaces from palette-specific color names to semantic primary/accent/border/muted/text tokens, leaving raw color values centralized in theme.ts.
2026-04-28 17:00:37 -05:00
Brooklyn Nicholson 185e8ee942 refactor(tui): use semantic theme text colors
Replace decorative/base palette usage in TUI components with semantic theme text tokens and remove hardcoded overlay colors from FPS and heart indicators.
2026-04-28 17:00:04 -05:00
Brooklyn Nicholson 024cccb9bc fix(tui): dim learning note color
Use the active theme dim color for learning notes so they stay subtle on dark skins instead of inheriting the bright cornsilk text color.
2026-04-28 17:00:04 -05:00
Brooklyn Nicholson 15115808b1 fix(tui): render learning notes as standalone rows
Give learning ledger notes their own post-turn row instead of routing them through the normal system-message prefix.
2026-04-28 17:00:04 -05:00
Brooklyn Nicholson c8a9e1234f fix(tui): include learning notes in turn completion
Carry learning events on the message completion payload so remembered/recalled notes flush deterministically after the assistant response even if standalone event timing is missed.
2026-04-28 17:00:04 -05:00
Brooklyn Nicholson 14af4ce665 fix(tui): place learning notes after responses
Buffer live learning events until the turn completes so remembered/recalled notes appear after the assistant response, and trim redundant user prefixes from memory titles.
2026-04-28 17:00:04 -05:00
Brooklyn Nicholson 9ee36e0732 feat(tui): surface live learning events
Emit learning events from memory, recall, and skill tool completions, render them as subtle italic transcript lines, and show learning stats/provenance in the TUI.
2026-04-28 17:00:04 -05:00
Brooklyn Nicholson 8e6f560fd3 refactor(tui): make learning ledger master-detail
Keep recent learning entries in a left-hand list and show the selected item details in a right-side pane only when expanded.
2026-04-28 17:00:04 -05:00
Brooklyn Nicholson 281b5ca546 refactor(tui): widen learning ledger layout
Use a wider floating ledger with two-column rows on large terminals while preserving the compact overlay behavior.
2026-04-28 17:00:04 -05:00
Brooklyn Nicholson f2a08f7581 refactor(tui): focus learning ledger on recent growth
Keep installed skills as quiet inventory while promoting remembered facts, recalled sessions, reused skills, and connected integrations as the primary ledger rows.
2026-04-28 17:00:04 -05:00
Brooklyn Nicholson 61dc679815 feat(tui): add learning ledger overlay
Surface existing memories, skills, recalls, and integrations as a read-only growth ledger so Hermes' accumulated context is visible without changing agent behavior.
2026-04-28 17:00:04 -05:00
112 changed files with 1481 additions and 7544 deletions
+1 -1
View File
@@ -494,7 +494,7 @@ branding:
agent_name: "My Agent"
welcome: "Welcome message"
response_label: " ⚔ Agent "
prompt_symbol: "⚔"
prompt_symbol: "⚔ "
tool_prefix: "╎" # Tool output line prefix
```
-561
View File
@@ -1,561 +0,0 @@
"""Curator — background skill maintenance orchestrator.
The curator is an auxiliary-model task that periodically reviews agent-created
skills and maintains the collection. It runs inactivity-triggered (no cron
daemon): when the agent is idle and the last curator run was longer than
``interval_hours`` ago, ``maybe_run_curator()`` spawns a forked AIAgent to do
the review.
Responsibilities:
- Auto-transition lifecycle states based on last_used_at timestamps
- Spawn a background review agent that can pin / archive / consolidate /
patch agent-created skills via skill_manage
- Persist curator state (last_run_at, paused, etc.) in .curator_state
Strict invariants:
- Only touches agent-created skills (see tools/skill_usage.is_agent_created)
- Never auto-deletes — only archives. Archive is recoverable.
- Pinned skills bypass all auto-transitions
- Uses the auxiliary client; never touches the main session's prompt cache
"""
from __future__ import annotations
import json
import logging
import os
import tempfile
import threading
from datetime import datetime, timedelta, timezone
from pathlib import Path
from typing import Any, Callable, Dict, Optional
from hermes_constants import get_hermes_home
from tools import skill_usage
logger = logging.getLogger(__name__)
DEFAULT_INTERVAL_HOURS = 24 * 7 # 7 days
DEFAULT_MIN_IDLE_HOURS = 2
DEFAULT_STALE_AFTER_DAYS = 30
DEFAULT_ARCHIVE_AFTER_DAYS = 90
# ---------------------------------------------------------------------------
# .curator_state — persistent scheduler + status
# ---------------------------------------------------------------------------
def _state_file() -> Path:
return get_hermes_home() / "skills" / ".curator_state"
def _default_state() -> Dict[str, Any]:
return {
"last_run_at": None,
"last_run_duration_seconds": None,
"last_run_summary": None,
"paused": False,
"run_count": 0,
}
def load_state() -> Dict[str, Any]:
path = _state_file()
if not path.exists():
return _default_state()
try:
data = json.loads(path.read_text(encoding="utf-8"))
if isinstance(data, dict):
base = _default_state()
base.update({k: v for k, v in data.items() if k in base or k.startswith("_")})
return base
except (OSError, json.JSONDecodeError) as e:
logger.debug("Failed to read curator state: %s", e)
return _default_state()
def save_state(data: Dict[str, Any]) -> None:
path = _state_file()
try:
path.parent.mkdir(parents=True, exist_ok=True)
fd, tmp = tempfile.mkstemp(dir=str(path.parent), prefix=".curator_state_", suffix=".tmp")
try:
with os.fdopen(fd, "w", encoding="utf-8") as f:
json.dump(data, f, indent=2, sort_keys=True, ensure_ascii=False)
f.flush()
os.fsync(f.fileno())
os.replace(tmp, path)
except BaseException:
try:
os.unlink(tmp)
except OSError:
pass
raise
except Exception as e:
logger.debug("Failed to save curator state: %s", e, exc_info=True)
def set_paused(paused: bool) -> None:
state = load_state()
state["paused"] = bool(paused)
save_state(state)
def is_paused() -> bool:
return bool(load_state().get("paused"))
# ---------------------------------------------------------------------------
# Config access
# ---------------------------------------------------------------------------
def _load_config() -> Dict[str, Any]:
"""Read curator.* config from ~/.hermes/config.yaml. Tolerates missing file."""
try:
from hermes_cli.config import load_config
cfg = load_config()
except Exception as e:
logger.debug("Failed to load config for curator: %s", e)
return {}
if not isinstance(cfg, dict):
return {}
cur = cfg.get("curator") or {}
if not isinstance(cur, dict):
return {}
return cur
def is_enabled() -> bool:
"""Default ON when no config says otherwise."""
cfg = _load_config()
return bool(cfg.get("enabled", True))
def get_interval_hours() -> int:
cfg = _load_config()
try:
return int(cfg.get("interval_hours", DEFAULT_INTERVAL_HOURS))
except (TypeError, ValueError):
return DEFAULT_INTERVAL_HOURS
def get_min_idle_hours() -> float:
cfg = _load_config()
try:
return float(cfg.get("min_idle_hours", DEFAULT_MIN_IDLE_HOURS))
except (TypeError, ValueError):
return DEFAULT_MIN_IDLE_HOURS
def get_stale_after_days() -> int:
cfg = _load_config()
try:
return int(cfg.get("stale_after_days", DEFAULT_STALE_AFTER_DAYS))
except (TypeError, ValueError):
return DEFAULT_STALE_AFTER_DAYS
def get_archive_after_days() -> int:
cfg = _load_config()
try:
return int(cfg.get("archive_after_days", DEFAULT_ARCHIVE_AFTER_DAYS))
except (TypeError, ValueError):
return DEFAULT_ARCHIVE_AFTER_DAYS
# ---------------------------------------------------------------------------
# Idle / interval check
# ---------------------------------------------------------------------------
def _parse_iso(ts: Optional[str]) -> Optional[datetime]:
if not ts:
return None
try:
return datetime.fromisoformat(ts)
except (TypeError, ValueError):
return None
def should_run_now(now: Optional[datetime] = None) -> bool:
"""Return True if the curator should run immediately.
Gates:
- curator.enabled == True
- not paused
- last_run_at missing, OR older than interval_hours
The idle check (min_idle_hours) is applied at the call site where we know
whether an agent is actively running — here we only enforce the static
gates.
"""
if not is_enabled():
return False
if is_paused():
return False
state = load_state()
last = _parse_iso(state.get("last_run_at"))
if last is None:
return True
if now is None:
now = datetime.now(timezone.utc)
if last.tzinfo is None:
last = last.replace(tzinfo=timezone.utc)
interval = timedelta(hours=get_interval_hours())
return (now - last) >= interval
# ---------------------------------------------------------------------------
# Automatic state transitions (pure function, no LLM)
# ---------------------------------------------------------------------------
def apply_automatic_transitions(now: Optional[datetime] = None) -> Dict[str, int]:
"""Walk every agent-created skill and move active/stale/archived based on
last_used_at. Pinned skills are never touched. Returns a counter dict
describing what changed."""
from tools import skill_usage as _u
if now is None:
now = datetime.now(timezone.utc)
stale_cutoff = now - timedelta(days=get_stale_after_days())
archive_cutoff = now - timedelta(days=get_archive_after_days())
counts = {"marked_stale": 0, "archived": 0, "reactivated": 0, "checked": 0}
for row in _u.agent_created_report():
counts["checked"] += 1
name = row["name"]
if row.get("pinned"):
continue
last_used = _parse_iso(row.get("last_used_at"))
# If never used, treat as using created_at as the anchor so new skills
# don't immediately archive themselves.
anchor = last_used or _parse_iso(row.get("created_at")) or now
if anchor.tzinfo is None:
anchor = anchor.replace(tzinfo=timezone.utc)
current = row.get("state", _u.STATE_ACTIVE)
if anchor <= archive_cutoff and current != _u.STATE_ARCHIVED:
ok, _msg = _u.archive_skill(name)
if ok:
counts["archived"] += 1
elif anchor <= stale_cutoff and current == _u.STATE_ACTIVE:
_u.set_state(name, _u.STATE_STALE)
counts["marked_stale"] += 1
elif anchor > stale_cutoff and current == _u.STATE_STALE:
# Skill got used again after being marked stale — reactivate.
_u.set_state(name, _u.STATE_ACTIVE)
counts["reactivated"] += 1
return counts
# ---------------------------------------------------------------------------
# Review prompt for the forked agent
# ---------------------------------------------------------------------------
CURATOR_REVIEW_PROMPT = (
"You are running as Hermes' background skill CURATOR. This is an "
"UMBRELLA-BUILDING consolidation pass, not a passive audit and not a "
"duplicate-finder.\n\n"
"The goal of the skill collection is a LIBRARY OF CLASS-LEVEL "
"INSTRUCTIONS AND EXPERIENTIAL KNOWLEDGE. A collection of hundreds of "
"narrow skills where each one captures one session's specific bug is "
"a FAILURE of the library — not a feature. An agent searching skills "
"matches on descriptions, not on exact names; one broad umbrella "
"skill with labeled subsections beats five narrow siblings for "
"discoverability, not the other way around.\n\n"
"The right target shape is CLASS-LEVEL skills with rich SKILL.md "
"bodies + `references/`, `templates/`, and `scripts/` subfiles for "
"session-specific detail — not one-session-one-skill micro-entries.\n\n"
"Hard rules — do not violate:\n"
"1. DO NOT touch bundled or hub-installed skills. The candidate list "
"below is already filtered to agent-created skills only.\n"
"2. DO NOT delete any skill. Archiving (moving the skill's directory "
"into ~/.hermes/skills/.archive/) is the maximum destructive action. "
"Archives are recoverable; deletion is not.\n"
"3. DO NOT touch skills shown as pinned=yes. Skip them entirely.\n"
"4. DO NOT use usage counters as a reason to skip consolidation. The "
"counters are new and often mostly zero. Judge overlap on CONTENT, "
"not on use_count. 'use=0' is not evidence a skill is valuable; it's "
"absence of evidence either way.\n"
"5. DO NOT reject consolidation on the grounds that 'each skill has "
"a distinct trigger'. Pairwise distinctness is the wrong bar. The "
"right bar is: 'would a human maintainer write this as N separate "
"skills, or as one skill with N labeled subsections?' When the "
"answer is the latter, merge.\n\n"
"How to work — not optional:\n"
"1. Scan the full candidate list. Identify PREFIX CLUSTERS (skills "
"sharing a first word or domain keyword). Examples you are likely "
"to find: hermes-config-*, hermes-dashboard-*, gateway-*, codex-*, "
"ollama-*, anthropic-*, gemini-*, mcp-*, salvage-*, pr-*, "
"competitor-*, python-*, security-*, etc. Expect 10-25 clusters.\n"
"2. For each cluster with 2+ members, do NOT ask 'are these pairs "
"overlapping?' — ask 'what is the UMBRELLA CLASS these skills all "
"serve? Would a maintainer name that class and write one skill for "
"it?' If yes, pick (or create) the umbrella and absorb the siblings "
"into it.\n"
"3. Three ways to consolidate — use the right one per cluster:\n"
" a. MERGE INTO EXISTING UMBRELLA — one skill in the cluster is "
"already broad enough to be the umbrella (example: `pr-triage-"
"salvage` for the PR review cluster). Patch it to add a labeled "
"section for each sibling's unique insight, then archive the "
"siblings.\n"
" b. CREATE A NEW UMBRELLA SKILL.md — no existing member is broad "
"enough. Use skill_manage action=create to write a new class-level "
"skill whose SKILL.md covers the shared workflow and has short "
"labeled subsections. Archive the now-absorbed narrow siblings.\n"
" c. DEMOTE TO REFERENCES/TEMPLATES/SCRIPTS — a sibling has "
"narrow-but-valuable session-specific content. Move it into the "
"umbrella's appropriate support directory:\n"
" • `references/<topic>.md` for session-specific detail OR "
"condensed knowledge banks (quoted research, API docs excerpts, "
"domain notes, provider quirks, reproduction recipes)\n"
" • `templates/<name>.<ext>` for starter files meant to be "
"copied and modified\n"
" • `scripts/<name>.<ext>` for statically re-runnable actions "
"(verification scripts, fixture generators, probes)\n"
" Then archive the old sibling. Use `terminal` with `mkdir -p "
"~/.hermes/skills/<umbrella>/references/ && mv ... <umbrella>/"
"references/<topic>.md` (or templates/ / scripts/).\n"
"4. Also flag skills whose NAME is too narrow (contains a PR number, "
"a feature codename, a specific error string, an 'audit' / "
"'diagnosis' / 'salvage' session artifact). These almost always "
"belong as a subsection or support file under a class-level umbrella.\n"
"5. Iterate. After one consolidation round, scan the remaining set "
"and look for the NEXT umbrella opportunity. Don't stop after 3 "
"merges.\n\n"
"Your toolset:\n"
" - skills_list, skill_view — read the current landscape\n"
" - skill_manage action=patch — add sections to the umbrella\n"
" - skill_manage action=create — create a new umbrella SKILL.md\n"
" - skill_manage action=write_file — add a references/, templates/, "
"or scripts/ file under an existing skill (the skill must already "
"exist)\n"
" - terminal — mv a sibling into the archive "
"OR move its content into a support subfile\n\n"
"'keep' is a legitimate decision ONLY when the skill is already a "
"class-level umbrella and none of the proposed merges would improve "
"discoverability. 'This is narrow but distinct from its siblings' "
"is NOT a reason to keep — it's a reason to move it under an "
"umbrella as a subsection or support file.\n\n"
"Expected output: real umbrella-ification. Process every obvious "
"cluster. If you end the pass with fewer than 10 archives, you "
"stopped too early — go back and look at the clusters you left "
"alone.\n\n"
"When done, write a summary with: clusters processed, skills "
"patched/absorbed, skills demoted to references/templates/scripts, "
"skills archived, new umbrellas created, and clusters you "
"deliberately left alone with one line each."
)
# ---------------------------------------------------------------------------
# Orchestrator — spawn a forked AIAgent for the LLM review pass
# ---------------------------------------------------------------------------
def _render_candidate_list() -> str:
"""Human/agent-readable list of agent-created skills with usage stats."""
rows = skill_usage.agent_created_report()
if not rows:
return "No agent-created skills to review."
lines = [f"Agent-created skills ({len(rows)}):\n"]
for r in rows:
lines.append(
f"- {r['name']} "
f"state={r['state']} "
f"pinned={'yes' if r.get('pinned') else 'no'} "
f"use={r.get('use_count', 0)} "
f"view={r.get('view_count', 0)} "
f"patches={r.get('patch_count', 0)} "
f"last_used={r.get('last_used_at') or 'never'}"
)
return "\n".join(lines)
def run_curator_review(
on_summary: Optional[Callable[[str], None]] = None,
synchronous: bool = False,
) -> Dict[str, Any]:
"""Execute a single curator review pass.
Steps:
1. Apply automatic state transitions (pure, no LLM).
2. If there are agent-created skills, spawn a forked AIAgent that runs
the LLM review prompt against the current candidate list.
3. Update .curator_state with last_run_at and a one-line summary.
4. Invoke *on_summary* with a user-visible description.
If *synchronous* is True, the LLM review runs in the calling thread; the
default is to spawn a daemon thread so the caller returns immediately.
"""
start = datetime.now(timezone.utc)
counts = apply_automatic_transitions(now=start)
auto_summary_parts = []
if counts["marked_stale"]:
auto_summary_parts.append(f"{counts['marked_stale']} marked stale")
if counts["archived"]:
auto_summary_parts.append(f"{counts['archived']} archived")
if counts["reactivated"]:
auto_summary_parts.append(f"{counts['reactivated']} reactivated")
auto_summary = ", ".join(auto_summary_parts) if auto_summary_parts else "no changes"
# Persist state before the LLM pass so a crash mid-review still records
# the run and doesn't immediately re-trigger.
state = load_state()
state["last_run_at"] = start.isoformat()
state["run_count"] = int(state.get("run_count", 0)) + 1
state["last_run_summary"] = f"auto: {auto_summary}"
save_state(state)
def _llm_pass():
nonlocal auto_summary
try:
candidate_list = _render_candidate_list()
if "No agent-created skills" in candidate_list:
final_summary = f"auto: {auto_summary}; llm: skipped (no candidates)"
else:
prompt = f"{CURATOR_REVIEW_PROMPT}\n\n{candidate_list}"
llm_summary = _run_llm_review(prompt)
final_summary = f"auto: {auto_summary}; llm: {llm_summary}"
except Exception as e:
logger.debug("Curator LLM pass failed: %s", e, exc_info=True)
final_summary = f"auto: {auto_summary}; llm: error ({e})"
elapsed = (datetime.now(timezone.utc) - start).total_seconds()
state2 = load_state()
state2["last_run_duration_seconds"] = elapsed
state2["last_run_summary"] = final_summary
save_state(state2)
if on_summary:
try:
on_summary(f"curator: {final_summary}")
except Exception:
pass
if synchronous:
_llm_pass()
else:
t = threading.Thread(target=_llm_pass, daemon=True, name="curator-review")
t.start()
return {
"started_at": start.isoformat(),
"auto_transitions": counts,
"summary_so_far": auto_summary,
}
def _run_llm_review(prompt: str) -> str:
"""Spawn an AIAgent fork to run the curator review prompt. Returns a short
summary of what the model said in its final response."""
import contextlib
try:
from run_agent import AIAgent
except Exception as e:
return f"AIAgent import failed: {e}"
# Resolve provider + model the same way the CLI does, so the curator
# fork inherits the user's active main config rather than falling
# through to an empty provider/model pair (which sends HTTP 400
# "No models provided"). AIAgent() without explicit provider/model
# arguments hits an auto-resolution path that fails for OAuth-only
# providers and for pool-backed credentials.
_api_key = None
_base_url = None
_api_mode = None
_resolved_provider = None
_model_name = ""
try:
from hermes_cli.config import load_config
from hermes_cli.runtime_provider import resolve_runtime_provider
_cfg = load_config()
_m = _cfg.get("model", {}) if isinstance(_cfg.get("model"), dict) else {}
_provider = _m.get("provider") or "auto"
_model_name = _m.get("default") or _m.get("model") or ""
_rp = resolve_runtime_provider(
requested=_provider, target_model=_model_name
)
_api_key = _rp.get("api_key")
_base_url = _rp.get("base_url")
_api_mode = _rp.get("api_mode")
_resolved_provider = _rp.get("provider") or _provider
except Exception as e:
logger.debug("Curator provider resolution failed: %s", e, exc_info=True)
review_agent = None
try:
review_agent = AIAgent(
model=_model_name,
provider=_resolved_provider,
api_key=_api_key,
base_url=_base_url,
api_mode=_api_mode,
# Umbrella-building over a large skill collection is worth a
# high iteration ceiling — the pass typically takes 50-100
# API calls against hundreds of candidate skills. The
# single-session review path caps itself at a much smaller
# number because it's not doing a curation sweep.
max_iterations=9999,
quiet_mode=True,
platform="curator",
skip_context_files=True,
skip_memory=True,
)
# Disable recursive nudges — the curator must never spawn its own review.
review_agent._memory_nudge_interval = 0
review_agent._skill_nudge_interval = 0
# Redirect the forked agent's stdout/stderr to /dev/null while it
# runs so its tool-call chatter doesn't pollute the foreground
# terminal. The background-thread runner also hides it; this
# belt-and-suspenders path matters when a caller invokes
# run_curator_review(synchronous=True) from the CLI.
with open(os.devnull, "w") as _devnull, \
contextlib.redirect_stdout(_devnull), \
contextlib.redirect_stderr(_devnull):
result = review_agent.run_conversation(user_message=prompt)
final = ""
if isinstance(result, dict):
final = str(result.get("final_response") or "").strip()
return (final[:240] + "") if len(final) > 240 else (final or "no change")
except Exception as e:
return f"error: {e}"
finally:
if review_agent is not None:
try:
review_agent.close()
except Exception:
pass
# ---------------------------------------------------------------------------
# Public entrypoint for the session-start hook
# ---------------------------------------------------------------------------
def maybe_run_curator(
*,
idle_for_seconds: Optional[float] = None,
on_summary: Optional[Callable[[str], None]] = None,
) -> Optional[Dict[str, Any]]:
"""Best-effort: run a curator pass if all gates pass. Returns the result
dict if a pass was started, else None. Never raises."""
try:
if not should_run_now():
return None
# Idle gating: only enforce when the caller provided a measurement.
if idle_for_seconds is not None:
min_idle_s = get_min_idle_hours() * 3600.0
if idle_for_seconds < min_idle_s:
return None
return run_curator_review(on_summary=on_summary)
except Exception as e:
logger.debug("maybe_run_curator failed: %s", e, exc_info=True)
return None
+2 -1
View File
@@ -223,7 +223,8 @@ def build_tool_preview(tool_name: str, args: dict, max_len: int | None = None) -
target = args.get("target", "")
if action == "add":
content = _oneline(args.get("content", ""))
return f"+{target}: \"{content[:25]}{'...' if len(content) > 25 else ''}\""
target_prefix = f"+{target}: " if target else "+"
return f"{target_prefix}\"{content[:25]}{'...' if len(content) > 25 else ''}\""
elif action == "replace":
old = _oneline(args.get("old_text") or "") or "<missing old_text>"
return f"~{target}: \"{old[:20]}\""
+3 -51
View File
@@ -184,59 +184,11 @@ _PREFIX_RE = re.compile(
)
def mask_secret(
value: str,
*,
head: int = 4,
tail: int = 4,
floor: int = 12,
placeholder: str = "***",
empty: str = "",
) -> str:
"""Mask a secret for display, preserving ``head`` and ``tail`` characters.
Canonical helper for display-time redaction across Hermes — used by
``hermes config``, ``hermes status``, ``hermes dump``, and anywhere
a secret needs to be shown truncated for debuggability while still
keeping the bulk hidden.
Args:
value: The secret to mask. ``None``/empty returns ``empty``.
head: Leading characters to preserve. Default 4.
tail: Trailing characters to preserve. Default 4.
floor: Values shorter than ``head + tail + floor_margin`` are
fully masked (returns ``placeholder``). Default 12 —
matches the existing config/status/dump convention.
placeholder: Value returned for too-short inputs. Default ``"***"``.
empty: Value returned when ``value`` is falsy (None, ""). The
caller can override this to e.g. ``color("(not set)",
Colors.DIM)`` for user-facing display.
Examples:
>>> mask_secret("sk-proj-abcdef1234567890")
'sk-p...7890'
>>> mask_secret("short") # fully masked
'***'
>>> mask_secret("") # empty default
''
>>> mask_secret("", empty="(not set)") # empty override
'(not set)'
>>> mask_secret("long-token", head=6, tail=4, floor=18)
'***'
"""
if not value:
return empty
if len(value) < floor:
return placeholder
return f"{value[:head]}...{value[-tail:]}"
def _mask_token(token: str) -> str:
"""Mask a log token — conservative 18-char floor, preserves 6 prefix / 4 suffix."""
# Empty input: historically this returned "***" rather than "". Preserve.
if not token:
"""Mask a token, preserving prefix for long tokens."""
if len(token) < 18:
return "***"
return mask_secret(token, head=6, tail=4, floor=18)
return f"{token[:6]}...{token[-4:]}"
def _redact_query_string(query: str) -> str:
+1 -9
View File
@@ -200,9 +200,6 @@ def get_external_skills_dirs() -> List[Path]:
if not isinstance(raw_dirs, list):
return []
from hermes_constants import get_hermes_home
hermes_home = get_hermes_home()
local_skills = get_skills_dir().resolve()
seen: Set[Path] = set()
result: List[Path] = []
@@ -213,12 +210,7 @@ def get_external_skills_dirs() -> List[Path]:
continue
# Expand ~ and environment variables
expanded = os.path.expanduser(os.path.expandvars(entry))
p = Path(expanded)
# Resolve relative paths against HERMES_HOME, not cwd
if not p.is_absolute():
p = (hermes_home / p).resolve()
else:
p = p.resolve()
p = Path(expanded).resolve()
if p == local_skills:
continue
if p in seen:
+1 -1
View File
@@ -927,7 +927,7 @@ display:
# agent_name: "My Agent" # Banner title and branding
# welcome: "Welcome message" # Shown at CLI startup
# response_label: " ⚔ Agent " # Response box header label
# prompt_symbol: "⚔" # Prompt symbol (bare token; renderers add trailing space)
# prompt_symbol: "⚔ " # Prompt symbol
# tool_prefix: "╎" # Tool output line prefix (default: ┊)
#
skin: default
+120 -91
View File
@@ -80,11 +80,6 @@ _COMMAND_SPINNER_FRAMES = ("⠋", "⠙", "⠹", "⠸", "⠼", "⠴", "⠦", "⠧
# Load .env from ~/.hermes/.env first, then project root as dev fallback.
# User-managed env files should override stale shell exports on restart.
from hermes_constants import get_hermes_home, display_hermes_home
from hermes_cli.browser_connect import (
DEFAULT_BROWSER_CDP_URL,
manual_chrome_debug_command,
try_launch_chrome_debug,
)
from hermes_cli.env_loader import load_hermes_dotenv
from utils import base_url_host_matches
@@ -245,6 +240,65 @@ def _parse_service_tier_config(raw: str) -> str | None:
logger.warning("Unknown service_tier '%s', ignoring", raw)
return None
def _get_chrome_debug_candidates(system: str) -> list[str]:
"""Return likely browser executables for local CDP auto-launch."""
candidates: list[str] = []
seen: set[str] = set()
def _add_candidate(path: str | None) -> None:
if not path:
return
normalized = os.path.normcase(os.path.normpath(path))
if normalized in seen:
return
if os.path.isfile(path):
candidates.append(path)
seen.add(normalized)
def _add_from_path(*names: str) -> None:
for name in names:
_add_candidate(shutil.which(name))
if system == "Darwin":
for app in (
"/Applications/Google Chrome.app/Contents/MacOS/Google Chrome",
"/Applications/Chromium.app/Contents/MacOS/Chromium",
"/Applications/Brave Browser.app/Contents/MacOS/Brave Browser",
"/Applications/Microsoft Edge.app/Contents/MacOS/Microsoft Edge",
):
_add_candidate(app)
elif system == "Windows":
_add_from_path(
"chrome.exe", "msedge.exe", "brave.exe", "chromium.exe",
"chrome", "msedge", "brave", "chromium",
)
for base in (
os.environ.get("ProgramFiles"),
os.environ.get("ProgramFiles(x86)"),
os.environ.get("LOCALAPPDATA"),
):
if not base:
continue
for parts in (
("Google", "Chrome", "Application", "chrome.exe"),
("Chromium", "Application", "chrome.exe"),
("Chromium", "Application", "chromium.exe"),
("BraveSoftware", "Brave-Browser", "Application", "brave.exe"),
("Microsoft", "Edge", "Application", "msedge.exe"),
):
_add_candidate(os.path.join(base, *parts))
else:
_add_from_path(
"google-chrome", "google-chrome-stable", "chromium-browser",
"chromium", "brave-browser", "microsoft-edge",
)
return candidates
def load_cli_config() -> Dict[str, Any]:
"""
Load CLI configuration from config files.
@@ -5925,29 +5979,7 @@ class HermesCLI:
print(f"(._.) Unknown cron command: {subcommand}")
print(" Available: list, add, edit, pause, resume, run, remove")
def _handle_curator_command(self, cmd: str):
"""Handle /curator slash command.
Delegates to hermes_cli.curator so the CLI and the `hermes curator`
subcommand share the same handler set.
"""
import shlex
tokens = shlex.split(cmd)[1:] if cmd else []
if not tokens:
tokens = ["status"]
try:
from hermes_cli.curator import cli_main
cli_main(tokens)
except SystemExit:
# argparse calls sys.exit() on --help or errors; swallow so we
# don't kill the interactive session.
pass
except Exception as exc:
print(f"(._.) curator: {exc}")
def _handle_skills_command(self, cmd: str):
"""Handle /skills slash command — delegates to hermes_cli.skills_hub."""
from hermes_cli.skills_hub import handle_skills_slash
@@ -6191,8 +6223,6 @@ class HermesCLI:
self.save_conversation()
elif canonical == "cron":
self._handle_cron_command(cmd_original)
elif canonical == "curator":
self._handle_curator_command(cmd_original)
elif canonical == "skills":
with self._busy_command(self._slow_command_status(cmd_original)):
self._handle_skills_command(cmd_original)
@@ -6576,7 +6606,34 @@ class HermesCLI:
Returns True if a launch command was executed (doesn't guarantee success).
"""
return try_launch_chrome_debug(port, system)
import subprocess as _sp
candidates = _get_chrome_debug_candidates(system)
if not candidates:
return False
# Dedicated profile dir so debug Chrome won't collide with normal Chrome
data_dir = str(_hermes_home / "chrome-debug")
os.makedirs(data_dir, exist_ok=True)
chrome = candidates[0]
try:
_sp.Popen(
[
chrome,
f"--remote-debugging-port={port}",
f"--user-data-dir={data_dir}",
"--no-first-run",
"--no-default-browser-check",
],
stdout=_sp.DEVNULL,
stderr=_sp.DEVNULL,
start_new_session=True, # detach from terminal
)
return True
except Exception:
return False
def _handle_browser_command(self, cmd: str):
"""Handle /browser connect|disconnect|status — manage live Chrome CDP connection."""
@@ -6585,44 +6642,13 @@ class HermesCLI:
parts = cmd.strip().split(None, 1)
sub = parts[1].lower().strip() if len(parts) > 1 else "status"
_DEFAULT_CDP = DEFAULT_BROWSER_CDP_URL
_DEFAULT_CDP = "http://127.0.0.1:9222"
current = os.environ.get("BROWSER_CDP_URL", "").strip()
if sub.startswith("connect"):
# Optionally accept a custom CDP URL: /browser connect ws://host:port
connect_parts = cmd.strip().split(None, 2) # ["/browser", "connect", "ws://..."]
cdp_url = connect_parts[2].strip() if len(connect_parts) > 2 else _DEFAULT_CDP
parsed_cdp = urlparse(cdp_url if "://" in cdp_url else f"http://{cdp_url}")
if parsed_cdp.scheme not in {"http", "https", "ws", "wss"}:
print()
print(
f" ⚠ Unsupported browser url scheme: {parsed_cdp.scheme or '(missing)'} "
"(expected one of: http, https, ws, wss)"
)
print()
return
try:
_port = parsed_cdp.port or (443 if parsed_cdp.scheme in {"https", "wss"} else 80)
except ValueError:
print()
print(f" ⚠ Invalid port in browser url: {cdp_url}")
print()
return
if not parsed_cdp.hostname:
print()
print(f" ⚠ Missing host in browser url: {cdp_url}")
print()
return
_host = parsed_cdp.hostname
if parsed_cdp.path.startswith("/devtools/browser/"):
cdp_url = parsed_cdp.geturl()
else:
cdp_url = parsed_cdp._replace(
path="",
params="",
query="",
fragment="",
).geturl()
# Clear any existing browser sessions so the next tool call uses the new backend
try:
@@ -6633,13 +6659,20 @@ class HermesCLI:
print()
# Extract port for connectivity checks
_port = 9222
try:
_port = int(cdp_url.rsplit(":", 1)[-1].split("/")[0])
except (ValueError, IndexError):
pass
# Check if Chrome is already listening on the debug port
import socket
_already_open = False
try:
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.settimeout(1)
s.connect((_host, _port))
s.connect(("127.0.0.1", _port))
s.close()
_already_open = True
except (OSError, socket.timeout):
@@ -6657,7 +6690,7 @@ class HermesCLI:
try:
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.settimeout(1)
s.connect((_host, _port))
s.connect(("127.0.0.1", _port))
s.close()
_already_open = True
break
@@ -6670,22 +6703,33 @@ class HermesCLI:
print(" Try again in a few seconds — the debug instance may still be starting")
else:
print(" ⚠ Could not auto-launch Chrome")
# Show manual instructions as fallback
_data_dir = str(_hermes_home / "chrome-debug")
sys_name = _plat.system()
chrome_cmd = manual_chrome_debug_command(_port, sys_name)
if chrome_cmd:
print(f" Launch Chrome manually:")
print(f" {chrome_cmd}")
if sys_name == "Darwin":
chrome_cmd = (
'open -a "Google Chrome" --args'
f" --remote-debugging-port=9222"
f' --user-data-dir="{_data_dir}"'
" --no-first-run --no-default-browser-check"
)
elif sys_name == "Windows":
chrome_cmd = (
f'chrome.exe --remote-debugging-port=9222'
f' --user-data-dir="{_data_dir}"'
f" --no-first-run --no-default-browser-check"
)
else:
print(" No Chrome/Chromium executable found in this environment")
chrome_cmd = (
f"google-chrome --remote-debugging-port=9222"
f' --user-data-dir="{_data_dir}"'
f" --no-first-run --no-default-browser-check"
)
print(f" Launch Chrome manually:")
print(f" {chrome_cmd}")
else:
print(f" ⚠ Port {_port} is not reachable at {cdp_url}")
if not _already_open:
print()
print("Browser not connected — start Chrome with remote debugging and retry /browser connect")
print()
return
os.environ["BROWSER_CDP_URL"] = cdp_url
# Eagerly start the CDP supervisor so pending_dialogs + frame_tree
# show up in the next browser_snapshot. No-op if already started.
@@ -9300,21 +9344,6 @@ class HermesCLI:
self._console_print(f"[dim {_tip_color}]✦ Tip: {_tip}[/]")
except Exception:
pass # Tips are non-critical — never break startup
# Curator — kick off a background skill-maintenance pass on startup
# if the schedule says we're due. Runs in a daemon thread so it
# never blocks the interactive loop. Best-effort; any failure is
# swallowed to avoid breaking session startup.
try:
from agent.curator import maybe_run_curator
maybe_run_curator(
idle_for_seconds=float("inf"), # CLI startup = fully idle
on_summary=lambda msg: self._console_print(
f"[dim #6b7684]💾 {msg}[/]"
),
)
except Exception:
pass
if self.preloaded_skills and not self._startup_skills_line_shown:
skills_label = ", ".join(self.preloaded_skills)
self._console_print(
-58
View File
@@ -286,10 +286,6 @@ if _config_path.exists():
# Only bridge explicit absolute paths from config.yaml.
if _cfg_key == "cwd" and str(_val) in (".", "auto", "cwd"):
continue
# Expand shell tilde in cwd so subprocess.Popen never
# receives a literal "~/" which the kernel rejects.
if _cfg_key == "cwd" and isinstance(_val, str):
_val = os.path.expanduser(_val)
if isinstance(_val, list):
os.environ[_env_var] = json.dumps(_val)
else:
@@ -2382,7 +2378,6 @@ class GatewayRunner:
# Discover and load event hooks
self.hooks.discover_and_load()
# Recover background processes from checkpoint (crash recovery)
try:
@@ -10226,20 +10221,6 @@ class GatewayRunner:
if progress_lines:
progress_lines[-1] = f"{base_msg} (×{count + 1})"
msg = progress_lines[-1] if progress_lines else base_msg
elif isinstance(raw, tuple) and len(raw) >= 1 and raw[0] == "__reset__":
# Content bubble just landed on the platform — close off
# the current tool-progress bubble so the next tool
# starts a fresh bubble below the content. Without this,
# tool lines keep editing the ORIGINAL progress message
# above the new content, making the chat appear out of
# order. Mirrors GatewayStreamConsumer.on_segment_break
# on the content side. (Issue: tool + content
# linearization regression after PR #7885.)
progress_msg_id = None
progress_lines = []
last_progress_msg[0] = None
repeat_count[0] = 0
continue
else:
msg = raw
progress_lines.append(msg)
@@ -10309,24 +10290,6 @@ class GatewayRunner:
_, base_msg, count = raw
if progress_lines:
progress_lines[-1] = f"{base_msg} (×{count + 1})"
elif isinstance(raw, tuple) and len(raw) >= 1 and raw[0] == "__reset__":
# Content-bubble marker during drain: close off
# the current progress bubble and start a fresh
# one for any tool lines that arrived after.
if can_edit and progress_lines and progress_msg_id:
_pending_text = "\n".join(progress_lines)
try:
await adapter.edit_message(
chat_id=source.chat_id,
message_id=progress_msg_id,
content=_pending_text,
)
except Exception:
pass
progress_msg_id = None
progress_lines = []
last_progress_msg[0] = None
repeat_count[0] = 0
else:
progress_lines.append(raw)
except Exception:
@@ -10532,11 +10495,6 @@ class GatewayRunner:
chat_id=source.chat_id,
config=_consumer_cfg,
metadata={"thread_id": _progress_thread_id} if _progress_thread_id else None,
on_new_message=(
(lambda: progress_queue.put(("__reset__",)))
if progress_queue is not None
else None
),
)
if _want_stream_deltas:
def _stream_delta_cb(text: str) -> None:
@@ -11744,7 +11702,6 @@ def _start_cron_ticker(stop_event: threading.Event, adapters=None, loop=None, in
IMAGE_CACHE_EVERY = 60 # ticks — once per hour at default 60s interval
CHANNEL_DIR_EVERY = 5 # ticks — every 5 minutes
PASTE_SWEEP_EVERY = 60 # ticks — once per hour
CURATOR_EVERY = 60 # ticks — poll hourly (inner gate handles the real cadence)
logger.info("Cron ticker started (interval=%ds)", interval)
tick_count = 0
@@ -11796,21 +11753,6 @@ def _start_cron_ticker(stop_event: threading.Event, adapters=None, loop=None, in
except Exception as e:
logger.debug("Paste sweep error: %s", e)
# Curator — piggy-back on the existing cron ticker so long-running
# gateways get weekly skill maintenance without needing restarts.
# maybe_run_curator() is internally gated by config.interval_hours
# (7 days by default), so CURATOR_EVERY is just the poll rate — the
# real work only fires once per config interval.
if tick_count % CURATOR_EVERY == 0:
try:
from agent.curator import maybe_run_curator
maybe_run_curator(
idle_for_seconds=float("inf"),
on_summary=lambda msg: logger.info("curator: %s", msg),
)
except Exception as e:
logger.debug("Curator tick error: %s", e)
stop_event.wait(timeout=interval)
logger.info("Cron ticker stopped")
-35
View File
@@ -91,20 +91,11 @@ class GatewayStreamConsumer:
chat_id: str,
config: Optional[StreamConsumerConfig] = None,
metadata: Optional[dict] = None,
on_new_message: Optional[callable] = None,
):
self.adapter = adapter
self.chat_id = chat_id
self.cfg = config or StreamConsumerConfig()
self.metadata = metadata
# Fired whenever a fresh content bubble is created on the platform
# (first-send of a new message, commentary, overflow chunk, or
# fallback continuation). The gateway uses this to linearize the
# tool-progress bubble: when content resumes after a tool batch,
# the next tool.started should open a NEW progress bubble below
# the content, not edit the old bubble above it.
# Called with no arguments. Exceptions are swallowed.
self._on_new_message = on_new_message
self._queue: queue.Queue = queue.Queue()
self._accumulated = ""
self._message_id: Optional[str] = None
@@ -155,16 +146,6 @@ class GatewayStreamConsumer:
if text:
self._queue.put((_COMMENTARY, text))
def _notify_new_message(self) -> None:
"""Fire the on_new_message callback, swallowing any errors."""
cb = self._on_new_message
if cb is None:
return
try:
cb()
except Exception:
logger.debug("on_new_message callback error", exc_info=True)
def _reset_segment_state(self, *, preserve_no_edit: bool = False) -> None:
if preserve_no_edit and self._message_id == "__no_edit__":
return
@@ -548,9 +529,6 @@ class GatewayStreamConsumer:
self._message_id = str(result.message_id)
self._already_sent = True
self._last_sent_text = text
# Fresh content bubble — close off any stale tool bubble
# above so the next tool starts a new bubble below.
self._notify_new_message()
return str(result.message_id)
else:
self._edit_supported = False
@@ -683,9 +661,6 @@ class GatewayStreamConsumer:
sent_any_chunk = True
last_successful_chunk = chunk
last_message_id = result.message_id or last_message_id
# Each fallback chunk is a fresh platform message — notify
# so any stale tool-progress bubble gets closed off.
self._notify_new_message()
self._message_id = last_message_id
self._already_sent = True
@@ -769,11 +744,6 @@ class GatewayStreamConsumer:
# tool..."), not the final response. Setting already_sent would cause
# the final response to be incorrectly suppressed when there are
# multiple tool calls. See: https://github.com/NousResearch/hermes-agent/issues/10454
if result.success:
# Commentary counts as fresh content — close off any
# stale tool bubble above it so the next tool starts a
# new bubble below.
self._notify_new_message()
return result.success
except Exception as e:
logger.error("Commentary send error: %s", e)
@@ -1003,11 +973,6 @@ class GatewayStreamConsumer:
# every delta/tool boundary when platforms accept a
# message but do not return an editable message id.
self._message_id = "__no_edit__"
# Notify the gateway that a fresh content bubble was
# created so any accumulated tool-progress bubble above
# gets closed off — the next tool fires into a new
# bubble below, preserving chronological order.
self._notify_new_message()
return True
else:
# Initial send failed — disable streaming for this session
-138
View File
@@ -1,138 +0,0 @@
"""Shared helpers for attaching Hermes to a local Chrome CDP port."""
from __future__ import annotations
import os
import platform
import shlex
import shutil
import subprocess
from hermes_constants import get_hermes_home
DEFAULT_BROWSER_CDP_PORT = 9222
DEFAULT_BROWSER_CDP_URL = f"http://127.0.0.1:{DEFAULT_BROWSER_CDP_PORT}"
_DARWIN_APPS = (
"/Applications/Google Chrome.app/Contents/MacOS/Google Chrome",
"/Applications/Chromium.app/Contents/MacOS/Chromium",
"/Applications/Brave Browser.app/Contents/MacOS/Brave Browser",
"/Applications/Microsoft Edge.app/Contents/MacOS/Microsoft Edge",
)
_WINDOWS_INSTALL_PARTS = (
("Google", "Chrome", "Application", "chrome.exe"),
("Chromium", "Application", "chrome.exe"),
("Chromium", "Application", "chromium.exe"),
("BraveSoftware", "Brave-Browser", "Application", "brave.exe"),
("Microsoft", "Edge", "Application", "msedge.exe"),
)
_LINUX_BIN_NAMES = (
"google-chrome", "google-chrome-stable", "chromium-browser",
"chromium", "brave-browser", "microsoft-edge",
)
_WINDOWS_BIN_NAMES = (
"chrome.exe", "msedge.exe", "brave.exe", "chromium.exe",
"chrome", "msedge", "brave", "chromium",
)
def get_chrome_debug_candidates(system: str) -> list[str]:
candidates: list[str] = []
seen: set[str] = set()
def add(path: str | None) -> None:
if not path:
return
normalized = os.path.normcase(os.path.normpath(path))
if normalized in seen or not os.path.isfile(path):
return
candidates.append(path)
seen.add(normalized)
def add_install_paths(bases: tuple[str | None, ...]) -> None:
for base in filter(None, bases):
for parts in _WINDOWS_INSTALL_PARTS:
add(os.path.join(base, *parts))
if system == "Darwin":
for app in _DARWIN_APPS:
add(app)
return candidates
if system == "Windows":
for name in _WINDOWS_BIN_NAMES:
add(shutil.which(name))
add_install_paths((
os.environ.get("ProgramFiles"),
os.environ.get("ProgramFiles(x86)"),
os.environ.get("LOCALAPPDATA"),
))
return candidates
for name in _LINUX_BIN_NAMES:
add(shutil.which(name))
add_install_paths(("/mnt/c/Program Files", "/mnt/c/Program Files (x86)"))
return candidates
def chrome_debug_data_dir() -> str:
return str(get_hermes_home() / "chrome-debug")
def _chrome_debug_args(port: int) -> list[str]:
return [
f"--remote-debugging-port={port}",
f"--user-data-dir={chrome_debug_data_dir()}",
"--no-first-run",
"--no-default-browser-check",
]
def manual_chrome_debug_command(port: int = DEFAULT_BROWSER_CDP_PORT, system: str | None = None) -> str | None:
system = system or platform.system()
candidates = get_chrome_debug_candidates(system)
if candidates:
argv = [candidates[0], *_chrome_debug_args(port)]
return subprocess.list2cmdline(argv) if system == "Windows" else shlex.join(argv)
if system == "Darwin":
data_dir = chrome_debug_data_dir()
return (
f'open -a "Google Chrome" --args --remote-debugging-port={port} '
f'--user-data-dir="{data_dir}" --no-first-run --no-default-browser-check'
)
return None
def _detach_kwargs(system: str) -> dict:
if system != "Windows":
return {"start_new_session": True}
flags = getattr(subprocess, "DETACHED_PROCESS", 0) | getattr(
subprocess, "CREATE_NEW_PROCESS_GROUP", 0
)
return {"creationflags": flags} if flags else {}
def try_launch_chrome_debug(port: int = DEFAULT_BROWSER_CDP_PORT, system: str | None = None) -> bool:
system = system or platform.system()
candidates = get_chrome_debug_candidates(system)
if not candidates:
return False
os.makedirs(chrome_debug_data_dir(), exist_ok=True)
try:
subprocess.Popen(
[candidates[0], *_chrome_debug_args(port)],
stdout=subprocess.DEVNULL,
stderr=subprocess.DEVNULL,
**_detach_kwargs(system),
)
return True
except Exception:
return False
-6
View File
@@ -128,9 +128,6 @@ COMMAND_REGISTRY: list[CommandDef] = [
subcommands=("normal", "fast", "status", "on", "off")),
CommandDef("skin", "Show or change the display skin/theme", "Configuration",
cli_only=True, args_hint="[name]"),
CommandDef("indicator", "Pick the TUI busy-indicator style", "Configuration",
cli_only=True, args_hint="[kaomoji|emoji|unicode|ascii]",
subcommands=("kaomoji", "emoji", "unicode", "ascii")),
CommandDef("voice", "Toggle voice mode", "Configuration",
args_hint="[on|off|tts|status]", subcommands=("on", "off", "tts", "status")),
CommandDef("busy", "Control what Enter does while Hermes is working", "Configuration",
@@ -148,9 +145,6 @@ COMMAND_REGISTRY: list[CommandDef] = [
CommandDef("cron", "Manage scheduled tasks", "Tools & Skills",
cli_only=True, args_hint="[subcommand]",
subcommands=("list", "add", "create", "edit", "pause", "resume", "run", "remove")),
CommandDef("curator", "Background skill maintenance (status, run, pin, archive)",
"Tools & Skills", args_hint="[subcommand]",
subcommands=("status", "run", "pause", "resume", "pin", "unpin", "restore")),
CommandDef("reload", "Reload .env variables into the running session", "Tools & Skills",
cli_only=True),
CommandDef("reload-mcp", "Reload MCP servers from config", "Tools & Skills",
+11 -53
View File
@@ -715,9 +715,6 @@ DEFAULT_CONFIG = {
"inline_diffs": True, # Show inline diff previews for write actions (write_file, patch, skill_manage)
"show_cost": False, # Show $ cost in the status bar (off by default)
"skin": "default",
# TUI busy indicator style: kaomoji (default), emoji, unicode (braille
# spinner), or ascii. Live-swappable via `/indicator <style>`.
"tui_status_indicator": "kaomoji",
"user_message_preview": { # CLI: how many submitted user-message lines to echo back in scrollback
"first_lines": 2,
"last_lines": 2,
@@ -915,35 +912,6 @@ DEFAULT_CONFIG = {
"guard_agent_created": False,
},
# Curator — background skill maintenance.
#
# Periodically reviews AGENT-CREATED skills (never bundled or
# hub-installed) and keeps the collection tidy: marks long-unused skills
# as stale, archives genuinely obsolete ones (archive only, never
# deletes), and spawns a forked aux-model agent to consolidate overlaps
# and patch drift. Runs inactivity-triggered from session start — no
# cron daemon.
#
# See `hermes curator status` for the last run summary.
"curator": {
"enabled": True,
# How long to wait between curator runs (hours). Default: 7 days.
"interval_hours": 24 * 7,
# Only run when the agent has been idle at least this long (hours).
"min_idle_hours": 2,
# Mark a skill as "stale" after this many days without use.
"stale_after_days": 30,
# Archive a skill (move to skills/.archive/) after this many days
# without use. Archived skills are recoverable — no auto-deletion.
"archive_after_days": 90,
# Optional per-task override for the curator's aux model. Leave null
# to use Hermes' main auxiliary client resolution.
"auxiliary": {
"provider": None,
"model": None,
},
},
# Honcho AI-native memory -- reads ~/.honcho/config.json as single source of truth.
# This section is only needed for hermes-specific overrides; everything else
# (apiKey, workspace, peerName, sessions, enabled) comes from the global config.
@@ -3739,27 +3707,18 @@ def _sanitize_env_lines(lines: list) -> list:
# Detect concatenated KEY=VALUE pairs on one line.
# Search for known KEY= patterns at any position in the line.
# We collect full needle ranges so we can drop matches that are
# fully contained within a longer overlapping needle. Without this,
# suffix collisions corrupt the file: e.g. LM_API_KEY= inside
# GLM_API_KEY= would otherwise split the line into "G\nLM_API_KEY=...".
match_ranges: list[tuple[int, int]] = []
split_positions = []
for key_name in known_keys:
needle = key_name + "="
idx = stripped.find(needle)
while idx >= 0:
match_ranges.append((idx, idx + len(needle)))
split_positions.append(idx)
idx = stripped.find(needle, idx + len(needle))
split_positions = sorted({
s for s, e in match_ranges
if not any(
s2 <= s and e2 >= e and (s2, e2) != (s, e)
for s2, e2 in match_ranges
)
})
if len(split_positions) > 1:
split_positions.sort()
# Deduplicate (shouldn't happen, but be safe)
split_positions = sorted(set(split_positions))
for i, pos in enumerate(split_positions):
end = split_positions[i + 1] if i + 1 < len(split_positions) else len(stripped)
part = stripped[pos:end].strip()
@@ -4051,13 +4010,12 @@ def get_env_value(key: str) -> Optional[str]:
# =============================================================================
def redact_key(key: str) -> str:
"""Redact an API key for display.
Thin wrapper over :func:`agent.redact.mask_secret` — preserves the
"(not set)" placeholder in dim color for the empty case.
"""
from agent.redact import mask_secret
return mask_secret(key, empty=color("(not set)", Colors.DIM))
"""Redact an API key for display."""
if not key:
return color("(not set)", Colors.DIM)
if len(key) < 12:
return "***"
return key[:4] + "..." + key[-4:]
def show_config():
-232
View File
@@ -1,232 +0,0 @@
"""CLI subcommand: `hermes curator <subcommand>`.
Thin shell around agent/curator.py and tools/skill_usage.py. Renders a status
table, triggers a run, pauses/resumes, and pins/unpins skills.
This module intentionally has no side effects at import time — main.py wires
the argparse subparsers on demand.
"""
from __future__ import annotations
import argparse
import sys
from datetime import datetime, timezone
from typing import Optional
def _fmt_ts(ts: Optional[str]) -> str:
if not ts:
return "never"
try:
dt = datetime.fromisoformat(ts)
except (TypeError, ValueError):
return str(ts)
if dt.tzinfo is None:
dt = dt.replace(tzinfo=timezone.utc)
delta = datetime.now(timezone.utc) - dt
secs = int(delta.total_seconds())
if secs < 60:
return f"{secs}s ago"
if secs < 3600:
return f"{secs // 60}m ago"
if secs < 86400:
return f"{secs // 3600}h ago"
return f"{secs // 86400}d ago"
def _cmd_status(args) -> int:
from agent import curator
from tools import skill_usage
state = curator.load_state()
enabled = curator.is_enabled()
paused = state.get("paused", False)
last_run = state.get("last_run_at")
summary = state.get("last_run_summary") or "(none)"
runs = state.get("run_count", 0)
status_line = (
"ENABLED" if enabled and not paused else
"PAUSED" if paused else
"DISABLED"
)
print(f"curator: {status_line}")
print(f" runs: {runs}")
print(f" last run: {_fmt_ts(last_run)}")
print(f" last summary: {summary}")
_ih = curator.get_interval_hours()
_interval_label = (
f"{_ih // 24}d" if _ih % 24 == 0 and _ih >= 24
else f"{_ih}h"
)
print(f" interval: every {_interval_label}")
print(f" stale after: {curator.get_stale_after_days()}d unused")
print(f" archive after: {curator.get_archive_after_days()}d unused")
rows = skill_usage.agent_created_report()
if not rows:
print("\nno agent-created skills")
return 0
by_state = {"active": [], "stale": [], "archived": []}
pinned = []
for r in rows:
state_name = r.get("state", "active")
by_state.setdefault(state_name, []).append(r)
if r.get("pinned"):
pinned.append(r["name"])
print(f"\nagent-created skills: {len(rows)} total")
for state_name in ("active", "stale", "archived"):
bucket = by_state.get(state_name, [])
print(f" {state_name:10s} {len(bucket)}")
if pinned:
print(f"\npinned ({len(pinned)}): {', '.join(pinned)}")
# Show top 5 least-recently-used active skills
active = sorted(
by_state.get("active", []),
key=lambda r: r.get("last_used_at") or r.get("created_at") or "",
)[:5]
if active:
print("\nleast recently used (top 5):")
for r in active:
last = _fmt_ts(r.get("last_used_at"))
print(f" {r['name']:40s} use={r.get('use_count', 0):3d} last_used={last}")
return 0
def _cmd_run(args) -> int:
from agent import curator
if not curator.is_enabled():
print("curator: disabled via config; enable with `curator.enabled: true`")
return 1
print("curator: running review pass...")
def _on_summary(msg: str) -> None:
print(msg)
result = curator.run_curator_review(
on_summary=_on_summary,
synchronous=bool(args.synchronous),
)
auto = result.get("auto_transitions", {})
if auto:
print(
f"auto: checked={auto.get('checked', 0)} "
f"stale={auto.get('marked_stale', 0)} "
f"archived={auto.get('archived', 0)} "
f"reactivated={auto.get('reactivated', 0)}"
)
if not args.synchronous:
print("llm pass running in background — check `hermes curator status` later")
return 0
def _cmd_pause(args) -> int:
from agent import curator
curator.set_paused(True)
print("curator: paused")
return 0
def _cmd_resume(args) -> int:
from agent import curator
curator.set_paused(False)
print("curator: resumed")
return 0
def _cmd_pin(args) -> int:
from tools import skill_usage
if not skill_usage.is_agent_created(args.skill):
print(
f"curator: '{args.skill}' is bundled or hub-installed — cannot pin "
"(only agent-created skills participate in curation)"
)
return 1
skill_usage.set_pinned(args.skill, True)
print(f"curator: pinned '{args.skill}' (will bypass auto-transitions)")
return 0
def _cmd_unpin(args) -> int:
from tools import skill_usage
if not skill_usage.is_agent_created(args.skill):
print(
f"curator: '{args.skill}' is bundled or hub-installed — "
"there's nothing to unpin (curator only tracks agent-created skills)"
)
return 1
skill_usage.set_pinned(args.skill, False)
print(f"curator: unpinned '{args.skill}'")
return 0
def _cmd_restore(args) -> int:
from tools import skill_usage
ok, msg = skill_usage.restore_skill(args.skill)
print(f"curator: {msg}")
return 0 if ok else 1
# ---------------------------------------------------------------------------
# argparse wiring (called from hermes_cli.main)
# ---------------------------------------------------------------------------
def register_cli(parent: argparse.ArgumentParser) -> None:
"""Attach `curator` subcommands to *parent*.
main.py calls this with the ArgumentParser returned by
``subparsers.add_parser("curator", ...)``.
"""
parent.set_defaults(func=lambda a: (parent.print_help(), 0)[1])
subs = parent.add_subparsers(dest="curator_command")
p_status = subs.add_parser("status", help="Show curator status and skill stats")
p_status.set_defaults(func=_cmd_status)
p_run = subs.add_parser("run", help="Trigger a curator review now")
p_run.add_argument(
"--sync", "--synchronous", dest="synchronous", action="store_true",
help="Wait for the LLM review pass to finish (default: background thread)",
)
p_run.set_defaults(func=_cmd_run)
p_pause = subs.add_parser("pause", help="Pause the curator until resumed")
p_pause.set_defaults(func=_cmd_pause)
p_resume = subs.add_parser("resume", help="Resume a paused curator")
p_resume.set_defaults(func=_cmd_resume)
p_pin = subs.add_parser("pin", help="Pin a skill so the curator never auto-transitions it")
p_pin.add_argument("skill", help="Skill name")
p_pin.set_defaults(func=_cmd_pin)
p_unpin = subs.add_parser("unpin", help="Unpin a skill")
p_unpin.add_argument("skill", help="Skill name")
p_unpin.set_defaults(func=_cmd_unpin)
p_restore = subs.add_parser("restore", help="Restore an archived skill")
p_restore.add_argument("skill", help="Skill name")
p_restore.set_defaults(func=_cmd_restore)
def cli_main(argv=None) -> int:
"""Standalone entry (also usable by hermes_cli.main fallthrough)."""
parser = argparse.ArgumentParser(prog="hermes curator")
register_cli(parser)
args = parser.parse_args(argv)
fn = getattr(args, "func", None)
if fn is None:
parser.print_help()
return 0
return int(fn(args) or 0)
if __name__ == "__main__": # pragma: no cover
sys.exit(cli_main())
+12 -67
View File
@@ -293,23 +293,15 @@ def run_doctor(args):
known_providers: set = set()
try:
from hermes_cli.auth import (
PROVIDER_REGISTRY,
resolve_provider as _resolve_auth_provider,
)
from hermes_cli.auth import PROVIDER_REGISTRY
known_providers = set(PROVIDER_REGISTRY.keys()) | {"openrouter", "custom", "auto"}
except Exception:
_resolve_auth_provider = None
pass
try:
from hermes_cli.config import get_compatible_custom_providers as _compatible_custom_providers
from hermes_cli.providers import (
normalize_provider as _normalize_catalog_provider,
resolve_provider_full as _resolve_provider_full,
)
from hermes_cli.providers import resolve_provider_full as _resolve_provider_full
except Exception:
_compatible_custom_providers = None
_normalize_catalog_provider = None
_resolve_provider_full = None
custom_providers = []
@@ -329,43 +321,17 @@ def run_doctor(args):
if name:
known_providers.add("custom:" + name.lower().replace(" ", "-"))
valid_provider_ids = set(known_providers)
provider_ids_to_accept = {provider} if provider else set()
if _normalize_catalog_provider is not None:
for known_provider in known_providers:
try:
valid_provider_ids.add(_normalize_catalog_provider(known_provider))
except Exception:
continue
runtime_provider = provider
if (
provider
and _resolve_auth_provider is not None
and provider not in ("auto", "custom")
):
try:
runtime_provider = _resolve_auth_provider(provider)
provider_ids_to_accept.add(runtime_provider)
except Exception:
runtime_provider = provider
catalog_provider = provider
canonical_provider = provider
if (
provider
and _resolve_provider_full is not None
and provider not in ("auto", "custom")
):
provider_def = _resolve_provider_full(provider, user_providers, custom_providers)
catalog_provider = provider_def.id if provider_def is not None else None
if catalog_provider is not None:
provider_ids_to_accept.add(catalog_provider)
canonical_provider = provider_def.id if provider_def is not None else None
if provider and provider != "auto":
if catalog_provider is None or (
known_providers
and not (provider_ids_to_accept & valid_provider_ids)
):
if canonical_provider is None or (known_providers and canonical_provider not in known_providers):
known_list = ", ".join(sorted(known_providers)) if known_providers else "(unavailable)"
check_fail(
f"model.provider '{provider_raw}' is not a recognised provider",
@@ -378,24 +344,7 @@ def run_doctor(args):
)
# Warn if model is set to a provider-prefixed name on a provider that doesn't use them
provider_for_policy = runtime_provider or catalog_provider
providers_accepting_vendor_slugs = {
"openrouter",
"custom",
"auto",
"ai-gateway",
"kilocode",
"opencode-zen",
"huggingface",
"lmstudio",
"nous",
}
if (
default_model
and "/" in default_model
and provider_for_policy
and provider_for_policy not in providers_accepting_vendor_slugs
):
if default_model and "/" in default_model and canonical_provider and canonical_provider not in ("openrouter", "custom", "auto", "ai-gateway", "kilocode", "opencode-zen", "huggingface", "nous", "lmstudio"):
check_warn(
f"model.default '{default_model}' uses a vendor/model slug but provider is '{provider_raw}'",
"(vendor-prefixed slugs belong to aggregators like openrouter)",
@@ -411,24 +360,20 @@ def run_doctor(args):
# own env-var checks elsewhere in doctor, and get_auth_status()
# returns a bare {logged_in: False} for anything it doesn't
# explicitly dispatch, which would produce false positives.
if runtime_provider and runtime_provider not in ("auto", "custom", "openrouter"):
if canonical_provider and canonical_provider not in ("auto", "custom", "openrouter"):
try:
from hermes_cli.auth import PROVIDER_REGISTRY, get_auth_status
pconfig = PROVIDER_REGISTRY.get(runtime_provider)
pconfig = PROVIDER_REGISTRY.get(canonical_provider)
if pconfig and getattr(pconfig, "auth_type", "") == "api_key":
status = get_auth_status(runtime_provider) or {}
configured = bool(
status.get("configured")
or status.get("logged_in")
or status.get("api_key")
)
status = get_auth_status(canonical_provider) or {}
configured = bool(status.get("configured") or status.get("logged_in") or status.get("api_key"))
if not configured:
check_fail(
f"model.provider '{runtime_provider}' is set but no API key is configured",
f"model.provider '{canonical_provider}' is set but no API key is configured",
"(check ~/.hermes/.env or run 'hermes setup')",
)
issues.append(
f"No credentials found for provider '{runtime_provider}'. "
f"No credentials found for provider '{canonical_provider}'. "
f"Run 'hermes setup' or set the provider's API key in {_DHH}/.env, "
f"or switch providers with 'hermes config set model.provider <name>'"
)
+6 -8
View File
@@ -33,14 +33,12 @@ def _get_git_commit(project_root: Path) -> str:
def _redact(value: str) -> str:
"""Redact all but first 4 and last 4 chars.
Thin wrapper over :func:`agent.redact.mask_secret`. Returns ``""`` for
an empty value (matches the historical behavior of this helper —
``hermes dump`` formats empty values as blank, not as ``"(not set)"``).
"""
from agent.redact import mask_secret
return mask_secret(value)
"""Redact all but first 4 and last 4 chars."""
if not value:
return ""
if len(value) < 12:
return "***"
return value[:4] + "..." + value[-4:]
def _gateway_status() -> str:
+333
View File
@@ -0,0 +1,333 @@
"""Learning ledger: read-only index of how Hermes has grown for this profile."""
from __future__ import annotations
import json
import time
from dataclasses import asdict, dataclass
from pathlib import Path
from typing import Any
from hermes_constants import get_hermes_home
@dataclass
class LedgerItem:
type: str
name: str
summary: str
source: str
count: int = 0
learned_from: str | None = None
last_used_at: float | None = None
learned_at: float | None = None
via: str | None = None
def build_learning_ledger(db: Any = None, *, limit: int = 80) -> dict[str, Any]:
"""Build a compact, read-only ledger from existing Hermes artifacts."""
skill_inventory = _skill_inventory()
items = [
*_memory_items(),
*_tool_usage_items(db),
*_integration_items(),
]
items.sort(
key=lambda i: (i.last_used_at or i.learned_at or 0, i.type, i.name),
reverse=True,
)
counts: dict[str, int] = {}
for item in items:
counts[item.type] = counts.get(item.type, 0) + 1
return {
"generated_at": time.time(),
"home": str(get_hermes_home()),
"counts": counts,
"items": [asdict(item) for item in items[: max(1, limit)]],
"inventory": {"skills": skill_inventory},
"total": len(items),
}
def _memory_items() -> list[LedgerItem]:
try:
from tools.memory_tool import MemoryStore, get_memory_dir
mem_dir = get_memory_dir()
pairs = [
("memory", "MEMORY.md", "agent note"),
("user", "USER.md", "user profile"),
]
items: list[LedgerItem] = []
for item_type, filename, label in pairs:
path = mem_dir / filename
for idx, entry in enumerate(MemoryStore._read_file(path), 1):
items.append(
LedgerItem(
type=item_type,
name=f"{label} {idx}",
summary=_one_line(entry),
source=str(path),
learned_at=_mtime(path),
)
)
return items
except Exception:
return []
def _skill_inventory() -> int:
try:
from tools.skills_tool import _find_all_skills
return len(_find_all_skills())
except Exception:
return 0
def _tool_usage_items(db: Any) -> list[LedgerItem]:
if db is None or not getattr(db, "_conn", None):
return []
usage: dict[tuple[str, str], LedgerItem] = {}
def bump(
item_type: str,
name: str,
summary: str,
ts: float | None,
*,
learned_from: str | None = None,
via: str | None = None,
):
key = (item_type, name)
item = usage.get(key)
if not item:
item = usage[key] = LedgerItem(
type=item_type,
name=name,
summary=summary,
source="state.db",
learned_from=learned_from,
via=via,
)
item.count += 1
if ts and (not item.last_used_at or ts > item.last_used_at):
item.last_used_at = ts
item.learned_from = learned_from or item.learned_from
item.via = via or item.via
try:
with db._lock:
rows = db._conn.execute(
"""
SELECT m.role, m.content, m.tool_calls, m.tool_name, m.timestamp,
m.session_id, s.title, s.source AS session_source
FROM messages m
LEFT JOIN sessions s ON s.id = m.session_id
WHERE m.tool_name IS NOT NULL OR m.tool_calls IS NOT NULL
ORDER BY m.timestamp DESC
LIMIT 5000
"""
).fetchall()
except Exception:
return []
for row in rows:
ts = _float(row["timestamp"])
tool_name = row["tool_name"]
content = row["content"] or ""
learned_from = row["title"] or row["session_source"] or row["session_id"]
if tool_name == "memory":
target = _json(content).get("target") or "memory"
bump(str(target), f"{target} writes", "Durable memory updates", ts, learned_from=learned_from, via="memory")
elif tool_name == "session_search":
event = learning_event_from_tool(tool_name, {}, content)
if event:
bump("recall", event["title"], event["summary"], ts, learned_from=learned_from, via="session_search")
elif tool_name in {"skill_view", "skill_manage"}:
data = _json(content)
name = str(data.get("name") or data.get("skill") or tool_name)
bump("skill-use", name, _skill_summary(tool_name, data), ts, learned_from=learned_from, via=tool_name)
for call in _tool_calls(row["tool_calls"]):
name, args = call
if name == "session_search":
event = learning_event_from_tool(name, args, content)
if event:
bump("recall", event["title"], event["summary"], ts, learned_from=learned_from, via=name)
elif name in {"skill_view", "skill_manage"}:
skill_name = str(
args.get("name") or args.get("skill") or args.get("query") or name
)
bump("skill-use", skill_name, _skill_summary(name, args), ts, learned_from=learned_from, via=name)
elif name == "memory":
target = str(args.get("target") or "memory")
bump(target, f"{target} writes", "Durable memory updates", ts, learned_from=learned_from, via=name)
return list(usage.values())
def learning_event_from_tool(
tool_name: str,
args: dict[str, Any] | None = None,
result: str | None = None,
) -> dict[str, Any] | None:
args = args or {}
data = _json(result)
if tool_name == "memory":
target = str(args.get("target") or data.get("target") or "memory")
content = str(args.get("content") or "").strip()
return {
"type": target if target in {"memory", "user"} else "memory",
"verb": "remembered",
"title": _memory_title(content) if content else f"{target} updated",
"summary": "Durable memory updated",
"source": "memory",
"via": "memory",
}
if tool_name == "session_search":
title = _recall_title(data) or str(args.get("query") or "").strip() or "past sessions"
return {
"type": "recall",
"verb": "recalled",
"title": _one_line(title, max_len=120),
"summary": "Past conversations recalled",
"source": "state.db",
"via": "session_search",
}
if tool_name in {"skill_view", "skill_manage"}:
action = str(args.get("action") or data.get("action") or "").strip().lower()
name = str(args.get("name") or args.get("query") or data.get("name") or "skill").strip()
verb = "updated skill" if tool_name == "skill_manage" and action in {"create", "patch", "update", "install"} else "applied skill"
return {
"type": "skill-use",
"verb": verb,
"title": _one_line(name, max_len=120),
"summary": _skill_summary(tool_name, {**args, **(data if isinstance(data, dict) else {})}),
"source": "skills",
"via": tool_name,
}
return None
def _skill_summary(tool_name: str, data: dict[str, Any]) -> str:
action = str(data.get("action") or "").strip().lower()
if tool_name == "skill_manage" and action:
return f"Skill {action.replace('_', ' ')}"
if tool_name == "skill_manage":
return "Skill managed"
return "Skill reused"
def _recall_title(data: Any) -> str:
if not isinstance(data, dict):
return ""
results = data.get("results")
if not isinstance(results, list) or not results:
return str(data.get("query") or "").strip()
first = results[0] if isinstance(results[0], dict) else {}
return str(first.get("title") or first.get("preview") or data.get("query") or "").strip()
def _memory_title(content: str) -> str:
title = _one_line(content, max_len=120)
lowered = title.lower()
for prefix in ("the user ", "user "):
if lowered.startswith(prefix):
return title[len(prefix):].lstrip()
return title
def _integration_items() -> list[LedgerItem]:
try:
from hermes_cli.config import load_config
cfg = load_config()
except Exception:
return []
items: list[LedgerItem] = []
provider = ((cfg.get("memory") or {}) if isinstance(cfg, dict) else {}).get(
"provider"
)
if provider:
items.append(
LedgerItem(
type="integration",
name=f"{provider} memory provider",
summary="External memory provider is configured",
source="config.yaml",
)
)
for server in (
sorted(((cfg.get("mcp") or {}).get("servers") or {}).keys())
if isinstance(cfg, dict)
else []
):
items.append(
LedgerItem(
type="integration",
name=f"{server} MCP server",
summary="MCP server is configured",
source="config.yaml",
)
)
return items
def _tool_calls(raw: str | None) -> list[tuple[str, dict[str, Any]]]:
calls = _json(raw)
if not isinstance(calls, list):
return []
parsed = []
for call in calls:
if not isinstance(call, dict):
continue
fn = call.get("function") or {}
name = call.get("name") or fn.get("name")
args = fn.get("arguments") or call.get("arguments") or call.get("args") or {}
if isinstance(args, str):
args = _json(args)
if name:
parsed.append((str(name), args if isinstance(args, dict) else {}))
return parsed
def _json(raw: Any) -> Any:
if not raw:
return {}
if isinstance(raw, (dict, list)):
return raw
try:
return json.loads(raw)
except Exception:
return {}
def _mtime(path: Path) -> float | None:
try:
return path.stat().st_mtime
except OSError:
return None
def _float(value: Any) -> float | None:
try:
return float(value)
except (TypeError, ValueError):
return None
def _one_line(text: str, *, max_len: int = 180) -> str:
line = " ".join(str(text).split())
return line[: max_len - 1] + "" if len(line) > max_len else line
-20
View File
@@ -9230,26 +9230,6 @@ Examples:
except Exception as _exc:
logging.getLogger(__name__).debug("Plugin CLI discovery failed: %s", _exc)
# =========================================================================
# curator command — background skill maintenance
# =========================================================================
curator_parser = subparsers.add_parser(
"curator",
help="Background skill maintenance (curator) — status, run, pause, pin",
description=(
"The curator is an auxiliary-model background task that "
"periodically reviews agent-created skills, prunes stale ones, "
"consolidates overlaps, and archives obsolete skills. "
"Bundled and hub-installed skills are never touched. "
"Archives are recoverable; auto-deletion never happens."
),
)
try:
from hermes_cli.curator import register_cli as _register_curator_cli
_register_curator_cli(curator_parser)
except Exception as _exc:
logging.getLogger(__name__).debug("curator CLI wiring failed: %s", _exc)
# =========================================================================
# memory command
# =========================================================================
+5 -14
View File
@@ -68,7 +68,7 @@ All fields are optional. Missing values inherit from the ``default`` skin.
welcome: "Welcome message" # Shown at CLI startup
goodbye: "Goodbye! ⚕" # Shown on exit
response_label: " ⚕ Hermes " # Response box header label
prompt_symbol: "" # Input prompt symbol (bare token; renderers add trailing space)
prompt_symbol: "" # Input prompt symbol (spacing is added by the UI)
help_header: "(^_^)? Commands" # /help header text
# Tool prefix: character for tool output lines (default: ┊)
@@ -780,21 +780,12 @@ def init_skin_from_config(config: dict) -> None:
# =============================================================================
def get_active_prompt_symbol(fallback: str = "") -> str:
"""Return the interactive prompt symbol with a single trailing space.
Skins store ``prompt_symbol`` as a bare token (no spaces). The trailing
space is appended here so callers can drop it straight into a rendered
prompt without hand-rolling whitespace.
"""
def get_active_prompt_symbol(fallback: str = " ") -> str:
"""Get the interactive prompt symbol from the active skin."""
try:
raw = get_active_skin().get_branding("prompt_symbol", fallback)
return get_active_skin().get_branding("prompt_symbol", fallback)
except Exception:
raw = fallback
cleaned = (raw or fallback).strip()
return f"{cleaned or fallback.strip()} "
return fallback
+6 -9
View File
@@ -26,15 +26,12 @@ def check_mark(ok: bool) -> str:
return color("", Colors.RED)
def redact_key(key: str) -> str:
"""Redact an API key for display.
Thin wrapper over :func:`agent.redact.mask_secret`. Preserves the
"(not set)" placeholder in dim color to match ``hermes config``'s
output (previously this variant was missing the DIM color —
consolidated via PR that also introduced ``mask_secret``).
"""
from agent.redact import mask_secret
return mask_secret(key, empty=color("(not set)", Colors.DIM))
"""Redact an API key for display."""
if not key:
return "(not set)"
if len(key) < 12:
return "***"
return key[:4] + "..." + key[-4:]
def _format_iso_timestamp(value) -> str:
-65
View File
@@ -206,27 +206,6 @@ _LEGACY_TOOLSET_MAP = {
# get_tool_definitions (the main schema provider)
# =============================================================================
# Module-level memoization for get_tool_definitions(). Keyed on
# (frozenset(enabled_toolsets), frozenset(disabled_toolsets), registry._generation).
# Hot callers (gateway runner, AIAgent.__init__) invoke this on every turn
# with quiet_mode=True; caching avoids ~7 ms of registry walking + schema
# filtering + check_fn probing per call. Only active when quiet_mode=True
# because quiet_mode=False has stdout side effects (tool-selection prints).
#
# Invalidation happens transparently via the registry's _generation counter,
# which bumps on register() / deregister() / register_toolset_alias(). The
# inner check_fn TTL cache in registry.py handles environment drift (Docker
# daemon start/stop, env var changes, etc.) on a 30 s horizon.
_tool_defs_cache: Dict[tuple, List[Dict[str, Any]]] = {}
def _clear_tool_defs_cache() -> None:
"""Drop memoized get_tool_definitions() results. Called when dynamic
schema dependencies change (e.g. discord capability cache reset,
execute_code sandbox reconfigured)."""
_tool_defs_cache.clear()
def get_tool_definitions(
enabled_toolsets: List[str] = None,
disabled_toolsets: List[str] = None,
@@ -245,50 +224,6 @@ def get_tool_definitions(
Returns:
Filtered list of OpenAI-format tool definitions.
"""
# Fast path: memoized result when the caller doesn't need stdout prints.
# The cache key captures every argument-level input; the registry
# generation captures registry mutations (MCP refresh, plugin load).
# check_fn results are TTL-cached one level down, inside
# registry.get_definitions. The config-mtime fingerprint below captures
# user-visible config edits that affect dynamic schemas (execute_code
# mode, discord action allowlist, etc.) without needing an explicit
# invalidate hook on every config-writer.
if quiet_mode:
try:
from hermes_cli.config import get_config_path
cfg_path = get_config_path()
cfg_stat = cfg_path.stat()
cfg_fp = (cfg_stat.st_mtime_ns, cfg_stat.st_size)
except (FileNotFoundError, OSError, ImportError):
cfg_fp = None
cache_key = (
frozenset(enabled_toolsets) if enabled_toolsets is not None else None,
frozenset(disabled_toolsets) if disabled_toolsets else None,
registry._generation,
cfg_fp,
)
cached = _tool_defs_cache.get(cache_key)
if cached is not None:
# Update _last_resolved_tool_names so downstream callers see
# consistent state even on a cache hit.
global _last_resolved_tool_names
_last_resolved_tool_names = [t["function"]["name"] for t in cached]
# Return a shallow copy of the list but share the dict references —
# schemas are treated as read-only by all known callers.
return list(cached)
result = _compute_tool_definitions(enabled_toolsets, disabled_toolsets, quiet_mode)
if quiet_mode:
_tool_defs_cache[cache_key] = result
return result
def _compute_tool_definitions(
enabled_toolsets: List[str] = None,
disabled_toolsets: List[str] = None,
quiet_mode: bool = False,
) -> List[Dict[str, Any]]:
"""Uncached implementation of :func:`get_tool_definitions`."""
# Determine which tool names the caller wants
tools_to_include: set = set()
-11
View File
@@ -165,17 +165,6 @@
NEW_HASH=$(echo "$OUTPUT" | awk '/got:/ {print $2; exit}')
if [ -z "$NEW_HASH" ]; then
# Magic-Nix-Cache occasionally returns HTTP 418 / cache-throttled
# mid-run; nix then prints "outputs … not valid, so checking is
# not possible" without a `got:` line. That's an infrastructure
# blip, not a stale lockfile — warn + skip rather than failing
# the lint. A real hash mismatch would still surface in the
# primary `.#$ATTR` build, which is a separate CI job.
if echo "$OUTPUT" | grep -qE "throttled|HTTP error 418|substituter .* is disabled|some outputs of .* are not valid"; then
echo " skipped (transient cache failure see primary nix build for real status)" >&2
echo "$OUTPUT" | tail -8 >&2
continue
fi
echo " build failed with no hash mismatch:" >&2
echo "$OUTPUT" | tail -40 >&2
exit 1
+1 -1
View File
@@ -4,7 +4,7 @@ let
src = ../web;
npmDeps = pkgs.fetchNpmDeps {
inherit src;
hash = "sha256-+B2+Fe4djPzHHcUXRx+m0cuyaopAhW0PcHsMgYfV5VE=";
hash = "sha256-AahWmJ9gDQ9pMPa1FYwUjYdO2mOi6JM9Mst27E0vp68=";
};
npm = hermesNpmLib.mkNpmPassthru { folder = "web"; attr = "web"; pname = "hermes-web"; };
-346
View File
@@ -1,346 +0,0 @@
---
name: comfyui
description: "Use when generating images/video/audio with ComfyUI — import workflows, run them with friendly parameters, manage models and dependencies. Uses the comfyui-skill CLI over the REST API."
version: 3.0.0
requires: ComfyUI running locally or via Comfy Cloud; comfyui-skill CLI (auto-installed via uvx)
author: kshitijk4poor
license: MIT
platforms: [macos, linux, windows]
prerequisites:
commands: ["uv"]
setup:
help: "CLI auto-runs via uvx. ComfyUI install: https://docs.comfy.org/installation"
metadata:
hermes:
tags:
[
comfyui,
image-generation,
stable-diffusion,
flux,
creative,
generative-ai,
video-generation,
]
related_skills: [stable-diffusion-image-generation, image_gen]
category: creative
---
# ComfyUI
Generate images, video, and audio through ComfyUI using the `comfyui-skill` CLI.
The CLI wraps ComfyUI's REST API into an agent-friendly interface — workflows become
"skills" with named parameters (e.g., `prompt`, `seed`) instead of raw node graphs.
**Reference files in this skill:**
- `references/cli-reference.md` — complete command reference with all subcommands and options
- `references/api-notes.md` — underlying REST API routes (for debugging / advanced use)
- `scripts/comfyui_setup.sh` — workspace initialization script
## When to Use
- User asks to generate images with Stable Diffusion, SDXL, Flux, or other diffusion models
- User wants to run a specific ComfyUI workflow
- User wants to chain generative steps (txt2img → upscale → face restore)
- User needs ControlNet, inpainting, img2img, or other advanced pipelines
- User asks to manage ComfyUI queue, check models, or install custom nodes
- User wants video/audio generation via AnimateDiff, Hunyuan, AudioCraft, etc.
## How It Works
The `comfyui-skill` CLI turns ComfyUI workflows into callable "skills":
1. **Import** a workflow JSON (from editor or API format) → CLI extracts a parameter schema
2. **Run** with friendly args (`--args '{"prompt": "a cat"}'`) → CLI injects values into the right nodes
3. **Retrieve** outputs → CLI downloads generated files locally
The agent never sees raw node IDs or graph wiring. The CLI handles:
- Editor-format → API-format conversion (resolves reroutes, widget ordering via `/object_info`)
- Auto-upload of local images referenced in args
- Dependency checking (missing custom nodes, models)
- WebSocket streaming with polling fallback
- Multi-server routing
- Idempotent execution via `--job-id`
## CLI Invocation
The CLI is invoked via `uvx` (no persistent install needed):
```bash
uvx --from comfyui-skill-cli comfyui-skill [OPTIONS] COMMAND [ARGS]
```
For brevity in all examples below, we alias this:
```bash
# In execute_code / terminal, always use the full uvx form:
COMFY="uvx --from comfyui-skill-cli comfyui-skill"
```
**Always pass `--json` for structured output** the agent can parse:
```bash
$COMFY --json list
$COMFY --json run my-workflow --args '{"prompt": "a cat"}'
```
If `comfyui-skill` is already installed as a `uv tool` (`uv tool install comfyui-skill-cli`),
it's on PATH directly and `uvx` is not needed.
## Setup & Onboarding
### 1. ComfyUI Must Be Running
The CLI talks to a running ComfyUI server. If the user doesn't have one:
- Point them to https://docs.comfy.org/installation. If they ask for help in onboarding, read the docs and help them set things up.
- Supports: NVIDIA (CUDA), AMD (ROCm), Intel Arc, Apple Silicon (MPS), CPU-only
- Desktop app available for Windows/macOS; manual install for Linux
- Comfy Cloud available for users without a GPU (https://platform.comfy.org)
### 2. Initialize a Workspace
The CLI reads `config.json` and `data/` from its working directory. Run the
setup script or initialize manually:
```bash
bash scripts/comfyui_setup.sh
```
Or manually:
```bash
mkdir -p ~/.hermes/comfyui && cd ~/.hermes/comfyui
```
Then add a server:
```bash
$COMFY --json server add --id local --url http://127.0.0.1:8188 --name "Local ComfyUI"
```
For Comfy Cloud:
```bash
$COMFY --json server add --id cloud --url https://cloud.comfy.org \
--name "Comfy Cloud" --api-key "comfyui-xxxxxxxxxxxx"
```
### 3. Verify Connection
```bash
$COMFY --json server status
```
Should return `{"status": "online", ...}`. If offline, user needs to start ComfyUI.
### 4. Import a Workflow
Users typically have workflow JSON files from the ComfyUI editor:
```bash
$COMFY --json workflow import /path/to/workflow.json --name my-workflow
```
The CLI auto-detects format (editor or API), converts if needed, and extracts
a parameter schema. Both formats are accepted.
To import from the ComfyUI server's saved workflows:
```bash
$COMFY --json workflow import --from-server
```
## Core Workflow
### Step 1: List Available Skills
```bash
$COMFY --json list
```
Returns all imported workflows with their parameter schemas. Required params
must be provided; optional params have sensible defaults.
### Step 2: Check Dependencies (First Run)
```bash
$COMFY --json deps check my-workflow
```
Reports missing custom nodes and models. If `is_ready` is false:
```bash
# Install missing nodes (requires ComfyUI Manager)
$COMFY --json deps install my-workflow --all
# Missing models must be downloaded manually — CLI tells you which folder
```
### Step 3: Execute
**Blocking (recommended for most use):**
```bash
$COMFY --json run my-workflow --args '{"prompt": "a beautiful sunset", "seed": 42}'
```
Blocks until done, streams progress, downloads outputs.
**Non-blocking (for long jobs):**
```bash
# Submit
$COMFY --json submit my-workflow --args '{"prompt": "..."}'
# Returns: {"prompt_id": "abc-123"}
# Poll (each poll = separate command, do NOT loop in shell)
$COMFY --json status abc-123
# Returns: {"status": "running", "progress": {"value": 15, "max": 25}}
# When status = "success", outputs are in the response
```
### Step 4: Present Results
On success, the response contains output file paths. Show them to the user.
Images referenced in the output can be displayed via `vision_analyze` or
returned as file paths.
## Quick Decision Tree
| User says | Command |
| --------------------------------- | ---------------------------------------------- |
| "generate an image" / "draw" | `run <skill> --args '{"prompt": "..."}'` |
| "import this workflow" | `workflow import <path>` |
| "use this image" (img2img) | `upload <image>` then `run` with the reference |
| "inpaint this" | `upload <mask> --mask` then `run` |
| "what workflows do I have" | `list` |
| "what models are available" | `models list checkpoints` |
| "check if everything's installed" | `deps check <skill>` |
| "what failed" / "show history" | `history list <skill>` |
| "cancel that" | `cancel <prompt_id>` |
| "free up GPU memory" | `free` |
| "which nodes exist for X" | `nodes search <query>` |
## Multi-Server
Skills are addressed as `server_id/workflow_id`:
```bash
$COMFY --json list # all servers
$COMFY --json run local/txt2img --args '{...}' # specific server
$COMFY --json run cloud/flux --args '{...}' # different server
$COMFY --json server stats --all # VRAM/RAM across all servers
```
If `server_id` is omitted, the default server is used.
## Image Upload (img2img / Inpainting)
```bash
# Upload input image
$COMFY --json upload /path/to/photo.png
# Returns: {"filename": "photo.png", ...}
# Upload mask for inpainting
$COMFY --json upload /path/to/mask.png --mask --original photo.png
# Use in workflow args — if a param has type "image" and value is a local
# file path (starts with /, ./, ../, ~), the CLI auto-uploads it
$COMFY --json run inpaint --args '{"image": "/path/to/photo.png", "mask": "/path/to/mask.png", "prompt": "fill with flowers"}'
```
## Model Discovery
```bash
$COMFY --json models list # all folder types
$COMFY --json models list checkpoints # checkpoint files
$COMFY --json models list loras # LoRA files
$COMFY --json models list controlnet # ControlNet models
```
Model folders: `checkpoints`, `loras`, `vae`, `controlnet`, `clip`, `clip_vision`,
`upscale_models`, `embeddings`, `unet`, `diffusion_models`.
## Node Discovery
```bash
$COMFY --json nodes list # all nodes, grouped by category
$COMFY --json nodes list -c sampling # filter by category
$COMFY --json nodes info KSampler # full details of one node
$COMFY --json nodes search "upscale" # fuzzy search
```
## Queue & System
```bash
$COMFY --json queue list # running + pending jobs
$COMFY --json queue clear # clear pending
$COMFY --json cancel <prompt_id> # cancel specific job
$COMFY --json free # unload models + free VRAM
$COMFY --json server stats # system info (VRAM, RAM, GPU)
```
## Workflow Management
```bash
$COMFY --json workflow import <path> --name <id> # import from file
$COMFY --json workflow import --from-server # import from ComfyUI server
$COMFY --json workflow enable <skill_id> # enable
$COMFY --json workflow disable <skill_id> # disable
$COMFY --json workflow delete <skill_id> # delete
$COMFY --json info <skill_id> # show schema + details
```
## Idempotent Execution
For retries that shouldn't burn extra GPU:
```bash
$COMFY --json run my-workflow --args '{"prompt": "..."}' --job-id "unique-key-123"
```
If `unique-key-123` was already executed, returns the cached result instantly.
## Pitfalls
1. **Working directory matters** — The CLI reads `config.json` and `data/` from CWD.
Always `cd` to the workspace directory before running commands. If `list` returns
empty or `server status` fails, you're in the wrong directory.
2. **Editor format needs a live server** — Importing editor-format workflows requires
a running ComfyUI instance (calls `/object_info` to resolve widget ordering).
API-format imports work offline.
3. **Missing custom nodes** — Always `deps check` before first run of an imported
workflow. "class_type not found" means missing nodes.
4. **JSON args quoting** — Wrap `--args` in single quotes to prevent bash from
eating the double quotes: `--args '{"prompt": "a cat"}'`.
5. **Comfy Cloud differences** — Cloud uses `/api/` prefix and `X-API-Key` auth.
The CLI handles this transparently when configured with `--api-key`.
6. **Model names are exact** — Case-sensitive, includes extension. Use
`models list checkpoints` to discover installed models.
7. **Long generations** — Video and high-step workflows can take minutes. The `run`
command blocks and streams progress. For very long jobs, use `submit` + `status`.
8. **Concurrent limits (Cloud)** — Free/Standard: 1 job. Creator: 3. Pro: 5.
Extra submits queue automatically.
9. **Config portability** — Use `config export` / `config import` to transfer
setups between machines.
## Verification Checklist
- [ ] `uv` or `uvx` available on PATH
- [ ] `comfyui-skill --json server status` returns online
- [ ] Workspace dir has `config.json` and `data/`
- [ ] At least one workflow imported (`list` returns non-empty)
- [ ] `deps check` passes for imported workflows
- [ ] Test run completes and outputs are saved
@@ -1,103 +0,0 @@
# ComfyUI REST API Notes
The `comfyui-skill` CLI wraps these endpoints. This reference is for debugging,
understanding errors, or advanced use when the CLI doesn't cover a specific need.
## Endpoints the CLI Uses
| Endpoint | Method | CLI Command |
|----------|--------|-------------|
| `/system_stats` | GET | `server status`, `server stats` |
| `/prompt` | POST | `run`, `submit` |
| `/history/{prompt_id}` | GET | `status`, `run` (polling) |
| `/history` | GET | `history list --server` |
| `/queue` | GET | `queue list` |
| `/queue` | POST | `queue clear`, `queue delete` |
| `/interrupt` | POST | `cancel` |
| `/free` | POST | `free` |
| `/object_info` | GET | `nodes list`, `workflow import` (schema extraction) |
| `/object_info/{class}` | GET | `nodes info` |
| `/models` | GET | `models list` |
| `/models/{folder}` | GET | `models list <folder>`, `deps check` |
| `/view` | GET | `run` (output download) |
| `/upload/image` | POST | `upload` |
| `/upload/mask` | POST | `upload --mask` |
| `/node_replacements` | GET | `workflow import` (deprecated node detection) |
| `/internal/logs/raw` | GET | `logs show` |
| `/workflow_templates` | GET | `templates list` |
| `/global_subgraphs` | GET | `templates subgraphs` |
| `/v2/userdata` | GET | `workflow import --from-server` |
| `/ws` | WebSocket | `run` (real-time progress) |
### Cloud-specific
| Endpoint | Method | Purpose |
|----------|--------|---------|
| `/api/jobs` | GET | Job listing with filtering |
| `/api/jobs/{id}` | GET | Job details |
### ComfyUI Manager (optional plugin)
| Endpoint | Method | CLI Command |
|----------|--------|-------------|
| `/manager/queue/start` | GET | `deps install` |
| `/manager/queue/install` | POST | `deps install` (custom nodes) |
| `/manager/queue/install_model` | POST | `deps install --models` |
| `/manager/queue/status` | GET | `deps install` (progress) |
## Local vs Cloud Differences
| | Local | Cloud |
|---|---|---|
| Base URL | `http://127.0.0.1:8188` | `https://cloud.comfy.org` |
| Route prefix | none | `/api` |
| Auth | none or bearer token | `X-API-Key` header |
| Job status | Poll `/history/{id}` | `/api/jobs/{id}` |
| Output download | Direct bytes from `/view` | 302 redirect → signed URL |
| WebSocket | `ws://host:port/ws?clientId={uuid}` | `wss://host/ws?clientId={uuid}&token={key}` |
| Concurrent jobs | Sequential | Tier-limited (Free: 1, Creator: 3, Pro: 5) |
The CLI handles all of these differences transparently based on the server config.
## Workflow JSON Format (API Format)
```json
{
"node_id_string": {
"class_type": "NodeClassName",
"inputs": {
"param_name": "value",
"linked_input": ["source_node_id", output_index]
}
}
}
```
- Node IDs are strings (`"3"`, not `3`)
- Links: `["node_id", output_index]` — 0-based int
- `class_type` must match exactly (case-sensitive)
## POST /prompt Payload
```json
{
"prompt": { "<workflow>" },
"client_id": "uuid",
"extra_data": {
"api_key_comfy_org": "key-for-paid-api-nodes"
}
}
```
The CLI constructs this from the imported workflow + injected parameters.
## WebSocket Message Types
| Type | When | Key Fields |
|------|------|------------|
| `execution_start` | Prompt begins | `prompt_id` |
| `executing` | Node running (`null` = done) | `node`, `prompt_id` |
| `progress` | Sampling steps | `node`, `value`, `max` |
| `executed` | Node output ready | `node`, `output` |
| `execution_success` | All nodes done | `prompt_id` |
| `execution_error` | Failure | `exception_type`, `exception_message` |
@@ -1,172 +0,0 @@
# comfyui-skill CLI Reference
Complete command map for `comfyui-skill` v0.2.x.
**Invocation:** `uvx --from comfyui-skill-cli comfyui-skill [OPTIONS] COMMAND [ARGS]`
Or if installed as a tool: `comfyui-skill [OPTIONS] COMMAND [ARGS]`
## Global Options
| Option | Short | Description |
|--------|-------|-------------|
| `--version` | `-V` | Show version |
| `--json` | `-j` | JSON output (always use this for agent parsing) |
| `--output-format` | | `text`, `json`, or `stream-json` (NDJSON events) |
| `--server` | `-s` | Server ID override |
| `--dir` | `-d` | Data directory (default: CWD) |
| `--verbose` | `-v` | Verbose output |
| `--no-update-check` | | Skip CLI update check |
## Standalone Commands
### `list`
List all available skills across all enabled servers.
### `info <SKILL_ID>`
Show skill details and parameter schema. Skill ID format: `server_id/workflow_id` or `workflow_id`.
### `run <SKILL_ID> [OPTIONS]`
Execute a skill (blocking — waits for completion, streams progress).
| Option | Short | Description |
|--------|-------|-------------|
| `--args` | `-a` | JSON parameters (default: `{}`) |
| `--only` | | Comma-separated node IDs for partial execution |
| `--priority` | `-p` | Queue priority (lower = first, negative = jump queue; default: 0) |
| `--validate` | | Validate workflow without executing (dry run) |
| `--job-id` | | Idempotency key — reuse cached result if already executed |
### `submit <SKILL_ID> [OPTIONS]`
Submit a skill (non-blocking — returns `prompt_id` immediately). Same options as `run` except no streaming.
### `status <PROMPT_ID>`
Check execution status. Returns: `queued` (with `position`), `running` (with `progress`), `success` (with `outputs`), or `error`.
### `upload [FILE_PATH] [OPTIONS]`
Upload a file to ComfyUI for use in workflows.
| Option | Description |
|--------|-------------|
| `--from-output` | Reuse output from a previous prompt_id as input |
| `--mask` | Upload as mask (for inpainting) |
| `--original` | Original image filename (for mask upload) |
### `cancel <PROMPT_ID>`
Cancel a running or queued job.
### `free [OPTIONS]`
Release GPU memory.
| Option | Short | Description |
|--------|-------|-------------|
| `--models` | `-m` | Unload all models from VRAM |
| `--memory` | | Free all cached memory |
## Command Groups
### `server` — Manage ComfyUI Servers
| Subcommand | Description |
|------------|-------------|
| `server list` | List all configured servers |
| `server status [SERVER_ID]` | Check if server is online |
| `server stats [SERVER_ID]` | System stats: VRAM, RAM, GPU, versions (`--all` for all servers) |
| `server add` | Add server (`--id`, `--url` required; `--name`, `--output-dir`, `--auth`, `--api-key` optional) |
| `server enable <SERVER_ID>` | Enable a server |
| `server disable <SERVER_ID>` | Disable a server |
| `server remove <SERVER_ID>` | Remove a server |
### `workflow` — Manage Workflows
| Subcommand | Description |
|------------|-------------|
| `workflow import [JSON_PATH]` | Import workflow (`--name`, `--type` image/audio/video, `--from-server`, `--preview`, `--check-deps`) |
| `workflow enable <SKILL_ID>` | Enable a workflow |
| `workflow disable <SKILL_ID>` | Disable a workflow |
| `workflow delete <SKILL_ID>` | Delete a workflow |
### `models` — Discover Models
| Subcommand | Description |
|------------|-------------|
| `models list [FOLDER]` | List models in a folder (checkpoints, loras, vae, controlnet, etc.) |
### `nodes` — Discover Nodes
| Subcommand | Description |
|------------|-------------|
| `nodes list` | List all node classes (`-c` to filter by category) |
| `nodes info <NODE_CLASS>` | Full details of a node type |
| `nodes search <QUERY>` | Fuzzy search across names/categories |
### `deps` — Dependency Management
| Subcommand | Description |
|------------|-------------|
| `deps check <SKILL_ID>` | Check if dependencies are installed (returns `is_ready`) |
| `deps install <SKILL_ID>` | Install missing deps (`--repos` git URLs, `--models`, `--all`) |
### `history` — Execution History
| Subcommand | Description |
|------------|-------------|
| `history list [SKILL_ID]` | List history (`--server`, `--status`, `--limit`, `--sort`) |
| `history show <SKILL_ID> <RUN_ID>` | Show specific run details |
### `queue` — Queue Management
| Subcommand | Description |
|------------|-------------|
| `queue list` | Show running and pending jobs |
| `queue clear` | Clear all pending jobs |
| `queue delete <PROMPT_IDS...>` | Remove specific jobs from queue |
### `logs` — Server Logs
| Subcommand | Description |
|------------|-------------|
| `logs show` | Show recent server logs (`--lines` / `-n`, default: 50) |
### `templates` — Discover Templates
| Subcommand | Description |
|------------|-------------|
| `templates list` | Workflow templates from custom nodes |
| `templates subgraphs` | Reusable subgraph components |
### `config` — Configuration
| Subcommand | Description |
|------------|-------------|
| `config export` | Export config + workflows as bundle (`--output`, `--portable-only`) |
| `config import <INPUT_PATH>` | Import bundle (`--dry-run`, `--apply-environment`, `--no-overwrite`) |
## Config File Format
Located at `<workspace>/config.json`:
```json
{
"default_server": "local",
"servers": [
{
"id": "local",
"name": "Local ComfyUI",
"url": "http://127.0.0.1:8188",
"enabled": true,
"output_dir": "./outputs",
"auth": "",
"comfy_api_key": ""
}
]
}
```
**Server fields:**
- `id` — unique identifier (no spaces/slashes/dots)
- `url` — ComfyUI base URL
- `enabled` — whether server is active
- `output_dir` — where outputs are saved (relative to workspace)
- `auth` — bearer token for authenticated servers
- `comfy_api_key` — Comfy Cloud API key (also sent as `extra_data.api_key_comfy_org` in prompts)
@@ -1,41 +0,0 @@
#!/usr/bin/env bash
# Initialize a comfyui-skill workspace directory.
# Usage: bash scripts/comfyui_setup.sh [WORKSPACE_DIR]
#
# Creates the workspace, adds a default local server config,
# and verifies the connection.
set -euo pipefail
WORKSPACE="${1:-$HOME/.hermes/comfyui}"
COMFY="${COMFY:-uvx --from comfyui-skill-cli comfyui-skill}"
echo "==> Initializing ComfyUI skill workspace at: $WORKSPACE"
mkdir -p "$WORKSPACE"
cd "$WORKSPACE"
# If config.json doesn't exist, create it with a default local server
if [ ! -f config.json ]; then
echo "==> Creating default config (local server at 127.0.0.1:8188)"
$COMFY --json server add --id local --url http://127.0.0.1:8188 --name "Local ComfyUI"
echo "==> Config created: $WORKSPACE/config.json"
else
echo "==> config.json already exists, skipping"
fi
# Verify connection
echo "==> Checking server connection..."
if $COMFY --json server status 2>/dev/null | grep -q '"online"'; then
echo "==> ComfyUI is reachable!"
$COMFY --json server stats 2>/dev/null || true
else
echo "==> ComfyUI is not reachable at the configured URL."
echo " Start ComfyUI first, or update the server URL:"
echo " cd $WORKSPACE && $COMFY server add --id local --url <YOUR_URL>"
echo ""
echo " Install ComfyUI: https://docs.comfy.org/installation"
fi
echo ""
echo "==> Workspace ready: $WORKSPACE"
echo " Always cd here before running comfyui-skill commands."
+40 -126
View File
@@ -3230,135 +3230,49 @@ class AIAgent:
)
_SKILL_REVIEW_PROMPT = (
"Review the conversation above and update the skill library. Be "
"ACTIVE — most sessions produce at least one skill update, even if "
"small. A pass that does nothing is a missed learning opportunity, "
"not a neutral outcome.\n\n"
"Target shape of the library: CLASS-LEVEL skills, each with a rich "
"SKILL.md and a `references/` directory for session-specific detail. "
"Not a long flat list of narrow one-session-one-skill entries. This "
"shapes HOW you update, not WHETHER you update.\n\n"
"Signals to look for (any one of these warrants action):\n"
" • User corrected your style, tone, format, legibility, or "
"verbosity. Frustration signals like 'stop doing X', 'this is too "
"verbose', 'don't format like this', 'why are you explaining', "
"'just give me the answer', 'you always do Y and I hate it', or an "
"explicit 'remember this' are FIRST-CLASS skill signals, not just "
"memory signals. Update the relevant skill(s) to embed the "
"preference so the next session starts already knowing.\n"
" • User corrected your workflow, approach, or sequence of steps. "
"Encode the correction as a pitfall or explicit step in the skill "
"that governs that class of task.\n"
" • Non-trivial technique, fix, workaround, debugging path, or "
"tool-usage pattern emerged that a future session would benefit "
"from. Capture it.\n"
" • A skill that got loaded or consulted this session turned out "
"to be wrong, missing a step, or outdated. Patch it NOW.\n\n"
"Preference order — prefer the earliest action that fits, but do "
"pick one when a signal above fired:\n"
" 1. UPDATE A CURRENTLY-LOADED SKILL. Look back through the "
"conversation for skills the user loaded via /skill-name or you "
"read via skill_view. If any of them covers the territory of the "
"new learning, PATCH that one first. It is the skill that was in "
"play, so it's the right one to extend.\n"
" 2. UPDATE AN EXISTING UMBRELLA (via skills_list + skill_view). "
"If no loaded skill fits but an existing class-level skill does, "
"patch it. Add a subsection, a pitfall, or broaden a trigger.\n"
" 3. ADD A SUPPORT FILE under an existing umbrella. Skills can be "
"packaged with three kinds of support files — use the right "
"directory per kind:\n"
" • `references/<topic>.md` — session-specific detail (error "
"transcripts, reproduction recipes, provider quirks) AND "
"condensed knowledge banks: quoted research, API docs, external "
"authoritative excerpts, or domain notes you found while working "
"on the problem. Write it concise and for the value of the task, "
"not as a full mirror of upstream docs.\n"
" • `templates/<name>.<ext>` — starter files meant to be "
"copied and modified (boilerplate configs, scaffolding, a "
"known-good example the agent can `reproduce with modifications`).\n"
" • `scripts/<name>.<ext>` — statically re-runnable actions "
"the skill can invoke directly (verification scripts, fixture "
"generators, deterministic probes, anything the agent should run "
"rather than hand-type each time).\n"
" Add support files via skill_manage action=write_file with "
"file_path starting 'references/', 'templates/', or 'scripts/'. "
"The umbrella's SKILL.md should gain a one-line pointer to any "
"new support file so future agents know it exists.\n"
" 4. CREATE A NEW CLASS-LEVEL UMBRELLA SKILL when no existing "
"skill covers the class. The name MUST be at the class level. "
"The name MUST NOT be a specific PR number, error string, feature "
"codename, library-alone name, or 'fix-X / debug-Y / audit-Z-today' "
"session artifact. If the proposed name only makes sense for "
"today's task, it's wrong — fall back to (1), (2), or (3).\n\n"
"User-preference embedding (important): when the user expressed a "
"style/format/workflow preference, the update belongs in the "
"SKILL.md body, not just in memory. Memory captures 'who the user "
"is and what the current situation and state of your operations "
"are'; skills capture 'how to do this class of task for this "
"user'. When they complain about how you handled a task, the "
"skill that governs that task needs to carry the lesson.\n\n"
"If you notice two existing skills that overlap, note it in your "
"reply — the background curator handles consolidation at scale.\n\n"
"'Nothing to save.' is a real option but should NOT be the "
"default. If the session ran smoothly with no corrections and "
"produced no new technique, just say 'Nothing to save.' and stop. "
"Otherwise, act."
"Review the conversation above and consider whether a skill should be saved or updated.\n\n"
"Work in this order — do not skip steps:\n\n"
"1. SURVEY the existing skill landscape first. Call skills_list to see what you "
"have. If anything looks potentially relevant, skill_view it before deciding. "
"You are looking for the CLASS of task that just happened, not the exact task. "
"Example: a successful Tauri build is in the class \"desktop app build "
"troubleshooting\", not \"fix my specific Tauri error today\".\n\n"
"2. THINK CLASS-FIRST. What general pattern of task did the user just complete? "
"What conditions will trigger this pattern again? Describe the class in one "
"sentence before looking at what to save.\n\n"
"3. PREFER GENERALIZING AN EXISTING SKILL over creating a new one. If a skill "
"already covers the class — even partially — update it (skill_manage patch) "
"with the new insight. Broaden its \"when to use\" trigger if needed.\n\n"
"4. ONLY CREATE A NEW SKILL when no existing skill reasonably covers the class. "
"When you create one, name and scope it at the class level "
"(\"react-i18n-setup\", not \"add-i18n-to-my-dashboard-app\"). The trigger "
"section must describe the class of situations, not this one session.\n\n"
"5. If you notice two existing skills that overlap, note it in your response "
"so a future review can consolidate them. Do not consolidate now unless the "
"overlap is obvious and low-risk.\n\n"
"Only act when something is genuinely worth saving. "
"If nothing stands out, just say 'Nothing to save.' and stop."
)
_COMBINED_REVIEW_PROMPT = (
"Review the conversation above and update two things:\n\n"
"**Memory**: who the user is. Did the user reveal persona, "
"desires, preferences, personal details, or expectations about "
"how you should behave? Save facts about the user and durable "
"preferences with the memory tool.\n\n"
"**Skills**: how to do this class of task. Be ACTIVE — most "
"sessions produce at least one skill update. A pass that does "
"nothing is a missed learning opportunity, not a neutral outcome.\n\n"
"Target shape of the skill library: CLASS-LEVEL skills with a rich "
"SKILL.md and a `references/` directory for session-specific detail. "
"Not a long flat list of narrow one-session-one-skill entries.\n\n"
"Signals that warrant a skill update (any one is enough):\n"
" • User corrected your style, tone, format, legibility, "
"verbosity, or approach. Frustration is a FIRST-CLASS skill "
"signal, not just a memory signal. 'stop doing X', 'don't format "
"like this', 'I hate when you Y' — embed the lesson in the skill "
"that governs that task so the next session starts fixed.\n"
" • Non-trivial technique, fix, workaround, or debugging path "
"emerged.\n"
" • A skill that was loaded or consulted turned out wrong, "
"missing, or outdated — patch it now.\n\n"
"Preference order for skills — pick the earliest that fits:\n"
" 1. UPDATE A CURRENTLY-LOADED SKILL. Check what skills were "
"loaded via /skill-name or skill_view in the conversation. If one "
"of them covers the learning, PATCH it first. It was in play; "
"it's the right place.\n"
" 2. UPDATE AN EXISTING UMBRELLA (skills_list + skill_view to "
"find the right one). Patch it.\n"
" 3. ADD A SUPPORT FILE under an existing umbrella via "
"skill_manage action=write_file. Three kinds: "
"`references/<topic>.md` for session-specific detail OR condensed "
"knowledge banks (quoted research, API docs excerpts, domain "
"notes) written concise and task-focused; `templates/<name>.<ext>` "
"for starter files meant to be copied and modified; "
"`scripts/<name>.<ext>` for statically re-runnable actions "
"(verification, fixture generators, probes). Add a one-line "
"pointer in SKILL.md so future agents find them.\n"
" 4. CREATE A NEW CLASS-LEVEL UMBRELLA when nothing exists. "
"Name at the class level — NOT a PR number, error string, "
"codename, library-alone name, or 'fix-X / debug-Y' session "
"artifact. If the name only fits today's task, fall back to (1), "
"(2), or (3).\n\n"
"User-preference embedding: when the user complains about how "
"you handled a task, update the skill that governs that task — "
"memory alone isn't enough. Memory says 'who the user is and "
"what the current situation and state of your operations are'; "
"skills say 'how to do this class of task for this user'. Both "
"should carry user-preference lessons when relevant.\n\n"
"If you notice overlapping existing skills, mention it — the "
"background curator handles consolidation.\n\n"
"Act on whichever of the two dimensions has real signal. If "
"genuinely nothing stands out on either, say 'Nothing to save.' "
"and stop — but don't reach for that conclusion as a default."
"Review the conversation above and consider two things:\n\n"
"**Memory**: Has the user revealed things about themselves — their persona, "
"desires, preferences, or personal details? Has the user expressed expectations "
"about how you should behave, their work style, or ways they want you to operate? "
"If so, save using the memory tool.\n\n"
"**Skills**: Was a non-trivial approach used to complete a task that required trial "
"and error, changing course due to experiential findings, or a different method "
"or outcome than the user expected? If so, work in this order:\n"
" a. SURVEY existing skills first (skills_list, then skill_view on candidates).\n"
" b. Identify the CLASS of task, not the specific task "
"(\"desktop app build troubleshooting\", not \"fix my Tauri error\").\n"
" c. PREFER UPDATING/GENERALIZING an existing skill that covers the class.\n"
" d. ONLY CREATE A NEW SKILL if no existing one covers the class. Scope at "
"the class level, not this one session.\n"
" e. If you notice overlapping skills during the survey, note it so a future "
"review can consolidate them.\n\n"
"Only act if there's something genuinely worth saving. "
"If nothing stands out, just say 'Nothing to save.' and stop."
)
@staticmethod
-3
View File
@@ -84,7 +84,6 @@ AUTHOR_MAP = {
"6548898+romanornr@users.noreply.github.com": "romanornr",
"foxion37@gmail.com": "foxion37",
"bloodcarter@gmail.com": "bloodcarter",
"scott@scotttrinh.com": "scotttrinh",
# contributors (from noreply pattern)
"david.vv@icloud.com": "davidvv",
"wangqiang@wangqiangdeMac-mini.local": "xiaoqiang243",
@@ -261,7 +260,6 @@ AUTHOR_MAP = {
"154585401+LeonSGP43@users.noreply.github.com": "LeonSGP43",
"mgparkprint@gmail.com": "vlwkaos",
"tranquil_flow@protonmail.com": "Tranquil-Flow",
"LyleLengyel@gmail.com": "mcndjxlefnd",
"wangshengyang2004@163.com": "Wangshengyang2004",
"hasan.ali13381@gmail.com": "H-Ali13381",
"xienb@proton.me": "XieNBi",
@@ -413,7 +411,6 @@ AUTHOR_MAP = {
"tesseracttars@gmail.com": "tesseracttars-creator",
"tianliangjay@gmail.com": "xingkongliang",
"tranquil_flow@protonmail.com": "Tranquil-Flow",
"LyleLengyel@gmail.com": "mcndjxlefnd",
"unayung@gmail.com": "Unayung",
"vorvul.danylo@gmail.com": "WorldInnovationsDepartment",
"win4r@outlook.com": "win4r",
-480
View File
@@ -1,480 +0,0 @@
"""Tests for agent/curator.py — orchestrator, idle gating, state transitions.
LLM spawning is never exercised here `_run_llm_review` is monkeypatched so
tests run fully offline and the curator module doesn't need real credentials.
"""
from __future__ import annotations
import importlib
import json
from datetime import datetime, timedelta, timezone
from pathlib import Path
import pytest
@pytest.fixture
def curator_env(tmp_path, monkeypatch):
"""Isolated HERMES_HOME + freshly reloaded curator + skill_usage modules."""
home = tmp_path / ".hermes"
(home / "skills").mkdir(parents=True)
monkeypatch.setattr(Path, "home", lambda: tmp_path)
monkeypatch.setenv("HERMES_HOME", str(home))
import tools.skill_usage as usage
importlib.reload(usage)
import agent.curator as curator
importlib.reload(curator)
# Neutralize the real LLM pass by default — tests opt in per-case.
monkeypatch.setattr(curator, "_run_llm_review", lambda prompt: "llm-stub")
# Default: no config file → curator defaults. Tests can override.
monkeypatch.setattr(curator, "_load_config", lambda: {})
return {"home": home, "curator": curator, "usage": usage}
def _write_skill(skills_dir: Path, name: str):
d = skills_dir / name
d.mkdir(parents=True, exist_ok=True)
(d / "SKILL.md").write_text(
f"---\nname: {name}\ndescription: x\n---\n", encoding="utf-8",
)
return d
# ---------------------------------------------------------------------------
# Config gates
# ---------------------------------------------------------------------------
def test_curator_enabled_default_true(curator_env):
assert curator_env["curator"].is_enabled() is True
def test_curator_disabled_via_config(curator_env, monkeypatch):
c = curator_env["curator"]
monkeypatch.setattr(c, "_load_config", lambda: {"enabled": False})
assert c.is_enabled() is False
assert c.should_run_now() is False
def test_curator_defaults(curator_env):
c = curator_env["curator"]
assert c.get_interval_hours() == 24 * 7 # 7 days
assert c.get_min_idle_hours() == 2
assert c.get_stale_after_days() == 30
assert c.get_archive_after_days() == 90
def test_curator_config_overrides(curator_env, monkeypatch):
c = curator_env["curator"]
monkeypatch.setattr(c, "_load_config", lambda: {
"interval_hours": 12,
"min_idle_hours": 0.5,
"stale_after_days": 7,
"archive_after_days": 60,
})
assert c.get_interval_hours() == 12
assert c.get_min_idle_hours() == 0.5
assert c.get_stale_after_days() == 7
assert c.get_archive_after_days() == 60
# ---------------------------------------------------------------------------
# should_run_now
# ---------------------------------------------------------------------------
def test_first_run_always_eligible(curator_env):
c = curator_env["curator"]
assert c.should_run_now() is True
def test_recent_run_blocks(curator_env):
c = curator_env["curator"]
c.save_state({
"last_run_at": datetime.now(timezone.utc).isoformat(),
"paused": False,
})
assert c.should_run_now() is False
def test_old_run_eligible(curator_env):
"""A run older than the configured interval should re-trigger. Use a
2x-interval cushion so the test doesn't become coupled to the exact
default bumping DEFAULT_INTERVAL_HOURS shouldn't break it."""
c = curator_env["curator"]
long_ago = datetime.now(timezone.utc) - timedelta(
hours=c.get_interval_hours() * 2
)
c.save_state({"last_run_at": long_ago.isoformat(), "paused": False})
assert c.should_run_now() is True
def test_paused_blocks_even_if_stale(curator_env):
c = curator_env["curator"]
long_ago = datetime.now(timezone.utc) - timedelta(days=30)
c.save_state({"last_run_at": long_ago.isoformat(), "paused": True})
assert c.should_run_now() is False
def test_set_paused_roundtrip(curator_env):
c = curator_env["curator"]
c.set_paused(True)
assert c.is_paused() is True
c.set_paused(False)
assert c.is_paused() is False
# ---------------------------------------------------------------------------
# Automatic state transitions
# ---------------------------------------------------------------------------
def test_unused_skill_transitions_to_stale(curator_env):
c = curator_env["curator"]
u = curator_env["usage"]
skills_dir = curator_env["home"] / "skills"
_write_skill(skills_dir, "old-skill")
# Record last-use well past stale_after_days (30 default)
long_ago = (datetime.now(timezone.utc) - timedelta(days=45)).isoformat()
data = u.load_usage()
data["old-skill"] = u._empty_record()
data["old-skill"]["last_used_at"] = long_ago
data["old-skill"]["created_at"] = long_ago
u.save_usage(data)
counts = c.apply_automatic_transitions()
assert counts["marked_stale"] == 1
assert u.get_record("old-skill")["state"] == "stale"
def test_very_old_skill_gets_archived(curator_env):
c = curator_env["curator"]
u = curator_env["usage"]
skills_dir = curator_env["home"] / "skills"
skill_dir = _write_skill(skills_dir, "ancient")
super_old = (datetime.now(timezone.utc) - timedelta(days=120)).isoformat()
data = u.load_usage()
data["ancient"] = u._empty_record()
data["ancient"]["last_used_at"] = super_old
data["ancient"]["created_at"] = super_old
u.save_usage(data)
counts = c.apply_automatic_transitions()
assert counts["archived"] == 1
assert not skill_dir.exists()
assert (skills_dir / ".archive" / "ancient" / "SKILL.md").exists()
assert u.get_record("ancient")["state"] == "archived"
def test_pinned_skill_is_never_touched(curator_env):
c = curator_env["curator"]
u = curator_env["usage"]
skills_dir = curator_env["home"] / "skills"
_write_skill(skills_dir, "precious")
super_old = (datetime.now(timezone.utc) - timedelta(days=365)).isoformat()
data = u.load_usage()
data["precious"] = u._empty_record()
data["precious"]["last_used_at"] = super_old
data["precious"]["created_at"] = super_old
data["precious"]["pinned"] = True
u.save_usage(data)
counts = c.apply_automatic_transitions()
assert counts["archived"] == 0
assert counts["marked_stale"] == 0
rec = u.get_record("precious")
assert rec["state"] == "active" # untouched
assert rec["pinned"] is True
def test_stale_skill_reactivates_on_recent_use(curator_env):
c = curator_env["curator"]
u = curator_env["usage"]
skills_dir = curator_env["home"] / "skills"
_write_skill(skills_dir, "revived")
recent = datetime.now(timezone.utc).isoformat()
data = u.load_usage()
data["revived"] = u._empty_record()
data["revived"]["state"] = "stale"
data["revived"]["last_used_at"] = recent
data["revived"]["created_at"] = recent
u.save_usage(data)
counts = c.apply_automatic_transitions()
assert counts["reactivated"] == 1
assert u.get_record("revived")["state"] == "active"
def test_new_skill_without_last_used_not_immediately_archived(curator_env):
"""A freshly-created skill with no use history should not get archived
just because last_used_at is None."""
c = curator_env["curator"]
u = curator_env["usage"]
skills_dir = curator_env["home"] / "skills"
_write_skill(skills_dir, "fresh")
# Bump nothing — record doesn't exist yet. Curator should create it
# and fall back to created_at which is ~now.
counts = c.apply_automatic_transitions()
assert counts["archived"] == 0
assert counts["marked_stale"] == 0
assert (skills_dir / "fresh").exists()
def test_bundled_skill_not_touched_by_transitions(curator_env):
c = curator_env["curator"]
u = curator_env["usage"]
skills_dir = curator_env["home"] / "skills"
_write_skill(skills_dir, "bundled")
(skills_dir / ".bundled_manifest").write_text(
"bundled:abc\n", encoding="utf-8",
)
super_old = (datetime.now(timezone.utc) - timedelta(days=500)).isoformat()
data = u.load_usage()
data["bundled"] = u._empty_record()
data["bundled"]["last_used_at"] = super_old
u.save_usage(data)
counts = c.apply_automatic_transitions()
# bundled skills are excluded from the agent-created list entirely
assert counts["checked"] == 0
assert (skills_dir / "bundled").exists() # never moved
# ---------------------------------------------------------------------------
# run_curator_review orchestration
# ---------------------------------------------------------------------------
def test_run_review_records_state(curator_env):
c = curator_env["curator"]
skills_dir = curator_env["home"] / "skills"
_write_skill(skills_dir, "a")
result = c.run_curator_review(synchronous=True)
assert "started_at" in result
state = c.load_state()
assert state["last_run_at"] is not None
assert state["run_count"] >= 1
assert state["last_run_summary"] is not None
def test_run_review_synchronous_invokes_llm_stub(curator_env, monkeypatch):
c = curator_env["curator"]
skills_dir = curator_env["home"] / "skills"
_write_skill(skills_dir, "a")
calls = []
monkeypatch.setattr(
c, "_run_llm_review",
lambda prompt: (calls.append(prompt), "stubbed-summary")[1],
)
captured = []
c.run_curator_review(on_summary=lambda s: captured.append(s), synchronous=True)
assert len(calls) == 1
assert "skill CURATOR" in calls[0] or "CURATOR" in calls[0]
assert captured # on_summary was called
assert any("stubbed-summary" in s for s in captured)
def test_run_review_skips_llm_when_no_candidates(curator_env, monkeypatch):
c = curator_env["curator"]
# No skills in the dir → no candidates
calls = []
monkeypatch.setattr(
c, "_run_llm_review",
lambda prompt: (calls.append(prompt), "never-called")[1],
)
captured = []
c.run_curator_review(on_summary=lambda s: captured.append(s), synchronous=True)
assert calls == [] # LLM not invoked
assert any("skipped" in s for s in captured)
def test_maybe_run_curator_respects_disabled(curator_env, monkeypatch):
c = curator_env["curator"]
monkeypatch.setattr(c, "_load_config", lambda: {"enabled": False})
result = c.maybe_run_curator()
assert result is None
def test_maybe_run_curator_enforces_idle_gate(curator_env, monkeypatch):
c = curator_env["curator"]
monkeypatch.setattr(c, "_load_config", lambda: {"min_idle_hours": 2})
# idle less than the threshold
result = c.maybe_run_curator(idle_for_seconds=60.0)
assert result is None
def test_maybe_run_curator_runs_when_eligible(curator_env, monkeypatch):
c = curator_env["curator"]
skills_dir = curator_env["home"] / "skills"
_write_skill(skills_dir, "a")
# Force idle over threshold
result = c.maybe_run_curator(idle_for_seconds=99999.0)
assert result is not None
assert "started_at" in result
def test_maybe_run_curator_swallows_exceptions(curator_env, monkeypatch):
c = curator_env["curator"]
def explode():
raise RuntimeError("boom")
monkeypatch.setattr(c, "should_run_now", explode)
# Must not raise
assert c.maybe_run_curator() is None
# ---------------------------------------------------------------------------
# Persistence
# ---------------------------------------------------------------------------
def test_state_file_survives_corrupt_read(curator_env):
c = curator_env["curator"]
c._state_file().write_text("not json", encoding="utf-8")
# Must fall back to default, not raise
assert c.load_state() == c._default_state()
def test_state_atomic_write_no_tmp_leftovers(curator_env):
c = curator_env["curator"]
c.save_state({"paused": True})
parent = c._state_file().parent
for p in parent.iterdir():
assert not p.name.startswith(".curator_state_"), f"tmp leftover: {p.name}"
def test_curator_review_prompt_has_invariants():
"""Core invariants must be in the review prompt text."""
from agent.curator import CURATOR_REVIEW_PROMPT
assert "MUST NOT" in CURATOR_REVIEW_PROMPT or "DO NOT" in CURATOR_REVIEW_PROMPT
assert "bundled" in CURATOR_REVIEW_PROMPT.lower()
assert "delete" in CURATOR_REVIEW_PROMPT.lower()
assert "pinned" in CURATOR_REVIEW_PROMPT.lower()
# Must describe the actions the reviewer can take. The exact vocabulary
# has tightened over time (the umbrella-first prompt drops 'keep' as a
# first-class decision verb, since passive keep-everything is the
# failure mode the prompt is trying to avoid), but the core merge /
# archive / patch trio must remain callable.
for verb in ("patch", "archive"):
assert verb in CURATOR_REVIEW_PROMPT.lower()
# Must mention consolidation (possibly via "merge" or "consolidat")
assert "consolidat" in CURATOR_REVIEW_PROMPT.lower() or "merge" in CURATOR_REVIEW_PROMPT.lower()
def test_curator_review_prompt_points_at_existing_tools_only():
"""The review prompt must rely on existing tools (skill_manage + terminal)
and must NOT reference bespoke curator tools that are not registered
model tools."""
from agent.curator import CURATOR_REVIEW_PROMPT
assert "skill_manage" in CURATOR_REVIEW_PROMPT
assert "skills_list" in CURATOR_REVIEW_PROMPT
assert "skill_view" in CURATOR_REVIEW_PROMPT
assert "terminal" in CURATOR_REVIEW_PROMPT.lower()
# These would be nice but aren't actually registered as tools — the
# curator uses skill_manage + terminal mv instead.
assert "archive_skill" not in CURATOR_REVIEW_PROMPT
assert "pin_skill" not in CURATOR_REVIEW_PROMPT
def test_curator_does_not_instruct_model_to_pin():
"""Pinning is a user opt-out, not a model decision. The prompt should
not tell the reviewer to pin skills autonomously."""
from agent.curator import CURATOR_REVIEW_PROMPT
# "pinned" appears in the invariant ("skip pinned skills"), but "pin"
# as a decision verb should not.
lines = CURATOR_REVIEW_PROMPT.split("\n")
decision_block = "\n".join(
l for l in lines
if l.strip().startswith(("keep", "patch", "archive", "consolidate", "pin "))
)
# No standalone "pin" action line
assert not any(l.strip().startswith("pin ") for l in lines), (
f"Found a pin action line in:\n{decision_block}"
)
def test_curator_review_prompt_is_umbrella_first():
"""The curator prompt must push umbrella-building / class-level thinking,
not pair-level 'are these two the same?' analysis."""
from agent.curator import CURATOR_REVIEW_PROMPT
lower = CURATOR_REVIEW_PROMPT.lower()
# Must frame the task as active umbrella-building, not a passive audit.
assert "umbrella" in lower, (
"must use UMBRELLA framing — the class-first abstraction the curator "
"is designed to produce"
)
# Must tell the reviewer not to stop at pair-level distinctness.
assert "class" in lower, "must reference class-level thinking"
# Must cover the three consolidation methods explicitly
assert "references/" in CURATOR_REVIEW_PROMPT, (
"must name references/ as a demotion target for session-specific content"
)
# templates/ and scripts/ make the umbrella a real class-level skill
assert "templates/" in CURATOR_REVIEW_PROMPT
assert "scripts/" in CURATOR_REVIEW_PROMPT
# Must say the counter argument: usage=0 is not a reason to skip
assert "use_count" in CURATOR_REVIEW_PROMPT or "counter" in lower, (
"must pre-empt the 'usage counters are zero, I can't judge' bailout"
)
def test_curator_review_prompt_offers_support_file_actions():
"""Support-file demotion (references/templates/scripts) must be one of
the three consolidation methods, alongside merge-into-existing and
create-new-umbrella."""
from agent.curator import CURATOR_REVIEW_PROMPT
# skill_manage action=write_file is how references/ are added to an
# existing skill — this is the create-adjacent action the curator needs
# to demote narrow siblings without touching their SKILL.md.
assert "write_file" in CURATOR_REVIEW_PROMPT
# Must offer creating a brand-new umbrella when no existing one fits
assert "action=create" in CURATOR_REVIEW_PROMPT or "create a new umbrella" in CURATOR_REVIEW_PROMPT.lower()
def test_cli_unpin_refuses_bundled_skill(curator_env, capsys):
"""hermes curator unpin must refuse bundled/hub skills too (matches pin)."""
from hermes_cli import curator as cli
skills_dir = curator_env["home"] / "skills"
_write_skill(skills_dir, "ship-skill")
(skills_dir / ".bundled_manifest").write_text(
"ship-skill:abc\n", encoding="utf-8",
)
class _A:
skill = "ship-skill"
rc = cli._cmd_unpin(_A())
captured = capsys.readouterr()
assert rc == 1
assert "bundled" in captured.out.lower() or "hub" in captured.out.lower()
def test_cli_pin_refuses_bundled_skill(curator_env, capsys):
from hermes_cli import curator as cli
skills_dir = curator_env["home"] / "skills"
_write_skill(skills_dir, "ship-skill")
(skills_dir / ".bundled_manifest").write_text(
"ship-skill:abc\n", encoding="utf-8",
)
class _A:
skill = "ship-skill"
rc = cli._cmd_pin(_A())
captured = capsys.readouterr()
assert rc == 1
assert "bundled" in captured.out.lower() or "hub" in captured.out.lower()
+5
View File
@@ -82,6 +82,11 @@ class TestBuildToolPreview:
result = build_tool_preview("memory", {"action": "add", "target": "user", "content": "test note"})
assert result is not None
assert "user" in result
assert "\n" not in result
def test_memory_tool_add_without_target_stays_one_line(self):
result = build_tool_preview("memory", {"action": "add", "content": "User identifies as a cutie patootie."})
assert result == '+"User identifies as a cuti..."'
def test_memory_replace_missing_old_text_marked(self):
# Avoid empty quotes "" in the preview when old_text is missing/None.
+5 -49
View File
@@ -1,11 +1,9 @@
"""Tests for CLI browser CDP auto-launch helpers."""
import os
import subprocess
from unittest.mock import patch
from cli import HermesCLI
from hermes_cli.browser_connect import manual_chrome_debug_command
def _assert_chrome_debug_cmd(cmd, expected_chrome, expected_port):
@@ -28,19 +26,13 @@ class TestChromeDebugLaunch:
captured["kwargs"] = kwargs
return object()
with patch("hermes_cli.browser_connect.shutil.which", side_effect=lambda name: r"C:\Chrome\chrome.exe" if name == "chrome.exe" else None), \
patch("hermes_cli.browser_connect.os.path.isfile", side_effect=lambda path: path == r"C:\Chrome\chrome.exe"), \
with patch("cli.shutil.which", side_effect=lambda name: r"C:\Chrome\chrome.exe" if name == "chrome.exe" else None), \
patch("cli.os.path.isfile", side_effect=lambda path: path == r"C:\Chrome\chrome.exe"), \
patch("subprocess.Popen", side_effect=fake_popen):
assert HermesCLI._try_launch_chrome_debug(9333, "Windows") is True
_assert_chrome_debug_cmd(captured["cmd"], r"C:\Chrome\chrome.exe", 9333)
# Windows uses creationflags (POSIX-only start_new_session would raise).
assert "start_new_session" not in captured["kwargs"]
flags = captured["kwargs"].get("creationflags", 0)
expected = getattr(subprocess, "DETACHED_PROCESS", 0) | getattr(
subprocess, "CREATE_NEW_PROCESS_GROUP", 0
)
assert flags == expected
assert captured["kwargs"]["start_new_session"] is True
def test_windows_launch_falls_back_to_common_install_dirs(self, monkeypatch):
captured = {}
@@ -57,45 +49,9 @@ class TestChromeDebugLaunch:
monkeypatch.delenv("ProgramFiles(x86)", raising=False)
monkeypatch.delenv("LOCALAPPDATA", raising=False)
with patch("hermes_cli.browser_connect.shutil.which", return_value=None), \
patch("hermes_cli.browser_connect.os.path.isfile", side_effect=lambda path: path == installed), \
with patch("cli.shutil.which", return_value=None), \
patch("cli.os.path.isfile", side_effect=lambda path: path == installed), \
patch("subprocess.Popen", side_effect=fake_popen):
assert HermesCLI._try_launch_chrome_debug(9222, "Windows") is True
_assert_chrome_debug_cmd(captured["cmd"], installed, 9222)
def test_manual_command_uses_detected_linux_browser(self):
with patch("hermes_cli.browser_connect.shutil.which", side_effect=lambda name: "/usr/bin/chromium" if name == "chromium" else None), \
patch("hermes_cli.browser_connect.os.path.isfile", side_effect=lambda path: path == "/usr/bin/chromium"):
command = manual_chrome_debug_command(9222, "Linux")
assert command is not None
assert command.startswith("/usr/bin/chromium --remote-debugging-port=9222")
def test_manual_command_uses_wsl_windows_chrome_when_available(self):
chrome = "/mnt/c/Program Files/Google/Chrome/Application/chrome.exe"
with patch("hermes_cli.browser_connect.shutil.which", return_value=None), \
patch("hermes_cli.browser_connect.os.path.isfile", side_effect=lambda path: path == chrome):
command = manual_chrome_debug_command(9222, "Linux")
assert command is not None
# Linux/WSL uses POSIX shell quoting (single quotes around paths with spaces).
assert command.startswith(f"'{chrome}' --remote-debugging-port=9222")
def test_manual_command_uses_windows_quoting_on_windows(self):
chrome = r"C:\Program Files\Google\Chrome\Application\chrome.exe"
with patch("hermes_cli.browser_connect.shutil.which", side_effect=lambda name: chrome if name == "chrome.exe" else None), \
patch("hermes_cli.browser_connect.os.path.isfile", side_effect=lambda path: path == chrome):
command = manual_chrome_debug_command(9222, "Windows")
assert command is not None
# Windows uses cmd.exe-compatible quoting via subprocess.list2cmdline.
assert command.startswith(f'"{chrome}" --remote-debugging-port=9222')
assert "'" not in command
def test_manual_command_returns_none_when_linux_browser_missing(self):
with patch("hermes_cli.browser_connect.shutil.which", return_value=None), \
patch("hermes_cli.browser_connect.os.path.isfile", return_value=False):
assert manual_chrome_debug_command(9222, "Linux") is None
+2 -2
View File
@@ -40,14 +40,14 @@ class TestCliSkinPromptIntegration:
cli = _make_cli_stub()
set_active_skin("ares")
assert cli._get_tui_prompt_fragments() == [("class:prompt", "")]
assert cli._get_tui_prompt_fragments() == [("class:prompt", " ")]
def test_secret_prompt_fragments_preserve_secret_state(self):
cli = _make_cli_stub()
cli._secret_state = {"response_queue": object()}
set_active_skin("ares")
assert cli._get_tui_prompt_fragments() == [("class:sudo-prompt", "🔑 ")]
assert cli._get_tui_prompt_fragments() == [("class:sudo-prompt", "🔑 ")]
def test_narrow_terminals_compact_voice_prompt_fragments(self):
cli = _make_cli_stub()
-26
View File
@@ -480,29 +480,3 @@ def _enforce_test_timeout():
yield
signal.alarm(0)
signal.signal(signal.SIGALRM, old)
@pytest.fixture(autouse=True)
def _reset_tool_registry_caches():
"""Clear tool-registry-level caches between tests.
The production registry caches ``check_fn()`` results for 30 s
(see tools/registry.py) and :func:`get_tool_definitions` memoizes
its result (see model_tools.py). Both are keyed on state that tests
routinely mutate (env vars, registry._generation, config.yaml mtime)
but a stale result from test A can still be served to test B
because 30 s covers the entire suite, and xdist worker reuse means
one test's cache lands in another's process. Clearing before every
test keeps hermetic behavior.
"""
try:
from tools.registry import invalidate_check_fn_cache
invalidate_check_fn_cache()
except ImportError:
pass
try:
from model_tools import _clear_tool_defs_cache
_clear_tool_defs_cache()
except ImportError:
pass
yield
-35
View File
@@ -41,10 +41,6 @@ def _simulate_config_bridge(cfg: dict, initial_env: dict | None = None):
# TERMINAL_CWD. Mirrors the fix in gateway/run.py.
if cfg_key == "cwd" and str(val) in (".", "auto", "cwd"):
continue
# Expand shell tilde so subprocess.Popen never receives a literal
# "~/" which the kernel rejects.
if cfg_key == "cwd" and isinstance(val, str):
val = os.path.expanduser(val)
if isinstance(val, list):
env[env_var] = json.dumps(val)
else:
@@ -59,8 +55,6 @@ def _simulate_config_bridge(cfg: dict, initial_env: dict | None = None):
if alias_env not in env:
alias_val = cfg.get(alias_key)
if isinstance(alias_val, str) and alias_val.strip():
if alias_key == "cwd":
alias_val = os.path.expanduser(alias_val)
env[alias_env] = alias_val.strip()
# --- Replicate lines 144-147: MESSAGING_CWD fallback ---
@@ -211,32 +205,3 @@ class TestNestedTerminalCwdPlaceholderSkip:
assert result["TERMINAL_ENV"] == "docker"
assert result["TERMINAL_TIMEOUT"] == "300"
assert result["TERMINAL_CWD"] == "/from/env"
class TestTildeExpansion:
"""terminal.cwd values containing shell tilde must be expanded.
subprocess.Popen does not expand shell syntax, so a literal "~/"
causes FileNotFoundError. Regression test for commit 3c42064e.
"""
def test_terminal_cwd_tilde_expanded(self):
"""terminal.cwd: '~/projects' should expand to /home/<user>/projects."""
cfg = {"terminal": {"cwd": "~/projects"}}
result = _simulate_config_bridge(cfg)
assert result["TERMINAL_CWD"] == os.path.expanduser("~/projects")
def test_top_level_cwd_tilde_expanded(self):
"""top-level cwd: '~/' should expand to user's home directory."""
cfg = {"cwd": "~/"}
result = _simulate_config_bridge(cfg)
assert result["TERMINAL_CWD"] == os.path.expanduser("~/")
def test_tilde_with_nested_precedence(self):
"""Nested terminal.cwd should win over top-level, both expanded."""
cfg = {
"cwd": "~/top",
"terminal": {"cwd": "~/nested"},
}
result = _simulate_config_bridge(cfg)
assert result["TERMINAL_CWD"] == os.path.expanduser("~/nested")
-156
View File
@@ -1337,159 +1337,3 @@ class TestCursorStrippingOnFallback:
assert consumer._already_sent is True
# _last_sent_text must NOT be updated when the edit failed
assert consumer._last_sent_text == "Hello ▉"
# ── on_new_message callback (tool-progress linearization) ─────────────
class TestOnNewMessageCallback:
"""The on_new_message callback fires whenever a fresh content bubble
lands on the platform. Gateway uses this to close off the current
tool-progress bubble so the next tool.started opens a new bubble
below the content preserving chronological order in the chat.
Before this callback existed (post PR #7885), content messages got
their own bubbles after segment breaks, but the tool-progress task
kept editing the ORIGINAL progress bubble above all new content.
Result: tool lines appeared stacked in the upper bubble while
content messages lined up below, making the timeline look scrambled.
"""
@pytest.mark.asyncio
async def test_callback_fires_on_first_send(self):
"""First-send of a new content bubble fires on_new_message."""
adapter = MagicMock()
adapter.send = AsyncMock(return_value=SimpleNamespace(success=True, message_id="msg_1"))
adapter.edit_message = AsyncMock(return_value=SimpleNamespace(success=True))
adapter.MAX_MESSAGE_LENGTH = 4096
events = []
config = StreamConsumerConfig(edit_interval=0.01, buffer_threshold=1)
consumer = GatewayStreamConsumer(
adapter, "chat", config,
on_new_message=lambda: events.append("reset"),
)
consumer.on_delta("Hello")
consumer.finish()
await consumer.run()
assert events == ["reset"]
@pytest.mark.asyncio
async def test_callback_fires_once_per_segment(self):
"""A new first-send fires the callback again after segment break."""
adapter = MagicMock()
msg_counter = iter(["msg_1", "msg_2", "msg_3"])
adapter.send = AsyncMock(
side_effect=lambda **kw: SimpleNamespace(success=True, message_id=next(msg_counter))
)
adapter.edit_message = AsyncMock(return_value=SimpleNamespace(success=True))
adapter.MAX_MESSAGE_LENGTH = 4096
events = []
config = StreamConsumerConfig(edit_interval=0.01, buffer_threshold=1)
consumer = GatewayStreamConsumer(
adapter, "chat", config,
on_new_message=lambda: events.append("reset"),
)
consumer.on_delta("A")
consumer.on_delta(None)
consumer.on_delta("B")
consumer.on_delta(None)
consumer.on_delta("C")
consumer.finish()
await consumer.run()
# Three content bubbles ⇒ three reset notifications
assert events == ["reset", "reset", "reset"]
@pytest.mark.asyncio
async def test_callback_not_fired_on_edit(self):
"""Subsequent edits of the same bubble do NOT fire the callback."""
adapter = MagicMock()
adapter.send = AsyncMock(return_value=SimpleNamespace(success=True, message_id="msg_1"))
adapter.edit_message = AsyncMock(return_value=SimpleNamespace(success=True))
adapter.MAX_MESSAGE_LENGTH = 4096
events = []
config = StreamConsumerConfig(edit_interval=0.01, buffer_threshold=1)
consumer = GatewayStreamConsumer(
adapter, "chat", config,
on_new_message=lambda: events.append("reset"),
)
consumer.on_delta("Hello")
task = asyncio.create_task(consumer.run())
await asyncio.sleep(0.05)
consumer.on_delta(" world")
await asyncio.sleep(0.05)
consumer.on_delta(" more")
await asyncio.sleep(0.05)
consumer.finish()
await task
# Only one first-send happened; edits do not re-fire.
assert events == ["reset"]
@pytest.mark.asyncio
async def test_callback_fires_on_commentary(self):
"""Commentary messages are fresh bubbles too — fire the callback."""
adapter = MagicMock()
adapter.send = AsyncMock(return_value=SimpleNamespace(success=True, message_id="msg_1"))
adapter.edit_message = AsyncMock(return_value=SimpleNamespace(success=True))
adapter.MAX_MESSAGE_LENGTH = 4096
events = []
config = StreamConsumerConfig(edit_interval=0.01, buffer_threshold=1)
consumer = GatewayStreamConsumer(
adapter, "chat", config,
on_new_message=lambda: events.append("reset"),
)
consumer.on_commentary("I'll search for that first.")
consumer.finish()
await consumer.run()
assert events == ["reset"]
@pytest.mark.asyncio
async def test_callback_error_swallowed(self):
"""Exceptions in the callback do not crash the consumer."""
adapter = MagicMock()
adapter.send = AsyncMock(return_value=SimpleNamespace(success=True, message_id="msg_1"))
adapter.edit_message = AsyncMock(return_value=SimpleNamespace(success=True))
adapter.MAX_MESSAGE_LENGTH = 4096
def raiser():
raise RuntimeError("boom")
config = StreamConsumerConfig(edit_interval=0.01, buffer_threshold=1)
consumer = GatewayStreamConsumer(
adapter, "chat", config,
on_new_message=raiser,
)
consumer.on_delta("Hello")
consumer.finish()
await consumer.run() # must not raise
assert consumer.already_sent is True
@pytest.mark.asyncio
async def test_no_callback_when_none(self):
"""Consumer works correctly when on_new_message is None (default)."""
adapter = MagicMock()
adapter.send = AsyncMock(return_value=SimpleNamespace(success=True, message_id="msg_1"))
adapter.edit_message = AsyncMock(return_value=SimpleNamespace(success=True))
adapter.MAX_MESSAGE_LENGTH = 4096
config = StreamConsumerConfig(edit_interval=0.01, buffer_threshold=1)
consumer = GatewayStreamConsumer(adapter, "chat", config) # no callback
consumer.on_delta("Hello")
consumer.finish()
await consumer.run()
assert consumer.already_sent is True
-17
View File
@@ -319,23 +319,6 @@ class TestSanitizeEnvLines:
assert result[0].startswith("OPENROUTER_API_KEY=")
assert result[1].startswith("OPENAI_BASE_URL=")
def test_glm_suffix_collision_not_split(self):
"""GLM_API_KEY / GLM_BASE_URL must not be mangled by LM_API_KEY / LM_BASE_URL suffixes (#17138)."""
lines = [
"GLM_API_KEY=glm-secret\n",
"GLM_BASE_URL=https://api.z.ai/api/paas/v4\n",
]
result = _sanitize_env_lines(lines)
assert result == lines, f"GLM_* lines were corrupted by suffix collision: {result}"
def test_suffix_collision_does_not_break_real_concatenation(self):
"""A genuine concatenation that happens to start with a suffix-superset key still splits."""
lines = ["GLM_API_KEY=glmLM_API_KEY=lm-key\n"]
result = _sanitize_env_lines(lines)
assert len(result) == 2
assert result[0].startswith("GLM_API_KEY=")
assert result[1].startswith("LM_API_KEY=")
def test_save_env_value_fixes_corruption_on_write(self, tmp_path):
"""save_env_value sanitizes corrupted lines when writing a new key."""
env_file = tmp_path / ".env"
-53
View File
@@ -345,59 +345,6 @@ def test_run_doctor_accepts_bare_custom_provider(monkeypatch, tmp_path):
assert "model.provider 'custom' is not a recognised provider" not in out
@pytest.mark.parametrize(
("provider", "default_model"),
[
("ai-gateway", "anthropic/claude-sonnet-4.6"),
("opencode-zen", "anthropic/claude-sonnet-4.6"),
("kilocode", "anthropic/claude-sonnet-4.6"),
("kimi-coding", "kimi-k2"),
],
)
def test_run_doctor_accepts_hermes_provider_ids_that_catalog_aliases(
monkeypatch, tmp_path, provider, default_model
):
home = tmp_path / ".hermes"
home.mkdir(parents=True, exist_ok=True)
(home / "config.yaml").write_text(
"model:\n"
f" provider: {provider}\n"
f" default: {default_model}\n",
encoding="utf-8",
)
monkeypatch.setattr(doctor_mod, "HERMES_HOME", home)
monkeypatch.setattr(doctor_mod, "PROJECT_ROOT", tmp_path / "project")
monkeypatch.setattr(doctor_mod, "_DHH", str(home))
(tmp_path / "project").mkdir(exist_ok=True)
fake_model_tools = types.SimpleNamespace(
check_tool_availability=lambda *a, **kw: ([], []),
TOOLSET_REQUIREMENTS={},
)
monkeypatch.setitem(sys.modules, "model_tools", fake_model_tools)
try:
from hermes_cli import auth as _auth_mod
monkeypatch.setattr(_auth_mod, "get_nous_auth_status", lambda: {})
monkeypatch.setattr(_auth_mod, "get_codex_auth_status", lambda: {})
except Exception:
pass
buf = io.StringIO()
with contextlib.redirect_stdout(buf):
doctor_mod.run_doctor(Namespace(fix=False))
out = buf.getvalue()
assert f"model.provider '{provider}' is not a recognised provider" not in out
assert f"model.provider '{provider}' is unknown" not in out
if provider in {"ai-gateway", "opencode-zen", "kilocode"}:
assert (
f"model.default '{default_model}' uses a vendor/model slug but provider is '{provider}'"
not in out
)
def test_run_doctor_termux_does_not_mark_browser_available_without_agent_browser(monkeypatch, tmp_path):
home = tmp_path / ".hermes"
home.mkdir(parents=True, exist_ok=True)
+1 -1
View File
@@ -252,7 +252,7 @@ class TestCliBrandingHelpers:
from hermes_cli.skin_engine import set_active_skin, get_active_prompt_symbol
set_active_skin("ares")
assert get_active_prompt_symbol() == ""
assert get_active_prompt_symbol() == " "
def test_active_help_header_ares(self):
from hermes_cli.skin_engine import set_active_skin, get_active_help_header
+34 -147
View File
@@ -1,176 +1,67 @@
"""Behavior tests for the skill review / combined review prompts.
"""Behavior tests for the class-first skill review prompts.
The review prompts steer the background review agent toward actively updating
the skill library after most sessions, with a strong bias toward:
1. Patching currently-loaded skills first,
2. Patching existing umbrellas next,
3. Adding references/ files under an existing umbrella,
4. Creating a new class-level umbrella only when nothing else fits.
User-preference corrections (style, format, verbosity, legibility) are
first-class skill signals, not just memory signals.
These tests assert behavioral *instructions* are present they do NOT
The skill review / combined review prompts steer the background review agent
toward generalizing existing skills rather than accumulating near-duplicates.
These tests assert the behavioral *instructions* are present they do NOT
snapshot the full prompt text (change-detector).
"""
from run_agent import AIAgent
# ---------------------------------------------------------------------------
# _SKILL_REVIEW_PROMPT
# ---------------------------------------------------------------------------
def test_skill_review_prompt_biases_toward_active_updates():
"""Prompt must frame updating as the default stance, not something rare."""
def test_skill_review_prompt_instructs_survey_first():
"""Prompt must tell the reviewer to list existing skills before deciding."""
prompt = AIAgent._SKILL_REVIEW_PROMPT
assert "ACTIVE" in prompt or "active" in prompt.lower(), (
"must tell the reviewer to be active"
)
# "missed learning opportunity" or equivalent framing for not acting
assert "missed" in prompt.lower() or "opportunity" in prompt.lower(), (
"must frame inaction as a miss, not a neutral outcome"
)
assert "skills_list" in prompt, "must instruct the reviewer to call skills_list"
assert "skill_view" in prompt, "must instruct the reviewer to skill_view candidates"
assert "SURVEY" in prompt, "must name the survey step explicitly"
def test_skill_review_prompt_treats_user_corrections_as_skill_signal():
"""Style/format/verbosity complaints must be FIRST-CLASS skill signals, not just memory."""
def test_skill_review_prompt_is_class_first():
"""Prompt must steer toward the CLASS of task, not the specific task."""
prompt = AIAgent._SKILL_REVIEW_PROMPT
lower = prompt.lower()
# Must mention style/format/verbosity-family corrections
assert any(k in lower for k in ("style", "format", "verbos", "legib", "tone")), (
"must name style/format/verbosity/legibility as signals"
)
# Must frame these as first-class skill signals (not memory-only)
assert "FIRST-CLASS" in prompt or "first-class" in prompt, (
"must explicitly label user-preference corrections as first-class skill signals"
)
# Must mention the correction-type phrases to tune the model's ear
assert "stop doing" in lower or "don't" in lower or "hate" in lower or "frustrat" in lower, (
"must give concrete phrasing examples so the model recognizes corrections"
)
assert "CLASS" in prompt, "must tell the reviewer to think about the task class"
assert "class level" in prompt, "must anchor naming at the class level"
def test_skill_review_prompt_prefers_loaded_skills_first():
"""Currently-loaded skills must be the first patch target."""
def test_skill_review_prompt_prefers_updating_existing():
"""Prompt must prefer generalizing an existing skill over creating a new one."""
prompt = AIAgent._SKILL_REVIEW_PROMPT
assert "LOADED" in prompt or "loaded" in prompt, (
"must mention currently-loaded skills"
assert "PREFER GENERALIZING" in prompt or "PREFER UPDATING" in prompt, (
"must state the update-over-create preference"
)
# Must name the mechanisms for detecting loaded skills
assert "skill_view" in prompt and "/skill" in prompt, (
"must name skill_view and /skill-name as loaded-skill signals"
assert "ONLY CREATE A NEW SKILL" in prompt, (
"must gate new-skill creation behind a last-resort clause"
)
def test_skill_review_prompt_has_four_step_preference_order():
"""The 4-step patch/support-file/create ladder must be present."""
def test_skill_review_prompt_flags_overlap_for_followup():
"""Prompt must ask the reviewer to note overlapping skills for future review."""
prompt = AIAgent._SKILL_REVIEW_PROMPT
assert "PATCH" in prompt
assert "references/" in prompt or "REFERENCE" in prompt
assert "CREATE" in prompt
assert "UMBRELLA" in prompt or "umbrella" in prompt
assert "overlap" in prompt.lower(), "must mention the overlap-flagging protocol"
def test_skill_review_prompt_names_three_support_file_kinds():
"""Support-file step must name references/, templates/, and scripts/."""
prompt = AIAgent._SKILL_REVIEW_PROMPT
assert "references/" in prompt, "must name references/ as a support-file kind"
assert "templates/" in prompt, "must name templates/ as a support-file kind"
assert "scripts/" in prompt, "must name scripts/ as a support-file kind"
# Purpose hints for each kind
assert "knowledge" in prompt.lower() or "research" in prompt.lower() or "API docs" in prompt, (
"must mention knowledge-bank / research / API-docs role of references/"
)
assert "copied" in prompt.lower() or "starter" in prompt.lower() or "reproduce" in prompt.lower(), (
"must mention that templates/ are starter files to copy/modify"
)
assert "re-runnable" in prompt.lower() or "verification" in prompt.lower() or "probe" in prompt.lower(), (
"must mention that scripts/ are re-runnable actions"
)
def test_skill_review_prompt_has_name_veto_for_create():
"""Creating a new skill must be gated behind class-level naming."""
prompt = AIAgent._SKILL_REVIEW_PROMPT
assert "class level" in prompt.lower() or "CLASS-LEVEL" in prompt
assert "MUST NOT" in prompt or "must not" in prompt, (
"must have a name-veto clause blocking session-artifact names"
)
def test_skill_review_prompt_embeds_user_preferences_in_skills():
"""Must explicitly say user-preference lessons belong in SKILL.md, not only memory."""
prompt = AIAgent._SKILL_REVIEW_PROMPT
lower = prompt.lower()
assert "preference" in lower, "must mention user preferences"
assert "memory" in lower and "skill" in lower, (
"must contrast memory vs skill responsibilities"
)
def test_skill_review_prompt_flags_overlap_and_defers_to_curator():
"""Reviewer should not consolidate live; flag overlap for the curator."""
prompt = AIAgent._SKILL_REVIEW_PROMPT
assert "overlap" in prompt.lower()
assert "curator" in prompt.lower(), "must defer consolidation to the curator"
def test_skill_review_prompt_still_has_opt_out_clause():
"""'Nothing to save.' must remain as a real-but-not-default option."""
def test_skill_review_prompt_preserves_opt_out_clause():
"""The 'Nothing to save.' escape clause must remain."""
prompt = AIAgent._SKILL_REVIEW_PROMPT
assert "Nothing to save." in prompt
# ---------------------------------------------------------------------------
# _COMBINED_REVIEW_PROMPT
# ---------------------------------------------------------------------------
def test_combined_review_prompt_has_memory_section():
"""Memory half must still cover user facts and preferences."""
def test_combined_review_prompt_keeps_memory_section():
"""Combined prompt must still cover memory review."""
prompt = AIAgent._COMBINED_REVIEW_PROMPT
assert "**Memory**" in prompt
assert "memory tool" in prompt
def test_combined_review_prompt_skills_biased_toward_active_updates():
"""Skills half must carry the active-update bias."""
def test_combined_review_prompt_skills_section_is_class_first():
"""The **Skills** half of the combined prompt must follow the same protocol."""
prompt = AIAgent._COMBINED_REVIEW_PROMPT
assert "**Skills**" in prompt
assert "ACTIVE" in prompt or "active" in prompt.lower()
assert "missed" in prompt.lower() or "opportunity" in prompt.lower()
def test_combined_review_prompt_treats_user_corrections_as_skill_signal():
"""Combined prompt must carry the same user-preference-is-skill-signal rule."""
prompt = AIAgent._COMBINED_REVIEW_PROMPT
lower = prompt.lower()
assert any(k in lower for k in ("style", "format", "verbos", "legib", "tone"))
assert "FIRST-CLASS" in prompt or "first-class" in prompt
def test_combined_review_prompt_prefers_loaded_skills_first():
"""Combined prompt must also prefer loaded skills first."""
prompt = AIAgent._COMBINED_REVIEW_PROMPT
assert "LOADED" in prompt or "loaded" in prompt
assert "skill_view" in prompt and "/skill" in prompt
def test_combined_review_prompt_has_four_step_skill_ladder():
"""Combined prompt must keep the patch/support-file/create ladder on the Skills half."""
prompt = AIAgent._COMBINED_REVIEW_PROMPT
assert "PATCH" in prompt
assert "references/" in prompt or "REFERENCE" in prompt
assert "CREATE" in prompt
assert "CLASS-LEVEL" in prompt or "class-level" in prompt or "class level" in prompt.lower()
def test_combined_review_prompt_names_three_support_file_kinds():
"""Combined prompt must also name all three support-file kinds."""
prompt = AIAgent._COMBINED_REVIEW_PROMPT
assert "references/" in prompt
assert "templates/" in prompt
assert "scripts/" in prompt
assert "SURVEY" in prompt
assert "CLASS" in prompt
assert "skills_list" in prompt
assert "ONLY CREATE A NEW SKILL" in prompt
def test_combined_review_prompt_preserves_opt_out_clause():
@@ -178,14 +69,10 @@ def test_combined_review_prompt_preserves_opt_out_clause():
assert "Nothing to save." in prompt
# ---------------------------------------------------------------------------
# _MEMORY_REVIEW_PROMPT — unchanged, still memory-focused
# ---------------------------------------------------------------------------
def test_memory_review_prompt_still_focused_on_user_facts():
def test_memory_review_prompt_unchanged_in_structure():
"""Memory-only review prompt stays focused on user facts — not touched by this change."""
prompt = AIAgent._MEMORY_REVIEW_PROMPT
# The memory-only prompt should NOT drift into skill territory
# Guardrails: the memory-only prompt must NOT mention skills/surveys.
assert "skills_list" not in prompt
assert "SURVEY" not in prompt
assert "memory tool" in prompt
+2 -2
View File
@@ -40,14 +40,14 @@ class TestCliSkinPromptIntegration:
cli = _make_cli_stub()
set_active_skin("ares")
assert cli._get_tui_prompt_fragments() == [("class:prompt", "")]
assert cli._get_tui_prompt_fragments() == [("class:prompt", " ")]
def test_secret_prompt_fragments_preserve_secret_state(self):
cli = _make_cli_stub()
cli._secret_state = {"response_queue": object()}
set_active_skin("ares")
assert cli._get_tui_prompt_fragments() == [("class:sudo-prompt", "🔑 ")]
assert cli._get_tui_prompt_fragments() == [("class:sudo-prompt", "🔑 ")]
def test_icon_only_skin_symbol_still_visible_in_special_states(self):
cli = _make_cli_stub()
-763
View File
@@ -944,39 +944,6 @@ def test_config_set_section_rejects_unknown_section_or_mode(tmp_path, monkeypatc
assert bad_mode["error"]["code"] == 4002
def test_config_mouse_uses_documented_key_with_legacy_fallback(monkeypatch):
cfg = {"display": {"tui_mouse": False}}
writes = []
monkeypatch.setattr(server, "_load_cfg", lambda: cfg)
monkeypatch.setattr(
server, "_write_config_key", lambda path, value: writes.append((path, value))
)
get_legacy = server.handle_request(
{"id": "1", "method": "config.get", "params": {"key": "mouse"}}
)
assert get_legacy["result"]["value"] == "off"
set_toggle = server.handle_request(
{"id": "2", "method": "config.set", "params": {"key": "mouse"}}
)
assert set_toggle["result"] == {"key": "mouse", "value": "on"}
assert writes == [("display.mouse_tracking", True)]
cfg["display"] = {"mouse_tracking": 0, "tui_mouse": True}
get_canonical = server.handle_request(
{"id": "3", "method": "config.get", "params": {"key": "mouse"}}
)
assert get_canonical["result"]["value"] == "off"
cfg["display"] = {"mouse_tracking": None, "tui_mouse": False}
get_null = server.handle_request(
{"id": "4", "method": "config.get", "params": {"key": "mouse"}}
)
assert get_null["result"]["value"] == "on"
def test_enable_gateway_prompts_sets_gateway_env(monkeypatch):
monkeypatch.delenv("HERMES_EXEC_ASK", raising=False)
monkeypatch.delenv("HERMES_GATEWAY_SESSION", raising=False)
@@ -2754,733 +2721,3 @@ def test_session_most_recent_handles_db_unavailable(monkeypatch):
)
assert resp["result"]["session_id"] is None
# ── browser.manage ───────────────────────────────────────────────────
def _stub_urlopen(monkeypatch, *, ok: bool):
"""Patch urllib.request.urlopen used by browser.manage to short-circuit probes."""
class _Resp:
status = 200 if ok else 503
def __enter__(self):
return self
def __exit__(self, *_):
return False
def _opener(_url, timeout=2.0): # noqa: ARG001 — match urllib signature
if not ok:
raise OSError("probe failed")
return _Resp()
import urllib.request
monkeypatch.setattr(urllib.request, "urlopen", _opener)
def _stub_urlopen_capture(monkeypatch, *, ok: bool):
urls: list[str] = []
class _Resp:
status = 200
def __enter__(self):
return self
def __exit__(self, *_):
return False
def _opener(url, timeout=2.0): # noqa: ARG001 — match urllib signature
urls.append(url)
if not ok:
raise OSError("probe failed")
return _Resp()
import urllib.request
monkeypatch.setattr(urllib.request, "urlopen", _opener)
return urls
def test_browser_manage_status_reads_env_var(monkeypatch):
"""Status returns the env var verbatim (no network I/O)."""
monkeypatch.setenv("BROWSER_CDP_URL", "http://127.0.0.1:9222")
resp = server.handle_request(
{"id": "1", "method": "browser.manage", "params": {"action": "status"}}
)
assert resp["result"]["connected"] is True
assert resp["result"]["url"] == "http://127.0.0.1:9222"
def test_browser_manage_status_falls_back_to_config_cdp_url(monkeypatch):
"""When env is unset, status surfaces ``browser.cdp_url`` from
config.yaml so users see what the next tool call will read."""
monkeypatch.delenv("BROWSER_CDP_URL", raising=False)
fake_cfg = types.SimpleNamespace(
read_raw_config=lambda: {"browser": {"cdp_url": "http://lan:9222"}}
)
with patch.dict(sys.modules, {"hermes_cli.config": fake_cfg}):
resp = server.handle_request(
{"id": "1", "method": "browser.manage", "params": {"action": "status"}}
)
assert resp["result"] == {"connected": True, "url": "http://lan:9222"}
def test_browser_manage_status_does_not_call_get_cdp_override(monkeypatch):
"""Regression guard for Copilot's "status must not block" review:
status must NOT route through `_get_cdp_override`, which performs a
`/json/version` HTTP probe with a multi-second timeout."""
monkeypatch.setenv("BROWSER_CDP_URL", "http://127.0.0.1:9222")
fake = types.SimpleNamespace(
_get_cdp_override=lambda: pytest.fail( # noqa: PT015 — fail loudly if called
"_get_cdp_override must not run on /browser status (network I/O)"
)
)
with patch.dict(sys.modules, {"tools.browser_tool": fake}):
resp = server.handle_request(
{"id": "1", "method": "browser.manage", "params": {"action": "status"}}
)
assert resp["result"]["connected"] is True
def test_browser_manage_connect_sets_env_and_cleans_twice(monkeypatch):
"""`/browser connect` must reach the live process: set env, reap browser
sessions before AND after publishing the new URL. The double-cleanup
closes the supervisor swap window where ``_ensure_cdp_supervisor``
could re-attach to the *old* CDP endpoint between steps."""
monkeypatch.delenv("BROWSER_CDP_URL", raising=False)
cleanup_calls: list[str] = []
def _cleanup_all():
cleanup_calls.append(os.environ.get("BROWSER_CDP_URL", ""))
fake = types.SimpleNamespace(
cleanup_all_browsers=_cleanup_all,
_get_cdp_override=lambda: os.environ.get("BROWSER_CDP_URL", ""),
)
with patch.dict(sys.modules, {"tools.browser_tool": fake}):
_stub_urlopen(monkeypatch, ok=True)
resp = server.handle_request(
{
"id": "1",
"method": "browser.manage",
"params": {"action": "connect", "url": "http://127.0.0.1:9222"},
}
)
assert resp["result"]["connected"] is True
assert resp["result"]["url"] == "http://127.0.0.1:9222"
assert resp["result"]["messages"] == ["Chrome is already listening on port 9222"]
assert os.environ.get("BROWSER_CDP_URL") == "http://127.0.0.1:9222"
# First cleanup runs against the OLD env (none here), second against the NEW.
assert cleanup_calls == ["", "http://127.0.0.1:9222"]
def test_browser_manage_connect_defaults_to_loopback(monkeypatch):
monkeypatch.delenv("BROWSER_CDP_URL", raising=False)
fake = types.SimpleNamespace(
cleanup_all_browsers=lambda: None,
_get_cdp_override=lambda: os.environ.get("BROWSER_CDP_URL", ""),
)
with patch.dict(sys.modules, {"tools.browser_tool": fake}):
urls = _stub_urlopen_capture(monkeypatch, ok=True)
resp = server.handle_request(
{"id": "1", "method": "browser.manage", "params": {"action": "connect"}}
)
assert resp["result"]["connected"] is True
assert resp["result"]["url"] == "http://127.0.0.1:9222"
assert resp["result"]["messages"] == ["Chrome is already listening on port 9222"]
assert urls[0] == "http://127.0.0.1:9222/json/version"
def test_browser_manage_connect_default_local_reports_launch_hint(monkeypatch):
monkeypatch.delenv("BROWSER_CDP_URL", raising=False)
emitted: list[tuple[str, dict]] = []
monkeypatch.setattr(
server,
"_emit",
lambda evt, sid, payload=None: emitted.append((evt, payload or {})),
)
fake = types.SimpleNamespace(
cleanup_all_browsers=lambda: None,
_get_cdp_override=lambda: os.environ.get("BROWSER_CDP_URL", ""),
)
with patch.dict(sys.modules, {"tools.browser_tool": fake}):
_stub_urlopen(monkeypatch, ok=False)
with (
patch(
"hermes_cli.browser_connect.try_launch_chrome_debug", return_value=False
),
patch(
"hermes_cli.browser_connect.get_chrome_debug_candidates",
return_value=[],
),
):
resp = server.handle_request(
{
"id": "1",
"method": "browser.manage",
"params": {
"action": "connect",
"session_id": "sess-1",
"url": "http://localhost:9222",
},
}
)
assert resp["result"]["connected"] is False
assert resp["result"]["url"] == "http://127.0.0.1:9222"
assert (
resp["result"]["messages"][0]
== "Chrome isn't running with remote debugging — attempting to launch..."
)
assert any(
"No Chrome/Chromium executable was found" in line
for line in resp["result"]["messages"]
)
assert any(
"--remote-debugging-port=9222" in line for line in resp["result"]["messages"]
)
assert "BROWSER_CDP_URL" not in os.environ
progress = [p["message"] for evt, p in emitted if evt == "browser.progress"]
assert progress == resp["result"]["messages"]
def test_browser_manage_connect_no_session_skips_progress_events(monkeypatch):
"""Without a session_id the TUI prints messages from the response;
emitting ``browser.progress`` events would double-render. Gate the
emit so callers without a session see the bundled list only."""
monkeypatch.delenv("BROWSER_CDP_URL", raising=False)
emitted: list[tuple[str, dict]] = []
monkeypatch.setattr(
server,
"_emit",
lambda evt, sid, payload=None: emitted.append((evt, payload or {})),
)
fake = types.SimpleNamespace(
cleanup_all_browsers=lambda: None,
_get_cdp_override=lambda: os.environ.get("BROWSER_CDP_URL", ""),
)
with patch.dict(sys.modules, {"tools.browser_tool": fake}):
_stub_urlopen(monkeypatch, ok=False)
with (
patch(
"hermes_cli.browser_connect.try_launch_chrome_debug", return_value=False
),
patch(
"hermes_cli.browser_connect.get_chrome_debug_candidates",
return_value=[],
),
):
resp = server.handle_request(
{
"id": "1",
"method": "browser.manage",
"params": {"action": "connect", "url": "http://localhost:9222"},
}
)
assert resp["result"]["connected"] is False
assert resp["result"]["messages"] # bundled list still populated
assert [evt for evt, _ in emitted if evt == "browser.progress"] == []
def test_browser_manage_connect_handles_null_url(monkeypatch):
"""Explicit ``{"url": null}`` (or empty string) must fall back to the
default loopback URL instead of raising a TypeError that gets swallowed
by the outer 5031 catch."""
monkeypatch.delenv("BROWSER_CDP_URL", raising=False)
fake = types.SimpleNamespace(
cleanup_all_browsers=lambda: None,
_get_cdp_override=lambda: os.environ.get("BROWSER_CDP_URL", ""),
)
with patch.dict(sys.modules, {"tools.browser_tool": fake}):
_stub_urlopen(monkeypatch, ok=True)
resp = server.handle_request(
{
"id": "1",
"method": "browser.manage",
"params": {"action": "connect", "url": None},
}
)
assert resp["result"]["connected"] is True
assert resp["result"]["url"] == "http://127.0.0.1:9222"
def test_browser_manage_connect_rejects_non_string_url(monkeypatch):
monkeypatch.delenv("BROWSER_CDP_URL", raising=False)
resp = server.handle_request(
{
"id": "1",
"method": "browser.manage",
"params": {"action": "connect", "url": 9222},
}
)
assert resp["error"]["code"] == 4015
assert "must be a string" in resp["error"]["message"]
assert "BROWSER_CDP_URL" not in os.environ
def test_browser_manage_connect_default_local_retries_after_launch(monkeypatch):
monkeypatch.delenv("BROWSER_CDP_URL", raising=False)
monkeypatch.setattr(server.time, "sleep", lambda _seconds: None)
fake = types.SimpleNamespace(
cleanup_all_browsers=lambda: None,
_get_cdp_override=lambda: os.environ.get("BROWSER_CDP_URL", ""),
)
class _Resp:
status = 200
def __enter__(self):
return self
def __exit__(self, *_):
return False
attempts = {"n": 0}
def _opener(_url, timeout=2.0): # noqa: ARG001 — match urllib signature
attempts["n"] += 1
if attempts["n"] < 3:
raise OSError("not ready")
return _Resp()
import urllib.request
monkeypatch.setattr(urllib.request, "urlopen", _opener)
with patch.dict(sys.modules, {"tools.browser_tool": fake}):
with patch(
"hermes_cli.browser_connect.try_launch_chrome_debug", return_value=True
):
resp = server.handle_request(
{"id": "1", "method": "browser.manage", "params": {"action": "connect"}}
)
assert resp["result"]["connected"] is True
assert resp["result"]["url"] == "http://127.0.0.1:9222"
assert resp["result"]["messages"] == [
"Chrome isn't running with remote debugging — attempting to launch...",
"Chrome launched and listening on port 9222",
]
assert os.environ["BROWSER_CDP_URL"] == "http://127.0.0.1:9222"
def test_browser_manage_connect_rejects_unreachable_endpoint(monkeypatch):
"""An unreachable endpoint must NOT mutate the env or reap sessions."""
monkeypatch.setenv("BROWSER_CDP_URL", "http://existing:9222")
cleanup_calls: list[str] = []
fake = types.SimpleNamespace(
cleanup_all_browsers=lambda: cleanup_calls.append(
os.environ.get("BROWSER_CDP_URL", "")
),
_get_cdp_override=lambda: os.environ.get("BROWSER_CDP_URL", ""),
)
with patch.dict(sys.modules, {"tools.browser_tool": fake}):
_stub_urlopen(monkeypatch, ok=False)
resp = server.handle_request(
{
"id": "1",
"method": "browser.manage",
"params": {"action": "connect", "url": "http://unreachable:9222"},
}
)
assert "error" in resp
# Env preserved; nothing reaped.
assert os.environ["BROWSER_CDP_URL"] == "http://existing:9222"
assert cleanup_calls == []
def test_browser_manage_connect_normalizes_bare_host_port(monkeypatch):
"""Persist a parsed `scheme://host:port` URL so `_get_cdp_override`
can normalize it; storing a bare host:port would break subsequent
tool calls (Copilot review on #17120)."""
monkeypatch.delenv("BROWSER_CDP_URL", raising=False)
fake = types.SimpleNamespace(
cleanup_all_browsers=lambda: None,
_get_cdp_override=lambda: os.environ.get("BROWSER_CDP_URL", ""),
)
with patch.dict(sys.modules, {"tools.browser_tool": fake}):
_stub_urlopen(monkeypatch, ok=True)
resp = server.handle_request(
{
"id": "1",
"method": "browser.manage",
"params": {"action": "connect", "url": "127.0.0.1:9222"},
}
)
assert resp["result"]["connected"] is True
# Bare host:port got promoted to a full URL with explicit scheme.
assert resp["result"]["url"].startswith("http://")
assert os.environ["BROWSER_CDP_URL"].startswith("http://")
def test_browser_manage_connect_strips_discovery_path(monkeypatch):
"""User-supplied discovery paths like `/json` or `/json/version`
must collapse to bare `scheme://host:port`; otherwise
``_resolve_cdp_override`` will append ``/json/version`` again and
produce a duplicate path (Copilot review round-2 on #17120)."""
monkeypatch.delenv("BROWSER_CDP_URL", raising=False)
fake = types.SimpleNamespace(
cleanup_all_browsers=lambda: None,
_get_cdp_override=lambda: os.environ.get("BROWSER_CDP_URL", ""),
)
with patch.dict(sys.modules, {"tools.browser_tool": fake}):
_stub_urlopen(monkeypatch, ok=True)
resp = server.handle_request(
{
"id": "1",
"method": "browser.manage",
"params": {"action": "connect", "url": "http://127.0.0.1:9222/json"},
}
)
assert resp["result"]["connected"] is True
assert resp["result"]["url"] == "http://127.0.0.1:9222"
assert os.environ["BROWSER_CDP_URL"] == "http://127.0.0.1:9222"
def test_browser_manage_connect_preserves_devtools_browser_endpoint(monkeypatch):
"""Concrete devtools websocket endpoints (e.g. Browserbase) must
survive verbatim we only collapse discovery-style paths."""
monkeypatch.delenv("BROWSER_CDP_URL", raising=False)
fake = types.SimpleNamespace(
cleanup_all_browsers=lambda: None,
_get_cdp_override=lambda: os.environ.get("BROWSER_CDP_URL", ""),
)
concrete = "ws://browserbase.example/devtools/browser/abc123"
class _OkSocket:
def __enter__(self):
return self
def __exit__(self, *a):
return False
with patch.dict(sys.modules, {"tools.browser_tool": fake}):
# If urlopen is reached for a concrete ws endpoint, the test
# would still pass because _stub_urlopen returned ok=True before;
# patch it to assert-fail so we prove the HTTP probe is skipped.
with patch(
"urllib.request.urlopen", side_effect=AssertionError("urlopen called")
):
with patch("socket.create_connection", return_value=_OkSocket()):
resp = server.handle_request(
{
"id": "1",
"method": "browser.manage",
"params": {"action": "connect", "url": concrete},
}
)
assert resp["result"]["connected"] is True
assert resp["result"]["url"] == concrete
assert os.environ["BROWSER_CDP_URL"] == concrete
def test_browser_manage_connect_local_devtools_ws_preserves_path(monkeypatch):
"""Regression: ``ws://127.0.0.1:9222/devtools/browser/<id>`` is a real
connectable endpoint; default-local normalization must not strip the
``/devtools/browser/...`` path or it breaks valid local CDP connects."""
monkeypatch.delenv("BROWSER_CDP_URL", raising=False)
fake = types.SimpleNamespace(
cleanup_all_browsers=lambda: None,
_get_cdp_override=lambda: os.environ.get("BROWSER_CDP_URL", ""),
)
concrete = "ws://127.0.0.1:9222/devtools/browser/abc123"
class _OkSocket:
def __enter__(self):
return self
def __exit__(self, *a):
return False
with patch.dict(sys.modules, {"tools.browser_tool": fake}):
with patch("socket.create_connection", return_value=_OkSocket()):
resp = server.handle_request(
{
"id": "1",
"method": "browser.manage",
"params": {"action": "connect", "url": concrete},
}
)
assert resp["result"]["connected"] is True
assert resp["result"]["url"] == concrete
assert os.environ["BROWSER_CDP_URL"] == concrete
def test_browser_manage_connect_rejects_invalid_port(monkeypatch):
monkeypatch.delenv("BROWSER_CDP_URL", raising=False)
resp = server.handle_request(
{
"id": "1",
"method": "browser.manage",
"params": {"action": "connect", "url": "http://localhost:abc"},
}
)
assert resp["error"]["code"] == 4015
assert "invalid port" in resp["error"]["message"]
assert "BROWSER_CDP_URL" not in os.environ
def test_browser_manage_connect_rejects_missing_host(monkeypatch):
monkeypatch.delenv("BROWSER_CDP_URL", raising=False)
resp = server.handle_request(
{
"id": "1",
"method": "browser.manage",
"params": {"action": "connect", "url": "http://:9222"},
}
)
assert resp["error"]["code"] == 4015
assert "missing host" in resp["error"]["message"]
assert "BROWSER_CDP_URL" not in os.environ
def test_browser_manage_connect_concrete_ws_skips_http_probe(monkeypatch):
"""Regression for round-2 Copilot review: a hosted CDP endpoint
(no HTTP discovery) must connect via TCP-only reachability check.
The HTTP probe used to reject these even though they're valid."""
monkeypatch.delenv("BROWSER_CDP_URL", raising=False)
fake = types.SimpleNamespace(
cleanup_all_browsers=lambda: None,
_get_cdp_override=lambda: os.environ.get("BROWSER_CDP_URL", ""),
)
concrete = "wss://chrome.browserless.io/devtools/browser/sess-1"
seen_targets: list[tuple[str, int]] = []
class _OkSocket:
def __enter__(self):
return self
def __exit__(self, *a):
return False
def _fake_create_connection(addr, timeout=None):
seen_targets.append(addr)
return _OkSocket()
with patch.dict(sys.modules, {"tools.browser_tool": fake}):
# urlopen would 404/ECONNREFUSED on a real hosted CDP endpoint;
# asserting it's never called proves the probe was skipped.
with patch(
"urllib.request.urlopen", side_effect=AssertionError("urlopen called")
):
with patch("socket.create_connection", side_effect=_fake_create_connection):
resp = server.handle_request(
{
"id": "1",
"method": "browser.manage",
"params": {"action": "connect", "url": concrete},
}
)
assert resp["result"] == {"connected": True, "url": concrete}
# wss → port 443, host preserved verbatim.
assert seen_targets == [("chrome.browserless.io", 443)]
def test_browser_manage_connect_concrete_ws_tcp_unreachable(monkeypatch):
"""If the TCP reachability check fails for a concrete ws endpoint,
return a clear 5031 error no fallback to the HTTP probe (which
can never succeed for these URLs anyway)."""
monkeypatch.delenv("BROWSER_CDP_URL", raising=False)
fake = types.SimpleNamespace(
cleanup_all_browsers=lambda: None,
_get_cdp_override=lambda: os.environ.get("BROWSER_CDP_URL", ""),
)
concrete = "ws://offline.example/devtools/browser/missing"
with patch.dict(sys.modules, {"tools.browser_tool": fake}):
with patch("socket.create_connection", side_effect=OSError("ECONNREFUSED")):
resp = server.handle_request(
{
"id": "1",
"method": "browser.manage",
"params": {"action": "connect", "url": concrete},
}
)
assert "error" in resp
assert resp["error"]["code"] == 5031
def test_browser_manage_disconnect_drops_env_and_cleans(monkeypatch):
monkeypatch.setenv("BROWSER_CDP_URL", "http://127.0.0.1:9222")
cleanup_count = {"n": 0}
fake = types.SimpleNamespace(
cleanup_all_browsers=lambda: cleanup_count.__setitem__(
"n", cleanup_count["n"] + 1
),
_get_cdp_override=lambda: os.environ.get("BROWSER_CDP_URL", ""),
)
with patch.dict(sys.modules, {"tools.browser_tool": fake}):
resp = server.handle_request(
{"id": "1", "method": "browser.manage", "params": {"action": "disconnect"}}
)
assert resp["result"] == {"connected": False}
assert "BROWSER_CDP_URL" not in os.environ
# Two cleanups: once before env removal, once after, matching connect.
assert cleanup_count["n"] == 2
# ── config.get indicator normalization ───────────────────────────────
def test_config_get_indicator_returns_known_value_verbatim(monkeypatch):
monkeypatch.setattr(
server, "_load_cfg", lambda: {"display": {"tui_status_indicator": "emoji"}}
)
resp = server.handle_request(
{"id": "1", "method": "config.get", "params": {"key": "indicator"}}
)
assert resp["result"] == {"value": "emoji"}
def test_config_get_indicator_normalizes_casing_and_whitespace(monkeypatch):
"""Hand-edited config.yaml stays consistent with what the TUI shows.
Frontend's `normalizeIndicatorStyle` lowercases + trims, so config.get
must do the same otherwise `/indicator` prints 'EMOJI ' while the
UI is actually rendering the kaomoji default."""
monkeypatch.setattr(
server, "_load_cfg", lambda: {"display": {"tui_status_indicator": " EMOJI "}}
)
resp = server.handle_request(
{"id": "1", "method": "config.get", "params": {"key": "indicator"}}
)
assert resp["result"] == {"value": "emoji"}
def test_config_get_indicator_falls_back_to_default_for_unknown(monkeypatch):
"""An unknown value in config.yaml falls back to the same default
the frontend uses (`_INDICATOR_DEFAULT`)."""
monkeypatch.setattr(
server, "_load_cfg", lambda: {"display": {"tui_status_indicator": "rainbow"}}
)
resp = server.handle_request(
{"id": "1", "method": "config.get", "params": {"key": "indicator"}}
)
assert resp["result"] == {"value": "kaomoji"}
def test_config_get_indicator_falls_back_when_unset(monkeypatch):
monkeypatch.setattr(server, "_load_cfg", lambda: {"display": {}})
resp = server.handle_request(
{"id": "1", "method": "config.get", "params": {"key": "indicator"}}
)
assert resp["result"] == {"value": "kaomoji"}
# ── config.set indicator validation ──────────────────────────────────
def test_config_set_indicator_accepts_known_value(monkeypatch):
written: dict = {}
monkeypatch.setattr(
server,
"_write_config_key",
lambda k, v: written.update({k: v}),
)
resp = server.handle_request(
{
"id": "1",
"method": "config.set",
"params": {"key": "indicator", "value": "EMOJI"},
}
)
assert resp["result"] == {"key": "indicator", "value": "emoji"}
assert written == {"display.tui_status_indicator": "emoji"}
def test_config_set_indicator_falsy_non_string_surfaces_in_error(monkeypatch):
"""`0` / `False` / `[]` are not valid styles, but the error message
must still tell the user what they sent `value or ""` would have
erased them to a blank string."""
monkeypatch.setattr(server, "_write_config_key", lambda *a, **k: None)
for bad in (0, False, []):
resp = server.handle_request(
{
"id": "1",
"method": "config.set",
"params": {"key": "indicator", "value": bad},
}
)
assert "error" in resp
msg = resp["error"]["message"]
assert "unknown indicator" in msg
# The exact repr varies; `0`/`False` stringify with content,
# `[]` becomes an empty list — what matters is the diagnostic
# is no longer just `unknown indicator: ` with nothing after.
assert msg.split("; ")[0] != "unknown indicator: ''"
def test_config_set_indicator_none_keeps_blank_repr(monkeypatch):
"""`None` is the genuine 'no value' case — empty raw is acceptable."""
monkeypatch.setattr(server, "_write_config_key", lambda *a, **k: None)
resp = server.handle_request(
{
"id": "1",
"method": "config.set",
"params": {"key": "indicator", "value": None},
}
)
assert "error" in resp
assert "unknown indicator: ''" in resp["error"]["message"]
# ── reload.env ───────────────────────────────────────────────────────
def test_reload_env_rpc_calls_hermes_cli_reload_env(monkeypatch):
"""reload.env mirrors classic CLI's `/reload` — re-reads ~/.hermes/.env
into the gateway process and reports the count of vars updated."""
calls = {"n": 0}
def _fake_reload():
calls["n"] += 1
return 7
fake = types.SimpleNamespace(reload_env=_fake_reload)
with patch.dict(sys.modules, {"hermes_cli.config": fake}):
resp = server.handle_request(
{"id": "1", "method": "reload.env", "params": {}}
)
assert resp["result"] == {"updated": 7}
assert calls["n"] == 1
def test_reload_env_rpc_surfaces_errors(monkeypatch):
def _broken():
raise RuntimeError("env path locked")
fake = types.SimpleNamespace(reload_env=_broken)
with patch.dict(sys.modules, {"hermes_cli.config": fake}):
resp = server.handle_request(
{"id": "1", "method": "reload.env", "params": {}}
)
assert "error" in resp
assert "env path locked" in resp["error"]["message"]
@@ -1,148 +0,0 @@
"""Tests that init_session() respects the configured cwd.
The bug: when terminal.cwd is set in config.yaml, the configured path was
displayed in the TUI banner but actual terminal commands ran in os.getcwd()
(the directory where ``hermes chat`` was started).
Root cause: init_session() captures the login shell environment by running
``pwd -P`` inside a ``bash -l -c`` bootstrap. Profile scripts (.bashrc,
.bash_profile, etc.) can change the working directory before ``pwd -P``
runs, so _update_cwd() overwrites self.cwd with the wrong directory.
Fix: the bootstrap now includes an explicit ``cd`` back to self.cwd before
running ``pwd -P``, so the configured cwd is always what gets recorded.
"""
from tempfile import TemporaryFile
from unittest.mock import MagicMock
from tools.environments.base import BaseEnvironment
class _TestableEnv(BaseEnvironment):
"""Concrete subclass for testing base class methods."""
def __init__(self, cwd="/tmp", timeout=10):
super().__init__(cwd=cwd, timeout=timeout)
def _run_bash(self, cmd_string, *, login=False, timeout=120, stdin_data=None):
raise NotImplementedError("Use mock")
def cleanup(self):
pass
class TestInitSessionCwdRespect:
"""init_session() must preserve the configured cwd."""
def test_bootstrap_contains_cd_to_configured_cwd(self):
"""The bootstrap script must cd to self.cwd before running pwd."""
env = _TestableEnv(cwd="/my/project")
# Capture the bootstrap script that init_session would pass to _run_bash
captured = {}
def mock_run_bash(cmd_string, *, login=False, timeout=120, stdin_data=None):
captured["cmd"] = cmd_string
mock = MagicMock()
mock.poll.return_value = 0
mock.returncode = 0
stdout = TemporaryFile(mode="w+b")
stdout.seek(0)
mock.stdout = stdout
return mock
env._run_bash = mock_run_bash
env.init_session()
assert "cmd" in captured, "init_session did not call _run_bash"
bootstrap = captured["cmd"]
# The cd must appear before pwd -P so the configured cwd is recorded
cd_pos = bootstrap.find("builtin cd")
pwd_pos = bootstrap.find("pwd -P")
assert cd_pos != -1, "bootstrap must contain 'builtin cd'"
assert pwd_pos != -1, "bootstrap must contain 'pwd -P'"
assert cd_pos < pwd_pos, (
"builtin cd must appear before pwd -P in the bootstrap so "
"the configured cwd is what gets recorded"
)
# The cd target must be the configured path (shlex.quote only adds
# quotes when the path contains shell-special characters)
assert "/my/project" in bootstrap, (
"bootstrap cd must target the configured cwd (/my/project)"
)
def test_configured_cwd_survives_init_session(self):
"""self.cwd must be the configured path after init_session completes."""
configured_cwd = "/my/project"
env = _TestableEnv(cwd=configured_cwd)
marker = env._cwd_marker
def mock_run_bash(cmd_string, *, login=False, timeout=120, stdin_data=None):
mock = MagicMock()
mock.poll.return_value = 0
mock.returncode = 0
# Simulate output where pwd reports the configured cwd
output = f"snapshot output\n{marker}{configured_cwd}{marker}\n"
stdout = TemporaryFile(mode="w+b")
stdout.write(output.encode("utf-8"))
stdout.seek(0)
mock.stdout = stdout
return mock
env._run_bash = mock_run_bash
env.init_session()
assert env.cwd == configured_cwd, (
f"Expected cwd={configured_cwd!r} after init_session, got {env.cwd!r}"
)
def test_default_cwd_still_works(self):
"""When no custom cwd is configured, default /tmp behavior is preserved."""
env = _TestableEnv() # default cwd="/tmp"
marker = env._cwd_marker
def mock_run_bash(cmd_string, *, login=False, timeout=120, stdin_data=None):
mock = MagicMock()
mock.poll.return_value = 0
mock.returncode = 0
output = f"snapshot output\n{marker}/tmp{marker}\n"
stdout = TemporaryFile(mode="w+b")
stdout.write(output.encode("utf-8"))
stdout.seek(0)
mock.stdout = stdout
return mock
env._run_bash = mock_run_bash
env.init_session()
assert env.cwd == "/tmp"
def test_bootstrap_cd_uses_shlex_quote(self):
"""Paths with spaces must be properly quoted in the bootstrap cd."""
env = _TestableEnv(cwd="/my project/with spaces")
captured = {}
def mock_run_bash(cmd_string, *, login=False, timeout=120, stdin_data=None):
captured["cmd"] = cmd_string
mock = MagicMock()
mock.poll.return_value = 0
mock.returncode = 0
stdout = TemporaryFile(mode="w+b")
stdout.seek(0)
mock.stdout = stdout
return mock
env._run_bash = mock_run_bash
env.init_session()
bootstrap = captured["cmd"]
# shlex.quote wraps paths with spaces in single quotes
assert "'/my project/with spaces'" in bootstrap, (
"bootstrap cd must properly quote paths with spaces"
)
-487
View File
@@ -1,487 +0,0 @@
"""Tests for tools/skill_usage.py — sidecar telemetry + provenance filtering."""
import json
import os
from pathlib import Path
import pytest
@pytest.fixture
def skills_home(tmp_path, monkeypatch):
"""Isolated HERMES_HOME with a clean skills/ dir for each test."""
home = tmp_path / ".hermes"
home.mkdir()
(home / "skills").mkdir()
monkeypatch.setattr(Path, "home", lambda: tmp_path)
monkeypatch.setenv("HERMES_HOME", str(home))
# Force skill_usage module to re-resolve paths per test
import importlib
import tools.skill_usage as mod
importlib.reload(mod)
return home
def _write_skill(skills_dir: Path, name: str, category: str = ""):
"""Create a minimal SKILL.md with a name: frontmatter field."""
if category:
d = skills_dir / category / name
else:
d = skills_dir / name
d.mkdir(parents=True, exist_ok=True)
(d / "SKILL.md").write_text(
f"""---
name: {name}
description: test skill
---
# body
""",
encoding="utf-8",
)
return d
# ---------------------------------------------------------------------------
# Round-trip
# ---------------------------------------------------------------------------
def test_empty_usage_returns_empty_dict(skills_home):
from tools.skill_usage import load_usage
assert load_usage() == {}
def test_save_and_load_roundtrip(skills_home):
from tools.skill_usage import load_usage, save_usage
data = {"skill-a": {"use_count": 3, "state": "active"}}
save_usage(data)
loaded = load_usage()
assert loaded["skill-a"]["use_count"] == 3
assert loaded["skill-a"]["state"] == "active"
def test_save_is_atomic_no_partial_tmp_files(skills_home):
from tools.skill_usage import save_usage, _usage_file
save_usage({"x": {"use_count": 1}})
skills_dir = _usage_file().parent
# No leftover tempfile
for p in skills_dir.iterdir():
assert not p.name.startswith(".usage_"), f"leftover tmp: {p.name}"
def test_get_record_missing_returns_empty_record(skills_home):
from tools.skill_usage import get_record
rec = get_record("nonexistent")
assert rec["use_count"] == 0
assert rec["view_count"] == 0
assert rec["state"] == "active"
assert rec["pinned"] is False
assert rec["archived_at"] is None
def test_get_record_backfills_missing_keys(skills_home):
from tools.skill_usage import get_record, save_usage
save_usage({"legacy": {"use_count": 5}}) # old-format record
rec = get_record("legacy")
assert rec["use_count"] == 5
assert "view_count" in rec # backfilled
assert "state" in rec
def test_load_usage_handles_corrupt_file(skills_home):
from tools.skill_usage import load_usage, _usage_file
_usage_file().write_text("{ not json }", encoding="utf-8")
assert load_usage() == {}
# ---------------------------------------------------------------------------
# Counter bumps
# ---------------------------------------------------------------------------
def test_bump_view_increments_and_timestamps(skills_home):
from tools.skill_usage import bump_view, get_record
bump_view("my-skill")
bump_view("my-skill")
rec = get_record("my-skill")
assert rec["view_count"] == 2
assert rec["last_viewed_at"] is not None
def test_bump_use_increments_and_timestamps(skills_home):
from tools.skill_usage import bump_use, get_record
bump_use("my-skill")
rec = get_record("my-skill")
assert rec["use_count"] == 1
assert rec["last_used_at"] is not None
def test_bump_patch_increments_and_timestamps(skills_home):
from tools.skill_usage import bump_patch, get_record
bump_patch("my-skill")
rec = get_record("my-skill")
assert rec["patch_count"] == 1
assert rec["last_patched_at"] is not None
def test_bump_on_empty_name_is_noop(skills_home):
from tools.skill_usage import bump_view, load_usage
bump_view("")
assert load_usage() == {}
def test_bumps_do_not_corrupt_other_skills(skills_home):
from tools.skill_usage import bump_view, bump_use, get_record
bump_view("skill-a")
bump_use("skill-b")
bump_view("skill-a")
assert get_record("skill-a")["view_count"] == 2
assert get_record("skill-a")["use_count"] == 0
assert get_record("skill-b")["use_count"] == 1
# ---------------------------------------------------------------------------
# State transitions
# ---------------------------------------------------------------------------
def test_set_state_active(skills_home):
from tools.skill_usage import set_state, get_record, STATE_ACTIVE
set_state("x", STATE_ACTIVE)
assert get_record("x")["state"] == "active"
def test_set_state_archived_records_timestamp(skills_home):
from tools.skill_usage import set_state, get_record, STATE_ARCHIVED
set_state("x", STATE_ARCHIVED)
rec = get_record("x")
assert rec["state"] == "archived"
assert rec["archived_at"] is not None
def test_set_state_invalid_is_noop(skills_home):
from tools.skill_usage import set_state, get_record
set_state("x", "bogus")
# No record created for invalid state
rec = get_record("x")
assert rec["state"] == "active" # default
def test_restoring_from_archive_clears_timestamp(skills_home):
from tools.skill_usage import set_state, get_record, STATE_ARCHIVED, STATE_ACTIVE
set_state("x", STATE_ARCHIVED)
assert get_record("x")["archived_at"] is not None
set_state("x", STATE_ACTIVE)
assert get_record("x")["archived_at"] is None
def test_set_pinned(skills_home):
from tools.skill_usage import set_pinned, get_record
set_pinned("x", True)
assert get_record("x")["pinned"] is True
set_pinned("x", False)
assert get_record("x")["pinned"] is False
def test_forget_removes_record(skills_home):
from tools.skill_usage import bump_view, forget, load_usage
bump_view("x")
assert "x" in load_usage()
forget("x")
assert "x" not in load_usage()
# ---------------------------------------------------------------------------
# Provenance filter — the load-bearing safety check
# ---------------------------------------------------------------------------
def test_agent_created_excludes_bundled(skills_home):
from tools.skill_usage import list_agent_created_skill_names
skills_dir = skills_home / "skills"
_write_skill(skills_dir, "bundled-skill", category="github")
_write_skill(skills_dir, "my-skill")
# Seed a bundled manifest marking bundled-skill as upstream
(skills_dir / ".bundled_manifest").write_text(
"bundled-skill:abc123\n", encoding="utf-8",
)
names = list_agent_created_skill_names()
assert "my-skill" in names
assert "bundled-skill" not in names
def test_agent_created_excludes_hub_installed(skills_home):
from tools.skill_usage import list_agent_created_skill_names
skills_dir = skills_home / "skills"
_write_skill(skills_dir, "hub-skill")
_write_skill(skills_dir, "my-skill")
hub_dir = skills_dir / ".hub"
hub_dir.mkdir()
(hub_dir / "lock.json").write_text(
json.dumps({"version": 1, "installed": {"hub-skill": {"source": "taps/main"}}}),
encoding="utf-8",
)
names = list_agent_created_skill_names()
assert "my-skill" in names
assert "hub-skill" not in names
def test_is_agent_created(skills_home):
from tools.skill_usage import is_agent_created
skills_dir = skills_home / "skills"
(skills_dir / ".bundled_manifest").write_text("bundled:abc\n", encoding="utf-8")
hub_dir = skills_dir / ".hub"
hub_dir.mkdir()
(hub_dir / "lock.json").write_text(
json.dumps({"installed": {"hubbed": {}}}), encoding="utf-8",
)
assert is_agent_created("my-skill") is True
assert is_agent_created("bundled") is False
assert is_agent_created("hubbed") is False
def test_agent_created_skips_archive_and_hub_dirs(skills_home):
from tools.skill_usage import list_agent_created_skill_names
skills_dir = skills_home / "skills"
_write_skill(skills_dir, "real-skill")
# Dot-prefixed dirs must be ignored even if they contain SKILL.md
archive = skills_dir / ".archive" / "old-skill"
archive.mkdir(parents=True)
(archive / "SKILL.md").write_text(
"---\nname: old-skill\n---\n", encoding="utf-8",
)
names = list_agent_created_skill_names()
assert "real-skill" in names
assert "old-skill" not in names
# ---------------------------------------------------------------------------
# Archive / restore
# ---------------------------------------------------------------------------
def test_archive_skill_moves_directory(skills_home):
from tools.skill_usage import archive_skill, get_record, STATE_ARCHIVED
skills_dir = skills_home / "skills"
skill_dir = _write_skill(skills_dir, "old-skill")
assert skill_dir.exists()
ok, msg = archive_skill("old-skill")
assert ok, msg
assert not skill_dir.exists()
assert (skills_dir / ".archive" / "old-skill" / "SKILL.md").exists()
assert get_record("old-skill")["state"] == "archived"
assert get_record("old-skill")["archived_at"] is not None
def test_archive_refuses_bundled_skill(skills_home):
from tools.skill_usage import archive_skill
skills_dir = skills_home / "skills"
_write_skill(skills_dir, "bundled")
(skills_dir / ".bundled_manifest").write_text("bundled:abc\n", encoding="utf-8")
ok, msg = archive_skill("bundled")
assert not ok
assert "bundled" in msg.lower() or "hub" in msg.lower()
def test_archive_refuses_hub_skill(skills_home):
from tools.skill_usage import archive_skill
skills_dir = skills_home / "skills"
_write_skill(skills_dir, "hub-skill")
hub_dir = skills_dir / ".hub"
hub_dir.mkdir()
(hub_dir / "lock.json").write_text(
json.dumps({"installed": {"hub-skill": {}}}), encoding="utf-8",
)
ok, msg = archive_skill("hub-skill")
assert not ok
def test_archive_missing_skill_returns_error(skills_home):
from tools.skill_usage import archive_skill
ok, msg = archive_skill("nonexistent")
assert not ok
assert "not found" in msg.lower()
def test_restore_skill_moves_back(skills_home):
from tools.skill_usage import archive_skill, restore_skill, get_record
skills_dir = skills_home / "skills"
_write_skill(skills_dir, "temp-skill")
archive_skill("temp-skill")
assert not (skills_dir / "temp-skill").exists()
ok, msg = restore_skill("temp-skill")
assert ok, msg
assert (skills_dir / "temp-skill" / "SKILL.md").exists()
assert get_record("temp-skill")["state"] == "active"
def test_archive_collision_gets_suffix(skills_home):
from tools.skill_usage import archive_skill
skills_dir = skills_home / "skills"
_write_skill(skills_dir, "dup")
archive_skill("dup")
_write_skill(skills_dir, "dup") # recreate
ok, msg = archive_skill("dup")
assert ok
# Two entries under .archive/ — second should have a timestamp suffix
archived = sorted(p.name for p in (skills_dir / ".archive").iterdir() if p.is_dir())
assert "dup" in archived
assert any(n.startswith("dup-") and n != "dup" for n in archived)
# ---------------------------------------------------------------------------
# Reporting
# ---------------------------------------------------------------------------
def test_agent_created_report_includes_defaults(skills_home):
from tools.skill_usage import agent_created_report, bump_view
skills_dir = skills_home / "skills"
_write_skill(skills_dir, "a")
_write_skill(skills_dir, "b")
bump_view("a")
rows = agent_created_report()
by_name = {r["name"]: r for r in rows}
assert "a" in by_name and "b" in by_name
assert by_name["a"]["view_count"] == 1
# b has no usage record yet — must still appear with defaults
assert by_name["b"]["view_count"] == 0
assert by_name["b"]["state"] == "active"
def test_agent_created_report_excludes_bundled_and_hub(skills_home):
from tools.skill_usage import agent_created_report
skills_dir = skills_home / "skills"
_write_skill(skills_dir, "mine")
_write_skill(skills_dir, "bundled")
_write_skill(skills_dir, "hubbed")
(skills_dir / ".bundled_manifest").write_text("bundled:abc\n", encoding="utf-8")
hub = skills_dir / ".hub"
hub.mkdir()
(hub / "lock.json").write_text(
json.dumps({"installed": {"hubbed": {}}}), encoding="utf-8",
)
names = {r["name"] for r in agent_created_report()}
assert "mine" in names
assert "bundled" not in names
assert "hubbed" not in names
# ---------------------------------------------------------------------------
# Provenance guard — telemetry must not leak records for bundled/hub skills
# ---------------------------------------------------------------------------
def test_bump_view_no_op_for_bundled_skill(skills_home):
"""Telemetry bumps on bundled skills are dropped — the sidecar must stay
focused on agent-created skills only."""
from tools.skill_usage import bump_view, load_usage
skills_dir = skills_home / "skills"
(skills_dir / ".bundled_manifest").write_text(
"ship-bundled:abc\n", encoding="utf-8",
)
bump_view("ship-bundled")
assert "ship-bundled" not in load_usage(), (
"bundled skill leaked into .usage.json"
)
def test_bump_patch_no_op_for_hub_skill(skills_home):
from tools.skill_usage import bump_patch, load_usage
skills_dir = skills_home / "skills"
hub = skills_dir / ".hub"
hub.mkdir()
(hub / "lock.json").write_text(
json.dumps({"installed": {"from-hub": {}}}), encoding="utf-8",
)
bump_patch("from-hub")
assert "from-hub" not in load_usage()
def test_bump_use_no_op_for_hub_skill(skills_home):
from tools.skill_usage import bump_use, load_usage
skills_dir = skills_home / "skills"
hub = skills_dir / ".hub"
hub.mkdir()
(hub / "lock.json").write_text(
json.dumps({"installed": {"from-hub": {}}}), encoding="utf-8",
)
bump_use("from-hub")
assert "from-hub" not in load_usage()
def test_set_state_no_op_for_bundled_skill(skills_home):
"""State transitions on bundled skills must not land in the sidecar."""
from tools.skill_usage import set_state, load_usage, STATE_ARCHIVED
skills_dir = skills_home / "skills"
(skills_dir / ".bundled_manifest").write_text(
"locked:abc\n", encoding="utf-8",
)
set_state("locked", STATE_ARCHIVED)
assert "locked" not in load_usage()
def test_restore_refuses_to_shadow_bundled_skill(skills_home):
"""If a bundled skill now occupies the name, refuse to restore."""
from tools.skill_usage import archive_skill, restore_skill
skills_dir = skills_home / "skills"
_write_skill(skills_dir, "shared-name")
archive_skill("shared-name")
# Now a bundled skill appears with the same name
(skills_dir / ".bundled_manifest").write_text(
"shared-name:abc\n", encoding="utf-8",
)
_write_skill(skills_dir, "shared-name") # bundled install landed
ok, msg = restore_skill("shared-name")
assert not ok
assert "bundled" in msg.lower() or "shadow" in msg.lower()
def test_end_to_end_no_code_path_mutates_bundled_skill(skills_home):
"""The combined guarantee: no curator code path can archive, mark stale,
set-state, or persist telemetry for a bundled or hub-installed skill."""
from tools.skill_usage import (
bump_view, bump_use, bump_patch, set_state, set_pinned,
archive_skill, load_usage, STATE_STALE, STATE_ARCHIVED,
)
skills_dir = skills_home / "skills"
_write_skill(skills_dir, "bundled-one")
_write_skill(skills_dir, "hub-one")
_write_skill(skills_dir, "mine")
(skills_dir / ".bundled_manifest").write_text(
"bundled-one:abc\n", encoding="utf-8",
)
hub = skills_dir / ".hub"
hub.mkdir()
(hub / "lock.json").write_text(
json.dumps({"installed": {"hub-one": {}}}), encoding="utf-8",
)
# Hammer every mutator at the bundled/hub names
for name in ("bundled-one", "hub-one"):
bump_view(name)
bump_use(name)
bump_patch(name)
set_state(name, STATE_STALE)
set_state(name, STATE_ARCHIVED)
set_pinned(name, True)
ok, _msg = archive_skill(name)
assert not ok, f"archive_skill(\"{name}\") should refuse"
# Sidecar must be clean of all three
data = load_usage()
assert "bundled-one" not in data
assert "hub-one" not in data
# Directories must still be in place on disk
assert (skills_dir / "bundled-one" / "SKILL.md").exists()
assert (skills_dir / "hub-one" / "SKILL.md").exists()
# The agent-created skill can still be mutated normally
bump_view("mine")
assert load_usage()["mine"]["view_count"] == 1
-128
View File
@@ -83,134 +83,6 @@ def test_write_json_broken_pipe(server):
assert server.write_json({"x": 1}) is False
def test_write_json_closed_stream_returns_false(server):
"""ValueError ('I/O on closed file') used to bubble up; treat as gone."""
class _Closed:
def write(self, _): raise ValueError("I/O operation on closed file")
def flush(self): raise ValueError("I/O operation on closed file")
server._real_stdout = _Closed()
assert server.write_json({"x": 1}) is False
def test_write_json_unicode_encode_error_re_raises(server):
"""A non-UTF-8 stdout encoding raises UnicodeEncodeError (a ValueError
subclass). It must NOT be swallowed as 'peer gone' that would let
`entry.py` exit cleanly via the False path and hide the real config
bug. We re-raise so the existing crash-log infrastructure records it."""
class _AsciiOnly:
def write(self, line):
line.encode("ascii") # raises UnicodeEncodeError on non-ascii
def flush(self): pass
server._real_stdout = _AsciiOnly()
with pytest.raises(UnicodeEncodeError):
server.write_json({"msg": "héllo"})
def test_write_json_unrelated_value_error_re_raises(server):
"""Only ValueError('...closed file...') means peer gone. Other
ValueErrors are programming errors and must surface."""
class _BadValue:
def write(self, _): raise ValueError("something else entirely")
def flush(self): pass
server._real_stdout = _BadValue()
with pytest.raises(ValueError, match="something else entirely"):
server.write_json({"x": 1})
def test_write_json_non_serializable_payload_re_raises(server):
"""Non-JSON-safe payloads are programming errors — they must NOT be
silently dropped via the False path (which would trigger a clean exit
in entry.py and mask the real bug)."""
import io
server._real_stdout = io.StringIO()
with pytest.raises(TypeError):
server.write_json({"obj": object()})
def test_write_json_peer_gone_oserror_on_flush_returns_false(server):
"""A flush that raises a peer-gone OSError (EPIPE) must not strand
the lock or crash; it returns False so the dispatcher exits cleanly."""
import errno
written = []
class _FlushPeerGone:
def write(self, line): written.append(line)
def flush(self): raise OSError(errno.EPIPE, "broken pipe")
server._real_stdout = _FlushPeerGone()
assert server.write_json({"x": 1}) is False
assert written and json.loads(written[0]) == {"x": 1}
def test_write_json_non_peer_gone_oserror_re_raises(server):
"""Host I/O failures (ENOSPC, EACCES, EIO …) are NOT peer-gone — they
must re-raise so the crash log records them instead of looking like
a clean disconnect via the False path."""
import errno
class _DiskFull:
def write(self, _): raise OSError(errno.ENOSPC, "no space left")
def flush(self): pass
server._real_stdout = _DiskFull()
with pytest.raises(OSError, match="no space"):
server.write_json({"x": 1})
def test_write_json_skips_flush_when_disable_flush_true(monkeypatch):
"""`StdioTransport` skips flush when `_DISABLE_FLUSH` is true.
Tests the runtime *behaviour* via direct module-attr patch. The env
var module constant wiring is covered by the dedicated env test
below; reloading server.py here would re-register atexit hooks and
recreate the worker pool.
"""
import importlib
transport_mod = importlib.import_module("tui_gateway.transport")
monkeypatch.setattr(transport_mod, "_DISABLE_FLUSH", True)
flushed = {"count": 0}
written = []
class _Stream:
def write(self, line): written.append(line)
def flush(self): flushed["count"] += 1
stream = _Stream()
transport = transport_mod.StdioTransport(lambda: stream, threading.Lock())
assert transport.write({"x": 1}) is True
assert flushed["count"] == 0
def test_disable_flush_env_var_actually_wires_to_module_constant(monkeypatch):
"""End-to-end: setting `HERMES_TUI_GATEWAY_NO_FLUSH=1` and importing
`tui_gateway.transport` fresh actually flips `_DISABLE_FLUSH` true.
Reloads only the transport module server.py is untouched so its
atexit hooks/worker pool stay intact."""
import importlib
monkeypatch.setenv("HERMES_TUI_GATEWAY_NO_FLUSH", "1")
transport_mod = importlib.reload(importlib.import_module("tui_gateway.transport"))
try:
assert transport_mod._DISABLE_FLUSH is True
finally:
# Restore the env-disabled state so other tests see the default.
monkeypatch.delenv("HERMES_TUI_GATEWAY_NO_FLUSH", raising=False)
importlib.reload(transport_mod)
# ── _emit ────────────────────────────────────────────────────────────
+4 -23
View File
@@ -164,18 +164,6 @@ HARDLINE_PATTERNS = [
(_CMDPOS + r'telinit\s+[06]\b', "telinit 0/6 (shutdown/reboot)"),
]
# Pre-compiled variant used by the hot-path matcher. Building these at module
# load eliminates the ~2.6 ms cold-cache re.compile fan-out on the first
# terminal() call per process (12 HARDLINE + 47 DANGEROUS patterns, each
# potentially evicted from Python's 512-entry ``re._cache`` by unrelated
# regex work elsewhere in the agent). DANGEROUS_PATTERNS_COMPILED is built
# at the end of this module after DANGEROUS_PATTERNS is defined.
_RE_FLAGS = re.IGNORECASE | re.DOTALL
HARDLINE_PATTERNS_COMPILED = [
(re.compile(pattern, _RE_FLAGS), description)
for pattern, description in HARDLINE_PATTERNS
]
def detect_hardline_command(command: str) -> tuple:
"""Check if a command matches the unconditional hardline blocklist.
@@ -184,8 +172,8 @@ def detect_hardline_command(command: str) -> tuple:
(is_hardline, description) or (False, None)
"""
normalized = _normalize_command_for_detection(command).lower()
for pattern_re, description in HARDLINE_PATTERNS_COMPILED:
if pattern_re.search(normalized):
for pattern, description in HARDLINE_PATTERNS:
if re.search(pattern, normalized, re.IGNORECASE | re.DOTALL):
return (True, description)
return (False, None)
@@ -279,13 +267,6 @@ DANGEROUS_PATTERNS = [
]
# Pre-compiled variant (same rationale as HARDLINE_PATTERNS_COMPILED above).
DANGEROUS_PATTERNS_COMPILED = [
(re.compile(pattern, _RE_FLAGS), description)
for pattern, description in DANGEROUS_PATTERNS
]
def _legacy_pattern_key(pattern: str) -> str:
"""Reproduce the old regex-derived approval key for backwards compatibility."""
return pattern.split(r'\b')[1] if r'\b' in pattern else pattern[:20]
@@ -338,8 +319,8 @@ def detect_dangerous_command(command: str) -> tuple:
(is_dangerous, pattern_key, description) or (False, None, None)
"""
command_lower = _normalize_command_for_detection(command).lower()
for pattern_re, description in DANGEROUS_PATTERNS_COMPILED:
if pattern_re.search(command_lower):
for pattern, description in DANGEROUS_PATTERNS:
if re.search(pattern, command_lower, re.IGNORECASE | re.DOTALL):
pattern_key = description
return (True, pattern_key, description)
return (False, None, None)
-5
View File
@@ -335,10 +335,6 @@ class BaseEnvironment(ABC):
instead of running with ``bash -l``.
"""
# Full capture: env vars, functions (filtered), aliases, shell options.
# Restore configured cwd after login shell profile scripts, which may
# change the working directory (e.g. bashrc `cd ~`). Without this,
# pwd -P captures the profile's directory, not terminal.cwd.
_quoted_cwd = shlex.quote(self.cwd)
bootstrap = (
f"export -p > {self._snapshot_path}\n"
f"declare -f | grep -vE '^_[^_]' >> {self._snapshot_path}\n"
@@ -346,7 +342,6 @@ class BaseEnvironment(ABC):
f"echo 'shopt -s expand_aliases' >> {self._snapshot_path}\n"
f"echo 'set +e' >> {self._snapshot_path}\n"
f"echo 'set +u' >> {self._snapshot_path}\n"
f"builtin cd {_quoted_cwd} 2>/dev/null || true\n"
f"pwd -P > {self._cwd_file} 2>/dev/null || true\n"
f"printf '\\n{self._cwd_marker}%s{self._cwd_marker}\\n' \"$(pwd -P)\"\n"
)
-2
View File
@@ -305,8 +305,6 @@ class LocalEnvironment(BaseEnvironment):
"""
def __init__(self, cwd: str = "", timeout: int = 60, env: dict = None):
if cwd:
cwd = os.path.expanduser(cwd)
super().__init__(cwd=cwd or os.getcwd(), timeout=timeout, env=env)
self.init_session()
+7 -62
View File
@@ -19,7 +19,6 @@ import importlib
import json
import logging
import threading
import time
from pathlib import Path
from typing import Callable, Dict, List, Optional, Set
@@ -98,48 +97,6 @@ class ToolEntry:
self.max_result_size_chars = max_result_size_chars
# ---------------------------------------------------------------------------
# check_fn TTL cache
#
# check_fn callables like tools/terminal_tool.check_terminal_requirements
# probe external state (Docker daemon, Modal SDK install, playwright binary
# availability). For a long-lived CLI or gateway process, calling them on
# every get_definitions() is pure waste — external state changes on human
# timescales. Cache results for ~30 s so env-var flips via ``hermes tools``
# or live credential file changes propagate within a turn or two without
# requiring any explicit invalidation.
# ---------------------------------------------------------------------------
_CHECK_FN_TTL_SECONDS = 30.0
_check_fn_cache: Dict[Callable, tuple[float, bool]] = {}
_check_fn_cache_lock = threading.Lock()
def _check_fn_cached(fn: Callable) -> bool:
"""Return bool(fn()), TTL-cached across calls. Swallows exceptions as False."""
now = time.monotonic()
with _check_fn_cache_lock:
cached = _check_fn_cache.get(fn)
if cached is not None:
ts, value = cached
if now - ts < _CHECK_FN_TTL_SECONDS:
return value
try:
value = bool(fn())
except Exception:
value = False
with _check_fn_cache_lock:
_check_fn_cache[fn] = (now, value)
return value
def invalidate_check_fn_cache() -> None:
"""Drop all cached ``check_fn`` results. Call after config changes that
affect tool availability (e.g. ``hermes tools enable``)."""
with _check_fn_cache_lock:
_check_fn_cache.clear()
class ToolRegistry:
"""Singleton registry that collects tool schemas + handlers from tool files."""
@@ -151,12 +108,6 @@ class ToolRegistry:
# reading tool metadata, so keep mutations serialized and readers on
# stable snapshots.
self._lock = threading.RLock()
# Monotonically-increasing generation counter. Bumped on every
# mutation (register / deregister / register_toolset_alias / MCP
# refresh). External callers (e.g. get_tool_definitions) can memoize
# against it: a cache entry keyed on the generation is valid for as
# long as the generation hasn't changed.
self._generation: int = 0
def _snapshot_state(self) -> tuple[List[ToolEntry], Dict[str, Callable]]:
"""Return a coherent snapshot of registry entries and toolset checks."""
@@ -207,7 +158,6 @@ class ToolRegistry:
alias, existing, toolset,
)
self._toolset_aliases[alias] = toolset
self._generation += 1
def get_registered_toolset_aliases(self) -> Dict[str, str]:
"""Return a snapshot of ``{alias: canonical_toolset}`` mappings."""
@@ -275,7 +225,6 @@ class ToolRegistry:
)
if check_fn and toolset not in self._toolset_checks:
self._toolset_checks[toolset] = check_fn
self._generation += 1
def deregister(self, name: str) -> None:
"""Remove a tool from the registry.
@@ -300,7 +249,6 @@ class ToolRegistry:
for alias, target in self._toolset_aliases.items()
if target != entry.toolset
}
self._generation += 1
logger.debug("Deregistered tool: %s", name)
# ------------------------------------------------------------------
@@ -311,17 +259,9 @@ class ToolRegistry:
"""Return OpenAI-format tool schemas for the requested tool names.
Only tools whose ``check_fn()`` returns True (or have no check_fn)
are included. ``check_fn()`` results are cached for ~30 s via
:func:`_check_fn_cached` to amortize repeat probes (check_terminal_
requirements probes modal/docker, browser checks probe playwright,
etc.); TTL chosen so env-var changes (``hermes tools enable foo``)
still take effect in near-real-time without forcing a full cache
flush on every call.
are included.
"""
result = []
# Per-call cache on top of the 30 s TTL — handles repeat probes of the
# same check_fn within one definitions pass without re-reading the
# TTL clock.
check_results: Dict[Callable, bool] = {}
entries_by_name = {entry.name: entry for entry in self._snapshot_entries()}
for name in sorted(tool_names):
@@ -330,7 +270,12 @@ class ToolRegistry:
continue
if entry.check_fn:
if entry.check_fn not in check_results:
check_results[entry.check_fn] = _check_fn_cached(entry.check_fn)
try:
check_results[entry.check_fn] = bool(entry.check_fn())
except Exception:
check_results[entry.check_fn] = False
if not quiet:
logger.debug("Tool %s check raised; skipping", name)
if not check_results[entry.check_fn]:
if not quiet:
logger.debug("Tool %s unavailable (check failed)", name)
-11
View File
@@ -700,17 +700,6 @@ def skill_manage(
clear_skills_system_prompt_cache(clear_snapshot=True)
except Exception:
pass
# Curator telemetry: bump patch_count on edit/patch/write_file (the actions
# that mutate an existing skill's guidance), drop the record on delete.
# Best-effort; telemetry failures never break the tool.
try:
from tools.skill_usage import bump_patch, forget
if action in ("patch", "edit", "write_file", "remove_file"):
bump_patch(name)
elif action == "delete":
forget(name)
except Exception:
pass
return json.dumps(result, ensure_ascii=False)
-456
View File
@@ -1,456 +0,0 @@
"""Skill usage telemetry + provenance tracking for the Curator feature.
Tracks per-skill usage metadata in a sidecar JSON file (~/.hermes/skills/.usage.json)
keyed by skill name. Counters are bumped by the existing skill tools (skill_view,
skill_manage); the curator orchestrator reads them to decide lifecycle transitions.
Design notes:
- Sidecar, not frontmatter. Keeps operational telemetry out of user-authored
SKILL.md content and avoids conflict pressure for bundled/hub skills.
- Atomic writes via tempfile + os.replace (same pattern as .bundled_manifest).
- All counter bumps are best-effort: failures log at DEBUG and return silently.
A broken sidecar never breaks the underlying tool call.
- Provenance filter: "agent-created" == not in .bundled_manifest AND not in
.hub/lock.json. The curator only ever mutates agent-created skills.
Lifecycle states:
active -> default
stale -> unused > stale_after_days (config)
archived -> unused > archive_after_days (config); moved to .archive/
pinned -> opt-out from auto transitions (boolean flag, orthogonal to state)
"""
from __future__ import annotations
import json
import logging
import os
import tempfile
from datetime import datetime, timezone
from pathlib import Path
from typing import Any, Dict, Iterable, List, Optional, Set, Tuple
from hermes_constants import get_hermes_home
logger = logging.getLogger(__name__)
STATE_ACTIVE = "active"
STATE_STALE = "stale"
STATE_ARCHIVED = "archived"
_VALID_STATES = {STATE_ACTIVE, STATE_STALE, STATE_ARCHIVED}
def _skills_dir() -> Path:
return get_hermes_home() / "skills"
def _usage_file() -> Path:
return _skills_dir() / ".usage.json"
def _archive_dir() -> Path:
return _skills_dir() / ".archive"
def _now_iso() -> str:
return datetime.now(timezone.utc).isoformat()
# ---------------------------------------------------------------------------
# Provenance — which skills are agent-created (and thus eligible for curation)
# ---------------------------------------------------------------------------
def _read_bundled_manifest_names() -> Set[str]:
"""Return the set of skill names that were seeded from the bundled repo.
Reads ~/.hermes/skills/.bundled_manifest (format: "name:hash" per line).
Returns empty set if the file is missing or unreadable.
"""
manifest = _skills_dir() / ".bundled_manifest"
if not manifest.exists():
return set()
names: Set[str] = set()
try:
for line in manifest.read_text(encoding="utf-8").splitlines():
line = line.strip()
if not line:
continue
name = line.split(":", 1)[0].strip()
if name:
names.add(name)
except OSError as e:
logger.debug("Failed to read bundled manifest: %s", e)
return names
def _read_hub_installed_names() -> Set[str]:
"""Return the set of skill names installed via the Skills Hub.
Reads ~/.hermes/skills/.hub/lock.json (see tools/skills_hub.py :: HubLockFile).
"""
lock_path = _skills_dir() / ".hub" / "lock.json"
if not lock_path.exists():
return set()
try:
data = json.loads(lock_path.read_text(encoding="utf-8"))
if isinstance(data, dict):
installed = data.get("installed") or {}
if isinstance(installed, dict):
return {str(k) for k in installed.keys()}
except (OSError, json.JSONDecodeError) as e:
logger.debug("Failed to read hub lock file: %s", e)
return set()
def list_agent_created_skill_names() -> List[str]:
"""Enumerate skills that were authored by the agent (or user), NOT by a
bundled or hub-installed source.
The curator operates exclusively on this set. Bundled / hub skills are
maintained by their upstream sources and must never be pruned here.
"""
base = _skills_dir()
if not base.exists():
return []
bundled = _read_bundled_manifest_names()
hub = _read_hub_installed_names()
off_limits = bundled | hub
names: List[str] = []
# Top-level SKILL.md files (flat layout) AND nested category/skill/SKILL.md
for skill_md in base.rglob("SKILL.md"):
# Skip anything under .archive or .hub
try:
rel = skill_md.relative_to(base)
except ValueError:
continue
parts = rel.parts
if parts and (parts[0].startswith(".") or parts[0] == "node_modules"):
continue
name = _read_skill_name(skill_md, fallback=skill_md.parent.name)
if name in off_limits:
continue
names.append(name)
return sorted(set(names))
def _read_skill_name(skill_md: Path, fallback: str) -> str:
"""Parse the `name:` field from a SKILL.md YAML frontmatter."""
try:
text = skill_md.read_text(encoding="utf-8", errors="replace")[:4000]
except OSError:
return fallback
in_frontmatter = False
for line in text.split("\n"):
stripped = line.strip()
if stripped == "---":
if in_frontmatter:
break
in_frontmatter = True
continue
if in_frontmatter and stripped.startswith("name:"):
value = stripped.split(":", 1)[1].strip().strip("\"'")
if value:
return value
return fallback
def is_agent_created(skill_name: str) -> bool:
"""Whether *skill_name* is neither bundled nor hub-installed."""
off_limits = _read_bundled_manifest_names() | _read_hub_installed_names()
return skill_name not in off_limits
# ---------------------------------------------------------------------------
# Sidecar I/O
# ---------------------------------------------------------------------------
def _empty_record() -> Dict[str, Any]:
return {
"use_count": 0,
"view_count": 0,
"last_used_at": None,
"last_viewed_at": None,
"patch_count": 0,
"last_patched_at": None,
"created_at": _now_iso(),
"state": STATE_ACTIVE,
"pinned": False,
"archived_at": None,
}
def load_usage() -> Dict[str, Dict[str, Any]]:
"""Read the entire .usage.json map. Returns empty dict on missing/corrupt."""
path = _usage_file()
if not path.exists():
return {}
try:
data = json.loads(path.read_text(encoding="utf-8"))
except (OSError, json.JSONDecodeError) as e:
logger.debug("Failed to read %s: %s", path, e)
return {}
if not isinstance(data, dict):
return {}
# Defensive: coerce any non-dict values to a fresh empty record
clean: Dict[str, Dict[str, Any]] = {}
for k, v in data.items():
if isinstance(v, dict):
clean[str(k)] = v
return clean
def save_usage(data: Dict[str, Dict[str, Any]]) -> None:
"""Write the usage map atomically. Best-effort — errors are logged, not raised."""
path = _usage_file()
try:
path.parent.mkdir(parents=True, exist_ok=True)
fd, tmp_path = tempfile.mkstemp(
dir=str(path.parent), prefix=".usage_", suffix=".tmp"
)
try:
with os.fdopen(fd, "w", encoding="utf-8") as f:
json.dump(data, f, indent=2, sort_keys=True, ensure_ascii=False)
f.flush()
os.fsync(f.fileno())
os.replace(tmp_path, path)
except BaseException:
try:
os.unlink(tmp_path)
except OSError:
pass
raise
except Exception as e:
logger.debug("Failed to write %s: %s", path, e, exc_info=True)
def get_record(skill_name: str) -> Dict[str, Any]:
"""Return the record for *skill_name*, creating a fresh one if missing."""
data = load_usage()
rec = data.get(skill_name)
if not isinstance(rec, dict):
return _empty_record()
# Backfill any missing keys so callers don't need to handle old files
base = _empty_record()
for k, v in base.items():
rec.setdefault(k, v)
return rec
def _mutate(skill_name: str, mutator) -> None:
"""Load, apply *mutator(record)* in place, save. Best-effort.
Bundled and hub-installed skills are NEVER recorded in the sidecar.
This keeps .usage.json focused on agent-created skills (the only ones
the curator considers) and prevents stale counters from hanging around
for upstream-managed skills.
"""
if not skill_name:
return
try:
if not is_agent_created(skill_name):
return
data = load_usage()
rec = data.get(skill_name)
if not isinstance(rec, dict):
rec = _empty_record()
mutator(rec)
data[skill_name] = rec
save_usage(data)
except Exception as e:
logger.debug("skill_usage._mutate(%s) failed: %s", skill_name, e, exc_info=True)
# ---------------------------------------------------------------------------
# Public counter-bump helpers
# ---------------------------------------------------------------------------
def bump_view(skill_name: str) -> None:
"""Bump view_count and last_viewed_at. Called from skill_view()."""
def _apply(rec: Dict[str, Any]) -> None:
rec["view_count"] = int(rec.get("view_count") or 0) + 1
rec["last_viewed_at"] = _now_iso()
_mutate(skill_name, _apply)
def bump_use(skill_name: str) -> None:
"""Bump use_count and last_used_at. Called when a skill is actively used
(e.g. loaded into the prompt path or referenced from an assistant turn)."""
def _apply(rec: Dict[str, Any]) -> None:
rec["use_count"] = int(rec.get("use_count") or 0) + 1
rec["last_used_at"] = _now_iso()
_mutate(skill_name, _apply)
def bump_patch(skill_name: str) -> None:
"""Bump patch_count and last_patched_at. Called from skill_manage (patch/edit)."""
def _apply(rec: Dict[str, Any]) -> None:
rec["patch_count"] = int(rec.get("patch_count") or 0) + 1
rec["last_patched_at"] = _now_iso()
_mutate(skill_name, _apply)
def set_state(skill_name: str, state: str) -> None:
"""Set lifecycle state. No-op if *state* is invalid."""
if state not in _VALID_STATES:
logger.debug("set_state: invalid state %r for %s", state, skill_name)
return
def _apply(rec: Dict[str, Any]) -> None:
rec["state"] = state
if state == STATE_ARCHIVED:
rec["archived_at"] = _now_iso()
elif state == STATE_ACTIVE:
rec["archived_at"] = None
_mutate(skill_name, _apply)
def set_pinned(skill_name: str, pinned: bool) -> None:
def _apply(rec: Dict[str, Any]) -> None:
rec["pinned"] = bool(pinned)
_mutate(skill_name, _apply)
def forget(skill_name: str) -> None:
"""Drop a skill's usage entry entirely. Called when the skill is deleted."""
if not skill_name:
return
try:
data = load_usage()
if skill_name in data:
del data[skill_name]
save_usage(data)
except Exception as e:
logger.debug("skill_usage.forget(%s) failed: %s", skill_name, e, exc_info=True)
# ---------------------------------------------------------------------------
# Archive / restore
# ---------------------------------------------------------------------------
def archive_skill(skill_name: str) -> Tuple[bool, str]:
"""Move an agent-created skill directory to ~/.hermes/skills/.archive/.
Returns (ok, message). Never archives bundled or hub skills callers are
responsible for checking provenance, but we double-check here as a safety net.
"""
if not is_agent_created(skill_name):
return False, f"skill '{skill_name}' is bundled or hub-installed; never archive"
skill_dir = _find_skill_dir(skill_name)
if skill_dir is None:
return False, f"skill '{skill_name}' not found"
archive_root = _archive_dir()
try:
archive_root.mkdir(parents=True, exist_ok=True)
except OSError as e:
return False, f"failed to create archive dir: {e}"
# Flatten any category nesting into a single ".archive/<skill>/" so restores
# are simple. If a collision exists, append a timestamp.
dest = archive_root / skill_dir.name
if dest.exists():
dest = archive_root / f"{skill_dir.name}-{datetime.now(timezone.utc).strftime('%Y%m%d%H%M%S')}"
try:
skill_dir.rename(dest)
except OSError as e:
# Cross-device — fall back to shutil.move
import shutil
try:
shutil.move(str(skill_dir), str(dest))
except Exception as e2:
return False, f"failed to archive: {e2}"
set_state(skill_name, STATE_ARCHIVED)
return True, f"archived to {dest}"
def restore_skill(skill_name: str) -> Tuple[bool, str]:
"""Move an archived skill back to ~/.hermes/skills/. Restores to the flat
top-level layout; original category nesting is NOT reconstructed.
Refuses to restore under a name that now collides with a bundled or
hub-installed skill that would shadow the upstream version.
"""
# If a bundled or hub skill has since been installed under the same
# name, refuse to restore rather than shadow it.
if not is_agent_created(skill_name):
return False, (
f"skill '{skill_name}' is now bundled or hub-installed; "
"restore would shadow the upstream version"
)
archive_root = _archive_dir()
if not archive_root.exists():
return False, "no archive directory"
# Try exact name match first, then any prefix match (for timestamped dupes)
candidates = [p for p in archive_root.iterdir() if p.is_dir() and p.name == skill_name]
if not candidates:
candidates = sorted(
[p for p in archive_root.iterdir()
if p.is_dir() and p.name.startswith(f"{skill_name}-")],
reverse=True,
)
if not candidates:
return False, f"skill '{skill_name}' not found in archive"
src = candidates[0]
dest = _skills_dir() / skill_name
if dest.exists():
return False, f"destination already exists: {dest}"
try:
src.rename(dest)
except OSError:
import shutil
try:
shutil.move(str(src), str(dest))
except Exception as e:
return False, f"failed to restore: {e}"
set_state(skill_name, STATE_ACTIVE)
return True, f"restored to {dest}"
def _find_skill_dir(skill_name: str) -> Optional[Path]:
"""Locate the directory for a skill by its frontmatter `name:` field.
Handles both flat (~/.hermes/skills/<skill>/SKILL.md) and category-nested
(~/.hermes/skills/<category>/<skill>/SKILL.md) layouts.
"""
base = _skills_dir()
if not base.exists():
return None
for skill_md in base.rglob("SKILL.md"):
try:
rel = skill_md.relative_to(base)
except ValueError:
continue
if rel.parts and rel.parts[0].startswith("."):
continue
if _read_skill_name(skill_md, fallback=skill_md.parent.name) == skill_name:
return skill_md.parent
return None
# ---------------------------------------------------------------------------
# Reporting — for the curator CLI / slash command
# ---------------------------------------------------------------------------
def agent_created_report() -> List[Dict[str, Any]]:
"""Return a list of {name, state, pinned, last_used_at, use_count, ...}
records for every agent-created skill. Missing usage records are backfilled
with defaults so callers can always index fields."""
data = load_usage()
rows: List[Dict[str, Any]] = []
for name in list_agent_created_skill_names():
rec = data.get(name)
if not isinstance(rec, dict):
rec = _empty_record()
base = _empty_record()
for k, v in base.items():
rec.setdefault(k, v)
rows.append({"name": name, **rec})
return rows
+3 -22
View File
@@ -1480,32 +1480,13 @@ registry.register(
check_fn=check_skills_requirements,
emoji="📚",
)
def _skill_view_with_bump(args, **kw):
"""Invoke skill_view, then bump view_count on success. Best-effort: a
telemetry failure never breaks the tool call."""
name = args.get("name", "")
result = skill_view(
name, file_path=args.get("file_path"), task_id=kw.get("task_id")
)
try:
parsed = json.loads(result)
if isinstance(parsed, dict) and parsed.get("success"):
# Use the resolved skill name from the payload when present —
# qualified forms ("plugin:skill") return with the canonical name.
resolved = parsed.get("name") or name
if resolved:
from tools.skill_usage import bump_view
bump_view(str(resolved))
except Exception:
pass
return result
registry.register(
name="skill_view",
toolset="skills",
schema=SKILL_VIEW_SCHEMA,
handler=_skill_view_with_bump,
handler=lambda args, **kw: skill_view(
args.get("name", ""), file_path=args.get("file_path"), task_id=kw.get("task_id")
),
check_fn=check_skills_requirements,
emoji="📚",
)
-2
View File
@@ -925,8 +925,6 @@ def _get_env_config() -> Dict[str, Any]:
# /workspace and track the original host path separately. Otherwise keep the
# normal sandbox behavior and discard host paths.
cwd = os.getenv("TERMINAL_CWD", default_cwd)
if cwd:
cwd = os.path.expanduser(cwd)
host_cwd = None
host_prefixes = ("/Users/", "/home/", "C:\\", "C:/")
if env_type == "docker" and mount_docker_cwd:
+5 -39
View File
@@ -45,47 +45,12 @@ import logging
import os
import re
import asyncio
from typing import List, Dict, Any, Optional, TYPE_CHECKING
from typing import List, Dict, Any, Optional
import httpx
# NOTE: `from firecrawl import Firecrawl` is deliberately NOT at module top —
# the SDK pulls ~200 ms of imports (httpcore, firecrawl.v1/v2 type trees) and
# we only need it when the backend is actually "firecrawl". We expose
# ``Firecrawl`` as a thin proxy that imports the SDK on first call/
# isinstance check, so both (a) the in-module ``Firecrawl(...)`` construction
# site in _get_firecrawl_client() works unchanged, and (b) tests using
# ``patch("tools.web_tools.Firecrawl", ...)`` keep working.
if TYPE_CHECKING:
from firecrawl import Firecrawl # noqa: F401 — type hints only
_FIRECRAWL_CLS_CACHE: Optional[type] = None
def _load_firecrawl_cls() -> type:
"""Import and cache ``firecrawl.Firecrawl``."""
global _FIRECRAWL_CLS_CACHE
if _FIRECRAWL_CLS_CACHE is None:
from firecrawl import Firecrawl as _cls
_FIRECRAWL_CLS_CACHE = _cls
return _FIRECRAWL_CLS_CACHE
class _FirecrawlProxy:
"""Module-level proxy that looks like ``firecrawl.Firecrawl`` but imports lazily."""
__slots__ = ()
def __call__(self, *args, **kwargs):
return _load_firecrawl_cls()(*args, **kwargs)
def __instancecheck__(self, obj):
return isinstance(obj, _load_firecrawl_cls())
def __repr__(self):
return "<lazy firecrawl.Firecrawl proxy>"
Firecrawl = _FirecrawlProxy()
# we only need it when the backend is actually "firecrawl". See
# _get_firecrawl_client() below for the lazy import.
from agent.auxiliary_client import (
async_call_llm,
extract_content_or_reasoning,
@@ -274,7 +239,8 @@ def _get_firecrawl_client():
if _firecrawl_client is not None and _firecrawl_client_config == client_config:
return _firecrawl_client
# Uses the module-level `Firecrawl` name (lazy proxy at module top).
# Lazy import — ~200 ms of SDK init, only paid when firecrawl is actually used.
from firecrawl import Firecrawl # noqa: E402
_firecrawl_client = Firecrawl(**kwargs)
_firecrawl_client_config = client_config
return _firecrawl_client
+1 -56
View File
@@ -29,28 +29,6 @@ def _install_sidecar_publisher() -> None:
)
# How long to wait for orderly shutdown (atexit + finalisers) before
# falling back to ``os._exit(0)`` so a wedged worker mid-flush can't
# strand the process. 1s covers the gateway's own shutdown work
# (thread-pool drain + session finalize) on every machine we've
# tested; override via ``HERMES_TUI_GATEWAY_SHUTDOWN_GRACE_S`` if a
# slower environment needs more headroom (e.g. encrypted disks
# flushing checkpoints) and accept that a longer grace also means a
# longer wait when shutdown actually deadlocks.
_DEFAULT_SHUTDOWN_GRACE_S = 1.0
def _shutdown_grace_seconds() -> float:
raw = (os.environ.get("HERMES_TUI_GATEWAY_SHUTDOWN_GRACE_S") or "").strip()
if not raw:
return _DEFAULT_SHUTDOWN_GRACE_S
try:
value = float(raw)
except ValueError:
return _DEFAULT_SHUTDOWN_GRACE_S
return value if value > 0 else _DEFAULT_SHUTDOWN_GRACE_S
def _log_signal(signum: int, frame) -> None:
"""Capture WHICH thread and WHERE a termination signal hit us.
@@ -60,15 +38,6 @@ def _log_signal(signum: int, frame) -> None:
handler the gateway-exited banner in the TUI has no trace the
crash log never sees a Python exception because the kernel reaps
the process before the interpreter runs anything.
Termination semantics: ``sys.exit(0)`` here used to race the worker
pool a thread holding ``_stdout_lock`` mid-flush would block the
interpreter shutdown indefinitely. We now log the stack, give the
process the configured shutdown grace
(``HERMES_TUI_GATEWAY_SHUTDOWN_GRACE_S``, default
``_DEFAULT_SHUTDOWN_GRACE_S``) to drain naturally on a background
thread, and fall back to ``os._exit(0)`` so a wedged write/flush
can never strand the process.
"""
name = {
signal.SIGPIPE: "SIGPIPE",
@@ -93,31 +62,7 @@ def _log_signal(signum: int, frame) -> None:
except Exception:
pass
print(f"[gateway-signal] {name}", file=sys.stderr, flush=True)
import threading as _threading
def _hard_exit() -> None:
# If a worker thread is still mid-flush on a half-closed pipe,
# ``sys.exit(0)`` would wait forever for it to drop the GIL on
# interpreter shutdown. ``os._exit`` skips atexit handlers but
# breaks the deadlock. The crash log + stderr line above are
# the forensic trail.
os._exit(0)
timer = _threading.Timer(_shutdown_grace_seconds(), _hard_exit)
timer.daemon = True
timer.start()
try:
sys.exit(0)
except SystemExit:
# Re-raise so the main-thread interpreter unwinds and runs
# atexit + finalisers inside the grace window. Python signal
# handlers always run on the main thread, but a worker thread
# holding ``_stdout_lock`` mid-flush can keep that unwind
# waiting indefinitely; the daemon timer above is the safety
# net for that exact case.
raise
sys.exit(0)
# SIGPIPE: ignore, don't exit. The old SIG_DFL killed the process
+81 -294
View File
@@ -140,7 +140,6 @@ _SLASH_WORKER_TIMEOUT_S = max(
# response writes are safe.
_LONG_HANDLERS = frozenset(
{
"browser.manage",
"cli.exec",
"session.branch",
"session.resume",
@@ -492,13 +491,6 @@ def _normalize_completion_path(path_part: str) -> str:
# ── Config I/O ────────────────────────────────────────────────────────
# Keep aligned with `INDICATOR_STYLES` / `DEFAULT_INDICATOR_STYLE` in
# ``ui-tui/src/app/interfaces.ts`` — both ends validate against the
# same shape so `config.get indicator` and the live TUI render agree.
_INDICATOR_STYLES: tuple[str, ...] = ("ascii", "emoji", "kaomoji", "unicode")
_INDICATOR_DEFAULT = "kaomoji"
def _load_cfg() -> dict:
global _cfg_cache, _cfg_mtime, _cfg_path
try:
@@ -691,21 +683,6 @@ def _coerce_statusbar(raw) -> str:
return "top"
def _display_mouse_tracking(display: dict) -> bool:
"""Return canonical display.mouse_tracking with legacy tui_mouse fallback."""
if not isinstance(display, dict):
return True
if "mouse_tracking" in display:
raw = display.get("mouse_tracking")
else:
raw = display.get("tui_mouse", True)
if raw is False or raw == 0:
return False
if isinstance(raw, str):
return raw.strip().lower() not in {"0", "false", "no", "off"}
return True
def _load_reasoning_config() -> dict | None:
from hermes_constants import parse_reasoning_effort
@@ -1046,6 +1023,17 @@ def _session_info(agent) -> dict:
info["mcp_servers"] = get_mcp_status()
except Exception:
info["mcp_servers"] = []
try:
from hermes_cli.learning_ledger import build_learning_ledger
ledger = build_learning_ledger(_get_db(), limit=1)
info["learning"] = {
"counts": ledger.get("counts", {}),
"inventory": ledger.get("inventory", {}),
"total": ledger.get("total", 0),
}
except Exception:
pass
try:
from hermes_cli.banner import get_update_result
from hermes_cli.config import recommended_update_command
@@ -1168,6 +1156,16 @@ def _on_tool_complete(sid: str, tool_call_id: str, name: str, args: dict, result
pass
if _tool_progress_enabled(sid) or payload.get("inline_diff"):
_emit("tool.complete", sid, payload)
try:
from hermes_cli.learning_ledger import learning_event_from_tool
event = learning_event_from_tool(name, args, result)
if event:
if session is not None:
session.setdefault("learning_events", []).append(event)
_emit("learning.event", sid, event)
except Exception:
pass
def _on_tool_progress(
@@ -2444,6 +2442,7 @@ def _(rid, params: dict) -> dict:
if session.get("running"):
return _err(rid, 4009, "session busy")
session["running"] = True
session["learning_events"] = []
history = list(session["history"])
history_version = int(session.get("history_version", 0))
images = list(session.get("attached_images", []))
@@ -2607,6 +2606,9 @@ def _(rid, params: dict) -> dict:
payload["reasoning"] = last_reasoning
if status_note:
payload["warning"] = status_note
learning_events = list(session.get("learning_events") or [])
if learning_events:
payload["learning_events"] = learning_events
rendered = render_message(raw, cols)
if rendered:
payload["rendered"] = rendered
@@ -3188,9 +3190,12 @@ def _(rid, params: dict) -> dict:
if key == "mouse":
raw = str(value or "").strip().lower()
cfg = _load_cfg()
display = cfg.get("display") if isinstance(cfg.get("display"), dict) else {}
current = _display_mouse_tracking(display)
display = (
_load_cfg().get("display")
if isinstance(_load_cfg().get("display"), dict)
else {}
)
current = bool(display.get("tui_mouse", True))
if raw in ("", "toggle"):
nv = not current
@@ -3201,23 +3206,9 @@ def _(rid, params: dict) -> dict:
else:
return _err(rid, 4002, f"unknown mouse value: {value}")
_write_config_key("display.mouse_tracking", nv)
_write_config_key("display.tui_mouse", nv)
return _ok(rid, {"key": key, "value": "on" if nv else "off"})
if key == "indicator":
# Use an explicit None check rather than `value or ""` so falsy
# non-string inputs (0, False, []) still surface as themselves
# in the error message instead of looking like a blank value.
raw = ("" if value is None else str(value)).strip().lower()
if raw not in _INDICATOR_STYLES:
return _err(
rid,
4002,
f"unknown indicator: {raw!r}; pick one of {'|'.join(_INDICATOR_STYLES)}",
)
_write_config_key("display.tui_status_indicator", raw)
return _ok(rid, {"key": key, "value": raw})
if key in ("prompt", "personality", "skin"):
try:
cfg = _load_cfg()
@@ -3288,18 +3279,6 @@ def _(rid, params: dict) -> dict:
return _ok(
rid, {"value": (_load_cfg().get("display") or {}).get("skin", "default")}
)
if key == "indicator":
# Normalize so a hand-edited config.yaml with stray casing or
# an unknown value reads back the SAME value the TUI actually
# rendered (frontend's `normalizeIndicatorStyle` falls back to
# `_INDICATOR_DEFAULT` for the same inputs). Otherwise
# `/indicator` would print one thing while the UI shows another.
raw = (_load_cfg().get("display") or {}).get("tui_status_indicator", "")
norm = str(raw).strip().lower()
return _ok(
rid,
{"value": norm if norm in _INDICATOR_STYLES else _INDICATOR_DEFAULT},
)
if key == "personality":
return _ok(
rid,
@@ -3375,7 +3354,7 @@ def _(rid, params: dict) -> dict:
return _ok(rid, {"value": _coerce_statusbar(raw)})
if key == "mouse":
display = _load_cfg().get("display")
on = _display_mouse_tracking(display)
on = display.get("tui_mouse", True) if isinstance(display, dict) else True
return _ok(rid, {"value": "on" if on else "off"})
if key == "mtime":
cfg_path = _hermes_home / "config.yaml"
@@ -3429,27 +3408,6 @@ def _(rid, params: dict) -> dict:
return _err(rid, 5015, str(e))
@method("reload.env")
def _(rid, params: dict) -> dict:
"""Re-read ``~/.hermes/.env`` into the gateway process via
``hermes_cli.config.reload_env``, matching classic CLI's ``/reload``
handler. Newly added API keys take effect on the next agent call
without restarting the TUI.
The credential pool / provider routing for any *already-constructed*
agent does not auto-rebuild that's the same behaviour as classic
CLI's ``/reload``. Users who want a brand-new credential resolution
should follow with ``/new``.
"""
try:
from hermes_cli.config import reload_env
count = reload_env()
return _ok(rid, {"updated": int(count)})
except Exception as e:
return _err(rid, 5015, str(e))
_TUI_HIDDEN: frozenset[str] = frozenset(
{
"sethome",
@@ -4741,241 +4699,54 @@ def _(rid, params: dict) -> dict:
# ── Methods: browser / plugins / cron / skills ───────────────────────
def _resolve_browser_cdp_url() -> str:
"""Return the configured browser CDP override without network I/O.
``/browser status`` must be fast calling
``tools.browser_tool._get_cdp_override`` would invoke
``_resolve_cdp_override``, which performs an HTTP probe to
``.../json/version`` for discovery-style URLs. That probe has
a multi-second timeout and would block the TUI on a slow or
unreachable host even though status only needs to report whether
an override is set.
Mirrors the env/config precedence of ``_get_cdp_override`` (env
var first, then ``browser.cdp_url`` from config.yaml) without the
websocket-resolution step, so the answer reflects user intent
even when the configured host is not currently reachable. The
actual WS normalization happens in ``browser_navigate`` on the
next tool call.
"""
env_url = os.environ.get("BROWSER_CDP_URL", "").strip()
if env_url:
return env_url
try:
from hermes_cli.config import read_raw_config
cfg = read_raw_config()
browser_cfg = cfg.get("browser", {}) if isinstance(cfg, dict) else {}
if isinstance(browser_cfg, dict):
return str(browser_cfg.get("cdp_url", "") or "").strip()
except Exception:
pass
return ""
def _is_default_local_cdp(parsed) -> bool:
"""Match the discovery-style local default; never the concrete WS form.
A user-supplied ``ws://127.0.0.1:9222/devtools/browser/<id>`` is a
real, connectable endpoint collapsing it to bare ``http://...:9222``
would strip the path and break the connect.
"""
try:
port = parsed.port or 80
except ValueError:
return False
discovery_path = parsed.path in {"", "/", "/json", "/json/version"}
return (
parsed.scheme in {"http", "ws"}
and parsed.hostname in {"127.0.0.1", "localhost"}
and port == 9222
and discovery_path
)
def _http_ok(url: str, timeout: float) -> bool:
import urllib.request
try:
with urllib.request.urlopen(url, timeout=timeout) as resp:
return 200 <= getattr(resp, "status", 200) < 300
except Exception:
return False
def _probe_urls(parsed) -> list[str]:
scheme = {"ws": "http", "wss": "https"}.get(parsed.scheme, parsed.scheme)
root = f"{scheme}://{parsed.netloc}".rstrip("/")
return [f"{root}/json/version", f"{root}/json"]
def _normalize_cdp_url(parsed) -> str:
# Concrete ``/devtools/browser/<id>`` endpoints (Browserbase et al.)
# are connectable as-is. Discovery-style inputs collapse to bare
# ``scheme://host:port`` so ``_resolve_cdp_override`` can append
# ``/json/version`` later without doubling the path.
if parsed.path.startswith("/devtools/browser/"):
return parsed.geturl()
return parsed._replace(path="", params="", query="", fragment="").geturl()
def _failure_messages(url: str, port: int, system: str) -> list[str]:
from hermes_cli.browser_connect import manual_chrome_debug_command
command = manual_chrome_debug_command(port, system)
hint = (
["Start Chrome with remote debugging, then retry /browser connect:", command]
if command
else [
"No Chrome/Chromium executable was found in this environment.",
f"Install one or start Chrome with --remote-debugging-port={port}, then retry /browser connect.",
]
)
return [
f"Chrome is not reachable at {url}.",
*hint,
"Browser not connected — start Chrome with remote debugging and retry /browser connect",
]
@method("browser.manage")
def _(rid, params: dict) -> dict:
action = params.get("action", "status")
if action == "status":
url = _resolve_browser_cdp_url()
url = os.environ.get("BROWSER_CDP_URL", "")
return _ok(rid, {"connected": bool(url), "url": url})
if action == "connect":
url = params.get("url", "http://localhost:9222")
try:
import urllib.request
from urllib.parse import urlparse
from tools.browser_tool import cleanup_all_browsers
if action == "disconnect":
return _browser_disconnect(rid)
if action != "connect":
return _err(rid, 4015, f"unknown action: {action}")
return _browser_connect(rid, params)
def _browser_connect(rid, params: dict) -> dict:
import platform
from hermes_cli.browser_connect import DEFAULT_BROWSER_CDP_URL
from tools.browser_tool import cleanup_all_browsers
from urllib.parse import urlparse
raw_url = params.get("url")
if raw_url is not None and not isinstance(raw_url, str):
return _err(rid, 4015, f"browser url must be a string, got {type(raw_url).__name__}")
url = (raw_url or "").strip() or DEFAULT_BROWSER_CDP_URL
sid = params.get("session_id") or ""
system = platform.system()
messages: list[str] = []
def announce(message: str, *, level: str = "info") -> None:
messages.append(message)
# Without a session id the TUI prints `messages` from the
# response; emitting an event would double-render. Only stream
# progress when there's a real session to scope it to.
if sid:
_emit("browser.progress", sid, {"message": message, "level": level})
parsed = urlparse(url if "://" in url else f"http://{url}")
if parsed.scheme not in {"http", "https", "ws", "wss"}:
return _err(rid, 4015, f"unsupported browser url: {url}")
if not parsed.hostname:
return _err(rid, 4015, f"missing host in browser url: {url}")
try:
port = parsed.port or (443 if parsed.scheme in {"https", "wss"} else 80)
except ValueError:
return _err(rid, 4015, f"invalid port in browser url: {url}")
# Always normalize default-local to 127.0.0.1:9222 so downstream
# comparisons + messaging match what we'll actually persist.
if _is_default_local_cdp(parsed):
url = DEFAULT_BROWSER_CDP_URL
parsed = urlparse(url)
port = parsed.port or 9222
try:
# ws[s]://.../devtools/browser/<id> endpoints (hosted CDP
# providers) don't serve the HTTP discovery path; just check
# TCP-level reachability and let browser_navigate handshake.
if parsed.scheme in {"ws", "wss"} and parsed.path.startswith(
"/devtools/browser/"
):
import socket
try:
with socket.create_connection((parsed.hostname, port), timeout=2.0):
pass
except OSError as e:
return _err(rid, 5031, f"could not reach browser CDP at {url}: {e}")
else:
probes = _probe_urls(parsed)
ok = any(_http_ok(p, timeout=2.0) for p in probes)
if not ok and _is_default_local_cdp(parsed):
from hermes_cli.browser_connect import try_launch_chrome_debug
announce(
"Chrome isn't running with remote debugging — attempting to launch..."
)
if try_launch_chrome_debug(port, system):
for _ in range(20):
time.sleep(0.5)
if any(_http_ok(p, timeout=1.0) for p in probes):
parsed = urlparse(url if "://" in url else f"http://{url}")
if parsed.scheme not in {"http", "https", "ws", "wss"}:
return _err(rid, 4015, f"unsupported browser url: {url}")
probe_root = f"{'https' if parsed.scheme == 'wss' else 'http' if parsed.scheme == 'ws' else parsed.scheme}://{parsed.netloc}"
probe_urls = [
f"{probe_root.rstrip('/')}/json/version",
f"{probe_root.rstrip('/')}/json",
]
ok = False
for probe in probe_urls:
try:
with urllib.request.urlopen(probe, timeout=2.0) as resp:
if 200 <= getattr(resp, "status", 200) < 300:
ok = True
break
if ok:
announce(f"Chrome launched and listening on port {port}")
else:
for line in _failure_messages(url, port, system)[1:]:
announce(line, level="error")
return _ok(
rid, {"connected": False, "url": url, "messages": messages}
)
elif not ok:
except Exception:
continue
if not ok:
return _err(rid, 5031, f"could not reach browser CDP at {url}")
elif _is_default_local_cdp(parsed):
announce(f"Chrome is already listening on port {port}")
normalized = _normalize_cdp_url(parsed)
# Order matters: reap sessions BEFORE publishing the new env
# so an in-flight tool call sees the old supervisor closed,
# then again AFTER so the default task's cached supervisor
# is drained against the new URL.
cleanup_all_browsers()
os.environ["BROWSER_CDP_URL"] = normalized
cleanup_all_browsers()
except Exception as e:
return _err(rid, 5031, str(e))
payload: dict[str, object] = {"connected": True, "url": normalized}
if messages:
payload["messages"] = messages
return _ok(rid, payload)
def _browser_disconnect(rid) -> dict:
# Reap, drop the env override, reap again — closes the same swap
# window covered by ``_browser_connect``.
def reap() -> None:
os.environ["BROWSER_CDP_URL"] = url
cleanup_all_browsers()
except Exception as e:
return _err(rid, 5031, str(e))
return _ok(rid, {"connected": True, "url": url})
if action == "disconnect":
os.environ.pop("BROWSER_CDP_URL", None)
try:
from tools.browser_tool import cleanup_all_browsers
cleanup_all_browsers()
except Exception:
pass
reap()
os.environ.pop("BROWSER_CDP_URL", None)
reap()
return _ok(rid, {"connected": False})
return _ok(rid, {"connected": False})
return _err(rid, 4015, f"unknown action: {action}")
@method("plugins.list")
@@ -5319,6 +5090,22 @@ def _(rid, params: dict) -> dict:
return _err(rid, 5024, str(e))
@method("learning.ledger")
def _(rid, params: dict) -> dict:
try:
from hermes_cli.learning_ledger import build_learning_ledger
return _ok(
rid,
build_learning_ledger(
_get_db(),
limit=int(params.get("limit", 80) or 80),
),
)
except Exception as e:
return _err(rid, 5025, str(e))
# ── Methods: shell ───────────────────────────────────────────────────
+7 -99
View File
@@ -23,45 +23,10 @@ the stream lazily through a callback.
from __future__ import annotations
import contextvars
import errno
import json
import logging
import os
import threading
from typing import Any, Callable, Optional, Protocol, runtime_checkable
# Errno values that mean "the peer is gone" rather than "the host has a
# real I/O problem". Anything outside this set re-raises so it surfaces
# in the crash log instead of looking like a clean disconnect.
_PEER_GONE_ERRNOS = frozenset({
errno.EPIPE, # write to closed pipe (POSIX)
errno.ECONNRESET, # peer reset the connection
errno.EBADF, # fd closed under us
errno.ESHUTDOWN, # transport endpoint shut down
getattr(errno, "WSAECONNRESET", -1), # win32 mapping (no-op on POSIX)
getattr(errno, "WSAESHUTDOWN", -1),
} - {-1})
logger = logging.getLogger(__name__)
# Optional knob: when true, StdioTransport does not call ``stream.flush``
# after writing. Use this on environments where a half-closed pipe (TUI
# Node parent quit while the gateway is still emitting events) makes
# flush block long enough to starve the rest of the worker pool.
#
# IMPORTANT: Python text stdout is fully buffered when attached to a
# pipe (the TUI case), so this knob ONLY makes sense when the gateway
# is launched with ``-u`` or ``PYTHONUNBUFFERED=1``. Without one of
# those, JSON-RPC frames will accumulate in the buffer and the TUI
# will hang waiting for ``gateway.ready``. Default stays off so the
# existing flush-after-write behaviour is unchanged.
_DISABLE_FLUSH = (os.environ.get("HERMES_TUI_GATEWAY_NO_FLUSH", "") or "").strip().lower() in {
"1",
"true",
"yes",
"on",
}
@runtime_checkable
class Transport(Protocol):
@@ -112,72 +77,15 @@ class StdioTransport:
self._lock = lock
def write(self, obj: dict) -> bool:
"""Return ``True`` on success, ``False`` ONLY when the peer is gone.
Returning ``False`` is the dispatcher's "broken stdout pipe" signal
``entry.py`` calls ``sys.exit(0)`` when ``write_json`` reports
``False``. So programming errors (non-JSON-safe payloads, encoding
misconfig, unexpected ValueErrors, host I/O bugs like ENOSPC) MUST
NOT return ``False``, otherwise a real bug looks like a clean
disconnect and is harder to diagnose. Those re-raise so the
existing crash-log infrastructure records the traceback.
Peer-gone branches:
* ``BrokenPipeError``
* ``ValueError("...closed file...")``
* ``OSError`` whose errno is in :data:`_PEER_GONE_ERRNOS`
(EPIPE / ECONNRESET / EBADF / ESHUTDOWN; plus WSA mappings
on Windows). Other OSError errnos (ENOSPC, EACCES, ...) are
real host problems and re-raise.
"""
# Serialization is OUTSIDE the lock so a large payload can't
# block other threads emitting their own frames. A non-JSON-safe
# payload is a programming error: re-raise so the crash log
# captures it instead of silently exiting via the False path.
line = json.dumps(obj, ensure_ascii=False) + "\n"
with self._lock:
stream = self._stream_getter()
try:
try:
with self._lock:
stream = self._stream_getter()
stream.write(line)
except BrokenPipeError:
return False
except ValueError as e:
# ValueError("I/O operation on closed file") is the
# ONLY ValueError that means "peer gone". Anything
# else — including UnicodeEncodeError, which is a
# ValueError subclass for misconfigured locales —
# is a real bug; re-raise so it surfaces in the crash log.
if isinstance(e, UnicodeEncodeError) or "closed file" not in str(e):
raise
return False
except OSError as e:
if e.errno not in _PEER_GONE_ERRNOS:
raise
logger.debug("StdioTransport write peer gone: %s", e)
return False
# A flush that *raises* with a peer-gone errno means the
# dispatcher should exit cleanly. A flush that *hangs* on
# a half-closed pipe holds the lock until it returns — see
# ``_DISABLE_FLUSH`` for the "skip flush entirely" escape
# hatch.
if not _DISABLE_FLUSH:
try:
stream.flush()
except BrokenPipeError:
return False
except ValueError as e:
if isinstance(e, UnicodeEncodeError) or "closed file" not in str(e):
raise
return False
except OSError as e:
if e.errno not in _PEER_GONE_ERRNOS:
raise
logger.debug("StdioTransport flush peer gone: %s", e)
return False
return True
stream.flush()
return True
except BrokenPipeError:
return False
def close(self) -> None:
return None
+1 -1
View File
@@ -30,7 +30,7 @@ export { useTerminalFocus } from './src/ink/hooks/use-terminal-focus.ts'
export { useTerminalTitle } from './src/ink/hooks/use-terminal-title.ts'
export { useTerminalViewport } from './src/ink/hooks/use-terminal-viewport.ts'
export { default as measureElement } from './src/ink/measure-element.ts'
export { createRoot, forceRedraw, default as render, renderSync } from './src/ink/root.ts'
export { createRoot, default as render, forceRedraw, renderSync } from './src/ink/root.ts'
export type { Instance, RenderOptions, Root } from './src/ink/root.ts'
export { stringWidth } from './src/ink/stringWidth.ts'
export { default as TextInput, UncontrolledTextInput } from 'ink-text-input'
@@ -23,7 +23,7 @@ export { useTerminalTitle } from './ink/hooks/use-terminal-title.js'
export { useTerminalViewport } from './ink/hooks/use-terminal-viewport.js'
export { default as measureElement } from './ink/measure-element.js'
export { scrollFastPathStats, type ScrollFastPathStats } from './ink/render-node-to-output.js'
export { createRoot, forceRedraw, default as render, renderSync } from './ink/root.js'
export { createRoot, default as render, forceRedraw, renderSync } from './ink/root.js'
export { stringWidth } from './ink/stringWidth.js'
export { isXtermJs } from './ink/terminal.js'
export { default as TextInput, UncontrolledTextInput } from 'ink-text-input'
@@ -1,4 +1,4 @@
import { PureComponent, type ReactNode } from 'react'
import React, { PureComponent, type ReactNode } from 'react'
import { updateLastInteractionTime } from '../../bootstrap/state.js'
import { logForDebugging } from '../../utils/debug.js'
@@ -316,10 +316,8 @@ export default class App extends PureComponent<Props, State> {
// Clear the timer reference
this.incompleteEscapeTimer = null
// Only proceed if we have an incomplete escape sequence or an unterminated
// bracketed paste. Missing paste-end markers otherwise leave every later
// keystroke trapped in the paste buffer.
if (!this.keyParseState.incomplete && this.keyParseState.mode !== 'IN_PASTE') {
// Only proceed if we have incomplete sequences
if (!this.keyParseState.incomplete) {
return
}
@@ -332,16 +330,13 @@ export default class App extends PureComponent<Props, State> {
// drain stdin next and clear this timer. Prevents both the spurious
// Escape key and the lost scroll event.
if (this.props.stdin.readableLength > 0) {
this.incompleteEscapeTimer = setTimeout(
this.flushIncomplete,
this.keyParseState.mode === 'IN_PASTE' ? this.PASTE_TIMEOUT : this.NORMAL_TIMEOUT
)
this.incompleteEscapeTimer = setTimeout(this.flushIncomplete, this.NORMAL_TIMEOUT)
return
}
// Process incomplete/paste state as a flush operation (input=null).
// This reuses all existing parsing logic.
// Process incomplete as a flush operation (input=null)
// This reuses all existing parsing logic
this.processInput(null)
}
@@ -360,10 +355,8 @@ export default class App extends PureComponent<Props, State> {
reconciler.discreteUpdates(processKeysInBatch, this, keys, undefined, undefined)
}
// If we have incomplete escape sequences or an unterminated paste, set a
// timer to flush/reset them. Paste starts are complete CSI sequences, so
// checking only `incomplete` would never arm the watchdog.
if (this.keyParseState.incomplete || this.keyParseState.mode === 'IN_PASTE') {
// If we have incomplete escape sequences, set a timer to flush them
if (this.keyParseState.incomplete) {
// Cancel any existing timer first
if (this.incompleteEscapeTimer) {
clearTimeout(this.incompleteEscapeTimer)
@@ -39,15 +39,6 @@ describe('enhanced keyboard modifier parsing', () => {
expect(event.key.super).toBe(true)
})
it('preserves forwarded VS Code/Cursor Cmd+C copy sequence as ctrl+super+c', () => {
const parsed = parseOne('\u001b[99;13u')
const event = new InputEvent(parsed)
expect(parsed.name).toBe('c')
expect(event.key.ctrl).toBe(true)
expect(event.key.super).toBe(true)
})
it('preserves Cmd on word-delete and word-navigation sequences', () => {
const backspace = new InputEvent(parseOne('\u001b[127;9u'))
const left = new InputEvent(parseOne('\u001b[1;9D'))
@@ -35,8 +35,6 @@ export function useSelection(): {
* replaces the old SGR-7 inverse so syntax highlighting stays readable
* under selection). Call once on mount + whenever theme changes. */
setSelectionBgColor: (color: string) => void
/** Monotonic counter incremented on every selection mutation. */
version: () => number
} {
// Look up the Ink instance via stdout — same pattern as instances map.
// StdinContext is available (it's always provided), and the Ink instance
@@ -60,8 +58,7 @@ export function useSelection(): {
shiftSelection: () => {},
moveFocus: () => {},
captureScrolledRows: () => {},
setSelectionBgColor: () => {},
version: () => 0
setSelectionBgColor: () => {}
}
}
@@ -76,8 +73,7 @@ export function useSelection(): {
shiftSelection: (dRow, minRow, maxRow) => ink.shiftSelectionForScroll(dRow, minRow, maxRow),
moveFocus: (move: FocusMove) => ink.moveSelectionFocus(move),
captureScrolledRows: (firstRow, lastRow, side) => ink.captureScrolledRows(firstRow, lastRow, side),
setSelectionBgColor: (color: string) => ink.setSelectionBgColor(color),
version: () => ink.getSelectionVersion()
setSelectionBgColor: (color: string) => ink.setSelectionBgColor(color)
}
}, [ink])
}
+8 -21
View File
@@ -63,7 +63,6 @@ import {
hasSelection,
moveFocus,
selectionBounds,
selectionSignature,
type SelectionState,
selectLineAt,
selectWordAt,
@@ -214,8 +213,7 @@ export default class Ink {
// Fired alongside the terminal repaint whenever the selection mutates
// so UI (e.g. footer hints) can react to selection appearing/clearing.
private readonly selectionListeners = new Set<() => void>()
private selectionVersion = 0
private lastSelectionSignature = ''
private selectionWasActive = false
// DOM nodes currently under the pointer (mode-1003 motion). Held here
// so App.tsx's handleMouseEvent is stateless — dispatchHover diffs
// against this set and mutates it in place.
@@ -1663,16 +1661,9 @@ export default class Ink {
return hasSelection(this.selection)
}
getSelectionVersion(): number {
return this.selectionVersion
}
/**
* Subscribe to selection state changes. Fires whenever the selection
* mutates anchor/focus moves, drag updates, programmatic clears.
* Does NOT fire on `copySelectionNoClear()` (no mutation, no notify),
* which is why version-based subscribers don't risk re-entrant copies.
* Returns an unsubscribe fn.
* is started, updated, cleared, or copied. Returns an unsubscribe fn.
*/
subscribeToSelectionChange(cb: () => void): () => void {
this.selectionListeners.add(cb)
@@ -1682,18 +1673,14 @@ export default class Ink {
private notifySelectionChange(): void {
this.scheduleRender()
// Only bump version when the selection range actually mutated.
// Listeners still fire unconditionally — useHasSelection() snapshots
// through React, which dedupes via Object.is on the boolean value.
const sig = selectionSignature(this.selection)
const active = hasSelection(this.selection)
if (sig !== this.lastSelectionSignature) {
this.lastSelectionSignature = sig
this.selectionVersion += 1
}
if (active !== this.selectionWasActive) {
this.selectionWasActive = active
for (const cb of this.selectionListeners) {
cb()
for (const cb of this.selectionListeners) {
cb()
}
}
}
@@ -1,41 +0,0 @@
import { describe, expect, it } from 'vitest'
import { INITIAL_STATE, parseMultipleKeypresses } from './parse-keypress.js'
import { PASTE_END, PASTE_START } from './termio/csi.js'
describe('parseMultipleKeypresses bracketed paste recovery', () => {
it('emits empty bracketed pastes when the terminal sends both markers', () => {
const [keys, state] = parseMultipleKeypresses(INITIAL_STATE, PASTE_START + PASTE_END)
expect(keys).toHaveLength(1)
expect(keys[0]).toMatchObject({ isPasted: true, raw: '' })
expect(state.mode).toBe('NORMAL')
})
it('flushes unterminated paste content back to normal input mode', () => {
const [pendingKeys, pendingState] = parseMultipleKeypresses(INITIAL_STATE, PASTE_START + 'hello')
expect(pendingKeys).toEqual([])
expect(pendingState.mode).toBe('IN_PASTE')
const [keys, state] = parseMultipleKeypresses(pendingState, null)
expect(keys).toHaveLength(1)
expect(keys[0]).toMatchObject({ isPasted: true, raw: 'hello' })
expect(state.mode).toBe('NORMAL')
expect(state.pasteBuffer).toBe('')
})
it('resets an empty unterminated paste start instead of staying stuck', () => {
const [pendingKeys, pendingState] = parseMultipleKeypresses(INITIAL_STATE, PASTE_START)
expect(pendingKeys).toEqual([])
expect(pendingState.mode).toBe('IN_PASTE')
const [keys, state] = parseMultipleKeypresses(pendingState, null)
expect(keys).toEqual([])
expect(state.mode).toBe('NORMAL')
expect(state.pasteBuffer).toBe('')
})
})
@@ -288,14 +288,9 @@ export function parseMultipleKeypresses(
}
}
// If a terminal drops the paste-end marker, the App watchdog flushes the
// partial paste and returns to normal input instead of swallowing all future
// keystrokes as paste content.
if (isFlush && inPaste) {
if (pasteBuffer) {
keys.push(createPasteKey(pasteBuffer))
}
// If flushing and still in paste mode, emit what we have
if (isFlush && inPaste && pasteBuffer) {
keys.push(createPasteKey(pasteBuffer))
inPaste = false
pasteBuffer = ''
}
@@ -75,13 +75,11 @@ export type Root = {
export const forceRedraw = (stdout: NodeJS.WriteStream = process.stdout): boolean => {
const instance = instances.get(stdout)
if (!instance) {
return false
}
instance.forceRedraw()
return true
}
@@ -799,20 +799,6 @@ export function hasSelection(s: SelectionState): boolean {
return s.anchor !== null && s.focus !== null
}
/**
* Stable fingerprint of the user-visible selection state. Used by Ink
* to skip incrementing the mutation counter when notifySelectionChange()
* fires without an actual change to anchor/focus/isDragging protects
* version-based subscribers (copy-on-select) from re-running for the
* same stable selection.
*/
export function selectionSignature(s: SelectionState): string {
const a = s.anchor ? `${s.anchor.row},${s.anchor.col}` : 'null'
const f = s.focus ? `${s.focus.row},${s.focus.col}` : 'null'
return `${a}|${f}|${s.isDragging ? 1 : 0}`
}
/**
* Normalized selection bounds: start is always before end in reading order.
* Returns null if no active selection.
@@ -293,19 +293,6 @@ describe('createGatewayEventHandler', () => {
expect(appended[1]).toMatchObject({ role: 'assistant', text: 'final answer' })
})
it('renders browser.progress events as system transcript lines as they stream in', () => {
const appended: Msg[] = []
const ctx = buildCtx(appended)
const handler = createGatewayEventHandler(ctx)
handler({
payload: { message: 'Chrome launched and listening on port 9222' },
type: 'browser.progress'
} as any)
expect(ctx.system.sys).toHaveBeenCalledWith('Chrome launched and listening on port 9222')
})
it('annotates gateway.start_timeout with stderr tail lines so users can diagnose without /logs', () => {
const appended: Msg[] = []
const onEvent = createGatewayEventHandler(buildCtx(appended))
@@ -327,48 +314,6 @@ describe('createGatewayEventHandler', () => {
expect(messages.some(m => m.includes('FileNotFoundError'))).toBe(true)
})
it('prefers raw text over Rich-rendered ANSI on message.complete (#16391)', () => {
const appended: Msg[] = []
const onEvent = createGatewayEventHandler(buildCtx(appended))
const raw = 'Hermes here.\n\nLine two.'
// Rich-rendered ANSI (`final_response_markdown: render`) used to win,
// which left visible escape codes in Ink output. Raw text must win.
const rendered = '\u001b[33mHermes here.\u001b[0m\n\n\u001b[2mLine two.\u001b[0m'
onEvent({ payload: { rendered, text: raw }, type: 'message.complete' } as any)
const assistant = appended.find(msg => msg.role === 'assistant')
expect(assistant?.text).toBe(raw)
expect(assistant?.text).not.toContain('\u001b[')
})
it('falls back to payload.rendered when text is missing on message.complete', () => {
const appended: Msg[] = []
const onEvent = createGatewayEventHandler(buildCtx(appended))
const rendered = 'fallback when gateway omitted text'
onEvent({ payload: { rendered }, type: 'message.complete' } as any)
const assistant = appended.find(msg => msg.role === 'assistant')
expect(assistant?.text).toBe(rendered)
})
it('always accumulates raw text in message.delta and ignores `rendered` (#16391)', () => {
const appended: Msg[] = []
const onEvent = createGatewayEventHandler(buildCtx(appended))
// Stream of partial text deltas; each delta carries an incremental
// Rich-ANSI fragment. Pre-fix code would replace the whole bufRef
// with the latest fragment, dropping prior text.
onEvent({ payload: { rendered: '\u001b[33mFi\u001b[0m', text: 'Fi' }, type: 'message.delta' } as any)
onEvent({ payload: { rendered: '\u001b[33mrst.\u001b[0m', text: 'rst.' }, type: 'message.delta' } as any)
onEvent({ payload: { text: ' second.' }, type: 'message.delta' } as any)
onEvent({ payload: {}, type: 'message.complete' } as any)
const assistant = appended.find(msg => msg.role === 'assistant')
expect(assistant?.text).toBe('First. second.')
})
it('anchors inline_diff as its own segment where the edit happened', () => {
const appended: Msg[] = []
const onEvent = createGatewayEventHandler(buildCtx(appended))
@@ -727,7 +672,9 @@ describe('createGatewayEventHandler', () => {
} as any)
// Pre-interrupt todos should land in turn state.
expect(getTurnState().todos).toEqual([{ content: 'pre-interrupt', id: 'todo-1', status: 'pending' }])
expect(getTurnState().todos).toEqual([
{ content: 'pre-interrupt', id: 'todo-1', status: 'pending' }
])
turnController.interruptTurn({
appendMessage: (msg: Msg) => appended.push(msg),
+11 -51
View File
@@ -85,6 +85,15 @@ describe('createSlashHandler', () => {
expect(ctx.gateway.gw.request).not.toHaveBeenCalled()
})
it('opens the learning ledger locally', () => {
const ctx = buildCtx()
expect(createSlashHandler(ctx)('/learned')).toBe(true)
expect(getOverlayState().learningLedger).toBe(true)
expect(ctx.gateway.rpc).not.toHaveBeenCalled()
expect(ctx.gateway.gw.request).not.toHaveBeenCalled()
})
it('routes /skills install <name> to skills.manage without opening overlay', () => {
const ctx = buildCtx()
@@ -191,14 +200,11 @@ describe('createSlashHandler', () => {
})
it.each([
['/browser status', 'browser.manage', { action: 'status', session_id: null }],
['/browser connect', 'browser.manage', { action: 'connect', session_id: null, url: 'http://127.0.0.1:9222' }],
['/browser status', 'browser.manage', { action: 'status' }],
['/reload-mcp', 'reload.mcp', { session_id: null }],
['/reload', 'reload.env', {}],
['/stop', 'process.stop', {}],
['/fast status', 'config.get', { key: 'fast', session_id: null }],
['/busy status', 'config.get', { key: 'busy' }],
['/indicator', 'config.get', { key: 'indicator' }]
['/busy status', 'config.get', { key: 'busy' }]
])('routes %s through native RPC (no slash worker)', (command, method, params) => {
const rpc = vi.fn(() => Promise.resolve({}))
const ctx = buildCtx({ gateway: { ...buildGateway(), rpc } })
@@ -208,34 +214,6 @@ describe('createSlashHandler', () => {
expect(ctx.gateway.gw.request).not.toHaveBeenCalled()
})
it('renders browser connect progress messages from the gateway', async () => {
const rpc = vi.fn(() =>
Promise.resolve({
connected: false,
messages: [
"Chrome isn't running with remote debugging — attempting to launch...",
'Browser not connected — start Chrome with remote debugging and retry /browser connect'
],
url: 'http://127.0.0.1:9222'
})
)
const ctx = buildCtx({ gateway: { ...buildGateway(), rpc } })
expect(createSlashHandler(ctx)('/browser connect')).toBe(true)
expect(ctx.transcript.sys).toHaveBeenCalledWith('checking Chrome remote debugging at http://127.0.0.1:9222...')
await vi.waitFor(() => {
expect(ctx.transcript.sys).toHaveBeenCalledWith(
"Chrome isn't running with remote debugging — attempting to launch..."
)
expect(ctx.transcript.sys).toHaveBeenCalledWith(
'Browser not connected — start Chrome with remote debugging and retry /browser connect'
)
expect(ctx.transcript.sys).not.toHaveBeenCalledWith('browser connect failed')
})
})
it('routes /rollback through native RPC when a session is active', () => {
patchUiState({ sid: 'sid-abc' })
const rpc = vi.fn(() => Promise.resolve({}))
@@ -246,24 +224,6 @@ describe('createSlashHandler', () => {
expect(ctx.gateway.gw.request).not.toHaveBeenCalled()
})
it('hot-swaps the live indicator when /indicator <style> succeeds', async () => {
const rpc = vi.fn(() => Promise.resolve({ value: 'emoji' }))
const ctx = buildCtx({ gateway: { ...buildGateway(), rpc } })
expect(createSlashHandler(ctx)('/indicator emoji')).toBe(true)
expect(rpc).toHaveBeenCalledWith('config.set', { key: 'indicator', value: 'emoji' })
await vi.waitFor(() => expect(getUiState().indicatorStyle).toBe('emoji'))
})
it('rejects unknown indicator styles before hitting the gateway', () => {
const rpc = vi.fn(() => Promise.resolve({}))
const ctx = buildCtx({ gateway: { ...buildGateway(), rpc } })
expect(createSlashHandler(ctx)('/indicator sparkle')).toBe(true)
expect(rpc).not.toHaveBeenCalled()
expect(ctx.transcript.sys).toHaveBeenCalledWith('usage: /indicator [ascii|emoji|kaomoji|unicode]')
})
it('drops stale slash.exec output after a newer slash', async () => {
let resolveLate: (v: { output?: string }) => void
let slashExecCalls = 0
@@ -1,64 +0,0 @@
import { describe, expect, it } from 'vitest'
const ENV_KEYS = ['COLORTERM', 'FORCE_COLOR', 'HERMES_TUI_TRUECOLOR', 'NO_COLOR'] as const
async function withCleanEnv(setup: () => void, body: () => Promise<void>) {
const saved: Record<string, string | undefined> = {}
for (const k of ENV_KEYS) {
saved[k] = process.env[k]
delete process.env[k]
}
try {
setup()
await body()
} finally {
for (const k of ENV_KEYS) {
if (saved[k] === undefined) {
delete process.env[k]
} else {
process.env[k] = saved[k]
}
}
}
}
describe('forceTruecolor', () => {
it('sets COLORTERM=truecolor and FORCE_COLOR=3 when unset', async () => {
await withCleanEnv(
() => {},
async () => {
await import('../lib/forceTruecolor.js?t=' + Date.now())
expect(process.env.COLORTERM).toBe('truecolor')
expect(process.env.FORCE_COLOR).toBe('3')
}
)
})
it('respects HERMES_TUI_TRUECOLOR=0 opt-out', async () => {
await withCleanEnv(
() => {
process.env.HERMES_TUI_TRUECOLOR = '0'
},
async () => {
await import('../lib/forceTruecolor.js?t=optout-' + Date.now())
expect(process.env.COLORTERM).toBeUndefined()
expect(process.env.FORCE_COLOR).toBeUndefined()
}
)
})
it('respects NO_COLOR', async () => {
await withCleanEnv(
() => {
process.env.NO_COLOR = '1'
},
async () => {
await import('../lib/forceTruecolor.js?t=no-color-' + Date.now())
expect(process.env.COLORTERM).toBeUndefined()
expect(process.env.FORCE_COLOR).toBeUndefined()
}
)
})
})
-6
View File
@@ -51,12 +51,6 @@ describe('isCopyShortcut', () => {
expect(isCopyShortcut({ ctrl: false, meta: true, super: false }, 'c', {})).toBe(false)
})
it('accepts the VS Code/Cursor forwarded Cmd+C copy sequence on macOS', async () => {
const { isCopyShortcut } = await importPlatform('darwin')
expect(isCopyShortcut({ ctrl: true, meta: false, super: true }, 'c', {})).toBe(true)
})
})
describe('isVoiceToggleKey', () => {
@@ -28,12 +28,6 @@ describe('terminalParityHints', () => {
it('suppresses IDE setup hint when keybindings are already configured', async () => {
const readFile = vi.fn().mockResolvedValue(
JSON.stringify([
{
key: 'cmd+c',
command: 'workbench.action.terminal.sendSequence',
when: 'terminalFocus && terminalTextSelected',
args: { text: '\u001b[99;13u' }
},
{
key: 'shift+enter',
command: 'workbench.action.terminal.sendSequence',
-149
View File
@@ -79,34 +79,11 @@ describe('configureTerminalKeybindings', () => {
expect(writeFile).toHaveBeenCalledTimes(1)
expect(copyFile).not.toHaveBeenCalled() // no existing file to back up
const written = writeFile.mock.calls[0]?.[1] as string
expect(written).toContain('cmd+c')
expect(written).toContain('terminalTextSelected')
expect(written).toContain('\\u001b[99;13u')
expect(written).toContain('shift+enter')
expect(written).toContain('cmd+enter')
expect(written).toContain('cmd+z')
})
it('only adds the Cmd+C forwarding binding on macOS', async () => {
const mkdir = vi.fn().mockResolvedValue(undefined)
const readFile = vi.fn().mockRejectedValue(Object.assign(new Error('missing'), { code: 'ENOENT' }))
const writeFile = vi.fn().mockResolvedValue(undefined)
const copyFile = vi.fn().mockResolvedValue(undefined)
const result = await configureTerminalKeybindings('vscode', {
fileOps: { copyFile, mkdir, readFile, writeFile },
homeDir: '/home/me',
platform: 'linux'
})
expect(result.success).toBe(true)
const written = writeFile.mock.calls[0]?.[1] as string
expect(written).not.toContain('cmd+c')
expect(written).not.toContain('terminalTextSelected')
expect(written).not.toContain('\\u001b[99;13u')
expect(written).toContain('shift+enter')
})
it('reports conflicts without overwriting existing bindings', async () => {
const mkdir = vi.fn().mockResolvedValue(undefined)
@@ -136,126 +113,6 @@ describe('configureTerminalKeybindings', () => {
expect(copyFile).not.toHaveBeenCalled() // no backup when not writing
})
it('flags a global (when-less) binding on the same key as a conflict', async () => {
// A user's keybindings.json `cmd+c` with no `when` clause is global —
// it overlaps any context, including our terminal scope. We must NOT
// silently add a terminal-scoped cmd+c that would shadow it.
const mkdir = vi.fn().mockResolvedValue(undefined)
const readFile = vi.fn().mockResolvedValue(
JSON.stringify([
{
key: 'cmd+c',
command: 'myExtension.smartCopy'
}
])
)
const writeFile = vi.fn().mockResolvedValue(undefined)
const copyFile = vi.fn().mockResolvedValue(undefined)
const result = await configureTerminalKeybindings('vscode', {
fileOps: { copyFile, mkdir, readFile, writeFile },
homeDir: '/Users/me',
platform: 'darwin'
})
expect(result.success).toBe(false)
expect(result.message).toContain('cmd+c')
expect(writeFile).not.toHaveBeenCalled()
})
it('flags an overlapping terminal-context binding as a conflict', async () => {
// Existing `cmd+c` scoped to plain `terminalFocus` overlaps with our
// `terminalFocus && terminalTextSelected` — both fire when the
// terminal is focused with text selected, so the existing binding
// would shadow ours. Treat as a conflict even though the strings
// aren't identical.
const mkdir = vi.fn().mockResolvedValue(undefined)
const readFile = vi.fn().mockResolvedValue(
JSON.stringify([
{
key: 'cmd+c',
command: 'workbench.action.terminal.copySelection',
when: 'terminalFocus'
}
])
)
const writeFile = vi.fn().mockResolvedValue(undefined)
const copyFile = vi.fn().mockResolvedValue(undefined)
const result = await configureTerminalKeybindings('vscode', {
fileOps: { copyFile, mkdir, readFile, writeFile },
homeDir: '/Users/me',
platform: 'darwin'
})
expect(result.success).toBe(false)
expect(result.message).toContain('cmd+c')
expect(writeFile).not.toHaveBeenCalled()
})
it('does not flag a negated terminalTextSelected binding as a conflict', async () => {
// A binding scoped to "terminal focused but no selected text" is
// logically disjoint from our copy-forwarding binding, which requires
// terminalTextSelected.
const mkdir = vi.fn().mockResolvedValue(undefined)
const readFile = vi.fn().mockResolvedValue(
JSON.stringify([
{
key: 'cmd+c',
command: 'workbench.action.terminal.sendSequence',
when: 'terminalFocus && !terminalTextSelected',
args: { text: '\u0003' }
}
])
)
const writeFile = vi.fn().mockResolvedValue(undefined)
const copyFile = vi.fn().mockResolvedValue(undefined)
const result = await configureTerminalKeybindings('vscode', {
fileOps: { copyFile, mkdir, readFile, writeFile },
homeDir: '/Users/me',
platform: 'darwin'
})
expect(result.success).toBe(true)
expect(writeFile).toHaveBeenCalledTimes(1)
})
it('does not flag a disjoint-when binding on the same key as a conflict', async () => {
// VS Code allows multiple bindings for the same key when their `when`
// clauses don't overlap. A user's pre-existing cmd+c binding scoped to
// editor focus should NOT block our terminal-scoped cmd+c binding.
const mkdir = vi.fn().mockResolvedValue(undefined)
const readFile = vi.fn().mockResolvedValue(
JSON.stringify([
{
key: 'cmd+c',
command: 'editor.action.clipboardCopyAction',
when: 'editorFocus'
}
])
)
const writeFile = vi.fn().mockResolvedValue(undefined)
const copyFile = vi.fn().mockResolvedValue(undefined)
const result = await configureTerminalKeybindings('vscode', {
fileOps: { copyFile, mkdir, readFile, writeFile },
homeDir: '/Users/me',
platform: 'darwin'
})
expect(result.success).toBe(true)
expect(writeFile).toHaveBeenCalledTimes(1)
})
it('backs up existing keybindings.json only when writing changes', async () => {
const mkdir = vi.fn().mockResolvedValue(undefined)
const readFile = vi.fn().mockResolvedValue(JSON.stringify([]))
@@ -329,12 +186,6 @@ describe('configureTerminalKeybindings', () => {
const readComplete = vi.fn().mockResolvedValue(
JSON.stringify([
{
key: 'cmd+c',
command: 'workbench.action.terminal.sendSequence',
when: 'terminalFocus && terminalTextSelected',
args: { text: '\u001b[99;13u' }
},
{
key: 'shift+enter',
command: 'workbench.action.terminal.sendSequence',
+19 -158
View File
@@ -1,92 +1,46 @@
import { afterEach, describe, expect, it, vi } from 'vitest'
import { describe, expect, it } from 'vitest'
// `theme.js` reads `process.env` at module-load to compute DEFAULT_THEME,
// and `fromSkin` closes over DEFAULT_THEME. A developer shell with
// HERMES_TUI_THEME=light (or HERMES_TUI_BACKGROUND set to something
// bright) would flip the base and turn these assertions into a local-
// only failure. We sterilize the relevant env vars + dynamically
// import the module fresh so EVERY symbol that closes over the env
// (DEFAULT_THEME, DARK_THEME, LIGHT_THEME, fromSkin) is loaded against
// a known-empty environment.
//
// `detectLightMode` takes env as an explicit arg, so it's safe to import
// statically — but we stay consistent and dynamic-import it too.
const RELEVANT_ENV = [
'HERMES_TUI_LIGHT',
'HERMES_TUI_THEME',
'HERMES_TUI_BACKGROUND',
'COLORFGBG',
'TERM_PROGRAM'
] as const
async function importThemeWithCleanEnv() {
for (const key of RELEVANT_ENV) {
vi.stubEnv(key, '')
}
vi.resetModules()
return import('../theme.js')
}
afterEach(() => {
vi.unstubAllEnvs()
vi.resetModules()
})
import { DARK_THEME, DEFAULT_THEME, detectLightMode, fromSkin, LIGHT_THEME } from '../theme.js'
describe('DEFAULT_THEME', () => {
it('has brand defaults', async () => {
const { DEFAULT_THEME } = await importThemeWithCleanEnv()
it('has brand defaults', () => {
expect(DEFAULT_THEME.brand.name).toBe('Hermes Agent')
expect(DEFAULT_THEME.brand.prompt).toBe('')
expect(DEFAULT_THEME.brand.tool).toBe('┊')
})
it('has color palette', async () => {
const { DEFAULT_THEME } = await importThemeWithCleanEnv()
it('has color palette', () => {
expect(DEFAULT_THEME.color.primary).toBe('#FFD700')
expect(DEFAULT_THEME.color.error).toBe('#ef5350')
})
})
describe('LIGHT_THEME', () => {
it('avoids bright-yellow accents unreadable on white backgrounds (#11300)', async () => {
const { LIGHT_THEME } = await importThemeWithCleanEnv()
it('avoids bright-yellow accents unreadable on white backgrounds (#11300)', () => {
expect(LIGHT_THEME.color.primary).not.toBe('#FFD700')
expect(LIGHT_THEME.color.accent).not.toBe('#FFBF00')
expect(LIGHT_THEME.color.muted).not.toBe('#B8860B')
expect(LIGHT_THEME.color.statusWarn).not.toBe('#FFD700')
})
it('keeps the same shape as DARK_THEME', async () => {
const { DARK_THEME, LIGHT_THEME } = await importThemeWithCleanEnv()
it('keeps the same shape as DARK_THEME', () => {
expect(Object.keys(LIGHT_THEME.color).sort()).toEqual(Object.keys(DARK_THEME.color).sort())
expect(LIGHT_THEME.brand).toEqual(DARK_THEME.brand)
})
})
describe('DEFAULT_THEME aliasing', () => {
it('defaults to DARK_THEME when nothing signals light', async () => {
const { DEFAULT_THEME, DARK_THEME: DARK } = await importThemeWithCleanEnv()
expect(DEFAULT_THEME).toBe(DARK)
it('defaults to DARK_THEME when nothing signals light', () => {
expect(DEFAULT_THEME).toBe(DARK_THEME)
})
})
describe('detectLightMode', () => {
it('returns false on empty env', async () => {
const { detectLightMode } = await importThemeWithCleanEnv()
it('returns false on empty env', () => {
expect(detectLightMode({})).toBe(false)
})
it('honors HERMES_TUI_LIGHT on/off', async () => {
const { detectLightMode } = await importThemeWithCleanEnv()
it('honors HERMES_TUI_LIGHT on/off', () => {
expect(detectLightMode({ HERMES_TUI_LIGHT: '1' })).toBe(true)
expect(detectLightMode({ HERMES_TUI_LIGHT: 'true' })).toBe(true)
expect(detectLightMode({ HERMES_TUI_LIGHT: 'on' })).toBe(true)
@@ -94,9 +48,7 @@ describe('detectLightMode', () => {
expect(detectLightMode({ HERMES_TUI_LIGHT: 'off' })).toBe(false)
})
it('sniffs COLORFGBG bg slots 7 and 15 as light (#11300)', async () => {
const { detectLightMode } = await importThemeWithCleanEnv()
it('sniffs COLORFGBG bg slots 7 and 15 as light (#11300)', () => {
expect(detectLightMode({ COLORFGBG: '0;15' })).toBe(true)
expect(detectLightMode({ COLORFGBG: '0;default;15' })).toBe(true)
expect(detectLightMode({ COLORFGBG: '0;7' })).toBe(true)
@@ -104,134 +56,43 @@ describe('detectLightMode', () => {
expect(detectLightMode({ COLORFGBG: '7;default;0' })).toBe(false)
})
it('falls through on malformed COLORFGBG with empty/non-numeric trailing field', async () => {
const { detectLightMode } = await importThemeWithCleanEnv()
// `Number('')` is 0, so `'15;'` would have been read as bg=0
// (authoritative dark) and incorrectly blocked TERM_PROGRAM.
// The strict /^\d+$/ guard makes these fall through instead.
const allowList = new Set(['Apple_Terminal'])
expect(detectLightMode({ COLORFGBG: '15;', TERM_PROGRAM: 'Apple_Terminal' }, allowList)).toBe(true)
expect(detectLightMode({ COLORFGBG: 'default;default', TERM_PROGRAM: 'Apple_Terminal' }, allowList)).toBe(true)
// Without an allow-list match, fall-through still defaults to dark.
expect(detectLightMode({ COLORFGBG: '15;' })).toBe(false)
})
it('lets HERMES_TUI_LIGHT=0 override a light COLORFGBG', async () => {
const { detectLightMode } = await importThemeWithCleanEnv()
it('lets HERMES_TUI_LIGHT=0 override a light COLORFGBG', () => {
expect(detectLightMode({ COLORFGBG: '0;15', HERMES_TUI_LIGHT: '0' })).toBe(false)
})
it('honors HERMES_TUI_THEME=light/dark as a symmetric explicit override', async () => {
const { detectLightMode } = await importThemeWithCleanEnv()
expect(detectLightMode({ HERMES_TUI_THEME: 'light' })).toBe(true)
expect(detectLightMode({ HERMES_TUI_THEME: 'dark' })).toBe(false)
expect(detectLightMode({ COLORFGBG: '0;15', HERMES_TUI_THEME: 'dark' })).toBe(false)
expect(detectLightMode({ COLORFGBG: '15;0', HERMES_TUI_THEME: 'light' })).toBe(true)
})
it('uses HERMES_TUI_BACKGROUND luminance when COLORFGBG is missing', async () => {
const { detectLightMode } = await importThemeWithCleanEnv()
expect(detectLightMode({ HERMES_TUI_BACKGROUND: '#ffffff' })).toBe(true)
expect(detectLightMode({ HERMES_TUI_BACKGROUND: '#000000' })).toBe(false)
expect(detectLightMode({ HERMES_TUI_BACKGROUND: '#1e1e1e' })).toBe(false)
// Three-char hex normalises like CSS.
expect(detectLightMode({ HERMES_TUI_BACKGROUND: '#fff' })).toBe(true)
// Garbage falls through to the default-dark path.
expect(detectLightMode({ HERMES_TUI_BACKGROUND: 'not-a-colour' })).toBe(false)
})
it('rejects partially-invalid hex instead of silently truncating', async () => {
const { detectLightMode } = await importThemeWithCleanEnv()
// `parseInt('fffgff'.slice(2,4), 16)` would return 15 — the strict
// regex must reject these inputs so they fall through to default-
// dark instead of producing a false-positive light reading.
expect(detectLightMode({ HERMES_TUI_BACKGROUND: '#fffgff' })).toBe(false)
expect(detectLightMode({ HERMES_TUI_BACKGROUND: 'ffggff' })).toBe(false)
expect(detectLightMode({ HERMES_TUI_BACKGROUND: '#xyz' })).toBe(false)
// Wrong length also rejected (no implicit padding/truncation).
expect(detectLightMode({ HERMES_TUI_BACKGROUND: '#fffff' })).toBe(false)
expect(detectLightMode({ HERMES_TUI_BACKGROUND: '#fffffff' })).toBe(false)
})
it('treats COLORFGBG as authoritative when present so it dominates the TERM_PROGRAM allow-list', async () => {
const { detectLightMode } = await importThemeWithCleanEnv()
// Inject a light-default allow-list so the precedence test is
// meaningful even though the production allow-list is empty.
const allowList = new Set(['Apple_Terminal'])
// Sanity: the allow-list alone WOULD turn this terminal light.
expect(detectLightMode({ TERM_PROGRAM: 'Apple_Terminal' }, allowList)).toBe(true)
// Dark COLORFGBG must beat the allow-list.
expect(detectLightMode({ COLORFGBG: '15;0', TERM_PROGRAM: 'Apple_Terminal' }, allowList)).toBe(false)
})
})
describe('fromSkin', () => {
// `fromSkin` closes over DEFAULT_THEME (which is env-derived), so we
// must dynamic-import it after sterilizing env — otherwise an ambient
// HERMES_TUI_THEME=light would flip the base palette and make these
// assertions order-dependent on the developer's shell.
it('overrides banner colors', async () => {
const { fromSkin } = await importThemeWithCleanEnv()
it('overrides banner colors', () => {
expect(fromSkin({ banner_title: '#FF0000' }, {}).color.primary).toBe('#FF0000')
})
it('preserves unset colors', async () => {
const { DEFAULT_THEME, fromSkin } = await importThemeWithCleanEnv()
it('preserves unset colors', () => {
expect(fromSkin({ banner_title: '#FF0000' }, {}).color.accent).toBe(DEFAULT_THEME.color.accent)
})
it('derives completion current background from resolved completion background', async () => {
const { fromSkin } = await importThemeWithCleanEnv()
const theme = fromSkin({ banner_accent: '#000000', completion_menu_bg: '#ffffff' }, {})
expect(theme.color.completionBg).toBe('#ffffff')
expect(theme.color.completionCurrentBg).toBe('#bfbfbf')
})
it('overrides branding', async () => {
const { fromSkin } = await importThemeWithCleanEnv()
it('overrides branding', () => {
const { brand } = fromSkin({}, { agent_name: 'TestBot', prompt_symbol: '$' })
expect(brand.name).toBe('TestBot')
expect(brand.prompt).toBe('$')
})
it('normalizes skin prompt symbols to trimmed single-line text', async () => {
const { DEFAULT_THEME, fromSkin } = await importThemeWithCleanEnv()
it('normalizes skin prompt symbols to one trimmed line', () => {
expect(fromSkin({}, { prompt_symbol: ' ⚔ \n' }).brand.prompt).toBe('⚔ ')
expect(fromSkin({}, { prompt_symbol: ' Ψ > \n' }).brand.prompt).toBe('Ψ >')
expect(fromSkin({}, { prompt_symbol: '\n\t' }).brand.prompt).toBe(DEFAULT_THEME.brand.prompt)
})
it('defaults for empty skin', async () => {
const { DEFAULT_THEME, fromSkin } = await importThemeWithCleanEnv()
it('defaults for empty skin', () => {
expect(fromSkin({}, {}).color).toEqual(DEFAULT_THEME.color)
expect(fromSkin({}, {}).brand.icon).toBe(DEFAULT_THEME.brand.icon)
})
it('passes banner logo/hero', async () => {
const { fromSkin } = await importThemeWithCleanEnv()
it('passes banner logo/hero', () => {
expect(fromSkin({}, {}, 'LOGO', 'HERO').bannerLogo).toBe('LOGO')
expect(fromSkin({}, {}, 'LOGO', 'HERO').bannerHero).toBe('HERO')
})
it('maps ui_ color keys + cascades to status', async () => {
const { fromSkin } = await importThemeWithCleanEnv()
it('maps ui_ color keys + cascades to status', () => {
const { color } = fromSkin({ ui_ok: '#008000' }, {})
expect(color.ok).toBe('#008000')
expect(color.statusGood).toBe('#008000')
})
+1 -133
View File
@@ -1,13 +1,7 @@
import { beforeEach, describe, expect, it, vi } from 'vitest'
import { $uiState, resetUiState } from '../app/uiStore.js'
import {
applyDisplay,
normalizeBusyInputMode,
normalizeIndicatorStyle,
normalizeMouseTracking,
normalizeStatusBar
} from '../app/useConfigSync.js'
import { applyDisplay, normalizeStatusBar } from '../app/useConfigSync.js'
describe('applyDisplay', () => {
beforeEach(() => {
@@ -71,19 +65,6 @@ describe('applyDisplay', () => {
expect(s.sections).toEqual({})
})
it('uses documented mouse_tracking with legacy tui_mouse fallback', () => {
const setBell = vi.fn()
applyDisplay({ config: { display: { mouse_tracking: false } } }, setBell)
expect($uiState.get().mouseTracking).toBe(false)
applyDisplay({ config: { display: { mouse_tracking: true, tui_mouse: false } } }, setBell)
expect($uiState.get().mouseTracking).toBe(true)
applyDisplay({ config: { display: { tui_mouse: false } } }, setBell)
expect($uiState.get().mouseTracking).toBe(false)
})
it('parses display.sections into per-section overrides', () => {
const setBell = vi.fn()
@@ -179,116 +160,3 @@ describe('normalizeStatusBar', () => {
expect(normalizeStatusBar('OFF')).toBe('off')
})
})
describe('normalizeMouseTracking', () => {
it('defaults on and prefers canonical mouse_tracking over legacy tui_mouse', () => {
expect(normalizeMouseTracking({})).toBe(true)
expect(normalizeMouseTracking({ mouse_tracking: false })).toBe(false)
expect(normalizeMouseTracking({ mouse_tracking: 0 })).toBe(false)
expect(normalizeMouseTracking({ mouse_tracking: 'off' })).toBe(false)
expect(normalizeMouseTracking({ mouse_tracking: 'false' })).toBe(false)
expect(normalizeMouseTracking({ mouse_tracking: null, tui_mouse: false })).toBe(true)
expect(normalizeMouseTracking({ mouse_tracking: true, tui_mouse: false })).toBe(true)
expect(normalizeMouseTracking({ tui_mouse: false })).toBe(false)
})
})
describe('normalizeBusyInputMode', () => {
it('passes through the canonical CLI parity values', () => {
expect(normalizeBusyInputMode('queue')).toBe('queue')
expect(normalizeBusyInputMode('steer')).toBe('steer')
expect(normalizeBusyInputMode('interrupt')).toBe('interrupt')
})
it('trims and lowercases input', () => {
expect(normalizeBusyInputMode(' Queue ')).toBe('queue')
expect(normalizeBusyInputMode('STEER')).toBe('steer')
})
it('defaults to queue for missing/unknown values (TUI-only override)', () => {
// CLI / messaging adapters keep `interrupt` as the framework default
// (see hermes_cli/config.py + tui_gateway/server.py::_load_busy_input_mode);
// the TUI ships `queue` because typing a follow-up while the agent
// streams is the common authoring pattern and an unintended interrupt
// loses work.
expect(normalizeBusyInputMode(undefined)).toBe('queue')
expect(normalizeBusyInputMode(null)).toBe('queue')
expect(normalizeBusyInputMode('')).toBe('queue')
expect(normalizeBusyInputMode('drop')).toBe('queue')
expect(normalizeBusyInputMode(42)).toBe('queue')
})
})
describe('normalizeIndicatorStyle', () => {
it('passes through the canonical enum', () => {
expect(normalizeIndicatorStyle('kaomoji')).toBe('kaomoji')
expect(normalizeIndicatorStyle('emoji')).toBe('emoji')
expect(normalizeIndicatorStyle('unicode')).toBe('unicode')
expect(normalizeIndicatorStyle('ascii')).toBe('ascii')
})
it('trims and lowercases input', () => {
expect(normalizeIndicatorStyle(' Emoji ')).toBe('emoji')
expect(normalizeIndicatorStyle('UNICODE')).toBe('unicode')
})
it('defaults to kaomoji for missing/unknown values', () => {
expect(normalizeIndicatorStyle(undefined)).toBe('kaomoji')
expect(normalizeIndicatorStyle(null)).toBe('kaomoji')
expect(normalizeIndicatorStyle('')).toBe('kaomoji')
expect(normalizeIndicatorStyle('sparkle')).toBe('kaomoji')
expect(normalizeIndicatorStyle(42)).toBe('kaomoji')
})
})
describe('applyDisplay → busy_input_mode', () => {
beforeEach(() => {
resetUiState()
})
it('threads display.busy_input_mode into $uiState', () => {
const setBell = vi.fn()
applyDisplay({ config: { display: { busy_input_mode: 'queue' } } }, setBell)
expect($uiState.get().busyInputMode).toBe('queue')
applyDisplay({ config: { display: { busy_input_mode: 'steer' } } }, setBell)
expect($uiState.get().busyInputMode).toBe('steer')
})
it('falls back to queue when value is missing or invalid (TUI-only default)', () => {
const setBell = vi.fn()
applyDisplay({ config: { display: {} } }, setBell)
expect($uiState.get().busyInputMode).toBe('queue')
applyDisplay({ config: { display: { busy_input_mode: 'drop' } } }, setBell)
expect($uiState.get().busyInputMode).toBe('queue')
})
})
describe('applyDisplay → tui_status_indicator', () => {
beforeEach(() => {
resetUiState()
})
it('threads display.tui_status_indicator into $uiState', () => {
const setBell = vi.fn()
applyDisplay({ config: { display: { tui_status_indicator: 'emoji' } } }, setBell)
expect($uiState.get().indicatorStyle).toBe('emoji')
applyDisplay({ config: { display: { tui_status_indicator: 'unicode' } } }, setBell)
expect($uiState.get().indicatorStyle).toBe('unicode')
})
it('falls back to kaomoji default when missing or invalid', () => {
const setBell = vi.fn()
applyDisplay({ config: { display: {} } }, setBell)
expect($uiState.get().indicatorStyle).toBe('kaomoji')
applyDisplay({ config: { display: { tui_status_indicator: 'rainbow' } } }, setBell)
expect($uiState.get().indicatorStyle).toBe('kaomoji')
})
})
-27
View File
@@ -28,31 +28,4 @@ describe('stickyPromptFromViewport', () => {
expect(stickyPromptFromViewport(messages, offsets, 16, 20, false)).toBe('current prompt')
})
it('shows the last prompt once the viewport starts after the history tail', () => {
const messages = [
{ role: 'user' as const, text: 'current prompt' },
{ role: 'assistant' as const, text: 'completed answer' }
]
expect(stickyPromptFromViewport(messages, [0, 2, 5], 8, 14, false)).toBe('current prompt')
})
it('shows a prompt as soon as its full row is above the viewport', () => {
const messages = [
{ role: 'user' as const, text: 'current prompt' },
{ role: 'assistant' as const, text: 'current answer' }
]
expect(stickyPromptFromViewport(messages, [0, 2, 10], 2, 8, false)).toBe('current prompt')
})
it('hides the sticky prompt at the bottom', () => {
const messages = [
{ role: 'user' as const, text: 'current prompt' },
{ role: 'assistant' as const, text: 'current answer' }
]
expect(stickyPromptFromViewport(messages, [0, 2, 10], 8, 10, true)).toBe('')
})
})
@@ -35,20 +35,4 @@ describe('viewportStore', () => {
})
expect(viewportSnapshotKey(snap)).toBe('0:16:5:40:3')
})
it('uses fresh scroll height to clear stale non-bottom state', () => {
const handle = {
getFreshScrollHeight: () => 20,
getPendingDelta: () => 0,
getScrollHeight: () => 40,
getScrollTop: () => 15,
getViewportHeight: () => 5,
isSticky: () => false
}
const snap = getViewportSnapshot(handle as any)
expect(snap.atBottom).toBe(true)
expect(snap.scrollHeight).toBe(20)
})
})
+25 -11
View File
@@ -64,6 +64,7 @@ export function createGatewayEventHandler(ctx: GatewayEventHandlerContext): (ev:
let pendingThinkingStatus = ''
let thinkingStatusTimer: null | ReturnType<typeof setTimeout> = null
let pendingLearning: string[] = []
// Inject the disk-save callback into turnController so recordMessageComplete
// can fire-and-forget a persist without having to plumb a gateway ref around.
@@ -269,7 +270,19 @@ export function createGatewayEventHandler(ctx: GatewayEventHandlerContext): (ev:
return
}
case 'learning.event': {
const title = String(ev.payload?.title ?? '').trim()
const verb = String(ev.payload?.verb ?? ev.payload?.type ?? 'learned').trim()
if (title) {
pendingLearning = pushUnique(4)(pendingLearning, `${verb}: ${title}`)
}
return
}
case 'message.start':
pendingLearning = []
turnController.startMessage()
return
@@ -307,16 +320,6 @@ export function createGatewayEventHandler(ctx: GatewayEventHandlerContext): (ev:
return
}
case 'browser.progress': {
const message = String(ev.payload?.message ?? '').trim()
if (message) {
sys(message)
}
return
}
case 'voice.status': {
// Continuous VAD loop reports its internal state so the status bar
// can show listening / transcribing / idle without polling.
@@ -383,7 +386,6 @@ export function createGatewayEventHandler(ctx: GatewayEventHandlerContext): (ev:
// 120-char clip used for `gateway.stderr` activity entries.
const STDERR_LINE_CAP = 120
const STDERR_LINES_MAX = 8
const tailLines = (stderrTail ?? '')
.split('\n')
.map(l => l.trim())
@@ -601,10 +603,22 @@ export function createGatewayEventHandler(ctx: GatewayEventHandlerContext): (ev:
return
case 'message.complete': {
const { finalMessages, finalText, wasInterrupted } = turnController.recordMessageComplete(ev.payload ?? {})
const completedLearning = (ev.payload?.learning_events ?? [])
.map(e => {
const title = String(e?.title ?? '').trim()
const verb = String(e?.verb ?? e?.type ?? 'learned').trim()
return title ? `${verb}: ${title}` : ''
})
.filter(Boolean)
if (!wasInterrupted) {
const msgs: Msg[] = finalMessages.length ? finalMessages : [{ role: 'assistant', text: finalText }]
const learningLines = [...completedLearning, ...pendingLearning].filter((text, i, xs) => xs.indexOf(text) === i)
msgs.forEach(appendMessage)
learningLines.forEach(text => appendMessage({ kind: 'learning', role: 'system', text }))
pendingLearning = []
if (bellOnComplete && stdout?.isTTY) {
stdout.write('\x07')
+1 -14
View File
@@ -27,23 +27,11 @@ export interface StateSetter<T> {
export type StatusBarMode = 'bottom' | 'off' | 'top'
export type BusyInputMode = 'interrupt' | 'queue' | 'steer'
// Single source of truth for indicator style names. Union type is
// derived from this tuple so adding/removing a style only touches one
// line — `useConfigSync` (validation) and `session.ts` (slash arg
// validation + usage hint) both import it.
export const INDICATOR_STYLES = ['ascii', 'emoji', 'kaomoji', 'unicode'] as const
export type IndicatorStyle = (typeof INDICATOR_STYLES)[number]
export const DEFAULT_INDICATOR_STYLE: IndicatorStyle = 'kaomoji'
export interface SelectionApi {
captureScrolledRows: (firstRow: number, lastRow: number, side: 'above' | 'below') => void
clearSelection: () => void
copySelection: () => Promise<string>
copySelectionNoClear: () => Promise<string>
getState: () => unknown
version: () => number
shiftAnchor: (dRow: number, minRow: number, maxRow: number) => void
shiftSelection: (dRow: number, minRow: number, maxRow: number) => void
}
@@ -74,6 +62,7 @@ export interface OverlayState {
approval: ApprovalReq | null
clarify: ClarifyReq | null
confirm: ConfirmReq | null
learningLedger: boolean
modelPicker: boolean
pager: null | PagerState
picker: boolean
@@ -97,7 +86,6 @@ export interface TranscriptRow {
export interface UiState {
bgTasks: Set<string>
busy: boolean
busyInputMode: BusyInputMode
compact: boolean
detailsMode: DetailsMode
detailsModeCommandOverride: boolean
@@ -107,7 +95,6 @@ export interface UiState {
sections: SectionVisibility
showCost: boolean
showReasoning: boolean
indicatorStyle: IndicatorStyle
sid: null | string
status: string
statusBar: StatusBarMode
+16 -2
View File
@@ -8,6 +8,7 @@ const buildOverlayState = (): OverlayState => ({
approval: null,
clarify: null,
confirm: null,
learningLedger: false,
modelPicker: false,
pager: null,
picker: false,
@@ -20,8 +21,20 @@ export const $overlayState = atom<OverlayState>(buildOverlayState())
export const $isBlocked = computed(
$overlayState,
({ agents, approval, clarify, confirm, modelPicker, pager, picker, secret, skillsHub, sudo }) =>
Boolean(agents || approval || clarify || confirm || modelPicker || pager || picker || secret || skillsHub || sudo)
({ agents, approval, clarify, confirm, learningLedger, modelPicker, pager, picker, secret, skillsHub, sudo }) =>
Boolean(
agents ||
approval ||
clarify ||
confirm ||
learningLedger ||
modelPicker ||
pager ||
picker ||
secret ||
skillsHub ||
sudo
)
)
export const getOverlayState = () => $overlayState.get()
@@ -45,6 +58,7 @@ export const resetFlowOverlays = () =>
...buildOverlayState(),
agents: $overlayState.get().agents,
agentsInitialHistoryIndex: $overlayState.get().agentsInitialHistoryIndex,
learningLedger: $overlayState.get().learningLedger,
modelPicker: $overlayState.get().modelPicker,
picker: $overlayState.get().picker,
skillsHub: $overlayState.get().skillsHub
+1 -1
View File
@@ -503,7 +503,7 @@ export const coreCommands: SlashCommand[] = [
ctx.guarded<SessionSteerResponse>(r => {
if (r?.status === 'queued') {
ctx.transcript.sys(
`steer queued — arrives after next tool call: "${payload.slice(0, 50)}${payload.length > 50 ? '…' : ''}"`
`steer queued — arrives after next tool call: "${payload.slice(0, 50)}${payload.length > 50 ? '…' : ''}"`
)
} else {
ctx.transcript.sys('steer rejected')
+21 -45
View File
@@ -2,7 +2,6 @@ import type {
BrowserManageResponse,
DelegationPauseResponse,
ProcessStopResponse,
ReloadEnvResponse,
ReloadMcpResponse,
RollbackDiffResponse,
RollbackListResponse,
@@ -90,71 +89,41 @@ export const opsCommands: SlashCommand[] = [
}
},
{
help: 're-read ~/.hermes/.env into the running gateway (CLI parity)',
name: 'reload',
run: (_arg, ctx) => {
ctx.gateway
.rpc<ReloadEnvResponse>('reload.env', {})
.then(
ctx.guarded<ReloadEnvResponse>(r => {
const n = Number(r.updated ?? 0)
const noun = n === 1 ? 'var' : 'vars'
ctx.transcript.sys(`reloaded .env (${n} ${noun} updated)`)
})
)
.catch(ctx.guardedErr)
}
},
{
help: 'manage browser CDP connection [connect|disconnect|status]',
name: 'browser',
run: (arg, ctx) => {
const [rawAction = 'status', ...rest] = arg.trim().split(/\s+/).filter(Boolean)
const action = rawAction.toLowerCase()
const trimmed = arg.trim()
const [rawAction, ...rest] = trimmed ? trimmed.split(/\s+/) : ['status']
const action = (rawAction || 'status').toLowerCase()
if (!['connect', 'disconnect', 'status'].includes(action)) {
return ctx.transcript.sys(
'usage: /browser [connect|disconnect|status] [url] · persistent: set browser.cdp_url in config.yaml'
)
return ctx.transcript.sys('usage: /browser [connect|disconnect|status] [url]')
}
const sid = ctx.sid ?? null
const url = action === 'connect' ? rest.join(' ').trim() || 'http://127.0.0.1:9222' : undefined
const payload: Record<string, unknown> = { action }
if (url) {
ctx.transcript.sys(`checking Chrome remote debugging at ${url}...`)
if (action === 'connect') {
payload.url = rest.join(' ').trim() || 'http://localhost:9222'
}
ctx.gateway
.rpc<BrowserManageResponse>('browser.manage', { action, session_id: sid, ...(url && { url }) })
.rpc<BrowserManageResponse>('browser.manage', payload)
.then(
ctx.guarded<BrowserManageResponse>(r => {
// Without a session we can't subscribe to streamed
// browser.progress events, so flush the bundled list.
if (!sid) {
r.messages?.forEach(message => ctx.transcript.sys(message))
}
if (action === 'status') {
return ctx.transcript.sys(
r.connected
? `browser connected: ${r.url || '(url unavailable)'}`
: 'browser not connected (try /browser connect <url> or set browser.cdp_url in config.yaml)'
r.connected ? `browser connected: ${r.url || '(url unavailable)'}` : 'browser not connected'
)
}
if (action === 'disconnect') {
return ctx.transcript.sys('browser disconnected')
if (action === 'connect') {
return ctx.transcript.sys(
r.connected ? `browser connected: ${r.url || '(url unavailable)'}` : 'browser connect failed'
)
}
if (r.connected) {
ctx.transcript.sys('Browser connected to live Chrome via CDP')
ctx.transcript.sys(`Endpoint: ${r.url || '(url unavailable)'}`)
ctx.transcript.sys('next browser tool call will use this CDP endpoint')
}
ctx.transcript.sys('browser disconnected')
})
)
.catch(ctx.guardedErr)
@@ -410,6 +379,13 @@ export const opsCommands: SlashCommand[] = [
}
},
{
aliases: ['growth', 'learned'],
help: 'show memories, skills, recalls, and integrations Hermes has accumulated',
name: 'learning',
run: () => patchOverlayState({ learningLedger: true })
},
{
help: 'browse, inspect, install skills',
name: 'skills',
-38
View File
@@ -12,7 +12,6 @@ import type {
} from '../../../gatewayTypes.js'
import { fmtK } from '../../../lib/text.js'
import type { PanelSection } from '../../../types.js'
import { DEFAULT_INDICATOR_STYLE, INDICATOR_STYLES, type IndicatorStyle } from '../../interfaces.js'
import { patchOverlayState } from '../../overlayStore.js'
import { patchUiState } from '../../uiStore.js'
import type { SlashCommand } from '../types.js'
@@ -269,43 +268,6 @@ export const sessionCommands: SlashCommand[] = [
}
},
{
help: 'pick the busy indicator: kaomoji (default), emoji, unicode (braille), or ascii',
name: 'indicator',
usage: `/indicator [${INDICATOR_STYLES.join('|')}]`,
run: (arg, ctx) => {
const value = arg.trim().toLowerCase()
if (!value) {
return ctx.gateway
.rpc<ConfigGetValueResponse>('config.get', { key: 'indicator' })
.then(
ctx.guarded<ConfigGetValueResponse>(r =>
ctx.transcript.sys(`indicator: ${r.value || DEFAULT_INDICATOR_STYLE}`)
)
)
}
if (!(INDICATOR_STYLES as readonly string[]).includes(value)) {
return ctx.transcript.sys(`usage: /indicator [${INDICATOR_STYLES.join('|')}]`)
}
ctx.gateway.rpc<ConfigSetResponse>('config.set', { key: 'indicator', value }).then(
ctx.guarded<ConfigSetResponse>(r => {
if (!r.value) {
return
}
// Hot-swap the running TUI immediately so the next render
// uses the new style without waiting for the 5s mtime poll
// to re-apply config.full.
patchUiState({ indicatorStyle: value as IndicatorStyle })
ctx.transcript.sys(`indicator → ${r.value}`)
})
)
}
},
{
help: 'toggle yolo mode (per-session approvals)',
name: 'yolo',
+3 -14
View File
@@ -431,13 +431,7 @@ class TurnController {
recordMessageComplete(payload: { rendered?: string; reasoning?: string; text?: string }) {
this.closeReasoningSegment()
// Ink renders markdown via <Md>; the gateway's Rich-rendered ANSI
// (`payload.rendered`) is for terminals that can't. Prioritising
// `rendered` here garbles output whenever a user opts into
// `display.final_response_markdown: render` because raw ANSI escapes
// pass through into the React tree. Prefer raw text and fall back
// only when the gateway elected not to send any (#16391).
const rawText = (payload.text ?? payload.rendered ?? this.bufRef).trimStart()
const rawText = (payload.rendered ?? payload.text ?? this.bufRef).trimStart()
const split = splitReasoning(rawText)
const finalText = finalTail(split.text, this.segmentMessages)
const existingReasoning = this.reasoningText.trim() || String(payload.reasoning ?? '').trim()
@@ -522,7 +516,7 @@ class TurnController {
return { finalMessages, finalText, wasInterrupted }
}
recordMessageDelta({ text }: { rendered?: string; text?: string }) {
recordMessageDelta({ rendered, text }: { rendered?: string; text?: string }) {
if (this.interrupted || !text) {
return
}
@@ -530,12 +524,7 @@ class TurnController {
this.pruneTransient()
this.endReasoningPhase()
// Always accumulate the raw text delta. The pre-#16391 path replaced
// the entire buffer with `rendered` (an *incremental* Rich ANSI
// fragment), which on every tick discarded everything streamed so far
// — visible as overlapping coloured text and lost prose under
// `display.final_response_markdown: render`.
this.bufRef += text
this.bufRef = rendered ?? this.bufRef + text
if (getUiState().streaming) {
this.scheduleStreaming()
+1 -3
View File
@@ -4,16 +4,14 @@ import { MOUSE_TRACKING } from '../config/env.js'
import { ZERO } from '../domain/usage.js'
import { DEFAULT_THEME } from '../theme.js'
import { DEFAULT_INDICATOR_STYLE, type UiState } from './interfaces.js'
import type { UiState } from './interfaces.js'
const buildUiState = (): UiState => ({
bgTasks: new Set(),
busy: false,
busyInputMode: 'queue',
compact: false,
detailsMode: 'collapsed',
detailsModeCommandOverride: false,
indicatorStyle: DEFAULT_INDICATOR_STYLE,
info: null,
inlineDiffs: true,
mouseTracking: MOUSE_TRACKING,
+2 -56
View File
@@ -10,13 +10,7 @@ import type {
} from '../gatewayTypes.js'
import { asRpcResult } from '../lib/rpc.js'
import {
type BusyInputMode,
DEFAULT_INDICATOR_STYLE,
INDICATOR_STYLES,
type IndicatorStyle,
type StatusBarMode
} from './interfaces.js'
import type { StatusBarMode } from './interfaces.js'
import { turnController } from './turnController.js'
import { patchUiState } from './uiStore.js'
@@ -30,52 +24,6 @@ const STATUSBAR_ALIAS: Record<string, StatusBarMode> = {
export const normalizeStatusBar = (raw: unknown): StatusBarMode =>
raw === false ? 'off' : typeof raw === 'string' ? (STATUSBAR_ALIAS[raw.trim().toLowerCase()] ?? 'top') : 'top'
const BUSY_MODES = new Set<BusyInputMode>(['interrupt', 'queue', 'steer'])
// TUI defaults to `queue` even though the framework default
// (`hermes_cli/config.py`) is `interrupt`. Rationale: in a full-screen
// TUI you're typically authoring the next prompt while the agent is
// still streaming, and an unintended interrupt loses work. Set
// `display.busy_input_mode: interrupt` (or `steer`) explicitly to
// opt out per-config; CLI / messaging adapters keep their `interrupt`
// default unchanged.
const TUI_BUSY_DEFAULT: BusyInputMode = 'queue'
export const normalizeBusyInputMode = (raw: unknown): BusyInputMode => {
if (typeof raw !== 'string') {
return TUI_BUSY_DEFAULT
}
const v = raw.trim().toLowerCase() as BusyInputMode
return BUSY_MODES.has(v) ? v : TUI_BUSY_DEFAULT
}
const INDICATOR_STYLE_SET: ReadonlySet<IndicatorStyle> = new Set(INDICATOR_STYLES)
export const normalizeIndicatorStyle = (raw: unknown): IndicatorStyle => {
if (typeof raw !== 'string') {
return DEFAULT_INDICATOR_STYLE
}
const v = raw.trim().toLowerCase() as IndicatorStyle
return INDICATOR_STYLE_SET.has(v) ? v : DEFAULT_INDICATOR_STYLE
}
const FALSEY_MOUSE = new Set(['0', 'false', 'no', 'off'])
const hasOwn = (obj: object, key: PropertyKey) => Object.prototype.hasOwnProperty.call(obj, key)
export const normalizeMouseTracking = (display: { mouse_tracking?: unknown; tui_mouse?: unknown }): boolean => {
const raw = hasOwn(display, 'mouse_tracking') ? display.mouse_tracking : display.tui_mouse
if (raw === false || raw === 0) {
return false
}
return typeof raw === 'string' ? !FALSEY_MOUSE.has(raw.trim().toLowerCase()) : true
}
const MTIME_POLL_MS = 5000
const quietRpc = async <T extends Record<string, any> = Record<string, any>>(
@@ -95,13 +43,11 @@ export const applyDisplay = (cfg: ConfigFullResponse | null, setBell: (v: boolea
setBell(!!d.bell_on_complete)
patchUiState({
busyInputMode: normalizeBusyInputMode(d.busy_input_mode),
compact: !!d.tui_compact,
detailsMode: resolveDetailsMode(d),
detailsModeCommandOverride: false,
indicatorStyle: normalizeIndicatorStyle(d.tui_status_indicator),
inlineDiffs: d.inline_diffs !== false,
mouseTracking: normalizeMouseTracking(d),
mouseTracking: d.tui_mouse !== false,
sections: resolveSections(d.sections),
showCost: !!d.show_cost,
showReasoning: !!d.show_reasoning,
+4 -2
View File
@@ -92,6 +92,10 @@ export function useInputHandlers(ctx: InputHandlerContext): InputHandlerResult {
return patchOverlayState({ skillsHub: false })
}
if (overlay.learningLedger) {
return patchOverlayState({ learningLedger: false })
}
if (overlay.picker) {
return patchOverlayState({ picker: false })
}
@@ -366,7 +370,6 @@ export function useInputHandlers(ctx: InputHandlerContext): InputHandlerResult {
if (isCtrl(key, ch, 'x') && cState.queueEditIdx !== null) {
cActions.removeQueue(cState.queueEditIdx)
return cActions.clearIn()
}
@@ -394,7 +397,6 @@ export function useInputHandlers(ctx: InputHandlerContext): InputHandlerResult {
if (isAction(key, ch, 'l')) {
clearSelection()
forceRedraw(terminal.stdout ?? process.stdout)
return
}
-37
View File
@@ -17,7 +17,6 @@ import type {
import { useGitBranch } from '../hooks/useGitBranch.js'
import { useVirtualHistory } from '../hooks/useVirtualHistory.js'
import { appendTranscriptMessage } from '../lib/messages.js'
import { isMac } from '../lib/platform.js'
import { asRpcResult, rpcErrorMessage } from '../lib/rpc.js'
import { terminalParityHints } from '../lib/terminalParity.js'
import { buildToolTrailLine, sameToolTrailGroup, toolTrailLabel } from '../lib/text.js'
@@ -144,47 +143,11 @@ export function useMainApp(gw: GatewayClient) {
const hasSelection = useHasSelection()
const selection = useSelection()
const lastCopiedVersionRef = useRef(-1)
useEffect(() => {
selection.setSelectionBgColor(ui.theme.color.selectionBg)
}, [selection, ui.theme.color.selectionBg])
// macOS Terminal.app does not forward Cmd+C to fullscreen TUIs that enable
// mouse tracking, so the only reliable native-feeling path is iTerm-style
// copy-on-select: once a drag creates a stable TUI selection, write it to
// the system clipboard while keeping the highlight visible.
//
// Subscribe directly via the ink selection bus (not useSyncExternalStore)
// so React doesn't re-render MainApp on every drag-move tick. The version
// ref de-dupes against re-entrant notifications.
useEffect(() => {
if (!isMac) {
return
}
return selection.subscribe(() => {
if (!selection.hasSelection()) {
return
}
const state = selection.getState() as { isDragging?: boolean } | null
if (state?.isDragging) {
return
}
const version = selection.version()
if (version === lastCopiedVersionRef.current) {
return
}
lastCopiedVersionRef.current = version
void selection.copySelectionNoClear()
})
}, [selection])
const clearSelection = useCallback(() => {
selection.clearSelection()
getInputSelection()?.collapseToEnd()
+5 -83
View File
@@ -4,12 +4,7 @@ import { TYPING_IDLE_MS } from '../config/timing.js'
import { attachedImageNotice } from '../domain/messages.js'
import { looksLikeSlashCommand } from '../domain/slash.js'
import type { GatewayClient } from '../gatewayClient.js'
import type {
InputDetectDropResponse,
PromptSubmitResponse,
SessionSteerResponse,
ShellExecResponse
} from '../gatewayTypes.js'
import type { InputDetectDropResponse, PromptSubmitResponse, ShellExecResponse } from '../gatewayTypes.js'
import { asRpcResult } from '../lib/rpc.js'
import { hasInterpolation, INTERPOLATION_RE } from '../protocol/interpolation.js'
import { PASTE_SNIPPET_RE } from '../protocol/paste.js'
@@ -212,72 +207,6 @@ export function useSubmission(opts: UseSubmissionOptions) {
[interpolate, send, shellExec]
)
// Honors `display.busy_input_mode` from config.yaml (CLI parity):
// - 'queue' (legacy): append to queueRef; drains on busy → false
// - 'steer' : inject into the current turn via session.steer; falls
// back to queue when steer is rejected (no agent / no
// tool window).
// - 'interrupt' (default): cancel the in-flight turn, then send the
// new text as a fresh prompt so it actually moves.
//
// `opts.fallbackToFront` controls whether a steer fallback re-inserts
// at the front of the queue (used by the queue-edit path to preserve
// a picked item's position); the mainline submit path always appends.
const handleBusyInput = useCallback(
(full: string, opts: { fallbackToFront?: boolean } = {}) => {
const live = getUiState()
const mode = live.busyInputMode
const fallback = (note: string) => {
if (opts.fallbackToFront) {
composerRefs.queueRef.current.unshift(full)
composerActions.syncQueue()
} else {
composerActions.enqueue(full)
}
sys(note)
}
if (mode === 'queue') {
return composerActions.enqueue(full)
}
if (mode === 'steer' && live.sid) {
gw.request<SessionSteerResponse>('session.steer', { session_id: live.sid, text: full })
.then(raw => {
const r = asRpcResult<SessionSteerResponse>(raw)
if (r?.status !== 'queued') {
fallback('steer rejected — message queued for next turn')
}
})
.catch(() => fallback('steer failed — message queued for next turn'))
return
}
// 'interrupt' (default): tear down the current turn, then send.
// `interruptTurn` fires `session.interrupt` without awaiting; if
// the gateway is still mid-response when `prompt.submit` lands,
// `send()`'s catch path re-queues with a "queued: ..." sys note
// (`isSessionBusyError`) — so a lost race degrades to queue
// semantics, not a dropped message.
if (live.sid) {
turnController.interruptTurn({ appendMessage, gw, sid: live.sid, sys })
}
if (hasInterpolation(full)) {
patchUiState({ busy: true })
return interpolate(full, send)
}
send(full)
},
[appendMessage, composerActions, composerRefs, gw, interpolate, send, sys]
)
const dispatchSubmission = useCallback(
(full: string) => {
if (!full.trim()) {
@@ -323,16 +252,9 @@ export function useSubmission(opts: UseSubmissionOptions) {
}
if (getUiState().busy) {
// 'interrupt' / 'steer' should reach the live turn instead of
// silently going back to the queue. handleBusyInput resolves
// mode-specific behavior (interrupt-and-send, steer, or queue).
if (getUiState().busyInputMode === 'queue') {
composerRefs.queueRef.current.unshift(picked)
composerRefs.queueRef.current.unshift(picked)
return composerActions.syncQueue()
}
return handleBusyInput(picked, { fallbackToFront: true })
return composerActions.syncQueue()
}
return sendQueued(picked)
@@ -341,7 +263,7 @@ export function useSubmission(opts: UseSubmissionOptions) {
composerActions.pushHistory(full)
if (getUiState().busy) {
return handleBusyInput(full)
return composerActions.enqueue(full)
}
if (hasInterpolation(full)) {
@@ -352,7 +274,7 @@ export function useSubmission(opts: UseSubmissionOptions) {
send(full)
},
[appendMessage, composerActions, composerRefs, handleBusyInput, interpolate, send, sendQueued, shellExec, slashRef]
[appendMessage, composerActions, composerRefs, interpolate, send, sendQueued, shellExec, slashRef]
)
const submit = useCallback(
+3 -1
View File
@@ -671,7 +671,9 @@ function DiffView({
<Text color={t.color.text}>
{diffMetricLine('duration', aTotals.totalDuration, bTotals.totalDuration, n => `${n.toFixed(1)}s`)}
</Text>
<Text color={t.color.text}>{diffMetricLine('tokens', sumTokens(aTotals), sumTokens(bTotals), fmtTokens)}</Text>
<Text color={t.color.text}>
{diffMetricLine('tokens', sumTokens(aTotals), sumTokens(bTotals), fmtTokens)}
</Text>
<Text color={t.color.text}>{diffMetricLine('cost', aTotals.costUsd, bTotals.costUsd, dollars)}</Text>
</Box>
</Box>
+5 -99
View File
@@ -1,12 +1,9 @@
import { Box, type ScrollBoxHandle, Text } from '@hermes/ink'
import { useStore } from '@nanostores/react'
import { type ReactNode, type RefObject, useEffect, useMemo, useState } from 'react'
import unicodeSpinners from 'unicode-animations'
import { type RefObject, useEffect, useMemo, useState } from 'react'
import { $delegationState } from '../app/delegationStore.js'
import type { IndicatorStyle } from '../app/interfaces.js'
import { useTurnSelector } from '../app/turnStore.js'
import { $uiState } from '../app/uiStore.js'
import { FACES } from '../content/faces.js'
import { VERBS } from '../content/verbs.js'
import { fmtDuration } from '../domain/messages.js'
@@ -18,98 +15,23 @@ import type { Theme } from '../theme.js'
import type { Msg, Usage } from '../types.js'
const FACE_TICK_MS = 2500
const HEART_COLORS = ['#ff5fa2', '#ff4d6d']
// Compact alternates for the `emoji` and `ascii` indicator styles.
// Each entry is a fixed-width (display-width) glyph.
const EMOJI_FRAMES = ['⚕ ', '🌀', '🤔', '✨', '🍵', '🔮']
const ASCII_FRAMES = ['|', '/', '-', '\\']
// Faster tick for spinner-style indicators — they read as motion only
// at frame rates closer to their authored interval.
const SPINNER_TICK_MS = 100
interface IndicatorRender {
frame: string
intervalMs: number
// When false, FaceTicker hides the rotating verb and just shows the
// glyph + duration. Lets `unicode` stay minimal while the other
// styles keep the verb-rotation flavour users associate with the
// running… status.
showVerb: boolean
}
const renderIndicator = (style: IndicatorStyle, tick: number): IndicatorRender => {
if (style === 'kaomoji') {
return { frame: FACES[tick % FACES.length] ?? '', intervalMs: FACE_TICK_MS, showVerb: true }
}
if (style === 'emoji') {
return {
frame: EMOJI_FRAMES[tick % EMOJI_FRAMES.length] ?? '⚕ ',
intervalMs: SPINNER_TICK_MS * 6,
showVerb: true
}
}
if (style === 'ascii') {
return {
frame: ASCII_FRAMES[tick % ASCII_FRAMES.length] ?? '|',
intervalMs: SPINNER_TICK_MS,
showVerb: true
}
}
// 'unicode' — braille spinner (fixed 1-col). Authored interval is
// ~80ms; honour it but bound below at a safe minimum so React
// re-renders stay reasonable. This style is for users who want
// the cleanest possible status, so no verb rotation either.
const spinner = unicodeSpinners.braille
const frame = spinner.frames[tick % spinner.frames.length] ?? '⠋'
return { frame, intervalMs: Math.max(SPINNER_TICK_MS, spinner.interval), showVerb: false }
}
function FaceTicker({ color, startedAt }: { color: string; startedAt?: null | number }) {
const ui = useStore($uiState)
const style = ui.indicatorStyle
const [tick, setTick] = useState(() => Math.floor(Math.random() * 1000))
const [verbTick, setVerbTick] = useState(() => Math.floor(Math.random() * VERBS.length))
const [now, setNow] = useState(() => Date.now())
// Pre-compute cadence + verb-visibility for the active style so an
// `/indicator` switch re-arms the interval (and skips the verb timer
// for verb-less styles like `unicode`) without leaving the previous
// timer dangling.
const { intervalMs, showVerb } = renderIndicator(style, 0)
useEffect(() => {
const glyph = setInterval(() => setTick(n => n + 1), intervalMs)
const face = setInterval(() => setTick(n => n + 1), FACE_TICK_MS)
const clock = setInterval(() => setNow(Date.now()), 1000)
// Verb timer is gated on `showVerb` — `unicode` style hides the verb
// entirely, so cycling `verbTick` would be an avoidable re-render.
const verb = showVerb ? setInterval(() => setVerbTick(n => n + 1), FACE_TICK_MS) : null
return () => {
clearInterval(glyph)
clearInterval(face)
clearInterval(clock)
if (verb !== null) {
clearInterval(verb)
}
}
}, [intervalMs, showVerb])
const { frame } = renderIndicator(style, tick)
const verb = VERBS[verbTick % VERBS.length] ?? ''
const verbSegment = showVerb ? ` ${verb}` : ''
const durationSegment = startedAt ? ` · ${fmtDuration(now - startedAt)}` : ''
}, [])
return (
<Text color={color}>
{frame}
{verbSegment}
{durationSegment}
{FACES[tick % FACES.length]} {VERBS[tick % VERBS.length]}{startedAt ? ` · ${fmtDuration(now - startedAt)}` : ''}
</Text>
)
}
@@ -338,22 +260,6 @@ export function StatusRule({
)
}
export function FloatBox({ children, color }: { children: ReactNode; color: string }) {
return (
<Box
alignSelf="flex-start"
borderColor={color}
borderStyle="double"
flexDirection="column"
marginTop={1}
opaque
paddingX={1}
>
{children}
</Box>
)
}
export function StickyPromptTracker({ messages, offsets, scrollRef, onChange }: StickyPromptTrackerProps) {
const { atBottom, bottom, top } = useViewportSnapshot(scrollRef)
const text = stickyPromptFromViewport(messages, offsets, top, bottom, atBottom)
+6 -8
View File
@@ -127,6 +127,9 @@ const ComposerPane = memo(function ComposerPane({
const promptText = sh ? '$' : ui.theme.brand.prompt
const promptLabel = `${promptText} `
const promptWidth = Math.max(1, stringWidth(promptLabel))
// ``pw`` retained as the local alias used by the mouse-drag handlers
// below — semantically the same value, kept short for readability there.
const pw = promptWidth
const inputColumns = stableComposerColumns(composer.cols, promptWidth)
const inputHeight = inputVisualHeight(composer.input, inputColumns)
const inputMouseRef = useRef<null | TextInputMouseApi>(null)
@@ -148,7 +151,7 @@ const ComposerPane = memo(function ComposerPane({
}
e.stopImmediatePropagation?.()
inputMouseRef.current?.dragAt(e.localRow ?? 0, (e.localCol ?? 0) - promptWidth)
inputMouseRef.current?.dragAt(e.localRow ?? 0, (e.localCol ?? 0) - pw)
}
// Spacer rows live on a different vertical origin; only the column is
@@ -160,7 +163,7 @@ const ComposerPane = memo(function ComposerPane({
}
e.stopImmediatePropagation?.()
inputMouseRef.current?.dragAt(0, (e.localCol ?? 0) - promptWidth)
inputMouseRef.current?.dragAt(0, (e.localCol ?? 0) - pw)
}
const endInputDrag = () => inputMouseRef.current?.end()
@@ -224,12 +227,7 @@ const ComposerPane = memo(function ComposerPane({
</Box>
))}
<Box
onMouseDown={captureInputDrag}
onMouseDrag={dragFromPromptRow}
onMouseUp={endInputDrag}
position="relative"
>
<Box onMouseDown={captureInputDrag} onMouseDrag={dragFromPromptRow} onMouseUp={endInputDrag} position="relative">
<Box width={promptWidth}>
{sh ? (
<Text color={ui.theme.color.shellDollar}>{promptLabel}</Text>
+157 -65
View File
@@ -1,4 +1,4 @@
import { Box, Text } from '@hermes/ink'
import { Box, Text, useStdout } from '@hermes/ink'
import { useStore } from '@nanostores/react'
import { useGateway } from '../app/gatewayContext.js'
@@ -6,15 +6,18 @@ import type { AppOverlaysProps } from '../app/interfaces.js'
import { $overlayState, patchOverlayState } from '../app/overlayStore.js'
import { $uiState } from '../app/uiStore.js'
import { FloatBox } from './appChrome.js'
import { LearningLedger } from './learningLedger.js'
import { MaskedPrompt } from './maskedPrompt.js'
import { ModelPicker } from './modelPicker.js'
import { OverlayHint } from './overlayControls.js'
import { OverlayGrid } from './overlayGrid.js'
import { ApprovalPrompt, ClarifyPrompt, ConfirmPrompt } from './prompts.js'
import { SessionPicker } from './sessionPicker.js'
import { SkillsHub } from './skillsHub.js'
const COMPLETION_WINDOW = 16
const OVERLAY_GUTTER = 4
const OVERLAY_MIN_WIDTH = 44
export function PromptZone({
cols,
@@ -102,8 +105,15 @@ export function FloatingOverlays({
const { gw } = useGateway()
const overlay = useStore($overlayState)
const ui = useStore($uiState)
const { stdout } = useStdout()
const hasAny = overlay.modelPicker || overlay.pager || overlay.picker || overlay.skillsHub || completions.length
const hasAny =
overlay.learningLedger ||
overlay.modelPicker ||
overlay.pager ||
overlay.picker ||
overlay.skillsHub ||
completions.length
if (!hasAny) {
return null
@@ -115,87 +125,169 @@ export function FloatingOverlays({
const viewportSize = Math.min(COMPLETION_WINDOW, completions.length)
const start = Math.max(0, Math.min(compIdx - Math.floor(COMPLETION_WINDOW / 2), completions.length - viewportSize))
const overlayWidth = Math.max(OVERLAY_MIN_WIDTH, cols - OVERLAY_GUTTER)
const overlayMaxHeight = Math.max(6, Math.min(18, (stdout?.rows ?? 24) - 8))
return (
<Box alignItems="flex-start" bottom="100%" flexDirection="column" left={0} position="absolute" right={0}>
{overlay.picker && (
<FloatBox color={ui.theme.color.border}>
<SessionPicker
gw={gw}
onCancel={() => patchOverlayState({ picker: false })}
onSelect={onPickerSelect}
t={ui.theme}
/>
</FloatBox>
<OverlayGrid
borderColor={ui.theme.color.border}
panels={[
{
content: (
<SessionPicker
gw={gw}
onCancel={() => patchOverlayState({ picker: false })}
onSelect={onPickerSelect}
t={ui.theme}
/>
),
id: 'sessions'
}
]}
maxHeight={overlayMaxHeight}
t={ui.theme}
width={overlayWidth}
/>
)}
{overlay.modelPicker && (
<FloatBox color={ui.theme.color.border}>
<ModelPicker
gw={gw}
onCancel={() => patchOverlayState({ modelPicker: false })}
onSelect={onModelSelect}
sessionId={ui.sid}
t={ui.theme}
/>
</FloatBox>
<OverlayGrid
borderColor={ui.theme.color.border}
panels={[
{
content: (
<ModelPicker
gw={gw}
onCancel={() => patchOverlayState({ modelPicker: false })}
onSelect={onModelSelect}
sessionId={ui.sid}
t={ui.theme}
/>
),
id: 'models'
}
]}
maxHeight={overlayMaxHeight}
t={ui.theme}
width={overlayWidth}
/>
)}
{overlay.skillsHub && (
<FloatBox color={ui.theme.color.border}>
<SkillsHub gw={gw} onClose={() => patchOverlayState({ skillsHub: false })} t={ui.theme} />
</FloatBox>
<OverlayGrid
borderColor={ui.theme.color.border}
panels={[
{
content: <SkillsHub gw={gw} onClose={() => patchOverlayState({ skillsHub: false })} t={ui.theme} />,
id: 'skills'
}
]}
maxHeight={overlayMaxHeight}
t={ui.theme}
width={overlayWidth}
/>
)}
{overlay.learningLedger && (
<LearningLedger
borderColor={ui.theme.color.border}
gw={gw}
onClose={() => patchOverlayState({ learningLedger: false })}
t={ui.theme}
width={overlayWidth}
maxHeight={overlayMaxHeight}
/>
)}
{overlay.pager && (
<FloatBox color={ui.theme.color.border}>
<Box flexDirection="column" paddingX={1} paddingY={1}>
{overlay.pager.title && (
<Box justifyContent="center" marginBottom={1}>
<Text bold color={ui.theme.color.primary}>
{overlay.pager.title}
</Text>
</Box>
)}
<OverlayGrid
borderColor={ui.theme.color.border}
panels={[
{
content: (
<Box flexDirection="column">
{overlay.pager.lines
.slice(overlay.pager.offset, overlay.pager.offset + pagerPageSize)
.map((line, i) => (
<Text key={i}>{line}</Text>
))}
{overlay.pager.lines.slice(overlay.pager.offset, overlay.pager.offset + pagerPageSize).map((line, i) => (
<Text key={i}>{line}</Text>
))}
<Box marginTop={1}>
<OverlayHint t={ui.theme}>
{overlay.pager.offset + pagerPageSize < overlay.pager.lines.length
? `↑↓/jk line · Enter/Space/PgDn page · b/PgUp back · g/G top/bottom · Esc/q close (${Math.min(overlay.pager.offset + pagerPageSize, overlay.pager.lines.length)}/${overlay.pager.lines.length})`
: `end · ↑↓/jk · b/PgUp back · g top · Esc/q close (${overlay.pager.lines.length} lines)`}
</OverlayHint>
</Box>
</Box>
</FloatBox>
</Box>
),
footer: (
<OverlayHint t={ui.theme}>
{overlay.pager.offset + pagerPageSize < overlay.pager.lines.length
? `↑↓/jk line · Enter/Space/PgDn page · b/PgUp back · g/G top/bottom · Esc/q close (${Math.min(overlay.pager.offset + pagerPageSize, overlay.pager.lines.length)}/${overlay.pager.lines.length})`
: `end · ↑↓/jk · b/PgUp back · g top · Esc/q close (${overlay.pager.lines.length} lines)`}
</OverlayHint>
),
id: 'pager',
title: overlay.pager.title
}
]}
maxHeight={overlayMaxHeight}
t={ui.theme}
width={overlayWidth}
/>
)}
{!!completions.length && (
<FloatBox color={ui.theme.color.primary}>
<Box flexDirection="column" width={Math.max(28, cols - 6)}>
{completions.slice(start, start + viewportSize).map((item, i) => {
const active = start + i === compIdx
<OverlayGrid
borderColor={ui.theme.color.primary}
panels={[
{
content: (
<Box flexDirection="column">
{completions.slice(start, start + viewportSize).map((item, i) => {
const active = start + i === compIdx
return (
<Box
backgroundColor={active ? ui.theme.color.completionCurrentBg : undefined}
flexDirection="row"
key={`${start + i}:${item.text}:${item.display}:${item.meta ?? ''}`}
width="100%"
>
<Text bold color={ui.theme.color.label}>
{' '}
{item.display}
</Text>
{item.meta ? <Text color={ui.theme.color.muted}> {item.meta}</Text> : null}
return (
<Box
backgroundColor={active ? ui.theme.color.completionCurrentBg : undefined}
key={`${start + i}:${item.text}`}
width="100%"
>
<Text bold color={ui.theme.color.label} wrap="truncate-end">
{item.display}
</Text>
</Box>
)
})}
</Box>
)
})}
</Box>
</FloatBox>
),
grow: 4,
id: 'completion-list'
},
{
content: (
<Box flexDirection="column">
{completions.slice(start, start + viewportSize).map((item, i) => {
const active = start + i === compIdx
return (
<Box
backgroundColor={active ? ui.theme.color.completionCurrentBg : undefined}
key={`${start + i}:${item.text}:meta`}
width="100%"
>
<Text color={ui.theme.color.muted} wrap="truncate-end">
{item.meta ?? ' '}
</Text>
</Box>
)
})}
</Box>
),
grow: 6,
id: 'completion-meta'
}
]}
maxHeight={overlayMaxHeight}
t={ui.theme}
width={overlayWidth}
/>
)}
</Box>
)
+16
View File
@@ -89,6 +89,16 @@ export function SessionPanel({ info, sid, t }: SessionPanelProps) {
</Box>
)
}
const learningLine = (() => {
const counts = info.learning?.counts ?? {}
const parts = [
counts.user || counts.memory ? `${(counts.user ?? 0) + (counts.memory ?? 0)} memories` : '',
counts.recall ? `${counts.recall} recalls` : '',
counts['skill-use'] ? `${counts['skill-use']} applied skills` : ''
].filter(Boolean)
return parts.length ? `learned: ${parts.join(' · ')}` : ''
})()
return (
<Box borderColor={t.color.border} borderStyle="round" marginBottom={1} paddingX={2} paddingY={1}>
@@ -160,6 +170,12 @@ export function SessionPanel({ info, sid, t }: SessionPanelProps) {
<Text color={t.color.muted}>/help for commands</Text>
</Text>
{learningLine && (
<Text color={t.color.text} dimColor italic>
{learningLine} · /learned
</Text>
)}
{typeof info.update_behind === 'number' && info.update_behind > 0 && (
<Text bold color={t.color.warn}>
! {info.update_behind} {info.update_behind === 1 ? 'commit' : 'commits'} behind
+316
View File
@@ -0,0 +1,316 @@
import { Box, Text, useInput, useStdout } from '@hermes/ink'
import { useEffect, useState } from 'react'
import type { GatewayClient } from '../gatewayClient.js'
import { rpcErrorMessage } from '../lib/rpc.js'
import type { Theme } from '../theme.js'
import { OverlayGrid } from './overlayGrid.js'
import { OverlayHint, windowItems, windowOffset } from './overlayControls.js'
const EDGE_GUTTER = 10
const MAX_WIDTH = 132
const MIN_WIDTH = 64
const VISIBLE_ROWS = 12
const LISTS = [
{ id: 'memories', title: 'Memories', types: ['user', 'memory'] },
{ id: 'skills', title: 'Skills', types: ['skill-use'] },
{ id: 'recalls', title: 'Recalls', types: ['recall'] },
{ id: 'connected', title: 'Connected', types: ['integration'] }
] as const
const typeIcon: Record<string, string> = {
integration: '◇',
memory: '◆',
recall: '↺',
'skill-use': '✦',
user: '●'
}
const fmtTime = (ts?: null | number) => {
if (!ts) {
return ''
}
const days = Math.floor((Date.now() - ts * 1000) / 86_400_000)
return days <= 0 ? 'today' : `${days}d ago`
}
export function LearningLedger({ borderColor, gw, maxHeight, onClose, t, width: fixedWidth }: LearningLedgerProps) {
const [ledger, setLedger] = useState<LearningLedgerResponse | null>(null)
const [activeList, setActiveList] = useState(0)
const [indices, setIndices] = useState<Record<string, number>>({})
const [expanded, setExpanded] = useState(false)
const [err, setErr] = useState('')
const [loading, setLoading] = useState(true)
const { stdout } = useStdout()
const width = fixedWidth ?? Math.max(MIN_WIDTH, Math.min(MAX_WIDTH, (stdout?.columns ?? 80) - EDGE_GUTTER))
useEffect(() => {
gw.request<LearningLedgerResponse>('learning.ledger', { limit: 120 })
.then(r => {
setLedger(r)
setErr('')
})
.catch((e: unknown) => setErr(rpcErrorMessage(e)))
.finally(() => setLoading(false))
}, [gw])
const items = ledger?.items ?? []
const lists = LISTS.map(list => ({
...list,
items: items.filter(item => list.types.includes(item.type as never))
}))
const active = lists[activeList] ?? lists[0]!
const activeIdx = Math.min(indices[active.id] ?? 0, Math.max(0, active.items.length - 1))
const selected = active.items[activeIdx]
const detailOpen = expanded && !!selected
useInput((ch, key) => {
if (key.escape || ch.toLowerCase() === 'q') {
onClose()
return
}
if (key.leftArrow && activeList > 0) {
setActiveList(v => v - 1)
return
}
if (key.rightArrow && activeList < lists.length - 1) {
setActiveList(v => v + 1)
return
}
if (key.upArrow && activeIdx > 0) {
setIndices(v => ({ ...v, [active.id]: activeIdx - 1 }))
return
}
if (key.downArrow && activeIdx < active.items.length - 1) {
setIndices(v => ({ ...v, [active.id]: activeIdx + 1 }))
return
}
if (key.return || ch === ' ') {
setExpanded(v => !v)
return
}
const n = ch === '0' ? 10 : parseInt(ch, 10)
if (!Number.isNaN(n) && n >= 1 && n <= Math.min(10, active.items.length)) {
const next = windowOffset(active.items.length, activeIdx, VISIBLE_ROWS) + n - 1
if (active.items[next]) {
setIndices(v => ({ ...v, [active.id]: next }))
}
}
})
if (loading) {
return <Text color={t.color.muted}>indexing learning ledger</Text>
}
if (err) {
return (
<Box flexDirection="column" width={width}>
<Text color={t.color.label}>learning ledger error: {err}</Text>
<OverlayHint t={t}>Esc/q close</OverlayHint>
</Box>
)
}
if (!items.length) {
return (
<Box flexDirection="column" width={width}>
<Text bold color={t.color.accent}>
Recent Learning
</Text>
<Text color={t.color.muted}>no memories, recalls, used skills, or integrations found yet</Text>
{ledger?.inventory?.skills ? (
<Text color={t.color.muted}>available knowledge: {ledger.inventory.skills} installed skills</Text>
) : null}
<OverlayHint t={t}>Esc/q close</OverlayHint>
</Box>
)
}
const listPanels = lists.map((list, listIdx) => {
const selectedIndex = Math.min(indices[list.id] ?? 0, Math.max(0, list.items.length - 1))
const { items: visible, offset } = windowItems(list.items, selectedIndex, Math.max(3, Math.floor(VISIBLE_ROWS / 2)))
return {
content: (
<LearningList
active={activeList === listIdx}
items={visible}
offset={offset}
selectedIndex={selectedIndex}
t={t}
total={list.items.length}
/>
),
grow: 1,
id: `learning-${list.id}`,
title: list.title
}
})
return (
<OverlayGrid
borderColor={borderColor}
footer={<OverlayHint t={t}>/ panel · / select · Enter/Space details · 1-9,0 quick · Esc/q close</OverlayHint>}
panels={[
...listPanels,
...(detailOpen && selected
? [
{
content: <LedgerDetails item={selected} t={t} />,
grow: 2,
id: 'learning-details',
title: 'Details'
}
]
: [])
]}
maxHeight={maxHeight}
t={t}
width={width}
/>
)
}
function LearningList({ active, items, offset, selectedIndex, t, total }: LearningListProps) {
return (
<Box flexDirection="column">
<Text color={active ? t.color.accent : t.color.muted}>{total} item{total === 1 ? '' : 's'}</Text>
{offset > 0 && <Text color={t.color.muted}> {offset} more</Text>}
<Box flexDirection="column">
{items.map((item, i) => {
const absolute = offset + i
return (
<LedgerRow
active={active && absolute === selectedIndex}
index={i + 1}
item={item}
key={`${item.type}:${item.name}:${i}`}
t={t}
/>
)
})}
</Box>
{offset + items.length < total && (
<Text color={t.color.muted}> {total - offset - items.length} more</Text>
)}
</Box>
)
}
function LedgerRow({ active, index, item, t }: LedgerRowProps) {
const when = fmtTime(item.last_used_at ?? item.learned_at)
const count = item.count ? ` ×${item.count}` : ''
const icon = typeIcon[item.type] ?? '•'
const title = compactTitle(item)
return (
<Box flexShrink={0} width="100%">
<Text bold={active} color={active ? t.color.accent : t.color.muted} inverse={active} wrap="truncate-end">
{active ? '▸ ' : ' '}
{index}. {icon} {title}
<Text color={active ? t.color.accent : t.color.muted}>
{' '}
{count}
{when ? ` · ${when}` : ''}
</Text>
</Text>
</Box>
)
}
function compactTitle(item: LearningLedgerItem) {
const raw = item.type === 'memory' || item.type === 'user' ? item.summary : item.name
return raw
.replace(/^User\s+/i, '')
.replace(/^Durable memory updates$/i, 'memory updated')
.replace(/^session_search$/i, 'past sessions')
}
function LedgerDetails({ item, t }: LedgerDetailsProps) {
const memoryLike = item.type === 'memory' || item.type === 'user'
return (
<Box flexDirection="column">
<Text color={t.color.primary} wrap="truncate-end">
{memoryLike ? item.name : item.summary}
</Text>
{memoryLike ? <Text color={t.color.text}>{item.summary}</Text> : null}
{item.count ? <Text color={t.color.muted}>used: {item.count}×</Text> : null}
{item.learned_from ? <Text color={t.color.muted}>from: {item.learned_from}</Text> : null}
{item.via ? <Text color={t.color.muted}>via: {item.via}</Text> : null}
{item.last_used_at ? <Text color={t.color.muted}>last used: {fmtTime(item.last_used_at)}</Text> : null}
<Text color={t.color.muted}>source: {item.source}</Text>
</Box>
)
}
interface LearningLedgerItem {
count?: number
learned_from?: null | string
last_used_at?: null | number
learned_at?: null | number
name: string
source: string
summary: string
type: string
via?: null | string
}
interface LearningLedgerResponse {
counts?: Record<string, number>
generated_at?: number
home?: string
inventory?: { skills?: number }
items?: LearningLedgerItem[]
total?: number
}
interface LearningListProps {
active: boolean
items: LearningLedgerItem[]
offset: number
selectedIndex: number
t: Theme
total: number
}
interface LedgerRowProps {
active: boolean
index: number
item: LearningLedgerItem
t: Theme
}
interface LedgerDetailsProps {
item: LearningLedgerItem
t: Theme
}
interface LearningLedgerProps {
borderColor: string
gw: GatewayClient
maxHeight?: number
onClose: () => void
t: Theme
width?: number
}
+10
View File
@@ -94,6 +94,16 @@ export const MessageLine = memo(function MessageLine({
)
}
if (msg.kind === 'learning') {
return (
<Box marginLeft={3} marginTop={1}>
<Text color={t.color.muted} italic>
{msg.text}
</Text>
</Box>
)
}
const { body, glyph, prefix } = ROLE[msg.role](t)
const showDetails =
+2 -4
View File
@@ -25,10 +25,8 @@ export function ModelPicker({ gw, onCancel, onSelect, sessionId, t }: ModelPicke
const [stage, setStage] = useState<'model' | 'provider'>('provider')
const { stdout } = useStdout()
// Pin the picker to a stable width so the FloatBox parent (which shrinks-
// to-fit with alignSelf="flex-start") doesn't resize as long provider /
// model names scroll into view, and so `wrap="truncate-end"` on each row
// has an actual constraint to truncate against.
// Pin the picker to a stable width so long provider / model names scroll
// into view without changing the overlay grid's measured layout.
const width = Math.max(MIN_WIDTH, Math.min(MAX_WIDTH, (stdout?.columns ?? 80) - 6))
useEffect(() => {
+79
View File
@@ -0,0 +1,79 @@
import { Box, Text } from '@hermes/ink'
import type { ReactNode } from 'react'
import type { Theme } from '../theme.js'
const GAP = 2
export function OverlayGrid({ borderColor, footer, maxHeight, panels, t, width }: OverlayGridProps) {
const visible = panels.filter(p => p.content)
const innerWidth = Math.max(20, width - 4)
const innerHeight = maxHeight ? Math.max(1, maxHeight - 2) : undefined
const panelHeight = innerHeight ? Math.max(1, innerHeight - (footer ? 1 : 0)) : undefined
const gapTotal = Math.max(0, visible.length - 1) * GAP
const usable = Math.max(1, innerWidth - gapTotal)
const growTotal = visible.reduce((sum, p) => sum + (p.grow ?? 1), 0) || 1
let used = 0
return (
<Box
alignSelf="flex-start"
borderColor={borderColor}
borderStyle="double"
flexDirection="column"
marginTop={1}
opaque
paddingX={1}
width={width}
>
<Box flexDirection="row">
{visible.map((panel, i) => {
const last = i === visible.length - 1
const panelWidth = last
? Math.max(1, usable - used)
: Math.max(1, Math.floor((usable * (panel.grow ?? 1)) / growTotal))
used += panelWidth
return (
<Box flexDirection="row" key={panel.id}>
<Box flexDirection="column" flexShrink={0} width={panelWidth}>
{panel.title ? (
<Text bold color={t.color.accent}>
{panel.title}
</Text>
) : null}
<Box
flexDirection="column"
height={panelHeight ? Math.max(1, panelHeight - (panel.title ? 1 : 0) - (panel.footer ? 1 : 0)) : undefined}
overflow="hidden"
>
{panel.content}
</Box>
{panel.footer ? <Box flexDirection="column">{panel.footer}</Box> : null}
</Box>
{!last ? <Box flexShrink={0} width={GAP} /> : null}
</Box>
)
})}
</Box>
{footer ? <Box flexDirection="column">{footer}</Box> : null}
</Box>
)
}
export interface OverlayGridPanel {
content: ReactNode
footer?: ReactNode
grow?: number
id: string
title?: string
}
interface OverlayGridProps {
borderColor: string
footer?: ReactNode
maxHeight?: number
panels: OverlayGridPanel[]
t: Theme
width: number
}
+1 -6
View File
@@ -133,12 +133,7 @@ export function SessionPicker({ gw, onCancel, onSelect, t }: SessionPickerProps)
</Text>
</Box>
<Text
bold={selected}
color={selected ? t.color.accent : t.color.muted}
inverse={selected}
wrap="truncate-end"
>
<Text bold={selected} color={selected ? t.color.accent : t.color.muted} inverse={selected} wrap="truncate-end">
{s.title || s.preview || '(untitled)'}
</Text>
</Box>
-12
View File
@@ -360,10 +360,6 @@ export function TextInput({
const nativeCursor = focus && termFocus && !selected && !!stdout?.isTTY
// Placeholder text is just a hint, not a selection — render it dim
// without inverse styling. In a TTY the hardware cursor parks at column
// 0 and visually marks the input start. Non-TTY surfaces still need the
// synthetic inverse first-char to draw a cursor at all.
const rendered = useMemo(() => {
if (!focus) {
return display || dim(placeholder)
@@ -715,14 +711,6 @@ export function TextInput({
if (range && range.start === range.end) {
selRef.current = null
setSel(null)
return
}
const normalized = selRange()
if (isMac && normalized) {
void writeClipboardText(vRef.current.slice(normalized.start, normalized.end))
}
}
+1 -1
View File
@@ -873,7 +873,7 @@ export const ToolTrail = memo(function ToolTrail({
const hasTools = groups.length > 0
const hasSubagents = subagents.length > 0
const hasMeta = meta.length > 0
const hasThinking = !!cot || reasoningActive || reasoningStreaming
const hasThinking = !!cot || reasoningActive || busy
const thinkingLive = reasoningActive || reasoningStreaming
const tokenCount =

Some files were not shown because too many files have changed in this diff Show More