Compare commits

..

22 Commits

Author SHA1 Message Date
teknium1 a8f462bc9a docs: add backup and transfer guide for moving installs between machines
hermes backup / hermes import already exist and work, but there was no docs page explaining the end-to-end flow. Add a Guides and Tutorials page covering what is in/left out of the zip, the 5-step transfer flow (backup, move, install, import, verify), quick snapshots vs full backups, security notes, what does not transfer cleanly, and troubleshooting.
2026-05-08 05:18:37 -07:00
brooklyn! 42f9234da3 feat(tui): segment turns with rule above non-first user msgs; trim ticker dead space (#21846)
Multi-turn transcripts ran together visually because every user message
got the same vertical rhythm regardless of position. Adds a short ─── in
the border colour above every user message after the first, so each turn
reads as its own block. Height estimator gains a `withSeparator` flag so
virtual scrolling pre-allocates the extra two rows (rule + top margin)
and avoids a jump on first measurement.

While in the area: the busy-indicator duration was padded with
`padStart(7)`, leaving five visible spaces between `·` and the digits
(`⠋ ·      2s`) — especially loud under the verb-less `unicode` style.
Drop the padding entirely (`⠋ · 2s`); the model label now shifts a few
columns as the duration grows, which is the right trade-off for the
minimal indicator styles. The verb-padding test stays; the
duration-padding test is removed alongside the function it covered.
2026-05-08 05:12:09 -07:00
Siddharth Balyan 7190e20e0b fix: include terminal backend in quick setup wizard (#21842)
The quick setup flow (recommended for first-time users) silently defaulted
terminal.backend to 'local' without ever presenting the choice. This meant
new users who wanted Docker, SSH, Modal, Daytona, or any other backend had
to know about 'hermes setup terminal' — which most wouldn't discover until
later.

Now the quick setup flow is:
  1. Provider selection
  2. API key
  3. Terminal backend (local/Docker/Modal/SSH/Daytona/Vercel/Singularity)
  4. Messaging platform
  5. Done

The terminal backend is a foundational decision (where ALL commands run)
and belongs in the onboarding path alongside provider selection.
2026-05-08 17:36:38 +05:30
Teknium 83c23e8861 fix(google-workspace): cleanup for --check-live salvage
Small follow-ups on top of #19643:
- check_auth() takes quiet kwarg to suppress its AUTHENTICATED print
  when called from check_auth_live(), so the final status line reflects
  the live-call outcome only.
- Drop redundant _ensure_deps() call in check_auth_live() (check_auth()
  already calls it).
- Add AUTHOR_MAP entry for ygd58 so release attribution script works.
2026-05-08 04:50:43 -07:00
ygd58 617ac0535b fix: correct docstring syntax error in check_auth_live 2026-05-08 04:50:43 -07:00
ygd58 5fa493a2ca fix(google-workspace): detect disabled_client in --check and add --check-live
setup.py --check only validated token shape/expiry but did not detect
when Google had disabled the OAuth client or account. Users got
AUTHENTICATED even when actual API calls failed with disabled_client.

Changes:
- Catch disabled_client and invalid_client in check_auth() refresh
  path with actionable guidance (check Cloud Console, check account
  status, do not retry)
- Add check_auth_live() that performs a real Calendar API call to
  detect disabled_client errors that survive token refresh
- Add --check-live CLI flag backed by check_auth_live()

Fixes #19570
2026-05-08 04:50:43 -07:00
Shannon Sands 80775d7585 test(auth): assert Nous refresh rotation payload 2026-05-08 04:17:42 -07:00
Shannon Sands b32461f6e8 fix(auth): send Nous refresh token via header 2026-05-08 04:17:42 -07:00
Teknium 486b14b423 feat(cron): routing intent — deliver=all fans out to every connected channel (#21495)
Adds one reserved token to the cron `deliver` field:

- `all` — expand to every platform with a configured home channel

Resolves at fire time, not create time, so a job created before Telegram
was wired up picks it up once `TELEGRAM_HOME_CHANNEL` is set. Composes
with existing targets: `origin,all`, `all,telegram:-100:17`.

Inspired by Vellum Assistant's reminder routing-intent system.

## Changes
- cron/scheduler.py: _expand_routing_tokens + integrate into _resolve_delivery_targets
- tools/cronjob_tools.py: schema description updated
- tests/cron/test_scheduler.py: TestRoutingIntents (5 cases)
- website/docs/user-guide/features/cron.md: docs + table rows

## Validation
- tests/cron/test_scheduler.py -k 'Routing or Deliver' → 57 passed
2026-05-08 04:17:21 -07:00
kshitijk4poor 81928f03ab refactor(gmi): move User-Agent to profile.default_headers
The previous revision of this PR added six GMI-specific branches
(`elif base_url_host_matches(..., 'api.gmi-serving.com')`) across
run_agent.py and agent/auxiliary_client.py, plus a _HERMES_UA_HEADERS
constant in auxiliary_client.py.

ProviderProfile already has a `default_headers: dict[str, str]` field
commented as 'Client-level quirks (set once at client construction)'.
Other plugins (ai-gateway, kimi-coding) already use it. Two of the four
auxiliary_client sites we previously patched already had a generic
`else: profile.default_headers` fallback that picked it up (so did
both run_agent sites).

This revision:

* Sets `default_headers={'User-Agent': 'HermesAgent/<ver>'}` on the
  GMI profile in plugins/model-providers/gmi/__init__.py.
* Reverts all six GMI-specific branches in run_agent.py and
  auxiliary_client.py.
* Adds the generic profile-fallback `else` block to the two
  auxiliary_client sites (`_to_async_client`, `resolve_provider_client`)
  that didn't have it yet. This benefits every provider whose profile
  declares default_headers, not just GMI — e.g. Vercel AI Gateway's
  HTTP-Referer/X-Title now flow through the async client path too.
* Replaces the GMI-specific URL-branch tests with a profile-level
  assertion and keeps the run_agent integration test (with
  `provider='gmi'` so the fallback picks up the profile).

Net diff vs main: +82/-0 across 5 files, touching only the GMI plugin,
two generic fallback blocks in auxiliary_client.py, AUTHOR_MAP, and
tests. No core files change.

Based on #20907 by @isaachuangGMICLOUD.
2026-05-08 03:22:11 -07:00
Isaac Huang 5d1bdf11b6 Add AUTHOR_MAP entry for Isaac Huang 2026-05-08 03:22:11 -07:00
kshitij 7338e5d9ba fix(model-switch): prevent stale Ollama credentials after provider switch (#21703)
When switching from a custom local provider (e.g. ollama-launch) to a
cloud provider, two bugs caused the CLI to misbehave:

1. _explicit_api_key/_explicit_base_url were only updated when the switch
   result had non-empty values (guarded by `if result.api_key:` etc.).
   If the previous provider set these to Ollama values ("ollama",
   "http://127.0.0.1:11434/v1"), those stale values leaked into the next
   turn's _ensure_runtime_credentials() call and were forwarded to the
   new provider's API endpoint, causing authentication/routing failures.

   Fix: unconditionally write result.api_key/base_url into the explicit
   fields after every successful switch. An empty string is the correct
   sentinel — it tells _ensure_runtime_credentials to re-resolve from the
   auth store / config rather than forwarding a stale override.

2. In AIAgent.switch_model(), `self.base_url = base_url or self.base_url`
   kept the old Ollama localhost URL whenever the incoming base_url was an
   empty string. For providers that use a native SDK (not an OpenAI-compat
   endpoint), the caller passes base_url="" and expects the agent to clear
   the field — not silently inherit Ollama's address.

   Fix: only update self.base_url when base_url is truthy.

3. _handle_model_picker_selection() was called from the prompt_toolkit
   Enter key binding without any exception guard. Any unexpected error
   in the model-selection code path propagated through prompt_toolkit's
   key-binding dispatcher and caused the entire TUI to exit — which the
   user sees as "the terminal exits when I switch providers".

   Fix: wrap the call in try/except and close the picker on failure.
2026-05-08 14:28:54 +05:30
helix4u faa13e49f8 docs(web): fix SearXNG env configuration 2026-05-07 17:54:47 -07:00
Teknium 1bdacb697c chore(release): add BennetYrWang to AUTHOR_MAP 2026-05-07 17:47:22 -07:00
BennetYrWang 34f7297359 Serialize Hermes config access 2026-05-07 17:47:22 -07:00
Teknium 307c85e5c1 fix(goals): auto-pause when judge model returns unparseable output
Weak judge models (e.g. deepseek-v4-flash) return empty strings or prose
when asked for the strict {done, reason} JSON verdict. The old code
failed-open to continue on every such turn, burning the entire turn
budget with log lines like

  judge returned empty response
  judge reply was not JSON: "Let me analyze whether the goal..."

and /goal clear could not stop it mid-loop without /stop.

After N=3 consecutive *parse* failures (transport/API errors don't
count — those are transient), the loop auto-pauses and prints:

  ⏸ Goal paused — the judge model (3 turns) isn't returning the
  required JSON verdict. Route the judge to a stricter model in
  ~/.hermes/config.yaml:
    auxiliary:
      goal_judge:
        provider: openrouter
        model: google/gemini-3-flash-preview
  Then /goal resume to continue.

The counter resets on any usable reply (both "done"/"continue" and
API errors) and persists across GoalManager reloads so cross-session
resumes carry the correct state.

Also fixes test_goal_verdict_send.py sharing a hardcoded session_id
across tests — the shared id only worked because the previous
_post_turn_goal_continuation was a never-awaited coroutine. Now that
PR #19160 made it properly awaited, the xdist test-leakage bug
surfaced. Each test gets a unique session_id via uuid suffix.
2026-05-07 17:33:09 -07:00
JC 03ddff8897 fix(gateway): defer goal status notices until after response delivery
Route goal status notices through the platform adapter send API and register post-delivery callbacks so completed-goal notices appear after the final assistant response. Also cancel queued synthetic goal continuations on /goal pause and /goal clear while preserving normal queued user messages.
2026-05-07 17:33:09 -07:00
Teknium 7d66d30d77 feat(kanban): add tooltips and docs link across dashboard (#21541)
Makes first-time use of the kanban view self-explanatory. Every control
that wasn't already labelled now has a `title` tooltip describing what
it does, and a `?` icon next to the board switcher opens the kanban
docs page in a new tab.

Coverage:
- BoardSwitcher: board select, + New board button, docs-link icon
  (both compact and full variants)
- BoardToolbar: Search, Tenant, Assignee, Show archived, Nudge
  dispatcher, Refresh
- BulkActionBar: → ready, Complete, Archive, reassign group, Apply,
  Clear
- Column header: hovering the header now surfaces COLUMN_HELP as a
  tooltip in addition to the visible sub-text; column count also
  labelled
- Card: task id, priority badge, tenant badge, assignee/unassigned,
  comment count, link count, age timestamp
- InlineCreate: assignee, priority, parent-task selectors

Closes the community feedback from @CharlieDePew asking for tooltips
and a docs link in the kanban view.

Relevant docs page:
https://hermes-agent.nousresearch.com/docs/user-guide/features/kanban
2026-05-07 16:13:27 -07:00
Austin Pickett 7f92e5506e Merge pull request #20942 from NousResearch/austin/fix/personality
fix(tui): preserve session when switching personality
2026-05-07 18:54:29 -04:00
Austin Pickett b0393af38c Merge pull request #20805 from NousResearch/austin-feat-sessions-skills-menu
feat(tui): add /sessions slash command for browsing and resuming previous sessions
2026-05-07 18:54:16 -04:00
Austin Pickett 65c762b2e8 fix(tui): preserve session when switching personality
Previously, /personality in the TUI called _reset_session_agent() which
destroyed the agent, cleared conversation history, and effectively started
a new session. This made personality switching disruptive — users lost
their entire conversation context.

Now /personality updates the agent's ephemeral_system_prompt in-place and
injects a pivot marker into the conversation history. The marker tells
the model to adopt the new persona from that point forward, which is
necessary because LLMs tend to pattern-match their prior responses and
continue the established tone without an explicit signal.

Changes:
- tui_gateway/server.py: Rewrite _apply_personality_to_session to update
  the agent in-place instead of resetting. Inject a user-role pivot
  marker so the model actually switches style mid-conversation.
- ui-tui/src/app/slash/commands/session.ts: Update help text (no longer
  mentions history reset).
- tests/test_tui_gateway_server.py: Update test to verify history is
  preserved, pivot marker is injected, and ephemeral prompt is set.
2026-05-06 19:30:46 -04:00
Austin Pickett 09a491464c feat(tui): add /sessions slash command for browsing and resuming previous sessions 2026-05-06 11:58:53 -04:00
107 changed files with 1743 additions and 4345 deletions
+2 -14
View File
@@ -30,27 +30,15 @@ Use any model you want — [Nous Portal](https://portal.nousresearch.com), [Open
## Quick Install
### Linux, macOS, WSL2, Termux
```bash
curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash
```
### Windows (native, PowerShell)
Run this in PowerShell:
```powershell
irm https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.ps1 | iex
```
The installer handles everything: uv, Python 3.11, Node.js, ripgrep, ffmpeg, **and a portable Git Bash** (MinGit, unpacked to `%LOCALAPPDATA%\hermes\git` — no admin required, completely isolated from any system Git install). Hermes uses this bundled Git Bash to run shell commands.
If you already have Git installed, the installer detects it and uses that instead. Otherwise a ~45MB MinGit download is all you need — it won't touch or interfere with any system Git.
Works on Linux, macOS, WSL2, and Android via Termux. The installer handles the platform-specific setup for you.
> **Android / Termux:** The tested manual path is documented in the [Termux guide](https://hermes-agent.nousresearch.com/docs/getting-started/termux). On Termux, Hermes installs a curated `.[termux]` extra because the full `.[all]` extra currently pulls Android-incompatible voice dependencies.
>
> **Windows:** Native Windows is supported — the PowerShell one-liner above installs everything. If you'd rather use WSL2, the Linux command works there too. Native Windows install lives under `%LOCALAPPDATA%\hermes`; WSL2 installs under `~/.hermes` as on Linux. The only Hermes feature that currently needs WSL2 specifically is the browser-based dashboard chat pane (it uses a POSIX PTY — classic CLI and gateway both run natively).
> **Windows:** Native Windows is not supported. Please install [WSL2](https://learn.microsoft.com/en-us/windows/wsl/install) and run the command above.
After installation:
-4
View File
@@ -13,10 +13,6 @@ Usage::
hermes-acp
"""
# IMPORTANT: hermes_bootstrap must be the very first import — UTF-8 stdio
# on Windows. No-op on POSIX. See hermes_bootstrap.py for full rationale.
import hermes_bootstrap # noqa: F401
import asyncio
import logging
import sys
+36
View File
@@ -2141,6 +2141,20 @@ def _to_async_client(sync_client, model: str, is_vision: bool = False):
)
elif base_url_host_matches(sync_base_url, "api.kimi.com"):
async_kwargs["default_headers"] = {"User-Agent": "claude-code/0.1.0"}
else:
# Fall back to profile.default_headers for providers that declare
# client-level headers on their ProviderProfile (e.g. attribution
# User-Agent strings). Provider is inferred from the hostname.
try:
from agent.model_metadata import _infer_provider_from_url
from providers import get_provider_profile as _gpf_async
_inferred = _infer_provider_from_url(sync_base_url)
if _inferred:
_ph_async = _gpf_async(_inferred)
if _ph_async and _ph_async.default_headers:
async_kwargs["default_headers"] = dict(_ph_async.default_headers)
except Exception:
pass
return AsyncOpenAI(**async_kwargs), model
@@ -2368,6 +2382,16 @@ def resolve_provider_client(
extra["default_headers"] = copilot_request_headers(
is_agent_turn=True, is_vision=is_vision
)
else:
# Fall back to profile.default_headers for providers that
# declare client-level attribution headers on their profile.
try:
from providers import get_provider_profile as _gpf_custom
_ph_custom = _gpf_custom(provider)
if _ph_custom and _ph_custom.default_headers:
extra["default_headers"] = dict(_ph_custom.default_headers)
except Exception:
pass
client = OpenAI(api_key=custom_key, base_url=_clean_base, **extra)
client = _wrap_if_needed(client, final_model, custom_base, custom_key)
return (_to_async_client(client, final_model, is_vision=is_vision) if async_mode
@@ -2556,6 +2580,18 @@ def resolve_provider_client(
headers.update(copilot_request_headers(
is_agent_turn=True, is_vision=is_vision
))
else:
# Fall back to profile.default_headers for providers that declare
# client-level attribution headers on their profile (e.g. GMI
# User-Agent for traffic identification, Vercel AI Gateway
# Referer/Title for analytics).
try:
from providers import get_provider_profile as _gpf_main
_ph_main = _gpf_main(provider)
if _ph_main and _ph_main.default_headers:
headers.update(_ph_main.default_headers)
except Exception:
pass
client = OpenAI(api_key=api_key, base_url=base_url,
**({"default_headers": headers} if headers else {}))
+1 -1
View File
@@ -1607,7 +1607,7 @@ def _run_llm_review(prompt: str) -> Dict[str, Any]:
# terminal. The background-thread runner also hides it; this
# belt-and-suspenders path matters when a caller invokes
# run_curator_review(synchronous=True) from the CLI.
with open(os.devnull, "w", encoding="utf-8") as _devnull, \
with open(os.devnull, "w") as _devnull, \
contextlib.redirect_stdout(_devnull), \
contextlib.redirect_stderr(_devnull):
conv_result = review_agent.run_conversation(user_message=prompt)
+3 -3
View File
@@ -754,7 +754,7 @@ def _load_context_cache() -> Dict[str, int]:
if not path.exists():
return {}
try:
with open(path, encoding="utf-8") as f:
with open(path) as f:
data = yaml.safe_load(f) or {}
return data.get("context_lengths", {})
except Exception as e:
@@ -776,7 +776,7 @@ def save_context_length(model: str, base_url: str, length: int) -> None:
path = _get_context_cache_path()
try:
path.parent.mkdir(parents=True, exist_ok=True)
with open(path, "w", encoding="utf-8") as f:
with open(path, "w") as f:
yaml.dump({"context_lengths": cache}, f, default_flow_style=False)
logger.info("Cached context length %s -> %s tokens", key, f"{length:,}")
except Exception as e:
@@ -800,7 +800,7 @@ def _invalidate_cached_context_length(model: str, base_url: str) -> None:
path = _get_context_cache_path()
try:
path.parent.mkdir(parents=True, exist_ok=True)
with open(path, "w", encoding="utf-8") as f:
with open(path, "w") as f:
yaml.dump({"context_lengths": cache}, f, default_flow_style=False)
except Exception as e:
logger.debug("Failed to invalidate context length cache entry %s: %s", key, e)
+1 -1
View File
@@ -144,7 +144,7 @@ def nous_rate_limit_remaining() -> Optional[float]:
"""
path = _state_path()
try:
with open(path, encoding="utf-8") as f:
with open(path) as f:
state = json.load(f)
reset_at = state.get("reset_at", 0)
remaining = reset_at - time.time()
+1 -1
View File
@@ -617,7 +617,7 @@ def _locked_update_approvals() -> Iterator[Dict[str, Any]]:
save_allowlist(data)
return
with open(lock_path, "a+", encoding="utf-8") as lock_fh:
with open(lock_path, "a+") as lock_fh:
fcntl.flock(lock_fh.fileno(), fcntl.LOCK_EX)
try:
data = load_allowlist()
-4
View File
@@ -20,10 +20,6 @@ Usage:
python batch_runner.py --dataset_file=data.jsonl --batch_size=10 --run_name=my_run --distribution=image_gen
"""
# IMPORTANT: hermes_bootstrap must be the very first import — UTF-8 stdio
# on Windows. No-op on POSIX. See hermes_bootstrap.py for full rationale.
import hermes_bootstrap # noqa: F401
import json
import logging
import os
+24 -90
View File
@@ -9,13 +9,10 @@ Usage:
python cli.py # Start interactive mode with all tools
python cli.py --toolsets web,terminal # Start with specific toolsets
python cli.py --skills hermes-agent-dev,github-auth
python cli.py -q "your question" # Single query mode
python cli.py --list-tools # List available tools and exit
"""
# IMPORTANT: hermes_bootstrap must be the very first import — UTF-8 stdio
# on Windows. No-op on POSIX. See hermes_bootstrap.py for full rationale.
import hermes_bootstrap # noqa: F401
import logging
import os
import shutil
@@ -731,43 +728,8 @@ def _run_cleanup():
_active_worktree: Optional[Dict[str, str]] = None
def _normalize_git_bash_path(p: Optional[str]) -> Optional[str]:
"""Translate a Git Bash-style path (``/c/Users/...``) to the native
Windows form (``C:\\Users\\...``) that Python's ``subprocess.Popen``
and ``pathlib.Path`` accept.
No-op on non-Windows and for paths that already look native. Git on
native Windows normally emits forward-slash Windows paths
(``C:/Users/...``) which both bash and Python handle, but certain
configurations (Git Bash shells, MSYS2, WSL-mounted repos) surface
``/c/...`` or ``/cygdrive/c/...`` variants.
"""
if not p:
return p
if sys.platform != "win32":
return p
import re as _re
# /c/Users/... or /C/Users/...
m = _re.match(r"^/([a-zA-Z])/(.*)$", p)
if m:
drive, rest = m.group(1), m.group(2)
return f"{drive.upper()}:\\{rest.replace('/', chr(92))}"
# /cygdrive/c/... or /mnt/c/...
m = _re.match(r"^/(?:cygdrive|mnt)/([a-zA-Z])/(.*)$", p)
if m:
drive, rest = m.group(1), m.group(2)
return f"{drive.upper()}:\\{rest.replace('/', chr(92))}"
return p
def _git_repo_root() -> Optional[str]:
"""Return the git repo root for CWD, or None if not in a repo.
Runs through :func:`_normalize_git_bash_path` so callers can pass
the result directly to ``Path``/``subprocess.Popen(cwd=...)`` on
Windows without hitting ``C:\\c\\Users\\...`` style resolution
mistakes.
"""
"""Return the git repo root for CWD, or None if not in a repo."""
import subprocess
try:
result = subprocess.run(
@@ -775,7 +737,7 @@ def _git_repo_root() -> Optional[str]:
capture_output=True, text=True, timeout=5,
)
if result.returncode == 0:
return _normalize_git_bash_path(result.stdout.strip())
return result.stdout.strip()
except Exception:
pass
return None
@@ -819,7 +781,7 @@ def _setup_worktree(repo_root: str = None) -> Optional[Dict[str, str]]:
try:
existing = gitignore.read_text() if gitignore.exists() else ""
if _ignore_entry not in existing.splitlines():
with open(gitignore, "a", encoding="utf-8") as f:
with open(gitignore, "a") as f:
if existing and not existing.endswith("\n"):
f.write("\n")
f.write(f"{_ignore_entry}\n")
@@ -870,39 +832,10 @@ def _setup_worktree(repo_root: str = None) -> Optional[Dict[str, str]]:
dst.parent.mkdir(parents=True, exist_ok=True)
shutil.copy2(str(src), str(dst))
elif src.is_dir():
# Symlink directories (faster, saves disk). On Windows,
# symlink creation requires Developer Mode or elevation,
# and fails with OSError otherwise — fall back to a
# recursive copy so the worktree is still usable. The
# copy is slower and uses disk, but it doesn't require
# admin and matches the Linux/macOS symlink outcome
# functionally.
# Symlink directories (faster, saves disk)
if not dst.exists():
dst.parent.mkdir(parents=True, exist_ok=True)
try:
os.symlink(str(src_resolved), str(dst))
except (OSError, NotImplementedError) as _sym_err:
if sys.platform == "win32":
logger.info(
".worktreeinclude: symlink failed (%s) — "
"falling back to copytree on Windows.",
_sym_err,
)
try:
shutil.copytree(
str(src_resolved),
str(dst),
symlinks=True,
dirs_exist_ok=False,
)
except Exception as _copy_err:
logger.warning(
".worktreeinclude: copy fallback "
"also failed for %s -> %s: %s",
src, dst, _copy_err,
)
else:
raise
os.symlink(str(src_resolved), str(dst))
except Exception as e:
logger.debug("Error copying .worktreeinclude entries: %s", e)
@@ -2147,7 +2080,7 @@ def save_config_value(key_path: str, value: any) -> bool:
# Load existing config
if config_path.exists():
with open(config_path, 'r', encoding="utf-8") as f:
with open(config_path, 'r') as f:
config = yaml.safe_load(f) or {}
else:
config = {}
@@ -5871,12 +5804,15 @@ class HermesCLI:
self.model = result.new_model
self.provider = result.target_provider
self.requested_provider = result.target_provider
# Always overwrite explicit overrides so stale credentials from the
# previous provider (e.g. Ollama api_key/base_url) don't leak into
# the new provider's credential resolution on the next turn.
self._explicit_api_key = result.api_key
self._explicit_base_url = result.base_url
if result.api_key:
self.api_key = result.api_key
self._explicit_api_key = result.api_key
if result.base_url:
self.base_url = result.base_url
self._explicit_base_url = result.base_url
if result.api_mode:
self.api_mode = result.api_mode
@@ -6094,12 +6030,15 @@ class HermesCLI:
self.model = result.new_model
self.provider = result.target_provider
self.requested_provider = result.target_provider
# Always overwrite explicit overrides so stale credentials from the
# previous provider (e.g. Ollama api_key/base_url) don't leak into
# the new provider's credential resolution on the next turn.
self._explicit_api_key = result.api_key
self._explicit_base_url = result.base_url
if result.api_key:
self.api_key = result.api_key
self._explicit_api_key = result.api_key
if result.base_url:
self.base_url = result.base_url
self._explicit_base_url = result.base_url
if result.api_mode:
self.api_mode = result.api_mode
@@ -9773,7 +9712,7 @@ class HermesCLI:
# Debug: log to file (stdout may be devnull from redirect_stdout)
try:
_dbg = _hermes_home / "interrupt_debug.log"
with open(_dbg, "a", encoding="utf-8") as _f:
with open(_dbg, "a") as _f:
_f.write(f"{time.strftime('%H:%M:%S')} interrupt fired: msg={str(interrupt_msg)[:60]!r}, "
f"children={len(self.agent._active_children)}, "
f"parent._interrupt={self.agent._interrupt_requested}\n")
@@ -10510,7 +10449,11 @@ class HermesCLI:
# --- /model picker modal ---
if self._model_picker_state:
self._handle_model_picker_selection()
try:
self._handle_model_picker_selection()
except Exception as _exc:
_cprint(f" ✗ Model selection failed: {_exc}")
self._close_model_picker()
event.app.current_buffer.reset()
event.app.invalidate()
return
@@ -10605,7 +10548,7 @@ class HermesCLI:
# Debug: log to file when message enters interrupt queue
try:
_dbg = _hermes_home / "interrupt_debug.log"
with open(_dbg, "a", encoding="utf-8") as _f:
with open(_dbg, "a") as _f:
_f.write(f"{time.strftime('%H:%M:%S')} ENTER: queued interrupt msg={str(payload)[:60]!r}, "
f"agent_running={self._agent_running}\n")
except Exception:
@@ -12409,15 +12352,6 @@ def main(
"""
global _active_worktree
# Force UTF-8 stdio on Windows before any banner/print() runs — the
# Rich console prints Unicode box-drawing characters that would
# UnicodeEncodeError on cp1252. No-op on Linux/macOS.
try:
from hermes_cli.stdio import configure_windows_stdio
configure_windows_stdio()
except Exception:
pass
# Signal to terminal_tool that we're in interactive mode
# This enables interactive sudo password prompts with timeout
os.environ["HERMES_INTERACTIVE"] = "1"
+45 -20
View File
@@ -14,7 +14,6 @@ import contextvars
import json
import logging
import os
import shutil
import subprocess
import sys
@@ -361,12 +360,52 @@ def _normalize_deliver_value(deliver) -> str:
return str(deliver)
# Routing intent tokens — resolved at fire time, not create time, so a
# job created before Telegram was wired up will pick up Telegram once it
# comes online. ``all`` expands into the set of connected platforms
# (those with a configured home chat_id) in _expand_routing_tokens.
_ROUTING_TOKENS = frozenset({"all"})
def _expand_routing_tokens(part: str) -> List[str]:
"""Expand a routing-intent token to concrete platform names.
``all`` expands to every platform in ``_iter_home_target_platforms()``
that has a configured home chat_id right now. Unknown / non-token
values pass through unchanged as a single-element list, so the caller
can treat every token uniformly.
"""
token = part.lower()
if token not in _ROUTING_TOKENS:
return [part]
expanded: List[str] = []
for platform_name in _iter_home_target_platforms():
if _get_home_target_chat_id(platform_name):
expanded.append(platform_name)
return expanded
def _resolve_delivery_targets(job: dict) -> List[dict]:
"""Resolve all concrete auto-delivery targets for a cron job (supports comma-separated deliver)."""
"""Resolve all concrete auto-delivery targets for a cron job.
Accepts the legacy comma-separated ``deliver`` string plus the
``all`` routing-intent token, which expands to every platform with
a configured home channel. Tokens may be combined with explicit
targets: ``origin,all`` and ``all,telegram:-100:17`` both work.
Duplicate (platform, chat_id, thread_id) tuples are collapsed by the
existing dedup pass.
"""
deliver = _normalize_deliver_value(job.get("deliver", "local"))
if deliver == "local":
return []
parts = [p.strip() for p in deliver.split(",") if p.strip()]
raw_parts = [p.strip() for p in deliver.split(",") if p.strip()]
# Expand routing intents.
parts: List[str] = []
for raw in raw_parts:
parts.extend(_expand_routing_tokens(raw))
seen = set()
targets = []
for part in parts:
@@ -715,21 +754,7 @@ def _run_job_script(script_path: str) -> tuple[bool, str]:
# choice explicit here keeps the allowed surface small and auditable.
suffix = path.suffix.lower()
if suffix in (".sh", ".bash"):
# Resolve bash dynamically so Windows (Git Bash) and Linux/macOS
# all work. On native Windows without Git for Windows installed
# shutil.which returns None — fall back to a clear error rather
# than a FileNotFoundError with a confusing "[WinError 2]"
# traceback.
_bash = shutil.which("bash") or (
"/bin/bash" if os.path.isfile("/bin/bash") else None
)
if _bash is None:
return False, (
f"Cannot run .sh/.bash script {path.name!r}: bash not found on PATH. "
"On Windows, install Git for Windows (which ships Git Bash) "
"or rewrite the script as Python (.py)."
)
argv = [_bash, str(path)]
argv = ["/bin/bash", str(path)]
else:
argv = [sys.executable, str(path)]
@@ -1228,7 +1253,7 @@ def run_job(job: dict) -> tuple[bool, str, str, Optional[str]]:
import yaml
_cfg_path = str(_get_hermes_home() / "config.yaml")
if os.path.exists(_cfg_path):
with open(_cfg_path, encoding="utf-8") as _f:
with open(_cfg_path) as _f:
_cfg = yaml.safe_load(_f) or {}
_cfg = _expand_env_vars(_cfg)
_model_cfg = _cfg.get("model", {})
@@ -1611,7 +1636,7 @@ def tick(verbose: bool = True, adapters=None, loop=None) -> int:
# Cross-platform file locking: fcntl on Unix, msvcrt on Windows
lock_fd = None
try:
lock_fd = open(lock_file, "w", encoding="utf-8")
lock_fd = open(lock_file, "w")
if fcntl:
fcntl.flock(lock_fd, fcntl.LOCK_EX | fcntl.LOCK_NB)
elif msvcrt:
@@ -365,7 +365,7 @@ class TerminalBench2EvalEnv(HermesAgentBaseEnv):
os.makedirs(log_dir, exist_ok=True)
run_ts = datetime.datetime.now().strftime("%Y%m%d_%H%M%S")
self._streaming_path = os.path.join(log_dir, f"samples_{run_ts}.jsonl")
self._streaming_file = open(self._streaming_path, "w", encoding="utf-8")
self._streaming_file = open(self._streaming_path, "w")
self._streaming_lock = __import__("threading").Lock()
print(f" Streaming results to: {self._streaming_path}")
@@ -422,7 +422,7 @@ class YCBenchEvalEnv(HermesAgentBaseEnv):
os.makedirs(log_dir, exist_ok=True)
run_ts = datetime.datetime.now().strftime("%Y%m%d_%H%M%S")
self._streaming_path = os.path.join(log_dir, f"samples_{run_ts}.jsonl")
self._streaming_file = open(self._streaming_path, "w", encoding="utf-8")
self._streaming_file = open(self._streaming_path, "w")
self._streaming_lock = threading.Lock()
print(f"\nYC-Bench eval matrix: {len(self.all_eval_items)} runs")
+3 -1
View File
@@ -3146,7 +3146,9 @@ class BasePlatformAdapter(ABC):
_post_cb = getattr(self, "_post_delivery_callbacks", {}).pop(session_key, None)
if callable(_post_cb):
try:
_post_cb()
_post_result = _post_cb()
if inspect.isawaitable(_post_result):
await _post_result
except Exception:
pass
# Stop typing indicator
+2 -2
View File
@@ -744,7 +744,7 @@ class TelegramAdapter(BasePlatformAdapter):
return
import yaml as _yaml
with open(config_path, "r", encoding="utf-8") as f:
with open(config_path, "r") as f:
config = _yaml.safe_load(f) or {}
# Navigate to platforms.telegram.extra.dm_topics
@@ -3516,7 +3516,7 @@ class TelegramAdapter(BasePlatformAdapter):
return
import yaml as _yaml
with open(config_path, "r", encoding="utf-8") as f:
with open(config_path, "r") as f:
config = _yaml.safe_load(f) or {}
dm_topics = (
+5 -15
View File
@@ -21,7 +21,6 @@ import logging
import os
import platform
import re
import shutil
import signal
import subprocess
@@ -178,15 +177,10 @@ def check_whatsapp_requirements() -> bool:
WhatsApp requires a Node.js bridge for most implementations.
"""
# Check for Node.js. Resolve via shutil.which so we respect PATHEXT
# (node.exe vs node) and get a meaningful "not installed" signal
# instead of spawning a cmd flash on Windows.
_node = shutil.which("node")
if not _node:
return False
# Check for Node.js
try:
result = subprocess.run(
[_node, "--version"],
["node", "--version"],
capture_output=True,
text=True,
timeout=5
@@ -470,13 +464,9 @@ class WhatsAppAdapter(BasePlatformAdapter):
bridge_dir = bridge_path.parent
if not (bridge_dir / "node_modules").exists():
print(f"[{self.name}] Installing WhatsApp bridge dependencies...")
# Resolve npm path so Windows can execute the .cmd shim.
# shutil.which honours PATHEXT; on POSIX it returns the
# plain executable path.
_npm_bin = shutil.which("npm") or "npm"
try:
install_result = subprocess.run(
[_npm_bin, "install", "--silent"],
["npm", "install", "--silent"],
cwd=str(bridge_dir),
capture_output=True,
text=True,
@@ -526,7 +516,7 @@ class WhatsAppAdapter(BasePlatformAdapter):
# messages are preserved for troubleshooting.
whatsapp_mode = os.getenv("WHATSAPP_MODE", "self-chat")
self._bridge_log = self._session_path.parent / "bridge.log"
bridge_log_fh = open(self._bridge_log, "a", encoding="utf-8")
bridge_log_fh = open(self._bridge_log, "a")
self._bridge_log_fh = bridge_log_fh
# Build bridge subprocess environment.
@@ -1170,7 +1160,7 @@ class WhatsAppAdapter(BasePlatformAdapter):
if file_size > MAX_TEXT_INJECT_BYTES:
print(f"[{self.name}] Skipping text injection for {doc_path} ({file_size} bytes > {MAX_TEXT_INJECT_BYTES})", flush=True)
continue
content = Path(doc_path).read_text(encoding="utf-8", errors="replace")
content = Path(doc_path).read_text(errors="replace")
fname = Path(doc_path).name
# Remove the doc_<hex>_ prefix for display
display_name = fname
+168 -158
View File
@@ -13,10 +13,6 @@ Usage:
python cli.py --gateway
"""
# IMPORTANT: hermes_bootstrap must be the very first import — UTF-8 stdio
# on Windows. No-op on POSIX. See hermes_bootstrap.py for full rationale.
import hermes_bootstrap # noqa: F401
import asyncio
import dataclasses
import inspect
@@ -1907,6 +1903,59 @@ class GatewayRunner:
depth += 1
return depth
@staticmethod
def _is_goal_continuation_event(event_or_text: Any) -> bool:
"""Return True for synthetic /goal continuation turns.
Goal continuations are normal queued user-role events, so pause/clear
must distinguish them from real user /queue messages before removing or
suppressing them.
"""
text = getattr(event_or_text, "text", event_or_text) or ""
return str(text).startswith("[Continuing toward your standing goal]\nGoal:")
def _clear_goal_pending_continuations(self, session_key: str, adapter: Any) -> int:
"""Remove queued synthetic /goal continuations for one session.
User-issued /goal pause/clear can race with a continuation already
queued by the judge. Remove only synthetic goal continuations while
preserving normal /queue and user follow-up events.
"""
removed = 0
pending_slot = getattr(adapter, "_pending_messages", None) if adapter is not None else None
if isinstance(pending_slot, dict):
pending_event = pending_slot.get(session_key)
if self._is_goal_continuation_event(pending_event):
pending_slot.pop(session_key, None)
removed += 1
queued_events = getattr(self, "_queued_events", None)
if isinstance(queued_events, dict):
overflow = queued_events.get(session_key) or []
if overflow:
kept = []
for queued_event in overflow:
if self._is_goal_continuation_event(queued_event):
removed += 1
else:
kept.append(queued_event)
if kept:
queued_events[session_key] = kept
else:
queued_events.pop(session_key, None)
return removed
def _goal_still_active_for_session(self, session_id: str) -> bool:
"""Best-effort fresh DB check before running a queued continuation."""
if not session_id:
return False
try:
from hermes_cli.goals import GoalManager
return GoalManager(session_id=session_id).is_active()
except Exception as exc:
logger.debug("goal continuation: active-state recheck failed: %s", exc)
return False
def _update_runtime_status(self, gateway_state: Optional[str] = None, exit_reason: Optional[str] = None) -> None:
try:
from gateway.status import write_runtime_status
@@ -2788,48 +2837,6 @@ class GatewayRunner:
return
current_pid = os.getpid()
# On Windows there's no bash/setsid chain — spawn a tiny Python
# watcher directly via sys.executable instead. The watcher polls
# current_pid, waits for our exit, then runs `hermes gateway
# restart` with detach flags so the respawn survives the CLI
# that triggered the /restart command closing its console.
if sys.platform == "win32":
import textwrap
from hermes_cli._subprocess_compat import windows_detach_popen_kwargs
cmd_argv = [*hermes_cmd, "gateway", "restart"]
watcher = textwrap.dedent(
"""
import os, subprocess, sys, time
pid = int(sys.argv[1])
cmd = sys.argv[2:]
deadline = time.monotonic() + 120
while time.monotonic() < deadline:
try:
os.kill(pid, 0)
except (ProcessLookupError, PermissionError, OSError):
break
time.sleep(0.2)
_CREATE_NEW_PROCESS_GROUP = 0x00000200
_DETACHED_PROCESS = 0x00000008
_CREATE_NO_WINDOW = 0x08000000
subprocess.Popen(
cmd,
stdout=subprocess.DEVNULL,
stderr=subprocess.DEVNULL,
creationflags=_CREATE_NEW_PROCESS_GROUP | _DETACHED_PROCESS | _CREATE_NO_WINDOW,
)
"""
).strip()
subprocess.Popen(
[sys.executable, "-c", watcher, str(current_pid), *cmd_argv],
stdout=subprocess.DEVNULL,
stderr=subprocess.DEVNULL,
**windows_detach_popen_kwargs(),
)
return
cmd = " ".join(shlex.quote(part) for part in hermes_cmd)
shell_cmd = (
f"while kill -0 {current_pid} 2>/dev/null; do sleep 0.2; done; "
@@ -5882,7 +5889,7 @@ class GatewayRunner:
except Exception:
session_entry = None
if session_entry is not None:
self._post_turn_goal_continuation(
await self._post_turn_goal_continuation(
session_entry=session_entry,
source=source,
final_response=_final_text,
@@ -8450,6 +8457,13 @@ class GatewayRunner:
state = mgr.pause(reason="user-paused")
if state is None:
return "No goal set."
try:
adapter = self.adapters.get(event.source.platform) if event.source else None
_quick_key = self._session_key_for_source(event.source) if event.source else None
if adapter and _quick_key:
self._clear_goal_pending_continuations(_quick_key, adapter)
except Exception as exc:
logger.debug("goal pause: pending continuation cleanup failed: %s", exc)
return f"⏸ Goal paused: {state.goal}"
if lower == "resume":
@@ -8464,6 +8478,13 @@ class GatewayRunner:
if lower in ("clear", "stop", "done"):
had = mgr.has_goal()
mgr.clear()
try:
adapter = self.adapters.get(event.source.platform) if event.source else None
_quick_key = self._session_key_for_source(event.source) if event.source else None
if adapter and _quick_key:
self._clear_goal_pending_continuations(_quick_key, adapter)
except Exception as exc:
logger.debug("goal clear: pending continuation cleanup failed: %s", exc)
return t("gateway.goal_cleared") if had else t("gateway.no_active_goal")
# Otherwise — treat the remaining text as the new goal.
@@ -8495,7 +8516,69 @@ class GatewayRunner:
"Controls: /goal status · /goal pause · /goal resume · /goal clear"
)
def _post_turn_goal_continuation(
async def _send_goal_status_notice(self, source: Any, message: str) -> None:
"""Send a /goal judge status line back to the originating chat/thread."""
adapter = self.adapters.get(source.platform)
if not adapter:
logger.debug("goal continuation: no adapter for %s", getattr(source, "platform", None))
return
try:
metadata = self._thread_metadata_for_source(source)
except Exception:
metadata = {"thread_id": source.thread_id} if getattr(source, "thread_id", None) else None
result = await adapter.send(source.chat_id, message, metadata=metadata)
if result is not None and not getattr(result, "success", True):
logger.warning(
"goal continuation: status send failed: %s",
getattr(result, "error", "unknown error"),
)
async def _defer_goal_status_notice_after_delivery(self, source: Any, message: str) -> None:
"""Send a /goal status line after the main response is delivered.
The gateway message handler returns the agent response to the platform
adapter, which sends it after this method's caller has returned. For a
natural Discord/Telegram reading order, goal status belongs after that
send. Platform adapters provide a one-shot post-delivery callback for
exactly this boundary; when unavailable, fall back to direct awaited
delivery rather than silently dropping the notice.
"""
adapter = self.adapters.get(source.platform)
if not adapter:
logger.debug("goal continuation: no adapter for %s", getattr(source, "platform", None))
return
async def _deliver() -> None:
try:
await self._send_goal_status_notice(source, message)
except Exception as exc:
logger.warning("goal continuation: status send failed: %s", exc, exc_info=True)
try:
session_key = self._session_key_for_source(source)
except Exception:
session_key = None
if session_key and hasattr(adapter, "register_post_delivery_callback"):
try:
generation = None
active = getattr(adapter, "_active_sessions", {}).get(session_key)
if active is not None:
generation = getattr(active, "_hermes_run_generation", None)
adapter.register_post_delivery_callback(
session_key,
_deliver,
generation=generation,
)
return
except Exception as exc:
logger.debug("goal continuation: post-delivery callback registration failed: %s", exc)
await _deliver()
async def _post_turn_goal_continuation(
self,
*,
session_entry: Any,
@@ -8531,38 +8614,14 @@ class GatewayRunner:
decision = mgr.evaluate_after_turn(final_response or "", user_initiated=True)
msg = decision.get("message") or ""
# Send the status line back to the user so they see the judge's
# verdict. Fire-and-forget via the adapter's ``send()`` method —
# adapters expose ``send(chat_id, content, reply_to, metadata)``,
# not a ``send_message(source, msg)`` wrapper, so an earlier
# ``hasattr(adapter, "send_message")`` gate here was dead code and
# users never saw ``✓ Goal achieved`` / ``⏸ budget exhausted``
# verdicts.
# Defer the status line until after the adapter has delivered the
# agent's visible final response. The judge runs after the response is
# produced but before BasePlatformAdapter sends it, so sending here
# would show "✓ Goal achieved" before the answer itself. Registering
# an awaited post-delivery callback preserves delivery reliability
# without reversing the user-visible ordering.
if msg and source is not None:
try:
adapter = self.adapters.get(source.platform)
if adapter is not None and hasattr(adapter, "send"):
import asyncio as _asyncio
thread_meta = (
{"thread_id": source.thread_id} if source.thread_id else None
)
coro = adapter.send(
chat_id=source.chat_id,
content=msg,
metadata=thread_meta,
)
if _asyncio.iscoroutine(coro):
try:
loop = _asyncio.get_running_loop()
loop.create_task(coro)
except RuntimeError:
# No running loop in this thread — best effort.
try:
_asyncio.run(coro)
except Exception:
pass
except Exception as exc:
logger.debug("goal continuation: status send failed: %s", exc)
await self._defer_goal_status_notice_after_delivery(source, msg)
if not decision.get("should_continue"):
return
@@ -11351,78 +11410,30 @@ class GatewayRunner:
# where systemd-run --user fails due to missing D-Bus session).
# PYTHONUNBUFFERED ensures output is flushed line-by-line so the
# gateway can stream it to the messenger in near-real-time.
# Spawn `hermes update --gateway` detached so it survives gateway restart.
# --gateway enables file-based IPC for interactive prompts (stash
# restore, config migration) so the gateway can forward them to the
# user instead of silently skipping them.
# Use setsid for portable session detach (works under system services
# where systemd-run --user fails due to missing D-Bus session).
# PYTHONUNBUFFERED ensures output is flushed line-by-line so the
# gateway can stream it to the messenger in near-real-time.
#
# Windows: no bash/setsid chain. Run `hermes update --gateway`
# directly via sys.executable; redirect stdout/stderr to the same
# output files via Popen file handles; write the exit code in a
# follow-up write. A tiny Python watcher would be cleaner but
# we're already inside gateway/run.py's update path which is async,
# so the simplest correct thing is: launch an inline Python helper
# that runs the command and writes both outputs.
hermes_cmd_str = " ".join(shlex.quote(part) for part in hermes_cmd)
update_cmd = (
f"PYTHONUNBUFFERED=1 {hermes_cmd_str} update --gateway"
f" > {shlex.quote(str(output_path))} 2>&1; "
f"status=$?; printf '%s' \"$status\" > {shlex.quote(str(exit_code_path))}"
)
try:
if sys.platform == "win32":
import textwrap
from hermes_cli._subprocess_compat import windows_detach_popen_kwargs
# hermes_cmd is a list of argv parts we can pass directly
# (no shell-quoting needed).
helper = textwrap.dedent(
"""
import os, subprocess, sys
output_path = sys.argv[1]
exit_code_path = sys.argv[2]
cmd = sys.argv[3:]
env = dict(os.environ)
env["PYTHONUNBUFFERED"] = "1"
with open(output_path, "wb") as f:
proc = subprocess.Popen(cmd, stdout=f, stderr=subprocess.STDOUT, env=env)
rc = proc.wait()
with open(exit_code_path, "w") as f:
f.write(str(rc))
"""
).strip()
setsid_bin = shutil.which("setsid")
if setsid_bin:
# Preferred: setsid creates a new session, fully detached
subprocess.Popen(
[
sys.executable, "-c", helper,
str(output_path), str(exit_code_path),
*hermes_cmd, "update", "--gateway",
],
[setsid_bin, "bash", "-c", update_cmd],
stdout=subprocess.DEVNULL,
stderr=subprocess.DEVNULL,
**windows_detach_popen_kwargs(),
start_new_session=True,
)
else:
hermes_cmd_str = " ".join(shlex.quote(part) for part in hermes_cmd)
update_cmd = (
f"PYTHONUNBUFFERED=1 {hermes_cmd_str} update --gateway"
f" > {shlex.quote(str(output_path))} 2>&1; "
f"status=$?; printf '%s' \"$status\" > {shlex.quote(str(exit_code_path))}"
# Fallback: start_new_session=True calls os.setsid() in child
subprocess.Popen(
["bash", "-c", update_cmd],
stdout=subprocess.DEVNULL,
stderr=subprocess.DEVNULL,
start_new_session=True,
)
setsid_bin = shutil.which("setsid")
if setsid_bin:
# Preferred: setsid creates a new session, fully detached
subprocess.Popen(
[setsid_bin, "bash", "-c", update_cmd],
stdout=subprocess.DEVNULL,
stderr=subprocess.DEVNULL,
start_new_session=True,
)
else:
# Fallback: start_new_session=True calls os.setsid() in child
subprocess.Popen(
["bash", "-c", update_cmd],
stdout=subprocess.DEVNULL,
stderr=subprocess.DEVNULL,
start_new_session=True,
)
except Exception as e:
pending_path.unlink(missing_ok=True)
exit_code_path.unlink(missing_ok=True)
@@ -14862,14 +14873,18 @@ class GatewayRunner:
)
if callable(_bg_cb):
try:
_bg_cb()
_bg_result = _bg_cb()
if inspect.isawaitable(_bg_result):
await _bg_result
except Exception:
pass
elif adapter and hasattr(adapter, "_post_delivery_callbacks"):
_bg_cb = adapter._post_delivery_callbacks.pop(session_key, None)
if callable(_bg_cb):
try:
_bg_cb()
_bg_result = _bg_cb()
if inspect.isawaitable(_bg_result):
await _bg_result
except Exception:
pass
# else: interrupted — discard the interrupted response ("Operation
@@ -14883,6 +14898,12 @@ class GatewayRunner:
next_channel_prompt = None
if pending_event is not None:
next_source = getattr(pending_event, "source", None) or source
if self._is_goal_continuation_event(pending_event) and not self._goal_still_active_for_session(session_id):
logger.info(
"Discarding stale goal continuation for session %s — goal is no longer active",
session_key or "?",
)
return result
next_message = await self._prepare_inbound_message_text(
event=pending_event,
source=next_source,
@@ -15194,10 +15215,7 @@ async def start_gateway(config: Optional[GatewayConfig] = None, replace: bool =
try:
os.kill(existing_pid, 0)
time.sleep(0.5)
except (ProcessLookupError, PermissionError, OSError):
# OSError covers Windows' WinError 87 "invalid parameter"
# for an already-gone PID — without this the probe loop
# busy-spins for the full 10s on every --replace start.
except (ProcessLookupError, PermissionError):
break # Process is gone
else:
# Still alive after 10s — force kill
@@ -15482,14 +15500,6 @@ async def start_gateway(config: Optional[GatewayConfig] = None, replace: bool =
def main():
"""CLI entry point for the gateway."""
# Force UTF-8 stdio on Windows — gateway logs and startup banner would
# otherwise UnicodeEncodeError on cp1252 consoles. No-op on POSIX.
try:
from hermes_cli.stdio import configure_windows_stdio
configure_windows_stdio()
except Exception:
pass
import argparse
parser = argparse.ArgumentParser(description="Hermes Gateway - Multi-platform messaging")
+3 -3
View File
@@ -113,7 +113,7 @@ def _get_process_start_time(pid: int) -> Optional[int]:
stat_path = Path(f"/proc/{pid}/stat")
try:
# Field 22 in /proc/<pid>/stat is process start time (clock ticks).
return int(stat_path.read_text(encoding="utf-8").split()[21])
return int(stat_path.read_text().split()[21])
except (FileNotFoundError, IndexError, PermissionError, ValueError, OSError):
return None
@@ -197,7 +197,7 @@ def _read_json_file(path: Path) -> Optional[dict[str, Any]]:
if not path.exists():
return None
try:
raw = path.read_text(encoding="utf-8").strip()
raw = path.read_text().strip()
except OSError:
return None
if not raw:
@@ -523,7 +523,7 @@ def acquire_scoped_lock(scope: str, identity: str, metadata: Optional[dict[str,
try:
_proc_status = Path(f"/proc/{existing_pid}/status")
if _proc_status.exists():
for _line in _proc_status.read_text(encoding="utf-8").splitlines():
for _line in _proc_status.read_text().splitlines():
if _line.startswith("State:"):
_state = _line.split()[1]
if _state in ("T", "t"): # stopped or tracing stop
-129
View File
@@ -1,129 +0,0 @@
"""Windows UTF-8 bootstrap for Hermes entry points.
Python on Windows has two long-standing text-encoding footguns:
1. ``sys.stdout`` / ``sys.stderr`` are bound to the console code page
(``cp1252`` on US-locale installs), so ``print("café")`` crashes with
``UnicodeEncodeError: 'charmap' codec can't encode character``.
2. Child processes spawned via ``subprocess`` don't know to use UTF-8
unless ``PYTHONUTF8`` and/or ``PYTHONIOENCODING`` are set in their
environment so any Python subprocess (the execute_code sandbox,
delegation children, linter subprocesses, etc.) inherits the same
cp1252 defaults and hits the same UnicodeEncodeError.
This module fixes both on Windows *only* POSIX is untouched. It
should be imported at the very top of every Hermes entry point
(``hermes``, ``hermes-agent``, ``hermes-acp``, ``python -m gateway.run``,
``batch_runner.py``, ``cron/scheduler.py``) before any other imports
that might do file I/O or print to stdout.
What this module does on Windows:
- Sets ``os.environ["PYTHONUTF8"] = "1"`` (PEP 540 UTF-8 mode) so
every child process we spawn uses UTF-8 for ``open()`` and stdio.
- Sets ``os.environ["PYTHONIOENCODING"] = "utf-8"`` for belt-and-
suspenders some tools read this instead of / in addition to
``PYTHONUTF8``.
- Reconfigures ``sys.stdout`` / ``sys.stderr`` to UTF-8 in the current
process, using the ``reconfigure()`` API (Python 3.7+). This fixes
``print("café")`` in the parent without a re-exec.
What this module does NOT do:
- It does not re-exec Python with ``-X utf8``, so ``open()`` calls in
the *current* process still default to locale encoding. Those need
an explicit ``encoding="utf-8"`` at the call site (lint rule
``PLW1514`` / ``PYI058``). Ruff is the right tool for that sweep.
What this module does on POSIX:
- Nothing. POSIX systems are already UTF-8 by default in 99% of cases,
and we don't want to touch ``LANG``/``LC_*`` behavior that users may
have configured intentionally. If someone hits a C/POSIX locale on
Linux, they can export ``PYTHONUTF8=1`` themselves we won't override.
Idempotent: safe to call multiple times. ``_bootstrap_once`` guards
against double-reconfigure.
"""
from __future__ import annotations
import os
import sys
_IS_WINDOWS = sys.platform == "win32"
_bootstrap_applied = False
def apply_windows_utf8_bootstrap() -> bool:
"""Apply the Windows UTF-8 bootstrap if we're on Windows.
Returns True if bootstrap was applied (i.e. we're on Windows and
haven't already done this), False otherwise. The return value is
advisory callers normally don't need it, but tests may want to
assert the path was taken.
Idempotent: subsequent calls after the first are a no-op.
"""
global _bootstrap_applied
if not _IS_WINDOWS:
return False
if _bootstrap_applied:
return False
# 1. Child processes inherit these and run in UTF-8 mode.
# We use setdefault() rather than overwriting so the user can
# explicitly opt out by setting PYTHONUTF8=0 in their environment
# (or PYTHONIOENCODING=something-else) if they really want to.
os.environ.setdefault("PYTHONUTF8", "1")
os.environ.setdefault("PYTHONIOENCODING", "utf-8")
# 2. Reconfigure the current process's stdio to UTF-8. Needed
# because os.environ changes don't retroactively rebind sys.stdout
# — those were bound at interpreter startup based on the console
# code page. ``reconfigure`` is a TextIOWrapper method since 3.7.
#
# errors="replace" means that if we ever *read* something from
# stdin that isn't UTF-8 (unlikely but possible with piped input
# from legacy tools), we'll get U+FFFD replacement chars rather
# than a crash. Output is pure UTF-8.
for stream_name in ("stdout", "stderr"):
stream = getattr(sys, stream_name, None)
if stream is None:
continue
reconfigure = getattr(stream, "reconfigure", None)
if reconfigure is None:
# Not a TextIOWrapper (could be redirected to a BytesIO in
# tests, or a non-standard stream in some embedded cases).
# Skip silently — the env-var fix is still in effect for
# child processes, which is the bigger win.
continue
try:
reconfigure(encoding="utf-8", errors="replace")
except (OSError, ValueError):
# Already closed, or someone replaced it with something
# non-reconfigurable. Non-fatal.
pass
# stdin is reconfigured separately with errors="replace" too — input
# from a legacy pipe shouldn't crash the process.
stdin = getattr(sys, "stdin", None)
if stdin is not None:
reconfigure = getattr(stdin, "reconfigure", None)
if reconfigure is not None:
try:
reconfigure(encoding="utf-8", errors="replace")
except (OSError, ValueError):
pass
_bootstrap_applied = True
return True
# Apply on import — entry points just need ``import hermes_bootstrap``
# (or ``from hermes_bootstrap import apply_windows_utf8_bootstrap``) at
# the very top of their module, before importing anything else. The
# import side effect does the right thing.
apply_windows_utf8_bootstrap()
-175
View File
@@ -1,175 +0,0 @@
"""Windows subprocess compatibility helpers.
Hermes is developed on Linux / macOS and tested natively on Windows too.
Several common subprocess patterns break silently-or-loudly on Windows:
* ``["npm", "install", ...]`` on Windows ``npm`` is ``npm.cmd``, a batch
shim. ``subprocess.Popen(["npm", ...])`` fails with WinError 193
("not a valid Win32 application") because CreateProcessW can't run a
``.cmd`` file without ``shell=True`` or PATHEXT resolution.
* ``start_new_session=True`` on POSIX, this maps to ``os.setsid()`` and
actually detaches the child. On Windows it's silently ignored; the
Windows equivalent is ``CREATE_NEW_PROCESS_GROUP | DETACHED_PROCESS``
creationflags, which Python only applies when you pass them explicitly.
* Console-window flashes every ``subprocess.Popen`` of a ``.exe`` on
Windows spawns a cmd window briefly unless ``CREATE_NO_WINDOW`` is
passed. Cosmetic but jarring for background daemons.
This module centralizes the platform-branching logic so the rest of the
codebase doesn't sprinkle ``if sys.platform == "win32":`` everywhere.
**All helpers are no-ops on non-Windows** calling them in Linux/macOS
code paths is safe by design. That's the "do no damage on POSIX"
guarantee.
"""
from __future__ import annotations
import os
import shutil
import subprocess
import sys
from typing import Optional, Sequence
__all__ = [
"IS_WINDOWS",
"resolve_node_command",
"windows_detach_flags",
"windows_hide_flags",
"windows_detach_popen_kwargs",
]
IS_WINDOWS = sys.platform == "win32"
# -----------------------------------------------------------------------------
# Node ecosystem launcher resolution
# -----------------------------------------------------------------------------
def resolve_node_command(name: str, argv: Sequence[str]) -> list[str]:
"""Resolve a Node-ecosystem command name to an absolute-path argv.
On Windows, commands like ``npm``, ``npx``, ``yarn``, ``pnpm``,
``playwright``, ``prettier`` ship as ``.cmd`` files (batch shims).
``subprocess.Popen(["npm", "install"])`` fails with WinError 193
because CreateProcessW doesn't execute batch files directly.
``shutil.which(name)`` *does* resolve ``.cmd`` via PATHEXT and returns
the fully-qualified path which CreateProcessW accepts because the
extension tells Windows to route through ``cmd.exe /c``.
On POSIX ``shutil.which`` also returns a fully-qualified path when
found. That's a small change from bare-name resolution (the OS does
its own PATH search) but functionally identical and has the side
benefit of making the argv reproducible in logs.
Behavior when the command is not on PATH:
- On Windows: return the bare name caller can still try with
``shell=True`` as a last resort, OR the subsequent Popen will
raise FileNotFoundError with a readable error we want to surface.
- On POSIX: same. Bare ``npm`` on a Linux box without npm installed
fails the same way it did before this function existed.
Args:
name: The command name to resolve (``npm``, ``npx``, ``node`` ).
argv: The remaining arguments. Must NOT include ``name`` itself
this function builds the full argv list.
Returns:
A list suitable for passing to subprocess.Popen/run/call.
"""
resolved = shutil.which(name)
if resolved:
return [resolved, *argv]
return [name, *argv]
# -----------------------------------------------------------------------------
# Detached / hidden process creation
# -----------------------------------------------------------------------------
# Win32 CreationFlags — defined here rather than imported from subprocess
# because CREATE_NO_WINDOW and DETACHED_PROCESS aren't guaranteed to be
# present on stdlib subprocess on older Pythons or non-Windows builds.
_CREATE_NEW_PROCESS_GROUP = 0x00000200
_DETACHED_PROCESS = 0x00000008
_CREATE_NO_WINDOW = 0x08000000
def windows_detach_flags() -> int:
"""Return Win32 creationflags that detach a child from the parent
console and process group. 0 on non-Windows.
Pair with ``start_new_session=False`` (default) when calling
subprocess.Popen on POSIX use ``start_new_session=True`` instead,
which maps to ``os.setsid()`` in the child.
Rationale:
- ``CREATE_NEW_PROCESS_GROUP`` child has its own process group so
Ctrl+C in the parent console doesn't propagate.
- ``DETACHED_PROCESS`` child has no console at all. Necessary for
background daemons (gateway watchers, update respawners) because
without it, closing the console kills the child.
- ``CREATE_NO_WINDOW`` suppress the brief cmd flash that would
otherwise appear when launching a console app. Redundant with
DETACHED_PROCESS but explicit for clarity.
"""
if not IS_WINDOWS:
return 0
return _CREATE_NEW_PROCESS_GROUP | _DETACHED_PROCESS | _CREATE_NO_WINDOW
def windows_hide_flags() -> int:
"""Return Win32 creationflags that merely hide the child's console
window without detaching the child. 0 on non-Windows.
Use for short-lived console apps spawned as part of a larger
operation (``taskkill``, ``where``, version probes) where we want no
flash but also want to collect stdout/exit code synchronously.
The key difference from :func:`windows_detach_flags`: NO
``DETACHED_PROCESS`` the child still inherits stdio handles so
``capture_output=True`` works. ``DETACHED_PROCESS`` would sever
stdio and break stdout capture.
"""
if not IS_WINDOWS:
return 0
return _CREATE_NO_WINDOW
def windows_detach_popen_kwargs() -> dict:
"""Return a dict of Popen kwargs that detach a child on Windows and
fall back to the POSIX equivalent (``start_new_session=True``) on
Linux/macOS.
Usage pattern:
.. code-block:: python
subprocess.Popen(
argv,
stdout=subprocess.DEVNULL,
stderr=subprocess.DEVNULL,
stdin=subprocess.DEVNULL,
close_fds=True,
**windows_detach_popen_kwargs(),
)
This replaces the unsafe-on-Windows pattern:
.. code-block:: python
subprocess.Popen(..., start_new_session=True)
which silently fails to detach on Windows (the flag is accepted but
has no effect the child stays attached to the parent's console
and dies when the console closes).
"""
if IS_WINDOWS:
return {"creationflags": windows_detach_flags()}
return {"start_new_session": True}
+1 -1
View File
@@ -3117,10 +3117,10 @@ def _refresh_access_token(
) -> Dict[str, Any]:
response = client.post(
f"{portal_base_url}/api/oauth/token",
headers={"x-nous-refresh-token": refresh_token},
data={
"grant_type": "refresh_token",
"client_id": client_id,
"refresh_token": refresh_token,
},
)
+3 -3
View File
@@ -573,7 +573,7 @@ def create_quick_snapshot(
"total_size": sum(manifest.values()),
"files": manifest,
}
with open(snap_dir / "manifest.json", "w", encoding="utf-8") as f:
with open(snap_dir / "manifest.json", "w") as f:
json.dump(meta, f, indent=2)
# Auto-prune
@@ -599,7 +599,7 @@ def list_quick_snapshots(
manifest_path = d / "manifest.json"
if manifest_path.exists():
try:
with open(manifest_path, encoding="utf-8") as f:
with open(manifest_path) as f:
results.append(json.load(f))
except (json.JSONDecodeError, OSError):
results.append({"id": d.name, "file_count": 0, "total_size": 0})
@@ -629,7 +629,7 @@ def restore_quick_snapshot(
if not manifest_path.exists():
return False
with open(manifest_path, encoding="utf-8") as f:
with open(manifest_path) as f:
meta = json.load(f)
restored = 0
+3
View File
@@ -109,6 +109,9 @@ COMMAND_REGISTRY: list[CommandDef] = [
CommandDef("resume", "Resume a previously-named session", "Session",
args_hint="[name]"),
# Configuration
CommandDef("sessions", "Browse and resume previous sessions", "Session"),
# Configuration
CommandDef("config", "Show current configuration", "Configuration",
cli_only=True),
+109 -105
View File
@@ -21,6 +21,7 @@ import stat
import subprocess
import sys
import tempfile
import threading
from dataclasses import dataclass
from pathlib import Path
from typing import Dict, Any, Optional, List, Tuple
@@ -42,6 +43,14 @@ _LOAD_CONFIG_CACHE: Dict[str, Tuple[int, int, Dict[str, Any]]] = {}
# _LOAD_CONFIG_CACHE but for read_raw_config() — used when callers want
# the user's on-disk values without defaults merged in.
_RAW_CONFIG_CACHE: Dict[str, Tuple[int, int, Dict[str, Any]]] = {}
# Serializes all config read/write paths. libyaml's C extension is not
# thread-safe for concurrent safe_load() on the same file, and multiple
# tool threads (approval.py, browser_tool.py, setup flows) hit
# load_config / read_raw_config / save_config from different threads
# during long agent runs. RLock (not Lock) because save_config internally
# calls read_raw_config. Also covers mutation of the module-level cache
# dicts above.
_CONFIG_LOCK = threading.RLock()
# Env var names written to .env that aren't in OPTIONAL_ENV_VARS
# (managed by setup/provider flows directly).
_EXTRA_ENV_KEYS = frozenset({
@@ -212,7 +221,7 @@ def get_container_exec_info() -> Optional[dict]:
try:
info = {}
with open(container_mode_file, "r", encoding="utf-8") as f:
with open(container_mode_file, "r") as f:
for line in f:
line = line.strip()
if "=" in line and not line.startswith("#"):
@@ -297,7 +306,7 @@ def _is_container() -> bool:
return True
# LXC / cgroup-based detection
try:
with open("/proc/1/cgroup", "r", encoding="utf-8") as f:
with open("/proc/1/cgroup", "r") as f:
cgroup_content = f.read()
if "docker" in cgroup_content or "lxc" in cgroup_content or "kubepods" in cgroup_content:
return True
@@ -3452,7 +3461,7 @@ def migrate_config(interactive: bool = True, quiet: bool = False) -> Dict[str, A
if not manifest_file.exists():
continue
try:
with open(manifest_file, encoding="utf-8") as _mf:
with open(manifest_file) as _mf:
manifest = yaml.safe_load(_mf) or {}
except Exception:
manifest = {}
@@ -3941,28 +3950,29 @@ def read_raw_config() -> Dict[str, Any]:
``load_config()``. Returns a deepcopy on every call since some callers
mutate the result before passing to ``save_config()``.
"""
try:
config_path = get_config_path()
st = config_path.stat()
cache_key = (st.st_mtime_ns, st.st_size)
except (FileNotFoundError, OSError):
return {}
with _CONFIG_LOCK:
try:
config_path = get_config_path()
st = config_path.stat()
cache_key = (st.st_mtime_ns, st.st_size)
except (FileNotFoundError, OSError):
return {}
path_key = str(config_path)
cached = _RAW_CONFIG_CACHE.get(path_key)
if cached is not None and cached[:2] == cache_key:
return copy.deepcopy(cached[2])
path_key = str(config_path)
cached = _RAW_CONFIG_CACHE.get(path_key)
if cached is not None and cached[:2] == cache_key:
return copy.deepcopy(cached[2])
try:
with open(config_path, encoding="utf-8") as f:
data = yaml.safe_load(f) or {}
except Exception:
return {}
try:
with open(config_path, encoding="utf-8") as f:
data = yaml.safe_load(f) or {}
except Exception:
return {}
if not isinstance(data, dict):
data = {}
_RAW_CONFIG_CACHE[path_key] = (cache_key[0], cache_key[1], copy.deepcopy(data))
return data
if not isinstance(data, dict):
data = {}
_RAW_CONFIG_CACHE[path_key] = (cache_key[0], cache_key[1], copy.deepcopy(data))
return data
def load_config() -> Dict[str, Any]:
@@ -3975,46 +3985,47 @@ def load_config() -> Dict[str, Any]:
(which change ``HERMES_HOME`` and therefore ``get_config_path()``)
don't collide.
"""
ensure_hermes_home()
config_path = get_config_path()
path_key = str(config_path)
with _CONFIG_LOCK:
ensure_hermes_home()
config_path = get_config_path()
path_key = str(config_path)
try:
st = config_path.stat()
cache_key: Optional[Tuple[int, int]] = (st.st_mtime_ns, st.st_size)
except FileNotFoundError:
cache_key = None
cached = _LOAD_CONFIG_CACHE.get(path_key)
if cached is not None and cache_key is not None and cached[:2] == cache_key:
return copy.deepcopy(cached[2])
config = copy.deepcopy(DEFAULT_CONFIG)
if cache_key is not None:
try:
with open(config_path, encoding="utf-8") as f:
user_config = yaml.safe_load(f) or {}
st = config_path.stat()
cache_key: Optional[Tuple[int, int]] = (st.st_mtime_ns, st.st_size)
except FileNotFoundError:
cache_key = None
if "max_turns" in user_config:
agent_user_config = dict(user_config.get("agent") or {})
if agent_user_config.get("max_turns") is None:
agent_user_config["max_turns"] = user_config["max_turns"]
user_config["agent"] = agent_user_config
user_config.pop("max_turns", None)
cached = _LOAD_CONFIG_CACHE.get(path_key)
if cached is not None and cache_key is not None and cached[:2] == cache_key:
return copy.deepcopy(cached[2])
config = _deep_merge(config, user_config)
except Exception as e:
print(f"Warning: Failed to load config: {e}")
config = copy.deepcopy(DEFAULT_CONFIG)
normalized = _normalize_root_model_keys(_normalize_max_turns_config(config))
expanded = _expand_env_vars(normalized)
_LAST_EXPANDED_CONFIG_BY_PATH[path_key] = copy.deepcopy(expanded)
if cache_key is not None:
_LOAD_CONFIG_CACHE[path_key] = (cache_key[0], cache_key[1], copy.deepcopy(expanded))
else:
_LOAD_CONFIG_CACHE.pop(path_key, None)
return expanded
if cache_key is not None:
try:
with open(config_path, encoding="utf-8") as f:
user_config = yaml.safe_load(f) or {}
if "max_turns" in user_config:
agent_user_config = dict(user_config.get("agent") or {})
if agent_user_config.get("max_turns") is None:
agent_user_config["max_turns"] = user_config["max_turns"]
user_config["agent"] = agent_user_config
user_config.pop("max_turns", None)
config = _deep_merge(config, user_config)
except Exception as e:
print(f"Warning: Failed to load config: {e}")
normalized = _normalize_root_model_keys(_normalize_max_turns_config(config))
expanded = _expand_env_vars(normalized)
_LAST_EXPANDED_CONFIG_BY_PATH[path_key] = copy.deepcopy(expanded)
if cache_key is not None:
_LOAD_CONFIG_CACHE[path_key] = (cache_key[0], cache_key[1], copy.deepcopy(expanded))
else:
_LOAD_CONFIG_CACHE.pop(path_key, None)
return expanded
_SECURITY_COMMENT = """
@@ -4094,45 +4105,46 @@ _COMMENTED_SECTIONS = """
def save_config(config: Dict[str, Any]):
"""Save configuration to ~/.hermes/config.yaml."""
if is_managed():
managed_error("save configuration")
return
from utils import atomic_yaml_write
with _CONFIG_LOCK:
if is_managed():
managed_error("save configuration")
return
from utils import atomic_yaml_write
ensure_hermes_home()
config_path = get_config_path()
current_normalized = _normalize_root_model_keys(_normalize_max_turns_config(config))
normalized = current_normalized
raw_existing = _normalize_root_model_keys(_normalize_max_turns_config(read_raw_config()))
if raw_existing:
normalized = _preserve_env_ref_templates(
ensure_hermes_home()
config_path = get_config_path()
current_normalized = _normalize_root_model_keys(_normalize_max_turns_config(config))
normalized = current_normalized
raw_existing = _normalize_root_model_keys(_normalize_max_turns_config(read_raw_config()))
if raw_existing:
normalized = _preserve_env_ref_templates(
normalized,
raw_existing,
_LAST_EXPANDED_CONFIG_BY_PATH.get(str(config_path)),
)
# Build optional commented-out sections for features that are off by
# default or only relevant when explicitly configured.
parts = []
sec = normalized.get("security", {})
if not sec or sec.get("redact_secrets") is None:
parts.append(_SECURITY_COMMENT)
fb = normalized.get("fallback_model", {})
fb_is_valid = False
if isinstance(fb, list):
fb_is_valid = any(isinstance(e, dict) and e.get("provider") and e.get("model") for e in fb)
elif isinstance(fb, dict):
fb_is_valid = bool(fb.get("provider") and fb.get("model"))
if not fb_is_valid:
parts.append(_FALLBACK_COMMENT)
atomic_yaml_write(
config_path,
normalized,
raw_existing,
_LAST_EXPANDED_CONFIG_BY_PATH.get(str(config_path)),
extra_content="".join(parts) if parts else None,
)
# Build optional commented-out sections for features that are off by
# default or only relevant when explicitly configured.
parts = []
sec = normalized.get("security", {})
if not sec or sec.get("redact_secrets") is None:
parts.append(_SECURITY_COMMENT)
fb = normalized.get("fallback_model", {})
fb_is_valid = False
if isinstance(fb, list):
fb_is_valid = any(isinstance(e, dict) and e.get("provider") and e.get("model") for e in fb)
elif isinstance(fb, dict):
fb_is_valid = bool(fb.get("provider") and fb.get("model"))
if not fb_is_valid:
parts.append(_FALLBACK_COMMENT)
atomic_yaml_write(
config_path,
normalized,
extra_content="".join(parts) if parts else None,
)
_secure_file(config_path)
_LAST_EXPANDED_CONFIG_BY_PATH[str(config_path)] = copy.deepcopy(current_normalized)
_secure_file(config_path)
_LAST_EXPANDED_CONFIG_BY_PATH[str(config_path)] = copy.deepcopy(current_normalized)
def load_env() -> Dict[str, str]:
@@ -4696,19 +4708,11 @@ def edit_config():
# Find editor
editor = os.getenv('EDITOR') or os.getenv('VISUAL')
if not editor:
# Try common editors — order is platform-aware so Windows users
# land on a working editor (notepad) even without Git Bash or nano
# installed. On POSIX, prefer nano/vim over code/notepad because
# it's more likely to be present on headless / server systems.
import shutil
import sys as _sys
if _sys.platform == "win32":
candidates = ['notepad', 'code', 'vim', 'vi', 'nano']
else:
candidates = ['nano', 'vim', 'vi', 'code', 'notepad']
for cmd in candidates:
# Try common editors
for cmd in ['nano', 'vim', 'vi', 'code', 'notepad']:
import shutil
if shutil.which(cmd):
editor = cmd
break
+4 -7
View File
@@ -598,7 +598,7 @@ def run_doctor(args):
# Detect stale root-level model keys (known bug source — PR #4329)
try:
import yaml
with open(config_path, encoding="utf-8") as f:
with open(config_path) as f:
raw_config = yaml.safe_load(f) or {}
stale_root_keys = [k for k in ("provider", "base_url") if k in raw_config and isinstance(raw_config[k], str)]
if stale_root_keys:
@@ -1059,8 +1059,7 @@ def run_doctor(args):
check_warn("Node.js not found", "(optional, needed for browser tools)")
# npm audit for all Node.js packages
_npm_bin = _safe_which("npm")
if _npm_bin:
if _safe_which("npm"):
npm_dirs = [
(PROJECT_ROOT, "Browser tools (agent-browser)"),
(PROJECT_ROOT / "scripts" / "whatsapp-bridge", "WhatsApp bridge"),
@@ -1069,10 +1068,8 @@ def run_doctor(args):
if not (npm_dir / "node_modules").exists():
continue
try:
# Use resolved absolute path so Windows can execute
# npm.cmd (CreateProcessW can't run bare .cmd names).
audit_result = subprocess.run(
[_npm_bin, "audit", "--json"],
["npm", "audit", "--json"],
cwd=str(npm_dir),
capture_output=True, text=True, timeout=30,
)
@@ -1399,7 +1396,7 @@ def run_doctor(args):
import yaml as _yaml
_mem_cfg_path = HERMES_HOME / "config.yaml"
if _mem_cfg_path.exists():
with open(_mem_cfg_path, encoding="utf-8") as _f:
with open(_mem_cfg_path) as _f:
_raw_cfg = _yaml.safe_load(_f) or {}
_active_memory_provider = (_raw_cfg.get("memory") or {}).get("provider", "")
except Exception:
+8 -49
View File
@@ -232,10 +232,6 @@ def _graceful_restart_via_sigusr1(pid: int, drain_timeout: float) -> bool:
# Process still exists but we can't signal it. Treat as alive
# so the caller falls back.
pass
except OSError:
# Windows raises OSError (WinError 87 "invalid parameter") for
# a gone PID — treat the same as ProcessLookupError.
return True
_time.sleep(0.5)
# Drain didn't finish in time.
return False
@@ -445,25 +441,6 @@ def launch_detached_profile_gateway_restart(profile: str, old_pid: int) -> bool:
if old_pid <= 0:
return False
# The watcher is a tiny Python subprocess that polls the old PID and
# respawns the gateway once it's gone. Both legs of the chain need
# platform-appropriate detach semantics:
#
# POSIX — ``start_new_session=True`` (os.setsid in the child) detaches
# from the parent's process group so Ctrl+C in the CLI doesn't
# propagate and the watcher/gateway survive the CLI exiting.
#
# Windows — ``start_new_session`` is silently accepted but does NOT
# detach. The watcher stays attached to the CLI's console and dies
# when the user closes the terminal, leaving ``hermes update`` users
# with no running gateway until they re-invoke ``hermes gateway``
# manually. The Win32 equivalent is the ``CREATE_NEW_PROCESS_GROUP |
# DETACHED_PROCESS | CREATE_NO_WINDOW`` creationflags bundle.
#
# ``windows_detach_popen_kwargs()`` returns the right kwargs for the
# host platform and is a no-op on POSIX (just ``start_new_session=True``).
from hermes_cli._subprocess_compat import windows_detach_popen_kwargs
watcher = textwrap.dedent(
"""
import os
@@ -481,39 +458,22 @@ def launch_detached_profile_gateway_restart(profile: str, old_pid: int) -> bool:
break
except PermissionError:
pass
except OSError:
# Windows: gone PID raises OSError (WinError 87).
break
time.sleep(0.2)
# Platform-appropriate detach for the respawned gateway. On POSIX
# start_new_session=True maps to os.setsid; on Windows we need
# explicit creationflags because start_new_session is a no-op there.
_popen_kwargs = {
"stdout": subprocess.DEVNULL,
"stderr": subprocess.DEVNULL,
}
if sys.platform == "win32":
_CREATE_NEW_PROCESS_GROUP = 0x00000200
_DETACHED_PROCESS = 0x00000008
_CREATE_NO_WINDOW = 0x08000000
_popen_kwargs["creationflags"] = (
_CREATE_NEW_PROCESS_GROUP | _DETACHED_PROCESS | _CREATE_NO_WINDOW
)
else:
_popen_kwargs["start_new_session"] = True
subprocess.Popen(cmd, **_popen_kwargs)
subprocess.Popen(
cmd,
stdout=subprocess.DEVNULL,
stderr=subprocess.DEVNULL,
start_new_session=True,
)
"""
).strip()
try:
# Same platform-aware detach for the watcher process itself — so
# closing the user's terminal doesn't kill the watcher.
subprocess.Popen(
[sys.executable, "-c", watcher, str(old_pid), *_gateway_run_args_for_profile(profile)],
stdout=subprocess.DEVNULL,
stderr=subprocess.DEVNULL,
**windows_detach_popen_kwargs(),
start_new_session=True,
)
except OSError:
return False
@@ -975,8 +935,7 @@ def stop_profile_gateway() -> bool:
try:
os.kill(pid, 0)
_time.sleep(0.5)
except (ProcessLookupError, PermissionError, OSError):
# OSError covers Windows' WinError 87 for gone PIDs.
except (ProcessLookupError, PermissionError):
break
if get_running_pid() is None:
+79 -21
View File
@@ -47,6 +47,14 @@ DEFAULT_MAX_TURNS = 20
DEFAULT_JUDGE_TIMEOUT = 30.0
# Cap how much of the last response + recent messages we send to the judge.
_JUDGE_RESPONSE_SNIPPET_CHARS = 4000
# After this many consecutive judge *parse* failures (empty output / non-JSON),
# the loop auto-pauses and points the user at the goal_judge config. API /
# transport errors do NOT count toward this — those are transient. This guards
# against small models (e.g. deepseek-v4-flash) that cannot follow the strict
# JSON reply contract; without it the loop runs until the turn budget is
# exhausted with every reply shaped like `judge returned empty response` or
# `judge reply was not JSON`.
DEFAULT_MAX_CONSECUTIVE_PARSE_FAILURES = 3
CONTINUATION_PROMPT_TEMPLATE = (
@@ -99,6 +107,7 @@ class GoalState:
last_verdict: Optional[str] = None # "done" | "continue" | "skipped"
last_reason: Optional[str] = None
paused_reason: Optional[str] = None # why we auto-paused (budget, etc.)
consecutive_parse_failures: int = 0 # judge-output parse failures in a row
def to_json(self) -> str:
return json.dumps(asdict(self), ensure_ascii=False)
@@ -116,6 +125,7 @@ class GoalState:
last_verdict=data.get("last_verdict"),
last_reason=data.get("last_reason"),
paused_reason=data.get("paused_reason"),
consecutive_parse_failures=int(data.get("consecutive_parse_failures", 0) or 0),
)
@@ -220,13 +230,17 @@ def _truncate(text: str, limit: int) -> str:
_JSON_OBJECT_RE = re.compile(r"\{.*?\}", re.DOTALL)
def _parse_judge_response(raw: str) -> Tuple[bool, str]:
"""Parse the judge's reply. Fail-open to ``(False, "<reason>")``.
def _parse_judge_response(raw: str) -> Tuple[bool, str, bool]:
"""Parse the judge's reply. Fail-open to ``(False, "<reason>", parse_failed)``.
Returns ``(done, reason)``.
Returns ``(done, reason, parse_failed)``. ``parse_failed`` is True when the
judge returned output that couldn't be interpreted as the expected JSON
verdict (empty body, prose, malformed JSON). Callers use that flag to
auto-pause after N consecutive parse failures so a weak judge model
doesn't silently burn the turn budget.
"""
if not raw:
return False, "judge returned empty response"
return False, "judge returned empty response", True
text = raw.strip()
@@ -252,7 +266,7 @@ def _parse_judge_response(raw: str) -> Tuple[bool, str]:
data = None
if not isinstance(data, dict):
return False, f"judge reply was not JSON: {_truncate(raw, 200)!r}"
return False, f"judge reply was not JSON: {_truncate(raw, 200)!r}", True
done_val = data.get("done")
if isinstance(done_val, str):
@@ -262,7 +276,7 @@ def _parse_judge_response(raw: str) -> Tuple[bool, str]:
reason = str(data.get("reason") or "").strip()
if not reason:
reason = "no reason provided"
return done, reason
return done, reason, False
def judge_goal(
@@ -270,36 +284,42 @@ def judge_goal(
last_response: str,
*,
timeout: float = DEFAULT_JUDGE_TIMEOUT,
) -> Tuple[str, str]:
) -> Tuple[str, str, bool]:
"""Ask the auxiliary model whether the goal is satisfied.
Returns ``(verdict, reason)`` where verdict is ``"done"``, ``"continue"``,
or ``"skipped"`` (when the judge couldn't be reached).
Returns ``(verdict, reason, parse_failed)`` where verdict is ``"done"``,
``"continue"``, or ``"skipped"`` (when the judge couldn't be reached).
This is deliberately fail-open: any error returns ``("continue", "...")``
so a broken judge doesn't wedge progress — the turn budget is the
backstop.
``parse_failed`` is True only when the judge call succeeded but its output
was unusable (empty or non-JSON). API/transport errors return False they
are transient and should fail-open silently. Callers use this flag to
auto-pause after N consecutive parse failures (see
``DEFAULT_MAX_CONSECUTIVE_PARSE_FAILURES``).
This is deliberately fail-open: any error returns ``("continue", "...", False)``
so a broken judge doesn't wedge progress — the turn budget and the
consecutive-parse-failures auto-pause are the backstops.
"""
if not goal.strip():
return "skipped", "empty goal"
return "skipped", "empty goal", False
if not last_response.strip():
# No substantive reply this turn — almost certainly not done yet.
return "continue", "empty response (nothing to evaluate)"
return "continue", "empty response (nothing to evaluate)", False
try:
from agent.auxiliary_client import get_text_auxiliary_client
except Exception as exc:
logger.debug("goal judge: auxiliary client import failed: %s", exc)
return "continue", "auxiliary client unavailable"
return "continue", "auxiliary client unavailable", False
try:
client, model = get_text_auxiliary_client("goal_judge")
except Exception as exc:
logger.debug("goal judge: get_text_auxiliary_client failed: %s", exc)
return "continue", "auxiliary client unavailable"
return "continue", "auxiliary client unavailable", False
if client is None or not model:
return "continue", "no auxiliary client configured"
return "continue", "no auxiliary client configured", False
prompt = JUDGE_USER_PROMPT_TEMPLATE.format(
goal=_truncate(goal, 2000),
@@ -319,17 +339,17 @@ def judge_goal(
)
except Exception as exc:
logger.info("goal judge: API call failed (%s) — falling through to continue", exc)
return "continue", f"judge error: {type(exc).__name__}"
return "continue", f"judge error: {type(exc).__name__}", False
try:
raw = resp.choices[0].message.content or ""
except Exception:
raw = ""
done, reason = _parse_judge_response(raw)
done, reason, parse_failed = _parse_judge_response(raw)
verdict = "done" if done else "continue"
logger.info("goal judge: verdict=%s reason=%s", verdict, _truncate(reason, 120))
return verdict, reason
return verdict, reason, parse_failed
# ──────────────────────────────────────────────────────────────────────
@@ -473,10 +493,18 @@ class GoalManager:
state.turns_used += 1
state.last_turn_at = time.time()
verdict, reason = judge_goal(state.goal, last_response)
verdict, reason, parse_failed = judge_goal(state.goal, last_response)
state.last_verdict = verdict
state.last_reason = reason
# Track consecutive judge parse failures. Reset on any usable reply,
# including API / transport errors (parse_failed=False) so a flaky
# network doesn't trip the auto-pause meant for bad judge models.
if parse_failed:
state.consecutive_parse_failures += 1
else:
state.consecutive_parse_failures = 0
if verdict == "done":
state.status = "done"
save_goal(self.session_id, state)
@@ -489,6 +517,36 @@ class GoalManager:
"message": f"✓ Goal achieved: {reason}",
}
# Auto-pause when the judge model can't produce the expected JSON
# verdict N turns in a row. Points the user at the goal_judge config
# so they can route this side task to a model that follows the
# contract (e.g. google/gemini-3-flash-preview). Without this guard,
# weak judge models burn the entire turn budget returning prose or
# empty strings.
if state.consecutive_parse_failures >= DEFAULT_MAX_CONSECUTIVE_PARSE_FAILURES:
state.status = "paused"
state.paused_reason = (
f"judge model returned unparseable output {state.consecutive_parse_failures} turns in a row"
)
save_goal(self.session_id, state)
return {
"status": "paused",
"should_continue": False,
"continuation_prompt": None,
"verdict": "continue",
"reason": reason,
"message": (
f"⏸ Goal paused — the judge model ({state.consecutive_parse_failures} turns) "
"isn't returning the required JSON verdict. Route the judge to a stricter "
"model in ~/.hermes/config.yaml:\n"
" auxiliary:\n"
" goal_judge:\n"
" provider: openrouter\n"
" model: google/gemini-3-flash-preview\n"
"Then /goal resume to continue."
),
}
if state.turns_used >= state.max_turns:
state.status = "paused"
state.paused_reason = f"turn budget exhausted ({state.turns_used}/{state.max_turns})"
+1 -1
View File
@@ -205,7 +205,7 @@ def _cmd_test(args) -> None:
if getattr(args, "payload_file", None):
try:
custom = json.loads(Path(args.payload_file).read_text(encoding="utf-8"))
custom = json.loads(Path(args.payload_file).read_text())
if isinstance(custom, dict):
payload.update(custom)
else:
+14 -26
View File
@@ -2835,7 +2835,7 @@ def _pid_alive(pid: Optional[int]) -> bool:
# where we have a cheap, deterministic process-state probe.
if sys.platform == "linux":
try:
with open(f"/proc/{int(pid)}/status", "r", encoding="utf-8") as f:
with open(f"/proc/{int(pid)}/status", "r") as f:
for line in f:
if line.startswith("State:"):
# "State:\tZ (zombie)" → dead
@@ -2911,10 +2911,7 @@ def _terminate_reclaimed_worker(
if _pid_alive(pid):
try:
# signal.SIGKILL doesn't exist on Windows; fall back to SIGTERM
# (which maps to TerminateProcess via the stdlib shim).
_sigkill = getattr(signal, "SIGKILL", signal.SIGTERM)
kill(int(pid), _sigkill)
kill(int(pid), signal.SIGKILL)
info["sigkill"] = True
except (ProcessLookupError, OSError):
return info
@@ -3038,9 +3035,7 @@ def enforce_max_runtime(
time.sleep(0.5)
if _pid_alive(pid):
try:
# signal.SIGKILL doesn't exist on Windows.
_sigkill = getattr(signal, "SIGKILL", signal.SIGTERM)
kill(pid, _sigkill)
kill(pid, signal.SIGKILL)
killed = True
except (ProcessLookupError, OSError):
pass
@@ -3519,24 +3514,17 @@ def dispatch_once(
# cleanly without calling ``kanban_complete`` / ``kanban_block``
# (protocol violation — auto-block) from a real crash (OOM killer,
# SIGKILL, non-zero exit — existing counter behavior).
#
# Windows has no zombies / no os.WNOHANG — subprocess.Popen handles
# are freed when the Python object is garbage-collected or .wait() is
# called explicitly. The kanban dispatcher discards the Popen handle
# after spawn (``_default_spawn`` → abandon), so on Windows there's
# nothing to reap here — skip the whole block.
if os.name != "nt":
try:
while True:
try:
_pid, _status = os.waitpid(-1, os.WNOHANG)
except ChildProcessError:
break
if _pid == 0:
break
_record_worker_exit(_pid, _status)
except Exception:
pass
try:
while True:
try:
_pid, _status = os.waitpid(-1, os.WNOHANG)
except ChildProcessError:
break
if _pid == 0:
break
_record_worker_exit(_pid, _status)
except Exception:
pass
result = DispatchResult()
result.reclaimed = release_stale_claims(conn)
+2 -19
View File
@@ -43,11 +43,6 @@ Usage:
hermes claw migrate --dry-run # Preview migration without changes
"""
# IMPORTANT: hermes_bootstrap must be the very first import — it sets up
# UTF-8 stdio on Windows so print()/subprocess children don't hit
# UnicodeEncodeError with non-ASCII characters. No-op on POSIX.
import hermes_bootstrap # noqa: F401
import argparse
import json
import os
@@ -7970,15 +7965,10 @@ def _cmd_update_impl(args, gateway_mode: bool):
print(
f"{len(_stuck)} gateway process(es) ignored SIGTERM — force-killing"
)
from gateway.status import terminate_pid as _terminate_pid
for pid in _stuck:
try:
# Routes through taskkill /T /F on Windows,
# SIGKILL on POSIX — _signal.SIGKILL doesn't
# exist on Windows so the old raw os.kill call
# used to crash the entire update path.
_terminate_pid(pid, force=True)
except (ProcessLookupError, PermissionError, OSError):
os.kill(pid, _signal.SIGKILL)
except (ProcessLookupError, PermissionError):
pass
# Give the OS a beat to reap the processes so the
# watchers see them exit and respawn.
@@ -8564,13 +8554,6 @@ def _build_provider_choices() -> list[str]:
def main():
"""Main entry point for hermes CLI."""
# Force UTF-8 stdio on Windows before anything prints. No-op elsewhere.
try:
from hermes_cli.stdio import configure_windows_stdio
configure_windows_stdio()
except Exception:
pass
from hermes_cli._parser import build_top_level_parser
parser, subparsers, chat_parser = build_top_level_parser()
+2 -2
View File
@@ -69,7 +69,7 @@ def _install_dependencies(provider_name: str) -> None:
try:
import yaml
with open(yaml_path, encoding="utf-8") as f:
with open(yaml_path) as f:
meta = yaml.safe_load(f) or {}
except Exception:
return
@@ -377,7 +377,7 @@ def _write_env_vars(env_path: Path, env_writes: dict) -> None:
if key not in updated_keys:
new_lines.append(f"{key}={val}")
env_path.write_text("\n".join(new_lines) + "\n", encoding="utf-8")
env_path.write_text("\n".join(new_lines) + "\n")
# ---------------------------------------------------------------------------
+2 -2
View File
@@ -173,7 +173,7 @@ def _read_disk_cache() -> tuple[dict[str, Any] | None, float]:
except (OSError, FileNotFoundError):
return (None, 0.0)
try:
with open(path, encoding="utf-8") as fh:
with open(path) as fh:
data = json.load(fh)
except (OSError, json.JSONDecodeError):
return (None, 0.0)
@@ -187,7 +187,7 @@ def _write_disk_cache(data: dict[str, Any]) -> None:
try:
path.parent.mkdir(parents=True, exist_ok=True)
tmp = path.with_suffix(path.suffix + ".tmp")
with open(tmp, "w", encoding="utf-8") as fh:
with open(tmp, "w") as fh:
json.dump(data, fh, indent=2)
fh.write("\n")
atomic_replace(tmp, path)
+1 -1
View File
@@ -174,7 +174,7 @@ def run_oneshot(
# Redirect stderr AND stdout to devnull for the entire call tree.
# We'll print the final response to the real stdout at the end.
real_stdout = sys.stdout
devnull = open(os.devnull, "w", encoding="utf-8")
devnull = open(os.devnull, "w")
try:
with redirect_stdout(devnull), redirect_stderr(devnull):
+1 -1
View File
@@ -870,7 +870,7 @@ class PluginManager:
if yaml is None:
logger.warning("PyYAML not installed cannot load %s", manifest_file)
return None
data = yaml.safe_load(manifest_file.read_text(encoding="utf-8")) or {}
data = yaml.safe_load(manifest_file.read_text()) or {}
name = data.get("name", plugin_dir.name)
key = f"{prefix}/{plugin_dir.name}" if prefix else name
+2 -2
View File
@@ -127,7 +127,7 @@ def _read_manifest(plugin_dir: Path) -> dict:
try:
import yaml
with open(manifest_file, encoding="utf-8") as f:
with open(manifest_file) as f:
return yaml.safe_load(f) or {}
except Exception as e:
logger.warning("Failed to read plugin.yaml in %s: %s", plugin_dir, e)
@@ -703,7 +703,7 @@ def _discover_all_plugins() -> list:
description = ""
if yaml:
try:
with open(manifest_file, encoding="utf-8") as f:
with open(manifest_file) as f:
manifest = yaml.safe_load(f) or {}
name = manifest.get("name", d.name)
version = manifest.get("version", "")
+6 -13
View File
@@ -354,7 +354,7 @@ def _read_config_model(profile_dir: Path) -> tuple:
return None, None
try:
import yaml
with open(config_path, "r", encoding="utf-8") as f:
with open(config_path, "r") as f:
cfg = yaml.safe_load(f) or {}
model_cfg = cfg.get("model", {})
if isinstance(model_cfg, str):
@@ -758,6 +758,7 @@ def _cleanup_gateway_service(name: str, profile_dir: Path) -> None:
def _stop_gateway_process(profile_dir: Path) -> None:
"""Stop a running gateway process via its PID file."""
import signal as _signal
import time as _time
pid_file = profile_dir / "gateway.pid"
@@ -768,27 +769,19 @@ def _stop_gateway_process(profile_dir: Path) -> None:
raw = pid_file.read_text().strip()
data = json.loads(raw) if raw.startswith("{") else {"pid": int(raw)}
pid = int(data["pid"])
# Route through terminate_pid so Windows uses the appropriate
# primitive (taskkill / TerminateProcess) — raw os.kill with
# _signal.SIGKILL raises AttributeError at import time on Windows,
# and raw os.kill with SIGTERM doesn't cascade to child processes
# the same way taskkill /T does.
from gateway.status import terminate_pid as _terminate_pid
_terminate_pid(pid) # graceful first
os.kill(pid, _signal.SIGTERM)
# Wait up to 10s for graceful shutdown
for _ in range(20):
_time.sleep(0.5)
try:
os.kill(pid, 0)
except (ProcessLookupError, OSError):
# OSError covers Windows' WinError 87 "invalid parameter"
# returned for an invalid/gone PID probe.
except ProcessLookupError:
print(f"✓ Gateway stopped (PID {pid})")
return
# Force kill
try:
_terminate_pid(pid, force=True)
except (ProcessLookupError, OSError):
os.kill(pid, _signal.SIGKILL)
except ProcessLookupError:
pass
print(f"✓ Gateway force-stopped (PID {pid})")
except (ProcessLookupError, PermissionError):
+5 -8
View File
@@ -7,14 +7,11 @@ keystrokes can be fed back in. The only caller today is the
Design constraints:
* **POSIX-only.** This module depends on ``fcntl``, ``termios``, and
``ptyprocess``, none of which exist on native Windows Python. Native
Windows ConPTY is a different API (Windows 10 build 17763+) and would
need a separate Windows implementation (``pywinpty``) that's tracked
as a future enhancement. On native Windows, importing this module
raises :class:`ImportError` and the dashboard's ``/chat`` tab shows a
WSL-recommended banner instead of crashing. Every other feature in the
dashboard (sessions, jobs, metrics, config editor) works natively.
* **POSIX-only.** Hermes Agent supports Windows exclusively via WSL, which
exposes a native POSIX PTY via ``openpty(3)``. Native Windows Python
has no PTY; :class:`PtyUnavailableError` is raised with a user-readable
install/platform message so the dashboard can render a banner instead of
crashing.
* **Zero Node dependency on the server side.** We use :mod:`ptyprocess`,
which is a pure-Python wrapper around the OS calls. The browser talks
to the same ``hermes --tui`` binary it would launch from the CLI, so
+4 -60
View File
@@ -84,34 +84,18 @@ def resolve_hermes_bin() -> Optional[str]:
1. ``sys.argv[0]`` if it resolves to a real executable.
2. ``shutil.which("hermes")`` on PATH.
3. ``None`` caller should fall back to ``python -m hermes_cli.main``.
Windows note: ``os.access(path, os.X_OK)`` returns True for ``.py`` and
``.pyc`` files on Windows (the OS treats anything listed in PATHEXT as
executable, and Python files are often registered there). But
``subprocess.run([script.py, ...])`` can't actually execute a .py
directly CreateProcessW needs a real .exe, not a script associated
with the Python launcher. On Windows we therefore skip the argv[0]
fast-path when it points at a .py file and fall through to either
``hermes.exe`` on PATH or the ``sys.executable -m hermes_cli.main``
fallback.
"""
argv0 = sys.argv[0]
_is_windows = sys.platform == "win32"
def _is_python_script(p: str) -> bool:
return p.lower().endswith((".py", ".pyc"))
# Absolute path to an executable (covers nix store, venv wrappers, etc.)
if os.path.isabs(argv0) and os.path.isfile(argv0) and os.access(argv0, os.X_OK):
if not (_is_windows and _is_python_script(argv0)):
return argv0
return argv0
# Relative path — resolve against CWD
if not argv0.startswith("-") and os.path.isfile(argv0):
abs_path = os.path.abspath(argv0)
if os.access(abs_path, os.X_OK):
if not (_is_windows and _is_python_script(abs_path)):
return abs_path
return abs_path
# PATH lookup
path_bin = shutil.which("hermes")
@@ -158,48 +142,8 @@ def relaunch(
preserve_inherited: bool = True,
original_argv: Optional[Sequence[str]] = None,
) -> None:
"""Replace the current process with a fresh hermes invocation.
On POSIX we use ``os.execvp`` which replaces the running process with
the new one in place same PID, no double-fork. That's what the
relaunch contract wants: "run hermes again as if the user had typed
the new argv".
Windows has no native exec semantics ``os.execvp`` on Windows
*emulates* exec by spawning the child and exiting the parent, but
only works when the target is a real Win32 executable. Our target
is usually ``hermes.exe`` (a Python console-script shim that wraps
``python -m hermes_cli.main``) or a ``.cmd`` batch file, and both
raise ``OSError(8, "Exec format error")`` on Windows' execvp.
The Windows-correct pattern is: spawn the child with ``subprocess.run``
(which routes through ``cmd.exe`` via ``shell=False`` + PATHEXT resolution),
wait for it to exit, then propagate its exit code via ``sys.exit``.
That's functionally equivalent — the user sees "hermes exited, then
new hermes started" — just with two PIDs in play instead of one.
"""
"""Replace the current process with a fresh hermes invocation."""
new_argv = build_relaunch_argv(
extra_args, preserve_inherited=preserve_inherited, original_argv=original_argv
)
if sys.platform == "win32":
# Windows: subprocess + exit, because execvp can't swap to .cmd/.exe shims.
import subprocess
try:
result = subprocess.run(new_argv)
sys.exit(result.returncode)
except KeyboardInterrupt:
sys.exit(130)
except OSError as exc:
# Surface a helpful error rather than the raw OSError — the
# caller used to see ``[Errno 8] Exec format error`` which is
# cryptic. Common causes: ``hermes`` not on PATH yet (install
# hasn't propagated User PATH into this shell) or a stale shim.
print(
f"\nHermes relaunch failed: {exc}\n"
f"Command: {' '.join(new_argv)}\n"
f"Fix: open a new terminal so PATH picks up, then re-run hermes.",
file=sys.stderr,
)
sys.exit(1)
else:
os.execvp(new_argv[0], new_argv)
os.execvp(new_argv[0], new_argv)
+8 -7
View File
@@ -3240,22 +3240,23 @@ def _offer_launch_chat():
def _run_first_time_quick_setup(config: dict, hermes_home, is_existing: bool):
"""Streamlined first-time setup: provider + model only.
"""Streamlined first-time setup: provider, model, terminal & messaging.
Applies sensible defaults for TTS (Edge), terminal (local), agent
settings, and tools the user can customize later via
``hermes setup <section>``.
Applies sensible defaults for TTS (Edge), agent settings, and tools
the user can customize later via ``hermes setup <section>``.
"""
# Step 1: Model & Provider (essential — skips rotation/vision/TTS)
setup_model_provider(config, quick=True)
# Step 2: Apply defaults for everything else
# Step 2: Terminal Backend — where commands run is a core decision
setup_terminal_backend(config)
# Step 3: Apply defaults for everything else
_apply_default_agent_settings(config)
config.setdefault("terminal", {}).setdefault("backend", "local")
save_config(config)
# Step 3: Offer messaging gateway setup
# Step 4: Offer messaging gateway setup
print()
gateway_choice = prompt_choice(
"Connect a messaging platform? (Telegram, Discord, etc.)",
+2 -2
View File
@@ -1257,7 +1257,7 @@ def do_snapshot_export(output_path: str, console: Optional[Console] = None) -> N
sys.stdout.write(payload)
else:
out = Path(output_path)
out.write_text(payload, encoding="utf-8")
out.write_text(payload)
c.print(f"[bold green]Snapshot exported:[/] {out}")
c.print(f"[dim]{len(installed)} skill(s), {len(tap_list)} tap(s)[/]\n")
@@ -1274,7 +1274,7 @@ def do_snapshot_import(input_path: str, force: bool = False,
return
try:
snapshot = json.loads(inp.read_text(encoding="utf-8"))
snapshot = json.loads(inp.read_text())
except json.JSONDecodeError:
c.print(f"[bold red]Error:[/] Invalid JSON in {inp}\n")
return
-252
View File
@@ -1,252 +0,0 @@
"""Windows-safe stdio configuration.
On Windows, Python's ``sys.stdout``/``sys.stderr`` default to the console's
active code page (often ``cp1252``, sometimes ``cp437``, occasionally ``cp932``
on Japanese locales, etc.). Hermes's banners, tool output feed, and slash
command listings all contain Unicode: box-drawing characters (````),
mathematical and geometric symbols (`` ``), and user-supplied
text in any language. Printing those to a cp1252 console raises
``UnicodeEncodeError: 'charmap' codec can't encode character…`` and kills the
whole CLI before the REPL even opens.
The fix is to force UTF-8 on the Python side and also flip the console's
code page to UTF-8 (65001). Both matter: Python-level only helps when
Python's stdout is a real TTY; code-page flipping lets subprocesses and
child Python ``print()`` calls agree on encoding.
This module is a no-op on every non-Windows platform, and idempotent.
Entry points (``cli.py`` ``main``, ``hermes_cli/main.py`` CLI dispatch,
``gateway/run.py`` startup) call :func:`configure_windows_stdio` exactly
once early in startup.
Patterns cribbed from Claude Code (``src/utils/platform.ts``), OpenCode
(``packages/opencode/src/pty/index.ts`` env injection), and OpenAI Codex
(``codex-rs/core/src/unified_exec/process_manager.rs``). None of those
actually flip the console code page they rely on their runtime (Node or
Rust) writing UTF-16 to the Win32 console API and letting the terminal
sort it out. Python doesn't get that luxury.
"""
from __future__ import annotations
import os
import sys
__all__ = ["configure_windows_stdio", "is_windows"]
_CONFIGURED = False
def is_windows() -> bool:
"""Return True iff running on native Windows (not WSL)."""
return sys.platform == "win32"
def _flip_console_code_page_to_utf8() -> None:
"""Set the attached console's input and output code pages to UTF-8.
Uses ``SetConsoleCP`` / ``SetConsoleOutputCP`` via ``ctypes``. Failure
is silent if there's no attached console (e.g. Hermes is running
behind a redirected stdout, under a service, or inside a PTY-less CI
runner) these calls simply return 0 and we move on.
CP_UTF8 is 65001.
"""
try:
import ctypes
kernel32 = ctypes.windll.kernel32 # type: ignore[attr-defined]
# Best-effort; if there's no console attached these just fail silently.
kernel32.SetConsoleCP(65001)
kernel32.SetConsoleOutputCP(65001)
except Exception:
# ctypes import, missing kernel32, or non-Windows — any failure here
# is non-fatal. We've still reconfigured Python's own streams below.
pass
def _reconfigure_stream(stream, *, encoding: str = "utf-8", errors: str = "replace") -> None:
"""Reconfigure a text stream to UTF-8 in place.
Uses ``TextIOWrapper.reconfigure`` (Python 3.7+). If the stream isn't
a ``TextIOWrapper`` (e.g. it's been redirected to an ``io.StringIO``
during tests), we skip rather than blow up.
"""
try:
reconfigure = getattr(stream, "reconfigure", None)
if reconfigure is None:
return
reconfigure(encoding=encoding, errors=errors)
except Exception:
pass
def configure_windows_stdio() -> bool:
"""Force UTF-8 stdio on Windows. No-op elsewhere.
Idempotent safe to call multiple times from different entry points.
Returns ``True`` if anything was actually changed, ``False`` on
non-Windows or on a repeat call.
Set ``HERMES_DISABLE_WINDOWS_UTF8=1`` in the environment to opt out
(for diagnosing encoding-related bugs by forcing the old cp1252 path).
Also sets a sensible default ``EDITOR`` on Windows if none is already
set see :func:`_default_windows_editor`.
"""
global _CONFIGURED
if _CONFIGURED:
return False
if not is_windows():
# Mark configured so repeated calls on POSIX are true no-ops.
_CONFIGURED = True
return False
if os.environ.get("HERMES_DISABLE_WINDOWS_UTF8") in ("1", "true", "True", "yes"):
_CONFIGURED = True
return False
# Encourage every child Python process spawned by the agent to also use
# UTF-8 for its stdio. PYTHONIOENCODING wins over the locale-based
# default in subprocesses. Don't override an explicit user setting.
os.environ.setdefault("PYTHONIOENCODING", "utf-8")
# PYTHONUTF8 = 1 enables UTF-8 Mode globally for any Python subprocess
# (PEP 540). Again, don't override an explicit setting.
os.environ.setdefault("PYTHONUTF8", "1")
# Set EDITOR to a working Windows default if neither EDITOR nor VISUAL
# is set. prompt_toolkit's ``open_in_editor`` falls back to POSIX-only
# paths (``/usr/bin/nano``, ``/usr/bin/vi``) that don't exist on
# Windows — Ctrl+X Ctrl+E and ``/edit`` silently do nothing there
# otherwise. This happens even with full Git for Windows installed,
# so it's not a MinGit-specific issue.
_default_editor = _default_windows_editor()
if _default_editor and not os.environ.get("EDITOR") and not os.environ.get("VISUAL"):
os.environ["EDITOR"] = _default_editor
# Augment PATH with the Hermes-managed Git install directories so
# subprocess calls (bash, rg, grep, etc.) resolve even in sessions
# that started before the User PATH broadcast reached them. When
# install.ps1 adds these to User PATH via SetEnvironmentVariable,
# already-running shells don't see the change — which means hermes
# launched from the install session won't find rg / bash / grep
# even though they're "installed". Prepending the known paths here
# closes that gap. No-op when the paths don't exist (e.g. system-Git
# install without Hermes-managed PortableGit).
_augment_path_with_known_tools()
# Flip the console code page first so that any subprocess that
# inherits the console (e.g. a launched shell) also sees CP_UTF8.
_flip_console_code_page_to_utf8()
# Reconfigure Python's own stdio wrappers so ``print()`` calls from
# this process round-trip emoji / box-drawing / non-Latin text.
# ``errors="replace"`` means a genuinely unencodable byte sequence
# gets a ``?`` rather than crashing the interpreter — we prefer
# degraded output over a stack trace.
_reconfigure_stream(sys.stdout)
_reconfigure_stream(sys.stderr)
# stdin is re-configured for completeness; Hermes's interactive
# input path uses prompt_toolkit which manages its own encoding,
# but batch/pipe input benefits from UTF-8 decoding on stdin too.
_reconfigure_stream(sys.stdin)
_CONFIGURED = True
return True
def _default_windows_editor() -> str:
"""Return a Windows-appropriate default for ``$EDITOR``.
Priority order, first match wins:
1. ``notepad`` ships with every Windows install, no deps, works as a
blocking editor (``subprocess.call(["notepad", file])`` blocks until
the user closes the window). This is the "always-works" default.
The prompt_toolkit buffer's ``open_in_editor`` and Hermes's
``hermes config edit`` both honour ``$EDITOR``. Users who prefer a
different editor can override:
- VSCode: ``$env:EDITOR = "code --wait"`` (``--wait`` is critical;
without it the editor returns immediately and any input is lost)
- Notepad++: ``$env:EDITOR = "'C:\\Program Files\\Notepad++\\notepad++.exe' -multiInst -nosession"``
- Neovim: ``$env:EDITOR = "nvim"`` (if installed)
Set this before launching Hermes (User env var in Windows Settings, or
export in a PowerShell profile) and Hermes picks it up automatically.
"""
import shutil
# notepad.exe is always in %SystemRoot%\System32 on Windows, so shutil.which
# will reliably find it. Return the bare name so prompt_toolkit's shlex
# split doesn't trip over a path containing spaces.
if shutil.which("notepad"):
return "notepad"
# On the extreme off-chance notepad is missing (WinPE, Nano Server), fall
# back to nothing and let prompt_toolkit's silent no-op do its thing.
return ""
def _augment_path_with_known_tools() -> None:
"""Prepend well-known Hermes-managed tool directories to os.environ['PATH'].
Fixes the "User PATH was just updated but my process can't see it" gap on
Windows. When install.ps1 runs, it adds entries like
``%LOCALAPPDATA%\\hermes\\git\\bin`` to the User PATH via
``SetEnvironmentVariable(..., "User")``. That write propagates to newly
*spawned* processes only already-running shells (including the one the
user invokes ``hermes`` from right after install) retain their old PATH.
Any subprocess Hermes spawns bash, ``rg``, ``grep``, ``npm`` inherits
that stale PATH and reports commands as missing even though they're on
disk. Symptom: ``search_files`` reports "rg/find not available" when
the user clearly just installed ripgrep.
Patch-up strategy: add the known Hermes-managed tool directories to our
PATH at startup so subprocess calls resolve correctly. No-op on POSIX
and when the directories don't exist. The User PATH broadcast still
happens in the background for future shells; this just smooths over
the first-launch gap.
"""
if not is_windows():
return
import shutil as _shutil
local_appdata = os.environ.get("LOCALAPPDATA", "")
if not local_appdata:
return
# Known tool dirs installed by scripts/install.ps1. Kept in sync with
# the PATH entries that installer adds to User scope — the two lists
# should match so this prefill fully mirrors what a fresh shell would
# see on next launch.
candidate_dirs = [
os.path.join(local_appdata, "hermes", "git", "cmd"),
os.path.join(local_appdata, "hermes", "git", "bin"),
os.path.join(local_appdata, "hermes", "git", "usr", "bin"),
# Hermes venv Scripts directory — host of the hermes.exe shim itself,
# also where any pip-installed console scripts land. Usually already
# on PATH when the user invokes hermes, but harmless to include.
os.path.join(local_appdata, "hermes", "hermes-agent", "venv", "Scripts"),
# WinGet packages directory — where ``winget install`` drops CLI
# shims by default (ripgrep lands here as rg.exe). Covers the case
# of a system-Git install + ripgrep-via-winget that isn't yet on
# the spawning shell's PATH.
os.path.join(local_appdata, "Microsoft", "WinGet", "Links"),
]
existing = os.environ.get("PATH", "")
existing_lower = {p.lower() for p in existing.split(os.pathsep) if p}
prepend = []
for d in candidate_dirs:
if os.path.isdir(d) and d.lower() not in existing_lower:
prepend.append(d)
if prepend:
os.environ["PATH"] = os.pathsep.join([*prepend, existing])
+3 -9
View File
@@ -509,12 +509,8 @@ def _run_post_setup(post_setup_key: str):
if not node_modules.exists() and npm_bin:
_print_info(" Installing Node.js dependencies for browser tools...")
import subprocess
# Use the resolved npm_bin absolute path so subprocess.Popen can
# execute npm.cmd on Windows (CreateProcessW otherwise rejects
# batch shims). On POSIX npm_bin is the plain path — same
# behaviour as before.
result = subprocess.run(
[npm_bin, "install", "--silent"],
["npm", "install", "--silent"],
capture_output=True, text=True, cwd=str(PROJECT_ROOT)
)
if result.returncode == 0:
@@ -613,13 +609,11 @@ def _run_post_setup(post_setup_key: str):
elif post_setup_key == "camofox":
camofox_dir = PROJECT_ROOT / "node_modules" / "@askjo" / "camofox-browser"
_npm_bin = shutil.which("npm")
if not camofox_dir.exists() and _npm_bin:
if not camofox_dir.exists() and shutil.which("npm"):
_print_info(" Installing Camofox browser server...")
import subprocess
# Absolute npm path so .cmd shim executes on Windows.
result = subprocess.run(
[_npm_bin, "install", "--silent"],
["npm", "install", "--silent"],
capture_output=True, text=True, cwd=str(PROJECT_ROOT)
)
if result.returncode == 0:
+2 -27
View File
@@ -692,7 +692,7 @@ def _tail_lines(path: Path, n: int) -> List[str]:
if not path.exists():
return []
try:
text = path.read_text(encoding="utf-8", errors="replace")
text = path.read_text(errors="replace")
except OSError:
return []
lines = text.splitlines()
@@ -2979,20 +2979,7 @@ async def get_models_analytics(days: int = 30):
import re
import asyncio
# PTY bridge is POSIX-only (depends on fcntl/termios/ptyprocess). On native
# Windows the import raises; catch and leave PtyBridge=None so the rest of
# the dashboard (sessions, jobs, metrics, config editor) still loads and the
# /api/pty endpoint cleanly refuses with a WSL-suggested message.
try:
from hermes_cli.pty_bridge import PtyBridge, PtyUnavailableError
_PTY_BRIDGE_AVAILABLE = True
except ImportError as _pty_import_err: # pragma: no cover - Windows-only path
PtyBridge = None # type: ignore[assignment]
_PTY_BRIDGE_AVAILABLE = False
class PtyUnavailableError(RuntimeError): # type: ignore[no-redef]
"""Stub on platforms where pty_bridge can't be imported."""
pass
from hermes_cli.pty_bridge import PtyBridge, PtyUnavailableError
_RESIZE_RE = re.compile(rb"\x1b\[RESIZE:(\d+);(\d+)\]")
_PTY_READ_CHUNK_TIMEOUT = 0.2
@@ -3126,18 +3113,6 @@ async def pty_ws(ws: WebSocket) -> None:
await ws.accept()
# On native Windows, the POSIX PTY bridge can't be imported. Tell the
# client and close cleanly rather than pretending the feature works.
if not _PTY_BRIDGE_AVAILABLE:
await ws.send_text(
"\r\n\x1b[31mChat unavailable: the embedded terminal requires a "
"POSIX PTY, which native Windows Python doesn't provide.\x1b[0m\r\n"
"\x1b[33mInstall Hermes inside WSL2 to use the dashboard's /chat "
"tab — the rest of the dashboard works here.\x1b[0m\r\n"
)
await ws.close(code=1011)
return
# --- spawn PTY ------------------------------------------------------
resume = ws.query_params.get("resume") or None
channel = _channel_or_close_code(ws)
+2 -2
View File
@@ -233,7 +233,7 @@ def is_wsl() -> bool:
if _wsl_detected is not None:
return _wsl_detected
try:
with open("/proc/version", "r", encoding="utf-8") as f:
with open("/proc/version", "r") as f:
_wsl_detected = "microsoft" in f.read().lower()
except Exception:
_wsl_detected = False
@@ -260,7 +260,7 @@ def is_container() -> bool:
_container_detected = True
return True
try:
with open("/proc/1/cgroup", "r", encoding="utf-8") as f:
with open("/proc/1/cgroup", "r") as f:
cgroup = f.read()
if "docker" in cgroup or "podman" in cgroup or "/lxc/" in cgroup:
_container_detected = True
+1 -1
View File
@@ -50,7 +50,7 @@ def _resolve_timezone_name() -> str:
import yaml
config_path = get_config_path()
if config_path.exists():
with open(config_path, encoding="utf-8") as f:
with open(config_path) as f:
cfg = yaml.safe_load(f) or {}
tz_cfg = cfg.get("timezone", "")
if isinstance(tz_cfg, str) and tz_cfg.strip():
+66 -15
View File
@@ -97,6 +97,12 @@
const API = "/api/plugins/kanban";
const MIME_TASK = "text/x-hermes-task";
// Docs link — surfaced as a `?` icon next to the board switcher and as
// `title=` hints on unlabelled controls. Kept in one place so rebrands or
// path changes are a single edit.
const DOCS_URL = "https://hermes-agent.nousresearch.com/docs/user-guide/features/kanban";
const DOCS_TUTORIAL_URL = "https://hermes-agent.nousresearch.com/docs/user-guide/features/kanban-tutorial";
// localStorage key for the user's selected board. Independent of the
// CLI's on-disk ``<root>/kanban/current`` pointer so browser users
// can inspect any board without shifting the CLI's active board out
@@ -1128,6 +1134,20 @@
// Board switcher (multi-project)
// -------------------------------------------------------------------------
// Small `?` affordance next to the board controls. Opens the kanban docs
// page in a new tab so users can look up what any of the widgets mean
// without losing the current board view.
function DocsLink() {
return h("a", {
href: DOCS_URL,
target: "_blank",
rel: "noopener noreferrer",
className: "hermes-kanban-docs-link",
title: "Open Hermes Kanban docs in a new tab",
"aria-label": "Hermes Kanban documentation",
}, "?");
}
function BoardSwitcher(props) {
const list = props.boardList || [];
const current = list.find(function (b) { return b.slug === props.board; });
@@ -1152,6 +1172,7 @@
size: "sm",
className: "h-7 text-xs",
}, "+ New board"),
h(DocsLink, null),
);
}
@@ -1165,6 +1186,7 @@
value: props.board,
className: "h-8 min-w-[220px]",
"aria-label": "Switch kanban board",
title: "Boards are independent work streams. Each board has its own tasks, tenants, and assignees.",
}, selectChangeHandler(function (v) { if (v) props.onSwitch(v); })),
list.map(function (b) {
const label = b.total > 0
@@ -1178,10 +1200,12 @@
),
),
h("div", { className: "flex-1" }),
h(DocsLink, null),
h(Button, {
onClick: props.onNewClick,
size: "sm",
className: "h-8",
title: "Create a new board. Useful when you want an unrelated work stream (different project, different team, isolated scratch area).",
}, "+ New board"),
props.board !== "default"
? h(Button, {
@@ -1326,7 +1350,8 @@
const tenants = (props.board && props.board.tenants) || [];
const assignees = (props.board && props.board.assignees) || [];
return h("div", { className: "flex flex-wrap items-end gap-3" },
h("div", { className: "flex flex-col gap-1" },
h("div", { className: "flex flex-col gap-1",
title: "Fuzzy-match tasks by id, title, or description. Matches across all columns." },
h(Label, { className: "text-xs text-muted-foreground" }, "Search"),
h(Input, {
placeholder: "Filter cards…",
@@ -1335,7 +1360,8 @@
className: "w-56 h-8",
}),
),
h("div", { className: "flex flex-col gap-1" },
h("div", { className: "flex flex-col gap-1",
title: "Tenants are free-form tags on a task (e.g. customer, project, team). Set them via the task drawer or kanban_create." },
h(Label, { className: "text-xs text-muted-foreground" }, "Tenant"),
h(Select, Object.assign({
value: props.tenantFilter,
@@ -1347,7 +1373,8 @@
}),
),
),
h("div", { className: "flex flex-col gap-1" },
h("div", { className: "flex flex-col gap-1",
title: "Filter by assigned Hermes profile. Profiles are the named agent identities that claim and work on tasks." },
h(Label, { className: "text-xs text-muted-foreground" }, "Assignee"),
h(Select, Object.assign({
value: props.assigneeFilter,
@@ -1359,7 +1386,8 @@
}),
),
),
h("label", { className: "flex items-center gap-2 text-xs" },
h("label", { className: "flex items-center gap-2 text-xs",
title: "Include archived tasks in the board view. Archived tasks are hidden by default." },
h("input", {
type: "checkbox",
checked: props.includeArchived,
@@ -1380,10 +1408,12 @@
h(Button, {
onClick: props.onNudgeDispatch,
size: "sm",
title: "Wake the dispatcher to claim ready tasks now instead of waiting for the next tick. Use this after adding tasks if you want them picked up immediately.",
}, "Nudge dispatcher"),
h(Button, {
onClick: props.onRefresh,
size: "sm",
title: "Reload the board from the database. The board auto-refreshes on task events; this is for forcing a re-read.",
}, "Refresh"),
);
}
@@ -1400,6 +1430,7 @@
h(Button, {
onClick: function () { props.onApply({ status: "ready" }); },
size: "sm",
title: "Move selected tasks to Ready. Ready tasks are picked up by the dispatcher on the next tick.",
}, "→ ready"),
h(Button, {
onClick: function () {
@@ -1407,6 +1438,7 @@
`Mark ${props.count} task(s) as done?`);
},
size: "sm",
title: "Mark selected tasks as done. Releases any claims and unblocks dependent children. You'll be asked for a completion summary.",
}, "Complete"),
h(Button, {
onClick: function () {
@@ -1414,8 +1446,10 @@
`Archive ${props.count} task(s)?`);
},
size: "sm",
title: "Archive selected tasks. They disappear from the default board view but remain in the database.",
}, "Archive"),
h("div", { className: "hermes-kanban-bulk-reassign" },
h("div", { className: "hermes-kanban-bulk-reassign",
title: "Reassign selected tasks to a different Hermes profile. Pick a profile (or unassign) and click Apply." },
h(Select, {
value: assignee,
onChange: function (e) { setAssignee(e.target.value); },
@@ -1435,12 +1469,14 @@
},
disabled: !assignee,
size: "sm",
title: "Apply the selected assignee to all selected tasks.",
}, "Apply"),
),
h("div", { className: "flex-1" }),
h(Button, {
onClick: props.onClear,
size: "sm",
title: "Deselect all tasks and hide this bar.",
}, "Clear"),
);
}
@@ -1521,11 +1557,13 @@
onDragLeave: handleDragLeave,
onDrop: handleDrop,
},
h("div", { className: "hermes-kanban-column-header" },
h("div", { className: "hermes-kanban-column-header",
title: COLUMN_HELP[props.column.name] || "" },
h("span", { className: cn("hermes-kanban-dot", COLUMN_DOT[props.column.name]) }),
h("span", { className: "hermes-kanban-column-label" },
COLUMN_LABEL[props.column.name] || props.column.name),
h("span", { className: "hermes-kanban-column-count" },
h("span", { className: "hermes-kanban-column-count",
title: `${props.column.tasks.length} task${props.column.tasks.length === 1 ? "" : "s"} in this column` },
props.column.tasks.length),
h("button", {
type: "button",
@@ -1652,7 +1690,8 @@
onClick: function (e) { e.stopPropagation(); },
title: "Select for bulk actions",
}),
h("span", { className: "hermes-kanban-card-id" }, t.id),
h("span", { className: "hermes-kanban-card-id",
title: `Task id: ${t.id}. Use this id with kanban_show, /kanban show, or hermes kanban show.` }, t.id),
t.warnings && t.warnings.count > 0
? h("span", {
className: cn(
@@ -1669,10 +1708,12 @@
t.warnings.highest_severity === "error" ? "!!" : "⚠")
: null,
t.priority > 0
? h(Badge, { className: "hermes-kanban-priority" }, `P${t.priority}`)
? h(Badge, { className: "hermes-kanban-priority",
title: `Priority ${t.priority}. Higher-priority tasks are claimed first by the dispatcher.` }, `P${t.priority}`)
: null,
t.tenant
? h(Badge, { variant: "outline", className: "hermes-kanban-tag" }, t.tenant)
? h(Badge, { variant: "outline", className: "hermes-kanban-tag",
title: `Tenant: ${t.tenant}. Free-form tag for grouping tasks (customer, project, team).` }, t.tenant)
: null,
progress
? h("span", {
@@ -1687,16 +1728,21 @@
h("div", { className: "hermes-kanban-card-title" }, t.title || "(untitled)"),
h("div", { className: "hermes-kanban-card-row hermes-kanban-card-meta" },
t.assignee
? h("span", { className: "hermes-kanban-assignee" }, "@", t.assignee)
: h("span", { className: "hermes-kanban-unassigned" }, "unassigned"),
? h("span", { className: "hermes-kanban-assignee",
title: `Assigned to Hermes profile @${t.assignee}` }, "@", t.assignee)
: h("span", { className: "hermes-kanban-unassigned",
title: "No profile assigned. The dispatcher will pick one from available profiles when the task is Ready." }, "unassigned"),
t.comment_count > 0
? h("span", { className: "hermes-kanban-count" }, "💬 ", t.comment_count)
? h("span", { className: "hermes-kanban-count",
title: `${t.comment_count} comment${t.comment_count === 1 ? "" : "s"} on this task` }, "💬 ", t.comment_count)
: null,
t.link_counts && (t.link_counts.parents + t.link_counts.children) > 0
? h("span", { className: "hermes-kanban-count" },
? h("span", { className: "hermes-kanban-count",
title: `${t.link_counts.parents} parent${t.link_counts.parents === 1 ? "" : "s"}, ${t.link_counts.children} child${t.link_counts.children === 1 ? "" : "ren"}. Children stay blocked until their parent is done.` },
"↔ ", t.link_counts.parents + t.link_counts.children)
: null,
h("span", { className: "hermes-kanban-ago" },
h("span", { className: "hermes-kanban-ago",
title: t.created_at ? `Created ${t.created_at}` : "" },
timeAgo ? timeAgo(t.created_at) : ""),
),
),
@@ -1777,6 +1823,9 @@
onChange: function (e) { setAssignee(e.target.value); },
placeholder: props.columnName === "triage" ? "specifier" : "assignee",
className: "h-7 text-xs flex-1",
title: props.columnName === "triage"
? "Hermes profile that will spec this task (default: the dispatcher's configured specifier). Leave blank to let the dispatcher pick."
: "Hermes profile to assign. Leave blank and the dispatcher will pick from available profiles when the task is Ready.",
}),
h(Input, {
type: "number",
@@ -1784,6 +1833,7 @@
onChange: function (e) { setPriority(e.target.value); },
placeholder: "pri",
className: "h-7 text-xs w-16",
title: "Priority. Higher-priority tasks are claimed first by the dispatcher. 0 = default.",
}),
),
h(Input, {
@@ -1815,6 +1865,7 @@
value: parent,
onChange: function (e) { setParent(e.target.value); },
className: "h-7 text-xs",
title: "Optional parent task. A child stays blocked in its current column until the parent is marked done.",
},
h(SelectOption, { value: "" }, "— no parent —"),
(props.allTasks || []).map(function (t) {
+26
View File
@@ -891,6 +891,32 @@
display: flex;
justify-content: flex-end;
padding: 0 0.25rem;
gap: 0.5rem;
align-items: center;
}
.hermes-kanban-docs-link {
display: inline-flex;
align-items: center;
justify-content: center;
width: 1.5rem;
height: 1.5rem;
border-radius: 9999px;
font-size: 0.75rem;
font-weight: 600;
line-height: 1;
color: var(--color-muted-foreground, rgba(180, 180, 200, 0.8));
background: var(--color-card-subtle, rgba(255, 255, 255, 0.04));
border: 1px solid var(--color-border, rgba(120, 120, 140, 0.25));
text-decoration: none;
cursor: help;
transition: color 0.15s, background 0.15s, border-color 0.15s;
}
.hermes-kanban-docs-link:hover,
.hermes-kanban-docs-link:focus-visible {
color: var(--color-foreground, #e7e7ee);
background: var(--color-card, rgba(255, 255, 255, 0.08));
border-color: var(--color-border, rgba(160, 160, 190, 0.45));
outline: none;
}
.hermes-kanban-dialog-backdrop {
position: fixed;
+5
View File
@@ -1,5 +1,6 @@
"""GMI Cloud provider profile."""
from hermes_cli import __version__ as _HERMES_VERSION
from providers import register_provider
from providers.base import ProviderProfile
@@ -12,6 +13,10 @@ gmi = ProviderProfile(
env_vars=("GMI_API_KEY", "GMI_BASE_URL"),
base_url="https://api.gmi-serving.com/v1",
auth_type="api_key",
# Attribution so GMI can identify traffic from Hermes Agent.
# The generic profile.default_headers fallback in run_agent.py and
# agent/auxiliary_client.py picks this up at client construction time.
default_headers={"User-Agent": f"HermesAgent/{_HERMES_VERSION}"},
default_aux_model="google/gemini-3.1-flash-lite-preview",
fallback_models=(
"zai-org/GLM-5.1-FP8",
+2 -26
View File
@@ -36,12 +36,6 @@ dependencies = [
"edge-tts>=7.2.7,<8",
# Skills Hub (GitHub App JWT auth — optional, only needed for bot identity)
"PyJWT[crypto]>=2.12.0,<3", # CVE-2026-32597
# Windows has no IANA tzdata shipped with the OS, so Python's ``zoneinfo``
# (PEP 615) raises ``ZoneInfoNotFoundError`` for every non-UTC timezone
# out of the box. ``tzdata`` ships the Olson database as a data package
# Python resolves automatically. No-op on Linux/macOS (which have
# /usr/share/zoneinfo). Credits: PR #13182 (@sprmn24).
"tzdata>=2023.3; sys_platform == 'win32'",
]
[project.optional-dependencies]
@@ -160,7 +154,7 @@ hermes-agent = "run_agent:main"
hermes-acp = "acp_adapter.entry:main"
[tool.setuptools]
py-modules = ["run_agent", "model_tools", "toolsets", "batch_runner", "trajectory_compressor", "toolset_distributions", "cli", "hermes_bootstrap", "hermes_constants", "hermes_state", "hermes_time", "hermes_logging", "rl_cli", "utils"]
py-modules = ["run_agent", "model_tools", "toolsets", "batch_runner", "trajectory_compressor", "toolset_distributions", "cli", "hermes_constants", "hermes_state", "hermes_time", "hermes_logging", "rl_cli", "utils"]
[tool.setuptools.package-data]
hermes_cli = ["web_dist/**/*"]
@@ -188,25 +182,7 @@ exclude = ["tinker-atropos"]
[tool.ruff]
exclude = ["tinker-atropos"]
preview = true # required for PLW1514 (unspecified-encoding) — preview rule
[tool.ruff.lint]
# All other lints are intentionally disabled (see comment history on this
# file) while we wrangle typechecks — but PLW1514 is too load-bearing to
# keep off. Bare open()/read_text()/write_text() in text mode defaults to
# the system locale encoding on Windows (cp1252 on US-locale installs),
# which silently corrupts any non-ASCII file content. We had three
# separate Windows sandbox regressions in one debug session before
# adding the explicit encoding. This rule keeps new code honest.
select = ["PLW1514"]
[tool.ruff.lint.per-file-ignores]
# Tests can intentionally exercise locale-encoding edge cases.
"tests/**" = ["PLW1514"]
# Skills and plugins are partially user-authored — their own conventions.
"skills/**" = ["PLW1514"]
"optional-skills/**" = ["PLW1514"]
"plugins/**" = ["PLW1514"]
select = [] # disable all lints for now, until we've wrangled typechecks a bit more :3
[tool.uv]
exclude-newer = "7 days"
+1 -1
View File
@@ -82,7 +82,7 @@ def load_hermes_config() -> dict:
if config_path.exists():
try:
with open(config_path, "r", encoding='utf-8') as f:
with open(config_path, "r") as f:
file_config = yaml.safe_load(f) or {}
# Get model from config
+8 -6
View File
@@ -20,10 +20,6 @@ Usage:
response = agent.run_conversation("Tell me about the latest Python updates")
"""
# IMPORTANT: hermes_bootstrap must be the very first import — UTF-8 stdio
# on Windows. No-op on POSIX. See hermes_bootstrap.py for full rationale.
import hermes_bootstrap # noqa: F401
import asyncio
import base64
import concurrent.futures
@@ -2390,7 +2386,13 @@ class AIAgent:
# ── Swap core runtime fields ──
self.model = new_model
self.provider = new_provider
self.base_url = base_url or self.base_url
# Use new base_url when provided; only fall back to current when the
# new provider genuinely has no endpoint (e.g. native SDK providers).
# Without this guard the old provider's URL (e.g. Ollama's localhost
# address) would persist silently after switching to a cloud provider
# that returns an empty base_url string.
if base_url:
self.base_url = base_url
self.api_mode = api_mode
# Invalidate transport cache — new api_mode may need a different transport
if hasattr(self, "_transport_cache"):
@@ -3686,7 +3688,7 @@ class AIAgent:
pass
review_agent = None
try:
with open(os.devnull, "w", encoding="utf-8") as _devnull, \
with open(os.devnull, "w") as _devnull, \
contextlib.redirect_stdout(_devnull), \
contextlib.redirect_stderr(_devnull):
# Inherit the parent agent's live runtime (provider, model,
+1 -1
View File
@@ -81,7 +81,7 @@ def build_catalog() -> dict:
def main() -> int:
catalog = build_catalog()
os.makedirs(os.path.dirname(OUTPUT_PATH), exist_ok=True)
with open(OUTPUT_PATH, "w", encoding="utf-8") as fh:
with open(OUTPUT_PATH, "w") as fh:
json.dump(catalog, fh, indent=2)
fh.write("\n")
+1 -1
View File
@@ -304,7 +304,7 @@ def main():
}
os.makedirs(os.path.dirname(OUTPUT_PATH), exist_ok=True)
with open(OUTPUT_PATH, "w", encoding="utf-8") as f:
with open(OUTPUT_PATH, "w") as f:
json.dump(index, f, separators=(",", ":"), ensure_ascii=False)
elapsed = time.time() - overall_start
+1 -1
View File
@@ -291,7 +291,7 @@ def check_release_file(release_file, all_contributors):
missing: set of handles NOT found in the file
"""
try:
content = Path(release_file).read_text(encoding="utf-8")
content = Path(release_file).read_text()
except FileNotFoundError:
print(f" [error] Release file not found: {release_file}", file=sys.stderr)
return set(), set(all_contributors)
+1 -1
View File
@@ -242,7 +242,7 @@ def check_config(groq_key, eleven_key):
if config_path.exists():
try:
import yaml
with open(config_path, encoding="utf-8") as f:
with open(config_path) as f:
cfg = yaml.safe_load(f) or {}
stt_provider = cfg.get("stt", {}).get("provider", "local")
+65 -414
View File
@@ -145,30 +145,15 @@ function Test-Python {
# Python not found — use uv to install it (no admin needed!)
Write-Info "Python $PythonVersion not found, installing via uv..."
try {
# Temporarily relax ErrorActionPreference: uv writes download progress
# to stderr, and with $ErrorActionPreference = "Stop" PowerShell wraps
# those stderr lines as ErrorRecord objects via 2>&1, then throws a
# terminating exception — even when uv exits 0. This caused fresh
# installs to fail on the first run despite Python being installed
# successfully. We verify success with `uv python find` afterwards
# which is the reliable signal regardless of exit code semantics.
$prevEAP = $ErrorActionPreference
$ErrorActionPreference = "Continue"
$uvOutput = & $UvCmd python install $PythonVersion 2>&1
$uvExitCode = $LASTEXITCODE
$ErrorActionPreference = $prevEAP
# Check if Python is now available (more reliable than exit code
# since uv may return non-zero due to "already installed" etc.)
$pythonPath = & $UvCmd python find $PythonVersion 2>$null
if ($pythonPath) {
$ver = & $pythonPath --version 2>$null
Write-Success "Python installed: $ver"
return $true
}
# uv ran but Python still not findable — show what happened
if ($uvExitCode -ne 0) {
if ($LASTEXITCODE -eq 0) {
$pythonPath = & $UvCmd python find $PythonVersion 2>$null
if ($pythonPath) {
$ver = & $pythonPath --version 2>$null
Write-Success "Python installed: $ver"
return $true
}
} else {
Write-Warn "uv python install output:"
Write-Host $uvOutput -ForegroundColor DarkGray
}
@@ -206,213 +191,19 @@ function Test-Python {
return $false
}
function Install-Git {
<#
.SYNOPSIS
Ensure Git (and Git Bash) are installed. Git for Windows bundles bash.exe
which Hermes uses to run shell commands.
Priority order (deliberately simple no winget, no registry, no system
package manager):
1. Existing ``git`` on PATH use it as-is (the common fast path).
2. Download **PortableGit** from the official git-for-windows GitHub
release (self-extracting 7z.exe) and unpack it to
``%LOCALAPPDATA%\hermes\git`` never touches system Git, never
requires admin, works even on locked-down machines and machines
with a broken system Git install.
**Why PortableGit, not MinGit:** MinGit is the minimal-automation
distribution and ships ONLY ``git.exe`` no bash, no POSIX utilities.
Hermes needs ``bash.exe`` to run shell commands. PortableGit is the
full Git for Windows distribution without the installer UI; it ships
``git.exe`` + ``bash.exe`` + ``sh``, ``awk``, ``sed``, ``grep``, ``curl``,
``ssh``, etc. in ``usr\bin\``.
We deliberately skip winget because it fails badly when the system Git
install is in a half-installed state (partially registered, or uninstall-
blocked). Owning the Hermes copy of Git ourselves is predictable and
recoverable: if it ever breaks, ``Remove-Item %LOCALAPPDATA%\hermes\git``
and re-running this installer fully recovers.
After install we locate ``bash.exe`` and persist the path in
``HERMES_GIT_BASH_PATH`` (User scope) so Hermes can find it in a fresh
shell without a second PATH refresh.
#>
function Test-Git {
Write-Info "Checking Git..."
if (Get-Command git -ErrorAction SilentlyContinue) {
$version = git --version
Write-Success "Git found ($version)"
Set-GitBashEnvVar
return $true
}
# Download PortableGit into $HermesHome\git. Always works as long as
# we can reach github.com — no admin, no winget, no reliance on the
# user's possibly-broken system Git install.
Write-Info "Git not found — downloading PortableGit to $HermesHome\git\ ..."
Write-Info "(no admin rights required; isolated from any system Git install)"
try {
$arch = if ([Environment]::Is64BitOperatingSystem) {
# Detect ARM64 vs x64 explicitly; PortableGit ships separate assets.
if ($env:PROCESSOR_ARCHITECTURE -eq "ARM64" -or $env:PROCESSOR_ARCHITEW6432 -eq "ARM64") {
"arm64"
} else {
"64-bit"
}
} else {
# PortableGit does not ship a 32-bit build — fall back to MinGit 32-bit
# with a warning that bash-based features will be unavailable.
"32-bit-mingit"
}
$releaseApi = "https://api.github.com/repos/git-for-windows/git/releases/latest"
$release = Invoke-RestMethod -Uri $releaseApi -UseBasicParsing -Headers @{ "User-Agent" = "hermes-installer" }
if ($arch -eq "32-bit-mingit") {
Write-Warn "32-bit Windows detected — PortableGit is 64-bit only. Installing MinGit 32-bit as a last resort; bash-dependent Hermes features (terminal tool, agent-browser) will not work on this machine."
$assetPattern = "MinGit-*-32-bit.zip"
$downloadIsZip = $true
} elseif ($arch -eq "arm64") {
$assetPattern = "PortableGit-*-arm64.7z.exe"
$downloadIsZip = $false
} else {
$assetPattern = "PortableGit-*-64-bit.7z.exe"
$downloadIsZip = $false
}
$asset = $release.assets | Where-Object { $_.name -like $assetPattern } | Select-Object -First 1
if (-not $asset) {
throw "Could not find $assetPattern in latest git-for-windows release"
}
$downloadUrl = $asset.browser_download_url
$downloadExt = if ($downloadIsZip) { "zip" } else { "7z.exe" }
$tmpFile = "$env:TEMP\$($asset.name)"
$gitDir = "$HermesHome\git"
Write-Info "Downloading $($asset.name) ($([math]::Round($asset.size / 1MB, 1)) MB)..."
Invoke-WebRequest -Uri $downloadUrl -OutFile $tmpFile -UseBasicParsing
if (Test-Path $gitDir) {
Write-Info "Removing previous Git install at $gitDir ..."
Remove-Item -Recurse -Force $gitDir
}
New-Item -ItemType Directory -Path $gitDir -Force | Out-Null
if ($downloadIsZip) {
Expand-Archive -Path $tmpFile -DestinationPath $gitDir -Force
} else {
# PortableGit is a self-extracting 7z archive. Invoke it with
# `-o<target> -y` (silent) to extract to $gitDir. No 7z install
# required; it's fully self-contained.
Write-Info "Extracting PortableGit to $gitDir ..."
$extractProc = Start-Process -FilePath $tmpFile `
-ArgumentList "-o`"$gitDir`"", "-y" `
-NoNewWindow -Wait -PassThru
if ($extractProc.ExitCode -ne 0) {
throw "PortableGit extraction failed (exit code $($extractProc.ExitCode))"
}
}
Remove-Item -Force $tmpFile -ErrorAction SilentlyContinue
# PortableGit layout: cmd\git.exe + bin\bash.exe + usr\bin\ (coreutils)
# MinGit layout: cmd\git.exe + usr\bin\bash.exe (if present)
$gitExe = "$gitDir\cmd\git.exe"
if (-not (Test-Path $gitExe)) {
throw "Git extraction did not produce git.exe at $gitExe"
}
# Add to session PATH so the rest of this install run can use git.
$env:Path = "$gitDir\cmd;$env:Path"
# Persist to User PATH so fresh shells see it. PortableGit needs
# cmd\ (for git.exe), bin\ (for bash.exe + core tools), and
# usr\bin\ (for perl, ssh, curl, and other POSIX coreutils).
$newPathEntries = @(
"$gitDir\cmd",
"$gitDir\bin",
"$gitDir\usr\bin"
)
$userPath = [Environment]::GetEnvironmentVariable("Path", "User")
$userPathItems = if ($userPath) { $userPath -split ";" } else { @() }
$changed = $false
foreach ($entry in $newPathEntries) {
if ($userPathItems -notcontains $entry) {
$userPathItems += $entry
$changed = $true
}
}
if ($changed) {
[Environment]::SetEnvironmentVariable("Path", ($userPathItems -join ";"), "User")
}
$version = & $gitExe --version
Write-Success "Git $version installed to $gitDir (portable, user-scoped)"
Set-GitBashEnvVar
return $true
} catch {
Write-Err "Could not install portable Git: $_"
Write-Info ""
Write-Info "Fallback: install Git manually from https://git-scm.com/download/win"
Write-Info "then re-run this installer. Hermes needs Git Bash on Windows to run"
Write-Info "shell commands (same as Claude Code and other coding agents)."
return $false
}
}
function Set-GitBashEnvVar {
<#
.SYNOPSIS
Locate ``bash.exe`` from an already-installed Git and persist the path in
``HERMES_GIT_BASH_PATH`` (User env scope) so Hermes can find it even before
PATH propagation completes in a newly-spawned shell.
#>
$candidates = @()
# Our own portable Git install is ALWAYS checked first, so a broken
# system Git doesn't hijack us. If the user had a working system Git
# we'd have returned early from Install-Git's fast path and never called
# this with a system-Git-only installation anyway.
#
# Layouts:
# PortableGit (our default): $HermesHome\git\bin\bash.exe
# MinGit (32-bit fallback): $HermesHome\git\usr\bin\bash.exe
$candidates += "$HermesHome\git\bin\bash.exe" # PortableGit layout (primary)
$candidates += "$HermesHome\git\usr\bin\bash.exe" # MinGit / PortableGit usr\bin fallback
# git.exe on PATH can tell us where the install root is
$gitCmd = Get-Command git -ErrorAction SilentlyContinue
if ($gitCmd) {
$gitExe = $gitCmd.Source
# Git for Windows (full installer): <root>\cmd\git.exe + <root>\bin\bash.exe
# MinGit: <root>\cmd\git.exe + <root>\usr\bin\bash.exe
$gitRoot = Split-Path (Split-Path $gitExe -Parent) -Parent
$candidates += "$gitRoot\bin\bash.exe"
$candidates += "$gitRoot\usr\bin\bash.exe"
}
# Standard system install locations as a final fallback. Note:
# ProgramFiles(x86) can't be referenced via ${env:...} string interpolation
# because of the parens — use [Environment]::GetEnvironmentVariable().
$candidates += "${env:ProgramFiles}\Git\bin\bash.exe"
$pf86 = [Environment]::GetEnvironmentVariable("ProgramFiles(x86)")
if ($pf86) { $candidates += "$pf86\Git\bin\bash.exe" }
$candidates += "${env:LocalAppData}\Programs\Git\bin\bash.exe"
foreach ($candidate in $candidates) {
if ($candidate -and (Test-Path $candidate)) {
[Environment]::SetEnvironmentVariable("HERMES_GIT_BASH_PATH", $candidate, "User")
$env:HERMES_GIT_BASH_PATH = $candidate
Write-Info "Set HERMES_GIT_BASH_PATH=$candidate"
return
}
}
Write-Warn "Could not locate bash.exe — Hermes may not find Git Bash."
Write-Info "If needed, set HERMES_GIT_BASH_PATH manually to your bash.exe path."
Write-Err "Git not found"
Write-Info "Please install Git from:"
Write-Info " https://git-scm.com/download/win"
return $false
}
function Test-Node {
@@ -620,71 +411,21 @@ function Install-SystemPackages {
function Install-Repository {
Write-Info "Installing to $InstallDir..."
$didUpdate = $false
if (Test-Path $InstallDir) {
# Test-Path "$InstallDir\.git" returns True when .git is a file OR a
# directory OR a symlink OR a submodule-style gitfile — and also when
# it's a broken stub left over from a failed previous install (e.g.
# a partial Remove-Item that couldn't delete a locked index.lock).
# Validate the repo properly by asking git itself. Two checks
# belt-and-braces: rev-parse AND git status. If either fails the
# repo is broken and we fall through to a fresh clone.
$repoValid = $false
if (Test-Path "$InstallDir\.git") {
Push-Location $InstallDir
try {
# Reset $LASTEXITCODE before the probe so we don't pick up
# a stale 0 from an earlier git call in this session.
$global:LASTEXITCODE = 0
$revParseOut = & git -c windows.appendAtomically=false rev-parse --is-inside-work-tree 2>&1
$revParseOk = ($LASTEXITCODE -eq 0) -and ($revParseOut -match "true")
$global:LASTEXITCODE = 0
$null = & git -c windows.appendAtomically=false status --short 2>&1
$statusOk = ($LASTEXITCODE -eq 0)
if ($revParseOk -and $statusOk) {
$repoValid = $true
}
} catch {}
Pop-Location
}
if ($repoValid) {
Write-Info "Existing installation found, updating..."
Push-Location $InstallDir
try {
git -c windows.appendAtomically=false fetch origin
if ($LASTEXITCODE -ne 0) { throw "git fetch failed (exit $LASTEXITCODE)" }
git -c windows.appendAtomically=false checkout $Branch
if ($LASTEXITCODE -ne 0) { throw "git checkout $Branch failed (exit $LASTEXITCODE)" }
git -c windows.appendAtomically=false pull origin $Branch
if ($LASTEXITCODE -ne 0) { throw "git pull failed (exit $LASTEXITCODE)" }
} finally {
Pop-Location
}
$didUpdate = $true
git -c windows.appendAtomically=false fetch origin
git -c windows.appendAtomically=false checkout $Branch
git -c windows.appendAtomically=false pull origin $Branch
Pop-Location
} else {
# Directory exists but isn't a usable git repo. Wipe it and
# fall through to a fresh clone. A leftover ``.git`` stub from
# a partial uninstall used to lock the installer into the
# "update" branch forever, emitting three ``fatal: not a git
# repository`` errors and failing with "not in a git directory".
Write-Warn "Existing directory at $InstallDir is not a valid git repo — replacing it."
try {
Remove-Item -Recurse -Force $InstallDir -ErrorAction Stop
} catch {
Write-Err "Could not remove $InstallDir : $_"
Write-Info "Close any programs that might be using files in $InstallDir (editors,"
Write-Info "terminals, running hermes processes) and try again."
throw
}
Write-Err "Directory exists but is not a git repository: $InstallDir"
Write-Info "Remove it or choose a different directory with -InstallDir"
throw "Directory exists but is not a git repository: $InstallDir"
}
}
if (-not $didUpdate) {
} else {
$cloneSuccess = $false
# Fix Windows git "copy-fd: write returned: Invalid argument" error.
@@ -705,7 +446,7 @@ function Install-Repository {
if ($LASTEXITCODE -eq 0) { $cloneSuccess = $true }
} catch { }
$env:GIT_SSH_COMMAND = $null
if (-not $cloneSuccess) {
if (Test-Path $InstallDir) { Remove-Item -Recurse -Force $InstallDir -ErrorAction SilentlyContinue }
Write-Info "SSH failed, trying HTTPS..."
@@ -723,18 +464,18 @@ function Install-Repository {
$zipUrl = "https://github.com/NousResearch/hermes-agent/archive/refs/heads/$Branch.zip"
$zipPath = "$env:TEMP\hermes-agent-$Branch.zip"
$extractPath = "$env:TEMP\hermes-agent-extract"
Invoke-WebRequest -Uri $zipUrl -OutFile $zipPath -UseBasicParsing
if (Test-Path $extractPath) { Remove-Item -Recurse -Force $extractPath }
Expand-Archive -Path $zipPath -DestinationPath $extractPath -Force
# GitHub ZIPs extract to repo-branch/ subdirectory
$extractedDir = Get-ChildItem $extractPath -Directory | Select-Object -First 1
if ($extractedDir) {
New-Item -ItemType Directory -Force -Path (Split-Path $InstallDir) -ErrorAction SilentlyContinue | Out-Null
Move-Item $extractedDir.FullName $InstallDir -Force
Write-Success "Downloaded and extracted"
# Initialize git repo so updates work later
Push-Location $InstallDir
git -c windows.appendAtomically=false init 2>$null
@@ -742,10 +483,10 @@ function Install-Repository {
git remote add origin $RepoUrlHttps 2>$null
Pop-Location
Write-Success "Git repo initialized for future updates"
$cloneSuccess = $true
}
# Cleanup temp files
Remove-Item -Force $zipPath -ErrorAction SilentlyContinue
Remove-Item -Recurse -Force $extractPath -ErrorAction SilentlyContinue
@@ -758,7 +499,7 @@ function Install-Repository {
throw "Failed to download repository (tried git clone SSH, HTTPS, and ZIP)"
}
}
# Set per-repo config (harmless if it fails)
Push-Location $InstallDir
git -c windows.appendAtomically=false config windows.appendAtomically false 2>$null
@@ -772,7 +513,7 @@ function Install-Repository {
Write-Success "Submodules ready"
}
Pop-Location
Write-Success "Repository ready"
}
@@ -918,21 +659,13 @@ function Copy-ConfigTemplates {
Write-Info "~/.hermes/config.yaml already exists, keeping it"
}
# Create SOUL.md if it doesn't exist (global persona file).
# IMPORTANT: write without a BOM. Windows PowerShell 5.1's
# ``Set-Content -Encoding UTF8`` writes UTF-8 WITH a byte-order-mark
# (the default PS5 behaviour), and Hermes's prompt-injection scanner
# flags the BOM as an invisible unicode character and refuses to
# load the file. PS7's ``-Encoding utf8NoBOM`` fixes that but we
# don't control which PowerShell version the user has. Go direct
# to .NET with an explicit UTF8Encoding($false) — BOM-free on every
# PowerShell version.
# Create SOUL.md if it doesn't exist (global persona file)
$soulPath = "$HermesHome\SOUL.md"
if (-not (Test-Path $soulPath)) {
$soulContent = @"
@"
# Hermes Agent Persona
<!--
<!--
This file defines the agent's personality and tone.
The agent will embody whatever you write here.
Edit this to customize how Hermes communicates with you.
@@ -945,9 +678,7 @@ Examples:
This file is loaded fresh each message -- no restart needed.
Delete the contents (or this file) to use the default personality.
-->
"@
$utf8NoBom = New-Object System.Text.UTF8Encoding($false)
[System.IO.File]::WriteAllText($soulPath, $soulContent, $utf8NoBom)
"@ | Set-Content -Path $soulPath -Encoding UTF8
Write-Success "Created ~/.hermes/SOUL.md (edit to customize personality)"
}
@@ -977,94 +708,36 @@ function Install-NodeDeps {
Write-Info "Skipping Node.js dependencies (Node not installed)"
return
}
# Resolve npm explicitly to npm.cmd, NOT npm.ps1. Node.js on Windows
# ships BOTH npm.cmd (a batch shim) and npm.ps1 (a PowerShell shim).
# Get-Command's default ordering picks whichever comes first in PATHEXT,
# and on many systems that's .ps1 — but .ps1 requires scripts to be
# enabled in PowerShell's execution policy, which most Windows users
# don't have (the Restricted / RemoteSigned default blocks unsigned
# .ps1 files). .cmd has no such restriction and works on every box.
#
# Strategy: look next to the npm shim we found and prefer npm.cmd if
# it exists in the same directory. Fall back to whatever Get-Command
# returned if we can't find a .cmd sibling.
$npmCmd = Get-Command npm -ErrorAction SilentlyContinue
if (-not $npmCmd) {
Write-Warn "npm not found on PATH — skipping Node.js dependencies."
Write-Info "Open a new PowerShell window and re-run 'hermes setup tools' later."
return
}
$npmExe = $npmCmd.Source
if ($npmExe -like "*.ps1") {
$npmCmdSibling = Join-Path (Split-Path $npmExe -Parent) "npm.cmd"
if (Test-Path $npmCmdSibling) {
Write-Info "Using npm.cmd (PowerShell execution policy blocks npm.ps1)"
$npmExe = $npmCmdSibling
} else {
Write-Warn "Only npm.ps1 available — install may fail if script execution is disabled."
Write-Info " If it fails, either enable PS script execution or install Node via winget."
}
}
# Helper: run "npm install" in a given directory and surface the real
# error when it fails. Returns $true on success.
#
# Implementation note: ``Start-Process -FilePath npm.cmd`` fails with
# ``%1 is not a valid Win32 application`` on some PowerShell versions
# because Start-Process bypasses cmd.exe / PATHEXT and expects a real
# PE file. The invocation-operator ``& $npmExe`` routes through the
# PowerShell command pipeline which DOES honour .cmd batch shims, so
# it works uniformly for npm.cmd, npx.cmd, and bare .exe files.
function _Run-NpmInstall([string]$label, [string]$installDir, [string]$logPath, [string]$npmPath) {
Push-Location $installDir
try {
# Redirect ALL output streams to the log file via 2>&1 and then
# ``Tee-Object`` / ``Out-File``. Simpler approach: call npm
# with output redirected and inspect $LASTEXITCODE afterwards.
& $npmPath install --silent *> $logPath
$code = $LASTEXITCODE
if ($code -eq 0) {
Write-Success "$label dependencies installed"
Remove-Item -Force $logPath -ErrorAction SilentlyContinue
return $true
}
Write-Warn "$label npm install failed — exit code $code"
if (Test-Path $logPath) {
$errText = (Get-Content $logPath -Raw -ErrorAction SilentlyContinue)
if ($errText) {
$snippet = if ($errText.Length -gt 1200) { $errText.Substring(0, 1200) + "..." } else { $errText }
Write-Info " npm output:"
foreach ($line in $snippet -split "`n") {
Write-Host " $line" -ForegroundColor DarkGray
}
Write-Info " Full log: $logPath"
}
}
Write-Info "Run manually later: cd `"$installDir`"; npm install"
return $false
} catch {
Write-Warn "$label npm install could not be launched: $_"
return $false
} finally {
Pop-Location
}
}
# Browser tools
if (Test-Path "$InstallDir\package.json") {
Push-Location $InstallDir
if (Test-Path "package.json") {
Write-Info "Installing Node.js dependencies (browser tools)..."
$browserLog = "$env:TEMP\hermes-npm-browser-$(Get-Random).log"
[void](_Run-NpmInstall "Browser tools" $InstallDir $browserLog $npmExe)
try {
npm install --silent 2>&1 | Out-Null
Write-Success "Node.js dependencies installed"
} catch {
Write-Warn "npm install failed (browser tools may not work)"
}
}
# TUI
# Install TUI dependencies
$tuiDir = "$InstallDir\ui-tui"
if (Test-Path "$tuiDir\package.json") {
Write-Info "Installing TUI dependencies..."
$tuiLog = "$env:TEMP\hermes-npm-tui-$(Get-Random).log"
[void](_Run-NpmInstall "TUI" $tuiDir $tuiLog $npmExe)
Push-Location $tuiDir
try {
npm install --silent 2>&1 | Out-Null
Write-Success "TUI dependencies installed"
} catch {
Write-Warn "TUI npm install failed (hermes --tui may not work)"
}
Pop-Location
}
Pop-Location
}
function Invoke-SetupWizard {
@@ -1213,35 +886,13 @@ function Write-Completion {
function Main {
Write-Banner
# Windows refuses to delete a directory any shell is currently cd'd
# inside — and silently leaves orphan files behind, which then wedge
# "is this a valid git repo" probes on re-install. If the current
# working dir is under $InstallDir, step out to the user's home
# BEFORE doing anything else. Harmless when the user ran the
# installer from somewhere else.
try {
$currentResolved = (Get-Location).ProviderPath
$installResolved = $null
if (Test-Path $InstallDir) {
$installResolved = (Resolve-Path $InstallDir -ErrorAction SilentlyContinue).ProviderPath
}
if ($installResolved -and $currentResolved.ToLower().StartsWith($installResolved.ToLower())) {
Write-Info "Stepping out of $InstallDir so Windows can replace files there if needed..."
Set-Location $env:USERPROFILE
}
} catch {}
if (-not (Install-Uv)) { throw "uv installation failed — cannot continue" }
if (-not (Test-Python)) { throw "Python $PythonVersion not available — cannot continue" }
if (-not (Install-Git)) { throw "Git not available and auto-install failed — install from https://git-scm.com/download/win then re-run" }
# Test-Node always returns $true (sets $script:HasNode on success, emits a
# warning on failure and continues so non-browser installs still work).
# Cast to [void] so the bare return value doesn't print "True" to the
# console between the "Node found" line and the next installer step.
[void](Test-Node)
if (-not (Test-Git)) { throw "Git not found — install from https://git-scm.com/download/win" }
Test-Node # Auto-installs if missing
Install-SystemPackages # ripgrep + ffmpeg in one step
Install-Repository
Install-Venv
Install-Dependencies
@@ -1250,7 +901,7 @@ function Main {
Copy-ConfigTemplates
Invoke-SetupWizard
Start-GatewayIfConfigured
Write-Completion
}
+2 -2
View File
@@ -111,7 +111,7 @@ def summarize(log: Path, since_ts_ms: int) -> dict[str, Any]:
frame_events: list[dict[str, Any]] = []
if not log.exists():
return {"error": f"no log at {log}", "react": [], "frame": []}
for line in log.read_text(encoding="utf-8").splitlines():
for line in log.read_text().splitlines():
line = line.strip()
if not line:
continue
@@ -505,7 +505,7 @@ def main() -> int:
if args.save:
path = Path(f"/tmp/perf-{args.save}.json")
path.write_text(json.dumps(metrics, indent=2), encoding="utf-8")
path.write_text(json.dumps(metrics, indent=2))
print(f"\n• saved: {path}")
if args.compare:
+5 -1
View File
@@ -47,6 +47,7 @@ AUTHOR_MAP = {
"qiyin.zuo@pcitc.com": "qiyin-code",
"oleksii.lisikh@gmail.com": "olisikh",
"leone.parise@gmail.com": "leoneparise",
"buraysandro9@gmail.com": "ygd58",
"teknium@nousresearch.com": "teknium1",
"piyushvp1@gmail.com": "thelumiereguy",
"harish.kukreja@gmail.com": "counterposition",
@@ -58,6 +59,7 @@ AUTHOR_MAP = {
"223003280+Abd0r@users.noreply.github.com": "Abd0r",
"abdielv@proton.me": "AJV20",
"mason@growagainorchids.com": "masonjames",
"ytchen0719@gmail.com": "liquidchen",
"am@studio1.tailb672fe.ts.net": "subtract0",
"axmaiqiu@gmail.com": "qWaitCrypto",
"159539633+MottledShadow@users.noreply.github.com": "MottledShadow",
@@ -429,6 +431,7 @@ AUTHOR_MAP = {
"johnsonblake1@gmail.com": "voteblake",
"hcn518@gmail.com": "pedh",
"haileymarshall005@gmail.com": "haileymarshall",
"bennet.yr.wang@gmail.com": "BennetYrWang",
"greer.guthrie@gmail.com": "g-guthrie",
"kennyx102@gmail.com": "bobashopcashier",
"77253505+bobashopcashier@users.noreply.github.com": "bobashopcashier",
@@ -694,6 +697,7 @@ AUTHOR_MAP = {
"mike@mikewaters.net": "mikewaters",
"65117428+WadydX@users.noreply.github.com": "WadydX",
"216480837+isaachuangGMICLOUD@users.noreply.github.com": "isaachuangGMICLOUD",
"isaac.h@gmicloud.ai": "isaachuangGMICLOUD",
"nukuom976228@gmail.com": "hsy5571616",
"11462216+Nan93@users.noreply.github.com": "Nan93",
"l973401489@126.com": "zhouxiaoya12",
@@ -1359,7 +1363,7 @@ def main():
)
if args.output:
Path(args.output).write_text(changelog, encoding="utf-8")
Path(args.output).write_text(changelog)
print(f"Changelog written to {args.output}")
else:
print(changelog)
@@ -130,7 +130,33 @@ def _ensure_deps():
sys.exit(1)
def check_auth():
def check_auth_live():
"""Check auth with a real API call to detect disabled_client/account issues."""
# quiet=True suppresses the "AUTHENTICATED" print from check_auth so the
# final status line reflects the live-call outcome (OK or FAILED).
if not check_auth(quiet=True):
return False
try:
from googleapiclient.discovery import build
from google.oauth2.credentials import Credentials
creds = Credentials.from_authorized_user_file(str(TOKEN_PATH))
service = build("calendar", "v3", credentials=creds)
service.calendarList().list(maxResults=1).execute()
print("LIVE_CHECK_OK: Real API call succeeded.")
return True
except Exception as e:
err_str = str(e).lower()
if "disabled_client" in err_str or "invalid_client" in err_str:
print(f"LIVE_CHECK_FAILED: OAuth client or account disabled: {e}")
print(" 1. Check Google Cloud Console for disabled OAuth client")
print(" 2. Check myaccount.google.com for account status")
print(" 3. Do NOT retry with a disabled account")
else:
print(f"LIVE_CHECK_FAILED: {e}")
return False
def check_auth(quiet: bool = False):
"""Check if stored credentials are valid. Prints status, exits 0 or 1."""
if not TOKEN_PATH.exists():
print(f"NOT_AUTHENTICATED: No token at {TOKEN_PATH}")
@@ -157,7 +183,8 @@ def check_auth():
print(f"AUTHENTICATED (partial): Token valid but missing {len(missing_scopes)} scopes:")
for s in missing_scopes:
print(f" - {s}")
print(f"AUTHENTICATED: Token valid at {TOKEN_PATH}")
if not quiet:
print(f"AUTHENTICATED: Token valid at {TOKEN_PATH}")
return True
if creds.expired and creds.refresh_token:
@@ -174,10 +201,25 @@ def check_auth():
print(f"AUTHENTICATED (partial): Token refreshed but missing {len(missing_scopes)} scopes:")
for s in missing_scopes:
print(f" - {s}")
print(f"AUTHENTICATED: Token refreshed at {TOKEN_PATH}")
if not quiet:
print(f"AUTHENTICATED: Token refreshed at {TOKEN_PATH}")
return True
except Exception as e:
print(f"REFRESH_FAILED: {e}")
err_str = str(e).lower()
if "disabled_client" in err_str or "invalid_client" in err_str:
print(f"OAUTH_CLIENT_DISABLED: {e}")
print(" The OAuth client or Google account has been disabled.")
print(" Steps to resolve:")
print(" 1. Check your Google Cloud Console — verify the OAuth client is not disabled")
print(" 2. Check if your Google account itself has been disabled at myaccount.google.com")
print(" 3. If the account is disabled, you can appeal at accounts.google.com/signin/recovery")
print(" 4. Do NOT retry API calls with a disabled account — this may worsen the situation")
print(" 5. If the OAuth client is disabled, create a new one in Google Cloud Console")
elif "token_revoked" in err_str or "invalid_grant" in err_str:
print(f"TOKEN_REVOKED: {e}")
print(" Re-run setup to re-authenticate.")
else:
print(f"REFRESH_FAILED: {e}")
return False
print("TOKEN_INVALID: Re-run setup.")
@@ -384,6 +426,7 @@ def main():
parser = argparse.ArgumentParser(description="Google Workspace OAuth setup for Hermes")
group = parser.add_mutually_exclusive_group(required=True)
group.add_argument("--check", action="store_true", help="Check if auth is valid (exit 0=yes, 1=no)")
group.add_argument("--check-live", action="store_true", help="Check auth with a real API call (detects disabled_client)")
group.add_argument("--client-secret", metavar="PATH", help="Store OAuth client_secret.json")
group.add_argument("--auth-url", action="store_true", help="Print OAuth URL for user to visit")
group.add_argument("--auth-code", metavar="CODE", help="Exchange auth code for token")
@@ -393,6 +436,8 @@ def main():
if args.check:
sys.exit(0 if check_auth() else 1)
if getattr(args, "check_live", False):
sys.exit(0 if check_auth_live() else 1)
elif args.client_secret:
store_client_secret(args.client_secret)
elif args.auth_url:
+89
View File
@@ -351,6 +351,95 @@ class TestResolveDeliveryTarget:
assert _resolve_delivery_targets({"deliver": []}) == []
class TestRoutingIntents:
"""``all`` routing intent expands at fire time."""
def test_all_expands_to_every_connected_home_channel(self, monkeypatch):
"""deliver='all' fans out to every platform with a configured home channel."""
from cron.scheduler import _resolve_delivery_targets
monkeypatch.setenv("TELEGRAM_HOME_CHANNEL", "-111")
monkeypatch.setenv("DISCORD_HOME_CHANNEL", "-222")
monkeypatch.setenv("SLACK_HOME_CHANNEL", "C333")
# Sanity: platforms without the env var must NOT appear in the expansion.
monkeypatch.delenv("SIGNAL_HOME_CHANNEL", raising=False)
monkeypatch.delenv("MATRIX_HOME_ROOM", raising=False)
targets = _resolve_delivery_targets({"deliver": "all", "origin": None})
platforms = sorted(t["platform"] for t in targets)
assert "telegram" in platforms
assert "discord" in platforms
assert "slack" in platforms
assert "signal" not in platforms
assert "matrix" not in platforms
def test_all_combines_with_explicit_target_and_dedups(self, monkeypatch):
"""'telegram:-999,all' yields every home channel + the explicit target without dupes."""
from cron.scheduler import _resolve_delivery_targets
monkeypatch.setenv("TELEGRAM_HOME_CHANNEL", "-111")
monkeypatch.setenv("DISCORD_HOME_CHANNEL", "-222")
# Explicit telegram target precedes 'all'. Expansion adds discord;
# the dedup pass collapses any (platform, chat_id, thread_id) repeats.
job = {"deliver": "telegram:-999,all", "origin": None}
targets = _resolve_delivery_targets(job)
platforms = sorted(t["platform"].lower() for t in targets)
assert "telegram" in platforms
assert "discord" in platforms
# Every target is unique on (platform, chat_id, thread_id).
keys = [(t["platform"].lower(), str(t["chat_id"]), t.get("thread_id")) for t in targets]
assert len(keys) == len(set(keys))
def test_all_with_no_connected_channels_returns_empty(self, monkeypatch):
"""deliver='all' with nothing connected returns [] — delivery is recorded as failed upstream."""
from cron.scheduler import _resolve_delivery_targets
for var in ("TELEGRAM_HOME_CHANNEL", "DISCORD_HOME_CHANNEL", "SLACK_HOME_CHANNEL",
"SIGNAL_HOME_CHANNEL", "MATRIX_HOME_ROOM", "MATTERMOST_HOME_CHANNEL",
"SMS_HOME_CHANNEL", "EMAIL_HOME_ADDRESS", "DINGTALK_HOME_CHANNEL",
"FEISHU_HOME_CHANNEL", "WECOM_HOME_CHANNEL", "WEIXIN_HOME_CHANNEL",
"BLUEBUBBLES_HOME_CHANNEL", "QQBOT_HOME_CHANNEL", "QQ_HOME_CHANNEL"):
monkeypatch.delenv(var, raising=False)
assert _resolve_delivery_targets({"deliver": "all", "origin": None}) == []
def test_origin_comma_all_preserves_origin_first(self, monkeypatch):
"""'origin,all' delivers to the origin platform plus every other home channel."""
from cron.scheduler import _resolve_delivery_targets
monkeypatch.setenv("TELEGRAM_HOME_CHANNEL", "-111")
monkeypatch.setenv("DISCORD_HOME_CHANNEL", "-222")
job = {
"deliver": "origin,all",
"origin": {"platform": "discord", "chat_id": "888"},
}
targets = _resolve_delivery_targets(job)
platforms = sorted(t["platform"].lower() for t in targets)
assert "telegram" in platforms
assert "discord" in platforms
# The origin's explicit chat_id (888) wins the dedup race over the
# discord home channel (-222) because origin is resolved first.
discord = next(t for t in targets if t["platform"].lower() == "discord")
assert discord["chat_id"] == "888"
def test_all_token_case_insensitive(self, monkeypatch):
"""'ALL' / 'All' / 'all' are all recognized."""
from cron.scheduler import _resolve_delivery_targets
monkeypatch.setenv("TELEGRAM_HOME_CHANNEL", "-111")
monkeypatch.setenv("DISCORD_HOME_CHANNEL", "-222")
for token in ("ALL", "All", "all"):
targets = _resolve_delivery_targets({"deliver": token, "origin": None})
platforms = sorted(t["platform"].lower() for t in targets)
assert platforms == ["discord", "telegram"], f"token={token!r} -> {platforms}"
class TestDeliverResultWrapping:
"""Verify that cron deliveries are wrapped with header/footer and no longer mirrored."""
+147
View File
@@ -0,0 +1,147 @@
from __future__ import annotations
from types import SimpleNamespace
import pytest
from gateway.config import Platform
from gateway.platforms.base import MessageEvent, MessageType
from gateway.run import GatewayRunner
from gateway.session import SessionSource
from hermes_cli.goals import CONTINUATION_PROMPT_TEMPLATE
class FakeAdapter:
def __init__(self):
self.calls = []
self.callbacks = {}
self._active_sessions = {}
async def send(self, chat_id, content, reply_to=None, metadata=None):
self.calls.append(
{
"chat_id": chat_id,
"content": content,
"reply_to": reply_to,
"metadata": metadata,
}
)
return SimpleNamespace(success=True)
def register_post_delivery_callback(self, session_key, callback, *, generation=None):
self.callbacks[session_key] = (generation, callback)
def _goal_continuation_event(source, goal="finish the task"):
return MessageEvent(
text=CONTINUATION_PROMPT_TEMPLATE.format(goal=goal),
message_type=MessageType.TEXT,
source=source,
)
@pytest.mark.asyncio
async def test_goal_status_notice_uses_adapter_send_with_thread_metadata():
"""Regression: /goal judge status must use BasePlatformAdapter.send().
The old implementation checked for a non-existent send_message() method,
so the goal could be marked done in state_meta without the visible
"✓ Goal achieved" status line being delivered to Discord/Telegram.
"""
runner = GatewayRunner.__new__(GatewayRunner)
adapter = FakeAdapter()
runner.adapters = {Platform.DISCORD: adapter}
source = SessionSource(
platform=Platform.DISCORD,
chat_id="parent-channel",
thread_id="thread-123",
)
await runner._send_goal_status_notice(source, "✓ Goal achieved: done")
assert adapter.calls == [
{
"chat_id": "parent-channel",
"content": "✓ Goal achieved: done",
"reply_to": None,
"metadata": {"thread_id": "thread-123"},
}
]
@pytest.mark.asyncio
async def test_goal_status_notice_defers_until_post_delivery_callback():
"""Regression: goal status must appear after the agent's visible reply.
_post_turn_goal_continuation runs before BasePlatformAdapter sends the
returned final response. It should therefore register a post-delivery
callback, not send the judge status immediately.
"""
runner = GatewayRunner.__new__(GatewayRunner)
adapter = FakeAdapter()
runner.adapters = {Platform.DISCORD: adapter}
runner.config = SimpleNamespace(group_sessions_per_user=True, thread_sessions_per_user=False)
source = SessionSource(
platform=Platform.DISCORD,
chat_id="parent-channel",
thread_id="thread-123",
user_id="user-1",
)
await runner._defer_goal_status_notice_after_delivery(source, "✓ Goal achieved: done")
assert adapter.calls == []
assert len(adapter.callbacks) == 1
_, callback = next(iter(adapter.callbacks.values()))
result = callback()
if hasattr(result, "__await__"):
await result
assert adapter.calls == [
{
"chat_id": "parent-channel",
"content": "✓ Goal achieved: done",
"reply_to": None,
"metadata": {"thread_id": "thread-123"},
}
]
def test_clear_goal_pending_continuations_removes_slot_and_overflow_only():
"""Regression: /goal pause/clear must cancel queued self-continuations.
A user-issued /goal pause can arrive after the judge queued the next
continuation but before that queued turn runs. The queued synthetic goal
continuation should be removed without dropping normal user /queue items.
"""
runner = GatewayRunner.__new__(GatewayRunner)
adapter = FakeAdapter()
adapter._pending_messages = {}
runner._queued_events = {}
source = SessionSource(
platform=Platform.DISCORD,
chat_id="parent-channel",
thread_id="thread-123",
)
session_key = "discord:parent-channel:thread-123"
normal_event = MessageEvent(
text="normal queued user message",
message_type=MessageType.TEXT,
source=source,
)
adapter._pending_messages[session_key] = _goal_continuation_event(source)
runner._queued_events[session_key] = [
normal_event,
_goal_continuation_event(source, goal="second continuation"),
]
removed = runner._clear_goal_pending_continuations(session_key, adapter)
assert removed == 2
assert adapter._pending_messages.get(session_key) is None
assert runner._queued_events[session_key] == [normal_event]
+15 -11
View File
@@ -61,8 +61,9 @@ class _RecordingAdapter:
return _R()
def _make_runner_with_adapter():
def _make_runner_with_adapter(session_id: str = None):
from gateway.run import GatewayRunner
import uuid
runner = object.__new__(GatewayRunner)
runner.config = GatewayConfig(
@@ -74,9 +75,12 @@ def _make_runner_with_adapter():
runner._queued_events = {}
src = _make_source()
# Default to a unique session_id so xdist parallel runs on the same worker
# don't see each other's GoalManager state (DEFAULT_DB_PATH gets frozen at
# module-import time, defeating per-test HERMES_HOME monkeypatches).
session_entry = SessionEntry(
session_key=build_session_key(src),
session_id="goal-sess-1",
session_id=session_id or f"goal-sess-{uuid.uuid4().hex[:8]}",
created_at=datetime.now(),
updated_at=datetime.now(),
platform=Platform.TELEGRAM,
@@ -103,8 +107,8 @@ async def test_goal_verdict_done_sent_via_adapter_send(hermes_home):
mgr = GoalManager(session_entry.session_id)
mgr.set("ship the feature")
with patch("hermes_cli.goals.judge_goal", return_value=("done", "the feature shipped")):
runner._post_turn_goal_continuation(
with patch("hermes_cli.goals.judge_goal", return_value=("done", "the feature shipped", False)):
await runner._post_turn_goal_continuation(
session_entry=session_entry,
source=src,
final_response="I shipped the feature.",
@@ -132,8 +136,8 @@ async def test_goal_verdict_continue_enqueues_continuation(hermes_home):
mgr = GoalManager(session_entry.session_id)
mgr.set("polish the docs")
with patch("hermes_cli.goals.judge_goal", return_value=("continue", "still needs work")):
runner._post_turn_goal_continuation(
with patch("hermes_cli.goals.judge_goal", return_value=("continue", "still needs work", False)):
await runner._post_turn_goal_continuation(
session_entry=session_entry,
source=src,
final_response="here's a partial edit",
@@ -160,8 +164,8 @@ async def test_goal_verdict_budget_exhausted_sends_pause(hermes_home):
state.turns_used = 2
save_goal(session_entry.session_id, state)
with patch("hermes_cli.goals.judge_goal", return_value=("continue", "keep going")):
runner._post_turn_goal_continuation(
with patch("hermes_cli.goals.judge_goal", return_value=("continue", "keep going", False)):
await runner._post_turn_goal_continuation(
session_entry=session_entry,
source=src,
final_response="still partial",
@@ -181,7 +185,7 @@ async def test_goal_verdict_skipped_when_no_active_goal(hermes_home):
"""No goal set → the hook is a no-op. Nothing is sent, nothing enqueued."""
runner, adapter, session_entry, src = _make_runner_with_adapter()
runner._post_turn_goal_continuation(
await runner._post_turn_goal_continuation(
session_entry=session_entry,
source=src,
final_response="anything",
@@ -207,9 +211,9 @@ async def test_goal_verdict_survives_adapter_without_send(hermes_home):
runner.adapters[Platform.TELEGRAM] = _NoSendAdapter()
with patch("hermes_cli.goals.judge_goal", return_value=("done", "ok")):
with patch("hermes_cli.goals.judge_goal", return_value=("done", "ok", False)):
# must not raise
runner._post_turn_goal_continuation(
await runner._post_turn_goal_continuation(
session_entry=session_entry,
source=src,
final_response="whatever",
+40 -1
View File
@@ -1,7 +1,6 @@
"""Regression tests for Nous OAuth refresh + agent-key mint interactions."""
import json
import os
from datetime import datetime, timezone
from pathlib import Path
@@ -862,6 +861,46 @@ def test_refresh_token_reuse_detection_surfaces_actionable_message():
assert exc_info.value.relogin_required is True
def test_refresh_token_exchange_sends_refresh_token_header():
"""Nous refresh tokens must be sent in a header so sandbox proxies can
substitute placeholder credentials without parsing form bodies.
"""
from hermes_cli.auth import _refresh_access_token
class _FakeResponse:
status_code = 200
def json(self):
return {"access_token": "access-2", "refresh_token": "refresh-2"}
class _FakeClient:
def __init__(self):
self.kwargs = None
def post(self, *args, **kwargs):
del args
self.kwargs = kwargs
return _FakeResponse()
client = _FakeClient()
payload = _refresh_access_token(
client=client,
portal_base_url="https://portal.nousresearch.com",
client_id="hermes-cli",
refresh_token="refresh-1",
)
assert payload["access_token"] == "access-2"
assert payload["refresh_token"] == "refresh-2"
assert client.kwargs is not None
assert client.kwargs["headers"]["x-nous-refresh-token"] == "refresh-1"
assert client.kwargs["data"] == {
"grant_type": "refresh_token",
"client_id": "hermes-cli",
}
def test_refresh_non_reuse_error_keeps_original_description():
"""Non-reuse invalid_grant errors must keep their original description untouched.
+16
View File
@@ -284,6 +284,22 @@ class TestGmiAuxiliary:
assert model == "google/gemini-3.1-flash-lite-preview"
assert mock_openai.call_args.kwargs["api_key"] == "gmi-test-key"
assert mock_openai.call_args.kwargs["base_url"] == "https://api.gmi-serving.com/v1"
# GMI profile declares default_headers with a HermesAgent User-Agent
# for traffic attribution. The generic profile-fallback branch in
# resolve_provider_client should carry it through to the OpenAI client.
headers = mock_openai.call_args.kwargs.get("default_headers", {})
assert headers.get("User-Agent", "").startswith("HermesAgent/")
def test_gmi_profile_declares_hermes_user_agent(self):
"""The GMI plugin sets a HermesAgent/<ver> User-Agent on its profile."""
from providers import get_provider_profile
profile = get_provider_profile("gmi")
assert profile is not None
ua = profile.default_headers.get("User-Agent", "")
assert ua.startswith("HermesAgent/"), (
f"expected GMI profile User-Agent to start with 'HermesAgent/', got {ua!r}"
)
def test_resolve_provider_client_accepts_gmi_alias(self, monkeypatch):
monkeypatch.setenv("GMI_API_KEY", "gmi-test-key")
+175 -17
View File
@@ -40,14 +40,14 @@ class TestParseJudgeResponse:
def test_clean_json_done(self):
from hermes_cli.goals import _parse_judge_response
done, reason = _parse_judge_response('{"done": true, "reason": "all good"}')
done, reason, _ = _parse_judge_response('{"done": true, "reason": "all good"}')
assert done is True
assert reason == "all good"
def test_clean_json_continue(self):
from hermes_cli.goals import _parse_judge_response
done, reason = _parse_judge_response('{"done": false, "reason": "more work needed"}')
done, reason, _ = _parse_judge_response('{"done": false, "reason": "more work needed"}')
assert done is False
assert reason == "more work needed"
@@ -55,7 +55,7 @@ class TestParseJudgeResponse:
from hermes_cli.goals import _parse_judge_response
raw = '```json\n{"done": true, "reason": "done"}\n```'
done, reason = _parse_judge_response(raw)
done, reason, _ = _parse_judge_response(raw)
assert done is True
assert "done" in reason
@@ -64,7 +64,7 @@ class TestParseJudgeResponse:
from hermes_cli.goals import _parse_judge_response
raw = 'Looking at this... the agent says X. Verdict: {"done": false, "reason": "partial"}'
done, reason = _parse_judge_response(raw)
done, reason, _ = _parse_judge_response(raw)
assert done is False
assert reason == "partial"
@@ -72,24 +72,24 @@ class TestParseJudgeResponse:
from hermes_cli.goals import _parse_judge_response
for s in ("true", "yes", "done", "1"):
done, _ = _parse_judge_response(f'{{"done": "{s}", "reason": "r"}}')
done, _, _ = _parse_judge_response(f'{{"done": "{s}", "reason": "r"}}')
assert done is True
for s in ("false", "no", "not yet"):
done, _ = _parse_judge_response(f'{{"done": "{s}", "reason": "r"}}')
done, _, _ = _parse_judge_response(f'{{"done": "{s}", "reason": "r"}}')
assert done is False
def test_malformed_json_fails_open(self):
"""Non-JSON → not done, with error-ish reason (so judge_goal can map to continue)."""
from hermes_cli.goals import _parse_judge_response
done, reason = _parse_judge_response("this is not json at all")
done, reason, _ = _parse_judge_response("this is not json at all")
assert done is False
assert reason # non-empty
def test_empty_response(self):
from hermes_cli.goals import _parse_judge_response
done, reason = _parse_judge_response("")
done, reason, _ = _parse_judge_response("")
assert done is False
assert reason
@@ -103,13 +103,13 @@ class TestJudgeGoal:
def test_empty_goal_skipped(self):
from hermes_cli.goals import judge_goal
verdict, _ = judge_goal("", "some response")
verdict, _, _ = judge_goal("", "some response")
assert verdict == "skipped"
def test_empty_response_continues(self):
from hermes_cli.goals import judge_goal
verdict, _ = judge_goal("ship the thing", "")
verdict, _, _ = judge_goal("ship the thing", "")
assert verdict == "continue"
def test_no_aux_client_continues(self):
@@ -120,7 +120,7 @@ class TestJudgeGoal:
"agent.auxiliary_client.get_text_auxiliary_client",
return_value=(None, None),
):
verdict, _ = goals.judge_goal("my goal", "my response")
verdict, _, _ = goals.judge_goal("my goal", "my response")
assert verdict == "continue"
def test_api_error_continues(self):
@@ -133,7 +133,7 @@ class TestJudgeGoal:
"agent.auxiliary_client.get_text_auxiliary_client",
return_value=(fake_client, "judge-model"),
):
verdict, reason = goals.judge_goal("goal", "response")
verdict, reason, _ = goals.judge_goal("goal", "response")
assert verdict == "continue"
assert "judge error" in reason.lower()
@@ -152,7 +152,7 @@ class TestJudgeGoal:
"agent.auxiliary_client.get_text_auxiliary_client",
return_value=(fake_client, "judge-model"),
):
verdict, reason = goals.judge_goal("goal", "agent response")
verdict, reason, _ = goals.judge_goal("goal", "agent response")
assert verdict == "done"
assert reason == "achieved"
@@ -171,7 +171,7 @@ class TestJudgeGoal:
"agent.auxiliary_client.get_text_auxiliary_client",
return_value=(fake_client, "judge-model"),
):
verdict, reason = goals.judge_goal("goal", "agent response")
verdict, reason, _ = goals.judge_goal("goal", "agent response")
assert verdict == "continue"
assert reason == "not yet"
@@ -260,7 +260,7 @@ class TestGoalManager:
mgr = GoalManager(session_id="eval-sid-1")
mgr.set("ship it")
with patch.object(goals, "judge_goal", return_value=("done", "shipped")):
with patch.object(goals, "judge_goal", return_value=("done", "shipped", False)):
decision = mgr.evaluate_after_turn("I shipped the feature.")
assert decision["verdict"] == "done"
@@ -276,7 +276,7 @@ class TestGoalManager:
mgr = GoalManager(session_id="eval-sid-2", default_max_turns=5)
mgr.set("a long goal")
with patch.object(goals, "judge_goal", return_value=("continue", "more work")):
with patch.object(goals, "judge_goal", return_value=("continue", "more work", False)):
decision = mgr.evaluate_after_turn("made some progress")
assert decision["verdict"] == "continue"
@@ -294,7 +294,7 @@ class TestGoalManager:
mgr = GoalManager(session_id="eval-sid-3", default_max_turns=2)
mgr.set("hard goal")
with patch.object(goals, "judge_goal", return_value=("continue", "not yet")):
with patch.object(goals, "judge_goal", return_value=("continue", "not yet", False)):
d1 = mgr.evaluate_after_turn("step 1")
assert d1["should_continue"] is True
assert mgr.state.turns_used == 1
@@ -356,3 +356,161 @@ def test_goal_command_dispatches_in_cli_registry_helpers():
assert "/goal" in COMMANDS
session_cmds = COMMANDS_BY_CATEGORY.get("Session", {})
assert "/goal" in session_cmds
# ──────────────────────────────────────────────────────────────────────
# Auto-pause on consecutive judge parse failures
# ──────────────────────────────────────────────────────────────────────
class TestJudgeParseFailureAutoPause:
"""Regression: weak judge models (e.g. deepseek-v4-flash) that return
empty strings or non-JSON prose must auto-pause the loop after N turns
instead of burning the whole turn budget."""
def test_parse_response_flags_empty_as_parse_failure(self):
from hermes_cli.goals import _parse_judge_response
done, reason, parse_failed = _parse_judge_response("")
assert done is False
assert parse_failed is True
assert "empty" in reason.lower()
def test_parse_response_flags_non_json_as_parse_failure(self):
from hermes_cli.goals import _parse_judge_response
done, reason, parse_failed = _parse_judge_response(
"Let me analyze whether the goal is fully satisfied based on the agent's response..."
)
assert done is False
assert parse_failed is True
assert "not json" in reason.lower()
def test_parse_response_clean_json_is_not_parse_failure(self):
from hermes_cli.goals import _parse_judge_response
done, _, parse_failed = _parse_judge_response(
'{"done": false, "reason": "more work"}'
)
assert done is False
assert parse_failed is False
def test_api_error_does_not_count_as_parse_failure(self):
"""Transient network/API errors must not trip the auto-pause guard."""
from hermes_cli import goals
fake_client = MagicMock()
fake_client.chat.completions.create.side_effect = RuntimeError("connection reset")
with patch(
"agent.auxiliary_client.get_text_auxiliary_client",
return_value=(fake_client, "judge-model"),
):
verdict, _, parse_failed = goals.judge_goal("goal", "response")
assert verdict == "continue"
assert parse_failed is False
def test_empty_judge_reply_flagged_as_parse_failure(self):
"""End-to-end: judge returns empty content → parse_failed=True."""
from hermes_cli import goals
fake_client = MagicMock()
fake_client.chat.completions.create.return_value = MagicMock(
choices=[MagicMock(message=MagicMock(content=""))]
)
with patch(
"agent.auxiliary_client.get_text_auxiliary_client",
return_value=(fake_client, "judge-model"),
):
verdict, _, parse_failed = goals.judge_goal("goal", "response")
assert verdict == "continue"
assert parse_failed is True
def test_auto_pause_after_three_consecutive_parse_failures(self, hermes_home):
"""N=3 consecutive parse failures → auto-pause with config pointer."""
from hermes_cli import goals
from hermes_cli.goals import GoalManager, DEFAULT_MAX_CONSECUTIVE_PARSE_FAILURES
assert DEFAULT_MAX_CONSECUTIVE_PARSE_FAILURES == 3
mgr = GoalManager(session_id="parse-fail-sid-1", default_max_turns=20)
mgr.set("do a thing")
with patch.object(
goals, "judge_goal", return_value=("continue", "judge returned empty response", True)
):
d1 = mgr.evaluate_after_turn("step 1")
assert d1["should_continue"] is True
assert mgr.state.consecutive_parse_failures == 1
d2 = mgr.evaluate_after_turn("step 2")
assert d2["should_continue"] is True
assert mgr.state.consecutive_parse_failures == 2
d3 = mgr.evaluate_after_turn("step 3")
assert d3["should_continue"] is False
assert d3["status"] == "paused"
assert mgr.state.consecutive_parse_failures == 3
# Message points at the config surface so the user can fix it.
assert "auxiliary" in d3["message"]
assert "goal_judge" in d3["message"]
assert "config.yaml" in d3["message"]
def test_parse_failure_counter_resets_on_good_reply(self, hermes_home):
"""A single good judge reply resets the counter — transient flakes don't pause."""
from hermes_cli import goals
from hermes_cli.goals import GoalManager
mgr = GoalManager(session_id="parse-fail-sid-2", default_max_turns=20)
mgr.set("another goal")
# Two parse failures…
with patch.object(
goals, "judge_goal", return_value=("continue", "not json", True)
):
mgr.evaluate_after_turn("step 1")
mgr.evaluate_after_turn("step 2")
assert mgr.state.consecutive_parse_failures == 2
# …then one clean reply resets the counter.
with patch.object(
goals, "judge_goal", return_value=("continue", "making progress", False)
):
d = mgr.evaluate_after_turn("step 3")
assert d["should_continue"] is True
assert mgr.state.consecutive_parse_failures == 0
def test_parse_failure_counter_not_incremented_by_api_errors(self, hermes_home):
"""API/transport errors must NOT count toward the auto-pause threshold."""
from hermes_cli import goals
from hermes_cli.goals import GoalManager
mgr = GoalManager(session_id="parse-fail-sid-3", default_max_turns=20)
mgr.set("goal")
with patch.object(
goals, "judge_goal", return_value=("continue", "judge error: RuntimeError", False)
):
for _ in range(5):
d = mgr.evaluate_after_turn("still going")
assert d["should_continue"] is True
assert mgr.state.consecutive_parse_failures == 0
assert mgr.state.status == "active"
def test_consecutive_parse_failures_persists_across_goalmanager_reloads(
self, hermes_home
):
"""The counter must be durable so cross-session resumes see it."""
from hermes_cli import goals
from hermes_cli.goals import GoalManager, load_goal
mgr = GoalManager(session_id="parse-fail-sid-4", default_max_turns=20)
mgr.set("persistent goal")
with patch.object(
goals, "judge_goal", return_value=("continue", "empty", True)
):
mgr.evaluate_after_turn("r")
mgr.evaluate_after_turn("r")
reloaded = load_goal("parse-fail-sid-4")
assert reloaded is not None
assert reloaded.consecutive_parse_failures == 2
+1 -132
View File
@@ -152,135 +152,4 @@ class TestRelaunch:
with pytest.raises(SystemExit):
relaunch_mod.relaunch(["--resume", "abc"])
assert calls == [("/usr/bin/hermes", ["/usr/bin/hermes", "--resume", "abc"])]
def test_windows_uses_subprocess_not_execvp(self, monkeypatch):
"""On Windows, os.execvp raises OSError "Exec format error" when the
target is a .cmd shim or console-script wrapper (both common for
hermes). relaunch() must detect win32 and use subprocess.run +
sys.exit instead."""
monkeypatch.setattr(relaunch_mod.sys, "platform", "win32")
monkeypatch.setattr(relaunch_mod, "resolve_hermes_bin", lambda: r"C:\Users\test\hermes.exe")
import subprocess as _subprocess
captured_argv = []
def fake_subprocess_run(argv, **kwargs):
captured_argv.append(list(argv))
class _Result:
returncode = 0
return _Result()
monkeypatch.setattr(_subprocess, "run", fake_subprocess_run)
# execvp MUST NOT be called on Windows — route must go through subprocess
execvp_calls = []
def fake_execvp(*args, **kwargs):
execvp_calls.append(args)
raise AssertionError("os.execvp must not be called on Windows")
monkeypatch.setattr(relaunch_mod.os, "execvp", fake_execvp)
with pytest.raises(SystemExit) as exc_info:
relaunch_mod.relaunch(["chat"])
assert exc_info.value.code == 0
assert execvp_calls == []
assert captured_argv == [[r"C:\Users\test\hermes.exe", "chat"]]
def test_windows_propagates_child_exit_code(self, monkeypatch):
"""A non-zero exit from the child should flow through to sys.exit."""
monkeypatch.setattr(relaunch_mod.sys, "platform", "win32")
monkeypatch.setattr(relaunch_mod, "resolve_hermes_bin", lambda: r"C:\hermes.exe")
import subprocess as _subprocess
def fake_run(argv, **kwargs):
class _Result:
returncode = 42
return _Result()
monkeypatch.setattr(_subprocess, "run", fake_run)
monkeypatch.setattr(relaunch_mod.os, "execvp", lambda *a, **kw: None)
with pytest.raises(SystemExit) as exc_info:
relaunch_mod.relaunch(["chat"])
assert exc_info.value.code == 42
def test_windows_surfaces_oserror_with_help(self, monkeypatch, capsys):
"""When subprocess itself raises OSError (file-not-found / bad format),
we must NOT let it bubble up as a cryptic traceback print a
user-readable hint and sys.exit(1)."""
monkeypatch.setattr(relaunch_mod.sys, "platform", "win32")
monkeypatch.setattr(relaunch_mod, "resolve_hermes_bin", lambda: r"C:\missing.exe")
import subprocess as _subprocess
def fake_run(argv, **kwargs):
raise OSError(2, "No such file or directory")
monkeypatch.setattr(_subprocess, "run", fake_run)
monkeypatch.setattr(relaunch_mod.os, "execvp", lambda *a, **kw: None)
with pytest.raises(SystemExit) as exc_info:
relaunch_mod.relaunch(["chat"])
assert exc_info.value.code == 1
err = capsys.readouterr().err
assert "relaunch failed" in err
assert "open a new terminal" in err.lower() or "path" in err.lower()
class TestResolveHermesBinWindowsPyGuard:
"""On Windows, resolve_hermes_bin MUST NOT return a .py path.
os.access(x, os.X_OK) returns True for .py files on Windows because
PATHEXT includes .py when the Python launcher is installed but
subprocess.run can't actually exec a .py directly, so the relaunch
would fail with the cryptic "%1 is not a valid Win32 application" error.
"""
def test_windows_rejects_py_argv0_falls_through_to_path(self, monkeypatch, tmp_path):
"""On Windows, if sys.argv[0] is a .py file, we must skip the
argv[0] fast-path and fall through to PATH / python -m."""
# Build a fake .py script that "passes" the isfile + X_OK checks.
script = tmp_path / "main.py"
script.write_text("# stub")
monkeypatch.setattr(relaunch_mod.sys, "platform", "win32")
monkeypatch.setattr(relaunch_mod.sys, "argv", [str(script), "chat"])
# Force PATH lookup to return a hermes.exe so the test doesn't
# exercise the None-fallback path (that's a separate test).
monkeypatch.setattr(
relaunch_mod.shutil, "which",
lambda name: r"C:\venv\Scripts\hermes.exe" if name == "hermes" else None,
)
bin_path = relaunch_mod.resolve_hermes_bin()
# Must NOT be the .py — must be the hermes.exe PATH entry.
assert bin_path == r"C:\venv\Scripts\hermes.exe"
def test_posix_still_accepts_py_argv0(self, monkeypatch, tmp_path):
"""POSIX behaviour unchanged: argv[0] pointing at an executable
script (including .py with a shebang + chmod +x) is fine to return
because POSIX exec can route through the shebang line."""
if sys.platform == "win32":
pytest.skip("POSIX semantics")
script = tmp_path / "hermes"
script.write_text("#!/usr/bin/env python3\n")
script.chmod(0o755)
monkeypatch.setattr(relaunch_mod.sys, "argv", [str(script), "chat"])
assert relaunch_mod.resolve_hermes_bin() == str(script)
def test_windows_py_argv0_with_no_hermes_on_path_returns_none(self, monkeypatch, tmp_path):
"""Bulletproof fallback: if argv0 is .py on Windows AND hermes.exe
isn't on PATH, return None so the caller falls back to
python -m hermes_cli.main."""
script = tmp_path / "main.py"
script.write_text("# stub")
monkeypatch.setattr(relaunch_mod.sys, "platform", "win32")
monkeypatch.setattr(relaunch_mod.sys, "argv", [str(script), "chat"])
monkeypatch.setattr(relaunch_mod.shutil, "which", lambda name: None)
assert relaunch_mod.resolve_hermes_bin() is None
assert calls == [("/usr/bin/hermes", ["/usr/bin/hermes", "--resume", "abc"])]
@@ -65,6 +65,31 @@ def test_routermint_base_url_applies_user_agent_header(mock_openai):
assert headers["User-Agent"].startswith("HermesAgent/")
@patch("run_agent.OpenAI")
def test_gmi_base_url_picks_up_profile_user_agent(mock_openai):
"""GMI declares User-Agent on its ProviderProfile.default_headers.
The ``_apply_client_headers_for_base_url`` else-branch looks up the
provider profile and applies its default_headers, so no GMI-specific
branch is needed in run_agent.
"""
mock_openai.return_value = MagicMock()
agent = AIAgent(
api_key="test-key",
base_url="https://api.gmi-serving.com/v1",
model="test/model",
provider="gmi",
quiet_mode=True,
skip_context_files=True,
skip_memory=True,
)
agent._apply_client_headers_for_base_url("https://api.gmi-serving.com/v1")
headers = agent._client_kwargs["default_headers"]
assert headers["User-Agent"].startswith("HermesAgent/")
@patch("run_agent.OpenAI")
def test_unknown_base_url_clears_default_headers(mock_openai):
mock_openai.return_value = MagicMock()
-297
View File
@@ -1,297 +0,0 @@
"""Tests for hermes_bootstrap — Windows UTF-8 stdio shim.
The bootstrap module is imported at the top of every Hermes entry point
(hermes, hermes-agent, hermes-acp, gateway, batch_runner, cli.py). It
fixes Python's Windows UTF-8 defaults so print("café") doesn't crash and
subprocess children inherit UTF-8 mode.
Key invariants covered by these tests:
1. Windows: env vars get set, stdio reconfigured, non-ASCII print works
2. POSIX: complete no-op (we don't touch LANG/LC_* or anything else)
3. Idempotent: safe to call multiple times
4. Respects user opt-out: if the user explicitly sets PYTHONUTF8=0 or
PYTHONIOENCODING=something-else, we leave those alone
5. Load order: every Hermes entry point imports hermes_bootstrap as its
first non-docstring import (before anything that might do file I/O
or print to stdout)
"""
from __future__ import annotations
import io
import os
import subprocess
import sys
import textwrap
import unittest.mock as mock
import pytest
# Import the module under test via an import-time side-effect check path.
# We need to be able to reset its state between tests, so we import it
# fresh in each test that manipulates _IS_WINDOWS.
def _fresh_import():
"""Return a freshly-imported hermes_bootstrap module.
Drops any cached copy from sys.modules first so module-level code
runs again and the platform check re-evaluates.
"""
sys.modules.pop("hermes_bootstrap", None)
import hermes_bootstrap # noqa: WPS433
return hermes_bootstrap
class TestWindowsBehavior:
"""Windows: the bootstrap does its job."""
@pytest.mark.skipif(
sys.platform != "win32",
reason="Windows-specific behavior",
)
def test_env_vars_set_on_windows(self, monkeypatch):
# Clear any pre-existing values and re-run bootstrap.
monkeypatch.delenv("PYTHONUTF8", raising=False)
monkeypatch.delenv("PYTHONIOENCODING", raising=False)
hb = _fresh_import()
# Module-level apply_windows_utf8_bootstrap() ran during import.
assert os.environ.get("PYTHONUTF8") == "1"
assert os.environ.get("PYTHONIOENCODING") == "utf-8"
assert hb._bootstrap_applied is True
@pytest.mark.skipif(
sys.platform != "win32",
reason="Windows-specific behavior",
)
def test_stdout_reconfigured_to_utf8_on_windows(self):
# The live process's stdout should now be UTF-8 (the Hermes CLI
# runs on Windows with a pytest console that's cp1252 by default).
# If reconfigure succeeded, sys.stdout.encoding is 'utf-8'.
_fresh_import()
# pytest may capture stdout, which makes encoding check flaky —
# so instead verify the reconfigure call succeeded on the real
# stream by attempting the failure case.
out = sys.stdout
reconfigure = getattr(out, "reconfigure", None)
if reconfigure is None:
pytest.skip("pytest replaced sys.stdout with a non-reconfigurable stream")
# After bootstrap, encoding should be utf-8 (or the reconfigure
# skipped because pytest's capture already set it to utf-8).
assert out.encoding.lower() in {"utf-8", "utf8"}, (
f"stdout encoding is {out.encoding!r} — bootstrap should have "
"reconfigured it to UTF-8"
)
@pytest.mark.skipif(
sys.platform != "win32",
reason="Windows-specific behavior",
)
def test_child_process_inherits_utf8_mode(self):
"""A subprocess spawned from this process should inherit
PYTHONUTF8=1 and be able to print non-ASCII to stdout."""
_fresh_import()
# Non-ASCII chars that would crash under cp1252: arrow, emoji.
script = textwrap.dedent("""
import sys
print("em-dash \\u2014 arrow \\u2192 emoji \\U0001f680")
sys.exit(0)
""").strip()
# Don't pass env= — let the child inherit os.environ, which
# now contains PYTHONUTF8=1 courtesy of the bootstrap.
result = subprocess.run(
[sys.executable, "-c", script],
capture_output=True,
timeout=15,
)
assert result.returncode == 0, (
f"Child crashed printing non-ASCII despite UTF-8 bootstrap:\n"
f" stdout: {result.stdout!r}\n"
f" stderr: {result.stderr!r}"
)
decoded = result.stdout.decode("utf-8")
assert "\u2014" in decoded
assert "\u2192" in decoded
assert "\U0001f680" in decoded
class TestUserOptOut:
"""If the user has explicitly set PYTHONUTF8 / PYTHONIOENCODING in
their environment, we respect that (setdefault, not overwrite)."""
@pytest.mark.skipif(
sys.platform != "win32",
reason="Only meaningful on Windows where we'd otherwise set these",
)
def test_user_pythonutf8_zero_preserved(self, monkeypatch):
monkeypatch.setenv("PYTHONUTF8", "0")
_fresh_import()
assert os.environ["PYTHONUTF8"] == "0", (
"bootstrap must not overwrite an explicit user setting"
)
@pytest.mark.skipif(
sys.platform != "win32",
reason="Only meaningful on Windows where we'd otherwise set these",
)
def test_user_pythonioencoding_preserved(self, monkeypatch):
monkeypatch.setenv("PYTHONIOENCODING", "latin-1")
_fresh_import()
assert os.environ["PYTHONIOENCODING"] == "latin-1"
class TestPosixNoOp:
"""POSIX: zero behavior change. We don't touch LANG, LC_*, or any
stdio. The goal is that Linux/macOS behave identically before and
after this module is imported."""
def test_noop_on_fake_posix(self, monkeypatch):
"""Even when imported, the bootstrap function must return False
and leave env untouched when _IS_WINDOWS is False."""
hb = _fresh_import()
# Reset + fake POSIX
hb._IS_WINDOWS = False
hb._bootstrap_applied = False
monkeypatch.delenv("PYTHONUTF8", raising=False)
monkeypatch.delenv("PYTHONIOENCODING", raising=False)
result = hb.apply_windows_utf8_bootstrap()
assert result is False
assert "PYTHONUTF8" not in os.environ
assert "PYTHONIOENCODING" not in os.environ
assert hb._bootstrap_applied is False
@pytest.mark.skipif(
sys.platform == "win32",
reason="Real POSIX required for this check",
)
def test_real_posix_bootstrap_is_noop(self, monkeypatch):
"""On actual Linux/macOS, importing the module must not set
PYTHONUTF8 or reconfigure stdio."""
monkeypatch.delenv("PYTHONUTF8", raising=False)
monkeypatch.delenv("PYTHONIOENCODING", raising=False)
hb = _fresh_import()
assert hb._bootstrap_applied is False
assert "PYTHONUTF8" not in os.environ
assert "PYTHONIOENCODING" not in os.environ
class TestIdempotence:
"""Calling apply_windows_utf8_bootstrap() multiple times must be safe."""
def test_second_call_returns_false(self):
hb = _fresh_import()
# First call already happened at import time.
result = hb.apply_windows_utf8_bootstrap()
assert result is False, (
"Second call should return False (idempotent no-op)"
)
def test_no_exceptions_on_repeated_calls(self):
hb = _fresh_import()
for _ in range(5):
hb.apply_windows_utf8_bootstrap()
class TestStdioReconfigureErrorHandling:
"""If sys.stdout/stderr/stdin have been replaced with streams that
don't support reconfigure (e.g. by a test harness), the bootstrap
must degrade gracefully rather than crash."""
def test_non_reconfigurable_stream_does_not_crash(self, monkeypatch):
"""Replace sys.stdout with a BytesIO (no reconfigure method),
then run the bootstrap and make sure it doesn't raise."""
hb = _fresh_import()
hb._IS_WINDOWS = True
hb._bootstrap_applied = False
fake = io.BytesIO() # no .reconfigure attribute
monkeypatch.setattr(sys, "stdout", fake)
try:
# Must not raise.
hb.apply_windows_utf8_bootstrap()
except Exception as exc:
pytest.fail(f"bootstrap raised on non-reconfigurable stdout: {exc}")
def test_reconfigure_oserror_is_caught(self, monkeypatch):
"""If reconfigure() itself raises (closed stream, etc.), swallow
the error the env-var half of the fix still applies."""
hb = _fresh_import()
hb._IS_WINDOWS = True
hb._bootstrap_applied = False
class _BrokenStream:
encoding = "utf-8"
def reconfigure(self, **kwargs):
raise OSError("simulated: stream already closed")
monkeypatch.setattr(sys, "stdout", _BrokenStream())
monkeypatch.setattr(sys, "stderr", _BrokenStream())
# Must not raise.
hb.apply_windows_utf8_bootstrap()
class TestEntryPointsImportBootstrap:
"""Every Hermes entry point must import hermes_bootstrap as its
first non-docstring import. We check this by scanning source files
rather than invoking the entry points (which would require a full
agent context)."""
# Entry points that invoke Hermes as a process. Each one must
# import hermes_bootstrap before doing any file I/O or stdout writes.
ENTRY_POINTS = [
"hermes_cli/main.py", # hermes CLI (console_script)
"run_agent.py", # hermes-agent (console_script)
"acp_adapter/entry.py", # hermes-acp (console_script)
"gateway/run.py", # gateway
"batch_runner.py", # batch mode
"cli.py", # legacy direct-launch CLI
]
@pytest.mark.parametrize("path", ENTRY_POINTS)
def test_entry_point_imports_bootstrap(self, path):
"""The file must contain 'import hermes_bootstrap' and that
line must appear before the first 'import' of anything else.
We're lenient about the docstring (can be arbitrarily long) and
about comment lines just need to verify the first import
statement is the bootstrap.
"""
# Resolve relative to the hermes-agent repo root. Tests live
# at tests/test_hermes_bootstrap.py, so go up one dir.
import pathlib
here = pathlib.Path(__file__).resolve()
repo_root = here.parent.parent # tests/ -> repo root
full_path = repo_root / path
assert full_path.exists(), f"entry point missing: {full_path}"
source = full_path.read_text(encoding="utf-8")
# Find the first non-comment, non-blank line that starts with
# 'import ' or 'from '. It must be 'import hermes_bootstrap'.
import tokenize
import ast
tree = ast.parse(source)
first_import_node = None
for node in ast.iter_child_nodes(tree):
if isinstance(node, (ast.Import, ast.ImportFrom)):
first_import_node = node
break
assert first_import_node is not None, (
f"{path}: no top-level imports found at all"
)
if isinstance(first_import_node, ast.Import):
first_import_name = first_import_node.names[0].name
else: # ImportFrom
first_import_name = first_import_node.module or ""
assert first_import_name == "hermes_bootstrap", (
f"{path}: first top-level import is {first_import_name!r}, "
f"but it must be 'hermes_bootstrap' so UTF-8 stdio is "
f"configured before anything else initializes. Move the "
f"'import hermes_bootstrap' line to be the first import."
)
-115
View File
@@ -1,115 +0,0 @@
"""Tests for ruff lint config — guards against accidental rule removal.
PLW1514 (unspecified-encoding) was enabled after a debug session on
Windows turned up three separate UTF-8 regressions in execute_code.
The rule catches bare ``open()`` / ``read_text()`` / ``write_text()``
calls that default to locale encoding cp1252 on Windows which
silently corrupts non-ASCII content.
These tests ensure:
1. PLW1514 stays in ``[tool.ruff.lint.select]``
2. The CI workflow's blocking step still invokes ``ruff check .``
3. pyproject.toml has ``preview = true`` (required PLW1514 is a
preview rule in ruff 0.15.x)
If someone removes any of these, CI stops enforcing UTF-8-explicit
opens and we're back to the original Windows-regression trap.
"""
from __future__ import annotations
import pathlib
import pytest
try:
import tomllib # Python 3.11+
except ImportError: # pragma: no cover — 3.10 and earlier
import tomli as tomllib # type: ignore
REPO_ROOT = pathlib.Path(__file__).resolve().parent.parent
def _load_pyproject() -> dict:
with open(REPO_ROOT / "pyproject.toml", "rb") as fh:
return tomllib.load(fh)
class TestRuffConfig:
def test_plw1514_is_in_select_list(self):
"""pyproject.toml must keep PLW1514 in [tool.ruff.lint.select]."""
cfg = _load_pyproject()
selected = (
cfg.get("tool", {})
.get("ruff", {})
.get("lint", {})
.get("select", [])
)
assert "PLW1514" in selected, (
"PLW1514 (unspecified-encoding) was removed from "
"[tool.ruff.lint.select]. This rule blocks bare open() calls "
"that default to locale encoding on Windows — removing it "
"re-opens a class of UTF-8 bugs we already paid to close. "
"If you genuinely want to remove it, delete this test in the "
"same commit so the intent is deliberate."
)
def test_preview_mode_enabled(self):
"""PLW1514 is a preview rule in ruff 0.15.x — preview=true is
required for it to actually run."""
cfg = _load_pyproject()
ruff_cfg = cfg.get("tool", {}).get("ruff", {})
assert ruff_cfg.get("preview") is True, (
"[tool.ruff] preview=true is required — PLW1514 is a preview "
"rule and silently becomes a no-op without it. If this ever "
"becomes a stable rule, you can drop preview=true but must "
"verify PLW1514 still fires in a sample test run first."
)
class TestLintWorkflow:
WORKFLOW_PATH = REPO_ROOT / ".github" / "workflows" / "lint.yml"
def test_workflow_exists(self):
assert self.WORKFLOW_PATH.exists(), (
f"CI workflow missing: {self.WORKFLOW_PATH}"
)
def test_workflow_has_blocking_ruff_step(self):
"""The workflow must run a blocking ``ruff check .`` step
(one without --exit-zero) so violations fail the job."""
content = self.WORKFLOW_PATH.read_text(encoding="utf-8")
# Look for the blocking step's named line + its command. We want
# at least one ``ruff check .`` that does NOT have ``--exit-zero``
# nearby.
import re
# Split into lines and find ruff check invocations
lines = content.splitlines()
found_blocking = False
for i, line in enumerate(lines):
stripped = line.strip()
if stripped.startswith("ruff check") and "--exit-zero" not in stripped:
# Also check it's not piped to `|| true` which would mask
# the exit code.
window = " ".join(lines[i:i + 3])
if "|| true" not in window:
found_blocking = True
break
assert found_blocking, (
"lint.yml no longer contains a blocking ``ruff check .`` step "
"(one without --exit-zero and not masked by || true). "
"Restore it — the PLW1514 rule is only useful if CI actually "
"fails on violation."
)
def test_workflow_yaml_is_valid(self):
"""Workflow file must parse as valid YAML (can't ship a broken
CI config to main)."""
import yaml
content = self.WORKFLOW_PATH.read_text(encoding="utf-8")
try:
parsed = yaml.safe_load(content)
except yaml.YAMLError as exc:
pytest.fail(f"lint.yml is not valid YAML: {exc}")
assert isinstance(parsed, dict)
assert "jobs" in parsed
+17 -11
View File
@@ -1863,13 +1863,15 @@ def test_config_set_personality_rejects_unknown_name(monkeypatch):
assert "Unknown personality" in resp["error"]["message"]
def test_config_set_personality_resets_history_and_returns_info(monkeypatch):
def test_config_set_personality_preserves_history_and_returns_info(monkeypatch):
agent = types.SimpleNamespace(
ephemeral_system_prompt=None, _cached_system_prompt="old"
)
session = _session(
agent=types.SimpleNamespace(),
agent=agent,
history=[{"role": "user", "text": "hi"}],
history_version=4,
)
new_agent = types.SimpleNamespace(model="x")
emits = []
server._sessions["sid"] = session
@@ -1878,13 +1880,9 @@ def test_config_set_personality_resets_history_and_returns_info(monkeypatch):
"_available_personalities",
lambda cfg=None: {"helpful": "You are helpful."},
)
monkeypatch.setattr(
server, "_make_agent", lambda sid, key, session_id=None: new_agent
)
monkeypatch.setattr(
server, "_session_info", lambda agent: {"model": getattr(agent, "model", "?")}
)
monkeypatch.setattr(server, "_restart_slash_worker", lambda session: None)
monkeypatch.setattr(server, "_emit", lambda *args: emits.append(args))
monkeypatch.setattr(server, "_write_config_key", lambda path, value: None)
@@ -1896,11 +1894,19 @@ def test_config_set_personality_resets_history_and_returns_info(monkeypatch):
}
)
assert resp["result"]["history_reset"] is True
assert resp["result"]["info"] == {"model": "x"}
assert session["history"] == []
assert resp["result"]["history_reset"] is False
assert resp["result"]["info"] == {"model": "?"}
# History is preserved with a pivot marker appended
assert len(session["history"]) == 2
assert session["history"][0] == {"role": "user", "text": "hi"}
assert session["history"][1]["role"] == "user"
assert "personality" in session["history"][1]["content"].lower()
assert "You are helpful." in session["history"][1]["content"]
assert session["history_version"] == 5
assert ("session.info", "sid", {"model": "x"}) in emits
# Agent's system prompt was updated in-place; cached prompt untouched
assert agent.ephemeral_system_prompt == "You are helpful."
assert agent._cached_system_prompt == "old"
assert ("session.info", "sid", {"model": "?"}) in emits
def test_session_compress_uses_compress_helper(monkeypatch):
+1 -9
View File
@@ -340,15 +340,7 @@ class TestRunBrowserCommandPathConstruction:
_run_browser_command("test-task", "navigate", ["https://example.com"])
assert captured_cmd is not None
# The prefix must split "npx agent-browser" into two argv items.
# On POSIX shutil.which("npx") returns the absolute path if npx is on
# PATH (which the test's patched PATH always contains when the system
# has it installed). The important invariant is that the second
# argv item is the package name "agent-browser", not a merged
# "npx agent-browser" string — that's what Popen needs.
assert len(captured_cmd) >= 2
assert captured_cmd[0].endswith("npx") or captured_cmd[0] == "npx"
assert captured_cmd[1] == "agent-browser"
assert captured_cmd[:2] == ["npx", "agent-browser"]
assert captured_cmd[2:6] == [
"--session",
"test-session",
+2 -8
View File
@@ -774,17 +774,11 @@ class TestEnvVarFiltering(unittest.TestCase):
class TestExecuteCodeEdgeCases(unittest.TestCase):
def test_windows_returns_error(self):
"""When SANDBOX_AVAILABLE is False (e.g. when the backend deems
the sandbox unusable for this environment), execute_code returns
an error JSON with a readable message pointing the caller at
regular tool calls. Previously this was a Windows-only gate;
execute_code now works on Windows via loopback TCP, so the
error is only emitted when SANDBOX_AVAILABLE is explicitly
flipped off (e.g. for future platform-specific disables)."""
"""On Windows (or when SANDBOX_AVAILABLE is False), returns error JSON."""
with patch("tools.code_execution_tool.SANDBOX_AVAILABLE", False):
result = json.loads(execute_code("print('hi')", task_id="test"))
self.assertIn("error", result)
self.assertIn("unavailable", result["error"].lower())
self.assertIn("Windows", result["error"])
def test_whitespace_only_code(self):
result = json.loads(execute_code(" \n\t ", task_id="test"))
+2 -30
View File
@@ -131,12 +131,6 @@ class TestResolveChildPython(unittest.TestCase):
def test_project_with_virtualenv_picks_venv_python(self):
"""Project mode + VIRTUAL_ENV pointing at a real venv → that python."""
if sys.platform == "win32":
pytest.skip(
"Creates symlinks and assumes POSIX venv layout (bin/python). "
"Windows venvs use Scripts/python.exe and symlink creation "
"requires elevated privileges (WinError 1314)."
)
import tempfile, pathlib
with tempfile.TemporaryDirectory() as td:
fake_venv = pathlib.Path(td)
@@ -160,12 +154,6 @@ class TestResolveChildPython(unittest.TestCase):
def test_project_prefers_virtualenv_over_conda(self):
"""If both VIRTUAL_ENV and CONDA_PREFIX are set, VIRTUAL_ENV wins."""
if sys.platform == "win32":
pytest.skip(
"Creates symlinks and assumes POSIX venv layout (bin/python). "
"Windows venvs use Scripts/python.exe and symlink creation "
"requires elevated privileges (WinError 1314)."
)
import tempfile, pathlib
with tempfile.TemporaryDirectory() as ve_td, tempfile.TemporaryDirectory() as conda_td:
ve = pathlib.Path(ve_td)
@@ -269,15 +257,7 @@ class TestModeAwareSchema(unittest.TestCase):
# Integration: what actually happens when execute_code runs per mode
# ---------------------------------------------------------------------------
@pytest.mark.skipif(
sys.platform == "win32",
reason=(
"Assumes POSIX venv layout (bin/python) and symlink creation "
"privileges. execute_code itself works on Windows — these "
"integration tests just haven't been ported to the Scripts/"
"python.exe layout yet."
),
)
@pytest.mark.skipif(sys.platform == "win32", reason="execute_code is POSIX-only")
class TestExecuteCodeModeIntegration(unittest.TestCase):
"""End-to-end: verify the subprocess actually runs where we expect."""
@@ -371,15 +351,7 @@ class TestExecuteCodeModeIntegration(unittest.TestCase):
# changes CWD + interpreter, not the security posture.
# ---------------------------------------------------------------------------
@pytest.mark.skipif(
sys.platform == "win32",
reason=(
"Assumes POSIX venv layout (bin/python) and symlink creation "
"privileges. execute_code itself works on Windows — these "
"integration tests just haven't been ported to the Scripts/"
"python.exe layout yet."
),
)
@pytest.mark.skipif(sys.platform == "win32", reason="execute_code is POSIX-only")
class TestSecurityInvariantsAcrossModes(unittest.TestCase):
def _run(self, code, mode):
@@ -1,698 +0,0 @@
"""Tests for execute_code env scrubbing on Windows.
On Windows the child process needs a small set of OS-essential env vars
(SYSTEMROOT, WINDIR, COMSPEC, ...) to run. Without SYSTEMROOT in particular,
``socket.socket(AF_INET, SOCK_STREAM)`` fails inside the sandbox with
WinError 10106 (Winsock can't locate mswsock.dll) and no tool call over
loopback TCP can ever succeed.
These tests cover ``_scrub_child_env`` directly so they run on every OS
the logic is conditional on a passed-in ``is_windows`` flag, not on
the host platform. We also keep a live Winsock smoke test that only runs
on a real Windows host.
Also covers the companion Windows bug: the sandbox writes
``hermes_tools.py`` and ``script.py`` into a temp dir, and those files
must be written as UTF-8 on every platform the generated stub contains
em-dash/en-dash characters in docstrings, and the default ``open(path, "w")``
on Windows uses the system locale (cp1252 typically), corrupting those
bytes. The child then fails to import with a SyntaxError:
``'utf-8' codec can't decode byte 0x97``.
"""
import os
import socket
import subprocess
import sys
import textwrap
import unittest.mock as mock
import pytest
from tools.code_execution_tool import (
_SAFE_ENV_PREFIXES,
_SECRET_SUBSTRINGS,
_WINDOWS_ESSENTIAL_ENV_VARS,
_scrub_child_env,
)
def _no_passthrough(_name):
return False
class TestWindowsEssentialAllowlist:
"""The allowlist itself — contents, shape, and invariants."""
def test_contains_winsock_required_vars(self):
# Without SYSTEMROOT the child cannot initialize Winsock.
assert "SYSTEMROOT" in _WINDOWS_ESSENTIAL_ENV_VARS
def test_contains_subprocess_required_vars(self):
# Without COMSPEC, subprocess can't resolve the default shell.
assert "COMSPEC" in _WINDOWS_ESSENTIAL_ENV_VARS
def test_contains_user_profile_vars(self):
# os.path.expanduser("~") on Windows uses USERPROFILE.
assert "USERPROFILE" in _WINDOWS_ESSENTIAL_ENV_VARS
assert "APPDATA" in _WINDOWS_ESSENTIAL_ENV_VARS
assert "LOCALAPPDATA" in _WINDOWS_ESSENTIAL_ENV_VARS
def test_contains_only_uppercase_names(self):
# Windows env var names are case-insensitive but we canonicalize to
# uppercase for the membership check (``k.upper() in _WINDOWS_...``).
for name in _WINDOWS_ESSENTIAL_ENV_VARS:
assert name == name.upper(), f"{name!r} should be uppercase"
def test_no_overlap_with_secret_substrings(self):
# Sanity: none of the essential OS vars should look like secrets.
# If this ever fires, we'd have a precedence ordering bug (secrets
# are blocked *before* the essentials check).
for name in _WINDOWS_ESSENTIAL_ENV_VARS:
assert not any(s in name for s in _SECRET_SUBSTRINGS), (
f"{name!r} looks secret-like — would be blocked before the "
"essentials allowlist can match"
)
class TestScrubChildEnvWindows:
"""Verify _scrub_child_env passes Windows essentials through when
is_windows=True and blocks them when is_windows=False (so POSIX hosts
don't inherit pointless Windows vars)."""
def _sample_windows_env(self):
"""A realistic subset of what os.environ looks like on Windows."""
return {
"SYSTEMROOT": r"C:\Windows",
"SystemDrive": "C:", # Windows preserves native case
"WINDIR": r"C:\Windows",
"ComSpec": r"C:\Windows\System32\cmd.exe",
"PATHEXT": ".COM;.EXE;.BAT;.CMD;.PY",
"USERPROFILE": r"C:\Users\alice",
"APPDATA": r"C:\Users\alice\AppData\Roaming",
"LOCALAPPDATA": r"C:\Users\alice\AppData\Local",
"PATH": r"C:\Windows\System32;C:\Python311",
"HOME": r"C:\Users\alice",
"TEMP": r"C:\Users\alice\AppData\Local\Temp",
# Should still be blocked:
"OPENAI_API_KEY": "sk-secret",
"GITHUB_TOKEN": "ghp_secret",
"MY_PASSWORD": "hunter2",
# Not matched by any rule — should be dropped on both OSes:
"RANDOM_UNKNOWN_VAR": "value",
}
def test_windows_essentials_passed_through_when_is_windows_true(self):
env = self._sample_windows_env()
scrubbed = _scrub_child_env(env,
is_passthrough=_no_passthrough,
is_windows=True)
# Every essential var from the sample env should survive.
assert scrubbed["SYSTEMROOT"] == r"C:\Windows"
assert scrubbed["SystemDrive"] == "C:" # case preserved
assert scrubbed["WINDIR"] == r"C:\Windows"
assert scrubbed["ComSpec"] == r"C:\Windows\System32\cmd.exe"
assert scrubbed["PATHEXT"] == ".COM;.EXE;.BAT;.CMD;.PY"
assert scrubbed["USERPROFILE"] == r"C:\Users\alice"
assert scrubbed["APPDATA"].endswith("Roaming")
assert scrubbed["LOCALAPPDATA"].endswith("Local")
# Safe-prefix vars still pass (baseline behavior).
assert "PATH" in scrubbed
assert "HOME" in scrubbed
assert "TEMP" in scrubbed
def test_secrets_still_blocked_on_windows(self):
"""The Windows allowlist must NOT defeat the secret-substring block.
This is the key security invariant: essentials are allowed by
*exact name*, and the secret-substring block runs before the
essentials check anyway, so a variable named e.g. ``API_KEY`` can
never sneak through just because we added Windows support.
"""
env = self._sample_windows_env()
scrubbed = _scrub_child_env(env,
is_passthrough=_no_passthrough,
is_windows=True)
assert "OPENAI_API_KEY" not in scrubbed
assert "GITHUB_TOKEN" not in scrubbed
assert "MY_PASSWORD" not in scrubbed
def test_unknown_vars_still_dropped_on_windows(self):
env = self._sample_windows_env()
scrubbed = _scrub_child_env(env,
is_passthrough=_no_passthrough,
is_windows=True)
assert "RANDOM_UNKNOWN_VAR" not in scrubbed
def test_essentials_blocked_when_is_windows_false(self):
"""On POSIX hosts, Windows-specific vars should not pass — they
have no meaning and could confuse child tooling."""
env = self._sample_windows_env()
scrubbed = _scrub_child_env(env,
is_passthrough=_no_passthrough,
is_windows=False)
# Safe prefixes still match (PATH, HOME, TEMP).
assert "PATH" in scrubbed
assert "HOME" in scrubbed
assert "TEMP" in scrubbed
# But Windows OS vars should be dropped.
assert "SYSTEMROOT" not in scrubbed
assert "WINDIR" not in scrubbed
assert "ComSpec" not in scrubbed
assert "APPDATA" not in scrubbed
def test_case_insensitive_essential_match(self):
"""Windows env var names are case-insensitive at the OS level but
Python preserves whatever case os.environ reported. The scrubber
must normalize to uppercase for the membership check."""
env = {
"SystemRoot": r"C:\Windows", # mixed case
"comspec": r"C:\Windows\System32\cmd.exe", # lowercase
"APPDATA": r"C:\Users\x\AppData\Roaming", # uppercase
}
scrubbed = _scrub_child_env(env,
is_passthrough=_no_passthrough,
is_windows=True)
assert "SystemRoot" in scrubbed
assert "comspec" in scrubbed
assert "APPDATA" in scrubbed
class TestScrubChildEnvPassthroughInteraction:
"""The passthrough hook runs *before* the secret block, so a skill
can legitimately forward a third-party API key. The Windows
essentials addition must not interfere with that."""
def test_passthrough_wins_over_secret_block(self):
env = {"TENOR_API_KEY": "x", "PATH": "/bin"}
scrubbed = _scrub_child_env(env,
is_passthrough=lambda k: k == "TENOR_API_KEY",
is_windows=False)
assert scrubbed.get("TENOR_API_KEY") == "x"
assert scrubbed.get("PATH") == "/bin"
def test_passthrough_still_works_on_windows(self):
env = {
"TENOR_API_KEY": "x",
"SYSTEMROOT": r"C:\Windows",
"OPENAI_API_KEY": "sk-secret", # not passthrough
}
scrubbed = _scrub_child_env(
env,
is_passthrough=lambda k: k == "TENOR_API_KEY",
is_windows=True,
)
assert scrubbed.get("TENOR_API_KEY") == "x"
assert scrubbed.get("SYSTEMROOT") == r"C:\Windows"
assert "OPENAI_API_KEY" not in scrubbed
@pytest.mark.skipif(
sys.platform != "win32",
reason="Winsock-specific regression — only meaningful on Windows",
)
class TestWindowsSocketSmokeTest:
"""Integration-ish smoke test: spawn a child Python with a scrubbed
env and confirm it can create an AF_INET socket. This is the
regression that motivated the fix without SYSTEMROOT the child
hits WinError 10106 before any RPC is attempted."""
def test_child_can_create_socket_with_scrubbed_env(self):
scrubbed = _scrub_child_env(os.environ, is_passthrough=_no_passthrough)
# Build a tiny child script that simply opens an AF_INET socket.
script = textwrap.dedent("""
import socket, sys
try:
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.close()
print("OK")
sys.exit(0)
except OSError as exc:
print(f"FAIL: {exc}")
sys.exit(1)
""").strip()
result = subprocess.run(
[sys.executable, "-c", script],
env=scrubbed,
capture_output=True,
text=True,
timeout=15,
)
assert result.returncode == 0, (
f"Child failed to create socket with scrubbed env:\n"
f" stdout={result.stdout!r}\n"
f" stderr={result.stderr!r}\n"
f" scrubbed keys={sorted(scrubbed.keys())}"
)
assert "OK" in result.stdout
# ---------------------------------------------------------------------------
# POSIX equivalence guard
# ---------------------------------------------------------------------------
def _legacy_posix_scrubber(source_env, is_passthrough):
"""Verbatim copy of the pre-Windows-fix inline scrubbing logic.
This is the oracle used by TestPosixEquivalence to prove the refactor
did not change POSIX behavior. DO NOT edit this to "match" a future
production change if _scrub_child_env's POSIX behavior legitimately
needs to evolve, delete this function and adjust the equivalence test
on purpose, so the churn is visible in review.
"""
_SAFE_ENV_PREFIXES = ("PATH", "HOME", "USER", "LANG", "LC_", "TERM",
"TMPDIR", "TMP", "TEMP", "SHELL", "LOGNAME",
"XDG_", "PYTHONPATH", "VIRTUAL_ENV", "CONDA",
"HERMES_")
_SECRET_SUBSTRINGS = ("KEY", "TOKEN", "SECRET", "PASSWORD", "CREDENTIAL",
"PASSWD", "AUTH")
out = {}
for k, v in source_env.items():
if is_passthrough(k):
out[k] = v
continue
if any(s in k.upper() for s in _SECRET_SUBSTRINGS):
continue
if any(k.startswith(p) for p in _SAFE_ENV_PREFIXES):
out[k] = v
return out
class TestPosixEquivalence:
"""Lock in the invariant that _scrub_child_env(env, is_windows=False)
behaves *bit-for-bit identically* to the pre-refactor inline scrubber.
If this ever fails, it means somebody changed POSIX env-scrubbing
behavior maybe on purpose, maybe not. Either way it should land
as a deliberate, reviewed change (update _legacy_posix_scrubber
above in the same PR).
Rationale: the Windows-essentials patch refactored the scrubber into
a helper. Linux/macOS must not regress. This class gates that.
"""
_POSIX_SYNTHETIC_ENV = {
# Safe-prefix matches
"PATH": "/usr/bin:/bin",
"HOME": "/home/alice",
"USER": "alice",
"LANG": "en_US.UTF-8",
"LC_CTYPE": "en_US.UTF-8",
"TERM": "xterm-256color",
"SHELL": "/bin/zsh",
"LOGNAME": "alice",
"TMPDIR": "/tmp",
"XDG_RUNTIME_DIR": "/run/user/1000",
"XDG_CONFIG_HOME": "/home/alice/.config",
"PYTHONPATH": "/opt/lib",
"VIRTUAL_ENV": "/home/alice/.venv",
"CONDA_PREFIX": "/opt/conda",
"HERMES_HOME": "/home/alice/.hermes",
"HERMES_INTERACTIVE": "1",
# Secret-substring blocks
"OPENAI_API_KEY": "sk-xxx",
"GITHUB_TOKEN": "ghp_xxx",
"AWS_SECRET_ACCESS_KEY": "yyy",
"MY_PASSWORD": "hunter2",
# Uncategorized — must be dropped
"RANDOM_UNKNOWN": "drop-me",
"DISPLAY": ":0",
"SSH_AUTH_SOCK": "/run/user/1000/ssh-agent",
# Passthrough candidate (also matches secret block by default)
"TENOR_API_KEY": "tenor-xxx",
}
_WINDOWS_SYNTHETIC_ENV = {
# Windows-essential names (must be dropped on POSIX, passed on Win)
"SYSTEMROOT": r"C:\Windows",
"SystemDrive": "C:",
"WINDIR": r"C:\Windows",
"ComSpec": r"C:\Windows\System32\cmd.exe",
"PATHEXT": ".COM;.EXE;.BAT",
"USERPROFILE": r"C:\Users\alice",
"APPDATA": r"C:\Users\alice\AppData\Roaming",
"LOCALAPPDATA": r"C:\Users\alice\AppData\Local",
# Safe-prefix matches (cross-platform)
"PATH": r"C:\Python311;C:\Windows\System32",
"HOME": r"C:\Users\alice",
"TEMP": r"C:\Users\alice\AppData\Local\Temp",
# Secret-looking (always blocked)
"OPENAI_API_KEY": "sk-xxx",
"GITHUB_TOKEN": "ghp_xxx",
}
@pytest.mark.parametrize("env_name,env", [
("posix_synthetic", _POSIX_SYNTHETIC_ENV),
("windows_synthetic_on_posix", _WINDOWS_SYNTHETIC_ENV),
])
@pytest.mark.parametrize("pt_name,pt", [
("no_passthrough", lambda _: False),
("tenor_passthrough", lambda k: k == "TENOR_API_KEY"),
("all_passthrough", lambda _: True),
])
def test_posix_behavior_unchanged(self, env_name, env, pt_name, pt):
"""For every combination of (env shape × passthrough rule), the
new helper with is_windows=False must produce the exact same dict
as the legacy inline scrubber.
We parametrize over three passthrough rules to cover the full
surface: no passthrough, single-var passthrough (the common
skill-registered case), and everything-passes (edge case that
could expose precedence bugs)."""
expected = _legacy_posix_scrubber(env, pt)
actual = _scrub_child_env(env, is_passthrough=pt, is_windows=False)
assert actual == expected, (
f"POSIX behavior regressed for env={env_name}, passthrough={pt_name}\n"
f" only in legacy: {sorted(set(expected) - set(actual))}\n"
f" only in new: {sorted(set(actual) - set(expected))}\n"
f" value diffs: {[k for k in expected if k in actual and expected[k] != actual[k]]}"
)
def test_posix_behavior_unchanged_on_real_os_environ(self):
"""Bonus check against the actual os.environ of the host running
the test. This covers vars we might not have thought to put in
the synthetic fixtures."""
expected = _legacy_posix_scrubber(os.environ, lambda _: False)
actual = _scrub_child_env(os.environ,
is_passthrough=lambda _: False,
is_windows=False)
assert actual == expected, (
"POSIX-mode scrubber diverged from legacy behavior on real "
f"os.environ (host platform={sys.platform})"
)
def test_windows_mode_is_strict_superset_of_posix_mode(self):
"""Correctness check on the NEW behavior: is_windows=True must
keep everything POSIX mode keeps, and *may* add Windows
essentials. It must never drop a var that POSIX mode would keep
if it did, we'd have broken same-host reuse of the scrubber."""
env = {**self._POSIX_SYNTHETIC_ENV, **self._WINDOWS_SYNTHETIC_ENV}
posix_result = _scrub_child_env(env,
is_passthrough=lambda _: False,
is_windows=False)
windows_result = _scrub_child_env(env,
is_passthrough=lambda _: False,
is_windows=True)
missing = set(posix_result) - set(windows_result)
assert not missing, (
f"is_windows=True dropped vars that is_windows=False kept: {missing}"
)
# And any extras must come from the Windows essentials allowlist.
extras = set(windows_result) - set(posix_result)
for k in extras:
assert k.upper() in _WINDOWS_ESSENTIAL_ENV_VARS, (
f"Unexpected extra var in windows-mode output: {k} "
f"(not in _WINDOWS_ESSENTIAL_ENV_VARS)"
)
# ---------------------------------------------------------------------------
# UTF-8 file-write regression test
# ---------------------------------------------------------------------------
#
# The sandbox writes two Python files into a temp dir — the generated
# ``hermes_tools.py`` stub, and the LLM's ``script.py``. Both contain
# non-ASCII characters in practice: the stub has em-dashes in docstrings
# ("``tcp://host:port`` — the parent falls back..."), and user scripts
# routinely contain non-ASCII strings, comments, or Unicode identifiers.
#
# On Windows, ``open(path, "w")`` without encoding= uses the system locale
# (cp1252 on US/UK installs), which cannot encode em-dashes. Python then
# tries to decode the file as UTF-8 when importing it (PEP 3120), fails,
# and the sandbox aborts with:
#
# SyntaxError: (unicode error) 'utf-8' codec can't decode byte 0x97
# in position N: invalid start byte
#
# This was the *second* Windows-specific bug (WinError 10106 was the first).
# The fix is to always pass ``encoding="utf-8"`` when writing Python source.
class TestSandboxWritesUtf8:
"""Verify the file-write call sites use UTF-8 explicitly, not the
platform default. We check the source of ``execute_code`` rather
than spawning a real sandbox because the latter needs a full agent
context but the code inspection is deterministic and fast."""
def test_stub_and_script_writes_specify_utf8(self):
"""Both ``hermes_tools.py`` and ``script.py`` writes in
``_execute_local`` must pass ``encoding="utf-8"``."""
import tools.code_execution_tool as cet
src = open(cet.__file__, encoding="utf-8").read()
# There should be no ``open(path, "w")`` without encoding= for
# the two staging files. Grep-style check: find every write of
# a .py file inside tmpdir and assert the line also contains
# ``encoding="utf-8"`` within a short window.
import re
pattern = re.compile(
r'open\(\s*os\.path\.join\(\s*tmpdir\s*,\s*"[^"]+\.py"\s*\)\s*,\s*"w"[^)]*\)'
)
for match in pattern.finditer(src):
line = match.group(0)
assert 'encoding="utf-8"' in line or "encoding='utf-8'" in line, (
f"Sandbox file write missing encoding=\"utf-8\" on Windows: {line!r}"
)
def test_file_rpc_stub_uses_utf8(self):
"""The file-based RPC transport stub (used by remote backends)
reads/writes JSON response files. Those must also specify UTF-8
so non-ASCII tool results survive the round-trip intact."""
from tools.code_execution_tool import generate_hermes_tools_module
stub = generate_hermes_tools_module(["terminal"], transport="file")
# The generated stub should open response + request files as UTF-8.
assert 'encoding="utf-8"' in stub, (
"File-based RPC stub does not specify encoding=\"utf-8\""
"will corrupt non-ASCII tool results on non-UTF-8 locales."
)
def test_stub_source_roundtrips_through_utf8(self):
"""Concrete regression: write the generated stub to a temp file
using ``encoding="utf-8"``, then parse it. This is what the
sandbox does, and it must succeed even when the stub contains
em-dashes (which it does check the transport-header docstring).
"""
from tools.code_execution_tool import generate_hermes_tools_module
import tempfile, ast
stub = generate_hermes_tools_module(
["terminal", "read_file", "write_file"], transport="uds"
)
# Sanity: stub actually contains a non-ASCII character, otherwise
# this test wouldn't prove anything meaningful.
non_ascii = [c for c in stub if ord(c) > 127]
assert non_ascii, (
"Generated stub is pure ASCII — test is meaningless. If the "
"stub's docstrings have lost their em-dashes, update this "
"assertion, but be aware the original regression is no longer "
"covered."
)
with tempfile.NamedTemporaryFile(
mode="w", suffix=".py", delete=False, encoding="utf-8"
) as f:
f.write(stub)
tmp_path = f.name
try:
# Re-read and parse exactly like the child Python would.
with open(tmp_path, encoding="utf-8") as fh:
round_tripped = fh.read()
assert round_tripped == stub, "UTF-8 round-trip corrupted the stub"
ast.parse(round_tripped) # must not raise SyntaxError
finally:
os.unlink(tmp_path)
@pytest.mark.skipif(
sys.platform != "win32",
reason="cp1252 default-encoding regression is Windows-specific",
)
def test_windows_default_encoding_would_have_failed(self):
"""Negative control: prove that on Windows, writing the stub
*without* ``encoding="utf-8"`` would corrupt the file. If this
test ever starts failing (i.e. default write succeeds), it means
Python's default encoding has changed and the explicit UTF-8
requirement may be obsolete reconsider the fix."""
from tools.code_execution_tool import generate_hermes_tools_module
import tempfile
stub = generate_hermes_tools_module(["terminal"], transport="uds")
# Find a non-ASCII character we can use to prove the corruption.
non_ascii = [c for c in stub if ord(c) > 127]
if not non_ascii:
pytest.skip("stub has no non-ASCII chars — nothing to corrupt")
# Write with default encoding (simulating the old buggy code).
with tempfile.NamedTemporaryFile(
mode="w", suffix=".py", delete=False
) as f:
try:
f.write(stub)
tmp_path = f.name
wrote_successfully = True
except UnicodeEncodeError:
# Default encoding can't even encode it — that's the bug
# in a different form. Still proves the point.
tmp_path = f.name
wrote_successfully = False
try:
if not wrote_successfully:
# Default-encoding write raised outright. The bug is real.
return
# Read back as UTF-8 (what Python does on import).
with open(tmp_path, encoding="utf-8") as fh:
try:
fh.read()
# If this succeeds on Windows, the platform default is
# already UTF-8 (e.g. Python 3.15 with UTF-8 mode on).
# In that case the explicit encoding= is belt-and-
# suspenders but no longer strictly required. Skip.
pytest.skip(
"Default text-file encoding is UTF-8-compatible on "
"this Windows build — explicit encoding= is no "
"longer load-bearing, but keep it for belt-and-"
"suspenders."
)
except UnicodeDecodeError:
# Exactly the failure mode that motivated the fix.
pass
finally:
os.unlink(tmp_path)
# ---------------------------------------------------------------------------
# UTF-8 stdio regression test
# ---------------------------------------------------------------------------
#
# The third Windows-specific sandbox bug: after the UTF-8 file-write fix
# let the child import hermes_tools, a user script that printed non-ASCII
# to stdout still crashed with:
#
# UnicodeEncodeError: 'charmap' codec can't encode character '\u2192'
# in position N: character maps to <undefined>
#
# Python's sys.stdout on Windows is bound to the console code page
# (cp1252 on US-locale installs) when the process is attached to a pipe
# without PYTHONIOENCODING set. LLM-generated scripts routinely print
# em-dashes, arrows, accented chars, emoji — all of which break.
#
# Fix: spawn the child with PYTHONIOENCODING=utf-8 and PYTHONUTF8=1.
# The latter also makes open()'s default encoding UTF-8 (PEP 540),
# belt-and-suspenders for user scripts that do their own file I/O.
class TestChildStdioIsUtf8:
"""Verify the sandbox child is spawned with UTF-8 stdio encoding,
so LLM scripts can print non-ASCII without crashing on Windows."""
def test_popen_env_sets_pythonioencoding_utf8(self):
"""Source-level check: the Popen call site must set
PYTHONIOENCODING=utf-8 in child_env."""
import tools.code_execution_tool as cet
src = open(cet.__file__, encoding="utf-8").read()
assert 'child_env["PYTHONIOENCODING"] = "utf-8"' in src, (
"PYTHONIOENCODING=utf-8 missing from child env — Windows "
"scripts that print non-ASCII will crash with "
"UnicodeEncodeError."
)
def test_popen_env_sets_pythonutf8_mode(self):
"""Source-level check: PYTHONUTF8=1 must be set too — it makes
open()'s default encoding UTF-8 in user-written file I/O."""
import tools.code_execution_tool as cet
src = open(cet.__file__, encoding="utf-8").read()
assert 'child_env["PYTHONUTF8"] = "1"' in src, (
"PYTHONUTF8=1 missing from child env — user scripts that "
"call open(path, 'w') without encoding= will produce "
"locale-encoded files on Windows."
)
def test_live_child_can_print_non_ascii(self):
"""Live regression: spawn a Python child with the same env
treatment the sandbox uses (PYTHONIOENCODING=utf-8 + PYTHONUTF8=1)
and verify it can print em-dashes, arrows, and emoji to stdout
without crashing. This is the exact scenario that broke in live
usage.
Runs on every OS on POSIX the fix is belt-and-suspenders but
still load-bearing for C.ASCII locale environments.
"""
script = textwrap.dedent("""
import sys
# Mix of chars that cp1252 can't encode: arrow, emoji.
print("em-dash \\u2014 arrow \\u2192 emoji \\U0001f680")
sys.exit(0)
""").strip()
# Build a scrubbed env the same way the sandbox does, then apply
# the stdio overrides.
scrubbed = _scrub_child_env(os.environ, is_passthrough=_no_passthrough)
scrubbed["PYTHONIOENCODING"] = "utf-8"
scrubbed["PYTHONUTF8"] = "1"
result = subprocess.run(
[sys.executable, "-c", script],
env=scrubbed,
capture_output=True,
timeout=15,
# Don't decode at the subprocess boundary — we want to check
# the raw bytes match UTF-8, same as what the sandbox does.
)
assert result.returncode == 0, (
f"Child crashed printing non-ASCII:\n"
f" stdout (raw): {result.stdout!r}\n"
f" stderr (raw): {result.stderr!r}"
)
decoded = result.stdout.decode("utf-8")
assert "\u2014" in decoded, f"em-dash missing from output: {decoded!r}"
assert "\u2192" in decoded, f"arrow missing from output: {decoded!r}"
assert "\U0001f680" in decoded, f"emoji missing from output: {decoded!r}"
@pytest.mark.skipif(
sys.platform != "win32",
reason="cp1252 stdout default is Windows-specific",
)
def test_windows_child_without_utf8_env_would_fail(self):
"""Negative control: spawn a Python child *without* our env
overrides and prove that on Windows, printing non-ASCII fails.
If this ever starts passing, Python has changed its default
stdio encoding on Windows and the fix may be obsolete but
keep the env vars anyway for belt-and-suspenders."""
script = textwrap.dedent("""
import sys
print("em-dash \\u2014 arrow \\u2192")
sys.exit(0)
""").strip()
# Scrubbed env WITHOUT the PYTHONIOENCODING / PYTHONUTF8 overrides.
# Also scrub PYTHONUTF8 and PYTHONIOENCODING from the inherited
# env so we reproduce the buggy state even if the parent test
# runner has them set.
scrubbed = _scrub_child_env(os.environ, is_passthrough=_no_passthrough)
for k in ("PYTHONIOENCODING", "PYTHONUTF8", "PYTHONLEGACYWINDOWSSTDIO"):
scrubbed.pop(k, None)
result = subprocess.run(
[sys.executable, "-c", script],
env=scrubbed,
capture_output=True,
text=False,
timeout=15,
)
# Either the child crashed (expected), or modern Python handled
# it anyway — in which case the fix is still defensive but no
# longer strictly required. Skip with a note if so.
if result.returncode == 0 and b"\xe2\x80\x94" in result.stdout:
pytest.skip(
"This Python/Windows build handles non-ASCII stdout even "
"without PYTHONIOENCODING/PYTHONUTF8 — fix is defensive "
"but no longer strictly load-bearing. Keep the env vars "
"for older Python builds and C.ASCII-locale containers."
)
# Otherwise: crash OR garbled output — both count as proving the
# bug is real on this system.
-812
View File
@@ -1,812 +0,0 @@
"""Behavioral tests for Windows-specific compatibility fixes.
Complements ``tests/tools/test_windows_compat.py`` (which does source-level
pattern linting) with cross-platform-mocked tests that exercise the actual
code paths Hermes takes on native Windows.
Runs on Linux CI every test mocks ``sys.platform``, ``subprocess.run``,
and ``os.kill`` as needed to simulate Windows behavior without requiring a
Windows runner.
"""
from __future__ import annotations
import importlib
import os
import signal
import subprocess
import sys
from pathlib import Path
from unittest.mock import MagicMock, patch
import pytest
# ---------------------------------------------------------------------------
# configure_windows_stdio
# ---------------------------------------------------------------------------
class TestConfigureWindowsStdio:
"""``hermes_cli.stdio.configure_windows_stdio`` wiring.
The function must:
- be a no-op on non-Windows
- only configure once per process (idempotent)
- set PYTHONIOENCODING / PYTHONUTF8 without overriding explicit user settings
- reconfigure sys.stdout/stderr/stdin to UTF-8 on Windows
- flip the console code page to CP_UTF8 (65001) via ctypes
- respect HERMES_DISABLE_WINDOWS_UTF8 opt-out
"""
@pytest.fixture(autouse=True)
def _reset_configured(self, monkeypatch):
"""Reload the module before each test so the _CONFIGURED flag resets."""
# Remove from sys.modules so import triggers a fresh load
sys.modules.pop("hermes_cli.stdio", None)
# Fresh import now; tests import from hermes_cli.stdio themselves,
# but this guarantees the module they get is a brand-new copy.
import hermes_cli.stdio as _s
_s._CONFIGURED = False
yield
sys.modules.pop("hermes_cli.stdio", None)
def test_no_op_on_posix(self):
from hermes_cli import stdio
assert stdio.is_windows() is False
result = stdio.configure_windows_stdio()
assert result is False
def test_idempotent(self):
from hermes_cli import stdio
stdio.configure_windows_stdio()
# Second call returns False because _CONFIGURED is set
assert stdio.configure_windows_stdio() is False
def test_windows_path_sets_env_and_reconfigures_streams(self, monkeypatch):
from hermes_cli import stdio
monkeypatch.setattr(stdio, "is_windows", lambda: True)
# Pretend the user has no prior setting
monkeypatch.delenv("PYTHONIOENCODING", raising=False)
monkeypatch.delenv("PYTHONUTF8", raising=False)
monkeypatch.delenv("HERMES_DISABLE_WINDOWS_UTF8", raising=False)
monkeypatch.delenv("EDITOR", raising=False)
monkeypatch.delenv("VISUAL", raising=False)
reconfigure_calls = []
def fake_reconfigure(stream, *, encoding="utf-8", errors="replace"):
reconfigure_calls.append((stream, encoding, errors))
cp_calls = []
def fake_flip():
cp_calls.append(True)
monkeypatch.setattr(stdio, "_reconfigure_stream", fake_reconfigure)
monkeypatch.setattr(stdio, "_flip_console_code_page_to_utf8", fake_flip)
# Pretend notepad.exe is on PATH (it always is on real Windows hosts,
# but not on the Linux CI runner — mock it so the editor default
# survives).
monkeypatch.setattr(stdio, "_default_windows_editor", lambda: "notepad")
result = stdio.configure_windows_stdio()
assert result is True
assert os.environ.get("PYTHONIOENCODING") == "utf-8"
assert os.environ.get("PYTHONUTF8") == "1"
# EDITOR must be set so prompt_toolkit's open_in_editor finds
# a working program on Windows (it defaults to /usr/bin/nano).
assert os.environ.get("EDITOR") == "notepad"
assert len(cp_calls) == 1 # SetConsoleOutputCP path hit
assert len(reconfigure_calls) == 3 # stdout, stderr, stdin
def test_respects_existing_editor_var(self, monkeypatch):
"""User's explicit EDITOR wins over our default."""
from hermes_cli import stdio
monkeypatch.setattr(stdio, "is_windows", lambda: True)
monkeypatch.setenv("EDITOR", "code --wait")
monkeypatch.setattr(stdio, "_reconfigure_stream", lambda *a, **kw: None)
monkeypatch.setattr(stdio, "_flip_console_code_page_to_utf8", lambda: None)
monkeypatch.setattr(stdio, "_default_windows_editor", lambda: "notepad")
stdio.configure_windows_stdio()
assert os.environ["EDITOR"] == "code --wait"
def test_respects_existing_visual_var(self, monkeypatch):
"""VISUAL takes precedence over our EDITOR default too."""
from hermes_cli import stdio
monkeypatch.setattr(stdio, "is_windows", lambda: True)
monkeypatch.delenv("EDITOR", raising=False)
monkeypatch.setenv("VISUAL", "nvim")
monkeypatch.setattr(stdio, "_reconfigure_stream", lambda *a, **kw: None)
monkeypatch.setattr(stdio, "_flip_console_code_page_to_utf8", lambda: None)
monkeypatch.setattr(stdio, "_default_windows_editor", lambda: "notepad")
stdio.configure_windows_stdio()
# EDITOR should NOT be set when VISUAL already is (prompt_toolkit
# checks VISUAL first anyway, but we also shouldn't override it).
assert os.environ.get("EDITOR", "") != "notepad"
assert os.environ["VISUAL"] == "nvim"
def test_respects_existing_env_var(self, monkeypatch):
"""User's explicit PYTHONIOENCODING wins over our default."""
from hermes_cli import stdio
monkeypatch.setattr(stdio, "is_windows", lambda: True)
monkeypatch.setenv("PYTHONIOENCODING", "latin-1")
monkeypatch.setattr(stdio, "_reconfigure_stream", lambda *a, **kw: None)
monkeypatch.setattr(stdio, "_flip_console_code_page_to_utf8", lambda: None)
stdio.configure_windows_stdio()
assert os.environ["PYTHONIOENCODING"] == "latin-1"
@pytest.mark.parametrize("optout", ["1", "true", "True", "yes"])
def test_disable_flag_short_circuits(self, monkeypatch, optout):
from hermes_cli import stdio
monkeypatch.setattr(stdio, "is_windows", lambda: True)
monkeypatch.setenv("HERMES_DISABLE_WINDOWS_UTF8", optout)
reconfigure_hit = []
monkeypatch.setattr(
stdio,
"_reconfigure_stream",
lambda *a, **kw: reconfigure_hit.append(True),
)
result = stdio.configure_windows_stdio()
assert result is False
assert reconfigure_hit == [], "opt-out must skip stream reconfiguration"
def test_reconfigure_stream_handles_missing_method(self, monkeypatch):
"""StringIO-like objects without .reconfigure() must not blow up."""
from hermes_cli import stdio
import io
buf = io.StringIO()
# Must not raise
stdio._reconfigure_stream(buf)
# ---------------------------------------------------------------------------
# terminate_pid — the centralized kill primitive
# ---------------------------------------------------------------------------
class TestTerminatePidRoutingOnWindows:
"""``gateway.status.terminate_pid`` must use taskkill /T /F on Windows.
On Linux we can't reload gateway/status with sys.platform=win32 because
the module unconditionally imports ``msvcrt`` in that branch. Instead
we patch the module-level ``_IS_WINDOWS`` flag and ``subprocess.run``
on the already-loaded module, which exercises the same branching code.
"""
def test_force_uses_taskkill_on_windows(self, monkeypatch):
from gateway import status
captured = {}
def fake_run(args, **kwargs):
captured["args"] = args
result = MagicMock()
result.returncode = 0
result.stderr = ""
result.stdout = ""
return result
monkeypatch.setattr(status, "_IS_WINDOWS", True)
monkeypatch.setattr(status.subprocess, "run", fake_run)
status.terminate_pid(12345, force=True)
assert captured["args"][0] == "taskkill"
assert "/PID" in captured["args"]
assert "12345" in captured["args"]
assert "/T" in captured["args"]
assert "/F" in captured["args"]
def test_force_taskkill_failure_raises_oserror(self, monkeypatch):
from gateway import status
def fake_run(args, **kwargs):
result = MagicMock()
result.returncode = 128
result.stderr = "ERROR: The process cannot be terminated."
result.stdout = ""
return result
monkeypatch.setattr(status, "_IS_WINDOWS", True)
monkeypatch.setattr(status.subprocess, "run", fake_run)
with pytest.raises(OSError, match="cannot be terminated"):
status.terminate_pid(12345, force=True)
def test_graceful_on_windows_uses_os_kill_sigterm(self, monkeypatch):
"""Non-force path calls os.kill with SIGTERM (Windows has no SIGKILL).
``terminate_pid(pid)`` with force=False bypasses the taskkill branch
and uses ``os.kill`` directly so platform doesn't actually matter
for the signal choice. Verifies the getattr fallback works.
"""
from gateway import status
captured = {}
def fake_kill(pid, sig):
captured["pid"] = pid
captured["sig"] = sig
monkeypatch.setattr(status.os, "kill", fake_kill)
status.terminate_pid(99, force=False)
assert captured["pid"] == 99
assert captured["sig"] == signal.SIGTERM
def test_taskkill_not_found_falls_back_to_os_kill(self, monkeypatch):
"""On Windows without taskkill (WinPE, containers), fall back gracefully."""
from gateway import status
captured = {}
def fake_run(args, **kwargs):
raise FileNotFoundError(2, "taskkill not found")
def fake_kill(pid, sig):
captured["pid"] = pid
captured["sig"] = sig
monkeypatch.setattr(status, "_IS_WINDOWS", True)
monkeypatch.setattr(status.subprocess, "run", fake_run)
monkeypatch.setattr(status.os, "kill", fake_kill)
status.terminate_pid(42, force=True)
assert captured["pid"] == 42
assert captured["sig"] == signal.SIGTERM
# ---------------------------------------------------------------------------
# SIGKILL fallback pattern
# ---------------------------------------------------------------------------
class TestSigkillFallback:
"""Modules that want SIGKILL must fall back to SIGTERM when absent."""
def test_getattr_fallback_works_when_sigkill_missing(self, monkeypatch):
"""The `getattr(signal, "SIGKILL", signal.SIGTERM)` pattern."""
# Build a stand-in signal module with no SIGKILL attribute
fake_signal = MagicMock()
del fake_signal.SIGKILL # ensure it's absent
fake_signal.SIGTERM = 15
result = getattr(fake_signal, "SIGKILL", fake_signal.SIGTERM)
assert result == 15
def test_getattr_fallback_prefers_sigkill_when_present(self):
"""On POSIX the fallback is a no-op: real SIGKILL wins."""
result = getattr(signal, "SIGKILL", signal.SIGTERM)
assert result == signal.SIGKILL
@pytest.mark.parametrize(
"module_path, line_pattern",
[
("hermes_cli.kanban_db", 'getattr(signal, "SIGKILL", signal.SIGTERM)'),
],
)
def test_module_uses_getattr_fallback(self, module_path, line_pattern):
"""Source-level check that our modules use the safe fallback."""
rel = module_path.replace(".", "/") + ".py"
root = Path(__file__).resolve().parents[2]
source = (root / rel).read_text(encoding="utf-8")
assert line_pattern in source, (
f"{rel} must use the getattr fallback pattern on its SIGKILL site"
)
# ---------------------------------------------------------------------------
# OSError widening on os.kill(pid, 0) probes
# ---------------------------------------------------------------------------
class TestProcessRegistryOSErrorWidening:
"""_is_host_pid_alive must treat Windows' OSError as 'not alive'."""
def test_oserror_treated_as_not_alive(self, monkeypatch):
from tools.process_registry import ProcessRegistry
def fake_kill(pid, sig):
# Simulate Windows' WinError 87 for an unknown PID
raise OSError(22, "Invalid argument")
monkeypatch.setattr("tools.process_registry.os.kill", fake_kill)
assert ProcessRegistry._is_host_pid_alive(12345) is False
def test_permission_error_treated_as_not_alive(self, monkeypatch):
"""Conservative: PermissionError also means 'not alive' (matches existing behavior)."""
from tools.process_registry import ProcessRegistry
def fake_kill(pid, sig):
raise PermissionError(1, "Operation not permitted")
monkeypatch.setattr("tools.process_registry.os.kill", fake_kill)
assert ProcessRegistry._is_host_pid_alive(12345) is False
def test_zero_or_none_pid_returns_false_without_calling_kill(self, monkeypatch):
"""No wasted syscall on falsy pids."""
from tools.process_registry import ProcessRegistry
kill_calls = []
monkeypatch.setattr(
"tools.process_registry.os.kill",
lambda pid, sig: kill_calls.append(pid),
)
assert ProcessRegistry._is_host_pid_alive(None) is False
assert ProcessRegistry._is_host_pid_alive(0) is False
assert kill_calls == []
def test_alive_pid_returns_true(self, monkeypatch):
from tools.process_registry import ProcessRegistry
# os.kill returning None (default) means "probe succeeded → pid alive"
monkeypatch.setattr("tools.process_registry.os.kill", lambda pid, sig: None)
assert ProcessRegistry._is_host_pid_alive(os.getpid()) is True
# ---------------------------------------------------------------------------
# tzdata dependency
# ---------------------------------------------------------------------------
class TestTzdataDependencyDeclared:
"""Windows installs must pull tzdata for zoneinfo to work."""
def test_pyproject_declares_tzdata_for_win32(self):
root = Path(__file__).resolve().parents[2]
source = (root / "pyproject.toml").read_text(encoding="utf-8")
# The dependency line should be conditional on sys_platform == 'win32'
# and should NOT be in the core dependencies for Linux/macOS.
assert (
'tzdata>=2023.3; sys_platform == \'win32\'' in source
or "tzdata>=2023.3; sys_platform == 'win32'" in source
or 'tzdata>=2023.3; sys_platform == "win32"' in source
), "tzdata must be a Windows-only dep in pyproject.toml dependencies"
# ---------------------------------------------------------------------------
# README / docs consistency
# ---------------------------------------------------------------------------
class TestReadmeNoLongerSaysWindowsUnsupported:
"""The README shouldn't claim native Windows isn't supported."""
def test_readme_does_not_say_not_supported(self):
root = Path(__file__).resolve().parents[2]
source = (root / "README.md").read_text(encoding="utf-8")
# Previous string (removed in this PR): "Native Windows is not supported"
assert "Native Windows is not supported" not in source, (
"README.md still says native Windows is not supported — update the "
"install copy to reflect the PowerShell installer."
)
def test_readme_mentions_powershell_installer(self):
root = Path(__file__).resolve().parents[2]
source = (root / "README.md").read_text(encoding="utf-8")
assert "install.ps1" in source, (
"README.md must point at scripts/install.ps1 for Windows users"
)
# ---------------------------------------------------------------------------
# pty_bridge graceful import on Windows
# ---------------------------------------------------------------------------
class TestWebServerPtyBridgeGuard:
"""The web server must not crash if pty_bridge can't import (Windows)."""
def test_import_guard_present_in_source(self):
root = Path(__file__).resolve().parents[2]
source = (root / "hermes_cli" / "web_server.py").read_text(encoding="utf-8")
assert "_PTY_BRIDGE_AVAILABLE" in source
assert "except ImportError" in source, (
"web_server.py must wrap the pty_bridge import in try/except ImportError"
)
def test_pty_handler_checks_availability_flag(self):
"""The /api/pty handler must short-circuit when the bridge is unavailable."""
root = Path(__file__).resolve().parents[2]
source = (root / "hermes_cli" / "web_server.py").read_text(encoding="utf-8")
assert "if not _PTY_BRIDGE_AVAILABLE" in source, (
"/api/pty handler must return a friendly error when PTY is unavailable"
)
# ---------------------------------------------------------------------------
# Entry points wire configure_windows_stdio
# ---------------------------------------------------------------------------
class TestEntryPointsConfigureStdio:
"""cli.py, hermes_cli/main.py, gateway/run.py must call configure_windows_stdio."""
@pytest.mark.parametrize(
"relpath",
["cli.py", "hermes_cli/main.py", "gateway/run.py"],
)
def test_entry_point_calls_configure_stdio(self, relpath):
root = Path(__file__).resolve().parents[2]
source = (root / relpath).read_text(encoding="utf-8")
assert "configure_windows_stdio" in source, (
f"{relpath} must call hermes_cli.stdio.configure_windows_stdio() "
"early in startup so Windows consoles render Unicode without crashing"
)
# ---------------------------------------------------------------------------
# _subprocess_compat shared helpers
# ---------------------------------------------------------------------------
class TestSubprocessCompatHelpers:
"""hermes_cli/_subprocess_compat.py POSIX + Windows behaviour."""
def test_is_windows_matches_sys_platform(self):
from hermes_cli import _subprocess_compat as sc
assert sc.IS_WINDOWS == (sys.platform == "win32")
def test_resolve_node_command_returns_absolute_on_posix(self):
"""On Linux, resolve_node_command('sh', ['-c','echo hi']) picks up /bin/sh."""
from hermes_cli._subprocess_compat import resolve_node_command
# We can't assert "npm is on PATH" portably; use `sh` which is
# guaranteed on POSIX. On Windows the test only confirms the
# no-crash fallback path.
argv = resolve_node_command("sh", ["-c", "echo hi"])
assert argv[1:] == ["-c", "echo hi"]
# First element is either an absolute path (sh found) or the bare
# name (fallback) — both are acceptable behaviours.
def test_resolve_node_command_fallback_when_absent(self):
from hermes_cli._subprocess_compat import resolve_node_command
argv = resolve_node_command(
"zzz-definitely-not-on-path-xyzzy", ["--help"]
)
# Must fall back to the bare name — NOT return None, NOT crash.
assert argv[0] == "zzz-definitely-not-on-path-xyzzy"
assert argv[1:] == ["--help"]
def test_windows_flags_zero_on_posix(self):
from hermes_cli._subprocess_compat import (
windows_detach_flags,
windows_hide_flags,
)
if sys.platform != "win32":
assert windows_detach_flags() == 0
assert windows_hide_flags() == 0
def test_windows_detach_popen_kwargs_is_posix_equivalent_on_posix(self):
from hermes_cli._subprocess_compat import windows_detach_popen_kwargs
kwargs = windows_detach_popen_kwargs()
if sys.platform != "win32":
# POSIX path MUST produce start_new_session=True, which maps to
# os.setsid() in the child — identical to the unchanged main
# branch behaviour. Do NOT break Linux/macOS here.
assert kwargs == {"start_new_session": True}
else:
# Windows path must include creationflags with all 3 bits set.
assert "creationflags" in kwargs
assert kwargs["creationflags"] != 0
# No start_new_session on Windows (silently no-op there).
assert "start_new_session" not in kwargs
def test_windows_detach_flags_has_expected_win32_bits(self, monkeypatch):
"""Simulate Windows to verify flag bundle."""
from hermes_cli import _subprocess_compat as sc
monkeypatch.setattr(sc, "IS_WINDOWS", True)
flags = sc.windows_detach_flags()
# CREATE_NEW_PROCESS_GROUP | DETACHED_PROCESS | CREATE_NO_WINDOW
assert flags & 0x00000200, "missing CREATE_NEW_PROCESS_GROUP"
assert flags & 0x00000008, "missing DETACHED_PROCESS"
assert flags & 0x08000000, "missing CREATE_NO_WINDOW"
# ---------------------------------------------------------------------------
# tui_gateway/entry.py signal installation survives absent POSIX signals
# ---------------------------------------------------------------------------
class TestTuiGatewayEntrySignalGuards:
"""Importing tui_gateway.entry must not crash when SIGPIPE/SIGHUP absent.
Linux has both signals, so this is mostly a source-level invariant check
(no bare ``signal.SIGPIPE`` at module level without a ``hasattr`` guard).
On Windows the import would have raised AttributeError before this fix.
"""
def test_source_guards_each_signal_installation(self):
root = Path(__file__).resolve().parents[2]
source = (root / "tui_gateway" / "entry.py").read_text(encoding="utf-8")
# Every signal.signal(...) at module scope must be preceded by a
# hasattr check. We look at the text: no bare "signal.signal("
# call should appear outside a function body without a guard.
# Simpler heuristic: all SIGPIPE / SIGHUP references outside the
# dict-building loop must be wrapped in hasattr.
assert 'hasattr(signal, "SIGPIPE")' in source
assert 'hasattr(signal, "SIGHUP")' in source
assert 'hasattr(signal, "SIGTERM")' in source
assert 'hasattr(signal, "SIGINT")' in source
def test_module_imports_cleanly(self):
"""Importing the module must not raise — verifies the guards work."""
# Drop any cached import so the module re-initialises
for mod in list(sys.modules):
if mod.startswith("tui_gateway"):
del sys.modules[mod]
import tui_gateway.entry # noqa: F401 # must not raise
# ---------------------------------------------------------------------------
# hermes_cli/kanban_db.py waitpid guard
# ---------------------------------------------------------------------------
class TestKanbanWaitpidWindowsGuard:
"""os.WNOHANG doesn't exist on Windows — the dispatcher tick reap loop
must be gated behind ``os.name != "nt"``."""
def test_source_gates_waitpid_loop(self):
root = Path(__file__).resolve().parents[2]
source = (root / "hermes_cli" / "kanban_db.py").read_text(encoding="utf-8")
# Find the waitpid call and confirm it's inside a POSIX gate.
idx = source.find("os.waitpid(-1, os.WNOHANG)")
assert idx > 0, "waitpid call must exist"
# Look backwards up to 400 chars for the gate.
preamble = source[max(0, idx - 400):idx]
assert 'os.name != "nt"' in preamble or "os.name != 'nt'" in preamble, (
"os.waitpid(-1, os.WNOHANG) must sit behind an os.name != 'nt' guard"
)
# ---------------------------------------------------------------------------
# code_execution_tool TCP loopback on Windows
# ---------------------------------------------------------------------------
class TestCodeExecutionTransportTcpFallback:
"""The RPC transport must fall back to TCP on Windows.
We can't easily execute the sandbox on Linux CI in Windows mode, but we
CAN assert that the generated client module supports both AF_UNIX and
AF_INET endpoints based on the HERMES_RPC_SOCKET format.
"""
def test_generated_client_handles_tcp_endpoint(self):
root = Path(__file__).resolve().parents[2]
source = (root / "tools" / "code_execution_tool.py").read_text(encoding="utf-8")
# _UDS_TRANSPORT_HEADER body must parse both transports.
assert 'endpoint.startswith("tcp://")' in source, (
"generated sandbox client must accept tcp:// endpoints for Windows"
)
assert "socket.AF_INET" in source, (
"generated sandbox client must be able to open AF_INET sockets"
)
def test_server_side_branches_on_use_tcp_rpc(self):
root = Path(__file__).resolve().parents[2]
source = (root / "tools" / "code_execution_tool.py").read_text(encoding="utf-8")
assert "_use_tcp_rpc = _IS_WINDOWS" in source
assert 'rpc_endpoint = f"tcp://{_host}:{_port}"' in source
# ---------------------------------------------------------------------------
# cron/scheduler.py /bin/bash dynamic resolution
# ---------------------------------------------------------------------------
class TestCronSchedulerBashResolution:
"""cron.scheduler must NOT hardcode /bin/bash — .sh scripts need a
dynamically-resolved bash so Windows (Git Bash) works."""
def test_source_uses_shutil_which_for_bash(self):
root = Path(__file__).resolve().parents[2]
source = (root / "cron" / "scheduler.py").read_text(encoding="utf-8")
# The old hardcoded path should be gone as the sole bash source.
# It may still appear as a POSIX fallback after shutil.which(), so
# we check for the shutil.which call near the .sh/.bash branch.
assert 'shutil.which("bash")' in source, (
"cron.scheduler must resolve bash dynamically via shutil.which"
)
def test_error_message_when_bash_missing(self):
root = Path(__file__).resolve().parents[2]
source = (root / "cron" / "scheduler.py").read_text(encoding="utf-8")
# The graceful-failure message must mention "bash not found" so
# Windows users without Git Bash see an actionable error instead
# of a WinError 2 traceback.
assert "bash not found" in source.lower()
# ---------------------------------------------------------------------------
# Node-ecosystem launcher resolution (npm / npx / node)
# ---------------------------------------------------------------------------
class TestNpmBareSpawnsResolved:
"""Every spawn site that launches ``npm``/``npx`` must resolve via
shutil.which / hermes_cli._subprocess_compat.resolve_node_command
so Windows can execute the .cmd batch shims."""
@pytest.mark.parametrize(
"relpath",
[
"hermes_cli/tools_config.py",
"hermes_cli/doctor.py",
"gateway/platforms/whatsapp.py",
"tools/browser_tool.py",
],
)
def test_no_bare_npm_or_npx_in_popen_argv(self, relpath):
"""Reject ``subprocess.run(["npm", ...])`` / ``["npx", ...]`` patterns.
Those fail on Windows with WinError 193. Callers must resolve
via shutil.which(...) and pass the absolute path (or fall back
to the bare name only as a last resort behind a variable).
"""
root = Path(__file__).resolve().parents[2]
source = (root / relpath).read_text(encoding="utf-8")
# The forbidden literal: a subprocess invocation that names npm
# or npx as a bare string inside an argv list.
forbidden_patterns = [
'["npm",',
'["npx",',
"['npm',",
"['npx',",
]
for pat in forbidden_patterns:
# Exception: strings inside error-message text or comments are fine.
# We only fail if the literal appears in an argv position, which
# we approximate by checking it isn't inside a print/log/comment.
# Find all occurrences and verify they're behind shutil.which.
idx = 0
while True:
idx = source.find(pat, idx)
if idx < 0:
break
# Look at the preceding 120 chars — if "shutil.which" appears
# there, or the pattern is inside a comment/string, it's fine.
context = source[max(0, idx - 120):idx]
if "#" in context.split("\n")[-1]:
idx += len(pat)
continue
# Argv forms that START with a bare npm/npx are the bug.
raise AssertionError(
f"{relpath}: bare {pat!r} still present at offset {idx}"
f"resolve via shutil.which(...) so Windows can execute .cmd shims"
)
# ---------------------------------------------------------------------------
# tools/environments/local.py Windows temp dir & PATH injection
# ---------------------------------------------------------------------------
class TestLocalEnvironmentWindowsTempDir:
"""LocalEnvironment.get_temp_dir must return a native Windows path on
Windows, NOT the POSIX ``/tmp`` literal (which Python can't open)."""
def test_posix_path_preserved_on_linux(self):
"""Linux/macOS behaviour MUST be unchanged — return / tmp or
tempfile.gettempdir()-derived POSIX path. This is the 'do no harm'
test regressions here break every Unix user's terminal tool."""
from tools.environments.local import LocalEnvironment
env = LocalEnvironment(cwd="/tmp", timeout=10, env={})
tmp_dir = env.get_temp_dir()
if sys.platform != "win32":
assert tmp_dir.startswith("/"), (
f"POSIX temp dir must start with '/'; got {tmp_dir!r}"
)
def test_source_has_windows_branch_using_hermes_home(self):
root = Path(__file__).resolve().parents[2]
source = (root / "tools" / "environments" / "local.py").read_text(encoding="utf-8")
assert "if _IS_WINDOWS:" in source
assert "get_hermes_home" in source
assert 'cache_dir = get_hermes_home() / "cache" / "terminal"' in source
class TestLocalEnvironmentPathInjectionGated:
"""The /usr/bin PATH injection in _make_run_env must be POSIX-only."""
def test_source_gates_path_injection(self):
root = Path(__file__).resolve().parents[2]
source = (root / "tools" / "environments" / "local.py").read_text(encoding="utf-8")
# The fix wraps the injection in `if not _IS_WINDOWS`.
assert 'not _IS_WINDOWS and "/usr/bin" not in existing_path.split(":")' in source
# ---------------------------------------------------------------------------
# cli.py git path normalization
# ---------------------------------------------------------------------------
class TestGitBashPathNormalization:
"""_normalize_git_bash_path should turn /c/Users/... into C:\\Users\\...
on Windows and leave paths unchanged on POSIX."""
def test_posix_noop(self):
"""Must NOT mutate paths on Linux/macOS."""
from cli import _normalize_git_bash_path
if sys.platform != "win32":
assert _normalize_git_bash_path("/home/teknium/foo") == "/home/teknium/foo"
assert _normalize_git_bash_path("/c/Users/foo") == "/c/Users/foo"
assert _normalize_git_bash_path("C:/Users/foo") == "C:/Users/foo"
assert _normalize_git_bash_path(None) is None
def test_empty_string_preserved(self):
from cli import _normalize_git_bash_path
assert _normalize_git_bash_path("") == ""
def test_windows_translation(self, monkeypatch):
"""Simulate Windows and verify /c/Users/... becomes C:\\Users\\..."""
import cli as cli_mod
monkeypatch.setattr(cli_mod.sys, "platform", "win32")
assert cli_mod._normalize_git_bash_path("/c/Users/foo") == r"C:\Users\foo"
assert cli_mod._normalize_git_bash_path("/C/Users/foo") == r"C:\Users\foo"
assert cli_mod._normalize_git_bash_path("/cygdrive/d/data") == r"D:\data"
assert cli_mod._normalize_git_bash_path("/mnt/c/Users") == r"C:\Users"
# Already-native path is preserved
assert cli_mod._normalize_git_bash_path(r"C:\Users\foo") == r"C:\Users\foo"
# Forward-slash Windows path is preserved (git on Windows often
# returns this form; it's valid for both bash and Python, so we
# don't need to translate).
assert cli_mod._normalize_git_bash_path("C:/Users/foo") == "C:/Users/foo"
class TestWorktreeSymlinkFallback:
""".worktreeinclude directory symlinks must fall back to copytree on
Windows (where symlink creation requires admin / Dev Mode)."""
def test_source_has_symlink_fallback(self):
root = Path(__file__).resolve().parents[2]
source = (root / "cli.py").read_text(encoding="utf-8")
# Look for the try/except that handles OSError around os.symlink
# with a shutil.copytree fallback.
assert "os.symlink(str(src_resolved), str(dst))" in source
assert "except (OSError, NotImplementedError)" in source
assert "shutil.copytree" in source
assert 'sys.platform == "win32"' in source
# ---------------------------------------------------------------------------
# Gateway detached watcher — Windows creationflags
# ---------------------------------------------------------------------------
class TestGatewayDetachedWatcherWindowsFlags:
"""launch_detached_profile_gateway_restart and the in-gateway update
launcher must use CREATE_NEW_PROCESS_GROUP | DETACHED_PROCESS on
Windows, not silent start_new_session=True."""
def test_hermes_cli_gateway_uses_compat_kwargs(self):
root = Path(__file__).resolve().parents[2]
source = (root / "hermes_cli" / "gateway.py").read_text(encoding="utf-8")
assert "windows_detach_popen_kwargs" in source, (
"hermes_cli/gateway.py must use the platform-aware detach helper"
)
# The legacy start_new_session=True on the outer Popen should be
# replaced by **windows_detach_popen_kwargs(). Inside the watcher
# STRING the old pattern is replaced by explicit creationflags.
assert "**windows_detach_popen_kwargs()" in source
def test_gateway_run_update_has_windows_branch(self):
root = Path(__file__).resolve().parents[2]
source = (root / "gateway" / "run.py").read_text(encoding="utf-8")
# Both the /restart and /update paths must have sys.platform=='win32' branches.
assert 'if sys.platform == "win32":' in source
# Windows branch uses windows_detach_popen_kwargs
assert "windows_detach_popen_kwargs" in source
+11 -34
View File
@@ -708,16 +708,7 @@ def _run_chrome_fallback_command(
)
return {"success": False, "error": hint}
# On Windows npx is npx.cmd — use shutil.which so CreateProcessW can
# execute the batch shim. shutil.which honours PATHEXT on Windows and
# returns the plain executable on POSIX. If npx isn't on PATH (Termux,
# bare container), fall back to the bare name and let Popen raise with
# a readable "FileNotFoundError: 'npx'" rather than WinError 193.
if browser_cmd == "npx agent-browser":
_npx_bin = shutil.which("npx") or "npx"
cmd_prefix = [_npx_bin, "agent-browser"]
else:
cmd_prefix = [browser_cmd]
cmd_prefix = ["npx", "agent-browser"] if browser_cmd == "npx agent-browser" else [browser_cmd]
base_args = cmd_prefix + ["--engine", "chrome", "--session", tmp_session, "--json"]
task_socket_dir = os.path.join(_socket_safe_tmpdir(), f"agent-browser-{tmp_session}")
@@ -751,7 +742,7 @@ def _run_chrome_fallback_command(
proc.wait()
return {"success": False, "error": f"Chrome fallback '{cmd}' timed out"}
try:
with open(stdout_path, "r", encoding="utf-8") as f:
with open(stdout_path, "r") as f:
stdout = f.read().strip()
if stdout:
return json.loads(stdout.split("\n")[-1])
@@ -1110,7 +1101,7 @@ def _write_owner_pid(socket_dir: str, session_name: str) -> None:
"""
try:
path = os.path.join(socket_dir, f"{session_name}.owner_pid")
with open(path, "w", encoding="utf-8") as f:
with open(path, "w") as f:
f.write(str(os.getpid()))
except OSError as exc:
logger.debug("Could not write owner_pid file for %s: %s",
@@ -1174,7 +1165,7 @@ def _reap_orphaned_browser_sessions():
owner_alive: Optional[bool] = None # None = owner_pid missing/unreadable
if os.path.isfile(owner_pid_file):
try:
owner_pid = int(Path(owner_pid_file).read_text(encoding="utf-8").strip())
owner_pid = int(Path(owner_pid_file).read_text().strip())
try:
os.kill(owner_pid, 0)
owner_alive = True
@@ -1184,10 +1175,6 @@ def _reap_orphaned_browser_sessions():
# Owner exists but we can't signal it (different uid).
# Treat as alive — don't reap someone else's session.
owner_alive = True
except OSError:
# Windows: gone PID raises OSError (WinError 87) instead
# of ProcessLookupError. Treat as dead to match POSIX.
owner_alive = False
except (ValueError, OSError):
owner_alive = None # corrupt file — fall through
@@ -1209,7 +1196,7 @@ def _reap_orphaned_browser_sessions():
continue
try:
daemon_pid = int(Path(pid_file).read_text(encoding="utf-8").strip())
daemon_pid = int(Path(pid_file).read_text().strip())
except (ValueError, OSError):
shutil.rmtree(socket_dir, ignore_errors=True)
continue
@@ -1224,11 +1211,6 @@ def _reap_orphaned_browser_sessions():
except PermissionError:
# Alive but owned by someone else — leave it alone
continue
except OSError:
# Windows raises OSError (WinError 87) for a gone PID — treat
# as dead and clean up, mirroring the ProcessLookupError branch.
shutil.rmtree(socket_dir, ignore_errors=True)
continue
# Daemon is alive and its owner is dead (or legacy + untracked). Reap.
try:
@@ -1777,12 +1759,7 @@ def _run_browser_command(
# Keep concrete executable paths intact, even when they contain spaces.
# Only the synthetic npx fallback needs to expand into multiple argv items.
# shutil.which resolves npx → npx.cmd on Windows; bare "npx" stays on POSIX.
if browser_cmd == "npx agent-browser":
_npx_bin = shutil.which("npx") or "npx"
cmd_prefix = [_npx_bin, "agent-browser"]
else:
cmd_prefix = [browser_cmd]
cmd_prefix = ["npx", "agent-browser"] if browser_cmd == "npx agent-browser" else [browser_cmd]
cmd_parts = cmd_prefix + backend_args + [
"--json",
@@ -1834,7 +1811,7 @@ def _run_browser_command(
# Detect AppArmor user namespace restrictions (Ubuntu 23.10+)
_userns_restrict = "/proc/sys/kernel/apparmor_restrict_unprivileged_userns"
try:
with open(_userns_restrict, encoding="utf-8") as _f:
with open(_userns_restrict) as _f:
if _f.read().strip() == "1":
_needs_sandbox_bypass = True
logger.debug(
@@ -1879,9 +1856,9 @@ def _run_browser_command(
result = {"success": False, "error": f"Command timed out after {timeout} seconds"}
# Fall through to fallback check below
else:
with open(stdout_path, "r", encoding="utf-8") as f:
with open(stdout_path, "r") as f:
stdout = f.read()
with open(stderr_path, "r", encoding="utf-8") as f:
with open(stderr_path, "r") as f:
stderr = f.read()
returncode = proc.returncode
@@ -3180,7 +3157,7 @@ def _cleanup_single_browser_session(task_id: str) -> None:
pid_file = os.path.join(socket_dir, f"{session_name}.pid")
if os.path.isfile(pid_file):
try:
daemon_pid = int(Path(pid_file).read_text(encoding="utf-8").strip())
daemon_pid = int(Path(pid_file).read_text().strip())
os.kill(daemon_pid, signal.SIGTERM)
logger.debug("Killed daemon pid %s for %s", daemon_pid, session_name)
except (ProcessLookupError, ValueError, PermissionError, OSError):
@@ -3323,7 +3300,7 @@ def _running_in_docker() -> bool:
if os.path.exists("/.dockerenv"):
return True
try:
with open("/proc/1/cgroup", "rt", encoding="utf-8") as fp:
with open("/proc/1/cgroup", "rt") as fp:
return "docker" in fp.read()
except OSError:
return False
+43 -185
View File
@@ -47,13 +47,10 @@ import uuid
_IS_WINDOWS = platform.system() == "Windows"
from typing import Any, Dict, List, Optional
# Availability gate. On Windows we fall back to loopback TCP for the
# sandbox RPC transport (AF_UNIX is unreliable on Windows Python) — see
# ``_use_tcp_rpc`` in ``_execute_local`` below. That makes execute_code
# available on every platform Hermes itself runs on.
# Availability gate: UDS requires a POSIX OS
logger = logging.getLogger(__name__)
SANDBOX_AVAILABLE = True
SANDBOX_AVAILABLE = sys.platform != "win32"
# The 7 tools allowed inside the sandbox. The intersection of this list
# and the session's enabled tools determines which stubs are generated.
@@ -73,85 +70,6 @@ DEFAULT_MAX_TOOL_CALLS = 50
MAX_STDOUT_BYTES = 50_000 # 50 KB
MAX_STDERR_BYTES = 10_000 # 10 KB
# Environment variable scrubbing rules (shared between the local + remote
# backends). Secret-substring block is applied first; anything left must
# match either a safe prefix or, on Windows, an OS-essential name.
_SAFE_ENV_PREFIXES = ("PATH", "HOME", "USER", "LANG", "LC_", "TERM",
"TMPDIR", "TMP", "TEMP", "SHELL", "LOGNAME",
"XDG_", "PYTHONPATH", "VIRTUAL_ENV", "CONDA",
"HERMES_")
_SECRET_SUBSTRINGS = ("KEY", "TOKEN", "SECRET", "PASSWORD", "CREDENTIAL",
"PASSWD", "AUTH")
# Windows-only: a handful of variables are required by the OS/CRT itself.
# Without them, even stdlib calls like ``socket.socket()`` fail with
# WinError 10106 (Winsock can't locate mswsock.dll) and ``subprocess``
# can't resolve cmd.exe. These are well-known OS paths, not secrets, so
# we allow them through by exact name. The _SECRET_SUBSTRINGS block
# still runs as a safety net (none of these names match those substrings).
_WINDOWS_ESSENTIAL_ENV_VARS = frozenset({
"SYSTEMROOT", # %SYSTEMROOT%\System32 — Winsock needs this
"SYSTEMDRIVE", # C: (or wherever Windows lives)
"WINDIR", # usually same as SYSTEMROOT
"COMSPEC", # cmd.exe path — subprocess shell=True needs it
"PATHEXT", # .COM;.EXE;.BAT;... — shell lookup
"OS", # "Windows_NT" — some tools gate on this
"PROCESSOR_ARCHITECTURE",
"NUMBER_OF_PROCESSORS",
"PUBLIC", # C:\Users\Public
"ALLUSERSPROFILE", # C:\ProgramData — some stdlib paths use it
"PROGRAMDATA", # C:\ProgramData
"PROGRAMFILES",
"PROGRAMFILES(X86)",
"PROGRAMW6432",
"APPDATA", # %USERPROFILE%\AppData\Roaming — Python uses it
"LOCALAPPDATA", # %USERPROFILE%\AppData\Local
"USERPROFILE", # C:\Users\<name> — Python's expanduser uses it
"USERDOMAIN",
"USERNAME",
"HOMEDRIVE", # C:
"HOMEPATH", # \Users\<name>
"COMPUTERNAME",
})
def _scrub_child_env(source_env, is_passthrough=None, is_windows=None):
"""Produce the scrubbed child-process env for execute_code.
Rules (order matters):
1. Passthrough vars (skill- or config-declared) always pass.
2. Secret-substring names (KEY/TOKEN/etc.) are blocked.
3. Names matching a safe prefix pass.
4. On Windows, a small OS-essential allowlist passes by exact name
without these the child can't even create a socket or spawn a
subprocess.
Extracted into a helper so tests can exercise the logic without
spawning a subprocess.
"""
if is_passthrough is None:
try:
from tools.env_passthrough import is_env_passthrough as _ep
except Exception:
_ep = lambda _: False # noqa: E731
is_passthrough = _ep
if is_windows is None:
is_windows = _IS_WINDOWS
scrubbed = {}
for k, v in source_env.items():
if is_passthrough(k):
scrubbed[k] = v
continue
if any(s in k.upper() for s in _SECRET_SUBSTRINGS):
continue
if any(k.startswith(p) for p in _SAFE_ENV_PREFIXES):
scrubbed[k] = v
continue
if is_windows and k.upper() in _WINDOWS_ESSENTIAL_ENV_VARS:
scrubbed[k] = v
return scrubbed
def check_sandbox_requirements() -> bool:
"""Code execution sandbox requires a POSIX OS for Unix domain sockets."""
@@ -317,27 +235,10 @@ _call_lock = threading.Lock()
''' + _COMMON_HELPERS + '''\
def _connect():
"""Connect to the parent's RPC server via the transport it picked.
HERMES_RPC_SOCKET can be either:
- a filesystem path (POSIX Unix domain socket the default on
Linux and macOS)
- a string of the form ``tcp://127.0.0.1:<port>`` (Windows, where
AF_UNIX is unreliable the parent falls back to loopback TCP)
"""
global _sock
if _sock is None:
endpoint = os.environ["HERMES_RPC_SOCKET"]
if endpoint.startswith("tcp://"):
# tcp://host:port (host is always 127.0.0.1 in practice — we
# only bind loopback server-side)
_host_port = endpoint[len("tcp://"):]
_host, _, _port = _host_port.rpartition(":")
_sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
_sock.connect((_host or "127.0.0.1", int(_port)))
else:
_sock = socket.socket(socket.AF_UNIX, socket.SOCK_STREAM)
_sock.connect(endpoint)
_sock = socket.socket(socket.AF_UNIX, socket.SOCK_STREAM)
_sock.connect(os.environ["HERMES_RPC_SOCKET"])
_sock.settimeout(300)
return _sock
@@ -390,12 +291,9 @@ def _call(tool_name, args):
req_file = os.path.join(_RPC_DIR, f"req_{seq_str}")
res_file = os.path.join(_RPC_DIR, f"res_{seq_str}")
# Write request atomically (write to .tmp, then rename).
# encoding="utf-8" is critical: on Windows-hosted remote backends
# (or any non-UTF-8 locale) the default open() mode would mangle
# non-ASCII chars in tool args when encoding them as JSON.
# Write request atomically (write to .tmp, then rename)
tmp = req_file + ".tmp"
with open(tmp, "w", encoding="utf-8") as f:
with open(tmp, "w") as f:
json.dump({"tool": tool_name, "args": args, "seq": seq}, f)
os.rename(tmp, req_file)
@@ -408,7 +306,7 @@ def _call(tool_name, args):
time.sleep(poll_interval)
poll_interval = min(poll_interval * 1.2, 0.25) # Back off to 250ms
with open(res_file, encoding="utf-8") as f:
with open(res_file) as f:
raw = f.read()
# Clean up response file
@@ -517,7 +415,7 @@ def _rpc_server_loop(
# their status prints don't leak into the CLI spinner.
try:
_real_stdout, _real_stderr = sys.stdout, sys.stderr
devnull = open(os.devnull, "w", encoding="utf-8")
devnull = open(os.devnull, "w")
try:
sys.stdout = devnull
sys.stderr = devnull
@@ -791,7 +689,7 @@ def _rpc_poll_loop(
# Dispatch through the standard tool handler
try:
_real_stdout, _real_stderr = sys.stdout, sys.stderr
devnull = open(os.devnull, "w", encoding="utf-8")
devnull = open(os.devnull, "w")
try:
sys.stdout = devnull
sys.stderr = devnull
@@ -1056,8 +954,7 @@ def execute_code(
"""
if not SANDBOX_AVAILABLE:
return json.dumps({
"error": "execute_code sandbox is unavailable in this environment. "
"Use normal tool calls (terminal, read_file, write_file, ...) instead."
"error": "execute_code is not available on Windows. Use normal tool calls instead."
})
if not code or not code.strip():
@@ -1091,22 +988,8 @@ def execute_code(
# Use /tmp on macOS to avoid the long /var/folders/... path that pushes
# Unix domain socket paths past the 104-byte macOS AF_UNIX limit.
# On Linux, tempfile.gettempdir() already returns /tmp.
#
# Windows: Python 3.9+ added partial AF_UNIX support but the file-backed
# variant is flaky across Windows builds (requires Windows 10 1803+,
# still fails under some configurations, and the socket file can't live
# on the same temp drive as the script). Fall back to loopback TCP —
# same ephemeral port, same 1-connection listen queue, same serialized
# request/response framing. The generated client reads the transport
# selector from HERMES_RPC_SOCKET (path vs. ``tcp://host:port``).
_sock_tmpdir = "/tmp" if sys.platform == "darwin" else tempfile.gettempdir()
_use_tcp_rpc = _IS_WINDOWS
if _use_tcp_rpc:
sock_path = None # not used on Windows; TCP endpoint stored below
rpc_endpoint = None # set after bind()
else:
sock_path = os.path.join(_sock_tmpdir, f"hermes_rpc_{uuid.uuid4().hex}.sock")
rpc_endpoint = sock_path
sock_path = os.path.join(_sock_tmpdir, f"hermes_rpc_{uuid.uuid4().hex}.sock")
tool_call_log: list = []
tool_call_counter = [0] # mutable so the RPC thread can increment
@@ -1114,42 +997,21 @@ def execute_code(
server_sock = None
try:
# Write the auto-generated hermes_tools module.
# encoding="utf-8" is required on Windows — the stub and user code
# both contain non-ASCII characters (em-dashes in docstrings, plus
# whatever the user script carries). Python's default open() uses
# the system locale on Windows (cp1252 typically), which corrupts
# those bytes; the child then fails to import with a SyntaxError
# ("'utf-8' codec can't decode byte 0x97 in position ...") because
# Python source files are decoded as UTF-8 by default (PEP 3120).
# Write the auto-generated hermes_tools module
# sandbox_tools is already the correct set (intersection with session
# tools, or SANDBOX_ALLOWED_TOOLS as fallback — see lines above).
tools_src = generate_hermes_tools_module(list(sandbox_tools))
with open(os.path.join(tmpdir, "hermes_tools.py"), "w", encoding="utf-8") as f:
with open(os.path.join(tmpdir, "hermes_tools.py"), "w") as f:
f.write(tools_src)
# Write the user's script
with open(os.path.join(tmpdir, "script.py"), "w", encoding="utf-8") as f:
with open(os.path.join(tmpdir, "script.py"), "w") as f:
f.write(code)
# --- Start RPC server ---
# Two transports:
# POSIX: AF_UNIX stream socket on sock_path, chmod 0600 for
# owner-only access. Filesystem permissions gate the socket.
# Windows: AF_INET stream socket on 127.0.0.1 with an ephemeral
# port. No filesystem permission story, but loopback-only bind
# means only the current user's processes (not remote) can
# connect. HERMES_RPC_SOCKET is set to ``tcp://127.0.0.1:<port>``
# which the generated client parses to pick AF_INET.
if _use_tcp_rpc:
server_sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
server_sock.bind(("127.0.0.1", 0)) # ephemeral port
_host, _port = server_sock.getsockname()[:2]
rpc_endpoint = f"tcp://{_host}:{_port}"
else:
server_sock = socket.socket(socket.AF_UNIX, socket.SOCK_STREAM)
server_sock.bind(sock_path)
os.chmod(sock_path, 0o600)
# --- Start UDS server ---
server_sock = socket.socket(socket.AF_UNIX, socket.SOCK_STREAM)
server_sock.bind(sock_path)
os.chmod(sock_path, 0o600)
server_sock.listen(1)
rpc_thread = threading.Thread(
@@ -1168,32 +1030,31 @@ def execute_code(
# generated scripts. The child accesses tools via RPC, not direct API.
# Exception: env vars declared by loaded skills (via env_passthrough
# registry) or explicitly allowed by the user in config.yaml
# (terminal.env_passthrough) are passed through. On Windows, a small
# OS-essential allowlist (SYSTEMROOT, WINDIR, COMSPEC, ...) is also
# passed through — without those, the child can't create a socket
# or spawn a subprocess. See ``_scrub_child_env`` for the rules.
child_env = _scrub_child_env(os.environ)
child_env["HERMES_RPC_SOCKET"] = rpc_endpoint
# (terminal.env_passthrough) are passed through.
_SAFE_ENV_PREFIXES = ("PATH", "HOME", "USER", "LANG", "LC_", "TERM",
"TMPDIR", "TMP", "TEMP", "SHELL", "LOGNAME",
"XDG_", "PYTHONPATH", "VIRTUAL_ENV", "CONDA",
"HERMES_")
_SECRET_SUBSTRINGS = ("KEY", "TOKEN", "SECRET", "PASSWORD", "CREDENTIAL",
"PASSWD", "AUTH")
try:
from tools.env_passthrough import is_env_passthrough as _is_passthrough
except Exception:
_is_passthrough = lambda _: False # noqa: E731
child_env = {}
for k, v in os.environ.items():
# Passthrough vars (skill-declared or user-configured) always pass.
if _is_passthrough(k):
child_env[k] = v
continue
# Block vars with secret-like names.
if any(s in k.upper() for s in _SECRET_SUBSTRINGS):
continue
# Allow vars with known safe prefixes.
if any(k.startswith(p) for p in _SAFE_ENV_PREFIXES):
child_env[k] = v
child_env["HERMES_RPC_SOCKET"] = sock_path
child_env["PYTHONDONTWRITEBYTECODE"] = "1"
# Force UTF-8 for the child's stdio and default file encoding.
#
# Without this, on Windows sys.stdout is bound to the console code
# page (cp1252 on US-locale installs), and any script that does
# ``print("café")`` or ``print("→")`` crashes with:
#
# UnicodeEncodeError: 'charmap' codec can't encode character
# '\u2192' in position N: character maps to <undefined>
#
# PYTHONIOENCODING fixes sys.stdin/stdout/stderr.
# PYTHONUTF8=1 enables "UTF-8 mode" (PEP 540) which additionally
# makes ``open()``'s default encoding UTF-8, so user scripts that
# write files without specifying encoding= also work correctly.
#
# On POSIX both values usually match the locale default already,
# so setting them is harmless belt-and-suspenders for environments
# with a C/POSIX locale (containers, minimal base images).
child_env["PYTHONIOENCODING"] = "utf-8"
child_env["PYTHONUTF8"] = "1"
# Ensure the hermes-agent root is importable in the sandbox so
# repo-root modules are available to child scripts. We also prepend
# the staging tmpdir so ``from hermes_tools import ...`` resolves even
@@ -1441,10 +1302,7 @@ def execute_code(
import shutil
shutil.rmtree(tmpdir, ignore_errors=True)
try:
# Only UDS has a filesystem socket to unlink; TCP sockets are
# freed by server_sock.close() above.
if sock_path:
os.unlink(sock_path)
os.unlink(sock_path)
except OSError:
pass # already cleaned up or never created
+1 -1
View File
@@ -541,7 +541,7 @@ Important safety rule: cron-run sessions should not recursively schedule more cr
},
"deliver": {
"type": "string",
"description": "Omit this parameter to auto-deliver back to the current chat and topic (recommended). Auto-detection preserves thread/topic context. Only set explicitly when the user asks to deliver somewhere OTHER than the current conversation. Values: 'origin' (same as omitting), 'local' (no delivery, save only), or platform:chat_id:thread_id for a specific destination. Examples: 'telegram:-1001234567890:17585', 'discord:#engineering', 'sms:+15551234567'. WARNING: 'platform:chat_id' without :thread_id loses topic targeting."
"description": "Omit this parameter to auto-deliver back to the current chat and topic (recommended). Auto-detection preserves thread/topic context. Only set explicitly when the user asks to deliver somewhere OTHER than the current conversation. Values: 'origin' (same as omitting), 'local' (no delivery, save only), 'all' (fan out to every connected home channel), or platform:chat_id:thread_id for a specific destination. Combine with comma: 'origin,all' delivers to the origin plus every other connected channel. Examples: 'telegram:-1001234567890:17585', 'discord:#engineering', 'sms:+15551234567', 'all'. WARNING: 'platform:chat_id' without :thread_id loses topic targeting. 'all' resolves at fire time, so a job created before a channel was wired up will pick it up automatically once connected."
},
"skills": {
"type": "array",
+15 -52
View File
@@ -99,33 +99,12 @@ def get_sandbox_dir() -> Path:
def _pipe_stdin(proc: subprocess.Popen, data: str) -> None:
"""Write *data* to proc.stdin on a daemon thread to avoid pipe-buffer deadlocks.
On Windows, text-mode stdin (``text=True`` / ``encoding="utf-8"``)
translates ``\\n`` ``\\r\\n`` as the data flows through the pipe
which corrupts every write_file / patch call because the bytes that
land on disk include injected carriage returns. The file IS created,
but every subsequent byte-count / content compare against the
caller's ``\\n``-only string fails.
Workaround: write through ``proc.stdin.buffer`` (the underlying byte
buffer), encoding to UTF-8 ourselves. That bypasses Python's
newline translation entirely on every platform. No behaviour change
on POSIX the byte sequence is identical to what text-mode would
produce there.
"""
"""Write *data* to proc.stdin on a daemon thread to avoid pipe-buffer deadlocks."""
def _write():
try:
# proc.stdin is a TextIOWrapper when text=True was set on the
# Popen. Its ``.buffer`` attribute is the raw BufferedWriter
# that bypasses newline translation. When Popen was created
# in byte mode, proc.stdin is already a BufferedWriter with
# no ``.buffer`` attribute — fall back to .write() directly.
raw = data.encode("utf-8") if isinstance(data, str) else data
target = getattr(proc.stdin, "buffer", proc.stdin)
target.write(raw)
target.close()
proc.stdin.write(data)
proc.stdin.close()
except (BrokenPipeError, OSError):
pass
@@ -158,7 +137,7 @@ def _load_json_store(path: Path) -> dict:
"""Load a JSON file as a dict, returning ``{}`` on any error."""
if path.exists():
try:
return json.loads(path.read_text(encoding="utf-8"))
return json.loads(path.read_text())
except Exception:
pass
return {}
@@ -167,7 +146,7 @@ def _load_json_store(path: Path) -> dict:
def _save_json_store(path: Path, data: dict) -> None:
"""Write *data* as pretty-printed JSON to *path*."""
path.parent.mkdir(parents=True, exist_ok=True)
path.write_text(json.dumps(data, indent=2), encoding="utf-8")
path.write_text(json.dumps(data, indent=2))
def _file_mtime_key(host_path: str) -> tuple[float, int] | None:
@@ -360,24 +339,15 @@ class BaseEnvironment(ABC):
# change the working directory (e.g. bashrc `cd ~`). Without this,
# pwd -P captures the profile's directory, not terminal.cwd.
_quoted_cwd = shlex.quote(self.cwd)
# Quote the snapshot / cwd-file paths so Git Bash on Windows handles
# ``C:/Users/...``-shaped paths without glob-splitting the colon or
# tripping on drive letters. On POSIX this is a no-op (no colons /
# special chars in a /tmp path). Previously unquoted interpolation
# caused ``C:/Users/.../hermes-snap-*.sh: No such file or directory``
# errors on Windows, leaking via stderr (merged into stdout on Linux
# backends) into every terminal-tool response.
_quoted_snap = shlex.quote(self._snapshot_path)
_quoted_cwd_file = shlex.quote(self._cwd_file)
bootstrap = (
f"export -p > {_quoted_snap}\n"
f"declare -f | grep -vE '^_[^_]' >> {_quoted_snap}\n"
f"alias -p >> {_quoted_snap}\n"
f"echo 'shopt -s expand_aliases' >> {_quoted_snap}\n"
f"echo 'set +e' >> {_quoted_snap}\n"
f"echo 'set +u' >> {_quoted_snap}\n"
f"export -p > {self._snapshot_path}\n"
f"declare -f | grep -vE '^_[^_]' >> {self._snapshot_path}\n"
f"alias -p >> {self._snapshot_path}\n"
f"echo 'shopt -s expand_aliases' >> {self._snapshot_path}\n"
f"echo 'set +e' >> {self._snapshot_path}\n"
f"echo 'set +u' >> {self._snapshot_path}\n"
f"builtin cd {_quoted_cwd} 2>/dev/null || true\n"
f"pwd -P > {_quoted_cwd_file} 2>/dev/null || true\n"
f"pwd -P > {self._cwd_file} 2>/dev/null || true\n"
f"printf '\\n{self._cwd_marker}%s{self._cwd_marker}\\n' \"$(pwd -P)\"\n"
)
try:
@@ -419,13 +389,6 @@ class BaseEnvironment(ABC):
re-dumps env vars, and emits CWD markers."""
escaped = command.replace("'", "'\\''")
# Quote the snapshot / cwd-file paths so Git Bash on Windows handles
# ``C:/Users/...``-shaped paths without glob-splitting the colon or
# tripping on drive letters. POSIX paths are unaffected. See
# :meth:`init_session` for the same fix on the bootstrap block.
_quoted_snap = shlex.quote(self._snapshot_path)
_quoted_cwd_file = shlex.quote(self._cwd_file)
parts = []
# Source snapshot (env vars from previous commands).
@@ -436,7 +399,7 @@ class BaseEnvironment(ABC):
# silent here, but the redirect is harmless.
if self._snapshot_ready:
parts.append(
f"source {_quoted_snap} >/dev/null 2>&1 || true"
f"source {self._snapshot_path} >/dev/null 2>&1 || true"
)
# Preserve bare ``~`` expansion, but rewrite ``~/...`` through
@@ -451,10 +414,10 @@ class BaseEnvironment(ABC):
# Re-dump env vars to snapshot (last-writer-wins for concurrent calls)
if self._snapshot_ready:
parts.append(f"export -p > {_quoted_snap} 2>/dev/null || true")
parts.append(f"export -p > {self._snapshot_path} 2>/dev/null || true")
# Write CWD to file (local reads this) and stdout marker (remote parses this)
parts.append(f"pwd -P > {_quoted_cwd_file} 2>/dev/null || true")
parts.append(f"pwd -P > {self._cwd_file} 2>/dev/null || true")
# Use a distinct line for the marker. The leading \n ensures
# the marker starts on its own line even if the command doesn't
# end with a newline (e.g. printf 'exact'). We'll strip this
+1 -1
View File
@@ -284,7 +284,7 @@ class FileSyncManager:
# Windows: no flock — run without serialization
self._sync_back_impl()
return
lock_fd = open(lock_path, "w", encoding="utf-8")
lock_fd = open(lock_path, "w")
try:
fcntl.flock(lock_fd, fcntl.LOCK_EX)
self._sync_back_impl()
+3 -53
View File
@@ -9,7 +9,6 @@ import signal
import subprocess
import tempfile
import time
from pathlib import Path
from tools.environments.base import BaseEnvironment, _pipe_stdin
@@ -190,25 +189,6 @@ def _find_bash() -> str:
if custom and os.path.isfile(custom):
return custom
# Prefer our own portable Git install first — this way a broken or
# partially-uninstalled system Git can't hijack the bash lookup. The
# install.ps1 installer always drops portable Git here when the user
# didn't already have a working system Git.
#
# Layouts (both checked so upgrades between MinGit and PortableGit
# installs work transparently):
# PortableGit: %LOCALAPPDATA%\hermes\git\bin\bash.exe (primary)
# MinGit: %LOCALAPPDATA%\hermes\git\usr\bin\bash.exe (legacy/32-bit fallback)
_local_appdata = os.environ.get("LOCALAPPDATA", "")
_hermes_portable_git = os.path.join(_local_appdata, "hermes", "git") if _local_appdata else ""
if _hermes_portable_git:
for candidate in (
os.path.join(_hermes_portable_git, "bin", "bash.exe"), # PortableGit (primary)
os.path.join(_hermes_portable_git, "usr", "bin", "bash.exe"), # MinGit fallback
):
if os.path.isfile(candidate):
return candidate
found = shutil.which("bash")
if found:
return found
@@ -216,7 +196,7 @@ def _find_bash() -> str:
for candidate in (
os.path.join(os.environ.get("ProgramFiles", r"C:\Program Files"), "Git", "bin", "bash.exe"),
os.path.join(os.environ.get("ProgramFiles(x86)", r"C:\Program Files (x86)"), "Git", "bin", "bash.exe"),
os.path.join(_local_appdata, "Programs", "Git", "bin", "bash.exe"),
os.path.join(os.environ.get("LOCALAPPDATA", ""), "Programs", "Git", "bin", "bash.exe"),
):
if candidate and os.path.isfile(candidate):
return candidate
@@ -255,15 +235,7 @@ def _make_run_env(env: dict) -> dict:
elif k not in _HERMES_PROVIDER_ENV_BLOCKLIST or _is_passthrough(k):
run_env[k] = v
existing_path = run_env.get("PATH", "")
# The "/usr/bin not already present → inject sane POSIX path" heuristic
# only makes sense on POSIX. On Windows the PATH separator is ";"
# (the split(":") above turns a full Windows PATH into a single
# unrecognisable chunk, which then triggers prepending POSIX paths
# to a Windows PATH — completely wrong). Skip the injection entirely
# on Windows; the native PATH already points at whatever shell
# Hermes is driving via _find_bash (Git Bash), and Git Bash itself
# prepends its MSYS2 /usr/bin equivalent via the shell-init files.
if not _IS_WINDOWS and "/usr/bin" not in existing_path.split(":"):
if "/usr/bin" not in existing_path.split(":"):
run_env["PATH"] = f"{existing_path}:{_SANE_PATH}" if existing_path else _SANE_PATH
# Per-profile HOME isolation: redirect system tool configs (git, ssh, gh,
@@ -385,29 +357,7 @@ class LocalEnvironment(BaseEnvironment):
Check the environment configured for this backend first so callers can
override the temp root explicitly (for example via terminal.env or a
custom TMPDIR), then fall back to the host process environment.
**Windows:** hardcoded ``/tmp`` is wrong in two ways native Python
can't open the path, and the Windows default temp (``%TEMP%``) often
contains spaces (``C:\\Users\\Some Name\\AppData\\Local\\Temp``) that
break unquoted bash interpolations. Use a dedicated cache dir under
``HERMES_HOME`` instead single-word path, guaranteed to exist, same
string resolves in both Git Bash and native Python.
"""
if _IS_WINDOWS:
# Derive a Windows-safe temp dir under HERMES_HOME. Using
# forward slashes makes the same string work unchanged in bash
# command interpolations AND in Python ``open()`` — Windows
# accepts forward slashes in filesystem paths, and we control
# the path so we can guarantee no spaces.
try:
from hermes_constants import get_hermes_home
cache_dir = get_hermes_home() / "cache" / "terminal"
except Exception:
cache_dir = Path(tempfile.gettempdir()) / "hermes_terminal"
cache_dir.mkdir(parents=True, exist_ok=True)
# Force forward slashes so the same string serves both contexts.
return str(cache_dir).replace("\\", "/")
for env_var in ("TMPDIR", "TMP", "TEMP"):
candidate = self.env.get(env_var) or os.environ.get(env_var)
if candidate and candidate.startswith("/"):
@@ -562,7 +512,7 @@ class LocalEnvironment(BaseEnvironment):
``_run_bash`` recovery path will resolve a safe fallback if needed.
"""
try:
with open(self._cwd_file, encoding="utf-8") as f:
with open(self._cwd_file) as f:
cwd_path = f.read().strip()
if cwd_path and os.path.isdir(cwd_path):
self.cwd = cwd_path
+2 -12
View File
@@ -966,21 +966,11 @@ class ShellFileOperations(FileOperations):
verify_result = self._exec(verify_cmd)
if verify_result.exit_code != 0:
return PatchResult(error=f"Post-write verification failed: could not re-read {path}")
# Normalize line endings before comparing. On Windows, Python's
# default text-mode ``open()`` translates ``\n`` → ``\r\n`` on
# write, so the file on disk legitimately holds CRLFs while our
# ``new_content`` string has bare LFs. Without this normalization
# every patch on Windows returns a bogus "wrote 39, read 42"
# false-negative even though the edit landed correctly. POSIX
# backends don't translate, so this is a no-op there.
_verify_stdout_normalized = verify_result.stdout.replace("\r\n", "\n").replace("\r", "\n")
_new_content_normalized = new_content.replace("\r\n", "\n").replace("\r", "\n")
if _verify_stdout_normalized != _new_content_normalized:
if verify_result.stdout != new_content:
return PatchResult(error=(
f"Post-write verification failed for {path}: on-disk content "
f"differs from intended write "
f"(wrote {len(_new_content_normalized)} chars, read back "
f"{len(_verify_stdout_normalized)} chars after normalizing line endings). "
f"(wrote {len(new_content)} chars, read back {len(verify_result.stdout)}). "
"The patch did not persist. Re-read the file and try again."
))
+1 -1
View File
@@ -1992,7 +1992,7 @@ def _snapshot_child_pids() -> set:
# Linux: read from /proc
try:
children_path = f"/proc/{my_pid}/task/{my_pid}/children"
with open(children_path, encoding="utf-8") as f:
with open(children_path) as f:
return {int(p) for p in f.read().split() if p.strip()}
except (FileNotFoundError, OSError, ValueError):
pass
+1 -5
View File
@@ -407,11 +407,7 @@ class ProcessRegistry:
try:
os.kill(pid, 0)
return True
except (ProcessLookupError, PermissionError, OSError):
# OSError covers Windows' WinError 87 for a gone PID, and the
# ``WinError 5 Access denied`` case — treat both as "can't probe
# or process is gone", which matches the conservative
# "not alive" semantics callers already handle.
except (ProcessLookupError, PermissionError):
return False
def _refresh_detached_session(self, session: Optional[ProcessSession]) -> Optional[ProcessSession]:
+7 -7
View File
@@ -169,7 +169,7 @@ def _scan_environments() -> List[EnvironmentInfo]:
continue
try:
with open(py_file, "r", encoding="utf-8") as f:
with open(py_file, "r") as f:
tree = ast.parse(f.read())
for node in ast.walk(tree):
@@ -333,7 +333,7 @@ async def _spawn_training_run(run_state: RunState, config_path: Path):
# File must stay open while the subprocess runs; we store the handle
# on run_state so _stop_training_run() can close it when done.
api_log_file = open(api_log, "w", encoding="utf-8") # closed by _stop_training_run
api_log_file = open(api_log, "w") # closed by _stop_training_run
run_state.api_log_file = api_log_file
run_state.api_process = subprocess.Popen(
["run-api"],
@@ -356,7 +356,7 @@ async def _spawn_training_run(run_state: RunState, config_path: Path):
# Step 2: Start the Tinker trainer
logger.info("[%s] Starting Tinker trainer: launch_training.py --config %s", run_id, config_path)
trainer_log_file = open(trainer_log, "w", encoding="utf-8") # closed by _stop_training_run
trainer_log_file = open(trainer_log, "w") # closed by _stop_training_run
run_state.trainer_log_file = trainer_log_file
run_state.trainer_process = subprocess.Popen(
[sys.executable, "launch_training.py", "--config", str(config_path)],
@@ -397,7 +397,7 @@ async def _spawn_training_run(run_state: RunState, config_path: Path):
logger.info("[%s] Starting environment: %s serve", run_id, env_info.file_path)
env_log_file = open(env_log, "w", encoding="utf-8") # closed by _stop_training_run
env_log_file = open(env_log, "w") # closed by _stop_training_run
run_state.env_log_file = env_log_file
run_state.env_process = subprocess.Popen(
[sys.executable, str(env_info.file_path), "serve", "--config", str(config_path)],
@@ -777,7 +777,7 @@ async def rl_start_training() -> str:
if "wandb_name" in _current_config and _current_config["wandb_name"]:
run_config["env"]["wandb_name"] = _current_config["wandb_name"]
with open(config_path, "w", encoding="utf-8") as f:
with open(config_path, "w") as f:
yaml.dump(run_config, f, default_flow_style=False)
# Create run state
@@ -1206,7 +1206,7 @@ async def rl_test_inference(
stderr_text = "\n".join(stderr_lines)
# Write logs to files for inspection outside CLI
with open(log_file, "w", encoding="utf-8") as f:
with open(log_file, "w") as f:
f.write(f"Command: {cmd_display}\n")
f.write(f"Working dir: {TINKER_ATROPOS_ROOT}\n")
f.write(f"Return code: {process.returncode}\n")
@@ -1238,7 +1238,7 @@ async def rl_test_inference(
# Parse the output JSONL file
if output_file.exists():
# Read JSONL file (one JSON object per line = one step)
with open(output_file, "r", encoding="utf-8") as f:
with open(output_file, "r") as f:
for line in f:
line = line.strip()
if not line:
+2 -2
View File
@@ -219,7 +219,7 @@ class GitHubAuth:
key_file = Path(key_path)
if not key_file.exists():
return None
private_key = key_file.read_text(encoding="utf-8")
private_key = key_file.read_text()
now = int(time.time())
payload = {
@@ -2667,7 +2667,7 @@ def append_audit_log(action: str, skill_name: str, source: str,
parts.append(extra)
line = " ".join(parts) + "\n"
try:
with open(AUDIT_LOG, "a", encoding="utf-8") as f:
with open(AUDIT_LOG, "a") as f:
f.write(line)
except OSError as e:
logger.debug("Could not write audit log: %s", e)
+3 -3
View File
@@ -126,7 +126,7 @@ def _read_failure_reason() -> str | None:
mtime = os.path.getmtime(p)
if (time.time() - mtime) >= _MARKER_TTL:
return None
with open(p, "r", encoding="utf-8") as f:
with open(p, "r") as f:
return f.read().strip()
except OSError:
return None
@@ -160,7 +160,7 @@ def _mark_install_failed(reason: str = ""):
try:
p = _failure_marker_path()
os.makedirs(os.path.dirname(p), exist_ok=True)
with open(p, "w", encoding="utf-8") as f:
with open(p, "w") as f:
f.write(reason)
except OSError:
pass
@@ -257,7 +257,7 @@ def _verify_cosign(checksums_path: str, sig_path: str, cert_path: str) -> bool |
def _verify_checksum(archive_path: str, checksums_path: str, archive_name: str) -> bool:
"""Verify SHA-256 of the archive against checksums.txt."""
expected = None
with open(checksums_path, encoding="utf-8") as f:
with open(checksums_path) as f:
for line in f:
# Format: "<hash> <filename>"
parts = line.strip().split(" ", 1)
+1 -1
View File
@@ -110,7 +110,7 @@ def detect_audio_environment() -> dict:
# WSL detection — PulseAudio bridge makes audio work in WSL.
# Only block if PULSE_SERVER is not configured.
try:
with open('/proc/version', 'r', encoding="utf-8") as f:
with open('/proc/version', 'r') as f:
if 'microsoft' in f.read().lower():
if os.environ.get('PULSE_SERVER'):
notices.append("Running in WSL with PulseAudio bridge")
+3 -2
View File
@@ -5,10 +5,11 @@ It implements ``WebSearchProvider`` only — there is no extract capability.
Configuration::
# ~/.hermes/config.yaml (SEARXNG_URL is a URL, not a secret — use config.yaml not .env)
SEARXNG_URL: http://localhost:8080
# ~/.hermes/.env
SEARXNG_URL=http://localhost:8080
# Use SearXNG for search, pair with any extract provider:
# ~/.hermes/config.yaml
web:
search_backend: "searxng"
extract_backend: "firecrawl"
+2 -2
View File
@@ -125,7 +125,7 @@ class CompressionConfig:
@classmethod
def from_yaml(cls, yaml_path: str) -> "CompressionConfig":
"""Load configuration from YAML file."""
with open(yaml_path, 'r', encoding="utf-8") as f:
with open(yaml_path, 'r') as f:
data = yaml.safe_load(f)
config = cls()
@@ -1174,7 +1174,7 @@ Write only the summary, starting with "[CONTEXT SUMMARY]:" prefix."""
# Save metrics
if self.config.metrics_enabled:
metrics_path = output_dir / self.config.metrics_output_file
with open(metrics_path, 'w', encoding="utf-8") as f:
with open(metrics_path, 'w') as f:
json.dump(self.aggregate_metrics.to_dict(), f, indent=2)
console.print(f"\n💾 Metrics saved to {metrics_path}")
+9 -25
View File
@@ -81,14 +81,11 @@ def _log_signal(signum: int, frame) -> None:
thread, and fall back to ``os._exit(0)`` so a wedged write/flush
can never strand the process.
"""
# SIGPIPE and SIGHUP don't exist on Windows — build the lookup
# dict from attributes that actually exist on the current platform.
_signal_names: dict[int, str] = {}
for _attr in ("SIGPIPE", "SIGTERM", "SIGHUP", "SIGINT", "SIGBREAK"):
_sig = getattr(signal, _attr, None)
if _sig is not None:
_signal_names[int(_sig)] = _attr
name = _signal_names.get(signum, f"signal {signum}")
name = {
signal.SIGPIPE: "SIGPIPE",
signal.SIGTERM: "SIGTERM",
signal.SIGHUP: "SIGHUP",
}.get(signum, f"signal {signum}")
try:
os.makedirs(os.path.dirname(_CRASH_LOG), exist_ok=True)
with open(_CRASH_LOG, "a", encoding="utf-8") as f:
@@ -143,23 +140,10 @@ def _log_signal(signum: int, frame) -> None:
# sys.exit(0) + _log_exit), which keeps the gateway alive as long as
# the main command pipe is still readable. Terminal signals still
# route through _log_signal so kills and hangups are diagnosable.
#
# SIGPIPE and SIGHUP don't exist on Windows; guard each installation
# with hasattr so ``python -m tui_gateway.entry`` (spawned by
# ``hermes --tui``) imports cleanly there. SIGBREAK (Windows' Ctrl+Break)
# is installed when available as a weaker equivalent of SIGHUP.
if hasattr(signal, "SIGPIPE"):
signal.signal(signal.SIGPIPE, signal.SIG_IGN)
if hasattr(signal, "SIGTERM"):
signal.signal(signal.SIGTERM, _log_signal)
if hasattr(signal, "SIGHUP"):
signal.signal(signal.SIGHUP, _log_signal)
elif hasattr(signal, "SIGBREAK"):
# Windows-only: Ctrl+Break in a console window delivers SIGBREAK.
# Route it through the same handler so kills are diagnosable.
signal.signal(signal.SIGBREAK, _log_signal)
if hasattr(signal, "SIGINT"):
signal.signal(signal.SIGINT, signal.SIG_IGN)
signal.signal(signal.SIGPIPE, signal.SIG_IGN)
signal.signal(signal.SIGTERM, _log_signal)
signal.signal(signal.SIGHUP, _log_signal)
signal.signal(signal.SIGINT, signal.SIG_IGN)
def _log_exit(reason: str) -> None:
+40 -15
View File
@@ -660,7 +660,7 @@ def _load_cfg() -> dict:
if _cfg_cache is not None and _cfg_mtime == mtime and _cfg_path == p:
return copy.deepcopy(_cfg_cache)
if p.exists():
with open(p, encoding="utf-8") as f:
with open(p) as f:
data = yaml.safe_load(f) or {}
else:
data = {}
@@ -679,7 +679,7 @@ def _save_cfg(cfg: dict):
import yaml
path = _hermes_home / "config.yaml"
with open(path, "w", encoding="utf-8") as f:
with open(path, "w") as f:
yaml.safe_dump(cfg, f)
with _cfg_lock:
_cfg_cache = copy.deepcopy(cfg)
@@ -1726,21 +1726,46 @@ def _validate_personality(value: str, cfg: dict | None = None) -> tuple[str, str
def _apply_personality_to_session(
sid: str, session: dict, new_prompt: str
) -> tuple[bool, dict | None]:
"""Apply a personality change to an existing session without resetting history.
Updates the agent's ephemeral system prompt in-place so the new personality
takes effect on the next turn. The cached base system prompt is left intact
(ephemeral_system_prompt is appended at API-call time, not baked into the
cache), which preserves prompt-cache hits.
Also injects a system-role marker into the conversation history so the model
knows to pivot its style from this point forward (without this, LLMs tend to
continue the tone established by earlier messages in the transcript).
Returns (history_reset, info) history_reset is always False since we
preserve the conversation.
"""
if not session:
return False, None
try:
info = _reset_session_agent(sid, session)
return True, info
except Exception:
if session.get("agent"):
agent = session["agent"]
agent.ephemeral_system_prompt = new_prompt or None
agent._cached_system_prompt = None
info = _session_info(agent)
_emit("session.info", sid, info)
return False, info
return False, None
agent = session.get("agent")
if agent:
agent.ephemeral_system_prompt = new_prompt or None
# Inject a pivot marker into history so the model sees the change point.
# This prevents it from pattern-matching its prior style.
if new_prompt:
marker = (
"[System: The user has changed the assistant's personality. "
"From this point forward, adopt the following persona and respond "
f"accordingly: {new_prompt}]"
)
else:
marker = (
"[System: The user has cleared the personality overlay. "
"From this point forward, respond in your normal default style.]"
)
with session["history_lock"]:
session["history"].append({"role": "user", "content": marker})
session["history_version"] = int(session.get("history_version", 0)) + 1
info = _session_info(agent)
_emit("session.info", sid, info)
return False, info
return False, None
def _cfg_max_turns(cfg: dict, default: int) -> int:
@@ -2588,7 +2613,7 @@ def _(rid, params: dict) -> dict:
f"hermes_conversation_{_time.strftime('%Y%m%d_%H%M%S')}.json"
)
try:
with open(filename, "w", encoding="utf-8") as f:
with open(filename, "w") as f:
json.dump(
{
"model": getattr(session["agent"], "model", ""),
+1 -10
View File
@@ -1,6 +1,6 @@
import { describe, expect, it } from 'vitest'
import { DURATION_PAD_LEN, padTickerDuration, padVerb, VERB_PAD_LEN } from '../components/appChrome.js'
import { padVerb, VERB_PAD_LEN } from '../components/appChrome.js'
import { VERBS } from '../content/verbs.js'
describe('FaceTicker verb padding', () => {
@@ -16,12 +16,3 @@ describe('FaceTicker verb padding', () => {
}
})
})
describe('FaceTicker duration padding', () => {
it('keeps elapsed segment width stable across second/minute boundaries', () => {
const samples = [9000, 10000, 59000, 60000, 61000, 3599000]
const lens = samples.map(ms => padTickerDuration(ms).length)
expect(new Set(lens)).toEqual(new Set([DURATION_PAD_LEN]))
})
})
@@ -31,4 +31,12 @@ describe('virtual height estimates', () => {
estimatedMsgHeight(msg, 80, { compact: false, details: false })
)
})
it('reserves two extra rows for the inter-turn separator on non-first user messages', () => {
const msg: Msg = { role: 'user', text: 'follow-up question' }
const base = estimatedMsgHeight(msg, 80, { compact: false, details: false })
const withSep = estimatedMsgHeight(msg, 80, { compact: false, details: false, withSeparator: true })
expect(withSep).toBe(base + 2)
})
})
+14 -1
View File
@@ -92,6 +92,19 @@ export const sessionCommands: SlashCommand[] = [
}
},
{
help: 'browse and resume previous sessions',
name: 'sessions',
run: (arg, ctx) => {
if (ctx.session.guardBusySessionSwitch('switch sessions')) {
return
}
if (!arg.trim()) {
return patchOverlayState({ picker: true })
}
}
},
{
help: 'attach an image',
name: 'image',
@@ -109,7 +122,7 @@ export const sessionCommands: SlashCommand[] = [
},
{
help: 'switch or reset personality (history reset on set)',
help: 'switch personality for this session',
name: 'personality',
run: (arg, ctx) => {
if (!arg) {
+8 -2
View File
@@ -264,15 +264,21 @@ export function useMainApp(gw: GatewayClient) {
return cache
}, [heightCacheKey])
// Index of the first user-role message — separator-rendering in
// appLayout.tsx skips this row, so the height estimator must skip it
// too. -1 when no user message exists yet (no row will gate true).
const firstUserIdx = useMemo(() => virtualRows.findIndex(r => r.msg.role === 'user'), [virtualRows])
const estimateRowHeight = useCallback(
(index: number) =>
estimatedMsgHeight(virtualRows[index]!.msg, cols, {
compact: ui.compact,
details: detailsVisible,
limitHistory: index < virtualRows.length - FULL_RENDER_TAIL_ITEMS,
userPrompt: ui.theme.brand.prompt
userPrompt: ui.theme.brand.prompt,
withSeparator: virtualRows[index]!.msg.role === 'user' && firstUserIdx >= 0 && index > firstUserIdx
}),
[cols, detailsVisible, ui.compact, ui.theme.brand.prompt, virtualRows]
[cols, detailsVisible, firstUserIdx, ui.compact, ui.theme.brand.prompt, virtualRows]
)
const syncHeightCache = useCallback(
+1 -3
View File
@@ -23,9 +23,7 @@ const HEART_COLORS = ['#ff5fa2', '#ff4d6d']
// Keep verb segment width stable so status-bar content to the right doesn't
// jitter when the ticker rotates between short/long verbs.
export const VERB_PAD_LEN = VERBS.reduce((max, v) => Math.max(max, v.length), 0) + 1 // + ellipsis
export const DURATION_PAD_LEN = 7 // e.g. " 9s", "1m 05s", "59m 59s"
export const padVerb = (verb: string) => `${verb}`.padEnd(VERB_PAD_LEN, ' ')
export const padTickerDuration = (ms: number) => fmtDuration(ms).padStart(DURATION_PAD_LEN, ' ')
// Compact alternates for the `emoji` and `ascii` indicator styles.
// Each entry is a fixed-width (display-width) glyph.
@@ -114,7 +112,7 @@ function FaceTicker({ color, startedAt }: { color: string; startedAt?: null | nu
// verb segment is hidden (e.g. `unicode` spinner style). When the verb
// IS shown, its trailing padding already provides the gap, so the extra
// space is harmless.
const durationSegment = startedAt ? ` · ${padTickerDuration(now - startedAt)}` : ''
const durationSegment = startedAt ? ` · ${fmtDuration(now - startedAt)}` : ''
return (
<Text color={color}>
+15
View File
@@ -76,6 +76,15 @@ const TranscriptPane = memo(function TranscriptPane({
return -1
}, [transcript.historyItems])
// Index of the first user-role message; every later user message gets a
// small dash above it so multi-turn transcripts visually segment by
// turn. -1 when no user message has been sent yet → no separator ever
// renders.
const firstUserIdx = useMemo(
() => transcript.historyItems.findIndex(m => m.role === 'user'),
[transcript.historyItems]
)
return (
<>
<ScrollBox
@@ -95,6 +104,12 @@ const TranscriptPane = memo(function TranscriptPane({
{transcript.virtualRows.slice(transcript.virtualHistory.start, transcript.virtualHistory.end).map(row => (
<Box flexDirection="column" key={row.key} ref={transcript.virtualHistory.measureRef(row.key)}>
{row.msg.role === 'user' && firstUserIdx >= 0 && row.index > firstUserIdx && (
<Box marginTop={1}>
<Text color={ui.theme.color.border}></Text>
</Box>
)}
{row.msg.kind === 'intro' ? (
<Box flexDirection="column" paddingTop={1}>
<Banner t={ui.theme} />
+16 -2
View File
@@ -43,8 +43,15 @@ export const estimatedMsgHeight = (
compact,
details,
limitHistory = false,
userPrompt = ''
}: { compact: boolean; details: boolean; limitHistory?: boolean; userPrompt?: string }
userPrompt = '',
withSeparator = false
}: {
compact: boolean
details: boolean
limitHistory?: boolean
userPrompt?: string
withSeparator?: boolean
}
) => {
if (msg.kind === 'intro') {
return msg.info?.version ? 9 : 5
@@ -80,5 +87,12 @@ export const estimatedMsgHeight = (
h++
}
// Inter-turn separator above non-first user messages (1 rule row + 1
// top-margin row). The render-side gate is in appLayout.tsx; we trust
// the caller to pass `withSeparator` only when it matches that gate.
if (withSeparator) {
h += 2
}
return Math.max(1, h)
}
+1 -11
View File
@@ -95,17 +95,7 @@ pytest tests/ -v
## Cross-Platform Compatibility
Hermes officially supports **Linux, macOS, WSL2, and native Windows** (via PowerShell install). Native Windows uses Git Bash (from [Git for Windows](https://git-scm.com/download/win)) for shell commands. A few features require POSIX kernel primitives and are gated: the dashboard's embedded PTY terminal pane (`/chat` tab) is WSL2-only.
When contributing code, keep these rules in mind:
- **Don't add unguarded `signal.SIGKILL` references.** It's not defined on Windows. Either route through `gateway.status.terminate_pid(pid, force=True)` (the centralized primitive that does `taskkill /T /F` on Windows and SIGKILL on POSIX), or fall back with `getattr(signal, "SIGKILL", signal.SIGTERM)`.
- **Catch `OSError` alongside `ProcessLookupError` on `os.kill(pid, 0)` probes.** Windows raises `OSError` (WinError 87, "parameter is incorrect") for an already-gone PID instead of `ProcessLookupError`.
- **Don't force the terminal to POSIX semantics.** `os.setsid`, `os.killpg`, `os.getpgid`, `os.fork` all raise on Windows — gate them with `if sys.platform != "win32":` or `if os.name != "nt":`.
- **Open files with an explicit `encoding="utf-8"`.** The Python default on Windows is the system locale (often cp1252), which mojibakes or crashes on non-Latin text.
- **Use `pathlib.Path` / `os.path.join` — never manually concat with `/`.** This matters less for strings the OS gives us back and more for strings we construct to hand to subprocesses.
Key patterns:
Hermes officially supports Linux, macOS, and WSL2. Native Windows is **not supported**, but the codebase includes some defensive coding patterns to avoid hard crashes in edge cases. Key rules:
### 1. `termios` and `fcntl` are Unix-only
+3 -32
View File
@@ -1,7 +1,7 @@
---
sidebar_position: 2
title: "Installation"
description: "Install Hermes Agent on Linux, macOS, WSL2, native Windows, or Android via Termux"
description: "Install Hermes Agent on Linux, macOS, WSL2, or Android via Termux"
---
# Installation
@@ -16,26 +16,6 @@ Get Hermes Agent up and running in under two minutes with the one-line installer
curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash
```
### Windows (native, PowerShell)
Open PowerShell and run:
```powershell
irm https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.ps1 | iex
```
The installer handles **everything**: `uv`, Python 3.11, Node.js 22, `ripgrep`, `ffmpeg`, **and a portable Git Bash** (MinGit — a slim, self-contained Git for Windows distribution that Hermes uses for shell commands). It clones the repo under `%LOCALAPPDATA%\hermes\hermes-agent`, creates a virtualenv, and adds `hermes` to your **User PATH**. Restart your terminal (or open a new PowerShell window) after the install so PATH picks up.
**How Git is handled:**
1. If `git` is already on your PATH, the installer uses your existing install.
2. Otherwise it downloads portable **MinGit** (~45MB, from the official `git-for-windows` GitHub release) and unpacks it to `%LOCALAPPDATA%\hermes\git`. No admin rights required. Completely isolated — it won't interfere with any system Git install, broken or otherwise.
**Why not use winget?** Earlier designs auto-installed Git via `winget install Git.Git`, but winget fails badly when a system Git install is in a partial or broken state (exactly when users need the installer to just work). The portable MinGit approach sidesteps winget, the Windows installer registry, and any existing system Git entirely. If the Hermes Git install itself ever breaks, `Remove-Item %LOCALAPPDATA%\hermes\git` and re-run the installer — no system impact, no uninstall drama.
The installer also sets `HERMES_GIT_BASH_PATH` to the located `bash.exe` so Hermes resolves it deterministically in fresh shells.
If you prefer WSL2, the Linux installer above works inside it; both native and WSL installs can coexist without conflict (native data lives under `%LOCALAPPDATA%\hermes`, WSL data lives under `~/.hermes`).
### Android / Termux
Hermes now ships a Termux-aware installer path too:
@@ -53,17 +33,8 @@ The installer detects Termux automatically and switches to a tested Android flow
If you want the fully explicit path, follow the dedicated [Termux guide](./termux.md).
:::note Windows Feature Parity
Everything except the browser-based dashboard chat terminal runs natively on Windows:
- **CLI (`hermes chat`, `hermes setup`, `hermes gateway`, …)** — native, uses your default terminal
- **Gateway (Telegram, Discord, Slack, …)** — native, runs as a background PowerShell process
- **Cron scheduler** — native
- **Browser tool** — native (Chromium via Node.js)
- **MCP servers** — native (stdio and HTTP transports both supported)
- **Dashboard `/chat` terminal pane****WSL2 only** (uses a POSIX PTY; native Windows has no equivalent). The rest of the dashboard (sessions, jobs, metrics) works natively — only the embedded PTY terminal tab is gated.
Set `HERMES_DISABLE_WINDOWS_UTF8=1` in your environment if you hit an encoding-related bug and want to fall back to the legacy cp1252 stdio path (useful for bisecting).
:::warning Windows
Native Windows is **not supported**. Please install [WSL2](https://learn.microsoft.com/en-us/windows/wsl/install) and run Hermes Agent from there. The install command above works inside WSL2.
:::
### What the Installer Does

Some files were not shown because too many files have changed in this diff Show More