Compare commits

..

36 Commits

Author SHA1 Message Date
alt-glitch ec1714e71f fix(install.ps1): handle uv stderr output with ErrorActionPreference=Stop
On fresh Windows installs, `uv python install` writes download progress to
stderr. With $ErrorActionPreference = 'Stop' (set globally in the script),
PowerShell wraps those stderr lines as ErrorRecord objects when captured via
2>&1, then throws a terminating exception — landing in the catch block even
though uv exits 0 and Python was installed successfully.

Fix: temporarily set ErrorActionPreference to 'Continue' around the native
uv call, then verify success with `uv python find` which is the reliable
signal regardless of exit code / stderr noise.

Tested on Windows 11 (build 26200) with ExecutionPolicy=Restricted,
uv 0.11.11, fresh machine with no prior Python install.
2026-05-08 14:13:06 +05:30
Teknium e0c03defd5 lint: enable PLW1514 as a blocking ruff rule
Turns the existing 'all lints disabled' stance into 'exactly one lint
enabled' — PLW1514 (unspecified-encoding) catches bare open() /
read_text() / write_text() calls that default to locale encoding on
Windows (cp1252), silently corrupting non-ASCII content.

Changes:

1. pyproject.toml
   - Migrate [tool.ruff] top-level select → [tool.ruff.lint].select
     (deprecated config location, ruff was warning on every run)
   - Add preview = true (PLW1514 is a preview rule in ruff 0.15.x)
   - select = ['PLW1514'] (exactly one rule, deliberately minimal)
   - per-file-ignores exempt tests/, plugins/, skills/, optional-skills/ —
     those have their own conventions or intentionally exercise edge cases

2. website/scripts/extract-skills.py
   - Fix 3 remaining bare opens (website/ was excluded from the main
     sweep but needed for ruff check . to go green)

3. tests/test_lint_config.py (new, 5 tests)
   - Guards against accidental rule removal.  If someone deletes PLW1514
     from the select list or disables preview mode, these tests fail
     with a loud message explaining why the rule exists.

Paired with a companion commit (held locally for now, pending a token
with workflow scope) that adds a blocking ruff step to .github/workflows/
lint.yml.  Without that companion commit, ruff is configured correctly
but nothing in CI enforces it yet — the advisory PR comment will still
surface new PLW1514 violations though, so authors see them.

Verified: ruff check . → exit 0, 0 violations across the repo.
Test suite: 90 passed, 14 skipped, 0 failed.
2026-05-07 19:36:13 -07:00
Teknium 9c914c01c8 codebase: add encoding='utf-8' to all bare open() calls (PLW1514)
Closes the last Python-on-Windows UTF-8 exposure by making every
text-mode open() call explicit about its encoding.

Before: on Windows, bare open(path, 'r') defaults to the system
locale encoding (cp1252 on US-locale installs).  That means reading
any config/yaml/markdown/json file with non-ASCII content either
crashes with UnicodeDecodeError or silently mis-decodes bytes.

After: all 89 affected call sites in production code now pass
encoding='utf-8' explicitly.  Works identically on every platform
and every locale, no surprise behavior.

Mechanical sweep via:
  ruff check --preview --extend-select PLW1514 --unsafe-fixes --fix     --exclude 'tests,venv,.venv,node_modules,website,optional-skills,               skills,tinker-atropos,plugins' .

All 89 fixes have the same shape: open(x) or open(x, mode) became
open(x, encoding='utf-8') or open(x, mode, encoding='utf-8').  Nothing
else changed.  Every modified file still parses and the Windows/sandbox
test suite is still green (85 passed, 14 skipped, 0 failed across
tests/tools/test_code_execution_windows_env.py +
tests/tools/test_code_execution_modes.py + tests/tools/test_env_passthrough.py +
tests/test_hermes_bootstrap.py).

Scope notes:
  - tests/ excluded: test fixtures can use locale encoding intentionally
    (exercising edge cases).  If we want to tighten tests later that's
    a separate PR.
  - plugins/ excluded: plugin-specific conventions may differ; plugin
    authors own their code.
  - optional-skills/ and skills/ excluded: skill scripts are user-authored
    and we don't want to mass-edit them.
  - website/ and tinker-atropos/ excluded: vendored / generated content.

46 files touched, 89 +/- lines (symmetric replacement).  No behavior
change on POSIX or on Windows when the file is ASCII; bug fix on
Windows when the file contains non-ASCII.
2026-05-07 19:24:45 -07:00
Teknium 6098272454 hermes_bootstrap: Windows-only UTF-8 stdio shim for all entry points
Codebase-wide fix for Python-on-Windows UTF-8 footguns, complementing
the earlier execute_code sandbox fixes (which remain load-bearing for
when the sandbox explicitly scrubs child env).

Problem: Python on Windows has two long-standing text-encoding pitfalls:

  1. sys.stdout/stderr are bound to the console code page (cp1252 on
     US-locale installs) — print('café') crashes with UnicodeEncodeError.
  2. Subprocess children don't know to use UTF-8 unless PYTHONUTF8 and/or
     PYTHONIOENCODING are set in their env — so any Python we spawn
     (linters, sandbox children, delegation workers) hits the same bug.

Solution: A tiny bootstrap module (hermes_bootstrap.py) imported as the
first statement of every Hermes entry point:

  - hermes_cli/main.py   (hermes / hermes-agent console_script)
  - run_agent.py         (hermes-agent direct)
  - acp_adapter/entry.py (hermes-acp)
  - gateway/run.py       (messaging gateway)
  - batch_runner.py      (parallel batch mode)
  - cli.py               (legacy direct-launch CLI)

On Windows, the bootstrap:
  - os.environ.setdefault('PYTHONUTF8', '1')       (PEP 540 UTF-8 mode)
  - os.environ.setdefault('PYTHONIOENCODING', 'utf-8')
  - sys.stdout/stderr/stdin.reconfigure(encoding='utf-8', errors='replace')

Children inherit the env vars → they run in UTF-8 mode.
Current process's stdio is reconfigured → print('café') works now.

On POSIX (Linux/macOS), the bootstrap is a complete no-op.  We don't
touch LANG, LC_*, or anything else — users who have intentionally
configured a non-UTF-8 locale aren't affected.  POSIX systems are
already UTF-8 by default in 99% of modern setups, so there's nothing
to fix.

setdefault() (not overwrite) means users who explicitly set PYTHONUTF8=0
or PYTHONIOENCODING=cp1252 in their environment are respected.

What this does NOT fix: bare open(path, 'w') calls in the *parent*
process still default to locale encoding because PYTHONUTF8 is only
read at interpreter init.  A ruff PLW1514 sweep (separate follow-up)
will add explicit encoding='utf-8' at those ~219 call sites for
belt-and-suspenders.

Tests (17): 16 passed, 1 skipped on Windows.
  - Windows: env vars set, stdio reconfigured, child inherits UTF-8 mode
  - POSIX: complete no-op (verified on fake POSIX + skipped on real
    POSIX since we don't have a Linux box in this session)
  - Idempotence: multiple calls safe
  - Graceful degradation: non-reconfigurable streams don't crash
  - User opt-out: explicit PYTHONUTF8=0 is respected
  - Load order: every entry point's FIRST top-level import is
    hermes_bootstrap, enforced by an AST-level parametrized test

pyproject.toml: added hermes_bootstrap to py-modules so it ships with
pip installs.
2026-05-07 19:09:40 -07:00
Teknium bf43f6cfdd execute_code: set PYTHONIOENCODING=utf-8 + PYTHONUTF8=1 in child env
Third Windows-specific sandbox bug (after WinError 10106 and the UTF-8
file-write bug): user scripts that print non-ASCII to stdout crash with

    UnicodeEncodeError: 'charmap' codec can't encode character '\u2192'
                        in position N: character maps to <undefined>

Root cause: Python's sys.stdout on Windows is bound to the console code
page (cp1252 on US-locale installs) when the process is attached to a
pipe without PYTHONIOENCODING set.  LLM-generated scripts routinely
print em-dashes, arrows, accented chars, and emoji — all of which cp1252
can't encode.

Fix: spawn the sandbox child with:

    PYTHONIOENCODING=utf-8   # sys.stdin/stdout/stderr all UTF-8
    PYTHONUTF8=1             # PEP 540 UTF-8 mode — open() defaults to UTF-8 too

PYTHONUTF8 is the belt-and-suspenders half: LLM scripts that call
open(path, 'w') without encoding= in user code will now produce UTF-8
files by default, matching what the sandbox already does for its own
staging files.

The parent side already decodes child stdout/stderr as UTF-8 with
errors='replace' (lines 1345-1347) so the end-to-end chain is clean.

On POSIX these values usually match the locale default already, so
setting them is harmless belt-and-suspenders for C/POSIX-locale
containers and minimal base images.

Tests added (4) — total file now at 28 passed, 1 skipped on Windows:
  - test_popen_env_sets_pythonioencoding_utf8 (source grep)
  - test_popen_env_sets_pythonutf8_mode (source grep)
  - test_live_child_can_print_non_ascii (cross-platform live test)
  - test_windows_child_without_utf8_env_would_fail (Windows negative
    control — actually reproduces the bug without our env overrides,
    proving the fix is load-bearing on this system)
2026-05-07 18:59:35 -07:00
Teknium f5ec30dfe6 tests: skip POSIX-venv-layout tests on Windows
test_code_execution_modes.py had two test-level failures and two
class-level stale skip reasons on this Windows-native branch:

  - TestResolveChildPython::test_project_with_virtualenv_picks_venv_python
  - TestResolveChildPython::test_project_prefers_virtualenv_over_conda

Both fail on Windows with OSError: [WinError 1314] — they call
pathlib.Path.symlink_to() to build a fake venv, which requires
developer mode or admin on Windows.  They also assume POSIX venv
layout (bin/python) where Windows uses Scripts/python.exe.  Skip
them with a specific, accurate reason.

Also updated two class-level skipif reasons that said
'execute_code is POSIX-only' — no longer true on this branch.
New reason explains it's the test infrastructure (symlinks + POSIX
venv layout) that's the blocker, not execute_code itself.

Results on Windows Python 3.11:
  Before: 41 passed, 10 skipped, 2 failed
  After:  43 passed, 12 skipped, 0 failed
2026-05-07 18:56:33 -07:00
Teknium 8798bea31f execute_code: write sandbox files as UTF-8 on Windows
Second Windows-specific sandbox bug (WinError 10106 was the first):
after the env-scrub fix let the child start, it immediately failed to
import hermes_tools with:

    SyntaxError: (unicode error) 'utf-8' codec can't decode byte 0x97
                 in position 154: invalid start byte

Root cause: _execute_local wrote the generated hermes_tools.py stub and
the user's script.py via open(path, 'w') without encoding=.  On Windows
the default text-mode encoding is cp1252 (system locale), which encodes
em-dashes (used in the stub's docstrings) as 0x97.  Python then decodes
source files as UTF-8 (PEP 3120) on import, chokes on 0x97, and the
sandbox dies before any tool call.

Fix: pass encoding='utf-8' to all four file opens in the code_execution
path — the two staging writes in _execute_local (hermes_tools.py +
script.py) and the two RPC file-transport reads/writes in the generated
remote stub.  JSON is ASCII-safe for most payloads but tool results
(terminal output, web_extract content) routinely carry non-ASCII.

Tests added (4):
  - test_stub_and_script_writes_specify_utf8 — source grep guard
  - test_file_rpc_stub_uses_utf8 — generated remote stub check
  - test_stub_source_roundtrips_through_utf8 — concrete round-trip
  - test_windows_default_encoding_would_have_failed — negative control
    (skips on modern Python builds where default is already UTF-8
    compatible, but retained for platforms where the regression could
    return)

24/25 tests pass on Windows 3.11 (negative control skips because this
Python build handles em-dashes via cp1252 subset — the fix is still
correct, just the corruption path isn't always triggerable).
2026-05-07 18:52:59 -07:00
Teknium 668e4b8d7e tests: lock in POSIX-equivalence guard for execute_code env scrubber
Adds TestPosixEquivalence to test_code_execution_windows_env.py.  The
class pins the invariant that _scrub_child_env(env, is_windows=False)
produces byte-for-byte identical output to the pre-refactor inline
scrubber, across a matrix of:

  - 2 synthetic envs (POSIX-shaped, Windows-shaped-on-POSIX)
  - 3 passthrough rules (none, single-var, everything)
  - 1 real-os.environ check on whatever platform runs the test

Plus a superset sanity check: is_windows=True must keep everything
is_windows=False keeps, and any extras must come from the
_WINDOWS_ESSENTIAL_ENV_VARS allowlist.

Rationale: the previous commit refactored the env-scrubbing inline
block into a helper.  Future changes to that helper must not silently
regress POSIX behavior — if someone needs to change it, they update
_legacy_posix_scrubber in lockstep so the churn is visible in review.

All 21 tests in the file pass locally on Windows (pytest 9.0.3).  8 of
them are parametrized equivalence checks that run on every OS.
2026-05-07 18:45:34 -07:00
Teknium fab984c7f8 execute_code: pass through Windows OS-essential env vars
The sandbox's env scrubbing was dropping SYSTEMROOT, WINDIR, COMSPEC,
APPDATA, etc. On Windows this broke the child process before any RPC
could happen:

    OSError: [WinError 10106] The requested service provider could not
    be loaded or initialized

Python's socket module uses SYSTEMROOT to locate mswsock.dll during
Winsock initialization. Without it, socket.socket(AF_INET, SOCK_STREAM)
fails — and the existing loopback-TCP fallback for Windows couldn't work.

Fix: add a small Windows-only allowlist (_WINDOWS_ESSENTIAL_ENV_VARS)
matched by exact uppercase name, after the existing secret-substring
block. The secret block still runs first, so the allowlist cannot be
used to exfiltrate credentials. Also extract the env scrubber into a
testable helper (_scrub_child_env) that takes is_windows as a parameter,
so the logic can be unit-tested on any OS.

Live Winsock smoke test verifies that a child spawned with the scrubbed
env can now create an AF_INET socket on a real Windows host; the test
is guarded by sys.platform == 'win32' so POSIX CI stays green.
2026-05-07 18:39:38 -07:00
Teknium f0d2516a30 fix(windows): prefer npm.cmd over npm.ps1, skip .py argv0 in relaunch
Two fixes from teknium1's next install run:

1. **npm install: "npm.ps1 cannot be loaded because running scripts is
   disabled on this system."**  Get-Command's default PATHEXT ordering
   picked up ``npm.ps1`` (the PowerShell shim) ahead of ``npm.cmd`` (the
   batch shim).  Most Windows users have PowerShell's execution policy
   set to Restricted or RemoteSigned, which blocks unsigned ``.ps1``
   files.  ``npm.cmd`` has no such restriction and works universally.

   Install-NodeDeps now detects when Get-Command returned npm.ps1, looks
   for a sibling npm.cmd in the same directory, and prefers it.  Prints
   an info line so the user sees why.  Emits a warning + hint if only
   npm.ps1 is available.

2. **"Launch hermes chat now? Y" crashes with "%1 is not a valid Win32
   application" on Windows installs.**  The setup wizard calls
   ``relaunch(["chat"])``; ``resolve_hermes_bin()`` returned
   ``sys.argv[0]`` which was ``...\\hermes_cli\\main.py`` (because hermes
   was launched via ``python -m hermes_cli.main`` during setup).

   On Windows, ``os.access(script.py, os.X_OK)`` returns True because
   PATHEXT lists ``.py`` when the Python launcher is registered — but
   ``subprocess.run([script.py, ...])`` can't actually execute a ``.py``
   directly.  CreateProcessW needs a real PE file.

   Fixed ``resolve_hermes_bin`` to reject ``.py``/``.pyc`` argv0 values
   on Windows specifically.  Falls through to ``shutil.which("hermes")``
   (hermes.exe in the venv Scripts dir) or, as a final fallback, lets
   build_relaunch_argv build ``[sys.executable, "-m", "hermes_cli.main"]``
   which is bulletproof.  POSIX behaviour unchanged — ``.py`` argv0 with
   a shebang + chmod+x is still a valid exec target there.

3 new tests cover the Windows paths: .py argv0 + hermes.exe on PATH →
returns hermes.exe; .py argv0 + no PATH → returns None (caller uses
python -m); POSIX + executable .py → still accepted.

26 relaunch tests pass, no POSIX regressions.
2026-05-07 18:29:17 -07:00
Teknium 2e403bd0a4 fix(windows): enable execute_code — stale AF_UNIX gate was blocking the tool
teknium1 noticed execute_code was missing from his enabled tools on Windows.
Root cause: tools/code_execution_tool.py set ``SANDBOX_AVAILABLE =
sys.platform != \"win32\"`` as a module-level constant, originally because
the RPC transport required AF_UNIX.  We added loopback TCP fallback for
the sandbox in commit eeb723fff (and covered it in the Windows TCP tests),
but forgot to lift the availability gate.  So execute_code was still
invisible via the check_fn path on Windows.

- SANDBOX_AVAILABLE is now True unconditionally (it's still checked — a
  future platform could flip it off via monkeypatch/env if needed).
- Error message when disabled no longer mentions Windows specifically,
  just says 'sandbox is unavailable in this environment'.
- test_windows_returns_error updated: patches SANDBOX_AVAILABLE=False
  directly (which was always its real intent) and asserts on 'unavailable'
  instead of 'Windows'.

Tests: 171 code-execution + windows-compat tests pass, no regressions.
2026-05-07 18:17:31 -07:00
Teknium 2c7b479d16 fix(windows): %1 install error, patch CRLF false-negative, SOUL.md BOM
Three bugs from teknium1's successful install + diagnostic chat on Windows:

1. **Start-Process -FilePath npm.cmd fails with "%1 is not a valid Win32
   application".**  Start-Process bypasses cmd.exe and PATHEXT to call
   CreateProcessW directly, which refuses .cmd batch shims.  Switched
   Install-NodeDeps to use PowerShell's invocation operator (``& $npmExe
   install --silent *> $log``) which DOES honour PATHEXT.  Extracted a
   ``_Run-NpmInstall`` helper so the browser + TUI paths share the same
   logic.  Captures $LASTEXITCODE correctly, still surfaces the real
   stderr on failure with a log-file pointer for the full output.

2. **patch tool returns false-negative on Windows due to CRLF round-trip.**
   Root cause was upstream of patch: ``subprocess.Popen(..., text=True,
   stdin=PIPE)`` on Windows translates ``\\n`` → ``\\r\\n`` when data flows
   through the stdin pipe.  ``_pipe_stdin()`` was writing the patch's
   new_content string through a text-mode pipe, bash then wrote those
   CRLF bytes to disk, and patch's post-write verify compared the
   on-disk CRLF bytes against the original LF-only string — fail.

   Fixed in two places for defense in depth:
   - ``_pipe_stdin()`` now writes through ``proc.stdin.buffer`` with
     explicit UTF-8 encoding, bypassing Python's newline translation on
     every platform.  No behaviour change on POSIX (bytes are identical)
     but stops the CRLF injection on Windows.
   - ``patch_replace``'s post-write verify normalizes CRLF→LF on both
     sides before comparing, so even if some future backend still
     translates newlines the patch tool won't report a bogus failure.

3. **SOUL.md gets a UTF-8 BOM on Windows PowerShell 5.1.**  ``Set-Content
   -Encoding UTF8`` on PS5.1 writes UTF-8 WITH a byte-order-mark (changed
   in PS7 via ``utf8NoBOM``).  Hermes's prompt-injection scanner sees
   the BOM (U+FEFF invisible char) and refuses to load the file, so
   SOUL.md's persona instructions never get applied.

   Fixed by writing the file via ``[System.IO.File]::WriteAllText``
   with an explicit ``UTF8Encoding($false)`` — BOM-free on every
   PowerShell version.

All POSIX behaviour verified unchanged: 198 tests pass across
test_file_operations, test_local_env_cwd_recovery, test_code_execution,
test_windows_native_support, test_windows_compat.
2026-05-07 18:11:43 -07:00
Teknium 225b57f314 fix(install.ps1): step out of $InstallDir before touching it + harden repo probe
User hit 'fatal: not in a git directory' on re-install because:

1. They ran Remove-Item -Force $env:LOCALAPPDATA\hermes -ErrorAction
   SilentlyContinue WHILE cd'd inside the install dir.  Windows
   silently refuses to delete a directory any shell is currently cd'd
   inside and leaves the skeleton intact, but the -ErrorAction
   SilentlyContinue swallowed every partial-delete failure so they
   thought the wipe succeeded.

2. The installer then walked into Install-Repository, saw $InstallDir
   still exists with a partial .git stub, my repo-validity probe
   returned success (the probe's git rev-parse may have exit-code-zeroed
   in a way I didn't expect), and the real git fetch died with three
   'fatal: not a git repository' errors.

Two fixes belt-and-braces:

- Main() now cds to $env:USERPROFILE at start if the current shell
  is inside $InstallDir.  Harmless when the user ran from elsewhere;
  critical when they didn't.  This alone fixes the user's case.

- Install-Repository's 'is this a valid repo' probe now runs BOTH
  git rev-parse --is-inside-work-tree AND git status, resets
  $LASTEXITCODE before each to avoid picking up a stale 0, and
  requires BOTH to succeed.  Also requires rev-parse's output to
  match 'true' (not just exit 0) to rule out exit-0-with-empty-output
  edge cases.
2026-05-07 18:05:35 -07:00
Teknium 4d7e72e14d fix(install.ps1): validate existing repo via git itself + clean up broken stubs
teknium1 hit "fatal: not in a git directory" on re-install when the previous
install left a $InstallDir\.git stub that Test-Path matched but git didn't
recognize (three "fatal: not a git repository" lines, then the script
exited before touching anything).

Two bugs:

1. Test-Path "$InstallDir\.git" was a weak gate — it matches .git
   whether it's a directory, file, symlink, submodule gitfile, OR a
   broken stub from a failed previous Remove-Item.  Replaced with a
   real repo probe: Push-Location + git rev-parse --is-inside-work-tree
   + $LASTEXITCODE check.  If git itself can't see a repo, we treat
   the directory as not-a-repo and fall through to fresh clone.

2. The original update path ignored $LASTEXITCODE.  fetch/checkout/pull
   all emitted fatals but the script kept going.  Now each command
   checks $LASTEXITCODE and throws with an explicit message.

Also: when the directory exists but isn't a valid repo, the new code
wipes it (Remove-Item -ErrorAction Stop) and falls through to fresh
clone, instead of dying with the old "Directory exists but is not a git
repository" error.  If the wipe itself fails (file locked, hermes still
running), we throw with a user-readable "close any programs using files
in <dir>" hint.

Refactored the function to use a $didUpdate flag instead of my earlier
draft's early `return` — that was skipping the submodule init block at
the bottom of the function.  Both the update and fresh-clone paths now
fall through to the submodule init step, which is correct (git pull
doesn't auto-update submodules).

PowerShell structural check: 21 functions defined, braces balanced.
2026-05-07 18:00:59 -07:00
Teknium 787d964ea1 fix(windows): quote cache paths in bash + augment PATH so rg/bash resolve on first launch
Three interrelated bugs from teknium1's first interactive chat on Windows:

1. **Snapshot/cwd file paths unquoted in bash command strings.**  The session
   bootstrap and per-command wrapper interpolated
   ``self._snapshot_path`` / ``self._cwd_file`` unquoted into bash commands
   like ``export -p > C:/Users/ryanc/.../hermes-snap-xxx.sh``.  Git Bash's
   MSYS2 layer handles ``C:/...`` paths correctly ONLY when quoted; unquoted,
   the colon and forward-slash get glob-parsed and the redirect targets a
   bogus path.  Symptom: every terminal command emitted two
   ``C:/Users/.../hermes-snap-*.sh (No such file or directory)`` lines that
   bled into stdout (``stderr=STDOUT`` on the local backend) and corrupted
   file contents when the agent wrote to scratch paths via the terminal
   tool.  Fix: ``shlex.quote()`` every interpolation of ``_snapshot_path``
   and ``_cwd_file`` in base.py — no-op on POSIX (the paths contain no
   shell-metachars), critical on Windows.

2. **Stale PATH on first hermes launch after install.**  ``install.ps1``
   adds the PortableGit ``cmd`` / ``bin`` / ``usr\bin`` directories to the
   Windows **User** PATH via ``SetEnvironmentVariable(..., "User")``.  That
   write propagates to newly *spawned* processes only — already-running
   shells (including the one the user types ``hermes`` into immediately
   after install) retain their old PATH.  So hermes starts with a PATH that
   doesn't include bash, rg, grep, ssh — and ``search_files`` reports
   "rg/find not available" when the user clearly just installed them.

   Fix: new ``_augment_path_with_known_tools()`` helper called from
   ``configure_windows_stdio()`` on startup.  Prepends the Hermes-managed
   Git directories + the WinGet Links directory (where ripgrep lands) to
   ``os.environ['PATH']`` if they exist on disk but aren't already in
   PATH.  Subsequent subprocess calls (including bash spawns via
   ``_find_bash()``) inherit the augmented PATH and find everything.
   No-op on POSIX and when the directories don't exist.

3. **Root cause of "file content corruption".**  #1 was the proximate cause.
   Errors like ``C:/Users/.../hermes-snap-xxx.sh: No such file or directory``
   were emitted on stderr by the failed redirect, captured into stdout via
   ``stderr=subprocess.STDOUT``, and if the agent used terminal commands
   like ``cat > file`` the leaked error bytes became part of the file.
   Fixing #1 eliminates this entirely.

## Tests

All 77 Windows-compat tests still pass on Linux (POSIX path is
shlex.quote('/tmp/foo.sh') → '/tmp/foo.sh' — unchanged).

## Not addressed here (would need a bigger design)

- Python file tools (``write_file``, ``read_file``) and the bash-backed
  terminal tool see DIFFERENT views of ``/tmp`` on Windows.  Python treats
  ``/tmp`` as ``C:\tmp`` (drive-relative), Git Bash's MSYS2 treats it as
  a virtual mount to the PortableGit install's ``tmp\``.  Would need a
  translation shim in the Python tools to resolve bash-virtual paths to
  their native-Windows equivalents.  Workaround for users today: use
  absolute native paths (``C:\Users\you\...``) instead of ``/tmp/...``
  when crossing between terminal and Python file tools.
2026-05-07 17:51:57 -07:00
Teknium cf9b2df57a fix(windows): use PortableGit (not MinGit), fix relaunch os.execvp crash, surface npm errors
Three real bugs from teknium1's first Windows install run:

1. **MinGit has no bash.exe.**  MinGit is the minimal-automation Git for Windows
   distribution — it ships git.exe but deliberately strips bash and the POSIX
   coreutils.  Installer logged "Could not locate bash.exe" and Hermes would
   fail to run any shell command.  Switched to PortableGit — the full Git for
   Windows minus the installer UI.  PortableGit ships bash.exe at
   <root>\bin\bash.exe plus sh, awk, sed, grep, curl, ssh in usr\bin\.  ARM64
   variant is detected separately (PortableGit-*-arm64.7z.exe).  32-bit falls
   back to MinGit-32-bit with a warning (PortableGit is 64-bit only).

   PortableGit ships as a 7z self-extractor (56MB vs MinGit's 38MB).  We
   invoke it with `-o<target> -y` to extract silently — no 7z install needed,
   it's self-contained.

   Updated tools/environments/local.py::_find_bash candidate order to prefer
   the PortableGit layout (<root>\bin\bash.exe) with the MinGit layout
   (<root>\usr\bin\bash.exe) as a fallback so existing installs keep working.

2. **os.execvp "Exec format error" on Windows.**  Setup wizard's "Launch
   hermes chat now? Y" called `os.execvp(["hermes", "chat"])` which on
   Windows can only swap to real Win32 .exe files — chokes with OSError(8)
   on .cmd batch shims and Python console-script wrappers.  Added a
   win32 branch in hermes_cli/relaunch.py::relaunch() that uses
   subprocess.run + sys.exit — functionally identical (user sees "hermes
   exited, then new hermes started") with one extra PID in play.  POSIX
   path is UNCHANGED — still uses os.execvp for in-place replacement.
   Catches OSError in the Windows branch and surfaces a "open a new
   terminal so PATH picks up, then re-run hermes" hint instead of a
   cryptic traceback.

3. **npm install failures silent on Windows.**  The install.ps1 was invoking
   `npm install --silent 2>&1 | Out-Null` inside a try/catch.  PowerShell's
   try/catch does NOT trigger on non-zero process exit codes — only on
   unhandled .NET exceptions — so npm failing printed a generic "npm
   install failed" with zero information about WHY.  The silent pipe ate
   the stderr.

   Rewrote Install-NodeDeps to:
   - Resolve npm.cmd via Get-Command (respects PATHEXT) instead of
     relying on bare `npm` name resolution.
   - Use Start-Process with -PassThru to capture the actual exit code.
   - Redirect stderr to a temp log and surface the first ~800 chars of
     the real npm error when install fails, plus the log path for the
     full text.
   - Fail loudly with the right exit code instead of a misleading success.
   - Bail cleanly with a helpful message when npm isn't on PATH at all.

4. **"True" printing to console after Node check.**  `Test-Node` returns $true;
   installer called it as a bare statement (no assignment, no cast).  PowerShell
   prints bare return values.  Wrapped the call in `[void](Test-Node)`.

## Tests

- Added 3 new tests in tests/hermes_cli/test_relaunch.py covering the
  Windows branch: subprocess is called (not execvp), child exit code
  propagates, OSError surfaces a helpful message.  All 23 tests pass
  (20 existing + 3 new).
- 77 Windows-compat tests still pass, POSIX behaviour unchanged.
2026-05-07 17:42:47 -07:00
Teknium eeb723fff2 feat(windows): close remaining POSIX-only landmines — TUI crash, kanban waitpid, AF_UNIX sandbox, /bin/bash, npm .cmd shims, cwd tracking, detach flags
Second pass on native Windows support, driven by a systematic audit across
five areas: POSIX-only primitives (signal.SIGKILL/SIGHUP/SIGPIPE, os.WNOHANG,
os.setsid), path translation bugs (/c/Users → C:\Users), subprocess patterns
(npm.cmd batch shims, start_new_session no-op on Windows), subsystem health
(cron, gateway daemon, update flow), and module-level import guards.

Every change is platform-gated — POSIX (Linux/macOS) behaviour is preserved
bit-identical. Explicit "do no harm" test: test_posix_path_preserved_on_linux,
test_posix_noop, test_windows_detach_popen_kwargs_is_posix_equivalent_on_posix.

## New module

- hermes_cli/_subprocess_compat.py — shared helpers (resolve_node_command,
  windows_detach_flags, windows_hide_flags, windows_detach_popen_kwargs).
  All no-ops on non-Windows.

## CRITICAL fixes (would crash or silently break on Windows)

- tui_gateway/entry.py: SIGPIPE/SIGHUP referenced at module top level would
  AttributeError on import on Windows, breaking `hermes --tui` entirely (it
  spawns this module as a subprocess).  Guard each signal.signal() call with
  hasattr() and add SIGBREAK as Windows' SIGHUP equivalent.

- hermes_cli/kanban_db.py: os.waitpid(-1, os.WNOHANG) in dispatcher tick was
  unguarded.  os.WNOHANG doesn't exist on Windows.  Gate the whole reap loop
  behind `os.name != "nt"` — Windows has no zombies anyway.

- tools/code_execution_tool.py: AF_UNIX socket for execute_code RPC fails on
  most Windows builds.  Fall back to loopback TCP (AF_INET on 127.0.0.1:0
  ephemeral port) when _IS_WINDOWS.  HERMES_RPC_SOCKET env var now accepts
  either a filesystem path (POSIX) or `tcp://127.0.0.1:<port>` (Windows).
  Generated sandbox client parses both.

- cron/scheduler.py: `argv = ["/bin/bash", str(path)]` hardcoded.  Use
  shutil.which("bash") so Windows (Git Bash via MinGit) works, with a
  readable error when bash is genuinely absent.

- 6 bare npm/npx spawn sites: tools_config.py x2, doctor.py, whatsapp.py
  (npm install + node version probe), browser_tool.py x2.  On Windows npm
  is npm.cmd / npx is npx.cmd (batch shims); subprocess.Popen(["npm", ...])
  fails with WinError 193.  shutil.which(...) returns the absolute .cmd
  path which CreateProcessW accepts because the extension routes through
  cmd.exe /c.  POSIX behaviour unchanged (shutil.which still returns the
  same path subprocess would resolve itself).

## HIGH fixes (silent misbehaviour on Windows)

- tools/environments/local.py get_temp_dir: hardcoded /tmp returned on
  Windows meant `_cwd_file = "/tmp/hermes-cwd-*.txt"`, which bash wrote
  via MSYS2's virtual /tmp but native Python couldn't open.  Result: cwd
  tracking silently broken — `cd` in terminal tool did nothing.  Windows
  branch now returns `%HERMES_HOME%/cache/terminal` with forward slashes
  (works in both bash and Python, guaranteed no spaces).

- tools/environments/local.py _make_run_env PATH injection: `/usr/bin not
  in split(":")` heuristic mangles Windows PATH (";" separator).  Gate
  the injection behind `not _IS_WINDOWS`.

- hermes_cli/gateway.py launch_detached_profile_gateway_restart: outer
  Popen + watcher-script Popen both used start_new_session=True, which
  Windows silently ignores.  Watcher stayed attached to CLI's console,
  died when user closed terminal after `hermes update`, left gateway
  stale.  Now branches through windows_detach_popen_kwargs() helper
  (CREATE_NEW_PROCESS_GROUP | DETACHED_PROCESS | CREATE_NO_WINDOW on
  Windows, start_new_session=True on POSIX — identical to main).

## MEDIUM fixes

- gateway/run.py /restart and /update handlers: hardcoded bash/setsid
  chain crashes on Windows when user triggers /update in-gateway.  Now
  has sys.platform=="win32" branch using sys.executable + a tiny
  Python watcher with proper detach flags.  POSIX path is unchanged.

- cli.py _git_repo_root: Git on Windows sometimes returns /c/Users/...
  style paths that break subprocess.Popen(cwd=...) and Path().resolve().
  Added _normalize_git_bash_path() helper that translates /c/Users,
  /cygdrive/c, /mnt/c variants to native C:\Users form.  POSIX no-op.
  _git_repo_root() now routes every result through it.

- cli.py worktree .worktreeinclude: os.symlink on directories failed
  hard on Windows (requires admin or Developer Mode).  Falls back to
  shutil.copytree with a warning log.

## Tests

- 29 new tests in tests/tools/test_windows_native_support.py covering:
  subprocess_compat helpers, TUI entry signal guards, kanban waitpid
  guard, code_execution TCP fallback source-level invariants, cron bash
  resolution, npm/npx bare-spawn lint per-file, local env Windows temp
  dir, PATH injection gating, git bash path normalization, symlink
  fallback, gateway detached watcher flags.

- One existing test assertion adjusted in test_browser_homebrew_paths:
  it compared captured Popen argv to the BARE `"npx"` literal; after the
  shutil.which() change argv[0] is the absolute path.  New assertion
  checks the shape (two items, second is `agent-browser`) rather than
  the exact first-item string.  Behaviour unchanged; test was too strict.

All 56 tests pass on Linux (30 from previous commits + 26 new).
267 tests from the affected files/dirs (browser, code_exec, local_env,
process_registry, kanban_db, windows_compat) all pass — zero regressions.
tests/hermes_cli/ (3909 pass) and tests/gateway/ (5021 pass) unchanged;
all pre-existing test failures confirmed unrelated via `git stash` re-run.

## What's still deferred (LOW priority)

- Visible cmd-window flashes on short-lived console apps (~14 sites) —
  cosmetic, needs a follow-up pass once we have user reports.
- agent/file_safety.py POSIX-only security deny patterns — separate
  hardening task.
- tools/process_registry.py returning "/tmp" as fallback — theoretical;
  reachable only when all env-var candidates fail.
2026-05-07 17:29:31 -07:00
Teknium 1da89528e7 fix(windows-editor): default EDITOR=notepad so /edit and Ctrl+X Ctrl+E work
Pre-existing Windows bug surfaced while reviewing the portable-MinGit
install: prompt_toolkit's Buffer.open_in_editor() falls back to POSIX
absolute paths (/usr/bin/nano, /usr/bin/vi, /usr/bin/emacs) that don't
exist on native Windows.  When neither $EDITOR nor $VISUAL is set,
Ctrl+X Ctrl+E ("open prompt in editor") and /edit both silently do
nothing on Windows — the user hits the key, nothing happens, no error.

This wasn't caused by MinGit (full Git for Windows doesn't fix it either,
because the Windows Python subprocess call resolves `/usr/bin/nano` as
`C:\usr\bin\nano`, which doesn't exist even with nano installed).

Fixes:
- hermes_cli/stdio.py::configure_windows_stdio now sets EDITOR=notepad
  on Windows if neither EDITOR nor VISUAL is set.  notepad.exe is in
  every Windows install, works as a blocking editor (subprocess.call
  waits for the window to close), and writes back to the file.
- hermes_cli/config.py (hermes config edit): reorder fallback list so
  Windows tries notepad first — previously nano led the list, which
  required Git Bash / WSL to be in PATH.
- Users who want VSCode / Neovim / Notepad++ can still override via
  $env:EDITOR — that's checked before our default kicks in.  Docstring
  spells out the common overrides.

The Ink TUI (`hermes --tui`) already handled Windows correctly via
ui-tui/src/lib/editor.ts falling back to notepad.exe on win32 — this
commit brings the classic prompt_toolkit CLI into parity.

3 new tests in test_windows_native_support.py verify:
- EDITOR=notepad gets set when unset on Windows
- Explicit $EDITOR is respected
- $VISUAL is respected (not overwritten by our default)
2026-05-07 16:46:37 -07:00
Teknium 5486ad2f2a feat(windows-install): bundle portable MinGit instead of relying on winget
User hit a real failure case: their system Git was in a half-installed state
(can neither uninstall nor reinstall) and winget refused to work around it.
We were one step away from shipping an installer that would have left users
with exactly the problem he already had.

What other agents do (reality check):
- Claude Code: requires pre-installed Git; breaks if user doesn't have it.
- OpenCode, Codex: don't need bash at all — PowerShell-first design.
- Cline: uses whatever shell VSCode is configured with; installs nothing.

None of them solve the "broken system Git" problem.  We need to own our Git.

Changes:
- scripts/install.ps1::Install-Git: dropped winget path entirely.  Now:
  (1) use existing git if present; (2) download portable MinGit from the
  official git-for-windows GitHub release to %LOCALAPPDATA%\hermes\git.
  No winget, no admin, no Windows installer registry, no system impact.
- Added %LOCALAPPDATA%\hermes\git\{cmd,usr\bin} to User PATH so git + bash
  + POSIX coreutils (which, env, grep, …) resolve in fresh shells.
- tools/environments/local.py::_find_bash: reorder so Hermes' portable
  MinGit install is checked BEFORE falling through to shutil.which("bash")
  or system install locations.  This way a broken system Git can't
  hijack the bash lookup.
- README + installation docs reworded to reflect the new story: "portable
  Git Bash, isolated from any system install, recoverable via rm -rf if it
  ever breaks."

Recoverability: if Hermes' Git install ever breaks, ``Remove-Item %LOCALAPPDATA%\hermes\git``
and re-run the installer — no system impact, no uninstall drama, no winget
to fight with.
2026-05-07 16:38:11 -07:00
Teknium fda234a210 feat(windows): close native-Windows install gaps — crash-free startup, UTF-8 stdio, tzdata dep, docs
Native Windows (with Git for Windows installed) can now run the Hermes CLI
and gateway end-to-end without crashing.  install.ps1 already existed and
the Git Bash terminal backend was already wired up — this PR fills the
remaining gaps discovered by auditing every Windows-unsafe primitive
(`signal.SIGKILL`, `os.kill(pid, 0)` probes, bare `fcntl`/`termios`
imports) and by comparing hermes against how Claude Code, OpenCode, Codex,
and Cline handle native Windows.

## What changed

### UTF-8 stdio (new module)
- `hermes_cli/stdio.py` — single `configure_windows_stdio()` entry point.
  Flips the console code page to CP_UTF8 (65001), reconfigures
  `sys.stdout`/`stderr`/`stdin` to UTF-8, sets `PYTHONIOENCODING` + `PYTHONUTF8`
  for subprocesses.  No-op on non-Windows.  Opt out via `HERMES_DISABLE_WINDOWS_UTF8=1`.
- Called early in `cli.py::main`, `hermes_cli/main.py::main`, and
  `gateway/run.py::main` so Unicode banners (box-drawing, geometric
  symbols, non-Latin chat text) don't `UnicodeEncodeError` on cp1252
  consoles.

### Crash sites fixed
- `hermes_cli/main.py:7970` (hermes update → stuck gateway sweep): raw
  `os.kill(pid, _signal.SIGKILL)` → `gateway.status.terminate_pid(pid, force=True)`
  which routes through `taskkill /T /F` on Windows.
- `hermes_cli/profiles.py::_stop_gateway_process`: same fix — also
  converted SIGTERM path to `terminate_pid()` and widened OSError catch
  on the intermediate `os.kill(pid, 0)` probe.
- `hermes_cli/kanban_db.py:2914, 3041`: raw `signal.SIGKILL` →
  `getattr(signal, "SIGKILL", signal.SIGTERM)` fallback (matches the
  pattern already used in `gateway/status.py`).

### OSError widening on `os.kill(pid, 0)` probes
Windows raises `OSError` (WinError 87) for a gone PID instead of
`ProcessLookupError`.  Widened the catch at:
- `gateway/run.py:15101` (`--replace` wait-for-exit loop — without this,
  the loop busy-spins the full 10s every Windows gateway start)
- `hermes_cli/gateway.py:228, 460, 940`
- `hermes_cli/profiles.py:777`
- `tools/process_registry.py::_is_host_pid_alive`
- `tools/browser_tool.py:1170, 1206`

### Dashboard PTY graceful degradation
`hermes_cli/pty_bridge.py` depends on `fcntl`/`termios`/`ptyprocess`,
none of which exist on native Windows.  Previously a Windows dashboard
would crash on `import hermes_cli.web_server` because of a top-level
import.  Now:
- `hermes_cli/web_server.py` wraps the pty_bridge import in
  `try/except ImportError` and sets `_PTY_BRIDGE_AVAILABLE=False`.
- The `/api/pty` WebSocket handler returns a friendly "use WSL2 for
  this tab" message instead of exploding.
- Every other dashboard feature (sessions, jobs, metrics, config
  editor) runs natively on Windows.

### Dependency
- `pyproject.toml`: add `tzdata>=2023.3; sys_platform == 'win32'` so
  Python's `zoneinfo` works on Windows (which has no IANA tzdata
  shipped with the OS).  Credits @sprmn24 (PR #13182).

### Docs
- README.md: removed "Native Windows is not supported"; added
  PowerShell one-liner and Git-for-Windows prerequisite note.
- `website/docs/getting-started/installation.md`: new Windows section
  with capability matrix (everything native except the dashboard
  `/chat` PTY tab, which is WSL2-only).
- `website/docs/user-guide/windows-wsl-quickstart.md`: reframed as
  "WSL2 as an alternative to native" rather than "the only way".
- `website/docs/developer-guide/contributing.md`: updated
  cross-platform guidance with the `signal.SIGKILL` / `OSError`
  rules we enforce now.
- `website/docs/user-guide/features/web-dashboard.md`: acknowledged
  native Windows works for everything except the embedded PTY pane.

## Why this shape

Pulled from a survey of how other agent codebases handle native
Windows (Claude Code, OpenCode, Codex, Cline):

- All four treat Git Bash as the canonical shell on Windows, same as
  hermes already does in `tools/environments/local.py::_find_bash()`.
- None of them force `SetConsoleOutputCP` — but they don't have to,
  Node/Rust write UTF-16 to the Win32 console API.  Python does not get
  that for free, so we flip CP_UTF8 via ctypes.
- None of them ship PowerShell-as-primary-shell (Claude Code exposes
  PS as a secondary tool; scope creep for this PR).
- All of them use `taskkill /T /F` for force-kill on Windows, which
  is exactly what `gateway.status.terminate_pid(force=True)` does.

## Non-goals (deliberate scope limits)

- No PowerShell-as-a-second-shell tool — worth designing separately.
- No terminal routing rewrite (#12317, #15461, #19800 cluster) — that's
  the hardest design call and needs a separate doc.
- No wholesale `open()` → `open(..., encoding="utf-8")` sweep (Tianworld
  cluster) — will do as follow-up if users hit actual breakage; most
  modern code already specifies it.

## Validation

- 28 new tests in `tests/tools/test_windows_native_support.py` — all
  platform-mocked, pass on Linux CI.  Cover:
  - `configure_windows_stdio` idempotency, opt-out, env-preservation
  - `terminate_pid` taskkill routing, failure → OSError, FileNotFoundError fallback
  - `getattr(signal, "SIGKILL", …)` fallback shape
  - `_is_host_pid_alive` OSError widening (Windows-gone-PID behavior)
  - Source-level checks that all entry points call `configure_windows_stdio`
  - pty_bridge import-guard present in `web_server.py`
  - README no longer says "not supported"
- 12 pre-existing tests in `tests/tools/test_windows_compat.py` still pass.
- `tests/hermes_cli/` ran fully (3909 passed, 9 failures — all confirmed
  pre-existing on main by stash-test).
- `tests/gateway/` ran fully (5021 passed, 1 pre-existing failure).
- `tests/tools/test_process_registry.py` + `test_browser_*` pass.
- Manual smoke: `import hermes_cli.stdio; import gateway.run;
  import hermes_cli.web_server` — all clean, `_PTY_BRIDGE_AVAILABLE=True`
  on Linux (as expected).

## Files

- New: `hermes_cli/stdio.py`, `tests/tools/test_windows_native_support.py`
- Modified: `cli.py`, `gateway/run.py`, `hermes_cli/main.py`,
  `hermes_cli/profiles.py`, `hermes_cli/gateway.py`,
  `hermes_cli/kanban_db.py`, `hermes_cli/pty_bridge.py`,
  `hermes_cli/web_server.py`, `tools/browser_tool.py`,
  `tools/process_registry.py`, `pyproject.toml`, `README.md`, and 4
  docs pages.

Credits to everyone whose prior PR work informed these fixes — see
the co-author trailers.  All of the PRs listed in
`~/.hermes/plans/windows-support-prs.md` fixing `os.kill` / `signal.SIGKILL`
/ UTF-8 stdio / tzdata / README patterns found the same issues; this PR
consolidates them.

Co-authored-by: Philip D'Souza <9472774+PhilipAD@users.noreply.github.com>
Co-authored-by: Arecanon <42595053+ArecaNon@users.noreply.github.com>
Co-authored-by: XiaoXiao0221 <263113677+XiaoXiao0221@users.noreply.github.com>
Co-authored-by: Lars Hagen <1360677+lars-hagen@users.noreply.github.com>
Co-authored-by: Luan Dias <65574834+luandiasrj@users.noreply.github.com>
Co-authored-by: Ruzzgar <ruzzgarcn@gmail.com>
Co-authored-by: sprmn24 <oncuevtv@gmail.com>
Co-authored-by: adybag14-cyber <252811164+adybag14-cyber@users.noreply.github.com>
Co-authored-by: Prasanna28Devadiga <54196612+Prasanna28Devadiga@users.noreply.github.com>
2026-05-07 16:31:40 -07:00
teknium1 7f369bfe55 chore(release): add hllqkb to AUTHOR_MAP for PR #21288 salvage 2026-05-07 15:21:34 -07:00
hllqkb c80fa728bd fix(installer): set UV_NO_CONFIG=1 to avoid permission denied under sudo -u
When the installer is run via , uv resolves config file
paths against the process owner's (root) home directory rather than the
effective user's, causing a Permission denied error when trying to read
/root/uv.toml.

Setting UV_NO_CONFIG=1 prevents uv from discovering any config files
(uv.toml, pyproject.toml) during installation, which is the correct
behavior for a bootstrap script that manages its own environment.

Fixes #21269
2026-05-07 15:21:34 -07:00
teknium 292f468366 fix(mcp): unwrap platforms key in channels_list
channels_list was iterating directory.items() directly, yielding
("updated_at", str) and ("platforms", dict) pairs — neither passed
the isinstance(entries_list, list) check, so the inner loop never ran
and every call returned count=0 even when channel_directory.json was
populated.

The writer (gateway/channel_directory.py) wraps the payload as
{"updated_at": ..., "platforms": {...}}; every other reader in the
codebase unwraps via directory.get("platforms", {}). This aligns
channels_list with that convention.

Also tightens the existing test_channels_with_directory test, which
bypassed the bug by asserting against _load_channel_directory() directly
instead of calling channels_list. It now calls the tool end-to-end and
a new test_channels_with_directory_platform_filter covers the filter
path. Both tests fail against the pre-fix code.

Closes #21474

Co-authored-by: chrisworksai <262485129+chrisworksai@users.noreply.github.com>
2026-05-07 13:41:16 -07:00
Austin Pickett d87c7b99e2 fix(analytics): prevent silent token loss and add Claude 4.5–4.7 pricing (#21455)
- Add pricing entries for Claude Opus 4.5/4.6/4.7, Sonnet 4.5/4.6, and
  Haiku 4.5 with updated source URLs (platform.claude.com)
- Add _normalize_anthropic_model_name() to handle dot-notation variants
  (e.g. claude-opus-4.7 → claude-opus-4-7) for pricing lookups
- Fix silent token loss: ensure session row exists before UPDATE in both
  run_agent.py and hermes_state.py (INSERT OR IGNORE is idempotent)
- Log token persistence failures at DEBUG level instead of swallowing
  them silently — makes undercounted analytics diagnosable
- Surface reasoning tokens in CLI /usage and TUI usage panel
- Add 'reasoning' and 'cost_status' fields to TUI Usage type
2026-05-07 13:24:31 -07:00
Teknium cff821e2dc docs: register triage_specifier in the aux-models enumerations (#21494)
The kanban specifier landed in #21435 with feature-page docs (the
kanban page itself + the CLI reference table), but three other docs
pages enumerate every auxiliary task slot and were missed:

  user-guide/configuration.md            Auxiliary Models section —
                                         interactive picker example
                                         + full auxiliary config
                                         reference YAML block.
  user-guide/features/fallback-providers.md
                                         Both 'Auxiliary Tasks' and
                                         'Fallback Reference' tables.
  user-guide/features/kanban-tutorial.md
                                         Triage-column bullet now
                                         mentions the  Specify
                                         button + CLI + slash command.

No other docs enumerate the aux task slots (verified with
grep -r 'title_generation\|auxiliary.session_search' website/docs/).
2026-05-07 13:07:18 -07:00
teknium1 2214ab1073 chore: fix AUTHOR_MAP for johnsonblake1@gmail.com → voteblake
The existing mapping pointed to the wrong GitHub user (blakejohnson, id
866695, IBM) — the email actually belongs to voteblake (id 5585957),
confirmed via search/commits?author-email. Mis-credited since 323ca7084.
2026-05-07 13:04:42 -07:00
Blake Johnson 9076a2e74e fix(agent): keep Nous GPT-5 fallback on chat completions 2026-05-07 13:04:42 -07:00
Teknium 24d48ffb82 feat(kanban): add specify — auxiliary LLM fleshes out triage tasks (#21435)
* feat(kanban): add `specify` — auxiliary LLM fleshes out triage tasks

The Triage column shipped with a placeholder 'a specifier will flesh
out the spec', but the specifier itself was never built. This wires
it up as a dedicated CLI verb.

`hermes kanban specify <id>` calls the auxiliary LLM (configured under
`auxiliary.triage_specifier`) to expand a rough one-liner into a
concrete spec — tightened title plus a body with Goal / Approach /
Acceptance criteria / Out-of-scope sections — then atomically flips
`status: triage -> todo` and recomputes ready so parent-free tasks
go straight to the dispatcher on the same tick.

Surface:

  hermes kanban specify <task_id>               # single task
  hermes kanban specify --all [--tenant T]      # sweep triage column
  hermes kanban specify ... --author NAME       # audit-comment author
  hermes kanban specify ... --json              # one JSON line per task

Design choices:

  - Parent gating is preserved. specify_triage_task flips to 'todo',
    then recompute_ready promotes to 'ready' only when parents are
    done — same rule as a normal parent-gated todo.
  - No daemon, no background watcher. Every invocation is explicit —
    keeps cost predictable and doesn't fight the dispatcher loop.
  - Response parse is lenient: strict JSON preferred, markdown-fence
    tolerated, raw-body fallback on malformed JSON so the LLM can't
    strand a task in triage.
  - All failure modes (no aux client, API error, task moved out of
    triage mid-call) return SpecifyOutcome(ok=False, reason=...) so
    --all continues past individual failures.

Changes:

  hermes_cli/kanban_db.py    + specify_triage_task()
  hermes_cli/kanban_specify.py  NEW (~220 LOC — prompt, parse, call)
  hermes_cli/kanban.py       + specify subcommand + _cmd_specify
  hermes_cli/config.py       + auxiliary.triage_specifier task slot
  website/docs/user-guide/features/kanban.md  specify + config notes
  website/docs/reference/cli-commands.md      CLI reference entry
  tests/hermes_cli/test_kanban_specify_db.py    NEW (10 tests)
  tests/hermes_cli/test_kanban_specify.py       NEW (20 tests)

Validation: 30/30 targeted tests pass. E2E: triage task -> specify ->
ends in 'ready' with events [created, specified, promoted] and the
audit comment recorded under the configured author.

* feat(kanban): wire specifier into dashboard and gateway slash

Follow-ups to the initial PR #21435 — closes the two gaps I'd left as
post-merge: dashboard button and first-class gateway surface.

Dashboard (plugins/kanban/dashboard/)
  - POST /tasks/:id/specify  NEW endpoint. Thin wrapper around
    kanban_specify.specify_task(). Returns the CLI outcome shape
    ({ok, task_id, reason, new_title}); ok=false with a human reason
    is a 200, not a 4xx, so the UI can render it inline without
    treating 'no aux client configured' as a crash.
  - Runs sync in FastAPI's threadpool because the LLM call can take
    tens of seconds on reasoning models.
  - Pins HERMES_KANBAN_BOARD around the specify call so the module's
    argless kb.connect() lands on the right board.
  - dist/index.js: doSpecify callback threaded through the drawer →
    TaskDetail → StatusActions prop chain.  Specify button appears
    ONLY when task.status === 'triage' (elsewhere the backend would
    reject anyway — hide the button to keep the action row clean).
    Busy state (Specifying…) + inline success/error banner under the
    button using the response.reason text.
  - dist/style.css: tiny hermes-kanban-msg-ok / -err classes using
    existing --color vars so themes reskin cleanly.

Gateway slash (/kanban specify)
  - Already works via the existing run_slash → build_parser →
    kanban_command pipeline. No code change needed — slash commands
    inherit the argparse tree automatically. Added coverage:
    test_run_slash_specify_end_to_end (create --triage, specify, verify
    promotion + retitle) and test_run_slash_specify_help_is_reachable.

Tests
  - tests/plugins/test_kanban_dashboard_plugin.py: 3 new tests for the
    REST endpoint — happy path, non-triage rejection as ok=false 200,
    missing aux client as ok=false 200.
  - tests/hermes_cli/test_kanban_cli.py: 2 new slash-surface tests.

Docs
  - website/docs/user-guide/features/kanban.md: dashboard action row
    description mentions  Specify + all three surfaces. REST table
    gains /tasks/:id/specify. Slash examples include /kanban specify.

Validation: 340/340 targeted tests pass. E2E via TestClient: create a
triage task over REST → POST /specify with mocked aux client → task
moves to 'ready' column on /board with new title and body applied.
2026-05-07 13:04:41 -07:00
adybag14-cyber 732a6c45fa feat: add termux doctor fallback guidance for blocked extras 2026-05-07 13:04:08 -07:00
adybag14-cyber dc5ef1ac8e fix: add termux-all install profile and safe fallbacks 2026-05-07 13:04:08 -07:00
adybag14-cyber da18fd084a fix: strengthen termux install network prerequisites 2026-05-07 13:04:08 -07:00
adybag14-cyber 54c0b10d14 fix(update): add heartbeat during dependency install 2026-05-07 13:04:08 -07:00
Abd0r 04193cf71c feat(web): add Brave Search (free tier) and DDGS search providers
Both implement WebSearchProvider via tools/web_providers/ — matching the
existing SearXNG pattern (PR #5c906d702). Search-only; pair with any
extract provider via web.extract_backend.

- tools/web_providers/brave_free.py — Brave Search API (free tier, 2k
  queries/mo). Uses BRAVE_SEARCH_API_KEY as X-Subscription-Token.
- tools/web_providers/ddgs.py — DuckDuckGo via the ddgs Python package.
  No API key; gated on package importability.
- tools/web_tools.py: both backends added to _get_backend() config list
  and auto-detect chain (trails paid providers), _is_backend_available,
  web_search_tool dispatch, web_extract_tool + web_crawl_tool search-only
  refusals, check_web_api_key, and the __main__ diagnostic. Introduces
  _ddgs_package_importable() helper so tests can monkeypatch a single
  symbol for the ddgs availability check.
- hermes_cli/tools_config.py: picker entries for both providers; ddgs
  gets a post_setup handler that runs `pip install ddgs`.
- hermes_cli/config.py: BRAVE_SEARCH_API_KEY in OPTIONAL_ENV_VARS.
- scripts/release.py: AUTHOR_MAP entry for @Abd0r.
- tests: 14 new tests (brave-free) + 15 new tests (ddgs) covering
  provider unit behavior, backend wiring, and search-only refusals.

Salvages the brave-free + ddgs portion of PR #19796. Not included: the
in-line helpers in web_tools.py (replaced with provider modules to match
the shipped architecture), the lynx-based extract path (these backends
should refuse extract with a clear error — users pair with a real
extract provider), and scripts/start-llama-server.sh (unrelated).

Co-authored-by: Abd0r <223003280+Abd0r@users.noreply.github.com>
2026-05-07 09:59:17 -07:00
xxxigm cdc0a47dd5 test(hermes_constants): cover parse_reasoning_effort() 2026-05-07 09:59:07 -07:00
Teknium 7e2af0c2e8 feat(acp): pass image file attachments through as image_url parts
Extends PR #21400's resource inlining with image-specific handling: ACP
resource_link and embedded blob resources with an image/* mime (or image
file suffix when mime is missing) now emit an OpenAI image_url part
with a base64 data URL, so vision models actually see the image
instead of a [Binary file omitted] note. Non-image resources keep the
existing text-inlining behavior.

Adds 3 tests: local PNG via resource_link, JPEG mime inferred from
suffix when client omits mimeType, and embedded blob PNG.
2026-05-07 09:24:32 -07:00
HenkDz 733e297b8a fix(acp): inline file attachment resources 2026-05-07 09:24:32 -07:00
110 changed files with 7235 additions and 414 deletions
+14 -2
View File
@@ -30,15 +30,27 @@ Use any model you want — [Nous Portal](https://portal.nousresearch.com), [Open
## Quick Install
### Linux, macOS, WSL2, Termux
```bash
curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash
```
Works on Linux, macOS, WSL2, and Android via Termux. The installer handles the platform-specific setup for you.
### Windows (native, PowerShell)
Run this in PowerShell:
```powershell
irm https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.ps1 | iex
```
The installer handles everything: uv, Python 3.11, Node.js, ripgrep, ffmpeg, **and a portable Git Bash** (MinGit, unpacked to `%LOCALAPPDATA%\hermes\git` — no admin required, completely isolated from any system Git install). Hermes uses this bundled Git Bash to run shell commands.
If you already have Git installed, the installer detects it and uses that instead. Otherwise a ~45MB MinGit download is all you need — it won't touch or interfere with any system Git.
> **Android / Termux:** The tested manual path is documented in the [Termux guide](https://hermes-agent.nousresearch.com/docs/getting-started/termux). On Termux, Hermes installs a curated `.[termux]` extra because the full `.[all]` extra currently pulls Android-incompatible voice dependencies.
>
> **Windows:** Native Windows is not supported. Please install [WSL2](https://learn.microsoft.com/en-us/windows/wsl/install) and run the command above.
> **Windows:** Native Windows is supported — the PowerShell one-liner above installs everything. If you'd rather use WSL2, the Linux command works there too. Native Windows install lives under `%LOCALAPPDATA%\hermes`; WSL2 installs under `~/.hermes` as on Linux. The only Hermes feature that currently needs WSL2 specifically is the browser-based dashboard chat pane (it uses a POSIX PTY — classic CLI and gateway both run natively).
After installation:
+4
View File
@@ -13,6 +13,10 @@ Usage::
hermes-acp
"""
# IMPORTANT: hermes_bootstrap must be the very first import — UTF-8 stdio
# on Windows. No-op on POSIX. See hermes_bootstrap.py for full rationale.
import hermes_bootstrap # noqa: F401
import asyncio
import logging
import sys
+288 -2
View File
@@ -3,13 +3,16 @@
from __future__ import annotations
import asyncio
import base64
import contextvars
import json
import logging
import os
from collections import defaultdict, deque
from concurrent.futures import ThreadPoolExecutor
from pathlib import Path
from typing import Any, Deque, Optional
from urllib.parse import unquote, urlparse
import acp
from acp.schema import (
@@ -18,6 +21,7 @@ from acp.schema import (
AuthenticateResponse,
AvailableCommand,
AvailableCommandsUpdate,
BlobResourceContents,
ClientCapabilities,
EmbeddedResourceContentBlock,
ForkSessionResponse,
@@ -46,6 +50,7 @@ from acp.schema import (
SessionResumeCapabilities,
SessionInfo,
TextContentBlock,
TextResourceContents,
UnstructuredCommandInput,
Usage,
UsageUpdate,
@@ -83,6 +88,272 @@ _executor = ThreadPoolExecutor(max_workers=4, thread_name_prefix="acp-agent")
# does not expose a client-side limit, so this is a fixed cap that clients
# paginate against using `cursor` / `next_cursor`.
_LIST_SESSIONS_PAGE_SIZE = 50
_MAX_ACP_RESOURCE_BYTES = 512 * 1024
_TEXT_RESOURCE_MIME_PREFIXES = ("text/",)
_TEXT_RESOURCE_MIME_TYPES = {
"application/json",
"application/javascript",
"application/typescript",
"application/xml",
"application/x-yaml",
"application/yaml",
"application/toml",
"application/sql",
}
def _resource_display_name(uri: str, name: str | None = None, title: str | None = None) -> str:
"""Human-readable attachment name for prompt context."""
raw_name = (name or "").strip()
raw_title = (title or "").strip()
if raw_title and raw_name and raw_title != raw_name:
return f"{raw_title} ({raw_name})"
if raw_title:
return raw_title
if raw_name:
return raw_name
parsed = urlparse(uri)
candidate = parsed.path if parsed.scheme else uri
return Path(unquote(candidate)).name or uri or "resource"
def _is_text_resource(mime_type: str | None) -> bool:
mime = (mime_type or "").split(";", 1)[0].strip().lower()
if not mime:
return False
return mime.startswith(_TEXT_RESOURCE_MIME_PREFIXES) or mime in _TEXT_RESOURCE_MIME_TYPES
def _is_image_resource(mime_type: str | None) -> bool:
mime = (mime_type or "").split(";", 1)[0].strip().lower()
return mime.startswith("image/")
def _guess_image_mime_from_path(path: Path) -> str | None:
suffix = path.suffix.lower()
return {
".png": "image/png",
".jpg": "image/jpeg",
".jpeg": "image/jpeg",
".gif": "image/gif",
".webp": "image/webp",
".bmp": "image/bmp",
".svg": "image/svg+xml",
}.get(suffix)
def _image_data_url(data: bytes, mime_type: str) -> str:
return f"data:{mime_type};base64,{base64.b64encode(data).decode('ascii')}"
def _path_from_file_uri(uri: str) -> Path | None:
"""Convert local file URIs/paths from ACP clients into a readable Path.
Zed may send POSIX file URIs from Linux/WSL workspaces or Windows-ish paths
when launched through wsl.exe. Translate the common Windows drive form to
/mnt/<drive>/... so Hermes running in WSL can read it.
"""
raw = (uri or "").strip()
if not raw:
return None
parsed = urlparse(raw)
if parsed.scheme and parsed.scheme != "file":
return None
if parsed.scheme == "file":
if parsed.netloc and parsed.netloc not in {"", "localhost"}:
return None
path_text = unquote(parsed.path or "")
else:
path_text = unquote(raw)
# file:///C:/Users/... or C:\Users\...
if len(path_text) >= 3 and path_text[0] == "/" and path_text[2] == ":" and path_text[1].isalpha():
drive = path_text[1].lower()
rest = path_text[3:].lstrip("/\\").replace("\\", "/")
return Path("/mnt") / drive / rest
if len(path_text) >= 2 and path_text[1] == ":" and path_text[0].isalpha():
drive = path_text[0].lower()
rest = path_text[2:].lstrip("/\\").replace("\\", "/")
return Path("/mnt") / drive / rest
return Path(path_text)
def _decode_text_bytes(data: bytes, mime_type: str | None) -> str | None:
"""Decode resource bytes if they are probably text; return None for binary."""
if b"\x00" in data and not _is_text_resource(mime_type):
return None
for encoding in ("utf-8-sig", "utf-8", "latin-1"):
try:
return data.decode(encoding)
except UnicodeDecodeError:
continue
return data.decode("utf-8", errors="replace")
def _format_resource_text(
*,
uri: str,
body: str,
name: str | None = None,
title: str | None = None,
note: str | None = None,
) -> str:
display = _resource_display_name(uri, name=name, title=title)
header = f"[Attached file: {display}]"
if note:
header += f" ({note})"
return f"{header}\nURI: {uri}\n\n{body}"
def _resource_link_to_parts(block: ResourceContentBlock) -> list[dict[str, Any]]:
"""Convert an ACP resource_link block to OpenAI content parts.
Returns a list of {"type": "text", ...} and/or {"type": "image_url", ...}
parts. Image resources produce an image_url part with a small text header
so the model knows which attachment it is. Non-image resources return a
single text part with the inlined file body (or a binary-omit note).
"""
uri = str(getattr(block, "uri", "") or "").strip()
if not uri:
return []
name = str(getattr(block, "name", "") or "").strip() or None
title = str(getattr(block, "title", "") or "").strip() or None
mime_type = str(getattr(block, "mime_type", "") or "").strip() or None
path = _path_from_file_uri(uri)
if path is None:
return [{
"type": "text",
"text": _format_resource_text(
uri=uri,
name=name,
title=title,
body="[Resource link only; Hermes cannot read non-file ACP resource URIs directly.]",
),
}]
# Image files: emit a short text header + image_url data URL so vision
# models can see the attachment instead of a "binary omitted" note.
image_mime = mime_type if _is_image_resource(mime_type) else _guess_image_mime_from_path(path)
if image_mime and _is_image_resource(image_mime):
try:
size = path.stat().st_size
if size > _MAX_ACP_RESOURCE_BYTES:
return [{
"type": "text",
"text": _format_resource_text(
uri=uri,
name=name,
title=title,
body=f"[Image too large to inline: {size} bytes, cap={_MAX_ACP_RESOURCE_BYTES}]",
),
}]
with path.open("rb") as fh:
data = fh.read()
except OSError as exc:
logger.warning("ACP image resource read failed: %s", uri, exc_info=True)
return [{
"type": "text",
"text": _format_resource_text(
uri=uri,
name=name,
title=title,
body=f"[Could not read attached image: {exc}]",
),
}]
display = _resource_display_name(uri, name=name, title=title)
return [
{"type": "text", "text": f"[Attached image: {display}]\nURI: {uri}"},
{"type": "image_url", "image_url": {"url": _image_data_url(data, image_mime)}},
]
try:
size = path.stat().st_size
read_size = min(size, _MAX_ACP_RESOURCE_BYTES)
with path.open("rb") as fh:
data = fh.read(read_size)
text = _decode_text_bytes(data, mime_type)
if text is None:
return [{
"type": "text",
"text": _format_resource_text(
uri=uri,
name=name,
title=title,
body=f"[Binary file omitted: {size} bytes, mime={mime_type or 'unknown'}]",
),
}]
note = None
if size > _MAX_ACP_RESOURCE_BYTES:
note = f"truncated to {_MAX_ACP_RESOURCE_BYTES} of {size} bytes"
return [{
"type": "text",
"text": _format_resource_text(uri=uri, name=name, title=title, body=text, note=note),
}]
except OSError as exc:
logger.warning("ACP resource read failed: %s", uri, exc_info=True)
return [{
"type": "text",
"text": _format_resource_text(
uri=uri,
name=name,
title=title,
body=f"[Could not read attached file: {exc}]",
),
}]
def _embedded_resource_to_parts(block: EmbeddedResourceContentBlock) -> list[dict[str, Any]]:
resource = getattr(block, "resource", None)
if resource is None:
return []
uri = str(getattr(resource, "uri", "") or "").strip()
mime_type = str(getattr(resource, "mime_type", "") or "").strip() or None
if isinstance(resource, TextResourceContents):
return [{"type": "text", "text": _format_resource_text(uri=uri, body=resource.text)}]
if isinstance(resource, BlobResourceContents):
blob = resource.blob or ""
try:
data = base64.b64decode(blob, validate=True)
except Exception:
data = blob.encode("utf-8", errors="replace")
# Image blobs go through as image_url so vision models can see them.
if _is_image_resource(mime_type):
if len(data) > _MAX_ACP_RESOURCE_BYTES:
return [{
"type": "text",
"text": _format_resource_text(
uri=uri,
body=f"[Embedded image too large to inline: {len(data)} bytes, cap={_MAX_ACP_RESOURCE_BYTES}]",
),
}]
display = _resource_display_name(uri)
return [
{"type": "text", "text": f"[Attached image: {display}]" + (f"\nURI: {uri}" if uri else "")},
{"type": "image_url", "image_url": {"url": _image_data_url(data, mime_type or "image/png")}},
]
text = _decode_text_bytes(data[:_MAX_ACP_RESOURCE_BYTES], mime_type)
if text is None:
body = f"[Binary embedded file omitted: {len(data)} bytes, mime={mime_type or 'unknown'}]"
else:
body = text
if len(data) > _MAX_ACP_RESOURCE_BYTES:
body += f"\n\n[Truncated to {_MAX_ACP_RESOURCE_BYTES} of {len(data)} bytes]"
return [{"type": "text", "text": _format_resource_text(uri=uri, body=body)}]
text = getattr(resource, "text", None)
if text:
return [{"type": "text", "text": _format_resource_text(uri=uri, body=str(text))}]
return []
def _extract_text(
@@ -144,6 +415,20 @@ def _content_blocks_to_openai_user_content(
if image_part is not None:
parts.append(image_part)
continue
if isinstance(block, ResourceContentBlock):
resource_parts = _resource_link_to_parts(block)
for part in resource_parts:
parts.append(part)
if part.get("type") == "text":
text_parts.append(part["text"])
continue
if isinstance(block, EmbeddedResourceContentBlock):
resource_parts = _embedded_resource_to_parts(block)
for part in resource_parts:
parts.append(part)
if part.get("type") == "text":
text_parts.append(part["text"])
continue
if not parts:
return _extract_text(prompt)
@@ -803,6 +1088,7 @@ class HermesACPAgent(acp.Agent):
user_text = _extract_text(prompt).strip()
user_content = _content_blocks_to_openai_user_content(prompt)
text_only_prompt = all(isinstance(block, TextContentBlock) for block in prompt)
has_content = bool(user_text) or (
isinstance(user_content, list) and bool(user_content)
)
@@ -821,7 +1107,7 @@ class HermesACPAgent(acp.Agent):
# silently append to state.queued_prompts and respond with
# "No active turn — queued for the next turn", which looks like
# /queue even though the user never typed /queue.
if isinstance(user_content, str) and user_text.startswith("/steer"):
if text_only_prompt and isinstance(user_content, str) and user_text.startswith("/steer"):
steer_text = user_text.split(maxsplit=1)[1].strip() if len(user_text.split(maxsplit=1)) > 1 else ""
interrupted_prompt = ""
rewrite_idle = False
@@ -846,7 +1132,7 @@ class HermesACPAgent(acp.Agent):
# Slash commands are text-only; if the client included images/resources,
# send the whole multimodal prompt to the agent instead of treating it as
# an ACP command.
if isinstance(user_content, str) and user_text.startswith("/"):
if text_only_prompt and isinstance(user_content, str) and user_text.startswith("/"):
response_text = self._handle_slash_command(user_text, state)
if response_text is not None:
if self._conn:
+1 -1
View File
@@ -1607,7 +1607,7 @@ def _run_llm_review(prompt: str) -> Dict[str, Any]:
# terminal. The background-thread runner also hides it; this
# belt-and-suspenders path matters when a caller invokes
# run_curator_review(synchronous=True) from the CLI.
with open(os.devnull, "w") as _devnull, \
with open(os.devnull, "w", encoding="utf-8") as _devnull, \
contextlib.redirect_stdout(_devnull), \
contextlib.redirect_stderr(_devnull):
conv_result = review_agent.run_conversation(user_message=prompt)
+3 -3
View File
@@ -754,7 +754,7 @@ def _load_context_cache() -> Dict[str, int]:
if not path.exists():
return {}
try:
with open(path) as f:
with open(path, encoding="utf-8") as f:
data = yaml.safe_load(f) or {}
return data.get("context_lengths", {})
except Exception as e:
@@ -776,7 +776,7 @@ def save_context_length(model: str, base_url: str, length: int) -> None:
path = _get_context_cache_path()
try:
path.parent.mkdir(parents=True, exist_ok=True)
with open(path, "w") as f:
with open(path, "w", encoding="utf-8") as f:
yaml.dump({"context_lengths": cache}, f, default_flow_style=False)
logger.info("Cached context length %s -> %s tokens", key, f"{length:,}")
except Exception as e:
@@ -800,7 +800,7 @@ def _invalidate_cached_context_length(model: str, base_url: str) -> None:
path = _get_context_cache_path()
try:
path.parent.mkdir(parents=True, exist_ok=True)
with open(path, "w") as f:
with open(path, "w", encoding="utf-8") as f:
yaml.dump({"context_lengths": cache}, f, default_flow_style=False)
except Exception as e:
logger.debug("Failed to invalidate context length cache entry %s: %s", key, e)
+1 -1
View File
@@ -144,7 +144,7 @@ def nous_rate_limit_remaining() -> Optional[float]:
"""
path = _state_path()
try:
with open(path) as f:
with open(path, encoding="utf-8") as f:
state = json.load(f)
reset_at = state.get("reset_at", 0)
remaining = reset_at - time.time()
+1 -1
View File
@@ -617,7 +617,7 @@ def _locked_update_approvals() -> Iterator[Dict[str, Any]]:
save_allowlist(data)
return
with open(lock_path, "a+") as lock_fh:
with open(lock_path, "a+", encoding="utf-8") as lock_fh:
fcntl.flock(lock_fh.fileno(), fcntl.LOCK_EX)
try:
data = load_allowlist()
+159 -14
View File
@@ -1,5 +1,6 @@
from __future__ import annotations
import re
from dataclasses import dataclass
from datetime import datetime, timezone
from decimal import Decimal
@@ -82,6 +83,121 @@ _UTC_NOW = lambda: datetime.now(timezone.utc)
# Official docs snapshot entries. Models whose published pricing and cache
# semantics are stable enough to encode exactly.
_OFFICIAL_DOCS_PRICING: Dict[tuple[str, str], PricingEntry] = {
# ── Anthropic Claude 4.7 ─────────────────────────────────────────────
# Opus 4.5/4.6/4.7 share $5/$25 pricing (new tokenizer, up to 35% more
# tokens for the same text).
# Source: https://platform.claude.com/docs/en/about-claude/pricing
(
"anthropic",
"claude-opus-4-7",
): PricingEntry(
input_cost_per_million=Decimal("5.00"),
output_cost_per_million=Decimal("25.00"),
cache_read_cost_per_million=Decimal("0.50"),
cache_write_cost_per_million=Decimal("6.25"),
source="official_docs_snapshot",
source_url="https://platform.claude.com/docs/en/about-claude/pricing",
pricing_version="anthropic-pricing-2026-05",
),
(
"anthropic",
"claude-opus-4-7-20250507",
): PricingEntry(
input_cost_per_million=Decimal("5.00"),
output_cost_per_million=Decimal("25.00"),
cache_read_cost_per_million=Decimal("0.50"),
cache_write_cost_per_million=Decimal("6.25"),
source="official_docs_snapshot",
source_url="https://platform.claude.com/docs/en/about-claude/pricing",
pricing_version="anthropic-pricing-2026-05",
),
# ── Anthropic Claude 4.6 ─────────────────────────────────────────────
(
"anthropic",
"claude-opus-4-6",
): PricingEntry(
input_cost_per_million=Decimal("5.00"),
output_cost_per_million=Decimal("25.00"),
cache_read_cost_per_million=Decimal("0.50"),
cache_write_cost_per_million=Decimal("6.25"),
source="official_docs_snapshot",
source_url="https://platform.claude.com/docs/en/about-claude/pricing",
pricing_version="anthropic-pricing-2026-05",
),
(
"anthropic",
"claude-opus-4-6-20250414",
): PricingEntry(
input_cost_per_million=Decimal("5.00"),
output_cost_per_million=Decimal("25.00"),
cache_read_cost_per_million=Decimal("0.50"),
cache_write_cost_per_million=Decimal("6.25"),
source="official_docs_snapshot",
source_url="https://platform.claude.com/docs/en/about-claude/pricing",
pricing_version="anthropic-pricing-2026-05",
),
(
"anthropic",
"claude-sonnet-4-6",
): PricingEntry(
input_cost_per_million=Decimal("3.00"),
output_cost_per_million=Decimal("15.00"),
cache_read_cost_per_million=Decimal("0.30"),
cache_write_cost_per_million=Decimal("3.75"),
source="official_docs_snapshot",
source_url="https://platform.claude.com/docs/en/about-claude/pricing",
pricing_version="anthropic-pricing-2026-05",
),
(
"anthropic",
"claude-sonnet-4-6-20250414",
): PricingEntry(
input_cost_per_million=Decimal("3.00"),
output_cost_per_million=Decimal("15.00"),
cache_read_cost_per_million=Decimal("0.30"),
cache_write_cost_per_million=Decimal("3.75"),
source="official_docs_snapshot",
source_url="https://platform.claude.com/docs/en/about-claude/pricing",
pricing_version="anthropic-pricing-2026-05",
),
# ── Anthropic Claude 4.5 ─────────────────────────────────────────────
(
"anthropic",
"claude-opus-4-5",
): PricingEntry(
input_cost_per_million=Decimal("5.00"),
output_cost_per_million=Decimal("25.00"),
cache_read_cost_per_million=Decimal("0.50"),
cache_write_cost_per_million=Decimal("6.25"),
source="official_docs_snapshot",
source_url="https://platform.claude.com/docs/en/about-claude/pricing",
pricing_version="anthropic-pricing-2026-05",
),
(
"anthropic",
"claude-sonnet-4-5",
): PricingEntry(
input_cost_per_million=Decimal("3.00"),
output_cost_per_million=Decimal("15.00"),
cache_read_cost_per_million=Decimal("0.30"),
cache_write_cost_per_million=Decimal("3.75"),
source="official_docs_snapshot",
source_url="https://platform.claude.com/docs/en/about-claude/pricing",
pricing_version="anthropic-pricing-2026-05",
),
(
"anthropic",
"claude-haiku-4-5",
): PricingEntry(
input_cost_per_million=Decimal("1.00"),
output_cost_per_million=Decimal("5.00"),
cache_read_cost_per_million=Decimal("0.10"),
cache_write_cost_per_million=Decimal("1.25"),
source="official_docs_snapshot",
source_url="https://platform.claude.com/docs/en/about-claude/pricing",
pricing_version="anthropic-pricing-2026-05",
),
# ── Anthropic Claude 4 / 4.1 ─────────────────────────────────────────
(
"anthropic",
"claude-opus-4-20250514",
@@ -91,8 +207,8 @@ _OFFICIAL_DOCS_PRICING: Dict[tuple[str, str], PricingEntry] = {
cache_read_cost_per_million=Decimal("1.50"),
cache_write_cost_per_million=Decimal("18.75"),
source="official_docs_snapshot",
source_url="https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching",
pricing_version="anthropic-prompt-caching-2026-03-16",
source_url="https://platform.claude.com/docs/en/about-claude/pricing",
pricing_version="anthropic-pricing-2026-05",
),
(
"anthropic",
@@ -103,8 +219,8 @@ _OFFICIAL_DOCS_PRICING: Dict[tuple[str, str], PricingEntry] = {
cache_read_cost_per_million=Decimal("0.30"),
cache_write_cost_per_million=Decimal("3.75"),
source="official_docs_snapshot",
source_url="https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching",
pricing_version="anthropic-prompt-caching-2026-03-16",
source_url="https://platform.claude.com/docs/en/about-claude/pricing",
pricing_version="anthropic-pricing-2026-05",
),
# OpenAI
(
@@ -184,7 +300,7 @@ _OFFICIAL_DOCS_PRICING: Dict[tuple[str, str], PricingEntry] = {
source_url="https://openai.com/api/pricing/",
pricing_version="openai-pricing-2026-03-16",
),
# Anthropic older models (pre-4.6 generation)
# ── Anthropic older models (pre-4.5 generation) ────────────────────────
(
"anthropic",
"claude-3-5-sonnet-20241022",
@@ -194,8 +310,8 @@ _OFFICIAL_DOCS_PRICING: Dict[tuple[str, str], PricingEntry] = {
cache_read_cost_per_million=Decimal("0.30"),
cache_write_cost_per_million=Decimal("3.75"),
source="official_docs_snapshot",
source_url="https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching",
pricing_version="anthropic-pricing-2026-03-16",
source_url="https://platform.claude.com/docs/en/about-claude/pricing",
pricing_version="anthropic-pricing-2026-05",
),
(
"anthropic",
@@ -206,8 +322,8 @@ _OFFICIAL_DOCS_PRICING: Dict[tuple[str, str], PricingEntry] = {
cache_read_cost_per_million=Decimal("0.08"),
cache_write_cost_per_million=Decimal("1.00"),
source="official_docs_snapshot",
source_url="https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching",
pricing_version="anthropic-pricing-2026-03-16",
source_url="https://platform.claude.com/docs/en/about-claude/pricing",
pricing_version="anthropic-pricing-2026-05",
),
(
"anthropic",
@@ -218,8 +334,8 @@ _OFFICIAL_DOCS_PRICING: Dict[tuple[str, str], PricingEntry] = {
cache_read_cost_per_million=Decimal("1.50"),
cache_write_cost_per_million=Decimal("18.75"),
source="official_docs_snapshot",
source_url="https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching",
pricing_version="anthropic-pricing-2026-03-16",
source_url="https://platform.claude.com/docs/en/about-claude/pricing",
pricing_version="anthropic-pricing-2026-05",
),
(
"anthropic",
@@ -230,8 +346,8 @@ _OFFICIAL_DOCS_PRICING: Dict[tuple[str, str], PricingEntry] = {
cache_read_cost_per_million=Decimal("0.03"),
cache_write_cost_per_million=Decimal("0.30"),
source="official_docs_snapshot",
source_url="https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching",
pricing_version="anthropic-pricing-2026-03-16",
source_url="https://platform.claude.com/docs/en/about-claude/pricing",
pricing_version="anthropic-pricing-2026-05",
),
# DeepSeek
(
@@ -426,8 +542,37 @@ def resolve_billing_route(
return BillingRoute(provider=provider_name or "unknown", model=model.split("/")[-1] if model else "", base_url=base_url or "", billing_mode="unknown")
def _normalize_anthropic_model_name(model: str) -> str:
"""Normalize Anthropic model name variants to canonical form.
Handles:
- Dot notation: claude-opus-4.7 → claude-opus-4-7
- Short aliases: claude-opus-4.7 → claude-opus-4-7
- Strips anthropic/ prefix if present
"""
name = model.lower().strip()
if name.startswith("anthropic/"):
name = name[len("anthropic/"):]
# Normalize dots to dashes in version numbers (e.g. 4.7 → 4-7, 4.6 → 4-6)
# But preserve the rest of the name structure
name = re.sub(r"(\d+)\.(\d+)", r"\1-\2", name)
return name
def _lookup_official_docs_pricing(route: BillingRoute) -> Optional[PricingEntry]:
return _OFFICIAL_DOCS_PRICING.get((route.provider, route.model.lower()))
model = route.model.lower()
# Direct lookup first
entry = _OFFICIAL_DOCS_PRICING.get((route.provider, model))
if entry:
return entry
# Try normalized name for Anthropic (handles dot-notation like opus-4.7)
if route.provider == "anthropic":
normalized = _normalize_anthropic_model_name(model)
if normalized != model:
entry = _OFFICIAL_DOCS_PRICING.get((route.provider, normalized))
if entry:
return entry
return None
def _openrouter_pricing_entry(route: BillingRoute) -> Optional[PricingEntry]:
+4
View File
@@ -20,6 +20,10 @@ Usage:
python batch_runner.py --dataset_file=data.jsonl --batch_size=10 --run_name=my_run --distribution=image_gen
"""
# IMPORTANT: hermes_bootstrap must be the very first import — UTF-8 stdio
# on Windows. No-op on POSIX. See hermes_bootstrap.py for full rationale.
import hermes_bootstrap # noqa: F401
import json
import logging
import os
+88 -9
View File
@@ -9,10 +9,13 @@ Usage:
python cli.py # Start interactive mode with all tools
python cli.py --toolsets web,terminal # Start with specific toolsets
python cli.py --skills hermes-agent-dev,github-auth
python cli.py -q "your question" # Single query mode
python cli.py --list-tools # List available tools and exit
"""
# IMPORTANT: hermes_bootstrap must be the very first import — UTF-8 stdio
# on Windows. No-op on POSIX. See hermes_bootstrap.py for full rationale.
import hermes_bootstrap # noqa: F401
import logging
import os
import shutil
@@ -728,8 +731,43 @@ def _run_cleanup():
_active_worktree: Optional[Dict[str, str]] = None
def _normalize_git_bash_path(p: Optional[str]) -> Optional[str]:
"""Translate a Git Bash-style path (``/c/Users/...``) to the native
Windows form (``C:\\Users\\...``) that Python's ``subprocess.Popen``
and ``pathlib.Path`` accept.
No-op on non-Windows and for paths that already look native. Git on
native Windows normally emits forward-slash Windows paths
(``C:/Users/...``) which both bash and Python handle, but certain
configurations (Git Bash shells, MSYS2, WSL-mounted repos) surface
``/c/...`` or ``/cygdrive/c/...`` variants.
"""
if not p:
return p
if sys.platform != "win32":
return p
import re as _re
# /c/Users/... or /C/Users/...
m = _re.match(r"^/([a-zA-Z])/(.*)$", p)
if m:
drive, rest = m.group(1), m.group(2)
return f"{drive.upper()}:\\{rest.replace('/', chr(92))}"
# /cygdrive/c/... or /mnt/c/...
m = _re.match(r"^/(?:cygdrive|mnt)/([a-zA-Z])/(.*)$", p)
if m:
drive, rest = m.group(1), m.group(2)
return f"{drive.upper()}:\\{rest.replace('/', chr(92))}"
return p
def _git_repo_root() -> Optional[str]:
"""Return the git repo root for CWD, or None if not in a repo."""
"""Return the git repo root for CWD, or None if not in a repo.
Runs through :func:`_normalize_git_bash_path` so callers can pass
the result directly to ``Path``/``subprocess.Popen(cwd=...)`` on
Windows without hitting ``C:\\c\\Users\\...`` style resolution
mistakes.
"""
import subprocess
try:
result = subprocess.run(
@@ -737,7 +775,7 @@ def _git_repo_root() -> Optional[str]:
capture_output=True, text=True, timeout=5,
)
if result.returncode == 0:
return result.stdout.strip()
return _normalize_git_bash_path(result.stdout.strip())
except Exception:
pass
return None
@@ -781,7 +819,7 @@ def _setup_worktree(repo_root: str = None) -> Optional[Dict[str, str]]:
try:
existing = gitignore.read_text() if gitignore.exists() else ""
if _ignore_entry not in existing.splitlines():
with open(gitignore, "a") as f:
with open(gitignore, "a", encoding="utf-8") as f:
if existing and not existing.endswith("\n"):
f.write("\n")
f.write(f"{_ignore_entry}\n")
@@ -832,10 +870,39 @@ def _setup_worktree(repo_root: str = None) -> Optional[Dict[str, str]]:
dst.parent.mkdir(parents=True, exist_ok=True)
shutil.copy2(str(src), str(dst))
elif src.is_dir():
# Symlink directories (faster, saves disk)
# Symlink directories (faster, saves disk). On Windows,
# symlink creation requires Developer Mode or elevation,
# and fails with OSError otherwise — fall back to a
# recursive copy so the worktree is still usable. The
# copy is slower and uses disk, but it doesn't require
# admin and matches the Linux/macOS symlink outcome
# functionally.
if not dst.exists():
dst.parent.mkdir(parents=True, exist_ok=True)
os.symlink(str(src_resolved), str(dst))
try:
os.symlink(str(src_resolved), str(dst))
except (OSError, NotImplementedError) as _sym_err:
if sys.platform == "win32":
logger.info(
".worktreeinclude: symlink failed (%s) — "
"falling back to copytree on Windows.",
_sym_err,
)
try:
shutil.copytree(
str(src_resolved),
str(dst),
symlinks=True,
dirs_exist_ok=False,
)
except Exception as _copy_err:
logger.warning(
".worktreeinclude: copy fallback "
"also failed for %s -> %s: %s",
src, dst, _copy_err,
)
else:
raise
except Exception as e:
logger.debug("Error copying .worktreeinclude entries: %s", e)
@@ -2080,7 +2147,7 @@ def save_config_value(key_path: str, value: any) -> bool:
# Load existing config
if config_path.exists():
with open(config_path, 'r') as f:
with open(config_path, 'r', encoding="utf-8") as f:
config = yaml.safe_load(f) or {}
else:
config = {}
@@ -7991,6 +8058,7 @@ class HermesCLI:
output_tokens = getattr(agent, "session_output_tokens", 0) or 0
cache_read_tokens = getattr(agent, "session_cache_read_tokens", 0) or 0
cache_write_tokens = getattr(agent, "session_cache_write_tokens", 0) or 0
reasoning_tokens = getattr(agent, "session_reasoning_tokens", 0) or 0
prompt = agent.session_prompt_tokens
completion = agent.session_completion_tokens
total = agent.session_total_tokens
@@ -8022,6 +8090,8 @@ class HermesCLI:
print(f" Cache read tokens: {cache_read_tokens:>10,}")
print(f" Cache write tokens: {cache_write_tokens:>10,}")
print(f" Output tokens: {output_tokens:>10,}")
if reasoning_tokens:
print(f" ↳ Reasoning (subset): {reasoning_tokens:>10,}")
print(f" Prompt tokens (total): {prompt:>10,}")
print(f" Completion tokens: {completion:>10,}")
print(f" Total tokens: {total:>10,}")
@@ -9703,7 +9773,7 @@ class HermesCLI:
# Debug: log to file (stdout may be devnull from redirect_stdout)
try:
_dbg = _hermes_home / "interrupt_debug.log"
with open(_dbg, "a") as _f:
with open(_dbg, "a", encoding="utf-8") as _f:
_f.write(f"{time.strftime('%H:%M:%S')} interrupt fired: msg={str(interrupt_msg)[:60]!r}, "
f"children={len(self.agent._active_children)}, "
f"parent._interrupt={self.agent._interrupt_requested}\n")
@@ -10535,7 +10605,7 @@ class HermesCLI:
# Debug: log to file when message enters interrupt queue
try:
_dbg = _hermes_home / "interrupt_debug.log"
with open(_dbg, "a") as _f:
with open(_dbg, "a", encoding="utf-8") as _f:
_f.write(f"{time.strftime('%H:%M:%S')} ENTER: queued interrupt msg={str(payload)[:60]!r}, "
f"agent_running={self._agent_running}\n")
except Exception:
@@ -12339,6 +12409,15 @@ def main(
"""
global _active_worktree
# Force UTF-8 stdio on Windows before any banner/print() runs — the
# Rich console prints Unicode box-drawing characters that would
# UnicodeEncodeError on cp1252. No-op on Linux/macOS.
try:
from hermes_cli.stdio import configure_windows_stdio
configure_windows_stdio()
except Exception:
pass
# Signal to terminal_tool that we're in interactive mode
# This enables interactive sudo password prompts with timeout
os.environ["HERMES_INTERACTIVE"] = "1"
+18 -3
View File
@@ -14,6 +14,7 @@ import contextvars
import json
import logging
import os
import shutil
import subprocess
import sys
@@ -714,7 +715,21 @@ def _run_job_script(script_path: str) -> tuple[bool, str]:
# choice explicit here keeps the allowed surface small and auditable.
suffix = path.suffix.lower()
if suffix in (".sh", ".bash"):
argv = ["/bin/bash", str(path)]
# Resolve bash dynamically so Windows (Git Bash) and Linux/macOS
# all work. On native Windows without Git for Windows installed
# shutil.which returns None — fall back to a clear error rather
# than a FileNotFoundError with a confusing "[WinError 2]"
# traceback.
_bash = shutil.which("bash") or (
"/bin/bash" if os.path.isfile("/bin/bash") else None
)
if _bash is None:
return False, (
f"Cannot run .sh/.bash script {path.name!r}: bash not found on PATH. "
"On Windows, install Git for Windows (which ships Git Bash) "
"or rewrite the script as Python (.py)."
)
argv = [_bash, str(path)]
else:
argv = [sys.executable, str(path)]
@@ -1213,7 +1228,7 @@ def run_job(job: dict) -> tuple[bool, str, str, Optional[str]]:
import yaml
_cfg_path = str(_get_hermes_home() / "config.yaml")
if os.path.exists(_cfg_path):
with open(_cfg_path) as _f:
with open(_cfg_path, encoding="utf-8") as _f:
_cfg = yaml.safe_load(_f) or {}
_cfg = _expand_env_vars(_cfg)
_model_cfg = _cfg.get("model", {})
@@ -1596,7 +1611,7 @@ def tick(verbose: bool = True, adapters=None, loop=None) -> int:
# Cross-platform file locking: fcntl on Unix, msvcrt on Windows
lock_fd = None
try:
lock_fd = open(lock_file, "w")
lock_fd = open(lock_file, "w", encoding="utf-8")
if fcntl:
fcntl.flock(lock_fd, fcntl.LOCK_EX | fcntl.LOCK_NB)
elif msvcrt:
@@ -365,7 +365,7 @@ class TerminalBench2EvalEnv(HermesAgentBaseEnv):
os.makedirs(log_dir, exist_ok=True)
run_ts = datetime.datetime.now().strftime("%Y%m%d_%H%M%S")
self._streaming_path = os.path.join(log_dir, f"samples_{run_ts}.jsonl")
self._streaming_file = open(self._streaming_path, "w")
self._streaming_file = open(self._streaming_path, "w", encoding="utf-8")
self._streaming_lock = __import__("threading").Lock()
print(f" Streaming results to: {self._streaming_path}")
@@ -422,7 +422,7 @@ class YCBenchEvalEnv(HermesAgentBaseEnv):
os.makedirs(log_dir, exist_ok=True)
run_ts = datetime.datetime.now().strftime("%Y%m%d_%H%M%S")
self._streaming_path = os.path.join(log_dir, f"samples_{run_ts}.jsonl")
self._streaming_file = open(self._streaming_path, "w")
self._streaming_file = open(self._streaming_path, "w", encoding="utf-8")
self._streaming_lock = threading.Lock()
print(f"\nYC-Bench eval matrix: {len(self.all_eval_items)} runs")
+2 -2
View File
@@ -744,7 +744,7 @@ class TelegramAdapter(BasePlatformAdapter):
return
import yaml as _yaml
with open(config_path, "r") as f:
with open(config_path, "r", encoding="utf-8") as f:
config = _yaml.safe_load(f) or {}
# Navigate to platforms.telegram.extra.dm_topics
@@ -3516,7 +3516,7 @@ class TelegramAdapter(BasePlatformAdapter):
return
import yaml as _yaml
with open(config_path, "r") as f:
with open(config_path, "r", encoding="utf-8") as f:
config = _yaml.safe_load(f) or {}
dm_topics = (
+15 -5
View File
@@ -21,6 +21,7 @@ import logging
import os
import platform
import re
import shutil
import signal
import subprocess
@@ -177,10 +178,15 @@ def check_whatsapp_requirements() -> bool:
WhatsApp requires a Node.js bridge for most implementations.
"""
# Check for Node.js
# Check for Node.js. Resolve via shutil.which so we respect PATHEXT
# (node.exe vs node) and get a meaningful "not installed" signal
# instead of spawning a cmd flash on Windows.
_node = shutil.which("node")
if not _node:
return False
try:
result = subprocess.run(
["node", "--version"],
[_node, "--version"],
capture_output=True,
text=True,
timeout=5
@@ -464,9 +470,13 @@ class WhatsAppAdapter(BasePlatformAdapter):
bridge_dir = bridge_path.parent
if not (bridge_dir / "node_modules").exists():
print(f"[{self.name}] Installing WhatsApp bridge dependencies...")
# Resolve npm path so Windows can execute the .cmd shim.
# shutil.which honours PATHEXT; on POSIX it returns the
# plain executable path.
_npm_bin = shutil.which("npm") or "npm"
try:
install_result = subprocess.run(
["npm", "install", "--silent"],
[_npm_bin, "install", "--silent"],
cwd=str(bridge_dir),
capture_output=True,
text=True,
@@ -516,7 +526,7 @@ class WhatsAppAdapter(BasePlatformAdapter):
# messages are preserved for troubleshooting.
whatsapp_mode = os.getenv("WHATSAPP_MODE", "self-chat")
self._bridge_log = self._session_path.parent / "bridge.log"
bridge_log_fh = open(self._bridge_log, "a")
bridge_log_fh = open(self._bridge_log, "a", encoding="utf-8")
self._bridge_log_fh = bridge_log_fh
# Build bridge subprocess environment.
@@ -1160,7 +1170,7 @@ class WhatsAppAdapter(BasePlatformAdapter):
if file_size > MAX_TEXT_INJECT_BYTES:
print(f"[{self.name}] Skipping text injection for {doc_path} ({file_size} bytes > {MAX_TEXT_INJECT_BYTES})", flush=True)
continue
content = Path(doc_path).read_text(errors="replace")
content = Path(doc_path).read_text(encoding="utf-8", errors="replace")
fname = Path(doc_path).name
# Remove the doc_<hex>_ prefix for display
display_name = fname
+123 -18
View File
@@ -13,6 +13,10 @@ Usage:
python cli.py --gateway
"""
# IMPORTANT: hermes_bootstrap must be the very first import — UTF-8 stdio
# on Windows. No-op on POSIX. See hermes_bootstrap.py for full rationale.
import hermes_bootstrap # noqa: F401
import asyncio
import dataclasses
import inspect
@@ -2784,6 +2788,48 @@ class GatewayRunner:
return
current_pid = os.getpid()
# On Windows there's no bash/setsid chain — spawn a tiny Python
# watcher directly via sys.executable instead. The watcher polls
# current_pid, waits for our exit, then runs `hermes gateway
# restart` with detach flags so the respawn survives the CLI
# that triggered the /restart command closing its console.
if sys.platform == "win32":
import textwrap
from hermes_cli._subprocess_compat import windows_detach_popen_kwargs
cmd_argv = [*hermes_cmd, "gateway", "restart"]
watcher = textwrap.dedent(
"""
import os, subprocess, sys, time
pid = int(sys.argv[1])
cmd = sys.argv[2:]
deadline = time.monotonic() + 120
while time.monotonic() < deadline:
try:
os.kill(pid, 0)
except (ProcessLookupError, PermissionError, OSError):
break
time.sleep(0.2)
_CREATE_NEW_PROCESS_GROUP = 0x00000200
_DETACHED_PROCESS = 0x00000008
_CREATE_NO_WINDOW = 0x08000000
subprocess.Popen(
cmd,
stdout=subprocess.DEVNULL,
stderr=subprocess.DEVNULL,
creationflags=_CREATE_NEW_PROCESS_GROUP | _DETACHED_PROCESS | _CREATE_NO_WINDOW,
)
"""
).strip()
subprocess.Popen(
[sys.executable, "-c", watcher, str(current_pid), *cmd_argv],
stdout=subprocess.DEVNULL,
stderr=subprocess.DEVNULL,
**windows_detach_popen_kwargs(),
)
return
cmd = " ".join(shlex.quote(part) for part in hermes_cmd)
shell_cmd = (
f"while kill -0 {current_pid} 2>/dev/null; do sleep 0.2; done; "
@@ -11305,30 +11351,78 @@ class GatewayRunner:
# where systemd-run --user fails due to missing D-Bus session).
# PYTHONUNBUFFERED ensures output is flushed line-by-line so the
# gateway can stream it to the messenger in near-real-time.
hermes_cmd_str = " ".join(shlex.quote(part) for part in hermes_cmd)
update_cmd = (
f"PYTHONUNBUFFERED=1 {hermes_cmd_str} update --gateway"
f" > {shlex.quote(str(output_path))} 2>&1; "
f"status=$?; printf '%s' \"$status\" > {shlex.quote(str(exit_code_path))}"
)
# Spawn `hermes update --gateway` detached so it survives gateway restart.
# --gateway enables file-based IPC for interactive prompts (stash
# restore, config migration) so the gateway can forward them to the
# user instead of silently skipping them.
# Use setsid for portable session detach (works under system services
# where systemd-run --user fails due to missing D-Bus session).
# PYTHONUNBUFFERED ensures output is flushed line-by-line so the
# gateway can stream it to the messenger in near-real-time.
#
# Windows: no bash/setsid chain. Run `hermes update --gateway`
# directly via sys.executable; redirect stdout/stderr to the same
# output files via Popen file handles; write the exit code in a
# follow-up write. A tiny Python watcher would be cleaner but
# we're already inside gateway/run.py's update path which is async,
# so the simplest correct thing is: launch an inline Python helper
# that runs the command and writes both outputs.
try:
setsid_bin = shutil.which("setsid")
if setsid_bin:
# Preferred: setsid creates a new session, fully detached
if sys.platform == "win32":
import textwrap
from hermes_cli._subprocess_compat import windows_detach_popen_kwargs
# hermes_cmd is a list of argv parts we can pass directly
# (no shell-quoting needed).
helper = textwrap.dedent(
"""
import os, subprocess, sys
output_path = sys.argv[1]
exit_code_path = sys.argv[2]
cmd = sys.argv[3:]
env = dict(os.environ)
env["PYTHONUNBUFFERED"] = "1"
with open(output_path, "wb") as f:
proc = subprocess.Popen(cmd, stdout=f, stderr=subprocess.STDOUT, env=env)
rc = proc.wait()
with open(exit_code_path, "w") as f:
f.write(str(rc))
"""
).strip()
subprocess.Popen(
[setsid_bin, "bash", "-c", update_cmd],
[
sys.executable, "-c", helper,
str(output_path), str(exit_code_path),
*hermes_cmd, "update", "--gateway",
],
stdout=subprocess.DEVNULL,
stderr=subprocess.DEVNULL,
start_new_session=True,
**windows_detach_popen_kwargs(),
)
else:
# Fallback: start_new_session=True calls os.setsid() in child
subprocess.Popen(
["bash", "-c", update_cmd],
stdout=subprocess.DEVNULL,
stderr=subprocess.DEVNULL,
start_new_session=True,
hermes_cmd_str = " ".join(shlex.quote(part) for part in hermes_cmd)
update_cmd = (
f"PYTHONUNBUFFERED=1 {hermes_cmd_str} update --gateway"
f" > {shlex.quote(str(output_path))} 2>&1; "
f"status=$?; printf '%s' \"$status\" > {shlex.quote(str(exit_code_path))}"
)
setsid_bin = shutil.which("setsid")
if setsid_bin:
# Preferred: setsid creates a new session, fully detached
subprocess.Popen(
[setsid_bin, "bash", "-c", update_cmd],
stdout=subprocess.DEVNULL,
stderr=subprocess.DEVNULL,
start_new_session=True,
)
else:
# Fallback: start_new_session=True calls os.setsid() in child
subprocess.Popen(
["bash", "-c", update_cmd],
stdout=subprocess.DEVNULL,
stderr=subprocess.DEVNULL,
start_new_session=True,
)
except Exception as e:
pending_path.unlink(missing_ok=True)
exit_code_path.unlink(missing_ok=True)
@@ -15100,7 +15194,10 @@ async def start_gateway(config: Optional[GatewayConfig] = None, replace: bool =
try:
os.kill(existing_pid, 0)
time.sleep(0.5)
except (ProcessLookupError, PermissionError):
except (ProcessLookupError, PermissionError, OSError):
# OSError covers Windows' WinError 87 "invalid parameter"
# for an already-gone PID — without this the probe loop
# busy-spins for the full 10s on every --replace start.
break # Process is gone
else:
# Still alive after 10s — force kill
@@ -15385,6 +15482,14 @@ async def start_gateway(config: Optional[GatewayConfig] = None, replace: bool =
def main():
"""CLI entry point for the gateway."""
# Force UTF-8 stdio on Windows — gateway logs and startup banner would
# otherwise UnicodeEncodeError on cp1252 consoles. No-op on POSIX.
try:
from hermes_cli.stdio import configure_windows_stdio
configure_windows_stdio()
except Exception:
pass
import argparse
parser = argparse.ArgumentParser(description="Hermes Gateway - Multi-platform messaging")
+3 -3
View File
@@ -113,7 +113,7 @@ def _get_process_start_time(pid: int) -> Optional[int]:
stat_path = Path(f"/proc/{pid}/stat")
try:
# Field 22 in /proc/<pid>/stat is process start time (clock ticks).
return int(stat_path.read_text().split()[21])
return int(stat_path.read_text(encoding="utf-8").split()[21])
except (FileNotFoundError, IndexError, PermissionError, ValueError, OSError):
return None
@@ -197,7 +197,7 @@ def _read_json_file(path: Path) -> Optional[dict[str, Any]]:
if not path.exists():
return None
try:
raw = path.read_text().strip()
raw = path.read_text(encoding="utf-8").strip()
except OSError:
return None
if not raw:
@@ -523,7 +523,7 @@ def acquire_scoped_lock(scope: str, identity: str, metadata: Optional[dict[str,
try:
_proc_status = Path(f"/proc/{existing_pid}/status")
if _proc_status.exists():
for _line in _proc_status.read_text().splitlines():
for _line in _proc_status.read_text(encoding="utf-8").splitlines():
if _line.startswith("State:"):
_state = _line.split()[1]
if _state in ("T", "t"): # stopped or tracing stop
+129
View File
@@ -0,0 +1,129 @@
"""Windows UTF-8 bootstrap for Hermes entry points.
Python on Windows has two long-standing text-encoding footguns:
1. ``sys.stdout`` / ``sys.stderr`` are bound to the console code page
(``cp1252`` on US-locale installs), so ``print("café")`` crashes with
``UnicodeEncodeError: 'charmap' codec can't encode character``.
2. Child processes spawned via ``subprocess`` don't know to use UTF-8
unless ``PYTHONUTF8`` and/or ``PYTHONIOENCODING`` are set in their
environment so any Python subprocess (the execute_code sandbox,
delegation children, linter subprocesses, etc.) inherits the same
cp1252 defaults and hits the same UnicodeEncodeError.
This module fixes both on Windows *only* POSIX is untouched. It
should be imported at the very top of every Hermes entry point
(``hermes``, ``hermes-agent``, ``hermes-acp``, ``python -m gateway.run``,
``batch_runner.py``, ``cron/scheduler.py``) before any other imports
that might do file I/O or print to stdout.
What this module does on Windows:
- Sets ``os.environ["PYTHONUTF8"] = "1"`` (PEP 540 UTF-8 mode) so
every child process we spawn uses UTF-8 for ``open()`` and stdio.
- Sets ``os.environ["PYTHONIOENCODING"] = "utf-8"`` for belt-and-
suspenders some tools read this instead of / in addition to
``PYTHONUTF8``.
- Reconfigures ``sys.stdout`` / ``sys.stderr`` to UTF-8 in the current
process, using the ``reconfigure()`` API (Python 3.7+). This fixes
``print("café")`` in the parent without a re-exec.
What this module does NOT do:
- It does not re-exec Python with ``-X utf8``, so ``open()`` calls in
the *current* process still default to locale encoding. Those need
an explicit ``encoding="utf-8"`` at the call site (lint rule
``PLW1514`` / ``PYI058``). Ruff is the right tool for that sweep.
What this module does on POSIX:
- Nothing. POSIX systems are already UTF-8 by default in 99% of cases,
and we don't want to touch ``LANG``/``LC_*`` behavior that users may
have configured intentionally. If someone hits a C/POSIX locale on
Linux, they can export ``PYTHONUTF8=1`` themselves we won't override.
Idempotent: safe to call multiple times. ``_bootstrap_once`` guards
against double-reconfigure.
"""
from __future__ import annotations
import os
import sys
_IS_WINDOWS = sys.platform == "win32"
_bootstrap_applied = False
def apply_windows_utf8_bootstrap() -> bool:
"""Apply the Windows UTF-8 bootstrap if we're on Windows.
Returns True if bootstrap was applied (i.e. we're on Windows and
haven't already done this), False otherwise. The return value is
advisory callers normally don't need it, but tests may want to
assert the path was taken.
Idempotent: subsequent calls after the first are a no-op.
"""
global _bootstrap_applied
if not _IS_WINDOWS:
return False
if _bootstrap_applied:
return False
# 1. Child processes inherit these and run in UTF-8 mode.
# We use setdefault() rather than overwriting so the user can
# explicitly opt out by setting PYTHONUTF8=0 in their environment
# (or PYTHONIOENCODING=something-else) if they really want to.
os.environ.setdefault("PYTHONUTF8", "1")
os.environ.setdefault("PYTHONIOENCODING", "utf-8")
# 2. Reconfigure the current process's stdio to UTF-8. Needed
# because os.environ changes don't retroactively rebind sys.stdout
# — those were bound at interpreter startup based on the console
# code page. ``reconfigure`` is a TextIOWrapper method since 3.7.
#
# errors="replace" means that if we ever *read* something from
# stdin that isn't UTF-8 (unlikely but possible with piped input
# from legacy tools), we'll get U+FFFD replacement chars rather
# than a crash. Output is pure UTF-8.
for stream_name in ("stdout", "stderr"):
stream = getattr(sys, stream_name, None)
if stream is None:
continue
reconfigure = getattr(stream, "reconfigure", None)
if reconfigure is None:
# Not a TextIOWrapper (could be redirected to a BytesIO in
# tests, or a non-standard stream in some embedded cases).
# Skip silently — the env-var fix is still in effect for
# child processes, which is the bigger win.
continue
try:
reconfigure(encoding="utf-8", errors="replace")
except (OSError, ValueError):
# Already closed, or someone replaced it with something
# non-reconfigurable. Non-fatal.
pass
# stdin is reconfigured separately with errors="replace" too — input
# from a legacy pipe shouldn't crash the process.
stdin = getattr(sys, "stdin", None)
if stdin is not None:
reconfigure = getattr(stdin, "reconfigure", None)
if reconfigure is not None:
try:
reconfigure(encoding="utf-8", errors="replace")
except (OSError, ValueError):
pass
_bootstrap_applied = True
return True
# Apply on import — entry points just need ``import hermes_bootstrap``
# (or ``from hermes_bootstrap import apply_windows_utf8_bootstrap``) at
# the very top of their module, before importing anything else. The
# import side effect does the right thing.
apply_windows_utf8_bootstrap()
+175
View File
@@ -0,0 +1,175 @@
"""Windows subprocess compatibility helpers.
Hermes is developed on Linux / macOS and tested natively on Windows too.
Several common subprocess patterns break silently-or-loudly on Windows:
* ``["npm", "install", ...]`` on Windows ``npm`` is ``npm.cmd``, a batch
shim. ``subprocess.Popen(["npm", ...])`` fails with WinError 193
("not a valid Win32 application") because CreateProcessW can't run a
``.cmd`` file without ``shell=True`` or PATHEXT resolution.
* ``start_new_session=True`` on POSIX, this maps to ``os.setsid()`` and
actually detaches the child. On Windows it's silently ignored; the
Windows equivalent is ``CREATE_NEW_PROCESS_GROUP | DETACHED_PROCESS``
creationflags, which Python only applies when you pass them explicitly.
* Console-window flashes every ``subprocess.Popen`` of a ``.exe`` on
Windows spawns a cmd window briefly unless ``CREATE_NO_WINDOW`` is
passed. Cosmetic but jarring for background daemons.
This module centralizes the platform-branching logic so the rest of the
codebase doesn't sprinkle ``if sys.platform == "win32":`` everywhere.
**All helpers are no-ops on non-Windows** calling them in Linux/macOS
code paths is safe by design. That's the "do no damage on POSIX"
guarantee.
"""
from __future__ import annotations
import os
import shutil
import subprocess
import sys
from typing import Optional, Sequence
__all__ = [
"IS_WINDOWS",
"resolve_node_command",
"windows_detach_flags",
"windows_hide_flags",
"windows_detach_popen_kwargs",
]
IS_WINDOWS = sys.platform == "win32"
# -----------------------------------------------------------------------------
# Node ecosystem launcher resolution
# -----------------------------------------------------------------------------
def resolve_node_command(name: str, argv: Sequence[str]) -> list[str]:
"""Resolve a Node-ecosystem command name to an absolute-path argv.
On Windows, commands like ``npm``, ``npx``, ``yarn``, ``pnpm``,
``playwright``, ``prettier`` ship as ``.cmd`` files (batch shims).
``subprocess.Popen(["npm", "install"])`` fails with WinError 193
because CreateProcessW doesn't execute batch files directly.
``shutil.which(name)`` *does* resolve ``.cmd`` via PATHEXT and returns
the fully-qualified path which CreateProcessW accepts because the
extension tells Windows to route through ``cmd.exe /c``.
On POSIX ``shutil.which`` also returns a fully-qualified path when
found. That's a small change from bare-name resolution (the OS does
its own PATH search) but functionally identical and has the side
benefit of making the argv reproducible in logs.
Behavior when the command is not on PATH:
- On Windows: return the bare name caller can still try with
``shell=True`` as a last resort, OR the subsequent Popen will
raise FileNotFoundError with a readable error we want to surface.
- On POSIX: same. Bare ``npm`` on a Linux box without npm installed
fails the same way it did before this function existed.
Args:
name: The command name to resolve (``npm``, ``npx``, ``node`` ).
argv: The remaining arguments. Must NOT include ``name`` itself
this function builds the full argv list.
Returns:
A list suitable for passing to subprocess.Popen/run/call.
"""
resolved = shutil.which(name)
if resolved:
return [resolved, *argv]
return [name, *argv]
# -----------------------------------------------------------------------------
# Detached / hidden process creation
# -----------------------------------------------------------------------------
# Win32 CreationFlags — defined here rather than imported from subprocess
# because CREATE_NO_WINDOW and DETACHED_PROCESS aren't guaranteed to be
# present on stdlib subprocess on older Pythons or non-Windows builds.
_CREATE_NEW_PROCESS_GROUP = 0x00000200
_DETACHED_PROCESS = 0x00000008
_CREATE_NO_WINDOW = 0x08000000
def windows_detach_flags() -> int:
"""Return Win32 creationflags that detach a child from the parent
console and process group. 0 on non-Windows.
Pair with ``start_new_session=False`` (default) when calling
subprocess.Popen on POSIX use ``start_new_session=True`` instead,
which maps to ``os.setsid()`` in the child.
Rationale:
- ``CREATE_NEW_PROCESS_GROUP`` child has its own process group so
Ctrl+C in the parent console doesn't propagate.
- ``DETACHED_PROCESS`` child has no console at all. Necessary for
background daemons (gateway watchers, update respawners) because
without it, closing the console kills the child.
- ``CREATE_NO_WINDOW`` suppress the brief cmd flash that would
otherwise appear when launching a console app. Redundant with
DETACHED_PROCESS but explicit for clarity.
"""
if not IS_WINDOWS:
return 0
return _CREATE_NEW_PROCESS_GROUP | _DETACHED_PROCESS | _CREATE_NO_WINDOW
def windows_hide_flags() -> int:
"""Return Win32 creationflags that merely hide the child's console
window without detaching the child. 0 on non-Windows.
Use for short-lived console apps spawned as part of a larger
operation (``taskkill``, ``where``, version probes) where we want no
flash but also want to collect stdout/exit code synchronously.
The key difference from :func:`windows_detach_flags`: NO
``DETACHED_PROCESS`` the child still inherits stdio handles so
``capture_output=True`` works. ``DETACHED_PROCESS`` would sever
stdio and break stdout capture.
"""
if not IS_WINDOWS:
return 0
return _CREATE_NO_WINDOW
def windows_detach_popen_kwargs() -> dict:
"""Return a dict of Popen kwargs that detach a child on Windows and
fall back to the POSIX equivalent (``start_new_session=True``) on
Linux/macOS.
Usage pattern:
.. code-block:: python
subprocess.Popen(
argv,
stdout=subprocess.DEVNULL,
stderr=subprocess.DEVNULL,
stdin=subprocess.DEVNULL,
close_fds=True,
**windows_detach_popen_kwargs(),
)
This replaces the unsafe-on-Windows pattern:
.. code-block:: python
subprocess.Popen(..., start_new_session=True)
which silently fails to detach on Windows (the flag is accepted but
has no effect the child stays attached to the parent's console
and dies when the console closes).
"""
if IS_WINDOWS:
return {"creationflags": windows_detach_flags()}
return {"start_new_session": True}
+3 -3
View File
@@ -573,7 +573,7 @@ def create_quick_snapshot(
"total_size": sum(manifest.values()),
"files": manifest,
}
with open(snap_dir / "manifest.json", "w") as f:
with open(snap_dir / "manifest.json", "w", encoding="utf-8") as f:
json.dump(meta, f, indent=2)
# Auto-prune
@@ -599,7 +599,7 @@ def list_quick_snapshots(
manifest_path = d / "manifest.json"
if manifest_path.exists():
try:
with open(manifest_path) as f:
with open(manifest_path, encoding="utf-8") as f:
results.append(json.load(f))
except (json.JSONDecodeError, OSError):
results.append({"id": d.name, "file_count": 0, "total_size": 0})
@@ -629,7 +629,7 @@ def restore_quick_snapshot(
if not manifest_path.exists():
return False
with open(manifest_path) as f:
with open(manifest_path, encoding="utf-8") as f:
meta = json.load(f)
restored = 0
+36 -7
View File
@@ -212,7 +212,7 @@ def get_container_exec_info() -> Optional[dict]:
try:
info = {}
with open(container_mode_file, "r") as f:
with open(container_mode_file, "r", encoding="utf-8") as f:
for line in f:
line = line.strip()
if "=" in line and not line.startswith("#"):
@@ -297,7 +297,7 @@ def _is_container() -> bool:
return True
# LXC / cgroup-based detection
try:
with open("/proc/1/cgroup", "r") as f:
with open("/proc/1/cgroup", "r", encoding="utf-8") as f:
cgroup_content = f.read()
if "docker" in cgroup_content or "lxc" in cgroup_content or "kubepods" in cgroup_content:
return True
@@ -780,6 +780,19 @@ DEFAULT_CONFIG = {
"timeout": 30,
"extra_body": {},
},
# Triage specifier — flesh out a rough one-liner in the Kanban
# Triage column into a concrete spec, then promote it to ``todo``.
# Invoked by ``hermes kanban specify`` (single id or --all). Set a
# cheap, capable model here (gemini-flash works well); the main
# model is overkill for short spec expansion.
"triage_specifier": {
"provider": "auto",
"model": "",
"base_url": "",
"api_key": "",
"timeout": 120,
"extra_body": {},
},
# Curator — skill-usage review fork. Timeout is generous because the
# review pass can take several minutes on reasoning models (umbrella
# building over hundreds of candidate skills). "auto" = use main chat
@@ -1864,6 +1877,14 @@ OPTIONAL_ENV_VARS = {
"password": False,
"category": "tool",
},
"BRAVE_SEARCH_API_KEY": {
"description": "Brave Search API subscription token (free tier: 2,000 queries/mo)",
"prompt": "Brave Search subscription token",
"url": "https://brave.com/search/api/",
"tools": ["web_search"],
"password": True,
"category": "tool",
},
"BROWSERBASE_API_KEY": {
"description": "Browserbase API key for cloud browser (optional — local browser works without this)",
"prompt": "Browserbase API key",
@@ -3431,7 +3452,7 @@ def migrate_config(interactive: bool = True, quiet: bool = False) -> Dict[str, A
if not manifest_file.exists():
continue
try:
with open(manifest_file) as _mf:
with open(manifest_file, encoding="utf-8") as _mf:
manifest = yaml.safe_load(_mf) or {}
except Exception:
manifest = {}
@@ -4675,11 +4696,19 @@ def edit_config():
# Find editor
editor = os.getenv('EDITOR') or os.getenv('VISUAL')
if not editor:
# Try common editors
for cmd in ['nano', 'vim', 'vi', 'code', 'notepad']:
import shutil
# Try common editors — order is platform-aware so Windows users
# land on a working editor (notepad) even without Git Bash or nano
# installed. On POSIX, prefer nano/vim over code/notepad because
# it's more likely to be present on headless / server systems.
import shutil
import sys as _sys
if _sys.platform == "win32":
candidates = ['notepad', 'code', 'vim', 'vi', 'nano']
else:
candidates = ['nano', 'vim', 'vi', 'code', 'notepad']
for cmd in candidates:
if shutil.which(cmd):
editor = cmd
break
+21 -4
View File
@@ -91,6 +91,15 @@ def _termux_browser_setup_steps(node_installed: bool) -> list[str]:
return steps
def _termux_install_all_fallback_notes() -> list[str]:
return [
"Termux install profile: use .[termux-all] for broad compatibility (installer default on Termux).",
"Matrix E2EE extra is excluded on Termux (python-olm currently fails to build).",
"Local faster-whisper extra is excluded on Termux (ctranslate2/av build path unavailable).",
"STT fallback: use Groq Whisper (set GROQ_API_KEY) or OpenAI Whisper (set VOICE_TOOLS_OPENAI_KEY).",
]
def _has_provider_env_config(content: str) -> bool:
"""Return True when ~/.hermes/.env contains provider auth/base URL settings."""
return any(key in content for key in _PROVIDER_ENV_HINTS)
@@ -589,7 +598,7 @@ def run_doctor(args):
# Detect stale root-level model keys (known bug source — PR #4329)
try:
import yaml
with open(config_path) as f:
with open(config_path, encoding="utf-8") as f:
raw_config = yaml.safe_load(f) or {}
stale_root_keys = [k for k in ("provider", "base_url") if k in raw_config and isinstance(raw_config[k], str)]
if stale_root_keys:
@@ -1050,7 +1059,8 @@ def run_doctor(args):
check_warn("Node.js not found", "(optional, needed for browser tools)")
# npm audit for all Node.js packages
if _safe_which("npm"):
_npm_bin = _safe_which("npm")
if _npm_bin:
npm_dirs = [
(PROJECT_ROOT, "Browser tools (agent-browser)"),
(PROJECT_ROOT / "scripts" / "whatsapp-bridge", "WhatsApp bridge"),
@@ -1059,8 +1069,10 @@ def run_doctor(args):
if not (npm_dir / "node_modules").exists():
continue
try:
# Use resolved absolute path so Windows can execute
# npm.cmd (CreateProcessW can't run bare .cmd names).
audit_result = subprocess.run(
["npm", "audit", "--json"],
[_npm_bin, "audit", "--json"],
cwd=str(npm_dir),
capture_output=True, text=True, timeout=30,
)
@@ -1084,6 +1096,11 @@ def run_doctor(args):
except Exception:
pass
if _is_termux():
check_info("Termux compatibility fallbacks:")
for note in _termux_install_all_fallback_notes():
check_info(note)
# =========================================================================
# Check: API connectivity
# =========================================================================
@@ -1382,7 +1399,7 @@ def run_doctor(args):
import yaml as _yaml
_mem_cfg_path = HERMES_HOME / "config.yaml"
if _mem_cfg_path.exists():
with open(_mem_cfg_path) as _f:
with open(_mem_cfg_path, encoding="utf-8") as _f:
_raw_cfg = _yaml.safe_load(_f) or {}
_active_memory_provider = (_raw_cfg.get("memory") or {}).get("provider", "")
except Exception:
+49 -8
View File
@@ -232,6 +232,10 @@ def _graceful_restart_via_sigusr1(pid: int, drain_timeout: float) -> bool:
# Process still exists but we can't signal it. Treat as alive
# so the caller falls back.
pass
except OSError:
# Windows raises OSError (WinError 87 "invalid parameter") for
# a gone PID — treat the same as ProcessLookupError.
return True
_time.sleep(0.5)
# Drain didn't finish in time.
return False
@@ -441,6 +445,25 @@ def launch_detached_profile_gateway_restart(profile: str, old_pid: int) -> bool:
if old_pid <= 0:
return False
# The watcher is a tiny Python subprocess that polls the old PID and
# respawns the gateway once it's gone. Both legs of the chain need
# platform-appropriate detach semantics:
#
# POSIX — ``start_new_session=True`` (os.setsid in the child) detaches
# from the parent's process group so Ctrl+C in the CLI doesn't
# propagate and the watcher/gateway survive the CLI exiting.
#
# Windows — ``start_new_session`` is silently accepted but does NOT
# detach. The watcher stays attached to the CLI's console and dies
# when the user closes the terminal, leaving ``hermes update`` users
# with no running gateway until they re-invoke ``hermes gateway``
# manually. The Win32 equivalent is the ``CREATE_NEW_PROCESS_GROUP |
# DETACHED_PROCESS | CREATE_NO_WINDOW`` creationflags bundle.
#
# ``windows_detach_popen_kwargs()`` returns the right kwargs for the
# host platform and is a no-op on POSIX (just ``start_new_session=True``).
from hermes_cli._subprocess_compat import windows_detach_popen_kwargs
watcher = textwrap.dedent(
"""
import os
@@ -458,22 +481,39 @@ def launch_detached_profile_gateway_restart(profile: str, old_pid: int) -> bool:
break
except PermissionError:
pass
except OSError:
# Windows: gone PID raises OSError (WinError 87).
break
time.sleep(0.2)
subprocess.Popen(
cmd,
stdout=subprocess.DEVNULL,
stderr=subprocess.DEVNULL,
start_new_session=True,
)
# Platform-appropriate detach for the respawned gateway. On POSIX
# start_new_session=True maps to os.setsid; on Windows we need
# explicit creationflags because start_new_session is a no-op there.
_popen_kwargs = {
"stdout": subprocess.DEVNULL,
"stderr": subprocess.DEVNULL,
}
if sys.platform == "win32":
_CREATE_NEW_PROCESS_GROUP = 0x00000200
_DETACHED_PROCESS = 0x00000008
_CREATE_NO_WINDOW = 0x08000000
_popen_kwargs["creationflags"] = (
_CREATE_NEW_PROCESS_GROUP | _DETACHED_PROCESS | _CREATE_NO_WINDOW
)
else:
_popen_kwargs["start_new_session"] = True
subprocess.Popen(cmd, **_popen_kwargs)
"""
).strip()
try:
# Same platform-aware detach for the watcher process itself — so
# closing the user's terminal doesn't kill the watcher.
subprocess.Popen(
[sys.executable, "-c", watcher, str(old_pid), *_gateway_run_args_for_profile(profile)],
stdout=subprocess.DEVNULL,
stderr=subprocess.DEVNULL,
start_new_session=True,
**windows_detach_popen_kwargs(),
)
except OSError:
return False
@@ -935,7 +975,8 @@ def stop_profile_gateway() -> bool:
try:
os.kill(pid, 0)
_time.sleep(0.5)
except (ProcessLookupError, PermissionError):
except (ProcessLookupError, PermissionError, OSError):
# OSError covers Windows' WinError 87 for gone PIDs.
break
if get_running_pid() is None:
+1 -1
View File
@@ -205,7 +205,7 @@ def _cmd_test(args) -> None:
if getattr(args, "payload_file", None):
try:
custom = json.loads(Path(args.payload_file).read_text())
custom = json.loads(Path(args.payload_file).read_text(encoding="utf-8"))
if isinstance(custom, dict):
payload.update(custom)
else:
+111
View File
@@ -570,6 +570,42 @@ def build_parser(parent_subparsers: argparse._SubParsersAction) -> argparse.Argu
)
p_ctx.add_argument("task_id")
# --- specify --- (triage → todo via auxiliary LLM)
p_specify = sub.add_parser(
"specify",
help="Flesh out a triage-column task into a concrete spec "
"(title + body) and promote it to todo. Uses the auxiliary "
"LLM configured under auxiliary.triage_specifier.",
)
p_specify.add_argument(
"task_id",
nargs="?",
default=None,
help="Task id to specify (required unless --all is given)",
)
p_specify.add_argument(
"--all",
dest="all_triage",
action="store_true",
help="Specify every task currently in the triage column",
)
p_specify.add_argument(
"--tenant",
default=None,
help="When used with --all, restrict the sweep to this tenant",
)
p_specify.add_argument(
"--author",
default=None,
help="Author name recorded on the audit comment "
"(default: $HERMES_PROFILE or 'specifier')",
)
p_specify.add_argument(
"--json",
action="store_true",
help="Emit one JSON object per task on stdout",
)
# --- gc ---
p_gc = sub.add_parser(
"gc", help="Garbage-collect archived-task workspaces, old events, and old logs",
@@ -684,6 +720,7 @@ def kanban_command(args: argparse.Namespace) -> int:
"notify-list": _cmd_notify_list,
"notify-unsubscribe": _cmd_notify_unsubscribe,
"context": _cmd_context,
"specify": _cmd_specify,
"gc": _cmd_gc,
}
handler = handlers.get(action)
@@ -1980,6 +2017,80 @@ def _cmd_context(args: argparse.Namespace) -> int:
return 0
def _cmd_specify(args: argparse.Namespace) -> int:
"""Flesh out a triage task (or all of them) via auxiliary LLM,
then promote to todo. Thin wrapper over ``kanban_specify``."""
from hermes_cli import kanban_specify as spec
all_flag = bool(getattr(args, "all_triage", False))
tenant = getattr(args, "tenant", None)
author = getattr(args, "author", None) or _profile_author()
want_json = bool(getattr(args, "json", False))
if args.task_id and all_flag:
print(
"kanban: pass either a task id OR --all, not both",
file=sys.stderr,
)
return 2
if all_flag:
ids = spec.list_triage_ids(tenant=tenant)
if not ids:
msg = (
"No triage tasks"
+ (f" for tenant {tenant!r}" if tenant else "")
+ "."
)
if want_json:
print(json.dumps({"specified": 0, "total": 0}))
else:
print(msg)
return 0
elif args.task_id:
ids = [args.task_id]
else:
print(
"kanban: specify requires a task id or --all",
file=sys.stderr,
)
return 2
ok_count = 0
fail_count = 0
for tid in ids:
outcome = spec.specify_task(tid, author=author)
if outcome.ok:
ok_count += 1
else:
fail_count += 1
if want_json:
print(json.dumps({
"task_id": outcome.task_id,
"ok": outcome.ok,
"reason": outcome.reason,
"new_title": outcome.new_title,
}))
else:
if outcome.ok:
title_suffix = (
f" — retitled: {outcome.new_title!r}"
if outcome.new_title
else ""
)
print(f"Specified {outcome.task_id} → todo{title_suffix}")
else:
print(
f"kanban: specify {outcome.task_id}: {outcome.reason}",
file=sys.stderr,
)
if not all_flag:
return 0 if ok_count == 1 else 1
# --all: succeed if at least one promotion landed; exit 1 only when
# every candidate failed (honest signal for scripts).
return 0 if (ok_count > 0 or not ids) else 1
def _cmd_gc(args: argparse.Namespace) -> int:
"""Remove scratch workspaces of archived tasks, prune old events, and
delete old worker logs."""
+111 -14
View File
@@ -2503,6 +2503,91 @@ def unblock_task(conn: sqlite3.Connection, task_id: str) -> bool:
return True
def specify_triage_task(
conn: sqlite3.Connection,
task_id: str,
*,
title: Optional[str] = None,
body: Optional[str] = None,
author: Optional[str] = None,
) -> bool:
"""Flesh out a triage task and promote it to ``todo``.
Atomically updates ``title`` / ``body`` (when provided) and transitions
``status: triage -> todo`` in a single write txn. Returns False when
the task is missing or not in the ``triage`` column callers should
surface that as "nothing to specify" rather than an error.
``todo`` (not ``ready``) is the correct landing column: ``recompute_ready``
promotes parent-free / parent-done todos to ``ready`` on the next
dispatcher tick, which keeps the normal parent-gating behaviour intact
for specified tasks that happen to have open parents.
``author`` is recorded on an audit comment only when at least one of
``title`` / ``body`` actually changed avoids noisy comment spam for
status-only promotions.
"""
if title is not None and not title.strip():
raise ValueError("title cannot be blank")
with write_txn(conn):
existing = conn.execute(
"SELECT title, body FROM tasks WHERE id = ? AND status = 'triage'",
(task_id,),
).fetchone()
if existing is None:
return False
sets: list[str] = ["status = 'todo'"]
params: list[Any] = []
changed_fields: list[str] = []
if title is not None and title.strip() != (existing["title"] or ""):
sets.append("title = ?")
params.append(title.strip())
changed_fields.append("title")
if body is not None and (body or "") != (existing["body"] or ""):
sets.append("body = ?")
params.append(body)
changed_fields.append("body")
params.append(task_id)
cur = conn.execute(
f"UPDATE tasks SET {', '.join(sets)} "
f"WHERE id = ? AND status = 'triage'",
tuple(params),
)
if cur.rowcount != 1:
return False
if changed_fields and author and author.strip():
# Inline INSERT (rather than ``add_comment``) because we're
# already inside this function's write_txn — nested BEGIN
# IMMEDIATE would raise OperationalError. We also skip the
# 'commented' event that ``add_comment`` emits, since the
# 'specified' event below already records the change.
conn.execute(
"INSERT INTO task_comments (task_id, author, body, created_at) "
"VALUES (?, ?, ?, ?)",
(
task_id,
author.strip(),
"Specified — updated "
+ ", ".join(changed_fields)
+ " and promoted to todo.",
int(time.time()),
),
)
_append_event(
conn,
task_id,
"specified",
{"changed_fields": changed_fields} if changed_fields else None,
)
# Outside the write_txn above, so we don't nest BEGIN IMMEDIATE — the
# ready-promotion pass opens its own IMMEDIATE txn. This runs the same
# logic the dispatcher would on its next tick, so a specified task
# with no open parents flips straight to 'ready' here instead of
# idling in 'todo' until the next sweep.
recompute_ready(conn)
return True
def archive_task(conn: sqlite3.Connection, task_id: str) -> bool:
with write_txn(conn):
cur = conn.execute(
@@ -2750,7 +2835,7 @@ def _pid_alive(pid: Optional[int]) -> bool:
# where we have a cheap, deterministic process-state probe.
if sys.platform == "linux":
try:
with open(f"/proc/{int(pid)}/status", "r") as f:
with open(f"/proc/{int(pid)}/status", "r", encoding="utf-8") as f:
for line in f:
if line.startswith("State:"):
# "State:\tZ (zombie)" → dead
@@ -2826,7 +2911,10 @@ def _terminate_reclaimed_worker(
if _pid_alive(pid):
try:
kill(int(pid), signal.SIGKILL)
# signal.SIGKILL doesn't exist on Windows; fall back to SIGTERM
# (which maps to TerminateProcess via the stdlib shim).
_sigkill = getattr(signal, "SIGKILL", signal.SIGTERM)
kill(int(pid), _sigkill)
info["sigkill"] = True
except (ProcessLookupError, OSError):
return info
@@ -2950,7 +3038,9 @@ def enforce_max_runtime(
time.sleep(0.5)
if _pid_alive(pid):
try:
kill(pid, signal.SIGKILL)
# signal.SIGKILL doesn't exist on Windows.
_sigkill = getattr(signal, "SIGKILL", signal.SIGTERM)
kill(pid, _sigkill)
killed = True
except (ProcessLookupError, OSError):
pass
@@ -3429,17 +3519,24 @@ def dispatch_once(
# cleanly without calling ``kanban_complete`` / ``kanban_block``
# (protocol violation — auto-block) from a real crash (OOM killer,
# SIGKILL, non-zero exit — existing counter behavior).
try:
while True:
try:
_pid, _status = os.waitpid(-1, os.WNOHANG)
except ChildProcessError:
break
if _pid == 0:
break
_record_worker_exit(_pid, _status)
except Exception:
pass
#
# Windows has no zombies / no os.WNOHANG — subprocess.Popen handles
# are freed when the Python object is garbage-collected or .wait() is
# called explicitly. The kanban dispatcher discards the Popen handle
# after spawn (``_default_spawn`` → abandon), so on Windows there's
# nothing to reap here — skip the whole block.
if os.name != "nt":
try:
while True:
try:
_pid, _status = os.waitpid(-1, os.WNOHANG)
except ChildProcessError:
break
if _pid == 0:
break
_record_worker_exit(_pid, _status)
except Exception:
pass
result = DispatchResult()
result.reclaimed = release_stale_claims(conn)
+265
View File
@@ -0,0 +1,265 @@
"""Kanban triage specifier — flesh out a one-liner into a real spec.
Used by ``hermes kanban specify [task_id | --all]``. Takes a task that
lives in the Triage column (a rough idea, typically only a title), calls
the auxiliary LLM to produce:
* A tightened title (optional only replaces if the model proposes a
materially different one)
* A concrete body: goal, proposed approach, acceptance criteria
and then flips the task ``triage -> todo`` via
``kanban_db.specify_triage_task``. The dispatcher promotes it to
``ready`` on its next tick (or immediately if there are no open parents).
Design notes
------------
* This module intentionally mirrors ``hermes_cli/goals.py`` same aux
client pattern, same "empty config => skip, don't crash" tolerance.
Keeps the surface area tiny and the failure modes predictable.
* The prompt is a short system + user pair. We ask for JSON with
``{title, body}``; if parsing fails, we fall back to treating the
whole response as the body and leave the title untouched. No
retry loop one shot, keep cost bounded.
* Structured output / JSON mode is not requested explicitly so the
specifier works on providers that don't implement it. The parse
is lenient (tolerates markdown code fences around the JSON).
"""
from __future__ import annotations
import json
import logging
import os
import re
from dataclasses import dataclass
from typing import Optional
from hermes_cli import kanban_db as kb
logger = logging.getLogger(__name__)
_SYSTEM_PROMPT = """You are the Kanban triage specifier for the Hermes Agent board.
A user dropped a rough idea into the Triage column. Your job is to turn it
into a concrete, actionable task spec that an autonomous worker can pick up
and execute without further clarification.
Output a single JSON object with exactly two keys:
{
"title": "<tightened task title, <= 80 chars, imperative voice>",
"body": "<multi-line spec, see structure below>"
}
The body MUST include these sections, each prefixed with a bold markdown
heading, in this order:
**Goal** one sentence, user-facing outcome.
**Approach** 2-5 bullets on how a worker should tackle it.
**Acceptance criteria** checklist of concrete, verifiable conditions.
**Out of scope** short list of things NOT to touch (omit if nothing
obvious; never invent scope creep).
Rules:
- Keep the tightened title close in meaning to the original idea do
NOT invent a different project.
- If the original idea is already detailed, preserve its substance and
just reformat into the sections above.
- Never add invented requirements the user didn't hint at.
- No preamble, no closing remarks, no code fences around the JSON.
- Output only the JSON object and nothing else.
"""
_USER_TEMPLATE = """Task id: {task_id}
Current title: {title}
Current body:
{body}
"""
@dataclass
class SpecifyOutcome:
"""Result of specifying a single triage task."""
task_id: str
ok: bool
reason: str = ""
new_title: Optional[str] = None
def _truncate(text: str, limit: int) -> str:
if len(text) <= limit:
return text
return text[: limit - 1] + ""
_FENCE_RE = re.compile(r"^\s*```(?:json)?\s*|\s*```\s*$", re.IGNORECASE)
def _extract_json_blob(raw: str) -> Optional[dict]:
"""Lenient JSON extraction — tolerates fenced code blocks and
leading/trailing whitespace. Returns None if nothing parses."""
if not raw:
return None
stripped = _FENCE_RE.sub("", raw.strip())
# Greedy: find the first `{` and last `}` and try that slice.
first = stripped.find("{")
last = stripped.rfind("}")
if first == -1 or last == -1 or last <= first:
return None
candidate = stripped[first : last + 1]
try:
val = json.loads(candidate)
except (ValueError, json.JSONDecodeError):
return None
if not isinstance(val, dict):
return None
return val
def _profile_author() -> str:
"""Mirror of ``hermes_cli.kanban._profile_author``. Kept local to
avoid a circular import when kanban.py imports this module."""
return (
os.environ.get("HERMES_PROFILE")
or os.environ.get("USER")
or "specifier"
)
def specify_task(
task_id: str,
*,
author: Optional[str] = None,
timeout: Optional[int] = None,
) -> SpecifyOutcome:
"""Specify a single triage task and promote it to ``todo``.
Returns an outcome describing what happened. Never raises for expected
failure modes (task not in triage, no aux client configured, API
error, malformed response) those surface via ``ok=False`` so the
``--all`` sweep can continue past individual failures.
"""
with kb.connect() as conn:
task = kb.get_task(conn, task_id)
if task is None:
return SpecifyOutcome(task_id, False, "unknown task id")
if task.status != "triage":
return SpecifyOutcome(
task_id, False, f"task is not in triage (status={task.status!r})"
)
try:
from agent.auxiliary_client import get_text_auxiliary_client
except Exception as exc: # pragma: no cover — import smoke test
logger.debug("specify: auxiliary client import failed: %s", exc)
return SpecifyOutcome(task_id, False, "auxiliary client unavailable")
try:
client, model = get_text_auxiliary_client("triage_specifier")
except Exception as exc:
logger.debug("specify: get_text_auxiliary_client failed: %s", exc)
return SpecifyOutcome(task_id, False, "auxiliary client unavailable")
if client is None or not model:
return SpecifyOutcome(
task_id, False, "no auxiliary client configured"
)
user_msg = _USER_TEMPLATE.format(
task_id=task.id,
title=_truncate(task.title or "", 400),
body=_truncate(task.body or "(no body)", 4000),
)
try:
resp = client.chat.completions.create(
model=model,
messages=[
{"role": "system", "content": _SYSTEM_PROMPT},
{"role": "user", "content": user_msg},
],
temperature=0.3,
max_tokens=1500,
timeout=timeout or 120,
)
except Exception as exc:
logger.info(
"specify: API call failed for %s (%s) — skipping",
task_id, exc,
)
return SpecifyOutcome(
task_id, False, f"LLM error: {type(exc).__name__}"
)
try:
raw = resp.choices[0].message.content or ""
except Exception:
raw = ""
parsed = _extract_json_blob(raw)
new_title: Optional[str]
new_body: Optional[str]
if parsed is None:
# Fall back: treat the whole reply as the body, leave title as-is.
# Worst case the user edits afterward — still better than stranding
# the task in triage on a malformed LLM reply.
stripped_raw = raw.strip()
if not stripped_raw:
return SpecifyOutcome(
task_id, False, "LLM returned an empty response"
)
new_title = None
new_body = stripped_raw
else:
title_val = parsed.get("title")
body_val = parsed.get("body")
new_title = (
title_val.strip()
if isinstance(title_val, str) and title_val.strip()
else None
)
new_body = (
body_val if isinstance(body_val, str) and body_val.strip() else None
)
if new_body is None and new_title is None:
return SpecifyOutcome(
task_id, False, "LLM response missing title and body"
)
with kb.connect() as conn:
ok = kb.specify_triage_task(
conn,
task_id,
title=new_title,
body=new_body,
author=author or _profile_author(),
)
if not ok:
# Race: someone else promoted / archived the task between our
# read above and the write. Report, don't crash.
return SpecifyOutcome(
task_id, False, "task moved out of triage before promotion"
)
return SpecifyOutcome(task_id, True, "specified", new_title=new_title)
def list_triage_ids(*, tenant: Optional[str] = None) -> list[str]:
"""Return task ids currently in the triage column.
``tenant`` narrows the sweep; ``None`` returns every triage task.
"""
with kb.connect() as conn:
tasks = kb.list_tasks(
conn,
status="triage",
tenant=tenant,
include_archived=False,
)
return [t.id for t in tasks]
+65 -11
View File
@@ -43,6 +43,11 @@ Usage:
hermes claw migrate --dry-run # Preview migration without changes
"""
# IMPORTANT: hermes_bootstrap must be the very first import — it sets up
# UTF-8 stdio on Windows so print()/subprocess children don't hit
# UnicodeEncodeError with non-ASCII characters. No-op on POSIX.
import hermes_bootstrap # noqa: F401
import argparse
import json
import os
@@ -230,6 +235,7 @@ except Exception:
pass # best-effort — don't crash if config isn't available yet
import logging
import threading
import time as _time
from datetime import datetime
@@ -6445,6 +6451,45 @@ def _load_installable_optional_extras() -> list[str]:
return referenced
def _run_install_with_heartbeat(
cmd: list[str],
*,
env: dict[str, str] | None = None,
heartbeat_interval_seconds: int = 30,
) -> None:
"""Run dependency install command with periodic heartbeat output.
Some resolvers/build backends (especially when compiling Rust/C extensions)
can stay quiet for minutes. Emit a simple elapsed-time heartbeat so users
know ``hermes update`` is still progressing even if pip/uv itself is silent.
"""
done = threading.Event()
start = _time.time()
def _heartbeat() -> None:
# Wait first, then print, so short installs don't emit noise.
while not done.wait(heartbeat_interval_seconds):
elapsed = int(_time.time() - start)
print(
f" … still installing dependencies ({elapsed}s elapsed)"
" — compiling Rust/C extensions can take several minutes",
flush=True,
)
t = threading.Thread(target=_heartbeat, daemon=True)
t.start()
try:
subprocess.run(
cmd,
cwd=PROJECT_ROOT,
check=True,
env=env,
)
finally:
done.set()
t.join(timeout=0.2)
def _install_python_dependencies_with_optional_fallback(
install_cmd_prefix: list[str],
*,
@@ -6461,12 +6506,13 @@ def _install_python_dependencies_with_optional_fallback(
Collecting/Building/Installing step), so keeping it visible costs
nothing on fast hardware and prevents the "hermes update hangs" reports
on slow hardware.
We also add periodic heartbeat lines in case the resolver/build backend is
itself silent for long stretches.
"""
try:
subprocess.run(
_run_install_with_heartbeat(
install_cmd_prefix + ["install", "-e", ".[all]"],
cwd=PROJECT_ROOT,
check=True,
env=env,
)
return
@@ -6475,10 +6521,8 @@ def _install_python_dependencies_with_optional_fallback(
" ⚠ Optional extras failed, reinstalling base dependencies and retrying extras individually..."
)
subprocess.run(
_run_install_with_heartbeat(
install_cmd_prefix + ["install", "-e", "."],
cwd=PROJECT_ROOT,
check=True,
env=env,
)
@@ -6486,10 +6530,8 @@ def _install_python_dependencies_with_optional_fallback(
installed_extras: list[str] = []
for extra in _load_installable_optional_extras():
try:
subprocess.run(
_run_install_with_heartbeat(
install_cmd_prefix + ["install", "-e", f".[{extra}]"],
cwd=PROJECT_ROOT,
check=True,
env=env,
)
installed_extras.append(extra)
@@ -7928,10 +7970,15 @@ def _cmd_update_impl(args, gateway_mode: bool):
print(
f"{len(_stuck)} gateway process(es) ignored SIGTERM — force-killing"
)
from gateway.status import terminate_pid as _terminate_pid
for pid in _stuck:
try:
os.kill(pid, _signal.SIGKILL)
except (ProcessLookupError, PermissionError):
# Routes through taskkill /T /F on Windows,
# SIGKILL on POSIX — _signal.SIGKILL doesn't
# exist on Windows so the old raw os.kill call
# used to crash the entire update path.
_terminate_pid(pid, force=True)
except (ProcessLookupError, PermissionError, OSError):
pass
# Give the OS a beat to reap the processes so the
# watchers see them exit and respawn.
@@ -8517,6 +8564,13 @@ def _build_provider_choices() -> list[str]:
def main():
"""Main entry point for hermes CLI."""
# Force UTF-8 stdio on Windows before anything prints. No-op elsewhere.
try:
from hermes_cli.stdio import configure_windows_stdio
configure_windows_stdio()
except Exception:
pass
from hermes_cli._parser import build_top_level_parser
parser, subparsers, chat_parser = build_top_level_parser()
+2 -2
View File
@@ -69,7 +69,7 @@ def _install_dependencies(provider_name: str) -> None:
try:
import yaml
with open(yaml_path) as f:
with open(yaml_path, encoding="utf-8") as f:
meta = yaml.safe_load(f) or {}
except Exception:
return
@@ -377,7 +377,7 @@ def _write_env_vars(env_path: Path, env_writes: dict) -> None:
if key not in updated_keys:
new_lines.append(f"{key}={val}")
env_path.write_text("\n".join(new_lines) + "\n")
env_path.write_text("\n".join(new_lines) + "\n", encoding="utf-8")
# ---------------------------------------------------------------------------
+2 -2
View File
@@ -173,7 +173,7 @@ def _read_disk_cache() -> tuple[dict[str, Any] | None, float]:
except (OSError, FileNotFoundError):
return (None, 0.0)
try:
with open(path) as fh:
with open(path, encoding="utf-8") as fh:
data = json.load(fh)
except (OSError, json.JSONDecodeError):
return (None, 0.0)
@@ -187,7 +187,7 @@ def _write_disk_cache(data: dict[str, Any]) -> None:
try:
path.parent.mkdir(parents=True, exist_ok=True)
tmp = path.with_suffix(path.suffix + ".tmp")
with open(tmp, "w") as fh:
with open(tmp, "w", encoding="utf-8") as fh:
json.dump(data, fh, indent=2)
fh.write("\n")
atomic_replace(tmp, path)
+1 -1
View File
@@ -174,7 +174,7 @@ def run_oneshot(
# Redirect stderr AND stdout to devnull for the entire call tree.
# We'll print the final response to the real stdout at the end.
real_stdout = sys.stdout
devnull = open(os.devnull, "w")
devnull = open(os.devnull, "w", encoding="utf-8")
try:
with redirect_stdout(devnull), redirect_stderr(devnull):
+1 -1
View File
@@ -870,7 +870,7 @@ class PluginManager:
if yaml is None:
logger.warning("PyYAML not installed cannot load %s", manifest_file)
return None
data = yaml.safe_load(manifest_file.read_text()) or {}
data = yaml.safe_load(manifest_file.read_text(encoding="utf-8")) or {}
name = data.get("name", plugin_dir.name)
key = f"{prefix}/{plugin_dir.name}" if prefix else name
+2 -2
View File
@@ -127,7 +127,7 @@ def _read_manifest(plugin_dir: Path) -> dict:
try:
import yaml
with open(manifest_file) as f:
with open(manifest_file, encoding="utf-8") as f:
return yaml.safe_load(f) or {}
except Exception as e:
logger.warning("Failed to read plugin.yaml in %s: %s", plugin_dir, e)
@@ -703,7 +703,7 @@ def _discover_all_plugins() -> list:
description = ""
if yaml:
try:
with open(manifest_file) as f:
with open(manifest_file, encoding="utf-8") as f:
manifest = yaml.safe_load(f) or {}
name = manifest.get("name", d.name)
version = manifest.get("version", "")
+13 -6
View File
@@ -354,7 +354,7 @@ def _read_config_model(profile_dir: Path) -> tuple:
return None, None
try:
import yaml
with open(config_path, "r") as f:
with open(config_path, "r", encoding="utf-8") as f:
cfg = yaml.safe_load(f) or {}
model_cfg = cfg.get("model", {})
if isinstance(model_cfg, str):
@@ -758,7 +758,6 @@ def _cleanup_gateway_service(name: str, profile_dir: Path) -> None:
def _stop_gateway_process(profile_dir: Path) -> None:
"""Stop a running gateway process via its PID file."""
import signal as _signal
import time as _time
pid_file = profile_dir / "gateway.pid"
@@ -769,19 +768,27 @@ def _stop_gateway_process(profile_dir: Path) -> None:
raw = pid_file.read_text().strip()
data = json.loads(raw) if raw.startswith("{") else {"pid": int(raw)}
pid = int(data["pid"])
os.kill(pid, _signal.SIGTERM)
# Route through terminate_pid so Windows uses the appropriate
# primitive (taskkill / TerminateProcess) — raw os.kill with
# _signal.SIGKILL raises AttributeError at import time on Windows,
# and raw os.kill with SIGTERM doesn't cascade to child processes
# the same way taskkill /T does.
from gateway.status import terminate_pid as _terminate_pid
_terminate_pid(pid) # graceful first
# Wait up to 10s for graceful shutdown
for _ in range(20):
_time.sleep(0.5)
try:
os.kill(pid, 0)
except ProcessLookupError:
except (ProcessLookupError, OSError):
# OSError covers Windows' WinError 87 "invalid parameter"
# returned for an invalid/gone PID probe.
print(f"✓ Gateway stopped (PID {pid})")
return
# Force kill
try:
os.kill(pid, _signal.SIGKILL)
except ProcessLookupError:
_terminate_pid(pid, force=True)
except (ProcessLookupError, OSError):
pass
print(f"✓ Gateway force-stopped (PID {pid})")
except (ProcessLookupError, PermissionError):
+8 -5
View File
@@ -7,11 +7,14 @@ keystrokes can be fed back in. The only caller today is the
Design constraints:
* **POSIX-only.** Hermes Agent supports Windows exclusively via WSL, which
exposes a native POSIX PTY via ``openpty(3)``. Native Windows Python
has no PTY; :class:`PtyUnavailableError` is raised with a user-readable
install/platform message so the dashboard can render a banner instead of
crashing.
* **POSIX-only.** This module depends on ``fcntl``, ``termios``, and
``ptyprocess``, none of which exist on native Windows Python. Native
Windows ConPTY is a different API (Windows 10 build 17763+) and would
need a separate Windows implementation (``pywinpty``) that's tracked
as a future enhancement. On native Windows, importing this module
raises :class:`ImportError` and the dashboard's ``/chat`` tab shows a
WSL-recommended banner instead of crashing. Every other feature in the
dashboard (sessions, jobs, metrics, config editor) works natively.
* **Zero Node dependency on the server side.** We use :mod:`ptyprocess`,
which is a pure-Python wrapper around the OS calls. The browser talks
to the same ``hermes --tui`` binary it would launch from the CLI, so
+60 -4
View File
@@ -84,18 +84,34 @@ def resolve_hermes_bin() -> Optional[str]:
1. ``sys.argv[0]`` if it resolves to a real executable.
2. ``shutil.which("hermes")`` on PATH.
3. ``None`` caller should fall back to ``python -m hermes_cli.main``.
Windows note: ``os.access(path, os.X_OK)`` returns True for ``.py`` and
``.pyc`` files on Windows (the OS treats anything listed in PATHEXT as
executable, and Python files are often registered there). But
``subprocess.run([script.py, ...])`` can't actually execute a .py
directly CreateProcessW needs a real .exe, not a script associated
with the Python launcher. On Windows we therefore skip the argv[0]
fast-path when it points at a .py file and fall through to either
``hermes.exe`` on PATH or the ``sys.executable -m hermes_cli.main``
fallback.
"""
argv0 = sys.argv[0]
_is_windows = sys.platform == "win32"
def _is_python_script(p: str) -> bool:
return p.lower().endswith((".py", ".pyc"))
# Absolute path to an executable (covers nix store, venv wrappers, etc.)
if os.path.isabs(argv0) and os.path.isfile(argv0) and os.access(argv0, os.X_OK):
return argv0
if not (_is_windows and _is_python_script(argv0)):
return argv0
# Relative path — resolve against CWD
if not argv0.startswith("-") and os.path.isfile(argv0):
abs_path = os.path.abspath(argv0)
if os.access(abs_path, os.X_OK):
return abs_path
if not (_is_windows and _is_python_script(abs_path)):
return abs_path
# PATH lookup
path_bin = shutil.which("hermes")
@@ -142,8 +158,48 @@ def relaunch(
preserve_inherited: bool = True,
original_argv: Optional[Sequence[str]] = None,
) -> None:
"""Replace the current process with a fresh hermes invocation."""
"""Replace the current process with a fresh hermes invocation.
On POSIX we use ``os.execvp`` which replaces the running process with
the new one in place same PID, no double-fork. That's what the
relaunch contract wants: "run hermes again as if the user had typed
the new argv".
Windows has no native exec semantics ``os.execvp`` on Windows
*emulates* exec by spawning the child and exiting the parent, but
only works when the target is a real Win32 executable. Our target
is usually ``hermes.exe`` (a Python console-script shim that wraps
``python -m hermes_cli.main``) or a ``.cmd`` batch file, and both
raise ``OSError(8, "Exec format error")`` on Windows' execvp.
The Windows-correct pattern is: spawn the child with ``subprocess.run``
(which routes through ``cmd.exe`` via ``shell=False`` + PATHEXT resolution),
wait for it to exit, then propagate its exit code via ``sys.exit``.
That's functionally equivalent — the user sees "hermes exited, then
new hermes started" — just with two PIDs in play instead of one.
"""
new_argv = build_relaunch_argv(
extra_args, preserve_inherited=preserve_inherited, original_argv=original_argv
)
os.execvp(new_argv[0], new_argv)
if sys.platform == "win32":
# Windows: subprocess + exit, because execvp can't swap to .cmd/.exe shims.
import subprocess
try:
result = subprocess.run(new_argv)
sys.exit(result.returncode)
except KeyboardInterrupt:
sys.exit(130)
except OSError as exc:
# Surface a helpful error rather than the raw OSError — the
# caller used to see ``[Errno 8] Exec format error`` which is
# cryptic. Common causes: ``hermes`` not on PATH yet (install
# hasn't propagated User PATH into this shell) or a stale shim.
print(
f"\nHermes relaunch failed: {exc}\n"
f"Command: {' '.join(new_argv)}\n"
f"Fix: open a new terminal so PATH picks up, then re-run hermes.",
file=sys.stderr,
)
sys.exit(1)
else:
os.execvp(new_argv[0], new_argv)
+2 -2
View File
@@ -1257,7 +1257,7 @@ def do_snapshot_export(output_path: str, console: Optional[Console] = None) -> N
sys.stdout.write(payload)
else:
out = Path(output_path)
out.write_text(payload)
out.write_text(payload, encoding="utf-8")
c.print(f"[bold green]Snapshot exported:[/] {out}")
c.print(f"[dim]{len(installed)} skill(s), {len(tap_list)} tap(s)[/]\n")
@@ -1274,7 +1274,7 @@ def do_snapshot_import(input_path: str, force: bool = False,
return
try:
snapshot = json.loads(inp.read_text())
snapshot = json.loads(inp.read_text(encoding="utf-8"))
except json.JSONDecodeError:
c.print(f"[bold red]Error:[/] Invalid JSON in {inp}\n")
return
+252
View File
@@ -0,0 +1,252 @@
"""Windows-safe stdio configuration.
On Windows, Python's ``sys.stdout``/``sys.stderr`` default to the console's
active code page (often ``cp1252``, sometimes ``cp437``, occasionally ``cp932``
on Japanese locales, etc.). Hermes's banners, tool output feed, and slash
command listings all contain Unicode: box-drawing characters (````),
mathematical and geometric symbols (`` ``), and user-supplied
text in any language. Printing those to a cp1252 console raises
``UnicodeEncodeError: 'charmap' codec can't encode character…`` and kills the
whole CLI before the REPL even opens.
The fix is to force UTF-8 on the Python side and also flip the console's
code page to UTF-8 (65001). Both matter: Python-level only helps when
Python's stdout is a real TTY; code-page flipping lets subprocesses and
child Python ``print()`` calls agree on encoding.
This module is a no-op on every non-Windows platform, and idempotent.
Entry points (``cli.py`` ``main``, ``hermes_cli/main.py`` CLI dispatch,
``gateway/run.py`` startup) call :func:`configure_windows_stdio` exactly
once early in startup.
Patterns cribbed from Claude Code (``src/utils/platform.ts``), OpenCode
(``packages/opencode/src/pty/index.ts`` env injection), and OpenAI Codex
(``codex-rs/core/src/unified_exec/process_manager.rs``). None of those
actually flip the console code page they rely on their runtime (Node or
Rust) writing UTF-16 to the Win32 console API and letting the terminal
sort it out. Python doesn't get that luxury.
"""
from __future__ import annotations
import os
import sys
__all__ = ["configure_windows_stdio", "is_windows"]
_CONFIGURED = False
def is_windows() -> bool:
"""Return True iff running on native Windows (not WSL)."""
return sys.platform == "win32"
def _flip_console_code_page_to_utf8() -> None:
"""Set the attached console's input and output code pages to UTF-8.
Uses ``SetConsoleCP`` / ``SetConsoleOutputCP`` via ``ctypes``. Failure
is silent if there's no attached console (e.g. Hermes is running
behind a redirected stdout, under a service, or inside a PTY-less CI
runner) these calls simply return 0 and we move on.
CP_UTF8 is 65001.
"""
try:
import ctypes
kernel32 = ctypes.windll.kernel32 # type: ignore[attr-defined]
# Best-effort; if there's no console attached these just fail silently.
kernel32.SetConsoleCP(65001)
kernel32.SetConsoleOutputCP(65001)
except Exception:
# ctypes import, missing kernel32, or non-Windows — any failure here
# is non-fatal. We've still reconfigured Python's own streams below.
pass
def _reconfigure_stream(stream, *, encoding: str = "utf-8", errors: str = "replace") -> None:
"""Reconfigure a text stream to UTF-8 in place.
Uses ``TextIOWrapper.reconfigure`` (Python 3.7+). If the stream isn't
a ``TextIOWrapper`` (e.g. it's been redirected to an ``io.StringIO``
during tests), we skip rather than blow up.
"""
try:
reconfigure = getattr(stream, "reconfigure", None)
if reconfigure is None:
return
reconfigure(encoding=encoding, errors=errors)
except Exception:
pass
def configure_windows_stdio() -> bool:
"""Force UTF-8 stdio on Windows. No-op elsewhere.
Idempotent safe to call multiple times from different entry points.
Returns ``True`` if anything was actually changed, ``False`` on
non-Windows or on a repeat call.
Set ``HERMES_DISABLE_WINDOWS_UTF8=1`` in the environment to opt out
(for diagnosing encoding-related bugs by forcing the old cp1252 path).
Also sets a sensible default ``EDITOR`` on Windows if none is already
set see :func:`_default_windows_editor`.
"""
global _CONFIGURED
if _CONFIGURED:
return False
if not is_windows():
# Mark configured so repeated calls on POSIX are true no-ops.
_CONFIGURED = True
return False
if os.environ.get("HERMES_DISABLE_WINDOWS_UTF8") in ("1", "true", "True", "yes"):
_CONFIGURED = True
return False
# Encourage every child Python process spawned by the agent to also use
# UTF-8 for its stdio. PYTHONIOENCODING wins over the locale-based
# default in subprocesses. Don't override an explicit user setting.
os.environ.setdefault("PYTHONIOENCODING", "utf-8")
# PYTHONUTF8 = 1 enables UTF-8 Mode globally for any Python subprocess
# (PEP 540). Again, don't override an explicit setting.
os.environ.setdefault("PYTHONUTF8", "1")
# Set EDITOR to a working Windows default if neither EDITOR nor VISUAL
# is set. prompt_toolkit's ``open_in_editor`` falls back to POSIX-only
# paths (``/usr/bin/nano``, ``/usr/bin/vi``) that don't exist on
# Windows — Ctrl+X Ctrl+E and ``/edit`` silently do nothing there
# otherwise. This happens even with full Git for Windows installed,
# so it's not a MinGit-specific issue.
_default_editor = _default_windows_editor()
if _default_editor and not os.environ.get("EDITOR") and not os.environ.get("VISUAL"):
os.environ["EDITOR"] = _default_editor
# Augment PATH with the Hermes-managed Git install directories so
# subprocess calls (bash, rg, grep, etc.) resolve even in sessions
# that started before the User PATH broadcast reached them. When
# install.ps1 adds these to User PATH via SetEnvironmentVariable,
# already-running shells don't see the change — which means hermes
# launched from the install session won't find rg / bash / grep
# even though they're "installed". Prepending the known paths here
# closes that gap. No-op when the paths don't exist (e.g. system-Git
# install without Hermes-managed PortableGit).
_augment_path_with_known_tools()
# Flip the console code page first so that any subprocess that
# inherits the console (e.g. a launched shell) also sees CP_UTF8.
_flip_console_code_page_to_utf8()
# Reconfigure Python's own stdio wrappers so ``print()`` calls from
# this process round-trip emoji / box-drawing / non-Latin text.
# ``errors="replace"`` means a genuinely unencodable byte sequence
# gets a ``?`` rather than crashing the interpreter — we prefer
# degraded output over a stack trace.
_reconfigure_stream(sys.stdout)
_reconfigure_stream(sys.stderr)
# stdin is re-configured for completeness; Hermes's interactive
# input path uses prompt_toolkit which manages its own encoding,
# but batch/pipe input benefits from UTF-8 decoding on stdin too.
_reconfigure_stream(sys.stdin)
_CONFIGURED = True
return True
def _default_windows_editor() -> str:
"""Return a Windows-appropriate default for ``$EDITOR``.
Priority order, first match wins:
1. ``notepad`` ships with every Windows install, no deps, works as a
blocking editor (``subprocess.call(["notepad", file])`` blocks until
the user closes the window). This is the "always-works" default.
The prompt_toolkit buffer's ``open_in_editor`` and Hermes's
``hermes config edit`` both honour ``$EDITOR``. Users who prefer a
different editor can override:
- VSCode: ``$env:EDITOR = "code --wait"`` (``--wait`` is critical;
without it the editor returns immediately and any input is lost)
- Notepad++: ``$env:EDITOR = "'C:\\Program Files\\Notepad++\\notepad++.exe' -multiInst -nosession"``
- Neovim: ``$env:EDITOR = "nvim"`` (if installed)
Set this before launching Hermes (User env var in Windows Settings, or
export in a PowerShell profile) and Hermes picks it up automatically.
"""
import shutil
# notepad.exe is always in %SystemRoot%\System32 on Windows, so shutil.which
# will reliably find it. Return the bare name so prompt_toolkit's shlex
# split doesn't trip over a path containing spaces.
if shutil.which("notepad"):
return "notepad"
# On the extreme off-chance notepad is missing (WinPE, Nano Server), fall
# back to nothing and let prompt_toolkit's silent no-op do its thing.
return ""
def _augment_path_with_known_tools() -> None:
"""Prepend well-known Hermes-managed tool directories to os.environ['PATH'].
Fixes the "User PATH was just updated but my process can't see it" gap on
Windows. When install.ps1 runs, it adds entries like
``%LOCALAPPDATA%\\hermes\\git\\bin`` to the User PATH via
``SetEnvironmentVariable(..., "User")``. That write propagates to newly
*spawned* processes only already-running shells (including the one the
user invokes ``hermes`` from right after install) retain their old PATH.
Any subprocess Hermes spawns bash, ``rg``, ``grep``, ``npm`` inherits
that stale PATH and reports commands as missing even though they're on
disk. Symptom: ``search_files`` reports "rg/find not available" when
the user clearly just installed ripgrep.
Patch-up strategy: add the known Hermes-managed tool directories to our
PATH at startup so subprocess calls resolve correctly. No-op on POSIX
and when the directories don't exist. The User PATH broadcast still
happens in the background for future shells; this just smooths over
the first-launch gap.
"""
if not is_windows():
return
import shutil as _shutil
local_appdata = os.environ.get("LOCALAPPDATA", "")
if not local_appdata:
return
# Known tool dirs installed by scripts/install.ps1. Kept in sync with
# the PATH entries that installer adds to User scope — the two lists
# should match so this prefill fully mirrors what a fresh shell would
# see on next launch.
candidate_dirs = [
os.path.join(local_appdata, "hermes", "git", "cmd"),
os.path.join(local_appdata, "hermes", "git", "bin"),
os.path.join(local_appdata, "hermes", "git", "usr", "bin"),
# Hermes venv Scripts directory — host of the hermes.exe shim itself,
# also where any pip-installed console scripts land. Usually already
# on PATH when the user invokes hermes, but harmless to include.
os.path.join(local_appdata, "hermes", "hermes-agent", "venv", "Scripts"),
# WinGet packages directory — where ``winget install`` drops CLI
# shims by default (ripgrep lands here as rg.exe). Covers the case
# of a system-Git install + ripgrep-via-winget that isn't yet on
# the spawning shell's PATH.
os.path.join(local_appdata, "Microsoft", "WinGet", "Links"),
]
existing = os.environ.get("PATH", "")
existing_lower = {p.lower() for p in existing.split(os.pathsep) if p}
prepend = []
for d in candidate_dirs:
if os.path.isdir(d) and d.lower() not in existing_lower:
prepend.append(d)
if prepend:
os.environ["PATH"] = os.pathsep.join([*prepend, existing])
+52 -3
View File
@@ -308,6 +308,23 @@ TOOL_CATEGORIES = {
{"key": "SEARXNG_URL", "prompt": "Your SearXNG instance URL (e.g., http://localhost:8080)", "url": "https://searxng.github.io/searxng/"},
],
},
{
"name": "Brave Search (Free Tier)",
"badge": "free tier · search only",
"tag": "2,000 queries/mo free — search only (pair with any extract provider)",
"web_backend": "brave-free",
"env_vars": [
{"key": "BRAVE_SEARCH_API_KEY", "prompt": "Brave Search subscription token", "url": "https://brave.com/search/api/"},
],
},
{
"name": "DuckDuckGo (ddgs)",
"badge": "free · no key · search only",
"tag": "Search via the ddgs Python package — no API key (pair with any extract provider)",
"web_backend": "ddgs",
"env_vars": [],
"post_setup": "ddgs",
},
],
},
"image_gen": {
@@ -492,8 +509,12 @@ def _run_post_setup(post_setup_key: str):
if not node_modules.exists() and npm_bin:
_print_info(" Installing Node.js dependencies for browser tools...")
import subprocess
# Use the resolved npm_bin absolute path so subprocess.Popen can
# execute npm.cmd on Windows (CreateProcessW otherwise rejects
# batch shims). On POSIX npm_bin is the plain path — same
# behaviour as before.
result = subprocess.run(
["npm", "install", "--silent"],
[npm_bin, "install", "--silent"],
capture_output=True, text=True, cwd=str(PROJECT_ROOT)
)
if result.returncode == 0:
@@ -592,11 +613,13 @@ def _run_post_setup(post_setup_key: str):
elif post_setup_key == "camofox":
camofox_dir = PROJECT_ROOT / "node_modules" / "@askjo" / "camofox-browser"
if not camofox_dir.exists() and shutil.which("npm"):
_npm_bin = shutil.which("npm")
if not camofox_dir.exists() and _npm_bin:
_print_info(" Installing Camofox browser server...")
import subprocess
# Absolute npm path so .cmd shim executes on Windows.
result = subprocess.run(
["npm", "install", "--silent"],
[_npm_bin, "install", "--silent"],
capture_output=True, text=True, cwd=str(PROJECT_ROOT)
)
if result.returncode == 0:
@@ -669,6 +692,32 @@ def _run_post_setup(post_setup_key: str):
_print_info(" Full voice list: https://github.com/OHF-Voice/piper1-gpl/blob/main/docs/VOICES.md")
_print_info(" Switch voices by setting tts.piper.voice in ~/.hermes/config.yaml")
elif post_setup_key == "ddgs":
try:
__import__("ddgs")
_print_success(" ddgs is already installed")
except ImportError:
import subprocess
_print_info(" Installing ddgs (DuckDuckGo search package)...")
try:
result = subprocess.run(
[sys.executable, "-m", "pip", "install", "-U", "ddgs", "--quiet"],
capture_output=True, text=True, timeout=300,
)
if result.returncode == 0:
_print_success(" ddgs installed")
else:
_print_warning(" ddgs install failed:")
_print_info(f" {result.stderr.strip()[:300]}")
_print_info(" Run manually: python -m pip install -U ddgs")
return
except subprocess.TimeoutExpired:
_print_warning(" ddgs install timed out (>5min)")
_print_info(" Run manually: python -m pip install -U ddgs")
return
_print_info(" No API key required. DuckDuckGo enforces server-side rate limits.")
_print_info(" Pair with an extract provider if you also need web_extract.")
elif post_setup_key == "spotify":
# Run the full `hermes auth spotify` flow — if the user has no
# client_id yet, this drops them into the interactive wizard
+27 -2
View File
@@ -692,7 +692,7 @@ def _tail_lines(path: Path, n: int) -> List[str]:
if not path.exists():
return []
try:
text = path.read_text(errors="replace")
text = path.read_text(encoding="utf-8", errors="replace")
except OSError:
return []
lines = text.splitlines()
@@ -2979,7 +2979,20 @@ async def get_models_analytics(days: int = 30):
import re
import asyncio
from hermes_cli.pty_bridge import PtyBridge, PtyUnavailableError
# PTY bridge is POSIX-only (depends on fcntl/termios/ptyprocess). On native
# Windows the import raises; catch and leave PtyBridge=None so the rest of
# the dashboard (sessions, jobs, metrics, config editor) still loads and the
# /api/pty endpoint cleanly refuses with a WSL-suggested message.
try:
from hermes_cli.pty_bridge import PtyBridge, PtyUnavailableError
_PTY_BRIDGE_AVAILABLE = True
except ImportError as _pty_import_err: # pragma: no cover - Windows-only path
PtyBridge = None # type: ignore[assignment]
_PTY_BRIDGE_AVAILABLE = False
class PtyUnavailableError(RuntimeError): # type: ignore[no-redef]
"""Stub on platforms where pty_bridge can't be imported."""
pass
_RESIZE_RE = re.compile(rb"\x1b\[RESIZE:(\d+);(\d+)\]")
_PTY_READ_CHUNK_TIMEOUT = 0.2
@@ -3113,6 +3126,18 @@ async def pty_ws(ws: WebSocket) -> None:
await ws.accept()
# On native Windows, the POSIX PTY bridge can't be imported. Tell the
# client and close cleanly rather than pretending the feature works.
if not _PTY_BRIDGE_AVAILABLE:
await ws.send_text(
"\r\n\x1b[31mChat unavailable: the embedded terminal requires a "
"POSIX PTY, which native Windows Python doesn't provide.\x1b[0m\r\n"
"\x1b[33mInstall Hermes inside WSL2 to use the dashboard's /chat "
"tab — the rest of the dashboard works here.\x1b[0m\r\n"
)
await ws.close(code=1011)
return
# --- spawn PTY ------------------------------------------------------
resume = ws.query_params.get("resume") or None
channel = _channel_or_close_code(ws)
+2 -2
View File
@@ -233,7 +233,7 @@ def is_wsl() -> bool:
if _wsl_detected is not None:
return _wsl_detected
try:
with open("/proc/version", "r") as f:
with open("/proc/version", "r", encoding="utf-8") as f:
_wsl_detected = "microsoft" in f.read().lower()
except Exception:
_wsl_detected = False
@@ -260,7 +260,7 @@ def is_container() -> bool:
_container_detected = True
return True
try:
with open("/proc/1/cgroup", "r") as f:
with open("/proc/1/cgroup", "r", encoding="utf-8") as f:
cgroup = f.read()
if "docker" in cgroup or "podman" in cgroup or "/lxc/" in cgroup:
_container_detected = True
+5
View File
@@ -612,6 +612,11 @@ class SessionDB:
the caller already holds cumulative totals (gateway path, where the
cached agent accumulates across messages).
"""
# Ensure the session row exists so the UPDATE doesn't silently affect
# 0 rows. Under concurrent load (cron + kanban + delegate_task) the
# initial create_session() may have failed due to SQLite locking.
# INSERT OR IGNORE is cheap and idempotent.
self._insert_session_row(session_id, "unknown", model=model)
if absolute:
sql = """UPDATE sessions SET
input_tokens = ?,
+1 -1
View File
@@ -50,7 +50,7 @@ def _resolve_timezone_name() -> str:
import yaml
config_path = get_config_path()
if config_path.exists():
with open(config_path) as f:
with open(config_path, encoding="utf-8") as f:
cfg = yaml.safe_load(f) or {}
tz_cfg = cfg.get("timezone", "")
if isinstance(tz_cfg, str) and tz_cfg.strip():
+1 -1
View File
@@ -802,7 +802,7 @@ def create_mcp_server(event_bridge: Optional[EventBridge] = None) -> "FastMCP":
return json.dumps({"count": len(targets), "channels": targets}, indent=2)
channels = []
for plat, entries_list in directory.items():
for plat, entries_list in directory.get("platforms", {}).items():
if platform and plat.lower() != platform.lower():
continue
if isinstance(entries_list, list):
+92 -17
View File
@@ -1905,6 +1905,29 @@
}).then(function () { load(); props.onRefresh(); });
};
// Triage specifier — calls the auxiliary LLM to flesh out a rough
// idea in the Triage column into a concrete spec (title + body with
// goal, approach, acceptance criteria) and promotes it to todo.
// Not a PATCH: runs through a dedicated POST endpoint because the
// LLM call can take tens of seconds, and its outcome is richer than
// a status flip (may update title AND body AND emit an audit
// comment — or fail with a human-readable reason that the UI
// surfaces inline without treating it as an HTTP error).
const doSpecify = function () {
return SDK.fetchJSON(
withBoard(`${API}/tasks/${encodeURIComponent(props.taskId)}/specify`, boardSlug),
{
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({}),
}
).then(function (res) {
load();
props.onRefresh();
return res;
});
};
const addLink = function (parentId) {
return SDK.fetchJSON(withBoard(`${API}/links`, boardSlug), {
method: "POST",
@@ -1994,6 +2017,7 @@
assignees: props.assignees || [],
boardSlug: boardSlug,
onPatch: doPatch,
onSpecify: doSpecify,
onAddParent: addLink,
onRemoveParent: removeLink,
onAddChild: addChild,
@@ -2062,7 +2086,11 @@
}) : null,
t.created_by ? h(MetaRow, { label: "Created by", value: t.created_by }) : null,
),
h(StatusActions, { task: t, onPatch: props.onPatch }),
h(StatusActions, {
task: t,
onPatch: props.onPatch,
onSpecify: props.onSpecify,
}),
h(DiagnosticsSection, {
task: t,
boardSlug: props.boardSlug,
@@ -2495,6 +2523,8 @@
function StatusActions(props) {
const t = props.task;
const [specifyBusy, setSpecifyBusy] = useState(false);
const [specifyMsg, setSpecifyMsg] = useState(null);
const b = function (label, patch, enabled, confirmMsg) {
return h(Button, {
onClick: function () { if (enabled !== false) props.onPatch(patch, { confirm: confirmMsg }); },
@@ -2502,22 +2532,67 @@
size: "sm",
}, label);
};
return h("div", { className: "hermes-kanban-actions" },
b("→ triage", { status: "triage" }, t.status !== "triage"),
b("→ ready", { status: "ready" }, t.status !== "ready"),
// No direct → running button: /tasks/:id PATCH rejects status=running
// with 400 (issue #19535). Tasks enter running only through the
// dispatcher's claim_task path, which atomically creates the run row,
// claim lock, and worker process metadata.
b("Block", { status: "blocked" },
t.status === "running" || t.status === "ready",
DESTRUCTIVE_TRANSITIONS.blocked),
b("Unblock", { status: "ready" }, t.status === "blocked"),
b("Complete", { status: "done" },
t.status === "running" || t.status === "ready" || t.status === "blocked",
DESTRUCTIVE_TRANSITIONS.done),
b("Archive", { status: "archived" }, t.status !== "archived",
DESTRUCTIVE_TRANSITIONS.archived),
// "Specify" appears only when the task is in the Triage column — the
// one column where an auxiliary LLM pass is meaningful. Elsewhere
// the backend would return ok:false with "not in triage" anyway,
// so hiding the button keeps the action row uncluttered.
const specifyButton = (t.status === "triage" && props.onSpecify)
? h(Button, {
onClick: function () {
if (specifyBusy) return;
setSpecifyBusy(true);
setSpecifyMsg(null);
props.onSpecify().then(function (res) {
if (res && res.ok) {
const suffix = res.new_title
? ` — retitled: ${res.new_title}`
: "";
setSpecifyMsg({ ok: true, text: `Specified${suffix}` });
} else {
setSpecifyMsg({
ok: false,
text: "Specify failed: " + ((res && res.reason) || "unknown error"),
});
}
}).catch(function (err) {
setSpecifyMsg({
ok: false,
text: "Specify failed: " + (err.message || String(err)),
});
}).then(function () {
setSpecifyBusy(false);
});
},
disabled: specifyBusy,
size: "sm",
}, specifyBusy ? "Specifying…" : "✨ Specify")
: null;
return h("div", null,
h("div", { className: "hermes-kanban-actions" },
specifyButton,
b("→ triage", { status: "triage" }, t.status !== "triage"),
b("→ ready", { status: "ready" }, t.status !== "ready"),
// No direct → running button: /tasks/:id PATCH rejects status=running
// with 400 (issue #19535). Tasks enter running only through the
// dispatcher's claim_task path, which atomically creates the run row,
// claim lock, and worker process metadata.
b("Block", { status: "blocked" },
t.status === "running" || t.status === "ready",
DESTRUCTIVE_TRANSITIONS.blocked),
b("Unblock", { status: "ready" }, t.status === "blocked"),
b("Complete", { status: "done" },
t.status === "running" || t.status === "ready" || t.status === "blocked",
DESTRUCTIVE_TRANSITIONS.done),
b("Archive", { status: "archived" }, t.status !== "archived",
DESTRUCTIVE_TRANSITIONS.archived),
),
specifyMsg ? h("div", {
className: specifyMsg.ok
? "hermes-kanban-msg-ok"
: "hermes-kanban-msg-err",
}, specifyMsg.text) : null,
);
}
+20
View File
@@ -402,6 +402,26 @@
gap: 0.3rem;
}
/* Specifier result banner — sits directly under the status action row. */
.hermes-kanban-msg-ok,
.hermes-kanban-msg-err {
margin-top: 0.4rem;
padding: 0.35rem 0.55rem;
border-radius: 0.375rem;
font-size: 0.85rem;
line-height: 1.3;
}
.hermes-kanban-msg-ok {
background: rgba(46, 160, 67, 0.12);
color: #2ea043;
border: 1px solid rgba(46, 160, 67, 0.35);
}
.hermes-kanban-msg-err {
background: rgba(248, 81, 73, 0.12);
color: #f85149;
border: 1px solid rgba(248, 81, 73, 0.35);
}
/* ---- Home channel subscription toggles (per-platform, per-task) ----- */
.hermes-kanban-home-subs {
+56
View File
@@ -30,6 +30,7 @@ import asyncio
import hmac
import json
import logging
import os
import sqlite3
import time
from dataclasses import asdict
@@ -1011,6 +1012,61 @@ def reclaim_task_endpoint(
conn.close()
class SpecifyBody(BaseModel):
"""Optional author override. Nothing else is configurable from the
dashboard model + prompt come from ``auxiliary.triage_specifier``
in config.yaml, same as the CLI."""
author: Optional[str] = None
@router.post("/tasks/{task_id}/specify")
def specify_task_endpoint(
task_id: str,
payload: SpecifyBody,
board: Optional[str] = Query(None),
):
"""Flesh out a triage-column task via the auxiliary LLM and promote
it to ``todo``. Maps 1:1 to ``hermes kanban specify <task_id>``.
Returns the outcome shape used by the CLI: ``{ok, task_id, reason,
new_title}``. A non-OK outcome is NOT an HTTP error the UI renders
the reason inline (e.g. "no auxiliary client configured") so the
operator knows what to fix, and retries without a page reload.
This endpoint runs in FastAPI's threadpool (sync ``def``) because
the underlying LLM call can take tens of seconds to minutes on
reasoning models, which would block the event loop if we used
``async def`` without an explicit ``run_in_executor``.
"""
board = _resolve_board(board)
# Pin the board for the duration of this call so the specifier module
# (which calls ``kb.connect()`` with no args) hits the right DB.
prev_env = os.environ.get("HERMES_KANBAN_BOARD")
try:
os.environ["HERMES_KANBAN_BOARD"] = board or kanban_db.DEFAULT_BOARD
# Import lazily so a missing auxiliary client at import time
# doesn't break plugin load.
from hermes_cli import kanban_specify # noqa: WPS433 (intentional)
outcome = kanban_specify.specify_task(
task_id,
author=(payload.author or None),
)
finally:
if prev_env is None:
os.environ.pop("HERMES_KANBAN_BOARD", None)
else:
os.environ["HERMES_KANBAN_BOARD"] = prev_env
return {
"ok": bool(outcome.ok),
"task_id": outcome.task_id,
"reason": outcome.reason,
"new_title": outcome.new_title,
}
class ReassignBody(BaseModel):
profile: Optional[str] = None # "" or None = unassign
reclaim_first: bool = False
+48 -5
View File
@@ -36,6 +36,12 @@ dependencies = [
"edge-tts>=7.2.7,<8",
# Skills Hub (GitHub App JWT auth — optional, only needed for bot identity)
"PyJWT[crypto]>=2.12.0,<3", # CVE-2026-32597
# Windows has no IANA tzdata shipped with the OS, so Python's ``zoneinfo``
# (PEP 615) raises ``ZoneInfoNotFoundError`` for every non-UTC timezone
# out of the box. ``tzdata`` ships the Olson database as a data package
# Python resolves automatically. No-op on Linux/macOS (which have
# /usr/share/zoneinfo). Credits: PR #13182 (@sprmn24).
"tzdata>=2023.3; sys_platform == 'win32'",
]
[project.optional-dependencies]
@@ -68,9 +74,7 @@ acp = ["agent-client-protocol>=0.9.0,<1.0"]
mistral = ["mistralai>=2.3.0,<3"]
bedrock = ["boto3>=1.35.0,<2"]
termux = [
# Tested Android / Termux path: keeps the core CLI feature-rich while
# avoiding extras that currently depend on non-Android wheels (notably
# faster-whisper -> ctranslate2 via the voice extra).
# Baseline Android / Termux path for reliable fresh installs.
"python-telegram-bot[webhooks]>=22.6,<23",
"hermes-agent[cron]",
"hermes-agent[cli]",
@@ -79,6 +83,27 @@ termux = [
"hermes-agent[honcho]",
"hermes-agent[acp]",
]
termux-all = [
# Best-effort "install all" profile for Termux: include broad extras that
# are known to resolve on Android, while intentionally excluding extras that
# currently hard-fail from missing/broken Android wheels/toolchains.
#
# Excluded for now:
# - matrix (mautrix[encryption] -> python-olm build failures on Termux)
# - voice (faster-whisper chain requires ctranslate2/av builds not packaged)
"hermes-agent[termux]",
"hermes-agent[messaging]",
"hermes-agent[slack]",
"hermes-agent[tts-premium]",
"hermes-agent[dingtalk]",
"hermes-agent[feishu]",
"hermes-agent[google]",
"hermes-agent[mistral]",
"hermes-agent[bedrock]",
"hermes-agent[homeassistant]",
"hermes-agent[sms]",
"hermes-agent[web]",
]
dingtalk = ["dingtalk-stream>=0.20,<1", "alibabacloud-dingtalk>=2.0.0", "qrcode>=7.0,<8"]
feishu = ["lark-oapi>=1.5.3,<2", "qrcode>=7.0,<8"]
google = [
@@ -135,7 +160,7 @@ hermes-agent = "run_agent:main"
hermes-acp = "acp_adapter.entry:main"
[tool.setuptools]
py-modules = ["run_agent", "model_tools", "toolsets", "batch_runner", "trajectory_compressor", "toolset_distributions", "cli", "hermes_constants", "hermes_state", "hermes_time", "hermes_logging", "rl_cli", "utils"]
py-modules = ["run_agent", "model_tools", "toolsets", "batch_runner", "trajectory_compressor", "toolset_distributions", "cli", "hermes_bootstrap", "hermes_constants", "hermes_state", "hermes_time", "hermes_logging", "rl_cli", "utils"]
[tool.setuptools.package-data]
hermes_cli = ["web_dist/**/*"]
@@ -163,7 +188,25 @@ exclude = ["tinker-atropos"]
[tool.ruff]
exclude = ["tinker-atropos"]
select = [] # disable all lints for now, until we've wrangled typechecks a bit more :3
preview = true # required for PLW1514 (unspecified-encoding) — preview rule
[tool.ruff.lint]
# All other lints are intentionally disabled (see comment history on this
# file) while we wrangle typechecks — but PLW1514 is too load-bearing to
# keep off. Bare open()/read_text()/write_text() in text mode defaults to
# the system locale encoding on Windows (cp1252 on US-locale installs),
# which silently corrupts any non-ASCII file content. We had three
# separate Windows sandbox regressions in one debug session before
# adding the explicit encoding. This rule keeps new code honest.
select = ["PLW1514"]
[tool.ruff.lint.per-file-ignores]
# Tests can intentionally exercise locale-encoding edge cases.
"tests/**" = ["PLW1514"]
# Skills and plugins are partially user-authored — their own conventions.
"skills/**" = ["PLW1514"]
"optional-skills/**" = ["PLW1514"]
"plugins/**" = ["PLW1514"]
[tool.uv]
exclude-newer = "7 days"
+1 -1
View File
@@ -82,7 +82,7 @@ def load_hermes_config() -> dict:
if config_path.exists():
try:
with open(config_path, "r") as f:
with open(config_path, "r", encoding='utf-8') as f:
file_config = yaml.safe_load(f) or {}
# Get model from config
+25 -3
View File
@@ -20,6 +20,10 @@ Usage:
response = agent.run_conversation("Tell me about the latest Python updates")
"""
# IMPORTANT: hermes_bootstrap must be the very first import — UTF-8 stdio
# on Windows. No-op on POSIX. See hermes_bootstrap.py for full rationale.
import hermes_bootstrap # noqa: F401
import asyncio
import base64
import concurrent.futures
@@ -3065,6 +3069,10 @@ class AIAgent:
) -> bool:
"""Return True when this provider/model pair should use Responses API."""
normalized_provider = (provider or "").strip().lower()
# Nous serves GPT-5.x models via its OpenAI-compatible chat
# completions endpoint; its /v1/responses endpoint returns 404.
if normalized_provider == "nous":
return False
if normalized_provider == "copilot":
try:
from hermes_cli.models import _should_use_copilot_responses_api
@@ -3678,7 +3686,7 @@ class AIAgent:
pass
review_agent = None
try:
with open(os.devnull, "w") as _devnull, \
with open(os.devnull, "w", encoding="utf-8") as _devnull, \
contextlib.redirect_stdout(_devnull), \
contextlib.redirect_stderr(_devnull):
# Inherit the parent agent's live runtime (provider, model,
@@ -12127,6 +12135,14 @@ class AIAgent:
# deltas instead of double-counting them.
if self._session_db and self.session_id:
try:
# Ensure the session row exists before attempting UPDATE.
# Under concurrent load (cron/kanban), the initial
# _ensure_db_session() may have failed due to SQLite
# locking. Retry here so per-call token deltas are
# not silently lost (UPDATE on a non-existent row
# affects 0 rows without error).
if not self._session_db_created:
self._ensure_db_session()
self._session_db.update_token_counts(
self.session_id,
input_tokens=canonical_usage.input_tokens,
@@ -12145,8 +12161,14 @@ class AIAgent:
model=self.model,
api_call_count=1,
)
except Exception:
pass # never block the agent loop
except Exception as e:
# Log token persistence failures so they're
# visible in agent.log — silent loss here is
# the root cause of undercounted analytics.
logger.debug(
"Token persistence failed (session=%s, tokens=%d): %s",
self.session_id, total_tokens, e,
)
if self.verbose_logging:
logging.debug(f"Token usage: prompt={usage_dict['prompt_tokens']:,}, completion={usage_dict['completion_tokens']:,}, total={usage_dict['total_tokens']:,}")
+1 -1
View File
@@ -81,7 +81,7 @@ def build_catalog() -> dict:
def main() -> int:
catalog = build_catalog()
os.makedirs(os.path.dirname(OUTPUT_PATH), exist_ok=True)
with open(OUTPUT_PATH, "w") as fh:
with open(OUTPUT_PATH, "w", encoding="utf-8") as fh:
json.dump(catalog, fh, indent=2)
fh.write("\n")
+1 -1
View File
@@ -304,7 +304,7 @@ def main():
}
os.makedirs(os.path.dirname(OUTPUT_PATH), exist_ok=True)
with open(OUTPUT_PATH, "w") as f:
with open(OUTPUT_PATH, "w", encoding="utf-8") as f:
json.dump(index, f, separators=(",", ":"), ensure_ascii=False)
elapsed = time.time() - overall_start
+1 -1
View File
@@ -291,7 +291,7 @@ def check_release_file(release_file, all_contributors):
missing: set of handles NOT found in the file
"""
try:
content = Path(release_file).read_text()
content = Path(release_file).read_text(encoding="utf-8")
except FileNotFoundError:
print(f" [error] Release file not found: {release_file}", file=sys.stderr)
return set(), set(all_contributors)
+1 -1
View File
@@ -242,7 +242,7 @@ def check_config(groq_key, eleven_key):
if config_path.exists():
try:
import yaml
with open(config_path) as f:
with open(config_path, encoding="utf-8") as f:
cfg = yaml.safe_load(f) or {}
stt_provider = cfg.get("stt", {}).get("provider", "local")
+414 -65
View File
@@ -145,15 +145,30 @@ function Test-Python {
# Python not found — use uv to install it (no admin needed!)
Write-Info "Python $PythonVersion not found, installing via uv..."
try {
# Temporarily relax ErrorActionPreference: uv writes download progress
# to stderr, and with $ErrorActionPreference = "Stop" PowerShell wraps
# those stderr lines as ErrorRecord objects via 2>&1, then throws a
# terminating exception — even when uv exits 0. This caused fresh
# installs to fail on the first run despite Python being installed
# successfully. We verify success with `uv python find` afterwards
# which is the reliable signal regardless of exit code semantics.
$prevEAP = $ErrorActionPreference
$ErrorActionPreference = "Continue"
$uvOutput = & $UvCmd python install $PythonVersion 2>&1
if ($LASTEXITCODE -eq 0) {
$pythonPath = & $UvCmd python find $PythonVersion 2>$null
if ($pythonPath) {
$ver = & $pythonPath --version 2>$null
Write-Success "Python installed: $ver"
return $true
}
} else {
$uvExitCode = $LASTEXITCODE
$ErrorActionPreference = $prevEAP
# Check if Python is now available (more reliable than exit code
# since uv may return non-zero due to "already installed" etc.)
$pythonPath = & $UvCmd python find $PythonVersion 2>$null
if ($pythonPath) {
$ver = & $pythonPath --version 2>$null
Write-Success "Python installed: $ver"
return $true
}
# uv ran but Python still not findable — show what happened
if ($uvExitCode -ne 0) {
Write-Warn "uv python install output:"
Write-Host $uvOutput -ForegroundColor DarkGray
}
@@ -191,19 +206,213 @@ function Test-Python {
return $false
}
function Test-Git {
function Install-Git {
<#
.SYNOPSIS
Ensure Git (and Git Bash) are installed. Git for Windows bundles bash.exe
which Hermes uses to run shell commands.
Priority order (deliberately simple no winget, no registry, no system
package manager):
1. Existing ``git`` on PATH use it as-is (the common fast path).
2. Download **PortableGit** from the official git-for-windows GitHub
release (self-extracting 7z.exe) and unpack it to
``%LOCALAPPDATA%\hermes\git`` never touches system Git, never
requires admin, works even on locked-down machines and machines
with a broken system Git install.
**Why PortableGit, not MinGit:** MinGit is the minimal-automation
distribution and ships ONLY ``git.exe`` no bash, no POSIX utilities.
Hermes needs ``bash.exe`` to run shell commands. PortableGit is the
full Git for Windows distribution without the installer UI; it ships
``git.exe`` + ``bash.exe`` + ``sh``, ``awk``, ``sed``, ``grep``, ``curl``,
``ssh``, etc. in ``usr\bin\``.
We deliberately skip winget because it fails badly when the system Git
install is in a half-installed state (partially registered, or uninstall-
blocked). Owning the Hermes copy of Git ourselves is predictable and
recoverable: if it ever breaks, ``Remove-Item %LOCALAPPDATA%\hermes\git``
and re-running this installer fully recovers.
After install we locate ``bash.exe`` and persist the path in
``HERMES_GIT_BASH_PATH`` (User scope) so Hermes can find it in a fresh
shell without a second PATH refresh.
#>
Write-Info "Checking Git..."
if (Get-Command git -ErrorAction SilentlyContinue) {
$version = git --version
Write-Success "Git found ($version)"
Set-GitBashEnvVar
return $true
}
Write-Err "Git not found"
Write-Info "Please install Git from:"
Write-Info " https://git-scm.com/download/win"
return $false
# Download PortableGit into $HermesHome\git. Always works as long as
# we can reach github.com — no admin, no winget, no reliance on the
# user's possibly-broken system Git install.
Write-Info "Git not found — downloading PortableGit to $HermesHome\git\ ..."
Write-Info "(no admin rights required; isolated from any system Git install)"
try {
$arch = if ([Environment]::Is64BitOperatingSystem) {
# Detect ARM64 vs x64 explicitly; PortableGit ships separate assets.
if ($env:PROCESSOR_ARCHITECTURE -eq "ARM64" -or $env:PROCESSOR_ARCHITEW6432 -eq "ARM64") {
"arm64"
} else {
"64-bit"
}
} else {
# PortableGit does not ship a 32-bit build — fall back to MinGit 32-bit
# with a warning that bash-based features will be unavailable.
"32-bit-mingit"
}
$releaseApi = "https://api.github.com/repos/git-for-windows/git/releases/latest"
$release = Invoke-RestMethod -Uri $releaseApi -UseBasicParsing -Headers @{ "User-Agent" = "hermes-installer" }
if ($arch -eq "32-bit-mingit") {
Write-Warn "32-bit Windows detected — PortableGit is 64-bit only. Installing MinGit 32-bit as a last resort; bash-dependent Hermes features (terminal tool, agent-browser) will not work on this machine."
$assetPattern = "MinGit-*-32-bit.zip"
$downloadIsZip = $true
} elseif ($arch -eq "arm64") {
$assetPattern = "PortableGit-*-arm64.7z.exe"
$downloadIsZip = $false
} else {
$assetPattern = "PortableGit-*-64-bit.7z.exe"
$downloadIsZip = $false
}
$asset = $release.assets | Where-Object { $_.name -like $assetPattern } | Select-Object -First 1
if (-not $asset) {
throw "Could not find $assetPattern in latest git-for-windows release"
}
$downloadUrl = $asset.browser_download_url
$downloadExt = if ($downloadIsZip) { "zip" } else { "7z.exe" }
$tmpFile = "$env:TEMP\$($asset.name)"
$gitDir = "$HermesHome\git"
Write-Info "Downloading $($asset.name) ($([math]::Round($asset.size / 1MB, 1)) MB)..."
Invoke-WebRequest -Uri $downloadUrl -OutFile $tmpFile -UseBasicParsing
if (Test-Path $gitDir) {
Write-Info "Removing previous Git install at $gitDir ..."
Remove-Item -Recurse -Force $gitDir
}
New-Item -ItemType Directory -Path $gitDir -Force | Out-Null
if ($downloadIsZip) {
Expand-Archive -Path $tmpFile -DestinationPath $gitDir -Force
} else {
# PortableGit is a self-extracting 7z archive. Invoke it with
# `-o<target> -y` (silent) to extract to $gitDir. No 7z install
# required; it's fully self-contained.
Write-Info "Extracting PortableGit to $gitDir ..."
$extractProc = Start-Process -FilePath $tmpFile `
-ArgumentList "-o`"$gitDir`"", "-y" `
-NoNewWindow -Wait -PassThru
if ($extractProc.ExitCode -ne 0) {
throw "PortableGit extraction failed (exit code $($extractProc.ExitCode))"
}
}
Remove-Item -Force $tmpFile -ErrorAction SilentlyContinue
# PortableGit layout: cmd\git.exe + bin\bash.exe + usr\bin\ (coreutils)
# MinGit layout: cmd\git.exe + usr\bin\bash.exe (if present)
$gitExe = "$gitDir\cmd\git.exe"
if (-not (Test-Path $gitExe)) {
throw "Git extraction did not produce git.exe at $gitExe"
}
# Add to session PATH so the rest of this install run can use git.
$env:Path = "$gitDir\cmd;$env:Path"
# Persist to User PATH so fresh shells see it. PortableGit needs
# cmd\ (for git.exe), bin\ (for bash.exe + core tools), and
# usr\bin\ (for perl, ssh, curl, and other POSIX coreutils).
$newPathEntries = @(
"$gitDir\cmd",
"$gitDir\bin",
"$gitDir\usr\bin"
)
$userPath = [Environment]::GetEnvironmentVariable("Path", "User")
$userPathItems = if ($userPath) { $userPath -split ";" } else { @() }
$changed = $false
foreach ($entry in $newPathEntries) {
if ($userPathItems -notcontains $entry) {
$userPathItems += $entry
$changed = $true
}
}
if ($changed) {
[Environment]::SetEnvironmentVariable("Path", ($userPathItems -join ";"), "User")
}
$version = & $gitExe --version
Write-Success "Git $version installed to $gitDir (portable, user-scoped)"
Set-GitBashEnvVar
return $true
} catch {
Write-Err "Could not install portable Git: $_"
Write-Info ""
Write-Info "Fallback: install Git manually from https://git-scm.com/download/win"
Write-Info "then re-run this installer. Hermes needs Git Bash on Windows to run"
Write-Info "shell commands (same as Claude Code and other coding agents)."
return $false
}
}
function Set-GitBashEnvVar {
<#
.SYNOPSIS
Locate ``bash.exe`` from an already-installed Git and persist the path in
``HERMES_GIT_BASH_PATH`` (User env scope) so Hermes can find it even before
PATH propagation completes in a newly-spawned shell.
#>
$candidates = @()
# Our own portable Git install is ALWAYS checked first, so a broken
# system Git doesn't hijack us. If the user had a working system Git
# we'd have returned early from Install-Git's fast path and never called
# this with a system-Git-only installation anyway.
#
# Layouts:
# PortableGit (our default): $HermesHome\git\bin\bash.exe
# MinGit (32-bit fallback): $HermesHome\git\usr\bin\bash.exe
$candidates += "$HermesHome\git\bin\bash.exe" # PortableGit layout (primary)
$candidates += "$HermesHome\git\usr\bin\bash.exe" # MinGit / PortableGit usr\bin fallback
# git.exe on PATH can tell us where the install root is
$gitCmd = Get-Command git -ErrorAction SilentlyContinue
if ($gitCmd) {
$gitExe = $gitCmd.Source
# Git for Windows (full installer): <root>\cmd\git.exe + <root>\bin\bash.exe
# MinGit: <root>\cmd\git.exe + <root>\usr\bin\bash.exe
$gitRoot = Split-Path (Split-Path $gitExe -Parent) -Parent
$candidates += "$gitRoot\bin\bash.exe"
$candidates += "$gitRoot\usr\bin\bash.exe"
}
# Standard system install locations as a final fallback. Note:
# ProgramFiles(x86) can't be referenced via ${env:...} string interpolation
# because of the parens — use [Environment]::GetEnvironmentVariable().
$candidates += "${env:ProgramFiles}\Git\bin\bash.exe"
$pf86 = [Environment]::GetEnvironmentVariable("ProgramFiles(x86)")
if ($pf86) { $candidates += "$pf86\Git\bin\bash.exe" }
$candidates += "${env:LocalAppData}\Programs\Git\bin\bash.exe"
foreach ($candidate in $candidates) {
if ($candidate -and (Test-Path $candidate)) {
[Environment]::SetEnvironmentVariable("HERMES_GIT_BASH_PATH", $candidate, "User")
$env:HERMES_GIT_BASH_PATH = $candidate
Write-Info "Set HERMES_GIT_BASH_PATH=$candidate"
return
}
}
Write-Warn "Could not locate bash.exe — Hermes may not find Git Bash."
Write-Info "If needed, set HERMES_GIT_BASH_PATH manually to your bash.exe path."
}
function Test-Node {
@@ -411,21 +620,71 @@ function Install-SystemPackages {
function Install-Repository {
Write-Info "Installing to $InstallDir..."
$didUpdate = $false
if (Test-Path $InstallDir) {
# Test-Path "$InstallDir\.git" returns True when .git is a file OR a
# directory OR a symlink OR a submodule-style gitfile — and also when
# it's a broken stub left over from a failed previous install (e.g.
# a partial Remove-Item that couldn't delete a locked index.lock).
# Validate the repo properly by asking git itself. Two checks
# belt-and-braces: rev-parse AND git status. If either fails the
# repo is broken and we fall through to a fresh clone.
$repoValid = $false
if (Test-Path "$InstallDir\.git") {
Push-Location $InstallDir
try {
# Reset $LASTEXITCODE before the probe so we don't pick up
# a stale 0 from an earlier git call in this session.
$global:LASTEXITCODE = 0
$revParseOut = & git -c windows.appendAtomically=false rev-parse --is-inside-work-tree 2>&1
$revParseOk = ($LASTEXITCODE -eq 0) -and ($revParseOut -match "true")
$global:LASTEXITCODE = 0
$null = & git -c windows.appendAtomically=false status --short 2>&1
$statusOk = ($LASTEXITCODE -eq 0)
if ($revParseOk -and $statusOk) {
$repoValid = $true
}
} catch {}
Pop-Location
}
if ($repoValid) {
Write-Info "Existing installation found, updating..."
Push-Location $InstallDir
git -c windows.appendAtomically=false fetch origin
git -c windows.appendAtomically=false checkout $Branch
git -c windows.appendAtomically=false pull origin $Branch
Pop-Location
try {
git -c windows.appendAtomically=false fetch origin
if ($LASTEXITCODE -ne 0) { throw "git fetch failed (exit $LASTEXITCODE)" }
git -c windows.appendAtomically=false checkout $Branch
if ($LASTEXITCODE -ne 0) { throw "git checkout $Branch failed (exit $LASTEXITCODE)" }
git -c windows.appendAtomically=false pull origin $Branch
if ($LASTEXITCODE -ne 0) { throw "git pull failed (exit $LASTEXITCODE)" }
} finally {
Pop-Location
}
$didUpdate = $true
} else {
Write-Err "Directory exists but is not a git repository: $InstallDir"
Write-Info "Remove it or choose a different directory with -InstallDir"
throw "Directory exists but is not a git repository: $InstallDir"
# Directory exists but isn't a usable git repo. Wipe it and
# fall through to a fresh clone. A leftover ``.git`` stub from
# a partial uninstall used to lock the installer into the
# "update" branch forever, emitting three ``fatal: not a git
# repository`` errors and failing with "not in a git directory".
Write-Warn "Existing directory at $InstallDir is not a valid git repo — replacing it."
try {
Remove-Item -Recurse -Force $InstallDir -ErrorAction Stop
} catch {
Write-Err "Could not remove $InstallDir : $_"
Write-Info "Close any programs that might be using files in $InstallDir (editors,"
Write-Info "terminals, running hermes processes) and try again."
throw
}
}
} else {
}
if (-not $didUpdate) {
$cloneSuccess = $false
# Fix Windows git "copy-fd: write returned: Invalid argument" error.
@@ -446,7 +705,7 @@ function Install-Repository {
if ($LASTEXITCODE -eq 0) { $cloneSuccess = $true }
} catch { }
$env:GIT_SSH_COMMAND = $null
if (-not $cloneSuccess) {
if (Test-Path $InstallDir) { Remove-Item -Recurse -Force $InstallDir -ErrorAction SilentlyContinue }
Write-Info "SSH failed, trying HTTPS..."
@@ -464,18 +723,18 @@ function Install-Repository {
$zipUrl = "https://github.com/NousResearch/hermes-agent/archive/refs/heads/$Branch.zip"
$zipPath = "$env:TEMP\hermes-agent-$Branch.zip"
$extractPath = "$env:TEMP\hermes-agent-extract"
Invoke-WebRequest -Uri $zipUrl -OutFile $zipPath -UseBasicParsing
if (Test-Path $extractPath) { Remove-Item -Recurse -Force $extractPath }
Expand-Archive -Path $zipPath -DestinationPath $extractPath -Force
# GitHub ZIPs extract to repo-branch/ subdirectory
$extractedDir = Get-ChildItem $extractPath -Directory | Select-Object -First 1
if ($extractedDir) {
New-Item -ItemType Directory -Force -Path (Split-Path $InstallDir) -ErrorAction SilentlyContinue | Out-Null
Move-Item $extractedDir.FullName $InstallDir -Force
Write-Success "Downloaded and extracted"
# Initialize git repo so updates work later
Push-Location $InstallDir
git -c windows.appendAtomically=false init 2>$null
@@ -483,10 +742,10 @@ function Install-Repository {
git remote add origin $RepoUrlHttps 2>$null
Pop-Location
Write-Success "Git repo initialized for future updates"
$cloneSuccess = $true
}
# Cleanup temp files
Remove-Item -Force $zipPath -ErrorAction SilentlyContinue
Remove-Item -Recurse -Force $extractPath -ErrorAction SilentlyContinue
@@ -499,7 +758,7 @@ function Install-Repository {
throw "Failed to download repository (tried git clone SSH, HTTPS, and ZIP)"
}
}
# Set per-repo config (harmless if it fails)
Push-Location $InstallDir
git -c windows.appendAtomically=false config windows.appendAtomically false 2>$null
@@ -513,7 +772,7 @@ function Install-Repository {
Write-Success "Submodules ready"
}
Pop-Location
Write-Success "Repository ready"
}
@@ -659,13 +918,21 @@ function Copy-ConfigTemplates {
Write-Info "~/.hermes/config.yaml already exists, keeping it"
}
# Create SOUL.md if it doesn't exist (global persona file)
# Create SOUL.md if it doesn't exist (global persona file).
# IMPORTANT: write without a BOM. Windows PowerShell 5.1's
# ``Set-Content -Encoding UTF8`` writes UTF-8 WITH a byte-order-mark
# (the default PS5 behaviour), and Hermes's prompt-injection scanner
# flags the BOM as an invisible unicode character and refuses to
# load the file. PS7's ``-Encoding utf8NoBOM`` fixes that but we
# don't control which PowerShell version the user has. Go direct
# to .NET with an explicit UTF8Encoding($false) — BOM-free on every
# PowerShell version.
$soulPath = "$HermesHome\SOUL.md"
if (-not (Test-Path $soulPath)) {
@"
$soulContent = @"
# Hermes Agent Persona
<!--
<!--
This file defines the agent's personality and tone.
The agent will embody whatever you write here.
Edit this to customize how Hermes communicates with you.
@@ -678,7 +945,9 @@ Examples:
This file is loaded fresh each message -- no restart needed.
Delete the contents (or this file) to use the default personality.
-->
"@ | Set-Content -Path $soulPath -Encoding UTF8
"@
$utf8NoBom = New-Object System.Text.UTF8Encoding($false)
[System.IO.File]::WriteAllText($soulPath, $soulContent, $utf8NoBom)
Write-Success "Created ~/.hermes/SOUL.md (edit to customize personality)"
}
@@ -708,36 +977,94 @@ function Install-NodeDeps {
Write-Info "Skipping Node.js dependencies (Node not installed)"
return
}
Push-Location $InstallDir
if (Test-Path "package.json") {
Write-Info "Installing Node.js dependencies (browser tools)..."
try {
npm install --silent 2>&1 | Out-Null
Write-Success "Node.js dependencies installed"
} catch {
Write-Warn "npm install failed (browser tools may not work)"
# Resolve npm explicitly to npm.cmd, NOT npm.ps1. Node.js on Windows
# ships BOTH npm.cmd (a batch shim) and npm.ps1 (a PowerShell shim).
# Get-Command's default ordering picks whichever comes first in PATHEXT,
# and on many systems that's .ps1 — but .ps1 requires scripts to be
# enabled in PowerShell's execution policy, which most Windows users
# don't have (the Restricted / RemoteSigned default blocks unsigned
# .ps1 files). .cmd has no such restriction and works on every box.
#
# Strategy: look next to the npm shim we found and prefer npm.cmd if
# it exists in the same directory. Fall back to whatever Get-Command
# returned if we can't find a .cmd sibling.
$npmCmd = Get-Command npm -ErrorAction SilentlyContinue
if (-not $npmCmd) {
Write-Warn "npm not found on PATH — skipping Node.js dependencies."
Write-Info "Open a new PowerShell window and re-run 'hermes setup tools' later."
return
}
$npmExe = $npmCmd.Source
if ($npmExe -like "*.ps1") {
$npmCmdSibling = Join-Path (Split-Path $npmExe -Parent) "npm.cmd"
if (Test-Path $npmCmdSibling) {
Write-Info "Using npm.cmd (PowerShell execution policy blocks npm.ps1)"
$npmExe = $npmCmdSibling
} else {
Write-Warn "Only npm.ps1 available — install may fail if script execution is disabled."
Write-Info " If it fails, either enable PS script execution or install Node via winget."
}
}
# Install TUI dependencies
# Helper: run "npm install" in a given directory and surface the real
# error when it fails. Returns $true on success.
#
# Implementation note: ``Start-Process -FilePath npm.cmd`` fails with
# ``%1 is not a valid Win32 application`` on some PowerShell versions
# because Start-Process bypasses cmd.exe / PATHEXT and expects a real
# PE file. The invocation-operator ``& $npmExe`` routes through the
# PowerShell command pipeline which DOES honour .cmd batch shims, so
# it works uniformly for npm.cmd, npx.cmd, and bare .exe files.
function _Run-NpmInstall([string]$label, [string]$installDir, [string]$logPath, [string]$npmPath) {
Push-Location $installDir
try {
# Redirect ALL output streams to the log file via 2>&1 and then
# ``Tee-Object`` / ``Out-File``. Simpler approach: call npm
# with output redirected and inspect $LASTEXITCODE afterwards.
& $npmPath install --silent *> $logPath
$code = $LASTEXITCODE
if ($code -eq 0) {
Write-Success "$label dependencies installed"
Remove-Item -Force $logPath -ErrorAction SilentlyContinue
return $true
}
Write-Warn "$label npm install failed — exit code $code"
if (Test-Path $logPath) {
$errText = (Get-Content $logPath -Raw -ErrorAction SilentlyContinue)
if ($errText) {
$snippet = if ($errText.Length -gt 1200) { $errText.Substring(0, 1200) + "..." } else { $errText }
Write-Info " npm output:"
foreach ($line in $snippet -split "`n") {
Write-Host " $line" -ForegroundColor DarkGray
}
Write-Info " Full log: $logPath"
}
}
Write-Info "Run manually later: cd `"$installDir`"; npm install"
return $false
} catch {
Write-Warn "$label npm install could not be launched: $_"
return $false
} finally {
Pop-Location
}
}
# Browser tools
if (Test-Path "$InstallDir\package.json") {
Write-Info "Installing Node.js dependencies (browser tools)..."
$browserLog = "$env:TEMP\hermes-npm-browser-$(Get-Random).log"
[void](_Run-NpmInstall "Browser tools" $InstallDir $browserLog $npmExe)
}
# TUI
$tuiDir = "$InstallDir\ui-tui"
if (Test-Path "$tuiDir\package.json") {
Write-Info "Installing TUI dependencies..."
Push-Location $tuiDir
try {
npm install --silent 2>&1 | Out-Null
Write-Success "TUI dependencies installed"
} catch {
Write-Warn "TUI npm install failed (hermes --tui may not work)"
}
Pop-Location
$tuiLog = "$env:TEMP\hermes-npm-tui-$(Get-Random).log"
[void](_Run-NpmInstall "TUI" $tuiDir $tuiLog $npmExe)
}
Pop-Location
}
function Invoke-SetupWizard {
@@ -886,13 +1213,35 @@ function Write-Completion {
function Main {
Write-Banner
# Windows refuses to delete a directory any shell is currently cd'd
# inside — and silently leaves orphan files behind, which then wedge
# "is this a valid git repo" probes on re-install. If the current
# working dir is under $InstallDir, step out to the user's home
# BEFORE doing anything else. Harmless when the user ran the
# installer from somewhere else.
try {
$currentResolved = (Get-Location).ProviderPath
$installResolved = $null
if (Test-Path $InstallDir) {
$installResolved = (Resolve-Path $InstallDir -ErrorAction SilentlyContinue).ProviderPath
}
if ($installResolved -and $currentResolved.ToLower().StartsWith($installResolved.ToLower())) {
Write-Info "Stepping out of $InstallDir so Windows can replace files there if needed..."
Set-Location $env:USERPROFILE
}
} catch {}
if (-not (Install-Uv)) { throw "uv installation failed — cannot continue" }
if (-not (Test-Python)) { throw "Python $PythonVersion not available — cannot continue" }
if (-not (Test-Git)) { throw "Git not found — install from https://git-scm.com/download/win" }
Test-Node # Auto-installs if missing
if (-not (Install-Git)) { throw "Git not available and auto-install failed — install from https://git-scm.com/download/win then re-run" }
# Test-Node always returns $true (sets $script:HasNode on success, emits a
# warning on failure and continues so non-browser installs still work).
# Cast to [void] so the bare return value doesn't print "True" to the
# console between the "Node found" line and the next installer step.
[void](Test-Node)
Install-SystemPackages # ripgrep + ffmpeg in one step
Install-Repository
Install-Venv
Install-Dependencies
@@ -901,7 +1250,7 @@ function Main {
Copy-ConfigTemplates
Invoke-SetupWizard
Start-GatewayIfConfigured
Write-Completion
}
+56 -9
View File
@@ -28,6 +28,10 @@ if [ -n "${PYTHONHOME:-}" ]; then
unset PYTHONHOME
fi
# Prevent uv from discovering config files (uv.toml, pyproject.toml) from the
# wrong user's home directory when running under sudo -u <user>. See #21269.
export UV_NO_CONFIG=1
# Colors
RED='\033[0;31m'
GREEN='\033[0;32m'
@@ -615,6 +619,41 @@ install_node() {
HAS_NODE=true
}
check_network_prerequisites() {
log_info "Checking internet connectivity for package install and web tools..."
local url
local failed=false
local checks=("https://pypi.org/simple/" "https://duckduckgo.com/")
if ! command -v curl >/dev/null 2>&1; then
log_warn "curl not found; skipping connectivity probes"
return 0
fi
for url in "${checks[@]}"; do
if ! curl -fsSI --max-time 8 "$url" >/dev/null 2>&1; then
failed=true
log_warn "Could not reach $url"
fi
done
if [ "$failed" = false ]; then
log_success "Internet connectivity looks good"
return 0
fi
if [ "$DISTRO" = "termux" ]; then
log_warn "Termux network prerequisites may be incomplete."
log_info "Try: pkg install -y ca-certificates curl && pkg update"
log_info "If mirrors are stale: termux-change-repo"
log_info "Then test: curl -I https://pypi.org/simple/ && curl -I https://duckduckgo.com/"
else
log_warn "Network checks failed. Hermes install may complete, but web search and dependency downloads can fail."
log_info "Verify internet/DNS and retry if pip install fails."
fi
}
install_system_packages() {
# Detect what's missing
HAS_RIPGREP=false
@@ -642,7 +681,7 @@ install_system_packages() {
# Termux always needs the Android build toolchain for the tested pip path,
# even when ripgrep/ffmpeg are already present.
if [ "$DISTRO" = "termux" ]; then
local termux_pkgs=(clang rust make pkg-config libffi openssl)
local termux_pkgs=(clang rust make pkg-config libffi openssl ca-certificates curl)
if [ "$need_ripgrep" = true ]; then
termux_pkgs+=("ripgrep")
fi
@@ -945,17 +984,24 @@ install_deps() {
fi
"$PIP_PYTHON" -m pip install --upgrade pip setuptools wheel >/dev/null
if ! "$PIP_PYTHON" -m pip install -e '.[termux]' -c constraints-termux.txt; then
log_warn "Termux feature install (.[termux]) failed, trying base install..."
if ! "$PIP_PYTHON" -m pip install -e '.' -c constraints-termux.txt; then
log_error "Package installation failed on Termux."
log_info "Ensure these packages are installed: pkg install clang rust make pkg-config libffi openssl"
log_info "Then re-run: cd $INSTALL_DIR && python -m pip install -e '.[termux]' -c constraints-termux.txt"
exit 1
# Try the broad Termux profile first (best-effort "install all" for Android),
# then fall back to the conservative Termux baseline, then base package.
if ! "$PIP_PYTHON" -m pip install -e '.[termux-all]' -c constraints-termux.txt; then
log_warn "Termux broad profile (.[termux-all]) failed, trying baseline Termux profile..."
if ! "$PIP_PYTHON" -m pip install -e '.[termux]' -c constraints-termux.txt; then
log_warn "Termux baseline profile (.[termux]) failed, trying base install..."
if ! "$PIP_PYTHON" -m pip install -e '.' -c constraints-termux.txt; then
log_error "Package installation failed on Termux."
log_info "Ensure these packages are installed: pkg install clang rust make pkg-config libffi openssl ca-certificates curl"
log_info "Then re-run: cd $INSTALL_DIR && python -m pip install -e '.[termux-all]' -c constraints-termux.txt"
exit 1
fi
fi
fi
log_success "Main package installed"
log_info "Termux note: matrix e2ee and local faster-whisper extras are excluded from .[termux-all] due to upstream Android wheel/toolchain blockers."
log_info "Termux note: browser/WhatsApp tooling is not installed by default; see the Termux guide for optional follow-up steps."
if [ -d "tinker-atropos" ] && [ -f "tinker-atropos/pyproject.toml" ]; then
@@ -1047,7 +1093,7 @@ setup_path() {
log_warn "hermes entry point not found at $HERMES_BIN"
log_info "This usually means the pip install didn't complete successfully."
if [ "$DISTRO" = "termux" ]; then
log_info "Try: cd $INSTALL_DIR && python -m pip install -e '.[termux]' -c constraints-termux.txt"
log_info "Try: cd $INSTALL_DIR && python -m pip install -e '.[termux-all]' -c constraints-termux.txt"
else
log_info "Try: cd $INSTALL_DIR && uv pip install -e '.[all]'"
fi
@@ -1570,6 +1616,7 @@ main() {
check_python
check_git
check_node
check_network_prerequisites
install_system_packages
clone_repo
+2 -2
View File
@@ -111,7 +111,7 @@ def summarize(log: Path, since_ts_ms: int) -> dict[str, Any]:
frame_events: list[dict[str, Any]] = []
if not log.exists():
return {"error": f"no log at {log}", "react": [], "frame": []}
for line in log.read_text().splitlines():
for line in log.read_text(encoding="utf-8").splitlines():
line = line.strip()
if not line:
continue
@@ -505,7 +505,7 @@ def main() -> int:
if args.save:
path = Path(f"/tmp/perf-{args.save}.json")
path.write_text(json.dumps(metrics, indent=2))
path.write_text(json.dumps(metrics, indent=2), encoding="utf-8")
print(f"\n• saved: {path}")
if args.compare:
+4 -2
View File
@@ -55,6 +55,7 @@ AUTHOR_MAP = {
"127238744+teknium1@users.noreply.github.com": "teknium1",
"128259593+Gutslabs@users.noreply.github.com": "Gutslabs",
"50326054+nocturnum91@users.noreply.github.com": "nocturnum91",
"223003280+Abd0r@users.noreply.github.com": "Abd0r",
"abdielv@proton.me": "AJV20",
"mason@growagainorchids.com": "masonjames",
"am@studio1.tailb672fe.ts.net": "subtract0",
@@ -77,6 +78,7 @@ AUTHOR_MAP = {
"dengtaoyuan@dengtaoyuandeMac-mini.local": "dengtaoyuan450-a11y",
"ysfalweshcan@gmail.com": "Junass1",
"bartokmagic@proton.me": "Bartok9",
"androidhtml@yandex.com": "hllqkb",
"25840394+Bongulielmi@users.noreply.github.com": "Bongulielmi",
"jonathan.troyer@overmatch.com": "JTroyerOvermatch",
"harryykyle1@gmail.com": "hharry11",
@@ -424,7 +426,7 @@ AUTHOR_MAP = {
"camilo@tekelala.com": "tekelala",
"vincentcharlebois@gmail.com": "vincentcharlebois",
"aryan@synvoid.com": "aryansingh",
"johnsonblake1@gmail.com": "blakejohnson",
"johnsonblake1@gmail.com": "voteblake",
"hcn518@gmail.com": "pedh",
"haileymarshall005@gmail.com": "haileymarshall",
"greer.guthrie@gmail.com": "g-guthrie",
@@ -1357,7 +1359,7 @@ def main():
)
if args.output:
Path(args.output).write_text(changelog)
Path(args.output).write_text(changelog, encoding="utf-8")
print(f"Changelog written to {args.output}")
else:
print(changelog)
+4
View File
@@ -29,6 +29,10 @@ NC='\033[0m'
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
cd "$SCRIPT_DIR"
# Prevent uv from discovering config files (uv.toml, pyproject.toml) from the
# wrong user's home directory when running under sudo -u <user>. See #21269.
export UV_NO_CONFIG=1
PYTHON_VERSION="3.11"
is_termux() {
+124 -1
View File
@@ -1,5 +1,14 @@
import base64
import pytest
from acp.schema import ImageContentBlock, TextContentBlock
from acp.schema import (
BlobResourceContents,
EmbeddedResourceContentBlock,
ImageContentBlock,
ResourceContentBlock,
TextContentBlock,
TextResourceContents,
)
from acp_adapter.server import HermesACPAgent, _content_blocks_to_openai_user_content
@@ -27,6 +36,48 @@ def test_text_only_acp_blocks_stay_string_for_legacy_prompt_path():
assert content == "/help"
def test_acp_resource_link_file_is_inlined_as_text(tmp_path):
attached = tmp_path / "notes.md"
attached.write_text("# Notes\n\nAttached file body", encoding="utf-8")
content = _content_blocks_to_openai_user_content([
TextContentBlock(type="text", text="Please read this file"),
ResourceContentBlock(
type="resource_link",
name="notes.md",
title="Project notes",
uri=attached.as_uri(),
mimeType="text/markdown",
),
])
assert content == (
"Please read this file\n"
"[Attached file: Project notes (notes.md)]\n"
f"URI: {attached.as_uri()}\n\n"
"# Notes\n\nAttached file body"
)
def test_acp_embedded_text_resource_is_inlined_as_text():
content = _content_blocks_to_openai_user_content([
EmbeddedResourceContentBlock(
type="resource",
resource=TextResourceContents(
uri="file:///workspace/todo.txt",
mimeType="text/plain",
text="first\nsecond",
),
),
])
assert content == (
"[Attached file: todo.txt]\n"
"URI: file:///workspace/todo.txt\n\n"
"first\nsecond"
)
@pytest.mark.asyncio
async def test_initialize_advertises_image_prompt_capability():
response = await HermesACPAgent().initialize()
@@ -34,3 +85,75 @@ async def test_initialize_advertises_image_prompt_capability():
assert response.agent_capabilities is not None
assert response.agent_capabilities.prompt_capabilities is not None
assert response.agent_capabilities.prompt_capabilities.image is True
# 1x1 transparent PNG — smallest valid image payload for inlining tests.
_ONE_PX_PNG = bytes.fromhex(
"89504e470d0a1a0a0000000d49484452000000010000000108060000001f15c4"
"890000000a49444154789c6300010000000500010d0a2db40000000049454e44ae426082"
)
def test_acp_resource_link_image_file_is_inlined_as_image_url(tmp_path):
attached = tmp_path / "shot.png"
attached.write_bytes(_ONE_PX_PNG)
content = _content_blocks_to_openai_user_content([
TextContentBlock(type="text", text="Look at this screenshot"),
ResourceContentBlock(
type="resource_link",
name="shot.png",
uri=attached.as_uri(),
mimeType="image/png",
),
])
assert isinstance(content, list)
# [user text, image header, image_url]
assert content[0] == {"type": "text", "text": "Look at this screenshot"}
assert content[1]["type"] == "text"
assert "[Attached image: shot.png]" in content[1]["text"]
assert content[2]["type"] == "image_url"
expected_url = "data:image/png;base64," + base64.b64encode(_ONE_PX_PNG).decode("ascii")
assert content[2]["image_url"]["url"] == expected_url
def test_acp_resource_link_image_mime_inferred_from_suffix(tmp_path):
"""No mimeType sent — should still be recognised as image by file suffix."""
attached = tmp_path / "pic.jpg"
attached.write_bytes(_ONE_PX_PNG) # content doesn't matter for the code path
content = _content_blocks_to_openai_user_content([
ResourceContentBlock(
type="resource_link",
name="pic.jpg",
uri=attached.as_uri(),
),
])
assert isinstance(content, list)
image_parts = [p for p in content if p.get("type") == "image_url"]
assert len(image_parts) == 1
assert image_parts[0]["image_url"]["url"].startswith("data:image/jpeg;base64,")
def test_acp_embedded_blob_image_is_inlined_as_image_url():
b64 = base64.b64encode(_ONE_PX_PNG).decode("ascii")
content = _content_blocks_to_openai_user_content([
EmbeddedResourceContentBlock(
type="resource",
resource=BlobResourceContents(
uri="file:///tmp/embed.png",
mimeType="image/png",
blob=b64,
),
),
])
assert isinstance(content, list)
assert content[0]["type"] == "text"
assert "[Attached image: embed.png]" in content[0]["text"]
assert content[1] == {
"type": "image_url",
"image_url": {"url": f"data:image/png;base64,{b64}"},
}
+5
View File
@@ -378,6 +378,11 @@ def test_run_doctor_termux_treats_docker_and_browser_warnings_as_expected(monkey
assert "1) pkg install nodejs" in out
assert "2) npm install -g agent-browser" in out
assert "3) agent-browser install" in out
assert "Termux compatibility fallbacks:" in out
assert "use .[termux-all] for broad compatibility" in out
assert "Matrix E2EE extra is excluded on Termux" in out
assert "Local faster-whisper extra is excluded on Termux" in out
assert "STT fallback: use Groq Whisper (set GROQ_API_KEY) or OpenAI Whisper (set VOICE_TOOLS_OPENAI_KEY)." in out
assert "docker not found (optional)" not in out
+55
View File
@@ -286,3 +286,58 @@ def test_run_slash_reassign_with_reclaim_flag(kanban_home):
assert "Reassigned" in out, out
out2 = kc.run_slash(f"show {tid}")
assert "newbie" in out2
# ---------------------------------------------------------------------------
# /kanban specify — slash surface (same entry point CLI + gateway use)
# ---------------------------------------------------------------------------
def test_run_slash_specify_end_to_end(kanban_home, monkeypatch):
"""The /kanban specify slash command routes through run_slash, which
both the interactive CLI and every gateway platform use. This test
covers both surfaces."""
from unittest.mock import MagicMock
# Create a triage task via the same slash surface.
create_out = kc.run_slash("create 'rough idea' --triage")
import re
m = re.search(r"(t_[a-f0-9]+)", create_out)
assert m, f"no task id in: {create_out!r}"
tid = m.group(1)
# Mock the auxiliary client so we don't hit a real provider.
resp = MagicMock()
resp.choices = [MagicMock()]
resp.choices[0].message.content = (
'{"title": "Spec: rough idea", "body": "**Goal**\\nShip it."}'
)
fake_client = MagicMock()
fake_client.chat.completions.create = MagicMock(return_value=resp)
monkeypatch.setattr(
"agent.auxiliary_client.get_text_auxiliary_client",
lambda *a, **kw: (fake_client, "test-model"),
)
# Specify via slash.
out = kc.run_slash(f"specify {tid}")
assert "Specified" in out
assert tid in out
# Task is promoted and retitled.
with kb.connect() as conn:
task = kb.get_task(conn, tid)
assert task.status in {"todo", "ready"}
assert task.title == "Spec: rough idea"
def test_run_slash_specify_help_is_reachable(kanban_home):
"""`--help` on a subcommand is handled by argparse itself — it prints
to the process stdout and raises SystemExit before run_slash's output
redirection is installed, so the returned string is the usage-error
sentinel. All we're asserting here is that the subcommand is
registered (no "unknown action" error) the shape of the help text
is covered by the direct argparse tests in test_kanban_specify.py."""
out = kc.run_slash("specify --help")
# Either the usage-error sentinel (stdout swallowed by argparse) or
# a real help rendering — both mean the subcommand exists.
assert "usage error" in out.lower() or "specify" in out.lower()
+337
View File
@@ -0,0 +1,337 @@
"""Tests for the specifier module + `hermes kanban specify` CLI surface.
The auxiliary LLM client is mocked these tests don't hit any network or
real provider. They exercise the prompt plumbing, response parsing, DB
writes, and CLI flag surface.
"""
from __future__ import annotations
import argparse
import json as jsonlib
from pathlib import Path
from unittest.mock import MagicMock, patch
import pytest
from hermes_cli import kanban as kanban_cli
from hermes_cli import kanban_db as kb
from hermes_cli import kanban_specify as spec
@pytest.fixture
def kanban_home(tmp_path, monkeypatch):
home = tmp_path / ".hermes"
home.mkdir()
monkeypatch.setenv("HERMES_HOME", str(home))
monkeypatch.setattr(Path, "home", lambda: tmp_path)
kb.init_db()
return home
def _fake_aux_response(content: str):
"""Build a minimal object shaped like an OpenAI chat.completions result.
The specifier only reads ``resp.choices[0].message.content``, so we
avoid importing the openai SDK and build the tree with MagicMock.
"""
resp = MagicMock()
resp.choices = [MagicMock()]
resp.choices[0].message.content = content
return resp
def _mock_client_returning(content: str):
client = MagicMock()
client.chat.completions.create = MagicMock(return_value=_fake_aux_response(content))
return client
def _patch_aux_client(content: str, *, model: str = "test-model"):
"""Patch get_text_auxiliary_client at its source + at the module that
imported it lazily inside specify_task. Both patches are needed
because kanban_specify imports the function inside the function body.
"""
client = _mock_client_returning(content)
return patch(
"agent.auxiliary_client.get_text_auxiliary_client",
return_value=(client, model),
), client
# ---------------------------------------------------------------------------
# JSON extraction helpers
# ---------------------------------------------------------------------------
def test_extract_json_blob_handles_plain_json():
raw = '{"title": "T", "body": "B"}'
assert spec._extract_json_blob(raw) == {"title": "T", "body": "B"}
def test_extract_json_blob_handles_fenced_json():
raw = '```json\n{"title": "T", "body": "B"}\n```'
assert spec._extract_json_blob(raw) == {"title": "T", "body": "B"}
def test_extract_json_blob_handles_prose_preamble():
raw = 'Sure! Here you go:\n{"title": "T", "body": "B"}\nThanks.'
assert spec._extract_json_blob(raw) == {"title": "T", "body": "B"}
def test_extract_json_blob_returns_none_for_unparseable():
assert spec._extract_json_blob("no json here") is None
assert spec._extract_json_blob("") is None
assert spec._extract_json_blob("{not: valid}") is None
# ---------------------------------------------------------------------------
# specify_task (module-level entry point)
# ---------------------------------------------------------------------------
def test_specify_task_happy_path(kanban_home):
with kb.connect() as conn:
tid = kb.create_task(conn, title="rough", triage=True)
content = jsonlib.dumps({
"title": "Refined rough",
"body": "**Goal**\nA concrete goal.",
})
p, _ = _patch_aux_client(content)
with p:
outcome = spec.specify_task(tid, author="ace")
assert outcome.ok is True
assert outcome.task_id == tid
assert outcome.new_title == "Refined rough"
with kb.connect() as conn:
task = kb.get_task(conn, tid)
# Parent-free → recompute_ready promotes to ready.
assert task.status == "ready"
assert task.title == "Refined rough"
assert "**Goal**" in (task.body or "")
def test_specify_task_falls_back_to_body_only_on_bad_json(kanban_home):
with kb.connect() as conn:
tid = kb.create_task(conn, title="keep title", triage=True)
# Model returned plain markdown, no JSON object.
content = "Goal: Do a thing.\nApproach: Steps here."
p, _ = _patch_aux_client(content)
with p:
outcome = spec.specify_task(tid)
assert outcome.ok is True
with kb.connect() as conn:
t = kb.get_task(conn, tid)
# Title preserved (no JSON with a title key).
assert t.title == "keep title"
# Body replaced with the raw response.
assert "Goal:" in (t.body or "")
def test_specify_task_rejects_non_triage_task(kanban_home):
with kb.connect() as conn:
tid = kb.create_task(conn, title="ready task")
p, client = _patch_aux_client("unused")
with p:
outcome = spec.specify_task(tid)
assert outcome.ok is False
assert "not in triage" in outcome.reason
# LLM must not be invoked for a non-triage task — fail cheap.
assert client.chat.completions.create.call_count == 0
def test_specify_task_unknown_id(kanban_home):
p, client = _patch_aux_client("unused")
with p:
outcome = spec.specify_task("t_nope")
assert outcome.ok is False
assert "unknown task" in outcome.reason
assert client.chat.completions.create.call_count == 0
def test_specify_task_no_aux_client_configured(kanban_home):
with kb.connect() as conn:
tid = kb.create_task(conn, title="rough", triage=True)
with patch(
"agent.auxiliary_client.get_text_auxiliary_client",
return_value=(None, ""),
):
outcome = spec.specify_task(tid)
assert outcome.ok is False
assert "auxiliary client" in outcome.reason
# Task must stay in triage — we never touched it.
with kb.connect() as conn:
assert kb.get_task(conn, tid).status == "triage"
def test_specify_task_llm_api_error_keeps_task_in_triage(kanban_home):
with kb.connect() as conn:
tid = kb.create_task(conn, title="rough", triage=True)
client = MagicMock()
client.chat.completions.create = MagicMock(side_effect=RuntimeError("429 rate limited"))
with patch(
"agent.auxiliary_client.get_text_auxiliary_client",
return_value=(client, "test-model"),
):
outcome = spec.specify_task(tid)
assert outcome.ok is False
assert "LLM error" in outcome.reason
with kb.connect() as conn:
assert kb.get_task(conn, tid).status == "triage"
def test_specify_task_empty_llm_response(kanban_home):
with kb.connect() as conn:
tid = kb.create_task(conn, title="rough", triage=True)
p, _ = _patch_aux_client("")
with p:
outcome = spec.specify_task(tid)
assert outcome.ok is False
with kb.connect() as conn:
assert kb.get_task(conn, tid).status == "triage"
def test_list_triage_ids(kanban_home):
with kb.connect() as conn:
a = kb.create_task(conn, title="a", triage=True)
b = kb.create_task(conn, title="b", triage=True, tenant="proj-1")
kb.create_task(conn, title="c") # not triage — excluded
ids_all = spec.list_triage_ids()
assert set(ids_all) == {a, b}
ids_tenant = spec.list_triage_ids(tenant="proj-1")
assert ids_tenant == [b]
# ---------------------------------------------------------------------------
# CLI wiring — argparse + _cmd_specify
# ---------------------------------------------------------------------------
def _run_cli(*argv: str) -> int:
"""Invoke the `hermes kanban …` argparse surface directly."""
root = argparse.ArgumentParser()
subp = root.add_subparsers(dest="cmd")
kanban_cli.build_parser(subp)
ns = root.parse_args(["kanban", *argv])
return kanban_cli.kanban_command(ns)
def test_cli_specify_requires_id_or_all(kanban_home, capsys):
rc = _run_cli("specify")
assert rc == 2
err = capsys.readouterr().err
assert "requires a task id or --all" in err
def test_cli_specify_rejects_both_id_and_all(kanban_home, capsys):
with kb.connect() as conn:
tid = kb.create_task(conn, title="rough", triage=True)
rc = _run_cli("specify", tid, "--all")
assert rc == 2
err = capsys.readouterr().err
assert "either a task id OR --all" in err
def test_cli_specify_single_id_success(kanban_home, capsys):
with kb.connect() as conn:
tid = kb.create_task(conn, title="rough", triage=True)
content = jsonlib.dumps({"title": "clean", "body": "body"})
p, _ = _patch_aux_client(content)
with p:
rc = _run_cli("specify", tid)
assert rc == 0
out = capsys.readouterr().out
assert tid in out
assert "→ todo" in out or "-> todo" in out or "" in out
def test_cli_specify_all_success_and_json(kanban_home, capsys):
with kb.connect() as conn:
a = kb.create_task(conn, title="a", triage=True)
b = kb.create_task(conn, title="b", triage=True)
content = jsonlib.dumps({"title": "spec", "body": "body"})
p, _ = _patch_aux_client(content)
with p:
rc = _run_cli("specify", "--all", "--json")
assert rc == 0
lines = [l for l in capsys.readouterr().out.strip().splitlines() if l]
# One JSON object per task + nothing else.
assert len(lines) == 2
parsed = [jsonlib.loads(l) for l in lines]
ids = {row["task_id"] for row in parsed}
assert ids == {a, b}
assert all(row["ok"] for row in parsed)
def test_cli_specify_all_empty_triage_column(kanban_home, capsys):
rc = _run_cli("specify", "--all")
assert rc == 0
assert "No triage tasks" in capsys.readouterr().out
def test_cli_specify_all_returns_1_when_every_task_fails(kanban_home, capsys):
with kb.connect() as conn:
kb.create_task(conn, title="a", triage=True)
kb.create_task(conn, title="b", triage=True)
with patch(
"agent.auxiliary_client.get_text_auxiliary_client",
return_value=(None, ""), # no aux client → every task fails
):
rc = _run_cli("specify", "--all")
assert rc == 1
def test_cli_specify_tenant_filter(kanban_home, capsys):
with kb.connect() as conn:
outside = kb.create_task(conn, title="outside", triage=True)
inside = kb.create_task(
conn, title="inside", triage=True, tenant="proj-a",
)
content = jsonlib.dumps({"title": "spec", "body": "body"})
p, _ = _patch_aux_client(content)
with p:
rc = _run_cli("specify", "--all", "--tenant", "proj-a", "--json")
assert rc == 0
lines = [
jsonlib.loads(l)
for l in capsys.readouterr().out.strip().splitlines()
if l
]
ids = {row["task_id"] for row in lines}
assert ids == {inside}
# The outside task stays in triage.
with kb.connect() as conn:
assert kb.get_task(conn, outside).status == "triage"
# The inside task was promoted.
assert kb.get_task(conn, inside).status in {"todo", "ready"}
def test_cli_specify_author_passed_through(kanban_home, capsys):
with kb.connect() as conn:
tid = kb.create_task(conn, title="rough", triage=True)
content = jsonlib.dumps({"title": "fresh title", "body": "fresh body"})
p, _ = _patch_aux_client(content)
with p:
rc = _run_cli("specify", tid, "--author", "custom-agent")
assert rc == 0
with kb.connect() as conn:
comments = kb.list_comments(conn, tid)
assert comments and comments[0].author == "custom-agent"
+184
View File
@@ -0,0 +1,184 @@
"""Tests for kb.specify_triage_task — the DB-layer atomic promotion
from the triage column to todo. LLM-free by design."""
from __future__ import annotations
from pathlib import Path
import pytest
from hermes_cli import kanban_db as kb
@pytest.fixture
def kanban_home(tmp_path, monkeypatch):
"""Isolated HERMES_HOME with an empty kanban DB."""
home = tmp_path / ".hermes"
home.mkdir()
monkeypatch.setenv("HERMES_HOME", str(home))
monkeypatch.setattr(Path, "home", lambda: tmp_path)
kb.init_db()
return home
def _create_triage(conn, title="rough idea", body=None, assignee=None):
return kb.create_task(
conn,
title=title,
body=body,
assignee=assignee,
triage=True,
)
def test_specify_promotes_triage_to_todo(kanban_home):
with kb.connect() as conn:
tid = _create_triage(conn, title="rough idea")
assert kb.get_task(conn, tid).status == "triage"
with kb.connect() as conn:
ok = kb.specify_triage_task(
conn,
tid,
title="Refined: rough idea",
body="**Goal**\nDo the thing.",
author="specifier-bot",
)
assert ok is True
with kb.connect() as conn:
task = kb.get_task(conn, tid)
# No parents → recompute_ready should have flipped it past todo to ready.
assert task.status == "ready"
assert task.title == "Refined: rough idea"
assert "**Goal**" in (task.body or "")
def test_specify_with_open_parent_lands_in_todo_not_ready(kanban_home):
# Parent-gated specified tasks must not jump the dispatcher — they go
# to todo and wait for parent completion like any other gated task.
with kb.connect() as conn:
parent = kb.create_task(conn, title="parent work")
child = _create_triage(conn, title="child idea")
kb.link_tasks(conn, parent, child)
# After linking with an open parent, triage status should still be
# 'triage' (linking doesn't touch triage tasks).
assert kb.get_task(conn, child).status == "triage"
with kb.connect() as conn:
ok = kb.specify_triage_task(
conn,
child,
body="full spec",
author="specifier",
)
assert ok is True
with kb.connect() as conn:
t = kb.get_task(conn, child)
# Parent still open → specified child sits in 'todo', not 'ready'.
assert t.status == "todo"
def test_specify_refuses_non_triage_task(kanban_home):
with kb.connect() as conn:
tid = kb.create_task(conn, title="normal task")
assert kb.get_task(conn, tid).status == "ready"
with kb.connect() as conn:
ok = kb.specify_triage_task(conn, tid, body="won't apply")
assert ok is False
with kb.connect() as conn:
# Status unchanged.
assert kb.get_task(conn, tid).status == "ready"
def test_specify_returns_false_for_unknown_id(kanban_home):
with kb.connect() as conn:
ok = kb.specify_triage_task(conn, "t_does_not_exist", body="x")
assert ok is False
def test_specify_rejects_blank_title(kanban_home):
with kb.connect() as conn:
tid = _create_triage(conn, title="rough")
with kb.connect() as conn, pytest.raises(ValueError):
kb.specify_triage_task(conn, tid, title=" ", body="ok")
def test_specify_emits_event(kanban_home):
with kb.connect() as conn:
tid = _create_triage(conn, title="rough")
with kb.connect() as conn:
kb.specify_triage_task(
conn, tid, title="new", body="b", author="ace"
)
with kb.connect() as conn:
events = kb.list_events(conn, tid)
kinds = [e.kind for e in events]
assert "specified" in kinds
# The specified event records which fields actually changed as a
# JSON payload under task_events.payload.
spec_ev = next(e for e in events if e.kind == "specified")
assert spec_ev.payload is not None
fields = spec_ev.payload.get("changed_fields") or []
assert "title" in fields
assert "body" in fields
def test_specify_records_audit_comment_only_when_author_given(kanban_home):
# With author → comment added.
with kb.connect() as conn:
tid1 = _create_triage(conn, title="a")
kb.specify_triage_task(
conn, tid1, title="A-spec", body="b", author="ace"
)
comments1 = kb.list_comments(conn, tid1)
assert len(comments1) == 1
assert "Specified" in comments1[0].body
assert comments1[0].author == "ace"
# Without author → no comment (silent).
with kb.connect() as conn:
tid2 = _create_triage(conn, title="b")
kb.specify_triage_task(conn, tid2, title="B-spec", body="b")
comments2 = kb.list_comments(conn, tid2)
assert comments2 == []
def test_specify_skips_comment_when_nothing_changed(kanban_home):
# Create triage task with title and body already set; pass identical
# values to specify. Should promote to todo but skip audit comment.
with kb.connect() as conn:
tid = _create_triage(conn, title="same", body="same body")
with kb.connect() as conn:
ok = kb.specify_triage_task(
conn,
tid,
title="same",
body="same body",
author="ace",
)
assert ok is True
with kb.connect() as conn:
# Promoted.
assert kb.get_task(conn, tid).status in {"todo", "ready"}
# No audit comment because neither field changed.
assert kb.list_comments(conn, tid) == []
def test_specify_with_only_body_preserves_title(kanban_home):
with kb.connect() as conn:
tid = _create_triage(conn, title="keep this title")
with kb.connect() as conn:
kb.specify_triage_task(conn, tid, body="new body only")
with kb.connect() as conn:
t = kb.get_task(conn, tid)
assert t.title == "keep this title"
assert t.body == "new body only"
def test_specify_second_call_noop_false(kanban_home):
# Promoting twice must not crash and the second call returns False
# because the task is no longer in triage.
with kb.connect() as conn:
tid = _create_triage(conn, title="once")
with kb.connect() as conn:
assert kb.specify_triage_task(conn, tid, body="spec") is True
with kb.connect() as conn:
assert kb.specify_triage_task(conn, tid, body="spec again") is False
+132 -1
View File
@@ -152,4 +152,135 @@ class TestRelaunch:
with pytest.raises(SystemExit):
relaunch_mod.relaunch(["--resume", "abc"])
assert calls == [("/usr/bin/hermes", ["/usr/bin/hermes", "--resume", "abc"])]
assert calls == [("/usr/bin/hermes", ["/usr/bin/hermes", "--resume", "abc"])]
def test_windows_uses_subprocess_not_execvp(self, monkeypatch):
"""On Windows, os.execvp raises OSError "Exec format error" when the
target is a .cmd shim or console-script wrapper (both common for
hermes). relaunch() must detect win32 and use subprocess.run +
sys.exit instead."""
monkeypatch.setattr(relaunch_mod.sys, "platform", "win32")
monkeypatch.setattr(relaunch_mod, "resolve_hermes_bin", lambda: r"C:\Users\test\hermes.exe")
import subprocess as _subprocess
captured_argv = []
def fake_subprocess_run(argv, **kwargs):
captured_argv.append(list(argv))
class _Result:
returncode = 0
return _Result()
monkeypatch.setattr(_subprocess, "run", fake_subprocess_run)
# execvp MUST NOT be called on Windows — route must go through subprocess
execvp_calls = []
def fake_execvp(*args, **kwargs):
execvp_calls.append(args)
raise AssertionError("os.execvp must not be called on Windows")
monkeypatch.setattr(relaunch_mod.os, "execvp", fake_execvp)
with pytest.raises(SystemExit) as exc_info:
relaunch_mod.relaunch(["chat"])
assert exc_info.value.code == 0
assert execvp_calls == []
assert captured_argv == [[r"C:\Users\test\hermes.exe", "chat"]]
def test_windows_propagates_child_exit_code(self, monkeypatch):
"""A non-zero exit from the child should flow through to sys.exit."""
monkeypatch.setattr(relaunch_mod.sys, "platform", "win32")
monkeypatch.setattr(relaunch_mod, "resolve_hermes_bin", lambda: r"C:\hermes.exe")
import subprocess as _subprocess
def fake_run(argv, **kwargs):
class _Result:
returncode = 42
return _Result()
monkeypatch.setattr(_subprocess, "run", fake_run)
monkeypatch.setattr(relaunch_mod.os, "execvp", lambda *a, **kw: None)
with pytest.raises(SystemExit) as exc_info:
relaunch_mod.relaunch(["chat"])
assert exc_info.value.code == 42
def test_windows_surfaces_oserror_with_help(self, monkeypatch, capsys):
"""When subprocess itself raises OSError (file-not-found / bad format),
we must NOT let it bubble up as a cryptic traceback print a
user-readable hint and sys.exit(1)."""
monkeypatch.setattr(relaunch_mod.sys, "platform", "win32")
monkeypatch.setattr(relaunch_mod, "resolve_hermes_bin", lambda: r"C:\missing.exe")
import subprocess as _subprocess
def fake_run(argv, **kwargs):
raise OSError(2, "No such file or directory")
monkeypatch.setattr(_subprocess, "run", fake_run)
monkeypatch.setattr(relaunch_mod.os, "execvp", lambda *a, **kw: None)
with pytest.raises(SystemExit) as exc_info:
relaunch_mod.relaunch(["chat"])
assert exc_info.value.code == 1
err = capsys.readouterr().err
assert "relaunch failed" in err
assert "open a new terminal" in err.lower() or "path" in err.lower()
class TestResolveHermesBinWindowsPyGuard:
"""On Windows, resolve_hermes_bin MUST NOT return a .py path.
os.access(x, os.X_OK) returns True for .py files on Windows because
PATHEXT includes .py when the Python launcher is installed but
subprocess.run can't actually exec a .py directly, so the relaunch
would fail with the cryptic "%1 is not a valid Win32 application" error.
"""
def test_windows_rejects_py_argv0_falls_through_to_path(self, monkeypatch, tmp_path):
"""On Windows, if sys.argv[0] is a .py file, we must skip the
argv[0] fast-path and fall through to PATH / python -m."""
# Build a fake .py script that "passes" the isfile + X_OK checks.
script = tmp_path / "main.py"
script.write_text("# stub")
monkeypatch.setattr(relaunch_mod.sys, "platform", "win32")
monkeypatch.setattr(relaunch_mod.sys, "argv", [str(script), "chat"])
# Force PATH lookup to return a hermes.exe so the test doesn't
# exercise the None-fallback path (that's a separate test).
monkeypatch.setattr(
relaunch_mod.shutil, "which",
lambda name: r"C:\venv\Scripts\hermes.exe" if name == "hermes" else None,
)
bin_path = relaunch_mod.resolve_hermes_bin()
# Must NOT be the .py — must be the hermes.exe PATH entry.
assert bin_path == r"C:\venv\Scripts\hermes.exe"
def test_posix_still_accepts_py_argv0(self, monkeypatch, tmp_path):
"""POSIX behaviour unchanged: argv[0] pointing at an executable
script (including .py with a shebang + chmod +x) is fine to return
because POSIX exec can route through the shebang line."""
if sys.platform == "win32":
pytest.skip("POSIX semantics")
script = tmp_path / "hermes"
script.write_text("#!/usr/bin/env python3\n")
script.chmod(0o755)
monkeypatch.setattr(relaunch_mod.sys, "argv", [str(script), "chat"])
assert relaunch_mod.resolve_hermes_bin() == str(script)
def test_windows_py_argv0_with_no_hermes_on_path_returns_none(self, monkeypatch, tmp_path):
"""Bulletproof fallback: if argv0 is .py on Windows AND hermes.exe
isn't on PATH, return None so the caller falls back to
python -m hermes_cli.main."""
script = tmp_path / "main.py"
script.write_text("# stub")
monkeypatch.setattr(relaunch_mod.sys, "platform", "win32")
monkeypatch.setattr(relaunch_mod.sys, "argv", [str(script), "chat"])
monkeypatch.setattr(relaunch_mod.shutil, "which", lambda name: None)
assert relaunch_mod.resolve_hermes_bin() is None
+28 -10
View File
@@ -323,15 +323,15 @@ def test_cmd_update_retries_optional_extras_individually_when_all_fails(monkeypa
return SimpleNamespace(stdout="main\n", stderr="", returncode=0)
if cmd == ["git", "rev-list", "HEAD..origin/main", "--count"]:
return SimpleNamespace(stdout="1\n", stderr="", returncode=0)
if cmd == ["git", "pull", "origin", "main"]:
if cmd == ["git", "pull", "--ff-only", "origin", "main"]:
return SimpleNamespace(stdout="Updating\n", stderr="", returncode=0)
if cmd == ["/usr/bin/uv", "pip", "install", "-e", ".[all]", "--quiet"]:
if cmd == ["/usr/bin/uv", "pip", "install", "-e", ".[all]"]:
raise CalledProcessError(returncode=1, cmd=cmd)
if cmd == ["/usr/bin/uv", "pip", "install", "-e", ".", "--quiet"]:
if cmd == ["/usr/bin/uv", "pip", "install", "-e", "."]:
return SimpleNamespace(returncode=0)
if cmd == ["/usr/bin/uv", "pip", "install", "-e", ".[matrix]", "--quiet"]:
if cmd == ["/usr/bin/uv", "pip", "install", "-e", ".[matrix]"]:
raise CalledProcessError(returncode=1, cmd=cmd)
if cmd == ["/usr/bin/uv", "pip", "install", "-e", ".[mcp]", "--quiet"]:
if cmd == ["/usr/bin/uv", "pip", "install", "-e", ".[mcp]"]:
return SimpleNamespace(returncode=0)
# Catch-all must include stdout/stderr so consumers that parse
# output (e.g. the dashboard-restart `ps -A` scan added in the
@@ -344,10 +344,10 @@ def test_cmd_update_retries_optional_extras_individually_when_all_fails(monkeypa
install_cmds = [c for c in recorded if "pip" in c and "install" in c]
assert install_cmds == [
["/usr/bin/uv", "pip", "install", "-e", ".[all]", "--quiet"],
["/usr/bin/uv", "pip", "install", "-e", ".", "--quiet"],
["/usr/bin/uv", "pip", "install", "-e", ".[matrix]", "--quiet"],
["/usr/bin/uv", "pip", "install", "-e", ".[mcp]", "--quiet"],
["/usr/bin/uv", "pip", "install", "-e", ".[all]"],
["/usr/bin/uv", "pip", "install", "-e", "."],
["/usr/bin/uv", "pip", "install", "-e", ".[matrix]"],
["/usr/bin/uv", "pip", "install", "-e", ".[mcp]"],
]
out = capsys.readouterr().out
@@ -371,7 +371,7 @@ def test_cmd_update_succeeds_with_extras(monkeypatch, tmp_path):
return SimpleNamespace(stdout="main\n", stderr="", returncode=0)
if cmd == ["git", "rev-list", "HEAD..origin/main", "--count"]:
return SimpleNamespace(stdout="1\n", stderr="", returncode=0)
if cmd == ["git", "pull", "origin", "main"]:
if cmd == ["git", "pull", "--ff-only", "origin", "main"]:
return SimpleNamespace(stdout="Updating\n", stderr="", returncode=0)
return SimpleNamespace(returncode=0, stdout="", stderr="")
@@ -384,6 +384,24 @@ def test_cmd_update_succeeds_with_extras(monkeypatch, tmp_path):
assert ".[all]" in install_cmds[0]
def test_install_heartbeat_prints_when_dependency_install_is_silent(monkeypatch, capsys):
"""Long quiet installs should emit periodic heartbeat lines."""
def fake_run(cmd, **kwargs):
hermes_main._time.sleep(1.2)
return SimpleNamespace(returncode=0)
monkeypatch.setattr(hermes_main.subprocess, "run", fake_run)
hermes_main._run_install_with_heartbeat(
["uv", "pip", "install", "-e", "."],
heartbeat_interval_seconds=1,
)
out = capsys.readouterr().out
assert "still installing dependencies" in out
# ---------------------------------------------------------------------------
# ff-only fallback to reset --hard on diverged history
# ---------------------------------------------------------------------------
@@ -1582,3 +1582,104 @@ def test_board_exposes_diagnostics_list_and_summary(client):
assert task_dict["warnings"] is not None
assert task_dict["warnings"]["highest_severity"] == "error"
assert task_dict["diagnostics"][0]["kind"] == "repeated_crashes"
# ---------------------------------------------------------------------------
# POST /tasks/:id/specify — triage specifier endpoint
# ---------------------------------------------------------------------------
def _patch_specifier_response(monkeypatch, *, content, model="test-model"):
"""Helper: install a fake auxiliary client so the specifier endpoint
can run without hitting any real provider."""
from unittest.mock import MagicMock
resp = MagicMock()
resp.choices = [MagicMock()]
resp.choices[0].message.content = content
fake_client = MagicMock()
fake_client.chat.completions.create = MagicMock(return_value=resp)
monkeypatch.setattr(
"agent.auxiliary_client.get_text_auxiliary_client",
lambda *a, **kw: (fake_client, model),
)
return fake_client
def test_specify_happy_path(client, monkeypatch):
import json as jsonlib
# Create a triage task.
t = client.post(
"/api/plugins/kanban/tasks",
json={"title": "one-liner", "triage": True},
).json()["task"]
assert t["status"] == "triage"
_patch_specifier_response(
monkeypatch,
content=jsonlib.dumps(
{"title": "Polished", "body": "**Goal**\nDo the thing."}
),
)
r = client.post(
f"/api/plugins/kanban/tasks/{t['id']}/specify",
json={"author": "ui-tester"},
)
assert r.status_code == 200
body = r.json()
assert body["ok"] is True
assert body["task_id"] == t["id"]
assert body["new_title"] == "Polished"
# Task should have moved off the triage column.
detail = client.get(f"/api/plugins/kanban/tasks/{t['id']}").json()["task"]
assert detail["status"] in {"todo", "ready"}
assert detail["title"] == "Polished"
assert "**Goal**" in (detail["body"] or "")
def test_specify_non_triage_returns_ok_false_not_http_error(client, monkeypatch):
"""The endpoint intentionally returns ``{ok: false, reason: ...}`` for
"task not in triage" rather than a 4xx the dashboard renders the
reason inline so the user can fix it without a page reload."""
# Create a normal (ready) task — not in triage.
t = client.post("/api/plugins/kanban/tasks", json={"title": "x"}).json()["task"]
_patch_specifier_response(monkeypatch, content="unused")
r = client.post(
f"/api/plugins/kanban/tasks/{t['id']}/specify",
json={},
)
assert r.status_code == 200
body = r.json()
assert body["ok"] is False
assert "not in triage" in body["reason"]
def test_specify_no_aux_client_surfaces_reason(client, monkeypatch):
t = client.post(
"/api/plugins/kanban/tasks",
json={"title": "rough", "triage": True},
).json()["task"]
# Simulate "no auxiliary client configured".
monkeypatch.setattr(
"agent.auxiliary_client.get_text_auxiliary_client",
lambda *a, **kw: (None, ""),
)
r = client.post(
f"/api/plugins/kanban/tasks/{t['id']}/specify",
json={},
)
assert r.status_code == 200
body = r.json()
assert body["ok"] is False
assert "auxiliary client" in body["reason"]
# Task must stay in triage — nothing was touched.
detail = client.get(f"/api/plugins/kanban/tasks/{t['id']}").json()["task"]
assert detail["status"] == "triage"
+21 -2
View File
@@ -3729,8 +3729,8 @@ class TestMaxTokensParam:
assert result == {"max_completion_tokens": 4096}
class TestAzureOpenAIRouting:
"""Verify Azure OpenAI endpoints stay on chat_completions for gpt-5.x."""
class TestGpt5ApiModeRouting:
"""Verify provider-specific GPT-5 API-mode routing."""
def test_azure_gpt5_stays_on_chat_completions(self, agent):
"""Azure serves gpt-5.x on /chat/completions — must not upgrade to codex_responses."""
@@ -3769,6 +3769,25 @@ class TestAzureOpenAIRouting:
agent.api_mode = "codex_responses"
assert agent.api_mode == "codex_responses"
def test_nous_gpt5_stays_on_chat_completions(self, agent):
"""Nous serves gpt-5.x on /chat/completions — must not upgrade to codex_responses."""
agent.provider = "nous"
agent.base_url = "https://inference-api.nousresearch.com/v1"
agent.api_mode = "chat_completions"
agent.model = "openai/gpt-5.5"
if (
agent.api_mode == "chat_completions"
and not agent._is_azure_openai_url()
and (
agent._is_direct_openai_url()
or agent._provider_model_requires_responses_api(
agent.model, provider=agent.provider,
)
)
):
agent.api_mode = "codex_responses"
assert agent.api_mode == "chat_completions"
def test_is_azure_openai_url_detection(self, agent):
assert agent._is_azure_openai_url("https://foo.openai.azure.com/openai/v1") is True
assert agent._is_azure_openai_url("https://api.openai.com/v1") is False
+297
View File
@@ -0,0 +1,297 @@
"""Tests for hermes_bootstrap — Windows UTF-8 stdio shim.
The bootstrap module is imported at the top of every Hermes entry point
(hermes, hermes-agent, hermes-acp, gateway, batch_runner, cli.py). It
fixes Python's Windows UTF-8 defaults so print("café") doesn't crash and
subprocess children inherit UTF-8 mode.
Key invariants covered by these tests:
1. Windows: env vars get set, stdio reconfigured, non-ASCII print works
2. POSIX: complete no-op (we don't touch LANG/LC_* or anything else)
3. Idempotent: safe to call multiple times
4. Respects user opt-out: if the user explicitly sets PYTHONUTF8=0 or
PYTHONIOENCODING=something-else, we leave those alone
5. Load order: every Hermes entry point imports hermes_bootstrap as its
first non-docstring import (before anything that might do file I/O
or print to stdout)
"""
from __future__ import annotations
import io
import os
import subprocess
import sys
import textwrap
import unittest.mock as mock
import pytest
# Import the module under test via an import-time side-effect check path.
# We need to be able to reset its state between tests, so we import it
# fresh in each test that manipulates _IS_WINDOWS.
def _fresh_import():
"""Return a freshly-imported hermes_bootstrap module.
Drops any cached copy from sys.modules first so module-level code
runs again and the platform check re-evaluates.
"""
sys.modules.pop("hermes_bootstrap", None)
import hermes_bootstrap # noqa: WPS433
return hermes_bootstrap
class TestWindowsBehavior:
"""Windows: the bootstrap does its job."""
@pytest.mark.skipif(
sys.platform != "win32",
reason="Windows-specific behavior",
)
def test_env_vars_set_on_windows(self, monkeypatch):
# Clear any pre-existing values and re-run bootstrap.
monkeypatch.delenv("PYTHONUTF8", raising=False)
monkeypatch.delenv("PYTHONIOENCODING", raising=False)
hb = _fresh_import()
# Module-level apply_windows_utf8_bootstrap() ran during import.
assert os.environ.get("PYTHONUTF8") == "1"
assert os.environ.get("PYTHONIOENCODING") == "utf-8"
assert hb._bootstrap_applied is True
@pytest.mark.skipif(
sys.platform != "win32",
reason="Windows-specific behavior",
)
def test_stdout_reconfigured_to_utf8_on_windows(self):
# The live process's stdout should now be UTF-8 (the Hermes CLI
# runs on Windows with a pytest console that's cp1252 by default).
# If reconfigure succeeded, sys.stdout.encoding is 'utf-8'.
_fresh_import()
# pytest may capture stdout, which makes encoding check flaky —
# so instead verify the reconfigure call succeeded on the real
# stream by attempting the failure case.
out = sys.stdout
reconfigure = getattr(out, "reconfigure", None)
if reconfigure is None:
pytest.skip("pytest replaced sys.stdout with a non-reconfigurable stream")
# After bootstrap, encoding should be utf-8 (or the reconfigure
# skipped because pytest's capture already set it to utf-8).
assert out.encoding.lower() in {"utf-8", "utf8"}, (
f"stdout encoding is {out.encoding!r} — bootstrap should have "
"reconfigured it to UTF-8"
)
@pytest.mark.skipif(
sys.platform != "win32",
reason="Windows-specific behavior",
)
def test_child_process_inherits_utf8_mode(self):
"""A subprocess spawned from this process should inherit
PYTHONUTF8=1 and be able to print non-ASCII to stdout."""
_fresh_import()
# Non-ASCII chars that would crash under cp1252: arrow, emoji.
script = textwrap.dedent("""
import sys
print("em-dash \\u2014 arrow \\u2192 emoji \\U0001f680")
sys.exit(0)
""").strip()
# Don't pass env= — let the child inherit os.environ, which
# now contains PYTHONUTF8=1 courtesy of the bootstrap.
result = subprocess.run(
[sys.executable, "-c", script],
capture_output=True,
timeout=15,
)
assert result.returncode == 0, (
f"Child crashed printing non-ASCII despite UTF-8 bootstrap:\n"
f" stdout: {result.stdout!r}\n"
f" stderr: {result.stderr!r}"
)
decoded = result.stdout.decode("utf-8")
assert "\u2014" in decoded
assert "\u2192" in decoded
assert "\U0001f680" in decoded
class TestUserOptOut:
"""If the user has explicitly set PYTHONUTF8 / PYTHONIOENCODING in
their environment, we respect that (setdefault, not overwrite)."""
@pytest.mark.skipif(
sys.platform != "win32",
reason="Only meaningful on Windows where we'd otherwise set these",
)
def test_user_pythonutf8_zero_preserved(self, monkeypatch):
monkeypatch.setenv("PYTHONUTF8", "0")
_fresh_import()
assert os.environ["PYTHONUTF8"] == "0", (
"bootstrap must not overwrite an explicit user setting"
)
@pytest.mark.skipif(
sys.platform != "win32",
reason="Only meaningful on Windows where we'd otherwise set these",
)
def test_user_pythonioencoding_preserved(self, monkeypatch):
monkeypatch.setenv("PYTHONIOENCODING", "latin-1")
_fresh_import()
assert os.environ["PYTHONIOENCODING"] == "latin-1"
class TestPosixNoOp:
"""POSIX: zero behavior change. We don't touch LANG, LC_*, or any
stdio. The goal is that Linux/macOS behave identically before and
after this module is imported."""
def test_noop_on_fake_posix(self, monkeypatch):
"""Even when imported, the bootstrap function must return False
and leave env untouched when _IS_WINDOWS is False."""
hb = _fresh_import()
# Reset + fake POSIX
hb._IS_WINDOWS = False
hb._bootstrap_applied = False
monkeypatch.delenv("PYTHONUTF8", raising=False)
monkeypatch.delenv("PYTHONIOENCODING", raising=False)
result = hb.apply_windows_utf8_bootstrap()
assert result is False
assert "PYTHONUTF8" not in os.environ
assert "PYTHONIOENCODING" not in os.environ
assert hb._bootstrap_applied is False
@pytest.mark.skipif(
sys.platform == "win32",
reason="Real POSIX required for this check",
)
def test_real_posix_bootstrap_is_noop(self, monkeypatch):
"""On actual Linux/macOS, importing the module must not set
PYTHONUTF8 or reconfigure stdio."""
monkeypatch.delenv("PYTHONUTF8", raising=False)
monkeypatch.delenv("PYTHONIOENCODING", raising=False)
hb = _fresh_import()
assert hb._bootstrap_applied is False
assert "PYTHONUTF8" not in os.environ
assert "PYTHONIOENCODING" not in os.environ
class TestIdempotence:
"""Calling apply_windows_utf8_bootstrap() multiple times must be safe."""
def test_second_call_returns_false(self):
hb = _fresh_import()
# First call already happened at import time.
result = hb.apply_windows_utf8_bootstrap()
assert result is False, (
"Second call should return False (idempotent no-op)"
)
def test_no_exceptions_on_repeated_calls(self):
hb = _fresh_import()
for _ in range(5):
hb.apply_windows_utf8_bootstrap()
class TestStdioReconfigureErrorHandling:
"""If sys.stdout/stderr/stdin have been replaced with streams that
don't support reconfigure (e.g. by a test harness), the bootstrap
must degrade gracefully rather than crash."""
def test_non_reconfigurable_stream_does_not_crash(self, monkeypatch):
"""Replace sys.stdout with a BytesIO (no reconfigure method),
then run the bootstrap and make sure it doesn't raise."""
hb = _fresh_import()
hb._IS_WINDOWS = True
hb._bootstrap_applied = False
fake = io.BytesIO() # no .reconfigure attribute
monkeypatch.setattr(sys, "stdout", fake)
try:
# Must not raise.
hb.apply_windows_utf8_bootstrap()
except Exception as exc:
pytest.fail(f"bootstrap raised on non-reconfigurable stdout: {exc}")
def test_reconfigure_oserror_is_caught(self, monkeypatch):
"""If reconfigure() itself raises (closed stream, etc.), swallow
the error the env-var half of the fix still applies."""
hb = _fresh_import()
hb._IS_WINDOWS = True
hb._bootstrap_applied = False
class _BrokenStream:
encoding = "utf-8"
def reconfigure(self, **kwargs):
raise OSError("simulated: stream already closed")
monkeypatch.setattr(sys, "stdout", _BrokenStream())
monkeypatch.setattr(sys, "stderr", _BrokenStream())
# Must not raise.
hb.apply_windows_utf8_bootstrap()
class TestEntryPointsImportBootstrap:
"""Every Hermes entry point must import hermes_bootstrap as its
first non-docstring import. We check this by scanning source files
rather than invoking the entry points (which would require a full
agent context)."""
# Entry points that invoke Hermes as a process. Each one must
# import hermes_bootstrap before doing any file I/O or stdout writes.
ENTRY_POINTS = [
"hermes_cli/main.py", # hermes CLI (console_script)
"run_agent.py", # hermes-agent (console_script)
"acp_adapter/entry.py", # hermes-acp (console_script)
"gateway/run.py", # gateway
"batch_runner.py", # batch mode
"cli.py", # legacy direct-launch CLI
]
@pytest.mark.parametrize("path", ENTRY_POINTS)
def test_entry_point_imports_bootstrap(self, path):
"""The file must contain 'import hermes_bootstrap' and that
line must appear before the first 'import' of anything else.
We're lenient about the docstring (can be arbitrarily long) and
about comment lines just need to verify the first import
statement is the bootstrap.
"""
# Resolve relative to the hermes-agent repo root. Tests live
# at tests/test_hermes_bootstrap.py, so go up one dir.
import pathlib
here = pathlib.Path(__file__).resolve()
repo_root = here.parent.parent # tests/ -> repo root
full_path = repo_root / path
assert full_path.exists(), f"entry point missing: {full_path}"
source = full_path.read_text(encoding="utf-8")
# Find the first non-comment, non-blank line that starts with
# 'import ' or 'from '. It must be 'import hermes_bootstrap'.
import tokenize
import ast
tree = ast.parse(source)
first_import_node = None
for node in ast.iter_child_nodes(tree):
if isinstance(node, (ast.Import, ast.ImportFrom)):
first_import_node = node
break
assert first_import_node is not None, (
f"{path}: no top-level imports found at all"
)
if isinstance(first_import_node, ast.Import):
first_import_name = first_import_node.names[0].name
else: # ImportFrom
first_import_name = first_import_node.module or ""
assert first_import_name == "hermes_bootstrap", (
f"{path}: first top-level import is {first_import_name!r}, "
f"but it must be 'hermes_bootstrap' so UTF-8 stdio is "
f"configured before anything else initializes. Move the "
f"'import hermes_bootstrap' line to be the first import."
)
+61 -1
View File
@@ -7,7 +7,12 @@ from unittest.mock import patch
import pytest
import hermes_constants
from hermes_constants import get_default_hermes_root, is_container
from hermes_constants import (
VALID_REASONING_EFFORTS,
get_default_hermes_root,
is_container,
parse_reasoning_effort,
)
class TestGetDefaultHermesRoot:
@@ -17,6 +22,7 @@ class TestGetDefaultHermesRoot:
"""When HERMES_HOME is not set, returns ~/.hermes."""
monkeypatch.delenv("HERMES_HOME", raising=False)
monkeypatch.setattr(Path, "home", lambda: tmp_path)
assert get_default_hermes_root() == tmp_path / ".hermes"
def test_hermes_home_is_native(self, tmp_path, monkeypatch):
@@ -111,3 +117,57 @@ class TestIsContainer:
# Even if we make os.path.exists return False, cached value wins
monkeypatch.setattr(os.path, "exists", lambda p: False)
assert is_container() is True
class TestParseReasoningEffort:
"""Tests for parse_reasoning_effort() — string → reasoning config dict."""
@pytest.mark.parametrize("value", ["", " ", "\t", "\n"])
def test_empty_or_whitespace_returns_none(self, value):
"""Empty / whitespace-only input falls back to caller default (None)."""
assert parse_reasoning_effort(value) is None
def test_none_disables_reasoning(self):
"""The literal "none" disables reasoning explicitly."""
assert parse_reasoning_effort("none") == {"enabled": False}
@pytest.mark.parametrize("level", list(VALID_REASONING_EFFORTS))
def test_each_valid_level(self, level):
"""Every level listed in VALID_REASONING_EFFORTS is accepted as-is."""
assert parse_reasoning_effort(level) == {"enabled": True, "effort": level}
@pytest.mark.parametrize(
"raw, expected_effort",
[
("MEDIUM", "medium"),
("High", "high"),
(" low ", "low"),
("\tXHIGH\n", "xhigh"),
("None", False),
],
)
def test_case_and_whitespace_normalized(self, raw, expected_effort):
"""Mixed case and surrounding whitespace are normalized before lookup."""
result = parse_reasoning_effort(raw)
if expected_effort is False:
assert result == {"enabled": False}
else:
assert result == {"enabled": True, "effort": expected_effort}
@pytest.mark.parametrize(
"value",
["bogus", "very-high", "max", "0", "off", "true", "default"],
)
def test_unknown_levels_return_none(self, value):
"""Unrecognized strings fall back to the caller default (None)."""
assert parse_reasoning_effort(value) is None
def test_known_supported_levels_are_documented(self):
"""Guard against silently dropping a documented level.
The docstring promises "minimal", "low", "medium", "high", "xhigh".
If someone removes one from VALID_REASONING_EFFORTS without updating
the docstring, this test will fail and force the call out.
"""
documented = {"minimal", "low", "medium", "high", "xhigh"}
assert documented.issubset(set(VALID_REASONING_EFFORTS))
@@ -0,0 +1,22 @@
"""Regression tests for Termux network prerequisite handling in install.sh."""
from pathlib import Path
REPO_ROOT = Path(__file__).resolve().parent.parent
INSTALL_SH = REPO_ROOT / "scripts" / "install.sh"
def test_termux_pkg_list_includes_network_basics() -> None:
text = INSTALL_SH.read_text()
assert "local termux_pkgs=(clang rust make pkg-config libffi openssl ca-certificates curl)" in text
def test_install_script_has_connectivity_probe_and_termux_guidance() -> None:
text = INSTALL_SH.read_text()
assert "check_network_prerequisites()" in text
assert "https://pypi.org/simple/" in text
assert "https://duckduckgo.com/" in text
assert "termux-change-repo" in text
assert "pkg install -y ca-certificates curl && pkg update" in text
assert "check_network_prerequisites" in text
+115
View File
@@ -0,0 +1,115 @@
"""Tests for ruff lint config — guards against accidental rule removal.
PLW1514 (unspecified-encoding) was enabled after a debug session on
Windows turned up three separate UTF-8 regressions in execute_code.
The rule catches bare ``open()`` / ``read_text()`` / ``write_text()``
calls that default to locale encoding cp1252 on Windows which
silently corrupts non-ASCII content.
These tests ensure:
1. PLW1514 stays in ``[tool.ruff.lint.select]``
2. The CI workflow's blocking step still invokes ``ruff check .``
3. pyproject.toml has ``preview = true`` (required PLW1514 is a
preview rule in ruff 0.15.x)
If someone removes any of these, CI stops enforcing UTF-8-explicit
opens and we're back to the original Windows-regression trap.
"""
from __future__ import annotations
import pathlib
import pytest
try:
import tomllib # Python 3.11+
except ImportError: # pragma: no cover — 3.10 and earlier
import tomli as tomllib # type: ignore
REPO_ROOT = pathlib.Path(__file__).resolve().parent.parent
def _load_pyproject() -> dict:
with open(REPO_ROOT / "pyproject.toml", "rb") as fh:
return tomllib.load(fh)
class TestRuffConfig:
def test_plw1514_is_in_select_list(self):
"""pyproject.toml must keep PLW1514 in [tool.ruff.lint.select]."""
cfg = _load_pyproject()
selected = (
cfg.get("tool", {})
.get("ruff", {})
.get("lint", {})
.get("select", [])
)
assert "PLW1514" in selected, (
"PLW1514 (unspecified-encoding) was removed from "
"[tool.ruff.lint.select]. This rule blocks bare open() calls "
"that default to locale encoding on Windows — removing it "
"re-opens a class of UTF-8 bugs we already paid to close. "
"If you genuinely want to remove it, delete this test in the "
"same commit so the intent is deliberate."
)
def test_preview_mode_enabled(self):
"""PLW1514 is a preview rule in ruff 0.15.x — preview=true is
required for it to actually run."""
cfg = _load_pyproject()
ruff_cfg = cfg.get("tool", {}).get("ruff", {})
assert ruff_cfg.get("preview") is True, (
"[tool.ruff] preview=true is required — PLW1514 is a preview "
"rule and silently becomes a no-op without it. If this ever "
"becomes a stable rule, you can drop preview=true but must "
"verify PLW1514 still fires in a sample test run first."
)
class TestLintWorkflow:
WORKFLOW_PATH = REPO_ROOT / ".github" / "workflows" / "lint.yml"
def test_workflow_exists(self):
assert self.WORKFLOW_PATH.exists(), (
f"CI workflow missing: {self.WORKFLOW_PATH}"
)
def test_workflow_has_blocking_ruff_step(self):
"""The workflow must run a blocking ``ruff check .`` step
(one without --exit-zero) so violations fail the job."""
content = self.WORKFLOW_PATH.read_text(encoding="utf-8")
# Look for the blocking step's named line + its command. We want
# at least one ``ruff check .`` that does NOT have ``--exit-zero``
# nearby.
import re
# Split into lines and find ruff check invocations
lines = content.splitlines()
found_blocking = False
for i, line in enumerate(lines):
stripped = line.strip()
if stripped.startswith("ruff check") and "--exit-zero" not in stripped:
# Also check it's not piped to `|| true` which would mask
# the exit code.
window = " ".join(lines[i:i + 3])
if "|| true" not in window:
found_blocking = True
break
assert found_blocking, (
"lint.yml no longer contains a blocking ``ruff check .`` step "
"(one without --exit-zero and not masked by || true). "
"Restore it — the PLW1514 rule is only useful if CI actually "
"fails on violation."
)
def test_workflow_yaml_is_valid(self):
"""Workflow file must parse as valid YAML (can't ship a broken
CI config to main)."""
import yaml
content = self.WORKFLOW_PATH.read_text(encoding="utf-8")
try:
parsed = yaml.safe_load(content)
except yaml.YAMLError as exc:
pytest.fail(f"lint.yml is not valid YAML: {exc}")
assert isinstance(parsed, dict)
assert "jobs" in parsed
+36 -9
View File
@@ -828,18 +828,45 @@ class TestE2EChannelsList:
assert result["channels"][0]["target"] == "slack:C1234"
def test_channels_with_directory(self, mcp_server_e2e, _event_loop, monkeypatch):
"""Populated channel_directory.json should be unwrapped via the 'platforms' key.
Regression test for issue #21474: the writer wraps platforms under
{"updated_at": ..., "platforms": {...}} but the reader was iterating
directory.items() directly, so channels_list always returned 0.
"""
import mcp_serve
monkeypatch.setattr(mcp_serve, "_load_channel_directory", lambda: {
"telegram": [
{"id": "123456", "name": "Alice", "type": "dm"},
{"id": "-100999", "name": "Dev Group", "type": "group"},
],
"updated_at": "2026-05-07T12:00:00",
"platforms": {
"telegram": [
{"id": "123456", "name": "Alice", "type": "dm"},
{"id": "-100999", "name": "Dev Group", "type": "group"},
],
"discord": [
{"id": "789", "name": "general", "type": "text"},
],
},
})
# Need to recreate server to pick up the new mock
server, bridge = mcp_server_e2e
# The tool closure already captured the old mock, so test the function directly
directory = mcp_serve._load_channel_directory()
assert len(directory["telegram"]) == 2
server, _ = mcp_server_e2e
result = _run_tool(server, "channels_list")
assert result["count"] == 3
targets = {c["target"] for c in result["channels"]}
assert targets == {"telegram:123456", "telegram:-100999", "discord:789"}
def test_channels_with_directory_platform_filter(self, mcp_server_e2e, _event_loop, monkeypatch):
"""Platform filter should work against the wrapped 'platforms' payload."""
import mcp_serve
monkeypatch.setattr(mcp_serve, "_load_channel_directory", lambda: {
"updated_at": "2026-05-07T12:00:00",
"platforms": {
"telegram": [{"id": "123456", "name": "Alice", "type": "dm"}],
"discord": [{"id": "789", "name": "general", "type": "text"}],
},
})
server, _ = mcp_server_e2e
result = _run_tool(server, "channels_list", {"platform": "discord"})
assert result["count"] == 1
assert result["channels"][0]["target"] == "discord:789"
class TestE2EPermissions:
+23
View File
@@ -0,0 +1,23 @@
"""Regression coverage for the Termux broad install profile."""
from pathlib import Path
REPO_ROOT = Path(__file__).resolve().parent.parent
PYPROJECT = REPO_ROOT / "pyproject.toml"
INSTALL_SH = REPO_ROOT / "scripts" / "install.sh"
def test_pyproject_defines_termux_all_without_known_blockers() -> None:
text = PYPROJECT.read_text()
assert "termux-all = [" in text
assert '"hermes-agent[termux]"' in text
assert '"hermes-agent[matrix]"' not in text.split("termux-all = [", 1)[1].split("]", 1)[0]
assert '"hermes-agent[voice]"' not in text.split("termux-all = [", 1)[1].split("]", 1)[0]
def test_install_script_prefers_termux_all_then_fallbacks() -> None:
text = INSTALL_SH.read_text()
assert "pip install -e '.[termux-all]' -c constraints-termux.txt" in text
assert "Termux broad profile (.[termux-all]) failed, trying baseline Termux profile..." in text
assert "Termux baseline profile (.[termux]) failed, trying base install..." in text
+9 -1
View File
@@ -340,7 +340,15 @@ class TestRunBrowserCommandPathConstruction:
_run_browser_command("test-task", "navigate", ["https://example.com"])
assert captured_cmd is not None
assert captured_cmd[:2] == ["npx", "agent-browser"]
# The prefix must split "npx agent-browser" into two argv items.
# On POSIX shutil.which("npx") returns the absolute path if npx is on
# PATH (which the test's patched PATH always contains when the system
# has it installed). The important invariant is that the second
# argv item is the package name "agent-browser", not a merged
# "npx agent-browser" string — that's what Popen needs.
assert len(captured_cmd) >= 2
assert captured_cmd[0].endswith("npx") or captured_cmd[0] == "npx"
assert captured_cmd[1] == "agent-browser"
assert captured_cmd[2:6] == [
"--session",
"test-session",
+8 -2
View File
@@ -774,11 +774,17 @@ class TestEnvVarFiltering(unittest.TestCase):
class TestExecuteCodeEdgeCases(unittest.TestCase):
def test_windows_returns_error(self):
"""On Windows (or when SANDBOX_AVAILABLE is False), returns error JSON."""
"""When SANDBOX_AVAILABLE is False (e.g. when the backend deems
the sandbox unusable for this environment), execute_code returns
an error JSON with a readable message pointing the caller at
regular tool calls. Previously this was a Windows-only gate;
execute_code now works on Windows via loopback TCP, so the
error is only emitted when SANDBOX_AVAILABLE is explicitly
flipped off (e.g. for future platform-specific disables)."""
with patch("tools.code_execution_tool.SANDBOX_AVAILABLE", False):
result = json.loads(execute_code("print('hi')", task_id="test"))
self.assertIn("error", result)
self.assertIn("Windows", result["error"])
self.assertIn("unavailable", result["error"].lower())
def test_whitespace_only_code(self):
result = json.loads(execute_code(" \n\t ", task_id="test"))
+30 -2
View File
@@ -131,6 +131,12 @@ class TestResolveChildPython(unittest.TestCase):
def test_project_with_virtualenv_picks_venv_python(self):
"""Project mode + VIRTUAL_ENV pointing at a real venv → that python."""
if sys.platform == "win32":
pytest.skip(
"Creates symlinks and assumes POSIX venv layout (bin/python). "
"Windows venvs use Scripts/python.exe and symlink creation "
"requires elevated privileges (WinError 1314)."
)
import tempfile, pathlib
with tempfile.TemporaryDirectory() as td:
fake_venv = pathlib.Path(td)
@@ -154,6 +160,12 @@ class TestResolveChildPython(unittest.TestCase):
def test_project_prefers_virtualenv_over_conda(self):
"""If both VIRTUAL_ENV and CONDA_PREFIX are set, VIRTUAL_ENV wins."""
if sys.platform == "win32":
pytest.skip(
"Creates symlinks and assumes POSIX venv layout (bin/python). "
"Windows venvs use Scripts/python.exe and symlink creation "
"requires elevated privileges (WinError 1314)."
)
import tempfile, pathlib
with tempfile.TemporaryDirectory() as ve_td, tempfile.TemporaryDirectory() as conda_td:
ve = pathlib.Path(ve_td)
@@ -257,7 +269,15 @@ class TestModeAwareSchema(unittest.TestCase):
# Integration: what actually happens when execute_code runs per mode
# ---------------------------------------------------------------------------
@pytest.mark.skipif(sys.platform == "win32", reason="execute_code is POSIX-only")
@pytest.mark.skipif(
sys.platform == "win32",
reason=(
"Assumes POSIX venv layout (bin/python) and symlink creation "
"privileges. execute_code itself works on Windows — these "
"integration tests just haven't been ported to the Scripts/"
"python.exe layout yet."
),
)
class TestExecuteCodeModeIntegration(unittest.TestCase):
"""End-to-end: verify the subprocess actually runs where we expect."""
@@ -351,7 +371,15 @@ class TestExecuteCodeModeIntegration(unittest.TestCase):
# changes CWD + interpreter, not the security posture.
# ---------------------------------------------------------------------------
@pytest.mark.skipif(sys.platform == "win32", reason="execute_code is POSIX-only")
@pytest.mark.skipif(
sys.platform == "win32",
reason=(
"Assumes POSIX venv layout (bin/python) and symlink creation "
"privileges. execute_code itself works on Windows — these "
"integration tests just haven't been ported to the Scripts/"
"python.exe layout yet."
),
)
class TestSecurityInvariantsAcrossModes(unittest.TestCase):
def _run(self, code, mode):
@@ -0,0 +1,698 @@
"""Tests for execute_code env scrubbing on Windows.
On Windows the child process needs a small set of OS-essential env vars
(SYSTEMROOT, WINDIR, COMSPEC, ...) to run. Without SYSTEMROOT in particular,
``socket.socket(AF_INET, SOCK_STREAM)`` fails inside the sandbox with
WinError 10106 (Winsock can't locate mswsock.dll) and no tool call over
loopback TCP can ever succeed.
These tests cover ``_scrub_child_env`` directly so they run on every OS
the logic is conditional on a passed-in ``is_windows`` flag, not on
the host platform. We also keep a live Winsock smoke test that only runs
on a real Windows host.
Also covers the companion Windows bug: the sandbox writes
``hermes_tools.py`` and ``script.py`` into a temp dir, and those files
must be written as UTF-8 on every platform the generated stub contains
em-dash/en-dash characters in docstrings, and the default ``open(path, "w")``
on Windows uses the system locale (cp1252 typically), corrupting those
bytes. The child then fails to import with a SyntaxError:
``'utf-8' codec can't decode byte 0x97``.
"""
import os
import socket
import subprocess
import sys
import textwrap
import unittest.mock as mock
import pytest
from tools.code_execution_tool import (
_SAFE_ENV_PREFIXES,
_SECRET_SUBSTRINGS,
_WINDOWS_ESSENTIAL_ENV_VARS,
_scrub_child_env,
)
def _no_passthrough(_name):
return False
class TestWindowsEssentialAllowlist:
"""The allowlist itself — contents, shape, and invariants."""
def test_contains_winsock_required_vars(self):
# Without SYSTEMROOT the child cannot initialize Winsock.
assert "SYSTEMROOT" in _WINDOWS_ESSENTIAL_ENV_VARS
def test_contains_subprocess_required_vars(self):
# Without COMSPEC, subprocess can't resolve the default shell.
assert "COMSPEC" in _WINDOWS_ESSENTIAL_ENV_VARS
def test_contains_user_profile_vars(self):
# os.path.expanduser("~") on Windows uses USERPROFILE.
assert "USERPROFILE" in _WINDOWS_ESSENTIAL_ENV_VARS
assert "APPDATA" in _WINDOWS_ESSENTIAL_ENV_VARS
assert "LOCALAPPDATA" in _WINDOWS_ESSENTIAL_ENV_VARS
def test_contains_only_uppercase_names(self):
# Windows env var names are case-insensitive but we canonicalize to
# uppercase for the membership check (``k.upper() in _WINDOWS_...``).
for name in _WINDOWS_ESSENTIAL_ENV_VARS:
assert name == name.upper(), f"{name!r} should be uppercase"
def test_no_overlap_with_secret_substrings(self):
# Sanity: none of the essential OS vars should look like secrets.
# If this ever fires, we'd have a precedence ordering bug (secrets
# are blocked *before* the essentials check).
for name in _WINDOWS_ESSENTIAL_ENV_VARS:
assert not any(s in name for s in _SECRET_SUBSTRINGS), (
f"{name!r} looks secret-like — would be blocked before the "
"essentials allowlist can match"
)
class TestScrubChildEnvWindows:
"""Verify _scrub_child_env passes Windows essentials through when
is_windows=True and blocks them when is_windows=False (so POSIX hosts
don't inherit pointless Windows vars)."""
def _sample_windows_env(self):
"""A realistic subset of what os.environ looks like on Windows."""
return {
"SYSTEMROOT": r"C:\Windows",
"SystemDrive": "C:", # Windows preserves native case
"WINDIR": r"C:\Windows",
"ComSpec": r"C:\Windows\System32\cmd.exe",
"PATHEXT": ".COM;.EXE;.BAT;.CMD;.PY",
"USERPROFILE": r"C:\Users\alice",
"APPDATA": r"C:\Users\alice\AppData\Roaming",
"LOCALAPPDATA": r"C:\Users\alice\AppData\Local",
"PATH": r"C:\Windows\System32;C:\Python311",
"HOME": r"C:\Users\alice",
"TEMP": r"C:\Users\alice\AppData\Local\Temp",
# Should still be blocked:
"OPENAI_API_KEY": "sk-secret",
"GITHUB_TOKEN": "ghp_secret",
"MY_PASSWORD": "hunter2",
# Not matched by any rule — should be dropped on both OSes:
"RANDOM_UNKNOWN_VAR": "value",
}
def test_windows_essentials_passed_through_when_is_windows_true(self):
env = self._sample_windows_env()
scrubbed = _scrub_child_env(env,
is_passthrough=_no_passthrough,
is_windows=True)
# Every essential var from the sample env should survive.
assert scrubbed["SYSTEMROOT"] == r"C:\Windows"
assert scrubbed["SystemDrive"] == "C:" # case preserved
assert scrubbed["WINDIR"] == r"C:\Windows"
assert scrubbed["ComSpec"] == r"C:\Windows\System32\cmd.exe"
assert scrubbed["PATHEXT"] == ".COM;.EXE;.BAT;.CMD;.PY"
assert scrubbed["USERPROFILE"] == r"C:\Users\alice"
assert scrubbed["APPDATA"].endswith("Roaming")
assert scrubbed["LOCALAPPDATA"].endswith("Local")
# Safe-prefix vars still pass (baseline behavior).
assert "PATH" in scrubbed
assert "HOME" in scrubbed
assert "TEMP" in scrubbed
def test_secrets_still_blocked_on_windows(self):
"""The Windows allowlist must NOT defeat the secret-substring block.
This is the key security invariant: essentials are allowed by
*exact name*, and the secret-substring block runs before the
essentials check anyway, so a variable named e.g. ``API_KEY`` can
never sneak through just because we added Windows support.
"""
env = self._sample_windows_env()
scrubbed = _scrub_child_env(env,
is_passthrough=_no_passthrough,
is_windows=True)
assert "OPENAI_API_KEY" not in scrubbed
assert "GITHUB_TOKEN" not in scrubbed
assert "MY_PASSWORD" not in scrubbed
def test_unknown_vars_still_dropped_on_windows(self):
env = self._sample_windows_env()
scrubbed = _scrub_child_env(env,
is_passthrough=_no_passthrough,
is_windows=True)
assert "RANDOM_UNKNOWN_VAR" not in scrubbed
def test_essentials_blocked_when_is_windows_false(self):
"""On POSIX hosts, Windows-specific vars should not pass — they
have no meaning and could confuse child tooling."""
env = self._sample_windows_env()
scrubbed = _scrub_child_env(env,
is_passthrough=_no_passthrough,
is_windows=False)
# Safe prefixes still match (PATH, HOME, TEMP).
assert "PATH" in scrubbed
assert "HOME" in scrubbed
assert "TEMP" in scrubbed
# But Windows OS vars should be dropped.
assert "SYSTEMROOT" not in scrubbed
assert "WINDIR" not in scrubbed
assert "ComSpec" not in scrubbed
assert "APPDATA" not in scrubbed
def test_case_insensitive_essential_match(self):
"""Windows env var names are case-insensitive at the OS level but
Python preserves whatever case os.environ reported. The scrubber
must normalize to uppercase for the membership check."""
env = {
"SystemRoot": r"C:\Windows", # mixed case
"comspec": r"C:\Windows\System32\cmd.exe", # lowercase
"APPDATA": r"C:\Users\x\AppData\Roaming", # uppercase
}
scrubbed = _scrub_child_env(env,
is_passthrough=_no_passthrough,
is_windows=True)
assert "SystemRoot" in scrubbed
assert "comspec" in scrubbed
assert "APPDATA" in scrubbed
class TestScrubChildEnvPassthroughInteraction:
"""The passthrough hook runs *before* the secret block, so a skill
can legitimately forward a third-party API key. The Windows
essentials addition must not interfere with that."""
def test_passthrough_wins_over_secret_block(self):
env = {"TENOR_API_KEY": "x", "PATH": "/bin"}
scrubbed = _scrub_child_env(env,
is_passthrough=lambda k: k == "TENOR_API_KEY",
is_windows=False)
assert scrubbed.get("TENOR_API_KEY") == "x"
assert scrubbed.get("PATH") == "/bin"
def test_passthrough_still_works_on_windows(self):
env = {
"TENOR_API_KEY": "x",
"SYSTEMROOT": r"C:\Windows",
"OPENAI_API_KEY": "sk-secret", # not passthrough
}
scrubbed = _scrub_child_env(
env,
is_passthrough=lambda k: k == "TENOR_API_KEY",
is_windows=True,
)
assert scrubbed.get("TENOR_API_KEY") == "x"
assert scrubbed.get("SYSTEMROOT") == r"C:\Windows"
assert "OPENAI_API_KEY" not in scrubbed
@pytest.mark.skipif(
sys.platform != "win32",
reason="Winsock-specific regression — only meaningful on Windows",
)
class TestWindowsSocketSmokeTest:
"""Integration-ish smoke test: spawn a child Python with a scrubbed
env and confirm it can create an AF_INET socket. This is the
regression that motivated the fix without SYSTEMROOT the child
hits WinError 10106 before any RPC is attempted."""
def test_child_can_create_socket_with_scrubbed_env(self):
scrubbed = _scrub_child_env(os.environ, is_passthrough=_no_passthrough)
# Build a tiny child script that simply opens an AF_INET socket.
script = textwrap.dedent("""
import socket, sys
try:
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.close()
print("OK")
sys.exit(0)
except OSError as exc:
print(f"FAIL: {exc}")
sys.exit(1)
""").strip()
result = subprocess.run(
[sys.executable, "-c", script],
env=scrubbed,
capture_output=True,
text=True,
timeout=15,
)
assert result.returncode == 0, (
f"Child failed to create socket with scrubbed env:\n"
f" stdout={result.stdout!r}\n"
f" stderr={result.stderr!r}\n"
f" scrubbed keys={sorted(scrubbed.keys())}"
)
assert "OK" in result.stdout
# ---------------------------------------------------------------------------
# POSIX equivalence guard
# ---------------------------------------------------------------------------
def _legacy_posix_scrubber(source_env, is_passthrough):
"""Verbatim copy of the pre-Windows-fix inline scrubbing logic.
This is the oracle used by TestPosixEquivalence to prove the refactor
did not change POSIX behavior. DO NOT edit this to "match" a future
production change if _scrub_child_env's POSIX behavior legitimately
needs to evolve, delete this function and adjust the equivalence test
on purpose, so the churn is visible in review.
"""
_SAFE_ENV_PREFIXES = ("PATH", "HOME", "USER", "LANG", "LC_", "TERM",
"TMPDIR", "TMP", "TEMP", "SHELL", "LOGNAME",
"XDG_", "PYTHONPATH", "VIRTUAL_ENV", "CONDA",
"HERMES_")
_SECRET_SUBSTRINGS = ("KEY", "TOKEN", "SECRET", "PASSWORD", "CREDENTIAL",
"PASSWD", "AUTH")
out = {}
for k, v in source_env.items():
if is_passthrough(k):
out[k] = v
continue
if any(s in k.upper() for s in _SECRET_SUBSTRINGS):
continue
if any(k.startswith(p) for p in _SAFE_ENV_PREFIXES):
out[k] = v
return out
class TestPosixEquivalence:
"""Lock in the invariant that _scrub_child_env(env, is_windows=False)
behaves *bit-for-bit identically* to the pre-refactor inline scrubber.
If this ever fails, it means somebody changed POSIX env-scrubbing
behavior maybe on purpose, maybe not. Either way it should land
as a deliberate, reviewed change (update _legacy_posix_scrubber
above in the same PR).
Rationale: the Windows-essentials patch refactored the scrubber into
a helper. Linux/macOS must not regress. This class gates that.
"""
_POSIX_SYNTHETIC_ENV = {
# Safe-prefix matches
"PATH": "/usr/bin:/bin",
"HOME": "/home/alice",
"USER": "alice",
"LANG": "en_US.UTF-8",
"LC_CTYPE": "en_US.UTF-8",
"TERM": "xterm-256color",
"SHELL": "/bin/zsh",
"LOGNAME": "alice",
"TMPDIR": "/tmp",
"XDG_RUNTIME_DIR": "/run/user/1000",
"XDG_CONFIG_HOME": "/home/alice/.config",
"PYTHONPATH": "/opt/lib",
"VIRTUAL_ENV": "/home/alice/.venv",
"CONDA_PREFIX": "/opt/conda",
"HERMES_HOME": "/home/alice/.hermes",
"HERMES_INTERACTIVE": "1",
# Secret-substring blocks
"OPENAI_API_KEY": "sk-xxx",
"GITHUB_TOKEN": "ghp_xxx",
"AWS_SECRET_ACCESS_KEY": "yyy",
"MY_PASSWORD": "hunter2",
# Uncategorized — must be dropped
"RANDOM_UNKNOWN": "drop-me",
"DISPLAY": ":0",
"SSH_AUTH_SOCK": "/run/user/1000/ssh-agent",
# Passthrough candidate (also matches secret block by default)
"TENOR_API_KEY": "tenor-xxx",
}
_WINDOWS_SYNTHETIC_ENV = {
# Windows-essential names (must be dropped on POSIX, passed on Win)
"SYSTEMROOT": r"C:\Windows",
"SystemDrive": "C:",
"WINDIR": r"C:\Windows",
"ComSpec": r"C:\Windows\System32\cmd.exe",
"PATHEXT": ".COM;.EXE;.BAT",
"USERPROFILE": r"C:\Users\alice",
"APPDATA": r"C:\Users\alice\AppData\Roaming",
"LOCALAPPDATA": r"C:\Users\alice\AppData\Local",
# Safe-prefix matches (cross-platform)
"PATH": r"C:\Python311;C:\Windows\System32",
"HOME": r"C:\Users\alice",
"TEMP": r"C:\Users\alice\AppData\Local\Temp",
# Secret-looking (always blocked)
"OPENAI_API_KEY": "sk-xxx",
"GITHUB_TOKEN": "ghp_xxx",
}
@pytest.mark.parametrize("env_name,env", [
("posix_synthetic", _POSIX_SYNTHETIC_ENV),
("windows_synthetic_on_posix", _WINDOWS_SYNTHETIC_ENV),
])
@pytest.mark.parametrize("pt_name,pt", [
("no_passthrough", lambda _: False),
("tenor_passthrough", lambda k: k == "TENOR_API_KEY"),
("all_passthrough", lambda _: True),
])
def test_posix_behavior_unchanged(self, env_name, env, pt_name, pt):
"""For every combination of (env shape × passthrough rule), the
new helper with is_windows=False must produce the exact same dict
as the legacy inline scrubber.
We parametrize over three passthrough rules to cover the full
surface: no passthrough, single-var passthrough (the common
skill-registered case), and everything-passes (edge case that
could expose precedence bugs)."""
expected = _legacy_posix_scrubber(env, pt)
actual = _scrub_child_env(env, is_passthrough=pt, is_windows=False)
assert actual == expected, (
f"POSIX behavior regressed for env={env_name}, passthrough={pt_name}\n"
f" only in legacy: {sorted(set(expected) - set(actual))}\n"
f" only in new: {sorted(set(actual) - set(expected))}\n"
f" value diffs: {[k for k in expected if k in actual and expected[k] != actual[k]]}"
)
def test_posix_behavior_unchanged_on_real_os_environ(self):
"""Bonus check against the actual os.environ of the host running
the test. This covers vars we might not have thought to put in
the synthetic fixtures."""
expected = _legacy_posix_scrubber(os.environ, lambda _: False)
actual = _scrub_child_env(os.environ,
is_passthrough=lambda _: False,
is_windows=False)
assert actual == expected, (
"POSIX-mode scrubber diverged from legacy behavior on real "
f"os.environ (host platform={sys.platform})"
)
def test_windows_mode_is_strict_superset_of_posix_mode(self):
"""Correctness check on the NEW behavior: is_windows=True must
keep everything POSIX mode keeps, and *may* add Windows
essentials. It must never drop a var that POSIX mode would keep
if it did, we'd have broken same-host reuse of the scrubber."""
env = {**self._POSIX_SYNTHETIC_ENV, **self._WINDOWS_SYNTHETIC_ENV}
posix_result = _scrub_child_env(env,
is_passthrough=lambda _: False,
is_windows=False)
windows_result = _scrub_child_env(env,
is_passthrough=lambda _: False,
is_windows=True)
missing = set(posix_result) - set(windows_result)
assert not missing, (
f"is_windows=True dropped vars that is_windows=False kept: {missing}"
)
# And any extras must come from the Windows essentials allowlist.
extras = set(windows_result) - set(posix_result)
for k in extras:
assert k.upper() in _WINDOWS_ESSENTIAL_ENV_VARS, (
f"Unexpected extra var in windows-mode output: {k} "
f"(not in _WINDOWS_ESSENTIAL_ENV_VARS)"
)
# ---------------------------------------------------------------------------
# UTF-8 file-write regression test
# ---------------------------------------------------------------------------
#
# The sandbox writes two Python files into a temp dir — the generated
# ``hermes_tools.py`` stub, and the LLM's ``script.py``. Both contain
# non-ASCII characters in practice: the stub has em-dashes in docstrings
# ("``tcp://host:port`` — the parent falls back..."), and user scripts
# routinely contain non-ASCII strings, comments, or Unicode identifiers.
#
# On Windows, ``open(path, "w")`` without encoding= uses the system locale
# (cp1252 on US/UK installs), which cannot encode em-dashes. Python then
# tries to decode the file as UTF-8 when importing it (PEP 3120), fails,
# and the sandbox aborts with:
#
# SyntaxError: (unicode error) 'utf-8' codec can't decode byte 0x97
# in position N: invalid start byte
#
# This was the *second* Windows-specific bug (WinError 10106 was the first).
# The fix is to always pass ``encoding="utf-8"`` when writing Python source.
class TestSandboxWritesUtf8:
"""Verify the file-write call sites use UTF-8 explicitly, not the
platform default. We check the source of ``execute_code`` rather
than spawning a real sandbox because the latter needs a full agent
context but the code inspection is deterministic and fast."""
def test_stub_and_script_writes_specify_utf8(self):
"""Both ``hermes_tools.py`` and ``script.py`` writes in
``_execute_local`` must pass ``encoding="utf-8"``."""
import tools.code_execution_tool as cet
src = open(cet.__file__, encoding="utf-8").read()
# There should be no ``open(path, "w")`` without encoding= for
# the two staging files. Grep-style check: find every write of
# a .py file inside tmpdir and assert the line also contains
# ``encoding="utf-8"`` within a short window.
import re
pattern = re.compile(
r'open\(\s*os\.path\.join\(\s*tmpdir\s*,\s*"[^"]+\.py"\s*\)\s*,\s*"w"[^)]*\)'
)
for match in pattern.finditer(src):
line = match.group(0)
assert 'encoding="utf-8"' in line or "encoding='utf-8'" in line, (
f"Sandbox file write missing encoding=\"utf-8\" on Windows: {line!r}"
)
def test_file_rpc_stub_uses_utf8(self):
"""The file-based RPC transport stub (used by remote backends)
reads/writes JSON response files. Those must also specify UTF-8
so non-ASCII tool results survive the round-trip intact."""
from tools.code_execution_tool import generate_hermes_tools_module
stub = generate_hermes_tools_module(["terminal"], transport="file")
# The generated stub should open response + request files as UTF-8.
assert 'encoding="utf-8"' in stub, (
"File-based RPC stub does not specify encoding=\"utf-8\""
"will corrupt non-ASCII tool results on non-UTF-8 locales."
)
def test_stub_source_roundtrips_through_utf8(self):
"""Concrete regression: write the generated stub to a temp file
using ``encoding="utf-8"``, then parse it. This is what the
sandbox does, and it must succeed even when the stub contains
em-dashes (which it does check the transport-header docstring).
"""
from tools.code_execution_tool import generate_hermes_tools_module
import tempfile, ast
stub = generate_hermes_tools_module(
["terminal", "read_file", "write_file"], transport="uds"
)
# Sanity: stub actually contains a non-ASCII character, otherwise
# this test wouldn't prove anything meaningful.
non_ascii = [c for c in stub if ord(c) > 127]
assert non_ascii, (
"Generated stub is pure ASCII — test is meaningless. If the "
"stub's docstrings have lost their em-dashes, update this "
"assertion, but be aware the original regression is no longer "
"covered."
)
with tempfile.NamedTemporaryFile(
mode="w", suffix=".py", delete=False, encoding="utf-8"
) as f:
f.write(stub)
tmp_path = f.name
try:
# Re-read and parse exactly like the child Python would.
with open(tmp_path, encoding="utf-8") as fh:
round_tripped = fh.read()
assert round_tripped == stub, "UTF-8 round-trip corrupted the stub"
ast.parse(round_tripped) # must not raise SyntaxError
finally:
os.unlink(tmp_path)
@pytest.mark.skipif(
sys.platform != "win32",
reason="cp1252 default-encoding regression is Windows-specific",
)
def test_windows_default_encoding_would_have_failed(self):
"""Negative control: prove that on Windows, writing the stub
*without* ``encoding="utf-8"`` would corrupt the file. If this
test ever starts failing (i.e. default write succeeds), it means
Python's default encoding has changed and the explicit UTF-8
requirement may be obsolete reconsider the fix."""
from tools.code_execution_tool import generate_hermes_tools_module
import tempfile
stub = generate_hermes_tools_module(["terminal"], transport="uds")
# Find a non-ASCII character we can use to prove the corruption.
non_ascii = [c for c in stub if ord(c) > 127]
if not non_ascii:
pytest.skip("stub has no non-ASCII chars — nothing to corrupt")
# Write with default encoding (simulating the old buggy code).
with tempfile.NamedTemporaryFile(
mode="w", suffix=".py", delete=False
) as f:
try:
f.write(stub)
tmp_path = f.name
wrote_successfully = True
except UnicodeEncodeError:
# Default encoding can't even encode it — that's the bug
# in a different form. Still proves the point.
tmp_path = f.name
wrote_successfully = False
try:
if not wrote_successfully:
# Default-encoding write raised outright. The bug is real.
return
# Read back as UTF-8 (what Python does on import).
with open(tmp_path, encoding="utf-8") as fh:
try:
fh.read()
# If this succeeds on Windows, the platform default is
# already UTF-8 (e.g. Python 3.15 with UTF-8 mode on).
# In that case the explicit encoding= is belt-and-
# suspenders but no longer strictly required. Skip.
pytest.skip(
"Default text-file encoding is UTF-8-compatible on "
"this Windows build — explicit encoding= is no "
"longer load-bearing, but keep it for belt-and-"
"suspenders."
)
except UnicodeDecodeError:
# Exactly the failure mode that motivated the fix.
pass
finally:
os.unlink(tmp_path)
# ---------------------------------------------------------------------------
# UTF-8 stdio regression test
# ---------------------------------------------------------------------------
#
# The third Windows-specific sandbox bug: after the UTF-8 file-write fix
# let the child import hermes_tools, a user script that printed non-ASCII
# to stdout still crashed with:
#
# UnicodeEncodeError: 'charmap' codec can't encode character '\u2192'
# in position N: character maps to <undefined>
#
# Python's sys.stdout on Windows is bound to the console code page
# (cp1252 on US-locale installs) when the process is attached to a pipe
# without PYTHONIOENCODING set. LLM-generated scripts routinely print
# em-dashes, arrows, accented chars, emoji — all of which break.
#
# Fix: spawn the child with PYTHONIOENCODING=utf-8 and PYTHONUTF8=1.
# The latter also makes open()'s default encoding UTF-8 (PEP 540),
# belt-and-suspenders for user scripts that do their own file I/O.
class TestChildStdioIsUtf8:
"""Verify the sandbox child is spawned with UTF-8 stdio encoding,
so LLM scripts can print non-ASCII without crashing on Windows."""
def test_popen_env_sets_pythonioencoding_utf8(self):
"""Source-level check: the Popen call site must set
PYTHONIOENCODING=utf-8 in child_env."""
import tools.code_execution_tool as cet
src = open(cet.__file__, encoding="utf-8").read()
assert 'child_env["PYTHONIOENCODING"] = "utf-8"' in src, (
"PYTHONIOENCODING=utf-8 missing from child env — Windows "
"scripts that print non-ASCII will crash with "
"UnicodeEncodeError."
)
def test_popen_env_sets_pythonutf8_mode(self):
"""Source-level check: PYTHONUTF8=1 must be set too — it makes
open()'s default encoding UTF-8 in user-written file I/O."""
import tools.code_execution_tool as cet
src = open(cet.__file__, encoding="utf-8").read()
assert 'child_env["PYTHONUTF8"] = "1"' in src, (
"PYTHONUTF8=1 missing from child env — user scripts that "
"call open(path, 'w') without encoding= will produce "
"locale-encoded files on Windows."
)
def test_live_child_can_print_non_ascii(self):
"""Live regression: spawn a Python child with the same env
treatment the sandbox uses (PYTHONIOENCODING=utf-8 + PYTHONUTF8=1)
and verify it can print em-dashes, arrows, and emoji to stdout
without crashing. This is the exact scenario that broke in live
usage.
Runs on every OS on POSIX the fix is belt-and-suspenders but
still load-bearing for C.ASCII locale environments.
"""
script = textwrap.dedent("""
import sys
# Mix of chars that cp1252 can't encode: arrow, emoji.
print("em-dash \\u2014 arrow \\u2192 emoji \\U0001f680")
sys.exit(0)
""").strip()
# Build a scrubbed env the same way the sandbox does, then apply
# the stdio overrides.
scrubbed = _scrub_child_env(os.environ, is_passthrough=_no_passthrough)
scrubbed["PYTHONIOENCODING"] = "utf-8"
scrubbed["PYTHONUTF8"] = "1"
result = subprocess.run(
[sys.executable, "-c", script],
env=scrubbed,
capture_output=True,
timeout=15,
# Don't decode at the subprocess boundary — we want to check
# the raw bytes match UTF-8, same as what the sandbox does.
)
assert result.returncode == 0, (
f"Child crashed printing non-ASCII:\n"
f" stdout (raw): {result.stdout!r}\n"
f" stderr (raw): {result.stderr!r}"
)
decoded = result.stdout.decode("utf-8")
assert "\u2014" in decoded, f"em-dash missing from output: {decoded!r}"
assert "\u2192" in decoded, f"arrow missing from output: {decoded!r}"
assert "\U0001f680" in decoded, f"emoji missing from output: {decoded!r}"
@pytest.mark.skipif(
sys.platform != "win32",
reason="cp1252 stdout default is Windows-specific",
)
def test_windows_child_without_utf8_env_would_fail(self):
"""Negative control: spawn a Python child *without* our env
overrides and prove that on Windows, printing non-ASCII fails.
If this ever starts passing, Python has changed its default
stdio encoding on Windows and the fix may be obsolete but
keep the env vars anyway for belt-and-suspenders."""
script = textwrap.dedent("""
import sys
print("em-dash \\u2014 arrow \\u2192")
sys.exit(0)
""").strip()
# Scrubbed env WITHOUT the PYTHONIOENCODING / PYTHONUTF8 overrides.
# Also scrub PYTHONUTF8 and PYTHONIOENCODING from the inherited
# env so we reproduce the buggy state even if the parent test
# runner has them set.
scrubbed = _scrub_child_env(os.environ, is_passthrough=_no_passthrough)
for k in ("PYTHONIOENCODING", "PYTHONUTF8", "PYTHONLEGACYWINDOWSSTDIO"):
scrubbed.pop(k, None)
result = subprocess.run(
[sys.executable, "-c", script],
env=scrubbed,
capture_output=True,
text=False,
timeout=15,
)
# Either the child crashed (expected), or modern Python handled
# it anyway — in which case the fix is still defensive but no
# longer strictly required. Skip with a note if so.
if result.returncode == 0 and b"\xe2\x80\x94" in result.stdout:
pytest.skip(
"This Python/Windows build handles non-ASCII stdout even "
"without PYTHONIOENCODING/PYTHONUTF8 — fix is defensive "
"but no longer strictly load-bearing. Keep the env vars "
"for older Python builds and C.ASCII-locale containers."
)
# Otherwise: crash OR garbled output — both count as proving the
# bug is real on this system.
@@ -0,0 +1,275 @@
"""Tests for the Brave Search (free tier) web search provider.
Covers:
- BraveFreeSearchProvider.is_configured() env var gating
- BraveFreeSearchProvider.search() happy path, HTTP error, request error, bad JSON
- Result normalization (title, url, description, position)
- Limit truncation + Brave's count cap (20)
- _is_backend_available("brave-free") integration
- _get_backend() recognizes "brave-free" as a valid configured backend
- check_web_api_key() includes brave-free in availability check
- web_extract / web_crawl return search-only errors when brave-free is active
"""
from __future__ import annotations
import json
from unittest.mock import MagicMock, patch
# ---------------------------------------------------------------------------
# BraveFreeSearchProvider unit tests
# ---------------------------------------------------------------------------
class TestBraveFreeProviderIsConfigured:
def test_configured_when_key_set(self, monkeypatch):
monkeypatch.setenv("BRAVE_SEARCH_API_KEY", "BSAkey123")
from tools.web_providers.brave_free import BraveFreeSearchProvider
assert BraveFreeSearchProvider().is_configured() is True
def test_not_configured_when_key_missing(self, monkeypatch):
monkeypatch.delenv("BRAVE_SEARCH_API_KEY", raising=False)
from tools.web_providers.brave_free import BraveFreeSearchProvider
assert BraveFreeSearchProvider().is_configured() is False
def test_not_configured_when_key_whitespace(self, monkeypatch):
monkeypatch.setenv("BRAVE_SEARCH_API_KEY", " ")
from tools.web_providers.brave_free import BraveFreeSearchProvider
assert BraveFreeSearchProvider().is_configured() is False
def test_provider_name(self):
from tools.web_providers.brave_free import BraveFreeSearchProvider
assert BraveFreeSearchProvider().provider_name() == "brave-free"
def test_implements_web_search_provider(self):
from tools.web_providers.base import WebSearchProvider
from tools.web_providers.brave_free import BraveFreeSearchProvider
assert issubclass(BraveFreeSearchProvider, WebSearchProvider)
class TestBraveFreeProviderSearch:
_SAMPLE_RESPONSE = {
"web": {
"results": [
{"title": "A", "url": "https://a.example.com", "description": "desc A"},
{"title": "B", "url": "https://b.example.com", "description": "desc B"},
{"title": "C", "url": "https://c.example.com", "description": "desc C"},
]
}
}
@staticmethod
def _mock_resp(json_data, status_code=200):
m = MagicMock()
m.status_code = status_code
m.json.return_value = json_data
m.raise_for_status = MagicMock()
return m
def test_happy_path_normalizes_results(self, monkeypatch):
monkeypatch.setenv("BRAVE_SEARCH_API_KEY", "BSAkey123")
from tools.web_providers.brave_free import BraveFreeSearchProvider
with patch("httpx.get", return_value=self._mock_resp(self._SAMPLE_RESPONSE)):
result = BraveFreeSearchProvider().search("test query", limit=5)
assert result["success"] is True
web = result["data"]["web"]
assert len(web) == 3
assert web[0] == {"title": "A", "url": "https://a.example.com", "description": "desc A", "position": 1}
assert web[2]["position"] == 3
def test_sends_subscription_token_header_and_count(self, monkeypatch):
"""Brave uses X-Subscription-Token; count maps from limit."""
monkeypatch.setenv("BRAVE_SEARCH_API_KEY", "BSAkey123")
from tools.web_providers.brave_free import BraveFreeSearchProvider
captured = {}
def fake_get(url, **kwargs):
captured["url"] = url
captured["headers"] = kwargs.get("headers", {})
captured["params"] = kwargs.get("params", {})
return self._mock_resp({"web": {"results": []}})
with patch("httpx.get", side_effect=fake_get):
BraveFreeSearchProvider().search("q", limit=5)
assert captured["url"] == "https://api.search.brave.com/res/v1/web/search"
assert captured["headers"].get("X-Subscription-Token") == "BSAkey123"
assert captured["params"].get("q") == "q"
assert captured["params"].get("count") == 5
def test_count_is_capped_at_20(self, monkeypatch):
"""Brave caps count at 20 — limit above that clamps."""
monkeypatch.setenv("BRAVE_SEARCH_API_KEY", "BSAkey123")
from tools.web_providers.brave_free import BraveFreeSearchProvider
captured = {}
def fake_get(url, **kwargs):
captured["params"] = kwargs.get("params", {})
return self._mock_resp({"web": {"results": []}})
with patch("httpx.get", side_effect=fake_get):
BraveFreeSearchProvider().search("q", limit=100)
assert captured["params"].get("count") == 20
def test_limit_is_respected_client_side(self, monkeypatch):
monkeypatch.setenv("BRAVE_SEARCH_API_KEY", "BSAkey123")
from tools.web_providers.brave_free import BraveFreeSearchProvider
with patch("httpx.get", return_value=self._mock_resp(self._SAMPLE_RESPONSE)):
result = BraveFreeSearchProvider().search("q", limit=2)
assert result["success"] is True
assert len(result["data"]["web"]) == 2
def test_empty_results(self, monkeypatch):
monkeypatch.setenv("BRAVE_SEARCH_API_KEY", "BSAkey123")
from tools.web_providers.brave_free import BraveFreeSearchProvider
with patch("httpx.get", return_value=self._mock_resp({"web": {"results": []}})):
result = BraveFreeSearchProvider().search("nothing", limit=5)
assert result["success"] is True
assert result["data"]["web"] == []
def test_missing_web_key_returns_empty(self, monkeypatch):
"""Responses without a ``web`` block should produce an empty result set, not crash."""
monkeypatch.setenv("BRAVE_SEARCH_API_KEY", "BSAkey123")
from tools.web_providers.brave_free import BraveFreeSearchProvider
with patch("httpx.get", return_value=self._mock_resp({})):
result = BraveFreeSearchProvider().search("q", limit=5)
assert result["success"] is True
assert result["data"]["web"] == []
def test_http_error_returns_failure(self, monkeypatch):
import httpx
monkeypatch.setenv("BRAVE_SEARCH_API_KEY", "BSAkey123")
from tools.web_providers.brave_free import BraveFreeSearchProvider
bad = MagicMock()
bad.status_code = 429
err = httpx.HTTPStatusError("429", request=MagicMock(), response=bad)
with patch("httpx.get", side_effect=err):
result = BraveFreeSearchProvider().search("q", limit=5)
assert result["success"] is False
assert "429" in result["error"]
def test_request_error_returns_failure(self, monkeypatch):
import httpx
monkeypatch.setenv("BRAVE_SEARCH_API_KEY", "BSAkey123")
from tools.web_providers.brave_free import BraveFreeSearchProvider
with patch("httpx.get", side_effect=httpx.RequestError("boom")):
result = BraveFreeSearchProvider().search("q", limit=5)
assert result["success"] is False
assert "boom" in result["error"] or "Brave" in result["error"]
def test_missing_key_returns_failure(self, monkeypatch):
monkeypatch.delenv("BRAVE_SEARCH_API_KEY", raising=False)
from tools.web_providers.brave_free import BraveFreeSearchProvider
result = BraveFreeSearchProvider().search("q", limit=5)
assert result["success"] is False
assert "BRAVE_SEARCH_API_KEY" in result["error"]
# ---------------------------------------------------------------------------
# Integration: _is_backend_available / _get_backend / check_web_api_key
# ---------------------------------------------------------------------------
class TestBraveFreeBackendWiring:
def test_is_backend_available_true_when_key_set(self, monkeypatch):
monkeypatch.setenv("BRAVE_SEARCH_API_KEY", "BSAkey123")
from tools.web_tools import _is_backend_available
assert _is_backend_available("brave-free") is True
def test_is_backend_available_false_when_key_missing(self, monkeypatch):
monkeypatch.delenv("BRAVE_SEARCH_API_KEY", raising=False)
from tools.web_tools import _is_backend_available
assert _is_backend_available("brave-free") is False
def test_configured_backend_accepted(self, monkeypatch):
from tools import web_tools
monkeypatch.setattr(web_tools, "_load_web_config", lambda: {"backend": "brave-free"})
monkeypatch.setenv("BRAVE_SEARCH_API_KEY", "BSAkey123")
assert web_tools._get_backend() == "brave-free"
def test_auto_detect_picks_brave_free_when_only_key_set(self, monkeypatch):
from tools import web_tools
monkeypatch.setattr(web_tools, "_load_web_config", lambda: {})
for key in ("FIRECRAWL_API_KEY", "FIRECRAWL_API_URL", "PARALLEL_API_KEY",
"TAVILY_API_KEY", "EXA_API_KEY", "SEARXNG_URL"):
monkeypatch.delenv(key, raising=False)
monkeypatch.setenv("BRAVE_SEARCH_API_KEY", "BSAkey123")
monkeypatch.setattr(web_tools, "_is_tool_gateway_ready", lambda: False)
monkeypatch.setattr(web_tools, "_ddgs_package_importable", lambda: False)
assert web_tools._get_backend() == "brave-free"
def test_brave_free_does_not_override_paid_provider(self, monkeypatch):
"""Tavily (higher priority) should win in auto-detect."""
from tools import web_tools
monkeypatch.setattr(web_tools, "_load_web_config", lambda: {})
for key in ("FIRECRAWL_API_KEY", "FIRECRAWL_API_URL", "PARALLEL_API_KEY", "EXA_API_KEY", "SEARXNG_URL"):
monkeypatch.delenv(key, raising=False)
monkeypatch.setenv("TAVILY_API_KEY", "tvly")
monkeypatch.setenv("BRAVE_SEARCH_API_KEY", "BSAkey123")
monkeypatch.setattr(web_tools, "_is_tool_gateway_ready", lambda: False)
assert web_tools._get_backend() == "tavily"
def test_check_web_api_key_true_when_brave_free_configured(self, monkeypatch):
from tools import web_tools
monkeypatch.setattr(web_tools, "_load_web_config", lambda: {"backend": "brave-free"})
monkeypatch.setenv("BRAVE_SEARCH_API_KEY", "BSAkey123")
assert web_tools.check_web_api_key() is True
# ---------------------------------------------------------------------------
# brave-free is search-only: web_extract / web_crawl return clear errors
# ---------------------------------------------------------------------------
class TestBraveFreeSearchOnlyErrors:
def test_web_extract_returns_search_only_error(self, monkeypatch):
import asyncio
from tools import web_tools
monkeypatch.setattr(web_tools, "_load_web_config", lambda: {"backend": "brave-free"})
monkeypatch.setenv("BRAVE_SEARCH_API_KEY", "BSAkey123")
monkeypatch.setattr(web_tools, "_is_tool_gateway_ready", lambda: False)
monkeypatch.setattr("tools.interrupt.is_interrupted", lambda: False, raising=False)
result_str = asyncio.get_event_loop().run_until_complete(
web_tools.web_extract_tool(["https://example.com"])
)
result = json.loads(result_str)
assert result["success"] is False
assert "search-only" in result["error"].lower()
assert "brave" in result["error"].lower()
def test_web_crawl_returns_search_only_error(self, monkeypatch):
import asyncio
from tools import web_tools
monkeypatch.setattr(web_tools, "_load_web_config", lambda: {"backend": "brave-free"})
monkeypatch.setenv("BRAVE_SEARCH_API_KEY", "BSAkey123")
monkeypatch.setattr(web_tools, "_is_tool_gateway_ready", lambda: False)
monkeypatch.setattr(web_tools, "check_firecrawl_api_key", lambda: False)
monkeypatch.setattr("tools.interrupt.is_interrupted", lambda: False, raising=False)
result_str = asyncio.get_event_loop().run_until_complete(
web_tools.web_crawl_tool("https://example.com")
)
result = json.loads(result_str)
assert result["success"] is False
assert "search-only" in result["error"].lower()
assert "brave" in result["error"].lower()
+246
View File
@@ -0,0 +1,246 @@
"""Tests for the DuckDuckGo (ddgs) web search provider.
Covers:
- DDGSSearchProvider.is_configured() reflects package importability
- DDGSSearchProvider.search() happy path, missing package, runtime error
- Result normalization (title, url, description, position)
- _is_backend_available("ddgs") / _get_backend() integration
- web_extract / web_crawl return search-only errors when ddgs is active
"""
from __future__ import annotations
import json
import sys
import types
from unittest.mock import MagicMock
def _install_fake_ddgs(monkeypatch, *, text_results=None, text_raises=None):
"""Install a stub ``ddgs`` module in sys.modules for the duration of a test.
``text_results``: iterable of dicts to yield from DDGS().text(...).
``text_raises``: if set, DDGS().text raises this exception instead.
"""
fake = types.ModuleType("ddgs")
class _FakeDDGS:
def __enter__(self):
return self
def __exit__(self, *_a):
return False
def text(self, query, max_results=5):
if text_raises is not None:
raise text_raises
for hit in (text_results or []):
yield hit
fake.DDGS = _FakeDDGS
monkeypatch.setitem(sys.modules, "ddgs", fake)
return fake
# ---------------------------------------------------------------------------
# DDGSSearchProvider unit tests
# ---------------------------------------------------------------------------
class TestDDGSProviderIsConfigured:
def test_configured_when_package_importable(self, monkeypatch):
_install_fake_ddgs(monkeypatch)
# Drop any cached ``tools.web_providers.ddgs`` so is_configured re-imports ddgs fresh
monkeypatch.delitem(sys.modules, "tools.web_providers.ddgs", raising=False)
from tools.web_providers.ddgs import DDGSSearchProvider
assert DDGSSearchProvider().is_configured() is True
def test_not_configured_when_package_missing(self, monkeypatch):
monkeypatch.delitem(sys.modules, "ddgs", raising=False)
monkeypatch.delitem(sys.modules, "tools.web_providers.ddgs", raising=False)
# Block the import so ``import ddgs`` raises ImportError even if the package is actually installed
import builtins
orig_import = builtins.__import__
def blocked_import(name, *args, **kwargs):
if name == "ddgs":
raise ImportError("blocked for test")
return orig_import(name, *args, **kwargs)
monkeypatch.setattr(builtins, "__import__", blocked_import)
from tools.web_providers.ddgs import DDGSSearchProvider
assert DDGSSearchProvider().is_configured() is False
def test_provider_name(self):
from tools.web_providers.ddgs import DDGSSearchProvider
assert DDGSSearchProvider().provider_name() == "ddgs"
def test_implements_web_search_provider(self):
from tools.web_providers.base import WebSearchProvider
from tools.web_providers.ddgs import DDGSSearchProvider
assert issubclass(DDGSSearchProvider, WebSearchProvider)
class TestDDGSProviderSearch:
def test_happy_path_normalizes_results(self, monkeypatch):
_install_fake_ddgs(monkeypatch, text_results=[
{"title": "A", "href": "https://a.example.com", "body": "desc A"},
{"title": "B", "href": "https://b.example.com", "body": "desc B"},
{"title": "C", "href": "https://c.example.com", "body": "desc C"},
])
from tools.web_providers.ddgs import DDGSSearchProvider
result = DDGSSearchProvider().search("q", limit=5)
assert result["success"] is True
web = result["data"]["web"]
assert len(web) == 3
assert web[0] == {"title": "A", "url": "https://a.example.com", "description": "desc A", "position": 1}
assert web[2]["position"] == 3
def test_accepts_url_key_as_fallback_for_href(self, monkeypatch):
_install_fake_ddgs(monkeypatch, text_results=[
{"title": "A", "url": "https://a.example.com", "body": "desc A"},
])
from tools.web_providers.ddgs import DDGSSearchProvider
result = DDGSSearchProvider().search("q", limit=5)
assert result["success"] is True
assert result["data"]["web"][0]["url"] == "https://a.example.com"
def test_limit_is_respected(self, monkeypatch):
_install_fake_ddgs(monkeypatch, text_results=[
{"title": f"R{i}", "href": f"https://r{i}.example.com", "body": ""}
for i in range(10)
])
from tools.web_providers.ddgs import DDGSSearchProvider
result = DDGSSearchProvider().search("q", limit=3)
assert result["success"] is True
assert len(result["data"]["web"]) == 3
def test_missing_package_returns_failure(self, monkeypatch):
monkeypatch.delitem(sys.modules, "ddgs", raising=False)
monkeypatch.delitem(sys.modules, "tools.web_providers.ddgs", raising=False)
import builtins
orig_import = builtins.__import__
def blocked_import(name, *args, **kwargs):
if name == "ddgs":
raise ImportError("blocked for test")
return orig_import(name, *args, **kwargs)
monkeypatch.setattr(builtins, "__import__", blocked_import)
from tools.web_providers.ddgs import DDGSSearchProvider
result = DDGSSearchProvider().search("q", limit=5)
assert result["success"] is False
assert "ddgs" in result["error"].lower()
def test_runtime_error_returns_failure(self, monkeypatch):
_install_fake_ddgs(monkeypatch, text_raises=RuntimeError("rate limited 202"))
from tools.web_providers.ddgs import DDGSSearchProvider
result = DDGSSearchProvider().search("q", limit=5)
assert result["success"] is False
assert "rate limited" in result["error"] or "failed" in result["error"].lower()
def test_empty_results(self, monkeypatch):
_install_fake_ddgs(monkeypatch, text_results=[])
from tools.web_providers.ddgs import DDGSSearchProvider
result = DDGSSearchProvider().search("nothing", limit=5)
assert result["success"] is True
assert result["data"]["web"] == []
# ---------------------------------------------------------------------------
# Integration: _is_backend_available / _get_backend / check_web_api_key
# ---------------------------------------------------------------------------
class TestDDGSBackendWiring:
def test_is_backend_available_true_when_package_importable(self, monkeypatch):
from tools import web_tools
monkeypatch.setattr(web_tools, "_ddgs_package_importable", lambda: True)
assert web_tools._is_backend_available("ddgs") is True
def test_is_backend_available_false_when_package_missing(self, monkeypatch):
from tools import web_tools
monkeypatch.setattr(web_tools, "_ddgs_package_importable", lambda: False)
assert web_tools._is_backend_available("ddgs") is False
def test_configured_backend_accepted(self, monkeypatch):
from tools import web_tools
monkeypatch.setattr(web_tools, "_load_web_config", lambda: {"backend": "ddgs"})
monkeypatch.setattr(web_tools, "_ddgs_package_importable", lambda: True)
assert web_tools._get_backend() == "ddgs"
def test_ddgs_trails_paid_providers_in_auto_detect(self, monkeypatch):
"""Exa (priority) should win over ddgs in auto-detect."""
from tools import web_tools
monkeypatch.setattr(web_tools, "_load_web_config", lambda: {})
for key in ("FIRECRAWL_API_KEY", "FIRECRAWL_API_URL", "PARALLEL_API_KEY",
"TAVILY_API_KEY", "SEARXNG_URL", "BRAVE_SEARCH_API_KEY"):
monkeypatch.delenv(key, raising=False)
monkeypatch.setenv("EXA_API_KEY", "exa-key")
monkeypatch.setattr(web_tools, "_is_tool_gateway_ready", lambda: False)
monkeypatch.setattr(web_tools, "_ddgs_package_importable", lambda: True)
assert web_tools._get_backend() == "exa"
def test_auto_detect_picks_ddgs_as_last_resort(self, monkeypatch):
from tools import web_tools
monkeypatch.setattr(web_tools, "_load_web_config", lambda: {})
for key in ("FIRECRAWL_API_KEY", "FIRECRAWL_API_URL", "PARALLEL_API_KEY",
"TAVILY_API_KEY", "EXA_API_KEY", "SEARXNG_URL", "BRAVE_SEARCH_API_KEY"):
monkeypatch.delenv(key, raising=False)
monkeypatch.setattr(web_tools, "_is_tool_gateway_ready", lambda: False)
monkeypatch.setattr(web_tools, "_ddgs_package_importable", lambda: True)
assert web_tools._get_backend() == "ddgs"
def test_check_web_api_key_true_when_ddgs_configured(self, monkeypatch):
from tools import web_tools
monkeypatch.setattr(web_tools, "_load_web_config", lambda: {"backend": "ddgs"})
monkeypatch.setattr(web_tools, "_ddgs_package_importable", lambda: True)
assert web_tools.check_web_api_key() is True
# ---------------------------------------------------------------------------
# ddgs is search-only: web_extract / web_crawl return clear errors
# ---------------------------------------------------------------------------
class TestDDGSSearchOnlyErrors:
def test_web_extract_returns_search_only_error(self, monkeypatch):
import asyncio
from tools import web_tools
monkeypatch.setattr(web_tools, "_load_web_config", lambda: {"backend": "ddgs"})
monkeypatch.setattr(web_tools, "_ddgs_package_importable", lambda: True)
monkeypatch.setattr(web_tools, "_is_tool_gateway_ready", lambda: False)
monkeypatch.setattr("tools.interrupt.is_interrupted", lambda: False, raising=False)
result_str = asyncio.get_event_loop().run_until_complete(
web_tools.web_extract_tool(["https://example.com"])
)
result = json.loads(result_str)
assert result["success"] is False
assert "search-only" in result["error"].lower()
assert "duckduckgo" in result["error"].lower() or "ddgs" in result["error"].lower()
def test_web_crawl_returns_search_only_error(self, monkeypatch):
import asyncio
from tools import web_tools
monkeypatch.setattr(web_tools, "_load_web_config", lambda: {"backend": "ddgs"})
monkeypatch.setattr(web_tools, "_ddgs_package_importable", lambda: True)
monkeypatch.setattr(web_tools, "_is_tool_gateway_ready", lambda: False)
monkeypatch.setattr(web_tools, "check_firecrawl_api_key", lambda: False)
monkeypatch.setattr("tools.interrupt.is_interrupted", lambda: False, raising=False)
result_str = asyncio.get_event_loop().run_until_complete(
web_tools.web_crawl_tool("https://example.com")
)
result = json.loads(result_str)
assert result["success"] is False
assert "search-only" in result["error"].lower()
assert "duckduckgo" in result["error"].lower() or "ddgs" in result["error"].lower()
+812
View File
@@ -0,0 +1,812 @@
"""Behavioral tests for Windows-specific compatibility fixes.
Complements ``tests/tools/test_windows_compat.py`` (which does source-level
pattern linting) with cross-platform-mocked tests that exercise the actual
code paths Hermes takes on native Windows.
Runs on Linux CI every test mocks ``sys.platform``, ``subprocess.run``,
and ``os.kill`` as needed to simulate Windows behavior without requiring a
Windows runner.
"""
from __future__ import annotations
import importlib
import os
import signal
import subprocess
import sys
from pathlib import Path
from unittest.mock import MagicMock, patch
import pytest
# ---------------------------------------------------------------------------
# configure_windows_stdio
# ---------------------------------------------------------------------------
class TestConfigureWindowsStdio:
"""``hermes_cli.stdio.configure_windows_stdio`` wiring.
The function must:
- be a no-op on non-Windows
- only configure once per process (idempotent)
- set PYTHONIOENCODING / PYTHONUTF8 without overriding explicit user settings
- reconfigure sys.stdout/stderr/stdin to UTF-8 on Windows
- flip the console code page to CP_UTF8 (65001) via ctypes
- respect HERMES_DISABLE_WINDOWS_UTF8 opt-out
"""
@pytest.fixture(autouse=True)
def _reset_configured(self, monkeypatch):
"""Reload the module before each test so the _CONFIGURED flag resets."""
# Remove from sys.modules so import triggers a fresh load
sys.modules.pop("hermes_cli.stdio", None)
# Fresh import now; tests import from hermes_cli.stdio themselves,
# but this guarantees the module they get is a brand-new copy.
import hermes_cli.stdio as _s
_s._CONFIGURED = False
yield
sys.modules.pop("hermes_cli.stdio", None)
def test_no_op_on_posix(self):
from hermes_cli import stdio
assert stdio.is_windows() is False
result = stdio.configure_windows_stdio()
assert result is False
def test_idempotent(self):
from hermes_cli import stdio
stdio.configure_windows_stdio()
# Second call returns False because _CONFIGURED is set
assert stdio.configure_windows_stdio() is False
def test_windows_path_sets_env_and_reconfigures_streams(self, monkeypatch):
from hermes_cli import stdio
monkeypatch.setattr(stdio, "is_windows", lambda: True)
# Pretend the user has no prior setting
monkeypatch.delenv("PYTHONIOENCODING", raising=False)
monkeypatch.delenv("PYTHONUTF8", raising=False)
monkeypatch.delenv("HERMES_DISABLE_WINDOWS_UTF8", raising=False)
monkeypatch.delenv("EDITOR", raising=False)
monkeypatch.delenv("VISUAL", raising=False)
reconfigure_calls = []
def fake_reconfigure(stream, *, encoding="utf-8", errors="replace"):
reconfigure_calls.append((stream, encoding, errors))
cp_calls = []
def fake_flip():
cp_calls.append(True)
monkeypatch.setattr(stdio, "_reconfigure_stream", fake_reconfigure)
monkeypatch.setattr(stdio, "_flip_console_code_page_to_utf8", fake_flip)
# Pretend notepad.exe is on PATH (it always is on real Windows hosts,
# but not on the Linux CI runner — mock it so the editor default
# survives).
monkeypatch.setattr(stdio, "_default_windows_editor", lambda: "notepad")
result = stdio.configure_windows_stdio()
assert result is True
assert os.environ.get("PYTHONIOENCODING") == "utf-8"
assert os.environ.get("PYTHONUTF8") == "1"
# EDITOR must be set so prompt_toolkit's open_in_editor finds
# a working program on Windows (it defaults to /usr/bin/nano).
assert os.environ.get("EDITOR") == "notepad"
assert len(cp_calls) == 1 # SetConsoleOutputCP path hit
assert len(reconfigure_calls) == 3 # stdout, stderr, stdin
def test_respects_existing_editor_var(self, monkeypatch):
"""User's explicit EDITOR wins over our default."""
from hermes_cli import stdio
monkeypatch.setattr(stdio, "is_windows", lambda: True)
monkeypatch.setenv("EDITOR", "code --wait")
monkeypatch.setattr(stdio, "_reconfigure_stream", lambda *a, **kw: None)
monkeypatch.setattr(stdio, "_flip_console_code_page_to_utf8", lambda: None)
monkeypatch.setattr(stdio, "_default_windows_editor", lambda: "notepad")
stdio.configure_windows_stdio()
assert os.environ["EDITOR"] == "code --wait"
def test_respects_existing_visual_var(self, monkeypatch):
"""VISUAL takes precedence over our EDITOR default too."""
from hermes_cli import stdio
monkeypatch.setattr(stdio, "is_windows", lambda: True)
monkeypatch.delenv("EDITOR", raising=False)
monkeypatch.setenv("VISUAL", "nvim")
monkeypatch.setattr(stdio, "_reconfigure_stream", lambda *a, **kw: None)
monkeypatch.setattr(stdio, "_flip_console_code_page_to_utf8", lambda: None)
monkeypatch.setattr(stdio, "_default_windows_editor", lambda: "notepad")
stdio.configure_windows_stdio()
# EDITOR should NOT be set when VISUAL already is (prompt_toolkit
# checks VISUAL first anyway, but we also shouldn't override it).
assert os.environ.get("EDITOR", "") != "notepad"
assert os.environ["VISUAL"] == "nvim"
def test_respects_existing_env_var(self, monkeypatch):
"""User's explicit PYTHONIOENCODING wins over our default."""
from hermes_cli import stdio
monkeypatch.setattr(stdio, "is_windows", lambda: True)
monkeypatch.setenv("PYTHONIOENCODING", "latin-1")
monkeypatch.setattr(stdio, "_reconfigure_stream", lambda *a, **kw: None)
monkeypatch.setattr(stdio, "_flip_console_code_page_to_utf8", lambda: None)
stdio.configure_windows_stdio()
assert os.environ["PYTHONIOENCODING"] == "latin-1"
@pytest.mark.parametrize("optout", ["1", "true", "True", "yes"])
def test_disable_flag_short_circuits(self, monkeypatch, optout):
from hermes_cli import stdio
monkeypatch.setattr(stdio, "is_windows", lambda: True)
monkeypatch.setenv("HERMES_DISABLE_WINDOWS_UTF8", optout)
reconfigure_hit = []
monkeypatch.setattr(
stdio,
"_reconfigure_stream",
lambda *a, **kw: reconfigure_hit.append(True),
)
result = stdio.configure_windows_stdio()
assert result is False
assert reconfigure_hit == [], "opt-out must skip stream reconfiguration"
def test_reconfigure_stream_handles_missing_method(self, monkeypatch):
"""StringIO-like objects without .reconfigure() must not blow up."""
from hermes_cli import stdio
import io
buf = io.StringIO()
# Must not raise
stdio._reconfigure_stream(buf)
# ---------------------------------------------------------------------------
# terminate_pid — the centralized kill primitive
# ---------------------------------------------------------------------------
class TestTerminatePidRoutingOnWindows:
"""``gateway.status.terminate_pid`` must use taskkill /T /F on Windows.
On Linux we can't reload gateway/status with sys.platform=win32 because
the module unconditionally imports ``msvcrt`` in that branch. Instead
we patch the module-level ``_IS_WINDOWS`` flag and ``subprocess.run``
on the already-loaded module, which exercises the same branching code.
"""
def test_force_uses_taskkill_on_windows(self, monkeypatch):
from gateway import status
captured = {}
def fake_run(args, **kwargs):
captured["args"] = args
result = MagicMock()
result.returncode = 0
result.stderr = ""
result.stdout = ""
return result
monkeypatch.setattr(status, "_IS_WINDOWS", True)
monkeypatch.setattr(status.subprocess, "run", fake_run)
status.terminate_pid(12345, force=True)
assert captured["args"][0] == "taskkill"
assert "/PID" in captured["args"]
assert "12345" in captured["args"]
assert "/T" in captured["args"]
assert "/F" in captured["args"]
def test_force_taskkill_failure_raises_oserror(self, monkeypatch):
from gateway import status
def fake_run(args, **kwargs):
result = MagicMock()
result.returncode = 128
result.stderr = "ERROR: The process cannot be terminated."
result.stdout = ""
return result
monkeypatch.setattr(status, "_IS_WINDOWS", True)
monkeypatch.setattr(status.subprocess, "run", fake_run)
with pytest.raises(OSError, match="cannot be terminated"):
status.terminate_pid(12345, force=True)
def test_graceful_on_windows_uses_os_kill_sigterm(self, monkeypatch):
"""Non-force path calls os.kill with SIGTERM (Windows has no SIGKILL).
``terminate_pid(pid)`` with force=False bypasses the taskkill branch
and uses ``os.kill`` directly so platform doesn't actually matter
for the signal choice. Verifies the getattr fallback works.
"""
from gateway import status
captured = {}
def fake_kill(pid, sig):
captured["pid"] = pid
captured["sig"] = sig
monkeypatch.setattr(status.os, "kill", fake_kill)
status.terminate_pid(99, force=False)
assert captured["pid"] == 99
assert captured["sig"] == signal.SIGTERM
def test_taskkill_not_found_falls_back_to_os_kill(self, monkeypatch):
"""On Windows without taskkill (WinPE, containers), fall back gracefully."""
from gateway import status
captured = {}
def fake_run(args, **kwargs):
raise FileNotFoundError(2, "taskkill not found")
def fake_kill(pid, sig):
captured["pid"] = pid
captured["sig"] = sig
monkeypatch.setattr(status, "_IS_WINDOWS", True)
monkeypatch.setattr(status.subprocess, "run", fake_run)
monkeypatch.setattr(status.os, "kill", fake_kill)
status.terminate_pid(42, force=True)
assert captured["pid"] == 42
assert captured["sig"] == signal.SIGTERM
# ---------------------------------------------------------------------------
# SIGKILL fallback pattern
# ---------------------------------------------------------------------------
class TestSigkillFallback:
"""Modules that want SIGKILL must fall back to SIGTERM when absent."""
def test_getattr_fallback_works_when_sigkill_missing(self, monkeypatch):
"""The `getattr(signal, "SIGKILL", signal.SIGTERM)` pattern."""
# Build a stand-in signal module with no SIGKILL attribute
fake_signal = MagicMock()
del fake_signal.SIGKILL # ensure it's absent
fake_signal.SIGTERM = 15
result = getattr(fake_signal, "SIGKILL", fake_signal.SIGTERM)
assert result == 15
def test_getattr_fallback_prefers_sigkill_when_present(self):
"""On POSIX the fallback is a no-op: real SIGKILL wins."""
result = getattr(signal, "SIGKILL", signal.SIGTERM)
assert result == signal.SIGKILL
@pytest.mark.parametrize(
"module_path, line_pattern",
[
("hermes_cli.kanban_db", 'getattr(signal, "SIGKILL", signal.SIGTERM)'),
],
)
def test_module_uses_getattr_fallback(self, module_path, line_pattern):
"""Source-level check that our modules use the safe fallback."""
rel = module_path.replace(".", "/") + ".py"
root = Path(__file__).resolve().parents[2]
source = (root / rel).read_text(encoding="utf-8")
assert line_pattern in source, (
f"{rel} must use the getattr fallback pattern on its SIGKILL site"
)
# ---------------------------------------------------------------------------
# OSError widening on os.kill(pid, 0) probes
# ---------------------------------------------------------------------------
class TestProcessRegistryOSErrorWidening:
"""_is_host_pid_alive must treat Windows' OSError as 'not alive'."""
def test_oserror_treated_as_not_alive(self, monkeypatch):
from tools.process_registry import ProcessRegistry
def fake_kill(pid, sig):
# Simulate Windows' WinError 87 for an unknown PID
raise OSError(22, "Invalid argument")
monkeypatch.setattr("tools.process_registry.os.kill", fake_kill)
assert ProcessRegistry._is_host_pid_alive(12345) is False
def test_permission_error_treated_as_not_alive(self, monkeypatch):
"""Conservative: PermissionError also means 'not alive' (matches existing behavior)."""
from tools.process_registry import ProcessRegistry
def fake_kill(pid, sig):
raise PermissionError(1, "Operation not permitted")
monkeypatch.setattr("tools.process_registry.os.kill", fake_kill)
assert ProcessRegistry._is_host_pid_alive(12345) is False
def test_zero_or_none_pid_returns_false_without_calling_kill(self, monkeypatch):
"""No wasted syscall on falsy pids."""
from tools.process_registry import ProcessRegistry
kill_calls = []
monkeypatch.setattr(
"tools.process_registry.os.kill",
lambda pid, sig: kill_calls.append(pid),
)
assert ProcessRegistry._is_host_pid_alive(None) is False
assert ProcessRegistry._is_host_pid_alive(0) is False
assert kill_calls == []
def test_alive_pid_returns_true(self, monkeypatch):
from tools.process_registry import ProcessRegistry
# os.kill returning None (default) means "probe succeeded → pid alive"
monkeypatch.setattr("tools.process_registry.os.kill", lambda pid, sig: None)
assert ProcessRegistry._is_host_pid_alive(os.getpid()) is True
# ---------------------------------------------------------------------------
# tzdata dependency
# ---------------------------------------------------------------------------
class TestTzdataDependencyDeclared:
"""Windows installs must pull tzdata for zoneinfo to work."""
def test_pyproject_declares_tzdata_for_win32(self):
root = Path(__file__).resolve().parents[2]
source = (root / "pyproject.toml").read_text(encoding="utf-8")
# The dependency line should be conditional on sys_platform == 'win32'
# and should NOT be in the core dependencies for Linux/macOS.
assert (
'tzdata>=2023.3; sys_platform == \'win32\'' in source
or "tzdata>=2023.3; sys_platform == 'win32'" in source
or 'tzdata>=2023.3; sys_platform == "win32"' in source
), "tzdata must be a Windows-only dep in pyproject.toml dependencies"
# ---------------------------------------------------------------------------
# README / docs consistency
# ---------------------------------------------------------------------------
class TestReadmeNoLongerSaysWindowsUnsupported:
"""The README shouldn't claim native Windows isn't supported."""
def test_readme_does_not_say_not_supported(self):
root = Path(__file__).resolve().parents[2]
source = (root / "README.md").read_text(encoding="utf-8")
# Previous string (removed in this PR): "Native Windows is not supported"
assert "Native Windows is not supported" not in source, (
"README.md still says native Windows is not supported — update the "
"install copy to reflect the PowerShell installer."
)
def test_readme_mentions_powershell_installer(self):
root = Path(__file__).resolve().parents[2]
source = (root / "README.md").read_text(encoding="utf-8")
assert "install.ps1" in source, (
"README.md must point at scripts/install.ps1 for Windows users"
)
# ---------------------------------------------------------------------------
# pty_bridge graceful import on Windows
# ---------------------------------------------------------------------------
class TestWebServerPtyBridgeGuard:
"""The web server must not crash if pty_bridge can't import (Windows)."""
def test_import_guard_present_in_source(self):
root = Path(__file__).resolve().parents[2]
source = (root / "hermes_cli" / "web_server.py").read_text(encoding="utf-8")
assert "_PTY_BRIDGE_AVAILABLE" in source
assert "except ImportError" in source, (
"web_server.py must wrap the pty_bridge import in try/except ImportError"
)
def test_pty_handler_checks_availability_flag(self):
"""The /api/pty handler must short-circuit when the bridge is unavailable."""
root = Path(__file__).resolve().parents[2]
source = (root / "hermes_cli" / "web_server.py").read_text(encoding="utf-8")
assert "if not _PTY_BRIDGE_AVAILABLE" in source, (
"/api/pty handler must return a friendly error when PTY is unavailable"
)
# ---------------------------------------------------------------------------
# Entry points wire configure_windows_stdio
# ---------------------------------------------------------------------------
class TestEntryPointsConfigureStdio:
"""cli.py, hermes_cli/main.py, gateway/run.py must call configure_windows_stdio."""
@pytest.mark.parametrize(
"relpath",
["cli.py", "hermes_cli/main.py", "gateway/run.py"],
)
def test_entry_point_calls_configure_stdio(self, relpath):
root = Path(__file__).resolve().parents[2]
source = (root / relpath).read_text(encoding="utf-8")
assert "configure_windows_stdio" in source, (
f"{relpath} must call hermes_cli.stdio.configure_windows_stdio() "
"early in startup so Windows consoles render Unicode without crashing"
)
# ---------------------------------------------------------------------------
# _subprocess_compat shared helpers
# ---------------------------------------------------------------------------
class TestSubprocessCompatHelpers:
"""hermes_cli/_subprocess_compat.py POSIX + Windows behaviour."""
def test_is_windows_matches_sys_platform(self):
from hermes_cli import _subprocess_compat as sc
assert sc.IS_WINDOWS == (sys.platform == "win32")
def test_resolve_node_command_returns_absolute_on_posix(self):
"""On Linux, resolve_node_command('sh', ['-c','echo hi']) picks up /bin/sh."""
from hermes_cli._subprocess_compat import resolve_node_command
# We can't assert "npm is on PATH" portably; use `sh` which is
# guaranteed on POSIX. On Windows the test only confirms the
# no-crash fallback path.
argv = resolve_node_command("sh", ["-c", "echo hi"])
assert argv[1:] == ["-c", "echo hi"]
# First element is either an absolute path (sh found) or the bare
# name (fallback) — both are acceptable behaviours.
def test_resolve_node_command_fallback_when_absent(self):
from hermes_cli._subprocess_compat import resolve_node_command
argv = resolve_node_command(
"zzz-definitely-not-on-path-xyzzy", ["--help"]
)
# Must fall back to the bare name — NOT return None, NOT crash.
assert argv[0] == "zzz-definitely-not-on-path-xyzzy"
assert argv[1:] == ["--help"]
def test_windows_flags_zero_on_posix(self):
from hermes_cli._subprocess_compat import (
windows_detach_flags,
windows_hide_flags,
)
if sys.platform != "win32":
assert windows_detach_flags() == 0
assert windows_hide_flags() == 0
def test_windows_detach_popen_kwargs_is_posix_equivalent_on_posix(self):
from hermes_cli._subprocess_compat import windows_detach_popen_kwargs
kwargs = windows_detach_popen_kwargs()
if sys.platform != "win32":
# POSIX path MUST produce start_new_session=True, which maps to
# os.setsid() in the child — identical to the unchanged main
# branch behaviour. Do NOT break Linux/macOS here.
assert kwargs == {"start_new_session": True}
else:
# Windows path must include creationflags with all 3 bits set.
assert "creationflags" in kwargs
assert kwargs["creationflags"] != 0
# No start_new_session on Windows (silently no-op there).
assert "start_new_session" not in kwargs
def test_windows_detach_flags_has_expected_win32_bits(self, monkeypatch):
"""Simulate Windows to verify flag bundle."""
from hermes_cli import _subprocess_compat as sc
monkeypatch.setattr(sc, "IS_WINDOWS", True)
flags = sc.windows_detach_flags()
# CREATE_NEW_PROCESS_GROUP | DETACHED_PROCESS | CREATE_NO_WINDOW
assert flags & 0x00000200, "missing CREATE_NEW_PROCESS_GROUP"
assert flags & 0x00000008, "missing DETACHED_PROCESS"
assert flags & 0x08000000, "missing CREATE_NO_WINDOW"
# ---------------------------------------------------------------------------
# tui_gateway/entry.py signal installation survives absent POSIX signals
# ---------------------------------------------------------------------------
class TestTuiGatewayEntrySignalGuards:
"""Importing tui_gateway.entry must not crash when SIGPIPE/SIGHUP absent.
Linux has both signals, so this is mostly a source-level invariant check
(no bare ``signal.SIGPIPE`` at module level without a ``hasattr`` guard).
On Windows the import would have raised AttributeError before this fix.
"""
def test_source_guards_each_signal_installation(self):
root = Path(__file__).resolve().parents[2]
source = (root / "tui_gateway" / "entry.py").read_text(encoding="utf-8")
# Every signal.signal(...) at module scope must be preceded by a
# hasattr check. We look at the text: no bare "signal.signal("
# call should appear outside a function body without a guard.
# Simpler heuristic: all SIGPIPE / SIGHUP references outside the
# dict-building loop must be wrapped in hasattr.
assert 'hasattr(signal, "SIGPIPE")' in source
assert 'hasattr(signal, "SIGHUP")' in source
assert 'hasattr(signal, "SIGTERM")' in source
assert 'hasattr(signal, "SIGINT")' in source
def test_module_imports_cleanly(self):
"""Importing the module must not raise — verifies the guards work."""
# Drop any cached import so the module re-initialises
for mod in list(sys.modules):
if mod.startswith("tui_gateway"):
del sys.modules[mod]
import tui_gateway.entry # noqa: F401 # must not raise
# ---------------------------------------------------------------------------
# hermes_cli/kanban_db.py waitpid guard
# ---------------------------------------------------------------------------
class TestKanbanWaitpidWindowsGuard:
"""os.WNOHANG doesn't exist on Windows — the dispatcher tick reap loop
must be gated behind ``os.name != "nt"``."""
def test_source_gates_waitpid_loop(self):
root = Path(__file__).resolve().parents[2]
source = (root / "hermes_cli" / "kanban_db.py").read_text(encoding="utf-8")
# Find the waitpid call and confirm it's inside a POSIX gate.
idx = source.find("os.waitpid(-1, os.WNOHANG)")
assert idx > 0, "waitpid call must exist"
# Look backwards up to 400 chars for the gate.
preamble = source[max(0, idx - 400):idx]
assert 'os.name != "nt"' in preamble or "os.name != 'nt'" in preamble, (
"os.waitpid(-1, os.WNOHANG) must sit behind an os.name != 'nt' guard"
)
# ---------------------------------------------------------------------------
# code_execution_tool TCP loopback on Windows
# ---------------------------------------------------------------------------
class TestCodeExecutionTransportTcpFallback:
"""The RPC transport must fall back to TCP on Windows.
We can't easily execute the sandbox on Linux CI in Windows mode, but we
CAN assert that the generated client module supports both AF_UNIX and
AF_INET endpoints based on the HERMES_RPC_SOCKET format.
"""
def test_generated_client_handles_tcp_endpoint(self):
root = Path(__file__).resolve().parents[2]
source = (root / "tools" / "code_execution_tool.py").read_text(encoding="utf-8")
# _UDS_TRANSPORT_HEADER body must parse both transports.
assert 'endpoint.startswith("tcp://")' in source, (
"generated sandbox client must accept tcp:// endpoints for Windows"
)
assert "socket.AF_INET" in source, (
"generated sandbox client must be able to open AF_INET sockets"
)
def test_server_side_branches_on_use_tcp_rpc(self):
root = Path(__file__).resolve().parents[2]
source = (root / "tools" / "code_execution_tool.py").read_text(encoding="utf-8")
assert "_use_tcp_rpc = _IS_WINDOWS" in source
assert 'rpc_endpoint = f"tcp://{_host}:{_port}"' in source
# ---------------------------------------------------------------------------
# cron/scheduler.py /bin/bash dynamic resolution
# ---------------------------------------------------------------------------
class TestCronSchedulerBashResolution:
"""cron.scheduler must NOT hardcode /bin/bash — .sh scripts need a
dynamically-resolved bash so Windows (Git Bash) works."""
def test_source_uses_shutil_which_for_bash(self):
root = Path(__file__).resolve().parents[2]
source = (root / "cron" / "scheduler.py").read_text(encoding="utf-8")
# The old hardcoded path should be gone as the sole bash source.
# It may still appear as a POSIX fallback after shutil.which(), so
# we check for the shutil.which call near the .sh/.bash branch.
assert 'shutil.which("bash")' in source, (
"cron.scheduler must resolve bash dynamically via shutil.which"
)
def test_error_message_when_bash_missing(self):
root = Path(__file__).resolve().parents[2]
source = (root / "cron" / "scheduler.py").read_text(encoding="utf-8")
# The graceful-failure message must mention "bash not found" so
# Windows users without Git Bash see an actionable error instead
# of a WinError 2 traceback.
assert "bash not found" in source.lower()
# ---------------------------------------------------------------------------
# Node-ecosystem launcher resolution (npm / npx / node)
# ---------------------------------------------------------------------------
class TestNpmBareSpawnsResolved:
"""Every spawn site that launches ``npm``/``npx`` must resolve via
shutil.which / hermes_cli._subprocess_compat.resolve_node_command
so Windows can execute the .cmd batch shims."""
@pytest.mark.parametrize(
"relpath",
[
"hermes_cli/tools_config.py",
"hermes_cli/doctor.py",
"gateway/platforms/whatsapp.py",
"tools/browser_tool.py",
],
)
def test_no_bare_npm_or_npx_in_popen_argv(self, relpath):
"""Reject ``subprocess.run(["npm", ...])`` / ``["npx", ...]`` patterns.
Those fail on Windows with WinError 193. Callers must resolve
via shutil.which(...) and pass the absolute path (or fall back
to the bare name only as a last resort behind a variable).
"""
root = Path(__file__).resolve().parents[2]
source = (root / relpath).read_text(encoding="utf-8")
# The forbidden literal: a subprocess invocation that names npm
# or npx as a bare string inside an argv list.
forbidden_patterns = [
'["npm",',
'["npx",',
"['npm',",
"['npx',",
]
for pat in forbidden_patterns:
# Exception: strings inside error-message text or comments are fine.
# We only fail if the literal appears in an argv position, which
# we approximate by checking it isn't inside a print/log/comment.
# Find all occurrences and verify they're behind shutil.which.
idx = 0
while True:
idx = source.find(pat, idx)
if idx < 0:
break
# Look at the preceding 120 chars — if "shutil.which" appears
# there, or the pattern is inside a comment/string, it's fine.
context = source[max(0, idx - 120):idx]
if "#" in context.split("\n")[-1]:
idx += len(pat)
continue
# Argv forms that START with a bare npm/npx are the bug.
raise AssertionError(
f"{relpath}: bare {pat!r} still present at offset {idx}"
f"resolve via shutil.which(...) so Windows can execute .cmd shims"
)
# ---------------------------------------------------------------------------
# tools/environments/local.py Windows temp dir & PATH injection
# ---------------------------------------------------------------------------
class TestLocalEnvironmentWindowsTempDir:
"""LocalEnvironment.get_temp_dir must return a native Windows path on
Windows, NOT the POSIX ``/tmp`` literal (which Python can't open)."""
def test_posix_path_preserved_on_linux(self):
"""Linux/macOS behaviour MUST be unchanged — return / tmp or
tempfile.gettempdir()-derived POSIX path. This is the 'do no harm'
test regressions here break every Unix user's terminal tool."""
from tools.environments.local import LocalEnvironment
env = LocalEnvironment(cwd="/tmp", timeout=10, env={})
tmp_dir = env.get_temp_dir()
if sys.platform != "win32":
assert tmp_dir.startswith("/"), (
f"POSIX temp dir must start with '/'; got {tmp_dir!r}"
)
def test_source_has_windows_branch_using_hermes_home(self):
root = Path(__file__).resolve().parents[2]
source = (root / "tools" / "environments" / "local.py").read_text(encoding="utf-8")
assert "if _IS_WINDOWS:" in source
assert "get_hermes_home" in source
assert 'cache_dir = get_hermes_home() / "cache" / "terminal"' in source
class TestLocalEnvironmentPathInjectionGated:
"""The /usr/bin PATH injection in _make_run_env must be POSIX-only."""
def test_source_gates_path_injection(self):
root = Path(__file__).resolve().parents[2]
source = (root / "tools" / "environments" / "local.py").read_text(encoding="utf-8")
# The fix wraps the injection in `if not _IS_WINDOWS`.
assert 'not _IS_WINDOWS and "/usr/bin" not in existing_path.split(":")' in source
# ---------------------------------------------------------------------------
# cli.py git path normalization
# ---------------------------------------------------------------------------
class TestGitBashPathNormalization:
"""_normalize_git_bash_path should turn /c/Users/... into C:\\Users\\...
on Windows and leave paths unchanged on POSIX."""
def test_posix_noop(self):
"""Must NOT mutate paths on Linux/macOS."""
from cli import _normalize_git_bash_path
if sys.platform != "win32":
assert _normalize_git_bash_path("/home/teknium/foo") == "/home/teknium/foo"
assert _normalize_git_bash_path("/c/Users/foo") == "/c/Users/foo"
assert _normalize_git_bash_path("C:/Users/foo") == "C:/Users/foo"
assert _normalize_git_bash_path(None) is None
def test_empty_string_preserved(self):
from cli import _normalize_git_bash_path
assert _normalize_git_bash_path("") == ""
def test_windows_translation(self, monkeypatch):
"""Simulate Windows and verify /c/Users/... becomes C:\\Users\\..."""
import cli as cli_mod
monkeypatch.setattr(cli_mod.sys, "platform", "win32")
assert cli_mod._normalize_git_bash_path("/c/Users/foo") == r"C:\Users\foo"
assert cli_mod._normalize_git_bash_path("/C/Users/foo") == r"C:\Users\foo"
assert cli_mod._normalize_git_bash_path("/cygdrive/d/data") == r"D:\data"
assert cli_mod._normalize_git_bash_path("/mnt/c/Users") == r"C:\Users"
# Already-native path is preserved
assert cli_mod._normalize_git_bash_path(r"C:\Users\foo") == r"C:\Users\foo"
# Forward-slash Windows path is preserved (git on Windows often
# returns this form; it's valid for both bash and Python, so we
# don't need to translate).
assert cli_mod._normalize_git_bash_path("C:/Users/foo") == "C:/Users/foo"
class TestWorktreeSymlinkFallback:
""".worktreeinclude directory symlinks must fall back to copytree on
Windows (where symlink creation requires admin / Dev Mode)."""
def test_source_has_symlink_fallback(self):
root = Path(__file__).resolve().parents[2]
source = (root / "cli.py").read_text(encoding="utf-8")
# Look for the try/except that handles OSError around os.symlink
# with a shutil.copytree fallback.
assert "os.symlink(str(src_resolved), str(dst))" in source
assert "except (OSError, NotImplementedError)" in source
assert "shutil.copytree" in source
assert 'sys.platform == "win32"' in source
# ---------------------------------------------------------------------------
# Gateway detached watcher — Windows creationflags
# ---------------------------------------------------------------------------
class TestGatewayDetachedWatcherWindowsFlags:
"""launch_detached_profile_gateway_restart and the in-gateway update
launcher must use CREATE_NEW_PROCESS_GROUP | DETACHED_PROCESS on
Windows, not silent start_new_session=True."""
def test_hermes_cli_gateway_uses_compat_kwargs(self):
root = Path(__file__).resolve().parents[2]
source = (root / "hermes_cli" / "gateway.py").read_text(encoding="utf-8")
assert "windows_detach_popen_kwargs" in source, (
"hermes_cli/gateway.py must use the platform-aware detach helper"
)
# The legacy start_new_session=True on the outer Popen should be
# replaced by **windows_detach_popen_kwargs(). Inside the watcher
# STRING the old pattern is replaced by explicit creationflags.
assert "**windows_detach_popen_kwargs()" in source
def test_gateway_run_update_has_windows_branch(self):
root = Path(__file__).resolve().parents[2]
source = (root / "gateway" / "run.py").read_text(encoding="utf-8")
# Both the /restart and /update paths must have sys.platform=='win32' branches.
assert 'if sys.platform == "win32":' in source
# Windows branch uses windows_detach_popen_kwargs
assert "windows_detach_popen_kwargs" in source
+34 -11
View File
@@ -708,7 +708,16 @@ def _run_chrome_fallback_command(
)
return {"success": False, "error": hint}
cmd_prefix = ["npx", "agent-browser"] if browser_cmd == "npx agent-browser" else [browser_cmd]
# On Windows npx is npx.cmd — use shutil.which so CreateProcessW can
# execute the batch shim. shutil.which honours PATHEXT on Windows and
# returns the plain executable on POSIX. If npx isn't on PATH (Termux,
# bare container), fall back to the bare name and let Popen raise with
# a readable "FileNotFoundError: 'npx'" rather than WinError 193.
if browser_cmd == "npx agent-browser":
_npx_bin = shutil.which("npx") or "npx"
cmd_prefix = [_npx_bin, "agent-browser"]
else:
cmd_prefix = [browser_cmd]
base_args = cmd_prefix + ["--engine", "chrome", "--session", tmp_session, "--json"]
task_socket_dir = os.path.join(_socket_safe_tmpdir(), f"agent-browser-{tmp_session}")
@@ -742,7 +751,7 @@ def _run_chrome_fallback_command(
proc.wait()
return {"success": False, "error": f"Chrome fallback '{cmd}' timed out"}
try:
with open(stdout_path, "r") as f:
with open(stdout_path, "r", encoding="utf-8") as f:
stdout = f.read().strip()
if stdout:
return json.loads(stdout.split("\n")[-1])
@@ -1101,7 +1110,7 @@ def _write_owner_pid(socket_dir: str, session_name: str) -> None:
"""
try:
path = os.path.join(socket_dir, f"{session_name}.owner_pid")
with open(path, "w") as f:
with open(path, "w", encoding="utf-8") as f:
f.write(str(os.getpid()))
except OSError as exc:
logger.debug("Could not write owner_pid file for %s: %s",
@@ -1165,7 +1174,7 @@ def _reap_orphaned_browser_sessions():
owner_alive: Optional[bool] = None # None = owner_pid missing/unreadable
if os.path.isfile(owner_pid_file):
try:
owner_pid = int(Path(owner_pid_file).read_text().strip())
owner_pid = int(Path(owner_pid_file).read_text(encoding="utf-8").strip())
try:
os.kill(owner_pid, 0)
owner_alive = True
@@ -1175,6 +1184,10 @@ def _reap_orphaned_browser_sessions():
# Owner exists but we can't signal it (different uid).
# Treat as alive — don't reap someone else's session.
owner_alive = True
except OSError:
# Windows: gone PID raises OSError (WinError 87) instead
# of ProcessLookupError. Treat as dead to match POSIX.
owner_alive = False
except (ValueError, OSError):
owner_alive = None # corrupt file — fall through
@@ -1196,7 +1209,7 @@ def _reap_orphaned_browser_sessions():
continue
try:
daemon_pid = int(Path(pid_file).read_text().strip())
daemon_pid = int(Path(pid_file).read_text(encoding="utf-8").strip())
except (ValueError, OSError):
shutil.rmtree(socket_dir, ignore_errors=True)
continue
@@ -1211,6 +1224,11 @@ def _reap_orphaned_browser_sessions():
except PermissionError:
# Alive but owned by someone else — leave it alone
continue
except OSError:
# Windows raises OSError (WinError 87) for a gone PID — treat
# as dead and clean up, mirroring the ProcessLookupError branch.
shutil.rmtree(socket_dir, ignore_errors=True)
continue
# Daemon is alive and its owner is dead (or legacy + untracked). Reap.
try:
@@ -1759,7 +1777,12 @@ def _run_browser_command(
# Keep concrete executable paths intact, even when they contain spaces.
# Only the synthetic npx fallback needs to expand into multiple argv items.
cmd_prefix = ["npx", "agent-browser"] if browser_cmd == "npx agent-browser" else [browser_cmd]
# shutil.which resolves npx → npx.cmd on Windows; bare "npx" stays on POSIX.
if browser_cmd == "npx agent-browser":
_npx_bin = shutil.which("npx") or "npx"
cmd_prefix = [_npx_bin, "agent-browser"]
else:
cmd_prefix = [browser_cmd]
cmd_parts = cmd_prefix + backend_args + [
"--json",
@@ -1811,7 +1834,7 @@ def _run_browser_command(
# Detect AppArmor user namespace restrictions (Ubuntu 23.10+)
_userns_restrict = "/proc/sys/kernel/apparmor_restrict_unprivileged_userns"
try:
with open(_userns_restrict) as _f:
with open(_userns_restrict, encoding="utf-8") as _f:
if _f.read().strip() == "1":
_needs_sandbox_bypass = True
logger.debug(
@@ -1856,9 +1879,9 @@ def _run_browser_command(
result = {"success": False, "error": f"Command timed out after {timeout} seconds"}
# Fall through to fallback check below
else:
with open(stdout_path, "r") as f:
with open(stdout_path, "r", encoding="utf-8") as f:
stdout = f.read()
with open(stderr_path, "r") as f:
with open(stderr_path, "r", encoding="utf-8") as f:
stderr = f.read()
returncode = proc.returncode
@@ -3157,7 +3180,7 @@ def _cleanup_single_browser_session(task_id: str) -> None:
pid_file = os.path.join(socket_dir, f"{session_name}.pid")
if os.path.isfile(pid_file):
try:
daemon_pid = int(Path(pid_file).read_text().strip())
daemon_pid = int(Path(pid_file).read_text(encoding="utf-8").strip())
os.kill(daemon_pid, signal.SIGTERM)
logger.debug("Killed daemon pid %s for %s", daemon_pid, session_name)
except (ProcessLookupError, ValueError, PermissionError, OSError):
@@ -3300,7 +3323,7 @@ def _running_in_docker() -> bool:
if os.path.exists("/.dockerenv"):
return True
try:
with open("/proc/1/cgroup", "rt") as fp:
with open("/proc/1/cgroup", "rt", encoding="utf-8") as fp:
return "docker" in fp.read()
except OSError:
return False
+185 -43
View File
@@ -47,10 +47,13 @@ import uuid
_IS_WINDOWS = platform.system() == "Windows"
from typing import Any, Dict, List, Optional
# Availability gate: UDS requires a POSIX OS
# Availability gate. On Windows we fall back to loopback TCP for the
# sandbox RPC transport (AF_UNIX is unreliable on Windows Python) — see
# ``_use_tcp_rpc`` in ``_execute_local`` below. That makes execute_code
# available on every platform Hermes itself runs on.
logger = logging.getLogger(__name__)
SANDBOX_AVAILABLE = sys.platform != "win32"
SANDBOX_AVAILABLE = True
# The 7 tools allowed inside the sandbox. The intersection of this list
# and the session's enabled tools determines which stubs are generated.
@@ -70,6 +73,85 @@ DEFAULT_MAX_TOOL_CALLS = 50
MAX_STDOUT_BYTES = 50_000 # 50 KB
MAX_STDERR_BYTES = 10_000 # 10 KB
# Environment variable scrubbing rules (shared between the local + remote
# backends). Secret-substring block is applied first; anything left must
# match either a safe prefix or, on Windows, an OS-essential name.
_SAFE_ENV_PREFIXES = ("PATH", "HOME", "USER", "LANG", "LC_", "TERM",
"TMPDIR", "TMP", "TEMP", "SHELL", "LOGNAME",
"XDG_", "PYTHONPATH", "VIRTUAL_ENV", "CONDA",
"HERMES_")
_SECRET_SUBSTRINGS = ("KEY", "TOKEN", "SECRET", "PASSWORD", "CREDENTIAL",
"PASSWD", "AUTH")
# Windows-only: a handful of variables are required by the OS/CRT itself.
# Without them, even stdlib calls like ``socket.socket()`` fail with
# WinError 10106 (Winsock can't locate mswsock.dll) and ``subprocess``
# can't resolve cmd.exe. These are well-known OS paths, not secrets, so
# we allow them through by exact name. The _SECRET_SUBSTRINGS block
# still runs as a safety net (none of these names match those substrings).
_WINDOWS_ESSENTIAL_ENV_VARS = frozenset({
"SYSTEMROOT", # %SYSTEMROOT%\System32 — Winsock needs this
"SYSTEMDRIVE", # C: (or wherever Windows lives)
"WINDIR", # usually same as SYSTEMROOT
"COMSPEC", # cmd.exe path — subprocess shell=True needs it
"PATHEXT", # .COM;.EXE;.BAT;... — shell lookup
"OS", # "Windows_NT" — some tools gate on this
"PROCESSOR_ARCHITECTURE",
"NUMBER_OF_PROCESSORS",
"PUBLIC", # C:\Users\Public
"ALLUSERSPROFILE", # C:\ProgramData — some stdlib paths use it
"PROGRAMDATA", # C:\ProgramData
"PROGRAMFILES",
"PROGRAMFILES(X86)",
"PROGRAMW6432",
"APPDATA", # %USERPROFILE%\AppData\Roaming — Python uses it
"LOCALAPPDATA", # %USERPROFILE%\AppData\Local
"USERPROFILE", # C:\Users\<name> — Python's expanduser uses it
"USERDOMAIN",
"USERNAME",
"HOMEDRIVE", # C:
"HOMEPATH", # \Users\<name>
"COMPUTERNAME",
})
def _scrub_child_env(source_env, is_passthrough=None, is_windows=None):
"""Produce the scrubbed child-process env for execute_code.
Rules (order matters):
1. Passthrough vars (skill- or config-declared) always pass.
2. Secret-substring names (KEY/TOKEN/etc.) are blocked.
3. Names matching a safe prefix pass.
4. On Windows, a small OS-essential allowlist passes by exact name
without these the child can't even create a socket or spawn a
subprocess.
Extracted into a helper so tests can exercise the logic without
spawning a subprocess.
"""
if is_passthrough is None:
try:
from tools.env_passthrough import is_env_passthrough as _ep
except Exception:
_ep = lambda _: False # noqa: E731
is_passthrough = _ep
if is_windows is None:
is_windows = _IS_WINDOWS
scrubbed = {}
for k, v in source_env.items():
if is_passthrough(k):
scrubbed[k] = v
continue
if any(s in k.upper() for s in _SECRET_SUBSTRINGS):
continue
if any(k.startswith(p) for p in _SAFE_ENV_PREFIXES):
scrubbed[k] = v
continue
if is_windows and k.upper() in _WINDOWS_ESSENTIAL_ENV_VARS:
scrubbed[k] = v
return scrubbed
def check_sandbox_requirements() -> bool:
"""Code execution sandbox requires a POSIX OS for Unix domain sockets."""
@@ -235,10 +317,27 @@ _call_lock = threading.Lock()
''' + _COMMON_HELPERS + '''\
def _connect():
"""Connect to the parent's RPC server via the transport it picked.
HERMES_RPC_SOCKET can be either:
- a filesystem path (POSIX Unix domain socket the default on
Linux and macOS)
- a string of the form ``tcp://127.0.0.1:<port>`` (Windows, where
AF_UNIX is unreliable the parent falls back to loopback TCP)
"""
global _sock
if _sock is None:
_sock = socket.socket(socket.AF_UNIX, socket.SOCK_STREAM)
_sock.connect(os.environ["HERMES_RPC_SOCKET"])
endpoint = os.environ["HERMES_RPC_SOCKET"]
if endpoint.startswith("tcp://"):
# tcp://host:port (host is always 127.0.0.1 in practice — we
# only bind loopback server-side)
_host_port = endpoint[len("tcp://"):]
_host, _, _port = _host_port.rpartition(":")
_sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
_sock.connect((_host or "127.0.0.1", int(_port)))
else:
_sock = socket.socket(socket.AF_UNIX, socket.SOCK_STREAM)
_sock.connect(endpoint)
_sock.settimeout(300)
return _sock
@@ -291,9 +390,12 @@ def _call(tool_name, args):
req_file = os.path.join(_RPC_DIR, f"req_{seq_str}")
res_file = os.path.join(_RPC_DIR, f"res_{seq_str}")
# Write request atomically (write to .tmp, then rename)
# Write request atomically (write to .tmp, then rename).
# encoding="utf-8" is critical: on Windows-hosted remote backends
# (or any non-UTF-8 locale) the default open() mode would mangle
# non-ASCII chars in tool args when encoding them as JSON.
tmp = req_file + ".tmp"
with open(tmp, "w") as f:
with open(tmp, "w", encoding="utf-8") as f:
json.dump({"tool": tool_name, "args": args, "seq": seq}, f)
os.rename(tmp, req_file)
@@ -306,7 +408,7 @@ def _call(tool_name, args):
time.sleep(poll_interval)
poll_interval = min(poll_interval * 1.2, 0.25) # Back off to 250ms
with open(res_file) as f:
with open(res_file, encoding="utf-8") as f:
raw = f.read()
# Clean up response file
@@ -415,7 +517,7 @@ def _rpc_server_loop(
# their status prints don't leak into the CLI spinner.
try:
_real_stdout, _real_stderr = sys.stdout, sys.stderr
devnull = open(os.devnull, "w")
devnull = open(os.devnull, "w", encoding="utf-8")
try:
sys.stdout = devnull
sys.stderr = devnull
@@ -689,7 +791,7 @@ def _rpc_poll_loop(
# Dispatch through the standard tool handler
try:
_real_stdout, _real_stderr = sys.stdout, sys.stderr
devnull = open(os.devnull, "w")
devnull = open(os.devnull, "w", encoding="utf-8")
try:
sys.stdout = devnull
sys.stderr = devnull
@@ -954,7 +1056,8 @@ def execute_code(
"""
if not SANDBOX_AVAILABLE:
return json.dumps({
"error": "execute_code is not available on Windows. Use normal tool calls instead."
"error": "execute_code sandbox is unavailable in this environment. "
"Use normal tool calls (terminal, read_file, write_file, ...) instead."
})
if not code or not code.strip():
@@ -988,8 +1091,22 @@ def execute_code(
# Use /tmp on macOS to avoid the long /var/folders/... path that pushes
# Unix domain socket paths past the 104-byte macOS AF_UNIX limit.
# On Linux, tempfile.gettempdir() already returns /tmp.
#
# Windows: Python 3.9+ added partial AF_UNIX support but the file-backed
# variant is flaky across Windows builds (requires Windows 10 1803+,
# still fails under some configurations, and the socket file can't live
# on the same temp drive as the script). Fall back to loopback TCP —
# same ephemeral port, same 1-connection listen queue, same serialized
# request/response framing. The generated client reads the transport
# selector from HERMES_RPC_SOCKET (path vs. ``tcp://host:port``).
_sock_tmpdir = "/tmp" if sys.platform == "darwin" else tempfile.gettempdir()
sock_path = os.path.join(_sock_tmpdir, f"hermes_rpc_{uuid.uuid4().hex}.sock")
_use_tcp_rpc = _IS_WINDOWS
if _use_tcp_rpc:
sock_path = None # not used on Windows; TCP endpoint stored below
rpc_endpoint = None # set after bind()
else:
sock_path = os.path.join(_sock_tmpdir, f"hermes_rpc_{uuid.uuid4().hex}.sock")
rpc_endpoint = sock_path
tool_call_log: list = []
tool_call_counter = [0] # mutable so the RPC thread can increment
@@ -997,21 +1114,42 @@ def execute_code(
server_sock = None
try:
# Write the auto-generated hermes_tools module
# Write the auto-generated hermes_tools module.
# encoding="utf-8" is required on Windows — the stub and user code
# both contain non-ASCII characters (em-dashes in docstrings, plus
# whatever the user script carries). Python's default open() uses
# the system locale on Windows (cp1252 typically), which corrupts
# those bytes; the child then fails to import with a SyntaxError
# ("'utf-8' codec can't decode byte 0x97 in position ...") because
# Python source files are decoded as UTF-8 by default (PEP 3120).
# sandbox_tools is already the correct set (intersection with session
# tools, or SANDBOX_ALLOWED_TOOLS as fallback — see lines above).
tools_src = generate_hermes_tools_module(list(sandbox_tools))
with open(os.path.join(tmpdir, "hermes_tools.py"), "w") as f:
with open(os.path.join(tmpdir, "hermes_tools.py"), "w", encoding="utf-8") as f:
f.write(tools_src)
# Write the user's script
with open(os.path.join(tmpdir, "script.py"), "w") as f:
with open(os.path.join(tmpdir, "script.py"), "w", encoding="utf-8") as f:
f.write(code)
# --- Start UDS server ---
server_sock = socket.socket(socket.AF_UNIX, socket.SOCK_STREAM)
server_sock.bind(sock_path)
os.chmod(sock_path, 0o600)
# --- Start RPC server ---
# Two transports:
# POSIX: AF_UNIX stream socket on sock_path, chmod 0600 for
# owner-only access. Filesystem permissions gate the socket.
# Windows: AF_INET stream socket on 127.0.0.1 with an ephemeral
# port. No filesystem permission story, but loopback-only bind
# means only the current user's processes (not remote) can
# connect. HERMES_RPC_SOCKET is set to ``tcp://127.0.0.1:<port>``
# which the generated client parses to pick AF_INET.
if _use_tcp_rpc:
server_sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
server_sock.bind(("127.0.0.1", 0)) # ephemeral port
_host, _port = server_sock.getsockname()[:2]
rpc_endpoint = f"tcp://{_host}:{_port}"
else:
server_sock = socket.socket(socket.AF_UNIX, socket.SOCK_STREAM)
server_sock.bind(sock_path)
os.chmod(sock_path, 0o600)
server_sock.listen(1)
rpc_thread = threading.Thread(
@@ -1030,31 +1168,32 @@ def execute_code(
# generated scripts. The child accesses tools via RPC, not direct API.
# Exception: env vars declared by loaded skills (via env_passthrough
# registry) or explicitly allowed by the user in config.yaml
# (terminal.env_passthrough) are passed through.
_SAFE_ENV_PREFIXES = ("PATH", "HOME", "USER", "LANG", "LC_", "TERM",
"TMPDIR", "TMP", "TEMP", "SHELL", "LOGNAME",
"XDG_", "PYTHONPATH", "VIRTUAL_ENV", "CONDA",
"HERMES_")
_SECRET_SUBSTRINGS = ("KEY", "TOKEN", "SECRET", "PASSWORD", "CREDENTIAL",
"PASSWD", "AUTH")
try:
from tools.env_passthrough import is_env_passthrough as _is_passthrough
except Exception:
_is_passthrough = lambda _: False # noqa: E731
child_env = {}
for k, v in os.environ.items():
# Passthrough vars (skill-declared or user-configured) always pass.
if _is_passthrough(k):
child_env[k] = v
continue
# Block vars with secret-like names.
if any(s in k.upper() for s in _SECRET_SUBSTRINGS):
continue
# Allow vars with known safe prefixes.
if any(k.startswith(p) for p in _SAFE_ENV_PREFIXES):
child_env[k] = v
child_env["HERMES_RPC_SOCKET"] = sock_path
# (terminal.env_passthrough) are passed through. On Windows, a small
# OS-essential allowlist (SYSTEMROOT, WINDIR, COMSPEC, ...) is also
# passed through — without those, the child can't create a socket
# or spawn a subprocess. See ``_scrub_child_env`` for the rules.
child_env = _scrub_child_env(os.environ)
child_env["HERMES_RPC_SOCKET"] = rpc_endpoint
child_env["PYTHONDONTWRITEBYTECODE"] = "1"
# Force UTF-8 for the child's stdio and default file encoding.
#
# Without this, on Windows sys.stdout is bound to the console code
# page (cp1252 on US-locale installs), and any script that does
# ``print("café")`` or ``print("→")`` crashes with:
#
# UnicodeEncodeError: 'charmap' codec can't encode character
# '\u2192' in position N: character maps to <undefined>
#
# PYTHONIOENCODING fixes sys.stdin/stdout/stderr.
# PYTHONUTF8=1 enables "UTF-8 mode" (PEP 540) which additionally
# makes ``open()``'s default encoding UTF-8, so user scripts that
# write files without specifying encoding= also work correctly.
#
# On POSIX both values usually match the locale default already,
# so setting them is harmless belt-and-suspenders for environments
# with a C/POSIX locale (containers, minimal base images).
child_env["PYTHONIOENCODING"] = "utf-8"
child_env["PYTHONUTF8"] = "1"
# Ensure the hermes-agent root is importable in the sandbox so
# repo-root modules are available to child scripts. We also prepend
# the staging tmpdir so ``from hermes_tools import ...`` resolves even
@@ -1302,7 +1441,10 @@ def execute_code(
import shutil
shutil.rmtree(tmpdir, ignore_errors=True)
try:
os.unlink(sock_path)
# Only UDS has a filesystem socket to unlink; TCP sockets are
# freed by server_sock.close() above.
if sock_path:
os.unlink(sock_path)
except OSError:
pass # already cleaned up or never created
+52 -15
View File
@@ -99,12 +99,33 @@ def get_sandbox_dir() -> Path:
def _pipe_stdin(proc: subprocess.Popen, data: str) -> None:
"""Write *data* to proc.stdin on a daemon thread to avoid pipe-buffer deadlocks."""
"""Write *data* to proc.stdin on a daemon thread to avoid pipe-buffer deadlocks.
On Windows, text-mode stdin (``text=True`` / ``encoding="utf-8"``)
translates ``\\n`` ``\\r\\n`` as the data flows through the pipe
which corrupts every write_file / patch call because the bytes that
land on disk include injected carriage returns. The file IS created,
but every subsequent byte-count / content compare against the
caller's ``\\n``-only string fails.
Workaround: write through ``proc.stdin.buffer`` (the underlying byte
buffer), encoding to UTF-8 ourselves. That bypasses Python's
newline translation entirely on every platform. No behaviour change
on POSIX the byte sequence is identical to what text-mode would
produce there.
"""
def _write():
try:
proc.stdin.write(data)
proc.stdin.close()
# proc.stdin is a TextIOWrapper when text=True was set on the
# Popen. Its ``.buffer`` attribute is the raw BufferedWriter
# that bypasses newline translation. When Popen was created
# in byte mode, proc.stdin is already a BufferedWriter with
# no ``.buffer`` attribute — fall back to .write() directly.
raw = data.encode("utf-8") if isinstance(data, str) else data
target = getattr(proc.stdin, "buffer", proc.stdin)
target.write(raw)
target.close()
except (BrokenPipeError, OSError):
pass
@@ -137,7 +158,7 @@ def _load_json_store(path: Path) -> dict:
"""Load a JSON file as a dict, returning ``{}`` on any error."""
if path.exists():
try:
return json.loads(path.read_text())
return json.loads(path.read_text(encoding="utf-8"))
except Exception:
pass
return {}
@@ -146,7 +167,7 @@ def _load_json_store(path: Path) -> dict:
def _save_json_store(path: Path, data: dict) -> None:
"""Write *data* as pretty-printed JSON to *path*."""
path.parent.mkdir(parents=True, exist_ok=True)
path.write_text(json.dumps(data, indent=2))
path.write_text(json.dumps(data, indent=2), encoding="utf-8")
def _file_mtime_key(host_path: str) -> tuple[float, int] | None:
@@ -339,15 +360,24 @@ class BaseEnvironment(ABC):
# change the working directory (e.g. bashrc `cd ~`). Without this,
# pwd -P captures the profile's directory, not terminal.cwd.
_quoted_cwd = shlex.quote(self.cwd)
# Quote the snapshot / cwd-file paths so Git Bash on Windows handles
# ``C:/Users/...``-shaped paths without glob-splitting the colon or
# tripping on drive letters. On POSIX this is a no-op (no colons /
# special chars in a /tmp path). Previously unquoted interpolation
# caused ``C:/Users/.../hermes-snap-*.sh: No such file or directory``
# errors on Windows, leaking via stderr (merged into stdout on Linux
# backends) into every terminal-tool response.
_quoted_snap = shlex.quote(self._snapshot_path)
_quoted_cwd_file = shlex.quote(self._cwd_file)
bootstrap = (
f"export -p > {self._snapshot_path}\n"
f"declare -f | grep -vE '^_[^_]' >> {self._snapshot_path}\n"
f"alias -p >> {self._snapshot_path}\n"
f"echo 'shopt -s expand_aliases' >> {self._snapshot_path}\n"
f"echo 'set +e' >> {self._snapshot_path}\n"
f"echo 'set +u' >> {self._snapshot_path}\n"
f"export -p > {_quoted_snap}\n"
f"declare -f | grep -vE '^_[^_]' >> {_quoted_snap}\n"
f"alias -p >> {_quoted_snap}\n"
f"echo 'shopt -s expand_aliases' >> {_quoted_snap}\n"
f"echo 'set +e' >> {_quoted_snap}\n"
f"echo 'set +u' >> {_quoted_snap}\n"
f"builtin cd {_quoted_cwd} 2>/dev/null || true\n"
f"pwd -P > {self._cwd_file} 2>/dev/null || true\n"
f"pwd -P > {_quoted_cwd_file} 2>/dev/null || true\n"
f"printf '\\n{self._cwd_marker}%s{self._cwd_marker}\\n' \"$(pwd -P)\"\n"
)
try:
@@ -389,6 +419,13 @@ class BaseEnvironment(ABC):
re-dumps env vars, and emits CWD markers."""
escaped = command.replace("'", "'\\''")
# Quote the snapshot / cwd-file paths so Git Bash on Windows handles
# ``C:/Users/...``-shaped paths without glob-splitting the colon or
# tripping on drive letters. POSIX paths are unaffected. See
# :meth:`init_session` for the same fix on the bootstrap block.
_quoted_snap = shlex.quote(self._snapshot_path)
_quoted_cwd_file = shlex.quote(self._cwd_file)
parts = []
# Source snapshot (env vars from previous commands).
@@ -399,7 +436,7 @@ class BaseEnvironment(ABC):
# silent here, but the redirect is harmless.
if self._snapshot_ready:
parts.append(
f"source {self._snapshot_path} >/dev/null 2>&1 || true"
f"source {_quoted_snap} >/dev/null 2>&1 || true"
)
# Preserve bare ``~`` expansion, but rewrite ``~/...`` through
@@ -414,10 +451,10 @@ class BaseEnvironment(ABC):
# Re-dump env vars to snapshot (last-writer-wins for concurrent calls)
if self._snapshot_ready:
parts.append(f"export -p > {self._snapshot_path} 2>/dev/null || true")
parts.append(f"export -p > {_quoted_snap} 2>/dev/null || true")
# Write CWD to file (local reads this) and stdout marker (remote parses this)
parts.append(f"pwd -P > {self._cwd_file} 2>/dev/null || true")
parts.append(f"pwd -P > {_quoted_cwd_file} 2>/dev/null || true")
# Use a distinct line for the marker. The leading \n ensures
# the marker starts on its own line even if the command doesn't
# end with a newline (e.g. printf 'exact'). We'll strip this
+1 -1
View File
@@ -284,7 +284,7 @@ class FileSyncManager:
# Windows: no flock — run without serialization
self._sync_back_impl()
return
lock_fd = open(lock_path, "w")
lock_fd = open(lock_path, "w", encoding="utf-8")
try:
fcntl.flock(lock_fd, fcntl.LOCK_EX)
self._sync_back_impl()
+53 -3
View File
@@ -9,6 +9,7 @@ import signal
import subprocess
import tempfile
import time
from pathlib import Path
from tools.environments.base import BaseEnvironment, _pipe_stdin
@@ -189,6 +190,25 @@ def _find_bash() -> str:
if custom and os.path.isfile(custom):
return custom
# Prefer our own portable Git install first — this way a broken or
# partially-uninstalled system Git can't hijack the bash lookup. The
# install.ps1 installer always drops portable Git here when the user
# didn't already have a working system Git.
#
# Layouts (both checked so upgrades between MinGit and PortableGit
# installs work transparently):
# PortableGit: %LOCALAPPDATA%\hermes\git\bin\bash.exe (primary)
# MinGit: %LOCALAPPDATA%\hermes\git\usr\bin\bash.exe (legacy/32-bit fallback)
_local_appdata = os.environ.get("LOCALAPPDATA", "")
_hermes_portable_git = os.path.join(_local_appdata, "hermes", "git") if _local_appdata else ""
if _hermes_portable_git:
for candidate in (
os.path.join(_hermes_portable_git, "bin", "bash.exe"), # PortableGit (primary)
os.path.join(_hermes_portable_git, "usr", "bin", "bash.exe"), # MinGit fallback
):
if os.path.isfile(candidate):
return candidate
found = shutil.which("bash")
if found:
return found
@@ -196,7 +216,7 @@ def _find_bash() -> str:
for candidate in (
os.path.join(os.environ.get("ProgramFiles", r"C:\Program Files"), "Git", "bin", "bash.exe"),
os.path.join(os.environ.get("ProgramFiles(x86)", r"C:\Program Files (x86)"), "Git", "bin", "bash.exe"),
os.path.join(os.environ.get("LOCALAPPDATA", ""), "Programs", "Git", "bin", "bash.exe"),
os.path.join(_local_appdata, "Programs", "Git", "bin", "bash.exe"),
):
if candidate and os.path.isfile(candidate):
return candidate
@@ -235,7 +255,15 @@ def _make_run_env(env: dict) -> dict:
elif k not in _HERMES_PROVIDER_ENV_BLOCKLIST or _is_passthrough(k):
run_env[k] = v
existing_path = run_env.get("PATH", "")
if "/usr/bin" not in existing_path.split(":"):
# The "/usr/bin not already present → inject sane POSIX path" heuristic
# only makes sense on POSIX. On Windows the PATH separator is ";"
# (the split(":") above turns a full Windows PATH into a single
# unrecognisable chunk, which then triggers prepending POSIX paths
# to a Windows PATH — completely wrong). Skip the injection entirely
# on Windows; the native PATH already points at whatever shell
# Hermes is driving via _find_bash (Git Bash), and Git Bash itself
# prepends its MSYS2 /usr/bin equivalent via the shell-init files.
if not _IS_WINDOWS and "/usr/bin" not in existing_path.split(":"):
run_env["PATH"] = f"{existing_path}:{_SANE_PATH}" if existing_path else _SANE_PATH
# Per-profile HOME isolation: redirect system tool configs (git, ssh, gh,
@@ -357,7 +385,29 @@ class LocalEnvironment(BaseEnvironment):
Check the environment configured for this backend first so callers can
override the temp root explicitly (for example via terminal.env or a
custom TMPDIR), then fall back to the host process environment.
**Windows:** hardcoded ``/tmp`` is wrong in two ways native Python
can't open the path, and the Windows default temp (``%TEMP%``) often
contains spaces (``C:\\Users\\Some Name\\AppData\\Local\\Temp``) that
break unquoted bash interpolations. Use a dedicated cache dir under
``HERMES_HOME`` instead single-word path, guaranteed to exist, same
string resolves in both Git Bash and native Python.
"""
if _IS_WINDOWS:
# Derive a Windows-safe temp dir under HERMES_HOME. Using
# forward slashes makes the same string work unchanged in bash
# command interpolations AND in Python ``open()`` — Windows
# accepts forward slashes in filesystem paths, and we control
# the path so we can guarantee no spaces.
try:
from hermes_constants import get_hermes_home
cache_dir = get_hermes_home() / "cache" / "terminal"
except Exception:
cache_dir = Path(tempfile.gettempdir()) / "hermes_terminal"
cache_dir.mkdir(parents=True, exist_ok=True)
# Force forward slashes so the same string serves both contexts.
return str(cache_dir).replace("\\", "/")
for env_var in ("TMPDIR", "TMP", "TEMP"):
candidate = self.env.get(env_var) or os.environ.get(env_var)
if candidate and candidate.startswith("/"):
@@ -512,7 +562,7 @@ class LocalEnvironment(BaseEnvironment):
``_run_bash`` recovery path will resolve a safe fallback if needed.
"""
try:
with open(self._cwd_file) as f:
with open(self._cwd_file, encoding="utf-8") as f:
cwd_path = f.read().strip()
if cwd_path and os.path.isdir(cwd_path):
self.cwd = cwd_path
+12 -2
View File
@@ -966,11 +966,21 @@ class ShellFileOperations(FileOperations):
verify_result = self._exec(verify_cmd)
if verify_result.exit_code != 0:
return PatchResult(error=f"Post-write verification failed: could not re-read {path}")
if verify_result.stdout != new_content:
# Normalize line endings before comparing. On Windows, Python's
# default text-mode ``open()`` translates ``\n`` → ``\r\n`` on
# write, so the file on disk legitimately holds CRLFs while our
# ``new_content`` string has bare LFs. Without this normalization
# every patch on Windows returns a bogus "wrote 39, read 42"
# false-negative even though the edit landed correctly. POSIX
# backends don't translate, so this is a no-op there.
_verify_stdout_normalized = verify_result.stdout.replace("\r\n", "\n").replace("\r", "\n")
_new_content_normalized = new_content.replace("\r\n", "\n").replace("\r", "\n")
if _verify_stdout_normalized != _new_content_normalized:
return PatchResult(error=(
f"Post-write verification failed for {path}: on-disk content "
f"differs from intended write "
f"(wrote {len(new_content)} chars, read back {len(verify_result.stdout)}). "
f"(wrote {len(_new_content_normalized)} chars, read back "
f"{len(_verify_stdout_normalized)} chars after normalizing line endings). "
"The patch did not persist. Re-read the file and try again."
))
+1 -1
View File
@@ -1992,7 +1992,7 @@ def _snapshot_child_pids() -> set:
# Linux: read from /proc
try:
children_path = f"/proc/{my_pid}/task/{my_pid}/children"
with open(children_path) as f:
with open(children_path, encoding="utf-8") as f:
return {int(p) for p in f.read().split() if p.strip()}
except (FileNotFoundError, OSError, ValueError):
pass
+5 -1
View File
@@ -407,7 +407,11 @@ class ProcessRegistry:
try:
os.kill(pid, 0)
return True
except (ProcessLookupError, PermissionError):
except (ProcessLookupError, PermissionError, OSError):
# OSError covers Windows' WinError 87 for a gone PID, and the
# ``WinError 5 Access denied`` case — treat both as "can't probe
# or process is gone", which matches the conservative
# "not alive" semantics callers already handle.
return False
def _refresh_detached_session(self, session: Optional[ProcessSession]) -> Optional[ProcessSession]:
+7 -7
View File
@@ -169,7 +169,7 @@ def _scan_environments() -> List[EnvironmentInfo]:
continue
try:
with open(py_file, "r") as f:
with open(py_file, "r", encoding="utf-8") as f:
tree = ast.parse(f.read())
for node in ast.walk(tree):
@@ -333,7 +333,7 @@ async def _spawn_training_run(run_state: RunState, config_path: Path):
# File must stay open while the subprocess runs; we store the handle
# on run_state so _stop_training_run() can close it when done.
api_log_file = open(api_log, "w") # closed by _stop_training_run
api_log_file = open(api_log, "w", encoding="utf-8") # closed by _stop_training_run
run_state.api_log_file = api_log_file
run_state.api_process = subprocess.Popen(
["run-api"],
@@ -356,7 +356,7 @@ async def _spawn_training_run(run_state: RunState, config_path: Path):
# Step 2: Start the Tinker trainer
logger.info("[%s] Starting Tinker trainer: launch_training.py --config %s", run_id, config_path)
trainer_log_file = open(trainer_log, "w") # closed by _stop_training_run
trainer_log_file = open(trainer_log, "w", encoding="utf-8") # closed by _stop_training_run
run_state.trainer_log_file = trainer_log_file
run_state.trainer_process = subprocess.Popen(
[sys.executable, "launch_training.py", "--config", str(config_path)],
@@ -397,7 +397,7 @@ async def _spawn_training_run(run_state: RunState, config_path: Path):
logger.info("[%s] Starting environment: %s serve", run_id, env_info.file_path)
env_log_file = open(env_log, "w") # closed by _stop_training_run
env_log_file = open(env_log, "w", encoding="utf-8") # closed by _stop_training_run
run_state.env_log_file = env_log_file
run_state.env_process = subprocess.Popen(
[sys.executable, str(env_info.file_path), "serve", "--config", str(config_path)],
@@ -777,7 +777,7 @@ async def rl_start_training() -> str:
if "wandb_name" in _current_config and _current_config["wandb_name"]:
run_config["env"]["wandb_name"] = _current_config["wandb_name"]
with open(config_path, "w") as f:
with open(config_path, "w", encoding="utf-8") as f:
yaml.dump(run_config, f, default_flow_style=False)
# Create run state
@@ -1206,7 +1206,7 @@ async def rl_test_inference(
stderr_text = "\n".join(stderr_lines)
# Write logs to files for inspection outside CLI
with open(log_file, "w") as f:
with open(log_file, "w", encoding="utf-8") as f:
f.write(f"Command: {cmd_display}\n")
f.write(f"Working dir: {TINKER_ATROPOS_ROOT}\n")
f.write(f"Return code: {process.returncode}\n")
@@ -1238,7 +1238,7 @@ async def rl_test_inference(
# Parse the output JSONL file
if output_file.exists():
# Read JSONL file (one JSON object per line = one step)
with open(output_file, "r") as f:
with open(output_file, "r", encoding="utf-8") as f:
for line in f:
line = line.strip()
if not line:
+2 -2
View File
@@ -219,7 +219,7 @@ class GitHubAuth:
key_file = Path(key_path)
if not key_file.exists():
return None
private_key = key_file.read_text()
private_key = key_file.read_text(encoding="utf-8")
now = int(time.time())
payload = {
@@ -2667,7 +2667,7 @@ def append_audit_log(action: str, skill_name: str, source: str,
parts.append(extra)
line = " ".join(parts) + "\n"
try:
with open(AUDIT_LOG, "a") as f:
with open(AUDIT_LOG, "a", encoding="utf-8") as f:
f.write(line)
except OSError as e:
logger.debug("Could not write audit log: %s", e)
+3 -3
View File
@@ -126,7 +126,7 @@ def _read_failure_reason() -> str | None:
mtime = os.path.getmtime(p)
if (time.time() - mtime) >= _MARKER_TTL:
return None
with open(p, "r") as f:
with open(p, "r", encoding="utf-8") as f:
return f.read().strip()
except OSError:
return None
@@ -160,7 +160,7 @@ def _mark_install_failed(reason: str = ""):
try:
p = _failure_marker_path()
os.makedirs(os.path.dirname(p), exist_ok=True)
with open(p, "w") as f:
with open(p, "w", encoding="utf-8") as f:
f.write(reason)
except OSError:
pass
@@ -257,7 +257,7 @@ def _verify_cosign(checksums_path: str, sig_path: str, cert_path: str) -> bool |
def _verify_checksum(archive_path: str, checksums_path: str, archive_name: str) -> bool:
"""Verify SHA-256 of the archive against checksums.txt."""
expected = None
with open(checksums_path) as f:
with open(checksums_path, encoding="utf-8") as f:
for line in f:
# Format: "<hash> <filename>"
parts = line.strip().split(" ", 1)
+1 -1
View File
@@ -110,7 +110,7 @@ def detect_audio_environment() -> dict:
# WSL detection — PulseAudio bridge makes audio work in WSL.
# Only block if PULSE_SERVER is not configured.
try:
with open('/proc/version', 'r') as f:
with open('/proc/version', 'r', encoding="utf-8") as f:
if 'microsoft' in f.read().lower():
if os.environ.get('PULSE_SERVER'):
notices.append("Running in WSL with PulseAudio bridge")
+130
View File
@@ -0,0 +1,130 @@
"""Brave Search web search provider (free tier).
Brave Search's Data-for-Search API offers a free tier (2,000 queries/mo at the
time of writing) after signing up at https://brave.com/search/api/. This
provider implements ``WebSearchProvider`` only the Data-for-Search endpoint
returns search results, it does not extract/crawl arbitrary URLs.
Configuration::
# ~/.hermes/.env
BRAVE_SEARCH_API_KEY=your-subscription-token
# ~/.hermes/config.yaml
web:
search_backend: "brave-free"
extract_backend: "firecrawl" # pair with an extract provider if needed
The API uses the ``X-Subscription-Token`` header. Free-tier keys are rate
limited (1 qps) and capped at 2k queries/month; see the Brave dashboard for
current quotas.
"""
from __future__ import annotations
import logging
import os
from typing import Any, Dict
from tools.web_providers.base import WebSearchProvider
logger = logging.getLogger(__name__)
_BRAVE_ENDPOINT = "https://api.search.brave.com/res/v1/web/search"
class BraveFreeSearchProvider(WebSearchProvider):
"""Search via the Brave Search API (free tier).
Requires ``BRAVE_SEARCH_API_KEY`` to be set. The value is passed as the
``X-Subscription-Token`` header. No extract capability pair with
Firecrawl/Tavily/Exa/Parallel when you also need ``web_extract``.
"""
def provider_name(self) -> str:
return "brave-free"
def is_configured(self) -> bool:
"""Return True when ``BRAVE_SEARCH_API_KEY`` is set to a non-empty value."""
return bool(os.getenv("BRAVE_SEARCH_API_KEY", "").strip())
def search(self, query: str, limit: int = 5) -> Dict[str, Any]:
"""Execute a search against the Brave Search API.
Returns normalized results::
{
"success": True,
"data": {
"web": [
{
"title": str,
"url": str,
"description": str,
"position": int,
},
...
]
}
}
On failure returns ``{"success": False, "error": str}``.
"""
import httpx
api_key = os.getenv("BRAVE_SEARCH_API_KEY", "").strip()
if not api_key:
return {"success": False, "error": "BRAVE_SEARCH_API_KEY is not set"}
# Brave's `count` is capped at 20.
count = max(1, min(int(limit), 20))
try:
resp = httpx.get(
_BRAVE_ENDPOINT,
params={"q": query, "count": count},
headers={
"X-Subscription-Token": api_key,
"Accept": "application/json",
},
timeout=15,
)
resp.raise_for_status()
except httpx.HTTPStatusError as exc:
logger.warning("Brave Search HTTP error: %s", exc)
return {
"success": False,
"error": f"Brave Search returned HTTP {exc.response.status_code}",
}
except httpx.RequestError as exc:
logger.warning("Brave Search request error: %s", exc)
return {"success": False, "error": f"Could not reach Brave Search: {exc}"}
try:
data = resp.json()
except Exception as exc: # noqa: BLE001
logger.warning("Brave Search response parse error: %s", exc)
return {"success": False, "error": "Could not parse Brave Search response as JSON"}
raw_results = (data.get("web") or {}).get("results", []) or []
truncated = raw_results[:limit]
web_results = [
{
"title": str(r.get("title", "")),
"url": str(r.get("url", "")),
"description": str(r.get("description", "")),
"position": i + 1,
}
for i, r in enumerate(truncated)
]
logger.info(
"Brave Search '%s': %d results (from %d raw, limit %d)",
query,
len(web_results),
len(raw_results),
limit,
)
return {"success": True, "data": {"web": web_results}}
+98
View File
@@ -0,0 +1,98 @@
"""DuckDuckGo web search provider via the ``ddgs`` Python package.
DuckDuckGo does not provide an official programmatic search API. The
community-maintained `ddgs <https://pypi.org/project/ddgs/>`_ package (the
renamed successor of ``duckduckgo-search``) scrapes DuckDuckGo's HTML results
page and normalizes them. It implements ``WebSearchProvider`` only there is
no extract capability.
Configuration::
# No API key required. Enable by installing the package and pointing the
# web backend at ddgs:
pip install ddgs
# ~/.hermes/config.yaml
web:
search_backend: "ddgs"
extract_backend: "firecrawl" # pair with an extract provider if needed
Rate limits are enforced server-side by DuckDuckGo. Expect intermittent
``DuckDuckGoSearchException`` / 202 responses under heavy use; this provider
surfaces them as ``{"success": False, "error": ...}`` rather than crashing
the tool call.
See https://duckduckgo.com/?q=duckduckgo+tos for terms of use.
"""
from __future__ import annotations
import logging
from typing import Any, Dict
from tools.web_providers.base import WebSearchProvider
logger = logging.getLogger(__name__)
class DDGSSearchProvider(WebSearchProvider):
"""Search via the ``ddgs`` package (DuckDuckGo HTML scrape).
No API key required. The provider is considered "configured" when the
``ddgs`` package is importable there is nothing else to set up.
"""
def provider_name(self) -> str:
return "ddgs"
def is_configured(self) -> bool:
"""Return True when the ``ddgs`` package is importable.
Called at tool-registration time; must not perform network I/O.
"""
try:
import ddgs # noqa: F401
return True
except ImportError:
return False
def search(self, query: str, limit: int = 5) -> Dict[str, Any]:
"""Execute a DuckDuckGo search and return normalized results.
Returns ``{"success": True, "data": {"web": [...]}}`` on success or
``{"success": False, "error": str}`` on failure (missing package,
rate-limited, network error, etc.).
"""
try:
from ddgs import DDGS # type: ignore
except ImportError:
return {
"success": False,
"error": "ddgs package is not installed — run `pip install ddgs`",
}
# DDGS().text yields at most `max_results` items; we cap defensively
# in case the package ignores the hint.
safe_limit = max(1, int(limit))
try:
web_results = []
with DDGS() as client:
for i, hit in enumerate(client.text(query, max_results=safe_limit)):
if i >= safe_limit:
break
url = str(hit.get("href") or hit.get("url") or "")
web_results.append(
{
"title": str(hit.get("title", "")),
"url": url,
"description": str(hit.get("body", "")),
"position": i + 1,
}
)
except Exception as exc: # noqa: BLE001 — ddgs raises its own exceptions
logger.warning("DDGS search error: %s", exc)
return {"success": False, "error": f"DuckDuckGo search failed: {exc}"}
logger.info("DDGS search '%s': %d results (limit %d)", query, len(web_results), limit)
return {"success": True, "data": {"web": web_results}}
+61 -9
View File
@@ -126,18 +126,22 @@ def _get_backend() -> str:
keys manually without running setup.
"""
configured = (_load_web_config().get("backend") or "").lower().strip()
if configured in ("parallel", "firecrawl", "tavily", "exa", "searxng"):
if configured in ("parallel", "firecrawl", "tavily", "exa", "searxng", "brave-free", "ddgs"):
return configured
# Fallback for manual / legacy config — pick the highest-priority
# available backend. Firecrawl also counts as available when the managed
# tool gateway is configured for Nous subscribers.
# Free-tier backends (searxng / brave-free / ddgs) trail the paid ones so
# existing paid setups are unaffected.
backend_candidates = (
("firecrawl", _has_env("FIRECRAWL_API_KEY") or _has_env("FIRECRAWL_API_URL") or _is_tool_gateway_ready()),
("parallel", _has_env("PARALLEL_API_KEY")),
("tavily", _has_env("TAVILY_API_KEY")),
("exa", _has_env("EXA_API_KEY")),
("searxng", _has_env("SEARXNG_URL")),
("brave-free", _has_env("BRAVE_SEARCH_API_KEY")),
("ddgs", _ddgs_package_importable()),
)
for backend, available in backend_candidates:
if available:
@@ -196,8 +200,27 @@ def _is_backend_available(backend: str) -> bool:
return _has_env("TAVILY_API_KEY")
if backend == "searxng":
return _has_env("SEARXNG_URL")
if backend == "brave-free":
return _has_env("BRAVE_SEARCH_API_KEY")
if backend == "ddgs":
return _ddgs_package_importable()
return False
def _ddgs_package_importable() -> bool:
"""Return True when the ``ddgs`` Python package can be imported.
ddgs is the only backend whose availability is driven by a package
presence rather than an env var / config entry. Wrapped in a helper
so auto-detect and ``_is_backend_available`` share the same check
(and tests can monkeypatch a single symbol).
"""
try:
import ddgs # noqa: F401
return True
except ImportError:
return False
# ─── Firecrawl Client ────────────────────────────────────────────────────────
_firecrawl_client = None
@@ -1200,6 +1223,26 @@ def web_search_tool(query: str, limit: int = 5) -> str:
_debug.save()
return result_json
if backend == "brave-free":
from tools.web_providers.brave_free import BraveFreeSearchProvider
response_data = BraveFreeSearchProvider().search(query, limit)
debug_call_data["results_count"] = len(response_data.get("data", {}).get("web", []))
result_json = json.dumps(response_data, indent=2, ensure_ascii=False)
debug_call_data["final_response_size"] = len(result_json)
_debug.log_call("web_search_tool", debug_call_data)
_debug.save()
return result_json
if backend == "ddgs":
from tools.web_providers.ddgs import DDGSSearchProvider
response_data = DDGSSearchProvider().search(query, limit)
debug_call_data["results_count"] = len(response_data.get("data", {}).get("web", []))
result_json = json.dumps(response_data, indent=2, ensure_ascii=False)
debug_call_data["final_response_size"] = len(result_json)
_debug.log_call("web_search_tool", debug_call_data)
_debug.save()
return result_json
if backend == "tavily":
logger.info("Tavily search: '%s' (limit: %d)", query, limit)
raw = _tavily_request("search", {
@@ -1350,11 +1393,12 @@ async def web_extract_tool(
"include_images": False,
})
results = _normalize_tavily_documents(raw, fallback_url=safe_urls[0] if safe_urls else "")
elif backend == "searxng":
# SearXNG is search-only — it cannot extract URL content
elif backend in ("searxng", "brave-free", "ddgs"):
# These backends are search-only — they cannot extract URL content
_label = {"searxng": "SearXNG", "brave-free": "Brave Search (free tier)", "ddgs": "DuckDuckGo (ddgs)"}[backend]
return json.dumps({
"success": False,
"error": "SearXNG is a search-only backend and cannot extract URL content. "
"error": f"{_label} is a search-only backend and cannot extract URL content. "
"Set web.extract_backend to firecrawl, tavily, exa, or parallel.",
}, ensure_ascii=False)
else:
@@ -1732,10 +1776,11 @@ async def web_crawl_tool(
_debug.save()
return cleaned_result
# SearXNG is search-only — it cannot crawl
if backend == "searxng":
# SearXNG / Brave Search (free tier) / DuckDuckGo (ddgs) are search-only — they cannot crawl
if backend in ("searxng", "brave-free", "ddgs"):
_label = {"searxng": "SearXNG", "brave-free": "Brave Search (free tier)", "ddgs": "DuckDuckGo (ddgs)"}[backend]
return json.dumps({
"error": "SearXNG is a search-only backend and cannot crawl URLs. "
"error": f"{_label} is a search-only backend and cannot crawl URLs. "
"Set FIRECRAWL_API_KEY for crawling, or use web_search instead.",
"success": False,
}, ensure_ascii=False)
@@ -2035,9 +2080,12 @@ def check_firecrawl_api_key() -> bool:
def check_web_api_key() -> bool:
"""Check whether the configured web backend is available."""
configured = _load_web_config().get("backend", "").lower().strip()
if configured in ("exa", "parallel", "firecrawl", "tavily", "searxng"):
if configured in ("exa", "parallel", "firecrawl", "tavily", "searxng", "brave-free", "ddgs"):
return _is_backend_available(configured)
return any(_is_backend_available(backend) for backend in ("exa", "parallel", "firecrawl", "tavily", "searxng"))
return any(
_is_backend_available(backend)
for backend in ("exa", "parallel", "firecrawl", "tavily", "searxng", "brave-free", "ddgs")
)
def check_auxiliary_model() -> bool:
@@ -2074,6 +2122,10 @@ if __name__ == "__main__":
print(" Using Tavily API (https://tavily.com)")
elif backend == "searxng":
print(f" Using SearXNG (search only): {os.getenv('SEARXNG_URL', '').strip()}")
elif backend == "brave-free":
print(" Using Brave Search free tier (search only)")
elif backend == "ddgs":
print(" Using DuckDuckGo via ddgs package (search only)")
else:
if firecrawl_url_available:
print(f" Using self-hosted Firecrawl: {os.getenv('FIRECRAWL_API_URL').strip().rstrip('/')}")
+2 -2
View File
@@ -125,7 +125,7 @@ class CompressionConfig:
@classmethod
def from_yaml(cls, yaml_path: str) -> "CompressionConfig":
"""Load configuration from YAML file."""
with open(yaml_path, 'r') as f:
with open(yaml_path, 'r', encoding="utf-8") as f:
data = yaml.safe_load(f)
config = cls()
@@ -1174,7 +1174,7 @@ Write only the summary, starting with "[CONTEXT SUMMARY]:" prefix."""
# Save metrics
if self.config.metrics_enabled:
metrics_path = output_dir / self.config.metrics_output_file
with open(metrics_path, 'w') as f:
with open(metrics_path, 'w', encoding="utf-8") as f:
json.dump(self.aggregate_metrics.to_dict(), f, indent=2)
console.print(f"\n💾 Metrics saved to {metrics_path}")
+25 -9
View File
@@ -81,11 +81,14 @@ def _log_signal(signum: int, frame) -> None:
thread, and fall back to ``os._exit(0)`` so a wedged write/flush
can never strand the process.
"""
name = {
signal.SIGPIPE: "SIGPIPE",
signal.SIGTERM: "SIGTERM",
signal.SIGHUP: "SIGHUP",
}.get(signum, f"signal {signum}")
# SIGPIPE and SIGHUP don't exist on Windows — build the lookup
# dict from attributes that actually exist on the current platform.
_signal_names: dict[int, str] = {}
for _attr in ("SIGPIPE", "SIGTERM", "SIGHUP", "SIGINT", "SIGBREAK"):
_sig = getattr(signal, _attr, None)
if _sig is not None:
_signal_names[int(_sig)] = _attr
name = _signal_names.get(signum, f"signal {signum}")
try:
os.makedirs(os.path.dirname(_CRASH_LOG), exist_ok=True)
with open(_CRASH_LOG, "a", encoding="utf-8") as f:
@@ -140,10 +143,23 @@ def _log_signal(signum: int, frame) -> None:
# sys.exit(0) + _log_exit), which keeps the gateway alive as long as
# the main command pipe is still readable. Terminal signals still
# route through _log_signal so kills and hangups are diagnosable.
signal.signal(signal.SIGPIPE, signal.SIG_IGN)
signal.signal(signal.SIGTERM, _log_signal)
signal.signal(signal.SIGHUP, _log_signal)
signal.signal(signal.SIGINT, signal.SIG_IGN)
#
# SIGPIPE and SIGHUP don't exist on Windows; guard each installation
# with hasattr so ``python -m tui_gateway.entry`` (spawned by
# ``hermes --tui``) imports cleanly there. SIGBREAK (Windows' Ctrl+Break)
# is installed when available as a weaker equivalent of SIGHUP.
if hasattr(signal, "SIGPIPE"):
signal.signal(signal.SIGPIPE, signal.SIG_IGN)
if hasattr(signal, "SIGTERM"):
signal.signal(signal.SIGTERM, _log_signal)
if hasattr(signal, "SIGHUP"):
signal.signal(signal.SIGHUP, _log_signal)
elif hasattr(signal, "SIGBREAK"):
# Windows-only: Ctrl+Break in a console window delivers SIGBREAK.
# Route it through the same handler so kills are diagnosable.
signal.signal(signal.SIGBREAK, _log_signal)
if hasattr(signal, "SIGINT"):
signal.signal(signal.SIGINT, signal.SIG_IGN)
def _log_exit(reason: str) -> None:
+4 -3
View File
@@ -660,7 +660,7 @@ def _load_cfg() -> dict:
if _cfg_cache is not None and _cfg_mtime == mtime and _cfg_path == p:
return copy.deepcopy(_cfg_cache)
if p.exists():
with open(p) as f:
with open(p, encoding="utf-8") as f:
data = yaml.safe_load(f) or {}
else:
data = {}
@@ -679,7 +679,7 @@ def _save_cfg(cfg: dict):
import yaml
path = _hermes_home / "config.yaml"
with open(path, "w") as f:
with open(path, "w", encoding="utf-8") as f:
yaml.safe_dump(cfg, f)
with _cfg_lock:
_cfg_cache = copy.deepcopy(cfg)
@@ -1280,6 +1280,7 @@ def _get_usage(agent) -> dict:
"output": g("session_output_tokens", "session_completion_tokens"),
"cache_read": g("session_cache_read_tokens"),
"cache_write": g("session_cache_write_tokens"),
"reasoning": g("session_reasoning_tokens"),
"prompt": g("session_prompt_tokens"),
"completion": g("session_completion_tokens"),
"total": g("session_total_tokens"),
@@ -2587,7 +2588,7 @@ def _(rid, params: dict) -> dict:
f"hermes_conversation_{_time.strftime('%Y%m%d_%H%M%S')}.json"
)
try:
with open(filename, "w") as f:
with open(filename, "w", encoding="utf-8") as f:
json.dump(
{
"model": getattr(session["agent"], "model", ""),
+2
View File
@@ -164,9 +164,11 @@ export interface Usage {
context_max?: number
context_percent?: number
context_used?: number
cost_status?: string
cost_usd?: number
input: number
output: number
reasoning?: number
total: number
}

Some files were not shown because too many files have changed in this diff Show More