Compare commits
29 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
| 7e6d4b982e | |||
| 3c6e70aef1 | |||
| 2f0f03c40d | |||
| 5c2170a7c6 | |||
| d77d877665 | |||
| ac8e238bc8 | |||
| 8d129d013b | |||
| 300140e006 | |||
| e71a2bd11b | |||
| 769ee86cd2 | |||
| 1b1e30510a | |||
| f3acdd94fe | |||
| 78a54d2c00 | |||
| e7c99651fb | |||
| fb51253620 | |||
| ef009a987a | |||
| 130396c658 | |||
| a5c1f925b5 | |||
| 11d93096b3 | |||
| d464d08a5f | |||
| 5a95fb2e14 | |||
| 0acb7f4583 | |||
| a3cd974ee7 | |||
| ea5a6c216b | |||
| 4df62d239e | |||
| 490b3e76b1 | |||
| 102eb4adc0 | |||
| c661fefa08 | |||
| c9e5a9bb08 |
@@ -22,7 +22,12 @@ concurrency:
|
||||
|
||||
jobs:
|
||||
deploy-vercel:
|
||||
if: github.event_name == 'release'
|
||||
# Triggered automatically on release publish (production cuts) and
|
||||
# manually via `gh workflow run deploy-site.yml` when an out-of-band
|
||||
# main commit needs to ship live before the next release tag — e.g.
|
||||
# a skills-index PR that doesn't touch website/** paths and so
|
||||
# doesn't auto-deploy via the deploy-docs path.
|
||||
if: github.event_name == 'release' || github.event_name == 'workflow_dispatch'
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- name: Trigger Vercel Deploy
|
||||
|
||||
@@ -196,10 +196,26 @@ jobs:
|
||||
- name: Set up Docker Buildx
|
||||
uses: docker/setup-buildx-action@8d2750c68a42422c14e847fe6c8ac0403b4cbd6f # v3
|
||||
|
||||
# Build once, load into the local daemon for smoke testing. Cached
|
||||
# to gha with a per-arch scope; the push step below reuses every
|
||||
# layer from this build.
|
||||
- name: Build image (arm64, smoke test)
|
||||
# Build once, load into the local daemon for smoke testing. PR arm64
|
||||
# builds deliberately avoid the gha cache: cold-cache arm64 builds can
|
||||
# outlive GitHub's short-lived Azure cache SAS token, then fail while
|
||||
# reading or writing cache blobs before the smoke test can run.
|
||||
- name: Build image (arm64, smoke test, uncached PR)
|
||||
if: github.event_name == 'pull_request'
|
||||
uses: docker/build-push-action@bcafcacb16a39f128d818304e6c9c0c18556b85f # v7.1.0
|
||||
with:
|
||||
context: .
|
||||
file: Dockerfile
|
||||
load: true
|
||||
platforms: linux/arm64
|
||||
tags: ${{ env.IMAGE_NAME }}:test
|
||||
build-args: |
|
||||
HERMES_GIT_SHA=${{ github.sha }}
|
||||
|
||||
# Main/release builds still use the per-arch gha cache so the digest
|
||||
# push below can reuse layers from this smoke-test build.
|
||||
- name: Build image (arm64, smoke test, cached publish)
|
||||
if: github.event_name != 'pull_request'
|
||||
uses: docker/build-push-action@bcafcacb16a39f128d818304e6c9c0c18556b85f # v7.1.0
|
||||
with:
|
||||
context: .
|
||||
|
||||
@@ -0,0 +1,110 @@
|
||||
# Hermes Agent v0.15.1 (v2026.5.29)
|
||||
|
||||
**Release Date:** May 29, 2026
|
||||
**Since v0.15.0:** 28 commits · 21 merged PRs · hotfix release · 9 contributors
|
||||
|
||||
> **The Patch Release.** A same-day hotfix for v0.15.0. Headline fix: the dashboard infinite-reload loop that hit anyone running v0.15.0 in loopback mode (Docker, hosted Hermes, fresh installs). A handful of other v0.15.0 follow-ups go along for the ride — kanban worker SIGTERM, `/model` picker unification, `/yolo` session bypass, the full 19,932-entry skills.sh catalog, `.md` media delivery restoration, gateway probe-stepdown safety, web-URL redaction passthrough, kanban worker vision on referenced images, hindsight observation-default. Docker users get an explicit `--insecure` opt-in env var (no more bind-host inference), MCP server bare-command PATH resolution, and arm64 PR-build cache fixes.
|
||||
|
||||
---
|
||||
|
||||
## ✨ Highlights
|
||||
|
||||
- **Dashboard 401 reload loop fixed** — In loopback mode the dashboard's identity probe (`/api/auth/me`) returns 401 by design, but v0.15.0's stale-token reload guard treated every 401 as a rotated session token and full-page-reloaded to pick up a fresh one. Every successful sibling call cleared the one-shot reload guard, so the page reload-looped forever (Firefox: "Navigated to /sessions" storm; Chrome: React re-render storm). Fix adds an `allowUnauthorized` opt-out to `fetchJSON` that skips only the loopback stale-token reload — 401 still throws so `AuthWidget` swallows it, gated-mode `login_url` redirects are unaffected. Closes [#34206](https://github.com/NousResearch/hermes-agent/issues/34206), [#34202](https://github.com/NousResearch/hermes-agent/issues/34202). ([#30698](https://github.com/NousResearch/hermes-agent/pull/30698) — @austinpickett)
|
||||
|
||||
- **Docker dashboard `--insecure` is now an explicit env opt-in, never derived from bind host** — Previously the Docker entrypoint inferred `--insecure` when the dashboard bound to a non-loopback host. That conflated "I want LAN access" with "I want to disable the same-origin guard." The fix splits them: bind host is bind host, and disabling the dashboard's loopback auth requires an explicit `HERMES_DASHBOARD_INSECURE=1`. Existing setups that genuinely wanted insecure binding must now set the env var. ([#34188](https://github.com/NousResearch/hermes-agent/pull/34188), [#34204](https://github.com/NousResearch/hermes-agent/pull/34204) — @benbarclay)
|
||||
|
||||
- **MCP bare command resolution under Docker** — MCP servers configured with bare commands (`npx`, `npm`, `node`) now resolve against `/usr/local/bin` so they actually launch inside the Docker image where those binaries live. v0.15.0 left these failing silently in containers when the agent's effective PATH didn't include the Node toolchain location. ([#34186](https://github.com/NousResearch/hermes-agent/pull/34186) — @benbarclay)
|
||||
|
||||
- **Skills page sidebar / source pills restored** — A stale `useMemo` dependency in the new dashboard skills page collapsed the source pills and category sidebar to "All" only. Fixed; both surfaces now reflect the live catalog state. ([#34194](https://github.com/NousResearch/hermes-agent/pull/34194))
|
||||
|
||||
- **Kanban worker can be killed again** — `SIGTERM` on a kanban worker was being absorbed by an intermediate process and the worker stayed running. Closes [#28181](https://github.com/NousResearch/hermes-agent/issues/28181). ([#34045](https://github.com/NousResearch/hermes-agent/pull/34045))
|
||||
|
||||
- **Full skills.sh catalog (858 → 19,932 entries)** — The skills hub page was pulling a partial paginated catalog. The fetch now walks the sitemap, so all 19,932 skills.sh entries surface in the picker instead of just the first 858. ([#34025](https://github.com/NousResearch/hermes-agent/pull/34025))
|
||||
|
||||
---
|
||||
|
||||
## 🐛 Bug Fixes
|
||||
|
||||
### Dashboard / Web
|
||||
|
||||
- **`/api/auth/me` 401 no longer triggers reload loop** in loopback mode — ([#30698](https://github.com/NousResearch/hermes-agent/pull/30698) — @austinpickett)
|
||||
- **Skills page source pills + category sidebar restored** — stale `useMemo` dep ([#34194](https://github.com/NousResearch/hermes-agent/pull/34194))
|
||||
|
||||
### Docker
|
||||
|
||||
- **`--insecure` is now explicit opt-in via env var**, not derived from bind host ([#34188](https://github.com/NousResearch/hermes-agent/pull/34188) — @benbarclay)
|
||||
- **Dashboard test suite repaired** to match the insecure-opt-in fix ([#34204](https://github.com/NousResearch/hermes-agent/pull/34204) — @benbarclay)
|
||||
- **arm64 PR builds skip the GHA cache** to avoid cache-thrash on cross-arch builders ([#33704](https://github.com/NousResearch/hermes-agent/pull/33704) — @BROCCOLO1D)
|
||||
|
||||
### MCP
|
||||
|
||||
- **Bare `npx`/`npm`/`node` resolve against `/usr/local/bin`** for Docker compatibility ([#34186](https://github.com/NousResearch/hermes-agent/pull/34186) — @benbarclay)
|
||||
|
||||
### Kanban
|
||||
|
||||
- **Worker SIGTERM actually terminates the process** ([#34045](https://github.com/NousResearch/hermes-agent/pull/34045))
|
||||
- **Workers receive images referenced in task bodies** for vision-capable models ([#34210](https://github.com/NousResearch/hermes-agent/pull/34210))
|
||||
|
||||
### Gateway
|
||||
|
||||
- **`.md` files deliver again** — media-delivery validation defaults to denylist-only instead of an overly-narrow allowlist ([#34022](https://github.com/NousResearch/hermes-agent/pull/34022))
|
||||
- **Probe stepdown safety** — on a context-overflow without an explicit provider context limit, the agent no longer steps down to a smaller model based on an unknown ceiling (salvage of [#33673](https://github.com/NousResearch/hermes-agent/pull/33673)) ([#33826](https://github.com/NousResearch/hermes-agent/pull/33826))
|
||||
|
||||
### CLI
|
||||
|
||||
- **`/yolo` mid-session enables the per-session bypass** instead of just toggling the env var (which the running agent had already snapshotted) ([#33931](https://github.com/NousResearch/hermes-agent/pull/33931) — @kshitijk4poor)
|
||||
- **`/model` and `hermes model` show the same list**, plus disk cache for picker startup ([#33867](https://github.com/NousResearch/hermes-agent/pull/33867))
|
||||
|
||||
### Skills
|
||||
|
||||
- **Full skills.sh catalog via sitemap** — 858 → 19,932 entries ([#34025](https://github.com/NousResearch/hermes-agent/pull/34025))
|
||||
|
||||
### Redaction
|
||||
|
||||
- **Web URLs pass through unchanged** — the redactor was eating query parameters that looked credential-shaped ([#34029](https://github.com/NousResearch/hermes-agent/pull/34029))
|
||||
|
||||
---
|
||||
|
||||
## ✨ Small Features
|
||||
|
||||
- **Hindsight default narrowed to observation-only** for `recall_types` — tool path is also narrowed ([#34079](https://github.com/NousResearch/hermes-agent/pull/34079) — @nicoloboschi, follow-up [#34091](https://github.com/NousResearch/hermes-agent/pull/4df62d239e38bf8c212a595721c9c01e176f6c3a) — @kshitijk4poor)
|
||||
- **Memory providers receive completed-turn message context** — salvage of [#28065](https://github.com/NousResearch/hermes-agent/pull/28065) ([#34097](https://github.com/NousResearch/hermes-agent/pull/34097) — @kshitijk4poor, credit to @devwdave)
|
||||
|
||||
---
|
||||
|
||||
## 📚 Documentation
|
||||
|
||||
- **`--no-supervise` / `HERMES_GATEWAY_NO_SUPERVISE` documented** in the reference docs (follow-up to [#33583](https://github.com/NousResearch/hermes-agent/pull/33583)) ([#33751](https://github.com/NousResearch/hermes-agent/pull/33751) — @r266-tech)
|
||||
|
||||
---
|
||||
|
||||
## 🛠️ Infrastructure
|
||||
|
||||
- **Vercel deploy workflow accepts `workflow_dispatch`** so docs deploys can be manually triggered ([#34081](https://github.com/NousResearch/hermes-agent/pull/34081))
|
||||
- **`@nous-research/ui` bumped to 0.18.2** (Nix `npmDepsHash` also updated to match) ([#34193](https://github.com/NousResearch/hermes-agent/pull/34193) follow-ups — @austinpickett)
|
||||
|
||||
---
|
||||
|
||||
## 👥 Contributors
|
||||
|
||||
### Core
|
||||
- @teknium1
|
||||
|
||||
### Community
|
||||
- @austinpickett — dashboard 401 reload-loop fix (the headline), `@nous-research/ui` bump, Nix `npmDepsHash` updates
|
||||
- @benbarclay — Docker `--insecure` opt-in, MCP bare-command resolution, dashboard test repair
|
||||
- @kshitijk4poor — `/yolo` session bypass, completed-turn memory context salvage, hindsight follow-up docs
|
||||
- @nicoloboschi — hindsight `recall_types` observation default
|
||||
- @BROCCOLO1D — arm64 PR build cache fix
|
||||
- @r266-tech — `--no-supervise` reference docs
|
||||
- @yangguangjin — probe stepdown safety (salvage of @yanghd's #33673)
|
||||
- @devwdave — completed-turn memory context (credited via salvage)
|
||||
- @andrewhosf — co-author
|
||||
|
||||
### Issue Reporters (the 401 loop)
|
||||
- @routesmith ([#34206](https://github.com/NousResearch/hermes-agent/issues/34206))
|
||||
- @beeaton ([#34202](https://github.com/NousResearch/hermes-agent/issues/34202))
|
||||
|
||||
---
|
||||
|
||||
**Full Changelog**: [v2026.5.28...v2026.5.29](https://github.com/NousResearch/hermes-agent/compare/v2026.5.28...v2026.5.29)
|
||||
@@ -1,7 +1,7 @@
|
||||
{
|
||||
"id": "hermes-agent",
|
||||
"name": "Hermes Agent",
|
||||
"version": "0.15.0",
|
||||
"version": "0.15.1",
|
||||
"description": "Self-improving open-source AI agent by Nous Research with ACP editor integration, persistent memory, skills, and rich tool support.",
|
||||
"repository": "https://github.com/NousResearch/hermes-agent",
|
||||
"website": "https://hermes-agent.nousresearch.com/docs/user-guide/features/acp",
|
||||
@@ -9,7 +9,7 @@
|
||||
"license": "MIT",
|
||||
"distribution": {
|
||||
"uvx": {
|
||||
"package": "hermes-agent[acp]==0.15.0",
|
||||
"package": "hermes-agent[acp]==0.15.1",
|
||||
"args": ["hermes-acp"]
|
||||
}
|
||||
}
|
||||
|
||||
@@ -4561,6 +4561,7 @@ def run_conversation(
|
||||
original_user_message=original_user_message,
|
||||
final_response=final_response,
|
||||
interrupted=interrupted,
|
||||
messages=messages,
|
||||
)
|
||||
|
||||
# Background memory/skill review — runs AFTER the response is delivered
|
||||
|
||||
+134
-14
@@ -37,6 +37,8 @@ from __future__ import annotations
|
||||
import base64
|
||||
import logging
|
||||
import mimetypes
|
||||
import os
|
||||
import re
|
||||
from pathlib import Path
|
||||
from typing import Any, Dict, List, Optional, Tuple
|
||||
|
||||
@@ -46,6 +48,102 @@ logger = logging.getLogger(__name__)
|
||||
_VALID_MODES = frozenset({"auto", "native", "text"})
|
||||
|
||||
|
||||
# Image extensions used by extract_image_refs(). Kept tight on purpose — we
|
||||
# only auto-attach things the model can actually see. Documents/archives are
|
||||
# excluded because the gateway's broader extract_local_files() also routes
|
||||
# them differently (send_document), and we don't want to attach a PDF as a
|
||||
# vision part.
|
||||
_IMAGE_EXTS = (
|
||||
".png", ".jpg", ".jpeg", ".gif", ".webp", ".bmp", ".tiff", ".tif", ".heic",
|
||||
)
|
||||
_IMAGE_EXT_PATTERN = "|".join(e.lstrip(".") for e in _IMAGE_EXTS)
|
||||
|
||||
# Absolute / home-relative local image path. Matches the same shape gateway's
|
||||
# extract_local_files() uses: anchors to ``~/`` or ``/``, ignores matches inside
|
||||
# URLs (the ``(?<![/:\w.])`` lookbehind), and case-insensitive on the extension.
|
||||
_LOCAL_IMAGE_PATH_RE = re.compile(
|
||||
r"(?<![/:\w.])(?:~/|/)(?:[\w.\-]+/)*[\w.\-]+\.(?:" + _IMAGE_EXT_PATTERN + r")\b",
|
||||
re.IGNORECASE,
|
||||
)
|
||||
|
||||
# http(s) URL ending in an image extension (optionally followed by a
|
||||
# query string). Case-insensitive on the extension. Strict ``http(s)://``
|
||||
# scheme so we don't accidentally grab ``file://`` URLs or other shapes.
|
||||
_IMAGE_URL_RE = re.compile(
|
||||
r"https?://[^\s<>\"']+?\.(?:" + _IMAGE_EXT_PATTERN + r")(?:\?[^\s<>\"']*)?",
|
||||
re.IGNORECASE,
|
||||
)
|
||||
|
||||
|
||||
def extract_image_refs(text: str) -> Tuple[List[str], List[str]]:
|
||||
"""Scan free-form text for image references the model should see.
|
||||
|
||||
Returns ``(local_paths, urls)``:
|
||||
|
||||
* ``local_paths`` — absolute (``/``) or home-relative (``~/``) paths
|
||||
whose suffix is an image extension AND whose expanded form exists
|
||||
on disk as a file. Order-preserving, deduplicated.
|
||||
* ``urls`` — ``http(s)://…`` URLs whose path ends in an image
|
||||
extension (a ``?query`` is allowed after the extension).
|
||||
Order-preserving, deduplicated.
|
||||
|
||||
Matches inside fenced code blocks (``` ``` ```) and inline backticks
|
||||
(`` `…` ``) are skipped so that snippets pasted into a task body for
|
||||
reference aren't mistaken for live attachments. This mirrors the
|
||||
behaviour of ``gateway.platforms.base.BaseAdapter.extract_local_files``.
|
||||
|
||||
Local paths are validated against the filesystem; URLs are not
|
||||
(the provider fetches them at request time).
|
||||
"""
|
||||
if not isinstance(text, str) or not text:
|
||||
return [], []
|
||||
|
||||
# Build spans covered by fenced code blocks and inline code so we can
|
||||
# ignore references the author embedded purely as example text.
|
||||
code_spans: list[tuple[int, int]] = []
|
||||
for m in re.finditer(r"```[^\n]*\n.*?```", text, re.DOTALL):
|
||||
code_spans.append((m.start(), m.end()))
|
||||
for m in re.finditer(r"`[^`\n]+`", text):
|
||||
code_spans.append((m.start(), m.end()))
|
||||
|
||||
def _in_code(pos: int) -> bool:
|
||||
return any(s <= pos < e for s, e in code_spans)
|
||||
|
||||
local_paths: list[str] = []
|
||||
seen_paths: set[str] = set()
|
||||
for match in _LOCAL_IMAGE_PATH_RE.finditer(text):
|
||||
if _in_code(match.start()):
|
||||
continue
|
||||
raw = match.group(0)
|
||||
expanded = os.path.expanduser(raw)
|
||||
try:
|
||||
if not os.path.isfile(expanded):
|
||||
continue
|
||||
except OSError:
|
||||
# ENAMETOOLONG / EINVAL on pathological inputs — skip rather than crash.
|
||||
continue
|
||||
if expanded in seen_paths:
|
||||
continue
|
||||
seen_paths.add(expanded)
|
||||
local_paths.append(expanded)
|
||||
|
||||
urls: list[str] = []
|
||||
seen_urls: set[str] = set()
|
||||
for match in _IMAGE_URL_RE.finditer(text):
|
||||
if _in_code(match.start()):
|
||||
continue
|
||||
url = match.group(0)
|
||||
# Strip trailing punctuation that's almost certainly prose, not part
|
||||
# of the URL (e.g. "see https://x.com/a.png." or "/a.png)").
|
||||
url = url.rstrip(".,;:!?)]>")
|
||||
if url in seen_urls:
|
||||
continue
|
||||
seen_urls.add(url)
|
||||
urls.append(url)
|
||||
|
||||
return local_paths, urls
|
||||
|
||||
|
||||
# Strict YAML/JSON boolean coercion for capability overrides.
|
||||
#
|
||||
# ``bool("false")`` is True in Python because non-empty strings are truthy, so
|
||||
@@ -320,20 +418,29 @@ def _file_to_data_url(path: Path) -> Optional[str]:
|
||||
def build_native_content_parts(
|
||||
user_text: str,
|
||||
image_paths: List[str],
|
||||
image_urls: Optional[List[str]] = None,
|
||||
) -> Tuple[List[Dict[str, Any]], List[str]]:
|
||||
"""Build an OpenAI-style ``content`` list for a user turn.
|
||||
|
||||
Shape:
|
||||
[{"type": "text", "text": "...\\n\\n[Image attached at: /local/path]"},
|
||||
{"type": "image_url", "image_url": {"url": "data:image/png;base64,..."}},
|
||||
{"type": "image_url", "image_url": {"url": "https://example.com/a.png"}},
|
||||
...]
|
||||
|
||||
The local path of each successfully attached image is appended to the
|
||||
text part as ``[Image attached at: <path>]``. The model still sees the
|
||||
pixels via the ``image_url`` part (full native vision); the path note
|
||||
just gives it a string handle so MCP/skill tools that take an image
|
||||
path or URL argument can be invoked on the same image without an
|
||||
extra round-trip. This parallels the text-mode hint produced by
|
||||
Local paths are read from disk and embedded as base64 ``data:`` URLs.
|
||||
Remote URLs (``http(s)://``) are passed through verbatim — the provider
|
||||
fetches them server-side. The model still sees the pixels either way.
|
||||
|
||||
For each successfully attached image, a hint is appended to the text
|
||||
part:
|
||||
|
||||
* local path → ``[Image attached at: <path>]``
|
||||
* URL → ``[Image attached: <url>]``
|
||||
|
||||
The hint gives the model a string handle so MCP/skill tools that take
|
||||
an image path or URL argument can be invoked on the same image without
|
||||
an extra round-trip. This parallels the text-mode hint produced by
|
||||
``Runner._enrich_message_with_vision`` (``vision_analyze using image_url:
|
||||
<path>``) so behaviour is consistent across both image input modes.
|
||||
|
||||
@@ -342,12 +449,14 @@ def build_native_content_parts(
|
||||
ceiling), the agent's retry loop transparently shrinks and retries
|
||||
once — see ``run_agent._try_shrink_image_parts_in_messages``.
|
||||
|
||||
Returns (content_parts, skipped_paths). Skipped paths are files that
|
||||
couldn't be read from disk and are NOT advertised in the path hints.
|
||||
Returns (content_parts, skipped). Skipped entries are local paths
|
||||
that couldn't be read from disk; URLs are never skipped (they're
|
||||
not validated here).
|
||||
"""
|
||||
skipped: List[str] = []
|
||||
image_parts: List[Dict[str, Any]] = []
|
||||
attached_paths: List[str] = []
|
||||
attached_urls: List[str] = []
|
||||
|
||||
for raw_path in image_paths:
|
||||
p = Path(raw_path)
|
||||
@@ -364,16 +473,26 @@ def build_native_content_parts(
|
||||
})
|
||||
attached_paths.append(str(raw_path))
|
||||
|
||||
for url in image_urls or []:
|
||||
url = (url or "").strip()
|
||||
if not url:
|
||||
continue
|
||||
image_parts.append({
|
||||
"type": "image_url",
|
||||
"image_url": {"url": url},
|
||||
})
|
||||
attached_urls.append(url)
|
||||
|
||||
text = (user_text or "").strip()
|
||||
|
||||
# If at least one image attached, build a single text part that combines
|
||||
# the user's caption (or a neutral default) with one path hint per image.
|
||||
if attached_paths:
|
||||
# the user's caption (or a neutral default) with one hint per image.
|
||||
if attached_paths or attached_urls:
|
||||
base_text = text or "What do you see in this image?"
|
||||
path_hints = "\n".join(
|
||||
f"[Image attached at: {p}]" for p in attached_paths
|
||||
)
|
||||
combined_text = f"{base_text}\n\n{path_hints}"
|
||||
hint_lines: List[str] = []
|
||||
hint_lines.extend(f"[Image attached at: {p}]" for p in attached_paths)
|
||||
hint_lines.extend(f"[Image attached: {u}]" for u in attached_urls)
|
||||
combined_text = f"{base_text}\n\n" + "\n".join(hint_lines)
|
||||
parts: List[Dict[str, Any]] = [{"type": "text", "text": combined_text}]
|
||||
parts.extend(image_parts)
|
||||
return parts, skipped
|
||||
@@ -388,4 +507,5 @@ def build_native_content_parts(
|
||||
__all__ = [
|
||||
"decide_image_input_mode",
|
||||
"build_native_content_parts",
|
||||
"extract_image_refs",
|
||||
]
|
||||
|
||||
+33
-2
@@ -368,11 +368,42 @@ class MemoryManager:
|
||||
|
||||
# -- Sync ----------------------------------------------------------------
|
||||
|
||||
def sync_all(self, user_content: str, assistant_content: str, *, session_id: str = "") -> None:
|
||||
@staticmethod
|
||||
def _provider_sync_accepts_messages(provider: MemoryProvider) -> bool:
|
||||
"""Return whether sync_turn accepts a messages keyword."""
|
||||
try:
|
||||
signature = inspect.signature(provider.sync_turn)
|
||||
except (TypeError, ValueError):
|
||||
return True
|
||||
params = list(signature.parameters.values())
|
||||
if any(p.kind == inspect.Parameter.VAR_KEYWORD for p in params):
|
||||
return True
|
||||
return "messages" in signature.parameters
|
||||
|
||||
def sync_all(
|
||||
self,
|
||||
user_content: str,
|
||||
assistant_content: str,
|
||||
*,
|
||||
session_id: str = "",
|
||||
messages: Optional[List[Dict[str, Any]]] = None,
|
||||
) -> None:
|
||||
"""Sync a completed turn to all providers."""
|
||||
for provider in self._providers:
|
||||
try:
|
||||
provider.sync_turn(user_content, assistant_content, session_id=session_id)
|
||||
if messages is not None and self._provider_sync_accepts_messages(provider):
|
||||
provider.sync_turn(
|
||||
user_content,
|
||||
assistant_content,
|
||||
session_id=session_id,
|
||||
messages=messages,
|
||||
)
|
||||
else:
|
||||
provider.sync_turn(
|
||||
user_content,
|
||||
assistant_content,
|
||||
session_id=session_id,
|
||||
)
|
||||
except Exception as e:
|
||||
logger.warning(
|
||||
"Memory provider '%s' sync_turn failed: %s",
|
||||
|
||||
@@ -112,11 +112,22 @@ class MemoryProvider(ABC):
|
||||
that do background prefetching should override this.
|
||||
"""
|
||||
|
||||
def sync_turn(self, user_content: str, assistant_content: str, *, session_id: str = "") -> None:
|
||||
def sync_turn(
|
||||
self,
|
||||
user_content: str,
|
||||
assistant_content: str,
|
||||
*,
|
||||
session_id: str = "",
|
||||
messages: Optional[List[Dict[str, Any]]] = None,
|
||||
) -> None:
|
||||
"""Persist a completed turn to the backend.
|
||||
|
||||
Called after each turn. Should be non-blocking — queue for
|
||||
background processing if the backend has latency.
|
||||
|
||||
``messages`` is the OpenAI-style conversation message list as of the
|
||||
completed turn, including any assistant tool calls and tool results.
|
||||
Providers that do not need raw turn context can ignore it.
|
||||
"""
|
||||
|
||||
@abstractmethod
|
||||
|
||||
@@ -576,6 +576,8 @@ def load_cli_config() -> Dict[str, Any]:
|
||||
"docker_env": "TERMINAL_DOCKER_ENV",
|
||||
"docker_mount_cwd_to_workspace": "TERMINAL_DOCKER_MOUNT_CWD_TO_WORKSPACE",
|
||||
"docker_run_as_host_user": "TERMINAL_DOCKER_RUN_AS_HOST_USER",
|
||||
"docker_persist_across_processes": "TERMINAL_DOCKER_PERSIST_ACROSS_PROCESSES",
|
||||
"docker_orphan_reaper": "TERMINAL_DOCKER_ORPHAN_REAPER",
|
||||
"sandbox_dir": "TERMINAL_SANDBOX_DIR",
|
||||
# Persistent shell (non-local backends)
|
||||
"persistent_shell": "TERMINAL_PERSISTENT_SHELL",
|
||||
@@ -15125,13 +15127,50 @@ def main(
|
||||
# Handle single query mode
|
||||
if query or image:
|
||||
query, single_query_images = _collect_query_images(query, image)
|
||||
# Kanban workers spawn with ``hermes chat -q "work kanban task <id>"``;
|
||||
# the actual task description lives in the task body. Mirror the
|
||||
# gateway/CLI behaviour for inbound images by scanning the body for
|
||||
# local image paths and http(s) image URLs and attaching them to the
|
||||
# worker's first turn. Without this, users who paste a screenshot
|
||||
# path or URL into a kanban task body never get it routed to the
|
||||
# model's vision input.
|
||||
single_query_image_urls: list[str] = []
|
||||
_kanban_task_id = os.environ.get("HERMES_KANBAN_TASK", "").strip()
|
||||
if _kanban_task_id:
|
||||
try:
|
||||
from hermes_cli import kanban_db as _kb
|
||||
from agent.image_routing import extract_image_refs as _extract_refs
|
||||
|
||||
_conn = _kb.connect()
|
||||
try:
|
||||
_task = _kb.get_task(_conn, _kanban_task_id)
|
||||
finally:
|
||||
try:
|
||||
_conn.close()
|
||||
except Exception:
|
||||
pass
|
||||
_body = getattr(_task, "body", "") if _task is not None else ""
|
||||
if _body:
|
||||
_kb_paths, _kb_urls = _extract_refs(_body)
|
||||
if _kb_paths:
|
||||
# Dedupe against any --image the user already passed.
|
||||
_seen = {str(p) for p in single_query_images}
|
||||
for _p in _kb_paths:
|
||||
if _p not in _seen:
|
||||
_seen.add(_p)
|
||||
single_query_images.append(Path(_p))
|
||||
if _kb_urls:
|
||||
single_query_image_urls.extend(_kb_urls)
|
||||
except Exception as _exc:
|
||||
# Best-effort enrichment; never block worker startup on it.
|
||||
logger.debug("kanban image-ref extraction failed: %s", _exc)
|
||||
if quiet:
|
||||
# Quiet mode: suppress banner, spinner, tool previews.
|
||||
# Only print the final response and parseable session info.
|
||||
cli.tool_progress_mode = "off"
|
||||
if cli._ensure_runtime_credentials():
|
||||
effective_query: Any = query
|
||||
if single_query_images:
|
||||
if single_query_images or single_query_image_urls:
|
||||
# Honour the same image-routing decision used by the
|
||||
# interactive path. With a vision-capable model (incl.
|
||||
# custom-provider models declared via
|
||||
@@ -15160,19 +15199,26 @@ def main(
|
||||
_parts, _skipped = _build_parts(
|
||||
query if isinstance(query, str) else "",
|
||||
[str(p) for p in single_query_images],
|
||||
image_urls=list(single_query_image_urls) or None,
|
||||
)
|
||||
if any(p.get("type") == "image_url" for p in _parts):
|
||||
effective_query = _parts
|
||||
else:
|
||||
# All images unreadable — text fallback.
|
||||
# ``_preprocess_images_with_vision`` only knows
|
||||
# about local files; URLs would be lost there,
|
||||
# so keep the original query text intact when
|
||||
# only URLs were supplied.
|
||||
if single_query_images:
|
||||
effective_query = cli._preprocess_images_with_vision(
|
||||
query, single_query_images, announce=False,
|
||||
)
|
||||
except Exception:
|
||||
if single_query_images:
|
||||
effective_query = cli._preprocess_images_with_vision(
|
||||
query, single_query_images, announce=False,
|
||||
)
|
||||
except Exception:
|
||||
effective_query = cli._preprocess_images_with_vision(
|
||||
query, single_query_images, announce=False,
|
||||
)
|
||||
else:
|
||||
elif single_query_images:
|
||||
effective_query = cli._preprocess_images_with_vision(
|
||||
query,
|
||||
single_query_images,
|
||||
|
||||
@@ -30,13 +30,21 @@ cd /opt/data
|
||||
dash_host="${HERMES_DASHBOARD_HOST:-0.0.0.0}"
|
||||
dash_port="${HERMES_DASHBOARD_PORT:-9119}"
|
||||
|
||||
# Binding to anything other than localhost requires --insecure — the
|
||||
# dashboard refuses otherwise because it exposes API keys. Inside a
|
||||
# container this is the expected deployment.
|
||||
# `--insecure` is opt-in via HERMES_DASHBOARD_INSECURE. The dashboard's
|
||||
# OAuth auth gate engages automatically on non-loopback binds when a
|
||||
# DashboardAuthProvider is registered (e.g. the bundled dashboard_auth/nous
|
||||
# provider, which auto-registers when HERMES_DASHBOARD_OAUTH_CLIENT_ID is
|
||||
# set). If no provider is registered, start_server fails closed with a
|
||||
# specific operator-facing error.
|
||||
#
|
||||
# This used to derive --insecure from the bind host ("anything non-loopback
|
||||
# implies insecure"), but that predates the OAuth gate and silently
|
||||
# disabled it on every container-deployed dashboard. The gate is now the
|
||||
# authority; operators on trusted LANs / behind a reverse proxy without
|
||||
# the OAuth contract opt in explicitly.
|
||||
insecure=""
|
||||
case "$dash_host" in
|
||||
127.0.0.1|localhost) ;;
|
||||
*) insecure="--insecure" ;;
|
||||
case "${HERMES_DASHBOARD_INSECURE:-}" in
|
||||
1|true|TRUE|True|yes|YES|Yes) insecure="--insecure" ;;
|
||||
esac
|
||||
|
||||
# shellcheck disable=SC2086 # word-splitting of $insecure is intentional
|
||||
|
||||
@@ -831,6 +831,8 @@ if _config_path.exists():
|
||||
"docker_env": "TERMINAL_DOCKER_ENV",
|
||||
"docker_mount_cwd_to_workspace": "TERMINAL_DOCKER_MOUNT_CWD_TO_WORKSPACE",
|
||||
"docker_run_as_host_user": "TERMINAL_DOCKER_RUN_AS_HOST_USER",
|
||||
"docker_persist_across_processes": "TERMINAL_DOCKER_PERSIST_ACROSS_PROCESSES",
|
||||
"docker_orphan_reaper": "TERMINAL_DOCKER_ORPHAN_REAPER",
|
||||
"sandbox_dir": "TERMINAL_SANDBOX_DIR",
|
||||
"persistent_shell": "TERMINAL_PERSISTENT_SHELL",
|
||||
}
|
||||
|
||||
@@ -14,8 +14,8 @@ Provides subcommands for:
|
||||
import os
|
||||
import sys
|
||||
|
||||
__version__ = "0.15.0"
|
||||
__release_date__ = "2026.5.28"
|
||||
__version__ = "0.15.1"
|
||||
__release_date__ = "2026.5.29"
|
||||
|
||||
|
||||
def _ensure_utf8():
|
||||
|
||||
@@ -5551,6 +5551,8 @@ def set_config_value(key: str, value: str):
|
||||
"terminal.daytona_image": "TERMINAL_DAYTONA_IMAGE",
|
||||
"terminal.docker_mount_cwd_to_workspace": "TERMINAL_DOCKER_MOUNT_CWD_TO_WORKSPACE",
|
||||
"terminal.docker_run_as_host_user": "TERMINAL_DOCKER_RUN_AS_HOST_USER",
|
||||
"terminal.docker_persist_across_processes": "TERMINAL_DOCKER_PERSIST_ACROSS_PROCESSES",
|
||||
"terminal.docker_orphan_reaper": "TERMINAL_DOCKER_ORPHAN_REAPER",
|
||||
"terminal.docker_env": "TERMINAL_DOCKER_ENV",
|
||||
# terminal.cwd intentionally excluded — CLI resolves at runtime,
|
||||
# gateway bridges it in gateway/run.py. Persisting to .env causes
|
||||
|
||||
@@ -26,10 +26,15 @@ from hermes_cli.dashboard_auth import list_providers
|
||||
from hermes_cli.dashboard_auth.audit import AuditEvent, audit_log
|
||||
from hermes_cli.dashboard_auth.base import ProviderError
|
||||
from hermes_cli.dashboard_auth.cookies import read_session_cookies
|
||||
from hermes_cli.dashboard_auth.public_paths import PUBLIC_API_PATHS
|
||||
|
||||
_log = logging.getLogger(__name__)
|
||||
|
||||
# Paths that bypass the auth gate. Order matters: prefix match.
|
||||
# Prefixes that bypass the auth gate. Match via ``path == prefix`` or
|
||||
# ``path.startswith(prefix)`` — so ``/assets/`` (with trailing slash)
|
||||
# matches ``/assets/foo.css`` but not ``/assetsleak``. Auth-bootstrap
|
||||
# (login page, OAuth round trip, provider listing) and static asset
|
||||
# mounts go here.
|
||||
_GATE_PUBLIC_PREFIXES: tuple[str, ...] = (
|
||||
"/auth/login",
|
||||
"/auth/callback",
|
||||
@@ -45,6 +50,20 @@ _GATE_PUBLIC_PREFIXES: tuple[str, ...] = (
|
||||
|
||||
|
||||
def _path_is_public(path: str) -> bool:
|
||||
"""True if ``path`` bypasses the OAuth auth gate.
|
||||
|
||||
Two sources of public-ness:
|
||||
|
||||
* :data:`PUBLIC_API_PATHS` — the shared ``/api/*`` allowlist that
|
||||
the legacy ``_SESSION_TOKEN`` middleware also honours. Matched
|
||||
exactly (no prefix expansion) so adding ``/api/status`` doesn't
|
||||
accidentally expose ``/api/status/secret-extension``.
|
||||
* :data:`_GATE_PUBLIC_PREFIXES` — auth-bootstrap routes and static
|
||||
mounts. Prefix-matched so ``/assets/foo.css`` lights up via
|
||||
``/assets/``.
|
||||
"""
|
||||
if path in PUBLIC_API_PATHS:
|
||||
return True
|
||||
return any(
|
||||
path == prefix or path.startswith(prefix)
|
||||
for prefix in _GATE_PUBLIC_PREFIXES
|
||||
|
||||
@@ -0,0 +1,49 @@
|
||||
"""Shared allowlist of ``/api/*`` paths that bypass dashboard auth.
|
||||
|
||||
Two middlewares enforce dashboard auth and previously kept independent
|
||||
copies of this list:
|
||||
|
||||
* ``hermes_cli.web_server.auth_middleware`` — loopback / ``--insecure``
|
||||
mode, gates on the ephemeral ``_SESSION_TOKEN``.
|
||||
* ``hermes_cli.dashboard_auth.middleware.gated_auth_middleware`` —
|
||||
non-loopback mode, gates on the OAuth session cookie.
|
||||
|
||||
When the lists drifted, ``/api/status`` ended up public under the legacy
|
||||
gate but 401'd under the OAuth gate. That broke the portal's wildcard
|
||||
liveness probe (``nous-account-service`` ``fly-provider.ts``
|
||||
``getInstanceRuntimeStatus``), which fetches ``/api/status`` without a
|
||||
cookie as its sole signal of "agent dashboard is alive": every healthy
|
||||
wildcard-subdomain agent surfaced as STARTING/down in the portal UI even
|
||||
though the dashboard was serving correctly.
|
||||
|
||||
Centralising the allowlist here so both middlewares import the same
|
||||
frozenset prevents the next drift. Keep this list minimal — only truly
|
||||
non-sensitive, read-only endpoints belong here. As a sanity check, every
|
||||
entry should be safe to expose to:
|
||||
|
||||
* external uptime probes (Pingdom, Better Stack, NAS),
|
||||
* the dashboard SPA before the user has logged in,
|
||||
* anyone who happens to ``curl`` the hostname.
|
||||
|
||||
If a new endpoint doesn't pass all three tests, it should be gated and
|
||||
the SPA should bootstrap it after login instead.
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
PUBLIC_API_PATHS: frozenset[str] = frozenset({
|
||||
# Liveness probe target. Returns version, gateway state, active
|
||||
# session count, and the dashboard auth-gate shape. No bodies, no
|
||||
# session content, no secrets. Documented as the portal's wildcard
|
||||
# liveness probe in
|
||||
# ``docs/agent-dashboard-public-url-contract.md`` (NAS side).
|
||||
"/api/status",
|
||||
# Read-only config-defaults / schema feeds for the SPA's Config page.
|
||||
"/api/config/defaults",
|
||||
"/api/config/schema",
|
||||
# Read-only model metadata (context windows, etc.) — same shape as
|
||||
# provider catalogs already exposed on the public internet.
|
||||
"/api/model/info",
|
||||
# Read-only theme + plugin manifests for the dashboard skin engine.
|
||||
"/api/dashboard/themes",
|
||||
"/api/dashboard/plugins",
|
||||
})
|
||||
@@ -13391,11 +13391,6 @@ Examples:
|
||||
"--yes", "-y", action="store_true", help="Skip confirmation"
|
||||
)
|
||||
|
||||
sessions_subparsers.add_parser(
|
||||
"optimize",
|
||||
help="Reclaim disk space: merge FTS5 segments + VACUUM (no data change)",
|
||||
)
|
||||
|
||||
sessions_subparsers.add_parser("stats", help="Show session store statistics")
|
||||
|
||||
sessions_rename = sessions_subparsers.add_parser(
|
||||
@@ -13568,39 +13563,6 @@ Examples:
|
||||
relaunch(["--resume", selected_id])
|
||||
return # won't reach here after execvp
|
||||
|
||||
elif action == "optimize":
|
||||
db_path = db.db_path
|
||||
before_mb = (
|
||||
os.path.getsize(db_path) / (1024 * 1024)
|
||||
if db_path.exists()
|
||||
else 0.0
|
||||
)
|
||||
print("Optimizing session store (FTS merge + VACUUM)…")
|
||||
try:
|
||||
# vacuum() merges FTS5 segments (optimize_fts) then VACUUMs.
|
||||
# Probe the index count first for the summary line.
|
||||
n = sum(
|
||||
1
|
||||
for t in db._FTS_TABLES
|
||||
if db._fts_table_exists(t)
|
||||
)
|
||||
db.vacuum()
|
||||
except Exception as e:
|
||||
print(f"Error: optimization failed: {e}")
|
||||
db.close()
|
||||
return
|
||||
after_mb = (
|
||||
os.path.getsize(db_path) / (1024 * 1024)
|
||||
if db_path.exists()
|
||||
else 0.0
|
||||
)
|
||||
saved = before_mb - after_mb
|
||||
print(f"Optimized {n} FTS index(es).")
|
||||
print(
|
||||
f"Database size: {before_mb:.1f} MB -> {after_mb:.1f} MB "
|
||||
f"(reclaimed {saved:.1f} MB)"
|
||||
)
|
||||
|
||||
elif action == "stats":
|
||||
total = db.session_count()
|
||||
msgs = db.message_count()
|
||||
|
||||
+13
-10
@@ -110,17 +110,20 @@ app.add_middleware(
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Endpoints that do NOT require the session token. Everything else under
|
||||
# /api/ is gated by the auth middleware below. Keep this list minimal —
|
||||
# only truly non-sensitive, read-only endpoints belong here.
|
||||
# /api/ is gated by the auth middleware below.
|
||||
#
|
||||
# This list is defined in ``hermes_cli.dashboard_auth.public_paths`` so the
|
||||
# OAuth gate middleware can honour the same allowlist — keeping the two
|
||||
# gates in lockstep avoids drift like the wildcard-subdomain regression
|
||||
# where ``/api/status`` was public under the legacy gate but 401'd under
|
||||
# the OAuth gate (breaking the portal's liveness probe).
|
||||
#
|
||||
# Keep the upstream list minimal — only truly non-sensitive, read-only
|
||||
# endpoints belong there.
|
||||
# ---------------------------------------------------------------------------
|
||||
_PUBLIC_API_PATHS: frozenset = frozenset({
|
||||
"/api/status",
|
||||
"/api/config/defaults",
|
||||
"/api/config/schema",
|
||||
"/api/model/info",
|
||||
"/api/dashboard/themes",
|
||||
"/api/dashboard/plugins",
|
||||
})
|
||||
from hermes_cli.dashboard_auth.public_paths import (
|
||||
PUBLIC_API_PATHS as _PUBLIC_API_PATHS,
|
||||
)
|
||||
|
||||
|
||||
def _has_valid_session_token(request: Request) -> bool:
|
||||
|
||||
@@ -3116,58 +3116,6 @@ class SessionDB:
|
||||
|
||||
# ── Space reclamation ──
|
||||
|
||||
# FTS5 virtual tables whose b-tree segments we merge on optimize. The
|
||||
# trigram table is created lazily / may be disabled, so we probe before
|
||||
# touching it (see optimize_fts).
|
||||
_FTS_TABLES = ("messages_fts", "messages_fts_trigram")
|
||||
|
||||
def _fts_table_exists(self, name: str) -> bool:
|
||||
"""True if an FTS5 virtual table is queryable in this DB."""
|
||||
try:
|
||||
self._conn.execute(f"SELECT 1 FROM {name} LIMIT 0")
|
||||
return True
|
||||
except sqlite3.OperationalError:
|
||||
return False
|
||||
|
||||
def optimize_fts(self) -> int:
|
||||
"""Merge fragmented FTS5 b-tree segments into one per index.
|
||||
|
||||
FTS5 indexes grow as a series of incremental segments — one per
|
||||
``INSERT`` batch driven by the message triggers. Over tens of
|
||||
thousands of messages these segments accumulate, which both bloats
|
||||
the ``*_data`` shadow tables and slows ``MATCH`` queries that must
|
||||
scan every segment. The special ``'optimize'`` command rewrites each
|
||||
index as a single merged segment.
|
||||
|
||||
This is purely a maintenance operation — it changes neither search
|
||||
results nor ``snippet()`` output, only on-disk layout and query
|
||||
speed. It is complementary to VACUUM: ``optimize`` compacts the FTS
|
||||
index internally, then VACUUM returns the freed pages to the OS.
|
||||
|
||||
Skips any FTS table that does not exist (e.g. the trigram index when
|
||||
disabled via ``HERMES_DISABLE_FTS_TRIGRAM`` or not yet created), so
|
||||
it is safe to call unconditionally.
|
||||
|
||||
Returns the number of FTS indexes that were optimized.
|
||||
"""
|
||||
optimized = 0
|
||||
with self._lock:
|
||||
for tbl in self._FTS_TABLES:
|
||||
if not self._fts_table_exists(tbl):
|
||||
continue
|
||||
try:
|
||||
# The column name in the INSERT must match the table name
|
||||
# for FTS5 special commands.
|
||||
self._conn.execute(
|
||||
f"INSERT INTO {tbl}({tbl}) VALUES('optimize')"
|
||||
)
|
||||
optimized += 1
|
||||
except sqlite3.OperationalError as exc:
|
||||
logger.warning(
|
||||
"FTS optimize failed for %s: %s", tbl, exc
|
||||
)
|
||||
return optimized
|
||||
|
||||
def vacuum(self) -> None:
|
||||
"""Run VACUUM to reclaim disk space after large deletes.
|
||||
|
||||
@@ -3181,17 +3129,7 @@ class SessionDB:
|
||||
exclusive lock, so callers must ensure no other writers are
|
||||
active. Safe to call at startup before the gateway/CLI starts
|
||||
serving traffic.
|
||||
|
||||
FTS5 segments are merged first via :meth:`optimize_fts` so the
|
||||
subsequent VACUUM reclaims the pages freed by the merge. This is a
|
||||
layout-only optimization — search results are unchanged.
|
||||
"""
|
||||
# Merge FTS5 segments before VACUUM so the freed pages are returned
|
||||
# to the OS in the same pass. optimize_fts() manages its own lock.
|
||||
try:
|
||||
self.optimize_fts()
|
||||
except Exception as exc:
|
||||
logger.warning("FTS optimize before VACUUM failed: %s", exc)
|
||||
# VACUUM cannot be executed inside a transaction.
|
||||
with self._lock:
|
||||
# Best-effort WAL checkpoint first, then VACUUM.
|
||||
|
||||
+1
-1
@@ -4,7 +4,7 @@ let
|
||||
src = ../web;
|
||||
npmDeps = pkgs.fetchNpmDeps {
|
||||
inherit src;
|
||||
hash = "sha256-6qhGuifHVtCeep1SiQdCUxBMr7UGhYpdMTvXhrQu/zA=";
|
||||
hash = "sha256-HV0aISBVjwbGqDj8qQynSxGFrrZDzuYAW3D3lB/x3zo=";
|
||||
};
|
||||
|
||||
npm = hermesNpmLib.mkNpmPassthru { folder = "web"; attr = "web"; pname = "hermes-web"; };
|
||||
|
||||
@@ -75,8 +75,17 @@ Config file: `~/.hermes/hindsight/config.json`
|
||||
| `recall_prompt_preamble` | — | Custom preamble for recalled memories in context |
|
||||
| `recall_tags` | — | Tags to filter when searching memories |
|
||||
| `recall_tags_match` | `any` | Tag matching mode: `any` / `all` / `any_strict` / `all_strict` |
|
||||
| `recall_types` | `observation` | Fact types surfaced by recall (both auto-recall and the `hindsight_recall` tool). Comma-separated string or JSON list. **Default narrowed to `observation` only** (see "Behavior change" below). Set to `observation,world,experience` to also include raw facts. |
|
||||
| `auto_recall` | `true` | Automatically recall memories before each turn |
|
||||
|
||||
> **Behavior change — `recall_types` defaults to `observation` only.**
|
||||
>
|
||||
> Previously recall returned all three fact types. It now returns only observations.
|
||||
>
|
||||
> Per [Hindsight's docs](https://hindsight.vectorize.io/developer/observations), observations are the **consolidated** knowledge layer Hindsight builds on top of raw facts: deduplicated beliefs grounded in evidence, refined as new facts arrive, with proof counts and freshness signals. Raw `world` / `experience` facts are the individual supporting evidence that feeds them. For per-turn context injection, observations are denser per token and avoid feeding the model multiple raw facts that one observation already summarizes.
|
||||
>
|
||||
> Restore the broad recall with `"recall_types": "observation,world,experience"` (string or JSON list) in `~/.hermes/hindsight/config.json`. This applies to **both** auto-recall and the `hindsight_recall` tool — both read the same `recall_types` setting (the tool schema has no per-call `types` argument), so narrowing the default narrows both paths.
|
||||
|
||||
### Retain
|
||||
|
||||
| Key | Default | Description |
|
||||
|
||||
@@ -579,7 +579,15 @@ class HindsightMemoryProvider(MemoryProvider):
|
||||
# Recall controls
|
||||
self._auto_recall = True
|
||||
self._recall_max_tokens = 4096
|
||||
self._recall_types: list[str] | None = None
|
||||
# Default to observation-only recall. Observations are Hindsight's
|
||||
# consolidated knowledge layer — deduplicated, evidence-grounded
|
||||
# beliefs built from many raw facts, with proof counts and
|
||||
# freshness signals (see hindsight.vectorize.io/developer/observations).
|
||||
# Including raw world/experience facts re-ships the supporting
|
||||
# evidence that observations already summarize, burning the
|
||||
# `recall_max_tokens` budget. Users can restore the broader
|
||||
# recall via the `recall_types` config key.
|
||||
self._recall_types: list[str] = ["observation"]
|
||||
self._recall_prompt_preamble = ""
|
||||
self._recall_max_input_chars = 800
|
||||
|
||||
@@ -856,6 +864,7 @@ class HindsightMemoryProvider(MemoryProvider):
|
||||
{"key": "retain_assistant_prefix", "description": "Label used before assistant turns in retained transcripts", "default": "Assistant"},
|
||||
{"key": "recall_tags", "description": "Tags to filter when searching memories (comma-separated)", "default": ""},
|
||||
{"key": "recall_tags_match", "description": "Tag matching mode for recall", "default": "any", "choices": ["any", "all", "any_strict", "all_strict"]},
|
||||
{"key": "recall_types", "description": "Fact types to surface on recall — applies to both auto-recall and the hindsight_recall tool (comma-separated or list). Defaults to observation-only — observations are Hindsight's consolidated, deduplicated, evidence-grounded knowledge layer; raw world/experience facts are the supporting evidence observations already summarize. Set to e.g. 'observation,world,experience' to also include raw facts.", "default": "observation"},
|
||||
{"key": "auto_recall", "description": "Automatically recall memories before each turn", "default": True},
|
||||
{"key": "auto_retain", "description": "Automatically retain conversation turns", "default": True},
|
||||
{"key": "retain_every_n_turns", "description": "Retain every N turns (1 = every turn)", "default": 1},
|
||||
@@ -1187,7 +1196,17 @@ class HindsightMemoryProvider(MemoryProvider):
|
||||
# Recall controls
|
||||
self._auto_recall = self._config.get("auto_recall", True)
|
||||
self._recall_max_tokens = int(self._config.get("recall_max_tokens", 4096))
|
||||
self._recall_types = self._config.get("recall_types") or None
|
||||
# Default narrows recall to observation-only; pass an explicit
|
||||
# `recall_types` list in config.json to broaden (e.g. include
|
||||
# "world" / "experience") or to disable the filter entirely.
|
||||
configured_types = self._config.get("recall_types")
|
||||
if configured_types is None:
|
||||
self._recall_types = ["observation"]
|
||||
elif isinstance(configured_types, str):
|
||||
# Allow comma-separated strings for parity with recall_tags.
|
||||
self._recall_types = [t.strip() for t in configured_types.split(",") if t.strip()]
|
||||
else:
|
||||
self._recall_types = list(configured_types) or ["observation"]
|
||||
self._recall_prompt_preamble = self._config.get("recall_prompt_preamble", "")
|
||||
self._recall_max_input_chars = int(self._config.get("recall_max_input_chars", 800))
|
||||
self._retain_async = self._config.get("retain_async", True)
|
||||
|
||||
+1
-1
@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
|
||||
|
||||
[project]
|
||||
name = "hermes-agent"
|
||||
version = "0.15.0"
|
||||
version = "0.15.1"
|
||||
description = "The self-improving AI agent — creates skills from experience, improves them during use, and runs anywhere"
|
||||
readme = "README.md"
|
||||
requires-python = ">=3.11"
|
||||
|
||||
+7
-2
@@ -2302,6 +2302,7 @@ class AIAgent:
|
||||
original_user_message: Any,
|
||||
final_response: Any,
|
||||
interrupted: bool,
|
||||
messages: list | None = None,
|
||||
) -> None:
|
||||
"""Mirror a completed turn into external memory providers.
|
||||
|
||||
@@ -2334,9 +2335,13 @@ class AIAgent:
|
||||
if not (self._memory_manager and final_response and original_user_message):
|
||||
return
|
||||
try:
|
||||
sync_kwargs = {"session_id": self.session_id or ""}
|
||||
if messages is not None:
|
||||
sync_kwargs["messages"] = messages
|
||||
self._memory_manager.sync_all(
|
||||
original_user_message, final_response,
|
||||
session_id=self.session_id or "",
|
||||
original_user_message,
|
||||
final_response,
|
||||
**sync_kwargs,
|
||||
)
|
||||
self._memory_manager.queue_prefetch_all(
|
||||
original_user_message,
|
||||
|
||||
@@ -101,6 +101,8 @@ AUTHOR_MAP = {
|
||||
"kronexoi13@gmail.com": "kronexoi",
|
||||
"hua.zhong@kingsmith.com": "vgocoder",
|
||||
"hermes@marian.local": "Schrotti77",
|
||||
"david@memorilabs.ai": "devwdave",
|
||||
"dave@devwdave.com": "devwdave",
|
||||
"1920071390@campus.ouj.ac.jp": "zapabob",
|
||||
"gaia@gaia.local": "jfuenmayor",
|
||||
"jiahuigu@users.noreply.github.com": "Jiahui-Gu",
|
||||
|
||||
@@ -16,6 +16,7 @@ from agent.image_routing import (
|
||||
_supports_vision_override,
|
||||
build_native_content_parts,
|
||||
decide_image_input_mode,
|
||||
extract_image_refs,
|
||||
)
|
||||
|
||||
|
||||
@@ -449,3 +450,190 @@ class TestLargeImageHandling:
|
||||
assert len(parts) == 2
|
||||
assert parts[0]["type"] == "text"
|
||||
assert parts[1]["type"] == "image_url"
|
||||
|
||||
|
||||
# ─── extract_image_refs ──────────────────────────────────────────────────────
|
||||
|
||||
|
||||
class TestExtractImageRefs:
|
||||
"""Scan task body / inbound text for image paths and URLs (kanban worker
|
||||
enrichment, issue raised May 2026)."""
|
||||
|
||||
def test_empty_or_none_returns_empty(self):
|
||||
assert extract_image_refs("") == ([], [])
|
||||
assert extract_image_refs(None) == ([], []) # type: ignore[arg-type]
|
||||
|
||||
def test_finds_absolute_path(self, tmp_path: Path):
|
||||
img = tmp_path / "screenshot.png"
|
||||
img.write_bytes(_png_bytes())
|
||||
body = f"Look at {img} and tell me what's wrong."
|
||||
paths, urls = extract_image_refs(body)
|
||||
assert paths == [str(img)]
|
||||
assert urls == []
|
||||
|
||||
def test_finds_home_relative_path(self, tmp_path: Path, monkeypatch):
|
||||
# Simulate ~/foo.png by pointing HOME at tmp_path and creating the file
|
||||
monkeypatch.setenv("HOME", str(tmp_path))
|
||||
img = tmp_path / "foo.png"
|
||||
img.write_bytes(_png_bytes())
|
||||
paths, urls = extract_image_refs("see ~/foo.png please")
|
||||
assert paths == [str(img)]
|
||||
assert urls == []
|
||||
|
||||
def test_skips_nonexistent_paths(self, tmp_path: Path):
|
||||
# Path-shaped but no file on disk → skipped.
|
||||
body = f"What's at {tmp_path}/never_created.png ?"
|
||||
paths, urls = extract_image_refs(body)
|
||||
assert paths == []
|
||||
assert urls == []
|
||||
|
||||
def test_finds_http_image_url(self):
|
||||
body = "Check out https://example.com/photos/cat.png — cute right?"
|
||||
paths, urls = extract_image_refs(body)
|
||||
assert paths == []
|
||||
assert urls == ["https://example.com/photos/cat.png"]
|
||||
|
||||
def test_finds_https_url_with_query_string(self):
|
||||
body = "Diagram: https://cdn.example.com/img.jpeg?size=large&v=2 here"
|
||||
paths, urls = extract_image_refs(body)
|
||||
assert urls == ["https://cdn.example.com/img.jpeg?size=large&v=2"]
|
||||
|
||||
def test_url_trailing_punctuation_stripped(self):
|
||||
# Prose punctuation right after the URL must not be part of the URL.
|
||||
body = "See https://example.com/a.png."
|
||||
paths, urls = extract_image_refs(body)
|
||||
assert urls == ["https://example.com/a.png"]
|
||||
|
||||
def test_ignores_non_image_urls(self):
|
||||
body = "See https://example.com/page.html and https://x.com/y.pdf"
|
||||
paths, urls = extract_image_refs(body)
|
||||
assert urls == []
|
||||
|
||||
def test_dedupes_paths_and_urls(self, tmp_path: Path):
|
||||
img = tmp_path / "dup.png"
|
||||
img.write_bytes(_png_bytes())
|
||||
body = (
|
||||
f"First {img} then again {img}. "
|
||||
"Also https://example.com/x.png and https://example.com/x.png again."
|
||||
)
|
||||
paths, urls = extract_image_refs(body)
|
||||
assert paths == [str(img)]
|
||||
assert urls == ["https://example.com/x.png"]
|
||||
|
||||
def test_ignores_paths_in_fenced_code_block(self, tmp_path: Path):
|
||||
img = tmp_path / "real.png"
|
||||
img.write_bytes(_png_bytes())
|
||||
body = (
|
||||
"Outside the block, attach this:\n"
|
||||
f"{img}\n"
|
||||
"But not these examples:\n"
|
||||
"```\n"
|
||||
f"some_other_image: /tmp/example.png\n"
|
||||
f"url: https://example.com/example.png\n"
|
||||
"```\n"
|
||||
)
|
||||
paths, urls = extract_image_refs(body)
|
||||
assert paths == [str(img)]
|
||||
assert urls == []
|
||||
|
||||
def test_ignores_paths_in_inline_code(self, tmp_path: Path):
|
||||
img = tmp_path / "real.jpg"
|
||||
img.write_bytes(_png_bytes())
|
||||
body = (
|
||||
f"Attach {img}, but ignore the example "
|
||||
"`https://example.com/skip.png` in backticks."
|
||||
)
|
||||
paths, urls = extract_image_refs(body)
|
||||
assert paths == [str(img)]
|
||||
assert urls == []
|
||||
|
||||
def test_does_not_match_paths_inside_urls(self, tmp_path: Path):
|
||||
# The lookbehind in the regex prevents matching the path-portion of
|
||||
# a URL as a local path. Only the URL should be detected.
|
||||
body = "Just the URL: https://example.com/some/dir/image.png"
|
||||
paths, urls = extract_image_refs(body)
|
||||
assert paths == []
|
||||
assert urls == ["https://example.com/some/dir/image.png"]
|
||||
|
||||
def test_mixed_paths_and_urls(self, tmp_path: Path):
|
||||
img = tmp_path / "local.png"
|
||||
img.write_bytes(_png_bytes())
|
||||
body = (
|
||||
f"Compare local {img} against the design at "
|
||||
"https://example.com/design/v2.png — does it match?"
|
||||
)
|
||||
paths, urls = extract_image_refs(body)
|
||||
assert paths == [str(img)]
|
||||
assert urls == ["https://example.com/design/v2.png"]
|
||||
|
||||
def test_case_insensitive_extension(self, tmp_path: Path):
|
||||
img = tmp_path / "shouty.PNG"
|
||||
img.write_bytes(_png_bytes())
|
||||
body = f"see {img}"
|
||||
paths, urls = extract_image_refs(body)
|
||||
assert paths == [str(img)]
|
||||
|
||||
|
||||
# ─── build_native_content_parts with URLs ────────────────────────────────────
|
||||
|
||||
|
||||
class TestBuildNativeContentPartsURLs:
|
||||
"""URL pass-through support added so kanban task bodies (and other
|
||||
inbound surfaces) can route remote image URLs straight to the model."""
|
||||
|
||||
def test_url_only_no_local_paths(self):
|
||||
parts, skipped = build_native_content_parts(
|
||||
"what is this?",
|
||||
[],
|
||||
image_urls=["https://example.com/diagram.png"],
|
||||
)
|
||||
assert skipped == []
|
||||
assert len(parts) == 2
|
||||
assert parts[0]["type"] == "text"
|
||||
assert "[Image attached: https://example.com/diagram.png]" in parts[0]["text"]
|
||||
assert parts[0]["text"].startswith("what is this?")
|
||||
assert parts[1] == {
|
||||
"type": "image_url",
|
||||
"image_url": {"url": "https://example.com/diagram.png"},
|
||||
}
|
||||
|
||||
def test_mixed_path_and_url(self, tmp_path: Path):
|
||||
img = tmp_path / "local.png"
|
||||
img.write_bytes(_png_bytes())
|
||||
parts, skipped = build_native_content_parts(
|
||||
"compare these",
|
||||
[str(img)],
|
||||
image_urls=["https://example.com/remote.jpg"],
|
||||
)
|
||||
assert skipped == []
|
||||
# 1 text + 2 image parts (local data URL first, then remote URL).
|
||||
image_parts = [p for p in parts if p.get("type") == "image_url"]
|
||||
assert len(image_parts) == 2
|
||||
assert image_parts[0]["image_url"]["url"].startswith("data:image/png;base64,")
|
||||
assert image_parts[1]["image_url"]["url"] == "https://example.com/remote.jpg"
|
||||
text = parts[0]["text"]
|
||||
assert "[Image attached at:" in text
|
||||
assert "[Image attached: https://example.com/remote.jpg]" in text
|
||||
|
||||
def test_empty_url_list_is_no_op(self, tmp_path: Path):
|
||||
img = tmp_path / "x.png"
|
||||
img.write_bytes(_png_bytes())
|
||||
# image_urls=[] should behave the same as not passing it at all.
|
||||
parts_no_urls, _ = build_native_content_parts("hi", [str(img)])
|
||||
parts_empty_urls, _ = build_native_content_parts("hi", [str(img)], image_urls=[])
|
||||
assert parts_no_urls == parts_empty_urls
|
||||
|
||||
def test_blank_url_strings_are_dropped(self):
|
||||
parts, _ = build_native_content_parts(
|
||||
"x", [], image_urls=["", " ", "https://example.com/a.png"]
|
||||
)
|
||||
image_parts = [p for p in parts if p.get("type") == "image_url"]
|
||||
assert len(image_parts) == 1
|
||||
assert image_parts[0]["image_url"]["url"] == "https://example.com/a.png"
|
||||
|
||||
def test_url_only_inserts_default_prompt_when_text_empty(self):
|
||||
parts, _ = build_native_content_parts(
|
||||
"", [], image_urls=["https://example.com/a.png"]
|
||||
)
|
||||
assert parts[0]["type"] == "text"
|
||||
assert parts[0]["text"].startswith("What do you see in this image?")
|
||||
|
||||
@@ -84,6 +84,13 @@ class MetadataMemoryProvider(FakeMemoryProvider):
|
||||
self.memory_writes.append((action, target, content, metadata or {}))
|
||||
|
||||
|
||||
class MessagesMemoryProvider(FakeMemoryProvider):
|
||||
"""Provider that opts into completed-turn message context."""
|
||||
|
||||
def sync_turn(self, user_content, assistant_content, *, session_id="", messages=None):
|
||||
self.synced_turns.append((user_content, assistant_content, session_id, messages))
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# MemoryProvider ABC tests
|
||||
# ---------------------------------------------------------------------------
|
||||
@@ -236,6 +243,28 @@ class TestMemoryManager:
|
||||
assert p1.synced_turns == [("user msg", "assistant msg")]
|
||||
assert p2.synced_turns == [("user msg", "assistant msg")]
|
||||
|
||||
def test_sync_all_passes_messages_to_opted_in_provider(self):
|
||||
mgr = MemoryManager()
|
||||
p = MessagesMemoryProvider("external")
|
||||
mgr.add_provider(p)
|
||||
messages = [
|
||||
{"role": "assistant", "tool_calls": [{"id": "call-1"}]},
|
||||
{"role": "tool", "tool_call_id": "call-1", "content": "ok"},
|
||||
]
|
||||
|
||||
mgr.sync_all("user msg", "assistant msg", session_id="sess-1", messages=messages)
|
||||
|
||||
assert p.synced_turns == [("user msg", "assistant msg", "sess-1", messages)]
|
||||
|
||||
def test_sync_all_omits_messages_for_legacy_provider(self):
|
||||
mgr = MemoryManager()
|
||||
p = FakeMemoryProvider("external")
|
||||
mgr.add_provider(p)
|
||||
|
||||
mgr.sync_all("user msg", "assistant msg", messages=[{"role": "tool"}])
|
||||
|
||||
assert p.synced_turns == [("user msg", "assistant msg")]
|
||||
|
||||
def test_sync_failure_doesnt_block_others(self):
|
||||
"""If one provider's sync fails, others still run."""
|
||||
mgr = MemoryManager()
|
||||
|
||||
@@ -227,6 +227,8 @@ _HERMES_BEHAVIORAL_VARS = frozenset({
|
||||
"TERMINAL_CONTAINER_DISK",
|
||||
"TERMINAL_CONTAINER_MEMORY",
|
||||
"TERMINAL_CONTAINER_PERSISTENT",
|
||||
"TERMINAL_DOCKER_PERSIST_ACROSS_PROCESSES",
|
||||
"TERMINAL_DOCKER_ORPHAN_REAPER",
|
||||
"TERMINAL_DOCKER_RUN_AS_HOST_USER",
|
||||
"BROWSER_CDP_URL",
|
||||
"CAMOFOX_URL",
|
||||
|
||||
@@ -12,6 +12,7 @@ the realistic runtime context. See the conftest module docstring.
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
import subprocess
|
||||
import time
|
||||
|
||||
@@ -87,7 +88,15 @@ def test_dashboard_slot_reports_up_when_enabled(
|
||||
"""Symmetry: with HERMES_DASHBOARD=1, s6-svstat reports the slot as up."""
|
||||
subprocess.run(
|
||||
["docker", "run", "-d", "--name", container_name,
|
||||
"-e", "HERMES_DASHBOARD=1", built_image, "sleep", "120"],
|
||||
"-e", "HERMES_DASHBOARD=1",
|
||||
# The default dashboard host is 0.0.0.0, which now engages the
|
||||
# OAuth auth gate. Without a provider registered (no
|
||||
# HERMES_DASHBOARD_OAUTH_CLIENT_ID in this test env), start_server
|
||||
# would fail closed and the slot would never come up. Pin the
|
||||
# explicit insecure opt-in to keep this test focused on the s6
|
||||
# supervision contract, not the auth gate.
|
||||
"-e", "HERMES_DASHBOARD_INSECURE=1",
|
||||
built_image, "sleep", "120"],
|
||||
check=True, capture_output=True, timeout=30,
|
||||
)
|
||||
# uvicorn takes a moment to bind; poll svstat.
|
||||
@@ -112,7 +121,12 @@ def test_dashboard_opt_in_starts(
|
||||
"""With HERMES_DASHBOARD=1, a dashboard process should be visible."""
|
||||
subprocess.run(
|
||||
["docker", "run", "-d", "--name", container_name,
|
||||
"-e", "HERMES_DASHBOARD=1", built_image, "sleep", "120"],
|
||||
"-e", "HERMES_DASHBOARD=1",
|
||||
# Default bind is 0.0.0.0; pin insecure opt-in so the auth gate
|
||||
# doesn't fail-closed before the process can come up. See
|
||||
# test_dashboard_slot_reports_up_when_enabled for the full rationale.
|
||||
"-e", "HERMES_DASHBOARD_INSECURE=1",
|
||||
built_image, "sleep", "120"],
|
||||
check=True, capture_output=True, timeout=30,
|
||||
)
|
||||
# Poll for the dashboard subprocess to appear — the entrypoint
|
||||
@@ -131,6 +145,10 @@ def test_dashboard_port_override(
|
||||
subprocess.run(
|
||||
["docker", "run", "-d", "--name", container_name,
|
||||
"-e", "HERMES_DASHBOARD=1", "-e", "HERMES_DASHBOARD_PORT=9120",
|
||||
# Default bind is 0.0.0.0; pin insecure opt-in so the auth gate
|
||||
# doesn't fail-closed before the port is bound. See
|
||||
# test_dashboard_slot_reports_up_when_enabled for the full rationale.
|
||||
"-e", "HERMES_DASHBOARD_INSECURE=1",
|
||||
built_image, "sleep", "120"],
|
||||
check=True, capture_output=True, timeout=30,
|
||||
)
|
||||
@@ -160,7 +178,13 @@ def test_dashboard_restarts_after_crash(
|
||||
"""
|
||||
subprocess.run(
|
||||
["docker", "run", "-d", "--name", container_name,
|
||||
"-e", "HERMES_DASHBOARD=1", built_image, "sleep", "120"],
|
||||
"-e", "HERMES_DASHBOARD=1",
|
||||
# Default bind is 0.0.0.0; pin insecure opt-in so the auth gate
|
||||
# doesn't fail-closed before the supervised dashboard can come up.
|
||||
# See test_dashboard_slot_reports_up_when_enabled for the full
|
||||
# rationale.
|
||||
"-e", "HERMES_DASHBOARD_INSECURE=1",
|
||||
built_image, "sleep", "120"],
|
||||
check=True, capture_output=True, timeout=30,
|
||||
)
|
||||
# Wait for the first dashboard to come up.
|
||||
@@ -201,3 +225,191 @@ def test_dashboard_restarts_after_crash(
|
||||
raise AssertionError(
|
||||
f"Dashboard not restarted after kill (first_pid={first_pid})"
|
||||
)
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# OAuth auth-gate behaviour — regression guard for the dashboard-insecure
|
||||
# auto-injection bug. Pre-fix, the s6 run script appended `--insecure`
|
||||
# whenever `HERMES_DASHBOARD_HOST` was non-loopback, silently disabling
|
||||
# the OAuth gate on every container-deployed dashboard. The matching
|
||||
# static-text guard lives in tests/test_docker_home_override_scripts.py;
|
||||
# this is the behavioural end-to-end check.
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
def _http_probe(
|
||||
container: str,
|
||||
path: str,
|
||||
*,
|
||||
deadline_s: float = 60.0,
|
||||
) -> tuple[int, str]:
|
||||
"""Poll ``http://127.0.0.1:9119<path>`` from inside the container.
|
||||
|
||||
Returns ``(status_code, body)`` as soon as the dashboard answers any
|
||||
HTTP response — 200, 401, 503, anything. The image doesn't ship
|
||||
``curl`` but the venv's stdlib ``urllib`` is good enough; we use a
|
||||
proper ``try``/``except`` to intercept ``HTTPError`` because
|
||||
``urlopen`` raises on 4xx/5xx, and we treat those as legitimate
|
||||
responses (the OAuth gate's 401 IS the success signal for the
|
||||
gate-engaged test).
|
||||
|
||||
Connection errors (uvicorn still starting, fail-closed exited) keep
|
||||
the poll loop running until ``deadline_s`` elapses.
|
||||
|
||||
The probe Python program is fed over stdin (``python -``) rather
|
||||
than ``python -c`` so we can use proper multi-line syntax with
|
||||
``try``/``except`` blocks without escaping hell.
|
||||
|
||||
Raises ``AssertionError`` on timeout.
|
||||
"""
|
||||
py_program = f"""\
|
||||
import urllib.request, urllib.error
|
||||
req = urllib.request.Request("http://127.0.0.1:9119{path}")
|
||||
try:
|
||||
r = urllib.request.urlopen(req, timeout=5)
|
||||
print(r.status)
|
||||
print(r.read().decode(), end="")
|
||||
except urllib.error.HTTPError as h:
|
||||
print(h.code)
|
||||
print(h.read().decode(), end="")
|
||||
"""
|
||||
# Feed the program over stdin via a heredoc so docker_exec_sh's
|
||||
# single bash string stays clean. The 'PY' delimiter is quoted to
|
||||
# disable shell expansion inside the heredoc body.
|
||||
probe = (
|
||||
"/opt/hermes/.venv/bin/python - <<'PY'\n"
|
||||
f"{py_program}"
|
||||
"PY"
|
||||
)
|
||||
end = time.monotonic() + deadline_s
|
||||
last_err = ""
|
||||
while time.monotonic() < end:
|
||||
r = docker_exec_sh(container, probe, timeout=10)
|
||||
if r.returncode == 0 and r.stdout.strip():
|
||||
lines = r.stdout.split("\n", 1)
|
||||
try:
|
||||
status = int(lines[0].strip())
|
||||
body = lines[1] if len(lines) > 1 else ""
|
||||
return status, body
|
||||
except (ValueError, IndexError) as exc:
|
||||
last_err = f"parse: {exc!r} / stdout={r.stdout!r}"
|
||||
else:
|
||||
last_err = f"rc={r.returncode} stderr={r.stderr!r}"
|
||||
time.sleep(0.5)
|
||||
raise AssertionError(
|
||||
f"Probe of {path} never returned HTTP within {deadline_s}s; "
|
||||
f"last error: {last_err}"
|
||||
)
|
||||
|
||||
|
||||
def test_dashboard_oauth_gate_engages_on_non_loopback_bind(
|
||||
built_image: str, container_name: str,
|
||||
) -> None:
|
||||
"""The s6 dashboard run script must NOT auto-add ``--insecure`` when the
|
||||
dashboard binds to ``0.0.0.0``. The OAuth auth gate engages on its own
|
||||
when a ``DashboardAuthProvider`` is registered (the bundled nous
|
||||
provider activates whenever ``HERMES_DASHBOARD_OAUTH_CLIENT_ID`` is
|
||||
set).
|
||||
|
||||
Regression guard for the wildcard-subdomain rollout where every
|
||||
portal-provisioned agent binds ``0.0.0.0`` and relies on the OAuth
|
||||
gate to authenticate browser callers. Before this fix, the run script
|
||||
flipped ``--insecure`` on for any non-loopback bind, which routed
|
||||
``start_server`` straight back into the legacy ``allow_public=True``
|
||||
branch and disabled the gate every time.
|
||||
|
||||
We verify two independent observable consequences of the gate being
|
||||
on:
|
||||
|
||||
1. ``/api/auth/providers`` (publicly reachable through the gate so
|
||||
the login page can bootstrap) returns 200 with ``nous`` in the
|
||||
provider list — proves the bundled provider registered.
|
||||
2. ``/api/sessions`` (a gated route under both the legacy
|
||||
``_SESSION_TOKEN`` middleware and the OAuth gate) returns 401
|
||||
to an unauthenticated caller — proves the OAuth gate is actively
|
||||
intercepting browser traffic. We deliberately probe a gated route
|
||||
here rather than ``/api/status``: status sits in the shared
|
||||
``PUBLIC_API_PATHS`` allowlist (portal liveness probe target) and
|
||||
responds 200 without a cookie under both gates, so it cannot
|
||||
distinguish "gate on" from "gate off".
|
||||
"""
|
||||
subprocess.run(
|
||||
["docker", "run", "-d", "--name", container_name,
|
||||
"-e", "HERMES_DASHBOARD=1",
|
||||
"-e", "HERMES_DASHBOARD_HOST=0.0.0.0",
|
||||
"-e", "HERMES_DASHBOARD_OAUTH_CLIENT_ID=agent:test-instance",
|
||||
built_image, "sleep", "120"],
|
||||
check=True, capture_output=True, timeout=30,
|
||||
)
|
||||
|
||||
# (1) Provider registry visible via the public bootstrap endpoint.
|
||||
status_code, body = _http_probe(container_name, "/api/auth/providers")
|
||||
assert status_code == 200, (
|
||||
f"/api/auth/providers should return 200 when a provider is "
|
||||
f"registered; got {status_code} body={body!r}"
|
||||
)
|
||||
payload = json.loads(body)
|
||||
provider_names = [p.get("name") for p in payload.get("providers", [])]
|
||||
assert "nous" in provider_names, (
|
||||
"Bundled dashboard_auth/nous provider should register when "
|
||||
f"HERMES_DASHBOARD_OAUTH_CLIENT_ID is set. Got: {payload!r}"
|
||||
)
|
||||
|
||||
# (2) A gated route (``/api/sessions``) returns 401 to an
|
||||
# unauthenticated caller — the OAuth gate is intercepting.
|
||||
status_code, body = _http_probe(container_name, "/api/sessions")
|
||||
assert status_code == 401, (
|
||||
"OAuth gate must intercept gated /api/* routes on 0.0.0.0 bind "
|
||||
"when a provider is registered and HERMES_DASHBOARD_INSECURE "
|
||||
f"is unset. Got: status={status_code} body={body!r}"
|
||||
)
|
||||
|
||||
# (3) ``/api/status`` remains 200 under the gate — it's in the shared
|
||||
# ``PUBLIC_API_PATHS`` allowlist so NAS's wildcard-subdomain
|
||||
# liveness probe (``fly-provider.ts`` ``getInstanceRuntimeStatus``)
|
||||
# can reach it without a cookie. Regression guard: this allowlist
|
||||
# drifted once already and surfaced every healthy agent as
|
||||
# STARTING/down in the portal UI.
|
||||
status_code, body = _http_probe(container_name, "/api/status")
|
||||
assert status_code == 200, (
|
||||
"/api/status must remain publicly reachable under the OAuth gate "
|
||||
"— the portal uses it as the wildcard-subdomain liveness probe. "
|
||||
f"Got: status={status_code} body={body!r}"
|
||||
)
|
||||
status = json.loads(body)
|
||||
assert status.get("auth_required") is True, (
|
||||
"/api/status must report auth_required=True when the OAuth gate "
|
||||
f"is engaged so the SPA/portal can distinguish modes. Got: {status!r}"
|
||||
)
|
||||
|
||||
|
||||
def test_dashboard_insecure_env_var_opts_out_of_gate(
|
||||
built_image: str, container_name: str,
|
||||
) -> None:
|
||||
"""``HERMES_DASHBOARD_INSECURE=1`` re-enables the legacy no-gate mode
|
||||
for operators running on trusted LANs behind a reverse proxy without
|
||||
the OAuth contract. Same opt-out shape as the rest of the s6 boolean
|
||||
envs (``HERMES_DASHBOARD``, ``HERMES_DASHBOARD_TUI``).
|
||||
|
||||
With the gate off, ``/api/status`` (a public endpoint under the
|
||||
legacy ``_SESSION_TOKEN`` middleware) returns 200 with the
|
||||
``auth_required: false`` body — proves the gate is bypassed.
|
||||
"""
|
||||
subprocess.run(
|
||||
["docker", "run", "-d", "--name", container_name,
|
||||
"-e", "HERMES_DASHBOARD=1",
|
||||
"-e", "HERMES_DASHBOARD_HOST=0.0.0.0",
|
||||
"-e", "HERMES_DASHBOARD_INSECURE=1",
|
||||
built_image, "sleep", "120"],
|
||||
check=True, capture_output=True, timeout=30,
|
||||
)
|
||||
status_code, body = _http_probe(container_name, "/api/status")
|
||||
assert status_code == 200, (
|
||||
f"/api/status should return 200 with the auth gate disabled; "
|
||||
f"got {status_code} body={body!r}"
|
||||
)
|
||||
status = json.loads(body)
|
||||
assert status.get("auth_required") is False, (
|
||||
"HERMES_DASHBOARD_INSECURE=1 must disable the auth gate (explicit "
|
||||
f"opt-in for trusted-LAN deployments). Got: {status!r}"
|
||||
)
|
||||
|
||||
@@ -131,8 +131,13 @@ class TestRefreshTokenCookieDeprecation:
|
||||
|
||||
|
||||
class TestApi401Envelope:
|
||||
# NOTE: probe a gated route (``/api/sessions``) here rather than
|
||||
# ``/api/status`` — status is in the shared ``PUBLIC_API_PATHS``
|
||||
# allowlist (portal liveness probe) so it would 200 even without a
|
||||
# cookie and never exercise the 401-envelope code path.
|
||||
|
||||
def test_no_cookie_returns_unauthenticated_envelope(self, gated_app):
|
||||
r = gated_app.get("/api/status")
|
||||
r = gated_app.get("/api/sessions")
|
||||
assert r.status_code == 401
|
||||
body = r.json()
|
||||
assert body["error"] == "unauthenticated"
|
||||
@@ -141,7 +146,7 @@ class TestApi401Envelope:
|
||||
|
||||
def test_invalid_cookie_returns_session_expired_envelope(self, gated_app):
|
||||
gated_app.cookies.set(SESSION_AT_COOKIE, "garbage")
|
||||
r = gated_app.get("/api/status")
|
||||
r = gated_app.get("/api/sessions")
|
||||
assert r.status_code == 401
|
||||
body = r.json()
|
||||
assert body["error"] == "session_expired"
|
||||
@@ -151,7 +156,7 @@ class TestApi401Envelope:
|
||||
"""Dead-cookie cleanup — Phase 6 requirement so the browser
|
||||
doesn't keep replaying the stale token on every request."""
|
||||
gated_app.cookies.set(SESSION_AT_COOKIE, "garbage")
|
||||
r = gated_app.get("/api/status")
|
||||
r = gated_app.get("/api/sessions")
|
||||
set_cookies = r.headers.get_list("set-cookie")
|
||||
assert any(
|
||||
c.startswith(f"{SESSION_AT_COOKIE}=") and "Max-Age=0" in c
|
||||
|
||||
@@ -56,10 +56,61 @@ def gated_app():
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
def test_gated_status_now_requires_auth(gated_app):
|
||||
"""When gate is on, /api/status is NOT public — login bootstrap uses /api/auth/providers."""
|
||||
def test_gated_status_is_public(gated_app):
|
||||
"""``/api/status`` MUST be public under the OAuth gate.
|
||||
|
||||
Regression guard for the wildcard-subdomain rollout: NAS
|
||||
(``fly-provider.ts`` ``getInstanceRuntimeStatus``) hits
|
||||
``/api/status`` without a cookie as its sole liveness probe. A 401
|
||||
here surfaces every healthy agent as STARTING/down in the portal
|
||||
UI. The endpoint returns only version + gateway/auth-gate metadata
|
||||
(no user data, no session content), so it stays in the shared
|
||||
``PUBLIC_API_PATHS`` allowlist under both the legacy ``_SESSION_TOKEN``
|
||||
gate and the OAuth gate.
|
||||
|
||||
The body also reports the gate's shape (``auth_required``,
|
||||
``auth_providers``) so the SPA's StatusPage and external monitors
|
||||
can distinguish loopback / gated / no-providers without a separate
|
||||
round trip.
|
||||
"""
|
||||
r = gated_app.get("/api/status")
|
||||
assert r.status_code == 401
|
||||
assert r.status_code == 200, (
|
||||
f"Expected 200, got {r.status_code}: {r.text}"
|
||||
)
|
||||
body = r.json()
|
||||
assert body["auth_required"] is True
|
||||
assert "version" in body
|
||||
assert "gateway_state" in body
|
||||
|
||||
|
||||
@pytest.mark.parametrize("path", [
|
||||
"/api/config/defaults",
|
||||
"/api/config/schema",
|
||||
"/api/model/info",
|
||||
"/api/dashboard/themes",
|
||||
"/api/dashboard/plugins",
|
||||
])
|
||||
def test_other_public_api_paths_are_public_under_gate(gated_app, path):
|
||||
"""The remaining ``PUBLIC_API_PATHS`` entries must also bypass the
|
||||
gate. They're documented as non-sensitive read-only endpoints that
|
||||
the SPA pre-loads before login (themes, config schema, model
|
||||
metadata). A 401 / 302-to-login here would block the dashboard
|
||||
shell from rendering pre-auth.
|
||||
|
||||
Accept any non-auth-failure status: 200 when the route succeeds,
|
||||
or any route-specific error (e.g. 400 / 404 / 500 from a missing
|
||||
dependency) — but NEVER 401, and NEVER a 302 to ``/login``.
|
||||
"""
|
||||
r = gated_app.get(path, follow_redirects=False)
|
||||
assert r.status_code != 401, (
|
||||
f"{path} returned 401 under the OAuth gate — should be public"
|
||||
)
|
||||
if r.status_code == 302:
|
||||
location = r.headers.get("location", "")
|
||||
assert "/login" not in location, (
|
||||
f"{path} redirected to {location} — should be public, "
|
||||
"not bounced to /login"
|
||||
)
|
||||
|
||||
|
||||
def test_gated_html_redirects_to_login(gated_app):
|
||||
@@ -98,7 +149,7 @@ def test_gated_static_asset_path_is_public(gated_app):
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
def test_full_login_round_trip_unlocks_api_status(gated_app):
|
||||
def test_full_login_round_trip_unlocks_gated_api(gated_app):
|
||||
# 1) Click "Sign in with Stub IdP" — /auth/login redirects to the stub
|
||||
# with a PKCE cookie on the response.
|
||||
r1 = gated_app.get("/auth/login?provider=stub", follow_redirects=False)
|
||||
@@ -128,11 +179,16 @@ def test_full_login_round_trip_unlocks_api_status(gated_app):
|
||||
assert any("hermes_session_at" in c for c in set_cookies)
|
||||
assert any("hermes_session_rt" in c for c in set_cookies)
|
||||
|
||||
# 3) /api/status now succeeds because we're authenticated.
|
||||
r3 = gated_app.get("/api/status")
|
||||
assert r3.status_code == 200
|
||||
body = r3.json()
|
||||
assert "version" in body
|
||||
# 3) A gated API route (``/api/sessions``) now succeeds because we
|
||||
# have a valid session cookie. (We deliberately don't probe
|
||||
# ``/api/status`` here — it's in the shared PUBLIC_API_PATHS
|
||||
# allowlist and would 200 even without a login, so it can't
|
||||
# distinguish "logged in" from "gate accidentally disabled".)
|
||||
r3 = gated_app.get("/api/sessions")
|
||||
assert r3.status_code == 200, (
|
||||
f"Expected 200 for /api/sessions post-login, got {r3.status_code}: "
|
||||
f"{r3.text}"
|
||||
)
|
||||
|
||||
|
||||
def test_login_unknown_provider_returns_404(gated_app):
|
||||
|
||||
@@ -59,19 +59,11 @@ def loopback_client():
|
||||
web_server.app.state.auth_required = prev_required
|
||||
|
||||
|
||||
def _login(client: TestClient) -> None:
|
||||
"""Drive the stub OAuth round trip so the gated client is authed."""
|
||||
r1 = client.get("/auth/login?provider=stub", follow_redirects=False)
|
||||
assert r1.status_code == 302
|
||||
state = r1.headers["location"].split("state=")[1]
|
||||
r2 = client.get(
|
||||
f"/auth/callback?code=stub_code&state={state}", follow_redirects=False
|
||||
)
|
||||
assert r2.status_code == 302
|
||||
|
||||
|
||||
def test_status_reports_auth_required_in_gated_mode(gated_client):
|
||||
_login(gated_client)
|
||||
# No ``_login()`` call — ``/api/status`` is in the shared
|
||||
# ``PUBLIC_API_PATHS`` allowlist precisely so external probes (and
|
||||
# the SPA's pre-login bootstrap) can read the gate's shape without
|
||||
# a cookie. Hit it cold.
|
||||
r = gated_client.get("/api/status")
|
||||
assert r.status_code == 200
|
||||
body = r.json()
|
||||
|
||||
@@ -0,0 +1,238 @@
|
||||
"""Worker-side image enrichment for kanban tasks.
|
||||
|
||||
When a kanban task body contains a local image path or an ``http(s)://``
|
||||
image URL, the worker must surface that image to the model on its first
|
||||
user turn — matching the CLI/gateway behaviour for inbound images.
|
||||
|
||||
The dispatcher spawns the worker as
|
||||
``hermes -p <profile> chat -q "work kanban task <id>"``. The task body
|
||||
itself never appears in argv; the worker has to read it from the kanban
|
||||
DB during startup. These tests cover the round-trip:
|
||||
|
||||
task body → kanban_db.get_task → extract_image_refs →
|
||||
build_native_content_parts → multimodal user turn
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
import base64
|
||||
from pathlib import Path
|
||||
|
||||
import pytest
|
||||
|
||||
from hermes_cli import kanban_db as kb
|
||||
from agent.image_routing import (
|
||||
build_native_content_parts,
|
||||
extract_image_refs,
|
||||
)
|
||||
|
||||
|
||||
# Tiny 1×1 transparent PNG used to back any path the tests stick into a
|
||||
# task body. extract_image_refs validates the path exists on disk, so the
|
||||
# byte content has to be a real readable file (any image bytes will do).
|
||||
_PNG = base64.b64decode(
|
||||
"iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAYAAAAfFcSJAAAADUlEQVR4nGNgYGBgAAAABQABpfZFQAAAAABJRU5ErkJggg=="
|
||||
)
|
||||
|
||||
|
||||
@pytest.fixture
|
||||
def kanban_home(tmp_path: Path, monkeypatch):
|
||||
"""Isolated HERMES_HOME with a fresh kanban DB for each test."""
|
||||
home = tmp_path / ".hermes"
|
||||
home.mkdir()
|
||||
monkeypatch.setenv("HERMES_HOME", str(home))
|
||||
monkeypatch.setattr(Path, "home", lambda: tmp_path)
|
||||
kb.init_db()
|
||||
return home
|
||||
|
||||
|
||||
def _add_task_with_body(body: str, *, title: str = "Look at this") -> str:
|
||||
conn = kb.connect()
|
||||
try:
|
||||
task_id = kb.create_task(
|
||||
conn,
|
||||
title=title,
|
||||
body=body,
|
||||
assignee="worker-a",
|
||||
tenant=None,
|
||||
)
|
||||
finally:
|
||||
conn.close()
|
||||
return task_id
|
||||
|
||||
|
||||
def _read_body(task_id: str) -> str:
|
||||
conn = kb.connect()
|
||||
try:
|
||||
task = kb.get_task(conn, task_id)
|
||||
return (task.body if task is not None else "") or ""
|
||||
finally:
|
||||
conn.close()
|
||||
|
||||
|
||||
class TestExtractFromTaskBody:
|
||||
"""Read a real kanban task body and run it through extract_image_refs."""
|
||||
|
||||
def test_local_path_in_body_round_trips(self, kanban_home, tmp_path):
|
||||
img = tmp_path / "screenshot.png"
|
||||
img.write_bytes(_PNG)
|
||||
tid = _add_task_with_body(
|
||||
f"Please review the screenshot at {img} and confirm "
|
||||
"the alignment is right."
|
||||
)
|
||||
|
||||
body = _read_body(tid)
|
||||
paths, urls = extract_image_refs(body)
|
||||
assert paths == [str(img)]
|
||||
assert urls == []
|
||||
|
||||
def test_url_in_body_round_trips(self, kanban_home):
|
||||
tid = _add_task_with_body(
|
||||
"The design lives at https://example.com/mock/v3.png — "
|
||||
"make the implementation match it."
|
||||
)
|
||||
|
||||
body = _read_body(tid)
|
||||
paths, urls = extract_image_refs(body)
|
||||
assert paths == []
|
||||
assert urls == ["https://example.com/mock/v3.png"]
|
||||
|
||||
def test_mixed_path_and_url_in_body(self, kanban_home, tmp_path):
|
||||
img = tmp_path / "current.png"
|
||||
img.write_bytes(_PNG)
|
||||
tid = _add_task_with_body(
|
||||
f"Compare the current screenshot {img} against the design at "
|
||||
"https://example.com/target.png and write a diff."
|
||||
)
|
||||
|
||||
body = _read_body(tid)
|
||||
paths, urls = extract_image_refs(body)
|
||||
assert paths == [str(img)]
|
||||
assert urls == ["https://example.com/target.png"]
|
||||
|
||||
def test_body_without_images_yields_nothing(self, kanban_home):
|
||||
tid = _add_task_with_body(
|
||||
"Refactor the auth module to use the new session helper."
|
||||
)
|
||||
|
||||
body = _read_body(tid)
|
||||
paths, urls = extract_image_refs(body)
|
||||
assert paths == []
|
||||
assert urls == []
|
||||
|
||||
def test_empty_body_is_safe(self, kanban_home):
|
||||
tid = _add_task_with_body("")
|
||||
|
||||
body = _read_body(tid)
|
||||
paths, urls = extract_image_refs(body)
|
||||
assert paths == []
|
||||
assert urls == []
|
||||
|
||||
|
||||
class TestBuildPartsFromTaskBody:
|
||||
"""Verify the full pipeline produces a multimodal user turn."""
|
||||
|
||||
def test_local_path_becomes_native_image_part(self, kanban_home, tmp_path):
|
||||
img = tmp_path / "design.png"
|
||||
img.write_bytes(_PNG)
|
||||
tid = _add_task_with_body(f"Check out {img} — what's broken?")
|
||||
body = _read_body(tid)
|
||||
paths, urls = extract_image_refs(body)
|
||||
|
||||
# Mirrors the cli.py wiring: pass the worker's literal -q argument
|
||||
# (the dispatcher uses ``"work kanban task <id>"``) plus the
|
||||
# extracted refs through build_native_content_parts.
|
||||
parts, skipped = build_native_content_parts(
|
||||
f"work kanban task {tid}",
|
||||
paths,
|
||||
image_urls=urls or None,
|
||||
)
|
||||
|
||||
assert skipped == []
|
||||
# text part + one image_url part
|
||||
assert len(parts) == 2
|
||||
assert parts[0]["type"] == "text"
|
||||
assert parts[0]["text"].startswith(f"work kanban task {tid}")
|
||||
assert f"[Image attached at: {img}]" in parts[0]["text"]
|
||||
assert parts[1]["type"] == "image_url"
|
||||
assert parts[1]["image_url"]["url"].startswith("data:image/png;base64,")
|
||||
|
||||
def test_url_becomes_image_url_part(self, kanban_home):
|
||||
tid = _add_task_with_body(
|
||||
"Reference: https://example.com/target.jpg — match it."
|
||||
)
|
||||
body = _read_body(tid)
|
||||
paths, urls = extract_image_refs(body)
|
||||
|
||||
parts, skipped = build_native_content_parts(
|
||||
f"work kanban task {tid}",
|
||||
paths,
|
||||
image_urls=urls or None,
|
||||
)
|
||||
|
||||
assert skipped == []
|
||||
assert len(parts) == 2
|
||||
assert parts[0]["type"] == "text"
|
||||
assert "[Image attached: https://example.com/target.jpg]" in parts[0]["text"]
|
||||
assert parts[1] == {
|
||||
"type": "image_url",
|
||||
"image_url": {"url": "https://example.com/target.jpg"},
|
||||
}
|
||||
|
||||
def test_body_with_both_yields_two_image_parts(self, kanban_home, tmp_path):
|
||||
img = tmp_path / "local.png"
|
||||
img.write_bytes(_PNG)
|
||||
tid = _add_task_with_body(
|
||||
f"Diff {img} vs https://example.com/target.png — explain it."
|
||||
)
|
||||
body = _read_body(tid)
|
||||
paths, urls = extract_image_refs(body)
|
||||
|
||||
parts, skipped = build_native_content_parts(
|
||||
f"work kanban task {tid}",
|
||||
paths,
|
||||
image_urls=urls or None,
|
||||
)
|
||||
|
||||
assert skipped == []
|
||||
image_parts = [p for p in parts if p.get("type") == "image_url"]
|
||||
assert len(image_parts) == 2
|
||||
# Local file is embedded as a data URL; remote URL passes through.
|
||||
assert image_parts[0]["image_url"]["url"].startswith("data:image/png;base64,")
|
||||
assert image_parts[1]["image_url"]["url"] == "https://example.com/target.png"
|
||||
|
||||
def test_body_with_no_images_leaves_query_untouched(self, kanban_home):
|
||||
tid = _add_task_with_body(
|
||||
"Rewrite the README intro paragraph to focus on use cases."
|
||||
)
|
||||
body = _read_body(tid)
|
||||
paths, urls = extract_image_refs(body)
|
||||
|
||||
parts, skipped = build_native_content_parts(
|
||||
f"work kanban task {tid}",
|
||||
paths,
|
||||
image_urls=urls or None,
|
||||
)
|
||||
|
||||
# No images → plain text-only return (single part, no list mutation).
|
||||
assert skipped == []
|
||||
assert len(parts) == 1
|
||||
assert parts[0]["type"] == "text"
|
||||
assert parts[0]["text"] == f"work kanban task {tid}"
|
||||
|
||||
def test_code_block_example_is_not_attached(self, kanban_home, tmp_path):
|
||||
# Only the real image outside the fenced code block should attach.
|
||||
real = tmp_path / "real.png"
|
||||
real.write_bytes(_PNG)
|
||||
tid = _add_task_with_body(
|
||||
f"Real screenshot:\n{real}\n\n"
|
||||
"Example we DON'T want attached:\n"
|
||||
"```\n"
|
||||
"image: /tmp/example_only.png\n"
|
||||
"url: https://example.com/example.png\n"
|
||||
"```\n"
|
||||
)
|
||||
body = _read_body(tid)
|
||||
paths, urls = extract_image_refs(body)
|
||||
|
||||
assert paths == [str(real)]
|
||||
assert urls == []
|
||||
@@ -197,10 +197,32 @@ class TestConfig:
|
||||
assert provider._recall_max_input_chars == 800
|
||||
assert provider._tags is None
|
||||
assert provider._recall_tags is None
|
||||
# Default recall narrowed to observation-only; world/experience are
|
||||
# aggregate facts that often crowd out concrete-event signal during
|
||||
# auto-recall. Users opt back in via the recall_types config key.
|
||||
assert provider._recall_types == ["observation"]
|
||||
assert provider._bank_mission == ""
|
||||
assert provider._bank_retain_mission is None
|
||||
assert provider._retain_context == "conversation between Hermes Agent and the User"
|
||||
|
||||
def test_recall_types_default_is_observation_only(self, provider):
|
||||
"""Auto-recall must filter to observation by default."""
|
||||
assert provider._recall_types == ["observation"]
|
||||
|
||||
def test_recall_types_explicit_list_overrides_default(self, provider_with_config):
|
||||
p = provider_with_config(recall_types=["world", "experience", "observation"])
|
||||
assert p._recall_types == ["world", "experience", "observation"]
|
||||
|
||||
def test_recall_types_csv_string_accepted(self, provider_with_config):
|
||||
"""For parity with recall_tags, comma-separated strings work too."""
|
||||
p = provider_with_config(recall_types="observation, world")
|
||||
assert p._recall_types == ["observation", "world"]
|
||||
|
||||
def test_recall_types_empty_list_falls_back_to_default(self, provider_with_config):
|
||||
"""An empty list shouldn't disable the filter (would be wider than default)."""
|
||||
p = provider_with_config(recall_types=[])
|
||||
assert p._recall_types == ["observation"]
|
||||
|
||||
def test_custom_config_values(self, provider_with_config):
|
||||
p = provider_with_config(
|
||||
retain_tags=["tag1", "tag2"],
|
||||
|
||||
@@ -91,6 +91,45 @@ class TestSyncExternalMemoryForTurn:
|
||||
session_id="test_session_001",
|
||||
)
|
||||
|
||||
def test_completed_turn_syncs_messages_when_present(self):
|
||||
agent = _bare_agent()
|
||||
messages = [
|
||||
{
|
||||
"role": "assistant",
|
||||
"content": None,
|
||||
"tool_calls": [
|
||||
{
|
||||
"id": "call-1",
|
||||
"type": "function",
|
||||
"function": {
|
||||
"name": "terminal",
|
||||
"arguments": "{\"command\":\"pytest\"}",
|
||||
},
|
||||
}
|
||||
],
|
||||
},
|
||||
{
|
||||
"role": "tool",
|
||||
"name": "terminal",
|
||||
"tool_call_id": "call-1",
|
||||
"content": "final Hermes-processed output",
|
||||
}
|
||||
]
|
||||
|
||||
agent._sync_external_memory_for_turn(
|
||||
original_user_message="run tests",
|
||||
final_response="tests passed",
|
||||
interrupted=False,
|
||||
messages=messages,
|
||||
)
|
||||
|
||||
agent._memory_manager.sync_all.assert_called_once_with(
|
||||
"run tests",
|
||||
"tests passed",
|
||||
session_id="test_session_001",
|
||||
messages=messages,
|
||||
)
|
||||
|
||||
# --- Edge cases (pre-existing behaviour preserved) ------------------
|
||||
|
||||
def test_no_final_response_skips(self):
|
||||
|
||||
@@ -13,3 +13,36 @@ def test_dashboard_run_resets_home_before_dropping_privileges() -> None:
|
||||
assert "#!/command/with-contenv sh" in text
|
||||
assert "export HOME=/opt/data" in text
|
||||
assert "exec s6-setuidgid hermes hermes dashboard" in text
|
||||
|
||||
|
||||
def test_dashboard_run_does_not_derive_insecure_from_bind_host() -> None:
|
||||
"""The s6 dashboard run script MUST NOT auto-add ``--insecure`` based on
|
||||
``HERMES_DASHBOARD_HOST``. Doing so disables the OAuth auth gate on
|
||||
every non-loopback bind even when an auth provider is registered —
|
||||
the exact regression that exposed every wildcard-subdomain agent
|
||||
dashboard publicly until early 2026.
|
||||
|
||||
The opt-in is now explicit: ``HERMES_DASHBOARD_INSECURE=1`` (truthy).
|
||||
The auth gate is the authority on whether non-loopback binds are safe.
|
||||
"""
|
||||
text = DASHBOARD_RUN.read_text(encoding="utf-8")
|
||||
|
||||
# No legacy host-derived flip.
|
||||
assert '127.0.0.1|localhost' not in text, (
|
||||
"Run script still derives --insecure from the bind host. The gate "
|
||||
"is the authority now — opt in via HERMES_DASHBOARD_INSECURE instead."
|
||||
)
|
||||
assert 'case "$dash_host" in' not in text, (
|
||||
"Legacy host-derived --insecure case-statement is back."
|
||||
)
|
||||
|
||||
# New opt-in env var present.
|
||||
assert "HERMES_DASHBOARD_INSECURE" in text, (
|
||||
"Explicit HERMES_DASHBOARD_INSECURE opt-in is missing."
|
||||
)
|
||||
# Truthy values aligned with the rest of the s6 scripts
|
||||
# (HERMES_DASHBOARD, HERMES_DASHBOARD_TUI).
|
||||
for truthy in ("1", "true", "TRUE", "True", "yes", "YES", "Yes"):
|
||||
assert truthy in text, (
|
||||
f"HERMES_DASHBOARD_INSECURE should accept truthy value {truthy!r}"
|
||||
)
|
||||
|
||||
@@ -2679,64 +2679,6 @@ class TestVacuum:
|
||||
db.vacuum()
|
||||
|
||||
|
||||
class TestOptimizeFts:
|
||||
def test_optimize_returns_index_count(self, db):
|
||||
"""A fresh DB has both FTS indexes; optimize merges both."""
|
||||
db.create_session(session_id="s1", source="cli")
|
||||
db.append_message(session_id="s1", role="user", content="hello world")
|
||||
assert db.optimize_fts() == 2
|
||||
|
||||
def test_optimize_preserves_search_and_snippet(self, db):
|
||||
"""Optimize is layout-only: MATCH results + snippets are unchanged."""
|
||||
db.create_session(session_id="s1", source="cli")
|
||||
for i in range(50):
|
||||
db.append_message(
|
||||
session_id="s1",
|
||||
role="user",
|
||||
content=f"needle alpha bravo charlie message {i}",
|
||||
)
|
||||
before = db.search_messages("needle")
|
||||
n = db.optimize_fts()
|
||||
assert n == 2
|
||||
after = db.search_messages("needle")
|
||||
assert len(after) == len(before)
|
||||
assert len(after) > 0
|
||||
# Snippet must still be populated (would be empty/None if the FTS
|
||||
# content shadow were lost during optimize).
|
||||
assert all(row.get("snippet") for row in after)
|
||||
# IDs and snippets are identical before/after — pure layout change.
|
||||
assert [r["id"] for r in after] == [r["id"] for r in before]
|
||||
assert [r["snippet"] for r in after] == [r["snippet"] for r in before]
|
||||
|
||||
def test_optimize_skips_missing_trigram_table(self, db):
|
||||
"""When the trigram index is absent, optimize handles only the porter
|
||||
index and does not raise."""
|
||||
db.create_session(session_id="s1", source="cli")
|
||||
db.append_message(session_id="s1", role="user", content="hello")
|
||||
# Drop the trigram table + triggers to simulate a disabled/absent index.
|
||||
with db._lock:
|
||||
for trig in (
|
||||
"messages_fts_trigram_insert",
|
||||
"messages_fts_trigram_delete",
|
||||
"messages_fts_trigram_update",
|
||||
):
|
||||
db._conn.execute(f"DROP TRIGGER IF EXISTS {trig}")
|
||||
db._conn.execute("DROP TABLE IF EXISTS messages_fts_trigram")
|
||||
assert db._fts_table_exists("messages_fts_trigram") is False
|
||||
assert db._fts_table_exists("messages_fts") is True
|
||||
# Only the porter index remains -> 1 optimized, no error.
|
||||
assert db.optimize_fts() == 1
|
||||
|
||||
def test_optimize_idempotent(self, db):
|
||||
"""Running optimize twice is safe (second pass is a no-op merge)."""
|
||||
db.create_session(session_id="s1", source="cli")
|
||||
db.append_message(session_id="s1", role="user", content="repeat me")
|
||||
assert db.optimize_fts() == 2
|
||||
assert db.optimize_fts() == 2
|
||||
# Search still works after repeated optimization.
|
||||
assert len(db.search_messages("repeat")) == 1
|
||||
|
||||
|
||||
class TestAutoMaintenance:
|
||||
def _make_old_ended(self, db, sid: str, days_old: int = 100):
|
||||
"""Create a session that is ended and was started `days_old` days ago."""
|
||||
|
||||
@@ -203,25 +203,43 @@ def test_auto_mount_replaces_persistent_workspace_bind(monkeypatch, tmp_path):
|
||||
|
||||
|
||||
def test_non_persistent_cleanup_removes_container(monkeypatch):
|
||||
"""When persistent=false, cleanup() must schedule docker stop + rm."""
|
||||
"""When persist_across_processes=false, cleanup() must docker stop AND
|
||||
docker rm so containers don't leak across hermes processes.
|
||||
|
||||
Updated for issue #20561: the previous implementation used fire-and-forget
|
||||
``subprocess.Popen("... &", shell=True)`` which raced with parent exit;
|
||||
the new implementation uses ``subprocess.run`` on a daemon thread with
|
||||
bounded timeouts. See test_cleanup_with_persist_disabled_stops_and_rms
|
||||
for the full behavior contract.
|
||||
"""
|
||||
monkeypatch.setattr(docker_env, "find_docker", lambda: "/usr/bin/docker")
|
||||
calls = _mock_subprocess_run(monkeypatch)
|
||||
monkeypatch.setattr(docker_env, "_get_active_profile_name", lambda: "default")
|
||||
_mock_subprocess_run(monkeypatch)
|
||||
# Run the worker thread synchronously so assertions can observe its work.
|
||||
import threading
|
||||
monkeypatch.setattr(threading, "Thread", _FakeThread)
|
||||
|
||||
popen_cmds = []
|
||||
monkeypatch.setattr(
|
||||
docker_env.subprocess, "Popen",
|
||||
lambda cmd, **kw: (popen_cmds.append(cmd), type("P", (), {"poll": lambda s: 0, "wait": lambda s, **k: None, "returncode": 0, "stdout": iter([]), "stdin": None})())[1],
|
||||
env = docker_env.DockerEnvironment(
|
||||
image="python:3.11", cwd="/root", timeout=60,
|
||||
task_id="ephemeral-task", persistent_filesystem=False,
|
||||
persist_across_processes=False,
|
||||
)
|
||||
|
||||
env = _make_dummy_env(persistent_filesystem=False, task_id="ephemeral-task")
|
||||
assert env._container_id
|
||||
container_id = env._container_id
|
||||
assert container_id
|
||||
|
||||
# Capture cleanup-time docker calls (everything before this was init).
|
||||
cleanup_calls = []
|
||||
real_run = docker_env.subprocess.run
|
||||
|
||||
def _capture(cmd, **kw):
|
||||
cleanup_calls.append((list(cmd) if isinstance(cmd, list) else cmd, kw))
|
||||
return real_run(cmd, **kw)
|
||||
|
||||
monkeypatch.setattr(docker_env.subprocess, "run", _capture)
|
||||
env.cleanup()
|
||||
|
||||
# Should have stop and rm calls via Popen
|
||||
stop_cmds = [c for c in popen_cmds if container_id in str(c) and "stop" in str(c)]
|
||||
assert len(stop_cmds) >= 1, f"cleanup() should schedule docker stop for {container_id}"
|
||||
stops = [c for c in cleanup_calls if isinstance(c[0], list) and c[0][1:2] == ["stop"]]
|
||||
assert stops, f"cleanup() should docker stop {container_id}; got {cleanup_calls}"
|
||||
|
||||
|
||||
class _FakePopen:
|
||||
@@ -514,3 +532,839 @@ def test_run_as_host_user_warns_and_skips_when_no_posix_ids(monkeypatch, caplog)
|
||||
"does not expose POSIX uid/gid" in rec.getMessage()
|
||||
for rec in caplog.records
|
||||
), "expected a warning when POSIX ids are unavailable"
|
||||
|
||||
|
||||
# ── Docker labels (issue #20561) ──────────────────────────────────
|
||||
|
||||
|
||||
def _run_args_from_calls(calls):
|
||||
"""Pull the argv list passed to the first ``docker run`` invocation."""
|
||||
run_calls = [
|
||||
c for c in calls
|
||||
if isinstance(c[0], list) and len(c[0]) >= 2 and c[0][1] == "run"
|
||||
]
|
||||
assert run_calls, "docker run should have been called"
|
||||
return run_calls[0][0]
|
||||
|
||||
|
||||
def _labels_in_run_args(run_args):
|
||||
"""Return the set of ``key=value`` strings passed via ``--label``."""
|
||||
return {
|
||||
run_args[i + 1]
|
||||
for i, flag in enumerate(run_args[:-1])
|
||||
if flag == "--label"
|
||||
}
|
||||
|
||||
|
||||
def test_run_command_tags_hermes_agent_label(monkeypatch):
|
||||
"""Every container hermes-agent starts must carry the hermes-agent=1 label
|
||||
so the orphan reaper (and external operators) can identify them with a
|
||||
single ``docker ps --filter label=hermes-agent=1`` call. Regression test
|
||||
for issue #20561 — without the label there is no global sweep target."""
|
||||
monkeypatch.setattr(docker_env, "find_docker", lambda: "/usr/bin/docker")
|
||||
calls = _mock_subprocess_run(monkeypatch)
|
||||
|
||||
_make_dummy_env(task_id="my-task")
|
||||
|
||||
labels = _labels_in_run_args(_run_args_from_calls(calls))
|
||||
assert "hermes-agent=1" in labels, (
|
||||
f"hermes-agent=1 label missing; got labels: {sorted(labels)}"
|
||||
)
|
||||
|
||||
|
||||
def test_run_command_tags_task_and_profile_labels(monkeypatch):
|
||||
"""task_id and the active profile name are surfaced as labels so future
|
||||
cross-process reuse logic can filter to a specific (task, profile) pair
|
||||
without parsing container names. Profile resolution uses the helper that
|
||||
returns ``"default"`` for the root Hermes home."""
|
||||
monkeypatch.setattr(docker_env, "find_docker", lambda: "/usr/bin/docker")
|
||||
monkeypatch.setattr(docker_env, "_get_active_profile_name", lambda: "research-bot")
|
||||
calls = _mock_subprocess_run(monkeypatch)
|
||||
|
||||
_make_dummy_env(task_id="kanban-42")
|
||||
|
||||
labels = _labels_in_run_args(_run_args_from_calls(calls))
|
||||
assert "hermes-task-id=kanban-42" in labels, (
|
||||
f"hermes-task-id=kanban-42 missing; got: {sorted(labels)}"
|
||||
)
|
||||
assert "hermes-profile=research-bot" in labels, (
|
||||
f"hermes-profile=research-bot missing; got: {sorted(labels)}"
|
||||
)
|
||||
|
||||
|
||||
def test_label_sanitizer_rejects_invalid_characters():
|
||||
"""Docker label values must be alnum + ``_.-`` and ≤63 chars. Profile or
|
||||
task names containing slashes, colons, or unicode would otherwise emit
|
||||
invalid labels that round-trip badly through ``docker ps --filter``."""
|
||||
assert docker_env._sanitize_label_value("plain-name_1.0") == "plain-name_1.0"
|
||||
assert docker_env._sanitize_label_value("with/slash") == "with_slash"
|
||||
assert docker_env._sanitize_label_value("with:colon") == "with_colon"
|
||||
assert docker_env._sanitize_label_value("emoji-😀-here") == "emoji-_-here"
|
||||
# Empty / non-string inputs must collapse to a queryable token, not "".
|
||||
assert docker_env._sanitize_label_value("") == "unknown"
|
||||
assert docker_env._sanitize_label_value(None) == "unknown" # type: ignore[arg-type]
|
||||
# >63 chars must truncate, not error.
|
||||
long_value = "x" * 100
|
||||
assert len(docker_env._sanitize_label_value(long_value)) == 63
|
||||
|
||||
|
||||
def test_run_command_sanitizes_unsafe_task_id(monkeypatch):
|
||||
"""A task_id containing characters Docker rejects in label values must be
|
||||
sanitized before reaching ``docker run --label``; otherwise the daemon
|
||||
refuses the run with an inscrutable error and the agent's first command
|
||||
blows up."""
|
||||
monkeypatch.setattr(docker_env, "find_docker", lambda: "/usr/bin/docker")
|
||||
calls = _mock_subprocess_run(monkeypatch)
|
||||
|
||||
_make_dummy_env(task_id="task/with:weird*chars")
|
||||
|
||||
labels = _labels_in_run_args(_run_args_from_calls(calls))
|
||||
# Each non-OK character becomes an underscore; the safe chars survive.
|
||||
assert "hermes-task-id=task_with_weird_chars" in labels, (
|
||||
f"sanitized task-id label missing; got: {sorted(labels)}"
|
||||
)
|
||||
|
||||
|
||||
def test_labels_attribute_populated_after_init(monkeypatch):
|
||||
"""``self._labels`` must be set to the same key/value pairs that went onto
|
||||
docker run, so subsequent reuse / reaper paths can match without re-running
|
||||
the sanitizer or re-importing the profile module."""
|
||||
monkeypatch.setattr(docker_env, "find_docker", lambda: "/usr/bin/docker")
|
||||
monkeypatch.setattr(docker_env, "_get_active_profile_name", lambda: "default")
|
||||
_mock_subprocess_run(monkeypatch)
|
||||
|
||||
env = _make_dummy_env(task_id="abc")
|
||||
|
||||
assert env._labels == {
|
||||
"hermes-agent": "1",
|
||||
"hermes-task-id": "abc",
|
||||
"hermes-profile": "default",
|
||||
}
|
||||
|
||||
|
||||
# ── Cross-process container reuse (issue #20561) ──────────────────
|
||||
|
||||
|
||||
def _mock_subprocess_run_with_reuse(monkeypatch, ps_state: str | None,
|
||||
start_succeeds: bool = True):
|
||||
"""Reuse-aware subprocess.run mock.
|
||||
|
||||
``ps_state`` controls what ``docker ps -a --filter ...`` returns:
|
||||
* ``None`` → no match (empty stdout). Forces a fresh ``docker run``.
|
||||
* ``"running"`` / ``"exited"`` / ... → emit ``CID\\tSTATE`` so the reuse
|
||||
path picks it up. ``"running"`` skips ``docker start``; other states
|
||||
trigger ``docker start`` (which can be forced to fail via
|
||||
``start_succeeds=False``).
|
||||
|
||||
Returns the captured call list so the test can verify which docker
|
||||
commands actually ran.
|
||||
"""
|
||||
calls = []
|
||||
|
||||
def _run(cmd, **kwargs):
|
||||
calls.append((list(cmd) if isinstance(cmd, list) else cmd, kwargs))
|
||||
if isinstance(cmd, list) and len(cmd) >= 2:
|
||||
sub = cmd[1]
|
||||
if sub == "version":
|
||||
return subprocess.CompletedProcess(cmd, 0, stdout="Docker version", stderr="")
|
||||
if sub == "ps":
|
||||
if ps_state is None:
|
||||
return subprocess.CompletedProcess(cmd, 0, stdout="", stderr="")
|
||||
return subprocess.CompletedProcess(
|
||||
cmd, 0, stdout=f"reused-cid\t{ps_state}\n", stderr="",
|
||||
)
|
||||
if sub == "start":
|
||||
if not start_succeeds:
|
||||
# Real subprocess.run with check=True raises on non-zero exit;
|
||||
# mirror that so the production code's except clause fires.
|
||||
raise subprocess.CalledProcessError(1, cmd, output="", stderr="no such container")
|
||||
return subprocess.CompletedProcess(cmd, 0, stdout="reused-cid\n", stderr="")
|
||||
if sub == "run":
|
||||
return subprocess.CompletedProcess(cmd, 0, stdout="fresh-cid\n", stderr="")
|
||||
return subprocess.CompletedProcess(cmd, 0, stdout="", stderr="")
|
||||
|
||||
monkeypatch.setattr(docker_env.subprocess, "run", _run)
|
||||
return calls
|
||||
|
||||
|
||||
def test_reuse_attaches_to_running_container_without_docker_run(monkeypatch):
|
||||
"""When a labeled container is already ``running``, the reuse probe
|
||||
must pick it up and skip ``docker run`` entirely. Regression for the
|
||||
issue #20561 root cause: every Hermes process spawning a new container
|
||||
despite docs claiming "ONE long-lived container shared across sessions"."""
|
||||
monkeypatch.setattr(docker_env, "find_docker", lambda: "/usr/bin/docker")
|
||||
monkeypatch.setattr(docker_env, "_get_active_profile_name", lambda: "default")
|
||||
calls = _mock_subprocess_run_with_reuse(monkeypatch, ps_state="running")
|
||||
|
||||
env = _make_dummy_env(task_id="reuse-test")
|
||||
|
||||
# The reuse path must populate _container_id from the ps probe output.
|
||||
assert env._container_id == "reused-cid", (
|
||||
f"expected reused container id, got {env._container_id!r}"
|
||||
)
|
||||
# And it must NOT have run `docker run`.
|
||||
run_invocations = [c for c in calls if isinstance(c[0], list) and len(c[0]) >= 2 and c[0][1] == "run"]
|
||||
assert not run_invocations, (
|
||||
f"docker run should be skipped on reuse, got: {run_invocations}"
|
||||
)
|
||||
# And it must have NOT issued a `docker start` for an already-running container.
|
||||
start_invocations = [c for c in calls if isinstance(c[0], list) and len(c[0]) >= 2 and c[0][1] == "start"]
|
||||
assert not start_invocations, (
|
||||
f"docker start should be skipped when container already running, got: {start_invocations}"
|
||||
)
|
||||
|
||||
|
||||
def test_reuse_starts_stopped_container_before_attaching(monkeypatch):
|
||||
"""A labeled container in ``exited`` state must be restarted via
|
||||
``docker start`` before the new Hermes process uses it. Without this
|
||||
step, ``docker exec`` against a stopped container errors out and the
|
||||
first agent command fails opaquely."""
|
||||
monkeypatch.setattr(docker_env, "find_docker", lambda: "/usr/bin/docker")
|
||||
monkeypatch.setattr(docker_env, "_get_active_profile_name", lambda: "default")
|
||||
calls = _mock_subprocess_run_with_reuse(monkeypatch, ps_state="exited")
|
||||
|
||||
env = _make_dummy_env(task_id="reuse-stopped")
|
||||
|
||||
assert env._container_id == "reused-cid"
|
||||
start_invocations = [c for c in calls if isinstance(c[0], list) and len(c[0]) >= 2 and c[0][1] == "start"]
|
||||
assert start_invocations, "expected docker start for exited container"
|
||||
run_invocations = [c for c in calls if isinstance(c[0], list) and len(c[0]) >= 2 and c[0][1] == "run"]
|
||||
assert not run_invocations, "should not docker run when reusing an exited container"
|
||||
|
||||
|
||||
def test_reuse_falls_back_to_fresh_run_when_start_fails(monkeypatch):
|
||||
"""If ``docker start`` on the matched container fails (container was
|
||||
removed between probe and start, daemon paused, etc.), the code must
|
||||
silently fall through to a fresh ``docker run`` rather than leaving the
|
||||
user with a broken environment. Defensive recovery — the probe is best-
|
||||
effort, not authoritative."""
|
||||
monkeypatch.setattr(docker_env, "find_docker", lambda: "/usr/bin/docker")
|
||||
monkeypatch.setattr(docker_env, "_get_active_profile_name", lambda: "default")
|
||||
calls = _mock_subprocess_run_with_reuse(
|
||||
monkeypatch, ps_state="exited", start_succeeds=False,
|
||||
)
|
||||
|
||||
env = _make_dummy_env(task_id="reuse-broken-start")
|
||||
|
||||
# docker start should be attempted then fail; code falls through to run.
|
||||
assert env._container_id == "fresh-cid", (
|
||||
f"expected fresh container id after fallback, got {env._container_id!r}"
|
||||
)
|
||||
run_invocations = [c for c in calls if isinstance(c[0], list) and len(c[0]) >= 2 and c[0][1] == "run"]
|
||||
assert run_invocations, "fallback to fresh docker run must happen on start failure"
|
||||
|
||||
|
||||
def test_no_reuse_when_persist_across_processes_disabled(monkeypatch):
|
||||
"""Opt-out path: ``persist_across_processes=False`` skips the ps probe
|
||||
entirely and always starts a fresh container, matching the pre-fix
|
||||
behavior for users who want hard per-process isolation."""
|
||||
monkeypatch.setattr(docker_env, "find_docker", lambda: "/usr/bin/docker")
|
||||
monkeypatch.setattr(docker_env, "_get_active_profile_name", lambda: "default")
|
||||
# ps_state=running would trigger reuse if the probe ran — assert it doesn't.
|
||||
calls = _mock_subprocess_run_with_reuse(monkeypatch, ps_state="running")
|
||||
|
||||
env = docker_env.DockerEnvironment(
|
||||
image="python:3.11", cwd="/root", timeout=60,
|
||||
task_id="no-reuse", persist_across_processes=False,
|
||||
)
|
||||
|
||||
# Must NOT have issued docker ps (the probe is gated by the flag).
|
||||
ps_invocations = [c for c in calls if isinstance(c[0], list) and len(c[0]) >= 2 and c[0][1] == "ps"]
|
||||
assert not ps_invocations, (
|
||||
f"docker ps probe should be skipped when persist_across_processes=False, got: {ps_invocations}"
|
||||
)
|
||||
# Should have started a fresh container.
|
||||
assert env._container_id == "fresh-cid"
|
||||
|
||||
|
||||
def test_find_reusable_container_prefers_running_over_stopped(monkeypatch):
|
||||
"""When the probe returns multiple matches (shouldn't normally happen,
|
||||
but can after a crash leaves stale duplicates), a ``running`` container
|
||||
is preferred over any stopped one. The duplicate gets reaped later by
|
||||
the orphan reaper; we don't try to be heroic about it here."""
|
||||
monkeypatch.setattr(docker_env, "find_docker", lambda: "/usr/bin/docker")
|
||||
monkeypatch.setattr(docker_env, "_get_active_profile_name", lambda: "default")
|
||||
|
||||
def _run(cmd, **kwargs):
|
||||
if isinstance(cmd, list) and len(cmd) >= 2:
|
||||
if cmd[1] == "version":
|
||||
return subprocess.CompletedProcess(cmd, 0, stdout="ok", stderr="")
|
||||
if cmd[1] == "ps":
|
||||
# Two matches: stopped first, running second.
|
||||
return subprocess.CompletedProcess(
|
||||
cmd, 0,
|
||||
stdout="stopped-cid\texited\nrunning-cid\trunning\n",
|
||||
stderr="",
|
||||
)
|
||||
return subprocess.CompletedProcess(cmd, 0, stdout="fresh-cid\n", stderr="")
|
||||
|
||||
monkeypatch.setattr(docker_env.subprocess, "run", _run)
|
||||
|
||||
env = _make_dummy_env(task_id="dup-match")
|
||||
assert env._container_id == "running-cid", (
|
||||
f"running container should win over stopped duplicate, got {env._container_id!r}"
|
||||
)
|
||||
|
||||
|
||||
# ── Cleanup correctness (issue #20561) ────────────────────────────
|
||||
|
||||
|
||||
class _FakeThread:
|
||||
"""Stand-in for threading.Thread that captures target/args and calls
|
||||
target() synchronously when .start() runs, so cleanup behavior is
|
||||
observable without actually backgrounding subprocess calls."""
|
||||
|
||||
def __init__(self, target=None, daemon=None, name=None):
|
||||
self._target = target
|
||||
self.daemon = daemon
|
||||
self.name = name
|
||||
self._done = False
|
||||
|
||||
def start(self):
|
||||
if self._target is not None:
|
||||
self._target()
|
||||
self._done = True
|
||||
|
||||
def is_alive(self):
|
||||
return not self._done
|
||||
|
||||
def join(self, timeout=None):
|
||||
self._done = True
|
||||
|
||||
|
||||
def _install_fake_thread(monkeypatch):
|
||||
import threading
|
||||
monkeypatch.setattr(threading, "Thread", _FakeThread)
|
||||
|
||||
|
||||
def test_cleanup_with_persist_is_noop_for_container(monkeypatch):
|
||||
"""``persist_across_processes=True`` (default) cleanup must NEITHER stop
|
||||
NOR remove the container — the docs promise "ONE long-lived container
|
||||
shared across sessions", and any docker stop would kill background
|
||||
processes inside the container (npm watchers, pytest watchers, etc.).
|
||||
|
||||
Resource reclamation in this mode happens via the orphan reaper on next
|
||||
Hermes startup, not on graceful exit. Issue #20561 — the first iteration
|
||||
of this PR did docker stop here, which Ben caught as contradicting the
|
||||
"ONE long-lived container" semantics."""
|
||||
monkeypatch.setattr(docker_env, "find_docker", lambda: "/usr/bin/docker")
|
||||
monkeypatch.setattr(docker_env, "_get_active_profile_name", lambda: "default")
|
||||
_mock_subprocess_run(monkeypatch)
|
||||
_install_fake_thread(monkeypatch)
|
||||
|
||||
env = _make_dummy_env(task_id="cleanup-persist", persistent_filesystem=False)
|
||||
# Default persist_across_processes=True.
|
||||
container_id = env._container_id
|
||||
assert container_id
|
||||
|
||||
cleanup_calls = []
|
||||
real_run = docker_env.subprocess.run
|
||||
|
||||
def _capturing_run(cmd, **kwargs):
|
||||
cleanup_calls.append((list(cmd) if isinstance(cmd, list) else cmd, kwargs))
|
||||
return real_run(cmd, **kwargs)
|
||||
|
||||
monkeypatch.setattr(docker_env.subprocess, "run", _capturing_run)
|
||||
|
||||
env.cleanup()
|
||||
|
||||
stops = [c for c in cleanup_calls if isinstance(c[0], list) and len(c[0]) >= 2 and c[0][1] == "stop"]
|
||||
rms = [c for c in cleanup_calls if isinstance(c[0], list) and len(c[0]) >= 2 and c[0][1] == "rm"]
|
||||
assert not stops, (
|
||||
f"docker stop must NOT be called when persist_across_processes=True; "
|
||||
f"container has to stay running so background processes survive. "
|
||||
f"Got: {stops}"
|
||||
)
|
||||
assert not rms, (
|
||||
f"docker rm must NOT be called when persist_across_processes=True; "
|
||||
f"reuse would be impossible. Got: {rms}"
|
||||
)
|
||||
# The in-process handle must still be cleared so the next __init__
|
||||
# re-probes via labels (and reuses the still-running container).
|
||||
assert env._container_id is None, (
|
||||
"in-process container_id should be cleared even in no-op cleanup"
|
||||
)
|
||||
|
||||
|
||||
def test_cleanup_force_remove_stops_and_rms_even_in_persist_mode(monkeypatch):
|
||||
"""``cleanup(force_remove=True)`` must stop AND rm the container even
|
||||
when ``persist_across_processes=True``. This is the explicit-teardown
|
||||
path for ``/reset``, ``cleanup_vm(task_id, force_remove=True)``, and any
|
||||
future caller that wants a guaranteed fresh container.
|
||||
|
||||
Without this kwarg, callers in persist mode would have no way to force a
|
||||
fresh container without also flipping the global config — too coarse for
|
||||
a per-task reset.
|
||||
"""
|
||||
monkeypatch.setattr(docker_env, "find_docker", lambda: "/usr/bin/docker")
|
||||
monkeypatch.setattr(docker_env, "_get_active_profile_name", lambda: "default")
|
||||
_mock_subprocess_run(monkeypatch)
|
||||
_install_fake_thread(monkeypatch)
|
||||
|
||||
env = _make_dummy_env(task_id="cleanup-force", persistent_filesystem=False)
|
||||
assert env._container_id
|
||||
|
||||
cleanup_calls = []
|
||||
real_run = docker_env.subprocess.run
|
||||
|
||||
def _capturing_run(cmd, **kwargs):
|
||||
cleanup_calls.append((list(cmd) if isinstance(cmd, list) else cmd, kwargs))
|
||||
return real_run(cmd, **kwargs)
|
||||
|
||||
monkeypatch.setattr(docker_env.subprocess, "run", _capturing_run)
|
||||
|
||||
env.cleanup(force_remove=True)
|
||||
|
||||
stops = [c for c in cleanup_calls if isinstance(c[0], list) and len(c[0]) >= 2 and c[0][1] == "stop"]
|
||||
rms = [c for c in cleanup_calls if isinstance(c[0], list) and len(c[0]) >= 2 and c[0][1] == "rm"]
|
||||
assert stops, f"force_remove must docker stop; got: {cleanup_calls}"
|
||||
assert rms, f"force_remove must docker rm; got: {cleanup_calls}"
|
||||
|
||||
|
||||
def test_cleanup_vm_default_honors_persist_mode(monkeypatch):
|
||||
"""``cleanup_vm(task_id)`` without ``force_remove=True`` must be a no-op
|
||||
for a persist-mode container.
|
||||
|
||||
Regression for the bug Ben caught after commit 4: ``AIAgent.close()``
|
||||
(which is called from ``tui_gateway/server.py`` on session.close, from
|
||||
``gateway/run.py`` on per-session teardown, and from per-turn cleanup)
|
||||
calls ``cleanup_vm(task_id)``. If that defaulted to ``force_remove=True``
|
||||
we'd tear down the container on every TUI session close, defeating the
|
||||
"ONE long-lived container shared across sessions" contract.
|
||||
"""
|
||||
monkeypatch.setattr(docker_env, "find_docker", lambda: "/usr/bin/docker")
|
||||
monkeypatch.setattr(docker_env, "_get_active_profile_name", lambda: "default")
|
||||
_mock_subprocess_run(monkeypatch)
|
||||
_install_fake_thread(monkeypatch)
|
||||
|
||||
from tools import terminal_tool
|
||||
|
||||
env = _make_dummy_env(task_id="session-close-test")
|
||||
container_id = env._container_id
|
||||
terminal_tool._active_environments["session-close-test"] = env
|
||||
|
||||
cleanup_calls = []
|
||||
real_run = docker_env.subprocess.run
|
||||
|
||||
def _capturing_run(cmd, **kwargs):
|
||||
cleanup_calls.append((list(cmd) if isinstance(cmd, list) else cmd, kwargs))
|
||||
return real_run(cmd, **kwargs)
|
||||
|
||||
monkeypatch.setattr(docker_env.subprocess, "run", _capturing_run)
|
||||
|
||||
try:
|
||||
terminal_tool.cleanup_vm("session-close-test")
|
||||
finally:
|
||||
terminal_tool._active_environments.pop("session-close-test", None)
|
||||
|
||||
stops = [c for c in cleanup_calls if isinstance(c[0], list) and len(c[0]) >= 2 and c[0][1] == "stop"]
|
||||
rms = [c for c in cleanup_calls if isinstance(c[0], list) and len(c[0]) >= 2 and c[0][1] == "rm"]
|
||||
assert not stops, (
|
||||
f"cleanup_vm() default must not docker stop a persist-mode container; "
|
||||
f"got: {stops}"
|
||||
)
|
||||
assert not rms, (
|
||||
f"cleanup_vm() default must not docker rm a persist-mode container; "
|
||||
f"got: {rms}"
|
||||
)
|
||||
|
||||
|
||||
def test_cleanup_vm_force_remove_tears_down_persist_container(monkeypatch):
|
||||
"""``cleanup_vm(task_id, force_remove=True)`` tears down a persist-mode
|
||||
container — the explicit-teardown path for ``/reset``-style flows.
|
||||
|
||||
Also pins the runtime-signature-inspection plumbing: the kwarg must
|
||||
actually flow through ``cleanup_vm`` into the backend's ``cleanup()``.
|
||||
"""
|
||||
monkeypatch.setattr(docker_env, "find_docker", lambda: "/usr/bin/docker")
|
||||
monkeypatch.setattr(docker_env, "_get_active_profile_name", lambda: "default")
|
||||
_mock_subprocess_run(monkeypatch)
|
||||
_install_fake_thread(monkeypatch)
|
||||
|
||||
from tools import terminal_tool
|
||||
|
||||
env = _make_dummy_env(task_id="explicit-teardown-test")
|
||||
terminal_tool._active_environments["explicit-teardown-test"] = env
|
||||
|
||||
cleanup_calls = []
|
||||
real_run = docker_env.subprocess.run
|
||||
|
||||
def _capturing_run(cmd, **kwargs):
|
||||
cleanup_calls.append((list(cmd) if isinstance(cmd, list) else cmd, kwargs))
|
||||
return real_run(cmd, **kwargs)
|
||||
|
||||
monkeypatch.setattr(docker_env.subprocess, "run", _capturing_run)
|
||||
|
||||
try:
|
||||
terminal_tool.cleanup_vm("explicit-teardown-test", force_remove=True)
|
||||
finally:
|
||||
terminal_tool._active_environments.pop("explicit-teardown-test", None)
|
||||
|
||||
stops = [c for c in cleanup_calls if isinstance(c[0], list) and len(c[0]) >= 2 and c[0][1] == "stop"]
|
||||
rms = [c for c in cleanup_calls if isinstance(c[0], list) and len(c[0]) >= 2 and c[0][1] == "rm"]
|
||||
assert stops, f"force_remove must reach docker stop; got: {cleanup_calls}"
|
||||
assert rms, f"force_remove must reach docker rm; got: {cleanup_calls}"
|
||||
|
||||
|
||||
def test_cleanup_with_persist_disabled_stops_and_rms(monkeypatch):
|
||||
"""``persist_across_processes=False`` cleanup must docker stop AND docker
|
||||
rm so containers don't leak. Crucially, this runs regardless of the
|
||||
``persistent_filesystem`` setting — the original code only rm'd when
|
||||
``not self._persistent``, which meant the default-on ``container_persistent:
|
||||
true`` users (the documented happy path) leaked Exited containers forever.
|
||||
Issue #20561 root-cause fix."""
|
||||
monkeypatch.setattr(docker_env, "find_docker", lambda: "/usr/bin/docker")
|
||||
monkeypatch.setattr(docker_env, "_get_active_profile_name", lambda: "default")
|
||||
_mock_subprocess_run(monkeypatch)
|
||||
_install_fake_thread(monkeypatch)
|
||||
|
||||
# Note: persistent_filesystem=True (the prior-leak scenario) + the new
|
||||
# cross-process toggle OFF must still result in a clean rm.
|
||||
env = docker_env.DockerEnvironment(
|
||||
image="python:3.11", cwd="/root", timeout=60,
|
||||
task_id="cleanup-no-persist", persistent_filesystem=True,
|
||||
persist_across_processes=False,
|
||||
)
|
||||
|
||||
cleanup_calls = []
|
||||
real_run = docker_env.subprocess.run
|
||||
|
||||
def _capturing_run(cmd, **kwargs):
|
||||
cleanup_calls.append((list(cmd) if isinstance(cmd, list) else cmd, kwargs))
|
||||
return real_run(cmd, **kwargs)
|
||||
|
||||
monkeypatch.setattr(docker_env.subprocess, "run", _capturing_run)
|
||||
|
||||
env.cleanup()
|
||||
|
||||
stops = [c for c in cleanup_calls if isinstance(c[0], list) and len(c[0]) >= 2 and c[0][1] == "stop"]
|
||||
rms = [c for c in cleanup_calls if isinstance(c[0], list) and len(c[0]) >= 2 and c[0][1] == "rm"]
|
||||
assert stops, "expected docker stop"
|
||||
assert rms, (
|
||||
"docker rm MUST run when persist_across_processes=False, even with "
|
||||
"persistent_filesystem=True — that gating was the leak source in #20561."
|
||||
)
|
||||
|
||||
|
||||
def test_cleanup_uses_subprocess_run_not_detached_shell(monkeypatch):
|
||||
"""The pre-fix code used ``subprocess.Popen("... &", shell=True)`` which
|
||||
raced with parent-process exit and silently dropped cleanup work. The
|
||||
new code must use ``subprocess.run`` with bounded ``timeout=`` so the
|
||||
work actually completes within the process lifetime.
|
||||
|
||||
Asserts cleanup never reaches into shell-mode Popen. Uses
|
||||
``force_remove=True`` so cleanup actually issues docker calls — the
|
||||
default persist-mode path is now a no-op (commit 4) and would trivially
|
||||
pass this assertion without exercising the docker code at all.
|
||||
"""
|
||||
monkeypatch.setattr(docker_env, "find_docker", lambda: "/usr/bin/docker")
|
||||
monkeypatch.setattr(docker_env, "_get_active_profile_name", lambda: "default")
|
||||
_mock_subprocess_run(monkeypatch)
|
||||
_install_fake_thread(monkeypatch)
|
||||
|
||||
def _forbidden_popen(*args, **kwargs):
|
||||
raise AssertionError(
|
||||
f"cleanup must not use subprocess.Popen anymore (issue #20561); "
|
||||
f"got args={args} kwargs={kwargs}"
|
||||
)
|
||||
|
||||
monkeypatch.setattr(docker_env.subprocess, "Popen", _forbidden_popen)
|
||||
|
||||
env = _make_dummy_env(task_id="no-popen-cleanup")
|
||||
env.cleanup(force_remove=True) # must not raise
|
||||
|
||||
|
||||
def test_wait_for_cleanup_returns_true_when_no_thread_started():
|
||||
"""``wait_for_cleanup`` must be a no-op when ``cleanup`` was never called
|
||||
(or the env has no live cleanup thread) — atexit calls it unconditionally
|
||||
across all active envs, so a False return would falsely flag healthy
|
||||
shutdowns."""
|
||||
env = docker_env.DockerEnvironment.__new__(docker_env.DockerEnvironment)
|
||||
# No _cleanup_thread set — simulates an env that was never cleanup()'d.
|
||||
assert env.wait_for_cleanup(timeout=1.0) is True
|
||||
|
||||
|
||||
def test_wait_for_cleanup_after_cleanup_returns_true(monkeypatch):
|
||||
"""End-to-end: cleanup() starts a thread, wait_for_cleanup() joins it
|
||||
and reports completion. Atexit relies on this contract to ensure docker
|
||||
stop/rm actually finishes before the Python interpreter exits.
|
||||
|
||||
Uses ``force_remove=True`` so cleanup actually starts a worker thread —
|
||||
the default persist-mode cleanup is a no-op (commit 4) and never spawns
|
||||
a thread, so the trivial "no thread" branch of wait_for_cleanup is
|
||||
already covered by the previous test.
|
||||
"""
|
||||
monkeypatch.setattr(docker_env, "find_docker", lambda: "/usr/bin/docker")
|
||||
monkeypatch.setattr(docker_env, "_get_active_profile_name", lambda: "default")
|
||||
_mock_subprocess_run(monkeypatch)
|
||||
_install_fake_thread(monkeypatch)
|
||||
|
||||
env = _make_dummy_env(task_id="wait-test")
|
||||
env.cleanup(force_remove=True)
|
||||
assert env.wait_for_cleanup(timeout=5.0) is True
|
||||
|
||||
|
||||
def test_cleanup_on_env_with_no_container_id_does_not_raise(monkeypatch):
|
||||
"""A DockerEnvironment whose ``__init__`` failed before the container_id
|
||||
was set (image-pull error, docker daemon down) should still be safe to
|
||||
cleanup() — the post-creation failure path in callers always tries.
|
||||
Without this guard the daemon-down case used to NameError on the cleanup
|
||||
branch."""
|
||||
env = docker_env.DockerEnvironment.__new__(docker_env.DockerEnvironment)
|
||||
env._container_id = None
|
||||
env._persistent = False
|
||||
env._workspace_dir = None
|
||||
env._home_dir = None
|
||||
# No exception expected.
|
||||
env.cleanup()
|
||||
|
||||
|
||||
# ── Orphan reaper (issue #20561) ──────────────────────────────────
|
||||
|
||||
|
||||
def _now_iso(offset_seconds: int = 0) -> str:
|
||||
"""Return an RFC3339 timestamp ``offset_seconds`` in the past."""
|
||||
import datetime
|
||||
t = datetime.datetime.now(datetime.timezone.utc) - datetime.timedelta(seconds=offset_seconds)
|
||||
# Format like Docker emits — with nanoseconds-style trailing digits.
|
||||
return t.isoformat().replace("+00:00", ".123456789Z")
|
||||
|
||||
|
||||
def _reaper_run_mock(monkeypatch, ps_ids: list[str], inspect_responses: dict[str, str],
|
||||
rm_succeeds: bool = True):
|
||||
"""Build a subprocess.run mock for reaper tests.
|
||||
|
||||
* ``ps_ids`` — what ``docker ps -a --filter ... --format '{{.ID}}'`` returns
|
||||
* ``inspect_responses[cid]`` — what ``docker inspect ... FinishedAt`` returns
|
||||
for each cid; ``""`` means "field unset".
|
||||
* ``rm_succeeds`` — whether ``docker rm -f`` returns 0.
|
||||
|
||||
Captures every call so tests can assert which containers were rm'd.
|
||||
"""
|
||||
calls = []
|
||||
|
||||
def _run(cmd, **kwargs):
|
||||
calls.append((list(cmd) if isinstance(cmd, list) else cmd, kwargs))
|
||||
if not isinstance(cmd, list) or len(cmd) < 2:
|
||||
return subprocess.CompletedProcess(cmd, 0, stdout="", stderr="")
|
||||
sub = cmd[1]
|
||||
if sub == "ps":
|
||||
return subprocess.CompletedProcess(
|
||||
cmd, 0, stdout="\n".join(ps_ids) + ("\n" if ps_ids else ""), stderr="",
|
||||
)
|
||||
if sub == "inspect":
|
||||
# cmd is [docker, inspect, --format, '{{.State.FinishedAt}}', cid]
|
||||
cid = cmd[-1]
|
||||
return subprocess.CompletedProcess(
|
||||
cmd, 0, stdout=inspect_responses.get(cid, "") + "\n", stderr="",
|
||||
)
|
||||
if sub == "rm":
|
||||
return subprocess.CompletedProcess(
|
||||
cmd, 0 if rm_succeeds else 1,
|
||||
stdout="", stderr="" if rm_succeeds else "no such container",
|
||||
)
|
||||
return subprocess.CompletedProcess(cmd, 0, stdout="", stderr="")
|
||||
|
||||
monkeypatch.setattr(docker_env.subprocess, "run", _run)
|
||||
return calls
|
||||
|
||||
|
||||
def test_reap_orphan_returns_zero_when_no_matches(monkeypatch):
|
||||
"""No labeled containers → no rm calls, returns 0. Establishes the
|
||||
happy-path baseline for the orphan reaper (issue #20561)."""
|
||||
calls = _reaper_run_mock(monkeypatch, ps_ids=[], inspect_responses={})
|
||||
|
||||
removed = docker_env.reap_orphan_containers(
|
||||
max_age_seconds=600, profile_filter="default", docker_exe="/usr/bin/docker",
|
||||
)
|
||||
|
||||
assert removed == 0
|
||||
rms = [c for c in calls if isinstance(c[0], list) and c[0][1:2] == ["rm"]]
|
||||
assert not rms, "no rm calls expected when ps returns empty"
|
||||
|
||||
|
||||
def test_reap_orphan_removes_stale_exited_container(monkeypatch):
|
||||
"""An Exited container older than max_age_seconds must be removed.
|
||||
This is the core repair path for issue #20561 — without the reaper,
|
||||
SIGKILL'd Hermes processes leak containers permanently."""
|
||||
old = _now_iso(offset_seconds=900) # 15 minutes ago
|
||||
calls = _reaper_run_mock(
|
||||
monkeypatch, ps_ids=["old-cid"], inspect_responses={"old-cid": old},
|
||||
)
|
||||
|
||||
removed = docker_env.reap_orphan_containers(
|
||||
max_age_seconds=600, profile_filter="default", docker_exe="/usr/bin/docker",
|
||||
)
|
||||
|
||||
assert removed == 1
|
||||
rms = [c for c in calls if isinstance(c[0], list) and c[0][1:2] == ["rm"]]
|
||||
assert len(rms) == 1
|
||||
assert "old-cid" in rms[0][0], f"expected rm of old-cid, got {rms[0][0]}"
|
||||
|
||||
|
||||
def test_reap_orphan_spares_recently_exited_container(monkeypatch):
|
||||
"""A container exited within max_age_seconds must NOT be reaped — that
|
||||
container belongs to a Hermes process that just finished and may be
|
||||
about to be replaced. Conservative window prevents racing sibling
|
||||
processes."""
|
||||
recent = _now_iso(offset_seconds=60) # 1 minute ago
|
||||
calls = _reaper_run_mock(
|
||||
monkeypatch, ps_ids=["recent-cid"], inspect_responses={"recent-cid": recent},
|
||||
)
|
||||
|
||||
removed = docker_env.reap_orphan_containers(
|
||||
max_age_seconds=600, profile_filter="default", docker_exe="/usr/bin/docker",
|
||||
)
|
||||
|
||||
assert removed == 0
|
||||
rms = [c for c in calls if isinstance(c[0], list) and c[0][1:2] == ["rm"]]
|
||||
assert not rms, f"recent container must not be reaped, got rm calls: {rms}"
|
||||
|
||||
|
||||
def test_reap_orphan_scopes_to_profile_filter_via_label(monkeypatch):
|
||||
"""The reaper must pass ``--filter label=hermes-profile=<profile>`` to
|
||||
docker ps so it never sweeps another profile's containers. A research
|
||||
profile must not tear down the default profile's stragglers."""
|
||||
calls = _reaper_run_mock(monkeypatch, ps_ids=[], inspect_responses={})
|
||||
|
||||
docker_env.reap_orphan_containers(
|
||||
max_age_seconds=600, profile_filter="research-bot", docker_exe="/usr/bin/docker",
|
||||
)
|
||||
|
||||
ps_calls = [c for c in calls if isinstance(c[0], list) and c[0][1:2] == ["ps"]]
|
||||
assert ps_calls, "expected at least one docker ps call"
|
||||
flat = " ".join(ps_calls[0][0])
|
||||
assert "label=hermes-profile=research-bot" in flat, (
|
||||
f"profile filter not applied to docker ps; got args: {ps_calls[0][0]}"
|
||||
)
|
||||
assert "label=hermes-agent=1" in flat, (
|
||||
f"hermes-agent label filter must also be applied; got: {ps_calls[0][0]}"
|
||||
)
|
||||
assert "status=exited" in flat, (
|
||||
"must filter to exited containers only — running containers may "
|
||||
"belong to a sibling Hermes process and must NEVER be reaped"
|
||||
)
|
||||
|
||||
|
||||
def test_reap_orphan_skips_container_with_unparseable_finished_at(monkeypatch):
|
||||
"""If docker inspect returns the zero-value ``0001-01-01T00:00:00Z`` (no
|
||||
FinishedAt yet) or an unparseable timestamp, the reaper must leave the
|
||||
container alone. Defensive — never reap a container whose age we can't
|
||||
determine."""
|
||||
calls = _reaper_run_mock(
|
||||
monkeypatch,
|
||||
ps_ids=["never-finished", "garbage-ts"],
|
||||
inspect_responses={
|
||||
"never-finished": "0001-01-01T00:00:00Z",
|
||||
"garbage-ts": "not-a-timestamp",
|
||||
},
|
||||
)
|
||||
|
||||
removed = docker_env.reap_orphan_containers(
|
||||
max_age_seconds=600, profile_filter="default", docker_exe="/usr/bin/docker",
|
||||
)
|
||||
|
||||
assert removed == 0
|
||||
rms = [c for c in calls if isinstance(c[0], list) and c[0][1:2] == ["rm"]]
|
||||
assert not rms, (
|
||||
f"reaper must NOT remove containers with unparseable FinishedAt; got: {rms}"
|
||||
)
|
||||
|
||||
|
||||
def test_reap_orphan_handles_docker_ps_failure_gracefully(monkeypatch):
|
||||
"""If docker ps itself fails (daemon down, permission denied), the
|
||||
reaper returns 0 without crashing. The reaper is best-effort plumbing,
|
||||
not a critical path — it must never block container creation."""
|
||||
def _failing_ps(cmd, **kwargs):
|
||||
if isinstance(cmd, list) and len(cmd) >= 2 and cmd[1] == "ps":
|
||||
return subprocess.CompletedProcess(cmd, 1, stdout="", stderr="Cannot connect to daemon")
|
||||
return subprocess.CompletedProcess(cmd, 0, stdout="", stderr="")
|
||||
|
||||
monkeypatch.setattr(docker_env.subprocess, "run", _failing_ps)
|
||||
|
||||
# Must not raise
|
||||
removed = docker_env.reap_orphan_containers(
|
||||
max_age_seconds=600, profile_filter="default", docker_exe="/usr/bin/docker",
|
||||
)
|
||||
assert removed == 0
|
||||
|
||||
|
||||
def test_reap_orphan_continues_after_individual_rm_failure(monkeypatch):
|
||||
"""If ``docker rm -f`` fails on one container (already removed by a
|
||||
concurrent process, container locked, etc.), the reaper must log and
|
||||
continue to the next candidate rather than aborting the whole sweep."""
|
||||
old = _now_iso(offset_seconds=900)
|
||||
rm_calls = []
|
||||
|
||||
def _run(cmd, **kwargs):
|
||||
if not isinstance(cmd, list) or len(cmd) < 2:
|
||||
return subprocess.CompletedProcess(cmd, 0, stdout="", stderr="")
|
||||
sub = cmd[1]
|
||||
if sub == "ps":
|
||||
return subprocess.CompletedProcess(
|
||||
cmd, 0, stdout="cid-a\ncid-b\ncid-c\n", stderr="",
|
||||
)
|
||||
if sub == "inspect":
|
||||
return subprocess.CompletedProcess(cmd, 0, stdout=old + "\n", stderr="")
|
||||
if sub == "rm":
|
||||
rm_calls.append(cmd[-1])
|
||||
# cid-b fails; cid-a and cid-c succeed.
|
||||
if cmd[-1] == "cid-b":
|
||||
return subprocess.CompletedProcess(cmd, 1, stdout="", stderr="no such container")
|
||||
return subprocess.CompletedProcess(cmd, 0, stdout="", stderr="")
|
||||
return subprocess.CompletedProcess(cmd, 0, stdout="", stderr="")
|
||||
|
||||
monkeypatch.setattr(docker_env.subprocess, "run", _run)
|
||||
|
||||
removed = docker_env.reap_orphan_containers(
|
||||
max_age_seconds=600, profile_filter="default", docker_exe="/usr/bin/docker",
|
||||
)
|
||||
|
||||
# All three were attempted, two succeeded.
|
||||
assert removed == 2
|
||||
assert set(rm_calls) == {"cid-a", "cid-b", "cid-c"}, (
|
||||
f"reaper must attempt all candidates even when one fails; got: {rm_calls}"
|
||||
)
|
||||
|
||||
|
||||
def test_container_finished_at_parses_nanosecond_timestamp(monkeypatch):
|
||||
"""Docker emits FinishedAt with nanosecond precision (RFC3339 with up to
|
||||
9 fractional digits), but Python's fromisoformat caps at microseconds.
|
||||
The helper must trim the extra digits without raising — otherwise every
|
||||
candidate gets skipped and the reaper does nothing."""
|
||||
|
||||
def _run(cmd, **kwargs):
|
||||
return subprocess.CompletedProcess(
|
||||
cmd, 0,
|
||||
stdout="2026-05-28T13:45:00.123456789Z\n",
|
||||
stderr="",
|
||||
)
|
||||
|
||||
monkeypatch.setattr(docker_env.subprocess, "run", _run)
|
||||
|
||||
result = docker_env._container_finished_at("/usr/bin/docker", "test-cid")
|
||||
assert result is not None, "must parse RFC3339 with nanoseconds"
|
||||
import datetime
|
||||
assert result.tzinfo == datetime.timezone.utc
|
||||
assert result.year == 2026 and result.month == 5 and result.day == 28
|
||||
|
||||
|
||||
def test_container_finished_at_returns_none_on_zero_value():
|
||||
"""Docker's zero-value ``0001-01-01T00:00:00Z`` (never finished) must
|
||||
map to None so the reaper treats the container as unreapable."""
|
||||
# Direct test of the parsing helper — no subprocess needed since the
|
||||
# check happens after the inspect call returns.
|
||||
import subprocess as _subprocess
|
||||
|
||||
class _MockRun:
|
||||
def __init__(self, stdout):
|
||||
self.returncode = 0
|
||||
self.stdout = stdout
|
||||
self.stderr = ""
|
||||
|
||||
import unittest.mock
|
||||
with unittest.mock.patch.object(
|
||||
docker_env.subprocess, "run", return_value=_MockRun("0001-01-01T00:00:00Z\n"),
|
||||
):
|
||||
result = docker_env._container_finished_at("/usr/bin/docker", "never-finished")
|
||||
assert result is None
|
||||
|
||||
@@ -0,0 +1,139 @@
|
||||
"""Integration tests for the docker orphan-reaper wiring in terminal_tool.
|
||||
|
||||
The reaper itself is unit-tested in tests/tools/test_docker_environment.py
|
||||
under the "Orphan reaper" section. These tests cover the terminal_tool-side
|
||||
gates: once-per-process behavior, the disable flag, and the
|
||||
``lifetime_seconds`` doubling that determines the reaper's age threshold.
|
||||
|
||||
Issue #20561 — without these gates, parallel subagents would each fire the
|
||||
reaper on container creation, and the ``terminal.docker_orphan_reaper: false``
|
||||
opt-out would silently do nothing.
|
||||
"""
|
||||
|
||||
import os
|
||||
from unittest.mock import patch
|
||||
|
||||
import tools.terminal_tool as terminal_tool
|
||||
|
||||
|
||||
def _reset_reaper_gate():
|
||||
"""Clear the once-per-process flag between tests."""
|
||||
terminal_tool._docker_orphan_reaper_ran = False
|
||||
|
||||
|
||||
def test_maybe_reap_runs_once_per_process(monkeypatch):
|
||||
"""The reaper sweep must run at most once per Python interpreter.
|
||||
Parallel subagents that each call _create_environment(env_type='docker')
|
||||
would otherwise fire N concurrent docker ps + inspect storms against the
|
||||
daemon and waste 5–10s of startup."""
|
||||
_reset_reaper_gate()
|
||||
call_count = {"reap": 0}
|
||||
|
||||
def _fake_reap(**kwargs):
|
||||
call_count["reap"] += 1
|
||||
return 0
|
||||
|
||||
with patch("tools.environments.docker.reap_orphan_containers", _fake_reap):
|
||||
config = {"docker_orphan_reaper": True}
|
||||
terminal_tool._maybe_reap_docker_orphans(config)
|
||||
terminal_tool._maybe_reap_docker_orphans(config)
|
||||
terminal_tool._maybe_reap_docker_orphans(config)
|
||||
|
||||
assert call_count["reap"] == 1, (
|
||||
f"reaper must run exactly once per process; got {call_count['reap']} calls"
|
||||
)
|
||||
|
||||
|
||||
def test_maybe_reap_respects_disable_flag(monkeypatch):
|
||||
"""``terminal.docker_orphan_reaper: false`` (via container_config) must
|
||||
skip the sweep entirely — no docker ps, no inspect, no rm. The escape
|
||||
hatch for operators running multiple Hermes processes in the same
|
||||
profile."""
|
||||
_reset_reaper_gate()
|
||||
call_count = {"reap": 0}
|
||||
|
||||
def _fake_reap(**kwargs):
|
||||
call_count["reap"] += 1
|
||||
return 0
|
||||
|
||||
with patch("tools.environments.docker.reap_orphan_containers", _fake_reap):
|
||||
terminal_tool._maybe_reap_docker_orphans({"docker_orphan_reaper": False})
|
||||
|
||||
assert call_count["reap"] == 0, "disabled reaper must not run any docker calls"
|
||||
# The once-per-process gate must NOT be tripped when the reaper is
|
||||
# disabled — that would prevent a subsequent toggle to true from working.
|
||||
assert terminal_tool._docker_orphan_reaper_ran is False
|
||||
|
||||
|
||||
def test_maybe_reap_doubles_lifetime_for_max_age(monkeypatch):
|
||||
"""The reaper's age threshold is ``2 × lifetime_seconds`` (with a 60s
|
||||
floor). Generous default — gives sibling Hermes processes ample grace
|
||||
to be replaced without their just-exited containers being yanked."""
|
||||
_reset_reaper_gate()
|
||||
captured_args = {}
|
||||
|
||||
def _fake_reap(**kwargs):
|
||||
captured_args.update(kwargs)
|
||||
return 0
|
||||
|
||||
monkeypatch.setenv("TERMINAL_LIFETIME_SECONDS", "300")
|
||||
with patch("tools.environments.docker.reap_orphan_containers", _fake_reap):
|
||||
terminal_tool._maybe_reap_docker_orphans({"docker_orphan_reaper": True})
|
||||
|
||||
assert captured_args.get("max_age_seconds") == 600, (
|
||||
f"expected 2 × 300 = 600, got {captured_args.get('max_age_seconds')}"
|
||||
)
|
||||
|
||||
|
||||
def test_maybe_reap_floors_at_60_seconds(monkeypatch):
|
||||
"""A user pinning TERMINAL_LIFETIME_SECONDS=0 (or any value <30) would
|
||||
otherwise get an effective age threshold of zero, which would race the
|
||||
user's own just-started container creation. Floor at 60s × 2 = 120s."""
|
||||
_reset_reaper_gate()
|
||||
captured_args = {}
|
||||
|
||||
def _fake_reap(**kwargs):
|
||||
captured_args.update(kwargs)
|
||||
return 0
|
||||
|
||||
monkeypatch.setenv("TERMINAL_LIFETIME_SECONDS", "0")
|
||||
with patch("tools.environments.docker.reap_orphan_containers", _fake_reap):
|
||||
terminal_tool._maybe_reap_docker_orphans({"docker_orphan_reaper": True})
|
||||
|
||||
assert captured_args.get("max_age_seconds") == 120, (
|
||||
f"expected floored 60 × 2 = 120, got {captured_args.get('max_age_seconds')}"
|
||||
)
|
||||
|
||||
|
||||
def test_maybe_reap_passes_current_profile_as_filter(monkeypatch):
|
||||
"""The reaper must be scoped to the current Hermes profile — a research
|
||||
profile must NEVER reap default's containers. Verifies the
|
||||
profile-filter wiring."""
|
||||
_reset_reaper_gate()
|
||||
captured_args = {}
|
||||
|
||||
def _fake_reap(**kwargs):
|
||||
captured_args.update(kwargs)
|
||||
return 0
|
||||
|
||||
with patch("tools.environments.docker.reap_orphan_containers", _fake_reap), \
|
||||
patch("tools.environments.docker._get_active_profile_name", return_value="research-bot"):
|
||||
terminal_tool._maybe_reap_docker_orphans({"docker_orphan_reaper": True})
|
||||
|
||||
assert captured_args.get("profile_filter") == "research-bot", (
|
||||
f"expected profile_filter='research-bot', got {captured_args.get('profile_filter')!r}"
|
||||
)
|
||||
|
||||
|
||||
def test_maybe_reap_swallows_exceptions(monkeypatch):
|
||||
"""A reaper crash (docker daemon down, parse error in helper) must NOT
|
||||
block env creation. The reaper is best-effort plumbing, not a critical
|
||||
path; failures get logged at debug level and execution continues."""
|
||||
_reset_reaper_gate()
|
||||
|
||||
def _exploding_reap(**kwargs):
|
||||
raise RuntimeError("docker daemon ate the cat")
|
||||
|
||||
with patch("tools.environments.docker.reap_orphan_containers", _exploding_reap):
|
||||
# Must not raise
|
||||
terminal_tool._maybe_reap_docker_orphans({"docker_orphan_reaper": True})
|
||||
@@ -34,6 +34,39 @@ def test_resolve_stdio_command_falls_back_to_hermes_node_bin(tmp_path):
|
||||
assert env["PATH"].split(os.pathsep)[0] == str(node_bin)
|
||||
|
||||
|
||||
def test_resolve_stdio_command_falls_back_to_usr_local_bin():
|
||||
"""When ``npx`` isn't on the filtered PATH and isn't under ``$HERMES_HOME/node/bin``
|
||||
or ``~/.local/bin``, the resolver should still locate it at ``/usr/local/bin/npx``.
|
||||
|
||||
This is the canonical install location for Node on Linux from-source builds,
|
||||
the upstream ``node:bookworm-slim`` image (which the Hermes Docker image
|
||||
copies ``node + npm + corepack`` from since #4977), and macOS Homebrew on
|
||||
Intel. Without this candidate, MCP servers run with an ``env.PATH`` that
|
||||
omits ``/usr/local/bin`` (common when users hand-author PATH for sandboxing)
|
||||
fail with ENOENT at ``execvp``.
|
||||
"""
|
||||
target = os.path.join(os.sep, "usr", "local", "bin", "npx")
|
||||
|
||||
# Pretend ONLY the /usr/local/bin/npx candidate exists and is executable —
|
||||
# the other candidates ($HERMES_HOME/node/bin/npx and ~/.local/bin/npx)
|
||||
# should fail isfile() and the resolver must fall through to /usr/local/bin.
|
||||
def _fake_isfile(path):
|
||||
return path == target
|
||||
|
||||
def _fake_access(path, _mode):
|
||||
return path == target
|
||||
|
||||
with patch("tools.mcp_tool.shutil.which", return_value=None), \
|
||||
patch("tools.mcp_tool.os.path.isfile", side_effect=_fake_isfile), \
|
||||
patch("tools.mcp_tool.os.access", side_effect=_fake_access):
|
||||
command, env = _resolve_stdio_command("npx", {"PATH": "/opt/data/bin:/usr/bin:/bin"})
|
||||
|
||||
assert command == target
|
||||
# /usr/local/bin must be prepended so npx's shebang (`/usr/bin/env node`)
|
||||
# can find node in the same directory.
|
||||
assert env["PATH"].split(os.pathsep)[0] == os.path.dirname(target)
|
||||
|
||||
|
||||
def test_resolve_stdio_command_respects_explicit_empty_path():
|
||||
seen_paths = []
|
||||
|
||||
|
||||
@@ -224,3 +224,39 @@ def test_docker_env_is_bridged_everywhere():
|
||||
assert "docker_env" in _gateway_env_map_keys()
|
||||
assert "docker_env" in _save_config_env_sync_keys()
|
||||
assert "TERMINAL_DOCKER_ENV" in _terminal_tool_env_var_names()
|
||||
|
||||
|
||||
def test_docker_persist_across_processes_is_bridged_everywhere():
|
||||
"""Regression pin for the cross-process container reuse toggle.
|
||||
|
||||
``terminal.docker_persist_across_processes`` (issue #20561) controls
|
||||
whether ``DockerEnvironment.__init__`` probes for and reuses an existing
|
||||
labeled container at startup, and whether ``cleanup()`` removes the
|
||||
container on Hermes exit or just stops it (keeping it for the next
|
||||
process). Same four-bridge invariant as docker_run_as_host_user /
|
||||
docker_env / docker_mount_cwd_to_workspace — drift between any of the
|
||||
four sites means ``terminal.docker_persist_across_processes: false`` in
|
||||
config.yaml silently does nothing for that entry point, leaving the
|
||||
user unable to opt out of the documented "ONE long-lived container
|
||||
shared across sessions" behavior.
|
||||
"""
|
||||
assert "docker_persist_across_processes" in _cli_env_map_keys()
|
||||
assert "docker_persist_across_processes" in _gateway_env_map_keys()
|
||||
assert "docker_persist_across_processes" in _save_config_env_sync_keys()
|
||||
assert "TERMINAL_DOCKER_PERSIST_ACROSS_PROCESSES" in _terminal_tool_env_var_names()
|
||||
|
||||
|
||||
def test_docker_orphan_reaper_is_bridged_everywhere():
|
||||
"""Regression pin for the startup orphan reaper toggle (issue #20561).
|
||||
|
||||
``terminal.docker_orphan_reaper`` controls whether Hermes sweeps stale
|
||||
Exited containers from prior SIGKILL'd processes at startup. Same
|
||||
four-site bridge invariant — drift means
|
||||
``terminal.docker_orphan_reaper: false`` silently does nothing for one
|
||||
entry point, and the reaper either runs when the operator disabled it
|
||||
or fails to run when they enabled it.
|
||||
"""
|
||||
assert "docker_orphan_reaper" in _cli_env_map_keys()
|
||||
assert "docker_orphan_reaper" in _gateway_env_map_keys()
|
||||
assert "docker_orphan_reaper" in _save_config_env_sync_keys()
|
||||
assert "TERMINAL_DOCKER_ORPHAN_REAPER" in _terminal_tool_env_var_names()
|
||||
|
||||
@@ -44,11 +44,17 @@ def server(hermes_home):
|
||||
):
|
||||
mod = importlib.import_module("tui_gateway.server")
|
||||
yield mod
|
||||
# Reset module-level session state without re-importing. importlib.reload
|
||||
# would re-register the module's atexit hooks (ThreadPoolExecutor
|
||||
# shutdown, _shutdown_sessions); the duplicates race the stderr
|
||||
# buffer at interpreter shutdown and surface as Fatal Python error:
|
||||
# _enter_buffered_busy. Clearing the per-session dicts gives the
|
||||
# next test a clean slate; _methods is NOT cleared because it's
|
||||
# populated at module import time and re-registration only happens
|
||||
# via reload (which we don't do).
|
||||
mod._sessions.clear()
|
||||
mod._pending.clear()
|
||||
mod._answers.clear()
|
||||
mod._methods.clear()
|
||||
importlib.reload(mod)
|
||||
|
||||
|
||||
@pytest.fixture()
|
||||
|
||||
@@ -30,11 +30,17 @@ def server():
|
||||
import importlib
|
||||
mod = importlib.import_module("tui_gateway.server")
|
||||
yield mod
|
||||
# Reset module-level session state without re-importing. importlib.reload
|
||||
# would re-register the module's atexit hooks (ThreadPoolExecutor
|
||||
# shutdown, _shutdown_sessions); the duplicates race the stderr
|
||||
# buffer at interpreter shutdown and surface as Fatal Python error:
|
||||
# _enter_buffered_busy. Clearing the per-session dicts gives the
|
||||
# next test a clean slate; _methods is NOT cleared because it's
|
||||
# populated at module import time and re-registration only happens
|
||||
# via reload (which we don't do).
|
||||
mod._sessions.clear()
|
||||
mod._pending.clear()
|
||||
mod._answers.clear()
|
||||
mod._methods.clear()
|
||||
importlib.reload(mod)
|
||||
|
||||
|
||||
@pytest.fixture()
|
||||
|
||||
@@ -34,11 +34,17 @@ def server():
|
||||
|
||||
mod = importlib.import_module("tui_gateway.server")
|
||||
yield mod
|
||||
# Reset module-level session state without re-importing. importlib.reload
|
||||
# would re-register the module's atexit hooks (ThreadPoolExecutor
|
||||
# shutdown, _shutdown_sessions); the duplicates race the stderr
|
||||
# buffer at interpreter shutdown and surface as Fatal Python error:
|
||||
# _enter_buffered_busy. Clearing the per-session dicts gives the
|
||||
# next test a clean slate; _methods is NOT cleared because it's
|
||||
# populated at module import time and re-registration only happens
|
||||
# via reload (which we don't do).
|
||||
mod._sessions.clear()
|
||||
mod._pending.clear()
|
||||
mod._answers.clear()
|
||||
mod._methods.clear()
|
||||
importlib.reload(mod)
|
||||
|
||||
|
||||
def test_init_session_attaches_background_review_callback(server, monkeypatch):
|
||||
|
||||
+427
-40
@@ -98,6 +98,167 @@ def _load_hermes_env_vars() -> dict[str, str]:
|
||||
return {}
|
||||
|
||||
|
||||
# Docker label values must match [a-zA-Z0-9_.-] and stay ≤63 chars to round-trip
|
||||
# safely through `docker ps --filter label=key=value`. Profile and task names
|
||||
# can technically contain other characters; sanitize defensively.
|
||||
_LABEL_VALUE_OK_RE = re.compile(r"[^A-Za-z0-9_.-]")
|
||||
|
||||
|
||||
def _sanitize_label_value(value: str) -> str:
|
||||
"""Coerce *value* into a Docker label-safe form (alnum + ``_.-``, ≤63 chars).
|
||||
|
||||
Empty or all-invalid inputs collapse to ``"unknown"`` so the resulting
|
||||
label is always queryable. Used at container-create time; never round-trip
|
||||
a sanitized value back into application logic.
|
||||
"""
|
||||
if not isinstance(value, str) or not value:
|
||||
return "unknown"
|
||||
cleaned = _LABEL_VALUE_OK_RE.sub("_", value)
|
||||
cleaned = cleaned[:63] or "unknown"
|
||||
return cleaned
|
||||
|
||||
|
||||
def _get_active_profile_name() -> str:
|
||||
"""Return the active Hermes profile name, or ``"default"`` on any error.
|
||||
|
||||
Resolved at container-create time so a single container is permanently
|
||||
tagged with the profile that created it. Profile switches inside the
|
||||
same process don't retroactively relabel running containers.
|
||||
"""
|
||||
try:
|
||||
from hermes_cli.profiles import get_active_profile_name
|
||||
|
||||
return get_active_profile_name() or "default"
|
||||
except Exception:
|
||||
return "default"
|
||||
|
||||
|
||||
def reap_orphan_containers(
|
||||
*,
|
||||
max_age_seconds: int = 600,
|
||||
profile_filter: str | None = None,
|
||||
docker_exe: str | None = None,
|
||||
) -> int:
|
||||
"""Remove stale hermes-tagged containers left behind by prior processes.
|
||||
|
||||
Targets containers that match all of:
|
||||
|
||||
* ``label=hermes-agent=1`` (created by this codebase)
|
||||
* ``status=exited`` (running containers are NEVER reaped — they may
|
||||
belong to a sibling Hermes process whose reuse path will pick them
|
||||
up; killing them would crash the sibling mid-command)
|
||||
* (optional) ``label=hermes-profile=<profile_filter>`` (sweep only the
|
||||
caller's profile by default; a hermes process in profile A must not
|
||||
tear down profile B's containers)
|
||||
* ``State.FinishedAt`` older than *max_age_seconds* ago (so a sibling
|
||||
process that just exited and is about to be replaced doesn't get
|
||||
its container yanked out from under it)
|
||||
|
||||
Returns the number of containers removed. Best-effort: any failure
|
||||
(docker daemon unreachable, slow inspect, parse error) is logged at
|
||||
debug level and the function returns whatever it managed before the
|
||||
failure. Safe to call repeatedly; idempotent.
|
||||
|
||||
Issue #20561 — this is the safety net for SIGKILL / OOM / crashed
|
||||
terminal exits that bypass the ``atexit`` cleanup hook. Without it,
|
||||
even with the cleanup-fix in the prior commit, a hard-killed Hermes
|
||||
process leaves its container behind permanently because there's no
|
||||
subsequent Hermes process scheduled to reuse that exact (task, profile)
|
||||
pair.
|
||||
"""
|
||||
docker = docker_exe or find_docker() or "docker"
|
||||
filters = ["--filter", "label=hermes-agent=1", "--filter", "status=exited"]
|
||||
if profile_filter:
|
||||
filters.extend(["--filter", f"label=hermes-profile={_sanitize_label_value(profile_filter)}"])
|
||||
|
||||
try:
|
||||
listing = subprocess.run(
|
||||
[docker, "ps", "-a", *filters, "--format", "{{.ID}}"],
|
||||
capture_output=True, text=True, timeout=15, check=False,
|
||||
)
|
||||
except (subprocess.TimeoutExpired, OSError) as e:
|
||||
logger.debug("orphan reaper docker ps failed: %s", e)
|
||||
return 0
|
||||
if listing.returncode != 0:
|
||||
logger.debug(
|
||||
"orphan reaper docker ps returned %d: %s",
|
||||
listing.returncode, listing.stderr.strip(),
|
||||
)
|
||||
return 0
|
||||
|
||||
candidate_ids = [ln.strip() for ln in listing.stdout.splitlines() if ln.strip()]
|
||||
if not candidate_ids:
|
||||
return 0
|
||||
|
||||
# Inspect each candidate to get FinishedAt; reap only those exited
|
||||
# long enough ago. Doing this per-container (rather than bulk inspect)
|
||||
# keeps the failure blast radius to one container at a time.
|
||||
import datetime
|
||||
now = datetime.datetime.now(datetime.timezone.utc)
|
||||
removed = 0
|
||||
for cid in candidate_ids:
|
||||
finished_at = _container_finished_at(docker, cid)
|
||||
if finished_at is None:
|
||||
# Couldn't determine age — be conservative and leave it alone.
|
||||
continue
|
||||
age = (now - finished_at).total_seconds()
|
||||
if age < max_age_seconds:
|
||||
continue
|
||||
try:
|
||||
result = subprocess.run(
|
||||
[docker, "rm", "-f", cid],
|
||||
capture_output=True, text=True, timeout=30,
|
||||
)
|
||||
if result.returncode == 0:
|
||||
removed += 1
|
||||
logger.info(
|
||||
"Reaped orphan container %s (exited %d seconds ago)",
|
||||
cid[:12], int(age),
|
||||
)
|
||||
else:
|
||||
logger.debug(
|
||||
"docker rm -f %s failed: %s",
|
||||
cid[:12], result.stderr.strip(),
|
||||
)
|
||||
except (subprocess.TimeoutExpired, OSError) as e:
|
||||
logger.debug("orphan reaper docker rm %s failed: %s", cid[:12], e)
|
||||
return removed
|
||||
|
||||
|
||||
def _container_finished_at(docker_exe: str, container_id: str):
|
||||
"""Parse ``docker inspect`` FinishedAt for *container_id*.
|
||||
|
||||
Returns a timezone-aware datetime, or ``None`` if the field is missing,
|
||||
unparseable, or the zero-value ``0001-01-01T00:00:00Z`` Docker emits
|
||||
for never-finished containers. ``None`` means "don't reap" — the caller
|
||||
leaves the container alone.
|
||||
"""
|
||||
try:
|
||||
result = subprocess.run(
|
||||
[docker_exe, "inspect", "--format", "{{.State.FinishedAt}}", container_id],
|
||||
capture_output=True, text=True, timeout=10, check=False,
|
||||
)
|
||||
except (subprocess.TimeoutExpired, OSError) as e:
|
||||
logger.debug("orphan reaper docker inspect %s failed: %s", container_id[:12], e)
|
||||
return None
|
||||
if result.returncode != 0:
|
||||
return None
|
||||
raw = result.stdout.strip()
|
||||
if not raw or raw.startswith("0001-01-01"):
|
||||
return None
|
||||
# Docker emits RFC3339 with nanoseconds (e.g. "2026-05-28T13:45:00.123456789Z").
|
||||
# Python's fromisoformat handles microseconds but not nanoseconds; trim.
|
||||
import re as _re
|
||||
raw = _re.sub(r"(\.\d{6})\d+", r"\1", raw)
|
||||
raw = raw.replace("Z", "+00:00")
|
||||
try:
|
||||
import datetime
|
||||
return datetime.datetime.fromisoformat(raw)
|
||||
except ValueError as e:
|
||||
logger.debug("could not parse FinishedAt %r for %s: %s", raw, container_id[:12], e)
|
||||
return None
|
||||
|
||||
|
||||
def find_docker() -> Optional[str]:
|
||||
"""Locate the docker (or podman) CLI binary.
|
||||
|
||||
@@ -304,15 +465,18 @@ class DockerEnvironment(BaseEnvironment):
|
||||
auto_mount_cwd: bool = False,
|
||||
run_as_host_user: bool = False,
|
||||
extra_args: list = None,
|
||||
persist_across_processes: bool = True,
|
||||
):
|
||||
if cwd == "~":
|
||||
cwd = "/root"
|
||||
super().__init__(cwd=cwd, timeout=timeout)
|
||||
self._persistent = persistent_filesystem
|
||||
self._persist_across_processes = persist_across_processes
|
||||
self._task_id = task_id
|
||||
self._forward_env = _normalize_forward_env_names(forward_env)
|
||||
self._env = _normalize_env_dict(env)
|
||||
self._container_id: Optional[str] = None
|
||||
self._labels: dict[str, str] = {}
|
||||
logger.info(f"DockerEnvironment volumes: {volumes}")
|
||||
# Ensure volumes is a list (config.yaml could be malformed)
|
||||
if volumes is not None and not isinstance(volumes, list):
|
||||
@@ -506,25 +670,88 @@ class DockerEnvironment(BaseEnvironment):
|
||||
|
||||
# Start the container directly via `docker run -d`.
|
||||
container_name = f"hermes-{uuid.uuid4().hex[:8]}"
|
||||
run_cmd = [
|
||||
self._docker_exe, "run", "-d",
|
||||
"--init", # tini/catatonit as PID 1 — reaps zombie children
|
||||
"--name", container_name,
|
||||
"-w", cwd,
|
||||
*all_run_args,
|
||||
image,
|
||||
"sleep", "infinity", # no fixed lifetime — idle reaper handles cleanup
|
||||
# Labels make hermes-created containers identifiable to:
|
||||
# * the orphan reaper (`hermes-agent=1` for the global sweep filter)
|
||||
# * future cross-process reuse (`hermes-task-id`, `hermes-profile`)
|
||||
# * operators running `docker ps --filter label=hermes-agent=1`
|
||||
# Values are limited to the safe character set defined by
|
||||
# _sanitize_label_value(); the active Hermes profile is captured at
|
||||
# container-start time and never changes for the container's lifetime.
|
||||
profile_name = _sanitize_label_value(_get_active_profile_name())
|
||||
task_label = _sanitize_label_value(task_id)
|
||||
label_args = [
|
||||
"--label", "hermes-agent=1",
|
||||
"--label", f"hermes-task-id={task_label}",
|
||||
"--label", f"hermes-profile={profile_name}",
|
||||
]
|
||||
logger.debug(f"Starting container: {' '.join(run_cmd)}")
|
||||
result = subprocess.run(
|
||||
run_cmd,
|
||||
capture_output=True,
|
||||
text=True,
|
||||
timeout=120, # image pull may take a while
|
||||
check=True,
|
||||
)
|
||||
self._container_id = result.stdout.strip()
|
||||
logger.info(f"Started container {container_name} ({self._container_id[:12]})")
|
||||
self._labels = {
|
||||
"hermes-agent": "1",
|
||||
"hermes-task-id": task_label,
|
||||
"hermes-profile": profile_name,
|
||||
}
|
||||
|
||||
# Cross-process container reuse (issue #20561 — docs claim "ONE long-lived
|
||||
# container shared across sessions"). If a prior Hermes process
|
||||
# already started a container for this (task_id, profile) and it
|
||||
# still exists, attach to it instead of starting a fresh one. This
|
||||
# restores the documented contract; opt out via
|
||||
# ``terminal.docker_persist_across_processes: false``.
|
||||
#
|
||||
# Reuse matches on labels only — we deliberately do NOT compare image
|
||||
# / mounts / resources. Operators who need a fresh container after
|
||||
# changing those settings should set ``docker_persist_across_processes:
|
||||
# false`` (or run ``docker rm -f`` against the labeled container) to
|
||||
# force a clean start.
|
||||
reused = False
|
||||
if persist_across_processes:
|
||||
existing = self._find_reusable_container(task_label, profile_name)
|
||||
if existing is not None:
|
||||
container_id, state = existing
|
||||
self._container_id = container_id
|
||||
if state != "running":
|
||||
try:
|
||||
subprocess.run(
|
||||
[self._docker_exe, "start", container_id],
|
||||
capture_output=True,
|
||||
text=True,
|
||||
timeout=30,
|
||||
check=True,
|
||||
)
|
||||
except (subprocess.CalledProcessError, subprocess.TimeoutExpired) as e:
|
||||
logger.warning(
|
||||
"Failed to start existing container %s (state=%s): "
|
||||
"%s — falling back to a fresh container.",
|
||||
container_id[:12], state, e,
|
||||
)
|
||||
self._container_id = None
|
||||
if self._container_id:
|
||||
logger.info(
|
||||
"Reusing container %s (task=%s, profile=%s, prior state=%s)",
|
||||
container_id[:12], task_label, profile_name, state,
|
||||
)
|
||||
reused = True
|
||||
|
||||
if not reused:
|
||||
run_cmd = [
|
||||
self._docker_exe, "run", "-d",
|
||||
"--init", # tini/catatonit as PID 1 — reaps zombie children
|
||||
"--name", container_name,
|
||||
*label_args,
|
||||
"-w", cwd,
|
||||
*all_run_args,
|
||||
image,
|
||||
"sleep", "infinity", # no fixed lifetime — idle reaper handles cleanup
|
||||
]
|
||||
logger.debug(f"Starting container: {' '.join(run_cmd)}")
|
||||
result = subprocess.run(
|
||||
run_cmd,
|
||||
capture_output=True,
|
||||
text=True,
|
||||
timeout=120, # image pull may take a while
|
||||
check=True,
|
||||
)
|
||||
self._container_id = result.stdout.strip()
|
||||
logger.info(f"Started container {container_name} ({self._container_id[:12]})")
|
||||
|
||||
# Build the init-time env forwarding args (used only by init_session
|
||||
# to inject host env vars into the snapshot; subsequent commands get
|
||||
@@ -629,31 +856,191 @@ class DockerEnvironment(BaseEnvironment):
|
||||
logger.debug("Docker --storage-opt support: %s", _storage_opt_ok)
|
||||
return _storage_opt_ok
|
||||
|
||||
def cleanup(self):
|
||||
"""Stop and remove the container. Bind-mount dirs persist if persistent=True."""
|
||||
if self._container_id:
|
||||
try:
|
||||
# Stop in background so cleanup doesn't block
|
||||
stop_cmd = (
|
||||
f"(timeout 60 {self._docker_exe} stop {self._container_id} || "
|
||||
f"{self._docker_exe} rm -f {self._container_id}) >/dev/null 2>&1 &"
|
||||
)
|
||||
subprocess.Popen(stop_cmd, shell=True)
|
||||
except Exception as e:
|
||||
logger.warning("Failed to stop container %s: %s", self._container_id, e)
|
||||
def _find_reusable_container(self, task_label: str, profile_label: str) -> Optional[tuple[str, str]]:
|
||||
"""Look for an existing container labeled for this (task, profile).
|
||||
|
||||
Returns ``(container_id, state)`` on hit, ``None`` on miss / on any
|
||||
failure (including ``docker ps`` itself failing). State is one of the
|
||||
values Docker reports via ``{{.State}}`` — e.g. ``running``, ``exited``,
|
||||
``created``, ``paused``, ``restarting``, ``dead``. The caller decides
|
||||
whether the state warrants ``docker start`` before reuse.
|
||||
|
||||
Restricted to the docker-stored label set this class creates; never
|
||||
matches containers that happened to be named ``hermes-*`` but were
|
||||
started by some other tool.
|
||||
"""
|
||||
try:
|
||||
result = subprocess.run(
|
||||
[
|
||||
self._docker_exe, "ps", "-a",
|
||||
"--filter", "label=hermes-agent=1",
|
||||
"--filter", f"label=hermes-task-id={task_label}",
|
||||
"--filter", f"label=hermes-profile={profile_label}",
|
||||
"--format", "{{.ID}}\t{{.State}}",
|
||||
],
|
||||
capture_output=True,
|
||||
text=True,
|
||||
timeout=10,
|
||||
check=False,
|
||||
)
|
||||
except (subprocess.TimeoutExpired, OSError) as e:
|
||||
logger.debug("docker ps probe failed: %s — will start a fresh container", e)
|
||||
return None
|
||||
if result.returncode != 0:
|
||||
logger.debug(
|
||||
"docker ps probe returned %d: %s — will start a fresh container",
|
||||
result.returncode, result.stderr.strip(),
|
||||
)
|
||||
return None
|
||||
lines = [ln.strip() for ln in result.stdout.splitlines() if ln.strip()]
|
||||
if not lines:
|
||||
return None
|
||||
# Multiple matches are unusual (one (task, profile) should produce one
|
||||
# container) but can happen if a previous Hermes process crashed
|
||||
# mid-cleanup. Prefer a running one if present; otherwise pick the
|
||||
# first listed. Stale duplicates get reaped by the orphan-reaper in a
|
||||
# follow-up commit; we don't try to be heroic about them here.
|
||||
running = None
|
||||
first = None
|
||||
for ln in lines:
|
||||
parts = ln.split("\t", 1)
|
||||
if len(parts) != 2:
|
||||
continue
|
||||
cid, state = parts[0], parts[1].lower()
|
||||
if first is None:
|
||||
first = (cid, state)
|
||||
if state == "running" and running is None:
|
||||
running = (cid, state)
|
||||
return running or first
|
||||
|
||||
def cleanup(self, *, force_remove: bool = False):
|
||||
"""Tear down the container according to persist mode and *force_remove*.
|
||||
|
||||
Persist-mode (``persist_across_processes=True``, the default) leaves the
|
||||
container **running** untouched. The docs promise "ONE long-lived
|
||||
container shared across sessions" and stopping it on every Hermes exit
|
||||
breaks that promise:
|
||||
|
||||
* Background processes inside the container (``npm run dev``, watchers,
|
||||
long-running pytest) get killed every time the user runs ``/quit``.
|
||||
* Every reuse requires ``docker start`` + waiting for the container to
|
||||
come back up, adding 1–2s to the first tool call of the new session.
|
||||
* The user-visible difference between "ONE long-lived container" and
|
||||
"a new container that happens to share state" is exactly this:
|
||||
processes survive in the former, die in the latter.
|
||||
|
||||
Resource reclamation for the persist-mode case lives in the
|
||||
``reap_orphan_containers()`` path (see issue #20561 commit 3): if no
|
||||
Hermes process touches a labeled container for ``2 × lifetime_seconds``
|
||||
it gets ``docker rm -f``'d at the next Hermes startup. That covers the
|
||||
SIGKILL / OOM / abandoned-laptop cases without us needing to stop the
|
||||
container on every graceful exit.
|
||||
|
||||
Opt-out mode (``persist_across_processes=False``) still does
|
||||
``docker stop`` + ``docker rm -f`` on every cleanup, matching the
|
||||
pre-PR behavior for users who explicitly want per-process isolation.
|
||||
|
||||
``force_remove=True`` overrides persist mode and always tears the
|
||||
container down (``docker stop`` + ``docker rm -f``). This is the
|
||||
explicit-teardown path for ``/reset``, ``cleanup_vm(task_id)``-driven
|
||||
resets, or any caller that wants a guaranteed fresh container on next
|
||||
``DockerEnvironment(task_id=...)``. No current caller passes
|
||||
``force_remove=True``; the parameter is here so the explicit-teardown
|
||||
semantics can be wired up later without changing this method's
|
||||
signature.
|
||||
|
||||
Cleanup runs on a daemon thread with bounded ``subprocess.run`` calls
|
||||
(not the racy ``Popen(... &)`` pattern from before PR #33645). The
|
||||
atexit hook in ``tools/terminal_tool.py`` waits up to 15s for the
|
||||
thread to finish before the interpreter exits, so ``docker stop`` /
|
||||
``docker rm`` actually completes when we do trigger it.
|
||||
"""
|
||||
container_id = self._container_id
|
||||
if not container_id:
|
||||
# Still drop the bind-mount dirs if any were allocated and we're
|
||||
# NOT in persist mode (persist mode preserves them).
|
||||
if not self._persistent:
|
||||
# Also schedule removal (stop only leaves it as stopped)
|
||||
try:
|
||||
subprocess.Popen(
|
||||
f"sleep 3 && {self._docker_exe} rm -f {self._container_id} >/dev/null 2>&1 &",
|
||||
shell=True,
|
||||
)
|
||||
except Exception:
|
||||
pass
|
||||
self._container_id = None
|
||||
for d in (self._workspace_dir, self._home_dir):
|
||||
if d:
|
||||
shutil.rmtree(d, ignore_errors=True)
|
||||
return
|
||||
|
||||
if not self._persistent:
|
||||
# Decide what to actually do. Three cases:
|
||||
#
|
||||
# force_remove=True → stop + rm (explicit teardown)
|
||||
# persist_across_processes=True → no-op (leave container running)
|
||||
# persist_across_processes=False → stop + rm (per-process isolation)
|
||||
#
|
||||
# The persist-mode no-op is the issue-#20561 contract: the container
|
||||
# outlives Hermes processes, processes inside it stay alive, and
|
||||
# reuse on next startup is instant.
|
||||
if force_remove:
|
||||
should_stop = True
|
||||
should_remove = True
|
||||
elif self._persist_across_processes:
|
||||
# No-op for the container. Drop the in-process handle so a fresh
|
||||
# __init__ will re-probe via labels (and find the running
|
||||
# container) instead of trying to reuse a stale Python reference.
|
||||
self._container_id = None
|
||||
return
|
||||
else:
|
||||
should_stop = True
|
||||
should_remove = True
|
||||
|
||||
# Capture state needed by the worker before we null out the attrs —
|
||||
# the worker thread can outlive ``self``.
|
||||
docker_exe = self._docker_exe
|
||||
log_id = container_id[:12]
|
||||
|
||||
def _do_cleanup() -> None:
|
||||
if should_stop:
|
||||
try:
|
||||
subprocess.run(
|
||||
[docker_exe, "stop", "-t", "10", container_id],
|
||||
capture_output=True, timeout=30,
|
||||
)
|
||||
except (subprocess.TimeoutExpired, OSError) as e:
|
||||
logger.warning("docker stop %s timed out / failed: %s", log_id, e)
|
||||
if should_remove:
|
||||
try:
|
||||
subprocess.run(
|
||||
[docker_exe, "rm", "-f", container_id],
|
||||
capture_output=True, timeout=30,
|
||||
)
|
||||
except (subprocess.TimeoutExpired, OSError) as e:
|
||||
logger.warning("docker rm -f %s failed: %s", log_id, e)
|
||||
|
||||
# Daemon thread: doesn't block interpreter exit (atexit returns
|
||||
# promptly), but unlike the old ``Popen(... &)`` shell trick the
|
||||
# Python-level join semantics let the thread actually run to
|
||||
# completion if the interpreter is still alive. atexit registers
|
||||
# ``_atexit_cleanup`` in terminal_tool.py which waits up to ~60s for
|
||||
# outstanding cleanups, so most exits complete the work cleanly.
|
||||
import threading
|
||||
t = threading.Thread(target=_do_cleanup, daemon=True, name=f"hermes-cleanup-{log_id}")
|
||||
t.start()
|
||||
self._cleanup_thread = t
|
||||
self._container_id = None
|
||||
|
||||
# Bind-mount dir teardown only runs when we actually removed the
|
||||
# container (the dirs are the container's filesystem state; keeping
|
||||
# them around with no container would orphan the data on disk).
|
||||
if should_remove and not self._persistent:
|
||||
for d in (self._workspace_dir, self._home_dir):
|
||||
if d:
|
||||
shutil.rmtree(d, ignore_errors=True)
|
||||
|
||||
def wait_for_cleanup(self, timeout: float = 30.0) -> bool:
|
||||
"""Block up to *timeout* seconds for the cleanup worker thread.
|
||||
|
||||
Returns ``True`` if the thread finished (or no thread was started),
|
||||
``False`` on timeout. The atexit hook in terminal_tool.py calls this
|
||||
on every active environment so docker stop/rm actually completes
|
||||
before the Python process exits — without this, ``hermes /quit``
|
||||
races the interpreter shutdown and leaves stopped containers behind.
|
||||
"""
|
||||
thread = getattr(self, "_cleanup_thread", None)
|
||||
if thread is None or not thread.is_alive():
|
||||
return True
|
||||
thread.join(timeout=timeout)
|
||||
return not thread.is_alive()
|
||||
|
||||
@@ -422,6 +422,17 @@ def _resolve_stdio_command(command: str, env: dict) -> tuple[str, dict]:
|
||||
candidates = [
|
||||
os.path.join(hermes_home, "node", "bin", resolved_command),
|
||||
os.path.join(os.path.expanduser("~"), ".local", "bin", resolved_command),
|
||||
# /usr/local/bin is the canonical install location for Node on
|
||||
# Linux from-source builds, the upstream node:bookworm-slim
|
||||
# image (which the Hermes Docker image copies node + npm +
|
||||
# corepack from since #4977), and macOS Homebrew on Intel.
|
||||
# Without this candidate, any MCP server configured with an
|
||||
# env.PATH that omits /usr/local/bin (a common pattern when
|
||||
# users hand-author PATH for sandboxing) fails with ENOENT
|
||||
# at execvp, and a naive symlink workaround into the user's
|
||||
# PATH only fails one layer deeper because npx's shebang
|
||||
# re-execs /usr/bin/env node which needs the same directory.
|
||||
os.path.join(os.sep, "usr", "local", "bin", resolved_command),
|
||||
]
|
||||
for candidate in candidates:
|
||||
if os.path.isfile(candidate) and os.access(candidate, os.X_OK):
|
||||
|
||||
+143
-3
@@ -861,6 +861,78 @@ _creation_locks_lock = threading.Lock() # Protects _creation_locks dict itself
|
||||
_cleanup_thread = None
|
||||
_cleanup_running = False
|
||||
|
||||
# Once-per-process guard for the docker orphan reaper (issue #20561).
|
||||
# Set when _maybe_reap_docker_orphans first runs; concurrent _create_environment
|
||||
# calls for parallel subagents won't re-trigger the sweep.
|
||||
_docker_orphan_reaper_ran = False
|
||||
_docker_orphan_reaper_lock = threading.Lock()
|
||||
|
||||
|
||||
def _maybe_reap_docker_orphans(container_config: Dict[str, Any]) -> None:
|
||||
"""Run the docker orphan reaper once per process, if enabled.
|
||||
|
||||
Sweeps long-Exited containers labeled ``hermes-agent=1`` for the current
|
||||
profile that match the issue #20561 leak class — containers left behind
|
||||
by Hermes processes that exited without firing ``atexit`` (SIGKILL,
|
||||
OOM, terminal-window-close). The reaper is conservative by default:
|
||||
only Exited containers older than ``2 × lifetime_seconds`` and scoped to
|
||||
the current profile.
|
||||
|
||||
Gates:
|
||||
|
||||
* ``terminal.docker_orphan_reaper: false`` disables it entirely (the
|
||||
operator opted out — usually because they're running multiple
|
||||
Hermes processes in the same profile and don't trust the
|
||||
conservative defaults).
|
||||
* ``_docker_orphan_reaper_ran`` flag — sweep runs once per Python
|
||||
interpreter, not on every subagent / RL-rollout / parallel
|
||||
``terminal()`` call.
|
||||
"""
|
||||
global _docker_orphan_reaper_ran
|
||||
if not container_config.get("docker_orphan_reaper", True):
|
||||
return
|
||||
# Cheap double-checked-locking: read without the lock, take the lock
|
||||
# only on first run, recheck inside.
|
||||
if _docker_orphan_reaper_ran:
|
||||
return
|
||||
with _docker_orphan_reaper_lock:
|
||||
if _docker_orphan_reaper_ran:
|
||||
return
|
||||
_docker_orphan_reaper_ran = True
|
||||
|
||||
# 2 × lifetime_seconds gives sibling Hermes processes a generous grace
|
||||
# window. Floor at 60s so an operator with TERMINAL_LIFETIME_SECONDS=0
|
||||
# doesn't get an instant-reap that races their own setup.
|
||||
# ``container_config`` only carries container_* keys, so read
|
||||
# lifetime_seconds from the env var the rest of the module uses.
|
||||
try:
|
||||
lifetime = int(os.getenv("TERMINAL_LIFETIME_SECONDS", "300"))
|
||||
except (TypeError, ValueError):
|
||||
lifetime = 300
|
||||
lifetime = max(60, lifetime)
|
||||
max_age = lifetime * 2
|
||||
|
||||
try:
|
||||
from tools.environments.docker import (
|
||||
reap_orphan_containers, _get_active_profile_name,
|
||||
)
|
||||
except ImportError:
|
||||
return
|
||||
try:
|
||||
profile = _get_active_profile_name()
|
||||
removed = reap_orphan_containers(
|
||||
max_age_seconds=max_age, profile_filter=profile,
|
||||
)
|
||||
if removed:
|
||||
logger.info(
|
||||
"Docker orphan reaper removed %d stale container(s) for profile %s",
|
||||
removed, profile,
|
||||
)
|
||||
except Exception as e:
|
||||
# Never fail the env-creation path because of a janitor problem.
|
||||
logger.debug("Docker orphan reaper raised: %s", e)
|
||||
|
||||
|
||||
# Per-task environment overrides registry.
|
||||
# Allows environments (e.g., TerminalBench2Env) to specify a custom Docker/Modal
|
||||
# image for a specific task_id BEFORE the agent loop starts. When the terminal or
|
||||
@@ -1024,6 +1096,22 @@ def _get_env_config() -> Dict[str, Any]:
|
||||
"docker_env": _parse_env_var("TERMINAL_DOCKER_ENV", "{}", json.loads, "valid JSON"),
|
||||
"docker_run_as_host_user": os.getenv("TERMINAL_DOCKER_RUN_AS_HOST_USER", "false").lower() in {"true", "1", "yes"},
|
||||
"docker_extra_args": _parse_env_var("TERMINAL_DOCKER_EXTRA_ARGS", "[]", json.loads, "valid JSON"),
|
||||
# Cross-process container reuse (issue #20561). The docs claim
|
||||
# "ONE long-lived container shared across sessions" — this toggle
|
||||
# makes that real by probing for a labeled container at startup and
|
||||
# attaching to it instead of always starting a fresh one. Set to
|
||||
# ``false`` for hard per-process isolation (no reuse, container is
|
||||
# removed on exit).
|
||||
"docker_persist_across_processes": os.getenv(
|
||||
"TERMINAL_DOCKER_PERSIST_ACROSS_PROCESSES", "true"
|
||||
).lower() in {"true", "1", "yes"},
|
||||
# Startup orphan reaper for hermes-tagged containers left behind by
|
||||
# crashed / SIGKILL'd previous processes that bypassed atexit.
|
||||
# Conservative: only sweeps Exited containers older than 2× the
|
||||
# idle-reap window AND scoped to the current profile. Issue #20561.
|
||||
"docker_orphan_reaper": os.getenv(
|
||||
"TERMINAL_DOCKER_ORPHAN_REAPER", "true"
|
||||
).lower() in {"true", "1", "yes"},
|
||||
}
|
||||
|
||||
|
||||
@@ -1072,6 +1160,13 @@ def _create_environment(env_type: str, image: str, cwd: str, timeout: int,
|
||||
return _LocalEnvironment(cwd=cwd, timeout=timeout)
|
||||
|
||||
elif env_type == "docker":
|
||||
# One-shot orphan reaper: clean up labeled containers left behind by
|
||||
# prior Hermes processes that hit SIGKILL / OOM / a closed terminal
|
||||
# before the atexit cleanup hook could run. Gated to once per
|
||||
# process so concurrent _create_environment calls (parallel
|
||||
# subagents, RL benchmarks) don't run the reaper N times.
|
||||
# Disable via ``terminal.docker_orphan_reaper: false`` (issue #20561).
|
||||
_maybe_reap_docker_orphans(cc)
|
||||
return _DockerEnvironment(
|
||||
image=image, cwd=cwd, timeout=timeout,
|
||||
cpu=cpu, memory=memory, disk=disk,
|
||||
@@ -1083,6 +1178,7 @@ def _create_environment(env_type: str, image: str, cwd: str, timeout: int,
|
||||
env=docker_env,
|
||||
run_as_host_user=cc.get("docker_run_as_host_user", False),
|
||||
extra_args=docker_extra_args,
|
||||
persist_across_processes=cc.get("docker_persist_across_processes", True),
|
||||
)
|
||||
|
||||
elif env_type == "singularity":
|
||||
@@ -1330,8 +1426,27 @@ def cleanup_all_environments():
|
||||
return cleaned
|
||||
|
||||
|
||||
def cleanup_vm(task_id: str):
|
||||
"""Manually clean up a specific environment by task_id."""
|
||||
def cleanup_vm(task_id: str, *, force_remove: bool = False):
|
||||
"""Manually clean up a specific environment by task_id.
|
||||
|
||||
*force_remove* (default False) is forwarded to backends that accept it
|
||||
— currently only ``DockerEnvironment``. The default of False matches
|
||||
session-lifecycle semantics: this function is called from
|
||||
``AIAgent.close()`` (TUI session close, gateway session teardown) and the
|
||||
per-turn cleanup branch for non-persistent envs, both of which should
|
||||
honor the user's persist-mode preference. Stopping the container here
|
||||
would defeat the "ONE long-lived container shared across sessions"
|
||||
contract — exactly the bug Ben reported when the container was killed
|
||||
on every TUI session close.
|
||||
|
||||
Pass ``force_remove=True`` for actual user-initiated teardown
|
||||
(e.g. ``/reset``-style flows that haven't been wired yet, or future
|
||||
"destroy my sandbox" commands).
|
||||
|
||||
The idle reaper passes the env through ``env.cleanup()`` directly (not
|
||||
via this function), so persist-mode idle envs are similarly no-op'd —
|
||||
only the orphan reaper at next startup reclaims them.
|
||||
"""
|
||||
# Remove from tracking dicts while holding the lock, but defer the
|
||||
# actual (potentially slow) env.cleanup() call to outside the lock
|
||||
# so other tool calls aren't blocked.
|
||||
@@ -1356,7 +1471,14 @@ def cleanup_vm(task_id: str):
|
||||
|
||||
try:
|
||||
if hasattr(env, 'cleanup'):
|
||||
env.cleanup()
|
||||
# Pass force_remove only if the env's cleanup() accepts it
|
||||
# (DockerEnvironment after issue #20561; other backends don't).
|
||||
import inspect
|
||||
sig = inspect.signature(env.cleanup)
|
||||
if "force_remove" in sig.parameters:
|
||||
env.cleanup(force_remove=force_remove)
|
||||
else:
|
||||
env.cleanup()
|
||||
elif hasattr(env, 'stop'):
|
||||
env.stop()
|
||||
elif hasattr(env, 'terminate'):
|
||||
@@ -1378,7 +1500,23 @@ def _atexit_cleanup():
|
||||
if _active_environments:
|
||||
count = len(_active_environments)
|
||||
logger.info("Shutting down %d remaining sandbox(es)...", count)
|
||||
# Snapshot the env objects BEFORE cleanup_all_environments empties
|
||||
# the dict; we need them to wait on docker cleanup threads after the
|
||||
# registry has been cleared.
|
||||
envs_to_wait = list(_active_environments.values())
|
||||
cleanup_all_environments()
|
||||
# Block briefly so docker stop/rm actually completes before the
|
||||
# interpreter exits. Issue #20561 — without this join, the daemon
|
||||
# cleanup threads were getting torn down mid-`docker stop`, leaving
|
||||
# Exited containers piled up on the host.
|
||||
for env in envs_to_wait:
|
||||
wait_fn = getattr(env, "wait_for_cleanup", None)
|
||||
if wait_fn is None:
|
||||
continue
|
||||
try:
|
||||
wait_fn(timeout=15.0)
|
||||
except Exception as e: # never block shutdown on a bad backend
|
||||
logger.debug("wait_for_cleanup raised on exit: %s", e)
|
||||
|
||||
atexit.register(_atexit_cleanup)
|
||||
|
||||
@@ -1746,6 +1884,8 @@ def terminal_tool(
|
||||
"docker_env": config.get("docker_env", {}),
|
||||
"docker_run_as_host_user": config.get("docker_run_as_host_user", False),
|
||||
"docker_extra_args": config.get("docker_extra_args", []),
|
||||
"docker_persist_across_processes": config.get("docker_persist_across_processes", True),
|
||||
"docker_orphan_reaper": config.get("docker_orphan_reaper", True),
|
||||
}
|
||||
|
||||
local_config = None
|
||||
|
||||
@@ -1589,7 +1589,7 @@ wheels = [
|
||||
|
||||
[[package]]
|
||||
name = "hermes-agent"
|
||||
version = "0.15.0"
|
||||
version = "0.15.1"
|
||||
source = { editable = "." }
|
||||
dependencies = [
|
||||
{ name = "croniter" },
|
||||
|
||||
Generated
+2786
-6
File diff suppressed because it is too large
Load Diff
+1
-1
@@ -10,7 +10,7 @@
|
||||
"preview": "vite preview"
|
||||
},
|
||||
"dependencies": {
|
||||
"@nous-research/ui": "0.16.0",
|
||||
"@nous-research/ui": "0.18.2",
|
||||
"@observablehq/plot": "^0.6.17",
|
||||
"@react-three/fiber": "^9.6.0",
|
||||
"@tailwindcss/vite": "^4.2.1",
|
||||
|
||||
+2
-2
@@ -50,12 +50,12 @@ import {
|
||||
import { Button } from "@nous-research/ui/ui/components/button";
|
||||
import { SelectionSwitcher } from "@nous-research/ui/ui/components/selection-switcher";
|
||||
import { Spinner } from "@nous-research/ui/ui/components/spinner";
|
||||
import { Typography } from "@/components/NouiTypography";
|
||||
import { Typography } from "@nous-research/ui/ui/components/typography/index";
|
||||
import { cn } from "@/lib/utils";
|
||||
import { Backdrop } from "@/components/Backdrop";
|
||||
import { SidebarFooter } from "@/components/SidebarFooter";
|
||||
import { SidebarStatusStrip, gatewayLine } from "@/components/SidebarStatusStrip";
|
||||
import { useBelowBreakpoint } from "@/hooks/useBelowBreakpoint";
|
||||
import { useBelowBreakpoint } from "@nous-research/ui/hooks/use-below-breakpoint";
|
||||
import { useSidebarStatus } from "@/hooks/useSidebarStatus";
|
||||
import { AuthWidget } from "@/components/AuthWidget";
|
||||
import { PageHeaderProvider } from "@/contexts/PageHeaderProvider";
|
||||
|
||||
@@ -1,7 +1,7 @@
|
||||
import { Select, SelectOption } from "@nous-research/ui/ui/components/select";
|
||||
import { Switch } from "@nous-research/ui/ui/components/switch";
|
||||
import { Input } from "@/components/ui/input";
|
||||
import { Label } from "@/components/ui/label";
|
||||
import { Input } from "@nous-research/ui/ui/components/input";
|
||||
import { Label } from "@nous-research/ui/ui/components/label";
|
||||
|
||||
function FieldHint({ schema, schemaKey }: { schema: Record<string, unknown>; schemaKey: string }) {
|
||||
const keyPath = schemaKey.includes(".") ? schemaKey : "";
|
||||
|
||||
@@ -1,225 +0,0 @@
|
||||
import {
|
||||
type PointerEvent as ReactPointerEvent,
|
||||
type ReactNode,
|
||||
useEffect,
|
||||
useRef,
|
||||
useState,
|
||||
} from "react";
|
||||
import { createPortal } from "react-dom";
|
||||
import { Typography } from "@/components/NouiTypography";
|
||||
import { cn, themedBody } from "@/lib/utils";
|
||||
|
||||
const CLOSE_DRAG_MIN_PX = 72;
|
||||
const CLOSE_DRAG_RATIO = 0.18;
|
||||
const SHEET_TRANSITION_MS = 280;
|
||||
|
||||
/**
|
||||
* Mobile-first picker shell: fixed backdrop + bottom sheet, portaled to `body`
|
||||
* so nested overflow/transform in the sidebar cannot clip menus (theme /
|
||||
* language switchers). Open/close uses slide + fade; teardown is delayed until
|
||||
* the exit animation finishes so animations can complete.
|
||||
*
|
||||
* Drag the header/handle downward to dismiss (skipped when reduced motion is on).
|
||||
*/
|
||||
export function BottomPickSheet({
|
||||
backdropDismissLabel = "Dismiss",
|
||||
children,
|
||||
onClose,
|
||||
open,
|
||||
title,
|
||||
}: BottomPickSheetProps) {
|
||||
const [renderPortal, setRenderPortal] = useState(open);
|
||||
const [entered, setEntered] = useState(false);
|
||||
const [dragOffsetPx, setDragOffsetPx] = useState(0);
|
||||
const [dragActive, setDragActive] = useState(false);
|
||||
|
||||
const closeTimerRef = useRef<ReturnType<typeof setTimeout> | null>(null);
|
||||
const sheetRef = useRef<HTMLDivElement>(null);
|
||||
const dragTrackingRef = useRef(false);
|
||||
const dragStartYRef = useRef(0);
|
||||
const dragOffsetRef = useRef(0);
|
||||
|
||||
const reducedMotion =
|
||||
typeof window !== "undefined" &&
|
||||
window.matchMedia("(prefers-reduced-motion: reduce)").matches;
|
||||
|
||||
const syncDragPx = (next: number) => {
|
||||
dragOffsetRef.current = next;
|
||||
setDragOffsetPx(next);
|
||||
};
|
||||
|
||||
useEffect(() => {
|
||||
if (closeTimerRef.current) {
|
||||
clearTimeout(closeTimerRef.current);
|
||||
closeTimerRef.current = null;
|
||||
}
|
||||
|
||||
const ms = reducedMotion ? 0 : SHEET_TRANSITION_MS;
|
||||
|
||||
let openRafId = 0;
|
||||
let exitRafId = 0;
|
||||
|
||||
if (open) {
|
||||
openRafId = requestAnimationFrame(() => {
|
||||
dragTrackingRef.current = false;
|
||||
dragOffsetRef.current = 0;
|
||||
setDragActive(false);
|
||||
setDragOffsetPx(0);
|
||||
setRenderPortal(true);
|
||||
requestAnimationFrame(() => {
|
||||
requestAnimationFrame(() => setEntered(true));
|
||||
});
|
||||
});
|
||||
} else {
|
||||
exitRafId = requestAnimationFrame(() => {
|
||||
dragTrackingRef.current = false;
|
||||
setDragActive(false);
|
||||
setEntered(false);
|
||||
closeTimerRef.current = window.setTimeout(() => {
|
||||
dragOffsetRef.current = 0;
|
||||
setDragOffsetPx(0);
|
||||
setRenderPortal(false);
|
||||
closeTimerRef.current = null;
|
||||
}, ms);
|
||||
});
|
||||
}
|
||||
|
||||
return () => {
|
||||
cancelAnimationFrame(openRafId);
|
||||
cancelAnimationFrame(exitRafId);
|
||||
if (closeTimerRef.current) {
|
||||
clearTimeout(closeTimerRef.current);
|
||||
closeTimerRef.current = null;
|
||||
}
|
||||
};
|
||||
}, [open, reducedMotion]);
|
||||
|
||||
useEffect(() => {
|
||||
if (!renderPortal) return;
|
||||
const prev = document.body.style.overflow;
|
||||
document.body.style.overflow = "hidden";
|
||||
return () => {
|
||||
document.body.style.overflow = prev;
|
||||
};
|
||||
}, [renderPortal]);
|
||||
|
||||
if (!renderPortal || typeof document === "undefined") return null;
|
||||
|
||||
const durationClass = reducedMotion ? "duration-0" : "duration-[280ms]";
|
||||
|
||||
const draggingVisual = dragActive || dragOffsetPx > 0;
|
||||
|
||||
const onDragPointerDown = (e: ReactPointerEvent<HTMLDivElement>) => {
|
||||
if (reducedMotion || !entered) return;
|
||||
if (e.pointerType === "mouse" && e.button !== 0) return;
|
||||
|
||||
dragTrackingRef.current = true;
|
||||
setDragActive(true);
|
||||
dragStartYRef.current = e.clientY;
|
||||
syncDragPx(0);
|
||||
e.currentTarget.setPointerCapture(e.pointerId);
|
||||
};
|
||||
|
||||
const onDragPointerMove = (e: ReactPointerEvent<HTMLDivElement>) => {
|
||||
if (!dragTrackingRef.current) return;
|
||||
const dy = e.clientY - dragStartYRef.current;
|
||||
const next = Math.max(0, dy);
|
||||
const sheetH = sheetRef.current?.offsetHeight ?? 560;
|
||||
syncDragPx(Math.min(next, sheetH));
|
||||
};
|
||||
|
||||
const endDrag = (e: ReactPointerEvent<HTMLDivElement>) => {
|
||||
if (!dragTrackingRef.current) return;
|
||||
dragTrackingRef.current = false;
|
||||
setDragActive(false);
|
||||
try {
|
||||
e.currentTarget.releasePointerCapture(e.pointerId);
|
||||
} catch {
|
||||
/* already released */
|
||||
}
|
||||
|
||||
const sheetH = sheetRef.current?.offsetHeight ?? 560;
|
||||
const threshold = Math.max(CLOSE_DRAG_MIN_PX, sheetH * CLOSE_DRAG_RATIO);
|
||||
const d = dragOffsetRef.current;
|
||||
|
||||
if (d >= threshold) {
|
||||
onClose();
|
||||
return;
|
||||
}
|
||||
syncDragPx(0);
|
||||
};
|
||||
|
||||
return createPortal(
|
||||
<div className="fixed inset-0 z-[200] flex flex-col justify-end">
|
||||
<button
|
||||
type="button"
|
||||
aria-label={backdropDismissLabel}
|
||||
className={cn(
|
||||
"absolute inset-0 bg-black/55 backdrop-blur-[2px]",
|
||||
"transition-opacity ease-out motion-reduce:transition-none",
|
||||
durationClass,
|
||||
entered ? "opacity-100" : "opacity-0",
|
||||
)}
|
||||
onClick={onClose}
|
||||
/>
|
||||
|
||||
<div
|
||||
aria-label={title}
|
||||
aria-modal="true"
|
||||
ref={sheetRef}
|
||||
className={cn(
|
||||
themedBody,
|
||||
"relative flex max-h-[85dvh] min-h-0 flex-col rounded-t-xl border border-current/20",
|
||||
"bg-background-base/98 pb-[max(1rem,env(safe-area-inset-bottom))]",
|
||||
"shadow-[0_-12px_40px_-8px_rgba(0,0,0,0.55)] backdrop-blur-md",
|
||||
"ease-out motion-reduce:transition-none transform-gpu",
|
||||
draggingVisual ? "transition-none" : cn("transition-transform", durationClass),
|
||||
entered ? "translate-y-0" : "translate-y-full",
|
||||
)}
|
||||
role="dialog"
|
||||
style={
|
||||
entered && dragOffsetPx > 0
|
||||
? { transform: `translateY(${dragOffsetPx}px)` }
|
||||
: undefined
|
||||
}
|
||||
>
|
||||
<div
|
||||
className={cn(
|
||||
"flex shrink-0 flex-col gap-2 border-b border-current/15 px-4 pb-3 pt-2",
|
||||
"touch-none select-none",
|
||||
reducedMotion ? "cursor-default" : "cursor-grab active:cursor-grabbing",
|
||||
)}
|
||||
onPointerCancel={endDrag}
|
||||
onPointerDown={onDragPointerDown}
|
||||
onPointerMove={onDragPointerMove}
|
||||
onPointerUp={endDrag}
|
||||
>
|
||||
<div
|
||||
aria-hidden
|
||||
className="mx-auto h-1 w-10 shrink-0 rounded-full bg-current/20"
|
||||
/>
|
||||
|
||||
<Typography
|
||||
mondwest
|
||||
className="text-display text-xs tracking-[0.12em] text-text-tertiary"
|
||||
>
|
||||
{title}
|
||||
</Typography>
|
||||
</div>
|
||||
|
||||
<div className="min-h-0 flex-1 overflow-y-auto overscroll-contain">
|
||||
{children}
|
||||
</div>
|
||||
</div>
|
||||
</div>,
|
||||
document.body,
|
||||
);
|
||||
}
|
||||
|
||||
interface BottomPickSheetProps {
|
||||
backdropDismissLabel?: string;
|
||||
children: ReactNode;
|
||||
onClose: () => void;
|
||||
open: boolean;
|
||||
title: string;
|
||||
}
|
||||
@@ -25,7 +25,7 @@
|
||||
|
||||
import { Button } from "@nous-research/ui/ui/components/button";
|
||||
import { Badge } from "@nous-research/ui/ui/components/badge";
|
||||
import { Card } from "@/components/ui/card";
|
||||
import { Card } from "@nous-research/ui/ui/components/card";
|
||||
|
||||
import { ModelPickerDialog } from "@/components/ModelPickerDialog";
|
||||
import { ToolCall, type ToolEntry } from "@/components/ToolCall";
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
import { ConfirmDialog } from "@/components/ui/confirm-dialog";
|
||||
import { ConfirmDialog } from "@nous-research/ui/ui/components/confirm-dialog";
|
||||
import { useI18n } from "@/i18n";
|
||||
|
||||
export function DeleteConfirmDialog({
|
||||
|
||||
@@ -2,9 +2,9 @@ import { useState, useRef, useEffect } from "react";
|
||||
import { createPortal } from "react-dom";
|
||||
import { Check } from "lucide-react";
|
||||
import { Button } from "@nous-research/ui/ui/components/button";
|
||||
import { BottomPickSheet } from "@/components/BottomPickSheet";
|
||||
import { Typography } from "@/components/NouiTypography";
|
||||
import { useBelowBreakpoint } from "@/hooks/useBelowBreakpoint";
|
||||
import { BottomSheet } from "@nous-research/ui/ui/components/bottom-sheet";
|
||||
import { Typography } from "@nous-research/ui/ui/components/typography/index";
|
||||
import { useBelowBreakpoint } from "@nous-research/ui/hooks/use-below-breakpoint";
|
||||
import { useI18n } from "@/i18n/context";
|
||||
import { LOCALE_META } from "@/i18n";
|
||||
import type { Locale } from "@/i18n";
|
||||
@@ -87,7 +87,7 @@ export function LanguageSwitcher({ collapsed = false, dropUp = false }: Language
|
||||
</Button>
|
||||
|
||||
{useMobileSheet && (
|
||||
<BottomPickSheet
|
||||
<BottomSheet
|
||||
backdropDismissLabel={t.common.close}
|
||||
onClose={() => setOpen(false)}
|
||||
open={open}
|
||||
@@ -101,7 +101,7 @@ export function LanguageSwitcher({ collapsed = false, dropUp = false }: Language
|
||||
setOpen={setOpen}
|
||||
/>
|
||||
</div>
|
||||
</BottomPickSheet>
|
||||
</BottomSheet>
|
||||
)}
|
||||
|
||||
{open && !useMobileSheet && (() => {
|
||||
|
||||
@@ -2,8 +2,8 @@ import { Button } from "@nous-research/ui/ui/components/button";
|
||||
import { Checkbox } from "@nous-research/ui/ui/components/checkbox";
|
||||
import { ListItem } from "@nous-research/ui/ui/components/list-item";
|
||||
import { Spinner } from "@nous-research/ui/ui/components/spinner";
|
||||
import { Label } from "@/components/ui/label";
|
||||
import { Input } from "@/components/ui/input";
|
||||
import { Input } from "@nous-research/ui/ui/components/input";
|
||||
import { Label } from "@nous-research/ui/ui/components/label";
|
||||
import type { GatewayClient } from "@/lib/gatewayClient";
|
||||
import { Check, Search, X } from "lucide-react";
|
||||
import { useEffect, useMemo, useRef, useState } from "react";
|
||||
|
||||
@@ -1,63 +0,0 @@
|
||||
import { forwardRef, type ElementType, type HTMLAttributes, type ReactNode } from "react";
|
||||
import { cn } from "@/lib/utils";
|
||||
|
||||
type TypographyProps = HTMLAttributes<HTMLElement> & {
|
||||
as?: ElementType;
|
||||
children?: ReactNode;
|
||||
compressed?: boolean;
|
||||
courier?: boolean;
|
||||
expanded?: boolean;
|
||||
mondwest?: boolean;
|
||||
mono?: boolean;
|
||||
sans?: boolean;
|
||||
variant?: "sm" | "md" | "lg" | "xl";
|
||||
};
|
||||
|
||||
const variantClasses: Record<NonNullable<TypographyProps["variant"]>, string> = {
|
||||
sm: "leading-[1.4] text-[.9375rem] tracking-[0.1875rem]",
|
||||
md: "text-[2.625rem] leading-[1] tracking-[0.0525rem]",
|
||||
lg: "text-[2.625rem] leading-[1] tracking-[0.0525rem]",
|
||||
xl: "text-[4.5rem] leading-[1] tracking-[0.135rem]",
|
||||
};
|
||||
|
||||
export const Typography = forwardRef<HTMLElement, TypographyProps>(function Typography(
|
||||
{
|
||||
as: Component = "span",
|
||||
className,
|
||||
compressed,
|
||||
courier,
|
||||
expanded,
|
||||
mondwest,
|
||||
mono,
|
||||
sans,
|
||||
variant,
|
||||
...props
|
||||
},
|
||||
ref,
|
||||
) {
|
||||
const hasFontVariant = compressed || courier || expanded || mondwest || mono || sans;
|
||||
|
||||
return (
|
||||
<Component
|
||||
className={cn(
|
||||
compressed && "font-compressed",
|
||||
courier && "font-courier",
|
||||
expanded && "font-expanded",
|
||||
mondwest && "font-mondwest tracking-[0.1875rem]",
|
||||
mono && "font-mono",
|
||||
(!hasFontVariant || sans) && "font-sans",
|
||||
variant && variantClasses[variant],
|
||||
className,
|
||||
)}
|
||||
ref={ref}
|
||||
{...props}
|
||||
/>
|
||||
);
|
||||
});
|
||||
|
||||
export const H2 = forwardRef<HTMLHeadingElement, Omit<TypographyProps, "as">>(function H2(
|
||||
{ className, variant = "lg", ...props },
|
||||
ref,
|
||||
) {
|
||||
return <Typography as="h2" className={cn("font-bold", className)} variant={variant} ref={ref} {...props} />;
|
||||
});
|
||||
@@ -3,9 +3,9 @@ import { ExternalLink, X, Check } from "lucide-react";
|
||||
import { Button } from "@nous-research/ui/ui/components/button";
|
||||
import { CopyButton } from "@nous-research/ui/ui/components/command-block";
|
||||
import { Spinner } from "@nous-research/ui/ui/components/spinner";
|
||||
import { H2 } from "@/components/NouiTypography";
|
||||
import { H2 } from "@nous-research/ui/ui/components/typography/h2";
|
||||
import { api, type OAuthProvider, type OAuthStartResponse } from "@/lib/api";
|
||||
import { Input } from "@/components/ui/input";
|
||||
import { Input } from "@nous-research/ui/ui/components/input";
|
||||
import { useI18n } from "@/i18n";
|
||||
import { cn, themedBody } from "@/lib/utils";
|
||||
|
||||
|
||||
@@ -16,9 +16,9 @@ import {
|
||||
CardDescription,
|
||||
CardHeader,
|
||||
CardTitle,
|
||||
} from "@/components/ui/card";
|
||||
} from "@nous-research/ui/ui/components/card";
|
||||
import { Badge } from "@nous-research/ui/ui/components/badge";
|
||||
import { ConfirmDialog } from "@/components/ui/confirm-dialog";
|
||||
import { ConfirmDialog } from "@nous-research/ui/ui/components/confirm-dialog";
|
||||
import { OAuthLoginModal } from "@/components/OAuthLoginModal";
|
||||
import { useI18n } from "@/i18n";
|
||||
|
||||
|
||||
@@ -2,7 +2,7 @@ import { AlertTriangle, Radio, Wifi, WifiOff } from "lucide-react";
|
||||
import type { PlatformStatus } from "@/lib/api";
|
||||
import { isoTimeAgo } from "@/lib/utils";
|
||||
import { Badge } from "@nous-research/ui/ui/components/badge";
|
||||
import { Card, CardContent, CardHeader, CardTitle } from "@/components/ui/card";
|
||||
import { Card, CardContent, CardHeader, CardTitle } from "@nous-research/ui/ui/components/card";
|
||||
import { useI18n } from "@/i18n";
|
||||
|
||||
export function PlatformsCard({ platforms }: PlatformsCardProps) {
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
import { Typography } from "@/components/NouiTypography";
|
||||
import { Typography } from "@nous-research/ui/ui/components/typography/index";
|
||||
import type { StatusResponse } from "@/lib/api";
|
||||
import { cn } from "@/lib/utils";
|
||||
import { useI18n } from "@/i18n";
|
||||
|
||||
@@ -3,9 +3,9 @@ import { createPortal } from "react-dom";
|
||||
import { Palette, Check } from "lucide-react";
|
||||
import { Button } from "@nous-research/ui/ui/components/button";
|
||||
import { ListItem } from "@nous-research/ui/ui/components/list-item";
|
||||
import { BottomPickSheet } from "@/components/BottomPickSheet";
|
||||
import { Typography } from "@/components/NouiTypography";
|
||||
import { useBelowBreakpoint } from "@/hooks/useBelowBreakpoint";
|
||||
import { BottomSheet } from "@nous-research/ui/ui/components/bottom-sheet";
|
||||
import { Typography } from "@nous-research/ui/ui/components/typography/index";
|
||||
import { useBelowBreakpoint } from "@nous-research/ui/hooks/use-below-breakpoint";
|
||||
import { BUILTIN_THEMES, useTheme } from "@/themes";
|
||||
import type { DashboardTheme, ThemeListEntry } from "@/themes";
|
||||
import { useI18n } from "@/i18n";
|
||||
@@ -91,7 +91,7 @@ export function ThemeSwitcher({ collapsed = false, dropUp = false }: ThemeSwitch
|
||||
</Button>
|
||||
|
||||
{useMobileSheet && (
|
||||
<BottomPickSheet
|
||||
<BottomSheet
|
||||
backdropDismissLabel={t.common.close}
|
||||
onClose={close}
|
||||
open={open}
|
||||
@@ -105,7 +105,7 @@ export function ThemeSwitcher({ collapsed = false, dropUp = false }: ThemeSwitch
|
||||
themeName={themeName}
|
||||
/>
|
||||
</div>
|
||||
</BottomPickSheet>
|
||||
</BottomSheet>
|
||||
)}
|
||||
|
||||
{open && !useMobileSheet && (() => {
|
||||
|
||||
@@ -1,40 +0,0 @@
|
||||
import { useEffect, useState } from "react";
|
||||
import { createPortal } from "react-dom";
|
||||
|
||||
export function Toast({ toast }: { toast: { message: string; type: "success" | "error" } | null }) {
|
||||
const [visible, setVisible] = useState(false);
|
||||
const [current, setCurrent] = useState(toast);
|
||||
|
||||
useEffect(() => {
|
||||
if (toast) {
|
||||
setCurrent(toast);
|
||||
setVisible(true);
|
||||
} else {
|
||||
setVisible(false);
|
||||
const timer = setTimeout(() => setCurrent(null), 200);
|
||||
return () => clearTimeout(timer);
|
||||
}
|
||||
}, [toast]);
|
||||
|
||||
if (!current) return null;
|
||||
|
||||
// Portal to document.body so the toast escapes any ancestor stacking context
|
||||
// (e.g. <main> has `relative z-2`, which would trap z-50 below the header's z-40).
|
||||
return createPortal(
|
||||
<div
|
||||
role="status"
|
||||
aria-live="polite"
|
||||
className={`fixed top-16 right-4 z-50 border px-4 py-2.5 font-courier text-xs tracking-wider uppercase backdrop-blur-sm ${
|
||||
current.type === "success"
|
||||
? "bg-success/15 text-success border-success/30"
|
||||
: "bg-destructive/15 text-destructive border-destructive/30"
|
||||
}`}
|
||||
style={{
|
||||
animation: visible ? "toast-in 200ms ease-out forwards" : "toast-out 200ms ease-in forwards",
|
||||
}}
|
||||
>
|
||||
{current.message}
|
||||
</div>,
|
||||
document.body,
|
||||
);
|
||||
}
|
||||
@@ -1,63 +0,0 @@
|
||||
import { cn, themedBody } from "@/lib/utils";
|
||||
|
||||
/**
|
||||
* Themed card primitive. Themes can restyle every card without touching
|
||||
* call sites by setting CSS vars under the `card` component-style bucket:
|
||||
*
|
||||
* componentStyles:
|
||||
* card:
|
||||
* clipPath: "polygon(10px 0, 100% 0, 100% calc(100% - 10px), calc(100% - 10px) 100%, 0 100%, 0 10px)"
|
||||
* border: "1px solid var(--color-ring)"
|
||||
* background: "linear-gradient(180deg, var(--color-card) 0%, transparent 100%)"
|
||||
* boxShadow: "0 0 0 1px var(--color-ring) inset, 0 0 24px -8px var(--warm-glow)"
|
||||
*
|
||||
* All properties are optional — vars that aren't set compute to their
|
||||
* CSS initial value, so the default shadcn-y card keeps looking normal
|
||||
* for themes that don't override anything.
|
||||
*/
|
||||
const CARD_STYLE: React.CSSProperties = {
|
||||
clipPath: "var(--component-card-clip-path)",
|
||||
borderImage: "var(--component-card-border-image)",
|
||||
background: "var(--component-card-background)",
|
||||
boxShadow: "var(--component-card-box-shadow)",
|
||||
};
|
||||
|
||||
export function Card({ className, style, ...props }: React.HTMLAttributes<HTMLDivElement>) {
|
||||
return (
|
||||
<div
|
||||
className={cn(
|
||||
"border border-border bg-card/80 text-card-foreground w-full",
|
||||
themedBody,
|
||||
className,
|
||||
)}
|
||||
style={{ ...CARD_STYLE, ...style }}
|
||||
{...props}
|
||||
/>
|
||||
);
|
||||
}
|
||||
|
||||
export function CardHeader({ className, ...props }: React.HTMLAttributes<HTMLDivElement>) {
|
||||
return <div className={cn("flex flex-col gap-1.5 p-4 border-b border-border", className)} {...props} />;
|
||||
}
|
||||
|
||||
export function CardTitle({ className, ...props }: React.HTMLAttributes<HTMLHeadingElement>) {
|
||||
return (
|
||||
<h3
|
||||
className={cn(
|
||||
"font-mondwest text-display text-sm tracking-[0.12em] text-text-primary",
|
||||
className,
|
||||
)}
|
||||
{...props}
|
||||
/>
|
||||
);
|
||||
}
|
||||
|
||||
export function CardDescription({ className, ...props }: React.HTMLAttributes<HTMLParagraphElement>) {
|
||||
return (
|
||||
<p className={cn("font-mondwest normal-case text-xs text-muted-foreground", className)} {...props} />
|
||||
);
|
||||
}
|
||||
|
||||
export function CardContent({ className, ...props }: React.HTMLAttributes<HTMLDivElement>) {
|
||||
return <div className={cn("p-4", className)} {...props} />;
|
||||
}
|
||||
@@ -1,137 +0,0 @@
|
||||
import { useEffect, useRef } from "react";
|
||||
import { createPortal } from "react-dom";
|
||||
import { AlertTriangle } from "lucide-react";
|
||||
import { Button } from "@nous-research/ui/ui/components/button";
|
||||
import { cn, themedBody } from "@/lib/utils";
|
||||
|
||||
export function ConfirmDialog({
|
||||
cancelLabel = "Cancel",
|
||||
confirmLabel = "Confirm",
|
||||
description,
|
||||
destructive = false,
|
||||
loading = false,
|
||||
onCancel,
|
||||
onConfirm,
|
||||
open,
|
||||
title,
|
||||
}: ConfirmDialogProps) {
|
||||
const dialogRef = useRef<HTMLDivElement>(null);
|
||||
|
||||
// Focus the confirm button when opened; trap ESC to cancel.
|
||||
useEffect(() => {
|
||||
if (!open) return;
|
||||
|
||||
const prevActive = document.activeElement as HTMLElement | null;
|
||||
dialogRef.current
|
||||
?.querySelector<HTMLButtonElement>("[data-confirm]")
|
||||
?.focus();
|
||||
|
||||
const onKey = (e: KeyboardEvent) => {
|
||||
if (e.key === "Escape") {
|
||||
e.preventDefault();
|
||||
onCancel();
|
||||
}
|
||||
};
|
||||
|
||||
document.addEventListener("keydown", onKey);
|
||||
const prevOverflow = document.body.style.overflow;
|
||||
document.body.style.overflow = "hidden";
|
||||
|
||||
return () => {
|
||||
document.removeEventListener("keydown", onKey);
|
||||
document.body.style.overflow = prevOverflow;
|
||||
prevActive?.focus?.();
|
||||
};
|
||||
}, [open, onCancel]);
|
||||
|
||||
if (!open) return null;
|
||||
|
||||
return createPortal(
|
||||
<div
|
||||
role="dialog"
|
||||
aria-modal="true"
|
||||
aria-labelledby="confirm-dialog-title"
|
||||
aria-describedby={description ? "confirm-dialog-desc" : undefined}
|
||||
onClick={(e) => {
|
||||
if (e.target === e.currentTarget) onCancel();
|
||||
}}
|
||||
className={cn(
|
||||
"fixed inset-0 z-50 flex items-center justify-center",
|
||||
"bg-black/60 backdrop-blur-sm",
|
||||
"animate-[fade-in_150ms_ease-out]",
|
||||
)}
|
||||
>
|
||||
<div
|
||||
ref={dialogRef}
|
||||
className={cn(
|
||||
themedBody,
|
||||
"relative w-full max-w-md mx-4",
|
||||
"border border-border bg-card shadow-lg",
|
||||
"animate-[dialog-in_180ms_ease-out]",
|
||||
)}
|
||||
>
|
||||
<div className="flex items-start gap-3 p-4 border-b border-border">
|
||||
{destructive && (
|
||||
<div
|
||||
aria-hidden
|
||||
className="mt-0.5 shrink-0 text-destructive"
|
||||
>
|
||||
<AlertTriangle className="h-4 w-4" />
|
||||
</div>
|
||||
)}
|
||||
|
||||
<div className="flex-1 min-w-0 flex flex-col gap-1">
|
||||
<h2
|
||||
id="confirm-dialog-title"
|
||||
className="font-mondwest text-display text-sm font-bold tracking-[0.12em] blend-lighter"
|
||||
>
|
||||
{title}
|
||||
</h2>
|
||||
|
||||
{description && (
|
||||
<p
|
||||
id="confirm-dialog-desc"
|
||||
className="font-mondwest normal-case text-xs text-muted-foreground leading-relaxed"
|
||||
>
|
||||
{description}
|
||||
</p>
|
||||
)}
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div className="flex items-center justify-end gap-2 p-3">
|
||||
<Button
|
||||
type="button"
|
||||
outlined
|
||||
onClick={onCancel}
|
||||
disabled={loading}
|
||||
>
|
||||
{cancelLabel}
|
||||
</Button>
|
||||
<Button
|
||||
data-confirm
|
||||
type="button"
|
||||
destructive={destructive}
|
||||
onClick={onConfirm}
|
||||
disabled={loading}
|
||||
>
|
||||
{loading ? "…" : confirmLabel}
|
||||
</Button>
|
||||
</div>
|
||||
</div>
|
||||
</div>,
|
||||
document.body,
|
||||
);
|
||||
}
|
||||
|
||||
interface ConfirmDialogProps {
|
||||
cancelLabel?: string;
|
||||
confirmLabel?: string;
|
||||
description?: string;
|
||||
destructive?: boolean;
|
||||
loading?: boolean;
|
||||
onCancel: () => void;
|
||||
onConfirm: () => void;
|
||||
open: boolean;
|
||||
title: string;
|
||||
}
|
||||
@@ -1,16 +0,0 @@
|
||||
import { cn } from "@/lib/utils";
|
||||
|
||||
export function Input({ className, ...props }: React.InputHTMLAttributes<HTMLInputElement>) {
|
||||
return (
|
||||
<input
|
||||
className={cn(
|
||||
"flex h-9 w-full border border-border bg-background/40 px-3 py-1 font-courier text-sm transition-colors",
|
||||
"placeholder:text-muted-foreground",
|
||||
"focus-visible:outline-none focus-visible:ring-1 focus-visible:ring-foreground/30 focus-visible:border-foreground/25",
|
||||
"disabled:cursor-not-allowed disabled:opacity-50",
|
||||
className,
|
||||
)}
|
||||
{...props}
|
||||
/>
|
||||
);
|
||||
}
|
||||
@@ -1,13 +0,0 @@
|
||||
import { cn } from "@/lib/utils";
|
||||
|
||||
export function Label({ className, ...props }: React.LabelHTMLAttributes<HTMLLabelElement>) {
|
||||
return (
|
||||
<label
|
||||
className={cn(
|
||||
"font-mondwest text-xs tracking-[0.1em] uppercase leading-none peer-disabled:cursor-not-allowed peer-disabled:opacity-70",
|
||||
className,
|
||||
)}
|
||||
{...props}
|
||||
/>
|
||||
);
|
||||
}
|
||||
@@ -1,19 +0,0 @@
|
||||
import { cn } from "@/lib/utils";
|
||||
|
||||
export function Separator({
|
||||
className,
|
||||
orientation = "horizontal",
|
||||
...props
|
||||
}: React.HTMLAttributes<HTMLDivElement> & { orientation?: "horizontal" | "vertical" }) {
|
||||
return (
|
||||
<div
|
||||
role="separator"
|
||||
className={cn(
|
||||
"shrink-0 bg-border",
|
||||
orientation === "horizontal" ? "h-px w-full" : "h-full w-px",
|
||||
className,
|
||||
)}
|
||||
{...props}
|
||||
/>
|
||||
);
|
||||
}
|
||||
@@ -1,7 +1,7 @@
|
||||
import { useCallback, useEffect, useState } from "react";
|
||||
import { api } from "@/lib/api";
|
||||
import type { ActionStatusResponse } from "@/lib/api";
|
||||
import { Toast } from "@/components/Toast";
|
||||
import { Toast } from "@nous-research/ui/ui/components/toast";
|
||||
import { useI18n } from "@/i18n";
|
||||
import {
|
||||
SystemActionsContext,
|
||||
|
||||
@@ -1,19 +0,0 @@
|
||||
import { useEffect, useState } from "react";
|
||||
|
||||
/** True when viewport width is strictly below `px` (matches Tailwind `min-width: px`). */
|
||||
export function useBelowBreakpoint(px: number) {
|
||||
const query = `(max-width: ${px - 1}px)`;
|
||||
const [matches, setMatches] = useState(() =>
|
||||
typeof window !== "undefined" ? window.matchMedia(query).matches : false,
|
||||
);
|
||||
|
||||
useEffect(() => {
|
||||
const mql = window.matchMedia(query);
|
||||
const sync = () => setMatches(mql.matches);
|
||||
sync();
|
||||
mql.addEventListener("change", sync);
|
||||
return () => mql.removeEventListener("change", sync);
|
||||
}, [query]);
|
||||
|
||||
return matches;
|
||||
}
|
||||
@@ -1,41 +0,0 @@
|
||||
import { useCallback, useState } from "react";
|
||||
|
||||
export function useConfirmDelete<TId>({
|
||||
onDelete,
|
||||
}: {
|
||||
onDelete: (id: TId) => Promise<void>;
|
||||
}) {
|
||||
const [pendingId, setPendingId] = useState<TId | null>(null);
|
||||
const [isDeleting, setIsDeleting] = useState(false);
|
||||
|
||||
const requestDelete = useCallback((id: TId) => {
|
||||
setPendingId(id);
|
||||
}, []);
|
||||
|
||||
const cancel = useCallback(() => {
|
||||
if (!isDeleting) setPendingId(null);
|
||||
}, [isDeleting]);
|
||||
|
||||
const confirm = useCallback(async () => {
|
||||
if (pendingId === null) return;
|
||||
const id = pendingId;
|
||||
setIsDeleting(true);
|
||||
try {
|
||||
await onDelete(id);
|
||||
setPendingId(null);
|
||||
} catch {
|
||||
// Dialog stays open; caller can surface errors in onDelete before rethrowing
|
||||
} finally {
|
||||
setIsDeleting(false);
|
||||
}
|
||||
}, [pendingId, onDelete]);
|
||||
|
||||
return {
|
||||
cancel,
|
||||
confirm,
|
||||
isDeleting,
|
||||
isOpen: pendingId !== null,
|
||||
pendingId,
|
||||
requestDelete,
|
||||
} as const;
|
||||
}
|
||||
@@ -1,15 +0,0 @@
|
||||
import { useCallback, useState } from "react";
|
||||
|
||||
export function useToast(duration = 3000) {
|
||||
const [toast, setToast] = useState<{ message: string; type: "success" | "error" } | null>(null);
|
||||
|
||||
const showToast = useCallback(
|
||||
(message: string, type: "success" | "error") => {
|
||||
setToast({ message, type });
|
||||
setTimeout(() => setToast(null), duration);
|
||||
},
|
||||
[duration],
|
||||
);
|
||||
|
||||
return { toast, showToast };
|
||||
}
|
||||
+27
-3
@@ -41,7 +41,11 @@ function setSessionHeader(headers: Headers, token: string): void {
|
||||
}
|
||||
}
|
||||
|
||||
export async function fetchJSON<T>(url: string, init?: RequestInit): Promise<T> {
|
||||
export async function fetchJSON<T>(
|
||||
url: string,
|
||||
init?: RequestInit,
|
||||
options?: FetchJSONOptions,
|
||||
): Promise<T> {
|
||||
// Inject the session token into all /api/ requests.
|
||||
const headers = new Headers(init?.headers);
|
||||
const token = window.__HERMES_SESSION_TOKEN__;
|
||||
@@ -100,7 +104,7 @@ export async function fetchJSON<T>(url: string, init?: RequestInit): Promise<T>
|
||||
// that reload once on the first stale-token 401 — gated mode is
|
||||
// handled above, so reaching here in gated mode means a real
|
||||
// middleware failure that should not reload-loop.
|
||||
if (!window.__HERMES_AUTH_REQUIRED__) {
|
||||
if (!window.__HERMES_AUTH_REQUIRED__ && !options?.allowUnauthorized) {
|
||||
let alreadyReloaded = false;
|
||||
try {
|
||||
alreadyReloaded =
|
||||
@@ -198,8 +202,19 @@ export const api = {
|
||||
* still exists but is never useful there (no Session, no cookie). The
|
||||
* AuthWidget component swallows 401s from this call: if the gate isn't
|
||||
* engaged, /api/auth/me returns 401 and the widget renders nothing.
|
||||
*
|
||||
* ``allowUnauthorized`` is load-bearing: in loopback mode this endpoint
|
||||
* 401s by design, and fetchJSON's default loopback behaviour treats a
|
||||
* 401 as a rotated session token and full-page-reloads to pick up a
|
||||
* fresh one. Because every *other* dashboard request succeeds (and so
|
||||
* clears the one-shot reload guard), that turns this expected 401 into
|
||||
* an infinite reload loop. Opting out keeps the 401 a plain throw the
|
||||
* widget can catch.
|
||||
*/
|
||||
getAuthMe: () => fetchJSON<AuthMeResponse>("/api/auth/me"),
|
||||
getAuthMe: () =>
|
||||
fetchJSON<AuthMeResponse>("/api/auth/me", undefined, {
|
||||
allowUnauthorized: true,
|
||||
}),
|
||||
logout: () =>
|
||||
fetch(`${BASE}/auth/logout`, {
|
||||
method: "POST",
|
||||
@@ -514,6 +529,15 @@ export interface ActionResponse {
|
||||
pid: number;
|
||||
}
|
||||
|
||||
/** Per-call overrides for {@link fetchJSON}. */
|
||||
interface FetchJSONOptions {
|
||||
/** When true, a 401 response is surfaced as a normal thrown error rather
|
||||
* than triggering the loopback stale-token page reload. Use for probes
|
||||
* whose 401 is an expected signal (e.g. /api/auth/me in non-gated mode)
|
||||
* rather than evidence of a rotated session token. */
|
||||
allowUnauthorized?: boolean;
|
||||
}
|
||||
|
||||
export interface ActionStatusResponse {
|
||||
exit_code: number | null;
|
||||
lines: string[];
|
||||
|
||||
@@ -20,7 +20,7 @@ import { timeAgo } from "@/lib/utils";
|
||||
import { Button } from "@nous-research/ui/ui/components/button";
|
||||
import { Spinner } from "@nous-research/ui/ui/components/spinner";
|
||||
import { Stats } from "@nous-research/ui/ui/components/stats";
|
||||
import { Card, CardContent, CardHeader, CardTitle } from "@/components/ui/card";
|
||||
import { Card, CardContent, CardHeader, CardTitle } from "@nous-research/ui/ui/components/card";
|
||||
import { Badge } from "@nous-research/ui/ui/components/badge";
|
||||
import { usePageHeader } from "@/contexts/usePageHeader";
|
||||
import { useI18n } from "@/i18n";
|
||||
|
||||
@@ -23,7 +23,7 @@ import { WebglAddon } from "@xterm/addon-webgl";
|
||||
import { Terminal } from "@xterm/xterm";
|
||||
import "@xterm/xterm/css/xterm.css";
|
||||
import { Button } from "@nous-research/ui/ui/components/button";
|
||||
import { Typography } from "@/components/NouiTypography";
|
||||
import { Typography } from "@nous-research/ui/ui/components/typography/index";
|
||||
import { HERMES_BASE_PATH, buildWsAuthParam } from "@/lib/api";
|
||||
import { cn } from "@/lib/utils";
|
||||
import { Copy, PanelRight, X } from "lucide-react";
|
||||
|
||||
@@ -38,15 +38,15 @@ import {
|
||||
} from "lucide-react";
|
||||
import { api } from "@/lib/api";
|
||||
import { getNestedValue, setNestedValue } from "@/lib/nested";
|
||||
import { useToast } from "@/hooks/useToast";
|
||||
import { Toast } from "@/components/Toast";
|
||||
import { useToast } from "@nous-research/ui/hooks/use-toast";
|
||||
import { Toast } from "@nous-research/ui/ui/components/toast";
|
||||
import { AutoField } from "@/components/AutoField";
|
||||
import { Button } from "@nous-research/ui/ui/components/button";
|
||||
import { ListItem } from "@nous-research/ui/ui/components/list-item";
|
||||
import { Spinner } from "@nous-research/ui/ui/components/spinner";
|
||||
import { Card, CardContent, CardHeader, CardTitle } from "@/components/ui/card";
|
||||
import { ConfirmDialog } from "@/components/ui/confirm-dialog";
|
||||
import { Input } from "@/components/ui/input";
|
||||
import { Card, CardContent, CardHeader, CardTitle } from "@nous-research/ui/ui/components/card";
|
||||
import { ConfirmDialog } from "@nous-research/ui/ui/components/confirm-dialog";
|
||||
import { Input } from "@nous-research/ui/ui/components/input";
|
||||
import { Badge } from "@nous-research/ui/ui/components/badge";
|
||||
import { useI18n } from "@/i18n";
|
||||
import { usePageHeader } from "@/contexts/usePageHeader";
|
||||
|
||||
@@ -4,17 +4,17 @@ import { Badge } from "@nous-research/ui/ui/components/badge";
|
||||
import { Button } from "@nous-research/ui/ui/components/button";
|
||||
import { Select, SelectOption } from "@nous-research/ui/ui/components/select";
|
||||
import { Spinner } from "@nous-research/ui/ui/components/spinner";
|
||||
import { H2 } from "@/components/NouiTypography";
|
||||
import { H2 } from "@nous-research/ui/ui/components/typography/h2";
|
||||
import { api } from "@/lib/api";
|
||||
import type { CronJob, ProfileInfo } from "@/lib/api";
|
||||
import { DeleteConfirmDialog } from "@/components/DeleteConfirmDialog";
|
||||
import { useToast } from "@/hooks/useToast";
|
||||
import { useConfirmDelete } from "@/hooks/useConfirmDelete";
|
||||
import { useToast } from "@nous-research/ui/hooks/use-toast";
|
||||
import { useConfirmDelete } from "@nous-research/ui/hooks/use-confirm-delete";
|
||||
import { useModalBehavior } from "@/hooks/useModalBehavior";
|
||||
import { Toast } from "@/components/Toast";
|
||||
import { Card, CardContent } from "@/components/ui/card";
|
||||
import { Input } from "@/components/ui/input";
|
||||
import { Label } from "@/components/ui/label";
|
||||
import { Toast } from "@nous-research/ui/ui/components/toast";
|
||||
import { Card, CardContent } from "@nous-research/ui/ui/components/card";
|
||||
import { Input } from "@nous-research/ui/ui/components/input";
|
||||
import { Label } from "@nous-research/ui/ui/components/label";
|
||||
import { useI18n } from "@/i18n";
|
||||
import { usePageHeader } from "@/contexts/usePageHeader";
|
||||
import { PluginSlot } from "@/plugins";
|
||||
|
||||
@@ -17,9 +17,9 @@ import {
|
||||
import { api } from "@/lib/api";
|
||||
import type { EnvVarInfo } from "@/lib/api";
|
||||
import { DeleteConfirmDialog } from "@/components/DeleteConfirmDialog";
|
||||
import { Toast } from "@/components/Toast";
|
||||
import { useConfirmDelete } from "@/hooks/useConfirmDelete";
|
||||
import { useToast } from "@/hooks/useToast";
|
||||
import { Toast } from "@nous-research/ui/ui/components/toast";
|
||||
import { useConfirmDelete } from "@nous-research/ui/hooks/use-confirm-delete";
|
||||
import { useToast } from "@nous-research/ui/hooks/use-toast";
|
||||
import { OAuthProvidersCard } from "@/components/OAuthProvidersCard";
|
||||
import { Button } from "@nous-research/ui/ui/components/button";
|
||||
import { ListItem } from "@nous-research/ui/ui/components/list-item";
|
||||
@@ -30,10 +30,10 @@ import {
|
||||
CardDescription,
|
||||
CardHeader,
|
||||
CardTitle,
|
||||
} from "@/components/ui/card";
|
||||
} from "@nous-research/ui/ui/components/card";
|
||||
import { Badge } from "@nous-research/ui/ui/components/badge";
|
||||
import { Input } from "@/components/ui/input";
|
||||
import { Label } from "@/components/ui/label";
|
||||
import { Input } from "@nous-research/ui/ui/components/input";
|
||||
import { Label } from "@nous-research/ui/ui/components/label";
|
||||
import { useI18n } from "@/i18n";
|
||||
import { usePageHeader } from "@/contexts/usePageHeader";
|
||||
import { PluginSlot } from "@/plugins";
|
||||
|
||||
@@ -12,8 +12,8 @@ import { Button } from "@nous-research/ui/ui/components/button";
|
||||
import { FilterGroup, Segmented } from "@nous-research/ui/ui/components/segmented";
|
||||
import { Spinner } from "@nous-research/ui/ui/components/spinner";
|
||||
import { Switch } from "@nous-research/ui/ui/components/switch";
|
||||
import { Card, CardContent, CardHeader, CardTitle } from "@/components/ui/card";
|
||||
import { Label } from "@/components/ui/label";
|
||||
import { Card, CardContent, CardHeader, CardTitle } from "@nous-research/ui/ui/components/card";
|
||||
import { Label } from "@nous-research/ui/ui/components/label";
|
||||
import { useI18n } from "@/i18n";
|
||||
import { usePageHeader } from "@/contexts/usePageHeader";
|
||||
import { PluginSlot } from "@/plugins";
|
||||
|
||||
@@ -24,9 +24,9 @@ import { formatTokenCount } from "@/lib/format";
|
||||
import { Button } from "@nous-research/ui/ui/components/button";
|
||||
import { Spinner } from "@nous-research/ui/ui/components/spinner";
|
||||
import { Stats } from "@nous-research/ui/ui/components/stats";
|
||||
import { Card, CardContent, CardHeader, CardTitle } from "@/components/ui/card";
|
||||
import { Card, CardContent, CardHeader, CardTitle } from "@nous-research/ui/ui/components/card";
|
||||
import { Badge } from "@nous-research/ui/ui/components/badge";
|
||||
import { ConfirmDialog } from "@/components/ui/confirm-dialog";
|
||||
import { ConfirmDialog } from "@nous-research/ui/ui/components/confirm-dialog";
|
||||
import { useModalBehavior } from "@/hooks/useModalBehavior";
|
||||
import { usePageHeader } from "@/contexts/usePageHeader";
|
||||
import { useI18n } from "@/i18n";
|
||||
|
||||
@@ -10,12 +10,12 @@ import { Select, SelectOption } from "@nous-research/ui/ui/components/select";
|
||||
import { Switch } from "@nous-research/ui/ui/components/switch";
|
||||
import { Spinner } from "@nous-research/ui/ui/components/spinner";
|
||||
import { CommandBlock } from "@nous-research/ui/ui/components/command-block";
|
||||
import { Card, CardContent, CardHeader, CardTitle } from "@/components/ui/card";
|
||||
import { ConfirmDialog } from "@/components/ui/confirm-dialog";
|
||||
import { Input } from "@/components/ui/input";
|
||||
import { Label } from "@/components/ui/label";
|
||||
import { useToast } from "@/hooks/useToast";
|
||||
import { Toast } from "@/components/Toast";
|
||||
import { Card, CardContent, CardHeader, CardTitle } from "@nous-research/ui/ui/components/card";
|
||||
import { ConfirmDialog } from "@nous-research/ui/ui/components/confirm-dialog";
|
||||
import { Input } from "@nous-research/ui/ui/components/input";
|
||||
import { Label } from "@nous-research/ui/ui/components/label";
|
||||
import { useToast } from "@nous-research/ui/hooks/use-toast";
|
||||
import { Toast } from "@nous-research/ui/ui/components/toast";
|
||||
import { useI18n } from "@/i18n";
|
||||
import { PluginSlot } from "@/plugins";
|
||||
import { cn } from "@/lib/utils";
|
||||
|
||||
@@ -14,19 +14,19 @@ import {
|
||||
X,
|
||||
} from "lucide-react";
|
||||
import spinners from "unicode-animations";
|
||||
import { H2 } from "@/components/NouiTypography";
|
||||
import { H2 } from "@nous-research/ui/ui/components/typography/h2";
|
||||
import { api } from "@/lib/api";
|
||||
import type { ProfileInfo } from "@/lib/api";
|
||||
import { DeleteConfirmDialog } from "@/components/DeleteConfirmDialog";
|
||||
import { useToast } from "@/hooks/useToast";
|
||||
import { useConfirmDelete } from "@/hooks/useConfirmDelete";
|
||||
import { useToast } from "@nous-research/ui/hooks/use-toast";
|
||||
import { useConfirmDelete } from "@nous-research/ui/hooks/use-confirm-delete";
|
||||
import { useModalBehavior } from "@/hooks/useModalBehavior";
|
||||
import { Toast } from "@/components/Toast";
|
||||
import { Card, CardContent } from "@/components/ui/card";
|
||||
import { Toast } from "@nous-research/ui/ui/components/toast";
|
||||
import { Card, CardContent } from "@nous-research/ui/ui/components/card";
|
||||
import { Badge } from "@nous-research/ui/ui/components/badge";
|
||||
import { Button } from "@nous-research/ui/ui/components/button";
|
||||
import { Input } from "@/components/ui/input";
|
||||
import { Label } from "@/components/ui/label";
|
||||
import { Input } from "@nous-research/ui/ui/components/input";
|
||||
import { Label } from "@nous-research/ui/ui/components/label";
|
||||
import { Checkbox } from "@nous-research/ui/ui/components/checkbox";
|
||||
import { useI18n } from "@/i18n";
|
||||
import { usePageHeader } from "@/contexts/usePageHeader";
|
||||
|
||||
@@ -34,18 +34,18 @@ import type {
|
||||
import { timeAgo } from "@/lib/utils";
|
||||
import { Markdown } from "@/components/Markdown";
|
||||
import { PlatformsCard } from "@/components/PlatformsCard";
|
||||
import { Toast } from "@/components/Toast";
|
||||
import { Toast } from "@nous-research/ui/ui/components/toast";
|
||||
import { Button } from "@nous-research/ui/ui/components/button";
|
||||
import { ListItem } from "@nous-research/ui/ui/components/list-item";
|
||||
import { Segmented } from "@nous-research/ui/ui/components/segmented";
|
||||
import { Spinner } from "@nous-research/ui/ui/components/spinner";
|
||||
import { Badge } from "@nous-research/ui/ui/components/badge";
|
||||
import { Card, CardContent, CardHeader, CardTitle } from "@/components/ui/card";
|
||||
import { Card, CardContent, CardHeader, CardTitle } from "@nous-research/ui/ui/components/card";
|
||||
import { DeleteConfirmDialog } from "@/components/DeleteConfirmDialog";
|
||||
import { useConfirmDelete } from "@/hooks/useConfirmDelete";
|
||||
import { Input } from "@/components/ui/input";
|
||||
import { useConfirmDelete } from "@nous-research/ui/hooks/use-confirm-delete";
|
||||
import { Input } from "@nous-research/ui/ui/components/input";
|
||||
import { useSystemActions } from "@/contexts/useSystemActions";
|
||||
import { useToast } from "@/hooks/useToast";
|
||||
import { useToast } from "@nous-research/ui/hooks/use-toast";
|
||||
import { useI18n } from "@/i18n";
|
||||
import { usePageHeader } from "@/contexts/usePageHeader";
|
||||
import { PluginSlot } from "@/plugins";
|
||||
|
||||
@@ -17,16 +17,16 @@ import {
|
||||
} from "lucide-react";
|
||||
import { api } from "@/lib/api";
|
||||
import type { SkillInfo, ToolsetInfo } from "@/lib/api";
|
||||
import { useToast } from "@/hooks/useToast";
|
||||
import { Toast } from "@/components/Toast";
|
||||
import { Card, CardContent, CardHeader, CardTitle } from "@/components/ui/card";
|
||||
import { useToast } from "@nous-research/ui/hooks/use-toast";
|
||||
import { Toast } from "@nous-research/ui/ui/components/toast";
|
||||
import { Card, CardContent, CardHeader, CardTitle } from "@nous-research/ui/ui/components/card";
|
||||
import { Badge } from "@nous-research/ui/ui/components/badge";
|
||||
import { Button } from "@nous-research/ui/ui/components/button";
|
||||
import { ListItem } from "@nous-research/ui/ui/components/list-item";
|
||||
import { Spinner } from "@nous-research/ui/ui/components/spinner";
|
||||
import { Switch } from "@nous-research/ui/ui/components/switch";
|
||||
import { cn } from "@/lib/utils";
|
||||
import { Input } from "@/components/ui/input";
|
||||
import { Input } from "@nous-research/ui/ui/components/input";
|
||||
import { useI18n } from "@/i18n";
|
||||
import { usePageHeader } from "@/contexts/usePageHeader";
|
||||
import { PluginSlot } from "@/plugins";
|
||||
|
||||
@@ -23,10 +23,10 @@ import { Badge } from "@nous-research/ui/ui/components/badge";
|
||||
import { Button } from "@nous-research/ui/ui/components/button";
|
||||
import { Checkbox } from "@nous-research/ui/ui/components/checkbox";
|
||||
import { Select, SelectOption } from "@nous-research/ui/ui/components/select";
|
||||
import { Card, CardHeader, CardTitle, CardContent } from "@/components/ui/card";
|
||||
import { Input } from "@/components/ui/input";
|
||||
import { Label } from "@/components/ui/label";
|
||||
import { Separator } from "@/components/ui/separator";
|
||||
import { Card, CardHeader, CardTitle, CardContent } from "@nous-research/ui/ui/components/card";
|
||||
import { Input } from "@nous-research/ui/ui/components/input";
|
||||
import { Label } from "@nous-research/ui/ui/components/label";
|
||||
import { Separator } from "@nous-research/ui/ui/components/separator";
|
||||
import { Tabs, TabsList, TabsTrigger } from "@nous-research/ui/ui/components/tabs";
|
||||
import { useI18n } from "@/i18n";
|
||||
import { registerSlot, PluginSlot } from "./slots";
|
||||
|
||||
@@ -154,10 +154,10 @@ hooks:
|
||||
**`sync_turn()` MUST be non-blocking.** If your backend has latency (API calls, LLM processing), run the work in a daemon thread:
|
||||
|
||||
```python
|
||||
def sync_turn(self, user_content, assistant_content):
|
||||
def sync_turn(self, user_content, assistant_content, *, session_id="", messages=None):
|
||||
def _sync():
|
||||
try:
|
||||
self._api.ingest(user_content, assistant_content)
|
||||
self._api.ingest(user_content, assistant_content, session_id=session_id, messages=messages)
|
||||
except Exception as e:
|
||||
logger.warning("Sync failed: %s", e)
|
||||
|
||||
@@ -167,6 +167,16 @@ def sync_turn(self, user_content, assistant_content):
|
||||
self._sync_thread.start()
|
||||
```
|
||||
|
||||
`messages` is optional OpenAI-style conversation context as of the completed
|
||||
turn. When present, it includes user/assistant messages, assistant tool calls,
|
||||
and tool result messages. Providers that do not need raw turn context can omit
|
||||
the `messages` parameter; Hermes will continue calling them with the legacy
|
||||
signature.
|
||||
|
||||
Cloud providers should document what parts of `messages` are sent off-device.
|
||||
Tool calls and tool results may contain file paths, command output, or other
|
||||
workspace data.
|
||||
|
||||
## Profile Isolation
|
||||
|
||||
All storage paths **must** use the `hermes_home` kwarg from `initialize()`, not hardcoded `~/.hermes`:
|
||||
|
||||
@@ -227,6 +227,7 @@ Options:
|
||||
| Option | Description |
|
||||
|--------|-------------|
|
||||
| `--all` | On `start` / `restart` / `stop`: act on **every profile's** gateway, not just the active `HERMES_HOME`. Useful if you run multiple profiles side-by-side and want to restart them all after `hermes update`. |
|
||||
| `--no-supervise` | On `run`: inside the s6-overlay Docker image, opt out of auto-supervision and use pre-s6 foreground semantics — gateway runs as the container's main process with no auto-restart. No-op outside the s6 image. Equivalent to setting `HERMES_GATEWAY_NO_SUPERVISE=1`. |
|
||||
|
||||
:::tip WSL users
|
||||
Use `hermes gateway run` instead of `hermes gateway start` — WSL's systemd support is unreliable. Wrap it in tmux for persistence: `tmux new -s hermes 'hermes gateway run'`. See [WSL FAQ](/reference/faq#wsl-gateway-keeps-disconnecting-or-hermes-gateway-start-fails) for details.
|
||||
|
||||
@@ -518,6 +518,7 @@ Advanced per-platform knobs for throttling the outbound message batcher. Most us
|
||||
| `HERMES_GATEWAY_PLATFORM_CONNECT_TIMEOUT` | Per-platform connect timeout during gateway startup (seconds). |
|
||||
| `HERMES_GATEWAY_BUSY_INPUT_MODE` | Default gateway busy-input behavior: `queue`, `steer`, or `interrupt`. Can be overridden per chat with `/busy`. |
|
||||
| `HERMES_GATEWAY_BUSY_ACK_ENABLED` | Whether the gateway sends an acknowledgment message (⚡/⏳/⏩) when a user sends input while the agent is busy (default: `true`). Set to `false` to suppress these messages entirely — the input is still queued/steered/interrupts as normal, only the chat reply is silenced. Bridged from `display.busy_ack_enabled` in `config.yaml`. |
|
||||
| `HERMES_GATEWAY_NO_SUPERVISE` | Inside the s6-overlay Docker image, opt out of auto-supervision when running `hermes gateway run` and use pre-s6 foreground semantics (no auto-restart, gateway is the container's main process). Truthy values: `1`, `true`, `yes`. Equivalent to the `--no-supervise` CLI flag. No-op outside the s6 image. |
|
||||
| `HERMES_FILE_MUTATION_VERIFIER` | Enable the per-turn file-mutation verifier footer (default: `true`). When enabled, Hermes appends an advisory listing any `write_file` / `patch` calls that failed during the turn and were not superseded by a successful write. Set to `0`, `false`, `no`, or `off` to suppress. Mirrors `display.file_mutation_verifier` in `config.yaml`; the env var wins when set. |
|
||||
| `HERMES_CRON_TIMEOUT` | Inactivity timeout for cron job agent runs in seconds (default: `600`). The agent can run indefinitely while actively calling tools or receiving stream tokens — this only triggers when idle. Set to `0` for unlimited. |
|
||||
| `HERMES_CRON_SCRIPT_TIMEOUT` | Timeout for pre-run scripts attached to cron jobs in seconds (default: `120`). Override for scripts that need longer execution (e.g., randomized delays for anti-bot timing). Also configurable via `cron.script_timeout_seconds` in `config.yaml`. |
|
||||
|
||||
@@ -130,7 +130,7 @@ The agent has the same filesystem access as your user account. Use `hermes tools
|
||||
|
||||
Runs commands inside a Docker container with security hardening (all capabilities dropped, no privilege escalation, PID limits).
|
||||
|
||||
**Single persistent container, not per-command.** Hermes starts ONE long-lived container on first use and routes every terminal, file, and `execute_code` call through `docker exec` into that same container — across sessions, `/new`, `/reset`, and `delegate_task` subagents — for the lifetime of the Hermes process. Working-directory changes, installed packages, and files in `/workspace` carry over from one tool call to the next, just like a local shell. The container is stopped and removed on shutdown. See **Container lifecycle** below for details.
|
||||
**Single persistent container, shared across Hermes processes.** Hermes starts ONE long-lived container on first use and routes every terminal, file, and `execute_code` call through `docker exec` into that same container — across sessions, `/new`, `/reset`, and `delegate_task` subagents. Working-directory changes, installed packages, files in `/workspace`, and **background processes** all carry over from one tool call to the next, and from one Hermes process to the next. When you close a TUI session, run `/quit`, or start a new `hermes` invocation, the container keeps running and the next Hermes process reuses it via a labeled lookup. See **Container lifecycle** below for the exact teardown rules.
|
||||
|
||||
```yaml
|
||||
terminal:
|
||||
@@ -138,8 +138,11 @@ terminal:
|
||||
docker_image: "nikolaik/python-nodejs:python3.11-nodejs20"
|
||||
docker_mount_cwd_to_workspace: false # Mount launch dir into /workspace
|
||||
docker_run_as_host_user: false # See "Running container as host user" below
|
||||
docker_forward_env: # Env vars to forward into container
|
||||
docker_forward_env: # Host env vars to forward into container
|
||||
- "GITHUB_TOKEN"
|
||||
docker_env: # Literal env vars to inject (KEY=value)
|
||||
DEBUG: "1"
|
||||
PYTHONUNBUFFERED: "1"
|
||||
docker_volumes: # Host directory mounts
|
||||
- "/home/user/projects:/workspace/projects"
|
||||
- "/home/user/data:/data:ro" # :ro for read-only
|
||||
@@ -151,14 +154,49 @@ terminal:
|
||||
container_cpu: 1 # CPU cores (0 = unlimited)
|
||||
container_memory: 5120 # MB (0 = unlimited)
|
||||
container_disk: 51200 # MB (requires overlay2 on XFS+pquota)
|
||||
container_persistent: true # Persist /workspace and /root across sessions
|
||||
container_persistent: true # Persist /workspace and /root bind-mount dirs
|
||||
|
||||
# Cross-process container reuse (defaults match the "one long-lived
|
||||
# container shared across sessions" contract — see Container lifecycle).
|
||||
docker_persist_across_processes: true # Reuse container across Hermes restarts
|
||||
docker_orphan_reaper: true # Sweep abandoned Exited containers at startup
|
||||
|
||||
# Cross-backend lifecycle settings (apply to docker as well)
|
||||
timeout: 180 # Per-command timeout in seconds
|
||||
lifetime_seconds: 300 # Idle-reaper window; also feeds 2× orphan-reaper threshold
|
||||
```
|
||||
|
||||
**`docker_env`** vs **`docker_forward_env`**: the former injects literal `KEY=value` pairs you specify in the config (the values live in your `config.yaml` or are passed as a JSON dict via `TERMINAL_DOCKER_ENV='{"DEBUG":"1"}'`). The latter forwards values from your shell or `~/.hermes/.env`, so the actual secret never appears in the config file. Use `docker_forward_env` for tokens and `docker_env` for static knobs the container needs.
|
||||
|
||||
**`terminal.docker_extra_args`** (also overridable via `TERMINAL_DOCKER_EXTRA_ARGS='["--gpus=all"]'`) lets you pass arbitrary `docker run` flags that Hermes doesn't surface as first-class keys — `--gpus`, `--network`, `--add-host`, alternative `--security-opt` overrides, etc. Each entry must be a string; the list is appended last to the assembled `docker run` invocation so it can override Hermes' defaults if needed. Use sparingly — flags that conflict with the sandbox hardening (capability drops, `--user`, the workspace bind mount) will silently weaken isolation.
|
||||
|
||||
**Requirements:** Docker Desktop or Docker Engine installed and running. Hermes probes `$PATH` plus common macOS install locations (`/usr/local/bin/docker`, `/opt/homebrew/bin/docker`, Docker Desktop app bundle). Podman is supported out of the box: set `HERMES_DOCKER_BINARY=podman` (or the full path) to force it when both are installed.
|
||||
|
||||
**Container lifecycle:** Hermes reuses a single long-lived container (`docker run -d ... sleep 2h`) for every terminal and file-tool call, across sessions, `/new`, `/reset`, and `delegate_task` subagents, for the lifetime of the Hermes process. Commands run via `docker exec` with a login shell, so working-directory changes, installed packages, and files in `/workspace` all persist from one tool call to the next. The container is stopped and removed on Hermes shutdown (or when the idle-sweep reclaims it).
|
||||
#### Container lifecycle
|
||||
|
||||
Every Hermes-managed container is tagged with three labels so subsequent processes (and the orphan reaper) can identify it:
|
||||
|
||||
- `hermes-agent=1` — marks it as Hermes-managed
|
||||
- `hermes-task-id=<sanitized task_id>` — keys the per-task reuse probe
|
||||
- `hermes-profile=<sanitized profile name>` — scopes reuse and reaping to the active Hermes profile
|
||||
|
||||
On startup, Hermes runs `docker ps --filter label=hermes-task-id=<id> --filter label=hermes-profile=<profile>` and **attaches to the existing container** when it finds one. If the container is `exited` (e.g. after a Docker daemon restart), it's `docker start`'d and reused — filesystem state and any installed packages survive, but in-container background processes do not.
|
||||
|
||||
When a Hermes process exits — `/quit`, closing a TUI session, gateway shutdown, even SIGKILL — the cleanup path is a **no-op for the container in default mode**. The container keeps running. The next Hermes process attaches to it in milliseconds via the label probe. This is the behavior the "one long-lived container shared across sessions" contract requires: it's the only way background processes (npm watchers, dev servers, long-running pytest) survive across sessions.
|
||||
|
||||
**The container is only torn down (stopped and `docker rm -f`'d) in these cases:**
|
||||
|
||||
| Trigger | When it fires |
|
||||
|---|---|
|
||||
| `docker_persist_across_processes: false` | Explicit per-process isolation. Every `cleanup()` does `stop` + `rm -f`. Matches pre-issue-#20561 behavior. |
|
||||
| Idle reaper (`lifetime_seconds`, default 300s) | Only when the env is `persist_across_processes=false`. Persist-mode envs are no-op'd; container survives the idle sweep. |
|
||||
| Orphan reaper at next startup | Sweeps **Exited** hermes-labeled containers older than `2 × lifetime_seconds` (default 600s = 10 min), scoped to the current profile. **Running containers are never touched** — sibling-process safety. Set `docker_orphan_reaper: false` to disable. |
|
||||
| Direct user action | `docker rm -f`, `docker system prune`, Docker Desktop restart. We don't set `--restart=always`, so a host reboot leaves the container `Exited` (its CoW layer survives and gets reused on next startup, but bg processes are gone). |
|
||||
|
||||
Edge cases worth knowing:
|
||||
|
||||
- **OOM kill of in-container PID 1** transitions the container to `Exited`. Next reuse will `docker start` it; filesystem state survives, bg processes do not.
|
||||
- **Switching profiles** isolates containers from each other — a container labeled `hermes-profile=work` is invisible to a Hermes process running under `hermes-profile=research`. The orphan reaper is profile-scoped too, so cross-profile containers don't get reaped accidentally, but they also won't get cleaned up automatically until you start Hermes again under their original profile.
|
||||
|
||||
Parallel subagents spawned via `delegate_task(tasks=[...])` share this one container — concurrent `cd`, env mutations, and writes to the same path will collide. If a subagent needs an isolated sandbox, it must register a per-task image override via `register_task_env_overrides()`, which RL and benchmark environments (TerminalBench2, HermesSweEnv, etc.) do automatically for their per-task Docker images.
|
||||
|
||||
@@ -170,6 +208,29 @@ Parallel subagents spawned via `delegate_task(tasks=[...])` share this one conta
|
||||
|
||||
**Credential forwarding:** Env vars listed in `docker_forward_env` are resolved from your shell environment first, then `~/.hermes/.env`. Skills can also declare `required_environment_variables` which are merged automatically.
|
||||
|
||||
#### Environment variable overrides
|
||||
|
||||
Every key under `terminal:` has an env-var override of the form `TERMINAL_<KEY_UPPERCASE>`. The most useful ones for the Docker backend:
|
||||
|
||||
| Env var | Maps to | Notes |
|
||||
|---|---|---|
|
||||
| `TERMINAL_DOCKER_IMAGE` | `docker_image` | Base image |
|
||||
| `TERMINAL_DOCKER_FORWARD_ENV` | `docker_forward_env` | JSON array: `'["GITHUB_TOKEN","OPENAI_API_KEY"]'` |
|
||||
| `TERMINAL_DOCKER_ENV` | `docker_env` | JSON dict: `'{"DEBUG":"1"}'` |
|
||||
| `TERMINAL_DOCKER_VOLUMES` | `docker_volumes` | JSON array of `"host:container[:ro]"` strings |
|
||||
| `TERMINAL_DOCKER_EXTRA_ARGS` | `docker_extra_args` | JSON array |
|
||||
| `TERMINAL_DOCKER_MOUNT_CWD_TO_WORKSPACE` | `docker_mount_cwd_to_workspace` | `true` / `false` |
|
||||
| `TERMINAL_DOCKER_RUN_AS_HOST_USER` | `docker_run_as_host_user` | `true` / `false` |
|
||||
| `TERMINAL_DOCKER_PERSIST_ACROSS_PROCESSES` | `docker_persist_across_processes` | `true` / `false` — default `true` |
|
||||
| `TERMINAL_DOCKER_ORPHAN_REAPER` | `docker_orphan_reaper` | `true` / `false` — default `true` |
|
||||
| `TERMINAL_CONTAINER_CPU` | `container_cpu` | CPU cores |
|
||||
| `TERMINAL_CONTAINER_MEMORY` | `container_memory` | MB |
|
||||
| `TERMINAL_CONTAINER_DISK` | `container_disk` | MB |
|
||||
| `TERMINAL_CONTAINER_PERSISTENT` | `container_persistent` | `true` / `false` — controls the bind-mount workspace dirs, distinct from `docker_persist_across_processes` |
|
||||
| `TERMINAL_LIFETIME_SECONDS` | `lifetime_seconds` | Idle reaper window |
|
||||
| `TERMINAL_TIMEOUT` | `timeout` | Per-command timeout |
|
||||
| `HERMES_DOCKER_BINARY` | _none_ | Force a specific docker/podman binary path |
|
||||
|
||||
### SSH Backend
|
||||
|
||||
Runs commands on a remote server over SSH. Uses ControlMaster for connection reuse (5-minute idle keepalive). Persistent shell is enabled by default — state (cwd, env vars) survives across commands.
|
||||
|
||||
@@ -101,8 +101,29 @@ The entrypoint starts `hermes dashboard` in the background (running as the non-r
|
||||
| `HERMES_DASHBOARD_HOST` | Bind address for the dashboard HTTP server | `127.0.0.1` |
|
||||
| `HERMES_DASHBOARD_PORT` | Port for the dashboard HTTP server | `9119` |
|
||||
| `HERMES_DASHBOARD_TUI` | Set to `1` to expose the in-browser Chat tab (embedded `hermes --tui` via PTY/WebSocket) | *(unset)* |
|
||||
| `HERMES_DASHBOARD_INSECURE` | Set to `1` (or `true` / `yes`) to bind without the OAuth auth gate. Only use on trusted networks behind a reverse proxy without the OAuth contract — the dashboard exposes API keys and session data | *(unset — gate enforced when a `DashboardAuthProvider` is registered)* |
|
||||
|
||||
By default, the dashboard stays on loopback to avoid exposing the unauthenticated web surface over the network. To publish it intentionally, set `HERMES_DASHBOARD_HOST=0.0.0.0` and configure your own trusted network boundary/reverse proxy. In that case you must explicitly add `--insecure` behavior by passing host/flags in your command path (the entrypoint no longer auto-enables insecure mode).
|
||||
By default, the dashboard stays on loopback (`127.0.0.1`) to avoid exposing
|
||||
the web surface over the network. To publish it intentionally, set
|
||||
`HERMES_DASHBOARD_HOST=0.0.0.0`. The dashboard's OAuth auth gate engages
|
||||
automatically whenever:
|
||||
|
||||
1. The bind host is non-loopback, **and**
|
||||
2. A `DashboardAuthProvider` plugin is registered.
|
||||
|
||||
The bundled `dashboard_auth/nous` provider activates whenever
|
||||
`HERMES_DASHBOARD_OAUTH_CLIENT_ID` is set (see
|
||||
[Web Dashboard → Authentication](features/web-dashboard.md)). With the
|
||||
gate engaged, browser callers are redirected to the configured portal's
|
||||
OAuth flow before they can reach any protected route.
|
||||
|
||||
If no provider is registered and the bind is non-loopback, the dashboard
|
||||
**fails closed at startup** with a specific error pointing at the
|
||||
missing env var. To opt out of the gate explicitly — for a trusted-LAN
|
||||
deployment behind your own reverse proxy without the OAuth contract —
|
||||
set `HERMES_DASHBOARD_INSECURE=1`. This re-enables the legacy "no auth,
|
||||
loud warning" mode and is the only path that disables the gate; the bind
|
||||
host does not implicitly determine `--insecure` anymore.
|
||||
|
||||
:::note
|
||||
The dashboard runs as a supervised s6 service inside the container. If
|
||||
@@ -228,7 +249,7 @@ docker run -it --rm \
|
||||
Direct `-e` flags override values from `.env`. This is useful for CI/CD or secrets-manager integrations where you don't want keys on disk.
|
||||
|
||||
:::note Looking for Docker as the **terminal backend**?
|
||||
This page covers running Hermes itself inside Docker. If you want Hermes to execute the agent's `terminal` / `execute_code` calls inside a Docker sandbox container (one persistent container per Hermes process), that's a separate config block — `terminal.backend: docker` plus `terminal.docker_image`, `terminal.docker_volumes`, `terminal.docker_forward_env`, `terminal.docker_run_as_host_user`, and `terminal.docker_extra_args`. See [Configuration → Docker Backend](configuration.md#docker-backend) for the full set.
|
||||
This page covers running Hermes itself inside Docker. If you want Hermes to execute the agent's `terminal` / `execute_code` calls inside a Docker sandbox container (one long-lived container shared across Hermes processes — see issue #20561), that's a separate config block — `terminal.backend: docker` plus `terminal.docker_image`, `terminal.docker_volumes`, `terminal.docker_forward_env`, `terminal.docker_env`, `terminal.docker_run_as_host_user`, `terminal.docker_extra_args`, `terminal.docker_persist_across_processes`, and `terminal.docker_orphan_reaper`. See [Configuration → Docker Backend](configuration.md#docker-backend) for the full set including container-lifecycle rules.
|
||||
:::
|
||||
|
||||
## Docker Compose example
|
||||
|
||||
@@ -520,6 +520,27 @@ echo 'SUPERMEMORY_API_KEY=***' >> ~/.hermes/.env
|
||||
|
||||
**Support:** [Discord](https://supermemory.link/discord) · [support@supermemory.com](mailto:support@supermemory.com)
|
||||
|
||||
### Memori
|
||||
|
||||
Structured long-term memory using Memori Cloud, with background completed-turn capture, tool-aware turn context, and explicit recall tools for facts, summaries, quota, signup, and feedback.
|
||||
|
||||
| | |
|
||||
|---|---|
|
||||
| **Best for** | Agent-controlled recall with structured project and session attribution |
|
||||
| **Requires** | `pip install hermes-memori` + `hermes-memori install` + [Memori API key](https://app.memorilabs.ai/signup) |
|
||||
| **Data storage** | Memori Cloud |
|
||||
| **Cost** | Memori pricing |
|
||||
|
||||
**Tools:** `memori_recall` (search long-term memory), `memori_recall_summary` (summarized context), `memori_quota` (usage/quota), `memori_signup` (request signup email), `memori_feedback` (send integration feedback)
|
||||
|
||||
**Setup:**
|
||||
```bash
|
||||
pip install hermes-memori
|
||||
hermes-memori install
|
||||
hermes config set memory.provider memori
|
||||
hermes memory setup
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Provider Comparison
|
||||
@@ -534,6 +555,7 @@ echo 'SUPERMEMORY_API_KEY=***' >> ~/.hermes/.env
|
||||
| **RetainDB** | Cloud | $20/mo | 5 | `requests` | Delta compression |
|
||||
| **ByteRover** | Local/Cloud | Free/Paid | 3 | `brv` CLI | Pre-compression extraction |
|
||||
| **Supermemory** | Cloud | Paid | 4 | `supermemory` | Context fencing + session graph ingest + multi-container |
|
||||
| **Memori** | Cloud | Free/Paid | 5 | `hermes-memori` | Tool-aware memory + structured recall |
|
||||
|
||||
## Profile Isolation
|
||||
|
||||
|
||||
@@ -224,6 +224,7 @@ hermes gateway <subcommand>
|
||||
| 选项 | 说明 |
|
||||
|--------|-------------|
|
||||
| `--all` | 在 `start` / `restart` / `stop` 时:对**每个 profile** 的 gateway 执行操作,而不仅限于活跃的 `HERMES_HOME`。当你并行运行多个 profile 并希望在 `hermes update` 后全部重启时很有用。 |
|
||||
| `--no-supervise` | 在 `run` 时:在 s6-overlay Docker 镜像内部,跳过 s6 自动监管,退回到 pre-s6 前台语义——gateway 作为容器主进程运行,无自动重启。在 s6 镜像之外为空操作。等同于设置 `HERMES_GATEWAY_NO_SUPERVISE=1`。 |
|
||||
|
||||
:::tip WSL 用户
|
||||
使用 `hermes gateway run` 而非 `hermes gateway start`——WSL 的 systemd 支持不稳定。用 tmux 包裹以保持持久运行:`tmux new -s hermes 'hermes gateway run'`。详见 [WSL FAQ](/reference/faq#wsl-gateway-keeps-disconnecting-or-hermes-gateway-start-fails)。
|
||||
|
||||
+1
@@ -518,6 +518,7 @@ Graph 事件(Teams 会议、日历、聊天等)的入站变更通知监听
|
||||
| `HERMES_GATEWAY_PLATFORM_CONNECT_TIMEOUT` | gateway 启动期间每个平台的连接超时(秒)。 |
|
||||
| `HERMES_GATEWAY_BUSY_INPUT_MODE` | 默认 gateway 繁忙输入行为:`queue`、`steer` 或 `interrupt`。可通过 `/busy` 按聊天覆盖。 |
|
||||
| `HERMES_GATEWAY_BUSY_ACK_ENABLED` | gateway 是否在用户 agent 繁忙时发送确认消息(⚡/⏳/⏩)(默认:`true`)。设为 `false` 可完全抑制这些消息——输入仍会正常排队/引导/中断,只是聊天回复被静默。从 `config.yaml` 中的 `display.busy_ack_enabled` 桥接。 |
|
||||
| `HERMES_GATEWAY_NO_SUPERVISE` | 在 s6-overlay Docker 镜像内部运行 `hermes gateway run` 时跳过 s6 自动监管,退回到 pre-s6 前台语义(无自动重启,gateway 作为容器主进程)。真值:`1`、`true`、`yes`。等同于 `--no-supervise` CLI 标志。在 s6 镜像之外为空操作。 |
|
||||
| `HERMES_FILE_MUTATION_VERIFIER` | 启用每轮文件变更验证器页脚(默认:`true`)。启用后,Hermes 附加一个建议列表,列出本轮中失败且未被成功写入覆盖的 `write_file`/`patch` 调用。设为 `0`、`false`、`no` 或 `off` 可抑制。镜像 `config.yaml` 中的 `display.file_mutation_verifier`;设置时环境变量优先。 |
|
||||
| `HERMES_CRON_TIMEOUT` | cron 任务 agent 运行的不活动超时(秒,默认:`600`)。agent 在主动调用工具或接收流 token 时可无限运行——仅在空闲时触发。设为 `0` 表示无限制。 |
|
||||
| `HERMES_CRON_SCRIPT_TIMEOUT` | cron 任务附加的预运行脚本超时(秒,默认:`120`)。对需要更长执行时间的脚本(例如随机延迟的反机器人计时)可增大此值。也可通过 `config.yaml` 中的 `cron.script_timeout_seconds` 配置。 |
|
||||
|
||||
@@ -80,8 +80,28 @@ docker run -d \
|
||||
| `HERMES_DASHBOARD_HOST` | dashboard HTTP 服务器的绑定地址 | `127.0.0.1` |
|
||||
| `HERMES_DASHBOARD_PORT` | dashboard HTTP 服务器的端口 | `9119` |
|
||||
| `HERMES_DASHBOARD_TUI` | 设为 `1` 以启用浏览器内 Chat 标签页(通过 PTY/WebSocket 嵌入 `hermes --tui`) | *(未设置)* |
|
||||
| `HERMES_DASHBOARD_INSECURE` | 设为 `1`(或 `true` / `yes`)以在不启用 OAuth 鉴权门控的情况下绑定。仅在可信网络(且通过没有 OAuth 契约的反向代理时)使用——dashboard 会暴露 API 密钥与会话数据 | *(未设置——当注册了 `DashboardAuthProvider` 时启用门控)* |
|
||||
|
||||
默认情况下,dashboard 保持在回环地址,以避免将未经身份验证的 Web 界面暴露到网络。若要有意发布,请设置 `HERMES_DASHBOARD_HOST=0.0.0.0` 并配置你自己的可信网络边界/反向代理。在这种情况下,你必须通过命令路径中的 host/flags 显式添加 `--insecure` 行为(入口点不再自动启用不安全模式)。
|
||||
默认情况下,dashboard 保持在回环地址(`127.0.0.1`),以避免将
|
||||
Web 界面暴露到网络。若要有意发布,请设置
|
||||
`HERMES_DASHBOARD_HOST=0.0.0.0`。当以下两项同时满足时,
|
||||
dashboard 的 OAuth 鉴权门控会自动启用:
|
||||
|
||||
1. 绑定地址为非回环地址,**且**
|
||||
2. 注册了一个 `DashboardAuthProvider` 插件。
|
||||
|
||||
捆绑的 `dashboard_auth/nous` 提供者会在设置
|
||||
`HERMES_DASHBOARD_OAUTH_CLIENT_ID` 时自动激活(参见
|
||||
[Web Dashboard → 鉴权](features/web-dashboard.md))。门控启用后,
|
||||
浏览器调用方会先被重定向到所配置门户的 OAuth 流,然后才能
|
||||
访问任何受保护路由。
|
||||
|
||||
如果未注册提供者且绑定为非回环地址,dashboard **会在启动时
|
||||
失败关闭**,并给出指向缺失环境变量的具体错误信息。要显式
|
||||
退出门控——用于不使用 OAuth 契约、通过你自己的反向代理部署
|
||||
在可信局域网中的场景——请设置 `HERMES_DASHBOARD_INSECURE=1`。
|
||||
这会恢复旧的“无鉴权,但发出告警”模式,也是唯一可以禁用门控的
|
||||
路径;绑定地址不再隐式决定 `--insecure`。
|
||||
|
||||
:::note
|
||||
dashboard 在容器内作为受监管的 s6 服务运行。如果
|
||||
|
||||
@@ -500,7 +500,7 @@ export default function SkillsDashboard() {
|
||||
const sources = useMemo(() => {
|
||||
const set = new Set(allSkillsLocal.map((s) => s.source));
|
||||
return SOURCE_ORDER.filter((s) => s === "all" || set.has(s));
|
||||
}, []);
|
||||
}, [allSkillsLocal]);
|
||||
|
||||
const categoryEntries = useMemo(() => {
|
||||
const pool =
|
||||
@@ -523,7 +523,7 @@ export default function SkillsDashboard() {
|
||||
return Array.from(map.entries())
|
||||
.sort((a, b) => b[1].count - a[1].count)
|
||||
.map(([key, { label, count }]) => ({ key, label, count }));
|
||||
}, [sourceFilter]);
|
||||
}, [sourceFilter, allSkillsLocal]);
|
||||
|
||||
const filtered = useMemo(() => {
|
||||
const q = debouncedSearch.toLowerCase().trim();
|
||||
|
||||
Reference in New Issue
Block a user