Compare commits

...

16 Commits

Author SHA1 Message Date
Test
cab6fb5a09 fix: allow agent-created skills with caution-level findings
Agent-created skills were using the same policy as community hub
installs, blocking any skill with medium/high severity findings
(e.g. docker pull, pip install, git clone). This meant the agent
couldn't create skills that reference Docker or other common tools.

Changed agent-created policy from (allow, block, block) to
(allow, allow, block) — matching the trusted policy. Caution-level
findings (medium/high severity) are now allowed through, while
dangerous findings (critical severity like exfiltration, prompt
injection, reverse shells) remain blocked.

Added 4 tests covering the agent-created policy: safe allowed,
caution allowed, dangerous blocked, force override.
2026-03-17 12:18:53 -07:00
teknium1
c881209b92 Revert "feat(cli): skin-aware light/dark theme mode with terminal auto-detection"
This reverts commit a1c81360a5.
2026-03-17 10:04:53 -07:00
Teknium
d7a2e3ddae fix: handle hyphenated FTS5 queries and preserve quoted literals (#1776)
_sanitize_fts5_query() was stripping ALL double quotes (including
properly paired ones), breaking user-provided quoted phrases like
"exact phrase".  Hyphenated terms like chat-send also silently
expanded to chat AND send, returning unexpected or zero results.

Fix:
1. Extract balanced quoted phrases into placeholders before
   stripping FTS5-special characters, then restore them.
2. Wrap unquoted hyphenated terms (word-word) in double quotes so
   FTS5 matches them as exact phrases instead of splitting on
   the hyphen.
3. Unmatched quotes are still stripped as before.

Based on issue report by @bailob (#1770) and PR #1773 by @Jah-yee
(whose branch contained unrelated changes and couldn't be merged
directly).

Closes #1770
Closes #1773

Co-authored-by: Jah-yee <Jah-yee@users.noreply.github.com>
2026-03-17 09:44:01 -07:00
Teknium
d5af593769 Merge pull request #1769 from sai-samarth/fix/whatsapp-send-message-support
Clean merge — PR is current against main, tests pass, implementation matches existing gateway WhatsApp bridge pattern.
2026-03-17 09:42:01 -07:00
Teknium
df74f86955 Merge pull request #1767 from sai-samarth/fix/systemd-node-path-whatsapp
Clean fix for nvm/non-standard Node.js paths in systemd units. Merges cleanly.
2026-03-17 09:41:39 -07:00
sai-samarth
a3de843fdb test: replace real-looking WhatsApp jid in regression test 2026-03-17 15:38:37 +00:00
sai-samarth
dc15bc508f fix(tools): add outbound WhatsApp send_message routing 2026-03-17 15:31:13 +00:00
sai-samarth
b8eb7c5fed fix(gateway): include resolved node path in systemd unit 2026-03-17 15:11:28 +00:00
Teknium
548cedb869 fix(context_compressor): prevent consecutive same-role messages after compression (#1743)
compress() checks both the head and tail neighbors when choosing the
summary message role.  When only the tail collides, the role is flipped.
When BOTH roles would create consecutive same-role messages (e.g.
head=assistant, tail=user), the summary is merged into the first tail
message instead of inserting a standalone message that breaks role
alternation and causes API 400 errors.

The previous code handled head-side collision but left the tail-side
uncovered — long conversations would crash mid-reply with no useful
error, forcing the user to /reset and lose session history.

Based on PR #1186 by @alireza78a, with improved double-collision
handling (merge into tail instead of unconditional 'user' fallback).

Co-authored-by: alireza78a <alireza78.crypto@gmail.com>
2026-03-17 05:18:52 -07:00
Teknium
702191049f fix(session): skip corrupt lines in load_transcript instead of crashing (#1744)
Wrap json.loads() in load_transcript() with try/except JSONDecodeError
so that partial JSONL lines (from mid-write crashes like OOM/SIGKILL)
are skipped with a warning instead of crashing the entire transcript
load. The rest of the history loads fine.

Adds a logger.warning with the session ID and truncated corrupt line
content for debugging visibility.

Salvaged from PR #1193 by alireza78a.
Closes #1193
2026-03-17 05:18:12 -07:00
Teknium
aea39eeafb Merge pull request #1736 from NousResearch/fix/gateway-platform-hardening
fix(gateway): SMS session-per-send + Matrix bare media types break downstream processing
2026-03-17 04:46:25 -07:00
Teknium
23a3f01b2b Merge pull request #1735 from NousResearch/fix/tool-handler-safety
fix(tools): browser handlers TypeError on unexpected LLM params + fuzzy_match docstring
2026-03-17 04:46:22 -07:00
Teknium
af118501b9 Merge pull request #1733 from NousResearch/fix/defensive-hardening
fix: defensive hardening — logging, dedup, locks, dead code
2026-03-17 04:46:20 -07:00
Teknium
d1d17f4f0a feat(compression): add summary_base_url + move compression config to YAML-only
- Add summary_base_url config option to compression block for custom
  OpenAI-compatible endpoints (e.g. zai, DeepSeek, Ollama)
- Remove compression env var bridges from cli.py and gateway/run.py
  (CONTEXT_COMPRESSION_* env vars no longer set from config)
- Switch run_agent.py to read compression config directly from
  config.yaml instead of env vars
- Fix backwards-compat block in _resolve_task_provider_model to also
  fire when auxiliary.compression.provider is 'auto' (DEFAULT_CONFIG
  sets this, which was silently preventing the compression section's
  summary_* keys from being read)
- Add test for summary_base_url config-to-client flow
- Update docs to show compression as config.yaml-only

Closes #1591
Based on PR #1702 by @uzaylisak
2026-03-17 04:46:15 -07:00
teknium1
6832d60bc0 fix(gateway): SMS persistent HTTP session + Matrix MIME media types
1. sms.py: Replace per-send aiohttp.ClientSession with a persistent
   session created in connect() and closed in disconnect(). Each
   outbound SMS no longer pays the TCP+TLS handshake cost. Falls back
   to a temporary session if the persistent one isn't available.

2. matrix.py: Use proper MIME types (image/png, audio/ogg, video/mp4)
   instead of bare category words (image, audio, video). The gateway's
   media processing checks startswith('image/') and startswith('audio/')
   so bare words caused Matrix images to skip vision enrichment and
   Matrix audio to skip transcription. Now extracts the actual MIME
   type from the nio event's content info when available.
2026-03-17 04:35:14 -07:00
teknium1
847ee20390 fix: defensive hardening — logging, dedup, locks, dead code
Four small fixes:

1. model_tools.py: Tool import failures logged at WARNING instead of
   DEBUG. If a tool module fails to import (syntax error, missing dep),
   the user now sees a warning instead of the tool silently vanishing.

2. hermes_cli/config.py: Remove duplicate 'import sys' (lines 19, 21).

3. agent/model_metadata.py: Remove 6 duplicate entries in
   DEFAULT_CONTEXT_LENGTHS dict. Python keeps the last value, so no
   functional change, but removes maintenance confusion.

4. hermes_state.py: Add missing self._lock to the LIKE query in
   resolve_session_id(). The exact-match path used get_session()
   (which locks internally), but the prefix fallback queried _conn
   without the lock.
2026-03-17 04:31:26 -07:00
30 changed files with 636 additions and 495 deletions

View File

@@ -1248,12 +1248,16 @@ def _resolve_task_provider_model(
cfg_base_url = str(task_config.get("base_url", "")).strip() or None
cfg_api_key = str(task_config.get("api_key", "")).strip() or None
# Backwards compat: compression section has its own keys
if task == "compression" and not cfg_provider:
# Backwards compat: compression section has its own keys.
# The auxiliary.compression defaults to provider="auto", so treat
# both None and "auto" as "not explicitly configured".
if task == "compression" and (not cfg_provider or cfg_provider == "auto"):
comp = config.get("compression", {}) if isinstance(config, dict) else {}
if isinstance(comp, dict):
cfg_provider = comp.get("summary_provider", "").strip() or None
cfg_model = cfg_model or comp.get("summary_model", "").strip() or None
_sbu = comp.get("summary_base_url") or ""
cfg_base_url = cfg_base_url or _sbu.strip() or None
env_model = _get_auxiliary_env_override(task, "MODEL") if task else None
resolved_model = model or env_model or cfg_model

View File

@@ -311,6 +311,7 @@ Write only the summary body. Do not include any preamble or prefix; the system w
)
compressed.append(msg)
_merge_summary_into_tail = False
if summary:
last_head_role = messages[compress_start - 1].get("role", "user") if compress_start > 0 else "user"
first_tail_role = messages[compress_end].get("role", "user") if compress_end < n_messages else "user"
@@ -326,13 +327,25 @@ Write only the summary body. Do not include any preamble or prefix; the system w
flipped = "assistant" if summary_role == "user" else "user"
if flipped != last_head_role:
summary_role = flipped
compressed.append({"role": summary_role, "content": summary})
else:
# Both roles would create consecutive same-role messages
# (e.g. head=assistant, tail=user — neither role works).
# Merge the summary into the first tail message instead
# of inserting a standalone message that breaks alternation.
_merge_summary_into_tail = True
if not _merge_summary_into_tail:
compressed.append({"role": summary_role, "content": summary})
else:
if not self.quiet_mode:
print(" ⚠️ No summary model available — middle turns dropped without summary")
for i in range(compress_end, n_messages):
compressed.append(messages[i].copy())
msg = messages[i].copy()
if _merge_summary_into_tail and i == compress_end:
original = msg.get("content") or ""
msg["content"] = summary + "\n\n" + original
_merge_summary_into_tail = False
compressed.append(msg)
self.compression_count += 1

View File

@@ -94,10 +94,9 @@ DEFAULT_CONTEXT_LENGTHS = {
"gpt-5": 128000,
"gpt-5-codex": 128000,
"gpt-5-nano": 128000,
"claude-opus-4-6": 200000,
# Bare model IDs without provider prefix (avoid duplicates with entries above)
"claude-opus-4-5": 200000,
"claude-opus-4-1": 200000,
"claude-sonnet-4-6": 200000,
"claude-sonnet-4-5": 200000,
"claude-sonnet-4": 200000,
"claude-haiku-4-5": 200000,
@@ -108,11 +107,7 @@ DEFAULT_CONTEXT_LENGTHS = {
"minimax-m2.5": 204800,
"minimax-m2.5-free": 204800,
"minimax-m2.1": 204800,
"glm-5": 202752,
"glm-4.7": 202752,
"glm-4.6": 202752,
"kimi-k2.5": 262144,
"kimi-k2-thinking": 262144,
"kimi-k2": 262144,
"qwen3-coder": 32768,
"big-pickle": 128000,

17
cli.py
View File

@@ -219,7 +219,6 @@ def load_cli_config() -> Dict[str, Any]:
"streaming": False,
"skin": "default",
"theme_mode": "auto",
},
"clarify": {
"timeout": 120, # Seconds to wait for a clarify answer before auto-proceeding
@@ -380,22 +379,10 @@ def load_cli_config() -> Dict[str, Any]:
if config_key in browser_config:
os.environ[env_var] = str(browser_config[config_key])
# Apply compression config to environment variables
compression_config = defaults.get("compression", {})
compression_env_mappings = {
"enabled": "CONTEXT_COMPRESSION_ENABLED",
"threshold": "CONTEXT_COMPRESSION_THRESHOLD",
"summary_model": "CONTEXT_COMPRESSION_MODEL",
"summary_provider": "CONTEXT_COMPRESSION_PROVIDER",
}
for config_key, env_var in compression_env_mappings.items():
if config_key in compression_config:
os.environ[env_var] = str(compression_config[config_key])
# Apply auxiliary model/direct-endpoint overrides to environment variables.
# Vision and web_extract each have their own provider/model/base_url/api_key tuple.
# (Compression is handled in the compression section above.)
# Compression config is read directly from config.yaml by run_agent.py and
# auxiliary_client.py — no env var bridging needed.
# Only set env vars for non-empty / non-default values so auto-detection
# still works.
auxiliary_config = defaults.get("auxiliary", {})

View File

@@ -662,17 +662,24 @@ class MatrixAdapter(BasePlatformAdapter):
http_url = self._mxc_to_http(url)
# Determine message type from event class.
media_type = "document"
# Use the MIME type from the event's content info when available,
# falling back to category-level MIME types for downstream matching
# (gateway/run.py checks startswith("image/"), startswith("audio/"), etc.)
content_info = getattr(event, "content", {}) if isinstance(getattr(event, "content", None), dict) else {}
event_mimetype = (content_info.get("info") or {}).get("mimetype", "")
media_type = "application/octet-stream"
msg_type = MessageType.DOCUMENT
if isinstance(event, nio.RoomMessageImage):
msg_type = MessageType.PHOTO
media_type = "image"
media_type = event_mimetype or "image/png"
elif isinstance(event, nio.RoomMessageAudio):
msg_type = MessageType.AUDIO
media_type = "audio"
media_type = event_mimetype or "audio/ogg"
elif isinstance(event, nio.RoomMessageVideo):
msg_type = MessageType.VIDEO
media_type = "video"
media_type = event_mimetype or "video/mp4"
elif event_mimetype:
media_type = event_mimetype
is_dm = self._dm_rooms.get(room.room_id, False)
if not is_dm and room.member_count == 2:

View File

@@ -79,6 +79,7 @@ class SmsAdapter(BasePlatformAdapter):
os.getenv("SMS_WEBHOOK_PORT", str(DEFAULT_WEBHOOK_PORT))
)
self._runner = None
self._http_session: Optional["aiohttp.ClientSession"] = None
def _basic_auth_header(self) -> str:
"""Build HTTP Basic auth header value for Twilio."""
@@ -106,6 +107,7 @@ class SmsAdapter(BasePlatformAdapter):
await self._runner.setup()
site = web.TCPSite(self._runner, "0.0.0.0", self._webhook_port)
await site.start()
self._http_session = aiohttp.ClientSession()
self._running = True
logger.info(
@@ -116,6 +118,9 @@ class SmsAdapter(BasePlatformAdapter):
return True
async def disconnect(self) -> None:
if self._http_session:
await self._http_session.close()
self._http_session = None
if self._runner:
await self._runner.cleanup()
self._runner = None
@@ -140,7 +145,8 @@ class SmsAdapter(BasePlatformAdapter):
"Authorization": self._basic_auth_header(),
}
async with aiohttp.ClientSession() as session:
session = self._http_session or aiohttp.ClientSession()
try:
for chunk in chunks:
form_data = aiohttp.FormData()
form_data.add_field("From", self._from_number)
@@ -167,6 +173,10 @@ class SmsAdapter(BasePlatformAdapter):
except Exception as e:
logger.error("[sms] send error to %s: %s", _redact_phone(chat_id), e)
return SendResult(success=False, error=str(e))
finally:
# Close session only if we created a fallback (no persistent session)
if not self._http_session and session:
await session.close()
return last_result

View File

@@ -130,17 +130,8 @@ if _config_path.exists():
os.environ[_env_var] = json.dumps(_val)
else:
os.environ[_env_var] = str(_val)
_compression_cfg = _cfg.get("compression", {})
if _compression_cfg and isinstance(_compression_cfg, dict):
_compression_env_map = {
"enabled": "CONTEXT_COMPRESSION_ENABLED",
"threshold": "CONTEXT_COMPRESSION_THRESHOLD",
"summary_model": "CONTEXT_COMPRESSION_MODEL",
"summary_provider": "CONTEXT_COMPRESSION_PROVIDER",
}
for _cfg_key, _env_var in _compression_env_map.items():
if _cfg_key in _compression_cfg:
os.environ[_env_var] = str(_compression_cfg[_cfg_key])
# Compression config is read directly from config.yaml by run_agent.py
# and auxiliary_client.py — no env var bridging needed.
# Auxiliary model/direct-endpoint overrides (vision, web_extract).
# Each task has provider/model/base_url/api_key; bridge non-default values to env vars.
_auxiliary_cfg = _cfg.get("auxiliary", {})
@@ -1632,10 +1623,6 @@ class GatewayRunner:
except Exception:
pass
# Check env override for disabling compression entirely
if os.getenv("CONTEXT_COMPRESSION_ENABLED", "").lower() in ("false", "0", "no"):
_hyg_compression_enabled = False
if _hyg_compression_enabled:
_hyg_context_length = get_model_context_length(_hyg_model)
_compress_token_threshold = int(

View File

@@ -944,7 +944,13 @@ class SessionStore:
for line in f:
line = line.strip()
if line:
messages.append(json.loads(line))
try:
messages.append(json.loads(line))
except json.JSONDecodeError:
logger.warning(
"Skipping corrupt line in transcript %s: %s",
session_id, line[:120],
)
return messages

View File

@@ -1,6 +1,5 @@
"""Shared ANSI color utilities for Hermes CLI modules."""
import os
import sys
@@ -21,123 +20,3 @@ def color(text: str, *codes) -> str:
if not sys.stdout.isatty():
return text
return "".join(codes) + text + Colors.RESET
# =============================================================================
# Terminal background detection (light vs dark)
# =============================================================================
def _detect_via_colorfgbg() -> str:
"""Check the COLORFGBG environment variable.
Some terminals (rxvt, xterm, iTerm2) set COLORFGBG to ``<fg>;<bg>``
where bg >= 8 usually means a dark background.
Returns "light", "dark", or "unknown".
"""
val = os.environ.get("COLORFGBG", "")
if not val:
return "unknown"
parts = val.split(";")
try:
bg = int(parts[-1])
except (ValueError, IndexError):
return "unknown"
# Standard terminal colors 0-6 are dark, 7+ are light.
# bg < 7 → dark background; bg >= 7 → light background.
if bg >= 7:
return "light"
return "dark"
def _detect_via_macos_appearance() -> str:
"""Check macOS AppleInterfaceStyle via ``defaults read``.
Returns "light", "dark", or "unknown".
"""
if sys.platform != "darwin":
return "unknown"
try:
import subprocess
result = subprocess.run(
["defaults", "read", "-g", "AppleInterfaceStyle"],
capture_output=True, text=True, timeout=2,
)
if result.returncode == 0 and "dark" in result.stdout.lower():
return "dark"
# If the key doesn't exist, macOS is in light mode.
return "light"
except Exception:
return "unknown"
def _detect_via_osc11() -> str:
"""Query the terminal background colour via the OSC 11 escape sequence.
Writes ``\\e]11;?\\a`` and reads the response to determine luminance.
Only works when stdin/stdout are connected to a real TTY (not piped).
Returns "light", "dark", or "unknown".
"""
if sys.platform == "win32":
return "unknown"
if not (sys.stdin.isatty() and sys.stdout.isatty()):
return "unknown"
try:
import select
import termios
import tty
fd = sys.stdin.fileno()
old_attrs = termios.tcgetattr(fd)
try:
tty.setraw(fd)
# Send OSC 11 query
sys.stdout.write("\x1b]11;?\x07")
sys.stdout.flush()
# Wait briefly for response
if not select.select([fd], [], [], 0.1)[0]:
return "unknown"
response = b""
while select.select([fd], [], [], 0.05)[0]:
response += os.read(fd, 128)
finally:
termios.tcsetattr(fd, termios.TCSADRAIN, old_attrs)
# Parse response: \x1b]11;rgb:RRRR/GGGG/BBBB\x07 (or \x1b\\)
text = response.decode("latin-1", errors="replace")
if "rgb:" not in text:
return "unknown"
rgb_part = text.split("rgb:")[-1].split("\x07")[0].split("\x1b")[0]
channels = rgb_part.split("/")
if len(channels) < 3:
return "unknown"
# Each channel is 2 or 4 hex digits; normalise to 0-255
vals = []
for ch in channels[:3]:
ch = ch.strip()
if len(ch) <= 2:
vals.append(int(ch, 16))
else:
vals.append(int(ch[:2], 16)) # take high byte
# Perceived luminance (ITU-R BT.601)
luminance = 0.299 * vals[0] + 0.587 * vals[1] + 0.114 * vals[2]
return "light" if luminance > 128 else "dark"
except Exception:
return "unknown"
def detect_terminal_background() -> str:
"""Detect whether the terminal has a light or dark background.
Tries three strategies in order:
1. COLORFGBG environment variable
2. macOS appearance setting
3. OSC 11 escape sequence query
Returns "light", "dark", or "unknown" if detection fails.
"""
for detector in (_detect_via_colorfgbg, _detect_via_macos_appearance, _detect_via_osc11):
result = detector()
if result != "unknown":
return result
return "unknown"

View File

@@ -16,7 +16,6 @@ import os
import platform
import re
import stat
import sys
import subprocess
import sys
import tempfile
@@ -162,6 +161,7 @@ DEFAULT_CONFIG = {
"threshold": 0.50,
"summary_model": "google/gemini-3-flash-preview",
"summary_provider": "auto",
"summary_base_url": None,
},
"smart_model_routing": {
"enabled": False,
@@ -236,7 +236,6 @@ DEFAULT_CONFIG = {
"streaming": False,
"show_cost": False, # Show $ cost in the status bar (off by default)
"skin": "default",
"theme_mode": "auto",
},
# Privacy settings

View File

@@ -6,6 +6,7 @@ Handles: hermes gateway [run|start|stop|restart|status|install|uninstall|setup]
import asyncio
import os
import shutil
import signal
import subprocess
import sys
@@ -401,8 +402,14 @@ def generate_systemd_unit(system: bool = False, run_as_user: str | None = None)
venv_bin = str(PROJECT_ROOT / "venv" / "bin")
node_bin = str(PROJECT_ROOT / "node_modules" / ".bin")
# Build a PATH that includes the venv, node_modules, and standard system dirs
sane_path = f"{venv_bin}:{node_bin}:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"
path_entries = [venv_bin, node_bin]
resolved_node = shutil.which("node")
if resolved_node:
resolved_node_dir = str(Path(resolved_node).resolve().parent)
if resolved_node_dir not in path_entries:
path_entries.append(resolved_node_dir)
path_entries.extend(["/usr/local/sbin", "/usr/local/bin", "/usr/sbin", "/usr/bin", "/sbin", "/bin"])
sane_path = ":".join(path_entries)
hermes_home = str(Path(os.getenv("HERMES_HOME", Path.home() / ".hermes")).resolve())

View File

@@ -114,7 +114,6 @@ class SkinConfig:
name: str
description: str = ""
colors: Dict[str, str] = field(default_factory=dict)
colors_light: Dict[str, str] = field(default_factory=dict)
spinner: Dict[str, Any] = field(default_factory=dict)
branding: Dict[str, str] = field(default_factory=dict)
tool_prefix: str = ""
@@ -123,12 +122,7 @@ class SkinConfig:
banner_hero: str = "" # Rich-markup hero art (replaces HERMES_CADUCEUS)
def get_color(self, key: str, fallback: str = "") -> str:
"""Get a color value with fallback.
In light theme mode, returns the light override if available.
"""
if get_theme_mode() == "light" and key in self.colors_light:
return self.colors_light[key]
"""Get a color value with fallback."""
return self.colors.get(key, fallback)
def get_spinner_list(self, key: str) -> List[str]:
@@ -174,21 +168,6 @@ _BUILTIN_SKINS: Dict[str, Dict[str, Any]] = {
"session_label": "#DAA520",
"session_border": "#8B8682",
},
"colors_light": {
"banner_border": "#7A5A00",
"banner_title": "#6B4C00",
"banner_accent": "#7A5500",
"banner_dim": "#8B7355",
"banner_text": "#3D2B00",
"prompt": "#3D2B00",
"ui_accent": "#7A5500",
"ui_label": "#01579B",
"ui_ok": "#1B5E20",
"input_rule": "#7A5A00",
"response_border": "#6B4C00",
"session_label": "#5C4300",
"session_border": "#8B7355",
},
"spinner": {
# Empty = use hardcoded defaults in display.py
},
@@ -222,21 +201,6 @@ _BUILTIN_SKINS: Dict[str, Dict[str, Any]] = {
"session_label": "#C7A96B",
"session_border": "#6E584B",
},
"colors_light": {
"banner_border": "#6B1010",
"banner_title": "#5C4300",
"banner_accent": "#8B1A1A",
"banner_dim": "#5C4030",
"banner_text": "#3A1800",
"prompt": "#3A1800",
"ui_accent": "#8B1A1A",
"ui_label": "#5C4300",
"ui_ok": "#1B5E20",
"input_rule": "#6B1010",
"response_border": "#7A1515",
"session_label": "#5C4300",
"session_border": "#5C4A3A",
},
"spinner": {
"waiting_faces": ["(⚔)", "(⛨)", "(▲)", "(<>)", "(/)"],
"thinking_faces": ["(⚔)", "(⛨)", "(▲)", "(⌁)", "(<>)"],
@@ -301,22 +265,6 @@ _BUILTIN_SKINS: Dict[str, Dict[str, Any]] = {
"session_label": "#888888",
"session_border": "#555555",
},
"colors_light": {
"banner_border": "#333333",
"banner_title": "#222222",
"banner_accent": "#333333",
"banner_dim": "#555555",
"banner_text": "#333333",
"prompt": "#222222",
"ui_accent": "#333333",
"ui_label": "#444444",
"ui_ok": "#444444",
"ui_error": "#333333",
"input_rule": "#333333",
"response_border": "#444444",
"session_label": "#444444",
"session_border": "#666666",
},
"spinner": {},
"branding": {
"agent_name": "Hermes Agent",
@@ -348,21 +296,6 @@ _BUILTIN_SKINS: Dict[str, Dict[str, Any]] = {
"session_label": "#7eb8f6",
"session_border": "#4b5563",
},
"colors_light": {
"banner_border": "#1A3A7A",
"banner_title": "#1A3570",
"banner_accent": "#1E4090",
"banner_dim": "#3B4555",
"banner_text": "#1A2A50",
"prompt": "#1A2A50",
"ui_accent": "#1A3570",
"ui_label": "#1E3A80",
"ui_ok": "#1B5E20",
"input_rule": "#1A3A7A",
"response_border": "#2A4FA0",
"session_label": "#1A3570",
"session_border": "#5A6070",
},
"spinner": {},
"branding": {
"agent_name": "Hermes Agent",
@@ -394,21 +327,6 @@ _BUILTIN_SKINS: Dict[str, Dict[str, Any]] = {
"session_label": "#A9DFFF",
"session_border": "#496884",
},
"colors_light": {
"banner_border": "#0D3060",
"banner_title": "#0D3060",
"banner_accent": "#154080",
"banner_dim": "#2A4565",
"banner_text": "#0A2850",
"prompt": "#0A2850",
"ui_accent": "#0D3060",
"ui_label": "#0D3060",
"ui_ok": "#1B5E20",
"input_rule": "#0D3060",
"response_border": "#1A5090",
"session_label": "#0D3060",
"session_border": "#3A5575",
},
"spinner": {
"waiting_faces": ["(≈)", "(Ψ)", "(∿)", "(◌)", "(◠)"],
"thinking_faces": ["(Ψ)", "(∿)", "(≈)", "(⌁)", "(◌)"],
@@ -473,23 +391,6 @@ _BUILTIN_SKINS: Dict[str, Dict[str, Any]] = {
"session_label": "#919191",
"session_border": "#656565",
},
"colors_light": {
"banner_border": "#666666",
"banner_title": "#222222",
"banner_accent": "#333333",
"banner_dim": "#555555",
"banner_text": "#333333",
"prompt": "#222222",
"ui_accent": "#333333",
"ui_label": "#444444",
"ui_ok": "#444444",
"ui_error": "#333333",
"ui_warn": "#444444",
"input_rule": "#666666",
"response_border": "#555555",
"session_label": "#444444",
"session_border": "#777777",
},
"spinner": {
"waiting_faces": ["(◉)", "(◌)", "(◬)", "(⬤)", "(::)"],
"thinking_faces": ["(◉)", "(◬)", "(◌)", "(○)", "(●)"],
@@ -555,21 +456,6 @@ _BUILTIN_SKINS: Dict[str, Dict[str, Any]] = {
"session_label": "#FFD39A",
"session_border": "#6C4724",
},
"colors_light": {
"banner_border": "#7A3511",
"banner_title": "#5C2D00",
"banner_accent": "#8B4000",
"banner_dim": "#5A3A1A",
"banner_text": "#3A1E00",
"prompt": "#3A1E00",
"ui_accent": "#8B4000",
"ui_label": "#5C2D00",
"ui_ok": "#1B5E20",
"input_rule": "#7A3511",
"response_border": "#8B4513",
"session_label": "#5C2D00",
"session_border": "#6B5540",
},
"spinner": {
"waiting_faces": ["(✦)", "(▲)", "(◇)", "(<>)", "(🔥)"],
"thinking_faces": ["(✦)", "(▲)", "(◇)", "(⌁)", "(🔥)"],
@@ -623,8 +509,6 @@ _BUILTIN_SKINS: Dict[str, Dict[str, Any]] = {
_active_skin: Optional[SkinConfig] = None
_active_skin_name: str = "default"
_theme_mode: str = "auto"
_resolved_theme_mode: Optional[str] = None
def _skins_dir() -> Path:
@@ -652,8 +536,6 @@ def _build_skin_config(data: Dict[str, Any]) -> SkinConfig:
default = _BUILTIN_SKINS["default"]
colors = dict(default.get("colors", {}))
colors.update(data.get("colors", {}))
colors_light = dict(default.get("colors_light", {}))
colors_light.update(data.get("colors_light", {}))
spinner = dict(default.get("spinner", {}))
spinner.update(data.get("spinner", {}))
branding = dict(default.get("branding", {}))
@@ -663,7 +545,6 @@ def _build_skin_config(data: Dict[str, Any]) -> SkinConfig:
name=data.get("name", "unknown"),
description=data.get("description", ""),
colors=colors,
colors_light=colors_light,
spinner=spinner,
branding=branding,
tool_prefix=data.get("tool_prefix", default.get("tool_prefix", "")),
@@ -744,39 +625,6 @@ def get_active_skin_name() -> str:
return _active_skin_name
def get_theme_mode() -> str:
"""Return the resolved theme mode: "light" or "dark".
When ``_theme_mode`` is ``"auto"``, detection is attempted once and cached.
If detection returns ``"unknown"``, defaults to ``"dark"``.
"""
global _resolved_theme_mode
if _theme_mode in ("light", "dark"):
return _theme_mode
# Auto mode — detect and cache
if _resolved_theme_mode is None:
try:
from hermes_cli.colors import detect_terminal_background
detected = detect_terminal_background()
except Exception:
detected = "unknown"
_resolved_theme_mode = detected if detected in ("light", "dark") else "dark"
return _resolved_theme_mode
def set_theme_mode(mode: str) -> None:
"""Set the theme mode to "light", "dark", or "auto"."""
global _theme_mode, _resolved_theme_mode
_theme_mode = mode
# Reset cached detection so it re-runs on next get_theme_mode() if auto
_resolved_theme_mode = None
def get_theme_mode_setting() -> str:
"""Return the raw theme mode setting (may be "auto", "light", or "dark")."""
return _theme_mode
def init_skin_from_config(config: dict) -> None:
"""Initialize the active skin from CLI config at startup.
@@ -789,13 +637,6 @@ def init_skin_from_config(config: dict) -> None:
else:
set_active_skin("default")
# Theme mode
theme_mode = display.get("theme_mode", "auto")
if isinstance(theme_mode, str) and theme_mode.strip():
set_theme_mode(theme_mode.strip())
else:
set_theme_mode("auto")
# =============================================================================
# Convenience helpers for CLI modules
@@ -849,14 +690,6 @@ def get_prompt_toolkit_style_overrides() -> Dict[str, str]:
warn = skin.get_color("ui_warn", "#FF8C00")
error = skin.get_color("ui_error", "#FF6B6B")
# Use lighter background colours for completion menus in light mode
if get_theme_mode() == "light":
menu_bg = "bg:#e8e8e8"
menu_sel_bg = "bg:#d0d0d0"
else:
menu_bg = "bg:#1a1a2e"
menu_sel_bg = "bg:#333355"
return {
"input-area": prompt,
"placeholder": f"{dim} italic",
@@ -865,11 +698,11 @@ def get_prompt_toolkit_style_overrides() -> Dict[str, str]:
"hint": f"{dim} italic",
"input-rule": input_rule,
"image-badge": f"{label} bold",
"completion-menu": f"{menu_bg} {text}",
"completion-menu.completion": f"{menu_bg} {text}",
"completion-menu.completion.current": f"{menu_sel_bg} {title}",
"completion-menu.meta.completion": f"{menu_bg} {dim}",
"completion-menu.meta.completion.current": f"{menu_sel_bg} {label}",
"completion-menu": f"bg:#1a1a2e {text}",
"completion-menu.completion": f"bg:#1a1a2e {text}",
"completion-menu.completion.current": f"bg:#333355 {title}",
"completion-menu.meta.completion": f"bg:#1a1a2e {dim}",
"completion-menu.meta.completion.current": f"bg:#333355 {label}",
"clarify-border": input_rule,
"clarify-title": f"{title} bold",
"clarify-question": f"{text} bold",

View File

@@ -350,11 +350,12 @@ class SessionDB:
.replace("%", "\\%")
.replace("_", "\\_")
)
cursor = self._conn.execute(
"SELECT id FROM sessions WHERE id LIKE ? ESCAPE '\\' ORDER BY started_at DESC LIMIT 2",
(f"{escaped}%",),
)
matches = [row["id"] for row in cursor.fetchall()]
with self._lock:
cursor = self._conn.execute(
"SELECT id FROM sessions WHERE id LIKE ? ESCAPE '\\' ORDER BY started_at DESC LIMIT 2",
(f"{escaped}%",),
)
matches = [row["id"] for row in cursor.fetchall()]
if len(matches) == 1:
return matches[0]
return None
@@ -688,21 +689,45 @@ class SessionDB:
``NOT``) have special meaning. Passing raw user input directly to
MATCH can cause ``sqlite3.OperationalError``.
Strategy: strip characters that are only meaningful as FTS5 operators
and would otherwise cause syntax errors. This preserves normal keyword
search while preventing crashes on inputs like ``C++``, ``"unterminated``,
or ``hello AND``.
Strategy:
- Preserve properly paired quoted phrases (``"exact phrase"``)
- Strip unmatched FTS5-special characters that would cause errors
- Wrap unquoted hyphenated terms in quotes so FTS5 matches them
as exact phrases instead of splitting on the hyphen
"""
# Remove FTS5-special characters that are not useful in keyword search
sanitized = re.sub(r'[+{}()"^]', " ", query)
# Collapse repeated * (e.g. "***") into a single one, and remove
# leading * (prefix-only matching requires at least one char before *)
# Step 1: Extract balanced double-quoted phrases and protect them
# from further processing via numbered placeholders.
_quoted_parts: list = []
def _preserve_quoted(m: re.Match) -> str:
_quoted_parts.append(m.group(0))
return f"\x00Q{len(_quoted_parts) - 1}\x00"
sanitized = re.sub(r'"[^"]*"', _preserve_quoted, query)
# Step 2: Strip remaining (unmatched) FTS5-special characters
sanitized = re.sub(r'[+{}()\"^]', " ", sanitized)
# Step 3: Collapse repeated * (e.g. "***") into a single one,
# and remove leading * (prefix-only needs at least one char before *)
sanitized = re.sub(r"\*+", "*", sanitized)
sanitized = re.sub(r"(^|\s)\*", r"\1", sanitized)
# Remove dangling boolean operators at start/end that would cause
# syntax errors (e.g. "hello AND" or "OR world")
# Step 4: Remove dangling boolean operators at start/end that would
# cause syntax errors (e.g. "hello AND" or "OR world")
sanitized = re.sub(r"(?i)^(AND|OR|NOT)\b\s*", "", sanitized.strip())
sanitized = re.sub(r"(?i)\s+(AND|OR|NOT)\s*$", "", sanitized.strip())
# Step 5: Wrap unquoted hyphenated terms (e.g. ``chat-send``) in
# double quotes. FTS5's tokenizer splits on hyphens, turning
# ``chat-send`` into ``chat AND send``. Quoting preserves the
# intended phrase match.
sanitized = re.sub(r"\b(\w+(?:-\w+)+)\b", r'"\1"', sanitized)
# Step 6: Restore preserved quoted phrases
for i, quoted in enumerate(_quoted_parts):
sanitized = sanitized.replace(f"\x00Q{i}\x00", quoted)
return sanitized.strip()
def search_messages(

View File

@@ -101,7 +101,7 @@ def _discover_tools():
try:
importlib.import_module(mod_name)
except Exception as e:
logger.debug("Could not import %s: %s", mod_name, e)
logger.warning("Could not import tool module %s: %s", mod_name, e)
_discover_tools()

View File

@@ -837,10 +837,17 @@ class AIAgent:
# Initialize context compressor for automatic context management
# Compresses conversation when approaching model's context limit
# Configuration via config.yaml (compression section) or environment variables
compression_threshold = float(os.getenv("CONTEXT_COMPRESSION_THRESHOLD", "0.50"))
compression_enabled = os.getenv("CONTEXT_COMPRESSION_ENABLED", "true").lower() in ("true", "1", "yes")
compression_summary_model = os.getenv("CONTEXT_COMPRESSION_MODEL") or None
# Configuration via config.yaml (compression section)
try:
from hermes_cli.config import load_config as _load_compression_config
_compression_cfg = _load_compression_config().get("compression", {})
if not isinstance(_compression_cfg, dict):
_compression_cfg = {}
except ImportError:
_compression_cfg = {}
compression_threshold = float(_compression_cfg.get("threshold", 0.50))
compression_enabled = str(_compression_cfg.get("enabled", True)).lower() in ("true", "1", "yes")
compression_summary_model = _compression_cfg.get("summary_model") or None
self.context_compressor = ContextCompressor(
model=self.model,

View File

@@ -525,14 +525,16 @@ class TestTaskSpecificOverrides:
assert model == "google/gemini-3-flash-preview" # OpenRouter, not Nous
def test_compression_task_reads_context_prefix(self, monkeypatch):
"""Compression task should check CONTEXT_COMPRESSION_PROVIDER."""
"""Compression task should check CONTEXT_COMPRESSION_PROVIDER env var."""
monkeypatch.setenv("CONTEXT_COMPRESSION_PROVIDER", "nous")
monkeypatch.setenv("OPENROUTER_API_KEY", "or-key") # would win in auto
with patch("agent.auxiliary_client._read_nous_auth") as mock_nous, \
patch("agent.auxiliary_client.OpenAI"):
mock_nous.return_value = {"access_token": "nous-tok"}
mock_nous.return_value = {"access_token": "***"}
client, model = get_text_auxiliary_client("compression")
assert model == "gemini-3-flash" # forced to Nous, not OpenRouter
# Config-first: model comes from config.yaml summary_model default,
# but provider is forced to Nous via env var
assert client is not None
def test_web_extract_task_override(self, monkeypatch):
monkeypatch.setenv("AUXILIARY_WEB_EXTRACT_PROVIDER", "openrouter")
@@ -566,6 +568,25 @@ class TestTaskSpecificOverrides:
client, model = get_text_auxiliary_client("compression")
assert model == "google/gemini-3-flash-preview" # auto → OpenRouter
def test_compression_summary_base_url_from_config(self, monkeypatch, tmp_path):
"""compression.summary_base_url should produce a custom-endpoint client."""
hermes_home = tmp_path / "hermes"
hermes_home.mkdir(parents=True, exist_ok=True)
(hermes_home / "config.yaml").write_text(
"""compression:
summary_provider: custom
summary_model: glm-4.7
summary_base_url: https://api.z.ai/api/coding/paas/v4
"""
)
monkeypatch.setenv("HERMES_HOME", str(hermes_home))
# Custom endpoints need an API key to build the client
monkeypatch.setenv("OPENAI_API_KEY", "test-key")
with patch("agent.auxiliary_client.OpenAI") as mock_openai:
client, model = get_text_auxiliary_client("compression")
assert model == "glm-4.7"
assert mock_openai.call_args.kwargs["base_url"] == "https://api.z.ai/api/coding/paas/v4"
class TestAuxiliaryMaxTokensParam:
def test_codex_fallback_uses_max_tokens(self, monkeypatch):

View File

@@ -111,7 +111,11 @@ class TestCompress:
# First 2 messages should be preserved (protect_first_n=2)
# Last 2 messages should be preserved (protect_last_n=2)
assert result[-1]["content"] == msgs[-1]["content"]
assert result[-2]["content"] == msgs[-2]["content"]
# The second-to-last tail message may have the summary merged
# into it when a double-collision prevents a standalone summary
# (head=assistant, tail=user in this fixture). Verify the
# original content is present in either case.
assert msgs[-2]["content"] in result[-2]["content"]
class TestGenerateSummaryNoneContent:
@@ -329,6 +333,146 @@ class TestCompressWithClient:
assert len(summary_msg) == 1
assert summary_msg[0]["role"] == "assistant"
def test_summary_role_flips_to_avoid_tail_collision(self):
"""When summary role collides with the first tail message but flipping
doesn't collide with head, the role should be flipped."""
mock_response = MagicMock()
mock_response.choices = [MagicMock()]
mock_response.choices[0].message.content = "summary text"
with patch("agent.context_compressor.get_model_context_length", return_value=100000):
c = ContextCompressor(model="test", quiet_mode=True, protect_first_n=2, protect_last_n=2)
# Head ends with tool (index 1), tail starts with user (index 6).
# Default: tool → summary_role="user" → collides with tail.
# Flip to "assistant" → tool→assistant is fine.
msgs = [
{"role": "user", "content": "msg 0"},
{"role": "assistant", "content": "", "tool_calls": [
{"id": "call_1", "type": "function", "function": {"name": "t", "arguments": "{}"}},
]},
{"role": "tool", "tool_call_id": "call_1", "content": "result 1"},
{"role": "assistant", "content": "msg 3"},
{"role": "user", "content": "msg 4"},
{"role": "assistant", "content": "msg 5"},
{"role": "user", "content": "msg 6"},
{"role": "assistant", "content": "msg 7"},
]
with patch("agent.context_compressor.call_llm", return_value=mock_response):
result = c.compress(msgs)
# Verify no consecutive user or assistant messages
for i in range(1, len(result)):
r1 = result[i - 1].get("role")
r2 = result[i].get("role")
if r1 in ("user", "assistant") and r2 in ("user", "assistant"):
assert r1 != r2, f"consecutive {r1} at indices {i-1},{i}"
def test_double_collision_merges_summary_into_tail(self):
"""When neither role avoids collision with both neighbors, the summary
should be merged into the first tail message rather than creating a
standalone message that breaks role alternation.
Common scenario: head ends with 'assistant', tail starts with 'user'.
summary='user' collides with tail, summary='assistant' collides with head.
"""
mock_response = MagicMock()
mock_response.choices = [MagicMock()]
mock_response.choices[0].message.content = "summary text"
with patch("agent.context_compressor.get_model_context_length", return_value=100000):
c = ContextCompressor(model="test", quiet_mode=True, protect_first_n=3, protect_last_n=3)
# Head: [system, user, assistant] → last head = assistant
# Tail: [user, assistant, user] → first tail = user
# summary_role="user" collides with tail, "assistant" collides with head → merge
msgs = [
{"role": "system", "content": "system prompt"},
{"role": "user", "content": "msg 1"},
{"role": "assistant", "content": "msg 2"},
{"role": "user", "content": "msg 3"}, # compressed
{"role": "assistant", "content": "msg 4"}, # compressed
{"role": "user", "content": "msg 5"}, # compressed
{"role": "user", "content": "msg 6"}, # tail start
{"role": "assistant", "content": "msg 7"},
{"role": "user", "content": "msg 8"},
]
with patch("agent.context_compressor.call_llm", return_value=mock_response):
result = c.compress(msgs)
# Verify no consecutive user or assistant messages
for i in range(1, len(result)):
r1 = result[i - 1].get("role")
r2 = result[i].get("role")
if r1 in ("user", "assistant") and r2 in ("user", "assistant"):
assert r1 != r2, f"consecutive {r1} at indices {i-1},{i}"
# The summary text should be merged into the first tail message
first_tail = [m for m in result if "msg 6" in (m.get("content") or "")]
assert len(first_tail) == 1
assert "summary text" in first_tail[0]["content"]
def test_double_collision_user_head_assistant_tail(self):
"""Reverse double collision: head ends with 'user', tail starts with 'assistant'.
summary='assistant' collides with tail, 'user' collides with head → merge."""
mock_response = MagicMock()
mock_response.choices = [MagicMock()]
mock_response.choices[0].message.content = "summary text"
with patch("agent.context_compressor.get_model_context_length", return_value=100000):
c = ContextCompressor(model="test", quiet_mode=True, protect_first_n=2, protect_last_n=2)
# Head: [system, user] → last head = user
# Tail: [assistant, user] → first tail = assistant
# summary_role="assistant" collides with tail, "user" collides with head → merge
msgs = [
{"role": "system", "content": "system prompt"},
{"role": "user", "content": "msg 1"},
{"role": "assistant", "content": "msg 2"}, # compressed
{"role": "user", "content": "msg 3"}, # compressed
{"role": "assistant", "content": "msg 4"}, # compressed
{"role": "assistant", "content": "msg 5"}, # tail start
{"role": "user", "content": "msg 6"},
]
with patch("agent.context_compressor.call_llm", return_value=mock_response):
result = c.compress(msgs)
# Verify no consecutive user or assistant messages
for i in range(1, len(result)):
r1 = result[i - 1].get("role")
r2 = result[i].get("role")
if r1 in ("user", "assistant") and r2 in ("user", "assistant"):
assert r1 != r2, f"consecutive {r1} at indices {i-1},{i}"
# The summary should be merged into the first tail message (assistant)
first_tail = [m for m in result if "msg 5" in (m.get("content") or "")]
assert len(first_tail) == 1
assert "summary text" in first_tail[0]["content"]
def test_no_collision_scenarios_still_work(self):
"""Verify that the common no-collision cases (head=assistant/tail=assistant,
head=user/tail=user) still produce a standalone summary message."""
mock_response = MagicMock()
mock_response.choices = [MagicMock()]
mock_response.choices[0].message.content = "summary text"
with patch("agent.context_compressor.get_model_context_length", return_value=100000):
c = ContextCompressor(model="test", quiet_mode=True, protect_first_n=2, protect_last_n=2)
# Head=assistant, Tail=assistant → summary_role="user", no collision
msgs = [
{"role": "user", "content": "msg 0"},
{"role": "assistant", "content": "msg 1"},
{"role": "user", "content": "msg 2"},
{"role": "assistant", "content": "msg 3"},
{"role": "assistant", "content": "msg 4"},
{"role": "user", "content": "msg 5"},
]
with patch("agent.context_compressor.call_llm", return_value=mock_response):
result = c.compress(msgs)
summary_msgs = [m for m in result if (m.get("content") or "").startswith(SUMMARY_PREFIX)]
assert len(summary_msgs) == 1, "should have a standalone summary message"
assert summary_msgs[0]["role"] == "user"
def test_summarization_does_not_start_tail_with_tool_outputs(self):
mock_response = MagicMock()
mock_response.choices = [MagicMock()]

View File

@@ -336,6 +336,56 @@ class TestSessionStoreRewriteTranscript:
assert reloaded == []
class TestLoadTranscriptCorruptLines:
"""Regression: corrupt JSONL lines (e.g. from mid-write crash) must be
skipped instead of crashing the entire transcript load. GH-1193."""
@pytest.fixture()
def store(self, tmp_path):
config = GatewayConfig()
with patch("gateway.session.SessionStore._ensure_loaded"):
s = SessionStore(sessions_dir=tmp_path, config=config)
s._db = None
s._loaded = True
return s
def test_corrupt_line_skipped(self, store, tmp_path):
session_id = "corrupt_test"
transcript_path = store.get_transcript_path(session_id)
transcript_path.parent.mkdir(parents=True, exist_ok=True)
with open(transcript_path, "w") as f:
f.write('{"role": "user", "content": "hello"}\n')
f.write('{"role": "assistant", "content": "hi th') # truncated
f.write("\n")
f.write('{"role": "user", "content": "goodbye"}\n')
messages = store.load_transcript(session_id)
assert len(messages) == 2
assert messages[0]["content"] == "hello"
assert messages[1]["content"] == "goodbye"
def test_all_lines_corrupt_returns_empty(self, store, tmp_path):
session_id = "all_corrupt"
transcript_path = store.get_transcript_path(session_id)
transcript_path.parent.mkdir(parents=True, exist_ok=True)
with open(transcript_path, "w") as f:
f.write("not json at all\n")
f.write("{truncated\n")
messages = store.load_transcript(session_id)
assert messages == []
def test_valid_transcript_unaffected(self, store, tmp_path):
session_id = "valid_test"
store.append_to_transcript(session_id, {"role": "user", "content": "a"})
store.append_to_transcript(session_id, {"role": "assistant", "content": "b"})
messages = store.load_transcript(session_id)
assert len(messages) == 2
assert messages[0]["content"] == "a"
assert messages[1]["content"] == "b"
class TestWhatsAppDMSessionKeyConsistency:
"""Regression: all session-key construction must go through build_session_key
so DMs are isolated by chat_id across platforms."""

View File

@@ -85,6 +85,13 @@ class TestGeneratedSystemdUnits:
assert "ExecStop=" not in unit
assert "TimeoutStopSec=60" in unit
def test_user_unit_includes_resolved_node_directory_in_path(self, monkeypatch):
monkeypatch.setattr(gateway_cli.shutil, "which", lambda cmd: "/home/test/.nvm/versions/node/v24.14.0/bin/node" if cmd == "node" else None)
unit = gateway_cli.generate_systemd_unit(system=False)
assert "/home/test/.nvm/versions/node/v24.14.0/bin" in unit
def test_system_unit_avoids_recursive_execstop_and_uses_extended_stop_timeout(self):
unit = gateway_cli.generate_systemd_unit(system=True)

View File

@@ -13,13 +13,9 @@ def reset_skin_state():
from hermes_cli import skin_engine
skin_engine._active_skin = None
skin_engine._active_skin_name = "default"
skin_engine._theme_mode = "auto"
skin_engine._resolved_theme_mode = None
yield
skin_engine._active_skin = None
skin_engine._active_skin_name = "default"
skin_engine._theme_mode = "auto"
skin_engine._resolved_theme_mode = None
class TestSkinConfig:
@@ -316,65 +312,3 @@ class TestCliBrandingHelpers:
assert overrides["clarify-title"] == f"{skin.get_color('banner_title')} bold"
assert overrides["sudo-prompt"] == f"{skin.get_color('ui_error')} bold"
assert overrides["approval-title"] == f"{skin.get_color('ui_warn')} bold"
class TestThemeMode:
def test_get_theme_mode_defaults_to_dark_on_unknown(self):
from hermes_cli.skin_engine import get_theme_mode, set_theme_mode
set_theme_mode("auto")
# In a test env, detection returns "unknown" → defaults to "dark"
with patch("hermes_cli.colors.detect_terminal_background", return_value="unknown"):
from hermes_cli import skin_engine
skin_engine._resolved_theme_mode = None # force re-detection
assert get_theme_mode() == "dark"
def test_set_theme_mode_light(self):
from hermes_cli.skin_engine import get_theme_mode, set_theme_mode
set_theme_mode("light")
assert get_theme_mode() == "light"
def test_set_theme_mode_dark(self):
from hermes_cli.skin_engine import get_theme_mode, set_theme_mode
set_theme_mode("dark")
assert get_theme_mode() == "dark"
def test_get_color_respects_light_mode(self):
from hermes_cli.skin_engine import SkinConfig, set_theme_mode
skin = SkinConfig(
name="test",
colors={"banner_title": "#FFD700", "prompt": "#FFF8DC"},
colors_light={"banner_title": "#6B4C00"},
)
set_theme_mode("light")
assert skin.get_color("banner_title") == "#6B4C00"
# Key not in colors_light falls back to colors
assert skin.get_color("prompt") == "#FFF8DC"
def test_get_color_falls_back_in_dark_mode(self):
from hermes_cli.skin_engine import SkinConfig, set_theme_mode
skin = SkinConfig(
name="test",
colors={"banner_title": "#FFD700", "prompt": "#FFF8DC"},
colors_light={"banner_title": "#6B4C00"},
)
set_theme_mode("dark")
assert skin.get_color("banner_title") == "#FFD700"
assert skin.get_color("prompt") == "#FFF8DC"
def test_init_skin_from_config_reads_theme_mode(self):
from hermes_cli.skin_engine import init_skin_from_config, get_theme_mode_setting
init_skin_from_config({"display": {"skin": "default", "theme_mode": "light"}})
assert get_theme_mode_setting() == "light"
def test_builtin_skins_have_colors_light(self):
from hermes_cli.skin_engine import _BUILTIN_SKINS, _build_skin_config
for name, data in _BUILTIN_SKINS.items():
skin = _build_skin_config(data)
assert len(skin.colors_light) > 0, f"Skin '{name}' has empty colors_light"

View File

@@ -28,22 +28,10 @@ def _run_auxiliary_bridge(config_dict, monkeypatch):
"AUXILIARY_VISION_BASE_URL", "AUXILIARY_VISION_API_KEY",
"AUXILIARY_WEB_EXTRACT_PROVIDER", "AUXILIARY_WEB_EXTRACT_MODEL",
"AUXILIARY_WEB_EXTRACT_BASE_URL", "AUXILIARY_WEB_EXTRACT_API_KEY",
"CONTEXT_COMPRESSION_PROVIDER", "CONTEXT_COMPRESSION_MODEL",
):
monkeypatch.delenv(key, raising=False)
# Compression bridge
compression_cfg = config_dict.get("compression", {})
if compression_cfg and isinstance(compression_cfg, dict):
compression_env_map = {
"enabled": "CONTEXT_COMPRESSION_ENABLED",
"threshold": "CONTEXT_COMPRESSION_THRESHOLD",
"summary_model": "CONTEXT_COMPRESSION_MODEL",
"summary_provider": "CONTEXT_COMPRESSION_PROVIDER",
}
for cfg_key, env_var in compression_env_map.items():
if cfg_key in compression_cfg:
os.environ[env_var] = str(compression_cfg[cfg_key])
# Compression config is read directly from config.yaml — no env var bridging.
# Auxiliary bridge
auxiliary_cfg = config_dict.get("auxiliary", {})
@@ -134,17 +122,6 @@ class TestAuxiliaryConfigBridge:
assert os.environ.get("AUXILIARY_VISION_API_KEY") == "local-key"
assert os.environ.get("AUXILIARY_VISION_MODEL") == "qwen2.5-vl"
def test_compression_provider_bridged(self, monkeypatch):
config = {
"compression": {
"summary_provider": "nous",
"summary_model": "gemini-3-flash",
}
}
_run_auxiliary_bridge(config, monkeypatch)
assert os.environ.get("CONTEXT_COMPRESSION_PROVIDER") == "nous"
assert os.environ.get("CONTEXT_COMPRESSION_MODEL") == "gemini-3-flash"
def test_empty_values_not_bridged(self, monkeypatch):
config = {
"auxiliary": {
@@ -186,18 +163,12 @@ class TestAuxiliaryConfigBridge:
def test_all_tasks_with_overrides(self, monkeypatch):
config = {
"compression": {
"summary_provider": "main",
"summary_model": "local-model",
},
"auxiliary": {
"vision": {"provider": "openrouter", "model": "google/gemini-2.5-flash"},
"web_extract": {"provider": "nous", "model": "gemini-3-flash"},
}
}
_run_auxiliary_bridge(config, monkeypatch)
assert os.environ.get("CONTEXT_COMPRESSION_PROVIDER") == "main"
assert os.environ.get("CONTEXT_COMPRESSION_MODEL") == "local-model"
assert os.environ.get("AUXILIARY_VISION_PROVIDER") == "openrouter"
assert os.environ.get("AUXILIARY_VISION_MODEL") == "google/gemini-2.5-flash"
assert os.environ.get("AUXILIARY_WEB_EXTRACT_PROVIDER") == "nous"
@@ -240,12 +211,12 @@ class TestGatewayBridgeCodeParity:
assert "AUXILIARY_WEB_EXTRACT_BASE_URL" in content
assert "AUXILIARY_WEB_EXTRACT_API_KEY" in content
def test_gateway_has_compression_provider(self):
"""Gateway must bridge compression.summary_provider."""
def test_gateway_no_compression_env_bridge(self):
"""Gateway should NOT bridge compression config to env vars (config-only)."""
gateway_path = Path(__file__).parent.parent / "gateway" / "run.py"
content = gateway_path.read_text()
assert "summary_provider" in content
assert "CONTEXT_COMPRESSION_PROVIDER" in content
assert "CONTEXT_COMPRESSION_PROVIDER" not in content
assert "CONTEXT_COMPRESSION_MODEL" not in content
# ── Vision model override tests ──────────────────────────────────────────────
@@ -308,6 +279,12 @@ class TestDefaultConfigShape:
assert "summary_provider" in compression
assert compression["summary_provider"] == "auto"
def test_compression_base_url_default(self):
from hermes_cli.config import DEFAULT_CONFIG
compression = DEFAULT_CONFIG["compression"]
assert "summary_base_url" in compression
assert compression["summary_base_url"] is None
# ── CLI defaults parity ─────────────────────────────────────────────────────

View File

@@ -261,6 +261,30 @@ class TestFTS5Search:
# The word "C" appears in the content, so FTS5 should find it
assert isinstance(results, list)
def test_search_hyphenated_term_does_not_crash(self, db):
"""Hyphenated terms like 'chat-send' must not crash FTS5."""
db.create_session(session_id="s1", source="cli")
db.append_message("s1", role="user", content="Run the chat-send command")
results = db.search_messages("chat-send")
assert isinstance(results, list)
assert len(results) >= 1
assert any("chat-send" in (r.get("snippet") or r.get("content", "")).lower()
for r in results)
def test_search_quoted_phrase_preserved(self, db):
"""User-provided quoted phrases should be preserved for exact matching."""
db.create_session(session_id="s1", source="cli")
db.append_message("s1", role="user", content="docker networking is complex")
db.append_message("s1", role="assistant", content="networking docker tips")
# Quoted phrase should match only the exact order
results = db.search_messages('"docker networking"')
assert isinstance(results, list)
# Should find the user message (exact phrase) but may or may not find
# the assistant message depending on FTS5 phrase matching
assert len(results) >= 1
def test_sanitize_fts5_query_strips_dangerous_chars(self):
"""Unit test for _sanitize_fts5_query static method."""
from hermes_state import SessionDB
@@ -278,6 +302,43 @@ class TestFTS5Search:
# Valid prefix kept
assert s('deploy*') == 'deploy*'
def test_sanitize_fts5_preserves_quoted_phrases(self):
"""Properly paired double-quoted phrases should be preserved."""
from hermes_state import SessionDB
s = SessionDB._sanitize_fts5_query
# Simple quoted phrase
assert s('"exact phrase"') == '"exact phrase"'
# Quoted phrase alongside unquoted terms
assert '"docker networking"' in s('"docker networking" setup')
# Multiple quoted phrases
result = s('"hello world" OR "foo bar"')
assert '"hello world"' in result
assert '"foo bar"' in result
# Unmatched quote still stripped
assert '"' not in s('"unterminated')
def test_sanitize_fts5_quotes_hyphenated_terms(self):
"""Hyphenated terms should be wrapped in quotes for exact matching."""
from hermes_state import SessionDB
s = SessionDB._sanitize_fts5_query
# Simple hyphenated term
assert s('chat-send') == '"chat-send"'
# Multiple hyphens
assert s('docker-compose-up') == '"docker-compose-up"'
# Hyphenated term with other words
result = s('fix chat-send bug')
assert '"chat-send"' in result
assert 'fix' in result
assert 'bug' in result
# Multiple hyphenated terms with OR
result = s('chat-send OR deploy-prod')
assert '"chat-send"' in result
assert '"deploy-prod"' in result
# Already-quoted hyphenated term — no double quoting
assert s('"chat-send"') == '"chat-send"'
# Hyphenated inside a quoted phrase stays as-is
assert s('"my chat-send thing"') == '"my chat-send thing"'
# =========================================================================
# Session search and listing

View File

@@ -216,6 +216,34 @@ def test_auto_mount_replaces_persistent_workspace_bind(monkeypatch, tmp_path):
assert "/sandboxes/docker/test-persistent-auto-mount/workspace:/workspace" not in run_args_str
def test_non_persistent_cleanup_removes_container(monkeypatch):
"""When container_persistent=false, cleanup() must run docker rm -f so the container is removed (Fixes #1679)."""
run_calls = []
def _run(cmd, **kwargs):
run_calls.append((list(cmd) if isinstance(cmd, list) else cmd, kwargs))
if cmd and getattr(cmd[0], "__str__", None) and "docker" in str(cmd[0]):
if len(cmd) >= 2 and cmd[1] == "run":
return subprocess.CompletedProcess(cmd, 0, stdout="abc123container\n", stderr="")
return subprocess.CompletedProcess(cmd, 0, stdout="", stderr="")
monkeypatch.setattr(docker_env, "find_docker", lambda: "/usr/bin/docker")
monkeypatch.setattr(docker_env.subprocess, "run", _run)
monkeypatch.setattr(docker_env.subprocess, "Popen", lambda *a, **k: type("P", (), {"poll": lambda: None, "wait": lambda **kw: None, "returncode": 0, "stdout": iter([]), "stdin": None})())
captured_run_args = []
_install_fake_minisweagent(monkeypatch, captured_run_args)
env = _make_dummy_env(persistent_filesystem=False, task_id="ephemeral-task")
assert env._container_id
container_id = env._container_id
env.cleanup()
rm_calls = [c for c in run_calls if isinstance(c[0], list) and len(c[0]) >= 4 and c[0][1:4] == ["rm", "-f", container_id]]
assert len(rm_calls) >= 1, "cleanup() should run docker rm -f <container_id> when container_persistent=false"
class _FakePopen:
def __init__(self, cmd, **kwargs):
self.cmd = cmd

View File

@@ -398,6 +398,25 @@ class TestSendToPlatformChunking:
# ---------------------------------------------------------------------------
class TestSendToPlatformWhatsapp:
def test_whatsapp_routes_via_local_bridge_sender(self):
chat_id = "test-user@lid"
async_mock = AsyncMock(return_value={"success": True, "platform": "whatsapp", "chat_id": chat_id, "message_id": "abc123"})
with patch("tools.send_message_tool._send_whatsapp", async_mock):
result = asyncio.run(
_send_to_platform(
Platform.WHATSAPP,
SimpleNamespace(enabled=True, token=None, extra={"bridge_port": 3000}),
chat_id,
"hello from hermes",
)
)
assert result["success"] is True
async_mock.assert_awaited_once_with({"bridge_port": 3000}, chat_id, "hello from hermes")
class TestSendTelegramHtmlDetection:
"""Verify that messages containing HTML tags are sent with parse_mode=HTML
and that plain / markdown messages use MarkdownV2."""

View File

@@ -154,6 +154,34 @@ class TestShouldAllowInstall:
assert allowed is True
assert "Force-installed" in reason
# -- agent-created policy --
def test_safe_agent_created_allowed(self):
allowed, _ = should_allow_install(self._result("agent-created", "safe"))
assert allowed is True
def test_caution_agent_created_allowed(self):
"""Agent-created skills with caution verdict (e.g. docker refs) should pass."""
f = [Finding("docker_pull", "medium", "supply_chain", "SKILL.md", 1, "docker pull img", "pulls Docker image")]
allowed, reason = should_allow_install(self._result("agent-created", "caution", f))
assert allowed is True
assert "agent-created" in reason
def test_dangerous_agent_created_blocked(self):
"""Agent-created skills with dangerous verdict (critical findings) stay blocked."""
f = [Finding("env_exfil_curl", "critical", "exfiltration", "SKILL.md", 1, "curl $TOKEN", "exfiltration")]
allowed, reason = should_allow_install(self._result("agent-created", "dangerous", f))
assert allowed is False
assert "Blocked" in reason
def test_force_overrides_dangerous_for_agent_created(self):
f = [Finding("x", "critical", "c", "f", 1, "m", "d")]
allowed, reason = should_allow_install(
self._result("agent-created", "dangerous", f), force=True
)
assert allowed is True
assert "Force-installed" in reason
# ---------------------------------------------------------------------------
# scan_file — pattern detection

View File

@@ -331,6 +331,8 @@ async def _send_to_platform(platform, pconfig, chat_id, message, thread_id=None,
result = await _send_discord(pconfig.token, chat_id, chunk)
elif platform == Platform.SLACK:
result = await _send_slack(pconfig.token, chat_id, chunk)
elif platform == Platform.WHATSAPP:
result = await _send_whatsapp(pconfig.extra, chat_id, chunk)
elif platform == Platform.SIGNAL:
result = await _send_signal(pconfig.extra, chat_id, chunk)
elif platform == Platform.EMAIL:
@@ -514,6 +516,34 @@ async def _send_slack(token, chat_id, message):
return {"error": f"Slack send failed: {e}"}
async def _send_whatsapp(extra, chat_id, message):
"""Send via the local WhatsApp bridge HTTP API."""
try:
import aiohttp
except ImportError:
return {"error": "aiohttp not installed. Run: pip install aiohttp"}
try:
bridge_port = extra.get("bridge_port", 3000)
async with aiohttp.ClientSession() as session:
async with session.post(
f"http://localhost:{bridge_port}/send",
json={"chatId": chat_id, "message": message},
timeout=aiohttp.ClientTimeout(total=30),
) as resp:
if resp.status == 200:
data = await resp.json()
return {
"success": True,
"platform": "whatsapp",
"chat_id": chat_id,
"message_id": data.get("messageId"),
}
body = await resp.text()
return {"error": f"WhatsApp bridge error ({resp.status}): {body}"}
except Exception as e:
return {"error": f"WhatsApp send failed: {e}"}
async def _send_signal(extra, chat_id, message):
"""Send via signal-cli JSON-RPC API."""
try:

View File

@@ -43,7 +43,7 @@ INSTALL_POLICY = {
"builtin": ("allow", "allow", "allow"),
"trusted": ("allow", "allow", "block"),
"community": ("allow", "block", "block"),
"agent-created": ("allow", "block", "block"),
"agent-created": ("allow", "allow", "block"),
}
VERDICT_INDEX = {"safe": 0, "caution": 1, "dangerous": 2}

View File

@@ -218,13 +218,18 @@ For native Anthropic auth, Hermes prefers Claude Code's own credential files whe
| `SESSION_IDLE_MINUTES` | Reset sessions after N minutes of inactivity (default: 1440) |
| `SESSION_RESET_HOUR` | Daily reset hour in 24h format (default: 4 = 4am) |
## Context Compression
## Context Compression (config.yaml only)
| Variable | Description |
|----------|-------------|
| `CONTEXT_COMPRESSION_ENABLED` | Enable auto-compression (default: `true`) |
| `CONTEXT_COMPRESSION_THRESHOLD` | Trigger at this % of limit (default: 0.50) |
| `CONTEXT_COMPRESSION_MODEL` | Model for summaries |
Context compression is configured exclusively through the `compression` section in `config.yaml` — there are no environment variables for it.
```yaml
compression:
enabled: true
threshold: 0.50
summary_model: google/gemini-3-flash-preview
summary_provider: auto
summary_base_url: null # Custom OpenAI-compatible endpoint for summaries
```
## Auxiliary Task Overrides
@@ -238,8 +243,6 @@ For native Anthropic auth, Hermes prefers Claude Code's own credential files whe
| `AUXILIARY_WEB_EXTRACT_MODEL` | Override model for web extraction/summarization |
| `AUXILIARY_WEB_EXTRACT_BASE_URL` | Direct OpenAI-compatible endpoint for web extraction/summarization |
| `AUXILIARY_WEB_EXTRACT_API_KEY` | API key paired with `AUXILIARY_WEB_EXTRACT_BASE_URL` |
| `CONTEXT_COMPRESSION_PROVIDER` | Override provider for context compression summaries |
| `CONTEXT_COMPRESSION_MODEL` | Override model for context compression summaries |
For task-specific direct endpoints, Hermes uses the task's configured API key or `OPENAI_API_KEY`. It does not reuse `OPENROUTER_API_KEY` for those custom endpoints.

View File

@@ -681,13 +681,54 @@ node_modules/
## Context Compression
Hermes automatically compresses long conversations to stay within your model's context window. The compression summarizer is a separate LLM call — you can point it at any provider or endpoint.
All compression settings live in `config.yaml` (no environment variables).
### Full reference
```yaml
compression:
enabled: true # Toggle compression on/off
threshold: 0.50 # Compress at this % of context limit
summary_model: "google/gemini-3-flash-preview" # Model for summarization
summary_provider: "auto" # Provider: "auto", "openrouter", "nous", "codex", "main", etc.
summary_base_url: null # Custom OpenAI-compatible endpoint (overrides provider)
```
### Common setups
**Default (auto-detect) — no configuration needed:**
```yaml
compression:
enabled: true
threshold: 0.50 # Compress at 50% of context limit by default
summary_model: "google/gemini-3-flash-preview" # Model for summarization
# summary_provider: "auto" # "auto", "openrouter", "nous", "main"
threshold: 0.50
```
Uses the first available provider (OpenRouter → Nous → Codex) with Gemini Flash.
**Force a specific provider** (OAuth or API-key based):
```yaml
compression:
summary_provider: nous
summary_model: gemini-3-flash
```
Works with any provider: `nous`, `openrouter`, `codex`, `anthropic`, `main`, etc.
**Custom endpoint** (self-hosted, Ollama, zai, DeepSeek, etc.):
```yaml
compression:
summary_model: glm-4.7
summary_base_url: https://api.z.ai/api/coding/paas/v4
```
Points at a custom OpenAI-compatible endpoint. Uses `OPENAI_API_KEY` for auth.
### How the three knobs interact
| `summary_provider` | `summary_base_url` | Result |
|---------------------|---------------------|--------|
| `auto` (default) | not set | Auto-detect best available provider |
| `nous` / `openrouter` / etc. | not set | Force that provider, use its auth |
| any | set | Use the custom endpoint directly (provider ignored) |
The `summary_model` must support a context length at least as large as your main model's, since it receives the full middle section of the conversation for compression.
@@ -711,17 +752,31 @@ Budget pressure is enabled by default. The agent sees warnings naturally as part
## Auxiliary Models
Hermes uses lightweight "auxiliary" models for side tasks like image analysis, web page summarization, and browser screenshot analysis. By default, these use **Gemini Flash** via OpenRouter or Nous Portal — you don't need to configure anything.
Hermes uses lightweight "auxiliary" models for side tasks like image analysis, web page summarization, and browser screenshot analysis. By default, these use **Gemini Flash** via auto-detection — you don't need to configure anything.
To use a different model, add an `auxiliary` section to `~/.hermes/config.yaml`:
### The universal config pattern
Every model slot in Hermes — auxiliary tasks, compression, fallback — uses the same three knobs:
| Key | What it does | Default |
|-----|-------------|---------|
| `provider` | Which provider to use for auth and routing | `"auto"` |
| `model` | Which model to request | provider's default |
| `base_url` | Custom OpenAI-compatible endpoint (overrides provider) | not set |
When `base_url` is set, Hermes ignores the provider and calls that endpoint directly (using `api_key` or `OPENAI_API_KEY` for auth). When only `provider` is set, Hermes uses that provider's built-in auth and base URL.
Available providers: `auto`, `openrouter`, `nous`, `codex`, `anthropic`, `main`, `zai`, `kimi-coding`, `minimax`, and any provider registered in the [provider registry](/docs/reference/environment-variables).
### Full auxiliary config reference
```yaml
auxiliary:
# Image analysis (vision_analyze tool + browser screenshots)
vision:
provider: "auto" # "auto", "openrouter", "nous", "main"
provider: "auto" # "auto", "openrouter", "nous", "codex", "main", etc.
model: "" # e.g. "openai/gpt-4o", "google/gemini-2.5-flash"
base_url: "" # direct OpenAI-compatible endpoint (takes precedence over provider)
base_url: "" # Custom OpenAI-compatible endpoint (overrides provider)
api_key: "" # API key for base_url (falls back to OPENAI_API_KEY)
# Web page summarization + browser page text extraction
@@ -730,8 +785,19 @@ auxiliary:
model: "" # e.g. "google/gemini-2.5-flash"
base_url: ""
api_key: ""
# Dangerous command approval classifier
approval:
provider: "auto"
model: ""
base_url: ""
api_key: ""
```
:::info
Context compression has its own top-level `compression:` block with `summary_provider`, `summary_model`, and `summary_base_url` — see [Context Compression](#context-compression) above. The fallback model uses a `fallback_model:` block — see [Fallback Model](#fallback-model) above. All three follow the same provider/model/base_url pattern.
:::
### Changing the Vision Model
To use GPT-4o instead of Gemini Flash for image analysis:
@@ -817,18 +883,22 @@ If you use Codex OAuth as your main model provider, vision works automatically
**Vision requires a multimodal model.** If you set `provider: "main"`, make sure your endpoint supports multimodal/vision — otherwise image analysis will fail.
:::
### Environment Variables
### Environment Variables (legacy)
You can also configure auxiliary models via environment variables instead of `config.yaml`:
Auxiliary models can also be configured via environment variables. However, `config.yaml` is the preferred method — it's easier to manage and supports all options including `base_url` and `api_key`.
| Setting | Environment Variable |
|---------|---------------------|
| Vision provider | `AUXILIARY_VISION_PROVIDER` |
| Vision model | `AUXILIARY_VISION_MODEL` |
| Vision endpoint | `AUXILIARY_VISION_BASE_URL` |
| Vision API key | `AUXILIARY_VISION_API_KEY` |
| Web extract provider | `AUXILIARY_WEB_EXTRACT_PROVIDER` |
| Web extract model | `AUXILIARY_WEB_EXTRACT_MODEL` |
| Compression provider | `CONTEXT_COMPRESSION_PROVIDER` |
| Compression model | `CONTEXT_COMPRESSION_MODEL` |
| Web extract endpoint | `AUXILIARY_WEB_EXTRACT_BASE_URL` |
| Web extract API key | `AUXILIARY_WEB_EXTRACT_API_KEY` |
Compression and fallback model settings are config.yaml-only.
:::tip
Run `hermes config` to see your current auxiliary model settings. Overrides only show up when they differ from the defaults.

View File

@@ -210,16 +210,26 @@ auxiliary:
model: ""
```
Or via environment variables:
Every task above follows the same **provider / model / base_url** pattern. Context compression uses its own top-level block:
```bash
AUXILIARY_VISION_PROVIDER=openrouter
AUXILIARY_VISION_MODEL=openai/gpt-4o
AUXILIARY_WEB_EXTRACT_PROVIDER=nous
CONTEXT_COMPRESSION_PROVIDER=main
CONTEXT_COMPRESSION_MODEL=google/gemini-3-flash-preview
```yaml
compression:
summary_provider: main # Same provider options as auxiliary tasks
summary_model: google/gemini-3-flash-preview
summary_base_url: null # Custom OpenAI-compatible endpoint
```
And the fallback model uses:
```yaml
fallback_model:
provider: openrouter
model: anthropic/claude-sonnet-4
# base_url: http://localhost:8000/v1 # Optional custom endpoint
```
All three — auxiliary, compression, fallback — work the same way: set `provider` to pick who handles the request, `model` to pick which model, and `base_url` to point at a custom endpoint (overrides provider).
### Provider Options for Auxiliary Tasks
| Provider | Description | Requirements |