Move container detection to hermes_constants

Remove `_is_inside_container()` from `hermes_cli/config.py` and migrate callers to use `is_container()` from `hermes_constants`. This centralizes container environment detection in a single, reusable location.
Add container detection utility to hermes_constants
2026-04-12 13:08:23 -07:00 · 2026-04-12 13:08:15 -07:00 · 2026-04-12 11:58:02 -07:00 · 2026-04-12 04:52:59 -07:00 · 2026-04-12 04:17:18 -07:00 · 2026-04-12 03:53:30 -07:00
43 changed files with 2642 additions and 389 deletions
@@ -69,9 +69,7 @@ jobs:
          file: Dockerfile
          push: true
          platforms: linux/amd64,linux/arm64
-          tags: |
-            nousresearch/hermes-agent:latest
-            nousresearch/hermes-agent:${{ github.sha }}
+          tags: nousresearch/hermes-agent:latest
          cache-from: type=gha
          cache-to: type=gha,mode=max

@@ -83,9 +81,6 @@ jobs:
          file: Dockerfile
          push: true
          platforms: linux/amd64,linux/arm64
-          tags: |
-            nousresearch/hermes-agent:latest
-            nousresearch/hermes-agent:${{ github.event.release.tag_name }}
-            nousresearch/hermes-agent:${{ github.sha }}
+          tags: nousresearch/hermes-agent:${{ github.event.release.tag_name }}
          cache-from: type=gha
          cache-to: type=gha,mode=max
@@ -0,0 +1,4 @@
+FROM nousresearch/hermes-agent:latest
+COPY hermes_cli/ /opt/hermes/hermes_cli/
+COPY hermes_constants.py /opt/hermes/hermes_constants.py
+COPY tools/voice_mode.py /opt/hermes/tools/voice_mode.py
@@ -1021,6 +1021,23 @@ _AUTO_PROVIDER_LABELS = {

 _AGGREGATOR_PROVIDERS = frozenset({"openrouter", "nous"})

+_MAIN_RUNTIME_FIELDS = ("provider", "model", "base_url", "api_key", "api_mode")
+
+
+def _normalize_main_runtime(main_runtime: Optional[Dict[str, Any]]) -> Dict[str, str]:
+    """Return a sanitized copy of a live main-runtime override."""
+    if not isinstance(main_runtime, dict):
+        return {}
+    normalized: Dict[str, str] = {}
+    for field in _MAIN_RUNTIME_FIELDS:
+        value = main_runtime.get(field)
+        if isinstance(value, str) and value.strip():
+            normalized[field] = value.strip()
+    provider = normalized.get("provider")
+    if provider:
+        normalized["provider"] = provider.lower()
+    return normalized
+

 def _get_provider_chain() -> List[tuple]:
    """Return the ordered provider detection chain.
@@ -1130,7 +1147,7 @@ def _try_payment_fallback(
    return None, None, ""


-def _resolve_auto() -> Tuple[Optional[OpenAI], Optional[str]]:
+def _resolve_auto(main_runtime: Optional[Dict[str, Any]] = None) -> Tuple[Optional[OpenAI], Optional[str]]:
    """Full auto-detection chain.

    Priority:
@@ -1142,6 +1159,12 @@ def _resolve_auto() -> Tuple[Optional[OpenAI], Optional[str]]:
    """
    global auxiliary_is_nous, _stale_base_url_warned
    auxiliary_is_nous = False  # Reset — _try_nous() will set True if it wins
+    runtime = _normalize_main_runtime(main_runtime)
+    runtime_provider = runtime.get("provider", "")
+    runtime_model = runtime.get("model", "")
+    runtime_base_url = runtime.get("base_url", "")
+    runtime_api_key = runtime.get("api_key", "")
+    runtime_api_mode = runtime.get("api_mode", "")

    # ── Warn once if OPENAI_BASE_URL is set but config.yaml uses a named
    #    provider (not 'custom').  This catches the common "env poisoning"
@@ -1149,7 +1172,7 @@ def _resolve_auto() -> Tuple[Optional[OpenAI], Optional[str]]:
    #    old OPENAI_BASE_URL lingers in ~/.hermes/.env. ──
    if not _stale_base_url_warned:
        _env_base = os.getenv("OPENAI_BASE_URL", "").strip()
-        _cfg_provider = _read_main_provider()
+        _cfg_provider = runtime_provider or _read_main_provider()
        if (_env_base and _cfg_provider
                and _cfg_provider != "custom"
                and not _cfg_provider.startswith("custom:")):
@@ -1163,12 +1186,25 @@ def _resolve_auto() -> Tuple[Optional[OpenAI], Optional[str]]:
            _stale_base_url_warned = True

    # ── Step 1: non-aggregator main provider → use main model directly ──
-    main_provider = _read_main_provider()
-    main_model = _read_main_model()
+    main_provider = runtime_provider or _read_main_provider()
+    main_model = runtime_model or _read_main_model()
    if (main_provider and main_model
            and main_provider not in _AGGREGATOR_PROVIDERS
            and main_provider not in ("auto", "")):
-        client, resolved = resolve_provider_client(main_provider, main_model)
+        resolved_provider = main_provider
+        explicit_base_url = None
+        explicit_api_key = None
+        if runtime_base_url and (main_provider == "custom" or main_provider.startswith("custom:")):
+            resolved_provider = "custom"
+            explicit_base_url = runtime_base_url
+            explicit_api_key = runtime_api_key or None
+        client, resolved = resolve_provider_client(
+            resolved_provider,
+            main_model,
+            explicit_base_url=explicit_base_url,
+            explicit_api_key=explicit_api_key,
+            api_mode=runtime_api_mode or None,
+        )
        if client is not None:
            logger.info("Auxiliary auto-detect: using main provider %s (%s)",
                        main_provider, resolved or main_model)
@@ -1249,6 +1285,7 @@ def resolve_provider_client(
    explicit_base_url: str = None,
    explicit_api_key: str = None,
    api_mode: str = None,
+    main_runtime: Optional[Dict[str, Any]] = None,
 ) -> Tuple[Optional[Any], Optional[str]]:
    """Central router: given a provider name and optional model, return a
    configured client with the correct auth, base URL, and API format.
@@ -1319,7 +1356,7 @@ def resolve_provider_client(

    # ── Auto: try all providers in priority order ────────────────────
    if provider == "auto":
-        client, resolved = _resolve_auto()
+        client, resolved = _resolve_auto(main_runtime=main_runtime)
        if client is None:
            return None, None
        # When auto-detection lands on a non-OpenRouter provider (e.g. a
@@ -1543,7 +1580,11 @@ def resolve_provider_client(

 # ── Public API ──────────────────────────────────────────────────────────────

-def get_text_auxiliary_client(task: str = "") -> Tuple[Optional[OpenAI], Optional[str]]:
+def get_text_auxiliary_client(
+    task: str = "",
+    *,
+    main_runtime: Optional[Dict[str, Any]] = None,
+) -> Tuple[Optional[OpenAI], Optional[str]]:
    """Return (client, default_model_slug) for text-only auxiliary tasks.

    Args:
@@ -1560,10 +1601,11 @@ def get_text_auxiliary_client(task: str = "") -> Tuple[Optional[OpenAI], Optiona
        explicit_base_url=base_url,
        explicit_api_key=api_key,
        api_mode=api_mode,
+        main_runtime=main_runtime,
    )


-def get_async_text_auxiliary_client(task: str = ""):
+def get_async_text_auxiliary_client(task: str = "", *, main_runtime: Optional[Dict[str, Any]] = None):
    """Return (async_client, model_slug) for async consumers.

    For standard providers returns (AsyncOpenAI, model). For Codex returns
@@ -1578,6 +1620,7 @@ def get_async_text_auxiliary_client(task: str = ""):
        explicit_base_url=base_url,
        explicit_api_key=api_key,
        api_mode=api_mode,
+        main_runtime=main_runtime,
    )


@@ -1892,6 +1935,7 @@ def _get_cached_client(
    base_url: str = None,
    api_key: str = None,
    api_mode: str = None,
+    main_runtime: Optional[Dict[str, Any]] = None,
 ) -> Tuple[Optional[Any], Optional[str]]:
    """Get or create a cached client for the given provider.

@@ -1915,7 +1959,9 @@ def _get_cached_client(
            loop_id = id(current_loop)
        except RuntimeError:
            pass
-    cache_key = (provider, async_mode, base_url or "", api_key or "", api_mode or "", loop_id)
+    runtime = _normalize_main_runtime(main_runtime)
+    runtime_key = tuple(runtime.get(field, "") for field in _MAIN_RUNTIME_FIELDS) if provider == "auto" else ()
+    cache_key = (provider, async_mode, base_url or "", api_key or "", api_mode or "", loop_id, runtime_key)
    with _client_cache_lock:
        if cache_key in _client_cache:
            cached_client, cached_default, cached_loop = _client_cache[cache_key]
@@ -1940,6 +1986,7 @@ def _get_cached_client(
        explicit_base_url=base_url,
        explicit_api_key=api_key,
        api_mode=api_mode,
+        main_runtime=runtime,
    )
    if client is not None:
        # For async clients, remember which loop they were created on so we
@@ -2065,6 +2112,75 @@ def _get_task_timeout(task: str, default: float = _DEFAULT_AUX_TIMEOUT) -> float
    return default


+# ---------------------------------------------------------------------------
+# Anthropic-compatible endpoint detection + image block conversion
+# ---------------------------------------------------------------------------
+
+# Providers that use Anthropic-compatible endpoints (via OpenAI SDK wrapper).
+# Their image content blocks must use Anthropic format, not OpenAI format.
+_ANTHROPIC_COMPAT_PROVIDERS = frozenset({"minimax", "minimax-cn"})
+
+
+def _is_anthropic_compat_endpoint(provider: str, base_url: str) -> bool:
+    """Detect if an endpoint expects Anthropic-format content blocks.
+
+    Returns True for known Anthropic-compatible providers (MiniMax) and
+    any endpoint whose URL contains ``/anthropic`` in the path.
+    """
+    if provider in _ANTHROPIC_COMPAT_PROVIDERS:
+        return True
+    url_lower = (base_url or "").lower()
+    return "/anthropic" in url_lower
+
+
+def _convert_openai_images_to_anthropic(messages: list) -> list:
+    """Convert OpenAI ``image_url`` content blocks to Anthropic ``image`` blocks.
+
+    Only touches messages that have list-type content with ``image_url`` blocks;
+    plain text messages pass through unchanged.
+    """
+    converted = []
+    for msg in messages:
+        content = msg.get("content")
+        if not isinstance(content, list):
+            converted.append(msg)
+            continue
+        new_content = []
+        changed = False
+        for block in content:
+            if block.get("type") == "image_url":
+                image_url_val = (block.get("image_url") or {}).get("url", "")
+                if image_url_val.startswith("data:"):
+                    # Parse data URI: data:<media_type>;base64,<data>
+                    header, _, b64data = image_url_val.partition(",")
+                    media_type = "image/png"
+                    if ":" in header and ";" in header:
+                        media_type = header.split(":", 1)[1].split(";", 1)[0]
+                    new_content.append({
+                        "type": "image",
+                        "source": {
+                            "type": "base64",
+                            "media_type": media_type,
+                            "data": b64data,
+                        },
+                    })
+                else:
+                    # URL-based image
+                    new_content.append({
+                        "type": "image",
+                        "source": {
+                            "type": "url",
+                            "url": image_url_val,
+                        },
+                    })
+                changed = True
+            else:
+                new_content.append(block)
+        converted.append({**msg, "content": new_content} if changed else msg)
+    return converted
+
+
+
 def _build_call_kwargs(
    provider: str,
    model: str,
@@ -2149,6 +2265,7 @@ def call_llm(
    model: str = None,
    base_url: str = None,
    api_key: str = None,
+    main_runtime: Optional[Dict[str, Any]] = None,
    messages: list,
    temperature: float = None,
    max_tokens: int = None,
@@ -2214,6 +2331,7 @@ def call_llm(
            base_url=resolved_base_url,
            api_key=resolved_api_key,
            api_mode=resolved_api_mode,
+            main_runtime=main_runtime,
        )
        if client is None:
            # When the user explicitly chose a non-OpenRouter provider but no
@@ -2234,7 +2352,7 @@ def call_llm(
            if not resolved_base_url:
                logger.info("Auxiliary %s: provider %s unavailable, trying auto-detection chain",
                            task or "call", resolved_provider)
-                client, final_model = _get_cached_client("auto")
+                client, final_model = _get_cached_client("auto", main_runtime=main_runtime)
        if client is None:
            raise RuntimeError(
                f"No LLM provider configured for task={task} provider={resolved_provider}. "
@@ -2255,6 +2373,11 @@ def call_llm(
        tools=tools, timeout=effective_timeout, extra_body=extra_body,
        base_url=resolved_base_url)

+    # Convert image blocks for Anthropic-compatible endpoints (e.g. MiniMax)
+    _client_base = str(getattr(client, "base_url", "") or "")
+    if _is_anthropic_compat_endpoint(resolved_provider, _client_base):
+        kwargs["messages"] = _convert_openai_images_to_anthropic(kwargs["messages"])
+
    # Handle max_tokens vs max_completion_tokens retry, then payment fallback.
    try:
        return _validate_llm_response(
@@ -2443,6 +2566,11 @@ async def async_call_llm(
        tools=tools, timeout=effective_timeout, extra_body=extra_body,
        base_url=resolved_base_url)

+    # Convert image blocks for Anthropic-compatible endpoints (e.g. MiniMax)
+    _client_base = str(getattr(client, "base_url", "") or "")
+    if _is_anthropic_compat_endpoint(resolved_provider, _client_base):
+        kwargs["messages"] = _convert_openai_images_to_anthropic(kwargs["messages"])
+
    try:
        return _validate_llm_response(
            await client.chat.completions.create(**kwargs), task)
@@ -86,12 +86,14 @@ class ContextCompressor(ContextEngine):
        base_url: str = "",
        api_key: str = "",
        provider: str = "",
+        api_mode: str = "",
    ) -> None:
        """Update model info after a model switch or fallback activation."""
        self.model = model
        self.base_url = base_url
        self.api_key = api_key
        self.provider = provider
+        self.api_mode = api_mode
        self.context_length = context_length
        self.threshold_tokens = max(
            int(context_length * self.threshold_percent),
@@ -111,11 +113,13 @@ class ContextCompressor(ContextEngine):
        api_key: str = "",
        config_context_length: int | None = None,
        provider: str = "",
+        api_mode: str = "",
    ):
        self.model = model
        self.base_url = base_url
        self.api_key = api_key
        self.provider = provider
+        self.api_mode = api_mode
        self.threshold_percent = threshold_percent
        self.protect_first_n = protect_first_n
        self.protect_last_n = protect_last_n
@@ -438,6 +442,13 @@ The user has requested that this compaction PRIORITISE preserving all informatio
        try:
            call_kwargs = {
                "task": "compression",
+                "main_runtime": {
+                    "model": self.model,
+                    "provider": self.provider,
+                    "base_url": self.base_url,
+                    "api_key": self.api_key,
+                    "api_mode": self.api_mode,
+                },
                "messages": [{"role": "user", "content": prompt}],
                "max_tokens": summary_budget * 2,
                # timeout resolved from auxiliary.compression.timeout config by call_llm
@@ -24,6 +24,7 @@ from hermes_cli.auth import (
    _codex_access_token_is_expiring,
    _decode_jwt_claims,
    _import_codex_cli_tokens,
+    _write_codex_cli_tokens,
    _load_auth_store,
    _load_provider_state,
    _resolve_kimi_base_url,
@@ -693,6 +694,14 @@ class CredentialPool:
                        self._replace_entry(synced, updated)
                        self._persist()
                        self._sync_device_code_entry_to_auth_store(updated)
+                        try:
+                            _write_codex_cli_tokens(
+                                updated.access_token,
+                                updated.refresh_token,
+                                last_refresh=updated.last_refresh,
+                            )
+                        except Exception as wexc:
+                            logger.debug("Failed to write refreshed Codex tokens to CLI file (retry): %s", wexc)
                        return updated
                    except Exception as retry_exc:
                        logger.debug("Codex retry refresh also failed: %s", retry_exc)
@@ -718,6 +727,17 @@ class CredentialPool:
        # _seed_from_singletons() on the next load_pool() sees fresh state
        # instead of re-seeding stale/consumed tokens.
        self._sync_device_code_entry_to_auth_store(updated)
+        # Write refreshed tokens back to ~/.codex/auth.json so Codex CLI
+        # and VS Code don't hit "refresh_token_reused" on their next refresh.
+        if self.provider == "openai-codex":
+            try:
+                _write_codex_cli_tokens(
+                    updated.access_token,
+                    updated.refresh_token,
+                    last_refresh=updated.last_refresh,
+                )
+            except Exception as wexc:
+                logger.debug("Failed to write refreshed Codex tokens to CLI file: %s", wexc)
        return updated

    def _entry_needs_refresh(self, entry: PooledCredential) -> bool:
@@ -144,6 +144,8 @@ class ProviderInfo:
 PROVIDER_TO_MODELS_DEV: Dict[str, str] = {
    "openrouter": "openrouter",
    "anthropic": "anthropic",
+    "openai": "openai",
+    "openai-codex": "openai",
    "zai": "zai",
    "kimi-coding": "kimi-for-coding",
    "minimax": "minimax",
@@ -12,7 +12,7 @@ import threading
 from collections import OrderedDict
 from pathlib import Path

-from hermes_constants import get_hermes_home, get_skills_dir
+from hermes_constants import get_hermes_home, get_skills_dir, is_wsl
 from typing import Optional

 from agent.skill_utils import (
@@ -366,6 +366,36 @@ PLATFORM_HINTS = {
    ),
 }

+# ---------------------------------------------------------------------------
+# Environment hints — execution-environment awareness for the agent.
+# Unlike PLATFORM_HINTS (which describe the messaging channel), these describe
+# the machine/OS the agent's tools actually run on.
+# ---------------------------------------------------------------------------
+
+WSL_ENVIRONMENT_HINT = (
+    "You are running inside WSL (Windows Subsystem for Linux). "
+    "The Windows host filesystem is mounted under /mnt/ — "
+    "/mnt/c/ is the C: drive, /mnt/d/ is D:, etc. "
+    "The user's Windows files are typically at "
+    "/mnt/c/Users/<username>/Desktop/, Documents/, Downloads/, etc. "
+    "When the user references Windows paths or desktop files, translate "
+    "to the /mnt/c/ equivalent. You can list /mnt/c/Users/ to discover "
+    "the Windows username if needed."
+)
+
+
+def build_environment_hints() -> str:
+    """Return environment-specific guidance for the system prompt.
+
+    Detects WSL, and can be extended for Termux, Docker, etc.
+    Returns an empty string when no special environment is detected.
+    """
+    hints: list[str] = []
+    if is_wsl():
+        hints.append(WSL_ENVIRONMENT_HINT)
+    return "\n\n".join(hints)
+
+
 CONTEXT_FILE_MAX_CHARS = 20_000
 CONTEXT_TRUNCATE_HEAD_RATIO = 0.7
 CONTEXT_TRUNCATE_TAIL_RATIO = 0.2
@@ -726,8 +756,16 @@ def build_skills_system_prompt(

        result = (
            "## Skills (mandatory)\n"
-            "Before replying, scan the skills below. If one clearly matches your task, "
-            "load it with skill_view(name) and follow its instructions. "
+            "Before replying, scan the skills below. If a skill matches or is even partially relevant "
+            "to your task, you MUST load it with skill_view(name) and follow its instructions. "
+            "Err on the side of loading — it is always better to have context you don't need "
+            "than to miss critical steps, pitfalls, or established workflows. "
+            "Skills contain specialized knowledge — API endpoints, tool-specific commands, "
+            "and proven workflows that outperform general-purpose approaches. Load the skill "
+            "even if you think you could handle the task with basic tools like web_search or terminal. "
+            "Skills also encode the user's preferred approach, conventions, and quality standards "
+            "for tasks like code review, planning, and testing — load them even for tasks you "
+            "already know how to do, because the skill defines how it should be done here.\n"
            "If a skill has issues, fix it with skill_manage(action='patch').\n"
            "After difficult/iterative tasks, offer to save as a skill. "
            "If a skill you loaded was missing steps, had wrong commands, or needed "
@@ -737,7 +775,7 @@ def build_skills_system_prompt(
            + "\n".join(index_lines) + "\n"
            "</available_skills>\n"
            "\n"
-            "If none match, proceed normally without loading a skill."
+            "Only proceed without loading a skill if genuinely none are relevant to the task."
        )

    # ── Store in LRU cache ────────────────────────────────────────────
@@ -36,7 +36,7 @@ def generate_title(user_message: str, assistant_response: str, timeout: float =

    try:
        response = call_llm(
-            task="compression",  # reuse compression task config (cheap/fast model)
+            task="title_generation",
            messages=messages,
            max_tokens=30,
            temperature=0.3,
@@ -2735,6 +2735,22 @@ class HermesCLI:
        if runtime_model and isinstance(runtime_model, str):
            self.model = runtime_model

+        # If model is still empty (e.g. user ran `hermes auth add openai-codex`
+        # without `hermes model`), fall back to the provider's first catalog
+        # model so the API call doesn't fail with "model must be non-empty".
+        if not self.model and resolved_provider:
+            try:
+                from hermes_cli.models import get_default_model_for_provider
+                _default = get_default_model_for_provider(resolved_provider)
+                if _default:
+                    self.model = _default
+                    logger.info(
+                        "No model configured — defaulting to %s for provider %s",
+                        _default, resolved_provider,
+                    )
+            except Exception:
+                pass
+
        # Normalize model for the resolved provider (e.g. swap non-Codex
        # models when provider is openai-codex).  Fixes #651.
        model_changed = self._normalize_model_for_provider(resolved_provider)
@@ -18,6 +18,7 @@ Environment variables:
    MATRIX_REQUIRE_MENTION      Require @mention in rooms (default: true)
    MATRIX_FREE_RESPONSE_ROOMS  Comma-separated room IDs exempt from mention requirement
    MATRIX_AUTO_THREAD          Auto-create threads for room messages (default: true)
+    MATRIX_RECOVERY_KEY         Recovery key for cross-signing verification after device key rotation
    MATRIX_DM_MENTION_THREADS   Create a thread when bot is @mentioned in a DM (default: false)
 """

@@ -508,6 +509,19 @@ class MatrixAdapter(BasePlatformAdapter):
                    await api.session.close()
                    return False

+                # Import cross-signing private keys from SSSS and self-sign
+                # the current device. Required after any device-key rotation
+                # (fresh crypto.db, share_keys re-upload) — otherwise the
+                # device's self-signing signature is stale and peers refuse
+                # to share Megolm sessions with the rotated device.
+                recovery_key = os.getenv("MATRIX_RECOVERY_KEY", "").strip()
+                if recovery_key:
+                    try:
+                        await olm.verify_with_recovery_key(recovery_key)
+                        logger.info("Matrix: cross-signing verified via recovery key")
+                    except Exception as exc:
+                        logger.warning("Matrix: recovery key verification failed: %s", exc)
+
                client.crypto = olm
                logger.info(
                    "Matrix: E2EE enabled (store: %s%s)",
@@ -876,13 +876,47 @@ class GatewayRunner:
                "api_mode": override.get("api_mode"),
            }
            if override_runtime.get("api_key"):
+                logger.debug(
+                    "Session model override (fast): session=%s config_model=%s -> override_model=%s provider=%s",
+                    (resolved_session_key or "")[:30], model, override_model,
+                    override_runtime.get("provider"),
+                )
                return override_model, override_runtime
+            # Override exists but has no api_key — fall through to env-based
+            # resolution and apply model/provider from the override on top.
+            logger.debug(
+                "Session model override (no api_key, fallback): session=%s config_model=%s override_model=%s",
+                (resolved_session_key or "")[:30], model, override_model,
+            )
+        else:
+            logger.debug(
+                "No session model override: session=%s config_model=%s override_keys=%s",
+                (resolved_session_key or "")[:30], model,
+                list(self._session_model_overrides.keys())[:5] if self._session_model_overrides else "[]",
+            )

        runtime_kwargs = _resolve_runtime_agent_kwargs()
        if override and resolved_session_key:
            model, runtime_kwargs = self._apply_session_model_override(
                resolved_session_key, model, runtime_kwargs
            )
+
+        # When the config has no model.default but a provider was resolved
+        # (e.g. user ran `hermes auth add openai-codex` without `hermes model`),
+        # fall back to the provider's first catalog model so the API call
+        # doesn't fail with "model must be a non-empty string".
+        if not model and runtime_kwargs.get("provider"):
+            try:
+                from hermes_cli.models import get_default_model_for_provider
+                model = get_default_model_for_provider(runtime_kwargs["provider"])
+                if model:
+                    logger.info(
+                        "No model configured — defaulting to %s for provider %s",
+                        model, runtime_kwargs["provider"],
+                    )
+            except Exception:
+                pass
+
        return model, runtime_kwargs

    def _resolve_turn_agent_config(self, user_message: str, model: str, runtime_kwargs: dict) -> dict:
@@ -1501,12 +1535,25 @@ class GatewayRunner:
        # This prevents stuck sessions from being blindly resumed on restart,
        # which can create an unrecoverable loop (#7536).  Suspended sessions
        # auto-reset on the next incoming message, giving the user a clean start.
-        try:
-            suspended = self.session_store.suspend_recently_active()
-            if suspended:
-                logger.info("Suspended %d in-flight session(s) from previous run", suspended)
-        except Exception as e:
-            logger.warning("Session suspension on startup failed: %s", e)
+        #
+        # SKIP suspension after a clean (graceful) shutdown — the previous
+        # process already drained active agents, so sessions aren't stuck.
+        # This prevents unwanted auto-resets after `hermes update`,
+        # `hermes gateway restart`, or `/restart`.
+        _clean_marker = _hermes_home / ".clean_shutdown"
+        if _clean_marker.exists():
+            logger.info("Previous gateway exited cleanly — skipping session suspension")
+            try:
+                _clean_marker.unlink()
+            except Exception:
+                pass
+        else:
+            try:
+                suspended = self.session_store.suspend_recently_active()
+                if suspended:
+                    logger.info("Suspended %d in-flight session(s) from previous run", suspended)
+            except Exception as e:
+                logger.warning("Session suspension on startup failed: %s", e)

        connected_count = 0
        enabled_platform_count = 0
@@ -2032,6 +2079,15 @@ class GatewayRunner:
            from gateway.status import remove_pid_file
            remove_pid_file()

+            # Write a clean-shutdown marker so the next startup knows this
+            # wasn't a crash.  suspend_recently_active() only needs to run
+            # after unexpected exits — graceful shutdowns already drain
+            # active agents, so there's no stuck-session risk.
+            try:
+                (_hermes_home / ".clean_shutdown").touch()
+            except Exception:
+                pass
+
            if self._restart_requested and self._restart_via_service:
                self._exit_code = GATEWAY_SERVICE_RESTART_EXIT_CODE
                self._exit_reason = self._exit_reason or "Gateway restart requested"
@@ -4304,6 +4360,11 @@ class GatewayRunner:
                            "api_mode": result.api_mode,
                        }

+                        # Evict cached agent so the next turn creates a fresh
+                        # agent from the override rather than relying on the
+                        # stale cache signature to trigger a rebuild.
+                        _self._evict_cached_agent(_session_key)
+
                        # Build confirmation text
                        plabel = result.provider_label or result.target_provider
                        lines = [f"Model switched to `{result.new_model}`"]
@@ -4417,6 +4478,10 @@ class GatewayRunner:
            "api_mode": result.api_mode,
        }

+        # Evict cached agent so the next turn creates a fresh agent from the
+        # override rather than relying on cache signature mismatch detection.
+        self._evict_cached_agent(session_key)
+
        # Persist to config if --global
        if persist_global:
            try:
@@ -6603,8 +6668,12 @@ class GatewayRunner:
            if buffer.strip() and (loop.time() - last_stream_time) >= stream_interval:
                await _flush_buffer()

-            # Check for prompts
-            if prompt_path.exists() and session_key:
+            # Check for prompts — only forward if we haven't already sent
+            # one that's still awaiting a response.  Without this guard the
+            # watcher would re-read the same .update_prompt.json every poll
+            # cycle and spam the user with duplicate prompt messages.
+            if (prompt_path.exists() and session_key
+                    and not self._update_prompt_pending.get(session_key)):
                try:
                    prompt_data = json.loads(prompt_path.read_text())
                    prompt_text = prompt_data.get("prompt", "")
@@ -6636,6 +6705,11 @@ class GatewayRunner:
                                f"or type your answer directly."
                            )
                        self._update_prompt_pending[session_key] = True
+                        # Remove the prompt file so it isn't re-read on the
+                        # next poll cycle.  The update process only needs
+                        # .update_response to continue — it doesn't re-check
+                        # .update_prompt.json while waiting.
+                        prompt_path.unlink(missing_ok=True)
                        logger.info("Forwarded update prompt to %s: %s", session_key, prompt_text[:80])
                except (json.JSONDecodeError, OSError) as e:
                    logger.debug("Failed to read update prompt: %s", e)
@@ -7545,6 +7619,10 @@ class GatewayRunner:
                    session_key=session_key,
                    user_config=user_config,
                )
+                logger.debug(
+                    "run_agent resolved: model=%s provider=%s session=%s",
+                    model, runtime_kwargs.get("provider"), (session_key or "")[:30],
+                )
            except Exception as exc:
                return {
                    "final_response": f"⚠️ Provider authentication failed: {exc}",
@@ -8046,8 +8124,16 @@ class GatewayRunner:
                    if hasattr(_adapter, 'has_pending_interrupt') and _adapter.has_pending_interrupt(session_key):
                        agent = agent_holder[0]
                        if agent:
-                            pending_event = _adapter.get_pending_message(session_key)
-                            pending_text = pending_event.text if pending_event else None
+                            # Peek at the pending message text WITHOUT consuming it.
+                            # The message must remain in _pending_messages so the
+                            # post-run dequeue at _dequeue_pending_event() can
+                            # retrieve the full MessageEvent (with media metadata).
+                            # If we pop here, a race exists: the agent may finish
+                            # before checking _interrupt_requested, and the message
+                            # is lost — neither the interrupt path nor the dequeue
+                            # path finds it.
+                            _peek_event = _adapter._pending_messages.get(session_key)
+                            pending_text = _peek_event.text if _peek_event else None
                            logger.debug("Interrupt detected from adapter, signaling agent...")
                            agent.interrupt(pending_text)
                            _interrupt_detected.set()
@@ -8138,7 +8224,7 @@ class GatewayRunner:
                        if (_backup_adapter and _backup_agent
                                and hasattr(_backup_adapter, 'has_pending_interrupt')
                                and _backup_adapter.has_pending_interrupt(session_key)):
-                            _bp_event = _backup_adapter.get_pending_message(session_key)
+                            _bp_event = _backup_adapter._pending_messages.get(session_key)
                            _bp_text = _bp_event.text if _bp_event else None
                            logger.info(
                                "Backup interrupt detected for session %s "
@@ -8198,7 +8284,7 @@ class GatewayRunner:
                        if (_backup_adapter and _backup_agent
                                and hasattr(_backup_adapter, 'has_pending_interrupt')
                                and _backup_adapter.has_pending_interrupt(session_key)):
-                            _bp_event = _backup_adapter.get_pending_message(session_key)
+                            _bp_event = _backup_adapter._pending_messages.get(session_key)
                            _bp_text = _bp_event.text if _bp_event else None
                            logger.info(
                                "Backup interrupt detected for session %s "
@@ -1303,6 +1303,49 @@ def _read_codex_tokens(*, _lock: bool = True) -> Dict[str, Any]:
    }


+def _write_codex_cli_tokens(
+    access_token: str,
+    refresh_token: str,
+    *,
+    last_refresh: Optional[str] = None,
+) -> None:
+    """Write refreshed tokens back to ~/.codex/auth.json.
+
+    OpenAI OAuth refresh tokens are single-use and rotate on every refresh.
+    When Hermes refreshes a token it consumes the old refresh_token; if we
+    don't write the new pair back, the Codex CLI (or VS Code extension) will
+    fail with ``refresh_token_reused`` on its next refresh attempt.
+
+    This mirrors the Anthropic write-back to ~/.claude/.credentials.json
+    via ``_write_claude_code_credentials()``.
+    """
+    codex_home = os.getenv("CODEX_HOME", "").strip()
+    if not codex_home:
+        codex_home = str(Path.home() / ".codex")
+    auth_path = Path(codex_home).expanduser() / "auth.json"
+    try:
+        existing: Dict[str, Any] = {}
+        if auth_path.is_file():
+            existing = json.loads(auth_path.read_text(encoding="utf-8"))
+        if not isinstance(existing, dict):
+            existing = {}
+
+        tokens_dict = existing.get("tokens")
+        if not isinstance(tokens_dict, dict):
+            tokens_dict = {}
+        tokens_dict["access_token"] = access_token
+        tokens_dict["refresh_token"] = refresh_token
+        existing["tokens"] = tokens_dict
+        if last_refresh is not None:
+            existing["last_refresh"] = last_refresh
+
+        auth_path.parent.mkdir(parents=True, exist_ok=True)
+        auth_path.write_text(json.dumps(existing, indent=2), encoding="utf-8")
+        auth_path.chmod(0o600)
+    except (OSError, IOError) as exc:
+        logger.debug("Failed to write refreshed tokens to %s: %s", auth_path, exc)
+
+
 def _save_codex_tokens(tokens: Dict[str, str], last_refresh: str = None) -> None:
    """Save Codex OAuth tokens to Hermes auth store (~/.hermes/auth.json)."""
    if last_refresh is None:
@@ -1425,6 +1468,12 @@ def _refresh_codex_auth_tokens(
    updated_tokens["refresh_token"] = refreshed["refresh_token"]

    _save_codex_tokens(updated_tokens)
+    # Write back to ~/.codex/auth.json so Codex CLI / VS Code stay in sync.
+    _write_codex_cli_tokens(
+        refreshed["access_token"],
+        refreshed["refresh_token"],
+        last_refresh=refreshed.get("last_refresh"),
+    )
    return updated_tokens


@@ -50,6 +50,7 @@ _EXTRA_ENV_KEYS = frozenset({
    "MATTERMOST_HOME_CHANNEL", "MATTERMOST_REPLY_MODE",
    "MATRIX_PASSWORD", "MATRIX_ENCRYPTION", "MATRIX_DEVICE_ID", "MATRIX_HOME_ROOM",
    "MATRIX_REQUIRE_MENTION", "MATRIX_FREE_RESPONSE_ROOMS", "MATRIX_AUTO_THREAD",
+    "MATRIX_RECOVERY_KEY",
 })
 import yaml

@@ -147,25 +148,6 @@ def managed_error(action: str = "modify configuration"):
 # Container-aware CLI (NixOS container mode)
 # =============================================================================

-def _is_inside_container() -> bool:
-    """Detect if we're already running inside a Docker/Podman container."""
-    # Standard Docker/Podman indicators
-    if os.path.exists("/.dockerenv"):
-        return True
-    # Podman uses /run/.containerenv
-    if os.path.exists("/run/.containerenv"):
-        return True
-    # Check cgroup for container runtime evidence (works for both Docker & Podman)
-    try:
-        with open("/proc/1/cgroup", "r") as f:
-            cgroup = f.read()
-            if "docker" in cgroup or "podman" in cgroup or "/lxc/" in cgroup:
-                return True
-    except OSError:
-        pass
-    return False
-
-
 def get_container_exec_info() -> Optional[dict]:
    """Read container mode metadata from HERMES_HOME/.container-mode.

@@ -180,7 +162,8 @@ def get_container_exec_info() -> Optional[dict]:
    if os.environ.get("HERMES_DEV") == "1":
        return None

-    if _is_inside_container():
+    from hermes_constants import is_container
+    if is_container():
        return None

    container_mode_file = get_hermes_home() / ".container-mode"
@@ -1293,6 +1276,14 @@ OPTIONAL_ENV_VARS = {
        "category": "messaging",
        "advanced": True,
    },
+    "MATRIX_RECOVERY_KEY": {
+        "description": "Matrix recovery key for cross-signing verification after device key rotation (from Element: Settings → Security → Recovery Key)",
+        "prompt": "Matrix recovery key",
+        "url": None,
+        "password": True,
+        "category": "messaging",
+        "advanced": True,
+    },
    "BLUEBUBBLES_SERVER_URL": {
        "description": "BlueBubbles server URL for iMessage integration (e.g. http://192.168.1.10:1234)",
        "prompt": "BlueBubbles server URL",
@@ -44,6 +44,16 @@ def _redact(value: str) -> str:
 def _gateway_status() -> str:
    """Return a short gateway status string."""
    if sys.platform.startswith("linux"):
+        from hermes_constants import is_container
+        if is_container():
+            try:
+                from hermes_cli.gateway import find_gateway_pids
+                pids = find_gateway_pids()
+                if pids:
+                    return f"running (docker, pid {pids[0]})"
+                return "stopped (docker)"
+            except Exception:
+                return "stopped (docker)"
        try:
            from hermes_cli.gateway import get_service_name
            svc = get_service_name()
@@ -331,7 +331,7 @@ def is_linux() -> bool:
    return sys.platform.startswith('linux')


-from hermes_constants import is_termux, is_wsl
+from hermes_constants import is_container, is_termux, is_wsl


 def _wsl_systemd_operational() -> bool:
@@ -353,7 +353,9 @@ def _wsl_systemd_operational() -> bool:


 def supports_systemd_services() -> bool:
-    if not is_linux() or is_termux():
+    if not is_linux() or is_termux() or is_container():
+        return False
+    if shutil.which("systemctl") is None:
        return False
    if is_wsl():
        return _wsl_systemd_operational()
@@ -483,6 +485,21 @@ def _journalctl_cmd(system: bool = False) -> list[str]:
    return ["journalctl"] if system else ["journalctl", "--user"]


+def _run_systemctl(args: list[str], *, system: bool = False, **kwargs) -> subprocess.CompletedProcess:
+    """Run a systemctl command, raising RuntimeError if systemctl is missing.
+
+    Defense-in-depth: callers are gated by ``supports_systemd_services()``,
+    but this ensures any future caller that bypasses the gate still gets a
+    clear error instead of a raw ``FileNotFoundError`` traceback.
+    """
+    try:
+        return subprocess.run(_systemctl_cmd(system) + args, **kwargs)
+    except FileNotFoundError:
+        raise RuntimeError(
+            "systemctl is not available on this system"
+        ) from None
+
+
 def _service_scope_label(system: bool = False) -> str:
    return "system" if system else "user"

@@ -929,7 +946,7 @@ def refresh_systemd_unit_if_needed(system: bool = False) -> bool:

    expected_user = _read_systemd_user_from_unit(unit_path) if system else None
    unit_path.write_text(generate_systemd_unit(system=system, run_as_user=expected_user), encoding="utf-8")
-    subprocess.run(_systemctl_cmd(system) + ["daemon-reload"], check=True, timeout=30)
+    _run_systemctl(["daemon-reload"], system=system, check=True, timeout=30)
    print(f"↻ Updated gateway {_service_scope_label(system)} service definition to match the current Hermes install")
    return True

@@ -1025,7 +1042,7 @@ def systemd_install(force: bool = False, system: bool = False, run_as_user: str
        if not systemd_unit_is_current(system=system):
            print(f"↻ Repairing outdated {_service_scope_label(system)} systemd service at: {unit_path}")
            refresh_systemd_unit_if_needed(system=system)
-            subprocess.run(_systemctl_cmd(system) + ["enable", get_service_name()], check=True, timeout=30)
+            _run_systemctl(["enable", get_service_name()], system=system, check=True, timeout=30)
            print(f"✓ {_service_scope_label(system).capitalize()} service definition updated")
            return
        print(f"Service already installed at: {unit_path}")
@@ -1036,8 +1053,8 @@ def systemd_install(force: bool = False, system: bool = False, run_as_user: str
    print(f"Installing {_service_scope_label(system)} systemd service to: {unit_path}")
    unit_path.write_text(generate_systemd_unit(system=system, run_as_user=run_as_user), encoding="utf-8")

-    subprocess.run(_systemctl_cmd(system) + ["daemon-reload"], check=True, timeout=30)
-    subprocess.run(_systemctl_cmd(system) + ["enable", get_service_name()], check=True, timeout=30)
+    _run_systemctl(["daemon-reload"], system=system, check=True, timeout=30)
+    _run_systemctl(["enable", get_service_name()], system=system, check=True, timeout=30)

    print()
    print(f"✓ {_service_scope_label(system).capitalize()} service installed and enabled!")
@@ -1063,15 +1080,15 @@ def systemd_uninstall(system: bool = False):
    if system:
        _require_root_for_system_service("uninstall")

-    subprocess.run(_systemctl_cmd(system) + ["stop", get_service_name()], check=False, timeout=90)
-    subprocess.run(_systemctl_cmd(system) + ["disable", get_service_name()], check=False, timeout=30)
+    _run_systemctl(["stop", get_service_name()], system=system, check=False, timeout=90)
+    _run_systemctl(["disable", get_service_name()], system=system, check=False, timeout=30)

    unit_path = get_systemd_unit_path(system=system)
    if unit_path.exists():
        unit_path.unlink()
        print(f"✓ Removed {unit_path}")

-    subprocess.run(_systemctl_cmd(system) + ["daemon-reload"], check=True, timeout=30)
+    _run_systemctl(["daemon-reload"], system=system, check=True, timeout=30)
    print(f"✓ {_service_scope_label(system).capitalize()} service uninstalled")


@@ -1080,7 +1097,7 @@ def systemd_start(system: bool = False):
    if system:
        _require_root_for_system_service("start")
    refresh_systemd_unit_if_needed(system=system)
-    subprocess.run(_systemctl_cmd(system) + ["start", get_service_name()], check=True, timeout=30)
+    _run_systemctl(["start", get_service_name()], system=system, check=True, timeout=30)
    print(f"✓ {_service_scope_label(system).capitalize()} service started")


@@ -1089,7 +1106,7 @@ def systemd_stop(system: bool = False):
    system = _select_systemd_scope(system)
    if system:
        _require_root_for_system_service("stop")
-    subprocess.run(_systemctl_cmd(system) + ["stop", get_service_name()], check=True, timeout=90)
+    _run_systemctl(["stop", get_service_name()], system=system, check=True, timeout=90)
    print(f"✓ {_service_scope_label(system).capitalize()} service stopped")


@@ -1105,7 +1122,7 @@ def systemd_restart(system: bool = False):
    if pid is not None and _request_gateway_self_restart(pid):
        print(f"✓ {_service_scope_label(system).capitalize()} service restart requested")
        return
-    subprocess.run(_systemctl_cmd(system) + ["reload-or-restart", get_service_name()], check=True, timeout=90)
+    _run_systemctl(["reload-or-restart", get_service_name()], system=system, check=True, timeout=90)
    print(f"✓ {_service_scope_label(system).capitalize()} service restarted")


@@ -1129,14 +1146,16 @@ def systemd_status(deep: bool = False, system: bool = False):
        print(f"  Run: {'sudo ' if system else ''}hermes gateway restart{scope_flag}  # auto-refreshes the unit")
        print()

-    subprocess.run(
-        _systemctl_cmd(system) + ["status", get_service_name(), "--no-pager"],
+    _run_systemctl(
+        ["status", get_service_name(), "--no-pager"],
+        system=system,
        capture_output=False,
        timeout=10,
    )

-    result = subprocess.run(
-        _systemctl_cmd(system) + ["is-active", get_service_name()],
+    result = _run_systemctl(
+        ["is-active", get_service_name()],
+        system=system,
        capture_output=True,
        text=True,
        timeout=10,
@@ -2129,24 +2148,24 @@ def _is_service_running() -> bool:

        if user_unit_exists:
            try:
-                result = subprocess.run(
-                    _systemctl_cmd(False) + ["is-active", get_service_name()],
-                    capture_output=True, text=True, timeout=10,
+                result = _run_systemctl(
+                    ["is-active", get_service_name()],
+                    system=False, capture_output=True, text=True, timeout=10,
                )
                if result.stdout.strip() == "active":
                    return True
-            except subprocess.TimeoutExpired:
+            except (RuntimeError, subprocess.TimeoutExpired):
                pass

        if system_unit_exists:
            try:
-                result = subprocess.run(
-                    _systemctl_cmd(True) + ["is-active", get_service_name()],
-                    capture_output=True, text=True, timeout=10,
+                result = _run_systemctl(
+                    ["is-active", get_service_name()],
+                    system=True, capture_output=True, text=True, timeout=10,
                )
                if result.stdout.strip() == "active":
                    return True
-            except subprocess.TimeoutExpired:
+            except (RuntimeError, subprocess.TimeoutExpired):
                pass

        return False
@@ -2606,6 +2625,15 @@ def gateway_command(args):
            print("  tmux new -s hermes 'hermes gateway run'         # persistent via tmux")
            print("  nohup hermes gateway run > ~/.hermes/logs/gateway.log 2>&1 &  # background")
            sys.exit(1)
+        elif is_container():
+            print("Service installation is not needed inside a Docker container.")
+            print("The container runtime is your service manager — use Docker restart policies instead:")
+            print()
+            print("  docker run --restart unless-stopped ...   # auto-restart on crash/reboot")
+            print("  docker restart <container>                # manual restart")
+            print()
+            print("To run the gateway: hermes gateway run")
+            sys.exit(0)
        else:
            print("Service installation not supported on this platform.")
            print("Run manually: hermes gateway run")
@@ -2624,10 +2652,17 @@ def gateway_command(args):
            systemd_uninstall(system=system)
        elif is_macos():
            launchd_uninstall()
+        elif is_container():
+            print("Service uninstall is not applicable inside a Docker container.")
+            print("To stop the gateway, stop or remove the container:")
+            print()
+            print("  docker stop <container>")
+            print("  docker rm <container>")
+            sys.exit(0)
        else:
            print("Not supported on this platform.")
            sys.exit(1)
-    
+
    elif subcmd == "start":
        system = getattr(args, 'system', False)
        if is_termux():
@@ -2648,10 +2683,19 @@ def gateway_command(args):
            print()
            print("To enable systemd: add systemd=true to /etc/wsl.conf and run 'wsl --shutdown' from PowerShell.")
            sys.exit(1)
+        elif is_container():
+            print("Service start is not applicable inside a Docker container.")
+            print("The gateway runs as the container's main process.")
+            print()
+            print("  docker start <container>     # start a stopped container")
+            print("  docker restart <container>   # restart a running container")
+            print()
+            print("Or run the gateway directly: hermes gateway run")
+            sys.exit(0)
        else:
            print("Not supported on this platform.")
            sys.exit(1)
-    
+
    elif subcmd == "stop":
        stop_all = getattr(args, 'all', False)
        system = getattr(args, 'system', False)
@@ -1107,6 +1107,7 @@ def select_provider_and_model(args=None):
                "base_url": base_url,
                "api_key": entry.get("api_key", ""),
                "model": entry.get("model", ""),
+                "api_mode": entry.get("api_mode", ""),
            }
        return custom_provider_map

@@ -1955,6 +1956,12 @@ def _model_flow_named_custom(config, provider_info):
    model["base_url"] = base_url
    if api_key:
        model["api_key"] = api_key
+    # Apply api_mode from custom_providers entry, or clear stale value
+    custom_api_mode = provider_info.get("api_mode", "")
+    if custom_api_mode:
+        model["api_mode"] = custom_api_mode
+    else:
+        model.pop("api_mode", None)  # let runtime auto-detect from URL
    save_config(cfg)
    deactivate_provider()

@@ -2492,8 +2499,11 @@ def _model_flow_api_key_provider(config, provider_id, current_model=""):
        print()
        override = ""
    if override and base_url_env:
-        save_env_value(base_url_env, override)
-        effective_base = override
+        if not override.startswith(("http://", "https://")):
+            print("  Invalid URL — must start with http:// or https://. Keeping current value.")
+        else:
+            save_env_value(base_url_env, override)
+            effective_base = override

    # Model selection — resolution order:
    #   1. models.dev registry (cached, filtered for agentic/tool-capable models)
@@ -3929,6 +3939,26 @@ def cmd_update(args):
        print()
        print("✓ Update complete!")
        
+        # Write exit code *before* the gateway restart attempt.
+        # When running as ``hermes update --gateway`` (spawned by the gateway's
+        # /update command), this process lives inside the gateway's systemd
+        # cgroup.  ``systemctl restart hermes-gateway`` kills everything in the
+        # cgroup (KillMode=mixed → SIGKILL to remaining processes), including
+        # us and the wrapping bash shell.  The shell never reaches its
+        # ``printf $status > .update_exit_code`` epilogue, so the exit-code
+        # marker file is never created.  The new gateway's update watcher then
+        # polls for 30 minutes and sends a spurious timeout message.
+        #
+        # Writing the marker here — after git pull + pip install succeed but
+        # before we attempt the restart — ensures the new gateway sees it
+        # regardless of how we die.
+        if gateway_mode:
+            _exit_code_path = get_hermes_home() / ".update_exit_code"
+            try:
+                _exit_code_path.write_text("0")
+            except OSError:
+                pass
+        
        # Auto-restart ALL gateways after update.
        # The code update (git pull) is shared across all profiles, so every
        # running gateway needs restarting to pick up the new code.
@@ -546,6 +546,20 @@ _PROVIDER_ALIASES = {
 }


+def get_default_model_for_provider(provider: str) -> str:
+    """Return the default model for a provider, or empty string if unknown.
+
+    Uses the first entry in _PROVIDER_MODELS as the default.  This is the
+    model a user would be offered first in the ``hermes model`` picker.
+
+    Used as a fallback when the user has configured a provider but never
+    selected a model (e.g. ``hermes auth add openai-codex`` without
+    ``hermes model``).
+    """
+    models = _PROVIDER_MODELS.get(provider, [])
+    return models[0] if models else ""
+
+
 def _openrouter_model_is_free(pricing: Any) -> bool:
    """Return True when both prompt and completion pricing are zero."""
    if not isinstance(pricing, dict):
@@ -2232,6 +2232,7 @@ def setup_gateway(config: dict):
        from hermes_cli.gateway import (
            _is_service_installed,
            _is_service_running,
+            supports_systemd_services,
            has_conflicting_systemd_units,
            install_linux_gateway_from_setup,
            print_systemd_scope_conflict_warning,
@@ -2244,16 +2245,18 @@ def setup_gateway(config: dict):

        service_installed = _is_service_installed()
        service_running = _is_service_running()
+        supports_systemd = supports_systemd_services()
+        supports_service_manager = supports_systemd or _is_macos

        print()
-        if _is_linux and has_conflicting_systemd_units():
+        if supports_systemd and has_conflicting_systemd_units():
            print_systemd_scope_conflict_warning()
            print()

        if service_running:
            if prompt_yes_no("  Restart the gateway to pick up changes?", True):
                try:
-                    if _is_linux:
+                    if supports_systemd:
                        systemd_restart()
                    elif _is_macos:
                        launchd_restart()
@@ -2262,14 +2265,14 @@ def setup_gateway(config: dict):
        elif service_installed:
            if prompt_yes_no("  Start the gateway service?", True):
                try:
-                    if _is_linux:
+                    if supports_systemd:
                        systemd_start()
                    elif _is_macos:
                        launchd_start()
                except Exception as e:
                    print_error(f"  Start failed: {e}")
-        elif _is_linux or _is_macos:
-            svc_name = "systemd" if _is_linux else "launchd"
+        elif supports_service_manager:
+            svc_name = "systemd" if supports_systemd else "launchd"
            if prompt_yes_no(
                f"  Install the gateway as a {svc_name} service? (runs in background, starts on boot)",
                True,
@@ -2277,7 +2280,7 @@ def setup_gateway(config: dict):
                try:
                    installed_scope = None
                    did_install = False
-                    if _is_linux:
+                    if supports_systemd:
                        installed_scope, did_install = install_linux_gateway_from_setup(force=False)
                    else:
                        launchd_install(force=False)
@@ -2285,7 +2288,7 @@ def setup_gateway(config: dict):
                    print()
                    if did_install and prompt_yes_no("  Start the service now?", True):
                        try:
-                            if _is_linux:
+                            if supports_systemd:
                                systemd_start(system=installed_scope == "system")
                            elif _is_macos:
                                launchd_start()
@@ -2296,12 +2299,21 @@ def setup_gateway(config: dict):
                    print_info("  You can try manually: hermes gateway install")
            else:
                print_info("  You can install later: hermes gateway install")
-                if _is_linux:
+                if supports_systemd:
                    print_info("  Or as a boot-time service: sudo hermes gateway install --system")
                print_info("  Or run in foreground:  hermes gateway")
        else:
-            print_info("Start the gateway to bring your bots online:")
-            print_info("   hermes gateway              # Run in foreground")
+            from hermes_constants import is_container
+            if is_container():
+                print_info("Start the gateway to bring your bots online:")
+                print_info("   hermes gateway run          # Run as container main process")
+                print_info("")
+                print_info("For automatic restarts, use a Docker restart policy:")
+                print_info("   docker run --restart unless-stopped ...")
+                print_info("   docker restart <container>  # Manual restart")
+            else:
+                print_info("Start the gateway to bring your bots online:")
+                print_info("   hermes gateway              # Run in foreground")

        print_info("━" * 50)

@@ -346,23 +346,35 @@ def show_status(args):
            print("  Note:         Android may stop background jobs when Termux is suspended")

    elif sys.platform.startswith('linux'):
-        try:
-            from hermes_cli.gateway import get_service_name
-            _gw_svc = get_service_name()
-        except Exception:
-            _gw_svc = "hermes-gateway"
-        try:
-            result = subprocess.run(
-                ["systemctl", "--user", "is-active", _gw_svc],
-                capture_output=True,
-                text=True,
-                timeout=5
-            )
-            is_active = result.stdout.strip() == "active"
-        except (FileNotFoundError, subprocess.TimeoutExpired):
-            is_active = False
-        print(f"  Status:       {check_mark(is_active)} {'running' if is_active else 'stopped'}")
-        print("  Manager:      systemd (user)")
+        from hermes_constants import is_container
+        if is_container():
+            # Docker/Podman: no systemd — check for running gateway processes
+            try:
+                from hermes_cli.gateway import find_gateway_pids
+                gateway_pids = find_gateway_pids()
+                is_active = len(gateway_pids) > 0
+            except Exception:
+                is_active = False
+            print(f"  Status:       {check_mark(is_active)} {'running' if is_active else 'stopped'}")
+            print("  Manager:      docker (foreground)")
+        else:
+            try:
+                from hermes_cli.gateway import get_service_name
+                _gw_svc = get_service_name()
+            except Exception:
+                _gw_svc = "hermes-gateway"
+            try:
+                result = subprocess.run(
+                    ["systemctl", "--user", "is-active", _gw_svc],
+                    capture_output=True,
+                    text=True,
+                    timeout=5
+                )
+                is_active = result.stdout.strip() == "active"
+            except (FileNotFoundError, subprocess.TimeoutExpired):
+                is_active = False
+            print(f"  Status:       {check_mark(is_active)} {'running' if is_active else 'stopped'}")
+            print("  Manager:      systemd (user)")
        
    elif sys.platform == 'darwin':
        from hermes_cli.gateway import get_launchd_label
@@ -189,6 +189,37 @@ def is_wsl() -> bool:
    return _wsl_detected


+_container_detected: bool | None = None
+
+
+def is_container() -> bool:
+    """Return True when running inside a Docker/Podman container.
+
+    Checks ``/.dockerenv`` (Docker), ``/run/.containerenv`` (Podman),
+    and ``/proc/1/cgroup`` for container runtime markers.  Result is
+    cached for the process lifetime.  Import-safe — no heavy deps.
+    """
+    global _container_detected
+    if _container_detected is not None:
+        return _container_detected
+    if os.path.exists("/.dockerenv"):
+        _container_detected = True
+        return True
+    if os.path.exists("/run/.containerenv"):
+        _container_detected = True
+        return True
+    try:
+        with open("/proc/1/cgroup", "r") as f:
+            cgroup = f.read()
+            if "docker" in cgroup or "podman" in cgroup or "/lxc/" in cgroup:
+                _container_detected = True
+                return True
+    except OSError:
+        pass
+    _container_detected = False
+    return False
+
+
 # ─── Well-Known Paths ─────────────────────────────────────────────────────────


@@ -94,7 +94,7 @@ from agent.model_metadata import (
 from agent.context_compressor import ContextCompressor
 from agent.subdirectory_hints import SubdirectoryHintTracker
 from agent.prompt_caching import apply_anthropic_cache_control
-from agent.prompt_builder import build_skills_system_prompt, build_context_files_prompt, load_soul_md, TOOL_USE_ENFORCEMENT_GUIDANCE, TOOL_USE_ENFORCEMENT_MODELS, DEVELOPER_ROLE_MODELS, GOOGLE_MODEL_OPERATIONAL_GUIDANCE, OPENAI_MODEL_EXECUTION_GUIDANCE
+from agent.prompt_builder import build_skills_system_prompt, build_context_files_prompt, build_environment_hints, load_soul_md, TOOL_USE_ENFORCEMENT_GUIDANCE, TOOL_USE_ENFORCEMENT_MODELS, DEVELOPER_ROLE_MODELS, GOOGLE_MODEL_OPERATIONAL_GUIDANCE, OPENAI_MODEL_EXECUTION_GUIDANCE
 from agent.usage_pricing import estimate_usage_cost, normalize_usage
 from agent.display import (
    KawaiiSpinner, build_tool_preview as _build_tool_preview,
@@ -1307,6 +1307,7 @@ class AIAgent:
                api_key=getattr(self, "api_key", ""),
                config_context_length=_config_context_length,
                provider=self.provider,
+                api_mode=self.api_mode,
            )
        self.compression_enabled = compression_enabled

@@ -1563,6 +1564,7 @@ class AIAgent:
                base_url=self.base_url,
                api_key=getattr(self, "api_key", ""),
                provider=self.provider,
+                api_mode=self.api_mode,
            )

        # ── Invalidate cached system prompt so it rebuilds next turn ──
@@ -1696,6 +1698,16 @@ class AIAgent:
            except Exception:
                logger.debug("status_callback error in _emit_status", exc_info=True)

+    def _current_main_runtime(self) -> Dict[str, str]:
+        """Return the live main runtime for session-scoped auxiliary routing."""
+        return {
+            "model": getattr(self, "model", "") or "",
+            "provider": getattr(self, "provider", "") or "",
+            "base_url": getattr(self, "base_url", "") or "",
+            "api_key": getattr(self, "api_key", "") or "",
+            "api_mode": getattr(self, "api_mode", "") or "",
+        }
+
    def _check_compression_model_feasibility(self) -> None:
        """Warn at session start if the auxiliary compression model's context
        window is smaller than the main model's compression threshold.
@@ -1716,7 +1728,10 @@ class AIAgent:
            from agent.auxiliary_client import get_text_auxiliary_client
            from agent.model_metadata import get_model_context_length

-            client, aux_model = get_text_auxiliary_client("compression")
+            client, aux_model = get_text_auxiliary_client(
+                "compression",
+                main_runtime=self._current_main_runtime(),
+            )
            if client is None or not aux_model:
                msg = (
                    "⚠ No auxiliary LLM provider configured — context "
@@ -3178,6 +3193,12 @@ class AIAgent:
                f"not on any model name returned by the API."
            )

+        # Environment hints (WSL, Termux, etc.) — tell the agent about the
+        # execution environment so it can translate paths and adapt behavior.
+        _env_hints = build_environment_hints()
+        if _env_hints:
+            prompt_parts.append(_env_hints)
+
        platform_key = (self.platform or "").lower().strip()
        if platform_key in PLATFORM_HINTS:
            prompt_parts.append(PLATFORM_HINTS[platform_key])
@@ -1,29 +1,51 @@
 ---
 name: github-code-review
-description: Review code changes by analyzing git diffs, leaving inline comments on PRs, and performing thorough pre-push review. Uses GitHub MCP tools (mcp_github_*) as the primary interface, with git CLI for local diff operations.
-version: 2.0.0
+description: Review code changes by analyzing git diffs, leaving inline comments on PRs, and performing thorough pre-push review. Works with gh CLI or falls back to git + GitHub REST API via curl.
+version: 1.1.0
 author: Hermes Agent
 license: MIT
 metadata:
  hermes:
-    tags: [GitHub, Code-Review, Pull-Requests, Git, Quality, MCP]
+    tags: [GitHub, Code-Review, Pull-Requests, Git, Quality]
    related_skills: [github-auth, github-pr-workflow]
 ---

 # GitHub Code Review

-Perform code reviews on local changes before pushing, or review open PRs on GitHub. This skill uses **GitHub MCP tools** (`mcp_github_*`) as the primary interface for all GitHub API interactions, with plain `git` for local diff operations.
+Perform code reviews on local changes before pushing, or review open PRs on GitHub. Most of this skill uses plain `git` — the `gh`/`curl` split only matters for PR-level interactions.

 ## Prerequisites

- GitHub MCP server configured (provides `mcp_github_*` tools)
- Inside a git repository (for local diff operations)
+- Authenticated with GitHub (see `github-auth` skill)
+- Inside a git repository
+
+### Setup (for PR interactions)
+
+```bash
+if command -v gh &>/dev/null && gh auth status &>/dev/null; then
+  AUTH="gh"
+else
+  AUTH="git"
+  if [ -z "$GITHUB_TOKEN" ]; then
+    if [ -f ~/.hermes/.env ] && grep -q "^GITHUB_TOKEN=" ~/.hermes/.env; then
+      GITHUB_TOKEN=$(grep "^GITHUB_TOKEN=" ~/.hermes/.env | head -1 | cut -d= -f2 | tr -d '\n\r')
+    elif grep -q "github.com" ~/.git-credentials 2>/dev/null; then
+      GITHUB_TOKEN=$(grep "github.com" ~/.git-credentials 2>/dev/null | head -1 | sed 's|https://[^:]*:\([^@]*\)@.*|\1|')
+    fi
+  fi
+fi
+
+REMOTE_URL=$(git remote get-url origin)
+OWNER_REPO=$(echo "$REMOTE_URL" | sed -E 's|.*github\.com[:/]||; s|\.git$||')
+OWNER=$(echo "$OWNER_REPO" | cut -d/ -f1)
+REPO=$(echo "$OWNER_REPO" | cut -d/ -f2)
+```

 ---

 ## 1. Reviewing Local Changes (Pre-Push)

-Local diffs use plain `git` — no API needed.
+This is pure `git` — works everywhere, no API needed.

 ### Get the Diff

@@ -100,206 +122,158 @@ When reviewing local changes, present findings in this structure:

 ---

-## 2. Reviewing a Pull Request on GitHub (MCP Tools)
+## 2. Reviewing a Pull Request on GitHub

-### Step 1: Gather PR Context
+### View PR Details

-Use MCP tools to get PR metadata, description, and changed files:
-
-```
-# Get PR details (title, author, description, branch, status)
-mcp_github_pull_request_read(method="get", owner=OWNER, repo=REPO, pullNumber=PR_NUMBER)
-
-# Get the diff
-mcp_github_pull_request_read(method="get_diff", owner=OWNER, repo=REPO, pullNumber=PR_NUMBER)
-
-# Get list of changed files with additions/deletions
-mcp_github_pull_request_read(method="get_files", owner=OWNER, repo=REPO, pullNumber=PR_NUMBER)
-
-# Get CI/CD status
-mcp_github_pull_request_read(method="get_status", owner=OWNER, repo=REPO, pullNumber=PR_NUMBER)
-
-# Get check runs (individual CI jobs)
-mcp_github_pull_request_read(method="get_check_runs", owner=OWNER, repo=REPO, pullNumber=PR_NUMBER)
-```
-
-### Step 2: Read File Contents for Context
-
-For each changed file, read the full file to understand the surrounding context:
-
-```
-# Read specific files from the PR branch
-mcp_github_get_file_contents(owner=OWNER, repo=REPO, path="src/auth/login.py", ref="refs/pull/PR_NUMBER/head")
-```
-
-### Step 3: Check Out Locally (Optional — for running tests)
-
-If you need to run tests or linters locally:
+**With gh:**

 ```bash
-git fetch origin pull/PR_NUMBER/head:pr-PR_NUMBER
-git checkout pr-PR_NUMBER
-
-# Run tests
-python -m pytest 2>&1 | tail -20
-
-# Run linter
-ruff check . 2>&1 | head -30
+gh pr view 123
+gh pr diff 123
+gh pr diff 123 --name-only
 ```

-### Step 4: Get Existing Review Comments
-
-Check what's already been discussed:
-
-```
-# Get review threads (grouped comments on code locations)
-mcp_github_pull_request_read(method="get_review_comments", owner=OWNER, repo=REPO, pullNumber=PR_NUMBER)
-
-# Get general PR comments
-mcp_github_pull_request_read(method="get_comments", owner=OWNER, repo=REPO, pullNumber=PR_NUMBER)
-
-# Get formal reviews (approvals, change requests)
-mcp_github_pull_request_read(method="get_reviews", owner=OWNER, repo=REPO, pullNumber=PR_NUMBER)
-```
-
-### Step 5: Apply the Review Checklist (Section 3)
-
-Go through each category systematically.
-
-### Step 6: Submit a Formal Review with Inline Comments
-
-Use the MCP review tools to submit findings:
-
-**Create a pending review, add inline comments, then submit:**
-
-```
-# Step A: Create a pending review (omit "event" to keep it pending)
-mcp_github_pull_request_review_write(
-    method="create",
-    owner=OWNER,
-    repo=REPO,
-    pullNumber=PR_NUMBER
-)
-
-# Step B: Add inline comments to the pending review
-mcp_github_add_comment_to_pending_review(
-    owner=OWNER,
-    repo=REPO,
-    pullNumber=PR_NUMBER,
-    path="src/auth.py",
-    line=45,
-    body="🔴 **Critical:** User input passed directly to SQL query — use parameterized queries.",
-    subjectType="LINE",
-    side="RIGHT"
-)
-
-mcp_github_add_comment_to_pending_review(
-    owner=OWNER,
-    repo=REPO,
-    pullNumber=PR_NUMBER,
-    path="src/models/user.py",
-    line=23,
-    body="⚠️ **Warning:** Password stored without hashing. Use bcrypt or argon2.",
-    subjectType="LINE",
-    side="RIGHT"
-)
-
-# Step C: Submit the pending review
-mcp_github_pull_request_review_write(
-    method="submit_pending",
-    owner=OWNER,
-    repo=REPO,
-    pullNumber=PR_NUMBER,
-    event="REQUEST_CHANGES",  # or "APPROVE" or "COMMENT"
-    body="## Hermes Agent Review\n\nFound 2 issues. See inline comments."
-)
-```
-
-**Or submit a review directly (no pending step):**
-
-```
-# Approve
-mcp_github_pull_request_review_write(
-    method="create",
-    owner=OWNER,
-    repo=REPO,
-    pullNumber=PR_NUMBER,
-    event="APPROVE",
-    body="LGTM! Code looks clean — good test coverage, no security concerns."
-)
-
-# Request changes
-mcp_github_pull_request_review_write(
-    method="create",
-    owner=OWNER,
-    repo=REPO,
-    pullNumber=PR_NUMBER,
-    event="REQUEST_CHANGES",
-    body="Found a few issues — see inline comments."
-)
-```
-
-### Step 7: Post a Summary Comment
-
-Leave a top-level summary so the PR author gets the full picture:
-
-```
-mcp_github_add_issue_comment(
-    owner=OWNER,
-    repo=REPO,
-    issue_number=PR_NUMBER,
-    body="""## Code Review Summary
-
-**Verdict: Changes Requested** (2 issues, 1 suggestion)
-
-### 🔴 Critical
- **src/auth.py:45** — SQL injection vulnerability
-
-### ⚠️ Warnings
- **src/models.py:23** — Plaintext password storage
-
-### 💡 Suggestions
- **src/utils.py:8** — Duplicated logic, consider consolidating
-
-### ✅ Looks Good
- Clean API design
- Good error handling in the middleware layer
-
---
-*Reviewed by Hermes Agent*"""
-)
-```
-
-### Step 8: Reply to Existing Comments
-
-If the PR author responds to your review:
-
-```
-# Reply to a specific review comment
-mcp_github_add_reply_to_pull_request_comment(
-    owner=OWNER,
-    repo=REPO,
-    pullNumber=PR_NUMBER,
-    commentId=COMMENT_ID,
-    body="Good point! That approach works too."
-)
-```
-
-### Step 9: Request Copilot Review (Optional)
-
-For automated AI feedback before your review:
-
-```
-mcp_github_request_copilot_review(owner=OWNER, repo=REPO, pullNumber=PR_NUMBER)
-```
-
-### Step 10: Clean Up (if checked out locally)
+**With git + curl:**

 ```bash
-git checkout main
-git branch -D pr-PR_NUMBER
+PR_NUMBER=123
+
+# Get PR details
+curl -s \
+  -H "Authorization: token $GITHUB_TOKEN" \
+  https://api.github.com/repos/$OWNER/$REPO/pulls/$PR_NUMBER \
+  | python3 -c "
+import sys, json
+pr = json.load(sys.stdin)
+print(f\"Title: {pr['title']}\")
+print(f\"Author: {pr['user']['login']}\")
+print(f\"Branch: {pr['head']['ref']} -> {pr['base']['ref']}\")
+print(f\"State: {pr['state']}\")
+print(f\"Body:\n{pr['body']}\")"
+
+# List changed files
+curl -s \
+  -H "Authorization: token $GITHUB_TOKEN" \
+  https://api.github.com/repos/$OWNER/$REPO/pulls/$PR_NUMBER/files \
+  | python3 -c "
+import sys, json
+for f in json.load(sys.stdin):
+    print(f\"{f['status']:10} +{f['additions']:-4} -{f['deletions']:-4}  {f['filename']}\")"
 ```

+### Check Out PR Locally for Full Review
+
+This works with plain `git` — no `gh` needed:
+
+```bash
+# Fetch the PR branch and check it out
+git fetch origin pull/123/head:pr-123
+git checkout pr-123
+
+# Now you can use read_file, search_files, run tests, etc.
+
+# View diff against the base branch
+git diff main...pr-123
+```
+
+**With gh (shortcut):**
+
+```bash
+gh pr checkout 123
+```
+
+### Leave Comments on a PR
+
+**General PR comment — with gh:**
+
+```bash
+gh pr comment 123 --body "Overall looks good, a few suggestions below."
+```
+
+**General PR comment — with curl:**
+
+```bash
+curl -s -X POST \
+  -H "Authorization: token $GITHUB_TOKEN" \
+  https://api.github.com/repos/$OWNER/$REPO/issues/$PR_NUMBER/comments \
+  -d '{"body": "Overall looks good, a few suggestions below."}'
+```
+
+### Leave Inline Review Comments
+
+**Single inline comment — with gh (via API):**
+
+```bash
+HEAD_SHA=$(gh pr view 123 --json headRefOid --jq '.headRefOid')
+
+gh api repos/$OWNER/$REPO/pulls/123/comments \
+  --method POST \
+  -f body="This could be simplified with a list comprehension." \
+  -f path="src/auth/login.py" \
+  -f commit_id="$HEAD_SHA" \
+  -f line=45 \
+  -f side="RIGHT"
+```
+
+**Single inline comment — with curl:**
+
+```bash
+# Get the head commit SHA
+HEAD_SHA=$(curl -s \
+  -H "Authorization: token $GITHUB_TOKEN" \
+  https://api.github.com/repos/$OWNER/$REPO/pulls/$PR_NUMBER \
+  | python3 -c "import sys,json; print(json.load(sys.stdin)['head']['sha'])")
+
+curl -s -X POST \
+  -H "Authorization: token $GITHUB_TOKEN" \
+  https://api.github.com/repos/$OWNER/$REPO/pulls/$PR_NUMBER/comments \
+  -d "{
+    \"body\": \"This could be simplified with a list comprehension.\",
+    \"path\": \"src/auth/login.py\",
+    \"commit_id\": \"$HEAD_SHA\",
+    \"line\": 45,
+    \"side\": \"RIGHT\"
+  }"
+```
+
+### Submit a Formal Review (Approve / Request Changes)
+
+**With gh:**
+
+```bash
+gh pr review 123 --approve --body "LGTM!"
+gh pr review 123 --request-changes --body "See inline comments."
+gh pr review 123 --comment --body "Some suggestions, nothing blocking."
+```
+
+**With curl — multi-comment review submitted atomically:**
+
+```bash
+HEAD_SHA=$(curl -s \
+  -H "Authorization: token $GITHUB_TOKEN" \
+  https://api.github.com/repos/$OWNER/$REPO/pulls/$PR_NUMBER \
+  | python3 -c "import sys,json; print(json.load(sys.stdin)['head']['sha'])")
+
+curl -s -X POST \
+  -H "Authorization: token $GITHUB_TOKEN" \
+  https://api.github.com/repos/$OWNER/$REPO/pulls/$PR_NUMBER/reviews \
+  -d "{
+    \"commit_id\": \"$HEAD_SHA\",
+    \"event\": \"COMMENT\",
+    \"body\": \"Code review from Hermes Agent\",
+    \"comments\": [
+      {\"path\": \"src/auth.py\", \"line\": 45, \"body\": \"Use parameterized queries to prevent SQL injection.\"},
+      {\"path\": \"src/models/user.py\", \"line\": 23, \"body\": \"Hash passwords with bcrypt before storing.\"},
+      {\"path\": \"tests/test_auth.py\", \"line\": 1, \"body\": \"Add test for expired token edge case.\"}
+    ]
+  }"
+```
+
+Event values: `"APPROVE"`, `"REQUEST_CHANGES"`, `"COMMENT"`
+
+The `line` field refers to the line number in the *new* version of the file. For deleted lines, use `"side": "LEFT"`.
+
 ---

 ## 3. Review Checklist
@@ -316,7 +290,6 @@ When performing a code review (local or PR), systematically check:
 - Input validation on user-facing inputs
 - No SQL injection, XSS, or path traversal
 - Auth/authz checks where needed
- Use `mcp_github_run_secret_scanning` on changed files for automated secret detection

 ### Code Quality
 - Clear naming (variables, functions, classes)
@@ -354,30 +327,151 @@ When the user asks you to "review the code" or "check before pushing":

 ---

-## 5. PR Review Workflow (End-to-End with MCP Tools)
+## 5. PR Review Workflow (End-to-End)

-When the user asks you to "review PR #N", "look at this PR", or gives you a PR URL:
+When the user asks you to "review PR #N", "look at this PR", or gives you a PR URL, follow this recipe:

-### Quick Reference
+### Step 1: Set up environment

-| Task | MCP Tool |
-|------|----------|
-| Get PR details | `mcp_github_pull_request_read(method="get")` |
-| Get PR diff | `mcp_github_pull_request_read(method="get_diff")` |
-| Get changed files | `mcp_github_pull_request_read(method="get_files")` |
-| Get CI status | `mcp_github_pull_request_read(method="get_status")` |
-| Get check runs | `mcp_github_pull_request_read(method="get_check_runs")` |
-| Read file contents | `mcp_github_get_file_contents(ref="refs/pull/N/head")` |
-| Get review threads | `mcp_github_pull_request_read(method="get_review_comments")` |
-| Get PR comments | `mcp_github_pull_request_read(method="get_comments")` |
-| Get reviews | `mcp_github_pull_request_read(method="get_reviews")` |
-| Create pending review | `mcp_github_pull_request_review_write(method="create")` |
-| Add inline comment | `mcp_github_add_comment_to_pending_review()` |
-| Submit review | `mcp_github_pull_request_review_write(method="submit_pending")` |
-| Add PR comment | `mcp_github_add_issue_comment()` |
-| Reply to comment | `mcp_github_add_reply_to_pull_request_comment()` |
-| Scan for secrets | `mcp_github_run_secret_scanning()` |
-| Request Copilot review | `mcp_github_request_copilot_review()` |
+```bash
+source ~/.hermes/skills/github/github-auth/scripts/gh-env.sh
+# Or run the inline setup block from the top of this skill
+```
+
+### Step 2: Gather PR context
+
+Get the PR metadata, description, and list of changed files to understand scope before diving into code.
+
+**With gh:**
+```bash
+gh pr view 123
+gh pr diff 123 --name-only
+gh pr checks 123
+```
+
+**With curl:**
+```bash
+PR_NUMBER=123
+
+# PR details (title, author, description, branch)
+curl -s -H "Authorization: token $GITHUB_TOKEN" \
+  https://api.github.com/repos/$GH_OWNER/$GH_REPO/pulls/$PR_NUMBER
+
+# Changed files with line counts
+curl -s -H "Authorization: token $GITHUB_TOKEN" \
+  https://api.github.com/repos/$GH_OWNER/$GH_REPO/pulls/$PR_NUMBER/files
+```
+
+### Step 3: Check out the PR locally
+
+This gives you full access to `read_file`, `search_files`, and the ability to run tests.
+
+```bash
+git fetch origin pull/$PR_NUMBER/head:pr-$PR_NUMBER
+git checkout pr-$PR_NUMBER
+```
+
+### Step 4: Read the diff and understand changes
+
+```bash
+# Full diff against the base branch
+git diff main...HEAD
+
+# Or file-by-file for large PRs
+git diff main...HEAD --name-only
+# Then for each file:
+git diff main...HEAD -- path/to/file.py
+```
+
+For each changed file, use `read_file` to see full context around the changes — diffs alone can miss issues visible only with surrounding code.
+
+### Step 5: Run automated checks locally (if applicable)
+
+```bash
+# Run tests if there's a test suite
+python -m pytest 2>&1 | tail -20
+# or: npm test, cargo test, go test ./..., etc.
+
+# Run linter if configured
+ruff check . 2>&1 | head -30
+# or: eslint, clippy, etc.
+```
+
+### Step 6: Apply the review checklist (Section 3)
+
+Go through each category: Correctness, Security, Code Quality, Testing, Performance, Documentation.
+
+### Step 7: Post the review to GitHub
+
+Collect your findings and submit them as a formal review with inline comments.
+
+**With gh:**
+```bash
+# If no issues — approve
+gh pr review $PR_NUMBER --approve --body "Reviewed by Hermes Agent. Code looks clean — good test coverage, no security concerns."
+
+# If issues found — request changes with inline comments
+gh pr review $PR_NUMBER --request-changes --body "Found a few issues — see inline comments."
+```
+
+**With curl — atomic review with multiple inline comments:**
+```bash
+HEAD_SHA=$(curl -s -H "Authorization: token $GITHUB_TOKEN" \
+  https://api.github.com/repos/$GH_OWNER/$GH_REPO/pulls/$PR_NUMBER \
+  | python3 -c "import sys,json; print(json.load(sys.stdin)['head']['sha'])")
+
+# Build the review JSON — event is APPROVE, REQUEST_CHANGES, or COMMENT
+curl -s -X POST \
+  -H "Authorization: token $GITHUB_TOKEN" \
+  https://api.github.com/repos/$GH_OWNER/$GH_REPO/pulls/$PR_NUMBER/reviews \
+  -d "{
+    \"commit_id\": \"$HEAD_SHA\",
+    \"event\": \"REQUEST_CHANGES\",
+    \"body\": \"## Hermes Agent Review\n\nFound 2 issues, 1 suggestion. See inline comments.\",
+    \"comments\": [
+      {\"path\": \"src/auth.py\", \"line\": 45, \"body\": \"🔴 **Critical:** User input passed directly to SQL query — use parameterized queries.\"},
+      {\"path\": \"src/models.py\", \"line\": 23, \"body\": \"⚠️ **Warning:** Password stored without hashing.\"},
+      {\"path\": \"src/utils.py\", \"line\": 8, \"body\": \"💡 **Suggestion:** This duplicates logic in core/utils.py:34.\"}
+    ]
+  }"
+```
+
+### Step 8: Also post a summary comment
+
+In addition to inline comments, leave a top-level summary so the PR author gets the full picture at a glance. Use the review output format from `references/review-output-template.md`.
+
+**With gh:**
+```bash
+gh pr comment $PR_NUMBER --body "$(cat <<'EOF'
+## Code Review Summary
+
+**Verdict: Changes Requested** (2 issues, 1 suggestion)
+
+### 🔴 Critical
+- **src/auth.py:45** — SQL injection vulnerability
+
+### ⚠️ Warnings
+- **src/models.py:23** — Plaintext password storage
+
+### 💡 Suggestions
+- **src/utils.py:8** — Duplicated logic, consider consolidating
+
+### ✅ Looks Good
+- Clean API design
+- Good error handling in the middleware layer
+
+---
+*Reviewed by Hermes Agent*
+EOF
+)"
+```
+
+### Step 9: Clean up
+
+```bash
+git checkout main
+git branch -D pr-$PR_NUMBER
+```

 ### Decision: Approve vs Request Changes vs Comment

@@ -0,0 +1,374 @@
+# Python Module Taste Guide
+
+_Opinionated notes on structuring Python projects where many people (and agents) contribute. Not a style guide — a taste document._
+
+---
+
+## 1. The file is the unit of understanding
+
+Every `.py` file should be explainable in one sentence. If you can't say "this file handles X" without using the word "and", split it.
+
+```
+# Good: one sentence each
+backends/docker.py      → "Docker execution backend"
+backends/ssh.py         → "SSH execution backend"
+agent/planner.py        → "Step planning and decomposition"
+agent/executor.py       → "Tool dispatch and result collection"
+
+# Bad: needs "and"
+agent/core.py           → "Planning and execution and tool dispatch and error handling"
+utils/helpers.py        → "Retry logic and string formatting and path resolution"
+```
+
+A 400-line file with one clear purpose is better than four 100-line files with fuzzy purposes.
+
+---
+
+## 2. Directory = namespace = concept boundary
+
+A directory exists to group files that share a concept AND need to import each other. If the files don't need each other, they don't need a directory — they can be siblings.
+
+```
+# This directory earns its existence:
+backends/
+├── __init__.py     # re-exports Backend, DockerBackend, etc.
+├── base.py         # Protocol/ABC
+├── docker.py       # imports base
+├── ssh.py          # imports base
+└── local.py        # imports base
+
+# This directory shouldn't exist:
+utils/
+├── __init__.py
+├── retry.py        # used by backends
+├── formatting.py   # used by cli
+└── paths.py        # used by config
+# These have nothing to do with each other. Just put them where they're used.
+```
+
+**Test:** if you delete the `__init__.py` and the directory, would each file work as a top-level module? If yes, the directory is probably just cosmetic grouping, not a real namespace.
+
+---
+
+## 3. Flat until it hurts (the two-level rule)
+
+Start with at most two levels of nesting under `src/`. Add a third level only when a directory has 7+ files AND they cluster into obvious sub-groups.
+
+```
+# Good: two levels
+src/hermes_agent/
+├── backends/
+├── agent/
+├── tools/
+├── config/
+└── cli/
+
+# Premature: three levels when you only have 2 files
+src/hermes_agent/
+└── backends/
+    └── docker/
+        ├── __init__.py
+        ├── container.py    # only 80 lines
+        └── image.py        # only 60 lines
+        # Just keep this as backends/docker.py until it's 300+ lines
+```
+
+Depth costs cognitive overhead. Every nested directory is a question: "do I look in `docker/` or `docker/container/`?" Flat trees answer questions faster.
+
+---
+
+## 4. `__init__.py` is your public API
+
+Treat `__init__.py` as the **only** file external consumers should import from. Everything else is internal.
+
+```python
+# backends/__init__.py
+from .base import Backend, ExecResult
+from .docker import DockerBackend
+from .local import LocalBackend
+from .ssh import SSHBackend
+
+__all__ = ["Backend", "ExecResult", "DockerBackend", "LocalBackend", "SSHBackend"]
+```
+
+This gives you freedom to refactor internals. You can split `docker.py` into `docker_container.py` + `docker_network.py` without changing any external imports — because everyone imports from `backends`, not from `backends.docker`.
+
+**Rule:** if you see `from hermes_agent.backends.docker import DockerBackend` in the agent code, that's a smell. It should be `from hermes_agent.backends import DockerBackend`.
+
+---
+
+## 5. Dependency arrows flow one way
+
+Draw the import graph. It should be a DAG with clear layers:
+
+```
+cli
+ ↓
+agent
+ ↓  ↘
+tools  backends
+ ↓       ↓
+config  config
+```
+
+**Hard rules:**
+
+- `config` imports nothing from the project (it's the leaf)
+- `backends` never imports from `agent`
+- `tools` never imports from `agent`
+- `agent` imports from `tools` and `backends`
+- `cli` imports from `agent` (and maybe `config`)
+
+If you're tempted to create a circular import, you're missing an interface. Extract the shared type into `config` or a `types.py` at the appropriate level.
+
+```python
+# Bad: circular
+# agent/executor.py imports backends.docker
+# backends/docker.py imports agent.state  ← circular!
+
+# Fix: extract the shared type
+# types.py (or config/types.py)
+@dataclass
+class AgentState:
+    ...
+
+# Now both agent and backends can import from types
+```
+
+---
+
+## 6. One file owns each type
+
+Every important class/dataclass/Protocol should live in exactly one file, and that file should be obvious from the type name.
+
+```
+BackendConfig      → config/backends.py   (or config.py if config is flat)
+DockerBackend      → backends/docker.py
+AgentLoop          → agent/loop.py
+ToolRegistry       → tools/registry.py
+```
+
+Anti-pattern: putting `BackendConfig` in `backends/base.py` because "it's related to backends." No — config lives in config. Backends _use_ config, they don't _define_ config. This keeps the dependency arrows clean.
+
+---
+
+## 7. Protocols over ABCs for external contracts
+
+Use `typing.Protocol` when you want to define "what shape does this thing have" without forcing inheritance. Use ABCs when you want to share implementation.
+
+```python
+# Protocol: structural subtyping, no inheritance needed
+# Good for: interfaces consumed by other packages
+@runtime_checkable
+class Backend(Protocol):
+    async def execute(self, cmd: str) -> ExecResult: ...
+    async def upload(self, local: Path, remote: str) -> None: ...
+
+# ABC: nominal subtyping, shared implementation
+# Good for: when backends share 50+ lines of common logic
+class BaseBackend(ABC):
+    def __init__(self, config: BackendConfig):
+        self.config = config
+        self._setup_logging()  # shared
+
+    @abstractmethod
+    async def execute(self, cmd: str) -> ExecResult: ...
+
+    def _setup_logging(self):  # shared implementation
+        ...
+```
+
+**Default to Protocol.** Reach for ABC only when you have real shared code, not just shared signatures.
+
+---
+
+## 8. Config is typed, loaded once, passed explicitly
+
+```python
+# config.py
+from pydantic import BaseModel
+
+class BackendConfig(BaseModel):
+    type: str = "local"
+    docker_image: str | None = None
+    ssh_host: str | None = None
+    timeout: float = 30.0
+
+class AgentConfig(BaseModel):
+    model: str = "gpt-4o"
+    max_steps: int = 50
+    backend: BackendConfig = BackendConfig()
+
+# Loading happens once, at the edge
+def load_config(path: Path) -> AgentConfig:
+    raw = yaml.safe_load(path.read_text())
+    return AgentConfig(**raw)
+```
+
+**Rules:**
+
+- Config classes import nothing from the project
+- Config is loaded in `cli/` or `main()`, never inside library code
+- No module-level globals like `CONFIG = load_config()`. Pass it through constructors.
+- No `os.environ.get()` scattered through library code. Read env vars in config loading only.
+
+---
+
+## 9. The "where does new code go?" test
+
+Before committing to a structure, simulate these scenarios:
+
+| Scenario                                 | Should be obvious where to add it                       |
+| ---------------------------------------- | ------------------------------------------------------- |
+| New execution backend (e.g., Kubernetes) | `backends/kubernetes.py` + register in `__init__.py`    |
+| New CLI subcommand                       | `cli/new_command.py` or a function in existing cli file |
+| New tool for the agent                   | `tools/new_tool.py` + register in tool registry         |
+| New config option                        | Add field to existing config model in `config.py`       |
+| Bug fix in SSH execution                 | `backends/ssh.py`, nowhere else                         |
+| New eval benchmark                       | `eval/new_benchmark.py`                                 |
+
+If any of these require touching 5+ files or the answer is "I'm not sure," the structure needs work.
+
+---
+
+## 10. Files that earn their existence
+
+Every file in the project should pass one of these tests:
+
+1. **It's the single home for a concept** (e.g., `docker.py` owns DockerBackend)
+2. **It's a boundary** (e.g., `__init__.py` defines the public API)
+3. **It's an entrypoint** (e.g., `__main__.py`, CLI commands)
+4. **It's config/constants** (e.g., `config.py`, `defaults.py`)
+
+Files that don't pass: `helpers.py`, `misc.py`, `common.py`, `base.py` (when it has no ABC/Protocol), `types.py` (when it has 2 types that belong in their respective modules).
+
+---
+
+## 11. Naming that communicates
+
+**Files:** noun or noun_phrase, lowercase_snake. The name should tell you what's _in_ the file, not what it _does_.
+
+```
+# Good: tells you what's inside
+registry.py         # contains ToolRegistry
+docker.py           # contains DockerBackend
+planner.py          # contains Planner, PlanStep
+
+# Bad: tells you what it does (vague)
+run.py              # run what?
+process.py          # process what?
+handle.py           # handle what?
+```
+
+**Directories:** plural nouns for collections, singular for a single concern.
+
+```
+backends/           # plural: collection of backend implementations
+config/             # singular: one concern
+tools/              # plural: collection of tools
+agent/              # singular: one agent system
+```
+
+---
+
+## 12. Tests: organize by confidence, not by source
+
+```
+tests/
+├── unit/              # fast, isolated, mock everything external
+│   ├── test_planner.py
+│   └── test_config.py
+├── integration/       # real backends, real I/O, but controlled
+│   ├── test_docker_backend.py
+│   └── test_ssh_backend.py
+├── e2e/               # full agent runs, slow, CI-only
+│   ├── test_ctf_solve.py
+│   └── test_migration.py
+└── fixtures/          # shared test data
+    ├── sample_config.yaml
+    └── mock_responses/
+```
+
+Don't mirror `src/` 1:1. Test files group by _what you're verifying_, not which source file they exercise. `test_agent_can_recover_from_backend_failure.py` might touch `agent/`, `backends/`, and `config/` — and that's fine.
+
+---
+
+## 13. The import order tells a story
+
+Within a file, imports should read top-down as: stdlib → third-party → project internals, with project internals going from "far away" to "nearby."
+
+```python
+# stdlib
+import asyncio
+from pathlib import Path
+
+# third-party
+from pydantic import BaseModel
+
+# project: far away (config is a leaf, used everywhere)
+from hermes_agent.config import BackendConfig
+
+# project: nearby (same package)
+from .base import Backend, ExecResult
+```
+
+This isn't just aesthetics — it makes dependency direction visible at a glance.
+
+---
+
+## 14. When to split a file
+
+Split when ANY of these are true:
+
+- File exceeds ~400 lines AND has 2+ distinct responsibilities
+- Two people frequently have merge conflicts in the same file
+- You find yourself adding `# --- Section: X ---` comments to navigate
+- The file has internal classes/functions that another module wants to import
+
+Do NOT split just because:
+
+- The file is "long" (a 600-line file with one clear purpose is fine)
+- You "might need to" someday
+- A linter told you to
+
+---
+
+## 15. Module-level code is a liability
+
+Every line that runs at import time is a line that can break `import hermes_agent`.
+
+```python
+# Bad: runs at import time
+import docker
+client = docker.from_env()  # crashes if Docker isn't running
+
+# Good: lazy, runs when needed
+def get_docker_client():
+    import docker
+    return docker.from_env()
+
+# Also good: runs in __init__, not at module level
+class DockerBackend:
+    def __init__(self, config: BackendConfig):
+        import docker
+        self._client = docker.from_env()
+```
+
+**Rule:** module-level code should only be: imports, type definitions, constants, and function/class definitions. Never side effects.
+
+---
+
+## Reading list
+
+These are worth reading not for rules but for _calibrating your taste_:
+
+- **Hynek Schlawack — [Testing & Packaging](https://hynek.me/articles/testing-packaging/)**: Best single article on src layout and why it matters.
+- **Python Packaging Guide — [src layout vs flat layout](https://packaging.python.org/en/latest/discussions/src-layout-vs-flat-layout/)**: The official take.
+- **Brandon Rhodes — [The Clean Architecture in Python](https://rhodesmill.org/brandon/talks/#clean-architecture-python)** (PyCon talk): Good for understanding dependency direction without going full enterprise.
+- **Cosmicpython — [Architecture Patterns with Python](https://www.cosmicpython.com/)**: Free online book. Chapters 1-4 on repository pattern and dependency inversion are relevant; skip the CQRS/event-sourcing stuff unless you need it.
+- **Hatch documentation**: Modern Python project management. Reading how Hatch structures things will passively teach you good layout conventions.
+- **Any well-structured open source project**: `httpx`, `pydantic`, `ruff` (Rust but the Python wrapper layout is instructive), `textual`. Read their `src/` trees and `__init__.py` files.
+
+---
+
+_The goal is not elegance. The goal is that a new contributor — human or agent — can go from "I need to change X" to "I know which file to open" in under 10 seconds._
@@ -971,6 +971,74 @@ class TestTaskSpecificOverrides:
            client, model = get_text_auxiliary_client("compression")
        assert model == "google/gemini-3-flash-preview"  # auto → OpenRouter

+    def test_resolve_auto_prefers_live_main_runtime_over_persisted_config(self, monkeypatch, tmp_path):
+        """Session-only live model switches should override persisted config for auto routing."""
+        hermes_home = tmp_path / "hermes"
+        hermes_home.mkdir(parents=True, exist_ok=True)
+        (hermes_home / "config.yaml").write_text(
+            """model:
+  default: glm-5.1
+  provider: opencode-go
+compression:
+  summary_provider: auto
+"""
+        )
+        monkeypatch.setenv("HERMES_HOME", str(hermes_home))
+
+        calls = []
+
+        def _fake_resolve(provider, model=None, *args, **kwargs):
+            calls.append((provider, model, kwargs))
+            return MagicMock(), model or "resolved-model"
+
+        with patch("agent.auxiliary_client.resolve_provider_client", side_effect=_fake_resolve):
+            client, model = _resolve_auto(
+                main_runtime={
+                    "provider": "openai-codex",
+                    "model": "gpt-5.4",
+                    "api_mode": "codex_responses",
+                }
+            )
+
+        assert client is not None
+        assert model == "gpt-5.4"
+        assert calls[0][0] == "openai-codex"
+        assert calls[0][1] == "gpt-5.4"
+        assert calls[0][2]["api_mode"] == "codex_responses"
+
+    def test_explicit_compression_pin_still_wins_over_live_main_runtime(self, monkeypatch, tmp_path):
+        """Task-level compression config should beat a live session override."""
+        hermes_home = tmp_path / "hermes"
+        hermes_home.mkdir(parents=True, exist_ok=True)
+        (hermes_home / "config.yaml").write_text(
+            """auxiliary:
+  compression:
+    provider: openrouter
+    model: google/gemini-3-flash-preview
+model:
+  default: glm-5.1
+  provider: opencode-go
+"""
+        )
+        monkeypatch.setenv("HERMES_HOME", str(hermes_home))
+
+        with patch("agent.auxiliary_client.resolve_provider_client", return_value=(MagicMock(), "google/gemini-3-flash-preview")) as mock_resolve:
+            client, model = get_text_auxiliary_client(
+                "compression",
+                main_runtime={
+                    "provider": "openai-codex",
+                    "model": "gpt-5.4",
+                },
+            )
+
+        assert client is not None
+        assert model == "google/gemini-3-flash-preview"
+        assert mock_resolve.call_args.args[0] == "openrouter"
+        assert mock_resolve.call_args.kwargs["main_runtime"] == {
+            "provider": "openai-codex",
+            "model": "gpt-5.4",
+        }
+
    def test_compression_summary_base_url_from_config(self, monkeypatch, tmp_path):
        """compression.summary_base_url should produce a custom-endpoint client."""
        hermes_home = tmp_path / "hermes"
@@ -1560,3 +1628,74 @@ class TestStaleBaseUrlWarning:

        assert not any("OPENAI_BASE_URL is set" in rec.message for rec in caplog.records), \
            "Warning should not fire a second time"
+
+
+# ---------------------------------------------------------------------------
+# Anthropic-compatible image block conversion
+# ---------------------------------------------------------------------------
+
+class TestAnthropicCompatImageConversion:
+    """Tests for _is_anthropic_compat_endpoint and _convert_openai_images_to_anthropic."""
+
+    def test_known_providers_detected(self):
+        from agent.auxiliary_client import _is_anthropic_compat_endpoint
+        assert _is_anthropic_compat_endpoint("minimax", "")
+        assert _is_anthropic_compat_endpoint("minimax-cn", "")
+
+    def test_openrouter_not_detected(self):
+        from agent.auxiliary_client import _is_anthropic_compat_endpoint
+        assert not _is_anthropic_compat_endpoint("openrouter", "")
+        assert not _is_anthropic_compat_endpoint("anthropic", "")
+
+    def test_url_based_detection(self):
+        from agent.auxiliary_client import _is_anthropic_compat_endpoint
+        assert _is_anthropic_compat_endpoint("custom", "https://api.minimax.io/anthropic")
+        assert _is_anthropic_compat_endpoint("custom", "https://example.com/anthropic/v1")
+        assert not _is_anthropic_compat_endpoint("custom", "https://api.openai.com/v1")
+
+    def test_base64_image_converted(self):
+        from agent.auxiliary_client import _convert_openai_images_to_anthropic
+        messages = [{
+            "role": "user",
+            "content": [
+                {"type": "text", "text": "describe"},
+                {"type": "image_url", "image_url": {"url": "data:image/png;base64,iVBOR="}}
+            ]
+        }]
+        result = _convert_openai_images_to_anthropic(messages)
+        img_block = result[0]["content"][1]
+        assert img_block["type"] == "image"
+        assert img_block["source"]["type"] == "base64"
+        assert img_block["source"]["media_type"] == "image/png"
+        assert img_block["source"]["data"] == "iVBOR="
+
+    def test_url_image_converted(self):
+        from agent.auxiliary_client import _convert_openai_images_to_anthropic
+        messages = [{
+            "role": "user",
+            "content": [
+                {"type": "image_url", "image_url": {"url": "https://example.com/img.jpg"}}
+            ]
+        }]
+        result = _convert_openai_images_to_anthropic(messages)
+        img_block = result[0]["content"][0]
+        assert img_block["type"] == "image"
+        assert img_block["source"]["type"] == "url"
+        assert img_block["source"]["url"] == "https://example.com/img.jpg"
+
+    def test_text_only_messages_unchanged(self):
+        from agent.auxiliary_client import _convert_openai_images_to_anthropic
+        messages = [{"role": "user", "content": "Hello"}]
+        result = _convert_openai_images_to_anthropic(messages)
+        assert result[0] is messages[0]  # same object, not copied
+
+    def test_jpeg_media_type_parsed(self):
+        from agent.auxiliary_client import _convert_openai_images_to_anthropic
+        messages = [{
+            "role": "user",
+            "content": [
+                {"type": "image_url", "image_url": {"url": "data:image/jpeg;base64,/9j/="}}
+            ]
+        }]
+        result = _convert_openai_images_to_anthropic(messages)
+        assert result[0]["content"][0]["source"]["media_type"] == "image/jpeg"
@@ -191,6 +191,37 @@ class TestNonStringContent:
        kwargs = mock_call.call_args.kwargs
        assert "temperature" not in kwargs

+    def test_summary_call_passes_live_main_runtime(self):
+        mock_response = MagicMock()
+        mock_response.choices = [MagicMock()]
+        mock_response.choices[0].message.content = "ok"
+
+        with patch("agent.context_compressor.get_model_context_length", return_value=100000):
+            c = ContextCompressor(
+                model="gpt-5.4",
+                provider="openai-codex",
+                base_url="https://chatgpt.com/backend-api/codex",
+                api_key="codex-token",
+                api_mode="codex_responses",
+                quiet_mode=True,
+            )
+
+        messages = [
+            {"role": "user", "content": "do something"},
+            {"role": "assistant", "content": "ok"},
+        ]
+
+        with patch("agent.context_compressor.call_llm", return_value=mock_response) as mock_call:
+            c._generate_summary(messages)
+
+        assert mock_call.call_args.kwargs["main_runtime"] == {
+            "model": "gpt-5.4",
+            "provider": "openai-codex",
+            "base_url": "https://chatgpt.com/backend-api/codex",
+            "api_key": "codex-token",
+            "api_mode": "codex_responses",
+        }
+

 class TestSummaryFailureCooldown:
    def test_summary_failure_enters_cooldown_and_skips_retry(self):
@@ -87,7 +87,10 @@ class TestProviderMapping:

    def test_unmapped_provider_not_in_dict(self):
        assert "nous" not in PROVIDER_TO_MODELS_DEV
-        assert "openai-codex" not in PROVIDER_TO_MODELS_DEV
+
+    def test_openai_codex_mapped_to_openai(self):
+        assert PROVIDER_TO_MODELS_DEV["openai"] == "openai"
+        assert PROVIDER_TO_MODELS_DEV["openai-codex"] == "openai"


 class TestExtractContext:
@@ -18,6 +18,7 @@ from agent.prompt_builder import (
    build_skills_system_prompt,
    build_nous_subscription_prompt,
    build_context_files_prompt,
+    build_environment_hints,
    CONTEXT_FILE_MAX_CHARS,
    DEFAULT_AGENT_IDENTITY,
    TOOL_USE_ENFORCEMENT_GUIDANCE,
@@ -26,6 +27,7 @@ from agent.prompt_builder import (
    MEMORY_GUIDANCE,
    SESSION_SEARCH_GUIDANCE,
    PLATFORM_HINTS,
+    WSL_ENVIRONMENT_HINT,
 )
 from hermes_cli.nous_subscription import NousFeatureState, NousSubscriptionFeatures

@@ -770,6 +772,29 @@ class TestPromptBuilderConstants:
        assert "cli" in PLATFORM_HINTS


+# =========================================================================
+# Environment hints
+# =========================================================================
+
+class TestEnvironmentHints:
+    def test_wsl_hint_constant_mentions_mnt(self):
+        assert "/mnt/c/" in WSL_ENVIRONMENT_HINT
+        assert "WSL" in WSL_ENVIRONMENT_HINT
+
+    def test_build_environment_hints_on_wsl(self, monkeypatch):
+        import agent.prompt_builder as _pb
+        monkeypatch.setattr(_pb, "is_wsl", lambda: True)
+        result = _pb.build_environment_hints()
+        assert "/mnt/" in result
+        assert "WSL" in result
+
+    def test_build_environment_hints_not_wsl(self, monkeypatch):
+        import agent.prompt_builder as _pb
+        monkeypatch.setattr(_pb, "is_wsl", lambda: False)
+        result = _pb.build_environment_hints()
+        assert result == ""
+
+
 # =========================================================================
 # Conditional skill activation
 # =========================================================================
@@ -0,0 +1,226 @@
+"""Tests for the clean shutdown marker that prevents unwanted session auto-resets.
+
+When the gateway shuts down gracefully (hermes update, gateway restart, /restart),
+it writes a .clean_shutdown marker.  On the next startup, if the marker exists,
+suspend_recently_active() is skipped so users don't lose their sessions.
+
+After a crash (no marker), suspension still fires as a safety net for stuck sessions.
+"""
+
+import os
+from datetime import datetime, timedelta
+from pathlib import Path
+from unittest.mock import AsyncMock, MagicMock, patch
+
+import pytest
+
+from gateway.config import GatewayConfig, Platform, PlatformConfig, SessionResetPolicy
+from gateway.session import SessionEntry, SessionSource, SessionStore
+
+
+# ---------------------------------------------------------------------------
+# Helpers
+# ---------------------------------------------------------------------------
+
+def _make_source(platform=Platform.TELEGRAM, chat_id="123", user_id="u1"):
+    return SessionSource(platform=platform, chat_id=chat_id, user_id=user_id)
+
+
+def _make_store(tmp_path, policy=None):
+    config = GatewayConfig()
+    if policy:
+        config.default_reset_policy = policy
+    return SessionStore(sessions_dir=tmp_path, config=config)
+
+
+# ---------------------------------------------------------------------------
+# SessionStore.suspend_recently_active
+# ---------------------------------------------------------------------------
+
+class TestSuspendRecentlyActive:
+    """Verify suspend_recently_active only marks recent sessions."""
+
+    def test_suspends_recently_active_sessions(self, tmp_path):
+        store = _make_store(tmp_path)
+        source = _make_source()
+        entry = store.get_or_create_session(source)
+        assert not entry.suspended
+
+        count = store.suspend_recently_active()
+        assert count == 1
+
+        # Re-fetch — should be suspended now
+        refreshed = store.get_or_create_session(source)
+        assert refreshed.was_auto_reset
+
+    def test_does_not_suspend_old_sessions(self, tmp_path):
+        store = _make_store(tmp_path)
+        source = _make_source()
+        entry = store.get_or_create_session(source)
+
+        # Backdate the session's updated_at beyond the cutoff
+        with store._lock:
+            entry.updated_at = datetime.now() - timedelta(seconds=300)
+            store._save()
+
+        count = store.suspend_recently_active(max_age_seconds=120)
+        assert count == 0
+
+    def test_already_suspended_not_double_counted(self, tmp_path):
+        store = _make_store(tmp_path)
+        source = _make_source()
+        entry = store.get_or_create_session(source)
+
+        # Suspend once
+        count1 = store.suspend_recently_active()
+        assert count1 == 1
+
+        # Create a new session (the old one got reset on next access)
+        entry2 = store.get_or_create_session(source)
+
+        # Suspend again — the new session is recent but not yet suspended
+        count2 = store.suspend_recently_active()
+        assert count2 == 1
+
+
+# ---------------------------------------------------------------------------
+# Clean shutdown marker integration
+# ---------------------------------------------------------------------------
+
+class TestCleanShutdownMarker:
+    """Test that the marker file controls session suspension on startup."""
+
+    def test_marker_written_on_graceful_stop(self, tmp_path, monkeypatch):
+        """stop() should write .clean_shutdown marker."""
+        monkeypatch.setattr("gateway.run._hermes_home", tmp_path)
+        marker = tmp_path / ".clean_shutdown"
+        assert not marker.exists()
+
+        # Create a minimal runner and call the shutdown logic directly
+        from gateway.run import GatewayRunner
+        runner = object.__new__(GatewayRunner)
+        runner._restart_requested = False
+        runner._restart_detached = False
+        runner._restart_via_service = False
+        runner._restart_task_started = False
+        runner._running = True
+        runner._draining = False
+        runner._stop_task = None
+        runner._running_agents = {}
+        runner._pending_messages = {}
+        runner._pending_approvals = {}
+        runner._background_tasks = set()
+        runner._shutdown_event = MagicMock()
+        runner._restart_drain_timeout = 5
+        runner._exit_code = None
+        runner._exit_reason = None
+        runner.adapters = {}
+        runner.config = GatewayConfig()
+
+        # Mock heavy dependencies
+        with patch("gateway.run.GatewayRunner._drain_active_agents", new_callable=AsyncMock, return_value=([], False)), \
+             patch("gateway.run.GatewayRunner._finalize_shutdown_agents"), \
+             patch("gateway.run.GatewayRunner._update_runtime_status"), \
+             patch("gateway.status.remove_pid_file"), \
+             patch("tools.process_registry.process_registry") as mock_proc_reg, \
+             patch("tools.terminal_tool.cleanup_all_environments"), \
+             patch("tools.browser_tool.cleanup_all_browsers"):
+            mock_proc_reg.kill_all = MagicMock()
+
+            import asyncio
+            asyncio.get_event_loop().run_until_complete(runner.stop())
+
+        assert marker.exists(), ".clean_shutdown marker should exist after graceful stop"
+
+    def test_marker_skips_suspension_on_startup(self, tmp_path, monkeypatch):
+        """If .clean_shutdown exists, suspend_recently_active should NOT be called."""
+        monkeypatch.setattr("gateway.run._hermes_home", tmp_path)
+
+        # Create the marker
+        marker = tmp_path / ".clean_shutdown"
+        marker.touch()
+
+        # Create a store with a recently active session
+        store = _make_store(tmp_path)
+        source = _make_source()
+        entry = store.get_or_create_session(source)
+        assert not entry.suspended
+
+        # Simulate what start() does:
+        if marker.exists():
+            marker.unlink()
+            # Should NOT call suspend_recently_active
+        else:
+            store.suspend_recently_active()
+
+        # Session should NOT be suspended
+        with store._lock:
+            store._ensure_loaded_locked()
+            for e in store._entries.values():
+                assert not e.suspended, "Session should NOT be suspended after clean shutdown"
+
+        assert not marker.exists(), "Marker should be cleaned up"
+
+    def test_no_marker_triggers_suspension(self, tmp_path, monkeypatch):
+        """Without .clean_shutdown marker (crash), suspension should fire."""
+        monkeypatch.setattr("gateway.run._hermes_home", tmp_path)
+
+        marker = tmp_path / ".clean_shutdown"
+        assert not marker.exists()
+
+        # Create a store with a recently active session
+        store = _make_store(tmp_path)
+        source = _make_source()
+        entry = store.get_or_create_session(source)
+        assert not entry.suspended
+
+        # Simulate what start() does:
+        if marker.exists():
+            marker.unlink()
+        else:
+            store.suspend_recently_active()
+
+        # Session SHOULD be suspended (crash recovery)
+        with store._lock:
+            store._ensure_loaded_locked()
+            suspended_count = sum(1 for e in store._entries.values() if e.suspended)
+        assert suspended_count == 1, "Session should be suspended after crash (no marker)"
+
+    def test_marker_written_on_restart_stop(self, tmp_path, monkeypatch):
+        """stop(restart=True) should also write the marker."""
+        monkeypatch.setattr("gateway.run._hermes_home", tmp_path)
+        marker = tmp_path / ".clean_shutdown"
+
+        from gateway.run import GatewayRunner
+        runner = object.__new__(GatewayRunner)
+        runner._restart_requested = False
+        runner._restart_detached = False
+        runner._restart_via_service = False
+        runner._restart_task_started = False
+        runner._running = True
+        runner._draining = False
+        runner._stop_task = None
+        runner._running_agents = {}
+        runner._pending_messages = {}
+        runner._pending_approvals = {}
+        runner._background_tasks = set()
+        runner._shutdown_event = MagicMock()
+        runner._restart_drain_timeout = 5
+        runner._exit_code = None
+        runner._exit_reason = None
+        runner.adapters = {}
+        runner.config = GatewayConfig()
+
+        with patch("gateway.run.GatewayRunner._drain_active_agents", new_callable=AsyncMock, return_value=([], False)), \
+             patch("gateway.run.GatewayRunner._finalize_shutdown_agents"), \
+             patch("gateway.run.GatewayRunner._update_runtime_status"), \
+             patch("gateway.status.remove_pid_file"), \
+             patch("tools.process_registry.process_registry") as mock_proc_reg, \
+             patch("tools.terminal_tool.cleanup_all_environments"), \
+             patch("tools.browser_tool.cleanup_all_browsers"):
+            mock_proc_reg.kill_all = MagicMock()
+
+            import asyncio
+            asyncio.get_event_loop().run_until_complete(runner.stop(restart=True))
+
+        assert marker.exists(), ".clean_shutdown marker should exist after restart-stop too"
@@ -403,6 +403,56 @@ class TestWatchUpdateProgress:

        # Should not crash; legacy notification handles this case

+    @pytest.mark.asyncio
+    async def test_prompt_forwarded_only_once(self, tmp_path):
+        """Regression: prompt must not be re-sent on every poll cycle.
+
+        Before the fix, the watcher never deleted .update_prompt.json after
+        forwarding, causing the same prompt to be sent every poll_interval.
+        """
+        runner = _make_runner()
+        hermes_home = tmp_path / "hermes"
+        hermes_home.mkdir()
+
+        pending = {"platform": "telegram", "chat_id": "111", "user_id": "222",
+                   "session_key": "agent:main:telegram:dm:111"}
+        (hermes_home / ".update_pending.json").write_text(json.dumps(pending))
+        (hermes_home / ".update_output.txt").write_text("")
+
+        mock_adapter = AsyncMock()
+        runner.adapters = {Platform.TELEGRAM: mock_adapter}
+
+        # Write the prompt file up front (before the watcher starts).
+        # The watcher should forward it exactly once, then delete it.
+        prompt = {"prompt": "Would you like to configure new options now? Y/n",
+                  "default": "n", "id": "dup-test"}
+        (hermes_home / ".update_prompt.json").write_text(json.dumps(prompt))
+
+        async def finish_after_polls():
+            # Wait long enough for multiple poll cycles to occur, then
+            # simulate a response + completion.
+            await asyncio.sleep(1.0)
+            (hermes_home / ".update_response").write_text("n")
+            await asyncio.sleep(0.3)
+            (hermes_home / ".update_exit_code").write_text("0")
+
+        with patch("gateway.run._hermes_home", hermes_home):
+            task = asyncio.create_task(finish_after_polls())
+            await runner._watch_update_progress(
+                poll_interval=0.1,
+                stream_interval=0.2,
+                timeout=10.0,
+            )
+            await task
+
+        # Count how many times the prompt text was sent
+        all_sent = [str(c) for c in mock_adapter.send.call_args_list]
+        prompt_sends = [s for s in all_sent if "configure new options" in s]
+        assert len(prompt_sends) == 1, (
+            f"Prompt was sent {len(prompt_sends)} times (expected 1). "
+            f"All sends: {all_sent}"
+        )
+

 # ---------------------------------------------------------------------------
 # Message interception for update prompts
@@ -14,6 +14,7 @@ from hermes_cli.auth import (
    PROVIDER_REGISTRY,
    _read_codex_tokens,
    _save_codex_tokens,
+    _write_codex_cli_tokens,
    _import_codex_cli_tokens,
    get_codex_auth_status,
    get_provider_auth_state,
@@ -161,7 +162,7 @@ def test_import_codex_cli_tokens_missing(tmp_path, monkeypatch):


 def test_codex_tokens_not_written_to_shared_file(tmp_path, monkeypatch):
-    """Verify Hermes never writes to ~/.codex/auth.json."""
+    """Verify _save_codex_tokens writes only to Hermes auth store, not ~/.codex/."""
    hermes_home = tmp_path / "hermes"
    codex_home = tmp_path / "codex-cli"
    hermes_home.mkdir(parents=True, exist_ok=True)
@@ -173,7 +174,7 @@ def test_codex_tokens_not_written_to_shared_file(tmp_path, monkeypatch):

    _save_codex_tokens({"access_token": "hermes-at", "refresh_token": "hermes-rt"})

-    # ~/.codex/auth.json should NOT exist
+    # ~/.codex/auth.json should NOT exist — _save_codex_tokens only touches Hermes store
    assert not (codex_home / "auth.json").exists()

    # Hermes auth store should have the tokens
@@ -181,6 +182,98 @@ def test_codex_tokens_not_written_to_shared_file(tmp_path, monkeypatch):
    assert data["tokens"]["access_token"] == "hermes-at"


+def test_write_codex_cli_tokens_creates_file(tmp_path, monkeypatch):
+    """_write_codex_cli_tokens creates ~/.codex/auth.json with refreshed tokens."""
+    codex_home = tmp_path / "codex-cli"
+    monkeypatch.setenv("CODEX_HOME", str(codex_home))
+
+    _write_codex_cli_tokens("new-access", "new-refresh", last_refresh="2026-04-12T00:00:00Z")
+
+    auth_path = codex_home / "auth.json"
+    assert auth_path.exists()
+    data = json.loads(auth_path.read_text())
+    assert data["tokens"]["access_token"] == "new-access"
+    assert data["tokens"]["refresh_token"] == "new-refresh"
+    assert data["last_refresh"] == "2026-04-12T00:00:00Z"
+    # Verify file permissions are restricted
+    assert (auth_path.stat().st_mode & 0o777) == 0o600
+
+
+def test_write_codex_cli_tokens_preserves_existing(tmp_path, monkeypatch):
+    """_write_codex_cli_tokens preserves extra fields in existing auth.json."""
+    codex_home = tmp_path / "codex-cli"
+    codex_home.mkdir(parents=True, exist_ok=True)
+    monkeypatch.setenv("CODEX_HOME", str(codex_home))
+
+    existing = {
+        "tokens": {
+            "access_token": "old-access",
+            "refresh_token": "old-refresh",
+            "extra_field": "preserved",
+        },
+        "last_refresh": "2026-01-01T00:00:00Z",
+        "custom_key": "keep_me",
+    }
+    (codex_home / "auth.json").write_text(json.dumps(existing))
+
+    _write_codex_cli_tokens("updated-access", "updated-refresh")
+
+    data = json.loads((codex_home / "auth.json").read_text())
+    assert data["tokens"]["access_token"] == "updated-access"
+    assert data["tokens"]["refresh_token"] == "updated-refresh"
+    assert data["tokens"]["extra_field"] == "preserved"
+    assert data["custom_key"] == "keep_me"
+    # last_refresh not updated since we didn't pass it
+    assert data["last_refresh"] == "2026-01-01T00:00:00Z"
+
+
+def test_write_codex_cli_tokens_handles_missing_dir(tmp_path, monkeypatch):
+    """_write_codex_cli_tokens creates parent directories if missing."""
+    codex_home = tmp_path / "does" / "not" / "exist"
+    monkeypatch.setenv("CODEX_HOME", str(codex_home))
+
+    _write_codex_cli_tokens("at", "rt")
+
+    assert (codex_home / "auth.json").exists()
+    data = json.loads((codex_home / "auth.json").read_text())
+    assert data["tokens"]["access_token"] == "at"
+
+
+def test_refresh_codex_auth_tokens_writes_back_to_cli(tmp_path, monkeypatch):
+    """After refreshing, _refresh_codex_auth_tokens writes back to ~/.codex/auth.json."""
+    from hermes_cli.auth import _refresh_codex_auth_tokens
+
+    hermes_home = tmp_path / "hermes"
+    codex_home = tmp_path / "codex-cli"
+    hermes_home.mkdir(parents=True, exist_ok=True)
+    codex_home.mkdir(parents=True, exist_ok=True)
+    (hermes_home / "auth.json").write_text(json.dumps({"version": 1, "providers": {}}))
+    monkeypatch.setenv("HERMES_HOME", str(hermes_home))
+    monkeypatch.setenv("CODEX_HOME", str(codex_home))
+
+    # Write initial CLI tokens
+    (codex_home / "auth.json").write_text(json.dumps({
+        "tokens": {"access_token": "old-at", "refresh_token": "old-rt"},
+    }))
+
+    # Mock the pure refresh to return new tokens
+    monkeypatch.setattr("hermes_cli.auth.refresh_codex_oauth_pure", lambda *a, **kw: {
+        "access_token": "refreshed-at",
+        "refresh_token": "refreshed-rt",
+        "last_refresh": "2026-04-12T01:00:00Z",
+    })
+
+    _refresh_codex_auth_tokens(
+        {"access_token": "old-at", "refresh_token": "old-rt"},
+        timeout_seconds=10,
+    )
+
+    # Verify CLI file was updated
+    cli_data = json.loads((codex_home / "auth.json").read_text())
+    assert cli_data["tokens"]["access_token"] == "refreshed-at"
+    assert cli_data["tokens"]["refresh_token"] == "refreshed-rt"
+
+
 def test_resolve_returns_hermes_auth_store_source(tmp_path, monkeypatch):
    hermes_home = tmp_path / "hermes"
    _setup_hermes_auth(hermes_home)
@@ -12,49 +12,10 @@ from unittest.mock import MagicMock, patch
 import pytest

 from hermes_cli.config import (
-    _is_inside_container,
    get_container_exec_info,
 )


-# =============================================================================
-# _is_inside_container
-# =============================================================================
-
-
-def test_is_inside_container_dockerenv():
-    """Detects /.dockerenv marker file."""
-    with patch("os.path.exists") as mock_exists:
-        mock_exists.side_effect = lambda p: p == "/.dockerenv"
-        assert _is_inside_container() is True
-
-
-def test_is_inside_container_containerenv():
-    """Detects Podman's /run/.containerenv marker."""
-    with patch("os.path.exists") as mock_exists:
-        mock_exists.side_effect = lambda p: p == "/run/.containerenv"
-        assert _is_inside_container() is True
-
-
-def test_is_inside_container_cgroup_docker():
-    """Detects 'docker' in /proc/1/cgroup."""
-    with patch("os.path.exists", return_value=False), \
-         patch("builtins.open", create=True) as mock_open:
-        mock_open.return_value.__enter__ = lambda s: s
-        mock_open.return_value.__exit__ = MagicMock(return_value=False)
-        mock_open.return_value.read = MagicMock(
-            return_value="12:memory:/docker/abc123\n"
-        )
-        assert _is_inside_container() is True
-
-
-def test_is_inside_container_false_on_host():
-    """Returns False when none of the container indicators are present."""
-    with patch("os.path.exists", return_value=False), \
-         patch("builtins.open", side_effect=OSError("no such file")):
-        assert _is_inside_container() is False
-
-
 # =============================================================================
 # get_container_exec_info
 # =============================================================================
@@ -81,7 +42,7 @@ def container_env(tmp_path, monkeypatch):

 def test_get_container_exec_info_returns_metadata(container_env):
    """Reads .container-mode and returns all fields including exec_user."""
-    with patch("hermes_cli.config._is_inside_container", return_value=False):
+    with patch("hermes_constants.is_container", return_value=False):
        info = get_container_exec_info()

    assert info is not None
@@ -93,7 +54,7 @@ def test_get_container_exec_info_returns_metadata(container_env):

 def test_get_container_exec_info_none_inside_container(container_env):
    """Returns None when we're already inside a container."""
-    with patch("hermes_cli.config._is_inside_container", return_value=True):
+    with patch("hermes_constants.is_container", return_value=True):
        info = get_container_exec_info()

    assert info is None
@@ -106,7 +67,7 @@ def test_get_container_exec_info_none_without_file(tmp_path, monkeypatch):
    monkeypatch.setenv("HERMES_HOME", str(hermes_home))
    monkeypatch.delenv("HERMES_DEV", raising=False)

-    with patch("hermes_cli.config._is_inside_container", return_value=False):
+    with patch("hermes_constants.is_container", return_value=False):
        info = get_container_exec_info()

    assert info is None
@@ -116,7 +77,7 @@ def test_get_container_exec_info_skipped_when_hermes_dev(container_env, monkeypa
    """Returns None when HERMES_DEV=1 is set (dev mode bypass)."""
    monkeypatch.setenv("HERMES_DEV", "1")

-    with patch("hermes_cli.config._is_inside_container", return_value=False):
+    with patch("hermes_constants.is_container", return_value=False):
        info = get_container_exec_info()

    assert info is None
@@ -126,7 +87,7 @@ def test_get_container_exec_info_not_skipped_when_hermes_dev_zero(container_env,
    """HERMES_DEV=0 does NOT trigger bypass — only '1' does."""
    monkeypatch.setenv("HERMES_DEV", "0")

-    with patch("hermes_cli.config._is_inside_container", return_value=False):
+    with patch("hermes_constants.is_container", return_value=False):
        info = get_container_exec_info()

    assert info is not None
@@ -143,7 +104,7 @@ def test_get_container_exec_info_defaults():
            "# minimal file with no keys\n"
        )

-        with patch("hermes_cli.config._is_inside_container", return_value=False), \
+        with patch("hermes_constants.is_container", return_value=False), \
             patch("hermes_cli.config.get_hermes_home", return_value=hermes_home), \
             patch.dict(os.environ, {}, clear=False):
            os.environ.pop("HERMES_DEV", None)
@@ -165,7 +126,7 @@ def test_get_container_exec_info_docker_backend(container_env):
        "hermes_bin=/opt/hermes/bin/hermes\n"
    )

-    with patch("hermes_cli.config._is_inside_container", return_value=False):
+    with patch("hermes_constants.is_container", return_value=False):
        info = get_container_exec_info()

    assert info["backend"] == "docker"
@@ -176,7 +137,7 @@ def test_get_container_exec_info_docker_backend(container_env):

 def test_get_container_exec_info_crashes_on_permission_error(container_env):
    """PermissionError propagates instead of being silently swallowed."""
-    with patch("hermes_cli.config._is_inside_container", return_value=False), \
+    with patch("hermes_constants.is_container", return_value=False), \
         patch("builtins.open", side_effect=PermissionError("permission denied")):
        with pytest.raises(PermissionError):
            get_container_exec_info()
@@ -122,3 +122,54 @@ class TestCustomProviderModelSwitch:
        model = config.get("model")
        assert isinstance(model, dict)
        assert model["default"] == "model-X"
+
+    def test_api_mode_set_from_provider_info(self, config_home):
+        """When custom_providers entry has api_mode, it should be applied."""
+        import yaml
+        from hermes_cli.main import _model_flow_named_custom
+
+        provider_info = {
+            "name": "Anthropic Proxy",
+            "base_url": "https://proxy.example.com/anthropic",
+            "api_key": "***",
+            "model": "claude-3",
+            "api_mode": "anthropic_messages",
+        }
+
+        with patch("hermes_cli.models.fetch_api_models", return_value=["claude-3"]), \
+             patch.dict("sys.modules", {"simple_term_menu": None}), \
+             patch("builtins.input", return_value="1"), \
+             patch("builtins.print"):
+            _model_flow_named_custom({}, provider_info)
+
+        config = yaml.safe_load((config_home / "config.yaml").read_text()) or {}
+        model = config.get("model")
+        assert isinstance(model, dict)
+        assert model.get("api_mode") == "anthropic_messages"
+
+    def test_api_mode_cleared_when_not_specified(self, config_home):
+        """When custom_providers entry has no api_mode, stale api_mode is removed."""
+        import yaml
+        from hermes_cli.main import _model_flow_named_custom
+
+        # Pre-seed a stale api_mode in config
+        config_path = config_home / "config.yaml"
+        config_path.write_text(yaml.dump({"model": {"api_mode": "anthropic_messages"}}))
+
+        provider_info = {
+            "name": "My vLLM",
+            "base_url": "https://vllm.example.com/v1",
+            "api_key": "***",
+            "model": "llama-3",
+        }
+
+        with patch("hermes_cli.models.fetch_api_models", return_value=["llama-3"]), \
+             patch.dict("sys.modules", {"simple_term_menu": None}), \
+             patch("builtins.input", return_value="1"), \
+             patch("builtins.print"):
+            _model_flow_named_custom({}, provider_info)
+
+        config = yaml.safe_load((config_home / "config.yaml").read_text()) or {}
+        model = config.get("model")
+        assert isinstance(model, dict)
+        assert "api_mode" not in model, "Stale api_mode should be removed"
@@ -394,6 +394,21 @@ class TestLaunchdServiceRecovery:


 class TestGatewayServiceDetection:
+    def test_supports_systemd_services_requires_systemctl_binary(self, monkeypatch):
+        monkeypatch.setattr(gateway_cli, "is_linux", lambda: True)
+        monkeypatch.setattr(gateway_cli, "is_termux", lambda: False)
+        monkeypatch.setattr(gateway_cli.shutil, "which", lambda name: None)
+
+        assert gateway_cli.supports_systemd_services() is False
+
+    def test_supports_systemd_services_returns_true_when_systemctl_present(self, monkeypatch):
+        monkeypatch.setattr(gateway_cli, "is_linux", lambda: True)
+        monkeypatch.setattr(gateway_cli, "is_termux", lambda: False)
+        monkeypatch.setattr(gateway_cli, "is_wsl", lambda: False)
+        monkeypatch.setattr(gateway_cli.shutil, "which", lambda name: "/usr/bin/systemctl")
+
+        assert gateway_cli.supports_systemd_services() is True
+
    def test_is_service_running_checks_system_scope_when_user_scope_is_inactive(self, monkeypatch):
        user_unit = SimpleNamespace(exists=lambda: True)
        system_unit = SimpleNamespace(exists=lambda: True)
@@ -418,6 +433,23 @@ class TestGatewayServiceDetection:

        assert gateway_cli._is_service_running() is True

+    def test_is_service_running_returns_false_when_systemctl_missing(self, monkeypatch):
+        unit = SimpleNamespace(exists=lambda: True)
+
+        monkeypatch.setattr(gateway_cli, "supports_systemd_services", lambda: True)
+        monkeypatch.setattr(
+            gateway_cli,
+            "get_systemd_unit_path",
+            lambda system=False: unit,
+        )
+
+        def fake_run(*args, **kwargs):
+            raise FileNotFoundError("systemctl")
+
+        monkeypatch.setattr(gateway_cli.subprocess, "run", fake_run)
+
+        assert gateway_cli._is_service_running() is False
+

 class TestGatewaySystemServiceRouting:
    def test_systemd_restart_self_requests_graceful_restart_without_reload_or_restart(self, monkeypatch, capsys):
@@ -1001,3 +1033,91 @@ class TestSystemUnitPathRemapping:
        # Target user paths should be present
        assert "/home/alice" in unit
        assert "WorkingDirectory=/home/alice/.hermes/hermes-agent" in unit
+
+
+class TestDockerAwareGateway:
+    """Tests for Docker container awareness in gateway commands."""
+
+    def test_run_systemctl_raises_runtimeerror_when_missing(self, monkeypatch):
+        """_run_systemctl raises RuntimeError with container guidance when systemctl is absent."""
+        import pytest
+
+        def fake_run(cmd, **kwargs):
+            raise FileNotFoundError("systemctl")
+
+        monkeypatch.setattr(gateway_cli.subprocess, "run", fake_run)
+
+        with pytest.raises(RuntimeError, match="systemctl is not available"):
+            gateway_cli._run_systemctl(["start", "hermes-gateway"])
+
+    def test_run_systemctl_passes_through_on_success(self, monkeypatch):
+        """_run_systemctl delegates to subprocess.run when systemctl exists."""
+        calls = []
+
+        def fake_run(cmd, **kwargs):
+            calls.append(cmd)
+            return SimpleNamespace(returncode=0, stdout="", stderr="")
+
+        monkeypatch.setattr(gateway_cli.subprocess, "run", fake_run)
+
+        result = gateway_cli._run_systemctl(["status", "hermes-gateway"])
+        assert result.returncode == 0
+        assert len(calls) == 1
+        assert "status" in calls[0]
+
+    def test_install_in_container_prints_docker_guidance(self, monkeypatch, capsys):
+        """'hermes gateway install' inside Docker exits 0 with container guidance."""
+        import pytest
+
+        monkeypatch.setattr(gateway_cli, "is_managed", lambda: False)
+        monkeypatch.setattr(gateway_cli, "is_termux", lambda: False)
+        monkeypatch.setattr(gateway_cli, "supports_systemd_services", lambda: False)
+        monkeypatch.setattr(gateway_cli, "is_macos", lambda: False)
+        monkeypatch.setattr(gateway_cli, "is_wsl", lambda: False)
+        monkeypatch.setattr(gateway_cli, "is_container", lambda: True)
+
+        args = SimpleNamespace(gateway_command="install", force=False, system=False, run_as_user=None)
+        with pytest.raises(SystemExit) as exc_info:
+            gateway_cli.gateway_command(args)
+
+        assert exc_info.value.code == 0
+        out = capsys.readouterr().out
+        assert "Docker" in out or "docker" in out
+        assert "restart" in out.lower()
+
+    def test_uninstall_in_container_prints_docker_guidance(self, monkeypatch, capsys):
+        """'hermes gateway uninstall' inside Docker exits 0 with container guidance."""
+        import pytest
+
+        monkeypatch.setattr(gateway_cli, "is_managed", lambda: False)
+        monkeypatch.setattr(gateway_cli, "is_termux", lambda: False)
+        monkeypatch.setattr(gateway_cli, "supports_systemd_services", lambda: False)
+        monkeypatch.setattr(gateway_cli, "is_macos", lambda: False)
+        monkeypatch.setattr(gateway_cli, "is_container", lambda: True)
+
+        args = SimpleNamespace(gateway_command="uninstall", system=False)
+        with pytest.raises(SystemExit) as exc_info:
+            gateway_cli.gateway_command(args)
+
+        assert exc_info.value.code == 0
+        out = capsys.readouterr().out
+        assert "docker" in out.lower()
+
+    def test_start_in_container_prints_docker_guidance(self, monkeypatch, capsys):
+        """'hermes gateway start' inside Docker exits 0 with container guidance."""
+        import pytest
+
+        monkeypatch.setattr(gateway_cli, "is_termux", lambda: False)
+        monkeypatch.setattr(gateway_cli, "supports_systemd_services", lambda: False)
+        monkeypatch.setattr(gateway_cli, "is_macos", lambda: False)
+        monkeypatch.setattr(gateway_cli, "is_wsl", lambda: False)
+        monkeypatch.setattr(gateway_cli, "is_container", lambda: True)
+
+        args = SimpleNamespace(gateway_command="start", system=False)
+        with pytest.raises(SystemExit) as exc_info:
+            gateway_cli.gateway_command(args)
+
+        assert exc_info.value.code == 0
+        out = capsys.readouterr().out
+        assert "docker" in out.lower()
+        assert "hermes gateway run" in out
@@ -257,3 +257,76 @@ class TestProviderPersistsAfterModelSave:
        assert model.get("provider") == "opencode-go"
        assert model.get("default") == "minimax-m2.5"
        assert model.get("api_mode") == "anthropic_messages"
+
+
+class TestBaseUrlValidation:
+    """Reject non-URL values in the base URL prompt (e.g. shell commands)."""
+
+    def test_invalid_base_url_rejected(self, config_home, monkeypatch, capsys):
+        """Typing a non-URL string should not be saved as the base URL."""
+        from hermes_cli.auth import PROVIDER_REGISTRY
+
+        pconfig = PROVIDER_REGISTRY.get("zai")
+        if not pconfig:
+            pytest.skip("zai not in PROVIDER_REGISTRY")
+
+        monkeypatch.setenv("GLM_API_KEY", "test-key")
+
+        from hermes_cli.main import _model_flow_api_key_provider
+        from hermes_cli.config import load_config, get_env_value
+
+        # User types a shell command instead of a URL at the base URL prompt
+        with patch("hermes_cli.auth._prompt_model_selection", return_value="glm-5"), \
+             patch("hermes_cli.auth.deactivate_provider"), \
+             patch("builtins.input", return_value="nano ~/.hermes/.env"):
+            _model_flow_api_key_provider(load_config(), "zai", "old-model")
+
+        # The garbage value should NOT have been saved
+        saved = get_env_value("GLM_BASE_URL") or ""
+        assert not saved or saved.startswith(("http://", "https://")), \
+            f"Non-URL value was saved as GLM_BASE_URL: {saved}"
+        captured = capsys.readouterr()
+        assert "Invalid URL" in captured.out
+
+    def test_valid_base_url_accepted(self, config_home, monkeypatch):
+        """A proper URL should be saved normally."""
+        from hermes_cli.auth import PROVIDER_REGISTRY
+
+        pconfig = PROVIDER_REGISTRY.get("zai")
+        if not pconfig:
+            pytest.skip("zai not in PROVIDER_REGISTRY")
+
+        monkeypatch.setenv("GLM_API_KEY", "test-key")
+
+        from hermes_cli.main import _model_flow_api_key_provider
+        from hermes_cli.config import load_config, get_env_value
+
+        with patch("hermes_cli.auth._prompt_model_selection", return_value="glm-5"), \
+             patch("hermes_cli.auth.deactivate_provider"), \
+             patch("builtins.input", return_value="https://custom.z.ai/api/paas/v4"):
+            _model_flow_api_key_provider(load_config(), "zai", "old-model")
+
+        saved = get_env_value("GLM_BASE_URL") or ""
+        assert saved == "https://custom.z.ai/api/paas/v4"
+
+    def test_empty_base_url_keeps_default(self, config_home, monkeypatch):
+        """Pressing Enter (empty) should not change the base URL."""
+        from hermes_cli.auth import PROVIDER_REGISTRY
+
+        pconfig = PROVIDER_REGISTRY.get("zai")
+        if not pconfig:
+            pytest.skip("zai not in PROVIDER_REGISTRY")
+
+        monkeypatch.setenv("GLM_API_KEY", "test-key")
+        monkeypatch.delenv("GLM_BASE_URL", raising=False)
+
+        from hermes_cli.main import _model_flow_api_key_provider
+        from hermes_cli.config import load_config, get_env_value
+
+        with patch("hermes_cli.auth._prompt_model_selection", return_value="glm-5"), \
+             patch("hermes_cli.auth.deactivate_provider"), \
+             patch("builtins.input", return_value=""):
+            _model_flow_api_key_provider(load_config(), "zai", "old-model")
+
+        saved = get_env_value("GLM_BASE_URL") or ""
+        assert saved == "", "Empty input should not save a base URL"
@@ -1,5 +1,4 @@
-"""Tests for setup_model_provider — verifies the delegation to
-select_provider_and_model() and config dict sync."""
+"""Tests for setup.py configuration flows."""
 import json
 import sys
 import types
@@ -8,6 +7,7 @@ import pytest

 from hermes_cli.auth import get_active_provider
 from hermes_cli.config import load_config, save_config
+from hermes_cli import setup as setup_mod
 from hermes_cli.setup import setup_model_provider


@@ -144,6 +144,85 @@ def test_setup_custom_providers_synced(tmp_path, monkeypatch):
    assert reloaded.get("custom_providers") == [{"name": "Local", "base_url": "http://localhost:8080/v1"}]


+def test_setup_gateway_skips_service_install_when_systemctl_missing(monkeypatch, capsys):
+    env = {
+        "TELEGRAM_BOT_TOKEN": "",
+        "TELEGRAM_HOME_CHANNEL": "",
+        "DISCORD_BOT_TOKEN": "",
+        "DISCORD_HOME_CHANNEL": "",
+        "SLACK_BOT_TOKEN": "",
+        "SLACK_HOME_CHANNEL": "",
+        "MATRIX_HOMESERVER": "https://matrix.example.com",
+        "MATRIX_USER_ID": "@alice:example.com",
+        "MATRIX_PASSWORD": "",
+        "MATRIX_ACCESS_TOKEN": "token",
+        "BLUEBUBBLES_SERVER_URL": "",
+        "BLUEBUBBLES_HOME_CHANNEL": "",
+        "WHATSAPP_ENABLED": "",
+        "WEBHOOK_ENABLED": "",
+    }
+
+    monkeypatch.setattr(setup_mod, "get_env_value", lambda key: env.get(key, ""))
+    monkeypatch.setattr(setup_mod, "prompt_yes_no", lambda *args, **kwargs: False)
+    monkeypatch.setattr("platform.system", lambda: "Linux")
+
+    import hermes_cli.gateway as gateway_mod
+
+    monkeypatch.setattr(gateway_mod, "supports_systemd_services", lambda: False)
+    monkeypatch.setattr(gateway_mod, "is_macos", lambda: False)
+    monkeypatch.setattr(gateway_mod, "_is_service_installed", lambda: False)
+    monkeypatch.setattr(gateway_mod, "_is_service_running", lambda: False)
+
+    setup_mod.setup_gateway({})
+
+    out = capsys.readouterr().out
+    assert "Messaging platforms configured!" in out
+    assert "Start the gateway to bring your bots online:" in out
+    assert "hermes gateway" in out
+
+
+def test_setup_gateway_in_container_shows_docker_guidance(monkeypatch, capsys):
+    """setup_gateway() in a Docker container shows Docker-specific restart instructions."""
+    env = {
+        "TELEGRAM_BOT_TOKEN": "",
+        "TELEGRAM_HOME_CHANNEL": "",
+        "DISCORD_BOT_TOKEN": "",
+        "DISCORD_HOME_CHANNEL": "",
+        "SLACK_BOT_TOKEN": "",
+        "SLACK_HOME_CHANNEL": "",
+        "MATRIX_HOMESERVER": "https://matrix.example.com",
+        "MATRIX_USER_ID": "@alice:example.com",
+        "MATRIX_PASSWORD": "",
+        "MATRIX_ACCESS_TOKEN": "token",
+        "BLUEBUBBLES_SERVER_URL": "",
+        "BLUEBUBBLES_HOME_CHANNEL": "",
+        "WHATSAPP_ENABLED": "",
+        "WEBHOOK_ENABLED": "",
+    }
+
+    monkeypatch.setattr(setup_mod, "get_env_value", lambda key: env.get(key, ""))
+    monkeypatch.setattr(setup_mod, "prompt_yes_no", lambda *args, **kwargs: False)
+    monkeypatch.setattr("platform.system", lambda: "Linux")
+
+    import hermes_cli.gateway as gateway_mod
+
+    monkeypatch.setattr(gateway_mod, "supports_systemd_services", lambda: False)
+    monkeypatch.setattr(gateway_mod, "is_macos", lambda: False)
+    monkeypatch.setattr(gateway_mod, "_is_service_installed", lambda: False)
+    monkeypatch.setattr(gateway_mod, "_is_service_running", lambda: False)
+
+    # Patch is_container at the import location in setup.py
+    import hermes_constants
+    monkeypatch.setattr(hermes_constants, "is_container", lambda: True)
+
+    setup_mod.setup_gateway({})
+
+    out = capsys.readouterr().out
+    assert "Messaging platforms configured!" in out
+    assert "docker" in out.lower() or "Docker" in out
+    assert "restart" in out.lower()
+
+
 def test_setup_syncs_custom_provider_removal_from_disk(tmp_path, monkeypatch):
    """Removing the last custom provider in model setup should persist."""
    monkeypatch.setenv("HERMES_HOME", str(tmp_path))
@@ -798,3 +798,120 @@ class TestFindGatewayPidsExclude:
        pids = gateway_cli.find_gateway_pids()

        assert pids == [100]
+
+
+# ---------------------------------------------------------------------------
+# Gateway mode writes exit code before restart (#8300)
+# ---------------------------------------------------------------------------
+
+
+class TestGatewayModeWritesExitCodeEarly:
+    """When running as ``hermes update --gateway``, the exit code marker must be
+    written *before* the gateway restart attempt.  Without this, systemd's
+    ``KillMode=mixed`` kills the update process (and its wrapping shell) during
+    the cgroup teardown, so the shell epilogue that normally writes the exit
+    code never executes.  The new gateway's update watcher then polls for 30
+    minutes and sends a spurious timeout message.
+    """
+
+    @patch("shutil.which", return_value=None)
+    @patch("subprocess.run")
+    def test_exit_code_written_in_gateway_mode(
+        self, mock_run, _mock_which, capsys, tmp_path, monkeypatch,
+    ):
+        monkeypatch.setattr(gateway_cli, "is_macos", lambda: False)
+        monkeypatch.setattr(gateway_cli, "supports_systemd_services", lambda: False)
+        monkeypatch.setattr(gateway_cli, "is_termux", lambda: False)
+
+        # Point HERMES_HOME at a temp dir so the marker file lands there
+        hermes_home = tmp_path / ".hermes"
+        hermes_home.mkdir()
+        monkeypatch.setenv("HERMES_HOME", str(hermes_home))
+        import hermes_cli.config as _cfg
+        monkeypatch.setattr(_cfg, "get_hermes_home", lambda: hermes_home)
+        # Also patch the module-level ref used by cmd_update
+        import hermes_cli.main as _main_mod
+        monkeypatch.setattr(_main_mod, "get_hermes_home", lambda: hermes_home)
+
+        mock_run.side_effect = _make_run_side_effect(commit_count="1")
+
+        args = SimpleNamespace(gateway=True)
+
+        with patch.object(gateway_cli, "find_gateway_pids", return_value=[]):
+            cmd_update(args)
+
+        exit_code_path = hermes_home / ".update_exit_code"
+        assert exit_code_path.exists(), ".update_exit_code not written in gateway mode"
+        assert exit_code_path.read_text() == "0"
+
+    @patch("shutil.which", return_value=None)
+    @patch("subprocess.run")
+    def test_exit_code_not_written_in_normal_mode(
+        self, mock_run, _mock_which, capsys, tmp_path, monkeypatch,
+    ):
+        """Non-gateway mode should NOT write the exit code (the shell does it)."""
+        monkeypatch.setattr(gateway_cli, "is_macos", lambda: False)
+        monkeypatch.setattr(gateway_cli, "supports_systemd_services", lambda: False)
+        monkeypatch.setattr(gateway_cli, "is_termux", lambda: False)
+
+        hermes_home = tmp_path / ".hermes"
+        hermes_home.mkdir()
+        monkeypatch.setenv("HERMES_HOME", str(hermes_home))
+        import hermes_cli.config as _cfg
+        monkeypatch.setattr(_cfg, "get_hermes_home", lambda: hermes_home)
+        import hermes_cli.main as _main_mod
+        monkeypatch.setattr(_main_mod, "get_hermes_home", lambda: hermes_home)
+
+        mock_run.side_effect = _make_run_side_effect(commit_count="1")
+
+        args = SimpleNamespace(gateway=False)
+
+        with patch.object(gateway_cli, "find_gateway_pids", return_value=[]):
+            cmd_update(args)
+
+        exit_code_path = hermes_home / ".update_exit_code"
+        assert not exit_code_path.exists(), ".update_exit_code should not be written outside gateway mode"
+
+    @patch("shutil.which", return_value=None)
+    @patch("subprocess.run")
+    def test_exit_code_written_before_restart_call(
+        self, mock_run, _mock_which, capsys, tmp_path, monkeypatch,
+    ):
+        """Exit code must exist BEFORE systemctl restart is called."""
+        monkeypatch.setattr(gateway_cli, "is_macos", lambda: False)
+        monkeypatch.setattr(gateway_cli, "supports_systemd_services", lambda: True)
+        monkeypatch.setattr(gateway_cli, "is_termux", lambda: False)
+
+        hermes_home = tmp_path / ".hermes"
+        hermes_home.mkdir()
+        monkeypatch.setenv("HERMES_HOME", str(hermes_home))
+        import hermes_cli.config as _cfg
+        monkeypatch.setattr(_cfg, "get_hermes_home", lambda: hermes_home)
+        import hermes_cli.main as _main_mod
+        monkeypatch.setattr(_main_mod, "get_hermes_home", lambda: hermes_home)
+
+        exit_code_path = hermes_home / ".update_exit_code"
+
+        # Track whether exit code exists when systemctl restart is called
+        exit_code_existed_at_restart = []
+
+        original_side_effect = _make_run_side_effect(
+            commit_count="1", systemd_active=True,
+        )
+
+        def tracking_side_effect(cmd, **kwargs):
+            joined = " ".join(str(c) for c in cmd)
+            if "systemctl" in joined and "restart" in joined:
+                exit_code_existed_at_restart.append(exit_code_path.exists())
+            return original_side_effect(cmd, **kwargs)
+
+        mock_run.side_effect = tracking_side_effect
+
+        args = SimpleNamespace(gateway=True)
+
+        with patch.object(gateway_cli, "find_gateway_pids", return_value=[]):
+            cmd_update(args)
+
+        assert exit_code_existed_at_restart, "systemctl restart was never called"
+        assert exit_code_existed_at_restart[0] is True, \
+            ".update_exit_code must exist BEFORE systemctl restart (cgroup kill race)"
@@ -26,6 +26,7 @@ def _make_agent(
    agent.provider = "openrouter"
    agent.base_url = "https://openrouter.ai/api/v1"
    agent.api_key = "sk-test"
+    agent.api_mode = "chat_completions"
    agent.quiet_mode = True
    agent.log_prefix = ""
    agent.compression_enabled = compression_enabled
@@ -99,6 +100,36 @@ def test_no_warning_when_aux_context_sufficient(mock_get_client, mock_ctx_len):
    assert agent._compression_warning is None


+def test_feasibility_check_passes_live_main_runtime():
+    """Compression feasibility should probe using the live session runtime."""
+    agent = _make_agent(main_context=200_000, threshold_percent=0.50)
+    agent.model = "gpt-5.4"
+    agent.provider = "openai-codex"
+    agent.base_url = "https://chatgpt.com/backend-api/codex"
+    agent.api_key = "codex-token"
+    agent.api_mode = "codex_responses"
+
+    mock_client = MagicMock()
+    mock_client.base_url = "https://chatgpt.com/backend-api/codex"
+    mock_client.api_key = "codex-token"
+
+    with patch("agent.auxiliary_client.get_text_auxiliary_client", return_value=(mock_client, "gpt-5.4")) as mock_get_client, \
+         patch("agent.model_metadata.get_model_context_length", return_value=200_000):
+        agent._emit_status = lambda msg: None
+        agent._check_compression_model_feasibility()
+
+    mock_get_client.assert_called_once_with(
+        "compression",
+        main_runtime={
+            "model": "gpt-5.4",
+            "provider": "openai-codex",
+            "base_url": "https://chatgpt.com/backend-api/codex",
+            "api_key": "codex-token",
+            "api_mode": "codex_responses",
+        },
+    )
+
+
@patch("agent.auxiliary_client.get_text_auxiliary_client")
 def test_warns_when_no_auxiliary_provider(mock_get_client):
    """Warning emitted when no auxiliary provider is configured."""
@@ -0,0 +1,120 @@
+"""Tests for empty model fallback — when provider is configured but model is missing."""
+
+from unittest.mock import MagicMock, patch
+import pytest
+
+
+class TestGetDefaultModelForProvider:
+    """Unit tests for hermes_cli.models.get_default_model_for_provider."""
+
+    def test_known_provider_returns_first_model(self):
+        from hermes_cli.models import get_default_model_for_provider
+        result = get_default_model_for_provider("openai-codex")
+        # Should return first model from _PROVIDER_MODELS["openai-codex"]
+        assert result
+        assert isinstance(result, str)
+
+    def test_openrouter_returns_empty(self):
+        """OpenRouter uses dynamic model fetch, no static catalog entry."""
+        from hermes_cli.models import get_default_model_for_provider
+        # OpenRouter is not in _PROVIDER_MODELS — it uses live fetching
+        result = get_default_model_for_provider("openrouter")
+        assert result == ""
+
+    def test_unknown_provider_returns_empty(self):
+        from hermes_cli.models import get_default_model_for_provider
+        assert get_default_model_for_provider("nonexistent-provider") == ""
+
+    def test_custom_provider_returns_empty(self):
+        """Custom provider has no model catalog — should return empty."""
+        from hermes_cli.models import get_default_model_for_provider
+        # Custom providers don't have entries in _PROVIDER_MODELS
+        assert get_default_model_for_provider("some-random-custom") == ""
+
+
+class TestGatewayEmptyModelFallback:
+    """Test that _resolve_session_agent_runtime fills in empty model from provider catalog."""
+
+    def test_empty_model_filled_from_provider(self):
+        """When config has no model but provider is openai-codex, use first codex model."""
+        from gateway.run import GatewayRunner
+
+        runner = object.__new__(GatewayRunner)
+        runner._session_model_overrides = {}
+
+        # Mock _resolve_gateway_model to return empty string
+        # Mock _resolve_runtime_agent_kwargs to return openai-codex provider
+        with patch("gateway.run._resolve_gateway_model", return_value=""), \
+             patch("gateway.run._resolve_runtime_agent_kwargs", return_value={
+                 "provider": "openai-codex",
+                 "api_key": "test-key",
+                 "base_url": "https://chatgpt.com/backend-api/codex",
+                 "api_mode": "codex_responses",
+             }):
+            model, kwargs = runner._resolve_session_agent_runtime()
+
+        # Model should have been filled in from provider catalog
+        assert model, "Model should not be empty when provider is known"
+        assert isinstance(model, str)
+        assert kwargs["provider"] == "openai-codex"
+
+    def test_nonempty_model_not_overridden(self):
+        """When config has a model set, don't override it."""
+        from gateway.run import GatewayRunner
+
+        runner = object.__new__(GatewayRunner)
+        runner._session_model_overrides = {}
+
+        with patch("gateway.run._resolve_gateway_model", return_value="gpt-5.4"), \
+             patch("gateway.run._resolve_runtime_agent_kwargs", return_value={
+                 "provider": "openai-codex",
+                 "api_key": "test-key",
+                 "base_url": "https://chatgpt.com/backend-api/codex",
+                 "api_mode": "codex_responses",
+             }):
+            model, kwargs = runner._resolve_session_agent_runtime()
+
+        assert model == "gpt-5.4", "Explicit model should not be overridden"
+
+    def test_empty_model_no_provider_stays_empty(self):
+        """When both model and provider are empty, model stays empty."""
+        from gateway.run import GatewayRunner
+
+        runner = object.__new__(GatewayRunner)
+        runner._session_model_overrides = {}
+
+        with patch("gateway.run._resolve_gateway_model", return_value=""), \
+             patch("gateway.run._resolve_runtime_agent_kwargs", return_value={
+                 "provider": "",
+                 "api_key": "test-key",
+                 "base_url": "https://example.com",
+                 "api_mode": "chat_completions",
+             }):
+            model, kwargs = runner._resolve_session_agent_runtime()
+
+        # Can't fill in a default without knowing the provider
+        assert model == ""
+
+
+class TestResolveGatewayModel:
+    """Test _resolve_gateway_model reads model from config correctly."""
+
+    def test_returns_default_key(self):
+        from gateway.run import _resolve_gateway_model
+        assert _resolve_gateway_model({"model": {"default": "gpt-5.4"}}) == "gpt-5.4"
+
+    def test_returns_model_key_fallback(self):
+        from gateway.run import _resolve_gateway_model
+        assert _resolve_gateway_model({"model": {"model": "gpt-5.4"}}) == "gpt-5.4"
+
+    def test_returns_empty_when_missing(self):
+        from gateway.run import _resolve_gateway_model
+        assert _resolve_gateway_model({"model": {}}) == ""
+
+    def test_returns_empty_when_no_model_section(self):
+        from gateway.run import _resolve_gateway_model
+        assert _resolve_gateway_model({}) == ""
+
+    def test_string_model_config(self):
+        from gateway.run import _resolve_gateway_model
+        assert _resolve_gateway_model({"model": "my-model"}) == "my-model"
@@ -6,7 +6,8 @@ from unittest.mock import patch

 import pytest

-from hermes_constants import get_default_hermes_root
+import hermes_constants
+from hermes_constants import get_default_hermes_root, is_container


 class TestGetDefaultHermesRoot:
@@ -60,3 +61,53 @@ class TestGetDefaultHermesRoot:
        monkeypatch.setattr(Path, "home", lambda: tmp_path)
        monkeypatch.setenv("HERMES_HOME", str(profile))
        assert get_default_hermes_root() == docker_root
+
+
+class TestIsContainer:
+    """Tests for is_container() — Docker/Podman detection."""
+
+    def _reset_cache(self, monkeypatch):
+        """Reset the cached detection result before each test."""
+        monkeypatch.setattr(hermes_constants, "_container_detected", None)
+
+    def test_detects_dockerenv(self, monkeypatch, tmp_path):
+        """/.dockerenv triggers container detection."""
+        self._reset_cache(monkeypatch)
+        monkeypatch.setattr(os.path, "exists", lambda p: p == "/.dockerenv")
+        assert is_container() is True
+
+    def test_detects_containerenv(self, monkeypatch, tmp_path):
+        """/run/.containerenv triggers container detection (Podman)."""
+        self._reset_cache(monkeypatch)
+        monkeypatch.setattr(os.path, "exists", lambda p: p == "/run/.containerenv")
+        assert is_container() is True
+
+    def test_detects_cgroup_docker(self, monkeypatch, tmp_path):
+        """/proc/1/cgroup containing 'docker' triggers detection."""
+        import builtins
+        self._reset_cache(monkeypatch)
+        monkeypatch.setattr(os.path, "exists", lambda p: False)
+        cgroup_file = tmp_path / "cgroup"
+        cgroup_file.write_text("12:memory:/docker/abc123\n")
+        _real_open = builtins.open
+        monkeypatch.setattr("builtins.open", lambda p, *a, **kw: _real_open(str(cgroup_file), *a, **kw) if p == "/proc/1/cgroup" else _real_open(p, *a, **kw))
+        assert is_container() is True
+
+    def test_negative_case(self, monkeypatch, tmp_path):
+        """Returns False on a regular Linux host."""
+        import builtins
+        self._reset_cache(monkeypatch)
+        monkeypatch.setattr(os.path, "exists", lambda p: False)
+        cgroup_file = tmp_path / "cgroup"
+        cgroup_file.write_text("12:memory:/\n")
+        _real_open = builtins.open
+        monkeypatch.setattr("builtins.open", lambda p, *a, **kw: _real_open(str(cgroup_file), *a, **kw) if p == "/proc/1/cgroup" else _real_open(p, *a, **kw))
+        assert is_container() is False
+
+    def test_caches_result(self, monkeypatch):
+        """Second call uses cached value without re-probing."""
+        monkeypatch.setattr(hermes_constants, "_container_detected", True)
+        assert is_container() is True
+        # Even if we make os.path.exists return False, cached value wins
+        monkeypatch.setattr(os.path, "exists", lambda p: False)
+        assert is_container() is True
@@ -106,8 +106,9 @@ def detect_audio_environment() -> dict:
    if any(os.environ.get(v) for v in ('SSH_CLIENT', 'SSH_TTY', 'SSH_CONNECTION')):
        warnings.append("Running over SSH -- no audio devices available")

-    # Docker detection
-    if os.path.exists('/.dockerenv'):
+    # Docker/Podman container detection
+    from hermes_constants import is_container
+    if is_container():
        warnings.append("Running inside Docker container -- no audio devices")

    # WSL detection — PulseAudio bridge makes audio work in WSL.
@@ -165,6 +165,15 @@ wheels = [
    { url = "https://files.pythonhosted.org/packages/fb/76/641ae371508676492379f16e2fa48f4e2c11741bd63c48be4b12a6b09cba/aiosignal-1.4.0-py3-none-any.whl", hash = "sha256:053243f8b92b990551949e63930a839ff0cf0b0ebbe0597b0f3fb19e1a0fe82e", size = 7490, upload-time = "2025-07-03T22:54:42.156Z" },
 ]

+[[package]]
+name = "aiosqlite"
+version = "0.22.1"
+source = { registry = "https://pypi.org/simple" }
+sdist = { url = "https://files.pythonhosted.org/packages/4e/8a/64761f4005f17809769d23e518d915db74e6310474e733e3593cfc854ef1/aiosqlite-0.22.1.tar.gz", hash = "sha256:043e0bd78d32888c0a9ca90fc788b38796843360c855a7262a532813133a0650", size = 14821, upload-time = "2025-12-23T19:25:43.997Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/00/b7/e3bf5133d697a08128598c8d0abc5e16377b51465a33756de24fa7dee953/aiosqlite-0.22.1-py3-none-any.whl", hash = "sha256:21c002eb13823fad740196c5a2e9d8e62f6243bd9e7e4a1f87fb5e44ecb4fceb", size = 17405, upload-time = "2025-12-23T19:25:42.139Z" },
+]
+
 [[package]]
 name = "altair"
 version = "6.0.0"
@@ -240,6 +249,54 @@ wheels = [
    { url = "https://files.pythonhosted.org/packages/38/0e/27be9fdef66e72d64c0cdc3cc2823101b80585f8119b5c112c2e8f5f7dab/anyio-4.12.1-py3-none-any.whl", hash = "sha256:d405828884fc140aa80a3c667b8beed277f1dfedec42ba031bd6ac3db606ab6c", size = 113592, upload-time = "2026-01-06T11:45:19.497Z" },
 ]

+[[package]]
+name = "asyncpg"
+version = "0.31.0"
+source = { registry = "https://pypi.org/simple" }
+sdist = { url = "https://files.pythonhosted.org/packages/fe/cc/d18065ce2380d80b1bcce927c24a2642efd38918e33fd724bc4bca904877/asyncpg-0.31.0.tar.gz", hash = "sha256:c989386c83940bfbd787180f2b1519415e2d3d6277a70d9d0f0145ac73500735", size = 993667, upload-time = "2025-11-24T23:27:00.812Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/08/17/cc02bc49bc350623d050fa139e34ea512cd6e020562f2a7312a7bcae4bc9/asyncpg-0.31.0-cp311-cp311-macosx_10_9_x86_64.whl", hash = "sha256:eee690960e8ab85063ba93af2ce128c0f52fd655fdff9fdb1a28df01329f031d", size = 643159, upload-time = "2025-11-24T23:25:36.443Z" },
+    { url = "https://files.pythonhosted.org/packages/a4/62/4ded7d400a7b651adf06f49ea8f73100cca07c6df012119594d1e3447aa6/asyncpg-0.31.0-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:2657204552b75f8288de08ca60faf4a99a65deef3a71d1467454123205a88fab", size = 638157, upload-time = "2025-11-24T23:25:37.89Z" },
+    { url = "https://files.pythonhosted.org/packages/d6/5b/4179538a9a72166a0bf60ad783b1ef16efb7960e4d7b9afe9f77a5551680/asyncpg-0.31.0-cp311-cp311-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:a429e842a3a4b4ea240ea52d7fe3f82d5149853249306f7ff166cb9948faa46c", size = 2918051, upload-time = "2025-11-24T23:25:39.461Z" },
+    { url = "https://files.pythonhosted.org/packages/e6/35/c27719ae0536c5b6e61e4701391ffe435ef59539e9360959240d6e47c8c8/asyncpg-0.31.0-cp311-cp311-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:c0807be46c32c963ae40d329b3a686356e417f674c976c07fa49f1b30303f109", size = 2972640, upload-time = "2025-11-24T23:25:41.512Z" },
+    { url = "https://files.pythonhosted.org/packages/43/f4/01ebb9207f29e645a64699b9ce0eefeff8e7a33494e1d29bb53736f7766b/asyncpg-0.31.0-cp311-cp311-musllinux_1_2_aarch64.whl", hash = "sha256:e5d5098f63beeae93512ee513d4c0c53dc12e9aa2b7a1af5a81cddf93fe4e4da", size = 2851050, upload-time = "2025-11-24T23:25:43.153Z" },
+    { url = "https://files.pythonhosted.org/packages/3e/f4/03ff1426acc87be0f4e8d40fa2bff5c3952bef0080062af9efc2212e3be8/asyncpg-0.31.0-cp311-cp311-musllinux_1_2_x86_64.whl", hash = "sha256:37fc6c00a814e18eef51833545d1891cac9aa69140598bb076b4cd29b3e010b9", size = 2962574, upload-time = "2025-11-24T23:25:44.942Z" },
+    { url = "https://files.pythonhosted.org/packages/c7/39/cc788dfca3d4060f9d93e67be396ceec458dfc429e26139059e58c2c244d/asyncpg-0.31.0-cp311-cp311-win32.whl", hash = "sha256:5a4af56edf82a701aece93190cc4e094d2df7d33f6e915c222fb09efbb5afc24", size = 521076, upload-time = "2025-11-24T23:25:46.486Z" },
+    { url = "https://files.pythonhosted.org/packages/28/fc/735af5384c029eb7f1ca60ccb8fa95521dbdaeef788edf4cecfc604c3cab/asyncpg-0.31.0-cp311-cp311-win_amd64.whl", hash = "sha256:480c4befbdf079c14c9ca43c8c5e1fe8b6296c96f1f927158d4f1e750aacc047", size = 584980, upload-time = "2025-11-24T23:25:47.938Z" },
+    { url = "https://files.pythonhosted.org/packages/2a/a6/59d0a146e61d20e18db7396583242e32e0f120693b67a8de43f1557033e2/asyncpg-0.31.0-cp312-cp312-macosx_10_13_x86_64.whl", hash = "sha256:b44c31e1efc1c15188ef183f287c728e2046abb1d26af4d20858215d50d91fad", size = 662042, upload-time = "2025-11-24T23:25:49.578Z" },
+    { url = "https://files.pythonhosted.org/packages/36/01/ffaa189dcb63a2471720615e60185c3f6327716fdc0fc04334436fbb7c65/asyncpg-0.31.0-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:0c89ccf741c067614c9b5fc7f1fc6f3b61ab05ae4aaa966e6fd6b93097c7d20d", size = 638504, upload-time = "2025-11-24T23:25:51.501Z" },
+    { url = "https://files.pythonhosted.org/packages/9f/62/3f699ba45d8bd24c5d65392190d19656d74ff0185f42e19d0bbd973bb371/asyncpg-0.31.0-cp312-cp312-manylinux_2_28_aarch64.whl", hash = "sha256:12b3b2e39dc5470abd5e98c8d3373e4b1d1234d9fbdedf538798b2c13c64460a", size = 3426241, upload-time = "2025-11-24T23:25:53.278Z" },
+    { url = "https://files.pythonhosted.org/packages/8c/d1/a867c2150f9c6e7af6462637f613ba67f78a314b00db220cd26ff559d532/asyncpg-0.31.0-cp312-cp312-manylinux_2_28_x86_64.whl", hash = "sha256:aad7a33913fb8bcb5454313377cc330fbb19a0cd5faa7272407d8a0c4257b671", size = 3520321, upload-time = "2025-11-24T23:25:54.982Z" },
+    { url = "https://files.pythonhosted.org/packages/7a/1a/cce4c3f246805ecd285a3591222a2611141f1669d002163abef999b60f98/asyncpg-0.31.0-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:3df118d94f46d85b2e434fd62c84cb66d5834d5a890725fe625f498e72e4d5ec", size = 3316685, upload-time = "2025-11-24T23:25:57.43Z" },
+    { url = "https://files.pythonhosted.org/packages/40/ae/0fc961179e78cc579e138fad6eb580448ecae64908f95b8cb8ee2f241f67/asyncpg-0.31.0-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:bd5b6efff3c17c3202d4b37189969acf8927438a238c6257f66be3c426beba20", size = 3471858, upload-time = "2025-11-24T23:25:59.636Z" },
+    { url = "https://files.pythonhosted.org/packages/52/b2/b20e09670be031afa4cbfabd645caece7f85ec62d69c312239de568e058e/asyncpg-0.31.0-cp312-cp312-win32.whl", hash = "sha256:027eaa61361ec735926566f995d959ade4796f6a49d3bde17e5134b9964f9ba8", size = 527852, upload-time = "2025-11-24T23:26:01.084Z" },
+    { url = "https://files.pythonhosted.org/packages/b5/f0/f2ed1de154e15b107dc692262395b3c17fc34eafe2a78fc2115931561730/asyncpg-0.31.0-cp312-cp312-win_amd64.whl", hash = "sha256:72d6bdcbc93d608a1158f17932de2321f68b1a967a13e014998db87a72ed3186", size = 597175, upload-time = "2025-11-24T23:26:02.564Z" },
+    { url = "https://files.pythonhosted.org/packages/95/11/97b5c2af72a5d0b9bc3fa30cd4b9ce22284a9a943a150fdc768763caf035/asyncpg-0.31.0-cp313-cp313-macosx_10_13_x86_64.whl", hash = "sha256:c204fab1b91e08b0f47e90a75d1b3c62174dab21f670ad6c5d0f243a228f015b", size = 661111, upload-time = "2025-11-24T23:26:04.467Z" },
+    { url = "https://files.pythonhosted.org/packages/1b/71/157d611c791a5e2d0423f09f027bd499935f0906e0c2a416ce712ba51ef3/asyncpg-0.31.0-cp313-cp313-macosx_11_0_arm64.whl", hash = "sha256:54a64f91839ba59008eccf7aad2e93d6e3de688d796f35803235ea1c4898ae1e", size = 636928, upload-time = "2025-11-24T23:26:05.944Z" },
+    { url = "https://files.pythonhosted.org/packages/2e/fc/9e3486fb2bbe69d4a867c0b76d68542650a7ff1574ca40e84c3111bb0c6e/asyncpg-0.31.0-cp313-cp313-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:c0e0822b1038dc7253b337b0f3f676cadc4ac31b126c5d42691c39691962e403", size = 3424067, upload-time = "2025-11-24T23:26:07.957Z" },
+    { url = "https://files.pythonhosted.org/packages/12/c6/8c9d076f73f07f995013c791e018a1cd5f31823c2a3187fc8581706aa00f/asyncpg-0.31.0-cp313-cp313-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:bef056aa502ee34204c161c72ca1f3c274917596877f825968368b2c33f585f4", size = 3518156, upload-time = "2025-11-24T23:26:09.591Z" },
+    { url = "https://files.pythonhosted.org/packages/ae/3b/60683a0baf50fbc546499cfb53132cb6835b92b529a05f6a81471ab60d0c/asyncpg-0.31.0-cp313-cp313-musllinux_1_2_aarch64.whl", hash = "sha256:0bfbcc5b7ffcd9b75ab1558f00db2ae07db9c80637ad1b2469c43df79d7a5ae2", size = 3319636, upload-time = "2025-11-24T23:26:11.168Z" },
+    { url = "https://files.pythonhosted.org/packages/50/dc/8487df0f69bd398a61e1792b3cba0e47477f214eff085ba0efa7eac9ce87/asyncpg-0.31.0-cp313-cp313-musllinux_1_2_x86_64.whl", hash = "sha256:22bc525ebbdc24d1261ecbf6f504998244d4e3be1721784b5f64664d61fbe602", size = 3472079, upload-time = "2025-11-24T23:26:13.164Z" },
+    { url = "https://files.pythonhosted.org/packages/13/a1/c5bbeeb8531c05c89135cb8b28575ac2fac618bcb60119ee9696c3faf71c/asyncpg-0.31.0-cp313-cp313-win32.whl", hash = "sha256:f890de5e1e4f7e14023619399a471ce4b71f5418cd67a51853b9910fdfa73696", size = 527606, upload-time = "2025-11-24T23:26:14.78Z" },
+    { url = "https://files.pythonhosted.org/packages/91/66/b25ccb84a246b470eb943b0107c07edcae51804912b824054b3413995a10/asyncpg-0.31.0-cp313-cp313-win_amd64.whl", hash = "sha256:dc5f2fa9916f292e5c5c8b2ac2813763bcd7f58e130055b4ad8a0531314201ab", size = 596569, upload-time = "2025-11-24T23:26:16.189Z" },
+    { url = "https://files.pythonhosted.org/packages/3c/36/e9450d62e84a13aea6580c83a47a437f26c7ca6fa0f0fd40b6670793ea30/asyncpg-0.31.0-cp314-cp314-macosx_10_15_x86_64.whl", hash = "sha256:f6b56b91bb0ffc328c4e3ed113136cddd9deefdf5f79ab448598b9772831df44", size = 660867, upload-time = "2025-11-24T23:26:17.631Z" },
+    { url = "https://files.pythonhosted.org/packages/82/4b/1d0a2b33b3102d210439338e1beea616a6122267c0df459ff0265cd5807a/asyncpg-0.31.0-cp314-cp314-macosx_11_0_arm64.whl", hash = "sha256:334dec28cf20d7f5bb9e45b39546ddf247f8042a690bff9b9573d00086e69cb5", size = 638349, upload-time = "2025-11-24T23:26:19.689Z" },
+    { url = "https://files.pythonhosted.org/packages/41/aa/e7f7ac9a7974f08eff9183e392b2d62516f90412686532d27e196c0f0eeb/asyncpg-0.31.0-cp314-cp314-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:98cc158c53f46de7bb677fd20c417e264fc02b36d901cc2a43bd6cb0dc6dbfd2", size = 3410428, upload-time = "2025-11-24T23:26:21.275Z" },
+    { url = "https://files.pythonhosted.org/packages/6f/de/bf1b60de3dede5c2731e6788617a512bc0ebd9693eac297ee74086f101d7/asyncpg-0.31.0-cp314-cp314-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:9322b563e2661a52e3cdbc93eed3be7748b289f792e0011cb2720d278b366ce2", size = 3471678, upload-time = "2025-11-24T23:26:23.627Z" },
+    { url = "https://files.pythonhosted.org/packages/46/78/fc3ade003e22d8bd53aaf8f75f4be48f0b460fa73738f0391b9c856a9147/asyncpg-0.31.0-cp314-cp314-musllinux_1_2_aarch64.whl", hash = "sha256:19857a358fc811d82227449b7ca40afb46e75b33eb8897240c3839dd8b744218", size = 3313505, upload-time = "2025-11-24T23:26:25.235Z" },
+    { url = "https://files.pythonhosted.org/packages/bf/e9/73eb8a6789e927816f4705291be21f2225687bfa97321e40cd23055e903a/asyncpg-0.31.0-cp314-cp314-musllinux_1_2_x86_64.whl", hash = "sha256:ba5f8886e850882ff2c2ace5732300e99193823e8107e2c53ef01c1ebfa1e85d", size = 3434744, upload-time = "2025-11-24T23:26:26.944Z" },
+    { url = "https://files.pythonhosted.org/packages/08/4b/f10b880534413c65c5b5862f79b8e81553a8f364e5238832ad4c0af71b7f/asyncpg-0.31.0-cp314-cp314-win32.whl", hash = "sha256:cea3a0b2a14f95834cee29432e4ddc399b95700eb1d51bbc5bfee8f31fa07b2b", size = 532251, upload-time = "2025-11-24T23:26:28.404Z" },
+    { url = "https://files.pythonhosted.org/packages/d3/2d/7aa40750b7a19efa5d66e67fc06008ca0f27ba1bd082e457ad82f59aba49/asyncpg-0.31.0-cp314-cp314-win_amd64.whl", hash = "sha256:04d19392716af6b029411a0264d92093b6e5e8285ae97a39957b9a9c14ea72be", size = 604901, upload-time = "2025-11-24T23:26:30.34Z" },
+    { url = "https://files.pythonhosted.org/packages/ce/fe/b9dfe349b83b9dee28cc42360d2c86b2cdce4cb551a2c2d27e156bcac84d/asyncpg-0.31.0-cp314-cp314t-macosx_10_15_x86_64.whl", hash = "sha256:bdb957706da132e982cc6856bb2f7b740603472b54c3ebc77fe60ea3e57e1bd2", size = 702280, upload-time = "2025-11-24T23:26:32Z" },
+    { url = "https://files.pythonhosted.org/packages/6a/81/e6be6e37e560bd91e6c23ea8a6138a04fd057b08cf63d3c5055c98e81c1d/asyncpg-0.31.0-cp314-cp314t-macosx_11_0_arm64.whl", hash = "sha256:6d11b198111a72f47154fa03b85799f9be63701e068b43f84ac25da0bda9cb31", size = 682931, upload-time = "2025-11-24T23:26:33.572Z" },
+    { url = "https://files.pythonhosted.org/packages/a6/45/6009040da85a1648dd5bc75b3b0a062081c483e75a1a29041ae63a0bf0dc/asyncpg-0.31.0-cp314-cp314t-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:18c83b03bc0d1b23e6230f5bf8d4f217dc9bc08644ce0502a9d91dc9e634a9c7", size = 3581608, upload-time = "2025-11-24T23:26:35.638Z" },
+    { url = "https://files.pythonhosted.org/packages/7e/06/2e3d4d7608b0b2b3adbee0d0bd6a2d29ca0fc4d8a78f8277df04e2d1fd7b/asyncpg-0.31.0-cp314-cp314t-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:e009abc333464ff18b8f6fd146addffd9aaf63e79aa3bb40ab7a4c332d0c5e9e", size = 3498738, upload-time = "2025-11-24T23:26:37.275Z" },
+    { url = "https://files.pythonhosted.org/packages/7d/aa/7d75ede780033141c51d83577ea23236ba7d3a23593929b32b49db8ed36e/asyncpg-0.31.0-cp314-cp314t-musllinux_1_2_aarch64.whl", hash = "sha256:3b1fbcb0e396a5ca435a8826a87e5c2c2cc0c8c68eb6fadf82168056b0e53a8c", size = 3401026, upload-time = "2025-11-24T23:26:39.423Z" },
+    { url = "https://files.pythonhosted.org/packages/ba/7a/15e37d45e7f7c94facc1e9148c0e455e8f33c08f0b8a0b1deb2c5171771b/asyncpg-0.31.0-cp314-cp314t-musllinux_1_2_x86_64.whl", hash = "sha256:8df714dba348efcc162d2adf02d213e5fab1bd9f557e1305633e851a61814a7a", size = 3429426, upload-time = "2025-11-24T23:26:41.032Z" },
+    { url = "https://files.pythonhosted.org/packages/13/d5/71437c5f6ae5f307828710efbe62163974e71237d5d46ebd2869ea052d10/asyncpg-0.31.0-cp314-cp314t-win32.whl", hash = "sha256:1b41f1afb1033f2b44f3234993b15096ddc9cd71b21a42dbd87fc6a57b43d65d", size = 614495, upload-time = "2025-11-24T23:26:42.659Z" },
+    { url = "https://files.pythonhosted.org/packages/3c/d7/8fb3044eaef08a310acfe23dae9a8e2e07d305edc29a53497e52bc76eca7/asyncpg-0.31.0-cp314-cp314t-win_amd64.whl", hash = "sha256:bd4107bb7cdd0e9e65fae66a62afd3a249663b844fa34d479f6d5b3bef9c04c3", size = 706062, upload-time = "2025-11-24T23:26:44.086Z" },
+]
+
 [[package]]
 name = "atroposlib"
 version = "0.4.0"
@@ -1672,6 +1729,8 @@ acp = [
 all = [
    { name = "agent-client-protocol" },
    { name = "aiohttp" },
+    { name = "aiosqlite", marker = "sys_platform == 'linux'" },
+    { name = "asyncpg", marker = "sys_platform == 'linux'" },
    { name = "croniter" },
    { name = "daytona" },
    { name = "debugpy" },
@@ -1727,6 +1786,8 @@ honcho = [
    { name = "honcho-ai" },
 ]
 matrix = [
+    { name = "aiosqlite" },
+    { name = "asyncpg" },
    { name = "markdown" },
    { name = "mautrix", extra = ["encryption"] },
 ]
@@ -1791,7 +1852,9 @@ requires-dist = [
    { name = "aiohttp", marker = "extra == 'homeassistant'", specifier = ">=3.9.0,<4" },
    { name = "aiohttp", marker = "extra == 'messaging'", specifier = ">=3.13.3,<4" },
    { name = "aiohttp", marker = "extra == 'sms'", specifier = ">=3.9.0,<4" },
+    { name = "aiosqlite", marker = "extra == 'matrix'", specifier = ">=0.20" },
    { name = "anthropic", specifier = ">=0.39.0,<1" },
+    { name = "asyncpg", marker = "extra == 'matrix'", specifier = ">=0.29" },
    { name = "atroposlib", marker = "extra == 'rl'", git = "https://github.com/NousResearch/atropos.git" },
    { name = "croniter", marker = "extra == 'cron'", specifier = ">=6.0.0,<7" },
    { name = "daytona", marker = "extra == 'daytona'", specifier = ">=0.148.0,<1" },
@@ -277,6 +277,7 @@ For cloud sandbox backends, persistence is filesystem-oriented. `TERMINAL_LIFETI
 | `MATRIX_FREE_RESPONSE_ROOMS` | Comma-separated room IDs where bot responds without `@mention` |
 | `MATRIX_AUTO_THREAD` | Auto-create threads for room messages (default: `true`) |
 | `MATRIX_DM_MENTION_THREADS` | Create a thread when bot is `@mentioned` in a DM (default: `false`) |
+| `MATRIX_RECOVERY_KEY` | Recovery key for cross-signing verification after device key rotation. Recommended for E2EE setups with cross-signing enabled. |
 | `HASS_TOKEN` | Home Assistant Long-Lived Access Token (enables HA platform + tools) |
 | `HASS_URL` | Home Assistant URL (default: `http://homeassistant.local:8123`) |
 | `WEBHOOK_ENABLED` | Enable the webhook platform adapter (`true`/`false`) |
@@ -272,6 +272,18 @@ When E2EE is enabled, Hermes:
 - Decrypts incoming messages and encrypts outgoing messages automatically
 - Auto-joins encrypted rooms when invited

+### Cross-Signing Verification (Recommended)
+
+If your Matrix account has cross-signing enabled (the default in Element), set the recovery key so the bot can self-sign its device on startup. Without this, other Matrix clients may refuse to share encryption sessions with the bot after a device key rotation.
+
+```bash
+MATRIX_RECOVERY_KEY=EsT... your recovery key here
+```
+
+**Where to find it:** In Element, go to **Settings** → **Security & Privacy** → **Encryption** → your recovery key (also called the "Security Key"). This is the key you were asked to save when you first set up cross-signing.
+
+On each startup, if `MATRIX_RECOVERY_KEY` is set, Hermes imports cross-signing keys from the homeserver's secure secret storage and signs the current device. This is idempotent and safe to leave enabled permanently.
+
 :::warning
 If you delete the `~/.hermes/platforms/matrix/store/` directory, the bot loses its encryption keys. You'll need to verify the device again in your Matrix client. Back up this directory if you want to preserve encrypted sessions.
 :::
@@ -374,7 +386,7 @@ changed identity keys for the same device as suspicious.
     -d '{
       "type": "m.login.password",
       "identifier": {"type": "m.id.user", "user": "@hermes:your-server.org"},
-       "password": "your-password",
+       "password": "***",
       "initial_device_display_name": "Hermes Agent"
     }'
   ```
@@ -388,17 +400,27 @@ changed identity keys for the same device as suspicious.
   rm -f ~/.hermes/platforms/matrix/store/crypto_store.*
   ```

-3. **Force your Matrix client to rotate the encryption session**. In Element,
+3. **Set your recovery key** (if you use cross-signing — most Element users do). Add to `~/.hermes/.env`:
+
+   ```bash
+   MATRIX_RECOVERY_KEY=EsT... your recovery key here
+   ```
+
+   This lets the bot self-sign with cross-signing keys on startup, so Element trusts the new device immediately. Without this, Element may see the new device as unverified and refuse to share encryption sessions. Find your recovery key in Element under **Settings** → **Security & Privacy** → **Encryption**.
+
+4. **Force your Matrix client to rotate the encryption session**. In Element,
   open the DM room with the bot and type `/discardsession`. This forces Element
   to create a new encryption session and share it with the bot's new device.

-4. **Restart the gateway**:
+5. **Restart the gateway**:

   ```bash
   hermes gateway run
   ```

-5. **Send a new message**. The bot should decrypt and respond normally.
+   If `MATRIX_RECOVERY_KEY` is set, you should see `Matrix: cross-signing verified via recovery key` in the logs.
+
+6. **Send a new message**. The bot should decrypt and respond normally.

 :::note
 After migration, messages sent *before* the upgrade cannot be decrypted -- the old
Author	SHA1	Message	Date
alt-glitch	c71b09be77	Move container detection to hermes_constants Remove `_is_inside_container()` from `hermes_cli/config.py` and migrate callers to use `is_container()` from `hermes_constants`. This centralizes container environment detection in a single, reusable location.	2026-04-12 13:08:23 -07:00
alt-glitch	2f2eeffb96	Add container detection utility to hermes_constants Extract `is_container()` detection logic from scattered locations (`config.py`, `voice_mode.py`) into a centralized, cached function in `hermes_constants.py`. This follows the same pattern as `is_wsl()` and `is_termux()` — checking `/.dockerenv`, `/run/.containerenv`, and cgroup markers. Update gateway status detection (`status.py`, `dump.py`) to use the new utility and handle Docker/Podman differently from systemd-based systems. Update setup guidance (`setup.py`) to show Docker restart instructions when running in a container. Add Dockerfile.test for CI integration testing and spec.md as a Python module taste guide for contributors.	2026-04-12 13:08:15 -07:00
Dilee	e89b9d9732	fix(gateway): handle Linux setups without systemctl	2026-04-12 11:58:02 -07:00
Teknium	4eecaf06e4	fix: prevent duplicate update prompt spam in gateway watcher (#8343 ) The _watch_update_progress() poll loop never deleted .update_prompt.json after forwarding the prompt to the user, causing the same prompt to be re-sent every poll cycle (2s). Two fixes: 1. Delete .update_prompt.json after forwarding — the update process only polls for .update_response, it doesn't need the prompt file to persist. 2. Guard re-sends with _update_prompt_pending check — belt-and-suspenders to prevent duplicates even under race conditions. Add regression test asserting the prompt is sent exactly once.	2026-04-12 04:52:59 -07:00
Teknium	7a67b13506	fix: title_generator no longer logs as 'compression' task Changed task='compression' to task='title_generation' so auto-title calls don't pollute logs with false compression alarms.	2026-04-12 04:17:18 -07:00
Teknium	45e60904c6	fix: fall back to provider's default model when model config is empty (#8303 ) When a user configures a provider (e.g. `hermes auth add openai-codex`) but never selects a model via `hermes model`, the gateway and CLI would pass an empty model string to the API, causing: 'Codex Responses request model must be a non-empty string' Now both gateway (_resolve_session_agent_runtime) and CLI (_ensure_runtime_credentials) detect an empty model and fill it from the provider's first catalog entry in _PROVIDER_MODELS. This covers all providers that have a static model list (openai-codex, anthropic, gemini, copilot, etc.). The fix is conservative: it only triggers when model is truly empty and a known provider was resolved. Explicit model choices are never overridden.	2026-04-12 03:53:30 -07:00
Teknium	17c72f176d	fix: make skill loading instructions more aggressive in system prompt (#8286 ) The previous wording ('If one clearly matches') set too high a threshold, and 'If none match, proceed normally' was an easy escape hatch for lazy models. Now: - Lowered threshold: 'matches or is even partially relevant' - Added MUST directive and 'err on the side of loading' guidance - Replaced permissive closer with 'only proceed without if genuinely none are relevant' This should reduce cases where the agent skips loading relevant skills unless explicitly forced.	2026-04-12 03:03:16 -07:00
Teknium	b6b6b02f0f	fix: prevent unwanted session auto-reset after graceful gateway restarts (#8299 ) When the gateway shuts down gracefully (hermes update, gateway restart, /restart), it now writes a .clean_shutdown marker file. On the next startup, if this marker exists, suspend_recently_active() is skipped and the marker is cleaned up. Previously, suspend_recently_active() fired on EVERY startup — including planned restarts from hermes update or hermes gateway restart. This caused users to lose their conversation history unexpectedly: the session would be marked as suspended, and the next message would trigger an auto-reset with a notification the user never asked for. The original purpose of suspend_recently_active() is crash recovery — preventing stuck sessions that were mid-processing when the gateway died unexpectedly. Graceful shutdowns already drain active agents via _drain_active_agents(), so there is no stuck-session risk. After a crash (no marker written), suspension still fires as before. Fixes the scenario where a user asks the agent to run hermes update, the gateway restarts, and the user's next message gets an unwanted 'Session automatically reset' notification with their history cleared.	2026-04-12 03:03:07 -07:00
Teknium	56e3ee2440	fix: write update exit code before gateway restart (cgroup kill race) (#8288 ) When /update runs via Telegram, hermes update --gateway is spawned inside the gateway's systemd cgroup. The update process itself calls systemctl restart hermes-gateway, which tears down the cgroup with KillMode=mixed — SIGKILL to all remaining processes. The wrapping bash shell is killed before it can execute the exit-code epilogue, so .update_exit_code is never created. The new gateway's update watcher then polls for 30 minutes and sends a spurious timeout message. Fix: write .update_exit_code from Python inside cmd_update() immediately after the git pull + pip install succeed ("Update complete!"), before attempting the gateway restart. The shell epilogue still writes it too (idempotent overwrite), but now the marker exists even when the process is killed mid-restart.	2026-04-12 02:33:21 -07:00
Teknium	b321330362	feat: add WSL environment hint to system prompt (#8285 ) When running inside WSL (Windows Subsystem for Linux), inject a hint into the system prompt explaining that the Windows host filesystem is mounted at /mnt/c/, /mnt/d/, etc. This lets the agent naturally translate Windows paths (Desktop, Documents) to their /mnt/ equivalents without the user needing to configure anything. Uses the existing is_wsl() detection from hermes_constants (cached, checks /proc/version for 'microsoft'). Adds build_environment_hints() in prompt_builder.py — extensible for Termux, Docker, etc. later. Closes the UX gap where WSL users had to manually explain path translation to the agent every session.	2026-04-12 02:26:28 -07:00
Teknium	dd5b1063d0	fix: register MATRIX_RECOVERY_KEY env var + document migration path Follow-up for cherry-picked PR #8272: - Add MATRIX_RECOVERY_KEY to module docstring header in matrix.py - Register in OPTIONAL_ENV_VARS (config.py) with password=True, advanced=True - Add to _NON_SETUP_ENV_VARS set - Document cross-signing verification in matrix.md E2EE section - Update migration guide with recovery key step (step 3) - Add to environment-variables.md reference	2026-04-12 02:18:03 -07:00
elkimek	b9af4955b9	fix(matrix): restore verify_with_recovery_key after device key rotation After the PgCryptoStore migration in v0.8.0, the verify_with_recovery_key call that previously ran after share_keys() was dropped. On any rotation that uploads fresh device keys (fresh crypto.db, server had stale keys from a prior install, etc.), the new device keys carry no valid self- signing signature because the bot has no access to the self-signing private key. Peers like Element then refuse to share Megolm sessions with the rotated device, so the bot silently stops decrypting incoming messages. This restores the recovery-key bootstrap: on startup, if MATRIX_RECOVERY_KEY is set, import the cross-signing private keys from SSSS and sign_own_device(), producing a valid signature server-side. Idempotent and gated on MATRIX_RECOVERY_KEY — no behavior change for users who don't configure a recovery key. Verified end-to-end by deleting crypto.db and restarting: the bot rotates device identity keys, re-uploads, self-signs via recovery key, and decrypts+replies to fresh messages from a paired Element client.	2026-04-12 02:18:03 -07:00
Ben Barclay	b0d65c333a	Merge pull request #8279 from NousResearch/chore/simplify-docker-tags chore: simplify Docker image tags	2026-04-12 19:09:05 +10:00
Ben	00adbd0de0	chore: simplify Docker image tags - Main branch push: only push :latest (remove SHA tag) - Release push: only push release tag name (remove :latest and SHA tag)	2026-04-12 19:08:16 +10:00
Teknium	95fa78eb6c	fix: write refreshed Codex tokens back to ~/.codex/auth.json (#8277 ) OpenAI OAuth refresh tokens are single-use and rotate on every refresh. When Hermes refreshes a Codex token, it consumed the old refresh_token but never wrote the new pair back to ~/.codex/auth.json. This caused Codex CLI and VS Code to fail with 'refresh_token_reused' on their next refresh attempt. This mirrors the existing Anthropic write-back pattern where refreshed tokens are written to ~/.claude/.credentials.json via _write_claude_code_credentials(). Changes: - Add _write_codex_cli_tokens() in hermes_cli/auth.py (parallel to _write_claude_code_credentials in anthropic_adapter.py) - Call it from _refresh_codex_auth_tokens() (non-pool refresh path) - Call it from credential_pool._refresh_entry() (pool happy path + retry) - Add tests for the new write-back behavior - Update existing test docstring to clarify _save_codex_tokens vs _write_codex_cli_tokens separation Fixes refresh token conflict reported by @ec12edfae2cb221	2026-04-12 02:05:20 -07:00
Teknium	6d05e3d56f	fix(gateway): evict cached agent on /model switch + add diagnostic logging (#8276 ) After /model switches the model (both picker and text paths), the cached agent's config signature becomes stale — the agent was updated in-place via switch_model() but the cache tuple's signature was never refreshed. The next turn should detect the signature mismatch and create a fresh agent, but this relies on the new model's signature differing from the old one in _agent_config_signature(). Evicting the cached agent explicitly after storing the session override is more defensive — the next turn is guaranteed to create a fresh agent from the override without depending on signature mismatch detection. Also adds debug logging at three key decision points so we can trace exactly what happens when /model + /retry interact: - _resolve_session_agent_runtime: which override path is taken (fast with api_key vs fallback), or why no override was found - _run_agent.run_sync: final resolved model/provider before agent creation Reported: /model switch to xiaomi/mimo-v2-pro followed by /retry still used the old model (glm-5.1).	2026-04-12 01:58:17 -07:00
Teknium	4aa534eae5	fix(gateway): peek at pending message during interrupt instead of consuming it The monitor_for_interrupt() and backup interrupt checks were calling get_pending_message() which pops the message from the adapter's queue. This created a race condition: if the agent finished naturally before checking _interrupt_requested, the pending message was permanently lost. Timeline of the race: 1. Agent near completion, user sends message 2. Level 1 guard stores message in adapter._pending_messages, sets event 3. monitor_for_interrupt() detects event, POPS message, calls agent.interrupt() 4. Agent's run_conversation() was already returning (interrupted=False) 5. Post-run dequeue finds nothing (monitor already consumed it) 6. result.get('interrupted') is False so interrupt_message fallback doesn't fire 7. User message permanently lost — agent finishes without processing it Fix: change all three interrupt detection sites (primary monitor + two backup checks) from get_pending_message() (pop) to _pending_messages.get() (peek). The message stays in the adapter's queue until _dequeue_pending_event() consumes it in the post-run handler, which runs regardless of whether the agent was interrupted or finished naturally. Reported by @_SushantSays — intermittent message loss during long terminal command execution, persisting after the previous fix (`73f970fa`) which addressed monitor task death but not this consumption race.	2026-04-12 01:57:34 -07:00
Teknium	ae6820a45a	fix(setup): validate base URL input in hermes model flow (#8264 ) Reject non-URL values (e.g. shell commands typed by mistake) in the base URL prompt during provider setup. Previously any string was saved as-is to .env, breaking connectivity when the garbage value was used as the API endpoint. Adds http:// / https:// prefix check with a clear error message. The custom-endpoint flow already had this validation (line 1620); this brings the generic API-key provider flow to parity. Triggered by a user support case where 'nano ~/.hermes/.env' was accidentally entered as GLM_BASE_URL during Z.AI setup.	2026-04-12 01:51:57 -07:00
Teknium	a1220977d3	fix: make skill loading instructions more aggressive in system prompt (#8209 ) The previous wording ('If one clearly matches') set too high a threshold, and 'If none match, proceed normally' was an easy escape hatch for lazy models. Now: - Lowered threshold: 'matches or is even partially relevant' - Added MUST directive and 'err on the side of loading' guidance - Replaced permissive closer with 'only proceed without if genuinely none are relevant' This should reduce cases where the agent skips loading relevant skills unless explicitly forced.	2026-04-12 01:46:34 -07:00
Teknium	078dba015d	fix: three provider-related bugs (#8161 , #8181 , #8147 ) (#8243 ) - Add openai/openai-codex -> openai mapping to PROVIDER_TO_MODELS_DEV so context-length lookups use models.dev data instead of 128k fallback. Fixes #8161. - Set api_mode from custom_providers entry when switching via hermes model, and clear stale api_mode when the entry has none. Also extract api_mode in _named_custom_provider_map(). Fixes #8181. - Convert OpenAI image_url content blocks to Anthropic image blocks when the endpoint is Anthropic-compatible (MiniMax, MiniMax-CN, or any URL containing /anthropic). Fixes #8147.	2026-04-12 01:44:18 -07:00
Harish Kukreja	b1f13a8c5f	fix(agent): route compression aux through live session runtime	2026-04-12 01:34:52 -07:00