feat(nix): container-aware CLI — auto-route hermes chat into managed container

When container.enable = true in the NixOS module, running 'hermes chat' on the host now automatically execs into the managed container via docker/podman exec. This means the interactive CLI runs in the same environment as the gateway service, with access to all container-installed packages and tools. Implementation: - NixOS activation script writes .container-mode metadata file to HERMES_HOME with backend, container_name, and hermes_bin path - File is removed when container mode is disabled (nixos-rebuild switch) - hermes_cli/config.py: _is_inside_container() detects Docker/Podman indicators (/.dockerenv, /run/.containerenv, cgroup) - hermes_cli/config.py: get_container_exec_info() reads .container-mode metadata, returns None when already inside a container - hermes_cli/main.py: _exec_in_container() validates the container is running, then os.execvp() replaces the process with the container exec - cmd_chat intercepts before normal flow, checks container info, execs Safety: - --host flag bypasses container routing (run on host regardless) - Falls back to host CLI if: container runtime not found, container not running, inspect fails, or any detection error - Strips --host from forwarded args (not meaningful inside container) - Already-inside-container detection prevents infinite exec loops Closes #7380
fix(nix): gate matrix extra to Linux in [all] profile (#7461 )
2026-04-11 06:15:44 +05:30 · 2026-04-11 05:59:56 +05:30 · 2026-04-10 17:27:32 -07:00 · 2026-04-10 17:04:38 -07:00 · 2026-04-10 16:51:44 -07:00 · 2026-04-10 16:51:44 -07:00
49 changed files with 2527 additions and 360 deletions
@@ -857,7 +857,7 @@ def _read_main_provider() -> str:
    return ""


-def _resolve_custom_runtime() -> Tuple[Optional[str], Optional[str]]:
+def _resolve_custom_runtime() -> Tuple[Optional[str], Optional[str], Optional[str]]:
    """Resolve the active custom/main endpoint the same way the main CLI does.

    This covers both env-driven OPENAI_BASE_URL setups and config-saved custom
@@ -870,18 +870,29 @@ def _resolve_custom_runtime() -> Tuple[Optional[str], Optional[str]]:
        runtime = resolve_runtime_provider(requested="custom")
    except Exception as exc:
        logger.debug("Auxiliary client: custom runtime resolution failed: %s", exc)
-        return None, None
+        runtime = None
+
+    if not isinstance(runtime, dict):
+        openai_base = os.getenv("OPENAI_BASE_URL", "").strip().rstrip("/")
+        openai_key = os.getenv("OPENAI_API_KEY", "").strip()
+        if not openai_base:
+            return None, None, None
+        runtime = {
+            "base_url": openai_base,
+            "api_key": openai_key,
+        }

    custom_base = runtime.get("base_url")
    custom_key = runtime.get("api_key")
+    custom_mode = runtime.get("api_mode")
    if not isinstance(custom_base, str) or not custom_base.strip():
-        return None, None
+        return None, None, None

    custom_base = custom_base.strip().rstrip("/")
    if "openrouter.ai" in custom_base.lower():
        # requested='custom' falls back to OpenRouter when no custom endpoint is
        # configured. Treat that as "no custom endpoint" for auxiliary routing.
-        return None, None
+        return None, None, None

    # Local servers (Ollama, llama.cpp, vLLM, LM Studio) don't require auth.
    # Use a placeholder key — the OpenAI SDK requires a non-empty string but
@@ -890,20 +901,33 @@ def _resolve_custom_runtime() -> Tuple[Optional[str], Optional[str]]:
    if not isinstance(custom_key, str) or not custom_key.strip():
        custom_key = "no-key-required"

-    return custom_base, custom_key.strip()
+    if not isinstance(custom_mode, str) or not custom_mode.strip():
+        custom_mode = None
+
+    return custom_base, custom_key.strip(), custom_mode


 def _current_custom_base_url() -> str:
-    custom_base, _ = _resolve_custom_runtime()
+    custom_base, _, _ = _resolve_custom_runtime()
    return custom_base or ""


 def _try_custom_endpoint() -> Tuple[Optional[OpenAI], Optional[str]]:
-    custom_base, custom_key = _resolve_custom_runtime()
+    runtime = _resolve_custom_runtime()
+    if len(runtime) == 2:
+        custom_base, custom_key = runtime
+        custom_mode = None
+    else:
+        custom_base, custom_key, custom_mode = runtime
    if not custom_base or not custom_key:
        return None, None
+    if custom_base.lower().startswith(_CODEX_AUX_BASE_URL.lower()):
+        return None, None
    model = _read_main_model() or "gpt-4o-mini"
-    logger.debug("Auxiliary client: custom endpoint (%s)", model)
+    logger.debug("Auxiliary client: custom endpoint (%s, api_mode=%s)", model, custom_mode or "chat_completions")
+    if custom_mode == "codex_responses":
+        real_client = OpenAI(api_key=custom_key, base_url=custom_base)
+        return CodexAuxiliaryClient(real_client, model), model
    return OpenAI(api_key=custom_key, base_url=custom_base), model


@@ -487,7 +487,7 @@ def _parse_skill_file(skill_file: Path) -> tuple[bool, dict, str]:
    (True, {}, "") to err on the side of showing the skill.
    """
    try:
-        raw = skill_file.read_text(encoding="utf-8")[:2000]
+        raw = skill_file.read_text(encoding="utf-8")
        frontmatter, _ = parse_frontmatter(raw)

        if not skill_matches_platform(frontmatter):
@@ -495,7 +495,7 @@ def _parse_skill_file(skill_file: Path) -> tuple[bool, dict, str]:

        return True, frontmatter, extract_skill_description(frontmatter)
    except Exception as e:
-        logger.debug("Failed to parse skill file %s: %s", skill_file, e)
+        logger.warning("Failed to parse skill file %s: %s", skill_file, e)
        return True, {}, ""


@@ -558,9 +558,10 @@ def build_skills_system_prompt(
    # ── Layer 1: in-process LRU cache ─────────────────────────────────
    # Include the resolved platform so per-platform disabled-skill lists
    # produce distinct cache entries (gateway serves multiple platforms).
+    from gateway.session_context import get_session_env
    _platform_hint = (
        os.environ.get("HERMES_PLATFORM")
-        or os.environ.get("HERMES_SESSION_PLATFORM")
+        or get_session_env("HERMES_SESSION_PLATFORM")
        or ""
    )
    cache_key = (
@@ -145,10 +145,11 @@ def get_disabled_skill_names(platform: str | None = None) -> Set[str]:
    if not isinstance(skills_cfg, dict):
        return set()

+    from gateway.session_context import get_session_env
    resolved_platform = (
        platform
        or os.getenv("HERMES_PLATFORM")
-        or os.getenv("HERMES_SESSION_PLATFORM")
+        or get_session_env("HERMES_SESSION_PLATFORM")
    )
    if resolved_platform:
        platform_disabled = (skills_cfg.get("platform_disabled") or {}).get(
@@ -31,7 +31,7 @@ except ImportError:
 # Configuration
 # =============================================================================

-HERMES_DIR = get_hermes_home()
+HERMES_DIR = get_hermes_home().resolve()
 CRON_DIR = HERMES_DIR / "cron"
 JOBS_FILE = CRON_DIR / "jobs.json"
 OUTPUT_DIR = CRON_DIR / "output"
@@ -338,10 +338,12 @@ def load_jobs() -> List[Dict[str, Any]]:
                    save_jobs(jobs)
                    logger.warning("Auto-repaired jobs.json (had invalid control characters)")
                return jobs
-        except Exception:
-            return []
-    except IOError:
-        return []
+        except Exception as e:
+            logger.error("Failed to auto-repair jobs.json: %s", e)
+            raise RuntimeError(f"Cron database corrupted and unrepairable: {e}") from e
+    except IOError as e:
+        logger.error("IOError reading jobs.json: %s", e)
+        raise RuntimeError(f"Failed to read cron database: {e}") from e


 def save_jobs(jobs: List[Dict[str, Any]]):
@@ -452,6 +454,7 @@ def create_job(
        "last_run_at": None,
        "last_status": None,
        "last_error": None,
+        "last_delivery_error": None,
        # Delivery configuration
        "deliver": deliver,
        "origin": origin,  # Tracks where job was created for "origin" delivery
@@ -620,8 +623,8 @@ def mark_job_run(job_id: str, success: bool, error: Optional[str] = None,

            save_jobs(jobs)
            return
-    
-    save_jobs(jobs)
+
+    logger.warning("mark_job_run: job_id %s not found, skipping save", job_id)


 def advance_next_run(job_id: str) -> bool:
@@ -769,7 +769,7 @@ def run_job(job: dict) -> tuple[bool, str, str, Optional[str]]:
            _cron_pool.shutdown(wait=False, cancel_futures=True)
            raise
        finally:
-            _cron_pool.shutdown(wait=False)
+            _cron_pool.shutdown(wait=False, cancel_futures=True)

        if _inactivity_timeout:
            # Build diagnostic summary from the agent's activity tracker.
@@ -76,10 +76,15 @@ def build_channel_directory(adapters: Dict[Any, Any]) -> Dict[str, Any]:
        except Exception as e:
            logger.warning("Channel directory: failed to build %s: %s", platform.value, e)

-    # Telegram, WhatsApp & Signal can't enumerate chats -- pull from session history
-    for plat_name in ("telegram", "whatsapp", "signal", "weixin", "email", "sms", "bluebubbles"):
-        if plat_name not in platforms:
-            platforms[plat_name] = _build_from_sessions(plat_name)
+    # Platforms that don't support direct channel enumeration get session-based
+    # discovery automatically.  Skip infrastructure entries that aren't messaging
+    # platforms — everything else falls through to _build_from_sessions().
+    _SKIP_SESSION_DISCOVERY = frozenset({"local", "api_server", "webhook"})
+    for plat in Platform:
+        plat_name = plat.value
+        if plat_name in _SKIP_SESSION_DISCOVERY or plat_name in platforms:
+            continue
+        platforms[plat_name] = _build_from_sessions(plat_name)

    directory = {
        "updated_at": datetime.now().isoformat(),
@@ -642,6 +642,8 @@ def load_gateway_config() -> GatewayConfig:
                    os.environ["MATRIX_FREE_RESPONSE_ROOMS"] = str(frc)
                if "auto_thread" in matrix_cfg and not os.getenv("MATRIX_AUTO_THREAD"):
                    os.environ["MATRIX_AUTO_THREAD"] = str(matrix_cfg["auto_thread"]).lower()
+                if "dm_mention_threads" in matrix_cfg and not os.getenv("MATRIX_DM_MENTION_THREADS"):
+                    os.environ["MATRIX_DM_MENTION_THREADS"] = str(matrix_cfg["dm_mention_threads"]).lower()

    except Exception as e:
        logger.warning(
@@ -25,6 +25,7 @@ import hmac
 import json
 import logging
 import os
+import socket as _socket
 import re
 import sqlite3
 import time
@@ -42,6 +43,7 @@ from gateway.config import Platform, PlatformConfig
 from gateway.platforms.base import (
    BasePlatformAdapter,
    SendResult,
+    is_network_accessible,
 )

 logger = logging.getLogger(__name__)
@@ -406,7 +408,8 @@ class APIServerAdapter(BasePlatformAdapter):
        Validate Bearer token from Authorization header.

        Returns None if auth is OK, or a 401 web.Response on failure.
-        If no API key is configured, all requests are allowed.
+        If no API key is configured, all requests are allowed (only when API
+        server is local).
        """
        if not self._api_key:
            return None  # No key configured — allow all (local-only use)
@@ -1713,8 +1716,16 @@ class APIServerAdapter(BasePlatformAdapter):
            if hasattr(sweep_task, "add_done_callback"):
                sweep_task.add_done_callback(self._background_tasks.discard)

+            # Refuse to start network-accessible without authentication
+            if is_network_accessible(self._host) and not self._api_key:
+                logger.error(
+                    "[%s] Refusing to start: binding to %s requires API_SERVER_KEY. "
+                    "Set API_SERVER_KEY or use the default 127.0.0.1.",
+                    self.name, self._host,
+                )
+                return False
+
            # Port conflict detection — fail fast if port is already in use
-            import socket as _socket
            try:
                with _socket.socket(_socket.AF_INET, _socket.SOCK_STREAM) as _s:
                    _s.settimeout(1)
@@ -6,10 +6,12 @@ and implement the required methods.
 """

 import asyncio
+import ipaddress
 import logging
 import os
 import random
 import re
+import socket as _socket
 import subprocess
 import sys
 import uuid
@@ -19,6 +21,41 @@ from urllib.parse import urlsplit
 logger = logging.getLogger(__name__)


+def is_network_accessible(host: str) -> bool:
+    """Return True if *host* would expose the server beyond loopback.
+
+    Loopback addresses (127.0.0.1, ::1, IPv4-mapped ::ffff:127.0.0.1)
+    are local-only.  Unspecified addresses (0.0.0.0, ::) bind all
+    interfaces.  Hostnames are resolved; DNS failure fails closed.
+    """
+    try:
+        addr = ipaddress.ip_address(host)
+        if addr.is_loopback:
+            return False
+        # ::ffff:127.0.0.1 — Python reports is_loopback=False for mapped
+        # addresses, so check the underlying IPv4 explicitly.
+        if getattr(addr, "ipv4_mapped", None) and addr.ipv4_mapped.is_loopback:
+            return False
+        return True
+    except ValueError:
+        # when host variable is a hostname, we should try to resolve below
+        pass
+
+    try:
+        resolved = _socket.getaddrinfo(
+            host, None, _socket.AF_UNSPEC, _socket.SOCK_STREAM,
+        )
+        # if the hostname resolves into at least one non-loopback address,
+        # then we consider it to be network accessible
+        for _family, _type, _proto, _canonname, sockaddr in resolved:
+            addr = ipaddress.ip_address(sockaddr[0])
+            if not addr.is_loopback:
+                return True
+        return False
+    except (_socket.gaierror, OSError):
+        return True
+
+
 def _detect_macos_system_proxy() -> str | None:
    """Read the macOS system HTTP(S) proxy via ``scutil --proxy``.

@@ -1190,6 +1190,8 @@ class FeishuAdapter(BasePlatformAdapter):
                lambda data: self._on_reaction_event("im.message.reaction.deleted_v1", data)
            )
            .register_p2_card_action_trigger(self._on_card_action_trigger)
+            .register_p2_im_chat_member_bot_added_v1(self._on_bot_added_to_chat)
+            .register_p2_im_chat_member_bot_deleted_v1(self._on_bot_removed_from_chat)
            .build()
        )

@@ -18,6 +18,7 @@ Environment variables:
    MATRIX_REQUIRE_MENTION      Require @mention in rooms (default: true)
    MATRIX_FREE_RESPONSE_ROOMS  Comma-separated room IDs exempt from mention requirement
    MATRIX_AUTO_THREAD          Auto-create threads for room messages (default: true)
+    MATRIX_DM_MENTION_THREADS   Create a thread when bot is @mentioned in a DM (default: false)
 """

 from __future__ import annotations
@@ -1043,6 +1044,13 @@ class MatrixAdapter(BasePlatformAdapter):
                if not self._is_bot_mentioned(body, formatted_body):
                    return

+        # DM mention-thread: when enabled, @mentioning bot in a DM creates a thread.
+        if is_dm and not thread_id:
+            dm_mention_threads = os.getenv("MATRIX_DM_MENTION_THREADS", "false").lower() in ("true", "1", "yes")
+            if dm_mention_threads and self._is_bot_mentioned(body, source_content.get("formatted_body")):
+                thread_id = event.event_id
+                self._track_thread(thread_id)
+
        # Strip mention from body when present (including in DMs).
        if self._is_bot_mentioned(body, source_content.get("formatted_body")):
            body = self._strip_mention(body)
@@ -1360,6 +1368,13 @@ class MatrixAdapter(BasePlatformAdapter):
                if not self._is_bot_mentioned(body, formatted_body):
                    return

+        # DM mention-thread: when enabled, @mentioning bot in a DM creates a thread.
+        if is_dm and not thread_id:
+            dm_mention_threads = os.getenv("MATRIX_DM_MENTION_THREADS", "false").lower() in ("true", "1", "yes")
+            if dm_mention_threads and self._is_bot_mentioned(body, source_content.get("formatted_body")):
+                thread_id = event.event_id
+                self._track_thread(thread_id)
+
        # Strip mention from body when present (including in DMs).
        if self._is_bot_mentioned(body, source_content.get("formatted_body")):
            body = self._strip_mention(body)
@@ -1348,12 +1348,28 @@ class GatewayRunner:
                for key, entry in _expired_entries:
                    try:
                        await self._async_flush_memories(entry.session_id)
-                        # Shut down memory provider on the cached agent
-                        cached_agent = self._running_agents.get(key)
-                        if cached_agent and cached_agent is not _AGENT_PENDING_SENTINEL:
+                        # Shut down memory provider and close tool resources
+                        # on the cached agent.  Idle agents live in
+                        # _agent_cache (not _running_agents), so look there.
+                        _cached_agent = None
+                        _cache_lock = getattr(self, "_agent_cache_lock", None)
+                        if _cache_lock is not None:
+                            with _cache_lock:
+                                _cached = self._agent_cache.get(key)
+                                _cached_agent = _cached[0] if isinstance(_cached, tuple) else _cached if _cached else None
+                        # Fall back to _running_agents in case the agent is
+                        # still mid-turn when the expiry fires.
+                        if _cached_agent is None:
+                            _cached_agent = self._running_agents.get(key)
+                        if _cached_agent and _cached_agent is not _AGENT_PENDING_SENTINEL:
                            try:
-                                if hasattr(cached_agent, 'shutdown_memory_provider'):
-                                    cached_agent.shutdown_memory_provider()
+                                if hasattr(_cached_agent, 'shutdown_memory_provider'):
+                                    _cached_agent.shutdown_memory_provider()
+                            except Exception:
+                                pass
+                            try:
+                                if hasattr(_cached_agent, 'close'):
+                                    _cached_agent.close()
                            except Exception:
                                pass
                        # Mark as flushed and persist to disk so the flag
@@ -1536,6 +1552,14 @@ class GatewayRunner:
                    agent.shutdown_memory_provider()
            except Exception:
                pass
+            # Close tool resources (terminal sandboxes, browser daemons,
+            # background processes, httpx clients) to prevent zombie
+            # process accumulation.
+            try:
+                if hasattr(agent, 'close'):
+                    agent.close()
+            except Exception:
+                pass

        for platform, adapter in list(self.adapters.items()):
            try:
@@ -1558,7 +1582,25 @@ class GatewayRunner:
        self._pending_messages.clear()
        self._pending_approvals.clear()
        self._shutdown_event.set()
-        
+
+        # Global cleanup: kill any remaining tool subprocesses not tied
+        # to a specific agent (catch-all for zombie prevention).
+        try:
+            from tools.process_registry import process_registry
+            process_registry.kill_all()
+        except Exception:
+            pass
+        try:
+            from tools.terminal_tool import cleanup_all_environments
+            cleanup_all_environments()
+        except Exception:
+            pass
+        try:
+            from tools.browser_tool import cleanup_all_browsers
+            cleanup_all_browsers()
+        except Exception:
+            pass
+
        from gateway.status import remove_pid_file, write_runtime_status
        remove_pid_file()
        try:
@@ -2400,8 +2442,8 @@ class GatewayRunner:
        # Build session context
        context = build_session_context(source, self.config, session_entry)
        
-        # Set environment variables for tools
-        self._set_session_env(context)
+        # Set session context variables for tools (task-local, concurrency-safe)
+        _session_env_tokens = self._set_session_env(context)
        
        # Read privacy.redact_pii from config (re-read per message)
        _redact_pii = False
@@ -3234,8 +3276,8 @@ class GatewayRunner:
                "Try again or use /reset to start a fresh session."
            )
        finally:
-            # Clear session env
-            self._clear_session_env()
+            # Restore session context variables to their pre-handler state
+            self._clear_session_env(_session_env_tokens)
    
    def _format_session_info(self) -> str:
        """Resolve current model config and return a formatted info block.
@@ -3335,8 +3377,22 @@ class GatewayRunner:
                _flush_task.add_done_callback(self._background_tasks.discard)
        except Exception as e:
            logger.debug("Gateway memory flush on reset failed: %s", e)
+        # Close tool resources on the old agent (terminal sandboxes, browser
+        # daemons, background processes) before evicting from cache.
+        # Guard with getattr because test fixtures may skip __init__.
+        _cache_lock = getattr(self, "_agent_cache_lock", None)
+        if _cache_lock is not None:
+            with _cache_lock:
+                _cached = self._agent_cache.get(session_key)
+                _old_agent = _cached[0] if isinstance(_cached, tuple) else _cached if _cached else None
+            if _old_agent is not None:
+                try:
+                    if hasattr(_old_agent, "close"):
+                        _old_agent.close()
+                except Exception:
+                    pass
        self._evict_cached_agent(session_key)
-        
+
        try:
            from tools.env_passthrough import clear_env_passthrough
            clear_env_passthrough()
@@ -6120,20 +6176,27 @@ class GatewayRunner:

        return True

-    def _set_session_env(self, context: SessionContext) -> None:
-        """Set environment variables for the current session."""
-        os.environ["HERMES_SESSION_PLATFORM"] = context.source.platform.value
-        os.environ["HERMES_SESSION_CHAT_ID"] = context.source.chat_id
-        if context.source.chat_name:
-            os.environ["HERMES_SESSION_CHAT_NAME"] = context.source.chat_name
-        if context.source.thread_id:
-            os.environ["HERMES_SESSION_THREAD_ID"] = str(context.source.thread_id)
-    
-    def _clear_session_env(self) -> None:
-        """Clear session environment variables."""
-        for var in ["HERMES_SESSION_PLATFORM", "HERMES_SESSION_CHAT_ID", "HERMES_SESSION_CHAT_NAME", "HERMES_SESSION_THREAD_ID"]:
-            if var in os.environ:
-                del os.environ[var]
+    def _set_session_env(self, context: SessionContext) -> list:
+        """Set session context variables for the current async task.
+
+        Uses ``contextvars`` instead of ``os.environ`` so that concurrent
+        gateway messages cannot overwrite each other's session state.
+
+        Returns a list of reset tokens; pass them to ``_clear_session_env``
+        in a ``finally`` block.
+        """
+        from gateway.session_context import set_session_vars
+        return set_session_vars(
+            platform=context.source.platform.value,
+            chat_id=context.source.chat_id,
+            chat_name=context.source.chat_name or "",
+            thread_id=str(context.source.thread_id) if context.source.thread_id else "",
+        )
+
+    def _clear_session_env(self, tokens: list) -> None:
+        """Restore session context variables to their pre-handler values."""
+        from gateway.session_context import clear_session_vars
+        clear_session_vars(tokens)
    
    async def _enrich_message_with_vision(
        self,
@@ -0,0 +1,113 @@
+"""
+Session-scoped context variables for the Hermes gateway.
+
+Replaces the previous ``os.environ``-based session state
+(``HERMES_SESSION_PLATFORM``, ``HERMES_SESSION_CHAT_ID``, etc.) with
+Python's ``contextvars.ContextVar``.
+
+**Why this matters**
+
+The gateway processes messages concurrently via ``asyncio``.  When two
+messages arrive at the same time the old code did:
+
+    os.environ["HERMES_SESSION_THREAD_ID"] = str(context.source.thread_id)
+
+Because ``os.environ`` is *process-global*, Message A's value was
+silently overwritten by Message B before Message A's agent finished
+running.  Background-task notifications and tool calls therefore routed
+to the wrong thread.
+
+``contextvars.ContextVar`` values are *task-local*: each ``asyncio``
+task (and any ``run_in_executor`` thread it spawns) gets its own copy,
+so concurrent messages never interfere.
+
+**Backward compatibility**
+
+The public helper ``get_session_env(name, default="")`` mirrors the old
+``os.getenv("HERMES_SESSION_*", ...)`` calls.  Existing tool code only
+needs to replace the import + call site:
+
+    # before
+    import os
+    platform = os.getenv("HERMES_SESSION_PLATFORM", "")
+
+    # after
+    from gateway.session_context import get_session_env
+    platform = get_session_env("HERMES_SESSION_PLATFORM", "")
+"""
+
+from contextvars import ContextVar
+
+# ---------------------------------------------------------------------------
+# Per-task session variables
+# ---------------------------------------------------------------------------
+
+_SESSION_PLATFORM: ContextVar[str] = ContextVar("HERMES_SESSION_PLATFORM", default="")
+_SESSION_CHAT_ID: ContextVar[str] = ContextVar("HERMES_SESSION_CHAT_ID", default="")
+_SESSION_CHAT_NAME: ContextVar[str] = ContextVar("HERMES_SESSION_CHAT_NAME", default="")
+_SESSION_THREAD_ID: ContextVar[str] = ContextVar("HERMES_SESSION_THREAD_ID", default="")
+
+_VAR_MAP = {
+    "HERMES_SESSION_PLATFORM": _SESSION_PLATFORM,
+    "HERMES_SESSION_CHAT_ID": _SESSION_CHAT_ID,
+    "HERMES_SESSION_CHAT_NAME": _SESSION_CHAT_NAME,
+    "HERMES_SESSION_THREAD_ID": _SESSION_THREAD_ID,
+}
+
+
+def set_session_vars(
+    platform: str = "",
+    chat_id: str = "",
+    chat_name: str = "",
+    thread_id: str = "",
+) -> list:
+    """Set all session context variables and return reset tokens.
+
+    Call ``clear_session_vars(tokens)`` in a ``finally`` block to restore
+    the previous values when the handler exits.
+
+    Returns a list of ``Token`` objects (one per variable) that can be
+    passed to ``clear_session_vars``.
+    """
+    tokens = [
+        _SESSION_PLATFORM.set(platform),
+        _SESSION_CHAT_ID.set(chat_id),
+        _SESSION_CHAT_NAME.set(chat_name),
+        _SESSION_THREAD_ID.set(thread_id),
+    ]
+    return tokens
+
+
+def clear_session_vars(tokens: list) -> None:
+    """Restore session context variables to their pre-handler values."""
+    if not tokens:
+        return
+    vars_in_order = [
+        _SESSION_PLATFORM,
+        _SESSION_CHAT_ID,
+        _SESSION_CHAT_NAME,
+        _SESSION_THREAD_ID,
+    ]
+    for var, token in zip(vars_in_order, tokens):
+        var.reset(token)
+
+
+def get_session_env(name: str, default: str = "") -> str:
+    """Read a session context variable by its legacy ``HERMES_SESSION_*`` name.
+
+    Drop-in replacement for ``os.getenv("HERMES_SESSION_*", default)``.
+
+    Resolution order:
+    1. Context variable (set by the gateway for concurrency-safe access)
+    2. ``os.environ`` (used by CLI, cron scheduler, and tests)
+    3. *default*
+    """
+    import os
+
+    var = _VAR_MAP.get(name)
+    if var is not None:
+        value = var.get()
+        if value:
+            return value
+    # Fall back to os.environ for CLI, cron, and test compatibility
+    return os.getenv(name, default)
@@ -141,6 +141,68 @@ def managed_error(action: str = "modify configuration"):
    print(format_managed_message(action), file=sys.stderr)


+# =============================================================================
+# Container-aware CLI (NixOS container mode)
+# =============================================================================
+
+def _is_inside_container() -> bool:
+    """Detect if we're already running inside a Docker/Podman container."""
+    # Standard Docker/Podman indicators
+    if os.path.exists("/.dockerenv"):
+        return True
+    # Podman uses /run/.containerenv
+    if os.path.exists("/run/.containerenv"):
+        return True
+    # Check cgroup for container runtime evidence (works for both Docker & Podman)
+    try:
+        with open("/proc/1/cgroup", "r") as f:
+            cgroup = f.read()
+            if "docker" in cgroup or "podman" in cgroup or "/lxc/" in cgroup:
+                return True
+    except (OSError, IOError):
+        pass
+    return False
+
+
+def get_container_exec_info() -> Optional[dict]:
+    """Read container mode metadata from HERMES_HOME/.container-mode.
+
+    Returns a dict with keys: backend, container_name, hermes_bin
+    or None if container mode is not active or we're already inside the container.
+
+    The .container-mode file is written by the NixOS activation script when
+    container.enable = true. It tells the host CLI to exec into the container
+    instead of running locally.
+    """
+    if _is_inside_container():
+        return None
+
+    container_mode_file = get_hermes_home() / ".container-mode"
+    if not container_mode_file.exists():
+        return None
+
+    try:
+        info = {}
+        with open(container_mode_file, "r") as f:
+            for line in f:
+                line = line.strip()
+                if "=" in line and not line.startswith("#"):
+                    key, _, value = line.partition("=")
+                    info[key.strip()] = value.strip()
+
+        backend = info.get("backend", "docker")
+        container_name = info.get("container_name", "hermes-agent")
+        hermes_bin = info.get("hermes_bin", "/data/current-package/bin/hermes")
+
+        return {
+            "backend": backend,
+            "container_name": container_name,
+            "hermes_bin": hermes_bin,
+        }
+    except (OSError, IOError):
+        return None
+
+
 # =============================================================================
 # Config paths
 # =============================================================================
@@ -1209,8 +1271,8 @@ OPTIONAL_ENV_VARS = {
        "advanced": True,
    },
    "API_SERVER_KEY": {
-        "description": "Bearer token for API server authentication. If empty, all requests are allowed (local use only).",
-        "prompt": "API server auth key (optional)",
+        "description": "Bearer token for API server authentication. Required for non-loopback binding; server refuses to start without it. On loopback (127.0.0.1), all requests are allowed if empty.",
+        "prompt": "API server auth key (required for network access)",
        "url": None,
        "password": True,
        "category": "messaging",
@@ -1225,7 +1287,7 @@ OPTIONAL_ENV_VARS = {
        "advanced": True,
    },
    "API_SERVER_HOST": {
-        "description": "Host/bind address for the API server (default: 127.0.0.1). Use 0.0.0.0 for network access — requires API_SERVER_KEY for security.",
+        "description": "Host/bind address for the API server (default: 127.0.0.1). Use 0.0.0.0 for network access — server refuses to start without API_SERVER_KEY.",
        "prompt": "API server host",
        "url": None,
        "password": False,
@@ -528,6 +528,56 @@ def _resolve_last_cli_session() -> Optional[str]:
    return None


+def _exec_in_container(container_info: dict, cli_args: list):
+    """Replace the current process with a command inside the managed container.
+
+    Uses os.execvp to hand off to docker/podman exec, preserving the TTY
+    so the interactive CLI works seamlessly inside the container.
+
+    Args:
+        container_info: dict with backend, container_name, hermes_bin
+        cli_args: the original CLI arguments (everything after 'hermes')
+    """
+    import shutil
+    import subprocess
+
+    backend = container_info["backend"]
+    container_name = container_info["container_name"]
+    hermes_bin = container_info["hermes_bin"]
+
+    # Find the container runtime on PATH
+    runtime = shutil.which(backend)
+    if not runtime:
+        print(f"Warning: {backend} not found on PATH, falling back to host CLI.",
+              file=sys.stderr)
+        return  # Fall through to normal CLI
+
+    # Check if the container is actually running
+    try:
+        result = subprocess.run(
+            [runtime, "inspect", "--format", "{{.State.Running}}", container_name],
+            capture_output=True, text=True, timeout=5
+        )
+        if result.returncode != 0 or result.stdout.strip().lower() != "true":
+            print(f"Warning: container '{container_name}' is not running, falling back to host CLI.",
+                  file=sys.stderr)
+            return
+    except (subprocess.TimeoutExpired, OSError):
+        return  # Fall through on any error
+
+    # Filter out --host flag from forwarded args (it's not meaningful inside)
+    forwarded_args = [a for a in cli_args if a != "--host"]
+
+    # Build the exec command
+    exec_cmd = [runtime, "exec", "-it", container_name, hermes_bin] + forwarded_args
+
+    print(f"Routing to container '{container_name}' via {backend}...",
+          file=sys.stderr)
+
+    # Replace the current process — this never returns on success
+    os.execvp(runtime, exec_cmd)
+
+
 def _resolve_session_by_name_or_id(name_or_id: str) -> Optional[str]:
    """Resolve a session name (title) or ID to a session ID.

@@ -556,6 +606,21 @@ def _resolve_session_by_name_or_id(name_or_id: str) -> Optional[str]:

 def cmd_chat(args):
    """Run interactive chat CLI."""
+    # ── Container-aware routing ──────────────────────────────────────────
+    # When NixOS container mode is active and we're on the host, exec into
+    # the managed container instead of running locally. --host bypasses this.
+    if not getattr(args, "host", False):
+        try:
+            from hermes_cli.config import get_container_exec_info
+            container_info = get_container_exec_info()
+            if container_info:
+                _exec_in_container(container_info, sys.argv[1:])
+                # _exec_in_container calls os.execvp which replaces the process.
+                # If we get here, the exec failed.
+                sys.exit(1)
+        except Exception:
+            pass  # Fall through to normal CLI on any detection error
+
    # Resolve --continue into --resume with the latest CLI session or by name
    continue_val = getattr(args, "continue_last", None)
    if continue_val and not getattr(args, "resume", None):
@@ -4386,6 +4451,12 @@ For more help on a command:
        default=None,
        help="Session source tag for filtering (default: cli). Use 'tool' for third-party integrations that should not appear in user session lists."
    )
+    chat_parser.add_argument(
+        "--host",
+        action="store_true",
+        default=False,
+        help="Run on the host even when NixOS container mode is active (bypass container exec)"
+    )
    chat_parser.set_defaults(func=cmd_chat)

    # =========================================================================
@@ -611,6 +611,22 @@
          chown ${cfg.user}:${cfg.group} ${cfg.stateDir}/.hermes/.managed
          chmod 0644 ${cfg.stateDir}/.hermes/.managed

+          # Container mode metadata — tells the host CLI to exec into the
+          # container instead of running locally. Removed when container mode
+          # is disabled so the host CLI falls back to native execution.
+          ${if cfg.container.enable then ''
+            cat > ${cfg.stateDir}/.hermes/.container-mode <<'HERMES_CONTAINER_MODE_EOF'
+# Written by NixOS activation script. Do not edit manually.
+backend=${cfg.container.backend}
+container_name=${containerName}
+hermes_bin=${containerDataDir}/current-package/bin/hermes
+HERMES_CONTAINER_MODE_EOF
+            chown ${cfg.user}:${cfg.group} ${cfg.stateDir}/.hermes/.container-mode
+            chmod 0644 ${cfg.stateDir}/.hermes/.container-mode
+          '' else ''
+            rm -f ${cfg.stateDir}/.hermes/.container-mode
+          ''}
+
          # Seed auth file if provided
          ${lib.optionalString (cfg.authFile != null) ''
            ${if cfg.authFileForceOverwrite then ''
@@ -88,10 +88,10 @@ all = [
  "hermes-agent[modal]",
  "hermes-agent[daytona]",
  "hermes-agent[messaging]",
-  # matrix excluded: python-olm (required by matrix-nio[e2e]) is upstream-broken
-  # on modern macOS (archived libolm, C++ errors with Clang 21+). Including it
-  # here causes the entire [all] install to fail, dropping all other extras.
-  # Users who need Matrix can install manually: pip install 'hermes-agent[matrix]'
+  # matrix: python-olm (required by matrix-nio[e2e]) is upstream-broken on
+  # modern macOS (archived libolm, C++ errors with Clang 21+).  On Linux the
+  # [matrix] extra's own marker pulls in the [e2e] variant automatically.
+  "hermes-agent[matrix]; sys_platform == 'linux'",
  "hermes-agent[cron]",
  "hermes-agent[cli]",
  "hermes-agent[dev]",
@@ -1977,19 +1977,14 @@ class AIAgent:
            except Exception as e:
                logger.debug("Background memory/skill review failed: %s", e)
            finally:
-                # Explicitly close the OpenAI/httpx client so GC doesn't
-                # try to clean it up on a dead asyncio event loop (which
-                # produces "Event loop is closed" errors in the terminal).
+                # Close all resources (httpx client, subprocesses, etc.) so
+                # GC doesn't try to clean them up on a dead asyncio event
+                # loop (which produces "Event loop is closed" errors).
                if review_agent is not None:
-                    client = getattr(review_agent, "client", None)
-                    if client is not None:
-                        try:
-                            review_agent._close_openai_client(
-                                client, reason="bg_review_done", shared=True
-                            )
-                            review_agent.client = None
-                        except Exception:
-                            pass
+                    try:
+                        review_agent.close()
+                    except Exception:
+                        pass

        t = threading.Thread(target=_run_review, daemon=True, name="bg-review")
        t.start()
@@ -2729,6 +2724,64 @@ class AIAgent:
            except Exception:
                pass
    
+    def close(self) -> None:
+        """Release all resources held by this agent instance.
+
+        Cleans up subprocess resources that would otherwise become orphans:
+        - Background processes tracked in ProcessRegistry
+        - Terminal sandbox environments
+        - Browser daemon sessions
+        - Active child agents (subagent delegation)
+        - OpenAI/httpx client connections
+
+        Safe to call multiple times (idempotent).  Each cleanup step is
+        independently guarded so a failure in one does not prevent the rest.
+        """
+        task_id = getattr(self, "session_id", None) or ""
+
+        # 1. Kill background processes for this task
+        try:
+            from tools.process_registry import process_registry
+            process_registry.kill_all(task_id=task_id)
+        except Exception:
+            pass
+
+        # 2. Clean terminal sandbox environments
+        try:
+            from tools.terminal_tool import cleanup_vm
+            cleanup_vm(task_id)
+        except Exception:
+            pass
+
+        # 3. Clean browser daemon sessions
+        try:
+            from tools.browser_tool import cleanup_browser
+            cleanup_browser(task_id)
+        except Exception:
+            pass
+
+        # 4. Close active child agents
+        try:
+            with self._active_children_lock:
+                children = list(self._active_children)
+                self._active_children.clear()
+            for child in children:
+                try:
+                    child.close()
+                except Exception:
+                    pass
+        except Exception:
+            pass
+
+        # 5. Close the OpenAI/httpx client
+        try:
+            client = getattr(self, "client", None)
+            if client is not None:
+                self._close_openai_client(client, reason="agent_close", shared=True)
+                self.client = None
+        except Exception:
+            pass
+
    def _hydrate_todo_store(self, history: List[Dict[str, Any]]) -> None:
        """
        Recover todo state from conversation history.
@@ -658,6 +658,19 @@ class TestGetTextAuxiliaryClient:
        assert client is None
        assert model is None

+    def test_custom_endpoint_uses_codex_wrapper_when_runtime_requests_responses_api(self):
+        with patch("agent.auxiliary_client._resolve_custom_runtime",
+                   return_value=("https://api.openai.com/v1", "sk-test", "codex_responses")), \
+             patch("agent.auxiliary_client._read_main_model", return_value="gpt-5.3-codex"), \
+             patch("agent.auxiliary_client.OpenAI") as mock_openai:
+            client, model = get_text_auxiliary_client()
+
+        from agent.auxiliary_client import CodexAuxiliaryClient
+        assert isinstance(client, CodexAuxiliaryClient)
+        assert model == "gpt-5.3-codex"
+        assert mock_openai.call_args.kwargs["base_url"] == "https://api.openai.com/v1"
+        assert mock_openai.call_args.kwargs["api_key"] == "sk-test"
+

 class TestVisionClientFallback:
    """Vision client auto mode resolves known-good multimodal backends."""
@@ -1,4 +1,4 @@
-"""Shared fixtures for Telegram gateway e2e tests.
+"""Shared fixtures for gateway e2e tests (Telegram, Discord).

 These tests exercise the full async message flow:
    adapter.handle_message(event)
@@ -14,19 +14,22 @@ import sys
 import uuid
 from datetime import datetime
 from types import SimpleNamespace
-from unittest.mock import AsyncMock, MagicMock
+from unittest.mock import AsyncMock, MagicMock, patch
+
+import pytest

 from gateway.config import GatewayConfig, Platform, PlatformConfig
 from gateway.platforms.base import MessageEvent, SendResult
 from gateway.session import SessionEntry, SessionSource, build_session_key


-#Ensure telegram module is available (mock it if not installed)
+# Platform library mocks

+# Ensure telegram module is available (mock it if not installed)
 def _ensure_telegram_mock():
    """Install mock telegram modules so TelegramAdapter can be imported."""
    if "telegram" in sys.modules and hasattr(sys.modules["telegram"], "__file__"):
-        return  # Real library installed
+        return # Real library installed

    telegram_mod = MagicMock()
    telegram_mod.Update = MagicMock()
@@ -51,24 +54,118 @@ def _ensure_telegram_mock():
        sys.modules.setdefault(name, telegram_mod)


-_ensure_telegram_mock()
+# Ensure discord module is available (mock it if not installed)
+def _ensure_discord_mock():
+    """Install mock discord modules so DiscordAdapter can be imported."""
+    if "discord" in sys.modules and hasattr(sys.modules["discord"], "__file__"):
+        return # Real library installed

+    discord_mod = MagicMock()
+    discord_mod.Intents.default.return_value = MagicMock()
+    discord_mod.DMChannel = type("DMChannel", (), {})
+    discord_mod.Thread = type("Thread", (), {})
+    discord_mod.ForumChannel = type("ForumChannel", (), {})
+    discord_mod.Interaction = object
+    discord_mod.app_commands = SimpleNamespace(
+        describe=lambda **kwargs: (lambda fn: fn),
+        choices=lambda **kwargs: (lambda fn: fn),
+        Choice=lambda **kwargs: SimpleNamespace(**kwargs),
+    )
+    discord_mod.opus.is_loaded.return_value = True
+
+    ext_mod = MagicMock()
+    commands_mod = MagicMock()
+    commands_mod.Bot = MagicMock
+    ext_mod.commands = commands_mod
+
+    sys.modules.setdefault("discord", discord_mod)
+    sys.modules.setdefault("discord.ext", ext_mod)
+    sys.modules.setdefault("discord.ext.commands", commands_mod)
+    sys.modules.setdefault("discord.opus", discord_mod.opus)
+
+
+def _ensure_slack_mock():
+    """Install mock slack modules so SlackAdapter can be imported."""
+    if "slack_bolt" in sys.modules and hasattr(sys.modules["slack_bolt"], "__file__"):
+        return  # Real library installed
+
+    slack_bolt = MagicMock()
+    slack_bolt.async_app.AsyncApp = MagicMock
+    slack_bolt.adapter.socket_mode.async_handler.AsyncSocketModeHandler = MagicMock
+
+    slack_sdk = MagicMock()
+    slack_sdk.web.async_client.AsyncWebClient = MagicMock
+
+    for name, mod in [
+        ("slack_bolt", slack_bolt),
+        ("slack_bolt.async_app", slack_bolt.async_app),
+        ("slack_bolt.adapter", slack_bolt.adapter),
+        ("slack_bolt.adapter.socket_mode", slack_bolt.adapter.socket_mode),
+        ("slack_bolt.adapter.socket_mode.async_handler", slack_bolt.adapter.socket_mode.async_handler),
+        ("slack_sdk", slack_sdk),
+        ("slack_sdk.web", slack_sdk.web),
+        ("slack_sdk.web.async_client", slack_sdk.web.async_client),
+    ]:
+        sys.modules.setdefault(name, mod)
+
+
+_ensure_telegram_mock()
+_ensure_discord_mock()
+_ensure_slack_mock()
+
+from gateway.platforms.discord import DiscordAdapter   # noqa: E402
 from gateway.platforms.telegram import TelegramAdapter  # noqa: E402

+import gateway.platforms.slack as _slack_mod  # noqa: E402
+_slack_mod.SLACK_AVAILABLE = True
+from gateway.platforms.slack import SlackAdapter  # noqa: E402

-#GatewayRunner factory (based on tests/gateway/test_status_command.py)

-def make_runner(session_entry: SessionEntry) -> "GatewayRunner":
+# Platform-generic factories
+
+def make_source(platform: Platform, chat_id: str = "e2e-chat-1", user_id: str = "e2e-user-1") -> SessionSource:
+    return SessionSource(
+        platform=platform,
+        chat_id=chat_id,
+        user_id=user_id,
+        user_name="e2e_tester",
+        chat_type="dm",
+    )
+
+
+def make_session_entry(platform: Platform, source: SessionSource = None) -> SessionEntry:
+    source = source or make_source(platform)
+    return SessionEntry(
+        session_key=build_session_key(source),
+        session_id=f"sess-{uuid.uuid4().hex[:8]}",
+        created_at=datetime.now(),
+        updated_at=datetime.now(),
+        platform=platform,
+        chat_type="dm",
+    )
+
+
+def make_event(platform: Platform, text: str = "/help", chat_id: str = "e2e-chat-1", user_id: str = "e2e-user-1") -> MessageEvent:
+    return MessageEvent(
+        text=text,
+        source=make_source(platform, chat_id, user_id),
+        message_id=f"msg-{uuid.uuid4().hex[:8]}",
+    )
+
+
+def make_runner(platform: Platform, session_entry: SessionEntry = None) -> "GatewayRunner":
    """Create a GatewayRunner with mocked internals for e2e testing.

    Skips __init__ to avoid filesystem/network side effects.
-    All command-dispatch dependencies are wired manually.
    """
    from gateway.run import GatewayRunner

+    if session_entry is None:
+        session_entry = make_session_entry(platform)
+
    runner = object.__new__(GatewayRunner)
    runner.config = GatewayConfig(
-        platforms={Platform.TELEGRAM: PlatformConfig(enabled=True, token="e2e-test-token")}
+        platforms={platform: PlatformConfig(enabled=True, token="e2e-test-token")}
    )
    runner.adapters = {}
    runner._voice_mode = {}
@@ -99,7 +196,6 @@ def make_runner(session_entry: SessionEntry) -> "GatewayRunner":
    runner._capture_gateway_honcho_if_configured = lambda *a, **kw: None
    runner._emit_gateway_run_progress = AsyncMock()

-    # Pairing store (used by authorization rejection path)
    runner.pairing_store = MagicMock()
    runner.pairing_store._is_rate_limited = MagicMock(return_value=False)
    runner.pairing_store.generate_code = MagicMock(return_value="ABC123")
@@ -107,67 +203,63 @@ def make_runner(session_entry: SessionEntry) -> "GatewayRunner":
    return runner


-#TelegramAdapter factory
+def make_adapter(platform: Platform, runner=None):
+    """Create a platform adapter wired to *runner*, with send methods mocked."""
+    if runner is None:
+        runner = make_runner(platform)

-def make_adapter(runner) -> TelegramAdapter:
-    """Create a TelegramAdapter wired to *runner*, with send methods mocked.
-
-    connect() is NOT called — no polling, no token lock, no real HTTP.
-    """
    config = PlatformConfig(enabled=True, token="e2e-test-token")
-    adapter = TelegramAdapter(config)

-    # Mock outbound methods so tests can capture what was sent
+    if platform == Platform.DISCORD:
+        with patch.object(DiscordAdapter, "_load_participated_threads", return_value=set()):
+            adapter = DiscordAdapter(config)
+        platform_key = Platform.DISCORD
+    elif platform == Platform.SLACK:
+        adapter = SlackAdapter(config)
+        platform_key = Platform.SLACK
+    else:
+        adapter = TelegramAdapter(config)
+        platform_key = Platform.TELEGRAM
+
    adapter.send = AsyncMock(return_value=SendResult(success=True, message_id="e2e-resp-1"))
    adapter.send_typing = AsyncMock()

-    # Wire adapter ↔ runner
    adapter.set_message_handler(runner._handle_message)
-    runner.adapters[Platform.TELEGRAM] = adapter
+    runner.adapters[platform_key] = adapter

    return adapter


-#Helpers
-
-def make_source(chat_id: str = "e2e-chat-1", user_id: str = "e2e-user-1") -> SessionSource:
-    return SessionSource(
-        platform=Platform.TELEGRAM,
-        chat_id=chat_id,
-        user_id=user_id,
-        user_name="e2e_tester",
-        chat_type="dm",
-    )
-
-
-def make_event(text: str, chat_id: str = "e2e-chat-1", user_id: str = "e2e-user-1") -> MessageEvent:
-    return MessageEvent(
-        text=text,
-        source=make_source(chat_id, user_id),
-        message_id=f"msg-{uuid.uuid4().hex[:8]}",
-    )
-
-
-def make_session_entry(source: SessionSource = None) -> SessionEntry:
-    source = source or make_source()
-    return SessionEntry(
-        session_key=build_session_key(source),
-        session_id=f"sess-{uuid.uuid4().hex[:8]}",
-        created_at=datetime.now(),
-        updated_at=datetime.now(),
-        platform=Platform.TELEGRAM,
-        chat_type="dm",
-    )
-
-
-async def send_and_capture(adapter: TelegramAdapter, text: str, **event_kwargs) -> AsyncMock:
-    """Send a message through the full e2e flow and return the send mock.
-
-    Drives: adapter.handle_message → background task → runner dispatch → adapter.send.
-    """
-    event = make_event(text, **event_kwargs)
+async def send_and_capture(adapter, text: str, platform: Platform, **event_kwargs) -> AsyncMock:
+    """Send a message through the full e2e flow and return the send mock."""
+    event = make_event(platform, text, **event_kwargs)
    adapter.send.reset_mock()
    await adapter.handle_message(event)
-    # Let the background task complete
    await asyncio.sleep(0.3)
    return adapter.send
+
+
+# Parametrized fixtures for platform-generic tests
+@pytest.fixture(params=[Platform.TELEGRAM, Platform.DISCORD, Platform.SLACK], ids=["telegram", "discord", "slack"])
+def platform(request):
+    return request.param
+
+
+@pytest.fixture()
+def source(platform):
+    return make_source(platform)
+
+
+@pytest.fixture()
+def session_entry(platform, source):
+    return make_session_entry(platform, source)
+
+
+@pytest.fixture()
+def runner(platform, session_entry):
+    return make_runner(platform, session_entry)
+
+
+@pytest.fixture()
+def adapter(platform, runner):
+    return make_adapter(platform, runner)
@@ -1,4 +1,4 @@
-"""E2E tests for Telegram gateway slash commands.
+"""E2E tests for gateway slash commands (Telegram, Discord).

 Each test drives a message through the full async pipeline:
    adapter.handle_message(event)
@@ -7,6 +7,7 @@ Each test drives a message through the full async pipeline:
        → adapter.send() (captured for assertions)

 No LLM involved — only gateway-level commands are tested.
+Tests are parametrized over platforms via the ``platform`` fixture in conftest.
 """

 import asyncio
@@ -15,46 +16,15 @@ from unittest.mock import AsyncMock
 import pytest

 from gateway.platforms.base import SendResult
-from tests.e2e.conftest import (
-    make_adapter,
-    make_event,
-    make_runner,
-    make_session_entry,
-    make_source,
-    send_and_capture,
-)
+from tests.e2e.conftest import make_event, send_and_capture


-#Fixtures
-
-@pytest.fixture()
-def source():
-    return make_source()
-
-
-@pytest.fixture()
-def session_entry(source):
-    return make_session_entry(source)
-
-
-@pytest.fixture()
-def runner(session_entry):
-    return make_runner(session_entry)
-
-
-@pytest.fixture()
-def adapter(runner):
-    return make_adapter(runner)
-
-
-#Tests
-
-class TestTelegramSlashCommands:
+class TestSlashCommands:
    """Gateway slash commands dispatched through the full adapter pipeline."""

    @pytest.mark.asyncio
-    async def test_help_returns_command_list(self, adapter):
-        send = await send_and_capture(adapter, "/help")
+    async def test_help_returns_command_list(self, adapter, platform):
+        send = await send_and_capture(adapter, "/help", platform)

        send.assert_called_once()
        response_text = send.call_args[1].get("content") or send.call_args[0][1]
@@ -62,24 +32,23 @@ class TestTelegramSlashCommands:
        assert "/status" in response_text

    @pytest.mark.asyncio
-    async def test_status_shows_session_info(self, adapter):
-        send = await send_and_capture(adapter, "/status")
+    async def test_status_shows_session_info(self, adapter, platform):
+        send = await send_and_capture(adapter, "/status", platform)

        send.assert_called_once()
        response_text = send.call_args[1].get("content") or send.call_args[0][1]
-        # Status output includes session metadata
        assert "session" in response_text.lower() or "Session" in response_text

    @pytest.mark.asyncio
-    async def test_new_resets_session(self, adapter, runner):
-        send = await send_and_capture(adapter, "/new")
+    async def test_new_resets_session(self, adapter, runner, platform):
+        send = await send_and_capture(adapter, "/new", platform)

        send.assert_called_once()
        runner.session_store.reset_session.assert_called_once()

    @pytest.mark.asyncio
-    async def test_stop_when_no_agent_running(self, adapter):
-        send = await send_and_capture(adapter, "/stop")
+    async def test_stop_when_no_agent_running(self, adapter, platform):
+        send = await send_and_capture(adapter, "/stop", platform)

        send.assert_called_once()
        response_text = send.call_args[1].get("content") or send.call_args[0][1]
@@ -87,8 +56,8 @@ class TestTelegramSlashCommands:
        assert "no" in response_lower or "stop" in response_lower or "not running" in response_lower

    @pytest.mark.asyncio
-    async def test_commands_shows_listing(self, adapter):
-        send = await send_and_capture(adapter, "/commands")
+    async def test_commands_shows_listing(self, adapter, platform):
+        send = await send_and_capture(adapter, "/commands", platform)

        send.assert_called_once()
        response_text = send.call_args[1].get("content") or send.call_args[0][1]
@@ -96,25 +65,25 @@ class TestTelegramSlashCommands:
        assert "/" in response_text

    @pytest.mark.asyncio
-    async def test_sequential_commands_share_session(self, adapter):
+    async def test_sequential_commands_share_session(self, adapter, platform):
        """Two commands from the same chat_id should both succeed."""
-        send_help = await send_and_capture(adapter, "/help")
+        send_help = await send_and_capture(adapter, "/help", platform)
        send_help.assert_called_once()

-        send_status = await send_and_capture(adapter, "/status")
+        send_status = await send_and_capture(adapter, "/status", platform)
        send_status.assert_called_once()

    @pytest.mark.asyncio
-    async def test_provider_shows_current_provider(self, adapter):
-        send = await send_and_capture(adapter, "/provider")
+    async def test_provider_shows_current_provider(self, adapter, platform):
+        send = await send_and_capture(adapter, "/provider", platform)

        send.assert_called_once()
        response_text = send.call_args[1].get("content") or send.call_args[0][1]
        assert "provider" in response_text.lower()

    @pytest.mark.asyncio
-    async def test_verbose_responds(self, adapter):
-        send = await send_and_capture(adapter, "/verbose")
+    async def test_verbose_responds(self, adapter, platform):
+        send = await send_and_capture(adapter, "/verbose", platform)

        send.assert_called_once()
        response_text = send.call_args[1].get("content") or send.call_args[0][1]
@@ -122,42 +91,50 @@ class TestTelegramSlashCommands:
        assert "verbose" in response_text.lower() or "tool_progress" in response_text

    @pytest.mark.asyncio
-    async def test_personality_lists_options(self, adapter):
-        send = await send_and_capture(adapter, "/personality")
+    async def test_personality_lists_options(self, adapter, platform):
+        send = await send_and_capture(adapter, "/personality", platform)

        send.assert_called_once()
        response_text = send.call_args[1].get("content") or send.call_args[0][1]
        assert "personalit" in response_text.lower()  # matches "personality" or "personalities"

    @pytest.mark.asyncio
-    async def test_yolo_toggles_mode(self, adapter):
-        send = await send_and_capture(adapter, "/yolo")
+    async def test_yolo_toggles_mode(self, adapter, platform):
+        send = await send_and_capture(adapter, "/yolo", platform)

        send.assert_called_once()
        response_text = send.call_args[1].get("content") or send.call_args[0][1]
        assert "yolo" in response_text.lower()

+    @pytest.mark.asyncio
+    async def test_compress_command(self, adapter, platform):
+        send = await send_and_capture(adapter, "/compress", platform)
+
+        send.assert_called_once()
+        response_text = send.call_args[1].get("content") or send.call_args[0][1]
+        assert "compress" in response_text.lower() or "context" in response_text.lower()
+

 class TestSessionLifecycle:
    """Verify session state changes across command sequences."""

    @pytest.mark.asyncio
-    async def test_new_then_status_reflects_reset(self, adapter, runner, session_entry):
+    async def test_new_then_status_reflects_reset(self, adapter, runner, session_entry, platform):
        """After /new, /status should report the fresh session."""
-        await send_and_capture(adapter, "/new")
+        await send_and_capture(adapter, "/new", platform)
        runner.session_store.reset_session.assert_called_once()

-        send = await send_and_capture(adapter, "/status")
+        send = await send_and_capture(adapter, "/status", platform)
        send.assert_called_once()
        response_text = send.call_args[1].get("content") or send.call_args[0][1]
        # Session ID from the entry should appear in the status output
        assert session_entry.session_id[:8] in response_text

    @pytest.mark.asyncio
-    async def test_new_is_idempotent(self, adapter, runner):
+    async def test_new_is_idempotent(self, adapter, runner, platform):
        """/new called twice should not crash."""
-        await send_and_capture(adapter, "/new")
-        await send_and_capture(adapter, "/new")
+        await send_and_capture(adapter, "/new", platform)
+        await send_and_capture(adapter, "/new", platform)
        assert runner.session_store.reset_session.call_count == 2


@@ -165,11 +142,11 @@ class TestAuthorization:
    """Verify the pipeline handles unauthorized users."""

    @pytest.mark.asyncio
-    async def test_unauthorized_user_gets_pairing_response(self, adapter, runner):
+    async def test_unauthorized_user_gets_pairing_response(self, adapter, runner, platform):
        """Unauthorized DM should trigger pairing code, not a command response."""
        runner._is_user_authorized = lambda _source: False

-        event = make_event("/help")
+        event = make_event(platform, "/help")
        adapter.send.reset_mock()
        await adapter.handle_message(event)
        await asyncio.sleep(0.3)
@@ -181,11 +158,11 @@ class TestAuthorization:
        assert "recognize" in response_text.lower() or "pair" in response_text.lower() or "ABC123" in response_text

    @pytest.mark.asyncio
-    async def test_unauthorized_user_does_not_get_help(self, adapter, runner):
+    async def test_unauthorized_user_does_not_get_help(self, adapter, runner, platform):
        """Unauthorized user should NOT see the help command output."""
        runner._is_user_authorized = lambda _source: False

-        event = make_event("/help")
+        event = make_event(platform, "/help")
        adapter.send.reset_mock()
        await adapter.handle_message(event)
        await asyncio.sleep(0.3)
@@ -200,12 +177,12 @@ class TestSendFailureResilience:
    """Verify the pipeline handles send failures gracefully."""

    @pytest.mark.asyncio
-    async def test_send_failure_does_not_crash_pipeline(self, adapter):
+    async def test_send_failure_does_not_crash_pipeline(self, adapter, platform):
        """If send() returns failure, the pipeline should not raise."""
        adapter.send = AsyncMock(return_value=SendResult(success=False, error="network timeout"))
-        adapter.set_message_handler(adapter._message_handler)  # re-wire with same handler
+        adapter.set_message_handler(adapter._message_handler) # re-wire with same handler

-        event = make_event("/help")
+        event = make_event(platform, "/help")
        # Should not raise — pipeline handles send failures internally
        await adapter.handle_message(event)
        await asyncio.sleep(0.3)
@@ -0,0 +1,132 @@
+"""Tests for the API server bind-address startup guard.
+
+Validates that is_network_accessible() correctly classifies addresses and
+that connect() refuses to start on non-loopback without API_SERVER_KEY.
+"""
+
+import socket
+from unittest.mock import AsyncMock, patch
+
+import pytest
+
+from gateway.config import PlatformConfig
+from gateway.platforms.api_server import APIServerAdapter
+from gateway.platforms.base import is_network_accessible
+
+
+# ---------------------------------------------------------------------------
+# Unit tests: is_network_accessible()
+# ---------------------------------------------------------------------------
+
+
+class TestIsNetworkAccessible:
+    """Direct tests for the address classification helper."""
+
+    # -- Loopback (safe, should return False) --
+
+    def test_ipv4_loopback(self):
+        assert is_network_accessible("127.0.0.1") is False
+
+    def test_ipv6_loopback(self):
+        assert is_network_accessible("::1") is False
+
+    def test_ipv4_mapped_loopback(self):
+        # ::ffff:127.0.0.1 — Python's is_loopback returns False for mapped
+        # addresses; the helper must unwrap and check ipv4_mapped.
+        assert is_network_accessible("::ffff:127.0.0.1") is False
+
+    # -- Network-accessible (should return True) --
+
+    def test_ipv4_wildcard(self):
+        assert is_network_accessible("0.0.0.0") is True
+
+    def test_ipv6_wildcard(self):
+        # This is the bypass vector that the string-based check missed.
+        assert is_network_accessible("::") is True
+
+    def test_ipv4_mapped_unspecified(self):
+        assert is_network_accessible("::ffff:0.0.0.0") is True
+
+    def test_private_ipv4(self):
+        assert is_network_accessible("10.0.0.1") is True
+
+    def test_private_ipv4_class_c(self):
+        assert is_network_accessible("192.168.1.1") is True
+
+    def test_public_ipv4(self):
+        assert is_network_accessible("8.8.8.8") is True
+
+    # -- Hostname resolution --
+
+    def test_localhost_resolves_to_loopback(self):
+        loopback_result = [
+            (socket.AF_INET, socket.SOCK_STREAM, 0, "", ("127.0.0.1", 0)),
+        ]
+        with patch("gateway.platforms.base._socket.getaddrinfo", return_value=loopback_result):
+            assert is_network_accessible("localhost") is False
+
+    def test_hostname_resolving_to_non_loopback(self):
+        non_loopback_result = [
+            (socket.AF_INET, socket.SOCK_STREAM, 0, "", ("10.0.0.1", 0)),
+        ]
+        with patch("gateway.platforms.base._socket.getaddrinfo", return_value=non_loopback_result):
+            assert is_network_accessible("my-server.local") is True
+
+    def test_hostname_mixed_resolution(self):
+        """If a hostname resolves to both loopback and non-loopback, it's
+        network-accessible (any non-loopback address is enough)."""
+        mixed_result = [
+            (socket.AF_INET, socket.SOCK_STREAM, 0, "", ("127.0.0.1", 0)),
+            (socket.AF_INET, socket.SOCK_STREAM, 0, "", ("10.0.0.1", 0)),
+        ]
+        with patch("gateway.platforms.base._socket.getaddrinfo", return_value=mixed_result):
+            assert is_network_accessible("dual-host.local") is True
+
+    def test_dns_failure_fails_closed(self):
+        """Unresolvable hostnames should require an API key (fail closed)."""
+        with patch(
+            "gateway.platforms.base._socket.getaddrinfo",
+            side_effect=socket.gaierror("Name resolution failed"),
+        ):
+            assert is_network_accessible("nonexistent.invalid") is True
+
+
+# ---------------------------------------------------------------------------
+# Integration tests: connect() startup guard
+# ---------------------------------------------------------------------------
+
+
+class TestConnectBindGuard:
+    """Verify that connect() refuses dangerous configurations."""
+
+    @pytest.mark.asyncio
+    async def test_refuses_ipv4_wildcard_without_key(self):
+        adapter = APIServerAdapter(PlatformConfig(enabled=True, extra={"host": "0.0.0.0"}))
+        result = await adapter.connect()
+        assert result is False
+
+    @pytest.mark.asyncio
+    async def test_refuses_ipv6_wildcard_without_key(self):
+        adapter = APIServerAdapter(PlatformConfig(enabled=True, extra={"host": "::"}))
+        result = await adapter.connect()
+        assert result is False
+
+    def test_allows_loopback_without_key(self):
+        """Loopback with no key should pass the guard."""
+        adapter = APIServerAdapter(PlatformConfig(enabled=True, extra={"host": "127.0.0.1"}))
+        assert adapter._api_key == ""
+        # The guard condition: is_network_accessible(host) AND NOT api_key
+        # For loopback, is_network_accessible is False so the guard does not block.
+        assert is_network_accessible(adapter._host) is False
+
+    @pytest.mark.asyncio
+    async def test_allows_wildcard_with_key(self):
+        """Non-loopback with a key should pass the guard."""
+        adapter = APIServerAdapter(
+            PlatformConfig(enabled=True, extra={"host": "0.0.0.0", "key": "sk-test"})
+        )
+        # The guard checks: is_network_accessible(host) AND NOT api_key
+        # With a key set, the guard should not block.
+        assert adapter._api_key == "sk-test"
+        assert is_network_accessible("0.0.0.0") is True
+        # Combined: the guard condition is False (key is set), so it passes
@@ -436,6 +436,95 @@ class TestThreadPersistence:
        assert len(data) == 5


+# ---------------------------------------------------------------------------
+# DM mention-thread feature
+# ---------------------------------------------------------------------------
+
+
+@pytest.mark.asyncio
+async def test_dm_mention_thread_disabled_by_default(monkeypatch):
+    """Default (dm_mention_threads=false): DM with mention should NOT create a thread."""
+    monkeypatch.delenv("MATRIX_DM_MENTION_THREADS", raising=False)
+    monkeypatch.setenv("MATRIX_AUTO_THREAD", "false")
+
+    adapter = _make_adapter()
+    room = _make_room(member_count=2)
+    event = _make_event("@hermes:example.org help me", event_id="$dm1")
+
+    await adapter._on_room_message(room, event)
+    adapter.handle_message.assert_awaited_once()
+    msg = adapter.handle_message.await_args.args[0]
+    assert msg.source.thread_id is None
+
+
+@pytest.mark.asyncio
+async def test_dm_mention_thread_creates_thread(monkeypatch):
+    """MATRIX_DM_MENTION_THREADS=true: DM with @mention creates a thread."""
+    monkeypatch.setenv("MATRIX_DM_MENTION_THREADS", "true")
+    monkeypatch.setenv("MATRIX_AUTO_THREAD", "false")
+
+    adapter = _make_adapter()
+    room = _make_room(member_count=2)
+    event = _make_event("@hermes:example.org help me", event_id="$dm1")
+
+    with patch.object(adapter, "_save_participated_threads"):
+        await adapter._on_room_message(room, event)
+
+    adapter.handle_message.assert_awaited_once()
+    msg = adapter.handle_message.await_args.args[0]
+    assert msg.source.thread_id == "$dm1"
+    assert msg.text == "help me"
+
+
+@pytest.mark.asyncio
+async def test_dm_mention_thread_no_mention_no_thread(monkeypatch):
+    """MATRIX_DM_MENTION_THREADS=true: DM without mention does NOT create a thread."""
+    monkeypatch.setenv("MATRIX_DM_MENTION_THREADS", "true")
+    monkeypatch.setenv("MATRIX_AUTO_THREAD", "false")
+
+    adapter = _make_adapter()
+    room = _make_room(member_count=2)
+    event = _make_event("hello without mention", event_id="$dm1")
+
+    await adapter._on_room_message(room, event)
+    adapter.handle_message.assert_awaited_once()
+    msg = adapter.handle_message.await_args.args[0]
+    assert msg.source.thread_id is None
+
+
+@pytest.mark.asyncio
+async def test_dm_mention_thread_preserves_existing_thread(monkeypatch):
+    """MATRIX_DM_MENTION_THREADS=true: DM already in a thread keeps that thread_id."""
+    monkeypatch.setenv("MATRIX_DM_MENTION_THREADS", "true")
+    monkeypatch.setenv("MATRIX_AUTO_THREAD", "false")
+
+    adapter = _make_adapter()
+    adapter._bot_participated_threads.add("$existing_thread")
+    room = _make_room(member_count=2)
+    event = _make_event("@hermes:example.org help me", thread_id="$existing_thread")
+
+    await adapter._on_room_message(room, event)
+    adapter.handle_message.assert_awaited_once()
+    msg = adapter.handle_message.await_args.args[0]
+    assert msg.source.thread_id == "$existing_thread"
+
+
+@pytest.mark.asyncio
+async def test_dm_mention_thread_tracks_participation(monkeypatch):
+    """DM mention-thread tracks the thread in _bot_participated_threads."""
+    monkeypatch.setenv("MATRIX_DM_MENTION_THREADS", "true")
+    monkeypatch.setenv("MATRIX_AUTO_THREAD", "false")
+
+    adapter = _make_adapter()
+    room = _make_room(member_count=2)
+    event = _make_event("@hermes:example.org help", event_id="$dm1")
+
+    with patch.object(adapter, "_save_participated_threads"):
+        await adapter._on_room_message(room, event)
+
+    assert "$dm1" in adapter._bot_participated_threads
+
+
 # ---------------------------------------------------------------------------
 # YAML config bridge
 # ---------------------------------------------------------------------------
@@ -480,6 +569,25 @@ class TestMatrixConfigBridge:
        assert os.getenv("MATRIX_FREE_RESPONSE_ROOMS") == "!room1:example.org,!room2:example.org"
        assert os.getenv("MATRIX_AUTO_THREAD") == "false"

+    def test_yaml_bridge_sets_dm_mention_threads(self, monkeypatch, tmp_path):
+        """Matrix YAML dm_mention_threads should bridge to env var."""
+        monkeypatch.delenv("MATRIX_DM_MENTION_THREADS", raising=False)
+
+        import os
+        import yaml
+
+        yaml_content = {"matrix": {"dm_mention_threads": True}}
+        config_file = tmp_path / "config.yaml"
+        config_file.write_text(yaml.dump(yaml_content))
+
+        yaml_cfg = yaml.safe_load(config_file.read_text())
+        matrix_cfg = yaml_cfg.get("matrix", {})
+        if isinstance(matrix_cfg, dict):
+            if "dm_mention_threads" in matrix_cfg and not os.getenv("MATRIX_DM_MENTION_THREADS"):
+                monkeypatch.setenv("MATRIX_DM_MENTION_THREADS", str(matrix_cfg["dm_mention_threads"]).lower())
+
+        assert os.getenv("MATRIX_DM_MENTION_THREADS") == "true"
+
    def test_env_vars_take_precedence_over_yaml(self, monkeypatch):
        """Env vars should not be overwritten by YAML values."""
        monkeypatch.setenv("MATRIX_REQUIRE_MENTION", "true")
@@ -3,9 +3,15 @@ import os
 from gateway.config import Platform
 from gateway.run import GatewayRunner
 from gateway.session import SessionContext, SessionSource
+from gateway.session_context import (
+    get_session_env,
+    set_session_vars,
+    clear_session_vars,
+)


-def test_set_session_env_includes_thread_id(monkeypatch):
+def test_set_session_env_sets_contextvars(monkeypatch):
+    """_set_session_env should populate contextvars, not os.environ."""
    runner = object.__new__(GatewayRunner)
    source = SessionSource(
        platform=Platform.TELEGRAM,
@@ -21,25 +27,93 @@ def test_set_session_env_includes_thread_id(monkeypatch):
    monkeypatch.delenv("HERMES_SESSION_CHAT_NAME", raising=False)
    monkeypatch.delenv("HERMES_SESSION_THREAD_ID", raising=False)

-    runner._set_session_env(context)
+    tokens = runner._set_session_env(context)

-    assert os.getenv("HERMES_SESSION_PLATFORM") == "telegram"
-    assert os.getenv("HERMES_SESSION_CHAT_ID") == "-1001"
-    assert os.getenv("HERMES_SESSION_CHAT_NAME") == "Group"
-    assert os.getenv("HERMES_SESSION_THREAD_ID") == "17585"
+    # Values should be readable via get_session_env (contextvar path)
+    assert get_session_env("HERMES_SESSION_PLATFORM") == "telegram"
+    assert get_session_env("HERMES_SESSION_CHAT_ID") == "-1001"
+    assert get_session_env("HERMES_SESSION_CHAT_NAME") == "Group"
+    assert get_session_env("HERMES_SESSION_THREAD_ID") == "17585"
+
+    # os.environ should NOT be touched
+    assert os.getenv("HERMES_SESSION_PLATFORM") is None
+    assert os.getenv("HERMES_SESSION_THREAD_ID") is None
+
+    # Clean up
+    runner._clear_session_env(tokens)


-def test_clear_session_env_removes_thread_id(monkeypatch):
+def test_clear_session_env_restores_previous_state(monkeypatch):
+    """_clear_session_env should restore contextvars to their pre-handler values."""
    runner = object.__new__(GatewayRunner)

-    monkeypatch.setenv("HERMES_SESSION_PLATFORM", "telegram")
-    monkeypatch.setenv("HERMES_SESSION_CHAT_ID", "-1001")
-    monkeypatch.setenv("HERMES_SESSION_CHAT_NAME", "Group")
-    monkeypatch.setenv("HERMES_SESSION_THREAD_ID", "17585")
+    monkeypatch.delenv("HERMES_SESSION_PLATFORM", raising=False)
+    monkeypatch.delenv("HERMES_SESSION_CHAT_ID", raising=False)
+    monkeypatch.delenv("HERMES_SESSION_CHAT_NAME", raising=False)
+    monkeypatch.delenv("HERMES_SESSION_THREAD_ID", raising=False)

-    runner._clear_session_env()
+    source = SessionSource(
+        platform=Platform.TELEGRAM,
+        chat_id="-1001",
+        chat_name="Group",
+        chat_type="group",
+        thread_id="17585",
+    )
+    context = SessionContext(source=source, connected_platforms=[], home_channels={})

-    assert os.getenv("HERMES_SESSION_PLATFORM") is None
-    assert os.getenv("HERMES_SESSION_CHAT_ID") is None
-    assert os.getenv("HERMES_SESSION_CHAT_NAME") is None
-    assert os.getenv("HERMES_SESSION_THREAD_ID") is None
+    tokens = runner._set_session_env(context)
+    assert get_session_env("HERMES_SESSION_PLATFORM") == "telegram"
+
+    runner._clear_session_env(tokens)
+
+    # After clear, contextvars should return to defaults (empty)
+    assert get_session_env("HERMES_SESSION_PLATFORM") == ""
+    assert get_session_env("HERMES_SESSION_CHAT_ID") == ""
+    assert get_session_env("HERMES_SESSION_CHAT_NAME") == ""
+    assert get_session_env("HERMES_SESSION_THREAD_ID") == ""
+
+
+def test_get_session_env_falls_back_to_os_environ(monkeypatch):
+    """get_session_env should fall back to os.environ when contextvar is unset."""
+    monkeypatch.setenv("HERMES_SESSION_PLATFORM", "discord")
+
+    # No contextvar set — should read from os.environ
+    assert get_session_env("HERMES_SESSION_PLATFORM") == "discord"
+
+    # Now set a contextvar — should prefer it
+    tokens = set_session_vars(platform="telegram")
+    assert get_session_env("HERMES_SESSION_PLATFORM") == "telegram"
+
+    # Restore — should fall back to os.environ again
+    clear_session_vars(tokens)
+    assert get_session_env("HERMES_SESSION_PLATFORM") == "discord"
+
+
+def test_get_session_env_default_when_nothing_set(monkeypatch):
+    """get_session_env returns default when neither contextvar nor env is set."""
+    monkeypatch.delenv("HERMES_SESSION_PLATFORM", raising=False)
+
+    assert get_session_env("HERMES_SESSION_PLATFORM") == ""
+    assert get_session_env("HERMES_SESSION_PLATFORM", "fallback") == "fallback"
+
+
+def test_set_session_env_handles_missing_optional_fields():
+    """_set_session_env should handle None chat_name and thread_id gracefully."""
+    runner = object.__new__(GatewayRunner)
+    source = SessionSource(
+        platform=Platform.TELEGRAM,
+        chat_id="-1001",
+        chat_name=None,
+        chat_type="private",
+        thread_id=None,
+    )
+    context = SessionContext(source=source, connected_platforms=[], home_channels={})
+
+    tokens = runner._set_session_env(context)
+
+    assert get_session_env("HERMES_SESSION_PLATFORM") == "telegram"
+    assert get_session_env("HERMES_SESSION_CHAT_ID") == "-1001"
+    assert get_session_env("HERMES_SESSION_CHAT_NAME") == ""
+    assert get_session_env("HERMES_SESSION_THREAD_ID") == ""
+
+    runner._clear_session_env(tokens)
@@ -0,0 +1,275 @@
+"""Tests for container-aware CLI routing (NixOS container mode).
+
+When container.enable = true in the NixOS module, the activation script
+writes a .container-mode metadata file. The host CLI detects this and
+execs into the container instead of running locally.
+"""
+import os
+from pathlib import Path
+from types import SimpleNamespace
+from unittest.mock import MagicMock, patch
+
+import pytest
+
+from hermes_cli.config import (
+    _is_inside_container,
+    get_container_exec_info,
+)
+
+
+# =============================================================================
+# _is_inside_container
+# =============================================================================
+
+
+def test_is_inside_container_dockerenv(tmp_path):
+    """Detects /.dockerenv marker file."""
+    with patch("os.path.exists") as mock_exists:
+        mock_exists.side_effect = lambda p: p == "/.dockerenv"
+        assert _is_inside_container() is True
+
+
+def test_is_inside_container_containerenv(tmp_path):
+    """Detects Podman's /run/.containerenv marker."""
+    with patch("os.path.exists") as mock_exists:
+        mock_exists.side_effect = lambda p: p == "/run/.containerenv"
+        assert _is_inside_container() is True
+
+
+def test_is_inside_container_cgroup_docker():
+    """Detects 'docker' in /proc/1/cgroup."""
+    with patch("os.path.exists", return_value=False), \
+         patch("builtins.open", create=True) as mock_open:
+        mock_open.return_value.__enter__ = lambda s: s
+        mock_open.return_value.__exit__ = MagicMock(return_value=False)
+        mock_open.return_value.read = MagicMock(
+            return_value="12:memory:/docker/abc123\n"
+        )
+        assert _is_inside_container() is True
+
+
+def test_is_inside_container_false_on_host():
+    """Returns False when none of the container indicators are present."""
+    with patch("os.path.exists", return_value=False), \
+         patch("builtins.open", side_effect=OSError("no such file")):
+        assert _is_inside_container() is False
+
+
+# =============================================================================
+# get_container_exec_info
+# =============================================================================
+
+
+@pytest.fixture
+def container_env(tmp_path, monkeypatch):
+    """Set up a fake HERMES_HOME with .container-mode file."""
+    hermes_home = tmp_path / ".hermes"
+    hermes_home.mkdir()
+    monkeypatch.setenv("HERMES_HOME", str(hermes_home))
+
+    container_mode = hermes_home / ".container-mode"
+    container_mode.write_text(
+        "# Written by NixOS activation script. Do not edit manually.\n"
+        "backend=podman\n"
+        "container_name=hermes-agent\n"
+        "hermes_bin=/data/current-package/bin/hermes\n"
+    )
+    return hermes_home
+
+
+def test_get_container_exec_info_returns_metadata(container_env):
+    """Reads .container-mode and returns backend/name/bin."""
+    with patch("hermes_cli.config._is_inside_container", return_value=False):
+        info = get_container_exec_info()
+
+    assert info is not None
+    assert info["backend"] == "podman"
+    assert info["container_name"] == "hermes-agent"
+    assert info["hermes_bin"] == "/data/current-package/bin/hermes"
+
+
+def test_get_container_exec_info_none_inside_container(container_env):
+    """Returns None when we're already inside a container."""
+    with patch("hermes_cli.config._is_inside_container", return_value=True):
+        info = get_container_exec_info()
+
+    assert info is None
+
+
+def test_get_container_exec_info_none_without_file(tmp_path, monkeypatch):
+    """Returns None when .container-mode doesn't exist (native mode)."""
+    hermes_home = tmp_path / ".hermes"
+    hermes_home.mkdir()
+    monkeypatch.setenv("HERMES_HOME", str(hermes_home))
+
+    with patch("hermes_cli.config._is_inside_container", return_value=False):
+        info = get_container_exec_info()
+
+    assert info is None
+
+
+def test_get_container_exec_info_defaults():
+    """Falls back to defaults for missing keys."""
+    import tempfile
+
+    with tempfile.TemporaryDirectory() as tmpdir:
+        hermes_home = Path(tmpdir) / ".hermes"
+        hermes_home.mkdir()
+        (hermes_home / ".container-mode").write_text(
+            "# minimal file with no keys\n"
+        )
+
+        with patch("hermes_cli.config._is_inside_container", return_value=False), \
+             patch("hermes_cli.config.get_hermes_home", return_value=hermes_home):
+            info = get_container_exec_info()
+
+        assert info is not None
+        assert info["backend"] == "docker"
+        assert info["container_name"] == "hermes-agent"
+        assert info["hermes_bin"] == "/data/current-package/bin/hermes"
+
+
+def test_get_container_exec_info_docker_backend(container_env):
+    """Correctly reads docker backend."""
+    (container_env / ".container-mode").write_text(
+        "backend=docker\n"
+        "container_name=hermes-custom\n"
+        "hermes_bin=/opt/hermes/bin/hermes\n"
+    )
+
+    with patch("hermes_cli.config._is_inside_container", return_value=False):
+        info = get_container_exec_info()
+
+    assert info["backend"] == "docker"
+    assert info["container_name"] == "hermes-custom"
+    assert info["hermes_bin"] == "/opt/hermes/bin/hermes"
+
+
+# =============================================================================
+# _exec_in_container
+# =============================================================================
+
+
+def test_exec_in_container_calls_execvp():
+    """Verifies os.execvp is called with the correct command."""
+    from hermes_cli.main import _exec_in_container
+
+    container_info = {
+        "backend": "podman",
+        "container_name": "hermes-agent",
+        "hermes_bin": "/data/current-package/bin/hermes",
+    }
+
+    with patch("shutil.which", return_value="/usr/bin/podman"), \
+         patch("subprocess.run") as mock_run, \
+         patch("os.execvp") as mock_exec:
+        # Simulate running container
+        mock_result = MagicMock()
+        mock_result.returncode = 0
+        mock_result.stdout = "true\n"
+        mock_run.return_value = mock_result
+
+        _exec_in_container(container_info, ["chat", "-m", "claude-sonnet-4"])
+
+        mock_exec.assert_called_once_with(
+            "/usr/bin/podman",
+            ["/usr/bin/podman", "exec", "-it", "hermes-agent",
+             "/data/current-package/bin/hermes", "chat", "-m", "claude-sonnet-4"]
+        )
+
+
+def test_exec_in_container_strips_host_flag():
+    """The --host flag is not forwarded into the container."""
+    from hermes_cli.main import _exec_in_container
+
+    container_info = {
+        "backend": "podman",
+        "container_name": "hermes-agent",
+        "hermes_bin": "/data/current-package/bin/hermes",
+    }
+
+    with patch("shutil.which", return_value="/usr/bin/podman"), \
+         patch("subprocess.run") as mock_run, \
+         patch("os.execvp") as mock_exec:
+        mock_result = MagicMock()
+        mock_result.returncode = 0
+        mock_result.stdout = "true\n"
+        mock_run.return_value = mock_result
+
+        _exec_in_container(container_info, ["chat", "--host", "-q", "hello"])
+
+        # --host should be stripped
+        exec_args = mock_exec.call_args[0][1]
+        assert "--host" not in exec_args
+        assert "-q" in exec_args
+        assert "hello" in exec_args
+
+
+def test_exec_in_container_fallback_no_runtime(capsys):
+    """Falls back gracefully when container runtime is not found."""
+    from hermes_cli.main import _exec_in_container
+
+    container_info = {
+        "backend": "podman",
+        "container_name": "hermes-agent",
+        "hermes_bin": "/data/current-package/bin/hermes",
+    }
+
+    with patch("shutil.which", return_value=None), \
+         patch("os.execvp") as mock_exec:
+        _exec_in_container(container_info, ["chat"])
+
+        # Should NOT call execvp — graceful fallback
+        mock_exec.assert_not_called()
+
+    captured = capsys.readouterr()
+    assert "not found on PATH" in captured.err
+
+
+def test_exec_in_container_fallback_container_not_running(capsys):
+    """Falls back when container exists but is not running."""
+    from hermes_cli.main import _exec_in_container
+
+    container_info = {
+        "backend": "docker",
+        "container_name": "hermes-agent",
+        "hermes_bin": "/data/current-package/bin/hermes",
+    }
+
+    with patch("shutil.which", return_value="/usr/bin/docker"), \
+         patch("subprocess.run") as mock_run, \
+         patch("os.execvp") as mock_exec:
+        mock_result = MagicMock()
+        mock_result.returncode = 0
+        mock_result.stdout = "false\n"
+        mock_run.return_value = mock_result
+
+        _exec_in_container(container_info, ["chat"])
+
+        mock_exec.assert_not_called()
+
+    captured = capsys.readouterr()
+    assert "not running" in captured.err
+
+
+def test_exec_in_container_fallback_inspect_fails():
+    """Falls back when docker inspect fails entirely."""
+    from hermes_cli.main import _exec_in_container
+
+    container_info = {
+        "backend": "docker",
+        "container_name": "hermes-agent",
+        "hermes_bin": "/data/current-package/bin/hermes",
+    }
+
+    with patch("shutil.which", return_value="/usr/bin/docker"), \
+         patch("subprocess.run") as mock_run, \
+         patch("os.execvp") as mock_exec:
+        mock_result = MagicMock()
+        mock_result.returncode = 1
+        mock_result.stdout = ""
+        mock_run.return_value = mock_result
+
+        _exec_in_container(container_info, ["chat"])
+
+        mock_exec.assert_not_called()
@@ -11,12 +11,19 @@ def _load_optional_dependencies():
    return project["optional-dependencies"]


-def test_matrix_extra_exists_but_excluded_from_all():
+def test_matrix_extra_linux_only_in_all():
    """matrix-nio[e2e] depends on python-olm which is upstream-broken on modern
    macOS (archived libolm, C++ errors with Clang 21+).  The [matrix] extra is
-    kept for opt-in install but deliberately excluded from [all] so one broken
-    upstream dep doesn't nuke every other extra during ``hermes update``."""
+    included in [all] but gated to Linux via a platform marker so that
+    ``hermes update`` doesn't fail on macOS."""
    optional_dependencies = _load_optional_dependencies()

    assert "matrix" in optional_dependencies
+    # Must NOT be unconditional — python-olm has no macOS wheels.
    assert "hermes-agent[matrix]" not in optional_dependencies["all"]
+    # Must be present with a Linux platform marker.
+    linux_gated = [
+        dep for dep in optional_dependencies["all"]
+        if "matrix" in dep and "linux" in dep
+    ]
+    assert linux_gated, "expected hermes-agent[matrix] with sys_platform=='linux' marker in [all]"
@@ -333,3 +333,25 @@ class TestShellFileOpsWriteDenied:
        result = file_ops.patch_replace("~/.ssh/authorized_keys", "old", "new")
        assert result.error is not None
        assert "denied" in result.error.lower()
+
+    def test_delete_file_denied_path(self, file_ops):
+        result = file_ops.delete_file("~/.ssh/authorized_keys")
+        assert result.error is not None
+        assert "denied" in result.error.lower()
+
+    def test_move_file_src_denied(self, file_ops):
+        result = file_ops.move_file("~/.ssh/id_rsa", "/tmp/dest.txt")
+        assert result.error is not None
+        assert "denied" in result.error.lower()
+
+    def test_move_file_dst_denied(self, file_ops):
+        result = file_ops.move_file("/tmp/src.txt", "~/.aws/credentials")
+        assert result.error is not None
+        assert "denied" in result.error.lower()
+
+    def test_move_file_failure_path(self, mock_env):
+        mock_env.execute.return_value = {"output": "No such file or directory", "returncode": 1}
+        ops = ShellFileOperations(mock_env)
+        result = ops.move_file("/tmp/nonexistent.txt", "/tmp/dest.txt")
+        assert result.error is not None
+        assert "Failed to move" in result.error
@@ -6,31 +6,31 @@ from tools.fuzzy_match import fuzzy_find_and_replace
 class TestExactMatch:
    def test_single_replacement(self):
        content = "hello world"
-        new, count, err = fuzzy_find_and_replace(content, "hello", "hi")
+        new, count, _, err = fuzzy_find_and_replace(content, "hello", "hi")
        assert err is None
        assert count == 1
        assert new == "hi world"

    def test_no_match(self):
        content = "hello world"
-        new, count, err = fuzzy_find_and_replace(content, "xyz", "abc")
+        new, count, _, err = fuzzy_find_and_replace(content, "xyz", "abc")
        assert count == 0
        assert err is not None
        assert new == content

    def test_empty_old_string(self):
-        new, count, err = fuzzy_find_and_replace("abc", "", "x")
+        new, count, _, err = fuzzy_find_and_replace("abc", "", "x")
        assert count == 0
        assert err is not None

    def test_identical_strings(self):
-        new, count, err = fuzzy_find_and_replace("abc", "abc", "abc")
+        new, count, _, err = fuzzy_find_and_replace("abc", "abc", "abc")
        assert count == 0
        assert "identical" in err

    def test_multiline_exact(self):
        content = "line1\nline2\nline3"
-        new, count, err = fuzzy_find_and_replace(content, "line1\nline2", "replaced")
+        new, count, _, err = fuzzy_find_and_replace(content, "line1\nline2", "replaced")
        assert err is None
        assert count == 1
        assert new == "replaced\nline3"
@@ -39,7 +39,7 @@ class TestExactMatch:
 class TestWhitespaceDifference:
    def test_extra_spaces_match(self):
        content = "def  foo(  x,  y  ):"
-        new, count, err = fuzzy_find_and_replace(content, "def foo( x, y ):", "def bar(x, y):")
+        new, count, _, err = fuzzy_find_and_replace(content, "def foo( x, y ):", "def bar(x, y):")
        assert count == 1
        assert "bar" in new

@@ -47,7 +47,7 @@ class TestWhitespaceDifference:
 class TestIndentDifference:
    def test_different_indentation(self):
        content = "    def foo():\n        pass"
-        new, count, err = fuzzy_find_and_replace(content, "def foo():\n    pass", "def bar():\n    return 1")
+        new, count, _, err = fuzzy_find_and_replace(content, "def foo():\n    pass", "def bar():\n    return 1")
        assert count == 1
        assert "bar" in new

@@ -55,13 +55,96 @@ class TestIndentDifference:
 class TestReplaceAll:
    def test_multiple_matches_without_flag_errors(self):
        content = "aaa bbb aaa"
-        new, count, err = fuzzy_find_and_replace(content, "aaa", "ccc", replace_all=False)
+        new, count, _, err = fuzzy_find_and_replace(content, "aaa", "ccc", replace_all=False)
        assert count == 0
        assert "Found 2 matches" in err

    def test_multiple_matches_with_flag(self):
        content = "aaa bbb aaa"
-        new, count, err = fuzzy_find_and_replace(content, "aaa", "ccc", replace_all=True)
+        new, count, _, err = fuzzy_find_and_replace(content, "aaa", "ccc", replace_all=True)
        assert err is None
        assert count == 2
        assert new == "ccc bbb ccc"
+
+
+class TestUnicodeNormalized:
+    """Tests for the unicode_normalized strategy (Bug 5)."""
+
+    def test_em_dash_matched(self):
+        """Em-dash in content should match ASCII '--' in pattern."""
+        content = "return value\u2014fallback"
+        new, count, strategy, err = fuzzy_find_and_replace(
+            content, "return value--fallback", "return value or fallback"
+        )
+        assert count == 1, f"Expected match via unicode_normalized, got err={err}"
+        assert strategy == "unicode_normalized"
+        assert "return value or fallback" in new
+
+    def test_smart_quotes_matched(self):
+        """Smart double quotes in content should match straight quotes in pattern."""
+        content = 'print(\u201chello\u201d)'
+        new, count, strategy, err = fuzzy_find_and_replace(
+            content, 'print("hello")', 'print("world")'
+        )
+        assert count == 1, f"Expected match via unicode_normalized, got err={err}"
+        assert "world" in new
+
+    def test_no_unicode_skips_strategy(self):
+        """When content and pattern have no Unicode variants, strategy is skipped."""
+        content = "hello world"
+        # Should match via exact, not unicode_normalized
+        new, count, strategy, err = fuzzy_find_and_replace(content, "hello", "hi")
+        assert count == 1
+        assert strategy == "exact"
+
+
+class TestBlockAnchorThreshold:
+    """Tests for the raised block_anchor threshold (Bug 4)."""
+
+    def test_high_similarity_matches(self):
+        """A block with >50% middle similarity should match."""
+        content = "def foo():\n    x = 1\n    y = 2\n    return x + y\n"
+        pattern = "def foo():\n    x = 1\n    y = 9\n    return x + y"
+        new, count, strategy, err = fuzzy_find_and_replace(content, pattern, "def foo():\n    return 0\n")
+        # Should match via block_anchor or earlier strategy
+        assert count == 1
+
+    def test_completely_different_middle_does_not_match(self):
+        """A block where only first+last lines match but middle is completely different
+        should NOT match under the raised 0.50 threshold."""
+        content = (
+            "class Foo:\n"
+            "    completely = 'unrelated'\n"
+            "    content = 'here'\n"
+            "    nothing = 'in common'\n"
+            "    pass\n"
+        )
+        # Pattern has same first/last lines but completely different middle
+        pattern = (
+            "class Foo:\n"
+            "    x = 1\n"
+            "    y = 2\n"
+            "    z = 3\n"
+            "    pass"
+        )
+        new, count, strategy, err = fuzzy_find_and_replace(content, pattern, "replaced")
+        # With threshold=0.50, this near-zero-similarity middle should not match
+        assert count == 0, (
+            f"Block with unrelated middle should not match under threshold=0.50, "
+            f"but matched via strategy={strategy}"
+        )
+
+
+class TestStrategyNameSurfaced:
+    """Tests for the strategy name in the 4-tuple return (Bug 6)."""
+
+    def test_exact_strategy_name(self):
+        new, count, strategy, err = fuzzy_find_and_replace("hello", "hello", "world")
+        assert strategy == "exact"
+        assert count == 1
+
+    def test_failed_match_returns_none_strategy(self):
+        new, count, strategy, err = fuzzy_find_and_replace("hello", "xyz", "world")
+        assert count == 0
+        assert strategy is None
+        assert err is not None
@@ -104,6 +104,45 @@ class TestStdioPidTracking:
        with _lock:
            assert fake_pid not in _stdio_pids

+    def test_kill_orphaned_uses_sigkill_when_available(self, monkeypatch):
+        """Unix-like platforms should keep using SIGKILL for orphan cleanup."""
+        from tools.mcp_tool import _kill_orphaned_mcp_children, _stdio_pids, _lock
+
+        fake_pid = 424242
+        with _lock:
+            _stdio_pids.clear()
+            _stdio_pids.add(fake_pid)
+
+        fake_sigkill = 9
+        monkeypatch.setattr(signal, "SIGKILL", fake_sigkill, raising=False)
+
+        with patch("tools.mcp_tool.os.kill") as mock_kill:
+            _kill_orphaned_mcp_children()
+
+        mock_kill.assert_called_once_with(fake_pid, fake_sigkill)
+
+        with _lock:
+            assert fake_pid not in _stdio_pids
+
+    def test_kill_orphaned_falls_back_without_sigkill(self, monkeypatch):
+        """Windows-like signal modules without SIGKILL should fall back to SIGTERM."""
+        from tools.mcp_tool import _kill_orphaned_mcp_children, _stdio_pids, _lock
+
+        fake_pid = 434343
+        with _lock:
+            _stdio_pids.clear()
+            _stdio_pids.add(fake_pid)
+
+        monkeypatch.delattr(signal, "SIGKILL", raising=False)
+
+        with patch("tools.mcp_tool.os.kill") as mock_kill:
+            _kill_orphaned_mcp_children()
+
+        mock_kill.assert_called_once_with(fake_pid, signal.SIGTERM)
+
+        with _lock:
+            assert fake_pid not in _stdio_pids
+

 # ---------------------------------------------------------------------------
 # Fix 3: MCP reload timeout (cli.py)
@@ -159,7 +159,7 @@ class TestApplyUpdate:
            def __init__(self):
                self.written = None

-            def read_file(self, path, offset=1, limit=500):
+            def read_file_raw(self, path):
                return SimpleNamespace(
                    content=(
                        'def run():\n'
@@ -211,7 +211,7 @@ class TestAdditionOnlyHunks:
        # Apply to a file that contains the context hint
        class FakeFileOps:
            written = None
-            def read_file(self, path, **kw):
+            def read_file_raw(self, path):
                return SimpleNamespace(
                    content="def main():\n    pass\n",
                    error=None,
@@ -239,7 +239,7 @@ class TestAdditionOnlyHunks:

        class FakeFileOps:
            written = None
-            def read_file(self, path, **kw):
+            def read_file_raw(self, path):
                return SimpleNamespace(
                    content="existing = True\n",
                    error=None,
@@ -253,3 +253,259 @@ class TestAdditionOnlyHunks:
        assert result.success is True
        assert file_ops.written.endswith("def new_func():\n    return True\n")
        assert "existing = True" in file_ops.written
+
+
+class TestReadFileRaw:
+    """Bug 1 regression tests — files > 2000 lines and lines > 2000 chars."""
+
+    def test_apply_update_file_over_2000_lines(self):
+        """A hunk targeting line 2200 must not truncate the file to 2000 lines."""
+        patch = """\
+*** Begin Patch
+*** Update File: big.py
+@@ marker_at_2200 @@
+ line_2200
+-old_value
+new_value
+*** End Patch"""
+        ops, err = parse_v4a_patch(patch)
+        assert err is None
+
+        # Build a 2500-line file; the hunk targets a region at line 2200
+        lines = [f"line_{i}" for i in range(1, 2501)]
+        lines[2199] = "line_2200"   # index 2199 = line 2200
+        lines[2200] = "old_value"
+        file_content = "\n".join(lines)
+
+        class FakeFileOps:
+            written = None
+            def read_file_raw(self, path):
+                return SimpleNamespace(content=file_content, error=None)
+            def write_file(self, path, content):
+                self.written = content
+                return SimpleNamespace(error=None)
+
+        file_ops = FakeFileOps()
+        result = apply_v4a_operations(ops, file_ops)
+        assert result.success is True
+        written_lines = file_ops.written.split("\n")
+        assert len(written_lines) == 2500, (
+            f"Expected 2500 lines, got {len(written_lines)}"
+        )
+        assert "new_value" in file_ops.written
+        assert "old_value" not in file_ops.written
+
+    def test_apply_update_preserves_long_lines(self):
+        """A line > 2000 chars must be preserved verbatim after an unrelated hunk."""
+        long_line = "x" * 3000
+        patch = """\
+*** Begin Patch
+*** Update File: wide.py
+@@ short_func @@
+ def short_func():
+-    return 1
+    return 2
+*** End Patch"""
+        ops, err = parse_v4a_patch(patch)
+        assert err is None
+
+        file_content = f"def short_func():\n    return 1\n{long_line}\n"
+
+        class FakeFileOps:
+            written = None
+            def read_file_raw(self, path):
+                return SimpleNamespace(content=file_content, error=None)
+            def write_file(self, path, content):
+                self.written = content
+                return SimpleNamespace(error=None)
+
+        file_ops = FakeFileOps()
+        result = apply_v4a_operations(ops, file_ops)
+        assert result.success is True
+        assert long_line in file_ops.written, "Long line was truncated"
+        assert "... [truncated]" not in file_ops.written
+
+
+class TestValidationPhase:
+    """Bug 2 regression tests — validation prevents partial apply."""
+
+    def test_validation_failure_writes_nothing(self):
+        """If one hunk is invalid, no files should be written."""
+        patch = """\
+*** Begin Patch
+*** Update File: a.py
+ def good():
+-    return 1
+    return 2
+*** Update File: b.py
+ THIS LINE DOES NOT EXIST
+-    old
+    new
+*** End Patch"""
+        ops, err = parse_v4a_patch(patch)
+        assert err is None
+
+        written = {}
+
+        class FakeFileOps:
+            def read_file_raw(self, path):
+                files = {
+                    "a.py": "def good():\n    return 1\n",
+                    "b.py": "completely different content\n",
+                }
+                content = files.get(path)
+                if content is None:
+                    return SimpleNamespace(content=None, error=f"File not found: {path}")
+                return SimpleNamespace(content=content, error=None)
+
+            def write_file(self, path, content):
+                written[path] = content
+                return SimpleNamespace(error=None)
+
+        result = apply_v4a_operations(ops, FakeFileOps())
+        assert result.success is False
+        assert written == {}, f"No files should have been written, got: {list(written.keys())}"
+        assert "validation failed" in result.error.lower()
+
+    def test_all_valid_operations_applied(self):
+        """When all operations are valid, all files are written."""
+        patch = """\
+*** Begin Patch
+*** Update File: a.py
+ def foo():
+-    return 1
+    return 2
+*** Update File: b.py
+ def bar():
+-    pass
+    return True
+*** End Patch"""
+        ops, err = parse_v4a_patch(patch)
+        assert err is None
+
+        written = {}
+
+        class FakeFileOps:
+            def read_file_raw(self, path):
+                files = {
+                    "a.py": "def foo():\n    return 1\n",
+                    "b.py": "def bar():\n    pass\n",
+                }
+                return SimpleNamespace(content=files[path], error=None)
+
+            def write_file(self, path, content):
+                written[path] = content
+                return SimpleNamespace(error=None)
+
+        result = apply_v4a_operations(ops, FakeFileOps())
+        assert result.success is True
+        assert set(written.keys()) == {"a.py", "b.py"}
+
+
+class TestApplyDelete:
+    """Tests for _apply_delete producing a real unified diff."""
+
+    def test_delete_diff_contains_removed_lines(self):
+        """_apply_delete must embed the actual file content in the diff, not a placeholder."""
+        patch = """\
+*** Begin Patch
+*** Delete File: old/stuff.py
+*** End Patch"""
+        ops, err = parse_v4a_patch(patch)
+        assert err is None
+
+        class FakeFileOps:
+            deleted = False
+
+            def read_file_raw(self, path):
+                return SimpleNamespace(
+                    content="def old_func():\n    return 42\n",
+                    error=None,
+                )
+
+            def delete_file(self, path):
+                self.deleted = True
+                return SimpleNamespace(error=None)
+
+        file_ops = FakeFileOps()
+        result = apply_v4a_operations(ops, file_ops)
+
+        assert result.success is True
+        assert file_ops.deleted is True
+        # Diff must contain the actual removed lines, not a bare comment
+        assert "-def old_func():" in result.diff
+        assert "-    return 42" in result.diff
+        assert "/dev/null" in result.diff
+
+    def test_delete_diff_fallback_on_empty_file(self):
+        """An empty file should produce the fallback comment diff."""
+        patch = """\
+*** Begin Patch
+*** Delete File: empty.py
+*** End Patch"""
+        ops, err = parse_v4a_patch(patch)
+        assert err is None
+
+        class FakeFileOps:
+            def read_file_raw(self, path):
+                return SimpleNamespace(content="", error=None)
+
+            def delete_file(self, path):
+                return SimpleNamespace(error=None)
+
+        result = apply_v4a_operations(ops, FakeFileOps())
+        assert result.success is True
+        # unified_diff produces nothing for two empty inputs — fallback comment expected
+        assert "Deleted" in result.diff or result.diff.strip() == ""
+
+
+class TestCountOccurrences:
+    def test_basic(self):
+        from tools.patch_parser import _count_occurrences
+        assert _count_occurrences("aaa", "a") == 3
+        assert _count_occurrences("aaa", "aa") == 2
+        assert _count_occurrences("hello world", "xyz") == 0
+        assert _count_occurrences("", "x") == 0
+
+
+class TestParseErrorSignalling:
+    """Bug 3 regression tests — parse_v4a_patch must signal errors, not swallow them."""
+
+    def test_update_with_no_hunks_returns_error(self):
+        """An UPDATE with no hunk lines is a malformed patch and should error."""
+        patch = """\
+*** Begin Patch
+*** Update File: foo.py
+*** End Patch"""
+        ops, err = parse_v4a_patch(patch)
+        assert err is not None, "Expected a parse error for hunk-less UPDATE"
+        assert ops == []
+
+    def test_move_without_destination_returns_error(self):
+        """A MOVE without '->' syntax should not silently produce a broken operation."""
+        # The move regex requires '->' so this will be treated as an unrecognised
+        # line and the op is never created.  Confirm nothing crashes and ops is empty.
+        patch = """\
+*** Begin Patch
+*** Move File: src/foo.py
+*** End Patch"""
+        ops, err = parse_v4a_patch(patch)
+        # Either parse sees zero ops (fine) or returns an error (also fine).
+        # What is NOT acceptable is ops=[MOVE op with empty new_path] + err=None.
+        if ops:
+            assert err is not None, (
+                "MOVE with missing destination must either produce empty ops or an error"
+            )
+
+    def test_valid_patch_returns_no_error(self):
+        """A well-formed patch must still return err=None."""
+        patch = """\
+*** Begin Patch
+*** Update File: f.py
+ ctx
+-old
+new
+*** End Patch"""
+        ops, err = parse_v4a_patch(patch)
+        assert err is None
+        assert len(ops) == 1
@@ -0,0 +1,274 @@
+"""Tests for zombie process cleanup — verifies processes spawned by tools
+are properly reaped when agent sessions end.
+
+Reproduction for issue #7131: zombie process accumulation on long-running
+gateway deployments.
+"""
+
+import os
+import signal
+import subprocess
+import sys
+import time
+import threading
+
+import pytest
+
+
+def _spawn_sleep(seconds: float = 60) -> subprocess.Popen:
+    """Spawn a portable long-lived Python sleep process (no shell wrapper)."""
+    return subprocess.Popen(
+        [sys.executable, "-c", f"import time; time.sleep({seconds})"],
+    )
+
+
+def _pid_alive(pid: int) -> bool:
+    """Return True if a process with the given PID is still running."""
+    try:
+        os.kill(pid, 0)
+        return True
+    except (ProcessLookupError, PermissionError):
+        return False
+
+
+class TestZombieReproduction:
+    """Demonstrate that subprocesses survive when cleanup is not called."""
+
+    def test_orphaned_processes_survive_without_cleanup(self):
+        """REPRODUCTION: processes spawned directly survive if no one kills
+        them — this models the gap that causes zombie accumulation when
+        the gateway drops agent references without calling close()."""
+        pids = []
+
+        try:
+            for _ in range(3):
+                proc = _spawn_sleep(60)
+                pids.append(proc.pid)
+
+            for pid in pids:
+                assert _pid_alive(pid), f"PID {pid} should be alive after spawn"
+
+            # Simulate "session end" by just dropping the reference
+            del proc  # noqa: F821
+
+            # BUG: processes are still alive after reference is dropped
+            for pid in pids:
+                assert _pid_alive(pid), (
+                    f"PID {pid} died after ref drop — "
+                    f"expected it to survive (demonstrating the bug)"
+                )
+        finally:
+            for pid in pids:
+                try:
+                    os.kill(pid, signal.SIGKILL)
+                except (ProcessLookupError, PermissionError):
+                    pass
+
+    def test_explicit_terminate_reaps_processes(self):
+        """Explicitly terminating+waiting on Popen handles works.
+        This models what ProcessRegistry.kill_process does internally."""
+        procs = []
+
+        try:
+            for _ in range(3):
+                proc = _spawn_sleep(60)
+                procs.append(proc)
+
+            for proc in procs:
+                assert _pid_alive(proc.pid)
+
+            for proc in procs:
+                proc.terminate()
+                proc.wait(timeout=5)
+
+            for proc in procs:
+                assert proc.returncode is not None, (
+                    f"PID {proc.pid} should have exited after terminate+wait"
+                )
+        finally:
+            for proc in procs:
+                try:
+                    proc.kill()
+                    proc.wait(timeout=1)
+                except Exception:
+                    pass
+
+
+class TestAgentCloseMethod:
+    """Verify AIAgent.close() exists, is idempotent, and calls cleanup."""
+
+    def test_close_calls_cleanup_functions(self):
+        """close() should call kill_all, cleanup_vm, cleanup_browser."""
+        from unittest.mock import patch
+
+        with patch("run_agent.AIAgent.__init__", return_value=None):
+            from run_agent import AIAgent
+            agent = AIAgent.__new__(AIAgent)
+            agent.session_id = "test-close-cleanup"
+            agent._active_children = []
+            agent._active_children_lock = threading.Lock()
+            agent.client = None
+
+            with patch("tools.process_registry.process_registry") as mock_registry, \
+                 patch("tools.terminal_tool.cleanup_vm") as mock_cleanup_vm, \
+                 patch("tools.browser_tool.cleanup_browser") as mock_cleanup_browser:
+                agent.close()
+
+                mock_registry.kill_all.assert_called_once_with(
+                    task_id="test-close-cleanup"
+                )
+                mock_cleanup_vm.assert_called_once_with("test-close-cleanup")
+                mock_cleanup_browser.assert_called_once_with("test-close-cleanup")
+
+    def test_close_is_idempotent(self):
+        """close() can be called multiple times without error."""
+        from unittest.mock import patch
+
+        with patch("run_agent.AIAgent.__init__", return_value=None):
+            from run_agent import AIAgent
+            agent = AIAgent.__new__(AIAgent)
+            agent.session_id = "test-close-idempotent"
+            agent._active_children = []
+            agent._active_children_lock = threading.Lock()
+            agent.client = None
+
+            agent.close()
+            agent.close()
+            agent.close()
+
+    def test_close_propagates_to_children(self):
+        """close() should call close() on all active child agents."""
+        from unittest.mock import MagicMock, patch
+
+        with patch("run_agent.AIAgent.__init__", return_value=None):
+            from run_agent import AIAgent
+            agent = AIAgent.__new__(AIAgent)
+            agent.session_id = "test-close-children"
+            agent._active_children_lock = threading.Lock()
+            agent.client = None
+
+            child_1 = MagicMock()
+            child_2 = MagicMock()
+            agent._active_children = [child_1, child_2]
+
+            agent.close()
+
+            child_1.close.assert_called_once()
+            child_2.close.assert_called_once()
+            assert agent._active_children == []
+
+    def test_close_survives_partial_failures(self):
+        """close() continues cleanup even if one step fails."""
+        from unittest.mock import patch
+
+        with patch("run_agent.AIAgent.__init__", return_value=None):
+            from run_agent import AIAgent
+            agent = AIAgent.__new__(AIAgent)
+            agent.session_id = "test-close-partial"
+            agent._active_children = []
+            agent._active_children_lock = threading.Lock()
+            agent.client = None
+
+            with patch(
+                "tools.process_registry.process_registry"
+            ) as mock_reg, patch(
+                "tools.terminal_tool.cleanup_vm"
+            ) as mock_vm, patch(
+                "tools.browser_tool.cleanup_browser"
+            ) as mock_browser:
+                mock_reg.kill_all.side_effect = RuntimeError("boom")
+
+                agent.close()
+
+                mock_vm.assert_called_once()
+                mock_browser.assert_called_once()
+
+
+class TestGatewayCleanupWiring:
+    """Verify gateway lifecycle calls close() on agents."""
+
+    def test_gateway_stop_calls_close(self):
+        """gateway stop() should call close() on all running agents."""
+        import asyncio
+        from unittest.mock import MagicMock, patch
+
+        runner = MagicMock()
+        runner._running = True
+        runner._running_agents = {}
+        runner.adapters = {}
+        runner._background_tasks = set()
+        runner._pending_messages = {}
+        runner._pending_approvals = {}
+        runner._shutdown_event = asyncio.Event()
+        runner._exit_reason = None
+
+        mock_agent_1 = MagicMock()
+        mock_agent_2 = MagicMock()
+        runner._running_agents = {
+            "session-1": mock_agent_1,
+            "session-2": mock_agent_2,
+        }
+
+        from gateway.run import GatewayRunner
+
+        loop = asyncio.new_event_loop()
+        try:
+            with patch("gateway.status.remove_pid_file"), \
+                 patch("gateway.status.write_runtime_status"), \
+                 patch("tools.terminal_tool.cleanup_all_environments"), \
+                 patch("tools.browser_tool.cleanup_all_browsers"):
+                loop.run_until_complete(GatewayRunner.stop(runner))
+        finally:
+            loop.close()
+
+        mock_agent_1.close.assert_called()
+        mock_agent_2.close.assert_called()
+
+    def test_evict_does_not_call_close(self):
+        """_evict_cached_agent() should NOT call close() — it's also used
+        for non-destructive refreshes (model switch, branch, fallback)."""
+        import threading
+        from unittest.mock import MagicMock
+
+        from gateway.run import GatewayRunner
+
+        runner = object.__new__(GatewayRunner)
+        runner._agent_cache_lock = threading.Lock()
+
+        mock_agent = MagicMock()
+        runner._agent_cache = {"session-key": (mock_agent, 12345)}
+
+        GatewayRunner._evict_cached_agent(runner, "session-key")
+
+        mock_agent.close.assert_not_called()
+        assert "session-key" not in runner._agent_cache
+
+
+class TestDelegationCleanup:
+    """Verify subagent delegation cleans up child agents."""
+
+    def test_run_single_child_calls_close(self):
+        """_run_single_child finally block should call close() on child."""
+        from unittest.mock import MagicMock
+        from tools.delegate_tool import _run_single_child
+
+        parent = MagicMock()
+        parent._active_children = []
+        parent._active_children_lock = threading.Lock()
+
+        child = MagicMock()
+        child._delegate_saved_tool_names = ["tool1"]
+        child.run_conversation.side_effect = RuntimeError("test abort")
+
+        parent._active_children.append(child)
+
+        result = _run_single_child(
+            task_index=0,
+            goal="test goal",
+            child=child,
+            parent_agent=parent,
+        )
+
+        child.close.assert_called_once()
+        assert child not in parent._active_children
+        assert result["status"] == "error"
@@ -64,14 +64,15 @@ def _scan_cron_prompt(prompt: str) -> str:


 def _origin_from_env() -> Optional[Dict[str, str]]:
-    origin_platform = os.getenv("HERMES_SESSION_PLATFORM")
-    origin_chat_id = os.getenv("HERMES_SESSION_CHAT_ID")
+    from gateway.session_context import get_session_env
+    origin_platform = get_session_env("HERMES_SESSION_PLATFORM")
+    origin_chat_id = get_session_env("HERMES_SESSION_CHAT_ID")
    if origin_platform and origin_chat_id:
        return {
            "platform": origin_platform,
            "chat_id": origin_chat_id,
-            "chat_name": os.getenv("HERMES_SESSION_CHAT_NAME"),
-            "thread_id": os.getenv("HERMES_SESSION_THREAD_ID"),
+            "chat_name": get_session_env("HERMES_SESSION_CHAT_NAME") or None,
+            "thread_id": get_session_env("HERMES_SESSION_THREAD_ID") or None,
        }
    return None

@@ -578,6 +578,15 @@ def _run_single_child(
            except (ValueError, UnboundLocalError) as e:
                logger.debug("Could not remove child from active_children: %s", e)

+        # Close tool resources (terminal sandboxes, browser daemons,
+        # background processes, httpx clients) so subagent subprocesses
+        # don't outlive the delegation.
+        try:
+            if hasattr(child, 'close'):
+                child.close()
+        except Exception:
+            logger.debug("Failed to close child agent after delegation")
+
 def delegate_task(
    goal: Optional[str] = None,
    context: Optional[str] = None,
@@ -252,23 +252,43 @@ class FileOperations(ABC):
    def read_file(self, path: str, offset: int = 1, limit: int = 500) -> ReadResult:
        """Read a file with pagination support."""
        ...
-    
+
+    @abstractmethod
+    def read_file_raw(self, path: str) -> ReadResult:
+        """Read the complete file content as a plain string.
+
+        No pagination, no line-number prefixes, no per-line truncation.
+        Returns ReadResult with .content = full file text, .error set on
+        failure. Always reads to EOF regardless of file size.
+        """
+        ...
+
    @abstractmethod
    def write_file(self, path: str, content: str) -> WriteResult:
        """Write content to a file, creating directories as needed."""
        ...
-    
+
    @abstractmethod
-    def patch_replace(self, path: str, old_string: str, new_string: str, 
+    def patch_replace(self, path: str, old_string: str, new_string: str,
                      replace_all: bool = False) -> PatchResult:
        """Replace text in a file using fuzzy matching."""
        ...
-    
+
    @abstractmethod
    def patch_v4a(self, patch_content: str) -> PatchResult:
        """Apply a V4A format patch."""
        ...
-    
+
+    @abstractmethod
+    def delete_file(self, path: str) -> WriteResult:
+        """Delete a file. Returns WriteResult with .error set on failure."""
+        ...
+
+    @abstractmethod
+    def move_file(self, src: str, dst: str) -> WriteResult:
+        """Move/rename a file from src to dst. Returns WriteResult with .error set on failure."""
+        ...
+
    @abstractmethod
    def search(self, pattern: str, path: str = ".", target: str = "content",
               file_glob: Optional[str] = None, limit: int = 50, offset: int = 0,
@@ -561,10 +581,62 @@ class ShellFileOperations(FileOperations):
            similar_files=similar[:5]  # Limit to 5 suggestions
        )
    
+    def read_file_raw(self, path: str) -> ReadResult:
+        """Read the complete file content as a plain string.
+
+        No pagination, no line-number prefixes, no per-line truncation.
+        Uses cat so the full file is returned regardless of size.
+        """
+        path = self._expand_path(path)
+        stat_cmd = f"wc -c < {self._escape_shell_arg(path)} 2>/dev/null"
+        stat_result = self._exec(stat_cmd)
+        if stat_result.exit_code != 0:
+            return self._suggest_similar_files(path)
+        try:
+            file_size = int(stat_result.stdout.strip())
+        except ValueError:
+            file_size = 0
+        if self._is_image(path):
+            return ReadResult(is_image=True, is_binary=True, file_size=file_size)
+        sample_result = self._exec(f"head -c 1000 {self._escape_shell_arg(path)} 2>/dev/null")
+        if self._is_likely_binary(path, sample_result.stdout):
+            return ReadResult(
+                is_binary=True, file_size=file_size,
+                error="Binary file — cannot display as text."
+            )
+        cat_result = self._exec(f"cat {self._escape_shell_arg(path)}")
+        if cat_result.exit_code != 0:
+            return ReadResult(error=f"Failed to read file: {cat_result.stdout}")
+        return ReadResult(content=cat_result.stdout, file_size=file_size)
+
+    def delete_file(self, path: str) -> WriteResult:
+        """Delete a file via rm."""
+        path = self._expand_path(path)
+        if _is_write_denied(path):
+            return WriteResult(error=f"Delete denied: {path} is a protected path")
+        result = self._exec(f"rm -f {self._escape_shell_arg(path)}")
+        if result.exit_code != 0:
+            return WriteResult(error=f"Failed to delete {path}: {result.stdout}")
+        return WriteResult()
+
+    def move_file(self, src: str, dst: str) -> WriteResult:
+        """Move a file via mv."""
+        src = self._expand_path(src)
+        dst = self._expand_path(dst)
+        for p in (src, dst):
+            if _is_write_denied(p):
+                return WriteResult(error=f"Move denied: {p} is a protected path")
+        result = self._exec(
+            f"mv {self._escape_shell_arg(src)} {self._escape_shell_arg(dst)}"
+        )
+        if result.exit_code != 0:
+            return WriteResult(error=f"Failed to move {src} -> {dst}: {result.stdout}")
+        return WriteResult()
+
    # =========================================================================
    # WRITE Implementation
    # =========================================================================
-    
+
    def write_file(self, path: str, content: str) -> WriteResult:
        """
        Write content to a file, creating parent directories as needed.
@@ -656,7 +728,7 @@ class ShellFileOperations(FileOperations):
        # Import and use fuzzy matching
        from tools.fuzzy_match import fuzzy_find_and_replace
        
-        new_content, match_count, error = fuzzy_find_and_replace(
+        new_content, match_count, _strategy, error = fuzzy_find_and_replace(
            content, old_string, new_string, replace_all
        )
        
@@ -21,7 +21,7 @@ Multi-occurrence matching is handled via the replace_all flag.
 Usage:
    from tools.fuzzy_match import fuzzy_find_and_replace
    
-    new_content, match_count, error = fuzzy_find_and_replace(
+    new_content, match_count, strategy, error = fuzzy_find_and_replace(
        content="def foo():\\n    pass",
        old_string="def foo():",
        new_string="def bar():",
@@ -48,27 +48,27 @@ def _unicode_normalize(text: str) -> str:


 def fuzzy_find_and_replace(content: str, old_string: str, new_string: str,
-                            replace_all: bool = False) -> Tuple[str, int, Optional[str]]:
+                            replace_all: bool = False) -> Tuple[str, int, Optional[str], Optional[str]]:
    """
    Find and replace text using a chain of increasingly fuzzy matching strategies.
-    
+
    Args:
        content: The file content to search in
        old_string: The text to find
        new_string: The replacement text
        replace_all: If True, replace all occurrences; if False, require uniqueness
-    
+
    Returns:
-        Tuple of (new_content, match_count, error_message)
-        - If successful: (modified_content, number_of_replacements, None)
-        - If failed: (original_content, 0, error_description)
+        Tuple of (new_content, match_count, strategy_name, error_message)
+        - If successful: (modified_content, number_of_replacements, strategy_used, None)
+        - If failed: (original_content, 0, None, error_description)
    """
    if not old_string:
-        return content, 0, "old_string cannot be empty"
-    
+        return content, 0, None, "old_string cannot be empty"
+
    if old_string == new_string:
-        return content, 0, "old_string and new_string are identical"
-    
+        return content, 0, None, "old_string and new_string are identical"
+
    # Try each matching strategy in order
    strategies: List[Tuple[str, Callable]] = [
        ("exact", _strategy_exact),
@@ -77,27 +77,28 @@ def fuzzy_find_and_replace(content: str, old_string: str, new_string: str,
        ("indentation_flexible", _strategy_indentation_flexible),
        ("escape_normalized", _strategy_escape_normalized),
        ("trimmed_boundary", _strategy_trimmed_boundary),
+        ("unicode_normalized", _strategy_unicode_normalized),
        ("block_anchor", _strategy_block_anchor),
        ("context_aware", _strategy_context_aware),
    ]
-    
-    for _strategy_name, strategy_fn in strategies:
+
+    for strategy_name, strategy_fn in strategies:
        matches = strategy_fn(content, old_string)
-        
+
        if matches:
            # Found matches with this strategy
            if len(matches) > 1 and not replace_all:
-                return content, 0, (
+                return content, 0, None, (
                    f"Found {len(matches)} matches for old_string. "
                    f"Provide more context to make it unique, or use replace_all=True."
                )
-            
+
            # Perform replacement
            new_content = _apply_replacements(content, matches, new_string)
-            return new_content, len(matches), None
-    
+            return new_content, len(matches), strategy_name, None
+
    # No strategy found a match
-    return content, 0, "Could not find a match for old_string in the file"
+    return content, 0, None, "Could not find a match for old_string in the file"


 def _apply_replacements(content: str, matches: List[Tuple[int, int]], new_string: str) -> str:
@@ -258,9 +259,90 @@ def _strategy_trimmed_boundary(content: str, pattern: str) -> List[Tuple[int, in
    return matches


+def _build_orig_to_norm_map(original: str) -> List[int]:
+    """Build a list mapping each original character index to its normalized index.
+
+    Because UNICODE_MAP replacements may expand characters (e.g. em-dash → '--',
+    ellipsis → '...'), the normalised string can be longer than the original.
+    This map lets us convert positions in the normalised string back to the
+    corresponding positions in the original string.
+
+    Returns a list of length ``len(original) + 1``; entry ``i`` is the
+    normalised index that character ``i`` maps to.
+    """
+    result: List[int] = []
+    norm_pos = 0
+    for char in original:
+        result.append(norm_pos)
+        repl = UNICODE_MAP.get(char)
+        norm_pos += len(repl) if repl is not None else 1
+    result.append(norm_pos)  # sentinel: one past the last character
+    return result
+
+
+def _map_positions_norm_to_orig(
+    orig_to_norm: List[int],
+    norm_matches: List[Tuple[int, int]],
+) -> List[Tuple[int, int]]:
+    """Convert (start, end) positions in the normalised string to original positions."""
+    # Invert the map: norm_pos -> first original position with that norm_pos
+    norm_to_orig_start: dict[int, int] = {}
+    for orig_pos, norm_pos in enumerate(orig_to_norm[:-1]):
+        if norm_pos not in norm_to_orig_start:
+            norm_to_orig_start[norm_pos] = orig_pos
+
+    results: List[Tuple[int, int]] = []
+    orig_len = len(orig_to_norm) - 1  # number of original characters
+
+    for norm_start, norm_end in norm_matches:
+        if norm_start not in norm_to_orig_start:
+            continue
+        orig_start = norm_to_orig_start[norm_start]
+
+        # Walk forward until orig_to_norm[orig_end] >= norm_end
+        orig_end = orig_start
+        while orig_end < orig_len and orig_to_norm[orig_end] < norm_end:
+            orig_end += 1
+
+        results.append((orig_start, orig_end))
+
+    return results
+
+
+def _strategy_unicode_normalized(content: str, pattern: str) -> List[Tuple[int, int]]:
+    """Strategy 7: Unicode normalisation.
+
+    Normalises smart quotes, em/en-dashes, ellipsis, and non-breaking spaces
+    to their ASCII equivalents in both *content* and *pattern*, then runs
+    exact and line_trimmed matching on the normalised copies.
+
+    Positions are mapped back to the *original* string via
+    ``_build_orig_to_norm_map`` — necessary because some UNICODE_MAP
+    replacements expand a single character into multiple ASCII characters,
+    making a naïve position copy incorrect.
+    """
+    # Normalize both sides. Either the content or the pattern (or both) may
+    # carry unicode variants — e.g. content has an em-dash that should match
+    # the LLM's ASCII '--', or vice-versa.  Skip only when neither changes.
+    norm_pattern = _unicode_normalize(pattern)
+    norm_content = _unicode_normalize(content)
+    if norm_content == content and norm_pattern == pattern:
+        return []
+
+    norm_matches = _strategy_exact(norm_content, norm_pattern)
+    if not norm_matches:
+        norm_matches = _strategy_line_trimmed(norm_content, norm_pattern)
+
+    if not norm_matches:
+        return []
+
+    orig_to_norm = _build_orig_to_norm_map(content)
+    return _map_positions_norm_to_orig(orig_to_norm, norm_matches)
+
+
 def _strategy_block_anchor(content: str, pattern: str) -> List[Tuple[int, int]]:
    """
-    Strategy 7: Match by anchoring on first and last lines.
+    Strategy 8: Match by anchoring on first and last lines.
    Adjusted with permissive thresholds and unicode normalization.
    """
    # Normalize both strings for comparison while keeping original content for offset calculation
@@ -290,8 +372,10 @@ def _strategy_block_anchor(content: str, pattern: str) -> List[Tuple[int, int]]:
    matches = []
    candidate_count = len(potential_matches)
    
-    # Thresholding logic: 0.10 for unique matches (max flexibility), 0.30 for multiple candidates
-    threshold = 0.10 if candidate_count == 1 else 0.30
+    # Thresholding logic: 0.50 for unique matches, 0.70 for multiple candidates.
+    # Previous values (0.10 / 0.30) were dangerously loose — a 10% middle-section
+    # similarity could match completely unrelated blocks.
+    threshold = 0.50 if candidate_count == 1 else 0.70

    for i in potential_matches:
        if pattern_line_count <= 2:
@@ -314,7 +398,7 @@ def _strategy_block_anchor(content: str, pattern: str) -> List[Tuple[int, int]]:

 def _strategy_context_aware(content: str, pattern: str) -> List[Tuple[int, int]]:
    """
-    Strategy 8: Line-by-line similarity with 50% threshold.
+    Strategy 9: Line-by-line similarity with 50% threshold.
    
    Finds blocks where at least 50% of lines have high similarity.
    """
@@ -2160,6 +2160,7 @@ def _kill_orphaned_mcp_children() -> None:
    Only kills PIDs tracked in ``_stdio_pids`` — never arbitrary children.
    """
    import signal as _signal
+    kill_signal = getattr(_signal, "SIGKILL", _signal.SIGTERM)

    with _lock:
        pids = list(_stdio_pids)
@@ -2167,7 +2168,7 @@ def _kill_orphaned_mcp_children() -> None:

    for pid in pids:
        try:
-            os.kill(pid, _signal.SIGKILL)
+            os.kill(pid, kill_signal)
            logger.debug("Force-killed orphaned MCP stdio process %d", pid)
        except (ProcessLookupError, PermissionError, OSError):
            pass  # Already exited or inaccessible
@@ -28,6 +28,7 @@ Usage:
        result = apply_v4a_operations(operations, file_ops)
 """

+import difflib
 import re
 from dataclasses import dataclass, field
 from typing import List, Optional, Tuple, Any
@@ -202,31 +203,162 @@ def parse_v4a_patch(patch_content: str) -> Tuple[List[PatchOperation], Optional[
        if current_hunk and current_hunk.lines:
            current_op.hunks.append(current_hunk)
        operations.append(current_op)
-    
+
+    # Validate the parsed result
+    if not operations:
+        # Empty patch is not an error — callers get [] and can decide
+        return operations, None
+
+    parse_errors: List[str] = []
+    for op in operations:
+        if not op.file_path:
+            parse_errors.append("Operation with empty file path")
+        if op.operation == OperationType.UPDATE and not op.hunks:
+            parse_errors.append(f"UPDATE {op.file_path!r}: no hunks found")
+        if op.operation == OperationType.MOVE and not op.new_path:
+            parse_errors.append(f"MOVE {op.file_path!r}: missing destination path (expected 'src -> dst')")
+
+    if parse_errors:
+        return [], "Parse error: " + "; ".join(parse_errors)
+
    return operations, None


-def apply_v4a_operations(operations: List[PatchOperation], 
-                          file_ops: Any) -> 'PatchResult':
+def _count_occurrences(text: str, pattern: str) -> int:
+    """Count non-overlapping occurrences of *pattern* in *text*."""
+    count = 0
+    start = 0
+    while True:
+        pos = text.find(pattern, start)
+        if pos == -1:
+            break
+        count += 1
+        start = pos + 1
+    return count
+
+
+def _validate_operations(
+    operations: List[PatchOperation],
+    file_ops: Any,
+) -> List[str]:
+    """Validate all operations without writing any files.
+
+    Returns a list of error strings; an empty list means all operations
+    are valid and the apply phase can proceed safely.
+
+    For UPDATE operations, hunks are simulated in order so that later
+    hunks validate against post-earlier-hunk content (matching apply order).
    """
-    Apply V4A patch operations using a file operations interface.
-    
+    # Deferred import: breaks the patch_parser ↔ fuzzy_match circular dependency
+    from tools.fuzzy_match import fuzzy_find_and_replace
+
+    errors: List[str] = []
+
+    for op in operations:
+        if op.operation == OperationType.UPDATE:
+            read_result = file_ops.read_file_raw(op.file_path)
+            if read_result.error:
+                errors.append(f"{op.file_path}: {read_result.error}")
+                continue
+
+            simulated = read_result.content
+            for hunk in op.hunks:
+                search_lines = [l.content for l in hunk.lines if l.prefix in (' ', '-')]
+                if not search_lines:
+                    # Addition-only hunk: validate context hint uniqueness
+                    if hunk.context_hint:
+                        occurrences = _count_occurrences(simulated, hunk.context_hint)
+                        if occurrences == 0:
+                            errors.append(
+                                f"{op.file_path}: addition-only hunk context hint "
+                                f"'{hunk.context_hint}' not found"
+                            )
+                        elif occurrences > 1:
+                            errors.append(
+                                f"{op.file_path}: addition-only hunk context hint "
+                                f"'{hunk.context_hint}' is ambiguous "
+                                f"({occurrences} occurrences)"
+                            )
+                    continue
+
+                search_pattern = '\n'.join(search_lines)
+                replace_lines = [l.content for l in hunk.lines if l.prefix in (' ', '+')]
+                replacement = '\n'.join(replace_lines)
+
+                new_simulated, count, _strategy, match_error = fuzzy_find_and_replace(
+                    simulated, search_pattern, replacement, replace_all=False
+                )
+                if count == 0:
+                    label = f"'{hunk.context_hint}'" if hunk.context_hint else "(no hint)"
+                    errors.append(
+                        f"{op.file_path}: hunk {label} not found"
+                        + (f" — {match_error}" if match_error else "")
+                    )
+                else:
+                    # Advance simulation so subsequent hunks validate correctly.
+                    # Reuse the result from the call above — no second fuzzy run.
+                    simulated = new_simulated
+
+        elif op.operation == OperationType.DELETE:
+            read_result = file_ops.read_file_raw(op.file_path)
+            if read_result.error:
+                errors.append(f"{op.file_path}: file not found for deletion")
+
+        elif op.operation == OperationType.MOVE:
+            if not op.new_path:
+                errors.append(f"{op.file_path}: MOVE operation missing destination path")
+                continue
+            src_result = file_ops.read_file_raw(op.file_path)
+            if src_result.error:
+                errors.append(f"{op.file_path}: source file not found for move")
+            dst_result = file_ops.read_file_raw(op.new_path)
+            if not dst_result.error:
+                errors.append(
+                    f"{op.new_path}: destination already exists — move would overwrite"
+                )
+
+        # ADD: parent directory creation handled by write_file; no pre-check needed.
+
+    return errors
+
+
+def apply_v4a_operations(operations: List[PatchOperation],
+                          file_ops: Any) -> 'PatchResult':
+    """Apply V4A patch operations using a file operations interface.
+
+    Uses a two-phase validate-then-apply approach:
+    - Phase 1: validate all operations against current file contents without
+      writing anything. If any validation error is found, return immediately
+      with no filesystem changes.
+    - Phase 2: apply all operations. A failure here (e.g. a race between
+      validation and apply) is reported with a note to run ``git diff``.
+
    Args:
        operations: List of PatchOperation from parse_v4a_patch
-        file_ops: Object with read_file, write_file methods
-    
+        file_ops: Object with read_file_raw, write_file methods
+
    Returns:
        PatchResult with results of all operations
    """
    # Import here to avoid circular imports
    from tools.file_operations import PatchResult
-    
+
+    # ---- Phase 1: validate ----
+    validation_errors = _validate_operations(operations, file_ops)
+    if validation_errors:
+        return PatchResult(
+            success=False,
+            error="Patch validation failed (no files were modified):\n"
+                  + "\n".join(f"  • {e}" for e in validation_errors),
+        )
+
+    # ---- Phase 2: apply ----
    files_modified = []
    files_created = []
    files_deleted = []
    all_diffs = []
    errors = []
-    
+
    for op in operations:
        try:
            if op.operation == OperationType.ADD:
@@ -236,7 +368,7 @@ def apply_v4a_operations(operations: List[PatchOperation],
                    all_diffs.append(result[1])
                else:
                    errors.append(f"Failed to add {op.file_path}: {result[1]}")
-                    
+
            elif op.operation == OperationType.DELETE:
                result = _apply_delete(op, file_ops)
                if result[0]:
@@ -244,7 +376,7 @@ def apply_v4a_operations(operations: List[PatchOperation],
                    all_diffs.append(result[1])
                else:
                    errors.append(f"Failed to delete {op.file_path}: {result[1]}")
-                    
+
            elif op.operation == OperationType.MOVE:
                result = _apply_move(op, file_ops)
                if result[0]:
@@ -252,7 +384,7 @@ def apply_v4a_operations(operations: List[PatchOperation],
                    all_diffs.append(result[1])
                else:
                    errors.append(f"Failed to move {op.file_path}: {result[1]}")
-                    
+
            elif op.operation == OperationType.UPDATE:
                result = _apply_update(op, file_ops)
                if result[0]:
@@ -260,19 +392,19 @@ def apply_v4a_operations(operations: List[PatchOperation],
                    all_diffs.append(result[1])
                else:
                    errors.append(f"Failed to update {op.file_path}: {result[1]}")
-                    
+
        except Exception as e:
            errors.append(f"Error processing {op.file_path}: {str(e)}")
-    
+
    # Run lint on all modified/created files
    lint_results = {}
    for f in files_modified + files_created:
        if hasattr(file_ops, '_check_lint'):
            lint_result = file_ops._check_lint(f)
            lint_results[f] = lint_result.to_dict()
-    
+
    combined_diff = '\n'.join(all_diffs)
-    
+
    if errors:
        return PatchResult(
            success=False,
@@ -281,16 +413,17 @@ def apply_v4a_operations(operations: List[PatchOperation],
            files_created=files_created,
            files_deleted=files_deleted,
            lint=lint_results if lint_results else None,
-            error='; '.join(errors)
+            error="Apply phase failed (state may be inconsistent — run `git diff` to assess):\n"
+                  + "\n".join(f"  • {e}" for e in errors),
        )
-    
+
    return PatchResult(
        success=True,
        diff=combined_diff,
        files_modified=files_modified,
        files_created=files_created,
        files_deleted=files_deleted,
-        lint=lint_results if lint_results else None
+        lint=lint_results if lint_results else None,
    )


@@ -317,68 +450,56 @@ def _apply_add(op: PatchOperation, file_ops: Any) -> Tuple[bool, str]:

 def _apply_delete(op: PatchOperation, file_ops: Any) -> Tuple[bool, str]:
    """Apply a delete file operation."""
-    # Read file first for diff
-    read_result = file_ops.read_file(op.file_path)
-    
-    if read_result.error and "not found" in read_result.error.lower():
-        # File doesn't exist, nothing to delete
-        return True, f"# {op.file_path} already deleted or doesn't exist"
-    
-    # Delete directly via shell command using the underlying environment
-    rm_result = file_ops._exec(f"rm -f {file_ops._escape_shell_arg(op.file_path)}")
-    
-    if rm_result.exit_code != 0:
-        return False, rm_result.stdout
-    
-    diff = f"--- a/{op.file_path}\n+++ /dev/null\n# File deleted"
-    return True, diff
+    # Read before deleting so we can produce a real unified diff.
+    # Validation already confirmed existence; this guards against races.
+    read_result = file_ops.read_file_raw(op.file_path)
+    if read_result.error:
+        return False, f"Cannot delete {op.file_path}: file not found"
+
+    result = file_ops.delete_file(op.file_path)
+    if result.error:
+        return False, result.error
+
+    removed_lines = read_result.content.splitlines(keepends=True)
+    diff = ''.join(difflib.unified_diff(
+        removed_lines, [],
+        fromfile=f"a/{op.file_path}",
+        tofile="/dev/null",
+    ))
+    return True, diff or f"# Deleted: {op.file_path}"


 def _apply_move(op: PatchOperation, file_ops: Any) -> Tuple[bool, str]:
    """Apply a move file operation."""
-    # Use shell mv command
-    mv_result = file_ops._exec(
-        f"mv {file_ops._escape_shell_arg(op.file_path)} {file_ops._escape_shell_arg(op.new_path)}"
-    )
-    
-    if mv_result.exit_code != 0:
-        return False, mv_result.stdout
-    
+    result = file_ops.move_file(op.file_path, op.new_path)
+    if result.error:
+        return False, result.error
+
    diff = f"# Moved: {op.file_path} -> {op.new_path}"
    return True, diff


 def _apply_update(op: PatchOperation, file_ops: Any) -> Tuple[bool, str]:
    """Apply an update file operation."""
-    # Read current content
-    read_result = file_ops.read_file(op.file_path, limit=10000)
-    
+    # Deferred import: breaks the patch_parser ↔ fuzzy_match circular dependency
+    from tools.fuzzy_match import fuzzy_find_and_replace
+
+    # Read current content — raw so no line-number prefixes or per-line truncation
+    read_result = file_ops.read_file_raw(op.file_path)
+
    if read_result.error:
        return False, f"Cannot read file: {read_result.error}"
-    
-    # Parse content (remove line numbers)
-    current_lines = []
-    for line in read_result.content.split('\n'):
-        if re.match(r'^\s*\d+\|', line):
-            # Line format: "    123|content"
-            parts = line.split('|', 1)
-            if len(parts) == 2:
-                current_lines.append(parts[1])
-            else:
-                current_lines.append(line)
-        else:
-            current_lines.append(line)
-    
-    current_content = '\n'.join(current_lines)
-    
+
+    current_content = read_result.content
+
    # Apply each hunk
    new_content = current_content
-    
+
    for hunk in op.hunks:
        # Build search pattern from context and removed lines
        search_lines = []
        replace_lines = []
-        
+
        for line in hunk.lines:
            if line.prefix == ' ':
                search_lines.append(line.content)
@@ -387,17 +508,15 @@ def _apply_update(op: PatchOperation, file_ops: Any) -> Tuple[bool, str]:
                search_lines.append(line.content)
            elif line.prefix == '+':
                replace_lines.append(line.content)
-        
+
        if search_lines:
            search_pattern = '\n'.join(search_lines)
            replacement = '\n'.join(replace_lines)
-            
-            # Use fuzzy matching
-            from tools.fuzzy_match import fuzzy_find_and_replace
-            new_content, count, error = fuzzy_find_and_replace(
+
+            new_content, count, _strategy, error = fuzzy_find_and_replace(
                new_content, search_pattern, replacement, replace_all=False
            )
-            
+
            if error and count == 0:
                # Try with context hint if available
                if hunk.context_hint:
@@ -408,8 +527,8 @@ def _apply_update(op: PatchOperation, file_ops: Any) -> Tuple[bool, str]:
                        window_start = max(0, hint_pos - 500)
                        window_end = min(len(new_content), hint_pos + 2000)
                        window = new_content[window_start:window_end]
-                        
-                        window_new, count, error = fuzzy_find_and_replace(
+
+                        window_new, count, _strategy, error = fuzzy_find_and_replace(
                            window, search_pattern, replacement, replace_all=False
                        )
                        
@@ -424,16 +543,23 @@ def _apply_update(op: PatchOperation, file_ops: Any) -> Tuple[bool, str]:
            # Insert at the location indicated by the context hint, or at end of file.
            insert_text = '\n'.join(replace_lines)
            if hunk.context_hint:
-                hint_pos = new_content.find(hunk.context_hint)
-                if hint_pos != -1:
+                occurrences = _count_occurrences(new_content, hunk.context_hint)
+                if occurrences == 0:
+                    # Hint not found — append at end as a safe fallback
+                    new_content = new_content.rstrip('\n') + '\n' + insert_text + '\n'
+                elif occurrences > 1:
+                    return False, (
+                        f"Addition-only hunk: context hint '{hunk.context_hint}' is ambiguous "
+                        f"({occurrences} occurrences) — provide a more unique hint"
+                    )
+                else:
+                    hint_pos = new_content.find(hunk.context_hint)
                    # Insert after the line containing the context hint
                    eol = new_content.find('\n', hint_pos)
                    if eol != -1:
                        new_content = new_content[:eol + 1] + insert_text + '\n' + new_content[eol + 1:]
                    else:
                        new_content = new_content + '\n' + insert_text
-                else:
-                    new_content = new_content.rstrip('\n') + '\n' + insert_text + '\n'
            else:
                new_content = new_content.rstrip('\n') + '\n' + insert_text + '\n'
    
@@ -443,7 +569,6 @@ def _apply_update(op: PatchOperation, file_ops: Any) -> Tuple[bool, str]:
        return False, write_result.error
    
    # Generate diff
-    import difflib
    diff_lines = difflib.unified_diff(
        current_content.splitlines(keepends=True),
        new_content.splitlines(keepends=True),
@@ -585,7 +585,10 @@ class ProcessRegistry:
        from tools.ansi_strip import strip_ansi
        from tools.terminal_tool import _interrupt_event

-        default_timeout = int(os.getenv("TERMINAL_TIMEOUT", "180"))
+        try:
+            default_timeout = int(os.getenv("TERMINAL_TIMEOUT", "180"))
+        except (ValueError, TypeError):
+            default_timeout = 180
        max_timeout = default_timeout
        requested_timeout = timeout
        timeout_note = None
@@ -212,7 +212,8 @@ def _handle_send(args):
        if isinstance(result, dict) and result.get("success") and mirror_text:
            try:
                from gateway.mirror import mirror_to_session
-                source_label = os.getenv("HERMES_SESSION_PLATFORM", "cli")
+                from gateway.session_context import get_session_env
+                source_label = get_session_env("HERMES_SESSION_PLATFORM", "cli")
                if mirror_to_session(platform_name, chat_id, mirror_text, source_label=source_label, thread_id=thread_id):
                    result["mirrored"] = True
            except Exception:
@@ -689,7 +690,10 @@ async def _send_email(extra, chat_id, message):
    address = extra.get("address") or os.getenv("EMAIL_ADDRESS", "")
    password = os.getenv("EMAIL_PASSWORD", "")
    smtp_host = extra.get("smtp_host") or os.getenv("EMAIL_SMTP_HOST", "")
-    smtp_port = int(os.getenv("EMAIL_SMTP_PORT", "587"))
+    try:
+        smtp_port = int(os.getenv("EMAIL_SMTP_PORT", "587"))
+    except (ValueError, TypeError):
+        smtp_port = 587

    if not all([address, password, smtp_host]):
        return {"error": "Email not configured (EMAIL_ADDRESS, EMAIL_PASSWORD, EMAIL_SMTP_HOST required)"}
@@ -1020,7 +1024,8 @@ async def _send_feishu(pconfig, chat_id, message, media_files=None, thread_id=No

 def _check_send_message():
    """Gate send_message on gateway running (always available on messaging platforms)."""
-    platform = os.getenv("HERMES_SESSION_PLATFORM", "")
+    from gateway.session_context import get_session_env
+    platform = get_session_env("HERMES_SESSION_PLATFORM", "")
    if platform and platform != "local":
        return True
    try:
@@ -426,7 +426,7 @@ def _patch_skill(
    # from exact-match failures on minor formatting mismatches.
    from tools.fuzzy_match import fuzzy_find_and_replace

-    new_content, match_count, match_error = fuzzy_find_and_replace(
+    new_content, match_count, _strategy, match_error = fuzzy_find_and_replace(
        content, old_string, new_string, replace_all
    )
    if match_error:
@@ -1788,7 +1788,10 @@ class ClawHubSource(SkillSource):
                    follow_redirects=True,
                )
                if resp.status_code == 429:
-                    retry_after = int(resp.headers.get("retry-after", "5"))
+                    try:
+                        retry_after = int(resp.headers.get("retry-after", "5"))
+                    except (ValueError, TypeError):
+                        retry_after = 5
                    retry_after = min(retry_after, 15)  # Cap wait time
                    logger.debug(
                        "ClawHub download rate-limited for %s, retrying in %ds (attempt %d/%d)",
@@ -347,7 +347,8 @@ def _capture_required_environment_variables(
 def _is_gateway_surface() -> bool:
    if os.getenv("HERMES_GATEWAY_SESSION"):
        return True
-    return bool(os.getenv("HERMES_SESSION_PLATFORM"))
+    from gateway.session_context import get_session_env
+    return bool(get_session_env("HERMES_SESSION_PLATFORM"))


 def _get_terminal_backend_name() -> str:
@@ -1420,10 +1420,11 @@ def terminal_tool(
                    # In gateway mode, auto-register a fast watcher so the
                    # gateway can detect completion and trigger a new agent
                    # turn.  CLI mode uses the completion_queue directly.
-                    _gw_platform = os.getenv("HERMES_SESSION_PLATFORM", "")
+                    from gateway.session_context import get_session_env as _gse
+                    _gw_platform = _gse("HERMES_SESSION_PLATFORM", "")
                    if _gw_platform and not check_interval:
-                        _gw_chat_id = os.getenv("HERMES_SESSION_CHAT_ID", "")
-                        _gw_thread_id = os.getenv("HERMES_SESSION_THREAD_ID", "")
+                        _gw_chat_id = _gse("HERMES_SESSION_CHAT_ID", "")
+                        _gw_thread_id = _gse("HERMES_SESSION_THREAD_ID", "")
                        proc_session.watcher_platform = _gw_platform
                        proc_session.watcher_chat_id = _gw_chat_id
                        proc_session.watcher_thread_id = _gw_thread_id
@@ -1445,9 +1446,10 @@ def terminal_tool(
                        result_data["check_interval_note"] = (
                            f"Requested {check_interval}s raised to minimum 30s"
                        )
-                    watcher_platform = os.getenv("HERMES_SESSION_PLATFORM", "")
-                    watcher_chat_id = os.getenv("HERMES_SESSION_CHAT_ID", "")
-                    watcher_thread_id = os.getenv("HERMES_SESSION_THREAD_ID", "")
+                    from gateway.session_context import get_session_env as _gse2
+                    watcher_platform = _gse2("HERMES_SESSION_PLATFORM", "")
+                    watcher_chat_id = _gse2("HERMES_SESSION_CHAT_ID", "")
+                    watcher_thread_id = _gse2("HERMES_SESSION_THREAD_ID", "")

                    # Store on session for checkpoint persistence
                    proc_session.watcher_platform = watcher_platform
@@ -480,7 +480,8 @@ def text_to_speech_tool(
    # Telegram voice bubbles require Opus (.ogg); OpenAI and ElevenLabs can
    # produce Opus natively (no ffmpeg needed).  Edge TTS always outputs MP3
    # and needs ffmpeg for conversion.
-    platform = os.getenv("HERMES_SESSION_PLATFORM", "").lower()
+    from gateway.session_context import get_session_env
+    platform = get_session_env("HERMES_SESSION_PLATFORM", "").lower()
    want_opus = (platform == "telegram")

    # Determine output path
@@ -1661,7 +1661,7 @@ dependencies = [
    { name = "fal-client" },
    { name = "fire" },
    { name = "firecrawl-py" },
-    { name = "httpx" },
+    { name = "httpx", extra = ["socks"] },
    { name = "jinja2" },
    { name = "openai" },
    { name = "parallel-web" },
@@ -1691,6 +1691,8 @@ all = [
    { name = "faster-whisper" },
    { name = "honcho-ai" },
    { name = "lark-oapi" },
+    { name = "markdown", marker = "sys_platform == 'linux'" },
+    { name = "matrix-nio", extra = ["e2e"], marker = "sys_platform == 'linux'" },
    { name = "mcp" },
    { name = "mistralai" },
    { name = "modal" },
@@ -1827,6 +1829,7 @@ requires-dist = [
    { name = "hermes-agent", extras = ["homeassistant"], marker = "extra == 'all'" },
    { name = "hermes-agent", extras = ["honcho"], marker = "extra == 'all'" },
    { name = "hermes-agent", extras = ["honcho"], marker = "extra == 'termux'" },
+    { name = "hermes-agent", extras = ["matrix"], marker = "sys_platform == 'linux' and extra == 'all'" },
    { name = "hermes-agent", extras = ["mcp"], marker = "extra == 'all'" },
    { name = "hermes-agent", extras = ["mcp"], marker = "extra == 'termux'" },
    { name = "hermes-agent", extras = ["messaging"], marker = "extra == 'all'" },
@@ -1839,7 +1842,7 @@ requires-dist = [
    { name = "hermes-agent", extras = ["tts-premium"], marker = "extra == 'all'" },
    { name = "hermes-agent", extras = ["voice"], marker = "extra == 'all'" },
    { name = "honcho-ai", marker = "extra == 'honcho'", specifier = ">=2.0.1,<3" },
-    { name = "httpx", specifier = ">=0.28.1,<1" },
+    { name = "httpx", extras = ["socks"], specifier = ">=0.28.1,<1" },
    { name = "jinja2", specifier = ">=3.1.5,<4" },
    { name = "lark-oapi", marker = "extra == 'feishu'", specifier = ">=1.5.3,<2" },
    { name = "markdown", marker = "extra == 'matrix'", specifier = ">=3.6,<4" },
@@ -2033,6 +2036,9 @@ wheels = [
 http2 = [
    { name = "h2" },
 ]
+socks = [
+    { name = "socksio" },
+]

 [[package]]
 name = "httpx-sse"
@@ -4500,6 +4506,15 @@ wheels = [
    { url = "https://files.pythonhosted.org/packages/e9/44/75a9c9421471a6c4805dbf2356f7c181a29c1879239abab1ea2cc8f38b40/sniffio-1.3.1-py3-none-any.whl", hash = "sha256:2f6da418d1f1e0fddd844478f41680e794e6051915791a034ff65e5f100525a2", size = 10235, upload-time = "2024-02-25T23:20:01.196Z" },
 ]

+[[package]]
+name = "socksio"
+version = "1.0.0"
+source = { registry = "https://pypi.org/simple" }
+sdist = { url = "https://files.pythonhosted.org/packages/f8/5c/48a7d9495be3d1c651198fd99dbb6ce190e2274d0f28b9051307bdec6b85/socksio-1.0.0.tar.gz", hash = "sha256:f88beb3da5b5c38b9890469de67d0cb0f9d494b78b106ca1845f96c10b91c4ac", size = 19055, upload-time = "2020-04-17T15:50:34.664Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/37/c3/6eeb6034408dac0fa653d126c9204ade96b819c936e136c5e8a6897eee9c/socksio-1.0.0-py3-none-any.whl", hash = "sha256:95dc1f15f9b34e8d7b16f06d74b8ccf48f609af32ab33c608d08761c5dcbb1f3", size = 12763, upload-time = "2020-04-17T15:50:31.878Z" },
+]
+
 [[package]]
 name = "sounddevice"
 version = "0.5.5"
@@ -122,6 +122,17 @@ services.hermes-agent.environmentFiles = [ "/var/lib/hermes/env" ];
 Setting `addToSystemPackages = true` does two things: puts the `hermes` CLI on your system PATH **and** sets `HERMES_HOME` system-wide so the interactive CLI shares state (sessions, skills, cron) with the gateway service. Without it, running `hermes` in your shell creates a separate `~/.hermes/` directory.
 :::

+:::info Container-aware CLI
+When `container.enable = true` and `addToSystemPackages = true`, running `hermes chat` on the host **automatically routes into the managed container**. This means your interactive CLI session runs inside the same environment as the gateway service — with access to all container-installed packages and tools.
+
+- The routing is transparent: `hermes chat` detects container mode and does `podman exec` / `docker exec` under the hood
+- All CLI flags are forwarded: `-m`, `--resume`, `--query`, etc. work as normal
+- Use `hermes chat --host` to bypass container routing and run directly on the host
+- If the container isn't running, the CLI falls back to host execution automatically
+
+Other `hermes` subcommands (`version`, `config`, `sessions`, `setup`) always run on the host since they only need access to shared state files.
+:::
+
 ### Verify It Works

 After `nixos-rebuild switch`, check that the service is running:
@@ -262,16 +262,17 @@ For cloud sandbox backends, persistence is filesystem-oriented. `TERMINAL_LIFETI
 | `MATRIX_REQUIRE_MENTION` | Require `@mention` in rooms (default: `true`). Set to `false` to respond to all messages. |
 | `MATRIX_FREE_RESPONSE_ROOMS` | Comma-separated room IDs where bot responds without `@mention` |
 | `MATRIX_AUTO_THREAD` | Auto-create threads for room messages (default: `true`) |
+| `MATRIX_DM_MENTION_THREADS` | Create a thread when bot is `@mentioned` in a DM (default: `false`) |
 | `HASS_TOKEN` | Home Assistant Long-Lived Access Token (enables HA platform + tools) |
 | `HASS_URL` | Home Assistant URL (default: `http://homeassistant.local:8123`) |
 | `WEBHOOK_ENABLED` | Enable the webhook platform adapter (`true`/`false`) |
 | `WEBHOOK_PORT` | HTTP server port for receiving webhooks (default: `8644`) |
 | `WEBHOOK_SECRET` | Global HMAC secret for webhook signature validation (used as fallback when routes don't specify their own) |
 | `API_SERVER_ENABLED` | Enable the OpenAI-compatible API server (`true`/`false`). Runs alongside other platforms. |
-| `API_SERVER_KEY` | Bearer token for API server authentication. Strongly recommended; required for any network-accessible deployment. |
+| `API_SERVER_KEY` | Bearer token for API server authentication. Enforced for non-loopback binding. |
 | `API_SERVER_CORS_ORIGINS` | Comma-separated browser origins allowed to call the API server directly (for example `http://localhost:3000,http://127.0.0.1:3000`). Default: disabled. |
 | `API_SERVER_PORT` | Port for the API server (default: `8642`) |
-| `API_SERVER_HOST` | Host/bind address for the API server (default: `127.0.0.1`). Use `0.0.0.0` for network access only with `API_SERVER_KEY` and a narrow `API_SERVER_CORS_ORIGINS` allowlist. |
+| `API_SERVER_HOST` | Host/bind address for the API server (default: `127.0.0.1`). Use `0.0.0.0` for network access — requires `API_SERVER_KEY` and a narrow `API_SERVER_CORS_ORIGINS` allowlist. |
 | `API_SERVER_MODEL_NAME` | Model name advertised on `/v1/models`. Defaults to the profile name (or `hermes-agent` for the default profile). Useful for multi-user setups where frontends like Open WebUI need distinct model names per connection. |
 | `MESSAGING_CWD` | Working directory for terminal commands in messaging mode (default: `~`) |
 | `GATEWAY_ALLOWED_USERS` | Comma-separated user IDs allowed across all platforms |
@@ -177,7 +177,7 @@ Authorization: Bearer ***
 Configure the key via `API_SERVER_KEY` env var. If you need a browser to call Hermes directly, also set `API_SERVER_CORS_ORIGINS` to an explicit allowlist.

 :::warning Security
-The API server gives full access to hermes-agent's toolset, **including terminal commands**. If you change the bind address to `0.0.0.0` (network-accessible), **always set `API_SERVER_KEY`** and keep `API_SERVER_CORS_ORIGINS` narrow — without that, remote callers may be able to execute arbitrary commands on your machine.
+The API server gives full access to hermes-agent's toolset, **including terminal commands**. When binding to a non-loopback address like `0.0.0.0`, `API_SERVER_KEY` is **required**. Also keep `API_SERVER_CORS_ORIGINS` narrow to control browser access.

 The default bind address (`127.0.0.1`) is for local-only use. Browser access is disabled by default; enable it only for explicit trusted origins.
 :::
@@ -16,7 +16,7 @@ Before setup, here's the part most people want to know: how Hermes behaves once

 | Context | Behavior |
 |---------|----------|
-| **DMs** | Hermes responds to every message. No `@mention` needed. Each DM has its own session. |
+| **DMs** | Hermes responds to every message. No `@mention` needed. Each DM has its own session. Set `MATRIX_DM_MENTION_THREADS=true` to start a thread when the bot is `@mentioned` in a DM. |
 | **Rooms** | By default, Hermes requires an `@mention` to respond. Set `MATRIX_REQUIRE_MENTION=false` or add room IDs to `MATRIX_FREE_RESPONSE_ROOMS` for free-response rooms. Room invites are auto-accepted. |
 | **Threads** | Hermes supports Matrix threads (MSC3440). If you reply in a thread, Hermes keeps the thread context isolated from the main room timeline. Threads where the bot has already participated do not require a mention. |
 | **Auto-threading** | By default, Hermes auto-creates a thread for each message it responds to in a room. This keeps conversations isolated. Set `MATRIX_AUTO_THREAD=false` to disable. |
@@ -62,6 +62,7 @@ matrix:
  free_response_rooms:            # Rooms exempt from mention requirement
    - "!abc123:matrix.org"
  auto_thread: true               # Auto-create threads for responses (default: true)
+  dm_mention_threads: false       # Create thread when @mentioned in DM (default: false)
 ```

 Or via environment variables:
@@ -70,6 +71,7 @@ Or via environment variables:
 MATRIX_REQUIRE_MENTION=true
 MATRIX_FREE_RESPONSE_ROOMS=!abc123:matrix.org,!def456:matrix.org
 MATRIX_AUTO_THREAD=true
+MATRIX_DM_MENTION_THREADS=false
 ```

 :::note
Author	SHA1	Message	Date
alt-glitch	611b89c2a7	feat(nix): container-aware CLI — auto-route hermes chat into managed container When container.enable = true in the NixOS module, running 'hermes chat' on the host now automatically execs into the managed container via docker/podman exec. This means the interactive CLI runs in the same environment as the gateway service, with access to all container-installed packages and tools. Implementation: - NixOS activation script writes .container-mode metadata file to HERMES_HOME with backend, container_name, and hermes_bin path - File is removed when container mode is disabled (nixos-rebuild switch) - hermes_cli/config.py: _is_inside_container() detects Docker/Podman indicators (/.dockerenv, /run/.containerenv, cgroup) - hermes_cli/config.py: get_container_exec_info() reads .container-mode metadata, returns None when already inside a container - hermes_cli/main.py: _exec_in_container() validates the container is running, then os.execvp() replaces the process with the container exec - cmd_chat intercepts before normal flow, checks container info, execs Safety: - --host flag bypasses container routing (run on host regardless) - Falls back to host CLI if: container runtime not found, container not running, inspect fails, or any detection error - Strips --host from forwarded args (not meaningful inside container) - Already-inside-container detection prevents infinite exec loops Closes #7380	2026-04-11 06:15:44 +05:30
Siddharth Balyan	9a0c44f908	fix(nix): gate matrix extra to Linux in [all] profile (#7461 ) * fix(nix): gate matrix extra to Linux in [all] profile matrix-nio[e2e] depends on python-olm which is upstream-broken on modern macOS (Clang 21+, archived libolm). Previously the [matrix] extra was completely excluded from [all], meaning NixOS users (who install via [all]) had no Matrix support at all. Add a sys_platform == 'linux' marker so [all] pulls in [matrix] on Linux (where python-olm builds fine) while still skipping it on macOS. This fixes the NixOS setup path without breaking macOS installs. Update the regression test to verify the Linux-gated marker is present rather than just checking matrix is absent from [all]. Fixes #4594 * chore: regenerate uv.lock with matrix-on-linux in [all]	2026-04-11 05:59:56 +05:30
Teknium	baddb6f717	fix(gateway): derive channel directory platforms from enum instead of hardcoded list (#7450 ) Six platforms (matrix, mattermost, dingtalk, feishu, wecom, homeassistant) were missing from the session-based discovery loop, causing /channels and send_message to return empty results on those platforms. Instead of adding them to the hardcoded tuple (which would break again when new platforms are added), derive the list dynamically from the Platform enum. Only infrastructure entries (local, api_server, webhook) are excluded; Discord and Slack are skipped automatically because their direct builders already populate the platforms dict. Reported by sprmn24 in PR #7416.	2026-04-10 17:27:32 -07:00
0xFrank-eth	e8034e2f6a	fix(gateway): replace os.environ session state with contextvars for concurrency safety When two gateway messages arrived concurrently, _set_session_env wrote HERMES_SESSION_PLATFORM/CHAT_ID/CHAT_NAME/THREAD_ID into the process-global os.environ. Because asyncio tasks share the same process, Message B would overwrite Message A's values mid-flight, causing background-task notifications and tool calls to route to the wrong thread/chat. Replace os.environ with Python's contextvars.ContextVar. Each asyncio task (and any run_in_executor thread it spawns) gets its own copy, so concurrent messages never interfere. Changes: - New gateway/session_context.py with ContextVar definitions, set/clear/get helpers, and os.environ fallback for CLI/cron/test backward compatibility - gateway/run.py: _set_session_env returns reset tokens, _clear_session_env accepts them for proper cleanup in finally blocks - All tool consumers updated: cronjob_tools, send_message_tool, skills_tool, terminal_tool (both notify_on_complete AND check_interval blocks), tts_tool, agent/skill_utils, agent/prompt_builder - Tests updated for new contextvar-based API Fixes #7358 Co-authored-by: teknium1 <127238744+teknium1@users.noreply.github.com>	2026-04-10 17:04:38 -07:00
Dylan Socolobsky	dab5ec8245	test(e2e): add Slack to parametrized e2e platform tests	2026-04-10 16:51:44 -07:00
Dylan Socolobsky	79565630b0	refactor(e2e): unify Telegram and Discord e2e tests into parametrized platform fixtures	2026-04-10 16:51:44 -07:00
Dylan Socolobsky	7033dbf5d6	test(e2e): add Discord e2e integration tests	2026-04-10 16:51:44 -07:00
pefontana	9555a0cf31	fix(gateway): look up expired agents in _agent_cache, add global kill_all Two fixes from PR review: 1. Session expiry was looking in _running_agents for the cached agent, but idle expired sessions live in _agent_cache. Now checks _agent_cache first, falls back to _running_agents. 2. Global cleanup in stop() was missing process_registry.kill_all(), so background processes from agents evicted without close() (branch, fallback) survived shutdown.	2026-04-10 16:51:44 -07:00
pefontana	f00dd3169f	fix(gateway): guard _agent_cache_lock access in reset handler Use getattr guard for _agent_cache_lock in _handle_reset_command because test fixtures may create GatewayRunner without calling __init__, leaving the attribute unset. Fixes e2e test failure: test_new_resets_session, test_new_then_status_reflects_reset, test_new_is_idempotent.	2026-04-10 16:51:44 -07:00
pefontana	8414f41856	test: add zombie process cleanup tests Add 9 tests covering the full zombie process prevention chain: - TestZombieReproduction: demonstrates that processes survive when references are dropped without explicit cleanup (the original bug) - TestAgentCloseMethod: verifies close() calls all cleanup functions, is idempotent, propagates to children, and continues cleanup even when individual steps fail - TestGatewayCleanupWiring: verifies stop() calls close() and that _evict_cached_agent() does NOT call close() (since it's also used for non-destructive cache refreshes) - TestDelegationCleanup: calls the real _run_single_child function and verifies close() is called on the child agent Ref: #7131	2026-04-10 16:51:44 -07:00
pefontana	672cc80915	fix(delegate): close child agent after delegation completes Call child.close() in the _run_single_child finally block after unregistering the child from the parent's active children list. Previously child AIAgent instances were only removed from the tracking list but never had their resources released — the OpenAI/httpx client and any tool subprocesses relied entirely on garbage collection. Ref: #7131	2026-04-10 16:51:44 -07:00
pefontana	fbe28352e4	fix(gateway): call agent.close() on session end to prevent zombies Wire AIAgent.close() into every gateway code path where an agent's session is actually ending: - stop(): close all running agents after interrupt + memory shutdown, then call cleanup_all_environments() and cleanup_all_browsers() as a global catch-all - _session_expiry_watcher(): close agents when sessions expire after the 5-minute idle timeout - _handle_reset_command(): close the old agent before evicting it from cache on /new or /reset Note: _evict_cached_agent() intentionally does NOT call close() because it is also used for non-destructive cache refreshes (model switch, branch, fallback) where tool resources should persist. Ref: #7131	2026-04-10 16:51:44 -07:00
pefontana	5b42aecfa7	feat(agent): add AIAgent.close() for subprocess cleanup Add a close() method to AIAgent that acts as a single entry point for releasing all resources held by an agent instance. This prevents zombie process accumulation on long-running gateway deployments by explicitly cleaning up: - Background processes tracked in ProcessRegistry - Terminal sandbox environments - Browser daemon sessions - Active child agents (subagent delegation) - OpenAI/httpx client connections Each cleanup step is independently guarded so a failure in one does not prevent the rest. The method is idempotent and safe to call multiple times. Also simplifies the background review cleanup to use close() instead of manually closing the OpenAI client. Ref: #7131	2026-04-10 16:51:44 -07:00
entropidelic	989b950fbc	fix(security): enforce API_SERVER_KEY for non-loopback binding Add is_network_accessible() helper using Python's ipaddress module to robustly classify bind addresses (IPv4/IPv6 loopback, wildcards, mapped addresses, hostname resolution with DNS-failure-fails-closed). The API server connect() now refuses to start when the bind address is network-accessible and no API_SERVER_KEY is set, preventing RCE from other machines on the network. Co-authored-by: entropidelic <entropidelic@users.noreply.github.com>	2026-04-10 16:51:44 -07:00
Devorun	2a6cbf52d0	fix(cron): prevent silent data loss by raising exceptions on unrecoverable jobs.json read failures (#6797 )	2026-04-10 16:51:35 -07:00
coffee	c5ab760528	fix(cron): missing field init, unnecessary save, and shutdown cleanup 1. Add missing `last_delivery_error` field initialization in `create_job()`. `mark_job_run()` sets this field on line 596 but it was never initialized, causing inconsistent job schemas between new and executed jobs. 2. Replace unnecessary `save_jobs()` call with a warning log when `mark_job_run()` is called with a non-existent job_id. Previously the function would silently write unchanged data to disk. 3. Add `cancel_futures=True` to the `finally` block in cron scheduler's thread pool shutdown. The `except` path already passes this flag but the normal exit path did not, leaving futures running after inactivity timeout detection.	2026-04-10 16:51:35 -07:00
Teknium	a4fc38c5b1	test: remove dead TestResolveForcedProvider tests (function doesn't exist on main)	2026-04-10 16:47:44 -07:00
KUSH42	0e939af7c2	fix(patch): harden V4A patch parser and fuzzy match — 9 correctness bugs - Bug 1: replace read_file(limit=10000) with read_file_raw in _apply_update, preventing silent truncation of files >2000 lines and corruption of lines >2000 chars; add read_file_raw to FileOperations abstract interface and ShellFileOperations - Bug 2: split apply_v4a_operations into validate-then-apply phases; if any hunk fails validation, zero writes occur (was: continue after failure, leaving filesystem partially modified) - Bug 3: parse_v4a_patch now returns an error for begin-marker-with-no-ops, empty file paths, and moves missing a destination (was: always returned error=None) - Bug 4: raise strategy 7 (block anchor) single-candidate similarity threshold from 0.10 to 0.50, eliminating false-positive matches in repetitive code - Bug 5: add _strategy_unicode_normalized (new strategy 7) with position mapping via _build_orig_to_norm_map; smart quotes and em-dashes in LLM-generated patches now match via strategies 1-6 before falling through to fuzzy strategies - Bug 6: extend fuzzy_find_and_replace to return 4-tuple (content, count, error, strategy); update all 5 call sites across patch_parser.py, file_operations.py, and skill_manager_tool.py - Bug 7: guard in _apply_update returns error when addition-only context hint is ambiguous (>1 occurrences); validation phase errors on both 0 and >1 - Bug 8: _apply_delete returns error (not silent success) on missing file - Bug 9: _validate_operations checks source existence and destination absence for MOVE operations before any write occurs	2026-04-10 16:47:44 -07:00
Billard	475cbce775	fix(aux): honor api_mode for custom auxiliary endpoints	2026-04-10 16:47:44 -07:00
coffee	c1f832a610	fix(tools): guard against ValueError on int() env var and header parsing Three locations perform `int()` conversion on environment variables or HTTP headers without error handling, causing unhandled `ValueError` crashes when the values are non-numeric: 1. `send_message_tool.py` — `EMAIL_SMTP_PORT` env var parsed outside the try/except block; a non-numeric value crashes `_send_email()` instead of returning a user-friendly error. 2. `process_registry.py` — `TERMINAL_TIMEOUT` env var parsed without protection; a non-numeric value crashes the `wait()` method. 3. `skills_hub.py` — HTTP `Retry-After` header can contain date strings per RFC 7231; `int()` conversion crashes on non-numeric values. All three now fall back to their default values on `ValueError`/`TypeError`.	2026-04-10 16:47:44 -07:00
Awsh1	6f63ba9c8f	fix(mcp): fall back when SIGKILL is unavailable	2026-04-10 16:47:44 -07:00
Fran Fitzpatrick	3e24ba1656	feat(matrix): add MATRIX_DM_MENTION_THREADS env var When enabled, @mentioning the bot in a DM creates a thread (default: false). Supports both env var and YAML config (matrix.dm_mention_threads). 6 new tests, docs updated. From #6957	2026-04-10 15:46:20 -07:00
buray	d8cd7974d8	fix(feishu): register group chat member event handlers Bot-added and bot-removed events were silently dropped because _on_bot_added_to_chat and _on_bot_removed_from_chat were not registered in _build_event_handler(). From #6975	2026-04-10 15:46:20 -07:00