fix(cron): pass skip_context_files=True to AIAgent in run_job

Cron jobs run from whatever directory the scheduler process lives in (typically the hermes-agent install dir), so without this flag the agent picks up AGENTS.md, SOUL.md, or .cursorrules from that cwd — injecting irrelevant project context into the cron job's system prompt. batch_runner.py and gateway boot_md already pass skip_context_files=True for the same reason. This aligns cron with the established pattern for autonomous/headless agent runs.
fix(whatsapp): pin Baileys to fix/abprops-abt-fetch for bad-request fix
2026-04-11 14:48:11 -07:00 · 2026-04-11 14:03:37 -07:00 · 2026-04-11 14:02:58 -07:00 · 2026-04-11 14:02:46 -07:00 · 2026-04-11 14:02:33 -07:00 · 2026-04-11 13:59:52 -07:00
102 changed files with 3859 additions and 1303 deletions
@@ -89,6 +89,15 @@
 # Optional base URL override:
 # HERMES_QWEN_BASE_URL=https://portal.qwen.ai/v1

+# =============================================================================
+# LLM PROVIDER (Xiaomi MiMo)
+# =============================================================================
+# Xiaomi MiMo models (mimo-v2-pro, mimo-v2-omni, mimo-v2-flash).
+# Get your key at: https://platform.xiaomimimo.com
+# XIAOMI_API_KEY=your_key_here
+# Optional base URL override:
+# XIAOMI_BASE_URL=https://api.xiaomimimo.com/v1
+
 # =============================================================================
 # TOOL API KEYS
 # =============================================================================
@@ -23,17 +23,13 @@ Resolution order for vision/multimodal tasks (auto mode):
  6. Custom endpoint (for local vision models: Qwen-VL, LLaVA, Pixtral, etc.)
  7. None

-Per-task provider overrides (e.g. AUXILIARY_VISION_PROVIDER,
-CONTEXT_COMPRESSION_PROVIDER) can force a specific provider for each task.
+Per-task overrides are configured in config.yaml under the ``auxiliary:`` section
+(e.g. ``auxiliary.vision.provider``, ``auxiliary.compression.model``).
 Default "auto" follows the chains above.

-Per-task model overrides (e.g. AUXILIARY_VISION_MODEL,
-AUXILIARY_WEB_EXTRACT_MODEL) let callers use a different model slug
-than the provider's default.
-
-Per-task direct endpoint overrides (e.g. AUXILIARY_VISION_BASE_URL,
-AUXILIARY_VISION_API_KEY) let callers route a specific auxiliary task to a
-custom OpenAI-compatible endpoint without touching the main model settings.
+Legacy env var overrides (AUXILIARY_{TASK}_PROVIDER, AUXILIARY_{TASK}_MODEL,
+AUXILIARY_{TASK}_BASE_URL, etc.) are still read as a backward-compat fallback
+but config.yaml takes priority.  New configuration should always use config.yaml.

 Payment / credit exhaustion fallback:
  When a resolved provider returns HTTP 402 or a credit-related error,
@@ -111,6 +107,14 @@ _API_KEY_PROVIDER_AUX_MODELS: Dict[str, str] = {
    "kilocode": "google/gemini-3-flash-preview",
 }

+# Vision-specific model overrides for direct providers.
+# When the user's main provider has a dedicated vision/multimodal model that
+# differs from their main chat model, map it here.  The vision auto-detect
+# "exotic provider" branch checks this before falling back to the main model.
+_PROVIDER_VISION_MODELS: Dict[str, str] = {
+    "xiaomi": "mimo-v2-omni",
+}
+
 # OpenRouter app attribution headers
 _OR_HEADERS = {
    "HTTP-Referer": "https://hermes-agent.nousresearch.com",
@@ -1687,16 +1691,18 @@ def resolve_vision_provider_client(
                if sync_client is not None:
                    return _finalize(main_provider, sync_client, default_model)
            else:
-                # Exotic provider (DeepSeek, Alibaba, named custom, etc.)
+                # Exotic provider (DeepSeek, Alibaba, Xiaomi, named custom, etc.)
+                # Use provider-specific vision model if available, otherwise main model.
+                vision_model = _PROVIDER_VISION_MODELS.get(main_provider, main_model)
                rpc_client, rpc_model = resolve_provider_client(
-                    main_provider, main_model)
+                    main_provider, vision_model)
                if rpc_client is not None:
                    logger.info(
                        "Vision auto-detect: using active provider %s (%s)",
-                        main_provider, rpc_model or main_model,
+                        main_provider, rpc_model or vision_model,
                    )
                    return _finalize(
-                        main_provider, rpc_client, rpc_model or main_model)
+                        main_provider, rpc_client, rpc_model or vision_model)

        # Fall back through aggregators.
        for candidate in _VISION_AUTO_PROVIDER_ORDER:
@@ -1958,8 +1964,8 @@ def _resolve_task_provider_model(

    Priority:
      1. Explicit provider/model/base_url/api_key args (always win)
-      2. Env var overrides (AUXILIARY_{TASK}_*, CONTEXT_{TASK}_*)
-      3. Config file (auxiliary.{task}.* or compression.*)
+      2. Config file (auxiliary.{task}.* or compression.*)
+      3. Env var overrides (backward-compat: AUXILIARY_{TASK}_*, CONTEXT_{TASK}_*)
      4. "auto" (full auto-detection chain)

    Returns (provider, model, base_url, api_key, api_mode) where model may
@@ -2002,10 +2008,11 @@ def _resolve_task_provider_model(
                _sbu = comp.get("summary_base_url") or ""
                cfg_base_url = cfg_base_url or _sbu.strip() or None

+    # Env vars are backward-compat fallback only — config.yaml is primary.
    env_model = _get_auxiliary_env_override(task, "MODEL") if task else None
    env_api_mode = _get_auxiliary_env_override(task, "API_MODE") if task else None
-    resolved_model = model or env_model or cfg_model
-    resolved_api_mode = env_api_mode or cfg_api_mode
+    resolved_model = model or cfg_model or env_model
+    resolved_api_mode = cfg_api_mode or env_api_mode

    if base_url:
        return "custom", resolved_model, base_url, api_key, resolved_api_mode
@@ -2013,19 +2020,23 @@ def _resolve_task_provider_model(
        return provider, resolved_model, base_url, api_key, resolved_api_mode

    if task:
+        # Config.yaml is the primary source for per-task overrides.
+        if cfg_base_url:
+            return "custom", resolved_model, cfg_base_url, cfg_api_key, resolved_api_mode
+        if cfg_provider and cfg_provider != "auto":
+            return cfg_provider, resolved_model, None, None, resolved_api_mode
+
+        # Env vars are backward-compat fallback for users who haven't
+        # migrated to config.yaml yet.
        env_base_url = _get_auxiliary_env_override(task, "BASE_URL")
        env_api_key = _get_auxiliary_env_override(task, "API_KEY")
        if env_base_url:
-            return "custom", resolved_model, env_base_url, env_api_key or cfg_api_key, resolved_api_mode
+            return "custom", resolved_model, env_base_url, env_api_key, resolved_api_mode

        env_provider = _get_auxiliary_provider(task)
        if env_provider != "auto":
            return env_provider, resolved_model, None, None, resolved_api_mode

-        if cfg_base_url:
-            return "custom", resolved_model, cfg_base_url, cfg_api_key, resolved_api_mode
-        if cfg_provider and cfg_provider != "auto":
-            return cfg_provider, resolved_model, None, None, resolved_api_mode
        return "auto", resolved_model, None, None, resolved_api_mode

    return "auto", resolved_model, None, None, resolved_api_mode
@@ -4,7 +4,6 @@ Pure display functions and classes with no AIAgent dependency.
 Used by AIAgent._execute_tool_calls for CLI feedback.
 """

-import json
 import logging
 import os
 import sys
@@ -14,6 +13,8 @@ from dataclasses import dataclass, field
 from difflib import unified_diff
 from pathlib import Path

+from utils import safe_json_loads
+
 # ANSI escape codes for coloring tool failure indicators
 _RED = "\033[31m"
 _RESET = "\033[0m"
@@ -372,9 +373,8 @@ def _result_succeeded(result: str | None) -> bool:
    """Conservatively detect whether a tool result represents success."""
    if not result:
        return False
-    try:
-        data = json.loads(result)
-    except (json.JSONDecodeError, TypeError):
+    data = safe_json_loads(result)
+    if data is None:
        return False
    if not isinstance(data, dict):
        return False
@@ -423,10 +423,7 @@ def extract_edit_diff(
 ) -> str | None:
    """Extract a unified diff from a file-edit tool result."""
    if tool_name == "patch" and result:
-        try:
-            data = json.loads(result)
-        except (json.JSONDecodeError, TypeError):
-            data = None
+        data = safe_json_loads(result)
        if isinstance(data, dict):
            diff = data.get("diff")
            if isinstance(diff, str) and diff.strip():
@@ -780,23 +777,19 @@ def _detect_tool_failure(tool_name: str, result: str | None) -> tuple[bool, str]
        return False, ""

    if tool_name == "terminal":
-        try:
-            data = json.loads(result)
+        data = safe_json_loads(result)
+        if isinstance(data, dict):
            exit_code = data.get("exit_code")
            if exit_code is not None and exit_code != 0:
                return True, f" [exit {exit_code}]"
-        except (json.JSONDecodeError, TypeError, AttributeError):
-            logger.debug("Could not parse terminal result as JSON for exit code check")
        return False, ""

    # Memory-specific: distinguish "full" from real errors
    if tool_name == "memory":
-        try:
-            data = json.loads(result)
+        data = safe_json_loads(result)
+        if isinstance(data, dict):
            if data.get("success") is False and "exceed the limit" in data.get("error", ""):
                return True, " [full]"
-        except (json.JSONDecodeError, TypeError, AttributeError):
-            logger.debug("Could not parse memory result as JSON for capacity check")

    # Generic heuristic for non-terminal tools
    lower = result[:500].lower()
@@ -27,12 +27,14 @@ _PROVIDER_PREFIXES: frozenset[str] = frozenset({
    "gemini", "zai", "kimi-coding", "minimax", "minimax-cn", "anthropic", "deepseek",
    "opencode-zen", "opencode-go", "ai-gateway", "kilocode", "alibaba",
    "qwen-oauth",
+    "xiaomi",
    "custom", "local",
    # Common aliases
    "google", "google-gemini", "google-ai-studio",
    "glm", "z-ai", "z.ai", "zhipu", "github", "github-copilot",
    "github-models", "kimi", "moonshot", "claude", "deep-seek",
    "opencode", "zen", "go", "vercel", "kilo", "dashscope", "aliyun", "qwen",
+    "mimo", "xiaomi-mimo",
    "qwen-portal",
 })

@@ -149,9 +151,10 @@ DEFAULT_CONTEXT_LENGTHS = {
    "moonshotai/Kimi-K2.5": 262144,
    "moonshotai/Kimi-K2-Thinking": 262144,
    "MiniMaxAI/MiniMax-M2.5": 204800,
-    "XiaomiMiMo/MiMo-V2-Flash": 32768,
-    "mimo-v2-pro": 1048576,
-    "mimo-v2-omni": 1048576,
+    "XiaomiMiMo/MiMo-V2-Flash": 256000,
+    "mimo-v2-pro": 1000000,
+    "mimo-v2-omni": 256000,
+    "mimo-v2-flash": 256000,
    "zai-org/GLM-5": 202752,
 }

@@ -211,6 +214,8 @@ _URL_TO_PROVIDER: Dict[str, str] = {
    "api.fireworks.ai": "fireworks",
    "opencode.ai": "opencode-go",
    "api.x.ai": "xai",
+    "api.xiaomimimo.com": "xiaomi",
+    "xiaomimimo.com": "xiaomi",
 }


@@ -161,6 +161,7 @@ PROVIDER_TO_MODELS_DEV: Dict[str, str] = {
    "gemini": "google",
    "google": "google",
    "xai": "xai",
+    "xiaomi": "xiaomi",
    "nvidia": "nvidia",
    "groq": "groq",
    "mistral": "mistral",
@@ -383,7 +384,14 @@ def get_model_capabilities(provider: str, model: str) -> Optional[ModelCapabilit

    # Extract capability flags (default to False if missing)
    supports_tools = bool(entry.get("tool_call", False))
-    supports_vision = bool(entry.get("attachment", False))
+    # Vision: check both the `attachment` flag and `modalities.input` for "image".
+    # Some models (e.g. gemma-4) list image in input modalities but not attachment.
+    input_mods = entry.get("modalities", {})
+    if isinstance(input_mods, dict):
+        input_mods = input_mods.get("input", [])
+    else:
+        input_mods = []
+    supports_vision = bool(entry.get("attachment", False)) or "image" in input_mods
    supports_reasoning = bool(entry.get("reasoning", False))

    # Extract limits
@@ -12,7 +12,7 @@ import threading
 from collections import OrderedDict
 from pathlib import Path

-from hermes_constants import get_hermes_home
+from hermes_constants import get_hermes_home, get_skills_dir
 from typing import Optional

 from agent.skill_utils import (
@@ -548,8 +548,7 @@ def build_skills_system_prompt(
    are read-only — they appear in the index but new skills are always created
    in the local dir.  Local skills take precedence when names collide.
    """
-    hermes_home = get_hermes_home()
-    skills_dir = hermes_home / "skills"
+    skills_dir = get_skills_dir()
    external_dirs = get_all_skills_dirs()[1:]  # skip local (index 0)

    if not skills_dir.exists() and not external_dirs:
@@ -12,7 +12,7 @@ import sys
 from pathlib import Path
 from typing import Any, Dict, List, Set, Tuple

-from hermes_constants import get_hermes_home
+from hermes_constants import get_config_path, get_skills_dir

 logger = logging.getLogger(__name__)

@@ -130,7 +130,7 @@ def get_disabled_skill_names(platform: str | None = None) -> Set[str]:
    Reads the config file directly (no CLI config imports) to stay
    lightweight.
    """
-    config_path = get_hermes_home() / "config.yaml"
+    config_path = get_config_path()
    if not config_path.exists():
        return set()
    try:
@@ -178,7 +178,7 @@ def get_external_skills_dirs() -> List[Path]:
    path.  Only directories that actually exist are returned.  Duplicates and
    paths that resolve to the local ``~/.hermes/skills/`` are silently skipped.
    """
-    config_path = get_hermes_home() / "config.yaml"
+    config_path = get_config_path()
    if not config_path.exists():
        return []
    try:
@@ -200,7 +200,7 @@ def get_external_skills_dirs() -> List[Path]:
    if not isinstance(raw_dirs, list):
        return []

-    local_skills = (get_hermes_home() / "skills").resolve()
+    local_skills = get_skills_dir().resolve()
    seen: Set[Path] = set()
    result: List[Path] = []

@@ -230,7 +230,7 @@ def get_all_skills_dirs() -> List[Path]:
    The local dir is always first (and always included even if it doesn't exist
    yet — callers handle that).  External dirs follow in config order.
    """
-    dirs = [get_hermes_home() / "skills"]
+    dirs = [get_skills_dir()]
    dirs.extend(get_external_skills_dirs())
    return dirs

@@ -384,7 +384,7 @@ def resolve_skill_config_values(
    current values (or the declared default if the key isn't set).
    Path values are expanded via ``os.path.expanduser``.
    """
-    config_path = get_hermes_home() / "config.yaml"
+    config_path = get_config_path()
    config: Dict[str, Any] = {}
    if config_path.exists():
        try:
@@ -24,6 +24,7 @@ model:
  #   "minimax"      - MiniMax global (requires: MINIMAX_API_KEY)
  #   "minimax-cn"   - MiniMax China (requires: MINIMAX_CN_API_KEY)
  #   "huggingface"  - Hugging Face Inference (requires: HF_TOKEN)
+  #   "xiaomi"       - Xiaomi MiMo (requires: XIAOMI_API_KEY)
  #   "kilocode"     - KiloCode gateway (requires: KILOCODE_API_KEY)
  #   "ai-gateway"   - Vercel AI Gateway (requires: AI_GATEWAY_API_KEY)
  #
@@ -722,6 +722,7 @@ def run_job(job: dict) -> tuple[bool, str, str, Optional[str]]:
            provider_sort=pr.get("sort"),
            disabled_toolsets=["cronjob", "messaging", "clarify"],
            quiet_mode=True,
+            skip_context_files=True,  # Don't inject SOUL.md/AGENTS.md from scheduler cwd
            skip_memory=True,  # Cron system prompts would corrupt user representations
            platform="cron",
            session_id=_cron_session_id,
@@ -11,12 +11,14 @@ When you run `hermes setup` for the first time and Hermes detects `~/.openclaw`,
 ### 2. CLI Command (quick, scriptable)

 ```bash
-hermes claw migrate                      # Full migration with confirmation prompt
-hermes claw migrate --dry-run            # Preview what would happen
+hermes claw migrate                      # Preview then migrate (always shows preview first)
+hermes claw migrate --dry-run            # Preview only, no changes
 hermes claw migrate --preset user-data   # Migrate without API keys/secrets
 hermes claw migrate --yes                # Skip confirmation prompt
 ```

+The migration always shows a full preview of what will be imported before making any changes. You review the preview and confirm before anything is written.
+
 **All options:**

 | Flag | Description |
@@ -39,7 +41,7 @@ Ask the agent to run the migration for you:
 ```

 The agent will use the `openclaw-migration` skill to:
-1. Run a dry-run first to preview changes
+1. Run a preview first to show what would change
 2. Ask about conflict resolution (SOUL.md, skills, etc.)
 3. Let you choose between `user-data` and `full` presets
 4. Execute the migration with your choices
@@ -58,16 +60,31 @@ The agent will use the `openclaw-migration` skill to:
 | Messaging settings | `~/.openclaw/config.yaml` (TELEGRAM_ALLOWED_USERS, MESSAGING_CWD) | `~/.hermes/.env` |
 | TTS assets | `~/.openclaw/workspace/tts/` | `~/.hermes/tts/` |

+Workspace files are also checked at `workspace.default/` and `workspace-main/` as fallback paths (OpenClaw renamed `workspace/` to `workspace-main/` in recent versions).
+
 ### `full` preset (adds to `user-data`)
 | Item | Source | Destination |
 |------|--------|-------------|
-| Telegram bot token | `~/.openclaw/config.yaml` | `~/.hermes/.env` |
-| OpenRouter API key | `~/.openclaw/.env` or config | `~/.hermes/.env` |
-| OpenAI API key | `~/.openclaw/.env` or config | `~/.hermes/.env` |
-| Anthropic API key | `~/.openclaw/.env` or config | `~/.hermes/.env` |
-| ElevenLabs API key | `~/.openclaw/.env` or config | `~/.hermes/.env` |
+| Telegram bot token | `openclaw.json` channels config | `~/.hermes/.env` |
+| OpenRouter API key | `.env`, `openclaw.json`, or `openclaw.json["env"]` | `~/.hermes/.env` |
+| OpenAI API key | `.env`, `openclaw.json`, or `openclaw.json["env"]` | `~/.hermes/.env` |
+| Anthropic API key | `.env`, `openclaw.json`, or `openclaw.json["env"]` | `~/.hermes/.env` |
+| ElevenLabs API key | `.env`, `openclaw.json`, or `openclaw.json["env"]` | `~/.hermes/.env` |

-Only these 6 allowlisted secrets are ever imported. Other credentials are skipped and reported.
+API keys are searched across four sources: inline config values, `~/.openclaw/.env`, the `openclaw.json` `"env"` sub-object, and per-agent auth profiles.
+
+Only allowlisted secrets are ever imported. Other credentials are skipped and reported.
+
+## OpenClaw Schema Compatibility
+
+The migration handles both old and current OpenClaw config layouts:
+
+- **Channel tokens**: Reads from flat paths (`channels.telegram.botToken`) and the newer `accounts.default` layout (`channels.telegram.accounts.default.botToken`)
+- **TTS provider**: OpenClaw renamed "edge" to "microsoft" — both are recognized and mapped to Hermes' "edge"
+- **Provider API types**: Both short (`openai`, `anthropic`) and hyphenated (`openai-completions`, `anthropic-messages`, `google-generative-ai`) values are mapped correctly
+- **thinkingDefault**: All enum values are handled including newer ones (`minimal`, `xhigh`, `adaptive`)
+- **Matrix**: Uses `accessToken` field (not `botToken`)
+- **SecretRef formats**: Plain strings, env templates (`${VAR}`), and `source: "env"` SecretRefs are resolved. `source: "file"` and `source: "exec"` SecretRefs produce a warning — add those keys manually after migration.

 ## Conflict Handling

@@ -84,18 +101,24 @@ For skills, you can also use `--skill-conflict rename` to import conflicting ski

 ## Migration Report

-Every migration (including dry runs) produces a report showing:
+Every migration produces a report showing:
 - **Migrated items** — what was successfully imported
 - **Conflicts** — items skipped because they already exist
 - **Skipped items** — items not found in the source
 - **Errors** — items that failed to import

-For execute runs, the full report is saved to `~/.hermes/migration/openclaw/<timestamp>/`.
+For executed migrations, the full report is saved to `~/.hermes/migration/openclaw/<timestamp>/`.
+
+## Post-Migration Notes
+
+- **Skills require a new session** — imported skills take effect after restarting your agent or starting a new chat.
+- **WhatsApp requires re-pairing** — WhatsApp uses QR-code pairing, not token-based auth. Run `hermes whatsapp` to pair.
+- **Archive cleanup** — after migration, you'll be offered to rename `~/.openclaw/` to `.openclaw.pre-migration/` to prevent state confusion. You can also run `hermes claw cleanup` later.

 ## Troubleshooting

 ### "OpenClaw directory not found"
-The migration looks for `~/.openclaw` by default. If your OpenClaw is installed elsewhere, use `--source`:
+The migration looks for `~/.openclaw` by default, then tries `~/.clawdbot` and `~/.moldbot`. If your OpenClaw is installed elsewhere, use `--source`:
 ```bash
 hermes claw migrate --source /path/to/.openclaw
 ```
@@ -108,3 +131,12 @@ hermes skills install openclaw-migration

 ### Memory overflow
 If your OpenClaw MEMORY.md or USER.md exceeds Hermes' character limits, excess entries are exported to an overflow file in the migration report directory. You can manually review and add the most important ones.
+
+### API keys not found
+Keys might be stored in different places depending on your OpenClaw setup:
+- `~/.openclaw/.env` file
+- Inline in `openclaw.json` under `models.providers.*.apiKey`
+- In `openclaw.json` under the `"env"` or `"env.vars"` sub-objects
+- In `~/.openclaw/agents/main/agent/auth-profiles.json`
+
+The migration checks all four. If keys use `source: "file"` or `source: "exec"` SecretRefs, they can't be resolved automatically — add them via `hermes config set`.
@@ -190,7 +190,7 @@ class StreamingConfig:
    """Configuration for real-time token streaming to messaging platforms."""
    enabled: bool = False
    transport: str = "edit"       # "edit" (progressive editMessageText) or "off"
-    edit_interval: float = 0.3    # Seconds between message edits
+    edit_interval: float = 1.0    # Seconds between message edits (Telegram rate-limits at ~1/s)
    buffer_threshold: int = 40    # Chars before forcing an edit
    cursor: str = " ▉"           # Cursor shown during streaming

@@ -210,7 +210,7 @@ class StreamingConfig:
        return cls(
            enabled=data.get("enabled", False),
            transport=data.get("transport", "edit"),
-            edit_interval=float(data.get("edit_interval", 0.3)),
+            edit_interval=float(data.get("edit_interval", 1.0)),
            buffer_threshold=int(data.get("buffer_threshold", 40)),
            cursor=data.get("cursor", " ▉"),
        )
@@ -1017,6 +1017,9 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
        weixin_group_allowed_users = os.getenv("WEIXIN_GROUP_ALLOWED_USERS", "").strip()
        if weixin_group_allowed_users:
            extra["group_allow_from"] = weixin_group_allowed_users
+        weixin_split_multiline = os.getenv("WEIXIN_SPLIT_MULTILINE_MESSAGES", "").strip()
+        if weixin_split_multiline:
+            extra["split_multiline_messages"] = weixin_split_multiline
        weixin_home = os.getenv("WEIXIN_HOME_CHANNEL", "").strip()
        if weixin_home:
            config.platforms[Platform.WEIXIN].home_channel = HomeChannel(
@@ -53,6 +53,7 @@ DEFAULT_HOST = "127.0.0.1"
 DEFAULT_PORT = 8642
 MAX_STORED_RESPONSES = 100
 MAX_REQUEST_BYTES = 1_000_000  # 1 MB default limit for POST bodies
+CHAT_COMPLETIONS_SSE_KEEPALIVE_SECONDS = 30.0


 def check_api_server_requirements() -> bool:
@@ -762,7 +763,11 @@ class APIServerAdapter(BasePlatformAdapter):
        """
        import queue as _q

-        sse_headers = {"Content-Type": "text/event-stream", "Cache-Control": "no-cache"}
+        sse_headers = {
+            "Content-Type": "text/event-stream",
+            "Cache-Control": "no-cache",
+            "X-Accel-Buffering": "no",
+        }
        # CORS middleware can't inject headers into StreamResponse after
        # prepare() flushes them, so resolve CORS headers up front.
        origin = request.headers.get("Origin", "")
@@ -775,6 +780,8 @@ class APIServerAdapter(BasePlatformAdapter):
        await response.prepare(request)

        try:
+            last_activity = time.monotonic()
+
            # Role chunk
            role_chunk = {
                "id": completion_id, "object": "chat.completion.chunk",
@@ -782,6 +789,7 @@ class APIServerAdapter(BasePlatformAdapter):
                "choices": [{"index": 0, "delta": {"role": "assistant"}, "finish_reason": None}],
            }
            await response.write(f"data: {json.dumps(role_chunk)}\n\n".encode())
+            last_activity = time.monotonic()

            # Helper — route a queue item to the correct SSE event.
            async def _emit(item):
@@ -805,6 +813,7 @@ class APIServerAdapter(BasePlatformAdapter):
                        "choices": [{"index": 0, "delta": {"content": item}, "finish_reason": None}],
                    }
                    await response.write(f"data: {json.dumps(content_chunk)}\n\n".encode())
+                return time.monotonic()

            # Stream content chunks as they arrive from the agent
            loop = asyncio.get_event_loop()
@@ -819,16 +828,19 @@ class APIServerAdapter(BasePlatformAdapter):
                                delta = stream_q.get_nowait()
                                if delta is None:
                                    break
-                                await _emit(delta)
+                                last_activity = await _emit(delta)
                            except _q.Empty:
                                break
                        break
+                    if time.monotonic() - last_activity >= CHAT_COMPLETIONS_SSE_KEEPALIVE_SECONDS:
+                        await response.write(b": keepalive\n\n")
+                        last_activity = time.monotonic()
                    continue

                if delta is None:  # End of stream sentinel
                    break

-                await _emit(delta)
+                last_activity = await _emit(delta)

            # Get usage from completed agent
            usage = {"input_tokens": 0, "output_tokens": 0, "total_tokens": 0}
@@ -823,7 +823,36 @@ class BasePlatformAdapter(ABC):
        result = handler(self)
        if asyncio.iscoroutine(result):
            await result
-    
+
+    def _acquire_platform_lock(self, scope: str, identity: str, resource_desc: str) -> bool:
+        """Acquire a scoped lock for this adapter. Returns True on success."""
+        from gateway.status import acquire_scoped_lock
+        self._platform_lock_scope = scope
+        self._platform_lock_identity = identity
+        acquired, existing = acquire_scoped_lock(
+            scope, identity, metadata={'platform': self.platform.value}
+        )
+        if acquired:
+            return True
+        owner_pid = existing.get('pid') if isinstance(existing, dict) else None
+        message = (
+            f'{resource_desc} already in use'
+            + (f' (PID {owner_pid})' if owner_pid else '')
+            + '. Stop the other gateway first.'
+        )
+        logger.error('[%s] %s', self.name, message)
+        self._set_fatal_error(f'{scope}_lock', message, retryable=False)
+        return False
+
+    def _release_platform_lock(self) -> None:
+        """Release the scoped lock acquired by _acquire_platform_lock."""
+        identity = getattr(self, '_platform_lock_identity', None)
+        if not identity:
+            return
+        from gateway.status import release_scoped_lock
+        release_scoped_lock(self._platform_lock_scope, identity)
+        self._platform_lock_identity = None
+
    @property
    def name(self) -> str:
        """Human-readable name for this adapter."""
@@ -30,6 +30,7 @@ from gateway.platforms.base import (
    cache_audio_from_bytes,
    cache_document_from_bytes,
 )
+from gateway.platforms.helpers import strip_markdown

 logger = logging.getLogger(__name__)

@@ -89,18 +90,7 @@ def _normalize_server_url(raw: str) -> str:
    return value.rstrip("/")


-def _strip_markdown(text: str) -> str:
-    """Strip common markdown formatting for iMessage plain-text delivery."""
-    text = re.sub(r"\*\*(.+?)\*\*", r"\1", text, flags=re.DOTALL)
-    text = re.sub(r"\*(.+?)\*", r"\1", text, flags=re.DOTALL)
-    text = re.sub(r"__(.+?)__", r"\1", text, flags=re.DOTALL)
-    text = re.sub(r"_(.+?)_", r"\1", text, flags=re.DOTALL)
-    text = re.sub(r"```[a-zA-Z0-9_+-]*\n?", "", text)
-    text = re.sub(r"`(.+?)`", r"\1", text)
-    text = re.sub(r"^#{1,6}\s+", "", text, flags=re.MULTILINE)
-    text = re.sub(r"\[([^\]]+)\]\(([^\)]+)\)", r"\1", text)
-    text = re.sub(r"\n{3,}", "\n\n", text)
-    return text.strip()
+


 # ---------------------------------------------------------------------------
@@ -393,7 +383,7 @@ class BlueBubblesAdapter(BasePlatformAdapter):
        reply_to: Optional[str] = None,
        metadata: Optional[Dict[str, Any]] = None,
    ) -> SendResult:
-        text = _strip_markdown(content or "")
+        text = strip_markdown(content or "")
        if not text:
            return SendResult(success=False, error="BlueBubbles send requires text")
        chunks = self.truncate_message(text, max_length=self.MAX_MESSAGE_LENGTH)
@@ -679,7 +669,7 @@ class BlueBubblesAdapter(BasePlatformAdapter):
        return info

    def format_message(self, content: str) -> str:
-        return _strip_markdown(content)
+        return strip_markdown(content)

    # ------------------------------------------------------------------
    # Inbound attachment downloading (from #4588)
@@ -42,6 +42,7 @@ except ImportError:
    httpx = None  # type: ignore[assignment]

 from gateway.config import Platform, PlatformConfig
+from gateway.platforms.helpers import MessageDeduplicator
 from gateway.platforms.base import (
    BasePlatformAdapter,
    MessageEvent,
@@ -52,8 +53,6 @@ from gateway.platforms.base import (
 logger = logging.getLogger(__name__)

 MAX_MESSAGE_LENGTH = 20000
-DEDUP_WINDOW_SECONDS = 300
-DEDUP_MAX_SIZE = 1000
 RECONNECT_BACKOFF = [2, 5, 10, 30, 60]
 _SESSION_WEBHOOKS_MAX = 500
 _DINGTALK_WEBHOOK_RE = re.compile(r'^https://api\.dingtalk\.com/')
@@ -89,8 +88,8 @@ class DingTalkAdapter(BasePlatformAdapter):
        self._stream_task: Optional[asyncio.Task] = None
        self._http_client: Optional["httpx.AsyncClient"] = None

-        # Message deduplication: msg_id -> timestamp
-        self._seen_messages: Dict[str, float] = {}
+        # Message deduplication
+        self._dedup = MessageDeduplicator(max_size=1000)
        # Map chat_id -> session_webhook for reply routing
        self._session_webhooks: Dict[str, str] = {}

@@ -170,7 +169,7 @@ class DingTalkAdapter(BasePlatformAdapter):

        self._stream_client = None
        self._session_webhooks.clear()
-        self._seen_messages.clear()
+        self._dedup.clear()
        logger.info("[%s] Disconnected", self.name)

    # -- Inbound message processing -----------------------------------------
@@ -178,7 +177,7 @@ class DingTalkAdapter(BasePlatformAdapter):
    async def _on_message(self, message: "ChatbotMessage") -> None:
        """Process an incoming DingTalk chatbot message."""
        msg_id = getattr(message, "message_id", None) or uuid.uuid4().hex
-        if self._is_duplicate(msg_id):
+        if self._dedup.is_duplicate(msg_id):
            logger.debug("[%s] Duplicate message %s, skipping", self.name, msg_id)
            return

@@ -256,20 +255,6 @@ class DingTalkAdapter(BasePlatformAdapter):
                content = " ".join(parts).strip()
        return content

-    # -- Deduplication ------------------------------------------------------
-
-    def _is_duplicate(self, msg_id: str) -> bool:
-        """Check and record a message ID. Returns True if already seen."""
-        now = time.time()
-        if len(self._seen_messages) > DEDUP_MAX_SIZE:
-            cutoff = now - DEDUP_WINDOW_SECONDS
-            self._seen_messages = {k: v for k, v in self._seen_messages.items() if v > cutoff}
-
-        if msg_id in self._seen_messages:
-            return True
-        self._seen_messages[msg_id] = now
-        return False
-
    # -- Outbound messaging -------------------------------------------------

    async def send(
@@ -45,6 +45,7 @@ sys.path.insert(0, str(_Path(__file__).resolve().parents[2]))
 from gateway.config import Platform, PlatformConfig
 import re

+from gateway.platforms.helpers import MessageDeduplicator, ThreadParticipationTracker
 from gateway.platforms.base import (
    BasePlatformAdapter,
    MessageEvent,
@@ -450,18 +451,14 @@ class DiscordAdapter(BasePlatformAdapter):
        # Track threads where the bot has participated so follow-up messages
        # in those threads don't require @mention.  Persisted to disk so the
        # set survives gateway restarts.
-        self._bot_participated_threads: set = self._load_participated_threads()
+        self._threads = ThreadParticipationTracker("discord")
        # Persistent typing indicator loops per channel (DMs don't reliably
        # show the standard typing gateway event for bots)
        self._typing_tasks: Dict[str, asyncio.Task] = {}
        self._bot_task: Optional[asyncio.Task] = None
-        # Cap to prevent unbounded growth (Discord threads get archived).
-        self._MAX_TRACKED_THREADS = 500
-        # Dedup cache: message_id → timestamp.  Prevents duplicate bot
-        # responses when Discord RESUME replays events after reconnects.
-        self._seen_messages: Dict[str, float] = {}
-        self._SEEN_TTL = 300   # 5 minutes
-        self._SEEN_MAX = 2000  # prune threshold
+        # Dedup cache: prevents duplicate bot responses when Discord
+        # RESUME replays events after reconnects.
+        self._dedup = MessageDeduplicator()
        # Reply threading mode: "off" (no replies), "first" (reply on first
        # chunk only, default), "all" (reply-reference on every chunk).
        self._reply_to_mode: str = getattr(config, 'reply_to_mode', 'first') or 'first'
@@ -502,18 +499,9 @@ class DiscordAdapter(BasePlatformAdapter):
            return False

        try:
-            # Acquire scoped lock to prevent duplicate bot token usage
-            from gateway.status import acquire_scoped_lock
-            self._token_lock_identity = self.config.token
-            acquired, existing = acquire_scoped_lock('discord-bot-token', self._token_lock_identity, metadata={'platform': 'discord'})
-            if not acquired:
-                owner_pid = existing.get('pid') if isinstance(existing, dict) else None
-                message = f'Discord bot token already in use' + (f' (PID {owner_pid})' if owner_pid else '') + '. Stop the other gateway first.'
-                logger.error('[%s] %s', self.name, message)
-                self._set_fatal_error('discord_token_lock', message, retryable=False)
+            if not self._acquire_platform_lock('discord-bot-token', self.config.token, 'Discord bot token'):
                return False

-
            # Parse allowed user entries (may contain usernames or IDs)
            allowed_env = os.getenv("DISCORD_ALLOWED_USERS", "")
            if allowed_env:
@@ -569,17 +557,8 @@ class DiscordAdapter(BasePlatformAdapter):
            @self._client.event
            async def on_message(message: DiscordMessage):
                # Dedup: Discord RESUME replays events after reconnects (#4777)
-                msg_id = str(message.id)
-                now = time.time()
-                if msg_id in adapter_self._seen_messages:
+                if adapter_self._dedup.is_duplicate(str(message.id)):
                    return
-                adapter_self._seen_messages[msg_id] = now
-                if len(adapter_self._seen_messages) > adapter_self._SEEN_MAX:
-                    cutoff = now - adapter_self._SEEN_TTL
-                    adapter_self._seen_messages = {
-                        k: v for k, v in adapter_self._seen_messages.items()
-                        if v > cutoff
-                    }

                # Always ignore our own messages
                if message.author == self._client.user:
@@ -685,23 +664,11 @@ class DiscordAdapter(BasePlatformAdapter):

        except asyncio.TimeoutError:
            logger.error("[%s] Timeout waiting for connection to Discord", self.name, exc_info=True)
-            try:
-                from gateway.status import release_scoped_lock
-                if getattr(self, '_token_lock_identity', None):
-                    release_scoped_lock('discord-bot-token', self._token_lock_identity)
-                    self._token_lock_identity = None
-            except Exception:
-                pass
+            self._release_platform_lock()
            return False
        except Exception as e:  # pragma: no cover - defensive logging
            logger.error("[%s] Failed to connect to Discord: %s", self.name, e, exc_info=True)
-            try:
-                from gateway.status import release_scoped_lock
-                if getattr(self, '_token_lock_identity', None):
-                    release_scoped_lock('discord-bot-token', self._token_lock_identity)
-                    self._token_lock_identity = None
-            except Exception:
-                pass
+            self._release_platform_lock()
            return False

    async def disconnect(self) -> None:
@@ -723,14 +690,7 @@ class DiscordAdapter(BasePlatformAdapter):
        self._client = None
        self._ready_event.clear()

-        # Release the token lock
-        try:
-            from gateway.status import release_scoped_lock
-            if getattr(self, '_token_lock_identity', None):
-                release_scoped_lock('discord-bot-token', self._token_lock_identity)
-                self._token_lock_identity = None
-        except Exception:
-            pass
+        self._release_platform_lock()

        logger.info("[%s] Disconnected", self.name)

@@ -1870,7 +1830,7 @@ class DiscordAdapter(BasePlatformAdapter):

        # Track thread participation so follow-ups don't require @mention
        if thread_id:
-            self._track_thread(thread_id)
+            self._threads.mark(thread_id)

        # If a message was provided, kick off a new Hermes session in the thread
        starter = (message or "").strip()
@@ -2241,49 +2201,6 @@ class DiscordAdapter(BasePlatformAdapter):
            return f"{parent_name} / {thread_name}"
        return thread_name

-    # ------------------------------------------------------------------
-    # Thread participation persistence
-    # ------------------------------------------------------------------
-
-    @staticmethod
-    def _thread_state_path() -> Path:
-        """Path to the persisted thread participation set."""
-        from hermes_cli.config import get_hermes_home
-        return get_hermes_home() / "discord_threads.json"
-
-    @classmethod
-    def _load_participated_threads(cls) -> set:
-        """Load persisted thread IDs from disk."""
-        path = cls._thread_state_path()
-        try:
-            if path.exists():
-                data = json.loads(path.read_text(encoding="utf-8"))
-                if isinstance(data, list):
-                    return set(data)
-        except Exception as e:
-            logger.debug("Could not load discord thread state: %s", e)
-        return set()
-
-    def _save_participated_threads(self) -> None:
-        """Persist the current thread set to disk (best-effort)."""
-        path = self._thread_state_path()
-        try:
-            # Trim to most recent entries if over cap
-            thread_list = list(self._bot_participated_threads)
-            if len(thread_list) > self._MAX_TRACKED_THREADS:
-                thread_list = thread_list[-self._MAX_TRACKED_THREADS:]
-                self._bot_participated_threads = set(thread_list)
-            path.parent.mkdir(parents=True, exist_ok=True)
-            path.write_text(json.dumps(thread_list), encoding="utf-8")
-        except Exception as e:
-            logger.debug("Could not save discord thread state: %s", e)
-
-    def _track_thread(self, thread_id: str) -> None:
-        """Add a thread to the participation set and persist."""
-        if thread_id not in self._bot_participated_threads:
-            self._bot_participated_threads.add(thread_id)
-            self._save_participated_threads()
-
    async def _handle_message(self, message: DiscordMessage) -> None:
        """Handle incoming Discord messages."""
        # In server channels (not DMs), require the bot to be @mentioned
@@ -2335,7 +2252,7 @@ class DiscordAdapter(BasePlatformAdapter):

            # Skip the mention check if the message is in a thread where
            # the bot has previously participated (auto-created or replied in).
-            in_bot_thread = is_thread and thread_id in self._bot_participated_threads
+            in_bot_thread = is_thread and thread_id in self._threads

            if require_mention and not is_free_channel and not in_bot_thread:
                if self._client.user not in message.mentions:
@@ -2361,7 +2278,7 @@ class DiscordAdapter(BasePlatformAdapter):
                    is_thread = True
                    thread_id = str(thread.id)
                    auto_threaded_channel = thread
-                    self._track_thread(thread_id)
+                    self._threads.mark(thread_id)

        # Determine message type
        msg_type = MessageType.TEXT
@@ -2545,7 +2462,7 @@ class DiscordAdapter(BasePlatformAdapter):
        # Track thread participation so the bot won't require @mention for
        # follow-up messages in threads it has already engaged in.
        if thread_id:
-            self._track_thread(thread_id)
+            self._threads.mark(thread_id)

        # Only batch plain text messages — commands, media, etc. dispatch
        # immediately since they won't be split by the Discord client.
@@ -360,19 +360,21 @@ def _render_code_block_element(element: Dict[str, Any]) -> str:


 def _strip_markdown_to_plain_text(text: str) -> str:
+    """Strip markdown formatting to plain text for Feishu text fallbacks.
+
+    Delegates common markdown stripping to the shared helper and adds
+    Feishu-specific patterns (blockquotes, strikethrough, underline tags,
+    horizontal rules, \\r\\n normalisation).
+    """
+    from gateway.platforms.helpers import strip_markdown
    plain = text.replace("\r\n", "\n")
    plain = _MARKDOWN_LINK_RE.sub(lambda m: f"{m.group(1)} ({m.group(2).strip()})", plain)
-    plain = re.sub(r"^#{1,6}\s+", "", plain, flags=re.MULTILINE)
    plain = re.sub(r"^>\s?", "", plain, flags=re.MULTILINE)
    plain = re.sub(r"^\s*---+\s*$", "---", plain, flags=re.MULTILINE)
-    plain = re.sub(r"```(?:[^\n]*\n)?([\s\S]*?)```", lambda m: m.group(1).strip("\n"), plain)
-    plain = re.sub(r"`([^`\n]+)`", r"\1", plain)
-    plain = re.sub(r"\*\*([^*\n]+)\*\*", r"\1", plain)
-    plain = re.sub(r"\*([^*\n]+)\*", r"\1", plain)
    plain = re.sub(r"~~([^~\n]+)~~", r"\1", plain)
    plain = re.sub(r"<u>([\s\S]*?)</u>", r"\1", plain)
-    plain = re.sub(r"\n{3,}", "\n\n", plain)
-    return plain.strip()
+    plain = strip_markdown(plain)
+    return plain


 def _coerce_int(value: Any, default: Optional[int] = None, min_value: int = 0) -> Optional[int]:
@@ -0,0 +1,261 @@
+"""Shared helper classes for gateway platform adapters.
+
+Extracts common patterns that were duplicated across 5-7 adapters:
+message deduplication, text batch aggregation, markdown stripping,
+and thread participation tracking.
+"""
+
+import asyncio
+import json
+import logging
+import re
+import time
+from pathlib import Path
+from typing import TYPE_CHECKING, Dict, Optional
+
+if TYPE_CHECKING:
+    from gateway.platforms.base import BasePlatformAdapter, MessageEvent
+
+logger = logging.getLogger(__name__)
+
+
+# ─── Message Deduplication ────────────────────────────────────────────────────
+
+
+class MessageDeduplicator:
+    """TTL-based message deduplication cache.
+
+    Replaces the identical ``_seen_messages`` / ``_is_duplicate()`` pattern
+    previously duplicated in discord, slack, dingtalk, wecom, weixin,
+    mattermost, and feishu adapters.
+
+    Usage::
+
+        self._dedup = MessageDeduplicator()
+
+        # In message handler:
+        if self._dedup.is_duplicate(msg_id):
+            return
+    """
+
+    def __init__(self, max_size: int = 2000, ttl_seconds: float = 300):
+        self._seen: Dict[str, float] = {}
+        self._max_size = max_size
+        self._ttl = ttl_seconds
+
+    def is_duplicate(self, msg_id: str) -> bool:
+        """Return True if *msg_id* was already seen within the TTL window."""
+        if not msg_id:
+            return False
+        now = time.time()
+        if msg_id in self._seen:
+            return True
+        self._seen[msg_id] = now
+        if len(self._seen) > self._max_size:
+            cutoff = now - self._ttl
+            self._seen = {k: v for k, v in self._seen.items() if v > cutoff}
+        return False
+
+    def clear(self):
+        """Clear all tracked messages."""
+        self._seen.clear()
+
+
+# ─── Text Batch Aggregation ──────────────────────────────────────────────────
+
+
+class TextBatchAggregator:
+    """Aggregates rapid-fire text events into single messages.
+
+    Replaces the ``_enqueue_text_event`` / ``_flush_text_batch`` pattern
+    previously duplicated in telegram, discord, matrix, wecom, and feishu.
+
+    Usage::
+
+        self._text_batcher = TextBatchAggregator(
+            handler=self._message_handler,
+            batch_delay=0.6,
+            split_threshold=1900,
+        )
+
+        # In message dispatch:
+        if msg_type == MessageType.TEXT and self._text_batcher.is_enabled():
+            self._text_batcher.enqueue(event, session_key)
+            return
+    """
+
+    def __init__(
+        self,
+        handler,
+        *,
+        batch_delay: float = 0.6,
+        split_delay: float = 2.0,
+        split_threshold: int = 4000,
+    ):
+        self._handler = handler
+        self._batch_delay = batch_delay
+        self._split_delay = split_delay
+        self._split_threshold = split_threshold
+        self._pending: Dict[str, "MessageEvent"] = {}
+        self._pending_tasks: Dict[str, asyncio.Task] = {}
+
+    def is_enabled(self) -> bool:
+        """Return True if batching is active (delay > 0)."""
+        return self._batch_delay > 0
+
+    def enqueue(self, event: "MessageEvent", key: str) -> None:
+        """Add *event* to the pending batch for *key*."""
+        chunk_len = len(event.text or "")
+        existing = self._pending.get(key)
+        if not existing:
+            event._last_chunk_len = chunk_len  # type: ignore[attr-defined]
+            self._pending[key] = event
+        else:
+            existing.text = f"{existing.text}\n{event.text}"
+            existing._last_chunk_len = chunk_len  # type: ignore[attr-defined]
+
+        # Cancel prior flush timer, start a new one
+        prior = self._pending_tasks.get(key)
+        if prior and not prior.done():
+            prior.cancel()
+        self._pending_tasks[key] = asyncio.create_task(self._flush(key))
+
+    async def _flush(self, key: str) -> None:
+        """Wait then dispatch the batched event for *key*."""
+        current_task = self._pending_tasks.get(key)
+        pending = self._pending.get(key)
+        last_len = getattr(pending, "_last_chunk_len", 0) if pending else 0
+
+        # Use longer delay when the last chunk looks like a split message
+        delay = self._split_delay if last_len >= self._split_threshold else self._batch_delay
+        await asyncio.sleep(delay)
+
+        event = self._pending.pop(key, None)
+        if event:
+            try:
+                await self._handler(event)
+            except Exception:
+                logger.exception("[TextBatchAggregator] Error dispatching batched event for %s", key)
+
+        if self._pending_tasks.get(key) is current_task:
+            self._pending_tasks.pop(key, None)
+
+    def cancel_all(self) -> None:
+        """Cancel all pending flush tasks."""
+        for task in self._pending_tasks.values():
+            if not task.done():
+                task.cancel()
+        self._pending_tasks.clear()
+        self._pending.clear()
+
+
+# ─── Markdown Stripping ──────────────────────────────────────────────────────
+
+# Pre-compiled regexes for performance
+_RE_BOLD = re.compile(r"\*\*(.+?)\*\*", re.DOTALL)
+_RE_ITALIC_STAR = re.compile(r"\*(.+?)\*", re.DOTALL)
+_RE_BOLD_UNDER = re.compile(r"__(.+?)__", re.DOTALL)
+_RE_ITALIC_UNDER = re.compile(r"_(.+?)_", re.DOTALL)
+_RE_CODE_BLOCK = re.compile(r"```[a-zA-Z0-9_+-]*\n?")
+_RE_INLINE_CODE = re.compile(r"`(.+?)`")
+_RE_HEADING = re.compile(r"^#{1,6}\s+", re.MULTILINE)
+_RE_LINK = re.compile(r"\[([^\]]+)\]\([^\)]+\)")
+_RE_MULTI_NEWLINE = re.compile(r"\n{3,}")
+
+
+def strip_markdown(text: str) -> str:
+    """Strip markdown formatting for plain-text platforms (SMS, iMessage, etc.).
+
+    Replaces the identical ``_strip_markdown()`` functions previously
+    duplicated in sms.py, bluebubbles.py, and feishu.py.
+    """
+    text = _RE_BOLD.sub(r"\1", text)
+    text = _RE_ITALIC_STAR.sub(r"\1", text)
+    text = _RE_BOLD_UNDER.sub(r"\1", text)
+    text = _RE_ITALIC_UNDER.sub(r"\1", text)
+    text = _RE_CODE_BLOCK.sub("", text)
+    text = _RE_INLINE_CODE.sub(r"\1", text)
+    text = _RE_HEADING.sub("", text)
+    text = _RE_LINK.sub(r"\1", text)
+    text = _RE_MULTI_NEWLINE.sub("\n\n", text)
+    return text.strip()
+
+
+# ─── Thread Participation Tracking ───────────────────────────────────────────
+
+
+class ThreadParticipationTracker:
+    """Persistent tracking of threads the bot has participated in.
+
+    Replaces the identical ``_load/_save_participated_threads`` +
+    ``_mark_thread_participated`` pattern previously duplicated in
+    discord.py and matrix.py.
+
+    Usage::
+
+        self._threads = ThreadParticipationTracker("discord")
+
+        # Check membership:
+        if thread_id in self._threads:
+            ...
+
+        # Mark participation:
+        self._threads.mark(thread_id)
+    """
+
+    _MAX_TRACKED = 500
+
+    def __init__(self, platform_name: str, max_tracked: int = 500):
+        self._platform = platform_name
+        self._max_tracked = max_tracked
+        self._threads: set = self._load()
+
+    def _state_path(self) -> Path:
+        from hermes_constants import get_hermes_home
+        return get_hermes_home() / f"{self._platform}_threads.json"
+
+    def _load(self) -> set:
+        path = self._state_path()
+        if path.exists():
+            try:
+                return set(json.loads(path.read_text(encoding="utf-8")))
+            except Exception:
+                pass
+        return set()
+
+    def _save(self) -> None:
+        path = self._state_path()
+        path.parent.mkdir(parents=True, exist_ok=True)
+        thread_list = list(self._threads)
+        if len(thread_list) > self._max_tracked:
+            thread_list = thread_list[-self._max_tracked:]
+            self._threads = set(thread_list)
+        path.write_text(json.dumps(thread_list), encoding="utf-8")
+
+    def mark(self, thread_id: str) -> None:
+        """Mark *thread_id* as participated and persist."""
+        if thread_id not in self._threads:
+            self._threads.add(thread_id)
+            self._save()
+
+    def __contains__(self, thread_id: str) -> bool:
+        return thread_id in self._threads
+
+    def clear(self) -> None:
+        self._threads.clear()
+
+
+# ─── Phone Number Redaction ──────────────────────────────────────────────────
+
+
+def redact_phone(phone: str) -> str:
+    """Redact a phone number for logging, preserving country code and last 4.
+
+    Replaces the identical ``_redact_phone()`` functions in signal.py,
+    sms.py, and bluebubbles.py.
+    """
+    if not phone:
+        return "<none>"
+    if len(phone) <= 8:
+        return phone[:2] + "****" + phone[-2:] if len(phone) > 4 else "****"
+    return phone[:4] + "****" + phone[-4:]
@@ -92,6 +92,7 @@ from gateway.platforms.base import (
    ProcessingOutcome,
    SendResult,
 )
+from gateway.platforms.helpers import ThreadParticipationTracker

 logger = logging.getLogger(__name__)

@@ -216,8 +217,7 @@ class MatrixAdapter(BasePlatformAdapter):
        self._pending_megolm: list = []

        # Thread participation tracking (for require_mention bypass)
-        self._bot_participated_threads: set = self._load_participated_threads()
-        self._MAX_TRACKED_THREADS = 500
+        self._threads = ThreadParticipationTracker("matrix")

        # Mention/thread gating — parsed once from env vars.
        self._require_mention: bool = os.getenv("MATRIX_REQUIRE_MENTION", "true").lower() not in ("false", "0", "no")
@@ -352,7 +352,16 @@ class MatrixAdapter(BasePlatformAdapter):
                from mautrix.crypto import OlmMachine
                from mautrix.crypto.store import MemoryCryptoStore

-                crypto_store = MemoryCryptoStore()
+                # account_id and pickle_key are required by mautrix ≥0.21.
+                # Use the Matrix user ID as account_id for stable identity.
+                # pickle_key secures in-memory serialisation; derive from
+                # the same user_id:device_id pair used for the on-disk HMAC.
+                _acct_id = self._user_id or "hermes"
+                _pickle_key = f"{_acct_id}:{self._device_id}"
+                crypto_store = MemoryCryptoStore(
+                    account_id=_acct_id,
+                    pickle_key=_pickle_key,
+                )

                # Restore persisted crypto state from a previous run.
                # Uses HMAC to verify integrity before unpickling.
@@ -418,6 +427,11 @@ class MatrixAdapter(BasePlatformAdapter):
            if isinstance(sync_data, dict):
                rooms_join = sync_data.get("rooms", {}).get("join", {})
                self._joined_rooms = set(rooms_join.keys())
+                # Store the next_batch token so incremental syncs start
+                # from where the initial sync left off.
+                nb = sync_data.get("next_batch")
+                if nb:
+                    await client.sync_store.put_next_batch(nb)
                logger.info(
                    "Matrix: initial sync complete, joined %d rooms",
                    len(self._joined_rooms),
@@ -809,19 +823,40 @@ class MatrixAdapter(BasePlatformAdapter):

    async def _sync_loop(self) -> None:
        """Continuously sync with the homeserver."""
+        client = self._client
+        # Resume from the token stored during the initial sync.
+        next_batch = await client.sync_store.get_next_batch()
        while not self._closing:
            try:
-                sync_data = await self._client.sync(timeout=30000)
+                sync_data = await client.sync(
+                    since=next_batch, timeout=30000,
+                )
                if isinstance(sync_data, dict):
                    # Update joined rooms from sync response.
                    rooms_join = sync_data.get("rooms", {}).get("join", {})
                    if rooms_join:
                        self._joined_rooms.update(rooms_join.keys())

-                # Share keys periodically if E2EE is enabled.
-                if self._encryption and getattr(self._client, "crypto", None):
+                    # Advance the sync token so the next request is
+                    # incremental instead of a full initial sync.
+                    nb = sync_data.get("next_batch")
+                    if nb:
+                        next_batch = nb
+                        await client.sync_store.put_next_batch(nb)
+
+                    # Dispatch events to registered handlers so that
+                    # _on_room_message / _on_reaction / _on_invite fire.
                    try:
-                        await self._client.crypto.share_keys()
+                        tasks = client.handle_sync(sync_data)
+                        if tasks:
+                            await asyncio.gather(*tasks)
+                    except Exception as exc:
+                        logger.warning("Matrix: sync event dispatch error: %s", exc)
+
+                # Share keys periodically if E2EE is enabled.
+                if self._encryption and getattr(client, "crypto", None):
+                    try:
+                        await client.crypto.share_keys()
                    except Exception as exc:
                        logger.warning("Matrix: E2EE key share failed: %s", exc)

@@ -984,7 +1019,7 @@ class MatrixAdapter(BasePlatformAdapter):
        # Require-mention gating.
        if not is_dm:
            is_free_room = room_id in self._free_rooms
-            in_bot_thread = bool(thread_id and thread_id in self._bot_participated_threads)
+            in_bot_thread = bool(thread_id and thread_id in self._threads)
            if self._require_mention and not is_free_room and not in_bot_thread:
                if not is_mentioned:
                    return None
@@ -992,7 +1027,7 @@ class MatrixAdapter(BasePlatformAdapter):
        # DM mention-thread.
        if is_dm and not thread_id and self._dm_mention_threads and is_mentioned:
            thread_id = event_id
-            self._track_thread(thread_id)
+            self._threads.mark(thread_id)

        # Strip mention from body.
        if is_mentioned:
@@ -1001,7 +1036,7 @@ class MatrixAdapter(BasePlatformAdapter):
        # Auto-thread.
        if not is_dm and not thread_id and self._auto_thread:
            thread_id = event_id
-            self._track_thread(thread_id)
+            self._threads.mark(thread_id)

        display_name = await self._get_display_name(room_id, sender)
        source = self.build_source(
@@ -1013,7 +1048,7 @@ class MatrixAdapter(BasePlatformAdapter):
        )

        if thread_id:
-            self._track_thread(thread_id)
+            self._threads.mark(thread_id)

        self._background_read_receipt(room_id, event_id)

@@ -1662,48 +1697,6 @@ class MatrixAdapter(BasePlatformAdapter):
            for rid in self._joined_rooms
        }

-    # ------------------------------------------------------------------
-    # Thread participation tracking
-    # ------------------------------------------------------------------
-
-    @staticmethod
-    def _thread_state_path() -> Path:
-        """Path to the persisted thread participation set."""
-        from hermes_cli.config import get_hermes_home
-        return get_hermes_home() / "matrix_threads.json"
-
-    @classmethod
-    def _load_participated_threads(cls) -> set:
-        """Load persisted thread IDs from disk."""
-        path = cls._thread_state_path()
-        try:
-            if path.exists():
-                data = json.loads(path.read_text(encoding="utf-8"))
-                if isinstance(data, list):
-                    return set(data)
-        except Exception as e:
-            logger.debug("Could not load matrix thread state: %s", e)
-        return set()
-
-    def _save_participated_threads(self) -> None:
-        """Persist the current thread set to disk (best-effort)."""
-        path = self._thread_state_path()
-        try:
-            thread_list = list(self._bot_participated_threads)
-            if len(thread_list) > self._MAX_TRACKED_THREADS:
-                thread_list = thread_list[-self._MAX_TRACKED_THREADS:]
-                self._bot_participated_threads = set(thread_list)
-            path.parent.mkdir(parents=True, exist_ok=True)
-            path.write_text(json.dumps(thread_list), encoding="utf-8")
-        except Exception as e:
-            logger.debug("Could not save matrix thread state: %s", e)
-
-    def _track_thread(self, thread_id: str) -> None:
-        """Add a thread to the participation set and persist."""
-        if thread_id not in self._bot_participated_threads:
-            self._bot_participated_threads.add(thread_id)
-            self._save_participated_threads()
-
    # ------------------------------------------------------------------
    # Mention detection helpers
    # ------------------------------------------------------------------
@@ -18,11 +18,11 @@ import json
 import logging
 import os
 import re
-import time
 from pathlib import Path
 from typing import Any, Dict, List, Optional

 from gateway.config import Platform, PlatformConfig
+from gateway.platforms.helpers import MessageDeduplicator
 from gateway.platforms.base import (
    BasePlatformAdapter,
    MessageEvent,
@@ -96,10 +96,8 @@ class MattermostAdapter(BasePlatformAdapter):
            or os.getenv("MATTERMOST_REPLY_MODE", "off")
        ).lower()

-        # Dedup cache: post_id → timestamp (prevent reprocessing)
-        self._seen_posts: Dict[str, float] = {}
-        self._SEEN_MAX = 2000
-        self._SEEN_TTL = 300  # 5 minutes
+        # Dedup cache (prevent reprocessing)
+        self._dedup = MessageDeduplicator()

    # ------------------------------------------------------------------
    # HTTP helpers
@@ -604,10 +602,8 @@ class MattermostAdapter(BasePlatformAdapter):
        post_id = post.get("id", "")

        # Dedup.
-        self._prune_seen()
-        if post_id in self._seen_posts:
+        if self._dedup.is_duplicate(post_id):
            return
-        self._seen_posts[post_id] = time.time()

        # Build message event.
        channel_id = post.get("channel_id", "")
@@ -734,13 +730,4 @@ class MattermostAdapter(BasePlatformAdapter):

        await self.handle_message(msg_event)

-    def _prune_seen(self) -> None:
-        """Remove expired entries from the dedup cache."""
-        if len(self._seen_posts) < self._SEEN_MAX:
-            return
-        now = time.time()
-        self._seen_posts = {
-            pid: ts
-            for pid, ts in self._seen_posts.items()
-            if now - ts < self._SEEN_TTL
-        }
+
@@ -37,6 +37,7 @@ from gateway.platforms.base import (
    cache_document_from_bytes,
    cache_image_from_url,
 )
+from gateway.platforms.helpers import redact_phone

 logger = logging.getLogger(__name__)

@@ -51,22 +52,10 @@ SSE_RETRY_DELAY_MAX = 60.0
 HEALTH_CHECK_INTERVAL = 30.0  # seconds between health checks
 HEALTH_CHECK_STALE_THRESHOLD = 120.0  # seconds without SSE activity before concern

-# E.164 phone number pattern for redaction
-_PHONE_RE = re.compile(r"\+[1-9]\d{6,14}")
-
-
 # ---------------------------------------------------------------------------
 # Helpers
 # ---------------------------------------------------------------------------

-def _redact_phone(phone: str) -> str:
-    """Redact a phone number for logging: +15551234567 -> +155****4567."""
-    if not phone:
-        return "<none>"
-    if len(phone) <= 8:
-        return phone[:2] + "****" + phone[-2:] if len(phone) > 4 else "****"
-    return phone[:4] + "****" + phone[-4:]
-

 def _parse_comma_list(value: str) -> List[str]:
    """Split a comma-separated string into a list, stripping whitespace."""
@@ -184,10 +173,8 @@ class SignalAdapter(BasePlatformAdapter):
        self._recent_sent_timestamps: set = set()
        self._max_recent_timestamps = 50

-        self._phone_lock_identity: Optional[str] = None
-
        logger.info("Signal adapter initialized: url=%s account=%s groups=%s",
-                     self.http_url, _redact_phone(self.account),
+                     self.http_url, redact_phone(self.account),
                     "enabled" if self.group_allow_from else "disabled")

    # ------------------------------------------------------------------
@@ -202,23 +189,7 @@ class SignalAdapter(BasePlatformAdapter):

        # Acquire scoped lock to prevent duplicate Signal listeners for the same phone
        try:
-            from gateway.status import acquire_scoped_lock
-
-            self._phone_lock_identity = self.account
-            acquired, existing = acquire_scoped_lock(
-                "signal-phone",
-                self._phone_lock_identity,
-                metadata={"platform": self.platform.value},
-            )
-            if not acquired:
-                owner_pid = existing.get("pid") if isinstance(existing, dict) else None
-                message = (
-                    "Another local Hermes gateway is already using this Signal account"
-                    + (f" (PID {owner_pid})." if owner_pid else ".")
-                    + " Stop the other gateway before starting a second Signal listener."
-                )
-                logger.error("Signal: %s", message)
-                self._set_fatal_error("signal_phone_lock", message, retryable=False)
+            if not self._acquire_platform_lock('signal-phone', self.account, 'Signal account'):
                return False
        except Exception as e:
            logger.warning("Signal: Could not acquire phone lock (non-fatal): %s", e)
@@ -270,13 +241,7 @@ class SignalAdapter(BasePlatformAdapter):
            await self.client.aclose()
            self.client = None

-        if self._phone_lock_identity:
-            try:
-                from gateway.status import release_scoped_lock
-                release_scoped_lock("signal-phone", self._phone_lock_identity)
-            except Exception as e:
-                logger.warning("Signal: Error releasing phone lock: %s", e, exc_info=True)
-            self._phone_lock_identity = None
+        self._release_platform_lock()

        logger.info("Signal: disconnected")

@@ -542,7 +507,7 @@ class SignalAdapter(BasePlatformAdapter):
        )

        logger.debug("Signal: message from %s in %s: %s",
-                      _redact_phone(sender), chat_id[:20], (text or "")[:50])
+                      redact_phone(sender), chat_id[:20], (text or "")[:50])

        await self.handle_message(event)

@@ -33,6 +33,7 @@ from pathlib import Path as _Path
 sys.path.insert(0, str(_Path(__file__).resolve().parents[2]))

 from gateway.config import Platform, PlatformConfig
+from gateway.platforms.helpers import MessageDeduplicator
 from gateway.platforms.base import (
    BasePlatformAdapter,
    MessageEvent,
@@ -89,11 +90,9 @@ class SlackAdapter(BasePlatformAdapter):
        self._team_clients: Dict[str, AsyncWebClient] = {}   # team_id → WebClient
        self._team_bot_user_ids: Dict[str, str] = {}          # team_id → bot_user_id
        self._channel_team: Dict[str, str] = {}                # channel_id → team_id
-        # Dedup cache: event_ts → timestamp.  Prevents duplicate bot
-        # responses when Socket Mode reconnects redeliver events.
-        self._seen_messages: Dict[str, float] = {}
-        self._SEEN_TTL = 300   # 5 minutes
-        self._SEEN_MAX = 2000  # prune threshold
+        # Dedup cache: prevents duplicate bot responses when Socket Mode
+        # reconnects redeliver events.
+        self._dedup = MessageDeduplicator()
        # Track pending approval message_ts → resolved flag to prevent
        # double-clicks on approval buttons.
        self._approval_resolved: Dict[str, bool] = {}
@@ -152,15 +151,7 @@ class SlackAdapter(BasePlatformAdapter):
                logger.warning("[Slack] Failed to read %s: %s", tokens_file, e)

        try:
-            # Acquire scoped lock to prevent duplicate app token usage
-            from gateway.status import acquire_scoped_lock
-            self._token_lock_identity = app_token
-            acquired, existing = acquire_scoped_lock('slack-app-token', app_token, metadata={'platform': 'slack'})
-            if not acquired:
-                owner_pid = existing.get('pid') if isinstance(existing, dict) else None
-                message = f'Slack app token already in use' + (f' (PID {owner_pid})' if owner_pid else '') + '. Stop the other gateway first.'
-                logger.error('[%s] %s', self.name, message)
-                self._set_fatal_error('slack_token_lock', message, retryable=False)
+            if not self._acquire_platform_lock('slack-app-token', app_token, 'Slack app token'):
                return False

            # First token is the primary — used for AsyncApp / Socket Mode
@@ -247,14 +238,7 @@ class SlackAdapter(BasePlatformAdapter):
                logger.warning("[Slack] Error while closing Socket Mode handler: %s", e, exc_info=True)
        self._running = False

-        # Release the token lock (use stored identity, not re-read env)
-        try:
-            from gateway.status import release_scoped_lock
-            if getattr(self, '_token_lock_identity', None):
-                release_scoped_lock('slack-app-token', self._token_lock_identity)
-                self._token_lock_identity = None
-        except Exception:
-            pass
+        self._release_platform_lock()

        logger.info("[Slack] Disconnected")

@@ -953,17 +937,8 @@ class SlackAdapter(BasePlatformAdapter):
        """Handle an incoming Slack message event."""
        # Dedup: Slack Socket Mode can redeliver events after reconnects (#4777)
        event_ts = event.get("ts", "")
-        if event_ts:
-            now = time.time()
-            if event_ts in self._seen_messages:
-                return
-            self._seen_messages[event_ts] = now
-            if len(self._seen_messages) > self._SEEN_MAX:
-                cutoff = now - self._SEEN_TTL
-                self._seen_messages = {
-                    k: v for k, v in self._seen_messages.items()
-                    if v > cutoff
-                }
+        if event_ts and self._dedup.is_duplicate(event_ts):
+            return

        # Bot message filtering (SLACK_ALLOW_BOTS / config allow_bots):
        #   "none"     — ignore all bot messages (default, backward-compatible)
@@ -19,7 +19,6 @@ import asyncio
 import base64
 import logging
 import os
-import re
 import urllib.parse
 from typing import Any, Dict, Optional

@@ -30,6 +29,7 @@ from gateway.platforms.base import (
    MessageType,
    SendResult,
 )
+from gateway.platforms.helpers import redact_phone, strip_markdown

 logger = logging.getLogger(__name__)

@@ -37,18 +37,6 @@ TWILIO_API_BASE = "https://api.twilio.com/2010-04-01/Accounts"
 MAX_SMS_LENGTH = 1600  # ~10 SMS segments
 DEFAULT_WEBHOOK_PORT = 8080

-# E.164 phone number pattern for redaction
-_PHONE_RE = re.compile(r"\+[1-9]\d{6,14}")
-
-
-def _redact_phone(phone: str) -> str:
-    """Redact a phone number for logging: +15551234567 -> +1555***4567."""
-    if not phone:
-        return "<none>"
-    if len(phone) <= 8:
-        return phone[:2] + "***" + phone[-2:] if len(phone) > 4 else "****"
-    return phone[:5] + "***" + phone[-4:]
-

 def check_sms_requirements() -> bool:
    """Check if SMS adapter dependencies are available."""
@@ -114,7 +102,7 @@ class SmsAdapter(BasePlatformAdapter):
        logger.info(
            "[sms] Twilio webhook server listening on port %d, from: %s",
            self._webhook_port,
-            _redact_phone(self._from_number),
+            redact_phone(self._from_number),
        )
        return True

@@ -163,7 +151,7 @@ class SmsAdapter(BasePlatformAdapter):
                            error_msg = body.get("message", str(body))
                            logger.error(
                                "[sms] send failed to %s: %s %s",
-                                _redact_phone(chat_id),
+                                redact_phone(chat_id),
                                resp.status,
                                error_msg,
                            )
@@ -174,7 +162,7 @@ class SmsAdapter(BasePlatformAdapter):
                        msg_sid = body.get("sid", "")
                        last_result = SendResult(success=True, message_id=msg_sid)
                except Exception as e:
-                    logger.error("[sms] send error to %s: %s", _redact_phone(chat_id), e)
+                    logger.error("[sms] send error to %s: %s", redact_phone(chat_id), e)
                    return SendResult(success=False, error=str(e))
        finally:
            # Close session only if we created a fallback (no persistent session)
@@ -192,16 +180,7 @@ class SmsAdapter(BasePlatformAdapter):

    def format_message(self, content: str) -> str:
        """Strip markdown — SMS renders it as literal characters."""
-        content = re.sub(r"\*\*(.+?)\*\*", r"\1", content, flags=re.DOTALL)
-        content = re.sub(r"\*(.+?)\*", r"\1", content, flags=re.DOTALL)
-        content = re.sub(r"__(.+?)__", r"\1", content, flags=re.DOTALL)
-        content = re.sub(r"_(.+?)_", r"\1", content, flags=re.DOTALL)
-        content = re.sub(r"```[a-z]*\n?", "", content)
-        content = re.sub(r"`(.+?)`", r"\1", content)
-        content = re.sub(r"^#{1,6}\s+", "", content, flags=re.MULTILINE)
-        content = re.sub(r"\[([^\]]+)\]\([^\)]+\)", r"\1", content)
-        content = re.sub(r"\n{3,}", "\n\n", content)
-        return content.strip()
+        return strip_markdown(content)

    # ------------------------------------------------------------------
    # Twilio webhook handler
@@ -236,7 +215,7 @@ class SmsAdapter(BasePlatformAdapter):

        # Ignore messages from our own number (echo prevention)
        if from_number == self._from_number:
-            logger.debug("[sms] ignoring echo from own number %s", _redact_phone(from_number))
+            logger.debug("[sms] ignoring echo from own number %s", redact_phone(from_number))
            return web.Response(
                text='<?xml version="1.0" encoding="UTF-8"?><Response></Response>',
                content_type="application/xml",
@@ -244,8 +223,8 @@ class SmsAdapter(BasePlatformAdapter):

        logger.info(
            "[sms] inbound from %s -> %s: %s",
-            _redact_phone(from_number),
-            _redact_phone(to_number),
+            redact_phone(from_number),
+            redact_phone(to_number),
            text[:80],
        )

@@ -147,7 +147,6 @@ class TelegramAdapter(BasePlatformAdapter):
        self._text_batch_split_delay_seconds = float(os.getenv("HERMES_TELEGRAM_TEXT_BATCH_SPLIT_DELAY_SECONDS", "2.0"))
        self._pending_text_batches: Dict[str, MessageEvent] = {}
        self._pending_text_batch_tasks: Dict[str, asyncio.Task] = {}
-        self._token_lock_identity: Optional[str] = None
        self._polling_error_task: Optional[asyncio.Task] = None
        self._polling_conflict_count: int = 0
        self._polling_network_error_count: int = 0
@@ -497,23 +496,7 @@ class TelegramAdapter(BasePlatformAdapter):
            return False
        
        try:
-            from gateway.status import acquire_scoped_lock
-
-            self._token_lock_identity = self.config.token
-            acquired, existing = acquire_scoped_lock(
-                "telegram-bot-token",
-                self._token_lock_identity,
-                metadata={"platform": self.platform.value},
-            )
-            if not acquired:
-                owner_pid = existing.get("pid") if isinstance(existing, dict) else None
-                message = (
-                    "Another local Hermes gateway is already using this Telegram bot token"
-                    + (f" (PID {owner_pid})." if owner_pid else ".")
-                    + " Stop the other gateway before starting a second Telegram poller."
-                )
-                logger.error("[%s] %s", self.name, message)
-                self._set_fatal_error("telegram_token_lock", message, retryable=False)
+            if not self._acquire_platform_lock('telegram-bot-token', self.config.token, 'Telegram bot token'):
                return False

            # Build the application
@@ -737,12 +720,7 @@ class TelegramAdapter(BasePlatformAdapter):
            return True
            
        except Exception as e:
-            if self._token_lock_identity:
-                try:
-                    from gateway.status import release_scoped_lock
-                    release_scoped_lock("telegram-bot-token", self._token_lock_identity)
-                except Exception:
-                    pass
+            self._release_platform_lock()
            message = f"Telegram startup failed: {e}"
            self._set_fatal_error("telegram_connect_error", message, retryable=True)
            logger.error("[%s] Failed to connect to Telegram: %s", self.name, e, exc_info=True)
@@ -768,12 +746,7 @@ class TelegramAdapter(BasePlatformAdapter):
                await self._app.shutdown()
            except Exception as e:
                logger.warning("[%s] Error during Telegram disconnect: %s", self.name, e, exc_info=True)
-        if self._token_lock_identity:
-            try:
-                from gateway.status import release_scoped_lock
-                release_scoped_lock("telegram-bot-token", self._token_lock_identity)
-            except Exception as e:
-                logger.warning("[%s] Error releasing Telegram token lock: %s", self.name, e, exc_info=True)
+        self._release_platform_lock()

        for task in self._pending_photo_batch_tasks.values():
            if task and not task.done():
@@ -784,7 +757,6 @@ class TelegramAdapter(BasePlatformAdapter):
        self._mark_disconnected()
        self._app = None
        self._bot = None
-        self._token_lock_identity = None
        logger.info("[%s] Disconnected from Telegram", self.name)

    def _should_thread_reply(self, reply_to: Optional[str], chunk_index: int) -> bool:
@@ -59,6 +59,7 @@ except ImportError:
    httpx = None  # type: ignore[assignment]

 from gateway.config import Platform, PlatformConfig
+from gateway.platforms.helpers import MessageDeduplicator
 from gateway.platforms.base import (
    BasePlatformAdapter,
    MessageEvent,
@@ -92,7 +93,6 @@ REQUEST_TIMEOUT_SECONDS = 15.0
 HEARTBEAT_INTERVAL_SECONDS = 30.0
 RECONNECT_BACKOFF = [2, 5, 10, 30, 60]

-DEDUP_WINDOW_SECONDS = 300
 DEDUP_MAX_SIZE = 1000

 IMAGE_MAX_BYTES = 10 * 1024 * 1024
@@ -172,7 +172,7 @@ class WeComAdapter(BasePlatformAdapter):
        self._listen_task: Optional[asyncio.Task] = None
        self._heartbeat_task: Optional[asyncio.Task] = None
        self._pending_responses: Dict[str, asyncio.Future] = {}
-        self._seen_messages: Dict[str, float] = {}
+        self._dedup = MessageDeduplicator(max_size=DEDUP_MAX_SIZE)
        self._reply_req_ids: Dict[str, str] = {}

        # Text batching: merge rapid successive messages (Telegram-style).
@@ -250,7 +250,7 @@ class WeComAdapter(BasePlatformAdapter):
            await self._http_client.aclose()
            self._http_client = None

-        self._seen_messages.clear()
+        self._dedup.clear()
        logger.info("[%s] Disconnected", self.name)

    async def _cleanup_ws(self) -> None:
@@ -476,7 +476,7 @@ class WeComAdapter(BasePlatformAdapter):
            return

        msg_id = str(body.get("msgid") or self._payload_req_id(payload) or uuid.uuid4().hex)
-        if self._is_duplicate(msg_id):
+        if self._dedup.is_duplicate(msg_id):
            logger.debug("[%s] Duplicate message %s ignored", self.name, msg_id)
            return
        self._remember_reply_req_id(msg_id, self._payload_req_id(payload))
@@ -636,6 +636,13 @@ class WeComAdapter(BasePlatformAdapter):
                if voice_text:
                    text_parts.append(voice_text)

+            # Extract appmsg title (filename) for WeCom AI Bot attachments
+            if msgtype == "appmsg":
+                appmsg = body.get("appmsg") if isinstance(body.get("appmsg"), dict) else {}
+                title = str(appmsg.get("title") or "").strip()
+                if title:
+                    text_parts.append(title)
+
        quote = body.get("quote") if isinstance(body.get("quote"), dict) else {}
        quote_type = str(quote.get("msgtype") or "").lower()
        if quote_type == "text":
@@ -668,6 +675,13 @@ class WeComAdapter(BasePlatformAdapter):
                refs.append(("image", body["image"]))
            if msgtype == "file" and isinstance(body.get("file"), dict):
                refs.append(("file", body["file"]))
+            # Handle appmsg (WeCom AI Bot attachments with PDF/Word/Excel)
+            if msgtype == "appmsg" and isinstance(body.get("appmsg"), dict):
+                appmsg = body["appmsg"]
+                if isinstance(appmsg.get("file"), dict):
+                    refs.append(("file", appmsg["file"]))
+                elif isinstance(appmsg.get("image"), dict):
+                    refs.append(("image", appmsg["image"]))

        quote = body.get("quote") if isinstance(body.get("quote"), dict) else {}
        quote_type = str(quote.get("msgtype") or "").lower()
@@ -825,24 +839,6 @@ class WeComAdapter(BasePlatformAdapter):
        wildcard = self._groups.get("*")
        return wildcard if isinstance(wildcard, dict) else {}

-    def _is_duplicate(self, msg_id: str) -> bool:
-        now = time.time()
-        if len(self._seen_messages) > DEDUP_MAX_SIZE:
-            cutoff = now - DEDUP_WINDOW_SECONDS
-            self._seen_messages = {
-                key: ts for key, ts in self._seen_messages.items() if ts > cutoff
-            }
-            if self._reply_req_ids:
-                self._reply_req_ids = {
-                    key: value for key, value in self._reply_req_ids.items() if key in self._seen_messages
-                }
-
-        if msg_id in self._seen_messages:
-            return True
-
-        self._seen_messages[msg_id] = now
-        return False
-
    def _remember_reply_req_id(self, message_id: str, req_id: str) -> None:
        normalized_message_id = str(message_id or "").strip()
        normalized_req_id = str(req_id or "").strip()
@@ -53,6 +53,7 @@ except ImportError:  # pragma: no cover - dependency gate
    CRYPTO_AVAILABLE = False

 from gateway.config import Platform, PlatformConfig
+from gateway.platforms.helpers import MessageDeduplicator
 from gateway.platforms.base import (
    BasePlatformAdapter,
    MessageEvent,
@@ -63,6 +64,7 @@ from gateway.platforms.base import (
    cache_image_from_bytes,
 )
 from hermes_constants import get_hermes_home
+from utils import atomic_json_write

 ILINK_BASE_URL = "https://ilinkai.weixin.qq.com"
 WEIXIN_CDN_BASE_URL = "https://novac2c.cdn.weixin.qq.com/c2c"
@@ -206,7 +208,7 @@ def save_weixin_account(
        "saved_at": time.strftime("%Y-%m-%dT%H:%M:%SZ", time.gmtime()),
    }
    path = _account_file(hermes_home, account_id)
-    path.write_text(json.dumps(payload, indent=2), encoding="utf-8")
+    atomic_json_write(path, payload)
    try:
        path.chmod(0o600)
    except OSError:
@@ -269,7 +271,7 @@ class ContextTokenStore:
            if key.startswith(prefix)
        }
        try:
-            self._path(account_id).write_text(json.dumps(payload), encoding="utf-8")
+            atomic_json_write(self._path(account_id), payload)
        except Exception as exc:
            logger.warning("weixin: failed to persist context tokens for %s: %s", _safe_id(account_id), exc)

@@ -755,23 +757,58 @@ def _pack_markdown_blocks_for_weixin(content: str, max_length: int) -> List[str]
    return packed


-def _split_text_for_weixin_delivery(content: str, max_length: int) -> List[str]:
+def _split_text_for_weixin_delivery(
+    content: str, max_length: int, split_per_line: bool = False,
+) -> List[str]:
    """Split content into sequential Weixin messages.

-    Prefer one message per top-level line/markdown unit when the author used
-    explicit line breaks. Oversized units fall back to block-aware packing so
-    long code fences still split safely.
-    """
-    if len(content) <= max_length and "\n" not in content:
-        return [content]
+    *compact* (default): Keep everything in a single message whenever it fits
+    within the platform limit, even when the author used explicit line breaks.
+    Only fall back to block-aware packing when the payload exceeds
+    ``max_length``.

-    chunks: List[str] = []
-    for unit in _split_delivery_units_for_weixin(content):
-        if len(unit) <= max_length:
-            chunks.append(unit)
-            continue
-        chunks.extend(_pack_markdown_blocks_for_weixin(unit, max_length))
-    return chunks or [content]
+    *per_line* (``split_per_line=True``): Legacy behavior — top-level line
+    breaks become separate chat messages; oversized units still use
+    block-aware packing.
+
+    The active mode is controlled via ``config.yaml`` ->
+    ``platforms.weixin.extra.split_multiline_messages`` (``true`` / ``false``)
+    or the env var ``WEIXIN_SPLIT_MULTILINE_MESSAGES``.
+    """
+    if split_per_line:
+        # Legacy: one message per top-level delivery unit.
+        if len(content) <= max_length and "\n" not in content:
+            return [content]
+        chunks: List[str] = []
+        for unit in _split_delivery_units_for_weixin(content):
+            if len(unit) <= max_length:
+                chunks.append(unit)
+                continue
+            chunks.extend(_pack_markdown_blocks_for_weixin(unit, max_length))
+        return chunks or [content]
+
+    # Compact (default): single message when under the limit.
+    if len(content) <= max_length:
+        return [content]
+    return _pack_markdown_blocks_for_weixin(content, max_length) or [content]
+
+
+def _coerce_bool(value: Any, default: bool = True) -> bool:
+    """Coerce a config value to bool, tolerating strings like ``"true"``."""
+    if value is None:
+        return default
+    if isinstance(value, bool):
+        return value
+    if isinstance(value, (int, float)):
+        return bool(value)
+    text = str(value).strip().lower()
+    if not text:
+        return default
+    if text in {"1", "true", "yes", "on"}:
+        return True
+    if text in {"0", "false", "no", "off"}:
+        return False
+    return default


 def _extract_text(item_list: List[Dict[str, Any]]) -> str:
@@ -833,7 +870,7 @@ def _load_sync_buf(hermes_home: str, account_id: str) -> str:

 def _save_sync_buf(hermes_home: str, account_id: str, sync_buf: str) -> None:
    path = _sync_buf_path(hermes_home, account_id)
-    path.write_text(json.dumps({"get_updates_buf": sync_buf}), encoding="utf-8")
+    atomic_json_write(path, {"get_updates_buf": sync_buf})


 async def qr_login(
@@ -972,8 +1009,7 @@ class WeixinAdapter(BasePlatformAdapter):
        self._typing_cache = TypingTicketCache()
        self._session: Optional[aiohttp.ClientSession] = None
        self._poll_task: Optional[asyncio.Task] = None
-        self._seen_messages: Dict[str, float] = {}
-        self._token_lock_identity: Optional[str] = None
+        self._dedup = MessageDeduplicator(ttl_seconds=MESSAGE_DEDUP_TTL_SECONDS)

        self._account_id = str(extra.get("account_id") or os.getenv("WEIXIN_ACCOUNT_ID", "")).strip()
        self._token = str(config.token or extra.get("token") or os.getenv("WEIXIN_TOKEN", "")).strip()
@@ -981,6 +1017,16 @@ class WeixinAdapter(BasePlatformAdapter):
        self._cdn_base_url = str(
            extra.get("cdn_base_url") or os.getenv("WEIXIN_CDN_BASE_URL", WEIXIN_CDN_BASE_URL)
        ).strip().rstrip("/")
+        self._send_chunk_delay_seconds = float(
+            extra.get("send_chunk_delay_seconds") or os.getenv("WEIXIN_SEND_CHUNK_DELAY_SECONDS", "0.35")
+        )
+        self._send_chunk_retries = int(
+            extra.get("send_chunk_retries") or os.getenv("WEIXIN_SEND_CHUNK_RETRIES", "2")
+        )
+        self._send_chunk_retry_delay_seconds = float(
+            extra.get("send_chunk_retry_delay_seconds")
+            or os.getenv("WEIXIN_SEND_CHUNK_RETRY_DELAY_SECONDS", "1.0")
+        )
        self._dm_policy = str(extra.get("dm_policy") or os.getenv("WEIXIN_DM_POLICY", "open")).strip().lower()
        self._group_policy = str(extra.get("group_policy") or os.getenv("WEIXIN_GROUP_POLICY", "disabled")).strip().lower()
        allow_from = extra.get("allow_from")
@@ -991,6 +1037,11 @@ class WeixinAdapter(BasePlatformAdapter):
            group_allow_from = os.getenv("WEIXIN_GROUP_ALLOWED_USERS", "")
        self._allow_from = self._coerce_list(allow_from)
        self._group_allow_from = self._coerce_list(group_allow_from)
+        self._split_multiline_messages = _coerce_bool(
+            extra.get("split_multiline_messages")
+            or os.getenv("WEIXIN_SPLIT_MULTILINE_MESSAGES"),
+            default=False,
+        )

        if self._account_id and not self._token:
            persisted = load_weixin_account(hermes_home, self._account_id)
@@ -1026,23 +1077,7 @@ class WeixinAdapter(BasePlatformAdapter):
            return False

        try:
-            from gateway.status import acquire_scoped_lock
-
-            self._token_lock_identity = self._token
-            acquired, existing = acquire_scoped_lock(
-                "weixin-bot-token",
-                self._token_lock_identity,
-                metadata={"platform": self.platform.value},
-            )
-            if not acquired:
-                owner_pid = existing.get("pid") if isinstance(existing, dict) else None
-                message = (
-                    "Another local Hermes gateway is already using this Weixin token"
-                    + (f" (PID {owner_pid})." if owner_pid else ".")
-                    + " Stop the other gateway before starting a second Weixin poller."
-                )
-                logger.error("[%s] %s", self.name, message)
-                self._set_fatal_error("weixin_token_lock", message, retryable=False)
+            if not self._acquire_platform_lock('weixin-bot-token', self._token, 'Weixin bot token'):
                return False
        except Exception as exc:
            logger.debug("[%s] Token lock unavailable (non-fatal): %s", self.name, exc)
@@ -1066,12 +1101,7 @@ class WeixinAdapter(BasePlatformAdapter):
        if self._session and not self._session.closed:
            await self._session.close()
        self._session = None
-        if self._token_lock_identity:
-            try:
-                from gateway.status import release_scoped_lock
-                release_scoped_lock("weixin-bot-token", self._token_lock_identity)
-            except Exception as exc:
-                logger.warning("[%s] Error releasing Weixin token lock: %s", self.name, exc, exc_info=True)
+        self._release_platform_lock()
        self._mark_disconnected()
        logger.info("[%s] Disconnected", self.name)

@@ -1149,16 +1179,8 @@ class WeixinAdapter(BasePlatformAdapter):
            return

        message_id = str(message.get("message_id") or "").strip()
-        if message_id:
-            now = time.time()
-            self._seen_messages = {
-                key: value
-                for key, value in self._seen_messages.items()
-                if now - value < MESSAGE_DEDUP_TTL_SECONDS
-            }
-            if message_id in self._seen_messages:
-                return
-            self._seen_messages[message_id] = now
+        if message_id and self._dedup.is_duplicate(message_id):
+            return

        chat_type, effective_chat_id = _guess_chat_type(message, self._account_id)
        if chat_type == "group":
@@ -1330,7 +1352,50 @@ class WeixinAdapter(BasePlatformAdapter):
            logger.debug("[%s] getConfig failed for %s: %s", self.name, _safe_id(user_id), exc)

    def _split_text(self, content: str) -> List[str]:
-        return _split_text_for_weixin_delivery(content, self.MAX_MESSAGE_LENGTH)
+        return _split_text_for_weixin_delivery(
+            content, self.MAX_MESSAGE_LENGTH, self._split_multiline_messages,
+        )
+
+    async def _send_text_chunk(
+        self,
+        *,
+        chat_id: str,
+        chunk: str,
+        context_token: Optional[str],
+        client_id: str,
+    ) -> None:
+        """Send a single text chunk with per-chunk retry and backoff."""
+        last_error: Optional[Exception] = None
+        for attempt in range(self._send_chunk_retries + 1):
+            try:
+                await _send_message(
+                    self._session,
+                    base_url=self._base_url,
+                    token=self._token,
+                    to=chat_id,
+                    text=chunk,
+                    context_token=context_token,
+                    client_id=client_id,
+                )
+                return
+            except Exception as exc:
+                last_error = exc
+                if attempt >= self._send_chunk_retries:
+                    break
+                wait = self._send_chunk_retry_delay_seconds * (attempt + 1)
+                logger.warning(
+                    "[%s] send chunk failed to=%s attempt=%d/%d, retrying in %.2fs: %s",
+                    self.name,
+                    _safe_id(chat_id),
+                    attempt + 1,
+                    self._send_chunk_retries + 1,
+                    wait,
+                    exc,
+                )
+                if wait > 0:
+                    await asyncio.sleep(wait)
+        assert last_error is not None
+        raise last_error

    async def send(
        self,
@@ -1344,18 +1409,18 @@ class WeixinAdapter(BasePlatformAdapter):
        context_token = self._token_store.get(self._account_id, chat_id)
        last_message_id: Optional[str] = None
        try:
-            for chunk in self._split_text(self.format_message(content)):
+            chunks = self._split_text(self.format_message(content))
+            for idx, chunk in enumerate(chunks):
                client_id = f"hermes-weixin-{uuid.uuid4().hex}"
-                await _send_message(
-                    self._session,
-                    base_url=self._base_url,
-                    token=self._token,
-                    to=chat_id,
-                    text=chunk,
+                await self._send_text_chunk(
+                    chat_id=chat_id,
+                    chunk=chunk,
                    context_token=context_token,
                    client_id=client_id,
                )
                last_message_id = client_id
+                if idx < len(chunks) - 1 and self._send_chunk_delay_seconds > 0:
+                    await asyncio.sleep(self._send_chunk_delay_seconds)
            return SendResult(success=True, message_id=last_message_id)
        except Exception as exc:
            logger.error("[%s] send failed to=%s: %s", self.name, _safe_id(chat_id), exc)
@@ -145,7 +145,6 @@ class WhatsAppAdapter(BasePlatformAdapter):
        self._bridge_log: Optional[Path] = None
        self._poll_task: Optional[asyncio.Task] = None
        self._http_session: Optional["aiohttp.ClientSession"] = None
-        self._session_lock_identity: Optional[str] = None

    def _whatsapp_require_mention(self) -> bool:
        configured = self.config.extra.get("require_mention")
@@ -290,23 +289,7 @@ class WhatsAppAdapter(BasePlatformAdapter):
        
        # Acquire scoped lock to prevent duplicate sessions
        try:
-            from gateway.status import acquire_scoped_lock
-
-            self._session_lock_identity = str(self._session_path)
-            acquired, existing = acquire_scoped_lock(
-                "whatsapp-session",
-                self._session_lock_identity,
-                metadata={"platform": self.platform.value},
-            )
-            if not acquired:
-                owner_pid = existing.get("pid") if isinstance(existing, dict) else None
-                message = (
-                    "Another local Hermes gateway is already using this WhatsApp session"
-                    + (f" (PID {owner_pid})." if owner_pid else ".")
-                    + " Stop the other gateway before starting a second WhatsApp bridge."
-                )
-                logger.error("[%s] %s", self.name, message)
-                self._set_fatal_error("whatsapp_session_lock", message, retryable=False)
+            if not self._acquire_platform_lock('whatsapp-session', str(self._session_path), 'WhatsApp session'):
                return False
        except Exception as e:
            logger.warning("[%s] Could not acquire session lock (non-fatal): %s", self.name, e)
@@ -468,12 +451,7 @@ class WhatsAppAdapter(BasePlatformAdapter):
            return True
            
        except Exception as e:
-            if self._session_lock_identity:
-                try:
-                    from gateway.status import release_scoped_lock
-                    release_scoped_lock("whatsapp-session", self._session_lock_identity)
-                except Exception:
-                    pass
+            self._release_platform_lock()
            logger.error("[%s] Failed to start bridge: %s", self.name, e, exc_info=True)
            self._close_bridge_log()
            return False
@@ -546,17 +524,11 @@ class WhatsAppAdapter(BasePlatformAdapter):
            await self._http_session.close()
        self._http_session = None

-        if self._session_lock_identity:
-            try:
-                from gateway.status import release_scoped_lock
-                release_scoped_lock("whatsapp-session", self._session_lock_identity)
-            except Exception as e:
-                logger.warning("[%s] Error releasing WhatsApp session lock: %s", self.name, e, exc_info=True)
+        self._release_platform_lock()

        self._mark_disconnected()
        self._bridge_process = None
        self._close_bridge_log()
-        self._session_lock_identity = None
        print(f"[{self.name}] Disconnected")
    
    async def send(
@@ -1465,7 +1465,18 @@ class GatewayRunner:
                logger.info("Recovered %s background process(es) from previous run", recovered)
        except Exception as e:
            logger.warning("Process checkpoint recovery: %s", e)
-        
+
+        # Suspend sessions that were active when the gateway last exited.
+        # This prevents stuck sessions from being blindly resumed on restart,
+        # which can create an unrecoverable loop (#7536).  Suspended sessions
+        # auto-reset on the next incoming message, giving the user a clean start.
+        try:
+            suspended = self.session_store.suspend_recently_active()
+            if suspended:
+                logger.info("Suspended %d in-flight session(s) from previous run", suspended)
+        except Exception as e:
+            logger.warning("Session suspension on startup failed: %s", e)
+
        connected_count = 0
        enabled_platform_count = 0
        startup_nonretryable_errors: list[str] = []
@@ -2221,6 +2232,13 @@ class GatewayRunner:
        # are system-generated and must skip user authorization.
        if getattr(event, "internal", False):
            pass
+        elif source.user_id is None:
+            # Messages with no user identity (Telegram service messages,
+            # channel forwards, anonymous admin actions) cannot be
+            # authorized — drop silently instead of triggering the pairing
+            # flow with a None user_id.
+            logger.debug("Ignoring message with no user_id from %s", source.platform.value)
+            return None
        elif not self._is_user_authorized(source):
            logger.warning("Unauthorized user: %s (%s) on %s", source.user_id, source.user_name, source.platform.value)
            # In DMs: offer pairing code. In groups: silently ignore.
@@ -2370,8 +2388,11 @@ class GatewayRunner:
                self._pending_messages.pop(_quick_key, None)
                if _quick_key in self._running_agents:
                    del self._running_agents[_quick_key]
-                logger.info("HARD STOP for session %s — session lock released", _quick_key[:20])
-                return "⚡ Force-stopped. The session is unlocked — you can send a new message."
+                # Mark session suspended so the next message starts fresh
+                # instead of resuming the stuck context (#7536).
+                self.session_store.suspend_session(_quick_key)
+                logger.info("HARD STOP for session %s — suspended, session lock released", _quick_key[:20])
+                return "⚡ Force-stopped. The session is suspended — your next message will start fresh."

            # /reset and /new must bypass the running-agent guard so they
            # actually dispatch as commands instead of being queued as user
@@ -2805,7 +2826,9 @@ class GatewayRunner:
        # so the agent knows this is a fresh conversation (not an intentional /reset).
        if getattr(session_entry, 'was_auto_reset', False):
            reset_reason = getattr(session_entry, 'auto_reset_reason', None) or 'idle'
-            if reset_reason == "daily":
+            if reset_reason == "suspended":
+                context_note = "[System note: The user's previous session was stopped and suspended. This is a fresh conversation with no prior context.]"
+            elif reset_reason == "daily":
                context_note = "[System note: The user's session was automatically reset by the daily schedule. This is a fresh conversation with no prior context.]"
            else:
                context_note = "[System note: The user's previous session expired due to inactivity. This is a fresh conversation with no prior context.]"
@@ -2822,7 +2845,9 @@ class GatewayRunner:
                )
                platform_name = source.platform.value if source.platform else ""
                had_activity = getattr(session_entry, 'reset_had_activity', False)
-                should_notify = (
+                # Suspended sessions always notify (they were explicitly stopped
+                # or crashed mid-operation) — skip the policy check.
+                should_notify = reset_reason == "suspended" or (
                    policy.notify
                    and had_activity
                    and platform_name not in policy.notify_exclude_platforms
@@ -2830,7 +2855,9 @@ class GatewayRunner:
                if should_notify:
                    adapter = self.adapters.get(source.platform)
                    if adapter:
-                        if reset_reason == "daily":
+                        if reset_reason == "suspended":
+                            reason_text = "previous session was stopped or interrupted"
+                        elif reset_reason == "daily":
                            reason_text = f"daily schedule at {policy.at_hour}:00"
                        else:
                            hours = policy.idle_minutes // 60
@@ -3913,25 +3940,31 @@ class GatewayRunner:
        handles /stop before this method is reached.  This handler fires
        only through normal command dispatch (no running agent) or as a
        fallback.  Force-clean the session lock in all cases for safety.
+
+        When there IS a running/pending agent, the session is also marked
+        as *suspended* so the next message starts a fresh session instead
+        of resuming the stuck context (#7536).
        """
        source = event.source
        session_entry = self.session_store.get_or_create_session(source)
        session_key = session_entry.session_key
-        
+
        agent = self._running_agents.get(session_key)
        if agent is _AGENT_PENDING_SENTINEL:
            # Force-clean the sentinel so the session is unlocked.
            if session_key in self._running_agents:
                del self._running_agents[session_key]
-            logger.info("HARD STOP (pending) for session %s — sentinel cleared", session_key[:20])
-            return "⚡ Force-stopped. The agent was still starting — session unlocked."
+            self.session_store.suspend_session(session_key)
+            logger.info("HARD STOP (pending) for session %s — suspended, sentinel cleared", session_key[:20])
+            return "⚡ Force-stopped. The agent was still starting — your next message will start fresh."
        if agent:
            agent.interrupt("Stop requested")
            # Force-clean the session lock so a truly hung agent doesn't
            # keep it locked forever.
            if session_key in self._running_agents:
                del self._running_agents[session_key]
-            return "⚡ Force-stopped. The session is unlocked — you can send a new message."
+            self.session_store.suspend_session(session_key)
+            return "⚡ Force-stopped. Your next message will start a fresh session."
        else:
            return "No active task to stop."

@@ -6597,6 +6630,8 @@ class GatewayRunner:
            chat_id=context.source.chat_id,
            chat_name=context.source.chat_name or "",
            thread_id=str(context.source.thread_id) if context.source.thread_id else "",
+            user_id=str(context.source.user_id) if context.source.user_id else "",
+            user_name=str(context.source.user_name) if context.source.user_name else "",
        )

    def _clear_session_env(self, tokens: list) -> None:
@@ -6809,6 +6844,8 @@ class GatewayRunner:
        platform_name = watcher.get("platform", "")
        chat_id = watcher.get("chat_id", "")
        thread_id = watcher.get("thread_id", "")
+        user_id = watcher.get("user_id", "")
+        user_name = watcher.get("user_name", "")
        agent_notify = watcher.get("notify_on_complete", False)
        notify_mode = self._load_background_notifications_mode()

@@ -6864,6 +6901,8 @@ class GatewayRunner:
                                platform=_platform_enum,
                                chat_id=chat_id,
                                thread_id=thread_id or None,
+                                user_id=user_id or None,
+                                user_name=user_name or None,
                            )
                            synth_event = MessageEvent(
                                text=synth_text,
@@ -368,6 +368,11 @@ class SessionEntry:
    # survives gateway restarts (the old in-memory _pre_flushed_sessions
    # set was lost on restart, causing redundant re-flushes).
    memory_flushed: bool = False
+
+    # When True the next call to get_or_create_session() will auto-reset
+    # this session (create a new session_id) so the user starts fresh.
+    # Set by /stop to break stuck-resume loops (#7536).
+    suspended: bool = False
    
    def to_dict(self) -> Dict[str, Any]:
        result = {
@@ -387,6 +392,7 @@ class SessionEntry:
            "estimated_cost_usd": self.estimated_cost_usd,
            "cost_status": self.cost_status,
            "memory_flushed": self.memory_flushed,
+            "suspended": self.suspended,
        }
        if self.origin:
            result["origin"] = self.origin.to_dict()
@@ -423,6 +429,7 @@ class SessionEntry:
            estimated_cost_usd=data.get("estimated_cost_usd", 0.0),
            cost_status=data.get("cost_status", "unknown"),
            memory_flushed=data.get("memory_flushed", False),
+            suspended=data.get("suspended", False),
        )


@@ -698,7 +705,12 @@ class SessionStore:
            if session_key in self._entries and not force_new:
                entry = self._entries[session_key]

-                reset_reason = self._should_reset(entry, source)
+                # Auto-reset sessions marked as suspended (e.g. after /stop
+                # broke a stuck loop — #7536).
+                if entry.suspended:
+                    reset_reason = "suspended"
+                else:
+                    reset_reason = self._should_reset(entry, source)
                if not reset_reason:
                    entry.updated_at = now
                    self._save()
@@ -771,6 +783,44 @@ class SessionStore:
                    entry.last_prompt_tokens = last_prompt_tokens
                self._save()

+    def suspend_session(self, session_key: str) -> bool:
+        """Mark a session as suspended so it auto-resets on next access.
+
+        Used by ``/stop`` to prevent stuck sessions from being resumed
+        after a gateway restart (#7536).  Returns True if the session
+        existed and was marked.
+        """
+        with self._lock:
+            self._ensure_loaded_locked()
+            if session_key in self._entries:
+                self._entries[session_key].suspended = True
+                self._save()
+                return True
+        return False
+
+    def suspend_recently_active(self, max_age_seconds: int = 120) -> int:
+        """Mark recently-active sessions as suspended.
+
+        Called on gateway startup to prevent sessions that were likely
+        in-flight when the gateway last exited from being blindly resumed
+        (#7536).  Only suspends sessions updated within *max_age_seconds*
+        to avoid resetting long-idle sessions that are harmless to resume.
+        Returns the number of sessions that were suspended.
+        """
+        import time as _time
+
+        cutoff = _time.time() - max_age_seconds
+        count = 0
+        with self._lock:
+            self._ensure_loaded_locked()
+            for entry in self._entries.values():
+                if not entry.suspended and entry.updated_at >= cutoff:
+                    entry.suspended = True
+                    count += 1
+            if count:
+                self._save()
+        return count
+
    def reset_session(self, session_key: str) -> Optional[SessionEntry]:
        """Force reset a session, creating a new session ID."""
        db_end_session_id = None
@@ -46,12 +46,16 @@ _SESSION_PLATFORM: ContextVar[str] = ContextVar("HERMES_SESSION_PLATFORM", defau
 _SESSION_CHAT_ID: ContextVar[str] = ContextVar("HERMES_SESSION_CHAT_ID", default="")
 _SESSION_CHAT_NAME: ContextVar[str] = ContextVar("HERMES_SESSION_CHAT_NAME", default="")
 _SESSION_THREAD_ID: ContextVar[str] = ContextVar("HERMES_SESSION_THREAD_ID", default="")
+_SESSION_USER_ID: ContextVar[str] = ContextVar("HERMES_SESSION_USER_ID", default="")
+_SESSION_USER_NAME: ContextVar[str] = ContextVar("HERMES_SESSION_USER_NAME", default="")

 _VAR_MAP = {
    "HERMES_SESSION_PLATFORM": _SESSION_PLATFORM,
    "HERMES_SESSION_CHAT_ID": _SESSION_CHAT_ID,
    "HERMES_SESSION_CHAT_NAME": _SESSION_CHAT_NAME,
    "HERMES_SESSION_THREAD_ID": _SESSION_THREAD_ID,
+    "HERMES_SESSION_USER_ID": _SESSION_USER_ID,
+    "HERMES_SESSION_USER_NAME": _SESSION_USER_NAME,
 }


@@ -60,6 +64,8 @@ def set_session_vars(
    chat_id: str = "",
    chat_name: str = "",
    thread_id: str = "",
+    user_id: str = "",
+    user_name: str = "",
 ) -> list:
    """Set all session context variables and return reset tokens.

@@ -74,6 +80,8 @@ def set_session_vars(
        _SESSION_CHAT_ID.set(chat_id),
        _SESSION_CHAT_NAME.set(chat_name),
        _SESSION_THREAD_ID.set(thread_id),
+        _SESSION_USER_ID.set(user_id),
+        _SESSION_USER_NAME.set(user_name),
    ]
    return tokens

@@ -87,6 +95,8 @@ def clear_session_vars(tokens: list) -> None:
        _SESSION_CHAT_ID,
        _SESSION_CHAT_NAME,
        _SESSION_THREAD_ID,
+        _SESSION_USER_ID,
+        _SESSION_USER_NAME,
    ]
    for var, token in zip(vars_in_order, tokens):
        var.reset(token)
@@ -36,7 +36,7 @@ _NEW_SEGMENT = object()
@dataclass
 class StreamConsumerConfig:
    """Runtime config for a single stream consumer instance."""
-    edit_interval: float = 0.3
+    edit_interval: float = 1.0
    buffer_threshold: int = 40
    cursor: str = " ▉"

@@ -56,6 +56,10 @@ class GatewayStreamConsumer:
        await task         # wait for final edit
    """

+    # After this many consecutive flood-control failures, permanently disable
+    # progressive edits for the remainder of the stream.
+    _MAX_FLOOD_STRIKES = 3
+
    def __init__(
        self,
        adapter: Any,
@@ -76,6 +80,8 @@ class GatewayStreamConsumer:
        self._last_sent_text = ""   # Track last-sent text to skip redundant edits
        self._fallback_final_send = False
        self._fallback_prefix = ""
+        self._flood_strikes = 0         # Consecutive flood-control edit failures
+        self._current_edit_interval = self.cfg.edit_interval  # Adaptive backoff

    @property
    def already_sent(self) -> bool:
@@ -129,7 +135,7 @@ class GatewayStreamConsumer:
                should_edit = (
                    got_done
                    or got_segment_break
-                    or (elapsed >= self.cfg.edit_interval
+                    or (elapsed >= self._current_edit_interval
                        and self._accumulated)
                    or len(self._accumulated) >= self.cfg.buffer_threshold
                )
@@ -173,12 +179,13 @@ class GatewayStreamConsumer:
                        if split_at < _safe_limit // 2:
                            split_at = _safe_limit
                        chunk = self._accumulated[:split_at]
-                        await self._send_or_edit(chunk)
-                        if self._fallback_final_send:
-                            # Edit failed while attempting to split an oversized
-                            # message. Keep the full accumulated text intact so
-                            # the fallback final-send path can deliver the
-                            # remaining continuation without dropping content.
+                        ok = await self._send_or_edit(chunk)
+                        if self._fallback_final_send or not ok:
+                            # Edit failed (or backed off due to flood control)
+                            # while attempting to split an oversized message.
+                            # Keep the full accumulated text intact so the
+                            # fallback final-send path can deliver the remaining
+                            # continuation without dropping content.
                            break
                        self._accumulated = self._accumulated[split_at:].lstrip("\n")
                        self._message_id = None
@@ -322,7 +329,10 @@ class GatewayStreamConsumer:
        return chunks

    async def _send_fallback_final(self, text: str) -> None:
-        """Send the final continuation after streaming edits stop working."""
+        """Send the final continuation after streaming edits stop working.
+
+        Retries each chunk once on flood-control failures with a short delay.
+        """
        final_text = self._clean_for_display(text)
        continuation = self._continuation_text(final_text)
        self._fallback_final_send = False
@@ -339,12 +349,25 @@ class GatewayStreamConsumer:
        last_successful_chunk = ""
        sent_any_chunk = False
        for chunk in chunks:
-            result = await self.adapter.send(
-                chat_id=self.chat_id,
-                content=chunk,
-                metadata=self.metadata,
-            )
-            if not result.success:
+            # Try sending with one retry on flood-control errors.
+            result = None
+            for attempt in range(2):
+                result = await self.adapter.send(
+                    chat_id=self.chat_id,
+                    content=chunk,
+                    metadata=self.metadata,
+                )
+                if result.success:
+                    break
+                if attempt == 0 and self._is_flood_error(result):
+                    logger.debug(
+                        "Flood control on fallback send, retrying in 3s"
+                    )
+                    await asyncio.sleep(3.0)
+                else:
+                    break  # non-flood error or second attempt failed
+
+            if not result or not result.success:
                if sent_any_chunk:
                    # Some continuation text already reached the user. Suppress
                    # the base gateway final-send path so we don't resend the
@@ -370,20 +393,52 @@ class GatewayStreamConsumer:
        self._last_sent_text = chunks[-1]
        self._fallback_prefix = ""

-    async def _send_or_edit(self, text: str) -> None:
-        """Send or edit the streaming message."""
+    def _is_flood_error(self, result) -> bool:
+        """Check if a SendResult failure is due to flood control / rate limiting."""
+        err = getattr(result, "error", "") or ""
+        err_lower = err.lower()
+        return "flood" in err_lower or "retry after" in err_lower or "rate" in err_lower
+
+    async def _try_strip_cursor(self) -> None:
+        """Best-effort edit to remove the cursor from the last visible message.
+
+        Called when entering fallback mode so the user doesn't see a stuck
+        cursor (▉) in the partial message.
+        """
+        if not self._message_id or self._message_id == "__no_edit__":
+            return
+        prefix = self._visible_prefix()
+        if not prefix or not prefix.strip():
+            return
+        try:
+            await self.adapter.edit_message(
+                chat_id=self.chat_id,
+                message_id=self._message_id,
+                content=prefix,
+            )
+            self._last_sent_text = prefix
+        except Exception:
+            pass  # best-effort — don't let this block the fallback path
+
+    async def _send_or_edit(self, text: str) -> bool:
+        """Send or edit the streaming message.
+
+        Returns True if the text was successfully delivered (sent or edited),
+        False otherwise.  Callers like the overflow split loop use this to
+        decide whether to advance past the delivered chunk.
+        """
        # Strip MEDIA: directives so they don't appear as visible text.
        # Media files are delivered as native attachments after the stream
        # finishes (via _deliver_media_from_response in gateway/run.py).
        text = self._clean_for_display(text)
        if not text.strip():
-            return
+            return True  # nothing to send is "success"
        try:
            if self._message_id is not None:
                if self._edit_supported:
                    # Skip if text is identical to what we last sent
                    if text == self._last_sent_text:
-                        return
+                        return True
                    # Edit existing message
                    result = await self.adapter.edit_message(
                        chat_id=self.chat_id,
@@ -393,19 +448,52 @@ class GatewayStreamConsumer:
                    if result.success:
                        self._already_sent = True
                        self._last_sent_text = text
+                        # Successful edit — reset flood strike counter
+                        self._flood_strikes = 0
+                        return True
                    else:
-                        # If an edit fails mid-stream (especially Telegram flood control),
-                        # stop progressive edits and send only the missing tail once the
+                        # Edit failed.  If this looks like flood control / rate
+                        # limiting, use adaptive backoff: double the edit interval
+                        # and retry on the next cycle.  Only permanently disable
+                        # edits after _MAX_FLOOD_STRIKES consecutive failures.
+                        if self._is_flood_error(result):
+                            self._flood_strikes += 1
+                            self._current_edit_interval = min(
+                                self._current_edit_interval * 2, 10.0,
+                            )
+                            logger.debug(
+                                "Flood control on edit (strike %d/%d), "
+                                "backoff interval → %.1fs",
+                                self._flood_strikes,
+                                self._MAX_FLOOD_STRIKES,
+                                self._current_edit_interval,
+                            )
+                            if self._flood_strikes < self._MAX_FLOOD_STRIKES:
+                                # Don't disable edits yet — just slow down.
+                                # Update _last_edit_time so the next edit
+                                # respects the new interval.
+                                self._last_edit_time = time.monotonic()
+                                return False
+
+                        # Non-flood error OR flood strikes exhausted: enter
+                        # fallback mode — send only the missing tail once the
                        # final response is available.
-                        logger.debug("Edit failed, disabling streaming for this adapter")
+                        logger.debug(
+                            "Edit failed (strikes=%d), entering fallback mode",
+                            self._flood_strikes,
+                        )
                        self._fallback_prefix = self._visible_prefix()
                        self._fallback_final_send = True
                        self._edit_supported = False
                        self._already_sent = True
+                        # Best-effort: strip the cursor from the last visible
+                        # message so the user doesn't see a stuck ▉.
+                        await self._try_strip_cursor()
+                        return False
                else:
                    # Editing not supported — skip intermediate updates.
                    # The final response will be sent by the fallback path.
-                    pass
+                    return False
            else:
                # First message — send new
                result = await self.adapter.send(
@@ -417,6 +505,7 @@ class GatewayStreamConsumer:
                    self._message_id = result.message_id
                    self._already_sent = True
                    self._last_sent_text = text
+                    return True
                elif result.success:
                    # Platform accepted the message but returned no message_id
                    # (e.g. Signal).  Can't edit without an ID — switch to
@@ -428,8 +517,11 @@ class GatewayStreamConsumer:
                    self._fallback_final_send = True
                    # Sentinel prevents re-entering this branch on every delta
                    self._message_id = "__no_edit__"
+                    return True  # platform accepted, just can't edit
                else:
                    # Initial send failed — disable streaming for this session
                    self._edit_supported = False
+                    return False
        except Exception as e:
            logger.error("Stream send/edit error: %s", e)
+            return False
@@ -250,9 +250,39 @@ PROVIDER_REGISTRY: Dict[str, ProviderConfig] = {
        api_key_env_vars=("HF_TOKEN",),
        base_url_env_var="HF_BASE_URL",
    ),
+    "xiaomi": ProviderConfig(
+        id="xiaomi",
+        name="Xiaomi MiMo",
+        auth_type="api_key",
+        inference_base_url="https://api.xiaomimimo.com/v1",
+        api_key_env_vars=("XIAOMI_API_KEY",),
+        base_url_env_var="XIAOMI_BASE_URL",
+    ),
 }


+# =============================================================================
+# Anthropic Key Helper
+# =============================================================================
+
+def get_anthropic_key() -> str:
+    """Return the first usable Anthropic credential, or ``""``.
+
+    Checks both the ``.env`` file (via ``get_env_value``) and the process
+    environment (``os.getenv``).  The fallback order mirrors the
+    ``PROVIDER_REGISTRY["anthropic"].api_key_env_vars`` tuple:
+
+        ANTHROPIC_API_KEY -> ANTHROPIC_TOKEN -> CLAUDE_CODE_OAUTH_TOKEN
+    """
+    from hermes_cli.config import get_env_value
+
+    for var in PROVIDER_REGISTRY["anthropic"].api_key_env_vars:
+        value = get_env_value(var) or os.getenv(var, "")
+        if value:
+            return value
+    return ""
+
+
 # =============================================================================
 # Kimi Code Endpoint Detection
 # =============================================================================
@@ -908,6 +938,7 @@ def resolve_provider(
        "opencode": "opencode-zen", "zen": "opencode-zen",
        "qwen-portal": "qwen-oauth", "qwen-cli": "qwen-oauth", "qwen-oauth": "qwen-oauth",
        "hf": "huggingface", "hugging-face": "huggingface", "huggingface-hub": "huggingface",
+        "mimo": "xiaomi", "xiaomi-mimo": "xiaomi",
        "go": "opencode-go", "opencode-go-sub": "opencode-go",
        "kilo": "kilocode", "kilo-code": "kilocode", "kilo-gateway": "kilocode",
        # Local server aliases — route through the generic custom provider
@@ -1,8 +1,9 @@
 """hermes claw — OpenClaw migration commands.

 Usage:
-    hermes claw migrate              # Interactive migration from ~/.openclaw
-    hermes claw migrate --dry-run    # Preview what would be migrated
+    hermes claw migrate              # Preview then migrate (always shows preview first)
+    hermes claw migrate --dry-run    # Preview only, no changes
+    hermes claw migrate --yes        # Skip confirmation prompt
    hermes claw migrate --preset full --overwrite  # Full migration, overwrite conflicts
    hermes claw cleanup              # Archive leftover OpenClaw directories
    hermes claw cleanup --dry-run    # Preview what would be archived
@@ -237,12 +238,12 @@ def _cmd_migrate(args):

    # Show what we're doing
    hermes_home = get_hermes_home()
+    auto_yes = getattr(args, "yes", False)
    print()
    print_header("Migration Settings")
    print_info(f"Source:      {source_dir}")
    print_info(f"Target:      {hermes_home}")
    print_info(f"Preset:      {preset}")
-    print_info(f"Mode:        {'dry run (preview only)' if dry_run else 'execute'}")
    print_info(f"Overwrite:   {'yes' if overwrite else 'no (skip conflicts)'}")
    print_info(f"Secrets:     {'yes (allowlisted only)' if migrate_secrets else 'no'}")
    if skill_conflict != "skip":
@@ -251,31 +252,81 @@ def _cmd_migrate(args):
        print_info(f"Workspace:   {workspace_target}")
    print()

-    # For execute mode (non-dry-run), confirm unless --yes was passed
-    if not dry_run and not getattr(args, "yes", False):
-        if not prompt_yes_no("Proceed with migration?", default=True):
-            print_info("Migration cancelled.")
-            return
-
    # Ensure config.yaml exists before migration tries to read it
    config_path = get_config_path()
    if not config_path.exists():
        save_config(load_config())

-    # Load and run the migration
+    # Load the migration module
    try:
        mod = _load_migration_module(script_path)
        if mod is None:
            print_error("Could not load migration script.")
            return
+    except Exception as e:
+        print()
+        print_error(f"Could not load migration script: {e}")
+        logger.debug("OpenClaw migration error", exc_info=True)
+        return

-        selected = mod.resolve_selected_options(None, None, preset=preset)
-        ws_target = Path(workspace_target).resolve() if workspace_target else None
+    selected = mod.resolve_selected_options(None, None, preset=preset)
+    ws_target = Path(workspace_target).resolve() if workspace_target else None

+    # ── Phase 1: Always preview first ──────────────────────────
+    try:
+        preview = mod.Migrator(
+            source_root=source_dir.resolve(),
+            target_root=hermes_home.resolve(),
+            execute=False,
+            workspace_target=ws_target,
+            overwrite=overwrite,
+            migrate_secrets=migrate_secrets,
+            output_dir=None,
+            selected_options=selected,
+            preset_name=preset,
+            skill_conflict_mode=skill_conflict,
+        )
+        preview_report = preview.migrate()
+    except Exception as e:
+        print()
+        print_error(f"Migration preview failed: {e}")
+        logger.debug("OpenClaw migration preview error", exc_info=True)
+        return
+
+    preview_summary = preview_report.get("summary", {})
+    preview_count = preview_summary.get("migrated", 0)
+
+    if preview_count == 0:
+        print()
+        print_info("Nothing to migrate from OpenClaw.")
+        _print_migration_report(preview_report, dry_run=True)
+        return
+
+    print()
+    print_header(f"Migration Preview — {preview_count} item(s) would be imported")
+    print_info("No changes have been made yet. Review the list below:")
+    _print_migration_report(preview_report, dry_run=True)
+
+    # If --dry-run, stop here
+    if dry_run:
+        return
+
+    # ── Phase 2: Confirm and execute ───────────────────────────
+    print()
+    if not auto_yes:
+        if not sys.stdin.isatty():
+            print_info("Non-interactive session — preview only.")
+            print_info("To execute, re-run with: hermes claw migrate --yes")
+            return
+        if not prompt_yes_no("Proceed with migration?", default=True):
+            print_info("Migration cancelled.")
+            return
+
+    try:
        migrator = mod.Migrator(
            source_root=source_dir.resolve(),
            target_root=hermes_home.resolve(),
-            execute=not dry_run,
+            execute=True,
            workspace_target=ws_target,
            overwrite=overwrite,
            migrate_secrets=migrate_secrets,
@@ -292,11 +343,11 @@ def _cmd_migrate(args):
        return

    # Print results
-    _print_migration_report(report, dry_run)
+    _print_migration_report(report, dry_run=False)

-    # After successful non-dry-run migration, offer to archive the source directory
-    if not dry_run and report.get("summary", {}).get("migrated", 0) > 0:
-        _offer_source_archival(source_dir, getattr(args, "yes", False))
+    # After successful migration, offer to archive the source directory
+    if report.get("summary", {}).get("migrated", 0) > 0:
+        _offer_source_archival(source_dir, auto_yes)


 def _offer_source_archival(source_dir: Path, auto_yes: bool = False):
@@ -330,6 +381,11 @@ def _offer_source_archival(source_dir: Path, auto_yes: bool = False):
    print_info("You can always rename it back if needed.")
    print()

+    if not auto_yes and not sys.stdin.isatty():
+        print_info("Non-interactive session — skipping archival.")
+        print_info("Run later with: hermes claw cleanup")
+        return
+
    if auto_yes or prompt_yes_no(f"Archive {source_dir} now?", default=True):
        try:
            archive_path = _archive_directory(source_dir)
@@ -433,6 +489,9 @@ def _cmd_cleanup(args):
        if dry_run:
            archive_path = _archive_directory(source_dir, dry_run=True)
            print_info(f"Would archive: {source_dir} → {archive_path}")
+        elif not auto_yes and not sys.stdin.isatty():
+            print_info(f"Non-interactive session — would archive: {source_dir}")
+            print_info("To execute, re-run with: hermes claw cleanup --yes")
        else:
            if auto_yes or prompt_yes_no(f"Archive {source_dir}?", default=True):
                try:
@@ -0,0 +1,79 @@
+"""Shared CLI output helpers for Hermes CLI modules.
+
+Extracts the identical ``print_info/success/warning/error`` and ``prompt()``
+functions previously duplicated across setup.py, tools_config.py,
+mcp_config.py, and memory_setup.py.
+"""
+
+import getpass
+import sys
+
+from hermes_cli.colors import Colors, color
+
+
+# ─── Print Helpers ────────────────────────────────────────────────────────────
+
+
+def print_info(text: str) -> None:
+    """Print a dim informational message."""
+    print(color(f"  {text}", Colors.DIM))
+
+
+def print_success(text: str) -> None:
+    """Print a green success message with ✓ prefix."""
+    print(color(f"✓ {text}", Colors.GREEN))
+
+
+def print_warning(text: str) -> None:
+    """Print a yellow warning message with ⚠ prefix."""
+    print(color(f"⚠ {text}", Colors.YELLOW))
+
+
+def print_error(text: str) -> None:
+    """Print a red error message with ✗ prefix."""
+    print(color(f"✗ {text}", Colors.RED))
+
+
+def print_header(text: str) -> None:
+    """Print a bold yellow header."""
+    print(color(f"\n  {text}", Colors.YELLOW))
+
+
+# ─── Input Prompts ────────────────────────────────────────────────────────────
+
+
+def prompt(
+    question: str,
+    default: str | None = None,
+    password: bool = False,
+) -> str:
+    """Prompt the user for input with optional default and password masking.
+
+    Replaces the four independent ``_prompt()`` / ``prompt()`` implementations
+    in setup.py, tools_config.py, mcp_config.py, and memory_setup.py.
+
+    Returns the user's input (stripped), or *default* if the user presses Enter.
+    Returns empty string on Ctrl-C or EOF.
+    """
+    suffix = f" [{default}]" if default else ""
+    display = color(f"  {question}{suffix}: ", Colors.YELLOW)
+
+    try:
+        if password:
+            value = getpass.getpass(display)
+        else:
+            value = input(display)
+        value = value.strip()
+        return value if value else (default or "")
+    except (KeyboardInterrupt, EOFError):
+        print()
+        return ""
+
+
+def prompt_yes_no(question: str, default: bool = True) -> bool:
+    """Prompt for a yes/no answer. Returns bool."""
+    hint = "Y/n" if default else "y/N"
+    answer = prompt(f"{question} ({hint})")
+    if not answer:
+        return default
+    return answer.lower().startswith("y")
@@ -32,7 +32,6 @@ _ENV_VAR_NAME_RE = re.compile(r"^[A-Za-z_][A-Za-z0-9_]*$")
 _EXTRA_ENV_KEYS = frozenset({
    "OPENAI_API_KEY", "OPENAI_BASE_URL",
    "ANTHROPIC_API_KEY", "ANTHROPIC_TOKEN",
-    "AUXILIARY_VISION_MODEL",
    "DISCORD_HOME_CHANNEL", "TELEGRAM_HOME_CHANNEL",
    "SIGNAL_ACCOUNT", "SIGNAL_HTTP_URL",
    "SIGNAL_ALLOWED_USERS", "SIGNAL_GROUP_ALLOWED_USERS",
@@ -381,7 +380,7 @@ DEFAULT_CONFIG = {
            "model": "",           # e.g. "google/gemini-2.5-flash", "gpt-4o"
            "base_url": "",        # direct OpenAI-compatible endpoint (takes precedence over provider)
            "api_key": "",         # API key for base_url (falls back to OPENAI_API_KEY)
-            "timeout": 30,         # seconds — LLM API call timeout; increase for slow local vision models
+            "timeout": 120,        # seconds — LLM API call timeout; vision payloads need generous timeout
            "download_timeout": 30,  # seconds — image HTTP download timeout; increase for slow connections
        },
        "web_extract": {
@@ -868,6 +867,21 @@ OPTIONAL_ENV_VARS = {
        "category": "provider",
        "advanced": True,
    },
+    "XIAOMI_API_KEY": {
+        "description": "Xiaomi MiMo API key for MiMo models (mimo-v2-pro, mimo-v2-omni, mimo-v2-flash)",
+        "prompt": "Xiaomi MiMo API Key",
+        "url": "https://platform.xiaomimimo.com",
+        "password": True,
+        "category": "provider",
+    },
+    "XIAOMI_BASE_URL": {
+        "description": "Xiaomi MiMo base URL override (default: https://api.xiaomimimo.com/v1)",
+        "prompt": "Xiaomi base URL (leave empty for default)",
+        "url": None,
+        "password": False,
+        "category": "provider",
+        "advanced": True,
+    },

    # ── Tool API keys ──
    "EXA_API_KEY": {
@@ -2568,7 +2582,8 @@ def show_config():
    for env_key, name in keys:
        value = get_env_value(env_key)
        print(f"  {name:<14} {redact_key(value)}")
-    anthropic_value = get_env_value("ANTHROPIC_TOKEN") or get_env_value("ANTHROPIC_API_KEY")
+    from hermes_cli.auth import get_anthropic_key
+    anthropic_value = get_anthropic_key()
    print(f"  {'Anthropic':<14} {redact_key(anthropic_value)}")
    
    # Model settings
@@ -2784,8 +2799,8 @@ def set_config_value(key: str, value: str):
    
    # Write only user config back (not the full merged defaults)
    ensure_hermes_home()
-    with open(config_path, 'w', encoding="utf-8") as f:
-        yaml.dump(user_config, f, default_flow_style=False, sort_keys=False)
+    from utils import atomic_yaml_write
+    atomic_yaml_write(config_path, user_config, sort_keys=False)
    
    # Keep .env in sync for keys that terminal_tool reads directly from env vars.
    # config.yaml is authoritative, but terminal_tool only reads TERMINAL_ENV etc.
@@ -51,6 +51,7 @@ _PROVIDER_ENV_HINTS = (
    "AI_GATEWAY_API_KEY",
    "OPENCODE_ZEN_API_KEY",
    "OPENCODE_GO_API_KEY",
+    "XIAOMI_API_KEY",
 )


@@ -335,8 +336,8 @@ def run_doctor(args):
                            model_section[k] = raw_config.pop(k)
                        else:
                            raw_config.pop(k)
-                    with open(config_path, "w") as f:
-                        yaml.dump(raw_config, f, default_flow_style=False)
+                    from utils import atomic_yaml_write
+                    atomic_yaml_write(config_path, raw_config)
                    check_ok("Migrated stale root-level keys into model section")
                    fixed_count += 1
                else:
@@ -685,7 +686,8 @@ def run_doctor(args):
    else:
        check_warn("OpenRouter API", "(not configured)")
    
-    anthropic_key = os.getenv("ANTHROPIC_TOKEN") or os.getenv("ANTHROPIC_API_KEY")
+    from hermes_cli.auth import get_anthropic_key
+    anthropic_key = get_anthropic_key()
    if anthropic_key:
        print("  Checking Anthropic API...", end="", flush=True)
        try:
@@ -934,6 +934,7 @@ def select_provider_and_model(args=None):
        "kilocode": "Kilo Code",
        "alibaba": "Alibaba Cloud (DashScope)",
        "huggingface": "Hugging Face",
+        "xiaomi": "Xiaomi MiMo",
        "custom": "Custom endpoint",
    }
    active_label = provider_labels.get(active, active) if active else "none"
@@ -966,6 +967,7 @@ def select_provider_and_model(args=None):
        ("opencode-go", "OpenCode Go (open models, $10/month subscription)"),
        ("ai-gateway", "AI Gateway (Vercel — 200+ models, pay-per-use)"),
        ("alibaba", "Alibaba Cloud / DashScope Coding (Qwen + multi-provider)"),
+        ("xiaomi", "Xiaomi MiMo (MiMo-V2 models — pro, omni, flash)"),
    ]

    def _named_custom_provider_map(cfg) -> dict[str, dict[str, str]]:
@@ -1077,7 +1079,7 @@ def select_provider_and_model(args=None):
        _model_flow_anthropic(config, current_model)
    elif selected_provider == "kimi-coding":
        _model_flow_kimi(config, current_model)
-    elif selected_provider in ("gemini", "zai", "minimax", "minimax-cn", "kilocode", "opencode-zen", "opencode-go", "ai-gateway", "alibaba", "huggingface"):
+    elif selected_provider in ("gemini", "zai", "minimax", "minimax-cn", "kilocode", "opencode-zen", "opencode-go", "ai-gateway", "alibaba", "huggingface", "xiaomi"):
        _model_flow_api_key_provider(config, selected_provider, current_model)

    # ── Post-switch cleanup: clear stale OPENAI_BASE_URL ──────────────
@@ -2547,13 +2549,8 @@ def _model_flow_anthropic(config, current_model=""):
    from hermes_cli.models import _PROVIDER_MODELS

    # Check ALL credential sources
-    existing_key = (
-        get_env_value("ANTHROPIC_TOKEN")
-        or os.getenv("ANTHROPIC_TOKEN", "")
-        or get_env_value("ANTHROPIC_API_KEY")
-        or os.getenv("ANTHROPIC_API_KEY", "")
-        or os.getenv("CLAUDE_CODE_OAUTH_TOKEN", "")
-    )
+    from hermes_cli.auth import get_anthropic_key
+    existing_key = get_anthropic_key()
    cc_available = False
    try:
        from agent.anthropic_adapter import read_claude_code_credentials, is_claude_code_token_valid
@@ -4357,7 +4354,7 @@ For more help on a command:
    )
    chat_parser.add_argument(
        "--provider",
-        choices=["auto", "openrouter", "nous", "openai-codex", "copilot-acp", "copilot", "anthropic", "gemini", "huggingface", "zai", "kimi-coding", "minimax", "minimax-cn", "kilocode"],
+        choices=["auto", "openrouter", "nous", "openai-codex", "copilot-acp", "copilot", "anthropic", "gemini", "huggingface", "zai", "kimi-coding", "minimax", "minimax-cn", "kilocode", "xiaomi"],
        default=None,
        help="Inference provider (default: auto)"
    )
@@ -5411,7 +5408,8 @@ For more help on a command:
    claw_migrate = claw_subparsers.add_parser(
        "migrate",
        help="Migrate from OpenClaw to Hermes",
-        description="Import settings, memories, skills, and API keys from an OpenClaw installation"
+        description="Import settings, memories, skills, and API keys from an OpenClaw installation. "
+                    "Always shows a preview before making changes."
    )
    claw_migrate.add_argument(
        "--source",
@@ -5420,7 +5418,7 @@ For more help on a command:
    claw_migrate.add_argument(
        "--dry-run",
        action="store_true",
-        help="Preview what would be migrated without making changes"
+        help="Preview only — stop after showing what would be migrated"
    )
    claw_migrate.add_argument(
        "--preset",
@@ -57,19 +57,8 @@ def _confirm(question: str, default: bool = True) -> bool:


 def _prompt(question: str, *, password: bool = False, default: str = "") -> str:
-    display = f"  {question}"
-    if default:
-        display += f" [{default}]"
-    display += ": "
-    try:
-        if password:
-            value = getpass.getpass(color(display, Colors.YELLOW))
-        else:
-            value = input(color(display, Colors.YELLOW))
-        return value.strip() or default
-    except (KeyboardInterrupt, EOFError):
-        print()
-        return default
+    from hermes_cli.cli_output import prompt as _shared_prompt
+    return _shared_prompt(question, default=default, password=password)


 # ─── Config Helpers ───────────────────────────────────────────────────────────
@@ -25,85 +25,13 @@ def _curses_select(title: str, items: list[tuple[str, str]], default: int = 0) -
    items: list of (label, description) tuples.
    Returns selected index, or default on escape/quit.
    """
-    try:
-        import curses
-        result = [default]
-
-        def _menu(stdscr):
-            curses.curs_set(0)
-            if curses.has_colors():
-                curses.start_color()
-                curses.use_default_colors()
-                curses.init_pair(1, curses.COLOR_GREEN, -1)
-                curses.init_pair(2, curses.COLOR_YELLOW, -1)
-                curses.init_pair(3, curses.COLOR_CYAN, -1)
-            cursor = default
-
-            while True:
-                stdscr.clear()
-                max_y, max_x = stdscr.getmaxyx()
-
-                # Title
-                try:
-                    stdscr.addnstr(0, 0, title, max_x - 1,
-                                   curses.A_BOLD | (curses.color_pair(2) if curses.has_colors() else 0))
-                    stdscr.addnstr(1, 0, "  ↑↓ navigate  ⏎ select  q quit", max_x - 1,
-                                   curses.color_pair(3) if curses.has_colors() else curses.A_DIM)
-                except curses.error:
-                    pass
-
-                for i, (label, desc) in enumerate(items):
-                    y = i + 3
-                    if y >= max_y - 1:
-                        break
-                    arrow = "→" if i == cursor else " "
-                    line = f" {arrow}  {label}"
-                    if desc:
-                        line += f"  {desc}"
-
-                    attr = curses.A_NORMAL
-                    if i == cursor:
-                        attr = curses.A_BOLD
-                        if curses.has_colors():
-                            attr |= curses.color_pair(1)
-                    try:
-                        stdscr.addnstr(y, 0, line[:max_x - 1], max_x - 1, attr)
-                    except curses.error:
-                        pass
-
-                stdscr.refresh()
-                key = stdscr.getch()
-
-                if key in (curses.KEY_UP, ord('k')):
-                    cursor = (cursor - 1) % len(items)
-                elif key in (curses.KEY_DOWN, ord('j')):
-                    cursor = (cursor + 1) % len(items)
-                elif key in (curses.KEY_ENTER, 10, 13):
-                    result[0] = cursor
-                    return
-                elif key in (27, ord('q')):
-                    return
-
-        curses.wrapper(_menu)
-        return result[0]
-
-    except Exception:
-        # Fallback: numbered input
-        print(f"\n  {title}\n")
-        for i, (label, desc) in enumerate(items):
-            marker = "→" if i == default else " "
-            d = f"  {desc}" if desc else ""
-            print(f"  {marker} {i + 1}. {label}{d}")
-        while True:
-            try:
-                val = input(f"\n  Select [1-{len(items)}] ({default + 1}): ")
-                if not val:
-                    return default
-                idx = int(val) - 1
-                if 0 <= idx < len(items):
-                    return idx
-            except (ValueError, EOFError):
-                return default
+    from hermes_cli.curses_ui import curses_radiolist
+    # Format (label, desc) tuples into display strings
+    display_items = [
+        f"{label}  {desc}" if desc else label
+        for label, desc in items
+    ]
+    return curses_radiolist(title, display_items, selected=default, cancel_returns=default)


 def _prompt(label: str, default: str | None = None, secret: bool = False) -> str:
@@ -92,6 +92,7 @@ _MATCHING_PREFIX_STRIP_PROVIDERS: frozenset[str] = frozenset({
    "minimax-cn",
    "alibaba",
    "qwen-oauth",
+    "xiaomi",
    "custom",
 })

@@ -56,6 +56,18 @@ OPENROUTER_MODELS: list[tuple[str, str]] = [

 _openrouter_catalog_cache: list[tuple[str, str]] | None = None

+
+def _codex_curated_models() -> list[str]:
+    """Derive the openai-codex curated list from codex_models.py.
+
+    Single source of truth: DEFAULT_CODEX_MODELS + forward-compat synthesis.
+    This keeps the gateway /model picker in sync with the CLI `hermes model`
+    flow without maintaining a separate static list.
+    """
+    from hermes_cli.codex_models import DEFAULT_CODEX_MODELS, _add_forward_compat_models
+    return _add_forward_compat_models(list(DEFAULT_CODEX_MODELS))
+
+
 _PROVIDER_MODELS: dict[str, list[str]] = {
    "nous": [
        "anthropic/claude-opus-4.6",
@@ -86,14 +98,7 @@ _PROVIDER_MODELS: dict[str, list[str]] = {
        "openai/gpt-5.4-pro",
        "openai/gpt-5.4-nano",
    ],
-    "openai-codex": [
-        "gpt-5.4",
-        "gpt-5.4-mini",
-        "gpt-5.3-codex",
-        "gpt-5.2-codex",
-        "gpt-5.1-codex-mini",
-        "gpt-5.1-codex-max",
-    ],
+    "openai-codex": _codex_curated_models(),
    "copilot-acp": [
        "copilot-acp",
    ],
@@ -183,6 +188,11 @@ _PROVIDER_MODELS: dict[str, list[str]] = {
        "deepseek-chat",
        "deepseek-reasoner",
    ],
+    "xiaomi": [
+        "mimo-v2-pro",
+        "mimo-v2-omni",
+        "mimo-v2-flash",
+    ],
    "opencode-zen": [
        "gpt-5.4-pro",
        "gpt-5.4",
@@ -488,6 +498,7 @@ _PROVIDER_LABELS = {
    "alibaba": "Alibaba Cloud (DashScope)",
    "qwen-oauth": "Qwen OAuth (Portal)",
    "huggingface": "Hugging Face",
+    "xiaomi": "Xiaomi MiMo",
    "custom": "Custom endpoint",
 }

@@ -530,6 +541,8 @@ _PROVIDER_ALIASES = {
    "hf": "huggingface",
    "hugging-face": "huggingface",
    "huggingface-hub": "huggingface",
+    "mimo": "xiaomi",
+    "xiaomi-mimo": "xiaomi",
 }


@@ -814,7 +827,7 @@ def list_available_providers() -> list[dict[str, str]]:
        "openrouter", "nous", "openai-codex", "copilot", "copilot-acp",
        "gemini", "huggingface",
        "zai", "kimi-coding", "minimax", "minimax-cn", "kilocode", "anthropic", "alibaba",
-        "qwen-oauth",
+        "qwen-oauth", "xiaomi",
        "opencode-zen", "opencode-go",
        "ai-gateway", "deepseek", "custom",
    ]
@@ -0,0 +1,45 @@
+"""
+Shared platform registry for Hermes Agent.
+
+Single source of truth for platform metadata consumed by both
+skills_config (label display) and tools_config (default toolset
+resolution).  Import ``PLATFORMS`` from here instead of maintaining
+duplicate dicts in each module.
+"""
+
+from collections import OrderedDict
+from typing import NamedTuple
+
+
+class PlatformInfo(NamedTuple):
+    """Metadata for a single platform entry."""
+    label: str
+    default_toolset: str
+
+
+# Ordered so that TUI menus are deterministic.
+PLATFORMS: OrderedDict[str, PlatformInfo] = OrderedDict([
+    ("cli",            PlatformInfo(label="🖥️  CLI",            default_toolset="hermes-cli")),
+    ("telegram",       PlatformInfo(label="📱 Telegram",        default_toolset="hermes-telegram")),
+    ("discord",        PlatformInfo(label="💬 Discord",         default_toolset="hermes-discord")),
+    ("slack",          PlatformInfo(label="💼 Slack",           default_toolset="hermes-slack")),
+    ("whatsapp",       PlatformInfo(label="📱 WhatsApp",        default_toolset="hermes-whatsapp")),
+    ("signal",         PlatformInfo(label="📡 Signal",          default_toolset="hermes-signal")),
+    ("bluebubbles",    PlatformInfo(label="💙 BlueBubbles",     default_toolset="hermes-bluebubbles")),
+    ("email",          PlatformInfo(label="📧 Email",           default_toolset="hermes-email")),
+    ("homeassistant",  PlatformInfo(label="🏠 Home Assistant",  default_toolset="hermes-homeassistant")),
+    ("mattermost",     PlatformInfo(label="💬 Mattermost",      default_toolset="hermes-mattermost")),
+    ("matrix",         PlatformInfo(label="💬 Matrix",          default_toolset="hermes-matrix")),
+    ("dingtalk",       PlatformInfo(label="💬 DingTalk",        default_toolset="hermes-dingtalk")),
+    ("feishu",         PlatformInfo(label="🪽 Feishu",          default_toolset="hermes-feishu")),
+    ("wecom",          PlatformInfo(label="💬 WeCom",           default_toolset="hermes-wecom")),
+    ("weixin",         PlatformInfo(label="💬 Weixin",          default_toolset="hermes-weixin")),
+    ("webhook",        PlatformInfo(label="🔗 Webhook",         default_toolset="hermes-webhook")),
+    ("api_server",     PlatformInfo(label="🌐 API Server",      default_toolset="hermes-api-server")),
+])
+
+
+def platform_label(key: str, default: str = "") -> str:
+    """Return the display label for a platform key, or *default*."""
+    info = PLATFORMS.get(key)
+    return info.label if info is not None else default
@@ -132,6 +132,10 @@ HERMES_OVERLAYS: Dict[str, HermesOverlay] = {
        base_url_override="https://api.x.ai/v1",
        base_url_env_var="XAI_BASE_URL",
    ),
+    "xiaomi": HermesOverlay(
+        transport="openai_chat",
+        base_url_env_var="XIAOMI_BASE_URL",
+    ),
 }


@@ -222,6 +226,10 @@ ALIASES: Dict[str, str] = {
    "hugging-face": "huggingface",
    "huggingface-hub": "huggingface",

+    # xiaomi
+    "mimo": "xiaomi",
+    "xiaomi-mimo": "xiaomi",
+
    # Local server aliases → virtual "local" concept (resolved via user config)
    "lmstudio": "lmstudio",
    "lm-studio": "lmstudio",
@@ -242,6 +250,7 @@ _LABEL_OVERRIDES: Dict[str, str] = {
    "nous": "Nous Portal",
    "openai-codex": "OpenAI Codex",
    "copilot-acp": "GitHub Copilot ACP",
+    "xiaomi": "Xiaomi MiMo",
    "local": "Local endpoint",
 }

@@ -197,24 +197,12 @@ def print_header(title: str):
    print(color(f"◆ {title}", Colors.CYAN, Colors.BOLD))


-def print_info(text: str):
-    """Print info text."""
-    print(color(f"  {text}", Colors.DIM))
-
-
-def print_success(text: str):
-    """Print success message."""
-    print(color(f"✓ {text}", Colors.GREEN))
-
-
-def print_warning(text: str):
-    """Print warning message."""
-    print(color(f"⚠ {text}", Colors.YELLOW))
-
-
-def print_error(text: str):
-    """Print error message."""
-    print(color(f"✗ {text}", Colors.RED))
+from hermes_cli.cli_output import (  # noqa: E402
+    print_error,
+    print_info,
+    print_success,
+    print_warning,
+)


 def is_interactive_stdin() -> bool:
@@ -269,80 +257,9 @@ def prompt(question: str, default: str = None, password: bool = False) -> str:


 def _curses_prompt_choice(question: str, choices: list, default: int = 0) -> int:
-    """Single-select menu using curses to avoid simple_term_menu rendering bugs."""
-    try:
-        import curses
-        result_holder = [default]
-
-        def _curses_menu(stdscr):
-            curses.curs_set(0)
-            if curses.has_colors():
-                curses.start_color()
-                curses.use_default_colors()
-                curses.init_pair(1, curses.COLOR_GREEN, -1)
-                curses.init_pair(2, curses.COLOR_YELLOW, -1)
-            cursor = default
-            scroll_offset = 0
-
-            while True:
-                stdscr.clear()
-                max_y, max_x = stdscr.getmaxyx()
-
-                # Rows available for list items: rows 2..(max_y-2) inclusive.
-                visible = max(1, max_y - 3)
-
-                # Scroll the viewport so the cursor is always visible.
-                if cursor < scroll_offset:
-                    scroll_offset = cursor
-                elif cursor >= scroll_offset + visible:
-                    scroll_offset = cursor - visible + 1
-                scroll_offset = max(0, min(scroll_offset, max(0, len(choices) - visible)))
-
-                try:
-                    stdscr.addnstr(
-                        0,
-                        0,
-                        question,
-                        max_x - 1,
-                        curses.A_BOLD | (curses.color_pair(2) if curses.has_colors() else 0),
-                    )
-                except curses.error:
-                    pass
-
-                for row, i in enumerate(range(scroll_offset, min(scroll_offset + visible, len(choices)))):
-                    y = row + 2
-                    if y >= max_y - 1:
-                        break
-                    arrow = "→" if i == cursor else " "
-                    line = f" {arrow}  {choices[i]}"
-                    attr = curses.A_NORMAL
-                    if i == cursor:
-                        attr = curses.A_BOLD
-                        if curses.has_colors():
-                            attr |= curses.color_pair(1)
-                    try:
-                        stdscr.addnstr(y, 0, line, max_x - 1, attr)
-                    except curses.error:
-                        pass
-
-                stdscr.refresh()
-                key = stdscr.getch()
-                if key in (curses.KEY_UP, ord("k")):
-                    cursor = (cursor - 1) % len(choices)
-                elif key in (curses.KEY_DOWN, ord("j")):
-                    cursor = (cursor + 1) % len(choices)
-                elif key in (curses.KEY_ENTER, 10, 13):
-                    result_holder[0] = cursor
-                    return
-                elif key in (27, ord("q")):
-                    return
-
-        curses.wrapper(_curses_menu)
-        from hermes_cli.curses_ui import flush_stdin
-        flush_stdin()
-        return result_holder[0]
-    except Exception:
-        return -1
+    """Single-select menu using curses. Delegates to curses_radiolist."""
+    from hermes_cli.curses_ui import curses_radiolist
+    return curses_radiolist(question, choices, selected=default, cancel_returns=-1)



@@ -15,25 +15,12 @@ from typing import List, Optional, Set

 from hermes_cli.config import load_config, save_config
 from hermes_cli.colors import Colors, color
+from hermes_cli.platforms import PLATFORMS as _PLATFORMS, platform_label

-PLATFORMS = {
-    "cli":      "🖥️  CLI",
-    "telegram": "📱 Telegram",
-    "discord":  "💬 Discord",
-    "slack":    "💼 Slack",
-    "whatsapp": "📱 WhatsApp",
-    "signal":   "📡 Signal",
-    "bluebubbles": "💬 BlueBubbles",
-    "email":    "📧 Email",
-    "homeassistant": "🏠 Home Assistant",
-    "mattermost": "💬 Mattermost",
-    "matrix":   "💬 Matrix",
-    "dingtalk": "💬 DingTalk",
-    "feishu": "🪽 Feishu",
-    "wecom": "💬 WeCom",
-    "weixin": "💬 Weixin",
-    "webhook": "🔗 Webhook",
-}
+# Backward-compatible view: {key: label_string} so existing code that
+# iterates ``PLATFORMS.items()`` or calls ``PLATFORMS.get(key)`` keeps
+# working without changes to every call site.
+PLATFORMS = {k: info.label for k, info in _PLATFORMS.items() if k != "api_server"}

 # ─── Config Helpers ───────────────────────────────────────────────────────────

@@ -141,11 +141,8 @@ def show_status(args):
        display = redact_key(value) if not show_all else value
        print(f"  {name:<12}  {check_mark(has_key)} {display}")

-    anthropic_value = (
-        get_env_value("ANTHROPIC_TOKEN")
-        or get_env_value("ANTHROPIC_API_KEY")
-        or ""
-    )
+    from hermes_cli.auth import get_anthropic_key
+    anthropic_value = get_anthropic_key()
    anthropic_display = redact_key(anthropic_value) if not show_all else anthropic_value
    print(f"  {'Anthropic':<12}  {check_mark(bool(anthropic_value))} {anthropic_display}")

@@ -33,33 +33,13 @@ PROJECT_ROOT = Path(__file__).parent.parent.resolve()

 # ─── UI Helpers (shared with setup.py) ────────────────────────────────────────

-def _print_info(text: str):
-    print(color(f"  {text}", Colors.DIM))
-
-def _print_success(text: str):
-    print(color(f"✓ {text}", Colors.GREEN))
-
-def _print_warning(text: str):
-    print(color(f"⚠ {text}", Colors.YELLOW))
-
-def _print_error(text: str):
-    print(color(f"✗ {text}", Colors.RED))
-
-def _prompt(question: str, default: str = None, password: bool = False) -> str:
-    if default:
-        display = f"{question} [{default}]: "
-    else:
-        display = f"{question}: "
-    try:
-        if password:
-            import getpass
-            value = getpass.getpass(color(display, Colors.YELLOW))
-        else:
-            value = input(color(display, Colors.YELLOW))
-        return value.strip() or default or ""
-    except (KeyboardInterrupt, EOFError):
-        print()
-        return default or ""
+from hermes_cli.cli_output import (  # noqa: E402 — late import block
+    print_error as _print_error,
+    print_info as _print_info,
+    print_success as _print_success,
+    print_warning as _print_warning,
+    prompt as _prompt,
+)

 # ─── Toolset Registry ─────────────────────────────────────────────────────────

@@ -118,25 +98,14 @@ def _get_plugin_toolset_keys() -> set:
    except Exception:
        return set()

-# Platform display config
+# Platform display config — derived from the canonical registry so every
+# module shares the same data.  Kept as dict-of-dicts for backward
+# compatibility with existing ``PLATFORMS[key]["label"]`` access patterns.
+from hermes_cli.platforms import PLATFORMS as _PLATFORMS_REGISTRY
+
 PLATFORMS = {
-    "cli":      {"label": "🖥️  CLI",       "default_toolset": "hermes-cli"},
-    "telegram": {"label": "📱 Telegram",   "default_toolset": "hermes-telegram"},
-    "discord":  {"label": "💬 Discord",    "default_toolset": "hermes-discord"},
-    "slack":    {"label": "💼 Slack",      "default_toolset": "hermes-slack"},
-    "whatsapp": {"label": "📱 WhatsApp",   "default_toolset": "hermes-whatsapp"},
-    "signal":   {"label": "📡 Signal",     "default_toolset": "hermes-signal"},
-    "bluebubbles": {"label": "💙 BlueBubbles", "default_toolset": "hermes-bluebubbles"},
-    "homeassistant": {"label": "🏠 Home Assistant", "default_toolset": "hermes-homeassistant"},
-    "email":    {"label": "📧 Email",      "default_toolset": "hermes-email"},
-    "matrix":   {"label": "💬 Matrix",     "default_toolset": "hermes-matrix"},
- "dingtalk": {"label": "💬 DingTalk", "default_toolset": "hermes-dingtalk"},
-    "feishu": {"label": "🪽 Feishu", "default_toolset": "hermes-feishu"},
-    "wecom": {"label": "💬 WeCom", "default_toolset": "hermes-wecom"},
-    "weixin": {"label": "💬 Weixin", "default_toolset": "hermes-weixin"},
-    "api_server": {"label": "🌐 API Server", "default_toolset": "hermes-api-server"},
-    "mattermost": {"label": "💬 Mattermost", "default_toolset": "hermes-mattermost"},
-    "webhook": {"label": "🔗 Webhook", "default_toolset": "hermes-webhook"},
+    k: {"label": info.label, "default_toolset": info.default_toolset}
+    for k, info in _PLATFORMS_REGISTRY.items()
 }


@@ -677,86 +646,9 @@ def _toolset_has_keys(ts_key: str, config: dict = None) -> bool:
 # ─── Menu Helpers ─────────────────────────────────────────────────────────────

 def _prompt_choice(question: str, choices: list, default: int = 0) -> int:
-    """Single-select menu (arrow keys). Uses curses to avoid simple_term_menu
-    rendering bugs in tmux, iTerm, and other non-standard terminals."""
-
-    # Curses-based single-select — works in tmux, iTerm, and standard terminals
-    try:
-        import curses
-        result_holder = [default]
-
-        def _curses_menu(stdscr):
-            curses.curs_set(0)
-            if curses.has_colors():
-                curses.start_color()
-                curses.use_default_colors()
-                curses.init_pair(1, curses.COLOR_GREEN, -1)
-                curses.init_pair(2, curses.COLOR_YELLOW, -1)
-            cursor = default
-
-            while True:
-                stdscr.clear()
-                max_y, max_x = stdscr.getmaxyx()
-                try:
-                    stdscr.addnstr(0, 0, question, max_x - 1,
-                                   curses.A_BOLD | (curses.color_pair(2) if curses.has_colors() else 0))
-                except curses.error:
-                    pass
-
-                for i, c in enumerate(choices):
-                    y = i + 2
-                    if y >= max_y - 1:
-                        break
-                    arrow = "→" if i == cursor else " "
-                    line = f" {arrow}  {c}"
-                    attr = curses.A_NORMAL
-                    if i == cursor:
-                        attr = curses.A_BOLD
-                        if curses.has_colors():
-                            attr |= curses.color_pair(1)
-                    try:
-                        stdscr.addnstr(y, 0, line, max_x - 1, attr)
-                    except curses.error:
-                        pass
-
-                stdscr.refresh()
-                key = stdscr.getch()
-
-                if key in (curses.KEY_UP, ord('k')):
-                    cursor = (cursor - 1) % len(choices)
-                elif key in (curses.KEY_DOWN, ord('j')):
-                    cursor = (cursor + 1) % len(choices)
-                elif key in (curses.KEY_ENTER, 10, 13):
-                    result_holder[0] = cursor
-                    return
-                elif key in (27, ord('q')):
-                    return
-
-        curses.wrapper(_curses_menu)
-        from hermes_cli.curses_ui import flush_stdin
-        flush_stdin()
-        return result_holder[0]
-
-    except Exception:
-        pass
-
-    # Fallback: numbered input (Windows without curses, etc.)
-    print(color(question, Colors.YELLOW))
-    for i, c in enumerate(choices):
-        marker = "●" if i == default else "○"
-        style = Colors.GREEN if i == default else ""
-        print(color(f"  {marker} {i+1}. {c}", style) if style else f"  {marker} {i+1}. {c}")
-    while True:
-        try:
-            val = input(color(f"  Select [1-{len(choices)}] ({default + 1}): ", Colors.DIM))
-            if not val:
-                return default
-            idx = int(val) - 1
-            if 0 <= idx < len(choices):
-                return idx
-        except (ValueError, KeyboardInterrupt, EOFError):
-            print()
-            return default
+    """Single-select menu (arrow keys). Delegates to curses_radiolist."""
+    from hermes_cli.curses_ui import curses_radiolist
+    return curses_radiolist(question, choices, selected=default, cancel_returns=default)


 # ─── Token Estimation ────────────────────────────────────────────────────────
@@ -189,6 +189,33 @@ def is_wsl() -> bool:
    return _wsl_detected


+# ─── Well-Known Paths ─────────────────────────────────────────────────────────
+
+
+def get_config_path() -> Path:
+    """Return the path to ``config.yaml`` under HERMES_HOME.
+
+    Replaces the ``get_hermes_home() / "config.yaml"`` pattern repeated
+    in 7+ files (skill_utils.py, hermes_logging.py, hermes_time.py, etc.).
+    """
+    return get_hermes_home() / "config.yaml"
+
+
+def get_skills_dir() -> Path:
+    """Return the path to the skills directory under HERMES_HOME."""
+    return get_hermes_home() / "skills"
+
+
+def get_logs_dir() -> Path:
+    """Return the path to the logs directory under HERMES_HOME."""
+    return get_hermes_home() / "logs"
+
+
+def get_env_path() -> Path:
+    """Return the path to the ``.env`` file under HERMES_HOME."""
+    return get_hermes_home() / ".env"
+
+
 OPENROUTER_BASE_URL = "https://openrouter.ai/api/v1"
 OPENROUTER_MODELS_URL = f"{OPENROUTER_BASE_URL}/models"

@@ -18,7 +18,7 @@ from logging.handlers import RotatingFileHandler
 from pathlib import Path
 from typing import Optional

-from hermes_constants import get_hermes_home
+from hermes_constants import get_config_path, get_hermes_home

 # Sentinel to track whether setup_logging() has already run.  The function
 # is idempotent — calling it twice is safe but the second call is a no-op
@@ -246,7 +246,7 @@ def _read_logging_config():
    """
    try:
        import yaml
-        config_path = get_hermes_home() / "config.yaml"
+        config_path = get_config_path()
        if config_path.exists():
            with open(config_path, "r", encoding="utf-8") as f:
                cfg = yaml.safe_load(f) or {}
@@ -16,7 +16,7 @@ crashes due to a bad timezone string.
 import logging
 import os
 from datetime import datetime
-from hermes_constants import get_hermes_home
+from hermes_constants import get_config_path
 from typing import Optional

 logger = logging.getLogger(__name__)
@@ -48,8 +48,7 @@ def _resolve_timezone_name() -> str:
    # 2. config.yaml ``timezone`` key
    try:
        import yaml
-        hermes_home = get_hermes_home()
-        config_path = hermes_home / "config.yaml"
+        config_path = get_config_path()
        if config_path.exists():
            with open(config_path) as f:
                cfg = yaml.safe_load(f) or {}
@@ -617,6 +617,19 @@ class Migrator:
            candidate = self.source_root / rel
            if candidate.exists():
                return candidate
+            # OpenClaw renamed workspace/ to workspace-main/ (and workspace-{agentId}
+            # for multi-agent).  Try the new path as a fallback.
+            if rel.startswith("workspace/"):
+                suffix = rel[len("workspace/"):]
+                for variant in ("workspace-main", "workspace-assistant"):
+                    alt = self.source_root / variant / suffix
+                    if alt.exists():
+                        return alt
+            elif rel.startswith("workspace.default/"):
+                suffix = rel[len("workspace.default/"):]
+                alt = self.source_root / "workspace-main" / suffix
+                if alt.exists():
+                    return alt
        return None

    def resolve_skill_destination(self, destination: Path) -> Path:
@@ -1033,11 +1046,8 @@ class Migrator:
    def migrate_secret_settings(self, config: Dict[str, Any]) -> None:
        secret_additions: Dict[str, str] = {}

-        telegram_token = (
-            config.get("channels", {})
-            .get("telegram", {})
-            .get("botToken")
-        )
+        tg_cfg = config.get("channels", {}).get("telegram", {})
+        telegram_token = self._get_channel_field(tg_cfg, "botToken") if isinstance(tg_cfg, dict) else None
        if isinstance(telegram_token, str) and telegram_token.strip():
            secret_additions["TELEGRAM_BOT_TOKEN"] = telegram_token.strip()

@@ -1057,15 +1067,28 @@ class Migrator:
        """Resolve a channel config value that may be a SecretRef."""
        return resolve_secret_input(value, self.load_openclaw_env())

+    @staticmethod
+    def _get_channel_field(ch_cfg: Dict[str, Any], field: str) -> Any:
+        """Get a field from channel config, checking both flat and accounts.default layout."""
+        val = ch_cfg.get(field)
+        if val is not None:
+            return val
+        accounts = ch_cfg.get("accounts")
+        if isinstance(accounts, dict):
+            default = accounts.get("default")
+            if isinstance(default, dict):
+                return default.get(field)
+        return None
+
    def migrate_discord_settings(self, config: Optional[Dict[str, Any]] = None) -> None:
        config = config or self.load_openclaw_config()
        additions: Dict[str, str] = {}
        discord = config.get("channels", {}).get("discord", {})
        if isinstance(discord, dict):
-            token = discord.get("token")
+            token = self._get_channel_field(discord, "token")
            if isinstance(token, str) and token.strip():
                additions["DISCORD_BOT_TOKEN"] = token.strip()
-            allow_from = discord.get("allowFrom", [])
+            allow_from = self._get_channel_field(discord, "allowFrom") or []
            if isinstance(allow_from, list):
                users = [str(u).strip() for u in allow_from if str(u).strip()]
                if users:
@@ -1080,13 +1103,13 @@ class Migrator:
        additions: Dict[str, str] = {}
        slack = config.get("channels", {}).get("slack", {})
        if isinstance(slack, dict):
-            bot_token = slack.get("botToken")
+            bot_token = self._get_channel_field(slack, "botToken")
            if isinstance(bot_token, str) and bot_token.strip():
                additions["SLACK_BOT_TOKEN"] = bot_token.strip()
-            app_token = slack.get("appToken")
+            app_token = self._get_channel_field(slack, "appToken")
            if isinstance(app_token, str) and app_token.strip():
                additions["SLACK_APP_TOKEN"] = app_token.strip()
-            allow_from = slack.get("allowFrom", [])
+            allow_from = self._get_channel_field(slack, "allowFrom") or []
            if isinstance(allow_from, list):
                users = [str(u).strip() for u in allow_from if str(u).strip()]
                if users:
@@ -1101,7 +1124,7 @@ class Migrator:
        additions: Dict[str, str] = {}
        whatsapp = config.get("channels", {}).get("whatsapp", {})
        if isinstance(whatsapp, dict):
-            allow_from = whatsapp.get("allowFrom", [])
+            allow_from = self._get_channel_field(whatsapp, "allowFrom") or []
            if isinstance(allow_from, list):
                users = [str(u).strip() for u in allow_from if str(u).strip()]
                if users:
@@ -1116,13 +1139,13 @@ class Migrator:
        additions: Dict[str, str] = {}
        signal = config.get("channels", {}).get("signal", {})
        if isinstance(signal, dict):
-            account = signal.get("account")
+            account = self._get_channel_field(signal, "account")
            if isinstance(account, str) and account.strip():
                additions["SIGNAL_ACCOUNT"] = account.strip()
-            http_url = signal.get("httpUrl")
+            http_url = self._get_channel_field(signal, "httpUrl")
            if isinstance(http_url, str) and http_url.strip():
                additions["SIGNAL_HTTP_URL"] = http_url.strip()
-            allow_from = signal.get("allowFrom", [])
+            allow_from = self._get_channel_field(signal, "allowFrom") or []
            if isinstance(allow_from, list):
                users = [str(u).strip() for u in allow_from if str(u).strip()]
                if users:
@@ -1161,6 +1184,16 @@ class Migrator:
                raw_key = provider_cfg.get("apiKey")
                api_key = resolve_secret_input(raw_key, openclaw_env)
                if not api_key:
+                    # Warn if a SecretRef with file/exec source was silently unresolvable
+                    if isinstance(raw_key, dict) and raw_key.get("source") in ("file", "exec"):
+                        self.record(
+                            "provider-keys",
+                            self.source_root / "openclaw.json",
+                            None,
+                            "skipped",
+                            f"Provider '{provider_name}' uses a {raw_key['source']}-backed SecretRef "
+                            f"that cannot be auto-migrated. Add this key manually via: hermes config set",
+                        )
                    continue

                base_url = provider_cfg.get("baseUrl", "")
@@ -1224,6 +1257,21 @@ class Migrator:
            if val and hermes_key not in secret_additions:
                secret_additions[hermes_key] = val

+        # Check the openclaw.json "env" sub-object — some OpenClaw setups
+        # store API keys here instead of in a separate .env file.
+        # Keys can be at env.<KEY> or env.vars.<KEY>.
+        json_env = config.get("env")
+        if isinstance(json_env, dict):
+            env_vars = json_env.get("vars")
+            sources = [json_env]
+            if isinstance(env_vars, dict):
+                sources.append(env_vars)
+            for src in sources:
+                for oc_key, hermes_key in env_key_mapping.items():
+                    val = src.get(oc_key)
+                    if isinstance(val, str) and val.strip() and hermes_key not in secret_additions:
+                        secret_additions[hermes_key] = val.strip()
+
        # Check per-agent auth-profiles.json for additional credentials
        auth_profiles_path = self.source_root / "agents" / "main" / "agent" / "auth-profiles.json"
        if auth_profiles_path.exists():
@@ -1324,8 +1372,9 @@ class Migrator:
        tts_data: Dict[str, Any] = {}

        provider = tts.get("provider")
-        if isinstance(provider, str) and provider in ("elevenlabs", "openai", "edge"):
-            tts_data["provider"] = provider
+        if isinstance(provider, str) and provider in ("elevenlabs", "openai", "edge", "microsoft"):
+            # OpenClaw renamed "edge" to "microsoft"; Hermes still uses "edge"
+            tts_data["provider"] = "edge" if provider == "microsoft" else provider

        # TTS provider settings live under messages.tts.providers.{provider}
        # in OpenClaw (not messages.tts.elevenlabs directly)
@@ -1374,9 +1423,9 @@ class Migrator:
                tts_data["openai"] = oai_settings

        edge_tts = (
-            (providers.get("edge") or {})
-            if isinstance(providers.get("edge"), dict) else
-            (tts.get("edge") or {})
+            (providers.get("edge") or providers.get("microsoft") or {})
+            if isinstance(providers.get("edge"), dict) or isinstance(providers.get("microsoft"), dict) else
+            (tts.get("edge") or tts.get("microsoft") or {})
        )
        if isinstance(edge_tts, dict):
            edge_voice = edge_tts.get("voice")
@@ -1890,11 +1939,11 @@ class Migrator:
        if defaults.get("thinkingDefault"):
            # Map OpenClaw thinking -> Hermes reasoning_effort
            thinking = defaults["thinkingDefault"]
-            if thinking in ("always", "high"):
+            if thinking in ("always", "high", "xhigh"):
                agent_cfg["reasoning_effort"] = "high"
-            elif thinking in ("auto", "medium"):
+            elif thinking in ("auto", "medium", "adaptive"):
                agent_cfg["reasoning_effort"] = "medium"
-            elif thinking in ("off", "low", "none"):
+            elif thinking in ("off", "low", "none", "minimal"):
                agent_cfg["reasoning_effort"] = "low"
            changes = True

@@ -2099,10 +2148,14 @@ class Migrator:
                                f"Provider '{prov_name}' already exists")
                    continue

-                api_type = prov_cfg.get("apiType") or prov_cfg.get("type") or "openai"
+                api_type = prov_cfg.get("apiType") or prov_cfg.get("api") or prov_cfg.get("type") or "openai"
                api_mode_map = {
                    "openai": "chat_completions",
+                    "openai-completions": "chat_completions",
+                    "openai-responses": "chat_completions",
                    "anthropic": "anthropic_messages",
+                    "anthropic-messages": "anthropic_messages",
+                    "google-generative-ai": "chat_completions",
                    "cohere": "chat_completions",
                }
                entry = {
@@ -2142,7 +2195,7 @@ class Migrator:

        # Extended channel token/allowlist mapping
        CHANNEL_ENV_MAP = {
-            "matrix": {"token": "MATRIX_ACCESS_TOKEN", "allowFrom": "MATRIX_ALLOWED_USERS",
+            "matrix": {"token": "MATRIX...OKEN", "tokenField": "accessToken", "allowFrom": "MATRIX_ALLOWED_USERS",
                        "extras": {"homeserverUrl": "MATRIX_HOMESERVER_URL", "userId": "MATRIX_USER_ID"}},
            "mattermost": {"token": "MATTERMOST_BOT_TOKEN", "allowFrom": "MATTERMOST_ALLOWED_USERS",
                           "extras": {"url": "MATTERMOST_URL", "teamId": "MATTERMOST_TEAM_ID"}},
@@ -2160,19 +2213,21 @@ class Migrator:
            if not ch_cfg:
                continue

-            # Extract tokens
-            if ch_mapping.get("token") and ch_cfg.get("botToken") and self.migrate_secrets:
-                self._set_env_var(ch_mapping["token"], ch_cfg["botToken"],
-                                  f"channels.{ch_name}.botToken")
-            if ch_mapping.get("allowFrom") and ch_cfg.get("allowFrom"):
-                allow_val = ch_cfg["allowFrom"]
+            # Extract tokens (check flat path, then accounts.default)
+            token_field = ch_mapping.get("tokenField", "botToken")
+            bot_token = self._get_channel_field(ch_cfg, token_field)
+            if ch_mapping.get("token") and bot_token and self.migrate_secrets:
+                self._set_env_var(ch_mapping["token"], str(bot_token),
+                                  f"channels.{ch_name}.{token_field}")
+            allow_val = self._get_channel_field(ch_cfg, "allowFrom")
+            if ch_mapping.get("allowFrom") and allow_val:
                if isinstance(allow_val, list):
                    allow_val = ",".join(str(x) for x in allow_val)
                self._set_env_var(ch_mapping["allowFrom"], str(allow_val),
                                  f"channels.{ch_name}.allowFrom")
            # Extra fields
            for oc_key, env_key in (ch_mapping.get("extras") or {}).items():
-                val = ch_cfg.get(oc_key)
+                val = self._get_channel_field(ch_cfg, oc_key)
                if val:
                    if isinstance(val, list):
                        val = ",".join(str(x) for x in val)
@@ -2495,6 +2550,33 @@ class Migrator:
        elif has_cron_store_archive:
            notes.append("- Run `hermes cron` to recreate scheduled tasks (see archived cron-store)")

+        # Check if skills were imported
+        has_skills = any(i.kind == "skills" and i.status == "migrated" for i in self.items)
+        if has_skills:
+            notes.extend([
+                "",
+                "## Imported Skills",
+                "",
+                "Imported skills require a new session to take effect. After migration,",
+                "restart your agent or start a new chat session, then run `/skills`",
+                "to verify they loaded correctly.",
+                "",
+            ])
+
+        # Check if WhatsApp was detected
+        has_whatsapp = any(i.kind == "whatsapp-settings" and i.status == "migrated" for i in self.items)
+        if has_whatsapp:
+            notes.extend([
+                "",
+                "## WhatsApp Requires Re-Pairing",
+                "",
+                "WhatsApp uses QR-code pairing, not token-based auth. Your allowlist",
+                "was migrated, but you must re-pair the device by running:",
+                "",
+                "    hermes whatsapp",
+                "",
+            ])
+
        notes.extend([
            "- Run `hermes gateway install` if you need the gateway service",
            "- Review `~/.hermes/config.yaml` for any adjustments",
@@ -700,10 +700,14 @@ class AIAgent:
        except Exception:
            pass

-        # Direct OpenAI sessions use the Responses API path.  GPT-5.x tool
-        # calls with reasoning are rejected on /v1/chat/completions, and
-        # Hermes is a tool-using client by default.
-        if self.api_mode == "chat_completions" and self._is_direct_openai_url():
+        # GPT-5.x models require the Responses API path — they are rejected
+        # on /v1/chat/completions by both OpenAI and OpenRouter.  Also
+        # auto-upgrade for direct OpenAI URLs (api.openai.com) since all
+        # newer tool-calling models prefer Responses there.
+        if self.api_mode == "chat_completions" and (
+            self._is_direct_openai_url()
+            or self._model_requires_responses_api(self.model)
+        ):
            self.api_mode = "codex_responses"

        # Pre-warm OpenRouter model metadata cache in a background thread.
@@ -735,6 +739,7 @@ class AIAgent:
        # Interrupt mechanism for breaking out of tool loops
        self._interrupt_requested = False
        self._interrupt_message = None  # Optional message that triggered interrupt
+        self._execution_thread_id: int | None = None  # Set at run_conversation() start
        self._client_lock = threading.RLock()
        
        # Subagent delegation state
@@ -1402,6 +1407,12 @@ class AIAgent:
            else:
                print(f"📊 Context limit: {self.context_compressor.context_length:,} tokens (auto-compression disabled)")

+        # Check immediately so CLI users see the warning at startup.
+        # Gateway status_callback is not yet wired, so any warning is stored
+        # in _compression_warning and replayed in the first run_conversation().
+        self._compression_warning = None
+        self._check_compression_model_feasibility()
+
        # Snapshot primary runtime for per-turn restoration.  When fallback
        # activates during a turn, the next turn restores these values so the
        # preferred model gets a fresh attempt each time.  Uses a single dict
@@ -1693,6 +1704,104 @@ class AIAgent:
            except Exception:
                logger.debug("status_callback error in _emit_status", exc_info=True)

+    def _check_compression_model_feasibility(self) -> None:
+        """Warn at session start if the auxiliary compression model's context
+        window is smaller than the main model's compression threshold.
+
+        When the auxiliary model cannot fit the content that needs summarising,
+        compression will either fail outright (the LLM call errors) or produce
+        a severely truncated summary.
+
+        Called during ``__init__`` so CLI users see the warning immediately
+        (via ``_vprint``).  The gateway sets ``status_callback`` *after*
+        construction, so ``_replay_compression_warning()`` re-sends the
+        stored warning through the callback on the first
+        ``run_conversation()`` call.
+        """
+        if not self.compression_enabled:
+            return
+        try:
+            from agent.auxiliary_client import get_text_auxiliary_client
+            from agent.model_metadata import get_model_context_length
+
+            client, aux_model = get_text_auxiliary_client("compression")
+            if client is None or not aux_model:
+                msg = (
+                    "⚠ No auxiliary LLM provider configured — context "
+                    "compression will drop middle turns without a summary. "
+                    "Run `hermes setup` or set OPENROUTER_API_KEY."
+                )
+                self._compression_warning = msg
+                self._emit_status(msg)
+                logger.warning(
+                    "No auxiliary LLM provider for compression — "
+                    "summaries will be unavailable."
+                )
+                return
+
+            aux_base_url = str(getattr(client, "base_url", ""))
+            aux_api_key = str(getattr(client, "api_key", ""))
+            aux_context = get_model_context_length(
+                aux_model,
+                base_url=aux_base_url,
+                api_key=aux_api_key,
+            )
+
+            threshold = self.context_compressor.threshold_tokens
+            if aux_context < threshold:
+                # Suggest a threshold that would fit the aux model,
+                # rounded down to a clean percentage.
+                safe_pct = int((aux_context / self.context_compressor.context_length) * 100)
+                msg = (
+                    f"⚠ Compression model ({aux_model}) context "
+                    f"is {aux_context:,} tokens, but the main model's "
+                    f"compression threshold is {threshold:,} tokens. "
+                    f"Context compression will not be possible — the "
+                    f"content to summarise will exceed the auxiliary "
+                    f"model's context window.\n"
+                    f"  Fix options (config.yaml):\n"
+                    f"  1. Use a larger compression model:\n"
+                    f"       auxiliary:\n"
+                    f"         compression:\n"
+                    f"           model: <model-with-{threshold:,}+-context>\n"
+                    f"  2. Lower the compression threshold to fit "
+                    f"the current model:\n"
+                    f"       compression:\n"
+                    f"         threshold: 0.{safe_pct:02d}"
+                )
+                self._compression_warning = msg
+                self._emit_status(msg)
+                logger.warning(
+                    "Auxiliary compression model %s has %d token context, "
+                    "below the main model's compression threshold of %d "
+                    "tokens — compression summaries will fail or be "
+                    "severely truncated.",
+                    aux_model,
+                    aux_context,
+                    threshold,
+                )
+        except Exception as exc:
+            logger.debug(
+                "Compression feasibility check failed (non-fatal): %s", exc
+            )
+
+    def _replay_compression_warning(self) -> None:
+        """Re-send the compression warning through ``status_callback``.
+
+        During ``__init__`` the gateway's ``status_callback`` is not yet
+        wired, so ``_emit_status`` only reaches ``_vprint`` (CLI).  This
+        method is called once at the start of the first
+        ``run_conversation()`` — by then the gateway has set the callback,
+        so every platform (Telegram, Discord, Slack, etc.) receives the
+        warning.
+        """
+        msg = getattr(self, "_compression_warning", None)
+        if msg and self.status_callback:
+            try:
+                self.status_callback("lifecycle", msg)
+            except Exception:
+                pass
+
    def _is_direct_openai_url(self, base_url: str = None) -> bool:
        """Return True when a base URL targets OpenAI's native API."""
        url = (base_url or self._base_url_lower).lower()
@@ -1702,6 +1811,21 @@ class AIAgent:
        """Return True when the base URL targets OpenRouter."""
        return "openrouter" in self._base_url_lower

+    @staticmethod
+    def _model_requires_responses_api(model: str) -> bool:
+        """Return True for models that require the Responses API path.
+
+        GPT-5.x models are rejected on /v1/chat/completions by both
+        OpenAI and OpenRouter (error: ``unsupported_api_for_model``).
+        Detect these so the correct api_mode is set regardless of
+        which provider is serving the model.
+        """
+        m = model.lower()
+        # Strip vendor prefix (e.g. "openai/gpt-5.4" → "gpt-5.4")
+        if "/" in m:
+            m = m.rsplit("/", 1)[-1]
+        return m.startswith("gpt-5")
+
    def _max_tokens_param(self, value: int) -> dict:
        """Return the correct max tokens kwarg for the current provider.
        
@@ -2709,8 +2833,10 @@ class AIAgent:
        """
        self._interrupt_requested = True
        self._interrupt_message = message
-        # Signal all tools to abort any in-flight operations immediately
-        _set_interrupt(True)
+        # Signal all tools to abort any in-flight operations immediately.
+        # Scope the interrupt to this agent's execution thread so other
+        # agents running in the same process (gateway) are not affected.
+        _set_interrupt(True, self._execution_thread_id)
        # Propagate interrupt to any running child agents (subagent delegation)
        with self._active_children_lock:
            children_copy = list(self._active_children)
@@ -2723,10 +2849,10 @@ class AIAgent:
            print("\n⚡ Interrupt requested" + (f": '{message[:40]}...'" if message and len(message) > 40 else f": '{message}'" if message else ""))
    
    def clear_interrupt(self) -> None:
-        """Clear any pending interrupt request and the global tool interrupt signal."""
+        """Clear any pending interrupt request and the per-thread tool interrupt signal."""
        self._interrupt_requested = False
        self._interrupt_message = None
-        _set_interrupt(False)
+        _set_interrupt(False, self._execution_thread_id)

    def _touch_activity(self, desc: str) -> None:
        """Update the last-activity timestamp and description (thread-safe)."""
@@ -5251,7 +5377,7 @@ class AIAgent:
            except Exception:
                pass

-            # Determine api_mode from provider / base URL
+            # Determine api_mode from provider / base URL / model
            fb_api_mode = "chat_completions"
            fb_base_url = str(fb_client.base_url)
            if fb_provider == "openai-codex":
@@ -5260,6 +5386,10 @@ class AIAgent:
                fb_api_mode = "anthropic_messages"
            elif self._is_direct_openai_url(fb_base_url):
                fb_api_mode = "codex_responses"
+            elif self._model_requires_responses_api(fb_model):
+                # GPT-5.x models need Responses API on every provider
+                # (OpenRouter, Copilot, direct OpenAI, etc.)
+                fb_api_mode = "codex_responses"

            old_model = self.model
            self.model = fb_model
@@ -5348,8 +5478,8 @@ class AIAgent:
        to the fallback provider for every subsequent turn.  Calling this at
        the top of ``run_conversation()`` makes fallback turn-scoped.

-        The gateway creates a fresh agent per message so this is a no-op
-        there (``_fallback_activated`` is always False at turn start).
+        The gateway caches agents across messages (``_agent_cache`` in
+        ``gateway/run.py``), so this restoration IS needed there too.
        """
        if not self._fallback_activated:
            return False
@@ -7445,6 +7575,12 @@ class AIAgent:
                    )
            except Exception:
                pass
+        # Replay compression warning through status_callback for gateway
+        # platforms (the callback was not wired during __init__).
+        if self._compression_warning:
+            self._replay_compression_warning()
+            self._compression_warning = None  # send once
+
        # NOTE: _turns_since_memory and _iters_since_skill are NOT reset here.
        # They are initialized in __init__ and must persist across run_conversation
        # calls so that nudge logic accumulates correctly in CLI mode.
@@ -7666,6 +7802,11 @@ class AIAgent:
        compression_attempts = 0
        _turn_exit_reason = "unknown"  # Diagnostic: why the loop ended
        
+        # Record the execution thread so interrupt()/clear_interrupt() can
+        # scope the tool-level interrupt signal to THIS agent's thread only.
+        # Must be set before clear_interrupt() which uses it.
+        self._execution_thread_id = threading.current_thread().ident
+
        # Clear any stale interrupt state at start
        self.clear_interrupt()

@@ -8144,8 +8285,24 @@ class AIAgent:
                                    _text_parts.append(getattr(_blk, "text", ""))
                            _trunc_content = "\n".join(_text_parts) if _text_parts else None

+                        # A response is "thinking exhausted" only when the model
+                        # actually produced reasoning blocks but no visible text after
+                        # them.  Models that do not use <think> tags (e.g. GLM-4.7 on
+                        # NVIDIA Build, minimax) may return content=None or an empty
+                        # string for unrelated reasons — treat those as normal
+                        # truncations that deserve continuation retries, not as
+                        # thinking-budget exhaustion.
+                        _has_think_tags = bool(
+                            _trunc_content and re.search(
+                                r'<(?:think|thinking|reasoning|REASONING_SCRATCHPAD)[^>]*>',
+                                _trunc_content,
+                                re.IGNORECASE,
+                            )
+                        )
                        _thinking_exhausted = (
-                            not _trunc_has_tool_calls and (
+                            not _trunc_has_tool_calls
+                            and _has_think_tags
+                            and (
                                (_trunc_content is not None and not self._has_content_after_think_block(_trunc_content))
                                or _trunc_content is None
                            )
@@ -9373,12 +9530,41 @@ class AIAgent:
                            invalid_json_args.append((tc.function.name, str(e)))
                    
                    if invalid_json_args:
+                        # Check if the invalid JSON is due to truncation rather
+                        # than a model formatting mistake.  Routers sometimes
+                        # rewrite finish_reason from "length" to "tool_calls",
+                        # hiding the truncation from the length handler above.
+                        # Detect truncation: args that don't end with } or ]
+                        # (after stripping whitespace) are cut off mid-stream.
+                        _truncated = any(
+                            not (tc.function.arguments or "").rstrip().endswith(("}", "]"))
+                            for tc in assistant_message.tool_calls
+                            if tc.function.name in {n for n, _ in invalid_json_args}
+                        )
+                        if _truncated:
+                            self._vprint(
+                                f"{self.log_prefix}⚠️  Truncated tool call arguments detected "
+                                f"(finish_reason={finish_reason!r}) — refusing to execute.",
+                                force=True,
+                            )
+                            self._invalid_json_retries = 0
+                            self._cleanup_task_resources(effective_task_id)
+                            self._persist_session(messages, conversation_history)
+                            return {
+                                "final_response": None,
+                                "messages": messages,
+                                "api_calls": api_call_count,
+                                "completed": False,
+                                "partial": True,
+                                "error": "Response truncated due to output length limit",
+                            }
+
                        # Track retries for invalid JSON arguments
                        self._invalid_json_retries += 1
-                        
+
                        tool_name, error_msg = invalid_json_args[0]
                        self._vprint(f"{self.log_prefix}⚠️  Invalid JSON in tool call arguments for '{tool_name}': {error_msg}")
-                        
+
                        if self._invalid_json_retries < 3:
                            self._vprint(f"{self.log_prefix}🔄 Retrying API call ({self._invalid_json_retries}/3)...")
                            # Don't add anything to messages, just retry the API call
@@ -8,7 +8,7 @@
      "name": "hermes-whatsapp-bridge",
      "version": "1.0.0",
      "dependencies": {
-        "@whiskeysockets/baileys": "7.0.0-rc.9",
+        "@whiskeysockets/baileys": "WhiskeySockets/Baileys#fix/abprops-abt-fetch",
        "express": "^4.21.0",
        "pino": "^9.0.0",
        "qrcode-terminal": "^0.12.0"
@@ -730,21 +730,22 @@
      }
    },
    "node_modules/@whiskeysockets/baileys": {
+      "name": "baileys",
      "version": "7.0.0-rc.9",
-      "resolved": "https://registry.npmjs.org/@whiskeysockets/baileys/-/baileys-7.0.0-rc.9.tgz",
-      "integrity": "sha512-YFm5gKXfDP9byCXCW3OPHKXLzrAKzolzgVUlRosHHgwbnf2YOO3XknkMm6J7+F0ns8OA0uuSBhgkRHTDtqkacw==",
+      "resolved": "git+ssh://git@github.com/WhiskeySockets/Baileys.git#01047debd81beb20da7b7779b08edcb06aa03770",
      "hasInstallScript": true,
      "license": "MIT",
      "dependencies": {
        "@cacheable/node-cache": "^1.4.0",
        "@hapi/boom": "^9.1.3",
        "async-mutex": "^0.5.0",
-        "libsignal": "git+https://github.com/whiskeysockets/libsignal-node.git",
+        "libsignal": "git+https://github.com/whiskeysockets/libsignal-node",
        "lru-cache": "^11.1.0",
        "music-metadata": "^11.7.0",
        "p-queue": "^9.0.0",
        "pino": "^9.6",
        "protobufjs": "^7.2.4",
+        "whatsapp-rust-bridge": "0.5.2",
        "ws": "^8.13.0"
      },
      "engines": {
@@ -2125,6 +2126,12 @@
        "node": ">= 0.8"
      }
    },
+    "node_modules/whatsapp-rust-bridge": {
+      "version": "0.5.2",
+      "resolved": "https://registry.npmjs.org/whatsapp-rust-bridge/-/whatsapp-rust-bridge-0.5.2.tgz",
+      "integrity": "sha512-6KBRNvxg6WMIwZ/euA8qVzj16qxMBzLllfmaJIP1JGAAfSvwn6nr8JDOMXeqpXPEOl71UfOG+79JwKEoT2b1Fw==",
+      "license": "MIT"
+    },
    "node_modules/win-guid": {
      "version": "0.2.1",
      "resolved": "https://registry.npmjs.org/win-guid/-/win-guid-0.2.1.tgz",
@@ -8,7 +8,7 @@
    "start": "node bridge.js"
  },
  "dependencies": {
-    "@whiskeysockets/baileys": "7.0.0-rc.9",
+    "@whiskeysockets/baileys": "WhiskeySockets/Baileys#fix/abprops-abt-fetch",
    "express": "^4.21.0",
    "qrcode-terminal": "^0.12.0",
    "pino": "^9.0.0"
@@ -7,6 +7,7 @@ from agent.models_dev import (
    PROVIDER_TO_MODELS_DEV,
    _extract_context,
    fetch_models_dev,
+    get_model_capabilities,
    lookup_models_dev_context,
 )

@@ -195,3 +196,88 @@ class TestFetchModelsDev:
        result = fetch_models_dev()
        mock_get.assert_not_called()
        assert result == SAMPLE_REGISTRY
+
+
+# ---------------------------------------------------------------------------
+# get_model_capabilities — vision via modalities.input
+# ---------------------------------------------------------------------------
+
+
+CAPS_REGISTRY = {
+    "google": {
+        "id": "google",
+        "models": {
+            "gemma-4-31b-it": {
+                "id": "gemma-4-31b-it",
+                "attachment": False,
+                "tool_call": True,
+                "modalities": {"input": ["text", "image"]},
+                "limit": {"context": 128000, "output": 8192},
+            },
+            "gemma-3-1b": {
+                "id": "gemma-3-1b",
+                "tool_call": True,
+                "limit": {"context": 32000, "output": 8192},
+            },
+        },
+    },
+    "anthropic": {
+        "id": "anthropic",
+        "models": {
+            "claude-sonnet-4": {
+                "id": "claude-sonnet-4",
+                "attachment": True,
+                "tool_call": True,
+                "limit": {"context": 200000, "output": 64000},
+            },
+        },
+    },
+}
+
+
+class TestGetModelCapabilities:
+    """Tests for get_model_capabilities vision detection."""
+
+    def test_vision_from_attachment_flag(self):
+        """Models with attachment=True should report supports_vision=True."""
+        with patch("agent.models_dev.fetch_models_dev", return_value=CAPS_REGISTRY):
+            caps = get_model_capabilities("anthropic", "claude-sonnet-4")
+        assert caps is not None
+        assert caps.supports_vision is True
+
+    def test_vision_from_modalities_input_image(self):
+        """Models with 'image' in modalities.input but attachment=False should
+        still report supports_vision=True (the core fix in this PR)."""
+        with patch("agent.models_dev.fetch_models_dev", return_value=CAPS_REGISTRY):
+            caps = get_model_capabilities("google", "gemma-4-31b-it")
+        assert caps is not None
+        assert caps.supports_vision is True
+
+    def test_no_vision_without_attachment_or_modalities(self):
+        """Models with neither attachment nor image modality should be non-vision."""
+        with patch("agent.models_dev.fetch_models_dev", return_value=CAPS_REGISTRY):
+            caps = get_model_capabilities("google", "gemma-3-1b")
+        assert caps is not None
+        assert caps.supports_vision is False
+
+    def test_modalities_non_dict_handled(self):
+        """Non-dict modalities field should not crash."""
+        registry = {
+            "google": {"id": "google", "models": {
+                "weird-model": {
+                    "id": "weird-model",
+                    "modalities": "text",  # not a dict
+                    "limit": {"context": 200000, "output": 8192},
+                },
+            }},
+        }
+        with patch("agent.models_dev.fetch_models_dev", return_value=registry):
+            caps = get_model_capabilities("gemini", "weird-model")
+        assert caps is not None
+        assert caps.supports_vision is False
+
+    def test_model_not_found_returns_none(self):
+        """Unknown model should return None."""
+        with patch("agent.models_dev.fetch_models_dev", return_value=CAPS_REGISTRY):
+            caps = get_model_capabilities("anthropic", "nonexistent-model")
+        assert caps is None
@@ -211,7 +211,8 @@ def make_adapter(platform: Platform, runner=None):
    config = PlatformConfig(enabled=True, token="e2e-test-token")

    if platform == Platform.DISCORD:
-        with patch.object(DiscordAdapter, "_load_participated_threads", return_value=set()):
+        from gateway.platforms.helpers import ThreadParticipationTracker
+        with patch.object(ThreadParticipationTracker, "_load", return_value=set()):
            adapter = DiscordAdapter(config)
        platform_key = Platform.DISCORD
    elif platform == Platform.SLACK:
@@ -409,11 +409,50 @@ class TestChatCompletionsEndpoint:
                )
                assert resp.status == 200
                assert "text/event-stream" in resp.headers.get("Content-Type", "")
+                assert resp.headers.get("X-Accel-Buffering") == "no"
                body = await resp.text()
                assert "data: " in body
                assert "[DONE]" in body
                assert "Hello!" in body

+    @pytest.mark.asyncio
+    async def test_stream_sends_keepalive_during_quiet_tool_gap(self, adapter):
+        """Idle SSE streams should send keepalive comments while tools run silently."""
+        import asyncio
+        import gateway.platforms.api_server as api_server_mod
+
+        app = _create_app(adapter)
+        async with TestClient(TestServer(app)) as cli:
+            async def _mock_run_agent(**kwargs):
+                cb = kwargs.get("stream_delta_callback")
+                if cb:
+                    cb("Working")
+                    await asyncio.sleep(0.65)
+                    cb("...done")
+                return (
+                    {"final_response": "Working...done", "messages": [], "api_calls": 1},
+                    {"input_tokens": 10, "output_tokens": 5, "total_tokens": 15},
+                )
+
+            with (
+                patch.object(api_server_mod, "CHAT_COMPLETIONS_SSE_KEEPALIVE_SECONDS", 0.01),
+                patch.object(adapter, "_run_agent", side_effect=_mock_run_agent),
+            ):
+                resp = await cli.post(
+                    "/v1/chat/completions",
+                    json={
+                        "model": "test",
+                        "messages": [{"role": "user", "content": "do the thing"}],
+                        "stream": True,
+                    },
+                )
+                assert resp.status == 200
+                body = await resp.text()
+                assert ": keepalive" in body
+                assert "Working" in body
+                assert "...done" in body
+                assert "[DONE]" in body
+
    @pytest.mark.asyncio
    async def test_stream_survives_tool_call_none_sentinel(self, adapter):
        """stream_delta_callback(None) mid-stream (tool calls) must NOT kill the SSE stream.
@@ -119,28 +119,29 @@ class TestDeduplication:
    def test_first_message_not_duplicate(self):
        from gateway.platforms.dingtalk import DingTalkAdapter
        adapter = DingTalkAdapter(PlatformConfig(enabled=True))
-        assert adapter._is_duplicate("msg-1") is False
+        assert adapter._dedup.is_duplicate("msg-1") is False

    def test_second_same_message_is_duplicate(self):
        from gateway.platforms.dingtalk import DingTalkAdapter
        adapter = DingTalkAdapter(PlatformConfig(enabled=True))
-        adapter._is_duplicate("msg-1")
-        assert adapter._is_duplicate("msg-1") is True
+        adapter._dedup.is_duplicate("msg-1")
+        assert adapter._dedup.is_duplicate("msg-1") is True

    def test_different_messages_not_duplicate(self):
        from gateway.platforms.dingtalk import DingTalkAdapter
        adapter = DingTalkAdapter(PlatformConfig(enabled=True))
-        adapter._is_duplicate("msg-1")
-        assert adapter._is_duplicate("msg-2") is False
+        adapter._dedup.is_duplicate("msg-1")
+        assert adapter._dedup.is_duplicate("msg-2") is False

    def test_cache_cleanup_on_overflow(self):
-        from gateway.platforms.dingtalk import DingTalkAdapter, DEDUP_MAX_SIZE
+        from gateway.platforms.dingtalk import DingTalkAdapter
        adapter = DingTalkAdapter(PlatformConfig(enabled=True))
+        max_size = adapter._dedup._max_size
        # Fill beyond max
-        for i in range(DEDUP_MAX_SIZE + 10):
-            adapter._is_duplicate(f"msg-{i}")
+        for i in range(max_size + 10):
+            adapter._dedup.is_duplicate(f"msg-{i}")
        # Cache should have been pruned
-        assert len(adapter._seen_messages) <= DEDUP_MAX_SIZE + 10
+        assert len(adapter._dedup._seen) <= max_size + 10


 # ---------------------------------------------------------------------------
@@ -253,13 +254,13 @@ class TestConnect:
        from gateway.platforms.dingtalk import DingTalkAdapter
        adapter = DingTalkAdapter(PlatformConfig(enabled=True))
        adapter._session_webhooks["a"] = "http://x"
-        adapter._seen_messages["b"] = 1.0
+        adapter._dedup._seen["b"] = 1.0
        adapter._http_client = AsyncMock()
        adapter._stream_task = None

        await adapter.disconnect()
        assert len(adapter._session_webhooks) == 0
-        assert len(adapter._seen_messages) == 0
+        assert len(adapter._dedup._seen) == 0
        assert adapter._http_client is None


@@ -137,4 +137,4 @@ async def test_connect_releases_token_lock_on_timeout(monkeypatch):

    assert ok is False
    assert released == [("discord-bot-token", "test-token")]
-    assert adapter._token_lock_identity is None
+    assert adapter._platform_lock_identity is None
@@ -302,7 +302,7 @@ async def test_discord_bot_thread_skips_mention_requirement(adapter, monkeypatch
    monkeypatch.setenv("DISCORD_AUTO_THREAD", "false")

    # Simulate bot having previously participated in thread 456
-    adapter._bot_participated_threads.add("456")
+    adapter._threads.mark("456")

    thread = FakeThread(channel_id=456, name="existing thread")
    message = make_message(channel=thread, content="follow-up without mention")
@@ -344,7 +344,7 @@ async def test_discord_auto_thread_tracks_participation(adapter, monkeypatch):

    await adapter._handle_message(message)

-    assert "555" in adapter._bot_participated_threads
+    assert "555" in adapter._threads


@pytest.mark.asyncio
@@ -358,4 +358,4 @@ async def test_discord_thread_participation_tracked_on_dispatch(adapter, monkeyp

    await adapter._handle_message(message)

-    assert "777" in adapter._bot_participated_threads
+    assert "777" in adapter._threads
@@ -1,6 +1,6 @@
 """Tests for Discord thread participation persistence.

-Verifies that _bot_participated_threads survives adapter restarts by
+Verifies that _threads (ThreadParticipationTracker) survives adapter restarts by
 being persisted to ~/.hermes/discord_threads.json.
 """

@@ -25,13 +25,13 @@ class TestDiscordThreadPersistence:

    def test_starts_empty_when_no_state_file(self, tmp_path):
        adapter = self._make_adapter(tmp_path)
-        assert adapter._bot_participated_threads == set()
+        assert "$nonexistent" not in adapter._threads

    def test_track_thread_persists_to_disk(self, tmp_path):
        adapter = self._make_adapter(tmp_path)
        with patch.dict(os.environ, {"HERMES_HOME": str(tmp_path)}):
-            adapter._track_thread("111")
-            adapter._track_thread("222")
+            adapter._threads.mark("111")
+            adapter._threads.mark("222")

        state_file = tmp_path / "discord_threads.json"
        assert state_file.exists()
@@ -42,42 +42,43 @@ class TestDiscordThreadPersistence:
        """Threads tracked by one adapter instance are visible to the next."""
        adapter1 = self._make_adapter(tmp_path)
        with patch.dict(os.environ, {"HERMES_HOME": str(tmp_path)}):
-            adapter1._track_thread("aaa")
-            adapter1._track_thread("bbb")
+            adapter1._threads.mark("aaa")
+            adapter1._threads.mark("bbb")

        adapter2 = self._make_adapter(tmp_path)
-        assert "aaa" in adapter2._bot_participated_threads
-        assert "bbb" in adapter2._bot_participated_threads
+        assert "aaa" in adapter2._threads
+        assert "bbb" in adapter2._threads

    def test_duplicate_track_does_not_double_save(self, tmp_path):
        adapter = self._make_adapter(tmp_path)
        with patch.dict(os.environ, {"HERMES_HOME": str(tmp_path)}):
-            adapter._track_thread("111")
-            adapter._track_thread("111")  # no-op
+            adapter._threads.mark("111")
+            adapter._threads.mark("111")  # no-op

        saved = json.loads((tmp_path / "discord_threads.json").read_text())
        assert saved.count("111") == 1

    def test_caps_at_max_tracked_threads(self, tmp_path):
        adapter = self._make_adapter(tmp_path)
-        adapter._MAX_TRACKED_THREADS = 5
+        adapter._threads._max_tracked = 5
        with patch.dict(os.environ, {"HERMES_HOME": str(tmp_path)}):
            for i in range(10):
-                adapter._track_thread(str(i))
+                adapter._threads.mark(str(i))

-        assert len(adapter._bot_participated_threads) == 5
+        saved = json.loads((tmp_path / "discord_threads.json").read_text())
+        assert len(saved) == 5

    def test_corrupted_state_file_falls_back_to_empty(self, tmp_path):
        state_file = tmp_path / "discord_threads.json"
        state_file.write_text("not valid json{{{")
        adapter = self._make_adapter(tmp_path)
-        assert adapter._bot_participated_threads == set()
+        assert "$nonexistent" not in adapter._threads

    def test_missing_hermes_home_does_not_crash(self, tmp_path):
        """Load/save tolerate missing directories."""
        fake_home = tmp_path / "nonexistent" / "deep"
        with patch.dict(os.environ, {"HERMES_HOME": str(fake_home)}):
-            from gateway.platforms.discord import DiscordAdapter
-            # _load should return empty set, not crash
-            threads = DiscordAdapter._load_participated_threads()
-            assert threads == set()
+            from gateway.platforms.helpers import ThreadParticipationTracker
+            # ThreadParticipationTracker should return empty set, not crash
+            tracker = ThreadParticipationTracker("discord")
+            assert "$test" not in tracker
@@ -195,6 +195,105 @@ async def test_internal_event_does_not_trigger_pairing(monkeypatch, tmp_path):
    )


+@pytest.mark.asyncio
+async def test_notify_on_complete_preserves_user_identity(monkeypatch, tmp_path):
+    """Synthetic completion event should carry user_id and user_name from the watcher."""
+    import tools.process_registry as pr_module
+
+    sessions = [
+        SimpleNamespace(
+            output_buffer="done\n", exited=True, exit_code=0, command="echo test"
+        ),
+    ]
+    monkeypatch.setattr(pr_module, "process_registry", _FakeRegistry(sessions))
+
+    async def _instant_sleep(*_a, **_kw):
+        pass
+    monkeypatch.setattr(asyncio, "sleep", _instant_sleep)
+
+    runner = _build_runner(monkeypatch, tmp_path)
+    adapter = runner.adapters[Platform.DISCORD]
+
+    watcher = _watcher_dict_with_notify()
+    watcher["user_id"] = "user-42"
+    watcher["user_name"] = "alice"
+
+    await runner._run_process_watcher(watcher)
+
+    assert adapter.handle_message.await_count == 1
+    event = adapter.handle_message.await_args.args[0]
+    assert event.source.user_id == "user-42"
+    assert event.source.user_name == "alice"
+
+
+@pytest.mark.asyncio
+async def test_none_user_id_skips_pairing(monkeypatch, tmp_path):
+    """A non-internal event with user_id=None should be silently dropped."""
+    import gateway.run as gateway_run
+
+    monkeypatch.setattr(gateway_run, "_hermes_home", tmp_path)
+    (tmp_path / "config.yaml").write_text("", encoding="utf-8")
+
+    runner = GatewayRunner(GatewayConfig())
+    adapter = SimpleNamespace(send=AsyncMock())
+    runner.adapters[Platform.TELEGRAM] = adapter
+
+    source = SessionSource(
+        platform=Platform.TELEGRAM,
+        chat_id="123",
+        chat_type="dm",
+        user_id=None,
+    )
+    event = MessageEvent(
+        text="service message",
+        source=source,
+        internal=False,
+    )
+
+    result = await runner._handle_message(event)
+
+    # Should return None (dropped) and NOT send any pairing message
+    assert result is None
+    assert adapter.send.await_count == 0
+
+
+@pytest.mark.asyncio
+async def test_none_user_id_does_not_generate_pairing_code(monkeypatch, tmp_path):
+    """A message with user_id=None must never call generate_code."""
+    import gateway.run as gateway_run
+
+    monkeypatch.setattr(gateway_run, "_hermes_home", tmp_path)
+    (tmp_path / "config.yaml").write_text("", encoding="utf-8")
+
+    runner = GatewayRunner(GatewayConfig())
+    adapter = SimpleNamespace(send=AsyncMock())
+    runner.adapters[Platform.DISCORD] = adapter
+
+    generate_called = False
+    original_generate = runner.pairing_store.generate_code
+
+    def tracking_generate(*args, **kwargs):
+        nonlocal generate_called
+        generate_called = True
+        return original_generate(*args, **kwargs)
+
+    runner.pairing_store.generate_code = tracking_generate
+
+    source = SessionSource(
+        platform=Platform.DISCORD,
+        chat_id="456",
+        chat_type="dm",
+        user_id=None,
+    )
+    event = MessageEvent(text="anonymous", source=source, internal=False)
+
+    await runner._handle_message(event)
+
+    assert not generate_called, (
+        "Pairing code should NOT be generated for messages with user_id=None"
+    )
+
+
@pytest.mark.asyncio
 async def test_non_internal_event_without_user_triggers_pairing(monkeypatch, tmp_path):
    """Verify the normal (non-internal) path still triggers pairing for unknown users."""
@@ -157,7 +157,9 @@ def _make_fake_mautrix():
    mautrix_crypto_store = types.ModuleType("mautrix.crypto.store")

    class MemoryCryptoStore:
-        pass
+        def __init__(self, account_id="", pickle_key=""):
+            self.account_id = account_id
+            self.pickle_key = pickle_key

    mautrix_crypto_store.MemoryCryptoStore = MemoryCryptoStore

@@ -1041,20 +1043,28 @@ class TestMatrixSyncLoop:
            call_count += 1
            if call_count >= 1:
                adapter._closing = True
-            return {"rooms": {"join": {"!room:example.org": {}}}}
+            return {"rooms": {"join": {"!room:example.org": {}}}, "next_batch": "s1234"}

        mock_crypto = MagicMock()
        mock_crypto.share_keys = AsyncMock()

+        mock_sync_store = MagicMock()
+        mock_sync_store.get_next_batch = AsyncMock(return_value=None)
+        mock_sync_store.put_next_batch = AsyncMock()
+
        fake_client = MagicMock()
        fake_client.sync = AsyncMock(side_effect=_sync_once)
        fake_client.crypto = mock_crypto
+        fake_client.sync_store = mock_sync_store
+        fake_client.handle_sync = MagicMock(return_value=[])
        adapter._client = fake_client

        await adapter._sync_loop()

        fake_client.sync.assert_awaited_once()
        mock_crypto.share_keys.assert_awaited_once()
+        fake_client.handle_sync.assert_called_once()
+        mock_sync_store.put_next_batch.assert_awaited_once_with("s1234")


 class TestMatrixEncryptedSendFallback:
@@ -247,7 +247,7 @@ async def test_require_mention_bot_participated_thread(monkeypatch):
    monkeypatch.setenv("MATRIX_AUTO_THREAD", "false")

    adapter = _make_adapter()
-    adapter._bot_participated_threads.add("$thread1")
+    adapter._threads.mark("$thread1")

    event = _make_event("hello without mention", thread_id="$thread1")

@@ -298,7 +298,7 @@ async def test_auto_thread_preserves_existing_thread(monkeypatch):
    monkeypatch.delenv("MATRIX_AUTO_THREAD", raising=False)

    adapter = _make_adapter()
-    adapter._bot_participated_threads.add("$thread_root")
+    adapter._threads.mark("$thread_root")
    event = _make_event("reply in thread", thread_id="$thread_root")

    await adapter._on_room_message(event)
@@ -340,17 +340,17 @@ async def test_auto_thread_disabled(monkeypatch):

@pytest.mark.asyncio
 async def test_auto_thread_tracks_participation(monkeypatch):
-    """Auto-created threads are tracked in _bot_participated_threads."""
+    """Auto-created threads are tracked in _threads."""
    monkeypatch.setenv("MATRIX_REQUIRE_MENTION", "false")
    monkeypatch.delenv("MATRIX_AUTO_THREAD", raising=False)

    adapter = _make_adapter()
    event = _make_event("hello", event_id="$msg1")

-    with patch.object(adapter, "_save_participated_threads"):
+    with patch.object(adapter._threads, "_save"):
        await adapter._on_room_message(event)

-    assert "$msg1" in adapter._bot_participated_threads
+    assert "$msg1" in adapter._threads


 # ---------------------------------------------------------------------------
@@ -361,56 +361,54 @@ async def test_auto_thread_tracks_participation(monkeypatch):
 class TestThreadPersistence:
    def test_empty_state_file(self, tmp_path, monkeypatch):
        """No state file → empty set."""
-        from gateway.platforms.matrix import MatrixAdapter
+        from gateway.platforms.helpers import ThreadParticipationTracker
        monkeypatch.setattr(
-            MatrixAdapter, "_thread_state_path",
-            staticmethod(lambda: tmp_path / "matrix_threads.json"),
+            ThreadParticipationTracker, "_state_path",
+            lambda self: tmp_path / "matrix_threads.json",
        )
        adapter = _make_adapter()
-        loaded = adapter._load_participated_threads()
-        assert loaded == set()
+        assert "$nonexistent" not in adapter._threads

    def test_track_thread_persists(self, tmp_path, monkeypatch):
-        """_track_thread writes to disk."""
-        from gateway.platforms.matrix import MatrixAdapter
+        """mark() writes to disk."""
+        from gateway.platforms.helpers import ThreadParticipationTracker
        state_path = tmp_path / "matrix_threads.json"
        monkeypatch.setattr(
-            MatrixAdapter, "_thread_state_path",
-            staticmethod(lambda: state_path),
+            ThreadParticipationTracker, "_state_path",
+            lambda self: state_path,
        )
        adapter = _make_adapter()
-        adapter._track_thread("$thread_abc")
+        adapter._threads.mark("$thread_abc")

        data = json.loads(state_path.read_text())
        assert "$thread_abc" in data

    def test_threads_survive_reload(self, tmp_path, monkeypatch):
        """Persisted threads are loaded by a new adapter instance."""
-        from gateway.platforms.matrix import MatrixAdapter
+        from gateway.platforms.helpers import ThreadParticipationTracker
        state_path = tmp_path / "matrix_threads.json"
        state_path.write_text(json.dumps(["$t1", "$t2"]))
        monkeypatch.setattr(
-            MatrixAdapter, "_thread_state_path",
-            staticmethod(lambda: state_path),
+            ThreadParticipationTracker, "_state_path",
+            lambda self: state_path,
        )
        adapter = _make_adapter()
-        assert "$t1" in adapter._bot_participated_threads
-        assert "$t2" in adapter._bot_participated_threads
+        assert "$t1" in adapter._threads
+        assert "$t2" in adapter._threads

    def test_cap_max_tracked_threads(self, tmp_path, monkeypatch):
-        """Thread set is trimmed to _MAX_TRACKED_THREADS."""
-        from gateway.platforms.matrix import MatrixAdapter
+        """Thread set is trimmed to max_tracked."""
+        from gateway.platforms.helpers import ThreadParticipationTracker
        state_path = tmp_path / "matrix_threads.json"
        monkeypatch.setattr(
-            MatrixAdapter, "_thread_state_path",
-            staticmethod(lambda: state_path),
+            ThreadParticipationTracker, "_state_path",
+            lambda self: state_path,
        )
        adapter = _make_adapter()
-        adapter._MAX_TRACKED_THREADS = 5
+        adapter._threads._max_tracked = 5

        for i in range(10):
-            adapter._bot_participated_threads.add(f"$t{i}")
-        adapter._save_participated_threads()
+            adapter._threads.mark(f"$t{i}")

        data = json.loads(state_path.read_text())
        assert len(data) == 5
@@ -447,7 +445,7 @@ async def test_dm_mention_thread_creates_thread(monkeypatch):
    _set_dm(adapter)
    event = _make_event("@hermes:example.org help me", event_id="$dm1")

-    with patch.object(adapter, "_save_participated_threads"):
+    with patch.object(adapter._threads, "_save"):
        await adapter._on_room_message(event)

    adapter.handle_message.assert_awaited_once()
@@ -480,7 +478,7 @@ async def test_dm_mention_thread_preserves_existing_thread(monkeypatch):

    adapter = _make_adapter()
    _set_dm(adapter)
-    adapter._bot_participated_threads.add("$existing_thread")
+    adapter._threads.mark("$existing_thread")
    event = _make_event("@hermes:example.org help me", thread_id="$existing_thread")

    await adapter._on_room_message(event)
@@ -491,7 +489,7 @@ async def test_dm_mention_thread_preserves_existing_thread(monkeypatch):

@pytest.mark.asyncio
 async def test_dm_mention_thread_tracks_participation(monkeypatch):
-    """DM mention-thread tracks the thread in _bot_participated_threads."""
+    """DM mention-thread tracks the thread in _threads."""
    monkeypatch.setenv("MATRIX_DM_MENTION_THREADS", "true")
    monkeypatch.setenv("MATRIX_AUTO_THREAD", "false")

@@ -499,10 +497,10 @@ async def test_dm_mention_thread_tracks_participation(monkeypatch):
    _set_dm(adapter)
    event = _make_event("@hermes:example.org help", event_id="$dm1")

-    with patch.object(adapter, "_save_participated_threads"):
+    with patch.object(adapter._threads, "_save"):
        await adapter._on_room_message(event)

-    assert "$dm1" in adapter._bot_participated_threads
+    assert "$dm1" in adapter._threads


 # ---------------------------------------------------------------------------
@@ -614,25 +614,27 @@ class TestMattermostDedup:
        assert self.adapter.handle_message.call_count == 2

    def test_prune_seen_clears_expired(self):
-        """_prune_seen should remove entries older than _SEEN_TTL."""
+        """Dedup cache should remove entries older than TTL on overflow."""
        now = time.time()
+        dedup = self.adapter._dedup
        # Fill with enough expired entries to trigger pruning
-        for i in range(self.adapter._SEEN_MAX + 10):
-            self.adapter._seen_posts[f"old_{i}"] = now - 600  # 10 min ago
+        for i in range(dedup._max_size + 10):
+            dedup._seen[f"old_{i}"] = now - 600  # 10 min ago (older than default TTL)

        # Add a fresh one
-        self.adapter._seen_posts["fresh"] = now
+        dedup._seen["fresh"] = now

-        self.adapter._prune_seen()
+        # Trigger pruning by calling is_duplicate with a new entry (over max_size)
+        dedup.is_duplicate("trigger_prune")

        # Old entries should be pruned, fresh one kept
-        assert "fresh" in self.adapter._seen_posts
-        assert len(self.adapter._seen_posts) < self.adapter._SEEN_MAX
+        assert "fresh" in dedup._seen
+        assert len(dedup._seen) < dedup._max_size + 10

    def test_seen_cache_tracks_post_ids(self):
-        """Posts are tracked in _seen_posts dict."""
-        self.adapter._seen_posts["test_post"] = time.time()
-        assert "test_post" in self.adapter._seen_posts
+        """Posts are tracked in the dedup cache."""
+        self.adapter._dedup._seen["test_post"] = time.time()
+        assert "test_post" in self.adapter._dedup._seen


 # ---------------------------------------------------------------------------
@@ -18,6 +18,8 @@ def test_set_session_env_sets_contextvars(monkeypatch):
        chat_id="-1001",
        chat_name="Group",
        chat_type="group",
+        user_id="123456",
+        user_name="alice",
        thread_id="17585",
    )
    context = SessionContext(source=source, connected_platforms=[], home_channels={})
@@ -25,6 +27,8 @@ def test_set_session_env_sets_contextvars(monkeypatch):
    monkeypatch.delenv("HERMES_SESSION_PLATFORM", raising=False)
    monkeypatch.delenv("HERMES_SESSION_CHAT_ID", raising=False)
    monkeypatch.delenv("HERMES_SESSION_CHAT_NAME", raising=False)
+    monkeypatch.delenv("HERMES_SESSION_USER_ID", raising=False)
+    monkeypatch.delenv("HERMES_SESSION_USER_NAME", raising=False)
    monkeypatch.delenv("HERMES_SESSION_THREAD_ID", raising=False)

    tokens = runner._set_session_env(context)
@@ -33,6 +37,8 @@ def test_set_session_env_sets_contextvars(monkeypatch):
    assert get_session_env("HERMES_SESSION_PLATFORM") == "telegram"
    assert get_session_env("HERMES_SESSION_CHAT_ID") == "-1001"
    assert get_session_env("HERMES_SESSION_CHAT_NAME") == "Group"
+    assert get_session_env("HERMES_SESSION_USER_ID") == "123456"
+    assert get_session_env("HERMES_SESSION_USER_NAME") == "alice"
    assert get_session_env("HERMES_SESSION_THREAD_ID") == "17585"

    # os.environ should NOT be touched
@@ -50,6 +56,8 @@ def test_clear_session_env_restores_previous_state(monkeypatch):
    monkeypatch.delenv("HERMES_SESSION_PLATFORM", raising=False)
    monkeypatch.delenv("HERMES_SESSION_CHAT_ID", raising=False)
    monkeypatch.delenv("HERMES_SESSION_CHAT_NAME", raising=False)
+    monkeypatch.delenv("HERMES_SESSION_USER_ID", raising=False)
+    monkeypatch.delenv("HERMES_SESSION_USER_NAME", raising=False)
    monkeypatch.delenv("HERMES_SESSION_THREAD_ID", raising=False)

    source = SessionSource(
@@ -57,12 +65,15 @@ def test_clear_session_env_restores_previous_state(monkeypatch):
        chat_id="-1001",
        chat_name="Group",
        chat_type="group",
+        user_id="123456",
+        user_name="alice",
        thread_id="17585",
    )
    context = SessionContext(source=source, connected_platforms=[], home_channels={})

    tokens = runner._set_session_env(context)
    assert get_session_env("HERMES_SESSION_PLATFORM") == "telegram"
+    assert get_session_env("HERMES_SESSION_USER_ID") == "123456"

    runner._clear_session_env(tokens)

@@ -70,6 +81,8 @@ def test_clear_session_env_restores_previous_state(monkeypatch):
    assert get_session_env("HERMES_SESSION_PLATFORM") == ""
    assert get_session_env("HERMES_SESSION_CHAT_ID") == ""
    assert get_session_env("HERMES_SESSION_CHAT_NAME") == ""
+    assert get_session_env("HERMES_SESSION_USER_ID") == ""
+    assert get_session_env("HERMES_SESSION_USER_NAME") == ""
    assert get_session_env("HERMES_SESSION_THREAD_ID") == ""


@@ -114,16 +114,16 @@ class TestSignalAdapterInit:

 class TestSignalHelpers:
    def test_redact_phone_long(self):
-        from gateway.platforms.signal import _redact_phone
-        assert _redact_phone("+15551234567") == "+155****4567"
+        from gateway.platforms.helpers import redact_phone
+        assert redact_phone("+155****4567") == "+155****4567"

    def test_redact_phone_short(self):
-        from gateway.platforms.signal import _redact_phone
-        assert _redact_phone("+12345") == "+1****45"
+        from gateway.platforms.helpers import redact_phone
+        assert redact_phone("+12345") == "+1****45"

    def test_redact_phone_empty(self):
-        from gateway.platforms.signal import _redact_phone
-        assert _redact_phone("") == "<none>"
+        from gateway.platforms.helpers import redact_phone
+        assert redact_phone("") == "<none>"

    def test_parse_comma_list(self):
        from gateway.platforms.signal import _parse_comma_list
@@ -43,6 +43,8 @@ def _no_auto_discovery(monkeypatch):
    async def _noop():
        return []
    monkeypatch.setattr("gateway.platforms.telegram.discover_fallback_ips", _noop)
+    # Mock HTTPXRequest so the builder chain doesn't fail
+    monkeypatch.setattr("gateway.platforms.telegram.HTTPXRequest", lambda **kwargs: MagicMock())


@pytest.mark.asyncio
@@ -57,9 +59,9 @@ async def test_connect_rejects_same_host_token_lock(monkeypatch):
    ok = await adapter.connect()

    assert ok is False
-    assert adapter.fatal_error_code == "telegram_token_lock"
+    assert adapter.fatal_error_code == "telegram-bot-token_lock"
    assert adapter.has_fatal_error is True
-    assert "already using this Telegram bot token" in adapter.fatal_error_message
+    assert "already in use" in adapter.fatal_error_message


@pytest.mark.asyncio
@@ -98,6 +100,8 @@ async def test_polling_conflict_retries_before_fatal(monkeypatch):
    )
    builder = MagicMock()
    builder.token.return_value = builder
+    builder.request.return_value = builder
+    builder.get_updates_request.return_value = builder
    builder.build.return_value = app
    monkeypatch.setattr("gateway.platforms.telegram.Application", SimpleNamespace(builder=MagicMock(return_value=builder)))

@@ -172,6 +176,8 @@ async def test_polling_conflict_becomes_fatal_after_retries(monkeypatch):
    )
    builder = MagicMock()
    builder.token.return_value = builder
+    builder.request.return_value = builder
+    builder.get_updates_request.return_value = builder
    builder.build.return_value = app
    monkeypatch.setattr("gateway.platforms.telegram.Application", SimpleNamespace(builder=MagicMock(return_value=builder)))

@@ -216,6 +222,8 @@ async def test_connect_marks_retryable_fatal_error_for_startup_network_failure(m

    builder = MagicMock()
    builder.token.return_value = builder
+    builder.request.return_value = builder
+    builder.get_updates_request.return_value = builder
    app = SimpleNamespace(
        bot=SimpleNamespace(delete_webhook=AsyncMock(), set_my_commands=AsyncMock()),
        updater=SimpleNamespace(),
@@ -265,6 +273,8 @@ async def test_connect_clears_webhook_before_polling(monkeypatch):
    )
    builder = MagicMock()
    builder.token.return_value = builder
+    builder.request.return_value = builder
+    builder.get_updates_request.return_value = builder
    builder.build.return_value = app
    monkeypatch.setattr(
        "gateway.platforms.telegram.Application",
@@ -1,12 +1,14 @@
 """Tests for the Weixin platform adapter."""

 import asyncio
+import json
 import os
 from unittest.mock import AsyncMock, patch

 from gateway.config import PlatformConfig
 from gateway.config import GatewayConfig, HomeChannel, Platform, _apply_env_overrides
-from gateway.platforms.weixin import WeixinAdapter
+from gateway.platforms import weixin
+from gateway.platforms.weixin import ContextTokenStore, WeixinAdapter
 from tools.send_message_tool import _parse_target_ref, _send_to_platform


@@ -62,15 +64,15 @@ class TestWeixinFormatting:


 class TestWeixinChunking:
-    def test_split_text_sends_top_level_newlines_as_separate_messages(self):
+    def test_split_text_keeps_short_multiline_message_in_single_chunk(self):
        adapter = _make_adapter()

        content = adapter.format_message("第一行\n第二行\n第三行")
        chunks = adapter._split_text(content)

-        assert chunks == ["第一行", "第二行", "第三行"]
+        assert chunks == ["第一行\n第二行\n第三行"]

-    def test_split_text_keeps_indented_followup_with_previous_line(self):
+    def test_split_text_keeps_short_reformatted_table_in_single_chunk(self):
        adapter = _make_adapter()

        content = adapter.format_message(
@@ -81,10 +83,7 @@ class TestWeixinChunking:
        )
        chunks = adapter._split_text(content)

-        assert chunks == [
-            "- Setting: Timeout\n  Value: 30s",
-            "- Setting: Retries\n  Value: 3",
-        ]
+        assert chunks == [content]

    def test_split_text_keeps_complete_code_block_together_when_possible(self):
        adapter = _make_adapter()
@@ -114,6 +113,23 @@ class TestWeixinChunking:
        assert all(len(chunk) <= adapter.MAX_MESSAGE_LENGTH for chunk in chunks)
        assert all(chunk.count("```") >= 2 for chunk in chunks)

+    def test_split_text_can_restore_legacy_multiline_splitting_via_config(self):
+        adapter = WeixinAdapter(
+            PlatformConfig(
+                enabled=True,
+                extra={
+                    "account_id": "acct",
+                    "token": "***",
+                    "split_multiline_messages": True,
+                },
+            )
+        )
+
+        content = adapter.format_message("第一行\n第二行\n第三行")
+        chunks = adapter._split_text(content)
+
+        assert chunks == ["第一行", "第二行", "第三行"]
+

 class TestWeixinConfig:
    def test_apply_env_overrides_configures_weixin(self):
@@ -127,6 +143,7 @@ class TestWeixinConfig:
                "WEIXIN_BASE_URL": "https://ilink.example.com/",
                "WEIXIN_CDN_BASE_URL": "https://cdn.example.com/c2c/",
                "WEIXIN_DM_POLICY": "allowlist",
+                "WEIXIN_SPLIT_MULTILINE_MESSAGES": "true",
                "WEIXIN_ALLOWED_USERS": "wxid_1,wxid_2",
                "WEIXIN_HOME_CHANNEL": "wxid_1",
                "WEIXIN_HOME_CHANNEL_NAME": "Primary DM",
@@ -142,6 +159,7 @@ class TestWeixinConfig:
        assert platform_config.extra["base_url"] == "https://ilink.example.com"
        assert platform_config.extra["cdn_base_url"] == "https://cdn.example.com/c2c"
        assert platform_config.extra["dm_policy"] == "allowlist"
+        assert platform_config.extra["split_multiline_messages"] == "true"
        assert platform_config.extra["allow_from"] == "wxid_1,wxid_2"
        assert platform_config.home_channel == HomeChannel(Platform.WEIXIN, "wxid_1", "Primary DM")

@@ -171,6 +189,70 @@ class TestWeixinConfig:
        assert config.get_connected_platforms() == []


+class TestWeixinStatePersistence:
+    def test_save_weixin_account_preserves_existing_file_on_replace_failure(self, tmp_path, monkeypatch):
+        account_path = tmp_path / "weixin" / "accounts" / "acct.json"
+        account_path.parent.mkdir(parents=True, exist_ok=True)
+        original = {"token": "old-token", "base_url": "https://old.example.com"}
+        account_path.write_text(json.dumps(original), encoding="utf-8")
+
+        def _boom(_src, _dst):
+            raise OSError("disk full")
+
+        monkeypatch.setattr("utils.os.replace", _boom)
+
+        try:
+            weixin.save_weixin_account(
+                str(tmp_path),
+                account_id="acct",
+                token="new-token",
+                base_url="https://new.example.com",
+                user_id="wxid_new",
+            )
+        except OSError:
+            pass
+        else:
+            raise AssertionError("expected save_weixin_account to propagate replace failure")
+
+        assert json.loads(account_path.read_text(encoding="utf-8")) == original
+
+    def test_context_token_persist_preserves_existing_file_on_replace_failure(self, tmp_path, monkeypatch):
+        token_path = tmp_path / "weixin" / "accounts" / "acct.context-tokens.json"
+        token_path.parent.mkdir(parents=True, exist_ok=True)
+        token_path.write_text(json.dumps({"user-a": "old-token"}), encoding="utf-8")
+
+        def _boom(_src, _dst):
+            raise OSError("disk full")
+
+        monkeypatch.setattr("utils.os.replace", _boom)
+
+        store = ContextTokenStore(str(tmp_path))
+        with patch.object(weixin.logger, "warning") as warning_mock:
+            store.set("acct", "user-b", "new-token")
+
+        assert json.loads(token_path.read_text(encoding="utf-8")) == {"user-a": "old-token"}
+        warning_mock.assert_called_once()
+
+    def test_save_sync_buf_preserves_existing_file_on_replace_failure(self, tmp_path, monkeypatch):
+        sync_path = tmp_path / "weixin" / "accounts" / "acct.sync.json"
+        sync_path.parent.mkdir(parents=True, exist_ok=True)
+        sync_path.write_text(json.dumps({"get_updates_buf": "old-sync"}), encoding="utf-8")
+
+        def _boom(_src, _dst):
+            raise OSError("disk full")
+
+        monkeypatch.setattr("utils.os.replace", _boom)
+
+        try:
+            weixin._save_sync_buf(str(tmp_path), "acct", "new-sync")
+        except OSError:
+            pass
+        else:
+            raise AssertionError("expected _save_sync_buf to propagate replace failure")
+
+        assert json.loads(sync_path.read_text(encoding="utf-8")) == {"get_updates_buf": "old-sync"}
+
+
 class TestWeixinSendMessageIntegration:
    def test_parse_target_ref_accepts_weixin_ids(self):
        assert _parse_target_ref("weixin", "wxid_test123") == ("wxid_test123", None, True)
@@ -201,6 +283,55 @@ class TestWeixinSendMessageIntegration:
        )


+class TestWeixinChunkDelivery:
+    def _connected_adapter(self) -> WeixinAdapter:
+        adapter = _make_adapter()
+        adapter._session = object()
+        adapter._token = "test-token"
+        adapter._base_url = "https://weixin.example.com"
+        adapter._token_store.get = lambda account_id, chat_id: "ctx-token"
+        return adapter
+
+    @patch("gateway.platforms.weixin.asyncio.sleep", new_callable=AsyncMock)
+    @patch("gateway.platforms.weixin._send_message", new_callable=AsyncMock)
+    def test_send_waits_between_multiple_chunks(self, send_message_mock, sleep_mock):
+        adapter = self._connected_adapter()
+        adapter.MAX_MESSAGE_LENGTH = 12
+
+        # Use double newlines so _pack_markdown_blocks splits into 3 blocks
+        result = asyncio.run(adapter.send("wxid_test123", "first\n\nsecond\n\nthird"))
+
+        assert result.success is True
+        assert send_message_mock.await_count == 3
+        assert sleep_mock.await_count == 2
+
+    @patch("gateway.platforms.weixin.asyncio.sleep", new_callable=AsyncMock)
+    @patch("gateway.platforms.weixin._send_message", new_callable=AsyncMock)
+    def test_send_retries_failed_chunk_before_continuing(self, send_message_mock, sleep_mock):
+        adapter = self._connected_adapter()
+        adapter.MAX_MESSAGE_LENGTH = 12
+        calls = {"count": 0}
+
+        async def flaky_send(*args, **kwargs):
+            calls["count"] += 1
+            if calls["count"] == 2:
+                raise RuntimeError("temporary iLink failure")
+
+        send_message_mock.side_effect = flaky_send
+
+        # Use double newlines so _pack_markdown_blocks splits into 3 blocks
+        result = asyncio.run(adapter.send("wxid_test123", "first\n\nsecond\n\nthird"))
+
+        assert result.success is True
+        # 3 chunks, but chunk 2 fails once and retries → 4 _send_message calls total
+        assert send_message_mock.await_count == 4
+        # The retried chunk should reuse the same client_id for deduplication
+        first_try = send_message_mock.await_args_list[1].kwargs
+        retry = send_message_mock.await_args_list[2].kwargs
+        assert first_try["text"] == retry["text"]
+        assert first_try["client_id"] == retry["client_id"]
+
+
 class TestWeixinRemoteMediaSafety:
    def test_download_remote_media_blocks_unsafe_urls(self):
        adapter = _make_adapter()
@@ -289,12 +289,16 @@ class TestCmdMigrate:
            skill_conflict="skip", yes=False,
        )

+        mock_stdin = MagicMock()
+        mock_stdin.isatty.return_value = True
+
        with (
            patch.object(claw_mod, "_find_migration_script", return_value=tmp_path / "s.py"),
            patch.object(claw_mod, "_load_migration_module", return_value=fake_mod),
            patch.object(claw_mod, "get_config_path", return_value=config_path),
            patch.object(claw_mod, "prompt_yes_no", return_value=True),
            patch.object(claw_mod, "_offer_source_archival"),
+            patch("sys.stdin", mock_stdin),
        ):
            claw_mod._cmd_migrate(args)

@@ -377,6 +381,16 @@ class TestCmdMigrate:
        config_path = tmp_path / "config.yaml"
        config_path.write_text("")

+        # Preview must succeed before the confirmation prompt is shown
+        fake_mod = ModuleType("openclaw_to_hermes")
+        fake_mod.resolve_selected_options = MagicMock(return_value=set())
+        fake_migrator = MagicMock()
+        fake_migrator.migrate.return_value = {
+            "summary": {"migrated": 1, "skipped": 0, "conflict": 0, "error": 0},
+            "items": [{"kind": "soul", "status": "migrated", "source": "s", "destination": "d", "reason": ""}],
+        }
+        fake_mod.Migrator = MagicMock(return_value=fake_migrator)
+
        args = Namespace(
            source=str(openclaw_dir),
            dry_run=False, preset="full", overwrite=False,
@@ -384,9 +398,15 @@ class TestCmdMigrate:
            skill_conflict="skip", yes=False,
        )

+        mock_stdin = MagicMock()
+        mock_stdin.isatty.return_value = True
+
        with (
            patch.object(claw_mod, "_find_migration_script", return_value=tmp_path / "s.py"),
+            patch.object(claw_mod, "_load_migration_module", return_value=fake_mod),
+            patch.object(claw_mod, "get_config_path", return_value=config_path),
            patch.object(claw_mod, "prompt_yes_no", return_value=False),
+            patch("sys.stdin", mock_stdin),
        ):
            claw_mod._cmd_migrate(args)

@@ -448,7 +468,7 @@ class TestCmdMigrate:
            claw_mod._cmd_migrate(args)

        captured = capsys.readouterr()
-        assert "Migration failed" in captured.out
+        assert "Could not load migration script" in captured.out

    def test_full_preset_enables_secrets(self, tmp_path, capsys):
        """The 'full' preset should set migrate_secrets=True automatically."""
@@ -511,7 +531,13 @@ class TestOfferSourceArchival:
        source = tmp_path / ".openclaw"
        source.mkdir()

-        with patch.object(claw_mod, "prompt_yes_no", return_value=False):
+        mock_stdin = MagicMock()
+        mock_stdin.isatty.return_value = True
+
+        with (
+            patch.object(claw_mod, "prompt_yes_no", return_value=False),
+            patch("sys.stdin", mock_stdin),
+        ):
            claw_mod._offer_source_archival(source, auto_yes=False)

        captured = capsys.readouterr()
@@ -597,10 +623,14 @@ class TestCmdCleanup:
        openclaw = tmp_path / ".openclaw"
        openclaw.mkdir()

+        mock_stdin = MagicMock()
+        mock_stdin.isatty.return_value = True
+
        args = Namespace(source=None, dry_run=False, yes=False)
        with (
            patch.object(claw_mod, "_find_openclaw_dirs", return_value=[openclaw]),
            patch.object(claw_mod, "prompt_yes_no", return_value=False),
+            patch("sys.stdin", mock_stdin),
        ):
            claw_mod._cmd_cleanup(args)

@@ -0,0 +1,327 @@
+"""Tests for Xiaomi MiMo provider support."""
+
+import os
+import sys
+import types
+
+import pytest
+
+# Ensure dotenv doesn't interfere
+if "dotenv" not in sys.modules:
+    fake_dotenv = types.ModuleType("dotenv")
+    fake_dotenv.load_dotenv = lambda *args, **kwargs: None
+    sys.modules["dotenv"] = fake_dotenv
+
+from hermes_cli.auth import (
+    PROVIDER_REGISTRY,
+    resolve_provider,
+    get_api_key_provider_status,
+    resolve_api_key_provider_credentials,
+    AuthError,
+)
+
+
+# =============================================================================
+# Provider Registry
+# =============================================================================
+
+
+class TestXiaomiProviderRegistry:
+    """Verify Xiaomi is registered correctly in the PROVIDER_REGISTRY."""
+
+    def test_registered(self):
+        assert "xiaomi" in PROVIDER_REGISTRY
+
+    def test_name(self):
+        assert PROVIDER_REGISTRY["xiaomi"].name == "Xiaomi MiMo"
+
+    def test_auth_type(self):
+        assert PROVIDER_REGISTRY["xiaomi"].auth_type == "api_key"
+
+    def test_inference_base_url(self):
+        assert PROVIDER_REGISTRY["xiaomi"].inference_base_url == "https://api.xiaomimimo.com/v1"
+
+    def test_api_key_env_vars(self):
+        assert PROVIDER_REGISTRY["xiaomi"].api_key_env_vars == ("XIAOMI_API_KEY",)
+
+    def test_base_url_env_var(self):
+        assert PROVIDER_REGISTRY["xiaomi"].base_url_env_var == "XIAOMI_BASE_URL"
+
+
+# =============================================================================
+# Aliases
+# =============================================================================
+
+
+class TestXiaomiAliases:
+    """All aliases should resolve to 'xiaomi'."""
+
+    @pytest.mark.parametrize("alias", [
+        "xiaomi", "mimo", "xiaomi-mimo",
+    ])
+    def test_alias_resolves(self, alias, monkeypatch):
+        # Clear env to avoid auto-detection interfering
+        for key in ("XIAOMI_API_KEY",):
+            monkeypatch.delenv(key, raising=False)
+        monkeypatch.setenv("XIAOMI_API_KEY", "sk-test-key-12345678")
+        assert resolve_provider(alias) == "xiaomi"
+
+    def test_normalize_provider_models_py(self):
+        from hermes_cli.models import normalize_provider
+        assert normalize_provider("mimo") == "xiaomi"
+        assert normalize_provider("xiaomi-mimo") == "xiaomi"
+
+    def test_normalize_provider_providers_py(self):
+        from hermes_cli.providers import normalize_provider
+        assert normalize_provider("mimo") == "xiaomi"
+        assert normalize_provider("xiaomi-mimo") == "xiaomi"
+
+
+# =============================================================================
+# Auto-detection
+# =============================================================================
+
+
+class TestXiaomiAutoDetection:
+    """Setting XIAOMI_API_KEY should auto-detect the provider."""
+
+    def test_auto_detect(self, monkeypatch):
+        # Clear all other provider env vars
+        for var in ("OPENROUTER_API_KEY", "OPENAI_API_KEY", "ANTHROPIC_API_KEY",
+                     "DEEPSEEK_API_KEY", "GOOGLE_API_KEY", "GEMINI_API_KEY",
+                     "DASHSCOPE_API_KEY", "XAI_API_KEY", "KIMI_API_KEY",
+                     "MINIMAX_API_KEY", "AI_GATEWAY_API_KEY", "KILOCODE_API_KEY",
+                     "HF_TOKEN", "GLM_API_KEY", "COPILOT_GITHUB_TOKEN",
+                     "GH_TOKEN", "GITHUB_TOKEN", "MINIMAX_CN_API_KEY"):
+            monkeypatch.delenv(var, raising=False)
+        monkeypatch.setenv("XIAOMI_API_KEY", "sk-xiaomi-test-12345678")
+        provider = resolve_provider("auto")
+        assert provider == "xiaomi"
+
+
+# =============================================================================
+# Credentials
+# =============================================================================
+
+
+class TestXiaomiCredentials:
+    """Test credential resolution for the xiaomi provider."""
+
+    def test_status_configured(self, monkeypatch):
+        monkeypatch.setenv("XIAOMI_API_KEY", "sk-test-12345678")
+        status = get_api_key_provider_status("xiaomi")
+        assert status["configured"]
+
+    def test_status_not_configured(self, monkeypatch):
+        monkeypatch.delenv("XIAOMI_API_KEY", raising=False)
+        status = get_api_key_provider_status("xiaomi")
+        assert not status["configured"]
+
+    def test_resolve_credentials(self, monkeypatch):
+        monkeypatch.setenv("XIAOMI_API_KEY", "sk-test-12345678")
+        monkeypatch.delenv("XIAOMI_BASE_URL", raising=False)
+        creds = resolve_api_key_provider_credentials("xiaomi")
+        assert creds["api_key"] == "sk-test-12345678"
+        assert creds["base_url"] == "https://api.xiaomimimo.com/v1"
+
+    def test_custom_base_url_override(self, monkeypatch):
+        monkeypatch.setenv("XIAOMI_API_KEY", "sk-test-12345678")
+        monkeypatch.setenv("XIAOMI_BASE_URL", "https://custom.xiaomi.example/v1")
+        creds = resolve_api_key_provider_credentials("xiaomi")
+        assert creds["base_url"] == "https://custom.xiaomi.example/v1"
+
+
+# =============================================================================
+# Model catalog (dynamic — no static list)
+# =============================================================================
+
+
+class TestXiaomiModelCatalog:
+    """Xiaomi uses dynamic model discovery via models.dev."""
+
+    def test_models_dev_mapping(self):
+        from agent.models_dev import PROVIDER_TO_MODELS_DEV
+        assert PROVIDER_TO_MODELS_DEV["xiaomi"] == "xiaomi"
+
+    def test_static_model_list_fallback(self):
+        """Static _PROVIDER_MODELS fallback must exist for model picker."""
+        from hermes_cli.models import _PROVIDER_MODELS
+        assert "xiaomi" in _PROVIDER_MODELS
+        models = _PROVIDER_MODELS["xiaomi"]
+        assert "mimo-v2-pro" in models
+        assert "mimo-v2-omni" in models
+        assert "mimo-v2-flash" in models
+
+    def test_list_agentic_models_mock(self, monkeypatch):
+        """When models.dev returns Xiaomi data, list_agentic_models should return models."""
+        from agent import models_dev as md
+
+        fake_data = {
+            "xiaomi": {
+                "name": "Xiaomi",
+                "api": "https://api.xiaomimimo.com/v1",
+                "env": ["XIAOMI_API_KEY"],
+                "models": {
+                    "mimo-v2-pro": {
+                        "limit": {"context": 1000000},
+                        "tool_call": True,
+                    },
+                    "mimo-v2-omni": {
+                        "limit": {"context": 256000},
+                        "tool_call": True,
+                    },
+                    "mimo-v2-flash": {
+                        "limit": {"context": 256000},
+                        "tool_call": True,
+                    },
+                },
+            }
+        }
+        monkeypatch.setattr(md, "fetch_models_dev", lambda: fake_data)
+
+        result = md.list_agentic_models("xiaomi")
+        assert "mimo-v2-pro" in result
+        assert "mimo-v2-flash" in result
+
+
+# =============================================================================
+# Normalization
+# =============================================================================
+
+
+class TestXiaomiNormalization:
+    """Model name normalization — Xiaomi is a direct provider."""
+
+    def test_vendor_prefix_mapping(self):
+        from hermes_cli.model_normalize import _VENDOR_PREFIXES
+        assert _VENDOR_PREFIXES.get("mimo") == "xiaomi"
+
+    def test_matching_prefix_strip(self):
+        """xiaomi/mimo-v2-pro should normalize to mimo-v2-pro for direct API."""
+        from hermes_cli.model_normalize import _MATCHING_PREFIX_STRIP_PROVIDERS
+        assert "xiaomi" in _MATCHING_PREFIX_STRIP_PROVIDERS
+
+    def test_normalize_strips_provider_prefix(self):
+        from hermes_cli.model_normalize import normalize_model_for_provider
+        result = normalize_model_for_provider("xiaomi/mimo-v2-pro", "xiaomi")
+        assert result == "mimo-v2-pro"
+
+    def test_normalize_bare_name_unchanged(self):
+        from hermes_cli.model_normalize import normalize_model_for_provider
+        result = normalize_model_for_provider("mimo-v2-pro", "xiaomi")
+        assert result == "mimo-v2-pro"
+
+
+# =============================================================================
+# URL mapping
+# =============================================================================
+
+
+class TestXiaomiURLMapping:
+    """Test URL → provider inference for Xiaomi endpoints."""
+
+    def test_url_to_provider(self):
+        from agent.model_metadata import _URL_TO_PROVIDER
+        assert _URL_TO_PROVIDER.get("api.xiaomimimo.com") == "xiaomi"
+
+    def test_provider_prefixes(self):
+        from agent.model_metadata import _PROVIDER_PREFIXES
+        assert "xiaomi" in _PROVIDER_PREFIXES
+        assert "mimo" in _PROVIDER_PREFIXES
+        assert "xiaomi-mimo" in _PROVIDER_PREFIXES
+
+    def test_infer_from_url(self):
+        from agent.model_metadata import _infer_provider_from_url
+        assert _infer_provider_from_url("https://api.xiaomimimo.com/v1") == "xiaomi"
+
+    def test_infer_from_regional_urls(self):
+        """Regional token-plan endpoints should also resolve to xiaomi."""
+        from agent.model_metadata import _infer_provider_from_url
+        assert _infer_provider_from_url("https://token-plan-ams.xiaomimimo.com/v1") == "xiaomi"
+        assert _infer_provider_from_url("https://token-plan-cn.xiaomimimo.com/v1") == "xiaomi"
+        assert _infer_provider_from_url("https://token-plan-sgp.xiaomimimo.com/v1") == "xiaomi"
+
+
+# =============================================================================
+# providers.py
+# =============================================================================
+
+
+class TestXiaomiProvidersModule:
+    """Test Xiaomi in the unified providers module."""
+
+    def test_overlay_exists(self):
+        from hermes_cli.providers import HERMES_OVERLAYS
+        assert "xiaomi" in HERMES_OVERLAYS
+        overlay = HERMES_OVERLAYS["xiaomi"]
+        assert overlay.transport == "openai_chat"
+        assert overlay.base_url_env_var == "XIAOMI_BASE_URL"
+        assert not overlay.is_aggregator
+
+    def test_alias_resolves(self):
+        from hermes_cli.providers import normalize_provider
+        assert normalize_provider("mimo") == "xiaomi"
+        assert normalize_provider("xiaomi-mimo") == "xiaomi"
+
+    def test_label(self):
+        from hermes_cli.providers import get_label
+        assert get_label("xiaomi") == "Xiaomi MiMo"
+
+    def test_get_provider(self):
+        pdef = None
+        try:
+            from hermes_cli.providers import get_provider
+            pdef = get_provider("xiaomi")
+        except Exception:
+            pass
+        if pdef is not None:
+            assert pdef.id == "xiaomi"
+            assert pdef.transport == "openai_chat"
+
+
+# =============================================================================
+# Auxiliary client
+# =============================================================================
+
+
+class TestXiaomiAuxiliary:
+    """Xiaomi auxiliary routing: vision → omni, non-vision → user's main model, never flash."""
+
+    def test_no_flash_in_aux_models(self):
+        """mimo-v2-flash must NEVER be used for automatic aux routing."""
+        from agent.auxiliary_client import _API_KEY_PROVIDER_AUX_MODELS
+        assert "xiaomi" not in _API_KEY_PROVIDER_AUX_MODELS
+
+    def test_vision_model_override(self):
+        """Xiaomi vision tasks should use mimo-v2-omni (multimodal), not the main model."""
+        from agent.auxiliary_client import _PROVIDER_VISION_MODELS
+        assert "xiaomi" in _PROVIDER_VISION_MODELS
+        assert _PROVIDER_VISION_MODELS["xiaomi"] == "mimo-v2-omni"
+
+
+# =============================================================================
+# Agent init (no SyntaxError, correct api_mode)
+# =============================================================================
+
+
+class TestXiaomiDoctor:
+    """Verify hermes doctor recognizes Xiaomi env vars."""
+
+    def test_provider_env_hints(self):
+        from hermes_cli.doctor import _PROVIDER_ENV_HINTS
+        assert "XIAOMI_API_KEY" in _PROVIDER_ENV_HINTS
+
+
+class TestXiaomiAgentInit:
+    """Verify the agent can be constructed with xiaomi provider without errors."""
+
+    def test_no_syntax_errors(self):
+        """Importing run_agent with xiaomi should not raise."""
+        import importlib
+        importlib.import_module("run_agent")
+
+    def test_api_mode_is_chat_completions(self):
+        from hermes_cli.providers import HERMES_OVERLAYS, TRANSPORT_TO_API_MODE
+        overlay = HERMES_OVERLAYS["xiaomi"]
+        api_mode = TRANSPORT_TO_API_MODE[overlay.transport]
+        assert api_mode == "chat_completions"
@@ -0,0 +1,279 @@
+"""Tests for _check_compression_model_feasibility() — warns when the
+auxiliary compression model's context is smaller than the main model's
+compression threshold.
+
+Two-phase design:
+  1. __init__  → runs the check, prints via _vprint (CLI), stores warning
+  2. run_conversation (first call) → replays stored warning through
+     status_callback (gateway platforms)
+"""
+
+from unittest.mock import MagicMock, patch
+
+from run_agent import AIAgent
+from agent.context_compressor import ContextCompressor
+
+
+def _make_agent(
+    *,
+    compression_enabled: bool = True,
+    threshold_percent: float = 0.50,
+    main_context: int = 200_000,
+) -> AIAgent:
+    """Build a minimal AIAgent with a compressor, skipping __init__."""
+    agent = AIAgent.__new__(AIAgent)
+    agent.model = "test-main-model"
+    agent.provider = "openrouter"
+    agent.base_url = "https://openrouter.ai/api/v1"
+    agent.api_key = "sk-test"
+    agent.quiet_mode = True
+    agent.log_prefix = ""
+    agent.compression_enabled = compression_enabled
+    agent._print_fn = None
+    agent.suppress_status_output = False
+    agent._stream_consumers = []
+    agent._executing_tools = False
+    agent._mute_post_response = False
+    agent.status_callback = None
+    agent.tool_progress_callback = None
+    agent._compression_warning = None
+
+    compressor = MagicMock(spec=ContextCompressor)
+    compressor.context_length = main_context
+    compressor.threshold_tokens = int(main_context * threshold_percent)
+    agent.context_compressor = compressor
+
+    return agent
+
+
+# ── Core warning logic ──────────────────────────────────────────────
+
+
+@patch("agent.model_metadata.get_model_context_length", return_value=32_768)
+@patch("agent.auxiliary_client.get_text_auxiliary_client")
+def test_warns_when_aux_context_below_threshold(mock_get_client, mock_ctx_len):
+    """Warning emitted when aux model context < main model threshold."""
+    agent = _make_agent(main_context=200_000, threshold_percent=0.50)
+    # threshold = 100,000 — aux has only 32,768
+    mock_client = MagicMock()
+    mock_client.base_url = "https://openrouter.ai/api/v1"
+    mock_client.api_key = "sk-aux"
+    mock_get_client.return_value = (mock_client, "google/gemini-3-flash-preview")
+
+    messages = []
+    agent._emit_status = lambda msg: messages.append(msg)
+
+    agent._check_compression_model_feasibility()
+
+    assert len(messages) == 1
+    assert "Compression model" in messages[0]
+    assert "32,768" in messages[0]
+    assert "100,000" in messages[0]
+    assert "will not be possible" in messages[0]
+    # Actionable fix guidance included
+    assert "Fix options" in messages[0]
+    assert "auxiliary:" in messages[0]
+    assert "compression:" in messages[0]
+    assert "threshold:" in messages[0]
+    # Warning stored for gateway replay
+    assert agent._compression_warning is not None
+
+
+@patch("agent.model_metadata.get_model_context_length", return_value=200_000)
+@patch("agent.auxiliary_client.get_text_auxiliary_client")
+def test_no_warning_when_aux_context_sufficient(mock_get_client, mock_ctx_len):
+    """No warning when aux model context >= main model threshold."""
+    agent = _make_agent(main_context=200_000, threshold_percent=0.50)
+    # threshold = 100,000 — aux has 200,000 (sufficient)
+    mock_client = MagicMock()
+    mock_client.base_url = "https://openrouter.ai/api/v1"
+    mock_client.api_key = "sk-aux"
+    mock_get_client.return_value = (mock_client, "google/gemini-2.5-flash")
+
+    messages = []
+    agent._emit_status = lambda msg: messages.append(msg)
+
+    agent._check_compression_model_feasibility()
+
+    assert len(messages) == 0
+    assert agent._compression_warning is None
+
+
+@patch("agent.auxiliary_client.get_text_auxiliary_client")
+def test_warns_when_no_auxiliary_provider(mock_get_client):
+    """Warning emitted when no auxiliary provider is configured."""
+    agent = _make_agent()
+    mock_get_client.return_value = (None, None)
+
+    messages = []
+    agent._emit_status = lambda msg: messages.append(msg)
+
+    agent._check_compression_model_feasibility()
+
+    assert len(messages) == 1
+    assert "No auxiliary LLM provider" in messages[0]
+    assert agent._compression_warning is not None
+
+
+def test_skips_check_when_compression_disabled():
+    """No check performed when compression is disabled."""
+    agent = _make_agent(compression_enabled=False)
+
+    messages = []
+    agent._emit_status = lambda msg: messages.append(msg)
+
+    agent._check_compression_model_feasibility()
+
+    assert len(messages) == 0
+    assert agent._compression_warning is None
+
+
+@patch("agent.auxiliary_client.get_text_auxiliary_client")
+def test_exception_does_not_crash(mock_get_client):
+    """Exceptions in the check are caught — never blocks startup."""
+    agent = _make_agent()
+    mock_get_client.side_effect = RuntimeError("boom")
+
+    messages = []
+    agent._emit_status = lambda msg: messages.append(msg)
+
+    # Should not raise
+    agent._check_compression_model_feasibility()
+
+    # No user-facing message (error is debug-logged)
+    assert len(messages) == 0
+
+
+@patch("agent.model_metadata.get_model_context_length", return_value=100_000)
+@patch("agent.auxiliary_client.get_text_auxiliary_client")
+def test_exact_threshold_boundary_no_warning(mock_get_client, mock_ctx_len):
+    """No warning when aux context exactly equals the threshold."""
+    agent = _make_agent(main_context=200_000, threshold_percent=0.50)
+    mock_client = MagicMock()
+    mock_client.base_url = "https://openrouter.ai/api/v1"
+    mock_client.api_key = "sk-aux"
+    mock_get_client.return_value = (mock_client, "test-model")
+
+    messages = []
+    agent._emit_status = lambda msg: messages.append(msg)
+
+    agent._check_compression_model_feasibility()
+
+    assert len(messages) == 0
+
+
+@patch("agent.model_metadata.get_model_context_length", return_value=99_999)
+@patch("agent.auxiliary_client.get_text_auxiliary_client")
+def test_just_below_threshold_warns(mock_get_client, mock_ctx_len):
+    """Warning fires when aux context is one token below the threshold."""
+    agent = _make_agent(main_context=200_000, threshold_percent=0.50)
+    mock_client = MagicMock()
+    mock_client.base_url = "https://openrouter.ai/api/v1"
+    mock_client.api_key = "sk-aux"
+    mock_get_client.return_value = (mock_client, "small-model")
+
+    messages = []
+    agent._emit_status = lambda msg: messages.append(msg)
+
+    agent._check_compression_model_feasibility()
+
+    assert len(messages) == 1
+    assert "small-model" in messages[0]
+
+
+# ── Two-phase: __init__ + run_conversation replay ───────────────────
+
+
+@patch("agent.model_metadata.get_model_context_length", return_value=32_768)
+@patch("agent.auxiliary_client.get_text_auxiliary_client")
+def test_warning_stored_for_gateway_replay(mock_get_client, mock_ctx_len):
+    """__init__ stores the warning; _replay sends it through status_callback."""
+    agent = _make_agent(main_context=200_000, threshold_percent=0.50)
+    mock_client = MagicMock()
+    mock_client.base_url = "https://openrouter.ai/api/v1"
+    mock_client.api_key = "sk-aux"
+    mock_get_client.return_value = (mock_client, "google/gemini-3-flash-preview")
+
+    # Phase 1: __init__ — _emit_status prints (CLI) but callback is None
+    vprint_messages = []
+    agent._emit_status = lambda msg: vprint_messages.append(msg)
+    agent._check_compression_model_feasibility()
+
+    assert len(vprint_messages) == 1  # CLI got it
+    assert agent._compression_warning is not None  # stored for replay
+
+    # Phase 2: gateway wires callback post-init, then run_conversation replays
+    callback_events = []
+    agent.status_callback = lambda ev, msg: callback_events.append((ev, msg))
+    agent._replay_compression_warning()
+
+    assert any(
+        ev == "lifecycle" and "will not be possible" in msg
+        for ev, msg in callback_events
+    )
+
+
+@patch("agent.model_metadata.get_model_context_length", return_value=200_000)
+@patch("agent.auxiliary_client.get_text_auxiliary_client")
+def test_no_replay_when_no_warning(mock_get_client, mock_ctx_len):
+    """_replay_compression_warning is a no-op when there's no stored warning."""
+    agent = _make_agent(main_context=200_000, threshold_percent=0.50)
+    mock_client = MagicMock()
+    mock_client.base_url = "https://openrouter.ai/api/v1"
+    mock_client.api_key = "sk-aux"
+    mock_get_client.return_value = (mock_client, "big-model")
+
+    agent._emit_status = lambda msg: None
+    agent._check_compression_model_feasibility()
+
+    assert agent._compression_warning is None
+
+    callback_events = []
+    agent.status_callback = lambda ev, msg: callback_events.append((ev, msg))
+    agent._replay_compression_warning()
+
+    assert len(callback_events) == 0
+
+
+def test_replay_without_callback_is_noop():
+    """_replay_compression_warning doesn't crash when status_callback is None."""
+    agent = _make_agent()
+    agent._compression_warning = "some warning"
+    agent.status_callback = None
+
+    # Should not raise
+    agent._replay_compression_warning()
+
+
+@patch("agent.model_metadata.get_model_context_length", return_value=32_768)
+@patch("agent.auxiliary_client.get_text_auxiliary_client")
+def test_run_conversation_clears_warning_after_replay(mock_get_client, mock_ctx_len):
+    """After replay in run_conversation, _compression_warning is cleared
+    so the warning is not sent again on subsequent turns."""
+    agent = _make_agent(main_context=200_000, threshold_percent=0.50)
+    mock_client = MagicMock()
+    mock_client.base_url = "https://openrouter.ai/api/v1"
+    mock_client.api_key = "sk-aux"
+    mock_get_client.return_value = (mock_client, "small-model")
+
+    agent._emit_status = lambda msg: None
+    agent._check_compression_model_feasibility()
+
+    assert agent._compression_warning is not None
+
+    # Simulate what run_conversation does
+    callback_events = []
+    agent.status_callback = lambda ev, msg: callback_events.append((ev, msg))
+    if agent._compression_warning:
+        agent._replay_compression_warning()
+        agent._compression_warning = None  # as in run_conversation
+
+    assert len(callback_events) == 1
+
+    # Second turn — nothing replayed
+    callback_events.clear()
+    if agent._compression_warning:
+        agent._replay_compression_warning()
+        agent._compression_warning = None
+
+    assert len(callback_events) == 0
@@ -22,23 +22,22 @@ class TestInterruptPropagationToChild(unittest.TestCase):
    def tearDown(self):
        set_interrupt(False)

+    def _make_bare_agent(self):
+        """Create a bare AIAgent via __new__ with all interrupt-related attrs."""
+        from run_agent import AIAgent
+        agent = AIAgent.__new__(AIAgent)
+        agent._interrupt_requested = False
+        agent._interrupt_message = None
+        agent._execution_thread_id = None  # defaults to current thread in set_interrupt
+        agent._active_children = []
+        agent._active_children_lock = threading.Lock()
+        agent.quiet_mode = True
+        return agent
+
    def test_parent_interrupt_sets_child_flag(self):
        """When parent.interrupt() is called, child._interrupt_requested should be set."""
-        from run_agent import AIAgent
-
-        parent = AIAgent.__new__(AIAgent)
-        parent._interrupt_requested = False
-        parent._interrupt_message = None
-        parent._active_children = []
-        parent._active_children_lock = threading.Lock()
-        parent.quiet_mode = True
-
-        child = AIAgent.__new__(AIAgent)
-        child._interrupt_requested = False
-        child._interrupt_message = None
-        child._active_children = []
-        child._active_children_lock = threading.Lock()
-        child.quiet_mode = True
+        parent = self._make_bare_agent()
+        child = self._make_bare_agent()

        parent._active_children.append(child)

@@ -49,40 +48,26 @@ class TestInterruptPropagationToChild(unittest.TestCase):
        assert child._interrupt_message == "new user message"
        assert is_interrupted() is True

-    def test_child_clear_interrupt_at_start_clears_global(self):
-        """child.clear_interrupt() at start of run_conversation clears the GLOBAL event.
-        
-        This is the intended behavior at startup, but verify it doesn't
-        accidentally clear an interrupt intended for a running child.
+    def test_child_clear_interrupt_at_start_clears_thread(self):
+        """child.clear_interrupt() at start of run_conversation clears the
+        per-thread interrupt flag for the current thread.
        """
-        from run_agent import AIAgent
-
-        child = AIAgent.__new__(AIAgent)
+        child = self._make_bare_agent()
        child._interrupt_requested = True
        child._interrupt_message = "msg"
-        child.quiet_mode = True
-        child._active_children = []
-        child._active_children_lock = threading.Lock()

-        # Global is set
+        # Interrupt for current thread is set
        set_interrupt(True)
        assert is_interrupted() is True

-        # child.clear_interrupt() clears both
+        # child.clear_interrupt() clears both instance flag and thread flag
        child.clear_interrupt()
        assert child._interrupt_requested is False
        assert is_interrupted() is False

    def test_interrupt_during_child_api_call_detected(self):
        """Interrupt set during _interruptible_api_call is detected within 0.5s."""
-        from run_agent import AIAgent
-
-        child = AIAgent.__new__(AIAgent)
-        child._interrupt_requested = False
-        child._interrupt_message = None
-        child._active_children = []
-        child._active_children_lock = threading.Lock()
-        child.quiet_mode = True
+        child = self._make_bare_agent()
        child.api_mode = "chat_completions"
        child.log_prefix = ""
        child._client_kwargs = {"api_key": "test", "base_url": "http://localhost:1234"}
@@ -117,21 +102,8 @@ class TestInterruptPropagationToChild(unittest.TestCase):

    def test_concurrent_interrupt_propagation(self):
        """Simulates exact CLI flow: parent runs delegate in thread, main thread interrupts."""
-        from run_agent import AIAgent
-
-        parent = AIAgent.__new__(AIAgent)
-        parent._interrupt_requested = False
-        parent._interrupt_message = None
-        parent._active_children = []
-        parent._active_children_lock = threading.Lock()
-        parent.quiet_mode = True
-
-        child = AIAgent.__new__(AIAgent)
-        child._interrupt_requested = False
-        child._interrupt_message = None
-        child._active_children = []
-        child._active_children_lock = threading.Lock()
-        child.quiet_mode = True
+        parent = self._make_bare_agent()
+        child = self._make_bare_agent()

        # Register child (simulating what _run_single_child does)
        parent._active_children.append(child)
@@ -157,5 +129,79 @@ class TestInterruptPropagationToChild(unittest.TestCase):
        set_interrupt(False)


+class TestPerThreadInterruptIsolation(unittest.TestCase):
+    """Verify that interrupting one agent does NOT affect another agent's thread.
+
+    This is the core fix for the gateway cross-session interrupt leak:
+    multiple agents run in separate threads within the same process, and
+    interrupting agent A must not kill agent B's running tools.
+    """
+
+    def setUp(self):
+        set_interrupt(False)
+
+    def tearDown(self):
+        set_interrupt(False)
+
+    def test_interrupt_only_affects_target_thread(self):
+        """set_interrupt(True, tid) only makes is_interrupted() True on that thread."""
+        results = {}
+        barrier = threading.Barrier(2)
+
+        def thread_a():
+            """Agent A's execution thread — will be interrupted."""
+            tid = threading.current_thread().ident
+            results["a_tid"] = tid
+            barrier.wait(timeout=5)  # sync with thread B
+            time.sleep(0.2)  # let the interrupt arrive
+            results["a_interrupted"] = is_interrupted()
+
+        def thread_b():
+            """Agent B's execution thread — should NOT be affected."""
+            tid = threading.current_thread().ident
+            results["b_tid"] = tid
+            barrier.wait(timeout=5)  # sync with thread A
+            time.sleep(0.2)
+            results["b_interrupted"] = is_interrupted()
+
+        ta = threading.Thread(target=thread_a)
+        tb = threading.Thread(target=thread_b)
+        ta.start()
+        tb.start()
+
+        # Wait for both threads to register their TIDs
+        time.sleep(0.05)
+        while "a_tid" not in results or "b_tid" not in results:
+            time.sleep(0.01)
+
+        # Interrupt ONLY thread A (simulates gateway interrupting agent A)
+        set_interrupt(True, results["a_tid"])
+
+        ta.join(timeout=3)
+        tb.join(timeout=3)
+
+        assert results["a_interrupted"] is True, "Thread A should see the interrupt"
+        assert results["b_interrupted"] is False, "Thread B must NOT see thread A's interrupt"
+
+    def test_clear_interrupt_only_clears_target_thread(self):
+        """Clearing one thread's interrupt doesn't clear another's."""
+        tid_a = 99990001
+        tid_b = 99990002
+        set_interrupt(True, tid_a)
+        set_interrupt(True, tid_b)
+
+        # Clear only A
+        set_interrupt(False, tid_a)
+
+        # Simulate checking from thread B's perspective
+        from tools.interrupt import _interrupted_threads, _lock
+        with _lock:
+            assert tid_a not in _interrupted_threads
+            assert tid_b in _interrupted_threads
+
+        # Cleanup
+        set_interrupt(False, tid_b)
+
+
 if __name__ == "__main__":
    unittest.main()
@@ -2087,8 +2087,9 @@ class TestRunConversation:
        assert "Thinking Budget Exhausted" in result["final_response"]
        assert "/thinkon" in result["final_response"]

-    def test_length_empty_content_detected_as_thinking_exhausted(self, agent):
-        """When finish_reason='length' and content is None/empty, detect exhaustion."""
+    def test_length_empty_content_without_think_tags_retries_normally(self, agent):
+        """When finish_reason='length' and content is None but no think tags,
+        fall through to normal continuation retry (not thinking-exhaustion)."""
        self._setup_agent(agent)
        resp = _mock_response(content=None, finish_reason="length")
        agent.client.chat.completions.create.return_value = resp
@@ -2100,12 +2101,10 @@ class TestRunConversation:
        ):
            result = agent.run_conversation("hello")

+        # Without think tags, the agent should attempt continuation retries
+        # (up to 3), not immediately fire thinking-exhaustion.
+        assert result["api_calls"] == 3
        assert result["completed"] is False
-        assert result["api_calls"] == 1
-        assert "reasoning" in result["error"].lower()
-        # User-friendly message is returned
-        assert result["final_response"] is not None
-        assert "Thinking Budget Exhausted" in result["final_response"]

    def test_length_with_tool_calls_returns_partial_without_executing_tools(self, agent):
        self._setup_agent(agent)
@@ -2169,6 +2168,35 @@ class TestRunConversation:
        mock_hfc.assert_called_once()
        assert result["final_response"] == "Done!"

+    def test_truncated_tool_args_detected_when_finish_reason_not_length(self, agent):
+        """When a router rewrites finish_reason from 'length' to 'tool_calls',
+        truncated JSON arguments should still be detected and refused rather
+        than wasting 3 retry attempts."""
+        self._setup_agent(agent)
+        agent.valid_tool_names.add("write_file")
+        bad_tc = _mock_tool_call(
+            name="write_file",
+            arguments='{"path":"report.md","content":"partial',
+            call_id="c1",
+        )
+        resp = _mock_response(
+            content="", finish_reason="tool_calls", tool_calls=[bad_tc],
+        )
+        agent.client.chat.completions.create.return_value = resp
+
+        with (
+            patch("run_agent.handle_function_call") as mock_handle_function_call,
+            patch.object(agent, "_persist_session"),
+            patch.object(agent, "_save_trajectory"),
+            patch.object(agent, "_cleanup_task_resources"),
+        ):
+            result = agent.run_conversation("write the report")
+
+        assert result["completed"] is False
+        assert result["partial"] is True
+        assert "truncated due to output length limit" in result["error"]
+        mock_handle_function_call.assert_not_called()
+

 class TestRetryExhaustion:
    """Regression: retry_count > max_retries was dead code (off-by-one).
@@ -222,6 +222,12 @@ def test_api_mode_normalizes_provider_case(monkeypatch):


 def test_api_mode_respects_explicit_openrouter_provider_over_codex_url(monkeypatch):
+    """GPT-5.x models need codex_responses even on OpenRouter.
+
+    OpenRouter rejects GPT-5 models on /v1/chat/completions with
+    ``unsupported_api_for_model``.  The model-level check overrides
+    the provider default.
+    """
    _patch_agent_bootstrap(monkeypatch)
    agent = run_agent.AIAgent(
        model="gpt-5-codex",
@@ -233,7 +239,7 @@ def test_api_mode_respects_explicit_openrouter_provider_over_codex_url(monkeypat
        skip_context_files=True,
        skip_memory=True,
    )
-    assert agent.api_mode == "chat_completions"
+    assert agent.api_mode == "codex_responses"
    assert agent.provider == "openrouter"


@@ -0,0 +1,158 @@
+"""Tests for _reap_orphaned_browser_sessions() — kills orphaned agent-browser
+daemons whose Python parent exited without cleaning up."""
+
+import os
+import signal
+import textwrap
+from pathlib import Path
+from unittest.mock import patch, MagicMock
+
+import pytest
+
+
+@pytest.fixture
+def fake_tmpdir(tmp_path):
+    """Patch _socket_safe_tmpdir to return a temp dir we control."""
+    with patch("tools.browser_tool._socket_safe_tmpdir", return_value=str(tmp_path)):
+        yield tmp_path
+
+
+@pytest.fixture(autouse=True)
+def _isolate_sessions():
+    """Ensure _active_sessions is empty for each test."""
+    import tools.browser_tool as bt
+    orig = bt._active_sessions.copy()
+    bt._active_sessions.clear()
+    yield
+    bt._active_sessions.clear()
+    bt._active_sessions.update(orig)
+
+
+def _make_socket_dir(tmpdir, session_name, pid=None):
+    """Create a fake agent-browser socket directory with optional PID file."""
+    d = tmpdir / f"agent-browser-{session_name}"
+    d.mkdir()
+    if pid is not None:
+        (d / f"{session_name}.pid").write_text(str(pid))
+    return d
+
+
+class TestReapOrphanedBrowserSessions:
+    """Tests for the orphan reaper function."""
+
+    def test_no_socket_dirs_is_noop(self, fake_tmpdir):
+        """No socket dirs => nothing happens, no errors."""
+        from tools.browser_tool import _reap_orphaned_browser_sessions
+        _reap_orphaned_browser_sessions()  # should not raise
+
+    def test_stale_dir_without_pid_file_is_removed(self, fake_tmpdir):
+        """Socket dir with no PID file is cleaned up."""
+        from tools.browser_tool import _reap_orphaned_browser_sessions
+        d = _make_socket_dir(fake_tmpdir, "h_abc1234567")
+        assert d.exists()
+        _reap_orphaned_browser_sessions()
+        assert not d.exists()
+
+    def test_stale_dir_with_dead_pid_is_removed(self, fake_tmpdir):
+        """Socket dir whose daemon PID is dead gets cleaned up."""
+        from tools.browser_tool import _reap_orphaned_browser_sessions
+        d = _make_socket_dir(fake_tmpdir, "h_dead123456", pid=999999999)
+        assert d.exists()
+        _reap_orphaned_browser_sessions()
+        assert not d.exists()
+
+    def test_orphaned_alive_daemon_is_killed(self, fake_tmpdir):
+        """Alive daemon not tracked by _active_sessions gets SIGTERM."""
+        from tools.browser_tool import _reap_orphaned_browser_sessions
+
+        d = _make_socket_dir(fake_tmpdir, "h_orphan12345", pid=12345)
+
+        kill_calls = []
+        original_kill = os.kill
+
+        def mock_kill(pid, sig):
+            kill_calls.append((pid, sig))
+            if sig == 0:
+                return  # pretend process exists
+            # Don't actually kill anything
+
+        with patch("os.kill", side_effect=mock_kill):
+            _reap_orphaned_browser_sessions()
+
+        # Should have checked existence (sig 0) then killed (SIGTERM)
+        assert (12345, 0) in kill_calls
+        assert (12345, signal.SIGTERM) in kill_calls
+
+    def test_tracked_session_is_not_reaped(self, fake_tmpdir):
+        """Sessions tracked in _active_sessions are left alone."""
+        import tools.browser_tool as bt
+        from tools.browser_tool import _reap_orphaned_browser_sessions
+
+        session_name = "h_tracked1234"
+        d = _make_socket_dir(fake_tmpdir, session_name, pid=12345)
+
+        # Register the session as actively tracked
+        bt._active_sessions["some_task"] = {"session_name": session_name}
+
+        kill_calls = []
+
+        def mock_kill(pid, sig):
+            kill_calls.append((pid, sig))
+
+        with patch("os.kill", side_effect=mock_kill):
+            _reap_orphaned_browser_sessions()
+
+        # Should NOT have tried to kill anything
+        assert len(kill_calls) == 0
+        # Dir should still exist
+        assert d.exists()
+
+    def test_permission_error_on_kill_check_skips(self, fake_tmpdir):
+        """If we can't check the PID (PermissionError), skip it."""
+        from tools.browser_tool import _reap_orphaned_browser_sessions
+
+        d = _make_socket_dir(fake_tmpdir, "h_perm1234567", pid=12345)
+
+        def mock_kill(pid, sig):
+            if sig == 0:
+                raise PermissionError("not our process")
+
+        with patch("os.kill", side_effect=mock_kill):
+            _reap_orphaned_browser_sessions()
+
+        # Dir should still exist (we didn't touch someone else's process)
+        assert d.exists()
+
+    def test_cdp_sessions_are_also_reaped(self, fake_tmpdir):
+        """CDP sessions (cdp_ prefix) are also scanned."""
+        from tools.browser_tool import _reap_orphaned_browser_sessions
+
+        d = _make_socket_dir(fake_tmpdir, "cdp_abc1234567")
+        assert d.exists()
+        _reap_orphaned_browser_sessions()
+        # No PID file → cleaned up
+        assert not d.exists()
+
+    def test_non_hermes_dirs_are_ignored(self, fake_tmpdir):
+        """Socket dirs that don't match our naming pattern are left alone."""
+        from tools.browser_tool import _reap_orphaned_browser_sessions
+
+        # Create a dir that doesn't match h_* or cdp_* pattern
+        d = fake_tmpdir / "agent-browser-other_session"
+        d.mkdir()
+        (d / "other_session.pid").write_text("12345")
+
+        _reap_orphaned_browser_sessions()
+
+        # Should NOT be touched
+        assert d.exists()
+
+    def test_corrupt_pid_file_is_cleaned(self, fake_tmpdir):
+        """PID file with non-integer content is cleaned up."""
+        from tools.browser_tool import _reap_orphaned_browser_sessions
+
+        d = _make_socket_dir(fake_tmpdir, "h_corrupt1234")
+        (d / "h_corrupt1234.pid").write_text("not-a-number")
+
+        _reap_orphaned_browser_sessions()
+        assert not d.exists()
@@ -780,14 +780,18 @@ class TestLoadConfig(unittest.TestCase):
@unittest.skipIf(sys.platform == "win32", "UDS not available on Windows")
 class TestInterruptHandling(unittest.TestCase):
    def test_interrupt_event_stops_execution(self):
-        """When _interrupt_event is set, execute_code should stop the script."""
+        """When interrupt is set for the execution thread, execute_code should stop."""
        code = "import time; time.sleep(60); print('should not reach')"
+        from tools.interrupt import set_interrupt
+
+        # Capture the main thread ID so we can target the interrupt correctly.
+        # execute_code runs in the current thread; set_interrupt needs its ID.
+        main_tid = threading.current_thread().ident

        def set_interrupt_after_delay():
            import time as _t
            _t.sleep(1)
-            from tools.terminal_tool import _interrupt_event
-            _interrupt_event.set()
+            set_interrupt(True, main_tid)

        t = threading.Thread(target=set_interrupt_after_delay, daemon=True)
        t.start()
@@ -804,8 +808,7 @@ class TestInterruptHandling(unittest.TestCase):
            self.assertEqual(result["status"], "interrupted")
            self.assertIn("interrupted", result["output"])
        finally:
-            from tools.terminal_tool import _interrupt_event
-            _interrupt_event.clear()
+            set_interrupt(False, main_tid)
            t.join(timeout=3)


@@ -227,6 +227,8 @@ class TestCheckpointNotify:
            "session_key": "sk1",
            "watcher_platform": "telegram",
            "watcher_chat_id": "123",
+            "watcher_user_id": "u123",
+            "watcher_user_name": "alice",
            "watcher_thread_id": "42",
            "watcher_interval": 5,
            "notify_on_complete": True,
@@ -236,6 +238,8 @@ class TestCheckpointNotify:
            assert recovered == 1
            assert len(registry.pending_watchers) == 1
            assert registry.pending_watchers[0]["notify_on_complete"] is True
+            assert registry.pending_watchers[0]["user_id"] == "u123"
+            assert registry.pending_watchers[0]["user_name"] == "alice"

    def test_recover_defaults_false(self, registry, tmp_path):
        """Old checkpoint entries without the field default to False."""
@@ -438,6 +438,8 @@ class TestCheckpoint:
            s = _make_session()
            s.watcher_platform = "telegram"
            s.watcher_chat_id = "999"
+            s.watcher_user_id = "u123"
+            s.watcher_user_name = "alice"
            s.watcher_thread_id = "42"
            s.watcher_interval = 60
            registry._running[s.id] = s
@@ -447,6 +449,8 @@ class TestCheckpoint:
            assert len(data) == 1
            assert data[0]["watcher_platform"] == "telegram"
            assert data[0]["watcher_chat_id"] == "999"
+            assert data[0]["watcher_user_id"] == "u123"
+            assert data[0]["watcher_user_name"] == "alice"
            assert data[0]["watcher_thread_id"] == "42"
            assert data[0]["watcher_interval"] == 60

@@ -460,6 +464,8 @@ class TestCheckpoint:
            "session_key": "sk1",
            "watcher_platform": "telegram",
            "watcher_chat_id": "123",
+            "watcher_user_id": "u123",
+            "watcher_user_name": "alice",
            "watcher_thread_id": "42",
            "watcher_interval": 60,
        }]))
@@ -471,6 +477,8 @@ class TestCheckpoint:
            assert w["session_id"] == "proc_live"
            assert w["platform"] == "telegram"
            assert w["chat_id"] == "123"
+            assert w["user_id"] == "u123"
+            assert w["user_name"] == "alice"
            assert w["thread_id"] == "42"
            assert w["check_interval"] == 60

@@ -348,7 +348,7 @@ word word
            result = _patch_skill("my-skill", "old text", "new text", file_path="references/evil.md")

        assert result["success"] is False
-        assert "boundary" in result["error"].lower()
+        assert "escapes" in result["error"].lower()
        assert outside_file.read_text() == "old text here"


@@ -412,7 +412,7 @@ class TestWriteFile:
            result = _write_file("my-skill", "references/escape/owned.md", "malicious")

        assert result["success"] is False
-        assert "boundary" in result["error"].lower()
+        assert "escapes" in result["error"].lower()
        assert not (outside_dir / "owned.md").exists()


@@ -449,7 +449,7 @@ class TestRemoveFile:
            result = _remove_file("my-skill", "references/escape/keep.txt")

        assert result["success"] is False
-        assert "boundary" in result["error"].lower()
+        assert "escapes" in result["error"].lower()
        assert outside_file.exists()


@@ -15,6 +15,10 @@ from tools.vision_tools import (
    _handle_vision_analyze,
    _determine_mime_type,
    _image_to_base64_data_url,
+    _resize_image_for_vision,
+    _is_image_size_error,
+    _MAX_BASE64_BYTES,
+    _RESIZE_TARGET_BYTES,
    vision_analyze_tool,
    check_vision_requirements,
    get_debug_session_info,
@@ -590,11 +594,13 @@ class TestBase64SizeLimit:

    @pytest.mark.asyncio
    async def test_oversized_image_rejected_before_api_call(self, tmp_path):
-        """Images exceeding 5 MB base64 should fail with a clear size error."""
+        """Images exceeding the 20 MB hard limit should fail with a clear error."""
        img = tmp_path / "huge.png"
        img.write_bytes(b"\x89PNG\r\n\x1a\n" + b"\x00" * (4 * 1024 * 1024))

-        with patch("tools.vision_tools.async_call_llm", new_callable=AsyncMock) as mock_llm:
+        # Patch the hard limit to a small value so the test runs fast.
+        with patch("tools.vision_tools._MAX_BASE64_BYTES", 1000), \
+             patch("tools.vision_tools.async_call_llm", new_callable=AsyncMock) as mock_llm:
            result = json.loads(await vision_analyze_tool(str(img), "describe this"))

        assert result["success"] is False
@@ -686,3 +692,180 @@ class TestVisionRegistration:

        entry = registry._tools.get("vision_analyze")
        assert callable(entry.handler)
+
+
+# ---------------------------------------------------------------------------
+# _resize_image_for_vision — auto-resize oversized images
+# ---------------------------------------------------------------------------
+
+
+class TestResizeImageForVision:
+    """Tests for the auto-resize function."""
+
+    def test_small_image_returned_as_is(self, tmp_path):
+        """Images under the limit should be returned unchanged."""
+        # Create a small 10x10 red PNG
+        try:
+            from PIL import Image
+        except ImportError:
+            pytest.skip("Pillow not installed")
+        img = Image.new("RGB", (10, 10), (255, 0, 0))
+        path = tmp_path / "small.png"
+        img.save(path, "PNG")
+
+        result = _resize_image_for_vision(path, mime_type="image/png")
+        assert result.startswith("data:image/png;base64,")
+        assert len(result) < _MAX_BASE64_BYTES
+
+    def test_large_image_is_resized(self, tmp_path):
+        """Images over the default target should be auto-resized to fit."""
+        try:
+            from PIL import Image
+        except ImportError:
+            pytest.skip("Pillow not installed")
+        # Create a large image that will exceed 5 MB in base64
+        # A 4000x4000 uncompressed PNG will be large
+        img = Image.new("RGB", (4000, 4000), (128, 200, 50))
+        path = tmp_path / "large.png"
+        img.save(path, "PNG")
+
+        result = _resize_image_for_vision(path, mime_type="image/png")
+        assert result.startswith("data:image/png;base64,")
+        # Default target is _RESIZE_TARGET_BYTES (5 MB), not _MAX_BASE64_BYTES (20 MB)
+        assert len(result) <= _RESIZE_TARGET_BYTES
+
+    def test_custom_max_bytes(self, tmp_path):
+        """The max_base64_bytes parameter should be respected."""
+        try:
+            from PIL import Image
+        except ImportError:
+            pytest.skip("Pillow not installed")
+        img = Image.new("RGB", (200, 200), (0, 128, 255))
+        path = tmp_path / "medium.png"
+        img.save(path, "PNG")
+
+        # Set a very low limit to force resizing
+        result = _resize_image_for_vision(path, max_base64_bytes=500)
+        # Should still return a valid data URL
+        assert result.startswith("data:image/")
+
+    def test_jpeg_output_for_non_png(self, tmp_path):
+        """Non-PNG images should be resized as JPEG."""
+        try:
+            from PIL import Image
+        except ImportError:
+            pytest.skip("Pillow not installed")
+        img = Image.new("RGB", (2000, 2000), (255, 128, 0))
+        path = tmp_path / "photo.jpg"
+        img.save(path, "JPEG", quality=95)
+
+        result = _resize_image_for_vision(path, mime_type="image/jpeg",
+                                           max_base64_bytes=50_000)
+        assert result.startswith("data:image/jpeg;base64,")
+
+    def test_constants_sane(self):
+        """Hard limit should be larger than resize target."""
+        assert _MAX_BASE64_BYTES == 20 * 1024 * 1024
+        assert _RESIZE_TARGET_BYTES == 5 * 1024 * 1024
+        assert _MAX_BASE64_BYTES > _RESIZE_TARGET_BYTES
+
+    def test_extreme_aspect_ratio_preserved(self, tmp_path):
+        """Extreme aspect ratios should be preserved during resize."""
+        try:
+            from PIL import Image
+        except ImportError:
+            pytest.skip("Pillow not installed")
+        # Very wide panorama: 8000x200
+        img = Image.new("RGB", (8000, 200), (100, 150, 200))
+        path = tmp_path / "panorama.png"
+        img.save(path, "PNG")
+
+        result = _resize_image_for_vision(path, mime_type="image/png",
+                                           max_base64_bytes=50_000)
+        assert result.startswith("data:image/")
+        # Decode and check aspect ratio is roughly preserved
+        import base64
+        header, b64data = result.split(",", 1)
+        raw = base64.b64decode(b64data)
+        from io import BytesIO
+        resized = Image.open(BytesIO(raw))
+        original_ratio = 8000 / 200  # 40:1
+        resized_ratio = resized.width / resized.height if resized.height > 0 else 0
+        # Allow some tolerance (floor clamping), but ratio should stay above 10:1
+        # With independent halving, ratio would collapse to ~1:1. Proportional
+        # scaling should keep it well above 10.
+        assert resized_ratio > 10, (
+            f"Aspect ratio collapsed: {resized.width}x{resized.height} "
+            f"(ratio {resized_ratio:.1f}, expected >10)"
+        )
+
+    def test_tall_narrow_image_preserved(self, tmp_path):
+        """Tall narrow images should also preserve aspect ratio."""
+        try:
+            from PIL import Image
+        except ImportError:
+            pytest.skip("Pillow not installed")
+        # Very tall: 200x6000
+        img = Image.new("RGB", (200, 6000), (200, 100, 50))
+        path = tmp_path / "tall.png"
+        img.save(path, "PNG")
+
+        result = _resize_image_for_vision(path, mime_type="image/png",
+                                           max_base64_bytes=50_000)
+        assert result.startswith("data:image/")
+        import base64
+        from io import BytesIO
+        header, b64data = result.split(",", 1)
+        raw = base64.b64decode(b64data)
+        resized = Image.open(BytesIO(raw))
+        original_ratio = 6000 / 200  # 30:1 (h/w)
+        resized_ratio = resized.height / resized.width if resized.width > 0 else 0
+        assert resized_ratio > 5, (
+            f"Aspect ratio collapsed: {resized.width}x{resized.height} "
+            f"(h/w ratio {resized_ratio:.1f}, expected >5)"
+        )
+
+    def test_no_pillow_returns_original(self, tmp_path):
+        """Without Pillow, oversized images should be returned as-is."""
+        # Create a dummy file
+        path = tmp_path / "test.png"
+        # Write enough bytes to exceed a tiny limit
+        path.write_bytes(b"\x89PNG\r\n\x1a\n" + b"\x00" * 1000)
+
+        with patch("tools.vision_tools._image_to_base64_data_url") as mock_b64:
+            # Simulate a large base64 result
+            mock_b64.return_value = "data:image/png;base64," + "A" * 200
+            with patch.dict("sys.modules", {"PIL": None, "PIL.Image": None}):
+                result = _resize_image_for_vision(path, max_base64_bytes=100)
+                # Should return the original (oversized) data url
+                assert len(result) > 100
+
+
+# ---------------------------------------------------------------------------
+# _is_image_size_error — detect size-related API errors
+# ---------------------------------------------------------------------------
+
+
+class TestIsImageSizeError:
+    """Tests for the size-error detection helper."""
+
+    def test_too_large_message(self):
+        assert _is_image_size_error(Exception("Request payload too large"))
+
+    def test_413_status(self):
+        assert _is_image_size_error(Exception("HTTP 413 Payload Too Large"))
+
+    def test_invalid_request(self):
+        assert _is_image_size_error(Exception("invalid_request_error: image too big"))
+
+    def test_exceeds_limit(self):
+        assert _is_image_size_error(Exception("Image exceeds maximum size"))
+
+    def test_unrelated_error(self):
+        assert not _is_image_size_error(Exception("Connection refused"))
+
+    def test_auth_error(self):
+        assert not _is_image_size_error(Exception("401 Unauthorized"))
+
+    def test_empty_message(self):
+        assert not _is_image_size_error(Exception(""))
@@ -473,13 +473,104 @@ def _cleanup_inactive_browser_sessions():
            logger.warning("Error cleaning up inactive session %s: %s", task_id, e)


+def _reap_orphaned_browser_sessions():
+    """Scan for orphaned agent-browser daemon processes from previous runs.
+
+    When the Python process that created a browser session exits uncleanly
+    (SIGKILL, crash, gateway restart), the in-memory ``_active_sessions``
+    tracking is lost but the node + Chromium processes keep running.
+
+    This function scans the tmp directory for ``agent-browser-*`` socket dirs
+    left behind by previous runs, reads the daemon PID files, and kills any
+    daemons that are still alive but not tracked by the current process.
+
+    Called once on cleanup-thread startup — not every 30 seconds — to avoid
+    races with sessions being actively created.
+    """
+    import glob
+
+    tmpdir = _socket_safe_tmpdir()
+    pattern = os.path.join(tmpdir, "agent-browser-h_*")
+    socket_dirs = glob.glob(pattern)
+    # Also pick up CDP sessions
+    socket_dirs += glob.glob(os.path.join(tmpdir, "agent-browser-cdp_*"))
+
+    if not socket_dirs:
+        return
+
+    # Build set of session_names currently tracked by this process
+    with _cleanup_lock:
+        tracked_names = {
+            info.get("session_name")
+            for info in _active_sessions.values()
+            if info.get("session_name")
+        }
+
+    reaped = 0
+    for socket_dir in socket_dirs:
+        dir_name = os.path.basename(socket_dir)
+        # dir_name is "agent-browser-{session_name}"
+        session_name = dir_name.removeprefix("agent-browser-")
+        if not session_name:
+            continue
+
+        # Skip sessions that we are actively tracking
+        if session_name in tracked_names:
+            continue
+
+        pid_file = os.path.join(socket_dir, f"{session_name}.pid")
+        if not os.path.isfile(pid_file):
+            # No PID file — just a stale dir, remove it
+            shutil.rmtree(socket_dir, ignore_errors=True)
+            continue
+
+        try:
+            daemon_pid = int(Path(pid_file).read_text().strip())
+        except (ValueError, OSError):
+            shutil.rmtree(socket_dir, ignore_errors=True)
+            continue
+
+        # Check if the daemon is still alive
+        try:
+            os.kill(daemon_pid, 0)  # signal 0 = existence check
+        except ProcessLookupError:
+            # Already dead, just clean up the dir
+            shutil.rmtree(socket_dir, ignore_errors=True)
+            continue
+        except PermissionError:
+            # Alive but owned by someone else — leave it alone
+            continue
+
+        # Daemon is alive and not tracked — orphan. Kill it.
+        try:
+            os.kill(daemon_pid, signal.SIGTERM)
+            logger.info("Reaped orphaned browser daemon PID %d (session %s)",
+                        daemon_pid, session_name)
+            reaped += 1
+        except (ProcessLookupError, PermissionError, OSError):
+            pass
+
+        # Clean up the socket directory
+        shutil.rmtree(socket_dir, ignore_errors=True)
+
+    if reaped:
+        logger.info("Reaped %d orphaned browser session(s) from previous run(s)", reaped)
+
+
 def _browser_cleanup_thread_worker():
    """
    Background thread that periodically cleans up inactive browser sessions.
    
    Runs every 30 seconds and checks for sessions that haven't been used
    within the BROWSER_SESSION_INACTIVITY_TIMEOUT period.
+    On first run, also reaps orphaned sessions from previous process lifetimes.
    """
+    # One-time orphan reap on startup
+    try:
+        _reap_orphaned_browser_sessions()
+    except Exception as e:
+        logger.warning("Orphan reap error: %s", e)
+
    while _cleanup_running:
        try:
            _cleanup_inactive_browser_sessions()
@@ -1873,10 +1964,10 @@ def browser_vision(question: str, annotate: bool = False, task_id: Optional[str]
                ),
            }, ensure_ascii=False)
        
-        # Read and convert to base64
-        image_data = screenshot_path.read_bytes()
-        image_base64 = base64.b64encode(image_data).decode("ascii")
-        data_url = f"data:image/png;base64,{image_base64}"
+        # Convert screenshot to base64 at full resolution.
+        _screenshot_bytes = screenshot_path.read_bytes()
+        _screenshot_b64 = base64.b64encode(_screenshot_bytes).decode("ascii")
+        data_url = f"data:image/png;base64,{_screenshot_b64}"
        
        vision_prompt = (
            f"You are analyzing a screenshot of a web browser.\n\n"
@@ -1890,7 +1981,7 @@ def browser_vision(question: str, annotate: bool = False, task_id: Optional[str]
        # Use the centralized LLM router
        vision_model = _get_vision_model()
        logger.debug("browser_vision: analysing screenshot (%d bytes)",
-                     len(image_data))
+                     len(_screenshot_bytes))

        # Read vision timeout from config (auxiliary.vision.timeout), default 120s.
        # Local vision models (llama.cpp, ollama) can take well over 30s for
@@ -1922,7 +2013,27 @@ def browser_vision(question: str, annotate: bool = False, task_id: Optional[str]
        }
        if vision_model:
            call_kwargs["model"] = vision_model
-        response = call_llm(**call_kwargs)
+        # Try full-size screenshot; on size-related rejection, downscale and retry.
+        try:
+            response = call_llm(**call_kwargs)
+        except Exception as _api_err:
+            from tools.vision_tools import (
+                _is_image_size_error, _resize_image_for_vision, _RESIZE_TARGET_BYTES,
+            )
+            if (_is_image_size_error(_api_err)
+                    and len(data_url) > _RESIZE_TARGET_BYTES):
+                logger.info(
+                    "Vision API rejected screenshot (%.1f MB); "
+                    "auto-resizing to ~%.0f MB and retrying...",
+                    len(data_url) / (1024 * 1024),
+                    _RESIZE_TARGET_BYTES / (1024 * 1024),
+                )
+                data_url = _resize_image_for_vision(
+                    screenshot_path, mime_type="image/png")
+                call_kwargs["messages"][0]["content"][1]["image_url"]["url"] = data_url
+                response = call_llm(**call_kwargs)
+            else:
+                raise
        
        analysis = (response.choices[0].message.content or "").strip()
        # Redact secrets the vision LLM may have read from the screenshot.
@@ -924,8 +924,8 @@ def execute_code(

    # --- Local execution path (UDS) --- below this line is unchanged ---

-    # Import interrupt event from terminal_tool (cooperative cancellation)
-    from tools.terminal_tool import _interrupt_event
+    # Import per-thread interrupt check (cooperative cancellation)
+    from tools.interrupt import is_interrupted as _is_interrupted

    # Resolve config
    _cfg = _load_config()
@@ -1114,7 +1114,7 @@ def execute_code(

        status = "success"
        while proc.poll() is None:
-            if _interrupt_event.is_set():
+            if _is_interrupted():
                _kill_process_group(proc)
                status = "interrupted"
                break
@@ -80,20 +80,18 @@ def register_credential_file(

    # Resolve symlinks and normalise ``..`` before the containment check so
    # that traversal like ``../. ssh/id_rsa`` cannot escape HERMES_HOME.
-    try:
-        resolved = host_path.resolve()
-        hermes_home_resolved = hermes_home.resolve()
-        resolved.relative_to(hermes_home_resolved)  # raises ValueError if outside
-    except ValueError:
+    from tools.path_security import validate_within_dir
+
+    containment_error = validate_within_dir(host_path, hermes_home)
+    if containment_error:
        logger.warning(
-            "credential_files: rejected path traversal %r "
-            "(resolves to %s, outside HERMES_HOME %s)",
+            "credential_files: rejected path traversal %r (%s)",
            relative_path,
-            resolved,
-            hermes_home_resolved,
+            containment_error,
        )
        return False

+    resolved = host_path.resolve()
    if not resolved.is_file():
        logger.debug("credential_files: skipping %s (not found)", resolved)
        return False
@@ -142,7 +140,8 @@ def _load_config_files() -> List[Dict[str, str]]:
        cfg = read_raw_config()
        cred_files = cfg.get("terminal", {}).get("credential_files")
        if isinstance(cred_files, list):
-            hermes_home_resolved = hermes_home.resolve()
+            from tools.path_security import validate_within_dir
+
            for item in cred_files:
                if isinstance(item, str) and item.strip():
                    rel = item.strip()
@@ -151,20 +150,19 @@ def _load_config_files() -> List[Dict[str, str]]:
                            "credential_files: rejected absolute config path %r", rel,
                        )
                        continue
-                    host_path = (hermes_home / rel).resolve()
-                    try:
-                        host_path.relative_to(hermes_home_resolved)
-                    except ValueError:
+                    host_path = hermes_home / rel
+                    containment_error = validate_within_dir(host_path, hermes_home)
+                    if containment_error:
                        logger.warning(
-                            "credential_files: rejected config path traversal %r "
-                            "(resolves to %s, outside HERMES_HOME %s)",
-                            rel, host_path, hermes_home_resolved,
+                            "credential_files: rejected config path traversal %r (%s)",
+                            rel, containment_error,
                        )
                        continue
-                    if host_path.is_file():
+                    resolved_path = host_path.resolve()
+                    if resolved_path.is_file():
                        container_path = f"/root/.hermes/{rel}"
                        result.append({
-                            "host_path": str(host_path),
+                            "host_path": str(resolved_path),
                            "container_path": container_path,
                        })
    except Exception as e:
@@ -165,12 +165,12 @@ def _validate_cron_script_path(script: Optional[str]) -> Optional[str]:
        )

    # Validate containment after resolution
+    from tools.path_security import validate_within_dir
+
    scripts_dir = get_hermes_home() / "scripts"
    scripts_dir.mkdir(parents=True, exist_ok=True)
-    resolved = (scripts_dir / raw).resolve()
-    try:
-        resolved.relative_to(scripts_dir.resolve())
-    except ValueError:
+    containment_error = validate_within_dir(scripts_dir / raw, scripts_dir)
+    if containment_error:
        return (
            f"Script path escapes the scripts directory via traversal: {raw!r}"
        )
@@ -1,8 +1,12 @@
-"""Shared interrupt signaling for all tools.
+"""Per-thread interrupt signaling for all tools.

-Provides a global threading.Event that any tool can check to determine
-if the user has requested an interrupt. The agent's interrupt() method
-sets this event, and tools poll it during long-running operations.
+Provides thread-scoped interrupt tracking so that interrupting one agent
+session does not kill tools running in other sessions.  This is critical
+in the gateway where multiple agents run concurrently in the same process.
+
+The agent stores its execution thread ID at the start of run_conversation()
+and passes it to set_interrupt()/clear_interrupt().  Tools call
+is_interrupted() which checks the CURRENT thread — no argument needed.

 Usage in tools:
    from tools.interrupt import is_interrupted
@@ -12,17 +16,61 @@ Usage in tools:

 import threading

-_interrupt_event = threading.Event()
+# Set of thread idents that have been interrupted.
+_interrupted_threads: set[int] = set()
+_lock = threading.Lock()


-def set_interrupt(active: bool) -> None:
-    """Called by the agent to signal or clear the interrupt."""
-    if active:
-        _interrupt_event.set()
-    else:
-        _interrupt_event.clear()
+def set_interrupt(active: bool, thread_id: int | None = None) -> None:
+    """Set or clear interrupt for a specific thread.
+
+    Args:
+        active: True to signal interrupt, False to clear it.
+        thread_id: Target thread ident.  When None, targets the
+                   current thread (backward compat for CLI/tests).
+    """
+    tid = thread_id if thread_id is not None else threading.current_thread().ident
+    with _lock:
+        if active:
+            _interrupted_threads.add(tid)
+        else:
+            _interrupted_threads.discard(tid)


 def is_interrupted() -> bool:
-    """Check if an interrupt has been requested. Safe to call from any thread."""
-    return _interrupt_event.is_set()
+    """Check if an interrupt has been requested for the current thread.
+
+    Safe to call from any thread — each thread only sees its own
+    interrupt state.
+    """
+    tid = threading.current_thread().ident
+    with _lock:
+        return tid in _interrupted_threads
+
+
+# ---------------------------------------------------------------------------
+# Backward-compatible _interrupt_event proxy
+# ---------------------------------------------------------------------------
+# Some legacy call sites (code_execution_tool, process_registry, tests)
+# import _interrupt_event directly and call .is_set() / .set() / .clear().
+# This shim maps those calls to the per-thread functions above so existing
+# code keeps working while the underlying mechanism is thread-scoped.
+
+class _ThreadAwareEventProxy:
+    """Drop-in proxy that maps threading.Event methods to per-thread state."""
+
+    def is_set(self) -> bool:
+        return is_interrupted()
+
+    def set(self) -> None:  # noqa: A003
+        set_interrupt(True)
+
+    def clear(self) -> None:
+        set_interrupt(False)
+
+    def wait(self, timeout: float | None = None) -> bool:
+        """Not truly supported — returns current state immediately."""
+        return self.is_set()
+
+
+_interrupt_event = _ThreadAwareEventProxy()
@@ -0,0 +1,43 @@
+"""Shared path validation helpers for tool implementations.
+
+Extracts the ``resolve() + relative_to()`` and ``..`` traversal check
+patterns previously duplicated across skill_manager_tool, skills_tool,
+skills_hub, cronjob_tools, and credential_files.
+"""
+
+import logging
+from pathlib import Path
+from typing import Optional
+
+logger = logging.getLogger(__name__)
+
+
+def validate_within_dir(path: Path, root: Path) -> Optional[str]:
+    """Ensure *path* resolves to a location within *root*.
+
+    Returns an error message string if validation fails, or ``None`` if the
+    path is safe.  Uses ``Path.resolve()`` to follow symlinks and normalize
+    ``..`` components.
+
+    Usage::
+
+        error = validate_within_dir(user_path, allowed_root)
+        if error:
+            return json.dumps({"error": error})
+    """
+    try:
+        resolved = path.resolve()
+        root_resolved = root.resolve()
+        resolved.relative_to(root_resolved)
+    except (ValueError, OSError) as exc:
+        return f"Path escapes allowed directory: {exc}"
+    return None
+
+
+def has_traversal_component(path_str: str) -> bool:
+    """Return True if *path_str* contains ``..`` traversal components.
+
+    Quick check for obvious traversal attempts before doing full resolution.
+    """
+    parts = Path(path_str).parts
+    return ".." in parts
@@ -85,6 +85,8 @@ class ProcessSession:
    # Watcher/notification metadata (persisted for crash recovery)
    watcher_platform: str = ""
    watcher_chat_id: str = ""
+    watcher_user_id: str = ""
+    watcher_user_name: str = ""
    watcher_thread_id: str = ""
    watcher_interval: int = 0                   # 0 = no watcher configured
    notify_on_complete: bool = False             # Queue agent notification on exit
@@ -684,7 +686,7 @@ class ProcessRegistry:
            and output snapshot.
        """
        from tools.ansi_strip import strip_ansi
-        from tools.terminal_tool import _interrupt_event
+        from tools.interrupt import is_interrupted as _is_interrupted

        try:
            default_timeout = int(os.getenv("TERMINAL_TIMEOUT", "180"))
@@ -721,7 +723,7 @@ class ProcessRegistry:
                    result["timeout_note"] = timeout_note
                return result

-            if _interrupt_event.is_set():
+            if _is_interrupted():
                result = {
                    "status": "interrupted",
                    "output": strip_ansi(session.output_buffer[-1000:]),
@@ -970,6 +972,8 @@ class ProcessRegistry:
                            "session_key": s.session_key,
                            "watcher_platform": s.watcher_platform,
                            "watcher_chat_id": s.watcher_chat_id,
+                            "watcher_user_id": s.watcher_user_id,
+                            "watcher_user_name": s.watcher_user_name,
                            "watcher_thread_id": s.watcher_thread_id,
                            "watcher_interval": s.watcher_interval,
                            "notify_on_complete": s.notify_on_complete,
@@ -1031,6 +1035,8 @@ class ProcessRegistry:
                    detached=True,  # Can't read output, but can report status + kill
                    watcher_platform=entry.get("watcher_platform", ""),
                    watcher_chat_id=entry.get("watcher_chat_id", ""),
+                    watcher_user_id=entry.get("watcher_user_id", ""),
+                    watcher_user_name=entry.get("watcher_user_name", ""),
                    watcher_thread_id=entry.get("watcher_thread_id", ""),
                    watcher_interval=entry.get("watcher_interval", 0),
                    notify_on_complete=entry.get("notify_on_complete", False),
@@ -1049,6 +1055,8 @@ class ProcessRegistry:
                        "session_key": session.session_key,
                        "platform": session.watcher_platform,
                        "chat_id": session.watcher_chat_id,
+                        "user_id": session.watcher_user_id,
+                        "user_name": session.watcher_user_name,
                        "thread_id": session.watcher_thread_id,
                        "notify_on_complete": session.notify_on_complete,
                    })
@@ -219,13 +219,15 @@ def _validate_file_path(file_path: str) -> Optional[str]:
    Validate a file path for write_file/remove_file.
    Must be under an allowed subdirectory and not escape the skill dir.
    """
+    from tools.path_security import has_traversal_component
+
    if not file_path:
        return "file_path is required."

    normalized = Path(file_path)

    # Prevent path traversal
-    if ".." in normalized.parts:
+    if has_traversal_component(file_path):
        return "Path traversal ('..') is not allowed."

    # Must be under an allowed subdirectory
@@ -242,15 +244,12 @@ def _validate_file_path(file_path: str) -> Optional[str]:

 def _resolve_skill_target(skill_dir: Path, file_path: str) -> Tuple[Optional[Path], Optional[str]]:
    """Resolve a supporting-file path and ensure it stays within the skill directory."""
+    from tools.path_security import validate_within_dir
+
    target = skill_dir / file_path
-    try:
-        resolved = target.resolve(strict=False)
-        skill_dir_resolved = skill_dir.resolve()
-        resolved.relative_to(skill_dir_resolved)
-    except ValueError:
-        return None, "Path escapes skill directory boundary."
-    except OSError as e:
-        return None, f"Invalid file path '{file_path}': {e}"
+    error = validate_within_dir(target, skill_dir)
+    if error:
+        return None, error
    return target, None


@@ -447,17 +447,8 @@ def _get_category_from_path(skill_path: Path) -> Optional[str]:
    return None


-def _estimate_tokens(content: str) -> int:
-    """
-    Rough token estimate (4 chars per token average).
-
-    Args:
-        content: Text content
-
-    Returns:
-        Estimated token count
-    """
-    return len(content) // 4
+# Token estimation — use the shared implementation from model_metadata.
+from agent.model_metadata import estimate_tokens_rough as _estimate_tokens


 def _parse_tags(tags_value) -> List[str]:
@@ -947,9 +938,10 @@ def skill_view(name: str, file_path: str = None, task_id: str = None) -> str:

        # If a specific file path is requested, read that instead
        if file_path and skill_dir:
+            from tools.path_security import validate_within_dir, has_traversal_component
+
            # Security: Prevent path traversal attacks
-            normalized_path = Path(file_path)
-            if ".." in normalized_path.parts:
+            if has_traversal_component(file_path):
                return json.dumps(
                    {
                        "success": False,
@@ -962,24 +954,13 @@ def skill_view(name: str, file_path: str = None, task_id: str = None) -> str:
            target_file = skill_dir / file_path

            # Security: Verify resolved path is still within skill directory
-            try:
-                resolved = target_file.resolve()
-                skill_dir_resolved = skill_dir.resolve()
-                if not resolved.is_relative_to(skill_dir_resolved):
-                    return json.dumps(
-                        {
-                            "success": False,
-                            "error": "Path escapes skill directory boundary.",
-                            "hint": "Use a relative path within the skill directory",
-                        },
-                        ensure_ascii=False,
-                    )
-            except (OSError, ValueError):
+            traversal_error = validate_within_dir(target_file, skill_dir)
+            if traversal_error:
                return json.dumps(
                    {
                        "success": False,
-                        "error": f"Invalid file path: '{file_path}'",
-                        "hint": "Use a valid relative path within the skill directory",
+                        "error": traversal_error,
+                        "hint": "Use a relative path within the skill directory",
                    },
                    ensure_ascii=False,
                )
@@ -1427,8 +1427,12 @@ def terminal_tool(
                    if _gw_platform and not check_interval:
                        _gw_chat_id = _gse("HERMES_SESSION_CHAT_ID", "")
                        _gw_thread_id = _gse("HERMES_SESSION_THREAD_ID", "")
+                        _gw_user_id = _gse("HERMES_SESSION_USER_ID", "")
+                        _gw_user_name = _gse("HERMES_SESSION_USER_NAME", "")
                        proc_session.watcher_platform = _gw_platform
                        proc_session.watcher_chat_id = _gw_chat_id
+                        proc_session.watcher_user_id = _gw_user_id
+                        proc_session.watcher_user_name = _gw_user_name
                        proc_session.watcher_thread_id = _gw_thread_id
                        proc_session.watcher_interval = 5
                        process_registry.pending_watchers.append({
@@ -1437,6 +1441,8 @@ def terminal_tool(
                            "session_key": session_key,
                            "platform": _gw_platform,
                            "chat_id": _gw_chat_id,
+                            "user_id": _gw_user_id,
+                            "user_name": _gw_user_name,
                            "thread_id": _gw_thread_id,
                            "notify_on_complete": True,
                        })
@@ -1457,10 +1463,14 @@ def terminal_tool(
                    watcher_platform = _gse2("HERMES_SESSION_PLATFORM", "")
                    watcher_chat_id = _gse2("HERMES_SESSION_CHAT_ID", "")
                    watcher_thread_id = _gse2("HERMES_SESSION_THREAD_ID", "")
+                    watcher_user_id = _gse2("HERMES_SESSION_USER_ID", "")
+                    watcher_user_name = _gse2("HERMES_SESSION_USER_NAME", "")

                    # Store on session for checkpoint persistence
                    proc_session.watcher_platform = watcher_platform
                    proc_session.watcher_chat_id = watcher_chat_id
+                    proc_session.watcher_user_id = watcher_user_id
+                    proc_session.watcher_user_name = watcher_user_name
                    proc_session.watcher_thread_id = watcher_thread_id
                    proc_session.watcher_interval = effective_interval

@@ -1470,6 +1480,8 @@ def terminal_tool(
                        "session_key": session_key,
                        "platform": watcher_platform,
                        "chat_id": watcher_chat_id,
+                        "user_id": watcher_user_id,
+                        "user_name": watcher_user_name,
                        "thread_id": watcher_thread_id,
                    })

@@ -277,6 +277,131 @@ def _image_to_base64_data_url(image_path: Path, mime_type: Optional[str] = None)
    return data_url


+# Hard limit for vision API payloads (20 MB) — matches the most restrictive
+# major provider (Gemini inline data limit).  Images above this are rejected.
+_MAX_BASE64_BYTES = 20 * 1024 * 1024
+
+# Target size when auto-resizing on API failure (5 MB).  After a provider
+# rejects an image, we downscale to this target and retry once.
+_RESIZE_TARGET_BYTES = 5 * 1024 * 1024
+
+
+def _is_image_size_error(error: Exception) -> bool:
+    """Detect if an API error is related to image or payload size."""
+    err_str = str(error).lower()
+    return any(hint in err_str for hint in (
+        "too large", "payload", "413", "content_too_large",
+        "request_too_large", "image_url", "invalid_request",
+        "exceeds", "size limit",
+    ))
+
+
+def _resize_image_for_vision(image_path: Path, mime_type: Optional[str] = None,
+                              max_base64_bytes: int = _RESIZE_TARGET_BYTES) -> str:
+    """Convert an image to a base64 data URL, auto-resizing if too large.
+
+    Tries Pillow first to progressively downscale oversized images.  If Pillow
+    is not installed or resizing still exceeds the limit, falls back to the raw
+    bytes and lets the caller handle the size check.
+
+    Returns the base64 data URL string.
+    """
+    # Quick file-size estimate: base64 expands by ~4/3, plus data URL header.
+    # Skip the expensive full-read + encode if Pillow can resize directly.
+    file_size = image_path.stat().st_size
+    estimated_b64 = (file_size * 4) // 3 + 100  # ~header overhead
+    if estimated_b64 <= max_base64_bytes:
+        # Small enough — just encode directly.
+        data_url = _image_to_base64_data_url(image_path, mime_type=mime_type)
+        if len(data_url) <= max_base64_bytes:
+            return data_url
+    else:
+        data_url = None  # defer full encode; try Pillow resize first
+
+    # Attempt auto-resize with Pillow (soft dependency)
+    try:
+        from PIL import Image
+        import io as _io
+    except ImportError:
+        logger.info("Pillow not installed — cannot auto-resize oversized image")
+        if data_url is None:
+            data_url = _image_to_base64_data_url(image_path, mime_type=mime_type)
+        return data_url  # caller will raise the size error
+
+    logger.info("Image file is %.1f MB (estimated base64 %.1f MB, limit %.1f MB), auto-resizing...",
+                file_size / (1024 * 1024), estimated_b64 / (1024 * 1024),
+                max_base64_bytes / (1024 * 1024))
+
+    mime = mime_type or _determine_mime_type(image_path)
+    # Choose output format: JPEG for photos (smaller), PNG for transparency
+    pil_format = "PNG" if mime == "image/png" else "JPEG"
+    out_mime = "image/png" if pil_format == "PNG" else "image/jpeg"
+
+    try:
+        img = Image.open(image_path)
+    except Exception as exc:
+        logger.info("Pillow cannot open image for resizing: %s", exc)
+        if data_url is None:
+            data_url = _image_to_base64_data_url(image_path, mime_type=mime_type)
+        return data_url  # fall through to size-check in caller
+    # Convert RGBA to RGB for JPEG output
+    if pil_format == "JPEG" and img.mode in ("RGBA", "P"):
+        img = img.convert("RGB")
+
+    # Strategy: halve dimensions until base64 fits, up to 4 rounds.
+    # For JPEG, also try reducing quality at each size step.
+    # For PNG, quality is irrelevant — only dimension reduction helps.
+    quality_steps = (85, 70, 50) if pil_format == "JPEG" else (None,)
+    prev_dims = (img.width, img.height)
+    candidate = None  # will be set on first loop iteration
+
+    for attempt in range(5):
+        if attempt > 0:
+            # Proportional scaling: halve the longer side and scale the
+            # shorter side to preserve aspect ratio (min dimension 64).
+            scale = 0.5
+            new_w = max(int(img.width * scale), 64)
+            new_h = max(int(img.height * scale), 64)
+            # Re-derive the scale from whichever dimension hit the floor
+            # so both axes shrink by the same factor.
+            if new_w == 64 and img.width > 0:
+                effective_scale = 64 / img.width
+                new_h = max(int(img.height * effective_scale), 64)
+            elif new_h == 64 and img.height > 0:
+                effective_scale = 64 / img.height
+                new_w = max(int(img.width * effective_scale), 64)
+            # Stop if dimensions can't shrink further
+            if (new_w, new_h) == prev_dims:
+                break
+            img = img.resize((new_w, new_h), Image.LANCZOS)
+            prev_dims = (new_w, new_h)
+            logger.info("Resized to %dx%d (attempt %d)", new_w, new_h, attempt)
+
+        for q in quality_steps:
+            buf = _io.BytesIO()
+            save_kwargs = {"format": pil_format}
+            if q is not None:
+                save_kwargs["quality"] = q
+            img.save(buf, **save_kwargs)
+            encoded = base64.b64encode(buf.getvalue()).decode("ascii")
+            candidate = f"data:{out_mime};base64,{encoded}"
+            if len(candidate) <= max_base64_bytes:
+                logger.info("Auto-resized image fits: %.1f MB (quality=%s, %dx%d)",
+                            len(candidate) / (1024 * 1024), q,
+                            img.width, img.height)
+                return candidate
+
+    # If we still can't get it small enough, return the best attempt
+    # and let the caller decide
+    if candidate is not None:
+        logger.warning("Auto-resize could not fit image under %.1f MB (best: %.1f MB)",
+                       max_base64_bytes / (1024 * 1024), len(candidate) / (1024 * 1024))
+        return candidate
+
+    # Shouldn't reach here, but fall back to full encode
+    return data_url or _image_to_base64_data_url(image_path, mime_type=mime_type)
+
+
 async def vision_analyze_tool(
    image_url: str,
    user_prompt: str,
@@ -376,24 +501,27 @@ async def vision_analyze_tool(
        if not detected_mime_type:
            raise ValueError("Only real image files are supported for vision analysis.")
        
-        # Convert image to base64 data URL
+        # Convert image to base64 — send at full resolution first.
+        # If the provider rejects it as too large, we auto-resize and retry.
        logger.info("Converting image to base64...")
        image_data_url = _image_to_base64_data_url(temp_image_path, mime_type=detected_mime_type)
-        # Calculate size in KB for better readability
        data_size_kb = len(image_data_url) / 1024
        logger.info("Image converted to base64 (%.1f KB)", data_size_kb)

-        # Pre-flight size check: most vision APIs cap base64 payloads at 5 MB.
-        # Reject early with a clear message instead of a cryptic provider 400.
-        _MAX_BASE64_BYTES = 5 * 1024 * 1024  # 5 MB
-        # The data URL includes the header (e.g. "data:image/jpeg;base64,") which
-        # is negligible, but measure the full string to be safe.
+        # Hard limit (20 MB) — no provider accepts payloads this large.
        if len(image_data_url) > _MAX_BASE64_BYTES:
-            raise ValueError(
-                f"Image too large for vision API: base64 payload is "
-                f"{len(image_data_url) / (1024 * 1024):.1f} MB (limit 5 MB). "
-                f"Resize or compress the image and try again."
-            )
+            # Try to resize down to 5 MB before giving up.
+            image_data_url = _resize_image_for_vision(
+                temp_image_path, mime_type=detected_mime_type)
+            if len(image_data_url) > _MAX_BASE64_BYTES:
+                raise ValueError(
+                    f"Image too large for vision API: base64 payload is "
+                    f"{len(image_data_url) / (1024 * 1024):.1f} MB "
+                    f"(limit {_MAX_BASE64_BYTES / (1024 * 1024):.0f} MB) "
+                    f"even after resizing. "
+                    f"Install Pillow (`pip install Pillow`) for better auto-resize, "
+                    f"or compress the image manually."
+                )

        debug_call_data["image_size_bytes"] = image_size_bytes
        
@@ -442,7 +570,24 @@ async def vision_analyze_tool(
        }
        if model:
            call_kwargs["model"] = model
-        response = await async_call_llm(**call_kwargs)
+        # Try full-size image first; on size-related rejection, downscale and retry.
+        try:
+            response = await async_call_llm(**call_kwargs)
+        except Exception as _api_err:
+            if (_is_image_size_error(_api_err)
+                    and len(image_data_url) > _RESIZE_TARGET_BYTES):
+                logger.info(
+                    "API rejected image (%.1f MB, likely too large); "
+                    "auto-resizing to ~%.0f MB and retrying...",
+                    len(image_data_url) / (1024 * 1024),
+                    _RESIZE_TARGET_BYTES / (1024 * 1024),
+                )
+                image_data_url = _resize_image_for_vision(
+                    temp_image_path, mime_type=detected_mime_type)
+                messages[0]["content"][1]["image_url"]["url"] = image_data_url
+                response = await async_call_llm(**call_kwargs)
+            else:
+                raise
        
        # Extract the analysis — fall back to reasoning if content is empty
        analysis = extract_content_or_reasoning(response)
@@ -498,8 +643,8 @@ async def vision_analyze_tool(
        elif "invalid_request" in err_str or "image_url" in err_str:
            analysis = (
                "The vision API rejected the image. This can happen when the "
-                "image is too large, in an unsupported format, or corrupted. "
-                "Try a smaller JPEG/PNG (under 3.5 MB) and retry. "
+                "image is in an unsupported format, corrupted, or still too "
+                "large after auto-resize. Try a smaller JPEG/PNG and retry. "
                f"Error: {e}"
            )
        else:
@@ -1,13 +1,16 @@
 """Shared utility functions for hermes-agent."""

 import json
+import logging
 import os
 import tempfile
 from pathlib import Path
-from typing import Any, Union
+from typing import Any, List, Optional, Union

 import yaml

+logger = logging.getLogger(__name__)
+

 TRUTHY_STRINGS = frozenset({"1", "true", "yes", "on"})

@@ -124,3 +127,88 @@ def atomic_yaml_write(
        except OSError:
            pass
        raise
+
+
+# ─── JSON Helpers ─────────────────────────────────────────────────────────────
+
+
+def safe_json_loads(text: str, default: Any = None) -> Any:
+    """Parse JSON, returning *default* on any parse error.
+
+    Replaces the ``try: json.loads(x) except (JSONDecodeError, TypeError)``
+    pattern duplicated across display.py, anthropic_adapter.py,
+    auxiliary_client.py, and others.
+    """
+    try:
+        return json.loads(text)
+    except (json.JSONDecodeError, TypeError, ValueError):
+        return default
+
+
+def read_json_file(path: Path, default: Any = None) -> Any:
+    """Read and parse a JSON file, returning *default* on any error.
+
+    Replaces the repeated ``try: json.loads(path.read_text()) except ...``
+    pattern in anthropic_adapter.py, auxiliary_client.py, credential_pool.py,
+    and skill_utils.py.
+    """
+    try:
+        return json.loads(Path(path).read_text(encoding="utf-8"))
+    except (json.JSONDecodeError, OSError, IOError, ValueError) as exc:
+        logger.debug("Failed to read %s: %s", path, exc)
+        return default
+
+
+def read_jsonl(path: Path) -> List[dict]:
+    """Read a JSONL file (one JSON object per line).
+
+    Returns a list of parsed objects, skipping blank lines.
+    """
+    entries = []
+    with open(path, "r", encoding="utf-8") as f:
+        for line in f:
+            line = line.strip()
+            if line:
+                entries.append(json.loads(line))
+    return entries
+
+
+def append_jsonl(path: Path, entry: dict) -> None:
+    """Append a single JSON object as a new line to a JSONL file."""
+    path = Path(path)
+    path.parent.mkdir(parents=True, exist_ok=True)
+    with open(path, "a", encoding="utf-8") as f:
+        f.write(json.dumps(entry, ensure_ascii=False) + "\n")
+
+
+# ─── Environment Variable Helpers ─────────────────────────────────────────────
+
+
+def env_str(key: str, default: str = "") -> str:
+    """Read an environment variable, stripped of whitespace.
+
+    Replaces the ``os.getenv("X", "").strip()`` pattern repeated 50+ times
+    across runtime_provider.py, anthropic_adapter.py, models.py, etc.
+    """
+    return os.getenv(key, default).strip()
+
+
+def env_lower(key: str, default: str = "") -> str:
+    """Read an environment variable, stripped and lowercased."""
+    return os.getenv(key, default).strip().lower()
+
+
+def env_int(key: str, default: int = 0) -> int:
+    """Read an environment variable as an integer, with fallback."""
+    raw = os.getenv(key, "").strip()
+    if not raw:
+        return default
+    try:
+        return int(raw)
+    except (ValueError, TypeError):
+        return default
+
+
+def env_bool(key: str, default: bool = False) -> bool:
+    """Read an environment variable as a boolean."""
+    return is_truthy_value(os.getenv(key, ""), default=default)
@@ -143,6 +143,10 @@ to find the parent assistant message, keeping groups intact.

 ### Phase 3: Generate Structured Summary

+:::warning Summary model context length
+The summary model must have a context window **at least as large** as the main agent model's. The entire middle section is sent to the summary model in a single `call_llm(task="compression")` call. If the summary model's context is smaller, the API returns a context-length error — `_generate_summary()` catches it, logs a warning, and returns `None`. The compressor then drops the middle turns **without a summary**, silently losing conversation context. This is the most common cause of degraded compaction quality.
+:::
+
 The middle turns are summarized using the auxiliary LLM with a structured
 template:

@@ -11,30 +11,32 @@ description: "Complete guide to migrating your OpenClaw / Clawdbot setup to Herm
 ## Quick start

 ```bash
-# Preview what would happen (no files changed)
-hermes claw migrate --dry-run
-
-# Run the migration (secrets excluded by default)
+# Preview then migrate (always shows a preview first, then asks to confirm)
 hermes claw migrate

-# Full migration including API keys
-hermes claw migrate --preset full
+# Preview only, no changes
+hermes claw migrate --dry-run
+
+# Full migration including API keys, skip confirmation
+hermes claw migrate --preset full --yes
 ```

-The migration reads from `~/.openclaw/` by default. If you still have a legacy `~/.clawdbot/` or `~/.moldbot/` directory, it's detected automatically. Same for legacy config filenames (`clawdbot.json`, `moldbot.json`).
+The migration always shows a full preview of what will be imported before making any changes. Review the list, then confirm to proceed.
+
+Reads from `~/.openclaw/` by default. Legacy `~/.clawdbot/` or `~/.moldbot/` directories are detected automatically. Same for legacy config filenames (`clawdbot.json`, `moldbot.json`).

 ## Options

 | Option | Description |
 |--------|-------------|
-| `--dry-run` | Preview what would be migrated without writing anything. |
+| `--dry-run` | Preview only — stop after showing what would be migrated. |
 | `--preset <name>` | `full` (default, includes secrets) or `user-data` (excludes API keys). |
 | `--overwrite` | Overwrite existing Hermes files on conflicts (default: skip). |
 | `--migrate-secrets` | Include API keys (on by default with `--preset full`). |
 | `--source <path>` | Custom OpenClaw directory. |
 | `--workspace-target <path>` | Where to place `AGENTS.md`. |
 | `--skill-conflict <mode>` | `skip` (default), `overwrite`, or `rename`. |
-| `--yes` | Skip confirmation prompt. |
+| `--yes` | Skip the confirmation prompt after preview. |

 ## What gets migrated

@@ -48,7 +50,7 @@ The migration reads from `~/.openclaw/` by default. If you still have a legacy `
 | User profile | `workspace/USER.md` | `~/.hermes/memories/USER.md` | Same entry-merge logic as memory. |
 | Daily memory files | `workspace/memory/*.md` | `~/.hermes/memories/MEMORY.md` | All daily files merged into main memory. |

-All workspace files also check `workspace.default/` as a fallback path.
+Workspace files are also checked at `workspace.default/` and `workspace-main/` as fallback paths (OpenClaw renamed `workspace/` to `workspace-main/` in recent versions, and uses `workspace-{agentId}` for multi-agent setups).

 ### Skills (4 sources)

@@ -66,7 +68,7 @@ Skill conflicts are handled by `--skill-conflict`: `skip` leaves the existing He
 | What | OpenClaw config path | Hermes destination | Notes |
 |------|---------------------|-------------------|-------|
 | Default model | `agents.defaults.model` | `config.yaml` → `model` | Can be a string or `{primary, fallbacks}` object |
-| Custom providers | `models.providers.*` | `config.yaml` → `custom_providers` | Maps `baseUrl`, `apiType` ("openai"→"chat_completions", "anthropic"→"anthropic_messages") |
+| Custom providers | `models.providers.*` | `config.yaml` → `custom_providers` | Maps `baseUrl`, `apiType`/`api` — handles both short ("openai", "anthropic") and hyphenated ("openai-completions", "anthropic-messages", "google-generative-ai") values |
 | Provider API keys | `models.providers.*.apiKey` | `~/.hermes/.env` | Requires `--migrate-secrets`. See [API key resolution](#api-key-resolution) below. |

 ### Agent behavior
@@ -75,7 +77,7 @@ Skill conflicts are handled by `--skill-conflict`: `skip` leaves the existing He
 |------|---------------------|-------------------|---------|
 | Max turns | `agents.defaults.timeoutSeconds` | `agent.max_turns` | `timeoutSeconds / 10`, capped at 200 |
 | Verbose mode | `agents.defaults.verboseDefault` | `agent.verbose` | "off" / "on" / "full" |
-| Reasoning effort | `agents.defaults.thinkingDefault` | `agent.reasoning_effort` | "always"/"high" → "high", "auto"/"medium" → "medium", "off"/"low"/"none"/"minimal" → "low" |
+| Reasoning effort | `agents.defaults.thinkingDefault` | `agent.reasoning_effort` | "always"/"high"/"xhigh" → "high", "auto"/"medium"/"adaptive" → "medium", "off"/"low"/"none"/"minimal" → "low" |
 | Compression | `agents.defaults.compaction.mode` | `compression.enabled` | "off" → false, anything else → true |
 | Compression model | `agents.defaults.compaction.model` | `compression.summary_model` | Direct string copy |
 | Human delay | `agents.defaults.humanDelay.mode` | `human_delay.mode` | "natural" / "custom" / "off" |
@@ -122,26 +124,26 @@ TTS settings are read from **two** OpenClaw config locations with this priority:
 | ElevenLabs model ID | `config.yaml` → `tts.elevenlabs.model_id` |
 | OpenAI model | `config.yaml` → `tts.openai.model` |
 | OpenAI voice | `config.yaml` → `tts.openai.voice` |
-| Edge TTS voice | `config.yaml` → `tts.edge.voice` |
+| Edge TTS voice | `config.yaml` → `tts.edge.voice` (OpenClaw renamed "edge" to "microsoft" — both are recognized) |
 | TTS assets | `~/.hermes/tts/` (file copy) |

 ### Messaging platforms

 | Platform | OpenClaw config path | Hermes `.env` variable | Notes |
 |----------|---------------------|----------------------|-------|
-| Telegram | `channels.telegram.botToken` | `TELEGRAM_BOT_TOKEN` | Token can be string or [SecretRef](#secretref-handling) |
+| Telegram | `channels.telegram.botToken` or `.accounts.default.botToken` | `TELEGRAM_BOT_TOKEN` | Token can be string or [SecretRef](#secretref-handling). Both flat and accounts layout supported. |
 | Telegram | `credentials/telegram-default-allowFrom.json` | `TELEGRAM_ALLOWED_USERS` | Comma-joined from `allowFrom[]` array |
-| Discord | `channels.discord.token` | `DISCORD_BOT_TOKEN` | |
-| Discord | `channels.discord.allowFrom` | `DISCORD_ALLOWED_USERS` | |
-| Slack | `channels.slack.botToken` | `SLACK_BOT_TOKEN` | |
-| Slack | `channels.slack.appToken` | `SLACK_APP_TOKEN` | |
-| Slack | `channels.slack.allowFrom` | `SLACK_ALLOWED_USERS` | |
-| WhatsApp | `channels.whatsapp.allowFrom` | `WHATSAPP_ALLOWED_USERS` | Auth via Baileys QR pairing (not a token) |
-| Signal | `channels.signal.account` | `SIGNAL_ACCOUNT` | |
-| Signal | `channels.signal.httpUrl` | `SIGNAL_HTTP_URL` | |
-| Signal | `channels.signal.allowFrom` | `SIGNAL_ALLOWED_USERS` | |
-| Matrix | `channels.matrix.botToken` | `MATRIX_ACCESS_TOKEN` | Via deep-channels migration |
-| Mattermost | `channels.mattermost.botToken` | `MATTERMOST_BOT_TOKEN` | Via deep-channels migration |
+| Discord | `channels.discord.token` or `.accounts.default.token` | `DISCORD_BOT_TOKEN` | |
+| Discord | `channels.discord.allowFrom` or `.accounts.default.allowFrom` | `DISCORD_ALLOWED_USERS` | |
+| Slack | `channels.slack.botToken` or `.accounts.default.botToken` | `SLACK_BOT_TOKEN` | |
+| Slack | `channels.slack.appToken` or `.accounts.default.appToken` | `SLACK_APP_TOKEN` | |
+| Slack | `channels.slack.allowFrom` or `.accounts.default.allowFrom` | `SLACK_ALLOWED_USERS` | |
+| WhatsApp | `channels.whatsapp.allowFrom` or `.accounts.default.allowFrom` | `WHATSAPP_ALLOWED_USERS` | Auth via Baileys QR pairing — requires re-pairing after migration |
+| Signal | `channels.signal.account` or `.accounts.default.account` | `SIGNAL_ACCOUNT` | |
+| Signal | `channels.signal.httpUrl` or `.accounts.default.httpUrl` | `SIGNAL_HTTP_URL` | |
+| Signal | `channels.signal.allowFrom` or `.accounts.default.allowFrom` | `SIGNAL_ALLOWED_USERS` | |
+| Matrix | `channels.matrix.accessToken` or `.accounts.default.accessToken` | `MATRIX_ACCESS_TOKEN` | Uses `accessToken` (not `botToken`) |
+| Mattermost | `channels.mattermost.botToken` or `.accounts.default.botToken` | `MATTERMOST_BOT_TOKEN` | |

 ### Other config

@@ -178,13 +180,14 @@ These are saved to `~/.hermes/migration/openclaw/<timestamp>/archive/` for manua

 ## API key resolution

-When `--migrate-secrets` is enabled, API keys are collected from **three sources** in priority order:
+When `--migrate-secrets` is enabled, API keys are collected from **four sources** in priority order:

 1. **Config values** — `models.providers.*.apiKey` and TTS provider keys in `openclaw.json`
 2. **Environment file** — `~/.openclaw/.env` (keys like `OPENROUTER_API_KEY`, `ANTHROPIC_API_KEY`, etc.)
-3. **Auth profiles** — `~/.openclaw/agents/main/agent/auth-profiles.json` (per-agent credentials)
+3. **Config env sub-object** — `openclaw.json` → `"env"` or `"env"."vars"` (some setups store keys here instead of a separate `.env` file)
+4. **Auth profiles** — `~/.openclaw/agents/main/agent/auth-profiles.json` (per-agent credentials)

-Config values take priority. The `.env` fills any gaps. Auth profiles fill whatever remains.
+Config values take priority. Each subsequent source fills any remaining gaps.

 ### Supported key targets

@@ -207,7 +210,7 @@ OpenClaw config values for tokens and API keys can be in three formats:
 "channels": { "telegram": { "botToken": { "source": "env", "id": "TELEGRAM_BOT_TOKEN" } } }
 ```

-The migration resolves all three formats. For env templates and SecretRef objects with `source: "env"`, it looks up the value in `~/.openclaw/.env`. SecretRef objects with `source: "file"` or `source: "exec"` can't be resolved automatically — those values must be added to Hermes manually after migration.
+The migration resolves all three formats. For env templates and SecretRef objects with `source: "env"`, it looks up the value in `~/.openclaw/.env` and the `openclaw.json` env sub-object. SecretRef objects with `source: "file"` or `source: "exec"` can't be resolved automatically — the migration warns about these, and those values must be added to Hermes manually via `hermes config set`.

 ## After migration

@@ -215,13 +218,17 @@ The migration resolves all three formats. For env templates and SecretRef object

 2. **Review archived files** — anything in `~/.hermes/migration/openclaw/<timestamp>/archive/` needs manual attention.

-3. **Verify API keys** — run `hermes status` to check provider authentication.
+3. **Start a new session** — imported skills and memory entries take effect in new sessions, not the current one.

-4. **Test messaging** — if you migrated platform tokens, restart the gateway: `systemctl --user restart hermes-gateway`
+4. **Verify API keys** — run `hermes status` to check provider authentication.

-5. **Check session policies** — verify `hermes config get session_reset` matches your expectations.
+5. **Test messaging** — if you migrated platform tokens, restart the gateway: `systemctl --user restart hermes-gateway`

-6. **Re-pair WhatsApp** — WhatsApp uses QR code pairing (Baileys), not token migration. Run `hermes whatsapp` to pair.
+6. **Check session policies** — verify `hermes config get session_reset` matches your expectations.
+
+7. **Re-pair WhatsApp** — WhatsApp uses QR code pairing (Baileys), not token migration. Run `hermes whatsapp` to pair.
+
+8. **Archive cleanup** — after confirming everything works, run `hermes claw cleanup` to rename leftover OpenClaw directories to `.pre-migration/` (prevents state confusion).

 ## Troubleshooting

@@ -231,7 +238,7 @@ The migration checks `~/.openclaw/`, then `~/.clawdbot/`, then `~/.moldbot/`. If

 ### "No provider API keys found"

-Keys might be in your `.env` file instead of `openclaw.json`. The migration checks both — make sure `~/.openclaw/.env` exists and has the keys. If keys use `source: "file"` or `source: "exec"` SecretRefs, they can't be resolved automatically.
+Keys might be stored in several places depending on your OpenClaw version: inline in `openclaw.json` under `models.providers.*.apiKey`, in `~/.openclaw/.env`, in the `openclaw.json` `"env"` sub-object, or in `agents/main/agent/auth-profiles.json`. The migration checks all four. If keys use `source: "file"` or `source: "exec"` SecretRefs, they can't be resolved automatically — add them via `hermes config set`.

 ### Skills not appearing after migration

@@ -27,6 +27,7 @@ You need at least one way to connect to an LLM. Use `hermes model` to switch pro
 | **MiniMax China** | `MINIMAX_CN_API_KEY` in `~/.hermes/.env` (provider: `minimax-cn`) |
 | **Alibaba Cloud** | `DASHSCOPE_API_KEY` in `~/.hermes/.env` (provider: `alibaba`, aliases: `dashscope`, `qwen`) |
 | **Kilo Code** | `KILOCODE_API_KEY` in `~/.hermes/.env` (provider: `kilocode`) |
+| **Xiaomi MiMo** | `XIAOMI_API_KEY` in `~/.hermes/.env` (provider: `xiaomi`, aliases: `mimo`, `xiaomi-mimo`) |
 | **OpenCode Zen** | `OPENCODE_ZEN_API_KEY` in `~/.hermes/.env` (provider: `opencode-zen`) |
 | **OpenCode Go** | `OPENCODE_GO_API_KEY` in `~/.hermes/.env` (provider: `opencode-go`) |
 | **DeepSeek** | `DEEPSEEK_API_KEY` in `~/.hermes/.env` (provider: `deepseek`) |
@@ -157,16 +158,20 @@ hermes chat --provider minimax-cn --model MiniMax-M2.7
 # Alibaba Cloud / DashScope (Qwen models)
 hermes chat --provider alibaba --model qwen3.5-plus
 # Requires: DASHSCOPE_API_KEY in ~/.hermes/.env
+
+# Xiaomi MiMo
+hermes chat --provider xiaomi --model mimo-v2-pro
+# Requires: XIAOMI_API_KEY in ~/.hermes/.env
 ```

 Or set the provider permanently in `config.yaml`:
 ```yaml
 model:
-  provider: "zai"       # or: kimi-coding, minimax, minimax-cn, alibaba
+  provider: "zai"       # or: kimi-coding, minimax, minimax-cn, alibaba, xiaomi
  default: "glm-5"
 ```

-Base URLs can be overridden with `GLM_BASE_URL`, `KIMI_BASE_URL`, `MINIMAX_BASE_URL`, `MINIMAX_CN_BASE_URL`, or `DASHSCOPE_BASE_URL` environment variables.
+Base URLs can be overridden with `GLM_BASE_URL`, `KIMI_BASE_URL`, `MINIMAX_BASE_URL`, `MINIMAX_CN_BASE_URL`, `DASHSCOPE_BASE_URL`, or `XIAOMI_BASE_URL` environment variables.

 :::note Z.AI Endpoint Auto-Detection
 When using the Z.AI / GLM provider, Hermes automatically probes multiple endpoints (global, China, coding variants) to find one that accepts your API key. You don't need to set `GLM_BASE_URL` manually — the working endpoint is detected and cached automatically.
@@ -849,7 +854,7 @@ You can also select named custom providers from the interactive `hermes model` m
 | **Cost optimization** | ClawRouter or OpenRouter with `sort: "price"` |
 | **Maximum privacy** | Ollama, vLLM, or llama.cpp (fully local) |
 | **Enterprise / Azure** | Azure OpenAI with custom endpoint |
-| **Chinese AI models** | z.ai (GLM), Kimi/Moonshot, or MiniMax (first-class providers) |
+| **Chinese AI models** | z.ai (GLM), Kimi/Moonshot, MiniMax, or Xiaomi MiMo (first-class providers) |

 :::tip
 You can switch between providers at any time with `hermes model` — no restart required. Your conversation history, memory, and skills carry over regardless of which provider you use.
@@ -924,7 +929,7 @@ fallback_model:

 When activated, the fallback swaps the model and provider mid-session without losing your conversation. It fires **at most once** per session.

-Supported providers: `openrouter`, `nous`, `openai-codex`, `copilot`, `copilot-acp`, `anthropic`, `huggingface`, `zai`, `kimi-coding`, `minimax`, `minimax-cn`, `deepseek`, `ai-gateway`, `opencode-zen`, `opencode-go`, `kilocode`, `alibaba`, `custom`.
+Supported providers: `openrouter`, `nous`, `openai-codex`, `copilot`, `copilot-acp`, `anthropic`, `huggingface`, `zai`, `kimi-coding`, `minimax`, `minimax-cn`, `deepseek`, `ai-gateway`, `opencode-zen`, `opencode-go`, `kilocode`, `xiaomi`, `alibaba`, `custom`.

 :::tip
 Fallback is configured exclusively through `config.yaml` — there are no environment variables for it. For full details on when it triggers, supported providers, and how it interacts with auxiliary tasks and delegation, see [Fallback Providers](/docs/user-guide/features/fallback-providers).
@@ -76,7 +76,7 @@ Common options:
 | `-q`, `--query "..."` | One-shot, non-interactive prompt. |
 | `-m`, `--model <model>` | Override the model for this run. |
 | `-t`, `--toolsets <csv>` | Enable a comma-separated set of toolsets. |
-| `--provider <provider>` | Force a provider: `auto`, `openrouter`, `nous`, `openai-codex`, `copilot-acp`, `copilot`, `anthropic`, `huggingface`, `zai`, `kimi-coding`, `minimax`, `minimax-cn`, `deepseek`, `ai-gateway`, `opencode-zen`, `opencode-go`, `kilocode`, `alibaba`. |
+| `--provider <provider>` | Force a provider: `auto`, `openrouter`, `nous`, `openai-codex`, `copilot-acp`, `copilot`, `anthropic`, `huggingface`, `zai`, `kimi-coding`, `minimax`, `minimax-cn`, `deepseek`, `ai-gateway`, `opencode-zen`, `opencode-go`, `kilocode`, `xiaomi`, `alibaba`. |
 | `-s`, `--skills <name>` | Preload one or more skills for the session (can be repeated or comma-separated). |
 | `-v`, `--verbose` | Verbose output. |
 | `-Q`, `--quiet` | Programmatic mode: suppress banner/spinner/tool previews. |
@@ -37,6 +37,8 @@ All variables go in `~/.hermes/.env`. You can also set them with `hermes config
 | `MINIMAX_CN_BASE_URL` | Override MiniMax China base URL (default: `https://api.minimaxi.com/v1`) |
 | `KILOCODE_API_KEY` | Kilo Code API key ([kilo.ai](https://kilo.ai)) |
 | `KILOCODE_BASE_URL` | Override Kilo Code base URL (default: `https://api.kilo.ai/api/gateway`) |
+| `XIAOMI_API_KEY` | Xiaomi MiMo API key ([platform.xiaomimimo.com](https://platform.xiaomimimo.com)) |
+| `XIAOMI_BASE_URL` | Override Xiaomi MiMo base URL (default: `https://api.xiaomimimo.com/v1`) |
 | `HF_TOKEN` | Hugging Face token for Inference Providers ([huggingface.co/settings/tokens](https://huggingface.co/settings/tokens)) |
 | `HF_BASE_URL` | Override Hugging Face base URL (default: `https://router.huggingface.co/v1`) |
 | `GOOGLE_API_KEY` | Google AI Studio API key ([aistudio.google.com/app/apikey](https://aistudio.google.com/app/apikey)) |
@@ -65,7 +67,7 @@ For native Anthropic auth, Hermes prefers Claude Code's own credential files whe

 | Variable | Description |
 |----------|-------------|
-| `HERMES_INFERENCE_PROVIDER` | Override provider selection: `auto`, `openrouter`, `nous`, `openai-codex`, `copilot`, `copilot-acp`, `anthropic`, `huggingface`, `zai`, `kimi-coding`, `minimax`, `minimax-cn`, `kilocode`, `alibaba`, `deepseek`, `opencode-zen`, `opencode-go`, `ai-gateway` (default: `auto`) |
+| `HERMES_INFERENCE_PROVIDER` | Override provider selection: `auto`, `openrouter`, `nous`, `openai-codex`, `copilot`, `copilot-acp`, `anthropic`, `huggingface`, `zai`, `kimi-coding`, `minimax`, `minimax-cn`, `kilocode`, `xiaomi`, `alibaba`, `deepseek`, `opencode-zen`, `opencode-go`, `ai-gateway` (default: `auto`) |
 | `HERMES_PORTAL_BASE_URL` | Override Nous Portal URL (for development/testing) |
 | `NOUS_INFERENCE_BASE_URL` | Override Nous inference API URL |
 | `HERMES_NOUS_MIN_KEY_TTL_SECONDS` | Min agent key TTL before re-mint (default: 1800 = 30min) |
@@ -480,7 +480,9 @@ Points at a custom OpenAI-compatible endpoint. Uses `OPENAI_API_KEY` for auth.
 | `nous` / `openrouter` / etc. | not set | Force that provider, use its auth |
 | any | set | Use the custom endpoint directly (provider ignored) |

-The `summary_model` must support a context length at least as large as your main model's, since it receives the full middle section of the conversation for compression.
+:::warning Summary model context length requirement
+The `summary_model` **must** have a context window at least as large as your main agent model's. The compressor sends the full middle section of the conversation to the summary model — if that model's context window is smaller than the main model's, the summarization call will fail with a context length error. When this happens, the middle turns are **dropped without a summary**, losing conversation context silently. If you override `summary_model`, verify its context length meets or exceeds your main model's.
+:::

 ## Context Engine

@@ -50,6 +50,7 @@ Both `provider` and `model` are **required**. If either is missing, the fallback
 | OpenCode Zen | `opencode-zen` | `OPENCODE_ZEN_API_KEY` |
 | OpenCode Go | `opencode-go` | `OPENCODE_GO_API_KEY` |
 | Kilo Code | `kilocode` | `KILOCODE_API_KEY` |
+| Xiaomi MiMo | `xiaomi` | `XIAOMI_API_KEY` |
 | Alibaba / DashScope | `alibaba` | `DASHSCOPE_API_KEY` |
 | Hugging Face | `huggingface` | `HF_TOKEN` |
 | Custom endpoint | `custom` | `base_url` + `api_key_env` (see below) |
@@ -169,7 +170,7 @@ When a task's provider is set to `"auto"` (the default), Hermes tries providers

 ```text
 OpenRouter → Nous Portal → Custom endpoint → Codex OAuth →
-API-key providers (z.ai, Kimi, MiniMax, Hugging Face, Anthropic) → give up
+API-key providers (z.ai, Kimi, MiniMax, Xiaomi MiMo, Hugging Face, Anthropic) → give up
 ```

 **For vision tasks:**
--- a/Show More
+++ b/Show More