fix: 24h cooldown for 401/403 auth failures + user notification

Previously, credentials exhausted due to 401 (invalid token) or 403 (forbidden) used the same 1-hour cooldown as 429 rate limits. This meant the system would retry an invalid token every hour forever — burning API calls and confusing users who had no idea why their primary provider wasn't being used. Changes: - credential_pool: EXHAUSTED_TTL_AUTH_SECONDS = 24h for 401/403 errors (rate limits keep 1h cooldown, provider reset_at timestamps still override both) - run_agent: emit actionable status message via _emit_status() when all pool credentials are rejected — tells the user to run `hermes auth reset <provider>` or `hermes model` to re-authenticate. Message propagates to both CLI (force-printed) and gateway (Telegram, Discord, etc.) - Tests for all three TTL cases (401 stays exhausted at 1h, 401 resets at 24h, 403 stays exhausted at 1h) and auth exhaustion notification (emits when pool exhausted, silent when rotation succeeds) Addresses user report: Copilot 401 + Codex 429 caused silent fallback with no recovery path visible to the user.
feat: add --all flag to gateway start and restart commands (#10043 )
2026-04-14 21:00:45 -07:00 · 2026-04-14 20:52:18 -07:00 · 2026-04-14 20:51:52 -07:00 · 2026-04-14 20:51:52 -07:00 · 2026-04-14 20:51:52 -07:00 · 2026-04-14 20:51:52 -07:00
46 changed files with 3017 additions and 1160 deletions
@@ -69,10 +69,10 @@ SUPPORTED_POOL_STRATEGIES = {
 }

 # Cooldown before retrying an exhausted credential.
-# 429 (rate-limited) and 402 (billing/quota) both cool down after 1 hour.
 # Provider-supplied reset_at timestamps override these defaults.
-EXHAUSTED_TTL_429_SECONDS = 60 * 60          # 1 hour
-EXHAUSTED_TTL_DEFAULT_SECONDS = 60 * 60      # 1 hour
+EXHAUSTED_TTL_429_SECONDS = 60 * 60          # 1 hour  (rate limits)
+EXHAUSTED_TTL_AUTH_SECONDS = 24 * 60 * 60    # 24 hours (401/403 — token invalid)
+EXHAUSTED_TTL_DEFAULT_SECONDS = 60 * 60      # 1 hour  (everything else)

 # Pool key prefix for custom OpenAI-compatible endpoints.
 # Custom endpoints all share provider='custom' but are keyed by their
@@ -193,6 +193,10 @@ def _exhausted_ttl(error_code: Optional[int]) -> int:
    """Return cooldown seconds based on the HTTP status that caused exhaustion."""
    if error_code == 429:
        return EXHAUSTED_TTL_429_SECONDS
+    if error_code in (401, 403):
+        # Auth failures are permanent until the user re-authenticates.
+        # Use a long cooldown to avoid retrying dead tokens every hour.
+        return EXHAUSTED_TTL_AUTH_SECONDS
    return EXHAUSTED_TTL_DEFAULT_SECONDS


@@ -36,6 +36,7 @@ _PROVIDER_PREFIXES: frozenset[str] = frozenset({
    "opencode", "zen", "go", "vercel", "kilo", "dashscope", "aliyun", "qwen",
    "mimo", "xiaomi-mimo",
    "arcee-ai", "arceeai",
+    "xai", "x-ai", "x.ai", "grok",
    "qwen-portal",
 })

@@ -989,6 +989,7 @@ def _prune_orphaned_branches(repo_root: str) -> None:
 _ACCENT_ANSI_DEFAULT = "\033[1;38;2;255;215;0m"  # True-color #FFD700 bold — fallback
 _BOLD = "\033[1m"
 _RST = "\033[0m"
+_STREAM_PAD = "    "  # 4-space indent for streamed response text (matches Panel padding)


 def _hex_to_ansi(hex_color: str, *, bold: bool = False) -> str:
@@ -1712,9 +1713,9 @@ class HermesCLI:
        # Parse and validate toolsets
        self.enabled_toolsets = toolsets
        if toolsets and "all" not in toolsets and "*" not in toolsets:
-            # Validate each toolset — MCP server names are added by
-            # _get_platform_tools() but aren't registered in TOOLSETS yet
-            # (that happens later in _sync_mcp_toolsets), so exclude them.
+            # Validate each toolset — MCP server names are resolved via
+            # live registry aliases (registered during discover_mcp_tools),
+            # but discovery hasn't run yet at this point, so exclude them.
            mcp_names = set((CLI_CONFIG.get("mcp_servers") or {}).keys())
            invalid = [t for t in toolsets if not validate_toolset(t) and t not in mcp_names]
            if invalid:
@@ -2580,7 +2581,7 @@ class HermesCLI:
        _tc = getattr(self, "_stream_text_ansi", "")
        while "\n" in self._stream_buf:
            line, self._stream_buf = self._stream_buf.split("\n", 1)
-            _cprint(f"{_tc}{line}{_RST}" if _tc else line)
+            _cprint(f"{_STREAM_PAD}{_tc}{line}{_RST}" if _tc else f"{_STREAM_PAD}{line}")

    def _flush_stream(self) -> None:
        """Emit any remaining partial line from the stream buffer and close the box."""
@@ -2597,7 +2598,7 @@ class HermesCLI:

        if self._stream_buf:
            _tc = getattr(self, "_stream_text_ansi", "")
-            _cprint(f"{_tc}{self._stream_buf}{_RST}" if _tc else self._stream_buf)
+            _cprint(f"{_STREAM_PAD}{_tc}{self._stream_buf}{_RST}" if _tc else f"{_STREAM_PAD}{self._stream_buf}")
            self._stream_buf = ""

        # Close the response box
@@ -5761,7 +5762,7 @@ class HermesCLI:
                        border_style=_resp_color,
                        style=_resp_text,
                        box=rich_box.HORIZONTALS,
-                        padding=(1, 2),
+                        padding=(1, 4),
                    ))
                else:
                    _cprint("  (No response generated)")
@@ -5885,7 +5886,7 @@ class HermesCLI:
                        title_align="left",
                        border_style=_resp_color,
                        box=rich_box.HORIZONTALS,
-                        padding=(1, 2),
+                        padding=(1, 4),
                    ))
                else:
                    _cprint("  💬 /btw: (no response)")
@@ -7648,7 +7649,7 @@ class HermesCLI:
                        label = " ⚕ Hermes "
                        fill = w - 2 - len(label)
                        _cprint(f"\n{_ACCENT}╭─{label}{'─' * max(fill - 1, 0)}╮{_RST}")
-                    _cprint(sentence.rstrip())
+                    _cprint(f"{_STREAM_PAD}{sentence.rstrip()}")

                tts_thread = threading.Thread(
                    target=stream_tts_to_speaker,
@@ -7879,7 +7880,7 @@ class HermesCLI:
                        border_style=_resp_color,
                        style=_resp_text,
                        box=rich_box.HORIZONTALS,
-                        padding=(1, 2),
+                        padding=(1, 4),
                    ))


@@ -3,12 +3,11 @@ Event Hook System

 A lightweight event-driven system that fires handlers at key lifecycle points.
 Hooks are discovered from ~/.hermes/hooks/ directories, each containing:
-  - HOOK.yaml  (metadata: name, description, events list, optional startup_readiness)
+  - HOOK.yaml  (metadata: name, description, events list)
  - handler.py (Python handler with async def handle(event_type, context))

 Events:
  - gateway:startup     -- Gateway process starts
-  - gateway:shutdown    -- Gateway process is shutting down
  - session:start       -- New session created (first message of a new session)
  - session:end         -- Session ends (user ran /new or /reset)
  - session:reset       -- Session reset completed (new session entry created)
@@ -32,26 +31,6 @@ from hermes_cli.config import get_hermes_home
 HOOKS_DIR = get_hermes_home() / "hooks"


-def _normalize_startup_readiness(hook_name: str, manifest: dict[str, Any]) -> Optional[dict[str, Any]]:
-    """Validate and normalize optional startup readiness metadata."""
-    readiness = manifest.get("startup_readiness")
-    if readiness is None:
-        return None
-    if not isinstance(readiness, dict):
-        print(f"[hooks] Ignoring startup_readiness for {hook_name}: expected mapping", flush=True)
-        return None
-
-    check_id = str(readiness.get("id", "")).strip()
-    if not check_id:
-        print(f"[hooks] Ignoring startup_readiness for {hook_name}: missing id", flush=True)
-        return None
-
-    return {
-        "id": check_id,
-        "required": bool(readiness.get("required", True)),
-    }
-
-
 class HookRegistry:
    """
    Discovers, loads, and fires event hooks.
@@ -83,7 +62,6 @@ class HookRegistry:
                "description": "Run ~/.hermes/BOOT.md on gateway startup",
                "events": ["gateway:startup"],
                "path": "(builtin)",
-                "startup_readiness": None,
            })
        except Exception as e:
            print(f"[hooks] Could not load built-in boot-md hook: {e}", flush=True)
@@ -124,7 +102,6 @@ class HookRegistry:
                if not events:
                    print(f"[hooks] Skipping {hook_name}: no events declared", flush=True)
                    continue
-                startup_readiness = _normalize_startup_readiness(hook_name, manifest)

                # Dynamically load the handler module
                spec = importlib.util.spec_from_file_location(
@@ -151,7 +128,6 @@ class HookRegistry:
                    "description": manifest.get("description", ""),
                    "events": events,
                    "path": str(hook_dir),
-                    "startup_readiness": startup_readiness,
                })

                print(f"[hooks] Loaded hook '{hook_name}' for events: {events}", flush=True)
@@ -515,6 +515,8 @@ class APIServerAdapter(BasePlatformAdapter):
        session_id: Optional[str] = None,
        stream_delta_callback=None,
        tool_progress_callback=None,
+        tool_start_callback=None,
+        tool_complete_callback=None,
    ) -> Any:
        """
        Create an AIAgent instance using the gateway's runtime config.
@@ -553,6 +555,8 @@ class APIServerAdapter(BasePlatformAdapter):
            platform="api_server",
            stream_delta_callback=stream_delta_callback,
            tool_progress_callback=tool_progress_callback,
+            tool_start_callback=tool_start_callback,
+            tool_complete_callback=tool_complete_callback,
            session_db=self._ensure_session_db(),
            fallback_model=fallback_model,
        )
@@ -965,6 +969,426 @@ class APIServerAdapter(BasePlatformAdapter):

        return response

+    async def _write_sse_responses(
+        self,
+        request: "web.Request",
+        response_id: str,
+        model: str,
+        created_at: int,
+        stream_q,
+        agent_task,
+        agent_ref,
+        conversation_history: List[Dict[str, str]],
+        user_message: str,
+        instructions: Optional[str],
+        conversation: Optional[str],
+        store: bool,
+        session_id: str,
+    ) -> "web.StreamResponse":
+        """Write an SSE stream for POST /v1/responses (OpenAI Responses API).
+
+        Emits spec-compliant event types as the agent runs:
+
+        - ``response.created`` — initial envelope (status=in_progress)
+        - ``response.output_text.delta`` / ``response.output_text.done`` —
+          streamed assistant text
+        - ``response.output_item.added`` / ``response.output_item.done``
+          with ``item.type == "function_call"`` — when the agent invokes a
+          tool (both events fire; the ``done`` event carries the finalized
+          ``arguments`` string)
+        - ``response.output_item.added`` with
+          ``item.type == "function_call_output"`` — tool result with
+          ``{call_id, output, status}``
+        - ``response.completed`` — terminal event carrying the full
+          response object with all output items + usage (same payload
+          shape as the non-streaming path for parity)
+        - ``response.failed`` — terminal event on agent error
+
+        If the client disconnects mid-stream, ``agent.interrupt()`` is
+        called so the agent stops issuing upstream LLM calls, then the
+        asyncio task is cancelled.  When ``store=True`` the full response
+        is persisted to the ResponseStore in a ``finally`` block so GET
+        /v1/responses/{id} and ``previous_response_id`` chaining work the
+        same as the batch path.
+        """
+        import queue as _q
+
+        sse_headers = {
+            "Content-Type": "text/event-stream",
+            "Cache-Control": "no-cache",
+            "X-Accel-Buffering": "no",
+        }
+        origin = request.headers.get("Origin", "")
+        cors = self._cors_headers_for_origin(origin) if origin else None
+        if cors:
+            sse_headers.update(cors)
+        if session_id:
+            sse_headers["X-Hermes-Session-Id"] = session_id
+        response = web.StreamResponse(status=200, headers=sse_headers)
+        await response.prepare(request)
+
+        # State accumulated during the stream
+        final_text_parts: List[str] = []
+        # Track open function_call items by name so we can emit a matching
+        # ``done`` event when the tool completes.  Order preserved.
+        pending_tool_calls: List[Dict[str, Any]] = []
+        # Output items we've emitted so far (used to build the terminal
+        # response.completed payload).  Kept in the order they appeared.
+        emitted_items: List[Dict[str, Any]] = []
+        # Monotonic counter for output_index (spec requires it).
+        output_index = 0
+        # Monotonic counter for call_id generation if the agent doesn't
+        # provide one (it doesn't, from tool_progress_callback).
+        call_counter = 0
+        # Canonical Responses SSE events include a monotonically increasing
+        # sequence_number. Add it server-side for every emitted event so
+        # clients that validate the OpenAI event schema can parse our stream.
+        sequence_number = 0
+        # Track the assistant message item id + content index for text
+        # delta events — the spec ties deltas to a specific item.
+        message_item_id = f"msg_{uuid.uuid4().hex[:24]}"
+        message_output_index: Optional[int] = None
+        message_opened = False
+
+        async def _write_event(event_type: str, data: Dict[str, Any]) -> None:
+            nonlocal sequence_number
+            if "sequence_number" not in data:
+                data["sequence_number"] = sequence_number
+            sequence_number += 1
+            payload = f"event: {event_type}\ndata: {json.dumps(data)}\n\n"
+            await response.write(payload.encode())
+
+        def _envelope(status: str) -> Dict[str, Any]:
+            env: Dict[str, Any] = {
+                "id": response_id,
+                "object": "response",
+                "status": status,
+                "created_at": created_at,
+                "model": model,
+            }
+            return env
+
+        final_response_text = ""
+        agent_error: Optional[str] = None
+        usage: Dict[str, int] = {"input_tokens": 0, "output_tokens": 0, "total_tokens": 0}
+
+        try:
+            # response.created — initial envelope, status=in_progress
+            created_env = _envelope("in_progress")
+            created_env["output"] = []
+            await _write_event("response.created", {
+                "type": "response.created",
+                "response": created_env,
+            })
+            last_activity = time.monotonic()
+
+            async def _open_message_item() -> None:
+                """Emit response.output_item.added for the assistant message
+                the first time any text delta arrives."""
+                nonlocal message_opened, message_output_index, output_index
+                if message_opened:
+                    return
+                message_opened = True
+                message_output_index = output_index
+                output_index += 1
+                item = {
+                    "id": message_item_id,
+                    "type": "message",
+                    "status": "in_progress",
+                    "role": "assistant",
+                    "content": [],
+                }
+                await _write_event("response.output_item.added", {
+                    "type": "response.output_item.added",
+                    "output_index": message_output_index,
+                    "item": item,
+                })
+
+            async def _emit_text_delta(delta_text: str) -> None:
+                await _open_message_item()
+                final_text_parts.append(delta_text)
+                await _write_event("response.output_text.delta", {
+                    "type": "response.output_text.delta",
+                    "item_id": message_item_id,
+                    "output_index": message_output_index,
+                    "content_index": 0,
+                    "delta": delta_text,
+                    "logprobs": [],
+                })
+
+            async def _emit_tool_started(payload: Dict[str, Any]) -> str:
+                """Emit response.output_item.added for a function_call.
+
+                Returns the call_id so the matching completion event can
+                reference it.  Prefer the real ``tool_call_id`` from the
+                agent when available; fall back to a generated call id for
+                safety in tests or older code paths.
+                """
+                nonlocal output_index, call_counter
+                call_counter += 1
+                call_id = payload.get("tool_call_id") or f"call_{response_id[5:]}_{call_counter}"
+                args = payload.get("arguments", {})
+                if isinstance(args, dict):
+                    arguments_str = json.dumps(args)
+                else:
+                    arguments_str = str(args)
+                item = {
+                    "id": f"fc_{uuid.uuid4().hex[:24]}",
+                    "type": "function_call",
+                    "status": "in_progress",
+                    "name": payload.get("name", ""),
+                    "call_id": call_id,
+                    "arguments": arguments_str,
+                }
+                idx = output_index
+                output_index += 1
+                pending_tool_calls.append({
+                    "call_id": call_id,
+                    "name": payload.get("name", ""),
+                    "arguments": arguments_str,
+                    "item_id": item["id"],
+                    "output_index": idx,
+                })
+                emitted_items.append({
+                    "type": "function_call",
+                    "name": payload.get("name", ""),
+                    "arguments": arguments_str,
+                    "call_id": call_id,
+                })
+                await _write_event("response.output_item.added", {
+                    "type": "response.output_item.added",
+                    "output_index": idx,
+                    "item": item,
+                })
+                return call_id
+
+            async def _emit_tool_completed(payload: Dict[str, Any]) -> None:
+                """Emit response.output_item.done (function_call) followed
+                by response.output_item.added (function_call_output)."""
+                nonlocal output_index
+                call_id = payload.get("tool_call_id")
+                result = payload.get("result", "")
+                pending = None
+                if call_id:
+                    for i, p in enumerate(pending_tool_calls):
+                        if p["call_id"] == call_id:
+                            pending = pending_tool_calls.pop(i)
+                            break
+                if pending is None:
+                    # Completion without a matching start — skip to avoid
+                    # emitting orphaned done events.
+                    return
+
+                # function_call done
+                done_item = {
+                    "id": pending["item_id"],
+                    "type": "function_call",
+                    "status": "completed",
+                    "name": pending["name"],
+                    "call_id": pending["call_id"],
+                    "arguments": pending["arguments"],
+                }
+                await _write_event("response.output_item.done", {
+                    "type": "response.output_item.done",
+                    "output_index": pending["output_index"],
+                    "item": done_item,
+                })
+
+                # function_call_output added (result)
+                result_str = result if isinstance(result, str) else json.dumps(result)
+                output_parts = [{"type": "input_text", "text": result_str}]
+                output_item = {
+                    "id": f"fco_{uuid.uuid4().hex[:24]}",
+                    "type": "function_call_output",
+                    "call_id": pending["call_id"],
+                    "output": output_parts,
+                    "status": "completed",
+                }
+                idx = output_index
+                output_index += 1
+                emitted_items.append({
+                    "type": "function_call_output",
+                    "call_id": pending["call_id"],
+                    "output": output_parts,
+                })
+                await _write_event("response.output_item.added", {
+                    "type": "response.output_item.added",
+                    "output_index": idx,
+                    "item": output_item,
+                })
+                await _write_event("response.output_item.done", {
+                    "type": "response.output_item.done",
+                    "output_index": idx,
+                    "item": output_item,
+                })
+
+            # Main drain loop — thread-safe queue fed by agent callbacks.
+            async def _dispatch(it) -> None:
+                """Route a queue item to the correct SSE emitter.
+
+                Plain strings are text deltas.  Tagged tuples with
+                ``__tool_started__`` / ``__tool_completed__`` prefixes
+                are tool lifecycle events.
+                """
+                if isinstance(it, tuple) and len(it) == 2 and isinstance(it[0], str):
+                    tag, payload = it
+                    if tag == "__tool_started__":
+                        await _emit_tool_started(payload)
+                    elif tag == "__tool_completed__":
+                        await _emit_tool_completed(payload)
+                    # Unknown tags are silently ignored (forward-compat).
+                elif isinstance(it, str):
+                    await _emit_text_delta(it)
+                # Other types (non-string, non-tuple) are silently dropped.
+
+            loop = asyncio.get_event_loop()
+            while True:
+                try:
+                    item = await loop.run_in_executor(None, lambda: stream_q.get(timeout=0.5))
+                except _q.Empty:
+                    if agent_task.done():
+                        # Drain remaining
+                        while True:
+                            try:
+                                item = stream_q.get_nowait()
+                                if item is None:
+                                    break
+                                await _dispatch(item)
+                                last_activity = time.monotonic()
+                            except _q.Empty:
+                                break
+                        break
+                    if time.monotonic() - last_activity >= CHAT_COMPLETIONS_SSE_KEEPALIVE_SECONDS:
+                        await response.write(b": keepalive\n\n")
+                        last_activity = time.monotonic()
+                    continue
+
+                if item is None:  # EOS sentinel
+                    break
+
+                await _dispatch(item)
+                last_activity = time.monotonic()
+
+            # Pick up agent result + usage from the completed task
+            try:
+                result, agent_usage = await agent_task
+                usage = agent_usage or usage
+                # If the agent produced a final_response but no text
+                # deltas were streamed (e.g. some providers only emit
+                # the full response at the end), emit a single fallback
+                # delta so Responses clients still receive a live text part.
+                agent_final = result.get("final_response", "") if isinstance(result, dict) else ""
+                if agent_final and not final_text_parts:
+                    await _emit_text_delta(agent_final)
+                if agent_final and not final_response_text:
+                    final_response_text = agent_final
+                if isinstance(result, dict) and result.get("error") and not final_response_text:
+                    agent_error = result["error"]
+            except Exception as e:  # noqa: BLE001
+                logger.error("Error running agent for streaming responses: %s", e, exc_info=True)
+                agent_error = str(e)
+
+            # Close the message item if it was opened
+            final_response_text = "".join(final_text_parts) or final_response_text
+            if message_opened:
+                await _write_event("response.output_text.done", {
+                    "type": "response.output_text.done",
+                    "item_id": message_item_id,
+                    "output_index": message_output_index,
+                    "content_index": 0,
+                    "text": final_response_text,
+                    "logprobs": [],
+                })
+                msg_done_item = {
+                    "id": message_item_id,
+                    "type": "message",
+                    "status": "completed",
+                    "role": "assistant",
+                    "content": [
+                        {"type": "output_text", "text": final_response_text}
+                    ],
+                }
+                await _write_event("response.output_item.done", {
+                    "type": "response.output_item.done",
+                    "output_index": message_output_index,
+                    "item": msg_done_item,
+                })
+
+            # Always append a final message item in the completed
+            # response envelope so clients that only parse the terminal
+            # payload still see the assistant text.  This mirrors the
+            # shape produced by _extract_output_items in the batch path.
+            final_items: List[Dict[str, Any]] = list(emitted_items)
+            final_items.append({
+                "type": "message",
+                "role": "assistant",
+                "content": [
+                    {"type": "output_text", "text": final_response_text or (agent_error or "")}
+                ],
+            })
+
+            if agent_error:
+                failed_env = _envelope("failed")
+                failed_env["output"] = final_items
+                failed_env["error"] = {"message": agent_error, "type": "server_error"}
+                failed_env["usage"] = {
+                    "input_tokens": usage.get("input_tokens", 0),
+                    "output_tokens": usage.get("output_tokens", 0),
+                    "total_tokens": usage.get("total_tokens", 0),
+                }
+                await _write_event("response.failed", {
+                    "type": "response.failed",
+                    "response": failed_env,
+                })
+            else:
+                completed_env = _envelope("completed")
+                completed_env["output"] = final_items
+                completed_env["usage"] = {
+                    "input_tokens": usage.get("input_tokens", 0),
+                    "output_tokens": usage.get("output_tokens", 0),
+                    "total_tokens": usage.get("total_tokens", 0),
+                }
+                await _write_event("response.completed", {
+                    "type": "response.completed",
+                    "response": completed_env,
+                })
+
+                # Persist for future chaining / GET retrieval, mirroring
+                # the batch path behavior.
+                if store:
+                    full_history = list(conversation_history)
+                    full_history.append({"role": "user", "content": user_message})
+                    if isinstance(result, dict) and result.get("messages"):
+                        full_history.extend(result["messages"])
+                    else:
+                        full_history.append({"role": "assistant", "content": final_response_text})
+                    self._response_store.put(response_id, {
+                        "response": completed_env,
+                        "conversation_history": full_history,
+                        "instructions": instructions,
+                    })
+                    if conversation:
+                        self._response_store.set_conversation(conversation, response_id)
+
+        except (ConnectionResetError, ConnectionAbortedError, BrokenPipeError, OSError):
+            # Client disconnected — interrupt the agent so it stops
+            # making upstream LLM calls, then cancel the task.
+            agent = agent_ref[0] if agent_ref else None
+            if agent is not None:
+                try:
+                    agent.interrupt("SSE client disconnected")
+                except Exception:
+                    pass
+            if not agent_task.done():
+                agent_task.cancel()
+                try:
+                    await agent_task
+                except (asyncio.CancelledError, Exception):
+                    pass
+            logger.info("SSE client disconnected; interrupted agent task %s", response_id)
+
+        return response
+
    async def _handle_responses(self, request: "web.Request") -> "web.Response":
        """POST /v1/responses — OpenAI Responses API format."""
        auth_err = self._check_auth(request)
@@ -1060,6 +1484,80 @@ class APIServerAdapter(BasePlatformAdapter):
        # Run the agent (with Idempotency-Key support)
        session_id = str(uuid.uuid4())

+        stream = bool(body.get("stream", False))
+        if stream:
+            # Streaming branch — emit OpenAI Responses SSE events as the
+            # agent runs so frontends can render text deltas and tool
+            # calls in real time.  See _write_sse_responses for details.
+            import queue as _q
+            _stream_q: _q.Queue = _q.Queue()
+
+            def _on_delta(delta):
+                # None from the agent is a CLI box-close signal, not EOS.
+                # Forwarding would kill the SSE stream prematurely; the
+                # SSE writer detects completion via agent_task.done().
+                if delta is not None:
+                    _stream_q.put(delta)
+
+            def _on_tool_progress(event_type, name, preview, args, **kwargs):
+                """Queue non-start tool progress events if needed in future.
+
+                The structured Responses stream uses ``tool_start_callback``
+                and ``tool_complete_callback`` for exact call-id correlation,
+                so progress events are currently ignored here.
+                """
+                return
+
+            def _on_tool_start(tool_call_id, function_name, function_args):
+                """Queue a started tool for live function_call streaming."""
+                _stream_q.put(("__tool_started__", {
+                    "tool_call_id": tool_call_id,
+                    "name": function_name,
+                    "arguments": function_args or {},
+                }))
+
+            def _on_tool_complete(tool_call_id, function_name, function_args, function_result):
+                """Queue a completed tool result for live function_call_output streaming."""
+                _stream_q.put(("__tool_completed__", {
+                    "tool_call_id": tool_call_id,
+                    "name": function_name,
+                    "arguments": function_args or {},
+                    "result": function_result,
+                }))
+
+            agent_ref = [None]
+            agent_task = asyncio.ensure_future(self._run_agent(
+                user_message=user_message,
+                conversation_history=conversation_history,
+                ephemeral_system_prompt=instructions,
+                session_id=session_id,
+                stream_delta_callback=_on_delta,
+                tool_progress_callback=_on_tool_progress,
+                tool_start_callback=_on_tool_start,
+                tool_complete_callback=_on_tool_complete,
+                agent_ref=agent_ref,
+            ))
+
+            response_id = f"resp_{uuid.uuid4().hex[:28]}"
+            model_name = body.get("model", self._model_name)
+            created_at = int(time.time())
+
+            return await self._write_sse_responses(
+                request=request,
+                response_id=response_id,
+                model=model_name,
+                created_at=created_at,
+                stream_q=_stream_q,
+                agent_task=agent_task,
+                agent_ref=agent_ref,
+                conversation_history=conversation_history,
+                user_message=user_message,
+                instructions=instructions,
+                conversation=conversation,
+                store=store,
+                session_id=session_id,
+            )
+
        async def _compute_response():
            return await self._run_agent(
                user_message=user_message,
@@ -1486,6 +1984,8 @@ class APIServerAdapter(BasePlatformAdapter):
        session_id: Optional[str] = None,
        stream_delta_callback=None,
        tool_progress_callback=None,
+        tool_start_callback=None,
+        tool_complete_callback=None,
        agent_ref: Optional[list] = None,
    ) -> tuple:
        """
@@ -1507,6 +2007,8 @@ class APIServerAdapter(BasePlatformAdapter):
                session_id=session_id,
                stream_delta_callback=stream_delta_callback,
                tool_progress_callback=tool_progress_callback,
+                tool_start_callback=tool_start_callback,
+                tool_complete_callback=tool_complete_callback,
            )
            if agent_ref is not None:
                agent_ref[0] = agent
@@ -1696,6 +1696,10 @@ class DiscordAdapter(BasePlatformAdapter):
        async def slash_update(interaction: discord.Interaction):
            await self._run_simple_slash(interaction, "/update", "Update initiated~")

+        @tree.command(name="restart", description="Gracefully restart the Hermes gateway")
+        async def slash_restart(interaction: discord.Interaction):
+            await self._run_simple_slash(interaction, "/restart", "Restart requested~")
+
        @tree.command(name="approve", description="Approve a pending dangerous command")
        @discord.app_commands.describe(scope="Optional: 'all', 'session', 'always', 'all session', 'all always'")
        async def slash_approve(interaction: discord.Interaction, scope: str = ""):
@@ -1405,7 +1405,7 @@ class GatewayRunner:
        action = "restarting" if self._restart_requested else "shutting down"
        hint = (
            "Your current task will be interrupted. "
-            "Use /retry after restart to continue."
+            "Send any message after restart to resume where it left off."
            if self._restart_requested
            else "Your current task will be interrupted."
        )
@@ -1475,6 +1475,106 @@ class GatewayRunner:
            except Exception:
                pass

+    _STUCK_LOOP_THRESHOLD = 3  # restarts while active before auto-suspend
+    _STUCK_LOOP_FILE = ".restart_failure_counts"
+
+    def _increment_restart_failure_counts(self, active_session_keys: set) -> None:
+        """Increment restart-failure counters for sessions active at shutdown.
+
+        Persists to a JSON file so counters survive across restarts.
+        Sessions NOT in active_session_keys are removed (they completed
+        successfully, so the loop is broken).
+        """
+        import json
+
+        path = _hermes_home / self._STUCK_LOOP_FILE
+        try:
+            counts = json.loads(path.read_text()) if path.exists() else {}
+        except Exception:
+            counts = {}
+
+        # Increment active sessions, remove inactive ones (loop broken)
+        new_counts = {}
+        for key in active_session_keys:
+            new_counts[key] = counts.get(key, 0) + 1
+        # Keep any entries that are still above 0 even if not active now
+        # (they might become active again next restart)
+
+        try:
+            path.write_text(json.dumps(new_counts))
+        except Exception:
+            pass
+
+    def _suspend_stuck_loop_sessions(self) -> int:
+        """Suspend sessions that have been active across too many restarts.
+
+        Returns the number of sessions suspended.  Called on gateway startup
+        AFTER suspend_recently_active() to catch the stuck-loop pattern:
+        session loads → agent gets stuck → gateway restarts → repeat.
+        """
+        import json
+
+        path = _hermes_home / self._STUCK_LOOP_FILE
+        if not path.exists():
+            return 0
+
+        try:
+            counts = json.loads(path.read_text())
+        except Exception:
+            return 0
+
+        suspended = 0
+        stuck_keys = [k for k, v in counts.items() if v >= self._STUCK_LOOP_THRESHOLD]
+
+        for session_key in stuck_keys:
+            try:
+                entry = self.session_store._entries.get(session_key)
+                if entry and not entry.suspended:
+                    entry.suspended = True
+                    suspended += 1
+                    logger.warning(
+                        "Auto-suspended stuck session %s (active across %d "
+                        "consecutive restarts — likely a stuck loop)",
+                        session_key[:30], counts[session_key],
+                    )
+            except Exception:
+                pass
+
+        if suspended:
+            try:
+                self.session_store._save()
+            except Exception:
+                pass
+
+        # Clear the file — counters start fresh after suspension
+        try:
+            path.unlink(missing_ok=True)
+        except Exception:
+            pass
+
+        return suspended
+
+    def _clear_restart_failure_count(self, session_key: str) -> None:
+        """Clear the restart-failure counter for a session that completed OK.
+
+        Called after a successful agent turn to signal the loop is broken.
+        """
+        import json
+
+        path = _hermes_home / self._STUCK_LOOP_FILE
+        if not path.exists():
+            return
+        try:
+            counts = json.loads(path.read_text())
+            if session_key in counts:
+                del counts[session_key]
+                if counts:
+                    path.write_text(json.dumps(counts))
+                else:
+                    path.unlink(missing_ok=True)
+        except Exception:
+            pass
+
    async def _launch_detached_restart_command(self) -> None:
        import shutil
        import subprocess
@@ -1540,7 +1640,7 @@ class GatewayRunner:
            pass
        try:
            from gateway.status import write_runtime_status
-            write_runtime_status(gateway_state="starting", exit_reason=None, startup_checks={})
+            write_runtime_status(gateway_state="starting", exit_reason=None)
        except Exception:
            pass
        
@@ -1582,23 +1682,8 @@ class GatewayRunner:
                "or configure platform allowlists (e.g., TELEGRAM_ALLOWED_USERS=your_id)."
            )
        
-        # Discover plugins before hooks so plugin-owned hook bundles can
-        # participate in this same startup cycle.
-        try:
-            from hermes_cli.plugins import discover_plugins
-
-            discover_plugins()
-        except Exception as e:
-            logger.warning("Plugin discovery during gateway startup failed: %s", e)
-
        # Discover and load event hooks
        self.hooks.discover_and_load()
-        try:
-            from gateway.status import reset_startup_checks
-
-            reset_startup_checks(self.hooks.loaded_hooks)
-        except Exception as e:
-            logger.warning("Startup readiness initialization failed: %s", e)
        
        # Recover background processes from checkpoint (crash recovery)
        try:
@@ -1633,6 +1718,17 @@ class GatewayRunner:
            except Exception as e:
                logger.warning("Session suspension on startup failed: %s", e)

+        # Stuck-loop detection (#7536): if a session has been active across
+        # 3+ consecutive restarts, it's probably stuck in a loop (the same
+        # history keeps causing the agent to hang).  Auto-suspend it so the
+        # user gets a clean slate on the next message.
+        try:
+            stuck = self._suspend_stuck_loop_sessions()
+            if stuck:
+                logger.warning("Auto-suspended %d stuck-loop session(s)", stuck)
+        except Exception as e:
+            logger.debug("Stuck-loop detection failed: %s", e)
+
        connected_count = 0
        enabled_platform_count = 0
        startup_nonretryable_errors: list[str] = []
@@ -2119,11 +2215,6 @@ class GatewayRunner:
                    logger.error("Failed to launch detached gateway restart: %s", e)

            self._finalize_shutdown_agents(active_agents)
-            await self.hooks.emit("gateway:shutdown", {
-                "restart": self._restart_requested,
-                "service_restart": self._restart_via_service,
-                "detached_restart": self._restart_detached,
-            })

            for platform, adapter in list(self.adapters.items()):
                try:
@@ -2189,6 +2280,14 @@ class GatewayRunner:
                    "active sessions."
                )

+            # Track sessions that were active at shutdown for stuck-loop
+            # detection (#7536).  On each restart, the counter increments
+            # for sessions that were running.  If a session hits the
+            # threshold (3 consecutive restarts while active), the next
+            # startup auto-suspends it — breaking the loop.
+            if active_agents:
+                self._increment_restart_failure_counts(set(active_agents.keys()))
+
            if self._restart_requested and self._restart_via_service:
                self._exit_code = GATEWAY_SERVICE_RESTART_EXIT_CODE
                self._exit_reason = self._exit_reason or "Gateway restart requested"
@@ -3687,6 +3786,12 @@ class GatewayRunner:
                _response_time, _api_calls, _resp_len,
            )

+            # Successful turn — clear any stuck-loop counter for this session.
+            # This ensures the counter only accumulates across CONSECUTIVE
+            # restarts where the session was active (never completed).
+            if session_key:
+                self._clear_restart_failure_count(session_key)
+
            # Surface error details when the agent failed silently (final_response=None)
            if not response and agent_result.get("failed"):
                error_detail = agent_result.get("error", "unknown error")
@@ -8470,6 +8575,21 @@ class GatewayRunner:
            if _msn:
                message = _msn + "\n\n" + message

+            # Auto-continue: if the loaded history ends with a tool result,
+            # the previous agent turn was interrupted mid-work (gateway
+            # restart, crash, SIGTERM).  Prepend a system note so the model
+            # finishes processing the pending tool results before addressing
+            # the user's new message.  (#4493)
+            if agent_history and agent_history[-1].get("role") == "tool":
+                message = (
+                    "[System note: Your previous turn was interrupted before you could "
+                    "process the last tool result(s). The conversation history contains "
+                    "tool outputs you haven't responded to yet. Please finish processing "
+                    "those results and summarize what was accomplished, then address the "
+                    "user's new message below.]\n\n"
+                    + message
+                )
+
            _approval_session_key = session_key or ""
            _approval_session_token = set_current_session_key(_approval_session_key)
            register_gateway_notify(_approval_session_key, _approval_notify_sync)
@@ -27,7 +27,6 @@ _RUNTIME_STATUS_FILE = "gateway_state.json"
 _LOCKS_DIRNAME = "gateway-locks"
 _IS_WINDOWS = sys.platform == "win32"
 _UNSET = object()
-_VALID_STARTUP_CHECK_STATES = {"pending", "ready", "failed"}


 def _get_pid_path() -> Path:
@@ -163,39 +162,11 @@ def _build_runtime_status_record() -> dict[str, Any]:
        "restart_requested": False,
        "active_agents": 0,
        "platforms": {},
-        "startup_checks": {},
        "updated_at": _utc_now_iso(),
    })
    return payload


-def _normalize_startup_check_entries(
-    startup_checks: Optional[dict[str, Any]],
-) -> dict[str, dict[str, Any]]:
-    """Normalize persisted startup readiness entries."""
-    if not isinstance(startup_checks, dict):
-        return {}
-
-    now = _utc_now_iso()
-    normalized: dict[str, dict[str, Any]] = {}
-    for raw_id, raw_payload in startup_checks.items():
-        check_id = str(raw_id).strip()
-        if not check_id:
-            continue
-        payload = raw_payload if isinstance(raw_payload, dict) else {}
-        state = str(payload.get("state", "pending")).strip().lower()
-        if state not in _VALID_STARTUP_CHECK_STATES:
-            state = "pending"
-        normalized[check_id] = {
-            "state": state,
-            "required": bool(payload.get("required", True)),
-            "source": payload.get("source"),
-            "detail": payload.get("detail"),
-            "updated_at": payload.get("updated_at") or now,
-        }
-    return normalized
-
-
 def _read_json_file(path: Path) -> Optional[dict[str, Any]]:
    if not path.exists():
        return None
@@ -252,7 +223,6 @@ def write_runtime_status(
    exit_reason: Any = _UNSET,
    restart_requested: Any = _UNSET,
    active_agents: Any = _UNSET,
-    startup_checks: Any = _UNSET,
    platform: Any = _UNSET,
    platform_state: Any = _UNSET,
    error_code: Any = _UNSET,
@@ -275,8 +245,6 @@ def write_runtime_status(
        payload["restart_requested"] = bool(restart_requested)
    if active_agents is not _UNSET:
        payload["active_agents"] = max(0, int(active_agents))
-    if startup_checks is not _UNSET:
-        payload["startup_checks"] = _normalize_startup_check_entries(startup_checks)

    if platform is not _UNSET:
        platform_payload = payload["platforms"].get(platform, {})
@@ -294,109 +262,7 @@ def write_runtime_status(

 def read_runtime_status() -> Optional[dict[str, Any]]:
    """Read the persisted gateway runtime health/status information."""
-    payload = _read_json_file(_get_runtime_status_path())
-    if payload is None:
-        return None
-    payload.setdefault("platforms", {})
-    payload["startup_checks"] = _normalize_startup_check_entries(payload.get("startup_checks"))
-    return payload
-
-
-def reset_startup_checks(checks: Optional[list[dict[str, Any]]] = None) -> dict[str, dict[str, Any]]:
-    """Replace persisted startup readiness checks for the current run."""
-    normalized: dict[str, dict[str, Any]] = {}
-    now = _utc_now_iso()
-
-    for hook in checks or []:
-        if not isinstance(hook, dict):
-            continue
-        readiness = hook.get("startup_readiness")
-        if not isinstance(readiness, dict):
-            continue
-        check_id = str(readiness.get("id", "")).strip()
-        if not check_id:
-            continue
-        normalized[check_id] = {
-            "state": "pending",
-            "required": bool(readiness.get("required", True)),
-            "source": hook.get("name"),
-            "detail": None,
-            "updated_at": now,
-        }
-
-    write_runtime_status(startup_checks=normalized)
-    return normalized
-
-
-def update_startup_check(
-    check_id: str,
-    state: str,
-    *,
-    detail: Any = _UNSET,
-    required: Any = _UNSET,
-    source: Any = _UNSET,
-) -> dict[str, Any]:
-    """Update a single startup readiness check in the runtime status file."""
-    normalized_id = str(check_id).strip()
-    if not normalized_id:
-        raise ValueError("startup readiness check id is required")
-
-    normalized_state = str(state).strip().lower()
-    if normalized_state not in _VALID_STARTUP_CHECK_STATES:
-        raise ValueError(f"invalid startup readiness state: {state}")
-
-    path = _get_runtime_status_path()
-    payload = _read_json_file(path) or _build_runtime_status_record()
-    checks = _normalize_startup_check_entries(payload.get("startup_checks"))
-    existing = checks.get(normalized_id, {})
-    now = _utc_now_iso()
-
-    checks[normalized_id] = {
-        "state": normalized_state,
-        "required": bool(existing.get("required", True) if required is _UNSET else required),
-        "source": existing.get("source") if source is _UNSET else source,
-        "detail": existing.get("detail") if detail is _UNSET else detail,
-        "updated_at": now,
-    }
-
-    payload["startup_checks"] = checks
-    payload.setdefault("platforms", {})
-    payload.setdefault("kind", _GATEWAY_KIND)
-    payload["pid"] = os.getpid()
-    payload["start_time"] = _get_process_start_time(os.getpid())
-    payload["updated_at"] = now
-    _write_json_file(path, payload)
-    return checks[normalized_id]
-
-
-def mark_startup_check_pending(
-    check_id: str,
-    *,
-    detail: Any = _UNSET,
-    required: Any = _UNSET,
-    source: Any = _UNSET,
-) -> dict[str, Any]:
-    return update_startup_check(check_id, "pending", detail=detail, required=required, source=source)
-
-
-def mark_startup_check_ready(
-    check_id: str,
-    *,
-    detail: Any = _UNSET,
-    required: Any = _UNSET,
-    source: Any = _UNSET,
-) -> dict[str, Any]:
-    return update_startup_check(check_id, "ready", detail=detail, required=required, source=source)
-
-
-def mark_startup_check_failed(
-    check_id: str,
-    *,
-    detail: Any = _UNSET,
-    required: Any = _UNSET,
-    source: Any = _UNSET,
-) -> dict[str, Any]:
-    return update_startup_check(check_id, "failed", detail=detail, required=required, source=source)
+    return _read_json_file(_get_runtime_status_path())


 def remove_pid_file() -> None:
@@ -844,8 +844,7 @@ class SlashCommandCompleter(Completer):
            return None
        return word

-    @staticmethod
-    def _context_completions(word: str, limit: int = 30):
+    def _context_completions(self, word: str, limit: int = 30):
        """Yield Claude Code-style @ context completions.

        Bare ``@`` or ``@partial`` shows static references and matching
@@ -2766,6 +2766,47 @@ def sanitize_env_file() -> int:
    return fixes


+def _check_non_ascii_credential(key: str, value: str) -> str:
+    """Warn and strip non-ASCII characters from credential values.
+
+    API keys and tokens must be pure ASCII — they are sent as HTTP header
+    values which httpx/httpcore encode as ASCII.  Non-ASCII characters
+    (commonly introduced by copy-pasting from rich-text editors or PDFs
+    that substitute lookalike Unicode glyphs for ASCII letters) cause
+    ``UnicodeEncodeError: 'ascii' codec can't encode character`` at
+    request time.
+
+    Returns the sanitized (ASCII-only) value.  Prints a warning if any
+    non-ASCII characters were found and removed.
+    """
+    try:
+        value.encode("ascii")
+        return value  # all ASCII — nothing to do
+    except UnicodeEncodeError:
+        pass
+
+    # Build a readable list of the offending characters
+    bad_chars: list[str] = []
+    for i, ch in enumerate(value):
+        if ord(ch) > 127:
+            bad_chars.append(f"  position {i}: {ch!r} (U+{ord(ch):04X})")
+    sanitized = value.encode("ascii", errors="ignore").decode("ascii")
+
+    import sys
+    print(
+        f"\n  Warning: {key} contains non-ASCII characters that will break API requests.\n"
+        f"  This usually happens when copy-pasting from a PDF, rich-text editor,\n"
+        f"  or web page that substitutes lookalike Unicode glyphs for ASCII letters.\n"
+        f"\n"
+        + "\n".join(f"  {line}" for line in bad_chars[:5])
+        + ("\n  ... and more" if len(bad_chars) > 5 else "")
+        + f"\n\n  The non-ASCII characters have been stripped automatically.\n"
+        f"  If authentication fails, re-copy the key from the provider's dashboard.\n",
+        file=sys.stderr,
+    )
+    return sanitized
+
+
 def save_env_value(key: str, value: str):
    """Save or update a value in ~/.hermes/.env."""
    if is_managed():
@@ -2774,6 +2815,8 @@ def save_env_value(key: str, value: str):
    if not _ENV_VAR_NAME_RE.match(key):
        raise ValueError(f"Invalid environment variable name: {key!r}")
    value = value.replace("\n", "").replace("\r", "")
+    # API keys / tokens must be ASCII — strip non-ASCII with a warning.
+    value = _check_non_ascii_credential(key, value)
    ensure_hermes_home()
    env_path = get_env_path()
    
@@ -8,11 +8,40 @@ from pathlib import Path
 from dotenv import load_dotenv


+# Env var name suffixes that indicate credential values.  These are the
+# only env vars whose values we sanitize on load — we must not silently
+# alter arbitrary user env vars, but credentials are known to require
+# pure ASCII (they become HTTP header values).
+_CREDENTIAL_SUFFIXES = ("_API_KEY", "_TOKEN", "_SECRET", "_KEY")
+
+
+def _sanitize_loaded_credentials() -> None:
+    """Strip non-ASCII characters from credential env vars in os.environ.
+
+    Called after dotenv loads so the rest of the codebase never sees
+    non-ASCII API keys.  Only touches env vars whose names end with
+    known credential suffixes (``_API_KEY``, ``_TOKEN``, etc.).
+    """
+    for key, value in list(os.environ.items()):
+        if not any(key.endswith(suffix) for suffix in _CREDENTIAL_SUFFIXES):
+            continue
+        try:
+            value.encode("ascii")
+        except UnicodeEncodeError:
+            os.environ[key] = value.encode("ascii", errors="ignore").decode("ascii")
+
+
 def _load_dotenv_with_fallback(path: Path, *, override: bool) -> None:
    try:
        load_dotenv(dotenv_path=path, override=override, encoding="utf-8")
    except UnicodeDecodeError:
        load_dotenv(dotenv_path=path, override=override, encoding="latin-1")
+    # Strip non-ASCII characters from credential env vars that were just
+    # loaded.  API keys must be pure ASCII since they're sent as HTTP
+    # header values (httpx encodes headers as ASCII).  Non-ASCII chars
+    # typically come from copy-pasting keys from PDFs or rich-text editors
+    # that substitute Unicode lookalike glyphs (e.g. ʋ U+028B for v).
+    _sanitize_loaded_credentials()


 def _sanitize_env_file_if_needed(path: Path) -> None:
@@ -10,7 +10,6 @@ import shutil
 import signal
 import subprocess
 import sys
-import time
 from pathlib import Path

 PROJECT_ROOT = Path(__file__).parent.parent.resolve()
@@ -38,10 +37,6 @@ from hermes_cli.setup import (
 from hermes_cli.colors import Colors, color


-_SERVICE_READINESS_TIMEOUT = 30.0
-_SERVICE_READINESS_POLL_INTERVAL = 0.2
-
-
 # =============================================================================
 # Process Management (for manual gateway runs)
 # =============================================================================
@@ -1105,123 +1100,12 @@ def systemd_uninstall(system: bool = False):
    print(f"✓ {_service_scope_label(system).capitalize()} service uninstalled")


-def _describe_startup_check(check_id: str, check: dict) -> str:
-    source = check.get("source")
-    detail = check.get("detail")
-    label = f"{check_id} ({source})" if source and source != check_id else check_id
-    return f"{label}: {detail}" if detail else label
-
-
-def _classify_startup_checks(state: dict | None) -> tuple[list[str], list[str], list[str]]:
-    checks = (state or {}).get("startup_checks") or {}
-    pending_required: list[str] = []
-    failed_required: list[str] = []
-    optional_warnings: list[str] = []
-
-    if not isinstance(checks, dict):
-        return pending_required, failed_required, optional_warnings
-
-    for check_id, raw_check in checks.items():
-        check = raw_check if isinstance(raw_check, dict) else {}
-        label = _describe_startup_check(str(check_id), check)
-        check_state = str(check.get("state", "pending")).strip().lower()
-        required = bool(check.get("required", True))
-
-        if check_state == "ready":
-            continue
-        if required:
-            if check_state == "failed":
-                failed_required.append(label)
-            else:
-                pending_required.append(label)
-        else:
-            prefix = "failed" if check_state == "failed" else "pending"
-            optional_warnings.append(f"{prefix}: {label}")
-
-    return pending_required, failed_required, optional_warnings
-
-
-def _wait_for_service_readiness(
-    *,
-    action: str,
-    previous_pid: int | None = None,
-    timeout: float = _SERVICE_READINESS_TIMEOUT,
-    poll_interval: float = _SERVICE_READINESS_POLL_INTERVAL,
-) -> list[str]:
-    from gateway.status import get_running_pid, read_runtime_status
-
-    deadline = time.monotonic() + timeout
-    last_pending: list[str] = []
-
-    while time.monotonic() < deadline:
-        live_pid = get_running_pid()
-        if live_pid is None or (previous_pid is not None and live_pid == previous_pid):
-            time.sleep(poll_interval)
-            continue
-
-        runtime = read_runtime_status() or {}
-        try:
-            runtime_pid = int(runtime.get("pid"))
-        except (TypeError, ValueError):
-            runtime_pid = None
-        if runtime_pid != live_pid:
-            time.sleep(poll_interval)
-            continue
-
-        gateway_state = runtime.get("gateway_state")
-        pending_required, failed_required, optional_warnings = _classify_startup_checks(runtime)
-        last_pending = pending_required
-
-        if gateway_state == "startup_failed":
-            reason = runtime.get("exit_reason") or f"gateway {action} failed during startup"
-            raise RuntimeError(reason)
-        if failed_required:
-            raise RuntimeError(
-                "required startup checks failed: " + "; ".join(failed_required)
-            )
-        if gateway_state == "running" and not pending_required:
-            return optional_warnings
-
-        time.sleep(poll_interval)
-
-    if last_pending:
-        raise RuntimeError(
-            "timed out waiting for required startup checks: " + "; ".join(last_pending)
-        )
-    if previous_pid is not None:
-        raise RuntimeError(
-            f"timed out waiting for gateway {action}; previous process is still active or no new runtime became ready"
-        )
-    raise RuntimeError(f"timed out waiting for gateway {action} readiness")
-
-
-def _await_service_ready_or_exit(
-    *,
-    action: str,
-    previous_pid: int | None = None,
-    timeout: float = _SERVICE_READINESS_TIMEOUT,
-) -> None:
-    try:
-        optional_warnings = _wait_for_service_readiness(
-            action=action,
-            previous_pid=previous_pid,
-            timeout=timeout,
-        )
-    except RuntimeError as exc:
-        print_error(f"  Gateway {action} did not become ready: {exc}")
-        raise SystemExit(1) from exc
-
-    for warning in optional_warnings:
-        print_warning(f"  Optional startup check {warning}")
-
-
 def systemd_start(system: bool = False):
    system = _select_systemd_scope(system)
    if system:
        _require_root_for_system_service("start")
    refresh_systemd_unit_if_needed(system=system)
    _run_systemctl(["start", get_service_name()], system=system, check=True, timeout=30)
-    _await_service_ready_or_exit(action="start")
    print(f"✓ {_service_scope_label(system).capitalize()} service started")


@@ -1244,11 +1128,64 @@ def systemd_restart(system: bool = False):

    pid = get_running_pid()
    if pid is not None and _request_gateway_self_restart(pid):
-        _await_service_ready_or_exit(action="restart", previous_pid=pid)
-        print(f"✓ {_service_scope_label(system).capitalize()} service restarted")
+        # SIGUSR1 sent — the gateway will drain active agents, exit with
+        # code 75, and systemd will restart it after RestartSec (30s).
+        # Wait for the old process to die and the new one to become active
+        # so the CLI doesn't return while the service is still restarting.
+        import time
+        scope_label = _service_scope_label(system).capitalize()
+        svc = get_service_name()
+        scope_cmd = _systemctl_cmd(system)
+
+        # Phase 1: wait for old process to exit (drain + shutdown)
+        print(f"⏳ {scope_label} service draining active work...")
+        deadline = time.time() + 90
+        while time.time() < deadline:
+            try:
+                os.kill(pid, 0)
+                time.sleep(1)
+            except (ProcessLookupError, PermissionError):
+                break  # old process is gone
+        else:
+            print(f"⚠ Old process (PID {pid}) still alive after 90s")
+
+        # Phase 2: wait for systemd to start the new process
+        print(f"⏳ Waiting for {svc} to restart...")
+        deadline = time.time() + 60
+        while time.time() < deadline:
+            try:
+                result = subprocess.run(
+                    scope_cmd + ["is-active", svc],
+                    capture_output=True, text=True, timeout=5,
+                )
+                if result.stdout.strip() == "active":
+                    # Verify it's a NEW process, not the old one somehow
+                    new_pid = get_running_pid()
+                    if new_pid and new_pid != pid:
+                        print(f"✓ {scope_label} service restarted (PID {new_pid})")
+                        return
+            except (subprocess.TimeoutExpired, FileNotFoundError):
+                pass
+            time.sleep(2)
+
+        # Timed out — check final state
+        try:
+            result = subprocess.run(
+                scope_cmd + ["is-active", svc],
+                capture_output=True, text=True, timeout=5,
+            )
+            if result.stdout.strip() == "active":
+                print(f"✓ {scope_label} service restarted")
+                return
+        except Exception:
+            pass
+        print(
+            f"⚠ {scope_label} service did not become active within 60s.\n"
+            f"  Check status: {'sudo ' if system else ''}hermes gateway status\n"
+            f"  Check logs:   journalctl {'--user ' if not system else ''}-u {svc} --since '2 min ago'"
+        )
        return
    _run_systemctl(["reload-or-restart", get_service_name()], system=system, check=True, timeout=90)
-    _await_service_ready_or_exit(action="restart", previous_pid=pid)
    print(f"✓ {_service_scope_label(system).capitalize()} service restarted")


@@ -1507,7 +1444,6 @@ def launchd_start():
        plist_path.write_text(generate_launchd_plist(), encoding="utf-8")
        subprocess.run(["launchctl", "bootstrap", _launchd_domain(), str(plist_path)], check=True, timeout=30)
        subprocess.run(["launchctl", "kickstart", f"{_launchd_domain()}/{label}"], check=True, timeout=30)
-        _await_service_ready_or_exit(action="start")
        print("✓ Service started")
        return

@@ -1520,7 +1456,6 @@ def launchd_start():
        print("↻ launchd job was unloaded; reloading service definition")
        subprocess.run(["launchctl", "bootstrap", _launchd_domain(), str(plist_path)], check=True, timeout=30)
        subprocess.run(["launchctl", "kickstart", f"{_launchd_domain()}/{label}"], check=True, timeout=30)
-    _await_service_ready_or_exit(action="start")
    print("✓ Service started")

 def launchd_stop():
@@ -1591,8 +1526,7 @@ def launchd_restart():
    try:
        pid = get_running_pid()
        if pid is not None and _request_gateway_self_restart(pid):
-            _await_service_ready_or_exit(action="restart", previous_pid=pid)
-            print("✓ Service restarted")
+            print("✓ Service restart requested")
            return
        if pid is not None:
            try:
@@ -1604,7 +1538,6 @@ def launchd_restart():
                if not exited:
                    print(f"⚠ Gateway drain timed out after {drain_timeout:.0f}s — forcing launchd restart")
        subprocess.run(["launchctl", "kickstart", "-k", target], check=True, timeout=90)
-        _await_service_ready_or_exit(action="restart", previous_pid=pid)
        print("✓ Service restarted")
    except subprocess.CalledProcessError as e:
        if e.returncode not in (3, 113):
@@ -1614,7 +1547,6 @@ def launchd_restart():
        plist_path = get_launchd_plist_path()
        subprocess.run(["launchctl", "bootstrap", _launchd_domain(), str(plist_path)], check=True, timeout=30)
        subprocess.run(["launchctl", "kickstart", target], check=True, timeout=30)
-        _await_service_ready_or_exit(action="restart", previous_pid=pid)
        print("✓ Service restarted")

 def launchd_status(deep: bool = False):
@@ -2987,6 +2919,15 @@ def gateway_command(args):

    elif subcmd == "start":
        system = getattr(args, 'system', False)
+        start_all = getattr(args, 'all', False)
+
+        if start_all:
+            # Kill all stale gateway processes across all profiles before starting
+            killed = kill_gateway_processes(all_profiles=True)
+            if killed:
+                print(f"✓ Killed {killed} stale gateway process(es) across all profiles")
+                _wait_for_gateway_exit(timeout=10.0, force_after=5.0)
+
        if is_termux():
            print("Gateway service start is not supported on Termux because there is no system service manager.")
            print("Run manually: hermes gateway")
@@ -3072,7 +3013,39 @@ def gateway_command(args):
        # Try service first, fall back to killing and restarting
        service_available = False
        system = getattr(args, 'system', False)
+        restart_all = getattr(args, 'all', False)
        service_configured = False
+
+        if restart_all:
+            # --all: stop every gateway process across all profiles, then start fresh
+            service_stopped = False
+            if supports_systemd_services() and (get_systemd_unit_path(system=False).exists() or get_systemd_unit_path(system=True).exists()):
+                try:
+                    systemd_stop(system=system)
+                    service_stopped = True
+                except subprocess.CalledProcessError:
+                    pass
+            elif is_macos() and get_launchd_plist_path().exists():
+                try:
+                    launchd_stop()
+                    service_stopped = True
+                except subprocess.CalledProcessError:
+                    pass
+            killed = kill_gateway_processes(all_profiles=True)
+            total = killed + (1 if service_stopped else 0)
+            if total:
+                print(f"✓ Stopped {total} gateway process(es) across all profiles")
+            _wait_for_gateway_exit(timeout=10.0, force_after=5.0)
+
+            # Start the current profile's service fresh
+            print("Starting gateway...")
+            if supports_systemd_services() and (get_systemd_unit_path(system=False).exists() or get_systemd_unit_path(system=True).exists()):
+                systemd_start(system=system)
+            elif is_macos() and get_launchd_plist_path().exists():
+                launchd_start()
+            else:
+                run_gateway(verbose=0)
+            return
        
        if supports_systemd_services() and (get_systemd_unit_path(system=False).exists() or get_systemd_unit_path(system=True).exists()):
            service_configured = True
@@ -4749,6 +4749,7 @@ For more help on a command:
    # gateway start
    gateway_start = gateway_subparsers.add_parser("start", help="Start the installed systemd/launchd background service")
    gateway_start.add_argument("--system", action="store_true", help="Target the Linux system-level gateway service")
+    gateway_start.add_argument("--all", action="store_true", help="Kill ALL stale gateway processes across all profiles before starting")
    
    # gateway stop
    gateway_stop = gateway_subparsers.add_parser("stop", help="Stop gateway service")
@@ -4758,6 +4759,7 @@ For more help on a command:
    # gateway restart
    gateway_restart = gateway_subparsers.add_parser("restart", help="Restart gateway service")
    gateway_restart.add_argument("--system", action="store_true", help="Target the Linux system-level gateway service")
+    gateway_restart.add_argument("--all", action="store_true", help="Kill ALL gateway processes across all profiles before restarting")
    
    # gateway status
    gateway_status = gateway_subparsers.add_parser("status", help="Show gateway status")
@@ -121,6 +121,7 @@ TOOL_CATEGORIES = {
        "providers": [
            {
                "name": "Nous Subscription",
+                "badge": "subscription",
                "tag": "Managed OpenAI TTS billed to your subscription",
                "env_vars": [],
                "tts_provider": "openai",
@@ -130,13 +131,15 @@ TOOL_CATEGORIES = {
            },
            {
                "name": "Microsoft Edge TTS",
-                "tag": "Free - no API key needed",
+                "badge": "★ recommended · free",
+                "tag": "Good quality, no API key needed",
                "env_vars": [],
                "tts_provider": "edge",
            },
            {
                "name": "OpenAI TTS",
-                "tag": "Premium - high quality voices",
+                "badge": "paid",
+                "tag": "High quality voices",
                "env_vars": [
                    {"key": "VOICE_TOOLS_OPENAI_KEY", "prompt": "OpenAI API key", "url": "https://platform.openai.com/api-keys"},
                ],
@@ -144,7 +147,8 @@ TOOL_CATEGORIES = {
            },
            {
                "name": "ElevenLabs",
-                "tag": "Premium - most natural voices",
+                "badge": "paid",
+                "tag": "Most natural voices",
                "env_vars": [
                    {"key": "ELEVENLABS_API_KEY", "prompt": "ElevenLabs API key", "url": "https://elevenlabs.io/app/settings/api-keys"},
                ],
@@ -152,7 +156,8 @@ TOOL_CATEGORIES = {
            },
            {
                "name": "Mistral (Voxtral TTS)",
-                "tag": "Multilingual, native Opus, needs MISTRAL_API_KEY",
+                "badge": "paid",
+                "tag": "Multilingual, native Opus",
                "env_vars": [
                    {"key": "MISTRAL_API_KEY", "prompt": "Mistral API key", "url": "https://console.mistral.ai/"},
                ],
@@ -168,6 +173,7 @@ TOOL_CATEGORIES = {
        "providers": [
            {
                "name": "Nous Subscription",
+                "badge": "subscription",
                "tag": "Managed Firecrawl billed to your subscription",
                "web_backend": "firecrawl",
                "env_vars": [],
@@ -177,7 +183,8 @@ TOOL_CATEGORIES = {
            },
            {
                "name": "Firecrawl Cloud",
-                "tag": "Hosted service - search, extract, and crawl",
+                "badge": "★ recommended",
+                "tag": "Full-featured search, extract, and crawl",
                "web_backend": "firecrawl",
                "env_vars": [
                    {"key": "FIRECRAWL_API_KEY", "prompt": "Firecrawl API key", "url": "https://firecrawl.dev"},
@@ -185,7 +192,8 @@ TOOL_CATEGORIES = {
            },
            {
                "name": "Exa",
-                "tag": "AI-native search and contents",
+                "badge": "paid",
+                "tag": "Neural search with semantic understanding",
                "web_backend": "exa",
                "env_vars": [
                    {"key": "EXA_API_KEY", "prompt": "Exa API key", "url": "https://exa.ai"},
@@ -193,7 +201,8 @@ TOOL_CATEGORIES = {
            },
            {
                "name": "Parallel",
-                "tag": "AI-native search and extract",
+                "badge": "paid",
+                "tag": "AI-powered search and extract",
                "web_backend": "parallel",
                "env_vars": [
                    {"key": "PARALLEL_API_KEY", "prompt": "Parallel API key", "url": "https://parallel.ai"},
@@ -201,7 +210,8 @@ TOOL_CATEGORIES = {
            },
            {
                "name": "Tavily",
-                "tag": "AI-native search, extract, and crawl",
+                "badge": "free tier",
+                "tag": "Search, extract, and crawl — 1000 free searches/mo",
                "web_backend": "tavily",
                "env_vars": [
                    {"key": "TAVILY_API_KEY", "prompt": "Tavily API key", "url": "https://app.tavily.com/home"},
@@ -209,7 +219,8 @@ TOOL_CATEGORIES = {
            },
            {
                "name": "Firecrawl Self-Hosted",
-                "tag": "Free - run your own instance",
+                "badge": "free · self-hosted",
+                "tag": "Run your own Firecrawl instance (Docker)",
                "web_backend": "firecrawl",
                "env_vars": [
                    {"key": "FIRECRAWL_API_URL", "prompt": "Your Firecrawl instance URL (e.g., http://localhost:3002)"},
@@ -223,6 +234,7 @@ TOOL_CATEGORIES = {
        "providers": [
            {
                "name": "Nous Subscription",
+                "badge": "subscription",
                "tag": "Managed FAL image generation billed to your subscription",
                "env_vars": [],
                "requires_nous_auth": True,
@@ -231,6 +243,7 @@ TOOL_CATEGORIES = {
            },
            {
                "name": "FAL.ai",
+                "badge": "paid",
                "tag": "FLUX 2 Pro with auto-upscaling",
                "env_vars": [
                    {"key": "FAL_KEY", "prompt": "FAL API key", "url": "https://fal.ai/dashboard/keys"},
@@ -244,6 +257,7 @@ TOOL_CATEGORIES = {
        "providers": [
            {
                "name": "Nous Subscription (Browser Use cloud)",
+                "badge": "subscription",
                "tag": "Managed Browser Use billed to your subscription",
                "env_vars": [],
                "browser_provider": "browser-use",
@@ -254,14 +268,16 @@ TOOL_CATEGORIES = {
            },
            {
                "name": "Local Browser",
-                "tag": "Free headless Chromium (no API key needed)",
+                "badge": "★ recommended · free",
+                "tag": "Headless Chromium, no API key needed",
                "env_vars": [],
                "browser_provider": "local",
                "post_setup": "agent_browser",
            },
            {
                "name": "Browserbase",
-                "tag": "Cloud browser with stealth & proxies",
+                "badge": "paid",
+                "tag": "Cloud browser with stealth and proxies",
                "env_vars": [
                    {"key": "BROWSERBASE_API_KEY", "prompt": "Browserbase API key", "url": "https://browserbase.com"},
                    {"key": "BROWSERBASE_PROJECT_ID", "prompt": "Browserbase project ID"},
@@ -271,6 +287,7 @@ TOOL_CATEGORIES = {
            },
            {
                "name": "Browser Use",
+                "badge": "paid",
                "tag": "Cloud browser with remote execution",
                "env_vars": [
                    {"key": "BROWSER_USE_API_KEY", "prompt": "Browser Use API key", "url": "https://browser-use.com"},
@@ -280,6 +297,7 @@ TOOL_CATEGORIES = {
            },
            {
                "name": "Firecrawl",
+                "badge": "paid",
                "tag": "Cloud browser with remote execution",
                "env_vars": [
                    {"key": "FIRECRAWL_API_KEY", "prompt": "Firecrawl API key", "url": "https://firecrawl.dev"},
@@ -289,7 +307,8 @@ TOOL_CATEGORIES = {
            },
            {
                "name": "Camofox",
-                "tag": "Local anti-detection browser (Firefox/Camoufox)",
+                "badge": "free · local",
+                "tag": "Anti-detection browser (Firefox/Camoufox)",
                "env_vars": [
                    {"key": "CAMOFOX_URL", "prompt": "Camofox server URL", "default": "http://localhost:9377",
                     "url": "https://github.com/jo-inc/camofox-browser"},
@@ -838,7 +857,8 @@ def _configure_tool_category(ts_key: str, cat: dict, config: dict):
        # Plain text labels only (no ANSI codes in menu items)
        provider_choices = []
        for p in providers:
-            tag = f" ({p['tag']})" if p.get("tag") else ""
+            badge = f" [{p['badge']}]" if p.get("badge") else ""
+            tag = f" — {p['tag']}" if p.get("tag") else ""
            configured = ""
            env_vars = p.get("env_vars", [])
            if not env_vars or all(get_env_value(v["key"]) for v in env_vars):
@@ -848,7 +868,7 @@ def _configure_tool_category(ts_key: str, cat: dict, config: dict):
                    configured = ""
                else:
                    configured = " [configured]"
-            provider_choices.append(f"{p['name']}{tag}{configured}")
+            provider_choices.append(f"{p['name']}{badge}{tag}{configured}")

        # Add skip option
        provider_choices.append("Skip — keep defaults / configure later")
@@ -1104,7 +1124,8 @@ def _configure_tool_category_for_reconfig(ts_key: str, cat: dict, config: dict):

        provider_choices = []
        for p in providers:
-            tag = f" ({p['tag']})" if p.get("tag") else ""
+            badge = f" [{p['badge']}]" if p.get("badge") else ""
+            tag = f" — {p['tag']}" if p.get("tag") else ""
            configured = ""
            env_vars = p.get("env_vars", [])
            if not env_vars or all(get_env_value(v["key"]) for v in env_vars):
@@ -1114,7 +1135,7 @@ def _configure_tool_category_for_reconfig(ts_key: str, cat: dict, config: dict):
                    configured = ""
                else:
                    configured = " [configured]"
-            provider_choices.append(f"{p['name']}{tag}{configured}")
+            provider_choices.append(f"{p['name']}{badge}{tag}{configured}")

        default_idx = _detect_active_provider_index(providers, config)

@@ -358,6 +358,7 @@ def _add_rotating_handler(
    path.parent.mkdir(parents=True, exist_ok=True)
    handler = _ManagedRotatingFileHandler(
        str(path), maxBytes=max_bytes, backupCount=backup_count,
+        encoding="utf-8",
    )
    handler.setLevel(level)
    handler.setFormatter(formatter)
@@ -4754,6 +4754,14 @@ class AIAgent:
                )
                self._swap_credential(next_entry)
                return True, False
+            # All credentials for this provider are exhausted due to auth failure.
+            # Emit an actionable notification so the user knows how to fix it.
+            _provider_label = getattr(self, "provider", "unknown")
+            self._emit_status(
+                f"🔐 All {_provider_label} credentials rejected (HTTP {rotate_status}). "
+                f"Run `hermes auth reset {_provider_label}` to clear, "
+                f"or `hermes model` to re-authenticate."
+            )

        return False, has_retried_429

@@ -6975,6 +6983,31 @@ class AIAgent:
                skip_pre_tool_call_hook=True,
            )

+    @staticmethod
+    def _wrap_verbose(label: str, text: str, indent: str = "     ") -> str:
+        """Word-wrap verbose tool output to fit the terminal width.
+
+        Splits *text* on existing newlines and wraps each line individually,
+        preserving intentional line breaks (e.g. pretty-printed JSON).
+        Returns a ready-to-print string with *label* on the first line and
+        continuation lines indented.
+        """
+        import shutil as _shutil
+        import textwrap as _tw
+        cols = _shutil.get_terminal_size((120, 24)).columns
+        wrap_width = max(40, cols - len(indent))
+        out_lines: list[str] = []
+        for raw_line in text.split("\n"):
+            if len(raw_line) <= wrap_width:
+                out_lines.append(raw_line)
+            else:
+                wrapped = _tw.wrap(raw_line, width=wrap_width,
+                                   break_long_words=True,
+                                   break_on_hyphens=False)
+                out_lines.extend(wrapped or [raw_line])
+        body = ("\n" + indent).join(out_lines)
+        return f"{indent}{label}{body}"
+
    def _execute_tool_calls_concurrent(self, assistant_message, messages: list, effective_task_id: str, api_call_count: int = 0) -> None:
        """Execute multiple tool calls concurrently using a thread pool.

@@ -7045,7 +7078,7 @@ class AIAgent:
                args_str = json.dumps(args, ensure_ascii=False)
                if self.verbose_logging:
                    print(f"  📞 Tool {i}: {name}({list(args.keys())})")
-                    print(f"     Args: {args_str}")
+                    print(self._wrap_verbose("Args: ", json.dumps(args, indent=2, ensure_ascii=False)))
                else:
                    args_preview = args_str[:self.log_prefix_chars] + "..." if len(args_str) > self.log_prefix_chars else args_str
                    print(f"  📞 Tool {i}: {name}({list(args.keys())}) - {args_preview}")
@@ -7143,7 +7176,7 @@ class AIAgent:
            elif not self.quiet_mode:
                if self.verbose_logging:
                    print(f"  ✅ Tool {i+1} completed in {tool_duration:.2f}s")
-                    print(f"     Result: {function_result}")
+                    print(self._wrap_verbose("Result: ", function_result))
                else:
                    response_preview = function_result[:self.log_prefix_chars] + "..." if len(function_result) > self.log_prefix_chars else function_result
                    print(f"  ✅ Tool {i+1} completed in {tool_duration:.2f}s - {response_preview}")
@@ -7236,7 +7269,7 @@ class AIAgent:
                args_str = json.dumps(function_args, ensure_ascii=False)
                if self.verbose_logging:
                    print(f"  📞 Tool {i}: {function_name}({list(function_args.keys())})")
-                    print(f"     Args: {args_str}")
+                    print(self._wrap_verbose("Args: ", json.dumps(function_args, indent=2, ensure_ascii=False)))
                else:
                    args_preview = args_str[:self.log_prefix_chars] + "..." if len(args_str) > self.log_prefix_chars else args_str
                    print(f"  📞 Tool {i}: {function_name}({list(function_args.keys())}) - {args_preview}")
@@ -7524,7 +7557,7 @@ class AIAgent:
            if not self.quiet_mode:
                if self.verbose_logging:
                    print(f"  ✅ Tool {i} completed in {tool_duration:.2f}s")
-                    print(f"     Result: {function_result}")
+                    print(self._wrap_verbose("Result: ", function_result))
                else:
                    response_preview = function_result[:self.log_prefix_chars] + "..." if len(function_result) > self.log_prefix_chars else function_result
                    print(f"  ✅ Tool {i} completed in {tool_duration:.2f}s - {response_preview}")
@@ -8962,12 +8995,35 @@ class AIAgent:
                            if isinstance(_default_headers, dict):
                                _headers_sanitized = _sanitize_structure_non_ascii(_default_headers)

+                            # Sanitize the API key — non-ASCII characters in
+                            # credentials (e.g. ʋ instead of v from a bad
+                            # copy-paste) cause httpx to fail when encoding
+                            # the Authorization header as ASCII.  This is the
+                            # most common cause of persistent UnicodeEncodeError
+                            # that survives message/tool sanitization (#6843).
+                            _credential_sanitized = False
+                            _raw_key = getattr(self, "api_key", None) or ""
+                            if _raw_key:
+                                _clean_key = _strip_non_ascii(_raw_key)
+                                if _clean_key != _raw_key:
+                                    self.api_key = _clean_key
+                                    if isinstance(getattr(self, "_client_kwargs", None), dict):
+                                        self._client_kwargs["api_key"] = _clean_key
+                                    _credential_sanitized = True
+                                    self._vprint(
+                                        f"{self.log_prefix}⚠️  API key contained non-ASCII characters "
+                                        f"(bad copy-paste?) — stripped them. If auth fails, "
+                                        f"re-copy the key from your provider's dashboard.",
+                                        force=True,
+                                    )
+
                            if (
                                _messages_sanitized
                                or _prefill_sanitized
                                or _tools_sanitized
                                or _system_sanitized
                                or _headers_sanitized
+                                or _credential_sanitized
                            ):
                                self._unicode_sanitization_passes += 1
                                self._vprint(
@@ -95,7 +95,9 @@ AUTHOR_MAP = {
    "vincentcharlebois@gmail.com": "vincentcharlebois",
    "aryan@synvoid.com": "aryansingh",
    "johnsonblake1@gmail.com": "blakejohnson",
+    "greer.guthrie@gmail.com": "g-guthrie",
    "kennyx102@gmail.com": "bobashopcashier",
+    "shokatalishaikh95@gmail.com": "areu01or00",
    "bryan@intertwinesys.com": "bryanyoung",
    "christo.mitov@gmail.com": "christomitov",
    "hermes@nousresearch.com": "NousResearch",
@@ -115,6 +117,7 @@ AUTHOR_MAP = {
    "m@statecraft.systems": "mbierling",
    "balyan.sid@gmail.com": "balyansid",
    "oluwadareab12@gmail.com": "bennytimz",
+    "simon@simonmarcus.org": "simon-marcus",
    # ── bulk addition: 75 emails resolved via API, PR salvage bodies, noreply
    #    crossref, and GH contributor list matching (April 2026 audit) ──
    "1115117931@qq.com": "aaronagent",
@@ -1,35 +1,19 @@
 ---
 name: google-workspace
-description: Gmail, Calendar, Drive, Contacts, Sheets, and Docs integration via gws CLI (googleworkspace/cli). Uses OAuth2 with automatic token refresh via bridge script. Requires gws binary.
-version: 2.0.0
+description: Gmail, Calendar, Drive, Contacts, Sheets, and Docs integration for Hermes. Uses Hermes-managed OAuth2 setup, prefers the Google Workspace CLI (`gws`) when available for broader API coverage, and falls back to the Python client libraries otherwise.
+version: 1.0.0
 author: Nous Research
 license: MIT
-required_credential_files:
-  - path: google_token.json
-    description: Google OAuth2 token (created by setup script)
-  - path: google_client_secret.json
-    description: Google OAuth2 client credentials (downloaded from Google Cloud Console)
 metadata:
  hermes:
-    tags: [Google, Gmail, Calendar, Drive, Sheets, Docs, Contacts, Email, OAuth, gws]
+    tags: [Google, Gmail, Calendar, Drive, Sheets, Docs, Contacts, Email, OAuth]
    homepage: https://github.com/NousResearch/hermes-agent
    related_skills: [himalaya]
 ---

 # Google Workspace

-Gmail, Calendar, Drive, Contacts, Sheets, and Docs — powered by `gws` (Google's official Rust CLI). The skill provides a backward-compatible Python wrapper that handles OAuth token refresh and delegates to `gws`.
-
-## Architecture
-
-```
-google_api.py  →  gws_bridge.py  →  gws CLI
-(argparse compat)  (token refresh)    (Google APIs)
-```
-
- `setup.py` handles OAuth2 (headless-compatible, works on CLI/Telegram/Discord)
- `gws_bridge.py` refreshes the Hermes token and injects it into `gws` via `GOOGLE_WORKSPACE_CLI_TOKEN`
- `google_api.py` provides the same CLI interface as v1 but delegates to `gws`
+Gmail, Calendar, Drive, Contacts, Sheets, and Docs — through Hermes-managed OAuth and a thin CLI wrapper. When `gws` is installed, the skill uses it as the execution backend for broader Google Workspace coverage; otherwise it falls back to the bundled Python client implementation.

 ## References

@@ -38,22 +22,7 @@ google_api.py  →  gws_bridge.py  →  gws CLI
 ## Scripts

 - `scripts/setup.py` — OAuth2 setup (run once to authorize)
- `scripts/gws_bridge.py` — Token refresh bridge to gws CLI
- `scripts/google_api.py` — Backward-compatible API wrapper (delegates to gws)
-
-## Prerequisites
-
-Install `gws`:
-
-```bash
-cargo install google-workspace-cli
-# or via npm (recommended, downloads prebuilt binary):
-npm install -g @googleworkspace/cli
-# or via Homebrew:
-brew install googleworkspace-cli
-```
-
-Verify: `gws --version`
+- `scripts/google_api.py` — compatibility wrapper CLI. It prefers `gws` for operations when available, while preserving Hermes' existing JSON output contract.

 ## First-Time Setup

@@ -63,13 +32,7 @@ on CLI, Telegram, Discord, or any platform.
 Define a shorthand first:

 ```bash
-HERMES_HOME="${HERMES_HOME:-$HOME/.hermes}"
-GWORKSPACE_SKILL_DIR="$HERMES_HOME/skills/productivity/google-workspace"
-PYTHON_BIN="${HERMES_PYTHON:-python3}"
-if [ -x "$HERMES_HOME/hermes-agent/venv/bin/python" ]; then
-  PYTHON_BIN="$HERMES_HOME/hermes-agent/venv/bin/python"
-fi
-GSETUP="$PYTHON_BIN $GWORKSPACE_SKILL_DIR/scripts/setup.py"
+GSETUP="python ~/.hermes/skills/productivity/google-workspace/scripts/setup.py"
 ```

 ### Step 0: Check if already set up
@@ -82,88 +45,166 @@ If it prints `AUTHENTICATED`, skip to Usage — setup is already done.

 ### Step 1: Triage — ask the user what they need

+Before starting OAuth setup, ask the user TWO questions:
+
 **Question 1: "What Google services do you need? Just email, or also
 Calendar/Drive/Sheets/Docs?"**

- **Email only** → Use the `himalaya` skill instead — simpler setup.
- **Calendar, Drive, Sheets, Docs (or email + these)** → Continue below.
+- **Email only** → They don't need this skill at all. Use the `himalaya` skill
+  instead — it works with a Gmail App Password (Settings → Security → App
+  Passwords) and takes 2 minutes to set up. No Google Cloud project needed.
+  Load the himalaya skill and follow its setup instructions.

-**Partial scopes**: Users can authorize only a subset of services. The setup
-script accepts partial scopes and warns about missing ones.
+- **Email + Calendar** → Continue with this skill, but use
+  `--services email,calendar` during auth so the consent screen only asks for
+  the scopes they actually need.

-**Question 2: "Does your Google account use Advanced Protection?"**
+- **Calendar/Drive/Sheets/Docs only** → Continue with this skill and use a
+  narrower `--services` set like `calendar,drive,sheets,docs`.

- **No / Not sure** → Normal setup.
- **Yes** → Workspace admin must add the OAuth client ID to allowed apps first.
+- **Full Workspace access** → Continue with this skill and use the default
+  `all` service set.
+
+**Question 2: "Does your Google account use Advanced Protection (hardware
+security keys required to sign in)? If you're not sure, you probably don't
+— it's something you would have explicitly enrolled in."**
+
+- **No / Not sure** → Normal setup. Continue below.
+- **Yes** → Their Workspace admin must add the OAuth client ID to the org's
+  allowed apps list before Step 4 will work. Let them know upfront.

 ### Step 2: Create OAuth credentials (one-time, ~5 minutes)

 Tell the user:

-> 1. Go to https://console.cloud.google.com/apis/credentials
-> 2. Create a project (or use an existing one)
-> 3. Enable the APIs you need (Gmail, Calendar, Drive, Sheets, Docs, People)
-> 4. Credentials → Create Credentials → OAuth 2.0 Client ID → Desktop app
-> 5. Download JSON and tell me the file path
+> You need a Google Cloud OAuth client. This is a one-time setup:
+>
+> 1. Create or select a project:
+>    https://console.cloud.google.com/projectselector2/home/dashboard
+> 2. Enable the required APIs from the API Library:
+>    https://console.cloud.google.com/apis/library
+>    Enable: Gmail API, Google Calendar API, Google Drive API,
+>    Google Sheets API, Google Docs API, People API
+> 3. Create the OAuth client here:
+>    https://console.cloud.google.com/apis/credentials
+>    Credentials → Create Credentials → OAuth 2.0 Client ID
+> 4. Application type: "Desktop app" → Create
+> 5. If the app is still in Testing, add the user's Google account as a test user here:
+>    https://console.cloud.google.com/auth/audience
+>    Audience → Test users → Add users
+> 6. Download the JSON file and tell me the file path
+>
+> Important Hermes CLI note: if the file path starts with `/`, do NOT send only the bare path as its own message in the CLI, because it can be mistaken for a slash command. Send it in a sentence instead, like:
+> `The JSON file path is: /home/user/Downloads/client_secret_....json`
+
+Once they provide the path:

 ```bash
 $GSETUP --client-secret /path/to/client_secret.json
 ```

+If they paste the raw client ID / client secret values instead of a file path,
+write a valid Desktop OAuth JSON file for them yourself, save it somewhere
+explicit (for example `~/Downloads/hermes-google-client-secret.json`), then run
+`--client-secret` against that file.
+
 ### Step 3: Get authorization URL

+Use the service set chosen in Step 1. Examples:
+
 ```bash
-$GSETUP --auth-url
+$GSETUP --auth-url --services email,calendar --format json
+$GSETUP --auth-url --services calendar,drive,sheets,docs --format json
+$GSETUP --auth-url --services all --format json
 ```

-Send the URL to the user. After authorizing, they paste back the redirect URL or code.
+This returns JSON with an `auth_url` field and also saves the exact URL to
+`~/.hermes/google_oauth_last_url.txt`.
+
+Agent rules for this step:
+- Extract the `auth_url` field and send that exact URL to the user as a single line.
+- Tell the user that the browser will likely fail on `http://localhost:1` after approval, and that this is expected.
+- Tell them to copy the ENTIRE redirected URL from the browser address bar.
+- If the user gets `Error 403: access_denied`, send them directly to `https://console.cloud.google.com/auth/audience` to add themselves as a test user.

 ### Step 4: Exchange the code

+The user will paste back either a URL like `http://localhost:1/?code=4/0A...&scope=...`
+or just the code string. Either works. The `--auth-url` step stores a temporary
+pending OAuth session locally so `--auth-code` can complete the PKCE exchange
+later, even on headless systems:
+
 ```bash
-$GSETUP --auth-code "THE_URL_OR_CODE_THE_USER_PASTED"
+$GSETUP --auth-code "THE_URL_OR_CODE_THE_USER_PASTED" --format json
 ```

+If `--auth-code` fails because the code expired, was already used, or came from
+an older browser tab, it now returns a fresh `fresh_auth_url`. In that case,
+immediately send the new URL to the user and have them retry with the newest
+browser redirect only.
+
 ### Step 5: Verify

 ```bash
 $GSETUP --check
 ```

-Should print `AUTHENTICATED`. Token refreshes automatically from now on.
+Should print `AUTHENTICATED`. Setup is complete — token refreshes automatically from now on.
+
+### Notes
+
+- Token is stored at `~/.hermes/google_token.json` and auto-refreshes.
+- Pending OAuth session state/verifier are stored temporarily at `~/.hermes/google_oauth_pending.json` until exchange completes.
+- If `gws` is installed, `google_api.py` points it at the same `~/.hermes/google_token.json` credentials file. Users do not need to run a separate `gws auth login` flow.
+- To revoke: `$GSETUP --revoke`

 ## Usage

-All commands go through the API script:
+All commands go through the API script. Set `GAPI` as a shorthand:

 ```bash
-HERMES_HOME="${HERMES_HOME:-$HOME/.hermes}"
-GWORKSPACE_SKILL_DIR="$HERMES_HOME/skills/productivity/google-workspace"
-PYTHON_BIN="${HERMES_PYTHON:-python3}"
-if [ -x "$HERMES_HOME/hermes-agent/venv/bin/python" ]; then
-  PYTHON_BIN="$HERMES_HOME/hermes-agent/venv/bin/python"
-fi
-GAPI="$PYTHON_BIN $GWORKSPACE_SKILL_DIR/scripts/google_api.py"
+GAPI="python ~/.hermes/skills/productivity/google-workspace/scripts/google_api.py"
 ```

 ### Gmail

 ```bash
+# Search (returns JSON array with id, from, subject, date, snippet)
 $GAPI gmail search "is:unread" --max 10
+$GAPI gmail search "from:boss@company.com newer_than:1d"
+$GAPI gmail search "has:attachment filename:pdf newer_than:7d"
+
+# Read full message (returns JSON with body text)
 $GAPI gmail get MESSAGE_ID
+
+# Send
 $GAPI gmail send --to user@example.com --subject "Hello" --body "Message text"
-$GAPI gmail send --to user@example.com --subject "Report" --body "<h1>Q4</h1>" --html
+$GAPI gmail send --to user@example.com --subject "Report" --body "<h1>Q4</h1><p>Details...</p>" --html
+$GAPI gmail send --to user@example.com --subject "Hello" --from '"Research Agent" <user@example.com>' --body "Message text"
+
+# Reply (automatically threads and sets In-Reply-To)
 $GAPI gmail reply MESSAGE_ID --body "Thanks, that works for me."
+$GAPI gmail reply MESSAGE_ID --from '"Support Bot" <user@example.com>' --body "Thanks"
+
+# Labels
 $GAPI gmail labels
 $GAPI gmail modify MESSAGE_ID --add-labels LABEL_ID
+$GAPI gmail modify MESSAGE_ID --remove-labels UNREAD
 ```

 ### Calendar

 ```bash
+# List events (defaults to next 7 days)
 $GAPI calendar list
-$GAPI calendar create --summary "Standup" --start 2026-03-01T10:00:00+01:00 --end 2026-03-01T10:30:00+01:00
-$GAPI calendar create --summary "Review" --start ... --end ... --attendees "alice@co.com,bob@co.com"
+$GAPI calendar list --start 2026-03-01T00:00:00Z --end 2026-03-07T23:59:59Z
+
+# Create event (ISO 8601 with timezone required)
+$GAPI calendar create --summary "Team Standup" --start 2026-03-01T10:00:00-06:00 --end 2026-03-01T10:30:00-06:00
+$GAPI calendar create --summary "Lunch" --start 2026-03-01T12:00:00Z --end 2026-03-01T13:00:00Z --location "Cafe"
+$GAPI calendar create --summary "Review" --start 2026-03-01T14:00:00Z --end 2026-03-01T15:00:00Z --attendees "alice@co.com,bob@co.com"
+
+# Delete event
 $GAPI calendar delete EVENT_ID
 ```

@@ -183,8 +224,13 @@ $GAPI contacts list --max 20
 ### Sheets

 ```bash
+# Read
 $GAPI sheets get SHEET_ID "Sheet1!A1:D10"
+
+# Write
 $GAPI sheets update SHEET_ID "Sheet1!A1:B2" --values '[["Name","Score"],["Alice","95"]]'
+
+# Append rows
 $GAPI sheets append SHEET_ID "Sheet1!A:C" --values '[["new","row","data"]]'
 ```

@@ -194,52 +240,37 @@ $GAPI sheets append SHEET_ID "Sheet1!A:C" --values '[["new","row","data"]]'
 $GAPI docs get DOC_ID
 ```

-### Direct gws access (advanced)
-
-For operations not covered by the wrapper, use `gws_bridge.py` directly:
-
-```bash
-GBRIDGE="$PYTHON_BIN $GWORKSPACE_SKILL_DIR/scripts/gws_bridge.py"
-$GBRIDGE calendar +agenda --today --format table
-$GBRIDGE gmail +triage --labels --format json
-$GBRIDGE drive +upload ./report.pdf
-$GBRIDGE sheets +read --spreadsheet SHEET_ID --range "Sheet1!A1:D10"
-```
-
 ## Output Format

-All commands return JSON via `gws --format json`. Key output shapes:
+All commands return JSON. Parse with `jq` or read directly. Key fields:

- **Gmail search/triage**: Array of message summaries (sender, subject, date, snippet)
- **Gmail get/read**: Message object with headers and body text
- **Gmail send/reply**: Confirmation with message ID
- **Calendar list/agenda**: Array of event objects (summary, start, end, location)
- **Calendar create**: Confirmation with event ID and htmlLink
- **Drive search**: Array of file objects (id, name, mimeType, webViewLink)
- **Sheets get/read**: 2D array of cell values
- **Docs get**: Full document JSON (use `body.content` for text extraction)
- **Contacts list**: Array of person objects with names, emails, phones
-
-Parse output with `jq` or read JSON directly.
+- **Gmail search**: `[{id, threadId, from, to, subject, date, snippet, labels}]`
+- **Gmail get**: `{id, threadId, from, to, subject, date, labels, body}`
+- **Gmail send/reply**: `{status: "sent", id, threadId}`
+- **Calendar list**: `[{id, summary, start, end, location, description, htmlLink}]`
+- **Calendar create**: `{status: "created", id, summary, htmlLink}`
+- **Drive search**: `[{id, name, mimeType, modifiedTime, webViewLink}]`
+- **Contacts list**: `[{name, emails: [...], phones: [...]}]`
+- **Sheets get**: `[[cell, cell, ...], ...]`

 ## Rules

-1. **Never send email or create/delete events without confirming with the user first.**
-2. **Check auth before first use** — run `setup.py --check`.
-3. **Use the Gmail search syntax reference** for complex queries.
-4. **Calendar times must include timezone** — ISO 8601 with offset or UTC.
-5. **Respect rate limits** — avoid rapid-fire sequential API calls.
+1. **Never send email or create/delete events without confirming with the user first.** Show the draft content and ask for approval.
+2. **Check auth before first use** — run `setup.py --check`. If it fails, guide the user through setup.
+3. **Use the Gmail search syntax reference** for complex queries — load it with `skill_view("google-workspace", file_path="references/gmail-search-syntax.md")`.
+4. **Calendar times must include timezone** — always use ISO 8601 with offset (e.g., `2026-03-01T10:00:00-06:00`) or UTC (`Z`).
+5. **Respect rate limits** — avoid rapid-fire sequential API calls. Batch reads when possible.

 ## Troubleshooting

 | Problem | Fix |
 |---------|-----|
-| `NOT_AUTHENTICATED` | Run setup Steps 2-5 |
-| `REFRESH_FAILED` | Token revoked — redo Steps 3-5 |
-| `gws: command not found` | Install: `npm install -g @googleworkspace/cli` |
-| `HttpError 403` | Missing scope — `$GSETUP --revoke` then redo Steps 3-5 |
-| `HttpError 403: Access Not Configured` | Enable API in Google Cloud Console |
-| Advanced Protection blocks auth | Admin must allowlist the OAuth client ID |
+| `NOT_AUTHENTICATED` | Run setup Steps 2-5 above |
+| `REFRESH_FAILED` | Token revoked or expired — redo Steps 3-5 |
+| `HttpError 403: Insufficient Permission` | Missing API scope — `$GSETUP --revoke` then redo Steps 3-5 |
+| `HttpError 403: Access Not Configured` | API not enabled — user needs to enable it in Google Cloud Console |
+| `ModuleNotFoundError` | Run `$GSETUP --install-deps` |
+| Advanced Protection blocks auth | Workspace admin must allowlist the OAuth client ID |

 ## Revoking Access

@@ -1,17 +1,17 @@
 #!/usr/bin/env python3
 """Google Workspace API CLI for Hermes Agent.

-Thin wrapper that delegates to gws (googleworkspace/cli) via gws_bridge.py.
-Maintains the same CLI interface for backward compatibility with Hermes skills.
+Uses the Google Workspace CLI (`gws`) when available, but preserves the
+existing Hermes-facing JSON contract and falls back to the Python client
+libraries if `gws` is not installed.

 Usage:
  python google_api.py gmail search "is:unread" [--max 10]
  python google_api.py gmail get MESSAGE_ID
  python google_api.py gmail send --to user@example.com --subject "Hi" --body "Hello"
  python google_api.py gmail reply MESSAGE_ID --body "Thanks"
-  python google_api.py calendar list [--start DATE] [--end DATE] [--calendar primary]
+  python google_api.py calendar list [--from DATE] [--to DATE] [--calendar primary]
  python google_api.py calendar create --summary "Meeting" --start DATETIME --end DATETIME
-  python google_api.py calendar delete EVENT_ID
  python google_api.py drive search "budget report" [--max 10]
  python google_api.py contacts list [--max 20]
  python google_api.py sheets get SHEET_ID RANGE
@@ -21,47 +21,396 @@ Usage:
 """

 import argparse
+import base64
 import json
 import os
+import shutil
 import subprocess
 import sys
+from datetime import datetime, timedelta, timezone
+from email.mime.text import MIMEText
 from pathlib import Path

-BRIDGE = Path(__file__).parent / "gws_bridge.py"
-PYTHON = sys.executable
+HERMES_HOME = Path(os.getenv("HERMES_HOME", Path.home() / ".hermes"))
+TOKEN_PATH = HERMES_HOME / "google_token.json"
+CLIENT_SECRET_PATH = HERMES_HOME / "google_client_secret.json"
+
+SCOPES = [
+    "https://www.googleapis.com/auth/gmail.readonly",
+    "https://www.googleapis.com/auth/gmail.send",
+    "https://www.googleapis.com/auth/gmail.modify",
+    "https://www.googleapis.com/auth/calendar",
+    "https://www.googleapis.com/auth/drive.readonly",
+    "https://www.googleapis.com/auth/contacts.readonly",
+    "https://www.googleapis.com/auth/spreadsheets",
+    "https://www.googleapis.com/auth/documents.readonly",
+]


-def gws(*args: str) -> None:
-    """Call gws via the bridge and exit with its return code."""
+def _ensure_authenticated():
+    if not TOKEN_PATH.exists():
+        print("Not authenticated. Run the setup script first:", file=sys.stderr)
+        print(f"  python {Path(__file__).parent / 'setup.py'}", file=sys.stderr)
+        sys.exit(1)
+
+
+def _stored_token_scopes() -> list[str]:
+    try:
+        data = json.loads(TOKEN_PATH.read_text())
+    except Exception:
+        return list(SCOPES)
+    scopes = data.get("scopes")
+    if isinstance(scopes, list) and scopes:
+        return scopes
+    return list(SCOPES)
+
+
+def _gws_binary() -> str | None:
+    override = os.getenv("HERMES_GWS_BIN")
+    if override:
+        return override
+    return shutil.which("gws")
+
+
+def _gws_env() -> dict[str, str]:
+    env = os.environ.copy()
+    env["GOOGLE_WORKSPACE_CLI_CREDENTIALS_FILE"] = str(TOKEN_PATH)
+    return env
+
+
+def _run_gws(parts: list[str], *, params: dict | None = None, body: dict | None = None):
+    binary = _gws_binary()
+    if not binary:
+        raise RuntimeError("gws not installed")
+
+    _ensure_authenticated()
+
+    cmd = [binary, *parts]
+    if params is not None:
+        cmd.extend(["--params", json.dumps(params)])
+    if body is not None:
+        cmd.extend(["--json", json.dumps(body)])
+
    result = subprocess.run(
-        [PYTHON, str(BRIDGE)] + list(args),
-        env={**os.environ, "HERMES_HOME": os.environ.get("HERMES_HOME", str(Path.home() / ".hermes"))},
+        cmd,
+        capture_output=True,
+        text=True,
+        env=_gws_env(),
    )
-    sys.exit(result.returncode)
+    if result.returncode != 0:
+        err = result.stderr.strip() or result.stdout.strip() or "Unknown gws error"
+        print(err, file=sys.stderr)
+        sys.exit(result.returncode or 1)
+
+    stdout = result.stdout.strip()
+    if not stdout:
+        return {}
+
+    try:
+        return json.loads(stdout)
+    except json.JSONDecodeError:
+        print("ERROR: Unexpected non-JSON output from gws:", file=sys.stderr)
+        print(stdout, file=sys.stderr)
+        sys.exit(1)


-# -- Gmail --
+def _headers_dict(msg: dict) -> dict[str, str]:
+    return {h["name"]: h["value"] for h in msg.get("payload", {}).get("headers", [])}
+
+
+def _extract_message_body(msg: dict) -> str:
+    body = ""
+    payload = msg.get("payload", {})
+    if payload.get("body", {}).get("data"):
+        body = base64.urlsafe_b64decode(payload["body"]["data"]).decode("utf-8", errors="replace")
+    elif payload.get("parts"):
+        for part in payload["parts"]:
+            if part.get("mimeType") == "text/plain" and part.get("body", {}).get("data"):
+                body = base64.urlsafe_b64decode(part["body"]["data"]).decode("utf-8", errors="replace")
+                break
+        if not body:
+            for part in payload["parts"]:
+                if part.get("mimeType") == "text/html" and part.get("body", {}).get("data"):
+                    body = base64.urlsafe_b64decode(part["body"]["data"]).decode("utf-8", errors="replace")
+                    break
+    return body
+
+
+def _extract_doc_text(doc: dict) -> str:
+    text_parts = []
+    for element in doc.get("body", {}).get("content", []):
+        paragraph = element.get("paragraph", {})
+        for pe in paragraph.get("elements", []):
+            text_run = pe.get("textRun", {})
+            if text_run.get("content"):
+                text_parts.append(text_run["content"])
+    return "".join(text_parts)
+
+
+def _datetime_with_timezone(value: str) -> str:
+    if not value:
+        return value
+    if "T" not in value:
+        return value
+    if value.endswith("Z"):
+        return value
+    tail = value[10:]
+    if "+" in tail or "-" in tail:
+        return value
+    return value + "Z"
+
+
+def get_credentials():
+    """Load and refresh credentials from token file."""
+    _ensure_authenticated()
+
+    from google.oauth2.credentials import Credentials
+    from google.auth.transport.requests import Request
+
+    creds = Credentials.from_authorized_user_file(str(TOKEN_PATH), _stored_token_scopes())
+    if creds.expired and creds.refresh_token:
+        creds.refresh(Request())
+        TOKEN_PATH.write_text(creds.to_json())
+    if not creds.valid:
+        print("Token is invalid. Re-run setup.", file=sys.stderr)
+        sys.exit(1)
+    return creds
+
+
+def build_service(api, version):
+    from googleapiclient.discovery import build
+
+    return build(api, version, credentials=get_credentials())
+
+
+# =========================================================================
+# Gmail
+# =========================================================================
+

 def gmail_search(args):
-    cmd = ["gmail", "+triage", "--query", args.query, "--max", str(args.max), "--format", "json"]
-    gws(*cmd)
+    if _gws_binary():
+        results = _run_gws(
+            ["gmail", "users", "messages", "list"],
+            params={"userId": "me", "q": args.query, "maxResults": args.max},
+        )
+        messages = results.get("messages", [])
+        output = []
+        for msg_meta in messages:
+            msg = _run_gws(
+                ["gmail", "users", "messages", "get"],
+                params={
+                    "userId": "me",
+                    "id": msg_meta["id"],
+                    "format": "metadata",
+                    "metadataHeaders": ["From", "To", "Subject", "Date"],
+                },
+            )
+            headers = _headers_dict(msg)
+            output.append(
+                {
+                    "id": msg["id"],
+                    "threadId": msg["threadId"],
+                    "from": headers.get("From", ""),
+                    "to": headers.get("To", ""),
+                    "subject": headers.get("Subject", ""),
+                    "date": headers.get("Date", ""),
+                    "snippet": msg.get("snippet", ""),
+                    "labels": msg.get("labelIds", []),
+                }
+            )
+        print(json.dumps(output, indent=2, ensure_ascii=False))
+        return
+
+    service = build_service("gmail", "v1")
+    results = service.users().messages().list(
+        userId="me", q=args.query, maxResults=args.max
+    ).execute()
+    messages = results.get("messages", [])
+    if not messages:
+        print("No messages found.")
+        return
+
+    output = []
+    for msg_meta in messages:
+        msg = service.users().messages().get(
+            userId="me", id=msg_meta["id"], format="metadata",
+            metadataHeaders=["From", "To", "Subject", "Date"],
+        ).execute()
+        headers = _headers_dict(msg)
+        output.append({
+            "id": msg["id"],
+            "threadId": msg["threadId"],
+            "from": headers.get("From", ""),
+            "to": headers.get("To", ""),
+            "subject": headers.get("Subject", ""),
+            "date": headers.get("Date", ""),
+            "snippet": msg.get("snippet", ""),
+            "labels": msg.get("labelIds", []),
+        })
+    print(json.dumps(output, indent=2, ensure_ascii=False))
+
+

 def gmail_get(args):
-    gws("gmail", "+read", "--id", args.message_id, "--headers", "--format", "json")
+    if _gws_binary():
+        msg = _run_gws(
+            ["gmail", "users", "messages", "get"],
+            params={"userId": "me", "id": args.message_id, "format": "full"},
+        )
+        headers = _headers_dict(msg)
+        result = {
+            "id": msg["id"],
+            "threadId": msg["threadId"],
+            "from": headers.get("From", ""),
+            "to": headers.get("To", ""),
+            "subject": headers.get("Subject", ""),
+            "date": headers.get("Date", ""),
+            "labels": msg.get("labelIds", []),
+            "body": _extract_message_body(msg),
+        }
+        print(json.dumps(result, indent=2, ensure_ascii=False))
+        return
+
+    service = build_service("gmail", "v1")
+    msg = service.users().messages().get(
+        userId="me", id=args.message_id, format="full"
+    ).execute()
+
+    headers = _headers_dict(msg)
+    result = {
+        "id": msg["id"],
+        "threadId": msg["threadId"],
+        "from": headers.get("From", ""),
+        "to": headers.get("To", ""),
+        "subject": headers.get("Subject", ""),
+        "date": headers.get("Date", ""),
+        "labels": msg.get("labelIds", []),
+        "body": _extract_message_body(msg),
+    }
+    print(json.dumps(result, indent=2, ensure_ascii=False))
+
+

 def gmail_send(args):
-    cmd = ["gmail", "+send", "--to", args.to, "--subject", args.subject, "--body", args.body, "--format", "json"]
+    if _gws_binary():
+        message = MIMEText(args.body, "html" if args.html else "plain")
+        message["to"] = args.to
+        message["subject"] = args.subject
+        if args.cc:
+            message["cc"] = args.cc
+        if args.from_header:
+            message["from"] = args.from_header
+
+        raw = base64.urlsafe_b64encode(message.as_bytes()).decode()
+        body = {"raw": raw}
+        if args.thread_id:
+            body["threadId"] = args.thread_id
+
+        result = _run_gws(
+            ["gmail", "users", "messages", "send"],
+            params={"userId": "me"},
+            body=body,
+        )
+        print(json.dumps({"status": "sent", "id": result["id"], "threadId": result.get("threadId", "")}, indent=2))
+        return
+
+    service = build_service("gmail", "v1")
+    message = MIMEText(args.body, "html" if args.html else "plain")
+    message["to"] = args.to
+    message["subject"] = args.subject
    if args.cc:
-        cmd += ["--cc", args.cc]
-    if args.html:
-        cmd.append("--html")
-    gws(*cmd)
+        message["cc"] = args.cc
+    if args.from_header:
+        message["from"] = args.from_header
+
+    raw = base64.urlsafe_b64encode(message.as_bytes()).decode()
+    body = {"raw": raw}
+
+    if args.thread_id:
+        body["threadId"] = args.thread_id
+
+    result = service.users().messages().send(userId="me", body=body).execute()
+    print(json.dumps({"status": "sent", "id": result["id"], "threadId": result.get("threadId", "")}, indent=2))
+
+

 def gmail_reply(args):
-    gws("gmail", "+reply", "--message-id", args.message_id, "--body", args.body, "--format", "json")
+    if _gws_binary():
+        original = _run_gws(
+            ["gmail", "users", "messages", "get"],
+            params={
+                "userId": "me",
+                "id": args.message_id,
+                "format": "metadata",
+                "metadataHeaders": ["From", "Subject", "Message-ID"],
+            },
+        )
+        headers = _headers_dict(original)
+
+        subject = headers.get("Subject", "")
+        if not subject.startswith("Re:"):
+            subject = f"Re: {subject}"
+
+        message = MIMEText(args.body)
+        message["to"] = headers.get("From", "")
+        message["subject"] = subject
+        if args.from_header:
+            message["from"] = args.from_header
+        if headers.get("Message-ID"):
+            message["In-Reply-To"] = headers["Message-ID"]
+            message["References"] = headers["Message-ID"]
+
+        raw = base64.urlsafe_b64encode(message.as_bytes()).decode()
+        result = _run_gws(
+            ["gmail", "users", "messages", "send"],
+            params={"userId": "me"},
+            body={"raw": raw, "threadId": original["threadId"]},
+        )
+        print(json.dumps({"status": "sent", "id": result["id"], "threadId": result.get("threadId", "")}, indent=2))
+        return
+
+    service = build_service("gmail", "v1")
+    original = service.users().messages().get(
+        userId="me", id=args.message_id, format="metadata",
+        metadataHeaders=["From", "Subject", "Message-ID"],
+    ).execute()
+    headers = _headers_dict(original)
+
+    subject = headers.get("Subject", "")
+    if not subject.startswith("Re:"):
+        subject = f"Re: {subject}"
+
+    message = MIMEText(args.body)
+    message["to"] = headers.get("From", "")
+    message["subject"] = subject
+    if args.from_header:
+        message["from"] = args.from_header
+    if headers.get("Message-ID"):
+        message["In-Reply-To"] = headers["Message-ID"]
+        message["References"] = headers["Message-ID"]
+
+    raw = base64.urlsafe_b64encode(message.as_bytes()).decode()
+    body = {"raw": raw, "threadId": original["threadId"]}
+
+    result = service.users().messages().send(userId="me", body=body).execute()
+    print(json.dumps({"status": "sent", "id": result["id"], "threadId": result.get("threadId", "")}, indent=2))
+
+

 def gmail_labels(args):
-    gws("gmail", "users", "labels", "list", "--params", json.dumps({"userId": "me"}), "--format", "json")
+    if _gws_binary():
+        results = _run_gws(["gmail", "users", "labels", "list"], params={"userId": "me"})
+        labels = [{"id": l["id"], "name": l["name"], "type": l.get("type", "")} for l in results.get("labels", [])]
+        print(json.dumps(labels, indent=2))
+        return
+
+    service = build_service("gmail", "v1")
+    results = service.users().labels().list(userId="me").execute()
+    labels = [{"id": l["id"], "name": l["name"], "type": l.get("type", "")} for l in results.get("labels", [])]
+    print(json.dumps(labels, indent=2))
+
+

 def gmail_modify(args):
    body = {}
@@ -69,145 +418,310 @@ def gmail_modify(args):
        body["addLabelIds"] = args.add_labels.split(",")
    if args.remove_labels:
        body["removeLabelIds"] = args.remove_labels.split(",")
-    gws(
-        "gmail", "users", "messages", "modify",
-        "--params", json.dumps({"userId": "me", "id": args.message_id}),
-        "--json", json.dumps(body),
-        "--format", "json",
-    )
+
+    if _gws_binary():
+        result = _run_gws(
+            ["gmail", "users", "messages", "modify"],
+            params={"userId": "me", "id": args.message_id},
+            body=body,
+        )
+        print(json.dumps({"id": result["id"], "labels": result.get("labelIds", [])}, indent=2))
+        return
+
+    service = build_service("gmail", "v1")
+    result = service.users().messages().modify(userId="me", id=args.message_id, body=body).execute()
+    print(json.dumps({"id": result["id"], "labels": result.get("labelIds", [])}, indent=2))


-# -- Calendar --
+# =========================================================================
+# Calendar
+# =========================================================================
+

 def calendar_list(args):
-    if args.start or args.end:
-        # Specific date range — use raw Calendar API for precise timeMin/timeMax
-        from datetime import datetime, timedelta, timezone as tz
-        now = datetime.now(tz.utc)
-        time_min = args.start or now.isoformat()
-        time_max = args.end or (now + timedelta(days=7)).isoformat()
-        gws(
-            "calendar", "events", "list",
-            "--params", json.dumps({
+    now = datetime.now(timezone.utc)
+    time_min = _datetime_with_timezone(args.start or now.isoformat())
+    time_max = _datetime_with_timezone(args.end or (now + timedelta(days=7)).isoformat())
+
+    if _gws_binary():
+        results = _run_gws(
+            ["calendar", "events", "list"],
+            params={
                "calendarId": args.calendar,
                "timeMin": time_min,
                "timeMax": time_max,
                "maxResults": args.max,
                "singleEvents": True,
                "orderBy": "startTime",
-            }),
-            "--format", "json",
+            },
        )
-    else:
-        # No date range — use +agenda helper (defaults to 7 days)
-        cmd = ["calendar", "+agenda", "--days", "7", "--format", "json"]
-        if args.calendar != "primary":
-            cmd += ["--calendar", args.calendar]
-        gws(*cmd)
+        events = []
+        for e in results.get("items", []):
+            events.append({
+                "id": e["id"],
+                "summary": e.get("summary", "(no title)"),
+                "start": e.get("start", {}).get("dateTime", e.get("start", {}).get("date", "")),
+                "end": e.get("end", {}).get("dateTime", e.get("end", {}).get("date", "")),
+                "location": e.get("location", ""),
+                "description": e.get("description", ""),
+                "status": e.get("status", ""),
+                "htmlLink": e.get("htmlLink", ""),
+            })
+        print(json.dumps(events, indent=2, ensure_ascii=False))
+        return
+
+    service = build_service("calendar", "v3")
+    results = service.events().list(
+        calendarId=args.calendar, timeMin=time_min, timeMax=time_max,
+        maxResults=args.max, singleEvents=True, orderBy="startTime",
+    ).execute()
+
+    events = []
+    for e in results.get("items", []):
+        events.append({
+            "id": e["id"],
+            "summary": e.get("summary", "(no title)"),
+            "start": e.get("start", {}).get("dateTime", e.get("start", {}).get("date", "")),
+            "end": e.get("end", {}).get("dateTime", e.get("end", {}).get("date", "")),
+            "location": e.get("location", ""),
+            "description": e.get("description", ""),
+            "status": e.get("status", ""),
+            "htmlLink": e.get("htmlLink", ""),
+        })
+    print(json.dumps(events, indent=2, ensure_ascii=False))
+
+

 def calendar_create(args):
-    cmd = [
-        "calendar", "+insert",
-        "--summary", args.summary,
-        "--start", args.start,
-        "--end", args.end,
-        "--format", "json",
-    ]
+    event = {
+        "summary": args.summary,
+        "start": {"dateTime": args.start},
+        "end": {"dateTime": args.end},
+    }
    if args.location:
-        cmd += ["--location", args.location]
+        event["location"] = args.location
    if args.description:
-        cmd += ["--description", args.description]
+        event["description"] = args.description
    if args.attendees:
-        for email in args.attendees.split(","):
-            cmd += ["--attendee", email.strip()]
-    if args.calendar != "primary":
-        cmd += ["--calendar", args.calendar]
-    gws(*cmd)
+        event["attendees"] = [{"email": e.strip()} for e in args.attendees.split(",") if e.strip()]
+
+    if _gws_binary():
+        result = _run_gws(
+            ["calendar", "events", "insert"],
+            params={"calendarId": args.calendar},
+            body=event,
+        )
+        print(json.dumps({
+            "status": "created",
+            "id": result["id"],
+            "summary": result.get("summary", ""),
+            "htmlLink": result.get("htmlLink", ""),
+        }, indent=2))
+        return
+
+    service = build_service("calendar", "v3")
+    result = service.events().insert(calendarId=args.calendar, body=event).execute()
+    print(json.dumps({
+        "status": "created",
+        "id": result["id"],
+        "summary": result.get("summary", ""),
+        "htmlLink": result.get("htmlLink", ""),
+    }, indent=2))
+
+

 def calendar_delete(args):
-    gws(
-        "calendar", "events", "delete",
-        "--params", json.dumps({"calendarId": args.calendar, "eventId": args.event_id}),
-        "--format", "json",
-    )
+    if _gws_binary():
+        _run_gws(["calendar", "events", "delete"], params={"calendarId": args.calendar, "eventId": args.event_id})
+        print(json.dumps({"status": "deleted", "eventId": args.event_id}))
+        return
+
+    service = build_service("calendar", "v3")
+    service.events().delete(calendarId=args.calendar, eventId=args.event_id).execute()
+    print(json.dumps({"status": "deleted", "eventId": args.event_id}))


-# -- Drive --
+# =========================================================================
+# Drive
+# =========================================================================
+

 def drive_search(args):
    query = args.query if args.raw_query else f"fullText contains '{args.query}'"
-    gws(
-        "drive", "files", "list",
-        "--params", json.dumps({
-            "q": query,
-            "pageSize": args.max,
-            "fields": "files(id,name,mimeType,modifiedTime,webViewLink)",
-        }),
-        "--format", "json",
-    )
+    if _gws_binary():
+        results = _run_gws(
+            ["drive", "files", "list"],
+            params={
+                "q": query,
+                "pageSize": args.max,
+                "fields": "files(id, name, mimeType, modifiedTime, webViewLink)",
+            },
+        )
+        print(json.dumps(results.get("files", []), indent=2, ensure_ascii=False))
+        return
+
+    service = build_service("drive", "v3")
+    results = service.files().list(
+        q=query, pageSize=args.max, fields="files(id, name, mimeType, modifiedTime, webViewLink)",
+    ).execute()
+    files = results.get("files", [])
+    print(json.dumps(files, indent=2, ensure_ascii=False))


-# -- Contacts --
+# =========================================================================
+# Contacts
+# =========================================================================
+

 def contacts_list(args):
-    gws(
-        "people", "people", "connections", "list",
-        "--params", json.dumps({
-            "resourceName": "people/me",
-            "pageSize": args.max,
-            "personFields": "names,emailAddresses,phoneNumbers",
-        }),
-        "--format", "json",
-    )
+    if _gws_binary():
+        results = _run_gws(
+            ["people", "people", "connections", "list"],
+            params={
+                "resourceName": "people/me",
+                "pageSize": args.max,
+                "personFields": "names,emailAddresses,phoneNumbers",
+            },
+        )
+        contacts = []
+        for person in results.get("connections", []):
+            names = person.get("names", [{}])
+            emails = person.get("emailAddresses", [])
+            phones = person.get("phoneNumbers", [])
+            contacts.append({
+                "name": names[0].get("displayName", "") if names else "",
+                "emails": [e.get("value", "") for e in emails],
+                "phones": [p.get("value", "") for p in phones],
+            })
+        print(json.dumps(contacts, indent=2, ensure_ascii=False))
+        return
+
+    service = build_service("people", "v1")
+    results = service.people().connections().list(
+        resourceName="people/me",
+        pageSize=args.max,
+        personFields="names,emailAddresses,phoneNumbers",
+    ).execute()
+    contacts = []
+    for person in results.get("connections", []):
+        names = person.get("names", [{}])
+        emails = person.get("emailAddresses", [])
+        phones = person.get("phoneNumbers", [])
+        contacts.append({
+            "name": names[0].get("displayName", "") if names else "",
+            "emails": [e.get("value", "") for e in emails],
+            "phones": [p.get("value", "") for p in phones],
+        })
+    print(json.dumps(contacts, indent=2, ensure_ascii=False))


-# -- Sheets --
+# =========================================================================
+# Sheets
+# =========================================================================
+

 def sheets_get(args):
-    gws(
-        "sheets", "+read",
-        "--spreadsheet", args.sheet_id,
-        "--range", args.range,
-        "--format", "json",
-    )
+    if _gws_binary():
+        result = _run_gws(
+            ["sheets", "spreadsheets", "values", "get"],
+            params={"spreadsheetId": args.sheet_id, "range": args.range},
+        )
+        print(json.dumps(result.get("values", []), indent=2, ensure_ascii=False))
+        return
+
+    service = build_service("sheets", "v4")
+    result = service.spreadsheets().values().get(
+        spreadsheetId=args.sheet_id, range=args.range,
+    ).execute()
+    print(json.dumps(result.get("values", []), indent=2, ensure_ascii=False))
+
+

 def sheets_update(args):
    values = json.loads(args.values)
-    gws(
-        "sheets", "spreadsheets", "values", "update",
-        "--params", json.dumps({
-            "spreadsheetId": args.sheet_id,
-            "range": args.range,
-            "valueInputOption": "USER_ENTERED",
-        }),
-        "--json", json.dumps({"values": values}),
-        "--format", "json",
-    )
+    body = {"values": values}
+
+    if _gws_binary():
+        result = _run_gws(
+            ["sheets", "spreadsheets", "values", "update"],
+            params={
+                "spreadsheetId": args.sheet_id,
+                "range": args.range,
+                "valueInputOption": "USER_ENTERED",
+            },
+            body=body,
+        )
+        print(json.dumps({"updatedCells": result.get("updatedCells", 0), "updatedRange": result.get("updatedRange", "")}, indent=2))
+        return
+
+    service = build_service("sheets", "v4")
+    result = service.spreadsheets().values().update(
+        spreadsheetId=args.sheet_id, range=args.range,
+        valueInputOption="USER_ENTERED", body=body,
+    ).execute()
+    print(json.dumps({"updatedCells": result.get("updatedCells", 0), "updatedRange": result.get("updatedRange", "")}, indent=2))
+
+

 def sheets_append(args):
    values = json.loads(args.values)
-    gws(
-        "sheets", "+append",
-        "--spreadsheet", args.sheet_id,
-        "--json-values", json.dumps(values),
-        "--format", "json",
-    )
+    body = {"values": values}
+
+    if _gws_binary():
+        result = _run_gws(
+            ["sheets", "spreadsheets", "values", "append"],
+            params={
+                "spreadsheetId": args.sheet_id,
+                "range": args.range,
+                "valueInputOption": "USER_ENTERED",
+                "insertDataOption": "INSERT_ROWS",
+            },
+            body=body,
+        )
+        print(json.dumps({"updatedCells": result.get("updates", {}).get("updatedCells", 0)}, indent=2))
+        return
+
+    service = build_service("sheets", "v4")
+    result = service.spreadsheets().values().append(
+        spreadsheetId=args.sheet_id, range=args.range,
+        valueInputOption="USER_ENTERED", insertDataOption="INSERT_ROWS", body=body,
+    ).execute()
+    print(json.dumps({"updatedCells": result.get("updates", {}).get("updatedCells", 0)}, indent=2))


-# -- Docs --
+# =========================================================================
+# Docs
+# =========================================================================
+

 def docs_get(args):
-    gws(
-        "docs", "documents", "get",
-        "--params", json.dumps({"documentId": args.doc_id}),
-        "--format", "json",
-    )
+    if _gws_binary():
+        doc = _run_gws(["docs", "documents", "get"], params={"documentId": args.doc_id})
+        result = {
+            "title": doc.get("title", ""),
+            "documentId": doc.get("documentId", ""),
+            "body": _extract_doc_text(doc),
+        }
+        print(json.dumps(result, indent=2, ensure_ascii=False))
+        return
+
+    service = build_service("docs", "v1")
+    doc = service.documents().get(documentId=args.doc_id).execute()
+    result = {
+        "title": doc.get("title", ""),
+        "documentId": doc.get("documentId", ""),
+        "body": _extract_doc_text(doc),
+    }
+    print(json.dumps(result, indent=2, ensure_ascii=False))


-# -- CLI parser (backward-compatible interface) --
+# =========================================================================
+# CLI parser
+# =========================================================================
+

 def main():
-    parser = argparse.ArgumentParser(description="Google Workspace API for Hermes Agent (gws backend)")
+    parser = argparse.ArgumentParser(description="Google Workspace API for Hermes Agent")
    sub = parser.add_subparsers(dest="service", required=True)

    # --- Gmail ---
@@ -228,13 +742,15 @@ def main():
    p.add_argument("--subject", required=True)
    p.add_argument("--body", required=True)
    p.add_argument("--cc", default="")
+    p.add_argument("--from", dest="from_header", default="", help="Custom From header (e.g. '\"Agent Name\" <user@example.com>')")
    p.add_argument("--html", action="store_true", help="Send body as HTML")
-    p.add_argument("--thread-id", default="", help="Thread ID (unused with gws, kept for compat)")
+    p.add_argument("--thread-id", default="", help="Thread ID for threading")
    p.set_defaults(func=gmail_send)

    p = gmail_sub.add_parser("reply")
    p.add_argument("message_id", help="Message ID to reply to")
    p.add_argument("--body", required=True)
+    p.add_argument("--from", dest="from_header", default="", help="Custom From header (e.g. '\"Agent Name\" <user@example.com>')")
    p.set_defaults(func=gmail_reply)

    p = gmail_sub.add_parser("labels")
@@ -1156,3 +1156,117 @@ def test_load_pool_does_not_seed_qwen_oauth_when_no_token(tmp_path, monkeypatch)

    assert not pool.has_credentials()
    assert pool.entries() == []
+
+
+# ---------------------------------------------------------------------------
+# Auth failure TTL — 401/403 credentials stay exhausted for 24 hours
+# ---------------------------------------------------------------------------
+
+
+def test_exhausted_401_entry_stays_exhausted_after_one_hour(tmp_path, monkeypatch):
+    """401-exhausted credentials should NOT reset after just 1 hour (token invalid)."""
+    monkeypatch.setenv("HERMES_HOME", str(tmp_path / "hermes"))
+    monkeypatch.delenv("OPENROUTER_API_KEY", raising=False)
+    _write_auth_store(
+        tmp_path,
+        {
+            "version": 1,
+            "credential_pool": {
+                "openrouter": [
+                    {
+                        "id": "cred-1",
+                        "label": "primary",
+                        "auth_type": "api_key",
+                        "priority": 0,
+                        "source": "manual",
+                        "access_token": "***",
+                        "base_url": "https://openrouter.ai/api/v1",
+                        "last_status": "exhausted",
+                        "last_status_at": time.time() - 3700,  # ~1h2m ago
+                        "last_error_code": 401,
+                    }
+                ]
+            },
+        },
+    )
+
+    from agent.credential_pool import load_pool
+
+    pool = load_pool("openrouter")
+    entry = pool.select()
+
+    # 401 uses a 24-hour cooldown — 1 hour is NOT enough to reset
+    assert entry is None
+
+
+def test_exhausted_401_entry_resets_after_24_hours(tmp_path, monkeypatch):
+    """401-exhausted credentials should reset after 24 hours."""
+    monkeypatch.setenv("HERMES_HOME", str(tmp_path / "hermes"))
+    monkeypatch.delenv("OPENROUTER_API_KEY", raising=False)
+    _write_auth_store(
+        tmp_path,
+        {
+            "version": 1,
+            "credential_pool": {
+                "openrouter": [
+                    {
+                        "id": "cred-1",
+                        "label": "primary",
+                        "auth_type": "api_key",
+                        "priority": 0,
+                        "source": "manual",
+                        "access_token": "***",
+                        "base_url": "https://openrouter.ai/api/v1",
+                        "last_status": "exhausted",
+                        "last_status_at": time.time() - 90000,  # ~25 hours ago
+                        "last_error_code": 401,
+                    }
+                ]
+            },
+        },
+    )
+
+    from agent.credential_pool import load_pool
+
+    pool = load_pool("openrouter")
+    entry = pool.select()
+
+    assert entry is not None
+    assert entry.id == "cred-1"
+    assert entry.last_status == "ok"
+
+
+def test_exhausted_403_entry_stays_exhausted_after_one_hour(tmp_path, monkeypatch):
+    """403-exhausted credentials should NOT reset after just 1 hour."""
+    monkeypatch.setenv("HERMES_HOME", str(tmp_path / "hermes"))
+    monkeypatch.delenv("OPENROUTER_API_KEY", raising=False)
+    _write_auth_store(
+        tmp_path,
+        {
+            "version": 1,
+            "credential_pool": {
+                "openrouter": [
+                    {
+                        "id": "cred-1",
+                        "label": "primary",
+                        "auth_type": "api_key",
+                        "priority": 0,
+                        "source": "manual",
+                        "access_token": "***",
+                        "base_url": "https://openrouter.ai/api/v1",
+                        "last_status": "exhausted",
+                        "last_status_at": time.time() - 3700,  # ~1h2m ago
+                        "last_error_code": 403,
+                    }
+                ]
+            },
+        },
+    )
+
+    from agent.credential_pool import load_pool
+
+    pool = load_pool("openrouter")
+    entry = pool.select()
+
+    # 403 uses a 24-hour cooldown — 1 hour is NOT enough to reset
+    assert entry is None
@@ -348,3 +348,79 @@ class TestPoolRotationCycle:
        )
        assert recovered is False
        assert has_retried is False
+
+
+class TestAuthExhaustionNotification:
+    """Verify user-facing notification when all credentials are rejected (401)."""
+
+    def _make_agent_with_empty_auth_pool(self):
+        from run_agent import AIAgent
+
+        with patch.object(AIAgent, "__init__", lambda self, **kw: None):
+            agent = AIAgent()
+
+        pool = MagicMock()
+        pool.has_credentials.return_value = True
+        pool.try_refresh_current.return_value = None
+        pool.mark_exhausted_and_rotate.return_value = None  # no more credentials
+        agent._credential_pool = pool
+        agent._swap_credential = MagicMock()
+        agent.log_prefix = ""
+        agent.provider = "copilot"
+        agent.status_callback = None
+
+        # Capture _emit_status calls
+        agent._emit_status_calls = []
+        original_emit = getattr(AIAgent, "_emit_status", None)
+
+        def capture_emit(self_inner, msg):
+            agent._emit_status_calls.append(msg)
+        agent._emit_status = lambda msg: capture_emit(agent, msg)
+
+        return agent, pool
+
+    def test_auth_failure_emits_notification_when_pool_exhausted(self):
+        """When all credentials are 401'd, user should see actionable message."""
+        from agent.error_classifier import FailoverReason
+
+        agent, pool = self._make_agent_with_empty_auth_pool()
+
+        recovered, _ = agent._recover_with_credential_pool(
+            status_code=401, has_retried_429=False,
+            classified_reason=FailoverReason.auth,
+        )
+        assert recovered is False
+        assert len(agent._emit_status_calls) == 1
+        msg = agent._emit_status_calls[0]
+        assert "copilot" in msg
+        assert "401" in msg
+        assert "hermes auth reset" in msg
+
+    def test_auth_failure_no_notification_when_rotation_succeeds(self):
+        """When rotation succeeds, no exhaustion warning should be emitted."""
+        from agent.error_classifier import FailoverReason
+        from run_agent import AIAgent
+
+        with patch.object(AIAgent, "__init__", lambda self, **kw: None):
+            agent = AIAgent()
+
+        next_entry = MagicMock()
+        next_entry.id = "cred-2"
+        pool = MagicMock()
+        pool.has_credentials.return_value = True
+        pool.try_refresh_current.return_value = None
+        pool.mark_exhausted_and_rotate.return_value = next_entry
+        agent._credential_pool = pool
+        agent._swap_credential = MagicMock()
+        agent.log_prefix = ""
+        agent.provider = "copilot"
+
+        agent._emit_status_calls = []
+        agent._emit_status = lambda msg: agent._emit_status_calls.append(msg)
+
+        recovered, _ = agent._recover_with_credential_pool(
+            status_code=401, has_retried_429=False,
+            classified_reason=FailoverReason.auth,
+        )
+        assert recovered is True
+        assert len(agent._emit_status_calls) == 0
@@ -1115,6 +1115,134 @@ class TestResponsesEndpoint:
            assert resp.status == 400


+class TestResponsesStreaming:
+    @pytest.mark.asyncio
+    async def test_stream_true_returns_responses_sse(self, adapter):
+        app = _create_app(adapter)
+        async with TestClient(TestServer(app)) as cli:
+            async def _mock_run_agent(**kwargs):
+                cb = kwargs.get("stream_delta_callback")
+                if cb:
+                    cb("Hello")
+                    cb(" world")
+                return (
+                    {"final_response": "Hello world", "messages": [], "api_calls": 1},
+                    {"input_tokens": 10, "output_tokens": 5, "total_tokens": 15},
+                )
+
+            with patch.object(adapter, "_run_agent", side_effect=_mock_run_agent):
+                resp = await cli.post(
+                    "/v1/responses",
+                    json={"model": "hermes-agent", "input": "hi", "stream": True},
+                )
+                assert resp.status == 200
+                assert "text/event-stream" in resp.headers.get("Content-Type", "")
+                body = await resp.text()
+                assert "event: response.created" in body
+                assert "event: response.output_text.delta" in body
+                assert "event: response.output_text.done" in body
+                assert "event: response.completed" in body
+                assert '"sequence_number":' in body
+                assert '"logprobs": []' in body
+                assert "Hello" in body
+                assert " world" in body
+
+    @pytest.mark.asyncio
+    async def test_stream_emits_function_call_and_output_items(self, adapter):
+        app = _create_app(adapter)
+        async with TestClient(TestServer(app)) as cli:
+            async def _mock_run_agent(**kwargs):
+                start_cb = kwargs.get("tool_start_callback")
+                complete_cb = kwargs.get("tool_complete_callback")
+                text_cb = kwargs.get("stream_delta_callback")
+                if start_cb:
+                    start_cb("call_123", "read_file", {"path": "/tmp/test.txt"})
+                if complete_cb:
+                    complete_cb("call_123", "read_file", {"path": "/tmp/test.txt"}, '{"content":"hello"}')
+                if text_cb:
+                    text_cb("Done.")
+                return (
+                    {
+                        "final_response": "Done.",
+                        "messages": [
+                            {
+                                "role": "assistant",
+                                "tool_calls": [
+                                    {
+                                        "id": "call_123",
+                                        "function": {
+                                            "name": "read_file",
+                                            "arguments": '{"path":"/tmp/test.txt"}',
+                                        },
+                                    }
+                                ],
+                            },
+                            {
+                                "role": "tool",
+                                "tool_call_id": "call_123",
+                                "content": '{"content":"hello"}',
+                            },
+                        ],
+                        "api_calls": 1,
+                    },
+                    {"input_tokens": 10, "output_tokens": 5, "total_tokens": 15},
+                )
+
+            with patch.object(adapter, "_run_agent", side_effect=_mock_run_agent):
+                resp = await cli.post(
+                    "/v1/responses",
+                    json={"model": "hermes-agent", "input": "read the file", "stream": True},
+                )
+                assert resp.status == 200
+                body = await resp.text()
+                assert "event: response.output_item.added" in body
+                assert "event: response.output_item.done" in body
+                assert body.count("event: response.output_item.done") >= 2
+                assert '"type": "function_call"' in body
+                assert '"type": "function_call_output"' in body
+                assert '"call_id": "call_123"' in body
+                assert '"name": "read_file"' in body
+                assert '"output": [{"type": "input_text", "text": "{\\"content\\":\\"hello\\"}"}]' in body
+
+    @pytest.mark.asyncio
+    async def test_streamed_response_is_stored_for_get(self, adapter):
+        app = _create_app(adapter)
+        async with TestClient(TestServer(app)) as cli:
+            async def _mock_run_agent(**kwargs):
+                cb = kwargs.get("stream_delta_callback")
+                if cb:
+                    cb("Stored response")
+                return (
+                    {"final_response": "Stored response", "messages": [], "api_calls": 1},
+                    {"input_tokens": 1, "output_tokens": 2, "total_tokens": 3},
+                )
+
+            with patch.object(adapter, "_run_agent", side_effect=_mock_run_agent):
+                resp = await cli.post(
+                    "/v1/responses",
+                    json={"model": "hermes-agent", "input": "store this", "stream": True},
+                )
+                body = await resp.text()
+                response_id = None
+                for line in body.splitlines():
+                    if line.startswith("data: "):
+                        try:
+                            payload = json.loads(line[len("data: "):])
+                        except json.JSONDecodeError:
+                            continue
+                        if payload.get("type") == "response.completed":
+                            response_id = payload["response"]["id"]
+                            break
+                assert response_id
+
+                get_resp = await cli.get(f"/v1/responses/{response_id}")
+                assert get_resp.status == 200
+                data = await get_resp.json()
+                assert data["id"] == response_id
+                assert data["status"] == "completed"
+                assert data["output"][-1]["content"][0]["text"] == "Stored response"
+
+
 # ---------------------------------------------------------------------------
 # Auth on endpoints
 # ---------------------------------------------------------------------------
@@ -0,0 +1,95 @@
+"""Tests for the auto-continue feature (#4493).
+
+When the gateway restarts mid-agent-work, the session transcript ends on a
+tool result that the agent never processed.  The auto-continue logic detects
+this and prepends a system note to the next user message so the model
+finishes the interrupted work before addressing the new input.
+"""
+
+import pytest
+
+
+def _simulate_auto_continue(agent_history: list, user_message: str) -> str:
+    """Reproduce the auto-continue injection logic from _run_agent().
+
+    This mirrors the exact code in gateway/run.py so we can test the
+    detection and message transformation without spinning up a full
+    gateway runner.
+    """
+    message = user_message
+    if agent_history and agent_history[-1].get("role") == "tool":
+        message = (
+            "[System note: Your previous turn was interrupted before you could "
+            "process the last tool result(s). The conversation history contains "
+            "tool outputs you haven't responded to yet. Please finish processing "
+            "those results and summarize what was accomplished, then address the "
+            "user's new message below.]\n\n"
+            + message
+        )
+    return message
+
+
+class TestAutoDetection:
+    """Test that trailing tool results are correctly detected."""
+
+    def test_trailing_tool_result_triggers_note(self):
+        history = [
+            {"role": "user", "content": "deploy the app"},
+            {"role": "assistant", "content": None, "tool_calls": [
+                {"id": "call_1", "function": {"name": "terminal", "arguments": "{}"}}
+            ]},
+            {"role": "tool", "tool_call_id": "call_1", "content": "deployed successfully"},
+        ]
+        result = _simulate_auto_continue(history, "what happened?")
+        assert "[System note:" in result
+        assert "interrupted" in result
+        assert "what happened?" in result
+
+    def test_trailing_assistant_message_no_note(self):
+        history = [
+            {"role": "user", "content": "hello"},
+            {"role": "assistant", "content": "Hi there!"},
+        ]
+        result = _simulate_auto_continue(history, "how are you?")
+        assert "[System note:" not in result
+        assert result == "how are you?"
+
+    def test_empty_history_no_note(self):
+        result = _simulate_auto_continue([], "hello")
+        assert result == "hello"
+
+    def test_trailing_user_message_no_note(self):
+        """Shouldn't happen in practice, but ensure no false positive."""
+        history = [
+            {"role": "user", "content": "hello"},
+        ]
+        result = _simulate_auto_continue(history, "hello again")
+        assert result == "hello again"
+
+    def test_multiple_tool_results_still_triggers(self):
+        """Multiple tool calls in a row — last one is still role=tool."""
+        history = [
+            {"role": "user", "content": "search and read"},
+            {"role": "assistant", "content": None, "tool_calls": [
+                {"id": "call_1", "function": {"name": "search", "arguments": "{}"}},
+                {"id": "call_2", "function": {"name": "read", "arguments": "{}"}},
+            ]},
+            {"role": "tool", "tool_call_id": "call_1", "content": "found it"},
+            {"role": "tool", "tool_call_id": "call_2", "content": "file content here"},
+        ]
+        result = _simulate_auto_continue(history, "continue")
+        assert "[System note:" in result
+
+    def test_original_message_preserved_after_note(self):
+        """The user's actual message must appear after the system note."""
+        history = [
+            {"role": "assistant", "content": None, "tool_calls": [
+                {"id": "c1", "function": {"name": "t", "arguments": "{}"}}
+            ]},
+            {"role": "tool", "tool_call_id": "c1", "content": "done"},
+        ]
+        result = _simulate_auto_continue(history, "now do X")
+        # System note comes first, then user's message
+        note_end = result.index("]\n\n")
+        user_msg_start = result.index("now do X")
+        assert user_msg_start > note_end
@@ -117,6 +117,23 @@ async def test_registers_native_thread_slash_command(adapter):
    adapter._handle_thread_create_slash.assert_awaited_once_with(interaction, "Planning", "", 1440)


+@pytest.mark.asyncio
+async def test_registers_native_restart_slash_command(adapter):
+    adapter._run_simple_slash = AsyncMock()
+    adapter._register_slash_commands()
+
+    assert "restart" in adapter._client.tree.commands
+
+    interaction = SimpleNamespace()
+    await adapter._client.tree.commands["restart"](interaction)
+
+    adapter._run_simple_slash.assert_awaited_once_with(
+        interaction,
+        "/restart",
+        "Restart requested~",
+    )
+
+
 # ------------------------------------------------------------------
 # _handle_thread_create_slash — success, session dispatch, failure
 # ------------------------------------------------------------------
@@ -125,25 +125,6 @@ async def test_gateway_stop_service_restart_sets_named_exit_code():
    assert runner._exit_code == GATEWAY_SERVICE_RESTART_EXIT_CODE


-@pytest.mark.asyncio
-async def test_gateway_stop_emits_shutdown_hook_after_drain(monkeypatch):
-    runner, adapter = make_restart_runner()
-    adapter.disconnect = AsyncMock()
-    runner.hooks.emit = AsyncMock()
-
-    with patch("gateway.status.remove_pid_file"), patch("gateway.status.write_runtime_status"):
-        await runner.stop(restart=True, service_restart=True)
-
-    runner.hooks.emit.assert_awaited_once_with(
-        "gateway:shutdown",
-        {
-            "restart": True,
-            "service_restart": True,
-            "detached_restart": False,
-        },
-    )
-
-
@pytest.mark.asyncio
 async def test_drain_active_agents_throttles_status_updates():
    runner, _adapter = make_restart_runner()
@@ -9,7 +9,7 @@ import pytest
 from gateway.hooks import HookRegistry


-def _create_hook(hooks_dir, hook_name, events, handler_code, *, manifest_extra=""):
+def _create_hook(hooks_dir, hook_name, events, handler_code):
    """Helper to create a hook directory with HOOK.yaml and handler.py."""
    hook_dir = hooks_dir / hook_name
    hook_dir.mkdir(parents=True)
@@ -17,7 +17,6 @@ def _create_hook(hooks_dir, hook_name, events, handler_code, *, manifest_extra="
        f"name: {hook_name}\n"
        f"description: Test hook\n"
        f"events: {events}\n"
-        f"{manifest_extra}"
    )
    (hook_dir / "handler.py").write_text(handler_code)
    return hook_dir
@@ -113,24 +112,6 @@ class TestDiscoverAndLoad:

        assert len(reg.loaded_hooks) == 2

-    def test_preserves_optional_startup_readiness_metadata(self, tmp_path):
-        _create_hook(
-            tmp_path,
-            "ready-hook",
-            '["gateway:startup"]',
-            "def handle(e, c): pass\n",
-            manifest_extra="startup_readiness:\n  id: beam-runtime\n  required: false\n",
-        )
-
-        reg = HookRegistry()
-        with patch("gateway.hooks.HOOKS_DIR", tmp_path), _patch_no_builtins(reg):
-            reg.discover_and_load()
-
-        assert reg.loaded_hooks[0]["startup_readiness"] == {
-            "id": "beam-runtime",
-            "required": False,
-        }
-

 class TestEmit:
    @pytest.mark.asyncio
@@ -193,7 +193,7 @@ async def test_shutdown_notification_says_restarting_when_restart_requested():

    assert len(adapter.sent) == 1
    assert "restarting" in adapter.sent[0]
-    assert "/retry" in adapter.sent[0]
+    assert "resume" in adapter.sent[0]


@pytest.mark.asyncio
@@ -132,68 +132,6 @@ async def test_runner_records_connected_platform_state_on_success(monkeypatch, t
    assert state["platforms"]["discord"]["error_message"] is None


-@pytest.mark.asyncio
-async def test_runner_discovers_plugins_before_loading_hooks(monkeypatch, tmp_path):
-    monkeypatch.setenv("HERMES_HOME", str(tmp_path))
-    config = GatewayConfig(
-        platforms={
-            Platform.DISCORD: PlatformConfig(enabled=True, token="***")
-        },
-        sessions_dir=tmp_path / "sessions",
-    )
-    runner = GatewayRunner(config)
-    order: list[str] = []
-
-    monkeypatch.setattr(runner, "_create_adapter", lambda platform, platform_config: _SuccessfulAdapter())
-    monkeypatch.setattr("hermes_cli.plugins.discover_plugins", lambda: order.append("plugins"))
-    monkeypatch.setattr(runner.hooks, "discover_and_load", lambda: order.append("hooks"))
-    monkeypatch.setattr(runner.hooks, "emit", AsyncMock())
-
-    ok = await runner.start()
-
-    assert ok is True
-    assert order == ["plugins", "hooks"]
-
-
-@pytest.mark.asyncio
-async def test_runner_initializes_startup_checks_before_gateway_startup_emit(monkeypatch, tmp_path):
-    monkeypatch.setenv("HERMES_HOME", str(tmp_path))
-    config = GatewayConfig(
-        platforms={
-            Platform.DISCORD: PlatformConfig(enabled=True, token="***")
-        },
-        sessions_dir=tmp_path / "sessions",
-    )
-    runner = GatewayRunner(config)
-
-    runner.hooks._loaded_hooks = [
-        {
-            "name": "beam-runtime",
-            "events": ["gateway:startup"],
-            "path": str(tmp_path / "hook"),
-            "startup_readiness": {
-                "id": "beam-runtime",
-                "required": True,
-            },
-        }
-    ]
-    monkeypatch.setattr(runner, "_create_adapter", lambda platform, platform_config: _SuccessfulAdapter())
-    monkeypatch.setattr("hermes_cli.plugins.discover_plugins", lambda: None)
-    monkeypatch.setattr(runner.hooks, "discover_and_load", lambda: None)
-
-    async def _assert_checks(event_type, context):
-        state = read_runtime_status()
-        assert event_type == "gateway:startup"
-        assert state["startup_checks"]["beam-runtime"]["state"] == "pending"
-        assert state["startup_checks"]["beam-runtime"]["required"] is True
-
-    monkeypatch.setattr(runner.hooks, "emit", _assert_checks)
-
-    ok = await runner.start()
-
-    assert ok is True
-
-
@pytest.mark.asyncio
 async def test_start_gateway_verbosity_imports_redacting_formatter(monkeypatch, tmp_path):
    """Verbosity != None must not crash with NameError on RedactingFormatter (#8044)."""
@@ -132,72 +132,6 @@ class TestGatewayRuntimeStatus:
        assert payload["platforms"]["discord"]["error_code"] is None
        assert payload["platforms"]["discord"]["error_message"] is None

-    def test_reset_startup_checks_replaces_previous_run_entries(self, tmp_path, monkeypatch):
-        monkeypatch.setenv("HERMES_HOME", str(tmp_path))
-
-        status.write_runtime_status(
-            gateway_state="running",
-            startup_checks={
-                "old-check": {
-                    "state": "ready",
-                    "required": True,
-                    "source": "old-hook",
-                    "detail": None,
-                }
-            },
-        )
-
-        status.reset_startup_checks([
-            {
-                "name": "new-hook",
-                "startup_readiness": {
-                    "id": "new-check",
-                    "required": False,
-                },
-            }
-        ])
-
-        payload = status.read_runtime_status()
-        assert set(payload["startup_checks"]) == {"new-check"}
-        assert payload["startup_checks"]["new-check"]["state"] == "pending"
-        assert payload["startup_checks"]["new-check"]["required"] is False
-        assert payload["startup_checks"]["new-check"]["source"] == "new-hook"
-
-    def test_mark_startup_check_ready_persists_detail(self, tmp_path, monkeypatch):
-        monkeypatch.setenv("HERMES_HOME", str(tmp_path))
-
-        status.reset_startup_checks([
-            {
-                "name": "beam",
-                "startup_readiness": {
-                    "id": "beam-runtime",
-                    "required": True,
-                },
-            }
-        ])
-
-        status.mark_startup_check_ready("beam-runtime", detail="ready for RPC")
-
-        payload = status.read_runtime_status()
-        assert payload["startup_checks"]["beam-runtime"]["state"] == "ready"
-        assert payload["startup_checks"]["beam-runtime"]["detail"] == "ready for RPC"
-
-    def test_mark_startup_check_failed_creates_missing_entry(self, tmp_path, monkeypatch):
-        monkeypatch.setenv("HERMES_HOME", str(tmp_path))
-
-        status.mark_startup_check_failed(
-            "late-hook",
-            detail="startup hook crashed",
-            required=False,
-            source="late-hook",
-        )
-
-        payload = status.read_runtime_status()
-        assert payload["startup_checks"]["late-hook"]["state"] == "failed"
-        assert payload["startup_checks"]["late-hook"]["required"] is False
-        assert payload["startup_checks"]["late-hook"]["source"] == "late-hook"
-        assert payload["startup_checks"]["late-hook"]["detail"] == "startup hook crashed"
-

 class TestTerminatePid:
    def test_force_uses_taskkill_on_windows(self, monkeypatch):
@@ -0,0 +1,116 @@
+"""Tests for stuck-session loop detection (#7536).
+
+When a session is active across 3+ consecutive gateway restarts (the agent
+gets stuck, gateway restarts, same session gets stuck again), the session
+is auto-suspended on startup so the user gets a clean slate.
+"""
+
+import json
+from pathlib import Path
+from unittest.mock import MagicMock
+
+import pytest
+
+from tests.gateway.restart_test_helpers import make_restart_runner
+
+
+@pytest.fixture
+def runner_with_home(tmp_path, monkeypatch):
+    """Create a runner with a writable HERMES_HOME."""
+    monkeypatch.setattr("gateway.run._hermes_home", tmp_path)
+    runner, adapter = make_restart_runner()
+    return runner, tmp_path
+
+
+class TestStuckLoopDetection:
+
+    def test_increment_creates_file(self, runner_with_home):
+        runner, home = runner_with_home
+        runner._increment_restart_failure_counts({"session:a", "session:b"})
+        path = home / runner._STUCK_LOOP_FILE
+        assert path.exists()
+        counts = json.loads(path.read_text())
+        assert counts["session:a"] == 1
+        assert counts["session:b"] == 1
+
+    def test_increment_accumulates(self, runner_with_home):
+        runner, home = runner_with_home
+        runner._increment_restart_failure_counts({"session:a"})
+        runner._increment_restart_failure_counts({"session:a"})
+        runner._increment_restart_failure_counts({"session:a"})
+        counts = json.loads((home / runner._STUCK_LOOP_FILE).read_text())
+        assert counts["session:a"] == 3
+
+    def test_increment_drops_inactive_sessions(self, runner_with_home):
+        runner, home = runner_with_home
+        runner._increment_restart_failure_counts({"session:a", "session:b"})
+        runner._increment_restart_failure_counts({"session:a"})  # b not active
+        counts = json.loads((home / runner._STUCK_LOOP_FILE).read_text())
+        assert "session:a" in counts
+        assert "session:b" not in counts
+
+    def test_suspend_at_threshold(self, runner_with_home):
+        runner, home = runner_with_home
+        # Simulate 3 restarts with session:a active each time
+        for _ in range(3):
+            runner._increment_restart_failure_counts({"session:a"})
+
+        # Create a mock session entry
+        mock_entry = MagicMock()
+        mock_entry.suspended = False
+        runner.session_store._entries = {"session:a": mock_entry}
+        runner.session_store._save = MagicMock()
+
+        suspended = runner._suspend_stuck_loop_sessions()
+        assert suspended == 1
+        assert mock_entry.suspended is True
+
+    def test_no_suspend_below_threshold(self, runner_with_home):
+        runner, home = runner_with_home
+        runner._increment_restart_failure_counts({"session:a"})
+        runner._increment_restart_failure_counts({"session:a"})
+        # Only 2 restarts — below threshold of 3
+
+        mock_entry = MagicMock()
+        mock_entry.suspended = False
+        runner.session_store._entries = {"session:a": mock_entry}
+
+        suspended = runner._suspend_stuck_loop_sessions()
+        assert suspended == 0
+        assert mock_entry.suspended is False
+
+    def test_clear_on_success(self, runner_with_home):
+        runner, home = runner_with_home
+        runner._increment_restart_failure_counts({"session:a", "session:b"})
+        runner._clear_restart_failure_count("session:a")
+
+        path = home / runner._STUCK_LOOP_FILE
+        counts = json.loads(path.read_text())
+        assert "session:a" not in counts
+        assert "session:b" in counts
+
+    def test_clear_removes_file_when_empty(self, runner_with_home):
+        runner, home = runner_with_home
+        runner._increment_restart_failure_counts({"session:a"})
+        runner._clear_restart_failure_count("session:a")
+        assert not (home / runner._STUCK_LOOP_FILE).exists()
+
+    def test_suspend_clears_file(self, runner_with_home):
+        runner, home = runner_with_home
+        for _ in range(3):
+            runner._increment_restart_failure_counts({"session:a"})
+
+        mock_entry = MagicMock()
+        mock_entry.suspended = False
+        runner.session_store._entries = {"session:a": mock_entry}
+        runner.session_store._save = MagicMock()
+
+        runner._suspend_stuck_loop_sessions()
+        assert not (home / runner._STUCK_LOOP_FILE).exists()
+
+    def test_no_file_no_crash(self, runner_with_home):
+        runner, home = runner_with_home
+        # No file exists — should return 0 and not crash
+        assert runner._suspend_stuck_loop_sessions() == 0
+        # Clear on nonexistent file — should not crash
+        runner._clear_restart_failure_count("nonexistent")
@@ -6,21 +6,12 @@ from pathlib import Path
 from types import SimpleNamespace

 import hermes_cli.gateway as gateway_cli
-import pytest
 from gateway.restart import (
    DEFAULT_GATEWAY_RESTART_DRAIN_TIMEOUT,
    GATEWAY_SERVICE_RESTART_EXIT_CODE,
 )


-_REAL_AWAIT_SERVICE_READY = gateway_cli._await_service_ready_or_exit
-
-
-@pytest.fixture(autouse=True)
-def _stub_service_readiness(monkeypatch):
-    monkeypatch.setattr(gateway_cli, "_await_service_ready_or_exit", lambda **kwargs: None)
-
-
 class TestSystemdServiceRefresh:
    def test_systemd_install_repairs_outdated_unit_without_force(self, tmp_path, monkeypatch):
        unit_path = tmp_path / "hermes-gateway.service"
@@ -91,30 +82,6 @@ class TestSystemdServiceRefresh:
            ["systemctl", "--user", "reload-or-restart", gateway_cli.get_service_name()],
        ]

-    def test_systemd_start_waits_for_readiness_before_reporting_success(self, monkeypatch):
-        calls = []
-
-        monkeypatch.setattr(gateway_cli, "_select_systemd_scope", lambda system=False: False)
-        monkeypatch.setattr(gateway_cli, "refresh_systemd_unit_if_needed", lambda system=False: calls.append(("refresh", system)))
-        monkeypatch.setattr(
-            gateway_cli,
-            "_run_systemctl",
-            lambda cmd, system=False, check=True, timeout=30, **kwargs: calls.append((tuple(cmd), system, timeout)),
-        )
-        monkeypatch.setattr(
-            gateway_cli,
-            "_await_service_ready_or_exit",
-            lambda **kwargs: calls.append(("ready", kwargs)),
-        )
-
-        gateway_cli.systemd_start()
-
-        assert calls == [
-            ("refresh", False),
-            (("start", gateway_cli.get_service_name()), False, 30),
-            ("ready", {"action": "start"}),
-        ]
-

 class TestGeneratedSystemdUnits:
    def test_user_unit_avoids_recursive_execstop_and_uses_extended_stop_timeout(self):
@@ -301,32 +268,6 @@ class TestLaunchdServiceRecovery:
            ["launchctl", "kickstart", target],
        ]

-    def test_launchd_start_waits_for_readiness_before_reporting_success(self, tmp_path, monkeypatch):
-        plist_path = tmp_path / "ai.hermes.gateway.plist"
-        plist_path.write_text(gateway_cli.generate_launchd_plist(), encoding="utf-8")
-        label = gateway_cli.get_launchd_label()
-        calls = []
-
-        monkeypatch.setattr(gateway_cli, "get_launchd_plist_path", lambda: plist_path)
-        monkeypatch.setattr(gateway_cli, "refresh_launchd_plist_if_needed", lambda: None)
-        monkeypatch.setattr(
-            gateway_cli.subprocess,
-            "run",
-            lambda cmd, check=False, **kwargs: calls.append(cmd) or SimpleNamespace(returncode=0, stdout="", stderr=""),
-        )
-        monkeypatch.setattr(
-            gateway_cli,
-            "_await_service_ready_or_exit",
-            lambda **kwargs: calls.append(("ready", kwargs)),
-        )
-
-        gateway_cli.launchd_start()
-
-        assert calls == [
-            ["launchctl", "kickstart", f"{gateway_cli._launchd_domain()}/{label}"],
-            ("ready", {"action": "start"}),
-        ]
-
    def test_launchd_restart_drains_running_gateway_before_kickstart(self, monkeypatch):
        calls = []
        target = f"{gateway_cli._launchd_domain()}/{gateway_cli.get_launchd_label()}"
@@ -374,7 +315,7 @@ class TestLaunchdServiceRecovery:
        gateway_cli.launchd_restart()

        assert calls == [("self", 321)]
-        assert "service restarted" in capsys.readouterr().out.lower()
+        assert "restart requested" in capsys.readouterr().out.lower()

    def test_launchd_stop_uses_bootout_not_kill(self, monkeypatch):
        """launchd_stop must bootout the service so KeepAlive doesn't respawn it."""
@@ -452,109 +393,6 @@ class TestLaunchdServiceRecovery:
        assert "not loaded" in output.lower()


-class TestGatewayServiceReadiness:
-    def test_wait_for_service_readiness_accepts_running_gateway_without_checks(self, monkeypatch):
-        monkeypatch.setattr("gateway.status.get_running_pid", lambda: 123)
-        monkeypatch.setattr(
-            "gateway.status.read_runtime_status",
-            lambda: {"pid": 123, "gateway_state": "running", "startup_checks": {}},
-        )
-
-        warnings = gateway_cli._wait_for_service_readiness(action="start", timeout=0.1, poll_interval=0.0)
-
-        assert warnings == []
-
-    def test_wait_for_service_readiness_ignores_stale_runtime_state_until_pid_matches(self, monkeypatch):
-        runtime_states = iter(
-            [
-                {"pid": 999, "gateway_state": "running", "startup_checks": {}},
-                {"pid": 123, "gateway_state": "running", "startup_checks": {}},
-            ]
-        )
-
-        monkeypatch.setattr("gateway.status.get_running_pid", lambda: 123)
-        monkeypatch.setattr("gateway.status.read_runtime_status", lambda: next(runtime_states))
-
-        warnings = gateway_cli._wait_for_service_readiness(action="start", timeout=0.1, poll_interval=0.0)
-
-        assert warnings == []
-
-    def test_wait_for_service_readiness_returns_optional_pending_warnings(self, monkeypatch):
-        monkeypatch.setattr("gateway.status.get_running_pid", lambda: 123)
-        monkeypatch.setattr(
-            "gateway.status.read_runtime_status",
-            lambda: {
-                "pid": 123,
-                "gateway_state": "running",
-                "startup_checks": {
-                    "optional-check": {
-                        "state": "pending",
-                        "required": False,
-                        "source": "test-hook",
-                        "detail": "still warming",
-                    }
-                },
-            },
-        )
-
-        warnings = gateway_cli._wait_for_service_readiness(action="start", timeout=0.1, poll_interval=0.0)
-
-        assert warnings == ["pending: optional-check (test-hook): still warming"]
-
-    def test_wait_for_service_readiness_fails_when_required_check_fails(self, monkeypatch):
-        monkeypatch.setattr("gateway.status.get_running_pid", lambda: 123)
-        monkeypatch.setattr(
-            "gateway.status.read_runtime_status",
-            lambda: {
-                "pid": 123,
-                "gateway_state": "running",
-                "startup_checks": {
-                    "beam-runtime": {
-                        "state": "failed",
-                        "required": True,
-                        "source": "beam",
-                        "detail": "RPC boot failed",
-                    }
-                },
-            },
-        )
-
-        with pytest.raises(RuntimeError, match=r"required startup checks failed: beam-runtime \(beam\): RPC boot failed"):
-            gateway_cli._wait_for_service_readiness(action="start", timeout=0.1, poll_interval=0.0)
-
-    def test_wait_for_service_readiness_times_out_on_pending_required_check(self, monkeypatch):
-        monkeypatch.setattr("gateway.status.get_running_pid", lambda: 123)
-        monkeypatch.setattr(
-            "gateway.status.read_runtime_status",
-            lambda: {
-                "pid": 123,
-                "gateway_state": "running",
-                "startup_checks": {
-                    "beam-runtime": {
-                        "state": "pending",
-                        "required": True,
-                        "source": "beam",
-                        "detail": "waiting for runtime",
-                    }
-                },
-            },
-        )
-
-        with pytest.raises(RuntimeError, match=r"timed out waiting for required startup checks: beam-runtime \(beam\): waiting for runtime"):
-            gateway_cli._wait_for_service_readiness(action="start", timeout=0.01, poll_interval=0.0)
-
-    def test_await_service_ready_or_exit_raises_system_exit_when_not_ready(self, monkeypatch):
-        monkeypatch.setattr(gateway_cli, "_await_service_ready_or_exit", _REAL_AWAIT_SERVICE_READY)
-        monkeypatch.setattr(
-            gateway_cli,
-            "_wait_for_service_readiness",
-            lambda **kwargs: (_ for _ in ()).throw(RuntimeError("not ready")),
-        )
-
-        with pytest.raises(SystemExit, match="1"):
-            gateway_cli._await_service_ready_or_exit(action="start")
-
-
 class TestGatewayServiceDetection:
    def test_supports_systemd_services_requires_systemctl_binary(self, monkeypatch):
        monkeypatch.setattr(gateway_cli, "is_linux", lambda: True)
@@ -614,7 +452,7 @@ class TestGatewayServiceDetection:


 class TestGatewaySystemServiceRouting:
-    def test_systemd_restart_self_requests_graceful_restart_without_reload_or_restart(self, monkeypatch, capsys):
+    def test_systemd_restart_self_requests_graceful_restart_and_waits(self, monkeypatch, capsys):
        calls = []

        monkeypatch.setattr(gateway_cli, "_select_systemd_scope", lambda system=False: False)
@@ -628,16 +466,37 @@ class TestGatewaySystemServiceRouting:
            "_request_gateway_self_restart",
            lambda pid: calls.append(("self", pid)) or True,
        )
-        monkeypatch.setattr(
-            gateway_cli.subprocess,
-            "run",
-            lambda *args, **kwargs: (_ for _ in ()).throw(AssertionError("systemctl should not run")),
-        )
+
+        # Simulate: old process dies immediately, new process becomes active
+        kill_call_count = [0]
+        def fake_kill(pid, sig):
+            kill_call_count[0] += 1
+            if kill_call_count[0] >= 2:  # first call checks, second = dead
+                raise ProcessLookupError()
+        monkeypatch.setattr(os, "kill", fake_kill)
+
+        # Simulate systemctl is-active returning "active" with a new PID
+        new_pid = [None]
+        def fake_subprocess_run(cmd, **kwargs):
+            if "is-active" in cmd:
+                result = SimpleNamespace(stdout="active\n", returncode=0)
+                new_pid[0] = 999  # new PID
+                return result
+            raise AssertionError(f"Unexpected systemctl call: {cmd}")
+
+        monkeypatch.setattr(gateway_cli.subprocess, "run", fake_subprocess_run)
+        # get_running_pid returns new PID after restart
+        pid_calls = [0]
+        def fake_get_pid():
+            pid_calls[0] += 1
+            return 999 if pid_calls[0] > 1 else 654
+        monkeypatch.setattr("gateway.status.get_running_pid", fake_get_pid)

        gateway_cli.systemd_restart()

-        assert calls == [("refresh", False), ("self", 654)]
-        assert "service restarted" in capsys.readouterr().out.lower()
+        assert ("self", 654) in calls
+        out = capsys.readouterr().out.lower()
+        assert "restarted" in out

    def test_gateway_install_passes_system_flags(self, monkeypatch):
        monkeypatch.setattr(gateway_cli, "supports_systemd_services", lambda: True)
@@ -0,0 +1,83 @@
+"""Tests for non-ASCII credential detection and sanitization.
+
+Covers the fix for issue #6843 — API keys containing Unicode lookalike
+characters (e.g. ʋ U+028B instead of v) cause UnicodeEncodeError when
+httpx tries to encode the Authorization header as ASCII.
+"""
+
+import os
+import sys
+import tempfile
+
+import pytest
+
+from hermes_cli.config import _check_non_ascii_credential
+
+
+class TestCheckNonAsciiCredential:
+    """Tests for _check_non_ascii_credential()."""
+
+    def test_ascii_key_unchanged(self):
+        key = "sk-proj-" + "a" * 100
+        result = _check_non_ascii_credential("TEST_API_KEY", key)
+        assert result == key
+
+    def test_strips_unicode_v_lookalike(self, capsys):
+        """The exact scenario from issue #6843: ʋ instead of v."""
+        key = "sk-proj-abc" + "ʋ" + "def"  # \u028b
+        result = _check_non_ascii_credential("OPENROUTER_API_KEY", key)
+        assert result == "sk-proj-abcdef"
+        assert "ʋ" not in result
+        # Should print a warning
+        captured = capsys.readouterr()
+        assert "non-ASCII" in captured.err
+
+    def test_strips_multiple_non_ascii(self, capsys):
+        key = "sk-proj-aʋbécd"
+        result = _check_non_ascii_credential("OPENAI_API_KEY", key)
+        assert result == "sk-proj-abcd"
+        captured = capsys.readouterr()
+        assert "U+028B" in captured.err  # reports the char
+
+    def test_empty_key(self):
+        result = _check_non_ascii_credential("TEST_KEY", "")
+        assert result == ""
+
+    def test_all_ascii_no_warning(self, capsys):
+        result = _check_non_ascii_credential("KEY", "all-ascii-value-123")
+        assert result == "all-ascii-value-123"
+        captured = capsys.readouterr()
+        assert captured.err == ""
+
+
+class TestEnvLoaderSanitization:
+    """Tests for _sanitize_loaded_credentials in env_loader."""
+
+    def test_strips_non_ascii_from_api_key(self, monkeypatch):
+        from hermes_cli.env_loader import _sanitize_loaded_credentials
+
+        monkeypatch.setenv("OPENROUTER_API_KEY", "sk-proj-abcʋdef")
+        _sanitize_loaded_credentials()
+        assert os.environ["OPENROUTER_API_KEY"] == "sk-proj-abcdef"
+
+    def test_strips_non_ascii_from_token(self, monkeypatch):
+        from hermes_cli.env_loader import _sanitize_loaded_credentials
+
+        monkeypatch.setenv("DISCORD_BOT_TOKEN", "tokénvalue")
+        _sanitize_loaded_credentials()
+        assert os.environ["DISCORD_BOT_TOKEN"] == "toknvalue"
+
+    def test_ignores_non_credential_vars(self, monkeypatch):
+        from hermes_cli.env_loader import _sanitize_loaded_credentials
+
+        monkeypatch.setenv("MY_UNICODE_VAR", "héllo wörld")
+        _sanitize_loaded_credentials()
+        # Not a credential suffix — should be left alone
+        assert os.environ["MY_UNICODE_VAR"] == "héllo wörld"
+
+    def test_ascii_credentials_untouched(self, monkeypatch):
+        from hermes_cli.env_loader import _sanitize_loaded_credentials
+
+        monkeypatch.setenv("OPENAI_API_KEY", "sk-proj-allascii123")
+        _sanitize_loaded_credentials()
+        assert os.environ["OPENAI_API_KEY"] == "sk-proj-allascii123"
@@ -142,6 +142,33 @@ class TestSurrogateVsAsciiSanitization:
        assert _sanitize_messages_surrogates(messages) is False


+class TestApiKeyNonAsciiSanitization:
+    """Tests for API key sanitization in the UnicodeEncodeError recovery.
+
+    Covers the root cause of issue #6843: a non-ASCII character (ʋ U+028B)
+    in the API key causes httpx to fail when encoding the Authorization
+    header as ASCII.  The recovery block must strip non-ASCII from the key.
+    """
+
+    def test_strip_non_ascii_from_api_key(self):
+        """_strip_non_ascii removes ʋ from an API key string."""
+        key = "sk-proj-abc" + "ʋ" + "def"
+        assert _strip_non_ascii(key) == "sk-proj-abcdef"
+
+    def test_api_key_at_position_153(self):
+        """Reproduce the exact error: ʋ at position 153 in 'Bearer <key>'."""
+        key = "sk-proj-" + "a" * 138 + "ʋ" + "bcd"
+        auth_value = f"Bearer {key}"
+        # This is what httpx does — and it fails:
+        with pytest.raises(UnicodeEncodeError) as exc_info:
+            auth_value.encode("ascii")
+        assert exc_info.value.start == 153
+        # After sanitization, it should work:
+        sanitized_key = _strip_non_ascii(key)
+        sanitized_auth = f"Bearer {sanitized_key}"
+        sanitized_auth.encode("ascii")  # should not raise
+
+
 class TestSanitizeToolsNonAscii:
    """Tests for _sanitize_tools_non_ascii."""

@@ -116,6 +116,22 @@ class TestValidateToolset:
    def test_invalid(self):
        assert validate_toolset("nonexistent") is False

+    def test_mcp_alias_uses_live_registry(self, monkeypatch):
+        reg = ToolRegistry()
+        reg.register(
+            name="mcp_dynserver_ping",
+            toolset="mcp-dynserver",
+            schema=_make_schema("mcp_dynserver_ping", "Ping"),
+            handler=_dummy_handler,
+        )
+        reg.register_toolset_alias("dynserver", "mcp-dynserver")
+
+        monkeypatch.setattr("tools.registry.registry", reg)
+
+        assert validate_toolset("dynserver") is True
+        assert validate_toolset("mcp-dynserver") is True
+        assert "mcp_dynserver_ping" in resolve_toolset("dynserver")
+

 class TestGetToolsetInfo:
    def test_leaf(self):
@@ -150,6 +166,23 @@ class TestCreateCustomToolset:
            del TOOLSETS["_test_custom"]


+class TestRegistryOwnedToolsets:
+    def test_registry_membership_is_live(self, monkeypatch):
+        reg = ToolRegistry()
+        reg.register(
+            name="test_live_toolset_tool",
+            toolset="test-live-toolset",
+            schema=_make_schema("test_live_toolset_tool", "Live"),
+            handler=_dummy_handler,
+        )
+
+        monkeypatch.setattr("tools.registry.registry", reg)
+
+        assert validate_toolset("test-live-toolset") is True
+        assert get_toolset("test-live-toolset")["tools"] == ["test_live_toolset_tool"]
+        assert resolve_toolset("test-live-toolset") == ["test_live_toolset_tool"]
+
+
 class TestToolsetConsistency:
    """Verify structural integrity of the built-in TOOLSETS dict."""

@@ -31,18 +31,25 @@ def _clear_browser_caches():


 class TestSanePath:
-    """Verify _SANE_PATH includes Homebrew directories."""
+    """Verify _SANE_PATH includes fallback directories used by browser_tool."""
+
+    def test_includes_termux_bin(self):
+        assert "/data/data/com.termux/files/usr/bin" in _SANE_PATH.split(os.pathsep)
+
+    def test_includes_termux_sbin(self):
+        assert "/data/data/com.termux/files/usr/sbin" in _SANE_PATH.split(os.pathsep)

    def test_includes_homebrew_bin(self):
-        assert "/opt/homebrew/bin" in _SANE_PATH
+        assert "/opt/homebrew/bin" in _SANE_PATH.split(os.pathsep)

    def test_includes_homebrew_sbin(self):
-        assert "/opt/homebrew/sbin" in _SANE_PATH
+        assert "/opt/homebrew/sbin" in _SANE_PATH.split(os.pathsep)

    def test_includes_standard_dirs(self):
-        assert "/usr/local/bin" in _SANE_PATH
-        assert "/usr/bin" in _SANE_PATH
-        assert "/bin" in _SANE_PATH
+        path_parts = _SANE_PATH.split(os.pathsep)
+        assert "/usr/local/bin" in path_parts
+        assert "/usr/bin" in path_parts
+        assert "/bin" in path_parts


 class TestDiscoverHomebrewNodeDirs:
@@ -143,6 +150,44 @@ class TestFindAgentBrowser:
            result = _find_agent_browser()
            assert result == "npx agent-browser"

+    def test_finds_npx_in_termux_fallback_path(self):
+        """Should find npx when only Termux fallback dirs are available."""
+        def mock_which(cmd, path=None):
+            if cmd == "agent-browser":
+                return None
+            if cmd == "npx":
+                if path and "/data/data/com.termux/files/usr/bin" in path:
+                    return "/data/data/com.termux/files/usr/bin/npx"
+                return None
+            return None
+
+        original_path_exists = Path.exists
+
+        def mock_path_exists(self):
+            if "node_modules" in str(self) and "agent-browser" in str(self):
+                return False
+            return original_path_exists(self)
+
+        real_isdir = os.path.isdir
+
+        def selective_isdir(path):
+            if path in (
+                "/data/data/com.termux/files/usr/bin",
+                "/data/data/com.termux/files/usr/sbin",
+            ):
+                return True
+            return real_isdir(path)
+
+        with patch("shutil.which", side_effect=mock_which), \
+             patch("os.path.isdir", side_effect=selective_isdir), \
+             patch.object(Path, "exists", mock_path_exists), \
+             patch(
+                 "tools.browser_tool._discover_homebrew_node_dirs",
+                 return_value=[],
+             ):
+            result = _find_agent_browser()
+            assert result == "npx agent-browser"
+
    def test_raises_when_not_found(self):
        """Should raise FileNotFoundError when nothing works."""
        original_path_exists = Path.exists
@@ -399,3 +444,51 @@ class TestRunBrowserCommandPathConstruction:
        result_path = captured_env.get("PATH", "")
        assert "/opt/homebrew/bin" in result_path
        assert "/opt/homebrew/sbin" in result_path
+
+    def test_subprocess_path_includes_termux_fallback_dirs(self, tmp_path):
+        """Termux fallback dirs should survive browser PATH rebuilding."""
+        captured_env = {}
+
+        mock_proc = MagicMock()
+        mock_proc.returncode = 0
+        mock_proc.wait.return_value = 0
+
+        def capture_popen(cmd, **kwargs):
+            captured_env.update(kwargs.get("env", {}))
+            return mock_proc
+
+        fake_session = {
+            "session_name": "test-session",
+            "session_id": "test-id",
+            "cdp_url": None,
+        }
+
+        fake_json = json.dumps({"success": True})
+        real_isdir = os.path.isdir
+
+        def selective_isdir(path):
+            if path in (
+                "/data/data/com.termux/files/usr/bin",
+                "/data/data/com.termux/files/usr/sbin",
+            ):
+                return True
+            if path.startswith(str(tmp_path)):
+                return True
+            return real_isdir(path)
+
+        with patch("tools.browser_tool._find_agent_browser", return_value="/usr/local/bin/agent-browser"), \
+             patch("tools.browser_tool._get_session_info", return_value=fake_session), \
+             patch("tools.browser_tool._socket_safe_tmpdir", return_value=str(tmp_path)), \
+             patch("tools.browser_tool._discover_homebrew_node_dirs", return_value=[]), \
+             patch("os.path.isdir", side_effect=selective_isdir), \
+             patch("subprocess.Popen", side_effect=capture_popen), \
+             patch("os.open", return_value=99), \
+             patch("os.close"), \
+             patch("tools.interrupt.is_interrupted", return_value=False), \
+             patch.dict(os.environ, {"PATH": "/usr/bin:/bin", "HOME": "/home/test"}, clear=True):
+            with patch("builtins.open", mock_open(read_data=fake_json)):
+                _run_browser_command("test-task", "navigate", ["https://example.com"])
+
+        result_path = captured_env.get("PATH", "")
+        assert "/data/data/com.termux/files/usr/bin" in result_path
+        assert "/data/data/com.termux/files/usr/sbin" in result_path
@@ -21,34 +21,19 @@ class TestRegisterServerTools:
    def mock_registry(self):
        return ToolRegistry()

-    @pytest.fixture
-    def mock_toolsets(self):
-        return {
-            "hermes-cli": {"tools": ["terminal"], "description": "CLI", "includes": []},
-            "hermes-telegram": {"tools": ["terminal"], "description": "TG", "includes": []},
-            "custom-toolset": {"tools": [], "description": "Other", "includes": []},
-        }
-
-    def test_injects_hermes_toolsets(self, mock_registry, mock_toolsets):
-        """Tools are injected into hermes-* toolsets but not custom ones."""
+    def test_exposes_live_server_aliases(self, mock_registry):
+        """Registered MCP tools are reachable via live raw-server aliases."""
        server = MCPServerTask("my_srv")
        server._tools = [_make_mcp_tool("my_tool", "desc")]
        server.session = MagicMock()
+        from toolsets import resolve_toolset, validate_toolset

-        with patch("tools.registry.registry", mock_registry), \
-            patch("toolsets.create_custom_toolset"), \
-            patch.dict("toolsets.TOOLSETS", mock_toolsets, clear=True):
-
+        with patch("tools.registry.registry", mock_registry):
            registered = _register_server_tools("my_srv", server, {})
-
-        assert "mcp_my_srv_my_tool" in registered
-        assert "mcp_my_srv_my_tool" in mock_registry.get_all_tool_names()
-
-        # Injected into hermes-* toolsets
-        assert "mcp_my_srv_my_tool" in mock_toolsets["hermes-cli"]["tools"]
-        assert "mcp_my_srv_my_tool" in mock_toolsets["hermes-telegram"]["tools"]
-        # NOT into non-hermes toolsets
-        assert "mcp_my_srv_my_tool" not in mock_toolsets["custom-toolset"]["tools"]
+            assert "mcp_my_srv_my_tool" in registered
+            assert "mcp_my_srv_my_tool" in mock_registry.get_all_tool_names()
+            assert validate_toolset("my_srv") is True
+            assert "mcp_my_srv_my_tool" in resolve_toolset("my_srv")


 class TestRefreshTools:
@@ -58,19 +43,13 @@ class TestRefreshTools:
    def mock_registry(self):
        return ToolRegistry()

-    @pytest.fixture
-    def mock_toolsets(self):
-        return {
-            "hermes-cli": {"tools": ["terminal"], "description": "CLI", "includes": []},
-            "hermes-telegram": {"tools": ["terminal"], "description": "TG", "includes": []},
-        }
-
    @pytest.mark.asyncio
-    async def test_nuke_and_repave(self, mock_registry, mock_toolsets):
+    async def test_nuke_and_repave(self, mock_registry):
        """Old tools are removed and new tools registered on refresh."""
        server = MCPServerTask("live_srv")
        server._refresh_lock = asyncio.Lock()
        server._config = {}
+        from toolsets import resolve_toolset

        # Seed initial state: one old tool registered
        mock_registry.register(
@@ -79,7 +58,6 @@ class TestRefreshTools:
            description="", emoji="",
        )
        server._registered_tool_names = ["mcp_live_srv_old_tool"]
-        mock_toolsets["hermes-cli"]["tools"].append("mcp_live_srv_old_tool")

        # New tool list from server
        new_tool = _make_mcp_tool("new_tool", "new behavior")
@@ -89,20 +67,13 @@ class TestRefreshTools:
            )
        )

-        with patch("tools.registry.registry", mock_registry), \
-            patch("toolsets.create_custom_toolset"), \
-            patch.dict("toolsets.TOOLSETS", mock_toolsets, clear=True):
-
+        with patch("tools.registry.registry", mock_registry):
            await server._refresh_tools()
-
-        # Old tool completely gone
-        assert "mcp_live_srv_old_tool" not in mock_registry.get_all_tool_names()
-        assert "mcp_live_srv_old_tool" not in mock_toolsets["hermes-cli"]["tools"]
-
-        # New tool registered
-        assert "mcp_live_srv_new_tool" in mock_registry.get_all_tool_names()
-        assert "mcp_live_srv_new_tool" in mock_toolsets["hermes-cli"]["tools"]
-        assert server._registered_tool_names == ["mcp_live_srv_new_tool"]
+            assert "mcp_live_srv_old_tool" not in mock_registry.get_all_tool_names()
+            assert "mcp_live_srv_old_tool" not in resolve_toolset("live_srv")
+            assert "mcp_live_srv_new_tool" in mock_registry.get_all_tool_names()
+            assert "mcp_live_srv_new_tool" in resolve_toolset("live_srv")
+            assert server._registered_tool_names == ["mcp_live_srv_new_tool"]


 class TestMessageHandler:
@@ -165,6 +136,25 @@ class TestDeregister:
        # bar still in ts1, so check should remain
        assert "ts1" in reg._toolset_checks

+    def test_removes_toolset_alias_when_last_tool_is_removed(self):
+        reg = ToolRegistry()
+        reg.register(name="foo", toolset="mcp-srv", schema={}, handler=lambda x: x)
+        reg.register_toolset_alias("srv", "mcp-srv")
+
+        reg.deregister("foo")
+
+        assert reg.get_toolset_alias_target("srv") is None
+
+    def test_preserves_toolset_alias_while_toolset_still_exists(self):
+        reg = ToolRegistry()
+        reg.register(name="foo", toolset="mcp-srv", schema={}, handler=lambda x: x)
+        reg.register(name="bar", toolset="mcp-srv", schema={}, handler=lambda x: x)
+        reg.register_toolset_alias("srv", "mcp-srv")
+
+        reg.deregister("foo")
+
+        assert reg.get_toolset_alias_target("srv") == "mcp-srv"
+
    def test_noop_for_unknown_tool(self):
        reg = ToolRegistry()
        reg.deregister("nonexistent")  # Should not raise
@@ -184,11 +184,7 @@ class TestToolHandler:
    def _patch_mcp_loop(self, coro_side_effect=None):
        """Return a patch for _run_on_mcp_loop that runs the coroutine directly."""
        def fake_run(coro, timeout=30):
-            loop = asyncio.new_event_loop()
-            try:
-                return loop.run_until_complete(coro)
-            finally:
-                loop.close()
+            return asyncio.run(coro)
        if coro_side_effect:
            return patch("tools.mcp_tool._run_on_mcp_loop", side_effect=coro_side_effect)
        return patch("tools.mcp_tool._run_on_mcp_loop", side_effect=fake_run)
@@ -365,10 +361,13 @@ class TestDiscoverAndRegister:

        _servers.pop("fs", None)

-    def test_toolset_created(self):
-        """A custom toolset is created for the MCP server."""
+    def test_toolset_resolves_live_from_registry(self):
+        """MCP toolsets resolve through the live registry without TOOLSETS mutation."""
+        from tools.registry import ToolRegistry
        from tools.mcp_tool import _discover_and_register_server, _servers, MCPServerTask
+        from toolsets import resolve_toolset, validate_toolset

+        mock_registry = ToolRegistry()
        mock_tools = [_make_mcp_tool("ping", "Ping")]
        mock_session = MagicMock()

@@ -378,16 +377,16 @@ class TestDiscoverAndRegister:
            server._tools = mock_tools
            return server

-        mock_create = MagicMock()
        with patch("tools.mcp_tool._connect_server", side_effect=fake_connect), \
-             patch("toolsets.create_custom_toolset", mock_create):
+             patch("tools.registry.registry", mock_registry):
            asyncio.run(
                _discover_and_register_server("myserver", {"command": "test"})
            )

-        mock_create.assert_called_once()
-        call_kwargs = mock_create.call_args
-        assert call_kwargs[1]["name"] == "mcp-myserver" or call_kwargs[0][0] == "mcp-myserver"
+            assert validate_toolset("myserver") is True
+            assert validate_toolset("mcp-myserver") is True
+            assert "mcp_myserver_ping" in resolve_toolset("myserver")
+            assert "mcp_myserver_ping" in resolve_toolset("mcp-myserver")

        _servers.pop("myserver", None)

@@ -550,12 +549,15 @@ class TestMCPServerTask:
 # ---------------------------------------------------------------------------

 class TestToolsetInjection:
-    def test_mcp_tools_added_to_all_hermes_toolsets(self):
-        """Discovered MCP tools are dynamically injected into all hermes-* toolsets."""
+    def test_mcp_tools_resolve_through_server_aliases(self):
+        """Discovered MCP tools resolve through raw server-name aliases."""
        from tools.mcp_tool import MCPServerTask
+        from tools.registry import ToolRegistry
+        from toolsets import resolve_toolset, validate_toolset

        mock_tools = [_make_mcp_tool("list_files", "List files")]
        mock_session = MagicMock()
+        mock_registry = ToolRegistry()

        fresh_servers = {}

@@ -565,43 +567,32 @@ class TestToolsetInjection:
            server._tools = mock_tools
            return server

-        fake_toolsets = {
-            "hermes-cli": {"tools": ["terminal"], "description": "CLI", "includes": []},
-            "hermes-telegram": {"tools": ["terminal"], "description": "TG", "includes": []},
-            "hermes-gateway": {"tools": [], "description": "GW", "includes": []},
-            "non-hermes": {"tools": [], "description": "other", "includes": []},
-        }
        fake_config = {"fs": {"command": "npx", "args": []}}

        with patch("tools.mcp_tool._MCP_AVAILABLE", True), \
             patch("tools.mcp_tool._servers", fresh_servers), \
             patch("tools.mcp_tool._load_mcp_config", return_value=fake_config), \
             patch("tools.mcp_tool._connect_server", side_effect=fake_connect), \
-             patch("toolsets.TOOLSETS", fake_toolsets):
+             patch("tools.registry.registry", mock_registry):
            from tools.mcp_tool import discover_mcp_tools
            result = discover_mcp_tools()

-        assert "mcp_fs_list_files" in result
-        # All hermes-* toolsets get injection
-        assert "mcp_fs_list_files" in fake_toolsets["hermes-cli"]["tools"]
-        assert "mcp_fs_list_files" in fake_toolsets["hermes-telegram"]["tools"]
-        assert "mcp_fs_list_files" in fake_toolsets["hermes-gateway"]["tools"]
-        # Non-hermes toolset should NOT get injection
-        assert "mcp_fs_list_files" not in fake_toolsets["non-hermes"]["tools"]
-        # Original tools preserved
-        assert "terminal" in fake_toolsets["hermes-cli"]["tools"]
-        # Server name becomes a standalone toolset
-        assert "fs" in fake_toolsets
-        assert "mcp_fs_list_files" in fake_toolsets["fs"]["tools"]
-        assert fake_toolsets["fs"]["description"].startswith("MCP server '")
+            assert "mcp_fs_list_files" in result
+            assert validate_toolset("fs") is True
+            assert validate_toolset("mcp-fs") is True
+            assert "mcp_fs_list_files" in resolve_toolset("fs")
+            assert "mcp_fs_list_files" in resolve_toolset("mcp-fs")

    def test_server_toolset_skips_builtin_collision(self):
-        """MCP server named after a built-in toolset shouldn't overwrite it."""
+        """MCP raw aliases never overwrite a built-in toolset name."""
        from tools.mcp_tool import MCPServerTask
+        from tools.registry import ToolRegistry
+        from toolsets import resolve_toolset, validate_toolset

        mock_tools = [_make_mcp_tool("run", "Run command")]
        mock_session = MagicMock()
        fresh_servers = {}
+        mock_registry = ToolRegistry()

        async def fake_connect(name, config):
            server = MCPServerTask(name)
@@ -620,12 +611,15 @@ class TestToolsetInjection:
             patch("tools.mcp_tool._servers", fresh_servers), \
             patch("tools.mcp_tool._load_mcp_config", return_value=fake_config), \
             patch("tools.mcp_tool._connect_server", side_effect=fake_connect), \
+             patch("tools.registry.registry", mock_registry), \
             patch("toolsets.TOOLSETS", fake_toolsets):
            from tools.mcp_tool import discover_mcp_tools
            discover_mcp_tools()

-        # Built-in toolset preserved — description unchanged
-        assert fake_toolsets["terminal"]["description"] == "Terminal tools"
+            assert fake_toolsets["terminal"]["description"] == "Terminal tools"
+            assert "mcp_terminal_run" not in resolve_toolset("terminal")
+            assert validate_toolset("mcp-terminal") is True
+            assert "mcp_terminal_run" in resolve_toolset("mcp-terminal")

    def test_server_connection_failure_skipped(self):
        """If one server fails to connect, others still proceed."""
@@ -776,6 +770,42 @@ class TestShutdown:
        assert len(_servers) == 0
        mock_server.shutdown.assert_called_once()

+    def test_shutdown_deregisters_registered_tools(self):
+        """shutdown_mcp_servers removes MCP tools and their raw alias."""
+        import tools.mcp_tool as mcp_mod
+        from tools.mcp_tool import MCPServerTask, shutdown_mcp_servers, _servers
+        from tools.registry import registry
+        from toolsets import resolve_toolset, validate_toolset
+
+        _servers.clear()
+        registry.register(
+            name="mcp_test_ping",
+            toolset="mcp-test",
+            schema={
+                "name": "mcp_test_ping",
+                "description": "Ping",
+                "parameters": {"type": "object", "properties": {}},
+            },
+            handler=lambda *_args, **_kwargs: "{}",
+        )
+        registry.register_toolset_alias("test", "mcp-test")
+
+        server = MCPServerTask("test")
+        server._registered_tool_names = ["mcp_test_ping"]
+        _servers["test"] = server
+
+        mcp_mod._ensure_mcp_loop()
+        try:
+            assert validate_toolset("test") is True
+            assert "mcp_test_ping" in resolve_toolset("test")
+            shutdown_mcp_servers()
+        finally:
+            mcp_mod._mcp_loop = None
+            mcp_mod._mcp_thread = None
+
+        assert "mcp_test_ping" not in registry.get_all_tool_names()
+        assert validate_toolset("test") is False
+
    def test_shutdown_handles_errors(self):
        """shutdown_mcp_servers handles errors during close gracefully."""
        import tools.mcp_tool as mcp_mod
@@ -1179,7 +1209,11 @@ class TestConfigurableTimeouts:
        try:
            handler = _make_tool_handler("test_srv", "my_tool", 180)
            with patch("tools.mcp_tool._run_on_mcp_loop") as mock_run:
-                mock_run.return_value = json.dumps({"result": "ok"})
+                def fake_run(coro, timeout=30):
+                    coro.close()
+                    return json.dumps({"result": "ok"})
+
+                mock_run.side_effect = fake_run
                handler({})
                # Verify timeout=180 was passed
                call_kwargs = mock_run.call_args
@@ -1279,11 +1313,7 @@ class TestUtilityHandlers:
    def _patch_mcp_loop(self):
        """Return a patch for _run_on_mcp_loop that runs the coroutine directly."""
        def fake_run(coro, timeout=30):
-            loop = asyncio.new_event_loop()
-            try:
-                return loop.run_until_complete(coro)
-            finally:
-                loop.close()
+            return asyncio.run(coro)
        return patch("tools.mcp_tool._run_on_mcp_loop", side_effect=fake_run)

    # -- list_resources --
@@ -3038,14 +3068,23 @@ class TestSanitizeMcpNameComponent:
            assert "/" not in name
            assert "." not in name

-    def test_slash_in_sync_mcp_toolsets(self):
-        """_sync_mcp_toolsets uses sanitize consistently with _convert_mcp_schema."""
-        from tools.mcp_tool import sanitize_mcp_name_component
+    def test_slash_in_server_alias_resolution(self):
+        """Server names with slashes resolve through their live MCP alias."""
+        from tools.registry import ToolRegistry
+        from toolsets import resolve_toolset, validate_toolset

-        # Verify the prefix generation matches what _convert_mcp_schema produces
-        server_name = "ai.exa/exa"
-        safe_prefix = f"mcp_{sanitize_mcp_name_component(server_name)}_"
-        assert safe_prefix == "mcp_ai_exa_exa_"
+        reg = ToolRegistry()
+        reg.register(
+            name="mcp_ai_exa_exa_search",
+            toolset="mcp-ai.exa/exa",
+            schema={"name": "mcp_ai_exa_exa_search", "description": "Search", "parameters": {"type": "object", "properties": {}}},
+            handler=lambda *_args, **_kwargs: "{}",
+        )
+        reg.register_toolset_alias("ai.exa/exa", "mcp-ai.exa/exa")
+
+        with patch("tools.registry.registry", reg):
+            assert validate_toolset("ai.exa/exa") is True
+            assert "mcp_ai_exa_exa_search" in resolve_toolset("ai.exa/exa")


 # ---------------------------------------------------------------------------
@@ -94,11 +94,21 @@ except ImportError:
 logger = logging.getLogger(__name__)

 # Standard PATH entries for environments with minimal PATH (e.g. systemd services).
-# Includes macOS Homebrew paths (/opt/homebrew/* for Apple Silicon).
-_SANE_PATH = (
-    "/opt/homebrew/bin:/opt/homebrew/sbin:"
-    "/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"
+# Includes Android/Termux and macOS Homebrew locations needed for agent-browser,
+# npx, node, and Android's glibc runner (grun).
+_SANE_PATH_DIRS = (
+    "/data/data/com.termux/files/usr/bin",
+    "/data/data/com.termux/files/usr/sbin",
+    "/opt/homebrew/bin",
+    "/opt/homebrew/sbin",
+    "/usr/local/sbin",
+    "/usr/local/bin",
+    "/usr/sbin",
+    "/usr/bin",
+    "/sbin",
+    "/bin",
 )
+_SANE_PATH = os.pathsep.join(_SANE_PATH_DIRS)


@functools.lru_cache(maxsize=1)
@@ -123,6 +133,28 @@ def _discover_homebrew_node_dirs() -> tuple[str, ...]:
        pass
    return tuple(dirs)

+
+def _browser_candidate_path_dirs() -> list[str]:
+    """Return ordered browser CLI PATH candidates shared by discovery and execution."""
+    hermes_home = get_hermes_home()
+    hermes_node_bin = str(hermes_home / "node" / "bin")
+    return [hermes_node_bin, *list(_discover_homebrew_node_dirs()), *_SANE_PATH_DIRS]
+
+
+def _merge_browser_path(existing_path: str = "") -> str:
+    """Prepend browser-specific PATH fallbacks without reordering existing entries."""
+    path_parts = [p for p in (existing_path or "").split(os.pathsep) if p]
+    existing_parts = set(path_parts)
+    prefix_parts: list[str] = []
+
+    for part in _browser_candidate_path_dirs():
+        if not part or part in existing_parts or part in prefix_parts:
+            continue
+        if os.path.isdir(part):
+            prefix_parts.append(part)
+
+    return os.pathsep.join(prefix_parts + path_parts)
+
 # Throttle screenshot cleanup to avoid repeated full directory scans.
 _last_screenshot_cleanup_by_dir: dict[str, float] = {}

@@ -895,21 +927,10 @@ def _find_agent_browser() -> str:
        _agent_browser_resolved = True
        return which_result

-    # Build an extended search PATH including Homebrew and Hermes-managed dirs.
-    # This covers macOS where the process PATH may not include Homebrew paths.
-    extra_dirs: list[str] = []
-    for d in ["/opt/homebrew/bin", "/usr/local/bin"]:
-        if os.path.isdir(d):
-            extra_dirs.append(d)
-    extra_dirs.extend(_discover_homebrew_node_dirs())
-
-    hermes_home = get_hermes_home()
-    hermes_node_bin = str(hermes_home / "node" / "bin")
-    if os.path.isdir(hermes_node_bin):
-        extra_dirs.append(hermes_node_bin)
-
-    if extra_dirs:
-        extended_path = os.pathsep.join(extra_dirs)
+    # Build an extended search PATH including Hermes-managed Node, macOS
+    # versioned Homebrew installs, and fallback system dirs like Termux.
+    extended_path = _merge_browser_path("")
+    if extended_path:
        which_result = shutil.which("agent-browser", path=extended_path)
        if which_result:
            _cached_agent_browser = which_result
@@ -924,10 +945,10 @@ def _find_agent_browser() -> str:
        _agent_browser_resolved = True
        return _cached_agent_browser
    
-    # Check common npx locations (also search extended dirs)
+    # Check common npx locations (also search the extended fallback PATH)
    npx_path = shutil.which("npx")
-    if not npx_path and extra_dirs:
-        npx_path = shutil.which("npx", path=os.pathsep.join(extra_dirs))
+    if not npx_path and extended_path:
+        npx_path = shutil.which("npx", path=extended_path)
    if npx_path:
        _cached_agent_browser = "npx agent-browser"
        _agent_browser_resolved = True
@@ -1046,24 +1067,9 @@ def _run_browser_command(
        
        browser_env = {**os.environ}

-        # Ensure PATH includes Hermes-managed Node first, Homebrew versioned
-        # node dirs (for macOS ``brew install node@24``), then standard system dirs.
-        hermes_home = get_hermes_home()
-        hermes_node_bin = str(hermes_home / "node" / "bin")
-
-        existing_path = browser_env.get("PATH", "")
-        path_parts = [p for p in existing_path.split(":") if p]
-        candidate_dirs = (
-            [hermes_node_bin]
-            + list(_discover_homebrew_node_dirs())
-            + [p for p in _SANE_PATH.split(":") if p]
-        )
-
-        for part in reversed(candidate_dirs):
-            if os.path.isdir(part) and part not in path_parts:
-                path_parts.insert(0, part)
-
-        browser_env["PATH"] = ":".join(path_parts)
+        # Ensure subprocesses inherit the same browser-specific PATH fallbacks
+        # used during CLI discovery.
+        browser_env["PATH"] = _merge_browser_path(browser_env.get("PATH", ""))
        browser_env["AGENT_BROWSER_SOCKET_DIR"] = task_socket_dir
        
        # Use temp files for stdout/stderr instead of pipes.
@@ -846,8 +846,7 @@ class MCPServerTask:
        After the initial ``await`` (list_tools), all mutations are synchronous
        — atomic from the event loop's perspective.
        """
-        from tools.registry import registry, tool_error
-        from toolsets import TOOLSETS
+        from tools.registry import registry

        async with self._refresh_lock:
            # Capture old tool names for change diff
@@ -857,16 +856,11 @@ class MCPServerTask:
            tools_result = await self.session.list_tools()
            new_mcp_tools = tools_result.tools if hasattr(tools_result, "tools") else []

-            # 2. Remove old tools from hermes-* umbrella toolsets
-            for ts_name, ts in TOOLSETS.items():
-                if ts_name.startswith("hermes-"):
-                    ts["tools"] = [t for t in ts["tools"] if t not in self._registered_tool_names]
-
-            # 3. Deregister old tools from the central registry
+            # 2. Deregister old tools from the central registry
            for prefixed_name in self._registered_tool_names:
                registry.deregister(prefixed_name)

-            # 4. Re-register with fresh tool list
+            # 3. Re-register with fresh tool list
            self._tools = new_mcp_tools
            self._registered_tool_names = _register_server_tools(
                self.name, self, self._config
@@ -1144,6 +1138,8 @@ class MCPServerTask:

    async def shutdown(self):
        """Signal the Task to exit and wait for clean resource teardown."""
+        from tools.registry import registry
+
        self._shutdown_event.set()
        if self._task and not self._task.done():
            try:
@@ -1158,6 +1154,9 @@ class MCPServerTask:
                    await self._task
                except asyncio.CancelledError:
                    pass
+        for tool_name in list(getattr(self, "_registered_tool_names", [])):
+            registry.deregister(tool_name)
+        self._registered_tool_names = []
        self.session = None


@@ -1671,57 +1670,6 @@ def _convert_mcp_schema(server_name: str, mcp_tool) -> dict:
    }


-def _sync_mcp_toolsets(server_names: Optional[List[str]] = None) -> None:
-    """Expose each MCP server as a standalone toolset and inject into hermes-* sets.
-
-    Creates a real toolset entry in TOOLSETS for each server name (e.g.
-    TOOLSETS["github"] = {"tools": ["mcp_github_list_files", ...]}). This
-    makes raw server names resolvable in platform_toolsets overrides.
-
-    Also injects all MCP tools into hermes-* umbrella toolsets for the
-    default behavior.
-
-    Skips server names that collide with built-in toolsets.
-    """
-    from toolsets import TOOLSETS
-
-    if server_names is None:
-        server_names = list(_load_mcp_config().keys())
-
-    existing = _existing_tool_names()
-    all_mcp_tools: List[str] = []
-
-    for server_name in server_names:
-        safe_prefix = f"mcp_{sanitize_mcp_name_component(server_name)}_"
-        server_tools = sorted(
-            t for t in existing if t.startswith(safe_prefix)
-        )
-        all_mcp_tools.extend(server_tools)
-
-        # Don't overwrite a built-in toolset that happens to share the name.
-        existing_ts = TOOLSETS.get(server_name)
-        if existing_ts and not str(existing_ts.get("description", "")).startswith("MCP server '"):
-            logger.warning(
-                "Skipping MCP toolset alias '%s' — a built-in toolset already uses that name",
-                server_name,
-            )
-            continue
-
-        TOOLSETS[server_name] = {
-            "description": f"MCP server '{server_name}' tools",
-            "tools": server_tools,
-            "includes": [],
-        }
-
-    # Also inject into hermes-* umbrella toolsets for default behavior.
-    for ts_name, ts in TOOLSETS.items():
-        if not ts_name.startswith("hermes-"):
-            continue
-        for tool_name in all_mcp_tools:
-            if tool_name not in ts["tools"]:
-                ts["tools"].append(tool_name)
-
-
 def _build_utility_schemas(server_name: str) -> List[dict]:
    """Build schemas for the MCP utility tools (resources & prompts).

@@ -1874,16 +1822,16 @@ def _existing_tool_names() -> List[str]:
 def _register_server_tools(name: str, server: MCPServerTask, config: dict) -> List[str]:
    """Register tools from an already-connected server into the registry.

-    Handles include/exclude filtering, utility tools, toolset creation,
-    and hermes-* umbrella toolset injection.
+    Handles include/exclude filtering and utility tools. Toolset resolution
+    for ``mcp-{server}`` and raw server-name aliases is derived from the live
+    registry, rather than mutating ``toolsets.TOOLSETS`` at runtime.

    Used by both initial discovery and dynamic refresh (list_changed).

    Returns:
        List of registered prefixed tool names.
    """
-    from tools.registry import registry, tool_error
-    from toolsets import create_custom_toolset, TOOLSETS
+    from tools.registry import registry

    registered_names: List[str] = []
    toolset_name = f"mcp-{name}"
@@ -1973,19 +1921,8 @@ def _register_server_tools(name: str, server: MCPServerTask, config: dict) -> Li
        )
        registered_names.append(util_name)

-    # Create a custom toolset so these tools are discoverable
    if registered_names:
-        create_custom_toolset(
-            name=toolset_name,
-            description=f"MCP tools from {name} server",
-            tools=registered_names,
-        )
-        # Inject into hermes-* umbrella toolsets for default behavior
-        for ts_name, ts in TOOLSETS.items():
-            if ts_name.startswith("hermes-"):
-                for tool_name in registered_names:
-                    if tool_name not in ts["tools"]:
-                        ts["tools"].append(tool_name)
+        registry.register_toolset_alias(name, toolset_name)

    return registered_names

@@ -2049,7 +1986,6 @@ def register_mcp_servers(servers: Dict[str, dict]) -> List[str]:
        }

    if not new_servers:
-        _sync_mcp_toolsets(list(servers.keys()))
        return _existing_tool_names()

    # Start the background event loop for MCP connections
@@ -2080,8 +2016,6 @@ def register_mcp_servers(servers: Dict[str, dict]) -> List[str]:
    # The outer timeout is generous: 120s total for parallel discovery.
    _run_on_mcp_loop(_discover_all(), timeout=120)

-    _sync_mcp_toolsets(list(servers.keys()))
-
    # Log a summary so ACP callers get visibility into what was registered.
    with _lock:
        connected = [n for n in new_servers if n in _servers]
@@ -52,6 +52,7 @@ class ToolRegistry:
    def __init__(self):
        self._tools: Dict[str, ToolEntry] = {}
        self._toolset_checks: Dict[str, Callable] = {}
+        self._toolset_aliases: Dict[str, str] = {}
        # MCP dynamic refresh can mutate the registry while other threads are
        # reading tool metadata, so keep mutations serialized and readers on
        # stable snapshots.
@@ -96,6 +97,27 @@ class ToolRegistry:
            if entry.toolset == toolset
        )

+    def register_toolset_alias(self, alias: str, toolset: str) -> None:
+        """Register an explicit alias for a canonical toolset name."""
+        with self._lock:
+            existing = self._toolset_aliases.get(alias)
+            if existing and existing != toolset:
+                logger.warning(
+                    "Toolset alias collision: '%s' (%s) overwritten by %s",
+                    alias, existing, toolset,
+                )
+            self._toolset_aliases[alias] = toolset
+
+    def get_registered_toolset_aliases(self) -> Dict[str, str]:
+        """Return a snapshot of ``{alias: canonical_toolset}`` mappings."""
+        with self._lock:
+            return dict(self._toolset_aliases)
+
+    def get_toolset_alias_target(self, alias: str) -> Optional[str]:
+        """Return the canonical toolset name for an alias, or None."""
+        with self._lock:
+            return self._toolset_aliases.get(alias)
+
    # ------------------------------------------------------------------
    # Registration
    # ------------------------------------------------------------------
@@ -164,11 +186,18 @@ class ToolRegistry:
            entry = self._tools.pop(name, None)
            if entry is None:
                return
-            # Drop the toolset check if this was the last tool in that toolset
-            if entry.toolset in self._toolset_checks and not any(
+            # Drop the toolset check and aliases if this was the last tool in
+            # that toolset.
+            toolset_still_exists = any(
                e.toolset == entry.toolset for e in self._tools.values()
-            ):
+            )
+            if not toolset_still_exists:
                self._toolset_checks.pop(entry.toolset, None)
+                self._toolset_aliases = {
+                    alias: target
+                    for alias, target in self._toolset_aliases.items()
+                    if target != entry.toolset
+                }
        logger.debug("Deregistered tool: %s", name)

    # ------------------------------------------------------------------
@@ -409,8 +409,39 @@ def get_toolset(name: str) -> Optional[Dict[str, Any]]:
        Dict: Toolset definition with description, tools, and includes
        None: If toolset not found
    """
-    # Return toolset definition
-    return TOOLSETS.get(name)
+    toolset = TOOLSETS.get(name)
+    if toolset:
+        return toolset
+
+    try:
+        from tools.registry import registry
+    except Exception:
+        return None
+
+    registry_toolset = name
+    description = f"Plugin toolset: {name}"
+    alias_target = registry.get_toolset_alias_target(name)
+
+    if name not in _get_plugin_toolset_names():
+        registry_toolset = alias_target
+        if not registry_toolset:
+            return None
+        description = f"MCP server '{name}' tools"
+    else:
+        reverse_aliases = {
+            canonical: alias
+            for alias, canonical in _get_registry_toolset_aliases().items()
+            if alias not in TOOLSETS
+        }
+        alias = reverse_aliases.get(name)
+        if alias:
+            description = f"MCP server '{alias}' tools"
+
+    return {
+        "description": description,
+        "tools": registry.get_tool_names_for_toolset(registry_toolset),
+        "includes": [],
+    }


 def resolve_toolset(name: str, visited: Set[str] = None) -> List[str]:
@@ -438,7 +469,7 @@ def resolve_toolset(name: str, visited: Set[str] = None) -> List[str]:
            # Use a fresh visited set per branch to avoid cross-branch contamination
            resolved = resolve_toolset(toolset_name, visited.copy())
            all_tools.update(resolved)
-        return list(all_tools)
+        return sorted(all_tools)

    # Check for cycles / already-resolved (diamond deps).
    # Silently return [] — either this is a diamond (not a bug, tools already
@@ -449,15 +480,8 @@ def resolve_toolset(name: str, visited: Set[str] = None) -> List[str]:
    visited.add(name)

    # Get toolset definition
-    toolset = TOOLSETS.get(name)
+    toolset = get_toolset(name)
    if not toolset:
-        # Fall back to tool registry for plugin-provided toolsets
-        if name in _get_plugin_toolset_names():
-            try:
-                from tools.registry import registry
-                return registry.get_tool_names_for_toolset(name)
-            except Exception:
-                pass
        return []

    # Collect direct tools
@@ -470,7 +494,7 @@ def resolve_toolset(name: str, visited: Set[str] = None) -> List[str]:
        included_tools = resolve_toolset(included_name, visited)
        tools.update(included_tools)
    
-    return list(tools)
+    return sorted(tools)


 def resolve_multiple_toolsets(toolset_names: List[str]) -> List[str]:
@@ -489,7 +513,7 @@ def resolve_multiple_toolsets(toolset_names: List[str]) -> List[str]:
        tools = resolve_toolset(name)
        all_tools.update(tools)
    
-    return list(all_tools)
+    return sorted(all_tools)


 def _get_plugin_toolset_names() -> Set[str]:
@@ -509,6 +533,15 @@ def _get_plugin_toolset_names() -> Set[str]:
        return set()


+def _get_registry_toolset_aliases() -> Dict[str, str]:
+    """Return explicit toolset aliases registered in the live registry."""
+    try:
+        from tools.registry import registry
+        return registry.get_registered_toolset_aliases()
+    except Exception:
+        return {}
+
+
 def get_all_toolsets() -> Dict[str, Dict[str, Any]]:
    """
    Get all available toolsets with their definitions.
@@ -518,19 +551,19 @@ def get_all_toolsets() -> Dict[str, Dict[str, Any]]:
    Returns:
        Dict: All toolset definitions
    """
-    result = TOOLSETS.copy()
-    # Add plugin-provided toolsets (synthetic entries)
+    result = dict(TOOLSETS)
+    aliases = _get_registry_toolset_aliases()
    for ts_name in _get_plugin_toolset_names():
-        if ts_name not in result:
-            try:
-                from tools.registry import registry
-                tools = registry.get_tool_names_for_toolset(ts_name)
-                result[ts_name] = {
-                    "description": f"Plugin toolset: {ts_name}",
-                    "tools": tools,
-                }
-            except Exception:
-                pass
+        display_name = ts_name
+        for alias, canonical in aliases.items():
+            if canonical == ts_name and alias not in TOOLSETS:
+                display_name = alias
+                break
+        if display_name in result:
+            continue
+        toolset = get_toolset(display_name)
+        if toolset:
+            result[display_name] = toolset
    return result


@@ -544,7 +577,14 @@ def get_toolset_names() -> List[str]:
        List[str]: List of toolset names
    """
    names = set(TOOLSETS.keys())
-    names |= _get_plugin_toolset_names()
+    aliases = _get_registry_toolset_aliases()
+    for ts_name in _get_plugin_toolset_names():
+        for alias, canonical in aliases.items():
+            if canonical == ts_name and alias not in TOOLSETS:
+                names.add(alias)
+                break
+        else:
+            names.add(ts_name)
    return sorted(names)


@@ -565,8 +605,9 @@ def validate_toolset(name: str) -> bool:
        return True
    if name in TOOLSETS:
        return True
-    # Check tool registry for plugin-provided toolsets
-    return name in _get_plugin_toolset_names()
+    if name in _get_plugin_toolset_names():
+        return True
+    return name in _get_registry_toolset_aliases()


 def create_custom_toolset(
@@ -152,12 +152,15 @@ hermes setup

 ### Install optional Node dependencies manually

-The tested Termux path skips Node/browser bootstrap on purpose. If you want to experiment later:
+The tested Termux path skips Node/browser bootstrap on purpose. If you want to experiment with browser tooling later:

 ```bash
+pkg install nodejs-lts
 npm install
 ```

+The browser tool automatically includes Termux directories (`/data/data/com.termux/files/usr/bin`) in its PATH search, so `agent-browser` and `npx` are discovered without any extra PATH configuration.
+
 Treat browser / WhatsApp tooling on Android as experimental until documented otherwise.

 ---
@@ -83,9 +83,11 @@ Standard OpenAI Chat Completions format. Stateless — the full conversation is
 }
 ```

-**Streaming** (`"stream": true`): Returns Server-Sent Events (SSE) with token-by-token response chunks. When streaming is enabled in config, tokens are emitted live as the LLM generates them. When disabled, the full response is sent as a single SSE chunk.
+**Streaming** (`"stream": true`): Returns Server-Sent Events (SSE) with token-by-token response chunks. For **Chat Completions**, the stream uses standard `chat.completion.chunk` events plus Hermes' custom `hermes.tool.progress` event for tool-start UX. For **Responses**, the stream uses OpenAI Responses event types such as `response.created`, `response.output_text.delta`, `response.output_item.added`, `response.output_item.done`, and `response.completed`.

-**Tool progress in streams**: When the agent calls tools during a streaming request, brief progress indicators are injected into the content stream as the tools start executing (e.g. `` `💻 pwd` ``, `` `🔍 Python docs` ``). These appear as inline markdown before the agent's response text, giving frontends like Open WebUI real-time visibility into tool execution.
+**Tool progress in streams**:
+- **Chat Completions**: Hermes emits `event: hermes.tool.progress` for tool-start visibility without polluting persisted assistant text.
+- **Responses**: Hermes emits spec-native `function_call` and `function_call_output` output items during the SSE stream, so clients can render structured tool UI in real time.

 ### POST /v1/responses

@@ -134,10 +134,10 @@ To use the Responses API mode:
 3. Change **API Type** from "Chat Completions" to **"Responses (Experimental)"**
 4. Save

-With the Responses API, Open WebUI sends requests in the Responses format (`input` array + `instructions`), and Hermes Agent can preserve full tool call history across turns via `previous_response_id`.
+With the Responses API, Open WebUI sends requests in the Responses format (`input` array + `instructions`), and Hermes Agent can preserve full tool call history across turns via `previous_response_id`. When `stream: true`, Hermes also streams spec-native `function_call` and `function_call_output` items, which enables custom structured tool-call UI in clients that render Responses events.

 :::note
-Open WebUI currently manages conversation history client-side even in Responses mode — it sends the full message history in each request rather than using `previous_response_id`. The Responses API mode is mainly useful for future compatibility as frontends evolve.
+Open WebUI currently manages conversation history client-side even in Responses mode — it sends the full message history in each request rather than using `previous_response_id`. The main advantage of Responses mode today is the structured event stream: text deltas, `function_call`, and `function_call_output` items arrive as OpenAI Responses SSE events instead of Chat Completions chunks.
 :::

 ## How It Works
@@ -0,0 +1,191 @@
+---
+sidebar_position: 2
+sidebar_label: "Google Workspace"
+title: "Google Workspace — Gmail, Calendar, Drive, Sheets & Docs"
+description: "Send email, manage calendar events, search Drive, read/write Sheets, and access Docs — all through OAuth2-authenticated Google APIs"
+---
+
+# Google Workspace Skill
+
+Gmail, Calendar, Drive, Contacts, Sheets, and Docs integration for Hermes. Uses OAuth2 with automatic token refresh. Prefers the [Google Workspace CLI (`gws`)](https://github.com/nicholasgasior/gws) when available for broader coverage, and falls back to Google's Python client libraries otherwise.
+
+**Skill path:** `skills/productivity/google-workspace/`
+
+## Setup
+
+The setup is fully agent-driven — ask Hermes to set up Google Workspace and it walks you through each step. The flow:
+
+1. **Create a Google Cloud project** and enable the required APIs (Gmail, Calendar, Drive, Sheets, Docs, People)
+2. **Create OAuth 2.0 credentials** (Desktop app type) and download the client secret JSON
+3. **Authorize** — Hermes generates an auth URL, you approve in the browser, paste back the redirect URL
+4. **Done** — token auto-refreshes from that point on
+
+:::tip Email-only users
+If you only need email (no Calendar/Drive/Sheets), use the **himalaya** skill instead — it works with a Gmail App Password and takes 2 minutes. No Google Cloud project needed.
+:::
+
+## Gmail
+
+### Searching
+
+```bash
+$GAPI gmail search "is:unread" --max 10
+$GAPI gmail search "from:boss@company.com newer_than:1d"
+$GAPI gmail search "has:attachment filename:pdf newer_than:7d"
+```
+
+Returns JSON with `id`, `from`, `subject`, `date`, `snippet`, and `labels` for each message.
+
+### Reading
+
+```bash
+$GAPI gmail get MESSAGE_ID
+```
+
+Returns the full message body as text (prefers plain text, falls back to HTML).
+
+### Sending
+
+```bash
+# Basic send
+$GAPI gmail send --to user@example.com --subject "Hello" --body "Message text"
+
+# HTML email
+$GAPI gmail send --to user@example.com --subject "Report" \
+  --body "<h1>Q4 Results</h1><p>Details here</p>" --html
+
+# Custom From header (display name + email)
+$GAPI gmail send --to user@example.com --subject "Hello" \
+  --from '"Research Agent" <user@example.com>' --body "Message text"
+
+# With CC
+$GAPI gmail send --to user@example.com --cc "team@example.com" \
+  --subject "Update" --body "FYI"
+```
+
+### Custom From Header
+
+The `--from` flag lets you customize the sender display name on outgoing emails. This is useful when multiple agents share the same Gmail account but you want recipients to see different names:
+
+```bash
+# Agent 1
+$GAPI gmail send --to client@co.com --subject "Research Summary" \
+  --from '"Research Agent" <shared@company.com>' --body "..."
+
+# Agent 2  
+$GAPI gmail send --to client@co.com --subject "Code Review" \
+  --from '"Code Assistant" <shared@company.com>' --body "..."
+```
+
+**How it works:** The `--from` value is set as the RFC 5322 `From` header on the MIME message. Gmail allows customizing the display name on your own authenticated email address without any additional configuration. Recipients see the custom display name (e.g. "Research Agent") while the email address stays the same.
+
+**Important:** If you use a *different email address* in `--from` (not the authenticated account), Gmail requires that address to be configured as a [Send As alias](https://support.google.com/mail/answer/22370) in Gmail Settings → Accounts → Send mail as.
+
+The `--from` flag works on both `send` and `reply`:
+
+```bash
+$GAPI gmail reply MESSAGE_ID \
+  --from '"Support Bot" <shared@company.com>' --body "We're on it"
+```
+
+### Replying
+
+```bash
+$GAPI gmail reply MESSAGE_ID --body "Thanks, that works for me."
+```
+
+Automatically threads the reply (sets `In-Reply-To` and `References` headers) and uses the original message's thread ID.
+
+### Labels
+
+```bash
+# List all labels
+$GAPI gmail labels
+
+# Add/remove labels
+$GAPI gmail modify MESSAGE_ID --add-labels LABEL_ID
+$GAPI gmail modify MESSAGE_ID --remove-labels UNREAD
+```
+
+## Calendar
+
+```bash
+# List events (defaults to next 7 days)
+$GAPI calendar list
+$GAPI calendar list --start 2026-03-01T00:00:00Z --end 2026-03-07T23:59:59Z
+
+# Create event (timezone required)
+$GAPI calendar create --summary "Team Standup" \
+  --start 2026-03-01T10:00:00-07:00 --end 2026-03-01T10:30:00-07:00
+
+# With location and attendees
+$GAPI calendar create --summary "Lunch" \
+  --start 2026-03-01T12:00:00Z --end 2026-03-01T13:00:00Z \
+  --location "Cafe" --attendees "alice@co.com,bob@co.com"
+
+# Delete event
+$GAPI calendar delete EVENT_ID
+```
+
+:::warning
+Calendar times **must** include a timezone offset (e.g. `-07:00`) or use UTC (`Z`). Bare datetimes like `2026-03-01T10:00:00` are ambiguous and will be treated as UTC.
+:::
+
+## Drive
+
+```bash
+$GAPI drive search "quarterly report" --max 10
+$GAPI drive search "mimeType='application/pdf'" --raw-query --max 5
+```
+
+## Sheets
+
+```bash
+# Read a range
+$GAPI sheets get SHEET_ID "Sheet1!A1:D10"
+
+# Write to a range
+$GAPI sheets update SHEET_ID "Sheet1!A1:B2" --values '[["Name","Score"],["Alice","95"]]'
+
+# Append rows
+$GAPI sheets append SHEET_ID "Sheet1!A:C" --values '[["new","row","data"]]'
+```
+
+## Docs
+
+```bash
+$GAPI docs get DOC_ID
+```
+
+Returns the document title and full text content.
+
+## Contacts
+
+```bash
+$GAPI contacts list --max 20
+```
+
+## Output Format
+
+All commands return JSON. Key fields per service:
+
+| Command | Fields |
+|---------|--------|
+| `gmail search` | `id`, `threadId`, `from`, `to`, `subject`, `date`, `snippet`, `labels` |
+| `gmail get` | `id`, `threadId`, `from`, `to`, `subject`, `date`, `labels`, `body` |
+| `gmail send/reply` | `status`, `id`, `threadId` |
+| `calendar list` | `id`, `summary`, `start`, `end`, `location`, `description`, `htmlLink` |
+| `calendar create` | `status`, `id`, `summary`, `htmlLink` |
+| `drive search` | `id`, `name`, `mimeType`, `modifiedTime`, `webViewLink` |
+| `contacts list` | `name`, `emails`, `phones` |
+| `sheets get` | 2D array of cell values |
+
+## Troubleshooting
+
+| Problem | Fix |
+|---------|-----|
+| `NOT_AUTHENTICATED` | Run setup (ask Hermes to set up Google Workspace) |
+| `REFRESH_FAILED` | Token revoked — re-run authorization steps |
+| `HttpError 403: Insufficient Permission` | Missing scope — revoke and re-authorize with the right services |
+| `HttpError 403: Access Not Configured` | API not enabled in Google Cloud Console |
+| `ModuleNotFoundError` | Run setup script with `--install-deps` |
@@ -92,6 +92,7 @@ const sidebars: SidebarsConfig = {
          label: 'Skills',
          items: [
            'user-guide/skills/godmode',
+            'user-guide/skills/google-workspace',
          ],
        },
      ],
@@ -118,7 +119,6 @@ const sidebars: SidebarsConfig = {
        'user-guide/messaging/wecom-callback',
        'user-guide/messaging/weixin',
        'user-guide/messaging/bluebubbles',
-        'user-guide/messaging/qqbot',
        'user-guide/messaging/open-webui',
        'user-guide/messaging/webhooks',
      ],
@@ -153,7 +153,6 @@ const sidebars: SidebarsConfig = {
        'guides/use-voice-mode-with-hermes',
        'guides/build-a-hermes-plugin',
        'guides/automate-with-cron',
-        'guides/automation-templates',
        'guides/cron-troubleshooting',
        'guides/work-with-skills',
        'guides/delegation-patterns',
Author	SHA1	Message	Date
Teknium	0c9715f2ff	fix: 24h cooldown for 401/403 auth failures + user notification Previously, credentials exhausted due to 401 (invalid token) or 403 (forbidden) used the same 1-hour cooldown as 429 rate limits. This meant the system would retry an invalid token every hour forever — burning API calls and confusing users who had no idea why their primary provider wasn't being used. Changes: - credential_pool: EXHAUSTED_TTL_AUTH_SECONDS = 24h for 401/403 errors (rate limits keep 1h cooldown, provider reset_at timestamps still override both) - run_agent: emit actionable status message via _emit_status() when all pool credentials are rejected — tells the user to run `hermes auth reset <provider>` or `hermes model` to re-authenticate. Message propagates to both CLI (force-printed) and gateway (Telegram, Discord, etc.) - Tests for all three TTL cases (401 stays exhausted at 1h, 401 resets at 24h, 403 stays exhausted at 1h) and auth exhaustion notification (emits when pool exhausted, silent when rotation succeeds) Addresses user report: Copilot 401 + Codex 429 caused silent fallback with no recovery path visible to the user.	2026-04-14 21:00:45 -07:00
Teknium	82f364ffd1	feat: add --all flag to gateway start and restart commands (#10043 ) - gateway start --all: kills all stale gateway processes across all profiles before starting the current profile's service - gateway restart --all: stops all gateway processes across all profiles, then starts the current profile's service fresh - gateway stop --all: already existed, unchanged The --all flag was only available on 'stop' but not on 'start' or 'restart', causing 'unrecognized arguments' errors for users.	2026-04-14 20:52:18 -07:00
Teknium	31d0620663	chore: add simon-marcus to AUTHOR_MAP	2026-04-14 20:51:52 -07:00
Teknium	cf1d718823	fix: keep batch-path function_call_output.output as string per OpenAI spec The streaming path emits output as content-part arrays for Open WebUI compatibility, but the batch (non-streaming) Responses API path must return output as a plain string per the OpenAI Responses API spec. Reverts the _extract_output_items change from the cherry-picked commits while preserving the streaming path's array format.	2026-04-14 20:51:52 -07:00
simon-marcus	302554b158	fix(api-server): format responses tool outputs for open webui	2026-04-14 20:51:52 -07:00
simon-marcus	d6c09ab94a	feat(api-server): stream /v1/responses SSE tool events	2026-04-14 20:51:52 -07:00
Teknium	da528a8207	fix: detect and strip non-ASCII characters from API keys (#6843 ) API keys containing Unicode lookalike characters (e.g. ʋ U+028B instead of v) cause UnicodeEncodeError when httpx encodes the Authorization header as ASCII. This commonly happens when users copy-paste keys from PDFs, rich-text editors, or web pages with decorative fonts. Three layers of defense: 1. Save-time validation (hermes_cli/config.py): _check_non_ascii_credential() strips non-ASCII from credential values when saving to .env, with a clear warning explaining the issue. 2. Load-time sanitization (hermes_cli/env_loader.py): _sanitize_loaded_credentials() strips non-ASCII from credential env vars (those ending in _API_KEY, _TOKEN, _SECRET, _KEY) after dotenv loads them, so the rest of the codebase never sees non-ASCII keys. 3. Runtime recovery (run_agent.py): The UnicodeEncodeError recovery block now also sanitizes self.api_key and self._client_kwargs['api_key'], fixing the gap where message/tool sanitization succeeded but the API key still caused httpx to fail on the Authorization header. Also: hermes_logging.py RotatingFileHandler now explicitly sets encoding='utf-8' instead of relying on locale default (defensive hardening for ASCII-locale systems).	2026-04-14 20:20:31 -07:00
kshitijk4poor	677f1227c3	fix: remove @staticmethod from _context_completions — crashes on @ mention PR #9467 added a call to self._fuzzy_file_completions() inside _context_completions(), but the method was still decorated with @staticmethod and didn't receive self. Every @ mention in the input triggers 'name self is not defined' from prompt_toolkit's async completer, spamming the error on every keystroke. Fix: remove @staticmethod, add self parameter. The method already uses self._fuzzy_file_completions() and self._get_project_files() via that call chain, so it was never meant to stay static after the fuzzy search feature was added.	2026-04-14 19:43:42 -07:00
Teknium	4610551d74	fix: update stale comment referencing removed _sync_mcp_toolsets	2026-04-14 17:19:20 -07:00
Greer Guthrie	498cb7a0fc	chore(release): map greer guthrie attribution	2026-04-14 17:19:20 -07:00
Greer Guthrie	c10fea8d26	fix(mcp): make server aliases explicit	2026-04-14 17:19:20 -07:00
Greer Guthrie	cda64a5961	fix(mcp): resolve toolsets from live registry	2026-04-14 17:19:20 -07:00
Teknium	2a98098035	fix: hermes gateway restart waits for service to come back up (#8260 ) Previously, systemd_restart() sent SIGUSR1 to the gateway, printed 'restart requested', and returned immediately. The gateway still needed to drain active agents, exit with code 75, wait for systemd's RestartSec=30, and start the new process. The user saw 'success' but the gateway was actually down for 30-60 seconds. Now the SIGUSR1 path blocks with progress feedback: Phase 1 — wait for old process to die: ⏳ User service draining active work... Polls os.kill(pid, 0) until ProcessLookupError (up to 90s) Phase 2 — wait for new process to become active: ⏳ Waiting for hermes-gateway to restart... Polls systemctl is-active + verifies new PID (up to 60s) Success: ✓ User service restarted (PID 12345) Timeout: ⚠ User service did not become active within 60s. Check status: hermes gateway status Check logs: journalctl --user -u hermes-gateway --since '2 min ago' The reload-or-restart fallback path (line 1189) already blocks because systemctl reload-or-restart is synchronous. Test plan: - Updated test to verify wait-for-restart behavior - All 118 gateway CLI tests pass	2026-04-14 17:12:58 -07:00
Teknium	6c89306437	fix: break stuck session resume loops after repeated restarts (#7536 ) When a session gets stuck (hung terminal, runaway tool loop) and the user restarts the gateway, the same session history loads and puts the agent right back in the stuck state. The user is trapped in a loop: restart → stuck → restart → stuck. Fix: track restart-failure counts per session using a simple JSON file (.restart_failure_counts). On each shutdown with active agents, the counter increments for those sessions. On startup, if any session has been active across 3+ consecutive restarts, it's auto-suspended — giving the user a clean slate on their next message. The counter resets to 0 when a session completes a turn successfully (response delivered), so normal sessions that happen to be active during planned restarts (/restart, hermes update) won't accumulate false counts. Implementation: - _increment_restart_failure_counts(): called during stop() when agents are active. Writes {session_key: count} to JSON file. Sessions NOT active are dropped (loop broken). - _suspend_stuck_loop_sessions(): called on startup. Reads the file, suspends sessions at threshold (3), clears the file. - _clear_restart_failure_count(): called after successful response delivery. Removes the session from the counter file. No SessionEntry schema changes. No database migration. Pure file-based tracking that naturally cleans up. Test plan: - 9 new stuck-loop tests (increment, accumulate, threshold, clear, suspend, file cleanup, edge cases) - All 28 gateway lifecycle tests pass (restart drain + auto-continue + stuck loop)	2026-04-14 17:08:35 -07:00
Teknium	847d7cbea5	fix: improve CLI text padding, word-wrap for responses and verbose tool output (#9920 ) * feat(skills): add fitness-nutrition skill to optional-skills Cherry-picked from PR #9177 by @haileymarshall. Adds a fitness and nutrition skill for gym-goers and health-conscious users: - Exercise search via wger API (690+ exercises, free, no auth) - Nutrition lookup via USDA FoodData Central (380K+ foods, DEMO_KEY fallback) - Offline body composition calculators (BMI, TDEE, 1RM, macros, body fat %) - Pure stdlib Python, no pip dependencies Changes from original PR: - Moved from skills/ to optional-skills/health/ (correct location) - Fixed BMR formula in FORMULAS.md (removed confusing -5+10, now just +5) - Fixed author attribution to match PR submitter - Marked USDA_API_KEY as optional (DEMO_KEY works without signup) Also adds optional env var support to the skill readiness checker: - New 'optional: true' field in required_environment_variables entries - Optional vars are preserved in metadata but don't block skill readiness - Optional vars skip the CLI capture prompt flow - Skills with only optional missing vars show as 'available' not 'setup_needed' * fix: increase CLI response text padding to 4-space tab indent Increases horizontal padding on all response display paths: - Rich Panel responses (main, background, /btw): padding (1,2) -> (1,4) - Streaming text: add 4-space indent prefix to each line - Streaming TTS: add 4-space indent prefix to sentences Gives response text proper breathing room with a tab-width indent. Rich Panel word wrapping automatically adjusts for the wider padding. Requested by AriesTheCoder. * fix: word-wrap verbose tool call args and results to terminal width Verbose mode (tool_progress: verbose) printed tool args and results as single unwrapped lines that could be thousands of characters long. Adds _wrap_verbose() helper that: - Pretty-prints JSON args with indent=2 instead of one-line dumps - Splits text on existing newlines (preserves JSON/structured output) - Wraps lines exceeding terminal width with 5-char continuation indent - Uses break_long_words=True for URLs and paths without spaces Applied to all 4 verbose print sites: - Concurrent tool call args - Concurrent tool results - Sequential tool call args - Sequential tool results --------- Co-authored-by: haileymarshall <haileymarshall@users.noreply.github.com>	2026-04-14 16:58:23 -07:00
Teknium	a9c78d0eb0	feat(setup): add recommendation badges to tool provider selection (#9929 ) New users don't know which tool providers to pick during setup. Add [badge] labels to each provider in the selection menu: - [★ recommended · free] for best default choices (Edge TTS, Local Browser) - [★ recommended] for top-tier paid options (Firecrawl Cloud) - [paid] for options requiring an API key - [free tier] for services with a free tier (Tavily) - [free · self-hosted] / [free · local] for self-run options - [subscription] for Nous subscription-managed options Also improves vague tag descriptions — e.g. 'AI-native search and contents' becomes 'Neural search with semantic understanding' and Tavily gets '1000 free searches/mo'. Both hermes setup and hermes tools share the same rendering path, so badges appear in both flows. Addresses user feedback about setup being confusing for newcomers.	2026-04-14 16:58:10 -07:00
Teknium	e7475b1582	feat: auto-continue interrupted agent work after gateway restart (#4493 ) When the gateway restarts mid-agent-work, the session transcript ends on a tool result the agent never processed. Previously, the user had to type 'continue' or use /retry (which replays from scratch, losing all prior work). Now, when the next user message arrives and the loaded history ends with role='tool', a system note is prepended: [System note: Your previous turn was interrupted before you could process the last tool result(s). Please finish processing those results and summarize what was accomplished, then address the user's new message below.] This is injected in _run_agent()'s run_sync closure, right before calling agent.run_conversation(). The agent sees the full history (including the pending tool results) and the system note, so it can summarize what was accomplished and then handle the user's new input. Design decisions: - No new session flags or schema changes — purely detects trailing tool messages in the loaded history - Works for any restart scenario (clean, crash, SIGTERM, drain timeout) as long as the session wasn't suspended (suspended = fresh start) - The user's actual message is preserved after the note - If the session WAS suspended (unclean shutdown), the old history is abandoned and the user starts fresh — no false auto-continue Also updates the shutdown notification message from 'Use /retry after restart to continue' to 'Send any message after restart to resume where it left off' — which is now accurate. Test plan: - 6 new auto-continue tests (trailing tool detection, no false positives for assistant/user/empty history, multi-tool, message preservation) - All 13 restart drain tests pass (updated /retry assertion)	2026-04-14 16:56:49 -07:00
Teknium	ac1f8fcccd	docs(termux): note browser tool PATH auto-discovery Update the Termux guide to mention that the browser tool now automatically discovers Termux directories, and add the missing pkg install nodejs-lts step.	2026-04-14 16:55:55 -07:00
adybag14-cyber	56c34ac4f7	fix(browser): add termux PATH fallbacks Refactor browser tool PATH construction to include Termux directories (/data/data/com.termux/files/usr/bin, /data/data/com.termux/files/usr/sbin) so agent-browser and npx are discoverable on Android/Termux. Extracts _browser_candidate_path_dirs() and _merge_browser_path() helpers to centralize PATH construction shared between _find_agent_browser() and _run_browser_command(), replacing duplicated inline logic. Also fixes os.pathsep usage (was hardcoded ':') for cross-platform correctness. Cherry-picked from PR #9846.	2026-04-14 16:55:55 -07:00
Teknium	3ca7417c2a	chore: add areu01or00 to AUTHOR_MAP	2026-04-14 16:55:48 -07:00
areu01or00	cfa24532d3	fix(discord): register native /restart slash command	2026-04-14 16:55:48 -07:00
Teknium	b24e5ee4b0	feat(google-workspace): add --from flag for custom sender display name (#9931 ) Adds --from flag to gmail send and gmail reply commands, allowing agents to customize the From header display name when sharing the same email account. Usage: --from '"Agent Name" <user@example.com>' Also syncs repo google_api.py with the deployed standalone implementation (replaces outdated gws_bridge thin wrapper), adds dedicated docs page under Features > Skills, and updates sidebar navigation. Requested by community user @Maxime44.	2026-04-14 16:55:34 -07:00
Julien Talbot	3b50821555	feat(xai): add xAI/Grok to provider prefix stripping Add 'xai', 'x-ai', 'x.ai', 'grok' to _PROVIDER_PREFIXES so that colon-prefixed model names (e.g. xai:grok-4.20) are stripped correctly for context length lookups. Cherry-picked from PR #9184 by @Julientalbot.	2026-04-14 16:43:42 -07:00