feat(sessions): add --sanitize flag to sessions export

Port from anomalyco/opencode#22489: redact user/model content from session exports before sharing for bug reports or training data. Adds hermes_state.sanitize_session_export() which returns a deep-copied session with: - Message content, reasoning, and reasoning_details replaced with [redacted:<kind>:<id>] tokens - Tool-call arguments redacted (tool id, type, and function name preserved) - Session title and system_prompt redacted - All structural/metric fields preserved: ids, timestamps, token counts, tool names, finish reasons, model info, cost data, message counts Wired into 'hermes sessions export --sanitize' (applies to both --session-id and full exports). The flag is opt-in — default behaviour is unchanged. User sees '(sanitized)' suffix on the export summary when the flag is active. 5 new tests covering content redaction, reasoning/tool-call redaction, empty-value preservation, input immutability, and reasoning_details block structure. E2E verified: raw export still leaks sk-proj-* API keys and usernames, sanitized export replaces them with redaction tokens while preserving model names, tool names, and tool call ids. Authored-by: Hermes Agent (autonomous weekly OpenCode PR scout)
chore(release): map mbelleau@Michels-MacBook-Pro.local to @malaiwah
2026-04-16 17:11:11 -07:00 · 2026-04-16 16:50:15 -07:00 · 2026-04-16 16:50:15 -07:00 · 2026-04-16 16:49:22 -07:00 · 2026-04-16 16:49:00 -07:00 · 2026-04-16 16:48:14 -07:00
56 changed files with 6623 additions and 471 deletions
@@ -0,0 +1,764 @@
+"""OpenAI-compatible facade that talks to Google's Cloud Code Assist backend.
+
+This adapter lets Hermes use the ``google-gemini-cli`` provider as if it were
+a standard OpenAI-shaped chat completion endpoint, while the underlying HTTP
+traffic goes to ``cloudcode-pa.googleapis.com/v1internal:{generateContent,
+streamGenerateContent}`` with a Bearer access token obtained via OAuth PKCE.
+
+Architecture
+------------
+- ``GeminiCloudCodeClient`` exposes ``.chat.completions.create(**kwargs)``
+  mirroring the subset of the OpenAI SDK that ``run_agent.py`` uses.
+- Incoming OpenAI ``messages[]`` / ``tools[]`` / ``tool_choice`` are translated
+  to Gemini's native ``contents[]`` / ``tools[].functionDeclarations`` /
+  ``toolConfig`` / ``systemInstruction`` shape.
+- The request body is wrapped ``{project, model, user_prompt_id, request}``
+  per Code Assist API expectations.
+- Responses (``candidates[].content.parts[]``) are converted back to
+  OpenAI ``choices[0].message`` shape with ``content`` + ``tool_calls``.
+- Streaming uses SSE (``?alt=sse``) and yields OpenAI-shaped delta chunks.
+
+Attribution
+-----------
+Translation semantics follow jenslys/opencode-gemini-auth (MIT) and the public
+Gemini API docs. Request envelope shape
+(``{project, model, user_prompt_id, request}``) is documented nowhere; it is
+reverse-engineered from the opencode-gemini-auth and clawdbot implementations.
+"""
+
+from __future__ import annotations
+
+import json
+import logging
+import os
+import time
+import uuid
+from types import SimpleNamespace
+from typing import Any, Dict, Iterator, List, Optional
+
+import httpx
+
+from agent import google_oauth
+from agent.google_code_assist import (
+    CODE_ASSIST_ENDPOINT,
+    FREE_TIER_ID,
+    CodeAssistError,
+    ProjectContext,
+    resolve_project_context,
+)
+
+logger = logging.getLogger(__name__)
+
+
+# =============================================================================
+# Request translation: OpenAI → Gemini
+# =============================================================================
+
+_ROLE_MAP_OPENAI_TO_GEMINI = {
+    "user": "user",
+    "assistant": "model",
+    "system": "user",   # handled separately via systemInstruction
+    "tool": "user",     # functionResponse is wrapped in a user-role turn
+    "function": "user",
+}
+
+
+def _coerce_content_to_text(content: Any) -> str:
+    """OpenAI content may be str or a list of parts; reduce to plain text."""
+    if content is None:
+        return ""
+    if isinstance(content, str):
+        return content
+    if isinstance(content, list):
+        pieces: List[str] = []
+        for p in content:
+            if isinstance(p, str):
+                pieces.append(p)
+            elif isinstance(p, dict):
+                if p.get("type") == "text" and isinstance(p.get("text"), str):
+                    pieces.append(p["text"])
+                # Multimodal (image_url, etc.) — stub for now; log and skip
+                elif p.get("type") in ("image_url", "input_audio"):
+                    logger.debug("Dropping multimodal part (not yet supported): %s", p.get("type"))
+        return "\n".join(pieces)
+    return str(content)
+
+
+def _translate_tool_call_to_gemini(tool_call: Dict[str, Any]) -> Dict[str, Any]:
+    """OpenAI tool_call -> Gemini functionCall part."""
+    fn = tool_call.get("function") or {}
+    args_raw = fn.get("arguments", "")
+    try:
+        args = json.loads(args_raw) if isinstance(args_raw, str) and args_raw else {}
+    except json.JSONDecodeError:
+        args = {"_raw": args_raw}
+    if not isinstance(args, dict):
+        args = {"_value": args}
+    return {
+        "functionCall": {
+            "name": fn.get("name") or "",
+            "args": args,
+        },
+        # Sentinel signature — matches opencode-gemini-auth's approach.
+        # Without this, Code Assist rejects function calls that originated
+        # outside its own chain.
+        "thoughtSignature": "skip_thought_signature_validator",
+    }
+
+
+def _translate_tool_result_to_gemini(message: Dict[str, Any]) -> Dict[str, Any]:
+    """OpenAI tool-role message -> Gemini functionResponse part.
+
+    The function name isn't in the OpenAI tool message directly; it must be
+    passed via the assistant message that issued the call. For simplicity we
+    look up ``name`` on the message (OpenAI SDK copies it there) or on the
+    ``tool_call_id`` cross-reference.
+    """
+    name = str(message.get("name") or message.get("tool_call_id") or "tool")
+    content = _coerce_content_to_text(message.get("content"))
+    # Gemini expects the response as a dict under `response`. We wrap plain
+    # text in {"output": "..."}.
+    try:
+        parsed = json.loads(content) if content.strip().startswith(("{", "[")) else None
+    except json.JSONDecodeError:
+        parsed = None
+    response = parsed if isinstance(parsed, dict) else {"output": content}
+    return {
+        "functionResponse": {
+            "name": name,
+            "response": response,
+        },
+    }
+
+
+def _build_gemini_contents(
+    messages: List[Dict[str, Any]],
+) -> tuple[List[Dict[str, Any]], Optional[Dict[str, Any]]]:
+    """Convert OpenAI messages[] to Gemini contents[] + systemInstruction."""
+    system_text_parts: List[str] = []
+    contents: List[Dict[str, Any]] = []
+
+    for msg in messages:
+        if not isinstance(msg, dict):
+            continue
+        role = str(msg.get("role") or "user")
+
+        if role == "system":
+            system_text_parts.append(_coerce_content_to_text(msg.get("content")))
+            continue
+
+        # Tool result message — emit a user-role turn with functionResponse
+        if role == "tool" or role == "function":
+            contents.append({
+                "role": "user",
+                "parts": [_translate_tool_result_to_gemini(msg)],
+            })
+            continue
+
+        gemini_role = _ROLE_MAP_OPENAI_TO_GEMINI.get(role, "user")
+        parts: List[Dict[str, Any]] = []
+
+        text = _coerce_content_to_text(msg.get("content"))
+        if text:
+            parts.append({"text": text})
+
+        # Assistant messages can carry tool_calls
+        tool_calls = msg.get("tool_calls") or []
+        if isinstance(tool_calls, list):
+            for tc in tool_calls:
+                if isinstance(tc, dict):
+                    parts.append(_translate_tool_call_to_gemini(tc))
+
+        if not parts:
+            # Gemini rejects empty parts; skip the turn entirely
+            continue
+
+        contents.append({"role": gemini_role, "parts": parts})
+
+    system_instruction: Optional[Dict[str, Any]] = None
+    joined_system = "\n".join(p for p in system_text_parts if p).strip()
+    if joined_system:
+        system_instruction = {
+            "role": "system",
+            "parts": [{"text": joined_system}],
+        }
+
+    return contents, system_instruction
+
+
+def _translate_tools_to_gemini(tools: Any) -> List[Dict[str, Any]]:
+    """OpenAI tools[] -> Gemini tools[].functionDeclarations[]."""
+    if not isinstance(tools, list) or not tools:
+        return []
+    declarations: List[Dict[str, Any]] = []
+    for t in tools:
+        if not isinstance(t, dict):
+            continue
+        fn = t.get("function") or {}
+        if not isinstance(fn, dict):
+            continue
+        name = fn.get("name")
+        if not name:
+            continue
+        decl = {"name": str(name)}
+        if fn.get("description"):
+            decl["description"] = str(fn["description"])
+        params = fn.get("parameters")
+        if isinstance(params, dict):
+            decl["parameters"] = params
+        declarations.append(decl)
+    if not declarations:
+        return []
+    return [{"functionDeclarations": declarations}]
+
+
+def _translate_tool_choice_to_gemini(tool_choice: Any) -> Optional[Dict[str, Any]]:
+    """OpenAI tool_choice -> Gemini toolConfig.functionCallingConfig."""
+    if tool_choice is None:
+        return None
+    if isinstance(tool_choice, str):
+        if tool_choice == "auto":
+            return {"functionCallingConfig": {"mode": "AUTO"}}
+        if tool_choice == "required":
+            return {"functionCallingConfig": {"mode": "ANY"}}
+        if tool_choice == "none":
+            return {"functionCallingConfig": {"mode": "NONE"}}
+    if isinstance(tool_choice, dict):
+        fn = tool_choice.get("function") or {}
+        name = fn.get("name")
+        if name:
+            return {
+                "functionCallingConfig": {
+                    "mode": "ANY",
+                    "allowedFunctionNames": [str(name)],
+                },
+            }
+    return None
+
+
+def _normalize_thinking_config(config: Any) -> Optional[Dict[str, Any]]:
+    """Accept thinkingBudget / thinkingLevel / includeThoughts (+ snake_case)."""
+    if not isinstance(config, dict) or not config:
+        return None
+    budget = config.get("thinkingBudget", config.get("thinking_budget"))
+    level = config.get("thinkingLevel", config.get("thinking_level"))
+    include = config.get("includeThoughts", config.get("include_thoughts"))
+    normalized: Dict[str, Any] = {}
+    if isinstance(budget, (int, float)):
+        normalized["thinkingBudget"] = int(budget)
+    if isinstance(level, str) and level.strip():
+        normalized["thinkingLevel"] = level.strip().lower()
+    if isinstance(include, bool):
+        normalized["includeThoughts"] = include
+    return normalized or None
+
+
+def build_gemini_request(
+    *,
+    messages: List[Dict[str, Any]],
+    tools: Any = None,
+    tool_choice: Any = None,
+    temperature: Optional[float] = None,
+    max_tokens: Optional[int] = None,
+    top_p: Optional[float] = None,
+    stop: Any = None,
+    thinking_config: Any = None,
+) -> Dict[str, Any]:
+    """Build the inner Gemini request body (goes inside ``request`` wrapper)."""
+    contents, system_instruction = _build_gemini_contents(messages)
+
+    body: Dict[str, Any] = {"contents": contents}
+    if system_instruction is not None:
+        body["systemInstruction"] = system_instruction
+
+    gemini_tools = _translate_tools_to_gemini(tools)
+    if gemini_tools:
+        body["tools"] = gemini_tools
+    tool_cfg = _translate_tool_choice_to_gemini(tool_choice)
+    if tool_cfg is not None:
+        body["toolConfig"] = tool_cfg
+
+    generation_config: Dict[str, Any] = {}
+    if isinstance(temperature, (int, float)):
+        generation_config["temperature"] = float(temperature)
+    if isinstance(max_tokens, int) and max_tokens > 0:
+        generation_config["maxOutputTokens"] = max_tokens
+    if isinstance(top_p, (int, float)):
+        generation_config["topP"] = float(top_p)
+    if isinstance(stop, str) and stop:
+        generation_config["stopSequences"] = [stop]
+    elif isinstance(stop, list) and stop:
+        generation_config["stopSequences"] = [str(s) for s in stop if s]
+    normalized_thinking = _normalize_thinking_config(thinking_config)
+    if normalized_thinking:
+        generation_config["thinkingConfig"] = normalized_thinking
+    if generation_config:
+        body["generationConfig"] = generation_config
+
+    return body
+
+
+def wrap_code_assist_request(
+    *,
+    project_id: str,
+    model: str,
+    inner_request: Dict[str, Any],
+    user_prompt_id: Optional[str] = None,
+) -> Dict[str, Any]:
+    """Wrap the inner Gemini request in the Code Assist envelope."""
+    return {
+        "project": project_id,
+        "model": model,
+        "user_prompt_id": user_prompt_id or str(uuid.uuid4()),
+        "request": inner_request,
+    }
+
+
+# =============================================================================
+# Response translation: Gemini → OpenAI
+# =============================================================================
+
+def _translate_gemini_response(
+    resp: Dict[str, Any],
+    model: str,
+) -> SimpleNamespace:
+    """Non-streaming Gemini response -> OpenAI-shaped SimpleNamespace.
+
+    Code Assist wraps the actual Gemini response inside ``response``, so we
+    unwrap it first if present.
+    """
+    inner = resp.get("response") if isinstance(resp.get("response"), dict) else resp
+
+    candidates = inner.get("candidates") or []
+    if not isinstance(candidates, list) or not candidates:
+        return _empty_response(model)
+
+    cand = candidates[0]
+    content_obj = cand.get("content") if isinstance(cand, dict) else {}
+    parts = content_obj.get("parts") if isinstance(content_obj, dict) else []
+
+    text_pieces: List[str] = []
+    reasoning_pieces: List[str] = []
+    tool_calls: List[SimpleNamespace] = []
+
+    for i, part in enumerate(parts or []):
+        if not isinstance(part, dict):
+            continue
+        # Thought parts are model's internal reasoning — surface as reasoning,
+        # don't mix into content.
+        if part.get("thought") is True:
+            if isinstance(part.get("text"), str):
+                reasoning_pieces.append(part["text"])
+            continue
+        if isinstance(part.get("text"), str):
+            text_pieces.append(part["text"])
+            continue
+        fc = part.get("functionCall")
+        if isinstance(fc, dict) and fc.get("name"):
+            try:
+                args_str = json.dumps(fc.get("args") or {}, ensure_ascii=False)
+            except (TypeError, ValueError):
+                args_str = "{}"
+            tool_calls.append(SimpleNamespace(
+                id=f"call_{uuid.uuid4().hex[:12]}",
+                type="function",
+                index=i,
+                function=SimpleNamespace(name=str(fc["name"]), arguments=args_str),
+            ))
+
+    finish_reason = "tool_calls" if tool_calls else _map_gemini_finish_reason(
+        str(cand.get("finishReason") or "")
+    )
+
+    usage_meta = inner.get("usageMetadata") or {}
+    usage = SimpleNamespace(
+        prompt_tokens=int(usage_meta.get("promptTokenCount") or 0),
+        completion_tokens=int(usage_meta.get("candidatesTokenCount") or 0),
+        total_tokens=int(usage_meta.get("totalTokenCount") or 0),
+        prompt_tokens_details=SimpleNamespace(
+            cached_tokens=int(usage_meta.get("cachedContentTokenCount") or 0),
+        ),
+    )
+
+    message = SimpleNamespace(
+        role="assistant",
+        content="".join(text_pieces) if text_pieces else None,
+        tool_calls=tool_calls or None,
+        reasoning="".join(reasoning_pieces) or None,
+        reasoning_content="".join(reasoning_pieces) or None,
+        reasoning_details=None,
+    )
+    choice = SimpleNamespace(
+        index=0,
+        message=message,
+        finish_reason=finish_reason,
+    )
+    return SimpleNamespace(
+        id=f"chatcmpl-{uuid.uuid4().hex[:12]}",
+        object="chat.completion",
+        created=int(time.time()),
+        model=model,
+        choices=[choice],
+        usage=usage,
+    )
+
+
+def _empty_response(model: str) -> SimpleNamespace:
+    message = SimpleNamespace(
+        role="assistant", content="", tool_calls=None,
+        reasoning=None, reasoning_content=None, reasoning_details=None,
+    )
+    choice = SimpleNamespace(index=0, message=message, finish_reason="stop")
+    usage = SimpleNamespace(
+        prompt_tokens=0, completion_tokens=0, total_tokens=0,
+        prompt_tokens_details=SimpleNamespace(cached_tokens=0),
+    )
+    return SimpleNamespace(
+        id=f"chatcmpl-{uuid.uuid4().hex[:12]}",
+        object="chat.completion",
+        created=int(time.time()),
+        model=model,
+        choices=[choice],
+        usage=usage,
+    )
+
+
+def _map_gemini_finish_reason(reason: str) -> str:
+    mapping = {
+        "STOP": "stop",
+        "MAX_TOKENS": "length",
+        "SAFETY": "content_filter",
+        "RECITATION": "content_filter",
+        "OTHER": "stop",
+    }
+    return mapping.get(reason.upper(), "stop")
+
+
+# =============================================================================
+# Streaming SSE iterator
+# =============================================================================
+
+class _GeminiStreamChunk(SimpleNamespace):
+    """Mimics an OpenAI ChatCompletionChunk with .choices[0].delta."""
+    pass
+
+
+def _make_stream_chunk(
+    *,
+    model: str,
+    content: str = "",
+    tool_call_delta: Optional[Dict[str, Any]] = None,
+    finish_reason: Optional[str] = None,
+    reasoning: str = "",
+) -> _GeminiStreamChunk:
+    delta_kwargs: Dict[str, Any] = {"role": "assistant"}
+    if content:
+        delta_kwargs["content"] = content
+    if tool_call_delta is not None:
+        delta_kwargs["tool_calls"] = [SimpleNamespace(
+            index=tool_call_delta.get("index", 0),
+            id=tool_call_delta.get("id") or f"call_{uuid.uuid4().hex[:12]}",
+            type="function",
+            function=SimpleNamespace(
+                name=tool_call_delta.get("name") or "",
+                arguments=tool_call_delta.get("arguments") or "",
+            ),
+        )]
+    if reasoning:
+        delta_kwargs["reasoning"] = reasoning
+        delta_kwargs["reasoning_content"] = reasoning
+    delta = SimpleNamespace(**delta_kwargs)
+    choice = SimpleNamespace(index=0, delta=delta, finish_reason=finish_reason)
+    return _GeminiStreamChunk(
+        id=f"chatcmpl-{uuid.uuid4().hex[:12]}",
+        object="chat.completion.chunk",
+        created=int(time.time()),
+        model=model,
+        choices=[choice],
+        usage=None,
+    )
+
+
+def _iter_sse_events(response: httpx.Response) -> Iterator[Dict[str, Any]]:
+    """Parse Server-Sent Events from an httpx streaming response."""
+    buffer = ""
+    for chunk in response.iter_text():
+        if not chunk:
+            continue
+        buffer += chunk
+        while "\n" in buffer:
+            line, buffer = buffer.split("\n", 1)
+            line = line.rstrip("\r")
+            if not line:
+                continue
+            if line.startswith("data: "):
+                data = line[6:]
+                if data == "[DONE]":
+                    return
+                try:
+                    yield json.loads(data)
+                except json.JSONDecodeError:
+                    logger.debug("Non-JSON SSE line: %s", data[:200])
+
+
+def _translate_stream_event(
+    event: Dict[str, Any],
+    model: str,
+    tool_call_indices: Dict[str, int],
+) -> List[_GeminiStreamChunk]:
+    """Unwrap Code Assist envelope and emit OpenAI-shaped chunk(s)."""
+    inner = event.get("response") if isinstance(event.get("response"), dict) else event
+    candidates = inner.get("candidates") or []
+    if not candidates:
+        return []
+    cand = candidates[0]
+    if not isinstance(cand, dict):
+        return []
+
+    chunks: List[_GeminiStreamChunk] = []
+
+    content = cand.get("content") or {}
+    parts = content.get("parts") if isinstance(content, dict) else []
+    for part in parts or []:
+        if not isinstance(part, dict):
+            continue
+        if part.get("thought") is True and isinstance(part.get("text"), str):
+            chunks.append(_make_stream_chunk(
+                model=model, reasoning=part["text"],
+            ))
+            continue
+        if isinstance(part.get("text"), str) and part["text"]:
+            chunks.append(_make_stream_chunk(model=model, content=part["text"]))
+        fc = part.get("functionCall")
+        if isinstance(fc, dict) and fc.get("name"):
+            name = str(fc["name"])
+            idx = tool_call_indices.setdefault(name, len(tool_call_indices))
+            try:
+                args_str = json.dumps(fc.get("args") or {}, ensure_ascii=False)
+            except (TypeError, ValueError):
+                args_str = "{}"
+            chunks.append(_make_stream_chunk(
+                model=model,
+                tool_call_delta={
+                    "index": idx,
+                    "name": name,
+                    "arguments": args_str,
+                },
+            ))
+
+    finish_reason_raw = str(cand.get("finishReason") or "")
+    if finish_reason_raw:
+        mapped = _map_gemini_finish_reason(finish_reason_raw)
+        if tool_call_indices:
+            mapped = "tool_calls"
+        chunks.append(_make_stream_chunk(model=model, finish_reason=mapped))
+    return chunks
+
+
+# =============================================================================
+# GeminiCloudCodeClient — OpenAI-compatible facade
+# =============================================================================
+
+MARKER_BASE_URL = "cloudcode-pa://google"
+
+
+class _GeminiChatCompletions:
+    def __init__(self, client: "GeminiCloudCodeClient"):
+        self._client = client
+
+    def create(self, **kwargs: Any) -> Any:
+        return self._client._create_chat_completion(**kwargs)
+
+
+class _GeminiChatNamespace:
+    def __init__(self, client: "GeminiCloudCodeClient"):
+        self.completions = _GeminiChatCompletions(client)
+
+
+class GeminiCloudCodeClient:
+    """Minimal OpenAI-SDK-compatible facade over Code Assist v1internal."""
+
+    def __init__(
+        self,
+        *,
+        api_key: Optional[str] = None,
+        base_url: Optional[str] = None,
+        default_headers: Optional[Dict[str, str]] = None,
+        project_id: str = "",
+        **_: Any,
+    ):
+        # `api_key` here is a dummy — real auth is the OAuth access token
+        # fetched on every call via agent.google_oauth.get_valid_access_token().
+        # We accept the kwarg for openai.OpenAI interface parity.
+        self.api_key = api_key or "google-oauth"
+        self.base_url = base_url or MARKER_BASE_URL
+        self._default_headers = dict(default_headers or {})
+        self._configured_project_id = project_id
+        self._project_context: Optional[ProjectContext] = None
+        self._project_context_lock = False  # simple single-thread guard
+        self.chat = _GeminiChatNamespace(self)
+        self.is_closed = False
+        self._http = httpx.Client(timeout=httpx.Timeout(connect=15.0, read=600.0, write=30.0, pool=30.0))
+
+    def close(self) -> None:
+        self.is_closed = True
+        try:
+            self._http.close()
+        except Exception:
+            pass
+
+    # Implement the OpenAI SDK's context-manager-ish closure check
+    def __enter__(self):
+        return self
+
+    def __exit__(self, exc_type, exc_val, exc_tb):
+        self.close()
+
+    def _ensure_project_context(self, access_token: str, model: str) -> ProjectContext:
+        """Lazily resolve and cache the project context for this client."""
+        if self._project_context is not None:
+            return self._project_context
+
+        env_project = google_oauth.resolve_project_id_from_env()
+        creds = google_oauth.load_credentials()
+        stored_project = creds.project_id if creds else ""
+
+        # Prefer what's already baked into the creds
+        if stored_project:
+            self._project_context = ProjectContext(
+                project_id=stored_project,
+                managed_project_id=creds.managed_project_id if creds else "",
+                tier_id="",
+                source="stored",
+            )
+            return self._project_context
+
+        ctx = resolve_project_context(
+            access_token,
+            configured_project_id=self._configured_project_id,
+            env_project_id=env_project,
+            user_agent_model=model,
+        )
+        # Persist discovered project back to the creds file so the next
+        # session doesn't re-run the discovery.
+        if ctx.project_id or ctx.managed_project_id:
+            google_oauth.update_project_ids(
+                project_id=ctx.project_id,
+                managed_project_id=ctx.managed_project_id,
+            )
+        self._project_context = ctx
+        return ctx
+
+    def _create_chat_completion(
+        self,
+        *,
+        model: str = "gemini-2.5-flash",
+        messages: Optional[List[Dict[str, Any]]] = None,
+        stream: bool = False,
+        tools: Any = None,
+        tool_choice: Any = None,
+        temperature: Optional[float] = None,
+        max_tokens: Optional[int] = None,
+        top_p: Optional[float] = None,
+        stop: Any = None,
+        extra_body: Optional[Dict[str, Any]] = None,
+        timeout: Any = None,
+        **_: Any,
+    ) -> Any:
+        access_token = google_oauth.get_valid_access_token()
+        ctx = self._ensure_project_context(access_token, model)
+
+        thinking_config = None
+        if isinstance(extra_body, dict):
+            thinking_config = extra_body.get("thinking_config") or extra_body.get("thinkingConfig")
+
+        inner = build_gemini_request(
+            messages=messages or [],
+            tools=tools,
+            tool_choice=tool_choice,
+            temperature=temperature,
+            max_tokens=max_tokens,
+            top_p=top_p,
+            stop=stop,
+            thinking_config=thinking_config,
+        )
+        wrapped = wrap_code_assist_request(
+            project_id=ctx.project_id,
+            model=model,
+            inner_request=inner,
+        )
+
+        headers = {
+            "Content-Type": "application/json",
+            "Accept": "application/json",
+            "Authorization": f"Bearer {access_token}",
+            "User-Agent": "hermes-agent (gemini-cli-compat)",
+            "X-Goog-Api-Client": "gl-python/hermes",
+            "x-activity-request-id": str(uuid.uuid4()),
+        }
+        headers.update(self._default_headers)
+
+        if stream:
+            return self._stream_completion(model=model, wrapped=wrapped, headers=headers)
+
+        url = f"{CODE_ASSIST_ENDPOINT}/v1internal:generateContent"
+        response = self._http.post(url, json=wrapped, headers=headers)
+        if response.status_code != 200:
+            raise _gemini_http_error(response)
+        try:
+            payload = response.json()
+        except ValueError as exc:
+            raise CodeAssistError(
+                f"Invalid JSON from Code Assist: {exc}",
+                code="code_assist_invalid_json",
+            ) from exc
+        return _translate_gemini_response(payload, model=model)
+
+    def _stream_completion(
+        self,
+        *,
+        model: str,
+        wrapped: Dict[str, Any],
+        headers: Dict[str, str],
+    ) -> Iterator[_GeminiStreamChunk]:
+        """Generator that yields OpenAI-shaped streaming chunks."""
+        url = f"{CODE_ASSIST_ENDPOINT}/v1internal:streamGenerateContent?alt=sse"
+        stream_headers = dict(headers)
+        stream_headers["Accept"] = "text/event-stream"
+
+        def _generator() -> Iterator[_GeminiStreamChunk]:
+            try:
+                with self._http.stream("POST", url, json=wrapped, headers=stream_headers) as response:
+                    if response.status_code != 200:
+                        # Materialize error body for better diagnostics
+                        response.read()
+                        raise _gemini_http_error(response)
+                    tool_call_indices: Dict[str, int] = {}
+                    for event in _iter_sse_events(response):
+                        for chunk in _translate_stream_event(event, model, tool_call_indices):
+                            yield chunk
+            except httpx.HTTPError as exc:
+                raise CodeAssistError(
+                    f"Streaming request failed: {exc}",
+                    code="code_assist_stream_error",
+                ) from exc
+
+        return _generator()
+
+
+def _gemini_http_error(response: httpx.Response) -> CodeAssistError:
+    status = response.status_code
+    try:
+        body = response.text[:500]
+    except Exception:
+        body = ""
+    # Let run_agent's retry logic see auth errors as rotatable via `api_key`
+    code = f"code_assist_http_{status}"
+    if status == 401:
+        code = "code_assist_unauthorized"
+    elif status == 429:
+        code = "code_assist_rate_limited"
+    return CodeAssistError(
+        f"Code Assist returned HTTP {status}: {body}",
+        code=code,
+    )
@@ -0,0 +1,417 @@
+"""Google Code Assist API client — project discovery, onboarding, quota.
+
+The Code Assist API powers Google's official gemini-cli. It sits at
+``cloudcode-pa.googleapis.com`` and provides:
+
+- Free tier access (generous daily quota) for personal Google accounts
+- Paid tier access via GCP projects with billing / Workspace / Standard / Enterprise
+
+This module handles the control-plane dance needed before inference:
+
+1. ``load_code_assist()`` — probe the user's account to learn what tier they're on
+   and whether a ``cloudaicompanionProject`` is already assigned.
+2. ``onboard_user()`` — if the user hasn't been onboarded yet (new account, fresh
+   free tier, etc.), call this with the chosen tier + project id. Supports LRO
+   polling for slow provisioning.
+3. ``retrieve_user_quota()`` — fetch the ``buckets[]`` array showing remaining
+   quota per model, used by the ``/gquota`` slash command.
+
+VPC-SC handling: enterprise accounts under a VPC Service Controls perimeter
+will get ``SECURITY_POLICY_VIOLATED`` on ``load_code_assist``. We catch this
+and force the account to ``standard-tier`` so the call chain still succeeds.
+
+Derived from opencode-gemini-auth (MIT) and clawdbot/extensions/google. The
+request/response shapes are specific to Google's internal Code Assist API,
+documented nowhere public — we copy them from the reference implementations.
+"""
+
+from __future__ import annotations
+
+import json
+import logging
+import os
+import time
+import urllib.error
+import urllib.parse
+import urllib.request
+import uuid
+from dataclasses import dataclass, field
+from typing import Any, Dict, List, Optional
+
+logger = logging.getLogger(__name__)
+
+
+# =============================================================================
+# Constants
+# =============================================================================
+
+CODE_ASSIST_ENDPOINT = "https://cloudcode-pa.googleapis.com"
+
+# Fallback endpoints tried when prod returns an error during project discovery
+FALLBACK_ENDPOINTS = [
+    "https://daily-cloudcode-pa.sandbox.googleapis.com",
+    "https://autopush-cloudcode-pa.sandbox.googleapis.com",
+]
+
+# Tier identifiers that Google's API uses
+FREE_TIER_ID = "free-tier"
+LEGACY_TIER_ID = "legacy-tier"
+STANDARD_TIER_ID = "standard-tier"
+
+# Default HTTP headers matching gemini-cli's fingerprint.
+# Google may reject unrecognized User-Agents on these internal endpoints.
+_GEMINI_CLI_USER_AGENT = "google-api-nodejs-client/9.15.1 (gzip)"
+_X_GOOG_API_CLIENT = "gl-node/24.0.0"
+_DEFAULT_REQUEST_TIMEOUT = 30.0
+_ONBOARDING_POLL_ATTEMPTS = 12
+_ONBOARDING_POLL_INTERVAL_SECONDS = 5.0
+
+
+class CodeAssistError(RuntimeError):
+    def __init__(self, message: str, *, code: str = "code_assist_error") -> None:
+        super().__init__(message)
+        self.code = code
+
+
+class ProjectIdRequiredError(CodeAssistError):
+    def __init__(self, message: str = "GCP project id required for this tier") -> None:
+        super().__init__(message, code="code_assist_project_id_required")
+
+
+# =============================================================================
+# HTTP primitive (auth via Bearer token passed per-call)
+# =============================================================================
+
+def _build_headers(access_token: str, *, user_agent_model: str = "") -> Dict[str, str]:
+    ua = _GEMINI_CLI_USER_AGENT
+    if user_agent_model:
+        ua = f"{ua} model/{user_agent_model}"
+    return {
+        "Content-Type": "application/json",
+        "Accept": "application/json",
+        "Authorization": f"Bearer {access_token}",
+        "User-Agent": ua,
+        "X-Goog-Api-Client": _X_GOOG_API_CLIENT,
+        "x-activity-request-id": str(uuid.uuid4()),
+    }
+
+
+def _client_metadata() -> Dict[str, str]:
+    """Match Google's gemini-cli exactly — unrecognized metadata may be rejected."""
+    return {
+        "ideType": "IDE_UNSPECIFIED",
+        "platform": "PLATFORM_UNSPECIFIED",
+        "pluginType": "GEMINI",
+    }
+
+
+def _post_json(
+    url: str,
+    body: Dict[str, Any],
+    access_token: str,
+    *,
+    timeout: float = _DEFAULT_REQUEST_TIMEOUT,
+    user_agent_model: str = "",
+) -> Dict[str, Any]:
+    data = json.dumps(body).encode("utf-8")
+    request = urllib.request.Request(
+        url, data=data, method="POST",
+        headers=_build_headers(access_token, user_agent_model=user_agent_model),
+    )
+    try:
+        with urllib.request.urlopen(request, timeout=timeout) as response:
+            raw = response.read().decode("utf-8", errors="replace")
+            return json.loads(raw) if raw else {}
+    except urllib.error.HTTPError as exc:
+        detail = ""
+        try:
+            detail = exc.read().decode("utf-8", errors="replace")
+        except Exception:
+            pass
+        # Special case: VPC-SC violation should be distinguishable
+        if _is_vpc_sc_violation(detail):
+            raise CodeAssistError(
+                f"VPC-SC policy violation: {detail}",
+                code="code_assist_vpc_sc",
+            ) from exc
+        raise CodeAssistError(
+            f"Code Assist HTTP {exc.code}: {detail or exc.reason}",
+            code=f"code_assist_http_{exc.code}",
+        ) from exc
+    except urllib.error.URLError as exc:
+        raise CodeAssistError(
+            f"Code Assist request failed: {exc}",
+            code="code_assist_network_error",
+        ) from exc
+
+
+def _is_vpc_sc_violation(body: str) -> bool:
+    """Detect a VPC Service Controls violation from a response body."""
+    if not body:
+        return False
+    try:
+        parsed = json.loads(body)
+    except (json.JSONDecodeError, ValueError):
+        return "SECURITY_POLICY_VIOLATED" in body
+    # Walk the nested error structure Google uses
+    error = parsed.get("error") if isinstance(parsed, dict) else None
+    if not isinstance(error, dict):
+        return False
+    details = error.get("details") or []
+    if isinstance(details, list):
+        for item in details:
+            if isinstance(item, dict):
+                reason = item.get("reason") or ""
+                if reason == "SECURITY_POLICY_VIOLATED":
+                    return True
+    msg = str(error.get("message", ""))
+    return "SECURITY_POLICY_VIOLATED" in msg
+
+
+# =============================================================================
+# load_code_assist — discovers current tier + assigned project
+# =============================================================================
+
+@dataclass
+class CodeAssistProjectInfo:
+    """Result from ``load_code_assist``."""
+    current_tier_id: str = ""
+    cloudaicompanion_project: str = ""   # Google-managed project (free tier)
+    allowed_tiers: List[str] = field(default_factory=list)
+    raw: Dict[str, Any] = field(default_factory=dict)
+
+
+def load_code_assist(
+    access_token: str,
+    *,
+    project_id: str = "",
+    user_agent_model: str = "",
+) -> CodeAssistProjectInfo:
+    """Call ``POST /v1internal:loadCodeAssist`` with prod → sandbox fallback.
+
+    Returns whatever tier + project info Google reports. On VPC-SC violations,
+    returns a synthetic ``standard-tier`` result so the chain can continue.
+    """
+    body: Dict[str, Any] = {
+        "metadata": {
+            "duetProject": project_id,
+            **_client_metadata(),
+        },
+    }
+    if project_id:
+        body["cloudaicompanionProject"] = project_id
+
+    endpoints = [CODE_ASSIST_ENDPOINT] + FALLBACK_ENDPOINTS
+    last_err: Optional[Exception] = None
+    for endpoint in endpoints:
+        url = f"{endpoint}/v1internal:loadCodeAssist"
+        try:
+            resp = _post_json(url, body, access_token, user_agent_model=user_agent_model)
+            return _parse_load_response(resp)
+        except CodeAssistError as exc:
+            if exc.code == "code_assist_vpc_sc":
+                logger.info("VPC-SC violation on %s — defaulting to standard-tier", endpoint)
+                return CodeAssistProjectInfo(
+                    current_tier_id=STANDARD_TIER_ID,
+                    cloudaicompanion_project=project_id,
+                )
+            last_err = exc
+            logger.warning("loadCodeAssist failed on %s: %s", endpoint, exc)
+            continue
+    if last_err:
+        raise last_err
+    return CodeAssistProjectInfo()
+
+
+def _parse_load_response(resp: Dict[str, Any]) -> CodeAssistProjectInfo:
+    current_tier = resp.get("currentTier") or {}
+    tier_id = str(current_tier.get("id") or "") if isinstance(current_tier, dict) else ""
+    project = str(resp.get("cloudaicompanionProject") or "")
+    allowed = resp.get("allowedTiers") or []
+    allowed_ids: List[str] = []
+    if isinstance(allowed, list):
+        for t in allowed:
+            if isinstance(t, dict):
+                tid = str(t.get("id") or "")
+                if tid:
+                    allowed_ids.append(tid)
+    return CodeAssistProjectInfo(
+        current_tier_id=tier_id,
+        cloudaicompanion_project=project,
+        allowed_tiers=allowed_ids,
+        raw=resp,
+    )
+
+
+# =============================================================================
+# onboard_user — provisions a new user on a tier (with LRO polling)
+# =============================================================================
+
+def onboard_user(
+    access_token: str,
+    *,
+    tier_id: str,
+    project_id: str = "",
+    user_agent_model: str = "",
+) -> Dict[str, Any]:
+    """Call ``POST /v1internal:onboardUser`` to provision the user.
+
+    For paid tiers, ``project_id`` is REQUIRED (raises ProjectIdRequiredError).
+    For free tiers, ``project_id`` is optional — Google will assign one.
+
+    Returns the final operation response. Polls ``/v1internal/<name>`` for up
+    to ``_ONBOARDING_POLL_ATTEMPTS`` × ``_ONBOARDING_POLL_INTERVAL_SECONDS``
+    (default: 12 × 5s = 1 min).
+    """
+    if tier_id != FREE_TIER_ID and tier_id != LEGACY_TIER_ID and not project_id:
+        raise ProjectIdRequiredError(
+            f"Tier {tier_id!r} requires a GCP project id. "
+            "Set HERMES_GEMINI_PROJECT_ID or GOOGLE_CLOUD_PROJECT."
+        )
+
+    body: Dict[str, Any] = {
+        "tierId": tier_id,
+        "metadata": _client_metadata(),
+    }
+    if project_id:
+        body["cloudaicompanionProject"] = project_id
+
+    endpoint = CODE_ASSIST_ENDPOINT
+    url = f"{endpoint}/v1internal:onboardUser"
+    resp = _post_json(url, body, access_token, user_agent_model=user_agent_model)
+
+    # Poll if LRO (long-running operation)
+    if not resp.get("done"):
+        op_name = resp.get("name", "")
+        if not op_name:
+            return resp
+        for attempt in range(_ONBOARDING_POLL_ATTEMPTS):
+            time.sleep(_ONBOARDING_POLL_INTERVAL_SECONDS)
+            poll_url = f"{endpoint}/v1internal/{op_name}"
+            try:
+                poll_resp = _post_json(poll_url, {}, access_token, user_agent_model=user_agent_model)
+            except CodeAssistError as exc:
+                logger.warning("Onboarding poll attempt %d failed: %s", attempt + 1, exc)
+                continue
+            if poll_resp.get("done"):
+                return poll_resp
+        logger.warning("Onboarding did not complete within %d attempts", _ONBOARDING_POLL_ATTEMPTS)
+    return resp
+
+
+# =============================================================================
+# retrieve_user_quota — for /gquota
+# =============================================================================
+
+@dataclass
+class QuotaBucket:
+    model_id: str
+    token_type: str = ""
+    remaining_fraction: float = 0.0
+    reset_time_iso: str = ""
+    raw: Dict[str, Any] = field(default_factory=dict)
+
+
+def retrieve_user_quota(
+    access_token: str,
+    *,
+    project_id: str = "",
+    user_agent_model: str = "",
+) -> List[QuotaBucket]:
+    """Call ``POST /v1internal:retrieveUserQuota`` and parse ``buckets[]``."""
+    body: Dict[str, Any] = {}
+    if project_id:
+        body["project"] = project_id
+    url = f"{CODE_ASSIST_ENDPOINT}/v1internal:retrieveUserQuota"
+    resp = _post_json(url, body, access_token, user_agent_model=user_agent_model)
+    raw_buckets = resp.get("buckets") or []
+    buckets: List[QuotaBucket] = []
+    if not isinstance(raw_buckets, list):
+        return buckets
+    for b in raw_buckets:
+        if not isinstance(b, dict):
+            continue
+        buckets.append(QuotaBucket(
+            model_id=str(b.get("modelId") or ""),
+            token_type=str(b.get("tokenType") or ""),
+            remaining_fraction=float(b.get("remainingFraction") or 0.0),
+            reset_time_iso=str(b.get("resetTime") or ""),
+            raw=b,
+        ))
+    return buckets
+
+
+# =============================================================================
+# Project context resolution
+# =============================================================================
+
+@dataclass
+class ProjectContext:
+    """Resolved state for a given OAuth session."""
+    project_id: str = ""           # effective project id sent on requests
+    managed_project_id: str = ""   # Google-assigned project (free tier)
+    tier_id: str = ""
+    source: str = ""               # "env", "config", "discovered", "onboarded"
+
+
+def resolve_project_context(
+    access_token: str,
+    *,
+    configured_project_id: str = "",
+    env_project_id: str = "",
+    user_agent_model: str = "",
+) -> ProjectContext:
+    """Figure out what project id + tier to use for requests.
+
+    Priority:
+      1. If configured_project_id or env_project_id is set, use that directly
+         and short-circuit (no discovery needed).
+      2. Otherwise call loadCodeAssist to see what Google says.
+      3. If no tier assigned yet, onboard the user (free tier default).
+    """
+    # Short-circuit: caller provided a project id
+    if configured_project_id:
+        return ProjectContext(
+            project_id=configured_project_id,
+            tier_id=STANDARD_TIER_ID,  # assume paid since they specified one
+            source="config",
+        )
+    if env_project_id:
+        return ProjectContext(
+            project_id=env_project_id,
+            tier_id=STANDARD_TIER_ID,
+            source="env",
+        )
+
+    # Discover via loadCodeAssist
+    info = load_code_assist(access_token, user_agent_model=user_agent_model)
+
+    effective_project = info.cloudaicompanion_project
+    tier = info.current_tier_id
+
+    if not tier:
+        # User hasn't been onboarded — provision them on free tier
+        onboard_resp = onboard_user(
+            access_token,
+            tier_id=FREE_TIER_ID,
+            project_id="",
+            user_agent_model=user_agent_model,
+        )
+        # Re-parse from the onboard response
+        response_body = onboard_resp.get("response") or {}
+        if isinstance(response_body, dict):
+            effective_project = (
+                effective_project
+                or str(response_body.get("cloudaicompanionProject") or "")
+            )
+        tier = FREE_TIER_ID
+        source = "onboarded"
+    else:
+        source = "discovered"
+
+    return ProjectContext(
+        project_id=effective_project,
+        managed_project_id=effective_project if tier == FREE_TIER_ID else "",
+        tier_id=tier,
+        source=source,
+    )
@@ -4924,6 +4924,52 @@ class HermesCLI:
            return "\n".join(p for p in parts if p)
        return str(value)

+    def _handle_gquota_command(self, cmd_original: str) -> None:
+        """Show Google Gemini Code Assist quota usage for the current OAuth account."""
+        try:
+            from agent.google_oauth import get_valid_access_token, GoogleOAuthError, load_credentials
+            from agent.google_code_assist import retrieve_user_quota, CodeAssistError
+        except ImportError as exc:
+            self.console.print(f"  [red]Gemini modules unavailable: {exc}[/]")
+            return
+
+        try:
+            access_token = get_valid_access_token()
+        except GoogleOAuthError as exc:
+            self.console.print(f"  [yellow]{exc}[/]")
+            self.console.print("  Run [bold]/model[/] and pick 'Google Gemini (OAuth)' to sign in.")
+            return
+
+        creds = load_credentials()
+        project_id = (creds.project_id if creds else "") or ""
+
+        try:
+            buckets = retrieve_user_quota(access_token, project_id=project_id)
+        except CodeAssistError as exc:
+            self.console.print(f"  [red]Quota lookup failed:[/] {exc}")
+            return
+
+        if not buckets:
+            self.console.print("  [dim]No quota buckets reported (account may be on legacy/unmetered tier).[/]")
+            return
+
+        # Sort for stable display, group by model
+        buckets.sort(key=lambda b: (b.model_id, b.token_type))
+        self.console.print()
+        self.console.print(f"  [bold]Gemini Code Assist quota[/]  (project: {project_id or '(auto / free-tier)'})")
+        self.console.print()
+        for b in buckets:
+            pct = max(0.0, min(1.0, b.remaining_fraction))
+            width = 20
+            filled = int(round(pct * width))
+            bar = "▓" * filled + "░" * (width - filled)
+            pct_str = f"{int(pct * 100):3d}%"
+            header = b.model_id
+            if b.token_type:
+                header += f" [{b.token_type}]"
+            self.console.print(f"    {header:40s}  {bar}  {pct_str}")
+        self.console.print()
+
    def _handle_personality_command(self, cmd: str):
        """Handle the /personality command to set predefined personalities."""
        parts = cmd.split(maxsplit=1)
@@ -5433,6 +5479,8 @@ class HermesCLI:
            self._handle_model_switch(cmd_original)
        elif canonical == "provider":
            self._show_model_and_providers()
+        elif canonical == "gquota":
+            self._handle_gquota_command(cmd_original)

        elif canonical == "personality":
            # Use original case (handler lowercases the personality name itself)
@@ -7411,7 +7459,15 @@ class HermesCLI:
        self._invalidate()

    def _get_approval_display_fragments(self):
-        """Render the dangerous-command approval panel for the prompt_toolkit UI."""
+        """Render the dangerous-command approval panel for the prompt_toolkit UI.
+
+        Layout priority: title + command + choices must always render, even if
+        the terminal is short or the description is long. Description is placed
+        at the bottom of the panel and gets truncated to fit the remaining row
+        budget. This prevents HSplit from clipping approve/deny off-screen when
+        tirith findings produce multi-paragraph descriptions or when the user
+        runs in a compact terminal pane.
+        """
        state = self._approval_state
        if not state:
            return []
@@ -7470,22 +7526,89 @@ class HermesCLI:
        box_width = _panel_box_width(title, preview_lines)
        inner_text_width = max(8, box_width - 2)

+        # Pre-wrap the mandatory content — command + choices must always render.
+        cmd_wrapped = _wrap_panel_text(cmd_display, inner_text_width)
+
+        # (choice_index, wrapped_line) so we can re-apply selected styling below
+        choice_wrapped: list[tuple[int, str]] = []
+        for i, choice in enumerate(choices):
+            label = choice_labels.get(choice, choice)
+            prefix = '❯ ' if i == selected else '  '
+            for wrapped in _wrap_panel_text(f"{prefix}{label}", inner_text_width, subsequent_indent="  "):
+                choice_wrapped.append((i, wrapped))
+
+        # Budget vertical space so HSplit never clips the command or choices.
+        # Panel chrome (full layout with separators):
+        #   top border + title + blank_after_title
+        #   + blank_between_cmd_choices + bottom border = 5 rows.
+        # In tight terminals we collapse to:
+        #   top border + title + bottom border = 3 rows (no blanks).
+        #
+        # reserved_below: rows consumed below the approval panel by the
+        # spinner/tool-progress line, status bar, input area, separators, and
+        # prompt symbol. Measured at ~6 rows during live PTY approval prompts;
+        # budget 6 so we don't overestimate the panel's room.
+        term_rows = shutil.get_terminal_size((100, 24)).lines
+        chrome_full = 5
+        chrome_tight = 3
+        reserved_below = 6
+
+        available = max(0, term_rows - reserved_below)
+        mandatory_full = chrome_full + len(cmd_wrapped) + len(choice_wrapped)
+
+        # If the full-chrome panel doesn't fit, drop the separator blanks.
+        # This keeps the command and every choice on-screen in compact terminals.
+        use_compact_chrome = mandatory_full > available
+        chrome_rows = chrome_tight if use_compact_chrome else chrome_full
+
+        # If the command itself is too long to leave room for choices (e.g. user
+        # hit "view" on a multi-hundred-character command), truncate it so the
+        # approve/deny buttons still render. Keep at least 1 row of command.
+        max_cmd_rows = max(1, available - chrome_rows - len(choice_wrapped))
+        if len(cmd_wrapped) > max_cmd_rows:
+            keep = max(1, max_cmd_rows - 1) if max_cmd_rows > 1 else 1
+            cmd_wrapped = cmd_wrapped[:keep] + ["… (command truncated — use /logs or /debug for full text)"]
+
+        # Allocate any remaining rows to description. The extra -1 in full mode
+        # accounts for the blank separator between choices and description.
+        mandatory_no_desc = chrome_rows + len(cmd_wrapped) + len(choice_wrapped)
+        desc_sep_cost = 0 if use_compact_chrome else 1
+        available_for_desc = available - mandatory_no_desc - desc_sep_cost
+        # Even on huge terminals, cap description height so the panel stays compact.
+        available_for_desc = max(0, min(available_for_desc, 10))
+
+        desc_wrapped = _wrap_panel_text(description, inner_text_width) if description else []
+        if available_for_desc < 1 or not desc_wrapped:
+            desc_wrapped = []
+        elif len(desc_wrapped) > available_for_desc:
+            keep = max(1, available_for_desc - 1)
+            desc_wrapped = desc_wrapped[:keep] + ["… (description truncated)"]
+
+        # Render: title → command → choices → description (description last so
+        # any remaining overflow clips from the bottom of the least-critical
+        # content, never from the command or choices). Use compact chrome (no
+        # blank separators) when the terminal is tight.
        lines = []
        lines.append(('class:approval-border', '╭' + ('─' * box_width) + '╮\n'))
        _append_panel_line(lines, 'class:approval-border', 'class:approval-title', title, box_width)
-        _append_blank_panel_line(lines, 'class:approval-border', box_width)
-        for wrapped in _wrap_panel_text(description, inner_text_width):
-            _append_panel_line(lines, 'class:approval-border', 'class:approval-desc', wrapped, box_width)
-        for wrapped in _wrap_panel_text(cmd_display, inner_text_width):
+        if not use_compact_chrome:
+            _append_blank_panel_line(lines, 'class:approval-border', box_width)
+
+        for wrapped in cmd_wrapped:
            _append_panel_line(lines, 'class:approval-border', 'class:approval-cmd', wrapped, box_width)
-        _append_blank_panel_line(lines, 'class:approval-border', box_width)
-        for i, choice in enumerate(choices):
-            label = choice_labels.get(choice, choice)
+        if not use_compact_chrome:
+            _append_blank_panel_line(lines, 'class:approval-border', box_width)
+
+        for i, wrapped in choice_wrapped:
            style = 'class:approval-selected' if i == selected else 'class:approval-choice'
-            prefix = '❯ ' if i == selected else '  '
-            for wrapped in _wrap_panel_text(f"{prefix}{label}", inner_text_width, subsequent_indent="  "):
-                _append_panel_line(lines, 'class:approval-border', style, wrapped, box_width)
-        _append_blank_panel_line(lines, 'class:approval-border', box_width)
+            _append_panel_line(lines, 'class:approval-border', style, wrapped, box_width)
+
+        if desc_wrapped:
+            if not use_compact_chrome:
+                _append_blank_panel_line(lines, 'class:approval-border', box_width)
+            for wrapped in desc_wrapped:
+                _append_panel_line(lines, 'class:approval-border', 'class:approval-desc', wrapped, box_width)
+
        lines.append(('class:approval-border', '╰' + ('─' * box_width) + '╯\n'))
        return lines

@@ -9137,7 +9260,13 @@ class HermesCLI:
            lines.append((border_style, "│" + (" " * box_width) + "│\n"))

        def _get_clarify_display():
-            """Build styled text for the clarify question/choices panel."""
+            """Build styled text for the clarify question/choices panel.
+
+            Layout priority: choices + Other option must always render even if
+            the question is very long. The question is budgeted to leave enough
+            rows for the choices and trailing chrome; anything over the budget
+            is truncated with a marker.
+            """
            state = cli_ref._clarify_state
            if not state:
                return []
@@ -9158,48 +9287,97 @@ class HermesCLI:
            box_width = _panel_box_width("Hermes needs your input", preview_lines)
            inner_text_width = max(8, box_width - 2)

+            # Pre-wrap choices + Other option — these are mandatory.
+            choice_wrapped: list[tuple[int, str]] = []
+            if choices:
+                for i, choice in enumerate(choices):
+                    prefix = '❯ ' if i == selected and not cli_ref._clarify_freetext else '  '
+                    for wrapped in _wrap_panel_text(f"{prefix}{choice}", inner_text_width, subsequent_indent="  "):
+                        choice_wrapped.append((i, wrapped))
+                # Trailing Other row(s)
+                other_idx = len(choices)
+                if selected == other_idx and not cli_ref._clarify_freetext:
+                    other_label_mand = '❯ Other (type your answer)'
+                elif cli_ref._clarify_freetext:
+                    other_label_mand = '❯ Other (type below)'
+                else:
+                    other_label_mand = '  Other (type your answer)'
+                other_wrapped = _wrap_panel_text(other_label_mand, inner_text_width, subsequent_indent="  ")
+            elif cli_ref._clarify_freetext:
+                # Freetext-only mode: the guidance line takes the place of choices.
+                other_wrapped = _wrap_panel_text(
+                    "Type your answer in the prompt below, then press Enter.",
+                    inner_text_width,
+                )
+            else:
+                other_wrapped = []
+
+            # Budget the question so mandatory rows always render.
+            # Chrome layouts:
+            #   full : top border + blank_after_title + blank_after_question
+            #          + blank_before_bottom + bottom border = 5 rows
+            #   tight: top border + bottom border = 2 rows (drop all blanks)
+            #
+            # reserved_below matches the approval-panel budget (~6 rows for
+            # spinner/tool-progress + status + input + separators + prompt).
+            term_rows = shutil.get_terminal_size((100, 24)).lines
+            chrome_full = 5
+            chrome_tight = 2
+            reserved_below = 6
+
+            available = max(0, term_rows - reserved_below)
+            mandatory_full = chrome_full + len(choice_wrapped) + len(other_wrapped)
+
+            use_compact_chrome = mandatory_full > available
+            chrome_rows = chrome_tight if use_compact_chrome else chrome_full
+
+            max_question_rows = max(1, available - chrome_rows - len(choice_wrapped) - len(other_wrapped))
+            max_question_rows = min(max_question_rows, 12)  # soft cap on huge terminals
+
+            question_wrapped = _wrap_panel_text(question, inner_text_width)
+            if len(question_wrapped) > max_question_rows:
+                keep = max(1, max_question_rows - 1)
+                question_wrapped = question_wrapped[:keep] + ["… (question truncated)"]
+
            lines = []
            # Box top border
            lines.append(('class:clarify-border', '╭─ '))
            lines.append(('class:clarify-title', 'Hermes needs your input'))
            lines.append(('class:clarify-border', ' ' + ('─' * max(0, box_width - len("Hermes needs your input") - 3)) + '╮\n'))
-            _append_blank_panel_line(lines, 'class:clarify-border', box_width)
+            if not use_compact_chrome:
+                _append_blank_panel_line(lines, 'class:clarify-border', box_width)

-            # Question text
-            for wrapped in _wrap_panel_text(question, inner_text_width):
+            # Question text (bounded)
+            for wrapped in question_wrapped:
                _append_panel_line(lines, 'class:clarify-border', 'class:clarify-question', wrapped, box_width)
-            _append_blank_panel_line(lines, 'class:clarify-border', box_width)
+            if not use_compact_chrome:
+                _append_blank_panel_line(lines, 'class:clarify-border', box_width)

            if cli_ref._clarify_freetext and not choices:
-                guidance = "Type your answer in the prompt below, then press Enter."
-                for wrapped in _wrap_panel_text(guidance, inner_text_width):
+                for wrapped in other_wrapped:
                    _append_panel_line(lines, 'class:clarify-border', 'class:clarify-choice', wrapped, box_width)
-                _append_blank_panel_line(lines, 'class:clarify-border', box_width)
+                if not use_compact_chrome:
+                    _append_blank_panel_line(lines, 'class:clarify-border', box_width)

            if choices:
                # Multiple-choice mode: show selectable options
-                for i, choice in enumerate(choices):
+                for i, wrapped in choice_wrapped:
                    style = 'class:clarify-selected' if i == selected and not cli_ref._clarify_freetext else 'class:clarify-choice'
-                    prefix = '❯ ' if i == selected and not cli_ref._clarify_freetext else '  '
-                    wrapped_lines = _wrap_panel_text(f"{prefix}{choice}", inner_text_width, subsequent_indent="  ")
-                    for wrapped in wrapped_lines:
-                        _append_panel_line(lines, 'class:clarify-border', style, wrapped, box_width)
+                    _append_panel_line(lines, 'class:clarify-border', style, wrapped, box_width)

-                # "Other" option (5th line, only shown when choices exist)
+                # "Other" option (trailing row(s), only shown when choices exist)
                other_idx = len(choices)
                if selected == other_idx and not cli_ref._clarify_freetext:
                    other_style = 'class:clarify-selected'
-                    other_label = '❯ Other (type your answer)'
                elif cli_ref._clarify_freetext:
                    other_style = 'class:clarify-active-other'
-                    other_label = '❯ Other (type below)'
                else:
                    other_style = 'class:clarify-choice'
-                    other_label = '  Other (type your answer)'
-                for wrapped in _wrap_panel_text(other_label, inner_text_width, subsequent_indent="  "):
+                for wrapped in other_wrapped:
                    _append_panel_line(lines, 'class:clarify-border', other_style, wrapped, box_width)

-            _append_blank_panel_line(lines, 'class:clarify-border', box_width)
+            if not use_compact_chrome:
+                _append_blank_panel_line(lines, 'class:clarify-border', box_width)
            lines.append(('class:clarify-border', '╰' + ('─' * box_width) + '╯\n'))
            return lines

@@ -1291,7 +1291,7 @@ class BasePlatformAdapter(ABC):
                path = path[1:-1].strip()
            path = path.lstrip("`\"'").rstrip("`\"',.;:)}]")
            if path:
-                media.append((path, has_voice_tag))
+                media.append((os.path.expanduser(path), has_voice_tag))

        # Remove MEDIA tags from content (including surrounding quote/backtick wrappers)
        if media:
@@ -1579,7 +1579,7 @@ class BasePlatformAdapter(ABC):
            # session lifecycle and its cleanup races with the running task
            # (see PR #4926).
            cmd = event.get_command()
-            if cmd in ("approve", "deny", "status", "stop", "new", "reset", "background", "restart"):
+            if cmd in ("approve", "deny", "status", "stop", "new", "reset", "background", "restart", "queue", "q"):
                logger.debug(
                    "[%s] Command '/%s' bypassing active-session guard for %s",
                    self.name, cmd, session_key,
@@ -235,6 +235,7 @@ class VoiceReceiver:
        # Calculate dynamic RTP header size (RFC 9335 / rtpsize mode)
        cc = first_byte & 0x0F  # CSRC count
        has_extension = bool(first_byte & 0x10)  # extension bit
+        has_padding = bool(first_byte & 0x20)  # padding bit (RFC 3550 §5.1)
        header_size = 12 + (4 * cc) + (4 if has_extension else 0)

        if len(data) < header_size + 4:  # need at least header + nonce
@@ -278,6 +279,31 @@ class VoiceReceiver:
        if ext_data_len and len(decrypted) > ext_data_len:
            decrypted = decrypted[ext_data_len:]

+        # --- Strip RTP padding (RFC 3550 §5.1) ---
+        # When the P bit is set, the last payload byte holds the count of
+        # trailing padding bytes (including itself) that must be removed
+        # before further processing. Skipping this passes padding-contaminated
+        # bytes into DAVE/Opus and corrupts inbound audio.
+        if has_padding:
+            if not decrypted:
+                if self._packet_debug_count <= 10:
+                    logger.warning(
+                        "RTP padding bit set but no payload (ssrc=%d)", ssrc,
+                    )
+                return
+            pad_len = decrypted[-1]
+            if pad_len == 0 or pad_len > len(decrypted):
+                if self._packet_debug_count <= 10:
+                    logger.warning(
+                        "Invalid RTP padding length %d for payload size %d (ssrc=%d)",
+                        pad_len, len(decrypted), ssrc,
+                    )
+                return
+            decrypted = decrypted[:-pad_len]
+            if not decrypted:
+                # Padding consumed entire payload — nothing to decode
+                return
+
        # --- DAVE E2EE decrypt ---
        if self._dave_session:
            with self._lock:
@@ -6889,7 +6889,7 @@ class GatewayRunner:
            except Exception as exc:
                return f"✗ Failed to upload debug report: {exc}"

-            # Schedule auto-deletion after 1 hour
+            # Schedule auto-deletion after 6 hours
            _schedule_auto_delete(list(urls.values()))

            lines = [_GATEWAY_PRIVACY_NOTICE, "", "**Debug report uploaded:**", ""]
@@ -6898,7 +6898,7 @@ class GatewayRunner:
                lines.append(f"`{label:<{label_width}}`  {url}")

            lines.append("")
-            lines.append("⏱ Pastes will auto-delete in 1 hour.")
+            lines.append("⏱ Pastes will auto-delete in 6 hours.")
            lines.append("For full log uploads, use `hermes debug share` from the CLI.")
            lines.append("Share these links with the Hermes team for support.")
            return "\n".join(lines)
@@ -7982,12 +7982,15 @@ class GatewayRunner:
                if _adapter:
                    _adapter_supports_edit = getattr(_adapter, "SUPPORTS_MESSAGE_EDITING", True)
                    _effective_cursor = _scfg.cursor if _adapter_supports_edit else ""
+                    _buffer_only = False
                    if source.platform == Platform.MATRIX:
                        _effective_cursor = ""
+                        _buffer_only = True
                    _consumer_cfg = StreamConsumerConfig(
                        edit_interval=_scfg.edit_interval,
                        buffer_threshold=_scfg.buffer_threshold,
                        cursor=_effective_cursor,
+                        buffer_only=_buffer_only,
                    )
                    _stream_consumer = GatewayStreamConsumer(
                        adapter=_adapter,
@@ -8553,12 +8556,15 @@ class GatewayRunner:
                        # Some Matrix clients render the streaming cursor
                        # as a visible tofu/white-box artifact.  Keep
                        # streaming text on Matrix, but suppress the cursor.
+                        _buffer_only = False
                        if source.platform == Platform.MATRIX:
                            _effective_cursor = ""
+                            _buffer_only = True
                        _consumer_cfg = StreamConsumerConfig(
                            edit_interval=_scfg.edit_interval,
                            buffer_threshold=_scfg.buffer_threshold,
                            cursor=_effective_cursor,
+                            buffer_only=_buffer_only,
                        )
                        _stream_consumer = GatewayStreamConsumer(
                            adapter=_adapter,
@@ -43,6 +43,7 @@ class StreamConsumerConfig:
    edit_interval: float = 1.0
    buffer_threshold: int = 40
    cursor: str = " ▉"
+    buffer_only: bool = False


 class GatewayStreamConsumer:
@@ -295,10 +296,13 @@ class GatewayStreamConsumer:
                    got_done
                    or got_segment_break
                    or commentary_text is not None
-                    or (elapsed >= self._current_edit_interval
-                        and self._accumulated)
-                    or len(self._accumulated) >= self.cfg.buffer_threshold
                )
+                if not self.cfg.buffer_only:
+                    should_edit = should_edit or (
+                        (elapsed >= self._current_edit_interval
+                            and self._accumulated)
+                        or len(self._accumulated) >= self.cfg.buffer_threshold
+                    )

                current_update_visible = False
                if should_edit and self._accumulated:
@@ -78,6 +78,10 @@ QWEN_OAUTH_CLIENT_ID = "f0304373b74a44d2b584a3fb70ca9e56"
 QWEN_OAUTH_TOKEN_URL = "https://chat.qwen.ai/api/v1/oauth2/token"
 QWEN_ACCESS_TOKEN_REFRESH_SKEW_SECONDS = 120

+# Google Gemini OAuth (google-gemini-cli provider, Cloud Code Assist backend)
+DEFAULT_GEMINI_CLOUDCODE_BASE_URL = "cloudcode-pa://google"
+GEMINI_OAUTH_ACCESS_TOKEN_REFRESH_SKEW_SECONDS = 60  # refresh 60s before expiry
+

 # =============================================================================
 # Provider Registry
@@ -122,6 +126,12 @@ PROVIDER_REGISTRY: Dict[str, ProviderConfig] = {
        auth_type="oauth_external",
        inference_base_url=DEFAULT_QWEN_BASE_URL,
    ),
+    "google-gemini-cli": ProviderConfig(
+        id="google-gemini-cli",
+        name="Google Gemini (OAuth)",
+        auth_type="oauth_external",
+        inference_base_url=DEFAULT_GEMINI_CLOUDCODE_BASE_URL,
+    ),
    "copilot": ProviderConfig(
        id="copilot",
        name="GitHub Copilot",
@@ -939,7 +949,7 @@ def resolve_provider(
        "github-copilot-acp": "copilot-acp", "copilot-acp-agent": "copilot-acp",
        "aigateway": "ai-gateway", "vercel": "ai-gateway", "vercel-ai-gateway": "ai-gateway",
        "opencode": "opencode-zen", "zen": "opencode-zen",
-        "qwen-portal": "qwen-oauth", "qwen-cli": "qwen-oauth", "qwen-oauth": "qwen-oauth",
+        "qwen-portal": "qwen-oauth", "qwen-cli": "qwen-oauth", "qwen-oauth": "qwen-oauth", "google-gemini-cli": "google-gemini-cli", "gemini-cli": "google-gemini-cli", "gemini-oauth": "google-gemini-cli",
        "hf": "huggingface", "hugging-face": "huggingface", "huggingface-hub": "huggingface",
        "mimo": "xiaomi", "xiaomi-mimo": "xiaomi",
        "aws": "bedrock", "aws-bedrock": "bedrock", "amazon-bedrock": "bedrock", "amazon": "bedrock",
@@ -1251,6 +1261,83 @@ def get_qwen_auth_status() -> Dict[str, Any]:
        }


+# =============================================================================
+# Google Gemini OAuth (google-gemini-cli) — PKCE flow + Cloud Code Assist.
+#
+# Tokens live in ~/.hermes/auth/google_oauth.json (managed by agent.google_oauth).
+# The `base_url` here is the marker "cloudcode-pa://google" that run_agent.py
+# uses to construct a GeminiCloudCodeClient instead of the default OpenAI SDK.
+# Actual HTTP traffic goes to https://cloudcode-pa.googleapis.com/v1internal:*.
+# =============================================================================
+
+def resolve_gemini_oauth_runtime_credentials(
+    *,
+    force_refresh: bool = False,
+) -> Dict[str, Any]:
+    """Resolve runtime OAuth creds for google-gemini-cli."""
+    try:
+        from agent.google_oauth import (
+            GoogleOAuthError,
+            _credentials_path,
+            get_valid_access_token,
+            load_credentials,
+        )
+    except ImportError as exc:
+        raise AuthError(
+            f"agent.google_oauth is not importable: {exc}",
+            provider="google-gemini-cli",
+            code="google_oauth_module_missing",
+        ) from exc
+
+    try:
+        access_token = get_valid_access_token(force_refresh=force_refresh)
+    except GoogleOAuthError as exc:
+        raise AuthError(
+            str(exc),
+            provider="google-gemini-cli",
+            code=exc.code,
+        ) from exc
+
+    creds = load_credentials()
+    base_url = DEFAULT_GEMINI_CLOUDCODE_BASE_URL
+    return {
+        "provider": "google-gemini-cli",
+        "base_url": base_url,
+        "api_key": access_token,
+        "source": "google-oauth",
+        "expires_at_ms": (creds.expires_ms if creds else None),
+        "auth_file": str(_credentials_path()),
+        "email": (creds.email if creds else "") or "",
+        "project_id": (creds.project_id if creds else "") or "",
+    }
+
+
+def get_gemini_oauth_auth_status() -> Dict[str, Any]:
+    """Return a status dict for `hermes auth list` / `hermes status`."""
+    try:
+        from agent.google_oauth import _credentials_path, load_credentials
+    except ImportError:
+        return {"logged_in": False, "error": "agent.google_oauth unavailable"}
+    auth_path = _credentials_path()
+    creds = load_credentials()
+    if creds is None or not creds.access_token:
+        return {
+            "logged_in": False,
+            "auth_file": str(auth_path),
+            "error": "not logged in",
+        }
+    return {
+        "logged_in": True,
+        "auth_file": str(auth_path),
+        "source": "google-oauth",
+        "api_key": creds.access_token,
+        "expires_at_ms": creds.expires_ms,
+        "email": creds.email,
+        "project_id": creds.project_id,
+    }
+
+
+
 # =============================================================================
 # SSH / remote session detection
 # =============================================================================
@@ -2469,6 +2556,8 @@ def get_auth_status(provider_id: Optional[str] = None) -> Dict[str, Any]:
        return get_codex_auth_status()
    if target == "qwen-oauth":
        return get_qwen_auth_status()
+    if target == "google-gemini-cli":
+        return get_gemini_oauth_auth_status()
    if target == "copilot-acp":
        return get_external_process_provider_status(target)
    # API-key providers
@@ -33,7 +33,7 @@ from hermes_constants import OPENROUTER_BASE_URL


 # Providers that support OAuth login in addition to API keys.
-_OAUTH_CAPABLE_PROVIDERS = {"anthropic", "nous", "openai-codex", "qwen-oauth"}
+_OAUTH_CAPABLE_PROVIDERS = {"anthropic", "nous", "openai-codex", "qwen-oauth", "google-gemini-cli"}


 def _get_custom_provider_names() -> list:
@@ -148,7 +148,7 @@ def auth_add_command(args) -> None:
        if provider.startswith(CUSTOM_POOL_PREFIX):
            requested_type = AUTH_TYPE_API_KEY
        else:
-            requested_type = AUTH_TYPE_OAUTH if provider in {"anthropic", "nous", "openai-codex", "qwen-oauth"} else AUTH_TYPE_API_KEY
+            requested_type = AUTH_TYPE_OAUTH if provider in {"anthropic", "nous", "openai-codex", "qwen-oauth", "google-gemini-cli"} else AUTH_TYPE_API_KEY

    pool = load_pool(provider)

@@ -254,6 +254,27 @@ def auth_add_command(args) -> None:
        print(f'Added {provider} OAuth credential #{len(pool.entries())}: "{entry.label}"')
        return

+    if provider == "google-gemini-cli":
+        from agent.google_oauth import run_gemini_oauth_login_pure
+
+        creds = run_gemini_oauth_login_pure()
+        label = (getattr(args, "label", None) or "").strip() or (
+            creds.get("email") or _oauth_default_label(provider, len(pool.entries()) + 1)
+        )
+        entry = PooledCredential(
+            provider=provider,
+            id=uuid.uuid4().hex[:6],
+            label=label,
+            auth_type=AUTH_TYPE_OAUTH,
+            priority=0,
+            source=f"{SOURCE_MANUAL}:google_pkce",
+            access_token=creds["access_token"],
+            refresh_token=creds.get("refresh_token"),
+        )
+        pool.add_entry(entry)
+        print(f'Added {provider} OAuth credential #{len(pool.entries())}: "{entry.label}"')
+        return
+
    if provider == "qwen-oauth":
        creds = auth_mod.resolve_qwen_runtime_credentials(refresh_if_expiring=False)
        label = (getattr(args, "label", None) or "").strip() or label_from_token(
@@ -102,6 +102,7 @@ COMMAND_REGISTRY: list[CommandDef] = [
    CommandDef("model", "Switch model for this session", "Configuration", args_hint="[model] [--global]"),
    CommandDef("provider", "Show available providers and current provider",
               "Configuration"),
+    CommandDef("gquota", "Show Google Gemini Code Assist quota usage", "Info"),

    CommandDef("personality", "Set a predefined personality", "Configuration",
               args_hint="[name]"),
@@ -1002,6 +1002,30 @@ OPTIONAL_ENV_VARS = {
        "category": "provider",
        "advanced": True,
    },
+    "HERMES_GEMINI_CLIENT_ID": {
+        "description": "Google OAuth client ID for google-gemini-cli (optional; defaults to Google's public gemini-cli client)",
+        "prompt": "Google OAuth client ID (optional — leave empty to use the public default)",
+        "url": "https://console.cloud.google.com/apis/credentials",
+        "password": False,
+        "category": "provider",
+        "advanced": True,
+    },
+    "HERMES_GEMINI_CLIENT_SECRET": {
+        "description": "Google OAuth client secret for google-gemini-cli (optional)",
+        "prompt": "Google OAuth client secret (optional)",
+        "url": "https://console.cloud.google.com/apis/credentials",
+        "password": True,
+        "category": "provider",
+        "advanced": True,
+    },
+    "HERMES_GEMINI_PROJECT_ID": {
+        "description": "GCP project ID for paid Gemini tiers (free tier auto-provisions)",
+        "prompt": "GCP project ID for Gemini OAuth (leave empty for free tier)",
+        "url": None,
+        "password": False,
+        "category": "provider",
+        "advanced": True,
+    },
    "OPENCODE_ZEN_API_KEY": {
        "description": "OpenCode Zen API key (pay-as-you-go access to curated models)",
        "prompt": "OpenCode Zen API key",
@@ -27,8 +27,8 @@ _DPASTE_COM_URL = "https://dpaste.com/api/"
 # paste.rs caps at ~1 MB; we stay under that with headroom.
 _MAX_LOG_BYTES = 512_000

-# Auto-delete pastes after this many seconds (1 hour).
-_AUTO_DELETE_SECONDS = 3600
+# Auto-delete pastes after this many seconds (6 hours).
+_AUTO_DELETE_SECONDS = 21600


 # ---------------------------------------------------------------------------
@@ -44,7 +44,7 @@ _PRIVACY_NOTICE = """\
  • Full agent.log and gateway.log (up to 512 KB each — likely contains
    conversation content, tool outputs, and file paths)

-Pastes auto-delete after 1 hour.
+Pastes auto-delete after 6 hours.
 """

 _GATEWAY_PRIVACY_NOTICE = (
@@ -52,7 +52,7 @@ _GATEWAY_PRIVACY_NOTICE = (
    "(may contain conversation fragments) to a public paste service. "
    "Full logs are NOT included from the gateway — use `hermes debug share` "
    "from the CLI for full log uploads.\n"
-    "Pastes auto-delete after 1 hour."
+    "Pastes auto-delete after 6 hours."
 )


@@ -422,9 +422,9 @@ def run_debug_share(args):
    if failures:
        print(f"\n  (failed to upload: {', '.join(failures)})")

-    # Schedule auto-deletion after 1 hour
+    # Schedule auto-deletion after 6 hours
    _schedule_auto_delete(list(urls.values()))
-    print(f"\n⏱  Pastes will auto-delete in 1 hour.")
+    print(f"\n⏱  Pastes will auto-delete in 6 hours.")

    # Manual delete fallback
    print(f"To delete now:  hermes debug delete <url>")
@@ -373,7 +373,11 @@ def run_doctor(args):
    print(color("◆ Auth Providers", Colors.CYAN, Colors.BOLD))

    try:
-        from hermes_cli.auth import get_nous_auth_status, get_codex_auth_status
+        from hermes_cli.auth import (
+            get_nous_auth_status,
+            get_codex_auth_status,
+            get_gemini_oauth_auth_status,
+        )

        nous_status = get_nous_auth_status()
        if nous_status.get("logged_in"):
@@ -388,6 +392,20 @@ def run_doctor(args):
            check_warn("OpenAI Codex auth", "(not logged in)")
            if codex_status.get("error"):
                check_info(codex_status["error"])
+
+        gemini_status = get_gemini_oauth_auth_status()
+        if gemini_status.get("logged_in"):
+            email = gemini_status.get("email") or ""
+            project = gemini_status.get("project_id") or ""
+            pieces = []
+            if email:
+                pieces.append(email)
+            if project:
+                pieces.append(f"project={project}")
+            suffix = f" ({', '.join(pieces)})" if pieces else ""
+            check_ok("Google Gemini OAuth", f"(logged in{suffix})")
+        else:
+            check_warn("Google Gemini OAuth", "(not logged in)")
    except Exception as e:
        check_warn("Auth provider status", f"(could not check: {e})")

@@ -1118,6 +1118,8 @@ def select_provider_and_model(args=None):
        _model_flow_openai_codex(config, current_model)
    elif selected_provider == "qwen-oauth":
        _model_flow_qwen_oauth(config, current_model)
+    elif selected_provider == "google-gemini-cli":
+        _model_flow_google_gemini_cli(config, current_model)
    elif selected_provider == "copilot-acp":
        _model_flow_copilot_acp(config, current_model)
    elif selected_provider == "copilot":
@@ -1520,6 +1522,76 @@ def _model_flow_qwen_oauth(_config, current_model=""):
        print("No change.")


+def _model_flow_google_gemini_cli(_config, current_model=""):
+    """Google Gemini OAuth (PKCE) via Cloud Code Assist — supports free AND paid tiers.
+
+    Flow:
+      1. Show upfront warning about Google's ToS stance (per opencode-gemini-auth).
+      2. If creds missing, run PKCE browser OAuth via agent.google_oauth.
+      3. Resolve project context (env -> config -> auto-discover -> free tier).
+      4. Prompt user to pick a model.
+      5. Save to ~/.hermes/config.yaml.
+    """
+    from hermes_cli.auth import (
+        DEFAULT_GEMINI_CLOUDCODE_BASE_URL,
+        get_gemini_oauth_auth_status,
+        resolve_gemini_oauth_runtime_credentials,
+        _prompt_model_selection,
+        _save_model_choice,
+        _update_config_for_provider,
+    )
+    from hermes_cli.models import _PROVIDER_MODELS
+
+    print()
+    print("⚠  Google considers using the Gemini CLI OAuth client with third-party")
+    print("   software a policy violation. Some users have reported account")
+    print("   restrictions. You can use your own API key via 'gemini' provider")
+    print("   for the lowest-risk experience.")
+    print()
+    try:
+        proceed = input("Continue with OAuth login? [y/N]: ").strip().lower()
+    except (EOFError, KeyboardInterrupt):
+        print("Cancelled.")
+        return
+    if proceed not in {"y", "yes"}:
+        print("Cancelled.")
+        return
+
+    status = get_gemini_oauth_auth_status()
+    if not status.get("logged_in"):
+        try:
+            from agent.google_oauth import resolve_project_id_from_env, start_oauth_flow
+
+            env_project = resolve_project_id_from_env()
+            start_oauth_flow(force_relogin=True, project_id=env_project)
+        except Exception as exc:
+            print(f"OAuth login failed: {exc}")
+            return
+
+    # Verify creds resolve + trigger project discovery
+    try:
+        creds = resolve_gemini_oauth_runtime_credentials(force_refresh=False)
+        project_id = creds.get("project_id", "")
+        if project_id:
+            print(f"  Using GCP project: {project_id}")
+        else:
+            print("  No GCP project configured — free tier will be auto-provisioned on first request.")
+    except Exception as exc:
+        print(f"Failed to resolve Gemini credentials: {exc}")
+        return
+
+    models = list(_PROVIDER_MODELS.get("google-gemini-cli") or [])
+    default = current_model or (models[0] if models else "gemini-2.5-flash")
+    selected = _prompt_model_selection(models, current_model=default)
+    if selected:
+        _save_model_choice(selected)
+        _update_config_for_provider("google-gemini-cli", DEFAULT_GEMINI_CLOUDCODE_BASE_URL)
+        print(f"Default model set to: {selected} (via Google Gemini OAuth / Code Assist)")
+    else:
+        print("No change.")
+
+
+

 def _model_flow_custom(config):
    """Custom endpoint: collect URL, API key, and model name.
@@ -5856,6 +5928,13 @@ Examples:
    sessions_export.add_argument("output", help="Output JSONL file path (use - for stdout)")
    sessions_export.add_argument("--source", help="Filter by source")
    sessions_export.add_argument("--session-id", help="Export a specific session")
+    sessions_export.add_argument(
+        "--sanitize",
+        action="store_true",
+        help="Redact user/model content (message text, reasoning, tool args/output, titles, "
+             "system prompt) before export. Structure and metrics are preserved. "
+             "Use when sharing exports for bug reports or training data.",
+    )

    sessions_delete = sessions_subparsers.add_parser("delete", help="Delete a specific session")
    sessions_delete.add_argument("session_id", help="Session ID to delete")
@@ -5925,6 +6004,19 @@ Examples:
                    print(f"{preview:<50} {last_active:<13} {s['source']:<6} {sid}")

        elif action == "export":
+            sanitize = getattr(args, "sanitize", False)
+            if sanitize:
+                try:
+                    from hermes_state import sanitize_session_export as _sanitize_fn
+                except Exception:
+                    _sanitize_fn = None
+                    print("Warning: sanitize_session_export unavailable — exporting raw data.")
+            else:
+                _sanitize_fn = None
+
+            def _maybe_sanitize(d):
+                return _sanitize_fn(d) if _sanitize_fn else d
+
            if args.session_id:
                resolved_session_id = db.resolve_session_id(args.session_id)
                if not resolved_session_id:
@@ -5934,6 +6026,7 @@ Examples:
                if not data:
                    print(f"Session '{args.session_id}' not found.")
                    return
+                data = _maybe_sanitize(data)
                line = _json.dumps(data, ensure_ascii=False) + "\n"
                if args.output == "-":
                    import sys
@@ -5941,18 +6034,20 @@ Examples:
                else:
                    with open(args.output, "w", encoding="utf-8") as f:
                        f.write(line)
-                    print(f"Exported 1 session to {args.output}")
+                    suffix = " (sanitized)" if sanitize and _sanitize_fn else ""
+                    print(f"Exported 1 session to {args.output}{suffix}")
            else:
                sessions = db.export_all(source=args.source)
                if args.output == "-":
                    import sys
                    for s in sessions:
-                        sys.stdout.write(_json.dumps(s, ensure_ascii=False) + "\n")
+                        sys.stdout.write(_json.dumps(_maybe_sanitize(s), ensure_ascii=False) + "\n")
                else:
                    with open(args.output, "w", encoding="utf-8") as f:
                        for s in sessions:
-                            f.write(_json.dumps(s, ensure_ascii=False) + "\n")
-                    print(f"Exported {len(sessions)} sessions to {args.output}")
+                            f.write(_json.dumps(_maybe_sanitize(s), ensure_ascii=False) + "\n")
+                    suffix = " (sanitized)" if sanitize and _sanitize_fn else ""
+                    print(f"Exported {len(sessions)} sessions to {args.output}{suffix}")

        elif action == "delete":
            resolved_session_id = db.resolve_session_id(args.session_id)
@@ -136,6 +136,11 @@ _PROVIDER_MODELS: dict[str, list[str]] = {
        "gemma-4-31b-it",
        "gemma-4-26b-it",
    ],
+    "google-gemini-cli": [
+        "gemini-2.5-pro",
+        "gemini-2.5-flash",
+        "gemini-2.5-flash-lite",
+    ],
    "zai": [
        "glm-5.1",
        "glm-5",
@@ -244,6 +249,7 @@ _PROVIDER_MODELS: dict[str, list[str]] = {
        "big-pickle",
    ],
    "opencode-go": [
+        "glm-5.1",
        "glm-5",
        "kimi-k2.5",
        "mimo-v2-pro",
@@ -534,6 +540,7 @@ CANONICAL_PROVIDERS: list[ProviderEntry] = [
    ProviderEntry("copilot-acp",    "GitHub Copilot ACP",       "GitHub Copilot ACP (spawns `copilot --acp --stdio`)"),
    ProviderEntry("huggingface",    "Hugging Face",             "Hugging Face Inference Providers (20+ open models)"),
    ProviderEntry("gemini",         "Google AI Studio",         "Google AI Studio (Gemini models — OpenAI-compatible endpoint)"),
+    ProviderEntry("google-gemini-cli", "Google Gemini (OAuth)",   "Google Gemini via OAuth + Code Assist (free tier supported; no API key needed)"),
    ProviderEntry("deepseek",       "DeepSeek",                 "DeepSeek (DeepSeek-V3, R1, coder — direct API)"),
    ProviderEntry("xai",            "xAI",                      "xAI (Grok models — direct API)"),
    ProviderEntry("zai",            "Z.AI / GLM",               "Z.AI / GLM (Zhipu AI direct API)"),
@@ -596,6 +603,8 @@ _PROVIDER_ALIASES = {
    "qwen": "alibaba",
    "alibaba-cloud": "alibaba",
    "qwen-portal": "qwen-oauth",
+    "gemini-cli": "google-gemini-cli",
+    "gemini-oauth": "google-gemini-cli",
    "hf": "huggingface",
    "hugging-face": "huggingface",
    "huggingface-hub": "huggingface",
@@ -64,6 +64,11 @@ HERMES_OVERLAYS: Dict[str, HermesOverlay] = {
        base_url_override="https://portal.qwen.ai/v1",
        base_url_env_var="HERMES_QWEN_BASE_URL",
    ),
+    "google-gemini-cli": HermesOverlay(
+        transport="openai_chat",
+        auth_type="oauth_external",
+        base_url_override="cloudcode-pa://google",
+    ),
    "copilot-acp": HermesOverlay(
        transport="codex_responses",
        auth_type="external_process",
@@ -232,6 +237,11 @@ ALIASES: Dict[str, str] = {
    "qwen": "alibaba",
    "alibaba-cloud": "alibaba",

+    # google-gemini-cli (OAuth + Code Assist)
+    "gemini-cli": "google-gemini-cli",
+    "gemini-oauth": "google-gemini-cli",
+
+
    # huggingface
    "hf": "huggingface",
    "hugging-face": "huggingface",
@@ -22,6 +22,7 @@ from hermes_cli.auth import (
    resolve_nous_runtime_credentials,
    resolve_codex_runtime_credentials,
    resolve_qwen_runtime_credentials,
+    resolve_gemini_oauth_runtime_credentials,
    resolve_api_key_provider_credentials,
    resolve_external_process_provider_credentials,
    has_usable_secret,
@@ -156,6 +157,9 @@ def _resolve_runtime_from_pool_entry(
    elif provider == "qwen-oauth":
        api_mode = "chat_completions"
        base_url = base_url or DEFAULT_QWEN_BASE_URL
+    elif provider == "google-gemini-cli":
+        api_mode = "chat_completions"
+        base_url = base_url or "cloudcode-pa://google"
    elif provider == "anthropic":
        api_mode = "anthropic_messages"
        cfg_provider = str(model_cfg.get("provider") or "").strip().lower()
@@ -804,6 +808,26 @@ def resolve_runtime_provider(
            logger.info("Qwen OAuth credentials failed; "
                        "falling through to next provider.")

+    if provider == "google-gemini-cli":
+        try:
+            creds = resolve_gemini_oauth_runtime_credentials()
+            return {
+                "provider": "google-gemini-cli",
+                "api_mode": "chat_completions",
+                "base_url": creds.get("base_url", ""),
+                "api_key": creds.get("api_key", ""),
+                "source": creds.get("source", "google-oauth"),
+                "expires_at_ms": creds.get("expires_at_ms"),
+                "email": creds.get("email", ""),
+                "project_id": creds.get("project_id", ""),
+                "requested_provider": requested_provider,
+            }
+        except AuthError:
+            if requested_provider != "auto":
+                raise
+            logger.info("Google Gemini OAuth credentials failed; "
+                        "falling through to next provider.")
+
    if provider == "copilot-acp":
        creds = resolve_external_process_provider_credentials(provider)
        return {
@@ -102,7 +102,7 @@ _DEFAULT_PROVIDER_MODELS = {
    "ai-gateway": ["anthropic/claude-opus-4.6", "anthropic/claude-sonnet-4.6", "openai/gpt-5", "google/gemini-3-flash"],
    "kilocode": ["anthropic/claude-opus-4.6", "anthropic/claude-sonnet-4.6", "openai/gpt-5.4", "google/gemini-3-pro-preview", "google/gemini-3-flash-preview"],
    "opencode-zen": ["gpt-5.4", "gpt-5.3-codex", "claude-sonnet-4-6", "gemini-3-flash", "glm-5", "kimi-k2.5", "minimax-m2.7"],
-    "opencode-go": ["glm-5", "kimi-k2.5", "mimo-v2-pro", "mimo-v2-omni", "minimax-m2.5", "minimax-m2.7"],
+    "opencode-go": ["glm-5.1", "glm-5", "kimi-k2.5", "mimo-v2-pro", "mimo-v2-omni", "minimax-m2.5", "minimax-m2.7"],
    "huggingface": [
        "Qwen/Qwen3.5-397B-A17B", "Qwen/Qwen3-235B-A22B-Thinking-2507",
        "Qwen/Qwen3-Coder-480B-A35B-Instruct", "deepseek-ai/DeepSeek-R1-0528",
@@ -430,6 +430,8 @@ def _print_setup_summary(config: dict, hermes_home):
        tool_status.append(("Text-to-Speech (MiniMax)", True, None))
    elif tts_provider == "mistral" and get_env_value("MISTRAL_API_KEY"):
        tool_status.append(("Text-to-Speech (Mistral Voxtral)", True, None))
+    elif tts_provider == "gemini" and (get_env_value("GEMINI_API_KEY") or get_env_value("GOOGLE_API_KEY")):
+        tool_status.append(("Text-to-Speech (Google Gemini)", True, None))
    elif tts_provider == "neutts":
        try:
            import importlib.util
@@ -913,6 +915,7 @@ def _setup_tts_provider(config: dict):
        "xai": "xAI TTS",
        "minimax": "MiniMax TTS",
        "mistral": "Mistral Voxtral TTS",
+        "gemini": "Google Gemini TTS",
        "neutts": "NeuTTS",
    }
    current_label = provider_labels.get(current_provider, current_provider)
@@ -935,10 +938,11 @@ def _setup_tts_provider(config: dict):
            "xAI TTS (Grok voices, needs API key)",
            "MiniMax TTS (high quality with voice cloning, needs API key)",
            "Mistral Voxtral TTS (multilingual, native Opus, needs API key)",
+            "Google Gemini TTS (30 prebuilt voices, prompt-controllable, needs API key)",
            "NeuTTS (local on-device, free, ~300MB model download)",
        ]
    )
-    providers.extend(["edge", "elevenlabs", "openai", "xai", "minimax", "mistral", "neutts"])
+    providers.extend(["edge", "elevenlabs", "openai", "xai", "minimax", "mistral", "gemini", "neutts"])
    choices.append(f"Keep current ({current_label})")
    keep_current_idx = len(choices) - 1
    idx = prompt_choice("Select TTS provider:", choices, keep_current_idx)
@@ -1045,6 +1049,19 @@ def _setup_tts_provider(config: dict):
                print_warning("No API key provided. Falling back to Edge TTS.")
                selected = "edge"

+    elif selected == "gemini":
+        existing = get_env_value("GEMINI_API_KEY") or get_env_value("GOOGLE_API_KEY")
+        if not existing:
+            print()
+            print_info("Get a free API key at https://aistudio.google.com/app/apikey")
+            api_key = prompt("Gemini API key for TTS", password=True)
+            if api_key:
+                save_env_value("GEMINI_API_KEY", api_key)
+                print_success("Gemini TTS API key saved")
+            else:
+                print_warning("No API key provided. Falling back to Edge TTS.")
+                selected = "edge"
+
    # Save the selection
    if "tts" not in config:
        config["tts"] = {}
@@ -172,6 +172,15 @@ TOOL_CATEGORIES = {
                ],
                "tts_provider": "mistral",
            },
+            {
+                "name": "Google Gemini TTS",
+                "badge": "preview",
+                "tag": "30 prebuilt voices, controllable via prompts",
+                "env_vars": [
+                    {"key": "GEMINI_API_KEY", "prompt": "Gemini API key", "url": "https://aistudio.google.com/app/apikey"},
+                ],
+                "tts_provider": "gemini",
+            },
        ],
    },
    "web": {
@@ -467,6 +467,7 @@ async def get_status():
        "latest_config_version": latest_ver,
        "gateway_running": gateway_running,
        "gateway_pid": gateway_pid,
+        "gateway_health_url": _GATEWAY_HEALTH_URL,
        "gateway_state": gateway_state,
        "gateway_platforms": gateway_platforms,
        "gateway_exit_reason": gateway_exit_reason,
@@ -1160,6 +1160,23 @@ class SessionDB:
            results.append({**session, "messages": messages})
        return results

+    # ---------------------------------------------------------------
+    # Export sanitization
+    # ---------------------------------------------------------------
+    #
+    # When users share session exports for debugging or training, the
+    # raw JSON contains every user message, tool output, and reasoning
+    # trace — which often includes file contents, command output, env
+    # variables, paths, and other confidential information.
+    #
+    # ``sanitize_session_export`` produces a deep copy of the export
+    # with all content fields replaced by opaque ``[redacted:<kind>:<id>]``
+    # tokens. Structural metadata (IDs, roles, timestamps, token counts,
+    # tool names, finish reasons, model info, cost data) is preserved
+    # so that the shape of a conversation is still analysable.
+    #
+    # Inspired by anomalyco/opencode#22489 (opencode's ``export --sanitize``).
+
    def clear_messages(self, session_id: str) -> None:
        """Delete all messages for a session and reset its counters."""
        def _do(conn):
@@ -1236,3 +1253,136 @@ class SessionDB:
            return len(session_ids)

        return self._execute_write(_do)
+
+
+# =========================================================================
+# Session export sanitization
+# =========================================================================
+#
+# Ported from anomalyco/opencode#22489 — users often want to share a
+# session export for bug reports, feature requests, or training data
+# collection, but the raw export contains every user prompt, tool
+# output, file content, and reasoning trace. ``sanitize_session_export``
+# replaces content fields with opaque tokens while preserving the
+# conversation's structure and metrics.
+
+# Message-level content fields that are always redacted on a message.
+_REDACT_MSG_STRING_FIELDS = (
+    "content",
+    "reasoning",
+)
+
+# Session-level fields that can contain user-facing text.
+_REDACT_SESSION_STRING_FIELDS = (
+    "system_prompt",
+    "title",
+)
+
+
+def _redact_token(kind: str, id_: Any, value: Any) -> Any:
+    """Produce an opaque redaction token. Preserves empty/None values."""
+    if value in (None, "", b""):
+        return value
+    return f"[redacted:{kind}:{id_}]"
+
+
+def _redact_tool_call(call: Any, msg_id: Any, index: int) -> Any:
+    """Redact arguments inside a tool_call while preserving structure (id, name)."""
+    if not isinstance(call, dict):
+        return call
+    out = dict(call)
+    tcid = out.get("id") or f"{msg_id}-{index}"
+    fn = out.get("function")
+    if isinstance(fn, dict):
+        new_fn = dict(fn)
+        if "arguments" in new_fn and new_fn["arguments"] not in (None, "", "{}"):
+            new_fn["arguments"] = _redact_token("tool-input", tcid, new_fn["arguments"])
+        out["function"] = new_fn
+    # Some schemas put args at the top level rather than under ``function``.
+    if "arguments" in out and out["arguments"] not in (None, "", "{}"):
+        out["arguments"] = _redact_token("tool-input", tcid, out["arguments"])
+    return out
+
+
+def _redact_reasoning_details(details: Any, msg_id: Any) -> Any:
+    """Redact text inside OpenAI / Anthropic reasoning_details blocks.
+
+    ``reasoning_details`` is a list of dicts with shapes like::
+
+        {"type": "reasoning.text", "text": "..."}
+        {"type": "reasoning.encrypted", "data": "..."}
+        {"type": "reasoning.summary", "summary": "..."}
+
+    We preserve the block type/structure and redact the inner payload.
+    """
+    if not isinstance(details, list):
+        return details
+    out = []
+    for idx, block in enumerate(details):
+        if not isinstance(block, dict):
+            out.append(block)
+            continue
+        new_block = dict(block)
+        for key in ("text", "data", "summary", "content"):
+            if key in new_block and new_block[key] not in (None, ""):
+                new_block[key] = _redact_token(f"reasoning-{key}", f"{msg_id}-{idx}", new_block[key])
+        out.append(new_block)
+    return out
+
+
+def _redact_message(msg: Dict[str, Any]) -> Dict[str, Any]:
+    """Return a sanitized copy of a single message row."""
+    if not isinstance(msg, dict):
+        return msg
+    msg_id = msg.get("id", "msg")
+    out = dict(msg)
+
+    # Plain string content fields.
+    for field in _REDACT_MSG_STRING_FIELDS:
+        if field in out and out[field] not in (None, ""):
+            out[field] = _redact_token(field.replace("_", "-"), msg_id, out[field])
+
+    # Tool calls: keep structure (id, name) but redact arguments.
+    tcs = out.get("tool_calls")
+    if isinstance(tcs, list):
+        out["tool_calls"] = [_redact_tool_call(tc, msg_id, i) for i, tc in enumerate(tcs)]
+
+    # Reasoning details: preserve block structure, redact text/data.
+    if "reasoning_details" in out:
+        out["reasoning_details"] = _redact_reasoning_details(out["reasoning_details"], msg_id)
+
+    # Codex reasoning items follow the same shape as reasoning_details.
+    if "codex_reasoning_items" in out:
+        out["codex_reasoning_items"] = _redact_reasoning_details(out["codex_reasoning_items"], msg_id)
+
+    return out
+
+
+def sanitize_session_export(session: Dict[str, Any]) -> Dict[str, Any]:
+    """Return a deep-sanitized copy of a session export.
+
+    All user-facing content (message text, reasoning, tool arguments and
+    outputs, system prompt, title) is replaced by ``[redacted:<kind>:<id>]``
+    tokens. Structural metadata (ids, timestamps, token counts, tool names,
+    model/provider info, cost data, finish reasons) is preserved so the
+    export remains useful for debugging schema issues, analysing tool-use
+    patterns, or counting sessions without leaking confidential data.
+
+    The input dict is not mutated.
+    """
+    if not isinstance(session, dict):
+        return session
+    sid = session.get("id", "session")
+    out = dict(session)
+
+    # Session-level text fields (title, system prompt).
+    for field in _REDACT_SESSION_STRING_FIELDS:
+        if field in out and out[field] not in (None, ""):
+            out[field] = _redact_token(field.replace("_", "-"), sid, out[field])
+
+    # Messages list: sanitize each row.
+    msgs = out.get("messages")
+    if isinstance(msgs, list):
+        out["messages"] = [_redact_message(m) for m in msgs]
+
+    return out
@@ -4365,6 +4365,22 @@ class AIAgent:
                self._client_log_context(),
            )
            return client
+        if self.provider == "google-gemini-cli" or str(client_kwargs.get("base_url", "")).startswith("cloudcode-pa://"):
+            from agent.gemini_cloudcode_adapter import GeminiCloudCodeClient
+
+            # Strip OpenAI-specific kwargs the Gemini client doesn't accept
+            safe_kwargs = {
+                k: v for k, v in client_kwargs.items()
+                if k in {"api_key", "base_url", "default_headers", "project_id", "timeout"}
+            }
+            client = GeminiCloudCodeClient(**safe_kwargs)
+            logger.info(
+                "Gemini Cloud Code Assist client created (%s, shared=%s) %s",
+                reason,
+                shared,
+                self._client_log_context(),
+            )
+            return client
        client = OpenAI(**client_kwargs)
        logger.info(
            "OpenAI client created (%s, shared=%s) %s",
@@ -69,6 +69,7 @@ AUTHOR_MAP = {
    "241404605+MestreY0d4-Uninter@users.noreply.github.com": "MestreY0d4-Uninter",
    "109555139+davetist@users.noreply.github.com": "davetist",
    # contributors (manual mapping from git names)
+    "ahmedsherif95@gmail.com": "asheriif",
    "dmayhem93@gmail.com": "dmahan93",
    "samherring99@gmail.com": "samherring99",
    "desaiaum08@gmail.com": "Aum08Desai",
@@ -226,6 +227,7 @@ AUTHOR_MAP = {
    "zzn+pa@zzn.im": "xinbenlv",
    "zaynjarvis@gmail.com": "ZaynJarvis",
    "zhiheng.liu@bytedance.com": "ZaynJarvis",
+    "mbelleau@Michels-MacBook-Pro.local": "malaiwah",
 }


@@ -9,11 +9,6 @@ metadata:
    tags: [wiki, knowledge-base, research, notes, markdown, rag-alternative]
    category: research
    related_skills: [obsidian, arxiv, agentic-research-ideas]
-    config:
-      - key: wiki.path
-        description: Path to the LLM Wiki knowledge base directory
-        default: "~/wiki"
-        prompt: Wiki directory path
 ---

 # Karpathy's LLM Wiki
@@ -39,19 +34,14 @@ Use this skill when the user:

 ## Wiki Location

-Configured via `skills.config.wiki.path` in `~/.hermes/config.yaml` (prompted
-during `hermes config migrate` or `hermes setup`):
+**Location:** Set via `WIKI_PATH` environment variable (e.g. in `~/.hermes/.env`).

-```yaml
-skills:
-  config:
-    wiki:
-      path: ~/wiki
+If unset, defaults to `~/wiki`.
+
+```bash
+WIKI="${WIKI_PATH:-$HOME/wiki}"
 ```

-Falls back to `~/wiki` default. The resolved path is injected when this
-skill loads — check the `[Skill config: ...]` block above for the active value.
-
 The wiki is just a directory of markdown files — open it in Obsidian, VS Code, or
 any editor. No database, no special tooling required.

@@ -87,7 +77,7 @@ When the user has an existing wiki, **always orient yourself before doing anythi
 ③ **Scan recent `log.md`** — read the last 20-30 entries to understand recent activity.

 ```bash
-WIKI="${wiki_path:-$HOME/wiki}"
+WIKI="${WIKI_PATH:-$HOME/wiki}"
 # Orientation reads at session start
 read_file "$WIKI/SCHEMA.md"
 read_file "$WIKI/index.md"
@@ -107,7 +97,7 @@ at hand before creating anything new.

 When the user asks to create or start a wiki:

-1. Determine the wiki path (from config, env var, or ask the user; default `~/wiki`)
+1. Determine the wiki path (from `$WIKI_PATH` env var, or ask the user; default `~/wiki`)
 2. Create the directory structure above
 3. Ask the user what domain the wiki covers — be specific
 4. Write `SCHEMA.md` customized to the domain (see template below)
@@ -141,3 +141,116 @@ class TestCliApprovalUi:
        assert "archive-" in rendered
        assert "keyring.gpg" in rendered
        assert "status=progress" in rendered
+
+    def test_approval_display_preserves_command_and_choices_with_long_description(self):
+        """Regression: long tirith descriptions used to push approve/deny off-screen.
+
+        The panel must always render the command and every choice, even when
+        the description would otherwise wrap into 10+ lines. The description
+        gets truncated with a marker instead.
+        """
+        cli = _make_cli_stub()
+        long_desc = (
+            "Security scan — [CRITICAL] Destructive shell command with wildcard expansion: "
+            "The command performs a recursive deletion of log files which may contain "
+            "audit information relevant to active incident investigations, running services "
+            "that rely on log files for state, rotated archives, and other system artifacts. "
+            "Review whether this is intended before approving. Consider whether a targeted "
+            "deletion with more specific filters would better match the intent."
+        )
+        cli._approval_state = {
+            "command": "rm -rf /var/log/apache2/*.log",
+            "description": long_desc,
+            "choices": ["once", "session", "always", "deny"],
+            "selected": 0,
+            "response_queue": queue.Queue(),
+        }
+
+        # Simulate a compact terminal where the old unbounded panel would overflow.
+        import shutil as _shutil
+
+        with patch("cli.shutil.get_terminal_size",
+                   return_value=_shutil.os.terminal_size((100, 20))):
+            fragments = cli._get_approval_display_fragments()
+
+        rendered = "".join(text for _style, text in fragments)
+
+        # Command must be fully visible (rm -rf /var/log/apache2/*.log is short).
+        assert "rm -rf /var/log/apache2/*.log" in rendered
+
+        # Every choice must render — this is the core bug: approve/deny were
+        # getting clipped off the bottom of the panel.
+        assert "Allow once" in rendered
+        assert "Allow for this session" in rendered
+        assert "Add to permanent allowlist" in rendered
+        assert "Deny" in rendered
+
+        # The bottom border must render (i.e. the panel is self-contained).
+        assert rendered.rstrip().endswith("╯")
+
+        # The description gets truncated — marker should appear.
+        assert "(description truncated)" in rendered
+
+    def test_approval_display_skips_description_on_very_short_terminal(self):
+        """On a 12-row terminal, only the command and choices have room.
+
+        The description is dropped entirely rather than partially shown, so the
+        choices never get clipped.
+        """
+        cli = _make_cli_stub()
+        cli._approval_state = {
+            "command": "rm -rf /var/log/apache2/*.log",
+            "description": "recursive delete",
+            "choices": ["once", "session", "always", "deny"],
+            "selected": 0,
+            "response_queue": queue.Queue(),
+        }
+
+        import shutil as _shutil
+
+        with patch("cli.shutil.get_terminal_size",
+                   return_value=_shutil.os.terminal_size((100, 12))):
+            fragments = cli._get_approval_display_fragments()
+
+        rendered = "".join(text for _style, text in fragments)
+
+        # Command visible.
+        assert "rm -rf /var/log/apache2/*.log" in rendered
+        # All four choices visible.
+        for label in ("Allow once", "Allow for this session",
+                      "Add to permanent allowlist", "Deny"):
+            assert label in rendered, f"choice {label!r} missing"
+
+    def test_approval_display_truncates_giant_command_in_view_mode(self):
+        """If the user hits /view on a massive command, choices still render.
+
+        The command gets truncated with a marker; the description gets dropped
+        if there's no remaining row budget.
+        """
+        cli = _make_cli_stub()
+        # 50 lines of command when wrapped at ~64 chars.
+        giant_cmd = "bash -c 'echo " + ("x" * 3000) + "'"
+        cli._approval_state = {
+            "command": giant_cmd,
+            "description": "shell command via -c/-lc flag",
+            "choices": ["once", "session", "always", "deny"],
+            "selected": 0,
+            "show_full": True,
+            "response_queue": queue.Queue(),
+        }
+
+        import shutil as _shutil
+
+        with patch("cli.shutil.get_terminal_size",
+                   return_value=_shutil.os.terminal_size((100, 24))):
+            fragments = cli._get_approval_display_fragments()
+
+        rendered = "".join(text for _style, text in fragments)
+
+        # All four choices visible even with a huge command.
+        for label in ("Allow once", "Allow for this session",
+                      "Add to permanent allowlist", "Deny"):
+            assert label in rendered, f"choice {label!r} missing"
+
+        # Command got truncated with a marker.
+        assert "(command truncated" in rendered
@@ -176,6 +176,22 @@ class TestCommandBypassActiveSession:
            "/background response was not sent back to the user"
        )

+    @pytest.mark.asyncio
+    async def test_queue_bypasses_guard(self):
+        """/queue must bypass so it can queue without interrupting."""
+        adapter = _make_adapter()
+        sk = _session_key()
+        adapter._active_sessions[sk] = asyncio.Event()
+
+        await adapter.handle_message(_make_event("/queue follow up"))
+
+        assert sk not in adapter._pending_messages, (
+            "/queue was queued as a pending message instead of being dispatched"
+        )
+        assert any("handled:queue" in r for r in adapter.sent_responses), (
+            "/queue response was not sent back to the user"
+        )
+

 # ---------------------------------------------------------------------------
 # Tests: non-bypass messages still get queued
@@ -108,6 +108,9 @@ def _make_fake_mautrix():
        def add_event_handler(self, event_type, handler):
            self._event_handlers.setdefault(event_type, []).append(handler)

+        def add_dispatcher(self, dispatcher_type):
+            pass
+
    class InternalEventType:
        INVITE = "internal.invite"

@@ -115,6 +118,14 @@ def _make_fake_mautrix():
    mautrix_client.InternalEventType = InternalEventType
    mautrix.client = mautrix_client

+    # --- mautrix.client.dispatcher ---
+    mautrix_client_dispatcher = types.ModuleType("mautrix.client.dispatcher")
+
+    class MembershipEventDispatcher:
+        pass
+
+    mautrix_client_dispatcher.MembershipEventDispatcher = MembershipEventDispatcher
+
    # --- mautrix.client.state_store ---
    mautrix_client_state_store = types.ModuleType("mautrix.client.state_store")

@@ -163,6 +174,19 @@ def _make_fake_mautrix():

    mautrix_crypto_store.MemoryCryptoStore = MemoryCryptoStore

+    # --- mautrix.crypto.attachments ---
+    mautrix_crypto_attachments = types.ModuleType("mautrix.crypto.attachments")
+
+    def encrypt_attachment(data):
+        encrypted_file = MagicMock()
+        encrypted_file.serialize.return_value = {
+            "key": {"k": "testkey"}, "iv": "testiv",
+            "hashes": {"sha256": "testhash"}, "v": "v2",
+        }
+        return (b"ciphertext_" + data, encrypted_file)
+
+    mautrix_crypto_attachments.encrypt_attachment = encrypt_attachment
+
    # --- mautrix.crypto.store.asyncpg ---
    mautrix_crypto_store_asyncpg = types.ModuleType("mautrix.crypto.store.asyncpg")

@@ -200,8 +224,10 @@ def _make_fake_mautrix():
        "mautrix.api": mautrix_api,
        "mautrix.types": mautrix_types,
        "mautrix.client": mautrix_client,
+        "mautrix.client.dispatcher": mautrix_client_dispatcher,
        "mautrix.client.state_store": mautrix_client_state_store,
        "mautrix.crypto": mautrix_crypto,
+        "mautrix.crypto.attachments": mautrix_crypto_attachments,
        "mautrix.crypto.store": mautrix_crypto_store,
        "mautrix.crypto.store.asyncpg": mautrix_crypto_store_asyncpg,
        "mautrix.util": mautrix_util,
@@ -357,6 +383,16 @@ class TestMatrixTypingIndicator:
            timeout=0,
        )

+    @pytest.mark.asyncio
+    async def test_stop_typing_no_client_is_noop(self):
+        self.adapter._client = None
+        await self.adapter.stop_typing("!room:example.org")  # should not raise
+
+    @pytest.mark.asyncio
+    async def test_stop_typing_suppresses_exceptions(self):
+        self.adapter._client.set_typing = AsyncMock(side_effect=Exception("network"))
+        await self.adapter.stop_typing("!room:example.org")  # should not raise
+

 # ---------------------------------------------------------------------------
 # mxc:// URL conversion
@@ -835,6 +871,41 @@ class TestMatrixAccessTokenAuth:
        await adapter.disconnect()


+class TestDeviceKeyReVerification:
+    @pytest.mark.asyncio
+    async def test_verify_fails_when_server_keys_mismatch_after_upload(self):
+        """share_keys() succeeds but server still has old keys -> should return False."""
+        adapter = _make_adapter()
+
+        mock_client = MagicMock()
+        mock_client.mxid = "@bot:example.org"
+        mock_client.device_id = "TESTDEVICE"
+
+        # First query: keys missing -> triggers share_keys
+        # Second query: keys still don't match -> should fail
+        mock_keys_missing = MagicMock()
+        mock_keys_missing.device_keys = {"@bot:example.org": {}}
+
+        mock_keys_mismatch = MagicMock()
+        mock_device = MagicMock()
+        mock_device.keys = {"ed25519:TESTDEVICE": "server_old_key"}
+        mock_keys_mismatch.device_keys = {"@bot:example.org": {"TESTDEVICE": mock_device}}
+
+        mock_client.query_keys = AsyncMock(side_effect=[mock_keys_missing, mock_keys_mismatch])
+
+        mock_olm = MagicMock()
+        mock_olm.account = MagicMock()
+        mock_olm.account.shared = False
+        mock_olm.account.identity_keys = {"ed25519": "local_new_key"}
+        mock_olm.share_keys = AsyncMock()
+
+        from gateway.platforms.matrix import MatrixAdapter
+        result = await adapter._verify_device_keys_on_server(mock_client, mock_olm)
+
+        assert result is False
+        mock_olm.share_keys.assert_awaited_once()
+
+
 class TestMatrixE2EEHardFail:
    """connect() must refuse to start when E2EE is requested but deps are missing."""

@@ -1139,6 +1210,56 @@ class TestMatrixSyncLoop:
        mock_sync_store.put_next_batch.assert_awaited_once_with("s1234")


+class TestMatrixUploadAndSend:
+    @pytest.mark.asyncio
+    async def test_upload_unencrypted_room_uses_plain_url(self):
+        """Unencrypted rooms should use plain 'url' key."""
+        adapter = _make_adapter()
+        adapter._encryption = True
+        mock_client = MagicMock()
+        mock_client.crypto = object()
+        mock_client.state_store = MagicMock()
+        mock_client.state_store.is_encrypted = AsyncMock(return_value=False)
+        mock_client.upload_media = AsyncMock(return_value="mxc://example.org/plain")
+        mock_client.send_message_event = AsyncMock(return_value="$event")
+        adapter._client = mock_client
+
+        result = await adapter._upload_and_send(
+            "!room:example.org", b"hello", "test.txt", "text/plain", "m.file",
+        )
+
+        assert result.success is True
+        sent = mock_client.send_message_event.await_args.args[2]
+        assert sent["url"] == "mxc://example.org/plain"
+        assert "file" not in sent
+
+    @pytest.mark.asyncio
+    async def test_upload_encrypted_room_uses_file_payload(self):
+        """Encrypted rooms should use 'file' key with crypto metadata."""
+        adapter = _make_adapter()
+        adapter._encryption = True
+        mock_client = MagicMock()
+        mock_client.crypto = object()
+        mock_client.state_store = MagicMock()
+        mock_client.state_store.is_encrypted = AsyncMock(return_value=True)
+        mock_client.upload_media = AsyncMock(return_value="mxc://example.org/enc")
+        mock_client.send_message_event = AsyncMock(return_value="$event")
+        adapter._client = mock_client
+
+        result = await adapter._upload_and_send(
+            "!room:example.org", b"secret", "secret.txt", "text/plain", "m.file",
+        )
+
+        assert result.success is True
+        # Should have uploaded ciphertext, not plaintext
+        uploaded_data = mock_client.upload_media.await_args.args[0]
+        assert uploaded_data != b"secret"
+        sent = mock_client.send_message_event.await_args.args[2]
+        assert "url" not in sent
+        assert "file" in sent
+        assert sent["file"]["url"] == "mxc://example.org/enc"
+
+
 class TestMatrixEncryptedSendFallback:
    @pytest.mark.asyncio
    async def test_send_retries_after_e2ee_error(self):
@@ -1165,128 +1286,24 @@ class TestMatrixEncryptedSendFallback:


 # ---------------------------------------------------------------------------
-# E2EE: MegolmEvent key request + buffering via _on_encrypted_event
+# E2EE: _joined_rooms reference preservation for CryptoStateStore
 # ---------------------------------------------------------------------------

-class TestMatrixMegolmEventHandling:
-    @pytest.mark.asyncio
-    async def test_encrypted_event_buffers_for_retry(self):
-        """_on_encrypted_event should buffer undecrypted events for retry."""
-        adapter = _make_adapter()
-        adapter._user_id = "@bot:example.org"
-        adapter._startup_ts = 0.0
-        adapter._dm_rooms = {}
+class TestJoinedRoomsReference:
+    def test_joined_rooms_reference_preserved_after_reassignment(self):
+        """_CryptoStateStore must see updates after initial sync populates rooms."""
+        from gateway.platforms.matrix import _CryptoStateStore

-        fake_event = MagicMock()
-        fake_event.room_id = "!room:example.org"
-        fake_event.event_id = "$encrypted_event"
-        fake_event.sender = "@alice:example.org"
+        joined = set()
+        store = _CryptoStateStore(MagicMock(), joined)

-        await adapter._on_encrypted_event(fake_event)
+        # Simulate what connect() should do: mutate in place, not reassign.
+        joined.clear()
+        joined.update(["!room1:example.org", "!room2:example.org"])

-        # Should have buffered the event
-        assert len(adapter._pending_megolm) == 1
-        room_id, event, ts = adapter._pending_megolm[0]
-        assert room_id == "!room:example.org"
-        assert event is fake_event
-
-    @pytest.mark.asyncio
-    async def test_encrypted_event_buffer_capped(self):
-        """Buffer should not grow past _MAX_PENDING_EVENTS."""
-        adapter = _make_adapter()
-        adapter._user_id = "@bot:example.org"
-        adapter._startup_ts = 0.0
-        adapter._dm_rooms = {}
-
-        from gateway.platforms.matrix import _MAX_PENDING_EVENTS
-
-        for i in range(_MAX_PENDING_EVENTS + 10):
-            evt = MagicMock()
-            evt.room_id = "!room:example.org"
-            evt.event_id = f"$event_{i}"
-            evt.sender = "@alice:example.org"
-            await adapter._on_encrypted_event(evt)
-
-        assert len(adapter._pending_megolm) == _MAX_PENDING_EVENTS
-
-
-# ---------------------------------------------------------------------------
-# E2EE: Retry pending decryptions
-# ---------------------------------------------------------------------------
-
-class TestMatrixRetryPendingDecryptions:
-    @pytest.mark.asyncio
-    async def test_successful_decryption_routes_to_handler(self):
-        adapter = _make_adapter()
-        adapter._user_id = "@bot:example.org"
-        adapter._startup_ts = 0.0
-        adapter._dm_rooms = {}
-
-        fake_encrypted = MagicMock()
-        fake_encrypted.event_id = "$encrypted"
-
-        decrypted_event = MagicMock()
-
-        mock_crypto = MagicMock()
-        mock_crypto.decrypt_megolm_event = AsyncMock(return_value=decrypted_event)
-
-        fake_client = MagicMock()
-        fake_client.crypto = mock_crypto
-        adapter._client = fake_client
-
-        now = time.time()
-        adapter._pending_megolm = [("!room:ex.org", fake_encrypted, now)]
-
-        with patch.object(adapter, "_on_room_message", AsyncMock()) as mock_handler:
-            await adapter._retry_pending_decryptions()
-            mock_handler.assert_awaited_once_with(decrypted_event)
-
-        # Buffer should be empty now
-        assert len(adapter._pending_megolm) == 0
-
-    @pytest.mark.asyncio
-    async def test_still_undecryptable_stays_in_buffer(self):
-        adapter = _make_adapter()
-
-        fake_encrypted = MagicMock()
-        fake_encrypted.event_id = "$still_encrypted"
-
-        mock_crypto = MagicMock()
-        mock_crypto.decrypt_megolm_event = AsyncMock(side_effect=Exception("missing key"))
-
-        fake_client = MagicMock()
-        fake_client.crypto = mock_crypto
-        adapter._client = fake_client
-
-        now = time.time()
-        adapter._pending_megolm = [("!room:ex.org", fake_encrypted, now)]
-
-        await adapter._retry_pending_decryptions()
-
-        assert len(adapter._pending_megolm) == 1
-
-    @pytest.mark.asyncio
-    async def test_expired_events_dropped(self):
-        adapter = _make_adapter()
-
-        from gateway.platforms.matrix import _PENDING_EVENT_TTL
-
-        fake_event = MagicMock()
-        fake_event.event_id = "$old_event"
-
-        mock_crypto = MagicMock()
-        fake_client = MagicMock()
-        fake_client.crypto = mock_crypto
-        adapter._client = fake_client
-
-        # Timestamp well past TTL
-        old_ts = time.time() - _PENDING_EVENT_TTL - 60
-        adapter._pending_megolm = [("!room:ex.org", fake_event, old_ts)]
-
-        await adapter._retry_pending_decryptions()
-
-        # Should have been dropped
-        assert len(adapter._pending_megolm) == 0
+        import asyncio
+        rooms = asyncio.get_event_loop().run_until_complete(store.find_shared_rooms("@user:ex"))
+        assert set(rooms) == {"!room1:example.org", "!room2:example.org"}


 # ---------------------------------------------------------------------------
@@ -1354,11 +1371,70 @@ class TestMatrixEncryptedEventHandler:
        handler_calls = mock_client.add_event_handler.call_args_list
        registered_types = [call.args[0] for call in handler_calls]

-        # Should have registered handlers for ROOM_MESSAGE, REACTION, INVITE, and ROOM_ENCRYPTED
-        assert len(handler_calls) >= 4  # At minimum these four
+        # Should have registered handlers for ROOM_MESSAGE, REACTION, INVITE
+        assert len(handler_calls) >= 3

        await adapter.disconnect()

+    @pytest.mark.asyncio
+    async def test_connect_fails_on_stale_otk_conflict(self):
+        """connect() must refuse E2EE when OTK upload hits 'already exists'."""
+        from gateway.platforms.matrix import MatrixAdapter
+
+        config = PlatformConfig(
+            enabled=True,
+            token="syt_test_token",
+            extra={
+                "homeserver": "https://matrix.example.org",
+                "user_id": "@bot:example.org",
+                "encryption": True,
+            },
+        )
+        adapter = MatrixAdapter(config)
+
+        fake_mautrix_mods = _make_fake_mautrix()
+
+        mock_client = MagicMock()
+        mock_client.mxid = "@bot:example.org"
+        mock_client.device_id = None
+        mock_client.state_store = MagicMock()
+        mock_client.sync_store = MagicMock()
+        mock_client.crypto = None
+        mock_client.whoami = AsyncMock(return_value=MagicMock(user_id="@bot:example.org", device_id="DEV123"))
+        mock_client.add_event_handler = MagicMock()
+        mock_client.add_dispatcher = MagicMock()
+        mock_client.query_keys = AsyncMock(return_value={
+            "device_keys": {"@bot:example.org": {"DEV123": {
+                "keys": {"ed25519:DEV123": "fake_ed25519_key"},
+            }}},
+        })
+        mock_client.api = MagicMock()
+        mock_client.api.token = "syt_test_token"
+        mock_client.api.session = MagicMock()
+        mock_client.api.session.close = AsyncMock()
+
+        # share_keys succeeds on first call (from _verify_device_keys_on_server),
+        # then raises "already exists" on the proactive OTK flush in connect().
+        mock_olm = MagicMock()
+        mock_olm.load = AsyncMock()
+        mock_olm.share_keys = AsyncMock(
+            side_effect=[None, Exception("One time key signed_curve25519:AAAAAQ already exists")]
+        )
+        mock_olm.share_keys_min_trust = None
+        mock_olm.send_keys_min_trust = None
+        mock_olm.account = MagicMock()
+        mock_olm.account.identity_keys = {"ed25519": "fake_ed25519_key"}
+
+        fake_mautrix_mods["mautrix.client"].Client = MagicMock(return_value=mock_client)
+        fake_mautrix_mods["mautrix.crypto"].OlmMachine = MagicMock(return_value=mock_olm)
+
+        from gateway.platforms import matrix as matrix_mod
+        with patch.object(matrix_mod, "_check_e2ee_deps", return_value=True):
+            with patch.dict("sys.modules", fake_mautrix_mods):
+                result = await adapter.connect()
+
+        assert result is False
+

 # ---------------------------------------------------------------------------
 # Disconnect
@@ -1740,16 +1816,49 @@ class TestMatrixReadReceipts:
    def setup_method(self):
        self.adapter = _make_adapter()

+    @pytest.mark.asyncio
+    async def test_accepted_message_schedules_read_receipt(self):
+        self.adapter._is_dm_room = AsyncMock(return_value=True)
+        self.adapter._get_display_name = AsyncMock(return_value="Alice")
+        self.adapter._background_read_receipt = MagicMock()
+
+        ctx = await self.adapter._resolve_message_context(
+            room_id="!room:ex",
+            sender="@alice:ex",
+            event_id="$event1",
+            body="hello",
+            source_content={"body": "hello"},
+            relates_to={},
+        )
+
+        assert ctx is not None
+        self.adapter._background_read_receipt.assert_called_once_with(
+            "!room:ex", "$event1"
+        )
+
    @pytest.mark.asyncio
    async def test_send_read_receipt(self):
-        """send_read_receipt should call client.set_read_markers."""
+        """send_read_receipt should call mautrix's real read-marker API."""
        mock_client = MagicMock()
-        mock_client.set_read_markers = AsyncMock(return_value=None)
+        mock_client.set_fully_read_marker = AsyncMock(return_value=None)
        self.adapter._client = mock_client

        result = await self.adapter.send_read_receipt("!room:ex", "$event1")
        assert result is True
-        mock_client.set_read_markers.assert_called_once()
+        mock_client.set_fully_read_marker.assert_awaited_once_with(
+            "!room:ex", "$event1", "$event1"
+        )
+
+    @pytest.mark.asyncio
+    async def test_send_read_receipt_falls_back_to_receipt_only(self):
+        """send_read_receipt should still work with clients lacking read markers."""
+        mock_client = MagicMock(spec=["send_receipt"])
+        mock_client.send_receipt = AsyncMock(return_value=None)
+        self.adapter._client = mock_client
+
+        result = await self.adapter.send_read_receipt("!room:ex", "$event1")
+        assert result is True
+        mock_client.send_receipt.assert_awaited_once_with("!room:ex", "$event1")

    @pytest.mark.asyncio
    async def test_read_receipt_no_client(self):
@@ -1852,5 +1961,3 @@ class TestMatrixPresence:
        self.adapter._client = None
        result = await self.adapter.set_presence("online")
        assert result is False
-
-
@@ -10,7 +10,6 @@ import pytest

 from gateway.config import PlatformConfig

-
 # The matrix adapter module is importable without mautrix installed
 # (module-level imports use try/except with stubs).  No need for
 # module-level mock installation — tests that call adapter methods
@@ -159,9 +158,15 @@ class TestStripMention:
        result = self.adapter._strip_mention("@hermes:example.org help me")
        assert result == "help me"

-    def test_strip_localpart(self):
+    def test_localpart_preserved(self):
+        """Localpart-only text is no longer stripped — avoids false positives in paths."""
        result = self.adapter._strip_mention("hermes help me")
-        assert result == "help me"
+        assert result == "hermes help me"
+
+    def test_localpart_in_path_preserved(self):
+        """Localpart inside a file path must not be damaged."""
+        result = self.adapter._strip_mention("read /home/hermes/config.yaml")
+        assert result == "read /home/hermes/config.yaml"

    def test_strip_returns_empty_for_mention_only(self):
        result = self.adapter._strip_mention("@hermes:example.org")
@@ -273,8 +278,8 @@ async def test_require_mention_dm_always_responds(monkeypatch):


@pytest.mark.asyncio
-async def test_dm_strips_mention(monkeypatch):
-    """DMs strip mention from body, matching Discord behavior."""
+async def test_dm_strips_full_mxid(monkeypatch):
+    """DMs strip the full MXID from body when require_mention is on (default)."""
    monkeypatch.delenv("MATRIX_REQUIRE_MENTION", raising=False)
    monkeypatch.delenv("MATRIX_FREE_RESPONSE_ROOMS", raising=False)
    monkeypatch.setenv("MATRIX_AUTO_THREAD", "false")
@@ -289,6 +294,23 @@ async def test_dm_strips_mention(monkeypatch):
    assert msg.text == "help me"


+@pytest.mark.asyncio
+async def test_dm_preserves_localpart_in_body(monkeypatch):
+    """DMs no longer strip bare localpart — only the full MXID is removed."""
+    monkeypatch.delenv("MATRIX_REQUIRE_MENTION", raising=False)
+    monkeypatch.delenv("MATRIX_FREE_RESPONSE_ROOMS", raising=False)
+    monkeypatch.setenv("MATRIX_AUTO_THREAD", "false")
+
+    adapter = _make_adapter()
+    _set_dm(adapter)
+    event = _make_event("hermes help me")
+
+    await adapter._on_room_message(event)
+    adapter.handle_message.assert_awaited_once()
+    msg = adapter.handle_message.await_args.args[0]
+    assert msg.text == "hermes help me"
+
+
@pytest.mark.asyncio
 async def test_bare_mention_passes_empty_string(monkeypatch):
    """A message that is only a mention should pass through as empty, not be dropped."""
@@ -309,7 +331,9 @@ async def test_bare_mention_passes_empty_string(monkeypatch):
 async def test_require_mention_free_response_room(monkeypatch):
    """Free-response rooms bypass mention requirement."""
    monkeypatch.delenv("MATRIX_REQUIRE_MENTION", raising=False)
-    monkeypatch.setenv("MATRIX_FREE_RESPONSE_ROOMS", "!room1:example.org,!room2:example.org")
+    monkeypatch.setenv(
+        "MATRIX_FREE_RESPONSE_ROOMS", "!room1:example.org,!room2:example.org"
+    )
    monkeypatch.setenv("MATRIX_AUTO_THREAD", "false")

    adapter = _make_adapter()
@@ -351,6 +375,22 @@ async def test_require_mention_disabled(monkeypatch):
    assert msg.text == "hello without mention"


+@pytest.mark.asyncio
+async def test_require_mention_disabled_skips_stripping(monkeypatch):
+    """MATRIX_REQUIRE_MENTION=false: mention text is NOT stripped from body."""
+    monkeypatch.setenv("MATRIX_REQUIRE_MENTION", "false")
+    monkeypatch.delenv("MATRIX_FREE_RESPONSE_ROOMS", raising=False)
+    monkeypatch.setenv("MATRIX_AUTO_THREAD", "false")
+
+    adapter = _make_adapter()
+    event = _make_event("@hermes:example.org help me")
+
+    await adapter._on_room_message(event)
+    adapter.handle_message.assert_awaited_once()
+    msg = adapter.handle_message.await_args.args[0]
+    assert msg.text == "@hermes:example.org help me"
+
+
 # ---------------------------------------------------------------------------
 # Auto-thread in _on_room_message
 # ---------------------------------------------------------------------------
@@ -442,8 +482,10 @@ class TestThreadPersistence:
    def test_empty_state_file(self, tmp_path, monkeypatch):
        """No state file → empty set."""
        from gateway.platforms.helpers import ThreadParticipationTracker
+
        monkeypatch.setattr(
-            ThreadParticipationTracker, "_state_path",
+            ThreadParticipationTracker,
+            "_state_path",
            lambda self: tmp_path / "matrix_threads.json",
        )
        adapter = _make_adapter()
@@ -452,9 +494,11 @@ class TestThreadPersistence:
    def test_track_thread_persists(self, tmp_path, monkeypatch):
        """mark() writes to disk."""
        from gateway.platforms.helpers import ThreadParticipationTracker
+
        state_path = tmp_path / "matrix_threads.json"
        monkeypatch.setattr(
-            ThreadParticipationTracker, "_state_path",
+            ThreadParticipationTracker,
+            "_state_path",
            lambda self: state_path,
        )
        adapter = _make_adapter()
@@ -466,10 +510,12 @@ class TestThreadPersistence:
    def test_threads_survive_reload(self, tmp_path, monkeypatch):
        """Persisted threads are loaded by a new adapter instance."""
        from gateway.platforms.helpers import ThreadParticipationTracker
+
        state_path = tmp_path / "matrix_threads.json"
        state_path.write_text(json.dumps(["$t1", "$t2"]))
        monkeypatch.setattr(
-            ThreadParticipationTracker, "_state_path",
+            ThreadParticipationTracker,
+            "_state_path",
            lambda self: state_path,
        )
        adapter = _make_adapter()
@@ -479,9 +525,11 @@ class TestThreadPersistence:
    def test_cap_max_tracked_threads(self, tmp_path, monkeypatch):
        """Thread set is trimmed to max_tracked."""
        from gateway.platforms.helpers import ThreadParticipationTracker
+
        state_path = tmp_path / "matrix_threads.json"
        monkeypatch.setattr(
-            ThreadParticipationTracker, "_state_path",
+            ThreadParticipationTracker,
+            "_state_path",
            lambda self: state_path,
        )
        adapter = _make_adapter()
@@ -604,6 +652,7 @@ class TestMatrixConfigBridge:
        }

        import os
+
        import yaml

        config_file = tmp_path / "config.yaml"
@@ -613,18 +662,27 @@ class TestMatrixConfigBridge:
        yaml_cfg = yaml.safe_load(config_file.read_text())
        matrix_cfg = yaml_cfg.get("matrix", {})
        if isinstance(matrix_cfg, dict):
-            if "require_mention" in matrix_cfg and not os.getenv("MATRIX_REQUIRE_MENTION"):
-                monkeypatch.setenv("MATRIX_REQUIRE_MENTION", str(matrix_cfg["require_mention"]).lower())
+            if "require_mention" in matrix_cfg and not os.getenv(
+                "MATRIX_REQUIRE_MENTION"
+            ):
+                monkeypatch.setenv(
+                    "MATRIX_REQUIRE_MENTION", str(matrix_cfg["require_mention"]).lower()
+                )
            frc = matrix_cfg.get("free_response_rooms")
            if frc is not None and not os.getenv("MATRIX_FREE_RESPONSE_ROOMS"):
                if isinstance(frc, list):
                    frc = ",".join(str(v) for v in frc)
                monkeypatch.setenv("MATRIX_FREE_RESPONSE_ROOMS", str(frc))
            if "auto_thread" in matrix_cfg and not os.getenv("MATRIX_AUTO_THREAD"):
-                monkeypatch.setenv("MATRIX_AUTO_THREAD", str(matrix_cfg["auto_thread"]).lower())
+                monkeypatch.setenv(
+                    "MATRIX_AUTO_THREAD", str(matrix_cfg["auto_thread"]).lower()
+                )

        assert os.getenv("MATRIX_REQUIRE_MENTION") == "false"
-        assert os.getenv("MATRIX_FREE_RESPONSE_ROOMS") == "!room1:example.org,!room2:example.org"
+        assert (
+            os.getenv("MATRIX_FREE_RESPONSE_ROOMS")
+            == "!room1:example.org,!room2:example.org"
+        )
        assert os.getenv("MATRIX_AUTO_THREAD") == "false"

    def test_yaml_bridge_sets_dm_mention_threads(self, monkeypatch, tmp_path):
@@ -632,6 +690,7 @@ class TestMatrixConfigBridge:
        monkeypatch.delenv("MATRIX_DM_MENTION_THREADS", raising=False)

        import os
+
        import yaml

        yaml_content = {"matrix": {"dm_mention_threads": True}}
@@ -641,8 +700,13 @@ class TestMatrixConfigBridge:
        yaml_cfg = yaml.safe_load(config_file.read_text())
        matrix_cfg = yaml_cfg.get("matrix", {})
        if isinstance(matrix_cfg, dict):
-            if "dm_mention_threads" in matrix_cfg and not os.getenv("MATRIX_DM_MENTION_THREADS"):
-                monkeypatch.setenv("MATRIX_DM_MENTION_THREADS", str(matrix_cfg["dm_mention_threads"]).lower())
+            if "dm_mention_threads" in matrix_cfg and not os.getenv(
+                "MATRIX_DM_MENTION_THREADS"
+            ):
+                monkeypatch.setenv(
+                    "MATRIX_DM_MENTION_THREADS",
+                    str(matrix_cfg["dm_mention_threads"]).lower(),
+                )

        assert os.getenv("MATRIX_DM_MENTION_THREADS") == "true"

@@ -651,9 +715,12 @@ class TestMatrixConfigBridge:
        monkeypatch.setenv("MATRIX_REQUIRE_MENTION", "true")

        import os
+
        yaml_cfg = {"matrix": {"require_mention": False}}
        matrix_cfg = yaml_cfg.get("matrix", {})
        if "require_mention" in matrix_cfg and not os.getenv("MATRIX_REQUIRE_MENTION"):
-            monkeypatch.setenv("MATRIX_REQUIRE_MENTION", str(matrix_cfg["require_mention"]).lower())
+            monkeypatch.setenv(
+                "MATRIX_REQUIRE_MENTION", str(matrix_cfg["require_mention"]).lower()
+            )

        assert os.getenv("MATRIX_REQUIRE_MENTION") == "true"
@@ -1013,3 +1013,106 @@ class TestFilterAndAccumulateIntegration:
            await task
        except asyncio.CancelledError:
            pass
+
+
+# ── buffer_only mode tests ─────────────────────────────────────────────
+
+
+class TestBufferOnlyMode:
+    """Verify buffer_only mode suppresses intermediate edits and only
+    flushes on structural boundaries (done, segment break, commentary)."""
+
+    @pytest.mark.asyncio
+    async def test_suppresses_intermediate_edits(self):
+        """Time-based and size-based edits are skipped; only got_done flushes."""
+        adapter = MagicMock()
+        adapter.MAX_MESSAGE_LENGTH = 4096
+        adapter.send = AsyncMock(return_value=SimpleNamespace(success=True, message_id="msg1"))
+        adapter.edit_message = AsyncMock(return_value=SimpleNamespace(success=True))
+
+        cfg = StreamConsumerConfig(edit_interval=0.01, buffer_threshold=5, cursor="", buffer_only=True)
+        consumer = GatewayStreamConsumer(adapter, "!room:server", config=cfg)
+
+        for word in ["Hello", " world", ", this", " is", " a", " test"]:
+            consumer.on_delta(word)
+        consumer.finish()
+
+        await consumer.run()
+
+        adapter.send.assert_called_once()
+        adapter.edit_message.assert_not_called()
+        assert "Hello world, this is a test" in adapter.send.call_args_list[0][1]["content"]
+
+    @pytest.mark.asyncio
+    async def test_flushes_on_segment_break(self):
+        """A segment break (tool call boundary) flushes accumulated text."""
+        adapter = MagicMock()
+        adapter.MAX_MESSAGE_LENGTH = 4096
+        adapter.send = AsyncMock(side_effect=[
+            SimpleNamespace(success=True, message_id="msg1"),
+            SimpleNamespace(success=True, message_id="msg2"),
+        ])
+        adapter.edit_message = AsyncMock(return_value=SimpleNamespace(success=True))
+
+        cfg = StreamConsumerConfig(edit_interval=0.01, buffer_threshold=5, cursor="", buffer_only=True)
+        consumer = GatewayStreamConsumer(adapter, "!room:server", config=cfg)
+
+        consumer.on_delta("Before tool call")
+        consumer.on_delta(None)
+        consumer.on_delta("After tool call")
+        consumer.finish()
+
+        await consumer.run()
+
+        assert adapter.send.call_count == 2
+        assert "Before tool call" in adapter.send.call_args_list[0][1]["content"]
+        assert "After tool call" in adapter.send.call_args_list[1][1]["content"]
+        adapter.edit_message.assert_not_called()
+
+    @pytest.mark.asyncio
+    async def test_flushes_on_commentary(self):
+        """An interim commentary message flushes in buffer_only mode."""
+        adapter = MagicMock()
+        adapter.MAX_MESSAGE_LENGTH = 4096
+        adapter.send = AsyncMock(side_effect=[
+            SimpleNamespace(success=True, message_id="msg1"),
+            SimpleNamespace(success=True, message_id="msg2"),
+            SimpleNamespace(success=True, message_id="msg3"),
+        ])
+        adapter.edit_message = AsyncMock(return_value=SimpleNamespace(success=True))
+
+        cfg = StreamConsumerConfig(edit_interval=0.01, buffer_threshold=5, cursor="", buffer_only=True)
+        consumer = GatewayStreamConsumer(adapter, "!room:server", config=cfg)
+
+        consumer.on_delta("Working on it...")
+        consumer.on_commentary("I'll search for that first.")
+        consumer.on_delta("Here are the results.")
+        consumer.finish()
+
+        await consumer.run()
+
+        # Three sends: accumulated text, commentary, final text
+        assert adapter.send.call_count >= 2
+        adapter.edit_message.assert_not_called()
+
+    @pytest.mark.asyncio
+    async def test_default_mode_still_triggers_intermediate_edits(self):
+        """Regression: buffer_only=False (default) still does progressive edits."""
+        adapter = MagicMock()
+        adapter.MAX_MESSAGE_LENGTH = 4096
+        adapter.send = AsyncMock(return_value=SimpleNamespace(success=True, message_id="msg1"))
+        adapter.edit_message = AsyncMock(return_value=SimpleNamespace(success=True))
+
+        # buffer_threshold=5 means any 5+ chars triggers an early edit
+        cfg = StreamConsumerConfig(edit_interval=0.01, buffer_threshold=5, cursor="")
+        consumer = GatewayStreamConsumer(adapter, "!room:server", config=cfg)
+
+        consumer.on_delta("Hello world, this is long enough to trigger edits")
+        consumer.finish()
+
+        await consumer.run()
+
+        # Should have at least one send. With buffer_threshold=5 and this much
+        # text, the consumer may send then edit, or just send once at got_done.
+        # The key assertion: this doesn't break.
+        assert adapter.send.call_count >= 1
@@ -370,6 +370,8 @@ class TestCopilotNormalization:
        assert opencode_model_api_mode("opencode-zen", "minimax-m2.5") == "chat_completions"

    def test_opencode_go_api_modes_match_docs(self):
+        assert opencode_model_api_mode("opencode-go", "glm-5.1") == "chat_completions"
+        assert opencode_model_api_mode("opencode-go", "opencode-go/glm-5.1") == "chat_completions"
        assert opencode_model_api_mode("opencode-go", "glm-5") == "chat_completions"
        assert opencode_model_api_mode("opencode-go", "opencode-go/glm-5") == "chat_completions"
        assert opencode_model_api_mode("opencode-go", "kimi-k2.5") == "chat_completions"
@@ -15,7 +15,7 @@ def test_opencode_go_appears_when_api_key_set():
    opencode_go = next((p for p in providers if p["slug"] == "opencode-go"), None)
    
    assert opencode_go is not None, "opencode-go should appear when OPENCODE_GO_API_KEY is set"
-    assert opencode_go["models"] == ["glm-5", "kimi-k2.5", "mimo-v2-pro", "mimo-v2-omni", "minimax-m2.7", "minimax-m2.5"]
+    assert opencode_go["models"] == ["glm-5.1", "glm-5", "kimi-k2.5", "mimo-v2-pro", "mimo-v2-omni", "minimax-m2.7", "minimax-m2.5"]
    # opencode-go can appear as "built-in" (from PROVIDER_TO_MODELS_DEV when
    # models.dev is reachable) or "hermes" (from HERMES_OVERLAYS fallback when
    # the API is unavailable, e.g. in CI).
@@ -1122,6 +1122,7 @@ class TestStatusRemoteGateway:
        assert data["gateway_running"] is True
        assert data["gateway_pid"] == 999
        assert data["gateway_state"] == "running"
+        assert data["gateway_health_url"] == "http://gw:8642"

    def test_status_remote_probe_not_attempted_when_local_pid_found(self, monkeypatch):
        """When local PID check succeeds, the remote probe is never called."""
@@ -1158,6 +1159,7 @@ class TestStatusRemoteGateway:
        assert resp.status_code == 200
        data = resp.json()
        assert data["gateway_running"] is False
+        assert data["gateway_health_url"] is None

    def test_status_remote_running_null_pid(self, monkeypatch):
        """Remote gateway running but PID not in response — pid should be None."""
@@ -73,6 +73,50 @@ def _build_encrypted_rtp_packet(secret_key, opus_payload, ssrc=100, seq=1, times
    return header + ciphertext + nonce_counter


+def _build_padded_rtp_packet(
+    secret_key, opus_payload, pad_len, ssrc=100, seq=1, timestamp=960,
+    declared_pad_len=None, ext_words=0,
+):
+    """Build a NaCl-encrypted RTP packet with the P bit set and padding appended.
+
+    Per RFC 3550 §5.1, the last padding byte declares how many trailing bytes
+    (including itself) to discard. ``pad_len`` is the actual padding appended;
+    ``declared_pad_len`` lets a test forge a mismatched declared length to
+    exercise the validation path. ``ext_words`` > 0 also sets the X bit and
+    prepends a synthetic extension block (4-byte preamble in cleartext header,
+    ext_words*4 bytes of encrypted extension data prepended to the payload).
+    """
+    if pad_len < 1:
+        raise ValueError("pad_len must be >= 1 (last byte includes itself)")
+    declared = pad_len if declared_pad_len is None else declared_pad_len
+    if declared < 0 or declared > 255:
+        raise ValueError("declared_pad_len must fit in one byte")
+
+    has_extension = ext_words > 0
+    first_byte = 0xA0 | (0x10 if has_extension else 0)  # V=2, P=1, [X=?], CC=0
+    fixed_header = struct.pack(">BBHII", first_byte, 0x78, seq, timestamp, ssrc)
+    if has_extension:
+        # 4-byte extension preamble: 2 bytes "defined by profile" + 2 bytes length-in-words
+        ext_preamble = struct.pack(">HH", 0xBEDE, ext_words)
+        header = fixed_header + ext_preamble
+        ext_data = b"\xab" * (ext_words * 4)
+    else:
+        header = fixed_header
+        ext_data = b""
+
+    padding = b"\x00" * (pad_len - 1) + bytes([declared])
+    plaintext = ext_data + opus_payload + padding
+
+    box = nacl.secret.Aead(secret_key)
+    nonce_counter = struct.pack(">I", seq)
+    full_nonce = nonce_counter + b"\x00" * 20
+
+    enc_msg = box.encrypt(plaintext, header, full_nonce)
+    ciphertext = enc_msg.ciphertext
+
+    return header + ciphertext + nonce_counter
+
+
 def _make_voice_receiver(secret_key, dave_session=None, bot_ssrc=9999,
                         allowed_user_ids=None, members=None):
    """Create a VoiceReceiver with real secret key."""
@@ -212,6 +256,113 @@ class TestRealNaClWithDAVE:
        assert len(receiver._buffers.get(100, b"")) == 0


+class TestRTPPaddingStrip:
+    """RFC 3550 §5.1 — strip RTP padding before DAVE/Opus decode."""
+
+    def test_padded_packet_stripped_and_buffered(self):
+        """P bit set → trailing padding stripped → opus payload decoded."""
+        key = _make_secret_key()
+        opus_silence = b"\xf8\xff\xfe"
+        receiver = _make_voice_receiver(key)
+
+        # 5 bytes of padding (4 zeros + count byte = 5)
+        packet = _build_padded_rtp_packet(key, opus_silence, pad_len=5, ssrc=100)
+        receiver._on_packet(packet)
+
+        assert 100 in receiver._buffers
+        assert len(receiver._buffers[100]) > 0
+
+    def test_padded_packet_matches_unpadded_output(self):
+        """Same opus payload with/without padding → same decoded PCM."""
+        key = _make_secret_key()
+        opus_silence = b"\xf8\xff\xfe"
+
+        recv_plain = _make_voice_receiver(key)
+        recv_plain._on_packet(
+            _build_encrypted_rtp_packet(key, opus_silence, ssrc=100)
+        )
+
+        recv_padded = _make_voice_receiver(key)
+        recv_padded._on_packet(
+            _build_padded_rtp_packet(key, opus_silence, pad_len=7, ssrc=100)
+        )
+
+        assert bytes(recv_plain._buffers[100]) == bytes(recv_padded._buffers[100])
+
+    def test_padding_with_dave_passthrough(self):
+        """Padding stripped before DAVE → passthrough buffers cleanly."""
+        key = _make_secret_key()
+        opus_silence = b"\xf8\xff\xfe"
+        dave = MagicMock()  # SSRC unmapped → DAVE skipped, passthrough used
+        receiver = _make_voice_receiver(key, dave_session=dave)
+
+        packet = _build_padded_rtp_packet(key, opus_silence, pad_len=4, ssrc=100)
+        receiver._on_packet(packet)
+
+        dave.decrypt.assert_not_called()
+        assert 100 in receiver._buffers
+        assert len(receiver._buffers[100]) > 0
+
+    def test_invalid_padding_length_zero_dropped(self):
+        """Declared pad_len=0 is invalid (RFC requires count includes itself)."""
+        key = _make_secret_key()
+        opus_silence = b"\xf8\xff\xfe"
+        receiver = _make_voice_receiver(key)
+
+        packet = _build_padded_rtp_packet(
+            key, opus_silence, pad_len=4, declared_pad_len=0, ssrc=100
+        )
+        receiver._on_packet(packet)
+
+        assert len(receiver._buffers.get(100, b"")) == 0
+
+    def test_invalid_padding_length_overflow_dropped(self):
+        """Declared pad_len > payload size → packet dropped."""
+        key = _make_secret_key()
+        opus_silence = b"\xf8\xff\xfe"
+        receiver = _make_voice_receiver(key)
+
+        packet = _build_padded_rtp_packet(
+            key, opus_silence, pad_len=4, declared_pad_len=255, ssrc=100
+        )
+        receiver._on_packet(packet)
+
+        assert len(receiver._buffers.get(100, b"")) == 0
+
+    def test_padding_consuming_entire_payload_dropped(self):
+        """Padding consumes entire payload → no opus data → dropped."""
+        key = _make_secret_key()
+        receiver = _make_voice_receiver(key)
+
+        # Empty opus payload, 6 bytes of padding (count byte declares 6)
+        packet = _build_padded_rtp_packet(key, b"", pad_len=6, ssrc=100)
+        receiver._on_packet(packet)
+
+        assert len(receiver._buffers.get(100, b"")) == 0
+
+    def test_padding_with_extension_stripped_correctly(self):
+        """X+P bits both set → strip extension from start, padding from end."""
+        key = _make_secret_key()
+        opus_silence = b"\xf8\xff\xfe"
+
+        # Same opus payload sent two ways: plain, and with both ext+padding
+        recv_plain = _make_voice_receiver(key)
+        recv_plain._on_packet(
+            _build_encrypted_rtp_packet(key, opus_silence, ssrc=100)
+        )
+
+        recv_ext_pad = _make_voice_receiver(key)
+        recv_ext_pad._on_packet(
+            _build_padded_rtp_packet(
+                key, opus_silence, pad_len=5, ext_words=2, ssrc=100
+            )
+        )
+
+        # Both must yield identical decoded PCM — ext data and padding both
+        # stripped before opus decode.
+        assert bytes(recv_plain._buffers[100]) == bytes(recv_ext_pad._buffers[100])
+
+
 class TestFullVoiceFlow:
    """End-to-end: encrypt → receive → buffer → silence detect → complete."""

@@ -0,0 +1,186 @@
+"""Regression guardrail: sequential _create_openai_client calls must not
+share a closed transport across invocations.
+
+This is the behavioral twin of test_create_openai_client_kwargs_isolation.py.
+That test pins "don't mutate input kwargs" at the syntactic level — it catches
+#10933 specifically because the bug mutated ``client_kwargs`` in place. This
+test pins the user-visible invariant at the behavioral level: no matter HOW a
+future keepalive / transport reimplementation plumbs sockets in, the Nth call
+to ``_create_openai_client`` must not hand back a client wrapping a
+now-closed httpx transport from an earlier call.
+
+AlexKucera's Discord report (2026-04-16): after ``hermes update`` pulled
+#10933, the first chat on a session worked, every subsequent chat failed
+with ``APIConnectionError('Connection error.')`` whose cause was
+``RuntimeError: Cannot send a request, as the client has been closed``.
+That is the exact scenario this test reproduces at object level without a
+network, so it runs in CI on every PR.
+"""
+from unittest.mock import MagicMock, patch
+
+from run_agent import AIAgent
+
+
+def _make_agent():
+    return AIAgent(
+        model="test/model",
+        quiet_mode=True,
+        skip_context_files=True,
+        skip_memory=True,
+    )
+
+
+def _make_fake_openai_factory(constructed):
+    """Return a fake ``OpenAI`` class that records every constructed instance
+    along with whatever ``http_client`` it was handed (or ``None`` if the
+    caller did not inject one).
+
+    The fake also forwards ``.close()`` calls down to the http_client if one
+    is present, mirroring what the real OpenAI SDK does during teardown and
+    what would expose the #10933 bug.
+    """
+
+    class _FakeOpenAI:
+        def __init__(self, **kwargs):
+            self._kwargs = kwargs
+            self._http_client = kwargs.get("http_client")
+            self._closed = False
+            constructed.append(self)
+
+        def close(self):
+            self._closed = True
+            hc = self._http_client
+            if hc is not None and hasattr(hc, "close"):
+                try:
+                    hc.close()
+                except Exception:
+                    pass
+
+    return _FakeOpenAI
+
+
+def test_second_create_does_not_wrap_closed_transport_from_first():
+    """Back-to-back _create_openai_client calls on the same _client_kwargs
+    must not hand call N a closed http_client from call N-1.
+
+    The bug class: call 1 injects an httpx.Client into self._client_kwargs,
+    client 1 closes (SDK teardown), its http_client closes with it, call 2
+    reads the SAME now-closed http_client from self._client_kwargs and wraps
+    it. Every request through client 2 then fails.
+    """
+    agent = _make_agent()
+    constructed: list = []
+    fake_openai = _make_fake_openai_factory(constructed)
+
+    # Seed a baseline kwargs dict resembling real runtime state.
+    agent._client_kwargs = {
+        "api_key": "test-key-value",
+        "base_url": "https://api.example.com/v1",
+    }
+
+    with patch("run_agent.OpenAI", fake_openai):
+        # Call 1 — what _replace_primary_openai_client does at init/rebuild.
+        client_a = agent._create_openai_client(
+            agent._client_kwargs, reason="initial", shared=True
+        )
+        # Simulate the SDK teardown that follows a rebuild: the old client's
+        # close() is invoked, which closes its underlying http_client if one
+        # was injected. This is exactly what _replace_primary_openai_client
+        # does via _close_openai_client after a successful rebuild.
+        client_a.close()
+
+        # Call 2 — the rebuild path. This is where #10933 crashed on the
+        # next real request.
+        client_b = agent._create_openai_client(
+            agent._client_kwargs, reason="rebuild", shared=True
+        )
+
+    assert len(constructed) == 2, f"expected 2 OpenAI constructions, got {len(constructed)}"
+    assert constructed[0] is client_a
+    assert constructed[1] is client_b
+
+    hc_a = constructed[0]._http_client
+    hc_b = constructed[1]._http_client
+
+    # If the implementation does not inject http_client at all, we're safely
+    # past the bug class — nothing to share, nothing to close. That's fine.
+    if hc_a is None and hc_b is None:
+        return
+
+    # If ANY http_client is injected, the two calls MUST NOT share the same
+    # object, because call 1's object was closed between calls.
+    if hc_a is not None and hc_b is not None:
+        assert hc_a is not hc_b, (
+            "Regression of #10933: _create_openai_client handed the same "
+            "http_client to two sequential constructions. After the first "
+            "client is closed (normal SDK teardown on rebuild), the second "
+            "wraps a closed transport and every subsequent chat raises "
+            "'Cannot send a request, as the client has been closed'."
+        )
+
+    # And whatever http_client the LATEST call handed out must not be closed
+    # already. This catches implementations that cache the injected client on
+    # ``self`` (under any attribute name) and rebuild the SDK client around
+    # it even after the previous SDK close closed the cached transport.
+    if hc_b is not None:
+        is_closed_attr = getattr(hc_b, "is_closed", None)
+        if is_closed_attr is not None:
+            assert not is_closed_attr, (
+                "Regression of #10933: second _create_openai_client returned "
+                "a client whose http_client is already closed. New chats on "
+                "this session will fail with 'Cannot send a request, as the "
+                "client has been closed'."
+            )
+
+
+def test_replace_primary_openai_client_survives_repeated_rebuilds():
+    """Full rebuild path: exercise _replace_primary_openai_client three times
+    back-to-back and confirm every resulting ``self.client`` is a fresh,
+    usable construction rather than a wrapper around a previously-closed
+    transport.
+
+    _replace_primary_openai_client is the real rebuild entrypoint — it is
+    what runs on 401 credential refresh, pool rotation, and model switch.
+    If a future keepalive tweak stores state on ``self`` between calls,
+    this test is what notices.
+    """
+    agent = _make_agent()
+    constructed: list = []
+    fake_openai = _make_fake_openai_factory(constructed)
+
+    agent._client_kwargs = {
+        "api_key": "test-key-value",
+        "base_url": "https://api.example.com/v1",
+    }
+
+    with patch("run_agent.OpenAI", fake_openai):
+        # Seed the initial client so _replace has something to tear down.
+        agent.client = agent._create_openai_client(
+            agent._client_kwargs, reason="seed", shared=True
+        )
+        # Three rebuilds in a row. Each one must install a fresh live client.
+        for label in ("rebuild_1", "rebuild_2", "rebuild_3"):
+            ok = agent._replace_primary_openai_client(reason=label)
+            assert ok, f"rebuild {label} returned False"
+            cur = agent.client
+            assert not cur._closed, (
+                f"after rebuild {label}, self.client is already closed — "
+                "this breaks the very next chat turn"
+            )
+            hc = cur._http_client
+            if hc is not None:
+                is_closed_attr = getattr(hc, "is_closed", None)
+                if is_closed_attr is not None:
+                    assert not is_closed_attr, (
+                        f"after rebuild {label}, self.client.http_client is "
+                        "closed — reproduces #10933 (AlexKucera report, "
+                        "Discord 2026-04-16)"
+                    )
+
+    # All four constructions (seed + 3 rebuilds) should be distinct objects.
+    # If two are the same, the rebuild is cacheing the SDK client across
+    # teardown, which also reproduces the bug class.
+    assert len({id(c) for c in constructed}) == len(constructed), (
+        "Some _create_openai_client calls returned the same object across "
+        "a teardown — rebuild is not producing fresh clients"
+    )
@@ -0,0 +1,137 @@
+"""Live regression guardrail for the keepalive/transport bug class (#10933).
+
+AlexKucera reported on Discord (2026-04-16) that after ``hermes update`` pulled
+#10933, the FIRST chat in a session worked and EVERY subsequent chat failed
+with ``APIConnectionError('Connection error.')`` whose cause was
+``RuntimeError: Cannot send a request, as the client has been closed``.
+
+The companion ``test_create_openai_client_reuse.py`` pins this contract at
+object level with mocked ``OpenAI``. This file runs the same shape of
+reproduction against a real provider so we have a true end-to-end smoke test
+for any future keepalive / transport plumbing.
+
+Opt-in — not part of default CI:
+    HERMES_LIVE_TESTS=1 pytest tests/run_agent/test_sequential_chats_live.py -v
+
+Requires ``OPENROUTER_API_KEY`` to be set (or sourced via ~/.hermes/.env).
+"""
+from __future__ import annotations
+
+import os
+from pathlib import Path
+
+import pytest
+
+
+# Load ~/.hermes/.env so live runs pick up OPENROUTER_API_KEY without
+# needing the runner to shell-source it first. Silent if the file is absent.
+def _load_user_env() -> None:
+    env_file = Path.home() / ".hermes" / ".env"
+    if not env_file.exists():
+        return
+    for raw in env_file.read_text().splitlines():
+        line = raw.strip()
+        if not line or line.startswith("#") or "=" not in line:
+            continue
+        k, v = line.split("=", 1)
+        k = k.strip()
+        v = v.strip().strip('"').strip("'")
+        # Don't clobber an already-set env var — lets the caller override.
+        os.environ.setdefault(k, v)
+
+
+_load_user_env()
+
+
+LIVE = os.environ.get("HERMES_LIVE_TESTS") == "1"
+OR_KEY = os.environ.get("OPENROUTER_API_KEY", "")
+
+pytestmark = [
+    pytest.mark.skipif(not LIVE, reason="live-only — set HERMES_LIVE_TESTS=1"),
+    pytest.mark.skipif(not OR_KEY, reason="OPENROUTER_API_KEY not configured"),
+]
+
+# Cheap, fast, tool-capable. Swap if it ever goes dark.
+LIVE_MODEL = "google/gemini-2.5-flash"
+
+
+def _make_live_agent():
+    from run_agent import AIAgent
+
+    return AIAgent(
+        model=LIVE_MODEL,
+        provider="openrouter",
+        api_key=OR_KEY,
+        base_url="https://openrouter.ai/api/v1",
+        max_iterations=3,
+        quiet_mode=True,
+        skip_context_files=True,
+        skip_memory=True,
+        # All toolsets off so the agent just produces a single text reply
+        # per turn — we want to test the HTTP client lifecycle, not tools.
+        disabled_toolsets=["*"],
+    )
+
+
+def _looks_like_error_reply(reply: str) -> tuple[bool, str]:
+    """AIAgent returns an error-sentinel string (not an exception) when the
+    underlying API call fails past retries. A naive ``assert reply and
+    reply.strip()`` misses this because the sentinel is truthy. This
+    checker enumerates the known-bad shapes so the live test actually
+    catches #10933 instead of rubber-stamping the error response.
+    """
+    lowered = reply.lower().strip()
+    bad_substrings = (
+        "api call failed",
+        "connection error",
+        "client has been closed",
+        "cannot send a request",
+        "max retries",
+    )
+    for marker in bad_substrings:
+        if marker in lowered:
+            return True, marker
+    return False, ""
+
+
+def _assert_healthy_reply(reply, turn_label: str) -> None:
+    assert reply and reply.strip(), f"{turn_label} returned empty: {reply!r}"
+    is_err, marker = _looks_like_error_reply(reply)
+    assert not is_err, (
+        f"{turn_label} returned an error-sentinel string instead of a real "
+        f"model reply — matched marker {marker!r}. This is the exact shape "
+        f"of #10933 (AlexKucera Discord report, 2026-04-16): the agent's "
+        f"retry loop burned three attempts against a closed httpx transport "
+        f"and surfaced 'API call failed after 3 retries: Connection error.' "
+        f"to the user. Reply was: {reply!r}"
+    )
+
+
+def test_three_sequential_chats_across_client_rebuild():
+    """Reproduces AlexKucera's exact failure shape end-to-end.
+
+    Turn 1 always worked under #10933. Turn 2 was the one that failed
+    because the shared httpx transport had been torn down between turns.
+    Turn 3 is here as extra insurance against any lazy-init shape where
+    the failure only shows up on call N>=3.
+
+    We also deliberately trigger ``_replace_primary_openai_client`` between
+    turn 2 and turn 3 — that is the real rebuild entrypoint (401 refresh,
+    credential rotation, model switch) and is the path that actually
+    stored the closed transport into ``self._client_kwargs`` in #10933.
+    """
+    agent = _make_live_agent()
+
+    r1 = agent.chat("Respond with only the word: ONE")
+    _assert_healthy_reply(r1, "turn 1")
+
+    r2 = agent.chat("Respond with only the word: TWO")
+    _assert_healthy_reply(r2, "turn 2")
+
+    # Force a client rebuild through the real path — mimics 401 refresh /
+    # credential rotation / model switch lifecycle.
+    rebuilt = agent._replace_primary_openai_client(reason="regression_test_rebuild")
+    assert rebuilt, "rebuild via _replace_primary_openai_client returned False"
+
+    r3 = agent.chat("Respond with only the word: THREE")
+    _assert_healthy_reply(r3, "turn 3 (post-rebuild)")
@@ -1,5 +1,6 @@
 """Tests for hermes_state.py — SessionDB SQLite CRUD, FTS5 search, export."""

+import json
 import time
 import pytest
 from pathlib import Path
@@ -609,6 +610,156 @@ class TestDeleteAndExport:
        assert exports[0]["source"] == "cli"


+# =========================================================================
+# Export sanitization (ported from anomalyco/opencode#22489)
+# =========================================================================
+
+class TestSanitizeSessionExport:
+    """Validate that sanitize_session_export redacts user content while
+    preserving structural metadata useful for analysis."""
+
+    def test_redacts_message_content(self, db):
+        from hermes_state import sanitize_session_export
+
+        db.create_session(session_id="s1", source="cli", model="test", system_prompt="secret prompt")
+        db.set_session_title("s1", "my confidential task")
+        db.append_message("s1", role="user", content="what is my password?")
+        db.append_message("s1", role="assistant", content="Here's your secret: XYZ")
+
+        raw = db.export_session("s1")
+        sanitized = sanitize_session_export(raw)
+
+        # Structural / metric fields are preserved.
+        assert sanitized["id"] == "s1"
+        assert sanitized["source"] == "cli"
+        assert sanitized["model"] == "test"
+        assert len(sanitized["messages"]) == 2
+        for msg in sanitized["messages"]:
+            assert "role" in msg
+            assert msg["role"] in ("user", "assistant")
+            assert "id" in msg
+            assert "timestamp" in msg
+
+        # Content is redacted.
+        assert "password" not in json.dumps(sanitized)
+        assert "XYZ" not in json.dumps(sanitized)
+        assert "confidential" not in json.dumps(sanitized)
+        assert "secret prompt" not in json.dumps(sanitized)
+        for msg in sanitized["messages"]:
+            assert msg["content"].startswith("[redacted:content:")
+
+        # Title and system_prompt are redacted.
+        assert sanitized["title"].startswith("[redacted:title:")
+        assert sanitized["system_prompt"].startswith("[redacted:system-prompt:")
+
+    def test_redacts_reasoning_and_tool_calls(self, db):
+        from hermes_state import sanitize_session_export
+
+        db.create_session(session_id="s1", source="cli")
+        db.append_message(
+            "s1",
+            role="assistant",
+            content="let me search",
+            reasoning="user asked about their private API key",
+            tool_calls=[{
+                "id": "tc_1",
+                "type": "function",
+                "function": {
+                    "name": "terminal",
+                    "arguments": '{"command": "cat /etc/passwd"}',
+                },
+            }],
+        )
+        db.append_message(
+            "s1",
+            role="tool",
+            content="root:x:0:0:root:/root:/bin/bash",
+            tool_call_id="tc_1",
+            tool_name="terminal",
+        )
+
+        raw = db.export_session("s1")
+        sanitized = sanitize_session_export(raw)
+        dumped = json.dumps(sanitized)
+
+        # No leaked content.
+        assert "private API key" not in dumped
+        assert "/etc/passwd" not in dumped
+        assert "root:x:0:0" not in dumped
+        assert "cat" not in dumped  # the command body should not leak
+
+        # Tool call structure preserved (id, type, function name).
+        asst = sanitized["messages"][0]
+        assert asst["tool_calls"][0]["id"] == "tc_1"
+        assert asst["tool_calls"][0]["type"] == "function"
+        assert asst["tool_calls"][0]["function"]["name"] == "terminal"
+        assert asst["tool_calls"][0]["function"]["arguments"].startswith("[redacted:tool-input:")
+
+        # Reasoning field redacted but present.
+        assert asst["reasoning"].startswith("[redacted:reasoning:")
+
+        # Tool response metadata preserved (tool_call_id, tool_name).
+        tool_msg = sanitized["messages"][1]
+        assert tool_msg["tool_call_id"] == "tc_1"
+        assert tool_msg["tool_name"] == "terminal"
+        assert tool_msg["content"].startswith("[redacted:content:")
+
+    def test_preserves_empty_values(self, db):
+        """Empty/None content should pass through untouched so consumers
+        don't treat sanitization as 'there was hidden data here'."""
+        from hermes_state import sanitize_session_export
+
+        db.create_session(session_id="s1", source="cli")
+        db.append_message("s1", role="user", content="")
+        raw = db.export_session("s1")
+        sanitized = sanitize_session_export(raw)
+
+        # Empty content stays empty (not a fake redaction token).
+        assert sanitized["messages"][0]["content"] in ("", None)
+
+    def test_does_not_mutate_input(self, db):
+        from hermes_state import sanitize_session_export
+
+        db.create_session(session_id="s1", source="cli")
+        db.append_message("s1", role="user", content="original text")
+        raw = db.export_session("s1")
+        original_content = raw["messages"][0]["content"]
+
+        sanitize_session_export(raw)
+
+        # Original dict is unchanged.
+        assert raw["messages"][0]["content"] == original_content
+
+    def test_redacts_reasoning_details_blocks(self):
+        """reasoning_details is a list of typed blocks — preserve type, redact payload."""
+        from hermes_state import sanitize_session_export
+
+        session = {
+            "id": "s1",
+            "source": "cli",
+            "messages": [{
+                "id": "m1",
+                "role": "assistant",
+                "content": "done",
+                "reasoning_details": [
+                    {"type": "reasoning.text", "text": "sensitive internal thought"},
+                    {"type": "reasoning.encrypted", "data": "encrypted_blob_XYZ"},
+                ],
+            }],
+        }
+        sanitized = sanitize_session_export(session)
+        dumped = json.dumps(sanitized)
+
+        assert "sensitive internal thought" not in dumped
+        assert "encrypted_blob_XYZ" not in dumped
+        # Block types preserved.
+        blocks = sanitized["messages"][0]["reasoning_details"]
+        assert blocks[0]["type"] == "reasoning.text"
+        assert blocks[0]["text"].startswith("[redacted:reasoning-text:")
+        assert blocks[1]["type"] == "reasoning.encrypted"
+        assert blocks[1]["data"].startswith("[redacted:reasoning-data:")
+
+
 # =========================================================================
 # Prune
 # =========================================================================
@@ -0,0 +1,200 @@
+"""Tests for the activity-heartbeat behavior of the blocking gateway approval wait.
+
+Regression test for false gateway inactivity timeouts firing while the agent
+is legitimately blocked waiting for a user to respond to a dangerous-command
+approval prompt.  Before the fix, ``entry.event.wait(timeout=...)`` blocked
+silently — no ``_touch_activity()`` calls — and the gateway's inactivity
+watchdog (``agent.gateway_timeout``, default 1800s) would kill the agent
+while the user was still choosing whether to approve.
+
+The fix polls the event in short slices and fires ``touch_activity_if_due``
+between slices, mirroring ``_wait_for_process`` in ``tools/environments/base.py``.
+"""
+
+import os
+import threading
+import time
+from unittest.mock import patch
+
+
+def _clear_approval_state():
+    """Reset all module-level approval state between tests."""
+    from tools import approval as mod
+    mod._gateway_queues.clear()
+    mod._gateway_notify_cbs.clear()
+    mod._session_approved.clear()
+    mod._permanent_approved.clear()
+    mod._pending.clear()
+
+
+class TestApprovalHeartbeat:
+    """The blocking gateway approval wait must fire activity heartbeats.
+
+    Without heartbeats, the gateway's inactivity watchdog kills the agent
+    thread while it's legitimately waiting for a slow user to respond to
+    an approval prompt (observed in real user logs: MRB, April 2026).
+    """
+
+    SESSION_KEY = "heartbeat-test-session"
+
+    def setup_method(self):
+        _clear_approval_state()
+        self._saved_env = {
+            k: os.environ.get(k)
+            for k in ("HERMES_GATEWAY_SESSION", "HERMES_YOLO_MODE",
+                      "HERMES_SESSION_KEY")
+        }
+        os.environ.pop("HERMES_YOLO_MODE", None)
+        os.environ["HERMES_GATEWAY_SESSION"] = "1"
+        # The blocking wait path reads the session key via contextvar OR
+        # os.environ fallback.  Contextvars don't propagate across threads
+        # by default, so env var is the portable way to drive this in tests.
+        os.environ["HERMES_SESSION_KEY"] = self.SESSION_KEY
+
+    def teardown_method(self):
+        for k, v in self._saved_env.items():
+            if v is None:
+                os.environ.pop(k, None)
+            else:
+                os.environ[k] = v
+        _clear_approval_state()
+
+    def test_heartbeat_fires_while_waiting_for_approval(self):
+        """touch_activity_if_due is called repeatedly during the wait."""
+        from tools.approval import (
+            check_all_command_guards,
+            register_gateway_notify,
+            resolve_gateway_approval,
+        )
+
+        register_gateway_notify(self.SESSION_KEY, lambda _payload: None)
+
+        # Use an Event to signal from _fake_touch back to the main thread
+        # so we can resolve as soon as the first heartbeat fires — avoids
+        # flakiness from fixed sleeps racing against thread startup.
+        first_heartbeat = threading.Event()
+        heartbeat_calls: list[str] = []
+
+        def _fake_touch(state, label):
+            # Bypass the 10s throttle so the heartbeat fires every loop
+            # iteration; we're measuring whether the call happens at all.
+            heartbeat_calls.append(label)
+            state["last_touch"] = 0.0
+            first_heartbeat.set()
+
+        result_holder: dict = {}
+
+        def _run_check():
+            try:
+                with patch(
+                    "tools.environments.base.touch_activity_if_due",
+                    side_effect=_fake_touch,
+                ):
+                    result_holder["result"] = check_all_command_guards(
+                        "rm -rf /tmp/nonexistent-heartbeat-target", "local"
+                    )
+            except Exception as exc:  # pragma: no cover
+                result_holder["exc"] = exc
+
+        thread = threading.Thread(target=_run_check, daemon=True)
+        thread.start()
+
+        # Wait for at least one heartbeat to fire — bounded at 10s to catch
+        # a genuinely hung worker thread without making a green run slow.
+        assert first_heartbeat.wait(timeout=10.0), (
+            "no heartbeat fired within 10s — the approval wait is blocking "
+            "without firing activity pings, which is the exact bug this "
+            "test exists to catch"
+        )
+
+        # Resolve the approval so the thread exits cleanly.
+        resolve_gateway_approval(self.SESSION_KEY, "once")
+        thread.join(timeout=5)
+
+        assert not thread.is_alive(), "approval wait did not exit after resolve"
+        assert "exc" not in result_holder, (
+            f"check_all_command_guards raised: {result_holder.get('exc')!r}"
+        )
+
+        # The fix: heartbeats fire while waiting.  Before the fix this list
+        # was empty because event.wait() blocked for the full timeout with
+        # no activity pings.
+        assert heartbeat_calls, "expected at least one heartbeat"
+        assert all(
+            call == "waiting for user approval" for call in heartbeat_calls
+        ), f"unexpected heartbeat labels: {set(heartbeat_calls)}"
+
+        # Sanity: the approval was resolved with "once" → command approved.
+        assert result_holder["result"]["approved"] is True
+
+    def test_wait_returns_immediately_on_user_response(self):
+        """Polling slices don't delay responsiveness — resolve is near-instant."""
+        from tools.approval import (
+            check_all_command_guards,
+            register_gateway_notify,
+            resolve_gateway_approval,
+        )
+
+        register_gateway_notify(self.SESSION_KEY, lambda _payload: None)
+
+        start_time = time.monotonic()
+        result_holder: dict = {}
+
+        def _run_check():
+            result_holder["result"] = check_all_command_guards(
+                "rm -rf /tmp/nonexistent-fast-target", "local"
+            )
+
+        thread = threading.Thread(target=_run_check, daemon=True)
+        thread.start()
+
+        # Resolve almost immediately — the wait loop should return within
+        # its current 1s poll slice.
+        time.sleep(0.1)
+        resolve_gateway_approval(self.SESSION_KEY, "once")
+        thread.join(timeout=5)
+        elapsed = time.monotonic() - start_time
+
+        assert not thread.is_alive()
+        assert result_holder["result"]["approved"] is True
+        # Generous bound to tolerate CI load; the previous single-wait
+        # impl returned in <10ms, the polling impl is bounded by the 1s
+        # slice length.
+        assert elapsed < 3.0, f"resolution took {elapsed:.2f}s, expected <3s"
+
+    def test_heartbeat_import_failure_does_not_break_wait(self):
+        """If tools.environments.base can't be imported, the wait still works."""
+        from tools.approval import (
+            check_all_command_guards,
+            register_gateway_notify,
+            resolve_gateway_approval,
+        )
+
+        register_gateway_notify(self.SESSION_KEY, lambda _payload: None)
+
+        result_holder: dict = {}
+        import builtins
+        real_import = builtins.__import__
+
+        def _fail_environments_base(name, *args, **kwargs):
+            if name == "tools.environments.base":
+                raise ImportError("simulated")
+            return real_import(name, *args, **kwargs)
+
+        def _run_check():
+            with patch.object(builtins, "__import__",
+                              side_effect=_fail_environments_base):
+                result_holder["result"] = check_all_command_guards(
+                    "rm -rf /tmp/nonexistent-import-fail-target", "local"
+                )
+
+        thread = threading.Thread(target=_run_check, daemon=True)
+        thread.start()
+
+        time.sleep(0.2)
+        resolve_gateway_approval(self.SESSION_KEY, "once")
+        thread.join(timeout=5)
+
+        assert not thread.is_alive()
+        # Even when heartbeat import fails, the approval flow completes.
+        assert result_holder["result"]["approved"] is True
@@ -587,3 +587,112 @@ class TestSecurity:
        
        result = mgr.restore(str(work_dir), target_hash, file_path="subdir/test.txt")
        assert result["success"] is True
+
+
+# =========================================================================
+# GPG / global git config isolation
+# =========================================================================
+# Regression tests for the bug where users with ``commit.gpgsign = true``
+# in their global git config got a pinentry popup (or a failed commit)
+# every time the agent took a background snapshot.
+
+import os as _os
+
+
+class TestGpgAndGlobalConfigIsolation:
+    def test_git_env_isolates_global_and_system_config(self, tmp_path):
+        """_git_env must null out GIT_CONFIG_GLOBAL / GIT_CONFIG_SYSTEM so the
+        shadow repo does not inherit user-level gpgsign, hooks, aliases, etc."""
+        env = _git_env(tmp_path / "shadow", str(tmp_path))
+        assert env["GIT_CONFIG_GLOBAL"] == _os.devnull
+        assert env["GIT_CONFIG_SYSTEM"] == _os.devnull
+        assert env["GIT_CONFIG_NOSYSTEM"] == "1"
+
+    def test_init_sets_commit_gpgsign_false(self, work_dir, checkpoint_base, monkeypatch):
+        monkeypatch.setattr("tools.checkpoint_manager.CHECKPOINT_BASE", checkpoint_base)
+        shadow = _shadow_repo_path(str(work_dir))
+        _init_shadow_repo(shadow, str(work_dir))
+        # Inspect the shadow's own config directly — the settings must be
+        # written into the repo, not just inherited via env vars.
+        result = subprocess.run(
+            ["git", "config", "--file", str(shadow / "config"), "--get", "commit.gpgsign"],
+            capture_output=True, text=True,
+        )
+        assert result.stdout.strip() == "false"
+
+    def test_init_sets_tag_gpgsign_false(self, work_dir, checkpoint_base, monkeypatch):
+        monkeypatch.setattr("tools.checkpoint_manager.CHECKPOINT_BASE", checkpoint_base)
+        shadow = _shadow_repo_path(str(work_dir))
+        _init_shadow_repo(shadow, str(work_dir))
+        result = subprocess.run(
+            ["git", "config", "--file", str(shadow / "config"), "--get", "tag.gpgSign"],
+            capture_output=True, text=True,
+        )
+        assert result.stdout.strip() == "false"
+
+    def test_checkpoint_works_with_global_gpgsign_and_broken_gpg(
+        self, work_dir, checkpoint_base, monkeypatch, tmp_path
+    ):
+        """The real bug scenario: user has global commit.gpgsign=true but GPG
+        is broken or pinentry is unavailable.  Before the fix, every snapshot
+        either failed or spawned a pinentry window.  After the fix, snapshots
+        succeed without ever invoking GPG."""
+        monkeypatch.setattr("tools.checkpoint_manager.CHECKPOINT_BASE", checkpoint_base)
+
+        # Fake HOME with global gpgsign=true and a deliberately broken GPG
+        # binary.  If isolation fails, the commit will try to exec this
+        # nonexistent path and the checkpoint will fail.
+        fake_home = tmp_path / "fake_home"
+        fake_home.mkdir()
+        (fake_home / ".gitconfig").write_text(
+            "[user]\n    email = real@user.com\n    name = Real User\n"
+            "[commit]\n    gpgsign = true\n"
+            "[tag]\n    gpgSign = true\n"
+            "[gpg]\n    program = /nonexistent/fake-gpg-binary\n"
+        )
+        monkeypatch.setenv("HOME", str(fake_home))
+        monkeypatch.delenv("GPG_TTY", raising=False)
+        monkeypatch.delenv("DISPLAY", raising=False)  # block GUI pinentry
+
+        mgr = CheckpointManager(enabled=True)
+        assert mgr.ensure_checkpoint(str(work_dir), reason="with-global-gpgsign") is True
+        assert len(mgr.list_checkpoints(str(work_dir))) == 1
+
+    def test_checkpoint_works_on_prefix_shadow_without_local_gpgsign(
+        self, work_dir, checkpoint_base, monkeypatch, tmp_path
+    ):
+        """Users with shadow repos created before the fix will not have
+        commit.gpgsign=false in their shadow's own config.  The inline
+        ``--no-gpg-sign`` flag on the commit call must cover them."""
+        monkeypatch.setattr("tools.checkpoint_manager.CHECKPOINT_BASE", checkpoint_base)
+
+        # Simulate a pre-fix shadow repo: init without commit.gpgsign=false
+        # in its own config.  _init_shadow_repo now writes it, so we must
+        # manually remove it to mimic the pre-fix state.
+        shadow = _shadow_repo_path(str(work_dir))
+        _init_shadow_repo(shadow, str(work_dir))
+        subprocess.run(
+            ["git", "config", "--file", str(shadow / "config"),
+             "--unset", "commit.gpgsign"],
+            capture_output=True, text=True, check=False,
+        )
+        subprocess.run(
+            ["git", "config", "--file", str(shadow / "config"),
+             "--unset", "tag.gpgSign"],
+            capture_output=True, text=True, check=False,
+        )
+
+        # And simulate hostile global config
+        fake_home = tmp_path / "fake_home"
+        fake_home.mkdir()
+        (fake_home / ".gitconfig").write_text(
+            "[commit]\n    gpgsign = true\n"
+            "[gpg]\n    program = /nonexistent/fake-gpg-binary\n"
+        )
+        monkeypatch.setenv("HOME", str(fake_home))
+        monkeypatch.delenv("GPG_TTY", raising=False)
+        monkeypatch.delenv("DISPLAY", raising=False)
+
+        mgr = CheckpointManager(enabled=True)
+        assert mgr.ensure_checkpoint(str(work_dir), reason="prefix-shadow") is True
+        assert len(mgr.list_checkpoints(str(work_dir))) == 1
@@ -0,0 +1,287 @@
+"""Tests for the Google Gemini TTS provider in tools/tts_tool.py."""
+
+import base64
+import struct
+from unittest.mock import MagicMock, patch
+
+import pytest
+
+
+@pytest.fixture(autouse=True)
+def clean_env(monkeypatch):
+    for key in (
+        "GEMINI_API_KEY",
+        "GOOGLE_API_KEY",
+        "GEMINI_BASE_URL",
+        "HERMES_SESSION_PLATFORM",
+    ):
+        monkeypatch.delenv(key, raising=False)
+
+
+@pytest.fixture
+def fake_pcm_bytes():
+    # 0.1s of silence at 24kHz mono 16-bit = 4800 bytes
+    return b"\x00" * 4800
+
+
+@pytest.fixture
+def mock_gemini_response(fake_pcm_bytes):
+    """A successful Gemini generateContent response."""
+    resp = MagicMock()
+    resp.status_code = 200
+    resp.json.return_value = {
+        "candidates": [
+            {
+                "content": {
+                    "parts": [
+                        {
+                            "inlineData": {
+                                "mimeType": "audio/L16;codec=pcm;rate=24000",
+                                "data": base64.b64encode(fake_pcm_bytes).decode(),
+                            }
+                        }
+                    ]
+                }
+            }
+        ]
+    }
+    return resp
+
+
+class TestWrapPcmAsWav:
+    def test_riff_header_structure(self):
+        from tools.tts_tool import _wrap_pcm_as_wav
+
+        pcm = b"\x01\x02\x03\x04" * 10
+        wav = _wrap_pcm_as_wav(pcm, sample_rate=24000, channels=1, sample_width=2)
+
+        assert wav[:4] == b"RIFF"
+        assert wav[8:12] == b"WAVE"
+        assert wav[12:16] == b"fmt "
+        # Audio format (PCM=1)
+        assert struct.unpack("<H", wav[20:22])[0] == 1
+        # Channels
+        assert struct.unpack("<H", wav[22:24])[0] == 1
+        # Sample rate
+        assert struct.unpack("<I", wav[24:28])[0] == 24000
+        # Bits per sample
+        assert struct.unpack("<H", wav[34:36])[0] == 16
+        assert wav[36:40] == b"data"
+        assert wav[44:] == pcm
+
+    def test_header_size_is_44(self):
+        from tools.tts_tool import _wrap_pcm_as_wav
+
+        pcm = b"\xff" * 100
+        wav = _wrap_pcm_as_wav(pcm)
+        assert len(wav) == 44 + len(pcm)
+
+
+class TestGenerateGeminiTts:
+    def test_missing_api_key_raises_value_error(self, tmp_path):
+        from tools.tts_tool import _generate_gemini_tts
+
+        output_path = str(tmp_path / "test.wav")
+        with pytest.raises(ValueError, match="GEMINI_API_KEY"):
+            _generate_gemini_tts("Hello", output_path, {})
+
+    def test_google_api_key_fallback(self, tmp_path, monkeypatch, mock_gemini_response):
+        from tools.tts_tool import _generate_gemini_tts
+
+        monkeypatch.setenv("GOOGLE_API_KEY", "from-google-env")
+        output_path = str(tmp_path / "test.wav")
+
+        with patch("requests.post", return_value=mock_gemini_response) as mock_post:
+            _generate_gemini_tts("Hi", output_path, {})
+
+        # Confirm it used the GOOGLE_API_KEY as the query parameter
+        _, kwargs = mock_post.call_args
+        assert kwargs["params"]["key"] == "from-google-env"
+
+    def test_wav_output_fast_path(self, tmp_path, monkeypatch, mock_gemini_response, fake_pcm_bytes):
+        from tools.tts_tool import _generate_gemini_tts
+
+        monkeypatch.setenv("GEMINI_API_KEY", "test-key")
+        output_path = str(tmp_path / "test.wav")
+
+        with patch("requests.post", return_value=mock_gemini_response):
+            result = _generate_gemini_tts("Hi", output_path, {})
+
+        assert result == output_path
+        data = (tmp_path / "test.wav").read_bytes()
+        assert data[:4] == b"RIFF"
+        assert data[8:12] == b"WAVE"
+        # Audio payload should match the PCM we put in
+        assert data[44:] == fake_pcm_bytes
+
+    def test_default_voice_and_model(self, tmp_path, monkeypatch, mock_gemini_response):
+        from tools.tts_tool import (
+            DEFAULT_GEMINI_TTS_MODEL,
+            DEFAULT_GEMINI_TTS_VOICE,
+            _generate_gemini_tts,
+        )
+
+        monkeypatch.setenv("GEMINI_API_KEY", "test-key")
+
+        with patch("requests.post", return_value=mock_gemini_response) as mock_post:
+            _generate_gemini_tts("Hi", str(tmp_path / "test.wav"), {})
+
+        args, kwargs = mock_post.call_args
+        assert DEFAULT_GEMINI_TTS_MODEL in args[0]
+        payload = kwargs["json"]
+        voice = (
+            payload["generationConfig"]["speechConfig"]["voiceConfig"]
+            ["prebuiltVoiceConfig"]["voiceName"]
+        )
+        assert voice == DEFAULT_GEMINI_TTS_VOICE
+
+    def test_custom_voice(self, tmp_path, monkeypatch, mock_gemini_response):
+        from tools.tts_tool import _generate_gemini_tts
+
+        monkeypatch.setenv("GEMINI_API_KEY", "test-key")
+        config = {"gemini": {"voice": "Puck"}}
+
+        with patch("requests.post", return_value=mock_gemini_response) as mock_post:
+            _generate_gemini_tts("Hi", str(tmp_path / "test.wav"), config)
+
+        payload = mock_post.call_args[1]["json"]
+        voice = (
+            payload["generationConfig"]["speechConfig"]["voiceConfig"]
+            ["prebuiltVoiceConfig"]["voiceName"]
+        )
+        assert voice == "Puck"
+
+    def test_custom_model(self, tmp_path, monkeypatch, mock_gemini_response):
+        from tools.tts_tool import _generate_gemini_tts
+
+        monkeypatch.setenv("GEMINI_API_KEY", "test-key")
+        config = {"gemini": {"model": "gemini-2.5-pro-preview-tts"}}
+
+        with patch("requests.post", return_value=mock_gemini_response) as mock_post:
+            _generate_gemini_tts("Hi", str(tmp_path / "test.wav"), config)
+
+        endpoint = mock_post.call_args[0][0]
+        assert "gemini-2.5-pro-preview-tts" in endpoint
+
+    def test_response_modality_is_audio(self, tmp_path, monkeypatch, mock_gemini_response):
+        from tools.tts_tool import _generate_gemini_tts
+
+        monkeypatch.setenv("GEMINI_API_KEY", "test-key")
+
+        with patch("requests.post", return_value=mock_gemini_response) as mock_post:
+            _generate_gemini_tts("Hi", str(tmp_path / "test.wav"), {})
+
+        payload = mock_post.call_args[1]["json"]
+        assert payload["generationConfig"]["responseModalities"] == ["AUDIO"]
+
+    def test_http_error_raises_runtime_error(self, tmp_path, monkeypatch):
+        from tools.tts_tool import _generate_gemini_tts
+
+        monkeypatch.setenv("GEMINI_API_KEY", "test-key")
+        err_resp = MagicMock()
+        err_resp.status_code = 400
+        err_resp.json.return_value = {"error": {"message": "Invalid voice"}}
+
+        with patch("requests.post", return_value=err_resp):
+            with pytest.raises(RuntimeError, match="HTTP 400.*Invalid voice"):
+                _generate_gemini_tts("Hi", str(tmp_path / "test.wav"), {})
+
+    def test_empty_audio_raises(self, tmp_path, monkeypatch):
+        from tools.tts_tool import _generate_gemini_tts
+
+        monkeypatch.setenv("GEMINI_API_KEY", "test-key")
+        resp = MagicMock()
+        resp.status_code = 200
+        resp.json.return_value = {
+            "candidates": [
+                {"content": {"parts": [{"inlineData": {"data": ""}}]}}
+            ]
+        }
+
+        with patch("requests.post", return_value=resp):
+            with pytest.raises(RuntimeError, match="empty audio"):
+                _generate_gemini_tts("Hi", str(tmp_path / "test.wav"), {})
+
+    def test_malformed_response_raises(self, tmp_path, monkeypatch):
+        from tools.tts_tool import _generate_gemini_tts
+
+        monkeypatch.setenv("GEMINI_API_KEY", "test-key")
+        resp = MagicMock()
+        resp.status_code = 200
+        resp.json.return_value = {"candidates": []}  # no content
+
+        with patch("requests.post", return_value=resp):
+            with pytest.raises(RuntimeError, match="malformed"):
+                _generate_gemini_tts("Hi", str(tmp_path / "test.wav"), {})
+
+    def test_snake_case_inline_data_accepted(self, tmp_path, monkeypatch, fake_pcm_bytes):
+        """Some Gemini SDK versions return inline_data instead of inlineData."""
+        from tools.tts_tool import _generate_gemini_tts
+
+        monkeypatch.setenv("GEMINI_API_KEY", "test-key")
+        resp = MagicMock()
+        resp.status_code = 200
+        resp.json.return_value = {
+            "candidates": [
+                {
+                    "content": {
+                        "parts": [
+                            {
+                                "inline_data": {
+                                    "data": base64.b64encode(fake_pcm_bytes).decode()
+                                }
+                            }
+                        ]
+                    }
+                }
+            ]
+        }
+
+        output_path = str(tmp_path / "test.wav")
+        with patch("requests.post", return_value=resp):
+            _generate_gemini_tts("Hi", output_path, {})
+
+        data = (tmp_path / "test.wav").read_bytes()
+        assert data[:4] == b"RIFF"
+
+    def test_custom_base_url_env(self, tmp_path, monkeypatch, mock_gemini_response):
+        from tools.tts_tool import _generate_gemini_tts
+
+        monkeypatch.setenv("GEMINI_API_KEY", "test-key")
+        monkeypatch.setenv("GEMINI_BASE_URL", "https://custom-gemini.example.com/v1beta")
+
+        with patch("requests.post", return_value=mock_gemini_response) as mock_post:
+            _generate_gemini_tts("Hi", str(tmp_path / "test.wav"), {})
+
+        assert mock_post.call_args[0][0].startswith("https://custom-gemini.example.com/v1beta/")
+
+
+class TestGeminiInCheckRequirements:
+    def test_gemini_api_key_satisfies_requirements(self, monkeypatch):
+        from tools.tts_tool import check_tts_requirements
+
+        # Strip everything else
+        for key in (
+            "ELEVENLABS_API_KEY",
+            "OPENAI_API_KEY",
+            "VOICE_TOOLS_OPENAI_KEY",
+            "MINIMAX_API_KEY",
+            "XAI_API_KEY",
+            "MISTRAL_API_KEY",
+            "GOOGLE_API_KEY",
+        ):
+            monkeypatch.delenv(key, raising=False)
+        monkeypatch.setenv("GEMINI_API_KEY", "k")
+
+        # Force edge_tts import to fail so we actually hit the gemini check
+        import builtins
+
+        real_import = builtins.__import__
+
+        def fake_import(name, *args, **kwargs):
+            if name == "edge_tts":
+                raise ImportError("simulated")
+            return real_import(name, *args, **kwargs)
+
+        with patch("builtins.__import__", side_effect=fake_import):
+            assert check_tts_requirements() is True
@@ -14,6 +14,7 @@ import os
 import re
 import sys
 import threading
+import time
 import unicodedata
 from typing import Optional

@@ -834,13 +835,43 @@ def check_all_command_guards(command: str, env_type: str,
                    "description": combined_desc,
                }

-            # Block until the user responds or timeout (default 5 min)
+            # Block until the user responds or timeout (default 5 min).
+            # Poll in short slices so we can fire activity heartbeats every
+            # ~10s to the agent's inactivity tracker.  Without this, the
+            # blocking event.wait() never touches activity, and the
+            # gateway's inactivity watchdog (agent.gateway_timeout, default
+            # 1800s) kills the agent while the user is still responding to
+            # the approval prompt.  Mirrors the _wait_for_process() cadence
+            # in tools/environments/base.py.
            timeout = _get_approval_config().get("gateway_timeout", 300)
            try:
                timeout = int(timeout)
            except (ValueError, TypeError):
                timeout = 300
-            resolved = entry.event.wait(timeout=timeout)
+
+            try:
+                from tools.environments.base import touch_activity_if_due
+            except Exception:  # pragma: no cover
+                touch_activity_if_due = None
+
+            _now = time.monotonic()
+            _deadline = _now + max(timeout, 0)
+            _activity_state = {"last_touch": _now, "start": _now}
+            resolved = False
+            while True:
+                _remaining = _deadline - time.monotonic()
+                if _remaining <= 0:
+                    break
+                # 1s poll slice — the event is set immediately when the
+                # user responds, so slice length only controls heartbeat
+                # cadence, not user-visible responsiveness.
+                if entry.event.wait(timeout=min(1.0, _remaining)):
+                    resolved = True
+                    break
+                if touch_activity_if_due is not None:
+                    touch_activity_if_due(
+                        _activity_state, "waiting for user approval"
+                    )

            # Clean up this entry from the queue
            with _lock:
@@ -126,7 +126,22 @@ def _shadow_repo_path(working_dir: str) -> Path:


 def _git_env(shadow_repo: Path, working_dir: str) -> dict:
-    """Build env dict that redirects git to the shadow repo."""
+    """Build env dict that redirects git to the shadow repo.
+
+    The shadow repo is internal Hermes infrastructure — it must NOT inherit
+    the user's global or system git config.  User-level settings like
+    ``commit.gpgsign = true``, signing hooks, or credential helpers would
+    either break background snapshots or, worse, spawn interactive prompts
+    (pinentry GUI windows) mid-session every time a file is written.
+
+    Isolation strategy:
+    * ``GIT_CONFIG_GLOBAL=<os.devnull>`` — ignore ``~/.gitconfig`` (git 2.32+).
+    * ``GIT_CONFIG_SYSTEM=<os.devnull>`` — ignore ``/etc/gitconfig`` (git 2.32+).
+    * ``GIT_CONFIG_NOSYSTEM=1`` — legacy belt-and-suspenders for older git.
+
+    The shadow repo still has its own per-repo config (user.email, user.name,
+    commit.gpgsign=false) set in ``_init_shadow_repo``.
+    """
    normalized_working_dir = _normalize_path(working_dir)
    env = os.environ.copy()
    env["GIT_DIR"] = str(shadow_repo)
@@ -134,6 +149,13 @@ def _git_env(shadow_repo: Path, working_dir: str) -> dict:
    env.pop("GIT_INDEX_FILE", None)
    env.pop("GIT_NAMESPACE", None)
    env.pop("GIT_ALTERNATE_OBJECT_DIRECTORIES", None)
+    # Isolate the shadow repo from the user's global/system git config.
+    # Prevents commit.gpgsign, hooks, aliases, credential helpers, etc. from
+    # leaking into background snapshots.  Uses os.devnull for cross-platform
+    # support (``/dev/null`` on POSIX, ``nul`` on Windows).
+    env["GIT_CONFIG_GLOBAL"] = os.devnull
+    env["GIT_CONFIG_SYSTEM"] = os.devnull
+    env["GIT_CONFIG_NOSYSTEM"] = "1"
    return env


@@ -211,6 +233,13 @@ def _init_shadow_repo(shadow_repo: Path, working_dir: str) -> Optional[str]:

    _run_git(["config", "user.email", "hermes@local"], shadow_repo, working_dir)
    _run_git(["config", "user.name", "Hermes Checkpoint"], shadow_repo, working_dir)
+    # Explicitly disable commit/tag signing in the shadow repo.  _git_env
+    # already isolates from the user's global config, but writing these into
+    # the shadow's own config is belt-and-suspenders — it guarantees the
+    # shadow repo is correct even if someone inspects or runs git against it
+    # directly (without the GIT_CONFIG_* env vars).
+    _run_git(["config", "commit.gpgsign", "false"], shadow_repo, working_dir)
+    _run_git(["config", "tag.gpgSign", "false"], shadow_repo, working_dir)

    info_dir = shadow_repo / "info"
    info_dir.mkdir(exist_ok=True)
@@ -552,9 +581,11 @@ class CheckpointManager:
            logger.debug("Checkpoint skipped: no changes in %s", working_dir)
            return False

-        # Commit
+        # Commit.  ``--no-gpg-sign`` inline covers shadow repos created before
+        # the commit.gpgsign=false config was added to _init_shadow_repo — so
+        # users with existing checkpoints never hit a GPG pinentry popup.
        ok, _, err = _run_git(
-            ["commit", "-m", reason, "--allow-empty-message"],
+            ["commit", "-m", reason, "--allow-empty-message", "--no-gpg-sign"],
            shadow, working_dir, timeout=_GIT_TIMEOUT * 2,
        )
        if not ok:
@@ -2,12 +2,13 @@
 """
 Text-to-Speech Tool Module

-Supports six TTS providers:
+Supports seven TTS providers:
 - Edge TTS (default, free, no API key): Microsoft Edge neural voices
 - ElevenLabs (premium): High-quality voices, needs ELEVENLABS_API_KEY
 - OpenAI TTS: Good quality, needs OPENAI_API_KEY
 - MiniMax TTS: High-quality with voice cloning, needs MINIMAX_API_KEY
 - Mistral (Voxtral TTS): Multilingual, native Opus, needs MISTRAL_API_KEY
+- Google Gemini TTS: Controllable, 30 prebuilt voices, needs GEMINI_API_KEY
 - NeuTTS (local, free, no API key): On-device TTS via neutts_cli, needs neutts installed

 Output formats:
@@ -99,6 +100,13 @@ DEFAULT_XAI_LANGUAGE = "en"
 DEFAULT_XAI_SAMPLE_RATE = 24000
 DEFAULT_XAI_BIT_RATE = 128000
 DEFAULT_XAI_BASE_URL = "https://api.x.ai/v1"
+DEFAULT_GEMINI_TTS_MODEL = "gemini-2.5-flash-preview-tts"
+DEFAULT_GEMINI_TTS_VOICE = "Kore"
+DEFAULT_GEMINI_TTS_BASE_URL = "https://generativelanguage.googleapis.com/v1beta"
+# PCM output specs for Gemini TTS (fixed by the API)
+GEMINI_TTS_SAMPLE_RATE = 24000
+GEMINI_TTS_CHANNELS = 1
+GEMINI_TTS_SAMPLE_WIDTH = 2  # 16-bit PCM (L16)

 def _get_default_output_dir() -> str:
    from hermes_constants import get_hermes_dir
@@ -506,6 +514,174 @@ def _generate_mistral_tts(text: str, output_path: str, tts_config: Dict[str, Any
    return output_path


+# ===========================================================================
+# Provider: Google Gemini TTS
+# ===========================================================================
+def _wrap_pcm_as_wav(
+    pcm_bytes: bytes,
+    sample_rate: int = GEMINI_TTS_SAMPLE_RATE,
+    channels: int = GEMINI_TTS_CHANNELS,
+    sample_width: int = GEMINI_TTS_SAMPLE_WIDTH,
+) -> bytes:
+    """Wrap raw signed-little-endian PCM with a standard WAV RIFF header.
+
+    Gemini TTS returns audio/L16;codec=pcm;rate=24000 -- raw PCM samples with
+    no container. We add a minimal WAV header so the file is playable and
+    ffmpeg can re-encode it to MP3/Opus downstream.
+    """
+    import struct
+
+    byte_rate = sample_rate * channels * sample_width
+    block_align = channels * sample_width
+    data_size = len(pcm_bytes)
+    fmt_chunk = struct.pack(
+        "<4sIHHIIHH",
+        b"fmt ",
+        16,             # fmt chunk size (PCM)
+        1,              # audio format (PCM)
+        channels,
+        sample_rate,
+        byte_rate,
+        block_align,
+        sample_width * 8,
+    )
+    data_chunk_header = struct.pack("<4sI", b"data", data_size)
+    riff_size = 4 + len(fmt_chunk) + len(data_chunk_header) + data_size
+    riff_header = struct.pack("<4sI4s", b"RIFF", riff_size, b"WAVE")
+    return riff_header + fmt_chunk + data_chunk_header + pcm_bytes
+
+
+def _generate_gemini_tts(text: str, output_path: str, tts_config: Dict[str, Any]) -> str:
+    """Generate audio using Google Gemini TTS.
+
+    Gemini's generateContent endpoint with responseModalities=["AUDIO"] returns
+    raw 24kHz mono 16-bit PCM (L16) as base64. We wrap it with a WAV RIFF
+    header to produce a playable file, then ffmpeg-convert to MP3 / Opus if
+    the caller requested those formats (same pattern as NeuTTS).
+
+    Args:
+        text: Text to convert (prompt-style; supports inline direction like
+              "Say cheerfully:" and audio tags like [whispers]).
+        output_path: Where to save the audio file (.wav, .mp3, or .ogg).
+        tts_config: TTS config dict.
+
+    Returns:
+        Path to the saved audio file.
+    """
+    import requests
+
+    api_key = (os.getenv("GEMINI_API_KEY") or os.getenv("GOOGLE_API_KEY") or "").strip()
+    if not api_key:
+        raise ValueError(
+            "GEMINI_API_KEY not set. Get one at https://aistudio.google.com/app/apikey"
+        )
+
+    gemini_config = tts_config.get("gemini", {})
+    model = str(gemini_config.get("model", DEFAULT_GEMINI_TTS_MODEL)).strip() or DEFAULT_GEMINI_TTS_MODEL
+    voice = str(gemini_config.get("voice", DEFAULT_GEMINI_TTS_VOICE)).strip() or DEFAULT_GEMINI_TTS_VOICE
+    base_url = str(
+        gemini_config.get("base_url")
+        or os.getenv("GEMINI_BASE_URL")
+        or DEFAULT_GEMINI_TTS_BASE_URL
+    ).strip().rstrip("/")
+
+    payload: Dict[str, Any] = {
+        "contents": [{"parts": [{"text": text}]}],
+        "generationConfig": {
+            "responseModalities": ["AUDIO"],
+            "speechConfig": {
+                "voiceConfig": {
+                    "prebuiltVoiceConfig": {"voiceName": voice},
+                },
+            },
+        },
+    }
+
+    endpoint = f"{base_url}/models/{model}:generateContent"
+    response = requests.post(
+        endpoint,
+        params={"key": api_key},
+        headers={"Content-Type": "application/json"},
+        json=payload,
+        timeout=60,
+    )
+    if response.status_code != 200:
+        # Surface the API error message when present
+        try:
+            err = response.json().get("error", {})
+            detail = err.get("message") or response.text[:300]
+        except Exception:
+            detail = response.text[:300]
+        raise RuntimeError(
+            f"Gemini TTS API error (HTTP {response.status_code}): {detail}"
+        )
+
+    try:
+        data = response.json()
+        parts = data["candidates"][0]["content"]["parts"]
+        audio_part = next((p for p in parts if "inlineData" in p or "inline_data" in p), None)
+        if audio_part is None:
+            raise RuntimeError("Gemini TTS response contained no audio data")
+        inline = audio_part.get("inlineData") or audio_part.get("inline_data") or {}
+        audio_b64 = inline.get("data", "")
+    except (KeyError, IndexError, TypeError) as e:
+        raise RuntimeError(f"Gemini TTS response was malformed: {e}") from e
+
+    if not audio_b64:
+        raise RuntimeError("Gemini TTS returned empty audio data")
+
+    pcm_bytes = base64.b64decode(audio_b64)
+    wav_bytes = _wrap_pcm_as_wav(pcm_bytes)
+
+    # Fast path: caller wants WAV directly, just write.
+    if output_path.lower().endswith(".wav"):
+        with open(output_path, "wb") as f:
+            f.write(wav_bytes)
+        return output_path
+
+    # Otherwise write WAV to a temp file and ffmpeg-convert to the target
+    # format (.mp3 or .ogg). If ffmpeg is missing, fall back to renaming the
+    # WAV -- this matches the NeuTTS behavior and keeps the tool usable on
+    # systems without ffmpeg (audio still plays, just with a misleading
+    # extension).
+    with tempfile.NamedTemporaryFile(suffix=".wav", delete=False) as tmp:
+        tmp.write(wav_bytes)
+        wav_path = tmp.name
+
+    try:
+        ffmpeg = shutil.which("ffmpeg")
+        if ffmpeg:
+            # For .ogg output, force libopus encoding (Telegram voice bubbles
+            # require Opus specifically; ffmpeg's default for .ogg is Vorbis).
+            if output_path.lower().endswith(".ogg"):
+                cmd = [
+                    ffmpeg, "-i", wav_path,
+                    "-acodec", "libopus", "-ac", "1",
+                    "-b:a", "64k", "-vbr", "off",
+                    "-y", "-loglevel", "error",
+                    output_path,
+                ]
+            else:
+                cmd = [ffmpeg, "-i", wav_path, "-y", "-loglevel", "error", output_path]
+            result = subprocess.run(cmd, capture_output=True, timeout=30)
+            if result.returncode != 0:
+                stderr = result.stderr.decode("utf-8", errors="ignore")[:300]
+                raise RuntimeError(f"ffmpeg conversion failed: {stderr}")
+        else:
+            logger.warning(
+                "ffmpeg not found; writing raw WAV to %s (extension may be misleading)",
+                output_path,
+            )
+            shutil.copyfile(wav_path, output_path)
+    finally:
+        try:
+            os.remove(wav_path)
+        except OSError:
+            pass
+
+    return output_path
+
+
 # ===========================================================================
 # NeuTTS (local, on-device TTS via neutts_cli)
 # ===========================================================================
@@ -634,7 +810,7 @@ def text_to_speech_tool(
        out_dir.mkdir(parents=True, exist_ok=True)
        # Use .ogg for Telegram with providers that support native Opus output,
        # otherwise fall back to .mp3 (Edge TTS will attempt ffmpeg conversion later).
-        if want_opus and provider in ("openai", "elevenlabs", "mistral"):
+        if want_opus and provider in ("openai", "elevenlabs", "mistral", "gemini"):
            file_path = out_dir / f"tts_{timestamp}.ogg"
        else:
            file_path = out_dir / f"tts_{timestamp}.mp3"
@@ -687,6 +863,10 @@ def text_to_speech_tool(
            logger.info("Generating speech with Mistral Voxtral TTS...")
            _generate_mistral_tts(text, file_str, tts_config)

+        elif provider == "gemini":
+            logger.info("Generating speech with Google Gemini TTS...")
+            _generate_gemini_tts(text, file_str, tts_config)
+
        elif provider == "neutts":
            if not _check_neutts_available():
                return json.dumps({
@@ -741,7 +921,7 @@ def text_to_speech_tool(
            if opus_path:
                file_str = opus_path
                voice_compatible = True
-        elif provider in ("elevenlabs", "openai", "mistral"):
+        elif provider in ("elevenlabs", "openai", "mistral", "gemini"):
            voice_compatible = file_str.endswith(".ogg")

        file_size = os.path.getsize(file_str)
@@ -811,6 +991,8 @@ def check_tts_requirements() -> bool:
        return True
    if os.getenv("XAI_API_KEY"):
        return True
+    if os.getenv("GEMINI_API_KEY") or os.getenv("GOOGLE_API_KEY"):
+        return True
    try:
        _import_mistral_client()
        if os.getenv("MISTRAL_API_KEY"):
@@ -17,10 +17,10 @@ export function LanguageSwitcher() {
      title={t.language.switchTo}
      aria-label={t.language.switchTo}
    >
-      {/* Show the *other* language's flag as the clickable target */}
-      <span className="text-base leading-none">{locale === "en" ? "🇨🇳" : "🇬🇧"}</span>
+      {/* Show the *current* language's flag — tooltip advertises the click action */}
+      <span className="text-base leading-none">{locale === "en" ? "🇬🇧" : "🇨🇳"}</span>
      <span className="hidden sm:inline font-display tracking-wide uppercase text-[0.65rem]">
-        {locale === "en" ? "中文" : "EN"}
+        {locale === "en" ? "EN" : "中文"}
      </span>
    </button>
  );
@@ -213,6 +213,7 @@ export interface StatusResponse {
  config_version: number;
  env_path: string;
  gateway_exit_reason: string | null;
+  gateway_health_url: string | null;
  gateway_pid: number | null;
  gateway_platforms: Record<string, PlatformStatus>;
  gateway_running: boolean;
@@ -53,6 +53,7 @@ export default function StatusPage() {
  };

  function gatewayValue(): string {
+    if (status!.gateway_running && status!.gateway_health_url) return status!.gateway_health_url;
    if (status!.gateway_running && status!.gateway_pid) return `${t.status.pid} ${status!.gateway_pid}`;
    if (status!.gateway_running) return t.status.runningRemote;
    if (status!.gateway_state === "startup_failed") return t.status.startFailed;
@@ -137,14 +138,14 @@ export default function StatusPage() {

      <div className="grid gap-4 sm:grid-cols-3">
        {items.map(({ icon: Icon, label, value, badgeText, badgeVariant }) => (
-          <Card key={label}>
+          <Card key={label} className="min-w-0 overflow-hidden">
            <CardHeader className="flex flex-row items-center justify-between pb-2">
              <CardTitle className="text-sm font-medium">{label}</CardTitle>
              <Icon className="h-4 w-4 text-muted-foreground" />
            </CardHeader>

            <CardContent>
-              <div className="text-2xl font-bold font-display">{value}</div>
+              <div className="text-2xl font-bold font-display truncate" title={value}>{value}</div>

              {badgeText && (
                <Badge variant={badgeVariant} className="mt-2">
@@ -186,18 +186,18 @@ Skills can declare non-secret settings that are stored in `config.yaml` under th
 metadata:
  hermes:
    config:
-      - key: wiki.path
-        description: Path to the LLM Wiki knowledge base directory
-        default: "~/wiki"
-        prompt: Wiki directory path
-      - key: wiki.domain
-        description: Domain the wiki covers
+      - key: myplugin.path
+        description: Path to the plugin data directory
+        default: "~/myplugin-data"
+        prompt: Plugin data directory path
+      - key: myplugin.domain
+        description: Domain the plugin operates on
        default: ""
-        prompt: Wiki domain (e.g., AI/ML research)
+        prompt: Plugin domain (e.g., AI/ML research)
 ```

 Each entry supports:
- `key` (required) — dotpath for the setting (e.g., `wiki.path`)
+- `key` (required) — dotpath for the setting (e.g., `myplugin.path`)
 - `description` (required) — explains what the setting controls
 - `default` (optional) — default value if the user doesn't configure it
 - `prompt` (optional) — prompt text shown during `hermes config migrate`; falls back to `description`
@@ -208,8 +208,8 @@ Each entry supports:
   ```yaml
   skills:
     config:
-       wiki:
-         path: ~/my-research
+       myplugin:
+         path: ~/my-data
   ```

 2. **Discovery:** `hermes config migrate` scans all enabled skills, finds unconfigured settings, and prompts the user. Settings also appear in `hermes config show` under "Skill Settings."
@@ -217,14 +217,14 @@ Each entry supports:
 3. **Runtime injection:** When a skill loads, its config values are resolved and appended to the skill message:
   ```
   [Skill config (from ~/.hermes/config.yaml):
-     wiki.path = /home/user/my-research
+     myplugin.path = /home/user/my-data
   ]
   ```
   The agent sees the configured values without needing to read `config.yaml` itself.

 4. **Manual setup:** Users can also set values directly:
   ```bash
-   hermes config set skills.config.wiki.path ~/my-wiki
+   hermes config set skills.config.myplugin.path ~/my-data
   ```

 :::tip When to use which
@@ -35,12 +35,99 @@ You need at least one way to connect to an LLM. Use `hermes model` to switch pro
 | **DeepSeek** | `DEEPSEEK_API_KEY` in `~/.hermes/.env` (provider: `deepseek`) |
 | **Hugging Face** | `HF_TOKEN` in `~/.hermes/.env` (provider: `huggingface`, aliases: `hf`) |
 | **Google / Gemini** | `GOOGLE_API_KEY` (or `GEMINI_API_KEY`) in `~/.hermes/.env` (provider: `gemini`) |
+| **Google Gemini (OAuth)** | `hermes model` → "Google Gemini (OAuth)" (provider: `google-gemini-cli`, free tier supported, browser PKCE login) |
 | **Custom Endpoint** | `hermes model` → choose "Custom endpoint" (saved in `config.yaml`) |

 :::tip Model key alias
 In the `model:` config section, you can use either `default:` or `model:` as the key name for your model ID. Both `model: { default: my-model }` and `model: { model: my-model }` work identically.
 :::

+
+### Google Gemini via OAuth (`google-gemini-cli`)
+
+The `google-gemini-cli` provider uses Google's Cloud Code Assist backend — the
+same API that Google's own `gemini-cli` tool uses. This supports both the
+**free tier** (generous daily quota for personal accounts) and **paid tiers**
+(Standard/Enterprise via a GCP project).
+
+**Quick start:**
+
+```bash
+hermes model
+# → pick "Google Gemini (OAuth)"
+# → see policy warning, confirm
+# → browser opens to accounts.google.com, sign in
+# → done — Hermes auto-provisions your free tier on first request
+```
+
+Hermes ships Google's **public** `gemini-cli` desktop OAuth client by default —
+the same credentials Google includes in their open-source `gemini-cli`. Desktop
+OAuth clients are not confidential (PKCE provides the security). You do not
+need to install `gemini-cli` or register your own GCP OAuth client.
+
+**How auth works:**
+- PKCE Authorization Code flow against `accounts.google.com`
+- Browser callback at `http://127.0.0.1:8085/oauth2callback` (with ephemeral-port fallback if busy)
+- Tokens stored at `~/.hermes/auth/google_oauth.json` (chmod 0600, atomic write, cross-process `fcntl` lock)
+- Automatic refresh 60 s before expiry
+- Headless environments (SSH, `HERMES_HEADLESS=1`) → paste-mode fallback
+- Inflight refresh deduplication — two concurrent requests won't double-refresh
+- `invalid_grant` (revoked refresh) → credential file wiped, user prompted to re-login
+
+**How inference works:**
+- Traffic goes to `https://cloudcode-pa.googleapis.com/v1internal:generateContent`
+  (or `:streamGenerateContent?alt=sse` for streaming), NOT the paid `v1beta/openai` endpoint
+- Request body wrapped `{project, model, user_prompt_id, request}`
+- OpenAI-shaped `messages[]`, `tools[]`, `tool_choice` are translated to Gemini's native
+  `contents[]`, `tools[].functionDeclarations`, `toolConfig` shape
+- Responses translated back to OpenAI shape so the rest of Hermes works unchanged
+
+**Tiers & project IDs:**
+
+| Your situation | What to do |
+|---|---|
+| Personal Google account, want free tier | Nothing — sign in, start chatting |
+| Workspace / Standard / Enterprise account | Set `HERMES_GEMINI_PROJECT_ID` or `GOOGLE_CLOUD_PROJECT` to your GCP project ID |
+| VPC-SC-protected org | Hermes detects `SECURITY_POLICY_VIOLATED` and forces `standard-tier` automatically |
+
+Free tier auto-provisions a Google-managed project on first use. No GCP setup required.
+
+**Quota monitoring:**
+
+```
+/gquota
+```
+
+Shows remaining Code Assist quota per model with progress bars:
+
+```
+Gemini Code Assist quota  (project: 123-abc)
+
+  gemini-2.5-pro                      ▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓░░░░   85%
+  gemini-2.5-flash [input]            ▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓░░   92%
+```
+
+:::warning Policy risk
+Google considers using the Gemini CLI OAuth client with third-party software a
+policy violation. Some users have reported account restrictions. For the lowest-risk
+experience, use your own API key via the `gemini` provider instead. Hermes shows
+an upfront warning and requires explicit confirmation before OAuth begins.
+:::
+
+**Custom OAuth client (optional):**
+
+If you'd rather register your own Google OAuth client — e.g., to keep quota
+and consent scoped to your own GCP project — set:
+
+```bash
+HERMES_GEMINI_CLIENT_ID=your-client.apps.googleusercontent.com
+HERMES_GEMINI_CLIENT_SECRET=...   # optional for Desktop clients
+```
+
+Register a **Desktop app** OAuth client at
+[console.cloud.google.com/apis/credentials](https://console.cloud.google.com/apis/credentials)
+with the Generative Language API enabled.
+
 :::info Codex Note
 The OpenAI Codex provider authenticates via device code (open a URL, enter a code). Hermes stores the resulting credentials in its own auth store under `~/.hermes/auth.json` and can import existing Codex CLI credentials from `~/.codex/auth.json` when present. No Codex CLI installation is required.
 :::
@@ -47,6 +47,9 @@ All variables go in `~/.hermes/.env`. You can also set them with `hermes config
 | `GOOGLE_API_KEY` | Google AI Studio API key ([aistudio.google.com/app/apikey](https://aistudio.google.com/app/apikey)) |
 | `GEMINI_API_KEY` | Alias for `GOOGLE_API_KEY` |
 | `GEMINI_BASE_URL` | Override Google AI Studio base URL |
+| `HERMES_GEMINI_CLIENT_ID` | OAuth client ID for `google-gemini-cli` PKCE login (optional; defaults to Google's public gemini-cli client) |
+| `HERMES_GEMINI_CLIENT_SECRET` | OAuth client secret for `google-gemini-cli` (optional) |
+| `HERMES_GEMINI_PROJECT_ID` | GCP project ID for paid Gemini tiers (free tier auto-provisions) |
 | `ANTHROPIC_API_KEY` | Anthropic Console API key ([console.anthropic.com](https://console.anthropic.com/)) |
 | `ANTHROPIC_TOKEN` | Manual or legacy Anthropic OAuth/setup-token override |
 | `DASHSCOPE_API_KEY` | Alibaba Cloud DashScope API key for Qwen models ([modelstudio.console.alibabacloud.com](https://modelstudio.console.alibabacloud.com/)) |
@@ -253,7 +253,7 @@ Skills for academic research, paper discovery, literature review, domain reconna
 |-------|-------------|------|
 | `arxiv` | Search and retrieve academic papers from arXiv using their free REST API. No API key needed. Search by keyword, author, category, or ID. Combine with web_extract or the ocr-and-documents skill to read full paper content. | `research/arxiv` |
 | `blogwatcher` | Monitor blogs and RSS/Atom feeds for updates using the blogwatcher CLI. Add blogs, scan for new articles, and track what you've read. | `research/blogwatcher` |
-| `llm-wiki` | Karpathy's LLM Wiki — build and maintain a persistent, interlinked markdown knowledge base. Ingest sources, query compiled knowledge, and lint for consistency. Unlike RAG, the wiki compiles knowledge once and keeps it current. Works as an Obsidian vault. Configurable via `skills.config.wiki.path`. | `research/llm-wiki` |
+| `llm-wiki` | Karpathy's LLM Wiki — build and maintain a persistent, interlinked markdown knowledge base. Ingest sources, query compiled knowledge, and lint for consistency. Unlike RAG, the wiki compiles knowledge once and keeps it current. Works as an Obsidian vault. Wiki path is controlled by the `WIKI_PATH` env var (defaults to `~/wiki`). | `research/llm-wiki` |
 | `domain-intel` | Passive domain reconnaissance using Python stdlib. Subdomain discovery, SSL certificate inspection, WHOIS lookups, DNS records, domain availability checks, and bulk multi-domain analysis. No API keys required. | `research/domain-intel` |
 | `duckduckgo-search` | Free web search via DuckDuckGo — text, news, images, videos. No API key needed. Prefer the `ddgs` CLI when installed; use the Python DDGS library only after verifying that `ddgs` is available in the current runtime. | `research/duckduckgo-search` |
 | `ml-paper-writing` | Write publication-ready ML/AI papers for NeurIPS, ICML, ICLR, ACL, AAAI, COLM. Use when drafting papers from research repos, structuring arguments, verifying citations, or preparing camera-ready submissions. Includes LaTeX templates, reviewer guidelines, and citation verificatio… | `research/ml-paper-writing` |
@@ -359,8 +359,8 @@ Skills can declare their own configuration settings via their SKILL.md frontmatt
 ```yaml
 skills:
  config:
-    wiki:
-      path: ~/wiki          # Used by the llm-wiki skill
+    myplugin:
+      path: ~/myplugin-data   # Example — each skill defines its own keys
 ```

 **How skill settings work:**
@@ -372,7 +372,7 @@ skills:
 **Setting values manually:**

 ```bash
-hermes config set skills.config.wiki.path ~/my-research-wiki
+hermes config set skills.config.myplugin.path ~/myplugin-data
 ```

 For details on declaring config settings in your own skills, see [Creating Skills — Config Settings](/docs/developer-guide/creating-skills#config-settings-configyaml).
@@ -155,10 +155,10 @@ Skills can also declare non-secret config settings (paths, preferences) stored i
 metadata:
  hermes:
    config:
-      - key: wiki.path
-        description: Path to the wiki directory
-        default: "~/wiki"
-        prompt: Wiki directory path
+      - key: myplugin.path
+        description: Path to the plugin data directory
+        default: "~/myplugin-data"
+        prompt: Plugin data directory path
 ```

 Settings are stored under `skills.config` in your config.yaml. `hermes config migrate` prompts for unconfigured settings, and `hermes config show` displays them. When a skill loads, its resolved config values are injected into the context so the agent knows the configured values automatically.
@@ -14,7 +14,7 @@ If you have a paid [Nous Portal](https://portal.nousresearch.com) subscription,

 ## Text-to-Speech

-Convert text to speech with six providers:
+Convert text to speech with seven providers:

 | Provider | Quality | Cost | API Key |
 |----------|---------|------|---------|
@@ -23,6 +23,7 @@ Convert text to speech with six providers:
 | **OpenAI TTS** | Good | Paid | `VOICE_TOOLS_OPENAI_KEY` |
 | **MiniMax TTS** | Excellent | Paid | `MINIMAX_API_KEY` |
 | **Mistral (Voxtral TTS)** | Excellent | Paid | `MISTRAL_API_KEY` |
+| **Google Gemini TTS** | Excellent | Free tier | `GEMINI_API_KEY` |
 | **NeuTTS** | Good | Free | None needed |

 ### Platform Delivery
@@ -39,7 +40,7 @@ Convert text to speech with six providers:
 ```yaml
 # In ~/.hermes/config.yaml
 tts:
-  provider: "edge"              # "edge" | "elevenlabs" | "openai" | "minimax" | "mistral" | "neutts"
+  provider: "edge"              # "edge" | "elevenlabs" | "openai" | "minimax" | "mistral" | "gemini" | "neutts"
  speed: 1.0                    # Global speed multiplier (provider-specific settings override this)
  edge:
    voice: "en-US-AriaNeural"   # 322 voices, 74 languages
@@ -61,6 +62,9 @@ tts:
  mistral:
    model: "voxtral-mini-tts-2603"
    voice_id: "c69964a6-ab8b-4f8a-9465-ec0925096ec8"  # Paul - Neutral (default)
+  gemini:
+    model: "gemini-2.5-flash-preview-tts"  # or gemini-2.5-pro-preview-tts
+    voice: "Kore"               # 30 prebuilt voices: Zephyr, Puck, Kore, Enceladus, Gacrux, etc.
  neutts:
    ref_audio: ''
    ref_text: ''
@@ -77,6 +81,7 @@ Telegram voice bubbles require Opus/OGG audio format:
 - **OpenAI, ElevenLabs, and Mistral** produce Opus natively — no extra setup
 - **Edge TTS** (default) outputs MP3 and needs **ffmpeg** to convert:
 - **MiniMax TTS** outputs MP3 and needs **ffmpeg** to convert for Telegram voice bubbles
+- **Google Gemini TTS** outputs raw PCM and uses **ffmpeg** to encode Opus directly for Telegram voice bubbles
 - **NeuTTS** outputs WAV and also needs **ffmpeg** to convert for Telegram voice bubbles

 ```bash
@@ -284,8 +284,40 @@ MATRIX_RECOVERY_KEY=EsT... your recovery key here

 On each startup, if `MATRIX_RECOVERY_KEY` is set, Hermes imports cross-signing keys from the homeserver's secure secret storage and signs the current device. This is idempotent and safe to leave enabled permanently.

-:::warning
-If you delete the `~/.hermes/platforms/matrix/store/` directory, the bot loses its encryption keys. You'll need to verify the device again in your Matrix client. Back up this directory if you want to preserve encrypted sessions.
+:::warning[Deleting the crypto store]
+If you delete `~/.hermes/platforms/matrix/store/crypto.db`, the bot loses its encryption identity. Simply restarting with the same device ID will **not** fully recover — the homeserver still holds one-time keys signed with the old identity key, and peers cannot establish new Olm sessions.
+
+Hermes detects this condition on startup and refuses to enable E2EE, logging: `device XXXX has stale one-time keys on the server signed with a previous identity key`.
+
+**Easiest recovery: generate a new access token** (which gets a fresh device ID with no stale key history). See the "Upgrading from a previous version with E2EE" section below. This is the most reliable path and avoids touching the homeserver database.
+
+**Manual recovery** (advanced — keeps the same device ID):
+
+1. Stop Synapse and delete the old device from its database:
+   ```bash
+   sudo systemctl stop matrix-synapse
+   sudo sqlite3 /var/lib/matrix-synapse/homeserver.db "
+     DELETE FROM e2e_device_keys_json WHERE device_id = 'DEVICE_ID' AND user_id = '@hermes:your-server';
+     DELETE FROM e2e_one_time_keys_json WHERE device_id = 'DEVICE_ID' AND user_id = '@hermes:your-server';
+     DELETE FROM e2e_fallback_keys_json WHERE device_id = 'DEVICE_ID' AND user_id = '@hermes:your-server';
+     DELETE FROM devices WHERE device_id = 'DEVICE_ID' AND user_id = '@hermes:your-server';
+   "
+   sudo systemctl start matrix-synapse
+   ```
+   Or via the Synapse admin API (note the URL-encoded user ID):
+   ```bash
+   curl -X DELETE -H "Authorization: Bearer ADMIN_TOKEN" \
+     'https://your-server/_synapse/admin/v2/users/%40hermes%3Ayour-server/devices/DEVICE_ID'
+   ```
+   Note: deleting a device via the admin API may also invalidate the associated access token. You may need to generate a new token afterward.
+
+2. Delete the local crypto store and restart Hermes:
+   ```bash
+   rm -f ~/.hermes/platforms/matrix/store/crypto.db*
+   # restart hermes
+   ```
+
+Other Matrix clients (Element, matrix-commander) may cache the old device keys. After recovery, type `/discardsession` in Element to force a new encryption session with the bot.
 :::

 :::info
@@ -361,6 +393,10 @@ pip install 'hermes-agent[matrix]'

 ### Upgrading from a previous version with E2EE

+:::tip
+If you also manually deleted `crypto.db`, see the "Deleting the crypto store" warning in the E2EE section above — there are additional steps to clear stale one-time keys from the homeserver.
+:::
+
 If you previously used Hermes with `MATRIX_ENCRYPTION=true` and are upgrading to
 a version that uses the new SQLite-based crypto store, the bot's encryption
 identity has changed. Your Matrix client (Element) may cache the old device keys